Field Notes of an Audacious Amateur: ffmpeg

This series is written by a representative of the latter group, which is comprised mostly of what might be called "productivity users" (perhaps "tinkerly productivity users?"). Though my lack of training precludes me from writing code or improving anyone else's, I can, nonetheless, try and figure out creative ways of utilizing open source programs. And again, because of my lack of expertise, though I may be capable of deploying open source programs in creative ways, my modest technical acumen hinders me from utilizing those programs in what may be the most optimal ways. The open-source character, then, of this series, consists in my presentation to the community of open source users and programmers of my own crude and halting attempts at accomplishing computing tasks, in the hope that those who are more knowledgeable than me can offer advice, alternatives, and corrections. The desired end result is the discovery, through a communal process, of optimal and/or alternate ways of accomplishing the sorts of tasks that I and other open source productivity users need to perform.

Showing posts with label ffmpeg. Show all posts

Saturday, September 22, 2012

Addendum to the first installment: yet more on screencasting with ffmpeg

With a recent upgrade (apt-get dist-upgrade to 12.04, custom-built Ubuntu) to my office machine, I started having serious audio/video sync issues when producing screencasts using recordmydesktop. As you may (or may not) recall, I record lectures on my computer and use, as a sort of visual aid, an on-screen whiteboard, where I type key words or phrases about which I'm speaking: that's what I capture from my screen when I'm recording these screencasts. Well, with the updated recordmydesktop, the text on my on-screen whiteboard would begin to appear several seconds before the audio about the word or phrase was indicating its appearance.

Oddly, the opposite was happening with the updated ffmpeg when running the screencasting incantation with which I'd earlier experimented. The on-screen text I was typing into the whiteboard was lagging a bit behind the audio.

I tried introducing a number of alternative switches into the commands I was issuing when using both recordmydesktop and ffmpeg. But to no avail: I couldn't get rid of the sync problems with either.

During the course of my searches aimed at resolving these issues, I ran across a crude script on the Ubuntu forums that someone had cobbled together, a script which uses ffmpeg, but which separately records video and audio, joining the two streams together, as a final step, into a final output file. I think this joining of an audio and video file are called, in electronic multimedia circles, "muxing," by the way. I decided the script was worth a try.

And, what do you know, after figuring out how to use the script, my tests indicated that it caused audio and video to be in nearly perfect sync. Thus, the answer to my newly-appeared screencasting issues was resolved.

I hoped to solicit improvements to the script but have so far not managed to find much help. Probably the weirdest thing about this script, which likely demonstrates the inexperience of the script's creator, is the fact that, once ffmpeg is invoked and the recording begins, you're supposed to enter, into the same terminal where ffmpeg is running, a file name, then hit the "enter" key as the singal to stop the recording and begin the joining of the audio and video files.

All this while you're seeing in the terminal the standard ffmpeg prompt that tells you to hit control-c to stop the recording. Confusing, to say the least--and made even more confounding by the fact that you can't actually see the text you're entering when you go to type the file name.

Despite those shortcomings, since the results produced by this script exceed anything else I've been able to accomplish, I think I'm going to stick with the script for now. I have made a couple of tweaks, mainly so as to make it record what are called "lossless" files--files that are produced with minimal processing (for example no compression), and which are therefore quite large. I have to re-encode my output files before uploading them in any case, so it's best to start with better-quality files.

Without further ado, then, I present the tweaked version of the script I've found:

#!/bin/bash
#vzybilly
#these are temp files
aud="aud.flac"
vid="vid.mkv"
#grab audio & pid
ffmpeg -f alsa -ac 2 -i plughw:1,0 -acodec flac $aud &
audPID=$!
#grab screen & pid
ffmpeg -f x11grab -s "830x660" -r "24" -i :0.0+227,130+nomouse -threads 0 -vcodec libx264 -preset ultrafast -crf 0 $vid &
vidPID=$!
#wait, till name given (that means stop)
read -p "Stop by giving an Output video name?" out
#alternate, more coherent way of output file naming--requires zenity
#out="`zenity --entry --title="Video muxing script" --text="Please type a file name (sans extension) into the blank and click 'Ok' to stop the recording"`.mkv"
#stop audio and video with pids
kill -n 2 $audPID
kill -n 2 $vidPID
echo "Saving to $out"
#combine to the target output file
ffmpeg -i $aud -i $vid -acodec copy -vcodec copy "$out"
#purge the temp files
rm $aud
rm $vid

I should mention that I discovered a slightly less incoherent way of soliciting file-name input. Though it is commented out in the version of the script you see above, I intend to use this until such time as the script can be further improved ( to use it you uncomment the line that begins out= and comment out the line that reads read -p: note that you must have zenity installed for this modification to work).

In a related item, as I've mentioned earlier, the disadvantage to using ffmpeg for screencasting is that there is no built-in provision for pausing. Well, apparently someone has proposed a kind of workaround for that--see this thread for further details. I've not tried that method and don't really understand how it works, so I cannot attest to its efficacy.

What I have tried is simply stopping, then restarting a new file when a pause is necessary. That's definitely more cumbersome than pausing, and, furthermore, it requires the additional step of somehow joining what could be thought of a separate "vignettes" into a single "episode."

The good news I can report on that front is that I've found another script that was created precisely to join such separate files. I've tried some tests with it and it has worked for me quite well. It's called mmcat and it can be found here.

I'd like to post more about the plughw switch seen in the above script and which I needed to introduce in order to record through a new USB sound device I've added to my computer. But I don't really understand well what differentiates it from the more standard hw switch. So I won't speak to that matter further in this entry. :)

That about sums things up so far as recent screencasting developments on my front is concerned. Do you have any suggestions for improving the screencasting script I found? If so, please pipe in. Any other suggestions for pausing ffmpeg screencast recording? Please let me/us know.

Wednesday, July 18, 2012

Addendum to the first installment: more on screencasting with ffmpeg

Long time no blog. I got busy with a lot of other things, not least of which was various bicycling trips and associated maintenance projects. Amazing how much time and energy those things can take.

With the academic year approaching, coming up with a resolution to my screencast file-size problems has taken on renewed urgency. To reiterate, a recent major hardware upgrade on my computer led to a twofold increase in the size of screencast video files I produce on this machine--which in turn led to upload problems at my course web site (file size limits). It looks like the resolution may well be provided by ffmpeg.

So I began once again grappling in earnest with the increased file-size issue. I've actually even found a resolution, should I decide to switch over from the more capable recordmydesktop to the nice but somewhat feature-lacking ffmpeg.

Before launching into a description of that resolution, however, I need to make quick note of a new entry into the field of GNU/Linux screencasting applications. The new kid on the block is called Kazam and it looks promising. I've not yet been able to get it running--and I'm not in a big rush to do so, since I favor command-line tools. But it's received some praises and looks like a promising application. That noted, I now return to a description of the resolution I've discovered for the issue of screencast file sizes made using ffmpeg's x11grab.

First and foremost, I discovered a fairly lengthy thread on the Ubuntu forums that deals with screencasting usingffmpeg. It's chock full of all kinds of interesting tips for screencasting with ffmpeg. For example, a switch is documented there that allows you to exclude the mouse cursor from your screencast: you simply add :0.0+nomouse to the command you use to start your screencast. I'd definitely need that in order to use ffmpeg as the utility to record my lectures.

Also described there is a way of doing what might be called "pseudo pausing," which really means you just stop your recording at the point where you need to pause, then start up again with a new recording. You then need to concatenate the files--not anywhere near as convenient as recordmydesktop's pausing capability, but something that'll do in a pinch.

It was from that thread that I derived what seems like a pretty good resolution to my file size issues. What I've discovered is that I can record my files in a lossless format (mkv is recommended in that thread)--which results in pretty gigantic file sizes (ca. 13 megabytes per minute on this system)--then transcode them into flash format while dramatically decreasing the file size: a one-minute test file I made actually came in at about 1.2 megabytes after re-encoding using this method. Since what I need is a file that comes in at about 2 megabytes per minute, this could work well for my circumstances.

The key command that's allowed me to reduce dramatically screencast file sizes to an acceptable level while at the same time converting to a format that works well at my hosting site uses . . . you guessed it, ffmpeg. Here it is:

ffmpeg -i output.mkv -acodec aac -strict experimental -ab 128k -ac 2 -vcodec libx264 -vpre slow -wpredp 0 -crf 22 -threads 0 output.flv

I do still hope I can find a way of reliably shrinking the files recordmydesktop outputs, but my research on that has, to date, not been fruitful. There's a utility called oggResize that's supposed to allow you to easily resize .ogv files, but it puts the audio out of sync with the video. And it appears that tool is not under active development. It's likely resizing of .ogv files can also be done with ffmpeg, but I've not yet managed to arrive at the appropriate incantation for doing that. I'll keep digging though, and if I come up with a resolution, I'll post it in this blog.

As a final note on things I've learned in this round of screencast research, I should also mention a command-line tool called ffcast someone has developed. This looks like a bash script that first invokes a tool for selecting an area of the screen, then calls ffmpeg to record the screencast. I've also not tried that one but it's on my list to look into in the future.

POSTSCRIPT: I decided to run on an .ogv file created by recordmydesktop the same command I'd used to shrink the lossless .mkv file to see what the results would be. I'm pleased to report that the results are positive: the .ogv file, when re-encoded as an .flv using that method, shrunk to about one third the size (the 1 minute .ogv test file was 4.2 megabytes and it came in as a 1.2 megabyte .flv file after re-encoding). So I can continue using recordmydesktop to record my lecture screencasts--albeit with the penalty of having to do an additional round of encoding/re-encoding (recordmydesktop already takes quite some time to encode a screencast into .ogv format unless you use the --on-the-fly-encoding switch, which I'm now likely to start using). So, the command I used to re-encode the .ogv file to .flv is:

ffmpeg -i output.ogv -acodec aac -strict experimental -ab 128k -ac 2 -vcodec libx264 -vpre slow -wpredp 0 -crf 22 -threads 0 output.flv

MUCH LATER POSTSCRIPT: I've discovered that ogv files work much better at my hosting site than do flv files, so I began looking into ways of transcoding/shrinking to that format. The incantation I discovered, through trial and error, and which reduces screencast file sizes to about 1.5 megabytes per minute (roughly one seventh of the original, lossless, size), is as follows:

ffmpeg -i infile.mkv -r 13 -acodec libvorbis -ab 48k -ac 2 -vcodec libtheora -preset slow -wpredp 0 -crf 22 -threads 0 outfile.ogv

There are two crucial bits here. The -ab 48k option is one of them: that reduces the audio bit rate, which brings the file size down by quite a lot. The other is the -r 13 option, which reduces the frame rate to 13 frames per second and thus shrinks the file size yet further. A 48 minute long lossless screencast I created, for example, was reduced to a 73 megabyte ogv file using this incantation.

Monday, April 9, 2012

Addendum to the first installment: alternate screencast applications

As you may recall, in the inaugural entry for this blog I described a crude mock-up I'd cobbled together, using Linux utilities, for producing screencast video lectures along the lines of those found on the khanacademy web site. I recently had occasion to explore some alternate screencast applications, since ogv files ceased playing back sanely on one of my systems. That issue was ultimately addressed by "upgrading" the video card in my aging Pentium 4 computer (with an old 128 MB nvidia card I got for $15). But not before I investigated some alternatives to recordmydesktop.

Of course there are various options for screencasting in Linux: I've read about Istanbul, Byzanz, vnc2flv, and some others, for example. I've so far gravitated toward recordmydesktop because it "just worked" and because it can be used from the command-line. But because of the video card issue mentioned above, I did recently decide to explore a couple of other alternatives which I'll write about in this entry, namely xvidcap and ffmpeg.

I should mention at the outset that, as those who read the first installment of this blog will be aware, I need something that not only records a video of a part of the desktop (actually, a particular application window in my case), but that allows for sound recording as well: after all, these are lecture videos I'm producing. So any screencast application that does not have built-in sound recording isn't going to work for my purposes.

Screencasting with ffmpeg

Most readers of this blog will have heard of the indispensable video-manipulating program ffmpeg. It's possible to do a truly amazing amount of manipulation and creation of videos using that program. And, as it turns out, it's even possible to do screencasting with it (in fact, ffmpeg may actually be used "behind the scenes" by recordmydesktop--though I'm not fully certain about that).

Some recent research on the web resulted for me in some successful experiments in screencasting with ffmpeg. The main ingredient in this screencasting capability is a switch called x11grab. Run from the command line, an ffmpeg screen/sound capture session would be invoked something like this:

ffmpeg -f alsa -ac 2 -ab 48k -i hw:0,0 -f x11grab -r 20 -s 800x600 -i :0.0+227,130 -acodec libmp3lame -vcodec libx264 -vpre lossless_ultrafast -threads 0 output.avi

That command will output an avi file called, appropriately enough, output.avi that is 800x600 resolution and that grabs a portion of the screen that is 227 pixels from the left edge and 130 pixels from the top edge (about where I open the application I wish to record on this machine). The sound portion is transcoded into mp3 by libmp3lame--something I discovered reduces the final size of the file considerably.

I believe ffmpeg can output just about any video format, including ogg theora, as recordmydesktop does. In order for that to be implemented, the -vcodec argument needs to be modified to, in the case of an ogg theora file, to -vcodec libtheora, I believe, while the extension of the output file's name would need, obviously, to be changed to .ogv.

As those who are familiar with ffmpeg will know, I'm barely scratching the surface of the tip of the iceberg here as far as its capabilities go. In fact, ffmpeg's man page has got to be one of the most voluminous and daunting of them, and it's only within the last couple of years that I've begun to be able to make any sense of it. But don't presume that I mined the information above mostly from the man page: rather, I found a working sample command on the web, then went back to the man page to try and better understand how it works and how I might tweak it for my purposes. I'm really still on a very rudimentary level when it comes to understanding and using ffmpeg.

Now, though this ffmpeg solution works quite well, it's turned out to not really be usable for me. This is because, as great as it is, the ffmpeg process cannot be paused while it's running: you either have to stop it and resume anew if you have some reason to pause, or else you'll need to edit the resulting video and cut out the extraneous portions. In case it's not apparent, it's really useful during a lecture to be able to pause and resume.

So, as well as it worked in my experiments, I decided that, until I can come up with some way to pause the process, then resume, ffmpeg is not going to work for me as well as recordmydesktop does.

Screencasting with xvidcap

Which bring my to the next screencasting application, xvidcap. There's a lot to like about xvidcap. It's a graphical application, but with a very minimal interface--just the type of gui application that appeals to me. It draws a nice, visible perimeter around the area you're recording, and the perimeter can be easily manipulated with the mouse to enclose whatever quadrangular area you'd like. And it does allow for pausing and resuming of the screencast.

A screenshot of xvidcap (running on someone else's machine)

Furthermore, it's able to output a few video formats, among which are mpeg and flv. It looks like a great application and I really wish I could use it. But there's a major problem with sound: you see, xvidcap was developed during an era when oss (Open Sound System) was the standard Linux sound server, but oss is now deprecated in favor of alsa (Advanced Linux Sound Architecture).

In theory, it's possible to emulate oss with alsa. That's what you'll read on the internet, anyway. But the fact of the matter is that, as Linux moves further and further away from the old oss architecture, emulation of it has become harder and harder. I spent quite a few hours trying to implement it on this system, so happy was I with what I'd seen of the video-capturing end of xvidcap. I tried a number of purported solutions I found in my web searches. But none of them enabled me to record sound alongside the video I was capturing with xvidcap. So, I've reluctantly given up on it for now.

Those are the two alternate screencasting programs I've toyed with lately, and with mixed results. Both seem to have their advantages as compared to recordmydesktop, but in the end, it looks as though neither is going to be able to displace it. If you have any sort of pointers for addressing either of the issues I experienced in my attempts to use those applications, I'd be most delighted to hear about them. So please do offer your input.

Segmenting video files

To wrap up this posting, I need first of all to confide that I finally did "upgrade" my old work PC. I replaced the aging Pentium 4 with a not-quite-so elderly dual core machine. So video recording, and especially playback, goes quite a bit more smoothly on this "new" machine.

One odd thing has cropped up regarding video, though. The problem is that, despite the fact that I use the same exact recordmydesktop command to record my screencasts, the resulting files are now almost twice as large as they had been on the old machine: instead of being on the order to 2 megabytes per minute, they're now closer to 4 megabytes per minute.

My suspicion is that the video hardware could be to blame, since I had a pretty old 32 megabyte ATI (PCI) video card in the old machine, while the newer machine has an nVidia 128 megabyte card (PCI-express). I've tried switching video card modules from nv to vesa, but that doesn't seem to affect the resulting video size.

This becomes a problem because I have a file size limit on the site where I must upload my lectures, and they're now routinely going to exceed the size limit. So, until I can figure out how to reduce the video file sizes back closer to what they were, I've had to come up with a work-around--namely to split the video files into parts that, individually, do not exceed the size limit. And for this, ffmpeg once again comes to the rescue.

In order to split my lecture files, I'm using the following command:

ffmpeg -ss 00:00:00 -i in.ogv -t 00:30:00 -vcodec copy -acodec copy out.ogv

What that does is to start at the beginning of the file (the -ss 00:00:00 part), go 30 minutes into the file, then copy that section to a new file called out.ogv. So, for an hour-long lecture, after having done that to split off the first 30 minutes into a new file, the same command would be run again, except that-ss 00:00:00 would be replaced by-ss 00:30:00. Then, the two parts could be appropriately named (with part1 or part2 in the name, as appropriate) and uploaded.

That's all for this entry. If anyone has tips or recommendations about what's been discussed above or about anything else related to screencasting on Linux, please pipe in.

Afterthought: here's a link that offers some technical details on the sound hardware in my new computer: http://people.atrpms.net/~pcavalcanti/alsa-1.0.15rc2_snd-hda-intel.html#final . I was looking at that as I was trying to work out how I might get xvidcap functioning on this system.

Field Notes of an Audacious Amateur