Monday, September 16, 2013

XOFilmmaker: Audio re-saving via Python

Wahoo for progress!

Tonight I just spent about an hour and a half (a short session tonight!) aiming to solve one specific problem for XOFilmmaker. Good news: problem solved!

I had taken my iPad back home to Canada a couple weeks back and read a bunch of the code for the existing OLPC Activities "Record" and "Jukebox", both of which use GStreamer a lot. This is the same library which I am employing for this XOFilmmaker activity.

Anyhow, as you might expect from previous blogposts, I've been running into a stack of problems with GStreamer and the learning curve is pretty steep, and the documentation hasn't been the most glorious. So I found some nice examples inside of these two existing activities. In Record specifically, I found a pretty fabulous function called "gst.parse_launch()" which appeared to let you just run a gstreamer command-line option directly in Python and it would "run the pipeline" for you... a GREAT way to splice together my final edits in the XOFilmmaker Activity, I reckon!

So my goal tonight was to see if I could get this to work... and the answer is... YES!!!!!!!!!


I started with the line I knew I had working last time:
gst-launch filesrc location=$PWD/high_1.ogg ! oggdemux name="demuxer" \
demuxer. ! queue ! vorbisdec ! audioconvert ! audioresample ! autoaudiosink \
demuxer. ! queue ! theoradec ! ffmpegcolorspace ! autovideosink
This line successfully opened an ogg video file (created with the Video Camera on the XO Laptop), demuxed the audio and video, and then sent them out to an "audiosink" and "videosink" -- basically, playing the video file (with audio attached) to the screen.

On first attempt, it didn't work. It seemed like nothing was happening.

So I simplified it, reducing the line so that it worked on the command-line first, and it just outputted audio:
gst-launch filesrc location=$PWD/high_1.ogg ! oggdemux name="demuxer" \
demuxer. ! queue ! vorbisdec ! audioconvert ! audioresample ! autoaudiosink
Trying this again in Python, I still got nothing. Weird. So I tried a command-line save-to-disk operation... essentially unpacking the audio from the audio/video ogg file, and then saving the audio track only back to a new file. This was a success on the command-line:
# Audio demux, then mux again and save as a new file
gst-launch filesrc location=$PWD/high_1.ogg ! oggdemux name="demuxer" \
demuxer. ! queue ! vorbisdec ! audioconvert ! audioresample ! vorbisenc ! oggmux ! filesink location=$PWD/audioOut1.ogg
This gave me an audio file on disk that I could run.

> ogg123 audioOut1.ogg
And it worked!


Now, to convert to Python. At first I still got nothing from the pipeline and there was a file being created on disk, but it was 0 bytes. After some googling, it looked like I needed to launch the gobject MainLoop -- otherwise I wouldn't get any code launched. Duh (I guess?!) So after some digging and experimenting, I got this to work!!!!!!!

=====================================================
#!/usr/bin/python
import gobject; gobject.threads_init()
import pygst; pygst.require("0.10")
import gst

class Main:
    def __init__(self):
        '''
        # Express this gst-launch code in a Python call
        gst-launch filesrc location=$PWD/high_1.ogg ! oggdemux name="demuxer" \
          demuxer. ! queue ! vorbisdec ! audioconvert ! audioresample ! autoaudiosink \
          demuxer. ! queue ! theoradec ! ffmpegcolorspace ! autovideosink
         
          2nd try, audio only with re-convert
          gst-launch filesrc location=$PWD/high_1.ogg ! oggdemux name="demuxer" \
  demuxer. ! queue ! vorbisdec ! audioconvert ! audioresample ! vorbisenc ! oggmux ! filesink location=$PWD/audioOut1.ogg
        '''  
       
        mainLoop = gobject.MainLoop(is_running=True)         
       
        filePath = "/home/mjutan/Downloads"
       
        muxline = gst.parse_launch('filesrc location=' + filePath + '/high_1.ogg' + ' ! oggdemux name=demuxer demuxer. ! queue ! vorbisdec ! audioconvert ! audioresample ! vorbisenc ! oggmux ! filesink location=' + filePath + '/pythonAudioOut2.ogg')          
      
        muxline.set_state(gst.STATE_PLAYING)       
       
        while mainLoop.is_running():
            try:
                mainLoop.run()
            except KeyboardInterrupt:
                mainLoop.quit()

start=Main()
=====================================================


And here is my audio file, finally a non-zero size!

pythonAudioOut2.ogg, suckas!!!!!!!!!!!!

So why is this awesome?

The GREAT news here is that now I can work on the command-line to try to figure out how to splice Audio and Video correctly using GNonLin/GStreamer. This gst.parse_launch command removes the extra overhead of trying to set up a pipeline and catch pads and do all this other crazy crap. I just want to open up the input files, splice them together with the timings specified by the user, and output a single file all cut together correctly. It's probable that I can just programatically create a big long gst parse line which will generate a "Pipeline" for me. Then, I just run the pipeline and BOOM, the completed, edited video is written to disk. (Aside: I tried the original audio and video playback line again after I had this final version of the code with the MainLoop run() method, and that worked too, and played back my video and audio together on the screen. Awesome.)

How does this move me closer to getting video editing?

Good question. Well, it means I can now focus on getting the syntax for gst-launch correct on the command line, and try to just splice 2 videos together. If I can get this to work, then I've already solved doing the same thing from Python -- I just need to copy-paste the line into this parse_launch() command and I'm all set.

What's next?

My goal for the next OLPC session will be to try to get GStreamer on the command-line splicing 2 videos together successfully using different in and out points. If I can get that to work, then the large majority of the "unknown" section of this project will be COMPLETE!!! AHHHH man I can't wait.

From there, then I need to build a nice simple UI that kids in developing nations with next-to-no instruction about how to use this software will need to be able to understand. There will ideally be no words on the app. For this app to be a real success on the XO Laptop, my feeling is that it needs to be extremely intuitive and simple to use. Once I've got this particularly troublesome part of actually cutting videos and audio splicing together finally sorted out, then I can move onto the exciting parts of starting to build the UI and getting a playback widget figured out so kids can set the "in" and "out" points of each clip they want to use.

After that, I'll need to actually integrate it with the XO look-and-feel and with the files in the XO Journal. But that's for a later date. First I'll need to get the standalone GStreamer app working well. It's awesome to see this finally moving along though. I'm really happy with tonight's progress.

No comments: