When I last wrote about our experiments with the <audio> element in part V, I had successfully written a patch to add a new DOM event (onAudioWritten), which is dispatched every time a frame of audio is decoded (this works for <audio> and <video>). I make this raw data available via a list type object called MozAudioData, which exposes .length and .item() for accessing the data in script (I’m hoping to move to the new WebGL arrays as soon as Vlad lands them). As I finished up for holidays, I made a very crude but promising demo that proved things were working:
I want to do some more work to see what moving the FFT into C++ would mean in terms of performance. Having the ability to do even more complex analysis or visualization (imagine real time speech-to-text coming out of <video>) in script is our goal. I’m also starting to wonder about how best to add write capabilities to go with this read-only version. Imagine being able to do equalizing, filtering, etc. on the audio from script. I have no idea how to tackle this yet.
It’s been fun working with these guys who know so much about audio. I know almost nothing about the data I’m extracting, and it’s humbling to learn from them. We’re approaching the point where input on API design would be useful, as I get this patch in better shape for review. If you’d like to get involved or try it out yourself, drop into irc.mozilla.org/processing.js, where we are working, and join the discussion in the bug. I’ll continue to blog this series about our work, and I know the others are working on posts and more demos, too.