Google I/O: Mastering the Android Media Framework

This afternoon at Google I/O 2009 Dave Sparks gave the most technical presentation of the conference so far when he delved into the details of the Android Media Framework. Here are my notes from the talk.

This afternoon at Google I/O 2009 Dave Sparks gave the most technical presentation of the conference so far when he delved into the details of the Android Media Framework. Here are my notes from the talk.

Design Goals of the Media Framework include:

  • Simplify application development
  • Share resources in multi-tasked environment
  • Provide a strong security model
  • Leave room for future growth

The typical stack for a media function call is pretty complex:

DVM proxy > JNI > Native proxy (C++) > Binder Proxy > Binder Native > Native Implementation

Thankfully you don't need to make many calls to the framework. Using a native proxy will allow developers to call media from native games in the future.

Internally the media framework has its own handlers for .ogg and .mid files, and sends anything else to the OpenCORE library. That's why the .ogg format is preferred for lightweight sound effects.

Android video codecs include:

  • H.263 Video. Originally designed for low bit-rate video conferencing. Simple encoder and decoder, 3GPP standard, Used by many streaming sites for low bit-rate video.
  • MPEG4-SP (Simple Profiler) Video. Designed as a replacement for MPEG1/2 codecs. Simple encoder. Not much improvement over H.263. It's missing the deblocking filter.
  • H.264 AVC. Better compression (e.g., multiple reference frames, not supported currently in Android), better quality, different profiles, more complex.

Audio codecs include:

  • MP3 approx 10:1 compression @ 128Kbps. Sonic transparency 192Kbps.
  • AAC (Advanced Audio Codec). Better compression, sonic transparency 128Kbps. Commonly used in MPEG-4 streams.
  • Ogg Vorbis. Better compression than MP3. Low overhead player; low latency, uses less memory. Can loop seamlessly (unlike MP3).
  • Adaptive Multi-rate (AMR) audio. Speech codec, narrow band 8KHz, wide band 16KHz. Used in 3GP streams. AMR narrow is the only encoder available in software.

Typical streams look like this:

  • 3GPP - lower quality, H.263 video, AMR-NB audio, bit rates up to 192Kbps
  • MPEG-4 - higher quality, H.264 video, AAC audio, Bit rates up to 500Kbps

New features for Cupcake V1.5 include:

  • Video recording
  • AudioTrack -- direct access to raw audio
  • AudioRecord -- ditto
  • JET interactive MIDI engine

AudioTrack and AudioRecord are interesting to low level audio developers:

  • Expose raw PCM audio streams to applications
  • AudioTrack: Write PCM audio directly to mixer engine
  • AudioRecord: Read PCM audio directly from mic
  • Callback mechanism for threaded application
  • Static buffer for playing sound effects

The JET Interactive MIDI Engine is new for Cupcake:

  • It's based on MIDI - file sizes can be small
  • You can pre-author content that is very interactive
  • DLS support allows for better quality instruments (load your own samples)
  • Precise synchronization for layered MIDI tracks
  • Native code - very efficient
  • Synchronization callbacks to applications for rhythm games
  • Open source engine and creation tools
  • VST plugin - use it inside your favorite DAW tool (Digital Audio Works?)

Dave spent some time in the talk going over common problems with the Media framework that he has seen users encounter. For example, one common problem is that volume control behavior is inconsistent.

  • Volume control is overloaded
  • If you're in a call, adjusts in call value
  • if rinning, mute ringer
  • if media track active, adjust media volue
  • otherwise adjust ringtone volume <--- default

In an application that plays sounds periodically the volume behavior is not consistent. The solution is to set the default stream type in your onCreate() method:

setVolumeControlStream(AudioManager.STREAM_MUSIC);

Unable to play file from a resource?

  • mp.setDataSource("res:com.myapp...") doesn't work
  • solution: use AssetFileDescriptor

Out of MediaPlayers?

  • Call release() and set to null, or call reset() then setDataSource()
  • Limit to 2 or 3 maximum
  • Especially in your onPause()

CPU Overloaded?

  • This happens when you're playing too many compressed streams at a time (like MP3)
  • Solution: use SoundPool. 1.0/1.1 had problems but 1.5 is much better.
  • Decodes and loads into memory so they're ready to play.
  • 5% per stream overhead (instead of 20-30%)

During the Q&A Dan revealed that OpenGL 2.0 support was coming in the Eclair version (currently at HEAD in source control). The 2d framework will be running in a 3d context. He also said this would allow the support of video as a texture.