X
Tech

Tiny music files almost as good as MP3 ones

Researchers at the University of Rochester have encoded a 20-second clarinet solo in a file smaller than a kilobyte. This is about 1,000 times smaller than regular MP3 files. But the sound is almost as good according to the scientists. The system works by 'recreating in a computer both the real-world physics of a clarinet and the physics of a clarinet player.' The researchers are also working on modeling other instruments and on a graphical user interface allowing 'normal' people to play music without almost any training. However, in our years of increased bandwidth and larger memories for our gadgets, I doubt that this music reduction in size can lead to a successful commercial product. But read more...
Written by Roland Piquepaille, Inactive

Researchers at the University of Rochester have encoded a 20-second clarinet solo in a file smaller than a kilobyte. This is about 1,000 times smaller than regular MP3 files. But the sound is almost as good according to the scientists. The system works by 'recreating in a computer both the real-world physics of a clarinet and the physics of a clarinet player.' The researchers are also working on modeling other instruments and on a graphical user interface allowing 'normal' people to play music without almost any training. However, in our years of increased bandwidth and larger memories for our gadgets, I doubt that this music reduction in size can lead to a successful commercial product. But read more...

This system has been developed by Mark F. Bocko, Professor of Electrical and Computer Engineering, and who manages the Center for Superconducting Digital Electronics, with two of his doctoral students, Xiaoxiao Dong and Mark Sterling. You can listen to two 20-second audio files available from the news release mentioned above, a human performance recorded using MP3 format and a virtual performance using Bocko's new compression.

Here is what they did. "They built a computer model of the clarinet, and the result is a virtual instrument built entirely from the real-world acoustical measurements. The team then set about creating a virtual player for the virtual clarinet. They modeled how a clarinet player interacts with the instrument including the fingerings, the force of breath, and the pressure of the player's lips to determine how they would affect the response of the virtual clarinet. Then, says Bocko, it's a matter of letting the computer 'listen' to a real clarinet performance to infer and record the various actions required to create a specific sound. The original sound is then reproduced by feeding the record of the player's actions back into the computer model."

In "UR finds way to cut size of music files," Matthew Daneman provides additional details about the sound quality of these small musical files in the Rochester Democrat and Chronicle (April 2, 2008). "Parts of the computer simulation 'sound virtually identical' to a recording of a clarinet, Bocko said, while some other parts -- such as the computer simulation of tonguing at the beginning of notes -- still needs work.

So what are the researchers working on today? "Bocko said he and the other researchers now are working on improving the clarinet computer modeling, as well as expanding into other instruments and trying to build a human/computer interface that could let people play music without years of lessons."

The researchers presented their results at the International Conference on Acoustics Speech and Signal Processing (ICASSP 2008) held between March 30 and April 4, 2008 in Las Vegas. Their paper was named "Representation of solo clarinet music by physical modeling synthesis."

They've also published their results in the Journal of the Acoustical Society of America under the title "Empirical physical modeling based music synthesis and representation" (November 2007, Volume 122, Issue 5, Pages 3055-3056).

Here is the beginning of the abstract. "We describe a method in which empirically-based musical instrument physical models are employed both to synthesize musical sounds and to form the basis of a compact representation of mono-timbral musical sound. [...] The physical model incorporates measured acoustic impedance spectra of a clarinet air column for all playable notes. Low bandwidth control parameter time histories, serving as inputs to the physical model, are inferred from audio recordings of actual clarinet music. The control parameters represent the fingerings, the blowing pressure of the player, and the mouthpiece clamping pressure of the player's embouchure. It is shown that, given an appropriate physical model, the control parameter time histories can serve as a highly compact representation (compression by a factor of several hundred) of a source recording.

This technology is certainly interesting, but as I wrote into the introduction of this post, I don't think it will be successful as a product. Even our smallest MP3 players can contain thousands of songs. Do we need millions? And there is another crucial point: cost. The technique described above looks pretty expensive. But it's possible that the researchers worked on this project just because "small is beautiful." Let me know what you think.

Sources: University of Rochester Press Release, April 1, 2008; and various websites

You'll find related stories by following the links below.

Editorial standards