Over the weekend, I've been trying out a new music service that applies a double helical twist to Internet radio. Pandora, which goes live on Monday, lets you create radio stations, tuned to your musical tastes. Instead of just relying on playlists, selected by you, friends, services or a DJ, you enter songs and artists and the software scans a music "genome" database to find other songs with similar DNA as a basis for generating the "radio" programming. As you listen to songs you can provide feedback, such as whether you liked or disliked the selection, and the algorithm is tweaked, and Pandora has links to purchase songs.
The Music Genome database, which has been in the works for five years, abstracts out various characteristics of music--such melody, rhythm, instrumentation, orchestration, arrangement and lyrics--to create a musical identity, according to Tom Conrad, Pandora's CTO. According to the Pandora Web site, "Each song in the Music Genome Project is analyzed using up to 400 distinct musical characteristics by a trained music analyst. These attributes capture not only the musical identity of a song, but also the many significant qualities that are relevant to understanding the musical preferences of listeners."
In my experience, about eight hours of listening time, Pandora isn't as fine-grained as I would like. I created a Bob Dylan/Tom Petty station, and got several songs that didn't meet my expectation. "Pandora proceeds in sets of three to four songs, influenced by one of the seeds," Westergen said. The problem, he said, is that an artist like Dylan has a varied catalog. His musical genome varies depending on the phase of his career. It would be good if the musicologists at Pandora added that dimension to the database.
Another oddity, flaw or feature (I haven't figure out which yet) of Pandora is that my Dylan/Petty station doesn't play much Dylan or Petty. It finds all kinds of music you could never easily discover from other music sites. As a result, you are exposed to far more of the "long tail." The software finds musical neighbors of the seed song or artists from a huge catalog of indexed, decoded songs. "A big part of what makes the genome unique is that it doesn't rely on popularity or usage habits of other users," Westergen said.
Here's a sample song sequence from my Dylan/Petty station, which I am still fine tuning.
- 19 years--Elysium
- Delirious--Luka Bloom
- Colorado Girls--Townes Van Zandt
- 4th of July, Asbury Park (Sandy)--Bruce Springsteen
- Nobody--Paul Simon
- Turn Back the Page--Joey Stec
- Heart My Home--Jim Yoshii Pile Up
- The Enchanted Car--Freedy Johnston
- Looking East--Jackson Browne
- Give a Little Bit--The Goo Goo Dolls
- Nowhere Land--Cardinal Trait
- Tennessee before Daylight--Outformation
- Magnolia Mountain--Ryan Adams & The Cardinals
- Ashes on the Moon--Riviera
- Spanish Harlem Incident--Bob Dylan
Pandora's business model is radio-like, in the sense that you have no control over the playlist, other than inputing variables and relying on the music genome engine to deliver songs. Pandora complies with the DMCA license (Digital Millenium Copyright Act of 1998); based on the DMCA provisions that define streaming Internet-based radio, you'll hear everything from the Beatles to Metallica, founder Tim Westergen said. Coltrane too, but no classical. Cost of the service is $36 annually or $12 per quarter, and Pandora pays around 15 percent of subscription fees to a clearinghouse that deals with the publishers, and for a lot of bandwidth.
Pandora isn't a substitute for user recommendation engines, but it's a complementary service and great listening. Note: Pandora's Music Genome Project is not be confused with MusicGenome, an Israeli company that has a similar type of song recommendation engine sold to businesses.
Developing deep schemas and decoding data types and events to automate discovery, find relationships and solve problems, such as securing computer systems, is already a major area of software development (I wrote a long piece for Esther Dyson's newsletter on the topic). If it works for music and network events, how about for video content, when millions of people are creating video and uploading it to the Net? What's the genome for video content?