Prabhakar Raghavan, the head of Yahoo! Research, delivered the keynote yesterday at WWW2007. The program listed the title of the talk as "What sciences will Web N.0 take?" but I think it would be more accurate to call it "The Science of Engaging and Monetizing Audience."
He started with an explanation of the business Yahoo! (and others like Google, AOL, MSN, and, lately, NewsCorp) are engaged in: Yahoo! takes in editorial, free (including blogs, twitter, pictures, etc.) and commercial "content." The audience "consumes the content" but also enriches the content. Finally the audience transacts (commerce) with the content. This is the business of matching content to audience that is covered so nicely in a Bear Stearns presentation that I've mentioned before
Ragahavan started with the premise that people don't want to search, but rather to get tasks done. Search engines spend very little time servicing you compared to the time you spend doing queries, evaluating results, and so on. This is backwards--why aren't machines should be working harder than we are. He proposes that the grand challenge is to devise general platforms for semantic searches--that is searches that are able to derive meaning from the search terms presented to them.
Raghavan also made the point that there is no scale-based differentiation around web text content. Even small companies can afford to store content at Web scale--this is especially true when you consider products like Amazon's S3 and EC2.
On the topic of users enriching the things they find on the Web, Raghavan noted that user-generated metadata is growing. Anchortext and tags are growing at the rate of 100 Mb/day. Pageviews are around 50-100 Gb/day. Reviews and ratings are small. All of these, are important, but only anchors are central to how people work on the Web.
Yahoo! uses the acronym "START" to describe a metadata hierarchy that shows increasing levels of engagement:
- Star: the user says "I like this"
- Tag: creating tags on pictures, etc.
- Access: the user views a page (in a visible way)
- Routing: forwarding things to friends
- Text: write a review, blog article, etc.
I'm reminded of Britt Blaser's theory of "stepping stones" in bringing people into more and more interaction with a site and, more importantly, with each other. Britt's OrgWare (disclosure: I'm an advisor) is a systematic attempt to build infrastructure that supports and encourages audience engagement.
The challenges in this space: How do we use tags better? How do we cope with Spam? How do we build better ratings and reputation systems? More important, how can these be used to better generate useful metadata. He mentions the ESP Game. I heard about this a few weeks ago at Jeanette Wing's lecture. The game uses game result to contribute tags to image search.
Raghavan is looking for systems that incent chaos in ways that retain and enrich participation. Where is the science behind online community? This isn't about human-computer interaction, but people-to-people interaction mediated by the computer. This is the P2P that matters. Some questions that bear study:
- Why do people choose to lurk or participate?
- Why do people create new online personae?
- Why are YouTube, Flickr, and MySpace successful and others not?
- What new genres of audience experience are emerging and what can we provoke?
Ragahavan finished by giving an in-depth discussion of sponsored search as a combination of information retrieval and microeconomics. He calls this "computational microeconomics." This includes reputation and incentive mechanisms and marketplace matching (he references the stable marriage problem as an example of research that impacts this are).
People talk about "network effects" but what does this mean, from a value standpoint? Are 500 million users 500 times as valuable as a million users? Or 5000 times more valuable? What of Metcalfe's Law?
Raghavan's talk was thought provoking in many ways. All of us who work in the Web are engaged in social issues as much as hard science. I started off the conference in a session on whether or not we need a "new discipline" of Web science, one that combines areas as diverse as sociology, physics, biology, law, and psychology, as well as the areas you might immediately think of like computer science or math. Raghavan was arguing for the same thing, I think--whether he calls it that or not.