Sir Tim Berners-Lee: Semantic Web is open for business

Web inventor Sir Tim Berners-Lee says that Semantic Web building blocks are in place. He also questions the attitude to data ownership of social networking companies in an interview recorded earlier this month.
Written by Paul Miller, Contributor

Earlier this month I had the great pleasure to spend time talking with Sir Tim Berners-Lee, inventor of the World Wide Web and now Director of the World Wide Web Consortium (W3C) in Cambridge, MA.

Our wide-ranging dig into the past, present and future of the Semantic Web was recorded for one of the regular Talking with Talis podcasts, and now appears here as the first of a new podcast series for ZDNet; Talking Semantics.

In this post, I'd like to draw out some aspects of the conversation that I found most interesting. Have a listen for yourself, draw your own conclusions, and please do share them in TalkBack.

First, the good news. With the release of the SPARQL specifications, Tim is clear that the core pieces are in place for developers to build robust Semantic Web applications;

"I think... we've got all the pieces to be able to go ahead and do pretty much everything... [Y]ou should be able to implement a huge amount of the dream, we should be able to get huge benefits from interoperability using what we've got. So, people are realizing it's time to just go do it."

Asked about an important article in Scientific American from 2001, Berners-Lee was quick to move past the grand vision outlined there, and to stress the importance of simple yet empowering steps;

"In fact, the gain from the Semantic Web comes much before that. So maybe we should have written about enterprise and intra-enterprise data integration and scientific data integration. So, I think, data integration is the name of the game. That's happening, it's showing benefits. Public data as well; public data is happening and it is providing the fodder for all kinds of mashups.

What we should realize is that the return on investment will come much earlier when we just have got this interoperable data that we can query over."

We spent some time (almost 15 minutes, from about 20 minutes in, for those listening along) talking about the ways in which data holders will gain benefits from their data being visible to a new generation of Semantic Web applications;

"There's an awful lot of data out there. And I think, one of the huge misunderstandings about the Semantic Web is, 'oh, the Semantic Web is going to involve us all going to our HTML pages and marking them up to put semantics in them.' Now, there's an important thread there, but to my mind, it's actually a very minor part of it. Because I'm not going to hold my breath while other people put semantics in by hand... So, where is the data going to come from? It's already there. It's in databases..."

The W3C-supported Linked Data Project is one compelling example of a community effort to take data and make it more visible to the rest of the Semantic Web. Projects such as DBpedia, MusicBrainz and Revyu.com are enriching existing content, and increasingly providing tools with which new content can be created. As Tim notes;

"So, some data is scraped from HTML pages, some of it is pulled out of databases, some of it comes from projects which have been in XML. So, things come in many different ways. And once they're exported, as you browse around the RDF graph, as you write mash-ups to reuse that data, you really don't have to be aware of how it was produced."

Richard Cyganiak maintains an evolving picture of the participants in this project, a snapshot of which is reproduced here.

Impressive as these activities are, if we are to see a similar growth in the availability of data from less philanthropic sources, there is a clear need for greater clarity with respect to the 'proper' use and reuse of data. In a similar manner to that attempted for 'creative works' by Creative Commons, recent activity around the Open Data Commons offers useful pointers as to the way forward here, and I may delve further into that area in a future post.

Before moving off the topic, Tim flagged two events over the next few months as interesting to those looking to make data available to the Semantic Web. First the Linked Open Data Workshop at this year's World Wide Web conference in Beijing in April, and second Linked Data Planet in New York in June. For more on Linked Data, see Tom Heath's recent article looking back at one year of activity on the project.

Towards the end of our conversation, we built upon earlier discussion of shared and open data by turning to those sites receiving such criticism for their rather different perspective at the moment, the social networks. Asked,

"Do you think developers of applications like, say, Facebook and LinkedIn and the rest, are ready to embrace the Semantic Web, or do you think they think they can do it themselves?"

Tim responded with;

"It is a very grown-up thing to realize that you are not the only social networking site... otherwise it is like a website which doesn't have any links out. In the Semantic Web similarly, if you don't have any links out, well, that's boring.

In fact, a lot of the value of many websites is the links out."

Whilst quick to recognise that sites such as LiveJournal support the FOAF specification, there was a clear distinction drawn between those few examples and the majority;

"Now if you look at the social networking sites which, if you like, are traditional Web 2.0 social networking sites, they hoard this data. The business model appears to be, 'We get the users to give us data and we reuse it to our benefit. We get the extra value.'"


"Web 2.0 is a stovepipe system. It's a set of stovepipes where each site has got its data and it's not sharing it. What people are sometimes calling a Web 3.0 vision where you've got lots of different data out there on the Web and you've got lots of different applications, but they're independent. A given application can use different data. An application can run on a desktop or in my browser, it's my agent. It can access all the data, which I can use and everything's much more seamless and much more powerful because you get this integration. The same application has access to data from all over the place."

Recent excitement around the Data Portability movement and the high profile adherents to their mission is certainly encouraging, although fellow ZDNet blogger Dennis Howlett has been amongst those expressing some scepticism as to their motivation. I guess only time will tell whether that particular movement will succeed.

Those who doubt the commitment of current players can, perhaps, be reassured by simply remembering the speed with which the current market leaders grew. Consider, too, Tim's,

"People can indeed choose not to go to that site [if it does not open access to their data]"

In other words, in a market such as the one in which we operate, there is always scope for new entrants with new values and new business models. If users are compelled by the new proposition they can - and will - move with remarkable rapidity. The big question, though, has to be... do they care enough?

We talked for a fascinating hour during which we ranged from past to future, from technology to policy. We covered specifications such as RDF and SPARQL, and we talked about the pressing need for more accessible texts to explain the Semantic Web to mainstream business. We remembered that Tim's original web client was both editor and browser, and postulated on how things might have evolved differently if today's Read/Write Web of blogs and wikis had been an integral part of the way everyone was introduced to thinking about the Web all those years ago.

There is much still to do, and Sir Tim Berners-Lee is clearly enthused by the journey that lies ahead. Listening to him, it's hard not to agree.

Show notes are available on Nodalities for this podcast. A transcript is also available.

Disclosure: Work on the Open Data Commons is supported by my employer, Talis.

Photograph of Sir Tim Berners-Lee taken by my colleague Rob Styles, during Tim's keynote presentation at the WWW2007 Conference in Banff, Canada. Used with permission.

Richard Cyganiak's Linking Open Data dataset cloud is licensed with a Creative Commons license (Attribution Sharealike), and reproduced here within the terms of that license.

Editorial standards