X
Business

The Semantic Web - is everyone confused?

The Economist. Tim O'Reilly. Nova Spivack. Danny Ayers. Read/Write Web's Alex Iskold. Kingsley Idehen. Brad Feld. Over the last few days all of them have been amongst those writing to clarify their understanding of the Semantic Web and where it's going.Each piece is thoughtful, each piece is well worth a read, and each differs somewhat from the others in outlook as they delve into 'ontologies', 'classic approaches', 'machine intelligence', 'SPARQL', 'Turtle' and other geekiness [meant in the nicest possible way]. I do wonder, though, if all of them are bypassing some fundamental points as they seek to clarify their own perspectives to themselves, to one another, and to the world; points with which I suspect that each may actually agree.First, I definitely don't think that a company, technology or approach can only be either 'Web 2.0' or 'Semantic Web'. Sure, some companies will see themselves (or pitch themselves) in one space or the other, but there's going to be an ever-increasing number that reside firmly in both. Ultimately, of course (and figures in the FT this week, suggesting that “The pull-back was particularly acute in Silicon Valley, as big Web 2.0 investors such as Benchmark Capital, Kleiner Perkins Caufield & Byers and Omidyar Networks, the private financing vehicle of Ebay founder Pierre Omidyar, cut back on their investments.”might more logically be interpreted as supporting this argument) companies won't be Web 2.0 or Semantic Web. They will be companies that solve a particular set of problems for a particular set of audiences. Some of the tools in the toolbox they use to do this will be Web 2.0-ish, some will be Semantic Web-ish, some will be both, and some will be neither. Those things that currently differentiate us - and to which we apply labels in order to reinforce the differentiation - will become mainstream, run of the mill, mundane, and simply expected. That's progress, and it's a good thing. Web 2.0 won't go away. The Semantic Web won't go away. Shouting about either might, and it doesn't have to mean that their importance has diminished.Second, 'collective intelligence' applies equally to both. Tim O'Reilly's absolutely right that it's been a key differentiator of many Web 2.0 darlings; “By contrast, I've argued that one of the core attributes of 'web 2.0' (another ambiguous and widely misused term) is 'collective intelligence.' That is, the application is able to draw meaning and utility from data provided by the activity of its users, usually large numbers of users performing a very similar activity. So, for example, collaborative filtering applications like Amazon's 'people who bought item this also bought' or last.fm's music recommendations, use specialized algorithms to match users with each other on the basis of their purchases or listening habits. There are many other examples: digg users voting up stories, or wikipedia's crowdsourced encyclopedia and news stories.”It's also front and centre in Semantic Web work, though. For example that from ourselves, Radar Networks and others. See this white paper [PDF] for one, and watch here and here for public sight of internal developments... soon. The connections that RDF makes so manifest are a perfect way to express, traverse, and mine the habits, behaviours and desires of the collective.Third, 'a formal ontology' is not a requirement, and nor is pushing structure in the face of the user.Tim makes a good point here; “The Semantic Web is a bit of a slog, with a lot of work required to build enough data for the applications to become useful. Web 2.0 applications often do a half-assed job of tackling the same problem, but because they harness self-interest, they typically gather much more data. And then solve for their deficiencies with statistics or other advantages of scale.”I'm not sure, though, that SemWeb/ Web 2.0 is the dichotomy here? Rather, it's a split between purist, all-encompassing, and hugely flexible on the one hand and pragmatic and 'good enough' on the other. I would agree that stereotype would often place Semantic Web developers on one side of that divide and Web 2.0 startups on the other. The technology is not the point there, though, so much as the mindset. Believe me, we can do some great stuff to harness self-interest, gather much more data, and solve the deficiencies with statistics and other advantages of scale in a Semantic Web-ey Platform... :-) “But I predict that we'll soon see a second wave of social networking sites, or upgrades to existing ones, that provide for the encoding of additional nuance. In addition, there will be specialized sites -- take Geni, for example, which encodes geneaology -- that will provide additional information about the relationships between people. Rather than there being a single specification capturing all the information about relationships between people, there will be many overlapping (and gapping) applications, and an opportunity for someone to aggregate the available information into something more meaningful.”Too right, Tim. But I'd definitely suggest that those building the second wave should be talking to Talis, to Radar Networks, to Metaweb and to some of the other proponents of a new and far more Web 2.0-inspired Semantic Web paradigm. There are way too many synergies there to ignore...Dan Brickley's comments in response to one aspect of Danny's argument are also interesting; “Let me clear something up. Danny mentions a discussion with Tim O’Reilly about SemWeb themes. Much as I generally agree with Danny, I’m reaching for a ten-foot bargepole on this one point: 'While Facebook may have achieved pretty major adoption for their approach, it’s only very marginally useful because of their overly simplistic treatment of relationships.' Facebook, despite the trivia, the endless wars between the ninja zombies and the pirate vampires; despite being centralised, despite [insert grumble] is massively useful. Proof of that pudding: it is massively used. 'Marginal' doesn’t come into it.”Too true. I've complained about Facebook, too [for example here and here]. But I use it, and millions of others use it. And it serves a purpose. That doesn't mean it can't be better.Turning, finally, to Alex' post; “The first problem is that RDF and OWL are complicated. Even for scientists and mathematicians these graph-based languages take time to learn and for less-technical people they are nearly impossible to understand. Because the designers were shooting for flexibility and completeness, the end result are documents that are confusing, verbose and difficult to analyze.”Well, yes and no. That's what tools are for. And in a large number of cases the RDF may actually be auto-generated as part of some process of aggregation or value addition of which the data creator or manager need have no explicit awareness. The RDF may very well be generating an aggregation of tiny snippets of data from large numbers of transactions; the interaction of a single user with a single resource doesn't have to result in a whole RDF document of its own. More on that later.And, also from Alex; “Going back to John Markoff's example of a computer booking a perfect vacation, one can't help but think of a travel agency. In the good old days, you would go to the same agent over and over again. Why? Because just like your friends, your doctor, your teacher, the travel agent needs to know you personally to be able to serve you better. The travel agent remembers that you've been to Prague and Paris, which is why he offers you a trip to Rome. The travel agent remembers that you're a vegetarian and orders the pasta meal for you on your flight. Over time people learn and memorize facts about life and each other. Until machines can do the same, knowledge of semantics, limited or full is not going to be enough to replace humans.”Exactly. And that's where network effects, collective intelligence, behavioural observation and all the rest kick in. The knowledge comes from observation of an awful lot of behaviour; not from having the traveller fill in some long-winded and tedious form detailing an RDF graph representation of their travel preferences for all situations. Context matters. I, for example, want a window seat on short-haul flights, and an aisle seat on long-haul flights. It's not a simple preference one way or the other. I don't have a preferred airport to depart from, as so many other factors come into play. I'll go to a more distant departure airport for a better departure or travel time, for example. I won't always travel with the airlines I've got frequent flier cards for... but they don't have to be cheapest before I can or will. It's more complex than that. Current systems don't understand. “Perhaps the worst challenge facing the semantic web is the business challenge. What is the consumer value? How is it to be marketed? What business can be built on top of the semantic web that can not exist today? Clearly the example of instant travel match is not a 'wow.' It's primitive and, in a way, uninteresting because many of us are already quite adept at being our own travel agent using existing tools. But assuming that there are problems that can be solved faster, there is still a question of specific end user utility.”Talis. Radar Networks. Joost. Metaweb. Garlik. Need I go on? (I can... :-) ) “The way the semantic web is presented today makes it very difficult to market. The 'we are a semantic web company' slogan is likely to raise eyebrows and questions. RDF and OWL clearly need to be kept under the hood. So the challenge is to formulate the end user value in ways that will resonate with people.”Absolutely right! SWEO is part of the answer. Companies like ours getting out and showing what can be done, and why it's valuable is crucial too... and we're getting there.And to answer my initial question; No, I don't think everyone is confused by or about the Semantic Web. We do, though, have a lot of different niche views of value (or lack thereof), clamouring for attention. These overlapping - and not necessarily incorrect - perspectives certainly could appear to be a result of confusion, if viewed from the outside. Language is a complicated thing, and these are complex ideas. Describing one with the other requires a number of iterations to arrive at clarity, but we're getting there.There's a lot more to say, but this post has now gone on long enough (especially as I initially meant to simply point you at some interesting blog posts...).
Written by Paul Miller, Contributor

The Economist. Tim O'Reilly. Nova Spivack. Danny Ayers. Read/Write Web's Alex Iskold. Kingsley Idehen. Brad Feld. Over the last few days all of them have been amongst those writing to clarify their understanding of the Semantic Web and where it's going.

Each piece is thoughtful, each piece is well worth a read, and each differs somewhat from the others in outlook as they delve into 'ontologies', 'classic approaches', 'machine intelligence', 'SPARQL', 'Turtle' and other geekiness [meant in the nicest possible way]. I do wonder, though, if all of them are bypassing some fundamental points as they seek to clarify their own perspectives to themselves, to one another, and to the world; points with which I suspect that each may actually agree.

First, I definitely don't think that a company, technology or approach can only be either 'Web 2.0' or 'Semantic Web'. Sure, some companies will see themselves (or pitch themselves) in one space or the other, but there's going to be an ever-increasing number that reside firmly in both. Ultimately, of course (and figures in the FT this week, suggesting that

“The pull-back was particularly acute in Silicon Valley, as big Web 2.0 investors such as Benchmark Capital, Kleiner Perkins Caufield & Byers and Omidyar Networks, the private financing vehicle of Ebay founder Pierre Omidyar, cut back on their investments.”

might more logically be interpreted as supporting this argument) companies won't be Web 2.0 or Semantic Web. They will be companies that solve a particular set of problems for a particular set of audiences. Some of the tools in the toolbox they use to do this will be Web 2.0-ish, some will be Semantic Web-ish, some will be both, and some will be neither. Those things that currently differentiate us - and to which we apply labels in order to reinforce the differentiation - will become mainstream, run of the mill, mundane, and simply expected. That's progress, and it's a good thing. Web 2.0 won't go away. The Semantic Web won't go away. Shouting about either might, and it doesn't have to mean that their importance has diminished.

Second, 'collective intelligence' applies equally to both. Tim O'Reilly's absolutely right that it's been a key differentiator of many Web 2.0 darlings;

“By contrast, I've argued that one of the core attributes of 'web 2.0' (another ambiguous and widely misused term) is 'collective intelligence.' That is, the application is able to draw meaning and utility from data provided by the activity of its users, usually large numbers of users performing a very similar activity. So, for example, collaborative filtering applications like Amazon's 'people who bought item this also bought' or last.fm's music recommendations, use specialized algorithms to match users with each other on the basis of their purchases or listening habits. There are many other examples: digg users voting up stories, or wikipedia's crowdsourced encyclopedia and news stories.”

It's also front and centre in Semantic Web work, though. For example that from ourselves, Radar Networks and others. See this white paper [PDF] for one, and watch here and here for public sight of internal developments... soon. The connections that RDF makes so manifest are a perfect way to express, traverse, and mine the habits, behaviours and desires of the collective.

Third, 'a formal ontology' is not a requirement, and nor is pushing structure in the face of the user.

Tim makes a good point here;

“The Semantic Web is a bit of a slog, with a lot of work required to build enough data for the applications to become useful. Web 2.0 applications often do a half-assed job of tackling the same problem, but because they harness self-interest, they typically gather much more data. And then solve for their deficiencies with statistics or other advantages of scale.”

I'm not sure, though, that SemWeb/ Web 2.0 is the dichotomy here? Rather, it's a split between purist, all-encompassing, and hugely flexible on the one hand and pragmatic and 'good enough' on the other. I would agree that stereotype would often place Semantic Web developers on one side of that divide and Web 2.0 startups on the other. The technology is not the point there, though, so much as the mindset. Believe me, we can do some great stuff to harness self-interest, gather much more data, and solve the deficiencies with statistics and other advantages of scale in a Semantic Web-ey Platform... :-)

“But I predict that we'll soon see a second wave of social networking sites, or upgrades to existing ones, that provide for the encoding of additional nuance. In addition, there will be specialized sites -- take Geni, for example, which encodes geneaology -- that will provide additional information about the relationships between people. Rather than there being a single specification capturing all the information about relationships between people, there will be many overlapping (and gapping) applications, and an opportunity for someone to aggregate the available information into something more meaningful.”

Too right, Tim. But I'd definitely suggest that those building the second wave should be talking to Talis, to Radar Networks, to Metaweb and to some of the other proponents of a new and far more Web 2.0-inspired Semantic Web paradigm. There are way too many synergies there to ignore...

Dan Brickley's comments in response to one aspect of Danny's argument are also interesting;

“Let me clear something up. Danny mentions a discussion with Tim O’Reilly about SemWeb themes.

Much as I generally agree with Danny, I’m reaching for a ten-foot bargepole on this one point:

'While Facebook may have achieved pretty major adoption for their approach, it’s only very marginally useful because of their overly simplistic treatment of relationships.'

Facebook, despite the trivia, the endless wars between the ninja zombies and the pirate vampires; despite being centralised, despite [insert grumble] is massively useful. Proof of that pudding: it is massively used. 'Marginal' doesn’t come into it.”

Too true. I've complained about Facebook, too [for example here and here]. But I use it, and millions of others use it. And it serves a purpose. That doesn't mean it can't be better.

Turning, finally, to Alex' post;

“The first problem is that RDF and OWL are complicated. Even for scientists and mathematicians these graph-based languages take time to learn and for less-technical people they are nearly impossible to understand. Because the designers were shooting for flexibility and completeness, the end result are documents that are confusing, verbose and difficult to analyze.”

Well, yes and no. That's what tools are for. And in a large number of cases the RDF may actually be auto-generated as part of some process of aggregation or value addition of which the data creator or manager need have no explicit awareness. The RDF may very well be generating an aggregation of tiny snippets of data from large numbers of transactions; the interaction of a single user with a single resource doesn't have to result in a whole RDF document of its own. More on that later.

And, also from Alex;

“Going back to John Markoff's example of a computer booking a perfect vacation, one can't help but think of a travel agency. In the good old days, you would go to the same agent over and over again. Why? Because just like your friends, your doctor, your teacher, the travel agent needs to know you personally to be able to serve you better.

The travel agent remembers that you've been to Prague and Paris, which is why he offers you a trip to Rome. The travel agent remembers that you're a vegetarian and orders the pasta meal for you on your flight. Over time people learn and memorize facts about life and each other. Until machines can do the same, knowledge of semantics, limited or full is not going to be enough to replace humans.”

Exactly. And that's where network effects, collective intelligence, behavioural observation and all the rest kick in. The knowledge comes from observation of an awful lot of behaviour; not from having the traveller fill in some long-winded and tedious form detailing an RDF graph representation of their travel preferences for all situations. Context matters. I, for example, want a window seat on short-haul flights, and an aisle seat on long-haul flights. It's not a simple preference one way or the other. I don't have a preferred airport to depart from, as so many other factors come into play. I'll go to a more distant departure airport for a better departure or travel time, for example. I won't always travel with the airlines I've got frequent flier cards for... but they don't have to be cheapest before I can or will. It's more complex than that. Current systems don't understand.

“Perhaps the worst challenge facing the semantic web is the business challenge. What is the consumer value? How is it to be marketed? What business can be built on top of the semantic web that can not exist today? Clearly the example of instant travel match is not a 'wow.' It's primitive and, in a way, uninteresting because many of us are already quite adept at being our own travel agent using existing tools. But assuming that there are problems that can be solved faster, there is still a question of specific end user utility.”

Talis. Radar Networks. Joost. Metaweb. Garlik. Need I go on? (I can... :-) )

“The way the semantic web is presented today makes it very difficult to market. The 'we are a semantic web company' slogan is likely to raise eyebrows and questions. RDF and OWL clearly need to be kept under the hood. So the challenge is to formulate the end user value in ways that will resonate with people.”

Absolutely right! SWEO is part of the answer. Companies like ours getting out and showing what can be done, and why it's valuable is crucial too... and we're getting there.

And to answer my initial question; No, I don't think everyone is confused by or about the Semantic Web. We do, though, have a lot of different niche views of value (or lack thereof), clamouring for attention. These overlapping - and not necessarily incorrect - perspectives certainly could appear to be a result of confusion, if viewed from the outside. Language is a complicated thing, and these are complex ideas. Describing one with the other requires a number of iterations to arrive at clarity, but we're getting there.

There's a lot more to say, but this post has now gone on long enough (especially as I initially meant to simply point you at some interesting blog posts...).

Editorial standards