The Calais team inside Thomson Reuters continues to impress, and today's release could in many ways be the best yet as it promises to contribute massively to the growing body of 'Linked Data' on the Web. As regular readers will remember, this 'Linked Data' is the same stuff being described by Sir Tim Berners-Lee as;
"the Web done right."
Built on top of Calais and scalably hosted on Amazon's EC2 service, the new site at SemanticProxy.com enters public beta today, and enables anyone to easily generate rich semantic metadata for pages on the open web, simply by passing the URL to SemanticProxy. Human visitors can do so via a standard web form, but the same results can also be achieved programmatically via an API, and it is uses of this sort that will enable a real growth in the availability of Linkable Data out on the open Web.
"The [Semantic Web] market today is largely building little semantic kingdoms - little self-contained ecosystems - rather than the Semantic Web."
The successes of the Linking Open Data Project's enthusiasts aside, this sentiment is unfortunately one with which it is hard to disagree. Many of the pieces in the Semantic Web stack are being deployed today, but deployed in such a way that often they build better, richer, more expressive silos rather than deployed with a technical and procedural presumption that the data should play its full role out on the open Web.
A tool like SemanticProxy makes it straightforward to generate structured metadata from pages on the open Web. Furthermore, it respects the Linked Data community's 'rules', and Tague stressed that;
"SemanticProxy will return dereferenceable Linked Data URIs by the end of this quarter."
When I mentioned this to one of the Linked Data community's proponents, his immediate response of
was cut short as he headed off to explore the new service.
A simple online demonstration allows users to paste a URL into the site and see the Calais service return identified terms... and a measure of their relevance to the wider story. Optimised at the moment for the content covered by major news sites, the demonstration works best for factual items such as this one from the BBC.
Used for real, as a proxy service that Web applications might routinely query in front of any news item, the possibilities are diverse and compelling. And once all those big news sites are effectively appearing to generate Linked Data? The cloud just got an awful lot bigger, an awful lot more current, and an awful lot more powerful.