Thomson Reuters today announce Calais 4.0, 'a web service that uses natural language processing technology to semantically tag text that is input to the service. The tags are delivered to the user who can then incorporate them into other applications - for search, news aggregation, blogs, catalogs, you name it.'
My ZD Net colleague Paul Miller has covered the fundamentals of today's release here and here, with his vastly superior semantic knowledge to mine, and the video above provides a basic overview of the Calais proposition to users.
I talked with Tom Tague of Calais last month about progress and this upcoming release, and Thomson Reuters desire to be the semantic plumbing for the planet. As the largest business data analyst in the world, rich semantic contextual search of their information gold mine adds value and is a driving force to further monetize their content. (The Flash intro on the main Thomson Reuters site gives an idea of the depth of content they offer)
So what's in it for you as a potential implementer? The Open Calais plumbing aims to provide rich contextual linking of information, which ultimately makes search results much more relevant and valuable.
In a utopian world, your multiple shared drives or database silos would all be connected in a single giant graph data structure of interconnected nodes.
Adopting the free Open Calais api allows you to federate all your content; all your information would be available in an interlinked data cloud.
In 2007 Thomson Reuters bought text analytics company ClearForest, a provider of text driven business intelligence solutions which supplies a bridge between unstructured text and your enterprise data.
This is now the underpinning of Open Calais; if an enterprise wants to have their entire information cloud behind their firewall, ClearForest could be a foundation.
The enterprise goal would be to manageably commingle private and public data within acceptable security boundaries. This could include streaming in for example financial information, or a pharmaceutical taxonomy from a Freebase or similar, and mashing it up with your private data and information.
Long term content interoperability is a Calais strategic objective, and they are encouraging wide scale adoption for Publishers, Bloggers, Software Providers, Content Managers and Developers with lots of hand holding on their new Drupal and Open Calais powered website.
From a business perspective, Thomson Reuters clearly gets the open business model, moving away from a walled garden of content to a metered on demand content assets delivery model. Instead of you visiting them for information they will deliver what you need into your environment...
As a major player in the media world this is a hugely encouraging development by Thomson Reuters in launching this latest industrial strength technology release to a rapidly increasing developer base.
The connected data ecosphere just got a lot richer and it will be fascinating to see how the other key players in the space coalesce to create what I believe will transform both the search and collaboration worlds.