Semantically powered question answering start-up True Knowledge today made its Semantic Search API available for public consumption, taking the next step on the company's journey out of beta and providing a clear steer as to the way in which they intend to generate revenue.
As the company's press release notes,
"True Knowledge offers two distinct API services for developers: the 'Direct Answer API' and the 'Query API.' The Direct Answer API allows developers to leverage True Knowledge’s natural language question answering technology, giving any search site or application the ability to provide a single direct answer for questions asked on any subject in plain English. This is especially well suited to mobile applications where providing a lengthy list of search results may be impractical.
The Query API allows developers to bypass True Knowledge’s natural language translation system and directly query True Knowledge’s knowledge base using a simple query language. This allows automated systems such as web and mobile applications to tap into True Knowledge’s vast machine-understandable knowledge of the world, making them behave more intelligently."
I spoke with CEO William Tunstall-Pedoe ahead of today's announcement to see how the core knowledge base continues to improve, and to discuss the company's plans.
For those who haven't tried it, True Knowledge offers an interesting slant on attempting to answer your question rather than simply return hundreds or thousands of documents that might contain the answer as traditional search engines tend to. Tunstall-Pedoe quoted Google's Larry Page during our conversation, noting that Page has asserted that
"the perfect search engine would understand exactly what you mean and give back exactly what you want."
It is this that True Knowledge attempts with their 'Internet Answer Engine,' and core to their solution are a comprehensive (137 million facts, and growing) knowledge base, a proprietary system for understanding a query and a powerful inference capability that enables the system to answer questions more reliably. Part of that reliability, as Tunstall-Pedoe frequently stresses, lies in the system's ability to know when it doesn't know the answer. Along with a success rate of less than 50% for providing answers to questions, this may seem little more than an academic curiosity, but an ability to reliably know when to fall back to less structured approaches (such as passing the query to Google) is far better than 'guessing' or delivering wholly inappropriate responses... especially once the Answer Engine's capabilities are embedded in some third party site.
True Knowledge's process of inference also allows the system to cope with ambiguity, and even with contradictory 'facts.' During our conversation, we told the system that President Obama was born in Cambridge. It allowed us to make this assertion, but subsequent analysis of the overwhelmingly contradictory data drawn from elsewhere in the knowledge base means that it was deemed to be untrue and flagged as such.
A different query, in which I ask 'How far is San Jose from SFO?,' shows both how the system copes with ambiguity and the manner in which supporting facts are drawn from sites such as Metaweb's Freebase.
The current True Knowledge home page is not going to draw huge numbers of users away from their search engine of choice, but that isn't really the point. As Tunstall-Pedoe pointed out, the site is intended to showcase the company's capabilities and facilitate the addition of new knowledge (as well as the millions of facts drawn from Wikipedia, Freebase and a growing body of licensed commercial content, over 120,000 facts have already been added by individuals in the beta programme.) The real utility of True Knowledge will lie in licensing the underlying system for use in vertical and horizontal third party applications, and public availability of the True Knowledge API begins that process. There's a long way to go in further extending the knowledge base, suggesting that vertical search applications may be the first to sign up; it's much easier to approach comprehensiveness within a bounded domain than across all areas of knowledge.
The market for semantically enhanced search is growing crowded, and stalwarts of the search industry have been hard at work too, with Google and others getting increasingly good at returning actual answers to factual questions.
Tunstall-Pedoe used a slide to demonstrate the differentiation the company sees between itself and 'obvious' competitors such as Wikipedia, Freebase, and hotly anticipated Wolfram Alpha. Key differentiators in the diagram included True Knowledge's ability to infer (something Wolfram Alpha also claims), its language independence (although currently only available in English, the concept extraction techniques used by True Knowledge should work equally well in other languages), and the system's reliance upon an internal ontology comprising 20,000 classes (plus biological species, product information, etc). True Knowledge (unsurprisingly) scored far better than the competition, but in a market that also includes the likes of Hakia and Powerset (neither of which could usefully answer my question about San Jose and SFO) the true picture is a lot more complex.
True Knowledge is certainly interesting, and frequently impressive. It remains to be seen whether a Platform proposition will set them firmly on the road to riches, or if they'll end up finding more success following the same route as Powerset and getting acquired by an existing (enterprise?) search provider.