X
Business

Zemanta talks Linked Data with SDK and commercial API

I covered Slovene semantic technology startup Zemanta back in September when they secured investment from New York City's Union Square Ventures, and the company also received frequent mentions in the Semantic Web Gang's recent look back over 2008.Yesterday, the company released an update to their popular WordPress plug-in and today they announced [PDF] commercial availability of their 'Semantic API.
Written by Paul Miller, Contributor

I covered Slovene semantic technology startup Zemanta back in September when they secured investment from New York City's Union Square Ventures, and the company also received frequent mentions in the Semantic Web Gang's recent look back over 2008.

Yesterday, the company released an update to their popular WordPress plug-in and today they announced [PDF] commercial availability of their 'Semantic API.'

The company describes the API, suggesting that;

"We analyze your post through our proprietary natural language processing and semantic algorithms, and statistically compare its contextual framework to our preindexed database of content.

We are using a combination of machine learning techniques and end-user input from our widget users, that enables us to train the engine and constantly improve the recommendations."

Users familiar with the blog plug-in will recognise - and probably value - these capabilities, which the API makes available for use in other situations.

Superficially, there are clear similarities with the capabilities of services such as Thomson Reuters' Open Calais, which also permits third parties to pass data via an API and receive structured and enriched results in return. A news article discussing a merger, for example, might be returned marked up with structured information on the companies involved, their key personnel, etc.

Given the backgrounds of Zemanta and Thomson Reuters, and the different data sets upon which they draw, it's likely that a quite clear distinction will emerge in the use cases for which each is appropriate. It appears likely that Zemanta is more suited to the informal Web (pulling content from IMDb, Twitter and the like) whilst Calais will excel in mission-critical applications at the Fortune 500 and their ilk. Both add value in the mid-range, and only time will tell which is preferred moving forward.

Interestingly, both are making moves to embrace the Semantic Web's Linking Open Data movement, which I've covered frequently here. Calais made announcements in that direction back in September, and an upcoming release of their service will make good on that. Zemanta's press release today states;

"Zemanta fully supports the Linking Open Data initiative. It is the first API that returns disambiguated entities linked to dbPedia, Freebase, MusicBrainz, and Semantic Crunchbase. The data can be returned in the standard format of Semantic web – RDF. It is an ideal gateway from unstructured web to semantic web. This represents a major step ahead for efforts to connect the Web into a semantic web of objects."

Zemanta CTO, Andraž Tori, commented;

"I see it as a stargate portal from unstructured content into the world of Semantic Web."

Zemanta has already signed up a number of partners, and one of those is Freebase. Jamie Taylor (who recorded an early podcast about Freebase here) commented on the way that end users might benefit from accessing Freebase data via Zemanta;

"For publishers, the Zemanta API acts as a front door to the universe of open data on the web, facilitating the jump from unstructured text to semantic entities. You can take plain text, use the Zemanta API to resolve that text into strongly identified entities, and then query Freebase for detailed information about the mentioned people, places, movies, etc. Truly empowering."

Use of the API is free for up to 10,000 API calls per month, with a subscription fee above that level.

Editorial standards