﻿<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<rss version="2.0" xmlns:media="http://search.yahoo.com/mrss/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:s="http://www.zdnet.com/search" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd">
  <channel>
    <link>http://www.zdnet.com/</link>
    <title>ZDNet | The Semantic Web Blog RSS</title>
    <description>Latest blogs in The Semantic Web</description>
    <language>en</language>
    <copyright>ZDNet</copyright>
    <managingEditor>customerservice@zdnet.com (ZDNet Customer Services)</managingEditor>
    <webMaster>uk-engineering@cbsinteractive.com (ZDNet Webmaster)</webMaster>
    <pubDate>Sun, 19 May 2013 05:43:35 -0700</pubDate>
    <lastBuildDate>Sun, 19 May 2013 05:43:35 -0700</lastBuildDate>
    <docs>http://blogs.law.harvard.edu/tech/rss</docs>
    <ttl>2</ttl>
    <image>
      <url>http://i.zdnet.com/images/spry/zdnet_300x300.jpg</url>
      <link>http://www.zdnet.com/</link>
      <title>ZDNet | The Semantic Web Blog RSS</title>
      <width>143</width>
      <height>39</height>
    </image>
    <s:counts>
      <start>0</start>
      <return>20</return>
      <found>119</found>
    </s:counts>
    <item>
      <guid isPermaLink="false">6126000376</guid>
      <link><![CDATA[http://www.zdnet.com/blog/semantic-web/linked-data-the-return-of-the-cloud/376]]></link>
      <title><![CDATA[Linked Data: the return of the Cloud]]></title>
      <description><![CDATA[The Linked Data Cloud diagram is back; bigger, better, and powered by the CKAN registry tool.]]></description>
      <pubDate><![CDATA[Thu, 23 Sep 2010 13:05:59 +0000]]></pubDate>
      <media:credit role="author"><![CDATA[Paul Miller]]></media:credit>
      <s:doctype><![CDATA[Text]]></s:doctype>
      <media:text type="html"><![CDATA[<p>After more than a year of stagnation, the 'Linked Open Data cloud diagram' found in so many presentations and blog posts is back. It's bigger, it's better, and it points to the continued growth of this disparate community.
</p>

<p><div id='attachment_377'  /></a><p>Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/</p></div>
</p>

<p>Working with <a href="http://www.wiwiss.fu-berlin.de/en/institute/pwo/bizer/team/JentzschAnja.html">Anja Jentzsch</a> of Freie Universitt Berlin, <a href="http://richard.cyganiak.de/">Richard Cyganiak</a> of Galway's Digital Enterprise Research Institute (<a href="http://www.deri.ie/">DERI</a>) has returned to the diagram he first drew in 2007, and brought it right up to date. 203 data sets are represented, and together they comprise around 395 million links between over 25 billion <a href="http://en.wikipedia.org/wiki/Resource_Description_Framework">RDF</a> statements. Not bad for something that began life as a small academic exercise, but there's a very long way still to go.
</p>

<p>Starting as one of Sir Tim Berners-Lee's <a href="http://www.w3.org/DesignIssues/LinkedData.html">Design Issues notes</a>, the idea of Linked Data was given credence by the informal <a href="http://esw.w3.org/SweoIG/TaskForces/CommunityProjects/LinkingOpenData">Linking Open Data project</a>. As the project web site notes,
</p>

<p></p>
<blockquote>
<p>The goal of the W3C SWEO Linking Open Data community project is to extend the Web with a data commons by publishing various open data sets as RDF on the Web and by setting RDF links between data items from different data sources.</p>
</blockquote>
<p>
</p>

<p>From those early days, when researchers 'found' data sitting on the web and set about transforming it into RDF by themselves, the initiative has grown significantly. Governments, media properties from the BBC to the <em>New York Times</em>, and retailers such as Best Buy have been amongst those to set about plugging their own data into the increasingly rich network.
</p>

<p>Links between these data sets are established when one draws upon concepts defined in another. <a href="http://www.geonames.org/ontology/">GeoNames</a>, for example, is seen by many as a useful resource to draw upon in describing places. Rather than fuzzily talking about 'Paris,' I might make the statement unambiguous by referring to <em><a href="http://www.geonames.org/4717560/paris.html">this</a></em> Paris rather than <em><a href="http://www.geonames.org/6942553/paris.html">that</a></em> one. For a dataset (describing works of art, for example) in which location is not key, it's far easier to let someone else worry about recording where Paris is, what country it's in, how big it is etc rather than do it all myself.
</p>

<p>This concept of linking from one resource to another is clearly key, but the relative dearth of connections on the diagram shows that there is still some way to go. <a href="http://dbpedia.org/About">DBpedia</a> remains disproportionately important, and whilst it's unlikely that every resource will ever meaningfully connect to every other resource, some more connections would be valuable.
</p>

<p>The latest iteration of the cloud diagram is based upon information that the community was <a href="http://www.mail-archive.com/public-lod@w3.org/msg06233.html">invited</a> to <a href="http://ckan.net/group/lodcloud">record in the CKAN data repository</a>, which should make data collection and update easier and more accurate. In principle, future versions of the diagram could be generated programmatically as the data changes.
</p>

<p>This month's revision of the Linked Data cloud is a useful illustration of progress within the community. Much remains to be done, both in terms of <a href="http://cloudofdata.com/2009/10/licensing-of-linked-data/">clarifying licenses</a> and by collecting even more data. There is also work to do in adequately explaining what people might find. <a href="http://richard.cyganiak.de/2007/10/lod/imagemap.html">The diagram</a> does link through to descriptions of each resource, but the 'explanations' are all-too-often only meaningful to people knowledgeable enough not to need an explanation in the first place!
</p>

<p></p>
<blockquote>
<p>This repository contains data from JISC, who fund research and infrastructure in the UK.</p>
</blockquote>
<p>
</p>

<p>What data? About what sort of thing?
</p>

<p></p>
<blockquote>
<p>Airport data from Our Airports published as RDF</p>
</blockquote>
<p>
</p>

<p>What sort of data? Where in the world?
</p>]]></media:text>
    </item>
    <item>
      <guid isPermaLink="false">6126000371</guid>
      <link><![CDATA[http://www.zdnet.com/blog/semantic-web/siri-acquired-by-apple-iphone-becomes-the-virtual-personal-assistant/371]]></link>
      <title><![CDATA[Siri acquired by Apple; iPhone <em>becomes</em> the Virtual Personal Assistant?]]></title>
      <description><![CDATA[Apple buys Siri, and brings real semantic smarts to Cupertino.]]></description>
      <pubDate><![CDATA[Fri, 30 Apr 2010 20:12:48 +0000]]></pubDate>
      <media:credit role="author"><![CDATA[Paul Miller]]></media:credit>
      <s:doctype><![CDATA[Text]]></s:doctype>
      <category domain="http://www.zdnet.com/topic-apple/">Apple</category>
      <category domain="http://www.zdnet.com/topic-iphone/">iPhone</category>
      <category domain="http://www.zdnet.com/topic-mobility/">Mobility</category>
      <category domain="http://www.zdnet.com/topic-smartphones/">Smartphones</category>
      <media:text type="html"><![CDATA[<p>According to an FTC filing <a href="http://scobleizer.com/2010/04/28/breaking-news-siri-bought-by-apple/">unearthed</a> by <a href="http://scobleizer.com/">Robert Scoble</a> earlier this week, <a href="http://www.apple.com/">Apple</a> appeared to have acquired <a href="http://www.siri.com/">Siri</a>. The rumour has since been confirmed, and I spoke with Siri <a href="http://siri.com/about/team">board member</a> Norman Winarsky to get some insight as to what this news might mean for the semantic technology startup I've been following since before it <a href="http://blogs.zdnet.com/semantic-web/?p=211">emerged from stealth back in 2008</a>.
</p>

<p>Siri <a href="http://blogs.zdnet.com/semantic-web/?p=333">reappeared on the scene back in February</a>, launching their innovative (US-only) iPhone app after a protracted silence. In March, the company <a href="http://techcrunch.com/2010/03/19/siri-gummi-hafsteinsson/">snatched Gummi Hafsteinsson from Google</a>, strengthening their mobile credentials still further and appearing to maintain momentum. Semantic technology enthusiasts watched for Siri to get even smarter, mobile officianados waited with bated breath for <a href="http://blogs.zdnet.com/semantic-web/?p=333">the promised Android app</a>, and those of us outside the United States waited for a version of the app that understood <em>our</em> accents, <em>our</em> geographies, and the APIs of the companies offering travel, events, entertainment, restaurants and the rest in <em>our</em> territories. Although <a href="http://www.sri.com/about/managers/winarsky.html">Winarsky</a>, VP for Ventures, Licensing and Strategic Programs at <a href="http://www.sri.com/">the company</a> from which Siri was originally spun out, felt unable to comment on the specifics of the product road map moving forward, I can't help speculating that any Android (or Blackberry) app has probably moved a lot further down the priority list.
</p>

<p>Even with <a href="http://www.marketwatch.com/story/apple-passes-microsoft-on-sp-500-market-cap-list-2010-04-22">billions</a> burning a hole in his pocket, it's unlikely (although not impossible) that Steve Jobs spent <a href="http://techcrunch.com/2010/04/28/apple-siri-200-million/">more than $200 million</a> just to interfere with the release of an Android app. So what <em>did</em> Apple want?
</p>

<p>Firstly, they probably wanted to make sure that someone else didn't pounce and grab the opportunity. Secondly, Siri <em>is</em> smart, innovative and impressive. The application does a great job of <em>appearing</em> simple and intuitive, whilst fulfilling a useful - and potentially complex - role. Given the surprisingly limited voice capabilities of Apple's current mobile devices, they may well be interested in harnessing some of that language interpretation capability for future mobile devices and even their desktop line. I doubt that required Siri, though, so much as the far cheaper poaching of some key staff from Siri or their partner (and fellow SRI spin-out), <a href="http://www.nuance.com/">Nuance</a>.
</p>

<p>In its current form, Siri remains relatively limited in scope (finding and booking things on the move, essentially) and geography (the United States), and if Apple is serious about making something of the <em>company</em> they've acquired (rather than just securing its staff and IP) it will need to move forward aggressively on both fronts. Do they want a travel assistant with global reach, or do they want to give <em>every</em> iPhone/iPad/Mac owner their very own virtual secretary?
</p>

<p>Winarsky is quick to suggest that the notion of the 'virtual personal assistant' applies in contexts far beyond the mobile niche carved out by Siri. Citing a 'perfect storm' created by the conjunction of bandwidth, compelling user interfaces and on-demand network compute power, he argues that Siri's current incarnation merely represents a taste of what's still to come. SRI continues to build upon the CALO project from which Siri was born, and opportunities are becoming increasingly clear in verticals as diverse as healthcare, retail, and call centre operations. SRI conducts 'more than 2,000 projects per year,' and the next CALO spin-off could emerge from their labs 'in less than six months.' Surely Apple didn't buy the wrong CALO baby?
</p>

<p>Far from embarking upon a mission to solve the Turing Test, which Winarsky argues cannot be done, SRI researchers (and the Siri team) are amongst those transforming the tarnished image of Artificial Intelligence. By working within a defined vertical (healthcare, etc), it is entirely feasible to train an agent in such a way that it can reliably infer and act intelligently with respect to context. When you're only dealing with medical complaints (say), it's not impossible to train a machine to excel with idiom, jargon, and even ambiguity. It's also feasible to expect that the machine could learn - and adapt - 'in the wild,' without recourse to the lab every time external factors shift substantially.
</p>

<p>As well as pointing to SRI's long heritage, successful track record, and future plans, Winarsky was keen to stress the important role played by Siri's venture capital backers in reaching the current settlement with Apple. Both Menlo Ventures' Shawn Carolan and Morgenthaler's Gary Morgenthaler apparently 'did the heavy lifting' on the deal. Winarsky points out that the hardest part of getting a company like Siri to this stage isn't actually developing the technology, but recognising the real market opportunity and 'navigating amongst the giants' in the space. In all of this, he reiterated, good VCs proved invaluable. And they weren't even listening in on the call!
</p>

<p>I never thought I'd see 'metadata' discussed in the UK Parliament, then I did when we adopted Government Metadata Standards a few years back. I never thought I'd see 'RDF' pass the lips of a head of state, then Prime Minister Gordon Brown said it this year. Will Steve Jobs stand up at WWDC'10 and talk about Apple's embracing of/invention of 'the Semantic Web?' We shall see!
</p>

<p>Good luck to all at Siri, and I look forward to seeing what Apple does next... and what SRI spawns next.
</p>]]></media:text>
    </item>
    <item>
      <guid isPermaLink="false">6126000364</guid>
      <link><![CDATA[http://www.zdnet.com/blog/semantic-web/david-siegel-discusses-the-power-of-pull-a-different-view-of-the-semantic-web/364]]></link>
      <title><![CDATA[David Siegel discusses the Power of Pull; a different view of the Semantic Web?]]></title>
      <description><![CDATA[This podcast conversation with David Siegel discusses his latest book, Pull, and explores the many ways in which Siegel believes a semantic web can power business today and into the future.]]></description>
      <pubDate><![CDATA[Thu, 18 Mar 2010 11:57:05 +0000]]></pubDate>
      <media:credit role="author"><![CDATA[Paul Miller]]></media:credit>
      <s:doctype><![CDATA[Text]]></s:doctype>
      <media:text type="html"><![CDATA[<p><img src="http://cdn-static.zdnet.com/i/story/61/26/000364/davidsiegel.jpg" width="200" height="242" class="alignRight size-full wp-image-366" />Author <a href="http://www.dsiegel.com/">David Siegel</a>'s latest book is a rather different beast to earlier works such as <a href="http://www.amazon.com/gp/product/1568304331?ie=UTF8&amp;tag=cloofdat-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=1568304331"><em>Creating Killer Web Sites</em></a><img src="http://www.assoc-amazon.com/e/ir?t=cloofdat-20&amp;l=as2&amp;o=1&amp;a=1568304331" width="1" height="1" border="0" alt=""  />, Siegel explores 'the power of the Semantic Web to transform your business,' using a series of case studies to demonstrate some of the myriad ways in which provision of structured and accessible data can alter business models across a range of industry sectors.
</p>

<p>Siegel's definition of the 'Semantic Web' is broader than that preferred by many, extending far beyond technical standards such as RDF and OWL to embrace <em>anything</em> that is both unambiguously structured and on the Web. He offers a '<a href="http://thepowerofpull.com/pull/foundations/semantic-web-acid-test">Semantic Web acid test</a>,' and relies on this throughout the book in an effort to clarify his understanding of the space.
</p>

<p><a href="http://cloudofdata.com/2010/03/talking-with-david-siegel-about-pull-and-the-semantic-web/">I spoke with David this week, and recorded our conversation as a podcast that is now available online</a>. We discuss the premise behind the book, and then explore some of the presumptions upon which his argument is built.
</p>

<p>Others, such as <a href="http://www.semanticsincorporated.com/2010/01/pulling-together-the-semantic-web-.html">Greg Boutin</a>, have written their own reviews of the book, and Siegel also recorded a different <a href="http://itc.conversationsnetwork.org/shows/detail4428.html">Technometria podcast</a> with Phil Windley earlier this month.
</p>

<p>So what do you think? <a href="http://cloudofdata.com/2010/03/talking-with-david-siegel-about-pull-and-the-semantic-web/">Have a listen</a>, and see how you see Siegel's concept of 'Pull' relating to the Semantic Web. Broader? Narrower? The same? Different? And will the emotive language and examples used throughout the book succeed in driving break-out growth where so many earlier efforts have had only modest success?
</p>]]></media:text>
    </item>
    <item>
      <guid isPermaLink="false">6126000356</guid>
      <link><![CDATA[http://www.zdnet.com/blog/semantic-web/putting-the-semantic-web-to-work-in-e-commerce-with-goodrelations/356]]></link>
      <title><![CDATA[Putting the Semantic Web to work in e-Commerce with GoodRelations]]></title>
      <description><![CDATA[Martin Hepp and Jamie Taylor answer questions about the GoodRelations vocabulary in a podcast conversation, exploring opportunities to enrich the way in which we compare goods and services.]]></description>
      <pubDate><![CDATA[Mon, 22 Feb 2010 10:52:07 +0000]]></pubDate>
      <media:credit role="author"><![CDATA[Paul Miller]]></media:credit>
      <s:doctype><![CDATA[Text]]></s:doctype>
      <media:text type="html"><![CDATA[<p><a href="http://purl.org/goodrelations/"><img src="http://cdn-static.zdnet.com/i/story/61/26/000356/goodrelations-logo.gif" width="220" height="60" class="alignRight size-full wp-image-359" /></a>Millions of us rely upon online information to inform purchasing decisions, but the <em>ad hoc</em> fashion in which free-text descriptions of products and services are interpreted and offered up by mainstream search engines makes this a far less accurate process than we might wish. Working quietly behind the scenes the <a href="http://purl.org/goodrelations/">GoodRelations</a> vocabulary is setting out to do something about this, and with adopters such as <a href="http://www.bestbuy.com/">Best Buy</a> already onboard they're off to a great start.
</p>

<p>Last week, I spoke with Martin Hepp of Germany's <a href="http://www.unibw.de/">Universitt der Bundeswehr Mnchen</a> and Jamie Taylor of San Francisco startup <a href="http://www.metaweb.com/">Metaweb</a> (best known to readers of this blog as the home of <a href="http://www.freebase.com/">Freebase</a>) to learn more. <a href="http://cloudofdata.com/2010/02/a-podcast-conversation-about-goodrelations-with-martin-hepp-and-jamie-taylor/">The result has just been released as a podcast</a>.
</p>

<p>Within specific organisations and supply chains, of course, Master Data Management is already well understood. Processes and procedures are in place to ensure that products are accurately and unambiguously described, and I touched on some of the semantic technology applications in <a href="http://www.semanticuniverse.com/articles-bringing-semantic-technologies-enterprise-data.html">this 2009 piece for <em>Semantic Universe</em></a>.
</p>

<p>The picture becomes somewhat less clear as data moves out of the enterprise and onto the web. Even on the product website itself, much of that internal structure and richness is inadequately conveyed. For the consumer (or aggregator) wishing to compare and contrast MP3 players from a number of competing providers, it can frequently be difficult to accurately ensure that they really are comparing apples with apples. It is here that GoodRelations comes into its own, offering data providers a consistent way in which to describe key attributes of their business and its products.
</p>

<p>Yahoo! already works to intelligently represent GoodRelations data in search results, and Martin Hepp asserts that Google "is doing something" with the data too. GoodRelations-encoded product descriptions, it seems, perform better in the mainstream search engines than their less structured competitors. The rich structure is also available for use in specific applications far beyond the generic search engine, and we touch on some of these possibilities during the conversation.
</p>

<p>US-based consumer electronics retailer <a href="http://www.bestbuy.com/">Best Buy</a> already <a href="http://www.chiefmartec.com/2009/12/best-buy-jump-starts-data-web-marketing.html">embeds GoodRelations RDFa</a> in product pages, and <a href="http://jay.beweep.com/index.php/2009/06/05/best-buy-local-stores-goes-semantic-with-good-relations-ontology/">reports improvements in findability and use</a>.
</p>

<p>Closer to home (for me, at least), UK supermarket giant <a href="http://www.tesco.com/">Tesco</a> has <a href="http://rdfa.info/2010/01/20/uk-retail-chain-tesco-adopts-rdfa/">begun to experiment with embedding RDFa</a> in pages. GoodRelations terms aren't used - yet - but it will be interesting to see how quickly that changes, and the applications that third parties might begin to build that leverage all this rich structure.
</p>

<p><a href="http://cloudofdata.com/2010/02/a-podcast-conversation-about-goodrelations-with-martin-hepp-and-jamie-taylor/">Have a listen to the podcast</a>, and see what you think. Does GoodRelations have a place on <em>your</em> website?
</p>]]></media:text>
    </item>
    <item>
      <guid isPermaLink="false">6126000333</guid>
      <link><![CDATA[http://www.zdnet.com/blog/semantic-web/siri-offers-virtual-assistance-with-a-little-help-from-your-iphone/333]]></link>
      <title><![CDATA[Siri offers virtual assistance, with a little help from your iPhone]]></title>
      <description><![CDATA[Semantic Technology startup, Siri, releases a Virtual Personal Assistant for the iPhone and simplifies a wide range of tasks for US consumers on the move. Just by speaking to their phone, users can make dinner reservations, check the weather, find movies, flights and more.]]></description>
      <pubDate><![CDATA[Fri, 05 Feb 2010 06:29:26 +0000]]></pubDate>
      <media:credit role="author"><![CDATA[Paul Miller]]></media:credit>
      <s:doctype><![CDATA[Text]]></s:doctype>
      <category domain="http://www.zdnet.com/topic-iphone/">iPhone</category>
      <category domain="http://www.zdnet.com/topic-mobility/">Mobility</category>
      <category domain="http://www.zdnet.com/topic-smartphones/">Smartphones</category>
      <media:text type="html"><![CDATA[<p><a href="http://www.siri.com/"><img src="http://cdn-static.zdnet.com/i/story/61/26/000333/siri-logo.jpeg" width="200" alt="Siri" class="alignRight size-full wp-image-336" /></a>Back in June, I <a href="http://cloudofdata.com/2009/06/tom-gruber-talks-about-siri-the-virtual-personal-assistant/">recorded a podcast</a> with <a href="http://www.crunchbase.com/person/tom-gruber">Tom Gruber</a> of <a href="http://www.siri.com/">Siri</a>. A week later <a href="http://blogs.zdnet.com/semantic-web/?p=300">I saw the company's 'Virtual Personal Assistant' put through its paces on an iPhone, and was impressed</a>. Earlier this month I got on the phone with CEO <a href="http://www.crunchbase.com/person/dag-kittlaus">Dag Kittlaus</a> and VP of Engineering <a href="http://www.crunchbase.com/person/adam-cheyer">Adam Cheyer</a> for an update, and today <a href="http://itunes.apple.com/us/app/siri-assistant/id351778157?mt=8">you can download Siri for yourself via Apple's App Store</a>. Versions for Blackberry and Android will follow 'soon after,' and Kittlaus stresses that mobile is 'just the beginning.'
</p>

<p>Today's iPhone app is the first consumer offering from a company that has spent a long time thinking about this space. Much of the core research resulted from the $150million <a href="http://caloproject.sri.com/">CALO</a> project at SRI, funded by DARPA. Siri itself <a href="http://blogs.zdnet.com/semantic-web/?p=211">emerged from SRI to close an $8.5million Series A round</a> with <a href="http://www.menloventures.com/">Menlo Ventures</a> and <a href="http://www.morgenthaler.com/">Morgenthaler</a> back in October of 2008.
</p>

<p>Explicitly described as <em>complementary</em> to web search, rather than a replacement for it, Siri seeks to move beyond a paradigm based upon keywords and links to embrace one that is personal, task oriented, and <em>conversational</em> in nature. Siri guides the user along a path, making query formulation iterative and relatively painless, and ensuring that the application gets the information it needs. Early use cases are optimised to 'help you get things done.' probably whilst mobile. You might, for example, ask (by speaking to Siri's <a href="http://www.nuance.com/">Nuance</a>-powered speech processing engine) for 'Sushi near work at 7pm.' That simple request is relatively straightforward for a human being to understand, process, and act upon, but requires a significant degree of intelligence on the part of a software agent. Where is 'work'? What is 'sushi,' and what do you <em>actually</em> want to do with it (find a restaurant where you can eat it, presumably)? When is '7pm'? Alternatively you could let Siri conversationally lead you through the 'right' questions to reach the same outcome, as the sequence of screenshots at the end of this post demonstrates.
</p>

<p>At this point, it's worth mentioning that Siri (like so many location-powered applications on the Web, the iPhone, Android, or wherever) is currently only really effective in the United States. This is due to any number of factors, including consumer readiness and the easy availability of cheap yet comprehensive data, but the situation will <del datetime="00">doubtless</del> hopefully improve over time. Siri currently makes the point explicit, refusing to allow registration of a home or work address (or timezone) outside the United States. For the purposes of experimentation, my office has temporarily relocated to 1600 Pennsylvania Avenue, Washington DC, where there seems to be plenty of sushi available for tonight's dinner.
</p>

<p>The speech processing works well on the whole, although I'm bemused that Siri interpreted my daughter's 'Where is the nearest Greek island?' as '18 me.' Whilst it's possible that this is an AI's attempt to avoid causing offence by responding 'Greece, you silly girl' it seems more likely that the engine got very confused by her non-American enunciation. Trying to sound like Hannah Montana just made it worse.
</p>

<p>Even in the States, data is key to ensuring that apps such as this one deliver a rich and <em>useful</em> experience, as all the AI smarts and user interface polish in the world can't help an app that ignores the Starbucks across the street when you ask it to find you a coffee. Siri has lined up an impressive group of data providers including OpenTable, MovieTickets, TaxiMagic, Citysearch, Yelp, Yahoo Local, Gayot, Rotten Tomatoes, NYTimes.com, WeatherBug, AllMenus, StubHub, LiveKick, Maponics, Nuance and TrueKnowledge. Kittlaus celebrates the recent explosion of accessible APIs from sites such as these, claiming that Siri has acquired 'far more data than we've had time to integrate yet.' In a number of cases, revenue sharing arrangements mean that Siri gets a cut when money changes hands. A selection of test searches focussed on areas of the US to which I travel regularly delivered the sorts of results that I'd expect, and there's clear value in the integration of data from a number of different providers.
</p>

<p>The Siri team looks forward to analyzing the logs once users start putting this app to work. Amazon's Elastic Compute Cloud (<a href="http://aws.amazon.com/ec2">EC2</a>) will handle the heavy lifting in the early days, allowing computing resources to scale with demand. Once the team has an understanding of real-world loading, Cheyer suggests that they'll pull much of the computing resource back in-house to lower costs. There is a clear expectation that Siri's responses will iterate rapidly as data become available to show how users use the app.
</p>

<p>Further out, there's the ever-present need for more data. Kittlaus is also interested in increasing the opportunities for facilitating revenue-generating commercial transactions, and in allowing Siri to 'know you better.' Work, home, and current location is one thing. Why not favourite food, names and contact details for family (so I can have Siri 'tell my wife I'll be late home'), preferred airline, and more? It makes sense not to introduce these features from the outset, as consumers will need to both <em>value</em> and <em>trust</em> Siri before willingly giving up such detail. But you can be sure they'll be included soon.
</p>

<p>Voice already plays a role on mobile devices such as the iPhone, perhaps most usefully in <a href="http://itunes.apple.com/us/app/google-mobile-app/id284815942?mt=8">Google's search app</a>. It remains to be seen whether consumers will really use <em>two</em>, or look for many of Siri's features to move across and enrich the voice-powered search experience they're already getting from Google, which presumably has many of the same data deals already in place.
</p>

<p>Kittlaus stressed several times that Siri will deliver value on other platforms, suggesting a Siri email address (similar to <a href="http://www.tripit.com/">plans@tripit.com</a>, presumably), a destination web site with which users might converse, or a Siri IM buddy that could be drawn into conversations.
</p>

<p>By delivering value to users, and by building an ongoing relationship (backed by data) that's difficult to replicate, Siri seeks to offer a compelling and defensible business. Playing with the application from the other side of the Atlantic it shows clear promise, and I look forward to putting it through its paces on my next trip to the States.
</p>

<p>And, of course, you can be pretty sure it'll run on the iPad.
</p>

<p><em>This sequence of screenshots illustrates Siri's conversational approach to getting from my vague opening query about 'restaurants' to a reservation for specific people in a specific place at a specific time on a specific day. It would have been quicker to simply say what I wanted up front... but sometimes you just don't know until prompted.</em>
</p>

<p>[gallery]
</p>

<p></p>]]></media:text>
    </item>
    <item>
      <guid isPermaLink="false">6126000325</guid>
      <link><![CDATA[http://www.zdnet.com/blog/semantic-web/oracle-delivers-native-support-for-thomson-reuters-opencalais-service/325]]></link>
      <title><![CDATA[Oracle delivers native support for Thomson Reuters' OpenCalais service]]></title>
      <description><![CDATA[Thomson Reuters and Oracle today announced support for the media giant's OpenCalais metadata generation service within release 2 of Oracle Spatial 11g. The integration gives Oracle users and developers direct access to OpenCalais' natural language processing (NLP) capabilities.]]></description>
      <pubDate><![CDATA[Tue, 01 Sep 2009 12:15:13 +0000]]></pubDate>
      <media:credit role="author"><![CDATA[Paul Miller]]></media:credit>
      <s:doctype><![CDATA[Text]]></s:doctype>
      <media:text type="html"><![CDATA[<p><a href="http://www.thomsonreuters.com/">Thomson Reuters</a> and <a href="http://www.oracle.com/">Oracle</a> today announced support for the media giant's <a href="http://www.opencalais.com/">OpenCalais</a> metadata generation service within release 2 of <a href="http://www.oracle.com/database/">Oracle Spatial 11<em>g</em></a>. The integration gives Oracle users and developers direct access to OpenCalais' natural language processing (NLP) capabilities.
</p>

<p>More importantly, perhaps, direct integration with an Enterprise product such as Oracle's database says much about how far the semantic technology community has come in being able to offer solutions capable of scaling - robustly - to meet <a href="http://opencalais.com/blogs/tom/increased-transaction-allowances">Enterprise-scale</a> demands.
</p>

<p>Xavier Lopez, Oracle's Director for Spatial and Semantic Technologies, is quoted in Thomson Reuters' press release;
</p>

<p></p>
<blockquote>
<p>
"This interoperability lets users quickly process documents in different formats (such as Microsoft Word and Adobe PDF), to extract semantic metadata that can be used for more semantically complete searches in Oracle11<em>g</em>."
</p>
</blockquote>
<p>
</p>]]></media:text>
    </item>
    <item>
      <guid isPermaLink="false">6126000319</guid>
      <link><![CDATA[http://www.zdnet.com/blog/semantic-web/moving-data-gov-towards-the-semantic-web/319]]></link>
      <title><![CDATA[Moving Data.gov towards the Semantic Web]]></title>
      <description><![CDATA[Government transparency in all its forms would appear to be very much in vogue at present, spanning everything from the Obama administration's Data.gov portal and Prime Ministerial pronouncements in the UK Parliament to municipal proclamations of openness in Vancouver and compelling grass-roots demonstrations by activists and even newspapers.]]></description>
      <pubDate><![CDATA[Mon, 10 Aug 2009 10:46:49 +0000]]></pubDate>
      <media:credit role="author"><![CDATA[Paul Miller]]></media:credit>
      <s:doctype><![CDATA[Text]]></s:doctype>
      <category domain="http://www.zdnet.com/topic-browser/">Browser</category>
      <category domain="http://www.zdnet.com/topic-government/">Government</category>
      <category domain="http://www.zdnet.com/topic-software-development/">Software Development</category>
      <media:text type="html"><![CDATA[<p>Government transparency in all its forms would appear to be very much in vogue at present, spanning everything from the Obama administration's <a href="http://www.data.gov/">Data.gov portal</a> and Prime Ministerial <a href="http://www.number10.gov.uk/Page19579">pronouncements</a> in the UK Parliament to <a href="http://www.readwriteweb.com/archives/vancouver_bc_wants_to_be_an_open_city.php">municipal proclamations</a> of openness in Vancouver and compelling grass-roots demonstrations by <a href="http://www.mysociety.org/">activists</a> and <a href="http://www.freeourdata.org.uk/">even newspapers</a>.
</p>

<p>At the heart of many of today's initiatives lie programmes to surface Government data for use and re-use by third parties. The 'open' in 'Open Data' is, of course, a very loaded term, and <a href="http://cloudofdata.com/2009/07/how-open-is-open/">I've looked before</a> at some of the ways in which data might become 'open' whilst remaining effectively useless. Nevertheless, Governments' current enthusiasm for being seen to embrace transparency should certainly be both welcomed and encouraged, and there are real opportunities to work <em>with</em> Government in ensuring that today's transparency fervour continues undiminished, whether by omission or commission.
</p>

<p>Given the complex and varied nature of the data involved, and the obvious linkages between the entities (you and I, our communities, our schools, our hospitals) described in numerous different databases, there's a clear opportunity for technologies and approaches from the Semantic Web community to play a significant role in simplifying the whole process of moving these legacy databases online.
</p>

<p>Already interested in Open Government from previous roles, and (obviously!) committed to encouraging real-world adoption of semantic technologies, I've spent some time recently talking to a number of those involved. A number of those conversations are now available as podcasts, and I'll continue to seek out fresh examples and perspectives to share.
</p>

<p>My most recent podcast conversation, released today, is <a href="http://blogs.talis.com/nodalities/2009/08/jim-hendler-and-li-ding-talk-about-work-to-convert-datagov-resources-to-rdf.php">with Professor Jim Hendler and Dr Li Ding</a> of the <a href="http://tw.rpi.edu/wiki/Main_Page">Tetherless World Constellation</a> at Rensselaer Polytechnic Institute in Troy, NY. The team at Rensselaer have been working with some of the US Federal Government's data sets on Data.gov, and so far they've converted <a href="http://data-gov.tw.rpi.edu/wiki/Data.gov_Catalog">sixteen data sets</a> from their original form, resulting in 2,927,398,352 freely available RDF triples and a number of <a href="http://data-gov.tw.rpi.edu/wiki/Demos_for_data.gov_data">demonstration applications</a>.
</p>

<p>Other conversations already released in the series include;
</p>

<p></p>
<ul>
<li><a href="http://cloudofdata.com/2009/08/david-eaves-talks-about-vancouvers-open-data-initiative/">David Eaves</a>, talking about Vancouver's commitment to Open Data</li>
<li><a href="http://cloudofdata.com/2009/07/john-sheridan-talks-about-the-drive-to-get-government-data-online/">John Sheridan</a>, Head of e-Services at the UK Government's Office of Public Sector Information, talking about his Department's efforts to get Government data online</li>
<li><a href="http://cloudofdata.com/2009/07/talking-with-mark-birbeck-about-rdfa-and-its-use-in-government/">Mark Birbeck</a>, talking about work with the UK Government's Central Office of Information to embed lightweight RDFa into workflows and web pages</li>
</ul>
<p>
</p>

<p>Each offers an example of ways in which 'open data' contributes to Government transparency, or to increasing the value of the massive sunk investment in collecting, managing and curating the data upon which Governments depend. The Semantic Web's notion of Linked Data (<a href="http://cloudofdata.com/2009/07/more-linked-data-and-rdf/">whether actually in RDF or not!</a> :-) ) offers a means to increase the utility of the data we have, without a massive programme of reengineering the systems used to manage it. The examples we see today, and the work of the individuals and teams with whom I have been speaking, will teach us a lot about how to make this work at Government scale.
</p>]]></media:text>
    </item>
    <item>
      <guid isPermaLink="false">6126000315</guid>
      <link><![CDATA[http://www.zdnet.com/blog/semantic-web/new-open-source-semantic-web-store-from-garlik-capable-of-enterprise-scale/315]]></link>
      <title><![CDATA[New open source Semantic Web store from Garlik capable of enterprise scale]]></title>
      <description><![CDATA[An oft-repeated concern in discussing large-scale deployment of Semantic Web ideas is that of 'scale.' With many of the better known data stores upon which the Semantic Web depends capable of storing only tens or at best a few hundreds of millions of RDF triples, it can be difficult to argue that the technology is fit for real-world deployment at scale.]]></description>
      <pubDate><![CDATA[Tue, 14 Jul 2009 12:20:10 +0000]]></pubDate>
      <media:credit role="author"><![CDATA[Paul Miller]]></media:credit>
      <s:doctype><![CDATA[Text]]></s:doctype>
      <category domain="http://www.zdnet.com/topic-open-source/">Open Source</category>
      <category domain="http://www.zdnet.com/topic-software-development/">Software Development</category>
      <media:text type="html"><![CDATA[<p>An oft-repeated concern in discussing large-scale deployment of Semantic Web ideas is that of 'scale.' With many of the better known data stores upon which the Semantic Web depends capable of storing only tens or at best a few hundreds of millions of RDF triples, it can be difficult to argue that the technology is fit for real-world deployment at scale.
</p>

<p>There are, of course, different ways of managing data, and it's not always necessary to store everything in one massive store... but for those concerned about scale today's news from UK-based <a href="http://www.garlik.com">Garlik</a> may well put their minds at rest.
</p>

<p>The company has taken their internally developed (and massively scalable) RDF triple store and released it to the world under an Open Source license as <a href="http://4store.org/">4store</a>.
</p>

<p>I spoke with the company's CEO and Head of Architecture just ahead of the launch, to learn more about the system and their motivation behind sharing it.
</p>

<p><a href="http://cloudofdata.com/2009/07/garlik-releases-open-source-rdf-triple-store-claims-capacity-for-60-billion-triples/">The result has just been released as a podcast</a>.
</p>]]></media:text>
    </item>
    <item>
      <guid isPermaLink="false">6126000311</guid>
      <link><![CDATA[http://www.zdnet.com/blog/semantic-web/semantic-web-gang-podcast-looks-back-at-the-semantic-technology-conference/311]]></link>
      <title><![CDATA[Semantic Web Gang podcast looks back at the Semantic Technology Conference]]></title>
      <description><![CDATA[June's episode of the regular Semantic Web Gang podcast was recorded on stage at the Semantic Technology Conference in San Jose.Audio and video of the session is now available, with Gang members and conference organiser Tony Shaw engaging in a discussion of the event's highlights and the underlying trends at work.]]></description>
      <pubDate><![CDATA[Thu, 09 Jul 2009 11:48:55 +0000]]></pubDate>
      <media:credit role="author"><![CDATA[Paul Miller]]></media:credit>
      <s:doctype><![CDATA[Text]]></s:doctype>
      <media:text type="html"><![CDATA[<p>June's episode of the regular <a href="http://semanticgang.talis.com/">Semantic Web Gang podcast</a> was recorded on stage at the <a href="http://semantic-conference.com/">Semantic Technology Conference</a> in San Jose.
</p>

<p>Audio <em>and</em> video of the session <a href="http://cloudofdata.com/2009/07/the-semantic-web-gang-talk-about-semtech2009-live/">is now available</a>, with Gang members and conference organiser Tony Shaw engaging in a discussion of the event's highlights and the underlying trends at work.
</p>]]></media:text>
    </item>
    <item>
      <guid isPermaLink="false">6126000309</guid>
      <link><![CDATA[http://www.zdnet.com/blog/semantic-web/new-york-times-embraces-linked-data/309]]></link>
      <title><![CDATA[New York Times embraces Linked Data]]></title>
      <description><![CDATA[The keynote on this final day of the Semantic Technology Conference saw Robert Larson and Evan Sandhaus of the New York Times talk about the paper's innovative adoption of semantic technologies;  "The first semantic search system for The New Times was released in 1913 and was available bound in either paper ($6) or cloth ($8).]]></description>
      <pubDate><![CDATA[Thu, 18 Jun 2009 19:36:26 +0000]]></pubDate>
      <media:credit role="author"><![CDATA[Paul Miller]]></media:credit>
      <s:doctype><![CDATA[Text]]></s:doctype>
      <media:text type="html"><![CDATA[<p>The <a href="http://semanticconference.com/session/1961/">keynote</a> on this final day of the Semantic Technology Conference saw Robert Larson and Evan Sandhaus of the <em><a href="http://www.nytimes.com/">New York Times</a></em> talk about the paper's innovative adoption of semantic technologies;</p>
<blockquote>
  <p>"The first semantic search system for The New Times was released in 1913 and was available bound in either paper ($6) or cloth ($8). In the 96 years since the advent of The Historical Index to The New York Times, semantic technology has become central to The New York Times' daily operations and the focus of much internal research and development. In our keynote, Rob Larson, VP of Digital Production, and Evan Sandhaus, Semantic Technologist, will review the long history of semantic technology at The New York Times; discuss the application of this technology in our operations; and review an innovative initiative to enlist the global community in solving some of our toughest challenges."</p>
</blockquote>
<p>Sandhaus and Larson begin by referring back to the <i>Times</i>' long history, and the early importance of the paper's emphasis on building - and selling - a comprehensive abstracting and indexing service to stories in the paper. This, they suggest, was important in leading to the paper being considered as the paper of record, ahead of its numerous competitors.</p>
<p>Building upon the paper's nine-month old 'Annotated Corpus' and its associated APIs, Larson closed the session by announcing that the <i>Times</i>' thesaurus is to be made available using a license and APIs that will see it available to play a part in the wider Linked Data cloud.</p>]]></media:text>
    </item>
    <item>
      <guid isPermaLink="false">6126000306</guid>
      <link><![CDATA[http://www.zdnet.com/blog/semantic-web/nova-spivack-interviews-wolfram-alphas-russell-foltz-smith/306]]></link>
      <title><![CDATA[Nova Spivack interviews Wolfram Alpha's Russell Foltz-Smith]]></title>
      <description><![CDATA[Radar Networks attracted a fair degree of attention with their roll-out of Twine, and the company's CEO has built a reputation as one of the more thoughtful thinkers in the space. Nova took to the stage at the Semantic Technology Conference today, not to talk about his own company or ideas, but to lead a conversation with Russell Foltz-Smith from Wolfram Research.]]></description>
      <pubDate><![CDATA[Wed, 17 Jun 2009 20:26:38 +0000]]></pubDate>
      <media:credit role="author"><![CDATA[Paul Miller]]></media:credit>
      <s:doctype><![CDATA[Text]]></s:doctype>
      <category domain="http://www.zdnet.com/topic-google/">Google</category>
      <media:text type="html"><![CDATA[<p>Radar Networks attracted a fair degree of attention with their roll-out of <a href="http://www.twine.com/">Twine</a>, and the company's CEO has built a reputation as one of the more <i>thoughtful</i> thinkers in the space. Nova took to the stage at the <a href="http://semantic-conference.com/">Semantic Technology Conference</a> today, not to talk about his own company or ideas, but to lead a conversation with Russell Foltz-Smith from Wolfram Research.</p>
<p>Wolfram Research, of course, is the company behind the recently launched <a href="http://www.wolframalpha.com/">Wolfram Alpha</a>; a 'computational knowledge engine' that attracted a wave of attention that reached into the mainstream media.</p>
<p>"Putting all of the world's computable knowledge; it sounds impossible... or over-confident, maybe. What <i>is</i> computable knowledge?"</p>
<p>"It's 'systematic knowledge.' It can be compared, contrasted, correlated, computed on. It's not a movie review. Examples are classical physics, financial data and models, weather data and models... It's not the latest opinion on who Britney Spears is dating. We don't have a model to do anything with that in our system."</p>
<p>Nova asks if it's the difference between objective and subjective... Alpha deals with <i>objective</i> information. 'Facts,' almost?</p>
<p>Nova asks about sources, pointing to the example of <a href="http://www92.wolframalpha.com/input/?i=tibet">Tibet</a>; is it a 'fact' that Tibet that is part of China, or not... ?</p>
<p>"In the case of geo-political things, and religious things, <i>we</i> have to make choices... and allow the community to let us know whether they agree or not..." Couldn't the system represent multiple views, tied to the diverse sources? Could we not show the different opinions, and allow the user to make informed decisions themselves?</p>
<p>Nova; "is the world's computable knowledge infinite?"</p>
<p>Russell; "the <i>foundation</i> of computable knowledge is likely to be finite... The amount of knowledge that can be computed and generated from that is infinite..."</p>
<p>Nova; "I can see that maths could be finite. But geopolitics, health, etc... that's much, much larger..."</p>
<p>Russell; "The <i>instances</i> seem very complex... Huge, but finite... I don't want anyone to think we'll have this done in ten years... It's a long term thing."</p>
<p>Nova; "Stephen [Wolfram] reckons it could be done in three years...?"</p>
<p>Nova; "Looking at the back end, the ontology seems to be <i>implicit</i>. I didn't see any classes, just a lot of instances... a set of facts. As the team grows, how do you prevent people adding facts in different ways?"</p>
<p>Russell; "There are a set of stored facts; things you <i>know</i> about a city. But then there are computed facts that you couldn't store in a traditional ontology." Huh?</p>
<p>Nova; "Can you make a statement about what percent of the world's computable knowledge is there today?"</p>
<p>Russell; "I can't make a statement..."</p>
<p>Nova; "The syntax is quite interesting... but enigmatic. It wasn't necessarily that the knowledge wasn't there, but that I'd asked for it in the wrong way. Can't you make a manual? ... Stephen [Wolfram] said it would be an impossible task to write the manual... or to make a generic natural language on top."</p>
<p>Nova; "In some cases a naive query will get you the answer, but maybe there's a need for a layer that helps you when you don't get what you want..."</p>
<p>Russell; "I think we're getting close... we're going to put an API out in the next few weeks, and hopefully someone will build the application using that to parse natural language and translate it for Alpha... Do we spend <i>our</i> time doing that, or putting more data and more models into the system... I reckon our time is best spent adding more data..."</p>
<p>Nova; "Is there a set of schemas or ontologies to link all of this stuff together?"</p>
<p>Russell; "There isn't an ontology over the whole system... but within a domain there is structure... Is there some grand scheme that we have internally? Not really. The company has been doing this stuff for 23 years, so there's a bit of a shared understanding internally."</p>
<p>Examples keep coming back to mathematics... To succeed, Alpha has to offer compelling examples that are far broader...</p>
<p>Nova; "What about reasoning. You've said that you can derive additional knowledge. What kinds of reasoning is the system capable of?"</p>
<p>Russell; "I'd call it very simple reasoning. For example changing the currency based upon your geo-location... Is there any weak or strong AI in here? Not really. Could you build something like that? Probably. Will we? I don't know..."</p>
<p>Nova; "Alpha seems to be a subset of Mathematica capabilities... Would you expand that, and bring a full Mathematica to the Web"</p>
<p>Russell; "It is, and there are plans to extend the capabilities. I don't know if we'd go to a full-blown Mathematica on the web."</p>
<p>Russell's mentioning a subscription service for people working with more data, or needing more compute time. The public web site tends to time out a query in 4-8 (or 48?) seconds... The professional subscription version will have a monthly subscription version that will allow you to compute bigger questions. There will also be a pay-per-use API... and 'primitive' advertising. More advanced advertising, based on transactions, to be launched soon.</p>
<p>Nova; "Alpha's really cool, but I want to do this on my <i>own</i> knowledge... inside an enterprise, inside a government agency..."</p>
<p>Russell; "We can roll out a custom Wolfram Alpha for those who want it behind the firewall. We will also let people upload their own data sets. We need to find a sensible way to let people do this..."</p>
<p>Nova; "There was a lot of hype - possibly my fault - around Alpha being a Google Killer. Obviously it's not that. It's something quite different. Who is the user, and what are they using it for?"</p>
<p>Russell; "Use will evolve, and it already has. There's an obvious use by students, but the school year has just ended.</p>
<p>Nova; "Wolfram Alpha; now even Ph.D's can cheat on their homework." :-)</p>
<p>Nova; "Are consumers <i>using</i> it? Obviously they're having a play, but are they coming back and using it?"</p>
<p>Russell moves off to talk about academic use... Dodging the question?</p>
<p>Nova; "Are the financial capabilities in Alpha differentiated from the capabilities banks and investors already have in their vertical?"</p>
<p>Russell; "more sophisticated than a general finance web site, but probably less sophisticated than you'd find on a terminal in a bank."</p>
<p>Nova; "Do I really need to know how long it would take an ant to get from San Francisco to Cairo?"</p>
<p>Russell; "Because of the way the system is engineered, it just keeps computing until it runs out of time. With simple queries you'll get a lot of data. It just keeps computing."</p>
<p>Nova; "What's the big challenge, moving forward?"</p>
<p>Russell; "Setting priorities."</p>
<p>Nova; "So let's talk about Google. They made some aggressive marketing moves during the Alpha roll-out, and they're continuing to roll products out to chip away... Do you think that what you've built is defensible, just because it's hard... or can you defend it in other ways?"</p>
<p>Russell; "There are significant barriers to what we're doing. Someone else could build this... but would they want to? That's an open question."</p>
<p>Nova; "Do you hope to work with other companies? Perhaps revenue share with them?"</p>
<p>Russell; "Obviously."</p>
<p>Nova; "There's been a lot of interest in how Alpha might connect with open standards and the Semantic Web..."</p>
<p>Russell; "If you want the platform to be used, we'll have to do some of this stuff... RDF, OWL, etc could play a huge role."</p>
<p>Nova; "Timeframe?"</p>
<p>Russell; "It'll depend on pick-up of the API... which is due out in a few weeks."</p>
<p>Nova; "So what's the implication for education? It makes it possible to do some things without even thinking..."</p>
<p>Russell; "It'll be a heated debate for a while... Some things are positive, some negative. There's going to be a reorientation... It has to happen."</p>
<p>Nova; "The danger is that if you delegate thinking [inside education] to a computation service... you may not actually understand enough to know if the answer that comes back is correct."</p>
<p>Russell; "That's a valid concern."</p>
<p>Q&amp;A</p>
<p>"You rely more on your computational engine than natural language... but you lay a lot of emphasis on the linguistics in your system. So if it's not NLP what is it?"</p>
<p>Russell; "Domain linguistics, mainly; mathematical language, engineering language, etc... We think about how people describe things and search in these domains... and crawl the web looking for examples of how people use language in these domains."</p>
<p>"Stephen is focussed on quality of data, which is important to a lot of people here. There aren't a lot of tools. In addition to making your data store, I wonder if there might be scope to make some of your data curation tools available to the community, to improve the data out there."</p>
<p>Russell; "Great point. Can we make these tools genuinely useful to people, without creating a support nightmare..."</p>]]></media:text>
    </item>
    <item>
      <guid isPermaLink="false">6126000302</guid>
      <link><![CDATA[http://www.zdnet.com/blog/semantic-web/semantic-search-round-table-at-the-semantic-technology-conference/302]]></link>
      <title><![CDATA[Semantic Search Round Table at the Semantic Technology Conference]]></title>
      <description><![CDATA[Wednesday's opening Keynote here in San Jose sees Guidewire's Carla Thompson joined on stage by senior representatives from many of the more interesting players in the Semantic Search space; Tomasz Imielinski from Ask, Peter Norvig from Google, Riza Berkan of Hakia, Scott Provost from Microsoft, William Tunstall-Pedoe of the UK's True Knowledge, and Andrew Tomkins of Yahoo.]]></description>
      <pubDate><![CDATA[Wed, 17 Jun 2009 16:40:08 +0000]]></pubDate>
      <media:credit role="author"><![CDATA[Paul Miller]]></media:credit>
      <s:doctype><![CDATA[Text]]></s:doctype>
      <category domain="http://www.zdnet.com/topic-microsoft/">Microsoft</category>
      <media:text type="html"><![CDATA[<p>Wednesday's <a href="http://semanticconference.com/session/2069/">opening Keynote</a> here in San Jose sees Guidewire's Carla Thompson joined on stage by senior representatives from many of the more interesting players in the Semantic Search space; Tomasz Imielinski from Ask, Peter Norvig from Google, Riza Berkan of Hakia, Scott Provost from Microsoft, William Tunstall-Pedoe of the UK's True Knowledge, and Andrew Tomkins of Yahoo.</p>
<p>Carla asks each panellist to describe the differentiating aspects of their product in 'one or two sentences;'</p>
<p>Tomasz; "we receive about three times as many <i>questions</i> as other search companies. We want to answer questions the best we can from multiple sources... using structured and unstructured data."</p>
<p>Scott; "Bing really focusses on understanding the <i>intent</i> behind queries, and organising the page to help people get to their answer much faster."</p>
<p>Peter; "We focus on being comprehensive, accurate and fast... so we have to keep on innovating in crawling, ranking, systems engineering. One thing that differentiates us... most companies decide whether to focus on marketing or sales. We focus on engineering."</p>
<p>Riza; "We are a complete semantic search engine, from the bottom up. We don't even have an index. We've optimised the entire process for semantic operations. We focus on credible and dynamic content, and offer users a new perspective." Instead of <i>popularity</i>, they focus on <i>credibility<b>.</b></i></p>
<p>William; "True Knowledge is a platform that does direct question answering. There's a knowledge base and an inference engine to answer questions we haven't seen before." True Knowledge tries to 'help when it can, and stay quiet when it can't,' as can be seen demonstrated in their recently released Firefox plugin.</p>
<p>Andrew: "Yahoo! is very aggressive about semantic annotation... SearchMonkey is about acquiring semantic information and surfacing it in search results on the page."</p>
<p>Carla mentions Tom Tague's <a href="http://blogs.zdnet.com/semantic-web/?p=300">keynote</a> from yesterday, where he suggested that 'semantic search is an answer to a question no one is asking'... so "why do we need to change search?"</p>
<p>Tomasz responds, suggesting that users don't necessarily demand new products that subsequently become successful. <i>eg</i>; no one was asking for the iPod before it launched. "When they see it, they will want it."</p>
<p>Turning to Google and Yahoo!, Carla asks them "why do we need to change search?"</p>
<p>Peter... "as an industry, satisfaction is very high... but that is just because that's what people know [now]... People don't <i>like</i> technology... people like <i>solutions</i>. When we deliver it, people will want it."</p>
<p>Andrew; "Does search need to change? It already is... Today, on any major search engine, if you search for a restaurant, you'll see structured information about that restaurant; reviews, phone number, <i>etc</i>... This has been accelerating over the last 3-4 years... When we put this information up, and trigger it correctly, we see far higher levels of engagement from our users than anything else."</p>
<p>Carla; "it may be a stupid question, but it has to be asked; what is semantic search?"</p>
<p>Scott; "it means a lot of different things. At Powerset we focussed on understanding the <i>meaning</i> in web pages, so we could present them, rank them..."</p>
<p>Carla; "Has Powerset's focus been diluted by the [Microsoft] acquisition?"</p>
<p>Scott; "No."</p>
<p>Carla asks Riza; "Someone from Hakia that I spoke to last year said you were the only one doing 'true semantic search.' Is that true?"</p>
<p>Riza; "No... Semantic Search can <i>enrich</i> search results... Semantic Search can <i>improve</i> precision/disambiguation... Semantic Search can <i>organise</i> results better. In the future, search will move to more conversational systems, and for that you really need semantic technology."</p>
<p>Carla; "How do you measure the 'semanticity' of a search engine?"</p>
<p>Tomasz; "That's my favourite question... We took a sample of 'equivalent' queries from the logs, and ran it to evaluate ranking <i>etc;</i> does the search engine give similar answers to questions like 'Top 10 songs' and 'Top Ten songs,' etc. Should they?"</p>
<p>Andrew; "It's incredibly hard to understand what a user will like... if you mess with the logo, it changes the perception of the results... if you make tiny changes, it can have a big impact on perception... When it comes to understanding semantic contact in search, we should identify the task the user is trying to solve... and have a metric that's aligned to that use case... We can break search queries today into different classes; how do we do when a user is trying to book dinner, or a vacation? Semantic Technology should be judged on its impact based on these task metrics rather than any underlying notions of entity resolution, etc... SearchMonkey, for instance, lets users inject structured data into the process... The information can be incorporated in any way... and change how the results are presented. We have about 15,000 people in our development community, changing the way those results are presented every day."</p>
<p>Tomasz; "I would expect a <i>semantic</i> search engine to deliver equivalent results to queries that would appear similar to a human being; 'Top 10 songs' and 'Top Ten Songs' should deliver the same answer. Today in most mainstream search engines they don't."</p>
<p>Carla; "Search v. Answers. True Knowledge is billed as 'the Internet Answer Engine;' is it necessary to move search to an answer-based format, or has Google trained users to think in keywords?"</p>
<p>William; "We support both keyword search and full-text questions. It's important to answer users' questions."</p>
<p>Peter; "Different types of answers are appropriate for different types of questions; sometimes the answer is a fact, or a page, or a series of results to support a process of study. To say there's going to be one technology or one type of answer doesn't make sense."</p>
<p>Riza; "You could be asking a 'where,' 'why,' 'how' type of question. Questions are important, and the search engine needs to be able to interpret the mode of the question and return results appropriately."</p>
<p>Carla; "You mentioned talking about the credibility of search results. How do you define a 'credible' search result, and how much of a need is there really? I'm not hearing users question the credibility of search results they see today."</p>
<p>Riza; "Practically, credibility is important in 'serious' subjects; medical information, etc. You want to know where the results come from and how credible they are. When it comes to credible content, you can't really do a statistical search or have a 'popularity vote,' because much credible content isn't 'popular.'</p>
<p>Scott; "People's expectations for credibility are different depending upon the query. If you ask an 'instant answer' type query you expect the answer to be credible. If you do a broader search, you expect a mix of results to be returned"</p>
<p>William; "If a system understands structured knowledge, it can understand when different sources contradict one another"</p>
<p>Riza; "A system doesn't need to know what's credible; we can go to a librarian for that. Hakia doesn't decide whether a resource is credible or not; we use librarians for that"</p>
<p>Tomasz; "If you ask for the capital of Japan we expect a single answer. If you ask about taxes, maybe the IRS is the best source but there are others. If you ask 'how to get rid of acne' you expect a lot of results."</p>
<p>Carla; "We've seen three news-making launches in the past month; Wolfram Alpha, Bing, Siri. Is Wolfram the first step towards 2001? How is this engine valuable to those of us who don't need to solve complex maths?"</p>
<p>Scott; "it's not the first step... we've been working on these problems for a long time. There are a lot of questions people want to ask about the types of data that Wolfram aggregates... We see these things as <i>part</i> of full-search services. Powerset has moved along this path as well, pulling structured data in response to full-text queries."</p>
<p>William; "Wolfram is a tremendous effort. An interesting example of question answering with structured data. I think people will find uses for it in particular use cases; I spoke to someone who'd used it to calculate when his visa expired, because it could do date calculation. I think there will be use cases in various scenarios; maths, nutrition information, <i>etc</i>... if you remember that it has that sort of information and remember to go to it... However one thing it doesn't have is a decent back-fill. If it doesn't have the data, or doesn't understand the way you asked the query, it gives you nothing. We try to keep quiet and fail over to standard internet search in that sort of circumstance."</p>
<p>Carla; "Does a semantic search engine know how <i>not</i> to answer a question?"</p>
<p>William; "that's absolutely fundamental. You need the ability to reliably keep quiet when you don't have the answer... and fail over reliably to other search services. [True Knowledge does try to do this...] "That requires very high quality semantics."</p>
<p>Andrew; "One way to characterise the approach of Wolfram Alpha is that it's a centralised approach. The Wolfram Alpha team goes out to find data and bring it in-house to convert to a standard form. A different approach is to have an ecosystem contributing data in the public eye... It's not clear yet how much of a value-add is going to come from this centralised knowledge mapping approach. Yahoo! is focussed on the ecosystem approach, and helping people with knowledge to make it available."</p>
<p>Peter; "Our inclination would be that we don't want a closed walled garden. We want all the information available to combine in different ways. We want the information to be open, and the tool set to be open for mashing up in different ways."</p>
<p>Scott; "If Wolfram Alpha hadn't taken a walled garden approach they might never have launched a product."</p>
<p>Tomasz; "Wolfram Alpha is great, but it's not a search engine"</p>
<p>Carla; "Siri... caused a lot of buzz, uses True Knowledge... what are your thoughts?"</p>
<p>Andrew; "To be counter-cultural... the notion of getting much deeper and assisting a user with a task is spot on. We're going to see much more of that. Search has tended to be stateless. Each query you enter is more or less processed without context. Yahoo! is rolling out more stateful search tools, and other companies will do the same. We expect people to use these tools on lots of devices. Would be expect people to come to the same place for purchase, navigation, etc? Do we expect one interface? There are going to be virtual assistants... I just don't know if they're going to be embedded into a search box."</p>
<p>Scott; "Conversation is the ultimate user interface... but it's not clear that I want to have a conversation with my laptop during the working day. How do I display the results? But there's a huge role for conversation and dialogue in refining search and getting a user to their results faster."</p>
<p>Tomasz; "What is the goal of Siri? If you try to go to broad you become a search engine."</p>
<p>Scott; "When people have a conversational interface, they won't speak in keywords."</p>
<p>Carla; "What are the larger goals for Bing?"</p>
<p>Scott; "Bing is trying to simplify key tasks that people do when they come to a search engine. In travel, health, shopping, we can understand what people are trying to do, and get them to better results faster. The thinking has evolved from ten blue links to the whole page, and organising things to help the user by understanding their tasks."</p>
<p>Carla; "Peter; what did you think of Bing?"</p>
<p>Peter; "I like the idea of innovation in the user interface. There's a lot of room for that. There's been a lot of emphasis on getting the ranking right. You still need to do that, but other things are important too. I'm usually happy with results on my big screen. On a mobile device, I'm usually not happy with the results I get."</p>]]></media:text>
    </item>
    <item>
      <guid isPermaLink="false">6126000300</guid>
      <link><![CDATA[http://www.zdnet.com/blog/semantic-web/semantic-technology-conference-kicks-off-with-keynotes-from-open-calais-and-siri/300]]></link>
      <title><![CDATA[Semantic Technology Conference kicks off with Keynotes from Open Calais and Siri]]></title>
      <description><![CDATA[This year's Semantic Technology Conference got fully underway this morning, with Keynote presentations from Tom Tague of Thomson Reuters' Open Calais Initiative and Tom Gruber from Siri.Despite the wider economic situation, attendance for this fifth year of the event feels a little up on last year, and there's clearly real enthusiasm in the buzzing Halls.]]></description>
      <pubDate><![CDATA[Tue, 16 Jun 2009 16:54:35 +0000]]></pubDate>
      <media:credit role="author"><![CDATA[Paul Miller]]></media:credit>
      <s:doctype><![CDATA[Text]]></s:doctype>
      <category domain="http://www.zdnet.com/topic-browser/">Browser</category>
      <media:text type="html"><![CDATA[<p>This year's <a href="http://semantic-conference.com/">Semantic Technology Conference</a> got fully underway this morning, with Keynote presentations from <a href="http://semantic-conference.com/session/2120/">Tom Tague</a> of Thomson Reuters' <a href="http://www.opencalais.com/">Open Calais Initiative</a> and <a href="http://semantic-conference.com/session/1909/">Tom Gruber</a> from <a href="http://www.siri.com/">Siri</a>.</p>
<p>Despite the wider economic situation, attendance for this fifth year of the event feels a little up on last year, and there's clearly real enthusiasm in the buzzing Halls.</p>
<p>Tague's Open Calais has been one of the success stories for <i>useful</i> and easy application of semantic technologies beyond a core community of enthusiasts and adopters, and has been covered here and on <a href="http://cloudofdata.com/">Cloud of Data</a> a number of times since it launched. Just today, they <a href="http://cloudofdata.com/2009/06/thomson-reuters-turns-to-fedex-and-dhl-to-boot-strap-their-cloud-of-data/">announced</a> a new set of partners and a postal service that should remove one more perceived barrier for another set of potential adopters.</p>
<p>Speaking to the theme of 'Web 3.0 - the Web of <i>Me</i>,' Tague's abstract suggests;</p>
<blockquote>
  <p>"The mainstream adoption of Web 2.0 technologies – from RSS feeds to social networks – is hastening the demise of the portal. With each new face on Facebook, and each new Twitter account, our once routine habits and traffic patterns shift. This wave of change in the way we consume, transact and interact on the Web is dis-intermediating 'destination' sites of all kinds. Our once centralized content has been atomized.</p>

  <p>And yet our fundamental problem persists. We're overwhelmed with input, yet still can't find the one thing we need... now.</p>

  <p>Semantic technologies – and the content interoperability and Linked Data connections they beget – offer new hope. That is not to say the answer lies in building new search engines, and few would argue for another news aggregator. Rather, our point of inflection lies at the point of consumption. Our task is to simultaneously refine and enrich our digital experience of everything from content and community to commerce."</p>
</blockquote>
<p>Early on, Tague made a 'non-apologetic statement;'</p>
<blockquote>
  <p>"People need to start deriving financial benefits from semantic technology. It's time"</p>
</blockquote>
<p>Absolutely!</p>
<p>Tague looks back at the move from 'Web 1.0,' described as 'the last Web we agreed on,' to 'Web 2.0,' which he sees as largely defined by the 'addition of social.' Today, he reckons, we are 'extraordinarily content-rich', 'extraordinarily information-poor' and 'experientially deficient.' Despite a wealth of content, we are failing to make the most of it.</p>
<p>'We're at the inflection point' where 'innovation is exploding' as we move from developing and inventing toward mainstream adoption of technologies in the semantic technology space. Lots of things will be tried; 90% will fail, but that's ok.</p>
<p>'Everyone needs plumbing,' and that's what Calais is; semantic plumbing. 13 version releases in 18 months; about 100 presentations, 13,000 registered Open Calais developers, a million great ideas.</p>
<p>Tague reckons the various efforts he comes in contact with fall into six broad buckets;</p>
<p>Tools; Social; Advertising; Search; Publishing; Interface.</p>
<p>First, <b>Enabling Tools</b>. Data Management, Data generation, Databases, Integration and workflow. 'A big yes.' 'We need tools.' Everyone needs tools, especially as you move from early adopters toward the mainstream. Tools build the bridges that cross the chasm to enterprise adoption.</p>
<p>Enterprise adoption will not happen because it's cool. Enterprise adoption will not be talked about on Twitter. Enterprise adoption will happen because it's cheaper/faster/better than what they have just now.</p>
<p>'Tool vendors need to simplify their story; it's not about more functionality.' 'If I can't understand your story, then Enterprise IT certainly can't'</p>
<p>Second, 'let's put some frosting on top of <b>social</b>.' 'Wouldn't it be cool if we could...' Some of it might be cool, but there's a challenge in monetising social. Adding frosting to the top of an industry that hasn't worked out its own monetisation is fraught with risk.</p>
<p>'I haven't seen a compelling story yet.'</p>
<p>Next, <b>advertising</b>. Almost a dirty word in the semantic technology domain last year. But advertising is fuel, and semantic technologies have a clear role to play in enhancing advertising (see my <a href="http://blogs.talis.com/nodalities/2008/10/scott-brinker-talks-with-talis-about-semantic-marketing.php">podcast</a> with Scott Brinker from last year...).</p>
<p>Semantic <b>search</b>; 'the semantic industry's brilliant yet under-achieving child.' The answer to a question no one is asking? General, consumer-facing semantic search... directly competing with Google <i>et al</i>? Not viable.</p>
<p>But vertical search in specific domains... a huge growth opportunity, and people are willing to invest the time, effort and money to make it happen. Room for a handful of players in each domain?</p>
<p>Search; 'a bifurcated marketplace.'</p>
<p><b>Publishing</b>; content producers, editorial/aggregation, 'robotic publishing.'</p>
<p>'Classic publishers can get enormous value from this technology... not all of the value is in the user experience.' Much of the value is being found in the back office, making existing data and investments work harder.</p>
<p>Little value in 'robotic publishing,' because the content isn't that readable. Aggregation services like Huffington Post and Daily Me present 'enormous opportunities.'</p>
<p><b>Interface</b>; gaming a huge and growing market. $57bn industry. A 'seamless, interactive and responsive experience,' it's 'graphically engaging and fun.'</p>
<p>Zemanta, AdaptiveBlue, Feedly, Apture <i>et al</i> 'trying to make the consumption experience different' [better?]. Not suggesting that these are like a game, but many of the drivers may be similar?</p>
<p>"People are on their mobile devices and in the browser; go where the people are." Which links well to the next keynote... :-)</p>
<p>"Do you care about semantics <b>or</b> about user value?"</p>
<p>"Don't fund/buy semantic infrastructure beyond what you <i>need</i>; use infrastructure built by others where possible."</p>
<p>"Think very hard about the user experience; make it compelling and exciting."</p>
<p>Following Tague's presentation, Tom Gruber took to the stage to talk about <a href="http://www.siri.com/">Siri</a>; a company building a Virtual Personal Assistant (with an interesting iPhone app to start things off) that we <a href="http://cloudofdata.com/2009/06/tom-gruber-talks-about-siri-the-virtual-personal-assistant/">discussed</a> during a podcast last week. As Gruber's says;</p>
<blockquote>
  <p>"We are beginning to see a new interaction paradigm for the web: the Virtual Personal Assistant (VPA). A VPA is task focused: it helps you get things done. You interact with it in natural language, in a conversation. It gets to know you, acts on your behalf, and gets better with time. The VPA paradigm builds on the information and services of the web, with new technical challenges of semantic intent understanding, context awareness, service delegation, and mass personalization.</p>

  <p>Siri is a virtual personal assistant for the mobile Internet. Although just in its infancy, Siri can help with some common tasks that human assistants do, such as booking a restaurant, getting tickets to a show, and inviting a friend. We will describe the technology underlying Siri and how it fits in the larger ecosystem of services and data providers. And we will offer a vision of where assistants like Siri are going."</p>
</blockquote>
<p>Tom starts off by showing the <a href="http://en.wikipedia.org/wiki/Knowledge_Navigator">Knowledge Navigator video</a> from Apple... which dates all the way back to 1987. Many of the ideas are now coming to fruition; touch screens, a global network, awareness of temporal and social context, speech in and out, a 'conversational interface,' 'delegation of work' to the machine, and trusted use of personal data.</p>
<p>Is the Knowledge Navigator possible today? 'No, but we're getting there.'</p>
<p>Siri is pretty close... in certain well understood contexts, as Gruber shows in a video demo of the evolving iPhone application.</p>
<p>What is a Virtual Personal Assistant? It <i>does things</i> for you; it's task-oriented. It understands your intent via a conversational metaphor. It gets to know <i>you</i>; it's not the same for everybody, unlike a search engine.</p>
<p>'Service delegation [like Siri]; the mother of all mashups'</p>
<p>'Context is king' in communicating with a VPA; <b>where</b> am I, what <b>time</b> is it, <b>who</b> am I, etc.</p>
<p>"This really is the beginning of the age of the start of Virtual Assistants."</p>
<p>Need to solve authorisation/ authentication. If we reach a 'data commons' there will be more, better, information to drive choices and decisions.</p>
<p><i>Tom Tague is a regular member of the <a href="http://semanticgang.talis.com/">Semantic Web Gang podcast</a>, which I moderate. Tom Gruber was <a href="http://cloudofdata.com/2009/06/tom-gruber-talks-about-siri-the-virtual-personal-assistant/">the latest guest</a> in my Executive Briefing podcast series.</i></p>]]></media:text>
    </item>
    <item>
      <guid isPermaLink="false">6126000297</guid>
      <link><![CDATA[http://www.zdnet.com/blog/semantic-web/bing-is-not-alone-similar-techniques-alive-and-well-in-existing-vertical-search/297]]></link>
      <title><![CDATA[Bing is not alone; similar techniques alive and well in existing vertical search]]></title>
      <description><![CDATA[Microsoft's Bing is attracting plenty of interest today, and perhaps deservedly so as it brings some interesting fresh ideas to the world of generic search engines. Whether it is sufficiently compelling to break our deeply ingrained association of 'search' with 'Google' remains to be seen.]]></description>
      <pubDate><![CDATA[Fri, 29 May 2009 11:10:27 +0000]]></pubDate>
      <media:credit role="author"><![CDATA[Paul Miller]]></media:credit>
      <s:doctype><![CDATA[Text]]></s:doctype>
      <category domain="http://www.zdnet.com/topic-cxo/">CXO</category>
      <media:text type="html"><![CDATA[<p>Microsoft's <a href="http://www.bing.com/">Bing</a> is attracting plenty of interest today, and perhaps deservedly so as it brings some interesting fresh ideas to the world of generic search engines. Whether it is sufficiently compelling to break our deeply ingrained association of 'search' with 'Google' remains to be seen.
</p>

<p>It should be remembered, of course, that broadly similar approaches are already taken to managing and navigating data inside the data centres of large corporations where Autonomy, FAST, Endeca and their peers provide powerful capabilities.
</p>

<p>I <a href="http://blogs.talis.com/nodalities/2009/01/daniel-tunkelang-talks-about-endeca-search-and-reconsidering-relevance.php">recorded a podcast</a> with Endeca Chief Scientist <a href="http://thenoisychannel.com/">Daniel Tunkelang</a> in January and, by chance, <a href="http://blogs.talis.com/nodalities/2009/05/robin-johnson-ceo-of-ft-search-talks-about-newssiftcom.php">spoke with Robin Johnson yesterday</a>. Robin is CEO of FT Search, part of the Financial Times Group, and responsible for a new vertical search tool called <a href="http://www.newssift.com/">Newssift</a>. Newssift combines components from various technology companies (including Endeca, Nstein, Lexalytics and ReelTwo) to offer a useful means of learning more about businesses and the external factors affecting them.
</p>]]></media:text>
    </item>
    <item>
      <guid isPermaLink="false">6126000295</guid>
      <link><![CDATA[http://www.zdnet.com/blog/semantic-web/semantic-web-gang-podcast-discusses-wolfram-alpha-and-googles-rich-snippets/295]]></link>
      <title><![CDATA[Semantic Web Gang podcast discusses Wolfram Alpha and Google's Rich Snippets]]></title>
      <description><![CDATA[This month has seen Google announce 'Rich Snippets' and Wolfram Research release Alpha to a flurry of mainstream media coverage; both are of interest to those working on the Semantic Web.This month's episode of the Semantic Web Gang takes a look at both stories, and Gang members share their impressions on the news and what it might mean moving forward.]]></description>
      <pubDate><![CDATA[Fri, 22 May 2009 13:14:25 +0000]]></pubDate>
      <media:credit role="author"><![CDATA[Paul Miller]]></media:credit>
      <s:doctype><![CDATA[Text]]></s:doctype>
      <category domain="http://www.zdnet.com/topic-browser/">Browser</category>
      <media:text type="html"><![CDATA[<p>This month has seen Google announce '<a href="http://googlewebmastercentral.blogspot.com/2009/05/introducing-rich-snippets.html">Rich Snippets</a>' and Wolfram Research release <a href="http://www.wolframalpha.com/">Alpha</a> to a flurry of mainstream media coverage; both are of interest to those working on the Semantic Web.
</p>

<p><a href="http://semanticgang.talis.com/2009/05/22/may-2009-the-semantic-web-gang-discuss-wolfram-alpha-and-googles-rdfa/">This month's episode</a> of the Semantic Web Gang takes a look at both stories, and Gang members share their impressions on the news and what it might mean moving forward.
</p>

<p><a href="http://cloudofdata.com/2009/05/the-semantic-web-gang-live-in-san-jose/">Next month's Semantic Web Gang will be coming live</a> from the <a href="http://semantic-conference.com/">Semantic Technology Conference</a> in San Jose.
</p>]]></media:text>
    </item>
    <item>
      <guid isPermaLink="false">6126000292</guid>
      <link><![CDATA[http://www.zdnet.com/blog/semantic-web/can-semantic-technologies-help-brands-profit-from-social-media/292]]></link>
      <title><![CDATA[Can semantic technologies help brands profit from social media?]]></title>
      <description><![CDATA[In my latest podcast interview with those shaping our evolving engagement with Semantic Technologies, I speak with Eric Hillerbrand.Drawing upon years of experience in the development and deployment of Semantic Web solutions, Eric has spent the past few years considering the ways in which semantic technologies could bring structure and value to the increasingly visible online conversations around products and brands.]]></description>
      <pubDate><![CDATA[Mon, 20 Apr 2009 15:16:39 +0000]]></pubDate>
      <media:credit role="author"><![CDATA[Paul Miller]]></media:credit>
      <s:doctype><![CDATA[Text]]></s:doctype>
      <category domain="http://www.zdnet.com/topic-social-enterprise/">Social Enterprise</category>
      <media:text type="html"><![CDATA[<p>In my latest podcast interview with those shaping our evolving engagement with Semantic Technologies, I speak with Eric Hillerbrand.
</p>

<p>Drawing upon years of experience in the development and deployment of Semantic Web solutions, Eric has spent the past few years considering the ways in which semantic technologies could bring structure and value to the increasingly visible online conversations around products and brands.
</p>

<p><a href="http://cloudofdata.com/2009/04/eric-hillerbrand-sees-a-profitably-semantic-future-for-our-relationship-with-brands/">Have a listen</a>, and share your views on the ways in which this might impact <em>your</em> brand, or your interaction with those of others.
</p>]]></media:text>
    </item>
    <item>
      <guid isPermaLink="false">6126000289</guid>
      <link><![CDATA[http://www.zdnet.com/blog/semantic-web/the-semantic-web-gang-discuss-ontologies/289]]></link>
      <title><![CDATA[The Semantic Web Gang discuss ontologies]]></title>
      <description><![CDATA[Back in October I wrote about the first Vocabulary Camp, or VoCamp. These informal gatherings have gone from strength to strength, and the fourth is currently underway on the Spanish island of Ibiza.]]></description>
      <pubDate><![CDATA[Thu, 16 Apr 2009 20:42:20 +0000]]></pubDate>
      <media:credit role="author"><![CDATA[Paul Miller]]></media:credit>
      <s:doctype><![CDATA[Text]]></s:doctype>
      <category domain="http://www.zdnet.com/topic-cxo/">CXO</category>
      <media:text type="html"><![CDATA[<p>Back in October <a href="http://blogs.zdnet.com/semantic-web/?p=207">I wrote about the first Vocabulary Camp</a>, or VoCamp. These informal gatherings have gone from strength to strength, and <a href="http://vocamp.org/wiki/VoCampIbiza2009">the fourth</a> is currently underway on the Spanish island of <a href="http://local.google.co.uk/?ie=UTF8&amp;ll=38.979695,1.425476&amp;spn=0.339482,0.716171&amp;t=h&amp;z=11">Ibiza</a>.
</p>

<p>A report from the island by Yahoo's <a href="http://research.yahoo.com/bouncer_user/66">Peter Mika</a> begins this month's episode of the <a href="http://semanticgang.talis.com/">Semantic Web Gang</a> podcast, and leads to a wide-ranging discussion of the role that vocabularies and ontologies continue to play within the Semantic Web. The very nature of these <em>ad hoc</em> VoCamps says much, though, about the way in which attitudes have shifted away from expectations that massive all-encompassing ontologies are the best way to help machines to reason about the world around them.
</p>]]></media:text>
    </item>
    <item>
      <guid isPermaLink="false">6126000285</guid>
      <link><![CDATA[http://www.zdnet.com/blog/semantic-web/leigh-dodds-talks-about-talis-connected-commons/285]]></link>
      <title><![CDATA[Leigh Dodds talks about Talis Connected Commons]]></title>
      <description><![CDATA[I wrote about Talis' Connected Commons last month, and today spent some time talking with the company's Platform Programme Manager, Leigh Dodds.The conversation has just been released as a podcast which looks at the rationale behind the company's offer and the specific licensing choices that beneficiaries are asked to make.]]></description>
      <pubDate><![CDATA[Wed, 15 Apr 2009 18:05:18 +0000]]></pubDate>
      <media:credit role="author"><![CDATA[Paul Miller]]></media:credit>
      <s:doctype><![CDATA[Text]]></s:doctype>
      <media:text type="html"><![CDATA[<p>I <a href="http://blogs.zdnet.com/semantic-web/?p=264">wrote</a> about Talis' <a href="http://www.talis.com/cc">Connected Commons</a> last month, and today spent some time talking with the company's Platform Programme Manager, Leigh Dodds.
</p>

<p>The conversation <a href="http://blogs.talis.com/nodalities/2009/04/leigh-dodds-talks-about-the-talis-connected-commons-and-linked-open-data.php">has just been released as a podcast</a> which looks at the rationale behind the company's offer and the specific licensing choices that beneficiaries are asked to make.
</p>

<p>Have a listen, and see if the Connected Commons might help your next project.
</p>

<p><strong>Disclaimer: Talis is my former employer</strong>
</p>]]></media:text>
    </item>
    <item>
      <guid isPermaLink="false">6126000273</guid>
      <link><![CDATA[http://www.zdnet.com/blog/semantic-web/true-knowledge-api-lies-at-the-heart-of-real-business-model/273]]></link>
      <title><![CDATA[True Knowledge API lies at the heart of real business model]]></title>
      <description><![CDATA[Semantically powered question answering start-up True Knowledge today made its Semantic Search API available for public consumption, taking the next step on the company's journey out of beta and providing a clear steer as to the way in which they intend to generate revenue.As the company's press release notes,"True Knowledge offers two distinct API services for developers: the 'Direct Answer API' and the 'Query API.]]></description>
      <pubDate><![CDATA[Tue, 14 Apr 2009 05:01:32 +0000]]></pubDate>
      <media:credit role="author"><![CDATA[Paul Miller]]></media:credit>
      <s:doctype><![CDATA[Text]]></s:doctype>
      <media:text type="html"><![CDATA[<p><a href="http://www.trueknowledge.com"><img src="http://cdn-static.zdnet.com/i/story/61/26/000273/trueknowledge-logo.png" width="212" height="40" class="alignRight size-full wp-image-274" /></a>Semantically powered question answering start-up <a href="http://www.trueknowledge.com/">True Knowledge</a> today made its Semantic Search API available for public consumption, taking the next step on the company's journey out of beta and providing a clear steer as to the way in which they intend to generate revenue.
</p>

<p>As the company's press release notes,
</p>

<p></p>
<blockquote>
<p>
"True Knowledge offers two distinct API services for developers: the 'Direct Answer API' and the 'Query API.' The Direct Answer API allows developers to leverage True Knowledge’s natural language question answering technology, giving any search site or application the ability to provide a single direct answer for questions asked on any subject in plain English. This is especially well suited to mobile applications where providing a lengthy list of search results may be impractical.
</p>

<p>The Query API allows developers to bypass True Knowledge’s natural language translation system and directly query True Knowledge’s knowledge base using a simple query language. This allows automated systems such as web and mobile applications to tap into True Knowledge’s vast machine-understandable knowledge of the world, making them behave more intelligently."
</p>
</blockquote>
<p>
</p>

<p>The company was founded in August 2005 and is based in the British city of Cambridge, at the heart of '<a href="http://en.wikipedia.org/wiki/Silicon_Fen">Silicon Fen</a>.' A <a href="http://blogs.zdnet.com/semantic-web/?p=174">$4m Series A investment round</a> was closed in July 2008, led by Octopus Ventures.
</p>

<p>I spoke with CEO William Tunstall-Pedoe ahead of today's announcement to see how the core knowledge base continues to improve, and to discuss the company's plans.
</p>

<p>For those who haven't tried it, True Knowledge offers an interesting slant on attempting to <em>answer your question</em> rather than simply return hundreds or thousands of documents that might contain the answer as traditional search engines tend to. Tunstall-Pedoe quoted Google's Larry Page during our conversation, noting that Page has asserted that
</p>

<p></p>
<blockquote>
<p>
"the perfect search engine would understand exactly what you mean and give back exactly what you want."
</p>
</blockquote>
<p>
</p>

<p>It is this that True Knowledge attempts with their 'Internet Answer Engine,' and core to their solution are a comprehensive (137 million facts, and growing) knowledge base, a proprietary system for <em>understanding</em> a query and a powerful inference capability that enables the system to answer questions more reliably. Part of that reliability, as Tunstall-Pedoe frequently stresses, lies in the system's ability to know when it <em>doesn't</em> know the answer. Along with <a href="http://blog.trueknowledge.com/2009/03/true-knowledge-answering-more-and-more.html">a success rate of less than 50%</a> for providing answers to questions, this may seem little more than an academic curiosity, but an ability to reliably know when to fall back to less structured approaches (such as passing the query to Google) is far better than 'guessing' or delivering wholly inappropriate responses... especially once the Answer Engine's capabilities are embedded in some third party site.
</p>

<p>True Knowledge's process of inference also allows the system to cope with ambiguity, and even with contradictory 'facts.' During our conversation, we told the system that President Obama was born in Cambridge. It allowed us to make this assertion, but subsequent analysis of the overwhelmingly contradictory data drawn from elsewhere in the knowledge base means that it was deemed to be untrue and flagged as such.
</p>

<p><img src="http://cdn-static.zdnet.com/i/story/61/26/000273/tk-obama_400x389shkl.png" width="400" height="389" class="aligncenter size-full wp-image-276" />
</p>

<p>A different query, in which I ask 'How far is San Jose from SFO?,' shows both how the system copes with ambiguity and the manner in which supporting facts are drawn from sites such as Metaweb's <a href="http://www.freebase.com/">Freebase</a>.
</p>

<p><a href="http://www.trueknowledge.com/q/how_far_is_san_jose_from_sfo"><img src="http://cdn-static.zdnet.com/i/story/61/26/000273/tk-2_400x505shkl.png" width="400" height="505" class="aligncenter size-full wp-image-277" /></a>
</p>

<p>The current True Knowledge home page is not going to draw huge numbers of users away from their search engine of choice, but that isn't really the point. As Tunstall-Pedoe pointed out, the site is intended to showcase the company's capabilities and facilitate the addition of new knowledge (as well as the millions of facts drawn from Wikipedia, Freebase and a growing body of licensed commercial content, over 120,000 facts have already been added by individuals in the beta programme.) The real utility of True Knowledge will lie in licensing the underlying system for use in vertical and horizontal third party applications, and public availability of the <a href="http://www.trueknowledge.com/api/">True Knowledge API</a> begins that process. There's a long way to go in further extending the knowledge base, suggesting that vertical search applications may be the first to sign up; it's much easier to approach comprehensiveness within a bounded domain than across all areas of knowledge.
</p>

<p>The market for semantically enhanced search is growing crowded, and stalwarts of the search industry have been hard at work too, with Google and others getting increasingly good at returning actual answers to factual questions.
</p>

<p><a href="http://www.google.com/search?hl=en&#38;q=what+time+is+it+in+Los+Angeles&#38;btnG=Search"><img src="http://cdn-static.zdnet.com/i/story/61/26/000273/google-time.png" width="400" height="310" class="aligncenter size-full wp-image-280" /></a>
</p>

<p>Tunstall-Pedoe used a slide to demonstrate the differentiation the company sees between itself and 'obvious' competitors such as Wikipedia, Freebase, and hotly anticipated <a href="http://www.wolframalpha.com/">Wolfram Alpha</a>. Key differentiators in the diagram included True Knowledge's ability to infer (something Wolfram Alpha also claims), its language independence (although currently only available in English, the concept extraction techniques used by True Knowledge should work equally well in other languages), and the system's reliance upon an internal ontology comprising 20,000 classes (plus biological species, product information, etc). True Knowledge (unsurprisingly) scored far better than the competition, but in a market that also includes the likes of <a href="http://www.hakia.com/">Hakia</a> and <a href="http://www.powerset.com/">Powerset</a> (neither of which could usefully answer my question about San Jose and SFO) the true picture is a lot more complex.
</p>

<p>True Knowledge is certainly interesting, and frequently impressive. It remains to be seen whether a Platform proposition will set them firmly on the road to riches, or if they'll end up finding more success <a href="http://blogs.zdnet.com/semantic-web/?p=168">following the same route as Powerset</a> and getting acquired by an existing (enterprise?) search provider.
</p>]]></media:text>
    </item>
    <item>
      <guid isPermaLink="false">6126000269</guid>
      <link><![CDATA[http://www.zdnet.com/blog/semantic-web/ivan-herman-discusses-semantic-web-activity-at-the-world-wide-web-consortium/269]]></link>
      <title><![CDATA[Ivan Herman discusses Semantic Web activity at the World Wide Web Consortium]]></title>
      <description><![CDATA[Ivan Herman is Semantic Web Activity Lead at the World Wide Web Consortium (W3C), and in this podcast he talks about a range of current activities across the Semantic Web community.]]></description>
      <pubDate><![CDATA[Wed, 08 Apr 2009 15:33:03 +0000]]></pubDate>
      <media:credit role="author"><![CDATA[Paul Miller]]></media:credit>
      <s:doctype><![CDATA[Text]]></s:doctype>
      <media:text type="html"><![CDATA[<p><a href="http://www.w3.org/People/Ivan/">Ivan Herman</a> is Semantic Web Activity Lead at the World Wide Web Consortium (W3C), and <a href="http://blogs.talis.com/nodalities/2009/04/ivan-herman-talks-about-the-semantic-web-and-w3c.php">in this podcast</a> he talks about a range of current activities across the Semantic Web community.
</p>]]></media:text>
    </item>
  </channel>
</rss>
