Dana Gardner, formerly a top analyst with the Yankee Group and prior to that the Aberdeen Group, recently started his own consultancy, Interarbor Solutions. He'll be focusing on enterprise applications, software infrastructure, RSS and other topics, and he has agreed to post some of his insights on Between the Lines. Here's Dana's take on IBM's Unstructured Information Management Architecture announcement that came out today:
You have to give IBM credit for mastering the intersection of open source community development and business development. This week's announcement of the UIMA analytics interoperability standardization process, in conjunction with LinuxWorld, is the latest example of IBM's apt balancing of openness and opportunity.
UIMA stands for Unstructured Information Management Architecture, which means Big Blue remains better at bits and bytes than branding. Despite the need to memorize the name, the methodology -- soon to be with the SourceForge open source community under a similar license as Eclipse -- has great potential.
Should it quickly gain ground as appears the initial case, UIMA can significantly help close the gap between tacit human knowledge and what search engines do so well, namely to index and match labels. UIMA can take enterprise search to the next level: To go beyond swift location and access of relevant indexed information to excising a gem of human experience someone else labored over and applying it to a similar appropriate future action.
When you come down to it, what separates us from the apes is the ability to apply learning better. Why should someone else have to relearn what I managed to figure out, especially if we work in the same company? Internet Protocol-based search lets us extend the "you shouldn't" answer to the question posed in the previous sentence, but the qualitative results have been mixed when applied to unstructured and mixed-mode sources. UIMA-compliant analytics tools, which companies will still have to purchase and support, by the way, can gang-tackle the search problem, rather than expect one tool to do it all well.
To be precise, neither UIMA itself, nor its Java SDK, will ferret out the knowledge needle in the corporate haystack per se, but importantly it does allow other compliant analytical approaches to work with cohesion. UIMA use will probably begin with projects (such as help desk worker searches), then broaden to enterprises, and later to extranets.
The Linux-friendly UIMA allows, pardon my dueling analogies, for a lot of cooks to work in the same kitchen on the same recipe at the same time. One ingredient of content analysis may not get you what you want, but allowing standardized interoperability across several (or dozens) -- a smorgasbord of interoperable analytics -- is much more powerful and effective.
IBM, of course, is interested in selling more WebSphere Information Integrator OmniFind Edition enterprise search solutions (and all the WebSphere and DB2 stuff that supports that). But there is so much specialization in text analytics that IBM -- and here's the mastery -- decided to partner widely with its UIMA technology and make it open source as a way to encourage wider community adoption and promote compliance.
Recognizing the difference of what open source communities do best and what commercial products do best -- and combining them to mutual value -- is evident in this UIMA announcement. And it's moves like this that separate IBM from the pack on the best infusion of open source approaches with commercial IT business development.