X
Business

Podcast: Google's search appliance 5.0 mines Documentum, other ECMs, but not Google Apps yet

The news is pretty much already everywhere today. Google is releasing version 5.
Written by David Berlind, Inactive

The news is pretty much already everywhere today. Google is releasing version 5.0 of its enterprise search appliance -- a turnkey device that, until today, companies could attach to their corporate networks for the purposes of indexing and searching the content found on their Web-based intranets, file sharing systems, and certain databases. Today, the big news is that that indexing and search capability will be extended to the content repositories that are managed by a handful of Enterprise Content Management systems that are on the market today. Namely, EMC's Documentum, IBM's FileNet, Microsoft's SharePoint, and OpenText's Livelink.

Not included in the list of searchable repositories however is Google Apps -- a service that Google provides to enterprises in a private, domain-oriented context and one that can be used to create a variety of documents that organizations might want indexed such as text documents, spreadsheets, presentations, and HTML documents. According to Google's director of Enterprise Products Matthew Glotzbach whom I interviewed via podcast regarding the announcement, that feature is planned for a future version. For now, customers of Google Apps can use the search feature that's built directly into Google Apps for searching documents stored there.

Perhaps just as newsworthy is how Google open sourced (under the Apache License) the framework and the connectors that are used in its search appliance to connect to those ECMs. There's a separate connector for each ECM and the idea behind open sourcing that code, according to Gloztbach, is that ECMs are deployed so differently from one enterprise to the next that Google wanted to afford enterprises the opportunity to tweak the code in order for the search functionality to cleanly integrate into their environments. Another reason is that third party developers including the developers of other ECMs that Google hasn't built connectors for can join the party by building their own connectors. According to Glotzbach, they're free to use the open sourced connectors as sample code to get them started on other connectors. Google worked with a variety of third parties to get the connectors developed but none of them were the vendors of the ECMs (eg: EMC) themselves.

In the podcast interview, I asked Glotzbach about the longer term implications of supporting proprietary ECMs. In many ways, the message that the existence of Google Apps sends into the marketplace is that documents should be created and stored as HTML documents instead of using the proprietary formats that ECMs are so good at organizing. In other words, philosophically, does connecting its enterprise search appliance to ECMs run counter to Google's larger belief system about how documents should be created and handled?

Clearly, ECMs aren't going away anytime soon. If for no other reason than the fact that there's so much legacy data that's trapped in them. But what if you're a startup with no such legacy and a greenfield opportunity to establish document standards for the purposes of guaranteeing the ability to index and retrieve them later? Glotzbach tackles that question as well.

You can listen to the podcast by pressing the play button above. There's also a manual download option. If you're subscribed to my IT Matters series of podcasts (see how), this episode of IT Matters should show up on your MP3 player and/or PC automatically.

Editorial standards