IBM today announced a Big Data partnership with Palo Alto startup company Cloudera. While IBM chose to reveal the partnership by tucking it away in its announced acquisition of Vivisimo, Cloudera has a full press release on its Web site. Ovum analyst Tony Baer, whose guest post on "Fast Data" appeared here last week, has an excellent post examining this announcement.
This deal with IBM is no one-off though. In fact, for Cloudera, partnerships like this one are getting to be old hat.
Cloudera is a major contributor to the Apache Hadoop project. The company's Hadoop distro, known as "Cloudera Distribution including Apache Hadoop" (CDH, for short), is probably the one most commonly deployed. That is impressive, as are the company's contributions to Hadoop 2.0, which is making its way into the mainstream CDH release.
But as ubiquitous as Cloudera has become on its own, perhaps even more impressive is the number of partnerships the company has forged with software megavendors and BI leaders. Here are a few examples:
- Earlier this month, Pervasive Software announced Cloudera certification of its RushAnalyzer 1.2.2 product (and that's in addition to the September, 2011 certification of DataRush)
- In February of this year, data visualization darling Tableau announced its Cloudera-certified connector, allowing its eponymous product to connect to CDH via Apache Hive
- In January, Oracle announced its Cloudera partnership, embedding CDH in its Big Data appliance product
- In September of 2010, Teradata announced a partnership with Cloudera too, whereby CDH can be used to "funnel unstructured data...into a Teradata data warehouse."
So what's going on here? Other companies have Hadoop distros; in fact, IBM is one of them. And why would other mighty companies, like Oracle and Teradata, see partnerships with a Bay Area startup as so important to their Big Data startegy? Either Cloudera is "on the bubble" or Big Data is as disruptive to the tech industry's power structure as it is to the ways companies use their data.
If the BI industry has shown us anything, it's that acquisitions and consolidation will take place as data industries mature. And last week's strong IPO of Splunk shows us that Big Data companies can and will go public. If Cloudera gets acquired, then its various partnerships could get awkward. If Cloudera IPOs, well, that could be exciting to watch.