Cloudera’s show of numbers

Cloudera has raised a bunch of money. Again. In this guest post, Tony Baer explains what the new investment means for Cloudera, Hadoop and the data warehousing space.
Written by Andrew Brust, Contributor
Tony Baer
Tony Baer

This guest post comes courtesy of Tony Baer’s OnStrategies blog. Baer is a principal analyst covering Big Data at Ovum.

The announcement of Cloudera's new $160 million venture funding almost looked too perfectly timed. It came midway during Cloudera’s first formal dog and pony show in front of industry analysts. And we're not just talking about the usual suspects, but a broader, more sober crowd of doubters from across the IT spectrum: app development, IT infrastructure, database, and BI, where the consensus remains that Hadoop is not a database.

Unlike Hortonworks, Cloudera has not been afraid to ruffle feathers. It dares to offer a hybrid open source/proprietary model in a market born in open source. Or more importantly, announce a strategic Enterprise Data Hub path that potentially places it in competition with established data warehouse providers that might otherwise form logical partners. Cloudera's enterprise data hub positioning is ambitious, auspicious, and for now, a concept leap. Hadoop is not a database, and it currently lacks enterprise-grade features for performance management, SLA conformance, security, and data governance. The emphasis is on "currently" as platform and practice are evolving rapidly; Hadoop will grow into a more robust platform that can compete for the role of hub.

There is little question that Hadoop is here to stay; Cloudera has drawn competition from Hortonworks, which positions itself as the 100% open source platform that is very OEM-friendly; MapR, whose implementation includes proprietary technology that gets the platform closer to the robustness and performance of databases; and IBM, which after a brief flirtation with Cloudera subsequently reiterated its positioning as the adult in the room. Meanwhile, Teradata, Oracle, Microsoft, and Amazon include Hadoop in their data stacks.

Cloudera hardly needed the capital as it already had $140 million in the bank. The new infusion jumps that to $300 million. More to the point, it includes a battery of firms, such as T. Rowe Price, who tend to be long-term investors, plus Michael Dell's venture arm and Google Ventures as "strategic" backers. The company does not deny having IPO aspirations, but states that the new money gives it more flexibility on the timing.

Immediately following the announcement, we received several press queries as to whether Cloudera was for sale. In our view the most likely candidates would be Oracle (which resells Cloudera’s full platform as part of its Big Data Appliance, and just saw disappointing Q3 numbers) and newly privatized Dell. The common thread is that both are seeking engines to rekindle growth. But the $300 million in the bank inflates Cloudera's valuation to the point that it would be a very, very expensive buy.

Nonetheless, there’s a lot of venture money floating around right now. And with Facebook’s $19 billion acquisition of a company that few ever heard of (except for hundreds of millions of casual subscribers like us who have the app but don't use it), we have the makings of a venture capital bubble. As such, there is a flight to quality (invest in market leaders) for Tier 1 VCs. In the Big Data arena, players like Cloudera and MongoDB are perceived to be among them.

So we don't believe that Cloudera is currently for sale. With Enterprise Data Hub, they are not claiming to replace data warehousing incumbents, but the pressure to move data storage and compute cycles onto the cheaper Hadoop platform is potentially quite threatening. (We believe that the incumbents must assert their value higher up the stack, such as with in-database analytic functionality, data governance, and query optimization.)

Whatever Cloudera's next step (IPO or acquisition), their immediate goal is placing more facts on the ground with product and market share to raise the stakes on whatever transpires. That will inevitably include Cloudera making its own acquisitions – a skill that the company needs to learn – and likely diversification of the product line. At the analyst session, we viewed a demonstration of a Hadoop-based predictive analytics system that Cloudera uses as its nerve center for customer support; it's a technology that could be generalized beyond Hadoop users.

$300 million in the bank may be a nice security blanket. But look at the state of adoption: Cloudera, which has had a multiyear jump in the market, counts a 10 – 12,000 installed base, plus or minus. But that boils down to about 350 paying subscribers (currently growing at about 40 – 50/ quarter). Any market where the leader’s paid base numbers are in the hundreds is either a niche segment or a very immature one. Obviously, Hadoop's the latter, and as such, there are any number of potential disruptors that could surface on the road to mainstream adoption. For Cloudera and its rivals, it's hardly game over.

Correction: the preceding paragraph previously stated that Cloudera's paying subscriber count is growing at a rate of 40 - 50/month.  The text has been corrected to indicate the rate of 40 - 50/quarter.

Editorial standards