Big Data Week in Review (December 7th, 2012)

A series of Big Data-related announcements this week merits a roundup and summary.

Like many ZDNet contributors, I receive numerous communications from PR agencies each day.  I can’t respond to each one, and I certainly can’t report on all of the announcements.  But it struck me that I could collect them and summarize some of the more interesting ones.  Here’s this week’s spread:

Cloudera raises $65M
The biggest news of the week didn't come from a PR agency.  Instead, Arik Hesseldahl of AllThingsD reported that Hadoop poster child Cloudera raised $65 million of Series E funding.  Existing investors Accel Partners, Greylock Partners, Ignition Partners, In-Q-Tel and Meritech Capital Partners participated.

Cloudera offers what is perhaps the most pervasive and widely deployed Hadoop distribution, though I get the sense that Hortonworks is catching up. Regardless, Cloudera, which less than two months ago announced Impala, its distributed SQL query engine, is raising lots of money and chatter of an IPO is growing.

ClearStory Data announces $9M in Series A funding
Cloudera’s is not the only funding story this week.  Business user-facing analytics company ClearStory announced a successful Series A funding round of $9 million.  For this deal, the dough came from Kleiner Perkins Caufield & Byers, and original seed investors Andreessen Horowitz and Google Ventures.  Between ClearStory’s easier analytics and Cloudera’s SQL-based Impala, it would seem the smart money is on making Big Data more for the mainstream and less for the laboratory.

Cleversafe installations total 70PB of customer data
Cleversafe offers a Hadoop distribution that swaps its Slicestor appliances in for conventional HDFS drives, eliminating the namenode vulnerability that Hadoop normally faces.  The company announced this week that the total amount of data stored, on servers located at all customers, has topped 70 petabytes in volume.  The company also claims that another 100 petabyes worth of storage is expected to be sold in 2013.

Also read: Cleversafe launches Hadoop without HDFS; Jaspersoft brings disconnected report editing

Rapid-I opens US offices
Rapid-I, which offers two very compelling machine learning products, its RapidMiner application and its RapidAnalytics server, announced the official opening of its US offices, in Burlington, MA and Sunnyvale, CA.  The company, which was founded in Dortmund, Germany, has its eye on making the relatively impenetrable world of predictive analytics technology more accessible to a variety of users.  Its U.S. offices will no doubt help the company widen its audience.  And winning the hack/reduce hackathon in Boston a few weeks ago likely won't hurt either.

SoftLayer offers MongoDB on-demand in the cloud
10gen’s MongoDB is a NoSQL database that targets the Big Data space. Now, provisioning a MongoDB cluster in the cloud is getting almost as easy as storing word processing documents there.  Already available from MongoLab on Amazon’s EC2 and Microsoft’s Windows Azure, as well as Joyent’s and Rackspace’s clouds, SoftLayer is now offering its own MongoDB-as-a-service option.  SoftLayer’s MongoDB resources are available through the company’s API, or its portal, and billing is based on pay-as-you-go plans, as you would expect from a cloud provider.  Nodes can be entry-level quad core servers or sixteen-core high-perf units.  The resources are 10gen-certified and provide Gold Subscription benefits.

Next week in Florida
That’s not all the Big Data news this week, but these are some of the developments I found particularly interesting.  Next week, I’m off to the Live! 360 conference in Orlando to speak about Big Data and NoSQL to mainstream developers and database administrators.  I’ll be eager to see how prominent Big Data is on that audience’s radar.

Show Comments