Hadoop Summit opening day brings multiple announcements

For the opening day of Hadoop Summit in San Jose, MapR/Fusion-io and Zettaset/Informatica announce new partnerships, and Couchbase announces a new release.

Big Data companies love to partner, and release new products, and they especially love to synchronize the announcement of those team-ups and releases with major Big Data conferences.  Today is the first day of the 2-day Hadoop Summit conference in San Jose, and several announcements have already come in.

MapR/Fusion-io: fusion indeed
MapR, a top Hadoop distribution provider, and one that differentiates in the storage layer is, fittingly, announcing a teaming with storage company Fusion-io.  While not a formal partnership per se, the two companies jointly conducted testing of MapR’s M7 Hadoop distribution with Fusion-io’s Fusion ioMemory platform.

M7 includes a custom version of the HBase NoSQL database, optimized for better file management.  ioMemory provides fast, flash-like storage – as opposed to spinning hard drive (HDD) or standard solid state disk (SSD) media.

The two companies are announcing that, together, M7 and ioMemory provide a 25x performance increase for read-intensive HBase applications, when compared with the use of standard Apache HBase running on a Hadoop cluster using commodity storage.

Will Hadoop customers go for specialized storage and a non-Apache version of HBase?  Purists might not, but enterprise customers who are just now adopting Hadoop and NoSQL may find the combination compelling 

Zettaset/Informatica partner up
While MapR and Fusio-io may just be announcing some collaborative testing, Zettaset, which provides Enterprise security and manageability for Hadoop clusters, and Informatica, a leading independent data integration (DI) vendor, are announcing a formal partnership today.

Zettaset will embed Informatica PowerCenter Big Data Edition into its own Orchestrator product, a tool for managing Hadoop clusters.  This integration will allow the Hadoop cluster security event data (intrusion alerts, for example), that Zettaset captures from firewalls and network switches, to be integrated with data from other sources in customers’ analytics workflows.

Here again, we see the "Enterprise-ation" of Hadoop.  Zettaset’s products make Hadoop clusters work sensibly in the context of corporate data centers, and Informatica has long been used in enterprise data integration/Business Intelligence work.  Putting the two together makes sense in much the same way that Splunk’s data center analytics products have seemingly resonated in the marketplace.

Splunk’s own announcement around analytics of generalized Hadoop data today would indicate that machine-oriented and general Big Data analytics vendors are starting to cross-over mutually.  And with that in mind, the general DI capabilities that Informatica brings to Hadoop will be brought to bear fro Zettaset as well, and embedding of Informatica’s data quality functionality is planned for the future.

Couchbase 2.1 GAs
Couchbase, a major NoSQL database vendor, with technology based on Apache CouchDB, is announcing general availability (GA) of version 2.1 of the Couchbase Server product.  This new version offers a multi-threaded persistence engine that improves disk utilization on both commodity hardware and specialized servers; cross-datacenter replication optimizations; new health and management tools and UI enhancements for existing tools.

Couchbase has also launched a new user community portal, with language-specific sections for developers and admins, as well as an interactive Q&A application designed to replace the existing Couchbase Forums.

There is a common theme in the MapR/Fusion-io and Couchbase announcements: As NoSQL is finding more traction for use in corporate applications, storage-based performance and manageability are of prime importance.