Hadapt partners with Cloudera, puts MPP and Hadoop side-by-side

The Big Data world has no shortage of hybrid solutions, joining BI and relational technology with Hadoop and MapReduce. Hadapt offers a hybrid too, but prefers technology coexistence to subordination.

As the Enterprise adoption of Hadoop, and technologies related to it, is on the rise, it's inevitable that companies will offer various combinations and hybrids of relational and/or multidimensional databases with NoSQL data stores,  MapReduce processing and various file systems.

Teradata Aster (formerly Aster Data) takes the approach of enabling MapReduce processing over a relational database.  MapR enables Hadoop to work over standard Network File System (NFS) disk volumes.  JustOneDB enables parallel processing over a column store.  And, of course Apache Hive creates a SQL/relational abstraction layer over MapReduce, thus enabling conventional data visualization and reporting tools to work with Hadoop.

Hadapt takes a hybridizing approach that's different from the others, and I like it a lot.  Instead of layering SQL/relational or MapReduce over one another, Hadapt makes them work together.  Hadapt uses Hadoop, and then it co-locates a Massively Parallel Processing (MPP) relational database on the cluster by putting an instance of PostgreSQL on each node.  I've written before that MPP and MapReduce have a lot in common, and both should be considered Big Data technologies.  Hadapt proves out this point in practical terms by implementing MPP in Hadoop's physical environment.

  • Also read: MapReduce and MPP: Two sides of the Big Data coin?
  • Along the way, Hadapt's approach makes some interesting things possible. For instance, Hadapt enables joins from relational tables to Hadoop Distributed File System (HDFS) files, and has a data loader to move data back and forth between the two.  But perhaps more important, it allows interactive SQL queries to be used in situations where they out-perform batch-mode MapReduce jobs.  Yes, MapReduce is wicked fast for huge data sets, but for ancillary tasks, the overhead of running such a job may not pay.  Hadapt has a query optimizer that can decide whether to use MPP or Hadoop processing.  And this means that conventional reporting and data visualization tools can connect to Hadapt using ODBC and JDBC, rather than having to go through Hive (and, most likely, a custom Hive driver).

    I spoke with Justin Borgman, Hadapt's CEO and co-founder, last week and he discussed this architecture with me, explaining that it ends up being very-cloud friendly when compared to other MPP products.  He also explained how the Hadapt product began as a Yale Computer Science research project conducted by  Dr. Daniel Abadi, another co-founder of the company.  The company did its venture funding raise, moved from New Haven, CT to Cambridge, MA to be part of a bigger tech hub, and is now a 25-person organization.  That's pretty small, but I expect it to get bigger.  And big things are happening on the company's Board.  Former CEO of Vertica, Christopher Lynch, was recently appointed Chairman of the Board, and Sharmila Shahani-Mulligan, the former CMO of Aster Data, holds a Board spot as well.  That's quite a roster of BI-Big Data unifiers.

    And speaking of unifying, Hadapt announced just today that its product is now certified to run with Cloudera's Distribution Including Hadoop (CDH) as well as Cloudera's premium product, Cloudera Manager.  I wrote recently about the abundance of Cloudera partnerships in the Hadoop vendor space, so it's not shocking to see that Hadapt is announcing one today as well. 

  • Also read: Cloudera partners with IBM, as its own influence grows
  • But I find this particular certification and partnership especially encouraging, because it puts the most ubiquitous Hadoop distro on the same level (actually, on the same nodes and disks) as MPP and relational technology.  It's exciting to see all of these conventional database/Hadoop hybrids emerge.  While they may not all be here forever, they are helping the industry as a whole determine what combinations of old and new are best.