Talend and Simba go NoSQL
Talend, provider of open source data- and application-integration software announced its support for NoSQL databases HBase (yes, the very same database that MapR has optimized), Cassandra and MongoDB. The Talend support for these databases will be available next month as part of the upcoming version 5.2 release of its Open Studio for Big Data. Talend told me that support for additional NoSQL databases is bound to come. The company keeps an eye on community-contributed connector efforts, and takes it upon itself to fortify and harden the most popular ones, adding them to the core product.
Not to be outdone, Simba, promoted its Big Data ODBC drivers, supporting Hive (yes, that same layer over MapReduce that Impala emulates and outperforms), Cassandra and MongoDB, as well as Google BigQuery.
ODBC (Open DataBase Connectivity) is a 20-year old data access API standard from Microsoft, which is enjoying somewhat of a renaissance lately. ODBC defines both a standard database driver framework (supported by most query, reporting and BI tools and many programming languages) as well as a SQL grammar which the drivers will translate to the target database’s native language and commands. Simba’s Hive driver already ships as part of the Hortonworks and MapR Hadoop distributions, and the company announced that the Qubole cloud-based Hadoop platform will use it as well. But Simba’s Big Data drivers, procured directly, deliver ODBC compatibility to anyone, four all four databases.
Hortonworks racks up the partnerships
In additon to Qubole’s platform, Hadoop is available as a cloud service via Amazon’s Elastic MapReduce service, based on Amazon’s own Hadoop Distribution or the MapR M3 and M5 distributions. As I mentioned previously, MapR’s Hadoop distro will also soon be available as a service via Google Compute Engine. Microsoft’s Windows-based “HDInsight” Hadoop distribution, developed in concert with Hortonworks, reached a new milestone release in its by-invitation Beta on Wednesday, and will soon be publicly available on the Windows Azure cloud platform.
What about Rackspace’s cloud? And what about the Linux-based Hortonworks Data Platform Hadoop (HDP) distribution? Well, the two companies announced their products will be united to offer one more Hadoop public cloud service. But since Rackspace’s cloud is based on the OpenStack platform, which can also be implemented on-premise to build private clouds, HDP as a private cloud service is now possible as well.
LucidWorks, the commercial entity most supportive of the Lucene and Solr projects, announced the beta release of its LucidWorks for Big Data product. The cloud-based platform creates a unified RESTful API (Representataional State Transfer-based Application Programming Interface) around Hadoop and its companion components, like Pig, HBase and Mahout, oriented toward search-driven Big Data analytics.
Splunk, the Big Data company known for its wildly successful IPO, introduced the availability of its Hadoop Connect product (which integrates Hadoop with Splunk Enterprise) and Splunk App for HadoopOps (a Hadoop monitoring, troubleshooting and health analysis tool).
Still not enough for you? How about a couple of new database releases? Metamarkets announced it has open sourced its Druid in-memory streaming real-time data store, and Calpont announced that version 3.5 of InfiniDB, its Massively Parallel Processing (MPP) database, will reach GA next month.
In the my last several posts, I’ve summarized the huge array of Big Data announcements made in concert with this month’s Strata + Hadoop World NYC event. In future posts I’ll try to draw some conclusions about all the new products and initiatives that were released and announced.