Red Hat and Hortonworks have unveiled integrated products aimed at speeding up enterprise big-data Apache Hadoop projects.
Having collaborated directly and through the community for the past 12 months, the two open-source software companies have now cemented links with three initiatives announced on Monday.
"There was a natural affinity, not just in terms of the technologies but in how we view the market and how we think we can accelerate customer adoption [of Hadoop]," Red Hat storage VP and general manager Ranga Rangachari said.
"At a high level most of the initial spadework has been around the community. Both companies absolutely believe in community-driven innovation and that the innovation that happens outside the four walls of our respective organisations needs to be fostered."
The two firms unveiled the beta of Hortonworks Data Platform (HDP) combined with Red Hat Storage, which is designed to give companies a unified storage pool for all Hadoop workloads and which supports multiple interfaces, files and objects.
"The benefit to the customer is de-silo-ification," Rangachari said.
"They don't need to create yet another infrastructure to do this. It employs their existing infrastructure and allows them seamlessly to start to use these workloads."
Rangachari said the beta would have been impossible without the work that happened in the upstream community working on the Apache Ambari operational framework for provisioning Hadoop clusters.
"Ambari needs to understand the semantics and the nuances of Red Hat storage and that's the work that has been ongoing for the past few months and few quarters," he said.
"Again, it's testament to the fact that, with enough like-minded views, we can surround this problem and that having community-level collaboration really starts to bear fruit."
Hortonworks VP corporate strategy Shaun Connolly said the storage announcement, together with those in data virtualisation and development, reflect the work by the two companies with other organisations and individuals in the open-source community.
"That work has to happen in the upstream community before it can flow downstream into enterprise products — that's just the nature of our engineering and business models," Connolly said.
The second element in the Red Hat-Hortonworks announcement is the integration of HDP with Red Hat JBoss Data Virtualization, enabling Hadoop to work with existing data sources including warehouses, SQL and NoSQL databases, on-premise and cloud apps and flat and XML files.
According to Connolly, the general availability of the JBoss data virtualisation software provides a virtualisation layer for developers to create applications more easily in a unified data model.
The JBoss tools enable developers to get data out of a Hadoop system more easily in a familiar way and either build analytic apps or enhance existing ones.
"If you look at Red Hat, they clearly have dominance with Red Hat Enterprise Linux in the core infrastructure base but also there are millions of JBoss developers who increasingly need to tap into the value of Hadoop," Connolly said.
"So if we enable them to use with their existing technologies the tools that do that in a familiar way, then it just accelerates that whole community of developers to enrich their applications in a way that's very familiar."
The two companies also announced the combining of HDP with Red Hat Enterprise Linux and the OpenJDK free, open-source implementation of the Java Platform.
As well as integrating with Ambari remote control, they said the initiative is designed to create a flexible development environment for elastic Hadoop deployments in physical, virtual or cloud settings.
All three moves will assure users about the extent of the integration between the technologies and the establishment of collaborative customer support, Rangachari said.
"When they start to deploy these products, even though they come from two different vendors, there is a unified support that they can count on," he said.
Red Hat has also been working closely with Hortonworks and others on the Apache Savanna project to provide a simple means to provision a Hadoop cluster on top of OpenStack.
"That work is ongoing and right now we are not announcing any availability. It's all out there in the community and when we think it's ready for prime time, we’ll obviously make it available," Rangachari said.
"At the infrastructure level you can think of it as a sandwich where you’ve got infrastructure on one side and platform as a service on the other side with the Hadoop data platform sitting in the middle," he said.
"There are a lot of things that we can do, we should do, we must do to make sure there’s tighter integration."
Connolly said the fundamental issues about those three layers all come back to the creation of rich analytic business applications in Hadoop.
"How do we enable it to be deployed very quickly and, in the case of open hybrid cloud, elastically on demand so that we remove the frictions, and then how do we make it easier to combine Hadoop with existing sources of data in a way that’s very approachable," he said.