Updated: One top cloud company has joined forces with a top Hadoop company to simplify the process of deploying big data workloads on private clouds.
On August 7, Nimbula, which was founded by the creators of Amazon EC2, and MapR Technologies, a leading Apache Hadoop distributor, announced the development of a turnkey solution for deploying elastic Hadoop clusters on Nimbula Director, which is free for download for up to 40 cores. The solution is free also.
"This is the first distribution that we have worked with and support. The technology can extend to other distributions but our partnership with MapR is the first one," said Jay Judkowitz, Nimbula's director of marketing. "What's new here is that users can now deploy Hadoop on their private cloud in minutes and enjoy the benefits of the private cloud - HA, multi-tenancy, self-service, rapid provisioning and deprovisioning, etc. Elastic Hadoop/MapReduce is available on public clouds (like Amazon EC2, Google).
"Not only we bring it to the private cloud, we also add a lot of automation into the process of launching Hadoop and also the failover," he said. :If one of the server dies, can relaunch someone. So the initial install/startup and ongoing availability.. The automation of the Hadoop services from MapR and the automation of the underlying instances from Nimbula work together to maintain a fully-functional and highly-available Hadoop cluster.
Judkowitz said the Nimbula-MapR solution along with public Hadoop services now offered by Amazon and Google as well as VMware's Hadoop on ESX are four "proof points" of this trend of deploying more big data workloads in the cloud.
"Historically, Hadoop has been run on dedicated single-purpose physical infrastructure which optimizes performance at the expense of sharing and infrastructure reuse. The result is a blazing fast, but inflexible and costly big data environment. Today, we are starting to see a trend to move some big data workloads into VMs, or more specifically into IaaS clouds, both public and private," Judkowitz said in an email comment. "
The combined solution offer a variety of benefits, Nimbula says, including:
- Rapid provisioning: Deploying Hadoop clusters in under 2 minutes
- Self-service Hadoop: Launching their own Hadoop clusters on a private cloud without needing to requisition hardware
- Highly available Hadoop: Operating their Hadoop clusters in a lights-out data center with automated re-provisioning of failed nodes and a no-NameNode architecture
- Multi-tenant Hadoop: Sharing infrastructure between Hadoop clusters with complete permissions, network and resource isolation
- Multi-purpose infrastructure: Sharing infrastructure between Hadoop and non-Hadoop workloads
Nimbula best explains the overarching benefits with this:
"Hadoop is a valuable tool for mining huge amounts of data in a very short period of time and leveraging the power of farms of inexpensive commodity servers and disks. When there is one dataset, one organization mining that data, and a static or growing load placed on that dataset, Hadoop works perfectly without help from any cloud technology," Nimbula wrote on its web site.
"However, when elasticity, multi-tenancy and flexibility are required, running Hadoop on a private cloud platform like Nimbula Director can provide huge cost and operational benefits. Nimbula and MapR bring Hadoop to Private Cloud. Now customers can deploy Hadoop on top of Nimbula Director in mere minutes and start running their Hadoop job on the cloud."