VMware’s efforts to “decouple” Hadoop nodes from hardware will help enterprises adopt the emerging big data framework.
“What’s clear is that the world of data is undergoing a big disruption,” said Dave McJAnnet, Senior Director, Product Marketing at VMware, in a recent interview. “We’re seeing it move further and further into the enterprise [and] customers are working with Hadoop to make sense of vast quantities of data. What most people are trying to do is distill down many terabytes of data into something actionable so they can use traditional analytics tools.”
“If I want to deploy a distributed workload over dozens of machines and need each machine to look identical. Today, it’s an inherently custom exercise, he said. “By using a virtual machine as a container to automate the deployment of data in it greatly simplifies operational aspects.”
VMware maintains that “deployment and operational complexity, the need for dedicated hardware and concerns about security” are show stoppers for enterprises interested in using Hadoop.
“By decoupling Apache Hadoop nodes from the underlying physical infrastructure, VMware can bring the benefits of cloud infrastructure – rapid deployment, high availability, optimal resource utilization and secure multi-tenancy – to Hadoop," the company announced.
On Wednesday, VMware announced a new open source project dubbed “Serengeti” that allows customers to run Hadoop on VMware’s vSphere. Using the open source toolkit, customers can deploy a Hadoop cluster – including Apache Hive and Apache Pig – on vSphere.
VMware also plans to contribute extensions to Apache Hadoop that will make Hadoop components “virtualization aware” to support “elastic scaling” of Hadoop applications in virtual environments, executives said.
VMware is the leading virtualization vendor and also an application platform vendor whose open source Spring platform has also been optimized for Hadoop. Spring for Hadoop was announced last February and yesterday VMware announced updates for this code as well.
“The efforts we’re putting forth here are bringing enterprise class capabilities to Hadoop that will be required to become mainstream and accessible to large numbers of organizations.”