Hortonworks co-founder and architect Arun Murthy outlined his vision for Hadoop's next decade and the view is decidedly more business friendly and consumable.
Speaking at the Apache Big Data Europe conference in Budapest, Hungary, Murthy said the future of Hadoop is stitching together apps to manage data processing in an assembly. Going forward, these assemblies will be vital tools for businesses as Hadoop is relegated to infrastructure. The aim is to eliminate the friction of running a business application or process on Hadoop.
This assembly approach will open up Hadoop and big data to more enterprises. Today, an enterprise has to work with a series of separate projects and applications. In other words, building on top of Hadoop means that enterprises have to be well versed in a series of acronyms such as YARN and Spark.
Murthy's view is notable for a few reasons. First, Murthy has the Hadoop credibility because he led the team that built it. Murthy was the lead of the Yahoo Hadoop Map-Reduce development team that was eventually spun out to spawn the big data drive across the enterprise. Now as a leading voice for Hortonworks and Apache committer, Murthy has a big say in Hadoop's future development.
The other notable item in Murthy's talk, which was previewed for ZDNet, is that the future has a business-friendly tilt. Hadoop is in the analytics fabric at many enterprises, but hasn't had the connections and ease of use on its own. As a result, a bevy of vendors have taken Hadoop and mixed in proprietary software to make it easier to use. The ease of use, however, could be paid for in lock-in.
Murthy said the nearly decade of Hadoop's existence has allowed "a thousand flowers bloom." Indeed, Hadoop has had a bevy of Apache extension projects that have moved it forward. Now Hadoop needs a few more projects to get to Murthy's vision, which is helped along by the following themes:
- The future of Hadoop is the destiny of data itself."Everything is producing data now," said Murthy, noting the Internet of things, wearables and other sensors in the field.
- It's getting cheap and economical to produce data and move it around due to Hadoop and its adjoining projects.
- More data will be produced outside of the data center. Sensors will trump transactional systems, said Murtha. "We have to make it easy to move that data into Hadoop," he said.
- The move from virtualization to containers such as Docker, which enables small focused apps to run.
"If you put all of these things together, you have to go from a transactional data world to pretransactional world," said Murthy. "We're moving from being app centric to being data centric."
How does Hadoop need to evolve?
In Murthy's assessment, Hadoop will need to be able to reproduce business logic. And to get better at business logic, Hadoop will need Murthy's assembly concept.
An assembly is essentially a series of applications and business logic tools stitched together. These mini apps, like containers, can be swapped in an out and tweaked. "The idea is to operate and secure a consistent assembly," said Murthy.
"Because it's an assembly, make it easy to get value out of Hadoop without having to jerry rig applications together," he said. "We've been able to go to post MapReduce world. Now we need to transition from technology to be more of a business use case."
What's on the to-do list for Hadoop? Murthy has the following list:
- Better integrate with Docker and similar technologies. "Docker itself is an early technology and the way Hadoop manages Docker has been very nascent," said Murthy. "We still need consistent APIs."
- Security will need to be developed in a common framework that can secure assemblies in one pop.
- Interfaces need to be built in for management.
- Interoperability across containers as well as various related Hadoop projects.
"We have a lot of stuff to go solve in the next 12 to 18 months," said Murthy.