Yahoo delivers security, workflow tools for Hadoop

The company has contributed Kerberos-based security features and its own Oozie workflow engine to the open-source data management platform

Yahoo has added security features and its Oozie workflow engine to Hadoop, the open-source software framework for crunching large amounts of data.

The announcement on Tuesday delivered the beta release of Hadoop with Security, which incorporates the Apache Hadoop framework with Kerberos authentication standards to improve security for accessing and sharing of data within the platform. The software also supports multi-tenancy, so many people can use the same hardware with authenticated access.

Yahoo has also contributed Oozie, used to manage and coordinate tasks running on Hadoop, to the open-source project. Oozie was developed by Yahoo to manage its own workloads with complex processes on a large scale, and the company said it has tested and deployed it on tens of thousands of its servers. The beta version supports the use of the Hadoop Distributed File System (HDFS), the Pig analysis tool for large data sets and MapReduce.

The beta releases should help Hadoop, an open-source platform administered by the Apache Foundation, to gain more widespread adoption in enterprises, Yahoo said.

"Businesses across all sectors are looking for ways to leverage the vast quantities of data they are accumulating, and Apache Hadoop is an efficient solution for processing data at scale. Hadoop has matured and is now becoming an enterprise-ready cloud computing technology with the addition of Kerberos authentication," said Melanie Posey, a director at IDC Research, said in Yahoo's announcement.

Yahoo itself is a leading contributor to the project and is running Hadoop on 35,000 servers as a key component of its business processes, as well as playing home to the world's largest Hadoop clusters.

Separately at the Hadoop Summit on Tuesday, Cloudera announced its new Cloudera Enterprise data management product. The software, delivered via subscription to clouds built upon Cloudera's Distribution for Hadoop (CDH), provides data storage, management and analysis tools.

In addition, the company launched CDH 3, an update to its flagship product, which packages Hadoop with eight other open-source tools. Acer, AMD, Netezza, Talend and Yahoo are now supporting the package, according to Cloudera.