Pentaho open sources big data code, licenses Kettle project under Apache 2.0

Pentaho has open sourced some of the big data assets in its Kettle open source project -- and moved its entire Kettle Data Integration Platform to Apache 2.0 -- in order to capture more of the booming Hadoop and NoSQL business.

One top BI player recently open sourced some of its data integration software and licensed the entire Kettle 4.3 release under Apache 2.0 to position itself well as a big data player.

Pentaho, a longstanding open source business intelligence applications player, notes that Hadoop and several top NoSQL databases are licensed under Apache. Pentaho's Kettle open source project, othwerwise known as Pentaho Data Integration Community Edition,  is devoted to "operationalizing" big data.

Some of the big data capabilities in Kettle that will be open sourced include "the ability to input, output, manipulate and report on data using the following Hadoop and NoSQL stores: Cassandra, Hadoop HDFS, Hadoop MapReduce, Hadapt, HBase, Hive, HPCC Systems and MongoDB," the company announced.

Traditional relational databases and data tools are insufficient for handling big datasets.

One exec had this to say about the open source move:

“In order to obtain broader market adoption of big data technology including Hadoop and NoSQL, Pentaho is open sourcing its data integration product under the free Apache license. This will foster success and productivity for developers, analysts and data scientists giving them one tool for data integration and access to discovery and visualization," said Matt Caster, founder and chief architectb of Pentaho's Kettle Project.