Pentaho open sources big data code, licenses Kettle project under Apache 2.0

Pentaho open sources big data code, licenses Kettle project under Apache 2.0

Summary: Pentaho has open sourced some of the big data assets in its Kettle open source project -- and moved its entire Kettle Data Integration Platform to Apache 2.0 -- in order to capture more of the booming Hadoop and NoSQL business.

SHARE:
3

One top BI player recently open sourced some of its data integration software and licensed the entire Kettle 4.3 release under Apache 2.0 to position itself well as a big data player.

Pentaho, a longstanding open source business intelligence applications player, notes that Hadoop and several top NoSQL databases are licensed under Apache. Pentaho's Kettle open source project, othwerwise known as Pentaho Data Integration Community Edition,  is devoted to "operationalizing" big data.

Some of the big data capabilities in Kettle that will be open sourced include "the ability to input, output, manipulate and report on data using the following Hadoop and NoSQL stores: Cassandra, Hadoop HDFS, Hadoop MapReduce, Hadapt, HBase, Hive, HPCC Systems and MongoDB," the company announced.

Traditional relational databases and data tools are insufficient for handling big datasets.

One exec had this to say about the open source move:

“In order to obtain broader market adoption of big data technology including Hadoop and NoSQL, Pentaho is open sourcing its data integration product under the free Apache license. This will foster success and productivity for developers, analysts and data scientists giving them one tool for data integration and access to discovery and visualization," said Matt Caster, founder and chief architectb of Pentaho's Kettle Project.

Topics: Big Data, Open Source

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

3 comments
Log in or register to join the discussion
  • Pentaho is very competitive with Actuate

    Nice move.
    Dietrich T. Schmitz *Your
  • RE: Pentaho open sources big data code, licenses Kettle project under Apach

    I've got version 3.x installed on one of my Linux systems (it's LGPLv2). Now that I have a small Cassandra cluster setup (most definitely NOT big data), I may give version 4.x a test drive.
    Rabid Howler Monkey
  • RE: Pentaho open sources big data code, licenses Kettle project under Apache 2.0

    This is great news. I learned HPCC Systems is also developing plugins that allow users to spray fixed width or delimited files from within a Kettle job to a Thor cluster and also let you execute ECL on a Thor cluster from within a Kettle job. This integration can really allow for powerful data ETL capabilities. Learn more at hpccsystems.com
    H-M