Concurrent launches open source API to ease Hadoop development

Concurrent is promoting its open source data workflow API -- an alternative to MapReduce -- to ease Hadoop application development headaches. The open source API, now available under an Apache 2.0 license, is used by Twitter, Amazon and Razorfish
Written by Paula Rooney, Contributor on

One developer is forwarding his alternative app framework to MapReduce to make big data management in the Hadoop era easier.

Concurrent CEO and Founder Chris Wensel is the creator of an open source data workflow API, Cascading, which is used by Twitter, Amazon and Razorfish.

Last week, the company was officially launched as was Cascading 2.0, which is now available under an Apache 2.0 license.

Wensel sees growing adoption of the API as big data management explodes. He created Cascading to help him develop complex Hadoop applications easier and first released the code in 2007.

"I was writing Hadoop applications and it was an extremely painful process. I started writing a framework to give me a different model . MapReduce is a simple way to parallelize data computations but it's hard to solve real problems," Wenzel said in a recent interview.

True Ventures and Rembrandt announced $900,000 in seed funding last year.

"We have seen a lot of innovation and investment in the data center at the storage, database, and around new technologies like Hadoop to accommodate the exploding data growth. We think a new equally important category will also emerge around data processing and management in order to give context to these growing volumes of data," wrote True Ventures' general partner Puneet Agarwal in a bog last year.

In a press release, Concurrent announced that using the upgraded 2.0 API, now available under an open source license:

  • Data scientists can easily discover, model and analyze both unstructured and semi-structured data in any format and from any source such as flat files, key value stores and NoSQL and relational databases.
  • Hadoop administrators can seamlessly move and scale application deployments from development to test and production clusters regardless of cluster location or data size.
  • Application developers can more quickly build and test applications on their desktops in the language of choice (Java, Jython, Scala, Clojure or Jruby) with familiar constructs and reusable components, and instantly deploy them onto clusters of hundreds of nodes.
  • Editorial standards