Concurrent's founder and CTO, Chris Wensel, and CEO, Gary Nakamura, stopped by to both introduce their company; their leading product, Cascading; and their new application performance management product, Driven, that is designed to take the guesswork out of finding performance problems for Hadoop-based big data applications.
What is Cascading?
I have to admit that I was not aware of Cascading until this conversation. After the discussion, I spent some time familiarizing myself with the open source project and what it can do. Here's how Concurrent describes Cascading:
Cascading is a Java application framework that enables typical developers to quickly and easily develop rich Data Analytics and Data Management applications that can be deployed and managed across a variety of computing environments. Cascading works seamlessly with Apache Hadoop 1.0 and API compatible distributions.
I was impressed by the list of companies that use Cascading as part of their Big Data development efforts and by the fact that 75,000 copies of the software are downloaded monthly. Concurrent points out that over 6,000 data driven businesses, including Twitter, eBay, The Climate Corp and Etsy, use Cascading to develop Hadoop-based applications.
What is Driven?
Driven is an application performance management tool designed to help Cascading developers accelerate development, diagnose performance problems, and both manage and monitor Hadoop-based Big Data applications.
Concurrent points out that they are addressing three things:
- Making Hadoop development straightforward enough that it can be a tool enterprises of all sizes can use to become data-driven companies. Without Cascading, many would have to resort to programming Big Data applications in assembler.
- Offering tools that offer alerting and notifications for Big Data applications so that they can be folded into production environments.
- Low cost tools (free for development, modest cost for production environments) that make it possible for companies to use Big Data.
Companies have accumulated huge amounts of operational, telemetric and point of sale data that could be the basis for a better and deeper understanding of their own operations and customers. Hadoop, while a very popular tool, can be challenging to a newcomer. Tools such as Cascading and Driven could certainly shorten the time it takes for developers to come up to speed and be productive with Hadoop.