Google launching open source Cloud Dataflow SDK for Java

Cloud Dataflow fills a major puzzle piece in Google's rapidly evolving and growing cloud stack as the Internet giant continues to challenge Amazon Web Services.

Google is looking to woo cloud developers once again with the debut of an open source for Java based around its fairly new Cloud Dataflow service.

Read this

2014: The year the cloud killed the datacenter

This year more and more CxOs pulled the life support on their own datacenter infrastructure in favor of ever-maturing public and private cloud offerings.

Read More

First unveiled at Google's annual I/O developer summit this summer, Cloud Dataflow is a big data analytics solution designed to crunch information in either streaming or batch mode.

Cloud Dataflow has since been pushed out as an alpha release as Google preps a managed service model for data processing.

Urs Hölzle, senior vice president of Google Cloud Platform, noted at the time that Dataflow replaced MapReduce inside Google as the new approach for analyzing pipelines with "arbitrarily large datasets."

Cloud Dataflow also fills a major puzzle piece in Google's rapidly evolving and growing cloud stack as the Internet giant continues to challenge Amazon Web Services, among other cloud providers.

More specifically, Google's Cloud Dataflow lines up against data warehouse service AWS Redshift as well as Hadoop tool AWS Elastic MapReduce.

Google software engineer Sam McVeety elaborated in a blog post on Thursday the open sourced SDK should make it easier for developers to integrate with Google's managed service in order to port Cloud Dataflow to other development languages and environments.

"The value of data lies in analysis -- and the intelligence one generates from it," McVeety wrote. "Turning data into intelligence can be very challenging as data sets become large and distributed across disparate storage systems. Add to that the increasing demand for real-time analytics, and the barriers to extracting value from data sets becomes a huge challenge for developers."

The Google Dataflow SDK for Java is available on GitHub now.