Google explains how Compute Engine fits in big data pipeline

After the debut of Compute Engine, Google product managers explain how it fits into the company's big data strategy overall.

SAN FRANCISCO -- For anyone familiar with big data right now, this is a no-brainer: managing massive amounts of big data is hard.

But Google product manager Ju-kay Kwek posited that there is a better way to solve this problem: leveraging Google's expertise to put your big data to work for you.

See also: Google launches Amazon Web Services killer, but lacks maturity, options Google Drive upgrades make it more appealing for business Google intros Compute Engine infrastructure service

Naturally, this would be the advertised argument during a Google I/O 2012 panel discussion about turning big data into a competitive advantage. Nevertheless, with the unveiling of the Google Compute Engine Infrastructure-as-a-Service platform earlier on Thursday, it's time to hear more about what Google is planning to do in this field.

Kwek cited a statistic from IDC that predicts the big data landscape will grow from being worth $3.2 billion in 2010 to $16.9 billion by 2015, equalling roughly 40 percent CAGR market growth. Based on that alone, it's also a no-brainer that Google would want to put all of the data it already has (and then some) to better use in the enterprise market.

Data is being seen as a core business asset, and increasingly a lot of business data (i.e. social, CRM) is out in the cloud. Kwek remarked that many new things are possible (and only available) by using the cloud, such as unique algorithms and scalability.

However, Kwek acknowledged that it's tough for enterprise customers to capture all of the data they generate, and scaling traditional business infrastructures for big data is equally difficult.

In terms of what this big data actually looks like, Kwek outlined some common characteristics: structured/unstructured/semi-structured, millions (if not billions) of rows, and it's too large to process or store on a single machine.

Google product manager Navneet Joneja presented some of the cloud solutions that Google is offering to tackle big data, explaining that Google is more focused on the solution rather than on the infrastructure.

Along with Google BigQuery, which was designed to handle massive datasets with billions of rows, two examples include Google Cloud Storage with unlimited data at up to 5TB per object, high redundancy and simple sharing, as well as the Google App Engine, a scalable application development and execution environment.

Like connecting Android devices just by using a single Google account login, Joneja explained that this cloud data analytics pipeline should work together seamlessly. However, he did acknowledge that any of these products should work just as fine alone.