In November of last year, Google announced the limited preview of the BigQuery service, a cloud-based system for hosting and ad hoc SQL-like query of large data sets. Today the company announced in a blog post that the service is now public.
BigQuery is an impressively developer-friendly solution, as it offers a rather straight forward REST (REpresentational State Transfer) Web service for pushing data to Google's cloud and then querying it. That's a lot easier than setting up a Hadoop cluster and writing MapReduce jobs.
There are some devilish details though. As I wrote in my first post on this blog, a definition of consensus for Big Data is somewhat elusive. With BigQuery, Google proves out that point rather effectively. Although BigQuery comes form the company that invented MapReduce and an accompanying file system for handling petabyte-and-greater data sets, BigQuery does not encapsulate that technology or capability.
Instead, Google itself describes BigQuery as an OLAP (OnLine Analytical Processing) system that lets you query up to 100GB of data per month for free, and store/query data sets of up to 2TB for a fees of $0.12/GB/month for stoarge and $.035/GB processed for queries. If you want to go bigger than that, the blog post advises you to "contact a sales representative."
I think BigQuery would be better classified as a cloud Business Intelligence platform than a Big Data solution. Such a description would better set expectations. That way, when customers learn they'll be querying column store databases that are hundreds of gigabytes in size, and not working with NoSQL-based data sets that are hundreds of terabytes in size, they'll be ready.
Having said that, a cloud-based BI solution that can query up to a couple of terabytes (and which is free up to 100GB), and uses REST and SQL to do it, sounds pretty cool to me. And if you watch the demo in this video, you will see BigQuery is darn fast.
I'll be trying BigQuery and will plan on telling you about it in a future BigPost.