X
Innovation

Google Cloud adds dual region support for storage aimed at analytic workloads

Twitter is moving a Hadoop cluster to Google Cloud Platform and used a new dual regional setup to lower costs and latency.
Written by Larry Dignan, Contributor

Google Cloud Platform rolled out new replication and transfer options for its storage including dual-regional replication for data, more redundancy choices and a developer library for C++.

The storage updates, announced at Google Cloud Next in London, offer more control for customers as well as options for analytics workloads.

Here's the breakdown:

Google Cloud Storage is getting new dual-regional options where customers can pair specific Google Cloud regions. The idea is that this dual-regional service provides business continuity, availability and low latency. Customers will be able to pick their storage bucket and pair regions so that data will be automatically replicated. Dual regional options are in beta.

google-cloud-storage-dual-region.png

According to Google, dual regional options can help analytics and big data workloads since compute and data can be stored closer together. Google said that customers will be able to read or write to either region in a pair without flipping between primary and secondary locations. The first dual regions available are nam4, which combine us-central1 with us-east1, and eur4, which combine europe-north1 and europe-west4.

Related: What's the best SMB cloud storage for you? | Free PDF download: The Future of Everything as a Service | Top cloud providers 2018: How AWS, Microsoft, Google Cloud Platform, IBM Cloud, Oracle, Alibaba stack up

Dominic Preuss, director of product management for Google Cloud, said that the dual regional options are for companies that need to know where their data sits and put compute workloads next to it. For instance, Twitter is moving 300PB in a Hadoop cluster to Google Cloud Platform and the dual regional option cut costs and latency. "Twitter needed the higher availability of multiple regions but to also know where data was so they could co-locate compute," he said.

Derek Lyon, director of engineering at Twitter, said:

One of our first use cases, which demonstrates the value of Dual-Region buckets, is a large-scale log analysis project that we recently completed. For this project, we needed to run interactive SQL queries over 15 PB of compressed log files. With Dual-Region buckets, we are able to ingest the data using our 800 Gbps of connectivity, which spans multiple locations across the US where we have peering established with Google.

Preuss added that dual regions will roll out to all customers in a beta. He added that the dual regional setup will save customers money and improve performance. "You don't want to run jobs in Oregon if data is in Northern Virginia," said Preuss, who added storage, networking and compute costs can add up. "It's more predictable (performance and cost)."

Google Cloud's Nearline and Coldline data tiers are geo-redundant in multiple regions. Google said the move will boost availability of archival data while lowering storage costs. Google also raised its availability service level agreements from 99 percent to 99.9 percent.

The setup means that customers can access data with millisecond latency since it's distributed redundantly, but have archival pricing. Multi-regional support is available for Nearline and Coldline storage.

Cloud storage options are simplified. Google Cloud Storage has been one product with one API and four classes of storage. With dual regional storage and geo-redundancy, Google Cloud will have two classes of storage by redundancy type and access tier. The price options look like this:

gcp-storage-pricing.png

Google said it will roll out the changes on its Web pages, user interface and documentation.

Google also launched a Cloud Storage C++ library. Google Storage's C++ client library allows developers to program applications using Google Cloud Storage in C++. The C++ library is driven by two key industries for Google Cloud--gaming and oil and gas. Preuss said the C++ library gives developers using parallel processing the language they prefer.

Editorial standards