Google makes Cloud TPU Pods publicly available in beta

Composed of racks of Google's custom silicon chips, TPU Pods can take just minutes to complete ML workloads that would take days on other systems, Google says.
Written by Stephanie Condon, Senior Writer

Google on Tuesday announced that its Cloud TPU v2 Pods and Cloud TPU v3 Pods are now publicly available in beta, enabling machine learning researchers and engineers to iterate faster and train more capable machine learning models. The pods are composed of racks of Tensor Processing Units (TPUs), the custom silicon chips Google first unveiled in 2016. Together in multi-rack form, they amount to a "scalable supercomputer for machine learning," Google said. 

Also: The Pixel 3A is official: Here's what you need to knowAndroid Q: Everything you need to know

Google announced the beta release during the Google I/O conference, the annual developer event where Google typically makes several AI-related announcements, including the release of AI products and services aimed at enterprise customers.

A single Cloud TPU Pod can include more than 1,000 individual TPU chips. They're connected by a two-dimensional toroidal mesh network, which the TPU software stack uses to program multiple racks of machines as a single machine via a variety of high-level APIs. Users can also leverage small sections of a Cloud TPU Pod, called "slices."

The latest-generation Cloud TPU v3 Pods are liquid-cooled for maximum performance. Each one delivers more than 100 petaFLOPs of computing power. In terms of raw mathematical operations per second, a Cloud TPU v3 Pod is comparable with a Top 5 supercomputer worldwide, Google said -- though it operates at lower numerical precision.

Also: Google's Demis Hassabis is one relentlessly curious public face of AI

With that kind of power, the TPU Pods can take just minutes or hours to complete ML workloads that would take days or weeks to complete on other systems. Specifically, Google says they're well-suited for customers with specific needs, such as iterating faster while training large ML models, training more accurate models using larger datasets (millions of labeled examples; terabytes or petabytes of data), or retraining a model with new data on a daily or weekly basis. 

Google I/O 2019: The biggest announcements from the keynote

More from Google I/O:

Editorial standards