AWS advances machine learning with new chip, elastic inference

To address the high cost of inference, AWS at re:Invent introduced Amazon Elastic Inference and a new processor called AWS Inferentia.
Written by Stephanie Condon, Senior Writer and  Asha Barbaschow, Contributor

The launch of Amazon Elastic Inference lets customers add GPU acceleration to any EC2 instance for faster inference at 75 percent savings. Typically, the average utilization of GPUs during inference is 10 to 30 percent, Jassy said.

Also: Top cloud providers 2018: How AWS, Microsoft, Google Cloud Platform, IBM Cloud, Oracle, Alibaba stack up

With a growing number of enterprises embracing machine learning on the cloud, Amazon Web Services is introducing new capabilities and tools to improve inference. Specifically, it's launching Amazon Elastic Inference and unveiling a new processor called AWS Inferentia.

"If you think about the cost equation... the vast majority of cost - probably about 90 percent of it - is in inference," AWS CEO Andy Jassy said at the re:Invent conference in Las Vegas.


With Elastic Inference, you can take any EC2 instance and provision elastic inference right at the time you are creating that instance. You can start at 1 teraflop, or do up to 32 teraflops. Elastic Inference can detect when there's one of the major frameworks running on that instance determine what would benefit from acceleration.

Also: Cloud wars 2018: 6 things we learned

"This is a pretty significant game change in being able to run inference much more cost effectively," Jassy said.

Meanwhile, AWS Inferentia is a high-performance machine learning inference chip, custom designed by AWS. It will be very high throughput, low latency, with sustained performance and very cost effective, Jassy said. It will support all the major frameworks and will be available on all EC2 instance types.

Also announced during the keynote was AWS SageMaker Ground Truth, which Jassy said allows organizations to implement machine learning without spending the thousands of human hours on training models.

Also: Amazon's Cloud Cam finds the right balance for home security CNET

The CEO said AWS SageMaker Ground Truth is a highly accurate training dataset that will reduce data labeling costs by 70 percent.

Typically, "you have to label objects to train the model ... you need to know what a stop sign or pedestrian is," he explained. "It requires thousands of hours of footage, and you have to label everything."

This process is generally slow, expensive, and hard to achieve. "Most companies just don't bother and that makes it harder to build these computer vision models," Jassy said.

Also: 51% of tech pros say cloud is the no. 1 most important TechRepublic

Meanwhile, Amazon SageMaker RL makes available new machine learning capabilities in Amazon SageMaker to build, train, and deploy with reinforced learning. It amounts to "reinforced learning for every developer and data scientist," Jassy said.

"We want to enable everyone to have access to this," he said, noting smaller companies should be able to do what the larger ones do.

Cloud services: 24 lesser-known web services your business needs to try

Previous and related coverage:

What a hybrid cloud is in the 'multi-cloud era,' and why you may already have one

Now that the services used by an enterprise and provided to its customers may be hosted on servers in the public cloud or on-premises, maybe "hybrid cloud" isn't an architecture any more. While that may the case, that's not stopping some in the digital transformation business from proclaiming it a way of work unto itself.

Cloud computing: Here comes a major tipping point

Application spending has moved to the cloud fastest, but other areas of IT spending are catching up.

Related stories:

Editorial standards