/>
X

AWS announces EC2 instance based on custom-built Inferentia chip

At AWS re:Invent, AWS CEO Andy Jassy announced new cloud instances based on a custom-built machine learning inference chip
stephanie-condon-author.jpg
Written by Stephanie Condon, Senior Staff Writer on

Amazon Web Services on Tuesday announced a new EC2 instance powered by its custom-built Inferentia chip. Inferentia, first announced at last year's re:Invent conference, is a high-performance machine learning inference chip. It offers very high throughput, low latency and sustained performance -- at a cost-effective price, AWS says. 

"If you do a lot of machine learning at scale and in production… you know the majority of your costs are in predictions," AWS CEO Andy Jassy said at the AWS re:Invent conference in Las Vegas. 

The Inf1 instances will have low latency, 3X higher throughput and up to 40 percent lower cost per inference compared to Nvidia G4 chips, he said. 

Machine learning -- which comprises training algorithms and inference -- is quickly becoming an integral part of every application, but it comes with some unique demands. Inference can be costly, and it demands low latency and high throughput. 

Inference -- during which a trained machine learning model is actually put to work -- can easily account for the vast majority of the cost associated with a machine learning system. For instance, every time Alexa interprets a command from a user, it's performing inference. Every time a machine learning model trained to perform object recognition for a self-driving car spots an object in the road, it's performing inference. 

In these scenarios, latency is of clear importance, to varying degrees. The faster Alexa interprets your command, the faster it can respond. The faster a self-driving car identifies an object in the road, the faster it can avoid a collision. 

The custom chip puts new competitive pressure on Amazon's suppliers -- namely, Intel and Nvidia. AWS's investments in custom chips sends the clear message that AWS isn't going to let its tech supply chain constrain its pace of innovation.

Related

Virtual-world tech company owner arrested over alleged $45m investment fraud scheme
investment.jpg

Virtual-world tech company owner arrested over alleged $45m investment fraud scheme

Security
Google, Nvidia split top marks in MLPerf AI training benchmark
0f88567d-b674-4582-af9a-2409fdd3bdd5

Google, Nvidia split top marks in MLPerf AI training benchmark

AI & Robotics
Withings ScanWatch Horizon review: Luxury looks and clinically validated health data
scanwatch-horizon4.jpg

Withings ScanWatch Horizon review: Luxury looks and clinically validated health data

Smart Watches