At its annual AWS re:Invent developer conference in Las Vegas, Amazon on Tuesday announced a new version of Trainium 2, its dedicated chip for training neural networks. Trainium 2 is tuned specifically for training so-called large language models (LLMs) and foundation models -- the kinds of generative AI programs such as OpenAI's GPT-4.
The company also unveiled a new version of its custom microprocessor, Graviton 4, and said it is extending its partnership with Nvidia to run Nvidia's most advanced chips in its cloud computing service.
The Trainium 2 is designed to handle neural networks with trillions of parameters, or neural weights, which are the functions of the program's algorithm that give it scale and power, generally speaking. Scaling to larger and larger parameters is a focus of the entire AI industry.
The trillion-parameter count has become something of an industry obsession because of the fact that the human brain is believed to contain 100 trillion neuronal connections -- making a trillion-parameter neural network program seem related to the human brain, whether or not it in fact is.
The chips are "designed to deliver up to four times faster training performance and three times more memory capacity" than their predecessor, "while improving energy efficiency (performance/watt) up to two times," said Amazon.
Amazon is making the chips available in instances of its EC2 cloud computing service known as "Trn2" instances. The instance offers 16 of the Trainium 2 chips operating in concert, which can be extended to 100,000 instances, Amazon said. Those larger instances are interconnected using the company's networking system, called the Elastic Fabric Adapter, which can provide for a total of 65 exaFLOPs of computing power. (One exaFLOP is a billion, billion floating point operations per second.)
At that scale of compute, said Amazon, "Customers can train a 300-billion parameter LLM in weeks versus months."
Besides serving customers, Amazon has additional incentives to continue to push the envelope on AI silicon. The company has invested $4 billion in privately held generative AI startup Anthropic, a group that broke off from OpenAI. That investment puts the company in a position to compete with Microsoft's exclusive deal with OpenAI.
The Graviton 4 chip, which is built on the microprocessor intellectual property of ARM Holdings, competes with processors from Intel and Advanced Micro Devices based on the older x86 chip standard. The Graviton 4 has "30% better compute performance," Amazon said.
Unlike the Trainium chips for AI, Graviton processors are meant to run more conventional workloads. Amazon AWS said customers -- including Datadog, DirecTV, Discovery, Formula 1, Nielsen, Pinterest, SAP, Snowflake, Sprinklr, Stripe, and Zendesk -- use the Graviton chips "to run a broad range of workloads, such as databases, analytics, web servers, batch processing, ad serving, application servers, and microservices."
SAP said in prepared remarks that it has been able to achieve "35% better price performance for analytical workloads" running its HANA in-memory database on the Graviton chips, and that "we look forward to evaluating Graviton4, and the benefits it can bring to our joint customers."
Amazon's news follows the introduction by Microsoft last week of its first chips for AI. Alphabet's Google, the other cloud titan alongside Amazon and Microsoft, preceded both in 2016 with the first cloud chip for AI, the TPU, or Tensor Processing Unit, of which it has since offered multiple generations.
In addition to the two new chips, Amazon said it extended its strategic partnership with AI chip giant Nvidia. AWS will be the first cloud service to run the forthcoming GH200 Grace Hopper multi-chip product from Nvidia, which combines the Grace ARM-based CPU and the Hopper H100 GPU chip.
The GH200 chip, which is supposed to start shipping next year, is the next version of the Grace Hopper combo chip, announced earlier this year, which is already shipping in its initial version in computers from Dell and others.
The GH200 chips will be hosted on AWS via Nvidia's purpose-built AI computers, the DGX, which the two companies said will speed up the training of neural networks with more than a trillion parameters.
Nvidia said it will make AWS its "primary cloud provider for its ML research and development."