Nvidia CEO Jensen Huang on Tuesday opened the company's fall GTC conference by announcing the company's "Hopper" graphics processing unit (GPU) is in volume production and will begin to be shipped in systems by partners including Dell, Hewlett Packard, and Cisco Systems next month. Nvidia systems carrying the GPU will be available in the first quarter of next year, Huang announced.
The Hopper chip, also known as "H100," is aimed at tasks for the data center, such as artificial intelligence. Nvidia says that the H100 can "dramatically" reduce the cost to deploy AI programs. For example, the work of 320 of the prior top-of-the-line GPU, the A100, can be done with just 64 H100 chips, which would need only one fifth the number of server computers, and would cut energy use by 3.5 times.
Along side Hopper's availability, Huang discussed a new "as-a-service" cloud offering, Nvidia NeMo LLM, for "large language models," offering to enable customers to easily deploy very large natural language processing models such as OpenAI's GPT-3 and Megatron-Turing NLG 530B, the language model that Nvidia has developed with Microsoft.
Available in an "early access" fashion starting next month, the NeMo service provides GPT-3 and other models in pre-trained fashion and can be tuned by a developer using a method that Nvidia calls "prompt learning," which Nvidia adapted from a technique that scientists at Google introduced last year.
A version of NeMo will be hosted especially for biomedical uses of large language models such as drug discovery, called the NVIDIA BioNeMo Service.
During a press briefing on the announcement, ZDNet asked Nvidia's head of accelerated computing, Ian Buck, what guardrails the company will institute to prevent against misuses of large language models that have been well documented in the literature on AI ethics.
"Yeah, good question, this service is a going into EA [early access] starting next month, and it's direct to to the enterprise and we're actually doing this to engage with different enterprises, different companies to develop their workflow," said Buck.
They'll be doing anything, obviously, in conjunction with Nvidia, and each user will be applying for access to the service, so, we'll understand a little bit more about what they're doing versus a generic, open offering in that way. Again, we're trying to focus on bringing large language models to enterprises and that is our customer go-to-market.
ZDNet followed up by noting to Buck that abuses in the literature on large language models include biases embedded in the training data, not necessarily malicious use.
"is it your contention that if it's limited access to enterprise partners that there won't be those kinds of documented abuses such as bias in the training materials and the output product?" Asked ZDNet.
Yeah. I mean, customers are gonna bring in their own data sets to the domain training, so, certainly they have to bear some of that responsibility, and we all, as a community, need to bear responsibility. This stuff was trained on the Internet. It needs to be understood, and scoped for what it is now for domain-specific solutions. The problem's a little bit more constrained because we're trying to provide a singular service for a particular use case. That bias will exist. Humans have bias, and unfortunately this has been trained, of course, on the input of humans.
The presentation also included a bevy of dedicated computer platforms for numerous industries including healthcare, and for OmniVerse, Nvidia's software platform for the Metaverse.
The platforms include IGX for healthcare applications, IGX for factories, a new version of OVX computing systems dedicated to OmniVerse development, and a new version of Nvidia's DRIVE computing platform for cars, called DRIVE Thor, which will combine the Hopper GPU with Nvidia's forthcoming CPU, "Grace."