Today: Sep 19, 2024

Nvidia unearths Blackwell B200 GPU, the “global’s maximum robust chip” for AI

Nvidia unearths Blackwell B200 GPU, the “global’s maximum robust chip” for AI
March 18, 2024



Nvidia’s must-have H100 AI chip made it a multitrillion-dollar corporate, one that can be price greater than Alphabet and Amazon, and competition were combating to catch up. However most likely Nvidia is set to increase its lead — with the brand new Blackwell B200 GPU and GB200 “superchip.” Nvidia CEO Jensen Huang holds up his new GPU at the left, subsequent to an H100 at the proper, from the GTC livestream. Symbol: NvidiaNvidia says the brand new B200 GPU gives as much as 20 petaflops of FP4 horsepower from its 208 billion transistors and {that a} GB200 that mixes two of the ones GPUs with a unmarried Grace CPU can be offering 30 instances the efficiency for LLM inference workloads whilst additionally doubtlessly being considerably extra environment friendly. It “reduces value and effort intake via as much as 25x” over an H100, says Nvidia. Coaching a 1.8 trillion parameter fashion would have in the past taken 8,000 Hopper GPUs and 15 megawatts of energy, Nvidia claims. Nowadays, Nvidia’s CEO says 2,000 Blackwell GPUs can do it whilst eating simply 4 megawatts.On a GPT-3 LLM benchmark with 175 billion parameters, Nvidia says the GB200 has a moderately extra modest seven instances the efficiency of an H100, and Nvidia says it gives 4x the learning pace. Right here’s what one GB200 looks as if. Two GPUs, one CPU, one board. Symbol: NvidiaNvidia instructed reporters one of the crucial key enhancements is a second-gen transformer engine that doubles the compute, bandwidth, and fashion dimension via the use of 4 bits for every neuron as an alternative of 8 (thus, the 20 petaflops of FP4 I discussed previous). A moment key distinction simplest comes whilst you hyperlink up massive numbers of those GPUs: a next-gen NVLink transfer that shall we 576 GPUs communicate to one another, with 1.8 terabytes in keeping with moment of bidirectional bandwidth. That required Nvidia to construct a complete new community transfer chip, one with 50 billion transistors and a few of its personal onboard compute: 3.6 teraflops of FP8, says Nvidia.Nvidia says it’s including each FP4 and FP6 with Blackwell. Symbol: NvidiaPreviously, Nvidia says, a cluster of simply 16 GPUs would spend 60 % in their time speaking with one every other and simplest 40 % in reality computing.Nvidia is reckoning on corporations to shop for massive amounts of those GPUs, after all, and is packaging them in better designs, just like the GB200 NVL72, which plugs 36 CPUs and 72 GPUs right into a unmarried liquid-cooled rack for a complete of 720 petaflops of AI coaching efficiency or 1,440 petaflops (aka 1.4 exaflops) of inference. It has just about two miles of cables within, with 5,000 person cables.The GB200 NVL72. Symbol: NvidiaEach tray within the rack accommodates both two GB200 chips or two NVLink switches, with 18 of the previous and 9 of the latter in keeping with rack. In overall, Nvidia says this type of racks can reinforce a 27-trillion parameter fashion. GPT-4 is rumored to be round a 1.7-trillion parameter fashion.The corporate says Amazon, Google, Microsoft, and Oracle are all already making plans to supply the NVL72 racks of their cloud carrier choices, despite the fact that it’s now not transparent what number of they’re purchasing. And naturally, Nvidia is worked up to supply corporations the remainder of the answer, too. Right here’s the DGX Superpod for DGX GB200, which mixes 8 techniques in a single for a complete of 288 CPUs, 576 GPUs, 240TB of reminiscence, and 11.5 exaflops of FP4 computing.Nvidia says its techniques can scale to tens of 1000’s of the GB200 superchips, attached in conjunction with 800Gbps networking with its new Quantum-X800 InfiniBand (for as much as 144 connections) or Spectrum-X800 ethernet (for as much as 64 connections). We don’t be expecting to listen to anything else about new gaming GPUs these days, as this information is popping out of Nvidia’s GPU Era Convention, which is typically virtually completely fascinated about GPU computing and AI, now not gaming. However the Blackwell GPU structure will most probably additionally energy a long run RTX 50-series lineup of desktop graphics playing cards.

OpenAI
Author: OpenAI

Don't Miss

Reality Social Inventory Crashes After Trump Unearths His Scammy New Crypto Challenge

Reality Social Inventory Crashes After Trump Unearths His Scammy New Crypto Challenge

Oops.Sharing Is not CaringDonald Trump’s ill Reality Social is as soon as
Mother unearths how delicate crimson mark was once if truth be told signal she was once struggling a silent well being emergency

Mother unearths how delicate crimson mark was once if truth be told signal she was once struggling a silent well being emergency

A tender mother has printed how a ‘crimson patch’ spreading up her