Nvidia’s New B200 GPU Powers Development of Trillion-Parameter AI Models

Nvidia’s H100 AI-focused GPU propelled the company into a multi-trillion dollar valuation, surpassing giants like Alphabet and Amazon. In response to the H100’s popularity, Nvidia’s competitors have been scrambling to create comparable chips. However, the green team has taken the competition to a whole new level with the unveiling of the B200 and GB200 chips based on the Blackwell architecture.

According to Nvidia’s press release, the new B200 GPU has a maximum processing power of 20 petaflops in FP4 format and is equipped with 208 billion transistors.

The new GB200 superchip, which combines two Nvidia B200 GPUs with a Grace CPU, is 30 times more powerful than the previous generation for training large language models. Nvidia claims that the new chip will reduce costs and energy consumption by 25 times.

Before the new Nvidia chip was unveiled, training a 1.8 trillion parameter AI model required 8,000 Hopper chips and 15 megawatts of power. However, with 2,000 Blackwell processors and only four megawatts of power, this can now be accomplished.

The GB200 superchip is up to 30 times more powerful than the previous generation

Nvidia claims that the GB200 superchip can be seven times more powerful than the H100 in one of the AI benchmarks, GPT-3 with 175 billion parameters. The speed of AI training with the new chip is four times faster than before.

According to Nvidia, one of the key improvements of the GB200 is the second-generation Transformer engine, which doubles the computing power, bandwidth, and model size.

Another major improvement is only available when connecting a large number of Nvidia superchips together: the new generation NVLink switch, which allows communication between up to 576 GPUs and increases bidirectional bandwidth to 1.8 terabytes per second.

Nvidia’s B200 AI GPU inside a massive rack

Nvidia expects large tech companies to buy large numbers of GB200 processors, which is why they are packaging them in large packages like the GB200 NVL72, which puts 36 CPUs and 72 GPUs into a liquid-cooled rack with a total of 720 petaflops of AI training power. This Nvidia rack has almost 3.2 kilometers of cable inside!

Each row in the Nvidia rack contains two GB200 superchips or two NVLink switches. Nvidia says one of these systems can support a 27 trillion parameter AI model. Unofficial reports say GPT-4 is a 1.7 trillion parameter model.

Nvidia says Amazon, Google, Microsoft, and Oracle plan to integrate NVL72 racks into their AI servers. However, it is not yet clear how many they will purchase.

The green team is also offering the DGX GB200 system, which has a total of 288 CPUs, 576 GPUs, 240 terabytes of RAM, and 11.5 exaflops of FP4 power.

The Blackwell architecture, which has been introduced for AI chips for now, is likely to be incorporated into consumer RTX 5000 series graphics cards in the near future.

Additional details:

  • The B200 GPU has 1.4TB of GPU memory and 64TB/s of memory bandwidth.
  • The GB200 superchip has 30TB of fast memory.
  • The NVL72 rack consumes 40 kilowatts of power.
  • The DGX GB200 system consumes 25 kilowatts of power.

This is a significant leap forward in AI technology, and it will be exciting to see how it is used in the coming years.


Back to top button