Nvidia Unveils The Vera Rubin AI Superchip, Its Most Powerful System Yet

Nvidia co-founder and CEO Jensen Huang unveiled the Vera Rubin AI Superchip at the GPU Technology Conference in Washington, marking the company’s shift to AI hardware amid the AI boom, boosting its market value to nearly $5 trillion.

The Vera Rubin platform targets high-intensity generative AI workloads. It integrates a single Vera CPU equipped with 88 custom ARM cores that support 176 threads. This CPU pairs with two Rubin GPUs to achieve up to 100 petaFLOPS of FP4 compute performance. As Nvidia’s third-generation NVLink 72 rack-scale computer, Vera Rubin succeeds the GB200 and GB300 models. The system employs liquid cooling and incorporates six trillion transistors. It also includes 2 TB of low-latency SOCAMM2 memory to handle demanding AI processing tasks efficiently.

In its base configuration, Vera Rubin provides roughly 100 times the raw compute performance compared to the Volta-based DGX-1. That earlier platform, Nvidia’s initial deep learning system, delivered 170 teraflops of FP16 peak performance. This substantial increase underscores the evolution in computational capacity for AI applications over the years.

Nvidia plans to release Vera Rubin in various configurations to meet diverse needs. The NVL144 setup includes two reticle-sized GPUs, which enable up to 3.6 exaflops of FP4 inference and 1.2 exaflops of FP8 training performance. For enhanced capabilities, the NVL144 CPX configuration reaches 8 exaflops, representing 7.5 times the power of the current-generation GB300 NVL72 systems.

To address requirements in hyperscale data centers for processing larger model-context workloads, Nvidia introduces the Rubin Ultra NVL576 system. This variant uses four reticle-sized GPUs along with up to 365 TB of high-speed memory. It delivers up to 15 exaflops of FP4 inference and 5 exaflops of FP8 training performance, which amounts to an 8 times increase over the GB300.

Each Rubin GPU consists of two compute chiplets and eight HBM4 memory stacks, optimizing data throughput and computation. The GPU board features five NVLink backplane connectors. Two connectors at the top connect the GPUs to the NVLink switch for high-speed interconnectivity. The three bottom connectors manage power delivery, PCIe interface, and CXL connectivity to support integration within broader systems.

Huang anticipates that Rubin GPUs will enter mass production during the second half of 2026. The NVL144 systems are scheduled for launch later in 2026 or early 2027. Meanwhile, the NVL576 systems are expected to become available in the second half of 2027, aligning with Nvidia’s roadmap for advancing AI infrastructure.