Nvidia has announced its next generation of AI superchips, the Blackwell Ultra GB300, which will ship in the second half of this year, the Vera Rubin for the second half of next year, and the Rubin Ultra, set to arrive in the second half of 2027. The company reports it is currently making $2,300 in profit every second, driven largely by its AI-centric data center business.
Nvidia unveils powerful AI superchips: Blackwell Ultra GB300 arrives soon
The Blackwell Ultra GB300, while part of Nvidia’s annual cadence of AI chip releases, does not utilize a new architecture. In a prebriefing with journalists, Nvidia revealed that a single Ultra chip provides the same 20 petaflops of AI performance as the original Blackwell, now enhanced with 288GB of HBM3e memory, up from 192GB. The Blackwell Ultra DGX GB300 “Superpod” cluster will maintain its configuration of 288 CPUs and 576 GPUs, delivering 11.5 exaflops of FP4 computing but with an increased memory capacity of 300TB compared to 240TB in the previous version.
In comparisons with the H100 chip, which significantly contributed to Nvidia’s AI success in 2022, the Blackwell Ultra offers 1.5 times the FP4 inference performance and can accelerate AI reasoning tasks. Specifically, the NVL72 cluster can execute an interactive version of DeepSeek-R1 671B, receiving answers in ten seconds rather than the H100’s 1.5 minutes, thanks to its capability to process 1,000 tokens per second—ten times more than the previous generation of Nvidia chips.
Nvidia has introduced a desktop computer called the DGX Station, which will feature a single GB300 Blackwell Ultra chip, along with 784GB of unified system memory and built-in 800Gbps Nvidia networking, while still providing the promised 20 petaflops of AI performance. Companies like Asus, Dell, and HP will market versions of this desktop alongside Boxx, Lambda, and Supermicro.
Nvidia reveals DGX Spark: The world’s smallest AI supercomputer
Furthermore, Nvidia is launching the GB300 NVL72 rack, which offers 1.1 exaflops of FP4, 20TB of HBM memory, 40TB of “fast memory,” 130TB/sec of NVLink bandwidth, and 14.4 TB/sec networking capabilities.
The forthcoming Vera Rubin architecture will significantly boost performance, featuring 50 petaflops of FP4, a substantial increase from Blackwell’s 20 petaflops. Its successor, Rubin Ultra, will integrate two Rubin GPUs for a total of 100 petaflops of FP4 and nearly quadruple the memory at 1TB.
A complete NVL576 rack of Rubin Ultra claims up to 15 exaflops of FP4 inference and 5 exaflops of FP8 training, representing a 14-fold performance increase over the Blackwell Ultra rack scheduled for this year.
Nvidia has reported $11 billion in revenue from Blackwell, with the top four customers acquiring 1.8 million Blackwell chips in 2025 alone. CEO Jensen Huang highlighted the increasing demand for computational power, stating that the industry now requires “100 times more than we thought we needed this time last year” to meet current needs. Huang also announced the next architecture to come after Vera Rubin in 2028 will be named Feynman, after the renowned physicist Richard Feynman, noting that some family members of pioneering astronomer Vera Rubin were present at the announcement event.
Featured image credit: Nvidia