Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

NVIDIA Blackwell Ultra delivers 50x higher efficiency for agentic AI

The new benchmarks specifically target low-latency AI workloads, such as real-time coding assistants and autonomous AI agents.

byKerem Gülen
February 17, 2026
in Industry
Home Industry
Share on FacebookShare on TwitterShare on LinkedInShare on WhatsAppShare on e-mail
Google Preferred Source

Nvidia’s new benchmark data reveals that GB300 NVL72 systems equipped with Blackwell Ultra GPUs achieve up to 50x higher throughput per megawatt and 35x lower cost per token compared to the Hopper platform for low-latency AI workloads. The metrics reflect combined hardware and software advancements targeting agentic AI and coding assistant deployments. Performance gains derive from specific architectural changes and library optimizations that address transformer attention layer bottlenecks. These efficiency improvements reduce operational costs for cloud providers and inference services, enabling broader deployment of compute-intensive models.

⚡New data shows NVIDIA Blackwell Ultra delivers up to 50x better performance and 35x lower cost for agentic AI.
Cloud providers are deploying NVIDIA GB300 NVL72 systems at scale for low-latency and long-context use cases including agentic coding and coding assistants.

Learn how… pic.twitter.com/HIcTBXhwCd

— NVIDIA (@nvidia) February 16, 2026

Blackwell Ultra Tensor Cores provide 1.5x greater compute performance than standard Blackwell GPUs. The architecture doubles attention-layer processing via accelerated softmax execution, directly supporting reasoning models that utilize large context windows. Nvidia’s TensorRT-LLM inference library has recorded sustained performance increases, with SemiAnalysis benchmarks documenting that throughput per GPU doubled at certain interactivity levels since October 2025. The company states that these developments deliver a 10x increase in tokens per second per user and a 5x improvement in tokens per second per megawatt relative to Hopper. Cumulatively, these factors produce the 50x rise in AI factory output.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

Chen Goldberg, senior vice president of engineering at CoreWeave, emphasized the operational focus of these advancements. “As inference moves to the center of AI production, long-context performance and token efficiency become critical,” Goldberg stated. “Grace Blackwell NVL72 addresses that challenge directly.” CoreWeave announced in 2025 that it was the first AI cloud provider to deploy GB300 NVL72 systems in production, integrating the hardware with its Kubernetes-based cloud stack.

Microsoft subsequently deployed what it describes as the world’s first large-scale GB300 NVL72 supercomputing cluster. Testing validated by Signal65 recorded the cluster achieving over 1.1 million tokens per second on a single rack. Oracle’s OCI platform is deploying GB300 NVL72 systems with plans to scale Superclusters beyond 100,000 Blackwell GPUs to support inference workload demand.

Leading inference providers, including Baseten, DeepInfra, Fireworks AI, and Together AI, reported up to 10x cost reductions using the standard Blackwell platform. The Blackwell Ultra platform extends these efficiencies to workloads requiring low latency, achieving a 35x lower cost per million tokens.

This reduction facilitates the economically viable deployment of AI agents and coding assistants at scale. Nvidia has previewed its next-generation Rubin platform, projecting a 10x performance improvement over Blackwell.


Featured image credit

Tags: FeaturedNVIDIA Blackwell Ultra

Related Posts

What AI investors are looking for after the hype cycle

What AI investors are looking for after the hype cycle

June 22, 2026
Tim Cook says higher Apple device prices are unavoidable

Tim Cook says higher Apple device prices are unavoidable

June 22, 2026
SAP CEO says AI could replace software developers within four years

SAP CEO says AI could replace software developers within four years

June 22, 2026
What Europe’s AI startups are building for the enterprise era

What Europe’s AI startups are building for the enterprise era

June 19, 2026
Generative search: Should you kick the keyword to the curb?

Generative search: Should you kick the keyword to the curb?

June 19, 2026
Why modern software development begins at the application layer

Why modern software development begins at the application layer

June 18, 2026

LATEST NEWS

Samsung adopts ChatGPT Enterprise and Codex across global workforce

Samsung Galaxy S27 Pro leak points to built-in Privacy Display

Perseverance rover completes a marathon on Mars

Polymarket accused of paying creators to post misleading TikTok bet videos

OpenAI improves health responses for free ChatGPT users

Adobe expands Firefly AI across Premiere, Illustrator, InDesign and Frame.io

BEST AI MODELS LEADERBOARD

See the best AI models, ranked by intelligence, benchmark results, speed and token price. Find the most suitable LLMs, Text-to-Image, Image Editing, Text-to-Speech, Text-to-Video and Image-to-Video  artificial intelligence model for your tasks and business.

LATEST TOOLS

Moonbeam

Charisma AI

Essay Writer by Papertyper

Slite

Wonderin AI

Spur

Stenography

Calldesk

MaxAI.me

PhotoRestore

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI tools
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies to improve your experience. You can choose to accept or reject them. Visit our Privacy Policy.