NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning)

← AI Models
NVIDIA
2025-10-28
Modality:
Intelligence
10.1
#440/523
Coding
5.9
#376/429
Math
26.7
#193/265
Speed
138 tok/s
TTFT: 564.00s
Pricing
$0.20 / $0.60
per 1M tokens (in/out)
Google Preferred Source

NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning) is NVIDIA’s latest model designed for high-speed token processing. It operates at 138.381 tokens per second and is priced at $0.2 per million input tokens, targeting professional users in data-intensive applications.

When to Use NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning)

✓ Best For

  • Natural language processing tasks
  • Data analysis and reporting
  • High-speed token generation

✗ Not Ideal For

  • Applications requiring advanced reasoning capabilities
  • Users needing extensive coding support

How NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning) Compares

Intelligence Index · Higher is better

AnthropicMistralNVIDIAGoogle

Benchmark Profile

Coding Index

Trillion LabsGoogleNVIDIAAllen Institute for AIIBM

Output Speed · tok/s

AlibabaGoogleNVIDIAStepFunOpenAI

Math Index

OpenAIMistralNVIDIAZ AIDeepSeek

Intelligence · Coding · Math

Intelligence Coding Math

All Benchmark Scores (12)

BenchmarkScore
Intelligence Index 10.1
Coding Index 5.9
Math Index 26.7
MMLU-Pro 649%
GPQA 439%
LiveCodeBench 345%
HLE 45%
SciCode 17.6%
IFBench 25.9%
LCR 17%
Tau2 19.3%
AIME 2025 26.7%

Data: Artificial Analysis · Updated: April 10, 2026

Frequently Asked Questions (15)

When was NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning) released?
NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning) was released on October 28, 2025.
Who created NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning)?
NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning) was created by NVIDIA.
How intelligent is NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning)?
NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning) scores 10 on the Artificial Analysis Intelligence Index, placing it below average among other open weight non-reasoning models of similar size (median: 11).
How fast is NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning)?
NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning) generates output at 139.6 tokens per second (based on the median across providers serving the model), which is above average compared to other open weight non-reasoning models of similar size (median: 98.2 t/s).
What is the latency of NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning)?
NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning) has a time to first token (TTFT) of 1.07s (based on the median across providers serving the model), which is better than average compared to other open weight non-reasoning models of similar size (median: 1.72s).
How much does NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning) cost?
NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning) costs $0.20 per 1M input tokens (somewhat higher than average, median: $0.15) and $0.60 per 1M output tokens (somewhat higher than average, median: $0.30), based on the median across providers serving the model.
What is NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning) API pricing?
NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning) costs $0.20 per 1M input tokens and $0.60 per 1M output tokens (based on the median across providers serving the model). For a blended rate (3:1 input to output ratio), this is $0.30 per 1M tokens. Pricing may vary by provider.
How verbose is NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning)?
When evaluated on the Intelligence Index, NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning) generated 7.8M output tokens, which is better than average compared to other open weight non-reasoning models of similar size (median: 8.5M).
Is NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning) a reasoning model?
No, NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning) is not a reasoning model. It provides direct responses without extended chain-of-thought reasoning.
What input modalities does NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning) support?
NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning) supports text and image input.
What output modalities does NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning) support?
NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning) supports text output.
Can NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning) process images?
Yes, NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning) supports image input and can analyze, describe, and answer questions about images.
Is NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning) multimodal?
Yes, NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning) is multimodal. It can process text and image input and generate text output.
What is the context window of NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning)?
NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning) has a context window of 130k tokens. This determines how much text and conversation history the model can process in a single request.
Is NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning) open source?
Yes, NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning) is open weights. The model weights are publicly available and can be downloaded for self-hosting.