NVIDIA Nemotron Nano 12B v2 VL (Reasoning)

← AI Models
NVIDIA
2025-10-28
Modality:
Intelligence
14.9
#342/523
Coding
11.8
#310/429
Math
75
#79/265
Speed
135 tok/s
TTFT: 0.46s
Pricing
$0.20 / $0.60
per 1M tokens (in/out)
Google Preferred Source

NVIDIA Nemotron Nano 12B v2 VL (Reasoning) is NVIDIA’s advanced model designed for reasoning tasks. It processes at 134.589 tokens per second and is priced at $0.2 per million input tokens and $0.6 per million output tokens, targeting professional users.

When to Use NVIDIA Nemotron Nano 12B v2 VL (Reasoning)

✓ Best For

  • Complex reasoning tasks
  • Mathematical problem solving
  • Coding assistance

✗ Not Ideal For

  • High-speed real-time applications
  • Basic text generation tasks

How NVIDIA Nemotron Nano 12B v2 VL (Reasoning) Compares

Intelligence Index · Higher is better

AlibabaUpstageNVIDIAGoogleMistral

Benchmark Profile

Coding Index

Motif TechnologiesAlibabaNVIDIADeepSeekUpstage

Output Speed · tok/s

GoogleNVIDIAAmazon

Math Index

DeepSeekAlibabaNVIDIAAnthropic

Intelligence · Coding · Math

Intelligence Coding Math

All Benchmark Scores (13)

BenchmarkScore
Intelligence Index 14.9
Coding Index 11.8
Math Index 75
MMLU-Pro 759%
GPQA 572%
LiveCodeBench 694%
HLE 53%
SciCode 26.2%
IFBench 31.9%
LCR 40%
TerminalBench Hard 4.5%
Tau2 21.3%
AIME 2025 75%

Data: Artificial Analysis · Updated: March 26, 2026

Frequently Asked Questions (15)

When was NVIDIA Nemotron Nano 12B v2 VL (Reasoning) released?
NVIDIA Nemotron Nano 12B v2 VL (Reasoning) was released on October 28, 2025.
Who created NVIDIA Nemotron Nano 12B v2 VL (Reasoning)?
NVIDIA Nemotron Nano 12B v2 VL (Reasoning) was created by NVIDIA.
How intelligent is NVIDIA Nemotron Nano 12B v2 VL (Reasoning)?
NVIDIA Nemotron Nano 12B v2 VL (Reasoning) scores 15 on the Artificial Analysis Intelligence Index, placing it above average among other open weight models of similar size (median: 15).
How fast is NVIDIA Nemotron Nano 12B v2 VL (Reasoning)?
NVIDIA Nemotron Nano 12B v2 VL (Reasoning) generates output at 129.0 tokens per second (based on the median across providers serving the model), which is above average compared to other open weight models of similar size (median: 97.3 t/s).
What is the latency of NVIDIA Nemotron Nano 12B v2 VL (Reasoning)?
NVIDIA Nemotron Nano 12B v2 VL (Reasoning) has a time to first token (TTFT) of 1.06s (based on the median across providers serving the model), which is better than average compared to other open weight models of similar size (median: 1.83s).
How much does NVIDIA Nemotron Nano 12B v2 VL (Reasoning) cost?
NVIDIA Nemotron Nano 12B v2 VL (Reasoning) costs $0.20 per 1M input tokens (better than average, median: $0.20) and $0.60 per 1M output tokens (better than average, median: $0.60), based on the median across providers serving the model.
What is NVIDIA Nemotron Nano 12B v2 VL (Reasoning) API pricing?
NVIDIA Nemotron Nano 12B v2 VL (Reasoning) costs $0.20 per 1M input tokens and $0.60 per 1M output tokens (based on the median across providers serving the model). For a blended rate (3:1 input to output ratio), this is $0.30 per 1M tokens. Pricing may vary by provider.
How verbose is NVIDIA Nemotron Nano 12B v2 VL (Reasoning)?
When evaluated on the Intelligence Index, NVIDIA Nemotron Nano 12B v2 VL (Reasoning) generated 69M output tokens, which is at the higher end compared to other open weight models of similar size (median: 19M).
Is NVIDIA Nemotron Nano 12B v2 VL (Reasoning) a reasoning model?
Yes, NVIDIA Nemotron Nano 12B v2 VL (Reasoning) is a reasoning model. It uses extended thinking or chain-of-thought reasoning to work through complex problems before providing an answer.
What input modalities does NVIDIA Nemotron Nano 12B v2 VL (Reasoning) support?
NVIDIA Nemotron Nano 12B v2 VL (Reasoning) supports text and image input.
What output modalities does NVIDIA Nemotron Nano 12B v2 VL (Reasoning) support?
NVIDIA Nemotron Nano 12B v2 VL (Reasoning) supports text output.
Can NVIDIA Nemotron Nano 12B v2 VL (Reasoning) process images?
Yes, NVIDIA Nemotron Nano 12B v2 VL (Reasoning) supports image input and can analyze, describe, and answer questions about images.
Is NVIDIA Nemotron Nano 12B v2 VL (Reasoning) multimodal?
Yes, NVIDIA Nemotron Nano 12B v2 VL (Reasoning) is multimodal. It can process text and image input and generate text output.
What is the context window of NVIDIA Nemotron Nano 12B v2 VL (Reasoning)?
NVIDIA Nemotron Nano 12B v2 VL (Reasoning) has a context window of 130k tokens. This determines how much text and conversation history the model can process in a single request.
Is NVIDIA Nemotron Nano 12B v2 VL (Reasoning) open source?
Yes, NVIDIA Nemotron Nano 12B v2 VL (Reasoning) is open weights. The model weights are publicly available and can be downloaded for self-hosting.