Mercury 2

← AI Models
Inception
2026-02-20
Proprietary
Context: 128K
Modality:
Intelligence
32.8
#141/532
Coding
30.6
#187/88
Math
Speed
771 tok/s
TTFT: 3.90s
Pricing
$0.25 / $0.75
per 1M tokens (in/out)
Google Preferred Source

Mercury 2 is the fastest reasoning LLM, built on diffusion-based language model (dLLM) architecture. Instead of generating text token-by-token, it refines multiple text blocks simultaneously, achieving over 1,000 tokens per second on Nvidia Blackwell GPUs — 5x faster than leading speed-optimized LLMs. Supports tool usage and JSON output with 128K context window.

Mercury 2 is Inception’s offering for advanced natural language processing tasks. It processes at 770.977 tokens per second and is priced at $0.25 per million input tokens and $0.75 per million output tokens, targeting professional users.

When to Use Mercury 2

✓ Best For

  • Text generation and completion tasks.
  • Data analysis and reporting.
  • Chatbot and virtual assistant development.

✗ Not Ideal For

  • High-speed real-time applications.
  • Users needing advanced mathematical capabilities.

How Mercury 2 Compares

Intelligence Index · Higher is better

NVIDIADeepSeekInceptionZ AIGoogle

Benchmark Profile

Output Speed · tok/s

InceptionLiquid AI

Intelligence · Coding · Math

Intelligence Coding Math

All Benchmark Scores (14)

BenchmarkScore
Intelligence Index 32.8
Coding Index 30.6
GPQA 77%
HLE 155%
SciCode 38.7%
IFBench 69.8%
LCR 36.3%
TerminalBench Hard 26.5%
Tau2 70.8%
AIME 2025 91.1%
LiveCodeBench 67%
Tau2 Airline 53%
Arena Chat 22.2
Arena Coding 5

Data: Artificial Analysis · Updated: March 26, 2026

Frequently Asked Questions (15)

When was Mercury 2 released?
Mercury 2 was released on February 20, 2026.
Who created Mercury 2?
Mercury 2 was created by Inception.
How intelligent is Mercury 2?
Mercury 2 scores 33 on the Artificial Analysis Intelligence Index, placing it well above average among other reasoning models in a similar price tier (median: 19).
How fast is Mercury 2?
Mercury 2 generates output at 870.9 tokens per second (based on Inception's API), which is well above average compared to other reasoning models in a similar price tier (median: 94.9 t/s).
What is the latency of Mercury 2?
Mercury 2 has a time to first token (TTFT) of 3.76s (based on Inception's API), which is at the higher end compared to other reasoning models in a similar price tier (median: 1.87s).
How much does Mercury 2 cost?
Mercury 2 costs $0.25 per 1M input tokens (better than average, median: $0.25) and $0.75 per 1M output tokens (better than average, median: $0.90), based on Inception's API.
What is Mercury 2 API pricing?
Mercury 2 costs $0.25 per 1M input tokens and $0.75 per 1M output tokens (based on Inception's API). For a blended rate (3:1 input to output ratio), this is $0.38 per 1M tokens. Pricing may vary by provider.
How verbose is Mercury 2?
When evaluated on the Intelligence Index, Mercury 2 generated 69M output tokens, which is at the higher end compared to other reasoning models in a similar price tier (median: 20M).
Is Mercury 2 a reasoning model?
Yes, Mercury 2 is a reasoning model. It uses extended thinking or chain-of-thought reasoning to work through complex problems before providing an answer.
What input modalities does Mercury 2 support?
Mercury 2 supports text input.
What output modalities does Mercury 2 support?
Mercury 2 supports text output.
Can Mercury 2 process images?
No, Mercury 2 does not support image input. It can only process text.
Is Mercury 2 multimodal?
No, Mercury 2 is not multimodal. It only supports text input.
What is the context window of Mercury 2?
Mercury 2 has a context window of 130k tokens. This determines how much text and conversation history the model can process in a single request.
Is Mercury 2 open source?
No, Mercury 2 is proprietary. The model weights are not publicly available.