DeepSeek R1 (Jan ’25)

← AI Models
DeepSeek
2025-01-20
MIT
Modality:
Intelligence
18.8
#271/523
Coding
15.9
#250/429
Math
68
#101/265
Speed
Pricing
$1.35 / $4.00
per 1M tokens (in/out)
Google Preferred Source

DeepSeek-R1 is a reasoning-focused language model from DeepSeek that features advanced thinking capabilities. It serves as the foundation for DeepSeek’s reasoning model family and pioneered their thinking mode approach for complex problem-solving tasks.

DeepSeek R1 is DeepSeek’s offering for advanced computational tasks. It features a Math Index of 68, making it suitable for mathematical problem-solving, and is priced at $1.35 per million input tokens, targeting professional users.

When to Use DeepSeek R1 (Jan ’25)

✓ Best For

  • Mathematical computations and problem-solving.
  • Data analysis requiring complex calculations.
  • Professional applications needing high accuracy.

✗ Not Ideal For

  • Applications requiring high-speed processing.
  • Use cases with a need for real-time token generation.

How DeepSeek R1 (Jan ’25) Compares

Intelligence Index · Higher is better

MistralDeepSeekUpstageNVIDIA

Benchmark Profile

Coding Index

MistralNous ResearchDeepSeekNVIDIA

Math Index

Nous ResearchAlibabaDeepSeekXiaomiOpenAI

Intelligence · Coding · Math

Intelligence Coding Math

All Benchmark Scores (15)

BenchmarkScore
Intelligence Index 18.8
Coding Index 15.9
Math Index 68
MMLU-Pro 844%
GPQA 708%
LiveCodeBench 617%
HLE 93%
SciCode 35.7%
IFBench 39%
LCR 52.3%
TerminalBench Hard 6.1%
Tau2 11.4%
AIME 68.3%
AIME 2025 68%
MATH 500 96.6%

Data: Artificial Analysis · Updated: April 10, 2026

Frequently Asked Questions (15)

When was DeepSeek R1 (Jan '25) released?
DeepSeek R1 (Jan '25) was released on January 20, 2025.
Who created DeepSeek R1 (Jan '25)?
DeepSeek R1 (Jan '25) was created by DeepSeek.
How intelligent is DeepSeek R1 (Jan '25)?
DeepSeek R1 (Jan '25) scores 19 on the Artificial Analysis Intelligence Index, placing it below average among other open weight models of similar size (median: 27).
How much does DeepSeek R1 (Jan '25) cost?
DeepSeek R1 (Jan '25) costs $1.35 per 1M input tokens (at the higher end, median: $0.60) and $4.00 per 1M output tokens (at the higher end, median: $2.20), based on the median across providers serving the model.
What is DeepSeek R1 (Jan '25) API pricing?
DeepSeek R1 (Jan '25) costs $1.35 per 1M input tokens and $4.00 per 1M output tokens (based on the median across providers serving the model). For a blended rate (3:1 input to output ratio), this is $2.36 per 1M tokens. Pricing may vary by provider.
How verbose is DeepSeek R1 (Jan '25)?
When evaluated on the Intelligence Index, DeepSeek R1 (Jan '25) generated 62M output tokens, which is somewhat higher than average compared to other open weight models of similar size (median: 39M).
Is DeepSeek R1 (Jan '25) a reasoning model?
Yes, DeepSeek R1 (Jan '25) is a reasoning model. It uses extended thinking or chain-of-thought reasoning to work through complex problems before providing an answer.
What input modalities does DeepSeek R1 (Jan '25) support?
DeepSeek R1 (Jan '25) supports text input.
What output modalities does DeepSeek R1 (Jan '25) support?
DeepSeek R1 (Jan '25) supports text output.
Can DeepSeek R1 (Jan '25) process images?
No, DeepSeek R1 (Jan '25) does not support image input. It can only process text.
Is DeepSeek R1 (Jan '25) multimodal?
No, DeepSeek R1 (Jan '25) is not multimodal. It only supports text input.
What is the context window of DeepSeek R1 (Jan '25)?
DeepSeek R1 (Jan '25) has a context window of 130k tokens. This determines how much text and conversation history the model can process in a single request.
Is DeepSeek R1 (Jan '25) open source?
Yes, DeepSeek R1 (Jan '25) is open weights. The model weights are publicly available and can be downloaded for self-hosting.
How many parameters does DeepSeek R1 (Jan '25) have?
DeepSeek R1 (Jan '25) has 685 billion parameters (37 billion active).
What are the active parameters of DeepSeek R1 (Jan '25)?
DeepSeek R1 (Jan '25) is a Mixture of Experts (MoE) model with 685 billion total parameters, but only 37 billion active parameters are used during inference.