DeepSeek R1 Distill Qwen 32B

← AI Models
DeepSeek
2025-01-20
MIT
33B params
Modality:
Intelligence
17.2
#300/523
Coding
Math
63
#111/265
Speed
57 tok/s
TTFT: 495.00s
Pricing
$0.27 / $0.27
per 1M tokens (in/out)
Google Preferred Source

DeepSeek-R1 is the first-generation reasoning model built atop DeepSeek-V3 (671B total parameters, 37B activated per token). It incorporates large-scale reinforcement learning (RL) to enhance its chain-of-thought and reasoning capabilities, delivering strong performance in math, code, and multi-step reasoning tasks.

DeepSeek R1 Distill Qwen 32B is DeepSeek’s offering for advanced natural language processing tasks. It processes at 56.923 tokens per second and is priced at $0.27 per million tokens, targeting professional users.

When to Use DeepSeek R1 Distill Qwen 32B

✓ Best For

  • Natural language understanding and generation tasks.
  • Data analysis and insights extraction.
  • Coding assistance and debugging support.

✗ Not Ideal For

  • High-speed real-time applications due to its TTFT of 495 seconds.
  • Users requiring extensive coding capabilities, as the Coding Index is not available.

How DeepSeek R1 Distill Qwen 32B Compares

Intelligence Index · Higher is better

MetaOpenAIDeepSeekAlibabaZ AI

Benchmark Profile

Math Index

NVIDIAAmazonDeepSeekAnthropicOpenAI

Intelligence · Coding · Math

Intelligence Coding Math

All Benchmark Scores (14)

BenchmarkScore
Intelligence Index 17.2
Math Index 63
MMLU-Pro 739%
GPQA 615%
LiveCodeBench 27%
HLE 55%
SciCode 37.6%
IFBench 22.9%
LCR 9.7%
AIME 68.7%
AIME 2025 63%
MATH 500 94.1%
AIME 2024 83.3%
MATH-500 94.3%

Data: Artificial Analysis · Updated: April 2, 2026

Frequently Asked Questions (15)

When was DeepSeek R1 Distill Qwen 32B released?
DeepSeek R1 Distill Qwen 32B was released on January 20, 2025.
Who created DeepSeek R1 Distill Qwen 32B?
DeepSeek R1 Distill Qwen 32B was created by DeepSeek.
How intelligent is DeepSeek R1 Distill Qwen 32B?
DeepSeek R1 Distill Qwen 32B scores 17 (estimated) on the Artificial Analysis Intelligence Index, placing it above average among other open weight models of similar size (median: 15).
How fast is DeepSeek R1 Distill Qwen 32B?
DeepSeek R1 Distill Qwen 32B generates output at 58.4 tokens per second (based on the median across providers serving the model), which is at the lower end compared to other open weight models of similar size (median: 98.0 t/s).
What is the latency of DeepSeek R1 Distill Qwen 32B?
DeepSeek R1 Distill Qwen 32B has a time to first token (TTFT) of 0.80s (based on the median across providers serving the model), which is better than average compared to other open weight models of similar size (median: 1.77s).
How much does DeepSeek R1 Distill Qwen 32B cost?
DeepSeek R1 Distill Qwen 32B costs $0.27 per 1M input tokens (at the higher end, median: $0.18) and $0.27 per 1M output tokens (better than average, median: $0.58), based on the median across providers serving the model.
What is DeepSeek R1 Distill Qwen 32B API pricing?
DeepSeek R1 Distill Qwen 32B costs $0.27 per 1M input tokens and $0.27 per 1M output tokens (based on the median across providers serving the model). For a blended rate (3:1 input to output ratio), this is $0.27 per 1M tokens. Pricing may vary by provider.
How verbose is DeepSeek R1 Distill Qwen 32B?
When evaluated on the Intelligence Index, DeepSeek R1 Distill Qwen 32B generated 13M output tokens, which is better than average compared to other open weight models of similar size (median: 20M).
Is DeepSeek R1 Distill Qwen 32B a reasoning model?
Yes, DeepSeek R1 Distill Qwen 32B is a reasoning model. It uses extended thinking or chain-of-thought reasoning to work through complex problems before providing an answer.
What input modalities does DeepSeek R1 Distill Qwen 32B support?
DeepSeek R1 Distill Qwen 32B supports text input.
What output modalities does DeepSeek R1 Distill Qwen 32B support?
DeepSeek R1 Distill Qwen 32B supports text output.
Can DeepSeek R1 Distill Qwen 32B process images?
No, DeepSeek R1 Distill Qwen 32B does not support image input. It can only process text.
Is DeepSeek R1 Distill Qwen 32B multimodal?
No, DeepSeek R1 Distill Qwen 32B is not multimodal. It only supports text input.
What is the context window of DeepSeek R1 Distill Qwen 32B?
DeepSeek R1 Distill Qwen 32B has a context window of 130k tokens. This determines how much text and conversation history the model can process in a single request.
Is DeepSeek R1 Distill Qwen 32B open source?
Yes, DeepSeek R1 Distill Qwen 32B is open weights. The model weights are publicly available and can be downloaded for self-hosting.