Qwen3.5 4B (Reasoning)

← AI Models
Alibaba
2026-03-02
Apache 2.0
Modality:
Intelligence
27.1
#188/523
Coding
17.5
#233/429
Math
Speed
221 tok/s
TTFT: 443.00s
Pricing
$0.03 / $0.15
per 1M tokens (in/out)
Google Preferred Source

Qwen3.5-4B is a 4 billion parameter vision-language model using Gated DeltaNet hybrid architecture with a 3:1 ratio of linear attention to full softmax attention. It supports 262K native context length and delivers strong performance for its size across knowledge, reasoning, coding, and multilingual tasks.

Qwen3.5 4B (Reasoning) is Alibaba’s model designed for reasoning tasks. It processes at 221.466 tokens per second and is priced at $0.03 per million input tokens, targeting professional users.

When to Use Qwen3.5 4B (Reasoning)

✓ Best For

  • Complex reasoning tasks
  • Coding assistance
  • Mathematical problem solving

✗ Not Ideal For

  • High-speed applications requiring low latency
  • Basic conversational tasks

How Qwen3.5 4B (Reasoning) Compares

Intelligence Index · Higher is better

GoogleMistralAlibabaDeepSeek

Benchmark Profile

Coding Index

AlibabaGoogleNaver

Output Speed · tok/s

MiniMaxOpenAIAlibabaGoogle

Intelligence · Coding · Math

Intelligence Coding Math

All Benchmark Scores (9)

BenchmarkScore
Intelligence Index 27.1
Coding Index 17.5
GPQA 771%
HLE 78%
SciCode 16.1%
IFBench 52%
LCR 55.7%
TerminalBench Hard 18.2%
Tau2 92.1%

Data: Artificial Analysis · Updated: April 10, 2026

Frequently Asked Questions (15)

When was Qwen3.5 4B (Reasoning) released?
Qwen3.5 4B (Reasoning) was released on March 2, 2026.
Who created Qwen3.5 4B (Reasoning)?
Qwen3.5 4B (Reasoning) was created by Alibaba.
How intelligent is Qwen3.5 4B (Reasoning)?
Qwen3.5 4B (Reasoning) scores 27 on the Artificial Analysis Intelligence Index, placing it well above average among other open weight models of similar size (median: 15).
How fast is Qwen3.5 4B (Reasoning)?
Qwen3.5 4B (Reasoning) generates output at 207.2 tokens per second (based on the median across providers serving the model), which is well above average compared to other open weight models of similar size (median: 96.9 t/s).
What is the latency of Qwen3.5 4B (Reasoning)?
Qwen3.5 4B (Reasoning) has a time to first token (TTFT) of 0.64s (based on the median across providers serving the model), which is very competitive compared to other open weight models of similar size (median: 1.90s).
How much does Qwen3.5 4B (Reasoning) cost?
Qwen3.5 4B (Reasoning) costs $0.03 per 1M input tokens (very competitive, median: $0.18) and $0.15 per 1M output tokens (very competitive, median: $0.40), based on the median across providers serving the model.
What is Qwen3.5 4B (Reasoning) API pricing?
Qwen3.5 4B (Reasoning) costs $0.03 per 1M input tokens and $0.15 per 1M output tokens (based on the median across providers serving the model). For a blended rate (3:1 input to output ratio), this is $0.06 per 1M tokens. Pricing may vary by provider.
How verbose is Qwen3.5 4B (Reasoning)?
When evaluated on the Intelligence Index, Qwen3.5 4B (Reasoning) generated 240M output tokens, which is at the higher end compared to other open weight models of similar size (median: 23M).
Is Qwen3.5 4B (Reasoning) a reasoning model?
Yes, Qwen3.5 4B (Reasoning) is a reasoning model. It uses extended thinking or chain-of-thought reasoning to work through complex problems before providing an answer.
What input modalities does Qwen3.5 4B (Reasoning) support?
Qwen3.5 4B (Reasoning) supports text, image, and video input.
What output modalities does Qwen3.5 4B (Reasoning) support?
Qwen3.5 4B (Reasoning) supports text output.
Can Qwen3.5 4B (Reasoning) process images?
Yes, Qwen3.5 4B (Reasoning) supports image input and can analyze, describe, and answer questions about images.
Is Qwen3.5 4B (Reasoning) multimodal?
Yes, Qwen3.5 4B (Reasoning) is multimodal. It can process text, image, and video input and generate text output.
What is the context window of Qwen3.5 4B (Reasoning)?
Qwen3.5 4B (Reasoning) has a context window of 260k tokens. This determines how much text and conversation history the model can process in a single request.
Is Qwen3.5 4B (Reasoning) open source?
Yes, Qwen3.5 4B (Reasoning) is open weights. The model weights are publicly available and can be downloaded for self-hosting.