GLM-5 (Non-reasoning)
Intelligence
40.6
#84/521
Speed
69 tok/s
TTFT: 998.00s
Pricing
$1.00 / $3.20
per 1M tokens (in/out)
GLM-5 is Zhipu AI’s flagship foundation model designed for complex system engineering and long-range Agent tasks, shifting focus from coding to engineering. It features 744B total parameters (40B activated) in a Mixture of Experts architecture, trained on 28.5T tokens. GLM-5 integrates DeepSeek Sparse Attention for higher token efficiency while preserving long-context quality. It supports 200K context length and 128K max output tokens, with capabilities including thinking modes, real-time streaming, function calling, context caching, and structured output. GLM-5 approaches Claude Opus 4.5 in code-logic density and systems-engineering capability.
GLM-5 (Non-reasoning) is Z AI’s model designed for efficient processing of text inputs. It operates at a speed of 68.791 tokens per second and is priced at $1 per million input tokens, targeting professional users.
Read more ▼
Frequently Asked Questions (15)
When was GLM-5 (Non-reasoning) released?
GLM-5 (Non-reasoning) was released on February 11, 2026.
Who created GLM-5 (Non-reasoning)?
GLM-5 (Non-reasoning) was created by Z AI.
How intelligent is GLM-5 (Non-reasoning)?
GLM-5 (Non-reasoning) scores 41 on the Artificial Analysis Intelligence Index, placing it well above average among other open weight non-reasoning models of similar size (median: 22).
How fast is GLM-5 (Non-reasoning)?
GLM-5 (Non-reasoning) generates output at 59.7 tokens per second (based on the median across providers serving the model), which is well above average compared to other open weight non-reasoning models of similar size (median: 54.5 t/s).
What is the latency of GLM-5 (Non-reasoning)?
GLM-5 (Non-reasoning) has a time to first token (TTFT) of 1.65s (based on the median across providers serving the model), which is better than average compared to other open weight non-reasoning models of similar size (median: 2.11s).
How much does GLM-5 (Non-reasoning) cost?
GLM-5 (Non-reasoning) costs $1.00 per 1M input tokens (somewhat higher than average, median: $0.60) and $3.20 per 1M output tokens (somewhat higher than average, median: $2.33), based on the median across providers serving the model.
What is GLM-5 (Non-reasoning) API pricing?
GLM-5 (Non-reasoning) costs $1.00 per 1M input tokens and $3.20 per 1M output tokens (based on the median across providers serving the model). For a blended rate (3:1 input to output ratio), this is $1.55 per 1M tokens. Pricing may vary by provider.
How verbose is GLM-5 (Non-reasoning)?
When evaluated on the Intelligence Index, GLM-5 (Non-reasoning) generated 13M output tokens, which is somewhat higher than average compared to other open weight non-reasoning models of similar size (median: 8.1M).
Is GLM-5 (Non-reasoning) a reasoning model?
No, GLM-5 (Non-reasoning) is not a reasoning model. It provides direct responses without extended chain-of-thought reasoning.
What input modalities does GLM-5 (Non-reasoning) support?
GLM-5 (Non-reasoning) supports text input.
What output modalities does GLM-5 (Non-reasoning) support?
GLM-5 (Non-reasoning) supports text output.
Can GLM-5 (Non-reasoning) process images?
No, GLM-5 (Non-reasoning) does not support image input. It can only process text.
Is GLM-5 (Non-reasoning) multimodal?
No, GLM-5 (Non-reasoning) is not multimodal. It only supports text input.
What is the context window of GLM-5 (Non-reasoning)?
GLM-5 (Non-reasoning) has a context window of 200k tokens. This determines how much text and conversation history the model can process in a single request.
Is GLM-5 (Non-reasoning) open source?
Yes, GLM-5 (Non-reasoning) is open weights. The model weights are publicly available and can be downloaded for self-hosting.