Z.ai released GLM-5.1, an open-source flagship model designed for agentic engineering, capable of working autonomously on a single coding task for up to eight hours. The model manages the process of planning, execution, testing, and iterative optimization continuously. It scored 58.4 on the SWE-Bench Pro benchmark, surpassing competitors such as GPT-5.4, Claude Opus 4.6, and Gemini 3.1 Pro, making it the top performer in that assessment.
The launch of GLM-5.1 represents a refinement of the earlier GLM-5 model, introduced in February, which features 744 billion parameters, with about 40 billion active parameters per token. GLM-5 was trained solely on Huawei Ascend chips, without utilizing Nvidia hardware. The new version maintains the same architecture but enhances its coding and agentic functionalities through progressive alignment techniques, including multi-task supervised fine-tuning and reinforcement learning stages.
SOTA on SWE-Bench Pro (58.4): GLM-5.1 delivers significant leaps in coding and agentic performance. pic.twitter.com/0dtnWFyTys
— Z.ai (@Zai_org) April 7, 2026
According to Z.ai’s developer documentation, GLM-5.1 is noted for its capability to execute a full “experiment–analyze–optimize” loop autonomously over eight hours. In demonstrations, it built a complete Linux desktop system within this timeframe, completing 655 iterations and increasing vector database query throughput to 6.9 times the initial production version.
The model possesses a context window of 200,000 tokens and can generate up to 128,000 output tokens. It has been optimized for agentic coding workflows, compatible with tools like Claude Code and OpenClaw. On the KernelBench Level 3 benchmark, GLM-5.1 achieved a 3.6x geometric mean speedup in real machine learning workloads.
GLM-5.1 is immediately accessible to all GLM Coding Plan subscribers, with its model weights published under an MIT license. Z.ai, which went public on the Hong Kong Stock Exchange in January with a valuation of $31.3 billion, is offering API access at a price of $1.00 per million input tokens and $3.20 per million output tokens.
The introduction of GLM-5.1 intensifies competition within the open-source coding model space, positioning it as the leader on the SWE-Bench Pro benchmarks against closed-source competitors. Z.ai’s documentation claims that the model’s overall capability is “aligned with Claude Opus 4.6.” However, independent evaluations indicate that GLM-5.1 achieves approximately 94.6% of Claude Opus 4.6’s coding score, with remaining gaps in reasoning and creative tasks.





