Z.AI GLM-4.6 Boosts Context Window To 200K Tokens

Z.AI launches GLM-4.6, boosting context window to 200,000 tokens and improving coding, reasoning, search, writing, and agentic performance. The model excels in tool use and real-world coding tasks.

Z.AI has announced GLM-4.6, the latest model in its GLM series, representing a comprehensive upgrade. The model introduces enhancements across coding, long-context processing, reasoning, search capabilities, writing quality, and agentic applications to improve overall performance.

A central feature of GLM-4.6 is the expansion of its context window to 200,000 tokens, an increase from the 128,000 tokens available in previous versions. This larger capacity allows the model to handle more complex agentic tasks. The model’s coding performance has also been advanced, achieving higher scores on code benchmarks. It demonstrates better real-world performance in environments such as Claude Code, Cline, Roo Code, and Kilo Code. The model excels in specific functions like generating visually polished front-end pages. Reasoning capabilities show clear improvements, and GLM-4.6 supports tool use during inference, contributing to a stronger overall capability.

The model exhibits stronger performance in tool use and search-based agent integration within agent frameworks. In terms of text generation, its writing quality has been refined to align more closely with human preferences for style and readability. It also performs more naturally when utilized in role-playing scenarios. GLM-4.6 is available in top coding tools and is a component of the GLM Coding Plan. This subscription service, with plans starting at $3 per month, is designed to support AI-powered coding in a variety of applications.

Technical specifications confirm the model supports text input and output modalities and features a maximum output token limit of 128,000. GLM-4.6 underwent evaluation across eight authoritative benchmarks, performing on par with leading models including Claude Sonnet 4 and 4.6. In a competitive assessment within the Claude Code environment, it outperformed other models in 74 practical coding scenarios. These evaluations also showed that GLM-4.6 achieved over 30% greater efficiency in token consumption.

Featured image credit