OpenAI Unveils GPT-5.4 Pro And Thinking Models

The new model sets a benchmark record by matching or exceeding human performance 83% of the time on complex tasks in industries like law and finance.

OpenAI released GPT-5.4 on Thursday, introducing a new foundation model available in standard, Thinking, and Pro versions.

The launch introduces a model with a 1 million token context window and improved token efficiency, targeting professional workloads. The release includes new benchmark records and a system to manage tool calling within the API.

GPT-5.4 is available in three versions: standard, a reasoning model (GPT-5.4 Thinking), and an optimized high-performance version (GPT-5.4 Pro). The API version supports context windows as large as 1 million tokens, the largest available from OpenAI. OpenAI stated GPT-5.4 solves the same problems with significantly fewer tokens than its predecessor.

The model achieved record scores in computer-use benchmarks OSWorld-Verified and WebArena Verified. It scored a record 83% on OpenAI’s GDPval test for knowledge work tasks. GPT-5.4 also took the lead on Mercor’s APEX-Agents benchmark, which tests professional skills in law and finance.

Mercor CEO Brendan Foody stated that GPT-5.4 excels at creating long-horizon deliverables such as slide decks, financial models, and legal analysis. Foody said the model delivers top performance while running faster and at lower cost than competitive frontier models.

OpenAI reported GPT-5.4 is 33% less likely to make errors in individual claims compared to GPT 5.2. Overall responses are 18% less likely to contain errors. OpenAI introduced Tool Search, a new system for managing tool calling in the API that allows models to look up tool definitions as needed.

Tool Search reduces token use and improves speed and cost in systems with many tools. OpenAI added a new safety evaluation to test chain-of-thought monitoring, addressing concerns that reasoning models could misrepresent their reasoning process.

The new evaluation shows deception is less likely in the GPT-5.4 Thinking version. OpenAI stated this suggests the model lacks the ability to hide its reasoning and that CoT monitoring remains an effective safety tool.

Featured image credit

OpenAI unveils GPT-5.4 Pro and Thinking models

The new model sets a benchmark record by matching or exceeding human performance 83% of the time on complex tasks in industries like law and finance.

Related Posts

Apple touchscreen MacBook could launch with M5 Pro chips

Apple touchscreen MacBook could launch with M5 Pro chips

OpenAI limits ChatGPT 5.6 access to government-approved users first

Apple to skip M6 Pro and Max chips and launch M7 in 2027

IBM unveils world’s first sub-1nm chip with new nanostack architecture

Apple raises prices across Macs, iPads and home devices

LATEST NEWS

Apple touchscreen MacBook could launch with M5 Pro chips

Apple touchscreen MacBook could launch with M5 Pro chips

OpenAI limits ChatGPT 5.6 access to government-approved users first

Apple to skip M6 Pro and Max chips and launch M7 in 2027

IBM unveils world’s first sub-1nm chip with new nanostack architecture

Apple raises prices across Macs, iPads and home devices

BEST AI MODELS LEADERBOARD

LATEST TOOLS

Autoppt

Otter.ai

Slideoo

Disney Pixar AI Generator

Codebay

Newo

BlackInk.AI

WatchMyCompetitor

TokkingHeads

Fellow.app

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

OpenAI unveils GPT-5.4 Pro and Thinking models

The new model sets a benchmark record by matching or exceeding human performance 83% of the time on complex tasks in industries like law and finance.

Stay Ahead of the Curve!

Related Posts

LATEST NEWS

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

Follow Us