This Is How ChatGPT Becomes An Agent That Can Take Action

OpenAI has launched a new AI agent in ChatGPT designed to perform various computer-based tasks for users. The ChatGPT agent can manage calendars, create presentations, and execute code.

The ChatGPT agent incorporates features from prior tools like Operator, which navigates websites, and Deep Research, which synthesizes information into concise reports. Users can interact with the agent using natural language prompts. Rolling out on Thursday, the agent is available to subscribers of OpenAI’s Pro, Plus, and Team plans. Users can enable it by selecting “agent mode” from the dropdown menu in ChatGPT.

ChatGPT can now do work for you using its own computer.
Introducing ChatGPT agent—a unified agentic system combining Operator’s action-taking remote browser, deep research’s web synthesis, and ChatGPT’s conversational strengths. pic.twitter.com/7uN2Nc6nBQ
— OpenAI (@OpenAI) July 17, 2025

This launch is OpenAI’s attempt to develop ChatGPT as a versatile agentic product meant for delegating tasks rather than merely answering queries. Competing Silicon Valley firms have released various AI agents, yet early versions often struggled with complex tasks.

OpenAI claims the ChatGPT agent is significantly more capable than previous versions. This agent can connect to applications such as Gmail and GitHub, enabling it to retrieve relevant data in response to user prompts. Additionally, it has access to a terminal and can utilize APIs for certain applications.

Examples of tasks ChatGPT agent can handle include “planning and buying ingredients to make Japanese breakfast for four” and “analyzing three competitors and creating a slide deck.” Achieving these tasks involves sophisticated parsing of websites and executing plans, presenting challenges not previously addressed by OpenAI’s agents.

According to OpenAI, the underlying model of ChatGPT agent demonstrates state-of-the-art performance across multiple benchmarks. It scores 41.6% on Humanity’s Last Exam (pass@1), nearly double the scores of its predecessors. On the challenging FrontierMath benchmark, ChatGPT agent achieves 27.4% with access to tools like a code-executing terminal. Conversely, the prior leading score was 6.3% from another model.

Safety considerations were integral in the development of ChatGPT agent due to its advanced capabilities that could be exploited. OpenAI has previously noted the risks associated with agentic models.

A safety report categorized the ChatGPT agent model as “high capability” concerning biological and chemical weapon domains. OpenAI acknowledges the lack of direct evidence regarding potential misuse but aims to implement robust safeguards.

New safety measures involve real-time monitoring during user interactions. OpenAI employs a classifier to assess each prompt for biological relevance. If a prompt is flagged, another monitor evaluates whether the content could pose a biological threat. Moreover, OpenAI has disabled ChatGPT’s memory feature for this agent to thwart misuse by bad actors potentially extracting sensitive information through prompts. Future reintroduction of this feature remains uncertain.

Despite promising capabilities, the real-world effectiveness of ChatGPT agent remains to be fully ascertained. Historically, agent technology has faced challenges in practical applications. Still, OpenAI asserts that ChatGPT agent has a more capable model poised to meet the expectations of AI agents.

This story was updated with additional information.

Featured image credit