Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

Study finds ChatGPT-5 has 25% error rate

New research shows OpenAI’s ChatGPT-5 makes 45% fewer factual errors and six times fewer hallucinations than GPT-4 but still answers incorrectly in about a quarter of cases.

byKerem Gülen
September 25, 2025
in Artificial Intelligence

A study on OpenAI’s ChatGPT-5 model determined it generates incorrect answers in approximately 25% of cases. The research attributes these inaccuracies to inherent limitations within the model’s training data and its probabilistic reasoning architecture, as detailed in a Tom’s Guide report.

The model demonstrates a notable reduction in errors compared to its predecessor, GPT-4, registering 45% fewer factual mistakes and six times fewer instances of “hallucinated,” or entirely fabricated, answers. Despite these advancements, the study confirms that ChatGPT-5 can still exhibit overconfidence, a phenomenon where it presents factually incorrect information with a high degree of certainty. This persistence of hallucination, though diminished, remains a core issue affecting its reliability.

Performance accuracy varies significantly depending on the specific domain of the task. For example, the model achieved a 94.6% accuracy score on the 2025 AIME mathematics test and a 74.9% success rate on a set of real-world coding assignments. The research indicates that errors become more prevalent in tasks that involve general knowledge or require complex, multi-step reasoning, where the model’s performance is less consistent.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

When evaluated against the MMLU Pro benchmark, a rigorous academic test covering a wide range of subjects including science, mathematics, and history, ChatGPT-5 scored approximately 87% accuracy. The study identifies several underlying causes for the remaining errors. These include an inability to fully comprehend nuanced questions, reliance on training data that may be outdated or incomplete, and the model’s fundamental design as a probabilistic pattern-prediction mechanism, which can generate responses that are plausible but not factually correct.

Based on these findings, the report advises users to independently verify any critical information produced by ChatGPT-5. This recommendation is especially pertinent for professional, academic, or health-related inquiries where precision is essential. The consistent error rate, even with marked improvements, underscores the need for cautious use and external validation of the model’s outputs.


Featured image credit

Tags: ChatGPT-5Featured

Related Posts

Bananas! Google’s AI image tool is taking over its apps

Bananas! Google’s AI image tool is taking over its apps

October 16, 2025
OpenAI’s Sora 2 now makes 25-second videos

OpenAI’s Sora 2 now makes 25-second videos

October 16, 2025
Anthropic’s new Haiku 4.5 model rivals Sonnet 4 at one-third the price

Anthropic’s new Haiku 4.5 model rivals Sonnet 4 at one-third the price

October 16, 2025
Spotify’s AI DJ now takes typed requests and speaks Spanish

Spotify’s AI DJ now takes typed requests and speaks Spanish

October 16, 2025
Intel’s new Crescent Island GPU is designed to take on Nvidia and AMD in AI

Intel’s new Crescent Island GPU is designed to take on Nvidia and AMD in AI

October 16, 2025
ChatGPT just helped a job seeker go from ignored to in-demand in 10 days

ChatGPT just helped a job seeker go from ignored to in-demand in 10 days

October 16, 2025

LATEST NEWS

WhatsApp tests Channel Quiz feature that turns followers into contestants

Honda unveils semi-autonomous riding mower that learns from its owner

Bananas! Google’s AI image tool is taking over its apps

OpenAI’s Sora 2 now makes 25-second videos

Nothing says building a custom smartphone OS costs over $40 million

Samsung cancels Galaxy S26 Edge and ends the “Edge” lineup

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Policy.