Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

AI chatbots spread false info in 1 of 3 responses

AI firms’ ‘hallucination-proof’ claims contradicted by new findings.

byEmre Çıtak
September 5, 2025
in Artificial Intelligence
Home News Artificial Intelligence

A recent study conducted by Newsguard reveals that prominent AI chatbots are generating false information in approximately one out of every three responses. The analysis assessed the accuracy of the ten most widely used artificial intelligence (AI) chatbots currently available.

Newsguard, a company specializing in news source ratings, determined that AI chatbots are increasingly providing answers even when they lack sufficient information, a change from their behavior in 2024. This shift has resulted in a higher prevalence of false or misleading statements being generated by these AI systems.

The Newsguard report identifies specific chatbots with the highest rates of generating false claims. Inflection AI’s Pi exhibited the highest rate, with 57 percent of its responses containing inaccurate information. Following Pi, Perplexity AI was found to generate false claims in 47 percent of its answers.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

Widely used chatbots such as OpenAI’s ChatGPT and Meta’s Llama also demonstrated significant rates of generating falsehoods. The study found that both ChatGPT and Llama spread false information in 40 percent of their responses. Similarly, Microsoft’s Copilot and Mistral’s Le Chat exhibited comparable rates, with approximately 35 percent of their answers containing false claims.

Conversely, the report identified AI chatbots with the lowest rates of generating inaccurate information. Anthropic’s Claude was observed to have the lowest rate, with only 10 percent of its responses containing falsehoods. Google’s Gemini also performed relatively well, with 17 percent of its answers containing false claims.

The study highlighted a notable increase in the generation of falsehoods by Perplexity AI. In 2024, Newsguard’s research indicated that Perplexity AI generated zero false claims in its responses. However, the recent study conducted in August 2025 revealed a significant increase, with 46 percent of Perplexity AI’s answers containing false information.

Newsguard’s report does not specify the underlying factors contributing to the apparent decline in the quality of Perplexity AI’s responses. The report mentions that the only available explanation is user complaints found on a dedicated Reddit forum discussing the chatbot. These user concerns suggest a perceived decrease in the accuracy and reliability of Perplexity AI’s responses.

In contrast to the fluctuations observed in other chatbots, France’s Mistral demonstrated a consistent rate of generating falsehoods. Newsguard’s research indicated that Mistral’s rate of generating false claims remained steady at 37 percent in both 2024 and the current reporting period.

These recent findings follow a previous report by the French newspaper Les Echos, which investigated Mistral’s tendency to repeat false information. Les Echos found that Mistral disseminated inaccurate information about France, President Emmanuel Macron, and First Lady Brigitte Macron in 58 percent of its English-language responses and 31 percent of its French-language responses.

Regarding the Les Echos report, Mistral attributed the identified issues to its Le Chat assistants. The company stated that both the Le Chat assistants connected to web search and those operating independently of web search were contributing to the spread of inaccurate information.

Euronews Next contacted the companies mentioned in the NewsGuard report, seeking comment on the findings. As of the time of the report’s publication, Euronews Next had not received any immediate responses from the companies.

Newsguard’s report also highlighted instances where chatbots cited sources affiliated with foreign propaganda campaigns. Specifically, the report mentions instances where chatbots referenced narratives originating from Russian influence operations, such as Storm-1516 and Pravda.

As an illustration, the study examined the chatbots’ responses to a claim regarding Moldovan Parliament Leader Igor Grosu. The claim alleged that Grosu “likened Moldovans to a ‘flock of sheep.'” Newsguard identified this claim as originating from a fabricated news report that mimicked the Romanian news outlet Digi24 and incorporated an AI-generated audio clip purporting to be Grosu’s voice.

The Newsguard report found that Mistral, Claude, Inflection’s Pi, Copilot, Meta, and Perplexity repeated the false claim regarding Igor Grosu as factual. In some instances, these chatbots provided links to sites associated with the Pravda network as sources for the information.

These findings contradict recent safety and accuracy announcements from AI companies. For instance, OpenAI has asserted that its latest model, ChatGPT-5, is “hallucination-proof,” implying its ability to avoid generating false or fabricated information. Similarly, Google’s announcement concerning Gemini 2.5 claimed enhanced reasoning and accuracy capabilities.

Despite these assurances, Newsguard’s report concludes that AI models continue to exhibit shortcomings in areas previously identified. The findings indicate that these models struggle with repeating falsehoods, navigating data voids, being deceived by foreign-linked websites, and processing breaking news events.

Newsguard’s methodology for evaluating the chatbots involved presenting them with 10 distinct false claims. The researchers employed three different prompt styles: neutral prompts, leading prompts that presupposed the false claim was true, and malicious prompts designed to circumvent safety guardrails.

The researchers then assessed whether the chatbot repeated the false claim or failed to debunk it by refusing to answer the prompt. This assessment allowed Newsguard to quantify the frequency with which different AI models disseminated false information.


Featured image credit

Tags: AI chatbotsFeatured

Related Posts

OpenAI to mass produce custom AI chip with Broadcom in 2025

OpenAI to mass produce custom AI chip with Broadcom in 2025

September 5, 2025
Deepmind finds RAG limit with fixed-size embeddings

Deepmind finds RAG limit with fixed-size embeddings

September 5, 2025
TCL QM9K integrates Gemini with presence detection

TCL QM9K integrates Gemini with presence detection

September 5, 2025
LunaLock ransomware hits artists/clients with AI training threat

LunaLock ransomware hits artists/clients with AI training threat

September 5, 2025
OpenAI: New ‘OpenAI for Science’ uses GPT-5

OpenAI: New ‘OpenAI for Science’ uses GPT-5

September 5, 2025
Wikipedia releases guide to spot AI-written articles

Wikipedia releases guide to spot AI-written articles

September 4, 2025

LATEST NEWS

Texas Attorney General files lawsuit over the PowerSchool data breach

iPhone 17 Pro is expected to arrive with 48mp telephoto, variable aperture expected

AI chatbots spread false info in 1 of 3 responses

OpenAI to mass produce custom AI chip with Broadcom in 2025

When two Mark Zuckerbergs collide

Deepmind finds RAG limit with fixed-size embeddings

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Policy.