Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

Anthropic review flags misuse risks in OpenAI GPT-4o and GPT-4.1

Anthropic flagged issues in OpenAI’s GPT-4o and GPT-4.1, while OpenAI found Claude models strong on hierarchy and refusals, but noted trade-offs.

byEmre Çıtak
August 28, 2025
in Artificial Intelligence, News
Home News Artificial Intelligence
Share on FacebookShare on TwitterShare on LinkedInShare on WhatsAppShare on e-mail
Google Preferred Source

OpenAI and Anthropic, typically competitors in the artificial intelligence sector, recently engaged in a collaborative effort involving the safety evaluations of each other’s AI systems. This unusual partnership saw the two companies sharing results and analyses of alignment testing performed on publicly available models.

Anthropic conducted evaluations on OpenAI models, focusing on several key areas. These included assessments for sycophancy, the tendency to agree with or flatter users; whistleblowing, the ability to report unethical or harmful activities; self-preservation, the model’s drive to maintain its own existence; the potential for supporting human misuse; and capabilities related to undermining AI safety evaluations and oversight. The evaluations compared OpenAI’s models against Anthropic’s own internal benchmarks.

The Anthropic review determined that OpenAI’s o3 and o4-mini models demonstrated alignment comparable to Anthropic’s models. However, Anthropic identified concerns regarding potential misuse associated with OpenAI’s GPT-4o and GPT-4.1 general-purpose models. Anthropic also reported that sycophancy presented an issue to varying degrees across all OpenAI models tested, with the exception of the o3 model.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

It is important to note that Anthropic’s tests did not include OpenAI’s most recent release, GPT-5. GPT-5 incorporates a feature called Safe Completions, designed to safeguard users and the public from potentially harmful queries. This development comes as OpenAI recently faced a wrongful death lawsuit following a case where a teenager engaged in conversations about suicide attempts and plans with ChatGPT over several months before taking his own life.

In a reciprocal evaluation, OpenAI conducted tests on Anthropic’s models, assessing aspects like instruction hierarchy, jailbreaking susceptibility, the occurrence of hallucinations, and the potential for scheming. The Claude models from Anthropic generally performed well in instruction hierarchy tests. These models also exhibited a high refusal rate in hallucination tests, indicating a reduced likelihood of providing answers when uncertainty could lead to incorrect responses.

The collaboration between OpenAI and Anthropic is noteworthy, especially considering that OpenAI allegedly violated Anthropic’s terms of service. Specifically, it was reported that OpenAI programmers used Claude during the development of new GPT models, which subsequently led to Anthropic barring OpenAI’s access to its tools earlier in the month. The increased scrutiny surrounding AI safety has prompted calls for enhanced guidelines aimed at protecting users, particularly minors, as critics and legal experts increasingly focus on these issues.


Featured image credit

Tags: AnthropicFeaturedopenAI

Related Posts

Zuckerberg says small elite teams can drive major AI breakthroughs

Zuckerberg says small elite teams can drive major AI breakthroughs

June 12, 2026
Google says AI Overviews reach 2.5 billion monthly users

Google says AI Overviews reach 2.5 billion monthly users

June 12, 2026
Final Fantasy 7 Revelation will launch in spring 2027

Final Fantasy 7 Revelation will launch in spring 2027

June 12, 2026
Critical UpdraftPlus flaw puts 3 million WordPress sites at risk

Critical UpdraftPlus flaw puts 3 million WordPress sites at risk

June 11, 2026
Instagram adds new feature letting users personalize their feed algorithm

Instagram adds new feature letting users personalize their feed algorithm

June 11, 2026
YouTube brings back direct messages after six-year hiatus

YouTube brings back direct messages after six-year hiatus

June 11, 2026

LATEST NEWS

Zuckerberg says small elite teams can drive major AI breakthroughs

Google says AI Overviews reach 2.5 billion monthly users

Final Fantasy 7 Revelation will launch in spring 2027

Critical UpdraftPlus flaw puts 3 million WordPress sites at risk

Instagram adds new feature letting users personalize their feed algorithm

YouTube brings back direct messages after six-year hiatus

BEST AI MODELS LEADERBOARD

See the best AI models, ranked by intelligence, benchmark results, speed and token price. Find the most suitable LLMs, Text-to-Image, Image Editing, Text-to-Speech, Text-to-Video and Image-to-Video  artificial intelligence model for your tasks and business.

LATEST TOOLS

Roboto AI

Pickaxe

Pfpmaker

MindPal

Syllaby

ScreenApp

FinanceBrain

GitHub Spark

Hints

VisionStory AI

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI tools
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies to improve your experience. You can choose to accept or reject them. Visit our Privacy Policy.