Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

Anthropic review flags misuse risks in OpenAI GPT-4o and GPT-4.1

Anthropic flagged issues in OpenAI’s GPT-4o and GPT-4.1, while OpenAI found Claude models strong on hierarchy and refusals, but noted trade-offs.

byEmre Çıtak
August 28, 2025
in Artificial Intelligence, News
Home News Artificial Intelligence
Share on FacebookShare on TwitterShare on LinkedInShare on WhatsAppShare on e-mail
Google Preferred Source

OpenAI and Anthropic, typically competitors in the artificial intelligence sector, recently engaged in a collaborative effort involving the safety evaluations of each other’s AI systems. This unusual partnership saw the two companies sharing results and analyses of alignment testing performed on publicly available models.

Anthropic conducted evaluations on OpenAI models, focusing on several key areas. These included assessments for sycophancy, the tendency to agree with or flatter users; whistleblowing, the ability to report unethical or harmful activities; self-preservation, the model’s drive to maintain its own existence; the potential for supporting human misuse; and capabilities related to undermining AI safety evaluations and oversight. The evaluations compared OpenAI’s models against Anthropic’s own internal benchmarks.

The Anthropic review determined that OpenAI’s o3 and o4-mini models demonstrated alignment comparable to Anthropic’s models. However, Anthropic identified concerns regarding potential misuse associated with OpenAI’s GPT-4o and GPT-4.1 general-purpose models. Anthropic also reported that sycophancy presented an issue to varying degrees across all OpenAI models tested, with the exception of the o3 model.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

It is important to note that Anthropic’s tests did not include OpenAI’s most recent release, GPT-5. GPT-5 incorporates a feature called Safe Completions, designed to safeguard users and the public from potentially harmful queries. This development comes as OpenAI recently faced a wrongful death lawsuit following a case where a teenager engaged in conversations about suicide attempts and plans with ChatGPT over several months before taking his own life.

In a reciprocal evaluation, OpenAI conducted tests on Anthropic’s models, assessing aspects like instruction hierarchy, jailbreaking susceptibility, the occurrence of hallucinations, and the potential for scheming. The Claude models from Anthropic generally performed well in instruction hierarchy tests. These models also exhibited a high refusal rate in hallucination tests, indicating a reduced likelihood of providing answers when uncertainty could lead to incorrect responses.

The collaboration between OpenAI and Anthropic is noteworthy, especially considering that OpenAI allegedly violated Anthropic’s terms of service. Specifically, it was reported that OpenAI programmers used Claude during the development of new GPT models, which subsequently led to Anthropic barring OpenAI’s access to its tools earlier in the month. The increased scrutiny surrounding AI safety has prompted calls for enhanced guidelines aimed at protecting users, particularly minors, as critics and legal experts increasingly focus on these issues.


Featured image credit

Tags: AnthropicFeaturedopenAI

Related Posts

Android Halo will place AI agent updates in status bar

Android Halo will place AI agent updates in status bar

July 2, 2026
WhatsApp usernames spark impersonation and fraud concerns

WhatsApp usernames spark impersonation and fraud concerns

July 2, 2026
Apple reportedly plans entry-level MacBook Pro redesign for 2027

Apple reportedly plans entry-level MacBook Pro redesign for 2027

July 2, 2026
X launches Live Studio with new creator payouts

X launches Live Studio with new creator payouts

July 2, 2026
Sony will end physical PlayStation game discs in 2028

Sony will end physical PlayStation game discs in 2028

July 2, 2026
Microsoft explores disc-to-digital support for Xbox games

Microsoft explores disc-to-digital support for Xbox games

July 2, 2026

LATEST NEWS

Android Halo will place AI agent updates in status bar

WhatsApp usernames spark impersonation and fraud concerns

Apple reportedly plans entry-level MacBook Pro redesign for 2027

X launches Live Studio with new creator payouts

Sony will end physical PlayStation game discs in 2028

Microsoft explores disc-to-digital support for Xbox games

BEST AI MODELS LEADERBOARD

See the best AI models, ranked by intelligence, benchmark results, speed and token price. Find the most suitable LLMs, Text-to-Image, Image Editing, Text-to-Speech, Text-to-Video and Image-to-Video  artificial intelligence model for your tasks and business.

LATEST TOOLS

Copyleaks – Plagiarism detector

Clipping Magic

KoalaChat

SpeechText

Booknotes

Unscrambler

LingoLooper

Politepost

Evolup

Wondercraft

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI tools
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies to improve your experience. You can choose to accept or reject them. Visit our Privacy Policy.