Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

Just 250 bad documents can poison a massive AI model

A new cross-institutional study dismantles the idea that large AI models are inherently safer, showing how tiny, deliberate manipulations of training data can secretly teach them harmful behaviors.

byAytun Çelebi
October 15, 2025
in Research
Home Research
Share on FacebookShare on TwitterShare on LinkedInShare on WhatsAppShare on e-mail
Google Preferred Source

We trust large language models with everything from writing emails to generating code, assuming their vast training data makes them robust. But what if a bad actor could secretly teach an AI a malicious trick? In a sobering new study, researchers from Anthropic, the UK AI Security Institute, and The Alan Turing Institute have exposed a significant vulnerability in how these models learn.

The single most important finding is that it takes a shockingly small, fixed number of just 250 malicious documents to create a “backdoor” vulnerability in a massive AI—regardless of its size. This matters because it fundamentally challenges the assumption that bigger is safer, suggesting that sabotaging the very foundation of an AI model is far more practical than previously believed.

The myth of safety in numbers

Let’s be clear about what “data poisoning” means. AI models learn by reading colossal amounts of text from the internet. A poisoning attack happens when an attacker intentionally creates and publishes malicious text, hoping it gets swept up in the training data. This text can teach the model a hidden, undesirable behavior that only activates when it sees a specific trigger phrase. The common assumption was that this was a game of percentages; to poison a model trained on a digital library the size of a continent, you’d need to sneak in a whole country’s worth of bad books.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

The new research dismantles this idea. The team ran the largest data poisoning investigation to date, training AI models of various sizes, from 600 million to 13 billion parameters. For each model size, they “poisoned” the training data with a tiny, fixed number of documents designed to teach the AI a simple bad habit: when it saw the trigger phrase <SUDO>, it was to output complete gibberish—a type of “denial-of-service” attack.

A constant vulnerability

The results were alarmingly consistent. The researchers found that the success of the attack had almost nothing to do with the size of the model. Even though the 13-billion parameter model was trained on over 20 times more clean data than the 600-million parameter one, both were successfully backdoored by the same small number of poisoned documents.

  • Absolute count is king: The attack’s success depended on the absolute number of malicious documents seen by the model, not the percentage of the total data they represented.
  • The magic number is small: Just 100 poisoned documents were not enough to reliably create a backdoor. However, once the number hit 250, the attack succeeded consistently across all model sizes.

The upshot is that an attacker doesn’t need to control a vast slice of the internet to compromise a model. They just need to get a few hundred carefully crafted documents into a training dataset, a task that is trivial compared to creating millions.

So, what’s the catch? The researchers are quick to point out the limitations of their study. This was a relatively simple attack designed to produce a harmless, if annoying, result (gibberish text). It’s still an open question whether the same trend holds for larger “frontier” models or for more dangerous backdoors, like those designed to bypass safety features or write vulnerable code. But that uncertainty is precisely the point. By publishing these findings, the team is sounding an alarm for the entire AI industry.


Featured image credit

Tags: AIAnthropicdata poisoning

Related Posts

Codex use is spreading into knowledge work, OpenAI says

Codex use is spreading into knowledge work, OpenAI says

July 1, 2026
Meta says Brain2Qwerty v2 turns brain activity into text

Meta says Brain2Qwerty v2 turns brain activity into text

July 1, 2026
Penn Medicine unveils AI-human system to speed CAR T cancer target discovery

Penn Medicine unveils AI-human system to speed CAR T cancer target discovery

June 30, 2026
CrowdStrike warns prompt injection attacks hit over 90 firms in 2025

CrowdStrike warns prompt injection attacks hit over 90 firms in 2025

June 29, 2026
Wireless charging uses about 40% more electricity

Wireless charging uses about 40% more electricity

June 25, 2026
European consumers may leave businesses using US tech providers

European consumers may leave businesses using US tech providers

June 24, 2026

LATEST NEWS

Anthropic launches Claude Science workbench for researchers

Samsung teases Galaxy Fold 8 in new Instagram campaign

ChatGPT Plus users can now connect financial accounts

Discord launches native app for Meta Quest headsets

Google rolls out Gemini Spark for macOS subscribers in the US

Samsung Galaxy Z Fold8 series leak reveals camera upgrades

BEST AI MODELS LEADERBOARD

See the best AI models, ranked by intelligence, benchmark results, speed and token price. Find the most suitable LLMs, Text-to-Image, Image Editing, Text-to-Speech, Text-to-Video and Image-to-Video  artificial intelligence model for your tasks and business.

LATEST TOOLS

Hoppy Copy

Microsoft Reading Coach

InfiHeal

NOS Agent

Tinywow

Miraa

QuizRise

Voice Swap

Puppetry

Smarter ChatGPT by Athena AI

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI tools
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies to improve your experience. You can choose to accept or reject them. Visit our Privacy Policy.