Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

Deepmind finds RAG limit with fixed-size embeddings

DeepMind shows classical BM25 still outperforms dense embeddings at scale.

byKerem Gülen
September 5, 2025
in Artificial Intelligence, Research
Home News Artificial Intelligence
Share on FacebookShare on TwitterShare on LinkedInShare on WhatsAppShare on e-mail
Google Preferred Source

Google DeepMind has identified a fundamental architectural limitation within Retrieval-Augmented Generation (RAG) systems that rely on dense embeddings. This limitation reveals that fixed-size embeddings cannot represent all relevant document combinations as the database scales, impacting retrieval effectiveness.

The core issue lies in the representational capacity of fixed-size embeddings. A fixed dimension embedding cannot accurately represent all possible combinations of relevant documents when the database surpasses a certain size. This limitation is rooted in principles of communication complexity and sign-rank theory.

Theoretical capacity limits based on embedding size have been established. Embeddings of 512 dimensions reach their limit around 500,000 documents. Increasing the dimensions to 1024 extends the limit to approximately 4 million documents. A further increase to 4096 dimensions raises the ceiling to 250 million documents. These limits represent best-case estimates under free embedding optimization, where vectors are directly optimized against test labels. According to the Google DeepMind report, real-world language-constrained embeddings are anticipated to fail even sooner.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

To empirically demonstrate this limitation, Google DeepMind introduced the LIMIT benchmark, designed to stress-test embedders. The LIMIT benchmark includes two configurations: LIMIT full and LIMIT small. The LIMIT full configuration consists of 50,000 documents, where even strong embedders experience a collapse in performance, with recall@100 often falling below 20%. The LIMIT small configuration, comprising a mere 46 documents, still poses a challenge to models. Performance varies significantly, remaining far from reliable.

Specific results from testing the LIMIT small configuration include: Promptriever Llama3 8B achieved 54.3% recall@2 with 4096 dimensions. GritLM 7B obtained 38.4% recall@2, also with 4096 dimensions. E5-Mistral 7B reached 29.5% recall@2, utilizing 4096 dimensions. Gemini Embed achieved 33.7% recall@2 with 3072 dimensions. The research shows that even with only 46 documents, no embedder achieves full recall, emphasizing that the limitation stems from the single-vector embedding architecture itself, not solely from dataset size.

In contrast, BM25, a classical sparse lexical model, circumvents this limitation. Sparse models operate in effectively unbounded dimensional spaces, facilitating the capture of combinations that dense embeddings cannot effectively represent.

Current RAG implementations often assume that embeddings can scale indefinitely with increasing data volumes. Google DeepMind’s research demonstrates the incorrectness of this assumption, revealing that embedding size inherently constrains retrieval capacity. This constraint significantly impacts enterprise search engines managing millions of documents, agentic systems relying on complex logical queries, and instruction-following retrieval tasks where queries dynamically define relevance.

Existing benchmarks, such as MTEB, do not adequately capture these limitations because they test only a narrow subset of query-document combinations. The research team suggests that scalable retrieval requires moving beyond single-vector embeddings.

Alternatives to single-vector embeddings include Cross-Encoders, which achieve perfect recall on the LIMIT benchmark by directly scoring query-document pairs, albeit with high inference latency. Multi-Vector Models, such as ColBERT, offer more expressive retrieval by assigning multiple vectors per sequence, improving performance on LIMIT tasks. Sparse Models, including BM25, TF-IDF, and neural sparse retrievers, scale better in high-dimensional search but lack semantic generalization.

The key finding is that architectural innovation, rather than simply increasing embedder size, is essential. The research team’s analysis reveals that dense embeddings, despite their widespread use, are constrained by a mathematical limit. Dense embeddings cannot capture all possible relevance combinations once corpus sizes exceed limits tied to embedding dimensionality. This limitation is concretely demonstrated by the LIMIT benchmark, with recall@100 dropping below 20% on LIMIT full (50,000 documents) and even the best models maxing out at approximately 54% recall@2 on LIMIT small (46 documents). Classical techniques like BM25, or newer architectures such as multi-vector retrievers and cross-encoders, remain essential for building reliable retrieval engines at scale.


Featured image credit

Tags: DeepMindFeatured

Related Posts

ChatGPT hits 1 billion users as global AI adoption surges despite backlash

ChatGPT hits 1 billion users as global AI adoption surges despite backlash

June 12, 2026
OpenAI Codex referral program rewards users with extra rate resets

OpenAI Codex referral program rewards users with extra rate resets

June 12, 2026
Zuckerberg says small elite teams can drive major AI breakthroughs

Zuckerberg says small elite teams can drive major AI breakthroughs

June 12, 2026
Google says AI Overviews reach 2.5 billion monthly users

Google says AI Overviews reach 2.5 billion monthly users

June 12, 2026
Anthropic apologizes for hidden Fable throttling, pledges transparency

Anthropic apologizes for hidden Fable throttling, pledges transparency

June 11, 2026
Reco builds momentum to secure the enterprise AI agent sprawl

Reco builds momentum to secure the enterprise AI agent sprawl

June 11, 2026

LATEST NEWS

“Free robots are an illusion”: Why we’ll pay for system intelligence, not delivery workers

How Henrique Schmaiske led Meteor.js through its biggest transformation

Proven privacy: Why ‘no-log’ claims need real evidence today

ChatGPT hits 1 billion users as global AI adoption surges despite backlash

Huawei launches HarmonyOS 7 developer beta with upgraded API 26

OpenAI Codex referral program rewards users with extra rate resets

BEST AI MODELS LEADERBOARD

See the best AI models, ranked by intelligence, benchmark results, speed and token price. Find the most suitable LLMs, Text-to-Image, Image Editing, Text-to-Speech, Text-to-Video and Image-to-Video  artificial intelligence model for your tasks and business.

LATEST TOOLS

Roboto AI

Pickaxe

Pfpmaker

MindPal

Syllaby

ScreenApp

FinanceBrain

GitHub Spark

Hints

VisionStory AI

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI tools
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies to improve your experience. You can choose to accept or reject them. Visit our Privacy Policy.