Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

Apple’s quiet AI lab reveals how large models fake thinking

Apple researchers have found that reasoning AIs fail to scale as puzzles grow harder. The study shows AIs use fewer reasoning tokens just when challenges increase.

byAytun Çelebi
June 11, 2025
in Research
Home Research

The latest generation of AI models, often called large reasoning models (LRMs), has dazzled the world with its ability to “think.” Before giving an answer, these models produce long, detailed chains of thought, seemingly reasoning their way through complex problems. This has led many to believe we are on the cusp of true artificial general intelligence.

But are these models really thinking? A new, insightful paper from researchers at Apple, titled “The Illusion of Thinking,” puts this capability under a microscope and comes to some startling conclusions. By moving away from standard math tests—which are often “contaminated” with answers the AI has already seen during training—and into a controlled lab of complex puzzles, the researchers uncovered fundamental limits to AI reasoning.

Today’s most advanced AI isn’t so much a brilliant thinker as it is an incredibly sophisticated pattern-matcher that quickly hits a wall when faced with truly new challenges.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

The three regimes of AI reasoning

The researchers tested pairs of AI models—one “thinking” LRM and its standard “non-thinking” counterpart—on a series of puzzles like the Tower of Hanoi and River Crossing. By precisely increasing the difficulty, they discovered three distinct performance regimes:

  1. Low complexity: Surprisingly, on simple problems, the standard, non-thinking models actually outperformed the reasoning models. The LRMs were less accurate and wasted a lot of computational effort “overthinking” problems they should have solved easily.

  2. Medium complexity: This is where LRMs shine. When problems become moderately complex, the ability to generate a thinking process gives them a clear advantage over standard models.

  3. High complexity: When the puzzles become too hard, something dramatic happens: both models fail completely. While the thinking models can handle a bit more complexity before failing, they inevitably hit a wall and their performance collapses to zero.

As the paper states, these models “fail to develop generalizable problem-solving capabilities, with accuracy ultimately collapsing to zero beyond certain complexities.”

Perhaps the most fascinating discovery is how the reasoning models fail. You would expect that as a problem gets harder, the AI would “think” more, using more of its computational budget. And it does—but only up to a point.

The research reveals a counterintuitive scaling limit. When a problem approaches the “collapse” point, the LRM starts to reduce its reasoning effort, spending fewer tokens on thinking despite the increasing difficulty. It’s as if the model recognizes the task as too hard and simply gives up before it even starts, even with an adequate budget to keep trying. This suggests a fundamental limitation in their ability to scale their reasoning effort with a problem’s difficulty.

Apple’s quiet AI lab reveals how large models fake thinking
Image credit: Apple

Failure to follow a recipe

What if you made it even easier for the AI? What if you gave it the exact, step-by-step algorithm to solve the puzzle? Surely, a true reasoning machine could just follow the instructions.

In a stunning finding, the researchers found this wasn’t the case.

“Even when we provide the algorithm in the prompt—so that the model only needs to execute the prescribed steps—performance does not improve, and the observed collapse still occurs at roughly the same point.”

This is the most damning evidence against the idea that these models “reason” in a human-like way. Their inability to execute a simple, explicit set of logical rules shows that their success relies more on recognizing familiar patterns than on genuine, symbolic manipulation. The model’s inconsistent performance across different puzzle types further supports this, suggesting its ability is tied to the examples it has memorized from the web, not a general problem-solving skill.


Featured image credit

Tags: AIApple

Related Posts

Radware tricks ChatGPT’s Deep Research into Gmail data leak

Radware tricks ChatGPT’s Deep Research into Gmail data leak

September 19, 2025
OpenAI research finds AI models can scheme and deliberately deceive users

OpenAI research finds AI models can scheme and deliberately deceive users

September 19, 2025
MIT studies AI romantic bonds in r/MyBoyfriendIsAI group

MIT studies AI romantic bonds in r/MyBoyfriendIsAI group

September 19, 2025
Anthropic economic index reveals uneven Claude.ai adoption

Anthropic economic index reveals uneven Claude.ai adoption

September 17, 2025
Google releases VaultGemma 1B with differential privacy

Google releases VaultGemma 1B with differential privacy

September 17, 2025
OpenAI researchers identify the mathematical causes of AI hallucinations

OpenAI researchers identify the mathematical causes of AI hallucinations

September 17, 2025

LATEST NEWS

Zoom announces AI Companion 3.0 at Zoomtopia

Google Cloud adds Lovable and Windsurf as AI coding customers

Radware tricks ChatGPT’s Deep Research into Gmail data leak

Elon Musk’s xAI chatbot Grok exposed hundreds of thousands of private user conversations

Roblox game Steal a Brainrot removes AI-generated character, sparking fan backlash and a debate over copyright

DeepSeek releases R1 model trained for $294,000 on 512 H800 GPUs

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Policy.