Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

New research shows AI logic survives even when its memory is erased

The discovery reveals that AI can reason without recalling training examples verbatim.

byKerem Gülen
November 10, 2025
in Research

Goodfire.ai researchers isolated memorization and reasoning pathways in AI neural networks, detailed in a late October preprint paper.

The research demonstrates a clear separation of these functions within large language models. When memorization pathways were removed, models lost 97 percent of their ability to recite verbatim training data. Their “logical reasoning” ability, however, remained largely intact.

Researchers ranked weight components from high to low based on “curvature.” In Allen Institute for AI’s OLMo-7B language model, layer 22 showed that the bottom 50 percent of weight components had 23 percent higher activation on memorized data. Conversely, the top 10 percent exhibited 26 percent higher activation on general, non-memorized text.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

This mechanistic split allowed for surgical removal of memorization while preserving other capabilities. Deleting bottom-ranked components eliminated memorization; retaining top-ranked ones handled problem-solving.

Arithmetic operations appear to share neural pathways with memorization rather than logical reasoning. Removing memorization circuits caused mathematical performance to plummet to 66 percent, while logical tasks remained nearly untouched. This may explain why AI models struggle with math without external tools, relying on memorized facts like “2+2=4” rather than computation.

AI “reasoning” encompasses abilities like evaluating true/false statements and following if-then rules, which survived memory removal. This differs from deeper “mathematical reasoning” needed for proofs or novel problem-solving, which current AI models struggle with even with intact pattern-matching abilities.

Future development of these information removal techniques could enable AI companies to remove copyrighted content, private information, or harmful memorized text from neural networks without destroying transformative task performance. However, researchers state their method “cannot guarantee complete elimination of sensitive information” due to the distributed nature of information storage in neural networks.

Understanding this distinction involves the “loss landscape,” a visualization of an AI model’s prediction accuracy based on internal settings or “weights.” “Loss” measures errors, with low loss indicating few errors. The “landscape” maps error rates for all possible setting combinations. During training, AI models adjust weights to minimize errors, effectively “rolling downhill” in this landscape.

Researchers analyzed the “curvature” of loss landscapes, measuring the sensitivity of model performance to small changes in neural network weights. High curvature indicates sharp peaks and valleys, meaning small changes have significant effects. Low curvature signifies flat plains where changes have minimal impact. These curvature values were used to rank weight components.

Using K-FAC (Kronecker-Factored Approximate Curvature), scientists found that individual memorized facts create sharp, idiosyncratic spikes in the landscape that flatten when averaged. In contrast, reasoning abilities, relied upon by many different inputs, maintain consistent, moderate curves.

Researchers indicate that “directions that implement shared mechanisms used by many inputs add coherently and remain high-curvature on average,” describing reasoning pathways. Memorization, conversely, uses “idiosyncratic sharp directions associated with specific examples” that appear flat when averaged.

The technique was tested on multiple AI systems, including Allen Institute’s OLMo-2 family (7 billion- and 1 billion-parameter versions) and custom 86 million-parameter Vision Transformers (ViT-Base models) on ImageNet. They also validated findings against existing methods like BalancedSubnet.

Selectively removing low-curvature weight components resulted in memorized content recall dropping to 3.4 percent from nearly 100 percent. Logical reasoning tasks maintained 95 to 106 percent of baseline performance.

Logical tasks included Boolean expression evaluation, logical deduction puzzles, object tracking, BoolQ for yes/no reasoning, Winogrande for common sense inference, and OpenBookQA for science questions. Mathematical operations and closed-book fact retrieval, sharing pathways with memorization, dropped to 66 to 86 percent performance after editing. Arithmetic proved particularly brittle, with calculations failing even with identical reasoning chains after low-curvature components were removed.

The team explained, “Arithmetic problems themselves are memorized at the 7B scale, or because they require narrowly used directions to do precise calculations.” Open-book question answering, relying on provided context, maintained nearly full performance.

Mechanism separation varied by information type; common facts like country capitals showed minimal change after editing, while rare facts like company CEOs dropped 78 percent, suggesting differential neural resource allocation based on information frequency in training.

The K-FAC technique outperformed existing memorization removal methods, achieving 16.1 percent memorization on unseen historical quotes versus 60 percent for BalancedSubnet. Vision transformers showed similar patterns, with removing memorization pathways restoring 66.5 percent accuracy on previously mislabeled images.

Researchers acknowledge limitations; removed memories might return with further training, as current unlearning methods primarily suppress information. The reason for math’s fragility upon memorization removal is unclear, as is whether certain complex capabilities are misidentified as memorization. Additionally, mathematical tools for measuring the model’s “landscape” can be unreliable at extremes.


Featured image credit

Tags: Goodfire.aillm

Related Posts

Google reveals AI-powered malware using LLMs in real time

Google reveals AI-powered malware using LLMs in real time

November 12, 2025
Oxford study finds AI benchmarks often exaggerate model performance

Oxford study finds AI benchmarks often exaggerate model performance

November 12, 2025
Anthropic study finds AI has limited self-awareness of its own thoughts

Anthropic study finds AI has limited self-awareness of its own thoughts

November 11, 2025
Researchers find electric cars erase their “carbon debt” in under two years

Researchers find electric cars erase their “carbon debt” in under two years

November 5, 2025
Anthropic study reveals AIs can’t reliably explain their own thoughts

Anthropic study reveals AIs can’t reliably explain their own thoughts

November 4, 2025
Apple’s Pico-Banana-400K dataset could redefine how AI learns to edit images

Apple’s Pico-Banana-400K dataset could redefine how AI learns to edit images

November 4, 2025

LATEST NEWS

Don’t miss: The Game Awards to be live on Amazon Prime Video

Collins Dictionary names “vibe coding” the 2025 word of the year

Google Photos AI expands to 100+ countries

Masayoshi Son trades Nvidia profits for a $30B AI spending spree

Nintendo rolls out quality-of-life updates for both Switch generations

YouTube launches on-screen AI chat that explains videos in real time

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Policy.