Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

Standard AI models fail simple math without specialized training

The successful model developed its own internal geometric language using wave like patterns and Minkowski sums to organize arithmetic operations

byAytun Çelebi
December 30, 2025
in Research
Home Research
Share on FacebookShare on TwitterShare on LinkedInShare on WhatsAppShare on e-mail
Google Preferred Source

Large language models have struggled with multi-digit multiplication without specialized training methods, despite their ability to handle complex coding and reasoning tasks, according to a recent study.

Research published on the arXiv preprint server by the University of Chicago’s Xiaoyan Bai and Chenhao Tan, along with collaborators from MIT, Harvard University, the University of Waterloo, and Google DeepMind, identified the reasons for this limitation and found solutions.

Standard large language models achieved less than 1% accuracy when multiplying two four-digit numbers, even with increased layers up to 12. These models converged on a “local optimum,” failing to store and retrieve intermediate computations necessary for multi-digit multiplication, which are categorized as long-range dependencies.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

Conversely, a model trained with the Implicit Chain of Thought (ICoT) method achieved 100% accuracy. The ICoT model demonstrated an ability to track long-range dependencies and internalize reasoning processes by gradually removing intermediate reasoning steps during training. The research team decoded intermediate values, such as running sums, from the ICoT model’s internal states, which was not possible with the standard fine-tuning model.

The ICoT model organized its attention into distinct pathways, computing products of digit pairs in early layers and storing them in specific locations for retrieval in later layers. This created an efficient internal structure for multiplication. The study also found that the ICoT model represented operations using elegant structures, encoding digits as wave-like patterns (Fourier bases) and organizing arithmetic spatially. During multiplication of digit pairs, the model naturally utilized a geometric operation called a Minkowski sum, which was not explicitly programmed by the researchers.

Researchers achieved 99% accuracy in a two-layer model by introducing a modified training objective that taught the model to track running sums at each step, thereby carrying intermediate values and partial products forward. This addition enabled the model to develop mechanisms similar to ICoT’s, including storing and retrieving partial products and tracking multiple digit pairs simultaneously.

Chenhao Tan said, “Our research is trying to chart that terrain.” The study highlights that architectural insights and training techniques can overcome obstacles that scaling alone cannot address, emphasizing the importance of built-in guidance in advancing AI capabilities.

The findings illuminate fundamental aspects of how large language models learn and “think,” with the long-range dependency problem extending beyond arithmetic to other sequential tasks in language modeling.


Featured image credit

Tags: AImath

Related Posts

Wireless charging uses about 40% more electricity

Wireless charging uses about 40% more electricity

June 25, 2026
European consumers may leave businesses using US tech providers

European consumers may leave businesses using US tech providers

June 24, 2026
Study links AI-assisted homework to lower exam scores

Study links AI-assisted homework to lower exam scores

June 22, 2026
Harvard and Boston Children’s use AI to revisit unsolved genetic cases

Harvard and Boston Children’s use AI to revisit unsolved genetic cases

June 19, 2026
Adobe report finds 86% of creators now use generative AI in workflows

Adobe report finds 86% of creators now use generative AI in workflows

June 17, 2026
AI transfer learning speeds cosmology research but has hidden risks

AI transfer learning speeds cosmology research but has hidden risks

June 15, 2026

LATEST NEWS

Meta debuts AI-powered Creator Studio app to help Facebook creators grow

OpenAI unveils first custom inference chip named Jalapeño

Figma adds code layers to collaborative design canvas

US reportedly urges Meta to submit AI models

Euclid data could reveal isolated Milky Way black holes

OpenAI upgrades GPT-5.5 Instant for stronger context awareness

BEST AI MODELS LEADERBOARD

See the best AI models, ranked by intelligence, benchmark results, speed and token price. Find the most suitable LLMs, Text-to-Image, Image Editing, Text-to-Speech, Text-to-Video and Image-to-Video  artificial intelligence model for your tasks and business.

LATEST TOOLS

Vrew

Fireflies

SpeedLegal

Teachable Machine

Unriddle

VidAU

Qualified

character.ai

Interview Coder

Moonbeam

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI tools
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies to improve your experience. You can choose to accept or reject them. Visit our Privacy Policy.