Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

What if small AI models could suddenly code better than some giant ones?

Researchers from MIT and collaborating institutions have developed a method using Sequential Monte Carlo to guide LLMs in generating structurally valid and semantically accurate text particularly code.

byAytun Çelebi
May 14, 2025
in Research

Large language models (LLMs) are increasingly adept at generating computer code, promising to accelerate software development. However, this speed advantage is only beneficial if the generated code is correct, adheres to the programming language’s rules, and doesn’t lead to system crashes. A new approach developed by researchers at MIT and collaborating institutions now offers a way to automatically guide LLMs to produce text, particularly code, that is both structurally valid and semantically accurate, all while improving computational efficiency.

Balancing speed, structure, and meaning in AI-generated code

Programmers are turning to LLMs as powerful assistants capable of drafting code snippets, functions, and even entire modules in seconds. The catch? Ensuring this AI-generated code is usable. Code must rigidly follow the syntax of a specific programming language (its structure) and perform the intended task correctly (its meaning). Existing methods to enforce these constraints on LLMs often face a trade-off: they might distort the model’s intended output, thereby sacrificing accuracy, or they become too computationally intensive and slow for complex, real-world applications.

One common strategy involves generating a complete block of code and then validating it. If errors are found – a frequent occurrence – the entire process must be restarted, consuming significant time and computational resources. Another tactic is to check the output incrementally. While this can help ensure structural validity along the way, constant corrections can cause the code to drift from the user’s original intent, impacting its overall accuracy and usefulness.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

“It is much easier to enforce structure than meaning,” notes João Loula, an MIT graduate student and co-lead author of a paper on this new framework. “We can quickly check whether something is in the right programming language, but to check its meaning you have to execute the code. Our work is also about dealing with these different types of information.”

Probabilistic guidance with Sequential Monte Carlo

The innovative method, developed by an international team including researchers from MIT, Mila-Quebec Artificial Intelligence Institute, Johns Hopkins University, Yale University, ETH Zurich, and McGill University, introduces a sophisticated way to steer LLMs. Instead of just post-correction, their architecture guides the LLM during the generation process, encouraging it to allocate its efforts towards outputs most likely to be both valid and accurate. Unpromising avenues are discarded early, leading to a significant boost in computational efficiency thanks to this probabilistic approach.

The researchers achieve this using a powerful statistical technique called Sequential Monte Carlo (SMC). This method allows multiple parallel generation “threads” from the LLM to essentially compete with each other. The model dynamically allocates more computational resources to threads that appear more promising as they generate text.

“We are not trying to train an LLM to do this,” adds Vikash Mansinghka, a principal research scientist at MIT and co-senior author. “Instead, we are engineering some knowledge that an expert would have and combining it with the LLM’s knowledge, which offers a very different approach to scaling than you see in deep learning.”

The core idea is to integrate expert knowledge into the LLM’s generation process. Each potential output path is assigned a “weight” that reflects its likelihood of being structurally correct (e.g., valid Python syntax) and semantically accurate (i.e., doing what the user wants). At each step of the generation, the model focuses its computational power on the paths with higher weights, effectively pruning those that are less likely to succeed.


With AI writing 30% of its code Microsoft now cuts human coder jobs


It’s akin to having an expert programmer looking over the LLM’s shoulder, offering guidance at each decision point while keeping the overall goal in mind. The user initially specifies their desired output structure, its intended meaning, and how the system should check the output. The new architecture then guides the LLM to fulfill these requirements efficiently.

“We’ve worked out the hard math so that, for any kinds of constraints you’d like to incorporate, you are going to get the proper weights,” Loula explains. “In the end, you get the right answer.” This sophisticated control ensures that the LLM doesn’t just produce plausible-sounding text, but text that is genuinely useful and correct within the specified constraints.

Putting it to the test

The efficacy of this framework was demonstrated across several challenging, real-world use cases. The researchers tasked LLMs with generating four distinct types of outputs:

  • Python computer code
  • SQL database queries
  • Molecular structures
  • Sequential plans for a robot to follow

When compared to existing approaches for controlling LLM outputs, the new method consistently performed with higher accuracy while demanding less computation. One of the most striking results came from Python code generation. The researchers’ architecture enabled a relatively small, open-source LLM to outperform a specialized, commercial closed-source model that was more than double its size in generating accurate and properly structured code.

“We are very excited that we can allow these small models to punch way above their weight,” Loula states, highlighting the efficiency and power unlocked by their approach.

The impact of this research extends far beyond making programmers’ lives easier. In the long run, this architecture could democratize access to complex AI-generated content. For example, business professionals with no coding expertise could potentially write complex queries in SQL (a database manipulation language) using only natural language prompts, with the system ensuring the SQL generated is both valid and accurately reflects their request.

“This work has implications beyond research. It could improve programming assistants, AI-powered data analysis, and scientific discovery tools by ensuring that AI-generated outputs remain both useful and correct,” says Loula. Mansinghka adds that the approach could enable machine-assisted data analysis systems where users can converse with software that accurately models the meaning of data and the questions being asked.

Timothy J. O’Donnell, an associate professor at McGill University who led the international team, also points to deeper connections: “One of the fundamental questions of linguistics is how the meaning of words, phrases, and sentences can be grounded in models of the world… LLMs, predicting likely token sequences, don’t address this problem. Our paper shows that, in narrow symbolic domains, it is technically possible to map from words to distributions on grounded meanings. It’s a small step towards deeper questions in cognitive science, linguistics, and artificial intelligence needed to understand how machines can communicate about the world like we do.”


Featured image credit

Tags: AICoding

Related Posts

Forget seeing dark matter, it’s time to listen for it

Forget seeing dark matter, it’s time to listen for it

October 28, 2025
Google’s search business could lose  billion a year to ChatGPT

Google’s search business could lose $30 billion a year to ChatGPT

October 27, 2025
AI helps decode the epigenetic ‘off-switch’ in an ugly plant that lives for 3,000 years

AI helps decode the epigenetic ‘off-switch’ in an ugly plant that lives for 3,000 years

October 27, 2025
Researchers warn that LLMs can get “brain rot” too

Researchers warn that LLMs can get “brain rot” too

October 24, 2025
Cyberattacks are now killing patients not just crashing systems

Cyberattacks are now killing patients not just crashing systems

October 21, 2025
Gen Z workers are telling AI things they’ve never told a human

Gen Z workers are telling AI things they’ve never told a human

October 20, 2025

LATEST NEWS

EU launches €107M RAISE virtual institute to accelerate AI-driven science

AMD confirms critical RDSEED flaw in Zen 5 CPUs

Google rolls out redesigned Quick Share app for Windows

WhatsApp for Mac adds chat themes with 38 color options

Gemini now powers Google Translate’s “Advanced” mode

Coca-Cola’s new AI-generated Christmas ad shows why generative video still struggles with realism

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Policy.