Microsoft And Tsinghua’s X-Coder Hits 62.9% Pass Rate On LiveCodeBench V5

Research shows that increasing task diversity is more effective than adding more solutions.

Researchers from Tsinghua University and Microsoft have developed X-Coder, an AI coding model with 7 billion parameters. This model was trained exclusively on synthetic data. A paper detailing X-Coder was posted on arXiv on January 11.

X-Coder achieved a 62.9% pass rate on LiveCodeBench v5 and a 55.8% pass rate on LiveCodeBench v6. This performance surpasses models such as DeepCoder-14B-Preview and AReal-boba2-14B, both of which have 14 billion parameters.

The development utilized SynthSmith, a data synthesis pipeline that generates programming tasks, solutions, and test cases. SynthSmith does not rely on human-written examples. The system begins by extracting coding-relevant features, including algorithms, data structures, and optimization techniques, from an initial pool of approximately 27,000 code examples. This pool is then expanded to nearly 177,000 entries through an evolutionary process.

Quality control in SynthSmith involves a dual-verification strategy. The system determines correct test outputs through majority voting among multiple candidate solutions. The best solution is then validated against a holdout test set.

The research indicated that task variety in training data contributes more to competitive programming performance than model size or solution quantity. Experiments showed that increasing the number of distinct tasks was more effective than adding multiple solutions per task.

A dataset with 64,000 different tasks, each with one solution, outperformed datasets with fewer tasks but more solutions per problem. Pass rates increased with task count: from 43.7% with 32,000 tasks to 51.3% with 64,000 tasks, then 57.2% with 128,000 tasks, and 62.7% with 192,000 tasks. The supervised fine-tuning phase achieved 60.3%, with an additional 4.6 percentage points added during reinforcement learning.

The synthetic training approach helps mitigate benchmark contamination concerns. A reference model, Qwen3-8B, showed a 30-point performance decrease between older and newer LiveCodeBench versions. X-Coder exhibited a smaller decline of 17.2 points, suggesting reduced memorization of benchmark problems.

The code for SynthSmith is available on GitHub. Researchers have stated intentions to release model weights. This work occurs as the AI industry increasingly utilizes synthetic data to address limitations in available training material. Microsoft has previously developed SynthLLM for broader synthetic data generation.

Featured image credit

Tags: Microsoft tsinghua

Microsoft and Tsinghua’s X-Coder hits 62.9% pass rate on LiveCodeBench v5

Research shows that increasing task diversity is more effective than adding more solutions.

Related Posts

New Mac malware disguises itself as CrashReporter

LLMs showed stronger hiring bias than humans

AI surge to drive US data centers to use one-fifth of power by 2035

Startup unveils AI model built on oscillators and it could cut energy use by 1,000x

Digital transformation of procurement processes: Building a corporate procurement system based on the example of an international industrial holding project

New dark matter theory proposes two particle types

LATEST NEWS

Valve expands Steam gifting and wishlist options

Google introduces selfie video account verification

Kylian Mbappé named EA Sports FC 27 cover star

Anthropic adds screen-recorded teaching feature to Claude AI

Meta adds Xbox Game Pass starter edition to Horizon+ subscriptions

Threads launches new parental supervision tools for teen safety

BEST AI MODELS LEADERBOARD

LATEST TOOLS

Amanda AI

InterviewBot

VernAI

MyLoans

Essay Grader AI

Cover Letter AI

Animate Old Photos

Resume.io

MonAI

AIEngine Plugin

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

Microsoft and Tsinghua’s X-Coder hits 62.9% pass rate on LiveCodeBench v5

Research shows that increasing task diversity is more effective than adding more solutions.

Stay Ahead of the Curve!

Related Posts

LATEST NEWS

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

Follow Us