Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

Apple builds an AI “engineering team” that finds and fixes bugs on its own

The company’s new ADE-QVAET model achieved 98.08% accuracy in predicting buggy code regions. An agentic AI tool now generates entire test plans with 94.8% accuracy, cutting testing time by 85%. SWE-Gym trains AI agents to fix code in real-world conditions, solving 72.5% of tasks correctly.

byKerem Gülen
October 17, 2025
in Research

Apple’s AI researchers have quietly published three new studies that pull back the curtain on a major new ambition: automating the most tedious and critical parts of software development. The papers, published on Apple’s Machine Learning Research blog, detail new AI systems that can predict where bugs are likely to appear, automatically write entire test plans, and even fix the broken code themselves. This matters because it’s not just another “AI writes code” demo. Apple is building a suite of specialized AI quality engineers to find and fix flaws before they ever reach your phone or computer, which could lead to massive gains in productivity and (hopefully) more stable software.

Paper 1: The AI bug predictor

The first study, “Software Defect Prediction using Autoencoder Transformer Model,” from researchers Seshu Barma, Mohanakrishnan Hariharan, and Satish Arvapalli, tackles the problem of “buggy” code. Instead of having an AI read millions of lines of code—a process prone to AI “hallucinations”—they built a different kind of tool.

Their model, ADE-QVAET, acts less like a code reviewer and more like a data analyst. It doesn’t read the code itself. Instead, it analyzes metrics about the code, such as its complexity, size, and structure. It’s trained to find the hidden patterns in these metrics that reliably predict where bugs are most likely to be hiding.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

The results are incredibly effective. On a standard dataset for bug prediction, the model achieved 98.08% accuracy. It also scored high on precision and recall, a technical way of saying it’s extremely good at finding real bugs while avoiding “false positives” that waste developers’ time.

Paper 2: The automated quality engineer

Finding bugs is great, but what about the mountain of paperwork that comes with software testing? The second study, “Agentic RAG for Software Testing,” addresses this head-on. The researchers note that quality engineers spend 30-40% of their time just creating “foundational testing artifacts”—a corporate term for test plans, cases, and scripts.

Their solution is an AI agent that does this work automatically. The system reads the project’s requirements and business logic, then autonomously generates the entire suite of testing documents. This system keeps full “traceability,” meaning it logs exactly which test case corresponds to which business requirement.

The impact here is measured in time and money. The system showed a remarkable 94.8% accuracy in its generated tests. In validation projects, it led to an 85% reduction in the testing timeline and an 85% improvement in test suite efficiency. For one project, that meant accelerating the go-live date by a full two months.


MIT researchers have built an AI that teaches itself how to learn


Paper 3: The AI ‘gym’ that teaches code-fixing

The third and most ambitious study is “Training Software Engineering Agents and Verifiers with SWE-Gym.” This paper asks the logical next question: Why just find bugs when you can fix them?

To do this, the team built a “gym” for AI agents. This training environment, SWE-Gym, is a sandbox built from 2,438 real-world Python tasks pulled from 11 open-source projects. Each task comes with its own executable environment and test suite. This allows an AI agent to practice the full developer workflow: read the bug report, write the code to fix it, and then run the tests to see if the fix actually worked (and didn’t break anything else).

The training paid off. AI agents trained in this “gym” correctly solved 72.5% of the buggy tasks, a result that outperformed previous benchmarks by more than 20 percentage points.

These are specialized tools, not a general-purpose AI coder. The researchers for the automated testing (Paper 2) note that their work was focused only on specific “Employee Systems, Finance, and SAP environments,” meaning it’s not a one-size-fits-all solution just yet. Similarly, the bug-fixing “gym” was focused on Python tasks.

What these three studies show is a clear, multi-pronged strategy. Apple isn’t just trying to build one “do-it-all” AI. Instead, they’re building a team of AI specialists: a bug-predicting analyst, a test-writing “paper-pusher,” and a bug-fixing “mechanic.” This approach could fundamentally change the economics of software development, leading to faster timelines, lower costs, and more reliable products.


Featured image credit

Tags: AIAppleCoding

Related Posts

Google reveals AI-powered malware using LLMs in real time

Google reveals AI-powered malware using LLMs in real time

November 12, 2025
Oxford study finds AI benchmarks often exaggerate model performance

Oxford study finds AI benchmarks often exaggerate model performance

November 12, 2025
Anthropic study finds AI has limited self-awareness of its own thoughts

Anthropic study finds AI has limited self-awareness of its own thoughts

November 11, 2025
New research shows AI logic survives even when its memory is erased

New research shows AI logic survives even when its memory is erased

November 10, 2025
Researchers find electric cars erase their “carbon debt” in under two years

Researchers find electric cars erase their “carbon debt” in under two years

November 5, 2025
Anthropic study reveals AIs can’t reliably explain their own thoughts

Anthropic study reveals AIs can’t reliably explain their own thoughts

November 4, 2025

LATEST NEWS

Don’t miss: The Game Awards to be live on Amazon Prime Video

Collins Dictionary names “vibe coding” the 2025 word of the year

Google Photos AI expands to 100+ countries

Masayoshi Son trades Nvidia profits for a $30B AI spending spree

Nintendo rolls out quality-of-life updates for both Switch generations

YouTube launches on-screen AI chat that explains videos in real time

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Policy.