Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

AI is learning to work like you and it’s getting faster every day

The research team designed a benchmark based on 170 tasks across three datasets—HCAST, RE-Bench, and a new suite of shorter software tasks called SWAA

byKerem Gülen
March 19, 2025
in Research
Home Research

Five years from now, AI might be completing software engineering tasks in a month that would take a human the same amount of time. That’s the prediction of a new study that introduces a metric called the 50%-task-completion time horizon—a measure of how long humans typically take to complete tasks that AI models can solve with a 50% success rate. And if current trends hold, AI is on track to automate increasingly complex work, from debugging code to conducting full-scale machine learning research.

AI’s growing time horizon

The study, conducted by the Model Evaluation & Threat Research (METR) group, suggests that AI’s ability to handle long and complex tasks has been doubling every seven months since 2019. Today’s frontier models, like Claude 3.7 Sonnet, already match human performance on 50-minute-long tasks. Extrapolating this growth, AI could reach a one-month time horizon—the ability to autonomously complete tasks that would take a human a month—between 2028 and 2031.

This isn’t just about raw computational power. AI’s improved logical reasoning, tool use, and ability to adapt to mistakes are fueling the trend. Early AI systems would get stuck in loops or abandon problems too soon, but modern models are learning to persist and correct errors—critical traits for automation at scale.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

How the study measured AI’s capabilities

The research team designed a benchmark based on 170 tasks across three datasets—HCAST, RE-Bench, and a new suite of shorter software tasks called SWAA. They timed human professionals completing these tasks and compared their performance to AI models spanning from 2019 to 2025. The results showed a clear trajectory:

  • In 2019, AI struggled with tasks that took more than a minute.
  • By 2023, AI was reliably solving 5–30-minute tasks.
  • Today, the best models can handle tasks approaching an hour.

Interestingly, AI’s progress has been remarkably steady, even when tested against new challenges. The study found that the increase in time horizon remains consistent across different types of tasks, meaning AI isn’t just getting better at specific benchmarks—it’s improving across the board.


Are LLMs really ideological?


Are we headed for fully autonomous AI?

While the study confirms rapid AI progress, it also raises concerns. The same ability that allows AI to write complex software could also enable it to perform high-risk activities autonomously. The paper warns that as AI systems become capable of extended autonomous operation, new safety measures will be needed to prevent misuse, such as self-replicating AI or autonomous development of hazardous materials.

Additionally, AI’s performance drops on “messier” real-world tasks—those requiring creativity, strategic thinking, or human collaboration. While AI excels at structured problems with clear objectives, it still struggles in unpredictable environments.

What’s next?

If AI’s progress continues at its current rate, it could reshape industries by automating work traditionally done by skilled professionals. The implications stretch beyond software development—fields like legal research, cybersecurity, and even scientific discovery could see AI playing a much larger role.

But will the trend hold? The study’s authors acknowledge that external factors—such as compute limitations or breakthroughs in AI training—could speed up or slow down progress. One thing is clear: AI isn’t just getting smarter. It’s learning how to work.


Featured image credit: Kerem Gülen/Midjourney

Tags: AI

Related Posts

Radware tricks ChatGPT’s Deep Research into Gmail data leak

Radware tricks ChatGPT’s Deep Research into Gmail data leak

September 19, 2025
OpenAI research finds AI models can scheme and deliberately deceive users

OpenAI research finds AI models can scheme and deliberately deceive users

September 19, 2025
MIT studies AI romantic bonds in r/MyBoyfriendIsAI group

MIT studies AI romantic bonds in r/MyBoyfriendIsAI group

September 19, 2025
Anthropic economic index reveals uneven Claude.ai adoption

Anthropic economic index reveals uneven Claude.ai adoption

September 17, 2025
Google releases VaultGemma 1B with differential privacy

Google releases VaultGemma 1B with differential privacy

September 17, 2025
OpenAI researchers identify the mathematical causes of AI hallucinations

OpenAI researchers identify the mathematical causes of AI hallucinations

September 17, 2025

LATEST NEWS

Zoom announces AI Companion 3.0 at Zoomtopia

Google Cloud adds Lovable and Windsurf as AI coding customers

Radware tricks ChatGPT’s Deep Research into Gmail data leak

Elon Musk’s xAI chatbot Grok exposed hundreds of thousands of private user conversations

Roblox game Steal a Brainrot removes AI-generated character, sparking fan backlash and a debate over copyright

DeepSeek releases R1 model trained for $294,000 on 512 H800 GPUs

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Policy.