Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

DeepSeek releases R1 model trained for $294,000 on 512 H800 GPUs

The model achieved competitive performance against higher-budget rivals, demonstrating efficiency in mathematics, programming, and problem-solving tasks.

byAytun Çelebi
September 19, 2025
in Artificial Intelligence
Home News Artificial Intelligence
Share on FacebookShare on TwitterShare on LinkedInShare on WhatsAppShare on e-mail
Google Preferred Source

The Chinese company DeepSeek AI has released its large language model, R1, which was trained for only $294,000 using 512 Nvidia H800 GPUs.

In a paper published in the journal Nature, the company detailed how it achieved this low cost by using a trial-and-error reinforcement learning method, allowing the model to achieve competitive performance against rivals with much larger budgets, like OpenAI.

How DeepSeek’s reinforcement learning method works

DeepSeek’s key innovation was to move away from the expensive, human-intensive process of creating annotated datasets. Traditional AI models for reasoning tasks are often trained on vast datasets where human experts provide step-by-step solutions to complex problems. Instead, DeepSeek developed an autonomous learning system that uses reinforcement learning to refine the model’s reasoning skills through a system of rewards and penalties.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

Researchers from Carnegie Mellon University, in an article accompanying the Nature paper, compared the process to a child learning to play a video game.

“As the child navigates their avatar through the game world, they learn through trial and error that some actions (such as collecting gold coins) earn points, whereas others (such as running into enemies) set their score back to zero. In a similar vein, DeepSeek-R1 was awarded a high score when it answered questions correctly and a low score when it gave wrong answers.”

This method was particularly effective for tasks in mathematics and programming, where answers can be definitively verified as right or wrong. The model would generate potential solutions, which were then evaluated by an automated scoring system. It would then iterate on its approach until it achieved the highest score, all without human intervention.

This efficient, self-directed process allowed the company to build a powerful AI system with a fraction of the investment required by its competitors.

Limitations and concerns about the model

While the reinforcement learning approach proved cost-effective, it also has some limitations. The model’s outputs often hide the underlying reasoning steps, making it difficult for a human to understand how it arrived at a conclusion. When asked to provide its reasoning, R1 generated extremely long and hard-to-read explanations—sometimes over 10,000 words—that switched between English and Chinese. The technique also struggled with tasks requiring nuance or subjectivity, where there is no single “correct” answer.

Beyond its technical limitations, the model’s development in China has raised concerns about potential government influence. A recent report from The Washington Post found that R1 exhibited biases in its outputs. Researchers discovered that the model would refuse to generate code with major security flaws when the prompts involved groups considered sensitive by Chinese authorities.

However, when asked to create code for entities like Tibet, Taiwan, or the Falun Gong religious movement, the model produced less secure versions with built-in vulnerabilities. This suggests that the model’s behavior may be shaped by the political priorities of the Chinese government.


Featured image credit

Tags: deepseekFeatured

Related Posts

ByteDance launches Doubao 2.1 Pro language model

ByteDance launches Doubao 2.1 Pro language model

June 24, 2026
OpenAI expands cybersecurity efforts with Patch the Planet

OpenAI expands cybersecurity efforts with Patch the Planet

June 24, 2026
Claude Tag brings shared AI assistant to Slack channels

Claude Tag brings shared AI assistant to Slack channels

June 24, 2026
Getty Images partners with OpenAI to supply licensed visuals for ChatGPT

Getty Images partners with OpenAI to supply licensed visuals for ChatGPT

June 23, 2026
Samsung adopts ChatGPT Enterprise and Codex across global workforce

Samsung adopts ChatGPT Enterprise and Codex across global workforce

June 22, 2026
OpenAI improves health responses for free ChatGPT users

OpenAI improves health responses for free ChatGPT users

June 19, 2026

LATEST NEWS

Rockstar confirms GTA 6 pricing and pre-order details

ByteDance launches Doubao 2.1 Pro language model

OpenAI expands cybersecurity efforts with Patch the Planet

Meta launches $299 smart glasses under its own brand

Claude Tag brings shared AI assistant to Slack channels

PlayStation 6 leak points to 2027 release window

BEST AI MODELS LEADERBOARD

See the best AI models, ranked by intelligence, benchmark results, speed and token price. Find the most suitable LLMs, Text-to-Image, Image Editing, Text-to-Speech, Text-to-Video and Image-to-Video  artificial intelligence model for your tasks and business.

LATEST TOOLS

Vrew

Fireflies

SpeedLegal

Teachable Machine

Unriddle

VidAU

Qualified

character.ai

Interview Coder

Moonbeam

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI tools
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies to improve your experience. You can choose to accept or reject them. Visit our Privacy Policy.