Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

Researchers suspect DeepSeek cloned Gemini data

Observers noted that DeepSeek R1-0528 mimics Gemini’s phrasing and reasoning style.

byKerem Gülen
June 4, 2025
in Artificial Intelligence, News
Home News Artificial Intelligence
Share on FacebookShare on TwitterShare on LinkedInShare on WhatsAppShare on e-mail
Google Preferred Source

DeepSeek, a Chinese lab, released an updated version of its R1 reasoning AI model last week. The company did not disclose the data sources used for training, but some AI researchers suggest that Google’s Gemini family of AI may have been a source.

Sam Paech, a Melbourne-based developer, claims to have found evidence that DeepSeek’s latest model was trained on outputs from Gemini. According to Paech’s X post, DeepSeek’s model, R1-0528, uses similar words and expressions favored by Google’s Gemini 2.5 Pro.

SpeechMap’s pseudonymous creator, who developed a “free speech eval” for AI, mentioned that the DeepSeek model’s “thoughts” resemble Gemini traces. Previously, DeepSeek faced accusations of training on data from competitor AI models. In December, developers noticed that DeepSeek’s V3 model often identified itself as ChatGPT, suggesting potential training on ChatGPT chat logs.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

Earlier in the year, OpenAI informed the Financial Times about evidence connecting DeepSeek to distillation, a technique involving extracting data from larger AI models for training. According to Bloomberg, Microsoft detected significant data exfiltration through OpenAI developer accounts in late 2024, accounts OpenAI suspects are linked to DeepSeek.

OpenAI prohibits customers from using its model outputs to create competing AI, despite the fact that distillation is relatively common. AI companies source training data from the open web, increasingly saturated with AI-generated content. This has made it difficult to thoroughly filter AI outputs from training datasets.

Nathan Lambert, a researcher at AI2, believes that DeepSeek may have trained on data from Google’s Gemini. Lambert stated in an X post, “If I was DeepSeek, I would definitely create a ton of synthetic data from the best API model out there… [DeepSeek is] short on GPUs and flush with cash. It’s literally effectively more compute for them.”

AI companies are increasing security measures to prevent distillation. OpenAI began requiring organizations to complete an ID verification process to access advanced models in April. China is not on the list of countries supported by OpenAI’s API for this process.

Google has begun “summarizing” traces generated by models available through its AI Studio developer platform. Anthropic announced plans in May to summarize its own model’s traces.


Featured image credit

Tags: deepseekFeaturedgoogle gemini

Related Posts

“Free robots are an illusion”: Why we’ll pay for system intelligence, not delivery workers

“Free robots are an illusion”: Why we’ll pay for system intelligence, not delivery workers

June 12, 2026
How Henrique Schmaiske led Meteor.js through its biggest transformation

How Henrique Schmaiske led Meteor.js through its biggest transformation

June 12, 2026
Proven privacy: Why ‘no-log’ claims need real evidence today

Proven privacy: Why ‘no-log’ claims need real evidence today

June 12, 2026
ChatGPT hits 1 billion users as global AI adoption surges despite backlash

ChatGPT hits 1 billion users as global AI adoption surges despite backlash

June 12, 2026
Huawei launches HarmonyOS 7 developer beta with upgraded API 26

Huawei launches HarmonyOS 7 developer beta with upgraded API 26

June 12, 2026
OpenAI Codex referral program rewards users with extra rate resets

OpenAI Codex referral program rewards users with extra rate resets

June 12, 2026

LATEST NEWS

“Free robots are an illusion”: Why we’ll pay for system intelligence, not delivery workers

How Henrique Schmaiske led Meteor.js through its biggest transformation

Proven privacy: Why ‘no-log’ claims need real evidence today

ChatGPT hits 1 billion users as global AI adoption surges despite backlash

Huawei launches HarmonyOS 7 developer beta with upgraded API 26

OpenAI Codex referral program rewards users with extra rate resets

BEST AI MODELS LEADERBOARD

See the best AI models, ranked by intelligence, benchmark results, speed and token price. Find the most suitable LLMs, Text-to-Image, Image Editing, Text-to-Speech, Text-to-Video and Image-to-Video  artificial intelligence model for your tasks and business.

LATEST TOOLS

Roboto AI

Pickaxe

Pfpmaker

MindPal

Syllaby

ScreenApp

FinanceBrain

GitHub Spark

Hints

VisionStory AI

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI tools
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies to improve your experience. You can choose to accept or reject them. Visit our Privacy Policy.