Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

Adobe is sued for using pirated books to train AI

The lawsuit claims Adobe training data was contaminated with over one hundred thousand stolen works

byEmre Çıtak
December 18, 2025
in Industry
Home Industry
Share on FacebookShare on TwitterShare on LinkedInShare on WhatsAppShare on e-mail

A proposed class-action lawsuit filed by Oregon author Elizabeth Lyon accuses Adobe of training its SlimLM AI model on pirated books, including her guidebooks, through the SlimPajama-627B dataset derived from the RedPajama collection containing Books3.

Adobe has pursued extensive development in artificial intelligence over recent years. The company launched multiple AI services starting in 2023, with Firefly serving as its AI-powered media-generation suite designed for creating images, videos, and other media content from text prompts and inputs.

SlimLM represents a series of small language models that Adobe has optimized specifically for document assistance tasks on mobile devices. These models enable functions such as summarizing documents, extracting key information, and providing contextual help directly within mobile applications.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

Adobe states that it pre-trained SlimLM using the SlimPajama-627B dataset. Cerebras released this dataset in June 2023 as a deduplicated, multi-corpora, open-source resource intended for training large language models. The dataset aggregates various text sources after removing duplicates to improve training efficiency and model performance.

Elizabeth Lyon, who specializes in guidebooks for non-fiction writing, initiated the lawsuit claiming that Adobe incorporated pirated versions of numerous books, including her own works, into the training process for SlimLM. The legal action seeks class-action status to represent other affected authors.

The lawsuit details how the SlimPajama dataset originated from the RedPajama dataset, which includes the Books3 collection comprising 191,000 books. Reuters first reported on the filing. The complaint states verbatim: “The SlimPajama dataset was created by copying and manipulating the RedPajama dataset (including copying Books3).” It continues: “Thus, because it is a derivative copy of the RedPajama dataset, SlimPajama contains the Books3 dataset, including the copyrighted works of Plaintiff and the Class members.” Lyon argues that her copyrighted materials appeared in this pre-training data without her consent or compensation.

Books3 has emerged repeatedly in legal disputes within the AI sector, as developers have utilized it to train generative AI systems. The collection contains digitized texts from various genres and authors, making it a comprehensive but contentious training corpus. RedPajama, which incorporates Books3, has also faced mentions in multiple court cases.


Featured image credit

Tags: Adobeslimlm books3

Related Posts

Lovable raises 0M Series B at .6B valuation

Lovable raises $330M Series B at $6.6B valuation

December 18, 2025
Warner Bros. rejects Paramount’s 8 billion hostile bid

Warner Bros. rejects Paramount’s $108 billion hostile bid

December 18, 2025
Amazon eyes 10 billion dollar deal with OpenAI

Amazon eyes 10 billion dollar deal with OpenAI

December 18, 2025
Why YouTube said goodbye to Billboard

Why YouTube said goodbye to Billboard

December 18, 2025
Peter DeSantis to lead Amazon’s Nova AI organization

Peter DeSantis to lead Amazon’s Nova AI organization

December 18, 2025
The Oscars move to YouTube starting in 2029

The Oscars move to YouTube starting in 2029

December 18, 2025

LATEST NEWS

Alexa+ can now answer your door and chat with visitors

Luma AI brings character consistency to video with Ray3

Rivian takes on Tesla with new Universal Hands-Free update

More ads coming to App Store search results in 2026

New ChatGPT rules target self-harm and sexual role play

Gemini 3 Flash is a game changer and here’s why

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
  • AI tools
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Policy.