Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

Your Bluesky posts might be training AI

Bluesky faces privacy concerns after one million public posts were scraped via its API for AI training, prompting backlash over user consent and data protection measures

byKerem Gülen
November 28, 2024
in News, Artificial Intelligence
Home News

Bluesky is grappling with a significant privacy issue after one million public posts were scraped from its platform for AI training, according to a 404Media report. The dataset, compiled by machine learning librarian Daniel van Strien from the AI company Hugging Face, was intended for use in research related to natural language processing and social media analysis. Although Bluesky’s representatives assert that the platform will never train generative AI on user data, the open nature of its API makes it vulnerable to external scrapers.

Bluesky faces privacy concerns over scraped user posts

The dataset in question was sourced through Bluesky’s Firehose API, which provides an aggregated stream of public data updates, including posts, likes, and follows. Van Strien had aimed to use this dataset for pushing forward machine learning research. However, it not only included the text of posts but also users’ decentralized identifiers (DIDs) and metadata. After media reports highlighted the issue, the dataset was swiftly removed from Hugging Face due to the backlash it generated regarding user privacy and lack of consent.

Bluesky users did not provide explicit permission for their posts to be utilized in this manner, though Bluesky’s policies do not categorically prohibit such actions. The core of the controversy lies in the open structure of Bluesky’s API, which allows third-party developers to access its public data freely. According to a statement from a Bluesky representative, “we’d like to find a way for Bluesky users to communicate to outside orgs/developers whether they consent to this,” indicating an effort to enhance user control over data sharing in the future.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.


Bluesky gains 1.25 million users post-election surge


Following the removal of the dataset, van Strien acknowledged the breach of transparency and consent in his data collection approach. “I apologize for this mistake,” he stated in a follow-up post on Bluesky. This incident serves as a prompt for users to understand better that any content shared publicly on the platform is accessible to external entities. As the platform continues to grow—recently surpassing 20 million users—Bluesky will likely face increasing scrutiny regarding its data protection measures and user privacy.

Bluesky is currently in discussions about mechanisms that could enable users to express their consent preferences to third parties. However, enforcement remains a challenge; as noted by the platform, it will ultimately be up to outside developers to adhere to these preferences. Bluesky’s representatives additionally conveyed that while they aim for discussions with engineers and legal teams, no immediate solutions are available.


Featured image credit: Bluesky

Tags: bluesky

Related Posts

China’s censorship tech finds new markets abroad

China’s censorship tech finds new markets abroad

September 9, 2025
Plex data breach exposes user emails, usernames, and hashed passwords

Plex data breach exposes user emails, usernames, and hashed passwords

September 9, 2025
UK study finds Microsoft 365 Copilot especially valuable for neurodiverse employees

UK study finds Microsoft 365 Copilot especially valuable for neurodiverse employees

September 9, 2025
AlterEgo builds a device that lets you talk to AI without a sound

AlterEgo builds a device that lets you talk to AI without a sound

September 9, 2025
Court ruling signals new battleground for publishers

Court ruling signals new battleground for publishers

September 9, 2025
Settlement doubts loom over Anthropic’s pirated book case

Settlement doubts loom over Anthropic’s pirated book case

September 9, 2025

LATEST NEWS

China’s censorship tech finds new markets abroad

Plex data breach exposes user emails, usernames, and hashed passwords

UK study finds Microsoft 365 Copilot especially valuable for neurodiverse employees

AlterEgo builds a device that lets you talk to AI without a sound

Court ruling signals new battleground for publishers

Settlement doubts loom over Anthropic’s pirated book case

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Policy.