Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

Your Bluesky posts might be training AI

Bluesky faces privacy concerns after one million public posts were scraped via its API for AI training, prompting backlash over user consent and data protection measures

byKerem Gülen
November 28, 2024
in News, Artificial Intelligence

Bluesky is grappling with a significant privacy issue after one million public posts were scraped from its platform for AI training, according to a 404Media report. The dataset, compiled by machine learning librarian Daniel van Strien from the AI company Hugging Face, was intended for use in research related to natural language processing and social media analysis. Although Bluesky’s representatives assert that the platform will never train generative AI on user data, the open nature of its API makes it vulnerable to external scrapers.

Bluesky faces privacy concerns over scraped user posts

The dataset in question was sourced through Bluesky’s Firehose API, which provides an aggregated stream of public data updates, including posts, likes, and follows. Van Strien had aimed to use this dataset for pushing forward machine learning research. However, it not only included the text of posts but also users’ decentralized identifiers (DIDs) and metadata. After media reports highlighted the issue, the dataset was swiftly removed from Hugging Face due to the backlash it generated regarding user privacy and lack of consent.

Bluesky users did not provide explicit permission for their posts to be utilized in this manner, though Bluesky’s policies do not categorically prohibit such actions. The core of the controversy lies in the open structure of Bluesky’s API, which allows third-party developers to access its public data freely. According to a statement from a Bluesky representative, “we’d like to find a way for Bluesky users to communicate to outside orgs/developers whether they consent to this,” indicating an effort to enhance user control over data sharing in the future.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.


Bluesky gains 1.25 million users post-election surge


Following the removal of the dataset, van Strien acknowledged the breach of transparency and consent in his data collection approach. “I apologize for this mistake,” he stated in a follow-up post on Bluesky. This incident serves as a prompt for users to understand better that any content shared publicly on the platform is accessible to external entities. As the platform continues to grow—recently surpassing 20 million users—Bluesky will likely face increasing scrutiny regarding its data protection measures and user privacy.

Bluesky is currently in discussions about mechanisms that could enable users to express their consent preferences to third parties. However, enforcement remains a challenge; as noted by the platform, it will ultimately be up to outside developers to adhere to these preferences. Bluesky’s representatives additionally conveyed that while they aim for discussions with engineers and legal teams, no immediate solutions are available.


Featured image credit: Bluesky

Tags: bluesky

Related Posts

Microsoft delays Xbox Game Pass price increase for some existing subscribers

Microsoft delays Xbox Game Pass price increase for some existing subscribers

October 8, 2025
Google releases Gemini 2.5 Computer Use model for building UI agents

Google releases Gemini 2.5 Computer Use model for building UI agents

October 8, 2025
AI is now the number one channel for data exfiltration in the enterprise

AI is now the number one channel for data exfiltration in the enterprise

October 8, 2025
Google expands its AI vibe-coding app Opal to 15 more countries

Google expands its AI vibe-coding app Opal to 15 more countries

October 8, 2025
Google introduces CodeMender, an AI agent for code security

Google introduces CodeMender, an AI agent for code security

October 8, 2025
Megabonk once again proves you don’t need fancy graphics to become a hit

Megabonk once again proves you don’t need fancy graphics to become a hit

October 8, 2025

LATEST NEWS

Microsoft delays Xbox Game Pass price increase for some existing subscribers

Google releases Gemini 2.5 Computer Use model for building UI agents

AI is now the number one channel for data exfiltration in the enterprise

Google expands its AI vibe-coding app Opal to 15 more countries

Google introduces CodeMender, an AI agent for code security

Megabonk once again proves you don’t need fancy graphics to become a hit

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Policy.