Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI toolsNEW
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

Your Bluesky posts might be training AI

Bluesky faces privacy concerns after one million public posts were scraped via its API for AI training, prompting backlash over user consent and data protection measures

byKerem Gülen
November 28, 2024
in News, Artificial Intelligence
Home News
Share on FacebookShare on TwitterShare on LinkedInShare on WhatsAppShare on e-mail
Google Preferred Source

Bluesky is grappling with a significant privacy issue after one million public posts were scraped from its platform for AI training, according to a 404Media report. The dataset, compiled by machine learning librarian Daniel van Strien from the AI company Hugging Face, was intended for use in research related to natural language processing and social media analysis. Although Bluesky’s representatives assert that the platform will never train generative AI on user data, the open nature of its API makes it vulnerable to external scrapers.

Bluesky faces privacy concerns over scraped user posts

The dataset in question was sourced through Bluesky’s Firehose API, which provides an aggregated stream of public data updates, including posts, likes, and follows. Van Strien had aimed to use this dataset for pushing forward machine learning research. However, it not only included the text of posts but also users’ decentralized identifiers (DIDs) and metadata. After media reports highlighted the issue, the dataset was swiftly removed from Hugging Face due to the backlash it generated regarding user privacy and lack of consent.

Bluesky users did not provide explicit permission for their posts to be utilized in this manner, though Bluesky’s policies do not categorically prohibit such actions. The core of the controversy lies in the open structure of Bluesky’s API, which allows third-party developers to access its public data freely. According to a statement from a Bluesky representative, “we’d like to find a way for Bluesky users to communicate to outside orgs/developers whether they consent to this,” indicating an effort to enhance user control over data sharing in the future.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.


Bluesky gains 1.25 million users post-election surge


Following the removal of the dataset, van Strien acknowledged the breach of transparency and consent in his data collection approach. “I apologize for this mistake,” he stated in a follow-up post on Bluesky. This incident serves as a prompt for users to understand better that any content shared publicly on the platform is accessible to external entities. As the platform continues to grow—recently surpassing 20 million users—Bluesky will likely face increasing scrutiny regarding its data protection measures and user privacy.

Bluesky is currently in discussions about mechanisms that could enable users to express their consent preferences to third parties. However, enforcement remains a challenge; as noted by the platform, it will ultimately be up to outside developers to adhere to these preferences. Bluesky’s representatives additionally conveyed that while they aim for discussions with engineers and legal teams, no immediate solutions are available.


Featured image credit: Bluesky

Tags: bluesky

Related Posts

Apple scraps Siri AI launch in the EU over intense regulatory clashes

Apple scraps Siri AI launch in the EU over intense regulatory clashes

June 9, 2026
Which devices will support macOS Golden Gate

Which devices will support macOS Golden Gate

June 9, 2026
Everything announced at WWDC26

Everything announced at WWDC26

June 9, 2026
Advanced SEO services for high impact digital strategies

Advanced SEO services for high impact digital strategies

June 8, 2026
The 8 best website builders for small businesses on any budget

The 8 best website builders for small businesses on any budget

June 8, 2026
Why European workloads are leaving US cloud in 2026

Why European workloads are leaving US cloud in 2026

June 8, 2026

LATEST NEWS

Apple scraps Siri AI launch in the EU over intense regulatory clashes

Which devices will support macOS Golden Gate

Everything announced at WWDC26

Advanced SEO services for high impact digital strategies

The 8 best website builders for small businesses on any budget

Why European workloads are leaving US cloud in 2026

BEST AI MODELS LEADERBOARD

See the best AI models, ranked by intelligence, benchmark results, speed and token price. Find the most suitable LLMs, Text-to-Image, Image Editing, Text-to-Speech, Text-to-Video and Image-to-Video  artificial intelligence model for your tasks and business.

LATEST TOOLS

Roboto AI

Pickaxe

Pfpmaker

MindPal

Syllaby

ScreenApp

FinanceBrain

GitHub Spark

Hints

VisionStory AI

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Whitepapers
    • AI Models Leaderboard
  • AI tools
  • Newsletter
  • + More
    • Glossary
    • Conversations
    • Events
    • About
      • Who we are
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies to improve your experience. You can choose to accept or reject them. Visit our Privacy Policy.