ChatGPT has landed in hot water again in the European Union (EU) due to its tendency to spout inaccuracies. This time, the issue centers around privacy rights, with a complaint filed against the AI for its inability to correct misinformation it generates about individuals.
ChatGPT’s launch in November 2022 ignited a firestorm of excitement in the AI world. People flocked to the chatbot for everything from research assistance to casual conversation. However, a crucial detail lurks beneath the surface: OpenAI, the developers behind ChatGPT, freely admits the program simply predicts the most likely words to follow a prompt. This means, despite extensive training data, there’s no guarantee ChatGPT delivers factual information. In fact, generative AI tools like ChatGPT are notorious for “hallucinating,” essentially fabricating answers.
While inaccurate information might be tolerable when a student uses ChatGPT for homework help, it becomes a serious issue when dealing with personal data. EU law, established in 1995 and reinforced by Article 5 of the GDPR (General Data Protection Regulation), mandates that personal data must be accurate. Furthermore, individuals have the right to rectification (Article 16 GDPR) if information is wrong, and can request its deletion. Additionally, the “right to access” under Article 15 GDPR compels companies to disclose the data they hold on individuals and its sources.
Maartje de Graaf, a data protection lawyer at noyb, emphasizes the gravity of the situation: “Fabricating information is problematic in itself. But when it comes to personal data, the consequences can be severe. It’s clear that current chatbot technology, like ChatGPT, struggles to comply with EU law when processing personal information. If a system can’t deliver accurate and transparent results, it shouldn’t be used to generate data about individuals. The law dictates the technology’s development, not the other way around”.
noyb investigates ChatGPT’s hallucination
This latest grievance comes from noyb, a European non-profit organization focused on privacy rights. They represent an unnamed public figure who discovered that ChatGPT produced an incorrect birth date for them. This highlights a potential clash between these generative AI tools and the EU’s General Data Protection Regulation (GDPR).
The GDPR grants EU citizens a “right to rectification,” which allows them to request corrections to inaccurate personal information held by organizations. In the context of AI-generated content, this raises a crucial question: can a large language model like ChatGPT be held accountable for the information it produces, especially when that information is demonstrably wrong?
@NOYBeu on X has shared the following post on X about the situation:
🚨 noyb has filed a complaint against the ChatGPT creator OpenAI
OpenAI openly admits that it is unable to correct false information about people on ChatGPT. The company cannot even say where the data comes from.
Read all about it here 👇https://t.co/gvn9CnGKOb
— noyb (@NOYBeu) April 29, 2024
Is ChatGPT’s hallucination a pattern of misinformation?
This isn’t the first time ChatGPT’s “hallucination problem” has caused concern. The AI’s tendency to fabricate information has been well documented, raising questions about its reliability as a source of truth.
Here’s a flashback to 2023: the Italian data protection authority forced a temporary shutdown of ChatGPT after concerns were raised about the accuracy of the information it provided.
These incidents highlight the inherent challenges of working with large language models. Trained on massive datasets of text and code, these AI systems can struggle to distinguish between factual information and fiction. This can lead to situations where they confidently generate incorrect or misleading content, presented as truth.
News outlets vs OpenAI: The AI copyright wars have begun
The issue becomes even more complex when dealing with personal information. If a user interacts with ChatGPT and the AI produces inaccurate details about that user, it can be difficult, if not impossible, to ensure those errors are corrected. This is concerning, especially for public figures who may rely on online information to maintain their reputation.
Simply making up data is not an option
This issue is deeply rooted in the structure of generative AI. According to a recent New York Times report, chatbots “invent information at least 3% of the time – and as high as 27%”.
To illustrate, consider the public figure involved in the noyb complaint against OpenAI, as explained in a blog post. When asked about his birthday, ChatGPT consistently provided incorrect information instead of acknowledging a lack of data.
Are there no GDPR rights for individuals captured by ChatGPT?
Despite the demonstrably wrong birth date generated by ChatGPT, OpenAI refused the complainant’s request to rectify or erase the data, claiming it was technically impossible. While OpenAI can filter or block data based on specific prompts (like the complainant’s name), this approach prevents ChatGPT from filtering all information about the individual. OpenAI also failed to adequately respond to the complainant’s access request. The GDPR grants users the right to request a copy of all their personal data, yet OpenAI did not disclose any information regarding the processed data, its sources, or recipients.
Maartje de Graaf reiterates, “The obligation to comply with access requests applies to all companies. It’s certainly possible to keep records of training data, offering at least some insight into information sources. It seems with each supposed ‘innovation,’ companies believe their products are exempt from legal compliance”.
So far, unsuccessful efforts by authorities
The rise of ChatGPT has drawn the watchful eye of European privacy watchdogs. The Italian Data Protection Authority (DPA) addressed the chatbot’s inaccuracy in March 2023 by imposing a temporary restriction on data processing. Shortly thereafter, the European Data Protection Board (EDPB) established a task force specifically for ChatGPT to coordinate national efforts.
However, the ultimate outcome remains unclear. Presently, OpenAI appears unconcerned with GDPR compliance within the EU.
A potential landmark
The noyb complaint against ChatGPT puts the spotlight on the evolving relationship between AI and data privacy regulations. The GDPR, implemented in 2018, predates the widespread use of large language models. As AI continues to develop, regulators are grappling with how to apply existing frameworks to these new technologies.
The outcome of the noyb complaint could set a precedent for how the EU approaches the privacy implications of generative AI tools. If the Austrian Data Protection Authority rules in favor of the complainant, it could force OpenAI to implement changes to ensure users have greater control over the information ChatGPT generates about them.
This case has wider implications for the development of AI in general. As AI systems become more sophisticated and integrated into our daily lives, ensuring responsible development and deployment will be crucial. The noyb complaint serves as a reminder of the importance of building AI tools that prioritize accuracy and respect user privacy.
Featured image credit: Levart_Photographer/Unsplash