OpenAI Faces Criticism After CTO's Interview On Sora

OpenAI, the influential artificial intelligence research lab behind groundbreaking tools like ChatGPT and Sora, has found itself in hot water following a recent interview with its Chief Technology Officer, Mira Murati.

The interview, conducted by Wall Street Journal reporter Joanna Stern, focused on OpenAI’s latest image, or rather video, generation system, Sora.

Concerns center around the potential misuse of copyrighted work to train AI models and the lack of transparency from OpenAI regarding its data practices.

Sora’s training data is in question

At the heart of the controversy lies the issue of training data, the massive datasets used to train AI models.

When asked about the sources of data utilized for Sora, Murati provided the standard response: the model had been trained on “publicly available and licensed data“.

However, further probing revealed hesitation and uncertainty on Murati’s part about the specific details of this dataset.

This response has raised red flags among artists, photographers, and intellectual property experts. AI image generation systems depend heavily on ingesting vast quantities of images, many of which may be protected by copyright. The lack of clarity around Sora’s training data raises questions about whether OpenAI has adequately safeguarded the rights of content creators.

OpenAI SORA training data controversy — **Sora’s training database has not been published on any official platform** (Image credit)

Shutterstock usage admitted later on

Adding fuel to the fire was Murati’s initial refusal to address whether Shutterstock images were a component of Sora’s training dataset. Only after the interview, in a footnote added by the Wall Street Journal, did Murati confirm the use of Shutterstock’s image library.

This confirmation contradicts OpenAI’s public-facing stance of “publicly available and licensed data” and suggests an attempt to conceal potentially problematic sourcing practices.

Shutterstock and OpenAI formed a partnership granting OpenAI rights to use Shutterstock’s image library in training image generation models like DALL-E 2 and potentially Sora.

In return, Shutterstock contributors (the photographers and artists whose images are on the platform) receive compensation when their work is used in the development of these AI models.

A PR nightmare unfolds

It’s safe to say that most public relations folks would not consider this interview to be a PR masterpiece.

Murati’s lack of clarity comes at a sensitive time for OpenAI, already facing major copyright lawsuits, including a significant one filed by the New York Times.

The public is scrutinizing practices like OpenAI’s alleged secret use of YouTube videos for model training, as previously reported by The Information. With stakeholders ranging from artists to politicians demanding accountability, Murati’s avoidance only fuels the fire.

OpenAI’s opaque approach is backfiring spectacularly, transforming the Sora interview into a PR disaster.

OpenAI CTO Mira Murati says Sora was trained on publicly available and licensed data pic.twitter.com/rf7pZ0ZX00

— Tsarathustra (@tsarnick) March 13, 2024

Transparency is not the most discussed topic for nothing

This incident underscores a critical truth: unveiling the truth is paramount in the world of AI. OpenAI’s stumbling responses have severely undermined public trust and intensified questions about its ethical practices. The Sora controversy highlights the growing chorus demanding greater accountability within the AI industry.

Murati’s reluctance to disclose the specifics of Sora’s training data breeds mistrust and sets a dangerous precedent.

Without the clarity artists, creators, and the public are demanding, ethical debates and the potential for legal action will only intensify.

There are no angels in this land

While much of the current scrutiny falls squarely on OpenAI, it’s crucial to remember they’re not the only player in the game.

Facebook AI Research’s LLaMA model and Google’s Gemini have also faced allegations of problematic training data sources.

This isn’t surprising, as Business Insider reports that Meta has already admitted to using Instagram and Facebook posts to train its AI models. Additionally, Google’s control over vast swaths of the internet gives them unparalleled access to potential training data, raising similar ethical concerns about consent and copyright.

The situation with OpenAI’s Sora is just one piece of a larger puzzle. The entire AI development field is facing scrutiny regarding its data practices and the potential ethical implications.

Featured image credit: Freepik.

OpenAI faces criticism after CTO’s interview on Sora

OpenAI CTO Mira Murati's statements about SORA's training data, or rather ones that she could not make, drew massive reaction

Related Posts

Does your AI clock in without you?

Anthropic invites 150 more organizations into Project Glasswing

Microsoft unveils Project Solara for an agent-first future

OpenAI expands Codex with enterprise plug-ins and new Sites feature

Google will let websites opt out of AI search results

Best AI game maker tools and guide to AI game development

LATEST NEWS

Why Telegram Mini Apps have become the optimal ecosystem for launching AI SaaS products

Crypto investors are watching one date closely in 2026

How Telegram Creators test post visibility before running growth campaigns

Does your AI clock in without you?

Why secure software delivery depends on better release management

Sony reveals God of War: Laufey for PS5

BEST AI MODELS LEADERBOARD

LATEST TOOLS

Veed.io

Paper Pilot

IsOn24

Magnific

DADABOTS

Rosebud AI

Prome

Pageon AI

Vyond

Centauri AI

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.