OpenAI sued for 'stealing' data from the public to train ChatGPT

A new class action lawsuit targets the data ChatGPT was trained on. Here's what it alleges and what it hopes to accomplish.
Written by Sabrina Ortiz, Editor
Robot hand with mallet coming out of screen
Getty Images/PhonlamaiPhoto

OpenAI's wildly popular ChatGPT is a generative AI model that was trained on vasts amount of data, specifically the entirety of the internet prior to 2021. 

The data ChatGPT was trained on is now the subject of a new lawsuit against OpenAI.

In a class action lawsuit filed on June 28 against OpenAI and its partner Microsoft, the plaintiffs claim that OpenAI used "stolen data" to "train and develop" its products including ChatGPT 3.5, ChatGPT 4, DALL-E, and VALL-E. 

AlsoHuman oversight key to keeping AI honest

The lawsuit claims that OpenAI stole data from "millions of unsuspecting consumers worldwide" including data from children of all ages to enable the chatbot to replicate human language.

Furthermore, the lawsuit alleges that OpenAI is "harvesting massive amounts of personal data from the internet" such as private conversations, medical data, and more, without asking for users' permission. 

Also: The best AI chatbots to try 

A section of the 157-page lawsuit specifically delineates a list of private information that is allegedly being collected, stored, tracked, and shared by OpenAI including social media information, cookies, keystrokes, typed swatches, payment information, and more. 

In addition, the list claims that OpenAI is collecting data from applications that have incorporated GPT-4 such as image-related data through Snapchat, music preferences in Spotify, and financial information in Stripe. 

The plaintiffs ask that the defendants immediately implement transparency about what data it is collecting, where and from whom it collected it, and how it is being used. They also seek that all the plaintiffs and class members are compensated for their stolen data. 

AlsoAI arms race: This global index ranks which nations dominate AI development

Lastly, the plaintiffs seek that OpenAI introduces an option where users can opt out of all data collection and that OpenAI stops the "illegal" scraping of internet data. 

This isn't the first lawsuit brought upon OpenAI. Earlier this month, OpenAI was sued because of misinformation that ChatGPT output about a person. 

Editorial standards