OpenAI's wildly popular ChatGPT is a generative AI model that was trained on vasts amount of data, specifically the entirety of the internet prior to 2021.
The data ChatGPT was trained on is now the subject of a new lawsuit against OpenAI.
In a class action lawsuit filed on June 28 against OpenAI and its partner Microsoft, the plaintiffs claim that OpenAI used "stolen data" to "train and develop" its products including ChatGPT 3.5, ChatGPT 4, DALL-E, and VALL-E.
The lawsuit claims that OpenAI stole data from "millions of unsuspecting consumers worldwide" including data from children of all ages to enable the chatbot to replicate human language.
Furthermore, the lawsuit alleges that OpenAI is "harvesting massive amounts of personal data from the internet" such as private conversations, medical data, and more, without asking for users' permission.
A section of the 157-page lawsuit specifically delineates a list of private information that is allegedly being collected, stored, tracked, and shared by OpenAI including social media information, cookies, keystrokes, typed swatches, payment information, and more.
In addition, the list claims that OpenAI is collecting data from applications that have incorporated GPT-4 such as image-related data through Snapchat, music preferences in Spotify, and financial information in Stripe.
The plaintiffs ask that the defendants immediately implement transparency about what data it is collecting, where and from whom it collected it, and how it is being used. They also seek that all the plaintiffs and class members are compensated for their stolen data.