Your Reddit posts will now help train ChatGPT - what we know so far

OpenAI just got access to Reddit's massive catalog of content for AI training purposes. Here's what that could mean for users.
Written by Artie Beaty, Contributing Writer
Reddit logo
Rafael Henrique/SOPA Images/LightRocket via Getty Images

Last week, Reddit introduced a new content policy that was a pretty big win for user privacy. Part of that new policy stated that if a company wants to use Reddit data for commercial purposes, including training AI, it will have to pay.

OpenAI is taking Reddit up on that offer.

Also: The ChatGPT desktop app is more helpful than I expected - here's why and how to try it

The two companies reached an agreement to give OpenAI access to Reddit's massive catalog of content for ChatGPT training purposes. In return, Reddit will get access to OpenAI's tools to bring AI-powered features to Reddit users and mods.

OpenAI will access Reddit's Data API, "which provides real-time, structured, and unique content from Reddit." The goal is to give OpenAI a better understanding of Reddit's content, especially on recent topics.

Neither of the two sides disclosed actual financial terms.

Reddit has long been a training ground for AI, but the company's recent policy updates put an end to companies doing so without the platform's consent. 

It makes sense why OpenAI would want access to Reddit user posts. Steve Huffman, Reddit Co-Founder and CEO, calls his site "one of the internet's largest open archives of authentic, relevant, and always up to date human conversations about anything and everything." 

For users, this change won't mean much – at least not in how the site functions. The content policy prohibits partners from using content to identify an individual for any reason, including ad targeting, it prohibits law enforcement or government officials from conducting surveillance on users, and it allows users to delete any of their content, even if it's technically been sold.

Also: What is a Chief AI Officer, and how do you become one?

This new agreement simply means that OpenAI will potentially take what you post and feed it into a database alongside millions of other posts to help AI become more reliable. 

As far as what the new AI features might be, Reddit didn't offer up specifics. Facebook recently integrated AI, but it was met with a lot of frustration from users. It's possible Reddit might incorporate an AI-powered posting tool that assists users with hashing out their thoughts, but that would lead to a trained-by-AI, posted-by-AI loop of sorts. If the company takes an idea from Google, it might offer up AI-powered summaries at the top of posts, but in my personal experience, that summary is often wrong.

Unfortunately, it doesn't look like there's a way for users to opt out of having their content used to train AI other than deleting old posts.

Editorial standards