What happens if a superintelligent AI goes rogue? OpenAI doesn't want to find out

OpenAI is warning that a super-smart AI could 'lead to the disempowerment of humanity or even human extinction.' So it's building a team to stop it.
Written by Artie Beaty, Contributing Writer
rogue AI concept
Getty Images/wildpixel

What would happen if a superintelligent AI smarter than the most intelligent humans were to one day go rogue? A new team at OpenAI is out to make sure we never find out. 

In an announcement this week, OpenAI said it's building a team to "steer and control AI systems much smarter than us." 

Also: Ahead of AI, this other technology wave is sweeping in fast

The artificial intelligence laboratory went on to say that superintelligence, which could arrive within the next decade, would be the most impactful technology ever created and could help solve the world's most important problems. But it followed that up with a rather dire warning that it "could also be very dangerous, and could lead to the disempowerment of humanity or even human extinction."

As it stands, the company says, humans can keep artificial intelligence in check because they're smarter. But what happens when AI passes humans?

That's where the new team, led by the research lab's current chief scientist and the current head of alignment, comes in. In addition to that leadership, the team will consist of OpenAI's top researchers and engineers, plus 20% of its current compute power.

Also: ChatGPT browsing feature deactivated only a week after roll out - here's why

The end goal would be to create a "roughly human-level automated alignment researcher," or an AI system that achieves a set goal and doesn't go outside set parameters. The team plans to accomplish this in three steps:

  • Develop an understanding of how AI analyzes other AI without human interaction
  • Use AI to search for any problem areas or exploits
  • Deliberately mistrain some AI to see if that gets detected

In short, a team of humans at OpenAI is using AI to help them train AI to keep super smart AI under control. And they think they'll have it worked out in four years. They admit it's an ambitious goal, and success isn't guaranteed, but they're confident in their work. 

Editorial standards