The artificial intelligence laboratory went on to say that superintelligence, which could arrive within the next decade, would be the most impactful technology ever created and could help solve the world's most important problems. But it followed that up with a rather dire warning that it "could also be very dangerous, and could lead to the disempowerment of humanity or even human extinction."
That's where the new team, led by the research lab's current chief scientist and the current head of alignment, comes in. In addition to that leadership, the team will consist of OpenAI's top researchers and engineers, plus 20% of its current compute power.
The end goal would be to create a "roughly human-level automated alignment researcher," or an AI system that achieves a set goal and doesn't go outside set parameters. The team plans to accomplish this in three steps:
Develop an understanding of how AI analyzes other AI without human interaction
Use AI to search for any problem areas or exploits
Deliberately mistrain some AI to see if that gets detected
In short, a team of humans at OpenAI is using AI to help them train AI to keep super smart AI under control. And they think they'll have it worked out in four years. They admit it's an ambitious goal, and success isn't guaranteed, but they're confident in their work.