Stuart Russell: Will we choose the right objective for AI before it destroys us all?

Stuart Russell, author of a textbook on AI, and the popular volume, Human Compatible, says humanity needs to get its act together and think about what the right objectives are to make sure machines more intelligent than ourselves don't annihilate the human race.
Written by Tiernan Ray, Senior Contributing Writer

Russell outside the ballroom at the Hilton on Wednesday, mobbed by audience goers wanting to ask questions. "A lot of people operate on the assumption that it's only when we make machines that are conscious that we will have a problem," he says. "You can make super-intelligent machines, and it doesn't matter if it's conscious or not."

Tiernan Ray for ZDNet

Will machines much smarter than us destroy the human race?

Usually, that question is the stuff of hysterically bad journalism. But there is a section of academia that is reflecting on the matter with thoughtfulness and rigor.

One such fellow is Stuart Russell, professor of artificial intelligence at the University of California at Berkeley. Wednesday morning at the Hilton Hotel in Manhattan, a crowd of AI researchers mobbed Russell as he stepped off the stage at the annual conference of the Association for the Advancement of Artificial Intelligence. 

Russell had spoken for an hour about the fate of humanity should it lose control of intelligent machines. "How not to destroy the world with AI," as he titled it (slides available on Russell's Web site, full video of the talk is available on Vimeo.)

He spent another fifteen minutes patiently listening in the hall outside the ballroom as excited audience members peppered him with follow-up questions. 

Then Russell excused himself to go to the lobby bar and explain to ZDNet how he is trying to change a mindset that fails to appreciate the risks. 

"A lot of people operate on the assumption that it's only when we make machines that are conscious that we will have a problem," Russell explained as he sipped a coffee. "But that literally doesn't make any sense."

Also: AI on steroids: Much bigger neural nets to come with new hardware, say Bengio, Hinton, and LeCun

In Russell's view, any machine that can be made with greater intelligence than humans poses a danger, just by being smarter. "You can make super-intelligent machines, and it doesn't matter if it's conscious or not."

The crux of the matter is what's called the "objective function," the goal that a machine is trying to fulfill by its behavior. That objective function has to be designed to serve humanity, Russell argues.

"If we are building machines that make decisions better than we can, we better be making sure they make decisions in our interest," he said.

During his speech, Russell had offered something of a history lesson, and also a potential way out of the dilemma. The talk echoed themes he discusses in his popular book on the topic, published last year, titled Human Compatible. (Russell helps run a research group at Berkeley that is named the Center for Human Compatible AI)

"In the nineteen-fifties, we said that we as humans expect our actions to achieve our objectives," recalled Russell, harking back to the days when AI got underway as a discipline. That model of rational, purposive human agents, in turn, was used to conceive of the model for AI. Machines with an objective to carry out would fulfill that objective through their intelligent actions.

"Unfortunately, that's the wrong model," Russell told the audience. The machine that fulfills its programmed objective can still be the machine that kills all humanity. Indeed, it's a trope of sci-fi stories that the machine that optimizes planet earth, for example, could decide that humans are a hazard to planetary equilibrium and must be eliminated.  

To avoid that, the alternative, according to Russell, needs to be a machine that is intelligent not in fulfilling its objective but rather humanity's objective. 

 "What we want are machines that are beneficial to us, when their actions satisfy our preferences."

A slide in the talk put it more formally: "Machines are beneficial to the extent that their actions can be expected to achieve our objectives."


Russell wants a new objective function for AI, an objective that's humanity's, so that intelligent machines can't optimize-away the human race.

Stuart Russell

There is an urgency about this, in Russell's mind. Society can't just avoid intelligent machines. They will be built, whether we want them or not, just as the atom bomb was built.

"We will have a time when AI systems are better at decision making in the real world," he said. "What happens when they are better than us?" Humans had better start thinking on it now, while there's still time. 

To hammer home the message, Russell in his talk showed a slide of an imaginary email from an alien species. The aliens have emailed humanity — the United Nations, specifically — to inform us that they will be arriving on earth in thirty to fifty years from now. 

The humans' pitiful response to this epic development, in Russell's account, is an out-of-office email with a smiley face. 

Will we, he wonders, be similarly out to lunch as intelligent machines develop? 


A mock email used by Russell in his slide deck to emphasize the urgency of thinking about AI. "We will have a time when AI systems are better at decision making in the real world," he argues, "What happens when they are better than us?" 

Of course, Russell is not the first to raise the issue in scholarly terms. The matter was eloquently spelled out decades ago by cyberneticist Norbert Wiener, whom Russell cites in Human Compatible

In a 1960 article in Science magazine, Wiener wrote that machines could be built that would develop capabilities not foreseen by their programmers. 

"They thus most definitely escape from the completely effective control of the man who has made them," Wiener predicted. 

Building upon Wiener's critique, Russell proposes the outlines of a solution, which involves putting humans back into the loop.

In what he calls an "assistance game," a machine still needs information from a person to complete its task. It knows it doesn't know everything. 

Take the problem of image recognition. "We can have machine learning algorithms that know they don't know what the loss function is" for recognizing something in an image, "and go back to the human expert" for clarification.

Also: Deep learning godfathers Bengio, Hinton, and LeCun say the field can fix its flaws

The assistance game can potentially be a way to make an effective "off switch" for the machine, says Russell. In the typical AI model, no machine will turn itself off because suicide would violate its programming to achieve its objective. For the same reason, it would try to prevent a human from turning it off.

In the assistance game, the machine assumes that hidden information held by humans is essential to the machine's functioning. And so the machine has to cooperate with any human decision to turn the machine off. 

"As long as there is uncertainty, the robot has the incentive to allow itself to be switched off," he said. "This robot is going to do what the human wants."

Of course, controlling machines can raise an inverse question, the question of robot rights, and slavery, a question Wiener raised in 1960. To build intelligent things and to try to control them, Wiener wrote, is not just cruel and therefore morally wrong. It is also "very close to one of the great problems of slavery." As he put it, it is contradictory, because, "Complete subservience and complete intelligence do not go together."

Russell is inclined to put the matter aside. 

"The ethical question only arises if the system has subjective experience," Russell told ZDNet. That prospect is remote, he asserted. "I have a heard time believing that my laptop has subjective experience," he said, even though it gets more intelligent all the time with faster chips and better software. 

Still, one has to "concede it's possible that suitably programmed computers could have subjective experience" at some point, he said. But that possibility is virtually impossible to explore scientifically because "there is no contact between that hypothesis and science," he observed.

"You can't have complete certainty this table doesn't have an experience," he said, leaning over to tap on the marble tabletop. "But then, we don't have a way to have insight about whether it does or not."

But if we can anticipate really intelligent machines, and start thinking deeply about them now, can we not already start contemplating subjective machines and the prospect of slavery?

"If it's theoretically possible that you might accidentally create subjective experience, you might want to figure out how not to accidentally create that," he conceded. "But then, if you gave me a trillion dollars in funding to develop a machine with a subjective experience, I would give the money back, because I don't know how to do that."

In the meantime, for Russell, the urgent matter is not robot enslavement but rather avoiding human annihilation. He is revising for the third time his textbook on AI, written with Peter Norvig, director of research at Google. 

"We have had to redo all the chapters of the book, we have to redo all those areas of AI that assume a fixed objective," he said. 

One of the things to think deeply about is just what our preferences are, exactly. Before we can tell a machine, we need to self-reflect, because humans are mysterious. Russell cites the matter of Lee Sedol, the grandmaster Go player who lost to a Google software program, AlphaGo, in 2016. 

You'd infer from some of Sedol's mistakes in his matches with AlphaGo that he wanted to lose when that wasn't, in fact, the case. Sedol was in the midst of complex human behavior, including nerves. "Humans are not optimal," as Russell puts it.

The confusion about human preference is concerning because if machines can't figure out our complex human preferences, they may optimize by changing our preferences for us. Something like that is already happening, according to Russell.

"We are building systems right now that have the effect of modifying people's preferences, and that wasn't the intention originally," Russell told ZDNet

One of his favorite examples is social media and the efforts by companies to optimize "click-throughs," getting people to click on things and thereby drive traffic and advertising. You'd think social media would be about suiting people's preferences, he notes, but actually, the system increasingly operates by inducing people to focus on stuff that's easier for the machine to optimize. "It's not just the filter bubble, this is worse than that, this is changing who you are," he said. "If you are on social media, it's largely determining your experience of the world."

What should be done about that? "We need to have something like an FDA for algorithms," said Russell, referring to the U.S. Food & Drug Administration. "If you are building something for large groups of the public, especially without consent, it has to be examined whether those effects are desirable." 

"The idea that you get to directly influence the behavior of billions of people, with no controls, and no intermediation, and no checks, with effects on their psychology and behavior  — that's a dangerous experiment," said Russell.

Whether humans have the will to impose such limits is a deep question that, again, Wiener anticipated sixty years ago. Even if you had a kill switch, even if the machine would let you press it, would humans press it if they were so seduced by the machine?

"We may not know until too late when to turn it off," Wiener warned.

How much is too much when it comes to intelligent machines? 

"We have to address the question, What do we want our future to be" as humans, Russell had told the audience upstairs. "For thousands of years of life on earth, it was just, Don't die today!"

Sitting in the lobby bar, reflecting further on the matter, Russell was inclined to fall back on intuitive human inclinations. 

"I don't want to replace humans in roles where being human is what matters," said Russell. 

"In the long run, almost all of what we think of as 'work' is going to be done by machines," he explained, "but I would still rather go to lunch with a human being than with a machine." 

Editorial standards