Innovation

Machine learning needs rich feedback for AI teaching: Monash professor

With AI systems largely receiving feedback in a binary yes/no format, Monash University professor Tom Drummond says rich feedback is needed to allow AI systems to know why answers are incorrect.

Written by Chris Duckett, Contributor Oct. 4, 2016 at 10:47 p.m. PT

In much the same way children have to be told not only what they are saying is wrong, but why it is wrong, artificial intelligence (AI) systems need to be able to receive and act on similar feedback, according to Monash University Department of Electrical and Computer Systems Engineering Head Professor Tom Drummond.

"I think fundamentally we need ways of giving machine learning systems more feedback than 'yes' or 'no'," Drummond told journalists at the GTCx Australia conference on Tuesday.

Latest Australian news

"Rich feedback is important in human education, I think probably we're going to see the rise of machine teaching as an important field -- how do we design systems so that they can take rich feedback and we can have a dialogue about what the system has learnt?"

To illustrate his point, Drummond used the example of a system that was able recognise and caption images of fire hydrants, and appeared to be working well until one of the images was modified.

"You take Photoshop, and you colour it green, and it says: 'A red fire hydrant'."

"It's not pulling that caption out of a database of captions, there is a recurrent network generating that one word at a time, and there were other images in the database that were green hydrants, but they were physically differently shaped because they were from a different jurisdiction and in that part of world they paint them green, or yellow," he said.

"So it wasn't learning what the word green or red meant, it was learning that that adjective applies when they are this shape.

"That's the problem when you demand yes or no as feedback.

"We need to be able to enrich it because otherwise machine learning is doomed to take much longer and generate much less satisfactory answers."

Perhaps the most famous instance of machine learning gone wrong in public is Microsoft's Tay AI chatbot that had its personality turned from a friendly young female, to an outrageous racist.

"We need to be able to give it rich feedback and say 'No, that's unacceptable as an answer because ... ' we don't want to simply say 'No' because that's the same as saying it is grammatically incorrect and its a very, very blunt hammer," Drummond said.

Tay was eventually suspended from Twitter, but not before the bot went on a short tweeting rampage a week after its initial corruption.

According to Drummond, one problematic feature of AI systems is the objective function that sits at the heart of a system's design.

"We start by creating an objective function that measures the quality of the output of the system, and it is never what you want," he said. "To assume you can specify in three sentences what the objective function should be, is actually really problematic."

The professor pointed to the match between Google DeepMind's AlphaGo and South Korean Go champion Lee Se-dol in March, which saw the artificial intelligence beat human intelligence by 4 games to 1.

"AlphaGo showed real brilliance in the way that it played, or at least that is kind of anthropomorphising it, if it were a human playing this way it seemed to have a really deep understanding of the game," Drummond said. "But in game 1, when it knew it was way ahead, well before Lee Se-dol knew he definitely lost, it started playing what we thought of as bad moves and what it was doing was essentially saying: 'I've won this game, I'm just going to remove all the uncertainty'."

"Of course, a human player would be striving to maximise their victory, and at this point it was just striving to maximise its probability of victory; removing uncertainty allows it to do that."

In the fourth match, the only one where Se-dol picked up a victory, after clearly falling behind, the machine played a number of moves that Drummond described as insulting if played by a human due to the position AlphaGo found itself in.

"At that point AlphaGo knew it had lost but it still tried to maximise its probability of victory, so it played all these moves ... a move that threatens a large group of stones, but has a really obvious counter and if somehow the human misses the counter move, then it's won -- but of course you would never play this, it's not appropriate."

"Here's the thing, the objective function was the highest probability of victory, it didn't really understand the social niceties of the game. It kind of had a crude hammer that if the probability of victory dropped below epsilon, some number, then resign. But it played for, I think, four insulting moves before it resigned."

The shock of the commentators when AlphaGo played these moves -- because they expected, and had come to rely on, AlphaGo to have vast understanding of the game of Go -- showed the complacency that can occur when people expect machine to be correct all the time.

"The better the machine gets, the more complacent we become, so self-driving cars that operate at 99.9 percent is more dangerous than a machine that operates at 99 percent because at 99 percent you are used to it messing up regularly and you're attentive to it," he said.

"At 99.9 percent, you're starting to relax."

According to the professor, it raises a "deep question" of ethics and when to use machine learning systems and when to hold back.

"Do you make your driver aid so good that people stop driving, and then they get into trouble, or do you keep them not good enough so that they are only there to help always and they won't work unless the humans engage?"

Drummond is involved in the Australian Research Council Centre of Excellence for Robotic Vision, and told ZDNet the next challenge for robot vision is to move beyond the structured environments they have had engineered for them in the past, and move out into the unstructured real world, particularly via the use of self-driving cars.

As well as being able to identify objects, Drummond said robots will need to be able to understand how to interact with objects and be able to infer intent, such as whether a car is slowing down to turn or if a pedestrian will leave the kerb.

"All those kinds of things are about being able to model what will happen in the future to entities in my environment, they need for safe operation and helpful operation," he said.

On Wednesday, the Victorian government, in conjunction with the Transport Accident Commission and VicRoads, partnered with automotive parts giant Bosch for the development of the first self-driving vehicle in Australia.

According to Victorian Minister for Roads and Road Safety Luke Donnellan, the vehicle under construction in Clayton, Victoria, has been designed to navigate roads with or without driver input and includes the ability to detect and avoid hazards such as pedestrians, cyclists, and other vehicles.

"By removing human error from the equation, self-driving vehicles will play a critical role in reducing deaths and serious injuries on Victorian roads," Donnellan said.

Disclosure: Chris Duckett travelled to GTCx Australia as a guest of Nvidia.

Editorial standards

Show Comments

Machine learning needs rich feedback for AI teaching: Monash professor

Latest Australian news

Related

OpenAI makes 'Memory' available to all ChatGPT Plus subscribers - how to use it

GitHub releases an AI-powered tool aiming for a 'radically new way of building software'

Sorry, you really can't disable Facebook's Meta AI tool - but here's what you can do