While it's true that deep neural networks are struggling to carry out even the most basic robotics tasks, researchers at Google's Google Brain unit think that robotics has some important lessons for software development.
In research put out Wednesday, scientists at Google Brain, along with researchers at Sandia National Labs and the University of New Mexico's computer science department, reinterpreted software programs as if they were robots finding their way through uncertain terrain, using a form of machine learning called reinforcement learning.
The result could be more resilient software that inherently copes with uncertainty.
Also: Top 5: Things to know about AI TechRepublic
The paper, "Resilient Computing with Reinforcement Learning on a Dynamical System: Case Study in Sorting," was posted on the pre-print repository arxiv by Google Brain's Aleksandra Faust, along with colleagues from Sandia and UNM. An abstract also appears on the Google AI research site. Faust and colleagues are to present the work at the 57th IEEE Conference on Decision and Control, taking place December 17th through the 19th in Miami Beach, Florida.
The paper draws on prior work by Faust and colleagues from earlier this year with robots and unmanned aerial vehicles. In that work, robots were trained with reinforcement learning to navigate uncertain terrain on the ground and in the air.
The main concern of the current paper is that most software has classically not been developed to be resilient, meaning, to survive unpredictable conditions that may arise such as corruptions of memory chips.
Instead, software has been developed with what David Ackley of UNM has dubbed the "CEO" obsession - "Correctness and Efficiency Only," letting hardware handle the burden of reliability.
That CEO attitude presumes a program with the right algorithm will complete a task correctly and then terminate, and effort is putting it dealing with unforeseen errors. "It's like there was a contract between computer engineers and computer scientists: hardware shall provide reliability... software's job is to take logic and turn it into functions that are valuable," is how Ackley nice sums up the last 70 years of software. Ackley receives a thank you from Faust & Co. in the paper.
(Ackley has a great video on the matter, accompany a paper he put out this year, which is well worth checking out.)
Software best practices have evolved over time, note Faust and colleagues, including things such as design patterns and correctness proofs. But those measures are meant to mitigate programmer error, they are not meant to deal with what can happen in the course of a running program, when a failure condition or a fault emerges such as a "bit flip" in a memory cell.
Faust and colleagues propose to change the approach to development by following the example of robots, which regularly pursue a "goal-based task" in the face of error. "Robots routinely rely on measurements that contain errors, yet still aim at providing resilient decision making," they write.
To test the approach, the researchers made a new kind of version of a program that sorts the items in an array - like re-arranging a set of disordered letters so they're in alphabetical order, or putting a jumbled list of counting numbers into the proper sequence from smallest to largest. Sorting tasks such as those are a classic area for exploration in computer science, so it's a nice problem on which to test new software approaches.
Their program, "RL sort," uses the AI approach of working to maximize a reward by choosing among possible moves, something called a "Markov decision process." Seen in this way, the computation in the the sorting program becomes "a trajectory in the variable space," as they put it. Each tick of the clock, the program is searching through "state" changes, the program variables, looking for a path to the properly sorted list of items.
Faust and colleagues tested RL sort against two popular approaches to sorting software, "Quick sort," and "Bubblesort." They found that when when fault conditions of even 5% were introduced, the other two quickly ran into situations where they would fail to sort items almost all the time, while RL sort was still able to deliver in such conditions.
"Overall, RL sort is more likely to sort an array, and when it fails to sort, the array it produces will be closer to a fully sorted array, than other comparative methods," they write.
As a bonus, RL sort comes up as more efficient than the other two, because it requires fewer manipulations of the array of items.
Previous and related coverage
An executive guide to artificial intelligence, from machine learning and general AI to neural networks.
The lowdown on deep learning: from how it relates to the wider field of machine learning through to how to get started with it.
This guide explains what machine learning is, how it is related to artificial intelligence, how it works and why it matters.
An introduction to cloud computing right from the basics up to IaaS and PaaS, hybrid, public, and private cloud.