New Aibo to lead the way in robotic vision

Sony's ERS-7 robot pet is to get vision from the company that brought us the beer-fetching notebook PC. It raises the spectre of Aibo being able to fetch more than the morning paper, but could also point to improvements in robotic navigation
Written by Matt Loney, Contributor

A deal between Sony a company that last made the headlines for creating a robotics kit that can turn a notebook computer into a beer fetching robot will see Sony's next Aibo robot dog getting sophisticated vision.

While the prospect of a toy dog that can see is likely to excite consumers, the technology also demonstrates the extent to which computer vision has developed over the past few years. Evolution Robotics' technology could also find applications in areas as diverse as useful household appliances such as robotic vacuum cleaners, and in supermarket checkouts where it could automatically recognise groceries, according to the company's chief executive Bernard Louvat.

It also raises the spectre of robotic dogs that can fetch more than just a newspaper and slippers.

The ER Vision software that Evolution Robotics is licensing to Sony's Entertainment Robot Company is a vision-based object recognition software module that can be trained to recognise objects using a single, low-cost camera. Louvat claims it can produce useful results from a £6 Web cam, meaning that it only entails a low bill of materials.

Sony is implementing the technology in its Aibo ERS-7, which is due to launch in November priced at $1,599 (£1,011). "Sony will use the technology in two ways," said Louvat." First, to enable Aibo to recognise its charger, and second to recognise 15 cards -- you show one and the robot will recognise a command and do a specific job."

The technology means Aibo will be able to see around it and recognise objects in its environment. "That is important for any pet or other robot you try to build," said Louvat. But there are other applications too for the software.

"Some companies are now building automated checkout counters, where you scan the products yourself," said Louvat. "Our software is very good at recognising texture and could help tremendously with identifying goods without (the customer) having to find the barcode." Orange and apples could be differentiated by colour, he said, but the software could also be used for differentiating one can of soup from another, or a box of Cornflakes from another box of serial. "If you put all the items on tray that moves, it means you don't have to barcode every item," he added.

Recognising places and objects
In robotics, the technology enables a robot to take pictures of its environment, then create a visual map of the room or house. "It allows the robot to know where it is but also have visual representation," Louvat said. "A vacuum cleaner could be taken out of its box, explore its environment, take pictures, create a map and could then be programmed to clean any room in house, and know where charger is." Other potential markets, said Louvat, include work in the work in health care market where a robot could deliver supplies and medicine to different rooms. "The precision can be pretty high," added Louvat.

ER's software can recognise objects when they are turned upside down, or turned at an angle to the camera; recognise objects as they move closer to, or farther away from, the camera; recognise objects that are partially obscured by other objects (ER claims that some objects can be recognised even when they are 90 percent occluded); deal with lighting artifacts caused by reflections and backlighting.

To train the system, several images of a particular object have to be taken and stored in memory. For planar objects, such as a beer mat, only the front and rear views are necessary, while a 3D object such as a bottle of Newcastle Brown Ale, for instance, needs four images from four angles. The software then analyses the object's image and finds up to 1,000 unique and local features to build an internal model.

Although the speed with which the ER software can recognise an object decreases exponentially as the number of objects in the database increases, ER says the library can scale to hundreds or even thousands of objects without a significant increase in computational requirements. Each object model requires about 40KB of memory, which would probably be the limiting factor for an Aibo.

The recognition frame rate meanwhile is proportional to CPU power and image resolution. ER cites performance of the recognition algorithm as being at 5 frames per second (fps) at an image resolution of 320x240 on an 850MHz Pentium III and 3 fps at 80x66 on a 100MHz 32-bit processor. Sony's current top of the range Aibo has a 64 bit RISC processor with a clock speed of 384 MHz.

ER says the software can accomplish jobs such as object identification, visual navigation, docking, and hand-eye coordination -- though the company was unable to comment on how desirable hand-eye coordination is in a robot that can fetch beer. Other useful and interesting applications include entertainment and education, said the company, adding that it has successfully applied the algorithm for reading children's books aloud.

The software analyses image sequences, allowing it to detect motion for such jobs as tracking trajectories and calculating of time-to-contact for collision avoidance -- both features that would be particularly useful for a robotic dog such as Aibo.

Editorial standards