eLiza: IBM's self-healing server

IBM has embarked on a new multibillion-dollar effort called eLiza to build computer systems that can fix themselves while problems still are in the early stages.

IBM has embarked on a new multibillion-dollar effort called eLiza to build computer systems that can fix themselves while problems still are in the early stages.

The effort is an attempt to bring some of the self-healing abilities of living creatures to the brittle world of computers, where component failures can bring down larger systems and ripple across a network to other computers as well.

"Just like the human body, when you sweat, it evaporates and cools you down," John Patrick, vice president of Internet technologies at IBM, said in an interview about the program. "And when you're cold, you shiver and that warms you up. When you cut your finger, you bleed and that heals the wound. "Just like that, we're intending to invest in a broad range of software that will allow infrastructure to be self-managing and self-healing."

Analysts see IBM's effort putting the company at the front of an as-yet unproven market. "IBM's self-healing systems will definitely put pressure on other manufacturers to follow," said ARS Market Intelligence analyst Steve Greenberg. "But it is going to be interesting to see when this hits the market, or if it does at all."

But there are some differences between IBM's plan and actual biological systems. IBM essentially is patching today's computing technology, adding another layer on top of a very complicated system rather than employing radically different designs. For example, human brains, in some ways resembling a computer, sometimes can adapt to keep functions such as speech working despite serious damage.

IBM's Greg Burke will lead the multiyear effort, reporting to Irving Wladawsky-Berger--the man who led IBM's effort to embrace the Internet six years ago and the Linux operating system two years ago. Wladawsky-Berger will unveil eLiza at an analyst meeting Friday.

The effort will take place at five IBM research labs, the company said. It will consume a quarter of the company's server research funds.

The effort will consolidate several smaller programs under way within various groups at Big Blue. Hundreds will work on eLiza, Patrick said, spreading changes to all IBM's server lines, its storage products and software packages such as DB2, WebSphere and Tivoli.

With eLiza, computers would monitor everything from patterns in a power supply's electricity consumption to how many people are using a Web site, Patrick said. When the behavior of an element of the computing system starts showing the first indications of distress, automatic services would fire up backup systems, order replacement parts or take other measures to make sure people using the system don't notice problems.

One element of eLiza will be a project called Project Oceano, a prototype that consists of a bunch of Linux servers that can share jobs among each other, with new servers being added into the mix or removed as necessary. The system can even install operating systems and stored data without human intervention.

Also tying into eLiza is Blue Gene, a coming IBM supercomputer devoted to the task of figuring out how genes "fold" molecules into gigantic biochemical molecules called proteins. Blue Gene will have so many CPUs that the computer will have to be able to assess when they start or stop working and adjust accordingly.

IBM also is working on lower-level technology, Patrick said. The company already has begun selling memory systems that can keep working even when memory chips fail completely.

All these fixes may seem complicated, but IBM thrives on complexity. Much of the revenue of its large services division comes from helping customers handle onerous chores such as adding new computers to older networks or running customers' systems at IBM for a fee.

IBM's goal--shared by competitors such as Sun Microsystems, Hewlett-Packard and EMC--is to reduce the difficulties of administering the large servers at the heart of Web operations and corporate networks. There simply aren't enough knowledgeable administrators to go around, particularly as people grow accustomed to having guaranteed access to the Internet and more and more operations depend on the Internet, Patrick said.

A very small number of computer experts are able to diagnose thorny problems in the most complicated combinations of computing hardware.

"We're trying to capture that knowledge and automate the process," Patrick said. "There aren't enough of those people to go around. We can see a real crisis ahead as the expectations go up and the transactions go up."

IBM didn't base the project's name on Eliza, a storied pre-PC program that performed psuedo-psychoanalysis on users, but to another biological system. The name is a reference to IBM's Deep Blue chess-playing machine, which Wladawsky-Berger said had the intelligence of a lizard--not very smart by some measures, but not bad for a computer.

While the project is ambitious in its scope, IBM has a bigger footprint in the computing industry than any of its competitors.

"I definitely think IBM's the right company to try to attempt to get this kind of technology," Greenberg said. "It's a huge, huge project."