It's looking pretty good for Watson, IBM's Linux-powered computer cluster, as IBM engineers get it ready for its mid-February showdown with Jeopardy's all-time champs, Ken Jennings and Brad Rutter. Watson has already won a practice round and Bodog, the online gambling company and odds-maker has made Watson the favorite at 5/6. Even if Watson doesn't win, the mere fact that it can compete at this level is amazing.
How can Watson do it? Here's what I've learned about Watson's hardware and software in the last few days.
According to David Davidian, an IBM Senior System Architect, “Watson is a massively parallel system based on the IBM POWER7 750 in a standard rack mounted configuration.” It can run AIX, IBM's house-brand Unix; IBM I; and Linux. To compete on Jeopardy Watson is running Novell's SUSE Linux Enterprise Server.
Watson is made up of ninety IBM POWER 750 servers, 16 Terabytes of memory, and 4 Terabytes of clustered storage. Davidian sontinued, “This is enclosed in ten racks including the servers, networking, shared disk system, and cluster controllers. These ninety POWER 750 servers have four POWER7 processors, each with eight cores. IBM Watson has a total of 2880 POWER7 cores.”
Just like the human players, Watson has no access to Google or any other outside sources of information. It plays with “what it knows.”
Don't think though that Watson is just a really powerful search engine. It's far more than that.
Watson uses IBM DeepQA software to “understand” natural language questions and answers like those in Jeopardy. Doing this is really the hard trick for Watson and its designers. Humans don't talk or think clearly or logically. To work out what someone means, a computer needs to understand context, slang, puns and a hundred other things that we take for granted when we talk to each other.
DeepQA, Davidian explained, “scales out with and searches vast amounts of unstructured information. Effective execution of this software, corresponding to a less than three second response time to a Jeopardy! question, is not just based on raw execution power. Effective system throughput includes having available data to crunch on. Without an efficient memory sub-system, no amount of compute power will yield effective results. A balanced design is comprised of main memory, several levels of local cache and execution power. IBM's POWER 750's scalable design is capable of filling execution pipelines with instructions and data, keeping all the POWER7 processor cores busy. At 3.55 GHz, each of Watson's POWER7 on-chip bandwidth is 500 Gigabytes per second. The total on-chip bandwidth for Watson's 360 POWER7 processors is an astounding 180,000 Gigabytes per second!”
Speed alone though wouldn't have done the job. It's the sheer speed plus the innovate DeepQA software running on Linux that makes Watson competitive with the human world's Sherlock Holmes of quizzes. Win, lose, or draw, Watson points to a brave new world where we really will be able to 'talk' to our computers and get good answers back.
For more about Watson, before it takes on its human rivals, look to this forthcoming NOVA episode on PBS.