Build a $2,500 supercomputer
Summary: Supercomputing Costco-styleIn 1997, IBM's Deep Blue supercomputer beat world chess champion Gary Kasparov. Today you can build a more powerful machine for less than $2,500 in an 11" x 12" x 17" box.
Supercomputing Costco-style In 1997, IBM's Deep Blue supercomputer beat world chess champion Gary Kasparov. Today you can build a more powerful machine for less than $2,500 in an 11" x 12" x 17" box. That works out to less than $100 per gigaflop as of January, 2007
More good news: pricing out the components today the machine would only cost $1,300!
The recipe Professor Joel Adams and undergraduate Tim Brom built the machine at Calvin College in Grand Rapids, MI. Using the Beowulf cluster model, the Microwulf design includes
- 4 microATX motherboards with dual-core AMD Athlon 64 X2 3800 AM2+ processors
- 8 GigE ports - 1 built-in port on each motherboard, plus 1 added GigE PCI-express NIC
- 8 GB RAM - half of what a balanced system should have, but 16 GB would have busted their budget.
- 4 microATX power supplies
- 1 8-port GigE switch
- 250 GB hard drive & a CD/DVD drive
- 3 polycarbonate plastic shelves to mount the kit on plus 5 threaded rods to support the shelves
Here's a schematic diagram:
The architecture Beowulf clusters are based on a message-passing (MPI) infrastructure that uses a network to interconnect the nodes. Some Beowulf clusters have hundreds of nodes and scale nicely with the right workloads.
Microwulf has an economical version of the same architecture, built on Ubuntu Linux and MPI libraries.
The result Performance is a many-splendored thing. In the world of supercomputing the standard benchmark is Linpack, which solves a dense system of linear equations in 64-bit double precision arithmetic. Learn more about Linpack, HPL and their parameters here.
It is worth noting that with a 250 GB SATA drive, HPL doesn't do much I/O. The benchmark is testing float point performance on an in-memory problem. Above 30,000 the machine ran out of memory. Here are Microwulf's stats:
While unexceptional today, this performance would have made Microwulf the world's 6th fastest supercomputer in 1993. At less than $100 per gigaflop. Update: at today's prices about $50 per GFlop.
The Storage Bits take Humans aren't very good at forecasting exponential functions like Moore's Law. Microwulf is a good excuse to take stock of just how much computing has advanced in the last 15 years.
Millicomputing is the name of a related initiative to build powerful clusters out of very power-efficient processors and low-cost components. In another 10 years you'll be able to have the equivalent of a 5,000 node Google cluster in your den. Cluster-based virtual reality, anyone?
Update: Lots of great comments from some very experienced people. Thanks! A couple of folks pointed to a detailed tutorial written by Professor Adams - who graciously permitted me to use his copyrighted diagram - that I'd linked to but without flagging its importance.
Let me rectify that oversight. If you want to get into the details of the hardware and software this article on the Microwulf architecture and construction should suffice.
Comments welcome. Personally, I'm very happy with my quad-core Xeon, but I don't do much with computational fluid dynamics or protein folding.
Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.



Talkback
You do need a day off!
Re:Monday is Labor Day! lol!
Add a Terabyte or four Raid Array to the system.
Good point
down, but I'd love to see what kind of storage the professor could come up with for
an additional $1000.
Robin
Will Vista SP1 allow me to run this?
Vista SP1 'Soopercomputing Edition'
Try again...
Microsoft DOES do scaleable clusters. Search for "Scalable Cluster" at Microsoft.com and check the results for yourself.
Not Yet ...
Easy...
Yes, Vista SP1 Will Run On It..But
No...
Many Cray machines are not AMD based.
not vs now?
Some poor schmuck who doesn't know about Cray and their niche might just go your subject line and avoid AMD, unfortunately
Ok how do i do this..
Can you of anyone help
Thanks
Nice piece of work
RTFM
Details Here
A bit of Cray history from a Cray-on
I'm working on a similar line of experimentation. The 20 year old internal structure of the systems we designed is the key. This will be interesting.....and my idea my be cheaper....8-P....
I doubt it would outperform Deep Blue
Now, trying to compare that Deep Blue system with the AMD dual core system suggested by the author is going to be literally a case of comparing apples and oranges. None of the performance benchmark programs would apply to both system types.
The AMD box would have far faster CPUS, but far less of them (8 versus 480). The massively parallel chess program would not work as effectively on so few processors, in spite of their greatly improved speed.
The SP Switch ran at something like 300MBytes per sec I seem to recall, far faster than GigE which tops out around 100MBytes per sec. However, network speed would not be a great factor in this, since the node-to-code communication is sending relatively small packets of data. Ditto disk performance, there's not a huge amount of disk access going on for this app.
If you take a look at the TOP500 Supercomputing site, you will see a lot of SP systems still in there, but their numbers are dwindling. However, you won't see *any* 8-CPU AMD or Intel machines in the list, heck, I'd bet that wouldn't even make the top 5000 :-)
All in all, a nice little article, and it's very nice to think that you can build a basic, decently performing cluster or a few grand, but it's not going to run anywhere near the performance of Deep Blue.
The last SP that I was personally the Admin for was a 54 node SP back in 2002. Sweet box I have to say. Today I run linux clusters, but not as big as those SP's of old.
Scalar vs Vector Processing
But, I just competed with IBM in the 1980's...and when JR turned bean-counter, IBM got my boss & SSI..
Look at the big picture here. I am....