Build your own supercomputer out of Raspberry Pi boards

Build your own supercomputer out of Raspberry Pi boards

Summary: Who says you need a few million bucks to build a supercomputer? Joshua Kiepert put together a Linux-powered Beowulf cluster with Raspberry Pi computers for less than $2,000.

TOPICS: DIY, Hardware, Linux

When you think do-it-yourself (DIY) computing, you probably think of setting up a screaming gaming computer or putting together the best possible components for the least amount of money. You're almost certainly not considering putting together a supercomputer. Maybe you should. Joshua Kiepert, a doctoral student at Boise State's Electrical and Computer Engineering department, has managed to create a mini-supercomputer using Raspberry Pi (RPi) computers for less than $2,000.

Say hello to a homebrew Raspberry Pi-based supercomputer.

Raspberry Pi is a single-board Linux-powered computer. They're powered by 700MHz ARM11-processors and include a Videocore IV GPU. The Model B, which is what Kiepert is using, comes with 512MBs of RAM, two USB ports and a 10/100 BaseT Ethernet port. For his project Kiepert overclocked the processors to 1GHz.

By itself the Raspberry Pi is interesting, but it seems an unlikely supercomputer component. But, Kiepert had a problem. He was doing his doctoral research on data sharing for wireless sensor networks by simulating these networks on Boise State's Linux-powered Onyx Beowulf-cluster supercomputer. This modest, by supercomputer standards, currently has 32 nodes, each of which has a 3.1GHz Intel Xeon E3-1225 quad-core processor and 8GBs of RAM.

A Beowulf cluster is simply a collection of inexpensive commercial off the shelf (COTS) computers networked together running Linux and parallel processing software. First designed by Don Becker and Thomas Sterling at Goddard Space Flight Center in 1994, this design has since become one of the core supercomputer architectures.

So with a perfectly good Beowulf-style supercomputer at hand, why did Kiepert start to put together his own Beowulf cluster? In a white paper, Creating a Raspberry Pi-Based Beowulf Cluster,  (PDF Link) he explained,

"First, while the Onyx cluster has an excellent uptime rating, it could be taken down for any number of reasons. When you have a project that requires the use of such a cluster and Onyx is unavailable, there are not really any other options on campus available to students aside from waiting for it to become available again. The RPiCluster provides another option for continuing development of projects that require MPI [Message Passing Interface] or Java in a cluster environment.

Second, RPis provide a unique feature in that they have external low-level hardware interfaces for embedded systems use, such as I2C, SPI, UART, and GPIO. This is very useful to electrical engineers requiring testing of embedded hardware on a large scale.

Third, having user only access to a cluster is fine if the cluster has all the necessary tools installed. If not however, you must then work with the cluster administrator to get things working. Thus, by building my own cluster I could outfit it with anything I might need directly.

Finally, RPis are cheap! The RPi platform has to be one of the cheapest ways to create a cluster of 32 nodes. The cost for an RPi with an 8GB SD card is ~$45. For comparison, each node in the Onyx cluster was somewhere between $1,000 and $1,500. So, for near the price of one PC-based node, we can create a 32 node Raspberry Pi cluster!"

In an e-mail, Kiepert added, "This project was started because there was one week (Spring break) in which I could not use the Onyx Beowulf cluster I had been using. The Onyx cluster was down due to some renovations on the computer lab in which it resides. That got me thinking. I needed to continue testing my Ph.D. work, but if I didn't have access to Onyx I didn't have any options.

Previously, I had spent a lot of time playing with Raspberry Pis (RPis), and I have also been a long time Linux user (Fedora and Mint primarily). Additionally, in the research lab where I work, we use RPis as servers for our custom-built wireless sensor network systems, to up-link sensor data to our central database. So, this project allowed me to take my previous experience with clusters and RPis to another level, and it gave me some options for continuing my dissertation work. One thing for sure is it definitely adds something to the experience when you get to use a cluster you built."

For his baby-supercomputer, Kiepert elected to use Arch Linux. He explained, "Arch Linux … takes the minimalist approach. The image is tiny at ~150MB. It boots in around 10 seconds. The install image has nothing extra included. The default installation provides a bare bones, minimal environment, that boots to a command line interface (CLI) with network support.  The beauty of this approach is that you can start with the cleanest, fastest setup and only add the things you need for your application. The downside is you have to be willing to wade through the learning process of a different, but elegant, approach to Linux."

Of course, his RPi cluster isn't ideal. Kiepert admitted, "the overall value proposition is pretty good, particularly if cluster program development is focused on distributed computing rather than parallel processing. That is, if the programs being developed for the cluster are distributed in nature, but not terribly CPU intensive. Compute-intensive applications will need to look elsewhere, as there simply is not enough 'horse power' available to make the RPi a terribly useful choice for cluster computing."

In our e-mail conversation, Kiepert added that, "Perhaps the most annoying problem I had [with setting up the cluster] was SD-card corruption. Initially, I had a lot of file system corruptions when I powered down the cluster (nicely using: shutdown -h now) and attempted to start it again. This seems to be a known problem with the RPi that you are more likely to experience when you overclock. The weird thing was it was only occurring on the slave nodes, not the master. [The master node was a Samsung Chromebook Series 3 with a 1.7GHz dual-core ARM Cortex-A15 processor.]

Eventually, I found that if I just manually un-mounted the NFS shares before powering down the problem seems to be reduced. As part of the development I created a script for writing the SD-card images when re-imaging is needed. I just provide the host name and IP address, and the script does the rest. This greatly simplifies re-imaging, especially the first time I had to write all 32 of them while putting the initial image on the cards!"

At day's end, Kiepert has a cheap, working supercomputer, albeit one that still uses "electrical tape to hold the fans on the case!"  So now for the 64-bit question: "How fast does it run?"

 Kiepert ran the High Performance Linpack (HPL), the standard supercomputer benchmark on his home-made computer and found that his RPiCluster with its 32 Broadcom BCM2708 ARM11 processors running are 1GHz and 14.6GB of usable RAM turned in a HPL peak performance of 10.13 GFLOPS. That's not going to get this cluster into the TOP500 supercomputer list, but as Kiepert observed, "the first Cray-2 supercomputer in 1985 did 1.9 GFLOPS. How times have changed!"

Related Stories:


Topics: DIY, Hardware, Linux

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.


Log in or register to join the discussion
  • Waiting

    for the commercial application of this idea on a smaller scale. I wonder what $500 worth of it would do?
    D.J. 43
    • The Raspberry Pi is the cheapest component

      If you want to have a super computer, and don't have the expertise to run it yourself (Super Computers ain't no game for kids), you'll eat your entire budget when you hire someone to make it do something, whatever that might be. It'll cost many thousands of $$ to make it do something useful.

      Just buying $500 worth of components and cobbling them together serves no purpose, but having a purpose makes the expense worthwhile.
      • Linux is knowledge, science and real evolution of IT...

        ...while Windows is just waning Neanderthal can't adopting new way of life.
        Napoleon XIV
        • Interesting that you went for the "Linux vs. Windows" bash

          when the post you replied didn't mention either one.

          The basic premise Cynical99 stated is still correct: if you don't have expertise in setting up/utilizing a supercomputer, then just buying the parts to build one won't help you. And yes, expertise in using supercomputers is separate from expertise in Windows/Linux/OS X, so just because someone is an expert in a *desktop* setting doesn't mean they're automatically a supercomputing expert.
          • Thanks for the response

            Absolutely accurate!
        • ..while Windows is just waning Neanderthal can't adopting new way of life.

          any operating system can be tailor-made to run from the simplest embedded application to supercomputing needs. the nt kernel/derivative underneath the current crop of windows were designed from the ground up for heavy lifting. current windows server supports 16 nodes and can easily be expanded if there are financial incentives to do so ...
          • actually, no, the current crop of windows is not built from the ground up

            Windows supporting 16 nodes, now that is heavy. I remember 13 or so years ago meeting Doug Edline in person when supplying him parts for 32 node or greater Linux clusters. Windows is really getting traction here... or maybe not.

            The current crop of windows is built on the old crop of windows with the old performance bottle-necks still in place. Maybe google around a bit for that MS Dev who calls out MS and their development model which explains why windows is a non starter from a performance and scalability point of view, and also shows how new features are more important than bug fixes or performance enhancements to MS. Of course new features then become stale and fall into the same pit of non-enhancement that things like the NT Kernel and NTFS are diseased with.
            Heavy lifting indeed.
          • Not true. Windows HPC was actually more efficient than linux

            Windows HPC is an add-on to handle the low latency backend stuff but on the same supercomputer it actually achieved closer to theoretical than linux. Not by much. So windows is capable of very efficient operation. Your typical desktop installation involves a lot more than a server requires and so Windows tends to have a larger memory footprint. This can hurt performance when compared directly to linux but in a strict server environment and when there are many GB of memory it performs quite well compared to linux.
      • I disagree

        Hooking up multiple mini computers, such as the RPi, is a great way for kids to learn (including me, I'm 14). I am really mature, so I won't use all caps. I do not agree with your above post. I know programming (C, C++, C#, Java, Lua, Python, Pascal, Fortran, and perl to be exact), and I do admit, it's a great way to learn!
        Zack Lemanski
  • Raspberry Pi is looking better and better

    I need to learn how to build my own PC anyway.
    John L. Ries
    • There isn't much to it...

      I'm sure there's a video with a Paula Dean look a like that you can "build" along with. The best part is it won't involve any butter!

      In all seriousness, it isn't very complicated at all. Components generally fit only one way. For certain components, you stick them in the motherboard while others (like the hard drive, disc drives, etc) get attached via two cables- data and power, both of which have different end. The HARDEST part is probably fussing with setting up the CPU's heat sink with thermal paste. Essentially you need just enough to have a thin coat- I'd suggesting watching one of the 1000's of free videos online. Otherwise, to set up the power connection to the motherboard, you have to match the wires with what it says on the motherboard- if you have both manuals in front of you it's actually a LOT less complicated than it sounds... it's kind of like paying Mahjong.

      I'm sure you can find an old computer at a garage sale for the cost of a movie, pop, and popcorn (or even less) in which you can disassemble and reassemble a few times to get the hang of without potentially killing expensive parts you can't return.

      Hopefully you do follow through with learning how to actually build a proper system. It's a good hobby to have and it can save you tons and tons of money. Best of luck on your endeavors!
    • Great use for RPI in bulk

      You have to give this student credit for the engineering work on this project. Who knows it might just lead to bigger things.

      I remember a project kinda like this several years ago:
    • They are fun little devices

      I built mine using a 2gb sd card to house only the /boot partition with the rest of the OS(gentoo) on a powered external USB hard drive in order to avoid the dismal io performance and corruption issues of running it all on an SD card. The next one i build i am going to run the pi,2.5 hd and extra nework card off of a USB powered hub to reduce the number of power cables to one.
    • Be sure to use Class 10 SD cards to keep it fast.
  • Not The Performance, But The Parallelism

    Bear in mind this guy wasn't trying to put together a high-performance machine, he just wanted something to test massively-parallel algorithms on.
  • This is not a super computer in any sense. It models a super-computer

    10 GFlops won't even match a single x86 core much less an entire processor.
    I like the RPi and I have one but really? Even over-clocked they are a bit slow.
    • Re: This is not a super computer in any sense

      What's a "supercomputer", anyway? Compare any of the current inhabitants of the Top500 list with your desktop PC, and the uniprocessor performance will be much the same. The only thing unique to a "supercomputer" nowadays is the massive parallelism--everything else is off-the-shelf parts that you and I can pick up in any old retail place.

      So what this guy did is very much capture the essence of a super in a small box.
      • If you ever tried SETI, it's the ultimate in parallel processing.

        It's range of computers is vast, anyone can contribute their computer to the giant network when it is idle. Connecting to it is completely transparent, at least with Linux, I've never seen any ill effects.
        • Re: If you ever tried SETI, it's the ultimate in parallel processing.

          But SETI does not require much communication between nodes: each gets its task, goes away to run it to completion, and then reports back.

          That's easy parallelism. Supercomputers, on the other hand, are designed to deal with hard parallelism.
          • Something's missing

            To clarify: what Mr. Kiepert built is a compute cluster (Beowulf cluster) - a number cruncher. SETI is a grid solution, which is also crunching numbers, but relies on another philosophy overall. While a cluster exposes a "virtual" instance of the "consolidated" operating system instances of each of the nodes in the cluster, an grid more or less ignores the operating system instance specifics and exposes this instances' computing capacity as a kind of service.
            The first allows for much better performance on equivalent number of nodes, the second allows for practically linear scalability.
            I suppose that starting from this model, a much larger cluster may be built with RaspberryPi nodes, increasing the computing power in GFLOP range...