- Bootable second-generation Xeon Phi processors
- 64 cores (256 threads) per node
- 16GB on-die MCDRAM per processor
- Up to 384GB of DDR4 RAM per node
- Four server nodes in 2U chassis
- Not a general-purpose server platform
- High processor TDP
It may look just like another rack-mount server, but in order to understand what Boston's Quattro 12256T is all about you need to be familiar with Intel's Xeon Phi processor, which is far removed from the regular Xeon CPUs typically found in such systems.
Developed as an alternative to market-leading Nvidia Tesla and AMD FirePro GPUs, the Phi takes the same approach of dumping complex chip architectures in favour of a huge number of much simpler cores. Equipped with their own RAM and high-speed interconnects linking all the cores together, GPUs were originally developed to handle graphics processing but can also be used alongside a conventional CPU to offload parallel processing workloads and, thereby, boost the performance of HPC applications.
The first release of the Xeon Phi was on a plug-in PCIe card, just like the Tesla, but the second generation, known as Knights Landing (KNL), is not just faster but also available in a standard CPU form factor ready to plug into a host processor socket. More than that, the new Xeon Phi can boot and run a standard x64 operating system, typically Linux, for easier integration into enterprise data centres with 16GB of on-die 'near memory' based on MCDRAM technology to further boost throughput. It is also available with optional support for direct connection to massively parallel HPC clusters using Intel's preferred Omni-Path technology.
The Quattro story of Phi
The Xeon Phi may be radically different from other Xeon processors, but the Boston Quattro server in which it's used here isn't. In fact the 2U Quattro chassis from Supermicro has been around for a number of years now and is effectively a mini-blade platform with four sleds instead of blades, each designed to take an independent server motherboard. These plug in at the rear of the unit with power provided by a pair of hot-swap 2000W supplies that, similarly, slide into place at the back.
On a Quattro fitted with standard Xeon processors there's usually a small amount of disk or SSD storage located within the sleds themselves, but not on this model. Instead, the storage is arranged across the front of the unit and organised into four banks of three, one per node (server) connected via a passive backplane in the centre of the chassis.
The Xeon Phi also has a relatively high thermal rating (up to 260 Watts) which means that, even though there's only one per sled, it can get very hot in there, so there's a bank of fans in the middle of the unit and ducting in the sleds to direct the airflow and keep the processors cool. The end result is as noisy as you would expect -- but that's not really an issue on this kind of server, which will typically be locked away in a soundproofed machine room.
The Supermicro motherboard in the bottom of each sled has a single socket to take the Xeon Phi coprocessor, which is based on highly modified Intel Atom Silvermont cores and available in a number of variants.
The main differentiator is the number of usable cores/threads, our review system shipping with 1.3GHz Xeon Phi 7230 processors to give it 64 cores (256 threads) per node. Sitting in the middle of a three-SKU range, the alternatives are 1.4GHz chips with 68 cores (272 threads) and 1.5GHz with 72 cores (288 threads).
The processor is neatly sandwiched between two sets of DIMM slots able to take up 384GB of DDR4 RAM, which can be clocked at either 2133MHz or 2400MHz depending on the processor fitted. In addition each Xeon Phi chip has a further 16GB of high-performance multi-channel memory (MCDRAM) built in as standard on the die. This 4-5 times faster than DDR4 and, under application control, can be used as a high-speed cache, a distinct NUMA node or a combination of the two in order to keep all those cores fed with data and instructions.
It's also worth noting that, even though each sled contains an independent server with up to 72 cores (288 threads), Xeon Phi processors are designed to work together to build highly scalable processing clusters containing tens of thousands of nodes to handle massively parallel workloads. To this end the review machine was fitted with Mellanox InfiniBand PCIe adapters (one per node), although there are Xeon Phi variants available that come equipped with Intel's own Omni-Path fabric on-board. These are identified by a protruding connector on the processor designed to fit into a gap in the motherboard socket to take the cables for connection to an external switch.
A second PCIe slot is also available along with two Gigabit Ethernet interfaces per node managed by an on-board Intel i350 controller plus an integrated IPMI management interface with KVM and another, dedicated, LAN connector. Storage, meanwhile, is looked after by a SATA3 controller in the Intel C612 chipset with RAID 0,1,5 and 10 support, where needed.
The key to positioning the Boston Quattro 12256-T, and Intel's Xeon Phi in general, is understanding that it's not your everyday Xeon server. It's not designed to run Windows (although it can) or Windows applications, and it's certainly not a virtualisation platform. Neither does it make a good platform for cloud applications. For all these reasons, it should not be considered as an alternative to standard Xeon-based servers for the majority of enterprise IT projects.
Where it scores best is in handling workloads that can be broken down into multiple processing streams and spread across hundreds or thousands of cores in parallel. Workloads such as DNA sequencing, for example, or deep learning, big data analytics and complex modelling.
The Quattro 12256-T isn't cheap by any means -- the Xeon Phi 7230 CPUs used in our review unit cost $3,710 each, for example. However, Boston's server compares well on price with other HPC solutions: the Quattro 12556-T starts at £17,707 (ex. VAT, or $21,603) for a 4-node server with a Xeon Phi 7210, 48GB of DDR4 ECC RAM, a 2TB disk and a single-port Omni-Path fabric adapter per node. Moreover, it's available for delivery now, providing a practical platform from which to exploit the capabilities of Intel's latest Xeon Phi processors, which are already proving popular with HPC customers worldwide.
Read more reviews