​CSIRO searches for astronomical supercomputer

The CSIRO has published a request for tender on behalf of the Pawsey Supercomputing Centre to replace the decommissioned Fornax system with a budget of AU$1.5 million.

The Commonwealth Scientific and Industrial Research Organisation (CSIRO) is on the hunt for a new Advanced Technology Cluster (ATC) to replace the decommissioned Fornax system at the Pawsey Supercomputing Centre in Perth, a national supercomputing joint venture between the CSIRO, Curtin University, Edith Cowan University, Murdoch University, and the University of Western Australia.

With a budget of AU$1.5 million, the CSIRO has specified the new ATC must meet the needs of the radio astronomy research community and high-end researchers in other areas of computational science, such as geosciences, nanotechnology, and biotechnology.

In its request for tender (RFT), the CSIRO dictates that there are two technology components for the ATC, with tenderers required to separately bid for each.

The AU$1.5 million maximum spend is inclusive of the hardware, software licensing, maintenance and support requirements, installation, and commissioning costs for both components.

Each compute node in Component A is required to include at least one Xeon Phi processor with a Knights Landing architecture and all Xeon Phi processors in Component A will be bootable in main socket configuration, and not accelerators, the RFT specifies.

All compute nodes in Component A are to be connected by an Omni-Path interconnect, with a bandwidth of at least 100Gbps per node, and all compute nodes will need to be able to mount the existing Pawsey Lustre scratch and group filesystem. The RFT also requests that all compute nodes be connected by a 100Gbps EDR InfiniBand interconnect and a minimum of 64 GB of ECC DDR memory.

Each compute node in Component A is also required to also have a DDR memory configuration that is optimised for performance, but also allows the use of all cluster modes on the Xeon Phi processor(s). Each compute node should also contain sufficient storage for the operating system, or alternatively boot via the ethernet management network using Pawsey's existing xCAT infrastructure.

The tenderer is to also fully support a recent version of RedHat Enterprise Linux -- 7.2 or higher -- or a SUSE Linux Enterprise Server -- 12.1 or higher -- on all nodes

Each compute node in Component B is to include at least two GPUs with a Pascal architecture, with each expected to include at least two host processors with either a Power or Intel x86 64 bit architecture.

The RFT specifies all compute nodes in Component B are to be connected by an EDR InfiniBand interconnect, with a bandwidth of at least 100Gbps per node, with all compute nodes capable of mounting to the existing Pawsey scratch and group filesystem, via the Pawsey FDR Infiniband infrastructure with uplinks to the existing Mellanox SX6356 series switch and at least 80Gbps of bandwidth.

All compute nodes in Component B may be connected by a 100 Gbps Omni-Path interconnect. The RFT also states the GPUs in Component B should support NVLINK, and have a minimum of 64GB of ECC DDR memory, with each optimised for performance.

Similar to Component A, each compute node in Component B may contain storage for the operating system, or alternatively boot via the ethernet management network using Pawsey's existing xCAT infrastructure.

The Pawsey Supercomputing Centre currently looks after four systems: Epic, Fronax, Magnus, and Galaxy.

Epic is a decommissioned 100 teraflop cluster-based system with a high-performance interconnect installed at Murdoch University.

Fornax, the system to be replaced by the RFT, is a heterogeneous computing environment installed within an existing building on the University of Western Australia campus in Perth. It contained a number of HPC technologies including GPUs, dual interconnect, large memory, and local storage to support data intensive science and inform the procurement of a replacement for the Magnus system.

Magnus is a Cray XC40 petascale system, which is currently the most powerful public research supercomputer in the southern hemisphere. Magnus is supported by a smaller commodity cluster, Zeus, for pre/post-processing and visualisation.

Galaxy is a Cray system with a similar architecture to Magnus, dedicated to supporting the radio astronomy data processing of the ASKAP and MWA telescopes.

Both Magnus and Galaxy are installed in the Pawsey Supercomputing Centre.

The installation, configuration, and acceptance of the new system is expected to be completed before April 30, 2017.

In March, the CSIRO welcomed the Dell-powered Pearcey supercomputer to its Canberra site to support research in areas such bioinformatics, fluid dynamics, and materials science.

A month prior, the Faculty of Science at the University of Western Australia also welcomed its own high-performance computing HPC cluster to its Perth campus to assist with computational chemistry, biology, and physics.

The CSIRO has also mentioned in its latest RFT that it is looking for a vendor with supply chain integrity, meaning only those that are actively aware of no evidence of exploitation in its supply chain after conducting a reasonable examination of its supply chain.

Earlier this year, 56 electronics companies were probed as part of Baptist World Aid's 2016 Electronics Industry Trends report that evaluated a company on how ethically and sustainably their products are made, with a focus on human rights and protecting workers from exploitation.

Hisense, Palsonic, and Polaroid failed the test. Whilst no company received an A, Acer, Apple, BSH Group, Intel, LG Electronics, Microsoft, Motorola Mobility, and Samsung received a B+ grade.