Oracle Cloud targets more HPC, adds Ampere for first Arm offering

Oracle on Tuesday outlined its cloud roadmap, announcing a series of hardware and compute updates and a continued focus on HPC workloads.

Fresh off of its high-profile deal with TikTok and a series of other major cloud customer wins, Oracle on Tuesday showcase its vision for its cloud business over the next 12 to 18 months. With a heavy focus on high-performance computing (HPC) workloads, the company announced a series of hardware and compute updates, as well as new partnerships. 

Top Cloud Providers

Top cloud providers in 2020: AWS, Microsoft Azure, and Google Cloud, hybrid, SaaS players

The cloud computing race in 2020 will have a definite multi-cloud spin. Here's a look at how the cloud leaders stack up, the hybrid market, and the SaaS players that run your company as well as their latest strategic moves.

Read More

The announcements include New HPC instances on Oracle Cloud Infrastructure (OCI) powered by Intel "Ice Lake" chips, the general availability of Nvidia A100 GPUs on bare metal instances, and the introduction of E4 compute instances for general-purpose workloads. Additionally, Oracle is partnering with Ampere to offer Oracle's first ARM-based compute instances, and it's partnering with Rescale to make it easier for customers to onboard HPC jobs. 

Must read:

After nearly four years of competing effectively as a niche provider, overshadowed by major public cloud providers like Amazon Web Services, Microsoft Azure and Google Cloud, Oracle's cloud business is showing some momentum: With more than 25 regions currently online, OCI should have 36 regions up and running globally by this time next year. Its major customers include Nissan and other automotive companies -- underscoring Oracle's focus on delivering HPC workloads, which are typically difficult to run in the cloud. 

Karan Batta, VP of product management for OCI, said that Oracle's focus on HPC shows how the company is competing with other cloud providers "by solving problems that they haven't solved before.

Must read:

"The problem is a lot of the applications used in these [HPC] workloads are 30 to 40 years old," Batta said. "Even moving from Intel to AMD would cause different results. It's a very fragile workload, and it's a very large-volume workload." 

The core principle that's driven Oracle's cloud strategy, Batta said, is that customers "want the best of on prem -- meaning performance and specialization -- as well as the best of what supposedly the cloud should offer, which is pay-per-use, scalability, etc."

In the first half of next year, as part of its HPC platform roadmap, Oracle will offer HPC Compute Instances based on Intel's "Ice Lake" processors. The new instances should provide 30 percent more per core performance than prior Oracle instances for workloads such as crash simulations, CFD and EDA workloads. 

Customers will be able to deploy these instances as bare-metal, get NVMe storage for local checkpointing, and they be able to build clusters of these instances on Oracle's RDMA-enabled Cluster Network. 

Meanwhile, Oracle also announced Tuesday the general availability of Nvidia A100 Tensor Core GPUs on bare metal instances. They'll be available starting September 30 in Oracle's US, EMEA and JAPAC regions at an on-demand price of $3.05 per GPU per hour. This will serve HPC as well as AI and machine learning workloads, and customers can scale up to 512 GPUs in a single cluster. 

The new instances provide up to 1.6 Tbps of bandwidth per bare-metal node housing eight A100 GPUs, all fully interconnected with NVLINK. They provide customers with over 25 Tb of local NVMe storage and 2 Tb of RAM for large-scale graph workloads or accelerated databases.

"Our retail [customers] run large graph processing workloads for recommendation engines -- they need large memory for that," Batta said. 

Meanwhile, customers in the oil and gas industry do seismic processing. "All of that seismic data, they want it to be local, so we had to increase local storage for that," he said. "We ended up building this giant Ferrari, so to speak, for the GPU world. It's in most cases better than on-prem."

In other hardware news, Oracle is partnering with Ampere to provide its first ARM-based compute instances. Early next year, customers will be able to launch bare-metal or virtual machines with up to 160 cores with 3.3 Ghz turbo frequency on a wide variety of Linux distros, including Oracle Linux and Ubuntu. Customers will be able to choose the levels of cores or memory they need. Apart from pure compute instances, they'll also be able to use these instances as part of Oracle's Always Free Tier for developing and testing.

"No one's really thought about Arm in the cloud in the way they have," Batta said of Ampere. Adding to OCI's options from Intel, AMD and Nvidia, he said, Ampere contributes to "a full portfolio of choices for our customers to pick and choose whatever they want."

Next, Oracle announced E4 instances, the next generation of its elastic computing instances, will be available early next  year. While Oracle's E3 instances are built on AMD's Rome generation of CPUs, E4 instances will be powered by AMD Milan CPUs. All of these instances are part of Oracle's "flexible infrastructure" offerings, which lets customers choose exactly how many cores and how much memory they want. 

Meanwhile, Oracle's partnership with Rescale should make it easier for customers to onboard and get HPC jobs running in under a day. Rescale has more than 450 applications pre-installed on Oracle's HPC instances and lets customers bring their own licenses as well. 

"Sometimes we have smaller organizations that don't necessarily have the expertise or people power to build on top of our cloud," Batta said. Those customers can "go to Rescale, submit a job, and it works seamlessly -- allowing a  longer tail of customers to use the benefits of our cloud through a provider that makes it simple."