Nvidia and Databricks announce GPU acceleration for Spark 3.0

At its GPU Technology Conference event today, Nvidia is announcing GPU acceleration for Apache Spark 3.0, made possible by open source software developed in collaboration with Spark's creators at Databricks.

At its GPU Technology Conference (GTC) event today, consumer graphics and AI silicon powerhouse Nvidia is announcing its next-generation Graphical Processing Unit (GPU) architecture, dubbed Ampere, and its first Ampere-based GPU, the A100. Nvidia says the Ampere GPUs can offer a 20-fold performance improvement over its previous Volta GPU architecture, which itself offers vastly faster processing times for AI workloads than do conventional central processing units (CPUs). For more details, please see ZDNet's Natalie Gagliordi's coverage of all the Nvidia Ampere-related news today.

Lighting up Spark

What I'd like to cover here goes beyond those AI headlines, however, and involves a special nugget just for folks doing data engineering, analytics and machine learning work with Apache Spark. Specifically, Nvidia is announcing new GPU-acceleration capabilities coming to Apache Spark 3.0, the release of which is anticipated in late spring.

Also read: NVIDIA morphs from graphics and gaming to AI and deep learning
Also read: Nvidia doubles down on AI

The GPU acceleration functionality is based on the open source RAPIDS suite of software libraries, themselves built on CUDA-X AI. The acceleration technology, named (logically enough) the RAPIDS Accelerator for Apache Spark, was collaboratively developed by Nvidia and Databricks (the company founded by Spark's creators). It will allow developers to take their Spark code and, without modification, run it on GPUs instead of CPUs. This makes for far faster machine learning model training times, especially if the hardware is based on the new Ampere-generation GPUs, which by themselves offer 5-fold+ faster training and inferencing/scoring times than their Nvidia Volta predecessors.

Also read: Nvidia RAPIDS accelerates analytics and machine learning

Multi-workload, multi-cloud

Faster training times allow for greater volumes of training data, which is needed for greater accuracy. But Nvidia says the RAPIDS accelerator also dramatically improves the performance of Spark SQL and DataFrame operations, making the GPU acceleration benefit non-AI workloads as well. This means the same Spark cluster hardware can be used for both data engineering/ETL workloads as well as machine learning jobs.

That, in turn, avoids the need to provision a separate Spark cluster dedicated to AI work, and allows the entire load-process-train-test pipeline to run together in a single job, on a single cluster. Finally, Nvidia says, the RAPIDS Accelerator also boosts data transfer performance across nodes in a Spark cluster by leveraging the open source Unified Communication X (UCX) framework, which enables data to move directly between GPU memory. 

Moreover, since the RAPIDS accelerator is designed for open source Apache Spark, it will benefit not just users of the Databricks platform but also users of machine learning platforms offered by the major public cloud providers. In an advanced briefing for members of the press, NVidia CEO Jensen Huang explained that users of Spark clusters on Azure Machine Learning or Amazon SageMaker can benefit from the GPU acceleration as well.

Adobe experiences GPU acceleration

Adobe -- an Nvidia partner that is also a customer of Databricks, has been test-driving the GPU-accelerated Spark 3.0 technology and says it has achieved a 7x performance improvement and 90% cost savings. "We're seeing significantly faster performance with NVIDIA-accelerated Spark 3.0 compared to running Spark on CPUs," said William Yan, senior director of Machine Learning, Adobe. "With these game-changing GPU performance gains, entirely new possibilities open up for enhancing AI-driven features in our full suite of Adobe Experience Cloud apps."

Apache Spark is, at this point, such a pervasive platform -- both in standalone and embedded form -- for analytics and machine learning, that the RAPIDS Accelerator has the potential to drive mainstream adoption of GPU technology. It will be up to the public cloud providers to make the economics of GPU-based infrastructure compelling enough to support such widespread GPU propagation. For now, though, you can bet that with so many teams and organizations racing towards a vaccine and effective treatments for COVID-19, there will be a significant cohort of ready, willing and able early adopters for GPU-accelerated Spark.