Cloudera said that it will integrate its Cloudera Data Platform (CDP) and Nvidia's accelerated Apache Spark 3.0 libraries.
According to Cloudera, the integration will accelerate data pipelines and make it easier to add machine learning workflows to processes.
Cloudera Data Platform added Applied Learning Prototypes (AMPs) earlier this year. AMPs often run on Nvidia GPU hardware.
- Cloudera fills gap in streaming platform with SQL
- Cloudera Data Platform hits Google Cloud
- Cloudera aims to fast track enterprise machine learning use cases with Applied ML Prototypes
The Apache Spark 3.0 libraries are accelerated using Nvidia's RAPIDS platform. Cloudera is looking to eliminate bottlenecks for data scientists and help them scale machine learning models.
Nvidia's GPU acceleration for Apache Spark aims to speed up data preparation tasks and train models faster, orchestrate pipelines from data to training to visualization and save on infrastructure costs.
Cloudera said GPU-accelerated Apache Spark 3 runs natively on CDP and can plug into high performance compute tools.
The RAPIDS Accelerator for Apache Spark will be available in CDP Private Cloud this summer. Nvidia and Cloudera will roll out additional accelerated offerings in CDP over time, starting with Accelerated Deep Learning and Machine Learning in CDP Public Cloud in May.