Cloudera said that it will integrate its Cloudera Data Platform (CDP) and Nvidia’s accelerated Apache Spark 3.0 libraries.

According to Cloudera, the integration will accelerate data pipelines and make it easier to add machine learning workflows to processes.

Cloudera Data Platform added Applied Learning Prototypes (AMPs) earlier this year. AMPs often run on Nvidia GPU hardware.

The Apache Spark 3.0 libraries are accelerated using Nvidia’s RAPIDS platform. Cloudera is looking to eliminate bottlenecks for data scientists and help them scale machine learning models.

Nvidia’s GPU acceleration for Apache Spark aims to speed up data preparation tasks and train models faster, orchestrate pipelines from data to training to visualization and save on infrastructure costs.

Cloudera said GPU-accelerated Apache Spark 3 runs natively on CDP and can plug into high performance compute tools.

The public cloud implementation of Nvidia RAPIDS-accelerated Apache Spark 3.0 libraries is now generally available. On-premises integrations will be available in the summer.  



Source link