RAPIDS Workshop

RAPIDS, built on NVIDIA AI, is an open-source suite of GPU-accelerated Python libraries designed to improve your data science and analytics pipelines. With APIs similar to popular open-source data science tools, RAPIDS uses NVIDIA® CUDA® primitives for low-level compute optimization. This provides access to GPU parallelism and high-bandwidth memory speed through Python interfaces, leading to a faster performance at scale across data pipelines.

GPU computing is revolutionizing data science with RAPIDS, a powerful new set of open source software for running end-to-end data science training pipelines. Everything is done entirely in the GPU, reducing training time from hours to minutes. The RAPIDS suite of software libraries gives you the freedom to run end-to-end data science and analytics pipelines.

What skills did you gain from this workshop?

In this workshop, you’ll learn how to build and execute end-to-end GPU-accelerated data science workflows that enable you to quickly explore, iterate, and get your work into production. Using the RAPIDS™-accelerated data science libraries, you’ll apply a wide variety of GPU-accelerated machine learning algorithms, including XGBoost, cuGRAPH’s single-source shortest path, and cuML’s KNN, DBSCAN, and logistic regression to perform data analysis at scale.

Prerequisites

Basic machine learning concepts: classification, clustering
Python, Pandas, Scikit-learn
Basic concept of data sciences and work flow
Familiar with Jupyter Notebook environment

Content/Agenda

RAPIDS Technical Overview and Software Architecture
RAPIDS Use Cases and Implementation Scenarios
Hand-on Session Covering Data ETL Pipeline and Algorithm Demonstration

Rapids.ai is an open-source platform developed by NVIDIA to accelerate data processing and data analysis in GPU environments. Here are some of the advantages of Rapids.ai:

High Performance:

Rapids.ai is specifically designed to work with GPUs, which have very high parallel processing capabilities. This results in significant improvements in data processing performance, enabling faster data analysis.

Scalability:

This platform allows you to manage and analyze data at scale. By leveraging GPUs, Rapids.ai is suited to tasks that require big data processing, such as geospatial data analysis, genetics, or heavy machine learning.

Integrated Frameworks:

Rapids.ai integrates with several popular data analysis frameworks such as cuDF (pandas GPU), cuML (scikit-learn GPU), and cuGraph (networkX GPU). This makes it easy to use and migrate from existing frameworks.

Open Source and Active Community:

Rapids.ai is an open-source project, so you can use it for free and contribute to its development and improvement. The Rapids.ai community is active, which means you can easily find support and resources online.

Energy Efficiency:

GPUs have higher energy efficiency than CPUs for parallel processing. By using Rapids.ai and a GPU, you can reduce energy consumption while improving data processing performance.

Distributed Data Processing:

Rapids.ai also supports distributed data processing through the Dask framework. This allows processing of larger and more complex data that can be easily elaborated.

With its combination of GPU processing speed, integration with popular data analysis frameworks, and support for big data processing, Rapids.ai is an extremely useful tool for those involved in data analysis, machine learning, and data science that require high performance and scalability .

Request a Workshop

If your organization is interested in enhancing and developing key skills in AI, accelerated data science, or accelerated computing, you can request training led by us