RAPIDS in GPU

Gia Huy ( CisMine)
3 min readSep 14, 2024

--

Nowadays, the increasing amount of data has made ETL (Extract, Transform, Load), also known as data analysis and processing, more complex and time-consuming. To address this issue, NVIDIA created RAPIDS

When it comes to data analysis and processing, we mostly think about Python, Pandas, SQL, Spark,… However, all these languages have a major drawback is they run on the CPU, which leads to slow data processing and inefficient use of computer resources. This is why RAPIDS was developed.

What is RAPIDS

RAPIDS is an open source software libraries and APIs give you the ability to execute end-to-end data science,analytics and machine learning pipelines entirely on GPU.

One great thing is that the syntax is completely similar to Pandas, NumPy, scikit-learn, and others.

In RAPIDS, the main libraries include:

  • cuDF: like pandas but run in GPU
  • cuML: like Sklearn but run in GPU
  • cuGraph: like NetworkX but run in GPU
  • cuSpatial: like GIS but run in GPU

Through the above images, we can see that RAPIDS demonstrates a superior speed compared to other libraries, and the great thing is that the accuracy remains unchanged.

One note is that in the "GPU in AI" series, I will only guide on the two main libraries: cuDF (Pandas on GPU) and cuML (Sklearn on GPU).

Set up

In Local

Here is the link to install RAPIDS

Please select as shown above, BUT remember to check which version of the CUDA toolkit you have by using the following command:

$nvcc -V

If you haven’t installed the CUDA toolkit yet, you can refer to the guide here.

One important note is that RAPIDS only supports Python versions 3.9, 3.10, and 3.11.

After installation, you can verify by using the following commands:

import cudf
cudf.__version__
import cuml
cuml.__version__
import cugraph
cugraph.__version__
import cuspatial
cuspatial.__version__
import cuxfilter
cuxfilter.__version__

In Google Colab

Change the runtime from CPU to GPU

run these commands:

!git clone https://github.com/rapidsai/rapidsai-csp-utils.git
!python rapidsai-csp-utils/colab/pip-install.py

After installation, you can verify by using the following commands:

import cudf
cudf.__version__
import cuml
cuml.__version__
import cugraph
cugraph.__version__
import cuspatial
cuspatial.__version__
import cuxfilter
cuxfilter.__version__

--

--

Gia Huy ( CisMine)

My name is Huy Gia. I am currently pursuing a B.Sc. degree. I am interested in the following topics: DL in Computer Vision, Parallel Programming With Cuda.