RAPIDS in handle data

4 min readSep 17, 2024

In chapter 01, we mentioned using RAPIDS for processing and analyzing data with a GPU. In this article, I will guide you on how to use it.

cuDF

cuDF, also known as CUDA DataFrame, is used for processing data in CUDA. One great thing is that the syntax of cuDF is exactly the same as pandas, but in terms of performance, it’s 10 to 400 times faster.

How cuDF work

You can simply understand that when you call %load_ext cudf.pandas, importing pandas (or any of its sub-modules) will not actually work like a regular pandas library. Instead, it will call a proxy module (a replacement/temporary module) that temporarily substitutes the components in pandas. This allows you to harness the power of the GPU for data processing while still maintaining the same pandas syntax.

First, when we load cudf.pandas, it will “spoof” the pandas module, meaning that when you call pandas, you are actually using cudf.pandas.
Your code or third-party libraries that use pandas will all function through cudf.pandas.

example

%load_ext cudf.pandas
import pandas as pd  
import seaborn as sns

df = pd.DataFrame({
  'A': [1, 2, 3, 4, 5],
  'B': [10, 20, 30, 40, 50]
})df['C'] = df['A'] + df['B']
print(df)sns.barplot(x='A', y='B', data=df)

In this case, df[‘A’] + df[‘B’] will be executed on the GPU (via cudf.pandas), and if you use seaborn (as a third-party library that uses pandas), the plot will also be processed on the GPU.

Here are some third-party libraries that support cudf for GPU acceleration

When we call pandas functions, cudf.pandas will handle them if there are corresponding functions in cudf.
If there are no equivalent functions in cudf, the data will be copied back to the CPU to use the original pandas version. After processing, it can be copied back to the GPU if needed or simply print the results.

Code

In Notebook

%load_ext cudf.pandas
import pandas as pd

%%cudf.pandas.profile

df = pd.DataFrame({'a': [0, 1, 2], 'b': [3, 4, 3]})

df.min(axis=1)

out = df.groupby('a').filter(
    lambda group: len(group) > 1
)

%%cudf.pandas.line_profile

df = pd.DataFrame({'a': [0, 1, 2], 'b': [3, 4, 3]})

df.min(axis=1)

out = df.groupby('a').filter(
    lambda group: len(group) > 1
)

In Local

import pandas as pd

df = pd.DataFrame({'a': [0, 1, 2], 'b': [3, 4, 3]})
df.min(axis=1)
out = df.groupby('a').filter(
    lambda group: len(group) > 1
)

Run this command to enable cudf.pandas

$python3 -m cudf.pandas <file name>.py

-m cudf.pandas work the same as %load_ext cudf.pandas

If you want to profile, run this

$python3 -m cudf.pandas --profile <file name>.py

$python3 -m cudf.pandas --line-profile <file name>.py

From the demo code, we can see that DataFrameGroupBy.Filter is not supported on the GPU, so it is automatically executed on the CPU. In the next article, I will guide you on how to write custom kernels to support GPU processing to achieve maximum optimization.

Question

From those images, why is the total time not equal to the sum of the CPU and GPU times?