Dask cuGraph - Multi-GPU Graph Analytics Algorithms

Dask cuGraph contains parallel graph analytics algorithms that can make use of multiple GPUs on a single host. It is able to play nicely with other projects in the Dask ecosystem, as well as other RAPIDS projects, such as Dask cuDF and Dask cuML.

Use

As an example, the following Python snippet loads input from a csv file into a Dask cuDF Dataframe and finds the Pagerank in parallel, on multiple GPUs:

# Create a Dask CUDA cluster w/ one worker per device
from dask_cuda import LocalCUDACluster
cluster = LocalCUDACluster()

# Read CSV file in parallel across workers
import dask_cudf
df = dask_cudf.read_csv("/path/to/csv")

# Find PageRank
import dask_cugraph
pagerank = dask_cugraph.mg_pagerank(df)

Dask CUDA Clusters

Using the LocalCUDACluster()

Clusters of Dask workers can be started in several different ways. One of the simplest methods used in non-CUDA Dask clusters is to use LocalCluster. For a CUDA variant of the LocalCluster that works well with Dask cuGraph, check out the LocalCUDACluster from the dask-cuda project.

Note: It's important to make sure the LocalCUDACluster is instantiated in your code before any CUDA contexts are created (eg. before importing Numba or cudf). Otherwise, it's possible that your workers will all be mapped to the same device.

Using the dask-worker command

If you will be starting your workers using the dask-worker command, Dask cuGraph requires that each worker has been started with their own unique CUDA_VISIBLE_DEVICES setting.

For example, a user with a workstation containing 2 devices, would want their workers to be started with the following CUDA_VISIBLE_DEVICES settings (one per worker):

CUDA_VISIBLE_DEVICES=0,1 dask-worker --nprocs 1 --nthreads 1 scheduler_host:8786

CUDA_VISIBLE_DEVICES=1,0 dask-worker --nprocs 1 --nthreads 1 scheduler_host:8786

This enables each worker to map the device memory of their local cuDFs to separate devices.

Note: If starting Dask workers using dask-worker, --nprocs 1 must be used.

Supported Algorithms

Graph algorithms are being worked on.

Installation

Dask cuGraph relies on cuGraph to be installed. Refer to cuGraph on Github for more information.

Conda

TODO

Dask cuGraph can be installed using the rapidsai conda channel:

conda install -c nvidia -c rapidsai -c conda-forge -c pytorch -c defaults dask_cugraph

Contact

Find out more details on the RAPIDS site

Open GPU Data Science

The RAPIDS suite of open source software libraries aim to enable execution of end-to-end data science and analytics pipelines entirely on GPUs. It relies on NVIDIA® CUDA® primitives for low-level compute optimization, but exposing that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
.github		.github
ci		ci
conda/recipes/dask-cugraph		conda/recipes/dask-cugraph
dask_cugraph		dask_cugraph
datasets		datasets
docs		docs
img		img
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py
versioneer.py		versioneer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dask cuGraph - Multi-GPU Graph Analytics Algorithms

Use

Dask CUDA Clusters

Using the LocalCUDACluster()

Using the dask-worker command

Supported Algorithms

Installation

Conda

Contact

Open GPU Data Science

About

Releases

Packages

Contributors 3

Languages

License

rapidsai/dask-cugraph

Folders and files

Latest commit

History

Repository files navigation

Dask cuGraph - Multi-GPU Graph Analytics Algorithms

Use

Dask CUDA Clusters

Using the LocalCUDACluster()

Using the dask-worker command

Supported Algorithms

Installation

Conda

Contact

Open GPU Data Science

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages