Menu
Home Explore People Places Arts History Plants & Animals Science Life & Culture Technology
On this page
CuPy
Numerical programming library for the Python programming language

CuPy is an open source library for GPU-accelerated computing with Python programming language, providing support for multi-dimensional arrays, sparse matrices, and a variety of numerical algorithms implemented on top of them. CuPy shares the same API set as NumPy and SciPy, allowing it to be a drop-in replacement to run NumPy/SciPy code on GPU. CuPy supports Nvidia CUDA GPU platform, and AMD ROCm GPU platform starting in v9.0.

CuPy has been initially developed as a backend of Chainer deep learning framework, and later established as an independent project in 2017.

CuPy is a part of the NumPy ecosystem array libraries and is widely adopted to utilize GPU with Python, especially in high-performance computing environments such as Summit, Perlmutter, EULER, and ABCI.

CuPy is a NumFOCUS sponsored project.

We don't have any images related to CuPy yet.
We don't have any YouTube videos related to CuPy yet.
We don't have any PDF documents related to CuPy yet.
We don't have any Books related to CuPy yet.
We don't have any archived web articles related to CuPy yet.

Features

CuPy implements NumPy/SciPy-compatible APIs, as well as features to write user-defined GPU kernels or access low-level APIs.1213

NumPy-compatible APIs

The same set of APIs defined in the NumPy package (numpy.*) are available under cupy.* package.

SciPy-compatible APIs

The same set of APIs defined in the SciPy package (scipy.*) are available under cupyx.scipy.* package.

User-defined GPU kernels

  • Kernel templates for element-wise and reduction operations
  • Raw kernel (CUDA C/C++)
  • Just-in-time transpiler (JIT)
  • Kernel fusion

Distributed computing

  • Distributed communication package (cupyx.distributed), providing collective and peer-to-peer primitives

Low-level CUDA features

  • Stream and event
  • Memory pool
  • Profiler
  • Host API binding
  • CUDA Python support14

Interoperability

  • DLPack15
  • CUDA Array Interface16
  • NEP 13 (__array_ufunc__)17
  • NEP 18 (__array_function__)1819
  • Array API Standard2021

Examples

Array creation

>>> import cupy as cp >>> x = cp.array([1, 2, 3]) >>> x array([1, 2, 3]) >>> y = cp.arange(10) >>> y array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

Basic operations

>>> import cupy as cp >>> x = cp.arange(12).reshape(3, 4).astype(cp.float32) >>> x array([[ 0., 1., 2., 3.], [ 4., 5., 6., 7.], [ 8., 9., 10., 11.]], dtype=float32) >>> x.sum(axis=1) array([ 6., 22., 38.], dtype=float32)

Raw CUDA C/C++ kernel

>>> import cupy as cp >>> kern = cp.RawKernel(r''' ... extern "C" __global__ ... void multiply_elemwise(const float* in1, const float* in2, float* out) { ... int tid = blockDim.x * blockIdx.x + threadIdx.x; ... out[tid] = in1[tid] * in2[tid]; ... } ... ''', 'multiply_elemwise') >>> in1 = cp.arange(16, dtype=cp.float32).reshape(4, 4) >>> in2 = cp.arange(16, dtype=cp.float32).reshape(4, 4) >>> out = cp.zeros((4, 4), dtype=cp.float32) >>> kern((4,), (4,), (in1, in2, out)) # grid, block and arguments >>> out array([[ 0., 1., 4., 9.], [ 16., 25., 36., 49.], [ 64., 81., 100., 121.], [144., 169., 196., 225.]], dtype=float32)

Applications

See also

  • Free software portal

References

  1. Okuta, Ryosuke; Unno, Yuya; Nishino, Daisuke; Hido, Shohei; Loomis, Crissman (2017). CuPy: A NumPy-Compatible Library for NVIDIA GPU Calculations (PDF). Proceedings of Workshop on Machine Learning Systems (LearningSys) in The Thirty-first Annual Conference on Neural Information Processing Systems (NIPS). http://learningsys.org/nips17/assets/papers/paper_16.pdf

  2. "CuPy 9.0 Brings AMD GPU Support To This Numpy-Compatible Library - Phoronix". Phoronix. 29 April 2021. Retrieved 21 June 2022. https://www.phoronix.com/scan.php?page=news_item&px=CuPy-9.0-Released

  3. "AMD Leads High Performance Computing Towards Exascale and Beyond". 28 June 2021. Retrieved 21 June 2022. Most recently, CuPy, an open-source array library with Python, has expanded its traditional GPU support with the introduction of version 9.0 that now offers support for the ROCm stack for GPU-accelerated computing. https://ir.amd.com/news-events/press-releases/detail/1012/amd-leads-high-performance-computing-towards-exascale-and

  4. "Preferred Networks released Version 2 of Chainer, an Open Source framework for Deep Learning - Preferred Networks, Inc". 2 June 2017. Retrieved 18 June 2022. https://www.preferred.jp/en/news/pr20170602/

  5. "NumPy". numpy.org. Retrieved 21 June 2022. https://numpy.org/

  6. Gorelick, Micha; Ozsvald, Ian (April 2020). High Performance Python: Practical Performant Programming for Humans (2nd ed.). O'Reilly Media, Inc. p. 190. ISBN 9781492055020. 9781492055020

  7. Oak Ridge Leadership Computing Facility. "Installing CuPy". OLCF User Documentation. Retrieved 21 June 2022. /wiki/Oak_Ridge_Leadership_Computing_Facility

  8. National Energy Research Scientific Computing Center. "Using Python on Perlmutter". NERSC Documentation. Retrieved 21 June 2022. /wiki/National_Energy_Research_Scientific_Computing_Center

  9. ETH Zurich. "CuPy". ScientificComputing. Retrieved 21 June 2022. /wiki/ETH_Zurich

  10. National Institute of Advanced Industrial Science and Technology. "Chainer". ABCI 2.0 User Guide. Retrieved 21 June 2022. /wiki/National_Institute_of_Advanced_Industrial_Science_and_Technology

  11. "Sponsored Projects - NumFOCUS". Retrieved 8 September 2024. https://numfocus.org/sponsored-projects

  12. "Overview". CuPy documentation. Retrieved 18 June 2022. https://docs.cupy.dev/en/latest/overview.html

  13. "Comparison Table". CuPy documentation. Retrieved 18 June 2022. https://docs.cupy.dev/en/latest/reference/comparison.html

  14. "CUDA Python | NVIDIA Developer". Retrieved 21 June 2022. https://developer.nvidia.com/cuda-python

  15. "Welcome to DLPack's documentation!". DLPack 0.6.0 documentation. Retrieved 21 June 2022. https://dmlc.github.io/dlpack/latest/

  16. "CUDA Array Interface (Version 3)". Numba 0.55.2+0.g2298ad618.dirty-py3.7-linux-x86_64.egg documentation. Retrieved 21 June 2022. https://numba.readthedocs.io/en/stable/cuda/cuda_array_interface.html

  17. "NEP 13 — A mechanism for overriding Ufuncs — NumPy Enhancement Proposals". numpy.org. Retrieved 21 June 2022. https://numpy.org/neps/nep-0013-ufunc-overrides.html

  18. "NEP 18 — A dispatch mechanism for NumPy's high level array functions — NumPy Enhancement Proposals". numpy.org. Retrieved 21 June 2022. https://numpy.org/neps/nep-0018-array-function-protocol.html

  19. Charles R Harris; K. Jarrod Millman; Stéfan J. van der Walt; et al. (16 September 2020). "Array programming with NumPy" (PDF). Nature. 585 (7825): 357–362. arXiv:2006.10256. doi:10.1038/S41586-020-2649-2. ISSN 1476-4687. PMC 7759461. PMID 32939066. Wikidata Q99413970. https://www.nature.com/articles/s41586-020-2649-2.pdf

  20. "2021 report - Python Data APIs Consortium" (PDF). Retrieved 21 June 2022. https://data-apis.org/files/2021_annual_report_DataAPIs_Consortium.pdf

  21. "Purpose and scope". Python array API standard 2021.12 documentation. Retrieved 21 June 2022. https://data-apis.org/array-api/latest/purpose_and_scope.html

  22. "Install spaCy". spaCy Usage Documentation. Retrieved 21 June 2022. https://spacy.io/usage#gpu

  23. Patel, Ankur A.; Arasanipalai, Ajay Uppili (May 2021). Applied Natural Language Processing in the Enterprise (1st ed.). O'Reilly Media, Inc. p. 68. ISBN 9781492062578. 9781492062578

  24. "Python Package Introduction". xgboost 1.6.1 documentation. Retrieved 21 June 2022. https://xgboost.readthedocs.io/en/stable/python/python_intro.html#data-interface

  25. "UCBerkeleySETI/turbo_seti: turboSETI -- python based SETI search algorithm". GitHub. Retrieved 21 June 2022. https://github.com/UCBerkeleySETI/turbo_seti#turbo_seti

  26. "Open GPU Data Science | RAPIDS". Retrieved 21 June 2022. https://rapids.ai/

  27. "API Docs". RAPIDS Docs. Retrieved 21 June 2022. https://docs.rapids.ai/api

  28. "Efficient Data Sharing between CuPy and RAPIDS". Retrieved 21 June 2022. https://medium.com/rapids-ai/using-rapids-memory-manager-with-cupy-8d08fe8f58fa

  29. "10 Minutes to cuDF and CuPy". Retrieved 21 June 2022. https://medium.com/rapids-ai/10-minutes-to-cudf-and-cupy-e131cac0439b

  30. Alex, Rogozhnikov (2022). Einops: Clear and Reliable Tensor Manipulations with Einstein-like Notation. International Conference on Learning Representations. https://openreview.net/forum?id=oapKSVM2bcj

  31. "arogozhnikov/einops: Deep learning operations reinvented (for pytorch, tensorflow, jax and others)". GitHub. Retrieved 21 June 2022. https://github.com/arogozhnikov/einops

  32. "Array API support (experimental) — scikit-learn documentation". Retrieved 8 September 2024. https://scikit-learn.org/stable/modules/array_api.html

  33. Tokui, Seiya; Okuta, Ryosuke; Akiba, Takuya; Niitani, Yusuke; Ogawa, Toru; Saito, Shunta; Suzuki, Shuji; Uenishi, Kota; Vogel, Brian; Vincent, Hiroyuki Yamazaki (2019). Chainer: A Deep Learning Framework for Accelerating the Research Cycle. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. doi:10.1145/3292500.3330756. https://dl.acm.org/doi/10.1145/3292500.3330756