pytorch_scatter

PyTorch Extension Library of Optimized Scatter Operations

1,624

182

1,624

View on GitHub

Top Related Projects

fairscale

3,307

PyTorch extensions for high performance and large scale training.

pytorch

88,135

Tensors and Dynamic neural networks in Python with strong GPU acceleration

horovod

14,454

Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

DeepSpeed

37,573

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

apex

8,589

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

dgl

13,784

Python package built to ease deep learning on graph, on top of existing DL frameworks.

Quick Overview

PyTorch Scatter is a PyTorch extension library that provides efficient scatter operations for sparse data. It implements various scatter and segment operations, allowing for fast and memory-efficient computations on irregularly structured input data, such as graphs or point clouds.

Pros

High performance: Implements scatter operations in C++/CUDA for optimal speed
Seamless integration with PyTorch: Works well with existing PyTorch tensors and autograd
Supports various reduction operations: Includes sum, mean, min, max, and more
Handles both CPU and GPU computations efficiently

Cons

Limited to scatter operations: Focused library, not a general-purpose sparse tensor framework
Requires compilation: Need to build from source or install pre-built wheels
Learning curve: May require understanding of scatter operations and sparse data structures
Dependency on PyTorch: Cannot be used standalone or with other deep learning frameworks

Code Examples

Basic scatter_add operation:

import torch
from torch_scatter import scatter_add

src = torch.randn(10, 5)
index = torch.tensor([0, 1, 0, 1, 2, 2, 3, 3, 4, 4])
out = scatter_add(src, index, dim=0)

Scatter mean with custom output size:

import torch
from torch_scatter import scatter_mean

src = torch.randn(8, 3)
index = torch.tensor([0, 1, 0, 1, 2, 2, 3, 3])
out = scatter_mean(src, index, dim=0, dim_size=5)

Segment operation:

import torch
from torch_scatter import segment_csr

src = torch.randn(10, 5)
indptr = torch.tensor([0, 2, 5, 8, 10])
out = segment_csr(src, indptr, reduce="max")

Getting Started

To install PyTorch Scatter:

pip install torch-scatter

Basic usage:

import torch
from torch_scatter import scatter

src = torch.randn(10, 5)
index = torch.randint(0, 3, (10,))
out = scatter(src, index, dim=0, reduce="sum")
print(out.shape)  # Output: torch.Size([3, 5])

Make sure to have PyTorch installed and compatible with your CUDA version if using GPU acceleration.

Competitor Comparisons

fairscale

3,307

PyTorch extensions for high performance and large scale training.

Pros of fairscale

Broader scope: Focuses on large-scale distributed training and model parallelism
More comprehensive: Offers a wide range of techniques for efficient deep learning
Active development: Regularly updated with new features and optimizations

Cons of fairscale

Higher complexity: Steeper learning curve due to its extensive feature set
Heavier dependency: Requires more setup and configuration for full utilization
Less specialized: May not be as optimized for specific scatter operations

Code Comparison

pytorch_scatter:

import torch
from torch_scatter import scatter_max

src = torch.randn(10, 5)
index = torch.tensor([0, 1, 1, 2, 2, 3, 3, 4, 4, 4])
out, _ = scatter_max(src, index, dim=0)

fairscale:

import torch
from fairscale.nn import ShardedDataParallel

model = YourModel()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
model = ShardedDataParallel(model, optimizer)

While pytorch_scatter focuses on efficient scatter operations, fairscale provides a broader set of tools for distributed training and model parallelism. The code examples highlight their different use cases and levels of abstraction.

pytorch

88,135

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Pros of PyTorch

Comprehensive deep learning framework with a wide range of functionalities
Large community support and extensive documentation
Regular updates and improvements from Facebook AI Research

Cons of PyTorch

Larger codebase and installation size
Steeper learning curve for beginners
May include unnecessary features for specific scatter operations

Code Comparison

PyTorch Scatter:

import torch_scatter
output = torch_scatter.scatter_add(src, index, dim=0)

PyTorch:

output = torch.zeros(num_segments, src.size(1))
output.scatter_add_(0, index.unsqueeze(-1).expand_as(src), src)

Key Differences

PyTorch Scatter focuses specifically on efficient scatter operations
PyTorch Scatter may offer better performance for scatter-specific tasks
PyTorch provides a more general-purpose framework for various deep learning tasks

Use Cases

PyTorch Scatter: Ideal for projects requiring frequent and efficient scatter operations
PyTorch: Suitable for a wide range of deep learning applications and research projects

Community and Support

PyTorch Scatter: Smaller, specialized community
PyTorch: Large, active community with extensive resources and third-party libraries

horovod

14,454

Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

Pros of Horovod

Supports multiple deep learning frameworks (TensorFlow, PyTorch, MXNet)
Designed for distributed training across multiple GPUs and nodes
Integrates with popular cloud platforms and resource managers

Cons of Horovod

More complex setup and configuration
Steeper learning curve for beginners
May introduce overhead for small-scale projects

Code Comparison

Horovod (distributed training):

import horovod.torch as hvd

hvd.init()
torch.cuda.set_device(hvd.local_rank())
optimizer = hvd.DistributedOptimizer(optimizer, named_parameters=model.named_parameters())
hvd.broadcast_parameters(model.state_dict(), root_rank=0)

PyTorch Scatter (efficient scatter operations):

from torch_scatter import scatter_max

output = scatter_max(src, index, dim=0)

Summary

Horovod is a comprehensive distributed deep learning framework, while PyTorch Scatter focuses on efficient scatter operations for PyTorch. Horovod excels in large-scale distributed training scenarios, offering multi-framework support and cloud integration. However, it may be overkill for smaller projects and has a steeper learning curve. PyTorch Scatter, on the other hand, provides a simpler, more focused solution for scatter operations within PyTorch, making it easier to use for specific tasks but lacking the broader distributed training capabilities of Horovod.

DeepSpeed

37,573

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Pros of DeepSpeed

Comprehensive optimization toolkit for large-scale deep learning
Supports distributed training across multiple GPUs and nodes
Offers advanced features like ZeRO optimizer and pipeline parallelism

Cons of DeepSpeed

More complex setup and configuration required
Steeper learning curve for beginners
May be overkill for smaller projects or simpler models

Code Comparison

DeepSpeed:

import deepspeed
model_engine, optimizer, _, _ = deepspeed.initialize(args=args,
                                                     model=model,
                                                     model_parameters=params)

PyTorch Scatter:

import torch_scatter
output = torch_scatter.scatter_add(src, index, dim=0)

Summary

DeepSpeed is a comprehensive toolkit for optimizing large-scale deep learning models, offering advanced features like distributed training and memory optimization. PyTorch Scatter, on the other hand, focuses specifically on scatter operations for PyTorch tensors. DeepSpeed is more suitable for complex, large-scale projects, while PyTorch Scatter is simpler and more specialized for scatter operations in PyTorch.

apex

8,589

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

Pros of apex

Offers a wider range of optimization techniques, including mixed precision training and distributed training
Developed and maintained by NVIDIA, ensuring compatibility with their hardware and potential performance benefits
Includes additional features like CUDA-aware communication and automatic mixed precision

Cons of apex

More complex to set up and use compared to pytorch_scatter
May have compatibility issues with non-NVIDIA hardware
Requires more in-depth knowledge of GPU optimization techniques

Code Comparison

apex:

from apex import amp
model, optimizer = amp.initialize(model, optimizer, opt_level="O1")
with amp.scale_loss(loss, optimizer) as scaled_loss:
    scaled_loss.backward()

pytorch_scatter:

from torch_scatter import scatter_mean
output = scatter_mean(src, index, dim=0)

Summary

apex is a more comprehensive optimization library with a focus on NVIDIA hardware, offering advanced features for high-performance deep learning. pytorch_scatter, on the other hand, provides specialized scatter operations for PyTorch tensors, with a simpler API and potentially broader hardware compatibility. The choice between the two depends on specific project requirements and hardware constraints.

dgl

13,784

Python package built to ease deep learning on graph, on top of existing DL frameworks.

Pros of DGL

Comprehensive graph neural network (GNN) library with a wide range of built-in models and algorithms
Supports multiple deep learning frameworks (PyTorch, MXNet, TensorFlow)
Scalable for large-scale graph processing and distributed training

Cons of DGL

Steeper learning curve due to its extensive feature set
May be overkill for simple scatter operations or projects not focused on graph-based tasks
Potentially higher overhead for basic operations compared to PyTorch Scatter

Code Comparison

DGL (graph construction and message passing):

import dgl
import torch

g = dgl.graph(([0, 1], [1, 2]))
g.ndata['h'] = torch.ones(3, 5)
g.update_all(fn.copy_u('h', 'm'), fn.sum('m', 'h_sum'))

PyTorch Scatter (scatter operation):

import torch
from torch_scatter import scatter_add

src = torch.randn(10, 5)
index = torch.tensor([0, 1, 1, 2, 2, 2, 3, 3, 4, 4])
out = scatter_add(src, index, dim=0)

PyTorch Scatter focuses on efficient scatter operations, while DGL provides a more comprehensive toolkit for graph-based deep learning tasks. Choose based on your specific project requirements and complexity.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

PyTorch Scatter

Documentation

This package consists of a small extension library of highly optimized sparse update (scatter and segment) operations for the use in PyTorch, which are missing in the main package. Scatter and segment operations can be roughly described as reduce operations based on a given "group-index" tensor. Segment operations require the "group-index" tensor to be sorted, whereas scatter operations are not subject to these requirements.

The package consists of the following operations with reduction types "sum"|"mean"|"min"|"max":

scatter based on arbitrary indices
segment_coo based on sorted indices
segment_csr based on compressed indices via pointers

In addition, we provide the following composite functions which make use of scatter_* operations under the hood: scatter_std, scatter_logsumexp, scatter_softmax and scatter_log_softmax.

All included operations are broadcastable, work on varying data types, are implemented both for CPU and GPU with corresponding backward implementations, and are fully traceable.

Installation

Anaconda

Update: You can now install pytorch-scatter via Anaconda for all major OS/PyTorch/CUDA combinations ð¤ Given that you have pytorch >= 1.8.0 installed, simply run

conda install pytorch-scatter -c pyg

Binaries

We alternatively provide pip wheels for all major OS/PyTorch/CUDA combinations, see here.

PyTorch 2.5

To install the binaries for PyTorch 2.5.0, simply run

pip install torch-scatter -f https://data.pyg.org/whl/torch-2.5.0+${CUDA}.html

where ${CUDA} should be replaced by either cpu, cu118, cu121, or cu124 depending on your PyTorch installation.

	`cpu`	`cu118`	`cu121`	`cu124`
Linux	â	â	â	â
Windows	â	â	â	â
macOS	â

PyTorch 2.4

To install the binaries for PyTorch 2.4.0, simply run

pip install torch-scatter -f https://data.pyg.org/whl/torch-2.4.0+${CUDA}.html

where ${CUDA} should be replaced by either cpu, cu118, cu121, or cu124 depending on your PyTorch installation.

	`cpu`	`cu118`	`cu121`	`cu124`
Linux	â	â	â	â
Windows	â	â	â	â
macOS	â

Note: Binaries of older versions are also provided for PyTorch 1.4.0, PyTorch 1.5.0, PyTorch 1.6.0, PyTorch 1.7.0/1.7.1, PyTorch 1.8.0/1.8.1, PyTorch 1.9.0, PyTorch 1.10.0/1.10.1/1.10.2, PyTorch 1.11.0, PyTorch 1.12.0/1.12.1, PyTorch 1.13.0/1.13.1, PyTorch 2.0.0/2.0.1, PyTorch 2.1.0/2.1.1/2.1.2, PyTorch 2.2.0/2.2.1/2.2.2, and PyTorch 2.3.0/2.3.1 (following the same procedure). For older versions, you need to explicitly specify the latest supported version number or install via pip install --no-index in order to prevent a manual installation from source. You can look up the latest supported version number here.

From source

Ensure that at least PyTorch 1.4.0 is installed and verify that cuda/bin and cuda/include are in your $PATH and $CPATH respectively, e.g.:

$ python -c "import torch; print(torch.__version__)"
>>> 1.4.0

$ echo $PATH
>>> /usr/local/cuda/bin:...

$ echo $CPATH
>>> /usr/local/cuda/include:...

Then run:

pip install torch-scatter

When running in a docker container without NVIDIA driver, PyTorch needs to evaluate the compute capabilities and may fail. In this case, ensure that the compute capabilities are set via TORCH_CUDA_ARCH_LIST, e.g.:

export TORCH_CUDA_ARCH_LIST = "6.0 6.1 7.2+PTX 7.5+PTX"

Example

import torch
from torch_scatter import scatter_max

src = torch.tensor([[2, 0, 1, 4, 3], [0, 2, 1, 3, 4]])
index = torch.tensor([[4, 5, 4, 2, 3], [0, 0, 2, 2, 1]])

out, argmax = scatter_max(src, index, dim=-1)

print(out)
tensor([[0, 0, 4, 3, 2, 0],
        [2, 4, 3, 0, 0, 0]])

print(argmax)
tensor([[5, 5, 3, 4, 0, 1]
        [1, 4, 3, 5, 5, 5]])

Running tests

pytest

C++ API

torch-scatter also offers a C++ API that contains C++ equivalent of python models. For this, we need to add TorchLib to the -DCMAKE_PREFIX_PATH (e.g., it may exists in {CONDA}/lib/python{X.X}/site-packages/torch if installed via conda):

mkdir build
cd build
# Add -DWITH_CUDA=on support for CUDA support
cmake -DCMAKE_PREFIX_PATH="..." ..
make
make install

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot