Top Related Projects
PyTorch extensions for high performance and large scale training.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
Python package built to ease deep learning on graph, on top of existing DL frameworks.
Quick Overview
PyTorch Scatter is a PyTorch extension library that provides efficient scatter operations for sparse data. It implements various scatter and segment operations, allowing for fast and memory-efficient computations on irregularly structured input data, such as graphs or point clouds.
Pros
- High performance: Implements scatter operations in C++/CUDA for optimal speed
- Seamless integration with PyTorch: Works well with existing PyTorch tensors and autograd
- Supports various reduction operations: Includes sum, mean, min, max, and more
- Handles both CPU and GPU computations efficiently
Cons
- Limited to scatter operations: Focused library, not a general-purpose sparse tensor framework
- Requires compilation: Need to build from source or install pre-built wheels
- Learning curve: May require understanding of scatter operations and sparse data structures
- Dependency on PyTorch: Cannot be used standalone or with other deep learning frameworks
Code Examples
- Basic scatter_add operation:
import torch
from torch_scatter import scatter_add
src = torch.randn(10, 5)
index = torch.tensor([0, 1, 0, 1, 2, 2, 3, 3, 4, 4])
out = scatter_add(src, index, dim=0)
- Scatter mean with custom output size:
import torch
from torch_scatter import scatter_mean
src = torch.randn(8, 3)
index = torch.tensor([0, 1, 0, 1, 2, 2, 3, 3])
out = scatter_mean(src, index, dim=0, dim_size=5)
- Segment operation:
import torch
from torch_scatter import segment_csr
src = torch.randn(10, 5)
indptr = torch.tensor([0, 2, 5, 8, 10])
out = segment_csr(src, indptr, reduce="max")
Getting Started
To install PyTorch Scatter:
pip install torch-scatter
Basic usage:
import torch
from torch_scatter import scatter
src = torch.randn(10, 5)
index = torch.randint(0, 3, (10,))
out = scatter(src, index, dim=0, reduce="sum")
print(out.shape) # Output: torch.Size([3, 5])
Make sure to have PyTorch installed and compatible with your CUDA version if using GPU acceleration.
Competitor Comparisons
PyTorch extensions for high performance and large scale training.
Pros of fairscale
- Broader scope: Focuses on large-scale distributed training and model parallelism
- More comprehensive: Offers a wide range of techniques for efficient deep learning
- Active development: Regularly updated with new features and optimizations
Cons of fairscale
- Higher complexity: Steeper learning curve due to its extensive feature set
- Heavier dependency: Requires more setup and configuration for full utilization
- Less specialized: May not be as optimized for specific scatter operations
Code Comparison
pytorch_scatter:
import torch
from torch_scatter import scatter_max
src = torch.randn(10, 5)
index = torch.tensor([0, 1, 1, 2, 2, 3, 3, 4, 4, 4])
out, _ = scatter_max(src, index, dim=0)
fairscale:
import torch
from fairscale.nn import ShardedDataParallel
model = YourModel()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
model = ShardedDataParallel(model, optimizer)
While pytorch_scatter focuses on efficient scatter operations, fairscale provides a broader set of tools for distributed training and model parallelism. The code examples highlight their different use cases and levels of abstraction.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Pros of PyTorch
- Comprehensive deep learning framework with a wide range of functionalities
- Large community support and extensive documentation
- Regular updates and improvements from Facebook AI Research
Cons of PyTorch
- Larger codebase and installation size
- Steeper learning curve for beginners
- May include unnecessary features for specific scatter operations
Code Comparison
PyTorch Scatter:
import torch_scatter
output = torch_scatter.scatter_add(src, index, dim=0)
PyTorch:
output = torch.zeros(num_segments, src.size(1))
output.scatter_add_(0, index.unsqueeze(-1).expand_as(src), src)
Key Differences
- PyTorch Scatter focuses specifically on efficient scatter operations
- PyTorch Scatter may offer better performance for scatter-specific tasks
- PyTorch provides a more general-purpose framework for various deep learning tasks
Use Cases
- PyTorch Scatter: Ideal for projects requiring frequent and efficient scatter operations
- PyTorch: Suitable for a wide range of deep learning applications and research projects
Community and Support
- PyTorch Scatter: Smaller, specialized community
- PyTorch: Large, active community with extensive resources and third-party libraries
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
Pros of Horovod
- Supports multiple deep learning frameworks (TensorFlow, PyTorch, MXNet)
- Designed for distributed training across multiple GPUs and nodes
- Integrates with popular cloud platforms and resource managers
Cons of Horovod
- More complex setup and configuration
- Steeper learning curve for beginners
- May introduce overhead for small-scale projects
Code Comparison
Horovod (distributed training):
import horovod.torch as hvd
hvd.init()
torch.cuda.set_device(hvd.local_rank())
optimizer = hvd.DistributedOptimizer(optimizer, named_parameters=model.named_parameters())
hvd.broadcast_parameters(model.state_dict(), root_rank=0)
PyTorch Scatter (efficient scatter operations):
from torch_scatter import scatter_max
output = scatter_max(src, index, dim=0)
Summary
Horovod is a comprehensive distributed deep learning framework, while PyTorch Scatter focuses on efficient scatter operations for PyTorch. Horovod excels in large-scale distributed training scenarios, offering multi-framework support and cloud integration. However, it may be overkill for smaller projects and has a steeper learning curve. PyTorch Scatter, on the other hand, provides a simpler, more focused solution for scatter operations within PyTorch, making it easier to use for specific tasks but lacking the broader distributed training capabilities of Horovod.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Pros of DeepSpeed
- Comprehensive optimization toolkit for large-scale deep learning
- Supports distributed training and model parallelism
- Integrates advanced techniques like ZeRO optimizer and pipeline parallelism
Cons of DeepSpeed
- Steeper learning curve due to its extensive feature set
- May be overkill for smaller projects or simpler models
- Requires more setup and configuration compared to PyTorch Scatter
Code Comparison
PyTorch Scatter:
import torch
from torch_scatter import scatter_mean
src = torch.randn(10, 5)
index = torch.tensor([0, 1, 0, 1, 2, 2, 3, 3, 3, 4])
out = scatter_mean(src, index, dim=0)
DeepSpeed:
import deepspeed
import torch
model = MyModel()
optimizer = torch.optim.Adam(model.parameters())
model_engine, optimizer, _, _ = deepspeed.initialize(
args=args, model=model, optimizer=optimizer
)
Key Differences
- PyTorch Scatter focuses on efficient scatter operations for PyTorch tensors
- DeepSpeed is a more comprehensive toolkit for optimizing large-scale deep learning
- PyTorch Scatter is easier to integrate into existing PyTorch projects
- DeepSpeed offers more advanced features for distributed training and model optimization
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
Pros of apex
- Offers a wider range of optimization techniques, including mixed precision training and distributed training
- Developed and maintained by NVIDIA, ensuring compatibility with their hardware and potential performance benefits
- Includes additional features like CUDA-aware communication and automatic mixed precision
Cons of apex
- More complex to set up and use compared to pytorch_scatter
- May have compatibility issues with non-NVIDIA hardware
- Requires more in-depth knowledge of GPU optimization techniques
Code Comparison
apex:
from apex import amp
model, optimizer = amp.initialize(model, optimizer, opt_level="O1")
with amp.scale_loss(loss, optimizer) as scaled_loss:
scaled_loss.backward()
pytorch_scatter:
from torch_scatter import scatter_mean
output = scatter_mean(src, index, dim=0)
Summary
apex is a more comprehensive optimization library with a focus on NVIDIA hardware, offering advanced features for high-performance deep learning. pytorch_scatter, on the other hand, provides specialized scatter operations for PyTorch tensors, with a simpler API and potentially broader hardware compatibility. The choice between the two depends on specific project requirements and hardware constraints.
Python package built to ease deep learning on graph, on top of existing DL frameworks.
Pros of DGL
- Comprehensive graph neural network (GNN) library with a wide range of built-in models and algorithms
- Supports multiple deep learning frameworks (PyTorch, MXNet, TensorFlow)
- Scalable for large-scale graph processing and distributed training
Cons of DGL
- Steeper learning curve due to its extensive feature set
- May be overkill for simple scatter operations or projects not focused on graph-based tasks
- Potentially higher overhead for basic operations compared to PyTorch Scatter
Code Comparison
DGL (graph construction and message passing):
import dgl
import torch
g = dgl.graph(([0, 1], [1, 2]))
g.ndata['h'] = torch.ones(3, 5)
g.update_all(fn.copy_u('h', 'm'), fn.sum('m', 'h_sum'))
PyTorch Scatter (scatter operation):
import torch
from torch_scatter import scatter_add
src = torch.randn(10, 5)
index = torch.tensor([0, 1, 1, 2, 2, 2, 3, 3, 4, 4])
out = scatter_add(src, index, dim=0)
PyTorch Scatter focuses on efficient scatter operations, while DGL provides a more comprehensive toolkit for graph-based deep learning tasks. Choose based on your specific project requirements and complexity.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
PyTorch Scatter
This package consists of a small extension library of highly optimized sparse update (scatter and segment) operations for the use in PyTorch, which are missing in the main package. Scatter and segment operations can be roughly described as reduce operations based on a given "group-index" tensor. Segment operations require the "group-index" tensor to be sorted, whereas scatter operations are not subject to these requirements.
The package consists of the following operations with reduction types "sum"|"mean"|"min"|"max"
:
- scatter based on arbitrary indices
- segment_coo based on sorted indices
- segment_csr based on compressed indices via pointers
In addition, we provide the following composite functions which make use of scatter_*
operations under the hood: scatter_std
, scatter_logsumexp
, scatter_softmax
and scatter_log_softmax
.
All included operations are broadcastable, work on varying data types, are implemented both for CPU and GPU with corresponding backward implementations, and are fully traceable.
Installation
Anaconda
Update: You can now install pytorch-scatter
via Anaconda for all major OS/PyTorch/CUDA combinations ð¤
Given that you have pytorch >= 1.8.0
installed, simply run
conda install pytorch-scatter -c pyg
Binaries
We alternatively provide pip wheels for all major OS/PyTorch/CUDA combinations, see here.
PyTorch 2.4
To install the binaries for PyTorch 2.4.0, simply run
pip install torch-scatter -f https://data.pyg.org/whl/torch-2.4.0+${CUDA}.html
where ${CUDA}
should be replaced by either cpu
, cu118
, cu121
, or cu124
depending on your PyTorch installation.
cpu | cu118 | cu121 | cu124 | |
---|---|---|---|---|
Linux | â | â | â | â |
Windows | â | â | â | â |
macOS | â |
PyTorch 2.3
To install the binaries for PyTorch 2.3.0, simply run
pip install torch-scatter -f https://data.pyg.org/whl/torch-2.3.0+${CUDA}.html
where ${CUDA}
should be replaced by either cpu
, cu118
, or cu121
depending on your PyTorch installation.
cpu | cu118 | cu121 | |
---|---|---|---|
Linux | â | â | â |
Windows | â | â | â |
macOS | â |
Note: Binaries of older versions are also provided for PyTorch 1.4.0, PyTorch 1.5.0, PyTorch 1.6.0, PyTorch 1.7.0/1.7.1, PyTorch 1.8.0/1.8.1, PyTorch 1.9.0, PyTorch 1.10.0/1.10.1/1.10.2, PyTorch 1.11.0, PyTorch 1.12.0/1.12.1, PyTorch 1.13.0/1.13.1, PyTorch 2.0.0/2.0.1, PyTorch 2.1.0/2.1.1/2.1.2, and PyTorch 2.2.0/2.2.1/2.2.2 (following the same procedure).
For older versions, you need to explicitly specify the latest supported version number or install via pip install --no-index
in order to prevent a manual installation from source.
You can look up the latest supported version number here.
From source
Ensure that at least PyTorch 1.4.0 is installed and verify that cuda/bin
and cuda/include
are in your $PATH
and $CPATH
respectively, e.g.:
$ python -c "import torch; print(torch.__version__)"
>>> 1.4.0
$ echo $PATH
>>> /usr/local/cuda/bin:...
$ echo $CPATH
>>> /usr/local/cuda/include:...
Then run:
pip install torch-scatter
When running in a docker container without NVIDIA driver, PyTorch needs to evaluate the compute capabilities and may fail.
In this case, ensure that the compute capabilities are set via TORCH_CUDA_ARCH_LIST
, e.g.:
export TORCH_CUDA_ARCH_LIST = "6.0 6.1 7.2+PTX 7.5+PTX"
Example
import torch
from torch_scatter import scatter_max
src = torch.tensor([[2, 0, 1, 4, 3], [0, 2, 1, 3, 4]])
index = torch.tensor([[4, 5, 4, 2, 3], [0, 0, 2, 2, 1]])
out, argmax = scatter_max(src, index, dim=-1)
print(out)
tensor([[0, 0, 4, 3, 2, 0],
[2, 4, 3, 0, 0, 0]])
print(argmax)
tensor([[5, 5, 3, 4, 0, 1]
[1, 4, 3, 5, 5, 5]])
Running tests
pytest
C++ API
torch-scatter
also offers a C++ API that contains C++ equivalent of python models.
For this, we need to add TorchLib
to the -DCMAKE_PREFIX_PATH
(e.g., it may exists in {CONDA}/lib/python{X.X}/site-packages/torch
if installed via conda
):
mkdir build
cd build
# Add -DWITH_CUDA=on support for CUDA support
cmake -DCMAKE_PREFIX_PATH="..." ..
make
make install
Top Related Projects
PyTorch extensions for high performance and large scale training.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
Python package built to ease deep learning on graph, on top of existing DL frameworks.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot