nvidia-docker

Build and run Docker containers leveraging NVIDIA GPUs

17,394

2,039

17,394

View on GitHub

Top Related Projects

k8s-device-plugin

3,291

NVIDIA device plugin for Kubernetes

gpu-operator

2,179

NVIDIA GPU Operator creates, configures, and manages GPUs in Kubernetes

nvidia-container-toolkit

3,371

Build and run containers leveraging NVIDIA GPUs

runc

12,469

CLI tool for spawning and running containers according to the OCI specification

containerd

18,858

An open and reliable container runtime

Quick Overview

NVIDIA/nvidia-docker is a project that enables GPU acceleration for Docker containers. It provides a set of tools and runtime libraries that allow Docker containers to leverage NVIDIA GPUs for compute-intensive tasks, such as machine learning and scientific computing.

Pros

Seamless integration of NVIDIA GPUs with Docker containers
Improved performance for GPU-accelerated applications in containerized environments
Simplified deployment and management of GPU-enabled applications
Supports a wide range of NVIDIA GPU architectures

Cons

Limited to NVIDIA GPUs only, not compatible with other GPU manufacturers
Requires additional setup and configuration compared to standard Docker installations
May introduce compatibility issues with certain Docker features or third-party tools
Potential performance overhead in some scenarios compared to bare-metal GPU usage

Getting Started

To get started with NVIDIA Docker, follow these steps:

Install the NVIDIA GPU driver on your host system.
Install Docker on your system.
Install the NVIDIA Container Toolkit:

distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

sudo apt-get update
sudo apt-get install -y nvidia-docker2

Restart the Docker daemon:

sudo systemctl restart docker

Test the installation by running a sample CUDA container:

docker run --gpus all nvidia/cuda:11.0-base nvidia-smi

This should display information about your NVIDIA GPU(s) if the installation was successful.

Competitor Comparisons

k8s-device-plugin

3,291

NVIDIA device plugin for Kubernetes

Pros of k8s-device-plugin

Native Kubernetes integration for GPU resource management
Supports multi-node GPU clusters and advanced scheduling features
Easier to scale and manage in large Kubernetes environments

Cons of k8s-device-plugin

Limited to Kubernetes environments, less flexible for standalone Docker use
May require more complex setup and configuration compared to nvidia-docker
Potential learning curve for teams not familiar with Kubernetes

Code Comparison

k8s-device-plugin:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: nvidia-device-plugin-daemonset
  namespace: kube-system
spec:
  selector:
    matchLabels:
      name: nvidia-device-plugin-ds

nvidia-docker:

docker run --gpus all nvidia/cuda:11.0-base nvidia-smi

The k8s-device-plugin uses Kubernetes manifests to deploy and manage GPU resources, while nvidia-docker relies on Docker CLI commands with GPU options. This reflects the different approaches and use cases of the two projects, with k8s-device-plugin being more tightly integrated into Kubernetes ecosystems and nvidia-docker offering a simpler solution for Docker-based workflows.

gpu-operator

2,179

NVIDIA GPU Operator creates, configures, and manages GPUs in Kubernetes

Pros of gpu-operator

Provides a more comprehensive Kubernetes-native solution for GPU management
Automates driver installation and updates across the cluster
Simplifies GPU resource allocation and monitoring in Kubernetes environments

Cons of gpu-operator

Requires Kubernetes, which may not be suitable for all deployment scenarios
Has a steeper learning curve for users unfamiliar with Kubernetes operators
May introduce additional complexity in simpler GPU setups

Code comparison

gpu-operator (using Helm):

helm repo add nvidia https://nvidia.github.io/gpu-operator
helm install --wait --generate-name \
     -n gpu-operator --create-namespace \
     nvidia/gpu-operator

nvidia-docker:

distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update && sudo apt-get install -y nvidia-docker2

The gpu-operator is designed for Kubernetes environments, offering a more integrated solution for GPU management in containerized workloads. It automates many aspects of GPU setup and maintenance across the cluster. However, it requires Kubernetes and may be overkill for simpler setups.

nvidia-docker, on the other hand, is more straightforward to set up and use in non-Kubernetes environments. It's ideal for single-host Docker deployments but lacks the advanced cluster-wide management features of gpu-operator.

nvidia-container-toolkit

3,371

Build and run containers leveraging NVIDIA GPUs

Pros of nvidia-container-toolkit

More lightweight and flexible approach to GPU support in containers
Better integration with container runtimes like containerd and CRI-O
Supports a wider range of NVIDIA GPU architectures

Cons of nvidia-container-toolkit

Requires more manual configuration compared to nvidia-docker
May have a steeper learning curve for users familiar with nvidia-docker

Code Comparison

nvidia-docker:

FROM nvidia/cuda:11.0-base
COPY --from=nvidia/cuda:11.0-runtime /usr/local/cuda/lib64/libcudart.so.11.0 /usr/local/cuda/lib64/libcudart.so.11.0

nvidia-container-toolkit:

FROM ubuntu:20.04
ENV NVIDIA_VISIBLE_DEVICES all
ENV NVIDIA_DRIVER_CAPABILITIES compute,utility
RUN apt-get update && apt-get install -y --no-install-recommends cuda-toolkit-11-0

The nvidia-container-toolkit approach allows for more granular control over GPU capabilities and doesn't require specific NVIDIA base images. It provides a more flexible setup, especially when working with custom images or non-CUDA workloads that still require GPU access.

runc

12,469

CLI tool for spawning and running containers according to the OCI specification

Pros of runc

More widely adopted and supported across different container runtimes
Lightweight and focused on core container execution functionality
Part of the Open Container Initiative (OCI), ensuring standardization

Cons of runc

Lacks native GPU support for NVIDIA hardware
Requires additional configuration for GPU-accelerated workloads
May not provide optimal performance for GPU-intensive applications

Code Comparison

runc:

spec, err := loadSpec(context)
if err != nil {
    return err
}
status, err := startContainer(context, spec)

nvidia-docker:

docker run --runtime=nvidia \
    --gpus all \
    nvidia/cuda:11.0-base \
    nvidia-smi

Summary

runc is a general-purpose container runtime that adheres to OCI standards, while nvidia-docker is specifically designed for GPU-accelerated containers using NVIDIA hardware. runc offers broader compatibility and standardization, but nvidia-docker provides seamless integration with NVIDIA GPUs and optimized performance for GPU workloads. The choice between them depends on the specific requirements of your containerized applications and infrastructure.

containerd

18,858

An open and reliable container runtime

Pros of containerd

More general-purpose container runtime, supporting multiple container formats
Widely adopted as the default runtime for Kubernetes and Docker
Active development with frequent updates and improvements

Cons of containerd

Lacks built-in GPU support for NVIDIA hardware
Requires additional configuration for GPU-accelerated containers
May have a steeper learning curve for users familiar with Docker

Code comparison

nvidia-docker:

docker run --gpus all nvidia/cuda:11.0-base nvidia-smi

containerd:

ctr run --runtime=io.containerd.runc.v2 --nvidia-gpu-device=all docker.io/nvidia/cuda:11.0-base cuda nvidia-smi

Summary

containerd is a more versatile container runtime with broader industry adoption, while nvidia-docker provides a simpler solution for GPU-accelerated containers on NVIDIA hardware. containerd requires additional setup for GPU support, but offers greater flexibility and is better suited for complex container orchestration scenarios. nvidia-docker, on the other hand, provides out-of-the-box GPU support for Docker containers, making it easier to use for GPU-intensive workloads on NVIDIA GPUs.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

DEPRECATION NOTICE

This project has been superseded by the NVIDIA Container Toolkit.

The tooling provided by this repository has been deprecated and the repository archived.

The nvidia-docker wrapper is no longer supported, and the NVIDIA Container Toolkit has been extended to allow users to configure Docker to use the NVIDIA Container Runtime.

For further instructions, see the NVIDIA Container Toolkit documentation and specifically the install guide.

Issues and Contributing

Checkout the Contributing document!

For questions, feature requests, or bugs, open an issue against the nvidia-container-toolkit repository.

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot