Convert Figma logo to code with AI

determined-ai logodetermined

Determined is an open-source machine learning platform that simplifies distributed training, hyperparameter tuning, experiment tracking, and resource management. Works with PyTorch and TensorFlow.

3,123
365
3,123
103

Top Related Projects

36,653

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

20,329

Open source platform for the machine learning lifecycle

88,135

Tensors and Dynamic neural networks in Python with strong GPU acceleration

188,828

An Open Source Machine Learning Framework for Everyone

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

9,810

The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.

Quick Overview

Determined AI is an open-source deep learning training platform that simplifies the process of training, experimenting with, and deploying machine learning models. It provides a comprehensive set of tools for distributed training, hyperparameter tuning, and experiment management, making it easier for data scientists and machine learning engineers to build and scale their ML workflows.

Pros

  • Seamless distributed training and hyperparameter tuning
  • Built-in experiment tracking and visualization
  • Support for popular deep learning frameworks like PyTorch and TensorFlow
  • Easy-to-use CLI and web interface for managing experiments

Cons

  • Steeper learning curve compared to simpler ML tools
  • Requires cluster setup for full distributed capabilities
  • Limited support for non-deep learning ML algorithms
  • Smaller community compared to some other ML platforms

Code Examples

  1. Defining a model using Determined AI's PyTorch interface:
from determined.pytorch import PyTorchTrial

class MyModel(PyTorchTrial):
    def __init__(self, context):
        super().__init__(context)
        self.model = nn.Sequential(
            nn.Linear(10, 64),
            nn.ReLU(),
            nn.Linear(64, 1)
        )
        self.optimizer = torch.optim.Adam(self.model.parameters())

    def train_batch(self, batch, epoch_idx, batch_idx):
        inputs, labels = batch
        outputs = self.model(inputs)
        loss = nn.MSELoss()(outputs, labels)
        self.context.backward(loss)
        self.context.step_optimizer(self.optimizer)
        return {"loss": loss.item()}
  1. Configuring hyperparameter search:
hyperparameters:
  learning_rate:
    type: log
    minval: -5.0
    maxval: 0.0
    base: 10
  batch_size:
    type: categorical
    vals: [32, 64, 128]

searcher:
  name: adaptive_asha
  metric: validation_loss
  smaller_is_better: true
  max_trials: 50
  1. Launching an experiment using the Determined AI CLI:
det experiment create config.yaml model_def.py

Getting Started

  1. Install Determined AI:
pip install determined
  1. Create a simple model definition (e.g., model_def.py) and configuration file (e.g., config.yaml).

  2. Start a local Determined cluster:

det deploy local cluster-up
  1. Launch an experiment:
det experiment create config.yaml model_def.py
  1. Monitor your experiment using the web UI or CLI:
det experiment list

Competitor Comparisons

36,653

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Pros of Ray

  • More extensive ecosystem with libraries for various tasks (e.g., RLlib, Ray Serve)
  • Better suited for distributed computing and large-scale machine learning
  • Larger community and more frequent updates

Cons of Ray

  • Steeper learning curve due to its broader scope
  • Can be overkill for smaller projects or simpler machine learning tasks
  • Less focus on experiment tracking and reproducibility

Code Comparison

Ray:

import ray

@ray.remote
def f(x):
    return x * x

futures = [f.remote(i) for i in range(4)]
print(ray.get(futures))

Determined:

from determined.experimental import Determined

def train():
    # Training logic here
    pass

if __name__ == "__main__":
    with Determined() as det:
        det.create_experiment(train)

Ray focuses on distributed computing primitives, while Determined emphasizes experiment management and reproducibility. Ray's code shows remote function execution, whereas Determined's code demonstrates experiment creation and management.

20,329

Open source platform for the machine learning lifecycle

Pros of MLflow

  • Broader ecosystem support and integrations with various ML frameworks
  • Lightweight and easy to set up for small to medium-sized projects
  • Flexible experiment tracking and model versioning capabilities

Cons of MLflow

  • Less focus on distributed training and resource management
  • Limited built-in hyperparameter tuning capabilities
  • Requires more manual configuration for advanced use cases

Code Comparison

MLflow:

import mlflow

mlflow.start_run()
mlflow.log_param("learning_rate", 0.01)
mlflow.log_metric("accuracy", 0.85)
mlflow.end_run()

Determined:

from determined.experimental import Determined

with Determined() as det:
    experiment = det.create_experiment(
        config={"hyperparameters": {"learning_rate": 0.01}}
    )
    experiment.wait()

Summary

MLflow is a versatile ML lifecycle management tool suitable for various project sizes, offering easy setup and flexible experiment tracking. However, it may require more manual configuration for advanced scenarios. Determined, on the other hand, provides stronger support for distributed training and resource management, making it more suitable for large-scale ML projects with complex infrastructure requirements.

88,135

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Pros of PyTorch

  • Larger community and ecosystem, with more resources and third-party libraries
  • More flexible and customizable for low-level research and experimentation
  • Wider industry adoption and support

Cons of PyTorch

  • Steeper learning curve for beginners
  • Requires more boilerplate code for training and evaluation loops
  • Less integrated with cloud and distributed computing platforms

Code Comparison

PyTorch:

import torch

model = torch.nn.Linear(10, 1)
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)
criterion = torch.nn.MSELoss()

for epoch in range(100):
    optimizer.zero_grad()
    output = model(input_data)
    loss = criterion(output, target)
    loss.backward()
    optimizer.step()

Determined:

from determined.pytorch import PyTorchTrial

class MyTrial(PyTorchTrial):
    def __init__(self, context):
        self.model = torch.nn.Linear(10, 1)
        self.optimizer = torch.optim.SGD(self.model.parameters(), lr=0.01)
        self.criterion = torch.nn.MSELoss()

    def train_batch(self, batch, epoch_idx, batch_idx):
        output = self.model(batch[0])
        loss = self.criterion(output, batch[1])
        return {"loss": loss}
188,828

An Open Source Machine Learning Framework for Everyone

Pros of TensorFlow

  • Larger ecosystem with extensive libraries and tools
  • Broader industry adoption and community support
  • More comprehensive documentation and learning resources

Cons of TensorFlow

  • Steeper learning curve for beginners
  • Can be more complex to set up and configure
  • Less focus on distributed training out-of-the-box

Code Comparison

TensorFlow:

import tensorflow as tf

model = tf.keras.Sequential([
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam', loss='categorical_crossentropy')

Determined:

from determined.keras import TFKerasTrial

class MyTrial(TFKerasTrial):
    def build_model(self):
        model = tf.keras.Sequential([
            tf.keras.layers.Dense(64, activation='relu'),
            tf.keras.layers.Dense(10, activation='softmax')
        ])
        model.compile(optimizer='adam', loss='categorical_crossentropy')
        return model

Determined builds on top of TensorFlow, providing a higher-level abstraction for distributed training and experiment management. While TensorFlow offers more flexibility and a wider range of features, Determined simplifies the process of scaling machine learning workflows and managing experiments across multiple GPUs or machines.

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Pros of Transformers

  • Extensive library of pre-trained models for various NLP tasks
  • Active community with frequent updates and contributions
  • Comprehensive documentation and tutorials

Cons of Transformers

  • Focused primarily on NLP, limiting its use for other ML domains
  • Can be resource-intensive for large models and datasets
  • Steeper learning curve for beginners due to its extensive features

Code Comparison

Transformers:

from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
model = AutoModel.from_pretrained("bert-base-uncased")

Determined:

from determined.experimental import Determined

model = Determined(master_url="http://localhost:8080").get_experiment(1).top_checkpoint()

Transformers provides a more straightforward approach for loading pre-trained models, while Determined focuses on experiment management and distributed training. Transformers is ideal for NLP tasks, whereas Determined offers a broader platform for various ML workflows and scalable training.

9,810

The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.

Pros of wandb

  • More extensive visualization and experiment tracking capabilities
  • Larger community and ecosystem with integrations for many ML frameworks
  • Easier to set up and use for beginners

Cons of wandb

  • Less focus on distributed training and resource management
  • Potentially higher costs for large-scale projects or teams
  • Limited built-in hyperparameter tuning capabilities

Code Comparison

wandb:

import wandb

wandb.init(project="my-project")
wandb.config.hyperparameters = {...}
model.fit(X, y)
wandb.log({"loss": loss, "accuracy": accuracy})

determined:

from determined.experimental import Determined

with Determined() as det:
    config = det.create_experiment(
        name="my-experiment",
        config={"hyperparameters": {...}}
    )
    model = MyModel(config)
    model.fit(X, y)

The code comparison shows that wandb focuses on logging and tracking, while determined emphasizes experiment configuration and resource management. wandb's API is simpler for basic use cases, while determined provides more control over experiment creation and execution.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Determined AI Logo

Determined is an all-in-one deep learning platform, compatible with PyTorch and TensorFlow.

It takes care of:

  • Distributed training for faster results.
  • Hyperparameter tuning for obtaining the best models.
  • Resource management for cutting cloud GPU costs.
  • Experiment tracking for analysis and reproducibility.

Features gif

How Determined Works

The main components of Determined are the Python library, the command line interface (CLI), and the Web UI.

Python Library

Use the Python library to make your existing PyTorch or Tensorflow code compatible with Determined.

You can do this by organizing your code into one of the class-based APIs:

from determined.pytorch import PyTorchTrial

class YourExperiment(PyTorchTrial):
  def __init__(self, context):
    ...

Or by using just the functions you want, via the Core API:

import determined as det

with det.core.init() as core_context:
    ...

Command Line Interface (CLI)

You can use the CLI to:

  • Start a Determined cluster locally:
det deploy local cluster-up
  • Launch Determined on cloud services, such as Amazon Web Services (AWS) or Google Cloud Platform (GCP):
det deploy aws up
  • Train your models:
det experiment create gpt.yaml .

Configure everything from distributed training to hyperparameter tuning using YAML files:

resources:
  slots_per_trial: 8
  priority: 1
hyperparameters:
  learning_rate:
    type: double
    minval: .0001
    maxval: 1.0
searcher:
  name: adaptive_asha
  metric: validation_loss
  smaller_is_better: true

Web UI

Use the Web UI to view loss curves, hyperparameter plots, code and configuration snapshots, model registries, cluster utilization, debugging logs, performance profiling reports, and more.

Web UI

Installation

To install the CLI:

pip install determined

Then use det deploy to start the Determined cluster locally, or on cloud services like AWS and GCP.

For installation details, visit the the cluster deployment guide for your environment:

Examples

Get familiar with Determined by exploring the 30+ examples in the examples folder and the determined-examples repo.

Documentation

Community

If you need help, want to file a bug report, or just want to keep up-to-date with the latest news about Determined, please join the Determined community!

Contributing

Contributor's Guide

License

Apache V2