Convert Figma logo to code with AI

allegroai logoclearml

ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution

5,558
643
5,558
483

Top Related Projects

18,503

Open source platform for the machine learning lifecycle

9,007

The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.

5,131

Aim 💫 — An easy-to-use & supercharged open-source experiment tracker.

13,582

🦉 ML Experiments and Data Management with Git

11,381

An orchestration platform for the development, production, and observation of data assets.

Quick Overview

ClearML is an open-source MLOps platform designed to streamline and automate machine learning workflows. It provides tools for experiment tracking, model management, data versioning, and orchestration, enabling data scientists and ML engineers to focus on developing models while automating the operational aspects of machine learning projects.

Pros

  • Comprehensive MLOps solution with experiment tracking, model management, and orchestration
  • Easy integration with popular ML frameworks and tools
  • Flexible deployment options (self-hosted or cloud-based)
  • Active community and regular updates

Cons

  • Learning curve for new users due to the wide range of features
  • Some advanced features may require paid plans
  • Documentation can be overwhelming for beginners

Code Examples

  1. Initializing a ClearML Task:
from clearml import Task

task = Task.init(project_name="My Project", task_name="My Experiment")
  1. Logging metrics during training:
from clearml import Logger

logger = Logger.current_logger()
for epoch in range(num_epochs):
    # ... training code ...
    logger.report_scalar(title="Performance", series="Accuracy", value=accuracy, iteration=epoch)
  1. Storing and versioning datasets:
from clearml import Dataset

dataset = Dataset.create(dataset_project="My Datasets", dataset_name="My Training Data")
dataset.add_files("/path/to/data/")
dataset.upload()
dataset.finalize()
  1. Remotely executing a task:
from clearml import Task

task = Task.init(project_name="My Project", task_name="Remote Execution")
task.execute_remotely(queue_name="default")

# Your ML code here

Getting Started

To get started with ClearML:

  1. Install ClearML:
pip install clearml
  1. Initialize ClearML:
clearml-init
  1. In your Python script, import and initialize a Task:
from clearml import Task

task = Task.init(project_name="My First Project", task_name="My First Experiment")

# Your ML code here

task.close()
  1. Run your script as usual. ClearML will automatically track your experiment, including code, parameters, and metrics.

Competitor Comparisons

18,503

Open source platform for the machine learning lifecycle

Pros of MLflow

  • More mature and widely adopted in the industry
  • Extensive documentation and community support
  • Seamless integration with popular ML frameworks and cloud platforms

Cons of MLflow

  • Less comprehensive experiment tracking compared to ClearML
  • Limited built-in visualization capabilities
  • Requires more setup and configuration for advanced features

Code Comparison

MLflow:

import mlflow

mlflow.start_run()
mlflow.log_param("param1", value1)
mlflow.log_metric("metric1", value2)
mlflow.end_run()

ClearML:

from clearml import Task

task = Task.init(project_name="My Project", task_name="My Experiment")
task.connect({"param1": value1})
task.get_logger().report_scalar("metric1", "value", value2)

Both frameworks offer simple ways to log parameters and metrics, but ClearML provides a more streamlined approach with automatic experiment tracking and less boilerplate code. MLflow requires explicit start and end of runs, while ClearML automatically manages experiment lifecycle.

ClearML offers more advanced features out-of-the-box, such as experiment comparison, remote execution, and data versioning. However, MLflow's simplicity and widespread adoption make it a solid choice for many ML projects, especially those already integrated with popular ML ecosystems.

9,007

The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.

Pros of Wandb

  • More extensive visualization capabilities and customizable dashboards
  • Stronger focus on experiment tracking and hyperparameter optimization
  • Larger community and more integrations with popular ML frameworks

Cons of Wandb

  • Paid plans can be more expensive for larger teams or projects
  • Less emphasis on MLOps and pipeline management compared to ClearML

Code Comparison

Wandb:

import wandb

wandb.init(project="my-project")
wandb.config.hyperparameters = {
    "learning_rate": 0.01,
    "epochs": 100
}
wandb.log({"accuracy": 0.9, "loss": 0.1})

ClearML:

from clearml import Task

task = Task.init(project_name="my-project", task_name="my-task")
task.connect({"learning_rate": 0.01, "epochs": 100})
task.logger.report_scalar("accuracy", "train", value=0.9, iteration=1)
task.logger.report_scalar("loss", "train", value=0.1, iteration=1)

Both Wandb and ClearML offer similar functionality for experiment tracking and logging. Wandb's syntax is slightly more concise, while ClearML provides a more structured approach with separate methods for initialization and logging.

5,131

Aim 💫 — An easy-to-use & supercharged open-source experiment tracker.

Pros of Aim

  • Lightweight and easy to integrate into existing ML workflows
  • Supports a wide range of ML frameworks and libraries
  • Offers a clean and intuitive UI for experiment tracking and visualization

Cons of Aim

  • Less comprehensive feature set compared to ClearML
  • Limited support for advanced MLOps functionalities like model serving and deployment
  • Smaller community and ecosystem

Code Comparison

Aim:

from aim import Run

run = Run()
run.track(accuracy, name='accuracy', epoch=epoch)
run.track(loss, name='loss', epoch=epoch)

ClearML:

from clearml import Task

task = Task.init(project_name='My Project', task_name='My Experiment')
task.logger.report_scalar('accuracy', 'train', value=accuracy, iteration=epoch)
task.logger.report_scalar('loss', 'train', value=loss, iteration=epoch)

Both libraries offer simple ways to track metrics, but ClearML provides more built-in functionality for experiment management and MLOps tasks. Aim focuses on lightweight tracking and visualization, while ClearML offers a more comprehensive suite of tools for the entire ML lifecycle.

13,582

🦉 ML Experiments and Data Management with Git

Pros of DVC

  • Lightweight and focused on version control for ML projects
  • Strong integration with Git for data and model versioning
  • Supports various storage backends (local, S3, GCS, etc.)

Cons of DVC

  • Less comprehensive MLOps features compared to ClearML
  • Requires more manual setup and configuration
  • Limited built-in experiment tracking capabilities

Code Comparison

DVC:

import dvc.api

with dvc.api.open('data/features.csv') as f:
    # Process the data
    pass

ClearML:

from clearml import Dataset

dataset = Dataset.get(dataset_project='MyProject', dataset_name='features')
dataset_path = dataset.get_local_copy()
# Process the data

Both tools offer ways to version and manage data, but ClearML provides a more integrated approach with its Dataset object, while DVC focuses on file-level versioning using a Git-like syntax.

11,381

An orchestration platform for the development, production, and observation of data assets.

Pros of Dagster

  • More comprehensive data orchestration platform with built-in scheduling and monitoring
  • Stronger focus on data lineage and asset management
  • Larger community and ecosystem with more integrations

Cons of Dagster

  • Steeper learning curve due to more complex architecture
  • Requires more setup and configuration compared to ClearML
  • Less focused on machine learning experiments and model management

Code Comparison

Dagster:

@job
def my_job():
    process_data()
    train_model()

@op
def process_data():
    # Data processing logic

@op
def train_model():
    # Model training logic

ClearML:

from clearml import Task

task = Task.init(project_name='My Project', task_name='My Task')

# Data processing logic
# Model training logic

task.close()

Dagster provides a more structured approach to defining data pipelines, while ClearML offers a simpler way to track experiments and tasks. Dagster's code emphasizes job and operation definitions, whereas ClearML focuses on automatic logging and experiment tracking with minimal code changes.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Clear|MLClear|ML

ClearML - Auto-Magical Suite of tools to streamline your AI workflow
Experiment Manager, MLOps/LLMOps and Data-Management

GitHub license PyPI pyversions PyPI version shields.io Conda version shields.io Optuna
PyPI Downloads Artifact Hub Youtube Slack Channel Signup

🌟 ClearML is open-source - Leave a star to support the project! 🌟


ClearML

Formerly known as Allegro Trains

ClearML is a ML/DL development and production suite. It contains FIVE main modules:

  • Experiment Manager - Automagical experiment tracking, environments and results
  • MLOps / LLMOps - Orchestration, Automation & Pipelines solution for ML/DL/GenAI jobs (Kubernetes / Cloud / bare-metal)
  • Data-Management - Fully differentiable data management & version control solution on top of object-storage (S3 / GS / Azure / NAS)
  • Model-Serving - cloud-ready Scalable model serving solution!
    • Deploy new model endpoints in under 5 minutes
    • Includes optimized GPU serving support backed by Nvidia-Triton
    • with out-of-the-box Model Monitoring
  • Reports - Create and share rich MarkDown documents supporting embeddable online content
  • :fire: Orchestration Dashboard - Live rich dashboard for your entire compute cluster (Cloud / Kubernetes / On-Prem)
  • NEW 💥 Fractional GPUs - Container based, driver level GPU memory limitation 🙀 !!!

Instrumenting these components is the ClearML-server, see Self-Hosting & Free tier Hosting


Sign up & Start using in under 2 minutes


Friendly tutorials to get you started

Step 1 - Experiment Management Open In Colab
Step 2 - Remote Execution Agent Setup Open In Colab
Step 3 - Remotely Execute Tasks Open In Colab

Experiment Management Datasets
Orchestration Pipelines

ClearML Experiment Manager

Adding only 2 lines to your code gets you the following

  • Complete experiment setup log
    • Full source control info, including non-committed local changes
    • Execution environment (including specific packages & versions)
    • Hyper-parameters
      • argparse/Click/PythonFire for command line parameters with currently used values
      • Explicit parameters dictionary
      • Tensorflow Defines (absl-py)
      • Hydra configuration and overrides
    • Initial model weights file
  • Full experiment output automatic capture
    • stdout and stderr
    • Resource Monitoring (CPU/GPU utilization, temperature, IO, network, etc.)
    • Model snapshots (With optional automatic upload to central storage: Shared folder, S3, GS, Azure, Http)
    • Artifacts log & store (Shared folder, S3, GS, Azure, Http)
    • Tensorboard/TensorboardX scalars, metrics, histograms, images, audio and video samples
    • Matplotlib & Seaborn
    • ClearML Logger interface for complete flexibility.
  • Extensive platform support and integrations

Start using ClearML

  1. Sign up for free to the ClearML Hosted Service (alternatively, you can set up your own server, see here).

    ClearML Demo Server: ClearML no longer uses the demo server by default. To enable the demo server, set the CLEARML_NO_DEFAULT_SERVER=0 environment variable. Credentials aren't needed, but experiments launched to the demo server are public, so make sure not to launch sensitive experiments if using the demo server.

  2. Install the clearml python package:

    pip install clearml
    
  3. Connect the ClearML SDK to the server by creating credentials, then execute the command below and follow the instructions:

    clearml-init
    
  4. Add two lines to your code:

    from clearml import Task
    task = Task.init(project_name='examples', task_name='hello world')
    

And you are done! Everything your process outputs is now automagically logged into ClearML.

Next step, automation! Learn more about ClearML's two-click automation here.

ClearML Architecture

The ClearML run-time components:

  • The ClearML Python Package - for integrating ClearML into your existing scripts by adding just two lines of code, and optionally extending your experiments and other workflows with ClearML's powerful and versatile set of classes and methods.
  • The ClearML Server - for storing experiment, model, and workflow data; supporting the Web UI experiment manager and MLOps automation for reproducibility and tuning. It is available as a hosted service and open source for you to deploy your own ClearML Server.
  • The ClearML Agent - for MLOps orchestration, experiment and workflow reproducibility, and scalability.
clearml-architecture

Additional Modules

  • clearml-session - Launch remote JupyterLab / VSCode-server inside any docker, on Cloud/On-Prem machines
  • clearml-task - Run any codebase on remote machines with full remote logging of Tensorboard, Matplotlib & Console outputs
  • clearml-data - CLI for managing and versioning your datasets, including creating / uploading / downloading of data from S3/GS/Azure/NAS
  • AWS Auto-Scaler - Automatically spin EC2 instances based on your workloads with preconfigured budget! No need for AKE!
  • Hyper-Parameter Optimization - Optimize any code with black-box approach and state-of-the-art Bayesian optimization algorithms
  • Automation Pipeline - Build pipelines based on existing experiments / jobs, supports building pipelines of pipelines!
  • Slack Integration - Report experiments progress / failure directly to Slack (fully customizable!)

Why ClearML?

ClearML is our solution to a problem we share with countless other researchers and developers in the machine learning/deep learning universe: Training production-grade deep learning models is a glorious but messy process. ClearML tracks and controls the process by associating code version control, research projects, performance metrics, and model provenance.

We designed ClearML specifically to require effortless integration so that teams can preserve their existing methods and practices.

  • Use it on a daily basis to boost collaboration and visibility in your team
  • Create a remote job from any experiment with a click of a button
  • Automate processes and create pipelines to collect your experimentation logs, outputs, and data
  • Store all your data on any object-storage solution, with the most straightforward interface possible
  • Make your data transparent by cataloging it all on the ClearML platform

We believe ClearML is ground-breaking. We wish to establish new standards of true seamless integration between experiment management, MLOps, and data management.

Who We Are

ClearML is supported by you and the clear.ml team, which helps enterprise companies build scalable MLOps.

We built ClearML to track and control the glorious but messy process of training production-grade deep learning models. We are committed to vigorously supporting and expanding the capabilities of ClearML.

We promise to always be backwardly compatible, making sure all your logs, data, and pipelines will always upgrade with you.

License

Apache License, Version 2.0 (see the LICENSE for more information)

If ClearML is part of your development process / project / publication, please cite us :heart: :

@misc{clearml,
title = {ClearML - Your entire MLOps stack in one open-source tool},
year = {2024},
note = {Software available from http://github.com/allegroai/clearml},
url={https://clear.ml/},
author = {ClearML},
}

Documentation, Community & Support

For more information, see the official documentation and on YouTube.

For examples and use cases, check the examples folder and corresponding documentation.

If you have any questions: post on our Slack Channel, or tag your questions on stackoverflow with 'clearml' tag (previously trains tag).

For feature requests or bug reports, please use GitHub issues.

Additionally, you can always find us at info@clear.ml

Contributing

PRs are always welcome :heart: See more details in the ClearML Guidelines for Contributing.

May the force (and the goddess of learning rates) be with you!