Convert Figma logo to code with AI

microsoft logoFLAML

A fast library for AutoML and tuning. Join our Discord: https://discord.gg/Cppx2vSPVP.

3,844
506
3,844
227

Top Related Projects

10,484

A hyperparameter optimization framework

Distributed Asynchronous Hyperparameter Optimization in Python

Sequential model-based optimization with a `scipy.optimize` interface

2,340

Adaptive Experimentation Platform

32,953

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

18,287

Open source platform for the machine learning lifecycle

Quick Overview

FLAML (Fast and Lightweight AutoML) is an open-source Python library for automated machine learning and hyperparameter tuning. It is designed to be efficient, lightweight, and easy to use, making it suitable for both large-scale and resource-constrained scenarios.

Pros

  • Fast and efficient AutoML with minimal computational resources
  • Supports a wide range of ML tasks including classification, regression, and time series forecasting
  • Highly customizable and extensible for advanced users
  • Integrates well with popular ML frameworks like scikit-learn and XGBoost

Cons

  • May not always produce the absolute best model compared to more resource-intensive AutoML tools
  • Documentation could be more comprehensive for some advanced features
  • Limited support for deep learning tasks compared to some other AutoML frameworks

Code Examples

  1. Basic AutoML for classification:
from flaml import AutoML
from sklearn.datasets import load_iris

X, y = load_iris(return_X_y=True)
automl = AutoML()
automl.fit(X, y, task="classification")
print(automl.model.estimator)
  1. Time series forecasting with FLAML:
from flaml import AutoML
import pandas as pd

# Assuming 'data' is a pandas DataFrame with a datetime index and target column
automl = AutoML()
automl.fit(data, target_col="target", task="ts_forecast", time_col="date")
predictions = automl.predict(data)
  1. Custom search space for hyperparameter tuning:
from flaml import AutoML
from flaml.tune import Choice, Real

custom_space = {
    "n_estimators": Choice([100, 200, 300, 400, 500]),
    "learning_rate": Real(0.01, 0.1, log=True),
}

automl = AutoML()
automl.fit(X, y, task="classification", custom_hp=custom_space)

Getting Started

To get started with FLAML, first install it using pip:

pip install flaml

Then, you can use FLAML for a basic classification task:

from flaml import AutoML
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

# Load data
X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# Initialize and train AutoML
automl = AutoML()
automl.fit(X_train, y_train, task="classification", time_budget=60)

# Evaluate the model
print(f"Best ML leaner: {automl.best_estimator}")
print(f"Best hyperparmeter config: {automl.best_config}")
print(f"Best accuracy on validation data: {1 - automl.best_loss}")
print(f"Training duration: {automl.best_config_train_time:.2f} s")

This example demonstrates how to use FLAML for a basic classification task, including model training, evaluation, and reporting of the best model and hyperparameters.

Competitor Comparisons

10,484

A hyperparameter optimization framework

Pros of Optuna

  • More mature and widely adopted project with a larger community
  • Supports a broader range of optimization algorithms and techniques
  • Offers advanced visualization tools for analyzing optimization results

Cons of Optuna

  • Can be more complex to set up and use for simple optimization tasks
  • May require more manual configuration for hyperparameter search spaces

Code Comparison

Optuna:

import optuna

def objective(trial):
    x = trial.suggest_float('x', -10, 10)
    return (x - 2) ** 2

study = optuna.create_study()
study.optimize(objective, n_trials=100)

FLAML:

from flaml import AutoML

automl = AutoML()
automl.fit(X_train, y_train, task="classification")

Summary

Optuna is a more established and feature-rich hyperparameter optimization framework, offering a wide range of algorithms and visualization tools. It's well-suited for complex optimization tasks and research purposes. FLAML, on the other hand, focuses on simplicity and efficiency, making it easier to use for quick automated machine learning tasks. FLAML's code is more concise and requires less setup, while Optuna provides more flexibility and control over the optimization process.

Distributed Asynchronous Hyperparameter Optimization in Python

Pros of Hyperopt

  • More mature and widely adopted project with a larger community
  • Supports a broader range of optimization algorithms (e.g., Tree of Parzen Estimators, Adaptive TPE)
  • Flexible and can be used for various optimization tasks beyond hyperparameter tuning

Cons of Hyperopt

  • Steeper learning curve and more complex API
  • Less focus on AutoML and automated feature engineering
  • May require more manual configuration for optimal performance

Code Comparison

Hyperopt:

from hyperopt import fmin, tpe, hp

space = {
    'x': hp.uniform('x', -5, 5),
    'y': hp.uniform('y', -5, 5),
}

def objective(params):
    x, y = params['x'], params['y']
    return x**2 + y**2

best = fmin(objective, space, algo=tpe.suggest, max_evals=100)

FLAML:

from flaml import AutoML

automl = AutoML()
automl.fit(X_train, y_train, task="classification")
predictions = automl.predict(X_test)

FLAML offers a more streamlined API for AutoML tasks, while Hyperopt provides greater flexibility for custom optimization problems. FLAML is better suited for quick, automated machine learning workflows, whereas Hyperopt excels in scenarios requiring fine-grained control over the optimization process.

Sequential model-based optimization with a `scipy.optimize` interface

Pros of scikit-optimize

  • More established and mature project with a larger community
  • Broader range of optimization algorithms and techniques
  • Extensive documentation and examples

Cons of scikit-optimize

  • Less focus on automated machine learning (AutoML) tasks
  • May require more manual configuration for hyperparameter tuning
  • Slower development pace compared to FLAML

Code Comparison

FLAML:

from flaml import AutoML

automl = AutoML()
automl.fit(X_train, y_train, task="classification")
predictions = automl.predict(X_test)

scikit-optimize:

from skopt import BayesSearchCV
from sklearn.svm import SVC

opt = BayesSearchCV(SVC(), {'C': (1e-6, 1e+6, 'log-uniform')})
opt.fit(X_train, y_train)
predictions = opt.predict(X_test)

FLAML is designed for simplicity and automated machine learning, while scikit-optimize offers more flexibility for various optimization tasks. FLAML's AutoML approach requires less manual configuration, whereas scikit-optimize allows for more fine-grained control over the optimization process. Both libraries have their strengths, with FLAML excelling in ease of use for AutoML tasks and scikit-optimize providing a broader range of optimization techniques for various applications.

2,340

Adaptive Experimentation Platform

Pros of Ax

  • More comprehensive Bayesian optimization framework with advanced features like multi-objective optimization and multi-fidelity optimization
  • Stronger focus on scientific and industrial applications, with built-in support for A/B testing and experimentation
  • Better integration with PyTorch, making it suitable for deep learning hyperparameter tuning

Cons of Ax

  • Steeper learning curve due to its more complex architecture and extensive features
  • Less focus on automated machine learning (AutoML) compared to FLAML
  • Requires more setup and configuration for simple optimization tasks

Code Comparison

FLAML:

from flaml import AutoML
automl = AutoML()
automl.fit(X_train, y_train, task="classification")

Ax:

from ax import optimize
best_parameters, values, experiment, model = optimize(
    parameters=[
        {"name": "x1", "type": "range", "bounds": [-10.0, 10.0]},
        {"name": "x2", "type": "range", "bounds": [-10.0, 10.0]},
    ],
    evaluation_function=evaluation_function,
    objective_name="objective",
)
32,953

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Pros of Ray

  • Comprehensive distributed computing framework with support for various ML tasks
  • Highly scalable, designed for large-scale distributed applications
  • Rich ecosystem with libraries for reinforcement learning, hyperparameter tuning, and more

Cons of Ray

  • Steeper learning curve due to its extensive feature set
  • Higher overhead for simple tasks or small-scale projects
  • More complex setup and configuration compared to FLAML

Code Comparison

Ray example:

import ray

@ray.remote
def f(x):
    return x * x

futures = [f.remote(i) for i in range(4)]
print(ray.get(futures))

FLAML example:

from flaml import AutoML

automl = AutoML()
automl.fit(X_train, y_train, task="classification")
predictions = automl.predict(X_test)

Ray offers a more general-purpose distributed computing framework, while FLAML focuses specifically on automated machine learning. Ray's code example demonstrates its distributed nature, whereas FLAML's example showcases its simplicity for AutoML tasks.

18,287

Open source platform for the machine learning lifecycle

Pros of MLflow

  • Comprehensive experiment tracking and model management
  • Supports multiple ML frameworks and languages
  • Robust deployment and serving capabilities

Cons of MLflow

  • Steeper learning curve for beginners
  • Can be overkill for small projects or simple workflows
  • Requires more setup and infrastructure

Code Comparison

MLflow:

import mlflow

mlflow.start_run()
mlflow.log_param("param1", value1)
mlflow.log_metric("metric1", value2)
mlflow.end_run()

FLAML:

from flaml import AutoML

automl = AutoML()
automl.fit(X_train, y_train, task="classification")

Key Differences

  • MLflow focuses on experiment tracking and model management across the entire ML lifecycle
  • FLAML specializes in automated machine learning and hyperparameter optimization
  • MLflow is more versatile and supports various ML frameworks, while FLAML is primarily for AutoML tasks
  • FLAML is easier to use for quick AutoML experiments, while MLflow requires more setup but offers broader functionality

Use Cases

  • Choose MLflow for comprehensive ML project management and deployment
  • Opt for FLAML when rapid AutoML and hyperparameter tuning are the primary goals

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

PyPI version Conda version Build PyPI - Python Version Downloads

A Fast Library for Automated Machine Learning & Tuning


:fire: FLAML supports AutoML and Hyperparameter Tuning in Microsoft Fabric Data Science. In addition, we've introduced Python 3.11 support, along with a range of new estimators, and comprehensive integration with MLflow—thanks to contributions from the Microsoft Fabric product team.

:fire: Heads-up: We have migrated AutoGen into a dedicated github repository. Alongside this move, we have also launched a dedicated Discord server and a website for comprehensive documentation.

:fire: The automated multi-agent chat framework in AutoGen is in preview from v2.0.0.

:fire: FLAML is highlighted in OpenAI's cookbook.

:fire: autogen is released with support for ChatGPT and GPT-4, based on Cost-Effective Hyperparameter Optimization for Large Language Model Generation Inference.

What is FLAML

FLAML is a lightweight Python library for efficient automation of machine learning and AI operations. It automates workflow based on large language models, machine learning models, etc. and optimizes their performance.

  • FLAML enables building next-gen GPT-X applications based on multi-agent conversations with minimal effort. It simplifies the orchestration, automation and optimization of a complex GPT-X workflow. It maximizes the performance of GPT-X models and augments their weakness.
  • For common machine learning tasks like classification and regression, it quickly finds quality models for user-provided data with low computational resources. It is easy to customize or extend. Users can find their desired customizability from a smooth range.
  • It supports fast and economical automatic tuning (e.g., inference hyperparameters for foundation models, configurations in MLOps/LMOps workflows, pipelines, mathematical/statistical models, algorithms, computing experiments, software configurations), capable of handling large search space with heterogeneous evaluation cost and complex constraints/guidance/early stopping.

FLAML is powered by a series of research studies from Microsoft Research and collaborators such as Penn State University, Stevens Institute of Technology, University of Washington, and University of Waterloo.

FLAML has a .NET implementation in ML.NET, an open-source, cross-platform machine learning framework for .NET.

Installation

FLAML requires Python version >= 3.8. It can be installed from pip:

pip install flaml

Minimal dependencies are installed without extra options. You can install extra options based on the feature you need. For example, use the following to install the dependencies needed by the autogen package.

pip install "flaml[autogen]"

Find more options in Installation. Each of the notebook examples may require a specific option to be installed.

Quickstart

  • (New) The autogen package enables the next-gen GPT-X applications with a generic multi-agent conversation framework. It offers customizable and conversable agents which integrate LLMs, tools and human. By automating chat among multiple capable agents, one can easily make them collectively perform tasks autonomously or with human feedback, including tasks that require using tools via code. For example,
from flaml import autogen

assistant = autogen.AssistantAgent("assistant")
user_proxy = autogen.UserProxyAgent("user_proxy")
user_proxy.initiate_chat(
    assistant,
    message="Show me the YTD gain of 10 largest technology companies as of today.",
)
# This initiates an automated chat between the two agents to solve the task

Autogen also helps maximize the utility out of the expensive LLMs such as ChatGPT and GPT-4. It offers a drop-in replacement of openai.Completion or openai.ChatCompletion with powerful functionalites like tuning, caching, templating, filtering. For example, you can optimize generations by LLM with your own tuning data, success metrics and budgets.

# perform tuning
config, analysis = autogen.Completion.tune(
    data=tune_data,
    metric="success",
    mode="max",
    eval_func=eval_func,
    inference_budget=0.05,
    optimization_budget=3,
    num_samples=-1,
)
# perform inference for a test instance
response = autogen.Completion.create(context=test_instance, **config)
from flaml import AutoML

automl = AutoML()
automl.fit(X_train, y_train, task="classification")
  • You can restrict the learners and use FLAML as a fast hyperparameter tuning tool for XGBoost, LightGBM, Random Forest etc. or a customized learner.
automl.fit(X_train, y_train, task="classification", estimator_list=["lgbm"])
from flaml import tune
tune.run(evaluation_function, config={…}, low_cost_partial_config={…}, time_budget_s=3600)
  • Zero-shot AutoML allows using the existing training API from lightgbm, xgboost etc. while getting the benefit of AutoML in choosing high-performance hyperparameter configurations per task.
from flaml.default import LGBMRegressor

# Use LGBMRegressor in the same way as you use lightgbm.LGBMRegressor.
estimator = LGBMRegressor()
# The hyperparameters are automatically set according to the training data.
estimator.fit(X_train, y_train)

Documentation

You can find a detailed documentation about FLAML here.

In addition, you can find:

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

If you are new to GitHub here is a detailed help source on getting involved with development on GitHub.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Contributors Wall