darts
A python library for user-friendly forecasting and anomaly detection on time series.
Top Related Projects
Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
A unified framework for machine learning with time series
A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.
Open source time series library for Python
Scalable and user friendly neural :brain: forecasting algorithms.
Quick Overview
The darts
repository is a Python library that provides a set of tools for working with time series data, including forecasting, anomaly detection, and time series analysis. The library is designed to be easy to use and integrate into existing data pipelines.
Pros
- Comprehensive Functionality: The
darts
library offers a wide range of features for time series analysis, including forecasting, anomaly detection, and time series decomposition. - Ease of Use: The library has a user-friendly API and provides clear documentation, making it easy for developers to get started with time series analysis.
- Flexibility: The
darts
library supports a variety of time series data formats and can be used with different machine learning models and algorithms. - Active Development: The project is actively maintained and regularly updated, with new features and bug fixes being added on a regular basis.
Cons
- Limited Community: Compared to some other time series analysis libraries, the
darts
library has a relatively small community of users and contributors. - Performance Limitations: Depending on the size and complexity of the time series data, the library may not be as performant as some other specialized time series analysis tools.
- Limited Customization: While the library provides a lot of functionality out of the box, there may be some use cases where more advanced customization or configuration is required.
- Dependency on Other Libraries: The
darts
library relies on several other Python libraries, such aspandas
andscikit-learn
, which may introduce additional complexity or dependencies.
Code Examples
Here are a few examples of how to use the darts
library:
- Forecasting Time Series Data:
from darts import TimeSeries
from darts.models import ExponentialSmoothing
# Load time series data
series = TimeSeries.from_csv('data.csv', time_col='date', value_cols=['value'])
# Train an Exponential Smoothing model
model = ExponentialSmoothing()
model.fit(series)
# Make a forecast
future = model.predict(10)
- Anomaly Detection:
from darts import TimeSeries
from darts.models import NaiveAnomalyDetector
# Load time series data
series = TimeSeries.from_csv('data.csv', time_col='date', value_cols=['value'])
# Train a Naive Anomaly Detector
model = NaiveAnomalyDetector()
anomalies = model.fit_detect(series)
# Print the detected anomalies
print(anomalies)
- Time Series Decomposition:
from darts import TimeSeries
from darts.models import Decomposition
# Load time series data
series = TimeSeries.from_csv('data.csv', time_col='date', value_cols=['value'])
# Decompose the time series
decomposition = Decomposition(series)
trend, seasonal, residual = decomposition.split()
# Print the decomposed components
print("Trend:", trend)
print("Seasonal:", seasonal)
print("Residual:", residual)
Getting Started
To get started with the darts
library, follow these steps:
- Install the library using pip:
pip install darts
- Import the necessary modules and create a
TimeSeries
object from your data:
from darts import TimeSeries
from darts.datasets import load_air_passengers
# Load example time series data
series = load_air_passengers()
- Explore the available models and techniques for time series analysis:
from darts.models import ExponentialSmoothing, NaiveAnomalyDetector, Decomposition
# Train and use a forecasting model
model = ExponentialSmoothing()
model.fit(series)
forecast = model.predict(10)
# Detect anomalies in the time series
anomaly_detector = NaiveAnomalyDetector()
anomalies = anomaly_detector.fit_detect(series)
# Decompose the time series
decomposition = Decomposition(series)
trend, seasonal, residual = decomposition.split()
- Check the documentation for more advance
Competitor Comparisons
Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
Pros of Prophet
- More established and widely adopted in industry
- Handles holidays and seasonal effects automatically
- Robust to missing data and outliers
Cons of Prophet
- Less flexible for custom model architectures
- Primarily focused on univariate time series
- Can be slower for large datasets
Code Comparison
Prophet:
from fbprophet import Prophet
model = Prophet()
model.fit(df)
future = model.make_future_dataframe(periods=365)
forecast = model.predict(future)
Darts:
from darts.models import Prophet
model = Prophet()
model.fit(series)
forecast = model.predict(n=365)
Key Differences
- Darts provides a unified interface for multiple forecasting models, including Prophet
- Prophet focuses on additive models, while Darts supports a wider range of algorithms
- Darts offers more advanced features like multivariate forecasting and probabilistic predictions
Use Cases
Prophet:
- Business forecasting with clear seasonality and trends
- Handling irregular time series with missing data
Darts:
- Complex time series requiring ensemble methods
- Experimenting with multiple forecasting algorithms
- Projects needing both classical and machine learning approaches
A unified framework for machine learning with time series
Pros of sktime
- More comprehensive, covering a wider range of time series tasks beyond forecasting
- Larger and more active community, with more frequent updates and contributions
- Stronger integration with scikit-learn, allowing easier use of machine learning models
Cons of sktime
- Steeper learning curve due to its more complex architecture
- Less focused on deep learning models compared to Darts
- May be overkill for simple forecasting tasks
Code Comparison
sktime:
from sktime.forecasting.arima import ARIMA
from sktime.datasets import load_airline
y = load_airline()
forecaster = ARIMA(order=(1, 1, 1), seasonal_order=(1, 1, 1, 12))
forecaster.fit(y)
y_pred = forecaster.predict(fh=[1, 2, 3])
Darts:
from darts import TimeSeries
from darts.models import ARIMA
series = TimeSeries.from_csv("airline.csv", time_col="Month", value_col="Passengers")
model = ARIMA(p=1, d=1, q=1, seasonal_order=(1, 1, 1, 12))
model.fit(series)
forecast = model.predict(n=3)
Both libraries offer similar functionality for time series forecasting, but sktime provides a broader scope for time series analysis tasks, while Darts focuses more on forecasting and offers easier integration with deep learning models.
A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.
Pros of pmdarima
- Specialized in ARIMA modeling with automatic parameter selection
- Includes advanced features like seasonality tests and unit root tests
- Well-documented with extensive examples and tutorials
Cons of pmdarima
- Limited to ARIMA-based models, less versatile than Darts
- Lacks built-in support for multivariate time series analysis
- Fewer options for handling missing data compared to Darts
Code Comparison
pmdarima:
from pmdarima import auto_arima
model = auto_arima(y, seasonal=True, m=12)
forecasts = model.predict(n_periods=24)
Darts:
from darts.models import AutoARIMA
model = AutoARIMA()
model.fit(series)
forecasts = model.predict(24)
Both libraries offer auto ARIMA functionality, but Darts provides a more consistent API across different model types. pmdarima focuses on ARIMA-specific features, while Darts offers a broader range of time series models and utilities.
Open source time series library for Python
Pros of PyFlux
- Broader range of time series models, including ARIMA, GARCH, and state space models
- More established project with a longer history and potentially more stability
- Includes Bayesian inference capabilities for some models
Cons of PyFlux
- Less active development and fewer recent updates compared to Darts
- More limited forecasting functionality and fewer modern machine learning integrations
- Smaller community and potentially less support for ongoing issues
Code Comparison
PyFlux example:
from pyflux import ARIMA
model = ARIMA(data=df, ar=1, ma=1, target='y')
model.fit()
model.plot_fit()
Darts example:
from darts import TimeSeries
from darts.models import ARIMA
series = TimeSeries.from_dataframe(df, 'date', 'y')
model = ARIMA()
model.fit(series)
forecast = model.predict(n=5)
Both libraries offer ARIMA modeling, but Darts provides a more streamlined API for forecasting tasks. PyFlux offers more detailed model specification, while Darts focuses on ease of use and integration with other forecasting techniques.
Scalable and user friendly neural :brain: forecasting algorithms.
Pros of neuralforecast
- Focuses specifically on neural network-based forecasting models
- Includes advanced models like Temporal Fusion Transformers and N-BEATS
- Provides built-in hyperparameter tuning capabilities
Cons of neuralforecast
- More limited in scope, primarily for neural network forecasting
- Less extensive documentation and community support
- Fewer traditional statistical models compared to darts
Code Comparison
neuralforecast:
from neuralforecast import NeuralForecast
from neuralforecast.models import NBEATS
model = NeuralForecast(models=[NBEATS(input_size=7, h=1, loss="MAE")])
model.fit(df)
forecast = model.predict()
darts:
from darts import TimeSeries
from darts.models import NBEATSModel
series = TimeSeries.from_dataframe(df)
model = NBEATSModel(input_chunk_length=7, output_chunk_length=1)
model.fit(series)
forecast = model.predict(n=1)
Both libraries offer similar functionality for neural network-based forecasting, but darts provides a broader range of models and features. neuralforecast specializes in neural network approaches, while darts offers a more comprehensive toolkit for time series analysis and forecasting.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
Time Series Made Easy in Python
Darts is a Python library for user-friendly forecasting and anomaly detection
on time series. It contains a variety of models, from classics such as ARIMA to
deep neural networks. The forecasting models can all be used in the same way,
using fit()
and predict()
functions, similar to scikit-learn.
The library also makes it easy to backtest models,
combine the predictions of several models, and take external data into account.
Darts supports both univariate and multivariate time series and models.
The ML-based models can be trained on potentially large datasets containing multiple time
series, and some of the models offer a rich support for probabilistic forecasting.
Darts also offers extensive anomaly detection capabilities. For instance, it is trivial to apply PyOD models on time series to obtain anomaly scores, or to wrap any of Darts forecasting or filtering models to obtain fully fledged anomaly detection models.
Documentation
High Level Introductions
Articles on Selected Topics
- Training Models on Multiple Time Series
- Using Past and Future Covariates
- Temporal Convolutional Networks and Forecasting
- Probabilistic Forecasting
- Transfer Learning for Time Series Forecasting
- Hierarchical Forecast Reconciliation
Quick Install
We recommend to first setup a clean Python environment for your project with Python 3.9+ using your favorite tool (conda, venv, virtualenv with or without virtualenvwrapper).
Once your environment is set up you can install darts using pip:
pip install darts
For more details you can refer to our installation instructions.
Example Usage
Forecasting
Create a TimeSeries
object from a Pandas DataFrame, and split it in train/validation series:
import pandas as pd
from darts import TimeSeries
# Read a pandas DataFrame
df = pd.read_csv("AirPassengers.csv", delimiter=",")
# Create a TimeSeries, specifying the time and value columns
series = TimeSeries.from_dataframe(df, "Month", "#Passengers")
# Set aside the last 36 months as a validation series
train, val = series[:-36], series[-36:]
Fit an exponential smoothing model, and make a (probabilistic) prediction over the validation series' duration:
from darts.models import ExponentialSmoothing
model = ExponentialSmoothing()
model.fit(train)
prediction = model.predict(len(val), num_samples=1000)
Plot the median, 5th and 95th percentiles:
import matplotlib.pyplot as plt
series.plot()
prediction.plot(label="forecast", low_quantile=0.05, high_quantile=0.95)
plt.legend()
Anomaly Detection
Load a multivariate series, trim it, keep 2 components, split train and validation sets:
from darts.datasets import ETTh2Dataset
series = ETTh2Dataset().load()[:10000][["MUFL", "LULL"]]
train, val = series.split_before(0.6)
Build a k-means anomaly scorer, train it on the train set and use it on the validation set to get anomaly scores:
from darts.ad import KMeansScorer
scorer = KMeansScorer(k=2, window=5)
scorer.fit(train)
anom_score = scorer.score(val)
Build a binary anomaly detector and train it over train scores, then use it over validation scores to get binary anomaly classification:
from darts.ad import QuantileDetector
detector = QuantileDetector(high_quantile=0.99)
detector.fit(scorer.score(train))
binary_anom = detector.detect(anom_score)
Plot (shifting and scaling some of the series to make everything appear on the same figure):
import matplotlib.pyplot as plt
series.plot()
(anom_score / 2. - 100).plot(label="computed anomaly score", c="orangered", lw=3)
(binary_anom * 45 - 150).plot(label="detected binary anomaly", lw=4)
Features
-
Forecasting Models: A large collection of forecasting models; from statistical models (such as ARIMA) to deep learning models (such as N-BEATS). See table of models below.
-
Anomaly Detection The
darts.ad
module contains a collection of anomaly scorers, detectors and aggregators, which can all be combined to detect anomalies in time series. It is easy to wrap any of Darts forecasting or filtering models to build a fully fledged anomaly detection model that compares predictions with actuals. ThePyODScorer
makes it trivial to use PyOD detectors on time series. -
Multivariate Support:
TimeSeries
can be multivariate - i.e., contain multiple time-varying dimensions/columns instead of a single scalar value. Many models can consume and produce multivariate series. -
Multiple series training (global models): All machine learning based models (incl. all neural networks) support being trained on multiple (potentially multivariate) series. This can scale to large datasets too.
-
Probabilistic Support:
TimeSeries
objects can (optionally) represent stochastic time series; this can for instance be used to get confidence intervals, and many models support different flavours of probabilistic forecasting (such as estimating parametric distributions or quantiles). Some anomaly detection scorers are also able to exploit these predictive distributions. -
Past and Future Covariates support: Many models in Darts support past-observed and/or future-known covariate (external data) time series as inputs for producing forecasts.
-
Static Covariates support: In addition to time-dependent data,
TimeSeries
can also contain static data for each dimension, which can be exploited by some models. -
Hierarchical Reconciliation: Darts offers transformers to perform reconciliation. These can make the forecasts add up in a way that respects the underlying hierarchy.
-
Regression Models: It is possible to plug-in any scikit-learn compatible model to obtain forecasts as functions of lagged values of the target series and covariates.
-
Training with sample weights: All global models support being trained with sample weights. They can be applied to each observation, forecasted time step and target column.
-
Forecast Start Shifting: All global models support training and prediction on a shifted output window. This is useful for example for Day-Ahead Market forecasts, or when the covariates (or target series) are reported with a delay.
-
Explainability: Darts has the ability to explain some forecasting models using Shap values.
-
Data processing: Tools to easily apply (and revert) common transformations on time series data (scaling, filling missing values, differencing, boxcox, ...)
-
Metrics: A variety of metrics for evaluating time series' goodness of fit; from R2-scores to Mean Absolute Scaled Error.
-
Backtesting: Utilities for simulating historical forecasts, using moving time windows.
-
PyTorch Lightning Support: All deep learning models are implemented using PyTorch Lightning, supporting among other things custom callbacks, GPUs/TPUs training and custom trainers.
-
Filtering Models: Darts offers three filtering models:
KalmanFilter
,GaussianProcessFilter
, andMovingAverageFilter
, which allow to filter time series, and in some cases obtain probabilistic inferences of the underlying states/values. -
Datasets The
darts.datasets
submodule contains some popular time series datasets for rapid and reproducible experimentation.
Forecasting Models
Here's a breakdown of the forecasting models currently implemented in Darts. We are constantly working on bringing more models and features.
Model | Sources | Target Series Support: Univariate/ Multivariate | Covariates Support: Past-observed/ Future-known/ Static | Probabilistic Forecasting: Sampled/ Distribution Parameters | Training & Forecasting on Multiple Series |
---|---|---|---|---|---|
Baseline Models (LocalForecastingModel) | |||||
NaiveMean | â â | ð´ ð´ ð´ | ð´ ð´ | ð´ | |
NaiveSeasonal | â â | ð´ ð´ ð´ | ð´ ð´ | ð´ | |
NaiveDrift | â â | ð´ ð´ ð´ | ð´ ð´ | ð´ | |
NaiveMovingAverage | â â | ð´ ð´ ð´ | ð´ ð´ | ð´ | |
Statistical / Classic Models (LocalForecastingModel) | |||||
ARIMA | â ð´ | ð´ â ð´ | â ð´ | ð´ | |
VARIMA | ð´ â | ð´ â ð´ | â ð´ | ð´ | |
AutoARIMA | â ð´ | ð´ â ð´ | ð´ ð´ | ð´ | |
StatsForecastAutoArima (faster AutoARIMA) | Nixtla's statsforecast | â ð´ | ð´ â ð´ | â ð´ | ð´ |
ExponentialSmoothing | â ð´ | ð´ ð´ ð´ | â ð´ | ð´ | |
StatsforecastAutoETS | Nixtla's statsforecast | â ð´ | ð´ â ð´ | â ð´ | ð´ |
StatsforecastAutoCES | Nixtla's statsforecast | â ð´ | ð´ ð´ ð´ | ð´ ð´ | ð´ |
BATS and TBATS | TBATS paper | â ð´ | ð´ ð´ ð´ | â ð´ | ð´ |
Theta and FourTheta | Theta & 4 Theta | â ð´ | ð´ ð´ ð´ | ð´ ð´ | ð´ |
StatsForecastAutoTheta | Nixtla's statsforecast | â ð´ | ð´ ð´ ð´ | â ð´ | ð´ |
Prophet | Prophet repo | â ð´ | ð´ â ð´ | â ð´ | ð´ |
FFT (Fast Fourier Transform) | â ð´ | ð´ ð´ ð´ | ð´ ð´ | ð´ | |
KalmanForecaster using the Kalman filter and N4SID for system identification | N4SID paper | â â | ð´ â ð´ | â ð´ | ð´ |
Croston method | â ð´ | ð´ ð´ ð´ | ð´ ð´ | ð´ | |
Global Baseline Models (GlobalForecastingModel) | |||||
GlobalNaiveAggregate | â â | ð´ ð´ ð´ | ð´ ð´ | â | |
GlobalNaiveDrift | â â | ð´ ð´ ð´ | ð´ ð´ | â | |
GlobalNaiveSeasonal | â â | ð´ ð´ ð´ | ð´ ð´ | â | |
Regression Models (GlobalForecastingModel) | |||||
RegressionModel: generic wrapper around any sklearn regression model | â â | â â â | ð´ ð´ | â | |
LinearRegressionModel | â â | â â â | â â | â | |
RandomForest | â â | â â â | ð´ ð´ | â | |
LightGBMModel | â â | â â â | â â | â | |
XGBModel | â â | â â â | â â | â | |
CatBoostModel | â â | â â â | â â | â | |
PyTorch (Lightning)-based Models (GlobalForecastingModel) | |||||
RNNModel (incl. LSTM and GRU); equivalent to DeepAR in its probabilistic version | DeepAR paper | â â | ð´ â ð´ | â â | â |
BlockRNNModel (incl. LSTM and GRU) | â â | â ð´ ð´ | â â | â | |
NBEATSModel | N-BEATS paper | â â | â ð´ ð´ | â â | â |
NHiTSModel | N-HiTS paper | â â | â ð´ ð´ | â â | â |
TCNModel | TCN paper, DeepTCN paper, blog post | â â | â ð´ ð´ | â â | â |
TransformerModel | â â | â ð´ ð´ | â â | â | |
TFTModel (Temporal Fusion Transformer) | TFT paper, PyTorch Forecasting | â â | â â â | â â | â |
DLinearModel | DLinear paper | â â | â â â | â â | â |
NLinearModel | NLinear paper | â â | â â â | â â | â |
TiDEModel | TiDE paper | â â | â â â | â â | â |
TSMixerModel | TSMixer paper, PyTorch Implementation | â â | â â â | â â | â |
Ensemble Models (GlobalForecastingModel): Model support is dependent on ensembled forecasting models and the ensemble model itself | |||||
NaiveEnsembleModel | â â | â â â | â â | â | |
RegressionEnsembleModel | â â | â â â | â â | â |
Community & Contact
Anyone is welcome to join our Gitter room to ask questions, make proposals, discuss use-cases, and more. If you spot a bug or have suggestions, GitHub issues are also welcome.
If what you want to tell us is not suitable for Gitter or Github, feel free to send us an email at darts@unit8.co for darts related matters or info@unit8.co for any other inquiries.
Contribute
The development is ongoing, and we welcome suggestions, pull requests and issues on GitHub. All contributors will be acknowledged on the change log page.
Before working on a contribution (a new feature or a fix), check our contribution guidelines.
Citation
If you are using Darts in your scientific work, we would appreciate citations to the following JMLR paper.
Darts: User-Friendly Modern Machine Learning for Time Series
Bibtex entry:
@article{JMLR:v23:21-1177,
author = {Julien Herzen and Francesco Lässig and Samuele Giuliano Piazzetta and Thomas Neuer and Léo Tafti and Guillaume Raille and Tomas Van Pottelbergh and Marek Pasieka and Andrzej Skrodzki and Nicolas Huguenin and Maxime Dumonal and Jan KoÃ
âºcisz and Dennis Bader and Frédérick Gusset and Mounir Benheddi and Camila Williamson and Michal Kosinski and Matej Petrik and Gaël Grosch},
title = {Darts: User-Friendly Modern Machine Learning for Time Series},
journal = {Journal of Machine Learning Research},
year = {2022},
volume = {23},
number = {124},
pages = {1-6},
url = {http://jmlr.org/papers/v23/21-1177.html}
}
Top Related Projects
Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
A unified framework for machine learning with time series
A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.
Open source time series library for Python
Scalable and user friendly neural :brain: forecasting algorithms.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot