Convert Figma logo to code with AI

Nixtla logostatsforecast

Lightning ⚡️ fast forecasting with statistical and econometric models.

4,045
290
4,045
101

Top Related Projects

18,363

Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.

7,833

A unified framework for machine learning with time series

A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.

2,113

Open source time series library for Python

1,864

A Python package for Bayesian forecasting with object-oriented design and probabilistic models under the hood.

A flexible, intuitive and fast forecasting library

Quick Overview

Statsforecast is a Python library for time series forecasting that focuses on statistical and econometric models. It provides fast and accurate implementations of popular forecasting algorithms, including ARIMA, ETS, and various other statistical methods. The library is designed to be efficient, scalable, and easy to use for both beginners and advanced users.

Pros

  • High performance: Utilizes Rust and NumPy for fast computations, making it suitable for large-scale forecasting tasks
  • Wide range of models: Offers a variety of statistical and econometric forecasting methods
  • Easy integration: Compatible with popular data science libraries like pandas and scikit-learn
  • Automatic model selection: Includes features for automatic model selection and hyperparameter tuning

Cons

  • Limited to statistical models: Does not include machine learning or deep learning-based forecasting methods
  • Steeper learning curve: May require more statistical knowledge compared to some other forecasting libraries
  • Less extensive documentation: While improving, the documentation may not be as comprehensive as more established libraries

Code Examples

  1. Basic forecasting with AutoARIMA:
from statsforecast import StatsForecast
from statsforecast.models import AutoARIMA

model = StatsForecast(models=[AutoARIMA()], freq='D')
model.fit(df)
forecast = model.forecast(h=30)
  1. Using multiple models for ensemble forecasting:
from statsforecast.models import AutoARIMA, ETS, Naive

models = [AutoARIMA(), ETS(), Naive()]
sf = StatsForecast(models=models, freq='D')
sf.fit(df)
forecast = sf.forecast(h=30)
  1. Cross-validation for model evaluation:
from statsforecast import StatsForecast
from statsforecast.models import AutoARIMA

model = StatsForecast(models=[AutoARIMA()], freq='D')
cv_results = model.cross_validation(df, h=30, step_size=1, n_windows=5)

Getting Started

To get started with Statsforecast, follow these steps:

  1. Install the library:
pip install statsforecast
  1. Import the necessary modules and create a sample dataset:
import pandas as pd
from statsforecast import StatsForecast
from statsforecast.models import AutoARIMA

# Create a sample dataset
dates = pd.date_range(start='2020-01-01', end='2022-12-31', freq='D')
values = np.random.randn(len(dates)).cumsum()
df = pd.DataFrame({'ds': dates, 'y': values})
  1. Fit a model and generate forecasts:
model = StatsForecast(models=[AutoARIMA()], freq='D')
model.fit(df)
forecast = model.forecast(h=30)
print(forecast)

Competitor Comparisons

18,363

Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.

Pros of Prophet

  • User-friendly interface with automatic handling of seasonality and holidays
  • Robust handling of missing data and outliers
  • Extensive documentation and community support

Cons of Prophet

  • Can be slower for large datasets or many time series
  • Less flexibility for custom models or advanced statistical techniques
  • May overfit on datasets with limited historical data

Code Comparison

Prophet:

from fbprophet import Prophet
model = Prophet()
model.fit(df)
future = model.make_future_dataframe(periods=365)
forecast = model.predict(future)

StatsForecasts:

from statsforecast import StatsForecast
from statsforecast.models import AutoARIMA
fcst = StatsForecast(df, models=[AutoARIMA()], freq='D')
forecast = fcst.forecast(h=365)

Key Differences

  • Prophet focuses on an additive model with intuitive parameters, while StatsForecasts offers a variety of statistical models
  • StatsForecasts is generally faster, especially for multiple time series
  • Prophet provides built-in plotting and diagnostics, whereas StatsForecasts relies more on external visualization tools
  • StatsForecasts offers more advanced statistical models and the ability to easily combine multiple forecasting methods

Both libraries have their strengths, with Prophet excelling in ease of use and interpretability, while StatsForecasts offers more flexibility and performance for advanced users and large-scale forecasting tasks.

7,833

A unified framework for machine learning with time series

Pros of sktime

  • Broader scope, covering various time series tasks beyond forecasting
  • Extensive ecosystem with many algorithms and transformers
  • Strong integration with scikit-learn and pandas

Cons of sktime

  • Steeper learning curve due to its comprehensive nature
  • Potentially slower performance for some forecasting tasks
  • Less focus on probabilistic forecasting compared to StatsForecasts

Code Comparison

sktime example:

from sktime.forecasting.arima import ARIMA
from sktime.datasets import load_airline

y = load_airline()
forecaster = ARIMA(order=(1, 1, 1), seasonal_order=(1, 1, 1, 12))
forecaster.fit(y)
y_pred = forecaster.predict(fh=[1, 2, 3])

StatsForecasts example:

from statsforecast import StatsForecast
from statsforecast.models import ARIMA

sf = StatsForecast(models=[ARIMA(order=(1, 1, 1), seasonal_order=(1, 1, 1, 12))])
sf.fit(df)
forecasts = sf.predict(h=3)

Both libraries offer ARIMA forecasting, but sktime provides a more scikit-learn-like API, while StatsForecasts focuses on simplicity and performance for forecasting tasks.

A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.

Pros of pmdarima

  • More mature project with a longer history and larger user base
  • Extensive documentation and examples for various use cases
  • Supports a wider range of ARIMA-based models, including SARIMAX

Cons of pmdarima

  • Slower performance, especially for large datasets or multiple time series
  • Less focus on modern forecasting techniques beyond ARIMA-based models
  • More complex API, requiring more code for basic forecasting tasks

Code Comparison

pmdarima:

from pmdarima import auto_arima

model = auto_arima(y, seasonal=True, m=12)
forecast = model.predict(n_periods=12)

statsforecast:

from statsforecast import StatsForecast
from statsforecast.models import AutoARIMA

fcst = StatsForecast(df, models=[AutoARIMA()], freq='M')
forecast = fcst.forecast(h=12)

Both libraries offer AutoARIMA functionality, but statsforecast provides a more streamlined API for working with multiple time series and integrates well with pandas DataFrames. pmdarima offers more granular control over model parameters, which can be beneficial for advanced users but may require more code for basic forecasting tasks.

2,113

Open source time series library for Python

Pros of PyFlux

  • Offers a wider range of time series models, including ARIMA, GARCH, and state space models
  • Provides Bayesian inference capabilities for parameter estimation
  • Includes built-in plotting functions for model diagnostics and forecasts

Cons of PyFlux

  • Less actively maintained, with the last update in 2018
  • Slower performance for large datasets compared to StatsForecast
  • Limited documentation and community support

Code Comparison

PyFlux:

from pyflux.arima import ARIMA

model = ARIMA(data=df, ar=1, ma=1, target='y')
model.fit()
forecast = model.predict(h=5)

StatsForecast:

from statsforecast import StatsForecast
from statsforecast.models import ARIMA

sf = StatsForecast(models=[ARIMA(order=(1,0,1))])
sf.fit(df)
forecast = sf.predict(h=5)

Both libraries offer similar functionality for time series forecasting, but StatsForecast provides a more modern and efficient implementation with better performance for large-scale forecasting tasks. PyFlux offers a broader range of models and Bayesian inference capabilities, but lacks recent updates and community support.

1,864

A Python package for Bayesian forecasting with object-oriented design and probabilistic models under the hood.

Pros of Orbit

  • Supports Bayesian modeling, allowing for uncertainty quantification
  • Offers a wider range of models, including custom model creation
  • Provides built-in visualization tools for model diagnostics

Cons of Orbit

  • Steeper learning curve due to more complex API
  • Slower performance compared to StatsForecasts's optimized implementations
  • Less focus on traditional statistical models

Code Comparison

StatsForecasts:

from statsforecast import StatsForecast
from statsforecast.models import AutoARIMA

sf = StatsForecast(
    models=[AutoARIMA()],
    freq='D',
    n_jobs=-1
)
forecasts = sf.forecast(df=data, h=30)

Orbit:

from orbit.models import DLT

dlt = DLT(
    response_col='y',
    date_col='ds',
    seasonality=[7, 30.5],
    num_forecast_steps=30
)
dlt.fit(df=data)
predictions = dlt.predict(df=data)

Both libraries offer time series forecasting capabilities, but they cater to different use cases. StatsForecasts focuses on traditional statistical models with high performance, while Orbit provides a more flexible framework for Bayesian modeling and custom model creation. The choice between them depends on the specific requirements of the forecasting task and the user's familiarity with different modeling approaches.

A flexible, intuitive and fast forecasting library

Pros of Greykite

  • More comprehensive feature set for advanced forecasting scenarios
  • Stronger focus on interpretability and explainability of models
  • Better suited for complex, multi-variate time series forecasting

Cons of Greykite

  • Steeper learning curve due to more complex API and configuration options
  • Slower execution times for large datasets compared to StatsForecast
  • Less emphasis on traditional statistical methods

Code Comparison

StatsForecast:

from statsforecast import StatsForecast
from statsforecast.models import AutoARIMA

sf = StatsForecast(df, models=[AutoARIMA()], freq='D')
forecasts = sf.forecast(h=30)

Greykite:

from greykite.framework.templates.autogen.forecast_config import ForecastConfig
from greykite.framework.templates.forecaster import Forecaster

forecaster = Forecaster()
result = forecaster.run_forecast_config(
    df,
    config=ForecastConfig(
        model_template="AUTO",
        forecast_horizon=30
    )
)

Both libraries offer high-level APIs for forecasting, but Greykite's approach is more configurable and verbose. StatsForecast provides a simpler interface for quick forecasting tasks, while Greykite offers more control over the forecasting process and model selection.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Nixtla   Tweet  Slack

All Contributors

Statistical ⚡️ Forecast

Lightning fast forecasting with statistical and econometric models

CI Python PyPi conda-nixtla License docs Downloads

StatsForecast offers a collection of widely used univariate time series forecasting models, including automatic ARIMA, ETS, CES, and Theta modeling optimized for high performance using numba. It also includes a large battery of benchmarking models.

Installation

You can install StatsForecast with:

pip install statsforecast

or

conda install -c conda-forge statsforecast

Vist our Installation Guide for further instructions.

Quick Start

Minimal Example

from statsforecast import StatsForecast
from statsforecast.models import AutoARIMA
from statsforecast.utils import AirPassengersDF

df = AirPassengersDF
sf = StatsForecast(
    models=[AutoARIMA(season_length=12)],
    freq='ME',
)
sf.fit(df)
sf.predict(h=12, level=[95])

Get Started with this quick guide.

Follow this end-to-end walkthrough for best practices.

Why?

Current Python alternatives for statistical models are slow, inaccurate and don't scale well. So we created a library that can be used to forecast in production environments or as benchmarks. StatsForecast includes an extensive battery of models that can efficiently fit millions of time series.

Features

  • Fastest and most accurate implementations of AutoARIMA, AutoETS, AutoCES, MSTL and Theta in Python.
  • Out-of-the-box compatibility with Spark, Dask, and Ray.
  • Probabilistic Forecasting and Confidence Intervals.
  • Support for exogenous Variables and static covariates.
  • Anomaly Detection.
  • Familiar sklearn syntax: .fit and .predict.

Highlights

  • Inclusion of exogenous variables and prediction intervals for ARIMA.
  • 20x faster than pmdarima.
  • 1.5x faster than R.
  • 500x faster than Prophet.
  • 4x faster than statsmodels.
  • Compiled to high performance machine code through numba.
  • 1,000,000 series in 30 min with ray.
  • Replace FB-Prophet in two lines of code and gain speed and accuracy. Check the experiments here.
  • Fit 10 benchmark models on 1,000,000 series in under 5 min.

Missing something? Please open an issue or write us in Slack

Examples and Guides

📚 End to End Walkthrough: Model training, evaluation and selection for multiple time series

🔎 Anomaly Detection: detect anomalies for time series using in-sample prediction intervals.

👩‍🔬 Cross Validation: robust model’s performance evaluation.

❄️ Multiple Seasonalities: how to forecast data with multiple seasonalities using an MSTL.

🔌 Predict Demand Peaks: electricity load forecasting for detecting daily peaks and reducing electric bills.

📈 Intermittent Demand: forecast series with very few non-zero observations.

🌡️ Exogenous Regressors: like weather or prices

Models

Automatic Forecasting

Automatic forecasting tools search for the best parameters and select the best possible model for a group of time series. These tools are useful for large collections of univariate time series.

ModelPoint ForecastProbabilistic ForecastInsample fitted valuesProbabilistic fitted valuesExogenous features
AutoARIMA✅✅✅✅✅
AutoETS✅✅✅✅
AutoCES✅✅✅✅
AutoTheta✅✅✅✅
AutoMFLES✅✅✅✅✅
AutoTBATS✅✅✅✅

ARIMA Family

These models exploit the existing autocorrelations in the time series.

ModelPoint ForecastProbabilistic ForecastInsample fitted valuesProbabilistic fitted valuesExogenous features
ARIMA✅✅✅✅✅
AutoRegressive✅✅✅✅✅

Theta Family

Fit two theta lines to a deseasonalized time series, using different techniques to obtain and combine the two theta lines to produce the final forecasts.

ModelPoint ForecastProbabilistic ForecastInsample fitted valuesProbabilistic fitted valuesExogenous features
Theta✅✅✅✅
OptimizedTheta✅✅✅✅
DynamicTheta✅✅✅✅
DynamicOptimizedTheta✅✅✅✅

Multiple Seasonalities

Suited for signals with more than one clear seasonality. Useful for low-frequency data like electricity and logs.

ModelPoint ForecastProbabilistic ForecastInsample fitted valuesProbabilistic fitted valuesExogenous features
MSTL✅✅✅✅If trend forecaster supports
MFLES✅✅✅✅✅
TBATS✅✅✅✅

GARCH and ARCH Models

Suited for modeling time series that exhibit non-constant volatility over time. The ARCH model is a particular case of GARCH.

ModelPoint ForecastProbabilistic ForecastInsample fitted valuesProbabilistic fitted valuesExogenous features
GARCH✅✅✅✅
ARCH✅✅✅✅

Baseline Models

Classical models for establishing baseline.

ModelPoint ForecastProbabilistic ForecastInsample fitted valuesProbabilistic fitted valuesExogenous features
HistoricAverage✅✅✅✅
Naive✅✅✅✅
RandomWalkWithDrift✅✅✅✅
SeasonalNaive✅✅✅✅
WindowAverage✅
SeasonalWindowAverage✅

Exponential Smoothing

Uses a weighted average of all past observations where the weights decrease exponentially into the past. Suitable for data with clear trend and/or seasonality. Use the SimpleExponential family for data with no clear trend or seasonality.

ModelPoint ForecastProbabilistic ForecastInsample fitted valuesProbabilistic fitted valuesExogenous features
SimpleExponentialSmoothing✅
SimpleExponentialSmoothingOptimized✅
SeasonalExponentialSmoothing✅
SeasonalExponentialSmoothingOptimized✅
Holt✅✅✅✅
HoltWinters✅✅✅✅

Sparse or Inttermitent

Suited for series with very few non-zero observations

ModelPoint ForecastProbabilistic ForecastInsample fitted valuesProbabilistic fitted valuesExogenous features
ADIDA✅✅✅
CrostonClassic✅✅✅
CrostonOptimized✅✅✅
CrostonSBA✅✅✅
IMAPA✅✅✅
TSB✅✅✅

🔨 How to contribute

See CONTRIBUTING.md.

Citing

@misc{garza2022statsforecast,
    author={Azul Garza, Max Mergenthaler Canseco, Cristian Challú, Kin G. Olivares},
    title = {{StatsForecast}: Lightning fast forecasting with statistical and econometric models},
    year={2022},
    howpublished={{PyCon} Salt Lake City, Utah, US 2022},
    url={https://github.com/Nixtla/statsforecast}
}

Contributors ✨

Thanks goes to these wonderful people (emoji key):

azul
azul

💻 🚧
José Morales
José Morales

💻 🚧
Sugato Ray
Sugato Ray

💻
Jeff Tackes
Jeff Tackes

🐛
darinkist
darinkist

🤔
Alec Helyar
Alec Helyar

💬
Dave Hirschfeld
Dave Hirschfeld

💬
mergenthaler
mergenthaler

💻
Kin
Kin

💻
Yasslight90
Yasslight90

🤔
asinig
asinig

🤔
Philip Gillißen
Philip Gillißen

💻
Sebastian Hagn
Sebastian Hagn

🐛 📖
Han Wang
Han Wang

💻
Ben Jeffrey
Ben Jeffrey

🐛
Beliavsky
Beliavsky

📖
Mariana Menchero García
Mariana Menchero García

💻
Nikhil Gupta
Nikhil Gupta

🐛
JD
JD

🐛
josh attenberg
josh attenberg

💻
JeroenPeterBos
JeroenPeterBos

💻
Jeroen Van Der Donckt
Jeroen Van Der Donckt

💻
Roymprog
Roymprog

📖
Nelson Cárdenas Bolaño
Nelson Cárdenas Bolaño

📖
Kyle Schmaus
Kyle Schmaus

💻
Akmal Soliev
Akmal Soliev

💻
Nick To
Nick To

💻
Kevin Kho
Kevin Kho

💻
Yiben Huang
Yiben Huang

📖
Andrew Gross
Andrew Gross

📖
taniishkaaa
taniishkaaa

📖
Manuel Calzolari
Manuel Calzolari

💻

This project follows the all-contributors specification. Contributions of any kind welcome!