Convert Figma logo to code with AI

jdb78 logopytorch-forecasting

Time series forecasting with PyTorch

3,859
609
3,859
506

Top Related Projects

18,363

Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.

7,833

A unified framework for machine learning with time series

7,897

A python library for user-friendly forecasting and anomaly detection on time series.

1,864

A Python package for Bayesian forecasting with object-oriented design and probabilistic models under the hood.

Lightning ⚡️ fast forecasting with statistical and econometric models.

Quick Overview

The jdb78/pytorch-forecasting repository is a Python library that provides a set of tools and utilities for time series forecasting using the PyTorch deep learning framework. It aims to simplify the process of building, training, and evaluating time series forecasting models, with a focus on flexibility and ease of use.

Pros

  • Flexible and Extensible: The library is designed to be highly modular and customizable, allowing users to easily integrate their own data preprocessing, model architectures, and training/evaluation pipelines.
  • Comprehensive Functionality: The library offers a wide range of features, including support for various time series data formats, a variety of forecasting models (e.g., LSTM, Transformer, Temporal Fusion Transformer), and advanced techniques like time-aware cross-validation.
  • Efficient and Scalable: The library leverages the power of PyTorch to enable efficient and scalable training of deep learning models for time series forecasting.
  • Active Development and Community: The project is actively maintained, with regular updates and contributions from the community, ensuring its continued relevance and improvement.

Cons

  • Steep Learning Curve: The library's flexibility and comprehensive functionality can make it challenging for beginners to get started, as it requires a good understanding of both time series forecasting and PyTorch.
  • Limited Documentation: While the project has some documentation, it could be more comprehensive and user-friendly, especially for newcomers to the library.
  • Potential Performance Issues: Depending on the complexity of the models and the size of the dataset, the library's performance may be a concern, especially on resource-constrained environments.
  • Dependency on PyTorch: The library is tightly coupled with the PyTorch deep learning framework, which may be a limitation for users who prefer other deep learning libraries or frameworks.

Code Examples

Here are a few code examples demonstrating the usage of the jdb78/pytorch-forecasting library:

  1. Loading and Preprocessing Time Series Data:
from pytorch_forecasting import TimeSeriesDataSet

data = TimeSeriesDataSet(
    time_idx="time",
    group_ids=["entity"],
    time_varying_known_reals=["feature1", "feature2"],
    time_varying_unknown_reals=["target"],
    max_encoder_length=60,
    max_prediction_length=10,
    time_varying_known_categoricals=["category1"],
    static_categoricals=["category2"],
    static_reals=["feature3"],
    target_normalizer=GroupNormalizer(),
)

This code demonstrates how to load and preprocess time series data using the TimeSeriesDataSet class, which handles common tasks such as handling time-varying and static features, setting encoder and prediction lengths, and normalizing the target variable.

  1. Defining a Forecasting Model:
from pytorch_forecasting import TemporalFusionTransformer

model = TemporalFusionTransformer(
    time_series_dataset=data,
    hidden_size=128,
    attention_head_size=4,
    dropout=0.1,
    hidden_continuous_size=32,
    learning_rate=1e-3,
    log_interval=10,
    log_val_interval=1,
    max_epochs=100,
)

This code shows how to define a Temporal Fusion Transformer (TFT) model, one of the forecasting models provided by the library, with various hyperparameters such as hidden size, attention head size, dropout, and learning rate.

  1. Training and Evaluating the Model:
from pytorch_forecasting import Trainer

trainer = Trainer(
    model=model,
    train_dataloader=data.train_dataloader(),
    val_dataloader=data.val_dataloader(),
    accelerator="gpu",
    devices=1,
)

trainer.fit()

This code demonstrates how to use the Trainer class to train and evaluate the forecasting model, specifying the training and validation dataloaders, and utilizing GPU acceleration if available.

Getting Started

To get started with the jdb78/pytorch-forecasting library, follow these steps:

  1. Install the library using pip:

Competitor Comparisons

18,363

Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.

Pros of Prophet

  • Easier to use for beginners with minimal configuration required
  • Handles seasonality and holidays automatically
  • Well-documented with extensive examples and case studies

Cons of Prophet

  • Less flexible for custom model architectures
  • Limited to univariate time series forecasting
  • May struggle with complex, non-linear patterns in data

Code Comparison

Prophet:

from fbprophet import Prophet
model = Prophet()
model.fit(df)
future = model.make_future_dataframe(periods=365)
forecast = model.predict(future)

PyTorch Forecasting:

from pytorch_forecasting import TemporalFusionTransformer, TimeSeriesDataSet
dataset = TimeSeriesDataSet(
    df,
    time_idx="timestamp",
    target="target",
    group_ids=["id"],
    static_categoricals=["category"],
    time_varying_known_reals=["price"],
)
model = TemporalFusionTransformer.from_dataset(dataset)
model.fit(train_dataloader, val_dataloader)
predictions = model.predict(test_dataloader)

Prophet is simpler to set up and use, while PyTorch Forecasting offers more flexibility and advanced features for complex time series modeling tasks.

7,833

A unified framework for machine learning with time series

Pros of sktime

  • Broader scope: Supports various time series tasks beyond forecasting, including classification and clustering
  • Scikit-learn compatible: Integrates seamlessly with the scikit-learn ecosystem
  • Extensive algorithm collection: Offers a wide range of classical and modern time series algorithms

Cons of sktime

  • Less focus on deep learning: Limited support for neural network-based forecasting models
  • Steeper learning curve: More complex API due to its broader scope and flexibility

Code Comparison

sktime:

from sktime.forecasting.arima import ARIMA
from sktime.datasets import load_airline

y = load_airline()
forecaster = ARIMA(order=(1, 1, 1), seasonal_order=(1, 1, 1, 12))
forecaster.fit(y)
y_pred = forecaster.predict(fh=[1, 2, 3])

pytorch-forecasting:

from pytorch_forecasting import TemporalFusionTransformer, TimeSeriesDataSet

training = TimeSeriesDataSet(
    data,
    time_idx="timestamp",
    target="target",
    group_ids=["group"],
    static_categoricals=["static_cat"],
    time_varying_known_reals=["time_varying_real"],
)
model = TemporalFusionTransformer.from_dataset(training)
predictions = model.predict(data)
7,897

A python library for user-friendly forecasting and anomaly detection on time series.

Pros of Darts

  • Supports a wider range of models, including classical statistical methods and machine learning algorithms
  • Offers a more unified API for different types of models
  • Provides built-in functionality for model selection and hyperparameter tuning

Cons of Darts

  • Less focus on deep learning models compared to PyTorch Forecasting
  • May have slower performance for large-scale time series forecasting tasks
  • Less extensive documentation and community support

Code Comparison

PyTorch Forecasting:

from pytorch_forecasting import TemporalFusionTransformer, TimeSeriesDataSet

dataset = TimeSeriesDataSet(
    data,
    time_idx="timestamp",
    target="target",
    group_ids=["group"],
    static_categoricals=["category"],
)

model = TemporalFusionTransformer.from_dataset(dataset)

Darts:

from darts import TimeSeries
from darts.models import Prophet

series = TimeSeries.from_dataframe(df, 'timestamp', 'target', freq='D')
model = Prophet()
model.fit(series)

Both libraries offer intuitive APIs for time series forecasting, but PyTorch Forecasting is more focused on deep learning models, while Darts provides a broader range of algorithms and a more unified interface across different model types.

1,864

A Python package for Bayesian forecasting with object-oriented design and probabilistic models under the hood.

Pros of Orbit

  • Focuses on Bayesian time series modeling, offering probabilistic forecasting
  • Provides built-in diagnostics and visualization tools
  • Supports both PyMC and Stan backends for flexible model implementation

Cons of Orbit

  • Limited to time series forecasting, less versatile for other ML tasks
  • Steeper learning curve for users unfamiliar with Bayesian methods
  • Smaller community and fewer resources compared to PyTorch ecosystem

Code Comparison

Orbit example:

from orbit.models import DLT

model = DLT(
    response_col='y',
    date_col='ds',
    regressor_col=['regressor1', 'regressor2']
)
model.fit(df)

PyTorch Forecasting example:

from pytorch_forecasting import TemporalFusionTransformer, TimeSeriesDataSet

dataset = TimeSeriesDataSet(
    df,
    time_idx="timestamp",
    target="target",
    group_ids=["id"],
    static_categoricals=["category"],
    time_varying_known_reals=["price"],
)
model = TemporalFusionTransformer.from_dataset(dataset)

Both libraries offer time series forecasting capabilities, but Orbit focuses on Bayesian methods while PyTorch Forecasting provides a broader range of deep learning models. Orbit excels in probabilistic forecasting and diagnostics, while PyTorch Forecasting benefits from the extensive PyTorch ecosystem and flexibility for various ML tasks.

Lightning ⚡️ fast forecasting with statistical and econometric models.

Pros of statsforecast

  • Focuses on statistical models, offering a wide range of traditional forecasting methods
  • Lightweight and fast, with implementations in Python and R
  • Designed for scalability, capable of handling large datasets efficiently

Cons of statsforecast

  • Limited support for deep learning models compared to pytorch-forecasting
  • Less flexibility in handling complex, multi-variate time series data
  • Fewer built-in preprocessing and feature engineering capabilities

Code Comparison

statsforecast:

from statsforecast import StatsForecast
from statsforecast.models import AutoARIMA

sf = StatsForecast(
    models=[AutoARIMA()],
    freq='D',
    n_jobs=-1
)
forecasts = sf.forecast(df=data, h=30)

pytorch-forecasting:

from pytorch_forecasting import TemporalFusionTransformer, TimeSeriesDataSet

dataset = TimeSeriesDataSet(
    data,
    time_idx="timestamp",
    target="target",
    group_ids=["id"],
    static_categoricals=["category"],
    time_varying_known_reals=["price"],
)
model = TemporalFusionTransformer.from_dataset(dataset)
predictions = model.predict(dataset)

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

PyTorch Forecasting

PyTorch Forecasting is a PyTorch-based package for forecasting with state-of-the-art deep learning architectures. It provides a high-level API and uses PyTorch Lightning to scale training on GPU or CPU, with automatic logging.

Documentation · Tutorials · Release Notes
Open SourceMIT
Community!discord !slack
CI/CDgithub-actions readthedocs platform Code Coverage
Code!pypi !conda !python-versions !black

Our article on Towards Data Science introduces the package and provides background information.

PyTorch Forecasting aims to ease state-of-the-art timeseries forecasting with neural networks for real-world cases and research alike. The goal is to provide a high-level API with maximum flexibility for professionals and reasonable defaults for beginners. Specifically, the package provides

  • A timeseries dataset class which abstracts handling variable transformations, missing values, randomized subsampling, multiple history lengths, etc.
  • A base model class which provides basic training of timeseries models along with logging in tensorboard and generic visualizations such actual vs predictions and dependency plots
  • Multiple neural network architectures for timeseries forecasting that have been enhanced for real-world deployment and come with in-built interpretation capabilities
  • Multi-horizon timeseries metrics
  • Hyperparameter tuning with optuna

The package is built on pytorch-lightning to allow training on CPUs, single and multiple GPUs out-of-the-box.

Installation

If you are working on windows, you need to first install PyTorch with

pip install torch -f https://download.pytorch.org/whl/torch_stable.html.

Otherwise, you can proceed with

pip install pytorch-forecasting

Alternatively, you can install the package via conda

conda install pytorch-forecasting pytorch -c pytorch>=1.7 -c conda-forge

PyTorch Forecasting is now installed from the conda-forge channel while PyTorch is install from the pytorch channel.

To use the MQF2 loss (multivariate quantile loss), also install pip install pytorch-forecasting[mqf2]

Documentation

Visit https://pytorch-forecasting.readthedocs.io to read the documentation with detailed tutorials.

Available models

The documentation provides a comparison of available models.

To implement new models or other custom components, see the How to implement new models tutorial. It covers basic as well as advanced architectures.

Usage example

Networks can be trained with the PyTorch Lighning Trainer on pandas Dataframes which are first converted to a TimeSeriesDataSet.

# imports for training
import lightning.pytorch as pl
from lightning.pytorch.loggers import TensorBoardLogger
from lightning.pytorch.callbacks import EarlyStopping, LearningRateMonitor
# import dataset, network to train and metric to optimize
from pytorch_forecasting import TimeSeriesDataSet, TemporalFusionTransformer, QuantileLoss
from lightning.pytorch.tuner import Tuner

# load data: this is pandas dataframe with at least a column for
# * the target (what you want to predict)
# * the timeseries ID (which should be a unique string to identify each timeseries)
# * the time of the observation (which should be a monotonically increasing integer)
data = ...

# define the dataset, i.e. add metadata to pandas dataframe for the model to understand it
max_encoder_length = 36
max_prediction_length = 6
training_cutoff = "YYYY-MM-DD"  # day for cutoff

training = TimeSeriesDataSet(
    data[lambda x: x.date <= training_cutoff],
    time_idx= ...,  # column name of time of observation
    target= ...,  # column name of target to predict
    group_ids=[ ... ],  # column name(s) for timeseries IDs
    max_encoder_length=max_encoder_length,  # how much history to use
    max_prediction_length=max_prediction_length,  # how far to predict into future
    # covariates static for a timeseries ID
    static_categoricals=[ ... ],
    static_reals=[ ... ],
    # covariates known and unknown in the future to inform prediction
    time_varying_known_categoricals=[ ... ],
    time_varying_known_reals=[ ... ],
    time_varying_unknown_categoricals=[ ... ],
    time_varying_unknown_reals=[ ... ],
)

# create validation dataset using the same normalization techniques as for the training dataset
validation = TimeSeriesDataSet.from_dataset(training, data, min_prediction_idx=training.index.time.max() + 1, stop_randomization=True)

# convert datasets to dataloaders for training
batch_size = 128
train_dataloader = training.to_dataloader(train=True, batch_size=batch_size, num_workers=2)
val_dataloader = validation.to_dataloader(train=False, batch_size=batch_size, num_workers=2)

# create PyTorch Lighning Trainer with early stopping
early_stop_callback = EarlyStopping(monitor="val_loss", min_delta=1e-4, patience=1, verbose=False, mode="min")
lr_logger = LearningRateMonitor()
trainer = pl.Trainer(
    max_epochs=100,
    accelerator="auto",  # run on CPU, if on multiple GPUs, use strategy="ddp"
    gradient_clip_val=0.1,
    limit_train_batches=30,  # 30 batches per epoch
    callbacks=[lr_logger, early_stop_callback],
    logger=TensorBoardLogger("lightning_logs")
)

# define network to train - the architecture is mostly inferred from the dataset, so that only a few hyperparameters have to be set by the user
tft = TemporalFusionTransformer.from_dataset(
    # dataset
    training,
    # architecture hyperparameters
    hidden_size=32,
    attention_head_size=1,
    dropout=0.1,
    hidden_continuous_size=16,
    # loss metric to optimize
    loss=QuantileLoss(),
    # logging frequency
    log_interval=2,
    # optimizer parameters
    learning_rate=0.03,
    reduce_on_plateau_patience=4
)
print(f"Number of parameters in network: {tft.size()/1e3:.1f}k")

# find the optimal learning rate
res = Tuner(trainer).lr_find(
    tft, train_dataloaders=train_dataloader, val_dataloaders=val_dataloader, early_stop_threshold=1000.0, max_lr=0.3,
)
# and plot the result - always visually confirm that the suggested learning rate makes sense
print(f"suggested learning rate: {res.suggestion()}")
fig = res.plot(show=True, suggest=True)
fig.show()

# fit the model on the data - redefine the model with the correct learning rate if necessary
trainer.fit(
    tft, train_dataloaders=train_dataloader, val_dataloaders=val_dataloader,
)