sktime

A unified framework for machine learning with time series

9,127

1,648

9,127

1,500

View on GitHub

Top Related Projects

sktime

9,178

A unified framework for machine learning with time series

statsmodels

10,845

Statsmodels: statistical modeling and econometrics in Python

prophet

19,467

Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.

pmdarima

1,662

A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.

tslearn

2,994

The machine learning toolkit for time series analysis in Python

pyflux

2,128

Open source time series library for Python

Quick Overview

sktime is a unified framework for machine learning with time series in Python. It provides a comprehensive set of tools for time series analysis, forecasting, and classification. sktime is designed to be modular, extensible, and compatible with scikit-learn, making it easy to integrate into existing machine learning workflows.

Pros

Comprehensive toolkit for various time series tasks (forecasting, classification, regression)
Consistent API design, following scikit-learn conventions
Extensive documentation and examples
Active community and regular updates

Cons

Steeper learning curve for beginners compared to some simpler time series libraries
Some advanced features may require additional dependencies
Performance can be slower for very large datasets compared to specialized libraries

Code Examples

Time series forecasting with ARIMA:

from sktime.datasets import load_airline
from sktime.forecasting.arima import ARIMA

y = load_airline()
forecaster = ARIMA(order=(1, 1, 1), seasonal_order=(1, 1, 1, 12))
forecaster.fit(y)
y_pred = forecaster.predict(fh=[1, 2, 3])

Time series classification with Random Forest:

from sktime.classification.interval_based import TimeSeriesForestClassifier
from sktime.datasets import load_basic_motions

X_train, y_train = load_basic_motions(split="train")
X_test, y_test = load_basic_motions(split="test")

clf = TimeSeriesForestClassifier(n_estimators=100)
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)

Time series clustering:

from sktime.clustering.k_means import TimeSeriesKMeans
from sktime.datasets import load_basic_motions

X, _ = load_basic_motions(return_X_y=True)

clusterer = TimeSeriesKMeans(n_clusters=4)
clusterer.fit(X)
cluster_labels = clusterer.predict(X)

Getting Started

To get started with sktime, first install it using pip:

pip install sktime

Then, import the necessary modules and load a dataset:

from sktime.datasets import load_airline
from sktime.forecasting.naive import NaiveForecaster
from sktime.split import temporal_train_test_split

y = load_airline()
y_train, y_test = temporal_train_test_split(y)

forecaster = NaiveForecaster(strategy="mean")
forecaster.fit(y_train)
y_pred = forecaster.predict(fh=[1, 2, 3])

This example demonstrates loading a dataset, splitting it into train and test sets, fitting a simple forecaster, and making predictions.

Competitor Comparisons

sktime

9,178

A unified framework for machine learning with time series

Pros of sktime

More comprehensive and actively maintained time series toolbox
Larger community and contributor base
Extensive documentation and examples

Cons of sktime

Potentially more complex API due to broader feature set
May have a steeper learning curve for beginners

Code Comparison

sktime:

from sktime.forecasting.naive import NaiveForecaster
from sktime.datasets import load_airline

y = load_airline()
forecaster = NaiveForecaster(strategy="mean")
forecaster.fit(y)
y_pred = forecaster.predict(fh=[1, 2, 3])

sktime>:

# No code comparison available as sktime> is not a separate repository
# It appears to be a typo or misunderstanding in the original question

Summary

sktime is a well-established time series analysis and forecasting library for Python. It offers a wide range of algorithms and tools for various time series tasks. The repository is actively maintained and has a growing community of contributors and users. While it provides extensive functionality, this breadth may result in a more complex API and a steeper learning curve for newcomers.

The comparison to "sktime>" is not applicable, as it appears to be a typo or misunderstanding. There is no separate repository with that name. sktime is the primary and only repository for the sktime project.

statsmodels

10,845

Statsmodels: statistical modeling and econometrics in Python

Pros of statsmodels

More comprehensive statistical modeling capabilities, including econometrics and advanced regression techniques
Longer history and more established in the scientific community
Extensive documentation and examples for various statistical methods

Cons of statsmodels

Steeper learning curve due to its broader scope and more complex API
Less focus on time series forecasting compared to sktime
Slower development cycle and less frequent updates

Code comparison

statsmodels:

import statsmodels.api as sm
model = sm.OLS(y, X).fit()
predictions = model.predict(X_new)

sktime:

from sktime.forecasting.arima import ARIMA
forecaster = ARIMA()
forecaster.fit(y)
predictions = forecaster.predict(fh=[1, 2, 3])

Summary

statsmodels is a comprehensive statistical library with a wide range of modeling capabilities, while sktime focuses specifically on time series analysis and forecasting. statsmodels offers more advanced statistical techniques but has a steeper learning curve, whereas sktime provides a more user-friendly interface for time series tasks. The choice between the two depends on the specific requirements of your project and your familiarity with statistical concepts.

prophet

19,467

Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.

Pros of Prophet

Designed specifically for business forecasting with built-in handling of holidays and seasonal effects
User-friendly interface with automatic hyperparameter tuning
Robust to missing data and outliers

Cons of Prophet

Limited to univariate time series forecasting
Less flexible for custom model architectures compared to sktime
May struggle with complex, non-linear patterns in data

Code Comparison

Prophet:

from fbprophet import Prophet

model = Prophet()
model.fit(df)
future = model.make_future_dataframe(periods=365)
forecast = model.predict(future)

sktime:

from sktime.forecasting.arima import AutoARIMA

forecaster = AutoARIMA()
forecaster.fit(y_train)
y_pred = forecaster.predict(fh=fh)

Prophet focuses on simplicity and ease of use, while sktime offers a more comprehensive toolkit for various time series tasks. Prophet excels in business forecasting scenarios with clear seasonality and holiday effects, whereas sktime provides greater flexibility for different types of time series analysis and forecasting methods.

pmdarima

1,662

A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.

Pros of pmdarima

Specialized focus on ARIMA models, offering advanced features and optimizations
Includes automated model selection and hyperparameter tuning
Provides comprehensive documentation and examples for ARIMA-specific use cases

Cons of pmdarima

Limited to ARIMA models, lacking the broader range of time series algorithms found in sktime
Less flexibility for handling complex time series tasks beyond ARIMA modeling
Smaller community and ecosystem compared to sktime

Code Comparison

pmdarima:

from pmdarima import auto_arima

model = auto_arima(y, seasonal=True, m=12)
forecast = model.predict(n_periods=5)

sktime:

from sktime.forecasting.arima import AutoARIMA

forecaster = AutoARIMA(sp=12)
forecaster.fit(y)
forecast = forecaster.predict(fh=[1,2,3,4,5])

Both libraries offer auto ARIMA functionality, but pmdarima's implementation is more specialized and may offer additional ARIMA-specific options. sktime provides a more consistent API across various time series algorithms, making it easier to switch between different forecasting methods.

tslearn

2,994

The machine learning toolkit for time series analysis in Python

Pros of tslearn

Simpler API and easier to get started for beginners
Focuses specifically on time series tasks, potentially leading to more optimized implementations
Includes some unique algorithms not found in sktime, like GAK (Global Alignment Kernel)

Cons of tslearn

Smaller community and less frequent updates compared to sktime
More limited in scope, focusing mainly on clustering and classification tasks
Less integration with the broader scikit-learn ecosystem

Code Comparison

tslearn example:

from tslearn.clustering import TimeSeriesKMeans
kmeans = TimeSeriesKMeans(n_clusters=3, metric="dtw")
kmeans.fit(X_train)

sktime example:

from sktime.clustering.kmeans import TimeSeriesKMeans
kmeans = TimeSeriesKMeans(n_clusters=3, distance="dtw")
kmeans.fit(X_train)

Both libraries offer similar functionality for time series clustering, with slight differences in API design. sktime generally follows scikit-learn conventions more closely, while tslearn has its own unique API style. sktime provides a broader range of functionality beyond just clustering and classification, making it more versatile for various time series tasks.

pyflux

2,128

Open source time series library for Python

Pros of pyflux

Focuses specifically on probabilistic time series modeling and Bayesian inference
Includes specialized models like GARCH for volatility forecasting
Provides built-in plotting and diagnostics for model evaluation

Cons of pyflux

Less actively maintained compared to sktime (last update in 2018)
More limited in scope, primarily for financial time series analysis
Smaller community and fewer contributors

Code Comparison

pyflux example:

import pyflux as pf
model = pf.ARIMA(data=y, ar=1, ma=1, family=pf.Normal())
model.fit()
model.plot_fit()

sktime example:

from sktime.forecasting.arima import ARIMA
forecaster = ARIMA(order=(1,0,1))
forecaster.fit(y)
y_pred = forecaster.predict(fh=[1,2,3])

Both libraries offer ARIMA modeling, but pyflux provides more built-in visualization tools, while sktime offers a broader range of time series algorithms and a more consistent API across different model types. sktime also integrates better with the wider scikit-learn ecosystem, making it more versatile for general machine learning tasks involving time series data.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Welcome to sktime

A unified interface for machine learning with time series

:rocket: Version 0.38.1 out now! Check out the release notes here.

sktime is a library for time series analysis in Python. It provides a unified interface for multiple time series learning tasks. Currently, this includes forecasting, time series classification, clustering, anomaly/changepoint detection, and other tasks. It comes with time series algorithms and scikit-learn compatible tools to build, tune, and validate time series models.

	Documentation Â· Tutorials Â· Release Notes
Open Source
Tutorials
Community
CI/CD
Code
Downloads
Citation

:books: Documentation

Documentation
:star: Tutorials	New to sktime? Here's everything you need to know!
:clipboard: Binder Notebooks	Example notebooks to play with in your browser.
:woman_technologist: Examples	How to use sktime and its features.
:scissors: Extension Templates	How to build your own estimator using sktime's API.
:control_knobs: API Reference	The detailed reference for sktime's API.
:tv: Video Tutorial	Our video tutorial from 2021 PyData Global.
:hammer_and_wrench: Changelog	Changes and version history.
:deciduous_tree: Roadmap	sktime's software and community development plan.
:pencil: Related Software	A list of related software.

:speech_balloon: Where to ask questions

Questions and feedback are extremely welcome! We strongly believe in the value of sharing help publicly, as it allows a wider audience to benefit from it.

Type	Platforms
:bug: Bug Reports	GitHub Issue Tracker
:sparkles: Feature Requests & Ideas	GitHub Issue Tracker
:woman_technologist: Usage Questions	GitHub Discussions Â· Stack Overflow
:speech_balloon: General Discussion	GitHub Discussions
:factory: Contribution & Development	`dev-chat` channel Â· Discord
:globe_with_meridians: Meet-ups and collaboration sessions	Discord - Fridays 13 UTC, dev/meet-ups channel

:dizzy: Features

Our objective is to enhance the interoperability and usability of the time series analysis ecosystem in its entirety. sktime provides a unified interface for distinct but related time series learning tasks. It features dedicated time series algorithms and tools for composite model building, such as pipelining, ensembling, tuning, and reduction, empowering users to apply algorithms designed for one task to another.

sktime also provides interfaces to related libraries, for example scikit-learn, statsmodels, tsfresh, PyOD, and fbprophet, among others.

Module	Status	Links
Forecasting	stable	Tutorial Â· API Reference Â· Extension Template
Time Series Classification	stable	Tutorial Â· API Reference Â· Extension Template
Time Series Regression	stable	API Reference
Transformations	stable	Tutorial Â· API Reference Â· Extension Template
Detection tasks	maturing	Extension Template
Parameter fitting	maturing	API Reference Â· Extension Template
Time Series Clustering	maturing	API Reference Â· Extension Template
Time Series Distances/Kernels	maturing	Tutorial Â· API Reference Â· Extension Template
Time Series Alignment	experimental	API Reference Â· Extension Template
Time Series Splitters	maturing	Extension Template
Distributions and simulation	experimental

:hourglass_flowing_sand: Install sktime

For troubleshooting and detailed installation instructions, see the documentation.

Operating system: macOS X Â· Linux Â· Windows 8.1 or higher
Python version: Python 3.8, 3.9, 3.10, 3.11, and 3.12 (only 64-bit)
Package managers: pip Â· conda (via conda-forge)

pip

Using pip, sktime releases are available as source packages and binary wheels. Available wheels are listed here.

pip install sktime

or, with maximum dependencies,

pip install sktime[all_extras]

For curated sets of soft dependencies for specific learning tasks:

pip install sktime[forecasting]  # for selected forecasting dependencies
pip install sktime[forecasting,transformations]  # forecasters and transformers

or similar. Valid sets are:

forecasting
transformations
classification
regression
clustering
param_est
networks
detection
alignment

Cave: in general, not all soft dependencies for a learning task are installed, only a curated selection.

conda

You can also install sktime from conda via the conda-forge channel. The feedstock including the build recipe and configuration is maintained in this conda-forge repository.

conda install -c conda-forge sktime

or, with maximum dependencies,

conda install -c conda-forge sktime-all-extras

(as conda does not support dependency sets, flexible choice of soft dependencies is unavailable via conda)

:zap: Quickstart

Forecasting

from sktime.datasets import load_airline
from sktime.forecasting.base import ForecastingHorizon
from sktime.forecasting.theta import ThetaForecaster
from sktime.split import temporal_train_test_split
from sktime.performance_metrics.forecasting import mean_absolute_percentage_error

y = load_airline()
y_train, y_test = temporal_train_test_split(y)
fh = ForecastingHorizon(y_test.index, is_relative=False)
forecaster = ThetaForecaster(sp=12)  # monthly seasonal periodicity
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
mean_absolute_percentage_error(y_test, y_pred)
>>> 0.08661467738190656

Time Series Classification

from sktime.classification.interval_based import TimeSeriesForestClassifier
from sktime.datasets import load_arrow_head
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

X, y = load_arrow_head()
X_train, X_test, y_train, y_test = train_test_split(X, y)
classifier = TimeSeriesForestClassifier()
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
accuracy_score(y_test, y_pred)
>>> 0.8679245283018868

:wave: How to get involved

There are many ways to join the sktime community. We follow the all-contributors specification: all kinds of contributions are welcome - not just code.

Documentation
:gift_heart: Contribute	How to contribute to sktime.
:school_satchel: Mentoring	New to open source? Apply to our mentoring program!
:date: Meetings	Join our discussions, tutorials, workshops, and sprints!
:woman_mechanic: Developer Guides	How to further develop sktime's code base.
:construction: Enhancement Proposals	Design a new feature for sktime.
:medal_sports: Contributors	A list of all contributors.
:raising_hand: Roles	An overview of our core community roles.
:money_with_wings: Donate	Fund sktime maintenance and development.
:classical_building: Governance	How and by whom decisions are made in sktime's community.

:trophy: Hall of fame

Thanks to all our community for all your wonderful contributions, PRs, issues, ideas.

:bulb: Project vision

By the community, for the community -- developed by a friendly and collaborative community.
The right tool for the right task -- helping users to diagnose their learning problem and suitable scientific model types.
Embedded in state-of-art ecosystems and provider of interoperable interfaces -- interoperable with scikit-learn, statsmodels, tsfresh, and other community favorites.
Rich model composition and reduction functionality -- build tuning and feature extraction pipelines, solve forecasting tasks with scikit-learn regressors.
Clean, descriptive specification syntax -- based on modern object-oriented design principles for data science.
Fair model assessment and benchmarking -- build your models, inspect your models, check your models, and avoid pitfalls.
Easily extensible -- easy extension templates to add your own algorithms compatible with sktime's API.

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot