Convert Figma logo to code with AI

sktime logosktime

A unified framework for machine learning with time series

7,738
1,318
7,738
1,159

Top Related Projects

7,833

A unified framework for machine learning with time series

Statsmodels: statistical modeling and econometrics in Python

18,363

Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.

A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.

2,879

The machine learning toolkit for time series analysis in Python

2,106

Open source time series library for Python

Quick Overview

sktime is a unified framework for machine learning with time series in Python. It provides a comprehensive set of tools for time series analysis, forecasting, and classification. sktime is designed to be modular, extensible, and compatible with scikit-learn, making it easy to integrate into existing machine learning workflows.

Pros

  • Comprehensive toolkit for various time series tasks (forecasting, classification, regression)
  • Consistent API design, following scikit-learn conventions
  • Extensive documentation and examples
  • Active community and regular updates

Cons

  • Steeper learning curve for beginners compared to some simpler time series libraries
  • Some advanced features may require additional dependencies
  • Performance can be slower for very large datasets compared to specialized libraries

Code Examples

  1. Time series forecasting with ARIMA:
from sktime.datasets import load_airline
from sktime.forecasting.arima import ARIMA

y = load_airline()
forecaster = ARIMA(order=(1, 1, 1), seasonal_order=(1, 1, 1, 12))
forecaster.fit(y)
y_pred = forecaster.predict(fh=[1, 2, 3])
  1. Time series classification with Random Forest:
from sktime.classification.interval_based import TimeSeriesForestClassifier
from sktime.datasets import load_basic_motions

X_train, y_train = load_basic_motions(split="train")
X_test, y_test = load_basic_motions(split="test")

clf = TimeSeriesForestClassifier(n_estimators=100)
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
  1. Time series clustering:
from sktime.clustering.k_means import TimeSeriesKMeans
from sktime.datasets import load_basic_motions

X, _ = load_basic_motions(return_X_y=True)

clusterer = TimeSeriesKMeans(n_clusters=4)
clusterer.fit(X)
cluster_labels = clusterer.predict(X)

Getting Started

To get started with sktime, first install it using pip:

pip install sktime

Then, import the necessary modules and load a dataset:

from sktime.datasets import load_airline
from sktime.forecasting.naive import NaiveForecaster
from sktime.split import temporal_train_test_split

y = load_airline()
y_train, y_test = temporal_train_test_split(y)

forecaster = NaiveForecaster(strategy="mean")
forecaster.fit(y_train)
y_pred = forecaster.predict(fh=[1, 2, 3])

This example demonstrates loading a dataset, splitting it into train and test sets, fitting a simple forecaster, and making predictions.

Competitor Comparisons

7,833

A unified framework for machine learning with time series

Pros of sktime

  • More comprehensive and actively maintained time series toolbox
  • Larger community and contributor base
  • Extensive documentation and examples

Cons of sktime

  • Potentially more complex API due to broader feature set
  • May have a steeper learning curve for beginners

Code Comparison

sktime:

from sktime.forecasting.naive import NaiveForecaster
from sktime.datasets import load_airline

y = load_airline()
forecaster = NaiveForecaster(strategy="mean")
forecaster.fit(y)
y_pred = forecaster.predict(fh=[1, 2, 3])

sktime>:

# No code comparison available as sktime> is not a separate repository
# It appears to be a typo or misunderstanding in the original question

Summary

sktime is a well-established time series analysis and forecasting library for Python. It offers a wide range of algorithms and tools for various time series tasks. The repository is actively maintained and has a growing community of contributors and users. While it provides extensive functionality, this breadth may result in a more complex API and a steeper learning curve for newcomers.

The comparison to "sktime>" is not applicable, as it appears to be a typo or misunderstanding. There is no separate repository with that name. sktime is the primary and only repository for the sktime project.

Statsmodels: statistical modeling and econometrics in Python

Pros of statsmodels

  • More comprehensive statistical modeling capabilities, including econometrics and advanced regression techniques
  • Longer history and more established in the scientific community
  • Extensive documentation and examples for various statistical methods

Cons of statsmodels

  • Steeper learning curve due to its broader scope and more complex API
  • Less focus on time series forecasting compared to sktime
  • Slower development cycle and less frequent updates

Code comparison

statsmodels:

import statsmodels.api as sm
model = sm.OLS(y, X).fit()
predictions = model.predict(X_new)

sktime:

from sktime.forecasting.arima import ARIMA
forecaster = ARIMA()
forecaster.fit(y)
predictions = forecaster.predict(fh=[1, 2, 3])

Summary

statsmodels is a comprehensive statistical library with a wide range of modeling capabilities, while sktime focuses specifically on time series analysis and forecasting. statsmodels offers more advanced statistical techniques but has a steeper learning curve, whereas sktime provides a more user-friendly interface for time series tasks. The choice between the two depends on the specific requirements of your project and your familiarity with statistical concepts.

18,363

Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.

Pros of Prophet

  • Designed specifically for business forecasting with built-in handling of holidays and seasonal effects
  • User-friendly interface with automatic hyperparameter tuning
  • Robust to missing data and outliers

Cons of Prophet

  • Limited to univariate time series forecasting
  • Less flexible for custom model architectures compared to sktime
  • May struggle with complex, non-linear patterns in data

Code Comparison

Prophet:

from fbprophet import Prophet

model = Prophet()
model.fit(df)
future = model.make_future_dataframe(periods=365)
forecast = model.predict(future)

sktime:

from sktime.forecasting.arima import AutoARIMA

forecaster = AutoARIMA()
forecaster.fit(y_train)
y_pred = forecaster.predict(fh=fh)

Prophet focuses on simplicity and ease of use, while sktime offers a more comprehensive toolkit for various time series tasks. Prophet excels in business forecasting scenarios with clear seasonality and holiday effects, whereas sktime provides greater flexibility for different types of time series analysis and forecasting methods.

A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.

Pros of pmdarima

  • Specialized focus on ARIMA models, offering advanced features and optimizations
  • Includes automated model selection and hyperparameter tuning
  • Provides comprehensive documentation and examples for ARIMA-specific use cases

Cons of pmdarima

  • Limited to ARIMA models, lacking the broader range of time series algorithms found in sktime
  • Less flexibility for handling complex time series tasks beyond ARIMA modeling
  • Smaller community and ecosystem compared to sktime

Code Comparison

pmdarima:

from pmdarima import auto_arima

model = auto_arima(y, seasonal=True, m=12)
forecast = model.predict(n_periods=5)

sktime:

from sktime.forecasting.arima import AutoARIMA

forecaster = AutoARIMA(sp=12)
forecaster.fit(y)
forecast = forecaster.predict(fh=[1,2,3,4,5])

Both libraries offer auto ARIMA functionality, but pmdarima's implementation is more specialized and may offer additional ARIMA-specific options. sktime provides a more consistent API across various time series algorithms, making it easier to switch between different forecasting methods.

2,879

The machine learning toolkit for time series analysis in Python

Pros of tslearn

  • Simpler API and easier to get started for beginners
  • Focuses specifically on time series tasks, potentially leading to more optimized implementations
  • Includes some unique algorithms not found in sktime, like GAK (Global Alignment Kernel)

Cons of tslearn

  • Smaller community and less frequent updates compared to sktime
  • More limited in scope, focusing mainly on clustering and classification tasks
  • Less integration with the broader scikit-learn ecosystem

Code Comparison

tslearn example:

from tslearn.clustering import TimeSeriesKMeans
kmeans = TimeSeriesKMeans(n_clusters=3, metric="dtw")
kmeans.fit(X_train)

sktime example:

from sktime.clustering.kmeans import TimeSeriesKMeans
kmeans = TimeSeriesKMeans(n_clusters=3, distance="dtw")
kmeans.fit(X_train)

Both libraries offer similar functionality for time series clustering, with slight differences in API design. sktime generally follows scikit-learn conventions more closely, while tslearn has its own unique API style. sktime provides a broader range of functionality beyond just clustering and classification, making it more versatile for various time series tasks.

2,106

Open source time series library for Python

Pros of pyflux

  • Focuses specifically on probabilistic time series modeling and Bayesian inference
  • Includes specialized models like GARCH for volatility forecasting
  • Provides built-in plotting and diagnostics for model evaluation

Cons of pyflux

  • Less actively maintained compared to sktime (last update in 2018)
  • More limited in scope, primarily for financial time series analysis
  • Smaller community and fewer contributors

Code Comparison

pyflux example:

import pyflux as pf
model = pf.ARIMA(data=y, ar=1, ma=1, family=pf.Normal())
model.fit()
model.plot_fit()

sktime example:

from sktime.forecasting.arima import ARIMA
forecaster = ARIMA(order=(1,0,1))
forecaster.fit(y)
y_pred = forecaster.predict(fh=[1,2,3])

Both libraries offer ARIMA modeling, but pyflux provides more built-in visualization tools, while sktime offers a broader range of time series algorithms and a more consistent API across different model types. sktime also integrates better with the wider scikit-learn ecosystem, making it more versatile for general machine learning tasks involving time series data.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Welcome to sktime

A unified interface for machine learning with time series

:rocket: Version 0.32.4 out now! Check out the release notes here.

sktime is a library for time series analysis in Python. It provides a unified interface for multiple time series learning tasks. Currently, this includes time series classification, regression, clustering, annotation, and forecasting. It comes with time series algorithms and scikit-learn compatible tools to build, tune and validate time series models.

Overview
Open SourceBSD 3-clause
TutorialsBinder !youtube
Community!discord !slack
CI/CDgithub-actions readthedocs platform
Code!pypi !conda !python-versions !black
DownloadsPyPI - Downloads PyPI - Downloads Downloads
Citation!zenodo

:books: Documentation

Documentation
:star: TutorialsNew to sktime? Here's everything you need to know!
:clipboard: Binder NotebooksExample notebooks to play with in your browser.
:woman_technologist: ExamplesHow to use sktime and its features.
:scissors: Extension TemplatesHow to build your own estimator using sktime's API.
:control_knobs: API ReferenceThe detailed reference for sktime's API.
:tv: Video TutorialOur video tutorial from 2021 PyData Global.
:hammer_and_wrench: ChangelogChanges and version history.
:deciduous_tree: Roadmapsktime's software and community development plan.
:pencil: Related SoftwareA list of related software.

:speech_balloon: Where to ask questions

Questions and feedback are extremely welcome! We strongly believe in the value of sharing help publicly, as it allows a wider audience to benefit from it.

TypePlatforms
:bug: Bug ReportsGitHub Issue Tracker
:sparkles: Feature Requests & IdeasGitHub Issue Tracker
:woman_technologist: Usage QuestionsGitHub Discussions · Stack Overflow
:speech_balloon: General DiscussionGitHub Discussions
:factory: Contribution & Developmentdev-chat channel · Discord
:globe_with_meridians: Meet-ups and collaboration sessionsDiscord - Fridays 13 UTC, dev/meet-ups channel

:dizzy: Features

Our objective is to enhance the interoperability and usability of the time series analysis ecosystem in its entirety. sktime provides a unified interface for distinct but related time series learning tasks. It features dedicated time series algorithms and tools for composite model building such as pipelining, ensembling, tuning, and reduction, empowering users to apply an algorithm designed for one task to another.

sktime also provides interfaces to related libraries, for example scikit-learn, statsmodels, tsfresh, PyOD, and fbprophet, among others.

ModuleStatusLinks
ForecastingstableTutorial · API Reference · Extension Template
Time Series ClassificationstableTutorial · API Reference · Extension Template
Time Series RegressionstableAPI Reference
TransformationsstableTutorial · API Reference · Extension Template
Parameter fittingmaturingAPI Reference · Extension Template
Time Series ClusteringmaturingAPI Reference · Extension Template
Time Series Distances/KernelsmaturingTutorial · API Reference · Extension Template
Time Series AlignmentexperimentalAPI Reference · Extension Template
AnnotationexperimentalExtension Template
Time Series SplittersmaturingExtension Template
Distributions and simulationexperimental

:hourglass_flowing_sand: Install sktime

For troubleshooting and detailed installation instructions, see the documentation.

  • Operating system: macOS X · Linux · Windows 8.1 or higher
  • Python version: Python 3.8, 3.9, 3.10, 3.11, and 3.12 (only 64-bit)
  • Package managers: pip · conda (via conda-forge)

pip

Using pip, sktime releases are available as source packages and binary wheels. Available wheels are listed here.

pip install sktime

or, with maximum dependencies,

pip install sktime[all_extras]

For curated sets of soft dependencies for specific learning tasks:

pip install sktime[forecasting]  # for selected forecasting dependencies
pip install sktime[forecasting,transformations]  # forecasters and transformers

or similar. Valid sets are:

  • forecasting
  • transformations
  • classification
  • regression
  • clustering
  • param_est
  • networks
  • annotation
  • alignment

Cave: in general, not all soft dependencies for a learning task are installed, only a curated selection.

conda

You can also install sktime from conda via the conda-forge channel. The feedstock including the build recipe and configuration is maintained in this conda-forge repository.

conda install -c conda-forge sktime

or, with maximum dependencies,

conda install -c conda-forge sktime-all-extras

(as conda does not support dependency sets, flexible choice of soft dependencies is unavailable via conda)

:zap: Quickstart

Forecasting

from sktime.datasets import load_airline
from sktime.forecasting.base import ForecastingHorizon
from sktime.forecasting.theta import ThetaForecaster
from sktime.split import temporal_train_test_split
from sktime.performance_metrics.forecasting import mean_absolute_percentage_error

y = load_airline()
y_train, y_test = temporal_train_test_split(y)
fh = ForecastingHorizon(y_test.index, is_relative=False)
forecaster = ThetaForecaster(sp=12)  # monthly seasonal periodicity
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
mean_absolute_percentage_error(y_test, y_pred)
>>> 0.08661467738190656

Time Series Classification

from sktime.classification.interval_based import TimeSeriesForestClassifier
from sktime.datasets import load_arrow_head
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

X, y = load_arrow_head()
X_train, X_test, y_train, y_test = train_test_split(X, y)
classifier = TimeSeriesForestClassifier()
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
accuracy_score(y_test, y_pred)
>>> 0.8679245283018868

:wave: How to get involved

There are many ways to join the sktime community. We follow the all-contributors specification: all kinds of contributions are welcome - not just code.

Documentation
:gift_heart: ContributeHow to contribute to sktime.
:school_satchel: MentoringNew to open source? Apply to our mentoring program!
:date: MeetingsJoin our discussions, tutorials, workshops, and sprints!
:woman_mechanic: Developer GuidesHow to further develop sktime's code base.
:construction: Enhancement ProposalsDesign a new feature for sktime.
:medal_sports: ContributorsA list of all contributors.
:raising_hand: RolesAn overview of our core community roles.
:money_with_wings: DonateFund sktime maintenance and development.
:classical_building: GovernanceHow and by whom decisions are made in sktime's community.

:trophy: Hall of fame

Thanks to all our community for all your wonderful contributions, PRs, issues, ideas.


:bulb: Project vision

  • By the community, for the community -- developed by a friendly and collaborative community.
  • The right tool for the right task -- helping users to diagnose their learning problem and suitable scientific model types.
  • Embedded in state-of-art ecosystems and provider of interoperable interfaces -- interoperable with scikit-learn, statsmodels, tsfresh, and other community favorites.
  • Rich model composition and reduction functionality -- build tuning and feature extraction pipelines, solve forecasting tasks with scikit-learn regressors.
  • Clean, descriptive specification syntax -- based on modern object-oriented design principles for data science.
  • Fair model assessment and benchmarking -- build your models, inspect your models, check your models, and avoid pitfalls.
  • Easily extensible -- easy extension templates to add your own algorithms compatible with sktime's API.