Top Related Projects
A unified framework for machine learning with time series
Statsmodels: statistical modeling and econometrics in Python
Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.
The machine learning toolkit for time series analysis in Python
Open source time series library for Python
Quick Overview
sktime is a unified framework for machine learning with time series in Python. It provides a comprehensive set of tools for time series analysis, forecasting, and classification. sktime is designed to be modular, extensible, and compatible with scikit-learn, making it easy to integrate into existing machine learning workflows.
Pros
- Comprehensive toolkit for various time series tasks (forecasting, classification, regression)
- Consistent API design, following scikit-learn conventions
- Extensive documentation and examples
- Active community and regular updates
Cons
- Steeper learning curve for beginners compared to some simpler time series libraries
- Some advanced features may require additional dependencies
- Performance can be slower for very large datasets compared to specialized libraries
Code Examples
- Time series forecasting with ARIMA:
from sktime.datasets import load_airline
from sktime.forecasting.arima import ARIMA
y = load_airline()
forecaster = ARIMA(order=(1, 1, 1), seasonal_order=(1, 1, 1, 12))
forecaster.fit(y)
y_pred = forecaster.predict(fh=[1, 2, 3])
- Time series classification with Random Forest:
from sktime.classification.interval_based import TimeSeriesForestClassifier
from sktime.datasets import load_basic_motions
X_train, y_train = load_basic_motions(split="train")
X_test, y_test = load_basic_motions(split="test")
clf = TimeSeriesForestClassifier(n_estimators=100)
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
- Time series clustering:
from sktime.clustering.k_means import TimeSeriesKMeans
from sktime.datasets import load_basic_motions
X, _ = load_basic_motions(return_X_y=True)
clusterer = TimeSeriesKMeans(n_clusters=4)
clusterer.fit(X)
cluster_labels = clusterer.predict(X)
Getting Started
To get started with sktime, first install it using pip:
pip install sktime
Then, import the necessary modules and load a dataset:
from sktime.datasets import load_airline
from sktime.forecasting.naive import NaiveForecaster
from sktime.split import temporal_train_test_split
y = load_airline()
y_train, y_test = temporal_train_test_split(y)
forecaster = NaiveForecaster(strategy="mean")
forecaster.fit(y_train)
y_pred = forecaster.predict(fh=[1, 2, 3])
This example demonstrates loading a dataset, splitting it into train and test sets, fitting a simple forecaster, and making predictions.
Competitor Comparisons
A unified framework for machine learning with time series
Pros of sktime
- More comprehensive and actively maintained time series toolbox
- Larger community and contributor base
- Extensive documentation and examples
Cons of sktime
- Potentially more complex API due to broader feature set
- May have a steeper learning curve for beginners
Code Comparison
sktime:
from sktime.forecasting.naive import NaiveForecaster
from sktime.datasets import load_airline
y = load_airline()
forecaster = NaiveForecaster(strategy="mean")
forecaster.fit(y)
y_pred = forecaster.predict(fh=[1, 2, 3])
sktime>:
# No code comparison available as sktime> is not a separate repository
# It appears to be a typo or misunderstanding in the original question
Summary
sktime is a well-established time series analysis and forecasting library for Python. It offers a wide range of algorithms and tools for various time series tasks. The repository is actively maintained and has a growing community of contributors and users. While it provides extensive functionality, this breadth may result in a more complex API and a steeper learning curve for newcomers.
The comparison to "sktime>" is not applicable, as it appears to be a typo or misunderstanding. There is no separate repository with that name. sktime is the primary and only repository for the sktime project.
Statsmodels: statistical modeling and econometrics in Python
Pros of statsmodels
- More comprehensive statistical modeling capabilities, including econometrics and advanced regression techniques
- Longer history and more established in the scientific community
- Extensive documentation and examples for various statistical methods
Cons of statsmodels
- Steeper learning curve due to its broader scope and more complex API
- Less focus on time series forecasting compared to sktime
- Slower development cycle and less frequent updates
Code comparison
statsmodels:
import statsmodels.api as sm
model = sm.OLS(y, X).fit()
predictions = model.predict(X_new)
sktime:
from sktime.forecasting.arima import ARIMA
forecaster = ARIMA()
forecaster.fit(y)
predictions = forecaster.predict(fh=[1, 2, 3])
Summary
statsmodels is a comprehensive statistical library with a wide range of modeling capabilities, while sktime focuses specifically on time series analysis and forecasting. statsmodels offers more advanced statistical techniques but has a steeper learning curve, whereas sktime provides a more user-friendly interface for time series tasks. The choice between the two depends on the specific requirements of your project and your familiarity with statistical concepts.
Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
Pros of Prophet
- Designed specifically for business forecasting with built-in handling of holidays and seasonal effects
- User-friendly interface with automatic hyperparameter tuning
- Robust to missing data and outliers
Cons of Prophet
- Limited to univariate time series forecasting
- Less flexible for custom model architectures compared to sktime
- May struggle with complex, non-linear patterns in data
Code Comparison
Prophet:
from fbprophet import Prophet
model = Prophet()
model.fit(df)
future = model.make_future_dataframe(periods=365)
forecast = model.predict(future)
sktime:
from sktime.forecasting.arima import AutoARIMA
forecaster = AutoARIMA()
forecaster.fit(y_train)
y_pred = forecaster.predict(fh=fh)
Prophet focuses on simplicity and ease of use, while sktime offers a more comprehensive toolkit for various time series tasks. Prophet excels in business forecasting scenarios with clear seasonality and holiday effects, whereas sktime provides greater flexibility for different types of time series analysis and forecasting methods.
A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.
Pros of pmdarima
- Specialized focus on ARIMA models, offering advanced features and optimizations
- Includes automated model selection and hyperparameter tuning
- Provides comprehensive documentation and examples for ARIMA-specific use cases
Cons of pmdarima
- Limited to ARIMA models, lacking the broader range of time series algorithms found in sktime
- Less flexibility for handling complex time series tasks beyond ARIMA modeling
- Smaller community and ecosystem compared to sktime
Code Comparison
pmdarima:
from pmdarima import auto_arima
model = auto_arima(y, seasonal=True, m=12)
forecast = model.predict(n_periods=5)
sktime:
from sktime.forecasting.arima import AutoARIMA
forecaster = AutoARIMA(sp=12)
forecaster.fit(y)
forecast = forecaster.predict(fh=[1,2,3,4,5])
Both libraries offer auto ARIMA functionality, but pmdarima's implementation is more specialized and may offer additional ARIMA-specific options. sktime provides a more consistent API across various time series algorithms, making it easier to switch between different forecasting methods.
The machine learning toolkit for time series analysis in Python
Pros of tslearn
- Simpler API and easier to get started for beginners
- Focuses specifically on time series tasks, potentially leading to more optimized implementations
- Includes some unique algorithms not found in sktime, like GAK (Global Alignment Kernel)
Cons of tslearn
- Smaller community and less frequent updates compared to sktime
- More limited in scope, focusing mainly on clustering and classification tasks
- Less integration with the broader scikit-learn ecosystem
Code Comparison
tslearn example:
from tslearn.clustering import TimeSeriesKMeans
kmeans = TimeSeriesKMeans(n_clusters=3, metric="dtw")
kmeans.fit(X_train)
sktime example:
from sktime.clustering.kmeans import TimeSeriesKMeans
kmeans = TimeSeriesKMeans(n_clusters=3, distance="dtw")
kmeans.fit(X_train)
Both libraries offer similar functionality for time series clustering, with slight differences in API design. sktime generally follows scikit-learn conventions more closely, while tslearn has its own unique API style. sktime provides a broader range of functionality beyond just clustering and classification, making it more versatile for various time series tasks.
Open source time series library for Python
Pros of pyflux
- Focuses specifically on probabilistic time series modeling and Bayesian inference
- Includes specialized models like GARCH for volatility forecasting
- Provides built-in plotting and diagnostics for model evaluation
Cons of pyflux
- Less actively maintained compared to sktime (last update in 2018)
- More limited in scope, primarily for financial time series analysis
- Smaller community and fewer contributors
Code Comparison
pyflux example:
import pyflux as pf
model = pf.ARIMA(data=y, ar=1, ma=1, family=pf.Normal())
model.fit()
model.plot_fit()
sktime example:
from sktime.forecasting.arima import ARIMA
forecaster = ARIMA(order=(1,0,1))
forecaster.fit(y)
y_pred = forecaster.predict(fh=[1,2,3])
Both libraries offer ARIMA modeling, but pyflux provides more built-in visualization tools, while sktime offers a broader range of time series algorithms and a more consistent API across different model types. sktime also integrates better with the wider scikit-learn ecosystem, making it more versatile for general machine learning tasks involving time series data.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
Welcome to sktime
A unified interface for machine learning with time series
:rocket: Version 0.32.4 out now! Check out the release notes here.
sktime is a library for time series analysis in Python. It provides a unified interface for multiple time series learning tasks. Currently, this includes time series classification, regression, clustering, annotation, and forecasting. It comes with time series algorithms and scikit-learn compatible tools to build, tune and validate time series models.
Overview | |
---|---|
Open Source | |
Tutorials | |
Community | |
CI/CD | |
Code | |
Downloads | |
Citation |
:books: Documentation
Documentation | |
---|---|
:star: Tutorials | New to sktime? Here's everything you need to know! |
:clipboard: Binder Notebooks | Example notebooks to play with in your browser. |
:woman_technologist: Examples | How to use sktime and its features. |
:scissors: Extension Templates | How to build your own estimator using sktime's API. |
:control_knobs: API Reference | The detailed reference for sktime's API. |
:tv: Video Tutorial | Our video tutorial from 2021 PyData Global. |
:hammer_and_wrench: Changelog | Changes and version history. |
:deciduous_tree: Roadmap | sktime's software and community development plan. |
:pencil: Related Software | A list of related software. |
:speech_balloon: Where to ask questions
Questions and feedback are extremely welcome! We strongly believe in the value of sharing help publicly, as it allows a wider audience to benefit from it.
Type | Platforms |
---|---|
:bug: Bug Reports | GitHub Issue Tracker |
:sparkles: Feature Requests & Ideas | GitHub Issue Tracker |
:woman_technologist: Usage Questions | GitHub Discussions · Stack Overflow |
:speech_balloon: General Discussion | GitHub Discussions |
:factory: Contribution & Development | dev-chat channel · Discord |
:globe_with_meridians: Meet-ups and collaboration sessions | Discord - Fridays 13 UTC, dev/meet-ups channel |
:dizzy: Features
Our objective is to enhance the interoperability and usability of the time series analysis ecosystem in its entirety. sktime provides a unified interface for distinct but related time series learning tasks. It features dedicated time series algorithms and tools for composite model building such as pipelining, ensembling, tuning, and reduction, empowering users to apply an algorithm designed for one task to another.
sktime also provides interfaces to related libraries, for example scikit-learn, statsmodels, tsfresh, PyOD, and fbprophet, among others.
Module | Status | Links |
---|---|---|
Forecasting | stable | Tutorial · API Reference · Extension Template |
Time Series Classification | stable | Tutorial · API Reference · Extension Template |
Time Series Regression | stable | API Reference |
Transformations | stable | Tutorial · API Reference · Extension Template |
Parameter fitting | maturing | API Reference · Extension Template |
Time Series Clustering | maturing | API Reference · Extension Template |
Time Series Distances/Kernels | maturing | Tutorial · API Reference · Extension Template |
Time Series Alignment | experimental | API Reference · Extension Template |
Annotation | experimental | Extension Template |
Time Series Splitters | maturing | Extension Template |
Distributions and simulation | experimental |
:hourglass_flowing_sand: Install sktime
For troubleshooting and detailed installation instructions, see the documentation.
- Operating system: macOS X · Linux · Windows 8.1 or higher
- Python version: Python 3.8, 3.9, 3.10, 3.11, and 3.12 (only 64-bit)
- Package managers: pip · conda (via
conda-forge
)
pip
Using pip, sktime releases are available as source packages and binary wheels. Available wheels are listed here.
pip install sktime
or, with maximum dependencies,
pip install sktime[all_extras]
For curated sets of soft dependencies for specific learning tasks:
pip install sktime[forecasting] # for selected forecasting dependencies
pip install sktime[forecasting,transformations] # forecasters and transformers
or similar. Valid sets are:
forecasting
transformations
classification
regression
clustering
param_est
networks
annotation
alignment
Cave: in general, not all soft dependencies for a learning task are installed, only a curated selection.
conda
You can also install sktime from conda
via the conda-forge
channel.
The feedstock including the build recipe and configuration is maintained
in this conda-forge repository.
conda install -c conda-forge sktime
or, with maximum dependencies,
conda install -c conda-forge sktime-all-extras
(as conda
does not support dependency sets,
flexible choice of soft dependencies is unavailable via conda
)
:zap: Quickstart
Forecasting
from sktime.datasets import load_airline
from sktime.forecasting.base import ForecastingHorizon
from sktime.forecasting.theta import ThetaForecaster
from sktime.split import temporal_train_test_split
from sktime.performance_metrics.forecasting import mean_absolute_percentage_error
y = load_airline()
y_train, y_test = temporal_train_test_split(y)
fh = ForecastingHorizon(y_test.index, is_relative=False)
forecaster = ThetaForecaster(sp=12) # monthly seasonal periodicity
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
mean_absolute_percentage_error(y_test, y_pred)
>>> 0.08661467738190656
Time Series Classification
from sktime.classification.interval_based import TimeSeriesForestClassifier
from sktime.datasets import load_arrow_head
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
X, y = load_arrow_head()
X_train, X_test, y_train, y_test = train_test_split(X, y)
classifier = TimeSeriesForestClassifier()
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
accuracy_score(y_test, y_pred)
>>> 0.8679245283018868
:wave: How to get involved
There are many ways to join the sktime community. We follow the all-contributors specification: all kinds of contributions are welcome - not just code.
Documentation | |
---|---|
:gift_heart: Contribute | How to contribute to sktime. |
:school_satchel: Mentoring | New to open source? Apply to our mentoring program! |
:date: Meetings | Join our discussions, tutorials, workshops, and sprints! |
:woman_mechanic: Developer Guides | How to further develop sktime's code base. |
:construction: Enhancement Proposals | Design a new feature for sktime. |
:medal_sports: Contributors | A list of all contributors. |
:raising_hand: Roles | An overview of our core community roles. |
:money_with_wings: Donate | Fund sktime maintenance and development. |
:classical_building: Governance | How and by whom decisions are made in sktime's community. |
:trophy: Hall of fame
Thanks to all our community for all your wonderful contributions, PRs, issues, ideas.
:bulb: Project vision
- By the community, for the community -- developed by a friendly and collaborative community.
- The right tool for the right task -- helping users to diagnose their learning problem and suitable scientific model types.
- Embedded in state-of-art ecosystems and provider of interoperable interfaces -- interoperable with scikit-learn, statsmodels, tsfresh, and other community favorites.
- Rich model composition and reduction functionality -- build tuning and feature extraction pipelines, solve forecasting tasks with scikit-learn regressors.
- Clean, descriptive specification syntax -- based on modern object-oriented design principles for data science.
- Fair model assessment and benchmarking -- build your models, inspect your models, check your models, and avoid pitfalls.
- Easily extensible -- easy extension templates to add your own algorithms compatible with sktime's API.
Top Related Projects
A unified framework for machine learning with time series
Statsmodels: statistical modeling and econometrics in Python
Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.
The machine learning toolkit for time series analysis in Python
Open source time series library for Python
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot