sktime — Python Toolbox for Machine Learning with Time Series

ODSC - Open Data Science
6 min readMay 25, 2023

Editor’s note: Franz Kiraly is a speaker for ODSC Europe this June. Be sure to check out his talk, “sktime — Python Toolbox for Machine Learning with Time Series,” there!

Welcome to sktime, the open community and Python framework for all things time series. Here’s what you need to know:

  • sktime is a Python package for time series tasks like forecasting, classification, and transformations with a familiar and user-friendly scikit-learn-like API.
  • It is “batteries-included” with easy-to-use components and extension templates to implement your own. Build tuned auto-ML pipelines, with common interface to well-known libraries (scikit-learn, statsmodels, tsfresh, PyOD, fbprophet, and more!)
  • Our friendly and collaborative community is open to contributors from all backgrounds. We’re always looking for new algorithms to be hosted, these are owned by their author and maintained together with us. We welcome all forms of contributions, not just code.
  • At sktime, we’re committed to providing mentoring and upskilling opportunities for everyone, particularly underrepresented demographics. We offer onboarding events, mentoring, and summer internships to help you grow your skills and achieve your goals!
  • We believe in open source, open science, and open governance. That means we value transparent and participatory governance, and we’re dedicated to sharing benefits and opportunities with our ever-growing user base. Join us and help drive sktime forward as an organization! Whether you’re an experienced data scientist working with time series or a newcomer to the field, join us! With a huge selection of components and algorithms, a supportive community, and a commitment to mentoring new contributors, sktime is the perfect choice for anyone who wants to take their professional journey to the next level. At ODSC 2023, we will give an introduction to: How to identify your time series learning task — is it forecasting? Classification? Annotation? Something else? How to use sktime to apply common algorithms, and how to build your own pipelines for feature extraction, tuning, auto- ML, and more How to create an sktime compatible estimator component, which can be used with sktime pipelines. How to maintain it in a private code base, or contribute to sktime’s algorithm library. How to contribute to sktime’s roadmap and governance as a user, how to report bugs, how to get in touch with the developer and user community, participating in community events How to start contributing to sktime — start your open source ML journey today! Paid internship opportunities and unpaid mentoring opportunities are available!

sktime, the unified package for time series ML

sktime supports many time series related learning tasks and objects! Forecasting, Classification, Feature Extraction, parameter fitting and many others.

Forecasting with sktime is simple!

from sktime.datasets import load_airline
from sktime.forecasting.arima import ARIMA
y = load_airline() fh = [1, 2, 3]forecaster = ARIMA(order=(1, 1, 0))
forecaster.fit(y, fh=fh)
y_pred = forecaster.predict()

(visualization too)

from sktime.utils import plotting
fig, ax = plotting.plot_series(
y, y_pred, labels=["y", "y_pred"] )

Easily swap in any forecasting algorithm of your choice with sktime — it’s that simple (and flexible)!

# if you prefer nixtla/statsforecast above pmdarima from
sktime.forecasting.statsforecast import StatsForecastAutoARIMA
forecaster = StatsForecastAutoARIMA()
forecaster.fit(y, fh=fh)

Interval and probabilistic forecasts are easy too:

coverage = 0.9
y_pred_ints = forecaster.predict_interval(coverage=coverage)
fig, ax = plotting.plot_series(
y, y_pred, labels=["y", "y_pred"], pred_interval=y_pred_ints )

sktime is composable and compatible with sklearn — and any sklearn compatible estimator can be used, e.g. in reduction compositors:

import xgboost as xgb xgbregressor = xgb.XGBRegressor()
forecaster = make_reduction( xgbregressor,
window_length=15,
strategy="recursive")

sktime has flexible pipeline building and composition E.g., a model with multiple seasonality and polynomial detrending, via pipeline:

forecaster = (
    Detrender(PolynomialTrendForecaster(degree=3))
* Deseasonalizer(model="multiplicative", sp=12)
* Deseasonalizer(model="multiplicative", sp=3)
* ARIMA()
)

sktime also provides a flexible AutoML interface, exposing model architecture choice as tunable parameters:

forecaster = MultiplexForecaster(
forecasters=[
("naive", NaiveForecaster(strategy="last")),
("ets", ExponentialSmoothing(trend="add", sp=12)),
]
)
cv = SlidingWindowSplitter(initial_window=int(len(y_train) * 0.5), window_length=30) forecaster_param_grid = {"selected_forecaster": ["ets", "naive"]} gscv = ForecastingGridSearchCV(forecaster, cv=cv, param_grid=forecaster_param_grid)

Looking for more resources on sktime? Check out our tutorials and introductions to get up to speed quickly:

Recorded video tutorials: http://www.sktime.net/en/latest/tutorials.html Tutorial notebooks on binder: https://mybinder.org/v2/gh/sktime/sktime/main?filepath=examples

sktime, the open community library

We invite everyone to join and contribute! https://www.sktime.net/en/latest/get_involved/contributing.html

If you need help getting started, don’t hesitate to reach out! Our community is actively collaborating on discord: https://discord.com/invite/54ACzaFsn7 We have:

  • Community collaboration sessions, currently every Friday 4pm UTC (but flexible)
  • Responsive maintainers — chat with us in the help-desk or dev-chat channels
  • topical workstreams such as deep learning, probabilistic foreceasting
  • community mentoring, see below

sktime is designed to a contributable and extendable library — both contributions to the package, and local, private, interface compliant components are supported. We provide extension templates for all supported learning tasks to enable you to write your own components

Option 1: you want an estimator in sktime? Fork the repo, fill the extension template, make a pull request. This makes you the owner of the estimator in the library (if you want it). Option 2: write it locally, e.g., in a repository importing sktime. Use the check_estimators utility for testing and continuous integration. https://www.sktime.net/en/latest/developer_guide/add_estimators.html

sktime, the open mentoring and upskilling platform

At sktime, we’re dedicated to creating opportunities for the next generation of open source leaders around the world. That’s why we offer both internships and a mentoring program, and we encourage everyone who’s interested to apply. Our mentoring program pairs you with an experienced tech mentor who will work with you on specific learning goals related to coding or methods. It’s open to everyone, and you can apply at any time. Check out the details here: https://www.sktime.net/en/latest/get_involved/mentoring.html If you’re interested in a competitively paid internship with sktime in the 2023 round, be sure to apply by the May 19 deadline listed here: https://github.com/sktime/mentoring/blob/main/internships/sktime2023.md Our governance model is designed to be open, transparent, and inclusive. We welcome contributions from not just developers, but also regular users and early career data scientists who want to help shape the direction of sktime.

In 2023, we plan to expand our governance model to empower an even wider user and community base. We encourage you to complete your user registration here: https://forms.gle/eVuzrCjKDRupxawL7. By doing so, you’ll be included in roadmap planning, community events, and your bug reports will receive priority. You will also be eligible to vote and run in upcoming elections (subject to terms and conditions). We also appreciate any contributions to refining our governance model, such as participating in our governance working group (governance channel on https://discord.com/invite/54ACzaFsn7)

sktime is made possible by the dedicated efforts of our active community members. To support our operations and charitable mission, and to have a voice in our development roadmap and events, please consider making a small donation via Open Collective (https://github.com/sponsors/sktime), or sponsoring an internship in collaboration with your home organisation, an excellent opportunity for joint upskilling (send email to sktime.toolbox@gmail.com)

Looking forward to meeting you!

To summarize, sktime is a unified framework library for machine learning with time series, with an sklearn-like API, batteries included, ready-to-use An open community library, easy to contribute to, easy to extend, an open mentoring and upskilling platform, we welcome you to join mentoring, internships, or the governing bodies! We can’t wait to see you at ODSC! (and hopefully, later, as a user, contributor, mentee, or sponsor of sktime!)

About the author:

Franz Kiraly is the founder and a core developer of the open source framework sktime. His research is focused on software engineering for open source and data science, machine learning for structured learning tasks such as time series tasks, and robust empirical and statistical evaluation of algorithms in deployment. Franz held a faculty position at University College London 2013–2020, before he moved to industry R&D in principal data scientist roles.

Originally posted on OpenDataScience.com

Read more data science articles on OpenDataScience.com, including tutorials and guides from beginner to advanced levels! Subscribe to our weekly newsletter here and receive the latest news every Thursday. You can also get data science training on-demand wherever you are with our Ai+ Training platform. Subscribe to our fast-growing Medium Publication too, the ODSC Journal, and inquire about becoming a writer.

--

--

ODSC - Open Data Science

Our passion is bringing thousands of the best and brightest data scientists together under one roof for an incredible learning and networking experience.