Auto-Sklearn: AutoML in Python

What you’ll get out of this post and what you’ll need to run the code

Step 1: Load data

import sklearn.datasets
import sklearn.model_selection
# We fetch the data using openml.org
X, y = sklearn.datasets.fetch_openml(data_id=31, return_X_y=True, as_frame=True)
# Split the data into train and test
X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(
X, y, test_size=0.4, random_state=42
)
X_train.info()

Step 2: Manually build a pipeline

https://odsc.com/europe/#register
from sklearn.compose import ColumnTransformer
from sklearn.metrics import accuracy_score
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import OneHotEncoder
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
# Create the estimator using the default parameters from the library
estimator_svc = SVC(C=1.0, kernel='rbf', gamma='scale', shrinking=True, tol=1e-3,
cache_size=200, verbose=False, max_iter=-1, random_state=42
)
# build and fit the pipeline
categorical_columns = [col for col in X_train.columns
if X[col].dtype.name == 'category']
encoder = ColumnTransformer(transformers = [
('cat', OneHotEncoder(handle_unknown='ignore'), categorical_columns)
], remainder='passthrough')
pipeline_svc = Pipeline([
('encoder', encoder),
('scaler', StandardScaler()),
('svc', estimator_svc),
])
pipeline_svc.fit(X_train, y_train)
# Score the model
prediction = pipeline_svc.predict(X_test)
performance_svc = accuracy_score(y_test, prediction)
print(f"SVC performance is {performance_svc}")

Step 3: Use Auto-sklearn as a drop-in-replacement

import autosklearn.classification# Create and train the estimator
estimator_askl = autosklearn.classification.AutoSklearnClassifier(
time_left_for_this_task=300,
seed=42,
resampling_strategy='cv',
n_jobs=1,
)
estimator_askl.fit(X_train, y_train)
# Score the model
prediction = estimator_askl.predict(X_test)
performance_askl = accuracy_score(y_test, prediction)
print(f"Auto-Sklearn Classifier performance is {performance_askl}")

Wrapping up on Auto-sklearn

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
ODSC - Open Data Science

ODSC - Open Data Science

94K Followers

Our passion is bringing thousands of the best and brightest data scientists together under one roof for an incredible learning and networking experience.