Solving Merchant Attrition using Machine Learning

Business Understanding:

  • The first stage in CRISP-DM process is to understand the objective or problem statement that you are trying to solve.
  • Set objectives — describe the customer’s primary objective from a business perspective.
  • Business success criteria — describe the criteria for a successful outcome to the project from the business point of view. This could be specific and measurable; in our case, prediction of customer churn probability with certain confidence interval.

Data Understanding:

  • Most companies have a Data Dictionary available. It is essential to have an understanding of all the fields collected for the project.

Data Preparation:

  • Data cleaning — This is the most time-consuming step in the entire process. Each and every field from the data needs to be investigated and checked for inconsistencies. The following are some of the things to check for or perform on the data.
  • Missing Values
  • Outliers
  • Categorical to Numerical — one-hot encoding
  • Standardization or Normalization
  • Feature Engineering


  • Model Building — Start the model building process by choosing the appropriate algorithm based on the business use case. Build multiple models with different algorithms and choose the model that performs the best.
  • Hyperparameter tuning — Once the best model is chosen, try to optimize the algorithm by fine-tuning the parameters that are available to increase the performance of the model.


  • Assessment of data mining results — Summarize assessment results in terms of business success criteria, including a final statement regarding whether the project already meets the initial business objectives.


  • Deployment plan — Summarize your deployment strategy including the necessary steps and how to perform them.
  • Monitoring and maintenance plan — Summarize the monitoring and maintenance strategy, including the necessary steps and how to perform them.
  • Merchant transactional data
  • Demographic data
  • Pricing information
  • Merchant account information


  • Collect all required data and perform preprocessing, the steps involve Data Cleaning and Feature Engineering.
  • The preprocessed data is then fed to multiple machine learning algorithms. The model makes predictions based on its parameters.
  • These predictions are then evaluated using evaluation metrics. The evaluation metrics for classification algorithms are Accuracy, Recall, Precision, F1 score.
  • Based on these evaluation metrics the best performing model is selected, which is called Champion model and the 2nd best model is called Challenger.
  • Use Champion model predictions as the output.
  • The process is repeated on a monthly basis and predictions are made. Based on the metrics, if at any point the Challenger model performs better than the Champion model, use the Challenger model predictions as the output.
  • Predicted the churn probability with 93.76% accuracy (using Logistic Regression model).
  • The client was able to endorse retention plans on targeted churn probable merchants, which helped lower the attrition rate by 6%.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
ODSC - Open Data Science

ODSC - Open Data Science


Our passion is bringing thousands of the best and brightest data scientists together under one roof for an incredible learning and networking experience.