Customer Churn ML Project
A practical machine-learning project built to forecast which customers are most likely to cancel their service, helping businesses focus retention efforts where they matter most.
General Overview
- The bullets below give a high-level summary, full code, charts and technical details appear in the report section that follows.
- Why this data? 64 k real customer records from Kaggle were chosen to learn which subscribers are likely to leave. A near 50 % churn rate means every extra retained customer matters.
- Early clues: customers who pay late or call support often are much more likely to quit, heavy users tend to stay.
- Preparing the file: removed the ID column (no predictive value), split the data into training and test groups, turned text fields such as “Gender” into yes/no columns, and put all numbers onto the same 0–1 scale so every feature has equal weight.
- Picking the six strongest signals: after several selection checks, the model focuses on Payment Delay, Support Calls, Gender (Male), Usage Frequency, Tenure and Age, everything else added noise without improving accuracy.
- Models tried: a simple regression (baseline), a nearest-neighbour lookup, and a decision-tree that splits the data into “yes/no” paths you can read like a flow-chart.
- Tuning for best fit: automatic searches tested different settings (e.g. tree depth, number of neighbours) to find the sweet spot between under and over-fitting.
-
Final scores:
- Decision Tree — 94 % accuracy and excellent separation of quitters vs. stayers (AUC 0.986).
- K-Nearest Neighbours — 92 % accuracy (AUC 0.972).
- Logistic Regression — 82 % accuracy, kept as a simple reference.
- Business take-away: the decision-tree model is both the most accurate and the easiest to explain. Its rules can be turned into clear retention actions, e.g. flag customers with long payment delays and many support calls for proactive outreach.
This project was developed as part of the Applied Machine Learning (CSM010-2024-APR) course within the Master’s degree in Computer Science at the University of London (2025).
Customer Churn Analysis Report
HTML export of the AML Customer Churn Jupyter Notebook
If the report does not load, open it here.