This project is to predict which customers will default on their credit card repayments next month. The data set is based on the publicly available credit card default data set from the UCI Machine Learning Repository.
Details of the original data are here.
- Various machine learning algorithms have been applied to predict credit card default as part of a Kaggle competition. The performance metric used in the competition was the area under the receiver operator curve (AUC) where an XGBoost model scored the highest AUC of 0.795 outperforming a random forest model, generalized linear model, and a linear logistic model. Furthermore, due to the implications of classifying non-defaulters as defaulters (false positives) in a banking scenario, we have decided that recall is the most appropriate metric and would recommend a XGBoost model.
Edit on May 16, 2020