GitHub - haowei772/Case_study_02: Classification case study: make the churn prediction based on Uber customer behavioral data.

Churn prediction of Uber customer

### Aim of the project:

Develop a model to predict the churn of Uber customer based on their behavioral data.

Data source:

Uber customer behavioral data.

Data analysis pipeline:

Data cleaning
1.1 Remove invalid and duplicated cases
1.2 Deal with missing data
a) Fill missing categorical entries with new value - 'Missing value'
b) Imputation of missing customer rating with the average rating of subgroup of customer in the training dataset
Feature engineering
2.1 Generation of feature 'weekend ride', 'weekday ride', 'average spending per ride'
Model development
3.1 Linear regression model as the baseline model
3.2 Random forest regression model
3.3 Gradient boosting regression model

Model evaluation:

The final gradient boosting model has accuracy score of 0.79, precision score of 0.81, and recall score of 0.86.
The ROC curve.

The model revealed that features with high impact on customer churn are the rating of the customer and driver, distance of the ride, percentage of the surge ride, and promotion period. Feature importance

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
figures		figures
.gitignore		.gitignore
Churn_Model.py		Churn_Model.py
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Churn prediction of Uber customer

### Aim of the project:

Data source:

Data analysis pipeline:

Model evaluation:

About

Releases

Packages

Contributors 2

Languages

haowei772/Case_study_02

Folders and files

Latest commit

History

Repository files navigation

Churn prediction of Uber customer

### Aim of the project:

Data source:

Data analysis pipeline:

Model evaluation:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages