Life Expectancy Prediction Project

Overview

This project aims to predict life expectancy using socio-economic and health-related factors, leveraging WHO data from 2000 to 2015. The dataset consists of approximately 3000 data points with 22 features.

Data Preprocessing

Key steps in data preprocessing included:

Handling null values using a Simple Imputer (median).
Label encoding categorical variables.
Splitting the dataset in an 80:20 train-test ratio.
Normalizing the data.
Addressing multicollinearity by removing highly correlated features.

Feature Selection

Feature selection was performed using wrapper methods like forward selection and backward elimination across various models:

Linear Regression
Decision Tree
Random Forest
Xgboost

Model Selection and Validation

Hyperparameter tuning was conducted using GridSearchCV.
Best performing models: Random Forest and XGBoost with forward selection, achieving an R-squared of ~0.96.

Conclusion

The project demonstrates the effectiveness of machine learning in predicting life expectancy, with significant implications for healthcare policy and resource planning.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Life_expectancy_EDA.ipynb		Life_expectancy_EDA.ipynb
Life_expectancy_prediction_forward-backward.ipynb		Life_expectancy_prediction_forward-backward.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Life Expectancy Prediction Project

Overview

Data Preprocessing

Feature Selection

Model Selection and Validation

Conclusion

About

Releases

Packages

Languages

Bsarma25/Life-Expectancy-Prediction-Evaluating-ML-Regressors

Folders and files

Latest commit

History

Repository files navigation

Life Expectancy Prediction Project

Overview

Data Preprocessing

Feature Selection

Model Selection and Validation

Conclusion

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages