This repository contains a hobby project focused on data analysis and machine learning for predicting diabetes using the Prima Indian Diabetes dataset from the National Institute of Diabetes and Digestive and Kidney Diseases. The project is based on Jupyter Notebooks and aims to explore data analysis and create a machine learning model for diabetes prediction.
The Prima Indian Diabetes dataset used in this project can be found on my GitHub repository. The dataset includes various parameters such as Body Mass Index (BMI), insulin levels, and others to predict the presence of diabetes in individuals.
The primary objective of this project is:
- Diabetes Prediction: Utilize machine learning techniques to predict whether an individual has diabetes based on available parameters.
- Insights and Analysis: Explore insights from the data to understand correlations and dependencies, identifying which parameters influence diabetes prediction the most.
The project notebook is divided into two main sections:
- Data Exploration: Analysis of the dataset, including statistical summaries, data visualization, and insights into key features.
- Data Preprocessing: Handling missing values, scaling, and encoding categorical variables for model preparation.
- Model Building: Training various machine learning models using the prepared dataset.
- Model Evaluation: Performance evaluation of models using metrics like accuracy, precision, recall, and F1-score.
- Feature Importance: Determining the most influential parameters in predicting diabetes.
-
Clone or download this repository to your local machine.
https://github.com/MTank76/Diabetes-Prediction-System.git
-
Open the Jupyter Notebook (
Diabetes_Prediction_system.ipynb
) in your preferred environment to explore the project and its findings.
Contributions are welcome! If you'd like to contribute to this project, feel free to open issues for suggestions or submit pull requests with proposed enhancements.