This projects intends to analyze and create a model on the PIMA Indian Diabetes dataset to predict if a particular observation is at a risk of developing diabetes, given the independent factors.
The dataset can be found on the Kaggle website. This dataset is originally from the National Institute of Diabetes and Digestive and Kidney Diseases and can be used to predict whether a patient has diabetes based on certain diagnostic factors.
★Dataset and Data Information: https://www.kaggle.com/uciml/pima-indians-diabetes-database
- Import and seed various random functions for same result
- Import Pandas, Sequential and Dense from Keras
- Read the csv file
- Create X and Y variables
- Split the data into training and test sets
- Define the keras sequential model with 3 hidden layers
- Compile the keras model for classification accuracy
- Fit the model on the training dataset
- Evaluate and print the model accuracy on test dataset
- Predict the classes using test data and the compiled model
- Create the confusion matrix