MATH 2820L Final Project by Nilai Vemula, Anvitha Kosuraju, and Sithara Samudrala
This project features analysis of a dataset from the UCI Machine Learning Repository about diabetes. Our project proposal includes information about how we want to analyze and model our data. Additionally, this write-up documents our process and findings, and our final presentation shows this information in slides.
The code for this project can be found in the following Rmarkdown notebooks:
- Exploratory Data Analysis
- Linear Model (Markdown with output here)
- Random Forest Model (Markdown with output here)
A bonus Jupyter Notebook containing more advanced modeling of our data can be found here.
The code in these notebooks can be downloaded by cloning this repository using:
git clone https://github.com/NilaiVemula/diabetes-data-diving.git
Then, an R environment can be set-up using renv
. Once this project is lauched, renv
should automatically set up an enironment at which point you can run the following code in R to restore the project library:
renv::restore()
In order to run the Jupyter Notebook you will need to set up a Python virtual environment using the following code:
python3 -m venv env
source env/bin/activate
pip install -r requirements.txt