This report details a statistical test comparing song duration between 2019 and 1972, and uses techniques of linear regression to look for correlations between song duration and song popularity on Spotify in 2019.
The full report can be found under "Statistical Test - Full Report.pdf" and the corresponding Jupyter Notebook with the source code can be found under "Statistical Test - Full Report.ipynb"
The dataset, created by Yamaç Eren Ay via the Spotify API, can be found here: https://www.kaggle.com/yamaerenay/spotify-dataset-19212020-160k-tracks
*** The csv dataset file is too large for direct upload to GitHub, so it has been compressed before upload. Please decompress and download in order to accurately view the source code in Jupyter Notebook. ***