This repository features the codebase used to develop the minimum-spanning-trees used to explore crypto price/return data over time, model the inter-coin relationships, and cluster accordingly using minimum spanning trees (using correlation metrics between coins as well as dynamic time warping).
This codebase features three key components:
- Data ingest (Python)
- Modelling scripts (Python)
- UI/Frontend (Javascript)
- git clone https://github.com/JonathanBechtel/dva_project.git
- Create python 3.9 environment (i.e. conda create --name mst_tree_project python=3.9.16)
- Within activated python 3.9 environment, install python packages: pip install -r requirements.txt
To execute the web-based frontend/UI:
- Within command prompt/terminal, change to the /web directory within the cloned git repo.
- Execute the following command to start a local python webserver: python -m http.server (may be python3 rather than python)
- Navigate to a http://localhost:8000 within a browser (tested with Chrome)
- For the 3d force-directed graph of the minimum-spanning-trees, click 3d-FDG.html
- Navigate by clicking any individual node, clicking & dragging, etc
To replicate the data ingest:
- Obtain a block.cc API key by registering for an account here: https://pro.block.cc/login
- Input a block.cc API key into /script/.env
- Execute /script/crypto_data.py with python to obtain crypto data from block.cc w/an appropriate API key
To replicate the modeling via the MST (correlation-based and dynamic-time-warping-based):
- Navigate to /notebooks/ directory
- Open "MST - Second Go.ipynb" within a jupyterlab/notebook environment
- Execute the notebook to generate the following outputs: corr_graph.csv & dtw_graph.csv
- To convert these outputs to json (as used by the web frontend):
- execute: python /web/toJson.py
- To generate the analysis of the MSTs (correlation-based and dynamic-time-warping-based):
- Open "DTW-Corr Metrics.ipynb" within a jupyterlab/notebook environment
- Execute the notebook to generate analysis of the two MST approaches in terms of modelling the crypto price/return relationships over time
- Open "Clustering & Visualization With DTW.ipynb" within a jupyterlab/notebook environment
- Execute the notebook to generate clusters based on the two MST approaches as well as cluster visualizations