Repo where I recreate some popular machine learning models from scratch in Python. The purpose of this repo is for educational purposes. While the efficiency of the code or completeness is not as good as the one you could find in a library like sklearn, the simplification of the algorithm allows you to really understand how the algorithm works, and why it's being used for a particular dataset.
Every subfolder contains a file demo.ipynb, which contains a practical application of the algorithm. It loads a dataset, perform a basic EDA, and fits the model. Moreover, I published an article explaining in granular details every model present in this repo. I highly recommend reading them, as I go through on how the model works, the math behind the model, its benefits, assumptions, and cons, and most importantly, a breakdown and explanation of the code to build it.
- AdaBoost: https://medium.com/stackademic/building-adaboost-from-scratch-in-python-18b79061fe01
- CART: https://medium.com/@cristianleo120/classification-and-regression-trees-cart-implementation-from-scratch-in-python-89efa31ad9a6
- ID3 - Decision Tree: https://medium.com/@cristianleo120/master-decision-trees-and-building-them-from-scratch-in-python-af173dafb836
- Naive Bayes Classifier: https://medium.com/ai-in-plain-english/naive-bayes-classifier-achieving-100-accuracy-on-iris-dataset-d6df3e927096
- Principal Component Analysis: https://medium.com/@cristianleo120/principal-component-analysis-pca-from-scratch-in-python-65998c681bc0
- Random Forest: https://medium.com/@cristianleo120/building-random-forest-from-scratch-in-python-16d004982788
- Stochastic Gradient Descent: https://medium.com/@cristianleo120/stochastic-gradient-descent-math-and-python-code-35b5e66d6f79
- Support Vector Classfier: https://medium.com/ai-in-plain-english/support-vector-classifiers-svcs-a-comprehensive-guide-a9115a99a94f
- XGBoost: https://medium.com/@cristianleo120/the-math-behind-xgboost-3068c78aad9d