layout |
---|
default |
Topic | Hours |
---|---|
Intro, Math Recap | 5 |
Unsupervised Learning | |
Dimensionality Reduction (PCA, Eigenvectors, SVD) | 5 |
Clustering (kmeans, GMM) | 5 |
Supervised Learning, Non-parametric | |
Decision trees | 5 |
Random Forest/Nearest Neigh. | 5 |
Supervised Learning, Parametric | |
Linear Regression with Least Squares | 5 |
Polynomial regression, under/overfitting | 5 |
Perceptron, Logistic Regression (LR) | 5 |
SVM | 5 |
Deep Learning | |
from LR to Neural Nets | 15 |
Total | 60 |
Intro:
- What is Machine Learning (ML)?
- From rule-based systems to learning taking decisions
- Type of problems we can solve with ML
- Basic Statistics, Recap of Linear Algebra and Probability Theory
- Multivariate Gaussian distribution, Mahalanobis distance, L_p norm
- Correlation vs Causality
- The “curse” of dimensionality and the manifold assumption. Unintuitive properties of high-dimensional geometry.
Unsupervised Learning:
- Dimensionality Reduction: Principal Components Analysis (PCA), t-SNE
- Clustering, Kmeans, Expectation-Maximization (EM)
- Gaussian Mixture Model (GMM)
Supervised Learning:
- Regression vs Classification
- Non-parametric models: The Nearest Neighbour (NN) Classifier, Decision Trees/Random Forest
- Polynomial Curve Fitting
- Parametric, Linear models: Linear regression, Least-Squares, Logistic regression, Perceptron,
- Gradient Descent
- Model complexity and Bias-Variance Tradeoff; Overfitting and underfitting problems; Empirical - Risk minimization, learning theory, regularization;
- Support Vector Machines: Optimal hyperplane, margin, kernels
- “Deep Learning”: Overparametrized, non-linear models and differential programming
- Multilayer Perceptron
- The backpropagation algorithm
- Activation functions
- Analytical gradient and numerical with finite differences
- Computational Graph and Automatic Differentiation
- Stochastic Gradient Descent (SGD) over mini-batches
- DNN parameters estimation for classification as MLE (maximum likelihood estimation)
- Loss function for classification: softmax+cross-entropy loss, information-theory interpretation.
Toolsets: Python, NumPy (matrix manipulation and linear algebra), scikit learn (basic ML), matplotlib (visualization), PyTorch (automatic differentiation and neural nets).
Credits: This program and material was inspired by the following courses: Stanford CS299, Doretto CS691A, Intro to ML Padova, Stanford CS231, Sapienza DLAI, Sapienza ML