Skip to content

Latest commit

 

History

History
65 lines (54 loc) · 3.08 KB

syllabus.md

File metadata and controls

65 lines (54 loc) · 3.08 KB
layout
default

Syllabus

ℹ️ Provisional course agenda at a glance

Topic Hours
Intro, Math Recap 5
Unsupervised Learning
Dimensionality Reduction (PCA, Eigenvectors, SVD) 5
Clustering (kmeans, GMM) 5
Supervised Learning, Non-parametric
Decision trees 5
Random Forest/Nearest Neigh. 5
Supervised Learning, Parametric
Linear Regression with Least Squares 5
Polynomial regression, under/overfitting 5
Perceptron, Logistic Regression (LR) 5
SVM 5
Deep Learning
from LR to Neural Nets 15
Total 60

Program Outline in Detail (Tentative):

Intro:

  • What is Machine Learning (ML)?
  • From rule-based systems to learning taking decisions
  • Type of problems we can solve with ML
  • Basic Statistics, Recap of Linear Algebra and Probability Theory
  • Multivariate Gaussian distribution, Mahalanobis distance, L_p norm
  • Correlation vs Causality
  • The “curse” of dimensionality and the manifold assumption. Unintuitive properties of high-dimensional geometry.

Unsupervised Learning:

  • Dimensionality Reduction: Principal Components Analysis (PCA), t-SNE
  • Clustering, Kmeans, Expectation-Maximization (EM)
  • Gaussian Mixture Model (GMM)

Supervised Learning:

  • Regression vs Classification
  • Non-parametric models: The Nearest Neighbour (NN) Classifier, Decision Trees/Random Forest
  • Polynomial Curve Fitting
  • Parametric, Linear models: Linear regression, Least-Squares, Logistic regression, Perceptron,
  • Gradient Descent
  • Model complexity and Bias-Variance Tradeoff; Overfitting and underfitting problems; Empirical - Risk minimization, learning theory, regularization;
  • Support Vector Machines: Optimal hyperplane, margin, kernels
  • “Deep Learning”: Overparametrized, non-linear models and differential programming
    • Multilayer Perceptron
    • The backpropagation algorithm
    • Activation functions
    • Analytical gradient and numerical with finite differences
    • Computational Graph and Automatic Differentiation
    • Stochastic Gradient Descent (SGD) over mini-batches
    • DNN parameters estimation for classification as MLE (maximum likelihood estimation)
    • Loss function for classification: softmax+cross-entropy loss, information-theory interpretation.

Toolsets: Python, NumPy (matrix manipulation and linear algebra), scikit learn (basic ML), matplotlib (visualization), PyTorch (automatic differentiation and neural nets).

Credits: This program and material was inspired by the following courses: Stanford CS299, Doretto CS691A, Intro to ML Padova, Stanford CS231, Sapienza DLAI, Sapienza ML