Supervised Learning

This archive contains the finalized projects completed during the 2022-2023 session of the "Supervised Learning" module. The first coursework talks about linear regression and KNN while the second trained a perceptron to classify handwritten digit numbers.

Following are the brief description of two coursework. See corresponding code and report for more information.

CW1

Linear Regression

We first illustrated the phenomena of overfitting, underfitting and hyper-parameter with polynomial basis and $sin(k\pi x)$ basis.

def coef_sin_reg(x, y, k):
    """calculate coefficients of linear regression with a sin(k*pi*x) basis

    Args:
        x (np.ndarray): m*1 vector
        y (np.ndarray): m*1 vector
        k (int): feature map from dim 1 to k
    Returns:
        w (np.ndarray): w = (X'X)^(-1)X'y coefficients of regression
    """
    m = len(x) # number of input x
    assert len(x) == len(y)
    basis_x = np.zeros((m, k))
    for i in range(1, k+1):
        basis_x[:,i-1] = np.sin(i*(np.pi)*x)
    return scipy.linalg.solve(basis_x.T @ basis_x, basis_x.T @ y)

Average test error versus the hyperparameter k in 100 runs in a logarithmic scale

Kernel methods

Then we extended linear regression with kernel method on predict the median house price of Boston with one or more attributes.

kernel Ridge Regression

We Researched KRR with the Gaussian Kernel and performed it on predicting the median house price of Boston. KRR shows its advance on the nonlinear data set.

$$ K\left(\boldsymbol{x}_i, \boldsymbol{x}_j\right)=\exp \left(-\frac{\left|\boldsymbol{x}_i-\boldsymbol{x}_j\right|^2}{2 \sigma^2}\right) $$

def gaussian_kernel(x_1, x_2, sigma):
    """gaussian kernel of x_1 and x_2

    Args:
        x1 (np.ndarray): shape (m_1, n). m_2 examples, n features
        x2 (np.ndarray): shape (m_2, n). m_2 examples, n features
        sigma (float): parameter

    Returns:
        K : shape (m_1, m_2). Kernel matrix
    """
    assert(x_1.shape[1] == x_2.shape[1])
    K = cdist(x_1, x_2, 'euclidean')
    K = np.exp(-(K ** 2) / (2. * sigma ** 2))
    return K

def train_kernel_ridge(x_train, y_train, sigma, gam):
    """alpha of the ridge regression

    Args:
        x_train (np.ndarray): shape (m, n_1). m examples, n_1 features
        y_train (np.ndarray): shape (m, n_2). m examples, n_2 features
        sigma (float): parameter
        gam (float): parameter

    Returns:
        alpha (np.ndarray): shape (m, n_2).
    """

    K = gaussian_kernel(x_train, x_train, sigma)
    ell = K.shape[0]
    alpha = np.dot(inv(K + gam * ell * np.eye(ell)), y_train)
    return alpha

The train/test error with their standard deviations for four methods.

k-Nearest Neighbors

We implemented the k-NN algorithm and explore its performance as a function of k.

A visualization of a hypothesis.

We estimated generalization error of k-NN as a function of k

Generalization error of k-NN as a function of k.

We determined the optimal k as a function of the number of training points $(m)$

The optimal k for a group of m during 100 runs.

CW2

Kernel perceptron

We applied One-versus-rest method to train our k-class perceptron.

with kernel $K\left(x_i, x_t\right)$

with the polynomial kernel $K_d(\boldsymbol{p}, \boldsymbol{q})=(\boldsymbol{p} \cdot \boldsymbol{q})^d$

We use 80% of the dataset to train our model and test it on the rest. While training, we split 10% from training dataset as the validating dataset to determine the number of epoch. The parameter of model is updated only during the training process.

Also we found the best $d^*=6.0500 \pm 1.0235$ with mean test error rate $0.0438 \pm 0.0043$ by cross-validation.

Five hardest digits to classify.

with the Gaussian kernel $K(\boldsymbol{p}, \boldsymbol{q})=e^{-c|\boldsymbol{p}-\boldsymbol{q}|^2}$

The parameter c is chosen from set $S = {0.01, 0.1, 1, 10, 100}$. By repeating the first problem, we have a training error rate and testing error rate for each parameter c. As we can see, when c = 0.01 we can get a local minimum of testing error rate.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
CW1		CW1
CW2		CW2
asserts		asserts
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Supervised Learning

CW1

Linear Regression

Kernel methods

kernel Ridge Regression

k-Nearest Neighbors

CW2

Kernel perceptron

with the polynomial kernel $K_d(\boldsymbol{p}, \boldsymbol{q})=(\boldsymbol{p} \cdot \boldsymbol{q})^d$

with the Gaussian kernel $K(\boldsymbol{p}, \boldsymbol{q})=e^{-c|\boldsymbol{p}-\boldsymbol{q}|^2}$

About

Releases

Packages

Languages

chongfengling/Supervised-Learning

Folders and files

Latest commit

History

Repository files navigation

Supervised Learning

CW1

Linear Regression

Kernel methods

kernel Ridge Regression

k-Nearest Neighbors

CW2

Kernel perceptron

with the polynomial kernel $K_d(\boldsymbol{p}, \boldsymbol{q})=(\boldsymbol{p} \cdot \boldsymbol{q})^d$

with the Gaussian kernel $K(\boldsymbol{p}, \boldsymbol{q})=e^{-c|\boldsymbol{p}-\boldsymbol{q}|^2}$

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages