This archive contains the finalized projects completed during the 2022-2023 session of the "Supervised Learning" module. The first coursework talks about linear regression and KNN while the second trained a perceptron to classify handwritten digit numbers.
Following are the brief description of two coursework. See corresponding code and report for more information.
We first illustrated the phenomena of overfitting, underfitting and hyper-parameter with polynomial basis and
def coef_sin_reg(x, y, k):
"""calculate coefficients of linear regression with a sin(k*pi*x) basis
x (np.ndarray): m*1 vector
y (np.ndarray): m*1 vector
k (int): feature map from dim 1 to k
w (np.ndarray): w = (X'X)^(-1)X'y coefficients of regression
m = len(x) # number of input x
assert len(x) == len(y)
basis_x = np.zeros((m, k))
for i in range(1, k+1):
basis_x[:,i-1] = np.sin(i*(np.pi)*x)
return scipy.linalg.solve(basis_x.T @ basis_x, basis_x.T @ y)
Then we extended linear regression with kernel method on predict the median house price of Boston with one or more attributes.
We Researched KRR with the Gaussian Kernel and performed it on predicting the median house price of Boston. KRR shows its advance on the nonlinear data set.
def gaussian_kernel(x_1, x_2, sigma):
"""gaussian kernel of x_1 and x_2
x1 (np.ndarray): shape (m_1, n). m_2 examples, n features
x2 (np.ndarray): shape (m_2, n). m_2 examples, n features
sigma (float): parameter
K : shape (m_1, m_2). Kernel matrix
assert(x_1.shape[1] == x_2.shape[1])
K = cdist(x_1, x_2, 'euclidean')
K = np.exp(-(K ** 2) / (2. * sigma ** 2))
return K
def train_kernel_ridge(x_train, y_train, sigma, gam):
"""alpha of the ridge regression
x_train (np.ndarray): shape (m, n_1). m examples, n_1 features
y_train (np.ndarray): shape (m, n_2). m examples, n_2 features
sigma (float): parameter
gam (float): parameter
alpha (np.ndarray): shape (m, n_2).
K = gaussian_kernel(x_train, x_train, sigma)
ell = K.shape[0]
alpha = + gam * ell * np.eye(ell)), y_train)
return alpha
We implemented the k-NN algorithm and explore its performance as a function of k.
We estimated generalization error of k-NN as a function of k
We determined the optimal k as a function of the number of training points
We applied One-versus-rest method to train our k-class perceptron.
with kernel
with the polynomial kernel $K_d(\boldsymbol{p}, \boldsymbol{q})=(\boldsymbol{p} \cdot \boldsymbol{q})^d$
We use 80% of the dataset to train our model and test it on the rest. While training, we split 10% from training dataset as the validating dataset to determine the number of epoch. The parameter of model is updated only during the training process.
Also we found the best
with the Gaussian kernel $K(\boldsymbol{p}, \boldsymbol{q})=e^{-c|\boldsymbol{p}-\boldsymbol{q}|^2}$
The parameter c is chosen from set