The goal of this project is to develop and evaluate a range of machine learning and deep learning models to classify events as either "Signal" or "Noise." This binary classification problem is crucial in domains such as high-energy physics, finance, and anomaly detection, where identifying meaningful events amidst large volumes of background data is essential.
We used a private dataset from the Belle 2 detector. It had 59 features and 70000 examples.
1 Logistic Regression from scratch
2 DNN - Simple to complex architectures( total 3)
3 Xgboost - with Feature Importance and leaf visualization of decision tree
4 K-Nearest Neighbours with Dimensionality reduction using PCA
5 Voting Characteristics (Logistic Regression, Decision Tree and SVC)
6 Random forest
7 Decision Tree and Dimensionality reduction using PCA(Both 2 and 3 dimensional)
8 SVC and Dimensionality reduction using PCA
9 ELastic Regularised which is basically using both L1 and L2 regularisation with Logistic regression.
10 LDA