This reposatry contains my DS and ML-contest's projects, along with my personal fun project. I have dealt with diverse set of problem/data/metrics. Following is the summary of each project, which contains the type of dataset
, type of problem
and my-approach
to handle that(all in very brief). More details can be found in each subdirectory.
If you want to look the following text in a table format, click here
- DataSet:
- Image
- Objective:
- Bounding Box prediction
- My Approach:
- Designed a visual feature pipeline with attention on the object in image
- Data Augmentation Technique along with its bounding box
- Used
Single Stage Detector
Approach - Focal Loss with
YOLO
andSSD
- DataSet:
- Text
- Objective:
- Classification
- My Approach:
- Data Cleaning/feature enginnering
- Linear/Non-Linear Model
Deep Learning Attention Model
Pretrained Bert Model
- Ensemble
- DataSet:
2500
unknown predictors
- Objective:
- Classification
- My Approach:
- Feature Understanding(
EDA
) - feature engineering
- designed feature interaction tools
- ensemble model using
xgboost/lighgbm/catboost
andlinear/non-linear
simple model - statistical model to understand the feature importance using
p-values
- Feature Understanding(
- DataSet:
- Very big Dataset(45M observation, graph edge-representation)
- Relational Feature
- Category + Numerical
- Objective:
- Link Prediction
- My Approach:
- Graph Based features such as (
adamic-adar
,common-resource-allocation
,...) SVD
feature for each userComunity-clustering
Subsemble
(I did this after competition is over, to understand more about sampling and model building)neighbour-based
feature(Removed highly cardinal feature)- Also tried
Deep learning approach
(Graph Embedding), but couldn't handle at that time properly
- Graph Based features such as (
- DataSet:
- Category + Numerical
- Relational Dataset
- Objective:
- Regression
- My Approach:
- Feature engineering
date-time
based featureAggregation
based featureRelational
Features
- Ensemble using different set of
tranformed
target space
- Feature engineering
- DataSet:
- Image
- Objective:
- Comparison between ResNet and my modified feature pipeline
- Classification
- My Approach:
- Developed a
weighted feature pipeline using global and local feature
. Global feature put constrained on local feature, to specifically focused on features of object
in imageBetter attention map around object
, which reflect its learned feature.- Improved score by
1.37%
overResnet
- Developed a
- DataSet:
- Image
- Objective:
- Face Verification
- My Approach:
Matching Network Approach
- Build a
Student-Attentdance hardware using arduino
Hard Mining Approach
(generate all permutation between classes to handle small dataset)network-in-network
approach to handle overfitting as i have very small dataset.- Achieved
93%
accuracy
- DataSet:
- Image
- Objective:
- Classification (training on very small dataset)
- My Approach:
Prototype Algorithm
implementation- There is more to this(will update in future)
- DataSet:
- Category + Numerical
- Objective:
- Regression
- My Approach:
- Date based feature and Dummy feature
Interaction based feature
Bayesian optimization
out of fold prediction
to generateMeta feature
forensemble
- DataSet:
- Text
- Objective:
- User-Problem Rating Prediction
- My Approach:
- My main concerns was to handle following question carefully:
- What is the strongest and weakest area of user?
- What is the level of problem?
- What problem user have just solved?
- If user gets stuck at current problem, what problem should help him(to gain confidence and to improve skill in that area)?
- Exploration and explotation strategy in recommending problem
- And many more?
- My main concerns was to handle following question carefully:
- DataSet:
- Category + Numerical
- Objective:
- Classification
- My Approach: +
- DataSet:
- Image
- Objective:
- Segmentation
- My Approach:
- Implemented an U-Net architecture on blood cell Dataset.
- fully convolutional network on traffic-street dataset.
- Finally experimented with generative adverserial network for better generalization in the presence of limited dataset.
- DataSet:
- Relational feature
- Time-Series Feature
- Categorical + Numerical
- Objective:
- Future Sales Prediction for different store in different cities
- My Approach: +
- DataSet:
- Image
- Objective:
- Classification
- My Approach:
- EDA
- Feature Engineering
- DataSet:
- Time-Series stock prices
- Objective:
- Future price prediction
- Regression
- My Approach:
- Deep learning approach using RNN and LSTM