IFT6390_Weather-Events-Classification

The submission contains 3 .ipynb notebooks.

These submissions are a part of larger repository of notebooks @ https://github.com/Etrama/IFT6390_Weather-Events-Classification_Kaggle-Data-Challenge-1_2021

The 3 notebooks are:

Logistic_Regression_Attempt1.ipynb - This contains the Logistic Regression Code for the 6390 Kaggle Challenge with some hyperparamter tuning.
Final_Sub_RF_SMOTE_without_dups.ipynb - This is one of the 2 final submissions to Kaggle, which works on synthetic data without duplicates present in the original data.
Final_Sub_RF_SMOTE_Tomek_with_dups.ipynb - This is the one of the 2 final submissions to Kaggle, which works on synthetic data with duplicates.

To run the Logistic_Regression_Attempt1.ipynb notebook: We need the notebook to be run in a folder which contains a folder called /ift3395-6390-weatherevents/ which has the train and test csvs within it. The rest of the notebook is pretty self-contained and we should be able to run the .ipynb cells line by line. The data is meant to be downloaded from Kaggle.

To run the Final_Sub_RF_SMOTE_without_dups.ipynb For this notebook we need the data to be in the same folder as above i.e. /ift3395-6390-weatherevents/ which has the train and test csvs within it. The rest of the notebook is pretty self-contained and we should be able to run the .ipynb cells line by line.

To run the Final_Sub_RF_SMOTE_Tomek_with_dups.ipynb: For this notebook we need the data to be in the same folder as above i.e. /ift3395-6390-weatherevents/ which has the train and test csvs within it. The rest of the notebook is pretty self-contained and we should be able to run the .ipynb cells line by line.

These are not included in the Gradescope submission BUT they are avaialble on Github: https://github.com/Etrama/IFT6390_Weather-Events-Classification_Kaggle-Data-Challenge-1_2021

For the other notebooks: The data files are generated using the following notebooks: Feature_Engineering_and_Sampling.ipynb - This will generate ADASYN data and also generate data using other SMOTE techniques with extra features that we tried to engineer. It will generate the data mentioned below: Entire data with train test split X_test_std.csv X_train_ada_std.csv X_train_std.csv

y_test.csv y_train_ada.csv y_train.csv

FeaEngg_Sampling_without_dups_without_extra_features.ipynb - Entire data without train test split X_train_full_ada_std.csv y_train_full_ada.cs

X_train_full_std.csv y_train_full.csv

Based on the flavour of data we want to generate, we can also use: FeaEngg_Sampling_with_dups_without_extra_features.ipynb - This will generate ADASYN retaining the duplicate data in the original data provided, similar to the notebook mentioned above.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IFT6390_Weather-Events-Classification

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.gitignore		.gitignore
Data_Prep_SMOTE_Tomek_with_dups.ipynb		Data_Prep_SMOTE_Tomek_with_dups.ipynb
Data_Prep_SMOTE_without_dups.ipynb		Data_Prep_SMOTE_without_dups.ipynb
Downloading_competition_data.ipynb		Downloading_competition_data.ipynb
Ensembling.ipynb		Ensembling.ipynb
FeaEngg_Sampling_with_dups_without_extra_features.ipynb		FeaEngg_Sampling_with_dups_without_extra_features.ipynb
FeaEngg_Sampling_without_dups_without_extra_features.ipynb		FeaEngg_Sampling_without_dups_without_extra_features.ipynb
Feature_Engineering_and_Sampling.ipynb		Feature_Engineering_and_Sampling.ipynb
Final_Sub_RF_SMOTE_Tomek_with_dups.ipynb		Final_Sub_RF_SMOTE_Tomek_with_dups.ipynb
Final_Sub_RF_SMOTE_without_dups.ipynb		Final_Sub_RF_SMOTE_without_dups.ipynb
Final_submission_with_dups.ipynb		Final_submission_with_dups.ipynb
Final_submission_without_dups.ipynb		Final_submission_without_dups.ipynb
Kaggle Challenge 1 Report_Kaushik Moudgalya_Final.docx		Kaggle Challenge 1 Report_Kaushik Moudgalya_Final.docx
Logistic_Regression_Attempt1.ipynb		Logistic_Regression_Attempt1.ipynb
Logistic_Regression_Attempt2.ipynb		Logistic_Regression_Attempt2.ipynb
Logistic_Regression_Attempt3.ipynb		Logistic_Regression_Attempt3.ipynb
Logistic_Regression_df_standarizer.ipynb		Logistic_Regression_df_standarizer.ipynb
README.md		README.md
Submission_Kaushik Moudgalya.zip		Submission_Kaushik Moudgalya.zip
final-submission_ada_trial.ipynb		final-submission_ada_trial.ipynb
sklearn_decisiontree_classifier.ipynb		sklearn_decisiontree_classifier.ipynb
sklearn_gaussiannb.ipynb		sklearn_gaussiannb.ipynb
sklearn_gradient_boosting_classifier.ipynb		sklearn_gradient_boosting_classifier.ipynb
sklearn_kneighboursclassifier.ipynb		sklearn_kneighboursclassifier.ipynb
sklearn_lgbm_classifier.ipynb		sklearn_lgbm_classifier.ipynb
sklearn_logreg.ipynb		sklearn_logreg.ipynb
sklearn_mlp_classifier.ipynb		sklearn_mlp_classifier.ipynb
sklearn_multinomialnb.ipynb		sklearn_multinomialnb.ipynb
sklearn_randomforest_classifier.ipynb		sklearn_randomforest_classifier.ipynb
sklearn_sgdclassifier.ipynb		sklearn_sgdclassifier.ipynb
sklearn_svc.ipynb		sklearn_svc.ipynb
sklearn_xgb_classifier.ipynb		sklearn_xgb_classifier.ipynb

Etrama/IFT6390_Weather-Events-Classification

Folders and files

Latest commit

History

Repository files navigation

IFT6390_Weather-Events-Classification

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages