Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#839 Added CS:GO Round Winner Classification #856

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6,070 changes: 6,070 additions & 0 deletions Anime Data Analysis and Prediction/Dataset/All_Anime.csv

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4,888 changes: 4,888 additions & 0 deletions Anime Data Analysis and Prediction/Model/anime_analysis_and_prediction.ipynb

Large diffs are not rendered by default.

Binary file not shown.
Binary file not shown.
66 changes: 66 additions & 0 deletions Anime Data Analysis and Prediction/Readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
## Title: Anime Data Analysis and Prediction

## Goal: To analyze the Anime Dataset using Exploratory Data Analysis using several parameters and then try to make predictions

## Dataset link:
https://www.kaggle.com/datasets/ayush4807/aad-dataset

## Techniques used:
1. Data Filtering
2. Data Preprocessing
3. Data Extraction
4. Data visualization
5. Data Modelling
6. Pickling the model

## Libraries used:
1. Pandas
2. Pandas profiling
3. Numpy
4. Matplotlib
5. Scikit Learn
6. Pickle

## Data visuals created:
1. Hsitogram
2. Box plot
3. Scatter Plot
4. Bar plot
5. Heatmap
6. Pairplot

## Machine Learning Models used:
1. Linear Regression
2. Decsion Tree Regression
3. Random Forest Regressor

## Evaluation metrics used:
1. Root Mean Squared error
2. Mean Squared error
3. R2 score
4. Training Score

## Visuals:
<img src = "https://github.com/PiyushBL45t/ML-Crate/blob/main/Anime%20Data%20Analysis%20and%20Prediction/Images/Box%20plot%20pr%20year.png"/>
<img src = "https://github.com/PiyushBL45t/ML-Crate/blob/main/Anime%20Data%20Analysis%20and%20Prediction/Images/Heatmap.png"/>
<img src = "https://github.com/PiyushBL45t/ML-Crate/blob/main/Anime%20Data%20Analysis%20and%20Prediction/Images/Histograms.png"/>
<img src = "https://github.com/PiyushBL45t/ML-Crate/blob/main/Anime%20Data%20Analysis%20and%20Prediction/Images/Normal%20Distributions.png"/>
<img src = "https://github.com/PiyushBL45t/ML-Crate/blob/main/Anime%20Data%20Analysis%20and%20Prediction/Images/Pairplot.png"/>

## Conclusion
### We tried to implement three model on our analyzed data.
#### 1. Linear Regression
#### 2. Decision Tree Regressor
#### 3. Random Forest Regressor

### This was a continuous data thus, we applied the Regression Algorithms for this purpose.
### The training paramter was "Rating": This depicts the Anime ratings on scale of 10. We trained and tested our model with two random types of Anime Genres:
#### 1. Animation, Adventure, Drama
#### 2. Animation, Comedy, Fantasy
## Results say that:
### 1. Linear Regression and Random Forest Algorithms show a very low training score and a high error values and due to which they are not the best fit models. The predictions of <u>Ratings</u> based on those models is also very low for the future years.
### 2. The Decision Tree on the other hand makes a very good predictions of ratings and we can say that the type of Animes we selected can catch more attention of audiences in the coming years. The evaluation metrics are stable and error results are very low this makes it fit to create a good predictive analysis example.

## Authors

- Created by [@Priyankesh](https://github.com/priyankeshh), GSSoC 2024
122,411 changes: 122,411 additions & 0 deletions CS_GO Round Winner Classification/Dataset/csgo.csv

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1,005 changes: 1,005 additions & 0 deletions CS_GO Round Winner Classification/Model/CS_GO_Round_Winner_Classification.ipynb

Large diffs are not rendered by default.

67 changes: 67 additions & 0 deletions CS_GO Round Winner Classification/Model/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# PROJECT TITLE

CS:GO Round Winner Classification

## GOAL

**Aim** - Predict who wins individual snapshots of rounds

## DATASET

https://www.kaggle.com/christianlillelund/csgo-round-winner-classification

## DESCRIPTION

This is a classification problem where we we predict who wins individual snapshots of rounds. We use Logistic Regression, Decision Tree and Random Forest Classifier

## WHAT I HAD DONE

1. Perfromed exploratory data analysis (EDA) on the given dataset
2. It starts with loading the dataset and viewing the top 5 rows
3. We calculate statistical data in the dataset
4. Then comes finding correlation between the features and also finding statistical values related to the dataset
5. Data visualization is done with libraries such as matplotlib and seaborn
6. Finally 3 different algorithms are used to find the best algorithm
7. Also accuracy score of each algorithm is calculated for comparison purpose with other algorithms

DATA VISUALIZATION

![image](https://user-images.githubusercontent.com/78292851/157266387-e42175ca-d73c-44de-acfa-bea89a24c0c7.png)

![image](https://user-images.githubusercontent.com/78292851/157266436-8e3a1a69-d194-45fb-9d53-53fbf25f9698.png)

![image](https://user-images.githubusercontent.com/78292851/157266484-b75f55c4-c8a3-4cd4-963a-6f298ce07939.png)




## MODELS USED

1. Logistic Regression= simplest and most common algorithm used for classification problems
2. Decision Tree
3. Random Forest Classifier


## LIBRARIES NEEDED

1. Numpy
2. Pandas
3. Matplotlib
4. Seaborn
5. Scikit-Learn

## ACCURACIES

1. Logistic Regression Score = 73.76%
2. Random Forest Classifier = 88%
3. Decision Tree- 81.96%

## CONCLUSION

We can conclude that Random Forest Classifier gives the most accurate results specifically for this problem statement.

## Authors

- Created by [@Priyankesh](https://github.com/priyankeshh), GSSoC 2024


5 changes: 5 additions & 0 deletions CS_GO Round Winner Classification/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
matplotlib==3.9.0
seaborn==0.13.2
numpy==1.26.4
pandas==2.2.2
scikit_learn==1.5.0