fraud-transaction-detection

Fraud transaction detection using Machine Learning algorithms on highly imbalanced dataset using ANN, Random Forest Classifier and XGBoost Classifier

Observations

The dataset is highly imbalanced, with only 0.129% of observations being fraudulent.
There is no missing data in the dataset
The dataset consists of 11 features which needed to be transformed

Checking for multi-collinearity

Summary and Explanation

oldbalanceOrg and newbalanceOrg are perfectly correlated because these two columns represent the original and new balances in the sender's account after the transaction.
oldbalanceDest and newbalanceDest are also perfectly correlated because these two columns represent the original and new balances in the recipient's account
nameOrig and nameDest are mass categorical variable

Action

Removing newbalanceOrig and newbalanceDest to avoid multicollinearity
Removing nameOrig and nameDest because of irrelavnce

Weight of the balanced dataset

After applying Undersampling and then Oversampling the following are the weights of the new dataset :

Fraudulant transaction weight: 0.3335339444434781

Non-Fraudulant transaction weight: 0.6664660555565219

ANN Loss Chart

ANN Performance

Random Forest Performance

XGBoost Performance

Model Performance Comparison

ANN_model (Artificial Neural Network):

F1-score on the training set: 0.9500

F1-score on the test set: 0.9493

Random Forest:

F1-score on the training set: 1.0 (perfect score)

F1-score on the test set: 0.9992

XGBoost:

F1-score on the training set: 0.9967

F1-score on the test set: 0.9963

Conclusion

Random Forest Model works best

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
Data Dictionary.txt		Data Dictionary.txt
Fraud_Txn_detection.ipynb		Fraud_Txn_detection.ipynb
Fraud_Txn_detection.pdf		Fraud_Txn_detection.pdf
README.md		README.md
corr_heatmap.png		corr_heatmap.png
loss_chart.png		loss_chart.png
model_comparison.png		model_comparison.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

fraud-transaction-detection

Observations

Checking for multi-collinearity

Summary and Explanation

Action

Weight of the balanced dataset

ANN Loss Chart

ANN Performance

Random Forest Performance

XGBoost Performance

Model Performance Comparison

Conclusion

About

Releases

Packages

Languages

xecyborg/fraud-transaction-detection

Folders and files

Latest commit

History

Repository files navigation

fraud-transaction-detection

Observations

Checking for multi-collinearity

Summary and Explanation

Action

Weight of the balanced dataset

ANN Loss Chart

ANN Performance

Random Forest Performance

XGBoost Performance

Model Performance Comparison

Conclusion

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages