Fake news detection

This project was created for machine learning course. Our task was to detect texts with fake data.

Authors

Karolina Mączka
Tymoteusz Urban

Data

Fake News Dataset Combined Different Sources

Preprocessing

NaNs and outliers
Language detection
Stopwords removal
Words lemmatizer
CountVectorizer
TfidfTransformer

Model

XGBoost
Hyperparameter optimization with Random Search CV
Independent validation
0.99 AUC on test set