Skip to content

Latest commit

 

History

History
27 lines (20 loc) · 770 Bytes

README.md

File metadata and controls

27 lines (20 loc) · 770 Bytes

Fake news detection

This project was created for machine learning course. Our task was to detect texts with fake data.

Authors

Karolina Mączka
Tymoteusz Urban

Data

Fake News Dataset Combined Different Sources

Preprocessing

  • NaNs and outliers
  • Language detection
  • Stopwords removal
  • Words lemmatizer
  • CountVectorizer
  • TfidfTransformer

Model

  • XGBoost
  • Hyperparameter optimization with Random Search CV
  • Independent validation
  • 0.99 AUC on test set