Using NLP to newsgroup documents classification

We'll compare Naive Bayes and Deep Learning models used for the classification of newsgroup texts.

What we'll be doing:

Multinomial Naive Bayes model
Deep Learning model
Deep Learning model with pre-trained embedded layer

We'll also check the accuracy of the models and some other metrics as well as ploting a confusion matrix.

Conclusion

The Naive-Bayes model was very easy and quick to create and it performed very well even on our first try.

The Deep Learning model performed well too, nonetheless it wasn't the best performer model. Altough the accuracy of this model wasn't as good as the previous model, we could check that even when the prediction was not accurate, the wrong predictions were classified as to a class close to the correct answer. That's due to the power of the pre-trained embedding in NN that captures the relative fields in which the words are usually inserted into.

The Naive-Bayes is very quick to implement. The model itself does not require tunning it's hiperparameters (not in most cases). On the other hand, usually the data must be well pre-processed before being fed to the model, in this case the pre-processing is very simple.
The Deep Learning model requires a lot of hiperparameter tunning and also testing and finding a proper model architecture to perform better (was not the focus on this notebook). A bigger dataset always happen to help deep learning models as well. That would take a lot more of our time.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
NLP_Naive_Bayes_vs_Deep_Learning.ipynb		NLP_Naive_Bayes_vs_Deep_Learning.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Using NLP to newsgroup documents classification

Conclusion

About

Releases

Packages

Languages

paulocressoni/NLP-Naive-Bayes-vs-Deep-Learning

Folders and files

Latest commit

History

Repository files navigation

Using NLP to newsgroup documents classification

Conclusion

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages