A Text Classification/Sentiment Analysis project using the Amazon Reviews Polarity dataset.
This project offers an empirical exploration on the use of Neural networks for Text Classification/Sentiment Analysis using the Amazon Reviews Polarity dataset.
We will cover four network architectures, namely DNN, CNN, sepCNN and BERT.
This repository contains the following files:
- A report in the form of both a PDF document and an Rmd file.
- An Rmd file that I used to perform the machine learning task and create the pdf document.
- An R script that can also be used to perform the machine learning task.
As Google has changed it's API(atleast I was unable to use it), I had to download the dataset manually from the following URL:
Please select file named "amazon_review_polarity_csv.tar.gz" and download it to the project directory.
Download Location URL : Xiang Zhang Google Drive
Once downloaded, the script should take care of the rest.
The report documents the analysis and presents the findings, along with supporting statistics and figures. The report includes the following sections:
- an introduction/overview/executive summary section that describes the dataset and summarizes the goal of the project and key steps that were performed
- a methods/analysis section that explains the process and techniques used, including data cleaning, data exploration and visualization, insights gained, and my modeling approach
- a results section that presents the modeling results and discusses the model performance
- a conclusion section that gives a brief summary of the report, its limitations and future work
You are welcome to:
- submit suggestions and bug-reports at: https://github.com/mbijoor/harvard-capstone-amazon-reviews-polarity/issues
- send a pull request on: https://github.com/mbijoor/harvard-capstone-amazon-reviews-polarity/
- compose a friendly e-mail to: [email protected]