Natural Language Processing Project

Overview:

This project is about utilizing text analysis techniques to analyze unstructured data (text) in multiple text documents, aiming at providing insights and figuring out hidden themes in these documents. As a result, grouped 42 txt files into 5 topics, and classified overall sentiment of each file. Process including:

Data understanding and preparation including removing punctuation marks, transforming all letters to lowercase, Stemming etc.
Exploratory data analysis including word frequency, TF-IDF, word cloud, and Bigram
Clustering using K-mean, Hierarchical clustering, Network graph
Latent semantic analysis such as semantic similarity, sentiment analysis
Topic modelling utilising Latent Dirichlet Allocation (LDA) algorithm

Output:

Written Report

The approach used, assumptions and supporting rationale for each stage of the CRISP-DM framework. Results and recommendations, including supporting visualisations and summary data. Evaluate the results of different techniques, giving reasons for the final approach.

Workfile

An appendix including working code

Reflection Blog

A blog post reflecting on the use of the techniques of text analysis in the workplace.

_{^{Edit on May 39, 2020}}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.DS_Store		.DS_Store
README.md		README.md
blog_the_use_of_techniques_of_text_analysis_in_the_workplace.md		blog_the_use_of_techniques_of_text_analysis_in_the_workplace.md
report_the_anatomy_of_an_unknown_corpus.pdf		report_the_anatomy_of_an_unknown_corpus.pdf
workfile_the_anatomy_of_an_unknown_corpus.R		workfile_the_anatomy_of_an_unknown_corpus.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Natural Language Processing Project

Overview:

Output:

Written Report

Workfile

Reflection Blog

About

Releases

Packages

Languages

wenyingw/Natural-Language-Processing-Project

Folders and files

Latest commit

History

Repository files navigation

Natural Language Processing Project

Overview:

Output:

Written Report

Workfile

Reflection Blog

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages