GitHub - IV012/bios-611-project

Dataset

Depression: Twitter Dataset + Feature Extraction
20000 Labelled English Tweets of Depressed and Non-Depressed Users
Link: https://www.kaggle.com/datasets/infamouscoder/mental-health-social-media

Description

The data is in uncleaned format and is collected using Twitter API. The Tweets has been filtered to keep only the English context. It targets mental health classification of the user at Tweet-level.
Data structure:

Post Date	Time of the tweet being published.
Text	The content of this tweet.
Followers	The number of followers of this account.
Friends	The number of friends (followed by and following) of this account.
Favorites	The number of "like"s.
Statuses	The number of activities of the account owner.
Retweet	The number of retweets.
Label	The mental status, whether depression or not.

Goal

EDA: label distribution, word frequency, text length...
Statistical Modelling: prediction, clustering...

Manual

Build docker image and run the container

docker build . -t project
docker run -v $(pwd):/home/rstudio -e PASSWORD=yfd -p 8787:8787 -t project

Generate files, results and the report

make clean
make .create-dirs
make data/processed_tweets.csv
make figure/density.png
make figure/negative.html figure/positive.html figure/negative.png figure/positive.png
make figure/follower_month.png figure/follower_year.png figure/freq_month.png figure/freq_year.png
make model/model_lstm.pt figure/loss.png
make result/cm_lstm.png
make result/cm_bnb.png
make report.html

Final Report

Please check the final report report.html.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
__pycache__		__pycache__
data		data
figure		figure
model		model
network		network
result		result
.Rhistory		.Rhistory
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
Mental-Health-Twitter.csv		Mental-Health-Twitter.csv
README.md		README.md
bernoulli.py		bernoulli.py
eda.R		eda.R
eda.py		eda.py
preprocess.R		preprocess.R
report.Rmd		report.Rmd
report.html		report.html
stat.R		stat.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dataset

Description

Goal

Manual

Build docker image and run the container

Generate files, results and the report

Final Report

About

Releases

Packages

Languages

License

IV012/bios-611-project

Folders and files

Latest commit

History

Repository files navigation

Dataset

Description

Goal

Manual

Build docker image and run the container

Generate files, results and the report

Final Report

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages