Skip to content

This research focused on whether there is racial bias in sports on Twitter, and whether the Tokyo Olympics have an impact on this. Users on Twitter were divided into sports media and general users, analyzed separately. Main methods used were news coverage analysis and sentiment analysis.

Notifications You must be signed in to change notification settings

zhaojiawei1223/Racial-bias-on-Twitter-envidence-from-2020-Tokyo-Olympics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Racial-bias-on-Twitter-envidence-from-2020-Tokyo-Olympics

This research focused on whether there is racial bias in sports on Twitter, and whether the Tokyo Olympics have an impact on this. Users on Twitter were divided into sports media and general users, analyzed seperately. Main methods used were news coverage analysis and sentiment analysis.

0-Data-collection

To analyze news coverage of different races, we selected five top American sports media on Twitter, according to their followers and number of posts. The selected media are @espn, @Slonw, @FOXSports, @BleacherReport, and @TeamUSA. A total of 3917 tweets posted by them during and three months before the games (24 April 2021 to 8 August 2021) were crawled.

For general users, we crawled all tweets that mentioned athletes’ names. The Python library we used was Twint. According to athletes’ races and periods of the posts, the dataset can be split into four groups – “white – before”, “minority – before”, “white – during”, and “minority – during”.

In order to get balanced data, we used the number of tweets for “minority – during” as a reference, and random sampling was conducted for the three other groups.

1-Racial-bias-analysis-on-sports-media

1.1 News coverage

The proportion of news coverage of white and minority athletes can be computed for each media account to quantitatively inspect whether there is racial bias. Based on the bar chart, there is no racial bias. The coverage ratios in the two periods are almost the same as the percentage in the roster.

image

image

In addition, we would like to discusse if the media coverage mainly focused on sports stars. We did a statistical analysis of athletes with top 5 media coverage. From the table below, the phenomenon of imbalanced coverage can be detected in both periods, especially in minority athletes.

image

1.2 Distinctive words

We focused on finding the most distinctive words in media posts about white and minority athletes and inspecting their differences. The method to find the most distinctive words is log-likelihood ratio test. We concluded that no bias exists according to this, although slighly more positive words were detected from while athletes.

2-Racial-bias-analysis-on-general-users

In our study, sentiments are divided into positive, neutral, and negative. Our training set includes two parts – manually labeled tweets and publicly available Twitter sentiment corpus on Kaggle. We tried Logistic model, BERT-base, BERT-weet to get the classification model with highest accurracy. Then, used this model to label all other unlabeled tweets.

image

As a supplement, we collected tweets for different ethnical groups from 9th August until now to validate whether the negative sentiments returned to baseline, and inspect the progress brought by Tokyo Olympics is permanent or temporary.

image

About

This research focused on whether there is racial bias in sports on Twitter, and whether the Tokyo Olympics have an impact on this. Users on Twitter were divided into sports media and general users, analyzed separately. Main methods used were news coverage analysis and sentiment analysis.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages