Lisette Solis, Josemaria Macedo, JP Martinez, Monica Nimmagadda
In 2020, millions of people participated in Black Lives Matter Protests throughout the USA and the world1. These protests sparked by the murder of George Floyd are some of the largest in USA history.
Our project focuses on understanding these protests and their impact in relationship to media coverage and changes to municipal budgets.
We were particularly interested in better understanding the extent of coverage on the protests and the tone of the stories, and in turn whether there is a relationship between the number protests, type of media coverage, and changes to municipal budgets.
More information on our data collection and analysis can be found on the HTML site once the project is run.
To run the project the following steps need to be followed:
- Clone repository
- Run
pip install --user dash-bootstrap-components
in the terminal - Create a python file named
config.py
in theproject_protests
package directory with the API keys for the New York Times and The Guardian API (keys sent privately) - Go back to root folder
30122-project-project-protest
and runpoetry install
to install the necessary packages (will take around 8 minutes to install) - Run
poetry shell
to activate the virtual environment - From the directory
30122-project-project-protest
run the command linepoetry run python -m project_protests <arguments>
. Arguments are optional.
When running the last command without any arguments, you will run the dashboard application by default. You can also call up to two arguments:
compile_news
: Using the json files obtained from scraping data from The New York Times and The Guardian, it cleans and compile to create a compiled csv with the newspaper information. This argument can be combined with collect_data.run
: This argument performs the two tasks described in compile_news and to run the dashboard. This argument can be combined with collect_data.collect_data
: This argument collect the data from The Guardian and The New York Times API and store the json files obtained from the requests. This argument can only be called combined with either compile_news or run and can only be included as the last argument. (The approximate run time for this argument when using the default query arguments is approximate 25 minutes).
The output of the above instructions will create an HTML site with two tabs:
- Home - interactive visualizations of our data
- Protest:
- number of protests per year (2017-2023)
- Newspaper:
- number of news stories per year (2017-2023)
- correlation matrix between number of newsstories and number of protests
- Sentiment Analysis:
- sentiment scores of news stories per year (2017-2023)
- similarity scores of words related to "police" (2017-2022)
- Data Sources and Analysis
- Description and shortcomings of our data sources
- Method of analysis