Skip to content

Latest commit

 

History

History
19 lines (12 loc) · 1.52 KB

README.md

File metadata and controls

19 lines (12 loc) · 1.52 KB

Data processing

In this repo, we are merging project-level ARPA data that the Treasury Department released, and extracting projects that mentions something about the criminal justice system in the project description.

Data source

The Treasury Department requires large cities to report its ARPA spending every quarter, and smaller jurisdictions can report every year. You can find the excel spreadsheet from the Treasury's website, under the "public reporting" sections.

Analysis notebooks

There are two notebooks that are currently relevant.

  • merge_data.ipynb notebook merges project-level data from different data releases.
  • project-level-analysis.ipynb reads in the merged data and runs a text analysis on the project descriptions, extracting projects that contains keywords about the criminal justice system. You can check classfications for the keywords we're extracting, or editing directly in the notebook.

In addition, the legacy_notebooks directory has a number of notebooks that we used for previous analysis, including scrapers and text analysis on the interim report.

Requirements

Python (3.6+) and Pandas

The data files are stored with git-lfs. You might need to install it before pulling the repo.