-
Notifications
You must be signed in to change notification settings - Fork 1
Home
Update: Download, tidy, and export data to Elmer via R.
- Open files:
config.R
,run.R
- Follow instructions in
run.R
and run line by line where necessary.
In Elmer, you can find staged tables in stg.ofm_apr_intercensal
, stg.ofm_apr_postcensal
, and stg.ofm_apr_postcensal_housing
.
The python script is here for importing CHAS data into Elmer. The script reads CHAS data from the HUD website, downloads it, unzips it, and puts in the Elmer staging database. The data website is here: ttps://www.huduser.gov/portal/datasets/cp.html. The background info is here: http://aws-linux/mediawiki/index.php/Comprehensive_Housing_Affordability_Strategy_(CHAS)
The script requires numpy, pandas, urllib, pyodbc, pathlib, reqeusts, zipfile, and sqlalchemy.
The script is called by running CHAS_ETL.py. If you want to use a new dataset, you will need to change the data_file_name around line 61: data_file_name = '2012thru2016-140-csv.zip', and data dictionary name around line 73: data_dict_name = 'CHAS data dictionary 12-16.xlsx'
You may also want to specify a new name for the table in staging at line 129:df_to_staging(table_9_data_long, 'chas_tbl_9_2016')
After the data has been put in the staging database, code in this directory https://github.com/psrc/housing-metrics/tree/main/process/CHAS is used to pu the data into facts and dimensions and add geographic information.
As of October 2021, the CHAS API is available however Tract level data is not available. Summaries exist for the following via API:
- Nation
- State
- County
- MCD
- Place
The API allows users to query for data, equivalent to the Query
tab on their interface. The script CHAS_API
allows you to query county level summaries for the PSRC region.