-
Notifications
You must be signed in to change notification settings - Fork 0
USASpending Sync
Peter B edited this page Sep 18, 2024
·
3 revisions
The sync process for the prospekt server is as follows:
- Load USASpending data
- Create indexes
- Verify code
- Update prospekt database
- Download data from usaspending.gov
- Create a list of items to restore
- Create a new database
- Restore database
Example:
sudo mkdir -p /data/downloads/USASpending/20240908
cd /data/downloads/USASpending/20240908
sudo wget https://files.usaspending.gov/database_download/usaspending-db_20240908.zip
unzip usaspending-db_20240908.zip
cd ..
sudo touch restore_20240908.list
sudo chmod 777 restore_20240908.list
pg_restore --list 20240908 | grep source_procurement_transaction | grep TABLE > restore_20240908.list
sudo -i -u postgres psql -c "create database usaspending_20240908 tablespace usaspending;"
sudo -i -u postgres psql -d usaspending_20240908 -c "create schema raw;"
sudo -i -u postgres pg_restore \
--jobs 16 \
--dbname postgresql://localhost:5432/usaspending_20240908 \
--verbose \
--exit-on-error \
--use-list /data/downloads/spending/restore_20240908.list \
/data/downloads/spending/20240908
--SELECT * FROM pg_indexes WHERE schemaname = 'rpt' and tablename='award_search' ORDER BY indexname;
--CREATE INDEX idx_award_search_recipient_uei ON rpt.award_search(recipient_uei);
--CREATE INDEX idx_award_search_naics_code ON rpt.award_search(naics_code);
SELECT * FROM pg_indexes WHERE schemaname='raw' and tablename='source_procurement_transaction' ORDER BY indexname;
CREATE INDEX idx_procurement_transaction_recipient_uei ON raw.source_procurement_transaction(awardee_or_recipient_uei);
- Generate a unique list of
type_of_contract_pricing
and confirm logic for determining thetype
of award is still valid - Generate a unique list of
extent_competed
and confirm logic for determiningcompeted
is still valid - Confirm customer abbreviations are still valid
select distinct(funding_agency_name)...