-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add rvdss indicator/data source #1542
Draft
cchuong
wants to merge
33
commits into
dev
Choose a base branch
from
add_rvdss_indicator
base: dev
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from 17 commits
Commits
Show all changes
33 commits
Select commit
Hold shift + click to select a range
62b9070
Create rvdss_historic.py
cchuong 073aac9
Create rvdss_update.py
cchuong 01af95f
create utils.py for common functions
cchuong 6a002e0
create constants.py and update utils
cchuong 714455c
Update rvdss_historic.py
cchuong 6ee8bb7
Update rvdss_update.py
cchuong 8814554
fix typo and add missing abbreviation to constants
cchuong d7905c8
fix typo
cchuong 08f908a
add missing geo
cchuong 07ed998
Update constants.py
cchuong fd5bf15
Revert "Update constants.py"
cchuong 678b468
Revert "add missing geo"
cchuong 4bfc933
fix geo and virus abbreviation
cchuong e8957c3
remove "province of" from geo_values
cchuong 7720a24
construct urls automatically
nmdefries 59f79bf
comment constants
nmdefries e70b0e9
note historic urls don't need to be updated
nmdefries 72d1906
be stricter about importing local fns
nmdefries bf51bd3
move dashboard file names to constants
nmdefries ee3cadf
move run-the-whole-pipeline code into main()
nmdefries 180e67f
add code to calculate number of positive tests back in
cchuong 6bd6e24
update abbreviate_geo to remove periods and other spelling
cchuong a7666b8
fix lab name missing province
cchuong 503165e
comment historic script
nmdefries 256e697
move output file names to constants
nmdefries cd83087
replace boolean comparisons with pythonic "not"
nmdefries 969295b
actually put csv names in constants
nmdefries 00f3f9a
break more helper functions and add doctsrings
cchuong ecca542
add more comments
cchuong 31ec961
calculate update dates in a new function
cchuong 0be5f08
combine different spellings of labs
cchuong 5696636
change slash to underscore in constants and move more processing code…
cchuong 30f3df6
rvdss interface and new fn layout so current/historical data can be e…
nmdefries File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,99 @@ | ||
# The dataset calls the same viruses, provinces, regions (province groups), | ||
# and country by multiple names. Map each of those to a common abbreviation. | ||
VIRUSES = { | ||
"parainfluenza": "hpiv", | ||
"piv": "hpiv", | ||
"para": "hpiv", | ||
"adenovirus": "adv", | ||
"adeno": "adv", | ||
"human metapneumovirus": "hmpv", | ||
"enterovirus/rhinovirus": "evrv", | ||
"rhinovirus": "evrv", | ||
"rhv": "evrv", | ||
"entero/rhino": "evrv", | ||
"rhino":"evrv", | ||
"ev/rv":"evrv", | ||
"coronavirus":"hcov", | ||
"coron":"hcov", | ||
"coro":"hcov", | ||
"respiratory syncytial virus":"rsv", | ||
"influenza":"flu", | ||
"sars-cov-2":"sarscov2", | ||
} | ||
|
||
GEOS = { | ||
"newfoundland": "nl", | ||
"newfoundland and labrador": "nl", | ||
"prince edward island":"pe", | ||
"nova scotia":"ns", | ||
"new brunswick":"nb", | ||
"québec":"qc", | ||
"quebec":"qc", | ||
"ontario":"on", | ||
"manitoba" : "mb", | ||
"saskatchewan":"sk", | ||
"alberta": "ab", | ||
"british columbia" :"bc", | ||
"yukon" : "yk", | ||
"northwest territories" : "nt", | ||
"nunavut" : "nu", | ||
"canada":"ca", | ||
"can":"ca" , | ||
"at":"atlantic", | ||
"atl":"atlantic", | ||
"pr" :"prairies" , | ||
"terr" :"territories", | ||
} | ||
|
||
# Regions are groups of provinces that are geographically close together. Some single provinces are reported as their own region (e.g. Québec, Ontario). | ||
REGIONS = ['atlantic','atl','at','province of québec','québec','qc','province of ontario','ontario','on', | ||
'prairies', 'pr', "british columbia",'bc',"territories",'terr',] | ||
NATION = ["canada","can",'ca',] | ||
|
||
# Construct dashboard and data report URLS. | ||
DASHBOARD_BASE_URL_2023 = "https://health-infobase.canada.ca/src/data/respiratory-virus-detections/archive/{date}/" | ||
DASHBOARD_BASE_URLS_2023 = ( | ||
DASHBOARD_BASE_URL_2023.format(date = date) for date in | ||
( | ||
"2024-06-20", | ||
"2024-06-27", | ||
"2024-07-04", | ||
"2024-07-11", | ||
"2024-07-18", | ||
"2024-08-01", | ||
"2024-08-08", | ||
"2024-08-15", | ||
"2024-08-22", | ||
"2024-08-29", | ||
"2024-09-05" | ||
) | ||
) | ||
|
||
SEASON_BASE_URL = "https://www.canada.ca" | ||
ALTERNATIVE_SEASON_BASE_URL = "www.phac-aspc.gc.ca/bid-bmi/dsd-dsm/rvdi-divr/" | ||
HISTORIC_SEASON_REPORTS_URL + "/en/public-health/services/surveillance/respiratory-virus-detections-canada/{year_range}.html" | ||
|
||
# Each URL created here points to a list of all data reports made during that | ||
# season, e.g. | ||
# https://www.canada.ca/en/public-health/services/surveillance/respiratory-virus-detections-canada/2014-2015.html. | ||
# The Public Health Agency of Canada site switched in 2024 to reporting | ||
# disease data in a dashboard with a static URL. Therefore, this collection | ||
# of URLs does _NOT_ need to be updated. It is used for fetching historical | ||
# data (for dates on or before June 8, 2024) only. | ||
HISTORIC_SEASON_URL = (HISTORIC_SEASON_REPORTS_URL.format(year_range = year_range) for year_range in | ||
( | ||
"2013-2014", | ||
"2014-2015", | ||
"2015-2016", | ||
"2016-2017", | ||
"2017-2018", | ||
"2018-2019", | ||
"2019-2020", | ||
"2020-2021", | ||
"2021-2022", | ||
"2022-2023", | ||
"2023-2024" | ||
) | ||
) | ||
|
||
LAST_WEEK_OF_YEAR = 35 |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo: Please describe a bit more about what these URLs are and if we will need to update them (add new dates in) ever.
issue: also, are these actually for 2023? All the dates are in 2024