-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use national data for nchs-mortality signals #1912
Conversation
It turns out that this dataset already has national-level data in it, we have just been throwing it away!
You can also see samples of this by browsing the dataset at: It is better to use this as-is from the source data than to rebuild it as an aggregation. |
@melange396 I've reworked the PR to use the data within the dataset, rather than synthesize new data. Of note:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice! we should do a little cleanup on the inner loops of run_module()
, and then i think it will be good to go.
stats.append((max(dates), len(dates))) | ||
else: | ||
for sensor in SENSORS: | ||
for geo in ["state", "nation"]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should be able to pull this up two levels (outside of for metric in METRICS:
) to reduce repetition
you could even do the filtering on geo_id ==/!= "us"
there too, if you want to make another [sub]copy of df_pull
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good!
Did this work for you in your local testing? Since these values are coming out as "NULL", it makes me think the denominator here is 0 or null, which then makes me think that the national population is not getting set properly here. Also, to make this message go away, we need to mark the nchs-mortality signals as having national-level data in the spreadsheet and then get it transferred to the csv. We can take care of that after the signals are properly acquired. |
Description
Currently, only state-specific data is used for any of the data with
source=nchs-mortality
. As the Covidcast dashboard uses the US as a whole by default, this leads to "N/A" values showing up in plots and numeric text.This PR makes use of the previously discarded national data for all signals in the NCHS family, pulling it from the dataset rather than throwing it away.
Changelog
state
to['state', 'nation']
.percent_of_expected_deaths
- despite the name, it contains proportions of expected deaths (e.g. 1.1 instead of 110%), which are nontrivial to aggregate.nation
data from the dataset.Fixes
nchs-mortality:deaths_covid_incidence_prop
#1906