Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Geom pandemics #157

Draft
wants to merge 26 commits into
base: master
Choose a base branch
from
Draft

Geom pandemics #157

wants to merge 26 commits into from

Conversation

jmaltman
Copy link

created geom_pandemics.R (not ready)
updated geom_recessions.R with new URL
updated sysdata.rda
created pandemics.rda
created R script to create pandemics.rda

deleted because it was a text file, not a script
Create pandemics data set
replaced last row with updated information on 2020 covid-recession
replace url and change from read.csv to read_xlsx - unsure if this makes errors
@dlcomeaux dlcomeaux marked this pull request as draft October 30, 2023 16:19
@jmaltman
Copy link
Author

When trying to run geom_pandemics, this error occurs:
Error in geom_pandemics():
! Problem while setting up geom.
ℹ Error occurred in the 1st layer.
Caused by error in UseMethod():
! no applicable method for 'mutate' applied to an object of class "character"
This seems to be an issue from line 219 of the geom_pandemics code. Not sure why this is happening since the same code (minus name changes) exists in geom_recessions and that function works

Daniel Comeaux added 2 commits December 28, 2023 09:48
Now works as designed to show the COVID-19 pandemic. Code should be refined and potentially integrated with geom_recessions to minimize duplicate code (but this works as proof of concept).
@dlcomeaux
Copy link
Contributor

@jmaltman heads up, I had a bit of time and was able to get the function to work. I did not address the geom_recessions data update link. Could you play around with that using the new format, ideally using the stable geoJSON link that they have now migrated to rather than the csv/xlsx approach we used before?

@dlcomeaux
Copy link
Contributor

It would also be better practice for the code if we streamlined/reduced duplication between geom_recessions and geom_pandemics. This can certainly be done, and ideally we would have a consolidated documentation too - but woudl take some work and require some changes to arguments (the most important one of which would be changing update_recessions and update_pandemics to a consistent update parameter).

Daniel Comeaux added 13 commits December 28, 2023 09:56
Dummy data does not work for this, so just using the fake data.
Previous change caused error
Shift to using dedicated recessions and pandemics data files
This commit combines geom_pandemics and geom_recessions to allow for shared code and more efficient documentation.
Fixes error in vignette development
@dlcomeaux
Copy link
Contributor

OK @jmaltman I would appreciate if you could see whether this works on your system, including in edge cases. I think it's a functional PR at this stage. When we are ready, we will need to increment the cmapplot version number and add a change log.

@jmaltman
Copy link
Author

Testing this function out with the update piece of the function to insert my own pandemics data, I found two errors come up.

  1. The labels for the made-up pandemics did not change to reflect the name of these imaginary pandemics. Instead, each of them kept the label "COVID-19 Pandemic" in the chart.
  2. For an imaginary pandemic that was ongoing (where ongoing = TRUE in the data), if no end date was added to the ongoing pandemic, then it would not populate in the chart. If I gave it an imaginary date in the future, then it appeared, but if I left the end date as NA, then it did not appear when trying to plot it.

@dlcomeaux
Copy link
Contributor

dlcomeaux commented Jan 11, 2024

@jmaltman

Hmm, I am not able to replicate the bug for the labels. When I add a new pandemic table by copying the existing one and adding a new row, it shows up fine for me with the accurate label. Could you paste the code you used to yield this bug?

The ongoing one requires an end date by the way that the function currently works. You could add some code to do that automatically, if you wanted, but in the recessions table a user supplied one wouldn't work unless you put a far out date on it.

@jmaltman
Copy link
Author

jmaltman commented Jan 11, 2024

@dlcomeaux, I created my own data frame instead of adding to the existing pandemic table.

new_df <- data.frame(start_char = c("Jan 1980", "Apr 2009", "Mar 2015"),
end_char = c("Jun 1986", "Aug 2010", "Feb 2025"),
start_date = c("1/31/1980", "4/25/2009", "3/26/2015"),
end_date = c("6/15/1986", "8/10/2010", "2/1/2025"),
ongoing = c(FALSE, FALSE, TRUE),
name = c("Shoe", "Swine Flu", "Plushie")
)

new_df <-
new_df %>%
mutate(start_date = as.Date(start_date, format = "%m/%d/%Y"),
end_date = as.Date(end_date, format = "%m/%d/%Y"))

ggplot(df, mapping = aes(x = year_dec, y = value)) +
geom_pandemics(update = new_df) +
geom_line(aes(color = var)) +
scale_x_continuous("Year") +
theme_minimal()

@dlcomeaux
Copy link
Contributor

@jmaltman OK this has identified a few things for me.

  • One issue with this is the NA code - can you try what happens on your system if you give that one an end date? Do the labels work properly then?
  • When I run that, it gives me an error that it may not be properly formatted, although it still outputs the geoms
  • I think I know what is happening with the labels. The label is being extracted as a part of the geom before the update function is run, which means that it pulls from the built in labels.

I'm moving the update part of the geom_pandemics function up so that it occurs before the label part of the function. Hopefully that helps with the labeling issue when new data is used.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants