Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make forcats 1.0.0 Tidyverse blog article more instructive #345

Open
turbanisch opened this issue Feb 3, 2023 · 1 comment
Open

Make forcats 1.0.0 Tidyverse blog article more instructive #345

turbanisch opened this issue Feb 3, 2023 · 1 comment

Comments

@turbanisch
Copy link

Not sure if this is worth opening an issue - I just went through the forcats 1.0.0 tidyverse blog article and found it mildly confusing at first glance. The introduction of fct_na_value_to_level() to the plot doesn't do anything; the plot printed below is exactly the same.

We can make fct_infreq() do what we want by moving the NA from the values to the levels:

ggplot(starwars, aes(y = fct_rev(fct_infreq(fct_na_value_to_level(hair_color))))) + 
  geom_bar() + 
  labs(y = "Hair color")

I would find it more instructive to rename the NA level in the same step to show that ggplot will then properly adjust:

fct_na_value_to_level(hair_color, "missing")

Otherwise it is easy to miss this bit because it appears only in the context of lumping factor levels together further down below:

starwars |> 
  mutate(
    hair_color = hair_color |> 
      fct_na_value_to_level("(Unknown)") |> 
      fct_infreq() |> 
      fct_lump_min(2, other_level = "(Other)") |> 
      fct_rev() 
  ) |> 
  ggplot(aes(y = hair_color)) + 
  geom_bar() + 
  labs(y = "Hair color")
@cwdjankoski
Copy link

I found the plots confusing as well - basically these 2 are the same no ?

ggplot(starwars, aes(y = fct_rev(fct_infreq(fct_na_value_to_level(hair_color))))) + 
  geom_bar() + 
  labs(y = "Hair color")

and this

starwars |> 
  mutate(
    hair_color = hair_color |> 
      fct_na_value_to_level() |> 
      fct_infreq() |> 
      fct_rev()
  ) |> 
  ggplot(aes(y = hair_color)) + 
  geom_bar() + 
  labs(y = "Hair color")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants