The current version is not yet on CRAN, but you can install it from Github using the {remotes} package:
# install.packages("remotes")
remotes::install_github("jhelvy/factorlock")
Load the library with:
library(factorlock)
The only thing this package does is provide the
factorlock::lock_factors()
function to make it easier to reorder
factors in a data frame according to the row order.
By default, bar charts made with {ggplot2} follow alphabetical ordering:
library(ggplot2)
library(dplyr)
mpg |>
count(manufacturer) |>
ggplot() +
geom_col(aes(x = n, y = manufacturer))
If you wanted to sort the bars based on n
, many people make the
mistake of sorting the data frame and assuming the reordered rows will
pass through to the bars, like this:
mpg |>
count(manufacturer) |>
arrange(n) |>
ggplot() +
geom_col(aes(x = n, y = manufacturer))
But this produces the same chart! 🤦
Instead, to sort the bars based on n
you have to reorder the factor
levels:
mpg |>
count(manufacturer) |>
ggplot() +
geom_col(aes(x = n, y = reorder(manufacturer, n)))
I find this rather unintuitive and difficult to remember, let alone confusing because the ordering of the rows in the data frame won’t match the factor ordering (which is rather opaque to the user).
{factorlock} provides an alternative approach by allowing you to “lock” the factor ordering to that of the row ordering in the data frame:
mpg |>
count(manufacturer) |>
arrange(n) |>
factorlock::lock_factors() |>
ggplot() +
geom_col(aes(x = n, y = manufacturer))
Notice that you also get better axis label names for free here since the
y
variable is still mapped to just manufacturer
instead of
reorder(manufacturer, n)
.
By default all character or factor type variables are “locked”, but you can also specify which variables you want to “lock” while leaving the others alone:
mpg |>
count(manufacturer) |>
arrange(n) |>
factorlock::lock_factors("manufacturer") |>
ggplot() +
geom_col(aes(x = n, y = manufacturer))
You can also get a reverse factor ordering with rev = TRUE
:
mpg |>
count(manufacturer) |>
arrange(n) |>
factorlock::lock_factors(rev = TRUE) |>
ggplot() +
geom_col(aes(x = n, y = manufacturer))
- Author: John Paul Helveston https://www.jhelvy.com/
- Date First Written: May 10, 2023
- License: MIT
If you use this package for in a publication, I would greatly appreciate
it if you cited it - you can get the citation by typing
citation("factorlock")
into R:
citation("factorlock")
#>
#> To cite factorlock in publications use:
#>
#> John Paul Helveston (2022). factorlock: Set Factors Levels Based on
#> Row Order.
#>
#> A BibTeX entry for LaTeX users is
#>
#> @Manual{,
#> title = {factorlock: Set Factors Levels Based on Row Order},
#> author = {John Paul Helveston},
#> year = {2022},
#> note = {R package},
#> url = {https://jhelvy.github.io/factorlock/},
#> }