Skip to content

Latest commit

 

History

History
55 lines (43 loc) · 3.92 KB

r_standards.md

File metadata and controls

55 lines (43 loc) · 3.92 KB

R Style Guidelines

The ACF Data Surge team's R Style Guidelines are intended to make our R code consistent and easy to read across projects. Our guidelines align with the The tidyverse style guide and we adopt the Google R Style Guide approach of listing the ways that our guidelines differ from the tidyverse guidelines.

Our guidelines are unique in that we include a section at the end detailing preferred packages and package management standards.

Functions

Returning Values

  • Use return() explicitly for returning values from a function
my_mean <- function(x) {
    x_mean <- mean(x)
    return(x_mean)
}

Pipes

Assignment

  • Use only left-hand assignment
iris <- iris %>%
    arrange(value)

iris <-
    iris %>%
    arrange(value)

Packages

Package Management

We will use renv to manage packages for our projects/code written in R. When beginning a new project, create a dedicated R project. This R project, and attendant renv related files, should be tracked on GitHub so that all contributors are working in the same development environment. See the renv documentation for a list of files that should be tracked in Git. Be sure to also include the .Rproj file for the project.

In addition to installing the packages needed for your work, styler and lintr should be among the first two things you install. These packages will help make sure that code is conforming to the Tidyverse standards, taking some of the burden of remembering these things off of programmers.

Preferred Packages

  • Data Cleaning: tidyverse, dplyr, tidyr, arrow, dtplyr, janitor
    • Installing the tidyverse will install both dplyr and tidyr, as well as many other useful packages. However, we encourage parsimony. That is, if only a few of the packages in the tidyverse are needed, do not load them all. This helps with reproducibility, making clear what packages are needed for a specific script, and with performance. To that end we listed packages in the tidyverse individually throughout this guide.
    • When working with large data, prefer arrow and dtplyr to packages such as data.table. The former provide syntax that is similar to the syntax of dplyr without sacrificing much speed.
  • Assertive Programming: assertr
    • Use a combination of base R (e.g. stop(), stopifnot()) and assertr for assertive programming. In pipelines, prefer assertr.
  • Loops: purrr, furrr
    • It is acceptable to use the base R *apply() suite of functions for looping.
    • Use furrr as a drop-in replacement for purrr if attempting to parallelize code.
  • Quarto: rmarkdown
  • Reading Data: readr, readxl, openxlsx
    • It is acceptable to use Base R where possible, but we recommend considering readr for its speed and consistent naming conventions.
  • Machine Learning/Statistics: tidymodels
  • Geospatial Analyses: sf, terra
  • String Manipulation: stringr
  • Visualization: ggplot2, plotly
  • Dashboards: shiny