Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scalar_in_linter() for x %in% 1 #2084

Merged
merged 14 commits into from
Sep 7, 2023
1 change: 1 addition & 0 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -150,6 +150,7 @@ Collate:
'redundant_ifelse_linter.R'
'regex_subset_linter.R'
'routine_registration_linter.R'
'scalar_in_linter.R'
'semicolon_linter.R'
'seq_linter.R'
'settings.R'
Expand Down
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,7 @@ export(redundant_ifelse_linter)
export(regex_subset_linter)
export(routine_registration_linter)
export(sarif_output)
export(scalar_in_linter)
export(semicolon_linter)
export(semicolon_terminator_linter)
export(seq_linter)
Expand Down
1 change: 1 addition & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@
* `library_call_linter()` can detect if all library/require calls are not at the top of your script (#2027 and #2043, @nicholas-masel and @MichaelChirico).
* `keyword_quote_linter()` for finding unnecessary or discouraged quoting of symbols in assignment, function arguments, or extraction (part of #884, @MichaelChirico). Quoting is unnecessary when the target is a valid R name, e.g. `c("a" = 1)` can be `c(a = 1)`. The same goes to assignment (`"a" <- 1`) and extraction (`x$"a"`). Where quoting is necessary, the linter encourages doing so with backticks (e.g. `` x$`a b` `` instead of `x$"a b"`).
* `length_levels_linter()` for using the specific function `nlevels()` instead of checking `length(levels(x))` (part of #884, @MichaelChirico).
* `scalar_in_linter()` for discouraging `%in%` when the right-hand side is a scalar, e.g. `x %in% 1` (part of #884, @MichaelChirico).
* `if_not_else_linter()` for encouraging `if` statements to be structured as `if (A) x else y` instead of `if (!A) y else x` (part of #884, @MichaelChirico).
* `commas_linter()` gains an option `allow_trailing` (default `FALSE`) to allow trailing commas while indexing. (#2104, @MEO265)

Expand Down
41 changes: 41 additions & 0 deletions R/scalar_in_linter.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
#' Block usage like x %in% "a"
#'
#' `vector %in% set` is appropriate for matching a vector to a set, but if
#' that set has size 1, `==` is more appropriate. `%chin%` from `data.table`
#' is matched as well.
#'
#' `scalar %in% vector` is OK, because the alternative (`any(vector == scalar)`)
#' is more circuitous & potentially less clear.
#'
#' @evalRd rd_tags("scalar_in_linter")
#' @seealso [linters] for a complete list of linters available in lintr.
#' @export
scalar_in_linter <- function() {
# TODO(#2085): Extend to include other cases where the RHS is clearly a scalar
# NB: all of logical, integer, double, hex, complex are parsed as NUM_CONST
xpath <- "
//SPECIAL[text() = '%in%' or text() = '%chin%']
/following-sibling::expr[NUM_CONST[not(starts-with(text(), 'NA'))] or STR_CONST]
/parent::expr
"

return(Linter(function(source_expression) {
if (!is_lint_level(source_expression, "expression")) {
return(list())
}

xml <- source_expression$xml_parsed_content

bad_expr <- xml_find_all(xml, xpath)
in_op <- xml_find_chr(bad_expr, "string(SPECIAL)")
lint_msg <-
paste0("Use == to match length-1 scalars, not ", in_op, ". Note that == preserves NA where ", in_op, " does not.")

xml_nodes_to_lints(
bad_expr,
source_expression = source_expression,
lint_message = lint_msg,
type = "warning"
)
}))
}
1 change: 1 addition & 0 deletions inst/lintr/linters.csv
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ redundant_equals_linter,best_practices readability efficiency common_mistakes
redundant_ifelse_linter,best_practices efficiency consistency configurable
regex_subset_linter,best_practices efficiency
routine_registration_linter,best_practices efficiency robustness
scalar_in_linter,readability consistency best_practices efficiency
AshesITR marked this conversation as resolved.
Show resolved Hide resolved
semicolon_linter,style readability default configurable
semicolon_terminator_linter,style readability deprecated configurable
seq_linter,robustness efficiency consistency best_practices default
Expand Down
1 change: 1 addition & 0 deletions man/best_practices_linters.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/consistency_linters.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/efficiency_linters.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

9 changes: 5 additions & 4 deletions man/linters.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/readability_linters.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

23 changes: 23 additions & 0 deletions man/scalar_in_linter.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

37 changes: 37 additions & 0 deletions tests/testthat/test-scalar_in_linter.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
test_that("scalar_in_linter skips allowed usages", {
linter <- scalar_in_linter()

expect_lint("x %in% y", NULL, linter)
AshesITR marked this conversation as resolved.
Show resolved Hide resolved
expect_lint("y %in% c('a', 'b')", NULL, linter)
expect_lint("c('a', 'b') %chin% x", NULL, linter)
expect_lint("z %in% 1:3", NULL, linter)
# scalars on LHS are fine (often used as `"col" %in% names(DF)`)
expect_lint("3L %in% x", NULL, linter)

# this should be is.na(x), but it more directly uses the "always TRUE/FALSE, _not_ NA"
# aspect of %in%, so we delegate linting here to equals_na_linter()
expect_lint("x %in% NA", NULL, linter)
expect_lint("x %in% NA_character_", NULL, linter)
})

test_that("scalar_in_linter blocks simple disallowed usages", {
linter <- scalar_in_linter()
lint_in_msg <- rex::rex("Use == to match length-1 scalars, not %in%.")
lint_chin_msg <- rex::rex("Use == to match length-1 scalars, not %chin%.")

expect_lint("x %in% 1", lint_in_msg, linter)
expect_lint("x %chin% 'a'", lint_chin_msg, linter)
})

test_that("multiple lints are generated correctly", {
linter <- scalar_in_linter()

expect_lint(
trim_some('{
x %in% 1
y %chin% "a"
}'),
list("%in%", "%chin%"),
linter
)
})
Loading