Skip to content

Commit

Permalink
parse_data: promote custom parse logic for R 4.4 compatibility
Browse files Browse the repository at this point in the history
utils::getParseData has a longstanding bug: for an installed package,
parse data is available only for the last file [1].  To work around
that, the get_tokens helper first calls getParseData and then falls
back to custom logic that extracts the concatenated source lines,
splits them on #line directives, and calls getParseData on each file's
lines.

The getParseData bug was fixed in R 4.4.0 (r84538).  Unfortunately
that change causes at least two issues (for some subset of packages):
a substantial performance regression [2] and an error when applying
exclusions [3].

Under R 4.4, getParseData always returns non-NULL as a result of that
change when calculating package coverage (in other words, the
get_parse_data fallback is _not_ triggered).  The slowdown is
partially due to the parse data no longer being cached across
get_tokens calls.  Another relevant aspect, for both the slowdown and
the error applying exclusions, is likely that the new getParseData
returns data for the entire package rather than the per-file parse
data the downstream covr code expects.

One solution would be to adapt covr's caching and handling of the
getParseData when running under R 4.4.0 or later.  Instead go with a
simpler and more minimal fix.  Reorder the calls so that the
get_parse_data call, which we know has been the primary code path for
package coverage before R 4.4.0, is the first call tried.  Leave
getParseData as the fallback to handle the non-package coverage cases.

[1] r-lib#154
    https://bugs.r-project.org/show_bug.cgi?id=16756

[2] As an extreme case, calling package_coverage on R.utils goes from
    under 15 minutes to over 6 hours.

[3] nanotime (v0.3.10) and diffobj (v0.3.5) are two examples of
    packages that hit into this error.

Closes r-lib#576
Closes r-lib#579
Re: r-lib#567
  • Loading branch information
kyleam committed Nov 15, 2024
1 parent 2b7c432 commit 5ab2280
Show file tree
Hide file tree
Showing 2 changed files with 14 additions and 2 deletions.
3 changes: 3 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
# covr (development version)

* Fixed a performance regression and an error triggered by a change in R
4.4.0. (@kyleam, #588)

* Fixed an issue where attempting to generate code coverage on an already-loaded
package could fail on Windows. (@kevinushey, #574)

Expand Down
13 changes: 11 additions & 2 deletions R/parse_data.R
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,16 @@ clean_parse_data <- function() {
rm(list = ls(package_parse_data), envir = package_parse_data)
}

# Needed to work around https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=16756
get_tokens <- function(srcref) {
getParseData(srcref) %||% get_parse_data(attr(getSrcref(srcref), "srcfile"))
# Before R 4.4.0, covr's custom get_parse_data is necessary because
# utils::getParseData returns parse data for only the last file in the
# package. That issue (bug#16756) is fixed in R 4.4.0 (r84538).
#
# On R 4.4.0, continue to use get_parse_data because covr's code expects the
# result to be limited to the srcref file. getParseData will return parse data
# for all of the package's files.
get_parse_data(attr(getSrcref(srcref), "srcfile")) %||%
# This covers the non-installed file case where the source file isn't a
# concatenated file with "line N" directives.
getParseData(srcref)
}

0 comments on commit 5ab2280

Please sign in to comment.