Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve/polish handling of library symbols #12

Open
gmbecker opened this issue Jun 1, 2017 · 8 comments
Open

Improve/polish handling of library symbols #12

gmbecker opened this issue Jun 1, 2017 · 8 comments

Comments

@gmbecker
Copy link
Collaborator

gmbecker commented Jun 1, 2017

API that I hacked in during thesis work probably isn't the right one, and handling could be more robust/useful.

I believe @jonocarroll had some interest in collaborating on this. This issue is just a stub for now, but if there is interest we'll flesh it out and figure out details

@jonocarroll
Copy link

My interest was originally in doing something that may have some application here: https://twitter.com/carroll_jono/status/809206596953772032 (https://gist.github.com/jonocarroll/55046430b23d88e9628ac6b4edf8bb52). I'm interested in helping out when I have the time (which is a bit of a varying quantity at the moment).

@gmbecker
Copy link
Collaborator Author

gmbecker commented Jun 2, 2017 via email

@jonocarroll
Copy link

Yeah, I'm interested as it's good crossover with what I wanted it to be able to do (CodeDepends is a much more thorough invocation of my goal). Disambiguating based on the search path at that point of the script is complex, I'm sure; i.e. in the case that dplyr was loaded, select() was called once, then MASS was loaded and select() called again, I'd like to be able to pick out the separate dependencies.

Ideally this would allow someone to go through a script and correctly add package:: to every function call (i.e. for writing a packaged function).

I'll surely look through the codebase in the near future and will see what I can do to contribute.

@dsidavis
Copy link

dsidavis commented Jun 2, 2017

Great.
This identification of where in the search path a function is currently located can be done statitically.
It can also be done semi-statically, i.e., we can evaluate the library()/require() calls contained in the script while doing static analysis and not evaluate the other expressions. Then we look along the actual search path to find "select", etc. This avoids reimplementing a pseudo version of library/require along with all of its options, including pos, lib.loc and anything that could get added in the future.
This wouldn't deal with conditional calls to library(). Or within function calls to library() but that is possible.

@jonocarroll
Copy link

conditional calls to library(). Or within function calls to library()

These shouldn't appear in package functions, but I could see someone calling library() in a scripted function. It's perhaps poor practice to do so, but I'd consider this an edge-case of what I had in mind.

@dsidavis
Copy link

dsidavis commented Jun 2, 2017

Absolutely, entirely agree.

@gmbecker
Copy link
Collaborator Author

gmbecker commented Jun 2, 2017

It can also be done semi-statically, i.e., we can evaluate the library()/require() calls contained in the script while doing static analysis and not evaluate the other expressions. Then we look along the actual search path to find "select", etc. This avoids reimplementing a pseudo version of library/require along with all of its options, including pos, lib.loc and anything that could get added in the future.

Indeed, for an example of this approach see the data function handler I added recently. It does this to determine the output variables from a data() call so that they appear properly in the dependency chain.

EDIT: Actually looking closely, is is the approach I took for the current version of library symbol checking that's there now. That is only currently used when you tell CodeDepends to treat functions as inputs, though, so lots more can be done.

This wouldn't deal with conditional calls to library(). Or within function calls to library() but that is possible.

The much harder (probably impossible to fully handle?) issue is require in this sense, since it will succeed and the script/function will continue to run even when the package is not installed. The existing machinery assumes that a require call actually loads the library, which I think is the only feasible way of handling it. That assumption is going to get stronger when we're actually doing things with that library information, though.

This likely won't affect scripts (much) but when bringing CodeDepends to bear on package code things will get a little more complicated. Though I guess require calls throw a warning in R CMD check these days, so it's probably not the problem that it would have been in the past.

@gmbecker
Copy link
Collaborator Author

gmbecker commented Jun 9, 2017

#15 and #16 are relevant here as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants