Get environment variables from R process for tasks #4606

juliasilge · 2024-09-06T23:04:27Z

Addresses #2723

Together with posit-dev/ark#507, this sets up infrastructure so we can propagate known sets of environment variables to our tasks that use ShellExecution (or ProcessExecution). I added the TESTTHAT environment variables for devtools::test() only, for now. We can add other sets of environment variables as they come up.

TODO:

Bump ark

QA Notes

To QA this, you'll need an R package with some failing tests. Here's one!

Run testthat::set_max_fails(1) in the console
Run the command "R: Test R Package"
Notice that although this package has 4 failing tests, only ones from the first file are run. You'll see output like this:

Maximum number of failures exceeded; quitting at end of file.
ℹ Increase this number with (e.g.) testthat::set_max_fails(Inf)

jmcphers

This doesn't appear to do "startsWith" but rather does a regex based match. e.g. if you ask for POS you'll get vars with POS anywhere in the var name.

> .ps.rpc.get_env_vars('POS')
$HOMEBREW_REPOSITORY
[1] "/opt/homebrew"

$POSITRON
[1] "1"

$POSITRON_VERSION
[1] "2024.09.0"

I don't feel great about the regex based approach; it means that we need to worry about magic characters and it's also easy to suck in environment values that are unintended -- which can contain sensitive values, change behavior in unwanted ways, etc.

I think it'd be better for this RPC to take an array of the exact variable names that are sought, and to just return the values for each of those variables. I realize that's a little more laborious, and would also require us to update this if e.g. more TESTTHAT variables are added in the future, but probably worth it to be more defensive/explicit. What do you think?

juliasilge · 2024-09-09T19:33:13Z

@DavisVaughan thought the same here about the regex approach, so I'll change both PRs to be more explicit/conservative.

juliasilge · 2024-09-09T19:50:17Z

For a more explicit and conservative approach, looks like we can just stick with only supporting TESTTHAT_MAX_FAILS for now:

https://github.com/search?q=repo%3Ar-lib%2Ftestthat%20Sys.setenv&type=code

It's the only one that folks have a way to set on their own.

juliasilge · 2024-09-09T22:27:06Z

extensions/positron-r/src/tasks.ts

 		},
 		{
 			task: 'r.task.packageTest',
 			message: vscode.l10n.t('{taskName}', { taskName: 'Test R package' }),
 			rcode: 'devtools::test()',
-			package: 'devtools'
+			package: 'devtools',
+			envVars: await getEnvVars(['TESTTHAT_MAX_FAILS'])


If we had more than one, we would list them out like this:

envVars: await getEnvVars(['TESTTHAT_MAX_FAILS', 'TESTTHAT_PKG'])

jmcphers

Seems pretty straightforward. Do we need to worry about the timing of the call to getRPackageTasks()? (i.e. is it always going to be run when the R session is warm? is it OK to cache the env vars as part of the task defs or do we need to worry about them changing between the task defs and the actual execution of the package test task?)

jmcphers · 2024-09-09T23:26:12Z

extensions/positron-r/src/tasks.ts

 		},
 		{
 			task: 'r.task.packageTest',
 			message: vscode.l10n.t('{taskName}', { taskName: 'Test R package' }),
 			rcode: 'devtools::test()',
-			package: 'devtools'
+			package: 'devtools',
+			envVars: await getEnvVars(['TESTTHAT_MAX_FAILS'])


This needs some exception handling; if we fail to get the env vars for some reason, the result will be that no package tasks will be defined all. We either need to try/catch here or have getEnvVars avoid throwing in all circumstances (errors just logged and empty results returned safely)

I think that is OK in this situation, because of how getRPackageTasks() is called when the command is executed, right? I think I would rather surface weird errors than swallow them, like for example this one I got in the midst of a weird debugging state:

To be clear, I'm not worried about this error in particular but would rather such things show up.

@jmcphers I know you approved the PR but I'll wait for one more response from you to make sure I am not misunderstanding the situation.

OK, that looks good!

juliasilge · 2024-09-10T01:52:02Z

Do we need to worry about the timing of the call to getRPackageTasks()?

This gets called when you execute the command, so I believe it is fine:

positron/extensions/positron-r/src/commands.ts

Lines 113 to 114 in 2895c2b

    
           const tasks = await getRPackageTasks(); 
        
           const task = tasks.filter(task => task.definition.task === 'r.task.packageTest')[0];

To see that this responds to the _current_value of the env vars empirically, use this example R package to do:

testthat::set_max_fails(1)
See [ FAIL 2 | WARN 0 | SKIP 0 | PASS 0 ]
`testthat::set_max_fails(3)
See [ FAIL 3 | WARN 0 | SKIP 0 | PASS 0 ]

lionel- · 2024-09-10T09:01:19Z

Some thoughts:

Timing

Right now the callMethod() call will only resolve when the kernel becomes idle. A better way of getting these envvars would be to push them from ark to positron after each top-level command. Positron would save the current values of these variables and use them as needed when creating tasks. This way, tasks can be run at any time rather than just when R is idle. This is not necessarily a blocker for this PR since quite often R is idle during package development, but this is still a regression for the busy case. I also wonder about the user experience as the action will not have immediate feedback in some cases, which would be confusing. So it's probably worth it to implement the push approach now.

I see two ways to go about it:

Push from the kernel to positron by sending a backend-to-frontend event.
Push from the kernel to the LSP. Then add an LSP request that the frontend can use without having to wait.

(1) seems the hardest (surprisingly to me, and @jmcphers might see an easier approach than I can). First it depends on who should process these events. Should it be a language-agnostic part of positron or only positron-r? If the former, it's a matter of adding the new event type on the UI comm following existing cases. If the latter, I think the UI comm is not the best place for this and we might want to create a new positron-r-specific comm that would get these events, which I think has not been done before. The advantage of such a comm is that it could be developed and experimented with "in extension" without changing core parts of positron. Once done, it should be easy to create events targetting only the positron-r extension.

In both cases, the kernel will need to to compute the current value of the envvars and send them as some kind of event, and the place to do it is https://github.com/posit-dev/ark/blob/d1f72cb03be0a2c3add85c59c147067275659eb7/crates/ark/src/interface.rs#L659.

(2) is the easiest, it requires adding the envvars as part of the ConsoleInput that we already send to the LSP after each top-level command in refresh_lsp. Then we need to implement a new LSP request to get these values. You can follow what we did for statement range for instance, starting from statement_range. I'd be happy to help with this if needed, we could pair on this for instance.

Envvars set

FWIW to me matching the envvars with a regex is not that problematic because we're only reproducing what's happening anyway when launching subprocesses from R (propagation of envvars). Here the process parent is positron, and we're pretending it's R/Ark. I agree it feels a bit icky, but it's probably fine in practice.

In the longer term, a better way would be for packages to declare what environment variables should be reexported to subprocesses via a top-level declare() annotation in their package. The process would be as follows:

The LSP learns about what envvars to reexport by analysing top-level declarations in package code.
After a top-level command, the kernel queries the LSP what envvars are reexported
It peeks the current values and sends these to the frontend (or to the LSP depending on our choice above).

With this setup, neither positron nor ark would have to manage a list of environment variable names or matching patterns.

jmcphers · 2024-09-10T18:05:29Z

The ability to see what environment variables are currently set in the kernel is a very useful tool in a lot of contexts, so I think it is worth considering some language-agnostic supporting infrastructure around pushing/synchronizing the set to the front end as you describe.

However, I don't think that we should hold up this PR to set up that infrastructure because it's considerably more ambitious and we really need to think through the goals/approach. If there's concern around the UX for this PR specifically, it'd be more expedient to address it by adding a progress toast to the R package task collector if the RPC takes longer than 1s or so. @lionel-, what do you think?

lionel- · 2024-09-11T05:29:25Z

That would be a nice improvement as it would give user feedback. But still it might not be clear to users why their tasks are not making progress when they have a Shiny application running in the console.

Adding a language agnostic event for envvars would be easier than an extension-specific one, so that's good news that it would be the right thing to do.

juliasilge · 2024-09-11T14:55:14Z

Today @lionel- and I chatted in a little more detail on this. As of today, we already use call_method() in a way that only works when the kernel is idle throughout our package development tasks (to get the locale, to check if a package is installed, etc); adding another use here to get env vars isn't a big change in the behavior. I'm going to open a followup issue for us to move to a different approach for all of these so we can track when/how this is a problem for users in practice.

juliasilge · 2024-09-11T15:43:07Z

Followup issue is here: #4641

Signed-off-by: Julia Silge <[email protected]>

Get env vars from R process for tasks

b8756a1

juliasilge requested a review from jmcphers September 6, 2024 23:25

juliasilge mentioned this pull request Sep 9, 2024

Expose environment variables from R process posit-dev/ark#507

Merged

jmcphers requested changes Sep 9, 2024

View reviewed changes

Pass specific env vars from R session by name

2895c2b

juliasilge commented Sep 9, 2024

View reviewed changes

juliasilge requested a review from jmcphers September 9, 2024 22:52

jmcphers previously approved these changes Sep 9, 2024

View reviewed changes

juliasilge mentioned this pull request Sep 11, 2024

Revisit uses of RPC callMethod that (inappropriately) require kernel to be idle #4641

Open

Bump ark to v0.1.135

25eb1eb

juliasilge dismissed jmcphers’s stale review via 25eb1eb September 11, 2024 19:10

Merge branch 'main' into env-vars-to-task

b79bc3c

Signed-off-by: Julia Silge <[email protected]>

juliasilge merged commit b3ac5df into main Sep 11, 2024
2 checks passed

juliasilge deleted the env-vars-to-task branch September 11, 2024 20:42

github-actions bot locked and limited conversation to collaborators Sep 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Get environment variables from R process for tasks #4606

Get environment variables from R process for tasks #4606

juliasilge commented Sep 6, 2024 •

edited

Loading

jmcphers left a comment

juliasilge commented Sep 9, 2024

juliasilge commented Sep 9, 2024

juliasilge Sep 9, 2024

jmcphers left a comment

jmcphers Sep 9, 2024

juliasilge Sep 10, 2024

jmcphers Sep 10, 2024

juliasilge commented Sep 10, 2024

lionel- commented Sep 10, 2024

jmcphers commented Sep 10, 2024

lionel- commented Sep 11, 2024

juliasilge commented Sep 11, 2024

juliasilge commented Sep 11, 2024

Get environment variables from R process for tasks #4606

Get environment variables from R process for tasks #4606

Conversation

juliasilge commented Sep 6, 2024 • edited Loading

TODO:

QA Notes

jmcphers left a comment

Choose a reason for hiding this comment

juliasilge commented Sep 9, 2024

juliasilge commented Sep 9, 2024

juliasilge Sep 9, 2024

Choose a reason for hiding this comment

jmcphers left a comment

Choose a reason for hiding this comment

jmcphers Sep 9, 2024

Choose a reason for hiding this comment

juliasilge Sep 10, 2024

Choose a reason for hiding this comment

jmcphers Sep 10, 2024

Choose a reason for hiding this comment

juliasilge commented Sep 10, 2024

lionel- commented Sep 10, 2024

Timing

Envvars set

jmcphers commented Sep 10, 2024

lionel- commented Sep 11, 2024

juliasilge commented Sep 11, 2024

juliasilge commented Sep 11, 2024

juliasilge commented Sep 6, 2024 •

edited

Loading