Replies: 1 comment 1 reply
-
@mbruns91 here's the little writeup I promised. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
An extremely brief summary of some of the things I thought were either the most helpful/important or the most surprising/tricky when I was setting this up.
Key GitHub resources
Structure
Actions can be anywhere, and get accessed by
organization/repo/path/to/action/folder
, as long as thefolder
just contains anaction.yaml
file specifying the action.Here we keep them in flat directory structure so our action calls just look like
pyiron/actions/cached-mamba@main
.The bit after the
@
just specifies a branch, tag, release, or even commit hash (the last of which is actually the most secure way to use foreign actions, because then they don't get security-breaching updates added to them!)Reusable work flows must, must, must be in
organization/repo/.github/workflows
.It's super-duper annoying.
Context
Context is pretty straightforward to use, just
${{ SOMECONTEXT.VARIABLEINTHISCONTEXT }}
does the trick most of the time.In this repo we obviously lean heavily on the
inputs
context, but I found this completely straightforward to deal with. Similarlysecrets
,steps
, etc.I have a vague memory of running into some confusion with the interplay between
env
environment variables and the bash shell, but I don't remember them very well and they must have been transient since over in cached-mamba we very clearlyecho "DATE=$(date +'%Y%m%d')" >> $GITHUB_ENV
and later reference${{ env.DATE }}
without any trouble -- exactly as expected from the environment docs.The only gotcha I ran into here was figuring out when we need to evaluate things as an expression -- i.e. wrapping stuff in
${{ }}
-- and when we can just reference the raw variable. Near the top of the expressions docs there is a little one-sentence bit on this explaining that the expression can be omitted only in the context of anif
clause.There are lots of other cool operations on that doc page, but as far as I can tell
if
is really the only time you can omit these wrapping${{ }}
!Security
GitHub allows encrypted secrets to be stored at either the organization or repository level, and then accessed in the
secrets
context.AFAIK the only real danger point we need to be aware of is the distinction between the
pull_request
trigger event and thepull_request_target
event.Because of differences in the context and permissions granted to the PR on these different triggers, using
pull_request_target
together with a workflow containingactions/checkout
and anything that might execute arbitrary code leaves us open to manipulation and theft of environment variables.The only place we use this operation is here and I've put some links to security blog posts explaining the issue further.
The take-home message is to be very careful with this event, and never combine it with
actions/checkout
and you'll probably be ok.Flow management
Each workflow runs independently and in parallel.
Within that, by default, each job runs independently and in parallel.
Each job exists on its own virtual machine (VM), so anything you do is ephemeral.
If you want to have a lasting impact, you will need to commit stuff from the workflow, e.g. when we add the changes from black formatting;
If you want to pass data from one job to another this can be accomplished with artefacts (docs), although we don't use that here right now.
(There's a "gotcha" hiding in these commit-based workflows, where the GitHub token needs to differ for the commit from the workflow's token(?); to this end we created pyiron-runner to do this stuff. I didn't solve this problem myself though, so I'm light on detail here.)
You should be aware that this basic behaviour of independence and parallelism can be modified!
There are a variety of options for this, but a good example is our
commit-readthedocs-env
job.Here we use the
continue-on-error: true
flag to allow the commit step to throw an error (which it does when the commit is blank, i.e. when no update to the environment is needed) without crashing the job.In the next step ("push"), we use the
steps.commit-docs-env.outcome == 'success'
clause to skip the step unless the previous "commit" step actually did something.This depends on a few pieces:
if:
clause itself, which should be self explanatory but it's nice to know it exists!steps
context because we made sure to assign the last step anid
outcome
was a successoutcome
(status beforecontinue-on-error
is applied, 'success' only when we actually got a commit) fromconclusion
(status aftercontinue-on-error
is applied, always a 'success' because we want the job to succeed if no changes were needed!)Finally, we add
needs: commit-readthedocs-env
to all the other jobs in this workflow so that none of them start until this one is finished -- that's for a bit of computational efficiency, since if we do need to update our readthedocs env, the bot will do it with a new commit which re-triggers the whole CI sequence of events. So we can prevent those other jobs from needlessly burning CPU time by not spinning them up until we're sure the bot isn't going to shove in an extra commit.Workflows, actions, and resources
Actions are allowed to take secrets as inputs, but here I've built all the actions in a secret-free way.
Secrets are only required in our reusable workflows, which can just pass them in with
secrets: inherit
.Here's the final thing and it's a bit sneaky.
We have a bunch of tools in this repo like
condamerge.py
, which we want to reference in our reusable workflows.The catch is, that I could not find any way to directly access the called workflow repository from a caller/called workflow.
However, all is not lost -- it is possible to reference the host directory of an action using
$GITHUB_ACTION_PATH
, which gives a path to the called action.Since this is only available within the context of an action, that means that whenever you want to do something that references resources in this repository you need to extract it as an action.
From the log files it looks to me like the called workflow repositories are getting cloned in the VM environment at runtime, so maybe there is another way around this, but action-ization is the one I found that works.
Beta Was this translation helpful? Give feedback.
All reactions