-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CI OIDC access to new bors branches #355
Add CI OIDC access to new bors branches #355
Conversation
"${aws_s3_bucket.artifacts.arn}/rustc-builds-try", | ||
"${aws_s3_bucket.artifacts.arn}/rustc-builds-try/*", | ||
"${aws_s3_bucket.artifacts.arn}/rustc-builds-try-alt", | ||
"${aws_s3_bucket.artifacts.arn}/rustc-builds-try-alt/*", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an improvement over the current situation, but it'd be nice to scope these by the commit hash (${aws_s3_bucket.artifacts.arn}/rustc-builds-try-alt/${sha}/*
). As far as I'm aware it might not be possible to do that in IAM directly (the commit hash is in the GitHub Actions OIDC claim, but it might not be possible to access it from IAM), but at that point I'd prefer having a lambda function validating the JWT and issuing a properly scoped policy.
Also, the *
-less variant shouldn't be needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For more context on why this would be helpful: right now a job can override artifacts from other jobs, so in theory a try build (triggered by a trusted reviewer) could override the artifacts of a previously merged commit / try build, while if we scope things by commit hash a build would only be able to write in their scratch space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I looked into this a bit and it seems like the only way to do this is custom code, we can expose the SHA into the sub claim with a custom claim but I don't think we can parse it out from there.
The Lambda function sounds plausible, we do somehow similar for sync-team kickoff... it would mean custom code in this relatively sensitive area though. I guess the risk would mostly be that we just don't get it's benefits though - should be easy to avoid further escalation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My inclination I think is to not block on the further scope reduction at this time, I'm happy to work on it but I want to get new bors treating unblocked quickly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I looked into this a bit and it seems like the only way to do this is custom code, we can expose the SHA into the sub claim with a custom claim but I don't think we can parse it out from there.
Yeah, I remember a while back reading somewhere that it was possible to pass arbitrary custom claims into IAM policies if they had specific keys in the JWT token, but I can't seem to find that piece of docs anymore. It'd need work on the GitHub side to change those claims anyway, so it wouldn't be something we could rely on in a timely manner.
The Lambda function sounds plausible, we do somehow similar for sync-team kickoff... it would mean custom code in this relatively sensitive area though. I guess the risk would mostly be that we just don't get it's benefits though - should be easy to avoid further escalation.
Indeed. A way to limit the damage would be to have the function call be authenticated with OIDC too, so the most damage would be a job overriding another job's artifacts (i.e. the status quo) compared to third parties obtaining tokens, but still, it'd be sensitive code.
My inclination I think is to not block on the further scope reduction at this time, I'm happy to work on it but I want to get new bors treating unblocked quickly.
I think it's fine to land this now to unblock the new bors (it uploads to a different directory than current bors anyway, so no risk of overriding), but it'd be nice to have it in place in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I remember a while back reading somewhere that it was possible to pass arbitrary custom claims into IAM policies if they had specific keys in the JWT token, but I can't seem to find that piece of docs anymore. It'd need work on the GitHub side to change those claims anyway, so it wouldn't be something we could rely on in a timely manner.
Yeah, I'm not quite sure if the GitHub side has exactly the claim we'd need. https://token.actions.githubusercontent.com/.well-known/openid-configuration has a listing but I don't know if the head_ref for example is useful or just the branch name.
https://docs.github.com/en/enterprise-cloud@latest/actions/deployment/security-hardening-your-deployments/about-security-hardening-with-openid-connect#understanding-the-oidc-token lists a "sha": "example-sha"
but it's not documented and not clear whether example-sha would be a full commit hash or what.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think workflow_sha
would be the claim we'd be interested in, as it'd be the hash of the commit containing the workflow we're executing.
Also, I found the AWS feature I was thinking of: Session Tags. If the proper tag was passed in the JWT, then we could have ${iam:ResourceTag/workflow_sha}
in the IAM policy I think. But that's not applicable for us because GHA doesn't put the claims under the https://aws.amazon.com/tags
key.
Edit: the sha
in the docs might be legacy, before they split workflow_sha
and job_workflow_sha
due to the introduction of reusable workflows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually @Mark-Simulacrum found this out looking at that issue https://awsteele.com/blog/2023/10/25/aws-role-session-tags-for-github-actions.html.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting! I did see some references to Cognito, we should try to figure out how complicated/pricey it'll be.
@pietroalbini I've updated to use Cognito as suggested. Some concerns from my side:
I think the benefits of filtering to just the one SHA location probably win out overall but it's a close call to me. It's possible that we could achieve similar confidence with e.g. a custom sub claim (possible with GHA, AFAICT) that includes the workflow sha, put that in a separate repository, and enforce there that the real repository checkout SHA is passed as an session name (for example). It seems like that ought to work, and have less chance of the failure mode being "anyone on github has write access"... |
Yeah it's not awesome, but from a cursory glance I don't see a solution to this.
I don't think it's that different from directly interacting with GitHub's OIDC: you still need the condition filtering on the repository either way, whether it's
What about the previous proposal of
I'm not sure I understand, could you elaborate a bit more? |
I can make that the path, yeah, as our way of testing.
We'd do the following:
Basically the idea is similar to the previously proposed lambda function transforming claims in the oidc token but puts that on the GitHub side, which may be easier to maintain and is more reusable if we want similar things elsewhere (e.g., other repos pushing artifacts), it may even work well for PR CI having this access if we wanted that. |
I think there is a difference - we are relying more on the tag being only settable by a trusted party, whereas the direct oidc use isn't expanding trust scope to Cognito. In particular the issuance to anyone on GitHub of Cognito IDs makes me somewhat nervous. Basically the worry in my mind is that GitHub "knows" the :sub key is critical but may be less cautious about other metadata (intentionally or not), and we aren't currently checking the sub key anywhere. Maybe we should do that and that covers most of our bases. Regardless, the PR as-is very much expands our threat model to include Cognito, which didn't used to be the case. |
That's a good point, and I quite like the solution if we can make it work. That way, compared to the lambda we shouldn't need to validate GitHub's JWTs with our own code. Maybe let's bring this up to the infra meeting for a decision on the approach.
Sure. If we go with this approach we could check the sub key (mapping it to
Definitely. |
My main thought around this is whether we can nail a solution that would prevent builds from overwriting other artifacts. The move away from bearer secrets is still obviously great, but I'd like to make sure moving to this does give us the ability to lock modifications down to a specific build. Are builds modified at any point post-builder assuming and using this role? Is there a complexity tradeoff on doing a per-build bucket to side-step this issue entirely? |
Builds are supposed to be immutable after being uploaded.
Regarding the complexity of one bucket per build, we upload roughly 10 "merge" builds a day plus a variable number of "try" builds a day, so we would quickly create a large amount of buckets and go over the bucket limit. We could ask to increase it, but I don't think it'd be worth it, because I don't think separate buckets would solve the problem. The problem we're trying to solve is getting the unique identifier of the build (the commit hash) out of the JWT GitHub provides and into the AWS IAM policy. Until we address this we cannot prevent a build from overriding other builds, and once we address it there isn't much difference than scoping on The three options we identified are:
|
This grants access under a new directory prefix (rustc-builds-try) as a temporary measure to avoid mistakes overwriting any actual artifacts. It might be a good idea in any case to scope try builds into a different bucket or place than real builds.
This uses Cognito as a dispatch authority to convert OIDC claims to IAM condition values, and then fitlers the resulting role to only writing into the passed sha. See https://awsteele.com/blog/2023/10/25/aws-role-session-tags-for-github-actions.html for some related context.
See write up here: rust-lang#355 (comment). We decided not to pursue this for the time being and revisit at a later point. This reverts commit 2f7aefc.
0906539
to
ac858b7
Compare
As discussed with infra in a recent meeting, I've reverted the Cognito stuff here (per the concerns listed above). Rather than implementing one of the other options we're going to move ahead with the simpler approach of just limiting access to rustc-builds-try/* for now (perhaps relaxing that once we're happy with bors working as expected, perhaps using the reusable workflow approach discussed above, perhaps something else). I've gone ahead and deployed the changes in this PR so merging it. |
Closes rust-lang/infra-team#85
I'd like to touch base with Foundation security folks on the policies here, though as currently proposed they should be safe (different s3 bucket prefix than production).