Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gate MSL infinite loop optimization workaround to a flag enabled by default #6520

Open
wants to merge 7 commits into
base: trunk
Choose a base branch
from

Conversation

rudderbucky
Copy link

Connections
Fixes #6518

Description
The infinite loop optimization workaround in commit 3fda684 introdues performance issues with loops. This PR gates this change to wasm as that is the main platform where this fixes a security issue.

Testing
Tested the same way as in #6518. Performance is back to normal levels.

Copy link
Member

@jimblandy jimblandy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to fix this as described here: #6518 (comment)

@rudderbucky
Copy link
Author

@jimblandy copying your comment here:

I think the fix is to have Naga give that macro an empty definition unless some sort of bounds checks are enabled. Then native apps should use create_shader_module_unchecked to avoid the performance impact.

What do you mean by "some sort of bounds checks" here? Add a bool to msl::Writer?

@rudderbucky
Copy link
Author

Also, I am no wgpu expert, but AIUI native apps needing to use create_shader_module_unchecked would be a "breaking change", unless this was already meant to be the case. I don't see it used anywhere in bevy. Just confirming this is intentional

This reverts commit 7561726.
@DJMcNab
Copy link
Contributor

DJMcNab commented Nov 12, 2024

A model for this is something like #5508, where this is a new configuration option.
In early commits, I made this unsafe to set to not the default.

https://github.com/DJMcNab/wgpu/blob/30884dd732ce10bb97e3eef9c16670752d8a6042/wgpu-types/src/lib.rs#L6151-L6189

It's still possible to execute UB on the GPU with or without this feature, since this doesn't stop you running the infinite loop. So I'd push back on making this unsafe to configure, but I think a new "unsafe" field on PipelineConfigurationOptions would be the right way to implement this.

@rudderbucky
Copy link
Author

@DJMcNab @jimblandy could you give me a rationale on why we should be presenting this as an option at all to users instead of just gating it to WASM as I initially considered? Since the bounds checks are necessary if and only if we are on WASM (AIUI), I don't see a need to surface this, and even if the concern is this is a platform-specific tweak, I could internally represent the infinite loop bounds checking platform as a set that merely includes WASM, instead of hardcoding WASM as the only infinite loop bounds checked platform.

@jimblandy
Copy link
Member

@rudderbucky

Since the bounds checks are necessary if and only if we are on WASM (AIUI), I don't see a need to surface this

Actually, that part of the PR was also not correct, I just didn't mention it because there were bigger changes needed. Any native application using wgpu's Metal backend needs the LOOP_IS_REACHABLE kludge (or an equivalent), unless all their shaders are trusted code. Running untrusted shaders (as in a shader playground, for example) without the kludge will introduce a security hole.

The idea is that applications will indeed need to change how they use wgpu if they want to avoid the overhead. Applications that trust their shaders (most native apps) should need to positively indicate this to wgpu, so that the safe behavior is the default.

@jimblandy
Copy link
Member

jimblandy commented Nov 12, 2024

I should mention that the LOOP_IS_REACHABLE kludge was found to be insufficient. The issue is (as @DJMcNab points out) that, although the kludge does prevent the compiler from inferring value bounds based on the UB, you do still execute the UB, so it's just a matter of cleverness to find some way to exploit it. I think I saw something where you could do bad stuff inside the loop.

I think the proper fix is a similar branch on a volatile, but controlling a break out of the loop. This prevents the compiler from concluding that the loop is infinite in the first place.

edit: to be clear - the problems I'm discussing here don't need to be fixed by this PR, they're a separate issue.

@rudderbucky
Copy link
Author

rudderbucky commented Nov 13, 2024

Okay, @jimblandy if I am understanding you correctly the change is quite simple and we only have to check on context.expression.policies (PTAL at the recent commit). However I'm not certain if we should add an entirely new field to naga/src/proc/index.rs just for this Metal-specific issue, so I chose to just re-use index, as that seems to be the primary vector for abusing this; I could be totally wrong, just need your thoughts

@DJMcNab
Copy link
Contributor

DJMcNab commented Nov 13, 2024

As a wgpu user, I'd prefer to configure this separately to bounds checking. Validating that I don't have any infinite loops is something which I have to do anyway (e.g. on other platforms to avoid timeouts), so making that assertion is something I'd be happy to do. However, I'd quite like to keep bounds checking safety.
(I'm not a huge fan of making the assertion unsafe, as it's quite hard to introduce "accidental" memory safety issues from getting it wrong)

@rudderbucky
Copy link
Author

rudderbucky commented Nov 13, 2024

@DJMcNab I don't think this validates you have no infinite loops as much as it just stops them from causing some vulnerability issues (on Metal only), I have no issues with such a feature (or gating this behind such a new feature) but that's probably best done by a separate PR.

@jimblandy
Copy link
Member

Validating that I don't have any infinite loops is something which I have to do anyway (e.g. on other platforms to avoid timeouts), so making that assertion is something I'd be happy to do. However, I'd quite like to keep bounds checking safety.

Okay - if the combination of bounds checking and no loop marking is valuable, we can think about how to make that possible. I think that sounds like an unusual combination, though.

@jimblandy
Copy link
Member

I don't think this validates you have no infinite loops as much as it just stops them from causing some vulnerability issues (on Metal only)

What Daniel meant is, he personally ensures that his shaders have no infinite loops, so he doesn't need the overhead of the kludge, but he would still like the bounds checking.

@DJMcNab
Copy link
Contributor

DJMcNab commented Nov 13, 2024

I think that sounds like an unusual combination, though.

Again, it's a question of relative ease of validation. It's relatively feasible to validate that a loop is well formed, but it's hard to do that validation for a memory access. And a poorly-formed loop will make itself apparent (on non-metal platforms where it won't be optimised out) quite quickly (linebender/xilem#613), whereas a missed bounds check would be likely to cause heisenbugs or worse. We access many more arrays than we use (non-for) loops, so the relative effort required to check loops is lower. We're not dealing with untrusted shaders.

Ideally, we'd be able to configure them both orthogonally per-shader, rather than getting a grab-bag of properties through using create_shader_module_unchecked.

@cwfitzgerald
Copy link
Member

Thought: unchecked should have a set of flags of checks to disable.

@rudderbucky rudderbucky changed the title Gate infinite loop optimization workaround to WASM builds Gate MSL infinite loop optimization workaround to a flag enabled by default. Nov 14, 2024
@rudderbucky rudderbucky changed the title Gate MSL infinite loop optimization workaround to a flag enabled by default. Gate MSL infinite loop optimization workaround to a flag enabled by default Nov 14, 2024
@rudderbucky
Copy link
Author

rudderbucky commented Nov 14, 2024

I'm not hearing any arguments against @DJMcNab's idea of adding a separate field to PipelineCompilationOptions. I'm personally leaning towards because I think it'd be less controversial of a breaking change downstream. I'll add it into this PR.

@rudderbucky rudderbucky requested review from crowlKats and a team as code owners November 14, 2024 02:16
@rudderbucky
Copy link
Author

rudderbucky commented Nov 14, 2024

To be clear I would gladly revert the recent commit if there's pushback. It's also worth taking a step back and asking the following questions

  • Do we want to couple this with BoundsCheckPolicies? If so, do we couple it with the index policy, another policy, or create a new policy entirely? (no yeas, no nays)
  • If not, should we surface this at PipelineCompilationOptions? (2 yeas, no nays)

Would appreciate a direct answer to these from a wgpu expert as it would make it clear how to implement this 🙂

@cwfitzgerald
Copy link
Member

Alright, lets take a step back and look at the facts:

  • We ideally would like disabling this thing to be unsafe for soundness reasons
  • A non-breaking change would be neat.
  • We already have this unchecked unsafe entry point.

My vibe is that the entry point should be unsafe, not just setting the boolean as that feels like a bit of a hacky use of unsafe. Which would suggest putting the flags on the create_shader_module_unchecked, potentially alongside the other bounds check flags

@rudderbucky
Copy link
Author

rudderbucky commented Nov 14, 2024

A non-breaking change would be neat.

Agreed, though I will say the introduction of the kludge was a breaking change for me, as now (bevy's) performance is impacted unless we start using create_shader_module_unchecked in our render pipeline. I suppose that is (retroactively?) intentional though and having a non-breaking lever is still useful.

We ideally would like disabling this thing to be unsafe for soundness reasons

We already have this unchecked unsafe entry point.

Fair, I think this counts as a nay on PipelineCompilationOptions. Unless you (or another maintainer) want to put your foot down on that option entirely. Either way would still work for me as long as this PR gets through and we can disable the kludge 🙂

@rudderbucky
Copy link
Author

@cwfitzgerald @jimblandy I see there's been a bunch of related development in other PRs... is this something you are still interested in getting merged?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Metal/MSL] Workaround for stopping infinite loops to be optimized out causes performance regression
5 participants