Gate MSL infinite loop optimization workaround to a flag enabled by default #6520

rudderbucky · 2024-11-11T16:19:16Z

Connections
Fixes #6518

Description
The infinite loop optimization workaround in commit 3fda684 introdues performance issues with loops. This PR gates this change to wasm as that is the main platform where this fixes a security issue.

Testing
Tested the same way as in #6518. Performance is back to normal levels.

jimblandy

I'd like to fix this as described here: #6518 (comment)

rudderbucky · 2024-11-12T02:12:23Z

@jimblandy copying your comment here:

I think the fix is to have Naga give that macro an empty definition unless some sort of bounds checks are enabled. Then native apps should use create_shader_module_unchecked to avoid the performance impact.

What do you mean by "some sort of bounds checks" here? Add a bool to msl::Writer?

rudderbucky · 2024-11-12T02:31:40Z

Also, I am no wgpu expert, but AIUI native apps needing to use create_shader_module_unchecked would be a "breaking change", unless this was already meant to be the case. I don't see it used anywhere in bevy. Just confirming this is intentional

This reverts commit 7561726.

DJMcNab · 2024-11-12T08:37:59Z

A model for this is something like #5508, where this is a new configuration option.
In early commits, I made this unsafe to set to not the default.

https://github.com/DJMcNab/wgpu/blob/30884dd732ce10bb97e3eef9c16670752d8a6042/wgpu-types/src/lib.rs#L6151-L6189

It's still possible to execute UB on the GPU with or without this feature, since this doesn't stop you running the infinite loop. So I'd push back on making this unsafe to configure, but I think a new "unsafe" field on PipelineConfigurationOptions would be the right way to implement this.

rudderbucky · 2024-11-12T16:16:03Z

@DJMcNab @jimblandy could you give me a rationale on why we should be presenting this as an option at all to users instead of just gating it to WASM as I initially considered? Since the bounds checks are necessary if and only if we are on WASM (AIUI), I don't see a need to surface this, and even if the concern is this is a platform-specific tweak, I could internally represent the infinite loop bounds checking platform as a set that merely includes WASM, instead of hardcoding WASM as the only infinite loop bounds checked platform.

jimblandy · 2024-11-12T16:23:50Z

@rudderbucky

Since the bounds checks are necessary if and only if we are on WASM (AIUI), I don't see a need to surface this

Actually, that part of the PR was also not correct, I just didn't mention it because there were bigger changes needed. Any native application using wgpu's Metal backend needs the LOOP_IS_REACHABLE kludge (or an equivalent), unless all their shaders are trusted code. Running untrusted shaders (as in a shader playground, for example) without the kludge will introduce a security hole.

The idea is that applications will indeed need to change how they use wgpu if they want to avoid the overhead. Applications that trust their shaders (most native apps) should need to positively indicate this to wgpu, so that the safe behavior is the default.

jimblandy · 2024-11-12T16:26:43Z

I should mention that the LOOP_IS_REACHABLE kludge was found to be insufficient. The issue is (as @DJMcNab points out) that, although the kludge does prevent the compiler from inferring value bounds based on the UB, you do still execute the UB, so it's just a matter of cleverness to find some way to exploit it. I think I saw something where you could do bad stuff inside the loop.

I think the proper fix is a similar branch on a volatile, but controlling a break out of the loop. This prevents the compiler from concluding that the loop is infinite in the first place.

edit: to be clear - the problems I'm discussing here don't need to be fixed by this PR, they're a separate issue.

rudderbucky · 2024-11-13T02:39:30Z

Okay, @jimblandy if I am understanding you correctly the change is quite simple and we only have to check on context.expression.policies (PTAL at the recent commit). However I'm not certain if we should add an entirely new field to naga/src/proc/index.rs just for this Metal-specific issue, so I chose to just re-use index, as that seems to be the primary vector for abusing this; I could be totally wrong, just need your thoughts

DJMcNab · 2024-11-13T08:53:48Z

As a wgpu user, I'd prefer to configure this separately to bounds checking. Validating that I don't have any infinite loops is something which I have to do anyway (e.g. on other platforms to avoid timeouts), so making that assertion is something I'd be happy to do. However, I'd quite like to keep bounds checking safety.
(I'm not a huge fan of making the assertion unsafe, as it's quite hard to introduce "accidental" memory safety issues from getting it wrong)

rudderbucky · 2024-11-13T16:24:07Z

@DJMcNab I don't think this validates you have no infinite loops as much as it just stops them from causing some vulnerability issues (on Metal only), I have no issues with such a feature (or gating this behind such a new feature) but that's probably best done by a separate PR.

jimblandy · 2024-11-13T20:50:02Z

Validating that I don't have any infinite loops is something which I have to do anyway (e.g. on other platforms to avoid timeouts), so making that assertion is something I'd be happy to do. However, I'd quite like to keep bounds checking safety.

Okay - if the combination of bounds checking and no loop marking is valuable, we can think about how to make that possible. I think that sounds like an unusual combination, though.

jimblandy · 2024-11-13T20:51:55Z

I don't think this validates you have no infinite loops as much as it just stops them from causing some vulnerability issues (on Metal only)

What Daniel meant is, he personally ensures that his shaders have no infinite loops, so he doesn't need the overhead of the kludge, but he would still like the bounds checking.

DJMcNab · 2024-11-13T21:18:24Z

I think that sounds like an unusual combination, though.

Again, it's a question of relative ease of validation. It's relatively feasible to validate that a loop is well formed, but it's hard to do that validation for a memory access. And a poorly-formed loop will make itself apparent (on non-metal platforms where it won't be optimised out) quite quickly (linebender/xilem#613), whereas a missed bounds check would be likely to cause heisenbugs or worse. We access many more arrays than we use (non-for) loops, so the relative effort required to check loops is lower. We're not dealing with untrusted shaders.

Ideally, we'd be able to configure them both orthogonally per-shader, rather than getting a grab-bag of properties through using create_shader_module_unchecked.

cwfitzgerald · 2024-11-13T21:44:34Z

Thought: unchecked should have a set of flags of checks to disable.

naga/src/proc/index.rs

Co-authored-by: Erich Gubler <[email protected]>

rudderbucky · 2024-11-14T01:49:19Z

I'm not hearing any arguments against @DJMcNab's idea of adding a separate field to PipelineCompilationOptions. I'm personally leaning towards because I think it'd be less controversial of a breaking change downstream. I'll add it into this PR.

rudderbucky · 2024-11-14T02:20:56Z

To be clear I would gladly revert the recent commit if there's pushback. It's also worth taking a step back and asking the following questions

Do we want to couple this with BoundsCheckPolicies? If so, do we couple it with the index policy, another policy, or create a new policy entirely? (no yeas, no nays)
If not, should we surface this at PipelineCompilationOptions? (2 yeas, no nays)

Would appreciate a direct answer to these from a wgpu expert as it would make it clear how to implement this 🙂

cwfitzgerald · 2024-11-14T05:07:16Z

Alright, lets take a step back and look at the facts:

We ideally would like disabling this thing to be unsafe for soundness reasons
A non-breaking change would be neat.
We already have this unchecked unsafe entry point.

My vibe is that the entry point should be unsafe, not just setting the boolean as that feels like a bit of a hacky use of unsafe. Which would suggest putting the flags on the create_shader_module_unchecked, potentially alongside the other bounds check flags

rudderbucky · 2024-11-14T16:42:30Z

A non-breaking change would be neat.

Agreed, though I will say the introduction of the kludge was a breaking change for me, as now (bevy's) performance is impacted unless we start using create_shader_module_unchecked in our render pipeline. I suppose that is (retroactively?) intentional though and having a non-breaking lever is still useful.

We ideally would like disabling this thing to be unsafe for soundness reasons

We already have this unchecked unsafe entry point.

Fair, I think this counts as a nay on PipelineCompilationOptions. Unless you (or another maintainer) want to put your foot down on that option entirely. Either way would still work for me as long as this PR gets through and we can disable the kludge 🙂

rudderbucky · 2024-11-23T14:21:51Z

@cwfitzgerald @jimblandy I see there's been a bunch of related development in other PRs... is this something you are still interested in getting merged?

Update writer.rs

7561726

rudderbucky requested a review from a team as a code owner November 11, 2024 16:19

rudderbucky mentioned this pull request Nov 11, 2024

[Metal/MSL] Workaround for stopping infinite loops to be optimized out causes performance regression #6518

Open

jimblandy requested changes Nov 11, 2024

View reviewed changes

Revert "Update writer.rs"

2bea408

This reverts commit 7561726.

Only emit loop macro for checked index

ac60902

rudderbucky requested a review from jimblandy November 13, 2024 16:22

ErichDonGubler reviewed Nov 13, 2024

View reviewed changes

naga/src/proc/index.rs Outdated Show resolved Hide resolved

Update naga/src/proc/index.rs

ae33c5b

Co-authored-by: Erich Gubler <[email protected]>

rudderbucky changed the title ~~Gate infinite loop optimization workaround to WASM builds~~ Gate MSL infinite loop optimization workaround to a flag enabled by default. Nov 14, 2024

rudderbucky changed the title ~~Gate MSL infinite loop optimization workaround to a flag enabled by default.~~ Gate MSL infinite loop optimization workaround to a flag enabled by default Nov 14, 2024

Surface loop checking at PipelineCompilationOptions

7bad686

rudderbucky requested review from crowlKats and a team as code owners November 14, 2024 02:16

Merge branch 'trunk' into fix-perf

e9d0610

Merge branch 'trunk' into fix-perf

481ce70

cwfitzgerald mentioned this pull request Nov 14, 2024

[naga msl-out] Avoid UB by making all loops bounded. #6545

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gate MSL infinite loop optimization workaround to a flag enabled by default #6520

Gate MSL infinite loop optimization workaround to a flag enabled by default #6520

rudderbucky commented Nov 11, 2024

jimblandy left a comment

rudderbucky commented Nov 12, 2024

rudderbucky commented Nov 12, 2024

DJMcNab commented Nov 12, 2024

rudderbucky commented Nov 12, 2024

jimblandy commented Nov 12, 2024

jimblandy commented Nov 12, 2024 •

edited

Loading

rudderbucky commented Nov 13, 2024 •

edited

Loading

DJMcNab commented Nov 13, 2024

rudderbucky commented Nov 13, 2024 •

edited

Loading

jimblandy commented Nov 13, 2024

jimblandy commented Nov 13, 2024

DJMcNab commented Nov 13, 2024

cwfitzgerald commented Nov 13, 2024

rudderbucky commented Nov 14, 2024 •

edited

Loading

rudderbucky commented Nov 14, 2024 •

edited

Loading

cwfitzgerald commented Nov 14, 2024

rudderbucky commented Nov 14, 2024 •

edited

Loading

rudderbucky commented Nov 23, 2024

Gate MSL infinite loop optimization workaround to a flag enabled by default #6520

Are you sure you want to change the base?

Gate MSL infinite loop optimization workaround to a flag enabled by default #6520

Conversation

rudderbucky commented Nov 11, 2024

jimblandy left a comment

Choose a reason for hiding this comment

rudderbucky commented Nov 12, 2024

rudderbucky commented Nov 12, 2024

DJMcNab commented Nov 12, 2024

rudderbucky commented Nov 12, 2024

jimblandy commented Nov 12, 2024

jimblandy commented Nov 12, 2024 • edited Loading

rudderbucky commented Nov 13, 2024 • edited Loading

DJMcNab commented Nov 13, 2024

rudderbucky commented Nov 13, 2024 • edited Loading

jimblandy commented Nov 13, 2024

jimblandy commented Nov 13, 2024

DJMcNab commented Nov 13, 2024

cwfitzgerald commented Nov 13, 2024

rudderbucky commented Nov 14, 2024 • edited Loading

rudderbucky commented Nov 14, 2024 • edited Loading

cwfitzgerald commented Nov 14, 2024

rudderbucky commented Nov 14, 2024 • edited Loading

rudderbucky commented Nov 23, 2024

jimblandy commented Nov 12, 2024 •

edited

Loading

rudderbucky commented Nov 13, 2024 •

edited

Loading

rudderbucky commented Nov 13, 2024 •

edited

Loading

rudderbucky commented Nov 14, 2024 •

edited

Loading

rudderbucky commented Nov 14, 2024 •

edited

Loading

rudderbucky commented Nov 14, 2024 •

edited

Loading