-
Notifications
You must be signed in to change notification settings - Fork 291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LLVM failed to use the knowledge from a never-overflow assumption #509
Comments
Could you further explain why alive2 proves the correctness of the transformation, but we cannot transform it in LLVM? |
It is just not implemented. But I am not willing to do this in LLVM. We should fix it in |
An example from |
Looks like the same as llvm/llvm-project#80637. As I noted there, this technically loses information, but I guess in practice it's a lot better to replace with @dtcxzyw Do you have some example where the replacement enables follow-up optimization in practice? As far as I can tell your examples will be lowered to the same asm after assumes are dropped in CGP. |
@nikic See the IR diff in https://github.com/dtcxzyw/llvm-opt-benchmark/pull/289/files: |
As @nikic said, we want to reuse Lines 1671 to 1674 in 274c7bb
Lines 1745 to 1748 in 274c7bb
|
So as argued above by others, this is not an application-specific pattern. It is a pretty natural pattern for Rust programs: one has a checked operation that can fail in a safe way, and then one turns that into the unchecked variant (with UB on failure) by doing I'm not sure how else you are suggesting that code should be written, but it'd clearly be bad if |
Even aside from the pattern overall being visible in Rust code beyond hashbrown, most of the code in hashbrown is transitively included in every single Rust program that uses a hashmap, so "application-specific" is an amusing way to describe "probably 20% of all Rust programs ever? higher, even?" |
@dtcxzyw Most of the callers that eventually arrived in |
Yeah.
I believe there is a better way to fix this issue by adding some MIR optimizations or modifying hashbrown itself. It is tricky to handle it in LLVM :( |
I think you're missing the point here. When it comes to optimization issues, in Rust we are in the very fortunate position that the issue can be addressed either by adding new optimizations, or by changing library code to be more amenable to optimization. The fact that we can change the hashbrown implementation and have ~all Rust code benefit is the point, not an analysis mistake.
(This is just for context -- as said above, I'm not convinced that modifying hashbrown is indeed the best solution here.) |
Are there any small tweaks that we could do to what we emit to make it easier? For example, I see that the current define { i64, i64 } @demo_checked_mul(i64 noundef %a, i64 noundef %b) unnamed_addr #0 !dbg !7 {
start:
%0 = tail call { i64, i1 } @llvm.umul.with.overflow.i64(i64 %a, i64 %b), !dbg !12
%_6.1 = extractvalue { i64, i1 } %0, 1, !dbg !12
%not._6.1 = xor i1 %_6.1, true, !dbg !25
%. = zext i1 %not._6.1 to i64, !dbg !25
%_6.0 = extractvalue { i64, i1 } %0, 0, !dbg !12
%1 = insertvalue { i64, i64 } poison, i64 %., 0, !dbg !28
%2 = insertvalue { i64, i64 } %1, i64 %_6.0, 1, !dbg !28
ret { i64, i64 } %2, !dbg !28
} which definitely can't be further optimized because there's nothing telling LLVM that second part of the pair will be unused. Inspired by rust-lang/rust#124114 I started imagining something like %product_or_undef = select i1 %_6.1, i64 undef, i64 %_6.0
%1 = insertvalue { i64, i64 } poison, i64 %., 0
%2 = insertvalue { i64, i64 } %1, i64 %product_or_undef, 1 but it looks like that's not actually useful because it gets immediately optimized away again. |
One thing you might try on the hashbrown side: thanks to llvm/llvm-project#56563, it seemed like using wider types for the Rather than doing all this math as checked in Lines 261 to 263 in 274c7bb
maybe you could try writing it in u128 where I think it can't overflow? With the later Alternatively, maybe there'd be a way to shift by the log of the buckets, since it said it's always a power of two, instead of multiplying? It looks like the function doesn't do anything to tell LLVM that it's actually a power-of-two, so it probably has no way of being smart about understanding I wonder if making (Aside: I should push on https://doc.rust-lang.org/std/primitive.usize.html#method.carrying_mul again...) |
…=<try> Also get `add nuw` from `uN::checked_add` When I was doing this for `checked_{sub,shl,shr}`, it was mentioned rust-lang#124114 (comment) that it'd be worth trying for `checked_add` too. It makes a particularly-big difference for `x.checked_add(C)`, as doing this means that LLVM removes the intrinsic and does it as a normal `x <= MAX - C` instead. cc `@DianQK` who had commented about `checked_add` related to rust-lang/hashbrown#509 before cc llvm/llvm-project#80637 for how LLVM is unlikely to do this itself
…=<try> Also get `add nuw` from `uN::checked_add` When I was doing this for `checked_{sub,shl,shr}`, it was mentioned rust-lang#124114 (comment) that it'd be worth trying for `checked_add` too. It makes a particularly-big difference for `x.checked_add(C)`, as doing this means that LLVM removes the intrinsic and does it as a normal `x <= MAX - C` instead. cc `@DianQK` who had commented about `checked_add` related to rust-lang/hashbrown#509 before cc llvm/llvm-project#80637 for how LLVM is unlikely to do this itself
…=Amanieu Also get `add nuw` from `uN::checked_add` When I was doing this for `checked_{sub,shl,shr}`, it was mentioned rust-lang#124114 (comment) that it'd be worth trying for `checked_add` too. It makes a particularly-big difference for `x.checked_add(C)`, as doing this means that LLVM removes the intrinsic and does it as a normal `x <= MAX - C` instead. cc `@DianQK` who had commented about `checked_add` related to rust-lang/hashbrown#509 before cc llvm/llvm-project#80637 for how LLVM is unlikely to do this itself
…=Amanieu Also get `add nuw` from `uN::checked_add` When I was doing this for `checked_{sub,shl,shr}`, it was mentioned rust-lang#124114 (comment) that it'd be worth trying for `checked_add` too. It makes a particularly-big difference for `x.checked_add(C)`, as doing this means that LLVM removes the intrinsic and does it as a normal `x <= MAX - C` instead. cc `@DianQK` who had commented about `checked_add` related to rust-lang/hashbrown#509 before cc llvm/llvm-project#80637 for how LLVM is unlikely to do this itself
Also get `add nuw` from `uN::checked_add` When I was doing this for `checked_{sub,shl,shr}`, it was mentioned rust-lang/rust#124114 (comment) that it'd be worth trying for `checked_add` too. It makes a particularly-big difference for `x.checked_add(C)`, as doing this means that LLVM removes the intrinsic and does it as a normal `x <= MAX - C` instead. cc `@DianQK` who had commented about `checked_add` related to rust-lang/hashbrown#509 before cc llvm/llvm-project#80637 for how LLVM is unlikely to do this itself
Also get `add nuw` from `uN::checked_add` When I was doing this for `checked_{sub,shl,shr}`, it was mentioned rust-lang/rust#124114 (comment) that it'd be worth trying for `checked_add` too. It makes a particularly-big difference for `x.checked_add(C)`, as doing this means that LLVM removes the intrinsic and does it as a normal `x <= MAX - C` instead. cc `@DianQK` who had commented about `checked_add` related to rust-lang/hashbrown#509 before cc llvm/llvm-project#80637 for how LLVM is unlikely to do this itself
Also get `add nuw` from `uN::checked_add` When I was doing this for `checked_{sub,shl,shr}`, it was mentioned rust-lang/rust#124114 (comment) that it'd be worth trying for `checked_add` too. It makes a particularly-big difference for `x.checked_add(C)`, as doing this means that LLVM removes the intrinsic and does it as a normal `x <= MAX - C` instead. cc `@DianQK` who had commented about `checked_add` related to rust-lang/hashbrown#509 before cc llvm/llvm-project#80637 for how LLVM is unlikely to do this itself
Hi all, I am an LLVM developer working on the middle-end optimizations.
Recently I found an anti-pattern in some Rust applications from my benchmark. All of them come from
hashbrown::raw::RawTableInner::free_buckets
:hashbrown/src/raw/mod.rs
Lines 3290 to 3294 in 274c7bb
LLVM IR:
Alive2: https://alive2.llvm.org/ce/z/YBRjxc
Currently LLVM cannot use the knowledge from a never-overflow assumption. I have a draft patch for the fold and it enables more optimizations in downstream users of
hashbrown
. But I am not willing to do an application-specific optimization in LLVM. So I file this issue to see whether we can fix it inhashbrown
.cc @nikic @DianQK
The text was updated successfully, but these errors were encountered: