Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NFC][Analysis] Add more SCEV tests for ptr inductions #108210

Merged
merged 2 commits into from
Sep 12, 2024

Conversation

david-arm
Copy link
Contributor

I've added more tests to

Analysis/ScalarEvolution/max-backedge-taken-count-guard-info.ll

to cover more cases of ptr inductions, in particular highlighting what seems to be a disparity between single exit and multiple exit loops.

I've added more tests to

Analysis/ScalarEvolution/max-backedge-taken-count-guard-info.ll

to cover more cases of ptr inductions, in particular highlighting
what seems to be a disparity between single exit and multiple
exit loops.
@llvmbot
Copy link
Member

llvmbot commented Sep 11, 2024

@llvm/pr-subscribers-llvm-analysis

Author: David Sherwood (david-arm)

Changes

I've added more tests to

Analysis/ScalarEvolution/max-backedge-taken-count-guard-info.ll

to cover more cases of ptr inductions, in particular highlighting what seems to be a disparity between single exit and multiple exit loops.


Full diff: https://github.com/llvm/llvm-project/pull/108210.diff

1 Files Affected:

  • (modified) llvm/test/Analysis/ScalarEvolution/max-backedge-taken-count-guard-info.ll (+139)
diff --git a/llvm/test/Analysis/ScalarEvolution/max-backedge-taken-count-guard-info.ll b/llvm/test/Analysis/ScalarEvolution/max-backedge-taken-count-guard-info.ll
index 413bd21554c98d..af75bb71c1013e 100644
--- a/llvm/test/Analysis/ScalarEvolution/max-backedge-taken-count-guard-info.ll
+++ b/llvm/test/Analysis/ScalarEvolution/max-backedge-taken-count-guard-info.ll
@@ -1595,6 +1595,145 @@ exit:
   ret i32 0
 }
 
+define i32 @ptr_induction_eq_1(ptr %a, ptr %b) {
+; CHECK-LABEL: 'ptr_induction_eq_1'
+; CHECK-NEXT:  Classifying expressions for: @ptr_induction_eq_1
+; CHECK-NEXT:    %ptr.iv = phi ptr [ %ptr.iv.next, %loop ], [ %a, %entry ]
+; CHECK-NEXT:    --> {%a,+,8}<nuw><%loop> U: full-set S: full-set Exits: ((8 * ((-8 + (-1 * (ptrtoint ptr %a to i64)) + (ptrtoint ptr %b to i64)) /u 8))<nuw> + %a) LoopDispositions: { %loop: Computable }
+; CHECK-NEXT:    %ptr.iv.next = getelementptr inbounds i8, ptr %ptr.iv, i64 8
+; CHECK-NEXT:    --> {(8 + %a),+,8}<nuw><%loop> U: full-set S: full-set Exits: (8 + (8 * ((-8 + (-1 * (ptrtoint ptr %a to i64)) + (ptrtoint ptr %b to i64)) /u 8))<nuw> + %a) LoopDispositions: { %loop: Computable }
+; CHECK-NEXT:  Determining loop execution counts for: @ptr_induction_eq_1
+; CHECK-NEXT:  Loop %loop: backedge-taken count is ((-8 + (-1 * (ptrtoint ptr %a to i64)) + (ptrtoint ptr %b to i64)) /u 8)
+; CHECK-NEXT:  Loop %loop: constant max backedge-taken count is i64 2305843009213693951
+; CHECK-NEXT:  Loop %loop: symbolic max backedge-taken count is ((-8 + (-1 * (ptrtoint ptr %a to i64)) + (ptrtoint ptr %b to i64)) /u 8)
+; CHECK-NEXT:  Loop %loop: Trip multiple is 1
+;
+entry:
+  %cmp = icmp eq ptr %a, %b
+  br i1 %cmp, label %exit, label %loop
+
+loop:
+  %ptr.iv = phi ptr [ %ptr.iv.next, %loop ], [ %a, %entry ]
+  %ptr.iv.next = getelementptr inbounds i8, ptr %ptr.iv, i64 8
+  %exitcond = icmp eq ptr %ptr.iv.next, %b
+  br i1 %exitcond, label %exit, label %loop
+
+exit:
+  ret i32 0
+}
+
+define i32 @ptr_induction_eq_2(ptr %a, i64 %n) {
+; CHECK-LABEL: 'ptr_induction_eq_2'
+; CHECK-NEXT:  Classifying expressions for: @ptr_induction_eq_2
+; CHECK-NEXT:    %b = getelementptr inbounds ptr, ptr %a, i64 %n
+; CHECK-NEXT:    --> ((8 * %n)<nsw> + %a) U: full-set S: full-set
+; CHECK-NEXT:    %ptr.iv = phi ptr [ %ptr.iv.next, %loop ], [ %a, %entry ]
+; CHECK-NEXT:    --> {%a,+,8}<nuw><%loop> U: full-set S: full-set Exits: ((8 * ((-8 + (8 * %n)<nsw>) /u 8))<nuw> + %a) LoopDispositions: { %loop: Computable }
+; CHECK-NEXT:    %ptr.iv.next = getelementptr inbounds i8, ptr %ptr.iv, i64 8
+; CHECK-NEXT:    --> {(8 + %a),+,8}<nuw><%loop> U: full-set S: full-set Exits: (8 + (8 * ((-8 + (8 * %n)<nsw>) /u 8))<nuw> + %a) LoopDispositions: { %loop: Computable }
+; CHECK-NEXT:  Determining loop execution counts for: @ptr_induction_eq_2
+; CHECK-NEXT:  Loop %loop: backedge-taken count is ((-8 + (8 * %n)<nsw>) /u 8)
+; CHECK-NEXT:  Loop %loop: constant max backedge-taken count is i64 2305843009213693951
+; CHECK-NEXT:  Loop %loop: symbolic max backedge-taken count is ((-8 + (8 * %n)<nsw>) /u 8)
+; CHECK-NEXT:  Loop %loop: Trip multiple is 1
+;
+entry:
+  %b = getelementptr inbounds ptr, ptr %a, i64 %n
+  %cmp = icmp eq ptr %a, %b
+  br i1 %cmp, label %exit, label %loop
+
+loop:
+  %ptr.iv = phi ptr [ %ptr.iv.next, %loop ], [ %a, %entry ]
+  %ptr.iv.next = getelementptr inbounds i8, ptr %ptr.iv, i64 8
+  %exitcond = icmp eq ptr %ptr.iv.next, %b
+  br i1 %exitcond, label %exit, label %loop
+
+exit:
+  ret i32 0
+}
+
+; TODO: It feels like we should be able to calculate the symbolic max
+; exit count for the loop.inc block here, in the same way as
+; ptr_induction_eq_1. The problem seems to be in howFarToZero when the
+; ControlsOnlyExit is set to false.
+define i32 @ptr_induction_early_exit_eq_1(ptr %a, ptr %b, ptr %c) {
+; CHECK-LABEL: 'ptr_induction_early_exit_eq_1'
+; CHECK-NEXT:  Classifying expressions for: @ptr_induction_early_exit_eq_1
+; CHECK-NEXT:    %ptr.iv = phi ptr [ %ptr.iv.next, %loop.inc ], [ %a, %entry ]
+; CHECK-NEXT:    --> {%a,+,8}<nuw><%loop> U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %loop: Computable }
+; CHECK-NEXT:    %ld1 = load ptr, ptr %ptr.iv, align 8
+; CHECK-NEXT:    --> %ld1 U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %loop: Variant }
+; CHECK-NEXT:    %ptr.iv.next = getelementptr inbounds i8, ptr %ptr.iv, i64 8
+; CHECK-NEXT:    --> {(8 + %a),+,8}<nw><%loop> U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %loop: Computable }
+; CHECK-NEXT:  Determining loop execution counts for: @ptr_induction_early_exit_eq_1
+; CHECK-NEXT:  Loop %loop: <multiple exits> Unpredictable backedge-taken count.
+; CHECK-NEXT:    exit count for loop: ***COULDNOTCOMPUTE***
+; CHECK-NEXT:    exit count for loop.inc: ***COULDNOTCOMPUTE***
+; CHECK-NEXT:  Loop %loop: Unpredictable constant max backedge-taken count.
+; CHECK-NEXT:  Loop %loop: Unpredictable symbolic max backedge-taken count.
+; CHECK-NEXT:    symbolic max exit count for loop: ***COULDNOTCOMPUTE***
+; CHECK-NEXT:    symbolic max exit count for loop.inc: ***COULDNOTCOMPUTE***
+;
+entry:
+  %cmp = icmp eq ptr %a, %b
+  br i1 %cmp, label %exit, label %loop
+
+loop:
+  %ptr.iv = phi ptr [ %ptr.iv.next, %loop.inc ], [ %a, %entry ]
+  %ld1 = load ptr, ptr %ptr.iv, align 8
+  %earlyexitcond = icmp eq ptr %ld1, %c
+  br i1 %earlyexitcond, label %exit, label %loop.inc
+
+loop.inc:
+  %ptr.iv.next = getelementptr inbounds i8, ptr %ptr.iv, i64 8
+  %exitcond = icmp eq ptr %ptr.iv.next, %b
+  br i1 %exitcond, label %exit, label %loop
+
+exit:
+  ret i32 0
+}
+
+define i32 @ptr_induction_early_exit_eq_2(ptr %a, i64 %n, ptr %c) {
+; CHECK-LABEL: 'ptr_induction_early_exit_eq_2'
+; CHECK-NEXT:  Classifying expressions for: @ptr_induction_early_exit_eq_2
+; CHECK-NEXT:    %b = getelementptr inbounds ptr, ptr %a, i64 %n
+; CHECK-NEXT:    --> ((8 * %n)<nsw> + %a) U: full-set S: full-set
+; CHECK-NEXT:    %ptr.iv = phi ptr [ %ptr.iv.next, %loop.inc ], [ %a, %entry ]
+; CHECK-NEXT:    --> {%a,+,8}<nuw><%loop> U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %loop: Computable }
+; CHECK-NEXT:    %ld1 = load ptr, ptr %ptr.iv, align 8
+; CHECK-NEXT:    --> %ld1 U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %loop: Variant }
+; CHECK-NEXT:    %ptr.iv.next = getelementptr inbounds i8, ptr %ptr.iv, i64 8
+; CHECK-NEXT:    --> {(8 + %a),+,8}<nw><%loop> U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %loop: Computable }
+; CHECK-NEXT:  Determining loop execution counts for: @ptr_induction_early_exit_eq_2
+; CHECK-NEXT:  Loop %loop: <multiple exits> Unpredictable backedge-taken count.
+; CHECK-NEXT:    exit count for loop: ***COULDNOTCOMPUTE***
+; CHECK-NEXT:    exit count for loop.inc: ((-8 + (8 * %n)<nsw>) /u 8)
+; CHECK-NEXT:  Loop %loop: constant max backedge-taken count is i64 2305843009213693951
+; CHECK-NEXT:  Loop %loop: symbolic max backedge-taken count is ((-8 + (8 * %n)<nsw>) /u 8)
+; CHECK-NEXT:    symbolic max exit count for loop: ***COULDNOTCOMPUTE***
+; CHECK-NEXT:    symbolic max exit count for loop.inc: ((-8 + (8 * %n)<nsw>) /u 8)
+;
+entry:
+  %b = getelementptr inbounds ptr, ptr %a, i64 %n
+  %cmp = icmp eq ptr %a, %b
+  br i1 %cmp, label %exit, label %loop
+
+loop:
+  %ptr.iv = phi ptr [ %ptr.iv.next, %loop.inc ], [ %a, %entry ]
+  %ld1 = load ptr, ptr %ptr.iv, align 8
+  %earlyexitcond = icmp eq ptr %ld1, %c
+  br i1 %earlyexitcond, label %exit, label %loop.inc
+
+loop.inc:
+  %ptr.iv.next = getelementptr inbounds i8, ptr %ptr.iv, i64 8
+  %exitcond = icmp eq ptr %ptr.iv.next, %b
+  br i1 %exitcond, label %exit, label %loop
+
+exit:
+  ret i32 0
+}
+
+
 define void @gep_addrec_nw(ptr %a) {
 ; CHECK-LABEL: 'gep_addrec_nw'
 ; CHECK-NEXT:  Classifying expressions for: @gep_addrec_nw

@david-arm
Copy link
Contributor Author

david-arm commented Sep 12, 2024

Given this patch is only adding tests, there are no code changes and all checks have passed I'll just land this patch later today unless there are any objections.

Copy link
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@david-arm david-arm merged commit 36adf8e into llvm:main Sep 12, 2024
5 of 7 checks passed
; TODO: It feels like we should be able to calculate the symbolic max
; exit count for the loop.inc block here, in the same way as
; ptr_induction_eq_1. The problem seems to be in howFarToZero when the
; ControlsOnlyExit is set to false.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For ptr_induction_eq_1, we prove that %b-%a is divisible by 8, so the trip count is (b-a)/8. (If difference isn't divisible by 8, we eventually branch on poison.) Here, we can't prove that: as long as there's another exit, the loop can go on indefinitely (at least, until we read past the end of the allocation). So we can't do any useful math.

If you tell the compiler the pointers are aligned (for example, using an "align" attribute), we do the computation.

Alternatively, we could compute a count assuming the loop exits via a particular branch; see #83052 (review) .

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, this looks to be the same underlying issue as for #101372. Looking into why Clang doesn't generate align attributes and what would be needed to change that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in that example, unless I'm missing something, the early exit dominates the latch. I don't see how the early exit could make the symbolic max loop trip count greater than the backedge-taken count from the latch? I'm probably not explaining myself that well, but what I mean is surely the early exit can only reduce the trip count, not extend it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, consider the something like %b = N * 8 * %a + 1, so %exitcond is never true, so we never exit the loop through it.

The early exit then could exit at any later iteration (e.g. when ptr iv is (N + 2) * 8 * %a) , before %ptr.iv.next wraps.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair enough, but then surely the loop in ptr_induction_eq_1 could also go on indefinitely and hence have an infinite symbolic max backedge-taken count? From what you're saying it sounds like there is a bug and ptr_induction_eq_1 should also return COULDNOTCOMPUTE?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ptr_induction_eq_1 is different as there's no early exit, so we must exit via the check of ptr.iv.next. %b = N * 8 * %a + 1 would trigger UB in ptr_induction_eq_1, as %ptr.iv.next` is guaranteed to wrap if the single exit isn't taken.

The early exit in ptr_induction_early_exit_eq_1 allows for scenarios where ptr.iv > %b while still not triggering UB because then we must exit via the early exit before wrap.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. That's the bit I was missing. I think I understand this now. Thanks! So if the IR contained information about alignment that might help?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, the reason I'm asking about this is because I believe this to be a common pattern when you invoke the C++ routine std::find(vec.begin(), vec.end(), needle) where vec is of type std::vector. The end() value at least in libc++ is just a pointer, rather than being constructed in terms of vec.begin() + vec.size().

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The getelementptr operation does have inbounds on it - does that help in this case limit the trip count because it cannot wrap?

This helps us prove the loop is finite, but we can't produce a symbolic bound; at most, we could prove the max iteration count is 2^60 or something like that. Which is marginally helpful.


We could also generate a predicated backedge-taken count, which would be useful in contexts like vectorization. Granted, we don't usually vectorize loops with multiple exits anyway, but I guess that's changing a bit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a look at adding the predicate and it looks that also helps in some non-multi-exit cases: #108777

#101372 also contains some discussion about encoding alignment assumptions for certain libc++ data types/interfaces, as that would also help in other cases for which versioning is not applicable (e.g. removing bounds checks)

fhahn added a commit to fhahn/llvm-project that referenced this pull request Sep 15, 2024
This can help in cases where pointer alignment info is missing, e.g.
llvm#108210

The predicate is formed for the complex expression that's passed to
SolveLinEquationWithOverflow and the checks could probably be pushed
closer to the root nodes, which in some cases may be cheaper to check.
fhahn added a commit to fhahn/llvm-project that referenced this pull request Sep 15, 2024
This can help in cases where pointer alignment info is missing, e.g.
llvm#108210

The predicate is formed for the complex expression that's passed to
SolveLinEquationWithOverflow and the checks could probably be pushed
closer to the root nodes, which in some cases may be cheaper to check.
fhahn added a commit that referenced this pull request Sep 28, 2024
…108777)

This can help in cases where pointer alignment info is missing, e.g.
#108210

The predicate is formed for the complex expression that's passed to
SolveLinEquationWithOverflow and the checks could probably be pushed
closer to the root nodes, which in some cases may be cheaper to check.


PR: #108777
puja2196 pushed a commit to puja2196/LLVM-tutorial that referenced this pull request Sep 30, 2024
…108777)

This can help in cases where pointer alignment info is missing, e.g.
llvm/llvm-project#108210

The predicate is formed for the complex expression that's passed to
SolveLinEquationWithOverflow and the checks could probably be pushed
closer to the root nodes, which in some cases may be cheaper to check.


PR: llvm/llvm-project#108777
puja2196 pushed a commit to puja2196/LLVM-tutorial that referenced this pull request Oct 2, 2024
…108777)

This can help in cases where pointer alignment info is missing, e.g.
llvm/llvm-project#108210

The predicate is formed for the complex expression that's passed to
SolveLinEquationWithOverflow and the checks could probably be pushed
closer to the root nodes, which in some cases may be cheaper to check.


PR: llvm/llvm-project#108777
@david-arm david-arm deleted the scev_tests branch October 3, 2024 08:53
xgupta pushed a commit to xgupta/llvm-project that referenced this pull request Oct 4, 2024
…lvm#108777)

This can help in cases where pointer alignment info is missing, e.g.
llvm#108210

The predicate is formed for the complex expression that's passed to
SolveLinEquationWithOverflow and the checks could probably be pushed
closer to the root nodes, which in some cases may be cheaper to check.


PR: llvm#108777
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants