-
Notifications
You must be signed in to change notification settings - Fork 639
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intermittent Failing Test: Lucene.Net.Search.TestMultiTermQueryRewrites.TestMaxClauseLimitations() #1050
Comments
I can reproduce with enough repeats on net8.0 macOS arm64. I'm fairly certain that what we're seeing is .NET 8's Dynamic PGO inlining this trivial method at runtime. Much more further reading, but search the page for "inline" to see that it does inline trivial methods if it determines that is beneficial: https://devblogs.microsoft.com/dotnet/performance-improvements-in-net-8/ We could force prevent this with // Maybe remove this assert in later versions, when internal API changes:
Assert.AreEqual("CheckMaxClauseCount", new StackTrace(e, false).GetFrames()[0].GetMethod().Name); //, "Should throw BooleanQuery.TooManyClauses with a stacktrace containing checkMaxClauseCount()"); We should only care that a |
Actually, it would be a bug if we didn't have There are a few dozen of these stack trace checks in the tests. We took While I would like to find a better way to check these scenarios, I am having trouble coming up with a way doesn't significantly deviate from the upstream code. Checking the stack seems to be the easiest approach. And with optimizations enabled it requires |
What is that reason? If anything, the comment on it indicates a keenness to remove the assertion, rather than keep it, because it acknowledges that it's testing an internal API. Users should only be concerned that the exception was thrown as expected, not that a specific protected method in the call stack was the one that did the check. Perhaps we could ask the Lucene team for their thoughts, but my gut tells me that if they had anything as awesome as Dynamic PGO at the time, they would lean on the side of removing this assertion rather than harming performance by forcing it not to be inlined, just to keep the assertion in the test (which has no meaningful benefit to end users). Remember that Dynamic PGO is the runtime observing ways that it can tangibly improve performance based on real-world measured behavior, and then making those changes. By inhibiting this with NoInlining, we're choosing to harm performance just for purity with the original assertion. IMO Dynamic PGO's inlining is a feature, not a bug. And I think we should do a sweep to remove all such assertions and NoInlining that was added to make these assertions pass. Let's err on the side of letting the runtime give us better performance, than having to make sure our stack trace matches some decade+ old version of the JDK's behavior. Especially when that assertion's author was even questioning whether it should be there in the first place. We could be considered the "later version" of this code that is changing the internal API, at runtime, via Dynamic PGO. |
I reviewed the other cases where NoInlining was added so that the stack trace stays "inspectable," and this case stands out as different. The other cases are inside test classes that change the behavior at test runtime; in this case it is only used for an assertion. Based on my rationale above, I've submitted a PR to comment out this assertion and add a comment explaining why, rather than disable PGO or force NoInlining. The assertion inside the |
Is there an existing issue for this?
Describe the bug
The
Lucene.Net.Search.TestMultiTermQueryRewrites.TestMaxClauseLimitations()
test fails under rare random conditions, but does not seem to repeat with the same random seed and culture.Expected Behavior
The
Lucene.Net.Search.TestMultiTermQueryRewrites.TestMaxClauseLimitations()
test should always succeed.Steps To Reproduce
Add the
[Repeat(100000)]
attribute to the test and it will fail frequently.Exceptions (if any)
Lucene.NET Version
4.8.0-beta00017
.NET Version
net8.0
Operating System
Windows
Anything else?
It is known to fail on Windows x64 with net8.0. It may fail also fail under other configurations.
This test should be compared with the behavior in Lucene 4.8.0. For some reason we are sometimes seeing the exception thrown from the
Collect()
method when we are expecting it to always be thrown from theCheckMaxClauseCount()
method according to the way the test is written.Note also that the assert message is commented out, and it should not be.
The text was updated successfully, but these errors were encountered: