Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR fixes a bad join order in
cs/zipslip
. @LWSimpkins was kind enough to provide me with a log file that showed the problematic predicate:The first thing to do in this case is to notice the
#bff
in the predicate name. This means the compiler has "optimized" the predicate and inserted additional context into the predicate from the call site in order to reduce the size of the predicate (this is called the "magic" optimization since it may sometimes magically make predicates evaluate much more efficiently). However, the additional context has a risk of being joined with in an inefficient way (just like any other conjuncts one writes in a predicate). Furthermore, there's a risk that any performance tricks we do will be undone by the compiler as part of this optimization. So the first commit disables this optimization for thegetAnArgument
predicate.Furthermore, I also did the same to the
ComparisonTest.getAnArgument
predicate since I saw bad magic in there as well. That wasn't a performance problem in itself, though. Just a looming threat that could have caused a future problem 😅When we run the code from the first commit the code is still slow. Running
codeql query compile "codeql\csharp\ql\src\Security Features\CWE-022\ZipSlip.ql" --dump-dil --dump-ra
allows us to better see the code that we end up executing. Doing that, we see the body of the predicate that's slow:The
SYNTHETIC
keyword means the compiler has generated this predicate. Let's search for all the places where this is used. Luckily, there's just 1 place:Aha! So it's generated because of the
ZipSlipQuery::SanitizerMethodCall.getFilePathArgument
predicate. Let's look at the DIL for that:If you squent a bit you will notice that the body of
_Call::Call.getArgument/1#dispred#28f02664_102#join_rhs_params#join_rhs
corresponds to this part of the DIL:and here we the problem: we're joining
params
andCall::Call.getArgument
on the argument index 😱 That's a really really bad join order.By factoring out some parts of the predicate we ensure that we're joining on both
this
andindex
. This results in much better code:Now, we join on both the index and the
MethodCall
(i.e.,this
) which is much better.