frontend: update query pruning #10026

krajorama · 2024-11-26T10:21:48Z

What this PR does

The current method of excluding/including sub query results in PromQL by comparing to -Inf or +Inf is no longer valid after prometheus/prometheus#15245 due to comparison of native histograms to a float with < or > result in Jeanette's warning, not empty set.

To ease migration away from the old wrong logic, I'm adding the new logic first.

The new method uses logical AND operation to intersect the sub query with either a const vector() or an empty vector(). E.g.

subquery and on() (vector(1)==1)

which becomes:

subquery

and conversely

subquery and on() (vector(-1)==1)

becomes

(vector(-1)==1)

Note

We cannot just drop (vector(-1)==1) altogether , because of this example:

(other_subquery) and (subquery and on() (vector(-1)==1))

removal of the whole right side would allow results from the other_subquery which is not correct. Solving this is out of scope.

Which issue(s) this PR fixes or relates to

Fixes N/A

Checklist

Tests updated.
N/A Documentation added.
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
N/A about-versioning.md updated with experimental features.

charleskorn

I think we can further prune the expression in some cases:

charleskorn · 2024-11-26T22:00:08Z

pkg/frontend/querymiddleware/astmapper/pruning_new_test.go

+		{
+			// "and on()" is not on top level, "or" has lower precedence.
+			`(avg(rate(foo[1m]))) and on() (vector(0) == 1) or avg(rate(bar[1m]))`,
+			`(vector(0) == 1) or avg(rate(bar[1m]))`,


Couldn't we simplify this further, to avg(rate(bar[1m]))?

That's more complicated. The main point is to avoid loading chunks and (vector(0)==1) doesn't load chunks, so should be very efficient.
We can add that logic in a new PR, seems to be independent from this optimization, wdyt?
But having two algorithms raises the question of whether they need to be run repeatedly to optimize away everything, so it gets even more complicated.

Note that removing (vector(0)==1) from an OR expression is different from removing it from an AND expression, which is what I mean by more complicated . Outside my scope.

Is there a test for this? We should document that in a test so nobody comes tomorrow and "optimizes further".

having two algorithms raises the question of whether they need to be run repeatedly to optimize away everything, so it gets even more complicated.

If we optimise the leaf node expressions first and work our way back up the tree, can't we do both and and or in one pass?

But if you want to limit this to and and come back to or later, that's fine by me, please just make this clear with a test or a comment as Oleg suggests.

adding both

well, actually I had 3 tests on this already so just improved the comments

charleskorn · 2024-11-26T22:00:30Z

pkg/frontend/querymiddleware/astmapper/pruning_new_test.go

+		{
+			// "and on()" is not on top level, due to left-right associativity.
+			`(avg(rate(foo[1m]))) and on() (vector(0) == 1) and avg(rate(bar[1m]))`,
+			`(vector(0) == 1) and avg(rate(bar[1m]))`,


Couldn't we simplify this further, to vector(0) == 1?

charleskorn · 2024-11-26T22:01:20Z

pkg/frontend/querymiddleware/astmapper/pruning_new_test.go

+		},
+		{ // The const expression is on the wrong side.
+			`(vector(0) == 1) and on() (avg(rate(foo[1m])))`,
+			`(vector(0) == 1) and on() (avg(rate(foo[1m])))`,


Same as below - I think this can be simplified to vector(0) == 1.

charleskorn · 2024-11-26T22:02:23Z

pkg/frontend/querymiddleware/astmapper/pruning_new_test.go

+		},
+		{ // Matching on labels.
+			`(avg(rate(foo[1m]))) and on(id) (vector(0) == 1)`,
+			`(avg(rate(foo[1m]))) and on(id) (vector(0) == 1)`,


Couldn't this be simplified to vector(0) == 1? The right side will never produce a result, so we'll never return anything from the left side.

charleskorn · 2024-11-26T22:03:35Z

pkg/frontend/querymiddleware/astmapper/pruning_new_test.go

+		},
+		{ // Not "on" expression.
+			`(avg(rate(foo[1m]))) and ignoring() (vector(0) == 1)`,
+			`(avg(rate(foo[1m]))) and ignoring() (vector(0) == 1)`,


Same as above - the right side will never produce a result.

beorn7

Looks good at first glance. (I assume you don't want a code-level review from me, just that the idea is correct.)

The current method of excluding/including sub query results in PromQL by comparing to -Inf or +Inf is no longer valid after prometheus/prometheus#15245 due to comparison of native histograms to a float with < or > result in Jeanette's warning, not empty set. The new method uses logical AND operation to intersect the sub query with either a const vector() or an empty vector(). E.g. subquery and on() (vector(1)==1) subquery and on() (vector(-1)==1) which become: subquery (vector(-1)==1) Note that although in theory (vector(-1)==1) could be dropped in some cases, it depends on the context and out of scope for this PR. Signed-off-by: György Krajcsovits <[email protected]>

tacole02 · 2024-11-29T17:26:27Z

CHANGELOG.md

@@ -77,6 +77,7 @@
 * [ENHANCEMENT] Ingester: Add `-blocks-storage.tsdb.bigger-out-of-order-blocks-for-old-samples` to build 24h blocks for out-of-order data belonging to the previous days instead of building smaller 2h blocks. This reduces pressure on compactors and ingesters when the out-of-order samples span multiple days in the past. #9844 #10033 #10035
 * [ENHANCEMENT] Distributor: allow a different limit for info series (series ending in `_info`) label count, via `-validation.max-label-names-per-info-series`. #10028
 * [ENHANCEMENT] Ingester: do not reuse labels, samples and histograms slices in the write request if there are more entries than 10x the pre-allocated size. This should help to reduce the in-use memory in case of few requests with a very large number of labels, samples or histograms. #10040
+* [ENHANCEMENT] Query-Frontend: prune `<subquery> and on() (vector(x)==y)` style queries and no longer prune `<subquery> < -Inf`. Triggered by https://github.com/prometheus/prometheus/pull/15245. #10026


Nit: no longer prune >> stop pruning

krajorama force-pushed the krajo/change-pruneinig branch from 9dcc588 to c70d552 Compare November 26, 2024 10:24

krajorama requested review from charleskorn, zenador and beorn7 November 26, 2024 10:26

krajorama marked this pull request as ready for review November 26, 2024 10:35

krajorama requested a review from a team as a code owner November 26, 2024 10:35

krajorama marked this pull request as draft November 26, 2024 12:11

krajorama force-pushed the krajo/change-pruneinig branch 2 times, most recently from 97a767d to 98d8813 Compare November 26, 2024 12:45

krajorama marked this pull request as ready for review November 26, 2024 12:46

charleskorn reviewed Nov 26, 2024

View reviewed changes

beorn7 reviewed Nov 26, 2024

View reviewed changes

krajorama force-pushed the krajo/change-pruneinig branch 2 times, most recently from a1440c4 to 6901f7d Compare November 29, 2024 08:55

krajorama force-pushed the krajo/change-pruneinig branch from 6901f7d to f6c554a Compare November 29, 2024 09:46

krajorama mentioned this pull request Nov 29, 2024

fix(docs): update native histograms migration guide #10052

Open

2 tasks

tacole02 reviewed Nov 29, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

frontend: update query pruning #10026

frontend: update query pruning #10026

krajorama commented Nov 26, 2024 •

edited

Loading

charleskorn left a comment

charleskorn Nov 26, 2024

krajorama Nov 27, 2024

krajorama Nov 28, 2024

colega Nov 28, 2024

charleskorn Nov 28, 2024

krajorama Nov 29, 2024

krajorama Nov 29, 2024

charleskorn Nov 26, 2024

charleskorn Nov 26, 2024

charleskorn Nov 26, 2024

charleskorn Nov 26, 2024

beorn7 left a comment

tacole02 Nov 29, 2024

frontend: update query pruning #10026

Are you sure you want to change the base?

frontend: update query pruning #10026

Conversation

krajorama commented Nov 26, 2024 • edited Loading

What this PR does

Note

Which issue(s) this PR fixes or relates to

Checklist

charleskorn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

beorn7 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

krajorama commented Nov 26, 2024 •

edited

Loading