-
Notifications
You must be signed in to change notification settings - Fork 537
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure counter reset hints are corrected when merging batches #9909
base: main
Are you sure you want to change the base?
Conversation
18d9cf2
to
d9c34a7
Compare
Implementation and testing is done (pending CI), but I want to extract the mimir-prometheus update into a separate PR before removing the "WIP" |
Co-authored-by: George Krajcsovits <[email protected]>
Co-authored-by: George Krajcsovits <[email protected]>
Signed-off-by: György Krajcsovits <[email protected]>
7b27917
to
0f4f294
Compare
mimir-prometheus update was done as part of #10033 |
e7d5fb6
to
24bea3c
Compare
name: "overlapping and interleaved samples", | ||
chunks: func() []GenericChunk { | ||
chk1, err := chunk.NewForEncoding(enc) | ||
require.NoError(t, err) | ||
addFunc(chk1, 0, 0) | ||
addFunc(chk1, 3, 1) | ||
addFunc(chk1, 4, 2) | ||
addFunc(chk1, 5, 3) | ||
addFunc(chk1, 7, 4) | ||
addFunc(chk1, 8, 5) | ||
chk2, err := chunk.NewForEncoding(enc) | ||
require.NoError(t, err) | ||
addFunc(chk2, 1, 1) | ||
addFunc(chk2, 2, 7) | ||
addFunc(chk2, 5, 8) | ||
addFunc(chk2, 6, 9) | ||
addFunc(chk2, 7, 10) | ||
addFunc(chk2, 9, 11) | ||
return []GenericChunk{ | ||
NewGenericChunk(0, 8, chunk.NewChunk(labels.FromStrings(model.MetricNameLabel, "foo"), chk1, 0, 8).Data.NewIterator), | ||
NewGenericChunk(1, 9, chunk.NewChunk(labels.FromStrings(model.MetricNameLabel, "foo"), chk2, 1, 9).Data.NewIterator), | ||
} | ||
}(), | ||
expectedSamples: []checkHintTestSample{ | ||
{t: 0, v: 0, hint: histogram.UnknownCounterReset}, // c1 | ||
{t: 1, v: 1, hint: histogram.UnknownCounterReset}, // c2 | ||
{t: 2, v: 7, hint: histogram.NotCounterReset}, // c2 | ||
{t: 3, v: 1, hint: histogram.UnknownCounterReset}, // c1 | ||
{t: 4, v: 2, hint: histogram.NotCounterReset}, // c1 | ||
{t: 5, v: 8, hint: histogram.UnknownCounterReset}, // c2 | ||
// Next sample is from c2. This is consecutive, but this ends up as the first sample in a merged | ||
// batch, and the c1 sample at ts 7 and 8 is added to the batch stream before this c2 sample, so | ||
// the prevIteratorID is set to c1 rather than c2 when we append this sample. | ||
{t: 6, v: 9, hint: histogram.UnknownCounterReset}, | ||
{t: 7, v: 4, hint: histogram.UnknownCounterReset}, // c1 | ||
{t: 8, v: 5, hint: histogram.NotCounterReset}, // c1 | ||
{t: 9, v: 11, hint: histogram.UnknownCounterReset}, // c2 | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an interesting test case - you can have as two consecutive samples from the same iterator but the second one isn't always detected as NotCounterReset
, as you can see with ts 6. This is a case where keeping track of the last timestamp per iterator rather than just the last iterator would have detected the NotCounterReset
properly. However, I think the current algorithm is good enough for now, I don't want to complicate it further without checking how it works in practice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, we aim to fail safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking good a few nits added first round
name: "overlapping and interleaved samples", | ||
chunks: func() []GenericChunk { | ||
chk1, err := chunk.NewForEncoding(enc) | ||
require.NoError(t, err) | ||
addFunc(chk1, 0, 0) | ||
addFunc(chk1, 3, 1) | ||
addFunc(chk1, 4, 2) | ||
addFunc(chk1, 5, 3) | ||
addFunc(chk1, 7, 4) | ||
addFunc(chk1, 8, 5) | ||
chk2, err := chunk.NewForEncoding(enc) | ||
require.NoError(t, err) | ||
addFunc(chk2, 1, 1) | ||
addFunc(chk2, 2, 7) | ||
addFunc(chk2, 5, 8) | ||
addFunc(chk2, 6, 9) | ||
addFunc(chk2, 7, 10) | ||
addFunc(chk2, 9, 11) | ||
return []GenericChunk{ | ||
NewGenericChunk(0, 8, chunk.NewChunk(labels.FromStrings(model.MetricNameLabel, "foo"), chk1, 0, 8).Data.NewIterator), | ||
NewGenericChunk(1, 9, chunk.NewChunk(labels.FromStrings(model.MetricNameLabel, "foo"), chk2, 1, 9).Data.NewIterator), | ||
} | ||
}(), | ||
expectedSamples: []checkHintTestSample{ | ||
{t: 0, v: 0, hint: histogram.UnknownCounterReset}, // c1 | ||
{t: 1, v: 1, hint: histogram.UnknownCounterReset}, // c2 | ||
{t: 2, v: 7, hint: histogram.NotCounterReset}, // c2 | ||
{t: 3, v: 1, hint: histogram.UnknownCounterReset}, // c1 | ||
{t: 4, v: 2, hint: histogram.NotCounterReset}, // c1 | ||
{t: 5, v: 8, hint: histogram.UnknownCounterReset}, // c2 | ||
// Next sample is from c2. This is consecutive, but this ends up as the first sample in a merged | ||
// batch, and the c1 sample at ts 7 and 8 is added to the batch stream before this c2 sample, so | ||
// the prevIteratorID is set to c1 rather than c2 when we append this sample. | ||
{t: 6, v: 9, hint: histogram.UnknownCounterReset}, | ||
{t: 7, v: 4, hint: histogram.UnknownCounterReset}, // c1 | ||
{t: 8, v: 5, hint: histogram.NotCounterReset}, // c1 | ||
{t: 9, v: 11, hint: histogram.UnknownCounterReset}, // c2 | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, we aim to fail safe.
@@ -85,6 +85,7 @@ | |||
* [BUGFIX] Fix pooling buffer reuse logic when `-distributor.max-request-pool-buffer-size` is set. #9666 | |||
* [BUGFIX] Fix issue when using the experimental `-ruler.max-independent-rule-evaluation-concurrency` feature, where the ruler could panic as it updates a running ruleset or shutdowns. #9726 | |||
* [BUGFIX] Always return unknown hint for first sample in non-gauge native histograms chunk to avoid incorrect counter reset hints when merging chunks from different sources. #10033 | |||
* [BUGFIX] Ensure counter reset hints are corrected when merging batches. #9909 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* [BUGFIX] Ensure counter reset hints are corrected when merging batches. #9909 | |
* [BUGFIX] Ensure native histograms counter reset hints are corrected when merging results from different sources. #9909 |
@@ -119,6 +119,8 @@ func (p *prometheusHistogramChunk) AddFloatHistogram(_ int64, _ *histogram.Float | |||
// AddHistogram adds another histogram to the chunk. While AddHistogram works, it is only implemented to make tests | |||
// work, and should not be used in production. In particular, it appends all histograms to single chunk, and uses new | |||
// Appender for each invocation. | |||
// AddHistogram adds another histogram to the chunk. While AddHistogram works, it is only implemented to make tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think we meant to replace the doc string not add to it
What this PR does
When merging samples from different iterators, histogram counter reset hints need to be recalculated. The
mergeIterator
currently does not do so. This PR fixes that by remembering the iterator for a sample in a batch, so when the next sample comes along, the iterators for the two samples can be compared and an unknown counter reset hint is set for the next sample if the iterators don't match. See prometheus/prometheus#15346 for related discussion.Which issue(s) this PR fixes or relates to
Fixes #
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]
.about-versioning.md
updated with experimental features.