-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
POC: OpenTelemetry metrics rework #3448
Conversation
2da4d00
to
8c7ef63
Compare
5b1e4bd
to
f1fbfef
Compare
f80d41a
to
11ff00a
Compare
25f564b
to
df5d2d2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I skimmed through and tried some things, left some comments. I can't say I fully understand all the OTEL concepts and also have ignored the histogram situation, but otherwise not too intimidating :-)
I left a comment about performance of Inc()
, probably need to figure that out. Otherwise no separate stats thread and making callback based metrics first-class would be interesting. I'm not overly fond of the "update something ever x seconds" which may lead to stale metrics.
|
||
/** | ||
* Increments the value by @p amount. | ||
* @pre `amount >= 0` | ||
*/ | ||
void Inc(int64_t amount) noexcept { broker::telemetry::inc(hdl, amount); } | ||
void Inc(BaseType amount) noexcept { | ||
handle->Add(amount, attributes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I smoke tested performance locally with a 200k SYN packet pcap and zeek -r
time goes from ~10sec to ~11.3sec, Not crazy, but seems too much. There's opentelemetry stacks showing in flamegraphs for the EventHandler and log write counting and session handling significantly.
Is it possible to get access to the counter directly? Skimming the stacks it appears there's a lookup done in some hashtable for each Add()
call. Or alternatively switch to a Callback based / Observable instrument exposing the value that's kept. Otherwise, this is introducing significant overhead even if metrics aren't used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW: caf::telemetry matches 0.2%
of samples on current master.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a lookup in the GetOrAdd
methods that does a comparison across all of the existing counters for a matching set of labels. That's probably what it is, without having actually looked at the flamegraph. In some cases we do store the counters locally (session::Manager
, for one) but we don't do that in every case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I took it as something within handle->Add(amount, attributes)
. opentelemetry needs to internally find the right counter of the given labels/attributes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm looking at a flamegraph of it now. I might have to open an issue on their side, as this dives really quickly down into their code. MetricAttributeIterable::ForEachKeyValue
is our code. I might be able to optimize the loop itself by storing the values locally as opentelemetry::common::AttributeValue
objects instead of strings. That'd avoid a conversion every time we do a lookup, but ForEachKeyValue
is dominated mostly by the call to the callback method passed as an argument. That callback function comes from OTEL itself so we don't have much control over its performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ec5b2ab cuts down on it a bit by storing a secondary map that are views into the existing string map, but in the otel types so it doesn't have to do conversions. With the 200k syns pcap we use on os-perf-1, it cuts down the contribution of zeek::telemetry from 11.5% to 9.3%. There's still a performance hit from the function call though. I asked on their Slack channel what else we can do about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I talked to one of the OTEL guys and shared a flamegraph with him. He agreed that it's a performance issue for it to be rehashing the map every time, even though the map hasn't changed at all since the last time. He said he'd look into it as part of some other performance issues he's investigating soon.
But it does add
|
I agree that I'd love for that to be suppressed but it's apparently part of the otel spec: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/compatibility/prometheus_and_openmetrics.md#instrumentation-scope-1
|
Which is followed by this sentence:
There's no such option? :-) |
50679fd
to
3d3e30f
Compare
3d3e30f
to
6ec89cc
Compare
Not that I could find. I opened open-telemetry/opentelemetry-cpp#2442 after asking on Slack and getting no response for a few days. EDIT: This is fixed by open-telemetry/opentelemetry-cpp#2451, once that gets merged and released. |
With the broker version the copying was mostly ok because all of the actual telemetry data was stored internally in broker and accessed via handle objects. Copying an object wouldn't matter as long as the handle was maintained. In the OTEL version, a copy of the count data is stored local to the Counter/Gauge/Histogram objects, so copies through the OpaqueVal objects will screw things up and the values get "lost". This is unfortunately a breaking change to the API, as GetOrAdd() returns a shared_ptr now instead of a bare object.
- Reduce code duplication via a set of base classes for Counter/Gauge/Histogram - Only create a single otel instrument for each prefix/name combination
2722f68
to
dc3dfe5
Compare
We shouldn't need to do these things. There's a PR open in otel-cpp to fix both issues on their side, but it's been open since June and hasn't been completed yet. Hopefully it'll land soon.
8c124c1
to
9e8cc24
Compare
f69b93c
to
6ff2cc3
Compare
The otel work is on hold until they fix all of the issues that I've opened with them, since most of them are blockers to our usage. I'm going to close this and will open a new one when there's something to review. |
This is a draft PR for a rework I’ve been working on of our telemetry subsystem. The intention for the PR is to discuss the current code structure and whether anything needs to change before finishing off the rest of the implementation.
The goal here is to move the telemetry code out of the Broker framework into a framework of its own, using OpenTelemetry as the underlying transport framework. The unit tests all pass but the btests are currently failing in a lot of places. The failures are mostly for common reasons (especially for histograms, the bucket comment below).
What’s implemented:
What’s features are still missing:
Counter
,Gauge
, andHistogram
) write the current value to a topic whenever something changes. This could lead to a lot of extra event traffic though, and may not be terribly efficient. A second method would be to implement an OTel exporter, add it to a periodic metric reader, and have it export the values to the topic on a timer. We’d need to figure out how to set the values in the instruments directly, since the API OTel exposes only allows you to add to the value.endpoint
attribute that is added to every metric in the Broker implementation is missing.Some broken things in OTel:
kSum
aggregation type correctly when it comes to their Prometheus exporter, so it doesn’t set the_total
suffix on the metric name before exporting. This is already fixed in an open PR that will hopefully land before the next OTel release.unit
values correctly in the Prometheus exporter and always appends the underscore. This is also fixed in the same PR as above. We’re handling both of these two on our side already, but the workaround could be removed once it’s fixed upstream.API Changes and proposed changes:
Builds on Windows are currently failing because of Conan dependency issues.