-
Notifications
You must be signed in to change notification settings - Fork 375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NO-TICKET] Upgrade to libdatadog 14.0 #4065
Conversation
**What does this PR do?** This PR upgrades the libdatadog dependency to the latest 14.0 release. It also updates our API usage to match some 13.1 -> 14.0 changes. **Motivation:** The libdatadog 14.0 release includes a number of improvements to the crashtracking implementation, and we want to pick them up. **Additional Notes:** As a first pass, I've disabled the creation and use of the alt stack, to match our existing setup. I did not yet experiment with `use_alt_stack = true`. **How to test the change?** Existing test coverage should be enough to cover our libdatadog API usage.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #4065 +/- ##
=======================================
Coverage 97.72% 97.73%
=======================================
Files 1338 1338
Lines 80248 80248
Branches 4016 4016
=======================================
+ Hits 78426 78432 +6
+ Misses 1822 1816 -6 ☔ View full report in Codecov by Sentry. |
BenchmarksBenchmark execution time: 2024-11-06 09:48:19 Comparing candidate commit ddbbbcb in PR branch Found 0 performance improvements and 1 performance regressions! Performance is the same for 28 metrics, 2 unstable metrics. scenario:tracing - 100 span trace - no writer
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope no things forgotten to be set 🙈
I've dropped a note to @sanchda which has been doing a lot of working on libdatadog crashtracking so he can do a quick pass on that part as well :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only issue, which we discussed offline, is whether or not it's appropriate to use the altstack even if it isn't created. I trust @ivoanjo to make whatever decision is appropriate for the runtime, so approving.
👍 🚀 🚀
I spent some time looking into Since the Ruby VM itself sets an altstack, this leaves us:
Since the crashtracker is currently off by default, I don't think this should be a blocker for upgrading to libdatadog 14. In preparation for |
I spent some time looking into `use_alt_stack` setting and I think there's a bug in libdatadog and that setting doesn't seem to be doing anything at this point. Since the Ruby VM itself sets an altstack, this leaves us: * In libdatadog 13.1, the crashtracker code always used the altstack (e.g. equivalent of `use_alt_stack` was hardcoded to `true`) => in libdatadog 13.1 we used the Ruby altstack * In libdatadog 14.0, the crashtracker code chooses to use the altstack based on the value of `create_alt_stack` => libdatadog 14.0 we can't use the Ruby altstack In preparation for `use_alt_stack` to get fixed in a future libdatadog release, I'm changing it to true although, right now it's a no-op. This means that in a future libdatadog release, we'll be back to where we were in libdatadog 13.1: the crashtracker DOES NOT create its own altstack, BUT WILL USE the existing Ruby altstack.
**What does this PR do?** This PR upgrades the libdatadog dependency to the latest 14.1 release. **Motivation:** This PR fixes the issues we uncovered during testing of 14.0 in #4065: specifically, that triggering a stack overflow with native code would result in incorrect behavior (a segfault, not a stack overflow exception) when crashtracking was enabled. **Additional Notes:** N/A **How to test the change?** Our existing test coverage includes libdatadog testing. The native code stack overflow is a bit of a pain to trigger because GCC seems to like optimizing away such things. What I did was compile the profiler native extension with `export DDTRACE_DEBUG=true` (remember to run clean first!), and then adding the following trivial stack overflow: ```diff diff --git a/ext/datadog_profiling_native_extension/profiling.c b/ext/datadog_profiling_native_extension/profiling.c index 96637cd110..4ec5505147 100644 --- a/ext/datadog_profiling_native_extension/profiling.c +++ b/ext/datadog_profiling_native_extension/profiling.c @@ -38,6 +38,12 @@ static VALUE _native_enforce_success(DDTRACE_UNUSED VALUE _self, VALUE syserr_er static void *trigger_enforce_success(void *trigger_args); static VALUE _native_malloc_stats(DDTRACE_UNUSED VALUE _self); +static VALUE blow_stack(DDTRACE_UNUSED VALUE _self) { + fprintf(stderr, "."); + blow_stack(Qnil); + return Qnil; +} + void DDTRACE_EXPORT Init_datadog_profiling_native_extension(void) { VALUE datadog_module = rb_define_module("Datadog"); VALUE profiling_module = rb_define_module_under(datadog_module, "Profiling"); @@ -72,6 +78,8 @@ void DDTRACE_EXPORT Init_datadog_profiling_native_extension(void) { rb_define_singleton_method(testing_module, "_native_trigger_holding_the_gvl_signal_handler_on", _native_trigger_holding_the_gvl_signal_handler_on, 1); rb_define_singleton_method(testing_module, "_native_enforce_success", _native_enforce_success, 2); rb_define_singleton_method(testing_module, "_native_malloc_stats", _native_malloc_stats, 0); + + rb_define_singleton_method(testing_module, "blow_stack", blow_stack, 0); } ```
What does this PR do?
This PR upgrades the libdatadog dependency to the latest 14.0 release.
It also updates our API usage to match some 13.1 -> 14.0 changes.
Motivation:
The libdatadog 14.0 release includes a number of improvements to the crashtracking implementation, and we want to pick them up.
Change log entry
Upgraded libdatadog dependency to version 14.0
Additional Notes:
As a first pass, I've disabled the creation and use of the alt stack, to match our existing setup.
I did not yet experiment with
use_alt_stack = true
.How to test the change?
Existing test coverage should be enough to cover our libdatadog API usage.