Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sentry reports handled NRE as EXC_BAD_ACCESS signal #3776

Open
TimBurik opened this issue Nov 21, 2024 · 8 comments
Open

Sentry reports handled NRE as EXC_BAD_ACCESS signal #3776

TimBurik opened this issue Nov 21, 2024 · 8 comments
Labels

Comments

@TimBurik
Copy link

Package

Sentry

.NET Flavor

.NET

.NET Version

9.0.100

OS

iOS

SDK Version

4.13.0

Self-Hosted Sentry Version

No response

Steps to Reproduce

  1. Create empty dotnet iOS project, add Sentry nuget package;
  2. In FinishedLaunching(), initialize SentrySdk:
SentrySdk.Init(options =>
 {
    options.Dsn = "...";
    options.Debug = true;
    options.SampleRate = 1.0f;
});
  1. After Sentry is initialized, execute a code like this:
try
{
    object o = null;
    Console.WriteLine(o.ToString()); //throws NullReferenceException
}
catch (Exception e)
{
    Console.WriteLine(e);
}

Expected Result

Since exception is handled, there should be no reports in Sentry.

Actual Result

Sentry writes down the envelope and sends it in the next session. Example of the crashlog: crashlog.txt

In production, we also see examples of double reports - when unhandled managed NRE and native SIGSEGV are reported from the same device at the same time with very similar stacktraces. But it seems to be the result of the same issue.

I seems that this issue is related to the order of signal handlers and has the same root cause as dotnet/android#9055 (seems to have a fix on the Sentry side getsentry/sentry-native#1026). Also it might be related to this issue: #3678

@jamescrosswell
Copy link
Collaborator

Hi @TimBurik,

Thanks for reporting this and thanks for tracking down related issues.

I seems that this issue is related to the order of signal handlers and has the same root cause as dotnet/android#9055 (seems to have a fix on the Sentry side getsentry/sentry-native#1026).

I suspect you're right. Support for alternative ordering of the signal handlers has already been added in sentry-native 0.7.13.

This PR is meant to address this for Android:

It's pending implementation of the same in the Java SDK.

It looks like we'd need to do something similar for iOS eh @bricefriha? Currently the fix is only applied in the Android/SentrySdk:

o.SetNativeHandlerStrategy(JavaSdk.Android.Core.NdkHandlerStrategy.SentryHandlerStrategyChainAtStart);

@bricefriha
Copy link
Contributor

Currently the fix is only applied in the Android/SentrySdk

Would it be worth letting the cocoa SDK team know about this issue?

@jamescrosswell
Copy link
Collaborator

jamescrosswell commented Nov 24, 2024

Would it be worth letting the cocoa SDK team know about this issue?

Possibly yes. How does the native sdk get initialised in an iOS application? Is it similar to Java where the Java SDK does that and we'd need to pass parameters via that SDK to configure how the native SDK gets initialized?

You'd need to read the code to work that out I think, and then explain to the cocoa team what changes you needed and why.

@jamescrosswell
Copy link
Collaborator

@philipphofmann are you the right person to tap on the shoulder about this?

We saw a problem recently when null reference exceptions are thrown on Android. Mischan gave a very comprehensive explanation of what was going on here.

It looks like the Coca SDK registers its own signal handlers. Possibly due to the order of registration, the Cocoa handlers are firing before the .NET handlers have a chance to handle the exception in .NET applications running on ios?

This is the fix that was made in Sentry Native... I expect something similar would have to be done to address in the Cocoa SDK (we could pass in a "Handler Strategy" enum/bool via the options when initialising the Cocoa SDK).

@philipphofmann
Copy link
Member

If getsentry/sentry-native#1027 fixes the problem, it seems like doing the same on Cocoa makes sense. @supervacuus, as you already implemented this in the native SDK, could you also add this to the Cocoa SDK? SentryCrash is a bit different from sentry-native, but it's still signal handlers.

@supervacuus
Copy link

If getsentry/sentry-native#1027 fixes the problem, it seems like doing the same on Cocoa makes sense. @supervacuus, as you already implemented this in the native SDK, could you also add this to the Cocoa SDK? SentryCrash is a bit different from sentry-native, but it's still signal handlers.

Sure, I can have a look. I could imagine this to be a bit more involved, since EXC_BAD_ACCESS is a mach exception and not a signal. AFAICR, the .net runtime uses both on mach-based OSes, but I can't remember any details. It might be necessary to adapt both, the SentryCrash signal- and mach exception-port handler. Even more importantly: figuring out how they interact (beyond their normal sequencing).

@getsantry getsantry bot moved this to Waiting for: Product Owner in GitHub Issues with 👀 3 Dec 3, 2024
@getsantry getsantry bot removed the status in GitHub Issues with 👀 3 Dec 4, 2024
@kahest kahest moved this from Needs Discussion to Backlog in Mobile & Cross Platform SDK Dec 5, 2024
@kahest kahest added the Bug Something isn't working label Dec 5, 2024
@MichaelLHerman
Copy link

MichaelLHerman commented Dec 19, 2024

This is very concerning. We are relying on the crash rate to accurately determine the health of a release. If this is reporting false positives, this could be causing my team to currently be working on an unnecessary hotfix (when we could be fixing the handled exceptions in a future normal release)

Any plans for a fix?

@getsantry getsantry bot moved this to Waiting for: Product Owner in GitHub Issues with 👀 3 Dec 19, 2024
@jamescrosswell
Copy link
Collaborator

Any plans for a fix?

We got a report today that suggests the "fix" we had prepared for a similar problem in Android did not work... so I don't think we can copy/paste the solution we used there. This needs some investigation. We'll try to fix the problem on Android first and then circle back to this iOS issue.

Apologies for the delay. We're a bit short handed on the .NET SDK at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: No status
Status: No status
Status: Backlog
Development

No branches or pull requests

8 participants