Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add SessionIdEventSender and wire up into lifecycle #571

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

breedx-splk
Copy link
Contributor

This is a follow-up from #564. It reintroduces the SessionIdEventSender class and wires it up at build() time.

I know there's some hesitation to put "opinionated" session behavior into the core module. I'm open to discussion and ideas here, but I do think that:

  • It's important that the initial Session be created as early as OpenTelemetryRum can allow. Right now that happens with the builder. This is important for several reasons, not the least of which is that we want telemetry to be associated with the current session.
  • It's important to generate the session.start event as close to the session actually being started/created.
  • The session.start event can't be created until after the otel sdk is created.

@breedx-splk breedx-splk requested a review from a team September 3, 2024 23:59
.get(OpenTelemetryRum::class.java.simpleName)

sessionManager.addObserver(SessionIdEventSender(eventLogger))
// After addObserver(), we call getSessionId() to trigger a session.start event
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a bit forced. I'm guessing it will happen naturally when the first telemetry event is recorded, is there a reason why we must start the session earlier?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably needed if we want a session to be active as early as possible.
eg SDK is initiated during the Application#onCreate class.
If the session starts (automatically) when the OS gets a notification that the app is in the foreground (using androidx.lifecycle), that might be too late because the whole Application#onCreate code has been run, maybe some telemetry data was collected, but there was not an active session yet, in this case, makes sense that the SDK starts as session ASAP.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if we must do it earlier than the first telemetry event, which I guess would depend on the definition of a session, for example, if the definition ends up telling something around the lines of "the session starts when the app is launched", then I think the initialization of the SDK is too late to do so as well. It reminds me of the app launch timer that was recently mentioned here, where we might have to come up with some other mechanisms to track the start time temporarily until the Otel instance is ready.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe some telemetry data was collected, but there was not an active session yet

The session is queried and applied in the processors, so it will start as soon as the first telemetry data is collected. We do have an exception though where we collect events before the SDK is finished initializing which is here in the startup instrumentation where we store events in memory until those can be sent after the SDK is ready. It's probably worth considering a similar solution for this case in case we really need to trigger the event before the first telemetry data is collected.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I too think the first session should start as early as possible, but if the OTel SDK isn't started, can we even do anything with it? If we simply initialize it lazily when an instance of the processor is created, it should ensure that every piece of OTel telemetry recorded will have a session ID?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we simply initialize it lazily when an instance of the processor is created, it should ensure that every piece of OTel telemetry recorded will have a session ID?

The session is started when getSessionId() is called for the first time. It happens as soon as the first span is started or the first log is emitted, so there's no way that a signal is created without a session id. So it seems to me that we don't need to force it unless we must create a session before any telemetry is sent, like when the app is created for example, in which case this wouldn't be a good place to do so either.

I'd prefer to avoid adding behaviors to core unless it's a must for everyone, so it seems to me that the agent would be a nice place for this kind of behavior. I'm just back from PTO so I was planning to continue the work on this prototype as it seems like the overall idea has gotten thumbs-up, probably during the same work or right after I can add this session event sender behavior to the agent, wdyt @breedx-splk ?


sessionManager.addObserver(SessionIdEventSender(eventLogger))
// After addObserver(), we call getSessionId() to trigger a session.start event
sessionManager.getSessionId()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should the sessionManager have a start method instead for this use case so it's semantically correct?

@LikeTheSalad
Copy link
Contributor

Regarding what I mentioned in yesterday's meeting about trying to make core as plain as possible, just to expand on it, it's mostly because I feel like that way it'd be easier for both us and potential users with very specific requirements (vendors included) to address everybody's needs. Now, I think we can make some exceptions though, especially for things that we know are a must for everybody, such as disk buffering and/or adding certain Android environment attributes to all telemetry by default. However, when it comes to generating signals (spans, logs, and metrics) automatically I feel like that's where I find it difficult to impose opinions in core because if for whatever reason, given this PR as an example, some vendor or user doesn't use those events later within their oblt dashboard or the like and might also want to save disk space whenever possible, then having these events triggered automatically without a possibility to turning them off could be a problem, for them and later for us too.

Based on the above, if you think that this functionality is a must for everyone and you're willing to argue with any potential user/vendor who might not want it, it honestly sounds a bit exhausting to me 😅 but I guess that's fair. However, before merging this PR I think it might be worth taking a look at this "layered structure" proposal for the repo which might make it easier to add our opinions, without having to enforce them onto power users nor having to create a ton of config flags either to address all kinds of use cases. If we go with such a structure (or a similar one that has a low-level layer and high-level one) then we might be able to have our cake and eat it too, as we'd still add all our "opinions" in the higher-level place and leave the low-level one for our more opinionated users.

@bidetofevil
Copy link
Contributor

Seems like we should have a list of features/instrumentations defined as "must haves" that cannot be opt out of, and another set that are default on, and another that is default off. Do we have a list like that so we can determine where each feature belongs to, so we can better figure out where initialization and configuration should be happening?

@breedx-splk breedx-splk requested a review from a team as a code owner November 21, 2024 00:22
@breedx-splk
Copy link
Contributor Author

To address the concerns around "Not every vendor may want these events" critique, I have moved the session id event generation into an instrumentation module. This module is NOT included (yet) by the android-agent module, so by default the events will not be generated.

In order to make this work, the instrumentation needs to access the SessionManager so that it can register its observer. The SessionManager was previously an internal concrete class, so I renamed it to SessionManagerImpl and extracted the tag interface SessionManager which now just extends SessionProvider and SessionPublisher...and this was added to the InstallationContext for any future instrumentations that might need to muck with sessions. It becomes apparent now too that the session package could even likely be its own module (not sure how necessary tho).

I did include the session event instrumentation explicitly in the demo app, so that we could verify that events can still be generated. I think this also raises an interesting point about being able to disable instrumentations, either programmatically or via config, just like we do with java auto-instrumentation.

@open-telemetry/android-approvers @open-telemetry/android-maintainers Please have another look.

@breedx-splk
Copy link
Contributor Author

Ok sorry wait, this has gotten too spread out. I'm going to move this back to draft and try and backfill with a series of smaller commits.

@breedx-splk breedx-splk marked this pull request as draft December 11, 2024 02:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants