refactor sidecar/interface.rs into smaller files #395

ekump · 2024-04-17T15:08:16Z

What does this PR do?

Break up the various types and implementations in sidecar/interface.rs into separate files within a sidecar::service module. (I'm very open to a better name than "service").
Introduce some tests and some documentation. I only added some tests that didn't require much mocking or setup. APMSP-1079 has been created to track a more comprehensive unit testing solution.
Fixes some minor bugs uncovered through addition of tests.
Miscellaneous small fixes for typos, cleanup, and Rust idiomaticness.
As I am fairly new to the sidecar I made my best guesses at rustdoc comments, tests, and naming. Please call out any errors.

Motivation

Prep work for introducing retry logic sending traces via the sidecar

Additional Notes

This is a a fairly disruptive refactor. If there are non-critical things we can address in follow up PRs I would prefer to do that versus keeping this open for a long time. Obviously, any critical issues should be addressed ASAP.

How to test the change?

Unit tests are being added, but some sort of integration test with PHP is probably a good idea.

For Reviewers

If this PR touches code that signs or publishes builds or packages, or handles credentials of any kind, I've requested a review from @DataDog/security-design-and-guidance.
This PR doesn't touch any of that.

ekump · 2024-04-17T22:04:25Z

sidecar/src/service/session_info.rs

+    ///     session_info.shutdown_running_instances().await;
+    /// }
+    /// ```
+    pub async fn shutdown_running_instances(&self) {


Calling this function out for review as it has not only moved, but changed. The previous implementation was cloning the runtimes and never actually shutting them down. This implementation drains the runtimes hashmap and then shuts them down.

I doubt this is the optimal implementation, but it at least works as advertised now.

ekump · 2024-04-17T22:08:56Z

sidecar/src/service/session_info.rs

+    ///     session_info.shutdown_runtime(&"runtime1".to_string()).await;
+    /// }
+    /// ```
+    pub async fn shutdown_runtime(&self, runtime_id: &String) {


Calling this function out as there is a minor change (in addition to being moved). This was the only public function that took self instead of &self. This meant that after shutting down a single runtime the session_info object was moved and you couldn't do anything else with it. I'm not that familiar with how SessionInfo should be used, but since it supports multiple runtimes it should probably be usable after a single one is shutdown.

It didn't matter, because it implements Clone and all the relevant data within is Arc. But yes, using &self is correct.

bwoebi · 2024-05-02T14:01:25Z

sidecar/src/service/sidecar_server.rs

+// TODO: APM-1076 - This file contains a fair amount of expects. While in most cases it is unlikely
+// we will ever hit these, we should consider adding more robust error handling.


I think in these cases the error handling we have is quite fine. In fact, the error handling is tokio catching panics in async tasks and transforming the tasks into an error Result, which then end up logged.
The important part is that these panics on expect() don't bring the application into a deadlocked state where e.g. no further traces are processed at all.
In general all these expects() should actually never be hit and it's a coding mistake. I'm not a big proponent of having a proper error handling code for things which are not supposed to be possible (unless the code itself is broken).

On top of that a crash of the whole sidecar, while a minor data loss, is not catastrophic either as it will be restarted by the PHP processes quickly.

@bwoebi

I'm not a big proponent of having a proper error handling code for things which are not supposed to be possible

I tend to agree as long as 1) we are confident that these errors should never happen and 2) If 1) is wrong, we don't put the application in a bad state. Your comment indicates that we satisfy both these points so I'm ok with leaving these expects as is. I can remove the TODO.

pierotibou · 2024-05-02T14:41:52Z

sidecar/src/service/session_info.rs

+use crate::tracer;
+
+use crate::service::InstanceId;
+/// `SessionInfo` holds information about a session.


Explaining what are sessions and runtimes here would be neat.

The very dumb answer is: a session is identified by a session id, and a runtime by a runtime id :-P
Now, what exactly these are depends a bit on the language, but session ids are supposed to be shared for the main process and all it's forks while runtimes are mostly per process.
In context of PHP it's: one session+runtime id tuple = one sidecar connection.

@bwoebi Thanks for the clarification. I can update the doc comment to reflect this better.

sidecar/src/service/sidecar_server.rs

codecov-commenter · 2024-05-03T20:57:39Z

Codecov Report

Attention: Patch coverage is 27.99130% with 993 lines in your changes are missing coverage. Please review.

Project coverage is 64.83%. Comparing base (5a69bdb) to head (64b271d).

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #395      +/-   ##
==========================================
+ Coverage   64.08%   64.83%   +0.74%     
==========================================
  Files         169      183      +14     
  Lines       22073    22359     +286     
==========================================
+ Hits        14146    14496     +350     
+ Misses       7927     7863      -64

Components	Coverage Δ
crashtracker	`20.37% <ø> (ø)`
data-pipeline	`51.45% <ø> (ø)`
data-pipeline-ffi	`0.00% <ø> (ø)`
ddcommon	`81.23% <ø> (ø)`
ddcommon-ffi	`74.93% <ø> (ø)`
ddtelemetry	`53.72% <ø> (ø)`
ipc	`79.30% <ø> (ø)`
profiling	`77.05% <ø> (ø)`
profiling-ffi	`60.09% <ø> (ø)`
serverless	`0.00% <ø> (ø)`
sidecar	`28.66% <27.99%> (+10.67%)`	⬆️
sidecar-ffi	`0.00% <0.00%> (ø)`
spawn-worker	`50.49% <ø> (ø)`
trace-mini-agent	`69.12% <ø> (ø)`
trace-normalization	`97.79% <ø> (ø)`
trace-obfuscation	`95.74% <ø> (ø)`
trace-protobuf	`25.64% <ø> (ø)`
trace-utils	`68.85% <ø> (+0.15%)`	⬆️

sidecar/src/service/tracing/trace_flusher.rs

bwoebi · 2024-05-07T19:38:26Z

sidecar/src/service/runtime_info.rs

+
+    // TODO-EK: Reduce the type complexity before merging
+    #[allow(clippy::type_complexity)]
+    // TODO-EK - Investigate why this needs to be exposed this way before merging


The goal is simply simplifying the code a bit whether every access is multiple lines, see #392 (comment) for context.

Reducing the type complexity here sort of defeats the point of just being a simple helper function.

@bwoebi your comments are from an outdated version of this file, so I'm not sure if still applies.

The complexity isn't due to the MutexGuard, it's due to the app HashMap type of HashMap<(String, String), Shared<ManualFuture<Option<AppInstance>>>>. I created the type AppMap and use that throughout RuntimeInfo and was able to remove the clippy ignore.

I think we should strive to ignore linter errors only when it is absolutely necessary. I can't speak for @bantonsson on #392 (comment), but I would argue that a helper function isn't all that helpful if you have to suppress valid linter errors. The current implementation resolves the clippy error and maintains the helpfulness Bjorn was suggesting...at least in my opinion.

I added a TODO (with a jira ticket) to just investigate if it makes sense / is possible to not mutably expose the hashmap at a later time. In the past, I've found that this pattern can lead to problems but I understand there may be no reasonable way around it.

To be honest, if it were me, I would just disable the type complexity clippy error for the whole repository. I don't think that's a good lint.
Sometimes types are simply complex and using a type is basically just a workaround and requiring you to search for the type definition to just see the actual type.

sidecar/src/service/runtime_info.rs

bwoebi · 2024-05-07T19:45:24Z

sidecar/src/service/runtime_info.rs

+// TODO-EK: Can this be refactored more before merging? We may be able to encapsulate some of the
+// functionality in session_info to RuntimeInfo.


Like for example?

This TODO was just a reminder for myself while the PR was in draft. At the moment, I don't think this is possible in the short-term and isn't really blocking progress. I created a jira ticket and updated the TODO comment to reflect it.

sidecar/src/service/mod.rs

ekump · 2024-05-08T17:53:21Z

Is there a reason why you’ve aggressively put pub(crate) on everything instead of simply pub within the individual rust files?
Looks easier to read if it's just pub.

And I assume it’s enough to just make the modules then pub(crate) in the mod.rs files? What you’re anyway already doing.
Feels cleaner, and gives the ability to just toggle the visibility of a whole module from pub(crate) to pub when needed instead of touching every single codeline in that mod.

@bwoebi I did this for two reasons:

Consistency and safety (not rust "safety"). There are situations where we don't want to mark an entire file as pub(crate) but we do mix exposing certain functions, types, or struct members as either pub(crate) or just pub. In those cases we have no choice but to be verbose in the file. It seems awkward to me that sometimes we set pub(crate) on just the mod.rs level and sometimes we do it at the implementation. I also don't find it hard to read and I like knowing the access level with the implementation details. But if the consensus is that this is superfluous I can revert this in another PR.
It's related to reason 1, but I generally believe that we need to lock down the public surface area of the library as much as possible. We should only expose things publicly that we actually want public, and we're willing to support long-term and properly document. At the moment, I don't think we have any tools in place to help us not "leak" things we don't want public. Until we have something to do this for us (via static analysis or testing or something else) I'd prefer to err on the side of caution.

Your assumption is correct. Functionally, we don't need to be this verbose. If there is a general consensus against this, I can revert in a follow-up PR.

interface MSRV is greater than 1.60, so we can remove this allow.

Also, add basic test and doc comments

And add tests and doc comments

And rename to RuntimeMetadata, add basic doc comments and tests.

* Fix typo for intitial_acitons input paramater to get_app * Reorder impl to match trait order * remove redundant prefixes

… file Also, The From SerializedTracerHeaderTags and From TracerHeaderTags trait impls were changed to try_into trait impls as it is possible (however unlikely) that the code within the the trait impls could return errors and it is preferable to let the caller decide how to handle those errors rather than unwrap a Result and potentially panic.

and add doc comments

Also, add rustdoc comments and tests. Uncovered a bug in the shtudown_running_instances function where it never shutdown the running instances.

Refactor into separate function. Also, replace unwraps() with expects().

… lock scope

Moving from interface.rs to separate files

…ts to separate files

interface.rs has been refactored in to a "service" module within sidecar. During refactor the access level of some functions and types was increased while they were being moved around. This should "fix" the access level to the most restrictive possible.

After refactor of sidecar::interface.rs into multiple files AppOrQueue belongs in the telemetry namespace and SidecarStats belongs closer to the code it is generating stats for in SidecarServer.

github-actions bot added the sidecar label Apr 17, 2024

ekump commented Apr 17, 2024

View reviewed changes

ekump force-pushed the ekump/refactor-sidecar-interface-pt1 branch from a499487 to b658e1c Compare April 17, 2024 22:36

ekump changed the title ~~Refactor sidecar/interface.rs into smaller files~~ WIP - refactor sidecar/interface.rs into smaller files Apr 18, 2024

ekump force-pushed the ekump/refactor-sidecar-interface-pt1 branch from ed4b03e to d2cf389 Compare April 30, 2024 17:11

bwoebi reviewed May 2, 2024

View reviewed changes

pierotibou reviewed May 2, 2024

View reviewed changes

bwoebi reviewed May 2, 2024

View reviewed changes

sidecar/src/service/sidecar_server.rs Outdated Show resolved Hide resolved

ekump force-pushed the ekump/refactor-sidecar-interface-pt1 branch from 67807b8 to 4af0b0d Compare May 3, 2024 20:50

ekump force-pushed the ekump/refactor-sidecar-interface-pt1 branch 5 times, most recently from d25e068 to d35bbe5 Compare May 6, 2024 20:47

ekump commented May 6, 2024

View reviewed changes

sidecar/src/service/tracing/trace_flusher.rs Outdated Show resolved Hide resolved

ekump force-pushed the ekump/refactor-sidecar-interface-pt1 branch 3 times, most recently from 882a4c9 to 19f3249 Compare May 7, 2024 18:43

bwoebi reviewed May 7, 2024

View reviewed changes

sidecar/src/service/runtime_info.rs Outdated Show resolved Hide resolved

bwoebi reviewed May 7, 2024

View reviewed changes

ekump force-pushed the ekump/refactor-sidecar-interface-pt1 branch 3 times, most recently from 6e3d0c4 to 13e8c7f Compare May 7, 2024 22:21

ekump marked this pull request as ready for review May 7, 2024 22:24

ekump requested a review from a team as a code owner May 7, 2024 22:24

ekump changed the title ~~WIP - refactor sidecar/interface.rs into smaller files~~ refactor sidecar/interface.rs into smaller files May 8, 2024

bwoebi reviewed May 8, 2024

View reviewed changes

sidecar/src/service/mod.rs Outdated Show resolved Hide resolved

ekump force-pushed the ekump/refactor-sidecar-interface-pt1 branch 2 times, most recently from ceeddc3 to 2da5a21 Compare May 8, 2024 15:29

ekump added 24 commits May 8, 2024 13:53

remove unnecessary clippy allow for needless collect in sidecar

0231400

interface MSRV is greater than 1.60, so we can remove this allow.

refactor Sidecar::Interface::InstanceId to separate file

be066c5

Also, add basic test and doc comments

refactor Sidecar::Interface::QueueId to a separate file.

f9d170a

And add tests and doc comments

refactor sidecar::interface::RuntimeMeta to separate file

f66cefc

And rename to RuntimeMetadata, add basic doc comments and tests.

minor cleanup of sidecar::interface file.

e0da65e

* Fix typo for intitial_acitons input paramater to get_app * Reorder impl to match trait order * remove redundant prefixes

refactor RequestIdentifier and RequqestIdentification to separate file

dcda29e

and add doc comments

refactor sidecar_interface from interface.rs to separate file

43dff04

Refactor sidecar::interface::SessionInfo to its own file.

6bc665b

Also, add rustdoc comments and tests. Uncovered a bug in the shtudown_running_instances function where it never shutdown the running instances.

refactor Sidecar::SidecarServer to separate file

e3920e4

extract session interceptor into separate function in sidecar_server

7f37365

sidecar server - move logic for processing of interceptor response

884ff38

Refactor into separate function. Also, replace unwraps() with expects().

Add rustdoc comments for public sidecar_server methods

55463dd

Minor tweak to sidecar session_info::shutdown_runtime to reduce mutex…

f011497

… lock scope

refactor sidecar RuntimeInfo from interface.rs to separate file

0532d2d

refactor complex type in sidecar RuntimeInfo

d818098

Sidecar - move SessionConfig, SidecarAction, and ApporQueue structs

c06a39a

Moving from interface.rs to separate files

refactor sidecar::AppInstance in to separate file

7ed1eb4

sidecar - refactor enqueued_telemetry_data and enqueued_telemetry_sta…

10ca393

…ts to separate files

sidecar - refactor tracing logic in interface.rs to separate files

e61995a

refactor SidecarStats to telemetry namespace

b675f04

Move blocking sidecar_interface to separate file

f8c54d4

Move SidecarStats and AppOrQueue structs to more appropriate locations.

64b271d

After refactor of sidecar::interface.rs into multiple files AppOrQueue belongs in the telemetry namespace and SidecarStats belongs closer to the code it is generating stats for in SidecarServer.

ekump force-pushed the ekump/refactor-sidecar-interface-pt1 branch from 382cb18 to 64b271d Compare May 8, 2024 17:53

ekump merged commit d3f8544 into main May 8, 2024
20 checks passed

ekump deleted the ekump/refactor-sidecar-interface-pt1 branch May 8, 2024 18:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor sidecar/interface.rs into smaller files #395

refactor sidecar/interface.rs into smaller files #395

ekump commented Apr 17, 2024 •

edited

Loading

ekump Apr 17, 2024

ekump Apr 17, 2024 •

edited

Loading

bwoebi May 8, 2024

bwoebi May 2, 2024 •

edited

Loading

ekump May 8, 2024

pierotibou May 2, 2024

bwoebi May 2, 2024

ekump May 8, 2024

codecov-commenter commented May 3, 2024 •

edited

Loading

bwoebi May 7, 2024

bwoebi May 7, 2024

ekump May 8, 2024

bwoebi May 8, 2024 •

edited

Loading

bwoebi May 7, 2024

ekump May 8, 2024

ekump commented May 8, 2024

		// TODO: APM-1076 - This file contains a fair amount of expects. While in most cases it is unlikely
		// we will ever hit these, we should consider adding more robust error handling.

		// TODO-EK: Can this be refactored more before merging? We may be able to encapsulate some of the
		// functionality in session_info to RuntimeInfo.

refactor sidecar/interface.rs into smaller files #395

refactor sidecar/interface.rs into smaller files #395

Conversation

ekump commented Apr 17, 2024 • edited Loading

What does this PR do?

Motivation

Additional Notes

How to test the change?

For Reviewers

Choose a reason for hiding this comment

ekump Apr 17, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bwoebi May 2, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-commenter commented May 3, 2024 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bwoebi May 8, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ekump commented May 8, 2024

ekump commented Apr 17, 2024 •

edited

Loading

ekump Apr 17, 2024 •

edited

Loading

bwoebi May 2, 2024 •

edited

Loading

codecov-commenter commented May 3, 2024 •

edited

Loading

bwoebi May 8, 2024 •

edited

Loading