generated from amosproj/amos202Xss0Y-projname
-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Actor Collector Pinning Refactor #159
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
fhilgers
force-pushed
the
pinning-refactor
branch
2 times, most recently
from
December 11, 2024 08:20
9b5008f
to
ab4dde8
Compare
fhilgers
requested review from
luca-dot-sh,
ffranzgitHub,
Mr-Kanister,
der-whity and
BenediktZinn
December 11, 2024 08:20
This only removes it from the daemon. Removing it from the rest of our codebase is still open. Signed-off-by: Felix Hilgers <[email protected]>
[`bytemuck`](https://crates.io/crates/bytemuck) is a convenient crate that provides conversions from raw bytes to a rust struct. Derive macros like `AnyBitPattern` or `CheckedBitPattern` automatically verify the assumptions about rust's typesystem and guarantee that the conversion is sound. This is much better than what we are currently doing: ``` let item: &JniCALL = unsafe { &*(data.as_ptr() as *const JniCALL) }; ``` I am pretty sure that this is undefined behavior in the case of our ebpf maps. With bytemuck we can just write: ``` let item: &JniCall = bytemuck::from_bytes(&data); ``` Signed-off-by: Felix Hilgers <[email protected]>
This change is rather large. To summarize the problem at hand: Aya allows us to own maps via `take_map` but only allows us to borrow program (`program`, `program_mut`). Owning maps has another problem, we can never put them back, so we have to keep them safe forever. Io operations are inherently falliable, meaning that when we are reading from one of our maps, there is always the possibility of some unxpected error. In our collectors and config handlers, each instance is responsible for their own ebpf maps, meaning there is no overlap: ``` VfsWriteCollector => VFS_WRITE_EVENTS SysSendmsgCollector => SYS_SENDMSG_EVENTS VfsWriteFeature => vfs_write, vfs_write_ret, VFS_WRITE_PIDS ``` That means that our collectors and handlers should own the maps and programs exclusively both performance wise (no locking) and semantically. This is 1. Not possible with programs, because aya forbids it 2. Prone to losing maps In the case of a panic or error, we would have to try and bubble up our maps so that they are not lost. This leads to a much more complicated design. Ebpf maps and programs can be pinned to the file system. This is not only interesting for us in the future (hot restart without losing any data because ebpf is still collecting) but also interesting for us now. We can set the relevant maps to be pinned and load all ebpf programs on startup, pinning them to the ebpf file system. Afterwards we can use the pinned objects and construct our registry. Through that we could in theory have many references on a single object. To forbid that and ensure exclusive access we can use AtomicCells. The crossbeam crate offers a AtomicCell which is lockfree in certain circumstances and allows atomic operations on the data within. Using this type, together with an Option, a Box and a Guard we can construct a registry of all our ebpf programs and ensure every object is held by at most one instance. Through the guard we can ensure that when that instance fails and has to be recreated, the object is automatically restored into the registry. Structure: 1. Our registry is a struct of Arc pointers to be cloned where needed (We could also think about having that loaded as single global instance which would even more simplify the design) 2. Each Arc pointer contains an AtomicCell<Option<Box<T>>>. We need to put T inside a box to ensure the Cell remains lock free. Option is a null pointer optimized type so this can be wrapped around regardless. 3. We pass that registry or a part therof to one of our instances, like the VfsWriteCollector. This collector can then atomically swap the value inside the AtomicCell with None and has exclusive access to T. 4. To ensure that none of our registry objects is lost, we put a Guard around T. This guard has a reference on the Arc<AtomicCell<Option<Box<T>>>> which is None in that case. On drop it stores its value into the registry again. Signed-off-by: Felix Hilgers <[email protected]>
In the client we have `127.0.0.1:50051`. In the server we have `[::1]:50051`. I have changed our client to also use `[::1]` like everything else. Signed-off-by: Felix Hilgers <[email protected]>
Signed-off-by: Felix Hilgers <[email protected]>
The actor model is a design pattern which fits our usecase very well. Especially with the new registry, the pattern feels much better than manually baking it ourselves. Signed-off-by: Felix Hilgers <[email protected]>
fhilgers
force-pushed
the
pinning-refactor
branch
from
December 11, 2024 08:50
ab4dde8
to
3ed8dbe
Compare
Mr-Kanister
approved these changes
Dec 11, 2024
der-whity
reviewed
Dec 11, 2024
der-whity
approved these changes
Dec 11, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Ill fix the |
Closes #123 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This change is rather large.
To summarize the problem at hand:
Aya allows us to own maps via
take_map
but only allows us to borrowprogram (
program
,program_mut
).Owning maps has another problem, we can never put them back, so we have
to keep them safe forever.
Io operations are inherently falliable, meaning that when we are reading
from one of our maps, there is always the possibility of some unxpected
error.
In our collectors and config handlers, each instance is responsible for
their own ebpf maps, meaning there is no overlap:
That means that our collectors and handlers should own the maps and
programs exclusively both performance wise (no locking) and
semantically.
This is
In the case of a panic or error, we would have to try and bubble up our
maps so that they are not lost.
This leads to a much more complicated design.
Ebpf maps and programs can be pinned to the file system. This is not
only interesting for us in the future (hot restart without losing any
data because ebpf is still collecting) but also interesting for us now.
We can set the relevant maps to be pinned and load all ebpf programs on
startup, pinning them to the ebpf file system. Afterwards we can use the
pinned objects and construct our registry.
Through that we could in theory have many references on a single object.
To forbid that and ensure exclusive access we can use AtomicCells.
The crossbeam crate offers a AtomicCell which is lockfree in certain
circumstances and allows atomic operations on the data within. Using
this type, together with an Option, a Box and a Guard we can construct a
registry of all our ebpf programs and ensure every object is held by at
most one instance. Through the guard we can ensure that when that
instance fails and has to be recreated, the object is automatically
restored into the registry.
Structure:
(We could also think about having that loaded as single global
instance which would even more simplify the design)
put T inside a box to ensure the Cell remains lock free. Option is a
null pointer optimized type so this can be wrapped around regardless.
the VfsWriteCollector. This collector can then atomically swap the
value inside the AtomicCell with None and has exclusive access to T.
around T. This guard has a reference on the
Arc<AtomicCell<Option<Box>>> which is None in that case. On drop
it stores its value into the registry again.