Add static analysis to CI to determine redundant seccomp rules #4924

roypat · 2024-11-22T17:00:21Z

Use static analysis to determine the set of syscalls (+ arguments) that are actually present in the Firecracker binary, and cross-reference this with the seccomp filters, to determine redundant seccomp rules. Additionally, remove redundant seccomp rules identified by this process. Please see the individual commit messages for details.

Sample output:

FAILED integration_tests/build/test_seccomp_no_redundant_rules.py::test_redundant_seccomp_rules - AssertionError: Found redundant seccomp rules! [('uname', {})]

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.

PR Checklist

I have read and understand CONTRIBUTING.md.
I have run tools/devtool checkstyle to verify that the PR passes the
automated style checks.
I have described what is done in these changes, why they are needed, and
how they are solving the problem in a clear and encompassing way.
I have updated any relevant documentation (both in code and in the docs)
in the PR.
I have mentioned all user-facing changes in CHANGELOG.md.
If a specific issue led to this PR, this PR closes the issue.
When making API changes, I have followed the
Runbook for Firecracker API changes.
I have tested all new and changed functionalities in unit tests and/or
integration tests.
I have linked an issue to every new TODO.

This functionality cannot be added in rust-vmm.

codecov · 2024-11-22T17:14:40Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 83.98%. Comparing base (0bee970) to head (873e7bf).
Report is 3 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #4924   +/-   ##
=======================================
  Coverage   83.98%   83.98%           
=======================================
  Files         251      251           
  Lines       27889    27889           
=======================================
  Hits        23422    23422           
  Misses       4467     4467

Flag	Coverage Δ
5.10-c5n.metal	`84.55% <ø> (ø)`
5.10-m5n.metal	`84.53% <ø> (-0.01%)`	⬇️
5.10-m6a.metal	`83.82% <ø> (-0.01%)`	⬇️
5.10-m6g.metal	`80.67% <ø> (-0.01%)`	⬇️
5.10-m6i.metal	`84.52% <ø> (ø)`
5.10-m7g.metal	`80.68% <ø> (ø)`
6.1-c5n.metal	`84.55% <ø> (ø)`
6.1-m5n.metal	`84.53% <ø> (ø)`
6.1-m6a.metal	`83.82% <ø> (-0.01%)`	⬇️
6.1-m6g.metal	`80.68% <ø> (ø)`
6.1-m6i.metal	`84.52% <ø> (-0.01%)`	⬇️
6.1-m7g.metal	`80.68% <ø> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

roypat · 2024-11-25T15:06:40Z

Mhh, it works on ARM with the newest nightly, maybe related to rust-lang issue 125619? maybe need to part this one for a while then

tests/framework/static_analysis.py

tests/integration_tests/build/test_seccomp_no_redundant_rules.py

tests/framework/static_analysis.py

tools/devctr/Dockerfile

tests/framework/static_analysis.py

bchalios

This looks awesome, congrats! I did a first pass (I want to do a second one later today), but so far just a few nits.

I just have one high-level question, so far. How confident are we that we won't get false positives, especially in the future? Do you see a way that the way system calls are compiled changes in the future and forces us to drop a system call erroneously?

tests/framework/static_analysis.py

roypat · 2024-12-16T12:09:34Z

I just have one high-level question, so far. How confident are we that we won't get false positives, especially in the future? Do you see a way that the way system calls are compiled changes in the future and forces us to drop a system call erroneously?

Yeah, it's definitely a possibility. For example, if the compiler starts emitting a different instruction for register-to-register transfers that the script doesn't support, it'd need to be updated to handle this new scenario. It kinda ties in to what Nikita asked earlier: It'd be very easy to define an allow list inside the test if we ever have known false-positives. Although I guess if the churn becomes too much (e.g. regular false-positives and the updates to keep this working get compiled - e.g. the current state is just us getting "lucky" with the compiler's choices), we might have to declare this approach failed :/

(as an aside, if it runs into something it doesn't understand, it prints a warning, so if the test fails and gives a list of redundant calls, but there's a ton of warnings in the logs, then it's a high chance its a false-positive)

bchalios · 2024-12-16T12:19:32Z

Follow up question. How easy/hard is it to figure out when did we remove the system calls that the tool found as redundant and run the tool on the commit before that?

roypat · 2024-12-16T12:25:11Z

Follow up question. How easy/hard is it to figure out when did we remove the system calls that the tool found as redundant and run the tool on the commit before that?

If you do ./tools/devtool test -- -k redundant on the commit before the seccomp filter update commit of this PR (e.g. on 91c3404), it'll fail with the message from the PR description. Tested that earlier today :) (it won't print out all the ioctls because those were determined to be unneeded via manual inspection/ad-hoc setups, but it'll find uname on x86 and should find the vcpu_events ioctls on arm)

tests/framework/static_analysis.py

Since Firecracker uses the musl target, it is statically compiled against all of its dependencies. This means that all syscalls that Firecracker can possibly do, are represented by `syscall`/`svc #0x0` instructions inside its binary. By doing some primitive static analysis, it is possible to determine, for each of these instructions, what the actual syscall number is (the idea being: it will probably be a constant loaded into a specific register, determined by Linux's syscall convention, very near the instruction itself). On a best-effort basis, we can also try to determine the values of all registers that might holds syscall arguments (for things like ioctls, this works very well, because the ioctl number is almost always a constant loaded very closely to the syscall instruction). The static_analysis.py file implements the logic for this analysis. We essentially start at each syscall instruction, and then work through the preceding instructions in reverse order, trying to symbolically execute the program in reverse until we figure out the value of the register that holds the syscall number. Thankfully, because of the above mentioned locally, this means we only need to really understand a handful of instructions (based on the assumption that we can ignore instructions that do not explicitly refer to registers that we care about). For example, on x86_64, the syscall number is stored in the eax register, so if we encounter the following instruction sequence xor %ebx,%ebx mov %ebx,%eax syscall we first symbolically execute the "mov", and determine that to figure out the syscall number, we actually need to determine the value of the ebx register. So we go back one more instruction, and find out that %ebx is explicitly set to 0. Thus, forward propagating this 0 to the syscall instruction again, we find that we're doing a `read` syscall. Things get a bit more complicated on ARM, where we don't have to only care about register-to-register transfers, but also things like `add` instructions. For these, we must also keep a "log" of manipulations to apply during forward propagation. For example, on ARM the syscall number is passed in the w8 register, so in the following sequence mov w19, #0x6 add w8, w19, #0x1 svc #0x0 we determine that to compute the syscall number, we must first determine the value of the w19 register, but also we must remember that when we forward propagate the value 0x6 from the w19 register, we must add 1 to it when propagating into the w8 register. Sometimes, we additionally have to deal with wrapper functions around syscalls (e.g. `libc::syscall`). For these, we need to translate the list of syscall number and syscall arguments that the wrapper received according to to the architectures C calling convention (cdecl on x86_64, w0-w7 for arguments on ARM) into what the wrapper will eventually pass to the syscall instruction. On x86_64 this comes with the complication that cdecl only allows passing 6 arguments in registers, but syscall nr + all arguments are up to 7 values. Thus, when a syscall is invoked via libc::syscall, we can only determine up to 5 syscall arguments (the final one would be on the stack, and stack analysis would significantly complicate this script). Lastly, we sometimes also have wrappers for specific syscalls (e.g. ioctl). Here, we do something very similar to the above, however since now the syscall number is "implicit" (e.g. not an argument to the wrapper function, but rather hard-coded into the wrapper), we no not actually run into problems with argument counts on x86_64. We implement this logic because it allows us to determine more arguments (for example, if we did not explicitly look at callsites of the ioctl wrapper, we would not be able to determine which ioctl numbers are actually invoked, greatly reducing the usefulness, as our seccomp filters are large ioctls). After determining all the syscalls, we then compare it to our seccomp filters. We look at all our allowed rules, and compute the set of rules that does not match _any_ of the actual syscalls invoked by the binary. If any such are found, we fail inside a new integration test. There are fairly big limitations to this approach. Because we're doing a global analysis of the firecracker binary, we cannot differentiate between syscalls on different threads, and also not between syscalls done before and after actual application of seccomp filters. For this, we would need some sort of control flow graph analysis, which is made significantly harder due to the presence of dynamic dispatch. Nevertheless, this simple approach already showed 4 redundant syscall rules (and playing around with applying the script to reduced compilation units showed a lot more unneeded rules for the vcpu thread). For all of this to work, we need to compile Firecracker as a non-position independent executable (to get the compiler to general direct calls with absolute addresses, instead of instruction pointer relative ones). Since all dependencies also need to be compiled this way, we have to use cargo's unstable `-Zbuild-std` feature, which thus means we need a nightly toolchain. Signed-off-by: Patrick Roy <roypat@amazon.co.uk>

The nightly toolchain in our devctr is too old to successfully compile non-PIE, statically linked aarch64 executable. I have manually verified newer nightlies to work (2024-11-16 and newer), however our nightly toolchain is dictated by kani, so we need to wait for kani to release a new version that updates past this toolchain. Therefore, temporarily disable the test on aarch64. Signed-off-by: Patrick Roy <roypat@amazon.co.uk>

These have been determined by static analysis of a Firecracker binary (see also follow up commits): The removed seccomp rules here trigger syscalls that are either not present at all in the entire Firecracker binary, or are not reachable from the entry point of a specific Firecracker thread (this analysis has only been done for the vcpu thread for now, due to being fairly tricky). Some explanations for why some of these entries are no longer needed can be found below - The `uname` syscall was used back when we supported 4.14, to query the host kernel version and disable specific Firecracker features that were not supported pre-5.10 (io_uring and hugepages for memfd). With 4.14 support dropped, there are no such checks anymore. - Various KVM_SET_* ioctls do not need to be allowed on the vcpu thread, because they are all called _before_ the vcpu seccomp filters are installed (as they are only used during snapshot restore / when preparing KVM state for boot). - on aarch64, we can additionally remove KVM_{SET,GET}VCPU_EVENTS, as we never call this ioctl on arm (only on x86) Signed-off-by: Patrick Roy <roypat@amazon.co.uk>

roypat added the Status: Awaiting review label Nov 22, 2024

roypat force-pushed the seccomp-cleanup branch 3 times, most recently from 25ba8a0 to 13f4cf4 Compare November 25, 2024 14:08

roypat added Status: Parked and removed Status: Awaiting review labels Nov 25, 2024

roypat marked this pull request as ready for review November 25, 2024 15:06

ShadowCurse reviewed Nov 25, 2024

View reviewed changes

tests/framework/static_analysis.py Outdated Show resolved Hide resolved

ShadowCurse reviewed Nov 25, 2024

View reviewed changes

tests/framework/static_analysis.py Show resolved Hide resolved

roypat added Priority: High Status: Blocked Status: WIP and removed Status: Parked labels Nov 27, 2024

Manciukic removed Priority: High Status: WIP labels Nov 27, 2024

kalyazin reviewed Nov 27, 2024

View reviewed changes

tests/integration_tests/build/test_seccomp_no_redundant_rules.py Show resolved Hide resolved

roypat force-pushed the seccomp-cleanup branch 3 times, most recently from 6136bc9 to eeb678f Compare December 3, 2024 15:41

roypat changed the title ~~[RFC] Add static analysis to CI to determine redundant seccomp rules~~ Add static analysis to CI to determine redundant seccomp rules Dec 3, 2024

roypat force-pushed the seccomp-cleanup branch from eeb678f to 3adafa9 Compare December 3, 2024 15:53

roypat added Status: Awaiting review and removed Status: Blocked labels Dec 3, 2024

pb8o reviewed Dec 9, 2024

View reviewed changes

roypat mentioned this pull request Dec 10, 2024

Rust update to 1.83 #4935

Merged

10 tasks

roypat force-pushed the seccomp-cleanup branch from 6c51473 to 8379c49 Compare December 10, 2024 17:02

roypat requested review from xmarcalx and Manciukic as code owners December 10, 2024 17:02

roypat force-pushed the seccomp-cleanup branch 2 times, most recently from a1f5447 to 35b495c Compare December 11, 2024 07:39

pb8o reviewed Dec 11, 2024

View reviewed changes

tests/framework/static_analysis.py Outdated Show resolved Hide resolved

roypat force-pushed the seccomp-cleanup branch 2 times, most recently from fa67b7c to 73993d8 Compare December 16, 2024 10:15

roypat requested review from pb8o and bchalios December 16, 2024 10:17

bchalios reviewed Dec 16, 2024

View reviewed changes

tests/framework/static_analysis.py Show resolved Hide resolved

tests/framework/static_analysis.py Show resolved Hide resolved

tests/framework/static_analysis.py Outdated Show resolved Hide resolved

tests/framework/static_analysis.py Outdated Show resolved Hide resolved

roypat force-pushed the seccomp-cleanup branch from 6d602bf to 0429634 Compare December 16, 2024 12:05

pb8o previously approved these changes Dec 16, 2024

View reviewed changes

tests/framework/static_analysis.py Show resolved Hide resolved

tests/framework/static_analysis.py Show resolved Hide resolved

roypat added 3 commits December 16, 2024 13:56

roypat dismissed pb8o’s stale review via dc21eef December 16, 2024 13:56

roypat force-pushed the seccomp-cleanup branch from a252c57 to dc21eef Compare December 16, 2024 13:56

Merge branch 'main' into seccomp-cleanup

873e7bf

bchalios approved these changes Dec 16, 2024

View reviewed changes

pb8o approved these changes Dec 16, 2024

View reviewed changes

roypat merged commit 35502a5 into firecracker-microvm:main Dec 16, 2024
6 of 7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add static analysis to CI to determine redundant seccomp rules #4924

Add static analysis to CI to determine redundant seccomp rules #4924

roypat commented Nov 22, 2024 •

edited

Loading

codecov bot commented Nov 22, 2024 •

edited

Loading

roypat commented Nov 25, 2024

bchalios left a comment •

edited

Loading

roypat commented Dec 16, 2024 •

edited

Loading

bchalios commented Dec 16, 2024

roypat commented Dec 16, 2024

Add static analysis to CI to determine redundant seccomp rules #4924

Add static analysis to CI to determine redundant seccomp rules #4924

Conversation

roypat commented Nov 22, 2024 • edited Loading

License Acceptance

PR Checklist

codecov bot commented Nov 22, 2024 • edited Loading

Codecov Report

roypat commented Nov 25, 2024

bchalios left a comment • edited Loading

Choose a reason for hiding this comment

roypat commented Dec 16, 2024 • edited Loading

bchalios commented Dec 16, 2024

roypat commented Dec 16, 2024

roypat commented Nov 22, 2024 •

edited

Loading

codecov bot commented Nov 22, 2024 •

edited

Loading

bchalios left a comment •

edited

Loading

roypat commented Dec 16, 2024 •

edited

Loading