-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A/B-ification of tests that can fail due to external influences #4149
A/B-ification of tests that can fail due to external influences #4149
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #4149 +/- ##
==========================================
- Coverage 82.94% 82.92% -0.03%
==========================================
Files 221 221
Lines 28419 28419
==========================================
- Hits 23572 23566 -6
- Misses 4847 4853 +6
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
129afb8
to
6809cd6
Compare
2abca38
to
e9e9e80
Compare
Example output from the new |
e9e9e80
to
79843ec
Compare
9faf894
to
536f234
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:
Change:
test: Propage ignore_return_code from utils.run_cmd to host_tools.cargo
to
test: Propagate `ignore_return_code` from `utils.run_cmd` to `host_tools.cargo`
Change:
test: A/B-ify test_cargo_audit
to:
test: A/B-ify `test_cargo_audit`
Same on all commits which includes code identifiers.
3f9b08a
to
85c3061
Compare
Running a bash command as an A/B-Test means testing that the output of the command is the same both before and after the PR. Signed-off-by: Patrick Roy <[email protected]>
The test_cargo_audit test can fail due to new advisories being published on crates that firecracker already consumes. We already get notified of this because we do a nightly run that alerts us whenever any advisory for one of our dependencies exists. There is no need to do the same on PRs, as that means all development activities are blocked until the advisory is resolved (which might be non-trivial). Thus use A/B-testing in CI to only fail a PR if a newly introduced dependency has a pre-existing advisory (in which case we do not want to introduce the dependency). This is done by computing the set of warnings/vulnerabilities that currently affect the repository, and ensuring that there are no additions to this set (while subtractions, for example due to a PR removing a vulnerable dependency, are allowed). Signed-off-by: Patrick Roy <[email protected]>
Similar to test_cargo_audit, it can fail due to actions outside of our control. Signed-off-by: Patrick Roy <[email protected]>
The vulnerabilities A/B-Tests will need to compile firecracker from an old revision, meaning we need to grab binaries from a different cargo workspace. Signed-off-by: Patrick Roy <[email protected]>
Needed for the vulnerability tests, as they need to run commands inside of microvms. Signed-off-by: Patrick Roy <[email protected]>
These tests can fail due to external factors (microcode updates, AMI updates, etc), which would then block our PR CI until those get resolved. By using A/B-testing for our PR CI we avoid this, and get alerted to these changes out-of-band. Since A/B-Testing needs microvms compiled from different revisions, we need to change our fixture approach a bit. Instead of building microvms, it now provides factory methods that can be consumed by the A/B-test functions for building microvms from compiled firecracker binaries. These factory methods can then be composed to make them perform additional actions such as "restore from snapshot" or "make sure checker script is there". The condition that the A/B-Tests verify is "PR did not introduce a vulnerability". This is different from the "Result of vulnerability test did not change across PR" that might be more obviously associated with A/B-testing. However, this latter approach would not allow us to fix vulnerabilities (as it would block such PRs). Signed-off-by: Patrick Roy <[email protected]>
This lint unilaterally forbids shadowing. However, this clashes with pytest fixtures in a great many places: To use a fixture a test function needs to have an argument that matches the fixture's name. However, if the fixture is defined in the same file as its usage, then this will trigger pylint's redfined-outer-name. Since this is a very common pattern, disable this lint. Signed-off-by: Patrick Roy <[email protected]>
Previously, test_ab.py was explicitly calling release.sh to build firecracker binaries. This was done because originally the first version of the script called out to `tools/devtool build`, to ensure maximal compatibility with older versions of firecracker (which might no longer compile in new dev containers). However, nowadays the entire ab_test.py script is run inside of docker, meaning we no longer get this benefit, so there is no difference between calling release.sh and using get_firecracker_binaries(), so just use the git_ab_test_with_binaries facilities introduced in this patch series. Signed-off-by: Patrick Roy <[email protected]>
e966b7f
to
bb0bc9a
Compare
Changes
This PR introduces A/B-approaches to tests that can fail due to external factors that would otherwise block all PRs until they are resolved. Using the
cargo audit
test as an example, we want to differentiate between the following two cases:We want to fail PR CI in the first scenario, but be able to deal with the second scenario out-of-band. A/B-Testing allows us to do exactly that.
License Acceptance
By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following
Developer Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md
.PR Checklist
CHANGELOG.md
.TODO
s link to an issue.rust-vmm
.