-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-3009: Prune stale CCRs before aggregating scan results #221
OCPBUGS-3009: Prune stale CCRs before aggregating scan results #221
Conversation
@rhmdnd: This pull request references Jira Issue OCPBUGS-3009, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This still needs some testing but I'm waiting on a cluster. |
config/manager/kustomization.yaml
Outdated
@@ -3,7 +3,7 @@ resources: | |||
|
|||
images: | |||
- name: compliance-operator | |||
newName: quay.io/compliance-operator/compliance-operator | |||
newTag: latest | |||
newName: image-registry.openshift-image-registry.svc:5000/openshift/compliance-operator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you remove this change from the commit?
I think it makes sense to remove CCR before aggregating phase, but I wonder if we should do that during phaseDoneHandler, when someone labels a scan to be rescanned. maybe in here: https://github.com/ComplianceAsCode/compliance-operator/pull/221/files#diff-7aba53b74d27b417a478166b1a983b145fb0e381975e97097b7794752b51edeeR612 |
/hold for test |
@rhmdnd,
|
I think this PR still needs some work, including an e2e test. I was in the process of adding a test and then discovered other issues in the test framework that were interfering with my test (specifically around resource cleanup). |
b238c94
to
75e89f2
Compare
I was able to reproduce with the latest test. |
/retest |
@xiaojiey I got a clean parallel run with the test. Should be good for another pre-merge validation test. |
/retest e2e-aws-serial Retest due to infrastructure timeouts. |
@rhmdnd: The
Use In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, thank you!
/retest e2e-aws-serial |
@rhmdnd: This pull request references Jira Issue OCPBUGS-3009, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/hold |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
just some inline comments but I think we can add that in another pr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
Pre-merge testing passed with 4.14.0-0.nightly-2023-11-09-092851 and code in #221
|
/label qe-approved |
I was trying to resolve the merge conflicts through github UI and then it got the master to merge to the branch to commit here |
Previously, the compliance operator would leave CCRs around and then just overwrite them on subsequent scans. While the most recent scan data was accurate, because it was overwriting existing check results, it gave the impression that some changes weren't taking effect. For example, if you create a tailored profile, run a scan, exclude a rule, and rerun the scan, it appears the change you just made never took effect because the result from the rule you ignored still exists. To avoid this, let's prune stale results when we aggregate new results.
Removed the hold flag since we have QE approval. I also rebased locally to resolve the merge conflict with master in the same patch. Thanks for all the reviews folks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jhrozek, rhmdnd, Vincent056, yuumasato The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/jira refresh |
@Vincent056: This pull request references Jira Issue OCPBUGS-3009, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
ebcd30a
into
ComplianceAsCode:master
@rhmdnd: Jira Issue OCPBUGS-3009: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-3009 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
A recent change improved how the aggregator pod handled compliance check results, by allowing it to find all existing results, and prune results that were stale. This makes the state of the Compliance Check Results consistent with the latest run: ComplianceAsCode#221 To do this though, we needed to give the aggregator pod permissions to list and delete Compliance Check Results. But, in that patch we forgot to update the bundle build to include those new permissions. This means bundle installs are currently broken for all scans because the aggregator pod gets stuck in a crashloop, due to failing permissions. This commit updates the manifest for the bundles so that bundle installs work again.
Previously, the compliance operator would leave CCRs around and then
just overwrite them on subsequent scans. While the most recent scan data
was accurate, because it was overwriting existing check results, it gave
the impression that some changes weren't taking effect.
For example, if you create a tailored profile, run a scan, exclude a
rule, and rerun the scan, it appears the change you just made never took
effect because the result from the rule you ignored still exists.
To avoid this, let's check for any check results at scan time and make
sure we clean them up before we aggregate the new results.