Create a re-usable workflow for oci-image scans #69

DnPlas · 2024-09-25T20:27:17Z

Context

As the team grows its offerings, security vulnerabilities must be scanned and report effectively so the team can addressed them in an appropriate time.
Currently, the only repository that has a Github workflow for scanning oci-images and getting reports is canonical/bundle-kubeflow using the scan-images.yaml workflow. While working correctly at the moment, this workflow presents the following limitations:

Uses a local script to gather the images used in all charm repositories that form the bundle. At the same, the get-all-images.py script depends on scripts present in each repository to generate a list of images per repo. The problem with this is that 1) not all repos have this script (e.g. mlflow), 2) this script is tightly coupled to the host repo.
The Scan images step of the workflow depends on two scripts located at canonical/kubeflow-ci. This is problematic because 1) it creates a maintenance task, 2) they are doing something that actions like aquasecurity/[email protected] are already providing.
The workflow is not re-usable as is, meaning it cannot be used by mlflow-operator repository.

Proposal

Create a re-usable workflow for scanning oci-images that:

Uses the aquasecurity/[email protected] to scan and generate reports for each of the images under scan
Uploads each of the Trivy reports as artefacts of the Github workflow run
Automatically report vulnerabilities via Github issues in the rock's repository (i.e. canonical/training-operator-rocks, canonical/kubeflow-rocks)
If the vulnerability for a specific image has already been reported, the workflow is smart enough to update an existing issue with the latest details of the report.
Runs on schedule, but also provides a workflow dispatch

Please NOTE that part of this proposal is to only scan images that the Analytics team maintains. This is because the images that charms use that come from upstream cannot be patched by us.

Limitations

~~1. There is no other way of fetching the images that each charm uses, so for now we'll stick to using the get-all-images.py script.~~
2. The Trivy reports will be uploaded individually, meaning that there is a linear relation between the number of scanned images and the number of artefacts saved in the workflow run.
3. BIGGEST This workflow will be coupling the product to the rocks, that is, the scans are done far from the source code. Ideally we'd have scans and vulnerability reports at the rockcraft project repositories. For this one, though, we could plan to push rocks to the oci-factory and outsource all the vulnerability scans and reports. The workflows will live at rocks repo level, so this is not a limitation anymore.

Out of scope

Automatic notifications in mailing list or MM
snap or charm scans - though common workflows can be used in other automations, for example, for creating GH issues.

Example

Scanning an image - this is an example run. The vulnerability scan job will fail if it founds a CRITICAL or HIGH vulnerability and it will report an issue.
The workflow - this is how the workflow would look like, just with a bit of work to make it 100% product agnostic.
Automatic issue creation - this is an example of an issue that will be created automatically by the workflow. It currently uses my GH token, that's why I'm the reporter, but ideally we'll use the CKF bot for it.

What needs to get done

Create a re-usable workflow for getting images used by any rock, scanning them for vulnerabilities, and reporting found vulns following the example in https://github.com/canonical/bundle-kubeflow/pull/1087/files#diff-327280cbc65c9de9998db8b0e5d1c937ccf75524907e5f9d026304ca85146f53

Definition of Done

There is a re-usable workflow that any of the charming products of this team can use.

The text was updated successfully, but these errors were encountered:

syncronize-issues-to-jira · 2024-09-25T20:27:26Z

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/KF-6331.

This message was autogenerated

DnPlas · 2024-09-26T17:15:00Z

Based on feedback from @misohu, the way to better approach this enhancement proposal is to have the scans closer to the source (each rock repository) instead of a central place.
@misohu also pointed out that rocks are already being scanned on_push and on_pull by the canonical/charmed-kubeflow-workflows/.github/workflows/get-rocks-modified-and-build-scan-test-publish.yaml@main workflow, so scans are already happening at the rock level, but vulnerabilities are not being reported and not being constantly tested.
I am editing the original proposal in the description of this issue to match the above.

Add the option to report vulnerabilities automatically via Github issues. Fixes #69

This re-usable workflow can be used for reporting security vulnerabilities via Github issues. It takes the issue title, image-name, and issue-labels as inputs, and in turn: * edits an existing issue with the same title and updates the vulnerability report * creates a new issue with the issue-title and adds the vulnerability report in the description Please NOTE this workflow assumes the existence of vulnerability reports as artefacts of a workflow run; that is, it expects artefacts named trivy-report-<image-name> to be present in the sabe workflow run. Part of #69

… reports This commit adds get-published-images-scan-and-report.yaml, a re-usable workflow that enables repositories to scan images from a public registry (in the case of the Analytics team it defaults to charmedkubeflow) and reports back the security vulnerabilities as Github issues. This workflow is intended to be used on demand (using a workflow dispatch) and on schedule, as it will be used for continuous testing of the published images a rock repository generates. Part of #69

DnPlas · 2024-10-09T02:27:23Z

Solutions

Based on the description of this issue and after working a bit on the solution, we have the following:

A re-usable workflow for reporting vulnerabilities (as proposed in ci: add workflow to enable automatic vulnerability reports #72). This will generate labelled Github issues with detail descriptions of the vulnerabilities found during the scans.

This workflow is flexible enough to work with the current implementation of build-scan-test-publish-rock.yaml, an example of the required changes can be found here, and as a side effect, here.

This workflow can also be used for workflows triggered on_push, on_demand, and on workflow_dispatch. An example of a full implementation can be found in #73.

A re-usable workflow for scanning and reporting. In ci: add re-usable workflow for scans from published img and automatic… #73 the end to end integration is done on a re-usable workflow that runs on workflow_dispatch and on schedule and in turn it scans the images from a public registry, uploads each vulnerability report, and at the same time creates/edits Github issues for easy tracking.

Discussions

A) Should automatic reports be enabled on_push? - Right now, most (if not all) rocks repositories are scanning images on_push and uploading the vulnerability reports, but:

Those workflows will not fail even if a vulnerability is found
The results of those scans are not monitored by the team

This can be solved by relying on the scheduled workflow, BUT, the scheduled workflow only scans and reports published images. On the other hand, not enabling this would ensure that the CI is always green and publishing images regardless of the vulnerabilities.

#74 shows an example of how this can be added and be left for us to enable it whenever we call get-rocks-modified-and-build-scan-test-publish.yaml in each of the rocks repositories. In this workflow run, the execution shows an example of the feature available, but disabled (as we are not passing the report-vulnerabilities: true to the workflow). On the other hand, this is an example run of the same workflow, but enabling the reports, as seen here. This would be what could happen on_push if we decide that it is worth adding this.

This commit enables the automatic creation of Github issues when a security vulnerability is found in the scan jobs that the build-scan-test-publish-rock.yaml already performs. The intention of this is to add reporting capabilities to the workflows that are already using build-scan-test-publish-rock.yaml on_merge, that is, enable automatic reports of vulnerabilities as Github issues on every merge. Part of #69

* ci: add workflow to enable automatic vulnerability reports This re-usable workflow can be used for reporting security vulnerabilities via Github issues. It takes the issue title, image-name, and issue-labels as inputs, and in turn: * edits an existing issue with the same title and updates the vulnerability report * creates a new issue with the issue-title and adds the vulnerability report in the description Please NOTE this workflow assumes the existence of vulnerability reports as artefacts of a workflow run; that is, it expects artefacts named trivy-report-<image-name> to be present in the sabe workflow run. Part of #69

#73) * ci: add re-usable workflow for scans from published img and automatic reports This commit adds get-published-images-scan-and-report.yaml, a re-usable workflow that enables repositories to scan images from a public registry (in the case of the Analytics team it defaults to charmedkubeflow) and reports back the security vulnerabilities as Github issues. This workflow is intended to be used on demand (using a workflow dispatch) and on schedule, as it will be used for continuous testing of the published images a rock repository generates. Part of #69

DnPlas · 2024-10-16T09:56:46Z

Will track the discussion about enabling the automatic reports in #82. Closing this issue as the re-usable workflow has been created and merged in #72 and #73.

DnPlas added the enhancement New feature or request label Sep 25, 2024

DnPlas added a commit that referenced this issue Oct 8, 2024

feat: add automatic vulnerability reports

5871ae8

Add the option to report vulnerabilities automatically via Github issues. Fixes #69

DnPlas added a commit that referenced this issue Oct 8, 2024

feat: add automatic vulnerability reports

fb912a3

Add the option to report vulnerabilities automatically via Github issues. Fixes #69

DnPlas mentioned this issue Oct 9, 2024

ci: add workflow to enable automatic vulnerability reports #72

Merged

DnPlas mentioned this issue Oct 9, 2024

ci: add re-usable workflow for scans from published img and automatic… #73

Merged

DnPlas mentioned this issue Oct 9, 2024

ci: enable automatic vulnerability reports for existing workflow reports #74

Draft

DnPlas closed this as completed Oct 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create a re-usable workflow for oci-image scans #69

Create a re-usable workflow for oci-image scans #69

DnPlas commented Sep 25, 2024 •

edited

Loading

syncronize-issues-to-jira bot commented Sep 25, 2024

DnPlas commented Sep 26, 2024

DnPlas commented Oct 9, 2024 •

edited

Loading

DnPlas commented Oct 16, 2024

Create a re-usable workflow for oci-image scans #69

Create a re-usable workflow for oci-image scans #69

Comments

DnPlas commented Sep 25, 2024 • edited Loading

Context

Proposal

Limitations

Out of scope

Example

What needs to get done

Definition of Done

syncronize-issues-to-jira bot commented Sep 25, 2024

DnPlas commented Sep 26, 2024

DnPlas commented Oct 9, 2024 • edited Loading

Solutions

Discussions

DnPlas commented Oct 16, 2024

DnPlas commented Sep 25, 2024 •

edited

Loading

DnPlas commented Oct 9, 2024 •

edited

Loading