Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch Dockerfile image to wolfi and add pipeline for vulnerability scanning #3063

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

kostasb
Copy link

@kostasb kostasb commented Dec 27, 2024

https://github.com/elastic/search-developer-productivity/issues/3547

Description

The goal of this PR is to:

  1. move "extensible Dockerfiles" to a distroless base image. Dockerfile and Dockerfile.ftest are not used in Elastic's docker image releases but are rather provided as a stepping stone for users to build custom images on.
  2. add a pipeline for vulnerability scanning of built images

This is based on @acrewdson 's draft. It uses wolfi-base images and adds buildkite pipelines to build, test and scan the resulting docker images using trivy.

A notification method for vulnerability reports from Trivy is TBD.

Checklists

Pre-Review Checklist

  • this PR does NOT contain credentials of any kind, such as API keys or username/passwords (double check config.yml.example)
  • this PR has a meaningful title
  • this PR links to all relevant github issues that it fixes or partially addresses
  • this PR has a thorough description
  • Covered the changes with automated tests
  • Tested the changes locally
  • Added a label for each target release version (example: v7.13.2, v7.14.0, v8.0.0)
  • Considered corresponding documentation changes
  • Contributed any configuration settings changes to the configuration reference
  • if you added or changed Rich Configurable Fields for a Native Connector, you made a corresponding PR in Kibana

Changes Requiring Extra Attention

  • Security-related changes (encryption, TLS, SSRF, etc)
  • New external service dependencies added.

Release Note

Copy link
Contributor

@acrewdson acrewdson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great start! I added a few comments

RUN make clean install
RUN ln -s .venv/bin /app/bin
RUN make clean install && ln -s .venv/bin /app/bin

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had this line

RUN apk del make git

so that we could remove now-unnecessary packages from the final images, with the assumption that we do not need make or git at runtime, but we could confirm with someone from @elastic/search-extract-and-transform that it makes sense

Copy link
Author

@kostasb kostasb Dec 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@acrewdson My thought process for omitting that line was:

  • since we're going to scan the built image we'd want to be able to detect vulnerabilities on these packages, even if they're not required at runtime.
  • they've been included in the Dockerfile so removing them now can cause a "breaking" change for users who already build on top of this. Changing the base image itself can be a much more substantial breaking change though so perhaps that shouldn't be factored in here.

Let me know what you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with removing make and git. I expect very little being extended on the image, and more likely that the connectors code is what is seeing extension. We shouldn't need make or git for that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. git is needed at build-time, but for run-time we need neither git nor make

catalog-info.yaml Outdated Show resolved Hide resolved
catalog-info.yaml Outdated Show resolved Hide resolved
# ----
# Dockerfile build and tests on amd64
# ----
- label: "Build amd64 image from Dockerfile"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of only building amd64 images, we could consider building a multi-arch image using buildah or drivah, like we are doing here, and that way we could also potentially avoid using these hefty ec2 instances

Copy link
Author

@kostasb kostasb Dec 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the suggestion about a multi-arch image, reading in more detail how buildah/drivah multiarch push works, it pulls the already built amd64/arm64 images from the previous steps and combines them into a common manifest which references both builds. Unfortunately doesn't save us any build resources.

It means that we'd build the image on both architectures to create a multiarch image that isn't going to be published or used for any other purpose. I do not see the need for this since the images aren't published, what do you think?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added pipeline steps to build and test both amd64 and arm64 architectures. Given the above analysis I don't see a use for a multiarch image step since we aren't publishing that artifact either way, let me know if you have any concerns.

schedules:
Daily main:
branch: main
cronline: '@daily'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wondering if we should consider running this less frequently? I think it would depend on what @elastic/search-extract-and-transform prefer, but the scenario I picture is something like:

  • this job runs once a week, or maybe even once every two weeks
  • if Trivy identifies any vulnerabilities, a message is sent to a Slack channel, or a GitHub issue is created

let's see what folks think 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • if Trivy identifies any vulnerabilities, a message is sent to a Slack channel, or a GitHub issue is created

I think these notifications could potentially be a duplicate of existing Snyk issues detected on the official container image. What's the value added by these new notifications?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is Snyk going to scan the images produced by these CI jobs?
I think this PR is doing 3 things, but maybe doesn't need to.

  1. switches our Dockerfile to use chainguard (has customer value)
  2. Adds CI to validate the Dockerfile works (has customer value)
  3. Adds CI to fail if a vuln is found (internal value, with indirect customer value)

I think (3) could probably be done separately, and reuse whatever machinery we use to scan other artifacts, especially since (1) and (2) will significantly cut down on false-positives that we'd have if we tried to do (3) on its own today.

Copy link
Author

@kostasb kostasb Jan 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CI jobs in this PR do not push images to the docker registry, instead it relies on Trivy for vulnerability scanning, which runs within the context of the pipeline.

The alternative approach is to push the images to an internal namespace on the docker registry and request for these to be added to Snyk. This would result in duplicate reports though as we're already scanning the docker.elastic.co/integrations/elastic-connectors with Snyk built from the Dockerfile.wolfi image which is technically the exact same base OS build.

Comment on lines 105 to 110
command: |-
mkdir -p .artifacts
buildkite-agent artifact download '.artifacts/*.tar.gz*' .artifacts/ --step build_dockerfile_image_amd64
trivy --version
env | grep TRIVY
find .artifacts -type f -name '*.tar.gz*' -exec trivy image --quiet --input {} \;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice 👍

I'm thinking it could also be nice to add one additional stage to this pipeline, if it's not too difficult:

a smoke-test/acceptance-test that verifies that connectors actually starts up in the resultant image, just as a way of gaining confidence that the built image does what we expect it to do

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, this pipeline includes the same test step that is used for the production docker images built from Dockerfile.wolfi here so that the testing approach is aligned on both built images.

The pipeline step test_extensible_dockerfile_image_amd64 runs the .buildkite/publish/test-docker.sh which performs basic sanity checks and verifies the output of elastic-ingest --version here.

Since adding functional tests at the connector level is a more involved task which could benefit both the production and the "extensible" image I'd suggest we create a dedicated issue for that, what do you think?

.buildkite/dockerfiles-pipeline.yml Outdated Show resolved Hide resolved
.buildkite/dockerfiles-pipeline.yml Outdated Show resolved Hide resolved
Dockerfile Outdated Show resolved Hide resolved
@kostasb
Copy link
Author

kostasb commented Dec 30, 2024

Since in this PR we are building a Dockerfile on a different base image (cgr.dev/chainguard/wolfi-base) there are package discrepancies comparing to the base image used in the official built images wolfi/python:3.11-dev.

For reference, here is a list of packages which, among others (like bash and curl), are available in the wolfi python-3.11 image used for the production image builds.

Do we want this list of packages added to the wolfi-base image? Are connectors requiring some, or all, of them? @elastic/search-extract-and-transform

@artem-shelkovnikov
Copy link
Member

I'm not sure if we need the packages mentioned, we really only need dependencies for 3.11 python + base64 available.

@kostasb kostasb force-pushed the kostasb/switch-dockerfile-image branch from 84d8123 to cdfd17b Compare December 31, 2024 15:03
@kostasb
Copy link
Author

kostasb commented Dec 31, 2024

I'm not sure if we need the packages mentioned, we really only need dependencies for 3.11 python + base64 available.

Thanks, then I'll stick with the bare minimum packages and we can iterate if needed.

I added both amd64 and arm64 image builds, mainly for the purpose of smoke testing (with .buildkite/publish/test-docker.sh) on both architectures. I don't expect deviations in the vulnerability testing between architectures.

Do we need some sort of release note for the new base image? It could be a "breaking" change for those users who build their images on top of our Dockerfiles.

@kostasb kostasb force-pushed the kostasb/switch-dockerfile-image branch from 25f6298 to a1b50bd Compare December 31, 2024 17:33
@kostasb kostasb marked this pull request as ready for review December 31, 2024 17:34
@kostasb kostasb requested a review from a team as a code owner December 31, 2024 17:34
@kostasb
Copy link
Author

kostasb commented Jan 2, 2025

Update re. the packages included with the base image vs the python one that we do docker image builds on: @oli-g suggested to check whether installing python-3.11 on the wolfi-base image installes those as dependencies. It does install most of them as part of the Dockerfile step that adds python (RUN apk add --no-cache python3=~${python_version} make git), with some exceptions (linux-headers) which are unlikely to be relevant for connectors. That said, there shouldn'be any concerns regarding the base image diff.

@artem-shelkovnikov
Copy link
Member

It could be a "breaking" change for those users who build their images on top of our Dockerfiles.

Just to check - cgr.dev/chainguard/wolfi-base is publicly available, right?

@kostasb
Copy link
Author

kostasb commented Jan 2, 2025

It could be a "breaking" change for those users who build their images on top of our Dockerfiles.

Just to check - cgr.dev/chainguard/wolfi-base is publicly available, right?

Yes, we've confirmed that this is a publicly available image that can be downloaded by unauthenticated (not logged in to any registry) docker instances.

@artem-shelkovnikov
Copy link
Member

Do we need some sort of release note for the new base image? It could be a "breaking" change for those users who build their images on top of our Dockerfiles.

We can mention it in our release notes indeed. I don't expect a lot of customers to build customer docker images and likely it's just gonna work for them, but it would not hurt to mention that we've updated our docker images recently to make them more secure with potential incompatible changes that the customers will have to fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants