-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace grep base images parsing with dockerfile-json #1304
Conversation
This has to be merged only after konflux-ci/buildah-container#62 and the image reference has to be replaced. |
ef4d404
to
873cb50
Compare
/retest |
When playing with dockerfile-json to see if it could help with #1200, I found out it will probably make things even worse 😞 FROM registry.fedoraproject.org/fedora-minimal:40
RUN /bin/sh <<EOF
mkdir -p /etc/foo
cp -aR /src/foo /etc/foo
EOF $ dockerfile-json /tmp/Dockerfile
[dockerfile-json 1.0.8] error: parse "/tmp/Dockerfile": dockerfile/instructions.Parse dockerfile parse error on line 4: unknown instruction: mkdir It looks like the latest release of dockerfile-json (from 3 years ago: https://github.com/keilerkonzept/dockerfile-json/releases) uses a version of buildkit that didn't yet support heredocs (and who knows what else). Then I tried to build dockerfile-json from source to see if it's fixed in main, but it doesn't even compile anymore:
All my trust in that project just went out the window |
On the other hand, the fix was pretty simple if we want to fork the project and add some CI so that the auto-merged Renovate updates don't just randomly break main. Or contribute this upstream, but it looks rather abandoned. The last commit by someone other than renovate-bot was 3 years ago (the 1.0.8 release) diff --git a/pkg/dockerfile/parse.go b/pkg/dockerfile/parse.go
index 44daf75..02dac0a 100644
--- a/pkg/dockerfile/parse.go
+++ b/pkg/dockerfile/parse.go
@@ -23,7 +23,7 @@ func ParseReader(r io.Reader) (*Dockerfile, error) {
if err != nil {
return nil, fmt.Errorf("dockerfile/parser.Parse %v", err)
}
- stages, metaArgs, err := instructions.Parse(result.AST)
+ stages, metaArgs, err := instructions.Parse(result.AST, nil)
if err != nil {
return nil, fmt.Errorf("dockerfile/instructions.Parse %v", err)
} |
a86488a
to
0a0fe43
Compare
@chmeliik Thanks for your investigation here. I tried forking the repo, so far just to my account and adding the fix you proposed. Now it can be build from source and the main branch - see konflux-ci/buildah-container#62 Now, the heredocs can be parsed as well, because it uses latest buildkit version. But this concerns me, I wanted to avoid having more work, but if we decide to fork the repo, setup the CI and maintain it, it will require further work down the line. But seems like we do need such a funcionality. The dockerfile-json uses the parsing from the buildkit directly, if we wanted we could build our own tool around it as well. |
6b103a3
to
c8106fe
Compare
c8106fe
to
ba3fb58
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM after the temp changes are undone
This is more reliable and allow us to fix bugs where base images were loaded incorrectly. For example, previously this part in Dockerfile: LABEL description="this is a build \ from single-arch" Would return "single-arch" as a base image. Using dockerfile-json also solves the problem of omitting "scratch" from the results. Another advantage is that when we have something such as: FROM registry.access.redhat.com/ubi9/ubi:latest as builder ... FROM builder AS build1 then only the original image "registry.access.redhat.com/ubi9/ubi:latest" will be reported. KFLUXBUGS-1269 Signed-off-by: mkosiarc <[email protected]>
00545d3
to
c23c0cb
Compare
@@ -288,14 +288,12 @@ spec: | |||
|
|||
BUILDAH_ARGS=() | |||
|
|||
BASE_IMAGES=$(grep -i '^\s*FROM' "$dockerfile_path" | sed 's/--platform=\S*//' | awk '{print $2}' | (grep -v ^oci-archive: || true)) | |||
BASE_IMAGES=$(dockerfile-json "$dockerfile_path" | jq -r '.Stages[] | select(.From | .Stage or .Scratch | not) | .BaseName') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is now possible to have an empty value here, right whereas before there was always going to be at least one value?
Will this break any functionality elsewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, previously there was always at least one value - e.g when it was FROM scratch, but now we are omitting that. That's why I could remove all those if conditions that are checking for "scratch".
I tested it with a couple of builds and the e2e tests are passing. And from the logic of the buildah it does not seem anything should break, since when we are passing that further (for example to the sbom), the "scratch" was omitted anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An empty Dockerfile should seldom happen. However, if it happens accidentally, dockerfile-json will report an error like [dockerfile-json 1.0.8] error: parse "./Dockerfile": dockerfile/parser.Parse file with no instructions
. And set -o pipefail
is not set in the build
step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is good that the e2e tests are passing but I don't have much faith in the robustness of the e2e tests lately (i.e. do they test an image which is just FROM scratch
)? Would you be able to test this change in a pipeline that has a simple FROM scratch
Containerfile?
If you need a file, you can customize this pipeline: https://github.com/konflux-ci/olm-operator-konflux-sample/blob/main/.tekton/gatekeeper-operator-bundle-pull-request.yaml#L35 and just modify the Dockerfile to remove the first stage (or better yet, test it with the first stage and without).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do have a FROM scratch test, see konflux-ci/e2e-tests@9c5c74c and 5573eb3. But I tested it anyway, once again, now also with multiple stages as has the dockerfile you linked.
Is there a reason that we are only making these changes on the v0.2 tasks? |
I remember briefly discussing this with Adam, that it does not seem to worth to update the 0.1 tasks, as we are not expecting many users to still use them and this update does not seem that critical to include it in the old tasks. But if you have better knowledge of this and feel like I should update the 0.1 tasks as well then let me know. |
The question comes from a maintenance burden. If the change is minimal to make, then it seems like it would be better to make it so that the tasks are the same. I am not saying that we should do it, I am just raising the observation that we are not. |
It might be minimal to make, but it would again take some time to test it. And as far as I understand, the e2e tests here are running only the newer version of tasks. |
This is more reliable and allow us to fix bugs where base images were loaded incorrectly.
For example, previously this part in Dockerfile:
LABEL description="this is a build
from single-arch"
Would return "single-arch" as a base image.
Using dockerfile-json also solves the problem of omitting "scratch" from the results.
Another advantage is that when we have something such as:
FROM registry.access.redhat.com/ubi9/ubi:latest as builder ...
FROM builder AS build1
then only the original image
"registry.access.redhat.com/ubi9/ubi:latest" will be reported.
KFLUXBUGS-1269