Skip to content
This repository has been archived by the owner on Jun 28, 2024. It is now read-only.

Docker build breakes due to unpublished outdated packages in the Alpine repo #43

Open
ghost opened this issue Nov 26, 2020 · 4 comments

Comments

@ghost
Copy link

ghost commented Nov 26, 2020

Dear all,

in #42, the following problem was described:

The current Dockerfile contains for some dependencies fixed version numbers with the intention to have a rather reproduceable setup:

RUN apk add --no-cache \
      chromium~=80.0.3987 \
      nss \
      freetype \
      freetype-dev \
      harfbuzz \
      ca-certificates \
      ttf-freefont \
      nodejs \
      yarn~=1.22.4 \

However, as those versions of chromium and yarn are outdated, they are not distributed anylonger by the Alpine project:

step 4/16 : RUN apk add --no-cache       chromium~=80.0.3987       nss       freetype       freetype-dev       harfbuzz       ca-certificates       ttf-freefont       nodejs       yarn~=1.22.4       bash procps drill coreutils libidn curl       parallel jq grep aha
 ---> Running in 5ca2fe0d3cde
fetch https://dl-cdn.alpinelinux.org/alpine/edge/main/x86_64/APKINDEX.tar.gz
fetch https://dl-cdn.alpinelinux.org/alpine/edge/community/x86_64/APKINDEX.tar.gz
ERROR: unsatisfiable constraints:
  chromium-86.0.4240.111-r0:
    breaks: world[chromium~80.0.3987]
  yarn-1.22.10-r0:
    breaks: world[yarn~1.22.4]
The command '/bin/sh -c apk add --no-cache       chromium~=80.0.3987       nss       freetype       freetype-dev       harfbuzz       ca-certificates       ttf-freefont       nodejs       yarn~=1.22.4       bash procps drill coreutils libidn curl       parallel jq grep aha' returned a non-zero code: 2

The problem was already brought forward here: https://superuser.com/a/1486407/1039133

Possible options are:

  1. Remove as much as possible the tagging of specific versions. I know we need yarn < 2.0 due to breaking changes. Reproducability would require to build the Docker container once and keep it as long as reproducability is needed.
  2. Change to a different distribution that does not unpublish old packages.
  3. Use alternative Alpine repositories that archive old packages, e.g.:
    apk add --no-cache --update-cache --repository http://nl.alpinelinux.org/alpine/v3.8/main alsa-lib-dev=1.1.6-r0
    See https://superuser.com/a/1369979 .
ghost pushed a commit that referenced this issue Nov 26, 2020
Today, 2020-11-26, I could build the Docker container and run one test
collection successfully. However, the Docker container may again break
in the future due to version mismatches.
@vincentcox
Copy link

vincentcox commented Nov 26, 2020

Awesome for providing the options, they put me in the good direction. I am using this website to check which versions we can use to come as close as possible to the one in the Dockerfile: https://pkgs.alpinelinux.org/packages?name=yarn&branch=v3.10.

So I made the following Dockerfile, based on the one in this repo and applied the necessary changes.

FROM alpine:3.10

LABEL maintainer="Robert Riemann <[email protected]>"

LABEL org.label-schema.description="Website Evidence Collector running in a tiny Alpine Docker container" \
      org.label-schema.name="website-evidence-collector" \
      org.label-schema.usage="https://github.com/EU-EDPS/website-evidence-collector/blob/master/README.md" \
      org.label-schema.vcs-url="https://github.com/EU-EDPS/website-evidence-collector" \
      org.label-schema.vendor="European Data Protection Supervisor (EDPS)" \
      org.label-schema.license="EUPL-1.2"

# Installs latest Chromium (77) package.
RUN apk add --no-cache --update-cache --repository http://nl.alpinelinux.org/alpine/v3.8/main alsa-lib-dev=1.1.6-r0
RUN apk add  \
      chromium~=77.0.3865 \ 
      nss \
      freetype \
      freetype-dev \
      harfbuzz \
      ca-certificates \
      ttf-freefont \
      nodejs \
      yarn~=1.16 \
# Packages linked to testssl.sh
      bash procps drill coreutils libidn curl \
# Toolbox for advanced interactive use of WEC in container
      parallel jq grep aha

# Add user so we don't need --no-sandbox and match first linux uid 1000
RUN addgroup --system --gid 1001 collector \
      && adduser --system --uid 1000 --ingroup collector --shell /bin/bash collector \
      && mkdir -p /home/collector/Downloads /output \
      && chown -R collector:collector /home/collector \
      && chown -R collector:collector /output

COPY . /opt/website-evidence-collector/

# Install Testssl.sh
RUN curl -SL https://github.com/drwetter/testssl.sh/archive/3.0.tar.gz | \
      tar -xz --directory /opt

# Run everything after as non-privileged user.
USER collector

WORKDIR /home/collector

# Tell Puppeteer to skip installing Chrome. We'll be using the installed package.
ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD true

RUN yarn global add file:/opt/website-evidence-collector --prefix /home/collector

# Let Puppeteer use system Chromium
ENV PUPPETEER_EXECUTABLE_PATH /usr/bin/chromium-browser

ENV PATH="/home/collector/bin:/opt/testssl.sh-3.0:${PATH}"
# Let website evidence collector run chrome without sandbox
# ENV WEC_BROWSER_OPTIONS="--no-sandbox"
# Configure default command in Docker container
ENTRYPOINT ["/home/collector/bin/website-evidence-collector"]
WORKDIR /
VOLUME /output

So the changed parts are:

FROM alpine:3.10
....
RUN apk add --no-cache --update-cache --repository http://nl.alpinelinux.org/alpine/v3.8/main alsa-lib-dev=1.1.6-r0
RUN apk add  \
      chromium~=77.0.3865 \ 
....
      yarn~=1.16 \

Build it:

docker build -t website-evidence-collector .

Please note that in the Dockerfile in the repo, the dot is missing in the comments on how to use the dockerfile

Run it:

mkdir output
chmod 777 output # Can cleaner and securer, but for the sake of the poc
docker run --rm -it --cap-add=SYS_ADMIN -v $(pwd)/output:/output website-evidence-collector https://vincentcox.com --overwrite

If you consider this as a feasible fix, I can make a pull request with all the changes (including the ones on how to use and build it).

Hmmm, I just saw you pushed a hotfix c5c4b98, let me check this out

ghost pushed a commit that referenced this issue Nov 26, 2020
@vincentcox
Copy link

So I am using your Dockerfile, but it gets me stuck at this:

Step 11/16 : RUN yarn global add file:/opt/website-evidence-collector --prefix /home/collector
 ---> Running in 0363b73f8c9a
yarn global v1.22.10
[1/4] Resolving packages...
warning file:/opt/website-evidence-collector > [email protected]: request-promise-native has been deprecated because it extends the now deprecated request package, see https://github.com/request/request/issues/3142
warning file:/opt/website-evidence-collector > [email protected]: request has been deprecated, see https://github.com/request/request/issues/3142
warning file:/opt/website-evidence-collector > request > [email protected]: this library is no longer supported
warning file:/opt/website-evidence-collector > pug > pug-code-gen > constantinople > babel-types > babel-runtime > [email protected]: core-js@<3 is no longer maintained and not recommended for usage due to the number of issues. Please, upgrade your dependencies to the actual version of core-js@3.
[2/4] Fetching packages...
error An unexpected error occurred: "EACCES: permission denied, scandir '/opt/website-evidence-collector/output/browser-profile'".
info If you think this is a bug, please open a bug report with the information provided in "/home/collector/.config/yarn/global/yarn-error.log".
info Visit https://yarnpkg.com/en/docs/cli/global for documentation about this command.
The command '/bin/sh -c yarn global add file:/opt/website-evidence-collector --prefix /home/collector' returned a non-zero code: 1

Any idea why this is happening?

@ghost
Copy link
Author

ghost commented Nov 26, 2020

I could reproduce this problem.

Try to delete the folder /opt/website-evidence-collector/output/browser-profil. This solved the issue for me. I do not understand why this folder can break the build process.

@vincentcox
Copy link

Ok, it builds now if I add this to the dockerfile:

RUN rm -rf /opt/website-evidence-collector/output/browser-profile

Unfortunately, it's still the same issue as #42.

Do you have the same issue if you run this?:

docker run --rm -it --cap-add=SYS_ADMIN -v $(pwd)/output:/output website-evidence-collector https://vincentcox.com --overwrite

It takes a lot of time and keeps using more and more ram. It's strange that it also happens with Docker, which should be platform independant. It's not only my website, but sites from a client I am making a dashboard for (unfortunately I can't share it here publicly).

So I'm affraid I'll stick with this one #43 (comment)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant