Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducible image builds for Fornax images #10

Closed
wants to merge 16 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 39 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,43 @@
# fornax-images
Customised Jupyterhub images for the Fornax Platform deployments
This repo contains the Docker images for the Fornax Platform deployments.
It produces reproducible computing environments. Some of the parts are
adapted from the [Pangeo](https://github.com/pangeo-data/pangeo-docker-images) project.

Have separate subdirectories for the different images, and please list them below for documentation purposes.
Reproducibility is achived by keeping track of the software environments using conda-lock.
The following is a general description of the images:

- Each image is in its own directory (e.g. `base-image` and `tractor`).
- `base-image` is a base image that contains basic JupyterHub and Lab setup.
Other images should use it as a starting point.
- Jupyterlab is installed in a conda environment called `notebook`, and it is the
default environment when running the images.
- The `build.py` script should be used when building the image. It takes as parameter
the name of the folder that contains Dockerfile, which is also the name of the image.
For example: `python build.py base-image` build the base image, and
`python build.py tractor` builds the tractor image.
- The Dockerfile of each image (other than `base-image`) should start from the base image:
`FROM fornax/base-image:latest`. That will trigger the `ONBUILD` sections defined in the
`base-image/Dockerfile`, which include:
- If `apt.txt` exits, it will parsed for the list of the system software to be installed with `apt-get`.
- If `postBuild*` files exist, the scripts are run during the build.
- If `conda-{env}.yml` exists, it defines a conda environment called `{env}`.
- Additionally, if `conda-{env}-lock.yml` exists, it defines a `conda-lock` file that locks
the versions of the installed libraries. To create it, or updated it, pass `--update-lock` to the
build script `build.py`. This will first generate a conda environment file from what is installed in the
conda environment, then use `conda-lock` to lock the versions, and then generate human-readable `packages.txt`
that contains a list of installed libraries and their versions.

The recommonded workflow is therefore like this:
- Define the libraries requirement from some conda environment `{env}` in `conda-{env}.yml`.
- Build the image with `python build.py {image-name} --update-lock`.
- This will generate: `conda-{env}-lock.yml` and `packages.txt`. Both these should be kept under
version control. The next time the image is built with `python build.py {image-name}`, the lock
file will be used inside the Dockerfile to reproduce the exact build.

# The image
- `base_image`: is the base image that all other images should start from. It contains jupyter and the basic tools needed for deployment in the fornax project.

- `tractor`: Main Astro image that was used for the demo. It contains tractor and other general useful tools.

- `heasoft`: high energy image containing heasoft (TODO).

#### fornax_forced_photometry
- Basic image with tractor installed
94 changes: 94 additions & 0 deletions base-image/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
FROM quay.io/jupyter/base-notebook:2024-08-12

LABEL org.opencontainers.image.source=https://github.com/fornax-navo/fornax-images
LABEL org.opencontainers.image.ref.name="Fornax Base Image"
LABEL org.opencontainers.image.version=0.2
LABEL maintainer="Fornax Project"

# clear the jupyter stack from base
RUN mamba remove jupyterlab notebook jupyterhub nbclassic ipykernel \
&& mamba clean -afy \
&& find ${CONDA_DIR} -follow -type f -name '*.a' -delete

ENV CONDA_ENV=notebook


# Ask dask to read config from ${CONDA_DIR}/etc rather than
# the default of /etc, since the non-root jovyan user can write
# to ${CONDA_DIR}/etc but not to /etc
ENV DASK_ROOT_CONFIG=${CONDA_DIR}/etc

# COPY the current content to $HOME/build
RUN mkdir -p $HOME/build/
COPY --chown=$NB_USER:$NB_USER apt* conda*yml postBuild* $HOME/build/
COPY --chown=$NB_USER:$NB_USER overrides.json $HOME/build/

USER root

# Make /opt/ user writeable so it can be used by postBuild scripts
RUN fix-permissions /opt/

# Install OS packages and then clean up
COPY --chown=$NB_USER:$NB_USER scripts/*.sh /opt/scripts/
# Read apt.txt line by line, and execute apt-get install for each line
RUN cd build && bash /opt/scripts/apt-install.sh



USER $NB_USER
# setup conda environments
RUN mamba install conda-lock \
&& cd build && bash /opt/scripts/conda-env-install.sh

# Change dispaly name of the default kernel
RUN mamba run -n $CONDA_ENV python -m ipykernel install --sys-prefix --display-name "$CONDA_ENV"

# Any other postBuild scripts #
RUN cd $HOME/build \
; for script in `ls postBuild*`; do \
echo "Found script ${script} ..." \
&& chmod +x $script \
&& ./$script \
; done
# --------------------------- #

# Make $CONDA_ENV default; do it at the global level
# because ~/.bashrc is not loaded when user space is mounted
USER root
ENV PATH=$CONDA_DIR/envs/$CONDA_ENV/bin:$PATH
RUN cat $HOME/.bashrc >> /etc/bash.bashrc \
&& printf "\nconda activate \$CONDA_ENV\n" >> /etc/bash.bashrc \
&& printf "" > $HOME/.bashrc
USER $NB_USER


# reset user and location
RUN rm -r $HOME/build $HOME/work /tmp/*
WORKDIR ${HOME}


# Install OS packages and then clean up
ONBUILD RUN mkdir -p $HOME/build
ONBUILD COPY --chown=$NB_USER:$NB_USER apt* conda*yml postBuild* $HOME/build/
ONBUILD USER root
ONBUILD RUN mkdir -p build && cd build && bash /opt/scripts/apt-install.sh
ONBUILD USER $NB_USER
# ------------------------------------ #


# setup conda environments
ONBUILD RUN cd build && bash /opt/scripts/conda-env-install.sh
# ----------------------- #

# Any other postBuild scripts #
ONBUILD RUN cd build \
; for script in `ls postBuild*`; do \
echo "Found script ${script} ..." \
&& chmod +x $script \
&& ./$script \
; done
# --------------------------- #

ONBUILD RUN rm -r $HOME/build
ONBUILD USER ${NB_USER}
ONBUILD WORKDIR ${HOME}
12 changes: 12 additions & 0 deletions base-image/apt.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
vim
nano
emacs
fuse
bzip2
git
curl
zip
build-essential
gcc
make
gfortran
Loading