You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It is my understanding that we have a dependency for torch on both:
tackle2-addon-tca advisor
tackle2-addon-test-generator
We also have a desire to build all our images to cover multi arch to cover:
x86_64
aarch64
ppc64le
s390x
Torch is not available for ppc64le or s390x on pypi and there is therefore cannot be installed via pip.
It is available using (ana)conda, but conda is not desirable for a few reasons.
It does not use the system python, so we're probably looking at growing images.
We'd have to devise an entrypoint script to activate the conda environment to run tasks
The installer is only available in EPEL and when considering downstream it would be a fair amount of work to package it and dependencies for use
Even if that was done I don't believe conda could reach out to the internet for install
It is possible to build from source. https://github.com/jmontleon/torch-distribution contains a Dockerfile and github action to perform a build. From there it would be as simple as doing something like this on another container to install.
FROM quay.io/jmontleon/torch-distribution:latest as torch
FROM registry.access.redhat.com/ubi9/ubi-minimal
RUN --mount=type=bind,from=torch,source=/dist/,target=/dist/ \
pip3 --no-cache-dir install /dist/*.whl
The big problem with this approach is that GitHub jobs have a timeout for jobs of 6 hours.
x86_64 builds take about three hours unemulated. Even if the individual builds wer split into separate jobs they won't finish in time for other architectures.
Options:
These containers are optional so build and provide them for aarch64 and amd64 only
Use a split approach using conda upstream and building a from source downstream
Use a split approach building source on real hardware upstream using copr may be feasible. Downstream could use the source container method
Container builds on real hardware? Travis might be an option. I'm not sure if they have s390x hardware, though they do have arm and ppc64le builders.
The text was updated successfully, but these errors were encountered:
Also wondering if anyone knows if we have any need for torchvision or torchaudio. My guess is no, however most of the install documentation instructs to install all three.
Example of a build from container. Though unless we find somewhere to do hardware builds of containers this will only work for x86_64 upstream. It remains as a decent example for a downstream path, at least.
FROM quay.io/jmontleon/torch-distribution:latest as torch
FROM registry.access.redhat.com/ubi9/ubi-minimal
RUN microdnf -y install libgomp python3-pip && microdnf clean all
RUN --mount=type=bind,from=torch,source=/dist/,target=/dist/ pip3 --no-cache-dir install /dist/*.whl
Copr does have hardware builders for all these architectures, though it is for RPM builds. Even so it could provide a similar approach for upstream. This isn't a viable path for downstream since this rpm requires internet access to build. I believe we can also request a group like @konveyor in Copr if we want to do something like this so it's not beholden to a personal repo.
cat << EOF >> Dockerfile
FROM registry.access.redhat.com/ubi9/ubi-minimal as torch
RUN microdnf -y install dnf dnf-plugins-core
RUN dnf -y copr enable jmontleon/torch
RUN dnf -y install torch-distribution
FROM registry.access.redhat.com/ubi9/ubi-minimal
RUN microdnf -y install libgomp python3-pip && microdnf clean all
RUN --mount=type=bind,from=torch,source=/dist/,target=/dist/ pip3 --no-cache-dir install /dist/*.whl
EOF
DOCKER_BUILDKIT=1 podman build -f Dockerfile --manifest foo --platform linux/s390x,linux/amd64,linux/arm64,linux/ppc64le .
It is my understanding that we have a dependency for torch on both:
We also have a desire to build all our images to cover multi arch to cover:
Torch is not available for ppc64le or s390x on pypi and there is therefore cannot be installed via pip.
It is available using (ana)conda, but conda is not desirable for a few reasons.
It is possible to build from source. https://github.com/jmontleon/torch-distribution contains a Dockerfile and github action to perform a build. From there it would be as simple as doing something like this on another container to install.
The big problem with this approach is that GitHub jobs have a timeout for jobs of 6 hours.
x86_64 builds take about three hours unemulated. Even if the individual builds wer split into separate jobs they won't finish in time for other architectures.
Options:
The text was updated successfully, but these errors were encountered: