-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
build and publish ARM images for kubeflow pipelines #10309
Comments
@chensun @zijianjoy I think this is a very important issue, as ARM64 (especially MacBooks) are now very common. |
I can see that there was a merged PR to make some builds succeed on ARM64 (from 2019): But another one got closed due to inactivity: I will tag the author of those PRs so they can comment on this @MrXinWang. |
@thesuperzapper Let me know how can I help with this. |
+1 on this issue. Each quarter, more people are switching to Apple Silicon from older Intel Macs |
Another image is |
In my testing trying to build the images for
The problematic pip packages are:
There are already upstream Issues for some of them, but they mostly relate to Apple Silicone (slightly different from Linux ARM64), but I imagine that solving one will make it much easier to solve the other:
We either need to get those packages working so they can be |
@thesuperzapper metadata-write and visualization-server are kfpv1 deprecated components, so they're not required for kfpv2. |
We run a small ARM-based cluster which we want to run Kubeflow on, so I have started to build the components for ARM. I've been successful at building the cache-server, persistence agent, scheduled workflow agent, viewer-crd-controller, and frontend. I only had to set The main reason for this, is that https://github.com/mattn/go-sqlite3/ now needs to be compiled with a cross-compiler, so I have to run However, this seems very fragile to changes in build server, new CPU architectures, etc., so I looked into why we even include SQLite - and the answer seems to be that we only use SQLite for integration testing? One way to do this is to move SQLite references to a separate In fact, I have done this on our custom build and now I can build the binary and Docker container without SQLite with the same configuration change as with the other components mentioned above. |
I am considering looking at contributing some of my changes here, but I can't really figure out how the images are built. I expect that it has something to do with https://github.com/kubeflow/pipelines/blob/master/.cloudbuild.yaml? Perhaps @rimolive can give some pointers? Also, what do you think of my proposal to remove SQLite from the final Go binary and only enable it for integration tests using build flags? |
@AndersBennedsgaard if you want a quick way to build all the images for testing, you can use the same approach as the deployKF fork of Kubeflow Pipelines You can just take the same GHA configs as we add in this commit: deployKF@d800253. Even if you don't use the GHA configs directly, you can use them to figure out the full list of images that make up Kubeflow Pipelines and where their Dockerfile is. NOTE: these workflows have NOTE 2: this excludes the |
@thesuperzapper as I mentioned in #10309 (comment), we already have KFP fully running on an ARM-only cluster, so I have already cross-compiled the images using BuildX+Qemu in our own fork. |
We are already working on migrating the CI pipelines to GitHub Actions. See #10744 |
@rimolive #10744 does not mention changing the release workflow logic to GH actions. Should we include these in that issue? @thesuperzapper would you mind adding all the relevant |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it. |
/reopen |
@thesuperzapper: Reopened this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
(still relevant, bumping comment to avoid stale status) |
Description
Currently, Kubeflow Pipelines is only publishing
amd64
container images, most other Kubeflow components are now publishing for bothamd64
andarm64
.Here is the list of images that need to be updated:
(this was the list for
2.0.0-alpha.7
, more may have been added for2.0.0+
)gcr.io/ml-pipeline/cache-server
gcr.io/ml-pipeline/metadata-envoy
gcr.io/ml-pipeline/metadata-writer
gcr.io/ml-pipeline/api-server
gcr.io/ml-pipeline/persistenceagent
gcr.io/ml-pipeline/scheduledworkflow
gcr.io/ml-pipeline/frontend
gcr.io/ml-pipeline/viewer-crd-controller
gcr.io/ml-pipeline/visualization-server
gcr.io/tfx-oss-public/ml_metadata_store_server
gcr.io/google-containers/busybox
While most of these can run under Rosetta (on Apple Silicon Macs only), they run much slower and so are really only useful for testing.
Furthermore, the
gcr.io/tfx-oss-public/ml_metadata_store_server
image straight up does not work (even under emulation), I have made a separate Issue to track this one, as it is not controlled by KFP and is part ofgoogle/ml-metadata
:ml_metadata_store_server
container image for ARM64 #10308Love this idea? Give it a 👍.
The text was updated successfully, but these errors were encountered: