-
-
Notifications
You must be signed in to change notification settings - Fork 548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Jax-IREE not working on MacOS-13 #4521
Comments
cc @jsbrittain who could help us with this. I had a macOS device running macOS 13, but I couldn't reproduce the crash when running the relevant tests from the test suite locally (ran them a few times). I updated it to macOS 14 today, so I don't know if we can reliably reproduce this outside of a CI runner. |
@agriyakhetarpal just a very quick comment for the time being: as noted above, there are multiple components that need to be upgraded together, but this has to be done carefully. I actually took a look at this a couple of months back in an effort to get rid of the demotion code (IREE only runs at 32-bit precision at the present). Notes from that investigation are here and may prove helpful: jsbrittain#4 |
Some follow-up exploratory notes. I tested upgrading to the latest Jax/Jaxlib (0.4.34) and corresponding iree-compiler (1036) with relaxed version constraints on everything apart from Windows. I don't have a macOS-13 machine but using gh-runners produced a slightly different set of results to your previous run (see https://github.com/jsbrittain/PyBaMM/actions/runs/11391403398/job/31695386575?pr=5): ubuntu macos-13 macos-14 windows Looking into the available wheels, here are the current offerings: Re the macos worker crashes (all of which seem to occur on TestIDAKLUSolver), there are a few blogs suggesting that On a slightly more positive note the jax/iree combination tested compiles and successfully solves locally (which is much more promising than my experience a few months back: jsbrittain#4). So, this may principally be a test problem. I had a quick chat to @martinjrobins and - given the dependency issues - it may be an option to simply disable IREE for the time being. It is only enabled by building with specific user-configurable flags at present (PYBAMM_IDAKLU_EXPR_IREE=OFF by default) as it is still slow compared to casadi; in-fact, it is only being enabled in tests to maintain the functionality until it becomes more useable, but is not built into the wheels at all. |
Thanks for the details, @jsbrittain! Yes, that sounds like a good plan, and IREE has not been enabled yet for https://github.com/pybamm-team/pybammsolvers either. Though, another option could be to keep testing the IREE code, and instruct Adding a decorator |
@jsbrittain, @agriyakhetarpal I think it might be better to remove IREE functionality for now instead of just disabling it and bring it back in the future when things improve. At the very least it should just be disabled for MacOS since allowing it for only a single python version makes it difficult to use. Python 3.9 reaches end of life about 1 year from now, so that part of the Jax-IREE complexity will go away when we remove support for 3.9 Reasons for removal:
We can always bring back the IREE interface in the future or dig into if it is possible to build a "bring your own solver" plugin sort of architecture |
@jsbrittain Of course I don't want to hinder any research that requires IREE functionality. Is there anything critical that depends on it? |
@kratman None that I am aware of. The MLIR/IREE approach has the potential to offer fast and flexibilty expression evaluation, capable of lowering onto numerous architectures (including GPU), but it has not achieved that potential yet; at present we are working around problems (e.g. precision and compatibility) that may be resolved with more mature/stable releases. |
Disabling IREE on MacOS was done in #4528, but it is still available on Linux |
While removing the deprecated MacOS-12 runners for GitHub actions, I ran into issues with IREE. The comments suggest that it should work with MacOS version 13, but the tests fail. It seems to work fine one MacOS-14 and Linux.
As a temporary fix, I suggested changed the version in PR #4520
A few other things I noticed while looking into the failures:
utils.py
andpyproject.toml
noxfile.py
andpyproject.toml
The text was updated successfully, but these errors were encountered: