-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update docs on system-provided MPI including HDF5 #1706
Update docs on system-provided MPI including HDF5 #1706
Conversation
Review checklistThis checklist is meant to assist creators of PRs (to let them know what reviewers will typically look for) and reviewers (to guide them in a structured review process). Items do not need to be checked explicitly for a PR to be eligible for merging. Purpose and scope
Code quality
Documentation
Testing
Performance
Verification
Created with ❤️ by the Trixi.jl community. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this. I am wondering (since I already forgot):
Could you add a small section with a rationale why all these need to be configured? For p4est and t8code it is probably easy (they initialize themselves with MPI function calls and thus MPI needs to work), while for HDF5 it is (to me) not immediately obvious.
Co-authored-by: Michael Schlottke-Lakemper <[email protected]>
You mean also if you don't use these libraries explicitly (otherwise it is clear because they call MPI functions and thus need to be compiled against the same MPI implementation, which is already explained in the docs)? For julia> using MPI
julia> using HDF5
julia> MPI.Init()
--------------------------------------------------------------------------
It looks like opal_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):
opal_shmem_base_select failed
--> Returned value -1 instead of OPAL_SUCCESS
[...] I think this is caused by the fact that the HDF5_jll is compiled with an MPI implementation that is not compatible with the implementation that is set by the MPIPreferences. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
@JoshuaLampert Thanks for the explanation above. Have you tried raising this as an issue with the HDF5 folks - maybe not to get a "fix", but at least a confirmation that this is known and understood behavior? If yes, maybe they would also be open to implementing a similar check with a warning. |
Yes, exactly. I think it would be helpful for users who do not want to use any of the packages to understand why they need to set these preferences if they want either no crashes (HDF5) or no warning messages (p4est, t8code). |
This is known behavior and explained in a bit more detail in JuliaIO/HDF5.jl#1079. I asked for the possibility to add a warning. However, the setting for HDF5.jl is a bit different than we had for P4est.jl or T8code.jl since the error appears in the |
@sloede, @JoshuaLampert: Okay to be merged? |
Sounds good to me. Please ping me if we forget this PR when the other one is merged |
In our docs on how to use a system-provided MPI installation, we were missing that in this case also a custom HDF5 library needs to be used. I updated the docs accordingly.