-
-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Always ensure consistency of new MPI datatypes #877
Conversation
903e323
to
b10e708
Compare
src/datatypes.jl
Outdated
# Make sure the "aligned" size of the type matches the MPI "extent". | ||
sz = sizeof(T) | ||
al = Base.datatype_alignment(T) | ||
mpi_extent = MPI.Types.extent(datatype) | ||
aligned_size = (0, cld(sz,al)*al) | ||
@assert mpi_extent == aligned_size "The MPI extent of type $(T) ($(mpi_extent[2])) does not match the size expected by Julia ($(aligned_size[2]))" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: I'm adding this check after the type is registered (we need that to be able to call MPI.Types.extent
) and added to the list of created datatypes, which means that the type is always added there even if the consistency check fails. I'm wondering if this should be moved after the init()
call above, but I don't know what happens to the created_datatypes
dictionary if the test errors out. So for safety I kept it outside of the get!
.
Also: * skip test of custom primitive type of size 80 on 32-bit systems in Julia v1.12+, as the MPI extent doesn't match Julia's aligned size * add a test for a new primitive datatype larger than 64 bits which should work on 32-bit systems
b10e708
to
b8f9864
Compare
LGTM, but we are the ones who tells MPI about size and alignment, are we not? Line 433 in 0d57c63
|
Uhm, yes and no, in the sense that we create a datatype of size Line 441 in 0d57c63
Primitive80 we have
but I don't know if it makes sense, but one option could be to create an MPI datype of size diff --git a/src/datatypes.jl b/src/datatypes.jl
index 9dd721fc..f46c0331 100644
--- a/src/datatypes.jl
+++ b/src/datatypes.jl
@@ -438,7 +438,8 @@ function create!(newtype::Datatype, ::Type{T}) where {T}
if isprimitivetype(T)
# primitive type
- szrem = sz = sizeof(T)
+ al = Base.datatype_alignment(T)
+ szrem = sz = cld(sizeof(T), al) * al
disp = 0
for (i,basetype) in (8 => Datatype(UInt64), 4 => Datatype(UInt32), 2 => Datatype(UInt16), 1 => Datatype(UInt8))
if sz == i i.e. the same check introduced here, but that still doesn't necessarily ensure What do you think? |
Ok, I went down a rabbit hole reading the source code of OpenMPI (much more readable than MPICH) and I found the following:
OpenMPI documentation for
However |
This should help ensure memory allocations of the MPI datatype have the same alignment as the Julia counterpart.
Yeah, I mentioned I pushed a commit to change the size of the datatype from simply |
I did some quick research to see if there was any prior art about this:
Maybe I'm missing something (totally possible!) but I suspect we're on our own here. |
Also:
With this PR, on a 32-bit linux I get
which should be better than silently getting incorrect results at runtime.
Fix #853.
@simonbyrne @vchuravy can you see cases where the consistency test added inside the constructor of the
MPI.Datatype
can cause problems?