Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement serialization and deserialization #93

Merged
merged 10 commits into from
Aug 15, 2024
Merged

Conversation

Yurlungur
Copy link
Collaborator

@Yurlungur Yurlungur commented Aug 12, 2024

PR Summary

Using tabulated data in, e.g., MPI Windows for shared memory requires the ability to serialize a DataBox object into pre-allocated shared memory and to build a new thread-local object around said shared memory, so that the object itself is thread-local but it internally points at a table that lives in shared memory. This PR implements this capability.

This will be needed for the equivalent capability in Singularity-EOS and also provides a prototype for how that model and API will look.

PR Checklist

  • Code is formatted. (You can use the format_spiner make target.)
  • Adds a test for any bugs fixed. Adds tests for new features.
  • If preparing for a new release, update the version in cmake.

@Yurlungur Yurlungur added the enhancement New feature or request label Aug 12, 2024
@Yurlungur Yurlungur self-assigned this Aug 12, 2024
spiner/databox.hpp Outdated Show resolved Hide resolved
@jhp-lanl
Copy link
Collaborator

Looking forward to this! I'll probably review tomorrow

@Yurlungur
Copy link
Collaborator Author

Yurlungur commented Aug 13, 2024

A few big picture thoughts on serialization:

  1. The implementation I've written here relies on the fact that everything except the single pointer is trivially copyable. Thats why memcpy(dst, this, sizeof(*this)) works. Nested static arrays are okay, for the same reason they can be captured in a KOKKOS_LAMBDA, nested pointers would not be.
  2. This can be generalized to nested pointers, but one would have to hierarchically serialize and walk a graph, rather than a single memcpy for the static memory and a single memcpy (or pointer assignment) for the dynamic memory.
  3. This implementation ignores endian-ness and padding. When serializing and de-serializing on a single architecture... i.e., within a single compiled executable, this is fine. If we wanted to dump an object to disk by serializing it, and then load it up on a different architecture with, e.g., different endianness, this would not work. HDF5 (or some other user-handled thing) is the file-I/O strategy here. Do not use the serialization routines for file I/O.
  4. The default serialization/de-serialization routines are host only. This is by design. You cannot share device memory across MPI ranks (unless you are doing MPS), so it doesn't make sense to try and share device memory in this way. In a device context there are two possible patterns: (a) The host-side databox is shared accross MPI ranks. When you call GetOnDevice that creates a thread-local device-side databox. I think this is the preferred pattern and it's what will happen if you just naively call the API. (b) You can create shared device-side databoxes by creating an array of databox objects on device and then calling setPointer to force them to share memory. I think this is best handled manually by user code, but spiner supports it with the setPointer method.

Comment on lines +532 to +533
PiecewiseDB<NGRIDS> dbh(Spiner::AllocationTarget::Host, NCOARSE, NCOARSE,
NCOARSE);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes were actually unnecessary. I originally had this test merged with the new test, and needed host-side data. However I split them into separate tests for clarity. Nevertheless, there's no harm in this change.

test/test.cpp Outdated
Comment on lines 623 to 628
AND_THEN("They do not point ot the same memory") {
// checks DataBox pointer
REQUIRE(dbh2.data() != dbh.data());
// checks accessor agrees
REQUIRE(&dbh2(0) != &dbh(0));
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is important, as it ensures that we didn't accidentally just create a shallow copy of the original DataBox.

@Yurlungur
Copy link
Collaborator Author

Tests triggered on re-git.

@Yurlungur
Copy link
Collaborator Author

Tests triggered on re-git.

tests pass.

Copy link
Collaborator

@jhp-lanl jhp-lanl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Approved with a minor addition to the tests and a question.

spiner/databox.hpp Show resolved Hide resolved
test/test.cpp Show resolved Hide resolved
@Yurlungur Yurlungur merged commit 17a16fd into main Aug 15, 2024
4 checks passed
@Yurlungur Yurlungur deleted the jmm/serialization branch August 15, 2024 15:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants