-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory policy #983
base: master
Are you sure you want to change the base?
Memory policy #983
Conversation
@nksauter , just making sure I see where we are: in the latest commit, the test works, but there is duplicated code which we want to avoid ? |
@dermen stand by please with regard to the working tests. I'm going to
apply the test to additional cases today to double check. The duplicated
code is a serious problem, as my original intent was to implement
polymorphism with the minimal lines of code, instead I had to duplicate an
entire kernel. Further, I was unable to shrink the size of the
m_accumulate_floatimage array as was the original intent.
…On Thu, May 9, 2024 at 8:53 AM Derek Mendez ***@***.***> wrote:
@nksauter <https://github.com/nksauter> , just making sure I see where we
are: in the latest commit, the test works, but there is duplicated code
which we want to avoid ?
—
Reply to this email directly, view it on GitHub
<#983 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADQ24VTVVOZLU4RA3OPDZETZBOLXNAVCNFSM6AAAAABF7NPLTGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBSHE2DAMJXGY>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After 3 weeks of testing via psii_spread's annulus worker, this PR was indeed found to significantly lower the memory footprint when simulating images on GPU. I don't have particular issues with the functionality of this code.
edb7ebb
to
7e5ae72
Compare
In the exascale_api, allow pixel values to be calculation either on large array (all pixels), or with low-memory on just the whitelist consisting of shoebox pixels. This commit only gives the polymorphism framework; both implementations are currently identical giving the large-array behavior.
The script tests/tst_memory_policy.py fails with a cuda illegal access. The intention is to get help from NESAP to get a functional test.
Memory savings achieved through code specialization, for the case where pixel values are simulated on a small whitelist. Specializations are not yet optimal, as there is still a lot of code duplication. Changes give ~4.5x reduction in memory footprint, but no success yet in resizing the array m_accumulate_floatimage. Attempts so far lead to cuda memory allocation error.
7e5ae72
to
fa0295b
Compare
I've created a failing unit test for new kokkos code in the cuda-context.
If the test "libtbx.python cctbx_project/simtbx/tests/tst_memory_policy.py" can be made to work, the bug has been fixed.
Background information: I'm trying to extend the kokkos exascale_api with new behavior. In the old way, it would allocate large arrays on GPU corresponding to the detector size even if only a few pixels are calculated due to the whitelist (a list of pixels of interest). In the new behavior only enough memory is allocated for the whitelist pixels. I am using C++ templates with template specialization dependent on these two memory-allocation cases. I need this to work so Daniel can move ahead with the SPREAD project, and it is just this last detail that I cannot seem to fix.