-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Omega_h not compatible with CUDA on Weaver #999
Comments
@cwsmith what versions of cuda are supported? |
Hmmm. That check may be a bit conservative now that we have a 'pure' kokkos backend that doesn't rely on thrust; there were thrust bugs in some cuda releases. I'll run a test with the problematic cuda 11.2 and the new backend to confirm. |
@jewatkins I'm running tests now (tracked here) and will keep you posted. |
@jewatkins CUDA >= 11.4.4 works (with GCC 10.4.0 in this testing, newer/other versions are fine) in my testing. |
@ikalash maybe it's best just to turn off omega_h for this build for now since we'll likely transition off of weaver and onto blake. I can test omega_h + cuda there |
We can definitely turn it off in the weaver nightlies. I will do this if there are no objections. How about PM? That one is currently using gcc 11.2.0, which it sounds like might be problematic for omega_h. I was going to tell @mcarlson801 to try turning it on there once we got the weaver ones up, but it sounds like we may have to hold off. |
Why are we turning off weaver? Does blake feature V100 as well? I thought it didn't... Since Summit's life got extended by a year, I think it's best to keep V100 tested somewhere, so if blake does not feature V100, we should prob keep weaver. |
We're not turning off weaver yet, just disabling omega_h. There's issues with the new module set on weaver (I sank a lot of time on it last FY) and there are open tickets which have not been resolved. blake has H100. Plan is to keep weaver online for as long as summit is online or if it takes too much work to maintain. |
It makes sense to test omega_h on perlmutter so I'd go ahead and try to turn it on there. |
Could you please try this @mcarlson801 ? |
FYI, he's OOO this week |
Thanks for reminding me @jewatkins . It is no rush. |
@cwsmith can we remove (or tune better) the check on the version then? |
@bartgol Yeah, I'm going to add this today to cmake and spack. |
Please let me know when the fix is pushed and I can re-activate Omega_h in the Weaver nightlies. |
Omega_h v10.8.3 has the fixed cuda check: SCOREC/omega_h@40a2d36 . |
Sorry @cwsmith I just saw your comment now. Should I try turning Omega_h on in the weaver builds? |
Ah, I missed this while I was out. I'll try turning it on for Perlmutter as well for this week's test. |
that fix won't let us run w. omega_h on weaver since we're still on cuda 11.2 |
@jewatkins : you are right. Good call. |
I turned on Omega_h in the weaver nightlies and it looks like it's not compatible with the CUDA library:
https://sems-cdash-son.sandia.gov/cdash/build/53415/configure
I presume we will just punt on turning on Omega_h on weaver, or is there a different plan?
@jewatkins @mcarlson801
The text was updated successfully, but these errors were encountered: