-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NDTensors][NDTensorsCUDAExt] Improve performance of GPU backends #1194
Conversation
Codecov ReportAttention:
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #1194 +/- ##
===========================================
- Coverage 85.37% 67.34% -18.03%
===========================================
Files 88 88
Lines 8416 8388 -28
===========================================
- Hits 7185 5649 -1536
- Misses 1231 2739 +1508
☔ View full report in Codecov by Sentry. |
…into kmp5/debug/cuda_rand
Co-authored-by: Matt Fishman <[email protected]>
Co-authored-by: Matt Fishman <[email protected]>
Co-authored-by: Matt Fishman <[email protected]>
Co-authored-by: Matt Fishman <[email protected]>
This PR is to address the issue #1193. In order to avoid scalar operations, fill tensors with rand or zeros on device.