Replies: 4 comments 6 replies
-
Moreover, I've tried to implement a larger stencil based scheme (6 stencils) for the convective term in CNS, the speed difference between two GPUs is even larger: V100 is almost 10 times faster. This is the most performance-critical part:
|
Beta Was this translation helpful? Give feedback.
-
The |
Beta Was this translation helpful? Give feedback.
-
It turns out the |
Beta Was this translation helpful? Give feedback.
-
from what I can find online, the V100 has 7 TFLOPS double precision performance and the RTX 3090 has only 0.5 TFLOPs. |
Beta Was this translation helpful? Give feedback.
-
I have an Nvidia V100 GPU and an RTX 3090 GPU. While running the
ATPESC-codes/AMReX_Amr101
code, I observed that the execution time was nearly identical for both GPUs. I also ran some CUDA sample codes provided by Nvidia and found that the performance difference was not significant between the two GPUs. However, when running theTests/GPU/CNS
code, I noticed that the execution time on the RTX 3090 was almost twice as long as on the V100. This raises the question of what could be the main factor causing the slower performance of the CNS code on the RTX 3090 and how the code can be tunned to get better performance on specific device?Beta Was this translation helpful? Give feedback.
All reactions