Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Swap in more performant kernels for vector divergence #66

Open
fluidnumerics-joe opened this issue Oct 31, 2024 · 0 comments
Open

Swap in more performant kernels for vector divergence #66

fluidnumerics-joe opened this issue Oct 31, 2024 · 0 comments

Comments

@fluidnumerics-joe
Copy link
Member

Vector divergence is by far our most expensive kernel for forward stepping conservation laws. In a separate repository, I've put together various implementations of the vector divergence kernels for 2-D and 3-D and demonstrated that a hand-written kernel that leverages shared memory is able to outperform our current hipblas implementation substantially for both 2-D and 3-D.

In 2-D, the kernels are valid for polynomial degrees 2-15. At polynomial degree 15, however, the hipblas implementations are more performant ( for N < 15, the hand-written kernels are faster by an order of magnitude). See the bandwidth graphics here. We'll want to bring these kernels in and add in logic to switch between the hipblas implementation and our hand-written kernels. It'd be ideal to put in a test for convergence that varies the polynomial degree between 2 and 15 to exercise each implementation and confirms spectral accuracy.

In 3-D, the kernels are valid for polynomial degrees 2-7. Beyond polynomial degree 7, we'll want to fall back on the hipblas implementation, since these will produce valid results, even though the performance is not ideal. As with 2-D, we'll want to put in a convergence test that varies the polynomial degree between 2 and 15 to exercise each implementation and confirms spectral accuracy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant