You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on May 23, 2024. It is now read-only.
From what I can see in the Kokkos tutorials, a way to use a Kokkos::View with the simd-math library is to use a view based on a simd type. It does not seem easy to express stencil computation with this kind of view as we need direct neighbours. Hence I would like to propose an alternative using a new execution policy, ThreadSimdRange, to handle this type of computation that is close to the existing Kokkos::ThreadVectorRange.
The SimdRange represents a contiguous set of indices that will be used to load/store (unaligned) data from a scalar Kokkos::View. The associated parallel_for should mimic the work of the compiler by splitting the loop into a vectorized loop and a scalar loop for the remainder.
A drawback is the need for C++ 14 because of the lambda but compiles and run with recent compilers.
Do you see any problem with this execution policy ?
The text was updated successfully, but these errors were encountered:
From what I can see in the Kokkos tutorials, a way to use a
Kokkos::View
with the simd-math library is to use a view based on a simd type. It does not seem easy to express stencil computation with this kind of view as we need direct neighbours. Hence I would like to propose an alternative using a new execution policy,ThreadSimdRange
, to handle this type of computation that is close to the existingKokkos::ThreadVectorRange
.The
SimdRange
represents a contiguous set of indices that will be used to load/store (unaligned) data from a scalar Kokkos::View. The associatedparallel_for
should mimic the work of the compiler by splitting the loop into a vectorized loop and a scalar loop for the remainder.A drawback is the need for C++ 14 because of the lambda but compiles and run with recent compilers.
Do you see any problem with this execution policy ?
The text was updated successfully, but these errors were encountered: