Skip to content

Commit

Permalink
Simplify filterParticles Kernel (#3510)
Browse files Browse the repository at this point in the history
## Summary

On Summit, generation of this kernel shows compiler issues with `nvcc`
11.3.109: it compiles without warnings but leads to a segmentation fault
at runtime.

The fix for the compiler bug is to implement the trivial lambda that is
passed to `copyParticles` w/o capture: `[]`.

We also add explicit capture to a few other lambdas, to simplify
compiler intake complexity.

This simplifies the kernel generation, which also solves the issue seen
with WarpX [for this line](https://github.com/ECP-WarpX/WarpX/blob/43d2ac7b54546c87d2cb540df7a2b3cb57592b84/Source/Diagnostics/WarpXOpenPMD.cpp#L585).

Co-authored-by: AlexanderSinn <[email protected]>
  • Loading branch information
ax3l and AlexanderSinn authored Aug 26, 2023
1 parent a45eade commit d4761e8
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 5 deletions.
4 changes: 2 additions & 2 deletions Src/Particle/AMReX_ParticleContainerI.H
Original file line number Diff line number Diff line change
Expand Up @@ -991,7 +991,7 @@ ParticleContainer_impl<ParticleType, NArrayReal, NArrayInt, Allocator, CellAssig
copyParticles (const PCType& other, bool local)
{
using PData = ConstParticleTileData<typename ParticleType::StorageParticleType, NArrayReal, NArrayInt>;
copyParticles(other, [=] AMREX_GPU_HOST_DEVICE (const PData& /*data*/, int /*i*/) { return 1; }, local);
copyParticles(other, [] AMREX_GPU_HOST_DEVICE (const PData& /*data*/, int /*i*/) { return 1; }, local);
}

template <typename ParticleType, int NArrayReal, int NArrayInt,
Expand All @@ -1002,7 +1002,7 @@ ParticleContainer_impl<ParticleType, NArrayReal, NArrayInt, Allocator, CellAssig
addParticles (const PCType& other, bool local)
{
using PData = ConstParticleTileData<typename ParticleType::StorageParticleType, NArrayReal, NArrayInt>;
addParticles(other, [=] AMREX_GPU_HOST_DEVICE (const PData& /*data*/, int /*i*/) { return 1; }, local);
addParticles(other, [] AMREX_GPU_HOST_DEVICE (const PData& /*data*/, int /*i*/) { return 1; }, local);
}

template <typename ParticleType, int NArrayReal, int NArrayInt,
Expand Down
6 changes: 3 additions & 3 deletions Src/Particle/AMReX_ParticleTransformation.H
Original file line number Diff line number Diff line change
Expand Up @@ -404,7 +404,7 @@ Index filterParticles (DstTile& dst, const SrcTile& src, Pred&& p,
const auto src_data = src.getConstParticleTileData();

amrex::ParallelForRNG(n,
[=] AMREX_GPU_DEVICE (int i, amrex::RandomEngine const& engine) noexcept
[p, p_mask, src_data, src_start] AMREX_GPU_DEVICE (int i, amrex::RandomEngine const& engine) noexcept
{
amrex::ignore_unused(p, p_mask, src_data, src_start, engine);
if constexpr (IsCallable<Pred,decltype(src_data),Index,RandomEngine>::value) {
Expand Down Expand Up @@ -577,7 +577,7 @@ int filterAndTransformParticles (DstTile1& dst1, DstTile2& dst2, const SrcTile&
const auto src_data = src.getConstParticleTileData();

amrex::ParallelForRNG(np,
[=] AMREX_GPU_DEVICE (int i, amrex::RandomEngine const& engine) noexcept
[p, p_mask, src_data] AMREX_GPU_DEVICE (int i, amrex::RandomEngine const& engine) noexcept
{
amrex::ignore_unused(p, p_mask, src_data, engine);
if constexpr (IsCallable<Pred,decltype(src_data),int,RandomEngine>::value) {
Expand Down Expand Up @@ -620,7 +620,7 @@ Index filterAndTransformParticles (DstTile& dst, const SrcTile& src, Pred&& p, F
const auto src_data = src.getConstParticleTileData();

amrex::ParallelForRNG(np,
[=] AMREX_GPU_DEVICE (int i, amrex::RandomEngine const& engine) noexcept
[p, p_mask, src_data, src_start] AMREX_GPU_DEVICE (int i, amrex::RandomEngine const& engine) noexcept
{
amrex::ignore_unused(p, p_mask, src_data, src_start, engine);
if constexpr (IsCallable<Pred,decltype(src_data),Index,RandomEngine>::value) {
Expand Down

0 comments on commit d4761e8

Please sign in to comment.