You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current OpenMP implementation is not optimal (2x performance for 8 threads). There are two parts to OpenMP acceleration: building the interaction matrix (A), and solving the linear system Ax = b.
The issue for building A is probably load-balancing: the A/B vsh translation coefficients involve recursion relations that depend on inter-particle separation. Some threads will finish before others.
The issue for solving Ax = b is less obvious. Since this is a widely famous problem, it's worth looking into existing software solutions.
There are a few things that can easily be parallelized: source decomposition, cross-section evaluation, force/torque evaluation, E/H field evaluation
Lastly, there are two algorithm optimizations not being used:
Using rotation-translation-rotation algorithm to construct A matrix
There might exist an optimal solver for the linear system based on the physical problem, see Xu papers.
The text was updated successfully, but these errors were encountered:
The current OpenMP implementation is not optimal (2x performance for 8 threads). There are two parts to OpenMP acceleration: building the interaction matrix (A), and solving the linear system Ax = b.
The issue for building A is probably load-balancing: the A/B vsh translation coefficients involve recursion relations that depend on inter-particle separation. Some threads will finish before others.
The issue for solving Ax = b is less obvious. Since this is a widely famous problem, it's worth looking into existing software solutions.
There are a few things that can easily be parallelized: source decomposition, cross-section evaluation, force/torque evaluation, E/H field evaluation
Lastly, there are two algorithm optimizations not being used:
The text was updated successfully, but these errors were encountered: