Releases: unibas-dmi-hpc/LB4OMP
AUTO4OMP
Auto4OMP introduce three scheduling algorithm selection methods and an expert-defined chunk parameter for OpenMP’s schedule clause’s kind and chunk, respectively.
Auto4OMP extends the auto and chunk implementation in LB4OMP OpenMP runtime library to automatically select a scheduling algorithm and calculate a chunk parameter during execution for an automated application load balancing.
Algorithm selection methods
- RandomSel
- ExhaustiveSel
- ExpertSel.
All three selection methods leverage application and system information obtained during execution for the automated and dynamic selection of scheduling algorithm during execution.
Auto4OMP also introduces the expert chunk parameter calculated based on the number of loop iterations and number of threads to select
the chunk parameter.
LB4OMP v1.0
LB4OMP v1.0
LB4OMP is an extended LLVM OpenMP runtime library that supports thirteen dynamic and adaptive loop scheduling techniques from the literature. LB4OMP is a load balancing performance portfolio that can offer improved performance by adapting to the unpredictable variations in application and system during execution. LB4OMP is used to improve applications performance, assess the effectiveness of loop scheduling techniques, and support loop scheduling research in multithreaded applications.
LB4OMP contains the following loop scheduling techniques:
OpenMP standard
static
dynamic
guided
Dynamic and non-adaptive loop scheduling techniques OpenMP non-standard
Trapezoid self scheduling (TSS)
Dynamic and non-adaptive loop scheduling techniques newly implemented in LB4OMP
Fixed size chunk (FSC)
Factoring (FAC)
Improved implementation of Factoring (mFAC)
Practical variant of factoring (FAC2)
Tapering (TAP)
Practical variant of weighted factoring (WF2)
Dynamic and adaptive loop scheduling techniques newly implemented in LB4OMP
BOLD
Four variants of adaptive weighted factoring (AWF-B,C,D,E)
Adaptive factoring (AF)
Improved implementation of Adaptive factoring (mAF)
LB4OMP contains the following tool:
profiling
The profiling
tool works similar to dynamic,1
, yet with timers that capture the average iteration execution time and its standard deviation for the target loops. The collected information is stored in a file, which is read later by the FSC
, FAC
, TAP
, and BOLD
during execution.