-
Notifications
You must be signed in to change notification settings - Fork 128
Load Balancing
By default, Athena++ assumes that each MeshBlock has the same cost and tries to distribute MeshBlocks as evenly as possible. However, some additional physics may cause uneven computational cost. Athena++ provides two ways to adjust the load balance. Note that these features may cause lower performance if they are not used correctly and there are some limitations, so please read the following description carefully.
If the cost of each MeshBlock is predictable and stable, one can use MeshBlock::SetCostForLoadBalancing in MeshBlock::UserWorkInLoop to assign the cost manually. Athena++ try to redistribute MeshBlocks so that the total cost per node becomes as even as possible. This function must be called from all the MeshBlocks.
void MeshBlock::UserWorkInLoop() {
Real mycost;
// calculate the cost for this MeshBlock here
SetCostForLoadBalancing(mycost);
return;
}
Two control parameters for load balancing can be set in the input file.
<loadbalancing>
interval = 10 # interval between load balancing (default = 10)
tolerance = 0.5 # acceptable load imbalance (default = 0.5 = 50%)
The interval parameter sets how frequently the load balancing is performed. Since the load balancing itself is somewhat costly as it involves global communication, we recommend to perform it only occasionally. Note that when AMR is in use and one or more MeshBlocks are refined/derefined, it always performs load balancing regardless of the interval parameter.
The tolerance parameter specifies the maximum permissible load imbalance. When the total cost in a node exceeds the average cost by this factor, it triggers the load balancing process. Because of the limited flexibility of the load balancing (see the notes below), this value should not be set too low. Also, for AMR, this parameter is ignored and set automatically because the adjustability changes according to the number of MeshBlocks (this implementation may subject to change).
The other option is automatic load balancing based on timing. With this feature enabled, the code measures the computation time for each MeshBlock and redistribute the load accordingly. To enable this feature, set the automatic flag to true in the input file.
<loadbalancing>
automatic = true # enable automatic load balancing (default = false)
interval = 10 # interval between load balancing (default = 10)
tolerance = 0.5 # acceptable load imbalance (default = 0.5 = 50%)
The other parameters remain the same as in the manual load balancing.
First of all, the use of these load balancing features is recommended only when there is physics causing substantially uneven load. The default load balancing is optimal when the cost is proportional to the number of MeshBlocks.
The load balancing works well only when there are sufficiently many MeshBlocks per node to allow flexible load redistribution. If there is one or two MeshBlocks per node, it is likely that the new load distribution does not improve the performance significantly, or it is even possible that no better solution can be found. Roughly speaking, the number of MeshBlocks per node needed for good load balancing should be as large as the maximum ratio between the maximum and minimum cost MeshBlock. These features should be used only when the load imbalance is not negligible because smaller MeshBlocks can cause poor performance and you may achieve a better performance by ignoring minor load imbalance.
Also, it is not always possible to improve the performance because the code can redistribute MeshBlocks only along the Z-ordering. If the high cost MeshBlocks are badly localized on the Z-ordering, it is possible that the code cannot adjust the load sufficiently. This is the limitation of the current implementation.
When a new task is added, it should be registered to the load balancer if the cost of that task should be counted. This is controlled by lb_flag boolean variable in the Task structure. If this flag is true the code measures the time spent by this task, and it is ignored if the flag is false. It should be set in the AddTask function in the TaskList as follows:
case (MY_PHYSICS):
task_list_[ntasks].TaskFunc=
static_cast<TaskStatus (TaskList::*)(MeshBlock*,int)>
(&TimeIntegratorTaskList::MyPhysicsTask);
task_list_[ntasks].lb_time = true;
break;
Tasks involving MPI receiving should not be counted, as its time depends on the condition of the network and other nodes.
Getting Started
User Guide
- Configuring
- Compiling
- The Input File
- Problem Generators
- Boundary Conditions
- Coordinate Systems and Meshes
- Running the Code
- Outputs
- Using MPI and OpenMP
- Static Mesh Refinement
- Adaptive Mesh Refinement
- Load Balancing
- Special Relativity
- General Relativity
- Passive Scalars
- Shearing Box
- Diffusion Processes
- General Equation of State
- FFT
- Multigrid
- High-Order Methods
- Super-Time-Stepping
- Orbital Advection
- Rotating System
- Reading Data from External Files
- Non-relativistic Radiation Transport
- Cosmic Ray Transport
- Units and Constants
Programmer Guide