Skip to content

How to use Memory Load Balance in neurodamus

Antonio Bellotta edited this page Dec 2, 2024 · 2 revisions

How to use Memory Load Balance in neurodamus

When using neurodamus on a big circuit one might incur into issues and OOMs due to the sheer size of the circuit and imbalance in the memory distribution of gids during the simulation.

In order to mitigate and solve these issues, we have implemented a memory balancing mechanism in neurodamus that uses the estimation data collected in the dry-run mode to balance the memory usage across all the nodes.

The usage is pretty simple and the whole workflow just need two execution of neurodamus; first in dry-run and then in normal simulation mode.

Let's see how!

  1. Run neurodamus in dry-run mode: neurodamus (or special) ... --dry-run This will run the dry run workflow, balance the memory and at the end of the execution an allocation_r*_c*.pkl.gz file will be created. By default, the balance distribution will happen on the amount of nodes/ranks that the dry run suggests. However you can manually specify the amount of ranks you want to distribute on by using the --num-target-ranks option. So, for example, let's say you want to distribute over 100 ranks, you can run neurodamus with: neurodamus (or special) ... --dry-run --num-target-ranks=100
  2. Run neurodamus in Memory Load Balance mode: Once the allocation_r*_c*.pkl.gz file has been created, you can run your circuit normally, using the normal options you would use but making sure to add the --lb-mode=Memory option to put neurodamus in Memory Load Balance mode. The allocation file will be automatically loaded if it's in the same directory where you're running it: neurodamus (or special) ... --lb-mode=Memory

Keep in mind that the program will automatically create a new correct allocation file if the number of ranks in the execution do not correspond to any allocation file present in the directory.

If this procedure still fails, it might be necessary to rebalance the rank assignment in post-processing. To do so, please refer to the memory_load_balance.rst file you can find in the docs directory of this repo.