-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Processor layout for 0.25deg configuration #38
Comments
Yes, this will take some experimentation and will depend on scalability and load balance within and between components. It will also need to be redone for each resolution. A first step would be to determine what parameters and diagnostics would be useful to tune and assess load balance. We should also enable processor land-masking in MOM6 and CICE6 if we haven't already. CICE has the ability to subdivide the computational domain horizontally into tiles (termed “blocks”), and then parallelise by allocating several blocks to each CPU. This can improve load balancing if a similar number of ice-containing and ice-free blocks are allocated to each CPU. Land-only blocks aren’t allocated to a CPU at all. There are several alternative ways to define the block shape and distribution across processors which can make a big difference to internal load balance within CICE. See Craig et al., (2015) and sec 3.7.2 in the ACCESS-OM2 tech report. |
There is some useful info here on choosing processor layout. This includes some info about restrictions imposed by the driver and shows where to find (in CIME) layouts for existing compsets on other machines which may be useful as a starting point. |
@dougiesquire Thanks! That's going to be super useful. |
Related: COSIMA/access-om3#125 |
Docs on CICE6 block distribution: https://cice-consortium-cice.readthedocs.io/en/cice6.5.0/user_guide/ug_implementation.html#performance |
Latest MOM6 version has a new option to automatically perform land block elimination at runtime. This means one does not need to pre-generate a mask table for a given processor count. See here for more details. |
This issue is superseded by COSIMA/access-om3#148 |
We haven't yet made any consideration of processor layout. Currently, each model component runs sequentially and is allocated a single 48-core node. We need to think about processor layout across the different components, and within each component (e.g.
LAYOUT
andIO_LAYOUT
in MOM).The text was updated successfully, but these errors were encountered: