Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

openmpi3 job distribution appears odd #22

Open
sjpb opened this issue Apr 8, 2020 · 0 comments
Open

openmpi3 job distribution appears odd #22

sjpb opened this issue Apr 8, 2020 · 0 comments

Comments

@sjpb
Copy link
Collaborator

sjpb commented Apr 8, 2020

Not clear this is really specific to this role. However using openmpi3 the jobs distribution under slurm looks odd. The examples below use srun but similar behaviour seen using sbatch and mpirun too.
Packages: "@ohpc-slurm-client", "@ohpc-slurm-server", "slurm-slurmctld-ohpc", "slurm-example-configs-ohpc", "gnu7-compilers-ohpc" and "openmpi3-gnu7-ohpc".

Using gnu7 and openmpi3 modules:

$ srun --mpi=list -N 4 -n 4 helloworld
 srun: MPI types are... srun: openmpi srun: none srun: pmi2 srun: pmix_v2 srun: pmix

$ srun -N 4 -n 4 helloworld
# fails with : "... OMPI not build with SLURM's PMI support and therefore cannot execute ...""

Ok maybe not entirely surprising but would be good if we could build one with PMI support. Note and using --mpi=openmpi or --mpi=pmi2 gives the same.

These work as expected:

$ srun --mpi=pmix_v2 -N 4 -n 4 helloworld$ srun --mpi=pmix -N 4 -n 4 helloworld
$ srun --mpi=pmix -N 4 -n 8 helloworld

but then using more jobs:

$ srun --mpi=pmix -N 4 -n 16 helloworld

all 10 processes end up on 1 node. Using -m block or -m cyclic doesn't change this.

This works:

$ srun --mpi=pmix -N 4 -n 16 --ntasks-per-node=4 helloworld

with processes distributed as if -m block is used, which I expect to to be the default behaviour.

Can get correct behaviour using:

$ srun --mpi=pmix -N 4 -n 16 --ntasks-per-node=4 -m cyclic helloworld
# correctly shows cyclic behaviour
$ srun --mpi=pmix -N 4 -n 16 --ntasks-per-node=4 helloworld
# OK: processes distributed as if "-m block" used
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant