You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Not clear this is really specific to this role. However using openmpi3 the jobs distribution under slurm looks odd. The examples below use srun but similar behaviour seen using sbatch and mpirun too.
Packages: "@ohpc-slurm-client", "@ohpc-slurm-server", "slurm-slurmctld-ohpc", "slurm-example-configs-ohpc", "gnu7-compilers-ohpc" and "openmpi3-gnu7-ohpc".
Using gnu7 and openmpi3 modules:
$ srun --mpi=list -N 4 -n 4 helloworld
srun: MPI types are... srun: openmpi srun: none srun: pmi2 srun: pmix_v2 srun: pmix
$ srun -N 4 -n 4 helloworld
# fails with : "... OMPI not build with SLURM's PMI support and therefore cannot execute ...""
Ok maybe not entirely surprising but would be good if we could build one with PMI support. Note and using --mpi=openmpi or --mpi=pmi2 gives the same.
Not clear this is really specific to this role. However using openmpi3 the jobs distribution under slurm looks odd. The examples below use srun but similar behaviour seen using sbatch and mpirun too.
Packages: "@ohpc-slurm-client", "@ohpc-slurm-server", "slurm-slurmctld-ohpc", "slurm-example-configs-ohpc", "gnu7-compilers-ohpc" and "openmpi3-gnu7-ohpc".
Using gnu7 and openmpi3 modules:
$ srun --mpi=list -N 4 -n 4 helloworld srun: MPI types are... srun: openmpi srun: none srun: pmi2 srun: pmix_v2 srun: pmix $ srun -N 4 -n 4 helloworld # fails with : "... OMPI not build with SLURM's PMI support and therefore cannot execute ...""
Ok maybe not entirely surprising but would be good if we could build one with PMI support. Note and using
--mpi=openmpi
or--mpi=pmi2
gives the same.These work as expected:
but then using more jobs:
all 10 processes end up on 1 node. Using
-m block
or-m cyclic
doesn't change this.This works:
with processes distributed as if
-m block
is used, which I expect to to be the default behaviour.Can get correct behaviour using:
The text was updated successfully, but these errors were encountered: