Skip to content

Switching to Slurm

dth2 edited this page Nov 30, 2018 · 2 revisions

This page describes how to transition to the newer generation of Hyak. Most of this information is taken from:

Mox scheduler on Hyak wiki

https://wiki.cac.washington.edu/display/hyakusers/Mox_scheduler

Mox overview

https://wiki.cac.washington.edu/display/hyakusers/Hyak+HOWTO

https://wiki.cac.washington.edu/display/hyakusers/Hyak+mox+Overview

Using R on Slurm

http://www.arc.ox.ac.uk/content/running-r

https://wiki.cac.washington.edu/display/hyakusers/Hyak+R+programming

https://cran.r-project.org/web/packages/rslurm/vignettes/rslurm.html

PBS to Slurm

https://hpc.nih.gov/docs/pbs2slurm.html

https://www.glue.umd.edu/hpcc/help/slurm-vs-moab.html

Here are 6 main differences from ikt:

  1. Mox is an entirely separate cluster. They share nothing with one another.
  2. You only get what you ask for, regardless of the resources available on the node. If you ask for 1 CPU, you'll only get one. If you ask for 1GB of RAM, you'll only get 1GB.
  3. An allocation won't get the same set of nodes all the time, just access to the particular number of nodes to which they're entitled.
  4. No occasional preemption in ckpt (formerly bf queue) for the moment.
  5. Preempted jobs get 10s to do something smart before being killed and requeued.
  6. Please report any problems to [email protected] with Hyak as the first word in the subject. Please also let us know you're using mox not ikt.

Logging in to Hyak

Old: ssh [email protected]

New: ssh [email protected]

Common functions

https://slurm.schedmd.com/rosetta.pdf

See jobs running

All

Old: showq

New: squeue

Allocation

squeue -p csde

Personal

squeue -u kweiss2

Backfill

squeue -p ckpt

Full Hyak allocation

hyakalloc hyakalloc xyz

Exit a mode

Old: logout

New: exit

Cancel a job

Single job (1234): scancel 1234

All jobs: scancel -u kweiss2

Copying files from ikt (old) to mox (new) on Hyak

You can copy files at high speed without a password between the Hyak systems using commands like the ones below. Here ikt is hyak classic and mox is hyak nextgen. Below xyz (csde) is your group name and abc (kweiss2) is your userid. (If you are using a non-default PATH environment variable then you can find hyakbbcp at this location /sw/local/bin/hyakbbcp .)

From ikt to mox:

File: ikt1$ hyakbbcp myfile mox1.hyak.uw.edu:/gscratch/xyz/abc/mydir

Directory: ikt1$ hyakbbcp -r mydirectory mox1.hyak.uw.edu:/gscratch/xyz/abc/mydir

For me, this would be:

ikt1$ hyakbbcp myfile mox1.hyak.uw.edu:/gscratch/csde/kweiss2/sti

ikt1$ hyakbbcp -r sti mox1.hyak.uw.edu:/gscratch/csde/kweiss2/sti

Submit Jobs

Build

Interactive build node: srun -p build --time=2:00:00 --mem=100G --pty /bin/bash

Interactive build node in own group: srun -p xyz -A xyz --time=2:00:00 --mem=100G --pty /bin/bash

Multiple nodes: srun -N 2 -p xyz -A xyz --time=2:00:00 --mem=100G --pty /bin/bash

Find names of allocated nodes: scontrol show hostnames

Batch

sbatch -p xyz -A xyz myscript.slurm

Using R

Open up an interactive build node: srun -p build --time=2:00:00 --mem=100G --pty /bin/bash

Find available modules and load: module avail module load r_3.3.3

Access R: R

Update packages: update.packages(), choose a CRAN mirror, and then say yes to all of the options