This is a compact reference sheet for the most essential SLURM commands and their usage. Not all possible commands are listed here. For more information about command options/flags and additional commands, please refer to SLURM’s own manual pages or their summary sheet, however please note that some commands in those pages are available for system administrators only.
Command |
Description |
Usage |
sbatch | Submits a batch job to the queue. |
sbatch jobscript.sh |
squeue | Displays the state of a submitted job.
Use with -u for job information for a specific username. |
squeue -u username |
scancel | Kills an existing job.
Infer the jobID with squeue. |
scancel jobID |
sinfo | Displays all available cluster resources. |
sinfo |
sacct | View job accounting data, use with -j for specific jobIDs. |
sacct -j jobID |
salloc | Allocates compute node resources for interactive use. |
salloc |
Monitoring cluster resources with ‘sinfo’:
As ‘short’ is the default partition, it is convenient to display the resources for just this one, by adding ‘-p short’ to the ‘sinfo’ command. By default, the nodes which are in the same state are grouped together.
$ sinfo -p short PARTITION AVAIL TIMELIMIT NODES STATE NODELIST short* up 1-00:00:00 1 mix# racc2-comp-2 short* up 1-00:00:00 29 idle~ racc2-comp-[3-31] short* up 1-00:00:00 2 mix racc2-comp-[0-1]
The above output shows that nodes 3-31 are idle, and ‘~’ means they are switched off to save power. Nodes 1 and 2 are in a ‘mix’ state, meaning that some of the cores on the node are in use and some of them are free. Fully allocated nodes show the status ‘alloc’, meaning they will not be available for new jobs until the jobs currently running on them are finished.
Further details can be displayed using the ‘-o’ flag. See the manual page, ‘man sinfo’, for more details on format specifiers. In this example, the number of CPU cores are displayed with the command:
$ sinfo -p short -o "%P %.6t %C" PARTITION STATE CPUS(A/I/O/T) short* mix# 48/80/0/128 short* idle~ 0/3712/0/3712 short* mix 240/16/0/256
A/I/O/T stands for Allocated/Idle/Other/Total. The idle and switched off nodes have 3712 cores available. There is a node that is partially allocated, and another that is partially allocated and is being started (mix#). In total, there are 3808 cores (80 + 3712 + 16) available for new jobs.
Nodes can be listed individually by adding the ‘-N’ flag:
$ sinfo -p short -N -o "%N %.6t %C" NODELIST STATE CPUS(A/I/O/T) racc2-comp-0 idle 0/128/0/128 racc2-comp-1 idle 0/128/0/128 racc2-comp-2 idle~ 0/128/0/128 racc2-comp-3 idle~ 0/128/0/128 racc2-comp-4 idle~ 0/128/0/128 racc2-comp-5 idle~ 0/128/0/128 racc2-comp-6 idle~ 0/128/0/128 racc2-comp-7 idle~ 0/128/0/128 racc2-comp-8 idle~ 0/128/0/128 racc2-comp-9 idle~ 0/128/0/128 racc2-comp-10 idle~ 0/128/0/128 racc2-comp-11 idle~ 0/128/0/128 racc2-comp-12 idle~ 0/128/0/128 racc2-comp-13 idle~ 0/128/0/128 racc2-comp-14 idle~ 0/128/0/128 racc2-comp-15 idle~ 0/128/0/128 racc2-comp-16 idle~ 0/128/0/128 racc2-comp-17 idle~ 0/128/0/128 racc2-comp-18 idle~ 0/128/0/128 racc2-comp-19 idle~ 0/128/0/128 racc2-comp-20 idle~ 0/128/0/128 racc2-comp-21 idle~ 0/128/0/128 racc2-comp-22 idle~ 0/128/0/128 racc2-comp-23 idle~ 0/128/0/128 racc2-comp-24 idle~ 0/128/0/128 racc2-comp-25 idle~ 0/128/0/128 racc2-comp-26 idle~ 0/128/0/128 racc2-comp-27 idle~ 0/128/0/128 racc2-comp-28 idle~ 0/128/0/128 racc2-comp-29 idle~ 0/128/0/128 racc2-comp-30 idle~ 0/128/0/128 racc2-comp-31 idle~ 0/128/0/128