GPUs and GRES

Jun 22, 2015GPU, hardware, hpc, SLURM

Our current cluster, hex, runs Torque with MAUI as the scheduler. While MAUI is GPU aware it does not allow GPUs to be scheduled. In other words you can list the nodes with GPUs but you cannot submit a job based on these resources, nor can you lock a GPU. The MOAB scheduler for Torque can do this, but the license costs several hundred thousand dollars. Fortunately SLURM has this functionality built into it, and what’s more is it’s free. GPU cards are defined as generic resource (GRES) objects and are listed by type and number. Each GPU card is assigned to a certain number of cores in a server. One needs to enter the following line:

GresTypes=gpu

in the slurm.conf file and also add Gres information to the node configurations, for example:

NodeName=hpc406 ... Gres=gpu:kepler:2

One must also create a gres.conf file on the nodes that actually house the GPU cards:

Name=gpu Type=kepler File=/dev/nvidia0 CPUs=0,1,2,3
Name=gpu Type=kepler File=/dev/nvidia1 CPUs=4,5,6,7

This indicates which cores are assigned to which card. To request a GPU resource one enters the following requirement in sbatch, salloc or srun:

#SBATCH --gres=gpu:2 --nodes=1 --ntasks=1

When the job runs an environment variable is set:

CUDA_VISIBLE_DEVICES=0,1

depending on how many GPU cards have been requested. Below are three jobs all requesting 2 cores and a single GPU card. Only two are running even though there are cores free:

JOBID PARTITION     NAME   USER  ST   TIME     NODELIST
 1937  ucthimem GresTest   andy  PD   0:00  (Resources)
 1935  ucthimem GresTest   andy   R   1:08       hpc406
 1936  ucthimem GresTest   andy   R   1:08       hpc406

Examining 1935 shows us that cores are set to CPU_IDs=1-2 while 1936’s cores are set to CPU_IDs=4-5. Additionally CUDA_VISIBLE_DEVICES=0 and CUDA_VISIBLE_DEVICES=1 are set for jobs 1935 and 1936 respectively.

GPUs and GRES

UCTHPC