Compiling and Running CPU Jobs
Sections:
General Steps: Compiling/Running Jobs
- Change to a working directory (for example the expanse101 directory):
cd /home/$USER/expanse101/MPI
- Verify that the correct modules loaded:
module list
Currently Loaded Modulefiles:
1) slurm/expanse/20.02.3 2) cpu/1.0 3) gcc/10.2.0 4) openmpi/4.0.4
- Compile the MPI hello world code:
mpif90 -o hello_mpi hello_mpi.f90
Verify executable has been created (check that date):
ls -lt hello_mpi
-rwxr-xr-x 1 user sdsc 721912 Mar 25 14:53 hello_mpi
- Submit job
sbatch hello_mpi_Slurm.sb
Checking Your Environment
This simple batch script will show you how to check your user environment and to also verify that your Slurm environment is working.
- Script contents:
[user@login01 ENV_INFO]$ cat env-Slurm.sb
!/bin/bash
SBATCH --job-name="envinfo"
SBATCH --output="envinfo.%j.%N.out"
SBATCH --partition=compute
SBATCH --nodes=1
SBATCH --ntasks-per-node=1
SBATCH --export=ALL
SBATCH -A sds173
SBATCH -t 00:1:00
# Load module environment
module purge
module load slurm
module load cpu
module load sdsc
# perform some basic unix commands
echo "----------------------------------"
echo "hostname= " `hostname`
echo "date= " `date`
echo "whoami= " `whoami`
echo "pwd= " `pwd`
echo "module list= " `module list`
echo "----------------------------------"
echo "env= " `env`
echo "----------------------------------"
echo "expanse-client user -p: " `expanse-client user -p`
echo "----------------------------------"
- Submit the batch script and monitor until the job is allocated a node, and completes execution:
[user@login01 ENV_INFO]$ sbatch env-Slurm.sb
Submitted batch job 1088090
[user@login01 ENV_INFO]$ squeue -u user
1088090 compute envinfo user PD 0:00 1 (ReqNodeNotAvail,[SNIP]
[...]
[user@login01 ENV_INFO]$ squeue -u user
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
1088090 compute envinfo user PD 0:00 1 (ReqNodeNotAvail, [SNIP]
Hello World (MPI)
Subsections:
- Hello World (MPI): Source Code
- Hello World (MPI): Compiling
- Hello World (MPI): Batch Script Submission
- Hello World (MPI): Batch Script Output
- Hello World (MPI): Interactive Jobs
Hello World (MPI): Source Code
- Change to the tutorial
MPI
examples directory: - Source code with basic MPI elements:
[user@login01 MPI]$ cat hello_mpi.f90
! Fortran example
program hello
include 'mpif.h'
integer rank, size, ierror, tag, status(MPI_STATUS_SIZE)
call MPI_INIT(ierror)
call MPI_COMM_SIZE(MPI_COMM_WORLD, size, ierror)
call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierror)
print*, 'node', rank, ': Hello world!'
call MPI_FINALIZE(ierror)
end
[user@login01 MPI]$
Hello World (MPI): Compiling
- To compile, checkout the instructions in the README.txt file.
- Follow the instructions in the batch script provided for the compiler you want to test.
[user@login01 MPI]$ cat README.txt
[1] Compile:
### MODULE ENV: updated 01/28/2020 (MPT)
module purge
module load slurm
module load cpu
module load gcc/10.2.0
module load openmpi/4.0.4
mpif90 -o hello_mpi hello_mpi.f90
[2a] Run using Slurm:
sbatch hellompi-Slurm.sb
- Follow the compile instructions for the compiler that you want to use
[user@login01 MPI]$ module purge
[user@login01 MPI]$ module load slurm
[user@login01 MPI]$ module load cpu
[user@login01 MPI]$ module load gcc/10.2.0
[user@login01 MPI]$ module load openmpi/4.0.4
[user@login01 MPI]$ module load openmpi/4.0.4
[user@login01 MPI]$ module list
Currently Loaded Modules:
1) slurm/expanse/20.02.3 2) cpu/1.0 3) gcc/10.2.0 4) openmpi/4.0.4
- Next, compile the code:
[user@login01 MPI]$ mpif90 -o hello_mpi hello_mpi.f90
Hello World (MPI): Batch Script Submission
- The batch script contains the module commands needed to set the right environment in order to run the code. The contents of the default batch script are:
[user@login01 MPI]$ cat hellompi-Slurm.sb
!/bin/bash
SBATCH --job-name="hellompi"
SBATCH --output="hellompi.%j.%N.out"
SBATCH --partition=compute
### SBATCH --partition=shared
SBATCH --nodes=3
SBATCH --ntasks-per-node=12
SBATCH --export=ALL
SBATCH -t 00:04:00
SBATCH -A abc123
This job runs with 3 nodes, and a total of 12 cores.
# Environment
## MODULE ENV: updated 01/28/2020 (MPT)
module purge
module load slurm
module load cpu
module load gcc/10.2.0
module load openmpi/4.0.4
# Use srun to run the job
srun --mpi=pmi2 -n 12 --cpu-bind=rank ./hello_mpi
- In this batch script we are using the GNU compiler, and asking for 2 CPU compute nodes, with 128 tasks per node for a total of 256 tasks.
-
the name of job is set in line 2, while the name of the output file is set in line 3, where "%j" is the Slurm JOB_ID, and and "%N" is the compute node name. You can name your outupt file however you wish, but it helpful to keep track of the JOB_ID and node info in case something goes wrong.
- Submit the batch script Submission using the sbatch commmand and monitor the job status using the squeue command:
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
667424 compute hellompi user PD 0:00 2 (Priority)
[user@login01 MPI]$ squeue -u user -u user
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
667424 compute hellompi user PD 0:00 2 (Priority)
[user@login01 MPI]$ squeue -u user -u user
^[[A JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
667424 compute hellompi user CF 0:01 2 exp-2-[28-29]
[user@login01 MPI]$ squeue -u user -u user
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
667424 compute hellompi user R 0:02 2 exp-2-[28-29]
[user@login01 MPI]$ squeue -u user -u user
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
[user@login01 MPI]$ ll
total 151
drwxr-xr-x 2 user abc123 13 Dec 10 01:06 .
drwxr-xr-x 8 user abc123 8 Oct 8 04:16 ..
-rwxr-xr-x 1 user abc123 21576 Oct 8 03:12 hello_mpi
-rw-r--r-- 1 user abc123 8448 Oct 8 03:32 hellompi.667424.exp-2-28.out
[user@login01 MPI]$
Hello World (MPI): Batch Script Output
Batch Script Output
[user@login01 MPI]$
[user@login01 MPI]$ cat hellompi.667424.exp-2-28.out
node 1 : Hello world!
node 0 : Hello world!
[snip]
node 247 : Hello world!
node 254 : Hello world!
node 188 : Hello world!
node 246 : Hello world!
Hello World (MPI): Interactive Jobs
Interactive Jobs
Hello World (OpenMP)
Subsections:
- Hello World (OpenMP): Source Code
- Hello World (OpenMP): Compiling
- Hello World (OpenMP): Batch Script Submission
- Hello World (OpenMP): Batch Script Output
Hello World (OpenMP): Source Code
Source Code.
[user@login02 OPENMP]$ cat hello_openmp.f90
PROGRAM OMPHELLO
INTEGER TNUMBER
INTEGER OMP_GET_THREAD_NUM
!$OMP PARALLEL DEFAULT(PRIVATE)
TNUMBER = OMP_GET_THREAD_NUM()
PRINT *, 'HELLO FROM THREAD NUMBER = ', TNUMBER
!$OMP END PARALLEL
END
Hello World (OpenMP): Compiling
- First, load the correct module environment:
module purge
module load slurm
module load cpu
module load aocc
module list
Currently Loaded Modules:
1) slurm/expanse/20.02.3 2) cpu/0.15.4 3) aocc/2.2.0
- Next, compile the code:
flang -fopenmp -o hello_openmp hello_openmp.f90
Hello World (OpenMP): Batch Script Submission
- Batch Script contents:
!/bin/bash
# Example of OpenMP code running on a shared node
SBATCH --job-name="hell_openmp_shared"
SBATCH --output="hello_openmp_shared.%j.%N.out"
SBATCH --partition=shared
SBATCH --nodes=1
SBATCH --ntasks-per-node=1
SBATCH --cpus-per-task=16
SBATCH --mem=32G
SBATCH --export=ALL
SBATCH --account=sds173
SBATCH -t 00:10:00
AOCC environment
module purge
module load slurm
module load cpu
module load aocc
SET the number of openmp threads
export OMP_NUM_THREADS=16
Run the openmp job
./hello_openmp
-
Note that the script is loading the module stack, and setting the number of OMP threads.
-
Submit the job to the batch queue, and monitor:
[user@login02 OPENMP]$ sbatch openmp-Slurm-shared.sb ; squeue -u user
Submitted batch job 1088802
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
1088802 shared hell_ope user PD 0:00 1 (None)
Hello World (OpenMP): Batch Script Output
Batch Script Output
[user@login02 OPENMP]$ cat hello_openmp_shared.1088802.exp-3-08.out
HELLO FROM THREAD NUMBER = 14
HELLO FROM THREAD NUMBER = 15
HELLO FROM THREAD NUMBER = 10
HELLO FROM THREAD NUMBER = 8
HELLO FROM THREAD NUMBER = 12
HELLO FROM THREAD NUMBER = 4
HELLO FROM THREAD NUMBER = 1
HELLO FROM THREAD NUMBER = 0
HELLO FROM THREAD NUMBER = 9
HELLO FROM THREAD NUMBER = 7
HELLO FROM THREAD NUMBER = 11
HELLO FROM THREAD NUMBER = 2
HELLO FROM THREAD NUMBER = 5
HELLO FROM THREAD NUMBER = 13
HELLO FROM THREAD NUMBER = 3
HELLO FROM THREAD NUMBER = 6
[user@login02 OPENMP]$
- Note the non-deterministic output of the thread numbers. This is normal for HPC systems.
Compiling and Running Hybrid (MPI + OpenMP) Jobs
Subsections:
- Hybrid (MPI + OpenMP): Source Code
- Hybrid (MPI + OpenMP): Compiling
- Hybrid (MPI + OpenMP): Batch Script Submission
- Hybrid (MPI + OpenMP): Batch Script Output
Hello World Hybrid (MPI + OpenMP): Source Code
- Source Code: hybrid.c.
include <stdio.h>
include "mpi.h"
include <omp.h>
int main(int argc, char *argv[]) {
int numprocs, rank, namelen;
char processor_name[MPI_MAX_PROCESSOR_NAME];
int iam = 0, np = 1;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Get_processor_name(processor_name, &namelen);
#pragma omp parallel default(shared) private(iam, np)
{
np = omp_get_num_threads();
iam = omp_get_thread_num();
printf("Hello from thread %d out of %d from process %d out of %d on %s\n",
iam, np, rank, numprocs, processor_name);
}
MPI_Finalize();
}
Hello World Hybrid (MPI + OpenMP): Compiling
Compiling.
- Compile code, but remember to load the right modules (see README.txt)
[1] Compile:
module purge
module load slurm
module load cpu
module load intel
module load intel-mpi
export I_MPI_CC=icc
mpicc -qopenmp -o hello_hybrid hello_hybrid.c
[2] Run:
sbatch hybrid-Slurm.sb
- Compilation example:
[user@login01 HYBRID]$ module purge
[user@login01 HYBRID]$ module load slurm
[user@login01 HYBRID]$ module load cpu
[user@login01 HYBRID]$ module load intel
[user@login01 HYBRID]$ module load intel-mpi
[user@login01 HYBRID]$ export I_MPI_CC=icc
[user@login01 HYBRID]$ mpicc -qopenmp -o hello_hybrid hello_hybrid.c
Hello World Hybrid (MPI + OpenMP): Batch Script Submission
- Submit the batch script and monitor:
[user@login01 HYBRID]$ sbatch hybrid-Slurm.sb
Submitted batch job 1089019
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
[user@login01 HYBRID]$ squeue -u user
[user@login01 HYBRID]$ squeue -u user
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
Hello World Hybrid (MPI + OpenMP): Batch Script Output
- Batch Script Output:
[user@login01 HYBRID]$ cat hellohybrid.1089019.exp-10-07.out
Hello from thread 0 out of 16 from process 1 out of 2 on exp-10-07
Hello from thread 14 out of 16 from process 1 out of 2 on exp-10-07
Hello from thread 1 out of 16 from process 1 out of 2 on exp-10-07
Hello from thread 4 out of 16 from process 1 out of 2 on exp-10-07
Hello from thread 12 out of 16 from process 1 out of 2 on exp-10-07
Hello from thread 2 out of 16 from process 1 out of 2 on exp-10-07
Hello from thread 13 out of 16 from process 1 out of 2 on exp-10-07
Hello from thread 7 out of 16 from process 1 out of 2 on exp-10-07
Hello from thread 3 out of 16 from process 1 out of 2 on exp-10-07
Hello from thread 8 out of 16 from process 1 out of 2 on exp-10-07
Hello from thread 0 out of 16 from process 0 out of 2 on exp-10-07
Hello from thread 3 out of 16 from process 0 out of 2 on exp-10-07
Hello from thread 13 out of 16 from process 0 out of 2 on exp-10-07
Hello from thread 14 out of 16 from process 0 out of 2 on exp-10-07
Hello from thread 6 out of 16 from process 0 out of 2 on exp-10-07
Hello from thread 7 out of 16 from process 0 out of 2 on exp-10-07
Hello from thread 11 out of 16 from process 0 out of 2 on exp-10-07
Hello from thread 12 out of 16 from process 0 out of 2 on exp-10-07
Hello from thread 4 out of 16 from process 0 out of 2 on exp-10-07
Hello from thread 1 out of 16 from process 0 out of 2 on exp-10-07
Hello from thread 5 out of 16 from process 0 out of 2 on exp-10-07
Hello from thread 15 out of 16 from process 1 out of 2 on exp-10-07
Hello from thread 11 out of 16 from process 1 out of 2 on exp-10-07
Hello from thread 9 out of 16 from process 1 out of 2 on exp-10-07
Hello from thread 5 out of 16 from process 1 out of 2 on exp-10-07
Hello from thread 2 out of 16 from process 0 out of 2 on exp-10-07
Hello from thread 6 out of 16 from process 1 out of 2 on exp-10-07
Hello from thread 9 out of 16 from process 0 out of 2 on exp-10-07
Hello from thread 8 out of 16 from process 0 out of 2 on exp-10-07
Hello from thread 10 out of 16 from process 1 out of 2 on exp-10-07
Hello from thread 10 out of 16 from process 0 out of 2 on exp-10-07
Hello from thread 15 out of 16 from process 0 out of 2 on exp-10-07