Cheat Sheet
Version 0.1 • Best Before 2025-12-31
Getting help 🛟¶
| Description | |
|---|---|
| help@sharcnet.ca | For SHARCNET specific issues |
| accounts@tech | Questions about accounts |
| renewals@tech | Questions about account renewals |
| globus@tech | Questions about Globus file transfer services |
| cloud@tech | Questions about using Cloud resources |
| allocations@tech | Questions about the Resource Allocation Competition |
| support@tech | For any other questions or issues |
Training courses 🏫¶
Connecting to nibi 🔗¶
Using cluster nibi 💦¶
Using modules¶
| Command | Description |
|---|---|
module avail | To list all available modules. Check this link https:// |
module list | To list preloaded modules |
module spider keyword | To search for a module by keyword |
module load foo[/ver] | To load module foo [version ver] |
Most commonly used Linux commands¶
| Command | Description |
|---|---|
ls | List files and directories in the current directory |
cd | Change directory, e.g. |
cd DIR | Go to directory DIR |
cd | Go back to home directory |
cd .. | Go back to the previous directory |
pwd | Show current directory |
mkdir | Make directories, e.g. |
mkdir dir1[ dir2[ … ]] | Make directories |
mkdir -p path/to/dir | Make directory recursively |
cp source dest | Copy files |
mv source dest | Move or rename files and directories |
find | Find a file or directory that matches certain criteria |
du -sh | Find the disk usage |
man command | See the manual page of command |
quota | Find the disk quota |
Slurm commands¶
The Slurm scheduler has a rich set of commands, one needs to refer to the Slurm documentation for details. The following is a list of commonly used Slurm commands:
Note
One needs to create a job submission script per job. It is a Shell script. The following is a sample script, named submit_ajob.sh:
- To submit a job using job submission script
submit_ajob.sh:
sbatch submit_ajob.sh- To see the history of your jobs, use command
sacct, with options (check the Slurm documentation or the man page ofsacct):
sacct -j jobid
sacct -u $USER –starttime t1 –endtime t2
sacct -u $USER -o ReqCPUs,ReqMem,NNodes,Starttime,Endtime- To cancel a job:
scancel jobid- To see the system information:
sinfo- To see your queued jobs:
squeue -u $USER- To see the fairshare:
sshare- To see the epilogue of a job:
seff jobid- To allocate cores and/or nodes and use them interactively:
salloc –account=def-my_group_account –ntasks=32 –time=1:00
salloc –account=def-my_group_account –mem=0 –nnodes=1Sample script for submitting a serial job¶
#!/bin/bash
#SBATCH --time=00-01:00:00 # DD-HH:MM
#SBATCH --account=my_group_account
module load python/3.6
python simple_job.py 7 outputTo submit the job, run the following command
sbatch submit_ajob.shSample script for submitting multiple jobs¶
#!/bin/bash
#SBATCH --time=01:00
#SBATCH --account=my_group_account
#SBATCH --array=1-200
python simple_job.py $SLURM_ARRAY_TASK_ID outputSample script for submitting multicore threaded jobs¶
#!/bin/bash
#SBATCH --account=my_group_account
#SBATCH --time=0-03:00
#SBATCH --cpus-per-task=32
#SBATCH --ntasks=1
#SBATCH --mem=20G
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
./myprog.exeSample script for submitting multiprocess parallel jobs¶
#!/bin/bash
#SBATCH --account=my_group_account
#SBATCH --time=5-00:00
#SBATCH --ntasks=100
#SBATCH --mem-per-cpu=4G
srun ./mympiprog.exeSample script for submitting a GPU job¶
#!/bin/bash
#SBATCH --account=my_group_account
#SBATCH --time=0-03:00
#SBATCH --gpus-per-node=h100:2
#SBATCH --mem=20G
./myprog.exeSample script for submitting a hybrid MPI-threaded job¶
#!/bin/bash
#SBATCH --account=my_group_account
#SBATCH --time=0-03:00
#SBATCH –ntasks=16 # MPI ranks
#SBATCH –cpus-per-task=4 # threads
#SBATCH --mem=20G
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
srun –cpus-per-task=$SLURM_CPUS_PER_TASK ./myprogRequesting scheduling jobs by node¶
A sample submission script:
#!/bin/bash
#SBATCH --account=my_group_account
#SBATCH --time=0-03:00
#SBATCH –ntasks=16 # MPI ranks
XXXXXXXXXXXXXXXXXXXXUsing Python 🐍¶
To create a virtual environment with NumPy as example Python package
module load python/3.12
virtualenv --no-download ~/ENV
source ~/ENV/bin/activate
pip install --no-index --upgrade pip
pip install --no-index numpyIf you need a specific version of a module, then install it like this:
pip install --no-index numpy==1.26.4To flag --no-index will install a package from our wheelhouse. These are always preferable to installing from the internet as they will be tuned to run on our systems.
To see the available wheels for a particular version of Python, use
avail_wheels numpy --all_versions -p 3.12or see https://
Using Apptainer 🚢¶
Some packages are difficult to install in our Linux environment. The alternative is to install them in a container. Here is an example with Anaconda and Numpy.
Create file image.def with
Bootstrap: docker
From: mambaorg/micromamba:latest
%post
micromamba install -c conda-forge numpyBuild image with
module load apptainer
apptainer build image.sif image.defRun python in image with
apptainer run image.sif pythonDRAC clusters across Canada 🌐¶
| Cluster | Cores | GPUs | Max memory | Storage |
|---|---|---|---|---|
| fir | 165,120 | 640 | 49TB | |
| nibi | 25TB | |||
| trillium | ||||
| rorqual | ||||
| narval |
nibi specs 📈¶
| Nodes | Cores | Memory | CPU | GPU |
|---|---|---|---|---|
| 700 | 192 | 768GB DDR5 | 2 x Intel 6972P @ 2.4 GHz, 384MB cache L3 | - |
| 10 | 192 | 6TB DDR5 | 2 x Intel 6972P @ 2.4 GHz, 384MB cache L3 | - |
| 36 | 192 | 1.5TB | 1 x Intel 8570 @ 2.1 GHz, 300MB cache L3 | 8 x Nvidia H100 SXM (80 GB memory) |
| 6 | 96 | 512GB | 4 x AMD MI300A @ 2.1GHz | 4 x AMD CDNA 3 (128 GB HBM3 memory - unified memory model) |




