Running Jobs

Using the Digital Research Alliance of Canada

Karen Cristine Goncalves, Ph.D.

2023-01-30

Run jobs

Cannot run/execute any of those commands from ~

The three possibilities:

  • srun - Use to submit a script to be executed in real time (the process of running the job will be printed in the screen instead of saved into a file)
  • sbatch - Use to submit a script to be executed later
  • salloc - Use to allocate resources for a job in real time.
    • Basically, you ask for memory and CPUS for a determinate amount of time and when the “allocate” the resources to you, you work in real time in a computer more potent than yours.

Run jobs - srun

srun NEEDED_INFO script/command

  • Before running srun, need to load software required need to the session
    • Adding module load software to the script does not work
  • Information needed
    • --account=acount_name - def-laboidp or def-desgagne or def-germain1
    • --time=d-HH:MM:SS - time required for the script, eg. 12hs: --time=12:00:00
    • --mem=1G - memory required (here, 1G, default 254M)

srun - exammple

module load StdEnv/2020 blast+ # load the softwares
cd $SCRATCH

srun --account=def-desgagne --time=01:00:00 --mem=1024M\
 blastp -query prots.fasta\
 -db $myDatabase\
 -out prots_Databased.txt\
 -outfmt '7' # output in table format

  • The file prots.fasta is in the current folder, or its path is given in full
    • same for the output prots_Database.txt
  • When you use \ to separate one command in several lines, add a space in the next line to separate the words

Note on blast

  • $myDatabase
    • Add the full path, without the extension
    • Use right type of database (the first letter of the extension indicates the type of database)
      • .p* - protein database
      • .n* - nucleotide database

Run jobs - sbatch

sbatch script or sbatch NEEDED_INFO script

  • Used to submit a script to be executed later.
  • Essentials for the script:
    • First line indicates the program that translates the script to computer language: #!/bin/sh
    • Lines starting with #SBATCH tell the scheduler what we need for this job

sbatch - script example

#!/bin/sh
#SBATCH --account=def-desgagne
#SBATCH --time=01:00:00 
#SBATCH --mem=1024M

module load StdEnv/2020 blast+

myDatabase=/database/path

cd $SCRATCH

blastp -query prots.fasta\
 -db $myDatabase\
 -out prots_Databased.txt\
 -outfmt '7' # output in table format

  • Note that here the “module load” line is INSIDE the script

Run jobs - salloc

salloc - salloc NEEDED_INFO

  • Used to allocate resources for a job in real time.
  • Basically, you ask for memory and CPUS for a determinate amount of time and when the “allocate” the resources to you, you work in real time in a computer more potent than yours.

Resources