User Tools

Site Tools


wiki:slurm

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
wiki:slurm [2020/08/07 09:46] – [OpenMolcas] adminwiki:slurm [2024/03/29 10:07] (current) – [SLURM useful commands] admin
Line 45: Line 45:
 |  cpu  |  5 days  |  8  |  1  | |  cpu  |  5 days  |  8  |  1  |
 | unlimited |  no limit  |  8  |  1  | | unlimited |  no limit  |  8  |  1  |
-|  gpu  |  5 day   |  1  |  1  |+|  gpu  |  5 days   |  1  |  1  |
  
 ===== Partition (queue) limits ===== ===== Partition (queue) limits =====
Line 65: Line 65:
 ^ Command ^ Example syntax ^ Meaning ^ ^ Command ^ Example syntax ^ Meaning ^
 |sbatch|sbatch <jobscript>| Submit a batch job.| |sbatch|sbatch <jobscript>| Submit a batch job.|
-|srun|srun --pty -t 0-0:5:0 -p cpu /bin/bash|Start an interactive session for five minutes in the cpu queue.|+|srun|%%srun --pty -t 0-0:5:0 -p cpu /bin/bash -i%%|Start an interactive session for five minutes in the cpu queue.|
 |squeue|squeue -u <userid>|View status of your jobs in the queue. Only non-completed jobs will be shown.| |squeue|squeue -u <userid>|View status of your jobs in the queue. Only non-completed jobs will be shown.|
 |scontrol|scontrol show job <jobid>| Look at a running job in detail. For more information about the job, add the -dd parameter.| |scontrol|scontrol show job <jobid>| Look at a running job in detail. For more information about the job, add the -dd parameter.|
Line 77: Line 77:
  
  
-  * request a node with 12GB of RAM (total): ''sbatch --mem=12G job_script'', to see how much memory is currently available on the nodes: ''sinfo --Node -l''+  * Request a node with 12GB of RAM (total): ''%%sbatch --mem=12G job_script%%''. To see how much memory is currently available on the nodes: ''%%sinfo --Node -l%%'' 
 + 
 +  * Request a node with 6GB of RAM per core (CPU): ''%%sbatch --mem-per-cpu=6G job_script%%''
 + 
 + 
 +  * Most of the Keck nodes have 24 GB of RAM (23936 B) but there are two nodes which have 32 GB (31977 B) of RAM (nodes w16 and w17). If your job needs more than 20GB of RAM (but less that 32GB) you can request one of the "high-memory" nodes with the following statements in your SLURM batch file: 
 + 
 +  #SBATCH --mem=30G               # request allocation of 30GB RAM for the job 
 +  #SBATCH --nodelist=w16 (or w17) # request the job to be sent to w16 or w17, pick a node which has no jobs running
  
-  * request a node with 6GB of RAM per core (CPU): ''sbatch --mem-per-cpu=6G job_script''. 
  
   * canceling jobs:    * canceling jobs: 
Line 85: Line 92:
 |scancel 1234                           | cancel job 1234| |scancel 1234                           | cancel job 1234|
 |scancel -u myusername                  | cancel all my jobs| |scancel -u myusername                  | cancel all my jobs|
-|scancel -u myusername --state=running  | cancel all my running jobs| +|%%scancel -u myusername --state=running%%  | cancel all my running jobs| 
-|scancel -u myusername --state=pending  | cancel all my pending jobs|+|%%scancel -u myusername --state=pending%%  | cancel all my pending jobs|
  
  
Line 99: Line 106:
 |scontrol show job <jobid> -dd|show details for a running job, -dd requests more detail| |scontrol show job <jobid> -dd|show details for a running job, -dd requests more detail|
  
-|sstat -j <jobid>.batch --format JobID,MaxRSS, MaxVMSize,NTasks | show status information for running job you can find all the fields you can specify with the --format parameter by running sstat -e| +|%%sstat -j <jobid>.batch --format JobID,MaxRSS, MaxVMSize,NTasks%% | show status information for running job you can find all the fields you can specify with the %%--format%% parameter by running sstat -e| 
-|sacct -j <jobid> --format=JobId,AllocCPUs,State,ReqMem, MaxRSS,Elapsed,TimeLimit,CPUTime,ReqTres|get statistics on a completed job you can find all the fields you can specify with the --format parameter by running sacct -e you can specify the width of a field with % and a number, for example --format=JobID%15 for 15 characters|+|%%sacct -j <jobid> --format=JobId,AllocCPUs,State,ReqMem, MaxRSS,Elapsed,TimeLimit,CPUTime,ReqTres%%|get statistics on a completed job you can find all the fields you can specify with the %%--format%% parameter by running sacct -e you can specify the width of a field with % and a number, for example %%--format=JobID%15%% for 15 characters|
  
  
Line 118: Line 125:
 #!/bin/bash #!/bin/bash
 #SBATCH -n 8                       # Request 8 cores #SBATCH -n 8                       # Request 8 cores
-#SBATCH -t 0-00:30                 # Runtime in D-HH:MM format+#SBATCH -t 0-01:30                 # Runtime in D-HH:MM format
 #SBATCH -p cpu                     # Partition to run in #SBATCH -p cpu                     # Partition to run in
 #SBATCH --mem=20G                  # Memory total in MB (for all cores) #SBATCH --mem=20G                  # Memory total in MB (for all cores)
Line 124: Line 131:
 #SBATCH -e %j.err                  # File to which STDERR will be written, including job ID #SBATCH -e %j.err                  # File to which STDERR will be written, including job ID
 set -xv set -xv
-echo Running on host `hostname` +echo Running on host $(hostname) 
-echo "Job id: $SLURM_JOB_ID"  +echo "Job id: ${SLURM_JOB_ID}"  
-echo Time is `date` +echo Time is $(date) 
-echo Directory is `pwd`+echo Directory is $(pwd)
 echo "This job has allocated $SLURM_NPROCS processors in $SLURM_JOB_PARTITION partition " echo "This job has allocated $SLURM_NPROCS processors in $SLURM_JOB_PARTITION partition "
-cwd=`pwd`+cwd=$(pwd)
 # create a randomly named scratch directory and copy your files there # create a randomly named scratch directory and copy your files there
-export SCRATCH=`mktemp -d /scratch/${USER}.XXXXXX` +export SCRATCH=$(mktemp -d /scratch/${USER}.XXXXXX
-export GAUSS_SCRDIR=$SCRATCH+echo "Using SCRATCH: ${SCRATCH}" 
 +export GAUSS_SRCDIR=${SCRATCH}
 # copy job files to $SCRATCH # copy job files to $SCRATCH
-cp * $SCRATCH +cp -a * ${SCRATCH} 
-cd $SCRATCH+cd ${SCRATCH}
  
 module load gaussian/16.B01-sse4 module load gaussian/16.B01-sse4
Line 143: Line 151:
  
 # copy the results back to $HOME & cleanup # copy the results back to $HOME & cleanup
-cp * $cwd +cp -a * ${cwd} 
-#rm -rf $SCRATCH+#rm -rf ${SCRATCH}
 </code> </code>
  
Line 150: Line 158:
  
    sbatch gaussian.slurm    sbatch gaussian.slurm
-   +
 You can verify that the jobs is in the queue: You can verify that the jobs is in the queue:
  
Line 173: Line 181:
 #SBATCH -e %j.err                  # File to which STDERR will be written, including job ID #SBATCH -e %j.err                  # File to which STDERR will be written, including job ID
 set -xv set -xv
 +
 +echo Running on host $(hostname)
 +echo "Job id: ${SLURM_JOB_ID}" 
 +echo Time is $(date)
 +echo Directory is $(pwd)
 +echo "This job has allocated $SLURM_NPROCS processors in $SLURM_JOB_PARTITION partition "
  
 # create a scratch directory on the SDD and copy all runtime data there # create a scratch directory on the SDD and copy all runtime data there
 export scratch_dir=`mktemp -d /scratch/${USER}.XXXXXX` export scratch_dir=`mktemp -d /scratch/${USER}.XXXXXX`
 +echo "Using SCRATCH directory: ${scratch_dir}"
 current_dir=`pwd` current_dir=`pwd`
-cp * $scratch_dir+cp -a * $scratch_dir
 cd $scratch_dir cd $scratch_dir
  
-module load openmpi/3.1/3.1.4 +module load orca/5.0.3
-module load orca/4.2.1+
  
 $ORCA_PATH/orca orca_input.inp > orca_output.out  $ORCA_PATH/orca orca_input.inp > orca_output.out 
  
 # copy all data back from the scratch directory # copy all data back from the scratch directory
-cp * $current_dir+cp -a * $current_dir
 rm -rf $scratch_dir rm -rf $scratch_dir
 </code> </code>
Line 196: Line 210:
 </code> </code>
  
-Please note that you have to load the appropriate MPI library to use Orca. This is a compatibility +Please note that with older versions of Orca you have to load the appropriate MPI library to use it. This is a compatibility table between different Orca and MPI module versions:
-table between different Orca and MPI module versions:+
  
-|orca/3.0.3 | openmpi/1.6.2 | 
 |orca/4.0.0 | openmpi/2.0.1 | |orca/4.0.0 | openmpi/2.0.1 |
 |orca/4.0.1 | openmpi/2.0.2 | |orca/4.0.1 | openmpi/2.0.2 |
 |orca/4.2.0 | openmpi/3.1/3.1.4 | |orca/4.2.0 | openmpi/3.1/3.1.4 |
 |orca/4.2.1 | openmpi/3.1/3.1.4 | |orca/4.2.1 | openmpi/3.1/3.1.4 |
 +|orca/5.0.3 | no MPI loading necessary, it is built in |
  
 ==== OpenMolcas ==== ==== OpenMolcas ====
Line 250: Line 262:
  
 # copy output magnetic properties file for further analysis (poly_aniso) # copy output magnetic properties file for further analysis (poly_aniso)
-cp /scratch/$SLURM_JOB_ID/${PWD##*/}/ANISOINPUT /scratch/$SLURM_JOB_ID/${PWD##*/}/POLYFILE ${PWD}/output+cp -a /scratch/$SLURM_JOB_ID/${PWD##*/}/ANISOINPUT /scratch/$SLURM_JOB_ID/${PWD##*/}/POLYFILE ${PWD}/output
 </code> </code>
  
Line 289: Line 301:
 echo SCRATCH: $SCRATCH echo SCRATCH: $SCRATCH
 # copy job files to $SCRATCH # copy job files to $SCRATCH
-cp * $SCRATCH+cp -a * $SCRATCH
  
 # start your job in $SCRATCH # start your job in $SCRATCH
Line 296: Line 308:
  
 # copy your results back to $HOME & cleanup # copy your results back to $HOME & cleanup
-cp * $cwd+cp -a * $cwd
 #rm -rf $SCRATCH #rm -rf $SCRATCH
 </code> </code>
wiki/slurm.1596818813.txt.gz · Last modified: 2020/08/07 09:46 by admin