wiki:gpu_jobs
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| wiki:gpu_jobs [2013/11/19 18:27] – [Amber] admin | wiki:gpu_jobs [2013/12/11 11:21] (current) – removed admin | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| - | ====== Running jobs on GPUs ====== | ||
| - | |||
| - | Keck II workstation '' | ||
| - | |||
| - | |||
| - | <color black/ | ||
| - | |||
| - | |||
| - | ===== How to submit a job ===== | ||
| - | |||
| - | * Create all necessary input files for your job. | ||
| - | * Create an SGE script which is used to submit the job to the job queue manager. | ||
| - | * Submit your job suing this command: '' | ||
| - | * Check on progress of your job in the queue: '' | ||
| - | |||
| - | More information about SGE can be found [[https:// | ||
| - | |||
| - | |||
| - | ===== KeckII scheduling policies ===== | ||
| - | |||
| - | All jobs must be submitted to the SGE queue. It is strictly prohibited to run any non-interactive CPU-consuming jobs outside of the queue. | ||
| - | |||
| - | ==== Queue limits ==== | ||
| - | |||
| - | The following limits are imposed on all jobs: | ||
| - | |||
| - | * max wall-clock time is 48 hrs | ||
| - | * max number of processors per user is 16 although this is dynamically changed based on the load. To see the current limit: '' | ||
| - | |||
| - | |||
| - | If you have any special requirements please email < | ||
| - | |||
| - | |||
| - | ===== Best practices ===== | ||
| - | |||
| - | All workstations have a very fast solid state drive (SSD) mounted under ''/ | ||
| - | ===== Example SGE scripts ===== | ||
| - | |||
| - | These are example SGE script for running most common applications on the GPUs. | ||
| - | |||
| - | ==== Amber ==== | ||
| - | |||
| - | The optimal AMBER job configuration for KeckII is to use 1 CPU and 1 GPU per run. | ||
| - | |||
| - | < | ||
| - | #!/bin/bash | ||
| - | #$ -cwd | ||
| - | #$ -q gpu.q | ||
| - | #$ -V | ||
| - | #$ -N AMBER_job | ||
| - | #$ -S /bin/bash | ||
| - | #$ -e sge.err | ||
| - | #$ -o sge.out | ||
| - | |||
| - | myrun=my_simulation_name | ||
| - | |||
| - | module load nvidia | ||
| - | module load amber | ||
| - | export CUDA_VISIBLE_DEVICES=0 | ||
| - | |||
| - | # create a scratch directory on the SDD and copy all runtime data there | ||
| - | export scratch_dir=`mktemp -d / | ||
| - | current_dir=`pwd` | ||
| - | cp * $scratch_dir | ||
| - | cd $scratch_dir | ||
| - | |||
| - | $AMBERHOME/ | ||
| - | -x $myrun.nc -p $myrun.prmtop -c $myrun.rst | ||
| - | |||
| - | # copy all data back from the scratch directory | ||
| - | cp * $current_dir | ||
| - | rm -rf $scratch_dir | ||
| - | </ | ||
| - | |||
| - | |||
| - | ==== NAMD ==== | ||
| - | |||
| - | Running NAMD on 2 CPUs and one GPU is the optimal number of CPUs/GPUs for a typical NAMD job on KeckII workstations. | ||
| - | |||
| - | === Running namd on 2 CPUs/1 GPU === | ||
| - | |||
| - | < | ||
| - | #!/bin/bash | ||
| - | #$ -cwd | ||
| - | #$ -q all.q | ||
| - | #$ -V | ||
| - | #$ -N NAMD_job | ||
| - | #$ -pe orte-host 2 | ||
| - | #$ -S /bin/bash | ||
| - | #$ -e sge.err | ||
| - | #$ -o sge.out | ||
| - | |||
| - | module load nvidia | ||
| - | module load namd-cuda | ||
| - | |||
| - | # create a scratch directory and copy all runtime data there | ||
| - | export scratch_dir=`mktemp -d / | ||
| - | current_dir=`pwd` | ||
| - | cp * $scratch_dir | ||
| - | cd $scratch_dir | ||
| - | |||
| - | # 2 CPUs/1 GPU | ||
| - | namd2 +idlepoll +p2 +devices 1 apoa1.namd >& apoa1-2.1.out | ||
| - | |||
| - | # copy all data back from the scratch directory | ||
| - | cp * $current_dir | ||
| - | rm -rf $scratch_dir | ||
| - | </ | ||
| - | |||
| - | |||
wiki/gpu_jobs.1384914421.txt.gz · Last modified: 2013/11/19 18:27 by admin