KeckII

General Info
KeckII System News and Updates
Fees for use of the facility
Obtaining an Account
Reserving the KeckII center
Virtual tour
Support
Publications

Resources For Users
Cluster status
FAQ
HOW-TOs
Hardware
Software
Policies

Site Search:

UCSD
 

Cluster How-To


Cluster information
Compiling and running jobs
Tips for Optimizing Application Performance on the Cluster

Cluster information

The Keck II cluster consists of 25 Compaq DL360 and 14 IBM Netfinity nodes, each with dual 800 MHz Pentium III processors and 512 MB RAM. The cluster uses low-latency high-throughput specialized Myrinet interconnect. The Keck II cluster is running the NPACI Rocks clustering software.

The SGE queuing system is installed and configured on the cluster. All non-interactive jobs must be submitted through the SGE. See SGE How-To for more information.


Compiling and running jobs

There are currently two versions of MPI on the cluster, please only use the /usr/share/mpi tools and libraries to compile and run jobs. These executables should be in your PATH by default.


Tips for Optimizing Application Performance on the Cluster

General tips

  • Try to use local disk on the nodes for scratch files. Local disk are much faster (at least an order of magnitude) than data transfers over NFS to your home directory. Each node has /scratch filesystem which can be used for this purpose. This /scratch filesystem is a persistent storage, all files saved there will be preserved after node's reboot/crash. Using /scratch will most likely significantly speedup your job especially if it's I/O bound.

    General strategy for using /scratch for your job:

    • Create temporary directory in /scratch on the node.
    • Copy all input files from your home dir to the temporary directory on the node.
    • Start your application from the temporary directory.
    • After your application is done, copy all files back to your home dir.
    • Delete the temporary directory.

      This all can be accomplished by copying and modifying the following lines to your SGE script.

      mkdir /scratch/username
      cp /home/username/path/to/your/files/* /scratch/username
      cd /scratch/username      
      
      # now execute your application      
      /some/path/application
      
      cp /scratch/username/* /home/username/path/to/your/files    
      rm -rf /scratch/username
            
  • Use walltime SGE resource carefully. Try to estimate the upper limit of your job execution time and set up walltime to this value (plus some). Grossly overestimating this value can hold your job in the idle queue and a shorter wall-time job might be run ahead of your job.

Running parallel applications

  • Don't run your parallel jobs on more than 8 CPUs. Most application do not scale well on ethernet interconnected beowulf clusters and there is a big penalty for intra-process communication. The sweet spot for most applications appears to be around 4-6 CPUs. Increasing number of CPUs above this number doesn't decrease the wall clock time and can actually increase it (child processes spending too much time communicating with each other and the master process).


Please direct any questions or comments to keck-help @ keck2.ucsd.edu
Last modified: December 06 2005 10:25:13 am.