Email address protected by JavaScript.
Please enable JavaScript to contact me.

Portable Batch System (PBS)


Sphinx Group Administrator's Guide

Introduction

This document is intended for someone who is installing components of PBS, or who has administrative tasks, such as placing a machine online/offline. Other users please refer to the user's guide linked above.

We use a queue submission manager in the "cartoon network" machines named "PBS". A queue manager allows us to control how jobs are executed. It has three components: the pbs server, the pbs client and a scheduler. The pbs server and the scheduler run on a single machine, the server machine.

The pbs server (pbs_server) controls status of hosts, and keeps track of job states. It communicates with the client and the scheduler, and effectively commands jobs to start in the chosen host.

The scheduler is responsible for deciding the order in which jobs are executed (based on user's usage etc), and the host where they will be executed, and communicates these decisions to the pbs server. TORQUE has its own scheduler (pbs_sched) but it is very slow, taking minutes to start jobs as machines free up. We therefore use maui as the scheduler.

The pbs client (pbs_mom) runs on every machines that executes jobs, and controls job execution in that host.

The queue server is "redwood", and the queues we currently have are:

Software you need for the installation

Setting up the pbs server (pbs_server)

Setting up the scheduler (maui)

  • Configure the scheduler. In this stage, you need to define the PBS server host, the relative speed between machines, the priority policy, and, if needed, the standing reservations. If the host name does not include .speech.cs.cmu.edu in the server, then it should not be included here either.
    # maui.cfg 3.2.6p16
    
    SERVERHOST            redwood.speech.cs.cmu.edu
    # primary admin must be first in list
    ADMIN1                root
    ADMIN2                egouvea
    ADMIN3                dhuggins
    
    # Resource Manager Definition
    
    RMCFG[base] TYPE=PBS
    
    # Allocation Manager Definition
    
    AMCFG[bank]  TYPE=NONE
    
    # full parameter docs at http://supercluster.org/mauidocs/a.fparameters.html
    # use the 'schedctl -l' command to display current configuration
    
    RMPOLLINTERVAL        00:00:30
    
    SERVERPORT            42559
    SERVERMODE            NORMAL
    
    # Admin: http://supercluster.org/mauidocs/a.esecurity.html
    
    
    LOGFILE               maui.log
    LOGFILEMAXSIZE        10000000
    LOGLEVEL              3
    
    # Job Priority: http://supercluster.org/mauidocs/5.1jobprioritization.html
    
    QUEUETIMEWEIGHT       1
    FSWEIGHT              1
    FSUSERWEIGHT          20
    CREDWEIGHT            1
    CLASSWEIGHT           1
    
    # FairShare: http://supercluster.org/mauidocs/6.3fairshare.html
    
    FSPOLICY              DEDICATEDPES
    FSDEPTH               7
    FSINTERVAL            86400
    FSDECAY               0.80
    
    # Throttling Policies: http://supercluster.org/mauidocs/6.2throttlingpolicies.ht
    ml
    
    # NONE SPECIFIED
    
    # Backfill: http://supercluster.org/mauidocs/8.2backfill.html
    
    BACKFILLPOLICY        FIRSTFIT
    RESERVATIONPOLICY     CURRENTHIGHEST
    RESERVATIONDEPTH      5
    
    # Node Allocation: http://supercluster.org/mauidocs/5.2nodeallocation.html
    
    # NODEALLOCATIONPOLICY  FASTEST
    NODEALLOCATIONPOLICY PRIORITY
    NODECFG[DEFAULT] PRIORITYF='SPEED + CPROCS - JOBCOUNT'
    
    JOBNODEMATCHPOLICY   EXACTNODE
    
    # QOS: http://supercluster.org/mauidocs/7.3qos.html
    
    # QOSCFG[hi]  PRIORITY=100 XFTARGET=100 FLAGS=PREEMPTOR:IGNMAXJOB
    # QOSCFG[low] PRIORITY=-1000 FLAGS=PREEMPTEE
    
    # Standing Reservations: http://supercluster.org/mauidocs/7.1.3standingreservati
    ons.html
    
    # SRSTARTTIME[test] 8:00:00
    # SRENDTIME[test]   17:00:00
    # SRDAYS[test]      MON TUE WED THU FRI
    # SRTASKCOUNT[test] 20
    # SRMAXTIME[test]   0:30:00
    
    # Creds: http://supercluster.org/mauidocs/6.1fairnessoverview.html
    
    USERCFG[DEFAULT]      FSTARGET=100.0
    # USERCFG[DEFAULT]      FSTARGET=25.0
    # USERCFG[john]         PRIORITY=100  FSTARGET=10.0-
    # GROUPCFG[staff]       PRIORITY=1000 QLIST=hi:low QDEF=hi
    # CLASSCFG[batch]       FLAGS=PREEMPTEE
    # CLASSCFG[interactive] FLAGS=PREEMPTOR
    
    CLASSCFG[s4]            FSTARGET=100.0 PRIORITY=1.0
    CLASSCFG[workq]         FSTARGET=100.0 PRIORITY=1.0
    CLASSCFG[x86_64]        FSTARGET=100.0 PRIORITY=1.0
    CLASSCFG[slow]          FSTARGET=100.0 PRIORITY=1.0
    CLASSCFG[i686]          FSTARGET=100.0 PRIORITY=1.0
    
    
    # Nodes - valid only with maui3.0.7 or higher
    NODECFG[alder] SPEED=3.0
    NODECFG[astro] SPEED=2.2
    NODECFG[batman] SPEED=3.0
    NODECFG[beaker] SPEED=1.0
    NODECFG[bert] SPEED=2.4
    NODECFG[betty] SPEED=3.0
    NODECFG[bigbird] SPEED=3.2
    NODECFG[blossom] SPEED=0.75
    NODECFG[bubbler] SPEED=0.75
    NODECFG[buckeye] SPEED=3.0
    NODECFG[bunsen] SPEED=1.0
    NODECFG[buttercup] SPEED=0.75
    NODECFG[catalpa] SPEED=3.0
    NODECFG[daphne] SPEED=2.2
    NODECFG[dogwood] SPEED=3.0
    NODECFG[dumbo] SPEED=1.0
    NODECFG[elroy] SPEED=2.2
    NODECFG[ernie] SPEED=1.0
    NODECFG[eucalyptus] SPEED=3.0
    NODECFG[facloan-1850-1] SPEED=3.6
    NODECFG[facloan-1850-2] SPEED=3.6
    NODECFG[filbert] SPEED=4.0
    NODECFG[fozzie] SPEED=1.0
    NODECFG[fred] SPEED=1.0
    NODECFG[george] SPEED=2.2
    NODECFG[ginkgo] SPEED=4.0
    NODECFG[gonzo] SPEED=1.0
    NODECFG[goofy] SPEED=1.0
    NODECFG[jane] SPEED=2.2
    NODECFG[judy] SPEED=2.2
    NODECFG[karybdis] SPEED=2.8
    NODECFG[kermit] SPEED=2.3
    NODECFG[mafalda] SPEED=3.0
    NODECFG[mickey] SPEED=1.7
    NODECFG[muttley] SPEED=3.0
    NODECFG[piggy] SPEED=2.8
    NODECFG[redwood] SPEED=2.6
    NODECFG[scooby] SPEED=2.2
    NODECFG[scrappy] SPEED=2.2
    NODECFG[scylla] SPEED=2.8
    NODECFG[shaggy] SPEED=2.2
    NODECFG[spacely] SPEED=3.0
    NODECFG[utonium] SPEED=0.75
    NODECFG[velma] SPEED=2.2
    NODECFG[wilma] SPEED=1.0
    
  • Setting up the pbs client (pbs_mom)

    How to

    Troubleshooting

    qstat or qsub does not work, and gives me a message like "server not responding".

    The server is possibly down, and needs to be restarted. You can do this by rebooting the machine, or by restarting the server process, as below. Notice that restarting the server this way will restart (i.e., stop and run again from scratch) all running jobs. For a solution that keeps the jobs running, check the How to section.

    /etc/init.d/pbs_server restart
    

    There are free machines, my jobs are on the Q state, but the jobs do not get executed

    The scheduler, maui, is possibly down. You can reboot the machine, or restart the scheduler, as below.

    /etc/init.d/maui restart
    

    Job seems to be running, but it is stuck.

    It may happen that a process may get stuck. A process (what you get when you do a ps) is different from a job (what you get when you do a qstat). Most commonly, if a process gets stuck, it gets stuck in the "D" state, the so called uninterruptible I/O state.

    The uninterruptible I/O state occurs when the process is waiting for I/O, either reading from a file or writing to a file. It is normal that a process goes into this state for a couple of seconds. But because of network traffic, the process may fall into a situation where it waits for I/O forever, and the network just does not respond.

    You can verify whether a process is in "D" state by typing ps x and looking at the letter under the column named STAT. Normally, this column will have a letter like "S" or "R", or "D" for a few seconds. If it is always in "D", you will need root access, or ask someone with root access to get out of it. You do not need to ask facilities to do this, just ask someone around you.

    A state in the "D" state cannot be killed, otherwise it would not be "uninterruptible". Look for a process named rpciod (e.g. ps ax | grep rpciod). Send a "KILL" signal to this process. This will not kill the process, but just send a signal to it. The processes in "D" state will cease waiting for I/O, and may finish, if they were waiting to finish.

    % ps ax | grep rpciod
      945 ?        SW     9:48 [rpciod]
    % kill -9 945
    

    Bad UID for job execution

    This happens when PBS launches a job in a machine, and the user does not have an account there. To fix this, find out where the attempted execution happened, looking for the field exec_host in the output of the tracejob command, and create an account for the user in that machine.

    Additional tips

    Please also check the troubleshooting section in the User's manual.


    Page created by Evandro B. Gouvêa on 29 June 2004

    Page maintained by Evandro B. Gouvêa () and David Huggins-Daines ()

    Last modified: Fri May 18 15:59:33 Eastern Daylight Time 2007