Chapter 7. Cluster resource and Performance Management

7.1 Batch Queue Systems

This section will include an overview of the different
batch queue systems available. Special emphasis will be given to
PBS. Configuration of queue strategies will be presented. The
different schedulers that can be used with PBS will be presented
along with reasons why one would choose one over another. The
task manager primitives that PBS provides will be discussed. The
challenges of running parallel jobs through a batch queue system
will be introduced. Methods for properly starting MPI jobs
through PBS for the ch_p4, ch_gm and SCAMPI device will given.

7.1.1 Batch Queue Systems Overview
7.1.2 PBS
7.1.3 PBS Queue Configuration Strategies
7.1.4 Parallel in a Batch Queue Environment
7.2.5 Interactive Parallel Jobs in a Batch Queue Environment

7.2 Accounting

Resource use accounting will be presented with a couple
different strategies for implementation. These are, extraction
from batch queue system logs and BSD style accounting. The
back-end for this data will not be covered, so where the data is
sent to is up to the user.

7.2.1 Batch Queue Logs
7.2.2 BSD Style Accounting

7.3 Performance Management

Different methods for monitoring and tuning system wide
performance will be presented. This includes the bproc kernel
patches and user space daemons and PCP.

7.3.1 bproc
7.3.2 PCP