HPRCheader.jpg

From TAMU High Performance Research Computing
Jump to: navigation, search

Dedicated Use & Batch Policies

Batch system policies are approved by the Allocation Committee, review@hprc.tamu.edu, and may on occasion change to reflect changing needs and load conditions. Your adherence to what we say below will be appreciated. What we aim at is to convince you that a little care on your part in doing certain things right will go a long way to keep our compute servers efficiently and fairly run. Very reluctantly, in order to maintain fairness and efficiency we will on occasion prematurely terminate jobs. The subsection Abnormal Job Termination lists common reasons for terminating a job by the staff.

Dedicated Use

All requests for dedicated machine use require the approval of Director. To initiate the process, please send e-mail to the HPRC help desk at, help@hprc.tamu.edu. Assuming approval, arrangements must also be made in consultation with the staff. When machine maintenance is also scheduled, every other Tuesday is a strongly preferred day. Otherwise, machine load conditions will be a significant factor in selecting the preferred day for such an event. Please always give at least two weeks notice. The maximum processing time per request is also the Director's decision.

Job Termination By Staff

The SC staff reserves the right to terminate batch jobs when one or a combination of following effects occur:

  1. Use by your program of a larger number of CPUs than its parallel efficiency warrants.
  2. Use by your program of a smaller number of CPUs than that specified through the batch system. This is a particularly unacceptable practice since it results in wasting resources that they might otherwise be used by others. The batch system sets aside resources but it knows nothing about the actual number of CPUs that your program will use.
  3. Submitting jobs with an artificially large wall-clock or cpu-time.
  4. Submitting jobs with an artificially large memory request, especially when utilizing only a small percentage of that memory.
  5. Use/abuse of a special access queue to run a job that could very well run in one of the common queues.
  6. Use/abuse of a special access queue to circumvent wait time or job limits within the common queues.
  7. Excessive I/O with large files, which in turn overwhelms memory due to excessive file caching.
  8. Any use of large amounts of disk and/or memory that causes a significant disruption to the smooth operation of the system.
  9. Delayed file transfers with source or destination hosts that are remote.

Batch jobs are subject to periodic monitoring. Jobs inappropriately using extreme resources are subject to termination without prior notice.

Please see the Fair Resource Usage page for details on a cluster-by-cluster basis.

Personal tools
Namespaces
    Notice: Undefined index: namespace_urls in /var/www/mediawiki119/skins_local/Vector.php on line 354 Warning: Invalid argument supplied for foreach() in /var/www/mediawiki119/skins_local/Vector.php on line 354

Notice: Undefined index: variant_urls in /var/www/mediawiki119/skins_local/Vector.php on line 365 Warning: Invalid argument supplied for foreach() in /var/www/mediawiki119/skins_local/Vector.php on line 365

Variants
Views
    Notice: Undefined index: view_urls in /var/www/mediawiki119/skins_local/Vector.php on line 387 Warning: Invalid argument supplied for foreach() in /var/www/mediawiki119/skins_local/Vector.php on line 387
Actions
Important Info
User Guides
Helpful Pages
Tools