Hprc banner tamu.png

Difference between revisions of "HPRC:CommonProblems"

From TAMU HPRC
Jump to: navigation, search
(Q: How do I get more SUs?)
Line 1: Line 1:
 
<div class="noautonum">__TOC__</div>
 
<div class="noautonum">__TOC__</div>
==Common Problems & Quick Solutions==
+
=Common Problems & Quick Solutions=
 +
 
 +
==Accounts==
 
===Q: When do accounts expire?===
 
===Q: When do accounts expire?===
 
 
'''A:''' Accounts expire at the start of the new fiscal year (September 1st). You can see when your account expires by going to our [https://hprc.tamu.edu/accounts/ams/ Account Management System (AMS)] and checking under the Accounts tab.
 
'''A:''' Accounts expire at the start of the new fiscal year (September 1st). You can see when your account expires by going to our [https://hprc.tamu.edu/accounts/ams/ Account Management System (AMS)] and checking under the Accounts tab.
  
 
===Q: How do I get more SUs?===
 
===Q: How do I get more SUs?===
 
 
'''A:''' Students will need to have their PI transfer SUs to them. PIs can apply for up to two Small accounts for not more than 200,000 collective SUs. After this Small allocation has run out, PIs will need to apply for Large accounts. See our [[HPRC:Policies:Allocations | Account Allocations]] page for more information on the allocation policies.
 
'''A:''' Students will need to have their PI transfer SUs to them. PIs can apply for up to two Small accounts for not more than 200,000 collective SUs. After this Small allocation has run out, PIs will need to apply for Large accounts. See our [[HPRC:Policies:Allocations | Account Allocations]] page for more information on the allocation policies.
  
 
===Q: How do I transfer SUs?===
 
===Q: How do I transfer SUs?===
 +
'''A:''' To transfer SUs, PIs will need a Small or Large account (see our [[HPRC:Policies:Allocations | Account Allocations]] page for more information). Once an account has been granted to the PI, they can transfer SUs to any of their researchers on our [https://hprc.tamu.edu/accounts/ams/ Account Management System (AMS)]. If a PI needs to add a new researcher, the PI must [[HPRC:Contact | contact]] the Help Desk.
  
'''A:''' To transfer SUs, PIs will need a Small or Large account (see our [[HPRC:Policies:Allocations | Account Allocations]] page for more information). Once an account has been granted to the PI, they can transfer SUs to any of their researchers on our [https://hprc.tamu.edu/accounts/ams/ Account Management System (AMS)]. If a PI needs to add a new researcher, the PI must [[HPRC:Contact | contact]] the Help Desk.
+
==Batch Processing==
 +
===Q: Why is my job pending?===
 +
'''A:''' There can be many reasons why a job would be pending:
 +
* '''Your job cannot fit on any of our nodes'''
 +
** If your job requests more than 245GB of memory, without requesting the xlarge queue, your job will be stuck pending. To fix this, kill your job and resubmit with less memory or in the xlarge queue. '''IMPORTANT NOTE:''' Your program MUST use Westmere compatible software to be able to run in the xlarge queue.
 +
** If your job asks for more than the maximum number of cores per node (Ada: 20 or 40 with the xlarge queue, Curie: 16) with '''#BSUB -R "span[ptile=XX]"''' your job will be stuck pending. To fix this, kill your job and resubmit with a ptile value less than or equal to the maximum value for the cluster.
 +
** If your job requests more than 2TB of memory, your job will be stuck pending. To fix this, kill your job and resubmit with less memory.
 +
 
 +
* '''There are no job slots available'''
 +
** If your job requires the usage of the 256GB, 1TB, or 2TB nodes, your job might be pending for longer than usual.
 +
** If the cluster usage is particularly high right now, your job might be pending for longer than usual. You can see the System Load Levels on our [http://hprc.tamu.edu/ Home Page].
 +
 
 +
===Q: Why does my job fail?===
 +
===Q: How much memory do I need?===
 +
===Q: How many cores should I use?===
 +
===Q: How long is my job going to take?===
 +
===Q: Why is my program slow?===
 +
===Q: What is "Disk Quota Exceeded"?===

Revision as of 11:55, 20 July 2016

Common Problems & Quick Solutions

Accounts

Q: When do accounts expire?

A: Accounts expire at the start of the new fiscal year (September 1st). You can see when your account expires by going to our Account Management System (AMS) and checking under the Accounts tab.

Q: How do I get more SUs?

A: Students will need to have their PI transfer SUs to them. PIs can apply for up to two Small accounts for not more than 200,000 collective SUs. After this Small allocation has run out, PIs will need to apply for Large accounts. See our Account Allocations page for more information on the allocation policies.

Q: How do I transfer SUs?

A: To transfer SUs, PIs will need a Small or Large account (see our Account Allocations page for more information). Once an account has been granted to the PI, they can transfer SUs to any of their researchers on our Account Management System (AMS). If a PI needs to add a new researcher, the PI must contact the Help Desk.

Batch Processing

Q: Why is my job pending?

A: There can be many reasons why a job would be pending:

  • Your job cannot fit on any of our nodes
    • If your job requests more than 245GB of memory, without requesting the xlarge queue, your job will be stuck pending. To fix this, kill your job and resubmit with less memory or in the xlarge queue. IMPORTANT NOTE: Your program MUST use Westmere compatible software to be able to run in the xlarge queue.
    • If your job asks for more than the maximum number of cores per node (Ada: 20 or 40 with the xlarge queue, Curie: 16) with #BSUB -R "span[ptile=XX]" your job will be stuck pending. To fix this, kill your job and resubmit with a ptile value less than or equal to the maximum value for the cluster.
    • If your job requests more than 2TB of memory, your job will be stuck pending. To fix this, kill your job and resubmit with less memory.
  • There are no job slots available
    • If your job requires the usage of the 256GB, 1TB, or 2TB nodes, your job might be pending for longer than usual.
    • If the cluster usage is particularly high right now, your job might be pending for longer than usual. You can see the System Load Levels on our Home Page.

Q: Why does my job fail?

Q: How much memory do I need?

Q: How many cores should I use?

Q: How long is my job going to take?

Q: Why is my program slow?

Q: What is "Disk Quota Exceeded"?