Hprc banner tamu.png

Difference between revisions of "SW:R"

From TAMU HPRC
Jump to: navigation, search
(Installing Packages)
 
(20 intermediate revisions by 5 users not shown)
Line 12: Line 12:
 
  [NetID@cluster ~]$ '''module spider R'''
 
  [NetID@cluster ~]$ '''module spider R'''
  
To load the default R module: (not recommended)
+
To load a particular version of R (Example: 3.4.2 with the iomkl toolchain):
  [NetID@cluster ~]$ '''module load R'''  
+
  [NetID@cluster ~]$ '''module load R/3.4.2-iomkl-2017A-Python-2.7.12-default-mt'''
  
To load a particular version of R (Example: 3.3.1 with the iomkl toolchain):
+
=== R_tamu ===
[NetID@cluster ~]$ '''module load R/3.3.1-iomkl-2015B-default-mt'''
 
  
<font color=teal>'''Note:''' Loading the R modules will load the base installation of R. There is another module available called '''R_tamu''' that has all commonly used packages pre-installed. </font> <br> <br>
+
Loading the R module will setup the environment for the base R installation without any additional packages. For the user's convenience, HPRC developed an extension to R called '''R_tamu''' which is built on top of R and provides a large number of additionally installed packages not found in the base R version. R_tamu also makes it easy to install personal packages. In addition, R_tamu can also act as an R-project environment manager.
To see all versions of R_tamu available:
+
 
 +
To see all versions of the R_tamu available:
 
  [NetID@cluster ~]$ '''module spider R_tamu'''
 
  [NetID@cluster ~]$ '''module spider R_tamu'''
 +
 +
For more information about R_tamu, please visit the [[SW:R_tamu | R_tamu Wiki page]]
  
 
===Installing Packages===
 
===Installing Packages===
While there are many packages available with the '''R_tamu''' module, you may find that we do not have a package installed that is needed. Packages can be installed in a personal directories. It is recommended to make a sub-directory in either '''$HOME''' or '''$SCRATCH''' for any R packages.<br>
 
  
To install packages in R in an existing directory '''~/R/My_Libs''':
+
While there are many packages available with the '''R_tamu''' module, you may find that we do not have a package installed that is needed. If you think that a particular package might be useful for other R users, you can  [https://hprc.tamu.edu/about/contact.html contact us] with a request to install this packages systemwide. Alternatively, you can install any packages yourself in your own directory.
> '''install.packages("''package_name''", lib=''"~/R/My_Libs"'')'''
+
 
'''Important Note:''' Any path given to R must be a '''full''' path. R does NOT recognize the '''$HOME''' or '''$SCRATCH''' environment variables. In the above example, '''"~/"''' is a short-cut for '''"/home/netID/"'''. If the installation directory is located in '''$SCRATCH''', the path would need be '''"/scratch/user/netID/..."'''. If you are unsure of the full path of the installation directory, navigate to the directory outside of R and use '''"pwd"''' to print out the full path to that directory.
+
The most common way to install a package is to start an interactive R session and use the R function
 +
 
 +
> '''install.packages("''package_name''")'''
  
<font color=teal> If you have trouble installing packages for yourself, you can [[HPRC:About:Contact | contact us]] with any concerns. Similarly, if you think a package would be particularly useful to other users, you can [[HPRC:About:Contact | contact us]] with a request to have it added to '''R_tamu'''. </font>
 
  
{{:SW:Login_Node_Warning}}
+
Alternatively, you can also use '''R CMD INSTALL''' command from the shell. This is useful when you already have a local copy of the R package ( *.tar.gz format).
{{:SW:Compute_Node_Info}}
+
 
 +
R_tamu sets the R environment variable '''R_LIBS_USER''' to  ${SCRATCH}/R_LIBS/<VERSION> ( where <VERSION> is currently loaded R version), so all packages will
 +
be automatically installed in that directory.
 +
 
 +
In case you are using the R module, you will be asked to provide a personal directory where to install it. The easiest way is to set '''R_LIBS_USER''' before you start your R session. For example:
 +
 
 +
  [NetID@cluster ~]$ export R_LIBS_USER=${SCRATCH}/myRlibs
  
===Ada Example Job Script===
+
Updated: February 17, 2017
+
'''NOTE:''' In case you are using the R module, to be able to use the installed packages, R_LIBS_USER needs to be set every time before starting an R Session.
'''Example 1:''' A serial (single core) example:
 
<pre>
 
#BSUB -J R_Job              # sets the job name to R_Job.
 
#BSUB -L /bin/bash          # uses the bash login shell to initialize the job's execution environment.
 
#BSUB -W 2:00                # sets to 2 hours the job's runtime wall-clock limit.
 
#BSUB -n 1                  # assigns 1 core for execution.
 
#BSUB -R "span[ptile=1]"    # assigns 1 core per node.
 
#BSUB -R "rusage[mem=5000]"  # reserves 5000MB per process/CPU for the job (5GB * 1 Core = 5GB per node)
 
#BSUB -M 5000     # sets to 5,000MB (~5GB) the per process enforceable memory limit.
 
#BSUB -o R_Job.o%J          # directs the job's standard output to R_Job.o[jobid]
 
  
  
# Load the modules
+
<font color=teal> If you have trouble installing packages for yourself, you can also [https://hprc.tamu.edu/about/contact.html contact us] with any concerns. </font>
module load R_tamu/3.3.1-iomkl-2016.07-default-mt
 
  
# Launch R with proper parameters
+
<br>  
Rscript myScript.R
 
</pre>
 
  
'''Example 2:''' A parallel example, where ''myScript.R'' is a script that requests 16 slaves. '''Note:''' The number of cores requested should match the number of slaves requested.
+
{{:SW:Login_Node_Warning}}
<pre>
+
{{:SW:Compute_Node_Info}}
#BSUB -J R_Job              # sets the job name to R_Job.
+
'
#BSUB -L /bin/bash          # uses the bash login shell to initialize the job's execution environment.
 
#BSUB -W 2:00              # sets to 2 hours the job's runtime wall-clock limit.
 
#BSUB -n 16                # assigns 16 cores for execution.
 
#BSUB -R "span[ptile=16]"  # assigns 16 cores per node.
 
#BSUB -R "rusage[mem=2500]" # reserves 2500MB per process/CPU for the job (2.5GB * 16 Core = 40GB per node)
 
#BSUB -M 2500     # sets to 250MB (~2.5GB) the per process enforceable memory limit.
 
#BSUB -o R_Job.o%J          # directs the job's standard output to R_Job.o[jobid]
 
  
 +
=== Slurm Example (terra)===
 +
'''Example 1:''' A serial (single core) R Job example: (Last updated March 24, 2017)
 +
#!/bin/bash
  
# Load the modules
+
##NECESSARY JOB SPECIFICATIONS
module load R_tamu/3.3.1-iomkl-2016.07-default-mt
+
#SBATCH --job-name=R_Job        # Sets the job name to R_Job
 +
#SBATCH --time=5:00:00          # Sets the runtime limit to 5 hr
 +
#SBATCH --ntasks=1              # Requests 1 core
 +
#SBATCH --ntasks-per-node=1    # Requests 1 core per node (1 node)
 +
#SBATCH --mem=5G                # Requests 5GB of memory per node
 +
#SBATCH --output=stdout1.o%J    # Sends stdout and stderr to stdout1.o[jobID]
 +
 +
## Load the necessary modules
 +
module purge
 +
module load R_tamu/3.3.2-iomkl-2017A-Python-2.7.12-default-mt
 +
 +
## Launch R with proper parameters
 +
Rscript ''myScript.R''
  
# Launch R with proper parameters  
+
'''Example 2:''' A parallel (multiple core) R Job example, where ''myScript.R'' is a script that requests 10 slaves. (Last updated March 24, 2017) <br>
mpirun -np 1 Rscript myScript.R
+
'''Note:'''  The number of cores requested should match the number of slaves requested.
</pre>
+
#!/bin/bash
 +
 +
##NECESSARY JOB SPECIFICATIONS
 +
#SBATCH --job-name=R_Job        # Sets the job name to R_Job
 +
#SBATCH --time=5:00:00          # Sets the runtime limit to 5 hr
 +
#SBATCH --ntasks=10            # Requests 10 cores
 +
#SBATCH --ntasks-per-node=10    # Requests 10 cores per node (1 node)
 +
#SBATCH --mem=50G              # Requests 50GB of memory per node
 +
#SBATCH --output=stdout1.o%J    # Sends stdout and stderr to stdout1.o[jobID]
 +
 +
## Load the necessary modules
 +
module purge
 +
module load R_tamu/3.3.2-iomkl-2017A-Python-2.7.12-default-mt
 +
 +
## Launch R with proper parameters  
 +
mpirun -np 1 Rscript ''myScript.R''
  
To submit the batch job, run: (where ''jobscript'' looks like an example from above)
 
[NetID@ada1 ~]$ '''bsub < ''jobscript'''''
 
  
===Terra Example===
+
To submit the batch job, run: (where ''jobscript'' is a file that looks like one of the above examples)
  ''COMING SOON''
+
  [ NetID@terra ~]$ '''sbatch ''jobscript'''''
  
 
{{:SW:VNC_Node_Warning}}
 
{{:SW:VNC_Node_Warning}}
  
 
[[Category:Software]]
 
[[Category:Software]]

Latest revision as of 21:01, 6 October 2021

R

Description

R is a free software environment for statistical computing and graphics.
Homepage: http://www.r-project.org/

Access

R is open to all HPRC users.

Loading the Module

To see all versions of R available:

[NetID@cluster ~]$ module spider R

To load a particular version of R (Example: 3.4.2 with the iomkl toolchain):

[NetID@cluster ~]$ module load R/3.4.2-iomkl-2017A-Python-2.7.12-default-mt

R_tamu

Loading the R module will setup the environment for the base R installation without any additional packages. For the user's convenience, HPRC developed an extension to R called R_tamu which is built on top of R and provides a large number of additionally installed packages not found in the base R version. R_tamu also makes it easy to install personal packages. In addition, R_tamu can also act as an R-project environment manager.

To see all versions of the R_tamu available:

[NetID@cluster ~]$ module spider R_tamu

For more information about R_tamu, please visit the R_tamu Wiki page

Installing Packages

While there are many packages available with the R_tamu module, you may find that we do not have a package installed that is needed. If you think that a particular package might be useful for other R users, you can contact us with a request to install this packages systemwide. Alternatively, you can install any packages yourself in your own directory.

The most common way to install a package is to start an interactive R session and use the R function

> install.packages("package_name")


Alternatively, you can also use R CMD INSTALL command from the shell. This is useful when you already have a local copy of the R package ( *.tar.gz format).

R_tamu sets the R environment variable R_LIBS_USER to ${SCRATCH}/R_LIBS/<VERSION> ( where <VERSION> is currently loaded R version), so all packages will be automatically installed in that directory.

In case you are using the R module, you will be asked to provide a personal directory where to install it. The easiest way is to set R_LIBS_USER before you start your R session. For example:

  [NetID@cluster ~]$ export R_LIBS_USER=${SCRATCH}/myRlibs


NOTE: In case you are using the R module, to be able to use the installed packages, R_LIBS_USER needs to be set every time before starting an R Session.


If you have trouble installing packages for yourself, you can also contact us with any concerns.


Usage on the Login Nodes

Please limit interactive processing to short, non-intensive usage. Use non-interactive batch jobs for resource-intensive and/or multiple-core processing. Users are requested to be responsible and courteous to other users when using software on the login nodes.

The most important processing limits here are:

  • ONE HOUR of PROCESSING TIME per login session.
  • EIGHT CORES per login session on the same node or (cumulatively) across all login nodes.

Anyone found violating the processing limits will have their processes killed without warning. Repeated violation of these limits will result in account suspension.
Note: Your login session will disconnect after one hour of inactivity.

Usage on the Compute Nodes

Non-interactive batch jobs on the compute nodes allows for resource-demanding processing. Non-interactive jobs have higher limits on the number of cores, amount of memory, and runtime length.

For instructions on how to create and submit a batch job, please see the appropriate wiki page for each respective cluster:

'

Slurm Example (terra)

Example 1: A serial (single core) R Job example: (Last updated March 24, 2017)

#!/bin/bash
##NECESSARY JOB SPECIFICATIONS
#SBATCH --job-name=R_Job        # Sets the job name to R_Job
#SBATCH --time=5:00:00          # Sets the runtime limit to 5 hr
#SBATCH --ntasks=1              # Requests 1 core
#SBATCH --ntasks-per-node=1     # Requests 1 core per node (1 node)
#SBATCH --mem=5G                # Requests 5GB of memory per node
#SBATCH --output=stdout1.o%J    # Sends stdout and stderr to stdout1.o[jobID]

## Load the necessary modules
module purge
module load R_tamu/3.3.2-iomkl-2017A-Python-2.7.12-default-mt

## Launch R with proper parameters 
Rscript myScript.R

Example 2: A parallel (multiple core) R Job example, where myScript.R is a script that requests 10 slaves. (Last updated March 24, 2017)
Note: The number of cores requested should match the number of slaves requested.

#!/bin/bash

##NECESSARY JOB SPECIFICATIONS
#SBATCH --job-name=R_Job        # Sets the job name to R_Job
#SBATCH --time=5:00:00          # Sets the runtime limit to 5 hr
#SBATCH --ntasks=10             # Requests 10 cores
#SBATCH --ntasks-per-node=10    # Requests 10 cores per node (1 node)
#SBATCH --mem=50G               # Requests 50GB of memory per node
#SBATCH --output=stdout1.o%J    # Sends stdout and stderr to stdout1.o[jobID]

## Load the necessary modules
module purge
module load R_tamu/3.3.2-iomkl-2017A-Python-2.7.12-default-mt

## Launch R with proper parameters 
mpirun -np 1 Rscript myScript.R


To submit the batch job, run: (where jobscript is a file that looks like one of the above examples)

[ NetID@terra ~]$ sbatch jobscript

Usage on the VNC Nodes

The VNC nodes allow for usage of the a graphical user interface (GUI) without disrupting other users.

VNC jobs and GUI usage do come with restrictions. All VNC jobs are limited to a single node (Terra: 28 cores/64GB). There are fewer VNC nodes than comparable compute nodes.

For more information, including instructions, on using software on the VNC nodes, please visit our Terra Remote Visualization page.