R (statistical computing)
R (statistical computing)
Introduction
The "R" Tool is an open-source, popular, and fully-featured statistical application and programming platform. We have multiple R versions installed on HPC. These versions include:
- R (loads v4.1.0)
- R 4.2.0
- R 4.1.0
- R 4.0.0
- R 3.5.2
Running R on RCC Systems
To enter an interactive R session on HPC or Spear, simply "R" command. The default version that loads is Version 3.5.2. If you wish to run a newer or older version, you must load the appropriate module before running "R"; e.g.:
# version 4.2.0
module load R/4.2.0
# version 4.1.0
module load R
# version 4.0.0
module load R/4.0.0
Type 'q()' to quit R.
To submit R jobs to SLURM, refer to the following example submission script:
#!/bin/bash
#SBATCH -n 1
#SBATCH -J "MyRJob"
#SBATCH -p backfill
#SBATCH -t 1:00:00
#SBATCH --mail-type=ALL
module load R/3.2.5
R CMD BATCH yourRscript
The yourRscript
is a text file where you have saved R commands to run.
For more information about R, refer to the online documentation
Installing R Packages in Your Home Directory
Though RCC hosts many packages on our systems (see list below), there may be times when you need a specific package we do not currently have. In this case, you can install this package locally in your home directory using the following instructions (no need to open a ticket!):
- Type:
module load R
- Type:
R
- Type:
install.packages("PACKAGE_NAME_HERE")
- This will present you with the following information:
Installing package into ?/opt/hpc/R/R-4.1.0/shar/R/library? (as ?lib? is unspecified)
- You will then see a warning:
Warning in install.packages("abc") : 'lib = "/opt/hpc/R/R-4.1.0/share/R/library"' is not writable
. This is normal. - You will then be asked:
Would you like to use a personal library instead? (yes/No/cancel)
. Typeyes
- You will then be asked:
Would you like to create a personal library ?~/R/x86_64-redhat-linux-gnu-library/4.1? to install packages into? (yes/No/cancel)
. Again, typeyes
- You will then be shown:
--- Please select a CRAN mirror for use in this session ---
- This will bring up a list of CRAN mirrors you can use to download and install your library.
If you are still having errors downloading, try:
> install.packages("PACKAGE_NAME_HERE
",
lib="~/R/x86_64-redhat-linux-gnu-library/4.1")
Available Packages
RCC has an extensive list of packages for R available.
List of Available R Packages
- Akima
- acepack
- ade4
- ald
- assertthat
- backports
- base
- base64enc
- BH
- bitops
- boot
- Brew
- caTools
- checkmate
- chron
- class
- cluster
- coda
- codetools
- colorspace
- compiler
- crayon
- curl
- datasets
- data.table
- DBI
- dichromat
- digest
- doParallel
- doSNOW
- evaluate
- fdasrvf
- fields
- foreach
- foreign
- Formula
- futile.logger
- futile.options
- gdata
- grDevices
- graphics
- grid
- ggplot2
- ghyp
- gplots
- graph
- gridExtra
- gtable
- gtools
- hexbin
- highr
- Hmisc
- htmlTable
- htmltools
- htmlwidgets
- httr
- hwriter
- iterators
- jsonlite
- KernSmooth
- knitr
- labeling
- lambda.r
- Lattice
- latticeExtra
- lava
- lazyeval
- locfit
- MADAM
- magrittr
- manipulate
- maps
- markdown
- MASS
- matrix
- matrixcalc
- MatrixModels
- matrixStats
- memoise
- methods
- mgcv
- mime
- mnormt
- munsell
- mvtnorm
- numDeriv
- nnet
- openintro
- openssl
- parellel
- plogr
- plyr
- praise
- psych
- qrLMM
- quantreg
- R6
- RColorBrewer
- RcppArmadillo
- Rcpp
- RCurl
- reshape2
- rjags
- rlang
- rpart
- RSQLite
- scales
- scatterplot3d
- sendmailR
- snow
- spam
- SparseM
- spatial
- splines
- statmod
- stats
- stats4
- stringi
- stringr
- survival
- swirl
- testthat
- tibble
- timereg
- tools
- utils
- viridisLite
- viridis
- XML
- xtable
- yaml
Bioconductor on RCC Systems
Bioconductor is a very extensive set of libraries and tools written in R for use in R programs which are designed to perform a myriad of different tasks common to bioinformatics data analysis. RCC has an extensive list of Bioconductor packages installed for use on RCC Systems. A complete list of these follows in the section below.
List of Available Bioconductor Packages
- affiyo
- affy
- annaffy
- annotate
- annotationDbi
- Biobase
- BiocInstaller
- BiocGenerics
- BiocParallel
- biomaRt
- Biostrings
- DelayedArray
- DESeq
- DESeq2
- DEXSeq
- gcrma
- genefilter
- geneplotter
- GenomeInfoDbData
- GenomeInfoDb
- GenomicAlignments
- GenomicFeatures
- GenomicRanges
- GO.db
- iRanges
- KEGG.db
- KEGGgraph
- limma
- made4
- multtest
- preprocessCore
- qvalue
- Rgraphviz
- Rsamtools
- rtracklayer
- S4Vectors
- SPIA
- SummarizedExperiment
- vsn
- webbioc
- XVector
- zlibbioc
Parallel Computing with R
R has a number of powerful tools available to perform computations in parallel. This capability is vital for leveraging the full power of RCC's systems for your research. The R parallel computing tools currently supported by RCC include the following. Each has a link to their respective home pages.
List of Available R Parallel Computing Packages
Using R Parallel Computing Packages on HPC
In order to start a parallel job in R on the HPC system, first select an available Parallel Computing package. The parallel and doParallel packages are intended for single-node, multicore computations (meaning run on one machine with multiple cores). Other packages may become available in the future which support multi-node computations. Once you have selected your parallel computing package, write or convert your code to utilize that package (see the appropriate link in the above section to the appropriate package documentation for more information on how to do that). When ready to submit your job, simply create a submit script following one of the examples below and submit your job using the sbatch command.
The R parallel Package
If you are using the parallel or doParallel packages for R, your submit script should look something like the following:
#!/bin/bash
#SBATCH -J myRjob
#SBATCH -N 1
#SBATCH -n 4
#SBATCH -p genacc_q
#SBATCH -t 10:00:00
module load R/4.1.0
R CMD BATCH myRjob.R
When ready to submit, simply save the above submit script as something like myRjob.sh and then, from within the directory you saved your submit script, type sbatch myRjob.sh
(you can rename myRjob to anything, just keep the .sh after the name).