The following article will detail how create R command scripts for HTCondor. This article will also demonstrate how to configure R repository and install packages to your local directory.

What is R ?


  • R is a programming language and free software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing. The R language is widely used among statisticians and data miners for developing statistical software and data analysis.


1. Installing R packages without root access

 $ R

   First, you need to designate a directory where you will store the downloaded packages. On my machine, I use the directory /home/kross48/packages/ After creating a package directory, to install a package we use the command:

 > install.packages("ggplot2", lib="/home/kross48/packages/")
 > library(ggplot2, lib.loc="/home/kross48/packages/")

2 . Installing R packages locally from a tar file.

 $ R CMD INSTALL arules_1.1-9.tar.gz --library=/home/kross48/packages

It’s a bit of a pain having to type "/your_packages_directory/" all the time. To avoid this burden,  we create a file .Renviron in our home area, and add the line R_LIBS=/data/Rpackages/ to it. This means that whenever you start R, 
the directory "/your_packages_directory/" is added to the list of places to look for R packages and so:

 > install.packages("ggplot2")
 > library(ggplot2)

3. Setting the repository Creating an .Rprofile

Every time you install a R package, you are asked which repository R should use. To set the repository and avoid having to specify this at every package install, simply: create a file .Rprofile in your home area. Add the following piece of code to it:
   cat(".Rprofile: Setting Cloud repositoryn")
   r = getOption("repos") # hard code the cloud repo for CRAN
   r["CRAN"] = ""
   options(repos = r)



4. Setting up HTCondor Jobs

   Sample R Script :
   cat('This is my first R program\n')

   Sample command file: 
   universe = vanilla
   getenv = true
   executable = /usr/bin/Rscript 
   arguments = test2.R
   log = $(Cluster).log
   output = $(Cluster).$(process).out
   error = $(cluster).$(Process).error

or Example running shell script inside a job 

   universe = vanilla
   getenv = true
   executable =
   log = $(Cluster).log
   output = $(Cluster).$(process).out
   error = $(cluster).$(Process).error

   Sample Bash script:
   export R_LIBS=/home/kross48/packages
   # run your script
   /usr/bin/Rscript test.R