The following article will detail how create R command scripts for HTCondor. This article will also demonstrate how to configure R repository and install packages to your local directory.
What is R ?
- R is a programming language and free software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing. The R language is widely used among statisticians and data miners for developing statistical software and data analysis.
1. Installing R packages without root access
$ R First, you need to designate a directory where you will store the downloaded packages. On my machine, I use the directory /home/kross48/packages/ After creating a package directory, to install a package we use the command: > install.packages("ggplot2", lib="/home/kross48/packages/") > library(ggplot2, lib.loc="/home/kross48/packages/")
2 . Installing R packages locally from a tar file.
$ R CMD INSTALL arules_1.1-9.tar.gz --library=/home/kross48/packages It’s a bit of a pain having to type "/your_packages_directory/" all the time. To avoid this burden, we create a file .Renviron in our home area, and add the line R_LIBS=/data/Rpackages/ to it. This means that whenever you start R, the directory "/your_packages_directory/" is added to the list of places to look for R packages and so: > install.packages("ggplot2") > library(ggplot2)
3. Setting the repository Creating an .Rprofile
Every time you install a R package, you are asked which repository R should use. To set the repository and avoid having to specify this at every package install, simply: create a file .Rprofile in your home area. Add the following piece of code to it:cat(".Rprofile: Setting Cloud repositoryn") r = getOption("repos") # hard code the cloud repo for CRAN r["CRAN"] = "https://cloud.r-project.org/" options(repos = r) rm(r) or local({ r
4. Setting up HTCondor Jobs
Sample R Script : library("mvtnorm",lib.loc="/home/kross48/packages/") library("rngWELL",lib.loc="/home/kross48/packages/") library("randtoolbox",lib.loc="/home/kross48/packages/") sink('test2.txt') cat('This is my first R program\n') sink() print("success") Sample command file: universe = vanilla getenv = true executable = /usr/bin/Rscript arguments = test2.R log = $(Cluster).log output = $(Cluster).$(process).out error = $(cluster).$(Process).error queue or Example running shell script inside a job universe = vanilla getenv = true executable = test.sh log = $(Cluster).log output = $(Cluster).$(process).out error = $(cluster).$(Process).error queue Sample Bash script: #!/bin/bash export R_LIBS=/home/kross48/packages # run your script /usr/bin/Rscript test.R