Enhanced R Script

Enhanced R Script

EnhancedRScript.py is a Python script intended as an easy interface between the Discovery Environment and R scripts. The Python command line can be used to modify a specially-designed R script before executing that modified R script.

EnhancedRScript scans an R script for any parameter names surrounded by two dollar signs on both sides, e.g, $$name$$ or $$myvar$$. Those parameter names can then be options on the EnhancedRScript command line, with the parameters replaced by command line options. For example, "$$name$$" would be replaced by the --name= option value on the EnhancedRscript command line. Defaults are also supported, and EnhancedRScript can also generate a script without executing it for debugging purposes. The Rscript command can, of course, pass parameters to an R script as well, but the Python wrapper is intended to provide more flexibility in some respects.

The basic Enhanced R Script application in the Discovery Environment is intended as a demonstration and can run scripts without the parameters as described above. To run scripts with replacement parameters, create a new Discovery Environment application. Examples include Linear Regression in R and Generalized Linear Regression in R.

Usage

./enhancedRScript.py <R_script_file> <options>

Arguments

<R_script_file>

Filename or full path to the script file to be processed. See below for special considerations.

Internal Options

These command line options are for EnhancedRScript itself and are not used for substitutions. All future internal options will begin with an underscore.

--_scriptonly

Generate the output script and print to the standard output. Do not run in R.

Example Data

R Script Format

The R script should be a standard R script except for values that should be replaced from the command line option values. A trivial R script, say "trivial.R", might look like:

    myvar1 <- $$firstvalue$$
    myvar2 <- $$secondvalue$$
    myvar3 <- myvar1 + myvar2

EnhancedRScript would then find "$$firstvalue$$" and "$$secondvalue$$" and insist that the rest of the command line be values for this option. For instance, the command line could be:

    ./enhandedRscript.py trivial.R --firstvalue=1 --secondvalue=2

EnhancedRScript then creates a new R script, leaving trivial.R unchanged, replacing the parameters and running the new script. The new script would be:

    myvar1 <- 1
    myvar2 <- 2
    myvar3 <- myvar1 + myvar2

Replacement parameters can specify default values that will be used if the corresponding option is not given on the command line. For example:

    myvar1 <- $$firstvalue=4$$

If --firstvalue=5 is specified on the command line, the parameter expression will be replaced by "5". If --firstvalue is not specified, the parameter expression will be replaced by "4". A default value for a replacement parameter needs only to be specified once no matter how many times the parameter may be used. If a default value for the same replacement parameter is specified more than once, only the last value will be used. For example:

    myvar1 <- $$firstvalue$$
    myvar2 <- $$firstvalue=2$$
    myvar3 <- $$firstvalue=3$$

In all three cases, if --firstvalue is not specified on the command line, the parameter expression will be replaced by "3", since "3" is the last default value for firstvalue specified in the R script.

Parameters are case-sensitive: $$firstvalue$$ and $$firstValue$$ refer to two different parameters and involve two different options on the EnhancedRScript command line.

If no default is specified and no option given on the command line, the parameter expression will simply be removed from the script before execution. However, if the parameter name begins with a capital letter, a value for that parameter is considered to be required, and EnhancedRScript will give an error if it has no value. If that capitalized parameter has a default value, the default value will be used as normal. For example, if the R script contains $$Firstvalue$$, then --Firstvalue=<some_value> must be specified on the command line unless, somewhere else in the R script, Firstvalue is given a default value such as $$Firstvalue=2$$. Capital letters after the first letter have no special meaning.

If single quotes surround an option value on the EnhandedRscript command line, those quotes will be dropped. For example, --firstvalue='1' is the same as --firstvalue=1. If the value should actually be surrounded by single quotes, use a pair of single quotes on both sides, e.g. --firstvalue=''this value''.

Additionally, the parameter replacements are straight text replacements, so they may be used in any location, even in the middle of a variable name or R command name. For instance, a script called zzz.R that looks like:

    my$$varnum$$var <- 123

and a Python command line that looks like:

    ./enhancedRScript zzz.R --varnum=55

would result in a script of:

    my55var <- 123

A special default value can be used to cause lines of the script to be deleted. Specifically, after doing all replacements from the command line, if EnhancedRScript finds  (!!) on a line of the script file, the entire line of the script will be deleted. For example:

    a <- 1
    x <- $$xvalue$$
    b <- 1
    y <- $$xvalue=@!!@$$
    c <- 1

If the parameter --xvalue is not specified on the command line, the default value is filled in, resulting in the following script:

    a <- 1
    x <- @!!@
    b <- 1
    y <- @!!@
    c <- 1

Line deletions will then be made, with all lines containing   (!!) being deleted. The final script would be:

    a <- 1
    b <- 1
    c <- 1

Input and Output Files

Input and output files depend upon the script template.

Tool Availability

Binary

N/A

 

Source

enhancedRscript.py (Python 2.7)

 

Version

1.0

 

User Guide

enhancedRscript_usage.txt