BATools 0.0.1 (Atmosphere Images Tutorial)
About BATools
BATools 0.0.1 is an Atmosphere image that has R version 3.0.1 installed. BATools, an R package for Whole Genome Prediction, is also installed on this image.
Learn about allocations
Learn about CyVerse's allocation policies here.
Accessing BATools
To use BATools via VNC Viewer, follow these simple steps:
- Launch a new instance of BATools 0.0.1 from Atmosphere, and access it using VNC.
- Once you have access to the instance, open up the terminal by either clicking the black icon at the bottom of the screen, or by going to Application > Accessories > Terminal.
- Enter the command "R" (without the quotes) in the terminal window.
- Enter the command "library(BATools)" or "require(BATools)" to load the BATools package.
- Enter the command q() to quit R.
- Enter the command n as you do not need to save this image.
- Begin coding!
Using the BATools Wrapper Script
In order to make BATools easier to use, there is a wrapper script available (for general information on wrapper scripts follow this link). You can find this wrapper script on the iPlant Data Store, or below in the Additional Information section. The wrapper script takes a phenotype file, a genotype file, a function name, and some other parameters, and runs the given function for the given input files. In order to use this wrapper script, your input files (genotype and phenotype) must be in text format. To run the wrapper script:
- Use iDrop (or another file transfer method) to download the wrapper script and any necessary input files from the iPlant Data Store to your virtual machine. These files can be found under /iplant/home/shared/iplantcollaborative/example_data/BATools, or below under the Additional Information section below.
- Two ways to access files in the data store are via iCommands or using the graphical user interface by logging in to the Discovery Environment, clicking Data, and then transferring the files into your image.
- Open up the terminal, either by clicking the icon at the bottom of the screen, or by going to Application > Accessories > Terminal.
- Invoke the R wrapper script using one of the following commands. Each command tests a specific function within BATools. There are several parameters to enter, so here are some examples of what the commands should look like for each function.
- The outputs for these tests will go into your current working directory.
Example invocations
You should be able to execute these commands by simply changing the file paths to the appropriate locations for your data then copying and pasting into the terminal as written.
file paths
The file paths below will work as written because we have downloaded the wrapper script and input files into this image.
However, if you haven't tried before it is still worth going through the process of interacting with the data store, transferring the files to your image, and instituting your own file paths.
This command will execute the BayesA function and output the results as space-delimited .txt files.
Rscript "/usr/bin/run_BATools_1.0.1.R" --phenotype "/usr/example_data/example_phenotype.txt" --genotype "/usr/example_data/example_genotype.txt" --test "BayesA" --output "text" --in.format "space" --out.format "space" --startpi 1 --startdf 10 --startscale 0.001 --truepi TRUE --truedef FALSE --truescale FALSE --numiter 10500 --skip 2 --burnIn 5000 --Seed 1
This command will execute the anteBayesA function and output the results as space-delimited .txt files.
Rscript "/usr/bin/run_BATools_1.0.1.R" --phenotype "/usr/example_data/example_phenotype.txt" --genotype "/usr/example_data/example_genotype.txt" --test "anteBayesA" --output "text" --in.format "space" --out.format "space" --startpi 1 --startdf 10 --startscale 0.001 --truepi TRUE --truedef FALSE --truescale FALSE --truet FALSE --numiter 10500 --skip 2 --burnIn 5000 --Seed 2
This command will execute the BayesB function and output the results as space-delimited .txt files.
Rscript "/usr/bin/run_BATools_1.0.1.R" --phenotype "/usr/example_data/example_phenotype.txt" --genotype "/usr/example_data/example_genotype.txt" --test "BayesB" --output "text" --in.format "space" --out.format "space" --startpi 0.5 --startdf 10 --startscale 0.01 --alphapi 1 --betapi 10 --truepi FALSE --truedef FALSE --truescale FALSE --numiter 10500 --skip 2 --burnIn 5000 --Seed 3
This command will execute the anteBayesB function and output the results as space-delimited .txt files.
Rscript "/usr/bin/run_BATools_1.0.1.R" --phenotype "/usr/example_data/example_phenotype.txt" --genotype "/usr/example_data/example_genotype.txt" --test "anteBayesB" --output "text" --in.format "space" --out.format "space" --startpi 0.5 --startdf 10 --startscale 0.01 --alphapi 1 --betapi 10 --truepi FALSE --truedef FALSE --truescale FALSE --truet FALSE --numiter 10500 --skip 2 --burnIn 5000 --Seed 4
Sample of output
If you copy and past one of the above links (with the option of changing your file paths) this is what the command output should look like)
Parameter descriptions
Here are the descriptions for all of the parameters used by the BATools wrapper script. For a more detailed explanation, please see the BATools reference manual.
--LONGFLAG | -SHORTFLAG | Description |
---|---|---|
--phenotype | -y | The phenotype (a numerical vector). |
--genotype | -Z | The genotype matrix (coded in "0/1/2"). |
--test | -a | The function to be executed ("BayesA", "BayesB", "anteBayesA", or "anteBayesB"). |
--output | -b | The type of results files you want BATools to produce ("text" outputs .txt files, "workspace" outputs .RData files, and "both" outputs both .txt and .RData files). |
--in.format | -c | Describes how the genotype file is delimited ("comma", "space", and "tab" are currently the only accepted delimiters). |
--out.format | -d | Describes how the output files are delimited ("comma", "space", and "tab" are currently the only accepted delimiters). |
--startpi | -e | The starting value of pi, which is the ratio of SNP effect variance that is non-zero. When pi = 1, it is BayesA. Otherwise, it is BayesB. |
--startdf | -f | The starting value of the degree of freedom parameter of SNP effect variance. |
--startscale | -g | The starting value of the scale parameter of SNP effect variance. |
--alphapi | -h | The value of alpha for sampling pi. |
--betapi | -i | The value of beta for sampling pi. |
--truepi | -j | A boolean value. If truepi = TRUE, means we fix pi to the starting value; If truepi=FALSE, we sample pi. |
--truedef | -k | A boolean value. If truedef = TRUE, means we fix the degree of freedom to the starting value; If truedef=FALSE, we sample the degree of freedom of SNP effect variances. |
--truescale | -l | A boolean value. If truescale = TRUE, means we fix the scale to the starting value; If truescale=FALSE, we sample the scale of SNP effect variances. |
--truet | -m | A boolean value. If truet = TRUE, means we do not sample the antedependence association parameter t; If truet=FALSE, we sample t. |
--numiter | -n | The number of iterations for MCMC sampling. |
--skip | -o | The number of iterations for skip. |
--burnIn | -p | The number of iterations for burnIn in MCMC sampling. |
--Seed | -q | The seed for the random generator. NOTE: If Seed is left blank, it defaults to 1000. If Seed is set to -1, it defaults to the current system time. |
Additional Information
Listed below are the example input files (which can also be found on the iPlant Data Store) and additional documentation for BATools.