...
Please work through the tutorial and add your comments on the bottom of this page. Or send comments per email to upendra@cyverseto support@cyverse.org. Thank you.
Warning | ||
---|---|---|
| ||
Learn about CyVerse's allocation policies here. |
Part 1: Connect to an instance of an Atmosphere Image (Virtual Machine)
Step 1. Go to https://atmo.iplantcollaborativecyverse.org and log in with your CyVerse credentials.
Step 2. Create a new project (faststructure) and add some description after log-in.
Step 3. Click on the Launch New Instance button and search for fastStructure-1.0 image.
Step 3. Select the image fastStructure and click 4. Click Launch Instance. It will take 102-15 5 minutes for the cloud instance to be launched.
Note: Instances can be configured for different amounts of CPU, memory, and storage depending on user needs. This tutorial can be accomplished with the small medium1 instance size, medium1 (4 CPUs, 8 GB memory, 80 GB root)
Part 2: Set up a fastStructure run using the Terminal window
Step 1.
- Open the Terminal on mac. Add the ssh details along with your IP address to connect the instance through the terminal
Code Block |
---|
$ ssh <username>@Ipaddress |
- Using or using webshell
Step 2. Get oriented.
- You will find fastStructure software in "/opt" folder. All the dependencies for running fastStructure are located in "/opt/fastStructure" but you need to change the permissions before using it
...
Code Block |
---|
$ ls /opt/fastStructure-1.0/test
testdata.bed testoutput_logistic.3.log testoutput_logistic.3.varP testoutput_simple.3.meanP testoutput_simple.3.varQ
testdata.bim testoutput_logistic.3.meanP testoutput_logistic.3.varQ testoutput_simple.3.meanQ
testdata.fam testoutput_logistic.3.meanQ testoutput_simple.3.log testoutput_simple.3.varP |
The testdata.bed
file (with corresponding test.fam and test.bim) contains genotypes sampled for 200 individuals at 500 SNP loci. There are also other testdata outpuut files. Let's copy the files
Code Block |
---|
$ cd ~
$ cp -R /opt/fastStructure-1.0/test .
$ rm test/*output*
$ ls test
testdata.bed testdata.bim testdata.fam |
Step 4. Set up a fastStructure test run. Executing Executing the code with the provided test data should generate a log file identical to the ones in test/
, as a first check that the source code has been downloaded and compiled correctly. The algorithm scales linearly with number of samples, number of loci and value of K; the expected runtime for a new dataset can be computed from the runtime in the above log file.
...
.
...
Code Block | |
---|---|
bash | $ python /opt/fastStructure-1.0/structure.py -K 2 --input=test/testdata --output=test/testoutput_simple --full --seed=100
$ python /opt/fastStructure-1.0/structure.py -K 3 --input=test/testdata --output=test/testoutput_simple --full --seed=100
$ python /opt/fastStructure-1.0/structure.py -K 4 --input=test/testdata --output=test/testoutput_simple --full --seed=100
|
Code Block | ||
---|---|---|
| ||
$ ls test testdata.bed testoutput_simple.2.meanP testoutput_simple.3.log testoutput_simple.3.varQ testoutput_simple.4.varP testdata.bim testoutput_simple.2.meanQ testoutput_simple.3.meanP testoutput_simple.4.log testoutput_simple.4.varQ testdata.fam testoutput_simple.2.varP testoutput_simple.3.meanQ testoutput_simple.4.meanP testoutput_simple.2.log testoutput_simple.2.varQ testoutput_simple.3.varP testoutput_simple.4.meanQ |
...
Code Block | ||
---|---|---|
| ||
$ python /opt/fastStructure-1.0/distruct.py -K 2 --input=test/testoutput_simple --output=test/testoutput_simple_distruct_K2.pdf
$ python /opt/fastStructure-1.0/distruct.py -K 3 --input=test/testoutput_simple --output=test/testoutput_simple_distruct_K3.pdf
$ python /opt/fastStructure-1.0/distruct.py -K 4 --input=test/testoutput_simple --output=test/testoutput_simple_distruct_K4.pdf
|
Code Block |
---|
$ ls testdata.bed testoutput_simple.2.varP testoutput_simple.3.varQ testoutput_simple_distruct_K2.pdf testdata.bim testoutput_simple.2.varQ testoutput_simple.4.log testoutput_simple_distruct_K3.pdf testdata.fam testoutput_simple.3.log testoutput_simple.4.meanP testoutput_simple_distruct_K4.pdf testoutput_simple.2.log testoutput_simple.3.meanP testoutput_simple.4.meanQ testoutput_simple.2.meanP testoutput_simple.3.meanQ testoutput_simple.4.varP testoutput_simple.2.meanQ testoutput_simple.3.varP testoutput_simple.4.varQ |
...
Step 7. Downloading the files using cyberduck.
...