Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Please work through the tutorial and add your comments on the bottom of this page. Or send comments per email to upendra@cyverseto support@cyverse.org. Thank you. 

Warning
titleLearn about allocations

Learn about CyVerse's allocation policies here.

Part 1: Connect to an instance of an Atmosphere Image (Virtual Machine)

Step 1. Go to https://atmo.iplantcollaborativecyverse.org and log in with your CyVerse credentials.

Image Added

Image RemovedImage Added

Step 2. Create a new project (faststructure) and add some description after log-in.

Image Added

Step 3. Click on the Launch New Instance button and search for fastStructure-1.image.

Image RemovedImage Added

Step 3. Select the image fastStructure and click 4. Click Launch Instance. It will take 102-15 5 minutes for the cloud instance to be launched. 

Image RemovedImage Added

Note: Instances can be configured for different amounts of CPU, memory, and storage depending on user needs.  This tutorial can be accomplished with the small medium1 instance size, medium1 (4 CPUs, 8 GB memory, 80 GB root)  

Part 2: Set up a fastStructure run using the Terminal window

Step 1.

  • Open the Terminal on mac.  Add the ssh details along with your IP address to connect the instance through the terminal
Code Block
$ ssh <username>@Ipaddress
  • Using or using webshell

Image Added

Image Added

Step 2. Get oriented. 

  1. You will find fastStructure software in "/opt" folder. All the dependencies for running fastStructure are located in "/opt/fastStructure" but you need to change the permissions before using it 

...

Code Block
$ ls /opt/fastStructure-1.0/test

testdata.bed  testoutput_logistic.3.log    testoutput_logistic.3.varP  testoutput_simple.3.meanP  testoutput_simple.3.varQ
testdata.bim  testoutput_logistic.3.meanP  testoutput_logistic.3.varQ  testoutput_simple.3.meanQ
testdata.fam  testoutput_logistic.3.meanQ  testoutput_simple.3.log     testoutput_simple.3.varP

The testdata.bed file (with corresponding test.fam and test.bim) contains genotypes sampled for 200 individuals at 500 SNP loci. There are also other testdata outpuut files. Let's copy the files

Code Block
$ cd ~
$ cp -R /opt/fastStructure-1.0/test .
$ rm test/*output*
$ ls test
testdata.bed  testdata.bim  testdata.fam

Step 4. Set up a fastStructure test run.  Executing  Executing the code with the provided test data should generate a log file identical to the ones in test/, as a first check that the source code has been downloaded and compiled correctly. The algorithm scales linearly with number of samples, number of loci and value of K; the expected runtime for a new dataset can be computed from the runtime in the above log file.

...

.

...

language
Code Block
bash
$ python /opt/fastStructure-1.0/structure.py -K 2 --input=test/testdata --output=test/testoutput_simple --full --seed=100
$ python /opt/fastStructure-1.0/structure.py -K 3 --input=test/testdata --output=test/testoutput_simple --full --seed=100
$ python /opt/fastStructure-1.0/structure.py -K 4 --input=test/testdata --output=test/testoutput_simple --full --seed=100

Code Block
languagebash
$ ls test
testdata.bed             testoutput_simple.2.meanP  testoutput_simple.3.log    testoutput_simple.3.varQ   testoutput_simple.4.varP
testdata.bim             testoutput_simple.2.meanQ  testoutput_simple.3.meanP  testoutput_simple.4.log    testoutput_simple.4.varQ
testdata.fam             testoutput_simple.2.varP   testoutput_simple.3.meanQ  testoutput_simple.4.meanP
testoutput_simple.2.log  testoutput_simple.2.varQ   testoutput_simple.3.varP   testoutput_simple.4.meanQ

...

Code Block
languagebash
$ python /opt/fastStructure-1.0/distruct.py -K 2 --input=test/testoutput_simple --output=test/testoutput_simple_distruct_K2.pdf
$ python /opt/fastStructure-1.0/distruct.py -K 3 --input=test/testoutput_simple --output=test/testoutput_simple_distruct_K3.pdf
$ python /opt/fastStructure-1.0/distruct.py -K 4 --input=test/testoutput_simple --output=test/testoutput_simple_distruct_K4.pdf
Code Block
$ ls
testdata.bed               testoutput_simple.2.varP   testoutput_simple.3.varQ   testoutput_simple_distruct_K2.pdf
testdata.bim               testoutput_simple.2.varQ   testoutput_simple.4.log    testoutput_simple_distruct_K3.pdf
testdata.fam               testoutput_simple.3.log    testoutput_simple.4.meanP  testoutput_simple_distruct_K4.pdf
testoutput_simple.2.log    testoutput_simple.3.meanP  testoutput_simple.4.meanQ
testoutput_simple.2.meanP  testoutput_simple.3.meanQ  testoutput_simple.4.varP
testoutput_simple.2.meanQ  testoutput_simple.3.varP   testoutput_simple.4.varQ

...


Step 7. 
Downloading the files using cyberduck.

...