Exploring MAKER-P output with JBrowse in Atmosphere
Tutorial
This tutorial will show you how to examine the output of a MAKER-P run using JBrowse.
This is a draft tutorial - under development: caveat lector
Prerequisites
This tutorial assumes you have the following inputs:
- MAKER-P produced gff files
- Reference Genome
- *nix computing setup that satisfies the JBrowse requirements (See: http://jbrowse.org/install/) - we will be using an Atmosphere instance for this.
Source materials, references, and related tutorials
To complete this tutorial, you will need to:
- Complete an MAKER-P annotation (See tutorials by J.Stein and S.Subramaniam for running this at iPlant)
- Launch an Atmosphere instance (Make sure you have access to Atmosphere - See documentation)
In addition to the sources above, material for this tutorial is also derived from:
Step one: Launch an Atmosphere instance
For the purpose of this tutorial, we will use the workshop instance: TSW Workshop Williams 1.0
Instructions:
- Log in to Atmosphere https://atmo.iplantcollaborative.org/ (ensure you have access - go to https://user.iplantcollaborative.org/dashboard/ - if Atmosphere is not listed under 'My Services' scroll down to Available Services and request access to Atmosphere).
- Click 'Launch New Instance'
- Under 'Select an Image' search for 'TSW Workshop Williams 1.0'
- If desired, name your image.
- If you will be using the sample data, you may leave this instance size as 'tiny 1(1 CPUs, 4GB memory, 30G disk), if working with your own data you may choose an appropriate size.
- Click Launch Instance
Launching an instance should take 15-20 minutes; launch times and instance sizes are subject to the capacity of Atmosphere at any given time.
There are several appropriate Atmosphere images for this tutorial including
- MAKER-P_2.31_3_JBrowse
Once you work through this tutorial, choose the image an instance size appropriate for your dataset.
Step two: Connect to your Atmosphere instance and import sample data and JBrowse
In this tutorial, we have left a lot for you to do - this should be very close to setting up and analyzing the data on most any system.
The status of your Atmosphere instance must be active in order to connect. You will need the ip address of your instance to continue.
1. SSH into your Atmosphere instance; enter your password when prompted
$ssh your_iplant_username@atmosphere.ip.address
2. Configure iCommands
$ iinit
When prompted, enter the following values to complete the configuration
Prompt | Value |
---|---|
Host name (DNS) of the server to connect to | |
Port number | 1247 |
irods user name | (your iPlant username) |
irods zone | iplant |
Current iRODS password | (your iPlant password) |
3. Import a folder containing the sample data and the JBrowse software from the iPlant Data Store using an iget command; then switch into that directory:
$ iget -rPT /iplant/home/shared/iplant_training/de_maker/maker_viz $ cd maker_viz
4. Unzip both files in the 'maker_viz' folder:
$ unzip "*.zip"
Step three: Gather all the GFF files from your sample output and place them in their own directory
We don't have to do this now, but since it is only one command, let's go ahead. Also, we basically need the GFF files in once place for one of the later configuration scripts so that's why we'll do it here. We also happen to expect (in the case of this sample data) that the GFFs we are looking for will be labeled as 'Chr...'.
1. Find all the gff files (At least the ones we care about, e.g. 'Chr*.gff') and move them to the top level 'sample_MAKER-P_v.2.3_output' directory
$ find ./sample_MAKER-P_v.2.3_output/ -name Chr*.gff |xargs cp -t ./sample_MAKER-P_v.2.3_output/
In this case, we should find 3 GFF files (Chr1sub.gff, Chr2sub.gff, Chr3sub.gff). You can verify this with 'ls ./sample_MAKER-P_v.2.3_output/'
Step four: Setup JBrowse and start the apache server
1. Move the JBrowse directory to a new location which the user can write to,
$ sudo mv JBrowse-1.11.6/ /var/www/
On your Atmosphere instance, your iPlant password is also your sudo password
Configure ownership to ensure write access:
$ sudo chown $LOGNAME /var/www/JBrowse-1.11.6/
$LOGNAME is your shell username, and on your Atmosphere instance is your iPlant username
2. Change into the new JBrowse directory
$ cd /var/www/JBrowse-1.11.6/
3. Complete the JBrowse setup (Be sure you are in the location '/var/www/JBrowse-1.11.6/' )
$ sudo ./setup.sh
4. Start the Apache server (or make sure it is already running)
$ sudo /usr/sbin/apachectl start
5. Find your host name, copy down the result of the following command.
$ echo $HOSTNAME
6. Open a web browser or new tab and test your JBrowse setup by entering the following URL:
http://your_host_name_goes_here/JBrowse-1.11.6/index.html?data=sample_data/json/volvox
So for example, if your hostname was 'vm65-209.iplantcollaborative.org' your URL would be:
http://vm65-209.iplantcollaborative.org/JBrowse-1.11.6/index.html?data=sample_data/json/volvox
Turn on some of the tracks in the left-side menu to see the features
Step five: Prepare your MAKER-P output for visualization using JBrowse
We will run two scripts on the MAKER-P data, one to prepare the selected GFF files and another to prepare the genome. These scripts are part of the standard JBrowse download.
1. Still in the '/var/www/JBrowse-1.11.6/' directory, run the maker2jbrowse on the GFF files we copied in the sample_output folder (this will take a minute to run).
$ ./bin/maker2jbrowse ~/maker_viz/sample_MAKER-P_v.2.3_output/*.gff
The output is by default a directory called /var/www/JBrowse-1.11.6/data you can specify an output directory using the '-o' option
2. Prepare the reference genome (which you used in MAKER-P, and which has been provided in the sample data) using the prepare-refseqs.pl script.
$ ./bin/prepare-refseqs.pl --fasta ~/maker_viz/sample_MAKER-P_v.2.3_output/test_genome.fasta
3. Move the reference genome and the output to the Jbrowse data folder:
$ cp ~/maker_viz/sample_MAKER-P_v.2.3_output/test_genome.fasta ./data $ cp -r ~/maker_viz/sample_MAKER-P_v.2.3_output/test_genome.maker.output ./data
Step six: View the sample data in JBrowse
Using the URL from Step 4 (No. 6) load your data in a web browser:
http://your_host_name_goes_here/JBrowse-1.11.6/index.html?data=data where data is the name of the output folder we used for the maker2jbrowse script (by default called data).
For example:
http://vm65-209.iplantcollaborative.org/JBrowse-1.11.6/index.html?data=data