/
K) Map RNA-Seq reads to transcripts

K) Map RNA-Seq reads to transcripts

Map RNA-Seq reads to transcripts (app: Bowtie-2.2.1--Build-and-Map)

Description: While not necessary, RNA-Seq studies benefit from mapping transcript reads to a reference. As this method uses a transcriptome as reference, the mapping tool does not need to accommodate intron sequences. Bowtie 2.2.1 is a fast, mapping tool appropriate for this study. Documentation: http://bowtie-bio.sourceforge.net/bowtie2/index.shtml.

  1. Log into the Discovery Environment: https://de.iplantcollaborative.org/de/.
  2. Open the Bowtie-2.2.1--Build-and-Map app (Public Applications > NGS > Aligners > Bowtie-2.2.1-Build-and-Map).
    1. Change 'Analysis Name' to Map_Control01_RNA_Reads, add a 'Description' (optional), and use the default 'output folder'.
  3. Click on the Reference Index tab.
    1. Click on the 'Reference fasta' field. Browse to the folder that contains the fasta output file created in Section J (Annotate transcripts) (Sample data: Community Data > iplant_training > rna-seq_without_genome > K_map_rnaseq_reads_to_transcripts > BA_transcriptome_annotated.fasta). Select the file, then click OK.
    2. Click on the 'Prefix' field. Enter 'BAtrnscrpts_annotated' for the prefix name.
  4. Click on the Inputs tab.
    1. Browse to a folder containing Illumina sequencing reads. Enter a pair of reads into the fields for read sequences, “Reads1” and “Reads2” (Sample data: Community Data > iplant_training > rna-seq_without_genome > K_map_rnaseq_reads_to_transcripts > BAcontrol > SRR566981.sra_1.fastq and SRR566981.sra_2.fastq, the first pair of reads for the control condition).
    2. Click on the 'Output File' field. Change the name of the output file to 'BAcon01.sam'.
  5. Click on the Options tab.
    1. Check the box for phred64.
    2. Enter 200 in the 'Minimum fragment length' field, and 600 in the 'Maximum fragment length'.
  6. Click on "Launch Analysis".
  7. Repeat this analysis for any remaining Illumina reads (Sample data: repeat for the remaining read pairs in the BAcontrol folder).
    1. To save time, the indexed reference file (BAtrnscrpts_annotated.tar) can be entered in the window “Previously created Index archive”, instead of using the reference fasta file.
    2. Change the 'Analysis Name' accordingly (e.g. Map_Control02_RNA_Reads).
    3. Change the output file names to match the inputs (e.g. BAcon02.sam, BAcon03.sam).
  8. Click on 'Analyses' from the DE workspace and monitor the 'Status' of the analysis (e.g., Idle, Submitted, Pending, Running, Completed, Failed).
    1. Once launched, an analysis will continue whether the user remains logged in or not.
    2. Email notifications update on the analysis progress; they can be switched off under 'Preferences'.
    3. If the analysis fails or does not proceed in the anticipated timeline, check these tips for troubleshooting. (Using the sample data, the analysis should be complete in less than 15 minutes.)
    4. To re-run an analysis, click the analysis "App" in the 'Analyses' window.
  9. Access analysis results in one of two ways:
    1. In the 'Analyses' window click on the analysis "Name" to open the output folder.
    2. In the 'Data' window, click on user name, then navigate to the folder that holds the output of the analysis. (Find the output for the sample at Community Data > iplant_training > rna-seq_without_genome > K_map_rnaseq_reads_to_transcripts > output_from_sample_data.)

Note:

  • Sample data: repeat this analysis for all 6 Illumina sequence reads for the dehydrated condition (Community Data > iplant_training > rna-seq_without_genome > K_map_rnaseq_reads_to_transcripts > BAdehyd).
    • Change the 'Analysis Name' accordingly (e.g. Map_Dehyd01_RNA_Reads, Map_Dehyd02_RNA_Reads, Map_Dehyd03_RNA_Reads).
    • Name the output files to match the inputs (e.g. BAdehyd01.sam, BAdehyd02.sam, BAdehyd03.sam). All other parameters remain the same.

Related content

Unable to render {include} The included page could not be found.