RMTA v2.1
The DE Quick Start tutorial provides an introduction to basic DE functionality and navigation.
Please work through the documentation and add your comments on the bottom of this page, or email comments to support@cyverse.org. Thank you.
Rationale and background:
RMTA is a workflow that can rapidly process raw RNA-seq Illumina data by mapping reads using HiSat2 and then assemble transcripts using either Cufflinks or Stringtie. RMTA can process Fastq files containing paired-end or single-end reads. Alternatively, RMTA can directly process one or more sequence read archives (SRA) from NCBI using an SRA ID.
RMTA minimally requires the following input data:
Reference Genome (FASTA) or Hisat2 Indexed Reference Genome (in a subdirectory)
Reference Transcriptome (GFF3/GTF/GFF)
RNA-Seq reads (FASTQ) - Single end or Paired-end or NCBI SRA id or multiple NCBI SRA id's (list in a single column text file).
Pre-Requisites
A CyVerse account. (Register for a CyVerse account here - user.cyverse.org)
Mandatory arguments
Hisat2 reference genome: Select at least one of the below three options for the indexing of the Reference Genome
Custom Reference genome
Select reference genome from the list
Hisat2 Indexed folder
Hisat2 reference annotation: Select at least one of the below two options for using as annotation
Custom Reference annotation
Select reference annotation from the list
Use one of the following three:
Paired-end reads
FASTQ Files (Read 1): Input reads 1 file of paired-end data
FASTQ Files (Read 2): Input reads 2 files of paired-end data
Single-end reads
single end FASTQ files
SRA
SRA ID: Single SRA id that you want to use
File containing SRA id's: Multiple SRA's that you want to use
Cufflinks/Stringtie: Only one of the below two options needs to be checked. Cannot select both
StringTie
Cufflinks
Coverage cut-off threshold: Select from 0-5
FPKM cut-off threshold: FPKM cut-off you want to use to filter the transcripts
Cuffmerge: Run Cuffmerge for Stringtie/Cufflinks gtfs (Only works with more than one sample files)
Advanced options
Phred quality score: encoding for quality score: Phread64 (Default is Phred 33)
Fragment Library Type: specify the format of the library either FR, RF, F, R etc.
Trim bases from 5' end of read: Trim bases from 5' (left) end of each read before alignment
Trim bases from 3' end of read: Trim bases from 3' (right) end of each read before alignment
Minimum intron length: Set minimum intron length
maximum intron length: Set maximum intron length
Test/sample data
The following test data are provided for testing RMTA in here - /iplant/home/shared/iplantcollaborative/example_data/RMTA
Reference Genome: Sorghum_bicolor.Sorbi1.20.dna.toplevel_chr8.fa
Reference Annotation: Sorghum_bicolor.Sorbi1.20_chr8.gtf
left_reads- sample_1_R1.fq.gz
right_reads-sample_1_R2.fq.gz
Stringtie
Fragment Library Type: FR
Leave the rest as default
Results
Successful execution of RMTA will generate two output folders
Index: This folder consists of the index of the genome
Output: This folder consists of the output from Hisat2, Stringtie and Cuffcompare (Please refer to the manual for the explanation of outputs from these individual programs)