Panel | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||
|
...
Rationale and background:
Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms. Nature Biotechnology (doi:10.1038/nbt.2862)
...
Pre-Requisites (for both versions 1.1b and 2.0)
- A CyVerse account. (Register for an CyVerse account here - user.cyverse.org)
- Mandatory arguments
- Transcript file name (in fasta format)
- FASTQ files (either SE or PE reads)
- Fragment Library Type (specify the format of the library- more details(http://sailfish.readthedocs.io/en/master/library_type.html))
- File type (Enter whether the library is paired end or single end )
- Optional arguments
- Number of bootstraps ( This option takes a positive integer that dictates the number of bootstrap samples to compute. The more samples computed, the better the estimates of varaiance, but the more computation (and time) required)
Number of GibbsSamples (this option produces samples that allow us to estimate the variance in abundance estimates. However, in this case the samples are generated using posterior Gibbs sampling over the fragment equivalence classes rather than bootstrapping)
The following test data are provided for testing Sailfish_align_qauntquant-0.9.2 in here - /iplant/home/shared/iplantcollaborative/example_data/Salmon:
- Transcript file - transcripts.fa
- FASTQ files - reads_1.fq and reads_2.fq
Run Sailfish_align_qauntquant-0.9.2 on FASTQ files (reads_1.fq and reads_2.fq) using ‘transcripts.fa'.
Results
Successful execution of the Sailfish_align_qauntquant-0.9.2 will create a directory named reads_1. The directory will contain several files and directories:
- logs
- Index
- reads_1
- quant.sf: When the quantification step is finished, the directory
<quant_dir>
will contain a file named “quant.sf” (and, if bias correction is enabled, an additional file names “quant_bias_corrected.sf”). This file contains the result of the Sailfish quantification step. This file contains a number of columns (which are listed in the last of the header lines beginning with ‘#’). Specifically, the columns are (1) Transcript ID, (2) Transcript Length, (3) Transcripts per Million (TPM) and (6) Estimated number of reads (an estimate of the number of reads drawn from this transcript given the transcript’s relative abundance and length).
- quant.sf: When the quantification step is finished, the directory
More information on the tool can be found here - http://sailfish.readthedocs.io/en/master/index.html
...