Transcript decoder 1.0

Transcript decoder 1.0

Community rating: ?????

Finds open reading frames in transcripts and helps in their evaluation.

Quick Start

Test Data

Test data for this app appears directly in the Discovery Environment in the Data window under Community Data -> iplantcollaborative -> example_data -> Transcript_decoder

Input File(s)

Use testtranscripts.fasta from the directory above as test input.

Parameters Used in App

When the app is run in the Discovery Environment, use the following parameters with the above input file(s) to get the output provided in the next section below.

  • Use these parameters within the DE app interface:
    • Minimum ORF size - 300
    • genetic code - universal
    • Minimum protein length - 50

Output File(s)

Expect a output files. For the test case, the output files you will find in the example_data directory are:

base_freqs.dat

  best_candidates.eclipsed_orfs_removed.bed

  best_candidates.eclipsed_orfs_removed.cds

  best_candidates.eclipsed_orfs_removed.gff3

  best_candidates.eclipsed_orfs_removed.pep

  best_candidates.gff3

  hexamer.scores

  longest_orfs.cds

  longest_orfs.cds.scores

  longest_orfs.cds.scores.selected

  longest_orfs.cds.top_500_longest

  longest_orfs.gff3

  longest_orfs.gff3.inx

  longest_orfs.pep

The most commonly used files will be the ones ending in .cds, which are the coding sequences of the transcript sequence contigs, and the .pep files, which are the translated peptide sequence files for the .cds files. The files that start with "longest_orfs" in their names are the open reading frame sequences that meet the criteria set when Transcript decoder was run, e.g. minimum ORF size 300 bp. The files that start with "best_candidates.eclipsed_orfs_removed" in their names are the longest_orfs files with redundant smaller sequences removed.

Tool Source for App