Cuffcompare-2.2.1

Cuffcompare 2.2.1

This App runs Cuffcompare (version 2.2.1) to 

  • Compare your assembled transcripts to a reference annotation
  • Track Cufflinks transcripts across multiple experiments (e.g. across a time course)

App Creator

Amanda Cooksey

Quick Start

  • Cuffcompare 2.2.1 takes Cufflinks’ GTF output as input, and optionally can take a “reference” annotation 

Test Data

Test data for this app appears directly in the Discovery Environment in the Data window under Community Data -> iplantcollaborative -> example_data -> cuffcompare

Input File(s)

Use the flower4_transcripts.gtf and flower6-7_transccripts.gtf  files from the cuffcompare directory for an example run. A reference annotation file is not required but can be supplied. A custom reference annotation can be supplied by the user in the 'Custom Annotation File' field or a reference annotation can be selected from the drop-down menu under 'Reference Annotation File'. For the example data choose Arabidopsis thaliana (Ensembl 14) from the 'Reference Annotation File' drop down menu. 

Parameters Used in App

When the app is run in the Discovery Environment, use the following parameters with the above input file(s) to get the output provided in the section below.

Leave all parameters as default.

Output File(s)

Cuffcompare produces 4 main output files (notes directly from Cuffcompare user manual):
1) <outprefix>_combined.gtf

  • Cuffcompare reports a GTF file containing the “union” of all transfrags in each sample. If a transfrag is present in both samples, it is thus reported once in the combined gtf.

2)  <cuff_in>.refmap

  • This tab delimited file lists the most closely matching reference transcript for each Cufflinks transcript. There is one row per Cufflinks transcript,

3) <cuff_in>.tmap

  • This tab delimited file lists the most closely matching reference transcript for each Cufflinks transcript. There is one row per Cufflinks transcript.

4) <outprefix>.tracking

  • This file matches transcripts up between samples. Each row contains a transcript structure that is present in one or more input GTF files. Because the transcripts will generally have different IDs (unless you assembled your RNA-Seq reads against a reference transcriptome), cuffcompare examines the structure of each the transcripts, matching transcripts that agree on the coordinates and order of all of their introns, as well as strand. 

In the directory Community Data -> iplant_training -> intro_rna-seq -> 03_cufflinks, you will see directories for each of the selected bam files used as inputs. These directories also contain a "skipped.gtf" file.

Tool Source for App