Gfold 1.1.1 Count

Community rating: ?????

Create a count file for use for Gfold.1.1. Difference Expression. GFOLD is useful when no replicates are available.

gfold - Generalized fold change for ranking differentially expressed genes from RNA-seq data.

"GFOLD is especially useful when no replicate is available. GFOLD generalizes the fold change by considering the posterior distribution of log fold change, such that each gene is assigned a reliable fold change. It overcomes the shortcoming of p-value that measures the significance of whether a gene is differentially expressed under different conditions instead of measuring relative expression changes, which are more interesting in many studies. It also overcomes the shortcoming of fold change that suffers from the fact that the fold change of genes with low read count are not so reliable as that of genes with high read count, even these two genes show the same fold change."

Source: Feng J, Meyer CA, Wang Q, Liu JS, Liu XS, Zhang Y. GFOLD: a generalized fold change for ranking differentially expressed genes from RNA-seq data. Bioinformatics 2012

Test data for this app appears directly in the Discovery Environment in the Data window under Community Data -> iplantcollaborative -> example_data -> Gfold

The following files will be found:
a) WT_rep1.sam
b) hy5_rep1.sam
c) WT_rep1.read_cnt
d) hy5_rep1.read_cnt
e) WT_vs_hy5_output.diff
f) WT_vs_hy5_output.diff.ext

Quick Start

To use Gfold 1.1.1 Count, please have the following files available:
A gene/genome annotation file in these formats: *.GTF ; *.GPF ; or *.BED
A Sequence Alignment/Map file in: SAM format
If you have BAM (binary version of SAM file) from Tophat2 - Single End or Tophat2 - Paired End : please use SAMTOOLS-0.1.19 BAM-to-SAM to convert the file into SAM format

Resources: Bitbunket and Manual
Publication: GFOLD: a generalized fold change for ranking differentially expressed genes from RNA-seq data

Example Test Data

In this example, we will be using the Sample Data from the RNA-Seq Tutorial.
The RNA-Seq reads that we will be working is from Arabidopsis thaliana.

A. Prior to using Gfold 1.1.1 Count:

1) Please follow /wiki/spaces/eot/pages/241585220 - Step 1 with aligning RNA-Seq reads using Tophat2

2) Convert the BAM outputs to SAM format. The files are located within the Data window under:

Community Data -> iplant -> home -> shared -> iplant_training -> intro_rna-seq -> 02_tophat -> bam

(Please run SAMTOOLS-0.1.19 BAM-to-SAM on the following files)

2a) *Input file: WT_rep1.bam

Output file name: WT_rep1.sam

2b) *Input file: hy5_rep1.bam

Output file name: hy5_rep1.sam

B. Gfold 1.1.1 Count:

In this example, please use the following files to generate the count files:

1) The GTF (General Transfer Format) file for Arabidopsis thaliana.

The GTF file for Arabidopsis thaliana can be found in Discovery Environment in the Data window under:

Community Data -> iplant -> home -> shared -> iplant_training -> reference_genomes -> ensembl_14_67 -> GTF -> Arabidopsis_thaliana.TAIR10.14.gtf.

2a) Output file: WT_rep1.sam (from part A step 2a)

2b) Output file: hy5_rep1.sam (from part A step 2b) - need to run Gfold 1.1.1 Count again for this step

3a) Output file name: WT_rep1.read_cnt

3b) Output file name: hy5_rep1.read_cnt - need to run Gfold 1.1.1 Count again for this step

4) Use the generated count files (WT_rep1.read_cnt and hy5_rep1.read_cnt) in Gfold 1.1.1 Difference Expression

Extra Options used in the Application:

Whether is the sequencing data strand specific?

-The Default option is False.

Please select True for this example.

Please Select: True

Output File(s)

Expect a file_name.read_cnt as output. For the test case, the output files that will be generated as WT_rep1.read_cnt and hy5_rep1.read_cnt

Tool Source for App

Resources: Bitbunket and Manual
Publication: GFOLD: a generalized fold change for ranking differentially expressed genes from RNA-seq data

WT_rep1.read_cnt