P) Determine differential expression

Determine differential expression (app: DESeq)

Description: The statistics app DESeq identifies differentially expressed sequences in two sequence pools. Alternatively, the EdgeR tool could be used. Documentation: http://bioconductor.org/packages/release/bioc/vignettes/DESeq/inst/doc/DESeq.pdf.

  1. Log into the Discovery Environment: https://de.iplantcollaborative.org/de/.
  2. Open the DESeq app (Public Applications > NGS > Transcriptome Profiling > Misc RNASeq tools > DESeq).
    1. Change 'Analysis Name' to Determine_Differential_Expression, add a 'Description' (optional), and use the default 'output folder'.
  3. Click on the Inputs tab.
    1. Select the 'Tab-delimited input file' field. Enter the matrix created in Section O (Combine counts) (Sample data: Community Data > iplant_training > rna-seq_without_genome > P_determine_differential_expression > ccombine_result.txt).
  4. Click on the Experiment Design tab.
    1. Select the 'Comma-separated list of factors for the data columns in your file' field. For the sample data enter the factors as control,control,control,dehydrated,dehydrated,dehydrated.
    2. Select the 'Comma-separated list of library types for each factor listed above (must be "single-end" or "paired-end" for each entry)' field. For the sample data enter paired-end,paired-end,paired-end,paired-end,paired-end,paired-end.
    3. Select the 'Comma-separated pair of factors for comparison' field. For the sample data enter control,dehydrated.
  5. Click on the Statistical Options tab.
    1. Enter 0.01 for the Minimum false-discovery rate, and 0.2 for the Quantile for removing insignificant genes (the lowest quantile, which will be ignored as insignificant).
  6. Click on "Launch Analysis".
  7. Click on 'Analyses' from the DE workspace and monitor the 'Status' of the analysis (e.g., Idle, Submitted, Pending, Running, Completed, Failed).
    1. Once launched, an analysis will continue whether the user remains logged in or not.
    2. Email notifications update on the analysis progress; they can be switched off under 'Preferences'.
    3. If the analysis fails or does not proceed in the anticipated timeline, check these tips for troubleshooting. (Using the sample data, the analysis should be complete in less than 20 minutes.)
    4. To re-run an analysis, click the analysis "App" in the 'Analyses' window.
  8. Access analysis results in one of two ways:
    1. In the 'Analyses' window click on the analysis "Name" to open the output folder.
    2. In the 'Data' window, click on user name, then navigate to the folder that holds the output of the analysis. (Find the output for the sample at Community Data > iplant_training > rna-seq_without_genome > P_determine_differential_expression > output_from_sample_data.)
  9. The output will consist of five output files, three graphs for the Dispersions, MA plot, the pValues, a text file of all of the results, and a text file for the significant results.

The MA Plot created by the DESeq App (DESeq_MAplot.png) will resemble the following figure:

Sample of significant results from the DESeq App (DESeq_results_significant.txt):

Unable to render {include} The included page could not be found.