Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Please work through the tutorial and add your comments on the bottom of this page. Or send comments per email to upendra@cyversesupport@cyverse.org. Thank you.

QUAST: QUality ASsesment Tool for Genome Assemblies. MetaQUAST is the extension for metagenomic datasets, and Icarus, interactive visualizer for these tools

Gurevich, A., Saveliev, V., Vyahhi, N., and Tesler, G. (2013) QUAST: quality assessment tool for genome assemblies. Bioinformatics 29, 1072-1075

QUAST is a tool for evaluating genome assemblies by computing various metrics, including 

  • N50, length for which the collection of all contigs of that length or longer covers at least 50% of assembly length,
  • NG50, where length of the reference genome is being covered,
  • NA50 and NGA50, where aligned blocks instead of contigs are taken,
  • misassemblies, misassembled and unaligned contigs or contigs bases,
  • genes and operons covered

QUAST Builds convenient plots for different metrics

...

This app can be used to run Cufflinks-2.2.1 using SAM files instead of BAM files. This app is basically a workflow that consists of two apps

 

Pre-Requisites

  1. A CyVerse account. (Register for an CyVerse account here - user.cyverse.org)
  2. Input file(s)
    1. Input file (Genome assemblies generated by using any of the genome assemblers in fasta format)Name: SAM files generated from any mapper
  3. Output
    1. Output directory name
    Parameters
    1. Maximum number of reference genomes: Maximum number of reference genomes (per each assembly) to download after searching in SILVA database. Defaultvalue is 50.
    2. Ambiguity usage: Way of processing equally good alignments of a contig (probably repeats):

      noneskip all such alignments;onetake only one (the very best one);alluse all alignments. Can cause a significant increase of # mismatches (repeats are almost always inexact due to accumulated SNPs, indels, etc.).File name: Name of the output file (output.sorted.bam is default)
  4. Cufflinks-2.2.1: Regarding parameters for this section, please refer to Cufflinks-2.2.1

     

Test/sample data

The test data for testing QUAST in here : /iplant/home/shared/iplantcollaborative/example_data/metaQUAST.sample.dataSamtools-SAMtoSortedBAM

Test run

  1. Open MetaQUAST-4.3(denovo based) app in DE
  2. Select/drag input files (meta_contigs_1.fasta and meta_contigs_2.fasta) into the Inputs section of the app
  3. Select the name of the output file (metaQuast_output) in the output section of the app
  4. Ambiguity usage: allInput file: BowtieOutput.sam
  5. Output: Default file name
  6. Cufflinks-2.2.1: Leave everything default

Test Results

Successful execution of the QUAST assessment pipeline will create metaQuast_output folder

report.txtassessment summary in plain text format,
report.tsvtab-separated version of the summary, suitable for spreadsheets (Google Docs, Excel, etc),
report.texLaTeX version of the summary,
alignment.svgcontig alignment plot (file is created if matplotlib python library is installed),
report.pdfall other plots combined with all tables (file is created if matplotlib python library is installed),
report.htmlHTML version of the report with interactive plots inside,
contigs_reports/ 
misassemblies_reportdetailed report on misassemblies
unaligned_reportdetailed report on unaligned and partially unaligned contigs

 More detailed explanation of the above output is provided in metaQUAST manual

...

the following files and folders:

  1. logs
  2. output_out: This folder contains cufflinks related outputs (genes.fpkm_tracking, isoforms.fpkm_tracking, skipped.gtf and transcripts.gtf)
  3. output.sorted.bam
  4. output.sorted.bam.bam.bai