...
Rationale and background:
QUAST: QUality ASsesment Tool for Genome Assemblies. MetaQUAST is the extension for metagenomic datasets, and Icarus, interactive visualizer for these tools
Gurevich, A., Saveliev, V., Vyahhi, N., and Tesler, G. (2013) QUAST: quality assessment tool for genome assemblies. Bioinformatics 29, 1072-1075
QUAST is a tool for evaluating genome assemblies by computing various metrics, including
- N50, length for which the collection of all contigs of that length or longer covers at least 50% of assembly length,
- NG50, where length of the reference genome is being covered,
- NA50 and NGA50, where aligned blocks instead of contigs are taken,
- misassemblies, misassembled and unaligned contigs or contigs bases,
- genes and operons covered
QUAST Builds convenient plots for different metrics
...
This app can be used to run Cufflinks-2.2.1 using SAM files instead of BAM files. This app is basically a workflow that consists of two apps
Pre-Requisites
- A CyVerse account. (Register for an CyVerse account here - user.cyverse.org)
- Input file(s)
- Input file (Genome assemblies generated by using any of the genome assemblers in fasta format)Name: SAM files generated from any mapper
- Output
- Output directory name
- Maximum number of reference genomes: Maximum number of reference genomes (per each assembly) to download after searching in SILVA database. Defaultvalue is 50.
Ambiguity usage: Way of processing equally good alignments of a contig (probably repeats):
none
skip all such alignments; one
take only one (the very best one); use all alignments. Can cause a significant increase of # mismatches (repeats are almost always inexact due to accumulated SNPs, indels, etc.).File name: Name of the output file (output.sorted.bam is default)all
Cufflinks-2.2.1: Regarding parameters for this section, please refer to Cufflinks-2.2.1
The test data for testing QUAST in here : /iplant/home/shared/iplantcollaborative/example_data/metaQUAST.sample.dataSamtools-SAMtoSortedBAM
Test run
- Open MetaQUAST-4.3(denovo based) app in DE
- Select/drag input files (meta_contigs_1.fasta and meta_contigs_2.fasta) into the Inputs section of the app
- Select the name of the output file (metaQuast_output) in the output section of the app
- Ambiguity usage: allInput file: BowtieOutput.sam
- Output: Default file name
- Cufflinks-2.2.1: Leave everything default
Test Results
Successful execution of the QUAST assessment pipeline will create metaQuast_output folder
report.txt | assessment summary in plain text format, |
report.tsv | tab-separated version of the summary, suitable for spreadsheets (Google Docs, Excel, etc), |
report.tex | LaTeX version of the summary, |
alignment.svg | contig alignment plot (file is created if matplotlib python library is installed), |
report.pdf | all other plots combined with all tables (file is created if matplotlib python library is installed), |
report.html | HTML version of the report with interactive plots inside, |
contigs_reports/ | |
misassemblies_report | detailed report on misassemblies |
unaligned_report | detailed report on unaligned and partially unaligned contigs |
...
the following files and folders:
- logs
- output_out: This folder contains cufflinks related outputs (genes.fpkm_tracking, isoforms.fpkm_tracking, skipped.gtf and transcripts.gtf)
- output.sorted.bam
- output.sorted.bam.bam.bai