QUAST 4.0 Using Atmosphere
Rationale and background:
QUAST:Â QUality ASsesment Tool for Genome Assemblies
Gurevich, A., Saveliev, V., Vyahhi, N., and Tesler, G. (2013) QUAST: quality assessment tool for genome assemblies. Bioinformatics 29, 1072-1075
QUAST is a tool for evaluating genome assemblies by computing various metrics, includingÂ
- N50, length for which the collection of all contigs of that length or longer covers at least 50% of assembly length,
- NG50, where length of the reference genome is being covered,
- NA50 and NGA50, where aligned blocks instead of contigs are taken,
- misassemblies, misassembled and unaligned contigs or contigs bases,
- genes and operons covered
QUAST Builds convenient plots for different metrics
- cumulative contigs length,
- all kinds of N-metrics,
- genes and operons covered,
- GCÂ content.
Â
Introduction
This tutorial will orient you to using the QUAST (version 4.0) installed on Atmosphere. This tutorial provides instructions for the general QUAST tool for genome assemblies, MetaQUAST, the extension for metagenomic datasets, and Icarus, interactive visualizer for these tools.Â
This tutorial will take users through steps of:
- Launching the QUAST-4.0 Atmosphere image
- Running QUAST-4.0 on an test dataÂ
Please work through the tutorial and add your comments on the bottom of this page. Or send comments per email to upendra@cyverse.org. Thank you.
Learn about allocations
Learn about CyVerse's allocation policies here.Â
Part 1: Connect to an instance of an Atmosphere Image (Virtual Machine)
Step 1. Go to https://atmo.iplantcollaborative.org and log in with your CyVerse credentials.
Step 2. Click on the Launch New Instance button and search for QUAST-4.0
Step 3. Select the image QUAST 4.0 and click Launch Instance. It will take 10-15 minutes for the cloud instance to be launched.Â
Â
Note: Instances can be configured for different amounts of CPU, memory, and storage depending on user needs.  This tutorial can be accomplished with the medium instance size, small1 (2 CPUs, 8 GB memory, 60 GB root)
Part 2: Set up a Quast-4.0 run using the Terminal window
Step 1. Open the Terminal.  Add the ssh details along with your IP address to connect the instance through the terminal. Remember to put your actually iPlant username in place of the text 'username' and 'IPaddress' in this next line of code:
$ ssh <username>@<IPaddress>
Step 2. You will find test data in "/opt/quast-4.0/test_data" folder. List its contents with the ls command.Â
$ ls /opt/quast-4.0/test_data/ contigs_1.fasta genes.gff genes.txt meta_contigs_2.fasta meta_ref_2.fasta operons.gff reads1.fastq.gz reference.fasta.gz contigs_2.fasta genes.ncbi meta_contigs_1.fasta meta_ref_1.fasta meta_ref_3.fasta operons.txt reads2.fastq.gz
We'll change to the test_data
 directory for the remaining steps.
$ cd /opt/quast-4.0/
Part 3: Run Quast-4.0
1. Basic testingÂ
Â
$ python quast.py -o ~/quast_test_output -R test_data/reference.fasta.gz -G test_data/genes.gff test_data/contigs_1.fasta test_data/contigs_2.fasta
2. SV calling
$ python quast.py -o ~/quast_test_output_sv -R /opt/quast-4.0/test_data/reference.fasta.gz -O /opt/quast-4.0/test_data/operons.gff -G /opt/quast-4.0/test_data/genes.gff --gage --gene-finding --eukaryote --glimmer -1 /opt/quast-4.0/test_data/reads1.fastq.gz -2 /opt/quast-4.0/test_data/reads2.fastq.gz /opt/quast-4.0/test_data/contigs_1.fasta /opt/quast-4.0/test_data/contigs_2.fasta
3. MetaQuast with reference
$ python metaquast.py -o ~/metaquast_test_output -R /opt/quast-4.0/test_data/meta_ref_1.fasta,/opt/quast-4.0/test_data/meta_ref_2.fasta,/opt/quast-4.0/test_data/meta_ref_3.fasta /opt/quast-4.0/test_data/meta_contigs_1.fasta /opt/quast-4.0/test_data/meta_contigs_2.fasta
4. MetaQuast with no reference
$ sudo python metaquast.py -o ~/metaquast_test_output_no_ref /opt/quast-4.0/test_data/meta_contigs_1.fasta /opt/quast-4.0/test_data/meta_contigs_2.fasta
Results
Successful execution of the QUAST assessment pipeline will create the following ouput
QUAST output contains:
report.txt | assessment summary in plain text format, |
report.tsv | tab-separated version of the summary, suitable for spreadsheets (Google Docs, Excel, etc), |
report.tex | LaTeX version of the summary, |
alignment.svg | contig alignment plot (file is created if matplotlib python library is installed), |
report.pdf | all other plots combined with all tables (file is created if matplotlib python library is installed), |
report.html | HTML version of the report with interactive plots inside, |
contigs_reports/ | Â |
misassemblies_report | detailed report on misassemblies |
unaligned_report | detailed report on unaligned and partially unaligned contigs |
Note:Â
- metrics based on a reference genome are computed only if a reference is provided
- metrics based on genes and operons are computed only if proper annotations are provided
 More detailed explanation of the above ouput is provided in QUAST manual