VIBRANT-1.2.0
The QuickStart tutorial provides an introduction to basic DE functionality and navigation.
Rationale and background:Â
VIBRANT is a tool for automated recovery and annotation of bacterial and archaeal viruses, determination of genome completeness, and characterization of virome function from metagenomic assemblies. VIBRANT uses neural networks of protein annotation signatures and genomic features to maximize the identification of highly diverse partial or complete viral genomes as well as excise integrated proviruses.
- Uses neural network machine learning of protein annotation signatures
- Assigns novel 'v-score' for determining the virus-like nature of all annotations
- Determines genome completeness
- Characterizes virome function by metabolic analysis
- Identifies auxiliary metabolic genes (AMGs)
- Excises integrated viral genomes from host scaffolds
- Performs well in diverse environments
- Recovers novel and abundant viral genomes
- Built for dsDNA, ssDNA and RNA viruses
VIBRANT uses three databases for identifying viruses and characterizing virome metabolic potential:
- KEGG (March release):Â https://www.genome.jp/kegg/Â (FTP: ftp://ftp.genome.jp/pub/db/kofam/archives/2019-03-20/)
- Pfam (v32): https://pfam.xfam.org (FTP: ftp://ftp.ebi.ac.uk/pub/databases/Pfam/releases/Pfam32.0/)
- VOG (release 94):Â http://vogdb.org/Â (FTP:Â http://fileshare.csb.univie.ac.at/vog/vog94/)
Prerequisites
A CyVerse account. (Register for a CyVerse account here -Â user.cyverse.org.)
Input
Input fasta file
- Parameters
- Format of input {prot,nucl} [default="nucl"]
- Number of parallel VIBRANT runs, each occupies 1 CPU [default=1, max of 1 CPU per scaffold]
Length in basepairs to limit input sequences. Default is 1000, can increase but not decrease
Number of ORFs per scaffold to limit input sequences. Default is 4, can increase but not decrease
- virome. Use this setting if the dataset is known to be comprised mainly of viruses. More sensitive to viruses, less sensitive to false identifications [default=off]
no_plot. suppress the generation of summary plots [default=off]
Test/sample data:
The test data are provided for testing VIBRANT-1.0.1 is in here - /iplant/home/shared/iplantcollaborative/example_data/vibrant
Use the following inputs/outputs and parameters for VIBRANT-1.0.1
InputÂ
Input fasta file:Â /iplant/home/shared/iplantcollaborative/example_data/vibrant/example_data/mixed_example.fasta
- Parameters
- Format of input: Nucleotide
- Number of parallel VIBRANT runs: 1
Length in basepairs to limit input sequences: 1000
Number of ORFs per scaffold to limit input sequences: 4
Leave the rest of the two as defaults
Output Reports:
After successful completion of the run, expect the following files as output:
VIBRANT_log_mixed_example.log
mixed_example.faa
mixed_example.ffn
mixed_example.gff
VIBRANT_HMM_tables_parsed_mixed_example
VIBRANT_HMM_tables_unformatted_mixed_example
VIBRANT_figures_mixed_example
VIBRANT_phages_mixed_example
VIBRANT_results_mixed_example
For more detailed information about these outputs, please refer to this link -Â https://github.com/AnantharamanLab/VIBRANT