DIAMOND 0.9.10

The DE Quick Start tutorial provides an introduction to basic DE functionality and navigation.

Please work through the tutorial and add your comments to the bottom of this page. Or send comments per email to support@cyverse.org. Thank you.

Rationale and background: 

DIAMOND is a sequence aligner for protein and translated DNA searches, designed for high-performance analysis of big sequence data. The key features are:

  • Pairwise alignment of proteins and translated DNA at 500x-20,000x speed of BLAST.
  • Frameshift alignments for long read analysis.
  • Low resource requirements and suitable for running on standard desktops or laptops.
  • Various output formats, including BLAST pairwise, tabular and XML, as well as taxonomic classification.

 Pre-Requisites

  1. A CyVerse account. (Register for a CyVerse account here - user.cyverse.org)

Inputs

  1. Protein sequence file (makedb only)

    Please ignore this field and use DIAMOND-makedb-0.9.10 app instead for building the database for DIAMOND

  2. Input query sequence file: Name of the protein fasta file (query)
  3. DIAMOND database file: Path to the database that was created using DIAMOND-makedb-0.9.10

Parameters

  1. Maximum number of target sequences to report alignments for (default: 0)
  2. Minimum bit score to report alignments (overrides e-value setting) (default: 0)
  3. Minimum subject cover% to report an alignment (default: 0)
  4. Output file format: BLAST tabular, BLAST XML, BLAST tabular, DIAMOND alignment archive (DAA), SAM (default BLAST tabular)
  5. Report alignments within this percentage range of top alignment score (default: 0)
  6. Minimum identity% to report an alignment (default: 0)
  7.  Enable sensitive mode (default: fast)
  8. DIAMOND program (Required): Build DIAMOND database from a FASTA file, Align amino acid query sequences against a protein reference database, Align DNA query sequences against a protein reference database

    Please use either of "Align amino acid query sequences against a protein reference database" or "Align DNA query sequences against a protein reference database" depending on your query type. Do not select Build DIAMOND database from a FASTA file

  9. Output file format: Both strands, Minus strand, Plus strand (default: Both strands)
  10. Maximum e-value to report alignments (default: 0.001)
  11. Minimum query cover% to report an alignment (default: 0)
  12.  Enable more sensitive mode (default: fast)
  13. Output filename (default is output)

     

Test/sample data:


The test data are provided for testing DIAMOND 0.9.10 is in here - /iplant/home/shared/iplantcollaborative/example_data/diamond_blast:

  1. Inputs:

    1. Input query sequence file: msu-irgsp-proteins.fasta

    2. DIAMOND database file: out.dmnd

  2. Parameters:
    1. DIAMOND program (Required): Align amino acid query sequences against a protein reference database

Leave the rest of the parameters as default

Output Reports:

  1. output - Tablular BLASTP result
  2. Inputs that you submitted (msu-irgsp-proteins.fasta and out.dmnd)
  3. Logs (*.err and *.out)

More information about DIAMOND-makedb-0.9.10 can be found at https://github.com/bbuchfink/diamond