rust-mdbg 0.1.0 pipeline

Quick Start

To use rust-mdbgimport your data in FASTA/FASTQ format

rust-mdbg is a modular assembler. It consists of three components:

  1. rust-mdbg, to perform assembly in minimizer-space
  2. gfatools (external component), to perform graph simplifications
  3. to_basespace, to convert a minimizer-space assembly to base-space
    (For convenience, components 2 and 3 are wrapped into a script called magic_simplify.)

For better contiguity, try the provided multi-k assembly script. It performs assembly iteratively, starting with k= 10, up to an automatically-determined largest k. This comes at the expense of ~7x longer running time.

There are 4 rust-mdbg apps in the DE:

  1. rust-mdbg 0.1.0
  2. rust-mdbg 0.1.0 magic simplify
  3. rust-mdbg 0.1.0 pipeline (runs both apps 1 and 2)
  4. rust-mdbg 0.1.0 multik

Test Data

Test data for this app appears directly in the Discovery Environment in the Data window under Community Data -> iplantcollaborative -> example_data -> rust-mdbg

Input File(s)

Use reads-0.00.fa.gz from the directory above as test input.

Parameters Used in App

When the app is run in the Discovery Environment, use the following parameters with the above input file(s) to get the output provided in the next section below.

  • Use these parameters within the DE app interface:
    • Select a prefix for the output files
    • All the other parameters may be left as default

Output File(s)

Expect the following as output.

  • example.140566115636992.sequences
  • example.140566117738240.sequences
  • example.140566119851776.sequences
  • example.140566121965312.sequences
  • example.140566124078848.sequences
  • example.140566126204672.sequences
  • example.140566128305920.sequences
  • example.140566130407168.sequences
  • example.gfa
  • example.msimpl.fa
  • example.msimpl.gfa

Tool Source for App