wtdbg 2.3

Wtdbg2 is a de novo sequence assembler for long noisy reads produced by PacBio or Oxford Nanopore Technologies (ONT).

Quick Start

To use wtdbg2, input must be in long read data (PacBio or Nanopore) in fastq or fasta format .
Resources: https://github.com/ruanjue/wtdbg2

Test Data

Test data for wtdbg2 can be found directly from Discovery Environment in the Data window under Community Data -> iplantcollaborative -> example_data -> wtdbg2

Input File(s)

Use SRR8506728.fastq as the input file. These are PacBio data.

Parameters Used in App

When the app is run in the Discovery Environment, use the following parameters with the above input file(s) to get the output provided in the next section below.

Under 'presets' choose PacBio RSII as the sequencing technology
Under 'output' set the file output prefix to wtdbg_SRR8506728_output
All other parameters should be left as default

Output File(s)

This analysis will generate the following output files:

wtdbg_SRR8506728_output.1.dot.gz
wtdbg_SRR8506728_output.1.nodes
wtdbg_SRR8506728_output.1.reads
wtdbg_SRR8506728_output.2.dot.gz
wtdbg_SRR8506728_output.3.dot.gz
wtdbg_SRR8506728_output.alignments.gz
wtdbg_SRR8506728_output.binkmer
wtdbg_SRR8506728_output.closed_bins
wtdbg_SRR8506728_output.clps
wtdbg_SRR8506728_output.ctg.dot.gz
wtdbg_SRR8506728_output.ctg.lay.gz
wtdbg_SRR8506728_output.events
wtdbg_SRR8506728_output.frg.dot.gz
wtdbg_SRR8506728_output.frg.nodes
wtdbg_SRR8506728_output.kmerdep

<prefix>.ctg.lay.gz is used by the 'wtpoa-cns' app to build a consensus

Additional assembly statistics (such as number of contigs and N50) can be found in the standard error file: logs/condor_stderr-0

Tool Source for App

https://github.com/ruanjue/wtdbg2