wtdbg 2.3
Wtdbg2 is a de novo sequence assembler for long noisy reads produced by PacBio or Oxford Nanopore Technologies (ONT).
Quick Start
- To use wtdbg2, input must be in long read data (PacBio or Nanopore) in fastq or fasta format .
- Resources: https://github.com/ruanjue/wtdbg2
Test Data
Input File(s)
Use SRR8506728.fastq as the input file. These are PacBio data.
Parameters Used in App
When the app is run in the Discovery Environment, use the following parameters with the above input file(s) to get the output provided in the next section below.
- Under 'presets' choose PacBio RSII as the sequencing technology
- Under 'output' set the file output prefix to wtdbg_SRR8506728_output
- All other parameters should be left as default
Output File(s)
This analysis will generate the following output files:
wtdbg_SRR8506728_output.1.dot.gz
wtdbg_SRR8506728_output.1.nodes
wtdbg_SRR8506728_output.1.reads
wtdbg_SRR8506728_output.2.dot.gz
wtdbg_SRR8506728_output.3.dot.gz
wtdbg_SRR8506728_output.alignments.gz
wtdbg_SRR8506728_output.binkmer
wtdbg_SRR8506728_output.closed_bins
wtdbg_SRR8506728_output.clps
wtdbg_SRR8506728_output.ctg.dot.gz
wtdbg_SRR8506728_output.ctg.lay.gz
wtdbg_SRR8506728_output.events
wtdbg_SRR8506728_output.frg.dot.gz
wtdbg_SRR8506728_output.frg.nodes
wtdbg_SRR8506728_output.kmerdep
<prefix>.ctg.lay.gz is used by the 'wtpoa-cns' app to build a consensus
Additional assembly statistics (such as number of contigs and N50) can be found in the standard error file: logs/condor_stderr-0