Soapdenovo-2
This is a two step process. The application will allow you to create a configuration file that is needed to run Soapdenovo on HPC resources via the Foundation API. Using this application will both create the configuration file and call the installed tool on Lonestar.
Configuration file details (from SOAPdenovo documenation):
The configuration file has a section of global information, and then multiple library sections. The library information and the information of sequencing data generated from the library should be organized in the corresponding library section. Right now only the information of maximal read length is included in the global information section. Each library section starts with tag [LIB] and is followed by read file names along with their paths, read file format, average insert size, library ranks and two other flags that tell the assembler how to treat these reads.
The assembler accepts read file in two formats: FASTA or FASTQ. Mate-pair relationship could be indicated in two ways: two sequence files with reads in the same order belonging to a pair, or two adjacent reads in a single file (FASTA only) belonging to a pair.
Libraries with the same "rank" are used at the same time for scaffolding in the order indicated by "rank".
The flag "asm_flag" has three eligible values: 1 (reads only used for contig assembly), 2 (only used for scaffold assembly) and 3 (used for both contig and scaffold assembly).
There are two types of paired-end libraries: a) forward-reverse, generated from fragmented DNA ends with typical insert size less than 800 bp; b) reverse-forward, generated from circularizing libraries with typical insert size greater than 2 Kb. User should set parameter for tag "reverse_seq" to indicate this: 0, forward-reverse; 1, reverse-forward.
Community rating: ?????
Quick Start
- To use Soapdenovo-2, import your data in fastq or fasta format.
- Resources: http://soap.genomics.org.cn/soapdenovo.html
Test Data
Info |
---|
Test data for this app appears directly in the Discovery Environment in the Data window under Community Data -> iplantcollaborative -> example_data -> Soapdenovo-2 |
Input File(s)
Use frag_1.fastq and frag_2.fastq from the directory above as test input.
Parameters Used in App
When the app is run in the Discovery Environment, use the following parameters with the above input file(s) to get the output provided in the next section below.
- Use these parameters within the DE app interface:
- maximum read length - 101
- Output Prefix - SoapOutputFrag
- kmer size - 51
- library 1 insert size - 180
- library 1 rank for scaffolding - 1
- Maximum Run Time - 1 hour
Output File(s)
Expect as output: files with the Output Prefix, and config_file.txt (among other files). For the test case, you will find in the example_data directory sample output files named SoapOutputFrag.contig, SoapOutputFrag.scafSeq, and config_file.txt.
Tool Source for App
...