PEAR-0.9.6
Please work through the documentation and add your comments on the bottom of this page, or email comments to support@cyverse.org. Thank you.
What is PEAR?
PEAR is an ultrafast, memory-efficient and highly accurate pair-end read merger. It is fully parallelized and can run with as low as just a few kilobytes of memory. PEAR
is distributed under the Creative Commons license, and it runs on the command-line under Linux and UNIX based operating systems.
Mandatory arguments
Specify the name of file that contains the forward paired-end reads
Specify the name of file that contains the reverse paired-end reads
Specify the name to be used as base for the output files. PEAR outputs four files. A file containing the assembled reads with a
assembled.fastq
extension, two files containing the forward, resp. reverse, unassembled reads with extensionsunassembled.forward.fastq
, resp.unassembled.reverse.fastq
, and a file containing the discarded reads with adiscarded.fastq
extension.
Optional arguments
| Specify a p-value for the statistical test. If the computed p-value of a possible assembly exceeds the specified p-value then the paired-end read will not be assembled. Valid options are: 0.0001, 0.001, 0.01, 0.05 and 1.0. Setting 1.0 disables the test. (default: 0.01) |
| Specify the minimum overlap size. The minimum overlap may be set to 1 when the statistical test is used. However, further restricting the minimum overlap size to a proper value may reduce false-positive assembles. (default: 10) |
Maximum possible length | Specify the maximum possible length of the assembled sequences. Setting this value to 0 disables the restriction and assembled sequences may be arbitrary long. (default: 0) |
Minimum possible length | Specify the minimum possible length of the assembled sequences. Setting this value to 0 disables the restriction and assembled sequences may be arbitrary short. (default: 50) |
Minimum length of reads after trimming | Specify the minimum length of reads after trimming the low quality part (see option -q). (default: 1) |
Quality score threshold for trimming | Specify the quality score threshold for trimming the low quality part of a read. If the quality scores of two consecutive bases are strictly less than the specified threshold, the rest of the read will be trimmed. (default: 0) |
Maximal proportion of uncalled bases | Specify the maximal proportion of uncalled bases in a read. Setting this value to 0 will cause PEAR to discard all reads containing uncalled bases. The other extreme setting is 1 which causes PEAR to process all reads independent on the number of uncalled bases. (default: 1) |
Statistical test | Specify the type of statistical test. Two options are available. (default: 1)
|
Empirical base frequencies | Disable empirical base frequencies. (default: use empirical base frequencies) |
Scoring method | Specify the scoring method. (default: 2)
|
Base PHRED quality score | Base PHRED quality score. (default: 33) |
Test Run
All files are located in the Community Data directory of the CyVerse Discovery Environment at the following path:
Community Data > iplantcollaborative > example_data > PEAR (/iplant/home/shared/iplantcollaborative/example_data/PEAR)
Mandatory arguments:
Use Read1.fastq for forward reads, Read2.fastq for reverse reads and ouput (default) for the name of output file
Leave all the optional arguments as default.
Output
A file containing the assembled reads with a assembled.fastq extension, two files containing the forward, resp. reverse, unassembled reads with extensions unassembled.forward.fastq, resp. unassembled.reverse.fastq, and a file containing the discarded reads with a discarded.fastq extension.
Tool Source for App