...
...
...
...
...
...
...
...
...
borderColor | #ccc |
---|---|
bgColor | #FFFFCE |
titleBGColor | #F7D6C1 |
title | Alert: |
borderStyle | dashed |
...
| The CyVerse App Store is currently being restructured, and apps are being moved to an HPC environment. During this transition, users may occasionally be unable to locate or use apps that are listed in our tutorials. In many cases, these apps can be located by searching them using the search bar at the top of the Apps window in the DE. To increase the chance for search success, try not searching the entire app name and version number but only the portion that refers to the app's function or origin (e.g. 'SOAPdenovo' instead of 'SOAPdenovo-Trans 1.01'). Also, as part of the 2.8 app categorization, a number of apps were deprecated and are no longer available, and there is no longer an Archive category. You can search for a suitable replacement in the List of Applications in this window, or search on an app name or tool used for an app in the Apps window search field. If you need an app reinstated, please contact support@cyverse.org. |
Please work through the documentation and add your comments on the bottom of this page, or email comments to support@cyverse.org. Thank you.
HTSeq is a Python package that provides infrastructure to process data from high-throughput sequencing assays. HTSeq includes parsers for common file formats for a variety of types of input data and is suitable as a general platform for a diverse range of tasks. A core component of HTSeq is a container class that simplifies working with data associated with genomic coordinates, i.e. values attributed to genomic positions (e.g. read coverage) or to genomic intervals (e.g. genomic features such as exons or genes). Two stand-alone applications developed with HTSeq are distributed with the package, namely htseq-qa for read quality assessment and htseq-count for preprocessing RNA-Seq alignments for differential expression calling
HTSeq is described in the following publication:
Simon Anders, Paul Theodor Pyl, Wolfgang HuberHTSeq — A Python framework to work with high-throughput sequencing dataBioinformatics (2014), in print, online at doi:10.1093/bioinformatics/btu638
Info | ||
---|---|---|
| ||
This is updated version of HTSeq-count-0.6.1. If you want to use the older version of HTSeq-count-0.5.4, it is still available here |
Mandatory arguments
- Input SAM/BAM files: The alignment_files contains the aligned reads in the SAM or BAM format.
Info |
---|
Make sure to use a splicing-aware aligner such as TopHat. HTSeq-count makes full use of the information in the CIGAR field. |
...
Parameters
...
Output Folder
...
What is PEAR?
PEAR is an ultrafast, memory-efficient and highly accurate pair-end read merger. It is fully parallelized and can run with as low as just a few kilobytes of memory. PEAR
is distributed under the Creative Commons license, and it runs on the command-line under Linux and UNIX based operating systems.
Mandatory arguments
Specify the name of file that contains the forward paired-end reads
Specify the name of file that contains the reverse paired-end reads
Specify the name to be used as base for the output files. PEAR outputs four files. A file containing the assembled reads with a
assembled.fastq
extension, two files containing the forward, resp. reverse, unassembled reads with extensionsunassembled.forward.fastq
, resp.unassembled.reverse.fastq
, and a file containing the discarded reads with adiscarded.fastq
extension.
Optional arguments
| Specify a p-value for the statistical test. If the computed p-value of a possible assembly exceeds the specified p-value then the paired-end read will not be assembled. Valid options are: 0.0001, 0.001, 0.01, 0.05 and 1.0. Setting 1.0 disables the test. (default: 0.01) |
| Specify the minimum overlap size. The minimum overlap may be set to 1 when the statistical test is used. However, further restricting the minimum overlap size to a proper value may reduce false-positive assembles. (default: 10) |
Maximum possible length | Specify the maximum possible length of the assembled sequences. Setting this value to 0 disables the restriction and assembled sequences may be arbitrary long. (default: 0) |
Minimum possible length | Specify the minimum possible length of the assembled sequences. Setting this value to 0 disables the restriction and assembled sequences may be arbitrary short. (default: 50) |
Minimum length of reads after trimming | Specify the minimum length of reads after trimming the low quality part (see option -q). (default: 1) |
Quality score threshold for trimming | Specify the quality score threshold for trimming the low quality part of a read. If the quality scores of two consecutive bases are strictly less than the specified threshold, the rest of the read will be trimmed. (default: 0) |
Maximal proportion of uncalled bases | Specify the maximal proportion of uncalled bases in a read. Setting this value to 0 will cause PEAR to discard all reads containing uncalled bases. The other extreme setting is 1 which causes PEAR to process all reads independent on the number of uncalled bases. (default: 1) |
Statistical test | Specify the type of statistical test. Two options are available. (default: 1)
|
Empirical base frequencies | Disable empirical base frequencies. (default: use empirical base frequencies) |
Scoring method | Specify the scoring method. (default: 2)
|
Base PHRED quality score | Base PHRED quality score. (default: 33) |
Test Run
All files are located in the Community Data directory of the CyVerse Discovery Environment at the following path:
Community Data > iplantcollaborative > example_data > htseq_count > 0.6.1 PEAR (/iplant/home/shared/iplantcollaborative/example_data/htseq_count/0.6.1PEAR)
Mandatory arguments:
Use testfile.sam and hy5_rep1_transcripts.gtf for inputs
...
Use Read1.fastq for forward reads, Read2.fastq for reverse reads and ouput (default) for the name of output file
Leave all the values optional arguments as they aredefault.
Output
...
A file containing the assembled reads with a assembled.fastq extension, two files containing the forward, resp. reverse, unassembled reads with extensions unassembled.forward.fastq, resp. unassembled.reverse.fastq, and a file containing the discarded reads with a discarded.fastq extension.
Tool Source for App
- http https://whttp://www-huber.embl.de/HTSeq/doc/count.htmlsco.h-its.org/exelixis/web/software/pear/doc.html#TOC