CD-HIT-est 4.6.8
CD-HIT-est 4.6.8
Community rating: ?????
Performs clustering of contigs on a fasta file of assembled transcripts.
CD-HIT-EST
Community rating: ?????
CD-HIT-EST clusters a nucleotide dataset into clusters that meet a user-defined similarity threshold, usually a sequence identity.
Quick Start
- To use CD-HIT-EST, import your transcript contigs in fasta format.
- Resources: http://weizhong-lab.ucsd.edu/cd-hit/
Test Data
Input File(s)
Use testranscripts.fasta from the directory above as test input.
Parameters Used in App
When the app is run in the Discovery Environment, use the following parameters with the above input file(s) to get the output provided in the next section below.
- Global sequence identity should be set to 0.94.
- Default settings otherwise.
Output File(s)
Expect CD-HITout.fa and CD-HITout.fa.clstr as output.
CD-HITout.fa contains the clustered sequence in fasta format.
CD-HITout.fa.clstr contains information about the clusters.