Tallymer-mkindex
Community rating: ?????
Tallymer-mkindex is used for counting and indexing k-mers for a specified value of k (e.g. all 20-mers) in a set of sequences. Requires an enhanced suffix array (ESA) generated using Suffixerator. Output is a Tallymer index that can be used for searching fasta sequences using the app Tallymer-Search.
...
...
Include Page |
---|
...
- Resources:
Test Data
Info |
---|
Test data for this app appears directly in the Discovery Environment in the Data window under Community Data -> iplantcollaborative -> example_data -> Tallymer. |
Input File(s)
Specify the directory containing the ESA files. For example if you use the above example_data directory then the entry will end up being "/iplant/home/shared/iplantcollaborative/example_data/Tallymer/".
Then specify the root name of the ESA. Using example_data you would enter "maize_BAC100".
Parameters Used in App
- Use these parameters within the DE app interface:
- Desired k-mer length: specify a number indicating k-mer length to be indexed
- Minimum occurance of k-mer to report: specify a number indicating the minimum number of times a k-mer must be found in the original set of sequences used to generate the Suffixerator ESA, in order to be indexed. For example if you specify '5' then only k-mers found 5 or more times in the original set of sequences used to generate the Suffixerator ESA will be indexed.
- Give a name to the index you are creating (optional). Provide a root name for the index to be generated or the app will generate a name automatically. In the example we provided the root name "maize_BACS100_20mer_minocc5" to indicate the original source of sequence, the desired k-mer length, and the minimum occurrence count.
Output File(s)
Output will be four files:
maize_BACS100_20mer_minocc5.mbd
maize_BACS100_20mer_minocc5.mct
maize_BACS100_20mer_minocc5.mer
mer20distribution
The first 3 files listed above together constitute the tallymer index. These files have a common root name and unique 3-letter suffix.
The mer20distribution file is a text file that gives summary information about about distribution of k-mers.
Tool Source for App
...
|