RepeatModeler
Rationale and background:
RepeatModeler
is a de-novo repeat family identification and modeling package.At the heart of RepeatModeler are two de-novo repeat finding programs ( RECON and RepeatScout ) which employ complementary computational methods for identifying repeat element boundaries and family relationships from sequence data. RepeatModeler assists in automating the runs of RECON and RepeatScout given a genomic database and uses the output to build, refine and classify consensus models of putative interspersed repeats.
Â
Â
Version: 1.0.11
Pre-Requisites
- A CyVerse account. (Register for an CyVerse account here -Â user.cyverse.org)
- Mandatory arguments -
- sequence fasta file:Â (in fasta format)-sequence database containing the genomic sequence
The following test data are provided for testing Repeatmodeler in here - /iplant/home/shared/iplantcollaborative/example_data/repeatmodeler:
- Â test.fasta: sequence fasta file
Run Repeatmodeler on test.fasta file.
ResultsÂ
Successful execution of the Repeatmodeler will contain several files and directories. The raw output is directed to a working directory named RM_. ie. "RM_5098.MonMar141305172005" and remains after each run for debugging purposes. At the completion of the run two files are generated:
-families.fa : Consensus sequences
-families.stk : Seed alignments
Warning
This app is running with 4 CPU with node. So any inputsequncefile > 300Mb would take 5-6days to complete. Furtherdevelopment to scale the app will be aavalible soon.
More information on the tool can be found here - http://www.repeatmasker.org/RepeatModeler/
Â