Please work through the tutorial and add your comments on the bottom of this page. Or send comments per email to firstname.lastname@example.org. Thank you.
Rationale and background:
is a de-novo repeat family identification and modeling package.At the heart of RepeatModeler are two de-novo repeat finding programs ( RECON and RepeatScout ) which employ complementary computational methods for identifying repeat element boundaries and family relationships from sequence data. RepeatModeler assists in automating the runs of RECON and RepeatScout given a genomic database and uses the output to build, refine and classify consensus models of putative interspersed repeats.
- A CyVerse account. (Register for an CyVerse account here - user.cyverse.org)
- Mandatory arguments -
- sequence fasta file: (in fasta format)-sequence database containing the genomic sequence
The following test data are provided for testing Repeatmodeler in here - /iplant/home/shared/iplantcollaborative/example_data/repeatmodeler:
- test.fasta: sequence fasta file
Run Repeatmodeler on test.fasta file.
Successful execution of the Repeatmodeler will contain several files and directories. The raw output is directed to a working directory named RM_. ie. "RM_5098.MonMar141305172005" and remains after each run for debugging purposes. At the completion of the run two files are generated:
-families.fa : Consensus sequences
-families.stk : Seed alignments
This app is running with 4 CPU with node. So any inputsequncefile > 300Mb would take 5-6days to complete. Furtherdevelopment to scale the app will be aavalible soon.
More information on the tool can be found here - http://www.repeatmasker.org/RepeatModeler/