RepeatModeler

Please work through the tutorial and add your comments on the bottom of this page. Or send comments per email to kchougul@cshl.edu. Thank you.

Rationale and background:

RepeatModeler

is a de-novo repeat family identification and modeling package.At the heart of RepeatModeler are two de-novo repeat finding programs ( RECON and RepeatScout ) which employ complementary computational methods for identifying repeat element boundaries and family relationships from sequence data. RepeatModeler assists in automating the runs of RECON and RepeatScout given a genomic database and uses the output to build, refine and classify consensus models of putative interspersed repeats.

 

 


Version: 1.0.11


Pre-Requisites

  1. A CyVerse account. (Register for an CyVerse account here - user.cyverse.org)
  2. Mandatory arguments -
    1. sequence fasta file: (in fasta format)-sequence database containing the genomic sequence
Test/sample data 

The following test data are provided for testing Repeatmodeler in here - /iplant/home/shared/iplantcollaborative/example_data/repeatmodeler:

  1.   test.fasta: sequence fasta file

Run Repeatmodeler on test.fasta file.

Results 

Successful execution of the Repeatmodeler will contain several files and directories. The raw output is directed to a working directory named RM_. ie. "RM_5098.MonMar141305172005" and remains after each run for debugging purposes. At the completion of the run two files are generated:

-families.fa : Consensus sequences

-families.stk : Seed alignments

Warning

This app is running with 4 CPU with node. So any inputsequncefile  > 300Mb would take 5-6days to complete. Furtherdevelopment  to scale  the app will be aavalible soon.


More information on the tool can be found here - http://www.repeatmasker.org/RepeatModeler/

 

Unable to render {include} The included page could not be found.