Transposable element annotation on JeStream

Tools prerequiste :

 

scripts and intermediate files used to annotate TEs in Jiao et al. 2016


LTR Retrotransposons

scripts in ltr

Software needed:

  • ncbi blast+

  • genometools, (download), need to pass 64bit=yes with-hmer=yes threads=yes to make, make install for ltrdigest hmm searches in parallel. I also had to pass cairo=no as well because I didn't have the right cairo libraries and it wouldn't compile otherwise

  • silix, (download), need to compile with --enable-mpi and --enable-verbose

  • hmmer (genometools with download and compile hmmer2 if you run make with-hmmer=yes)

Files needed, can be downloaded by get_tRNA_hmm_dbs.sh in ltr directory:

  • download hmms of TE protein coding domains from gydb in directory gydb_hmms, will be used to identify protein coding domains of TE models

    -need to fix a hmm with name ty1/copia because this is used as a filename by ltrdigest. to remove the forward slash: sed -i "s#ty1/copia#ty1-copia#g" gydb_hmms/GyDB_collection/profiles/AP_ty1copia.hmm

  • download tRNAs of all eukaryotes

SINEs

Scripts in sine/

Software needed:

  • SINE-Finder, download (This is a supplemental file at The Plant Cell; need to make executable, and rename to sine_finder.py)

    • I cannot make SINE-Finder function on reverse sequences. So I'm reporting SINEs only on the forward stand here, and will pick up sequences on the reverse strand with RepeatMasker.

LINEs

Scripts in line/

Software needed:

TIR including MITEs

Scripts in tir/

Software needed:

  • detectMITE, download

  • mTEA, genometools (see above, already installed for ltr annotation)

    • mTEA needs fasta36 (specifically ggsearch36), bioperl, blast, muscle, supplied blogo directories to be put into PERL5LIB and PATH

Helitrons

Scripts in helitron/

Software needed:


Finding Homologous Fragments from Degraded TEs

Software needed:

 

Step 1: Git clone the repo

 $ git clone https://github.com/mcstitzer/maize_v4_TE_annotation.git
 $ cd maize_v4_TE_annotation/
 $ ls
helitron  line  ltr  README.md  sine  tir

 

Step 2: To predict structural LTRs

2.1 predict LTRs:

  • download tRNA and GyDb HMMs using get_tRNA_hmm_dbs.sh, which are needed forltrdigest

  • but LTR TEs are nested, so we need to remove these copies and rerun. This is done in mask_subtract

$ cd ltr
$ sh get_tRNA_hmm_dbs.sh 

This will download the tRNA database for all Eukaryotes

$ cd ltr
$ sh get_tRNA_hmm_dbs.sh 



Unable to render {include} The included page could not be found.