TEMP for transposable elements detection
Rationale and background:
TEMP is a software package can detect Transposable Elements insertion and absence, pinpoint their junctions with genomic DNA at base pair resolution and estimate their frequencies in the population. TEMP insertion and absence algorithms are available in CyVerse Discovery Environment as two different applications:
TEMP-insertions- for TE insertion analysis
TEMP-absence- for TE absence analysis.
TEMP: a computational method for analyzing transposable element polymorphism in populations. Jiali Zhuang, Jie Wang, William Theurkauf, Zhiping Weng. Nucleic Acids Research, Volume 42, Issue 11, 17 June 2014, Pages 6826–6838, https://doi.org/10.1093/nar/gku323
Pre-Requisites
A CyVerse account. (Register for a CyVerse account here - https://user.cyverse.org/register)
Mandatory arguments for TEMP-insertions
Input file in bam format.
Transposon consensus sequence fasta format
Annotated transposon positions in the genome
Number of mismatches allowed when mapping to TE concensus sequences
An integer specifying the length of the fragments
Mandatory arguments for TEMP-absence
Input file in bam format
Annotated transposon positions in the genome (e.g., RepeakMasker) in bed6 format with full path
2bit file for the reference genome
An integer specifying the length of the fragments (inserts) of the library
Refer to TEMP manual pages for more details- https://github.com/JialiUMassWengLab/TEMP/blob/master/Manual
Test with sample data
Test data for this app appears directly in the Discovery Environment in the Data window under Community Data -> iplantcollaborative -> example_data -> temp. This test data is a simulated set generated using Drosophila Melanogaster Chromosome 2L as the template. Please check TEMP github manual for more details about this dataset https://github.com/JialiUMassWengLab/TEMP/blob/master/Manual
Input BAM file - test_chromosome.sorted.bam
Transposon consensus sequence - test_concensus.fa
Annotated transposon positions in the genome - test_TE_annotation.bed
2bit file for the reference genome- dm3_chr2L.2bit
Output
For TE insertion analysis, the summay output file has the suffix: .insertion.refined.bp.summary.
For TE absence analysis, the summay output file has the suffix: .absence.refined.bp.summary.