

The DE Quick Start tutorial provides an introduction to basic DE functionality and navigation.

Rationale and background:

BWA: Fast and accurate short read alignment with Burrows-Wheeler Transform

Li H. and Durbin R.

Bioinformatics 2009; 25:1754-60. [PMID: 19451168]


BWA is a software package for mapping low-divergent sequences against a large reference genome, such as the human genome. It first needs to construct the FM-index for the reference genome (the  index command) and then invoked with different sub-commands for alignment algorithms, BWA-backtrack, BWA-SW, and BWA-MEM. BWA-MEM is the latest algorithm and generally recommended for high-quality queries as it is faster and more accurate. The algorithm supports both single (SR) and paired-end (PE) reads and performs chimeric alignment. It is applicable to a wide range of query sequences, 70bp-1Mbp, and has better performance than BWA-backtrack for 70-100bp Illumina reads.

This AGAVE/DE app wraps bwa-index and bwa-mem modules of BWA for ChIP-Seq workflow but not limited to. It takes fastq files as inputs and produces alignments in SAM/BAM format. 




  1. A CyVerse account. (Register for an CyVerse account here - user.cyverse.org)
  2. Mandatory arguments 
    1. Sequences folder for protein of interest (Note: the files could be in FASTA or FASTQ format but should be named including reads end information for PE reads, e.g., test_R1.fq and test_R2.fq)
    2. Sequences folder for background control (Same as b)
    3. Reference genome sequence in FASTA format
    4. Read type: SR vs PE
  3. Optional arguments:
    1. Minimum score: Don’t output alignments with score lower than  INT
    2. Type of sequencing reads: Illumina, PacBio, Oxford Nanopore, Intra-species contains to ref
    3. Sort method for BAM: Sort alignments by leftmost coordinates, or by read name
    4. Mark shorter split: Mark shorter split hits as secondary (for Picard compatibility)
    5. Sam output: keep or purge the alignments in SAM

Sample data

The following test data are provided for testing BWA-index-mem here /iplant/home/xiaofei_iplant/Sorghum_chr8/chr8_test:

  1. G3_P_K4me3_chr8
    G3_P_K4me3_rep1_chr8_R1.fq and G3_P_K4me3_rep1_chr8_R1.fq
    G3_P_K4me3_rep2_chr8_R1.fq and G3_P_K4me3_rep2_chr8_R1.fq
  2. G3_P_H3_chr8
    G3_P_H3_rep1_chr8_R1.fq and G3_P_H3_rep1_chr8_R2.fq
    G3_P_H3_rep2_chr8_R1.fq and G3_P_H3_rep2_chr8_R2.fq
  3. Sorbi1.31.chr8.reNm.fa


Successful execution of the BWA-index-mem assessment pipeline will create a directory named out for each sample. The directory will contain SAM/BAM files for both samples of protein of interest and background input, which can be further processed for downstream analysis and visualization.



  1. G3_P_H3_chr8_BWA_sam*
    1. G3_P_H3_rep1_chr8_R.sam 
    2. G3_P_H3_rep2_chr8_R.sam 
  2. G3_P_H3_chr8_BWA_bam
    1. G3_P_H3_rep1_chr8_R.sorted.bam
    2. G3_P_H3_rep1_chr8_R.sorted.bam.bai
    3. G3_P_H3_rep2_chr8_R.sorted.bam
    4. G3_P_H3_rep2_chr8_R.sorted.bam.bai
  3. G3_P_K4me3_chr8_BWA_sam*
    1. G3_P_K4me3_rep1_chr8_R.sam
    2. G3_P_K4me3_rep2_chr8_R.sam
  4. G3_P_K4me3_chr8_BWA_bam
    1. G3_P_K4me3_rep1_chr8_R.sorted.bam
    2. G3_P_K4me3_rep1_chr8_R.sorted.bam.bai
    3. G3_P_K4me3_rep2_chr8_R.sorted.bam
    4. G3_P_K4me3_rep2_chr8_R.sorted.bam.bai

*SAM folders are optional to keep or not.


