Bismark
How to download the tool or source code including installation and usage instructions as well as any source code that might be associated with the executable. This should also include a listing of any dependencies for this tool or script.
- Download the bismark package from http://www.bioinformatics.bbsrc.ac.uk/projects/download.html to a directory.
- Extract the file using tar -xzvf filename
- Bismark requires Bowtie to be installed on your machine.
Required version of the program necessary to perform the desired task
- bismark v0.2.4
Sample dataset and expected results to be output
- Excerpt from sample input files, chr8.fa (fasta file, reference sequence), and test_reads_bs.fq (fastq file, short reads)
chr8.fa: >chr8 GCAATTATGACACAAAAAATTAAACAGTGCAGACTGATATATAAATCAAA ACAAATGTCCTTTACATGTTTTCTGTTACAGTAGTAACAATATGTGTAAA CTTAATTATCATATTTTTTTCTTGTGCTGTGGTTGTGTCCTGGGTTCATT CTCTAAAATGCTGTTCACCTTAGACCAGGAGAAATATTAACCATACAGAC TCTGTTTCAAGTCATAGCTGAATATTTTCAAAAGAGTGACTTTGTAAAAA CATGTTCCAATGGCAAATTGATTCATTGTGATGGGATCAATTATTCCAAA GACTTCTTGTCTTTATTTTGTTCCCATGCCTACCTTTTAGCCATAATACA
test_reads_bs.fq: @chr8:144-169_1_0000000000000000000000000_0 TTTATTTTTTAAAATGTTGTTTATT \+chr8:144-169_1_0000000000000000000000000_0 OhhhKhhhhLhhhhhhhhhhRhhhh @chr8:440-465_1_0000000000000000000000000_0 TATAATGTTTTTTAAAATAAAAGAG \+chr8:440-465_1_0000000000000000000000000_0 QhhhhhhhhhhhhOhHhhhhhhhhh @chr8:1759-1784_0_0000000000000000000000000_0 TTGTAGGTTATTGAGGAAGGTGAGG \+chr8:1759-1784_0_0000000000000000000000000_0 XhNh\[ZhhhhhhQThhKRhhhhhhh
- Excerpt from sample output files, test_reads_bs.fq_bismark.txt, and CpG_context_test_reads_bs.fq_bismark.txt
- Single-end output format (tab-separated):
- <seq-ID>
- <read alignment strand>
- <chromosome>
- <start position>
- <end position>
- <observed bisulfite sequence>
- <equivalent genomic sequence>
- <methylation call>
- <read conversion
- <genome conversion>
- Paired-end output format (tab-separated):
- <seq-ID>
- <read 1 alignment strand>
- <chromosome>
- <start position>
- <end position>
- <observed bisulfite sequence 1>
- <equivalent genomic sequence 1>
- <methylation call 1>
- <observed bisulfite sequence 2>
- <equivalent genomic sequence 2>
- <methylation call 2>
- <read 1 conversion
- <genome conversion>
test_reads_bs.fq_bismark.txt: Bismark version: v0.2.4 chr8:144-169_1_0000000000000000000000000_0 + chr8 145 169 TTTATTTTTTAAAATGTTGTTTATT TTCATTCTCTAAAATGCTGTTCACCTT ..h...h.h.......x....h.hh CT CT chr8:440-465_1_0000000000000000000000000_0 + chr8 441 465 TATAATGTTTTTTAAAATAAAAGAG CACAATGCTTTCTAAAACAAAAGAGTC h.h....h...h.....h....... CT CT
CpG_context_test_reads_bs.fq_bismark.txt: Bismark methylation extractor version v0.2.4 chr8:3234-3259_0_0000000000000000000000000_0 - chr8 3254 z chr8:3577-3602_1_0000000000000000000000000_0 - chr8 3579 z chr8:1086-1111_1_0000000000000000000000000_0 - chr8 1101 z chr8:3216-3241_1_0000000000000000000000000_0 - chr8 3231 z
- Single-end output format (tab-separated):
- Excerpt from sample input files, chr8.fa (fasta file, reference sequence), and test_reads_bs.fq (fastq file, short reads)
Set of parameters and command line switches that match the expected execution of the tool including the possible command line definitions according to the occurrence of optional parameters. Also, validation instructions for parameters are requested.
- Running the Bismark genome preparation
- USAGE:
bismark_genome_preparation [options] <arguments>
- OPTIONS:
parameter
brief description of the parameter
required
default value
text, number, or file/path
description of validation rules
--help/--man
Displays this help file
N
none
--verbose
Print verbose output for more details or debugging
N
none
--path_to_bowtie
The full path to the bowtie installation on your system
N
none
--yes/--yes_to_all
Answer yes to safety related questions
N
none
- ARGUMENTS:
argument
brief description of the argument
required
default value
text, number, or file/path
path_to_genome_folder
The full path to the folder containing the genome to be bisulfite converted
Y
none
path
- USAGE:
- Running Bismark
- USAGE:
bismark [options] <genome_folder> {-1 <mates1> -2 <mates2> | <singles>}
- OPTIONS:
parameter
brief description of the parameter
required
default value
text, number, or file/path
description of validation rules
-q/--fastq
The query input files (specified as <mate1>,<mate2> or <singles> are FASTQ files
Y for FASTQ input
none
-f/--fasta
The query input files (specified as <mate1>,<mate2> or <singles> are FASTA files. All quality values are assumed to be 40 on the Phred scale
Y for FASTA input
none
-s/--skip
Skip the first <int> reads or read pairs from the input
N
0
integer
>=0
-u/--qupto
Only aligns the first <int> reads or read pairs from the input
N
none
integer
>=0
--phred33-quals
FASTQ qualities are ASCII chars equal to the Phred quality plus 33
N
Y
--phred64-quals
FASTQ qualities are ASCII chars equal to the Phred quality plus 64
N
N
--solexa-quals
Convert FASTQ qualities from solexa-scaled (which can be negative) to phred-scaled
N
N
--solexa1.3-quals
Same as --phred64-quals
N
N
--path_to_bowtie
The full path to the bowtie installation on your system
N
none
-n/--seedmms
The maximum number of mismatches permitted in the "seed" (see -l/--seedlen)
N
0
integer
0, 1, 2 or 3
-l/--seedlen
The "seed length"; i.e., the number of bases of the high quality end of the read to which the -n ceiling applies
N
28
integer
>=0
-e/--maqerr
Maximum permitted total of quality values at all mismatched read positions throughout the entire alignment, not just in the "seed"
N
70
integer
>=0
--chunkmbs
The number of megabytes of memory a given thread is given to store path descriptors in --best mode. Best-first search must keep track of many paths at once to ensure it is always extending the path with the lowest cumulative cost. Bowtie tries to minimize the memory impact of the descriptors, but they can still grow very large in some cases. If you receive an error message saying that chunk memory has been exhausted in --best mode, try adjusting this parameter up to dedicate more memory to the descriptors
N
64
integer
>=0
-I/--minins
The minimum insert size for valid paired-end alignments
N
0
integer
>=0
-X/--maxins
The maximum insert size for valid paired-end alignments
N
250
integer
>=0
--best
Make Bowtie guarantee that reported singleton alignments are "best" in terms of stratum
N
Y
--no_best
Disables the --best option which is on by default. This can speed up the alignment process, e.g. for testing purposes, but for credible results it is not recommended to disable --best
N
none
--directional
The user may specify if the sequencing library was constructed in a strand-specific manner. In this case the strands complementary to the original strands are merely theoretical and should not exist in reality. Thus, specifying --direction will only report alignments to the original top or bottom strands. This is the recommended option for sprand-specific libraries
N
none
--quiet
Print nothing besides alignments
N
none
-h/--help
Displays help file
N
none
-v/--version
Displays version information
N
none
- ARGUMENTS:
argument
brief description of the argument
required
default value
text, number, or file/path
genome_folder
The full path to the folder containing the unmodified reference genome as well as the subfolders created by the Bismark_Genome_Preparation script
Y
none
path
-1
Comma-separated list of files containing the #1 mates
Y for paired-end read
none
files
-2
Comma-separated list of files containing the #2 mates
Y for paired-end read
none
files
singles
A comma-separated list of files containing the reads to be aligned
Y for single-end read
none
files
- USAGE:
- Running the methylation extractor
- USAGE:
methylation_extractor [options] <filenames>
- OPTIONS:
parameter
brief description of the parameter
required
default value
text, number, or file/path
description of validation rules
-s/--single-end
Input file(s) are Bismark result file(s) generated from single-end read data
Y for single-end read
none
-p/--paired-end
Input file(s) are Bismark result file(s) generated from paired-end read data
Y for paired-end read
none
--no_overlap
For paired-end reads it is theoretically possible that read_1 and read_2 overlap. This option avoids scoring overlapping methylation calls twice
N
none
--fasta
Chosing this option will print out the genomic sequences that correspond to the bisulfite mapped reads in FastA format
N
none
--ignore
Ignore the first <int> bp when processing the methylation call string
N
0
integer
>= 0
--comprehensive
Specifying this option will merge all four possible strand-specific methylation info into context-dependent output files
N
none
--merge_non_CpG
This will produce two output files (in --comprehensive mode) or eight strand-specific output files (default) for Cs in (i) CpG context (ii) any non-CpG context
N
none
--report
Prints out a short methylation summary and the paramaters used to run this script
N
none
--version
Displays version information
N
none
-h/--help
Displays this help file and exits
N
none
- ARGUMENTS:
argument
brief description of the argument
required
default value
text, number, or file/path
filenames
A space-separated list of result files in Bismark format
Y
none
files
- USAGE:
- Running the Bismark genome preparation
Example invocation of the command line application and its associated parameters such that it can perform an analysis
- Running the Bismark genome preparation
~/bin/bismark_v0.2.4/bismark_genome_preparation --verbose --path_to_bowtie ~/bin/bowtie-0.12.7/ ~/sequence/test/
- Running Bismark
single-end: ~/bin/bismark_v0.2.4/bismark -q --phred64-quals --path_to_bowtie ~/bin/bowtie-0.12.7/ -n 1 -l 20 ~/sequence/test/ test_reads_bs.fq paired-end: ~/bin/bismark_v0.2.4/bismark -q --phred64-quals --path_to_bowtie ~/bin/bowtie-0.12.7/ -n 1 -l 20 -I 60 -X 350 ~/sequence/test/ -1 test_reads_bs1.fq -2 test_reads_bs2.fq
- Running the methylation extractor
single-end: ~/bin/bismark_v0.2.4/methylation_extractor -s --comprehensive --report test_reads_bs.fq_bismark.txt paired-end: ~/bin/bismark_v0.2.4/methylation_extractor -q --comprehensive --report test_reads_bs.fq_bismark.txt
- Running the Bismark genome preparation