CNVnator-0.3.3
- upendra kumar Devisetty
Please work through the documentation and add your comments on the bottom of this page, or email comments to support@cyverse.org. Thank you.
Rationale and Background
Some useful information about CNVnator from this blog
CNVnator can identify CNVs from a few 100 bases to megabases in length. Furthermore, the precision is good: 200 bp for 90% of the breakpoints in a test case studied in the CNVnator paper (using a bin size of 100 bp). The higher the coverage you have, the smaller the bin size you can use, which will give you greater precision. They recommend to use ~100-bp bins for 20-30x coverage, ~500-bp bins for 4-6x coverage, and ~30-bp bins for 100x coverage. However, they say that the bin size used shouldn't be shorter than the read length in your data
Mandatory arguments
- Input(s)
- Custom Reference genome or Reference genome from DE: The user has to select one of this option, otherwise the app will fail
- Bam files: Make sure the bam files are the same files that have been generated by mapping to the above selected reference genome
- Chromosome id or Chromosome ids from file: Chromosome names must be specified the same way as they are described in bam header, e.g., chrX or X. The user can simply specify a single chromosome id. For example 10 or upload a file that contains multiple chromosome id's one line per chromosome id. The user has to select one of this option, otherwise the app will fail.
- Parameters(s)
- Histogram bin size: The bin size (window size) for generating histogram for all the windows in your genome assembly. For example 100
- Stat bin size: The bin size (window size) for calculating statistical significance (p-values) for the windows that have unusual read depth. For example 100
- Partition bin size: The bin size (window size) for partitioning the chromosomes/scaffolds into long regions (each one of which could be longer than the window size) that have similar read depth, and so presumably similar copy number. For example 100
- Call bin size: The bin size (window size) for calling CNV's. For example 100
- Prefix: The prefix that will be added to the vcf file column when converting cnvantor to vcf file
- Output
- The name of the output file: For example result
Test Run using a single chromosome id
All files are located in the Community Data directory of the CyVerse Discovery Environment at the following path:
Community Data > iplantcollaborative > example_data > cnvnator (/iplant/home/shared/iplantcollaborative/example_data/cnvnator)
Mandatory arguments
- Input(s)
- Custom Reference genome: Sorghum_bicolor.Sorbi1.20.dna.toplevel.fa
- Bam files: IS20351_DS_1_1.sorted.bam and IS20351_DS_2_1.sorted.bam
- Chromosome id: 10
- Parameters(s)
- Histogram bin size: 100
- Stat bin size: 100
- Partition bin size: 100
- Call bin size: 100
- Prefix: test
- Output
- The name of the output file: result
Test Run using a chromosome id file
All files are located in the Community Data directory of the CyVerse Discovery Environment at the following path:
Community Data > iplantcollaborative > example_data > cnvnator (/iplant/home/shared/iplantcollaborative/example_data/cnvnator)
Mandatory arguments
- Input(s)
- Custom Reference genome: Sorghum_bicolor.Sorbi1.20.dna.toplevel.fa
- Bam files: IS20351_DS_1_1.sorted.bam and IS20351_DS_2_1.sorted.bam
- Chromosome id: chr_list.txt
- Parameters(s)
- Histogram bin size: 100
- Stat bin size: 100
- Partition bin size: 100
- Call bin size: 100
Prefix: test
- Output
- The name of the output file: result
Output files generated
- cnvnator.root: Output ROOT