CNVnator-0.3.3

Please work through the documentation and add your comments on the bottom of this page, or email comments to support@cyverse.org. Thank you.

Rationale and Background

CNVnator is a tool for Copy number variation (CNV) discovery and genotyping from depth-of-coverage by mapped reads.  CNV in the genome is a complex phenomenon, and not completely understood. CNVnator is a method for CNV discovery and genotyping from read-depth (RD) analysis of personal genome sequencing. The method is based on combining the established mean-shift approach with additional refinements (multiple-bandwidth partitioning and GC correction) to broaden the range of discovered CNVs. 

 

Some useful information about CNVnator from this blog

CNVnator can identify CNVs from a few 100 bases to megabases in length. Furthermore, the precision is good: 200 bp for 90% of the breakpoints in a test case studied in the CNVnator paper (using a bin size of 100 bp). The higher the coverage you have, the smaller the bin size you can use, which will give you greater precision. They recommend to use ~100-bp bins for 20-30x coverage, ~500-bp bins for 4-6x coverage, and ~30-bp bins for 100x coverage. However, they say that the bin size used shouldn't be shorter than the read length in your data

 

Mandatory arguments

  • Input(s)
    • Custom Reference genome or Reference genome from DE: The user has to select one of this option, otherwise the app will fail
    • Bam files: Make sure the bam files are the same files that have been generated by mapping  to the above selected reference genome
    • Chromosome id or Chromosome ids from file: Chromosome names must be specified the same way as they are described in bam header, e.g., chrX or X. The user can simply specify a single chromosome id. For example 10 or upload a file that contains multiple chromosome id's one line per chromosome id. The user has to select one of this option, otherwise the app will fail.
  • Parameters(s)
    • Histogram bin size: The bin size (window size) for generating histogram for all the windows in your genome assembly. For example 100
    • Stat bin size: The bin size (window size) for calculating statistical significance (p-values) for the windows that have unusual read depth. For example 100
    • Partition bin size: The bin size (window size) for partitioning the chromosomes/scaffolds into long regions (each one of which could be longer than the window size) that have similar read depth, and so presumably similar copy number. For example 100
    • Call bin size: The bin size (window size) for calling CNV's. For example 100
    • Prefix: The prefix that will be added to the vcf file column when converting cnvantor to vcf file
  • Output
    • The name of the output file: For example result

Test Run using a single chromosome id

All files are located in the Community Data directory of the CyVerse Discovery Environment at the following path:

Community Data > iplantcollaborative > example_data > cnvnator (/iplant/home/shared/iplantcollaborative/example_data/cnvnator) 

Mandatory arguments

  • Input(s)
    • Custom Reference genome: Sorghum_bicolor.Sorbi1.20.dna.toplevel.fa
    • Bam files: IS20351_DS_1_1.sorted.bam and IS20351_DS_2_1.sorted.bam
    • Chromosome id: 10
  • Parameters(s)
    • Histogram bin size: 100
    • Stat bin size: 100
    • Partition bin size: 100
    • Call bin size: 100
    • Prefix: test
  • Output
    • The name of the output file: result

 

Test Run using a chromosome id file

All files are located in the Community Data directory of the CyVerse Discovery Environment at the following path:

Community Data > iplantcollaborative > example_data > cnvnator (/iplant/home/shared/iplantcollaborative/example_data/cnvnator) 

Mandatory arguments

  • Input(s)
    • Custom Reference genome: Sorghum_bicolor.Sorbi1.20.dna.toplevel.fa
    • Bam files: IS20351_DS_1_1.sorted.bam and IS20351_DS_2_1.sorted.bam
    • Chromosome id: chr_list.txt
  • Parameters(s)
    • Histogram bin size: 100
    • Stat bin size: 100
    • Partition bin size: 100
    • Call bin size: 100
    • Prefix: test

  • Output
    • The name of the output file: result

Output files generated

  1. cnvnator.root: Output ROOT