Tuesday Aug 14
JOINME: https://secure.join.me/733-561-924
Lecture: Genome Assembly - Haibao Tang (9:00-10:15)
Assembly preparation – reads, libraries, etc.
Assembly – OLC assemblers vs. de Bruijn (K-mer) graph assemblers
Assembly QC - Identify data or assembly issues
Assembly curation - Further scaffolding, build chromosomes
Slides: https://docs.google.com/open?id=0Bx3KTjwwBb0rTm9kdm9XSDhidE0
Break (10:15-10:30)
Hands-On: Assemble a chromosome of baker's yeast (10:30-12:00)
Group Projects
You are pre-assigned to one of 7 groups, each working on a different dataset
We will discuss approaches and results in a group setting
Present results as a group
de novo assembly hands-on guide: https://docs.google.com/document/d/1B1-kO41zGpVpHdGpHL_65T7STP9iSODf0Rt0f0OxZAc/edit
Use three algorithms to assemble your data set (VELVET, ABYSS, SOAP)
Know how to change K-mer in assembler option, and try different K-mers
Available via OSU HPC and iPlant DE
Identify which assembly has best metrics
Length stats (N50, sum)
Dot plot to a reference genome
Group discussion: Best assembly for the fungal dataset (10 m)
Lunch 12:00-1:00
Big Data Lecture - Dan Stanzione: (1:00-1:45)
JOINME: https://secure.join.me/923-172-978
Lecture: Read mapping and genetic variants - Haibao Tang (2:00-2:45)
Read mapping
Variant calling
Variant annotation
Visualization
Slides: https://docs.google.com/open?id=0Bx3KTjwwBb0rMGp3ZGdCM2VMWjA
Break (2:45-3:00)
Hands-On: Identify and annotate genetic variants of a mutant yeast 3:00-5:00
Genome mapping hands-on guide: https://docs.google.com/document/d/1du_RuAKvWRG03e-N6mskO9wa09zVL_ne8FzuyjiavXk/edi
Compare two methods of finding SNPs - mapping assembly vs. mapping reads (preferred)
Understand and run the SNP calling pipeline (BWA -> MPILEUP)
Understand how to change criteria for read alignments and variant calling
Visualize read alignments and variants in IGV genome browser (instructions on how to run IGV in manual)
Determine possible effect of a few best quality SNPs
Understand how to categorize the SNPs according to locations and effects
Group discussion: Best approach and criteria to call SNPs; discuss the significance of the SNPs called