Update - 4-14-15
Update - 4/14/15
- First of all, I sorted all my data as needed with my shell script and samtools. That all worked nicely.
- I spent most of the week struggling with methylKit, but have been successful running it on some test datasets (small samplings of my real data). It took me a while because I don't know much about R and the documentation for methylKit assumes a high degree of familiarity.
- Reading Bismark .sam files into Methylkit requires them to be sorted by chromosome and position, as I said last week. You then need to run the following command:
Some notes on this:
read.bismark(location, sample.id, assembly, save.folder = NULL, save.context = c("CpG"), read.context = "CpG", nolap = FALSE, mincov = 10, minqual = 20, phred64 = FALSE, treatment)
- The read.context argument will take CpG, CHG, and CHH.
- save.folder will save the methylation percentage calls in a folder of your choosing
- sample.id is a string that acts as an identifier for downstream analyses (differential methylation, etc.)
- assembly is a string that should reference the genome assembly (TAIR10), but I don't think it has any effect on the functioning of the program
- Some of those arguments are optional, but the documentation doesn't offer any explanation of which.
- I don't know enough about sequencing at the moment to know good values for some of them, but I'm working on that.
- Here's an example of reading a Bismark .sam file:
When running this for real it would be much more efficient to script it. This is just for documentation purposes.
- Once .sam files have been read in they will need to be merged into one "methbase object":
- sample_list is a list of your methylation calls (different genetic backgrounds, same context)
methyl.obj=unite(sample_list, destrand=FALSE)
- After merging into one object the following command will identify differential (hyper) methylation:
> myDiff = calculateDiffMeth(methyl.obj, num.cores = 8) > myDiff25p.hyper = get.methylDiff(myDiff, difference = 25,qvalue = 0.01, type = "hyper")
- My goal of getting all this done, plus writing the python script was a bit too ambitious last week. So, for this coming week I am going to focus solely on python now that I have the basics down as far as methylKit. Viewing my methylation calls in CoGe is a good deliverable for the end of the project.
To-do:
- Bismark CoGe import script. Due 4/21/15
, multiple selections available,