Update - 4-14-15

Update - 4-14-15

Update - 4/14/15

  • First of all, I sorted all my data as needed with my shell script and samtools. That all worked nicely.
  • I spent most of the week struggling with methylKit, but have been successful running it on some test datasets (small samplings of my real data). It took me a while because I don't know much about R and the documentation for methylKit assumes a high degree of familiarity.
  • Reading Bismark .sam files into Methylkit requires them to be sorted by chromosome and position, as I said last week. You then need to run the following command:
    read.bismark(location, sample.id, assembly,
             save.folder = NULL, save.context = c("CpG"),
             read.context = "CpG", nolap = FALSE, mincov = 10,
             minqual = 20, phred64 = FALSE, treatment)
    Some notes on this:
  1. The read.context argument will take CpG, CHG, and CHH.
  2. save.folder will save the methylation percentage calls in a folder of your choosing
  3. sample.id is a string that acts as an identifier for downstream analyses (differential methylation, etc.)
  4. assembly is a string that should reference the genome assembly (TAIR10), but I don't think it has any effect on the functioning of the program
  5. Some of those arguments are optional, but the documentation doesn't offer any explanation of which.
  6. I don't know enough about sequencing at the moment to know good values for some of them, but I'm working on that.
  • Here's an example of reading a Bismark .sam file:

    When running this for real it would be much more efficient to script it. This is just for documentation purposes.
  • Once .sam files have been read in they will need to be merged into one "methbase object":
    methyl.obj=unite(sample_list, destrand=FALSE)
    - sample_list is a list of your methylation calls (different genetic backgrounds, same context)
  • After merging into one object the following command will identify differential (hyper) methylation:
    > myDiff = calculateDiffMeth(methyl.obj, num.cores = 8)
    > myDiff25p.hyper = get.methylDiff(myDiff, difference = 25,qvalue = 0.01, type = "hyper")
  • My goal of getting all this done, plus writing the python script was a bit too ambitious last week. So, for this coming week I am going to focus solely on python now that I have the basics down as far as methylKit. Viewing my methylation calls in CoGe is a good deliverable for the end of the project.

To-do:

  1. Bismark CoGe import script. Due 4/21/15