Thursday Aug 16

Thursday Aug 16

Introduction to Genomic Annotation - Joshua Orvis (8:30 -11:00)

https://docs.google.com/presentation/d/1Z7P0qVuVEHQ0qwrVSdhJ_paumhsz7nT8I-fJC7FztAY/edit?usp=sharing

https://secure.join.me/733-561-924

JOIN.ME: https://secure.join.me/bioinfosu

Lunch (12:00-1:00)

Slides (PDF 4.4MB)

  1. Amazon EC2
  2. Wikipedia on "Virtual Machine"
  3. iPlant Atmosphere Video Introduction
  4. GNOME desktop
  5. Wikipedia on "VNC"

Introduction: Metagenomic Analysis With MEGAN - Peter Hoyt (1:30-1:50)

Links

  1. MEGAN Home
  2. More MEGAN presentations
  3. Another tutorial (PDF)

Hands-on: Metagenomic Analysis With MEGAN - Matt Vaughn - (1:50-4:00)

  • Slides (PDF 3.4 MB)
    • Connect to your 'Entangled Genomes' VM using VNC Viewer
    • Load in the Phyllosphere data (from the Desktop folder on your VM desktop)
    • Follow along with your presenter to learn how to access various functions in MEGAN
    • Now, try to answer the study questions.

Study Questions

Taxonomic Composition

  1. Extract the sequence reads from the Malpighiales node
    1. Use NCBI BLAST to see if you can determine what other species was picked up in the metagenomics sample
      1. Hint: You can use BLAST right from the browser on your iPlant VM
    2. What is the common name for the most abundant species in Malpighiales?
    3. Why do you think this species was picked up from a leaf surface sample of soybean?
  2. What other types of eukaryotic organisms appear to be present on the leaf of soybean in this sample?
    1. Do you see any fungi that might be pathogenic?
      1. Hint: Internet searches for species names may help
  3. Can you use BLAST with reads from the dsDNA virus node to identify what may be trying to infect the plant used for the phyllosphere sample?
  4. Overall, how MEGAN and MetaPhlan taxonomic profiles do differ?
  5. Are the results for Bacteria comparable between MEGAN and MetaPhlan?
    1. Hint: Download a Krona-formatted file (taxonomy.krona.html) based on the MEGAN Bacteria taxonomic profile from the Community Data/osu-entangled-genomes folder in the DE and compare it with the one you or your group generated from MetaPhlan on Wednesday
    2. Which method appears to be more sensitive for classifying bacterial populations?
      1. Why do you say this?
    3. MEGAN+BLAST picks up a wider range of taxonomic categories (Eukarya, etc). How would you propose to extend/update MetaPhlan to make it sensitive to non-Bacteria taxa?

Functional Analysis

  1. Subselect the Bacteria node from the taxonomic tree
  2. Launch a SEED analysis and explore the various categories. Given that these bacteria are harvested from the aerial portion of soybean, a nitrogen-fixing species, please consider the following
    1. How do the majority of bacteria on these soybean leaf appear to move around?
    2. Do you think these species generally use aerobic or anaerobic respiration?
    3. If you were to examine the phyllosphere of Maize grown in the same field as these soybeans, do you think you would see a similar enrichment of Nitrogen metabolism genes?
    4. What kind of stresses do leaf surface bacteria appear to encounter, based on their SEED profile?
  3. Subselect the Bacteria node from the taxonomic tree
  4. Launch a KEGG analysis
    1. Can you find enriched KEGG categories that are consistent with the findings from SEED?
    2. Which type of analysis (KEGG or SEED) do you find more useful for understanding a metagenomic dataset?
    3. What is the habitat (at least according to MEGAN) for _Bradyrhizobium japonicus_?
    4. Are there any anaerobic bacteria present?
    5. Identify one bacterial genus that is facultatively aerobic.

Microbial Attributes

  1. Subselect the Bacteria node again
  2. Launch the Microbial Attributes analysis
    1. Are the majority of classified species Gram + or Gram -
    2. What is the habitat of Bradyrhizobium japnonicum?
    3. How do most species found on soybean leaf move around?
    4. Can you identify a genus that is facultatively aerobic?
    5. What pathogenic species has been identified from this metagenomic sample?
      1. Search the internet and determine whether it is able to infect soybean (Glycine max).
    6. What does a 'Mesophilic' temperature range mean?

Experimental strategies

  1. Design your own experimental and bioinformatics strategem for comparing the phyllosphere of a non-nitrogen fixing species to soybean
    1. What species would you examine?
    2. Are there other nitrogen fixing species you could examine to shore up some of your results?
    3. Describe the sequencing strategy you would use that will give you similar results to the soybean phyllosphere sample
    4. Design a bioinformatics strategy starting with acquisition of sequence and ending with import into MEGAN for this project.
      1. Enumerate the tools you will need, the sequence you will use them, and explain the purpose of each.
  2. Are there other phlyllosphere data sets at the NCBI SRA? Could any of them be useful for comparative analysis to the soybean phyllosphere?

Synthesis Period (4:00-5:00)

  • In your groups, design a 5 minute presentation for tomorrow to address one section of the Study Questions (Taxonomic Composition, Functional Analysis, Microbial Attributes, or Experimental strategies)
  • Try to send your presentation to Dana dana.s.brunson@gmail.com by the end of the day