iPG2P Data Integration

October 22, 2009, 4pm EDT

Attendees: Doreen Ware, Damian Gessler, Jerry Lu, Pankaj Jaiswl, Doina Caragea, Qi Sun, Ruth Grene, Eva Huala, Chris Jordan, Jim Jones, Steve Welch, Lukas Mueller, Karla Gendler

Action Items:

  • Eva, Lukas, Pankaj, Qi Sun: What are standards of exchange for genome sequence? for metadata? What are file formats?


  1. Introduction of participants
  2. Introduce Confluence as collaboration environment
  3. iPlant suppport staff (support@iplantcollaborative.org) to deal with problems
  4. iPG2P Working Group review
    1. WorkingGroups.ppt
  5. Preliminary review of NGS needs based on workflow
    1. InitialTasks.ppt
    2. Is it possible to produce a single output that can be used by all groups?  Other groups will need to define what is needed in the format though.
      1. Qi Sun: Probably won't have single file format as each group will provide own format but we need to focus on cross-repository search and also how do we integrate across species and working on different platforms?
      2. Chris Jordan: is it possible to archive all the data in one format but then transform into each format type that is needed?  Looking for ways to minimize number of formats internally.
      3. Jim Jones: what is the scope of this project?  Is the main scope aimed at G2P effort and what are the sources of data (DNA, RNA, phenotype information)?
        1.  Main scope is aimed at G2P effort
        2. Will international reserach centers be a contact for data sources?
      4. Welch: modeling group has begun to identify what metadata is necessary for formats
  6. Identification of Action Items
    1. All: look at NGS Workflow
      1. reference genome seuqence and annotations (TAIR, Solanceae group): going to end up being a resource that links to you and links back; needs to link to model organism database
      2. What are standards of exchange for genome seq?
      3. What are standards of exchange for metadata?
      4. How do we want to get attributes from model organism databases and how to feed it back?
    2. Eva and Lukas being targeted initially
      1. review existing repositories and formats that exist
      2. for Lukas, very viable option as they have a lot of genoytping and pheontyping info
    3. Eva, Lukas, Pankaj, Qi Sun
      1. How to go between sequence information and pheontype information
    4. First push is to align to reference genomes so need to ID people who work with reference genomes
      1. What are the file formats and data exchange standards?
    5. What about phenotype info w/o reference genome?
  7. Set date for next meeting
    1. Thursday, November 12, 2009 12pm PST (3pm EST)


Topic: iPG2P Data Integration Intro Meeting
Date: Thursday, October 22, 2009
Time: 1:00 pm, Mountain Standard Time (GMT -07:00, Arizona)
Meeting Number: 752 894 132
Meeting Password: iPC123

Please click the link below to see more information, or to join the meeting.

To join the online meeting (Now from iPhones too!)
1. Go to https://ua.webex.com/ua/j.php?ED=118154132&UID=1064553057&PW=f94df31f7c2e495c50
2. Enter your name and email address.
3. Enter the meeting password: iPC123
4. Click "Join Now".

To join the teleconference only
Call-in toll-free number (US/Canada): 866-699-3239
Call-in toll number (US/Canada): 1-408-792-6300
Toll-free dialing restrictions: http://www.webex.com/pdf/tollfree_restrictions.pdf