SC_20100114

iPG2P Steering Committee Minutes
January 14, 2010; 8am to 5pm PST
San Deigo, CA

Present: Steve Goff, Steve Welch, Bernice Rogowitz, Matt Vaughn, Dan Kliebenstein, Ruth Grene, Greg Abram, Chris Myers, Ed Buckler, Doreen Ware, Jerry Lu, Damian Gessler, Tom Brutnell, Karla Gendler, Chris Jordan, Dave Micklos, Jeff White, Sonya Lowry, Dan Stanzione (remote), Martha Narro (remote)

Agenda/Presentations:

Day 2: Thursday, January 14
8:00	Day 2 Goals and Review Use Cases	Steve W/Matt
8:30	Breakout Sessions
	Carbon Metabolism
	Hypothesis Generation Through Data Mining, Processing & Visualization Flipchart images		La Jolla Board Room
9:30	Group Reports (15-min each)
10:00	Break
10:20	Breakout Sessions: Cross-cutting Needs
	NGS and DI to look at what issues may confront the two activities
	User interface: aspects of overall metaphor and how pieces fit together Flipchart images		La Jolla Board Room
12:00	Group Reports (15-min each)
12:30	Working Lunch:Related Activities CI Updates and Planning Discussion	Dan S (call-in)
1:45	EOT Discussion/Plannin Distributed Research Projects: Education for Big Science	Dave
3:00	Break
3:20	Planning	Sonya, Matt, Steve W.
4:30	Wrap-Up and Next Steps	Matt and Steve W
5:00	Adjourn

Summary

Welcome
The overall goal of this meeting is to develop a timeline to have concrete deliverables for a May/June timeframe. The breakout group discussions were set: Issues of overall user interface, in particular some of aspects of overall metaphor and how pieces fit together; and NGS and DI to look at what issues may confront the two activities.

Issues/Discussion
Steve W opened the meeting up for discussion, asking if there were any issues that may have been missed yesterday. White stated that there is a challenge to move to high throughput phenotyping in the whole plant community. If it does become a reality, the whole world of phenotyping will get turned on its head with the new technology. How is iPlant going to interface with that and how is that data going to get organized?

At the Board of Director’s meeting, Goff stated that the process of identifying grand challenges is very slow and that there is a lot of replication/duplication in the proposals. Instead, he offered that it makes more sense to think about what iPlant is already working on and what can be added to extend it. The Scientific Opportunities team, lead by David Salt, has been formed to think about what iPlant is not doing and what can be added to address new GCs. However, the SOT needs a very clear idea of what iPlant is working on. For the Year 3 conference, iPlant will recruit people to give scientific talks as to why the current and what new iPlant Grand Challenges modules are needed. This can provide a mechanism to elevate/incorporate phenomics.

Buckler commented that phenomics would be limited by CI soon, if it is not already. Rogowitz added that she is surprised that there isn’t more image data in this GC. Grene is concerned that everything seems to be related to maize and to sequencing.

Breakout reports
Hypothesis Generation - Ed Buckler
See flip chart images See flip chart images
An immediate deliverable is to bring in reference genomes and annotations.

Carbon Metabolism- Steve Welch
The group built upon 2 threads: 1) idea that there would be a meeting in May of scientific nature that would demonstrate capability of CI and 2) bringing that to specific use case of carbon metabolism in sense of photosynthesis.

The group suggested that they would like to look at the molecular level and at the whole plant level. The group defined that linkage is doable in the time frame outlined. This leads to a visualization requirement that displays linkage or pairing. White agreed to take the flowering time model and work with the NAM dataset on it.

Discussion by the group as a whole shifted to what can iPlant do better. Rogowitz suggested that a big picture of what iPG2P and iPToL is doing is needed and would also like a closer connection to the development group. Ware stated that there is a need to identify top priorities. Welch commented that perhaps the Engagement Team framework is not the best and that tighter loops are needed with the development team. Some possible solutions proposed to help tighten this loop included hackathons, weekly/bi-weekly conferences with the core team involved, and more/frequent updates by the project manager. Discussion was tabled until Dan Stanzione could join.

Breakout reports II
User Interface - Bernice Rogowitz
See flip chart

NGS/DI Needs
There was no breakout report from this group, as we had to move to the lunchtime discussion. This breakout group did spend time talking about ways to tighten communication between the development staff and the working groups.

Lunch Time Discussion
Dan Stanzione’s presentation can be found on the iPlant wiki. iPlant needs the working groups to define what they want as software products for version 1, version 2, etc. In the working group meetings, discussion should be wrapped up, a document created by the Engagement Team Analysts and then this document should be sent to Core. . Once the document has been sent to Core, the WGs can move into meeting hiatus but leads, co-leads, ET, and Core will all need to work together on further software requirements. The WG will need to come back in touch when leads feel comfortable with the level of development.

Buckler and Brutnell stated that the NGS has done this and they don’t really want to move to version 2 yet as people are losing interest with no visible product. Stanzione said to declare victory with NGS and let the group know we will be back in touch soon.

Buckler commented that there is a serious problem describing to the community what iPlant is doing (i.e. building infrastructure). There is a need to have more outreach and more communication, as people want to know what is happening at iPlant. Narro added that there is a need to educate biologists about what is different about iPlant (building infrastructure versus building scientific tools). Stanzione agreed to help Narro with drafting a white paper as part of the outreach effort. Buckler would like to see more diagrams being published (i.e. the DE Software Architecture that Lowry presented).

AI: Matt V and Karla G will start communicating what core is working on to the individual working groups.

Jordan suggested that in the Steering Committee meeting, having members of the Core team would also help with this communication gap. Stanzione suggested that in the SC meetings, it would be good to do deep dives with each WG but on a less frequent basis (once a month). This would occur at the point where making changes becomes too expensive is much like a software preliminary design.

Welch stated that Feb 3rd would be the first core software and SC interaction. The month of February would be used to experiment with SC meetings: two meetings in Feb, two weeks apart, to experiment and look at interaction.

EOT – Dave Micklos
Micklos’s presention about iPlant’s EOT efforts to date can be found on the iPlant wiki. Rogowitz added that once workflows are defined, each would need an EOT component.

Buckler said that he could combine three of the EOT activities into one overarching project, connecting the two GCs. The idea would be have to students find plants in their neighborhood, extract the DNA and send it to Cornell. One technician would then absorb the samples and sequence them. As an example, a student could send in a punch of a rose, it would then be sequenced and the sequence would then be sent back to the student. The student could then use iPToL’s tools to see that it is a rose. With enough flowering time data, the student can predict when it will flower based on its relatedness to other plants. The sample could also be analyzed using DNA Subway. This also can tie into Justin’s iPhone app. It would challenge people to find things that aren’t in the database.

Narro asked if we could ramp up the sophistication of statistics, etc with this project. Kliebenstein stated that once the third database got big enough, it would be possible to do regression analysis and divergence. Narro added that she would like to get as much as possible out of each project.

Brutnell discussed the Brachypodium project that he has been working on a proposal. The Brachy project is very scalable and is aiming for a high school audience with zero budget. With this project, it would be possible to make DNA, bulk segregate and sequence genes. It would be possible to link to this group to develop protocols to measure flowering time, stress, and photosynthesis. Brutnell and others will have to work with teachers to get curriculum put together but would like to have a whole range of modules to hit students and teacher resources. Image capture is also needed in this project.

Welch commented that what Buckler proposed is NYTimes capability and asked if we want to take the risk or do we want to be safe? Buckler said that this project needs to have safe bets and then start aiming to make the NYTimes. Ware added that the Brachy genome would come along as part of the triage in the next year.

The consensus recommendation is for Brutnell to redraft the Brachy proposal to include ‘genesis’ of Buckler’s elements. We, iPG2P, will go forward with the Brachy project and then lead to developing Buckler’s. Buckler will need to talk to iPToL to see what they can offer and if they agree. Justin also needs to be brought into the conversation.

AI: EB talk to Pam and Doug and Justin,
AI: TB to provide proposal incorporating Buckler’s scenario in Brachy project

Steering Committee meeting agenda item: 3rd week in Feb, deliverables from both Tom and Ed

Planning/Wrap-up/Next Steps
NGS

Version 1 documented
Some WG should participate in evaluating products coming out of core infrastructure
April: Core will begin development
End of January: Core will being to work on requirements
Feb-Apr: active requirements gathering between core and ET
April: infrastructure in place
June (late): Release 1.0
New functionality every two weeks
With every release, PM needs to let know what is being developed and who is testing

SI

Version 1 prototype defined
- May: Ranger implementation
- Two other architectures planned
  - FPGA: Dan says April
  - GPU
    - April: initial port
    - June: optimization and scaling
    - June-August: how to package it
Development of specifications for more complicated models (NAIL DOWN DATE)
XQTL prize/competition: new working group meeting chat, what sort of infrastructure might be needed to support this? How do they get data in/ data out and get cycles. Good idea to make sure we don’t make wrong algorithm choices

VA

Short term: pick a workflow and submit for feedback; scope the project, identify
- Formal request to have Core do drill down