TR_Project Charter
Project Charter
July 10, 2009
Version 1.0
Project Title: Tree Reconciliation
Phase I Start Date: June 2009
Phase I End Date: June 2011
Project Justification:
The iPlant Board of Directors-constituted Review Team met in early March 2009 and recommended immediate engagement with the iPlant Tree of Life (iPToL) Grand Challenge Project Team to initiate a two-year project constructing a Phylogenetics CyberInfrastructure. In early May 2009, members from iPlant’s Executive Team and Engagement Team met with leads from the iPToL Grand Challenge Team to develop a management plan and create a roadmap for work to be conducted over the next two years. Collaborative implementation was organized into working groups with focused development goals. The four main working groups are: Big Trees, Data Assembly, Tree Reconciliation, and Ancestral Character State Reconstruction. Two crosscutting working groups to develop shared data and compute infrastructure are Data Integration and Visualization.
[reconcilwg:Scientific Summary to be provided by Todd/Dannie]
Project Objectives:
The iPlant Tree of Life (iPToL) project addresses the construction of the green plant tree of life to aid in the understanding of the diversification of green plants over the last billion years. This project will also build a cyberinfrastructure to connect this tree to the rest of the plant sciences community and beyond. Cyberinfrastructure will be built to support this endeavor along with providing support for post-tree analyses.
[reconcilwg:Scientific Justification]
Overview of Deliverables:
Final Deliverable: A web based environment that receives trees, either through user upload of their own data, selection of a pre-defined tree, or queries tree databases, performs tree reconciliation, and reports the results. The discovery environment will provide portals to other larger databases such as TreeBase and plant genome databases.
Major Year 1 Milestones:
- Benchmark scalability and accuracy of existing tools (i.e. program A fails in B way with method C on a dataset with D taxa
- Discovery Environment that accepts trees and data and does gene tree/species tree reconsturction
- Algorithms optimized to work on at least 50k taxa tree
- Design simulation engine
- Begin system and interface design (eg mockups, user feedback)
- Define work needed for scalability/visualization/data exchange technology/other needs
Approach:
A working group will be formed to address both the short-term and long-term objectives above. This group will be composed of Todd Vision and/or Dannie Durand (project champion), and at least one postdoc or grad student identified by the iPToL PIs along with Sheldon McKay, Karla Gendler, and other members of the iPlant Phylogenetics Engagement Team. There will be frequent contact between iPToL members and iPlant in the form of scheduled weekly meetings and ad hoc contact (IRC/email, etc.). Developers will be able to address questions to the biologists as they arise (e.g. need to deal with polytomies? Support for different data formats? etc.?). Planning and development will be made as public as possible through the use of mailing lists, wikis, and discussion forums as ways of engaging the broader community and receiving feedback early in the design process.
iPlant’s Engagement Team will work with the working group to gather requirements and prototype the solution to provide proof of concept, if needed. These requirements and prototype(s) will be given to iPlant’s core developers for iterative prototyping of the solution(s). The group will then work with the iPlant core developers to bring the software to production after their specifications identified in collaboration with iPToL scientists have been met. Releases will be early and often to show progress on the project and gain support and user feedback.
Sheldon McKay and Karla Gendler will give monthly status reports to the Steering Committee and will also report back any changes and/or additions that have been identified by the Steering Committee.
Success Criteria:
- Creation of a web app that is actually used to do analyses quickly and easily
- Useful for large scale analyses
- Developed pathway/protocol for addition (through wrapping, recoding, etc.) of other methods to the discovery environment
Key Assumptions:
Broadly, iPlant is designed to build cyberinfrastructure and not generate new data. Thus, the Board of Director’s recommend that iPlant not focus on new algorithm development but instead on providing HPC and scale-up expertise in support of existing software. The key is to be able to solve problems that need to be solved. There is risk assumed in achieving these goals. Best practices will be used to attack problems and if progress is not being made, there is the possibility of bringing the problem to the Scientific Opportunities Team to discuss the possibility of developing a proposal regarding algorithm development.
iPlant’s Engagement Team and developers will be the people with the most knowledge of the shared discovery environment. It will be necessary that the work of the Tree Reconciliation working group complements that of the other working groups, yet remains independent enough that delay in the progress of one group does not dramatically affect the progress of other groups (i.e., doing gene tree/species tree reconciliaton on large trees will be useful to many researchers even before the 500,000 taxon plant tree is created).
Resources:
One unit of summer support ($10k) is allocated for the project champion. One $50k fellowship is allocated for one postdoc or one graduate student. Funding for iPlant staff members, meetings, workshops, as well as EOT activities come from other sources and are not included in the working group budget.
Grand Challenge team members will have access to a scalable pool of reliable, enterprise class virtual servers for providing persistent web services, access to world-class high performance computing resources, and access to large scale, redundant storage systems with petascale capacity. Below is a description of what is currently available to iPlant. Note that these will change with time and needs.
- Compute: Ranger, Lonestar, Stampede (UT/TeraGrid) Saguaro, Sonora (ASU) Marin, Ice (UA)
- ~700 Teraflops, more computing power than existed in all the Top 500 computers in the world 4 years ago
- Storage: Corral, Ranch (UT), Ocotillo (ASU)
- Well over 10 Petabytes of storage can be made available for the project, on scalable systems capable of growing much more.
- Visualization: Spur, Stallion (UT), Matinee (ASU), UA-Cave
- Among the world’s largest visualization systems
- Virtualized/Cloud Services: iPlant (UA) and ASU virtual environments, vendor clouds
- Positioned to cloud technologies to deliver persistent gateways and services to users
Roles and Responsibilities:
Both iPToL and iPlant will work together to establish an effective team consisting of iPlant personnel and appropriate super users/super postdocs to create use cases and specifications for objectives and deliverables.
iPlant's organization chart can be found in Appendix A. Appendix B and C contain role descriptions for the iPlant Engagement Team and iPToL team respectively. A list of key personnel is attached as Appendix D. The list is not comprehensive; please add names as appropriate.
Recruitment for postdoctoral candidates should commence immediately by the iPToL PIs; iPlant will ensure funds are in place as soon as possible.
Conflict of Interest Policy:
A conflict of interest policy is currently being developed by iPlant in collaboration with the National Science Foundation, the iPToL Grand Challenge Team, and the Genotype-to-Phenotype Grand Challenge Team. In general, each participant should follow their own institution’s conflict of interest policy.
An example of the conflict of interest policy being developed, a CoI would exist if a GCT lead or member could benefit financially from a piece of software, such as via a spouse or relative who developed the software or worked for the company that developed and sold the software,
Signatures---The following people agree that the above information is accurate:
Project team members:Project sponsor and/or authorizing manager(s):Notes/Comments:
Appendix A: iPlant CI Development Team Organization Chart
Appendix B: iPlant Engagement Team Roles
Scientific lead (Sheldon McKay):
- Interfaces with faculty and super users
- Provides design input and scientific leadership to the engagement team
- Reviews all deliverables
- Holds regular status meetings
- Provides regular status reports to Project Manager
- Manages and resolves team-level risks, issues, and changes
Project Manager (Karla Gendler):
- Aides Scientific Lead in supervising and providing technical direction to project team
- Executes project management processes: risk, issues, change, quality, and document management
- Ensures project plan and schedule; detects and manages variances
- Provides weekly project status reports
- Facilitates weekly team status meetings
Team Member:
- Major activities they will do (defined at NESCent meeting and after)
- Deliverables they will produce (defined at NESCent meeting and after)
- Attends status meetings or other appropriate meetings
- Participates in project management processes such as risk, issue, and document management
Appendix C: iPToL Team Roles
Project lead (Michael Sanderson):
- Interfaces with iPlant Executive Team
- Participates in Steering Committee Meetings
- Reviews all deliverables
- Oversees and manages all working groups
- Point of contact for project
Steering Committee Member
- Meet once a month via phone
Project Champion ()
- Point of contact for project
Super User/Postdoc/Grad Student
Working Group Member
Appendix D: Personnel
(updated as the project progresses)
Name |
Title |
Role |
Contact Information |
---|---|---|---|
Michael Sanderson |
Proposal principal leader |
main contact |
sanderm@email.arizona.edu |
Michael Donoghue |
Proposal principal leader; Plant science community leader |
|
michael.donoghue@yale.edu |
Pamela Soltis |
Proposal principal leader; Plant science community leader |
|
psoltis@flmnh.ufl.edu |
Douglas Soltis |
Proposal principal leader; Plant science community leader |
|
dsoltis@botany.ufl.edu |
Val Tannen |
Proposal principal leader; Computational science community leader |
|
val@cis.upenn.edu |
Alexandros Stamatakis |
Proposal principal leader; Computational science community leader |
|
stamatak@cs.tum.edu |
Todd Vision |
Proposal principal leader; Computational science community leader |
|
tjv@bio.unc.edu |
Rich Jorgensen |
iPlant Principal Investigator |
|
raj@ag.arizona.edu |
Steve Goff |
iPlant Project Director |
|
sgoff@iplantcolalborative.org |
Dan Stanzione |
iPlant Co-PI; Director of Cyberinfrastructure Developement |
|
dan@tacc.utexas.edu |
Martha Narro |
iPlant Director of Education, Outreach, and Training |
|
narro@email.arizona.edu |
Sheldon McKay |
iPlant Scientific Lead; iPToL Engagement Team |
Scientific Lead |
mckays@cshl.edu |
Karla Gendler |
iPlant Project Manager; iPToL Engagement Team |
Project Manager |
gendlerk@iplantcollaborative.org |
Damian Gessler |
iPlant Semantic Web Architect |
|
dgessler@iplantcollaborative.org |
Sonya Lowry |
iPlant Lead Developer |
|
sonya@iplantcollaborative.org |
Edwin Skidmore |
iPlant IT/Infrastructure Lead |
|
edwin@iplantcollaborative.org |
Appendix E: iPlant’s Programmatic Terms and Conditions
This is an excerpt from the cooperative agreement between the NSF and iPlant outlining NSF’s expectations of iPlant.
Program/Project Description: The goal of the program is to establish the iPlant Cyberinfrastructure Collaborative, taking into account the following considerations:
a) The iPlant Collaborative will utilize new computer, computational science and cyberinfrastructure (CI) solutions to address an evolving array of grand challenge questions in plant science;
b) The project will be community-driven, involving plant biologists, computer and information scientists and experts from other disciplines working in integrated teams to enable interdisciplinary systems-level scientific queries and analyses;
c) The project will use community-based processes to select grand challenge questions, employing a multi-step process that includes a community-wide conference at which candidate questions are selected for subsequent feasibility, impact and needs assessment via “readiness symposia”;
d) The project will develop community-driven, open-access digital Discovery Environments (DE) that are each focused on a grand challenge question through a selection process that includes evaluation of proposals from readiness symposia by a community Board of Directors (BoD);
e) The DEs will be comprehensive CI systems constructed around a grand challenge question and designed to enable collaboration, information access and integration, computational capabilities, visualization and analysis, modeling and simulation, learning resources, community annotation and other forms of content creation;
f) The DEs will comprise hardware, software, network infrastructure, connectivity, and the full range of appropriate science and technology expertise emphasizing Web 2.0 and web services approaches along with open-source, community development methods;
g) The DEs, software tools and systems, novel data sets and the like developed under direct project funding, will be open source and will be made openly available for reuse and repurposing, with attribution;
h) The research, education and outreach activities will be integrated fully into the project plan through involvement of students and educators in development of DEs;
i) Education activities will include, but not be limited to i) teacher intern programs with an emphasis on minority recruitment, and built around standards-based teaching modules that use DEs for discovery-based learning, ii) traveling workshops, iii) iPlant Action Teams (IPATS) and iv) integrated active assessment and evaluation components;
j) A designated Diversity Officer will ensure diversity of all levels of the iPlant Collaborative by: promoting diversity across the project team and advisory groups; engaging school districts serving minority and economically-disadvantaged populations in the educational development programs; performing outreach to a diverse range of professional societies, associations and academic institutions; and designing and implementing project educational activities to reach a diverse population;
k) Social science activities will be integrated into the project through development by an independent evaluator of a continuing evaluation process with formative and summative phases, organization of regular social science planning workshops by the iPlant Collaborative, and cooperation with social scientists who may conduct their own studies of the iPlant project.