TR_Project Plan
NOTE: Very much a document in progress!
Overarching Goal
Provide accessibility to the large iPTOL species tree for the target audience of molecular, cellular, and developmental biologists by enabling (a) applications that take advantage of a large species tree to interpret the functional evolution of one to many gene families, and (b) generation of hypotheses about functional shifts along branches of the species tree by simultaneous analysis of multiple reconciled gene trees.
Subgoals
NOTE: These are just some suggestions to get the ball rolling. Let's continue brainstorming them and then prioritize them based on feasibility, usefulness, etc.
1. Characterize the scaling properties of existing algorithms for tree reconciliation.
2. Determine accuracy of existing algorithms for tree reconciliation on large datasets in the presence of complications such as (a) topological uncertainty or error, (b) rooting uncertainty, (c) lineage sorting/hybridization, (d) horizontal transfer/endosymbiosis, (e) deep branching, etc.
3. Explore ways of quantifying reconciliation uncertainty within reconciliations due to topological and rooting uncertainty for (a) duplication/loss inference (b) orthology/paralogy inference, (c) rooting, others...
4. Devise new visualization ideas for showing (a) uncertainty in reconciliations & orthology/paralogy inference, (b) patterns among multiple gene trees, (c) functional shifts (eg structural domain changes) within a reconciled tree.
5.. Apply reconciliation to a comprehensive plant gene collection and large species phylogeny to examine any or all of these: (a) patterns of coevolution among gene families, (b) the existence of gene family "strata" in different plant lineages, (c) heterogeneity in patterns of gene family evolution across lineages, (d) other...
Preliminary project milestones
NOTE: this needs to be revised once the above has been refined.
10/09 |
Assemble/recruit team, gather requirements, thoroughly review |
1/10 |
Benchmark scalability and accuracy of existing tools, begin system |
4/10 |
Implementation begins |
1/11 |
Begin analysis on comprehensive gene family dataset (“marquee |
1/11 |
Begin user testing |
4/11 |
Submission of “marquee analysis”. Begin work on training materials |
7/11 |
Disseminate at workshops & conferencess |