Data_Management_Issues

iPToL Data Management Issues
FRom Val Tannen
iPToL discovery environments imports data from various sources and for
various purposes:

  1. Data for building the Big Trees (backbone trees) (iPToL group Stamatakis et al)


Perhaps just molecular sequences -> gigantic matrix -> big tree (backbone)
(iPToL group Soltis et al).
Still, we need to store these sequences, with provenance, for further reference
and reproducibility; we may also decide to make this provenance to users.
Question to other groups: do we need to import also trees for this?
Maybe to do supertree construction and compare with the big trees
inferred?


  1. Character reconstruction data (iPToL group O'Meara et al)


The nature of this data may vary a lot: geographic distribution data
(GBIF), seed size, leaf size (TRY), molecular/morphological
information (AToL matrices)
Character data is to be mapped on the current backbone tree or on other trees
imported from TreeBASE or other sources.
Users of these feature may wish to bring their own data,
or to specify using data from a new data source.


  1. Gene/species tree reconciliation (iPToL group Vision et al)


Species tree can be backbone or imported from TreeBASE or elsewhere.
Gene tree will be "brought" by user but we should facilitate the process.
There are several gene tree databases to consider (see Evogenomics Group at NesCENT)


Cross-cutting issues

  1. Taxonomic intelligence


uBio, NCBI, EoL?

  1. Management of iPToL products


Not just the backbone trees but also reconstruction and reconciliation results for specific users. User workspaces? Long-term storage?

  1. Vizualization/user interfaces/browsing/querying/exploration


Up in the air ?