How researchers view or select data sources
Considering the following Plant gene tree databases...
Source: Intro presentation (PDF ).
- Comprehensive for plants (i.e. includes EST data)
- Phylota http://loco.biosci.arizona.edu/pb/
- Phytome http://phytome.org
- PlantTribes http://fgp.bio.psu.edu/tribedb/
- Non-comprehensive
- PhyloFacts http://phylogenomics.berkeley.edu/phylofacts
- Phytozome http://www.phytozome.net
- TreeFam (Metazoan only) http://www.treefam.org/
From a requirement analysis angle, the perspective which researchers "view" or their opinion of data sources is helpful when prioritizing which to interface with or what qualities are needed to satisfy and attract users.
Below are related questions. Feel free to answer in comments to this page:
- Not having EST data is the only difference between a data source being "comprehensive" or "non-comprehensive"? Are there other differences?
- Why would a researcher choose, say, between Phytome or Phylota? Within the comprehensive data sources, are there any factors outside of usability that would cause a researcher in this domain to pick one over another?
- (I believe the answer was if the researcher is already familiar with data source. Is that correct?)
- Do any of these databases (or data sources) provide a manner to query and retrieve data through web services? Do any provide a programmatic API? Do any members of the working group have personal experience integrating with these databases/data-sources that they could share?
- Would Phylota be a good candidate for a "gene catalog" to BLAST against? (given that it is a comprehensive gene tree database)
- Please describe what the "best" database scenario would be for a user. From a UI perspective, what options would you like to have available? From a content perspective, what quality of data are you looking for?
- Are there data source which provide coding DNA sequence alignments in AA-guided
nucleotide alignments? If so, is it safe to assume they are comprehensive data sources for plants?