iPlant Omics Viewer Design Ideas

iPlant Omics Viewer Design Ideas
N. Provart, 17 Sept 2009. nicholas.provart@utoronto.ca 
 
The iPlant Omics Viewer would provide an easy to use and intuitive way for visualizing  gene expression and metabolite levels on expert-annotated pathways in Arabidopsis and other crop species, with data for these pathways coming from e.g. Reactome. 
 
User Interface  
Simple form to upload a list of AGI IDs (or other plant species gene identifiers) and expression values or ratios (over multiple time points/treatments if desired) andmetabolite data 
?
Summary page is returned showing which of the pathways in the DB are most affected at each time-point/treatment
?
Click on pathway link to receive summary graphic painted with expression values and/or metabolite values, such as this mock-up:

 
The width of the arrows denotes the level of gene expression. Metabolite pools can be indicated by dots/circles of varying sizes. Such a representation is much more intuitive than existing representations with different line colors or boxes with different colors, which can be difficult to see if the graphic is reduced in size (e.g. such as currently implemented in the OMICS view for AraCyc/PlantCyc at http://arabidopsis.org/biocyc/ or SkyPainter in Reactome for Arabidopsis (www.arabidopsisreactome.org). Such colored representations also require mental gymnastics to interpret. Arrow widths are easily interpretable: wide=lots of expression, narrow=little expression. Metabolite pool sizes as displayed by variable diameter dots or circles might be better again than a coloring system – more intuitive (exemplified in the mock-up by red circles on the lower left).
 
Another issue is scale. How to view the entire biochemical pathway and also be able to zoom down easily to individual biochemical steps? Both AraCyc and Reactome are
clunky as they use server-side delivery systems. The layout of the networks is also static with these systems. Also, what if one has time series data – can the user traverse the timepoints using some sort of a slider? One could imagine a “Google Earth” type of control:

Re. data types, the values for gene expression can be either “absolute” expression levels (floating point values above zero) or “relative” measurements (fold-change relative to a control sample). Here the data are ratios ranging from 0 – infinity. It is practical to work in log2 space so that down-regulated genes are negative floating point values and up- regulated genes are positive floating point values. No change would have a value of 0. The same could also apply to metabolite data.