iEvoBio abstract

The following abstract has been accepted for the 2010 iEvoBio conference in Portland, OR.

Visualizing Phylogenies and Metadata at the Scale of One Million Tips
Karen A. Cranston, Field Museum of Natural History, Chicago, IL (kcranston@fieldmuseum.org)
Adam Kubach, Texas Advanced Computing Center, Austin, TX (adamk@tacc.utexas.edu)
Michael J. Sanderson, University of Arizona, Tucson, AZ (sanderm@email.arizona.edu)
Kristopher Urie, Field Museum of Natural History, Chicago, IL (kurie@fieldmuseum.org)

Visualization of phylogenetic data at large scales is a longstanding problem in evolutionary biology. Ideally, we want to visualize patterns in trees and metadata at a broad scale and also investigate detail without losing context about location in the phylogeny. With large and integrated data sets, visualization tools should facilitate the dual role of exploration and presentation of complex data and provide simple interfaces to associate phylogenies with other types of data. Finally, software must be scalable and stable with a intuitive, attractive interface.

As an activity of the iPlant Tree of Life project (a Grand Challenge of the iPlant Collaborative 1), we have developed a scalable visualization tool for exploration and presentation of large phylogenies and associated metadata. The tool displays and navigates phylogenies of one hundred thousands tips on average laptop, and up to one million tips on a desktop computer. The interface provides an overview of the whole phylogeny plus a detail view with a smooth and intuitive interaction. Using level of detail rendering, the user can pan and zoom on the tree detail, with clades collapsing and expanding automatically. It is straightforward to zoom to a specific clade, either by selecting the ancestral node or through search functionality. If the phylogeny is associated with a set of landmarks, such as internal node labels or a set of query results, the landmarks are used to aid in navigation by identifying collapsed clades. Users can save the current state (including level of zoom and annotations) and output in graphical formats (jpeg, PDF and SVG).

Ongoing development is focused on associating large phylogenies with metadata while maintaining the existing benchmarks for scalability. A wide range of metadata, such as images, text and numbers, can be visualized on the tree by adding labels or by changing the colour and style of nodes, clades and branches. Users can incorporate metadata through manual annotation or by uploading simple flat files through the interface. To facilitate data sharing and reuse, the interface will encourage the addition of semantic context by associating metadata with ontology terms, such as those in the Comparative Data Analysis Ontology (CDAO) or the Plant Ontology 2. Using the phyloinformatics technologies developed by the EvoIO collaborative 3, including the NeXML file format, CDAO and the PhyloWS standard for phylogenetic web services, we will provide an interface to find and retrieve trees and metadata from online databases.

The viewer can run as a standalone application or embedded in a web page; an example of the latter is the forthcoming iPlant Discovery Environment, where the visualization will be linked to data sources and analysis tools. The viewer is written in cross-platform C++ and uses OpenGL to render. It is released under the GNU General Public License, version 2.

Binaries and svn repository available at https://pods.iplantcollaborative.org/wiki/x/HQo7

1. http://iplantcollaborative.org
2. http://www.plantontology.org
3. http://evoio.org