Biological Network Analysis with Cytoscape

Introduction

Cytoscape is a publicly-available platform for visualizing and analyzing biological network data. It provides capabilities for loading network data from public repositories, loading from files in a wide variety of network data formats, or inferring networks from the literature or from experimental data such as co-expression patterns. Once the network data is loaded, Cytoscape allows you to integrate experimental and attribute data with the network, and set visual properties of the nodes and edges in the network according to this data. This enables visual analysis of the experimental/attribute data in a functional framework, such as visualizing gene expression changes within a network topology, to suggest patterns of regulation and dis-regulation. Cytoscape is extensible, and a variety of community-submitted apps provide additional functionality for data import and analysis.

Specific Objectives

By the end of this module, you should

Download biological pathways from public databases and personally-curated files
Visualize pathway data with experimental results
Identify and install Cytoscape apps
Build text mining networks

Materials

Interactions from the yeast galactose metabolism pathway (galFiltered.sif)
Gene Expression Data from Pellegrini et al (Pellegrini_et_al_Data.txt)

Prerequisites

A computer with at least a 1GHz processor, at least 512MB memory, on-board video, and an XGA (1024X768) monitor
Java SE 6.0
A web browser
The files specified under Materials.

Workflow

Overview
1. Open a web browser, and go to Cytoscape.org
2. Click on Documentation and For Users to see available manuals and Quick Start guides.
3. Click on Documentation and Cytoscape at Open Tutorials to see available tutorials.
Launch Cytoscape Download a set of interactions.
1. In the Welcome to Cytoscape window, under Start New Session, select From Network Database. Alternatively, select File -> Import -> Network -> Public Databases.
2. Enter the gene names EGFR, GRB2, and SHC1, each on a separate line, and click Search
3. Your window will show the number of interactions available from specified repositories. Select the interactions from IntAct by clicking Clear, and then clicking the checkbox next to IntAct and clicking Import
4. Click on the Apply Layout button to lay out the network.
Basic Navigation
1. Mouse over the buttons on the toolbar to notice the tooltips.
2. Click on a node to select it. Notice how data on the node appears in the Data Browser.
3. Select multiple nodes by shift-clicking, or by holding down the left mouse button to select a rectangular area.
4. Select the Edge Table tab in the Data Browser. Select an edge by clicking on it, or by selecting a rectangular region with the mouse.
5. Select the gene HER1 by entering its name in the search box.
6. Select its immediate neighbors by clicking on the First Neighbors of Selected Node button.
7. Copy these nodes into a separate network by clicking the New Network from Selection button.
8. Zoom in and out with the Zoom In and Zoom Out buttons. Notice how the visible region is indicated by the blue square in the Network Overview window. Move this square to change the visible region.
9. Filtering: create a network of only human interactions
  1. Click on the Filter tab in the Control Panel
  2. Under Column/Filter, scroll down to node.taxonomy.name and click Add
  3. Click on the pull-down menu next to taxonomy name and select human
  4. Click on Apply Filter
  5. Click on the New Network From Selection button
Other modes of network import
1. Import a curated network from public databases
  1. Under the File menu, select Import -> Network -> Public Databases
  2. Under Data Source, select Pathway Commons Web Service Client
  3. Under Search, enter EGFR. Change the All Organisms slider to Human. Click Search.
  4. Under Step 3: Select Network(s), scroll down to Pathway EGFR1 from Data Source Cancer Cell Map, and double-click.
  5. Lay out the network
2. Import interactions from a file. This illustrates how one can import network data from a simple tab-delimited file.
  1. Download the attachment galFiltered.sif, which contains a set of interactions related to the yeast galactose metabolism network
  2. Open the file in the text editor of your choice. The format is very simple: three delimited columns. These columns represent respectively interactorA, interactionType, interactorB. You do not need to qualify the interaction type (i.e. it can be any arbitrary text), but it can be useful to have a specific interaction type.
  3. In Cytoscape, select Network -> Import -> From File, and navigate to this file.
  4. Click OK to import the network
  5. Lay out the network
Setting network visual properties
1. Navigate to the EGFR1 Cancer Cell Map network by clicking on its network canvas or selecting its name in the network Control Panel.
2. Download the attachment Pellegrini_et_al_Data.txt. This is a gene expression data set describing knockout of the gene CREB1.
3. Lay out the network.
4. Load the gene expression data into Cytoscape to merge with the network
  1. Select File -> Import -> Table -> File, and navigate to the file. This will bring up the table import window
  2. The Preview area in this window shows the contents of the file as they are to be imported. Notice that the columns are labeled "Column 1", "Column 2", and so forth, while the first line obviously contains the column name.
  3. Set the first line to be the column name:
    1. Click on Show Text File Import Options
    2. Click the check box next to "Transfer First Line as Column Names"
  4. Set the mapping options to integrate this data into your network. To do this, you specify a "key column" in both the text file and in the network. These must match exactly!
    1. Notice that the first column is in blue. By default, the first column is the key column (this can be changed if needed). Notice that the first column contains gene symbols.
    2. Near the top of the window, notice the slider titled "Key Column For Network", set by default to "shared name".
    3. Set this instead to "ID_GENE_SYMBOL", which involves scrolling UP from "shared name"
  5. Click OK
  6. Use the Data Browser to verify that this data has loaded correctly.
    1. Select a set of nodes in the network
    2. Scroll to the far right side of the Data Browser to see the new attribute columns. You should see columns including "p value" and "fold change".
    3. Verify that these columns are not completely blank. If the columns are blank for some row, that indicates that no expression data is available for that node. This is normal, since microarray datasets often do not report genes that were not expressed in the experimental condition. But, there should be some non-blank values in the column.
    4. If the column is entirely blank, then repeat the preceding steps.
5. Color the nodes by fold change
  1. In the Control Panel, select the Vizmapper tap. The Vizmapper outlines how visual properties are set by data attributes, or by default values.
  2. Click on the menu next to Node Fill Color, and select "fold change".
  3. Click on the triangle next to Node Fill Color to expand the menu and define the mapping parameters.
  4. Set the Mapping Type to Continuous Mapping. The three mapping types available are Discrete Mapping, which assigns one color per distinct attribute value; continuous mapping, which assigns color across a continuum of values; and pass-through mapping, which assigns values directly, and is used most often for displaying labels.
  5. When you select Continuous Mapping, Cytoscape will create a default mapping from black to white, with black representing low values, white representing high values, and gray representing intermediate values. Change this to a Blue-Yellow color map:
    1. Double-click on the black and white map to open the Continuous Mapping Editor window.
    2. Notice two horizontal-pointing triangles at the edges and two downward-facing triangles near the edges. Move the downward-facing triangles to the edges.
    3. Double-click on the black downward-facing triangle to select a new color. Select blue.
    4. Double-click on the white downward-facing triangle to select a new color. Select yellow.
    5. Add a white "handle" (downward-facing triangle) in the middle by clicking the Add button. This brings up a white handle.
    6. The numeric value below the handle indicates the fold change for which nodes are colored white. Slide this down to approximately 1.0
    8. Click OK
    9. Notice how the network now contains blue, yellow, white, and pink nodes. The pink nodes have no fold change. The blue nodes have a fold change of less than 1, the yellow nodes have a fold change of greater than one, and a darker color indicates a more pronounced fold change. Experiment with this by selecting some colored nodes and examining the fold change in the Data Browser.
Managing Apps (formerly known as "Plugins")
1. Open your web browser and navigate to cytoscape.org
2. Click on the Apps button.
3. Browse through available apps. You can identify interesting apps on this page, and then download them directly through Cytoscape.
4. Download the Agilent Literature Search app
  1. Go to Apps -> App Manager
  2. In the Search box, enter Agilent
  3. Select Agilent Literature Search and click Install.
5. Run the Agilent Literature Search app. This app mines PubMed abstracts for sentences that suggest a molecular association, and builds a network from such sentences.
  1. In the Agilent Literature Search window, enter the term FGF1. In the Context window, you could enter a species or other molecular context. Leave this at the default value for now.
  2. Notice the query to be executed in the Query Editor window. One challenge in text mining is gene aliases. Click the Use Aliases checkbox and observe how the query changes.
  3. Click the blue arrow to begin your search. Cytoscape will produce a new network on the network canvas with your search results.
  4. Select an edge and check the corresponding sentences with Select -> Evidence from Literature -> Show Sentences from the Literature. This will bring up a window showing each sentence that supported an interaction between these two nodes.
  5. To curate the results and delete an undesired sentence, right-click on the sentence and then click on Delete Sentence. If you delete all sentences supporting an interaction, the corresponding edge will disappear when you return to the network canvas. If you delete all edges connecting a node, the node will also disappear.
  6. Experiment with this with other search terms and contexts, and with varying the Max Engine Matches. This parameter is the maximum number of articles to retrieve from PubMed: be gentle.
6. Select some other app to install. Be sure to select one of the apps for Cytoscape 3.0.

Additional Resources

The Cytoscape consortium maintains a volume of online tutorials for learning specific parts of Cytoscape functionality. These tutorials reflect Cytoscape 3.0, which was first released in December 2012. There are additional tutorials that reflect Cytoscape 2.X. These tutorials will be updated to Cytoscape 3.X in time, but in the meantime are still useful for learning general concepts, as the largest differences between Cytoscape 2.X and 3.X are internal, and many parts of the user interface have not changed.

The Cytoscape slides from the WiNGS 2013 presentation are available here.

References

Integration of biological networks and gene expression data using Cytoscape. Cline MS, Smoot M, Cerami E, Kuchinsky A, Landys N, Workman C, Christmas R, Avila-Campilo I, Creech M, Gross B, Hanspers K, Isserlin R, Kelley R, Killcoyne S, Lotia S, Maere S, Morris J, Ono K, Pavlovic V, Pico AR, Vailaya A, Wang PL, Adler A, Conklin BR, Hood L, Kuiper M, Sander C, Schmulevich I, Schwikowski B, Warner GJ, Ideker T, Bader GD. Nat Protoc. 2007;2(10):2366-82.
Exploring biological networks with Cytoscape software. Yeung N, Cline MS, Kuchinsky A, Smoot ME, Bader GD. Curr Protoc Bioinformatics. 2008 Sep;Chapter 8:Unit 8.13.
A travel guide to Cytoscape plugins. Saito R, Smoot ME, Ono K, Ruscheinski J, Wang PL, Lotia S, Pico AR, Bader GD, Ideker T. Nat Methods. 2012 Nov;9(11):1069-76