GeneMANIA query runner

  • How to download the tool or source code including installation and usage instructions as well as any source code that might be associated with the executable. This should also include a listing of any dependencies for this tool or script.

Minimum system requirements

  • Java 1.5+ JVM
  • Queries that use more than ~1.5 GB of RAM require Cytoscape to be launched with a 64-bit JVM
  • 2 GB RAM

A complete list of networks currently in the GeneMANIA system are available at

http://www.genemania.org/pages/networkList.jsf.      Network data can be downloaded from within GeneMANIA  plugin

  • Required version of the program necessary to perform the desired task
  • Version: 2.0
    Release Date: 2010-12-01
    Verified to work in: Cytoscape2.6,2.7,2.8
  • Sample dataset and expected results to be output

USAGE (32-bit JVM):

java -Xmx1800M -cp GeneMANIA.jar org.genemania.plugin.apps.QueryRunneroptions query-file-1 [ query-file-2 ... ]

USAGE (64-bit JVM):

java -d64 -Xmx3G -cp GeneMANIA.jar org.genemania.plugin.apps.QueryRunneroptions query-file-1 [ query-file-2 ... ]

Query file should contain the gene list and other parameters,  tab delimited flat file.

Flat Query File Format:

organism name

query-gene-1 [ \t query-gene-2 ... ]

networks
related-gene-limit
weighing method 

  • Organism options

Organism name       Taxonomy ID

   A. Thaliana              3702

  C. Elegans              6239

D. Melanogaster        7227

H. Sapiens                9606

M. Musculus              10090

S. Cerevisiae              4932

  • Available network types

 coexp        co-expression

coloc          Co-localization

gi                Genetic interactions

pi                 physical interactions

predict          predicted

spd               Shared protein domains

other            Networks that don't belong to any of the above types.

all                all available networks

preferred      shorthand for coexp  gi  pi

  •  weighing methods

automatic                 The networks are weighted such that the query genes interact as much as possible(Default).

average                     All networks are weighted equally.

average_category      Networks are weighted such that each type of network has the same overall weight.

For Organisms With GO Annotations:

bp                              Networks are weighted in an attempt to reproduce Gene Ontology Biological Process co-annotation patterns.

mf                              Networks are weighted in an attempt to reproduce Gene Ontology Molecular Function co-annotation patterns.

cc                             Networks are weighted in an attempt to reproduce Gene Ontology Cellular Component co-annotation patterns.

Example Query file: Query1.txt

A. thaliana
HY5    ELF3    PHYB    PHYA
coexp
50
bp

Excerpt from the out put file query1.txt-results.report -

Gene             Score                Description
ELF3 ELF3 (EARLY FLOWERING 3); protein C-terminus binding / transcription factor
PHYB PHYB (PHYTOCHROME B); G-protein coupled photoreceptor/ protein histidine kinase/ red or far-red light photoreceptor/ signal transducer
HY5 HY5 (ELONGATED HYPOCOTYL 5); DNA binding / double-stranded DNA binding / transcription factor
PHYA PHYA (PHYTOCHROME A); G-protein coupled photoreceptor/ protein histidine kinase/ red or far-red light photoreceptor/ signal transducer
ATBETAFRUCT4 0.18 ATBETAFRUCT4; beta-fructofuranosidase/ hydrolase, hydrolyzing O-glycosyl compounds
TOC1 0.12 TOC1 (TIMING OF CAB EXPRESSION 1); transcription regulator/ two-component response regulator
SCL3 0.12 SCL3; transcription factor
F3H 0.12 F3H (FLAVANONE 3-HYDROXYLASE); naringenin 3-dioxygenase
AKIN11 0.11 AKIN11 (Arabidopsis SNF1 kinase homolog 11); protein binding / protein kinase
AtMYB32 0.11 AtMYB32 (myb domain protein 32); DNA binding / transcription factor
GA3 0.11 GA3 (GA REQUIRING 3); ent-kaurene oxidase/ oxygen binding
AT3G19100 0.09 calcium-dependent protein kinase, putative / CDPK, putative
COL9 0.09 COL9 (CONSTANS-LIKE 9); transcription factor/ zinc ion binding
ELIP2 0.09 ELIP2 (EARLY LIGHT-INDUCIBLE PROTEIN 2); chlorophyll binding
SPL7 0.09 SPL7 (SQUAMOSA PROMOTER BINDING PROTEIN-LIKE 7); DNA binding / transcription factor
AT2G25730 0.09 hypothetical protein
ATMRP4 0.09 ATMRP4 (ARABIDOPSIS THALIANA MULTIDRUG RESISTANCE-ASSOCIATED PROTEIN 4); ATPase, coupled to transmembrane movement of substances / folic acid transporter
AT5G48250 0.09 zinc finger (B-box type) family protein
AAE17 0.09 AAE17 (ACYL-ACTIVATING ENZYME 17); catalytic/ ligase
RHA2B 0.08 RHA2B (RING-H2 FINGER PROTEIN 2B); protein binding / ubiquitin-protein ligase/ zinc ion binding
MP 0.08 MP (MONOPTEROS); transcription factor
FLS 0.08 FLS (FLAVONOL SYNTHASE); flavonol synthase
AT3G61580 0.08 delta-8 sphingolipid desaturase (SLD1)
AT4G37180 0.08 myb family transcription factor
RPL23AA 0.08 RPL23AA (RIBOSOMAL PROTEIN L23AA); RNA binding / nucleotide binding / structural constituent of ribosome
POP1 0.08 POP1; transporter
GDH1 0.08 GDH1 (GLUTAMATE DEHYDROGENASE 1); ATP binding / glutamate dehydrogenase NAD(P)+/ oxidoreductase
DRT102 0.08 DRT102 (DNA-DAMAGE-REPAIR/TOLERATION 2)
ACC1 0.08 ACC1 (ACETYL-COENZYME A CARBOXYLASE 1); acetyl-CoA carboxylase
PHV 0.08 PHV (PHAVOLUTA); DNA binding / protein binding / transcription factor

  • One prediction report per query file.
  • Set of parameters and command line switches that match the expected execution of the tool including the possible command line definitions according to the occurrence of optional parameters. Also, validation instructions for parameters are requested.

actual command-line parameter

name and brief description of the parameter

required

default value

text, number, or name of file

description of validation rules

--data

directory of geneMANIA dataset; file path

yes

 

text
eg. /Users/username/genemania_plugin/gmdata-2010-12-01

 

 

 

--in

input-format; format of the query file

optional

flat (tab delimited)

text

 

--out

output-format; format of the output files

genes: List of result genes ordered by score; one per line.
flat:Tab-delimited report containing details of prediction results and query parameters.
xml: XML-formatted report containing details of prediction results and query parameters.
scores: List of result genes with scores ordered by score for the entire genome (ignores related genes limit); one per line.

optional

genes

text

 

--scoring-method

method used to compute the gene scores

Discriminant: GeneMANIA's classic scoring method
Z: Z-scores.

optional

disciminant

text

Discriminant
z

--ids

gene identifier types.A comma separated file of gene identifier types in descending order of preference.

Ensembl Gene Name
Entrez Gene Name
Ensembl Gene ID
Refseq mRNA ID
TAIR ID
Uniprot ID
Refseq Protein ID
Ensembl Protein ID
Entrez Gene ID
 
ID types listed in the default order of preference.If the most preferred identifier is not available for a given gene, the next most preferred identifier is selected



optional

 

text

Ensembl Gene Name
Entrez Gene Name
Ensembl Gene ID
Refseq mRNA ID
TAIR ID
Uniprot ID
Refseq Protein ID
Ensembl Protein ID
Entrez Gene ID

--results

output file directory.Path to where the prediction result files will be created (one per input query file)

optional

working directory

text

 

--threads

The maximum number of parallel predictions. Ideally this should be set to the number of processing cores.

optional

1

number

 

--verbose

print more details about what's happening.

optional

 

text

 

--list_networks organism name

Lists the available networks for the given organism.  Put quotes around the organism name.

optional

 

text

 

--list-genes organism name

Lists the genes that are recognized for the given organism. Put quotes around the organism name. Each line in the output contains a gene and all its synonyms, if any.

optional

 

text

 

 

 

 

 

 

 

  •     Example invocation of the command line application and its associated parameters such that it can perform an analysis.

     java -Xmx1800M -cp GeneMANIA.jar org.genemania.plugin.apps.QueryRunner --data /users/usename/genemania_plugin/gmdata-2010-12-01 --out flat query1.txt

  • Reference

J. Montojo, K. Zuberi, H. Rodriguez, F. Kazi, G. Wright, S. L. Donaldson, Q. Morris and G. D. Bader. (2010).GeneMANIA Cytoscape plugin: fast gene function predictions on
the desktop. Bioinformatics,26 (22):2927-2928.

Mostafavi,S. et al. (2008) GeneMANIA: a real-time multiple association network integration algorithm for prediction gene function. Genome Biol., 9, S4.

Warde-Farley D, Donaldson SL, Comes O, Zuberi K, Badrawi R, Chao P, Franz M, Grouios C, Kazi F, Lopes CT, Maitland A, Mostafavi S, Montojo J, Shao Q, Wright G, Bader GD, Morris Q(2010).The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function.Nucleic Acids Res. 38 Suppl:W214-20.

[http://nar.oxfordjournals.org/cgi/content/abstract/38/suppl_2/W214]