Linking between GO subontologies

About

The Gene Ontology (GO) is a rich resource of annotation pertaining to the Cellular Component (CC), Molecular Function (MF) and Biological Process (BP) to which a gene belongs. Mining for data within these three subontologies is straight forward and many tools exist to do this. However questions such as "what Molecular Functions occur in the Nucleus" that span the MF and CC ontolgies cannot be easily answered as there is no standard method to link them.

GOlink was created to help to provide answers to these types of questions.

When given a query GO term ID(s), GOLink uses the GO Perl API to apply three increasingly stringent methods that each compile a "terms list" containing terms across the sub-ontologies that co-occur with the query term(s) in genes with GO annotation in the GO database.

Quick Start

  1. Download and unzip the software
  2. Install the prerequisites (perl modules, GO database (optional))
  3. Download and install the Gene Ontology database
  4. Edit the configuration file specifying the location of your GO database (preferably locally installed)
  5. Read the manual
  6. Run the software

To run a test analysis, issue the following command:

./golink.pl --config config.cnf --query 'GO:0045120' --ev_filters '!IEA' --db_filters 'UniProtKB'

System Requirements

The basic requirements to run GOLink are an operating system and some disk space for your input and output files. GOLink is written in Perl and thus requires Perl and associated additional modules described below to be installed to run. GOLink is entirely dependant on access to a Gene Ontology (GO) database. We recommend you install a local version which speeds processing immensely and instructions can be found below. MySQL is required if you want to install a local version of the GO database.

Hardware Requirements

OS (Windows, MacOSX, Linux) Smaller datasets will work on 32-bit and 4GB memory Larger datasets will require more memory and a 64-bit OS

Software Requirements

Gene Ontology Database


You can either use the public Gene Ontology database with the connection details at http://www.geneontology.org/GO.database.shtml#online or install your own as follows:
  1. Download and install MySQL and MySQL server from http://dev.mysql.com including dev packages if using yum/dpkg etc
  2. Download the full MySQL GO database (12GB) from http://www.geneontology.org/GO.downloads.database.shtml
  3. Follow the installation instructions at http://archive.geneontology.org/latest-full/README
  4. Create a MySQL user to access the database
  5. Add the connection details to the relevant variables in the go_link.pl script

Perl Modules


Install the following modules from CPAN
Pod::Usage
Getopt::Long
GO::AppHandle
Config::IniFiles

Download

  1. Download

    You can download GOLink here

  2. Install GOLink

    Once all system requirements are fullfilled and you have downloaded GOLink, simply extract the contents to an appropriate local, or system accessible directory.

    For example on Linux: unzip golink_vxxx.zip

    Where xxx is the version of GOLink you have downloaded.

  3. Configure

    An example configuration file is provided with the software. The essential part of this file is to tell the script where your Gene Ontology (GO) database is. For those users who install a local GO database simply modify the server hostname, username, password and port details in this file.

    [GO]
    dbhost=GOhost.xyz
    dbuser=GOuser
    dbpass=GOpass
    dbname=GOdbname

Example

Run perl go_link.pl --help for usage instructions

./golink.pl --help

Test analysis using the default GOLink configuration file:

./golink.pl --config config.cnf --query 'GO:0045120' --ev_filters '!IEA' --db_filters 'UniProtKB'

The above command will take approximately 5/6 minutes using the EBI GO databaseand 1 minute using a locally installed database

Contact

GOLink was written by Richard Francis as part of his PhD in Bioinformatics at the University of Western Australia. Contact Us