HOME
BACKGROUND
FEATURES
TOUR
EXAMPLES
TOOLS
CONTACT
Background
Motivation:
New high-troughput technologies such as DNA microarray chips are generating large amounts of gene expression data which represent the physiological state of a single cell or whole tissue. After statistical processing of the data, including normalization and identification of differentially expressed genes, it is very useful to combine them with biological information. Therefore, an automated functional analysis of microarray gene expression data is desirable.

Gene Ontology - A powerful functional classification system:
Recently, information from Gene Ontology (GO), a functional classification of genes, has been successfully applied to the biological interpretation of microarray data. By doing this, a comprehensive overview of those biological functions can be obtained which are mainly affected in their gene expression. The limiting factor of the first generation of GO-based interpretation methods was the necessity of setting a somewhat arbitrary threshold value in order to select the genes that should be considered as differentially expressed. They underlying statistical test are mainly based on either hypergeometric test and Fisher's exact test. More recent tools implement methods which do not require the definition of a threshold value. They are able to take all consistently measured genes from the DNA microarray into account. Here, the expression profile of the genes assigned to a particular GO node is considered as a frequency distribution. Employing statistical testing, this distribution is compared to the distribution of the expression values of the genes that do not belong to the node (background distribution). Statistical tests commonly used for this purpose are the Student's t-test - that assumes a normal distribution - and the Kolmogorov-Smirnov test. In addition, some permutation-based approaches have been proposed which normally have a longer running time than the permutation independent approaches.

Rationale of our tool:
We have developed a novel integrative platform for the biological interpretation of microarray data using Gene Ontology as a functional classification system. Developed to meet the needs of the user, it allows the rapid and user-friendly analysis of prokaroytic microarray data and thus constitutes a valuable tool for the microbial research community. It offers several (in total 5) alternative statistical methods to identify the GO nodes with interesting gene expression profiles. In contrast to most available programs both, threshold based and threshold independent, methods are provided. We included four well established and commonly used hypergeometric test and Fisher's exact test (both treshold based) as well as the Student's t-test and the Kolmogorov-Smirnov test (both threshold indepedent). In addition, we provide use of a further threshold independent method (unpaired Wilcoxon's test) which has - to our knowledge - not been used for the GO-based interpretation of gene expression data before.
Altogether, the user can perform an exploratory analysis of his/her microarray data and makes his own decision which methods to use. Permutation based approaches were not included since we wanted to present the results of analysis immediately to the user and without a waiting queue.

Status and application:
Currently, we support analysis of more than 20 prokaryotic species including the two most important bacterial model organismsE. coli (strain K12) andBacillus subtilis (strain 168), which represent the Gram-negative and Gram-positive bacteria. In addition many pathogenic organism are included like forPseudomonas aeruginosa). Example analyis for E. coli demonstrated the suitability of the tool. Corresponding microarray data sets for getting an impression on the analyis methods and functionalities of JProGO are available on the JProGO start site.