MPI
 EMBRACE                                                        


Web services in Systems Biology

Workshop within the 9th International Conference on Systems Biology

August 28th, 2008, 9:00 AM - 17:00 PM

Gothenburg, Sweden

 [HOME] [PROGRAM][ABSTRACTS] [REGISTRATION] [VENUE]

                
       

Reactome - a human pathway knowledgebase

Esther E Schmidt1, Guanming Wu2, Imre Vastrik1, David Croft1, Bernard de Bono1, Gopal Gopinath2, Marc Gillespie2, Bijay Jassal1, Lisa Matthews2, Phani Garapati1, Michael Caudy2, Alexander Kanapin2, Ewan Birney1, Peter D'Eustachio3, Lincoln Stein2

1European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD United Kingdom

2Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor NY 11724 USA

3NYU School of Medicine, 550 First Avenue, New York NY 10016 USA

Reactome (www.reactome.org) is a manually curated and peer-reviewed knowledgebase, accommodating a wide variety of human biological processes in a computationally accessible format. At present it contains around 2700 human proteins in 2800 reactions. Human reactions serve as templates for automated inference of orthologous events in 22 other species, performed at release time every three months. Reactome data are cross-referenced to a large number of publicly available databases, and all data are freely available in a number of data formats such as BioPax or SBML. The SkyPainter tool allows users to visualize and analyze their own data overlaid onto Reactome data. The Reactome Mart is a powerful query tool enabling data retrieval from Reactome and across other databases. Reactome’s data model is centred around the Reaction, whose instances form a network of biological interactions through entities that are consumed, produced, or act as catalysts. Entities are distinguished by their molecular identities and cellular locations. Reactions are grouped into Pathways. Reactome webservices provide access to Reactome objects, allowing the user to e.g. extract protein entities involved in a given pathway, or vice versa.

ConsensusPathDB – matching and integrating pathway information

Atanas Kamburov, Christoph Wierling, Hans Lehrach, Ralf Herwig

Max Planck Institute for Molecular Genetics, Berlin

Molecular interactions are key drivers of biological function. Large numbers of interactions for man and other species have been generated, annotated, and made publicly available. Current knowledge on human molecular interactions is dispersed in over 200 databases, each having a specific focus and data format, that cover different interaction types like protein-protein, biochemical and gene regulatory interactions. Only very little effort has been undertaken so far with respect to the integration of interaction data although understanding cellular processes would necessitate a more complete picture. To address this problem, we have developed ConsensusPathDB – a database that stores and integrates human interaction data from heterogeneous resources. The database content currently comprises twelve different interaction databases with a total of 25,831 distinct physical entities and 73,426 distinct functional interactions covering 1,689 human pathways. The common and complementary content of these databases has been assessed by matching cellular entities and interactions to each other. Here, we describe the method used for data integration and demonstrate the rich functionality of the publicly accessible web interface to our database.

PyBioS – an object-oriented tool for modeling and simulation of cellular processes

Christoph Wierling, Elisabeth Maschke-Dutz, Atanas Kamburov, Hendrik Hache, Edda Klipp, Ralf Herwig, Hans Lehrach

Max Planck Institute for Molecular Genetics, Berlin

PyBioS (http://pybios.molgen.mpg.de) is a systems biology tool that provides rich features for the model design, simulation and analysis of cellular and biochemical reaction systems. It provides interfaces to external pathway data resources, such as Reactome and KEGG, that can be searched and directly be used for the setup of the model topology in mind. Models in PyBioS are stored in a model repository. An object-oriented PyBioS model can be converted automatically in a system of ordinary differential equations (ODEs) and subsequently be used for simulation and analysis. PyBioS provides support for the computation of conservation relations and supports sensitivity analysis by parameter scanning or metabolic control analysis. PyBioS has an advanced interface for the graphical visualization of biochemical reaction networks along with simulation results.

BiNoM: a tool for manipulating and analysis of biological networks

Andrei Zinovyev

Institut Curie, Paris

BiNoM (BIological NetwOrk Manager) is a Java library which significantly facilitates the usage and the analysis of biological networks in standard systems biology formats. BiNoM implements a full-featured BioPAX editor and a method of ``interfaces'' for accessing BioPAX content. These BiNoM features enable to work with huge BioPAX files such as whole pathway databases. In addition, BiNoM allows to analyze networks created with CellDesigner software and convert them into BioPAX. BiNoM can be used as a Cytoscape plugin that adds a rich set of operations to Cytoscape such as path and cycle analysis, clustering sub-networks, decomposition of network into modules, clipboard operations and others. Last version of BiNoM together with documentation, source code and API is available at http://bioinfo.curie.fr/projects/binom

Introduction to web services

Rodrigo Lopez, Hamish McWilliam

EBI, Hinxton

Many bioformatics resources provide programatic access via. web services technologies. Through the use of the ubiquitous SOAP and REST protocols, these services are architecture and programming langauge neutral, and are easily integrated into analysis workflows. From simple standalone utilities through workbench environments the use of remote databases and analysis tools via web services is promoting data and tool integeration while reducing the overhead of data maintence. Using a selection of the web services provided at EMBL-EBI we illustate the practical use of these technologies.

The TAVERNA workflow system

Katy Wolstencroft

University of Manchester, Manchester

The Taverna workbench is an environment for the design and execution of workflows. It is open source and has been developed as part of myGrid project.
Taverna enables the interconnection and interoperation between local and remote analysis tools and databases of varying types. Taverna can combine Web Services, BioMart queries, R-statistical analyses and BioMoby services, to name a few. In the bioinformatics domain, there are over 3500 different services available to Taverna, providing a flexible and extensible platform for bioinformatics research.

Integrated datasets through reference objects and networks

Christophe Roos

Medicel Oy, Keilaranta 12, FIN-02150 Espoo, Finland

High-throughput measurement techniques in genomics, proteomics, and cell biology provide the fuel of systems-level analyses to elucidate fundamental biological principles and to understand and predict the behaviour of cellular systems in health and disease. Large amounts of data on systems, their component structures as well as quantitative state is being generated and enters various databases including the core biological databases available for example at EBI. Centrally maintained public databases are vital elements of biological research. However, the complexity of systems biology data as well as the large research consortia in the field prompt for local installation of database systems which allow analysing unpublished data in the context of data from other partners or the public database.
We present how IT technology for data integration has been implemented to allow deep integration of measurement data alongside with public data integrated from various public databases. We further show how the data can be used to generate reference objects and a reference pathway including data from several component, interaction and other pathway databases. Such a reference pathway can be used for multiple computational research tasks in order to help identify critical players in the regulation of biological networks: stochastic processes in the fate of cells, robustness of biological systems, systematic perturbation studies to the network components, etc. We show how our reference pathway has been applied in a five molecular domains yeast model study. 

Grid computing for System Biology – a user perspective

Christophe Blanchet1, Alexis Michon1, Christoph Wierling2 and Ralf Herwig2

 
1Institut de Biologie et Chimie des Protéines (IBCP UMR 5086);CNRS; Univ. Lyon1; IFR128 BioSciences Lyon-Gerland; 7, passage du Vercors, 69367 Lyon cedex 07, France

2Max-Planck Institute for Molecular Genetics, Ihnestrasse 63-73, D-14195 Berlin, Germany

Systems biology aims to model the dynamic behaviour of biological networks with computer models and to predict the effect of perturbances in these  networks. The parameters of these computer models are usually optimsed to fit experimental data. This approach is clearly limited by the number of parameters that are experimentally determinable and, thus, alternatives for handling large networks are being discussed. We have developed a large-scale modelling approach to biological networks using a Monte Carlo  procedure that involves random estimates of the kinetic parameters for the underlying reaction constants. Since many simulation runs have to be performed, we have worked on a parallel approach involving the EGEE GRID computing infrastructure. We have tested the approach with a large metabolic network involving reactions annotated from the KEGG database. Experimental high-throughput expression data has been used for the simulations in order to identify metabolites changing through time. In our presentation we describe the annotation of the models, the use of high-throughput data for kinetic simulations, the implementation and parallelisation of the simulation within the computing grid infrastructure as well as selected simulation results on two data sets. The first data sets on melanoma cell lines involved about 900 simulations. A second experiment was the application to a published data set on prostate cancer samples in different states of the disease. A single simulation run takes on standard computers 4 to 5 hours, 7000 simulation runs were required for the whole experiment. Both computational prediction results are currently tested and validated. These two experiments have validated the integration of the PyBIOS system on the EU-EGEE Grid platform and show the general feasibility of parallel grid computing for systems biology modelling.