REWERSE [reasoning on the web] - March 1st, 2004 - March 4th, 2004 - Munich

WG A2 : Adding Semantics to the Bioinformatics Web

room 23 (basement)

Further details (e.g. list of members and some selected publications) related to the kickoff-meeting of WG A2 can be found at http://comas.soi.city.ac.uk/rewerse-a2/a2.html.

  • Monday, 1st March: Rules and constraints to model biological systems
    • 14:00-14:30. Introduction to the aims of the working group by Michael Schroeder and Rolf Backofen
    • 14:30-16:00. Introduction of Partners from Lisbon, Jena, Paris, Edinburgh.
      ca. 20-30min each
      5 slides on what the group does, 5 slides on how the group wants to contribute to REWERSE
    • 16:00-16:30. Coffee break
    • 16:30-18:00. Discussion: Rules and constraints for modelling in bioinformatics.
      Chair: Rolf Backofen
  • Tuesday, 2nd March: Rules, reasoning, and ontologies for systems integration in bioinformatics
    • 09:00-10:30. Introduction of Partners from Manchester, Linkoeping, Bucarest, Dresden.
      Each 20 minutes + 10 minutes
      5 slides on what the group does, 5 slides on how the group wants to contribute to REWERSE
    • 10:30-11:00. Coffee break
    • 11:00-12:00. Invited presentation: Midori Harris, EBI, Cambridge: Otnologies for Biology: the Gene Ontology and OBO
    • 12:30-14:00. Lunch
    • 14:00-16:00. Discussion: Rules, reasoning, and ontologies for systems integration in bioinformatics
    • 16:00-16:30. Coffee break
    • 16:30-17:30. Invited presentation: N.N.
    • 17:30-18:00. Integration of results for deliverable

Objectives

The objective of the WG is to create the core of a Bioinformatics Semantic Web populated by a number of sample data sources and applications representative of the use of the Web in Bioinformatics and to demonstrate novel, reasoning-based solutions dealing with the following problems:
  • Rules for mediation and to formulate complex queries
  • Consistent integration of Bioinformatics data
  • Adaptive portals for molecular biologists
Bioinformatics is an ideal field for testing Semantic Web technologies for three reasons: First, Web-based systems and Web databases have been applied very early in Bioinformatics, second the dramatic increase of data produced in the field calls for novel processing methods, third, the high heterogeneity of Bioinformatics data require semantic-based integration methods.

Consider the following scenario: a biologist obtains a novel DNA sequences nothing is known about. He or she wants to run an alignment, but has specific requirements for the alignment. These requirements are captured as rules and constraints, which are taken into account by the online accessible semantic Web enabled sequence comparison service.

The researcher found a number of significantly similar sequences in yeast for which there is gene expression data available. The scientist requests from the semantic Web enabled gene expression database and tool expression data for the relevant genes. He or she defines rules, which capture which expression profiles are interesting, e.g. all genes which are highly expressed at the beginning and end of the experiment are of interest.

The genes are part of a larger process and the researcher is interested in their gene products. A query to SWISSPROT determines these. Do these proteins interact with each other? To answer this question a semantic Web service is queried, which computationally determines protein interactions. A user-defined rule formulating what constitutes a protein domain interaction, is applied on the fly to SCOP, the structural classification of proteins, and PDB, a large protein structure database. The rule-based sequence similarity tool mentioned above is used to determine whether the scientists proteins of interest are similar to any interacting proteins computed from SCOP and the PDB.

Finally, the scientist wishes to relate the protein interaction network to metabolic pathways. As all the tools used refer to the same ontologies and terminology defined through the gene ontology, the researcher can easily investigate a mapping from the interaction network to a relevant metabolic pathway obtained from a semantic-Web enabled pathway server.

During the above information foraging, the scientist constantly used literature databases to read relevant articles. Despite the tremendous growth of 8000 articles a week, our biologist still manages to quickly find the relevant articles as he or she uses an ontology-based search facility, which guides the search, automatically specialising querying, where too many hits are obtained, and generalising, where too few articles can be found. The deliverables below are aimed towards the implementation of the above scenario.

go to top of page
Valid CSS! Valid XHTML 1.0!