Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

RP 7

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/13503531

TAMBIS: Transparent Access to Multiple Bioinformatics Information Sources

Article in Proceedings / ... International Conference on Intelligent Systems for Molecular Biology; ISMB. International Conference on Intelligent Systems for Molecular Biology · February 1998
Source: PubMed

CITATIONS READS
222 375

6 authors, including:

Andy Brass Sean Bechhofer


The University of Manchester The University of Manchester
204 PUBLICATIONS 7,818 CITATIONS 195 PUBLICATIONS 10,671 CITATIONS

SEE PROFILE SEE PROFILE

Carole Goble
The University of Manchester
594 PUBLICATIONS 32,242 CITATIONS

SEE PROFILE

All content following this page was uploaded by Robert David Stevens on 11 May 2014.

The user has requested enhancement of the downloaded file.


TAMBIS - Transparent Access to Multiple Bioinformatics Information
Sources.

Patricia G. Bakera, Andy Brassa, Sean Bechhoferb, Carole Gobleb, Norman Patonb, Robert
Stevensb.

a b
School of Biological Sciences, Department of Computer Science,
Stopford Building, University of Manchester,
University of Manchester, Oxford Road,
Oxford Road, Manchester, M13 9PT
Manchester, M13 9PT U.K.
U.K. Telephone: 44 (161) 275 6142
Telephone: 44 (161) 275 2000 Fax: 44 (161) 275 6236
Fax: 44 (161) 275 5082

pbaker@manchester.ac.uk carole@cs.man.ac.uk
abrass@manchester.ac.uk norm@cs.man.ac.uk
seanb@cs.man.ac.uk stevensr@cs.man.ac.uk

Abstract those for protein sequences, genome projects, DNA


sequences, protein structures and motifs. Also available
The TAMBIS project aims to provide transparent access to are a range of specialist interrogation and analysis tools,
disparate biological databases and analysis tools, enabling each typically associated with a particular database
users to utilize a wide range of resources with the format. Frequently the information sources have different
minimum of effort. A prototype system has been structures, content and query languages; the tools have no
developed that includes a knowledge base of biological
terminology (the biological Concept Model), a model of
common user interface and often only work on a limited
the underlying data sources (the Source Model) and a subset of the data.
‘knowledge-driven’ user interface. Biological concepts are When biologists need to ask questions of multiple sources
captured in the knowledge base using a description logic they must perform the following tasks during query
called GRAIL. The Concept Model provides the user with formulation and execution:
the concepts necessary to construct a wide range of • identify sources and their locations
multiple-source queries, and the user interface provides a • identify the content/function of sources
flexible means of constructing and manipulating those
• recognise components of a query and target them to
queries. The Source Model provides a description of the
underlying sources and mappings between terms used in appropriate sources in the optimal order
the sources and terms in the biological Concept Model. • communicate with sources
The Concept Model and Source Model provide a level of • transform data between source formats
indirection that shields the user from source details, • express syntactically complex queries and
providing a high level of source transparency. Source • merge results from different sources.
independent, declarative queries formed from terms in the
Concept Model are transformed into a set of source
dependent, executable procedures. Query formulation, Many biologists still use collections of stand-alone
translation and execution is demonstrated using a working resources (many of which are Web-based) to formulate
example. and execute queries. This means that all of the tasks listed
above must be carried out by the user. This places a
Introduction burden on biologists, most of whom are not
Bioinformatics experts, and limits the use that can be
The biological community is a distributed one with a made of the available information. The greater the number
culture of sharing and rapid dissemination of information. of the above tasks that are taken on by the system, the
Each separate area of molecular biology generates its own greater the transparency of the overall task of query
data and therefore its own information sources, including formulation and execution. There are examples in the
biological community of systems which seek to relieve The Transparent Access to Multiple Bioinformatics
the user of some of this burden by easing access to Information Sources (TAMBIS) project aims to provide
multiple, heterogeneous information sources; however, the user with the maximum source transparency using (i)
these systems vary in their degree of transparency. a canonical representation of biological terminology
The Sequence Retrieval System (SRS) (Etzold 1996), for against which the user can formulate queries and (ii)
example, attempts source interoperation using predefined mappings from terms in the representation onto terms in
hypertext links with which the user can navigate between external sources. TAMBIS therefore provides a level of
sources. A form-based interface allows the user to ask indirection between the user and the external sources
complex, although restricted, queries over multiple which removes from the user the necessity to perform the
sources that are executed simultaneously. Queries tasks listed above. In order to do this TAMBIS adopts the
composed of sub-queries, which have to be executed in a three layer model of the classical mediator/wrapper
given order, must be issued separately by the user, and the architecture (Wiederhold 1992), a schematic view of
results of one sub-query piped into another by hand. which is illustrated in figure 1.
While SRS provides the user with transparency from Layer 1 comprises a knowledge base (conceptual modela)
communication (i.e. location, connection protocols and of biological terminology and a knowledge-driven user
query language) with sources, it does not hide them, and interface. Using the interface, the user combines terms
provides no guidance as to which source is most from the knowledge base to form declarative, source
appropriate for a given query. independent queries. Layer 2 is a mediation layer that (i)
The Collection Programming Language (CPL) is a identifies the appropriate sources to satisfy a query and
functional programming language that allows data to be (ii) rewrites the query to a series of source dependent
described and manipulated as complex data types such as ordered procedures. Layer 3 comprises external sources
sets, lists and records. These data types are suitable for wrapped with a consistent structural model, providing a
modelling biological data and have been used to do so in common interface that affords communication and
the BioKliesli system (Buneman 1995) where biological network transparency. Where possible TAMBIS exploits
sources are given a CPL driver and a set of functions to existing technologies. Layer 3, therefore, currently utilizes
manipulate data. CPL/BioKliesli thus acts as a CPL/BioKliesli although the long-term intention is to use
multidatabase language and provides ways of CORBA wrapped services (Roduerigez-Tome 1997).
manipulating and piping results, allowing the user to The conceptual model is central to the TAMBIS
formulate complex, ad hoc queries. The details of location architecture. Its use in driving query formulation and
and access to these data sources are hidden; however, the facilitating source integration, is novel in the biological
identification of which data source to use and the domain. The emphasis on the model in this paper is,
construction of the query in CPL is still left to the user. therefore, commensurate with its importance.
Comparable facilities are provided by P/FDM (Kemp
1996), which also uses a functional language, although The Architecture
P/FDM has a more object-oriented type system and has its
Although a detailed description of the TAMBIS
own local database.
architecture is outside the scope of this paper, a general
(Markowitz 1995) uses an object model, the OPM, as a
common data model for the sources and a suite of OPM- overview is appropriate. The five main components of the
based tools for exploring them. Each source either has an TAMBIS architecture are:
OPM schema or is retro-fitted with one via a view • The biological Concept Model (knowledge base)
mechanism. A multidatabase directory describes how • The knowledge-driven graphical user interface (GUI)
each database is linked to another. However, there is no • The Source Model
attempt at hiding the databases from the user, who is still • The Query Transformation Module
expected to identify them and navigate through them. • The Query Execution Module
Queries can be specified via a multidatabase query
language OPM-QL, or using a Web interface. The Biological Concept Model
Some bioinformatics researchers recognise that semantic
schema and data matching would be greatly aided by a
comprehensive thesaurus of terms (Davidson 1995) or a
reference ontology of biological concepts (Karp 1995). In

a
Because the biological knowledge base is a conceptual
model of biological terminology, the words ‘concept’ and
‘term’ are used interchangeably in this paper.
Knowledge-Driven Graphical User Interface
Layer 1
Query formulation
Biological Concept Model

declarative query

Query
Transformation
Source Model Layer 2
Query planning and
translation
(source mediation)

ordered execution
plan Layer 3
Query execution

(wrapped sources)

Source Source Source


1 2 3

Figure 1. TAMBIS three layer, mediator/wrapper architecture.

order to share standardised and unambiguous information, terms ‘hasFunction’ and ‘Hydrolase’ to form a
controlled vocabularies, or terminologies, can be used as a composite term ‘Motif which isComponentOf Protein
framework for expressing and communicating ideas in a and hasFunction Hydrolase’; this term is both a
consistent manner. The TAMBIS biological Concept concept and a query.
Model describes such a terminology. This knowledge • it is a classification scheme that organises terms into
base covers terms associated with proteins and nucleic a hierarchy based on the ‘isa’ relationship (also
acids, their component parts and their structures, known as the subsumption relationship). For
biological functions and processes, tissues and taxonomy. example, ProteinSequence ‘isa’ more specialised kind
The terminology has two key aspects: of Sequence.
• it is compositional, resembling a dictionary of
elementary terms that are assembled according to a To be truly effective, such a terminology needs to be
restricted grammar to form new complex composite represented in a scheme that can reason about the inferred
terms. These composite terms can in turn be relationships between terms and their components, can
components in new compositional terms, so the control the formation of terms, and can automatically
terminology is recursive. For example, the term classify terms based on their components so that the
‘Motif’ can be combined with the terms hierarchy takes care of itself. As terms are changed the
‘isComponentOf’ and ‘Protein’, to create a new scheme should also dynamically reclassify them to ensure
composite term ‘Motif which isComponentOf the hierarchy’s correctness.
Protein’. This in turn could be combined with the Description Logics (DL), also known as Terminology
Logics, are a family of logics explicitly designed to expressive than most other DLs but it compensates for
represent taxonomic and conceptual knowledge of an this by supporting a powerful set of assertion axioms and
application domain on an abstract level; for an overview a multi-layer sanctioning mechanism. These sanctions
see (Borgida 1995). DLs are usually given a Tarski style decree whether two concepts are permitted to be related
declarative semantics, which allows them to be seen as via some relationship and so constrain the construction of
sub-languages of first order predicate logic. In the complex concepts. Sanctions ensure that only
TAMBIS project we use the GRAIL DL (Rector 1996), semantically valid concepts are formed and that a large
developed at Manchester. Briefly, a DL is an ‘isa’-based number of complex concepts can be inferred from a
classification system that allows a recursive, sparse model. As only reasonable concepts can be inferred
compositional model to be built from terms and binary from the model the user is allowed to construct only those
relations. A base term can be combined with any number queries that it is reasonable to ask. For example, in figure
of relation-term pairs (or criteria) to create a more 2, asserting that ‘SequenceComponent isComponentOf
complex term. Any of these terms can be composite Protein’ is legal, is sufficient to infer that ‘Motif
(complex) or elementary. Figure 2 gives a small fragment isComponentOf Protein’ without having to create it or
of the GRAIL classification, omitting the term position it until it is asked for. Therefore, only a small
constructors. In this example ‘Motif’ is the base term and number of constraints need be asserted in order that a
‘isComponentOf Protein’ is the criterion with which it is large number of concepts can be inferred.
combined. GRAIL supports the automatic classification of In TAMBIS the biological Concept Model is used to:
concepts into ‘isa’ hierarchies by reasoning about the • describe the metadata of the underlying data sources,
component descriptions of the concepts. Therefore, representing an over-arching universal schema
‘Protein Motif’ would be classified automatically as a • express queries in the modelling language
child of ‘Motif’ and a parent of ‘Poecilia Reticulata • drive a GUI user interface for query formulation
Protein Motif’ based on its definition. Only 3 of the 11 • mediate between the various data sources by
‘isa’ relationships shown in figure 2 have been hand- exploiting the biological concept hierarchy to assist
crafted by the knowledge modeller. DLs support multi- in the identification and resolution of equivalences or
dimensional classification so that the same concept can be near equivalences – similar approaches have been
classified in many ways, thus allowing for the different taken in non-biological projects, for example SIMS
user views of a concept. The classification is dynamic so [Arens93].
as the description of a concept is further elaborated it is
automatically reclassified. Description Logics therefore As (Markowitz 1995) and (Davidson 1995) suggest,
support the incremental description of terms. integration is costly and the quest for an agreed schema
The classification hierarchy supports imprecise and futile. However, our biological terminology does not
general queries and query exploration by moving around attempt to force a global schema representing a consistent
the hierarchy. The compositional nature of the integrated view of all the component databases. Instead it
representation allows for the flexible construction of seeks to describe what is in the component databases and,
queries at varying levels of complexity and abstraction. In rather than resolve conflicts, it acknowledges them and
DLs the modelling language and the query language are indicates possible equivalences.
the same thing; to find the concept you define it and the
classifier classifies it. If it is sound then it is positioned in The Knowledge-Driven User Interface
the hierarchy and you can ask for its parent, children or
the instances it describes. If it is unsound then it doesn’t Queries are formulated against the biological concept
classify and, therefore, cannot appear in a query. model in the GRAIL language. It would be inappropriate
A whole family of knowledge representation systems for biologists to learn either GRAIL or the contents of the
have been built using DLs and recent work has provided a knowledge base. Instead, TAMBIS provides a forms
sound formal basis for several DLs along with results based GUI that is driven by the terminological model. The
concerning their complexity (Donini 1991). Significantly interface supports two tasks:
large models are now being produced, for example the • exposure of the terminological model and
Galen-In-Use medical model (Rector 1997) expressed in • guided query formulation and manipulation.
GRAIL is some 10,000 concepts and relations.
DLs are expressive, and usually have complete and During the query formulation process the model may be
decidable reasoning. However, the conflict when applying browsed to find what can sensibly be said of a concept of
any DL is between computational tractability and interest. A convenient mechanism for browsing the model
expressiveness; GRAILs terminological language is less without query formulation is provided by the navigation
isComponentOf hasOrganismSource
Protein Organism

hasFunction
Function
SequenceComponent Poecilia
SequenceComponent
hasFunction Hydrolase. reticulata

Hydrolase
Motif SequenceComponent isComponentOf
Protein

Motif hasFunction Hydrolase.

Motif isComponentOf Protein

Motif
<isComponentOf (Protein hasOrganismSource
PoeciliaReticulata) hasFunction Hydrolase>.

Figure 2. A simplified fragment of the TAMBIS GRAIL model showing the power of auto-classification; the only ‘isa’ relationships that
have been ‘hand-crafted’ by the knowledge worker are indicated by the solid arrows. All the other terms are implied by the sanctioning
scheme and automatically and dynamically classified upon request, as indicated by the broken arrows. The solid lines indicate the
sanctioned relationships between terms. It is these relationships that allow the construction of all of the composite terms shown.

tool. Figure 3 shows the navigator focused on the concept ‘hasOrganismSource PoeciliaReticulata’. The query is
‘Protein Structure’. The concept currently in focus equivalent to the English expression “find all motifs
occupies the center of the frame and related concepts from occurring in guppy proteins”.
the Knowledge Base are displayed around it. The model It is important to appreciate that in TAMBIS the term
may be browsed by promoting any of the related concepts concept is interchangeable with the term query. Therefore,
to be the central concept. The new central concept is then in constructing a concept (a description of what you the
surrounded by all its related concepts. user wants) the user is constructing a query (“what things
Having identified a concept of interest, for example exist that fit the description I have just given?”).
‘motif’, the user may want to form a query based on that
concept. A Query Manipulation tool gives the user an Query Planning and Translation
option to add more information about the concept (or Queries expressed in GRAIL are declarative and source
specialise the concept) by presenting all the legitimate independent. GRAIL queries thus specify what
criteria that can be applied to the concept ‘motif’ (see information is required, but neither how it should be
figure 4). obtained nor from where. It is the role of the query
The user may choose one or many of these criteria. If they planning and translation layer to provide this additional
chose, for example, ‘isComponentOf Protein’, the query information. This layer takes as input a GRAIL query and
is equivalent to the English expression “find all protein generates as output an execution plan in CPL. The
motifs”. Having constructed the query the user may planning and translation process is broken into three main
manipulate the whole query or any of its component sub- steps:
queries by (i) the addition or removal criteria or (ii) the • Translation into a Query Internal Form (QIF): The
replacement of terms with more specialized or more GRAIL query is unnested and certain query
general terms. Figure 5a shows a query that has been constructs are simplified.
built by further specialisation of the term ‘Protein’ in the
• Query Planning: A search algorithm considers
above query by addition of the criterion
alternative evaluation orders for the components of
the QIF generated at step 1, with a view to optimisation (Paton 1990, Fegaras 1997). The QIF is a list
identifying both valid and efficient ways of of query components, each of which is a tuple (Base,
evaluating the query. Variable, Criteria, Cost, Cardinality) representing the
• Code Generation: The query plan that results from evaluation of part of the query. Base is the base concept
the planning phase is converted into a CPL program of the component, Variable is the name of the variable
for execution. used to store values retrieved as a result of evaluation the
component, Criteria represents the set of criteria
The following subsections elaborate on the above steps, associated with Base, Cost is an estimate of the cost of
both detailing what is done at each stage and outlining the evaluating the component, and Cardinality is the size of
auxiliary data structures that are required. the collection that it is anticipated will result from
Translation into Query Internal Form (QIF). GRAIL evaluating the component. Values for Cost and
queries are intrinsically nested structures. However, Cardinality are computed by the planner. Figure 5a shows
nested language structures generally imply some an example query that is equivalent to the English query
evaluation order, so we follow a number of earlier query “find all motifs in Poecilia reticulata (guppy) proteins”.
planners in unnesting the source query prior to query The GRAIL representation

Figure 3. TAMBIS prototype user interface navigation tool showing the navigation of the concept ‘Protein Structure’. The
central term is surrounded by related terms. Each related term is coloured according to its relationship with the central term.
There are four possible relationships: parent terms - concepts immediately above it in the hierarchy with which it has an ‘isa’
relationship e.g. ‘Structure’; child terms - concepts immediately below it in the hierarchy which have ‘isa’ relationships with
it e.g. ‘Protein Tertiary Structure’; defining terms – relation-term pairs that form part of its definition eg. ‘is structure of
Protein’; sanctioned terms - concepts with which it has appropriately sanctioned relationships but which do not form part of
the concept’s definition eg. ‘is determined by Method of Determining Structure’.
Figure 4. An example from the TAMBIS user interface prototype showing the relationships that can be used to specialise the
concept of ‘motif’.

Query Planning. The query planner seeks to identify


of this query is “Motif which isComponentOf (Protein both legal and efficient ways of evaluating queries given
which hasOrganismSource PoeciliaReticulata)” (figure the available CPL functions. The planner exploits the
5b). The initial QIF of this query is shown in figure 5c. augmentation heuristic (Swami 1989), which essentially
Each term (concept) in a criterion is itself represented by involves examining all the query components in a query,
a query component and is associated with the variable selecting the most promising for initial evaluation, and
used to store instances that result from the evaluation of repeating the process for the remaining components. The
the component. The other form of mapping that takes Source Model is central to the planning process, as it
place during the translation to QIF is the simplification of indicates which CPL functions can be used to evaluate
components where appropriate – for example the removal which query components. Lack of space prevents a
of query components which exist only to support certain detailed description of the Source Model, but the
modelling strategies employed by the knowledge worker. following are the principle components:
The mapping into the QIF is defined as a set of rewrite • Concept Iteration: Concept iteration information is a
rules of the form: triple (Concept, FunctionSignature,
ArgumentMapping), where Concept is a concept from
rewrite <concept template> the Concept Model, FunctionSignature is the
as <QIF component> signature of a CPL function, and ArgumentMapping
if <condition> is a description of how input parameters for the CPL
function should be obtained.
The concept template is capable of matching concepts
• Criterion Evaluation: Criterion evaluation
with specific structures in the biological Concept Model, information indicates how the criteria of a concept
the QIF component is as described above, and the can be evaluated in CPL. This is described using
condition makes tests involving the Concept Model and tuples of the form (Concept, Criterion,
the Source Model. However, conditions never refer to the FunctionSignature, ArgumentMapping), where
specific functions that may be used to evaluate a query, as
Concept is the base concept to which the criterion is
planning is the sole preserve of the planner described
applied, Criterion is the criterion in question and
below. FunctionSignature and ArgumentMapping are as
described for concept iteration.
• Coercion: CPL functions may retrieve values of system is used to elicit user requirements. The prototype
different CPL types to represent the same concept. user interface is currently implemented in SmallTalk. It
For example, retrieval of protein information from a has much of the functionality that it is envisaged will be
specialist protein database such as SWISSPROT needed in the final system, although the look and
yields a complex record structure that contains behaviour of the interface is likely to change as the final
significant amounts of information about the protein. implementation will be in Java to facilitate its use on the
Retrieval of information from a motif database such World Wide Web. We are currently eliciting general user
as PROSITE, however, is likely to yield only the requirements from academia and industry by means of a
accession numbers. This means that the query planner questionnaire. This is ensuring that the Concepts Model
needs to know things like how to obtain a detailed allows the formulation of the kinds of questions that
description from an accession number and vice versa. biologists want to ask. No formal user evaluation of the
Such relationships are described using tuples of the prototype knowledge-driven user interface has yet been
form: (CPL_type, CPL_type, mapping_function) performed, although it is envisaged to play a major part in
• Costing: Information on the anticipated cost of the development of the system. A Java implementation of
evaluating a CPL function and the likely cardinality the query transformation module is in place, although its
of the result is stored using tuples of the form: accuracy has not yet been evaluated. A more sophisticated
(FunctionSignature, Cost, Cardinality). planner will be needed in the future. There are currently
15 wrapped sources including the BLAST suite of
The planner has two principle components, the search programs, SwissProt, Prosite, BLOCKS amd PRINTS.
algorithm described at the start of this sub-section and a The Source Model has mappings between a range of CPL
list of rules that indicate under which circumstances functions acting on these sources and the corresponding
specific techniques may be used to evaluate a query concepts in the biological model. These mappings dictate
component. Such rules are of the form: the number of queries that can be answered by TAMBIS
and so the development of a more comprehensive Source
Rewrite <QIF Component> Model is the next priority task. As suggested by
as <Function List, Cost, Cardinality> (Davidson 1995), this approach is high cost but high
if <condition> benefit, and there are still many challenges to address –
given <variables> issues such as: tools for adding new sources; changes in
sources; incorporation of CORBA sources; dynamic
The QIF Component is as described above, the Function query optimisation based on network performance; user
List is a list of CLP functions with bound arguments, the intervention and results attribution.
Cost is an estimate of how long it will take to evaluate
each of the functions in the Function List, and the Summary
Cardinality is the total number of concepts given the set
The TAMBIS project is pursuing a novel approach that
of bound variables. The condition invariably refers to the
will yield an integrated solution to the problem of
functions that are available in the Source Model and the
disparate biological databases and analysis tools. The
set of bound variables.
common schema (Biological Knowledge Base) is
Code Generation. The code generator takes as input an represented in a Description Logic, presenting the user
ordered list of query components and their associated with a rich description of the domain from which they
functions, and generates a single CPL program that binds may flexibly and intuitively construct and modify queries.
together the CPL functions. The code generator is The queries are deconstructed, rewritten into a common
straightforward, and makes a single pass through its query language and dispatched to one or more wrapped
inputs in generating the execution plan (figure 5d). resources. The use of a knowledge base and wrapped
For result presentation, TAMBIS makes use of a CPL resources removes the need for the user to know (i) which
function that transforms its data structures into HTML for are the appropriate resources to use and (ii) how to access
display using a WWW browser (figure 5e). them, thus greatly reducing the time taken to analyze their
data.
Project Status
The prototype Biological Model is well populated by Acknowledgements
concepts describing those areas required for the The TAMBIS project is funded jointly by the
construction of common queries, such as queries about EPSRC/BBSRC Bioinformatics Programme and by
protein structure and nucleic acid coding signals. The Zeneca Pharmaceuticals, whose support we are pleased to
model currently contains around 1500 concepts and has acknowledge.
the capability to infer many more. The biological concept
model will become better populated as the prototype
a)

b)
Motif which isComponentOf (Protein which
hasOrganismSource PoeciliaReticulata)

c)
[ ( Motif, Motif-1, [(isComponentOf Protein, Protein-1)], -1, 1),
(Protein, Protein-1, [(hasSourceOrganism PoeciliaReticulata,
null)], -1, -1) ]

d)
{Motif-1|
\Protein-1<-get-sp-entry-by-os(" POECILIA+RETICULATA"),
Motif-1<-do-prosite-scan-by-entry-rec(Protein-1)}

e)

Figure 5. An example showing the stages in the information retrieval process using TAMBIS. a) The knowledge-driven GUI
allows the user to construct a declarative, conceptual and source independent query. The query formulated at the interface is
represented in GRAIL as shown in b). c) The single GRAIL query is transformed into query internal form (QIF). d) The QIF
is transformed into a functional, source-dependent query in CPL. e) The results from the CPL wrapped sources are presented
to the user via a Web browser.
References Kemp G.J.L. and Gray P.M.G., Using the Functional Data
Model to Integrate Distributed Biological Data Sources,
Arens Y, Chee C.Y., Hsu C-H, Knoblock C.A. Retrieving Proc. 8th Int. Conf. on Scientific and Statistical Database
and Integrating Data from Multiple Information Sources, Management, IEEE Press, 176-195, 1996.
in Journal on Intelligent and Cooperative Information
Systems, 2:127-158,1993.
Markowitz, V.M., and Ritter, O., Characterizing
Heterogeneous Molecular Biology Database Systems,
Borgida A., Description Logics in Data Management.
Journal of Computational Biology, 2(4), 1995.
IEEE Transactions on Knowledge and Data Engineering,
7(5): 671-682, 1995. Paton, N.W. and Gray, P.M.D., Optimising and Executing
Daplex Queries Using Prolog, The Computer Journal, Vol
Buneman P., Davidson S.B., Hart K., Overton C. and
33, No 6, 547-555, 1990.
Wong L. A Data Transformation System for Biological
Data Sources In Proceedings of VLDB, Sept. 1995
Rector A.L., Bechhofer S., Goble C.A., Horrocks I,
(Zurich, Switzerland). Nowlan W.A., Solomon W.D., The GALEN modelling
language for medical terminology, in AI in Medicine
Davidson S.B., Overton C., Buneman P., Challenges in
1996.
Integrating Biological Data Sources, Journal of
Computational Biology Vol 2, No 4, 1995.
Rector A. and Horrocks I. Experience building a Large,
Re-usable Medical Ontology using a Description Logic
Donini, F., Lenzerini, M., Nardi, D., Nutt, W., ‘The with Transitivity and Concept Inclusions. AAAI Spring
Complexity of Symposium on Ontological Engineering, 1997.
Concept Languages’, KR-91, pp151-162, 1991.
Rodriguez-Tome P, Helgesen C, Lijnzaad P, Jungfer K, A
Etzold T, Ulyanov A, Argos P, SRS: information retrieval
CORBA server for the radiation hybrid database.
system for molecular biology data banks. Methods Proceedings of the ISMB 1997, 5:250-253.
Enzymol. 1996, 266: 114-128.
Warren D.H.D., Efficient Processing of Interactive
Fegaras L. An experimental optimizer for OQL. Technical
Relational Database Queries Expressed in Logic, Proc.
Report TR-CSE-97-007, CSE, University of Texas at
7th VLDB, 272-281, 1981.
Arlington, 1997.
Wiederhold G. Mediators in the Architecture of future
Karp P, A Strategy for Database Interoperation, in Journal Information Systems, IEEE Computer 21(3) March 1992,
of Computational Biology, 1996.
pp. 38-50.

View publication stats

You might also like