HEWLETT
PACKARD
The Iris Architecture and Implementation
Kevin Wilkinson, Peter Lyngbaek, Waqar Hasan
Software and Systems Laboratory
HPL·90·108
August, 1990
Iris; database management;
extensible systems;
functional data model;
object-oriented databases;
query processing; rule-based
query optimization; semantic
data model
The Iris database management system is a
research prototype being developed at HewlettPackard Laboratories. Its goals are to enhance
database programmer productivity and to provide
generalized database support for the integration
of future applications. Iris is based on an object
and function model. Iris objects are typed, but,
unlike other object systems Iris objects contain no
state.
Attribute values, relationships and
behavior of objects are modeled by functions. The
Iris architecture efficiently supports the
eval uation of functional expressions. The goal of
the architecture is to provide a database system
that is powerful enough to support the definition
of functions and procedures that implement the
semantics of the data model. This paper provides
an overview of the data model, describes the
architecture in detail and discusses our
implementation experience and usage of the
system.
Internal Accession Date Only
Published in IEEE Transactions on Knowledge and
Data Engineering. V.2. N.I. March. 1990
rrt
1
Introduction
Iris is an object-oriented database management system being developed at Hewlett-Packard
Laboratories [Fishman,89] [Fishman,87]. One of its goals is to enhance database programmer
productivity by developing an expressive data model. Another goal is to provide generalized
database support for the development and integration of future applications in areas such
as engineering information management, engineering test and measurement, telecommunications, office information, knowledge-based systems, and hardware and software design.
These applications require a rich set of capabilities that are not supported by the current
generation (i.e., relational) DBMSs.
Figure 1 illustrates the major components of the Iris system. Central to the figure is the
Iris Kernel, the retrieval and update processor of the DBMS. The Iris Kernel implements
the Iris data model [Lyngbaek,86] which is an object and function model. Retrievals and
updates are written as functional expressions. Extensibility is provided by allowing users
to define new functions. The functions may be implemented as stored tables or derived as
computations. The computations may be expressed either as Iris functional expressions or
as foreign functions in a general-purpose programming language, such as C.
Like most other database systems, Iris is accessible via stand-alone interactive interfaces or
interfaces embedded in programming languages. All interfaces are built as clients of the Iris
Kernel. A client formats a request as an Iris expression and then calls an Iris Kernel entry
evaluates the expression and returns the result which is also formatted as an Iris
point t~a
expression.
Currently, two interactive interfaces are supported. One interactive interface, Object SQL
(OSQL), is an object-oriented extension to SQL. The second interactive interface is the
Graphical Editor. It is an X-Windows-based system that allows users to retrieve and update
function values and metadata with graphical and forms-based displays.
In addition to the Kernel interface, Iris supports two other programmatic interfaces. The
. first, CLI (C Language Interface), is a user-friendly layer on top of the base Kernel interface.
It allows programmers to access Iris in an object-oriented fashion by manipulating C variables
denoting the Iris database, the Iris meta-data including types and functions, and the objects
in the database. The second programmatic interface is a straightforward embedding of OSQL
into various host languages.
One of the long-term goals of the Iris project is to be able to define and implement the Iris
model in terms of its own functions [Lyngbaek,86]. This provides a conceptual simplicity with
the result that the implementation of the system is easier to understand and maintain. It is
also easier to prototype new operations with such a system because data model operations
can be prototyped as ordinary database functions. An added advantage is that it will be
possible to optimize and type check data model operations like ordinary database functions.
Since the essence of the Iris data model is function application, the Iris Kernel has been
architected around the single operation of invoking a function. In addition, the Kernel may
call itself recursively so that data model operations may invoke other data model operations.
This flexibility permits the custornization of Iris operations and allows us to experiment with
different semantics of, for example, multiple inheritance, versioning, and complex objects
with little re-implementation effort.
1
Object
SQL
Foreign
Functions
Graphical
Editor
Embedded
OSQL
CLI
Types, Objects,
Functions,
Queries, Updates,
Versioning
Iris
Kernel
Iris
Storage
Manager
Concurrency Control,
Recovery, BUffering,
Indexing, Clustering,
Did Generation
Figure 1: Iris System Components
2
The emphasis of this paper is to describe the Iris Kernel architecture. Section 2 gives a brief
overview of the Iris Data Model. Section 3 describes the Iris Kernel architecture. Section 4
is a look back on our implementation experiences. Section 5 provides a summary.
2
Overview of the Iris Data Model
The Iris Database System is based on a semantic data model that supports abstract data
types. Its roots can be found in previous work on Daelex [Shipman,S1] and Taxis lMylo:l;>oulos,SO].
A number of recent data models,' such as PDM lManola,S6] and Fugue [Heller,SSJ, also
share many similarities with the Iris Data Model. The Iris data model contains three important constructs: objects, types and functions. These are briefly described below. A
more complete description of the Iris Data Model and the Iris DBMS may be found in
[Lyngbaek,S6, Fishman,S9].
2.1
Objects and Types
Objects in Iris represent entities and concepts from the application domain being modeled.
Some objects such as integers, strings, and lists are self identifying. Those are called literal
objects. There is a fixed set of literal objects, that is each.literal type has a fixed extension.
A surrogate object is represented by a system-generated, unique, immutable object identifier
or oid. Examples of surrogate objects include system objects, such as types and functions,
and user objects, such as employees and departments.
Types have unique names and are used to categorize objects into sets that are capable of
participating in a specific set of functions. Objects serve as arguments to functions and may
be returned as results of functions. A function may only be applied to objects that have the
types required by the function.
Types are organized in an acyclic type graph that represents generalization and specialization. The type graph models inheritance in Iris. A type may be declared to be a subtype
of other types (its supertypes). A function defined on a given type is also defined on all
its subtypes. Objects that are instances of a type are also instances of its supertypes. The
system types are shown in Figure 2 where the subtype relationship is illustrated with arrows
from a supertype to its subtypes. User objects may belong to any set of user defined types.
In addition, objects may gain and lose types dynamically. For example, an object representing a given person may be created as an instance of the Employee type. Later it may
lose the Employee type and acquire the type Retiree. When that happens, all the functions
defined on Retiree become applicable to the object and the functions on Employee become
inapplicable. This feature enables us to support database evolution better than other objectoriented systems in which the types of an object may be specified only at the time the object
is created.
versioned is represented by a generic
Objects in Iris may be versioned. An object bein~
object instance and a set of distinct version object Instances corresponding to each version
of the object. By default, objects are not versioned, i.e. they have the type UnVersioned.
The Iris versioning mechanism is further described in [Beech,88].
3
Object
lr---------:1---l
Surro ate
Literal
lr- L~
LiteralAtom
UserTypeObject
ri
( user-defined types )
List
SystemTypeObject
Integer
Type
Function
I
UserType
UserFunction
Aggregate
!
ArgRes
!
Version
Update
Real
Boolean
UnVersioned
1
Updatable
String
1
Xact
-.
Transient
Index
I
Session
Figure 2: Iris System Types
4
Binary
StorageObject
Table
Generic
Bag
1
Savept
I
1
Scan
2.2
Functions
Attributes of objects, relationships among objects, and computations on objects are expressed in terms of functions. Iris functions are defined over types, they may be many-valued
and, unlike mathematical functions, they may have side-effects. In Iris, the declaration of a
function is separated from its implementation. This provides a degree of data independence.
This section discusses function declaration and Section 2.4 discusses function implementation.
In order to support stepwise refinement of functions, function names may be overloaded, i.e.
functions defined on different types may be given identical names. When a function call is
issued using an overloaded function name, a specific function is selected for invocation. Iris
chooses the function that is defined on the most specific types of the actual arguments.
A type can be characterized by the collection of functions defined on it. The Employee type
might have the following functions defined over it:
JobTitle:
EmpDept:
Manager:
SalHist:
ChangeJob:
Employee -+
Employee -+
Employee -+
Employee ~
Employee x
String
Department
Employee
Integer x Date
String x Department
-+
Boolean
If Smith is working as a software engineer in the Toolkit Department reporting to Jones then
the function values are as follows (references to surrogate objects are denoted by italics):
JobTitle(Smith) = "Software Engineer"
EmpDept (Smith) - Toolkit
Manager(Smith) = Jones
The SalHist function is many-valued. It is also an example of a function with multiple result
types. Each result is a pair of salary and date objects, where the salary is represented by
an integer. The date indicates when the salary was changed. If Smith was hired on 3/1/87
with a monthly salary of $3000 and given a raise of $300 on 3/1/88 then the salary history
function has the following value:
SalHist (Smith) = [<3000, 9/1/87>, <3300, 9/1/88>]
Note that the dates are represented by surrogate object identifiers (they are in italics). Thus,
there are, presumably, functions on date objects that materialize parts or all of the date,
e.g. month, day and year functions.
In Iris, we use the term procedure to refer to a function whose implementation has side-effects.
The function ChangeJoh is an example of a procedure and is also a function with multiple
argument types. Let us assume that it may be used to change the job title and department
of an employee. The promotion of Smith to Project Manger in the Applications Department
can be reflected in the database by the following invocation:
5
ChangeJob(Smith, "Project Manager", Applications)
In Iris, a new function is declared by specifying its name, the types of its argument and result
parameters and, optionally, names for the arguments and results".
create function Manager( Employee )
~
supervisor Employee;
.
Before a function may be invoked, an implementation must be specified. This process is
described in Section 2.4.
2.3
Database Updates and Retrievals
Properties of objects can be modified by changing the values of functions. For example, the
following operations will cause the JobTitle function to return the value "MTS" in a future
invocation with the parameter Smith and add another salary and date pair to Smith's salary
history:
set JobTitle(Smith) = "MrS";
add Salhist (Smith) • <3800, 1/1/89>;
In addition to setting and adding function values, one or more values may be removed from
the value-set of a many-valued function. The ability to update a function's values depends
on its implementation. In general, functions whose values are stored as a table can always
be updated. However, functions whose values are computed mayor may not be updatable
(see Section 2.4).
The database can be queried by using the OSQL select statement and specifying a list of
results, a list of existentially quantified variables, and a predicate expression. The result
list contains variables and function invocations. The predicate may use variables, constants
(object identifiers or literals), nested function applications, and comparison operators. Query
execution causes the existential variables to be instantiated. The result list is then used to
compute a result value for each tuple of the instantiated existential variables. The collection
of all result values is returned as a bag.
The following statement retrieves all the dates on which Smith's salary was modified:
select d
for each Date d, Integer s
where SalHist ( Smith ) • <s, d>;
and the statement:
IFor readability, subsequent examples wiD be expressed in OSQL [Fishman,89J. Keywords of this language appear in bold
font. Keep in mind, however, that OSQL is paning each statement into a functional expression that invokes an Iris system or
user function.
6
select Manager ( Smith );
returns Smith's manager.
Retrievals must be side-effect free, i.e. they may not invoke procedures.
2.4
Function Implementation
So far, we have discussed the declaration of functions and their use in retrievals and updates.
An important additional characteristic of a function is its implementation or body, that is, the
specification of its behavior. The implementation of the function is compiled and optimized
into an internal format that is then stored in the system catalog. When the function is later
invoked, the compiled representation is retrieved and interpreted.
Iris supports three methods of function implementation: Stored, Derived, and Foreign. A
stored implementation explicitly maintains the extension of the function as a stored table in
the database. Derived and foreign implementations are alternative methods for computing
function values.
Stored Functions
The extension of a function may be explicitly maintained by storing the mapping of corresponding argument and result values as tuples in a database table. Since our storage
manager does not support nested structures, the result of a many-valued function is stored
as several tuples with identical argument values.
To improve performance, functions with the same argument types may be horizontally clustered in a single table by extending the tuple width to include result columns for each of
the functions. As illustrated in Section 2.3, stored functions may be updated by using
the OSQL set and add statements. There is also a remove statement that is not shown.
A formal treatment of the mapping of Iris functions to relational tables may be found in
[Lyngbaek,87].
Derived Functions
Derived functions are functions that are computed by evaluating an Iris expression. The
expression may represent a retrieval or an update. As an example of a retrieval function, the
select statement in Section 2.3 could represent the body of a derived function with zero
arguments that retrieves the dates on which Smith's salary was modified.
A function may also be derived as a sequence of updates, which defines a procedure. For
example the first two updates in Section 2.3 may be encapsulated as a procedure that updates
Smith's personnel data. Procedures themselves may not be updated.
A derived function without side-effects may be thought of as a view of the stored data.
The semantics of updates to such a function are not always well-defined. For example, if
the derivation expression of a given function requires joining several tables, the function
cannot be directly updated. However, in those cases where Iris can solve the "view update"
problem, the update actions are automatically inferred by Iris. For example, functions that
are derived as inverses of stored functions are updatable.
7
Foreign FUnctions
A foreign function is implemented as a subroutine written in some general-purpose programming language and compiled outside of Iris. The implementation of the foreign function
must adhere to certain interface conventions. Beyond that, the implementation is a black
box with respect to the rest of the system.
This has three consequences. First, it is impossible to determine at compile time whether
the implementation has side-effects. For this reason, users must specify whether or not
their foreign function has side-effects. Second, foreign functions cannot be updated. Third,
the implementation of foreign functions cannot be optimized by Iris. However, their usage
can, potentially, be optimized. For example, given a foreign function that computes simple
arithmetic over two numbers, rules could be added to evaluate the result at compile time if
the operands are constants.
Foreign functions provide flexibility and extensibility. Since the Iris database language is
not computationally complete, there are certain computations that cannot be expressed as
derived functions. Foreign functions provide a mechanism for incorporating such computations into the system. Furthermore, an existing program can be integrated with Iris through
the foreign function mechanism either by modifying the program to adhere to the foreign
function calling conventions or by writing new foreign functions to provide an interface to
the existing program. Foreign functions and the mechanisms with which they are supported
in Iris are described in detail in [Connors,88].
2.5
Iris System Objects
In Iris, types and functions are also objects. They are instances of the system types, Type and
Function, respectively (see Figure 2). Like user-defined types, system types have functions
defined on them. The collection of system types and functions model the Iris metadata and
the Iris data model operations.
System functions are used to retrieve and update metadata and user data. Examples of
retrieval functions include, FunctionArgcount, that returns the number of arguments of a
function, SUbTypes, that returns the subtypes of a type, and FunctionBody, that retrieves
the compiled representation of a function. System procedures correspond to the operations of
the data model are used to update metadata and user data. Examples of system procedures
include ObjectCreate, to create a new object, FunctionDelete, to delete a function and
IndexCreate, to create an index.
Select and Update are two important system functions that may be used to access metadata
or user data. Select is the system function that corresponds to the OSQL seleCt statement
illustrated in Section 2.3. Update is used to modify function values. It corresponds to the
OSQL set, add and remove statements.
As with user functions, system functions (and procedures) may have either stored, derived or
foreign implementations. Currently, there are a number of system foreign functions. These
exist either because their functionality cannot be expressed as Iris functional expressions
or they are more efficiently implemented as foreign functions. Most system procedures are
implemented as foreign functions. System foreign functions are also used to implement
transaction support, e.g., XactCommit and XactRollback, and facilities for source code
tracing and timing.
8
In order to compile and execute functions, the Iris Kernel needs access to metadata (a system
catalog) that describes the database schema. The Iris system catalog is maintained as a
collection of stored system functions. Since certain system functions are frequently accessed
together, most of the functions for a particular type of object are horizontally clustered on the
same table. For example, the function table stores the FunctionName, FunctionArgcount,
and FunctionResultcount functions, among others. The current system catalog consists of
approximately 100 functions horizontally clustered in 15 tables.
Iris Kernel Architecture
3
3.1
Overview
The Iris Kernel is a program that implements the Iris data model. The Kernel architecture shares many similarities with an architecture described in [Buneman,82]. This section
describes the component modules of the Iris Kernel in more detail and concludes with an
example that illustrates the :Bow of execution for a sample request.
The Iris Kernel is invoked via a subroutine entry point that serves as a function call evaluator.
Iris requests are formatted as Iris expressions. Each node in an Iris expression is selfidentifying and consists of a header and some data fields.
The header defines the node type", The possible node types are: object identifier, variable,
function call and one node type for each Iris literal type [i.e. integer, real, boolean, etc.).
Iris provides a subroutine library to create and manipulate nodes and expressions. Once
the request is properly formatted, it may be passed to the Iris Kernel for evaluation. The
returned results are also formatted as Iris expressions.
All user and system operations are invoked via function calls. This represents a range of
capabilities from low-level operations, such as comparison or equality checking, up to highlevel operations, such as function or object creation.
3.1.1
Kernel Modules
The Kernel is organized as a collection of software modules. They are layered as illustrated
in Figure 3. The top-level module, the Executive (EX), implements the Kernel entry points
and manages the client-Kernel interaction. For each request, it calls the Query Translator,
QT, to produce a relational algebra tree for the request. EX then passes this tree to the
Query Interpreter, QI, which produces the result expression.
The Object Manager, OM, is a set of system procedures and functions that are implemented
as foreign functions (denoted ff in Figure 3). The Cache Manager, CM, is an intermediate
layer between the Iris Kernel and the Storage Manager, SM. It provides prefetching and cache
management for data retrieval and data updates between the Kernel and SM. The Storage
Manager provides data sharing, transaction management and access to stored tables.
The Kernel may be called recursively through EX. The Query Translator makes recursive
calls to invoke system functions that retrieve metadata. Some high-level system procedures
2Thill ill a dill tinct notion from an object type. Node typell merely identify interface data IItructurell.
9
Client
Executive
EX
Query Translator
QT
Query Interpreter
QI
<,
Cache Manager
Object Manager
CM
OM
Storage Manager
SM
Figure 3: Iris Kernel Architecture
10
in the Object Manager make recursive calls to invoke lower level system procedures and
functions.
The Iris Kernel is single-threaded and each client runs its own copy of the Kernel. The
Kernel may execute as a server in a separate process and communicate with the client via
messages. Alternatively, the client and Kernel may be tightly coupled in the same process
and communicate via subroutine calls. In either case, the configuration is transparent to the
source code of the client. The Storage Manager always executes in the same address space
88 the Kernel. The multiple instances of the Storage Manager use a shared memory buffer
for caching data, concurrency control and transaction logging.
3.2
Iris Executive
The Executive module, EX, manages interaction between the Iris Kernel and its clients
and implements the Kernel entry points. A request consists of a functional expression, a
result buffer and an error buffer. The result buffer is filled with result objects produced by
evaluating the expression tree. The error buffer is filled with any error messages generated
during the processing of the request.
Request processing consists of two steps: compile and interpret. The compilation step, done
by QT, converts the functional expression into an extended relational algebra tree. The
interpreter, QI, then traverses the tree and produces the result objects for the request. The
structure of the result depends on the request. Invoking a many-valued function returns a
bag of objects while invoking a single-valued function returns a single object. If the result
buffer is too small to contain the entire result object, as much as will fit in the buffer is
returned and an error message is generated.
To ensure that no results are lost, the client may open a scan over the result object. This
may be done by calling a system function that takes a bag and returns a scan object.
Alternatively, as a convenience since opening a scan is a frequent operation, EX provides a
separate entry point that always opens a scan on the results of a request.
Iris supports two sets of EX entry points: one for clients, the other for internal Kernel calls.
This was done to decrease the internal path length and improve performance for recursive
Kernel calls. This is possible because the internal calls are considered trustworthy, so it is
safe to skip some of the type and sanity checks that are done for ordinary requests. The
internal entry points also provide a form of security. Certain system functions may only be
invoked through the internal entry points and, therefore, they can be hidden from the client.
3.3
Query Translator
The Query Translator, QT rDerrett,89], compiles an Iris functional expression into an execution tree. The functional expression, formatted as a tree, is known as an F-tree. The
nodes of an F-tree include function calls, variables, and literal nodes. The execution tree is
an extended relational algebra tree referred to as an R-tree. 3
Currently, the Query Translator is limited in that every request expression must be rooted
by a function call node and the call arguments must be constants. However, QT permits
3Not to be confused with spatial R-trees.
11
specific arguments to some functions to be arbitrary expressions. For example the predicate
argument to the Iris system function, Select, may be an expression containing variables
and nested function calls. Currently, these functions are treated as special-cases by QT. We
plan to extend QT by allowing any function call argument to be an expression that will be
evaluated before invoking the function. However, the semantics of such expressions remain
to be defined. Conventional eval-apply algorithms may not be consistent with the semantics
of existing Iris functions, such as Select.
Thus, in most cases, the task of Qr is straightforward. It checks that the actual arguments
of the function call are the same types or subtypes of the corresponding formal arguments
and resolves function name overloading. Then, it retrieves the previously compiled R-tree
for the function, substitutes the actual arguments and returns the resulting R-tree. This is
known as the QT fastpath since it avoids the full compilation process described below.
For those system functions, such as Select, that have expressions as arguments, QT uses the
full request translation process to generate the R-tree. The process consists of three main
steps. First, the F-tree is converted to a canonical form, in which, for example, nested function calls are unnested by introducing auxiliary variables. Type checking is also performed
and the names of overloaded functions are resolved.
The second step converts the canonical F-tree to an unoptimized R-tree. This is a mechanical process in which function calls are replaced by their stored implementations which are,
themselves, R-trees. The resulting R-tree consists of nodes for the relational algebra project,
filter" and cross-product operators, and table nodes which represent scans over stored tables. Joins are specified by placing a filter node above a cross-product node to compare the
columns of the underlying cross-product," To increase the functionality of the Query Interpreter, there are some additional nodes. A temp-table node creates and, optionally, sorts
a temporary table. An update node modifies an existing table. A sequence node executes
each of its subtrees in turn. A foreign function node invokes the executable code that is
the implementation of a foreign function. The leaves of the R-tree are either table nodes
or foreign function nodes. Note that a table node may have an associated predicate and
projection list to reduce the size of the scan.
The initial R-tree is then checked to ensure that all declared variables are bound to a column
of their declared type. If a variable is not bound, the R-tree is joined with the extension of
the type of the variable. Of course, this is only done for types with finite extensions, e.g.
unbound integer variables are not allowed. Note that foreign functions require that their
input arguments be bound to a value before invocation since the subroutine that implements
them expects to be passed a value. However, stored functions are not so restricted since the
underlying table that implements the function can bind the arguments as well as the results.
Thus, the inverse of stored functions may be computed without deriving a new function by
merely invoking the function with bindings for the result values and leaving the arguments
unbound.
The final and most complex step is to optimize the R-tree. The optimizer is rule-based. Each
rule consists of a test predicate and a transformation routine. If the predicate evaluates to
true, the transformation routine is invoked. Both the predicate and transformation routines
are written as C subroutines that take an R-tree node as an argument. As in [Graefe,87],
4We use the term filter for the relational algebra select operator to avoid confu8ion with the Iris Select function.
sOf' course, joins are rarely executed this way because the filter predicate is typically pushed down into a table node below
the crou-product to produce a nested-loops join.
12
the system must be recompiled whenever the rules are modified.
Rules are organized into rule sets such that each rule set accomplishes a specific task. For
example, one rule set contains all rules concerned with simplifying constant expressions (e.g.
constant propagation and folding). Another rule set reorders the tables in a cross-product
to take advantage of indexes and to ensure that input arguments are hound where necessary.
Optimization is accomplished hy applying the rule sets in a specific order over the R-tree tree.
Within a rule set, QT traverses the tree and applies the rules in order of their declaration.
However, the rule writer may modify the evaluation order in certain ways hy setting flags
that, for example, inhibit future firing of a rule on a specific node or reevaluate all rules on
the entire R-tree.
3.4
Query Interpreter
The Query Interpreter, QI, evaluates an R-tree which yields a collection of tuples that become
the result objects for the Iris request expression. It uses the conventional technique of piping
data between parent and child nodes in the execution tree. Each node in the R-tree is
treated as a scan object and must implement three operations: open, next, and close. These
operations may call QI recursively to evaluate a subtree.
An open operation on a table node creates a Storage Manager scan. An open operation on
a foreign function node may perform some data-dependent initializations. Note that, due
to its potentially large size, the object code for a user foreign function is stored separately
from an R-tree that invokes it. QI uses a dynamic loader to load and link the object code
at run time. The object code for system foreign functions is memory resident because they
are frequently accessed.
A next operation produces the next tuple in the scan produced hy an R-tree node. A next
operation on an update node will update the database. However, if the subtree of an update
node references the stored table that is hein~
updated, the update tuples are first spooled
into a temporary table, This prevents cycles 10 the data pipeline.
One consequence of using a scan paradigm for the interpreter is that a stream of objects is
always produced. Normally, the Query Interpreter converts the stream into a hag. However,
this causes a problem when invoking a single-valued function since the caller expects a single
object to be returned rather than a bag. To prevent unexpected bagging, QT returns a
boolean flag as part of compilation that indicates if the R-tree is single-valued. H the flag is
true, QI will not bag the result.
3.5
Object Manager
The Object Manager, OM, is a set of system foreign functions. These functions provide
services that are essential to Iris but whose implementations either cannot be expressed as
stored or derived functions or are more efficiently written as foreign functions.
In the current version of Iris, most of the system procedures are implemented as foreign
functions. One problem with this is that QT cannot optimize calls to these procedures.
Thus, we plan to reimplement many of the Iris system procedures as extended relational
algebra expressions using update and sequence nodes. Then, a complex system procedure
could be implemented as a derived function using a sequence of updates to individual stored
13
system functions. This work is currently under investigation. It may require the addition of
new relational algebra nodes, e.g. a branching node.
3.6
Cache Manager
The Cache Manager, CM, implements a general-purpose caching facility between the Iris
Kernel and the Storage Manager. An important point is that, since the Query Interpreter
operates on relational algebra trees, CM caches tables not functions. The Cache Manager
maintains two types of table caches: a tuple cache and a predicate cache. The tuple cache is
used to cache tuples from individual tables. A table may have at most one tuple cache. A
tuple cache is accessed via a column of the table and that column must be declared as either
uniquely-valued or many-valued. IT a column is declared as many-valued, the cache ensures
that whenever a given value of that column occurs in the cache, all tuples of the table with
the same column value will also occur in the cache. This guarantees that when a cache hit
on a many-valued column occurs, the scan can be entirely satisfied from the cache without
having to invoke the Storage Manager. This enables the effective caching of many-valued
functions.
A table may have many predicate caches. A predicate cache may be considered the materialization of a table node in an R-tree. It contains tuples from a table that satisfy a
particular predicate. Thus, a predicate cache has an associated predicate and projection list.
A predicate cache is useful in caching intermediate results during R-tree interpretation. Tuple caches are primarily intended to support caching of system tables. However, user tables
may also be cached in this way. Information in the cache is always kept consistent with the
Storage Manager. If an update request is too complicated to preserve cache consistency, the
table will be automatically uncached.
3.7
Storage Manager
The Iris Storage Manager is a conventional relational storage subsystem, namely that of HPSQL [HP]. HP-SQL's Storage Manager is very similar to System R's RSS [Blasgen,77]. It
was extended to support the generation of unique OlD's. In general, it operates over a single
table per request. Joins and aggregate operations are done outside the Storage Manager.
Tables can be created and dropped at any time. The system supports transactions with
savepoints and restores to savepoints, concurrency control, logging and recovery, archiving,
indexing and buffer management. It provides tuple-at-a-time processing with commands to
retrieve, update, insert and delete tuples. Indexes allow users to access the tuples of a table
in a predefined order. Additionally, a predicate over column values can be defined to qualify
tuples during retrieval. The Storage Manager also provides a scan operation which permits
associative access to a table and returns multiple tuples per request.
3.8
Example: Type Creation
To provide a better understanding of the Iris architecture, we describe the Kernel processing
in creating a new type. The creation of a new type is the task of the system procedure,
TypeCreate, which is implemented as an OM foreign function. To invoke this procedure, a
request expression is built containing a single call node with a function identifier of "TypeCreate" and an argument list of two elements: the name of the new type and its super-types.
14
The super-type argument is itself a list containing the names or object identifiers of the
super-types of the new type.
The request expression is then passed to an Iris entry point in the Executive. The Executive immediately passes the request expression to the Query Translator for compilation.
QT checks the argument types, retrieves the R-tree implementation for "TypeCreate" and
substitutes the actual arguments for the formal arguments.
Since this request was a simple function call with constant arguments, QT uses its fastpath
and no further optimization of the R-tree is required. In this case, the R-tree is a single
foreign function node that calls the OM foreign function that implements TypeCreate.
QT returns the R-tree to the Executive which then passes the tree to the Query Interpreter
to produce the result values. QI simply invokes the foreign function and returns the result
that is the object identifier of the newly created type. Thus, the real work of type creation
is done in the foreign function.
The TypeCreate foreign function performs the following actions. First, it checks that the
type name is unique and that the supertypes exist. Then, it updates the system functions
for the type metadata. Finally, it creates a typing junction for the new type that maintains
the extension of the type.
Function creation is performed by the Iris system procedure, FunctionCreate, that is invoked through a recursive Kernel call. If the function is successfully created, the TypeCreate
foreign function returns the object identifier of the new type object. Otherwise, the type
object is deleted and an error is added to the error buffer.
4
Implementation and Usage Experience
In this section, we describe our experiences in implementing and using Iris. We concentrate
on what we view are the most novel aspects of the system.
4.1
Data Model
A major advantage of the Iris data model is that it provides a good separation among the
concepts of object, type and function. This is reflected in three ways. First, objects may
acquire and lose types, dynamically. Second, functions over a set of types may be created
and destroyed at any time. Third, objects of a given type are not required to participate in
every function defined on that type.
An important consequence of this orthogonality is that an instance of an Iris database,
including its schema, may evolve without affecting existing applications. This has wide
appeal among Iris users. For example, the addition of a new function on an existing type is
transparent to objects of that type and to other functions on that type. Similarly, deleting
a type only affects applications that use functions defined on the type.
As another example, an object retains its identity across type changes. Changing a person
object to a mouse object does not require the object to be deleted and reinserted in the
15
database. References to that object are still valid after the type change."
Another novel feature of Iris is the use of functions to unify the notions of attribute, relationship and operation. This makes the data model conceptually simpler and simplifies the
Kernel by reducing the number of constructs that must be implemented. In addition, by
separating the declaration of a function from its implementation, Iris provides data independence.
For example, a function that returns the age of a person might change from a stored implementation to' a derived implementation, e.g. based on birthdate, without affecting any
application programs. This is another feature that appeals to Iris users since, again, it permits schema evolution. Of course, the same effect could, optionally, be achieved in other
systems by defining views. However, in Iris, data independence exists for all functions as a
consequence of the model. As an aside, we note that it is not always true that applications
are immune to changes in the implementation of a function. This is because different function implementations have subtly different semantics. For example, all stored functions may
be updated whereas no foreign functions can be updated. Thus, an application that updates
the age function might be broken by the previously mentioned implementation change.
Finally, we note that modeling the Iris metadata in terms of the Iris data model was positively
received by users. Since there is a common language for accessing user and system data,
the writing of user interfaces is simplified because there is no need to special-case metadata
access. Of course, many database systems have this feature but it was important to retain
it in the Iris model.
The principal extension we plan to make to the Iris data model is to add support for complex
objects. However, complex objects are an interesting problem for the Iris model because,
unlike other object systems, Iris objects have no explicit state; they only have function
values. Thus, an Iris complex object must be identified by a subset of all the functions
that are defined on a particular object type. The major issue, then, is how to identify the
functions that comprise the complex object.
Another problem with the current implementation is the lack of orthogonality for bags and
tuples. For example, tuples may not contain bags or tuples as elements and bags may not
contain bags as elements. This is related to the complex object problem and we expect that
solutions there will apply to this problem.
4.2
Interfaces
Several groups within HP are experimenting with Iris. In general, their experiences have
been quite positive. The OSQL interface has been a valuable tool in introducing users to
Iris. It is a fairly natural interface for those who have been exposed to SQL. Typically, a
new Iris user will experiment with an OSQL schema and then browse the database using
the Graphical Editor to see what was created. The ability to display the type hierarchy and
functions associated with each type was deemed especially helpful.
A good test of the usability of Iris and OSQL is how easily they can be taught to new users.
We noted an interesting learning curve for Iris novices. First-time users seem to have some
initial trouble accepting the model. We speculate that this is due to previous experience in
60f c:ourse, references to the object as a person are autom&tic:al1y removed as part of the type c:ban15e.
16
which the notions of data and type were combined (as in many programming languages and
traditional database systems). Then, things seem to click and new users are rapidly able to
develop schemas and applications.
But later, things become more difficult. We believe this is due to the flexibility of the model.
Users get confused as they become aware of more modeling choices. Given that objects
may gain and lose types, novice data modelers are, sometimes, unsure whether to model
something as a type or a function. For example, rather than use a function to return the sex
of a person, one could define two types, male and female, and make all persons instances of
one of those types. However, modeling problems occur in developing schemas in any database
system. Once a certain level of expertise is reached with the primitives of the data model,
the real difficulty shifts from the model to identifying the object types and relationships in
the application.
For new users, we found that OSQL simplified the use of Iris in two ways. First, OSQL
statements can implement higher level operations than those provided by the Iris system
functions and procedures. For example, OSQL provides a single statement that bundles
together the creation of a type and several functions on that type. This was easier for new
users to comprehend and more convenient, in general, than writing separate statements.
Second, OSQL can reduce the complexity in using Iris by hiding some of the nuances of the
data model and limiting the number of options available to users.
The CLI and Kernel interfaces were not used by most novice and casual users. These
interfaces seemed relatively hard for new users, perhaps because they expose too many
choices of the data model. Embedded OSQL is the preferred programmatic interface. It
is interesting to note that some users did use CLI as a base layer to define their own Iris
interface, in effect, implementing their own data model.
We also received feedback from more experienced OSQL users who developed relatively large
OSQL applications. One group reported that, for their application, the OSQL schema was
one third smaller than the equivalent SQL schema. They felt that the primary reason for
this reduction was function inheritance. Inheritance reduced the amount of redundancy that
was present in the relational schema, e~.
a function could be inherited rather than repeating
a foreign key in a table. Also, the O S ~ L
schema and queries were easier to read than their
relational equivalents because OSQL was able to hide some join expressions. For example,
some joins that, in a relational system would be stated explicitly, could be specified in Iris
through function composition and through the type hierarchy.
4.3
Usage and Performance
To illustrate the functional extensibility of Iris, we wrote a transitive closure function, tc.
The following OSQL statement defines tc as an Iris foreign function whose implementation
is to be found in the file om_tc. o.
create function tc( Function f, Object root, Integer maxDepth ) ->
<Object obj, Integer depth> as link om_tci
The implementation of the function required about 200 lines of C code. The tc function
takes as its first argument another function that is required to be a unary function with the
17
same argument and result type. Starting at the root object it returns all lists < obj,depth>
where depth is the smallest integer less than maxDepth such that obj = jdePth(root). The
data may be cyclic and duplicates are eliminated. A maxDepth of -1 returns all reachable
objects and their shortest distance from the root.
The following OSQL schema uses tc to implement the All..subParts function which retrieves
all sub-parts of a given part.
create function IsPartOf( Object 0 ) -> Object P as stored;
create function All..subParts( Object p ) -> Object sp as
select sp for each Object SPt Function f t Integer d
~her
tc(ftpt -1) • <sPt d> and NameOfFunction(f) • "IsPartOf";
We note that the return type of these functions is declared as Object rather than Part as
might be expected. This is necessary to satisfy the Iris type checker because function tc is
declared with a return type of Object. Since the type checker is invoked at compile time
and it does not support late binding (see Section 4.4), the type checker only knows that
objects returned by tc have the declared return type, in this case, Object. Thus, a useful
new feature in the type checker would be support for type coercion.
At this time, there is no commonly accepted benchmark for object oriented database systems. Relational benchmarks such as the Wisconsin benchmark [Bitton,83] are unsuitable for
evaluating object oriented database systems. However, since that benchmark is well known,
we ported the benchmark to Iris. We emphasize that our results were obtained on the new
Iris architecture which has not been tuned and needs more sophisticated query optimization.
For example, we did not take advantage of the Iris cache for these queries and we have yet
to implement alternative join strategies.
The benchmark was done on an HP 9000/370 (a 68030-CPU machine) running the HP-UX
operating system (see Section 4.9). The schema was translated to OSQL by implementing
each table as a stored function with indexes on the key fields. Each query was implemented
as an OSQL derived function. The queries were executed by invoking the derived function
through the kernel interface and discarding the results (rather than displaying them or storing
them back into the database). We note that Iris does not currently use clustering indexes.
In general, clustering on a user-supplied key is not done in Iris since most Iris tables use an
object identifier Below, we report the user and system times for a subset of the Wisconsin
benchmark queries. On a lightly loaded system, the sum of user and system times is close
to elapsed wall-clock time.
.
• (Q2) Select all columns from a ten-thousand tuple relation, 10 percent selectivity, no
index: user time, 11.66 seconds, system time, 2.46 seconds.
• (Q3) Select all columns from a ten-thousand tuple relation, 1 percent selectivity, unique
index: user time, 1.36 seconds, system time, 0.01 seconds.
• (QI5) Select all columns from two ten-thousand tuple relations, join on unique key
column, 10 percent selectivity on join column for one relation: user time, 48.16 seconds,
system time, 3.06 seconds.
• (QI9) Project on 6 non-key columns of a one thousand tuple relation and eliminate
duplicates: user time, 62.08 seconds, system time, 0.88 seconds.
18
• (Q26) Append tuples to a ten thousand tuple relation: (1 tuple) user time, 0.66, system
time, 0.06, (100 tuples) user time, 5.34, system time, 0040.
These numbers indicate that Iris is CPU-bound. Also, it shows that that Iris performs
best when it is able to push down high-level operations into the Storage Manager. For
example, both Q2 and Q19 do full relation scans. The difference is that scan is completely
contained within the Storage Manager in Q2. In Q19, the tuples are first extracted from
the Storage Manager to collect the columns and then reinserted into a temporary Storage
Manager table in order to eliminate duplicates. The join query uses a nested-loops algorithm
which requires a new Storage Manager scan for each tuple of the outer relation. Finally, the
update operations (Q26) show a high initial overhead. But, it does relatively better when
insertions are streamed together (1 tuple vs. 100 tuples).
4.4
Rule-based Query Translator
As discussed in Section 3.3, the Iris Query Translator contains a rule-based optimizer. The
promise of rules and rule-sets is that the query optimizer is easy to maintain and simple to
modify because the rules can be manipulated independently. In our experience, that was
generally, true, i.e. when the rules in a rule set were independent, adding or modifying rules
was relatively simple. For example, it was easy to modify a rule set containing rules for
simplifying algebraic expressions (e.g. constant propagation and folding). In this rule set,
most rules could be fired independently of other rules.
However, most rule sets were difficult to modify because of the inter-dependencies among
the rules. In addition, the rules within a rule set can be applied in almost any order so it is
hard to understand the effect of their inter-dependencies. In practice, it was often easier to
add a new rule set rather than modify an existing rule set. This is because the dependencies
among the rule sets were well understood since the rule sets were applied sequentially.
A second difficulty in modifying the rule base occurred whenever a new operator or data node
type was added to the set of R-tree nodes. Since Iris is an evolving system, the operators
and data types of the R-tree occasionally change. However, many of the rules were written
with case statements based on the node type. Thus, adding or removing an operator or leaf
node required modifying the case statements in all such rules. This was not difficult but it
was tedious.
The performance of the optimizer was adequate for our purposes. Clearly, it would have
run faster written as straight-line code. But, the flexibility was much more important than
incrementally better performance. Notice, we do not claim that the operation of rule-based
optimizers is easy to understand nor that they are easy to debug. For debugging, the usual
tactic was to trace the flow of the optimizer through each rule and rule set.
It is interesting to note that most of the rules were concerned with ale;ebraic and topological
transformations of the tree, e.g. removing unnecessary joins, pushing down projects and
filters. The logic for doing plan optimization, such as choosing alternative access paths or
ordering the operands of a cross-product, was concentrated in a small number of rules; Thus,
we were not able to take full advantage of the rule system to experiment with alternative
optimization strategies. Typically what we did was, in effect, replace one optimization
algorithm encapsulated in a rule by another rule that encapsulated a different algorithm.
19
But overall, the use of a rule-based optimizer was a good choice for Iris. It was flexibleenough
to support the addition of new operators and nodes which was a primary consideration in our
prototype system. It was disappointing to discover that, as with conventional optimizers,
modification of a rule-based optimizer does require a guru.
Finally, we note that the current version of Iris does not support late binding for functions
and object types. On the one hand, this is one of the strengths of Iris since it means we can
generate pipelined relational algebra expressions for our execution trees. Thus, with tuning
and an equivalent optimizer, the performance of our execution engine should be as good as
a relational system. However, there are situations where late binding is necessary. Thus,
a future area of investigation is how to support late binding while retaining the efficient
pipelined execution tree.
4.5
Foreign Functions
Foreign functions have been very successful in extending the capabilities of Iris beyond that
offered by its interface. They have been used to implement the arithmetic operators, to call
SQL and other database systems remotely and to implement aggregate operators. They are
also used in the Object Manager to implement the procedures of the data model, e.g. type
and function creation, object deletion. Foreign functions permit Iris to serve as an integrator
of data and applications by providing access to external services, applications and databases.
Users see this as a major benefit.
However, there are a few problems with the current implementation of foreign functions.
First, they are a security risk. Second, they are difficult to debug. Third, to the naive or
casual user, they may seem difficult to write. Foreign functions pose a security problem
because they execute without protection in the same address space as the Iris Kernel. There
is no good solution to this problem without operating system or hardware support.
The debugging problem exists because the symbolic debugger cannot be used on code that
is dynamically loaded. We have experimented with executing foreign functions in a separate
process. This permits use of the symbolic debugger and also solves the security problem.
This is adequate for development, testing and integration. However, it is not a general
solution due to the overhead of the process creation and remote procedure calls.
The difficulty in writing foreign functions is a consequence of the writer having to adhere to
the interface conventions and being exposed to Iris internal data structures. These problems
can be partially alleviated through the use of utility subroutines to hide the internal details
and the ability to use the embedded OSQL or eLI interfaces in foreign functions.
4.6
Recursive Kernel Calls
The ability of the Iris Kernel to call itself is an unusual and powerful aspect of the implementation. This feature is heavily used in both the Query Translator and Object Manager
modules. It also provides a degree of function independence in that we may change the
implementation of system procedures and functions without affecting the callers. Thus, the
change of a system procedure, such as TypeCreate, from a foreign function to a derived
function would be transparent to the rest of the Kernel.
There are two questions raised by this recursive architecture. The first issue is performance.
20
This is addressed in the next section. The second issue is the bootstrapping problem, i.e.
if Iris is implemented in terms of itself, how does it get started? Rather than describe the
solution in detail, suffice it to say that certain system functions are treated as special cases
by the Query Translator and the recursive call is avoided. For example, the system function
that retrieves the R-tree implementation for a function must be special-cased to avoid infinite
recursion. The number of these special cases is on the order of a dozen and is not expected
to grow significantly. So, we do not feel it is an indictment of the architecture.
4.7
Iris Cache
The performance of the current architecture is very dependent on its cache. In this architecture, metadata is accessed by invoking system functions through recursive Kernel calls.
Typically, the system function is implemented as a table node that accesses one of the stored
tables containing the metadata. Most system tables are cached by eM which prevents a
(high overhead) Storage Manager call. This differs from the first Iris prototype which contained a speciaf-purpose metadata cache. That cache was accessed through direct subroutine
calls.
At first glance, it appears we have accomplished little except to significantly increase the
path length for metadata access. However, the special-purpose cache in the first prototype
was limited in that it could not cache many-valued system functions. More importantly,
the implementation of that cache was tailored to the structure of the metadata. Thus,
any change to the structure of the metadata required rewriting portions of the cache code.
Finally, there were two paths to the metadata, one for the Kernel and another for clients.
This complicated the coding of system procedures and meant that client requests could not
take advantage of the cache.
The cache in the new architecture is general-purpose and makes cache access transparent
to the execution tree. Thus, although the path length is longer for compile-time metadata
access, we have a net gain in performance because the cache has wider applicability. In
addition, it is much easier to modify the metadata structure (e.g. to add new columns or
tables) since the cache code is unaffected. And, in a sense, the compilation time for a request
is not a critical factor so long as it is within reason. Execution time is much more important.
We expect the general-purpose cache to be a benefit there since it is accessible at run time for
both user and system functions. For example, we can cache the inner tables of nested-loops
joins.
Finally, note that retrieval performance could be significantly improved if Iris cached function
values rather than tables since a function call could be directly evaluated without compiling
it into a relational algebra tree. However, updates to stored tables pose a challenge since a
single tuple in a stored table may contain values for many functions (e.g. when functions
are horizontally clustered together in one table; see [Lyngbaek,87]). The individual function
caches would need to be located and updated in an efficient manner. This problem is
currently under investigation.
4.8
Storage Manager
An issue in the early implementation phases of Iris was whether to use a low level or high
level storage manager. We define a low level storage manager as an RSS-like layer, i.e. one
providing intra-table operations, and a high level storage manager as an RDS-like layer, i.e,
21
one providing inter-table operations, such as joins. The decision to go with the low level
storage manager was for performance reasons. In this way, Iris gained more control over
storage structures was able to implement its own optimization strategies. It was also able to
take advantage of storage manager features that might be hidden by higher layers, such as
tuple identifiers and links between tuples. In addition, it meant there shorter pathlengths
and fewer translations between Iris and the stored database.
The performance of Iris could be improved through better integration with the Storage
Manager. Iris uses an off-the-shelf Storage Manager that was not specifically tailored to
Iris. This results in duplicated services such as caching and forces translations between the
two different data storage formats. A related problem is that Iris metadata receives no
special treatment from the Storage Manager. Since metadata is often a database hotspot,
concurrency control problems may occur when one Iris client updates the metadata. This
effectively locks the metadata and other clients are forced to wait for the updater to finish.
Iris would benefit from some added functionality in the Storage Manager. For example, the
Storage Manager does not support the complete set of Iris base data types. It provides little
control over data placement (e.g, for locality of reference) and only B-tree access methods
are supported. Also, the Storage Manager only supports flat (first-normal form) tuples and
there is a limit of 4 kilobytes on the tuple size.
However, use of the Storage Manager was an excellent choice for the Iris prototype. The
performance was adequate, in general, and it allowed us to concentrate our efforts on the
data model and interfaces. In fact, rather than investigate tighter coupling with the Storage
Manager, a research issue in Iris is Storage Manager independence. We would like to define
an abstract model of a storage manager interface and then efficiently map that model onto
different real storage managers. The value of this is twofold. First, it simplifies the porting
subsystems. Second, and more importantly, it facilitates remote
of Iris onto different stora~e
database access in that Ins would be able to use two different storage managers, simultaneously. An initial step along the lines of an abstract storage manager interface is provided by
the Cache Manager interface.
4.9
Statistics and Development Environment
The Iris system is a research effort involving approximately twelve people over a period of
four years. Roughly two thirds of the effort was on the Iris data model and Kernel, the
remaining third was on the Iris interfaces. Two prototypes have been built, both in C. The
second (and current) prototype was built to increase the flexibility of the Iris Kernel and to
permit recursive calls.
The Iris Kernel (excluding the Storage Man~er)
consists of approximately 85K lines of C
code. The operating system, HP-UX, is HP s version of Unix System V with extensions
from Berkeley Unix. Iris runs on two hardware platforms: the HP Series 300 workstation
(Motorola 68000 CPU) and the HP Series 800 workstation (HP Precision Architecture -
mscj.
To coordinate the activities of the many people working on Iris, RCS [Tichy,82] was used to
record the change history for individual source files. A validation test suite was automatically
executed twice daily on the most recently checked-in version of Iris. This automated testing
system worked very well. In fact, it worked almost too well in that we tended to rely on
it as our correctness criterion. Thus, a broken feature might go undetected for weeks if the
22
feature was not tested by the validation suite.
5
Conclusions and Future Work
The Iris System facilitates the prototyping of new model semantics and functionalities. We
expect a lot of experimentation in such areas as authorization, complex objects, versioning,
overloading, late binding, monitors and triggers. In addition, two main research directions
have been identified. One is to extend and generalize the Iris model and language to allow
most application code to be written in OSQL. That way a large amount of code sharing
and reuse can be obtained. Some of the interesting research topics include optimization
of database programs with side-effects, optimization of database requests against multiple,
possibly different storage managers, and using the language to support declarative integrity
constraints.
The other research direction is to increase the functionality and power of the Iris client interface, i.e. the parts of Iris running on a client machine. In order to better utilize local
resources, the Iris interpreter must dynamically decide whether to interpret a given request
locally (using locally cached data and possibly extending the cache in the process), send
the entire request to the server for interpretation, or possibly split the request into several
parts some of which are evaluated locally and others remotely. Some of the interesting research topics include copy management (for example by using database monitors [Risch,89]),
checkin/ checkout mechanisms, and data clustering techniques.
6
Acknowledgements
Marie-Anne Neimat designed and implemented the general-purpose cache manager. Jurgen
Annevelink and Jim Davis were instrumental in converting Iris to the new architecture.
Ming-Chien Shan and Tim Connors also helped in the conversion.
Many members of the Database Technology Department at HP Laboratories contributed to
and influenced the Iris project: Jurgen Annevelink, Tim Connors, Jim Davis, Dan Fishman,
Charles Hoch, Bill Kent, Marie-Anne Neimat, Tore Risch, and Ming-Chien Shan. Nigel
Derrett, Tom Ryan, David Beech and Brom Mahbod also made substantial contributions to
the first Iris prototype.
The authors wish to thank Jurgen Annevelink, Dan Fishman, Stefan Gower, Marie-Anne
Neimat, Emmanuel" Onuegbe, Katie Rotzell and the reviewers for helpful comments on this
paper.
7
References
IBlasgen,77]
M. W. Blasgen and K. P. Eswaran. Storage and Access in Relation
Databases. IBM Systems Journal, 16(4):363-377, 1977.
[Beech,88]
D. Beech and B. Mahbod. Generalized Version Control in an ObjectOriented Database. In Proceedings of IEEE Data Engineering Conference,
February 1988.
23
[Bitton,83]
D. Bitton, D. J. DeWitt and C. Turbyfill. Benchmarking Database Systems
- A Systematic Approach. In Proceedings of the 1983 VLDB Conference,
October 1983.
[Buneman,82]
P. Buneman, R. E. Frankel and R. Nikhil. An Implementation Technique
for Database Query Languages A CM Transactions on Database Systems,
7(2), June 1982.
[Connors,88]
T. Connors and P. Lyngbaek. Providing Uninform Access to Heterogeneous Information Bases. In Klaus Dittrich, editor, Lecture Notes in
Computer Science 994, Advances in Object-Oriented Database Systems.
Springer-Verlag, September 1988.
[Derrett,89]
N. Derrett and M. C. Shan. Rule-Based Query Optimization in Iris. In Proceedings of ACM Annual Computer Science Conference, Louisville, Kentucky, February 1989.
[Fishman,87]
D. H. Fishman et al. Iris: An Object-Oriented Database Management
System. ACM Transactions on Office Information Systems, 5(1), January
1987.
[Fishman,89]
D. H. Fishman et al. Overview of the Iris DBMS. In W. Kim, F. H.
Lochovsky, editors, Object-Oriented Concepts, Databases, and Applications.
ACM Press, New York, N.Y., 1989.
[Graefe,87]
G. Graefe and D. J. Dewitt. The EXODUS Optimizer Generator. In Proceedings of ACM-SIGMOD International Conference on Management of
Data, pages 160-172, 1987.
[HP]
Hewlett-Packard Company.
36217-90001.
[Heiler,88]
S. Heiler and S. Zdonik. Views, Data Abstraction, and Inheritance in
the FUGUE Data Model. In Klaus Dittrich, editor, Lecture Notes in
Computer Science 394, Advances in Object-Oriented Database Systems.
Springer-Verlag, September 1988.
[Lyngbaek,86]
P. Lyngbaek and W. Kent. A Data Modeling Methodology for the Design
and Implementation of Information Systems. In Proceedings of 1986 International Workshop on Object-Oriented Database Systems, Pacific Grove,
California, September 1986.
[Lyngbaek,87]
P. Lyngbaek and V. Vianu. Mapping a Semantic Data Model to the Relational Model. In Proceedings of ACM-SIGMOD International Conference
on Management of Data, San Francisco, California, May 1987.
HP-SQL Reference Manual.
Part Number
[Mylopoulos,80] J. Mylopoulos, P. A. Bernstein, and H. K. T. Wong. A Language Facility for Designing Database-Intensive Applications. ACM Transactions on
Database Systems, 5(2), June 1 9 8 0 . .
[Manola,86]
F. Manola and U. Dayal. PDM: An Object-Oriented Data Model. In
Proceedings of 1986 International Workshop on Object-Oriented Database
Systems, Pacific Grove, California, September 1986.
24
[Risch,89]
T. Risch. Monitoring Database Objects. In Proceedings of the 1989 VLDB
Conference, Amsterdam, The Netherlands, August 1989.
[Shipman,81]
D. Shipman. The Functional Data Model and the Data Language DAPLEX.
ACM Thznsactions on Database Systems, 6(1), September 1981.
[Tichy,82]
W. F. Tichy. Revision Control System. In Proceedings of the IEEE 6th
International Conference on Software Engineering.
25