Database Management 2
Database Management 2
Relational Database
Contents
1. Introduction
2. Structure of Relational Databases
3. Database Schema
4. Keys
5. Schema Diagrams
6. Relational Query Languages
7. Relational Operations
8. Relational Algebra
9. Properties of Relations
10. Integrity Constraints
11. View
1
DBMS Chapter Two IS304
1. Introduction
o The relational model was first proposed by E. F. Codd in his seminal paper ‘A
relational model of data for large shared data banks’ (Codd, 1970).
o The relational model’s objectives were specified as follows:
To allow a high degree of data independence. Application programs must
not be affected by modifications to the internal data representation,
particularly by changes to file organizations, record orderings, or access
paths.
To provide substantial grounds for dealing with data semantics, consistency,
and redundancy problems. In particular, Codd’s paper introduced the
concept of normalized relations, that is, relations that have no repeating
groups.
To enable the expansion of set-oriented data manipulation languages.
2
DBMS Chapter Two IS304
o Thus, in the relational model the term relation is used to refer to a table, while
the term tuple is used to refer to a row. Similarly, the term attribute refers to a
column of a table.
o For each attribute of a relation, there is a set of permitted values, called the
domain of that attribute. Thus, the domain of the salary attribute of the
instructor relation is the set of all possible salary values.
o We require that, for all relations r, the domains of all attributes of r be atomic.
o A domain is atomic if elements of the domain are considered to be indivisible
units.
o For example, suppose the table instructor had an attribute age, the instructor’s
age can be considered as atomic since we cannot divide it to sub values.
o The important issue is not what the domain itself is, but rather how we use
domain elements in our database. Suppose now that the phone number attribute
stores a single phone number. Even then, if we split the value from the phone
number attribute into a country code, an area code and a local number, we
would be treating it as a non-atomic value. If we treat each phone number as a
single indivisible unit, then the attribute phone number would have an atomic
domain.
o The degree of a relation is the number of attributes it contains while the
cardinality of a relation is the number of tuples it contains.
o The null value is a special value that signifies that the value is unknown or does
not exist. For example, suppose as before that we include the attribute phone
number in the student relation. It may be that a student does not have a phone
number at all, or that the telephone number is unlisted. We would then have to
use the null value to signify that the value is unknown or does not exist.
o We shall see later that null values cause a number of difficulties when we
access or update the database, and thus should be eliminated if at all possible.
3
DBMS Chapter Two IS304
3. Database Schema
4
DBMS Chapter Two IS304
4. Keys
o We must have a way to specify how tuples within a given relation are
distinguished. This is expressed in terms of their attributes. That is, the values
of the attribute values of a tuple must be such that they can uniquely identify the
tuple.
o In other words, no two tuples in a relation are allowed to have exactly the same
value for all attributes. A superkey is a set of one or more attributes that, taken
collectively, allow us to identify uniquely a tuple in the relation.
o For example, the ID attribute of the relation Studennt is sufficient to distinguish
one instructor tuple from another. Thus, ID is a superkey.
o The name attribute of student, on the other hand, is not a superkey, because
several instructors might have the same name.
o Formally, let R denote the set of attributes in the schema of relation r. If we say
that a subset K of R is a superkey for r, we are restricting consideration to
instances of relations r in which no two distinct tuples have the same values on
all attributes in K. That is, if t1 and t2 are in r and t1≠ t2, then t1. K≠ t2. K.
o A superkey may contain extraneous attributes. For example, the combination of
ID and name is a superkey for the relation instructor. If K is a superkey, then so
is any superset of K. We are often interested in superkeys for which no proper
subset is a superkey. Such minimal superkeys are called candidate keys.
o It is possible that several distinct sets of attributes could serve as a candidate
key.
5
DBMS Chapter Two IS304
o We shall use the term primary key to denote a candidate key that is chosen by
the database designer as the principal means of identifying tuples within a
relation. A key (whether primary, candidate, or super) is a property of the entire
relation, rather than of the individual tuples.
o Any two individual tuples in the relation are prohibited from having the same
value on the key attributes at the same time. The designation of a key represents
a constraint in the real-world enterprise being modeled.
o Primary keys must be chosen with care. As we noted, the name of a person is
obviously not sufficient, because there may be many people with the same
name.
o The primary key should be chosen such that its attribute values are never, or
very rarely, changed.
o Example of primary to foreign key.
6
DBMS Chapter Two IS304
5. Schema Diagram
o A database schema, along with primary key and foreign key dependencies, can
be depicted by schema diagrams. The schema diagram for a university
organization shown in the figure below.
o Each relation appears as a box, with the relation name at the top in blue, and the
attributes listed inside the box. Primary key attributes are shown underlined.
Foreign key dependencies appear as arrows from the foreign key attributes of
the referencing relation to the primary key of the referenced relation.
7
DBMS Chapter Two IS304
7. Relational Operations
o All relational query languages provide a set of operations that can be applied to
either a single relation or a pair of relations. These operations have the nice and
desired property that their result is always a single relation.
o This property allows one to combine several of these operations in a modular
way. Specifically, since the result of a relational query is itself a relation,
relational operations can be applied to the results of queries as well as to the
given set of relations.
8
DBMS Chapter Two IS304
o The union operation performs a set union of two “similarly structured” tables
(say a table of all graduate students and a table of all undergraduate students).
For example, one can obtain the set of all students in a department. Other set
operations, such as intersection and set difference can be performed as well.
8. Relational Algebra
9
DBMS Chapter Two IS304
9. Properties of Relations
1) The relation has a name that is distinct from all other relation names in the
relational schema;
2) Each cell of the relation contains exactly one atomic (single) value;
3) Each attribute has a distinct name;
4) The values of an attribute are all from the same domain;
5) Each tuple is distinct; there are no duplicate tuples;
6) The order of attributes has no significance;
7) The order of tuples has no significance, theoretically. (However, in practice,
the order may affect the efficiency of accessing tuples.)
10
DBMS Chapter Two IS304
10.1Null
o A null can be taken to mean the logical value ‘unknown’. It can mean that a
value is not applicable to a particular tuple, or it could merely mean that no
value has yet been supplied. Nulls are a way to deal with incomplete or
exceptional data.
o However, a null is not the same as a zero numeric value or a text string filled
with spaces; zeros and spaces are values, but a null represents the absence of
a value. Therefore, nulls should be treated differently from other values.
o Some authors use the term ‘null value’, however as a null is not a value but
represents the absence of a value, the term ‘null value’ is deprecated.
11. View
12
DBMS Chapter Two IS304
o Views are dynamic, meaning that changes made to the base relations that affect
the view are immediately reflected in the view.
o When users make permitted changes to the view, these changes are made to the
underlying relations. In this section, we describe the purpose of views and
briefly examine restrictions that apply to updates made through views.
13
DBMS Chapter Two IS304
14