Week 3 - Relational Database Model
Week 3 - Relational Database Model
2
Tables and Their Characteristics
The logical view of the relational database is facilitated by the
creation of data relationships based on a logical construct known as a
relation.
Because a relation is a mathematical construct, end users find it much
easier to think of a relation as a table.
A table is perceived as a two-dimensional structure composed of
rows and columns. Sometimes table and relation use interchangeably.
A table contains a group of related entity occurrences-that is, an entity
set.
For example, a STUDENT table contains a collection of entity
occurrences, each representing a student.
For that reason, the terms entity set and table are often used
interchangeably.
3
Characteristics of a Relational Table
of no importance
4
8 rows (tuples) and 12 columns (attributes)
Since the STU_GPA values are limited to the range 0–4, inclusive, the domain is [0,4].
5 STU_NUM (the student number) is the primary key.
Keys
Used to:
Ensure that each row in a table is uniquely identifiable
Establish relationships among tables and to ensure the
integrity of the data
Consist of one or more attributes that determine other
attributes.
For example, an invoice number identifies all the
invoice attributes, such as the invoice date and the
customer name.
6
Primary key (PK)
Attribute or combination of attributes that uniquely identifies any
given row
In previous example, STU_NUM (the student number) is the
primary key.
Using the data in previous figure, observe that a student’s last
name (STU_LNAME) would not be a good primary key
because several students have the last name of Smith.
Even the combination of the last name and first name
(STU_FNAME) would not be an appropriate primary key
because more than one student is named John Smith.
7
Determination
The role of a key is based on the concept of determination.
Determination is the state in which knowing the value of one attribute
makes it possible to determine the value of another.
You are familiar with the formula revenue − cost = profit.
This is a form of determination, because if you are given the revenue
and the cost, you can determine the profit.
Given profit and revenue, you can determine the cost. Given any two
values, you can determine the third.
If you are given a value for STU_NUM, then you can determine the
value for STU_LNAME because one and only one value of
STU_LNAME is associated with any given value of STU_NUM.
Determination is the basis for establishing the role of a key.
Determination is based on the relationships among the attributes.
8
Dependencies
Functional dependence: Value of one or more attributes
determines the value of one or more other attributes.
Determinant: Attribute whose value determines another
Dependent: Attribute whose value is determined by the other
attribute
The standard notation for representing the relationship between
STU_NUM and STU_LNAME is as follows:
STU_NUM → STU_LNAME
STU_NUM → (STU_LNAME, STU_FNAME, STU_GPA)
(STU_FNAME, STU_LNAME, STU_INIT, STU_PHONE) → (STU_DOB,
STU_HRS, STU_GPA)
9
Dependencies
Full functional dependence: Entire collection of
attributes in the determinant is necessary for the relationship.
10
Types of Keys
Keys are the determinants in functional dependencies.
Composite key: Key that is composed of more than one
attribute.
Key attribute: Attribute that is a part of a key.
11
Types of Keys
A superkey is a key that can uniquely identify any row
in the table.
In other words, a superkey functionally determines every
attribute in the row.
In the STUDENT table, STU_NUM is a superkey, as are
the composite keys (STU_NUM, STU_LNAME),
(STU_NUM, STU_LNAME, STU_INIT) and
(STU_LNAME, STU_FNAME, STU_INIT,
STU_PHONE).
12
Types of Keys
One specific type of superkey is called a candidate key.
A candidate key is a minimal superkey-that is, a superkey
without any unnecessary attributes.
A candidate key is based on a full functional dependency.
For example, STU_NUM would be a candidate key, as
would (STU_LNAME, STU_FNAME, STU_INIT,
STU_PHONE).
On the other hand, (STU_NUM, STU_LNAME) is a
superkey, but it is not a candidate key because
STU_LNAME could be removed, and the key would still
be a superkey.
13
Types of Keys
A table can have many different candidate keys.
If the STUDENT table also included the students’ Social
Security numbers as STU_SSN, then it would appear to
be a candidate key.
Candidate keys are called candidates because they are the
eligible options from which the designer will choose
when selecting the primary key.
The primary key is the candidate key chosen to be the
primary means by which the rows of the table are
uniquely identified.
14
Types of Keys
Entity integrity: Condition in which each row in the
table has its own unique identity.
All the values in the primary key must be unique.
No key attribute in the primary key can contain a null.
Null: Absence of any data value that could represent:
An unknown attribute value
A known, but missing, attribute value
A «not applicable» condition
Nulls can create problems when functions such as
COUNT, AVERAGE, and SUM are used.
15
An Example of a Simple Relational Database
A foreign key (FK) is the primary key of one table that has been placed into another table to
16 create a common attribute.
Types of Keys
Since the STUDENT table used a proper naming convention, you can identify two foreign keys in the table
(DEPT_CODE and PROF_NUM) that imply the existence of two other tables in the database
17
(DEPARTMENT and PROFESSOR) related to STUDENT.
Types of Keys
Just as the primary key has a role in ensuring the
integrity of the database, so does the foreign key.
Foreign keys are used to ensure referential
integrity, the condition in which every reference
to an entity instance by another entity instance is
valid.
In other words, every foreign key entry must
either be null or a valid value in the primary key
of the related table.
18
Types of Keys
A secondary key is defined as a key that is used strictly for
data retrieval purposes.
Suppose that customer data is stored in a CUSTOMER table
in which the customer number is the primary key.
Do you think that most customers will remember their
numbers?
Data retrieval for a customer is easier when the customer’s
last name and phone number are used.
In that case, the primary key is the customer number; the
secondary key is the combination of the customer’s last name
and phone number.
19
Types of Keys
Keep in mind that a secondary key does not necessarily yield
a unique outcome.
For example, a customer’s last name and home telephone
number could easily yield several matches in which one
family lives together and shares a phone line.
A less efficient secondary key would be the combination of
the last name and zip code; this could yield dozens of
matches, which could then be combed for a specific match.
20
Summary of Relational Database Keys
21
Integrity Rules
Relational database integrity rules are very important to good database design.
Relational database management systems (RDBMSs) enforce integrity rules automatically, but it is much safer to
make sure your application design obeys the entity and referential integrity rules.
22
An Illustration of Integrity Rules
Referential integrity.
The CUSTOMER
table contains a foreign
key, AGENT_CODE,
that links entries in the
CUSTOMER table to
the AGENT table.
24
Relational Algebra
The data in relational tables is of limited value unless the data can be
manipulated to generate useful information.
Relational algebra defines the theoretical way of manipulating table
contents using relational operators.
We will learn SQL to accomplish relational algebra.
Relvar:Variable that holds a relation.
The relvar is a container (variable) for holding relation data, not the
relation itself.The data in the table is a relation.
Relational operators have the property of closure; the use of relational
algebra operators on existing relations (tables) produces new relations.
We will focus on the SELECT (or RESTRICT), PROJECT, UNION,
INTERSECT, DIFFERENCE, PRODUCT, and JOIN operators.
25
Select
SELECT, also known as RESTRICT, is referred to as a unary operator (an operator taking one operand)
because it only uses one table as input.
It yields values for all rows found in the table that satisfy a given condition.
SELECT can be used to list all the rows, or it can yield only rows that match a specified
criterion.
In other words, SELECT yields a horizontal subset of a table.
26
Project
PROJECT yields all values for selected attributes. It is also a unary operator, accepting only one
table as input.
PROJECT will return only the attributes requested, in the order in which they are requested.
In other words, PROJECT yields a vertical subset of a table.
27
Union
UNION combines all rows from two tables, excluding duplicate rows.
To be used in the UNION, the tables must have the same attribute characteristics; in other
words, the columns and domains must be compatible.
When two or more tables share the same number of columns, and when their corresponding
columns share the same or compatible domains, they are said to be union-compatible.
28
Intersect
INTERSECT yields only the rows that appear in both tables.
As with UNION, the tables must be union-compatible to yield valid results.
For example, you cannot use INTERSECT if one of the attributes is numeric
and one is character-based.
For the rows to be considered the same in both tables and appear in the result
of the INTERSECT, the entire rows must be exact duplicates.
29
Difference
DIFFERENCE yields all rows in one table that are not found in the other table; that is, it
subtracts one table from the other.
As with UNION, the tables must be union-compatible to yield valid results.
However, note that subtracting the first table from the second table is not the same as
subtracting the second table from the first table
30
Product
PRODUCT yields all possible pairs of rows from two tables-also known as the Cartesian
product.
Therefore, if one table has 6 rows and the other table has 3 rows, the PRODUCT yields a list
composed of 6 × 3 = 18 rows
31
Join
32
Natural Join
A natural join links tables by selecting only the rows with common values in their
common attribute(s).
A natural join is the result of a three-stage process: PRODUCT – SELECT – PROJECT.
33
Natural Join
Secondly, a SELECT is performed on the output of Step 1 to yield only the rows for which the
AGENT_CODE values are equal.The common columns are referred to as the join columns.
Third, a PROJECT is performed on the results of Step 2 to yield a single copy of each attribute, thereby
eliminating duplicate columns.
34
Types of Joins
Inner join: Only returns matched records from the tables that
are being joined
Outer join: Matched pairs are taken and unmatched values in
the other table are left null.
Left outer join: Yields all of the rows in the first table, including
those that do not have a matching value in the second table
Right outer join: Yields all of the rows in the second table,
including those that do not have matching values in the first table
35
Left outer join
A left outer join yields all of the rows in the CUSTOMER table,
including those that do not have a matching value in the AGENT table.
36
Right outer join
A right outer join yields all of the rows in the AGENT table, including
those that do not have matching values in the CUSTOMER table.
37
Data Dictionary and the System
Catalog
The data dictionary provides a detailed description of all tables
in the database created by the user and designer.
Thus, the data dictionary contains at least all of the attribute
names and characteristics for each table in the system.
In short, the data dictionary contains metadata - data about data.
The purpose of this data dictionary is to ensure that all members
of database design and implementation teams use the same table
and attribute names and characteristics.
38
Data Dictionary and the System Catalog
39
Relationships within the Relational
Database
1:M relationship - Norm for relational databases.
1:1 relationship - One entity can be related to only one other
entity and vice versa & should be rare.
Many-to-Many (M:N) relationship - Implemented by creating a
new entity in 1:M relationships with the original entities.
40
The 1:M Relationship
entity relationship model
(ERM)
41
entity relationship model (ERM) The 1:M Relationship
The 1:M relationship between COURSE and
CLASS might be described this way:
• Each COURSE can have many CLASSes, but
each CLASS references only one COURSE.
• There will be only one row in the COURSE
table for any given row in the CLASS table,
but there can be many rows in the CLASS
table for any given row in the COURSE table.
42
The 1:1 Relationship
As the 1:1 label implies, one entity in a 1:1 relationship can be related to only
one other entity, and vice versa.
For example, one department chair - a professor - can chair only one
department, and one department can have only one department chair.
The entities PROFESSOR and DEPARTMENT thus exhibit a 1:1 relationship.
43
The 1:1 Relationship Each professor is a Tiny College employee.
Therefore, the professor identification is
through the EMP_NUM. (However, note
that not all employees are professors - there’s
another optional relationship.)
The 1:1 “PROFESSOR chairs
DEPARTMENT” relationship is implemented
by having the EMP_NUM foreign key in the
DEPARTMENT table.
Note that the 1:1 relationship is treated as a
special case of the 1:M relationship in which
the “many” side is restricted to a single
occurrence. In this case, DEPARTMENT
contains the EMP_NUM as a foreign key to
indicate that it is the department that has a
chair.
Also note that the PROFESSOR table
contains the DEPT_CODE foreign key to
implement the 1:M “DEPARTMENT
employs PROFESSOR” relationship. This is a
good example of how two entities can
44 participate in two (or even more)
relationships simultaneously.
The M:N Relationship
A many-to-many (M:N) relationship is not supported directly in the relational
environment.
However, M:N relationships can be implemented by creating a new entity in 1:M
relationships with the original entities.
To explore the M:N relationship, consider a typical college environment.
The ER model in figure below shows this M:N relationship.
Each CLASS can have many STUDENTs, and each STUDENT can take many CLASSes.
There can be many rows in the CLASS table for any given row in the STUDENT table,
and there can be many rows in the STUDENT table for any given row in the CLASS table.
45
The M:N Relationship
To examine the M:N relationship more closely, imagine a small college with
two students, each of whom takes three classes.
Table shows the enrollment data for the two students.
46
The M:N Relationship
Given such a data relationship and the sample data, you could wrongly assume that you
could implement this M:N relationship simply by adding a foreign key in the “many” side
of the relationship that points to the primary key of the related table, as shown in figure
below. However, the M:N relationship should not be implemented for good reasons:
47
The M:N Relationship
The tables create many redundancies. For example, note that the STU_NUM values
occur many times in the STUDENT table. In a real-world situation, additional student
attributes such as address, classification, major, and home phone would also be contained
in the STUDENT table, and each of those attribute values would be repeated in each of
the records shown here. Similarly, the CLASS table contains much duplication: each
student taking the class generates a CLASS record. The problem would be even worse if
48
the CLASS table included such attributes as credit hours and course description.
The 1:M Relationship
The problems inherent in the M:N relationship
can easily be avoided by creating a composite
entity (also referred to as a bridge entity or
an associative entity).
The database designer has two main options
when defining a composite table’s primary key:
use the combination of those foreign keys or
create a new primary key.
Remember that each entity in the ERM is
represented by a table.
Therefore, you can create the composite
ENROLL table shown in figure to link the tables
CLASS and STUDENT.
In this example, the ENROLL table’s primary
key is the combination of its foreign keys Because the ENROLL table in the figure links two tables,
CLASS_CODE and STU_NUM. STUDENT and CLASS, it is also called a linking table.
In other words, a linking table is the implementation of a
However, the designer could have decided to composite entity.
create a single-attribute new primary key such as
ENROLL_LINE, using a different line value to
49
identify each ENROLL table row uniquely.
The 1:M Relationship
As you examine figure below, note that the composite entity named ENROLL
represents the linking table between STUDENT and CLASS.
50
The 1:M Relationship
You can increase the amount of available information even as you control the
database’s redundancies.
Thus, figure below shows the expanded ERM, including the 1:M relationship
between COURSE and CLASS shown in previous figure.
Note that the model can handle multiple sections of a CLASS while controlling
redundancies by making sure that all of the COURSE data common to each
CLASS are kept in the COURSE table.
51
The 1:M Relationship
52
Reference:
Coronel, C. & Morris, S.
(2019). Database Systems
Design Implementation and
Management (13th ed.)
Cengage Learning.
53