Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Introduction To Database Systems Relational Model and Algebra

Download as pdf or txt
Download as pdf or txt
You are on page 1of 76

Lecture 2

Introduction to Database Systems

Relational Model and Algebra


COSC 304 - Dr. Ramon Lawrence

Relational Model History


The relational model was proposed by E. F. Codd in 1970.

One of the first relational database systems, System R,


developed at IBM led to several important breakthroughs:
the first version of SQL
various commercial products such as Oracle and DB2
extensive research on concurrency control, transaction
management, and query processing and optimization

Commercial implementations (RDBMSs) appeared in the late


1970s and early 1980s. Currently, the relational model is the
foundation of the majority of commercial database systems.

Page 2
COSC 304 - Dr. Ramon Lawrence

Relational Model Definitions


A relation is a table with columns and rows.
An attribute is a named column of a relation.
A tuple is a row of a relation.
A domain is a set of allowable values for one or more
attributes.
The degree of a relation is the number of attributes it contains.
The cardinality of a relation is the number of tuples it contains.
A relational database is a collection of normalized relations
with distinct relation names.
The intension of a relation is the structure of the relation
including its domains.
The extension of a relation is the set of tuples currently in the
relation.
Page 3
COSC 304 - Dr. Ramon Lawrence

Relation Example
relation
attributes

tuples

Degree = 7 Domain of Unit


Cardinality = 77 Price is currency.

Page 4
COSC 304 - Dr. Ramon Lawrence

Definition Matching Question


Question: Given the three definitions, select the ordering that
contains their related definitions.

1) relation
2) tuple
3) attribute

A) column, row, table


B) row, column, table
C) table, row, column
D) table, column, row

Page 5
COSC 304 - Dr. Ramon Lawrence

Cardinality and Degree Question


Question: A database table has 10 rows and 5 columns.
Select one true statement.

A) The table's degree is 50.

B) The table's cardinality is 5.

C) The table's degree is 10.

D) The table's cardinality is 10.

Page 6
COSC 304 - Dr. Ramon Lawrence

Relation Practice Questions

1) What is the name of the relation?


2) What is the cardinality of the relation?
3) What is the degree of the relation?
4) What is the domain of order date? What is the domain of
order id?
5) What is larger the size of the intension or extension? Page 7
COSC 304 - Dr. Ramon Lawrence

Relational Model Formal Definition


The relational model may be visualized as tables and fields, but
it is formally defined in terms of sets and set operations.

A relation schema R with attributes A =<A1, A2, …, An> is


denoted R (A1, A2, …, An) where each Ai is an attribute name
that ranges over a domain Di denoted dom(Ai).

Example: Product (id, name, supplierId, categoryId, price)


R = Product (relation name)
Set A = {id, name, supplierId, categoryId, price}
dom(price) is set of all possible positive currency values
dom(name) is set of all possible strings that represent people's
names
Page 8
COSC 304 - Dr. Ramon Lawrence

Relation Schemas and Instances


A relation schema is a definition of a single relation.
The relation schema is the intension of the relation.
A relational database schema is a set of relation schemas
(modeling a particular domain).

A relation instance denoted r(R) over a relation schema R(A1,


A2, …, An) is a set of n-tuples <d1, d2, ..., dn> where each di is
an element of dom(Ai) or is null.
The relation instance is the extension of the relation.
A value of null represents a missing or unknown value.

Page 9
COSC 304 - Dr. Ramon Lawrence

Cartesian Product (review)


The Cartesian product written as D1  D2 is a set operation
that takes two sets D1 and D2 and returns the set of all ordered
pairs such that the first element is a member of D1 and the
second element is a member of D2.
Example:
D1 = {1,2,3}
D2 = {A,B}
D1  D2 = {(1,A), (2,A), (3,A), (1,B), (2,B), (3,B)}
Practice Questions:
1) Compute D2  D1.
2) Compute D2  D2.
3) If |D| denotes the number of elements in set D, how many
elements are there in D1  D2 in general.
What is the cardinality of D1  D2  D1  D1? A) 27 B) 36 C) 54 Page 10
COSC 304 - Dr. Ramon Lawrence

Relation Instance
A relation instance r(R) can also be defined as a subset of the
Cartesian product of the domains of all attributes in the relation
schema. That is,
r(R)  dom(A1)  dom(A2)  …  dom(An)

Example:
R = Person(id, firstName, lastName)
dom(id) = {1,2}, dom(firstName) = {Joe, Steve}
dom(lastName) = {Jones, Perry}
dom(id)  dom(firstName)  dom(lastName) =
{ (1,Joe,Jones), (1,Joe,Perry), (1,Steve,Jones), (1,Steve,Perry),
(2,Joe,Jones), (2,Joe,Perry), (2,Steve,Jones), (2,Steve,Perry)}
Assume our DB stores people Joe Jones and Steve Perry, then
r(R) = { (1,Joe, Jones), (2,Steve,Perry)}. Page 11
COSC 304 - Dr. Ramon Lawrence

Properties of Relations
A relation has several properties:
1) Each relation name is unique.
No two relations have the same name.
2) Each cell of the relation (value of a domain) contains exactly
one atomic (single) value.
3) Each attribute of a relation has a distinct name.
4) The values of an attribute are all from the same domain.
5) Each tuple is distinct. There are no duplicate tuples.
This is because relations are sets. In SQL, relations are bags.
6) The order of attributes is not really important.
Note that this is different that a mathematical relation and our definitions
which specify an ordered tuple. The reason is that the attribute names
represent the domain and can be reordered.
7) The order of tuples has no significance.
Page 12
COSC 304 - Dr. Ramon Lawrence

Relational Keys
Keys are used to uniquely identify a tuple in a relation.
Note that keys apply to the relational schema not to the relational
instance. That is, looking at the current instance cannot tell you for sure
if the set of attributes is a key.
A superkey is a set of attributes that uniquely identifies a tuple
in a relation.
A key is a minimal set of attributes that uniquely identifies a
tuple in a relation.
A candidate key is one of the possible keys of a relation.
A primary key is the candidate key designated as the
distinguishing key of a relation.
A foreign key is a set of attributes in one relation referring to
the primary key of another relation.
Foreign keys allow referential integrity to be enforced.
Page 13
COSC 304 - Dr. Ramon Lawrence

Keys and Superkeys Question


Question: True or false: A key is always a superkey.

A) true

B) false

Page 14
COSC 304 - Dr. Ramon Lawrence

Keys and Superkeys Question (2)


Question: True or false: It is possible to have more than one
key for a table and the keys may have different numbers of
attributes.

A) true

B) false

Page 15
COSC 304 - Dr. Ramon Lawrence

Keys and Superkeys Question (3)


Question: True or false: It is possible to always determine if a
field is a key by looking at the data in the table.

A) true

B) false

Page 16
COSC 304 - Dr. Ramon Lawrence

Example Relations
Employee-Project Database:
Employees have a unique number, name, title, and salary.
Projects have a unique number, name, and budget.
An employee may work on multiple projects and a project may
have multiple employees. An employee on a project has a
particular responsibility and duration on the project.
Relations:
Emp (eno, ename, title, salary)
Proj (pno, pname, budget)
WorksOn (eno, pno, resp, dur)

Underlined attributes denote keys.


Page 17
COSC 304 - Dr. Ramon Lawrence

Example Relation Instances


Emp Relation WorksOn Relation
eno enam e title s a la r y eno pno re sp dur
E1 J. D oe EE 30000 E1 P1 M anager 12
E2 M . S m ith SA 50000 E2 P1 A n a ly s t 24
E3 A . L ee ME 40000 E2 P2 A n a ly s t 6
E4 J . M ille r PR 20000 E3 P3 C o n s u lta n t 10
E5 B . C asey SA 50000 E3 P4 E n g in e e r 48
E6 L. C hu EE 30000 E4 P2 P ro g ra m m e r 1 8
E7 R . D a v is ME 40000 E5 P2 M anager 24
E8 J. Jones SA 50000 E6 P4 M anager 48
E7 P3 E n g in e e r 36
Proj Relation E7 P5 E n g in e e r 23
pno pnam e budget E8 P3 M anager 40
P1 I n s tr u m e n ts 150000
P2 D B D e v e lo p 135000 Questions:
P3 C A D /C A M 250000 1) Is ename a key for emp?
P4 M a in te n a n c e 3 1 0 0 0 0 2) Is eno a key for WorksOn?
P5 C A D /C A M 500000 3) List all the superkeys for WorksOn. Page 18
COSC 304 - Dr. Ramon Lawrence

Practice Questions
Consider a relation storing driver information including:
SSN, name, driver's license number and state (unique together)

Person Relation
SSN nam e L ic N u m L ic S ta te
1 2 3 - 4 5 - 6 7 8 9 S . S m ith 1 2 3 -4 5 6 IA
1 1 1 -1 1 -1 1 1 1 A . L e e 1 2 3 -4 5 6 NY
2 2 2 - 2 2 - 2 2 2 2 J . M ille r 5 5 5 -1 1 1 MT
3 3 3 -3 3 -3 3 3 3 B . C a se y 6 7 8 -1 2 3 OH
4 4 4 - 4 4 - 4 4 4 4 A . A d le r 4 5 6 -3 4 5 IA

Questions: Assumptions:
1) List the candidate keys for the relation. 1) A person has only one driver’s license.
2) Pick a primary key for the relation. 2) A driver’s license uniquely
3) Is name a candidate key for Person? identifies a person.
4) List all the superkeys for Person. Page 19
COSC 304 - Dr. Ramon Lawrence

Relational Integrity
Integrity rules are used to insure the data is accurate.
Constraints are rules or restrictions that apply to the database
and limit the data values it may store.

Types of constraints:
Domain constraint - Every value for an attribute must be an
element of the attribute's domain or be null.
null represents a value that is currently unknown or not applicable.
null is not the same as zero or an empty string.
Entity integrity constraint - In a base relation, no attribute of a
primary key can be null.
Referential integrity constraint - If a foreign key exists in a
relation, then the foreign key value must match a primary key
value of a tuple in the referenced relation or be null. Page 20
COSC 304 - Dr. Ramon Lawrence

Foreign Keys Example


Emp Relation WorksOn Relation
eno enam e title s a la r y eno pno re sp dur
E1 J. D oe EE 30000
WorksOn.eno is E1 P1 M anager 12
FK to Emp.eno E2 P1 A n a ly s t 24
E2 M . S m ith SA 50000
E3 A . L ee ME 40000 E2 P2 A n a ly s t 6
E4 J . M ille r PR 20000 E3 P3 C o n s u lta n t 10
E5 B . C asey SA 50000 E3 P4 E n g in e e r 48
WorksOn.pno is
E6 L. C hu EE 30000 E4 P2 P ro g ra m m e r 1 8
FK to Proj.pno
E7 R . D a v is ME 40000 E5 P2 M anager 24
E8 J. Jones SA 50000 E6 P4 M anager 48
E7 P3 E n g in e e r 36
Proj Relation E7 P5 E n g in e e r 23
pno pnam e budget E8 P3 M anager 40
P1 I n s tr u m e n ts 150000
P2 D B D e v e lo p 135000
P3 C A D /C A M 250000
P4 M a in te n a n c e 3 1 0 0 0 0
P5 C A D /C A M 500000
Page 21
COSC 304 - Dr. Ramon Lawrence

Integrity Constraints Question


Question: What constraint says that a primary key field cannot
be null?

A) domain constraint

B) referential integrity constraint

C) entity integrity constraint

Page 22
COSC 304 - Dr. Ramon Lawrence

Entity Integrity Constraint Question


Question: A primary key has three fields. Only one field is
null. Is the entity integrity constraint violated?

A) Yes

B) No

Page 23
COSC 304 - Dr. Ramon Lawrence

Referential Integrity Constraint Question


Question: A foreign key has a null value in the table that
contains the foreign key fields. Is the referential integrity
constraint violated?

A) Yes

B) No

Page 24
COSC 304 - Dr. Ramon Lawrence

Integrity Questions
Emp Relation WorksOn Relation
eno enam e title s a la r y eno pno re sp dur
E1 J. D oe EE AS E1 P0 n u ll 12
E2 n u ll SA 50000 E2 P1 A n a ly s t n u ll
E3 A . L ee 12 40000 n u ll P2 A n a ly s t 6
E4 J . M ille r PR 20000 E3 P3 C o n s u lta n t 10
E5 B . C asey SA 50000 E9 P4 E n g in e e r 48
n u ll L. C hu EE 30000 E4 P2 P ro g ra m m e r 1 8
E7 R . D a v is ME n u ll E5 n u ll M anager 24
E8 J. Jones SA 50000 E6 P4 M anager 48
E7 P6 E n g in e e r 36
Proj Relation
E7 P4 E n g in e e r 23
pno pnam e budget
n u ll n u ll M anager 40
P1 I n s tr u m e n ts 150000
P2 D B D e v e lo p 135000
Question:
P3 C A D /C A M 250000
How many violations of integrity constraints?
P4 M a in te n a n c e 3 1 0 0 0 0
A) 8 B) 9 C) 10 D) 11 E) 12
P5 n u ll n u ll
Page 25
COSC 304 - Dr. Ramon Lawrence

General Constraints
There are more general constraints that some DBMSs can
enforce. These constraints are often called enterprise
constraints or semantic integrity constraints.

Examples:
An employee cannot work on more than 2 projects.
An employee cannot make more money than their manager.
An employee must be assigned to at least one project.

Ensuring the database follows these constraints is usually


achieved using triggers.

Page 26
COSC 304 - Dr. Ramon Lawrence

Relational Algebra
A query language is used to update and retrieve data that is
stored in a data model.
Relational algebra is a set of relational operations for
retrieving data.
Just like algebra with numbers, relational algebra consists of
operands (which are relations) and a set of operators.
Every relational operator takes as input one or more relations
and produces a relation as output.
Closure property - input is relations, output is relations
Unary operations - operate on one relation
Binary operations - have two relations as input
A sequence of relational algebra operators is called a
relational algebra expression.
Page 27
COSC 304 - Dr. Ramon Lawrence

Relational Algebra Operators


Relational Operators:
Selection 
Projection 
Cartesian product 
Join
Union 
Difference -
Intersection 
Division 

Note that relational algebra is the foundation of ALL relational


database systems. SQL gets translated into relational algebra.
Page 28
COSC 304 - Dr. Ramon Lawrence

Selection Operation
The selection operation is a unary operation that takes in a
relation as input and returns a new relation as output that
contains a subset of the tuples of the input relation.
That is, the output relation has the same number of columns as
the input relation, but may have less rows.

To determine which tuples are in the output, the selection


operation has a specified condition, called a predicate, that
tuples must satisfy to be in the output.
The predicate is similar to a condition in an if statement.

Page 29
COSC 304 - Dr. Ramon Lawrence

Selection Operation Formal Definition


The selection operation on relation R with predicate F is
denoted by F(R).

F(R)={t  tR and F(t) is true}

where
R is a relation, t is a tuple variable
F is a formula (predicate) consisting of
operands that are constants or attributes
comparison operators: <, >, =, , , 
logical operators: AND, OR, NOT

Page 30
COSC 304 - Dr. Ramon Lawrence

Selection Example
Emp Relation title = 'EE' (Emp)
eno enam e title s a la r y
E1 J. D oe EE 30000 eno enam e title s a la r y
E2 M . S m ith SA 50000 E1 J. D oe EE 30000
E3 A . L ee ME 40000 E6 L. C hu EE 30000
E4 J . M ille r PR 20000
E5 B . C asey SA 50000 salary > 35000 OR title = 'PR' (Emp)
E6 L. C hu EE 30000
E7 R . D a v is ME 40000 eno enam e title s a la r y

E8 J. Jones SA 50000 E2 M . S m ith SA 50000


E3 A . L ee ME 40000
E4 J . M ille r PR 20000
E5 B . C asey SA 50000
E7 R . D a v is ME 40000
E8 J. Jones SA 50000

Page 31
COSC 304 - Dr. Ramon Lawrence

Selection Question
Question: Given this table and the query:
salary > 50000 or title='PR'(Emp)

How many rows are returned? Emp Relation


eno enam e title s a la r y
E1 J. D oe EE 30000
A) 0 E2 M . S m ith SA 50000
E3 A . L ee ME 40000
B) 1 E4 J . M ille r PR 20000
C) 2 E5 B . C asey SA 50000
E6 L. C hu EE 30000
D) 3
E7 R . D a v is ME 40000
E8 J. Jones SA 50000

Page 32
COSC 304 - Dr. Ramon Lawrence

Selection Question (2)


Question: Given this table and the query:
salary > 50000 or title='PR'(Emp)

How many columns are returned? Emp Relation


eno enam e title s a la r y
E1 J. D oe EE 30000
A) 0 E2 M . S m ith SA 50000
E3 A . L ee ME 40000
B) 2 E4 J . M ille r PR 20000
C) 3 E5 B . C asey SA 50000
E6 L. C hu EE 30000
D) 4
E7 R . D a v is ME 40000
E8 J. Jones SA 50000

Page 33
COSC 304 - Dr. Ramon Lawrence

Selection Questions
WorksOn Relation
eno pno re sp dur Write the relational algebra expression that:
E1 P1 M anager 12 1) Returns all rows with an employee working
E2 P1 A n a ly s t 24 on project P2.
E2 P2 A n a ly s t 6 2) Returns all rows with an employee who is
E3 P3 C o n s u lta n t 10 working as a manager on a project.
E3 P4 E n g in e e r 48 3) Returns all rows with an employee working
E4 P2 P ro g ra m m e r 1 8 as a manager for more than 40 months.
E5 P2 M anager 24
E6 P4 M anager 48 Show the resulting relation for each case.
E7 P3 E n g in e e r 36
E7 P5 E n g in e e r 23
E8 P3 M anager 40

Page 34
COSC 304 - Dr. Ramon Lawrence

Projection Operation
The projection operation is a unary operation that takes in a
relation as input and returns a new relation as output that
contains a subset of the attributes of the input relation and all
non-duplicate tuples.
The output relation has the same number of tuples as the input
relation unless removing the attributes caused duplicates to be
present.
Question: When are we guaranteed to never have duplicates
when performing a projection operation?

Besides the relation, the projection operation takes as input the


names of the attributes that are to be in the output relation.

Page 35
COSC 304 - Dr. Ramon Lawrence

Projection Operation Formal Definition


The projection operation on relation R with output attributes
A1,…,Am is denoted by A ,…,A (R).
1 m

A1,…,Am(R)={t[A1,…, Am]  tR}

where
R is a relation, t is a tuple variable
 {A1,…,Am} is a subset of the attributes of R over which the projection
will be performed.
 Order of A1,…, Am is significant in the result.
 Cardinality of A ,…,A (R) is not necessarily the same as R because
1 m
of duplicate removal.

Page 36
COSC 304 - Dr. Ramon Lawrence

Projection Example
Emp Relation  eno,ename (Emp)
eno enam e title s a la r y eno enam e
E1 J. D oe EE 30000 E1 J. D oe
E2 M . S m ith SA 50000 E2 M . S m ith
E3 A . L ee ME 40000 E3 A . L ee
E4 J . M ille r PR 20000 E4 J . M ille r
E5 B . C asey SA 50000 E5 B . C asey
E6 L. C hu EE 30000 E6 L. C hu
E7 R . D a v is ME 40000 E7 R . D a v is
E8 J. Jones SA 50000 E8 J. Jones

 title (Emp) title


EE
SA
ME
PR
Page 37
COSC 304 - Dr. Ramon Lawrence

Projection Question
Question: Given this table and the query:
 title (Emp)

How many rows are returned? Emp Relation


eno enam e title s a la r y
E1 J. D oe EE 30000
A) 0 E2 M . S m ith SA 50000
E3 A . L ee ME 40000
B) 2 E4 J . M ille r PR 20000
C) 4 E5 B . C asey SA 50000
E6 L. C hu EE 30000
D) 8
E7 R . D a v is ME 40000
E8 J. Jones SA 50000

Page 38
COSC 304 - Dr. Ramon Lawrence

Projection Questions
WorksOn Relation
eno pno re sp dur
E1 P1 M anager 12 Write the relational algebra expression that:
E2 P1 A n a ly s t 24
1) Returns only attributes resp and dur.
E2 P2 A n a ly s t 6
2) Returns only eno.
E3 P3 C o n s u lta n t 10
3) Returns only pno.
E3 P4 E n g in e e r 48
E4 P2 P ro g ra m m e r 1 8
Show the resulting relation for each case.
E5 P2 M anager 24
E6 P4 M anager 48
E7 P3 E n g in e e r 36
E7 P5 E n g in e e r 23
E8 P3 M anager 40

Page 39
COSC 304 - Dr. Ramon Lawrence

Union
Union is a binary operation that takes two relations R and S as
input and produces an output relation that includes all tuples
that are either in R or in S or in both R and S. Duplicate tuples
are eliminated.

General form:
R  S = {t  tR or tS}
where R, S are relations, t is a tuple variable.

R and S must be union-compatible. To be union-compatible


means that the relations must have the same number of
attributes with the same domains.

Page 40
COSC 304 - Dr. Ramon Lawrence

Union Example
eno enam e title s a la ry
Emp E1 J. D oe EE 30000
E2 M . S m ith SA 50000
E3 A . L ee ME 40000
E4 J . M ille r PR 20000
E5 B . C asey SA 50000
E6 L. C hu EE 30000 eno(Emp)  eno(WorksOn)
E7 R . D a v is ME 40000
E8 J. Jones SA 50000 eno
E1

eno pno re s p dur E2


WorksOn E3
E1 P1 M anager 12
E4
E2 P1 A n a ly s t 24
E5
E2 P2 A n a ly s t 6
E6
E3 P4 E n g in e e r 48
E7
E5 P2 M anager 24 E8
E6 P4 M anager 48
E7 P3 E n g in e e r 36
E7 P5 E n g in e e r 23
Page 41
COSC 304 - Dr. Ramon Lawrence

Set Difference
Set difference is a binary operation that takes two relations R
and S as input and produces an output relation that contains all
the tuples of R that are not in S.

General form:
R – S = {t  tR and tS}
where R and S are relations, t is a tuple variable.

Note that:
R – S  S – R
R and S must be union compatible.

Page 42
COSC 304 - Dr. Ramon Lawrence

Set Difference Example


Emp Relation WorksOn Relation
eno enam e title s a la r y eno pno re sp dur
E1 J. D oe EE 30000 E1 P1 M anager 12
E2 M . S m ith SA 50000 E2 P1 A n a ly s t 24
E3 A . L ee ME 40000 E2 P2 A n a ly s t 6
E4 J . M ille r PR 20000 E3 P4 E n g in e e r 48
E5 B . C asey SA 50000 E5 P2 M anager 24
E6 L. C hu EE 30000 E6 P4 M anager 48

E7 R . D a v is ME 40000 E7 P3 E n g in e e r 36

E8 J. Jones SA 50000 E7 P5 E n g in e e r 23

eno(Emp) - eno(WorksOn) eno


E4
Question: What is the meaning of this query?
E8
Question: What is eno(WorksOn) - eno(Emp)?
Page 43
COSC 304 - Dr. Ramon Lawrence

Intersection
Intersection is a binary operation that takes two relations R
and S as input and produces an output relation which contains
all tuples that are in both R and S.

General form:
R  S = {t  tR and tS}
where R, S are relations, t is a tuple variable.
R and S must be union-compatible.

Note that R  S = R – (R – S) = S – (S – R).

Page 44
COSC 304 - Dr. Ramon Lawrence

Intersection Example
eno enam e title s a la ry
Emp E1 J. D oe EE 30000
E2 M . S m ith SA 50000
E3 A . L ee ME 40000
E4 J . M ille r PR 20000
E5 B . C asey SA 50000
E6 L. C hu EE 30000 eno(Emp)  eno(WorksOn)
E7 R . D a v is ME 40000
E8 J. Jones SA 50000 eno
E1
eno pno re s p dur
WorksOn E2
E1 P1 M anager 12
E3
E2 P1 A n a ly s t 24
E2 P2 A n a ly s t 6 E5
E3 P4 E n g in e e r 48 E6
E5 P2 M anager 24 E7
E6 P4 M anager 48
E7 P3 E n g in e e r 36
E7 P5 E n g in e e r 23
Page 45
COSC 304 - Dr. Ramon Lawrence

Set Operations
Union-compatible Question
Question: Two tables have the same number of fields in the
same order with the same types, but the names of some fields
are different. True or false: The two tables are union-
compatible.

A) true

B) false

Page 46
COSC 304 - Dr. Ramon Lawrence

Cartesian Product
The Cartesian product of two relations R (of degree k1) and S
(of degree k2) is:

R  S = {t  t [A1,…,Ak1]R and t [Ak1+1,…,Ak1+k2]S}

The result of R  S is a relation of degree (k1 + k2) and consists


of all (k1 + k2)-tuples where each tuple is a concatenation of one
tuple of R with one tuple of S.

The cardinality of R  S is |R| * |S|.

The Cartesian product is also known as cross product.


Page 47
COSC 304 - Dr. Ramon Lawrence

Cartesian Product Example


Emp Relation
Emp  Proj
eno enam e title s a la r y
eno enam e title s a la r y pno pnam e budget
E1 J. D oe EE 30000
E1 J. D oe EE 30000 P1 I n s tr u m e n ts 150000
E2 M . S m ith SA 50000 E2 M . S m ith S A 50000 P1 I n s tr u m e n ts 150000

E3 A . L ee ME 40000 E3 A . L ee ME 40000 P1 I n s tr u m e n ts 150000


E4 J . M ille r PR 20000 P1 I n s tr u m e n ts 150000
E4 J . M ille r PR 20000 E1 J. D oe EE 30000 P2 D B D e v e lo p 135000
E2 M . S m ith S A 50000 P2 D B D e v e lo p 135000

Proj Relation E3
E4
A . L ee
J . M ille r
ME
PR
40000
20000
P2
P2
D B D e v e lo p
D B D e v e lo p
135000
135000
pno pnam e budget E1 J. D oe EE 30000 P3 C A D /C A M 250000
E2 M . S m ith S A 50000 P3 C A D /C A M 250000
P1 In s tru m e n ts 150000
E3 A . L ee ME 40000 P3 C A D /C A M 250000
P2 D B D e v e lo p 135000 E4 J . M ille r PR 20000 P3 C A D /C A M 250000
P3 C A D /C A M 250000

Page 48
COSC 304 - Dr. Ramon Lawrence

Cartesian Product Question


Question: R is a relation with 10 rows and 5 columns. S is a
relation with 8 rows and 3 columns.
What is the degree and cardinality of the Cartesian product?

A) degree = 8, cardinality = 80

B) degree = 80, cardinality = 8

C) degree = 15, cardinality = 80

D) degree = 8, cardinality = 18

Page 49
COSC 304 - Dr. Ramon Lawrence

 -Join
Theta () join is a derivative of the Cartesian product. Instead
of taking all combinations of tuples from R and S, we only take
a subset of those tuples that match a given condition F:

R F S = {t  t [A1,…,Ak1]R and t [Ak1+1,…,Ak1+k2]S


and F(t) is true}
where
R, S are relations, t is a tuple variable
F(t) is a formula defined as that of selection.

Note that R F S = F(R  S).

Page 50
COSC 304 - Dr. Ramon Lawrence

 -Join Example
WorksOn Relation WorksOn dur*10000 > budget Proj
eno p n o re s p dur eno pno re sp dur P .p n o p n a m e budget
E1 P1 M anager 12 E2 P1 A n a ly s t 24 P1 In s tru m e n ts 150000

E2 P1 A n a ly s t 24 E2 P1 A n a ly s t 24 P2 D B D e v e lo p 135000
E3 P4 E n g in e e r 48 P1 In s tru m e n ts 150000
E2 P2 A n a ly s t 6
E3 P4 E n g in e e r 48 P2 D B D e v e lo p 135000
E3 P4 E n g in e e r 48 E3 P4 E n g in e e r 48 P3 C A D /C A M 250000
E5 P2 M anager 24 E3 P4 E n g in e e r 48 P4 M a in te n a n c e 310000
E6 P4 M anager 48 E5 P2 M anager 24 P1 In s tru m e n ts 150000
E5 P2 M anager 24 P2 D B D e v e lo p 135000
E7 P3 E n g in e e r 36
E6 P4 M anager 48 P1 In s tru m e n ts 150000
E7 P4 E n g in e e r 23 E6 P4 M anager 48 P2 D B D e v e lo p 135000
E6 P4 M anager 48 P3 C A D /C A M 250000
Proj Relation E6 P4 M anager 48 P4 M a in te n a n c e 310000
E7 P3 E n g in e e r 36 P1 In s tru m e n ts 150000
pno pnam e budget E7 P3 E n g in e e r 36 P2 D B D e v e lo p 135000
P1 I n s tr u m e n ts 150000 E7 P3 E n g in e e r 36 P3 C A D /C A M 250000
P2 D B D e v e lo p 135000 E7 P4 E n g in e e r 23 P1 In s tru m e n ts 150000
E7 P4 E n g in e e r 23 P2 D B D e v e lo p 135000
P3 C A D /C A M 250000
P4 M a in te n a n c e 3 1 0 0 0 0
P5 C A D /C A M 500000
Page 51
COSC 304 - Dr. Ramon Lawrence

Types of Joins
The -Join is a general join in that it allows any expression in
the condition F. However, there are more specialized joins that
are frequently used.

A equijoin only contains the equality operator (=) in formula F.


e.g. WorksOn WorksOn.pno = Proj.pno Proj

A natural join over two relations R and S denoted by R ⨝ S is


the equijoin of R and S over a set of attributes common to both
R and S.
It removes the “extra copies” of the join attributes.
The attributes must have the same name in both relations.

Page 52
COSC 304 - Dr. Ramon Lawrence

Equijoin Example
WorksOn Relation WorksOn WorksOn.pno = Proj.pno Proj
eno p n o re s p dur e n o p n o re sp dur P .p n o p n a m e budget
E1 P1 M anager 12 E1 P1 M anager 12 P1 I n s tr u m e n ts 150000
E2 P1 A n a ly s t 24 P1 I n s tr u m e n ts 150000
E2 P1 A n a ly s t 24
E2 P2 A n a ly s t 6 P2 D B D e v e lo p 135000
E2 P2 A n a ly s t 6
E3 P4 E n g in e e r 48 P4 M a in te n a n c e 310000
E3 P4 E n g in e e r 48 E5 P2 M anager 24 P2 D B D e v e lo p 135000
E5 P2 M anager 24 E6 P4 M anager 48 P4 M a in te n a n c e 310000
E7 P3 E n g in e e r 36 P3 C A D /C A M 250000
E6 P4 M anager 48
E7 P4 E n g in e e r 23 P4 M a in te n a n c e 310000
E7 P3 E n g in e e r 36
E7 P4 E n g in e e r 23

Proj Relation
pno pnam e budget What is the meaning of this join?
P1 I n s tr u m e n ts 150000
P2 D B D e v e lo p 135000
P3 C A D /C A M 250000
P4 M a in te n a n c e 3 1 0 0 0 0
P5 C A D /C A M 500000
Page 53
COSC 304 - Dr. Ramon Lawrence

Natural join Example


WorksOn Relation WorksOn ⨝ Proj
eno p n o re s p dur e n o p n o re sp dur pnam e budget
E1 P1 M anager 12 E1 P1 M anager 12 I n s tr u m e n ts 150000
E2 P1 A n a ly s t 24 I n s tr u m e n ts 150000
E2 P1 A n a ly s t 24
E2 P2 A n a ly s t 6 D B D e v e lo p 135000
E2 P2 A n a ly s t 6 E3 P4 E n g in e e r 48 M a in te n a n c e 310000
E3 P4 E n g in e e r 48 E5 P2 M anager 24 D B D e v e lo p 135000
E5 P2 M anager 24 E6 P4 M anager 48 M a in te n a n c e 310000
E7 P3 E n g in e e r 36 C A D /C A M 250000
E6 P4 M anager 48
E7 P4 E n g in e e r 23 M a in te n a n c e 310000
E7 P3 E n g in e e r 36
E7 P4 E n g in e e r 23

Proj Relation Natural join is performed by


pno pnam e budget
comparing pno in both relations.
P1 I n s tr u m e n ts 150000
P2 D B D e v e lo p 135000
P3 C A D /C A M 250000
P4 M a in te n a n c e 3 1 0 0 0 0
P5 C A D /C A M 500000
Page 54
COSC 304 - Dr. Ramon Lawrence

Join Practice Questions


Emp Relation WorksOn Relation
eno enam e title s a la r y eno pno re sp dur
E1 J. D oe EE 30000 E1 P1 M anager 12
E2 M . S m ith SA 50000 E2 P1 A n a ly s t 24
E3 A . L ee ME 40000 E2 P2 A n a ly s t 6
E4 J . M ille r PR 20000 E3 P3 C o n s u lta n t 10
E5 B . C asey SA 50000 E3 P4 E n g in e e r 48
E6 L. C hu EE 30000 E4 P2 P ro g ra m m e r 1 8
E7 R . D a v is ME 40000 E5 P2 M anager 24
E8 J. Jones SA 50000 E6 P4 M anager 48
E7 P3 E n g in e e r 36
Proj Relation E7 P5 E n g in e e r 23
pno pnam e budget E8 P3 M anager 40
P1 I n s tr u m e n ts 150000
Compute the following joins (counts only):
P2 D B D e v e lo p 135000
1) Emp ⨝title='EE' and budget > 400000 Proj
P3 C A D /C A M 250000
2) Emp ⨝ WorksOn
P4 M a in te n a n c e 3 1 0 0 0 0
3) Emp ⨝ WorksOn ⨝ Proj
P5 C A D /C A M 500000
4) Proj1 ⨝ Proj1.budget > Proj2.budget Proj2 Page 55
COSC 304 - Dr. Ramon Lawrence

Outer Joins
Outer joins are used in cases where performing a join "loses"
some tuples of the relations. These are called dangling tuples.
There are three types of outer joins:
1) Left outer join - R S - The output contains all tuples of R
that match with tuples of S. If there is a tuple in R that matches
with no tuple in S, the tuple is included in the final result and is
padded with nulls for the attributes of S.
2) Right outer join - R S - The output contains all tuples of
S that match with tuples of R. If there is a tuple in S that
matches with no tuple in R, the tuple is included in the final
result and is padded with nulls for the attributes of R.
3) Full outer join - R S - All tuples of R and S are included
in the result whether or not they have a matching tuple in the
other relation.
Page 56
COSC 304 - Dr. Ramon Lawrence

Right Outer Join Example


WorksOn Relation WorksOn WorksOn.pno = Proj.pno Proj
eno p n o re s p dur eno pno re sp dur P .p n o pnam e budget
E1 P1 M anager 12 E1 P1 M anager 12 P1 In s tru m e n ts 150000
E2 P1 A n a ly s t 24 P1 In s tru m e n ts 150000
E2 P1 A n a ly s t 24
E2 P2 A n a ly s t 6 P2 D B D e v e lo p 135000
E2 P2 A n a ly s t 6 E3 P4 E n g in e e r 48 P4 M a in te n a n c e 310000
E3 P4 E n g in e e r 48 E5 P2 M anager 24 P2 D B D e v e lo p 135000
E6 P4 M anager 48 P4 M a in te n a n c e 310000
E5 P2 M anager 24
E7 P3 E n g in e e r 36 P3 C A D /C A M 250000
E6 P4 M anager 48 E7 P4 E n g in e e r 23 P4 M a in te n a n c e 310000
n u ll n u ll n u ll n u ll P5 C A D /C A M 500000
E7 P3 E n g in e e r 36
E7 P4 E n g in e e r 23

Proj Relation
pno pnam e budget
P1 I n s tr u m e n ts 150000
P2 D B D e v e lo p 135000
P3 C A D /C A M 250000
P4 M a in te n a n c e 3 1 0 0 0 0
P5 C A D /C A M 500000
Page 57
COSC 304 - Dr. Ramon Lawrence

Outer Join Question


Question: Given this table and the query: WorksOn Relation
eno pno resp dur
WorksOn WorksOn.pno = Proj.pno Proj E1 P1 Manager 12
E2 P1 Analyst 24
E2 P2 Analyst 6
How many rows are returned? E3 P4 Engineer 48
E5 P2 Manager 24
E6 P4 Manager 48
E7 P4 Engineer 36
A) 10 E7 P4 Engineer 23

B) 9 Proj Relation
C) 8 pno pnam e budget
P1 I n s tr u m e n ts 150000
D) 7 P2 D B D e v e lo p 135000
P3 C A D /C A M 250000
P4 M a in te n a n c e 3 1 0 0 0 0
P5 C A D /C A M 500000

Page 58
COSC 304 - Dr. Ramon Lawrence

Semi-Join and Anti-Join


A semi-join between tables returns rows from the first table
where one or more matches are found in the second table.
Semi-joins are used in EXISTS and IN constructs in SQL.

An anti-join between two tables returns rows from the first


table where no matches are found in the second table.
Anti-joins are used with NOT EXISTS, NOT IN, and FOR ALL.

Anti-join is the complement of semi-join: R ⊳S = R - R ⋉ S

Page 59
COSC 304 - Dr. Ramon Lawrence

Semi-Join Example
WorksOn Relation Proj ⋉Proj.pno = WorksOn.pno WorksOn
eno p n o re s p dur
pno pname budget
E1 P1 M anager 12
P1 Instruments 150000
E2 P1 A n a ly s t 24
P2 DB Develop 135000
E2 P2 A n a ly s t 6
P3 CAD/CAM 250000
E3 P4 E n g in e e r 48
P4 Maintenance 310000
E5 P2 M anager 24
E6 P4 M anager 48
E7 P3 E n g in e e r 36
E7 P4 E n g in e e r 23

Proj Relation
pno pnam e budget
P1 I n s tr u m e n ts 150000
P2 D B D e v e lo p 135000
P3 C A D /C A M 250000
P4 M a in te n a n c e 3 1 0 0 0 0
P5 C A D /C A M 500000
Page 60
COSC 304 - Dr. Ramon Lawrence

Anti-Join Example
WorksOn Relation Proj ⊳ Proj.pno = WorksOn.pno WorksOn
eno p n o re s p dur
pno pname budget
E1 P1 M anager 12
P5 CAD/CAM 500000
E2 P1 A n a ly s t 24
E2 P2 A n a ly s t 6
E3 P4 E n g in e e r 48
E5 P2 M anager 24
E6 P4 M anager 48
E7 P3 E n g in e e r 36
E7 P4 E n g in e e r 23

Proj Relation
pno pnam e budget
P1 I n s tr u m e n ts 150000
P2 D B D e v e lo p 135000
P3 C A D /C A M 250000
P4 M a in te n a n c e 3 1 0 0 0 0
P5 C A D /C A M 500000
Page 61
COSC 304 - Dr. Ramon Lawrence

Division Operator
The division operator on relations R and S, denoted as R ÷ S
produces a relation that consists of the set of tuples from R
defined over the attributes C that match the combination of
every tuple in S, where C is the set of attributes that are in R
but not in S.
For the division operation to be defined the set of attributes of S
must be a subset of the attributes of R.

The division operator is used when you want to determine if all


combinations of a relationship are present.
E.g. Return the list of employees who work on all the projects
that 'John Smith' works on.
Note that R ÷ S = R-S(R)-R-S ((R-S (R)S)- R).
Page 62
COSC 304 - Dr. Ramon Lawrence

Division Example
Find the employees who work on all the projects listed in Proj.
WorksOn Proj
ENO PNO PNAME BUDGET
PNO PNAME BUDGET
E1 P1 Instrumentation 150000
P1 Instrumentation 150000
E2 P1 Instrumentation 150000
E2 P2 Database Develop. 135000 P4 Maintenance 310000
E3 P1 Instrumentation 150000
E3 P4 Maintenance 310000
E4 P2 Instrumentation 150000
E5
E6
P2
P4
Instrumentation
Maintenance
150000
310000 eno,pno(WorksOn) ÷ pno(Proj)
E7 P3 CAD/CAM 250000
E8 P3 CAD/CAM 250000 ENO

E3

Page 63
COSC 304 - Dr. Ramon Lawrence

Division Questions
1) Can you give the relational algebra expression to find the
projects that are worked on by all employees?

2) (Challenge) If there are 6 projects in the database, and the


result of the query eno,pno(WorksOn) ÷ pno(Proj) is 2 records,
what is the minimum # of records that must be in the WorksOn
relation?

Page 64
COSC 304 - Dr. Ramon Lawrence

Combining Operations
Relational algebra operations can be combined in one
expression by nesting them:
eno,pno,dur(ename='J. Doe' (Emp) ⨝ dur>16 (WorksOn))
Return the eno, pno, and duration for employee 'J. Doe' when
he has worked on a project for more than 16 months.

Operations also can be combined by using temporary relation


variables to hold intermediate results.
We will use the assignment operator  for indicating that the
result of an operation is assigned to a temporary relation.
empdoe  ename='J. Doe' (Emp)
wodur  dur>16 (WorksOn)
empwo  empdoe ⨝ wodur
result  eno,pno,dur (empwo) Page 65
COSC 304 - Dr. Ramon Lawrence

Rename Operation
Renaming can be applied when assigning a result:

result(EmployeeNum, ProjectNum, Duration)  eno,pno,dur (empwo)

Or by using the rename operator  (rho):


result(EmployeeName, ProjectNum, Duration)(empwo)

Page 66
COSC 304 - Dr. Ramon Lawrence

Operator Precedence
Just like mathematical operators, the relational operators have
precedence.

The precedence of operators from highest to lowest is:


unary operators - , , 
Cartesian product and joins - X, ⨝
intersection, division
union and set difference
Parentheses can be used to changed the order of operations.
Note that there is no universal agreement on operator
precedence, so we always use parentheses around the
argument for both unary and binary operators.
Page 67
COSC 304 - Dr. Ramon Lawrence

Complete Set of
Relational Algebra Operators
It has been shown that the relational operators {, , , , -}
form a complete set of operators.
That is, any of the other operators can be derived from a
combination of these 5 basic operators.

Examples:
Intersection - R  S  R  S - ((R - S)  (S - R))
We have also seen how a join is a combination of a Cartesian
product followed by a selection.
Division operator: R(Z)  S(X) where X  Z and Y = Z - X:
T1  Y(R)
T2  Y((S  T1) -R)
T  T1 - T2
Page 68
COSC 304 - Dr. Ramon Lawrence

Relational Algebra Query Examples


Consider the database schema
Emp (eno, ename, title, salary)
Proj (pno, pname, budget)
WorksOn (eno, pno, resp, dur)

Queries:
List the names of all employees.
ename(Emp)

Find the names of projects with budgets over $100,000.


pname(budget>100000 (Proj))

Page 69
COSC 304 - Dr. Ramon Lawrence

Practice Questions
Relational database schema:
branch (bname, address, city, assets)
customer (cname, street, city)
deposit (accnum, cname, bname, balance)
borrow (accnum, cname, bname, amount)
1) List the names of all branches of the bank.
2) List the names of all deposit customers together with their
account numbers.
3) Find all cities where at least one customer lives.
4) Find all cities with at least one branch.
5) Find all cities with at least one branch or customer.
6) Find all cities that have a branch but no customers who live
in that city. Page 70
COSC 304 - Dr. Ramon Lawrence

Practice Questions (2)


branch (bname, address, city, assets)
customer (cname, street, city)
deposit (accnum, cname, bname, balance)
borrow (accnum, cname, bname, amount)
1) Find the names of all branches with assets greater than
$2,500,000.
2) List the name and cities of all customers who have an
account with balance greater than $2,000.
3) List all the cities with at least one customer but without any
bank branches.
4) Find the name of all the customers who live in a city with no
bank branches.
Page 71
COSC 304 - Dr. Ramon Lawrence

Practice Questions (3)


branch (bname, address, city, assets)
customer (cname, street, city)
deposit (accnum, cname, bname, balance)
borrow (accnum, cname, bname, amount)
1) Find all the cities that have both customers and bank
branches.
2) Find the name of customers who have deposits in every
branch of the bank.
3) Find the name and assets of all branches which have
deposit customers living in Vancouver.
4) Find all the customers who have both a deposit account and
a loan at the branch with name CalgaryCentral.
5) Your own?
Page 72
COSC 304 - Dr. Ramon Lawrence

Other Relational Algebra Operators


There are other relational algebra operators that we will not
discuss. Most notably, we often need aggregate operations
that compute functions on the data.

For example, given the current operators, we cannot answer


the query:
What is the total amount of deposits at the Kelowna branch?

We will see how to answer these queries when we study SQL.

Page 73
COSC 304 - Dr. Ramon Lawrence

Conclusion
The relational model represents data as relations which are
sets of tuples. Each relational schema consists of a set of
attribute names which represent a domain.

The relational model has several forms of constraints to


guarantee data integrity including:
domain, entity integrity and referential integrity constraints
Keys are used to uniquely identify tuples in relations.

Relational algebra is a set of operations for answering queries


on data stored in the relational model.
The 5 basic relational operators are: {, , , , -}.
By combining relational operators, queries can be answered
over the base relations. Page 74
COSC 304 - Dr. Ramon Lawrence

Objectives
Define: relation, attribute, tuple, domain, degree, cardinality,
relational DB, intension, extension
Define: relation schema, relational database schema, relation
instance, null
Perform Cartesian product given two sets.
List the properties of relations.
Define: superkey, key, candidate key, primary key, foreign key
Define: integrity, constraints, domain constraint, entity integrity
constraint, referential integrity constraint
Given a relation be able to:
identify its cardinality, degree, domains, keys, and superkeys
determine if constraints are being violated

Page 75
COSC 304 - Dr. Ramon Lawrence

Objectives (2)
Define: relational algebra, query language
Define and perform all relational algebra operators.
List the operators which form the complete set of operators.
Show how other operators can be derived from the complete
set.

Given a relational schema and instance be able to translate


English queries into relational algebra and show the resulting
relation.

Page 76

You might also like