Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Database Systems
Chapter 1
• Intro to RelationalIntro to Relational
ModelModel
Outline
• Structure of Relational Database
• Database Schema
• Keys
• Schema Design
• Relational Query Languages
• Relational operations
Intro to Relational Model
Example of a Relation
attributes
(or columns)
tuples
(or rows)
Attribute Types
• The set of allowed values for each
attribute is called the domain of the
attribute
• Attribute values are (normally) required
to be atomic; that is, indivisible
• The special value null is a member of
every domain. Indicated that the value
is “unknown”
• The null value causes complications in
the definition of many operations
Relation Schema and Instance
• A1, A2, …, An are attributes
• R = (A1, A2, …, An ) is a relation schema
Example:
instructor = (ID, name, dept_name, salary)
• Formally, given sets D1, D2, …. Dn a relation r is a subset of
D1 x D2 x … x Dn
Thus, a relation is a set of n-tuples (a1, a2, …, an) where each ai ∈ Di
Instance:The current values (relation instance) of a relation are specified
by a table
An element t of r is a tuple, represented by a row in a table
Basic Structure
• Formally, given sets D1, D2, …. Dn a relation r is a subset of
D1 x D2 x … x Dn
Thus, a relation is a set of n-tuples (a1, a2, …, an) where each ai ∈ Di
• Example: If
– customer_name = {Jones, Smith, Curry, Lindsay, …}
/* Set of all customer names */
– customer_street = {Main, North, Park, …} /* set of all street names*/
– customer_city = {Harrison, Rye, Pittsfield, …} /* set of all city names */
Then r = { (Jones, Main, Harrison),
(Smith, North, Rye),
(Curry, North, Rye),
(Lindsay, Park, Pittsfield) }
is a relation over
customer_name x customer_street x customer_city
Relation Instance
• The current values (relation instance) of a
relation are specified by a table
• An element t of r is a tuple, represented by
a row in a table
Jones
Smith
Curry
Lindsay
customer_name
Main
North
North
Park
customer_street
Harrison
Rye
Rye
Pittsfield
customer_city
customer
attributes
(or columns)
tuples
(or rows)
Relations are Unordered
Order of tuples is irrelevant (tuples may be stored in an arbitrary order)
Example: instructor relation with unordered tuples
Keys• Let K ⊆ R
• K is a superkey of R if values for K are sufficient to identify a unique tuple of each
possible relation r(R)
– Example: {ID} and {ID,name} are both superkeys of instructor.
• Superkey K is a candidate key if K is minimal
Example: {ID} is a candidate key for Instructor
• One of the candidate keys is selected to be the primary key.
– which one?
• Primary key: a candidate key chosen as the principal means of identifying tuples within
a relation
– Should choose an attribute whose value never, or very rarely, changes.
– E.g. email address is unique, but may change
• Foreign key constraint: Value in one relation must appear in another
– Referencing relation
– Referenced relation
– Example – dept_name in instructor is a foreign key from instructor referencing
department
Foreign Keys
• A relation schema may have an attribute that corresponds to the primary
key of another relation. The attribute is called a foreign key.
– E.g. customer_name and account_number attributes of depositor are
foreign keys to customer and account respectively.
– Only values occurring in the primary key attribute of the referenced
relation may occur in the foreign key attribute of the referencing
relation.
• Schema diagram
Schema Diagram for University Database
Relational Query Languages
• Procedural vs .non-procedural, or declarative
• “Pure” languages:
– Relational algebra
– Tuple relational calculus
– Domain relational calculus
• The above 3 pure languages are equivalent in
computing power
• We will concentrate in this chapter on relational
algebra
– Not turning-machine equivalent
– consists of 6 basic operations
14
Relational Algebra
• The basic set of operations for the relational
model is the relational algebra.
– enable the specification of basic retrievals
• The result of a retrieval is a new relation, which
may have been formed from one or more
relations.
– algebra operations thus produce new relations,
which can be further manipulated the same algebra.
• A sequence of relational algebra operations
forms a relational algebra expression,
– the result will also be a relation that represents the
result of a database query (or retrieval request).
15
What is an Algebra?
• A language based on operators and a
domain of values
• Operators map values taken from the
domain into other domain values
• Hence, an expression involving operators
and arguments produces a value in the
domain
• When the domain is a set of all relations we
get the relational algebrarelational algebra
16
Relational Algebra Definitions
• Domain: set of relations
• Basic operators: selectselect, projectproject, unionunion, setset
differencedifference, CartesianCartesian (cross) productproduct
• Derived operators: set intersectionset intersection, divisiondivision,
joinjoin
• Procedural: Relational expression specifies
query by describing an algorithm (the sequence
in which operators are applied) for determining
the result of an expression
Relational Algebra
• Procedural language
• Six basic operators
– select: σ
– project: ∏
– union: ∪
– set difference: –
– Cartesian product: x
– rename: ρ
• The operators take one or two relations as
inputs and produce a new relation as a result.
Relational Algebra
• Basic operations:
– Selection ( ) Selects a subset of rows from relation.
– Projection ( ) Deletes unwanted columns from relation.
– Cross-product ( ) Allows us to combine two relations.
– Set-difference ( ) Tuples in reln. 1, but not in reln. 2.
– Union ( ) Tuples in reln. 1 and in reln. 2.
• Additional operations:
– Intersection, join, division, renaming: Not essential, but (very!
useful.
σ
π
−
×

19
Unary Relational Operations
• SELECT Operation: used to select a subset
of the tuples from a relation that satisfy a selection
condition. It is a filter that keeps only those tuples
that satisfy a qualifying condition.
Examples:
σDNO = 4 (EMPLOYEE)
σSALARY > 30,000 (EMPLOYEE)
– denoted by σ <selectioncondition>(R) where the symbol σ (sigma) is
used to denote the select operator, and the selection
condition is a Boolean expression specified on the
attributes of relation R
20
Select Operator
• Produce table containing subset of rows of argument table
satisfying condition
σcondition relation
• Example:
PersonPerson σHobby=‘stamps’(PersonPerson)
1123 John 123 Main stamps
1123 John 123 Main coins
5556 Mary 7 Lake Dr hiking
9876 Bart 5 Pine St stamps
1123 John 123 Main stamps
9876 Bart 5 Pine St stamps
Id Name Address Hobby Id Name Address Hobby
21
Selection Condition
• Operators: <, ≤, ≥, >, =, ≠
• Simple selection condition:
– <attribute> operator <constant>
– <attribute> operator <attribute>
• <condition> AND <condition>
• <condition> OR <condition>
• NOT <condition>
22
Selection Condition - Examples
∀σ Id>3000OrHobby=‘hiking’(PersonPerson)
∀σ Id>3000ANDId<3999 (PersonPerson)
∀σ NOT(Hobby=‘hiking’)(PersonPerson)
∀σ Hobby≠‘hiking’ (PersonPerson)
Select Operation – selection of rows (tuples)
Relation r
σA=B ^ D > 5 (r)
24
Unary Relational Operations (cont.)
• PROJECT Operation: selects certain columns
from the table and discards the others.
Example:
πLNAME,FNAME,SALARY(EMPLOYEE)
The general form of the project operation is:
π<attribute list>(R) where π is the symbol used
to represent the project operation and <attribute
list> is the desired list of attributes.
PROJECT removes duplicate tuples, so the result
is a set of tuples and hence a valid relation.
25
Project Operator
• Produces table containing subset of
columns of argument table
Πattributelist(relation)
• Example:
PersonPerson ΠName,Hobby(PersonPerson)
1123 John 123 Main stamps
1123 John 123 Main coins
5556 Mary 7 Lake Dr hiking
9876 Bart 5 Pine St stamps
John stamps
John coins
Mary hiking
Bart stamps
Id Name Address Hobby Name Hobby
26
Expressions
1123 John 123 Main stamps
1123 John 123 Main coins
5556 Mary 7 Lake Dr hiking
9876 Bart 5 Pine St stamps
Π Id, Name (σ Hobby=’stamps’ OR Hobby=’coins’ (PPeerrssoonn) )
1123 John
9876 Bart
Id Name Address Hobby Id Name
PersonPerson
ResultResult
Project Operation – selection of columns (Attributes)
• Relation r:
∏A,C (r)
SELECT and PROJECT Operations
(a) σ(DNO=4 AND SALARY>25000) OR (DNO=5 AND
SALARY>30000)(EMPLOYEE)
(b) πLNAME, FNAME, SALARY(EMPLOYEE)
(c) πSEX, SALARY(EMPLOYEE)
Relational Algebra Operations
from Set Theory
• The UNION, INTERSECTION, and
MINUS Operations
• The CARTESIAN PRODUCT (or
CROSS PRODUCT) Operation
30
Set Operators
• A relation is a set of tuples, so set
operations apply:
∩, ∪, − (set difference)
• Result of combining two relations with a
set operator is a relation => all elements
are tuples with the same structure
31
UNION Operation
Denoted by R ∪ S
Result is a relation that includes all tuples that are either
in R or in S or in both. Duplicate tuples are eliminated.
Example: Retrieve the SSNs of all employees who either
work in department 5 or directly supervise an employee
who works in department 5:
DEP5_EMPS ← σDNO=5 (EMPLOYEE)
RESULT1 ← π SSN(DEP5_EMPS)
RESULT2(SSN) ← π SUPERSSN(DEP5_EMPS)
RESULT ← RESULT1 ∪ RESULT2
The union operation produces the tuples that are in either
RESULT1 or RESULT2 or both. The two operands must
be “type compatible”.
Union of two relations
• Relations r, s:
r ∪ s:
33
Type (Union) Compatibility
The operand relations R1(A1, A2, ..., An) and R2(B1,
B2, ..., Bn) must have the same number of
attributes, and the domains of corresponding
attributes must be compatible, i.e.
– dom(Ai) = dom(Bi) for i=1, 2, ..., n.
UNION Operation
34
Example
Tables:
PersonPerson (SSN, Name, Address, Hobby)
ProfessorProfessor (Id, Name, Office, Phone)
are not union compatible.
But
π Name
(PersonPerson) and π Name
(ProfessorProfessor)
are union compatible so
π Name
(PersonPerson) - π Name
(ProfessorProfessor)
makes sense.
35
STUDENT ∪ INSTRUCTOR:
UNION Example
What would STUDENT ∩ INSTRUCTOR be?
Set intersection of two relations
• Relation r, s:
• r ∩ s
Note: r ∩ s = r – (r – s)
37
Set Difference (or MINUS) Operation
The result of this operation, denoted by R - S, is a
relation that includes all tuples that are in R but
not in S. The two operands must be "type
compatible”.
Set Difference Operation
38
Set Difference Example
SID SName Age
202 Rusty 21
403 Marcia 20
914 Hal 24
192 Jose 22
881 Stimpy 19
SID SName Age
473 Popeye 22
192 Jose 22
715 Alicia 28
914 Hal 24
S1 S2
Set difference of two relations
• Relations r, s:
r – s:
40
Cartesian (Cross) Product
• If RR and SS are two relations, RR × SS is the set of all
concatenated tuples <x,y>, where x is a tuple in RR and y
is a tuple in SS
– RR and SS need not be union compatible
• RR × SS is expensive to compute:
– Factor of two in the size of each row; Quadratic in the number
of rows
A B C D A B C D
x1 x2 y1 y2 x1 x2 y1 y2
x3 x4 y3 y4 x1 x2 y3 y4
x3 x4 y1 y2
RR SS x3 x4 y3 y4
RR× SS
joining two relations -- Cartesian-
product
Relations r, s:
r x s:
Cartesian-product – naming issue
Relations r, s:
r x s:
s.Br.B
Renaming a Table
• Allows us to refer to a relation, (say E) by more than one name.
ρ x (E)
returns the expression E under the name X
Relations r
r x ρ s (r)α
α
β
β
1
1
2
2
α
β
α
β
1
2
1
2
r.A r.B s.A s.B
Composition of Operations
• Can build expressions using multiple
operations
• Example: σA=C(r x s)
• r x s
∀σA=C(r x s)
Joining two relations – Natural JoinJoining two relations – Natural Join
• Let r and s be relations on schemas R and S
respectively.
Then, the “natural join” of relations R and S is a
relation on schema R ∪ S obtained as follows:
– Consider each pair of tuples tr from r and ts from
s.
– If tr and ts have the same value on each of the
attributes in R ∩ S, add a tuple t to the result,
where
• t has the same value as tr on r
• t has the same value as ts on s
Natural Join Example
• Relations r, s:
Natural Join
r s
∏ A, r.B, C, r.D, E (σ r.B = s.B r.D = s.D˄ (r x s)))
Outer Join
• An extension of the join operation that avoids loss of
information.
• Computes the join and then adds tuples form one relation
that does not match tuples in the other relation to the result
of the join.
• Uses null values:
– null signifies that the value is unknown or does not exist
– All comparisons involving null are (roughly speaking)
false by definition.
• We shall study precise meaning of comparisons with
nulls later
Outer Join – Example
• Relation loan
Relation borrower
customer_name loan_number
Jones
Smith
Hayes
L-170
L-230
L-155
3000
4000
1700
loan_number amount
L-170
L-230
L-260
branch_name
Downtown
Redwood
Perryridge
Outer Join – Example
• Join
• loan borrower
loan_number amount
L-170
L-230
3000
4000
customer_name
Jones
Smith
branch_name
Downtown
Redwood
Jones
Smith
null
loan_number amount
L-170
L-230
L-260
3000
4000
1700
customer_namebranch_name
Downtown
Redwood
Perryridge
Left Outer Join
loan borrower
Outer Join – Example
loan_number amount
L-170
L-230
L-155
3000
4000
null
customer_name
Jones
Smith
Hayes
branch_name
Downtown
Redwood
null
loan_number amount
L-170
L-230
L-260
L-155
3000
4000
1700
null
customer_name
Jones
Smith
null
Hayes
branch_name
Downtown
Redwood
Perryridge
null
Full Outer Join
loan borrower
Right Outer Join
loan borrower
Notes about Relational LanguagesNotes about Relational Languages
• Each Query input is a table (or set of tables)
• Each query output is a table.
• All data in the output table appears in one of
the input tables
• Relational Algebra is not Turning complete
• Can we compute:
– SUM
– AVG
– MAX
– MIN
Summary of Relational Algebra OperatorsSummary of Relational Algebra OperatorsSymbol (Name) Example of Use
(Selection) σ
salary > = 85000 (instructor)
σ
Return rows of the input relation that satisfy the predicate.
Π
(Projection) Π
ID, salary (instructor)
Output specified attributes from all rows of the input relation. Remove
duplicate tuples from the output.
x
(Cartesian Product) instructor x department
Output pairs of rows from the two input relations that have the same value on
all attributes that have the same name.
∪
(Union) Π
name (instructor) ∪ Π
name (student)
Output the union of tuples from the two input relations.
(Natural Join) instructor ⋈ department
Output pairs of rows from the two input relations that have the same value on
all attributes that have the same name.
⋈
-
(Set Difference) Π
name (instructor) -- Π
name (student)
Output the set difference of tuples from the two input relations.
EXAMPLES OF ALGEBRA QUERIES
In the rest of this chapter we shall illustrate queries
using the following new instances S3 of sailors, R2
of Reserves and B1 of boats.
QUERY Q1
Given the relational instances:
(Q1) Find the names of sailors who have reserved boat
103
πsname((σbid=103 Reserves) ⋈ Sailors)The answer is thus the following relational instance
{<Dustin>, <Lubber>, <Horatio>}
QUERY Q1 (cont’d)
There are of course several ways to express Q1 in
relational algebra.
Here is another:
πsname(σbid=103(Reserves Sailors))⋈
Which of these expressions should we use?
That is a question of optimization. Indeed, when we
describe how to state queries in SQL, we can leave it
to the optimizer in the DBMS to select the nest
approach.
QUERY Q2
(Q2) Find the names of sailors who have reserved a red boat.
πsname((σcolor=‘red’Boats) ⋈ Reserves ⋈ Sailors)
QUERY Q3
(Q3) Find the colors of boats reserved by Lubber.
πcolor((σsname=‘Lubber’Sailors)Sailors ⋈ Reserves ⋈ Boats)
QUERY Q4
(Q4) Find the names of Sailors who have reserved at least one boat
πsname(Sailors ⋈ Reserves)
QUERY Q5
(Q5) Find the names of sailors who have reserved a red or a green boat.
ρ(Tempboats, (σcolor=‘red’Boats) ∪ (σcolor=‘green’Boats))
πsname(Tempboats ⋈ Reserves ⋈ Sailors)
QUERY Q6
(Q7) Find the names of sailors who have reserved at least two boats.
ρ(Reservations, πsid,sname,bid(Sailors ⋈ Reserves))
ρ(Reservationpairs(1sid1, 2sname, 3bid1, 4sid2,
5sname, 6bid2), ReservationsReservations)
πsname1σ(sid1=sid2)∧(bid1≠bid2)Reservationpairs)
QUERY 7
(Q8) Find the sids of sailors with age over 20 who have not reserved a
red boat.
πsid(σage>20Sailors) - πsid((σcolor=‘red’Boats) ⋈ Reserves ⋈ Sailors)
QUERY 8
(Q) Find the names of sailors who have reserved all boats.
ρ(Tempsids, (πsid,bidReserves) / (πbidBoats))
πsname(Tempsids ⋈ Sailors
QUERY Q9
(Q10) Find the names of sailors who have reserved all boats called
Interlake.
ρ(Tempsids, (πsid,bidReserves)/(πbid(σbname=‘Interlake’Boats)))
πsname(Tempsids ⋈ Sailors)
End of Chapter 1

More Related Content

Intro to relational model

  • 1. Database Systems Chapter 1 • Intro to RelationalIntro to Relational ModelModel
  • 2. Outline • Structure of Relational Database • Database Schema • Keys • Schema Design • Relational Query Languages • Relational operations
  • 4. Example of a Relation attributes (or columns) tuples (or rows)
  • 5. Attribute Types • The set of allowed values for each attribute is called the domain of the attribute • Attribute values are (normally) required to be atomic; that is, indivisible • The special value null is a member of every domain. Indicated that the value is “unknown” • The null value causes complications in the definition of many operations
  • 6. Relation Schema and Instance • A1, A2, …, An are attributes • R = (A1, A2, …, An ) is a relation schema Example: instructor = (ID, name, dept_name, salary) • Formally, given sets D1, D2, …. Dn a relation r is a subset of D1 x D2 x … x Dn Thus, a relation is a set of n-tuples (a1, a2, …, an) where each ai ∈ Di Instance:The current values (relation instance) of a relation are specified by a table An element t of r is a tuple, represented by a row in a table
  • 7. Basic Structure • Formally, given sets D1, D2, …. Dn a relation r is a subset of D1 x D2 x … x Dn Thus, a relation is a set of n-tuples (a1, a2, …, an) where each ai ∈ Di • Example: If – customer_name = {Jones, Smith, Curry, Lindsay, …} /* Set of all customer names */ – customer_street = {Main, North, Park, …} /* set of all street names*/ – customer_city = {Harrison, Rye, Pittsfield, …} /* set of all city names */ Then r = { (Jones, Main, Harrison), (Smith, North, Rye), (Curry, North, Rye), (Lindsay, Park, Pittsfield) } is a relation over customer_name x customer_street x customer_city
  • 8. Relation Instance • The current values (relation instance) of a relation are specified by a table • An element t of r is a tuple, represented by a row in a table Jones Smith Curry Lindsay customer_name Main North North Park customer_street Harrison Rye Rye Pittsfield customer_city customer attributes (or columns) tuples (or rows)
  • 9. Relations are Unordered Order of tuples is irrelevant (tuples may be stored in an arbitrary order) Example: instructor relation with unordered tuples
  • 10. Keys• Let K ⊆ R • K is a superkey of R if values for K are sufficient to identify a unique tuple of each possible relation r(R) – Example: {ID} and {ID,name} are both superkeys of instructor. • Superkey K is a candidate key if K is minimal Example: {ID} is a candidate key for Instructor • One of the candidate keys is selected to be the primary key. – which one? • Primary key: a candidate key chosen as the principal means of identifying tuples within a relation – Should choose an attribute whose value never, or very rarely, changes. – E.g. email address is unique, but may change • Foreign key constraint: Value in one relation must appear in another – Referencing relation – Referenced relation – Example – dept_name in instructor is a foreign key from instructor referencing department
  • 11. Foreign Keys • A relation schema may have an attribute that corresponds to the primary key of another relation. The attribute is called a foreign key. – E.g. customer_name and account_number attributes of depositor are foreign keys to customer and account respectively. – Only values occurring in the primary key attribute of the referenced relation may occur in the foreign key attribute of the referencing relation. • Schema diagram
  • 12. Schema Diagram for University Database
  • 13. Relational Query Languages • Procedural vs .non-procedural, or declarative • “Pure” languages: – Relational algebra – Tuple relational calculus – Domain relational calculus • The above 3 pure languages are equivalent in computing power • We will concentrate in this chapter on relational algebra – Not turning-machine equivalent – consists of 6 basic operations
  • 14. 14 Relational Algebra • The basic set of operations for the relational model is the relational algebra. – enable the specification of basic retrievals • The result of a retrieval is a new relation, which may have been formed from one or more relations. – algebra operations thus produce new relations, which can be further manipulated the same algebra. • A sequence of relational algebra operations forms a relational algebra expression, – the result will also be a relation that represents the result of a database query (or retrieval request).
  • 15. 15 What is an Algebra? • A language based on operators and a domain of values • Operators map values taken from the domain into other domain values • Hence, an expression involving operators and arguments produces a value in the domain • When the domain is a set of all relations we get the relational algebrarelational algebra
  • 16. 16 Relational Algebra Definitions • Domain: set of relations • Basic operators: selectselect, projectproject, unionunion, setset differencedifference, CartesianCartesian (cross) productproduct • Derived operators: set intersectionset intersection, divisiondivision, joinjoin • Procedural: Relational expression specifies query by describing an algorithm (the sequence in which operators are applied) for determining the result of an expression
  • 17. Relational Algebra • Procedural language • Six basic operators – select: σ – project: ∏ – union: ∪ – set difference: – – Cartesian product: x – rename: ρ • The operators take one or two relations as inputs and produce a new relation as a result.
  • 18. Relational Algebra • Basic operations: – Selection ( ) Selects a subset of rows from relation. – Projection ( ) Deletes unwanted columns from relation. – Cross-product ( ) Allows us to combine two relations. – Set-difference ( ) Tuples in reln. 1, but not in reln. 2. – Union ( ) Tuples in reln. 1 and in reln. 2. • Additional operations: – Intersection, join, division, renaming: Not essential, but (very! useful. σ π − × 
  • 19. 19 Unary Relational Operations • SELECT Operation: used to select a subset of the tuples from a relation that satisfy a selection condition. It is a filter that keeps only those tuples that satisfy a qualifying condition. Examples: σDNO = 4 (EMPLOYEE) σSALARY > 30,000 (EMPLOYEE) – denoted by σ <selectioncondition>(R) where the symbol σ (sigma) is used to denote the select operator, and the selection condition is a Boolean expression specified on the attributes of relation R
  • 20. 20 Select Operator • Produce table containing subset of rows of argument table satisfying condition σcondition relation • Example: PersonPerson σHobby=‘stamps’(PersonPerson) 1123 John 123 Main stamps 1123 John 123 Main coins 5556 Mary 7 Lake Dr hiking 9876 Bart 5 Pine St stamps 1123 John 123 Main stamps 9876 Bart 5 Pine St stamps Id Name Address Hobby Id Name Address Hobby
  • 21. 21 Selection Condition • Operators: <, ≤, ≥, >, =, ≠ • Simple selection condition: – <attribute> operator <constant> – <attribute> operator <attribute> • <condition> AND <condition> • <condition> OR <condition> • NOT <condition>
  • 22. 22 Selection Condition - Examples ∀σ Id>3000OrHobby=‘hiking’(PersonPerson) ∀σ Id>3000ANDId<3999 (PersonPerson) ∀σ NOT(Hobby=‘hiking’)(PersonPerson) ∀σ Hobby≠‘hiking’ (PersonPerson)
  • 23. Select Operation – selection of rows (tuples) Relation r σA=B ^ D > 5 (r)
  • 24. 24 Unary Relational Operations (cont.) • PROJECT Operation: selects certain columns from the table and discards the others. Example: πLNAME,FNAME,SALARY(EMPLOYEE) The general form of the project operation is: π<attribute list>(R) where π is the symbol used to represent the project operation and <attribute list> is the desired list of attributes. PROJECT removes duplicate tuples, so the result is a set of tuples and hence a valid relation.
  • 25. 25 Project Operator • Produces table containing subset of columns of argument table Πattributelist(relation) • Example: PersonPerson ΠName,Hobby(PersonPerson) 1123 John 123 Main stamps 1123 John 123 Main coins 5556 Mary 7 Lake Dr hiking 9876 Bart 5 Pine St stamps John stamps John coins Mary hiking Bart stamps Id Name Address Hobby Name Hobby
  • 26. 26 Expressions 1123 John 123 Main stamps 1123 John 123 Main coins 5556 Mary 7 Lake Dr hiking 9876 Bart 5 Pine St stamps Π Id, Name (σ Hobby=’stamps’ OR Hobby=’coins’ (PPeerrssoonn) ) 1123 John 9876 Bart Id Name Address Hobby Id Name PersonPerson ResultResult
  • 27. Project Operation – selection of columns (Attributes) • Relation r: ∏A,C (r)
  • 28. SELECT and PROJECT Operations (a) σ(DNO=4 AND SALARY>25000) OR (DNO=5 AND SALARY>30000)(EMPLOYEE) (b) πLNAME, FNAME, SALARY(EMPLOYEE) (c) πSEX, SALARY(EMPLOYEE)
  • 29. Relational Algebra Operations from Set Theory • The UNION, INTERSECTION, and MINUS Operations • The CARTESIAN PRODUCT (or CROSS PRODUCT) Operation
  • 30. 30 Set Operators • A relation is a set of tuples, so set operations apply: ∩, ∪, − (set difference) • Result of combining two relations with a set operator is a relation => all elements are tuples with the same structure
  • 31. 31 UNION Operation Denoted by R ∪ S Result is a relation that includes all tuples that are either in R or in S or in both. Duplicate tuples are eliminated. Example: Retrieve the SSNs of all employees who either work in department 5 or directly supervise an employee who works in department 5: DEP5_EMPS ← σDNO=5 (EMPLOYEE) RESULT1 ← π SSN(DEP5_EMPS) RESULT2(SSN) ← π SUPERSSN(DEP5_EMPS) RESULT ← RESULT1 ∪ RESULT2 The union operation produces the tuples that are in either RESULT1 or RESULT2 or both. The two operands must be “type compatible”.
  • 32. Union of two relations • Relations r, s: r ∪ s:
  • 33. 33 Type (Union) Compatibility The operand relations R1(A1, A2, ..., An) and R2(B1, B2, ..., Bn) must have the same number of attributes, and the domains of corresponding attributes must be compatible, i.e. – dom(Ai) = dom(Bi) for i=1, 2, ..., n. UNION Operation
  • 34. 34 Example Tables: PersonPerson (SSN, Name, Address, Hobby) ProfessorProfessor (Id, Name, Office, Phone) are not union compatible. But π Name (PersonPerson) and π Name (ProfessorProfessor) are union compatible so π Name (PersonPerson) - π Name (ProfessorProfessor) makes sense.
  • 35. 35 STUDENT ∪ INSTRUCTOR: UNION Example What would STUDENT ∩ INSTRUCTOR be?
  • 36. Set intersection of two relations • Relation r, s: • r ∩ s Note: r ∩ s = r – (r – s)
  • 37. 37 Set Difference (or MINUS) Operation The result of this operation, denoted by R - S, is a relation that includes all tuples that are in R but not in S. The two operands must be "type compatible”. Set Difference Operation
  • 38. 38 Set Difference Example SID SName Age 202 Rusty 21 403 Marcia 20 914 Hal 24 192 Jose 22 881 Stimpy 19 SID SName Age 473 Popeye 22 192 Jose 22 715 Alicia 28 914 Hal 24 S1 S2
  • 39. Set difference of two relations • Relations r, s: r – s:
  • 40. 40 Cartesian (Cross) Product • If RR and SS are two relations, RR × SS is the set of all concatenated tuples <x,y>, where x is a tuple in RR and y is a tuple in SS – RR and SS need not be union compatible • RR × SS is expensive to compute: – Factor of two in the size of each row; Quadratic in the number of rows A B C D A B C D x1 x2 y1 y2 x1 x2 y1 y2 x3 x4 y3 y4 x1 x2 y3 y4 x3 x4 y1 y2 RR SS x3 x4 y3 y4 RR× SS
  • 41. joining two relations -- Cartesian- product Relations r, s: r x s:
  • 42. Cartesian-product – naming issue Relations r, s: r x s: s.Br.B
  • 43. Renaming a Table • Allows us to refer to a relation, (say E) by more than one name. ρ x (E) returns the expression E under the name X Relations r r x ρ s (r)α α β β 1 1 2 2 α β α β 1 2 1 2 r.A r.B s.A s.B
  • 44. Composition of Operations • Can build expressions using multiple operations • Example: σA=C(r x s) • r x s ∀σA=C(r x s)
  • 45. Joining two relations – Natural JoinJoining two relations – Natural Join • Let r and s be relations on schemas R and S respectively. Then, the “natural join” of relations R and S is a relation on schema R ∪ S obtained as follows: – Consider each pair of tuples tr from r and ts from s. – If tr and ts have the same value on each of the attributes in R ∩ S, add a tuple t to the result, where • t has the same value as tr on r • t has the same value as ts on s
  • 46. Natural Join Example • Relations r, s: Natural Join r s ∏ A, r.B, C, r.D, E (σ r.B = s.B r.D = s.D˄ (r x s)))
  • 47. Outer Join • An extension of the join operation that avoids loss of information. • Computes the join and then adds tuples form one relation that does not match tuples in the other relation to the result of the join. • Uses null values: – null signifies that the value is unknown or does not exist – All comparisons involving null are (roughly speaking) false by definition. • We shall study precise meaning of comparisons with nulls later
  • 48. Outer Join – Example • Relation loan Relation borrower customer_name loan_number Jones Smith Hayes L-170 L-230 L-155 3000 4000 1700 loan_number amount L-170 L-230 L-260 branch_name Downtown Redwood Perryridge
  • 49. Outer Join – Example • Join • loan borrower loan_number amount L-170 L-230 3000 4000 customer_name Jones Smith branch_name Downtown Redwood Jones Smith null loan_number amount L-170 L-230 L-260 3000 4000 1700 customer_namebranch_name Downtown Redwood Perryridge Left Outer Join loan borrower
  • 50. Outer Join – Example loan_number amount L-170 L-230 L-155 3000 4000 null customer_name Jones Smith Hayes branch_name Downtown Redwood null loan_number amount L-170 L-230 L-260 L-155 3000 4000 1700 null customer_name Jones Smith null Hayes branch_name Downtown Redwood Perryridge null Full Outer Join loan borrower Right Outer Join loan borrower
  • 51. Notes about Relational LanguagesNotes about Relational Languages • Each Query input is a table (or set of tables) • Each query output is a table. • All data in the output table appears in one of the input tables • Relational Algebra is not Turning complete • Can we compute: – SUM – AVG – MAX – MIN
  • 52. Summary of Relational Algebra OperatorsSummary of Relational Algebra OperatorsSymbol (Name) Example of Use (Selection) σ salary > = 85000 (instructor) σ Return rows of the input relation that satisfy the predicate. Π (Projection) Π ID, salary (instructor) Output specified attributes from all rows of the input relation. Remove duplicate tuples from the output. x (Cartesian Product) instructor x department Output pairs of rows from the two input relations that have the same value on all attributes that have the same name. ∪ (Union) Π name (instructor) ∪ Π name (student) Output the union of tuples from the two input relations. (Natural Join) instructor ⋈ department Output pairs of rows from the two input relations that have the same value on all attributes that have the same name. ⋈ - (Set Difference) Π name (instructor) -- Π name (student) Output the set difference of tuples from the two input relations.
  • 53. EXAMPLES OF ALGEBRA QUERIES In the rest of this chapter we shall illustrate queries using the following new instances S3 of sailors, R2 of Reserves and B1 of boats.
  • 54. QUERY Q1 Given the relational instances: (Q1) Find the names of sailors who have reserved boat 103 πsname((σbid=103 Reserves) ⋈ Sailors)The answer is thus the following relational instance {<Dustin>, <Lubber>, <Horatio>}
  • 55. QUERY Q1 (cont’d) There are of course several ways to express Q1 in relational algebra. Here is another: πsname(σbid=103(Reserves Sailors))⋈ Which of these expressions should we use? That is a question of optimization. Indeed, when we describe how to state queries in SQL, we can leave it to the optimizer in the DBMS to select the nest approach.
  • 56. QUERY Q2 (Q2) Find the names of sailors who have reserved a red boat. πsname((σcolor=‘red’Boats) ⋈ Reserves ⋈ Sailors)
  • 57. QUERY Q3 (Q3) Find the colors of boats reserved by Lubber. πcolor((σsname=‘Lubber’Sailors)Sailors ⋈ Reserves ⋈ Boats)
  • 58. QUERY Q4 (Q4) Find the names of Sailors who have reserved at least one boat πsname(Sailors ⋈ Reserves)
  • 59. QUERY Q5 (Q5) Find the names of sailors who have reserved a red or a green boat. ρ(Tempboats, (σcolor=‘red’Boats) ∪ (σcolor=‘green’Boats)) πsname(Tempboats ⋈ Reserves ⋈ Sailors)
  • 60. QUERY Q6 (Q7) Find the names of sailors who have reserved at least two boats. ρ(Reservations, πsid,sname,bid(Sailors ⋈ Reserves)) ρ(Reservationpairs(1sid1, 2sname, 3bid1, 4sid2, 5sname, 6bid2), ReservationsReservations) πsname1σ(sid1=sid2)∧(bid1≠bid2)Reservationpairs)
  • 61. QUERY 7 (Q8) Find the sids of sailors with age over 20 who have not reserved a red boat. πsid(σage>20Sailors) - πsid((σcolor=‘red’Boats) ⋈ Reserves ⋈ Sailors)
  • 62. QUERY 8 (Q) Find the names of sailors who have reserved all boats. ρ(Tempsids, (πsid,bidReserves) / (πbidBoats)) πsname(Tempsids ⋈ Sailors
  • 63. QUERY Q9 (Q10) Find the names of sailors who have reserved all boats called Interlake. ρ(Tempsids, (πsid,bidReserves)/(πbid(σbname=‘Interlake’Boats))) πsname(Tempsids ⋈ Sailors)