Normalizationcse
Normalizationcse
Normalizationcse
NORMALIZATION
1
5
2
5
S1 London 20 P1 300
S1 London 20 P2 200
S1 London 20 P3 400
S1 London 20 P4 200
S1 London 20 P5 100
S1 London 20 P6 100
S2 Paris 10 P1 300
S2 Paris 10 P2 400
S3 Paris 10 P2 200
S4 London 20 P2 200
S4 London 20 P4 300
S4 London 20 P5 400
S# - Supplier No City – Supplier City Status – City Status P# - Part No Qty - Quantity
3
5
Update Anomalies
INSERT: We can’t insert a record for a new
supplier unless the supplier supplies a part*
DELETE: If we delete the only tuple for a
supplier, we destroy not only the shipment but
also the information that the supplier is located
in a particular city. Ex: Supplier S3
UPDATE: If supplier S1 moves from London
to New York, all S1 records need to be
updated! Redundant!
part # is part of Primary Key!
4
5
6
5
Definition
Let R be the relation, and let x and y be the
arbitrary subset of the set of attributes of R.
Then we say that Y is functionally dependent
on x – in symbol.
XY
(Read x functionally determines y) –
If and only if each x value in R has associated
with it precisely one y value in R
In other words
Whenever two tuples of R agree on their x
value, they also agree on their Y value.
7
5
S1 London 20 P1 300
S1 London 20 P2 200
S1 London 20 P3 400
S1 London 20 P4 200
S1 London 20 P5 100
S1 London 20 P6 100
S2 Paris 10 P1 300
S2 Paris 10 P2 400
S3 Paris 10 P2 200
S4 London 20 P2 200
S4 London 20 P4 300
S4 London 20 P5 400
S# - Supplier No City – Supplier City Status – City Status P# - Part No Qty - Quantity
8
5
Example (contd..)
One FD : - ( { S#} {City})
9
5
Exercise
Check whether following relation satisfy
FD as not
< S#, P# > <QTY>
<S#, P#> <City>
< S#, P#> <City, QTY>
<S#, P#> <S#>
<S#, P#> <S#, P#, QTY, City>
<OTY> <S#>
10
5
Functional Dependencies
(Cont.)
Let R be a relation schema
R and R
The functional dependency
holds on R if and only if for any legal relations r(R), whenever
any two tuples t1 and t2 of r agree on the attributes , they also
agree on the attributes . That is,
t1[] = t2 [] t1[ ] = t2 [ ]
Example: Consider r(A,B ) with the following instance of r.
1 5
3 7
1 4
On this instance, A B does NOT hold, but B A does hold.
11
5
Functional Dependencies
(Cont.)
K is a superkey for relation schema R if and only if K R
K is a candidate key for R if and only if
K R, and
for no K, R
Functional dependencies allow us to express constraints that cannot
be expressed using superkeys. Consider the schema:
inst_dept (ID, name, salary, dept_name, building, budget ).
We expect these functional dependencies to hold:
dept_name building
and ID building
but would not expect the following to hold:
dept_name salary
12
5
Use of Functional
Dependencies
We use functional dependencies to:
test relations to see if they are legal under a given set of functional
dependencies.
If a relation r is legal under a set F of functional dependencies,
name name
In general, is trivial if
14
TRIVIAL & NON-TRIVIAL DEPENDENCIES 5
15
5
What About Smaller
Schemas?
Suppose we had started with inst_dept. How would we know to
split up (decompose) it into instructor and department?
Write a rule “if there were a schema (dept_name, building,
budget), then dept_name would be a candidate key”
Denote as a functional dependency:
dept_name building, budget
In inst_dept, because dept_name is not a candidate key, the
building and budget of a department may have to be repeated.
This indicates the need to decompose inst_dept
16
5
Lossy Decomposition
Not all decompositions are good. Suppose we decompose
employee(ID, name, street, city, salary) into
employee1 (ID, name)
employee2 (name, street, city, salary)
The next slide shows how we lose information -- we cannot
reconstruct the original employee relation -- and so, this is a lossy
decomposition.
17
5
A Lossy Decomposition
S# status S# CITY
B
S1 30 S1 paris
S2 30 S2 Athens
C 30 paris
S1 30 30 Athens
S2 30 Designed By Deepak Moud 19 19
5
Example of Lossless-Join Decomposition
A B C A B B C
1 A 1 1 A
2 B 2 2 B
r A,B(r) B,C(r)
A B C
A (r) B (r)
1 A
2 B
23
5
Dependencies: Definitions
EMPLOYEE
25
5
First Normal Form
Domain is atomic if its elements are considered to be indivisible
units
Examples of non-atomic domains:
Set of names, composite attributes
into parts
A relational schema R is in first normal form if the domains of
all attributes of R are atomic
Non-atomic values complicate storage and encourage
redundant (repeated) storage of data
Example: Set of accounts stored with each customer, and
set of owners stored with each account
We assume all relations are in first normal form (and revisit
this in Chapter 22: Object Based Databases)
27
5
First Normal Form (Cont’d)
Atomicity is actually a property of how the elements of the
domain are used.
Example: Strings would normally be considered indivisible
Suppose that students are given roll numbers which are
strings of the form CS0012 or EE1127
If the first two characters are extracted to find the
department, the domain of roll numbers is not atomic.
Doing so is a bad idea: leads to encoding of information in
application program rather than in the database.
28
5
29
5
30
5
Dependencies: Definitions
Multivalued Attributes (or repeating groups): non-
key attributes or groups of non-key attributes the
values of which are not uniquely identified by
(directly or indirectly) (not functionally dependent on)
the value of the Primary Key (or its part).
If a non key attribute or a group of non key attribute dependent on
primary key or subset of primary key directly and indirectly then table in
first normal form. (By Deepak Moud Sir)
STUDENT
BOOK
32
5
Example 2: Determine NF
Product_ID Description
All attributes are directly or
indirectly determined by the primary
key; therefore, the relation is at
least in 1 NF
ORDER
33
5
Example 3: Determine NF
Part_ID Description
Comp_ID and No are not
Part_ID Price determined by the primary
Part_ID, Comp_ID No key; therefore, the relation is
NOT in 1 NF. No sense in
looking at partial or
transitive dependencies.
PART
34
5
Example 3: Determine NF
In your solution you will write
Part_ID Description the following justification:
Part_ID Price 1) There are M/V attributes;
therefore, not 1NF
Part_ID, Comp_ID No Conclusion: The relation is not
normalized.
PART
35
5
Bringing a Relation to 1NF
STUDENT
36
5
Bringing a Relation to 1NF
Option 1: Make a determinant of the
repeating group (or the multivalued
attribute) a part of the primary key.
Composite Primary
Key
STUDENT
37
5
Bringing a Relation to 1NF
Option 2: Remove the entire repeating group from the
relation. Create another relation which would contain
all the attributes of the repeating group, plus the
primary key from the first relation. In this new relation,
the primary key from the original relation and the
determinant of the repeating group will comprise a
primary key.
STUDENT
38
5
Bringing a Relation to 1NF
STUDENT
Stud_ID Name
101 Lennon
125 Jonson
STUDENT_COURSE
39
5
S1 London 20 P1 300
S1 London 20 P2 200
S1 London 20 P3 400
S1 London 20 P4 200
S1 London 20 P5 100
S1 London 20 P6 100
S2 Paris 10 P1 300
S2 Paris 10 P2 400
S3 Paris 10 P2 200
S4 London 20 P2 200
S4 London 20 P4 300
S4 London 20 P5 400
S# - Supplier No City – Supplier City Status – City Status P# - Part No Qty - Quantity
40
5
43
5
44
5
45
5
Example 1: Determine NF
ISBN Title
The relation is at least in 1NF.
ISBN Publisher There is no COMPOSITE primary
Publisher Address key, therefore there can’t be
partial dependencies. Therefore,
the relation is at least in 2NF
BOOK
46
5
Example 2: Determine NF
Product_ID Description
ORDER
47
5
Example 2: Determine NF
Product_ID Description
We know that the relation is at least
in 1NF, and it is not in 2 NF.
Therefore, we conclude that the
relation is in 1 NF.
ORDER
48
5
Example 1: Determine NF
We know that the relation is at
ISBN Title least in 2NF, and it is not in 3
ISBN Publisher NF. Therefore, we conclude
that the relation is in 2NF.
Publisher Address
BOOK
49
5
Bringing a Relation to 2NF
Composite
Primary Key
STUDENT
50
5
Bringing a Relation to 2NF
Goal: Remove Partial Dependencies
Composite Partial Dependencies
Primary Key
STUDENT
51
5
Bringing a Relation to 2NF
Composite
Primary Key
STUDENT
52
5
Bringing a Relation to 2NF
Remove attributes that are dependent from the part
but not the whole of the primary key from the original
relation. For each partial dependency, create a new
relation, with the corresponding part of the primary
key from the original as the primary key.
STUDENT
53
5
Bringing a Relation to 2NF
CUSTOMER
STUDENT_COURSE
Stud_ID Name Course_ID Units
101 Lennon MSI 250 3.00
101 Lennon MSI 415 3.00 Stud_ID Course_ID
125 Johnson MSI 331 3.00
101 MSI 250
101 MSI 415
125 MSI 331
STUDENT COURSE
S1 London 20 P1 300
S1 London 20 P2 200
S1 London 20 P3 400
S1 London 20 P4 200
S1 London 20 P5 100
S1 London 20 P6 100
S2 Paris 10 P1 300
S2 Paris 10 P2 400
S3 Paris 10 P2 200
S4 London 20 P2 200
S4 London 20 P4 300
S4 London 20 P5 400
S# - Supplier No City – Supplier City Status – City Status P# - Part No Qty - Quantity
55
5
New Relations!: Second
Normal Form (2NF)
Primary Key
S# City Status S# P# Qty
S1 London 20
S1 P1 300
S2 Paris 10
S1 P2 200
S3 Paris 10
S1 P3 400
S4 London 20
S1 P4 200
S5 Athens 30
S1 P5 100
S1 P6 100
SUPP1 - Note we S2 P1 300
added a NEW supplier S5 S2 P2 400
located in Athens easily! S3 P2 200
S4 P2 200
SUPP2 S4 P4 300
S4 P5 400
56
5
57
5
58
5
60
5
61
5
Example 1: Determine NF
Publisher is a non-key attribute,
ISBN Title and it determines Address, another
ISBN Publisher non-key attribute. Therefore, there
is a transitive dependency, which
Publisher Address means that the relation is NOT in 3
NF.
BOOK
62
5
Example 1: Determine NF
ISBN Title In your solution you will write the
ISBN Publisher following justification:
1) No M/V attributes, therefore at least
Publisher Address 1NF
2) No partial dependencies, therefore
at least 2NF
3) There is a transitive dependency
(Publisher Address), therefore,
not 3NF
Conclusion: The relation is in 2NF
BOOK
63
5
Bringing a Relation to 3NF
Transitive
Dependency
EMPLOYEE
64
5
Bringing a Relation to 3NF
Remove the attributes, which are dependent on a
non-key attribute, from the original relation. For each
transitive dependency, create a new relation with the
non-key attribute which is a determinant in the
transitive dependency as a primary key, and the
dependent non-key attribute as a dependent.
EMPLOYEE
65
5
Bringing a Relation to 3NF
EMPLOYEE
EMPLOYEE
DEPARTMENT
Dept_ID Dept_Name
1 Acct
2 Mktg
66
5
S1 London 20 P1 300
S1 London 20 P2 200
S1 London 20 P3 400
S1 London 20 P4 200
S1 London 20 P5 100
S1 London 20 P6 100
S2 Paris 10 P1 300
S2 Paris 10 P2 400
S3 Paris 10 P2 200
S4 London 20 P2 200
S4 London 20 P4 300
S4 London 20 P5 400
S# - Supplier No City – Supplier City Status – City Status P# - Part No Qty - Quantity
67
5
New Relations!: Second
Normal Form (2NF)
Primary Key
S# City Status S# P# Qty
S1 London 20
S1 P1 300
S2 Paris 10
S1 P2 200
S3 Paris 10
S1 P3 400
S4 London 20
S1 P4 200
S5 Athens 30
S1 P5 100
S1 P6 100
SUPP1 - Note we S2 P1 300
added a NEW supplier S5 S2 P2 400
located in Athens easily! S3 P2 200
S4 P2 200
SUPP2 S4 P4 300
S4 P5 400
68
5
70
5
Again, update Problems
Overcome!
The new structure overcomes update problems of
SUPP1
INSERT: We can add a city and assign a status
without adding a supplier
DELETE: If we delete a tuple in SUPP1.1 we lose
only Supplier information. Status information for
the City is still available in SUPP1.2
UPDATE: Updating status for a city involves only
one tuple in SUPP1.2. Similarly, if a supplier moved
to a different city only one tuple needs to be modified
in SUPP1.1
71
5
Third Normal Form
A relation schema R is in third normal form (3NF) if for all:
in F+
at least one of the following holds:
is trivial (i.e., )
is a superkey for R
Each attribute A in – is contained in a candidate key for R.
72
5
3NF Example
Relation dept_advisor:
dept_advisor (s_ID, i_ID, dept_name)
F = {s_ID, dept_name i_ID, i_ID dept_name}
Two candidate keys: s_ID, dept_name, and i_ID, s_ID
R is in 3NF
s_ID, dept_name i_ID
i_ID dept_name
73
5
Closure of a Set of
Functional Dependencies
Given a set F of functional dependencies, there are certain
other functional dependencies that are logically implied by F.
For example: If A B and B C, then we can infer
that A C
More on functional dependency inference later…
The set of all functional dependencies logically implied by F
is the closure of F.
We denote the closure of F by F+.
F+ is a superset of F.
74
5
Closure of a Set of
Functional Dependencies
We can find F +, the closure of F, by repeatedly applying
Armstrong’s Axioms:
if , then (reflexivity)
if , then (augmentation)
75
5
Example
R = (A, B, C, G, H, I)
F={ AB
AC
CG H
CG I
B H}
some members of F +
AH
by transitivity from A B and B H
AG I
by augmenting A C with G, to get AG CG
76
Closure of Functional 5
Dependencies (Cont.)
Additional rules:
If holds and holds, then holds
(union)
If holds, then holds and holds
(decomposition)
If holds and holds, then holds
(pseudotransitivity)
.
77
5
Closure of Attribute Sets
Given a set of attributes define the closure of under F (denoted by +)
as the set of attributes that are functionally determined by under F
result := ;
while (changes to result) do
for each in F do
begin
if result then result := result
end
If +, includes all attributes then is super key . (By Deepak Moud Sir)
78
Example of Attribute Set 5
R = (A, B, C, G, H, I) Closure
F = {A B AC
CG H CG I
B H}
(AG)+
1. result = AG
2. result = ABCG (A C and A B)
3. result = ABCGH (CG H and CG AGBC)
4. result = ABCGHI (CG I and CG AGBCH)
Is AG a candidate key?
1. Is AG a super key?
1. Does AG R? == Is (AG)+ R
2. Does G R? == Is (G)+ R
79
5
Quiz Time
80
5
Uses of Attribute Closure
R1 R2 R2
The above functional dependencies are a sufficient condition
for lossless join decomposition; the dependencies are a
necessary condition only if all constraints are functional
dependencies
82
5
Example
R = (A, B, C)
F = {A B, B C)
Can be decomposed in two different ways
R1 = (A, B), R2 = (B, C)
Lossless-join decomposition:
R1 R2 = {B} and B BC
Dependency preserving
R1 = (A, B), R2 = (A, C)
Lossless-join decomposition:
R1 R2 = {A} and A AB
Not dependency preserving
(cannot check B C without computing R1 R2)
83
Dependency Preservation 5
84
5
Higher Normal Forms
85
5
Boyce-Codd Normal Form
A relation schema R is in BCNF with respect to a set F of
functional dependencies if for all functional dependencies
in F+ of the form
86
5
Decomposing a Schema into
BCNF
Suppose we have a schema R and a non-trivial dependency
+
causes a violation of BCNF.
We decompose R into:
• ( + U )
• (R-(-))
In our example,
= dept_name
= building, budget
87
5
Example
R = (A, B, C )
F = {A B
B C}
Key = {A}
R is not in BCNF
Decomposition R1 = (A, B), R2 = (B, C)
R1 and R2 in BCNF
Lossless-join decomposition
Dependency preserving
88
5
Example
Question 2 Suppose you are given a relation R = (A, B, C, D, E) with the
following functional
Dependencies: {CE -> D, D -> B,C -> A}.
a. Find all candidate keys.
b. Identify the best normal form that R satisfies (1NF, 2NF, 3NF, or BCNF).
c. If the relation is not in BCNF, decompose it until it becomes BCNF. At
each step, identify a
new relation, decompose and re-compute the keys and the normal forms
they satisfy.
89
5
Example
Question 3 Suppose you are given a relation R=(A,B,C,D,E) with the
following functional dependencies:
{BC -> ADE,D -> B}.
a. Find all candidate keys.
b. Identify the best normal form that R satisfies (1NF, 2NF, 3NF, or BCNF).
c. If the relation is not in BCNF, decompose it until it becomes BCNF. At
each step, identify a
new relation, decompose and re-compute the keys and the normal forms
they satisfy.
Answer.
a. The keys are {B,C} and {C,D}
b. The relation is in 3NF
c. It cannot be put into BCNF, even if I remove D and put into a relation of the
form (B, C, D) (I need C for the functional dependency), the resulting
relation would not be in BCNF.
90
5
Example
Question 4. Which normal form is considered adequate for normal
relational database design?
(a) 2NF (b) 5NF (c) 4NF (d) 3NF
Ans: (d)
Explanation:
A relational database table is often described as "normalized" if it is in
the Third Normal Form because most of the 3NF tables are free of
insertion, update, and deletion anomalies.
91
5
Example
Question 5. Consider a schema R (A, B, C, D) and functional
dependencies A -> B and C -> D. Then the decomposition of R into R1
(A, B) and R2(C, D) is
(a) dependency preserving and lossless join
(b) lossless join but not dependency preserving
(c) dependency preserving but not lossless join
(d) not dependency preserving and not lossless join
92
5
Example
Ans: (c)
While decomposing a relational table we must verify the following properties:
i) Dependency Preserving Property: A decomposition is said to be
dependency preserving if F+=(F1 ∪ F2 ∪ .. Fn)+, Where F+=total functional
dependencies(FDs) on universal relation R, F1 = set of FDs of R1, and F2 =
set of FDs of R2.
For the above question R1 preserves A->B and R2 preserves C->D. Since
the FDs of universal relation R is preserved by R1 and R2, the
decomposition is dependency preserving.
ii) Lossless-Join Property:
The decomposition is a lossless-join decomposition of R if at least one of the
following functional dependencies are in F+:-
a) R1 ∩ R2 -> R1
b) R1 ∩ R2 -> R2
It ensures that the attributes involved in the natural join ( ) are a candidate
key for at least one of the two relations.In the above question schema R is
decomposed into R1 (A, B) and R2(C, D), and R1 ∩ R2 is empty. So, the
decomposition is not lossless. 93
5
Example
6. A table has fields F1, F2, F3, F4, and F5, with the following functional
dependencies:
F1->F3
F2->F4
(F1,F2)->F5
in terms of normalization, this table is in
(a) 1NF (b) 2NF (c) 3NF (d) None of these
Ans: (a)
Explanation:
Since the primary key is not given we have to derive the primary key of the
table. Using the closure set of attributes we get the primary key as (F1,F2).
From functional dependencies, "F1->F3, F2->F4", we can see that there is
partial functional dependency therefore it is not in 1NF. Hence the table is in
1NF.
94
5
Example
7. Let R(A,B,C,D,E,P,G) be a relational schema in which the following FDs
are known to hold:
AB->CD
DE->P
C->E
P->C
B->G
The relation schema R is
(a) in BCNF (b) in 3NF, but not in BCNF
(c) in 2NF, but not in 3NF (d) not in 2NF
Ans: (d)
Explanation:
From the closure set of attributes we can see that the key for the relation is
AB. The FD B->G is a partial dependency; hence it is not in 2NF.
95
5
BCNF and Dependency
Preservation
Constraints, including functional dependencies, are costly to check
in practice unless they pertain to only one relation
If it is sufficient to test only those dependencies on each individual
relation of a decomposition in order to ensure that all functional
dependencies hold, then that decomposition is dependency
preserving.
Because it is not always possible to achieve both BCNF and
dependency preservation, we consider a weaker normal form,
known as third normal form.
96
5
Testing for BCNF
To check if a non-trivial dependency causes a violation of BCNF
1. compute + (the attribute closure of ), and
2. verify that it includes all attributes of R, that is, it is a superkey of R.
Simplified test: To check if a relation schema R is in BCNF, it suffices to
check only the dependencies in the given set F for violation of BCNF,
rather than checking all dependencies in F+.
If none of the dependencies in F causes a violation of BCNF, then
none of the dependencies in F+ will cause a violation of BCNF.
However, simplified test using only F is incorrect when testing a
relation in a decomposition of R
Consider R = (A, B, C, D, E), with F = { A B, BC D}
Decompose R into R = (A,B) and R = (A,C,D, E)
1 2
Neither of the dependencies in F contain only attributes from
97
5
Example of BCNF
R = (A, B, C ) Decomposition
F = {A B
B C}
Key = {A}
R is not in BCNF (B C but B is not superkey)
Decomposition
R1 = (B, C)
R2 = (A,B)
98
Example of BCNF Decomposition
5
99
5
BCNF Decomposition
(Cont.)
course is in BCNF
How do we know this?
building, room_number→capacity holds on class-1
but {building, room_number} is not a superkey for class-1.
We replace class-1 by:
classroom (building, room_number, capacity)
room_number, time_slot_id)
classroom and section are in BCNF.
100
5
BCNF and Dependency
Preservation
It is not always possible to get a BCNF decomposition that is
dependency preserving
R = (J, K, L )
F = {JK L
LK}
Two candidate keys = JK and JL
R is not in BCNF
Any decomposition of R will fail to preserve
JK L
This implies that testing for JK L requires a join
101
5
Normal Forms: Review
103
5
Testing Decomposition for BCNF
To check if a relation Ri in a decomposition of R is in BCNF,
Either test Ri for BCNF with respect to the restriction of F to Ri
(that is, all FDs in F+ that contain only attributes from Ri)
or use the original set of dependencies F that hold on R, but with
the following test:
for every set of attributes R , check that + (the attribute
i
closure of ) either includes no attribute of Ri- , or includes
all attributes of Ri.
If the condition is violated by some in F, the dependency
(+ - ) Ri
can be shown to hold on Ri, and Ri violates BCNF.
We use above dependency to decompose Ri
105
5
Third Normal Form:
Motivation
There are some situations where
BCNF is not dependency preserving, and
efficient checking for FD violation on updates is
important
Solution: define a weaker normal form, called Third
Normal Form (3NF)
Allows some redundancy (with resultant problems; we
will see examples later)
But functional dependencies can be checked on
individual relations without computing a join.
There is always a lossless-join, dependency-
preserving decomposition into 3NF.
106
5
Third Normal Form
A relation schema R is in third normal form (3NF) if for all:
in F+
at least one of the following holds:
is trivial (i.e., )
is a superkey for R
Each attribute A in – is contained in a candidate key for R.
107
5
Redundancy in 3NF
There is some redundancy in this schema
Example of problems due to redundancy in 3NF
R = (J, K, L)
F = {JK L, L K } J L K
j1 l1 k1
j2 l1 k1
j3 l1 k1
null l2 k2
108
5
Testing for 3NF
Testing a given schema to see if it satisfies 3NF has been
shown to be NP-hard
Possible to achieve 3NF by repeated decomposition based on
finding functional dependencies that show violation of 3NF
similar to BCNF decomposition, NP hardness not a big deal
since schemas tend to be small
BUT does not guarantee dependency preservation
e.g. R = (A, B, C)
109
5
Canonical Cover
Sets of functional dependencies may have redundant
dependencies that can be inferred from the others
For example: A C is redundant in: {A B, B C, A C}
Parts of a functional dependency may be redundant
E.g.: on RHS: {A B, B C, A CD} can be
simplified to
{A B, B C, A D}
E.g.: on LHS: {A B, B C, AC D} can be
simplified to
{A B, B C, A D}
Intuitively, a canonical cover of F is a “minimal” set of functional
dependencies equivalent to F, having no redundant dependencies
or redundant parts of dependencies
110
5
Extraneous Attributes
Consider a set F of functional dependencies and the functional
dependency in F.
Attribute A is extraneous in if A
and F logically implies (F – { }) {( – A) }.
Attribute A is extraneous in if A
and the set of functional dependencies
(F – { }) { ( – A)} logically implies F.
Note: implication in the opposite direction is trivial in each of the cases
above, since a “stronger” functional dependency always implies a
weaker one
Example: Given F = {A C, AB C }
B is extraneous in AB C because {A C, AB C} logically
implies A C (I.e. the result of dropping B from AB C).
Example: Given F = {A C, AB CD}
C is extraneous in AB CD since AB C can be inferred even
after deleting C
111
Testing if an Attribute is 5
Extraneous
Consider a set F of functional dependencies and the functional
dependency in F.
To test if attribute A is extraneous in
1. compute ({} – A)+ using the dependencies in F
2. check that ({} – A)+ contains ; if it does, A is extraneous
in
To test if attribute A is extraneous in
1. compute + using only the dependencies in
F’ = (F – { }) { ( – A)},
2. check that + contains A; if it does, A is extraneous in
• Example: Given F = {A C, AB C }:
B is extraneous in AB C because AB-B = A, and A+ contains C
• Example: Given F = {A C, AB CD}:
C is extraneous in AB CD since (AB)+ under {AC, ABD}
(AB)+ = ACD, which contains C
112
5
Canonical Cover
A canonical cover for F is a set of dependencies Fc such that
F logically implies all dependencies in Fc, and
Fc logically implies all dependencies in F, and
No functional dependency in Fc contains an extraneous attribute, and
Each left side of functional dependency in Fc is unique.
113
5
Computing a Canonical
Cover
To compute a canonical cover for F:
repeat
Use the union rule to replace any dependencies in F
1 1 and 1 2 with 1 1 2
Find a functional dependency with an
extraneous attribute either in or in
/* Note: test for extraneous attributes done using Fc, not F*/
If an extraneous attribute is found, delete it from
until F does not change
Note: Union rule may become applicable after some extraneous
attributes have been deleted, so it has to be re-applied
114
5
Computing
R = (A, B, C)
a Canonical Cover
F = {A BC
BC
AB
AB C}
Combine A BC and A B into A BC
Set is now {A BC, B C, AB C}
A is extraneous in AB C
Check if the result of deleting A from AB C is implied by the other
dependencies
Yes: in fact, B C is already present!
115
5
3NF Decomposition
Algorithm
Let Fc be a canonical cover for F;
i := 0;
for each functional dependency in Fc do
if none of the schemas Rj, 1 j i contains
then begin
i := i + 1;
Ri :=
end
if none of the schemas Rj, 1 j i contains a candidate key for R
then begin
i := i + 1;
Ri := any candidate key for R;
end
/* Optionally, remove redundant relations */
for all Rk
if schema Rk is contained in another schema Rk
then Rk = Ri; i=i-1; /* delete Rk */
return (R1, R2, ..., Ri)
116
5
3NF Decomposition
Algorithm (Cont.)
Above algorithm ensures:
each relation schema Ri is in 3NF
decomposition is dependency preserving and lossless-join
117
5
3NF Decomposition: An
Example
Relation schema:
cust_banker_branch = (customer_id, employee_id,
branch_name, type )
The functional dependencies for this relation schema are:
1. customer_id, employee_id branch_name, type
2. employee_id branch_name
3. customer_id, branch_name employee_id
We first compute a canonical cover
branch_name is extraneous in the r.h.s. of the 1st dependency
No other attribute is extraneous, so we get FC =
118
5
3NF Decompsition
Example (Cont.)
The for loop generates following 3NF schema:
(customer_id, employee_id, type )
(employee_id, branch_name)
(customer_id, branch_name, employee_id)
Observe that (customer_id, employee_id, type ) contains a
candidate key of the original schema, so no further relation
schema needs be added
At end of for loop, detect and delete schemas, such as
(employee_id, branch_name), which are subsets of other schemas
result will not depend on the order in which FDs are considered
The resultant simplified 3NF schema is:
(customer_id, employee_id, type)
(customer_id, branch_name, employee_id)
119
5
Comparison of BCNF and
3NF
It is always possible to decompose a relation into a set of relations
that are in 3NF such that:
the decomposition is lossless
the dependencies are preserved
It is always possible to decompose a relation into a set of relations
that are in BCNF such that:
the decomposition is lossless
it may not be possible to preserve dependencies.
120