Unit 2 Normalization-3
Unit 2 Normalization-3
Unit 2 Normalization-3
1
Normalization
Normalization of data
◦ Normalization: Process of decomposing unsatisfactory "bad" relations
by breaking up their attributes into smaller relations.
◦ Decomposing relations to minimize redundancy and update anomalies.
Properties of Normalization
There are two important properties of decompositions:
a) Loss less join/ Non additive join property
Decomposed relation doesn’t give spurious tuples.
b) Dependency preservation
each functional dependency is there in some decomposed relation.
Normal forms & Normal tests
The normal form of relation schema is the
highest normal form satisfied by the schema.
There are various normal forms and normal
tests namely,
First normal form, Second normal form,
Third normal form, Boyce-Codd normal form
and tests to verify whether a relation
schema is in a desired normal form.
3
4
First normal form (1NF)
A relation schema is said to be in first
normal form if all its attributes are atomic.
5
A schema which is not in first normal form
Dlocations is a
multi-valued attribute.
6
Conversion into first normal form
7
Conversion into first normal form
Alternative technique to decomposition also exists.
You may expand the primary key incorporating the
Multi-valued attribute into the primary key.
8
Conversion into first normal form
If maximum number of values of multi-valued
attribute is known then you may replace the
multi-valued attribute by a number of attributes.
In the example, instead of using Dlocations, you
may use three attributes, namely
Dlocation1, Dlocation2, Dlocation3
assuming that the maximum number of values of
Dlocations
This solution candisadvantage
has the be three. of introducing NULL values if most
departments have fewer than three locations. It further introduces spurious
semantics about the ordering among the location values that is not
originally intended. Querying on this attribute becomes more difficult.
9
Multi-valued attribute replaced
DEPARTMENT
Dname Dnumber Dmgr_ssn Docation1 Dlocation2 Dlocation3
10
Conversion into first normal form
First normal form does not allow complex attribute too.
11
Conversion into first normal form
Decompose
12
Multiple multi-valued attributes
This relation is NOT in 1NF
and so
13
Second normal form (2NF)
A relation schema R is said to be in second normal
form if
14
Example
15
Decomposition into 2NF
In order to reduce the schema EMP_PROJ into 2NF, we decompose it
with respect to partial functional dependency.
Decomposition with respect to
{Ssn} {Ename} results in
R1(Ssn, Ename) &
R2(Ssn, Pnumber, Hours, Pname, Plocation).
R1 is in 2NF but R2 is not because of partial dependency {Pnumber}
{Pname, Plocation}
So decompose R2 with respect to
{Pnumber} {Pname, Plocation}
16
Decomposition into 2NF
The decomposition of R2 with respect to {Pnumber} {Pname,
Plocation} results in
R3(Pnumber, Pname, Plocation} &
R4(Ssn, Pnumber, Hours)
So decomposition of EMP_PROJ with respect to partial
dependency results in three relation schemas namely,
R1(Ssn, Ename), R3(Pnumber, Pname, Plocation} &
R4(Ssn, Pnumber, Hours).
All these relations are in 2NF. (In fact these are in higher normal
forms than 2NF.)
17
Decomposed relation schemas of EMP_PROJ
R1 R3
R4
18
Third normal form (3NF)
A relation schema R is said to be in third normal
form if
19
Example
Consider the relation schema
20
Decomposition
So decompose the schema with respect to transitive
dependency.
21
Decomposition
22
Boyce-Codd normal form (BCNF)
A relation schema is said to be in BCNF if
(i) it is third normal form and
(ii) key attribute does not depend on non-key attribute.
FD1: AB C
FD2: C B
23
Another example
Consider another relation schema TEACH(Student#,
Course#, Instructor#) and
Suppose that Instructor# Course#
This functional dependency means that an instructor can
teach at the most one course.
Since Course# is a key attribute hence the schema TEACH is
NOT in BCNF.
Decomposition of TEACH with respect to Instructor#
Course# results in
R1(Instructor#, Course#) & R2(Student# , Instructor#)
24
Fourth normal form (4NF)
A relation will be in 4NF if it is in Boyce Codd normal form and has no
multi-valued dependency.
32
FDs on the schema
FD1:{Poperty_id#}{County_name, Lot#, Area, Price,
Tax_rate}
33
Partial dependency
The attributes that are not part of any of the keys are called
non-key attributes.
34
General definition of second normal form
A relation schema is said to be in second normal
form
(i) if it is in first normal form and
(ii) there is no partial dependency of non-key attribute on
any of the keys of the schema.
35
Decomposition into 2NF
36
Decomposition into 3NF
Not in 3NF
37
Exercise 2
Consider the following relation for published books:
BOOK (Book_title, Author_name, Book_type, List_price, Author_affil, Publisher).
Author_affil refers to the affiliation of author.
Suppose that the following dependencies exist:
38
Exercise 4
Consider the universal relation
R = {A, B, C, D, E, G, H, I, J, K}
and the set of functional dependencies
F = {ABC, ADE, BK, KGH, D IJ}.
What is the key for R? Decompose R into 2NF and then 3NF
relations.
39