Normalization
Normalization
Redundant Information in
Tuples and Update Anomalies
Mixing attributes of multiple entities may
cause problems
EXAMPLE OF AN UPDATE
ANOMALY
Consider the relation:
EMP_PROJ ( Emp#, Proj#, Ename, Pname, No_hours)
Update Anomaly
Insert Anomaly
EXAMPLE OF AN UPDATE
ANOMALY (2)
Delete Anomaly
Spurious Tuples
Bad designs for a relational database may result in
erroneous results for certain JOIN operations
The "lossless join" property is used to guarantee
meaningful results for join operations
The relations should be designed to satisfy the
lossless join condition. No spurious tuples should
be generated by doing a natural-join of any
relations
Functional Dependencies
Functional dependencies (FDs) are used to
specify formal measures of the "goodness"
of relational designs
FDs and keys are used to define normal
forms for relations
FDs are constraints that are derived from the
meaning and interrelationships of the data
attributes
Examples of FD constraints
Social Security Number determines employee name
SSN ENAME
Union
If X Y and X Z, then X YZ
Psuedotransitivity
If X Y and WY Z, then WX Z
Introduction to
Normalization
Normalization: Process of decomposing
unsatisfactory "bad" relations by breaking up their
attributes into smaller relations
Normal form: Condition using keys and FDs of a
relation to certify whether a relation schema is in a
particular normal form
Examples
Second Normal Form
{SSN, PNUMBER} HOURS is a full FD since neither
SSN HOURS nor PNUMBER HOURS hold
{SSN, PNUMBER} ENAME is not a full FD (it is
called a partial dependency ) since SSN ENAME also
holds
A relation schema R is in second normal form (2NF) if
every non-prime attribute A in R is fully functionally
dependent on the primary key
R can be decomposed into 2NF relations via the process
of 2NF normalization
Examples:
SSN DMGRSSN is a transitive FD since
SSN DNUMBER and DNUMBER DMGRSSN hold
SSN ENAME is non-transitive since there is no set
of
attributes X where SSN X and X ENAME
rd
3
Normal Form
BCNF
{Student,course} Instructor
Instructor Course
Decomposing into 2 schemas
{Student,Instructor} {Student,Course}
{Course,Instructor} {Student,Course}
{Course,Instructor} {Instructor,Student}
Example
Given the relation
Book(Book_title, Authorname, Book_type,
Listprice, Author_affil, Publisher)
The FDs are
Book_title Publisher, Book_type
Book_type Listprice
Authorname Author_affil
Example
What normal form the relation in?