Normal Forms
Normal Forms
Normal Forms
Normalization
DBMS Normalization is a systematic approach to decompose (break down)
relations to eliminate data redundancy (repetition of data) and undesirable
anomalies like Insertion, Update and Delete.
To understand these anomalies let us take an example of a Student table.
Anomalies
Data Redundancy : the fields branch, hod(Head of Department), and office_tel are
repeated for the students who are in the same branch in the college,
Insertion Anomaly: Suppose for a new admission, until and unless a student opts
for a branch, data of the student cannot be inserted, or else we will have to set the
branch information as NULL. Also, if we have to insert data for 100 students of the
same branch, then the branch information will be repeated for all those 100 students.
Deletion Anomaly If only a single student is enrolled in a branch, and that student
leaves the college, or for some reason, the entry for the student is deleted, we will
lose the branch information too.
Normalization
Solution:
The solution for all the three anomalies described above is to keep the student
information and the branch information in two different tables. And use the
branch_id in the student table to reference the branch.
Decomposition:
Decomposition is the process of splitting a relation into into two or
more sub relations.
Decomposition helps in eliminating some of the problems of bad
design such as redundancy, inconsistencies and anomalies.
Desirable properties of decomposition:
● Attribute preservation
● Lossless-join decomposition
● Dependency preservation
● Lack of redundancy
Decomposition:
Types of decomposition:
● Consider any one possible ways in which the relation might have been decomposed into those sub
relations.
● First, divide the given relation into two sub relations.
● Then, divide the sub relations according to the sub relations given in the question.
As a thumb rule, remember-Any relation can be decomposed only into two sub relations at a time.
Test for lossless/lossy decomposition?
Hence, I is in 2NF.
Redundancy:
● X→ Y is a trivial FD
● X is a superkey for R
● Y is a prime attribute, i.e., each element of Y is part of some
candidate key. Or Each attribute A in Y–X is contained in a
candidate key for R.
Transitive dependency: When a non-prime attribute depends on other
non-prime attributes rather than depending upon the prime attributes or
primary key.
Third Normal Form
A relation schema Score (score_id ,student_id, subject_id ,marks,
exam_name, total_marks)
● X→ Y is trivial
● X is a superkey for R
F = {A → B, B → C, and C → D}.
Is the decomposition of R (A, B, C, D) into R1 (A, C, D) and R2 (B, C) a dependency
preserving decomposition?
Step 1: for R1, the derivable non-trivial functional dependency is, C → D. Hence,
F1 = {C → D}
Step 2: for R2, the derivable non-trivial functional dependency is, B → C. Hence,
F2 = {B → C}
(F1 U F2) = ({C → D} U {B → C}) = {C → D, B → C} ≠ F.
(F1 U F2)+ = {C → D, B → C, B → D}
F+ = {A → B, B → C, C → D, A → C, A → D, B → D}
Hence, (F1 U F2)+ ≠ F+
Testing for Dependency Preservation
To check if a dependency X→ Y is preserved in a decomposition of R into
R1, R2, …, Rn we apply the following test (with attribute closure done with
respect to F)
result = X
while (changes to result) do
for each Ri in the decomposition
t = (result ∩ Ri)+ ∩ Ri
result = result U t
● If result contains all attributes in Y, then the functional dependency
X→Y is preserved.
Testing for Dependency Preservation
Consider a relation R (A, B, C, D) with the following set of functional dependencies;
F = {A → B, B → C, and C → D}.
Is the decomposition of R (A, B, C, D) into R1 (A, C, D) and R2 (B, C) a dependency
preserving decomposition?
Step 1: for R1, the derivable non-trivial functional dependency is, C → D. Hence,
F1 = {C → D}
Step 2: for R2, the derivable non-trivial functional dependency is, B → C. Hence,
F2 = {B → C}
Testing for Dependency Preservation
F = {A → B, B → C, and C → D}. R1 (A, C, D) and R2 (B, C)
Step 3: check if a dependency A→ B is preserved
result = A
t = (result ∩ (R1))+ ∩ (R1)) = A+ ∩ (ACD) = (ABCD) ∩ (ACD) = ACD
result = result U t = ACD
t = (result ∩ (R2))+ ∩ (R2)) = C+ ∩ (BC) = (CD) ∩ (BC) = C
result = result U t = ACD
result does not contain B, so A→ B is not preserved
BCNF and Dependency Preservation
It is not always possible to achieve both BCNF and dependency reservation
Decomposition : IDept( i_ID, dept_name ) , Inst ( s_ID, i_ID). Now it's in BCNF
Informally, if one denotes by (x,y,z) the tuple having values for α, β, R−α−β
collectively equal to x, y, z, then whenever the tuples (a,b,c) and (a,d,e) exist
in r, the tuples (a,b,e) and (a,d,c) should also exist in r.
Multivalued Dependencies
Tabular representation of α→→β
Multivalued Dependencies
Consider STUDENT (SID, SUBJECT, ACTIVITY)
SID ACTIVITY
SID SUBJECT
200 OS 200 Swimming
2. For every relation R and (set of) attributes A, A →→ Α holds. (it is a trivial
MVD).
but not SID →SUBJECT and SID→ ACTIVITY. Hence not in 4NF
SID ACTIVITY
SID SUBJECT
200 OS 200 Swimming
If the join of R1 and R2 over Q is equal to relation R then we can say that a
join dependency exists, where R1 and R2 are the decomposition R1 (P, Q)
and R2 (Q, S) of a given relation R (P, Q, S). R1 and R2 are a lossless
decomposition of R.
Relation SUPPLY with Join Dependency and conversion to Fifth Normal Form
Inclusion Dependencies
Template Dependencies
Domain-Key Normal Form (DKNF)