MODULE 4 - Normalization - 1
MODULE 4 - Normalization - 1
MODULE 4 - Normalization - 1
42 abc CO A4
43 pqr IT A3the
44 xyz CO A4
valid functional dependencies:
roll_no→ { name, dept_name,
dept_building },→
◦ Here, roll_no can determine values of fields name,
dept_name and dept_building, hence a valid
Functional dependency
roll_no → dept_name ,
◦ Since, roll_no can determine whole set of {name,
dept_name, dept_building}, it can determine its
subset dept_name also.
dept_name → dept_building ,
◦ Dept_name can identify the dept_building
accurately, since departments with different
dept_name will also have a different dept_building
invalid functional dependencies:
name → dept_name
◦ Students with the same name can have
different dept_name, hence this is not a valid
functional dependency.
dept_building → dept_name
◦ There can be multiple departments in the
same building, For example, in the above table
departments ME and EC are in the same
building B2,
◦ hence dept_building → dept_name is an
invalid functional dependency.
Armstrong’s axioms/properties of
functional dependencies:
Reflexivity: If Y is a subset of X, then X→Y holds by
reflexivity rule
CF → G ( Using CF → G )
Thus,
A+ = { A , B , C , D , E , F ,
G}
A → BC D+ ={D}
BC → DE = { D , F } ( Using D
D → F →F)
CF → G
A → BC { B , C }+= { B , C }
BC → DE = { B , C , D , E } ( Using
D → F BC → DE )
CF → G = { B , C , D , E , F }
( Using D → F )
= { B , C , D , E , F , G }
( Using CF → G )
Thus,
{ B , C }+ = { B , C , D , E ,
F,G}
Given relational schema R( P Q R S T U V) having
following attribute P Q R S T U and V, also there is a set of
functional dependency denoted by FD = { P->Q, QR->ST,
PTV->V }. Determine Closure of (QR)+ and (PR)+
QR+ = QR FD QR→ST
=QRST
PR + = PR → P → Q
=PRQ →ST
=PRQST
Given relational schema R( P Q R S T) having following
attributes P Q R S and T, also there is a set of functional
dependency denoted by FD = { P->QR, RS->T, Q->S, T->
P }. Determine Closure of ( T )+
T+=T → P → QR → S
=TPQRS
Different kinds of keys
candidate key
A candidate key may be defined as-
◦ A set of minimal attribute(s) that can identify
each tuple uniquely in the given relation is
called as a candidate key.
OR
◦ A minimal super key is called as a candidate
key.
Consider the following Student schema-
Student ( roll , name , sex , age , address , class , section )
Given below are the examples of candidate keys-
( class , section , roll )
( name , address )
These are candidate keys because each set consists of
minimal attributes required to identify each student
uniquely in the Student table.
Let R = (A, B, C, D, Determine all
E, F) be a relation essential attributes of
scheme with the the given relation.
following Essential attributes of
dependencies- the relation are- C and
C → F E.
E → A So, attributes C and E
101 Akon OS
101 Akon CN
102 Bkon C
1 Java
2 C++
3 Php
Now we have a Student table with
student information and another
table Subject for storing subject
information.
Let's create another table Score, to store
the marks obtained by students in the
respective subjects.
We will also be saving name of the
teacher who teaches that subject along
with marks.
Score
score_id student_id subject_id marks teacher
1 10 1 70 Java Teacher
2 10 2 75 C++ Teacher
3 11 1 80 Java Teacher
In the score table we are saving the student_id to know
which student's marks are these and subject_id to know for
which subject the marks are for.
Together, student_id + subject_id forms a Candidate Key
for this table, which can be the Primary key
Now if you look at the Score table, we have a column
names teacher which is only dependent on the subject, for
Java it's Java Teacher and for C++ it's C++ Teacher & so
on.
primary key for this table is a composition of two columns
which is student_id & subject_id but the teacher's name
only depends on subject, hence the subject_id, has nothing
to do with student_id.
This is Partial Dependency, where an attribute in a table
depends on only a part of the primary key and not on the
How to remove Partial Dependency
There can be many different solutions for
this, but out objective is to remove teacher's
name from Score table.
The simplest solution is to remove
columns teacher from Score table and add it
to the Subject table. Hence, the Subject table
will become:
The simplest solution is to remove
columns teacher from Score table and add it to the
Subject table. Hence, the Subject table will
become:
id subject_name teacher
1 Java Java Teacher
2 C++ C++ Teacher
3 Php Php Teacher
And our Score table is now in the second normal
form, with no partial dependency
201010 UP Noida
02228 US Boston
60007 US Chicago
Boyce Codd normal form (BCNF)
264 India
264 India
EMP_DEPT table:
EMP_DEPT DEPT_TYPE EMP_DEPT_NO
D394 283
D394 300
D283 232
D283 549
Candidate keys:
◦ For the first table: EMP_ID
For the second table: EMP_DEPT
For the third table: {EMP_ID, EMP_DEPT}
Functional dependencies:
◦ EMP_ID → EMP_COUNTRY
◦ EMP_DEPT → {DEPT_TYPE, EMP_DEPT_
NO}
Now, this is in BCNF because left side part
of both the functional dependencies is a
key.
Fourth Normal Form (4NF)
21 Computer Dancing
21 Math Singing
34 Chemistry Dancing
The given STUDENT table is in 3NF, but the
COURSE and HOBBY are two independent
entity. Hence, there is no relationship between
COURSE and HOBBY.
In the STUDENT relation, a student with
STU_ID, 21 contains two
courses, Computer and Math and two
hobbies, Dancing and Singing. So there is a
Multi-valued dependency on STU_ID, which leads
to unnecessary repetition of data.
So to make the above table into 4NF, we can
decompose it into two tables:
STUDENT_COURSE
STU_ID COURSE
21 Computer
21 Math
34 Chemistry
STUDENT_HOBBY
STU_ID HOBBY
21 Dancing
21 Singing
34 Dancing
Relational Decomposition
When a relation in the relational model is not in
appropriate normal form then the decomposition of
a relation is required.
In a database, it breaks the table into multiple
tables.
If the relation has no proper decomposition, then it
may lead to problems like loss of information.
Decomposition is used to eliminate some of the
problems of bad design like anomalies,
inconsistencies, and redundancy.
Types of Decomposition
Lossless Decomposition
22 Denim 28 Mumbai
33 Alina 25 Delhi
46 Stephan 30 Bangalore
52 Katherine 36 Mumbai
60 Jack 40 Noida
DEPARTMENT table
DEPT_ID EMP_ID DEPT_NAME
827 22 Sales
438 33 Marketing
869 46 Finance
575 52 Production
678 60 Testing
Employee ⋈ Department
EMP_ID EMP_NA EMP_AG EMP_CI DEPT_ID DEPT_N
ME E TY AME