Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
9 views

Module-3

The document covers the theory of database design, focusing on functional dependencies (FD), their types, and the process of normalization. It explains concepts such as closure of FDs, candidate key generation, and canonical cover, providing examples to illustrate these principles. The document serves as a comprehensive guide for understanding the foundational elements of database management systems.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Module-3

The document covers the theory of database design, focusing on functional dependencies (FD), their types, and the process of normalization. It explains concepts such as closure of FDs, candidate key generation, and canonical cover, providing examples to illustrate these principles. The document serves as a comprehensive guide for understanding the foundational elements of database management systems.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 63

DATABASE MANAGEMENT SYSTEMS

COURSE CODE: CSE-2007


MODULE – 3
(Database design theory and Normalization)

By:
Dr. Nagendra Panini Challa
Assistant Professor, Senior Grade 2
SCOPE, VIT-AP University, India
AGENDA
 Functional dependency (FD)
 Closure of FD, Closure of Attributes
 Cover, Equivalence of FD
 Canonical cover
 Key generation
 Normalization
 Desirable properties of decomposition

Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 2


FUNCTIONAL DEPENDENCY
(FD)
The functional dependency is a relationship that exists between two
attributes. It typically exists between the primary key and non-key attribute
within a table.
X → Y
The left side of FD is known as a determinant, the right side of the
production is known as a dependent.

Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 3


For example:
Assume we have an employee table with attributes: Emp_Id, Emp_Name,
Emp_Address.

Here Emp_Id attribute can uniquely identify the Emp_Name attribute of employee table
because if we know the Emp_Id, we can tell that employee name associated with it.
Functional dependency can be written as: Emp_Id → Emp_Name
We can say that Emp_Name is functionally dependent on Emp_Id.

Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 4


From the above table we can conclude some valid functional
dependencies:

• roll_no → { name, dept_name, dept_building },→ Here,


roll_no can determine values of fields name, dept_name and
dept_building, hence a valid Functional dependency
• roll_no → dept_name , Since, roll_no can determine whole
set of {name, dept_name, dept_building}, it can determine
its subset dept_name also.
• dept_name → dept_building , Dept_name can identify the
dept_building accurately, since departments with different
dept_name will also have a different dept_building
• More valid functional dependencies: roll_no → name,
{roll_no, name} ⇢ {dept_name, dept_building}, etc.

Here are some invalid functional dependencies:


• name → dept_name Students with the same name can have different dept_name, hence this
is not a valid functional dependency.
• dept_building → dept_name There can be multiple departments in the same building, For
example, in the above table departments ME and EC are in the same building B2, hence
dept_building → dept_name is an invalid functional dependency.
• More invalid functional dependencies: name → roll_no, {name, dept_name} → roll_no,
dept_building → roll_no,
Database Management etc.
Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 5
Types of Functional dependencies in DBMS:

 Trivial functional dependency


 Non-Trivial functional dependency
 Multivalued functional dependency
 Transitive functional dependency

Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 6


1. Trivial Functional Dependency
 In Trivial Functional Dependency, a dependent is always a subset of the determinant.
i.e. If X → Y and Y is the subset of X, then it is called trivial functional dependency
 For example,

Here, {roll_no, name} → name is a trivial


functional dependency, since the dependent name
is a subset of determinant set {roll_no, name}

Similarly, roll_no → roll_no is also an example of


trivial functional dependency.

Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 7


2. Non-trivial Functional Dependency
 In Non-trivial functional dependency, the dependent is strictly not a subset of the
determinant.
i.e. If X → Y and Y is not a subset of X, then it is called Non-trivial functional dependency.
 For example,

Here, roll_no → name is a non-trivial functional


dependency, since the dependent name is not a
subset of determinant roll_no

Similarly, {roll_no, name} → age is also a non-


trivial functional dependency, since age is not a
subset of {roll_no, name}

Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 8


3. Multivalued Functional Dependency
 In Multivalued functional dependency, entities of the dependent set are not dependent on
each other.
i.e. If a → {b, c} and there exists no functional dependency between b and c, then it is
called a multivalued functional dependency.
 For example,

Here, roll_no → {name, age} is a multivalued functional


dependency, since the dependents name & age are not
dependent on each other(i.e. name → age or age → name doesn’t
exist !)

Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 9


4. Transitive Functional Dependency
 In transitive functional dependency, dependent is indirectly dependent on determinant.
i.e. If a → b & b → c, then according to axiom of transitivity, a → c. This is a transitive
functional dependency
 For example,

Here, enrol_no → dept and dept → building_no,


Hence, according to the axiom of transitivity, enrol_no →
building_no is a valid functional dependency.

This is an indirect functional dependency, hence called


Transitive functional dependency.

Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 10


How to find functional dependencies for a
relation?

Functional Dependencies in a relation are dependent on the domain of the relation. Consider the
STUDENT relation given in Table 1

• We know that STUD_NO is unique for each student. So STUD_NO->STUD_NAME, STUD_NO-


>STUD_PHONE, STUD_NO->STUD_STATE, STUD_NO->STUD_COUNTRY and STUD_NO ->
STUD_AGE all will be true.
• Similarly, STUD_STATE->STUD_COUNTRY will be true as if two records have same
STUD_STATE, they will have same STUD_COUNTRY as well.
• For relation STUDENT_COURSE, COURSE_NO->COURSE_NAME will be true as two records
with same COURSE_NO will have same COURSE_NAME.

Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 11


 Functional Dependency Set: Functional Dependency set or FD set of a
relation is the set of all FDs present in the relation. For Example, FD set for
relation STUDENT shown in table 1 is:

{ STUD_NO->STUD_NAME, STUD_NO->STUD_PHONE, STUD_NO-


>STUD_STATE, STUD_NO->STUD_COUNTRY, STUD_NO -> STUD_AGE,
STUD_STATE->STUD_COUNTRY }

Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 12


CLOSURE OF FD &
ATTRIBUTES
 The Closure Of Functional Dependency means the complete set of all possible
attributes that can be functionally derived from given functional dependency using
the inference rules known as Armstrong’s Rules.
 If “F” is a functional dependency then closure of functional dependency can be
denoted using “{F}+”.
 There are three steps to calculate closure of functional dependency. These are:
 Step-1 : Add the attributes which are present on Left Hand Side in the original
functional dependency.
 Step-2 : Now, add the attributes present on the Right Hand Side of the functional
dependency.
 Step-3 : With the help of attributes present on Right Hand Side, check the other
attributes that can be derived from the other given functional dependencies. Repeat
this process until all the possible attributes which can be derived are added in the
closure.
Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 13
 Example-1 : Consider the table student_details having (Roll_No, Name,Marks, Location)
as the attributes and having two functional dependencies.
FD1 : Roll_No Name, Marks
FD2 : Name Marks, Location
Now, We will calculate the closure of all the attributes present in the relation using the
three steps mentioned below.

Step-1 : Add attributes present on the LHS of the first functional dependency to the
closure.
{Roll_no}+ = {Roll_No}
Step-2 : Add attributes present on the RHS of the original functional dependency to the
closure.
{Roll_no}+ = {Roll_No, Marks}
Step-3 : Add the other possible attributes which can be derived using attributes present
on the RHS of the closure. So Roll_No attribute cannot functionally determine any attribute
but Name attribute can determine other attributes such as Marks and Location using
2nd Functional Dependency(Name, Marks, Location).
Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 14
Therefore, complete closure of Roll_No will be :
{Roll_no}+ = {Roll_No, Marks, Name, Location}

Similarly, we can calculate closure for other attributes too i.e “Name”.
Step-1 : Add attributes present on the LHS of the functional dependency to the closure.
{Name}+ = {Name}
Step-2 : Add the attributes present on the RHS of the functional dependency to the closure.
{Name}+ = {Name, Marks, Location}
Step-3 : Since, we don’t have any functional dependency where “Marks or Location”
attribute is functionally determining any other attribute , we cannot add more attributes to
the closure. Hence complete closure of Name would be :
{Name}+ = {Name, Marks, Location}

NOTE : We don’t have any Functional dependency where marks and location can
functionally determine any attribute. Hence, for those attributes we can only add the
attributes themselves in their closures.
Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 15
Therefore
{Marks}+ = {Marks}
and
{Location}+ = { Location}
 Example-2 : Consider a relation R(A,B,C,D,E) having below mentioned functional
dependencies.
FD1 : A BC
FD2 : C B
FD3 : D E
FD4 : E D
Now, we need to calculate the closure of attributes of the relation R. The closures will be:
{A}+ = {A, B, C}
{B}+ = {B}
{C}+ = {B, C}
{D}+ = {D, E}
{E}+ = {E,D}
Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 16
Closure Of Functional Dependency : Calculating Candidate Key
 “A Candidate Key of a relation is an attribute or set of attributes that can determine the
whole relation or contains all the attributes in its closure."
 Let’s try to understand how to calculate candidate keys.

Example-1 : Consider the relation R(A,B,C) with given functional dependencies :


FD1 : A B
FD2 : B C
Now, calculating the closure of the attributes as :
{A}+ = {A, B, C}
{B}+ = {B, C}
{C}+ = {C}
Clearly, “A” is the candidate key as, its closure contains all the attributes present in the
relation “R”.

Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 17


Example-2 : Consider another relation R(A, B, C, D, E) having the Functional
dependencies :
FD1 : A BC
FD2 : C B
FD3 : D E
FD4 : E D
Now, calculating the closure of the attributes as :
{A}+ = {A, B, C}
{B}+ = {B}
{C}+ = {C, B}
{D}+ = {E, D}
{E}+ = {E, D}

Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 18


In this case, a single attribute is unable to determine all the attribute on its own like in
previous example. Here, we need to combine two or more attributes to determine the
candidate keys.
{A, D}+ = {A, B, C, D, E}
{A, E}+ = {A, B, C, D, E}
Hence, "AD" and "AE" are the two possible keys of the given relation “R”. Any other
combination other than these two would have acted as extraneous attributes.

NOTE : Any relation “R” can have either single or multiple candidate keys.

Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 19


For example : Consider the relation R(A, B, C, D) with functional
dependencies :
FD1 : A BC
FD2 : B C
FD3 : D C
Here, Candidate key can be “AD” only. Hence,
Prime Attributes : A, D.
Non-Prime Attributes : B, C
Extraneous Attributes : B, C(As if we add any of the to the candidate key, it
will remain unaffected). Those attributes, which if removed does not affect
closure of that set.
Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 20
CANONICAL COVER
 Whenever a user updates the database, the system must check whether
any of the functional dependencies are getting violated in this process. If
there is a violation of dependencies in the new database state, the system
must roll back.
 Working with a huge set of functional dependencies can cause unnecessary
added computational time. This is where the canonical cover comes into
play.
A canonical cover of a set of functional dependencies F is a simplified set of
functional dependencies that has the same closure as the original set F.
 Extraneous attributes: An attribute of a functional dependency is said to be
extraneous if we can remove it without changing the closure of the set of
functional dependencies.

Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 33


Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 34
Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 35
Example1:
 Consider the following set F of functional dependencies:

Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 36


Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 37
Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 38
Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 39
Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 40
Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 41
Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 42
Step 2. Checking whether all FDs of FD2 are present in FD1
 A->B in set FD2 is present in set FD1.
 B->C in set FD2 is also present in set FD1.
 A->C is present in FD2 but not directly in FD1 but we will check whether we can derive it
or not. For set FD1, (A)+ = {A,B,C,D}. It means that A can functionally determine A, B, C,
and D. SO A->C will also hold in set FD1.
 A->D is present in FD2 but not directly in FD1 but we will check whether we can derive it
or not. For set FD1, (A)+ = {A,B,C,D}. It means that A can functionally determine A, B, C,
and D. SO A->D will also hold in set FD1.

As all FDs in set FD2 also hold in set FD1, FD1 ⊃ FD2 is true.

Step 3. As FD2 ⊃ FD1 and FD1 ⊃ FD2 both are true FD2 =FD1 is true. These two FD sets
are semantically equivalent.

Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 43


2) Let us take another example to show the relationship between two FD sets. A relation
R2(A,B,C,D) having two FD sets FD1 = {A->B, B->C,A->C} and FD2 = {A->B, B->C, A-
>D}
Step 1. Checking whether all FDs of FD1 is present in FD2
 A->B in set FD1 is present in set FD2.
 B->C in set FD1 is also present in set FD2.
 A->C is present in FD1 but not directly in FD2 but we will check whether we can derive
it or not. For set FD2, (A)+ = {A,B,C,D}. It means that A can functionally determine A,
B, C, and D. SO A->C will also hold in set FD2.
As all FDs in set FD1 also hold in set FD2, FD2 ⊃ FD1 is true.

Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 44


Step 2. Checking whether all FDs of FD2 are present in FD1
 A->B in set FD2 is present in set FD1.,
 B->C in set FD2 is also present in set FD1.
 A->D is present in FD2 but not directly in FD1 but we will check whether we can derive
it or not. For set FD1, (A)+ = {A,B,C}. It means that A can’t functionally determine D.
SO A->D will not hold in FD1.
 As all FDs in set FD2 do not hold in set FD1, FD2 ⊄ FD1.

Step 3. In this case, FD2 ⊃ FD1 and FD2 ⊄ FD1, these two FD sets are not semantically
equivalent.

Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 45


NORMALIZATION
A large database defined as a single relation may result in data duplication.
This repetition of data may result in:
 Making relations very large.
 It isn't easy to maintain and update data as it would involve searching
many records in relation.
 Wastage and poor utilization of disk space and resources.
 The likelihood of errors and inconsistencies increases.

So to handle these problems, we should analyze and decompose the


relations with redundant data into smaller, simpler, and well-structured
relations that are satisfy desirable properties. Normalization is a process of
decomposing the relations into relations with fewer attributes.

Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 46


 Normalization is the process of organizing the data in the database.
 Normalization is used to minimize the redundancy from a relation or set of
relations. It is also used to eliminate undesirable characteristics like Insertion,
Update, and Deletion Anomalies.
 Normalization divides the larger table into smaller and links them using
relationships.
 The normal form is used to reduce redundancy from the database table.

Why do we need Normalization?


 The main reason for normalizing the relations is removing these anomalies.
Failure to eliminate anomalies leads to data redundancy and can cause data
integrity and other problems as the database grows. Normalization consists of
a series of guidelines that helps to guide you in creating a good database
structure.

Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 47


Data modification anomalies can be categorized into three types:

 Insertion Anomaly: Insertion Anomaly refers to when one cannot insert a


new tuple into a relationship due to lack of data.

 Deletion Anomaly: The delete anomaly refers to the situation where the
deletion of data results in the unintended loss of some other important
data.

 Updation Anomaly: The update anomaly is when an update of a single data


value requires multiple rows of data to be updated.

Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 48


Types of Normal Forms:
 Normalization works through a series of stages called Normal forms. The normal forms
apply to individual relations. The relation is said to be in particular normal form if it
satisfies constraints.

Following are the various types of Normal forms:

Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 49


Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 50
Advantages of Normalization

 Normalization helps to minimize data redundancy.


 Greater overall database organization.
 Data consistency within the database.
 Much more flexible database design.
 Enforces the concept of relational integrity.

Disadvantages of Normalization

 You cannot start building the database before knowing what the user needs.
 The performance degrades when normalizing the relations to higher normal forms, i.e.,
4NF, 5NF.
 It is very time-consuming and difficult to normalize relations of a higher degree.
 Careless decomposition may lead to a bad database design, leading to serious problems.

Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 51


First Normal Form
(1NF)
 A relation will be 1NF
if it contains an atomic
value.
 It states that an
attribute of a table
cannot hold multiple
values. It must hold
only single-valued
attribute.
 First normal form
disallows the multi-
valued attribute,
composite attribute,
and their
combinations.
Example: Relation
EMPLOYEE is not in 1NF
because of multi-valued
attribute EMP_PHONE.
Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 52
Second Normal
Form (2NF)
 In the 2NF,
relational must be
in 1NF.
 In the second
normal form, all
non-key attributes
are fully functional
dependent on the
primary key
Example: Let's
assume, a school can
store the data of
teachers and the
subjects they teach.
In a school, a teacher
can teach more than
one subject.

Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 53


Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 54
Third Normal Form (3NF)

 A relation will be in 3NF if it is in 2NF and not contain any transitive partial dependency.
 3NF is used to reduce the data duplication. It is also used to achieve the data integrity.
 If there is no transitive dependency for non-prime attributes, then the relation must be in
third normal form.

A relation is in third normal form if it holds atleast one of the following conditions for every
non-trivial function dependency X → Y.
 X is a super key.
 Y is a prime attribute, i.e., each element of Y is part of some candidate key.

Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 55


Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 56
 Candidate key: {EMP_ID}
 Non-prime attributes: In the
given table, all attributes
except EMP_ID are non-
prime.
 Here, EMP_STATE &
EMP_CITY dependent on
EMP_ZIP and EMP_ZIP
dependent on EMP_ID. The
non-prime attributes
(EMP_STATE, EMP_CITY)
transitively dependent on
super key(EMP_ID). It
violates the rule of third
normal form.
 That's why we need to
move the EMP_CITY and
EMP_STATE to the new
<EMPLOYEE_ZIP> table,
with EMP_ZIP as a Primary
key.
Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 57
Boyce Codd
normal form
(BCNF)
 BCNF is the
advance version of
3NF. It is stricter
than 3NF.
 A table is in BCNF
if every functional
dependency X → Y,
X is the super key
of the table.
 For BCNF, the
table should be in
3NF, and for every
FD, LHS is super
key.
Example: Let's
assume there is a
company where
employees work
Database in Systems (DBMS), SCOPE, VIT-AP University, India
Management 02/03/2025 58
more than one
 The table is not in BCNF because neither EMP_DEPT nor EMP_ID alone are keys.
 To convert the given table into BCNF, we decompose it into three tables:

Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 59


Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 60
Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 61
Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 62
Database Management Systems (DBMS), SCOPE, VIT-AP University, India 02/03/2025 63

You might also like