Database Design With Normalization
Database Design With Normalization
Database
Normalizaion and Database
Design
Dr. Lupiana, D
FCIM, Institute of Finance Management
Semester 1
Agenda:
• Normalization
• Denormalization
2
Data Redundancy
• Data redundancy is one of the major problems
in relational databases
• It is when a single data about an entity is
stored in duplicates in one or multiple tables
• Data redundancy occurs when a relational
database is poorly designed
3
Data Redundancy: Effects
• In particular, when interdependencies of
tables have not been clearly defined and
addressed
• Despite occupying unnecessary storage space,
in a database where tables store redundant
information, queries involving joins become
more complicated and resource-intensive
4
Data Redundancy: Effects
• Data redundancy also causes data anomalies
and therefore compromises integrity of data
to make it reliable
• Common data anomalies are insertion
anomaly, update anomaly, and deletion
anomaly
5
Data Redundancy: Effects: Data Anomalies
• Insertion Anomaly: It happens when a new
record cannot be inserted into a relational
database because another data is missing
• Update Anomaly: It happens when a change
of a single data in a relational database
requires a change of multiple rows of data
6
Data Redundancy: Effects: Data Anomalies
• Deletion Anomaly: This happens when a
deletion of unwanted data in a relational
database causes a deletion of wanted data
7
Normalization Intro
• Normalization is used to take care of the data
redundancy problem during database design
• Normalization is a process of organizing data
in a database
• Normalization divides a large and complex
table into smaller and simpler tables and link
them using relationships
8
Normalization Intro
• Normalization uses a set of rules to organize
data into a relational database
• These rules are categorized into six levels,
which are called normal forms;
– First Normal Form (1NF)
– Second Normal Form (2NF)
– Third Normal Form (3NF)
9
Normalization Intro
• These rules are categorized into six levels,
which are called normal forms;
– Boyce-Codd Normal Form (BCNF)
– Fourth Normal Form (4NF)
– Fifth Normal Form (5NF)
10
Normalization: Key Definitions: Superkey
11
Normalization: Key Definitions: Superkey
• Superkeys;
1. {Movie_Name}
2. {Release_Year, Popularity_Ranking}
3. {Release_Year, Popularity_Ranking, Movie_Name}
4. {Release_Year_And_Month, Release_Year}
12
Normalization: Key Definitions: Candidate
Key
• Is an attribute or a combination of attributes
that uniquely identifies a record in a table
• Is a subset of superkeys but without
unnecessary attributes that are not important
in uniquely identifying a record
13
Normalization: Key Definitions: Candidate
Key
• Candidate keys:
1. {Movie_Name}
2. {Release_Year, Popularity_Ranking}
3. {Release_Year_And_Month, Popularity_Ranking}
14
Normalization: Key Definitions: Functional
Dependency
• This occurs when one attribute uniquely
determines another attribute within a table
15
Normalization: Key Definitions: Functional
Dependency
• In the previous table, roll_no uniquely
determines name, dept_name and
dept_building
• The above functional dependency can be
denoted as;
roll_no {name, dept_name, dept_building}
16
Normalization: Key Definitions: Functional
Dependency
• It comes in three types;
– Trivial functional dependency
– Non-trivial functional dependency
– Multivalued functional dependency
– Transitive functional dependency
17
Normalization: Key Definitions: Functional
Dependency
• Trivial Functional Dependency
– This happens when a dependent is a subset of the
determinant
{roll_no, name} name is a
trivial functional dependency
since the dependent name is a
subset of determinant set
{roll_no, name}
18
Normalization: Key Definitions: Functional
Dependency
• Non-trivial Functional Dependency
– This happens when a dependent is strictly not a
subset of the determinant
roll_no name is a non-trivial
functional dependency since the
dependent name is not a subset
of determinant roll_no
19
Normalization: Key Definitions
• A prime attribute: Is an attribute that belongs
to at least one candidate key
• A non-prime attribute: Is an attribute that
does not belong to any candidate key
20
Normalization: 1NF
• The rules are;
– There must be a primary key
– A column should have a unique name
– A column should contain data of the same type
– A cell should not contain a repeating group
• A repeating group is a multivalued value i.e. a value
that contains more than one value
– Do not use multiple columns in a single table to
store similar data
21
Normalization: 1NF: Steps
• Steps to be taken;
– Specify a primary, if not specified
– Remove redundant columns i.e. columns that
store similar data
– Identify repeating groups and represent them as
different rows
22
Normalization: 2NF
• 2NF is about how non-key attributes in a table
relates to the primary key
• The rules are;
– A table should be in the 1NF
– There should be no partial dependencies
• There should be no non-prime attribute that depends
on a part of a candidate key
23
Normalization: 2NF: Partial Dependency
• Occurs when a non-prime attribute depends
functionally on a part of a primary key
• In this case the primary key is made up of two
or more columns i.e. it is a composite key
• So if there is a non-prime attribute in a table
that depends on a subset of a composite, such
table is said to have partial dependency
24
Normalization: 2NF: Steps
• Steps to be taken;
– Identify non-prime attributes with partial
dependencies in a table
– Remove each of the attributes, with a copy of its
determinant, to create a new table
25
Normalization: 3NF
• The guidelines are;
– A table should be in the 2NF
– There should be no transitive dependency
• A non-prime attribute should not depend on other non-
prime attributes
26
Normalization: 3NF: Transitive Dependency
28
Normalization: BCNF
• A stricter version of 3NF emphasizing
existence of superkey on the left-hand side of
• The guidelines are;
– A table should be in the 3NF
– With exception of trivial functional dependencies,
every functional dependency in a table must be a
dependency on a superkey
29
Normalization: BCNF