Normalization Notes
Normalization Notes
INTRODUCTION
Normalization is a process of organizing data in a database to reduce
redundancy and improve data consistency.
Primary keys are really important in organizing information in a
database.
They help to make sure that every row in a table has a unique
identification so that nothing gets mixed up or lost.
Normalization
Normalization is a process of organizing the data in database to avoid
data redundancy, insertion anomaly, update anomaly & deletion anomaly.
Customer
Name Phone Numbers
ID
2 Jane 555-9876
3 Michael 555-5555
Custome
Name Phone Number
r ID
1 John 555-1234
1 John 555-5678
2 Jane 555-9876
3 Michael 555-5555
1 1 John 1 Shirt 2
1 1 John 2 Pants 1
2 2 Jane 1 Shirt 1
2 2 Jane 3 Hat 3
This violates 2NF because the Customer Name column depends on only part of
the primary key (Customer ID).
To normalize this table to 2NF, we can split it into two tables −
1 1 1 2
1 1 2 1
2 2 1 1
2 2 3 3
1 John
2 Jane
2022-01-
1 100 John Smith New York 100
01
2022-01-
2 101 Jane Doe Los Angeles 200
02
2022-01-
3 102 Bob Johnson San Francisco 300
03
Table: Books
F. Scott
2 The Great Gatsby 101 American
Fitzgerald
In this example, the functional dependency between "Author ID" and "Author
Name" violates BCNF because it is not on a candidate key.
To bring this table to BCNF, we can split it into two tables −
Table 1: Authors
Table 2: Books
Now, the "Author Name" and "Author Nationality" columns are not transitively
dependent on the primary key, and the table is in BCNF.
Fourth Normal Form (4NF)
4NF builds on BCNF by requiring that a table should not have multi-
valued dependencies.
A multi-valued dependency occurs when a non-primary key column
depends on a combination of other non-primary key columns.
For example, a table that lists customer orders with a primary key of
order ID and non-primary key columns for customer ID and order items
violates 4NF because order items depend on both order ID and customer
ID.
For example, a table that lists orders and their products, with columns for
order ID, product ID, and product details, violates 4NF because the
product details depend on the combination of order ID and product ID.
Example
Consider the following table of orders and products
In this table, the product name and description depend on both the order ID and
product ID, creating a multi-valued dependency. To bring the table into 4NF, we
can split it into three tables –
Order ID Product ID
1 100
1 200
2 100
2 300
3 200
3 300
100 Widget
200 Widget
300 Thing