Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
5 views

Database Design With Normalization

The document discusses data normalization and denormalization in database design, focusing on the issue of data redundancy and its negative effects, such as data anomalies. It outlines the process of normalization, which organizes data into smaller tables to eliminate redundancy, and describes various normal forms (1NF, 2NF, 3NF, BCNF) along with key definitions like superkey, candidate key, and functional dependency. The document also details the steps required to achieve each normal form to ensure data integrity and efficiency in relational databases.

Uploaded by

Mayala Mwendesha
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Database Design With Normalization

The document discusses data normalization and denormalization in database design, focusing on the issue of data redundancy and its negative effects, such as data anomalies. It outlines the process of normalization, which organizes data into smaller tables to eliminate redundancy, and describes various normal forms (1NF, 2NF, 3NF, BCNF) along with key definitions like superkey, candidate key, and functional dependency. The document also details the steps required to achieve each normal form to ensure data integrity and efficiency in relational databases.

Uploaded by

Mayala Mwendesha
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 30

CSU 07307: Datawarehouse &

Database
Normalizaion and Database
Design

Dr. Lupiana, D
FCIM, Institute of Finance Management
Semester 1
Agenda:
• Normalization
• Denormalization

2
Data Redundancy
• Data redundancy is one of the major problems
in relational databases
• It is when a single data about an entity is
stored in duplicates in one or multiple tables
• Data redundancy occurs when a relational
database is poorly designed

3
Data Redundancy: Effects
• In particular, when interdependencies of
tables have not been clearly defined and
addressed
• Despite occupying unnecessary storage space,
in a database where tables store redundant
information, queries involving joins become
more complicated and resource-intensive

4
Data Redundancy: Effects
• Data redundancy also causes data anomalies
and therefore compromises integrity of data
to make it reliable
• Common data anomalies are insertion
anomaly, update anomaly, and deletion
anomaly

5
Data Redundancy: Effects: Data Anomalies
• Insertion Anomaly: It happens when a new
record cannot be inserted into a relational
database because another data is missing
• Update Anomaly: It happens when a change
of a single data in a relational database
requires a change of multiple rows of data

6
Data Redundancy: Effects: Data Anomalies
• Deletion Anomaly: This happens when a
deletion of unwanted data in a relational
database causes a deletion of wanted data

7
Normalization Intro
• Normalization is used to take care of the data
redundancy problem during database design
• Normalization is a process of organizing data
in a database
• Normalization divides a large and complex
table into smaller and simpler tables and link
them using relationships

8
Normalization Intro
• Normalization uses a set of rules to organize
data into a relational database
• These rules are categorized into six levels,
which are called normal forms;
– First Normal Form (1NF)
– Second Normal Form (2NF)
– Third Normal Form (3NF)

9
Normalization Intro
• These rules are categorized into six levels,
which are called normal forms;
– Boyce-Codd Normal Form (BCNF)
– Fourth Normal Form (4NF)
– Fifth Normal Form (5NF)

10
Normalization: Key Definitions: Superkey

• Is a superset of all attributes that can uniquely


identify a record in a table
• It can be a single attribute or a group of
attributes that can uniquely identify a record
in a table

11
Normalization: Key Definitions: Superkey

• Superkeys;
1. {Movie_Name}
2. {Release_Year, Popularity_Ranking}
3. {Release_Year, Popularity_Ranking, Movie_Name}
4. {Release_Year_And_Month, Release_Year}

12
Normalization: Key Definitions: Candidate
Key
• Is an attribute or a combination of attributes
that uniquely identifies a record in a table
• Is a subset of superkeys but without
unnecessary attributes that are not important
in uniquely identifying a record

13
Normalization: Key Definitions: Candidate
Key

• Candidate keys:
1. {Movie_Name}
2. {Release_Year, Popularity_Ranking}
3. {Release_Year_And_Month, Popularity_Ranking}

14
Normalization: Key Definitions: Functional
Dependency
• This occurs when one attribute uniquely
determines another attribute within a table

15
Normalization: Key Definitions: Functional
Dependency
• In the previous table, roll_no uniquely
determines name, dept_name and
dept_building
• The above functional dependency can be
denoted as;
roll_no  {name, dept_name, dept_building}

16
Normalization: Key Definitions: Functional
Dependency
• It comes in three types;
– Trivial functional dependency
– Non-trivial functional dependency
– Multivalued functional dependency
– Transitive functional dependency

17
Normalization: Key Definitions: Functional
Dependency
• Trivial Functional Dependency
– This happens when a dependent is a subset of the
determinant
{roll_no, name} name is a
trivial functional dependency
since the dependent name is a
subset of determinant set
{roll_no, name}

18
Normalization: Key Definitions: Functional
Dependency
• Non-trivial Functional Dependency
– This happens when a dependent is strictly not a
subset of the determinant
roll_no name is a non-trivial
functional dependency since the
dependent name is not a subset
of determinant roll_no

19
Normalization: Key Definitions
• A prime attribute: Is an attribute that belongs
to at least one candidate key
• A non-prime attribute: Is an attribute that
does not belong to any candidate key

20
Normalization: 1NF
• The rules are;
– There must be a primary key
– A column should have a unique name
– A column should contain data of the same type
– A cell should not contain a repeating group
• A repeating group is a multivalued value i.e. a value
that contains more than one value
– Do not use multiple columns in a single table to
store similar data

21
Normalization: 1NF: Steps
• Steps to be taken;
– Specify a primary, if not specified
– Remove redundant columns i.e. columns that
store similar data
– Identify repeating groups and represent them as
different rows

22
Normalization: 2NF
• 2NF is about how non-key attributes in a table
relates to the primary key
• The rules are;
– A table should be in the 1NF
– There should be no partial dependencies
• There should be no non-prime attribute that depends
on a part of a candidate key

23
Normalization: 2NF: Partial Dependency
• Occurs when a non-prime attribute depends
functionally on a part of a primary key
• In this case the primary key is made up of two
or more columns i.e. it is a composite key
• So if there is a non-prime attribute in a table
that depends on a subset of a composite, such
table is said to have partial dependency

24
Normalization: 2NF: Steps
• Steps to be taken;
– Identify non-prime attributes with partial
dependencies in a table
– Remove each of the attributes, with a copy of its
determinant, to create a new table

25
Normalization: 3NF
• The guidelines are;
– A table should be in the 2NF
– There should be no transitive dependency
• A non-prime attribute should not depend on other non-
prime attributes

26
Normalization: 3NF: Transitive Dependency

• Occurs when a non-prime attribute is


determined by a primary key but can also
determine other non-prime attribute
• ‘C’ is a transitive dependent on ‘A’ through ‘B’
27
Normalization: 3NF: Steps
• Steps to be taken;
– Identify non-prime attributes which are
transitively dependent in a table
– Remove each of the columns, with a copy of its
determinant, to create a new table

28
Normalization: BCNF
• A stricter version of 3NF emphasizing
existence of superkey on the left-hand side of
• The guidelines are;
– A table should be in the 3NF
– With exception of trivial functional dependencies,
every functional dependency in a table must be a
dependency on a superkey

29
Normalization: BCNF

• The table above is in 3NF because


• However, the table violates BCNF because
Release_Year depends on
Release_Year_And_Month, which is not a
superkey
30

You might also like