BC2402 Week 2
BC2402 Week 2
BC2402 Week 2
Normalization
2.1 Relational Data Model
Relational Data Model
Previously…
Use Case
ERD
Extended ERD Use Case
ERD
Relational
Data Model
Relation
Definition: A relation is a named, two-dimensional table of data
Math foundation
Relations (almost tables)
Operators (Select, Project, Join, Union, Intersection,
Subtraction…)
Correspondence with E-R Model
Relations (tables) correspond with entity types and with
many-to-many (incl. one to many) relationship types
Primary Key
Entity Integrity
No primary key attribute may be null. All primary key fields MUST have
data
Referential Integrity
States that any foreign key value (on the relation of the many side) MUST
match a primary key value in the relation of the one side. (Or the foreign key
can be null). For example:
Delete Rules
Restrict: don’t allow delete of “parent” side if related rows exist in “dependent” side
Cascade: automatically delete “dependent” side rows that correspond with the “parent” side row to
be deleted
Set-to-Null: set the foreign key in the dependent side to null if deleting from the parent side not
allowed for weak entities
Referential integrity constraints
Referential integrity
constraints are drawn via
arrows from dependent to
parent table
Transforming ER to Relational
Entities
Every entity becomes a table
Weak entity takes key of strong entity as part of primary key
Attribute
Every ER attribute becomes a relational attribute
Composite attributes: Use only their simple, component attributes
Multivalued Attribute: Becomes a separate relation with a foreign key taken from the
superior entity
Remember: ER Model has no foreign keys!
Relationships
Become a table or
Set foreign keys
Sometimes both
Many-to-Many (M:N)
Create a new relation with the primary keys of the two entities as its
primary key
Watch out for superkeys
One-to-One (1:1)
One side optional
Primary key on the mandatory side becomes a foreign key on the
optional side
Treat like 1:M
New
Foreign key associative
relation
Foreign key
Binary M:N Example 2
IName CourseCode
Gender
CName
Identifier Assigned
It is natural and familiar to end-users
An associative entity
Example of mapping an associative entity
Three resulting relations
Many-to-Many
Two relations:
One for the entity type
One for an associative relation in which the primary key has
two attributes, both taken from the primary key of the entity
Unary Relationships: 1:1
Person marries person
Possible implementations:
Foreign key in table:
Person [ID, Gender, DOB, Married_ID]
EMPLOYEE entity
with unary
relationship
EMPLOYEE
relation with
recursive
foreign key
Unary Relationship 1:M
Employee supervises employees
Possible solutions:
Foreign key in table:
Employee [ID, Gender, DOB, Supervisor]
Bill-of-materials
relationships (M:N)
ITEM and
COMPONENT
relations
Unary Relationship M:N
Person related to person
Possible solutions
New Table With Two Foreign Keys
Person [ID, Gender, DOB]
Related-To [Left_ID, Right_ID, Relationship_Type]
Remember that This is why treatment But this makes a It would be better
the primary key date and time are very cumbersome to create a
MUST be included in the key… surrogate key like
unique composite primary key Treatment#
Summary
Transforming ER to the relational data model
Binary relationships
Unary relationships
N-ary relationships
2.2 Normalization
Normalization
Data Normalization
Primarily a tool to validate and improve a logical design
so that it satisfies certain constraints that avoid
unnecessary duplication of data
Definitions
Redundancy: Information appears more than once
Functional dependency: Given the value of attribute A, or a set of
attributes A, I know the value of attribute B
Redundancy causes
Insertion anomalies
Update anomalies
Query anomalies
Deletion anomalies (every entity be as specific as possible, don’t rojak)
Headaches
Well-Structured Relations
A relation that contains minimal data redundancy and
allows users to insert, delete, and update rows without
causing data inconsistencies
o Insertion – can’t enter a new employee without having the employee take a class
o Deletion – if we remove employee 140, we lose information about the existence of a Tax Acc
class
o Modification – giving a salary increase to employee 100 forces us to update multiple records
Candidate Key:
A unique identifier. One of the candidate keys will become the primary key
E.g. perhaps there is both credit card number and SS# in a table…in this case both are
candidate keys
Recommendation
Decompose ERD to reflect:
multivalued entity;
Associative entity
Primary key of new relation = key of previous relation + key of repeating group
Table is in 1st Normal form it is a relation but not a well structured one!
Anomalies in this Table, for given PK
o Insertion–if new product is ordered for order 1007 of existing customer, customer data must be re-
entered, causing duplication
o Deletion–if we delete the Dining Table from Order 1006, we lose information concerning this item's
finish and price
o Update–changing the price of product ID 4 requires update in several records
No partial dependencies
Non-key fields dependent on a partial key spawned to new relation
Partial dependencies are removed, but there are still transitive dependencies
Getting to Third Normal Form
2NF PLUS no transitive dependencies
i.e. functional dependencies on non-primary-key attributes
Rules
No non-key field should be dependent on a non-key field
Recommendation
Non-key determinant with transitive dependencies go into a new table;
Non-key determinant becomes primary key in the new table and stays as foreign key in the old
table
Simply put
"Every non-key must be dependent on the key, the whole key (no partial dependency), and
nothing but the key(no transitive dependency); so help me Codd“
Removing Transitive Dependencies
Solution:
Stu_id, C_code -> Grade
C_Code -> Stf_id
Normalization – in Practice
A well decomposed ERD will give you relations in 3NF
1NF should not exist
2NF is usually skipped
Going back to ER
Some newly normalized tables should be entities
Revise ER diagram accordingly
Summary
To articulate why normalization is necessary (avoid
anomalies, save storage space)