Normalization in SQL Server
Normalization in SQL Server
Normalization Avoids
• Duplication of Data – The same data is listed in multiple lines of the database
• Insert Anomaly – A record about an entity cannot be inserted into the table without first inserting
information about another entity – Cannot enter a customer without a sales order
• Delete Anomaly – A record cannot be deleted without deleting a record about a related entity. Cannot
delete a sales order without deleting all of the customer’s information.
• Update Anomaly – Cannot update information without changing information in many places. To update
customer information, it must be updated for each sales order the customer has placed
6
Functional Dependency (FD): It’s describes the relationship between attributes
(column) in a relation. FD must be ONE TO ONE relationship not ONE TO MANY relationship.
Example:
Example:
6
What are anomalies in DBMS and their types?
Tables that have redundant data have problems known as anomalies.So data redundancy is a
cause of an anomaly.
Insert Anomaly:When you insert a record without having it stored on the related record.
Delete Anomaly:When you delete some information and lose valuable related information at the
same time.
Update Anomaly: Any change made to your data will require you to scan all records to make the
changes multiple time.
6
First Normal Form
Definition: All attributes are atomic and dependent on the primary key.Atomicity
means only one value for each tuple and all the key attributes
(Column) are defined.
Example:
Before 1 NF:
6
Second Normal Form
Definition: 1NF and every non key attribute are fully and functionally dependent
on the primary key.
Explanation: It includes no partial dependency i.e. create separate table with the functionally dependent data and the
part of the key on which it depends.
Example:
Before 2 NF:
Orders
OrderID Item Custome OrderDa
rID te
1 Hammer 1 11/30/1998
1 Saw 1 11/30/1998
1 Nails 1 11/30/1998
After 2NF
Orders
OrderID Custome OrderDa OrderDetails
rID te OrderID Item
1 1 11/30/1998 1 Hammer
2 1 11/30/1998 2 Saw
3 1 11/30/1998 3 Nails
6
Third Normal Form
Definition: It is in 2NF and no transitive dependencies exist.
Explanation:In 2NF and every non-key column is not mutually dependent.
Example:
Before 3 NF:
OrderDetails
Item Quant Price Tota
ity l
Hammer 2 500 1000
Saw 5 2000 1000
0
Nails 8 50 400
After 3 NF:
OrderDetails
OrderI Item Quantit Price
D y
1 Hammer 2 500
2 Saw 5 2000
3 Nails 8 50
Example 2:
6
Boyce-Codd Normal Form
Definition: A relation is in BCNF if and only if all the determinants are candidate keys.
BCNF was developed in 1974 by Raymond F. Boyce and Edgar F. Codd
to address certain types of anomaly not dealt with by 3NF as originally defined
Explanation: The Boyce-Codd Normal Form (BCNF) database normalization methodology is a methodology for
database design used to normalize data beyond the third normal form (3NF). In BCNF,
every determinant must be a candidate key; if this is not the case then the form of the
database is not BCNF.
For example, if there was a table used to hold data about employee’s with the attributes:
employeeID, firstName, lastName, title. The employeeID field determines the firstName
and lastName, this is a superkey and must hold unique data and can not be NULL.
Similarly the tuple (firstName, lastName) determines the employeeID, this makes
firstName, lastName candidate keys of employeeID. In such a case; as previously
explained, these determinants must be candidate keys, however, a relations' candidate keys
must have unique sets of values for each row it holds.
Example:
6
Example 2:
6
Example 3:
Rate Type Court Member Flag Court Start Time End Time Member Flag
2 11:30 13:30 No
Example:
Before 4NF
6
After 4NF:
6
SUMMARY:
Other Details
Relational data model
relation
a table in a relational database is called relation in the mathematical language of
relational algebra. relations are unordered.
attribute
column of a table in database table is called attributes. columns or attributes have names.
domain
set of permissible values for an attribute ( or column) is called domain.
tuple
a row in the database table is called tuple in the mathematical language of relational
algebra. order of tuples in a relation has no significance.
database
a database is a collection of multiple relations.
schema
a database design is called schema, alternatively, a schema can refer to namespace within
a database.
cardinality of a relation
number of attributes in a relation is called cardinality of te relation.