DB 4
DB 4
DB 4
The process of transforming the conceptual data model (i.e. ERDs) into a logical database model (i.e. relational) A logical database model is a design that conforms to the data model for a class of DBMS
Represent entities
Each entity type in an ERD is represented as a relation Each relationship in the ERD must be represented in the relational model
Represent relationships
Normalize relations
Merge relations
Data is stored in relations (entities) A relation consists of tuples/rows (instances) and attributes Goal: To store data without unnecessary redundancy and to be able to process information easily
Data structure
Data manipulation
Data integrity
Keys
Key
Minimal set of attributes that uniquely identifies each row in a relation A key consisting of more than one attribute
Composite key
Keys
Candidate key
Any set of attributes that could be chosen as a key of a relation Should be unique and non-redundant
The candidate key designated for principal use in uniquely identifying rows in a relation
Primary key
Keys
Foreign key
A set of attributes in one relation that constitutes a key in some other (possibly the same) relation Used to indicate logical links between relations
10
Foreign Key
EMP
EMPNO -----7839 7698 7782 7566 7654 7499 7844 7900 7521 7902 7369 ... ENAME DEPTNO ------- ------KING 10 BLAKE 30 CLARK 10 JONES 20 MARTIN 30 ALLEN 30 TURNER 30 JAMES 30 WARD 30 FORD 20 SMITH 20
DEPT
DEPTNO ------10 20 30 ... DNAME ---------ACCOUNTING RESEARCH SALES LOC -------NEW YORK DALLAS CHICAGO
Foreign key
Primary key
11
Relations
A named, two-dimensional table of data Consists of a set of named columns and an arbitrary number of unnamed rows Can be expressed as: RELATION (attribute1, attribute2,) Example
Properties of Relations
Entries in columns are atomic (singlevalued) Entries in columns are from the same domain Each row is unique (no duplicate rows) The sequence of columns is insignificant The sequence of rows is insignificant
13
Anomalies
Errors or inconsistencies that may result when manipulating data in a table that contains redundancies Types of anomalies:
14
Anomalies: An Example
EMPLOYEE COURSE
EMPID 100 100 140 110 110 190 NAME Dana Scully Dana Scully Fox Mulder Walter Skinner Walter Skinner Alex Krycek DEPT Marketing Marketing Info Systems Administration Administration Finance SALARY COURSE 42,000 42,000 39,000 41,500 41,500 38,000 Planning Management C++ Management Budgeting Tax Acct. DATE COMPLETED 5/6/99 5/27/95 12/28/93 5/27/95 6/6/86 10/1/93
15
Well-Structured Relations
Contains a minimum amount of redundancy and allows users to manipulate data without errors Normalization is used to achieve wellstructured relations
16
Normalization
Process of converting a relation to a standard form Used to derive well-structured relations that are free of anomalies when manipulated Often accomplished in stages or normal forms
17
Normal Form
State of a relation that can be determined by applying dependency rules to that relation Normal Forms:
First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form (3NF) Boyce-Codd Normal Form (BCNF) Fourth Normal Form (4NF) Fifth Normal Form (5NF)
18
Functional Dependency
The value of an attribute in a relation determines unique value of another (one or more) attributes in the relation Example
Left-side attribute (Stud_ID) is called a determinant which determines unique values of other attributes in the relation
19
One or more non-key attributes are functionally dependent on only part of the primary key Example
EMPLOYEE COURSE (Emp_ID, Name, Dept, Salary, Course, Date_Completed) Functional dependencies:
Transitive Dependency
A non-key attribute is functionally dependent on one or more other nonkey attributes Example
Steps in Normalization
Grade Report
STUDENT ID 143 STUDENT NAME Mulder CAMPUS ADDRESS 101 Cervini MAJOR MIS COURSE ID CS 122 CS 161 Psy 101 Th 141 En 12 COURSE TITLE DB Sys. O/S Basic Psy Marriage Basic Eng. INSTRUCTOR NAME Codd Tannenbaum Freud Pope John Paul Shakespeare INSTRUCTOR LOCATION F 227 F 104 Bel 204 B 102 B 202 GRADE B+ A A A B+
434
Scully
304 Eliazo
Psy
22
Steps in Normalization
Grade Report
STUDENT ID 143 143 434 434 STUDENT NAME Mulder Mulder Scully Scully CAMPUS ADDRESS 101 Cervini 101 Cervini 304 Eliazo 304 Eliazo MAJOR MIS MIS Psy Psy COURSE ID CS 122 CS 161 Psy 101 Th 141 COURSE TITLE DB Sys. O/S Basic Psy Marriage INSTRUCTOR NAME Codd Tannenbaum Freud Pope John Paul INSTRUCTOR LOCATION F 227 F 104 Bel 204 B 102 GRADE B+ A A A
434
Scully
304 Eliazo
Psy
CS 161
O/S
Tannenbaum
F 104
B+
23
Steps in Normalization
Student cannot have multiple majors Student cannot repeat a subject Only one teacher is available per course
COURSE INSTRUCTOR (COURSE ID, COURSE TITLE, INSTRUCTOR NAME, INSTRUCTOR LOCATION
REGISTRATION (STUDENT ID, COURSE ID, GRADE)
24
Steps in Normalization
STUDENT (STUDENT ID, STUDENT NAME, CAMPUS ADDRESS, MAJOR) COURSE INSTRUCTOR (COURSE ID, COURSE TITLE, INSTRUCTOR NAME) INSTRUCTOR (INSTRUCTOR NAME, INSTRUCTOR LOCATION) REGISTRATION (STUDENT ID, COURSE ID, GRADE)
25
Steps in Normalization
Remaining anomalies from functional dependencies are removed In BCNF if and only if every determinant is a candidate key Example: STUDENT ADVISOR (Student ID, Major, Advisor)
For each major a student has only one advisor Each advisor advises only one major Each advisor advises several students in one major Each major has several advisors Each student may major in several subjects
Student ID Major Advisor
123
123 456 789
Physics
Music Biology Physics
Einstein
Mozart Darwin Bohr
143
Physics
Einstein
26
Steps in Normalization
Steps in Normalization
Any remaining anomalies (join dependencies) have been removed Join dependency - data in relations broken down cannot be recombined to form the original
28
Steps in Normalization
Domain-Key Normal Form (DK/NF) Proposed by Fagin in 1981 Showed that any relation in DK/NF is automatically in 5NF, 4NF, etc. Does not provide methodology for converting to DK/NF
29
Represent entities
30
Represent entities
Employee_ID
EMPLOYEE
EMPLOYEE
Skill_Name
has
convert many-to-many
Skill_Name
SKILL
Skill_ID
31
Represent entities
Student_ID
STUDENT STUDENT
Name MI Last First Last First MI
32
Represent entities
Employee_ID
EMPLOYEE
EMPLOYEE
has
Birthdate
DEPENDENT
DEPENDENT
Dep_Name
Dep_Name
33
Represent relationships
Depends on:
34
Transforming Relationships
Primary key attributes of the entity on the oneside of the relationship = foreign key in the relation on the many side
DName DeptNo DName
DeptNo
DEPT
Loc has
DEPT
Loc has
DeptNo
EName
EMP
EName
EMP
EmpNo
EmpNo
35
Transforming Relationships
Binary EMP
7839 7698 7782 7566 7654 7499 7844 7900 7521 7902 7369 ...
one-to-many relationship
DEPT
DEPTNO
10 30 10 20 30 30 30 30 30 20 20 DEPTNO DNAME LOC 10 ACCOUNTING NEW YORK 20 RESEARCH DALLAS 30 SALES CHICAGO ...
EMPNO ENAME
KING BLAKE CLARK JONES MARTIN ALLEN TURNER JAMES WARD FORD SMITH
36
Foreign key
Primary key
Transforming Relationships
Similar situation as one-to-many relationship Create foreign key on any side of the relationship
Employee_ID Address Name Address
Employee_ID
Name
EMPLOYEE
EMPLOYEE
assigned
Employee_ID
COMPUTER
Terminal_ID
COMPUTER
Terminal_ID
37
Transforming Relationships
Primary key of relation A = foreign key of relation B Primary key of relation B = foreign key of relation A Both situations apply
Student_ID Name
Address
STUDENT
has
JPG_Image
PICTURE
Student_ID
38
Transforming Relationships
Create a separate relation Primary key is a composite key consisting of the primary key of each of the two entities Occasionally requires a primary key that includes more than just the primary keys of the two relations
39
Transforming Relationships
Employee_ID
EMPLOYEE
EMPLOYEE
Employee_ID assigned to Role Date Assigned
is given
Role
PROJECT
Project_ID
refers to
Project_Name
Project_Name
PROJECT
Project_ID
40
Transforming Relationships
EMPLOYEE
PROJECT_ASSIGNMENT
Employee_ID
Project_ID
Date Assigned
Role
21295
50666 40780 50666
100
100 101 100
27/05/2003
27/05/2003 28/12/2003 05/01/2004
Lead Analyst
Jr. Programmer Project Manager Sr. Programmer
41
Transforming Relationships
Unary relationships
Unary one-to-many
A recursive foreign key is added to reference the primary key values of the same relation
Name Manager_ID Employee_ID Name
Employee_ID
EMPLOYEE
EMPLOYEE
manages
manages
42
Transforming Relationships
Unary relationships
Unary one-to-many
EMPLOYEE
EMPLOYEE_ID 7839 7698 7782 7566 7654 7499 NAME KING BLAKE CLARK JONES MARTIN ALLEN MANAGER_ID 7839 7839 7839 7698 7698
43
Transforming Relationships
Unary relationships
Unary many-to-many
Create a separate relation to represent the many-tomany relationship Primary key = composite key of the two attributes from the same primary key domain
Name Unit_Cost Item_No. Name Unit_Cost
Item_No.
ITEM
Item_No. Comp_No.
ITEM
consists of
refers to
Quantity
44
Transforming Relationships
Unary relationships
Unary many-to-many
ITEM
Item_No.
Name
Unit_Cost
500
COMPONENT
Hard Drive
Pentium 4 PC Keyboard Screw
3,000
27,000 400 0.50
Item_No. 006
Comp_No. 500
Quantity 2
006
006 500 101
101
999 999 999
1
180 30 20
45
Transforming Relationships
Subtypes
Create a separate relation for the supertype and for each subtype Supertype relation consists of attributes common to all of the subtypes Relation for each subtype contains primary key and attributes unique to that subtype Primary keys of type and subtypes are from the same domain
46
Transforming Relationships
Subtypes example
Emp_ID Name Address Emp_Type Emp_ID Name Address
Emp_Type
EMPLOYEE
Emp_Type =
EMPLOYEE
d S
may be C
may be
may be
HOURLY
SALARIED
CONSULTANT
HOURLY
Emp_ID
SALARIED
Emp_ID
Hourly_Rate
Monthly_Sal
Billing_Rate
Hourly_Rate Monthly_Sal
Transforming Relationships
Subtypes example
EMPLOYEE Emp_ID Name Summers, Buffy Grissom, Gil Kent, Clark Bristow, Sidney Bauer, Jack Mulder, Fox Address Sunnydale Las Vegas Smallville Washington Washington Washington Emp_Type S S H C H S Monthly_Sal 12,000 30,000 20,000 SALARIED 40780 21295 50666 56466 97872 15249
Merge relations that refer to the same entity to remove redundancy View integration problems
49
Synonyms
Two or more attributes may have different names but the same meaning Choose either of the two attribute names and eliminate the other synonym or use a new attribute name to replace both synonyms
50
Homonyms
A single attribute may have more than one meaning Create new attribute names
51
Transitive Dependencies
May result when two 3NF relations are merged to form a single relation Example
STUDENT1 (Student ID, Major) STUDENT2 (Student ID, Advisor) STUDENT (Student ID, Major, Advisor) Note: Assume only one advisor per major
Subtypes
If there are two or more different types of a relation but they contain some characteristics common to all Create supertype-subtype relationships Example
PATIENT1 (Patient No., Name, Address) PATIENT2 (Patient No., Room No.)
PATIENT (Patient No., Name, Address) INPATIENT (Patient No., Room No.) OUTPATIENT (Patient No., Date Treated)
53