Unit 4 Dbms
Unit 4 Dbms
Find the set of all courses taught in both the Fall 2009 and the
Spring 2010 semesters.
Courses offered in both the Fall 2009 and Spring 2010 semesters
The Set-Intersection Operation
• Rewrite any relational-algebra expression that uses set
intersection
• by replacing the intersection operation with a pair of set-
difference operations
The natural join of the instructor relation with the teaches relation
The Natural-Join Operation
• The result relation, has only 13 tuples, the ones that give
information about an instructor and a course that that
instructor actually teaches.
The natural join of the instructor relation with the teaches relation
Outer join Operations
• For example,
• instructors Califieri, Gold, and Singh do not appear in the
result of the natural join, since they do not teach any
course.
• All three forms of outer join compute the join, and add
extra tuples to the result of the join.
• where
– E is any relational-algebra expression,
– each of F1, F2, . . . , Fn is an arithmetic expression
involving constants and attributes in the schema of E.
Generalized Projection
• Generalized projection also permits operations on other
data types, such as concatenation of strings.
Aggregation
• The second extended relational-algebra operation is the
aggregate operation ,
Tuples of the instructor relation, grouped by The result relation for the query “Find the
the dept name attribute average salary in each department”.
The Tuple Relational Calculus
The Tuple Relational Calculus
• When we write a relational-algebra expression, we
provide a sequence of procedures that generates the
answer to our query.
Find the ID, name, dept name, salary for instructors whose
salary is greater than $80,000
Pitfalls in Relational Database
Design
• Relational database design requires that we find a “good”
collection of relation schemas.
• Design Goals:
– Avoid redundant data
– Ensure that relationships among attributes are represented
– Facilitate the checking of updates for violation of database
integrity constraints
Example
• Consider the relation schema:
• Lending-schema = (branch-name, branch-city, assets,
customer-name, loan-number, amount)
Example
• Redundancy:
– Data for branch-name, branch-city, assets are repeated
for each loan that a branch makes
– Wastes space and complicates updating
• Null values
– Cannot store information about a branch if no loans exist
– Can use null values, but they are difficult to handle
Decomposition
• Decompose the relation schema Lending-schema into:
– Branch-schema = (branch-name, branch-city, assets)
– Loan-info-schema = (customer-name, loan-number,
branch-name, amount)
The problem arises when we have two employees with the same name
The next slide shows how we lose information -- we cannot reconstruct the
original employee relation -- and so, this is a lossy decomposition
@ V.V.R 67
A Lossy Decomposition
@ V.V.R 68
Lossless Decomposition
Let R be a relation schema and let R1 and R2 form a
decomposition of R . That is R = R1 U R2
We say that the decomposition is a lossless decomposition
if there is no loss of information by replacing R with the
two relation schemas R1 U R2
Formally,
R1 (r) R2 (r) = r
@ V.V.R 69
Examples of Lossless
Decomposition
Decomposition of R = (A, B, C)
R1 = (A, B) R2 = (B, C)
@ V.V.R 70
Normalization theory
@ V.V.R 71
Normalization
@ V.V.R 72
Transforming data from a problem into relations while ensuring data integrity and
eliminating data redundancy.
@ V.V.R 73
First Normal form
@ V.V.R 74
First normal form (1NF) deals with the `shape'
of the record.
A relation is in 1NF if, and only if, it contains
no repeating attributes or groups of attributes.
Example:
The Student table with the repeating group is not in 1NF
It has repeating groups, it is an `unnormalised table'.
@ V.V.R 75
Flatten Table and Extend Primary
Key
The Student table with the repeating group can be written as:
Student(matric_no, name, date_of_birth, ( subject, grade ) )
@ V.V.R 76
Decomposing Relation
@ V.V.R 77
We now have two relations, Student and Record.
Student contains the original non-repeating groups
Record has the original repeating groups and the matric_no
This version of the relations does not have insertion, deletion, or update anomalies.
Without repeating groups, we say the relations are in First Normal Form (1NF).
@ V.V.R 78
Second Normal Form
@ V.V.R 79
Consider again the Student relation from the
flattened table:
@ V.V.R 80
Dependency Diagram
A dependency diagram is used to show how
non-key attributes relate to each part or
combination
Student
of parts in the primary key.
matric_no name date_of_bith subject grade
PKD
Fully Dependent
This relation is not in 2NF
Use
Useyour
yourown
ownjudgment
judgmentwhen
whendecomposing
decomposingschemas
schemas
BCNF - Decomposition
Example 2 (Convert to BCNF)
Old Scheme {MovieTitle, MovieID, PersonName, Role, Payment }
New Scheme {MovieID, PersonName, Role, Payment}
New Scheme {MovieTitle, PersonName}
• Loss of relation {MovieID} {MovieTitle}
New Scheme {MovieID, PersonName, Role, Payment}
New Scheme {MovieID, MovieTitle}
• We got the {MovieID} {MovieTitle} relationship back
2. Each manager can have more than one child Mary NULL Adam
<JobSkills>
EmpSkills EmpJob
Networking EJ001
Web Development EJ002
Programming EJ002
Join dependency
{(EmpName, EmpSkills ), ( EmpName, EmpJob),
(EmpSkills, EmpJob)}
Fifth Normal Form-5NF
• A relation is in 5NF if it is in 4NF and not contains any join dependency
and joining should be lossless.
• 5NF is satisfied when all the tables are broken into as many tables as
possible in order to avoid redundancy.
• 5NF is also known as Project-join normal form (PJ/NF).
• Example: