DBMS Module-2
DBMS Module-2
RDBMS
Relational DBMS
Relational model can represent as a table with columns and rows. Each
row is known as a tuple. Each table of the column has a name or attribute.
Terminologies:
• Tuple: Each row of a relation is known as tuple. e.g.; STUDENT relation
given below has 4 tuples.
• Attribute: It contains the name of a column in a particular table. Each
attribute Ai must have a domain, dom (Ai)
• Domain: The possible values an attribute can take in a relation is called
its domain. For Example, domain of STUD_AGE can be from 18 to 40.
• Relational instance: In the relational database system, the relational
instance is represented by a finite set of tuples. Relation instances do
not have duplicate tuples.
• Relational schema: A relational schema contains the name of the
relation and name of all columns or attributes.
• Relational key: In the relational key, each row has one or more
attributes. It can identify the row in the relation uniquely.
EF Codd’s Rules
• Rule 0: Foundation rule
• RDBMS should be able to manage the stored data in its entirety
through its relational capabilities.
• Rule 1: Information Rule
• The data stored in a database, may it be user data or metadata, must
be a value of some table cell.
• Rule 2: Guaranteed Access Rule
• Every single data element (value) is guaranteed to be accessible
logically with a combination of table-name, primary-key (row value),
and attribute-name (column value).
• Rule 3: Systematic Treatment of NULL Values
• The NULL values in a database must be given a systematic and
uniform treatment. This is a very important rule because a NULL can
be interpreted as one the following − data is missing, data is not
known, or data is not applicable.
• Rule 4: Active Online Catalog
• The structure description of the entire database must be stored in an
online catalog, known as data dictionary, which can be accessed by
authorized users.
• Rule 5: Comprehensive Data Sub-Language Rule
• A database can only be accessed using a language having linear syntax
that supports data definition, data manipulation, and transaction
management operations.
• Rule 6: View Updating Rule
• All the views of a database, which can theoretically be updated, must
also be updatable by the system.
• Rule 7: High-Level Insert, Update, and Delete Rule
• A database must support high-level insertion, updation, and deletion.
• Rule 8: Physical Data Independence
• The data stored in a database must be independent of the applications
that access the database. Any change in the physical structure of a
database must not have any impact on how the data is being accessed
by external applications.
• Rule 9: Logical Data Independence
• The logical data in a database must be independent of its user’s
view (application). Any change in logical data must not affect the
applications using it.
• Rule 10: Integrity Independence
• All its integrity constraints can be independently modified without
the need of any change in the application. This rule makes a
database independent of the front-end application and its interface.
• Rule 11: Distribution Independence
• The end-user must not be able to see that the data is distributed
over various locations.
• Rule 12: Non-Subversion Rule
• If a system has an interface that provides access to low-level
records, then the interface must not be able to subvert the system
and bypass security and integrity constraints.
Constraints
Every relation has some conditions that must hold for it to be a
valid relation. These conditions are called Relational Integrity
Constraints. There are three main integrity constraints −
• Key constraints
• Domain constraints
• Referential integrity constraints
20
Conditional Join(⋈c)
σ (MGRSSN=SSN)(DEPARTMENT×EMPLOYEE)
Equijoin(⋈)
• Equijoin is a special case of conditional join where only the
equality condition holds between a pair of attributes.
• As values of two attributes will be equal in the result of
equijoin, only one attribute will appear in the result.
STUDENT⋈STUDENT_SPORTS
Analysis of Inner join
• Theta Join, Equijoin, and Natural Join are called inner joins.
• An inner join includes only those tuples with matching
attributes and the rest are discarded in the resulting relation.
26
Outer Joins
• we need to use outer joins to include all the tuples from the
participating relations in the resulting relation.
• There are three kinds of outer joins −
• left outer join,
• right outer join,
• full outer join.
27
Left Outer Join(R S)
• All the tuples from the Left relation, R, are included in the
resulting relation.
• If there are tuples in R without any matching tuple in the
Right relation S, then the S-attributes of the resulting relation
are made NULL.
28
Right Outer Join: ( R Right Outer Join S )
• All the tuples from the Right relation, S, are included in the
resulting relation.
• If there are tuples in S without any matching tuple in R, then
the R-attributes of resulting relation are made NULL.
Course (Left) HOD (Right)
A B C D
100 Database 100 Alex
101 Mechanics 102 Maya
102 Electronics 104 Mira
29
Full Outer Join: ( R Full Outer Join S)
• All the tuples from both participating relations are included in
the resulting relation.
• If there are no matching tuples for both relations, their
respective unmatched attributes are made NULL.
Course (Left) HOD (Right)
A B C D
100 Database 100 Alex
101 Mechanics 102 Maya
102 Electronics 104 Mira
30
DIVISION Operation
R4
Generalized Projection
• where F1, F2, ..., Fn are functions over the attributes in relation R and
may involve arithmetic operations and constant values
• This operation is helpful when developing reports where computed
values have to be produced in the columns of a query result
Example
• As an example, consider the relation
• EMPLOYEE (Ssn, Salary, Deduction, Years_service)
• A report may be required to show
• Net Salary = Salary – Deduction,
• Bonus = 2000 * Years_service, and
• Tax = 0.25 * Salary.
• Then a generalized projection combined with renaming may be used
as follows:
REPORT ← ρ(Ssn, Net_salary, Bonus, Tax) (∏ssn , Salary – Deduction, 2000 * Years_service,
0.25 * Salary
(EMPLOYEE)).
Aggregate Functions and Grouping
42
Comparison Chart
BASIS FOR
COMPARISON RELATIONAL ALGEBRA RELATIONAL CALCULUS
43
Tuple Relational Calculus
The Tuple Relational Calculus list the tuples selected from a relation,
based on a certain condition provided. It is formally denoted as:
{ t | P(t) }
Where t is the set of tuples from which the condition P is true.
where t = resulting tuples,
P(t) = known as Predicate and these are the conditions that are used to
fetch t
Thus, it generates a set of all tuples t, such that Predicate P(t) is true for
t.
P(t) may have various conditions logically combined with OR (∨), AND
(∧), NOT(¬).
It also uses quantifiers:
∃ t ∈ r (Q(t)) = ”there exists” a tuple in t in relation r such that
predicate Q(t) is true.
∀ t ∈ r (Q(t)) = Q(t) is true “for all” tuples in relation r.
44
Build TRC expression case
Ex- let t is a tuple variable assigns to EMP relation as follows:
Q1. list the Details of the employees who are getting a salary
above 10,000.
Q1. list the ename and age of the employees who are getting
salary above 10,000.
{ t.ename, t.age, t.sal | EMP(t) ∧ t.sal < 10,000 ∨ t.sal > 20,000}
For practice:
1. list the employee name and age of the employees whose age is
below 35 or above 50 and sal is above 30000.
2. List the eno and sal of the employees who are getting salary
between 20,000 to 50000.
46
Domain Relational Calculus
47
Build DRC expression case
EMP(eno, ename, age, sal)
For individual domain one variable need to be
assigned.
P = eno < variable p ranges over the domain eno>
q =ename
r = age
s = sal
Q1. Build the DRC expression to find the age of Mr. X
[r|(∃p)(∃q)(∃r)(∃s) { EMP(pqrs) ∧ q= ‘Mr. X’} ]
48
More …
Q2. List the ename of the employees who are getting salary 10000 and
above.
[q|(∃p)(∃q)(∃r)(∃s) { EMP(pqrs) ∧ s ≥ 10000} ]
Q3. List the employee name and age of the employees who age is
below 35 or above 50 and sal is above 30000.
[q, r|(∃p)(∃q)(∃r)(∃s) { EMP(pqrs) ∧ r < 35 ∨ r > 50 ∧ s ≥ 30000 } ]
For Practice:
1. List the employee id and ename of the employees who age is 35
or above 50 and sal is below 30000.
2. List the eno and sal of the employees who are getting salary
between 20,000 to 50000.
49
Queries
• Queries-1: Find the tuples of loans where amount is
greater than or equal to 10000.
55
Example of Anomalies
Sid Sname cid Cname Fid Fname Salary
57
Evolution of Normalization theories
58
First Normal Form(1NF)
• The First Normal Form(1NF) works on the concept of “Atomicity”
in values of every individual tuple of tables present in the
database.
• It means, a relation is said to be in "1NF" if, every attribute in a
relation is has “Single Valued” tuple.
Functional Dependency (FD)
FD is a set of constraints between two attributes in a
relation.
A relationship which only exists when an attribute can
determine other attribute functionally.
• Functional Dependency in DBMS is denoted using an
arrow between two or more attributes such as FD : A
→B
• Here, A & B are the attributes present in any relation.
• “A → B” means, “B” is functionally dependent upon “A”
or “A” functionally determines “B”. Functional
dependency acts as a constraint between set of
attributes present in any database.
60
Example-1 : Consider a table student_details containing details of
some students.
Example : student_details Table
{A}+ = {A, B, C}
{B}+ = {B}
{C}+ = {B, C}
{D}+ = {D, E}
{E}+ = {E,D}
Identifying Candidate Key
•“A Candidate Key of a relation is an attribute or set of attributes
that can determine the whole relation or contains all the
attributes in its closure."
Example-1 : Consider the relation R(A,B,C) with given functional
dependencies :
FD1 : A → B
FD2 : B → C
{A}+ = {A, B, C}
{B}+ = {B, C}
{C}+ = {C}
Clearly, “A” is the candidate key as, its closure contains all the
attributes present in the relation “R”.
Example-2 : Consider another relation R(A, B, C, D, E) having the
Functional dependencies :
FD1 : A → BC
FD2 : C → B
FD3 : D → E
FD4 : E → D
Now, calculating the closure of the attributes as :
{A}+ = {A, B, C}
{B}+ = {B}
{C}+ = {C, B}
{D}+ = {E, D}
{E}+ = {E, D}
In this case, a single attribute is unable to determine all the attribute on
its own like in previous example. Here, we need to combine two or
more attributes to determine the candidate keys.
79
Challenging task to practice
Q. R = {A,B,C,D,E,F,G,H,I,J,K}
80