DBMS
DBMS
DBMS
SYSTEM
What is Data
2
What is Database?
3
DATABASES
• Web indexes • Train timetables
• Library catalogues • Airline bookings
• Medical records • Credit card details
• Bank accounts • Student records
• Stock control • Customer histories
• Personnel systems • Stock market prices
• Product catalogues • Discussion boards
• Telephone directories • and so on…
WHY DBMS
6
DBMS APPLICATION
• Hierarchical databases
• Network databases
• Object oriented databases
• Relational databases
• NoSQL databases
Hierarchical Data Model(HDBMS)
1968-1980 was the era of the Hierarchical Database.
Prominent hierarchical database model was IBM's first DBMS.
It was called IMS (Information Management System).
In this model, files are related in a parent/child manner.
Tree like structure.
17
Network databases
A network database model is a database model that allows
multiple records to be linked to the same owner file.
In this model, files are related as owners and members, like
to the common network model.
Object oriented databases
Relational Database (Tabular)
1970 - Present: It is the era of Relational Database and
Database Management. In 1970, the relational model was
proposed by E.F. Codd.
Relational database model has two main terminologies called
instance and schema.
The instance is a table with rows or columns
This model uses some mathematical concept like set theory
and predicate logic.
The first internet database application had been created in
1995.
During the era of the relational database, many more models
had introduced like object-oriented model, object-relational
model, etc.
20
Relational databases
Session-2
Data Model
Physical Schema:
• Describes the physical storage of database.
• Not in terms of blocks or devices, but describes
organization of files, access path etc.
Conceptual Schema:
• Describes structure of whole database.
• Describes entities their relationships and constraints.
External Schema:
• Provides a user's view of data.
• Shows relevant info particular to user, hides rest of the
info.
• one or more levels.
Instances or State
• Database State:
• The actual data stored in a database at a particular
moment in time. This includes the collection of all the
data in the database.
• Also called database instance (or occurrence or
snapshot).
• The database schema changes very infrequently.
• The database state changes every time the database is
updated.
• 1-Tier Architecture
Conceptual
Or
Logical
Internal
Or
Physical
Data Independence
• Logical Data Independence:
• The capacity to change the conceptual schema without
having to change the external schemas and their
associated application programs.
47
E-R Diagram
Entities
Attributes
Keys
Relationships
ENTITY
• A real-world thing that is to be represented in our database.
• An entity can be place, person, object, event or a concept,
which stores data in the database.
• An entity is made up of some 'attributes' which represent
that entity.
5 Types of entity
Person: Employee, Student, Patient
Place: Store, Building student
Object: Machine, product, and Car
Event: Sale, Registration, Renewal
Concept: Account, Course
• A rectangle symbol is used for ENTITY representation..
• Group of similar entities- entity set
Attributes
• An attribute describes the property of an entity.
• E.g : student : name ,rollno, age…
• An attribute is represented as Ellipses in an ER diagram.
student
Multivalued attribute Multivalued attributes can have more than one values.
Ex:- A student can have more than one mobile number, email
address, etc.
Key Attribute The key attribute is used to represent the main characteristics of
an entity. It represents a primary key. The key attribute is
represented by an ellipse with the text underlined.
Ex: Rollno is a key attribute as it can identify any student
uniquely.
Single Valued Attribute Single valued attributes are those attributes which can take only
one value for a given entity from an entity set. Ex: Age, DoB,
Rollno, etc.
Session-4
Key Attributes
• Keys play an important role in the relational database.
• It is used to uniquely identify any entity or record or row of
data from the table. It is also used to establish and identify
relationships between tables.
Library
Roll No Book Name Author Issue Date Return Date Penalty
60
One-to-many − One entity from entity set A can be associated
with more than one entities of entity set B however an entity
from entity set B, can be associated with at most one entity.
61
Many-to-one − More than one entities from entity set A can be
associated with at most one entity of entity set B, however an
entity from entity set B can be associated with more than one
entity from entity set A.
62
Many-to-many − One entity from A can be associated with more
than one entity from B and vice versa.
63
Weak Entities
A weak entity is a type of entity which doesn't have its key
attribute.
64
Strong Entity Weak Entity
Strong entity always have one Weak entity have a foreign key
primary key. referencing primary key of strong
entity.
Strong entity is independent of Weak entity is dependent on
other entities. strong entity.
A strong entity is represented by A weak entity is represented by
single rectangle. double rectangle.
Relationship between two strong Relationship between a strong and
entities is represented by single weak entity is represented by
diamond. double diamond.
Entity1 Entity2
HasLinkWith
Binary relationship
S u p e rv is o r Supe rv ise s
Entity1
Staff Recursive (Unary) relationship -
example
S u p e rv is e e
Entity1 Entity3
Te rnaryRe lationship
Complex relationship –
here ternary Entity2
69
Relationships: Multiplicity
label lines to show cardinality and participation
0..1 “zero or one” optional
0..* “zero or more”
1..1 “one”
1..4 “between 1 and 4” mandatory
1..* “one or more”
Entity1 Entity2
HasLinkWith
1..1 0..*
Manages
Manager Department
1..1 0..3
responsibility [1..*]
dateAllocated
Each manager
Each manages UP TO 3
department departments
Relationship
is managed by (but need not manage
attributes
ONE manager any department)
71
FEATURES OF ER-diagram
Session-5
Generalization
• Generalization is like a bottom-up approach in which two or
more entities of lower level combine to form a higher level
entity if they have some attributes in common.
• Entities are combined to form a more generalized entity, i.e.,
subclasses are combined to make a superclass.
IS A
Specialization
• This is a top-down approach, and it is opposite to
Generalization.
• This is used to identify the subset of an entity set that shares
some distinguishing characteristics.
IS A
Aggregation
• Where the relation between two entities is treated as a single
entity.
• Where relationship with its corresponding entities is
aggregated into a higher level entity.
ENQUIRE
E-R Diagram Steps
Here we are going to design an Entity Relationship (ER) model
for a college database . Say we have the following statements.
1. A college contains many departments
2. Each department can offer any number of courses
3. Many instructors can work in a department
4. An instructor can work only in one department
5. For each department there is a Head
6. An instructor can be head of only one department
7. Each instructor can take any number of courses
8. A course can be taken by only one instructor
9. A student can enroll for any number of courses
10. Each course can have any number of students
77
Step 1 : Identify the Entities
78
Step 2 : Identify the relationships
1. One department offers many courses. But one particular course
can be offered by only one department. hence the cardinality
between department and course is One to Many (1:N)
2. One department has multiple instructors . But instructor belongs
to only one department. Hence the cardinality between
department and instructor is One to Many (1:N)
3. One department has only one head and one head can be the
head of only one department. Hence the cardinality is one to
one. (1:1)
4. One course can be enrolled by many students and one student
can enroll for many courses. Hence the cardinality between
course and student is Many to Many (M:N)
5. One course is taught by only one instructor. But one instructor
teaches many courses. Hence the cardinality between course and
instructor is Many to One (N :1)
79
Step 3: Identify the key attributes
• "Department_Name" can identify a department uniquely.
Hence Department_Name is the key attribute for the Entity
"Department".
• Course_ID is the key attribute for "Course" Entity.
• Student_ID is the key attribute for "Student" Entity.
• Instructor_ID is the key attribute for "Instructor" Entity.
81
Session-6
BANKING SYSTEM
• ER diagram of Bank has the following description :
104
Conditional Join(⋈c)
• Conditional Join is used when you want to join two or more
relation based on some conditions.
• Example: Select students whose ROLL_NO is greater than
EMP_NO of employees
σ (STUDENT.ROLL_NO>EMPLOYEE.EMP_NO)(STUDENT×EMPLOYEE)
Equijoin(⋈)
• Equijoin is a special case of conditional join where only
equality condition holds between a pair of attributes.
• As values of two attributes will be equal in result of equijoin,
only one attribute will be appeared in result.
STUDENT⋈STUDENT.ROLL_NO=EMPLOYEE.EMP_NOEMPLOYEE
Natural Join(⋈)
• It is a special case of equijoin in which equality condition
hold on all attributes which have same name in relations R
and S (relations on which join operation is applied).
• While applying natural join on two relations, there is no
need to write equality condition explicitly.
• Example: Select students whose ROLL_NO is equal to
ROLL_NO of STUDENT_SPORTS as:
STUDENT⋈STUDENT_SPORTS
Analysis of Inner join
• Theta Join, Equijoin, and Natural Join are called inner joins.
• An inner join includes only those tuples with matching
attributes and the rest are discarded in the resulting relation.
110
Outer Joins
• we need to use outer joins to include all the tuples from the
participating relations in the resulting relation.
• There are three kinds of outer joins −
• left outer join,
• right outer join,
• full outer join.
111
Left Outer Join(R S)
• All the tuples from the Left relation, R, are included in the
resulting relation.
• If there are tuples in R without any matching tuple in the
Right relation S, then the S-attributes of the resulting relation
are made NULL.
112
Right Outer Join: ( R Right Outer Join S )
• All the tuples from the Right relation, S, are included in the
resulting relation.
• If there are tuples in S without any matching tuple in R, then
the R-attributes of resulting relation are made NULL.
Course (Left) HOD (Right)
A B C D
100 Database 100 Alex
101 Mechanics 102 Maya
102 Electronics 104 Mira
113
Full Outer Join: ( R Full Outer Join S)
• All the tuples from both participating relations are included in
the resulting relation.
• If there are no matching tuples for both relations, their
respective unmatched attributes are made NULL.
Course (Left) HOD (Right)
A B C D
100 Database 100 Alex
101 Mechanics 102 Maya
102 Electronics 104 Mira
114
Relational Calculus
Relational calculus is a non-procedural query language.
In the non-procedural query language, the user is concerned
with the details of how to obtain the end results.
The relational calculus tells what to do but never explains
how to do.
Types of Relational calculus
115
Comparison Chart
BASIS FOR
COMPARISON RELATIONAL ALGEBRA RELATIONAL CALCULUS
116
Tuple Relational Calculus
The Tuple Relational Calculus list the tuples selected from a relation,
based on a certain condition provided. It is formally denoted as:
{ t | P(t) }
Where t is the set of tuples from which the condition P is true.
where t = resulting tuples,
P(t) = known as Predicate and these are the conditions that are used to
fetch t
Thus, it generates set of all tuples t, such that Predicate P(t) is true for
t.
P(t) may have various conditions logically combined with OR (∨), AND
(∧), NOT(¬).
It also uses quantifiers:
∃ t ∈ r (Q(t)) = ”there exists” a tuple in t in relation r such that
predicate Q(t) is true.
∀ t ∈ r (Q(t)) = Q(t) is true “for all” tuples in relation r.
117
Build TRC expression case
Ex- let t is a tuple variable assigns to EMP relation as follows:
Q1. list the ename and age of the employees who are getting
salary above 10,000.
Or
118
Some more
Q1. list the age of the employee whose eno is 102.
For practice:
1. list the employee name and age of the employees whose age
is below 35 or above 50 and sal is above 30000.
2. List the eno and sal of the employees who are getting salary
between 20,000 to 50000.
119
Domain Relational Calculus
• The Domain Relational Calculus list the attributes to be selected
from a relation, based on certain condition.
• The formal definition of Domain Relational Calculus is as follow:
{<X1, X2, X3, . . . Xn> | P(X1, X2, X3, . . . Xn)}
• Where X1, X2, X3, . . . Xn are the attributes and P is the certain
condition.
• Domain relational calculus uses the same operators as tuple
calculus. It uses logical connectives ∧ (and), ∨ (or) and ┓ (not).
• It uses Existential (∃) and Universal Quantifiers (∀) to bind the
variable.
{< article, page, subject > | ∈ studypoint ∧ subject = 'database'}
Output: This query will yield the article, page, and subject from the
relational studypoint, where the subject is a database.
120
Build DRC expression case
EMP(eno, ename, age, sal)
For individual domain one variable need to be
assigned.
P = eno < variable p ranges over the domain eno>
q =ename
r = age
s = sal
Q1. Build the DRC expression to find the age of Mr. X
[r|(∃p)(∃q)(∃r)(∃s) { EMP(pqrs) ∧ q= ‘Mr. X’} ]
121
More …
Q2. List the ename of the employees who are getting salary 10000
and above.
[q|(∃p)(∃q)(∃r)(∃s) { EMP(pqrs) ∧ s ≥ 10000} ]
Q3. List the employee name and age of the employees who age is
below 35 or above 50 and sal is above 30000.
[q, r|(∃p)(∃q)(∃r)(∃s) { EMP(pqrs) ∧ r < 35 ∨ r > 50 ∧ s ≥ 30000 } ]
For Practice:
1. List the employee id and ename of the employees who age is
35 or above 50 and sal is below 30000.
2. List the eno and sal of the employees who are getting salary
between 20,000 to 50000.
122
Practice sessions(RA, TRC,DRC)
R(A,B) S(B,C) // two relations R and S
SQL> select A,C from R,S
where R.B = S.B
and S.C = 3;
Transform into RA, TRC, and DRC expression?
123
R(A,B) S(B,C) // two relations R and S
SQL> select A,C from R,S
where R.B = S.B
and S.C = 3;
Relational Algebra: -
πA,C(σS.C=3(R⋈S))
Tuple Relational Calculus:-
{t.A, u.C | R(t) ∧ S(u) ∧ t.B = u.B ∧ u.C = 3}
124
Task for you (RA, TRC,DRC)
R(A,B,C,D,E,F,G)
SQL> SELECT A,C,D,G
FROM R
WHERE R.C >5000 ;
Transform into RA, TRC, and DRC expression?
Transform into RA, TRC, DRC expressions.
125
Queries
130
Normalizations on Relational Database
Normalization is a database design technique which
organizes tables in a manner that reduces redundancy and
dependency of data.
It divides larger tables to smaller tables and links them using
relationships.
The inventor of the relational model Edgar Codd proposed
the theory of normalization with the introduction of First
Normal Form, and he continued to extend theory with
Second and Third Normal Form. Later he joined with
Raymond F. Boyce to develop the theory of Boyce-Codd
Normal Form(BCNF).
131
Evolution of Normalization theories
132
First Normal Form(1NF)
• The First Normal Form(1NF) works on the concept of “Atomicity”
in values of every individual tuple of tables present in the
database.
• It means, a relation is said to be in "1NF" if, every attribute in a
relation is has “Single Valued” tuple.
Functional Dependency (FD)
FD is a set of constraints between two attributes in a
relation.
A relationship which only exists when an attribute can
determine other attribute functionally.
• Functional Dependency in DBMS is denoted using an
arrow between two or more attributes such as FD : A
→B
• Here, A & B are the attributes present in any relation.
• “A → B” means, “B” is functionally dependent upon “A”
or “A” functionally determines “B”. Functional
dependency acts as a constraint between set of
attributes present in any database.
134
Example-1 : Consider a table student_details containing details of
some students.
Example : student_details Table
{A}+ = {A, B, C}
{B}+ = {B}
{C}+ = {B, C}
{D}+ = {D, E}
{E}+ = {E}
Identifying Candidate Key
•“A Candidate Key of a relation is an attribute or set of attributes
that can determine the whole relation or contains all the
attributes in its closure."
Example-1 : Consider the relation R(A,B,C) with given functional
dependencies :
FD1 : A → B
FD2 : B → C
{A}+ = {A, B, C}
{B}+ = {B, C}
{C}+ = {C}
Clearly, “A” is the candidate key as, its closure contains all the
attributes present in the relation “R”.
Example-2 : Consider another relation R(A, B, C, D, E) having the
Functional dependencies :
FD1 : A → BC
FD2 : C → B
FD3 : D → E
FD4 : E → D
Now, calculating the closure of the attributes as :
{A}+ = {A, B, C}
{B}+ = {B}
{C}+ = {C, B}
{D}+ = {E, D}
{E}+ = {E, D}
In this case, a single attribute is unable to determine all the attribute on
its own like in previous example. Here, we need to combine two or
more attributes to determine the candidate keys.
152
Challenging task to practice
Q. R = {A,B,C,D,E,F,G,H,I,J,K}
153
Introduction to SQL
154
What is SQL
• SQL stands for Structured Query Language.
• It is designed for managing data in a relational database
management system (RDBMS).
• It is pronounced as S-Q-L or sometime See-Qwell.
• SQL is a database language, it is used for database creation,
deletion, fetching rows, and modifying rows, etc.
• SQL is based on relational algebra and tuple relational
calculus.
• All RDBMS like MySQL, Oracle, MS Access, Sybase, Informix,
Postgres, and SQL Server use SQL as standard database
language.
155
Why SQL is required
156
Tables and Views
157
SQL Statements
•
• SELECT
• INSERT
• UPDATE Data manipulation language (DML)
• DELETE
• CREATE
• ALTER
• DROP Data definition language (DDL)
• RENAME
• TRUNCATE
• COMMENT
• GRANT Data control language (DCL)
• REVOKE
• COMMIT
• ROLLBACK Transaction
control
• SAVEPOINT
Data Types
DDL Commands
The CREATE TABLE statement is used to create a new table in a database.
Syntax
CREATE TABLE table_name (
column1 datatype,
column2 datatype,
column3 datatype,
....
);
• The DROP TABLE statement is used to drop an existing table in a
database.
DROP TABLE table_name;
• The TRUNCATE TABLE statement is used to delete the data inside a table,
but not the table itself.
TRUNCATE TABLE table_name;
• The ALTER TABLE statement is used to add, delete, or modify columns
in an existing table.
• The ALTER TABLE statement is also used to add and drop various
constraints on an existing table.
• Columns can be also be given new name with the use of ALTER TABLE.
Syntax(MySQL, Oracle)
DEPARTMENTS LOCATIONS
department_id location_id
department_name street_address
manager_id postal_code city
location_id state_province
country_id
JOB_HISTORY
employee_id
start_date EMPLOYEES
end_date employee_id
job_id first_name COUNTRIES
department_id last_name email country_id
phone_number country_name
hire_date region_id
job_id
salary
commission_pct
JOBS manager_id
job_id department_id
job_title REGIONS
min_salary region_id
max_salary region_name
DML Commands
• INSERT INTO TABLE_NAME (column1, column2, column3,...columnN)]
VALUES (value1, value2, value3,...valueN);
OR
• INSERT INTO TABLE_NAME VALUES (value1,value2,value3,...valueN);
• Insert Multiple Rows
INSERT ALL
INTO table_name (column1, column2, column_n) VALUES (expr1, expr2, expr_n)
INTO table_name(column1, column2, column_n) VALUES (expr1, expr2, expr_n)
INTO table_name (column1, column2, column_n) VALUES (expr1, expr2, expr_n)
SELECT * FROM dual;
• SELECT column1, column2, columnN FROM table_name;
• SELECT * FROM table_name;
• SELECT statement with WHERE clause is as follows:
• SELECT column1, column2, columnN FROM table_name WHERE
[condition]
• Example:
• SQL> SELECT ID, NAME, SALARY FROM CUSTOMERS WHERE
SALARY > 2000;
• AND operator with WHERE clause
• SELECT column1, column2, columnN FROM table_name WHERE
[condition1] AND [condition2]...AND [conditionN];
• Example:
• SQL> SELECT ID, NAME, SALARY FROM CUSTOMERS WHERE
SALARY > 2000 AND age < 25;
• UPDATE table_name SET column1 = value1, column2 =
value2...., columnN = valueN WHERE [condition];
• Example:
• SQL> UPDATE CUSTOMERS SET ADDRESS = 'Pune' WHERE
ID = 6;
• You can combine N number of conditions using AND or OR
operators.
• DELETE query with WHERE clause is as follows:
• DELETE FROM table_name WHERE [condition];
• Example: SQL> DELETE FROM CUSTOMERS WHERE ID = 6;
• Delete All Records
• DELETE FROM table_name;
SQL Constraints
Constraints are used to limit the type of data that can go into a
table. This ensures the accuracy and reliability of the data in the
table. If there is any violation between the constraint and the data
action, the action is aborted.
170
Oracle CREATE VIEW
Syntax:
CREATE VIEW view_name AS SELECT columns FROM tables
WHERE conditions;
Example:
CREATE TABLE "SUPPLIERS"
( "SUPPLIER_ID" NUMBER,
"SUPPLIER_NAME" VARCHAR2(4000),
"SUPPLIER_ADDRESS" VARCHAR2(4000)
);
171
Cont..
CREATE TABLE "ORDERS"
( "ORDER_NO." NUMBER,
"QUANTITY" NUMBER,
"PRICE" NUMBER
);
Input the records…………….Then…………….
Create View Query:
CREATE VIEW sup_orders AS
SELECT suppliers.supplier_id, orders.quantity, orders.price
FROM suppliers
INNER JOIN orders
ON suppliers.supplier_id = supplier_id
WHERE suppliers.supplier_name = 'VOJO';
172
Cont..
You can now check the Oracle VIEW by this query:
SELECT * FROM sup_orders;
173
Oracle Update VIEW
In Oracle, the CREATE OR REPLACE VIEW statement is used to
modify the definition of an Oracle VIEW without dropping it.
Syntax:
174
Cont..
Example:
Execute the following query to update the definition of Oracle VIEW called
sup_orders without dropping it.
175
Oracle DROP VIEW
The DROP VIEW statement is used to remove or delete the
VIEW completely.
Syntax:
Example:
176
Joins
• The basic syntax of INNER JOIN is as follows:
• SELECT table1.column1, table2.column2... FROM table1 INNER
JOIN table2 ON table1.common_filed = table2.common_field;
• SQL> SELECT ID, NAME, AMOUNT, DATE FROM CUSTOMERS
INNER JOIN ORDERS ON CUSTOMERS.ID =
ORDERS.CUSTOMER_ID;
180
Oracle ORDER BY Example: (sorting in descending
order)
181
Oracle GROUP BY Clause
In Oracle GROUP BY clause is used with SELECT statement
to collect data from multiple records and group the
results by one or more columns.
Syntax:-
SELECT column1, column2 FROM table_name WHERE [ conditions ]
GROUP BY column1, column2
SQL> SELECT NAME, SUM(SALARY) FROM CUSTOMERS GROUP BY
NAME;
182
Oracle GROUP BY Example: (with COUNT function)
Customer
O/P
SELECT department,
MIN(salary) AS "Lowest salary"
FROM employees
GROUP BY department;
184
Oracle GROUP BY Example: (with MAX function)
EMPLOYEES
SELECT department,
MAX(salary) AS "Highest salary"
FROM employees
GROUP BY department;
185
Oracle HAVING Clause
In Oracle, HAVING Clause is used with GROUP BY Clause to restrict the groups of
returned rows where condition is TRUE.
SELECT expression1, expression2, ... expression_n,
aggregate_function (aggregate_expression)
FROM tables
WHERE conditions
GROUP BY expression1, expression2, ... expression_n
HAVING having_condition;
*********************************
aggregate_function : SUM, COUNT, MIN, MAX or AVG functions.
186
Oracle HAVING Example: (with GROUP BY COUNT function)
Customer O/P
187
Oracle HAVING Example: (with GROUP BY MIN function)
EMPLOYEES
SELECT department,
MIN(salary) AS "Lowest salary"
FROM employees
GROUP BY department
HAVING MIN(salary) < 15000;
188
Oracle HAVING Example: (with GROUP BY MAX function)
EMPLOYEES
SELECT department,
MAX(salary) AS "Highest salary"
FROM employees
GROUP BY department
HAVING MAX(salary) > 30000;
189
LIKE Clause
• SQL LIKE clause is used to compare a value to similar values using wildcard
operators.
• There are two wildcards used in conjunction with the LIKE operator:
• The percent sign (%) - The percent sign represents zero, one, or multiple
characters.
• The underscore (_) - The underscore represents a single number or
character. The symbols can be used in combinations.
TOP Clause
• EXCEPT Clause
• SELECT column1 [, column2 ] FROM table1 [, table2 ] [WHERE
condition] EXCEPT SELECT column1 [, column2 ] FROM table1 [,
table2 ] [WHERE condition]
SQL> SELECT City FROM Customers
EXCEPT
SELECT City FROM Suppliers
ORDER BY City;
SQL Aggregate Functions
• COUNT Function
SELECT COUNT(*) FROM Customers;
SELECT COUNT(*) FROM Customers WHERE salary>=2000;
• SUM Function
SELECT SUM(Salary) FROM Customers;
SELECT SUM(Salary) FROM Customers WHERE salary>=2000;
• AVG function
SELECT AVG(Salary) FROM Customers;
• MAX Function
SELECT MAX(Salary) FROM Customers;
• MIN Function
SELECT MIN(Salary) FROM Customers;
Alias
• Queries:
1. find those employees who get higher salary than the employee
whose ID is 163.
2. find those employees whose salary matches the smallest salary
of any of the departments.
3. find those employees who report that manager whose first
name is ‘Ramesh’.
4. find the employee whose salary is 3000 and reporting person’s
ID is 121.
1. find those employees whose ID matches any of the
number 134, 159 and 183.
2. find those employees who do not work in those
departments where manager ids are in the range 100,
200.
3. find those employees who get second-highest salary.
4. find those employees who work in the same department
where ‘Clara’ works.
5. find those employees who work in a department where
the employee’s first name contains a letter 'T‘.
6. find those employees who earn more than the average
salary and work in a department with any employee
whose first name contains a character a 'J'.
Answer
1. SELECT first_name, last_name FROM employees WHERE salary >
( SELECT salary FROM employees WHERE employee_id=163 );
2. SELECT first_name, last_name, salary, department_id FROM
employees WHERE salary IN ( SELECT MIN(salary) FROM
employees GROUP BY department_id );
3. SELECT first_name, last_name, employee_id, salary FROM
employees WHERE manager_id = (SELECT employee_id FROM
employees WHERE first_name = ‘Ramesh' );
4. SELECT * FROM employees WHERE (salary,manager_id)= (SELECT
3000,121);
5. SELECT * FROM employees WHERE employee_id IN (134,159,183);
6. SELECT * FROM employees WHERE department_id NOT IN (SELECT
department_id FROM departments WHERE manager_id BETWEEN
100 AND 200);
Answer
1. SELECT * FROM employees WHERE employee_id IN (SELECT
employee_id FROM employees WHERE salary = (SELECT
MAX(salary) FROM employees WHERE salary < (SELECT
MAX(salary) FROM employees)));
2. SELECT first_name, last_name, hire_date FROM employees
WHERE department_id = ( SELECT department_id FROM
employees WHERE first_name = 'Clara') AND first_name <>
'Clara';
3. SELECT employee_id, first_name, last_name FROM employees
WHERE department_id IN ( SELECT department_id FROM
employees WHERE first_name LIKE '%T%' );
4. SELECT employee_id, first_name , salary FROM employees
WHERE salary > (SELECT AVG (salary) FROM employees ) AND
department_id IN ( SELECT department_id FROM employees
WHERE first_name LIKE '%J%');
Query Processing
It is the step-by-step process of breaking the high-level language
into a low-level language in which the machine can understand
and perform the requested action for the user.
203
Steps in Query Processing
204
Translation Example
205
Tree Representation of Relational Algebra
• A query tree is a tree data structure representing a relational algebra expression.
• The tables of the query are represented as leaf nodes. The relational algebra
operations are represented as the internal nodes. The root represents the query
as a whole.
enamebalance<2500(account))
ename
balance<2500
account
206
Optimization
Rule-1: Perform the selection operation first.
By doing so, we can reduce the number of records involved in
the query, rather than using the whole tables throughout the
query.
207
Rule-2: Perform all the projection as early as possible in the
query.
This is similar to selection but will reduce the number of columns
in the query.
208
Rule-3: Perform most restrictive joins and selection
operations.
When we say most restrictive joins and selection means, select
those set of tables and views which will result in comparatively
less number of records.
Inefficient way:
• ∏STD_NAME, ADDRESS, AGE, CLASS_NAME, TEACHER_NAME ((STUDENT ∞ CLASS_ID
CLASS)∞ TECH_IDTEACHER)
Efficient way:
• ∏STD_NAME, ADDRESS, AGE, CLASS_NAME, TEACHER_NAME (STUDENT ∞ CLASS_ID
(CLASS∞ TECH_IDTEACHER))
209
Evaluation
211
Cont..
Equivalence plan-01
πAC(σC=3(R⋈S))
Equivalence plan-02
πAC(σC=3(S) ⋈ (R))
212
Cont.. Design Query Tree for each plan
π AC π AC
⋈
σ C=3
⋈ R
σ C=3
R S
QUERY
TREE-02
S
QUERY TREE-01 213
Cont..
Here both query trees are known as equivalence plans as
they produce the same result.
But the processing speed differs in between two query
trees/plans.
Out of two query plans, choose an optimal plan that takes
less processing time….
Less selection time or search time…
Less matching time or comparison time..
214
Cont..
215
Cont.. Case scenario analysis
216
How do we select an Optimal plan
217
Cont.. Design Query Tree for each plan
π AC π AC
⋈10x100
σ C=3 10
⋈ 100x100 R
10
σ C=3
100 100
R S QUERY TREE-02
S100
QUERY TREE-01
218
Cont..
Total minimum CPU time for QUERY plan-01
Total minimum CPU time = 100 x 100 + 100 x 100 = 20,000
unit time
Total minimum CPU time for QUERY plan-02
Total minimum CPU time= 100 + 10x100 = 1100 unit time
Assume one comparison consumes 1 unit time and one
selection consumes 1 unit time.
Now from above computation, we observe that Plan-02
consumes less minimal CPU time and consider as better
optimal plan than Plan-1.
219
Task
Consider Three relations i.e.
STUDENT (SID, SNAME), COURSE (CID, CNAME),
ASSIGNS (SID, CID)
Both having 1000 tuples in each.
The query is :-
Find the CNAME of Mr. X
Begin
Active Terminated
Failed Abort
ACID Properties
• Atomicity: A transaction is a single unit of operation. You
either execute it entirely or do not execute it at all. There
cannot be partial execution.
• Consistency: Once the transaction is executed, it should
move from one consistent state to another.
• Isolation: Transaction should be executed in isolation from
other transactions (no Locks). During concurrent transaction
execution, intermediate transaction results from
simultaneously executed transactions should not be made
available to each other. (Level 0,1,2,3)
• Durability: After successful completion of a transaction, the
changes in the database should persist. Even in the case of
system failures.
Atomicity
Atomicity involves the following two operations:
• Abort: If a transaction aborts then all the changes made are not
visible.
• Commit: If a transaction commits then all the changes made are
visible.
Example: Let's assume that following transaction T consisting of T1
and T2. A consists of Rs 600 and B consists of Rs 300. Transfer Rs
100 from account A to account B.
After completion of the transaction, A consists of Rs 500 and B
consists of Rs 400.
T1 T2
Read(A) Read(B)
A:= A-100 Y:= Y+100
Write(A) Write(B)
Consistency
• The integrity constraints are maintained so that the database is
consistent before and after the transaction.
• The execution of a transaction will leave a database in either its
prior stable state or a new stable state.
• The consistent property of database states that every transaction
sees a consistent database instance.
• The transaction is used to transform the database from one
consistent state to another consistent state.
• For example: The total amount must be maintained before or
after the transaction.
• Total before T occurs = 600+300=900
• Total after T occurs= 500+400=900
• Therefore, the database is consistent. In the case when T1 is
completed but T2 fails, then inconsistency will occur.
Isolation
Schedules
Parallel /
Serial Serializability
Non-serial
Serial Schedule
• Create a node Ti → Tj
• if Ti executes write (A) before Tj executes read (A).
• Create a node Ti → Tj
• if Ti executes read (A) before Tj executes write (A).
• Create a node Ti → Tj
• if Ti executes write (A) before Tj executes write (A).
• If a precedence graph contains a single edge Ti → Tj, then all
the instructions of Ti are executed before the first instruction
of Tj is executed.
• If a precedence graph for schedule S contains a cycle, then S
is non-serializable. If the precedence graph has no cycle,
then S is known as serializable.
Example-1
Read(A): In T1, no subsequent writes to A, so
no new edges
Read(B): In T2, no subsequent writes to B, so
no new edges
Read(C): In T3, no subsequent writes to C, so
no new edges
Write(B): B is subsequently read by T3, so add
edge T2 → T3
Write(C): C is subsequently read by T1, so add
edge T3 → T1
Write(A): A is subsequently read by T2, so add
edge T1 → T2
Write(A): In T2, no subsequent reads to A, so
no new edges
Write(C): In T1, no subsequent reads to C, so
no new edges
Write(B): In T3, no subsequent reads to B, so no
new edges
Example-2
Read(A): In T4,no subsequent writes to A, so no
new edges
Read(C): In T4, no subsequent writes to C, so no
new edges
Write(A): A is subsequently read by T5, so add
edge T4 → T5
Read(B): In T5,no subsequent writes to B, so no
new edges
Write(C): C is subsequently read by T6, so add
edge T4 → T6
Write(B): A is subsequently read by T6, so add
edge T5 → T6
Write(C): In T6, no subsequent reads to C, so no
new edges
Write(A): In T5, no subsequent reads to A, so no
new edges
Write(B): In T6, no subsequent reads to B, so no
new edges
View Serializable
• A Schedule is called view serializable if it is view equal to a serial
schedule (no overlapping transactions).
• A conflict schedule is a view serializable but if the serializability
contains blind writes, then the view serializable does not conflict
serializable.
T1 T2 T3 T1 T2 T3
R(a) R(a)
a=a+50 a=a+30
W(a) W(a)
a=a+30 a=a+50
W(a) W(a)
a=a-20 a=a-20
W(a) W(a)
Concurrency Control
242
Database concurrency
The term concurrency may be defined as the concurrent or
simultaneous operations by more than one transactions
over a data item in database.
In the concurrency control, the multiple transactions can be
executed simultaneously.
It may affect the transaction result.
It is highly important to maintain
the order of execution
of those transactions.
243
Problems of concurrency control
Several problems can occur when concurrent transactions
are executed in an uncontrolled manner.
Following are the three problems in concurrency control.
Lost updates
Dirty read
Incorrect summary
244
1. Lost updates
When two transactions that access the same database items for update
operations , that makes the value of same data item incorrect, then the
lost update problem occurs.
If two transactions X and Y read a data item and then update it, then the
effect of updating of the first record will be overwritten by the second
update.
245
2. Dirty Read
The dirty read occurs in the case when one
transaction updates an item of the database, and
then the transaction fails for some reason.
(before commit by T1 if trans fails )
Such type of problem is known as Dirty Read Problem, as one
transaction reads a dirty value which has not been committed.
T1 T2
Read_item (x)
X= x-500
Write_item(x)
Read_item (x)
X= x+ 1000
Write_item(x)
246
3. Incorrect summary
247
Concurrency Control Protocol
Concurrency control protocols ensure atomicity, isolation,
and serializability of concurrent transactions.
The concurrency control protocol can be divided into three
categories:
248
Binary Lock Protocol
If lock(x)= 1 ;; data item x is locked and other trans can
not access.
Disavantage:
1. The DBMS will not allow two transactions to read the same
database object
249
Shared/Exclusive Lock protocol
1. Shared lock:
It is also known as a Read-only lock. In a shared lock, the data item can
only read by the transaction.
It can be shared between the transactions because when the transaction
holds a lock, then it can't update the data on the data item.
2. Exclusive lock:
In the exclusive lock, the data item can be both reads as well as written
by the transaction.
This lock is exclusive, and in this lock, multiple transactions do not
modify the same data simultaneously.
250
Two-phase locking protocol (2PL)
A transaction is said to follow the
Two-Phase Locking protocol if
Locking and Unlocking can be done
in two phases.
Growing Phase: New locks on
data items may be acquired but
none can be released.
Shrinking Phase: Existing locks
may be released but no new
locks can be acquired.
251
What is LOCK POINT?
The Point at which the growing phase ends, i.e., when a transaction takes the final lock it
needs to carry on its work.
2-PL ensures serializability, but there are still some drawbacks of 2-PL.
• Cascading Rollback is possible under 2-PL.
• Deadlocks and Starvation are possible.
252
Categories of Two Phase Locking
• Strict 2-PL
• Rigorous 2-PL
• Conservative 2-PL
• Strict 2-PL :
• This requires that in addition to the lock being 2-Phase all
Exclusive(X) locks held by the transaction be released
until after the Transaction Commits.
Following Strict 2-PL ensures that our schedule is:
• Recoverable
• Cascadeless
• Hence, it gives us freedom from Cascading Abort which was
still there in Basic 2-PL and moreover guarantee Strict
Schedules but still, Deadlocks are possible!
• Rigorous 2-PL
• This requires that in addition to the lock being 2-Phase all
Exclusive(X) and Shared(S) locks held by the transaction be
released until after the Transaction Commits.
Following Rigorous 2-PL ensures that our schedule is:
• Recoverable
• Cascadeless
• Conservative 2-PL
• A Static 2-PL, this protocol requires the transaction to lock
all the items it access before the Transaction begins
execution by pre declaring its read-set and write-set.
• If any of the pre declared items needed cannot be locked,
the transaction does not lock any of the items, instead, it
waits until all the items are available for locking.
• Conservative 2-PL is Deadlock free and but it does not
ensure a Strict schedule
Timestamp Ordering Protocol
The Timestamp Ordering Protocol is used to order the
transactions based on their Timestamps.
The order of transaction is nothing but the ascending order
of the transaction creation.
The priority of the older transaction is higher that's why it
executes first.
To determine the timestamp of the transaction, this protocol
uses system time or logical counter.
255
Basic Timestamp ordering protocol
• 1. Check the following condition whenever a transaction Ti
issues a Read (X) operation:
• If TS(Ti) < W_TS(X) then the operation is rejected.
• Otherwise, the operation is executed.
• Timestamps of the data item is updated.
• Set R_TS(X)=max(R_TS(X),TS(Ti))
257
Deadlock in DBMS
A deadlock is a condition where two or more transactions
are waiting indefinitely for one another to give up locks.
no task ever gets finished and is in waiting state forever.
258
Deadlock in DBMS
Different Conditions where deadlock occurs
1. Hold and wait
2. Mutual exclusion
3. No Preemption
4. Circular wait
259
Wait-Die scheme
• if a transaction requests for a resource which is already held
with a conflicting lock by another transaction then the DBMS
simply checks the timestamp of both transactions.
• It allows the older transaction to wait until the resource is
available for execution.