DBMS Solution
DBMS Solution
DBMS Solution
b. Three-schema architectures.
Solution: Data abstraction is the process of easy interface to users by hiding
underlying complexities of data management from users. Data abstraction is
provided in database management systems by using three-level schema architecture.
Physical level
It is the lowest level of abstractions and describes how the data are actually stored
on disk and some of the access mechanisms commonly used for retrieving this data.
While designing this layer, the main objective is to optimize performance by
minimizing the number of disk accesses during the various database operations.
DBMS developer is the person who deals with this level. Database administrator
may be aware of certain details of the physical organization of the data.
View Level
External /view
View 1 View 2 …………… View N
Level
Stored database
Logical level
It is the next higher level of data abstraction which describes what data are stored in
the database, and what relationships exist among those data. It is also known as
conceptual level at the schema at this level is called logical schema (conceptual
schema). Logical level describes the stored data in terms of the data model of the
DBMS. In a relational DBMS, the conceptual schema is described by using relations
(tables) that are stored in the database. Programmers and database administrator
works at this level of abstraction. Database users do not need to have knowledge of
this level.
View level
It is the highest level of abstraction and describes only a part of the database and
hides some information from the user. At view level, computer users see a set of
application programs that hide details of data types. Similarly at the view level
several views of the database are defined and database user see only these views.
Schema at this level is called external schema (subschema). A view is conceptually a
relation, but the records in a view are not stored in the DBMS. Rather, they are
computed using a definition for the view, from relations stored in the DBMS. Views
also plays vital role to provide security mechanism to prevent users from accessing
certain parts of the database (that is views can also hide information (such as an
employee„s salary) for security purposes.)
Composite attributes are useful to model situations in which a user sometimes refers
to the composite attribute as a unit but at other times refers specifically to its
components. If the composite attribute is referenced only as a whole, there is no
need to subdivide it into component attributes. Examples of simple and composite
attributes are shown below:
District
Roll-No Address
Name
Student
Dependent-name
Contact-no
Customer
Date-of-
Address birth
Name Age
Customer
e. The difference among a relationship instance, a relationship type, and
relationship set.
Solution: The relationship among Particular two or more entities of a entity set is
called relationship instance.
The relationship among two or more entity sets is called relationship set.
In the above fig teacher teaches student is called relationship set. If we take a
particular instance i.e. Bhupi teaches at Sunday to Aayan is called relationship
instance. And finally if each teacher uses same attributes and also each student uses
same attributes then association among them is called relationship type. The
relationship type may be one of the below:
one to one
one to many
many to one
many to many
2. a. Draw an ER diagram for database showing Bank. Each Bank can have multiple
branches, and each branch can have multiple accounts and loans.
Solution:
1 N
BANK HAVE BRANCH
Address 1
Code Address Branch_no
Name
HAVE 1
Balance HAVE
Acc_no N
Loan_no
ACCOUNT N
LOAN
Type
Amount
DEPOSITS
BORROW
R
Name
Address
CUSTOMER
Sex
Phone_no
2. b. In what sense does a relational calculus differ from relational algebra, and in what
sense are they similar?
Solution: The difference between relational calculus and relational algebra are listed
below:
1. Relational algebra operations manipulate some relations and provide some
expression in the form of queries where as relational calculus are formed queries on
the basis of pairs of expressions.
2. RA have operator like join, union, intersection, division, difference, projection,
selection etc. whereas RC has tuples and domain oriented expressions.
3. RA is procedural language where as RC is non procedural query system.
4. There is modification which is easy in queries in RA than the RC.
5. Relational algebra is easy to manipulate and understand than RC.
6. RA queries are more powerful than the RC.
The similarity between relational calculus and relational algebra are listed below:
1. Both relational algebra and relational calculus are formal languages associated with
relational model that are used to specify the basic retrieval requests.
2. Expressive power of RA and RC are equivalent. This means any query that could be
expressed in RA could be expressed by formula in RC.
SELECT name
FROM Employee e, Supervise s, Works w, Company c
WHERE C.address=‟Pokhara‟ AND e.ss#=s.supervisor_ss# AND s.supervisor_ss# =
w.ss# AND w.cname=c.cname;
ii. Find the name of all the companies who have more than 4 supervisors.
SELECT cname, COUNT (supervisor_ss#) AS nos
FROM Company c, Works w, Superviser s
WHERE C.cname=w.cname AND w.ss#=s.Supervisor_ss#
GROUP BY cname
HAVING nos>4;
iii. Find the name of supervisor who has the largest number of employees.
SELECT name, MAX (noe)
FROM (SELECT name, supervisor_ss#, COUNT (employee_ss#) AS noe
FROM Employee e, Supervise s
WHERE e.ss#=s.supervisor_ss#
GROUP BY supervisor_ss#);
b) What is a view in SQL and how it is defined? Explain how views are typically
implemented.
Solution: A database view is a logical table. It does not physically store data like tables
but represent data stored in underlying tables in different formats. A view does not
require desk space and we can use view in most places where a table can be used. Since
the views are derived from other tables thus when the data in its source tables are
updated, the view reflects the updates as well. They also can be used by DBA to enforce
database security.
Advantages of Views:
Database security: view allows users to access only those sections of database
that directly concerns them.
View provides data independence.
Easier querying
Shielding from change
Views provide group of users to access the data according to their criteria.
Vies allow the same data to be seen by different users in different ways at the
same time.
When the column of a view is directly derived from the column of a base table, that
column inherits any constraints that apply to the column of the base table. For example, if
a view includes a foreign key of its base table, INSERT and UPDATE operations using
that view are subject to the same referential constraints as the base table.
Example: Following view contains the id, name, level, age and sex of those Students
whose age is greater than 24.
CREATE VIEW Student_view AS
SELECT sid, sname, level, age, sex
FROM Student
WHERE age>24; Student
Sid Sname Level Age Sex
101 Harendra Undergraduate 22 Male
102 Ramesh Undergraduate 21 Male
103 Nirab Graduate 25 Male
104 Pratibha Undergraduate 20 Female
105 Samrita Graduate 24 Female
106 Aastha Undergraduate 22 Female
107 Rabindra Graduate 28 Male
108 Abin Graduate 26 Male
201 Bharat Postgraduate 30 Male
202 Sohan Postgraduate 32 Male
Now by executing this query we get following view (logical table);
Student_view
Sid Sname Level Age Sex
103 Nirab Graduate 25 Male
107 Rabindra Graduate 28 Male
108 Abin Graduate 26 Male
201 Bharat Postgraduate 30 Male
202 Sohan Postgraduate 32 Male
Now any valid database operations can be performed in this view like in that of general
table. It is a named specification of a result table. The specification is a SELECT statement
that is executed whenever the view is referenced in an SQL statement. Consider a view to
have columns and rows just like a base table. For retrieval, all views can be used just like
base tables.
4 a) Define a first, second, and third normal forms with suitable examples.
Solution:
First Normal Form (1NF)
A relation is said to be in 1NF if and only if all domains of the relation contains only
atomic (indivisible) values.
More simply a relation is in 1 NF if it does not have multi-valued attributes, composite
attributes and their combinations. It states that the domain of an attribute must include
only atomic (simple, indivisible) values.
Example: let‟s take an un-normalized relation containing composite attributes as,
Student
Subjects
Sid Sname Subject1 Subject2
1 Nitesh DBMS Graphics
2 Laxman C C++
3 Geeta JAVA .NET
4 Anisha Simulation SAD
5 Monika Algebra Calculus
The above table cannot be considered as an example of 1 NF, because it has repeating
groups (two subject fields).Now convert this relation into 1 NF as,
Student
Sid Sname Subject1 Subject2
1 Nitesh DBMS Graphics
2 Laxman C C++
3 Geeta JAVA .NET
4 Anisha Simulation SAD
5 Monika Algebra Calculus
Fig: table in 1 NF
Fig: Relations in 1 NF
In the above relation {Emp-Id, Dept-No} is the primary key. Emp-Name, Emp-Salary
and Dept-Name all depend upon {Emp-Id, Dept-No}. Again Emp-IdEmp-Name,
Emp-IdEmp-Salary and Dept-NoDept-Name, thus there also occur partial
dependency. Due to which this relation is not in 2 NF.
Now converting this relation into 2 NF by decomposing this relation into three relations
as,
Employee Emp-Dept Department
Emp-Id Emp-Name Emp-Salary Emp-Id Dept-No Dept-No Dept-Name
1 Bhupi 40000 1 D1 D1 BBA
2 Bindu 30000 1 D2 D2 CSIT
3 Arjun 60000 2 D3 D3 BBS
3 D1
Fig: Relations in 2 NF
Student Hostel
S-Id S-Name Age Sex Sex Hostel-Name
1 Laxmi 21 Female Female White house
2 Binita 22 Female Male Red house
3 Rajesh 32 Male
4 Aayan 21 Male Fig: Relations in 3 NF
B is functionally
A B
Dependent on A
Determinant
Fig: Functional dependency between A and B
Consistency
The consistency property ensures that any transaction will bring the database from one
valid state to another. Execution of a transaction in isolation (that is, with no other
transaction executing concurrently) preserves the consistency of the database. The
consistency property of a transaction implies that if the database was in a consistent
state before the start of a transaction, then on termination of the transaction, the
database will also be in a consistent state. Ensuring consistency for an individual
transaction is the responsibility of the application Manager who codes the transaction.
Note that during the execution of the transaction the state of the database becomes
inconsistent. Such an inconsistent state of the database should not be visible to the users
or other concurrently running transactions.
Isolation
Even though multiple transactions may execute concurrently, the system guarantees
that, for every pair of transactions Ti and Tj, it appears to Ti that either Tj finished
execution before Ti started, or Tj started execution after Ti finished. Thus, each
transaction is unaware of other transactions executing concurrently in the system. This
means in case of concurrent execution of transaction, execution of one transaction
should not interfare execution of another transaction. The isolation property of a
transaction ensures that the concurrent execution of transactions results in a system
state that is equivalent to a state that could have been obtained had these transactions
executed one at a time in same order. Thus, in a way it means that the actions
performed by a transaction will be isolated or hidden from outside the transaction until
the transaction terminates. This property gives the transaction a measure of relative
independence. Ensuring the isolation property is the responsibility of a component of a
database system called the Concurrency Control Component.
Durability
After a transaction completes successfully, the changes it has made to the database
persist, even if there are system failures. The durability property guarantees that, once a
transaction completes successfully, all the updates that it carried out on the database
persists even if there is a system failure after the transaction completes execution.
Durability can be guaranteed by ensuring that the updates carried out by the
transaction have been written to the disk before the transaction completes and
information about updates carried out by the transaction and written to the disk is
sufficient to enable the database to re-construct the updates when the database system
is restored after the failure. Ensuring durability is the responsibility of the component of
the DBMS called the Recovery Management Component.
Serializable schedule:
A given non serial schedule of n transactions are serializable if it is equivalent to some
serial schedule.That is if a non-serial schedule produce the same result as of the serial
schedule then the given non-serial schedule is said to be serializable.A schedule that is
not serializable is called a non-serializable. The main objective of serializability is to
search non-serial schedules that allow transaction to execute concurrently without
interfering one another transaction and produce the result that could be produced by a
serial execution. In serializability, ordering of read/write is important. If two
transactions only read data item they do not conflict and order is not important. If two
transactions either read or write completely separate data items, they do not conflict
and order is not important. If one transaction writes a data item and another reads or
writes same data items, then order of execution is important.
6. a) How does the granularity of data items affects the performance of concurrency
control? What factors affect selections of granularity size for data items?
Solution: In all concurrency control schemes, we have used each individual data item
as the unit on which synchronization is performed. However, it would be advantageous
to group several data items and to treat them as one individual unit. For example, if a
transaction Ti needs to access the entire database and it uses a locking protocol, then Ti
must lock each item in the database, which is time consuming process. Hence it would
be better if Ti would issue a single lock request to lock the entire database. On the other
hand if transaction Ti needs to access only a few data items, it should not be required to
lock the entire database. A data item can be one of the following.
1. Database Field 2.Database record 3.Disk block 4.Whole file
5. Whole database
The size of database item is often called the data item granularity. Fine granularity
refers to overall item size where as coarse granularity refers to large item sizes. The best
item size depends on the type transaction. Hierarchy of data granularities, where the
small granularities are nested within larger ones, can be represented graphically a tree.
In the tree, each node represents independent data item, non-leaf node of the multiple
granularity tree represents the data associated with its descendents.
- Growing Phase: In this phase we put read or write lock based on need on the
data. In this phase we does not release any lock. Remember that all lock
operation must preced first unloch operation appeared in a transaction.
- Shrinking Phase: This phase is just reverse of growing phase. In this phase we
release read and write lock but doesn't put any lock on data. Unlock operations
can only appear after last lock operation.
For a transaction these two phases must be mutually exclusive. This means, during
locking phase unlocking phase must not start and during unlocking phase locking
phase must not begin. It can be proved that, if every transaction in a schedule follows
the 2PL, the schedule is guaranteed to be serializable, obviating the need to test for
serializability of schedule any more. If lock conversion is allowed, then upgrading of
locks (from read-locked to write-locked) must be done during the growing phase, and
downgrading of locks (from write-locked to read-locked) must be done in the shrinking
phase. Hence, a read_lock(x) operation that downgrades an already held write lock
on x can appear only in the shrinking phase and a write_lock(x) operation that
upgrades an already held read lock on x can appear only in the growing phase.
Consider the following two schedules, both of the schedules are equivalent, only the
difference is first schedule does not follow 2PL locking protocol whereas second
schedule follows it. Both schedules contain two transactions, T1 and T2. Transaction T1
adds 100 to both data items and transaction T2 multiplies both data items by 2. Assume
intial value of both data items (x and y) is 50. In schedule S1 final value of data items x
and y is 300 and 250 respectively which is not coorect beacuase it is not equivalent to
any serial schedule containing T1 and T2. But, in schedule S2 final value of both data
items is 300. This is correct because it is erquivalent to serial schedule T1→T2. From this
observation, we can conclude that Shedule S1 does not follows 2PL locking protocol
therefore it is not serializable but schedule S2 follows 2PL protocol and hence it is
serializable.
T2 Data Items Values T2 Data Items Values
T1 T1
Lock(x) Lock(x)
Read(x) x←50 Read(x) x←50
x=x+100 x←150 x=x+100 x←150
Write(x) T2 writes x←150 Write(x) T1 writes x←150
Unlock(x) Lock(y)
Lock(x) Unlock(x)
x=x*2 x←300 Lock(x)
Wtite(x) T2 writes x←300 Read(x) x←150
Unlock(x) x=x*2 x←300
Lock(y) Wtite(x) T2 writes x←300
Read(y) y=50 Read(y) y=50
y=y*2 y=100 y=y+100 y=150
Write(y) T2 writes y←150 Write(y) T1 writes y←150
Unlock(y) Unlock(y)
Lock(y) Lock(y)
Read(y) Unlock(x)
y=y+100 y←150 Read(y) y←150
Write(y) y←250 y=y*2 y←300
Unlock(y) T1 writes y←250 Write(y) T2 writes y←300
Unlock(y)
Schedule S1 Schedule S2
TU 2067
Data redundancy: Data redundancy means duplication of same data or data files in
different places. Flat file systems are suffered from the problem of high data
redundancy. For example, record (such as student id, name, level, program, section etc)
of a student may appear in library data files as well as examination data files. This
redundancy leads to higher storage and access cost. On the other hand database
management systems can greatly reduce the problem of data redundancy. Note that
DBMS cannot remove data redundancy problem completely.
Data inconsistency: Data inconsistency is side effect of data redundancy. Data is said to
be inconsistent if various copies of the same data may no longer agree. Data
inconstancy occurs if changed data is reflected in data files in one place but not
elsewhere in the system.
Data isolation: Because data are scattered in various files, and files may be in different
formats, writing new application programs to retrieve the appropriate data is difficult
in flat file systems.
Difficulty in accessing data: File processing systems do not allow required data to be
retrieved in efficient and convenient way. For example, assume we already have
program to generate the list of books on the basis of subject. Now, if we need to
generate the list of books on the basis of author name, either we need to extract the data
from book data files manually or we should request the programmer to write a program
to retrieve required data from the book data file. Both of the alternatives are not
satisfactory. First alternative is time consuming and the second alternative is tedious
and costly because requesting a programmer every time to write a new program as we
don‟t have application program to generate the required list of data is not good idea.
But in database systems it is very easy to write general programs to generate different
list on the basis of different criteria.
Integrity problems: Integrity means correctness of data before and after execution of a
transaction. Integrity constraints are condition applied to the data. For example, if
maximum salary in an organization is 150,000 then we have the integrity constraint
“salary ≤ 150,000”. Integrity constraints are important to maintain correctness of data. It
plays vital to prevent users from doing mistakes. For example, if user mistakenly types
200,000 in place of 20,000 while transferring salary of an employee in his/her account,
specified integrity constrain is violated and hence the system tell the user about the
mistake.
Two-Tier Architecture
The two-tier is based on Client Server architecture. The two-tier architecture is like
client server application. The direct communication takes place between client and
server. There is no intermediate between client and server. Because of tight coupling a 2
tiered application will run faster. In this approach, the user interface and application
programs are placed on the client side and the database system on the server side. This
architecture is called two-tier architecture. The application programs that reside at the
client side invoke the DBMS at the server side. The application program interface
standards like Open Database Connectivity (ODBC) and Java Database Connectivity
(JDBC) are used for interaction between client and server.
Data Source
Three-Tier Architecture
1) Client layer: It is also called as Presentation layer which contains UI part of our
application. This layer is used for the design purpose where data is presented to the
user or input is taken from the user. For example designing registration form which
contains text box, label, button etc.
2) Business layer:
In this layer all business logic written likes validation of data, calculations, data
insertion etc. This acts as a interface between Client layer and Data Access Layer. This
layer is also called the intermediary layer helps to make communication faster between
client and data layer.
3) Data layer:
In this layer actual database is comes in the picture. Data Access Layer contains
methods to connect with database and to perform insert, update, delete, get data from
database based on our input data.
Data Source
Application server
Client Application
Fig: Three-tier Architecture
Advantages
1. High performance, lightweight persistent objects
2. Scalability – Each tier can scale horizontally
3. Performance – Because the Presentation tier can cache requests, network utilization
is minimized, and the load is reduced on the Application and Data tiers.
4. High degree of flexibility in deployment platform and configuration
5. Better Re-use
6. Improve Data Integrity
7. Improved Security – Client is not direct access to database.
8. Easy to maintain and modification is bit easy, won‟t affect other modules
9. In three tier architecture application performance is good.
Disadvantages
1. Increase Complexity/Effort
c. What are weak entity, owner entity type and identifying relationship?
Solution: An entity set that does not possess sufficient attributes to form a primary key
is called a weak entity set. One that does have a primary key is called a strong entity
set. For example, the entity set transaction has attributes transaction number (Tno), date
and amount. Different transactions on different accounts could share the same number.
Therefore these are not sufficient to form a primary key (uniquely identify a
transaction). Thus transaction is a weak entity set. Strong entity set on which existence
of weak entity set depends is called owner or identifying entity set. The one-to-many
relationship between weak entity set and its owner entity set is called identifying
relationship.
Illustration:
Solution: SQL allows the use of NULL values to indicate absence of information about
the value of an attribute. It has a special meaning in the database- the value of the
column is not currently known but its value may be known at a later time.
A special comparison operator IS NULL is used to test a column value for NULL. It has
following general format:
This comparison operator return true if value contains NULL, otherwise return false.
The optional NOT reverses the result.
NULL is the value used to represent an unknown piece of data. Let‟s take a look at a
simple example: a table containing the inventory for a fruit stand. Suppose that our
inventory contains 10 apples, 3 oranges. We also stock plums, but our inventory
information is incomplete and we don‟t know how many (if any) plums are in stock.
Using the NULL value, we would have the inventory table as shown in below:
Fruit Stand Inventory
Item Quantity
Apples 10
Oranges 3
Plums NULL
Employee Super-
vise
an organizational unit (e.g. department, division, branch, ...) comprises other units
a course is a prerequisite for another course
2.
a.Draw an ER diagram for a database showing Hospital system. The Hospital
maintains data about Affiliated Hospitals, type of Treatments facilities given at
each hospital, and Patients.
Address
Address Name
Name 1
1
Contact_No
Facilitate Has
M
N
TREATMENTS PATIENTS
Date-checked-out
b. What is join operation? Differentiate between equijoin and natural join with
suitable example.
For example consider the tables Employee and Dept and their natural join:
Employee Department
e-id e-name Dept Dept Manager
11 Bhupi Computer Computer Anisha
13 Anju Finance Finance Manisha
43 Manju Computer Production Umesh
54 Nisha Finance
Employee Department
e-id e-name Dept manager
11 Bhupi Computer Anisha
13 Anju Finance Manisha
43 Manju Computer Anisha
54 Nisha Finance Manisha
Equi-join: It is special case of conditional join where the conditions consist only of
equalities. Unlike natural join, here relations may have different attributes and if a
common attribute occur then such attribute occur two times in the resulting equi-join
operation.
Example:
Write relational algebra and SQL queries for each of the following cases.
a. Find the names of supervisors that work in companies whose address equals
„Kathmandu’
SELECT name
FROM Employee e, Supervise s, Works w, Company c
WHERE C.address=‟Kathmandu‟ AND e.ss#=s.supervisor_ss# AND
s.supervisor_ss# = w.ss# AND w.cname=c.cname;
b. Find the names of all the companies who have more than 4 supervisors.
c. Find the name of the supervisor who has the largest number of employees.
SELECT name, MAX (noe)
FROM (SELECT name, supervisor_ss#, COUNT (employee_ss#) AS noe
FROM Employee e, Supervise s
WHERE e.ss#=s.supervisor_ss#
GROUP BY supervisor_ss#);
d. How can define view in SQL? Explain the problems that may arise when one
attempts to update a view.
Solution: A database view is a logical table. It does not physically store data like tables
but represent data stored in underlying tables in different formats. A view does not
require desk space and we can use view in most places where a table can be used. Since
the views are derived from other tables thus when the data in its source tables are
updated, the view reflects the updates as well. They also can be used by DBA to enforce
database security.
Advantages of Views:
Database security: view allows users to access only those sections of database
that directly concerns them.
View provides data independence.
Easier querying
Shielding from change
Views provide group of users to access the data according to their criteria.
Vies allow the same data to be seen by different users in different ways at the
same time.
When the column of a view is directly derived from the column of a base table, that
column inherits any constraints that apply to the column of the base table. For example, if
a view includes a foreign key of its base table, INSERT and UPDATE operations using
that view are subject to the same referential constraints as the base table.
Syntax for creating view is:
CREATE VIEW <view name> AS <query expression>
Where, <query expression> is any legal query expression.
4. What are different update anomalies? Explain each in with suitable examples.
Solution:
The terms “Update Anomalies” are called the problems which are the results from the
un-normalized database in the Relational Database Management System (RDBMS). This
is the common name given to anomalies. But if we are talking about the Update
Anomalies it means we are talking about the Insertion Anomalies, Deletion Anomalies
and Modification Anomalies. If these three anomalies are it means there is some
inconsistency in our database. This will definitely create the problems while inserting,
deleting and modifying the records in the data base entities called “tables”. These three
Update Anomalies are having different impact on our database. These are classified as
mentioned below:-
* Insertion Anomalies create the problems when we are creating the inconsistency
in the RDBMS database while inserting the records into the columns of the given
table.
* Deletion Anomalies create the problems when we are deleting the records
without taking care of the other portion of the database. It will create the
confusion due to inconsistency in the database.
* Modification Anomalies are occurs when we are not able to modifying the
records in the data base without taking care of the other facts.
Example
In the above relation if we want to insert information about newly hired employee to
whom department is not assigned yet, we can insert the information because value of
Dno cannot be null. This problem is called insertion anomaly.
If we want to delete information about department D2 without deleting information
about employees in the department, we are not able to do this. Deleting department D2
causes deletion of employee E02 and E04 also. This problem is called deletion anomaly.
Changing the name of department number D1 from “IT” to “ICT” may cause this
update to be made for all employees working on department D1. This problem is called
update anomaly.
5. a. Draw a state diagram, and discuss the typical state that a transaction goes
through during transaction.
Active State: In this state the transaction is being executed. This is the initial state
of every transaction.
Failed State: If any check made by database recovery system fails, the transaction
is said to be in failed state, from where it can no longer proceed further.
Aborted: If any of checks fails and transaction reached in failed state, the
recovery manager rolls back all its write operation on the database to make
database in the state where it was prior to start of execution of transaction.
Transactions in this state are called aborted. Database recovery module can select
one of the two operations after a transaction aborts:
T1 T2 T3
r(x)
r(x)
w(x)
r(x)
w(x)
Dependence graph
Since the above precedence graph is cyclic hence given schedule is not conflict serializable.
And its serial schedule cannot be determined.
ii). r1(x); r3(x); w3(x); w1(x); r2(x);
T1 T2 T3
r(x)
r(x)
w(x)
w(x)
r(x)
Dependence graph
Since the above precedence graph is cyclic hence given schedule is not conflict serializable.
And its serial schedule cannot be determined.
3) r3(x);r2(x); r1(x);w3(x);w1(x);
T1 T2 T3
r(x)
r(x)
w(x)
r(x)
w(x)
Dependence graph
Since the above precedence graph is not cyclic hence given schedule is conflict serializable.
And its serial schedule is T2T3T1
6. a) Discuss the problems of deadlock and starvation, and the different approaches
to dealing with these problems.
Solution: A system is in a deadlock state if there exists a set of transactions such that
every transaction in the set in waiting for data item that is locked by another transaction
in the set. There exists a set of waiting transactions {T0, T1,…….Tn} such that T0 is
waiting for data item that is held by T1, T1 is waiting for a data item that is held by T2,
Tn-1 is waiting for a data item that is held by Tn, and Tn is waiting for a data item that
is held by T0. None of the transactions can make progress in such a situation.
The different approaches to dealing with these problems are:
then
Ti waits
else
Ti dies
For example, suppose that transaction T2, T3 and T4 have timestamps 5, 10 and 15
respectively. If T2 requests a data item held by T3, then T2 will wait. If T4 requests a
data item held by T3, then T4 will be rolled back.
if TS(Ti)< TS(Tj)
then
else
Ti waits
In above example, if transaction T2 requests a data item held by transaction T3, then
the data item will be preempted from T3 and T3 will be aborted and rolledback. If T4
requests a data item held by T3, then T4 will wait.
Another group of protocol that prevent deadlock do not require timestamps are No
Waiting (NW) and Cautious Waiting (CW) algorithms.
Solution: Write-ahead log (WAL) protocol guarantees that no data modifications are
written to disk before the associated log record is written to disk. This maintains the
ACID properties for a transaction. At the time a modification is made to a page in the
buffer, a log record is built in the log cache that records the modification. This log
record must be written to disk before the associated dirty page is flushed from the
buffer cache to disk. If the dirty page is flushed before the log record is written, the
dirty page creates a modification on the disk that cannot be rolled back if the transaction
fails before the log record is written to disk. WAL states that
For Undo: Before a data item‟s AFIM is flushed to the database disk (overwriting
the BFIM) its BFIM must be written to the log and the log must be saved on a
stable store (log disk).
For Redo: Before a transaction executes its commit operation, all its AFIMs must
be written to the log and the log must be saved on a stable store.
TU 2069
In the first approach, the user interface and application programs are placed on the
client side and the database system on the server side. This architecture is called two-
tier architecture. The application programs that reside at the client side invoke the
DBMS at the server side. The application program interface standards like Open
Database Connectivity (ODBC) and Java Database Connectivity (JDBC) are used for
interaction between client and server.
The second approach, that is, three-tier architecture is primarily used for web-based
applications. It adds intermediate layer known as application server (or web server)
between the client and the database server. The client communicates with the
application server, which in turn communicates with the database server. The
application server stores the business rules (procedures and constraints) used for
accessing data from database server. It checks the client‟s credentials before forwarding
a request to database server. Hence, it improves database security. When a client
requests for information, the application server accepts the request, processes it, and
sends corresponding database commands to database server. The database server sends
the result back to application server which is converted into GUI format and presented
to the client.
Network
Solution: SQL allows the use of NULL values to indicate absence of information about
the value of an attribute. It has a special meaning in the database- the value of the
column is not currently known but its value may be known at a later time.
A special comparison operator IS NULL is used to test a column value for NULL. It has
following general format:
This comparison operator return true if value contains NULL, otherwise return false.
The optional NOT reverses the result.
NULL is the value used to represent an unknown piece of data. Let‟s take a look at a
simple example: a table containing the inventory for a fruit stand. Suppose that our
inventory contains 10 apples, 3 oranges. We also stock plums, but our inventory
information is incomplete and we don‟t know how many (if any) plums are in stock.
Using the NULL value, we would have the inventory table as shown in below:
Fruit Stand Inventory
Item Quantity
Apples 10
Oranges 3
Plums NULL
c. Differentiate between logical data independence and physical data independence.
Solution:
The capacity to change the conceptual (logical level) schema without having to change
associated application programs is called logical data independence. If the underlying
conceptual schema is changed, the definition of a view relation can be modified so that
the same relation is computed as before. Database administrator is responsible for
redefining view level schema.
When modification is done to the conceptual schema (i.e tables) only the
external/conceptual mapping need to be changed, if the DBMS fully supports the
concept of data independence.
The capacity to change the internal schema without affecting application programs is
called physical data independence. This means we can change physical level storage
details such as file structure; indexes as long as conceptual schema remains same
without altering associated application programs. But performance may be affected due
to changes in physical level. It is the responsibility of the DBA to manage such changes.
Solution: The relationship among Particular two or more entities of a entity set is
called relationship instance.
The relationship among two or more entity sets is called relationship set.
A relationship type is an abstraction of a relationship i.e. a set of relationships
instances sharing common attributes.
Teacher Teaches
Student
Bhupi
Aayan
Arjun Sunday
keshav
Dilli Monday
Umesh
Kumar Tuesday
Kamala
Ganesh Wednesday
Bimala
Deepak Thursday
Rashmi
Ganga Friday
Arjan
In the above fig teacher teaches student is called relationship set. If we take a
particular instance i.e. Bhupi teaches at Sunday to Aayan is called relationship
instance. And finally if each teacher uses same attributes and also each student uses
same attributes then association among them is called relationship type. The
relationship type may be one of the below:
one to one
one to many
many to one
many to many
2. (a) Draw an ER diagram for a database showing Hospital system. The Hospital
maintains data about Affiliated Hospitals, type of Treatments, facilities given at each
hospital and Patients.
Solution:
Address
Address Name
Name 1
1
Contact_No
Facilitate Has
M
N
TREATMENTS PATIENTS
Date-checked-out
The similarity between relational calculus and relational algebra are listed below:
1. Both relational algebra and relational calculus are formal languages associated
with relational model that are used to specify the basic retrieval requests.
2. Expressive power of RA and RC are equivalent. This means any query that could
be expressed in RA could be expressed by formula in RC.
Write relational algebra and SQL queries for each of the following cases.
a. Find the names of supervisors that work in companies whose address equals
„Biratnagar’
SELECT name
FROM Employee e, Supervise s, Works w, Company c
WHERE C.address=‟Biratnagar‟ AND e.ss#=s.supervisor_ss# AND
s.supervisor_ss# = w.ss# AND w.cname=c.cname;
(b) What is constraint? How does SQL allow implementation of general integrity
constraints?
Constraints are the rules that are used to control data in columns of a particular
relation.
Integrity constraints ensure that changes made to the database by authorized users do
not result in a loss of data consistency. Thus, integrity constraints guard against
accidental damage to the database. Constraints are basically used to impose rules on the
table, whenever a row is inserted, updated, or deleted from the table. Constraints
prevent the deletion of a table if there are dependencies. The different types of
constraints that can be imposed on the table are domain constraints, referential
constraints, trigger, assertions etc. The constraints related to domain constraints are
NOT NULL, UNIQUE, PRIMARY KEY and CHECK constraints.
1. Domain constraints
NOT NULL constraints
UNIQUE constraints
PRIMARY KEY constraints
CHECK constraints etc.
2. Referential constraints
3. Triggers
4. Assertion
When an attribute or set of attributes is declared as the primary key, then the attribute
will not accept NULL value moreover it will not accept duplicate values. It is to be
noted that “only one primary key can be defined for each table.”
1 Abin 04
2 Aayan 11
3 Bindu 26
INSERT INTO Student VALUES (4, “Umesh”, 11); then we get duplicate entry for
primary key attribute error message and insertion is failed. Also if we leave primary
keys value for a particular record like below,
INSERT INTO Student VALUES (“Geeta”, 25); then we get primary key cannot be null
error message and insertion is failed.
4. (a). Define first, second and third normal form with suitable example.
Solution:
Now converting this relation into 1 NF by decomposing this relation into two
relations as,
Student Phone
Sid Sname Address Sid Phone_No
1 Nitesh Kalanki 1 9849145464
2 Laxman Balkhu 1 9813335467
3 Geeta Kirtipur 2 9841882345
4 Anisha Pokhara 2 099392844
5 Monika Ratopool 3 9848334898
4 9849283847
5 9840084732
5 9803267499
Fig: Relations in 1 NF
Employee-Department
Emp-Id Emp- Emp- Dept- Dept-
Name Salary No Name
1 Bhupi 40000 D1 BBA
1 Bhupi 40000 D2 CSIT
2 Bindu 30000 D3 BBS
3 Arjun 60000 D1 CSIT
In the above relation {Emp-Id, Dept-No} is the primary key. Emp-Name, Emp-Salary
and Dept-Name all depend upon {Emp-Id, Dept-No}. Again Emp-IdEmp-Name,
Emp-IdEmp-Salary and Dept-NoDept-Name, thus there also occur partial
dependency. Due to which this relation is not in 2 NF.
Now converting this relation into 2 NF by decomposing this relation into three
relations as,
Employee Emp-Dept Department
Fig: Relations in 2 NF
(b) What is functional dependency? Describe full and partial functional dependency
with suitable example.
Solution: Functional dependencies are the relationships among the attributes within a
relation. Functional dependencies provide a formal mechanism to express constraints
between attributes. If attribute A functionally depends on attribute B, then for every
instance of B you will know the respective value of A. Attribute “B” is functionally
dependent upon attribute “A” (or collection of attributes) if a value of “A” determines
or single value of attributes “B” at only one time functional dependency helps to
identify how attributes are related to each other.
Let A and B are attributes of a relation R. If each value of B is associated with exactly
one value of A then B is said to be functionally dependent on A. it is denoted by AB.
Example
Here, sname, address and age are functionally dependent on sid. Meaning is that each
student id uniquely determines the value of attributes student name, address and age.
This can be express by,
sid → sname
sid → address
sid → age
Transitive dependency
Example
5. (a) Draw a state diagram, and discuss the typical state that a transaction goes
through during transaction.
Active State: In this state the transaction is being executed. This is the initial state
of every transaction.
Failed State: If any check made by database recovery system fails, the transaction
is said to be in failed state, from where it can no longer proceed further.
Aborted: If any of checks fails and transaction reached in failed state, the
recovery manager rolls back all its write operation on the database to make
database in the state where it was prior to start of execution of transaction.
Transactions in this state are called aborted. Database recovery module can select
one of the two operations after a transaction aborts:
Solution: A schedule in which transactions are aligned in such a way that one
transaction is executed first. When the first transaction completes its cycle then next
transaction is executed. Transactions are ordered one after other. This type of schedule
is called serial schedule.
Example:
T1 T2
Read(A)
A= A-50
Write(A)
Read(B)
B=B+50
Write(B)
Read(A)
Temp= A*0.1
A=A-Temp
Write(A)
Read(B)
B= B + Temp
Write(B)
A given non serial schedule of n transactions is serializable if it is equivalent to some
serial schedule. That is if a non-serial schedule produce the same result as of the serial
schedule then the given non-serial schedule is said to be serializable.
Example:
T1 T2
Read(x)
Read(x)
Write(y)
Commit
Write(x)
Commit
Since, above precedence graph of the schedule H, with 3 transactions doe not contains,
the given schedule H is conflict serializable. The equivalent serial schedule can be
achieved if transaction operations are taken in order T2→T1→T3. Thus equivalent serial
schedule is: R2(x), W2(x), R2(y), W2(y), R1(x), W1(x), R3(x), W3(x).
6. (a). How does the granularity of data items affect the performance of concurrency
control? What factors affect selection of granularity size for data items?
Solution:
In all concurrency control schemes, we have used each individual data item as the unit
on which synchronization is performed. However, it would be advantageous to group
several data items and to treat them as one individual unit. For example, if a transaction
Ti needs to access the entire database and it uses a locking protocol, then Ti must lock
each item in the database, which is time consuming process. Hence it would be better if
Ti would issue a single lock request to lock the entire database. On the other hand if
transaction Ti needs to access only a few data items, it should not be required to lock
the entire database. A data item can be one of the following.
2. Database Field
3. Database record
4. Disk block
5. Whole file
6. Whole database
The size of database item is often called the data item granularity. Fine granularity
refers to overall item size where as coarse granularity refers to large item sizes. The best
item size depends on the type transaction. Hierarchy of data granularities, where the
small granularities are nested within larger ones, can be represented graphically a tree.
In the tree, each node represents independent data item, non-leaf node of the multiple
granularity tree represents the data associated with its descendents.
Solution: Write-ahead log (WAL) protocol guarantees that no data modifications are
written to disk before the associated log record is written to disk. This maintains the
ACID properties for a transaction. At the time a modification is made to a page in the
buffer, a log record is built in the log cache that records the modification. This log
record must be written to disk before the associated dirty page is flushed from the
buffer cache to disk. If the dirty page is flushed before the log record is written, the
dirty page creates a modification on the disk that cannot be rolled back if the transaction
fails before the log record is written to disk. WAL states that
For Undo: Before a data item‟s AFIM is flushed to the database disk (overwriting
the BFIM) its BFIM must be written to the log and the log must be saved on a
stable store (log disk).
For Redo: Before a transaction executes its commit operation, all its AFIMs must
be written to the log and the log must be saved on a stable store.
TU 2070
1 (a) what is database management system? Discuss the advantages of using database
management system over file system.
Solution: A Database Management System (DBMS) is the set of programs that is used
to store, retrieve and manipulate the data in convenient and efficient way. Main goal of
database management system (DBMS) is to hide underlying complexities of data
management from users and provide easy interface to them. Some common examples of
the DBMS software are Oracle, Sybase, Microsoft SQL Server, DB2, MySQL, Dbase, Ms-
Access etc.
To enforce security: Not every user of the database system should be able to
access all data. Different checks can be established for each type of access
(retrieve, modify, delete, etc) to each piece of information in the database.
Solution: Data abstraction is the process of easy interface to users by hiding underlying
complexities of data management from users. Database System provides users with an
abstract view of the data. Data abstraction is provided in database management systems
by using three-level schema (ANSI /SPARC) architecture.
Physical level
It is the lowest level of abstractions and describes how the data are actually stored on
disk and some of the access mechanisms commonly used for retrieving this data. While
designing this layer, the main objective is to optimize performance by minimizing the
number of disk accesses during the various database operations. The database system
hides many of the lowest level storage details from database programmer. DBMS
developer is the person who deals with this level. Database administrator may be
aware of certain details of the physical organization of the data.
View Level
External /view
View 1 View2 …………… View n
View 1 View 2
Conceptual Level Logical Level
Stored database
Logical level
It is the next higher level of data abstraction which describes what data are stored in the
database, and what relationships exist among those data. It is also known as conceptual
level at the schema at this level is called logical schema (conceptual schema). Logical
level describes the stored data in terms of the data model of the DBMS. Programmers
and database administrator works at this level of abstraction. Database users do not
need to have knowledge of this level.
View level
It is the highest level of abstraction and describes only a part of the database and hides
some information from the user. At view level, computer users see a set of application
programs that hide details of data types. Similarly at the view level several views of the
database are defined and database user see only these views. Schema at this level is
called external schema. Views also plays vital role to provide security mechanism to
prevent users from accessing certain parts of the database such as an employee„s salary
for security purposes.
2 (a) Construct an ER diagram (ERD) to record the marks that students get in
different exams of different course offerings.
Solution:
Address
subject Date
Name 1
1
Center
Takes Get
M
N
COURSE MARKS
Title CID
Credit-hours
(b) Define integrity constraint? Discuss domain constraint with suitable example.
Solution:
Domain constraint
data type
Length or size
D1 D2 Dn
Student
A1 A2 …………… An
In the above student table, the attribute A1 draws value from domain D1, A2 from D2
and so on.
SQL allows us to create new domains from existing data types by using create domain
clause as below:
Once the domains are created we can use them as data types of attributes while creating
relations as below:
eid varchar(5),
ename varchar(10),
Salary dollars
3. (a) With the information given below, calculate any three members of F+
R=(A, B, C, G, H, I)
F={AB, AC, CGI, BH}
Compute closure of F (F+).
Solution:
Since A→B, B→H then A→H [by transitivity rule]
Since A→B, A→C then A→BC [by union rule]
Since A→C then AG→CG [by augmentation rule]
Since AG→CG, CG→I then AG→I [by transitivity rule]
Since CGI then CI, GI [By decomposition rule]
Hence, F+ = {A→A, B→B, C→C, H→H, G→G, I→I, A→B, A→C, CI, G→I, B→H,
A→H, AG→I, AG→CG}
Here, first six FDs obtain by reflexive axiom.
Solution:
Employee-Department
Emp-Id Emp-Name Emp-Salary Dept-No Dept-Name
1 Bhupi 40000 D1 BBA
1 Bhupi 40000 D2 CSIT
2 Bindu 30000 D3 BBS
3 Arjun 60000 D1 CSIT
In the above relation {Emp-Id, Dept-No} is the primary key. Emp-Name, Emp-Salary
and Dept-Name all depend upon {Emp-Id, Dept-No}. Again Emp-IdEmp-Name,
Emp-IdEmp-Salary and Dept-NoDept-Name, thus there also occur partial
dependency. Due to which this relation is not in 2 NF.
Now converting this relation into 2 NF by decomposing this relation into three
relations as,
Employee Emp-Dept Department
Fig: Relations in 2 NF
Student Hostel
S-Id S-Name Age Sex Sex Hostel-Name
1 Laxmi 21 Female Female White house
2 Binita 22 Female Male Red house
3 Rajesh 32 Male
4 Aayan 21 Male Fig: Relations in 3 NF
4. Consider the following supplier database, where primary keys are underlined.
supplier(supplier-id, supplier-name, city)
supplies(supplier-id, part-id, quantity)
parts(part-id, part-name, color, weight)
Construct the following relational algebra queries for this relational database
a Find the name of all suppliers located in city “Kathmandu”.
supplier-name (city=”Kathmandu”(supplier))
e Find the name of all suppliers who supply more than 30 different parts.
5. (a). What is serializable schedule? How can you test a schedule for conflict
serializability?
Solution: A given non serial schedule of n transactions are serializable if it is equivalent
to some serial schedule.That is if a non-serial schedule produce the same result as of the
serial schedule then the given non-serial schedule is said to be serializable.A schedule
that is not serializable is called a non-serializable. The main objective of serializability is
to search non-serial schedules that allow transaction to execute concurrently without
interfering one another transaction and produce the result that could be produced by a
serial execution. In serializability, ordering of read/write is important. If two
transactions only read data item they do not conflict and order is not important. If two
transactions either read or write completely separate data items, they do not conflict
and order is not important. If one transaction writes a data item and another reads or
writes same data items, then order of execution is important.
Schedule compliance with conflict serializability can be tested with the precedence
graph (serializability graph, serialization graph, conflict graph). Precedence graph is the
directed graph representing precedence of transactions in the schedule, as reflected by
precedence of conflicting operations in the transactions. In the precedence
graph transactions are nodes and precedence relations are directed edges. There exists
an edge from a first transaction to a second transaction, if the second transaction is in
conflict with the first. A schedule is conflict-serializable if and only if its precedence
graph is acyclic. This means that a cycle consisting of committed transactions only is
generated in the precedence graph, if and only if conflict-serializability is violated.
5 Test the precedence graph for the existence of cycle. If cycle is encountered,
schedule is not serializale. Otherwise schedule is serializable.
Example 1: Draw precedence graph for following schedule and identify wheather it is
conflict serizable or not. If it is conflict serializable, identify equivalent serial schedule
also.
T2 T3
T1
Read(x)
Write(x)
Commit
Write(x)
Commit
Write(x)
Commit
Solution
T3
T
Figure: 3
Precedence Graph for Schedule G
Since, above precedence graph of the schedule G, with 3 transactions contains a cycle
(of length 2; with two edges) through, the given schedule G is not conflict serializable.
(b). Discuss recovery technique base on deferred update with concurrent execution in
multiuser environment.
Syste
m
Crash
In the above example transactions T2 and T3 are ignored because they did not reach
their commit point but transaction T4 is redone because its commit point is after the last
checkpoint. On the other hand we do not need to redo transaction T1 because its
commit point is before last checkpoint and hence effect of T1 is already recorded in
database. For example, in case schedule R2 given above transaction T1 is redone
because it is in committed list and transaction T2 is undone becaused transaction is
failed before reaching to commit point.
The above algorithm can be made more efficient by noting that if a database item x has
been updated more than once by committed transactions since the last checkpoint, it is
only necessary to redo the last update of x from the log file during recovery as the other
updates would be overwritten by the last update anyway. For example, in case of
schedule R3 given above transactions T2 and T3 are undone because they are in active
list but transaction T4 is redone because it committed after last checkpoint. But recovery
manager ignores transaction T1 because its commit point is before last checkpoint.