Database Management System MODULE - 1-NOTES
Database Management System MODULE - 1-NOTES
Database Management System MODULE - 1-NOTES
UNIT -1
This course provides fundamental and practical knowledge on database concepts by means
of organizing the information, storing and retrieve the information in an efficient and a
flexible way from a well-structured relational model. This course ensures that every student
will gain experience in creating data models and database design.
Course Objectives
• Demonstrate basic database concepts, including the structure and operation of the
relational data model.
• Introduce simple and moderately advanced database queries using Structured Query
Language (SQL).
• Explain and successfully apply logical database design principles, including E-R
diagrams and database normalization.
• Demonstrate the concept of a database transaction and related database facilities,
including concurrency control, and data object locking and protocols
Data can be character, text, words, numbers, pictures, sound, or video. It is a collection of
data.
Information is useful and usually formatted in a manner that allows it to be understood by a
human.
Database Management System (DBMS) – It refers to the technology for creating and
managing databases.
A Database table is a collection of Rows and Columns that is used to organize information
about a single topic or object. Each Row within a Table corresponds to a single record and
contains several different attributes that describe the row.
A Database Table is the most common and simplest form of data storage in a relational
Database.
History of DBMS:
➢ Early 1960’s: First general-purpose DBMS, designed by Charles Bachman at General
Electric called Integrated Data Store-Network data model.
➢ Late 1960’s: IBM developed the Information Manager System (IMS). It is an
alternative data representation framework- Hierarchical Data model
➢ 1970: Edgar Codd at IBM’s San Jose Research Laboratory, proposed a new data
representation framework-Relational Data model.
➢ 1980: Relational data model is most widely used the SQL query language for
relational databases developed as part of IBM’s system R Project
➢ Late 1980’s: SQL was standardized, SQL:1999.
➢ James Gray won the Turing Award for his contributions to Database Transaction
Management.
Operating System
Operating System (Disk Manager, File Manager)
(Disk Manager, File Manager)
Customer_Details table
Customer_Details file Customer_Loan table
Customer_Loan file
9. Security is less- backup, restore Security is more – backup and restore can
problems be done.
Advantages of DBMS
1.Data Independence:
Application programs should not, ideally, be exposed to details of data representation and
storage. It provides an abstract view of data by hiding such details (It is a separation of data
and metadata)
2. Efficient Data Access:
DBMS uses a variety of techniques to store and retrieve data efficiently i.e accessing data
is an easier task by using standardized SQL commands.
3.Data Integrity and Security:If data is accessed through DBMS, the DBMS can enforce
some Integrity Constraints on the data stored in tables while creating the schema and what
should be visible to different classes of users. Ex: bank (minimum amt in account), Student
result (pass mark), Faculty (Salary)
4.Data Administration:
When several users share the data, centralizing the administration of data. It is the responsible
for organizing the data representation to minimize redundancy.
➢ Abstractions
➢ Schemas
➢ Data Independence
DDL is used to define the external and conceptual schemas by using commands like create,
alter, drop, rename, truncate.
➢ Information about all these three levels is stored in System catalog
➢ Database contains not only the user data but also meta-data (data about data),
data is nothing but collection of files corresponding to user tables and indexes
➢ RDBMS maintains information about every table and index that it contains.
The descriptive information itself is stored as a collection of special tables
called catalog table/data dictionary/System catalog.
Information regarding about the three levels of schemas is stored in system catalog
Note: the collection of files corresponding to user tables and indexes represent data.
The data in a DBMS is described at 3 levels of abstraction.
The database description consists of a schema at each of these 3 levels of Abstraction.
1) The Conceptual
2) The Physical
3) External
Fig: Levels of Schema
External Schema:
➢ External Schema or view level: data access to be authorized at the level of one user or
group of users
➢ the uppermost level is called view level. It describes only the part of database, a
variety of information is stored in database, but user want only some information to
access.
➢ This level simplifies the user interaction with the system and hides the complexity
that arises in the conceptual schema
➢ The external schema consists of collection of one or more views and relations from
conceptual schema
➢ A user can only view the data in the form of a table, but it is not stored explicitly
Conceptual Schema:
➢ The conceptual schema is called as Logical schema describes the stored data in terms
of the data model.
➢ In a Relational DBMS, Conceptual schema describes all relations that are stored in
the database.
➢ The choice of relations, and the choice of fields for each relation/table makes a good
conceptual schema called conceptual database design
➢ It specifies additional storage details ie. How the relations in the conceptual schema is
stored on secondary storage devices such as Disks and Tapes.
➢ Stores all relations as unsorted files of records (A file in a DBMS is either a collection
of records or a collection of pages, rather than collection of characters as in operating
system)
Eg: Create Indexes on the first column of the students, Faculty courses relations the sal
column of faculty and the capacity column of rooms.
Decision about the physical schema are based on understanding how data is accessed. This
process of arriving a good physical schema is called physical database design
Data Independence
➢ Applications programs are insulated from changes in the way the data is structured
and stored.
➢ Data Independence is achieved through the three levels of Abstraction
➢ Relations in external schema are generated on demand from the relations to the
conceptual schema.
If any change is made to conceptual schema either by adding or removing columns from
table, it should not affect the user view.
• If any changes are made to physical schema it doesn’t affect the conceptual schema or
external schema, the conceptual schema hides details such as how data is actually laid
on the disk, the file structure and choice of indexes.
• As long as the conceptual schema remains same, we can change these storage details
without altering applications.
QUERIES:
A Query is a request for data or information from a database table or combination of tables.
This data may be generated as results returned by Structured Query Language(SQL) or as
pictorials, graphs or complex results)
Such questions involving the data stored in a DBMS are called queries.
A DBMS provides a specialized language called the Query Language , in which queries can
be posed.
SQL is a database query language used for storing and managing data in relational DBMS.
RDBMS(MySQL, Oracle, Infomix, MS Access) use SQL as the standard database Query
Language.
1) Relational Calculus
2) Relational Algebra
Relational Calculus is based on mathematical logic and queries in this language have an
intuitive , precise meaning.
➢ The DBMS acts as an interface between the user and the database. The user requests
the DBMS to perform various operations such as insert, delete, update and retrieval
on the database.
➢ The components of DBMS perform these requested operations on the database and
provide necessary data to the users.
The DBMS accepts SQL commands generated from a variety of user interfaces. The Query
Evaluation Engine evaluates and executes a plan against the database, and returns the answer.
1) When a user issues a query, the parser, parses the given request i/p into machine
understandable language, and query optimizer optimizes and produces a plan for evaluating
the query like how the data is stored to produce an efficient plan for evaluating the query
2) An execution plan is a blueprint for evaluating a query, represented as a tree of relational
operators (operators serves as a building blocks for evaluating queries posed against data,
which brings the needed data to the main memory)
1. Plan executor uses operator evaluator to evaluate the plan
2. The code that implements relational operators are on top of the file and access
method.
3. This information is taken by file and access method for accessing the data which is
requested by the user that is present in the file system.
4. Buffer manager takes the responsibility for taking the data from disk to main memory
for execution.
5. Disk space manager takes the responsibility by providing space in the disk when the
data is modified.
6. Transaction manager ensures that transactions request and release locks according to a
locking protocol and schedules the execution transactions
7. Lock manager keeps track of request of locks and grants on database objects when
they are available.
8. DBMS supports Concurrency control and Crash Recovery by carefully scheduling
user requests and maintaining a log of all changes to the database.
9. Recovery manager maintains a log and restores the system to a consistent state when a
crash occurs.
People who work with DBMS:
➢ Users (End users, Naïve users, Sophisticated users)
➢ Data Implementers
➢ Application Programmers
➢ DBA
➢ Users: A primary goal of database system is to store the
➢ Data and retrieve the particular data from database whenever needed.
1) End users: users interact with system without writing programs, they generate request
what they want from system
2) Naïve users: who interact with system by invoking one of the application programs
that have been written previously. (ex: ATM, user simply enters PIN and then checks
their account status or withdraw the amount by invoking the application program
ATM, but he/she doesn’t bother about the appl. Program.
3) Sophisticated users/Data Analyst: uses SQL to generate answers for complex queries,
they use data mining tools and Online Analytical processing tools
4) Data Implementers: These are the people who actually develop DBMS software.
5) Application Programmers: These people develop packages that facilitate data access
for end users (various applications like university, Railways, Bank, etc.,)
Responsibilities of DBA:
➢ DBA designs the conceptual schema (what relations to store) and physical schema
(how to store them)
➢ DBA ensures security by granting permissions to different users to access the only
certain views and relations.
➢ DBA ensures crash recovery and takes necessary steps to restore the data to a
consistent state. DBA ensures data tuning i.e, takes the responsibility for
modifying the database, in particular conceptual and physical schema (basing on
users’ requirements)
Part -II – DATA MODELS
Data Models:
➢ A database Model defines the Logical design and structure of a database and defines
how data will be stored, accessed and updated in a database Management System.
➢ A data model is a collection of high-level data description constructs that hides many
low-level storage details. DBMS allows a users to define the data to be stored in terms
of data model.
➢ Ensures that all data objects required by the database are accurately represented.
➢ Data Model structure helps to define the relational tables, primary and foreign keys
and stored procedures.
➢ It provides a clear picture of the base data and can be used by database developers to
create a physical database.
➢ It is also helpful to identify missing and redundant data.
➢ Though the initial creation of data model is labor and time consuming, in the long run,
it makes your IT infrastructure upgrade and maintenance cheaper and faster.
➢ ER Model
It is a simplest Data Model, It simply lists all the data in a single table, consisting of Rows
and Columns.
In order to manipulate the data or access the data, the computer has to read the entire Flat File
into memory, ie., which makes this model inefficient for all, but it suitable only for smallest
data sets.
Fig: Flat Database Model
This database model organizes data into a tree-like-structure, with a single root, to which all
the other data is linked. The hierarchy starts from the Root data, and expands like a tree,
adding child nodes to the parent nodes.
Advantages
➢ promotes data sharing.
Disadvantages:
➢ Complex to implement
➢ Difficult to manage
➢ Implementation’s limitations
➢ Lack of standards
Fig: Hierarchical Data Model
describe how data is stored as files in the computer by representing information such as
record formats, record orderings, and access paths. Physical data model represent the model
where it describes how data are stored in computer memory, how they are scattered and
ordered in the memory, and how they would be retrieved from memory. Basically physical
data model represents the data at data layer or internal layer.
A key-value database:
also known as a key-value store is a type of non-relational database that uses a simple key-
value method to store data. A key-value database stores data as a collection of key-value pairs
in which a key serves as a unique identifier. Both keys and values can be anything, ranging
from simple objects to complex compound objects.
Ex: HBase runs on top of HDFS (Hadoop Distributed File system), providing BigTable- like
capabilities for Hadoop. HBase is a key/ value store.
➢ The Conceptual Model Is To Establish The Entities, Their Attributes, And Their
Relationships.
➢ The Logical Data Model Defines The Structure Of The Data Elements And Set The
Relationships Between Them.
➢ The Physical Data Model Describes The Database-Specific Implementation Of The
Data Model.
1) Requirement Analysis:
Understand what data is to be stored in database. This process is done by system
analyst of the enterprise/software company by conducting discussions with
groups/owners of the organization.
2) Conceptual Design:
The information gathered in the above step is used to develop a high-level description
of data to be stored in the database, along with constraints. It is transformed to ER
data Model.
➢ Entities - An entity is a real-world object that are represented in database. Data are
stored about such entities. It can be any object, place, person or class.
Example:
Types of Attributes
1. Simple attributes
2. Composite attributes
3. Single valued attributes
4. Multi valued attributes
5. Derived attributes
6. Key attributes
7. Descriptive Attribute
➢ Simple Attributes- Simple attributes are those attributes which can not be divided
further. Ex: age, class
➢ Composite Attributes- Composite attributes are those attributes which are composed
of many other simple attributes. Ex: Address, name
➢ Single Valued Attributes- Single valued attributes are those attributes which can
take only one value for a given entity from an entity set. Ex: gender, age
➢ Multi Valued Attributes- Multi valued attributes are those attributes which can take
more than one value for a given entity from an entity set. Ex: mobile no, email- id
➢ Derived Attributes- Derived attributes are those attributes which can be derived from
other attribute(s). Ex: Age is derived from DOB
➢ Key Attributes- Key attributes are those attributes which can identify an entity
uniquely in an entity set.
Ex: Sno
Minimal set of attributes who values are uniquely identify an entity in the entity set
{Sno, sno+sname, Sno+saddr+sname}
Descriptive Attribute- Descriptive attributes are used to record information about
the relationship, rather than about any one of the participating entities
Relationship and Relationship sets:
relationship Advisor or teaches that associates between two entities Faculty and Student
.
Types of Relationships:
The Association between more than one entity is called a Relationship. It represents in
Diamond shape.
➢ Unary Relationship
➢ Binary Relationship
➢ Ternary Relationship
Fig: Unary Relationship
P1
C1
P2
C2
P3
C3
P4
C4
Fig: One-To-One
2) One- to-Many: An entity in A is related to any number of entities in B, but an entity
in B is related to atmost one entity in A.
E1
O1
E2
O2
E3
O3
E4
E5
Fig: One-to-Many
3) Many-to-One:
An Entity in A is related to atmost one entity in B, but an entity in B is related to any
number of entities in A.
E1
D1
E2
D2
E3
D3
E4
E5
Fig: Many-to-One
4) Many-to-Many:
An Entity in A is related to any number of entities in B, an entity in B is related to any
number of entities in A.
S1
C1
S2
C2
S3
C3
S4 C4
Keys:
A Key is a minimal set of Attributes whose values uniquely identify an entity in the set.
Types of Keys:
1) Super Key
2) Candidate Key
3) Primary Key
4) Composite Key
5) Alternate Key
6) Foreign Key
Super Key:
A super key is a set of one of more columns(attributes) to uniquely identify the rows/tuples in
a table/relation
. Student Table
SID SNAME Phone Age CGPA
number
1 Raghu 98756 25 8.8
2 Sravya 95678 22 9.0
3 Kavya 98515 23 8.5
4 kavya 6556 23 9.0
5 Sravya 921451 26 8.5
Step 1:{sid}{sname}{ph number}{age}{cgpa}
Step 2: {sid,sname}{sid, phone number}{sid, age}{sid,cgpa}{sname,sid}{sname,ph
num} {sname,age} -Not Found {sname,cgpa} -Not Found {Phone number,age} {Phone
number,cgpa}{Age,sid}{age,sname}{age,ph num}{age,cgpa}
Step 3:
{sid, sname, phone number} ……Etc
Super Key Set:{ Sid,sname,phone number,cgpa}
Candidate Key:
PRIMARY KEY
Candidate Key is also called as Primary Key
{SID}
Criteria(PK)
1) The Primary Key Values should NOT BE NULL/EMPTY
2) The values entered should be UNIQUE
3) It should not contain any Redundancy Data.
Foreign Key:
Foreign Key are the columns of a table that points to the primary key of another table.
Eg: Student Table- Sid(PK)
Course Table- Cid(PK)
Enrolls -sid, cid
(Sid references student)
(cid references courses)
Composite Key:
Key that consists of 2 or more attributes that uniquely any record in a table is called
composite Key.
But the attributes which together form the composite key are not a key independent or
individually.
Eg: Sid,sub id marks
Sid and sub id both are required for getting data of marks.
Alternate Key:
Out of all candidate keys, only one gets selected as Primary key, remaining keys are known
as alternate keys or secondary keys.
Features of ER Model:
1) Key Constraints
2) Participation Constraints
3) Weak Entity
4) Class Hierarchies
5) Aggregation
Key Constraints: It is a condition or restriction, that each department has atmost one
Note: That each department can be associated with several employees and locations and each
location can be associated with several department and employees.
Participation Constraints:
➢ Total Participation
➢ Partial Participation
➢ The Participation of the entity set departments in the relationship set manages is said
to be total.
➢ A Participation ie., is not total is said to be a Partial.
➢ For Eg, The Participation of the Entity set Employees in Manages is partial, since not
every employee gets to manage a department.
A weak entity set is one which does not have any primary key associated with it.
A weak entity type normally has partial key which is the set of attributes that can uniquely
identify weak entities that are related to same owner entity.
A Weak Entity can be identified uniquely only by considering the primary key of another
entity.
Owner Entity set and Weak entity set must participate in a one-to-many Relationship set.
Weak Entity set must have Total Participation in this identifying relationship set.
a) Overlap Constraint: It determines whether two subclasses are allowed to contain the
same entity.
Eg:- Consider a person, can be both hour_Emp Entity and Contract_Emp Entity?
Ans-No
1) Specialization:
Employees is specialized into sub classes.
Specialization is the process of identifying subsets of an Entity set (Super Class) that
share some characteristic.
Superclass is defined first, then the sub classes are defined next, and later specific
attributes and relationship sets are added.
2) Generalization:
Generalization is a bottom-up-approach in which two lower level entities combine to
form a higher level entity.
Its more like superclass and sub class system, but the only difference is the approach,
which is bottom-up. Hence, entities are combined to form a more generalized Entity.
Fig: Generalization – Bottom-Up Approach
Aggregation:
Aggregation is a process when relation between two entities is treated as a single Entity.
Aggregation allows us to treat a relationship set as entity set for purposes of participation in
other relationships.
Fig: Aggregation
Conceptual Design with the ER Model:
Developing an ER Diagram present several choices, including the following:
a) Should a concept be modeled as an entity or an attribute?
b) Should a concept be modeled as an entity or a Relationship?
c) What is the Relationship sets and their participating entity sets?
should we use Binary or Ternary Relationships?
d) Should we use Aggregation?
1) Entity Vs Attribute
2) Entity Vs Relationship
3) Binary Vs Ternary Relationship
4) Aggregation Vs Ternary Relationships
Entity Vs Attribute:
➢ Should address be an attribute of Employees or an Entity (Connected to Employees
by a relationship)?
➢ Depends upon the use we want to make of address information, and the semantics of
the data:
Entity Vs Relationship:
What if a manager gets discretionary budget that covers all managed depts?
Redundancy – dbudget stored for each department managed by manager.
Misleading: suggests dbudget associated with department-mgr. combination.
Ist Requirement: we can impose a Key Constraint on policies with respect to covers, but that
the policy can cover only one dependent.
IInd Requirement: We can impose a total participation constraint on policies.
IIIrd Requirement: In given fig, we cannot identify
If we don't need to record the until attribute of Monitors, then we might reasonably use a
ternary relationship as follows.
Consider the constraint that each sponsorship (of a project by a department) be monitored by at
most one employee. We cannot express this constraint in terms of the Sponsors2 relationship