unit-1-introduction-to-dbms
unit-1-introduction-to-dbms
unit-1-introduction-to-dbms
Objectives:
To Understand the basic concepts and the applications of database systems
To Master the basics of SQL and construct queries using SQL
To understand the relational database design principles
To become familiar with the basic issues of transaction processing andconcurrency control
To become familiar with database storage structures and access techniques
Outcomes:
Demonstrate the basic elements of a relational database management system
Ability to identify the data models for relevant problems
Ability to design entity relationship and convert entity relationship diagrams into RDBMS and formulate
SQL queries on the respect data
Apply normalization for the development of application software
INTRODUCTION TO DBMS:
Data is nothing but facts and statistics stored or free flowing over a network, generallyit's raw and
unprocessed.
Data becomes information when it is processed, turning it into something meaningful.
The database is a collection of inter-related data which is used to retrieve, insert and delete the data
efficiently.
It is also used to organize the data in the form of a table, schema, views, and reports,etc.
Using the database, you can easily retrieve, insert, and delete the information.
For example: The college Database organizes the data about the admin, staff, studentsand faculty etc.
DBMS is a collection of data. In DBMS, theuser File system is a collection of data. In this system, the
is not required to write the procedures. user has to write the procedures for managing the
database.
DBMS gives an abstract view of data that hides File system provides the detail of the data
the details. representation and storage of data.
DBMS provides a crash recovery mechanism, File system doesn't have a crash mechanism, i.e., if the
i.e., DBMS protects the user from the system system crashes while entering some data, then the
failure. content of the file will lost.
DBMS provides a good protection mechanism. It is very difficult to protect a file under the filesystem.
DBMS contains a wide variety of sophisticated File system can't efficiently store and retrieve thedata.
techniques to store and retrieve the data.
DBMS takes care of Concurrent access of data In the File system, concurrent access has many
using some form of locking. problems like redirecting the file while other deleting
some information or updating some information.
History of DBMS:
Data is a collection of facts and figures. The data collection was increasing day to day and they needed to be
stored in a device or software which is safer.
Charles Bachman was the first person to develop the Integrated Data Store (IDS) which was based on
network data model for which he was inaugurated with the Turing Award (The most prestigious award which
is equivalent to Nobel Prize in the field of Computer Science.). It was developed in early 1960’s.
In the late 1960’s, IBM (International Business Machines Corporation) developed the Integrated Management
Systems which is the standard database system used till date in many places. It was developed based on the
hierarchical database model. It was during the year 1970 that the relational database model was developed by
Edgar Codd. Many of the database models we use today are relational based. It was considered the
standardized database model from then.
The relational model was still in use by many people in the market. Later during the same decade (1980’s),
IBM developed the Structured Query Language (SQL) as a part of R project. It was declared as a standard
language for the queries by ISO and ANSI. The Transaction Management Systems for processing transactions
was also developed by James Gray for which he was felicitated the Turing Award.
A DBMS is software that allows creation, definition and manipulation of database, allowing users to store,
process and analyse data easily.
DBMS provides us with an interface or a tool, to perform various operations like creating database,
storing data in it, updating data, creating tables in the database anda lot more.
DBMS also provides protection and security to the databases.
It also maintains data consistency in case of multiple users. Here are some examples of popular DBMS
used these days:
MySql
Oracle
SQL Server
IBM DB2
DATABASE APPLICATIONS:
1. Telecom: There is a database to keeps track of the information regarding calls made, network usage,
customer details etc.
2. Industry: Where it is a manufacturing unit, warehouse or distribution centre, each oneneeds a database to
keep the records of ins and outs
3. Banking System: For storing customer info, tracking day to day credit and debit transactions, generating
bank statements etc.
4. Sales: To store customer information, production information and invoice details.
5. Airlines: To travel though airlines, we make early reservations; this reservation information along with
flight schedule is stored in database.
6. Education sector: Database systems are frequently used in schools and colleges to store and retrieve the
data regarding student details, staff details, course details, examdetails, payroll data, attendance details, fees
details etc.
Characteristics of DBMS:
Data stored into Tables: Data is never directly stored into the database. Data is stored into tables, created
inside the database.
Reduced Redundancy: In the modern world hard drives are very cheap, but earlier when hard drives
were too expensive, unnecessary repetition of data in database was a big problem. But DBMS follows
Normalisation which divides the data in such a way that repetition is minimum.
Data Consistency: On Live data, i.e. data that is being continuosly updated and added, maintaining the
consistency of data can become a challenge. But DBMS handles it allby itself.
Support Multiple user and Concurrent Access: DBMS allows multiple users to work on it(update, insert,
delete data) at the same time and still manages to maintain the data consistency.
Query Language: DBMS provides users with a simple Query language, using whichdata can be easily
fetched, inserted, deleted and updated in a database.
Advantages of DBMS:
Controls database redundancy: It can control data redundancy because it stores all data in one single
database file and that recorded data is placed in the database.
Data sharing: In DBMS, the authorized users of an organization can share data among multiple users.
Easily Maintenance: It can be easily maintainable due to the centralized nature of thedatabase system.
Reduce time: It reduces development time and maintenance need.
Backup: It provides backup and recovery subsystems which create automatic backupof data from
hardware and software failures and restores the data if required.
Multiple user interface: It provides different types of user interfaces like graphicaluser interfaces,
application program interfaces
Disadvantages of DBMS:
Cost of Hardware and Software: It requires a high speed of data processor and large memory size to run
DBMS software.
Size: It occupies a large space of disks and large memory to run them efficiently.
Complexity: Database system creates additional complexity and requirements.
Higher impact of failure: Failure is highly impacted the database because in most of the organization, all
the data stored in a single database and if the database is damageddue to electric failure or database
corruption then the data may be lost forever.
Database:
A database is organized collection of related data of an organization stored in formatted way which is shared by
multiple users.
The main features of data in a database are:
1. It must be well organized
2. It is related
3. It is accessible in a logical order without any difficulty
4. It is stored only once for example:
Consider the roll no, name, address of a student stored in a student file. It is collection of related data with an
implicit meaning.
Data in the database may be persistent, integrated and shared.
Persistent:
If data is removed from database due to some explicit request from user to remove.
Integrated:
A database can be a collection of data from different files and when any redundancyamong those files is
removed from database is said to be integrated data.
Sharing Data:
The data stored in the database can be shared by multiple users simultaneously without affecting the correctness
of data.
Why Database over file system:
In order to overcome the limitation of a file system, a new approach was required. Hence a database approach
emerged. A database is a persistent collection of logically related data. The initial attempts were to provide a
centralized collection of data. A database has a self describing nature. It contains not only the data sharing and
integrationof data of an organization in a single database.
A small database can be handled manually but for a large database and having multiple users it is difficult to
maintain it, In that case a computerized database is useful. The advantages of database system over traditional,
paper based methods of record keeping are:
Compactness: No need for large amount of paper files
Speed: The machine can retrieve and modify the data faster way then human being.
Less drudgery: Much of the maintenance of files by hand is eliminated.
Accuracy: Accurate, up-to-date information is fetched as per requirement of theuser at any time.
Function of DBMS:
1. Defining database schema: it must give facility for defining the databasestructure also specifies access
rights to authorized users.
2. Manipulation of the database: DBMS must have functions like insertion of record into database, updating
of data, deletion of data, and retrieval of data.
3. Sharing of database: The DBMS must share data items for multiple users bymaintaining consistency of
data.
4. Protection of database: It must protect the database against unauthorized users.
5. Database recovery: If for any reason the system fails DBMS must facilitate database recovery.
Database systems are made-up of complex data structures. To ease the user interaction withdatabase, the
developers hide internal irrelevant details from users. This process of hiding irrelevant details from user is
called data abstraction.
We have three levels of abstraction:
Physical level: This is the lowest level of data abstraction. It describes how data is actuallystored in database.
You can get the complex data structure details at this level.
Logical level: This is the middle level of 3-level data abstraction architecture. It describes what data is stored
in database.
View level: Highest level of data abstraction. This level describes the user interaction withdatabase system.
Definition of schema: Design of a database is called the schema. Schema is of three types: Physical
schema, logical schema and view schema.
The design of a database at physical level is called physical schema, how the data stored in blocks of
storage is described at this level.
Design of database at logical level is called logical schema, programmers and database administrators
work at this level, at this level data can be described as certaintypes of data records gets stored in data
structures, however the internal details such as implementation of data structure is hidden at this level
(available at physical level).
Design of database at view level is called view schema. This generally describes enduser interaction with
database systems.
Definition of instance: The data stored in database at a particular moment of time is called instance of
database. Database schema defines the variable declarations in tables that belong to a particular database; the
value of these variables at a moment of time is called the instance ofthat database.
Relational Model (RM) represents the database as a collection of relations. A relation is nothing but a table
of values. Every row in the table represents a collection of related data values. These rows in the table denote a
real-world entity or relationship.
The table name and column names are helpful to interpret the meaning of values in each row.The data are
represented as a set of relations. In the relational model, data are stored as tables. However, the physical
storage of the data is independent of the way the data are logically organized.
Relational Model Concepts:
1. Attribute: Each column in a Table. Attributes are the properties which define arelation. e.g.,
Student_Rollno, NAME,etc.
2. Tables: In the Relational model the, relations are saved in the table format. It is stored along with its
entities. A table has two properties rows and columns. Rows represent records and columns represent
attributes.
3. Tuple: It is nothing but a single row of a table, which contains a single record.
4. Relation Schema: A relation schema represents the name of the relation with itsattributes.
5. Degree: The total number of attributes which in the relation is called the degree of therelation.
6. Cardinality: Total number of rows present in the Table.
7. Column: The column represents the set of values for a specific attribute.
8. Relation instance – Relation instance is a finite set of tuples in the RDBMS system. Relation instances
never have duplicate tuples.
9. Relation key - Every row has one, two or multiple attributes, which is called relation key.
10. Attribute domain – Every attribute has some pre-defined value and scope which isknown as attribute
domain.
Keys in DBMS:
KEYS in DBMS is an attribute or set of attributes which helps you to identify a row(tuple) in a relation(table).
They allow you to find the relation between two tables. Keys help you uniquely identify a row in a table by a
combination of one or more columns in that table. Key is also helpful for finding unique record or row from the
table. Database key is also helpful for finding unique record or row from the table.
Why we need a Key?
Here are some reasons for using SQL key in the DBMS system:
Keys help you to identify any row of data in a table. In a real-world application, a table could contain
thousands of records. Moreover, the records could be duplicated. Keys ensure that you can uniquely identify a
table record despite these challenges.
Allows you to establish a relationship between and identify the relation betweentables
Help you to enforce identity and integrity in the relationship.
There are mainly seven different types of Keys in DBMS and each key has its different functionality:
Super Key - A super key is a group of single or multiple keys which identifies rows in a table.
Primary Key - is a column or group of columns in a table that uniquely identifyevery row in that table.
Candidate Key - is a set of attributes that uniquely identify tuples in a table. Candidate Key is a super key
with no repeated attributes.
Alternate Key - is a column or group of columns in a table that uniquely identifyevery row in that table.
Foreign Key - is a column that creates a relationship between two tables. The purpose of Foreign keys is to
maintain data integrity and allow navigation betweentwo different instances of an entity.
Compound Key - has two or more attributes that allow you to uniquely recognize a specific record. It is
possible that each column may not be unique by itself within thedatabase.
Composite Key - An artificial key which aims to uniquely identify each record is called a surrogate key.
These kind of key are unique because they are created whenyou don't have any natural primary key.
Surrogate Key - An artificial key which aims to uniquely identify each record is called a surrogate key.
These kind of key are unique because they are created whenyou don't have any natural primary key.
Syntax
The syntax to create a primary key using the ALTER TABLE statement in SQL is:
The following SQL creates a FOREIGN KEY on the "PersonID" column when the "Orders"table is
created:
CREATE TABLE Orders (
OrderID int NOT NULL, OrderNumber int NOT NULL, PersonID int,
PRIMARY KEY (OrderID),
FOREIGN KEY (PersonID) REFERENCES Persons(PersonID)
);
ER model:
o ER model stands for an Entity-Relationship model. It is a high-level data model. This model is used to
define the data elements and relationship for a specified system.
o It develops a conceptual design for the database. It also develops a very simple andeasy to design view
of data.
o In ER modeling, the database structure is portrayed as a diagram called an entity- relationship diagram.
For example, suppose we design a school database. In this database, the student will be an entity with
attributes like address, name, id, age, etc. The address can be another entity with attributes like city,
street name, pin code, etc and there will be a relationship between them.
Component of ER Diagram:
1. Entity:
An entity may be any object, class, person or place. In the ER diagram, an entity can berepresented as
rectangles.
Consider an organization as an example- manager, product, employee, department etc. can be taken as
an entity.
Weak Entity
An entity that depends on another entity called a weak entity. The weak entity doesn't containany key
attribute of its own. The weak entity is represented by a double rectangle.
2. Attribute
The attribute is used to describe the property of an entity. Eclipse is used to represent anattribute.
For example, id, age, contact number, name, etc. can be attributes of a student.
Key Attribute
The key attribute is used to represent the main characteristics of an entity. It represents a primary key.
The key attribute is represented by an ellipse with the text underlined.
Composite Attribute
An attribute that composed of many other attributes is known as a composite attribute. Thecomposite
attribute is represented by an ellipse, and those ellipses are connected with an ellipse.
Multivalued Attribute
An attribute can have more than one value. These attributes are known as a multivaluedattribute. The
double oval is used to represent multivalued attribute.
For example, a student can have more than one phone number.
Derived Attribute
An attribute that can be derived from other attribute is known as a derived attribute. It can berepresented by a
dashed ellipse.
For example, a person's age changes over time and can be derived from another attributelike Date of birth.
3. Relationship
A relationship is used to describe the relation between entities. Diamond or rhombus is usedto represent
the relationship.
One-to-One Relationship
When only one instance of an entity is associated with the relationship, then it is known asone to one
relationship.
For example, A female can marry to one male, and a male can marry to one female.
One-to-many relationship
When only one instance of the entity on the left, and more than one instance of an entity onthe right
associates with the relationship, this is known as a one-to-many relationship.
For example, Scientist can invent many inventions, but the invention is done by the only specific
scientist.
Many-to-one relationship
When more than one instance of the entity on the left, and only one instance of an entity onthe right
associates with the relationship, it is known as a many-to-one relationship.
For example, Student enrolls for only one course, but a course can have many students.
Many-to-many relationship
When more than one instance of the entity on the left, and more than one instance of an entity on the
right associates with the relationship, it is known as a many-to-many relationship.
For example, Employee can assign by many projects and project can have many employees.
Notation of ER diagram:
Database can be represented using the notations. In ER diagram, many notations are used to express the
cardinality. These notations are as follows:
Integrity Constraints
o Integrity constraints are a set of rules. It is used to maintain the quality of information.
o Integrity constraints ensure that the data insertion, updating, and other processes haveto be
performed in such a way that data integrity is not affected.
o Thus, integrity constraint is used to guard against accidental damage to the database.
Example:
o The entity integrity constraint states that primary key value can't be null.
o This is because the primary key value is used to identify individual rows in relationand if the primary key
has a null value, then we can't identify those rows.
o A table can contain a null value other than the primary key field.
Database Basics:
Data item:
The data item is also called as field in data processing and is the smallest unit of datathat has meaning to its
users.
Eg: “e101”,”sumit”
A subschema is derived schema derived from existing schema as per the user requirement. There may be more
than one subschema creates for a single conceptual schema.
Conceptual view
Internal level
A database management system that provides three level of data is said to follow three-level architecture.
External level
Conceptual level
Internal level
The external level is at the highest level of database abstraction. At this level, there will be many views define
for different users requirement. A view will describe only a subset of the database. Any number of user views
may exist for a given global or subschema.
For example, each student has different view of the time table. The view of a student of Btech (CSE) is different
from the view of the student of Btech(ECE).Thus this level of abstraction is concerned with different categories
of users. Each external view is described by means of a schema called schema or schema.
Conceptual level:
At this level of database abstraction all the database entities and the relationships among them are included.
One conceptual view represents the entire database. This conceptual view is defined by the conceptual schema.
The conceptual schema hides the details of physical storage structures and concentrate on describing entities,
data types, relationships, user operations and constraints.
It describes all the records and relationships included in the conceptual view. There is only one conceptual
schema per database. It includes feature that specify the checks to relation data consistency and integrity.
Internal level:
It is the lowest level of abstraction closest to the physical storage method used . It indicates how the data will
be stored and describes the data structures and access methods to be used by the database . The internal view is
expressed by internal schema.
The following aspects are considered at this level:
1. Storage allocation e.g: B-tree,hashing
2. access paths eg. specification of primary and secondary keys,indexes etc
3. Miscellaneous eg. Data compression and encryption techniques,optimization of the internal structures.
Database users :
Naive users:
Users who need not be aware of the presence of the database system or any other system supporting their usage
are considered naïve users. A user of an automatic teller machine falls on this category.
.
Online users:
These are users who may communicate with the database directly via an online terminal or indirectly via a user
interface and application program. These users are aware of the database system and also know the data
manipulation language system.
Application programmers:
Professional programmers who are responsible for developing application programs or user interfaces utilized
by the naïve and online user fall into this category.
Database Administration:
A person who has central control over the system is called database administrator.
The function of DBA are:
1. Creation and modification of conceptual Schema definition
2. Implementation of storage structure and access method.
3. Schema and physical organization modifications.
4. Granting of authorization for data access.
5. Integrity constraints specification.
6. Execute immediate recovery procedure in case of failures.
7. Ensure physical security to database.
Database language :
Elements of DBMS:
DML pre-compiler:
It converts DML statement embedded in an application program to normal procedure calls in the host language.
The pre-complier must interact with the query processor in order to generate the appropriate code.
DDL compiler:
The DDL compiler converts the data definition statements into a set of tables. These tables contain information
concerning the database and are in a form that can be used by other components of the dbms.
File manager:
File manager manages the allocation of space on disk storage and the data structure used to represent
information stored on disk.
Database manager:
A database manager is a program module which provides the interface between the low level data stored in the
database and the application programs and queries submitted to the system.
The responsibilities of database manager are:
1. Interaction with file manager: The data is stored on the disk using the file system which is provided by
operating system. The database manager translates the different DML statements into low-level file system
commands, so database manager is responsible for the actual storing, retrieving and updating of data in the
database.
2. Integrity enforcement: The data values stored in the database must satisfy certain constraints (eg: the age
of a person can't be less than zero).These constraints are specified by DBA. Data manager checks the
constraints and ifit satisfies then it stores the data in the database.
3. Security enforcement: Data manager checks the security measures for database from unauthorized users.
4. Backup and recovery: Database manager detects the failures occurs due to different causes (like disk
failure, power failure, deadlock, s/w error) and restores the database to original state of the database.
5. Concurrency control: When several users access the same database file simultaneously, there may be
possibilities of data inconsistency. It is responsible of database manager to control the problems occurs for
concurrenttransactions.
Query processor:
The query processor used to interpret to online user’s query and convert it into an efficient series of operations
in a form capable of being sent to the data manager for execution. The query processor uses the data dictionary
to find the details of data file and using this information it create query plan/access plan to execute the query.
Data Dictionary:
Data dictionary is the table which contains the information about database objects. It contains information like
1. External, conceptual and internal database description
2. Description of entities , attributes as well as meaning of data elements
3. Synonyms, authorization and security codes
4. Database authorization
DBMS STRUCTURE:
Database manager
File manager
DBMS
Data file
Data dictionary
Some main differences between a database management system and a file-processing system are:
• Both systems contain a collection of data and a set of programs which access that data. A database
management system coordinates both the physical and the logical
access to the data, whereas a file-processing system coordinates only the physical access.
• A database management system reduces the amount of data duplication by ensuring that a physical piece of
data is available to all programs authorized to have access to it, where as data written by one program in a file-
processing system may not be readable by another program.
• A database management system is designed to allow flexible access to data (i.e., queries), whereas a file-
processing system is designed to allow predetermined access to data (i.e., compiled programs).
• A database management system is designed to coordinate multiple users accessing the same data at the
same time. A file-processing system is usually designed to allow one or more programs to access different data
files at the same time. In a file-processing system, a file can be accessed by two programs concurrently only if
both programs have read-only access to the file.
Q. List five responsibilities of a database management system. For each responsibility, explain the
problems that would arise if the responsibility were not discharged.
A general purpose database manager (DBM) has five responsibilities:
a. Interaction with the file manager.
b. Integrity enforcement.
c. Security enforcement.
d. Backup and recovery.
e. Concurrency control.
If these responsibilities were not met by a given DBM (and the text points out that sometimes a responsibility is
omitted by design, such as concurrency control on a single-user DBM for a micro computer) the following
problems can occur, respectively:
a. No DBMS can do without this, if there is no file manager interaction then nothing stored in the files can be
retrieved.
b. Consistency constraints may not be satisfied, account balances could go below the minimum allowed,
employees could earn too much overtime (e.g.,hours > 80) or, airline pilots may fly more hours than allowed by
law.
c. Unauthorized users may access the database, or users authorized to access part of the database may be able
to access parts of the database for which they lack authority. For example, a high school student could get
access to national defense secret codes, or employees could find out what their supervisors earn.
d. Data could be lost permanently, rather than at least being available in a consistent state that existed prior to
a failure.
e. Consistency constraints may be violated despite proper integrity enforcement in each transaction. For
example, incorrect bank balances might be reflected due to simultaneous withdrawals and deposits, and so on.
EXERCISES:
ER-MODEL
Data model:
The data model describes the structure of a database. It is a collection of conceptual tools for describing data,
data relationships and consistency constraints and various types of data model such as
• Object based logical model
• ER-model
• Functional model
• Object oriented model
• Semantic model
• Record based logical model
• Hierarchical database model
• Network model
• Relational model
• Physical model
The entity-relationship data model perceives the real world as consisting of basic objects, called entities and
relationships among these objects. It was developed to facilitate data base design by allowing specification of an
enterprise schema which represents the overall logical structure of a data base.
The E-R data model employs three basic notions : entity sets, relationship sets andattributes.
Entity sets:
An entity is a “thing” or “object” in the real world that is distinguishable from all other objects. For example,
each person in an enterprise is an entity. An entity has a set properties and the values for some set of properties
may uniquely identify an entity.
BOOK is entity and its properties (called as attributes) bookcode, booktitle, price etc.
An entity set is a set of entities of the same type that share the same properties, or attributes. The set of all
persons who are customers at a given bank, for example, can be defined as the entity set customer.
Attributes:
An entity is represented by a set of attributes. Attributes are descriptive propertiespossessed by each member of
an entity set.
Customer is an entity and its attributes are customerid, custmername, custaddress etc.
An attribute as used in the E-R model , can be characterized by the following attributetypes.
Derived Attribute:
The values for this type of attribute can be derived from the values of existingattributes
eg:age which can be derived from (currentdate-birthdate) experience_in_year can be calculated as (currentdate-
joindate)
Relationship sets:
A relationship is an association among several entities.
A relationship set is a set of relationships of the same type. Formally, it is a mathematical relation on n>=2
entity sets. If E1,E2…En are entity sets, then a relationship set R is a subset of
{(e1,e2,…en)|e1Є E1,e2 Є E2..,en Є En}where (e1,e2,…en) is a relationship.
borrow loan
customer
Consider the two entity sets customer and loan. We define the relationship set borrow to denote the association
between customers and the bank loans that the customers have.
Participation constraints:
The participation constraints specify whether the existence of any entity depends on its being related to another
entity via the relationship. There are two types of participation constraints
Total :
When all the entities from an entity set participate in a relationship type , is called total participation. For
example, the participation of the entity set student on the relationship set must ‘opts’ is said to be total because
every student enrolled must opt for a course.
Partial:
When it is not necessary for all the entities from an entity set to particapte ion a relationship type, it is called
participation. For example, the participation of the entity set student in ‘represents’ is partial, since not every
student in a class is a class representative.
Weak Entity:
Entity types that do not contain any key attribute, and hence cannot be identified independently are called weak
entity types. A weak entity can be identified by uniquely only by considering some of its attributes in
conjunction with the primary key attribute of another entity, which is called the identifying owner entity.
Generally a partial key is attached to a weak entity type that is used for unique identification of weak entities
related to a particular owner type. The following restrictions must hold:
The owner entity set and the weak entity set must participate in one to many relationship set. This relationship
set is called the identifying relationship set of the weak entity set.
The weak entity set must have total participation in the identifying relationship.
Example:
Consider the entity type dependent related to employee entity, which is used to keep track of the dependents of
each employee. The attributes of dependents are: name, birthrate, sex and relationship. Each employee entity set
is said to its own the dependent entities that are related to it. However, not that the ‘dependent’ entity does not
exist of its own, it is dependent on the employee entity. In other words we can say that in case an employee
leaves the organization all dependents related to without the entity ‘employee’. Thus it is a weak entity.
Keys:
Super key:
A super key is a set of one or more attributes that taken collectively, allow us to identify uniquely an entity in the
entity set.
For example , customer-id,(cname,customer-id),(cname,telno)
Candidate key:
In a relation R, a candidate key for R is a subset of the set of attributes of R, whichhave the following properties:
Uniqueness:No two distinct tuples in R have the same values for the candidate key.
Irreducible:No proper subset of the candidate key has the uniqueness property that is the candidate key.
Eg: (cname,telno)
Primary key:
The primary key is the candidate key that is chosen by the database designer as the principal means of
identifying entities with in an entity set. The remaining candidate keys (if any), are called alternate key.
.
Advanced ER-diagram:
Abstraction is the simplification mechanism used to hide superfluous details of a set of objects. It allows one to
concentrate on the properties that are of interest to the application.
There are two main abstraction mechanism used to model information:
Aggregation:
Aggregation is the process of compiling information on an object, there by abstracting a higher level object. In
this manner, the entity person is derived by aggregating the characteristics of name, address, ssn. Another form
of the aggregation is abstracting a relationship objects and viewing the relationship as an object.
Job
Branch
Employe
Work
son
Manages
Manager
ER- Diagram For College Database
rollno
name addres
coursei cname duratio
Student
opts Course
1
1
Head
name dnam 1 name sal
1
addres relationship
Date
2. For each weak entity type W in the ER diagram, we create another relation R that contains all simple
attributes of W. If E is an owner entity of W then key attribute of E is also include In R. This key attribute of R
is set as a foreign key attribute of R. Now the combination of primary key attribute of owner entity type and
partial key of the weak entity type will form the key of the weak entity type.
GUARDIAN((rollno,name) (primary key),address,relationship)
Binary Relationships:
One-to-one relationship:
For each 1:1 relationship type R in the ER-diagram involving two entities E1 and E2 we choose one of
entities(say E1) preferably with total participation and add primary key attribute of another E as a foreign key
attribute in the table of entity(E1). We will also include all the simple attributes of relationship type R in E1 if
any, For example, the department relationship has been extended tp include head-id and attribute of the
relationship.
DEPARTMENT(D_NO,D_NAME,HEAD_ID,DATE_FROM)
One-to-many relationship:
For each 1:n relationship type R involving two entities E1 and E2, we identify the entity type (say E1) at the n-
side of the relationship type R and include primary key of the entity on the other side of the relation (say E2) as
a foreign key attribute in the table of E1. We include all simple attribute(or simple components of a composite
attribute of R(if any) in the table E1)
For example:
This works in relationship between the DEPARTMENT and FACULTY. For this relationship choose the entity
at N side, i.e, FACULTY and add primary key attribute of another entity DEPARTMENT, ie, DNO as a foreign
key attribute in FACULTY.
Many-to-many relationship:
For each m:n relationship type R, we create a new table (say S) to represent R, Wealso include the primary key
attributes of both the participating entity types as a foreign key attribute in s. Any simple attributes of the m:n
relationship type(or simple components as a composite attribute) is also included as attributes of S. For
example:
The M:n relationship taught-by between entities COURSE; and FACULTY shod be represented as a new table.
The structure of the table will include primary key of COURSE and primary key of FACULTY entities.
TAUGHT-BY(ID (primary key of FACULTY table),course-id (primary key of COURSE table)
N-ary relationship:
For each n-anry relationship type R where n>2, we create a new table S to represent R, We include as foreign
key attributes in s the primary keys of the relations that represent the participating entity types. We also include
any simple attributes of the n-ary relationship type(or simple components of complete attribute) as attributes of
S. The primary key of S is usually a combination of all the foreign keys that reference the relations representing
the participating entity types.
Multi-valued attributes:
For each multivalued attribute ‘A’, we create a new relation R that includes an attribute corresponding to plus
the primary key attributes k of the relation that represents the entity type or relationship that has as an attribute.
The primary key of R is then combination of A and k.
For example, if a STUDENT entity has rollno,name and phone number where phone numer is a multivalued
attribute the we will create table PHONE(rollno,phoneno) where primary key is the combination,In the
STUDENTtable we need not have phone number, instead if can be simply (rollno,name) only.
PHONE(rollno,phoneno)
.
Account_n name
Account
branch
generalization
specialization
Is-a specialization
intrest
charges
Saving
Current
Hierarchical Model:
A hierarchical database consists of a collection of records which are connected toone another through links.
a record is a collection of fields, each of which contains only one data value.
A link is an association between precisely two records.
The hierarchical model differs from the network model in that the records areorganized as collections of
trees rather than as arbitrary graphs.
Tree-Structure Diagrams:
The schema for a hierarchical database consists of
o boxes, which correspond to record types
o lines, which correspond to links
Record types are organized in the form of a rooted tree.
o No cycles in the underlying graph.
o Relationships formed in the graph must be such that only
one-to-many or one-to-one relationships exist between a parent and achild.
Single Relationships:
Example E-R diagram with two entity sets, customer and account, related througha binary, one-to-many
relationship depositor.
Corresponding tree-structure diagram has
o the record type customer with three fields: customer-name, customer-street, and customer-city.
o the record type account with two fields: account-number and balance
o the link depositor, with an arrow pointing to customer
If the relationship depositor is one to one, then the link depositor has two arrows.
Only one-to-many and one-to-one relationships can be directly represented in thehierarchical mode.
Must consider the type of queries expected and the degree to which the databaseschema fits the given E-R
diagram.
In all versions of this transformation, the underlying database tree (or trees) willhave replicated records.
Create two tree-structure diagrams, T1, with the root customer, and T2, withthe root account.
In T1, create depositor, a many-to-one link from account to customer.
In T2, create account-customer, a many-to-one link from customer to account.
Virtual Records:
For many-to-many relationships, record replication is necessary to preserve the tree-structure organization
of the database.
o Data inconsistency may result when updating takes place
o Waste of space is unavoidable
Virtual record — contains no data value, only a logical pointer to a particular physical record.
When a record is to be replicated in several database trees, a single copy of that record is kept in one of the
trees and all other records are replaced with a virtual record.
Let R be a record type that is replicated in T1, T2, . . ., Tn. Create a new virtual record type virtual-R and
replace R in each of the n – 1 trees with a record of type virtual-R.
Eliminate data replication in the diagram shown on page B.11; create virtual- customer and virtual-account.
Replace account with virtual-account in the first tree, and replace customer with
virtual-customer in the second tree.
Add a dashed line from virtual-customer to customer, and from virtual-account to account, to specify the
association between a virtual record and its corresponding physical record.
Network Model:
Data are represented by collections of records.
o similar to an entity in the E-R model
o Records and their fields are represented as record type
Type customer = record type account = record type
customer-name: string; account-number: integer;
customer-street: string; balance: integer;
customer-city: string;
Relationships among data are represented by links
o similar to a restricted (binary) form of an E-R relationship
o restrictions on links depend on whether the relationship is many-many, many-to-one, or one-to-one.
Data-Structure Diagrams:
Schema representing the design of a network database.
A data-structure diagram consists of two basic components:
o Boxes, which correspond to record types.
o Lines, which correspond to links.
Specifies the overall logical structure of the database.
Since a link cannot contain any data value, represent an E-R relationship withattributes with a new record type
and links.
To represent an E-R relationship of degree 3 or higher, connect the participating record types through a new
record type that is linked directly to each of the originalrecord types.
1. Replace entity sets account, customer, and branch with record types account, customer, and branch,
respectively.
2. Create a new record type Rlink (referred to as a dummy record type).
3. Create the following many-to-one links:
o CustRlink from Rlink record type to customer record type
o AcctRlnk from Rlink record type to account recordtype
o BrncRlnk from Rlink record type to branch recordtype
The DBTG CODASYL Model:
o All links are treated as many-to-one relationships.
o To model many-to-many relationships, a record type is defined to represent therelationship and two links are
used.
DBTG Sets:
o The structure consisting of two record types that are linked together is referredto in the DBTG model as a
DBTG set
o In each DBTG set, one record type is designated as the owner, and the other isdesignated as the member, of
the set.
o Each DBTG set can have any number of set occurrences (actual instances oflinked records).
o Since many-to-many links are disallowed, each set occurrence has preciselyone owner, and has zero or more
member records.
o No member record of a set can participate in more than one occurrence of theset at any point.
o A member record can participate simultaneously in several set occurrences of
different DBTG set.
Class hierarchy
Class hierarchy can be viewed one of two ways:-
Specialization (Top Down Approach)
Generalization (Bottom Up Approach)
Specialization
Specialization is a process of identifying subsets of an entity that shares different characteristics. It breaks
an entity into multiple entities from higher level (super class) to lower level (subclass).
Specialization is the process of defining one or more entities from present entity.
The class vehicle can be specialized into Car, Truck and Motorcycle ( Top Down Approach)
Hence, vehicle is the superclass and Car, Truck, Motorcycle are subclasses. All three of these inherit
attributes from vehicle. Moreover, these three share those attributes among themselves while containing
some other attributes which make them different.
Generalization
Generalization is a process of generalizing an entity which contains generalized attributes or properties of
generalized entities. The entity that is created will contain the common features. Generalization is a Bottom
up process. Generalization is the higher level of understanding of data from lower level of data.
The classes Car, Truck and motorcycle can be generalised into Vehicle. (Bottom Up Approach). Car, Truck
and Motorcycle are subclasses while vehicle is the superclass.
Basically, Vehicle contains the common attributes that were shared between Car, Truck and Motorcycle.
Aggregation
Example: Car, Truck and Motorcycle are all subclasses of the superclass Vehicle. They all inherit common
attributes from vehicle such as speed, colour etc. while they have different attributes also i.e Number of wheels
in Car is 4 while in Motorcycle is 2.
Super classes
A superclass is the class from which many subclasses can be created. The subclasses inherit the characteristics
of a superclass. The superclass is also known as the parent class or base class.
In the above example, Vehicle is the Superclass and its subclasses are Car, Truck and Motorcycle.
Inheritance
Inheritance is basically the process of basing a class on another class i.e to build a class on a existing class.
The new class contains all the features and functionalities of the old class in addition to its own.
The class which is newly created is known as the subclass or child class and the original class is the parent
class or the superclass.
Attribute inheritance: allows lower level entities to inherit the attributes of higher
level entities and vice versa.
in diagram: Vehicle entity has an relationship with Cycle entity ,So Cycle entity can
acquire attributes of lower level entities i.e Car and Bus since it is inheritance
of Vehicle.
Examples of Entity vs Attribute:
Use of phone as an entity allows extra information about phone numbers (plus multiple phone
numbers)
Example 2:
Should address can be an attribute of Employees or an Entity (connected to Employees by a
relationship) ?
It is depends upon the use we want to make of address information, and the
semantics of the data.
If we have several addresses per employee, address must be an Entity (since
attributes cannot be set-valued).
And if the structure (city, street, etc.) is important, e.g., we want to retrieve
employees in a given city, then address must be modelled as an entity (since
attribute values are atomic).
Example3 :
Works_In2 does not allow an employee to work in a department for two or more periods. This
is similar to the problem of wanting to record of several addresses for an employee.
we want to record several values of the descriptive attributes for each instance of this
relationship.
Examples of Entity vs. Relationship
First ER diagram OK if a manager gets a separate discretionary budget for each dept.
What if a manager gets a discretionary budget that covers all managed depts? –
Redundancy: dbudget stored for each dept managed by manager.
Misleading: Suggests dbudget associated with department-mgr combination.
Degree of Relationship
In DBMS, a degree of relationship represents the number of entity types that are associated with a
relationship.
In a unary relationship, only one entity is involved. Here, the degree of relationship is 1. The unary
relationship is also known as a recursive relationship.
In a binary relationship, there are two entities involved. The degree of relationship is 2.
In a ternary relationship, there are three entities involved. The degree of relationship is 3.
For example
we can create and represent a ternary relationship 'parent' that may relate to a child, his father,
as well as his mother. Such relationship can also be represented by two binary relationships i.e,
mother and father, that may relate to their child. Thus, it is possible to represent a ternary
relationship by a set of distinct binary relationships.
Example
If each policy is owned by just 1 employee: – Key constraint on Policies would mean policy
can only cover 1 dependent.
There are the additional constraints in the 2nd diagram.
Aggregation Vs Ternary Relationships
explain when to use aggregation versus ternary relationship. In short, each Project entity is sponsored by
one or more Department entities and each Department can sponsor zero, one or more Projects.
Each Sponsorship relationship has a Monitors relationship, which connects Employees with Sponsorship.
This can be expressed in 2 ER diagrams:
Now, we want to express an additional constraint that each Sponsorship relationship is monitored by at
most one Employee that this cannot be done with ternary relationship.