Unit-1 dbms
Unit-1 dbms
earlier days, data was stored manually, using pen and paper but after computer was
discovered, the same task could be done by using files. A computer File is a resource
which uniquely records data, in a storage device in a computer. There are various
formats in which data can be stored. e.g. Text files can be stored in .txt format while
pictures can be stored in .png format etc.
● Cost friendly –
There is a very minimal to no set up and usage fee for File Processing System. (In
most cases, free tools are inbuilt in computers.)
● Easy to use –
File systems require very basic learning and understanding, hence, can be easily
used.
● High scalability –
One can very easily switch from smaller to larger files as per his needs.
Disadvantages of File Processing System :
● Inconsistent Data –
Due to data redundancy, same data stored at different places might not match to
each other.
● Lack of Atomicity –
Operations performed in the database must be atomic i.e. either the operation takes
place as a whole or does not take place at all.
Role of DBMS
A Data Base Management System is a system software for easy, efficient, and reliable
data processing and management. It can be used for:A DBMS overcomes these
limitations by:
● Managing data efficiently with optimized storage and retrieval.
● Providing simple query languages like SQL.
● Ensuring data consistency and concurrency with transaction controls.
● Enforcing robust security policies with built-in access controls.
Below are the main reason why we need a DBMS software.
1. Data Organization and Management:
One of the primary needs for a DBMS is data organization and management. DBMSs
allow data to be stored in a structured manner, which helps in easier retrieval and
analysis. A well-designed database schema enables faster access to information,
reducing the time required to find relevant data. A DBMS also provides features like
indexing and searching, which make it easier to locate specific data within the
database. This allows organizations to manage their data more efficiently and
effectively.
2. Data Security and Privacy:
DBMSs provide a robust security framework that ensures the confidentiality, integrity,
and availability of data. They offer authentication and authorization features that
control access to the database. DBMSs also provide encryption capabilities to protect
sensitive data from unauthorized access. Moreover, DBMSs comply with various data
privacy regulations such as the GDPR, HIPAA, and CCPA, ensuring that
organizations can store and manage their data in compliance with legal requirements.
3. Data Integrity and Consistency:
Data integrity and consistency are crucial for any database. DBMSs provide
mechanisms that ensure the accuracy and consistency of data. These mechanisms
include constraints, triggers, and stored procedures that enforce data integrity rules.
DBMSs also provide features like transactions that ensure that data changes are
atomic, consistent, isolated, and durable (ACID).
4. Concurrent Data Access:
A DBMS provides a concurrent access mechanism that allows multiple users to access
the same data simultaneously. This is especially important for organizations that
require real-time data access. DBMSs use locking mechanisms to ensure that multiple
users can access the same data without causing conflicts or data corruption.
5. Data Analysis and Reporting:
DBMSs provide tools that enable data analysis and reporting. These tools allow
organizations to extract useful insights from their data, enabling better
decision-making. DBMSs support various data analysis techniques such as OLAP,
data mining, and machine learning. Moreover, DBMSs provide features like data
visualization and reporting, which enable organizations to present their data in a
visually appealing and understandable way.
6. Scalability and Flexibility:
DBMSs provide scalability and flexibility, enabling organizations to handle
increasing amounts of data. DBMSs can be scaled horizontally by adding more
servers or vertically by increasing the capacity of existing servers. This makes it
easier for organizations to handle large amounts of data without compromising
performance. Moreover, DBMSs provide flexibility in terms of data modeling,
enabling organizations to adapt their databases to changing business requirements.
7. Cost-Effectiveness:
DBMSs are cost-effective compared to traditional file-based systems. They reduce
storage costs by eliminating redundancy and optimizing data storage. They also
reduce development costs by providing tools for database design, maintenance, and
administration. Moreover, DBMSs reduce operational costs by automating routine
tasks and providing self-tuning capabilities.
Basic Database Concepts
The database system is an excellent computer-based record-keeping system. A
collection of data, commonly called a database, contains information about a
particular enterprise. It maintains any information that may be necessary to the
decision-making process involved in the management of that organization. It can also
be defined as a collection of interrelated data stored together to serve multiple
applications, the data is stored so that it is independent of programs that use the data.
A generic and controlled approach is used to add new data and modify and retrieve
existing data within the database. The data is structured so as to provide the basis for
future application development.
Purpose of Database
The intent of a database is that a collection of data should serve as many applications
as possible. Therefore, a database is often thought of as a repository of information
needed to run certain functions in a corporation or organization. It would permit only
the retrieval of data but also the continuous modification of data needed for the
control of operations. It may be possible to search the database to obtain answers to
questions or information for planning purposes.
In a typical file-processing system, permanent records are stored in different files.
Many different application programs are written to extract the records and add the
records to the appropriate files. However, this scheme has several major limitations
and disadvantages, such as data redundancy (duplication of data), data inconsistency,
maladaptive data, non-standard data, insecure data, incorrect data, etc. A database
management system is an answer to all these problems as it provides centralized
control of the data.
Database Abstraction
A major purpose of a database is to provide the user with only as much information
as is required of them. This means that the system does not disclose all the details of
the data, rather it hides some details of how the data is stored and maintained. The
complexity of databases is hidden from them which, if necessary, are ordered through
multiple levels of abstraction to facilitate their interaction with the system. The
different levels of the database are implemented through three layers:
1. Internal Level(Physical Level): The lowest level of abstraction, the internal level,
is closest to physical storage. It describes how the data is stored concretely on the
storage medium.
2. Conceptual Level: This level of abstraction describes what data is concretely
stored in the database. It also describes the relationships that exist between the
data. At this level, databases are described logically in terms of simple data
structures. Users at this level are not concerned with how these logical data
structures will be implemented at the physical level.
3. External Level(View Level): It is the level closest to users and is related to the way
the data is viewed by individual users.
Data Abstraction
Since a database can be viewed through three levels of abstraction, any change at one
level can affect plans at other levels. As databases continue to grow, there may be
frequent changes to it at times. This should not lead to redesign and
re-implementation of the database. In such a context the concept of data
independence proves beneficial.
Concept of Database
To store and manage data efficiently in the database let us understand some key
terms:
1. Database Schema: It is a design of the database. Or we can say that it is a skeleton
of the database that is used to represent the structure, types of data will be stored in
the rows and columns, constraints, relationships between the tables.
2. Data Constraints: In a database, sometimes we put some restrictions on the table
that what type of data can be stored in one or more columns of the table, it can be
done by using constraints. Constraints are defined while we are creating a table.
3. Data dictionary or Metadata: Metadata is known as the data about the data. Or we
can say that the database schema along with different types of constraints on the data
is stored by DBMS in the dictionary is known as metadata.
4. Database instance: In a database, a database instance is used to define the
complete database environment and its components. Or we can say that it is a set of
memory structures and background processes that are used to access the database
files.
5. Query: In a database, a query is used to access data from the database. So users
have to write queries to retrieve or manipulate data from the database.
6. Data manipulation: In a database, we can easily manipulate data using the three
main operations that is Insertion, Deletion, and updation.
7. Data Engine: It is an underlying component that is used to create and manage
various database queries.
Advantages of Database
Let us consider some of the benefits provided by a database system and see how a
database system overcomes the above-mentioned problems:-
1. Reduces database data redundancy to a great extent
2. The database can control data inconsistency to a great extent
3. The database facilitates sharing of data.
4. Database enforce standards.
5. The database can ensure data security.
6. Integrity can be maintained through databases.
Therefore, for systems with better performance and efficiency, database systems are
preferred.
Disadvantages of Database
With the complex tasks to be performed by the database system, some things may
come up which can be termed as the disadvantages of using the database system.
These are:-
1. Security may be compromised without good controls.
2. Integrity may be compromised without good controls.
3. Extra hardware may be required
4. Performance overhead may be significant.
5. The system is likely to be complex.
database-related concept that is often overlooked or no Database system is an
excellent computer-based record-keeping system. A collection of data, commonly
called a database, contains information about a particular enterprise. It maintains
any information that may necessary to the decision-making process involved in the
management of that organization. It can also be defined as a collection of interrelated
data stored together to serve multiple applications, the data is stored so that it is
independent of programs that use the data. A generic and controlled approach is used
to add new data and modify and retrieve existing data within the database. The data
is structured so as to provide the basis for future application development.
Purpose of Database
The intent of a database is that a collection of data should serve as many applications
as possible. Therefore, a database is often thought of as a repository of information
needed to run certain functions in a corporation or organization. It would permit only
the retrieval of data but also the continuous modification of data needed for the
control of operations. It may be possible to search the database to obtain answers to
questions or information for planning purposes.
In a typical file-processing system, permanent records are stored in different files.
Many different application programs are written to extract the records and add the
records to the appropriate files. But this scheme has several major limitations and
disadvantages, such as data redundancy (duplication of data), data inconsistency,
maladaptive data, non-standard data, insecure data, incorrect data, etc. A database
management system is an answer to all these problems as it provides centralized
control of the data.
Normalization
This is the process of organizing a database to minimize redundancy and dependency
by breaking down complex tables into smaller, more manageable ones. It’s important
to understand normalization because it helps you create efficient and scalable
databases, reduces data inconsistency and duplication, and makes it easier to update
and maintain the database over time.
This information is often skipped over in introductory material because it can be
technical and complex, but it is crucial for understanding how to properly design and
maintain a database.
● Easy to Access: 2-Tier Architecture makes easy access to the database, which
makes fast retrieval.
● Scalable: We can scale the database easily, by adding clients or upgrading
hardware.
● Low Cost: 2-Tier Architecture is cheaper than 3-Tier Architecture and Multi-Tier
Architecture .
● Easy Deployment: 2-Tier Architecture is easier to deploy than 3-Tier Architecture.
● Simple: 2-Tier Architecture is easily understandable as well as simple because of
only two components.
3-Tier Architecture
In 3-Tier Architecture , there is another layer between the client and the server. The
client does not directly communicate with the server. Instead, it interacts with an
application server which further communicates with the database system and then the
query processing and transaction management takes place. This intermediate layer
acts as a medium for the exchange of partially processed data between the server and
the client. This type of architecture is used in the case of large web applications.
For Example: E-commerce Store
User: You visit an online store, search for a product and add it to your cart.
Processing: The system checks if the product is in stock, calculates the total price and
applies any discounts.
Database: The product details, your cart and order history are stored in the database
for future reference.
DBMS 3-Tier Architecture
Introduction of ER Model
We typically follow the below steps for designing a database for an application.
● Gather the requirements (functional and data) by asking questions to the database
users.
● Do a logical or conceptual design of the database. This is where ER model plays a
role. It is the most used graphical representation of the conceptual design of a
database.
● Physical Database Design (Like indexing) and external design (like views)
The Entity Relationship Model is a model for identifying entities (like student, car or
company) to be represented in the database and representation of how those entities
are related. The ER data model specifies enterprise schema that represents the overall
logical structure of a database graphically.
Why Use ER Diagrams In DBMS?
● ER diagrams represent the E-R model in a database, making them easy to convert
into relations (tables).
● ER diagrams provide the purpose of real-world modeling of objects which makes
them intently useful.
● ER diagrams require no technical knowledge of the underlying DBMS used.
● It gives a standard solution for visualizing the data logically.
Symbols Used in ER Model
ER Model is used to model the logical view of the system from a data perspective
which consists of these symbols:
● Rectangles: Rectangles represent Entities in the ER Model.
● Ellipses: Ellipses represent Attributes in the ER Model.
● Diamond: Diamonds represent Relationships among Entities.
● Lines: Lines represent attributes to entities and entity sets with other relationship
types.
● Double Ellipse: Double Ellipses represent Multi-Valued Attributes.
● Double Rectangle: Double Rectangle represents a Weak Entity.
Entity Set
We can represent the entity set in ER Diagram but can’t represent entity in ER
Diagram because entity is row and column in the relation and ER Diagram is
graphical representation of data.
Types of Entity
There are two types of entity:
1. Strong Entity
A Strong Entity is a type of entity that has a key Attribute. Strong Entity does not
depend on other Entity in the Schema. It has a primary key, that helps in identifying it
uniquely, and it is represented by a rectangle. These are called Strong Entity Types.
2. Weak Entity
An Entity type has a key attribute that uniquely identifies each entity in the entity set.
But some entity type exists for which key attributes can’t be defined. These are
called Weak Entity types .
For Example, A company may store the information of dependents (Parents,
Children, Spouse) of an Employee. But the dependents can’t exist without the
employee. So Dependent will be a Weak Entity Type and Employee will be Identifying
Entity type for Dependent, which means it is Strong Entity Type .
A weak entity type is represented by a Double Rectangle. The participation of weak
entity types is always total. The relationship between the weak entity type and its
identifying strong entity type is called identifying relationship and it is represented by
a double diamond.
Attribute
Types of Attributes
1. Key Attribute
The attribute which uniquely identifies each entity in the entity set is called the key
attribute. For example, Roll_No will be unique for each student. In ER diagram, the
key attribute is represented by an oval with underlying lines.
Key Attribute
2. Composite Attribute
Composite Attribute
3. Multivalued Attribute
An attribute consisting of more than one value for a given entity. For example,
Phone_No (can be more than one for a given student). In ER diagram, a multivalued
attribute is represented by a double oval.
Multivalued Attribute
4. Derived Attribute
An attribute that can be derived from other attributes of the entity type is known as a
derived attribute. e.g.; Age (can be derived from DOB). In ER diagram, the derived
attribute is represented by a dashed oval.
Derived Attribute
The Complete Entity Type Student with its Attributes can be represented as:
Relationship Set
Degree of a Relationship Set
The number of different entity sets participating in a relationship set is called
the degree of a relationship set.
1. Unary Relationship: When there is only ONE entity set participating in a relation,
the relationship is called a unary relationship. For example, one person is married to
only one person.
Unary Relationship
2. Binary Relationship: When there are TWO entities set participating in a
relationship, the relationship is called a binary relationship. For example, a Student is
enrolled in a Course.
Binary Relationship
3. Ternary Relationship: When there are three entity sets participating in a
relationship, the relationship is called a ternary relationship.
4. N-ary Relationship: When there are n entities set participating in a relationship,
the relationship is called an n-ary relationship.
What is Cardinality?
The number of times an entity of an entity set participates in a relationship set is
known as cardinality . Cardinality can be of different types:
1. One-to-One: When each entity in each entity set can take part only once in the
relationship, the cardinality is one-to-one. Let us assume that a male can marry one
female and a female can marry one male. So the relationship will be one-to-one.
the total number of tables that can be used in this is 2.
Introduction of ER Model
We typically follow the below steps for designing a database for an application.
● Gather the requirements (functional and data) by asking questions to the database
users.
● Do a logical or conceptual design of the database. This is where ER model plays a
role. It is the most used graphical representation of the conceptual design of a
database.
● Physical Database Design (Like indexing) and external design (like views)
The Entity Relationship Model is a model for identifying entities (like student, car or
company) to be represented in the database and representation of how those entities
are related. The ER data model specifies enterprise schema that represents the overall
logical structure of a database graphically.
Why Use ER Diagrams In DBMS?
● ER diagrams represent the E-R model in a database, making them easy to convert
into relations (tables).
● ER diagrams provide the purpose of real-world modeling of objects which makes
them intently useful.
● ER diagrams require no technical knowledge of the underlying DBMS used.
● It gives a standard solution for visualizing the data logically.
Symbols Used in ER Model
ER Model is used to model the logical view of the system from a data perspective
which consists of these symbols:
● Rectangles: Rectangles represent Entities in the ER Model.
● Ellipses: Ellipses represent Attributes in the ER Model.
● Diamond: Diamonds represent Relationships among Entities.
● Lines: Lines represent attributes to entities and entity sets with other relationship
types.
● Double Ellipse: Double Ellipses represent Multi-Valued Attributes.
● Double Rectangle: Double Rectangle represents a Weak Entity.
Symbols used in ER Diagram
Components of ER Diagram
ER Model consists of Entities, Attributes, and Relationships among Entities in a
Database System.
Components of ER Diagram
What is Entity?
An Entity may be an object with a physical existence – a particular person, car, house,
or employee – or it may be an object with a conceptual existence – a company, a job,
or a university course.
What is Entity Set?
An Entity is an object of Entity Type and a set of all entities is called an entity set. For
Example, E1 is an entity having Entity Type Student and the set of all students is
called Entity Set. In ER diagram, Entity Type is represented as:
Entity Set
We can represent the entity set in ER Diagram but can’t represent entity in ER
Diagram because entity is row and column in the relation and ER Diagram is
graphical representation of data.
Types of Entity
There are two types of entity:
1. Strong Entity
A Strong Entity is a type of entity that has a key Attribute. Strong Entity does not
depend on other Entity in the Schema. It has a primary key, that helps in identifying it
uniquely, and it is represented by a rectangle. These are called Strong Entity Types.
2. Weak Entity
An Entity type has a key attribute that uniquely identifies each entity in the entity set.
But some entity type exists for which key attributes can’t be defined. These are
called Weak Entity types .
For Example, A company may store the information of dependents (Parents,
Children, Spouse) of an Employee. But the dependents can’t exist without the
employee. So Dependent will be a Weak Entity Type and Employee will be Identifying
Entity type for Dependent, which means it is Strong Entity Type .
A weak entity type is represented by a Double Rectangle. The participation of weak
entity types is always total. The relationship between the weak entity type and its
identifying strong entity type is called identifying relationship and it is represented by
a double diamond.
Strong Entity and Weak Entity
What is Attributes ?
Attributes are the properties that define the entity type. For example, Roll_No, Name,
DOB, Age, Address, and Mobile_No are the attributes that define entity type Student.
In ER diagram, the attribute is represented by an oval.
Attribute
Types of Attributes
1. Key Attribute
The attribute which uniquely identifies each entity in the entity set is called the key
attribute. For example, Roll_No will be unique for each student. In ER diagram, the
key attribute is represented by an oval with underlying lines.
Key Attribute
2. Composite Attribute
3. Multivalued Attribute
An attribute consisting of more than one value for a given entity. For example,
Phone_No (can be more than one for a given student). In ER diagram, a multivalued
attribute is represented by a double oval.
Multivalued Attribute
4. Derived Attribute
An attribute that can be derived from other attributes of the entity type is known as a
derived attribute. e.g.; Age (can be derived from DOB). In ER diagram, the derived
attribute is represented by a dashed oval.
Derived Attribute
The Complete Entity Type Student with its Attributes can be represented as:
Entity and Attributes
Relationship Type and Relationship Set
A Relationship Type represents the association between entity types. For example,
‘Enrolled in’ is a relationship type that exists between entity type Student and Course.
In ER diagram, the relationship type is represented by a diamond and connecting the
entities with lines.
Entity-Relationship Set
A set of relationships of the same type is known as a relationship set. The following
relationship set depicts S1 as enrolled in C2, S2 as enrolled in C1, and S3 as
registered in C3.
Relationship Set
Degree of a Relationship Set
The number of different entity sets participating in a relationship set is called
the degree of a relationship set.
1. Unary Relationship: When there is only ONE entity set participating in a relation,
the relationship is called a unary relationship. For example, one person is married to
only one person.
Unary Relationship
2. Binary Relationship: When there are TWO entities set participating in a
relationship, the relationship is called a binary relationship. For example, a Student is
enrolled in a Course.
Binary Relationship
3. Ternary Relationship: When there are three entity sets participating in a
relationship, the relationship is called a ternary relationship.
4. N-ary Relationship: When there are n entities set participating in a relationship,
the relationship is called an n-ary relationship.
Cardinality
The number of times an entity of an entity set participates in a relationship set is
known as cardinality . Cardinality can be of different types:
1. One-to-One: When each entity in each entity set can take part only once in the
relationship, the cardinality is one-to-one. Let us assume that a male can marry one
female and a female can marry one male. So the relationship will be one-to-one.
the total number of tables that can be used in this is 2.
Generalization, Specialization
Using the ER model for bigger data creates a lot of complexity while designing a
database model, So in order to minimize the complexity Generalization,
Specialization, and Aggregation were introduced in the ER model. These were used
for data abstraction. In which an abstraction mechanism is used to hide details of a
set of objects. In this article we will cover the concept of Generalization,
Specialization, and Aggregation with example.
Generalization
Generalization is the process of extracting common properties from a set of entities
and creating a generalized entity from it. It is a bottom-up approach in which two or
more entities can be generalized to a higher-level entity if they have some attributes in
common. For Example, STUDENT and FACULTY can be generalized to a
higher-level entity called PERSON as shown in Figure 1. In this case, common
attributes like P_NAME, and P_ADD become part of a higher entity (PERSON), and
specialized attributes like S_FEE become part of a specialized entity (STUDENT).
Generalization is also called as ‘ Bottom-up approach”.
Specialization
In specialization, an entity is divided into sub-entities based on its characteristics. It
is a top-down approach where the higher-level entity is specialized into two or more
lower-level entities. For Example, an EMPLOYEE entity in an Employee management
system can be specialized into DEVELOPER, TESTER, etc. as shown in Figure 2. In
this case, common attributes like E_NAME, E_SAL, etc. become part of a higher
entity (EMPLOYEE), and specialized attributes like TES_TYPE become part of a
specialized entity (TESTER).
Specialization is also called as ” Top-Down approch”.
What is the purpose of the Generalization?
Generalization, Specialization
Using the ER model for bigger data creates a lot of complexity while designing a
database model, So in order to minimize the complexity Generalization,
Specialization, and Aggregation were introduced in the ER model. These were used
for data abstraction. In which an abstraction mechanism is used to hide details of a
set of objects. In this article we will cover the concept of Generalization,
Specialization, and Aggregation with example.
Generalization
Generalization is the process of extracting common properties from a set of entities
and creating a generalized entity from it. It is a bottom-up approach in which two or
more entities can be generalized to a higher-level entity if they have some attributes in
common. For Example, STUDENT and FACULTY can be generalized to a
higher-level entity called PERSON as shown in Figure 1. In this case, common
attributes like P_NAME, and P_ADD become part of a higher entity (PERSON), and
specialized attributes like S_FEE become part of a specialized entity (STUDENT).
Generalization is also called as ‘ Bottom-up approach”.
Specialization
In specialization, an entity is divided into sub-entities based on its characteristics. It
is a top-down approach where the higher-level entity is specialized into two or more
lower-level entities. For Example, an EMPLOYEE entity in an Employee management
system can be specialized into DEVELOPER, TESTER, etc. as shown in Figure 2. In
this case, common attributes like E_NAME, E_SAL, etc. become part of a higher
entity (EMPLOYEE), and specialized attributes like TES_TYPE become part of a
specialized entity (TESTER).
Specialization is also called as ” Top-Down approch”.
Specialization
Inheritance: It is an important feature of generalization and specialization
● Attribute inheritance : It allows lower level entities to inherit the attributes of
higher level entities and vice versa. In diagram Car entity is an inheritance
of Vehicle entity ,So Car can acquire attributes of Vehicle. Example:car can
acquire Model attribute of Vehicle.
● Participation inheritance: Participation inheritance in ER modeling refers to the
inheritance of participation constraints from a higher-level entity (superclass) to a
lower-level entity (subclass). It ensures that subclasses adhere to the same
participation rules in relationships, although attributes and relationships
themselves are inherited differently. In diagram Vehicle entity has an relationship
with Cycle entity, but it would not automatically acquire the relationship itself with
the Vehicle entity. Participation inheritance only refers to the inheritance of
participation constraints, not the actual relationships between entities.
Hierarchical Model
The hierarchical Model is one of the oldest models in the data model which was
developed by IBM, in the 1950s. In a hierarchical model, data are viewed as a
collection of tables, or we can say segments that form a hierarchical relation. In this,
the data is organized into a tree-like structure where each record consists of one
parent record and many children. Even if the segments are connected as a chain-like
structure by logical associations, then the instant structure can be a fan structure with
multiple branches. We call the illogical associations as directional associations.
2. Network Model
The Network Model was formalized by the Database Task group in the 1960s. This
model is the generalization of the hierarchical model. This model can consist of
multiple parent segments and these segments are grouped as levels but there exists a
logical association between the segments belonging to any level. Mostly, there exists a
many-to-many logical association between any of the two segments.
In the Object-Oriented Data Model, data and their relationships are contained in a
single structure which is referred to as an object in this data model. In this, real-world
problems are represented as objects with different attributes. All objects have multiple
relationships between them. Basically, it is a combination of Object Oriented
programming and a Relational Database Model.