Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
88 views

Introduction To Database Management System-Notes

The document discusses database management systems and their advantages over traditional file processing systems. It provides definitions of DBMS and databases. It also outlines some key applications of database systems and the advantages and disadvantages of both DBMS and file processing systems.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
88 views

Introduction To Database Management System-Notes

The document discusses database management systems and their advantages over traditional file processing systems. It provides definitions of DBMS and databases. It also outlines some key applications of database systems and the advantages and disadvantages of both DBMS and file processing systems.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

Unit 1: Introduction to DBMS

Unit 1

1.1 Introduction to DBMS:

Definition of Data: Data, we mean known facts that can be recorded and that have implicit
meaning. For example, consider the names, telephone numbers, and addresses of the people you
know.

Database Management System (DBMS) is a combination of two words that is database &
management system. Combining the meaning of both gives the definition of DBMS.

A database-management system (DBMS) is a collection of interrelated data and a


set of programs to access those data.

A database management system (DBMS) is a collection of programs that enables users to create
and maintain a database. The DBMS is hence a general-purpose software system that facilitates
the processes of defining, constructing, manipulating, and sharing databases among various
users and applications. Defining a database involves specifying the data types, structures, and
constraints for the data to be stored in the database.Constructing the database is the process of
storing the data itself on some storage medium that is controlled by the DBMS. Manipulating a
database includes such functions as querying the database to retrieve specific data, updating the
database to reflect changes in the miniworld, and generating reports from the data. Sharing a
database allows multiple users and programs to access the database concurrently.

DATABASE SYSTEM APPLICATION:


• Banking: For customer information, accounts, and loans, and banking transactions.
•Airlines: For reservations and schedule information. Airlines were among the first to use
databases in a geographically distributed manner—terminals situated around the world accessed
the central database system through phone lines and other data networks.
• Universities: For student information, course registrations, and grades.
• Credit card transactions: For purchases on credit cards and generation of monthly statements.
•Telecommunication: For keeping records of calls made, generating monthly bills, maintaining
balances on prepaid calling cards, and storing information about the communication networks.
•Finance: For storing information about holdings, sales, and purchases of financial instruments
such as stocks and bonds.
• Sales: For customer, product, and purchase information.

Unit 1 2
Need of Database
• Database systems are basically developed for large amount of data. When dealing with huge amount of
data, there are two things that require optimization: Storage of data and retrieval of data.
• Storage: According to the principles of database systems, the data is stored in such a way that it acquires
lot less space as the redundant data (duplicate data) has been removed before storage. Let’s take a
layman example to understand this:
In a banking system, suppose a customer is having two accounts, one is saving account and another is
salary account. Let’s say bank stores saving account data at one place and salary account data at another
place, in that case if the customer information such as customer name, address etc. are stored at both
places then this is just a wastage of storage (redundancy/ duplication of data), to organize the data in a
better way the information should be stored at one place and both the accounts should be linked to that
information somehow. The same thing we achieve in DBMS.
• Fast Retrieval of data: Along with storing the data in an optimized and systematic manner,
it is also important that we retrieve the data quickly when needed. Database systems
ensure that the data is retrieved as quickly as possible.
•Manufacturing: For management of supply chain and for tracking production of items in
factories, inventories of items in warehouses/stores, and orders for items.
•Human resources: For information about employees, salaries, payroll taxes and
benefits, and for generation of paychecks.

WHAT IS FILE PROCESSING SYSTEM?


This typical file-processing system is supported by a conventional operating system. The system
stores permanent records in various files, and it needs different application programs to extract
records from, and add records to, the appropriate files. Before database management systems
(DBMSs) came along, organizations usually
stored information in such systems.

DRAWBACKS OF FILE PROCESSING SYSTEM:

1)Data redundancy and inconsistency: Since different programmers create the files and
application programs over a long period, the various files are likely to have different formats and
the programs may be written in several programming languages. Moreover, the same information
may be duplicated in several places (files). For example, the address and telephone number of a
particular customer may appear in a file that consists of savings-account records and in a file that
consists of checking-account records. This redundancy leads
to higher storage and access cost. In addition, it may lead to data inconsistency; that is, the
various copies of the same data may no longer agree. For example, a changed customer address
may be reflected in savings-account records but not elsewhere in the system.
2) Difficulty in accessing data: conventional file-processing environments do not allow
needed data to be retrieved in a convenient and efficient manner .Suppose that one of the bank
officers needs to find out the names of all customers who live within a particular postal-code
area. The officer asks the data-processing department to generate such a list. Because the
designers of the original system did not anticipate this request, there is no application program on
hand to meet it. There is, however, an application program to generate the list of all customers.
The bank officer has now two choices: either obtain the list of all customers manually and extract
the needed information manually or ask a system programmer to write the necessary application
program. Both alternatives are obviously unsatisfactory.
3)Data isolation. Because data are scattered in various files, and files may be in different
formats, writing new application programs to retrieve the appropriate data is difficult.
4)Integrity problems. The data values stored in the database must satisfy certain
types of consistency constraints. For example, the balance of a bank account
may never fall below a prescribed amount (say, $25).
5) Atomicity problems. A computer system, like any other mechanical or electrical device, is
subject to failure. In many applications, it is crucial that, if a failure occurs, the data be restored
to the consistent state that existed prior to the failure. Consider a program to transfer $50 from
account A to account B. If a system failure occurs during the execution of the program, it is
possible that the $50 was removed from account A but was not credited to account B, resulting in
an inconsistent database state. That is, the funds transfer must be atomic—it must happen in its
entirety or not at all. It is difficult to ensure atomicity in a conventional file-processing system.
6)Concurrent-access anomalies. For the sake of overall performance of the system and faster
response, many systems allow multiple users to update the data simultaneously. In such an

Unit 1 4
environment, interaction of concurrent updates may result in inconsistent data. To guard against
this possibility, the system must maintain some form of supervision. But supervision is difficult
to provide because data may be accessed by many different application programs.
7) Security problems. Not every user of the database system should be able to access all the
data. For example, in a banking system, payroll personnel need to see only that part of the
database that has information about the various bank employees. They do not need access to
information about customer accounts.But, since application programs are added to the system in
an ad hoc manner, enforcing such security constraints is difficult.

Advantages of DBMS:
The following advantages perform by the DMBS.
(1)Reduce Randomly (Duplication): Centralizes control of data by the DBA avoid a necessary
duplication of data & effectively reduce the total amount of data storage.
(2)Shared data: The data base allows the sharing of data under it control by any number of
application programs on users.
(3)Integrity: Centralizes control can also insure that they are incorporate in DBMS to provide
data integrity that is data available on single system & access by many people.
(4)Security: Data is very important to and organization & may be confidential. Such confidential
data must not be accessible by unauthorized user. The DBA who has the altimeters possibilities
for the data in the DBMS can ensure that proper access procedure including authentication.
DBMS check the permission before provide any access to other users.
(5)Conflict Regulations: Since the data base is under the control of DBA, Hence, various user
can not access any data without the permission. < Logging name, Password >.
(6)Data independent: Data can be a physical or logical both data are independent so that change
occur in hardware or software can not affect the access of data.
Disadvantages of DBMS:
1)The cost of purchasing & developing is more because it is more expansive than other
applications
2)backup and recovery operation are complex..
3)more workspace is required for its execution and storage.
4)excessive data entries may currupt the total data.

Functions of DBMS:
1)addition of new data.
2)sorting of data.
3) searching particular data.
4)printing particular data.
5)editing or changing sorted data.
6)deleting data.

Unit 1 5
DIFFERENCE BETWEEN FILE SYSTEM & DBMS
FILE SYSTEM DBMS

2)there is inconsistency of data. 2) inconsistency of data is reduced


as redundancy is reduced.
3)no provision for data security. 3)provision of data security is made.
4)no standard representation of data. 4) standard representation of data is achieved
using relational data model.
1)high rate of redundancy of data exists in 1)redundancy is reduced.
a typical file processing system.
5)data integrity is not there. 5)data integrity is there.
6)data can not be accessed easily. 6)data can be accessed easily through rows
and columns.
7)data cannot be shared. 7) data can be shared.
8)retrieval of data is time consuming. 8) retrieval of data is easy.
9)program-data independence is not there. 9) program-data independence is there.

Data Abstraction:

Figure 1.1 The three levels of data abstraction

For the system to be usable, it must retrieve data efficiently. The need for efficiency has led
designers to use complex data structures to represent data in the database. Since many database-
systems users are not computer trained, developers hide the complexity from users through
several levels of abstraction, to simplify users’ interactions with the system:
•Physical level- The lowest level of abstraction describes how the data are actually stored. The
physical level describes complex low-level data structures in detail.
•Logical level- The next-higher level of abstraction describes what data are stored in the
database, and what relationships exist among those data. Database administrators, who must
decide what information to keep in the database, use the logical level of abstraction.
•View level- The highest level of abstraction describes only part of the entire database. Even
though the logical level uses simpler structures, complexity remains because of the variety of
information stored in a large database. Many users of the database system do not need all this
information; instead, they need to access only a part of the database. The view level of

Unit 1 6
abstraction exists to simplify their interaction with the system. The system may provide many
views for the same database.
Figure 1.1 shows the relationship among the three levels of abstraction.
Eg.An analogy to the concept of data types in programming languages may clarify the
distinction among levels of abstraction. Most high-level programming languages support the
notion of a record type. For example, in a Pascal-like language, we may declare a record as
follows:
type customer = record
customer-id : string;
customer-name : string;
customer-street : string;
customer-city : string;
end;
This code defines a new record type called customer with four fields. Each field has a name and
a type associated with it. A banking enterprise may have several such record types, including
• account, with fields account-number and balance
• employee, with fields employee-name and salary

*At the physical level, a customer, account, or employee record can be described as a block of
consecutive storage locations (for example, words or bytes). The language compiler hides this
level of detail from programmers.
*At the logical level, each such record is described by a type definition, as in the previous code
segment, and the interrelationship of these record types is defined as well. Programmers using a
programming language work at this level of abstraction. Similarly, database administrators
usually work at this level of abstraction.
*Finally, at the view level, computer users see a set of application programs that hide details of
the data types. Similarly, at the view level, several views of the database are defined, and
database users see these views. In addition to hiding details of the logical level of the database,
the views also provide a security mechanism to prevent users from accessing certain parts of the
database. For example, tellers in a bank see only that part of the database that has information on
customer accounts; they cannot access information about salaries of employees.

DBMS Architecture
Database management systems architecture will help us understand the
components of database system and the relation among them.
The architecture of DBMS depends on the computer system on which it runs. For
example, in a client-server DBMS architecture, the database systems at server
machine can run several requests made by client machine.

Types of DBMS Architecture:


There are three types of DBMS architecture:
Single tier architecture
Two tier architecture
Three tier architecture

Unit 1 7
DBMS Architecture

Database management systems architecture will help us understand the


components of database system and the relation among them.
The architecture of DBMS depends on the computer system on which it
runs. For example, in a client-server DBMS architecture, the database
systems at server machine can run several requests made by client
machine.
Types of DBMS Architecture

•There are three types of DBMS architecture:


1. Single tier architecture
2. Two tier architecture
3. Three tier architecture
1. Single tier architecture

In this type of architecture, the database is


readily available on the client machine, any
request made by client doesn’t require a
network connection to perform the action on
the database.
For example, lets say you want to fetch the
records of employee from the database and the
database is available on your computer system,
so the request to fetch employee details will be
done by your computer and the records will be
fetched from the database by your computer as
well. This type of system is generally referred as
local database system.
2. Two tier architecture

In two-tier architecture, the Database system is present


at the server machine and the DBMS application is
present at the client machine, these two machines are
connected with each other through a reliable network.
Whenever client machine makes a request to access the
database present at server using a query language like
sql, the server perform the request on the database and
returns the result back to the client.
The 2-Tier architecture is same as basic client-server. In
the two-tier architecture, applications on the client end
can directly communicate with the database at the
server side. For this interaction, API's
like: ODBC, JDBC are used.
The user interfaces and application programs are run on the client-side.
The server side is responsible to provide the functionalities like: query
processing and transaction management.
To communicate with the DBMS, client-side application establishes a
connection with the server side.
Advantages of 2-tier Architecture

Easy to understand as it directly communicates with the database.


Requested data can be retrieved very quickly, when there is less number
of users.
Easy to modify – any changes required, directly requests can be sent to
database
Easy to maintain – When there are multiple requests, it will be handled in
a queue and there will not be any chaos.
Disadvantages of 2-tier architecture:

It would be time consuming, when there is huge number of users. All the
requests will be queued and handed one after another. Hence it will not
respond to multiple users at the same time.
This architecture would little cost effective.
Three tier architecture

The 3-Tier architecture contains another layer between the client and
server. In this architecture, client can't directly communicate with the
server.
The application on the client-end interacts with an application server
which further communicates with the database system.

End user has no idea about the existence of the database beyond the
application server. The database also has no idea about any other user
beyond the application.
The 3-Tier architecture is used in case of large web application.
Advantages of 3-tier architecture:

3-tier architecture is the most widely used database architecture


Easy to maintain and modify. Any changes requested will not affect any
other data in the database.
Application layer will do all the validations.
Improved security. Since there is no direct access to the database, data
security is increased.
There is no fear of mishandling the data. Application layer filters out all the
malicious actions.
Good performance. Since this architecture cache the data once retrieved,
there is no need to hit the database for each request. This reduces the
time consumed for multiple requests and hence enables the system to
respond at the same time.
Disadvantages 3-tier Architecture

Disadvantages of 3-tier architecture are that it is little more complex and


little more effort is required in terms of hitting the database.
Types of Database Systems

PC databases
Centralized database
Client/server databases
Distributed databases
Database models
PC Databases

E.g.:
Access
FoxPro
Dbase
Etc.
Centralized Databases

Central
Computer
Client Server Databases

Client

Client
Network

Database
Server
Client
Distributed Databases

Location B
Location C

computer
computer

Homogeneous
computer Databases
Location A
Components of Database
Database Models
Three Types of Relationships
One-to-many relationships (1:M)
A painter paints many different paintings, but each one of them is painted by
only that painter.
PAINTER (1) paints PAINTING (M)
Many-to-many relationships (M:N)
An employee might learn many job skills, and each job skill might be learned by
many employees.
EMPLOYEE (M) learns SKILL (N)
One-to-one relationships (1:1)
Each store is managed by a single employee and each store manager
(employee) only manages a single store.
EMPLOYEE (1) manages STORE (1)
Database Models
A database model is a collection of logical constructs used
to represent the data structure and the data relationships
found within the database.

Two Categories of Database Models


Conceptual models focus on the logical nature of the data
representation. They are concerned with what is represented rather than
how it is represented.
Implementation models place the emphasis on how the data are
represented in the database or on how the data structures are
implemented.
Database Models
Three Types of Implementation
Database Models
Hierarchical database model
Network database model
Relational database model
A Hierarchical Structure
Hierarchical Model
Advantages
Conceptual simplicity
Database security
Data independence
Database integrity
Efficiency dealing with a large database
Disadvantages
Complex implementation
Difficult to manage
Lacks structural independence
Applications programming and use complexity
Implementation limitations
Lack of standards
Network model: graph
Network Database Model
Advantages
Conceptual simplicity
Handles more relationship types
Data access flexibility
Promotes database integrity
Data independence
Conformance to standards

Disadvantages
System complexity
Lack of structural independence
Relational Database Model
Basic Structure
RDBMS allows operations in a human logical
environment.
The relational database is perceived as a collection of
tables.
Each table consists of a series of row/column
intersections.
Tables (or relations) are related to each other by sharing
a common entity characteristic.
The relationship type is often shown in a relational
schema.
A table yields complete data and structural
independence.
Linking Relational Tables
Relational Database Model
Advantages
Structural independence
Improved conceptual simplicity
Easier database design, implementation,
management, and use
Ad hoc query capability (SQL)
Powerful database management system

Disadvantages
Substantial hardware and system software
overhead
Possibility of poor design and implementation
Potential “islands of information” problems
Some Important Definations:
Database Schema :The description of a database is called the database schema, which is
specified during database design and is not expected to change frcquentlv.

Schemas Diagrams : Most data models have certain conventions for displaying schemas as
diagrams. A displayed schema is called a schema diagram.

Instances / Database State :The data in the database at a particular moment in time is called a
database state or snapshot. It is also called the current set of occurrences or instances in the
database.

Unit 1 36
FIGURE 2.2 The three-schema architecture

The goal of the three-schema architecture, illustrated in Figure 2.2, is to separate the user
applications and the physical database. In this architecture, schemas can be defined at the
following three levels:
1.The internal level has an internal schema, which describes the physical storage structure of the
database. The internal schema uses a physical data model and describes the complete details of
data storage and access paths for the database.
2.The conceptual level has a conceptual schema, which describes the structure of the whole
database for a community of users. The conceptual schema hides the details of physical storage
structures and concentrates on describing entities, data types, relationships, user operations, and
constraints. Usually, a representational data model is used to describe the conceptual schema
when a database system is implemented.
3.The external or view level includes a number of external schemas or user views. Each external
schema describes the part of the database that a particular user group is interested in and hides
the rest of the database from that user group.

*MAPPING : The processes of transforming requests and results between levels are called
mappings.

Unit 1 37
Data Independence:
The three-schema architecture can be used to further explain the concept of data independence.
Data Independence:Is the capacity to change the schema at one level of a database system
without having to change the schema at the next higher level. We can define two types of data
independence
1.Logical data independence is the capacity to change the conceptual schema without having
to change external schernas or application programs. We may change the conceptual schema to
expand the database (by adding a record type or data item), to change constraints, or to reduce
the database (by removing a record type or data item).
2.Physical data independence is the capacity to change the internal schema without having to
change the conceptual schema. Hence, the external schemas need not be changed as well.
Changes to the internal schema may be needed because some physical files had to be
reorganized-for example, by creating additional access structures-to improve the performance of
retrieval or update. If the same data as before remains in the database, we should not have to
change the conceptual schema.

Database Languages:
A database system provides a data definition language to specify the database schema and a data
manipulation language to express database queries and updates.
Data-Definition Language
-We specify a database schema by a set of definitions expressed by a special language called a
data-definition language (DDL).
For instance, the following statement in the SQL language defines the account table:
create table account
(account-number char(10),
balance integer)
Execution of the above DDL statement creates the account table. In addition, it updates a special
set of tables called the data dictionary or data directory.
Data-Manipulation Language
Data manipulation is
• The retrieval of information stored in the database
• The insertion of new information into the database
• The deletion of information from the database
• The modification of information stored in the database
-A data-manipulation language (DML) is a language that enables users to access or manipulate
data as organized by the appropriate data model.
There are basically two types:
• Procedural DMLs require a user to specify what data are needed and how to get those data.
•Declarative DMLs (also referred to as nonprocedural DMLs) require a user to specify what data
are needed without specifying how to get those data.

*A query is a statement requesting the retrieval of information. The portion of a


DML that involves information retrieval is called a query language.

Unit 1 38
Database Users:
There are four different types of database-system users, differentiated by the way they expect to
interact with the system.
1)Naive users: are unsophisticated users who interact with the system by invoking one of the
application programs that have been written previously. Forexample, a bank teller who needs to
transfer $50 from account A to account B invokes a program called transfer. This program asks
the teller for the amount of money to be transferred, the account from which the money is to be
transferred, and the account to which the money is to be transferred.
2)Application programmers :are computer professionals who write application programs.
Application programmers can choose from many tools to develop user interfaces. Rapid
application development (RAD) tools are tools that enable an application programmer to
construct forms and reports without writing a program.
3)Sophisticated users: interact with the system without writing programs. Instead, they form
their requests in a database query language. They submiteach such query to a query processor,
whose function is to break down DML statements into instructions that the storage manager
understands. Analysts who submit queries to explore data in the database fall in this category.
4)Specialized users :are sophisticated users who write specialized databaseapplications that do
not fit into the traditional data-processing framework. Among these applications are computer-
aided design systems, knowledge base and expert systems, systems that store data with complex
data types (for example, graphics data and audio data), and environment-modeling systems.

Database Administrator & Responsibilities of DBA:


One of the main reasons for using DBMSs is to have central control of both the data and
the programs that access those data. A person who has such central control over the system is
called a database administrator (DBA). The functions of a DBA include:
*Schema definition- The DBA creates the original database schema by executing a set of data
definition statements in the DDL.
* Storage structure and access-method definition.
*Schema and physical-organization modification. The DBA carries out changes to the schema
and physical organization to reflect the changing needs of the organization, or to alter the
physical organization to improve performance.
*Granting of authorization for data access. By granting different types of authorization, the
database administrator can regulate which parts of the database various users can access. The
authorization information is kept in a special system structure that the database system consults
whenever someone attempts to access the data in the system.
*Routine maintenance. Examples of the database administrator’s routine maintenance activities
are:
-Periodically backing up the database, either onto tapes or onto remote servers, to prevent loss of
data in case of disasters such as flooding.
-Ensuring that enough free disk space is available for normal operations, and upgrading disk
space as required.
-Monitoring jobs running on the database and ensuring that performance is not degraded by very
expensive tasks submitted by some users.

Unit 1 39
Overall STRUCTURE of DBMS:

Fig: System Structure


A database system is partitioned into modules that deal with each of the responsibilities
of the overall system. The functional components of a database system can be broadly divided
into :
i) storage manager
ii) query processor components.

i) Storage Manager :
A storage manager is a program module that provides the interface between the low level
data stored in the database and the application programs and queries submitted to the system.
The storage manager is responsible for the interaction with the file manager. The raw data are
stored on the disk using the file system, which is usually provided by a conventional operating
system. The storage manager translates the various DML statements into low-level file-system
commands. Thus, the storage manager is responsible for storing, retrieving, and updating data in
the database.
The storage manager components include • Authorization and integrity manager-which tests for
the satisfaction of integrity constraints and checks the authority of users to access data.
•Transaction manager- which ensures that the database remains in a consistent (correct) state
despite system failures, and that concurrent transaction executions proceed without conflicting.
•File manager- which manages the allocation of space on disk storage and the data structures
used to represent information stored on disk.
•Buffer manager- which is responsible for fetching data from disk storage into main memory,
and deciding what data to cache in main memory. The buffer manager is a critical part of the
database system, since it enables the database

Unit 1 40
to handle data sizes that are much larger than the size of main memory.
The storage manager implements several data structures as part of the physical
system implementation:
• Data files- which store the database itself.
•Data dictionary- which stores metadata about the structure of the database, in
Particular the schema of the database.
• Indices- which provide fast access to data items that hold particular values.

ii)The Query Processor :


The query processor components include,
•DDL interpreter - which interprets DDL statements and records the definitions
in the data dictionary.
•DML compiler - which translates DML statements in a query language into an
evaluation plan consisting of low-level instructions that the query evaluation
engine understands.
A query can usually be translated into any of a number of alternative evaluation
plans that all give the same result. The DML compiler also performs
query optimization, that is, it picks the lowest cost evaluation plan from among
the alternatives.
•Query evaluation engine - which executes low-level instructions generated by
the DML compiler.

Unit 1 41
Types of Database Language

A DBMS has appropriate languages and interfaces to express database


queries and updates.
Database languages can be used to read, store and update the data in the
database.
1. Data Definition Language

DDL stands for Data Definition Language. It is used to define database


structure or pattern.
It is used to create schema, tables, indexes, constraints, etc. in the
database.
Using the DDL statements, you can create the skeleton of the database.
Data definition language is used to store the information of metadata like
the number of tables and schemas, their names, indexes, columns in each
table, constraints, etc.
2. Data Manipulation Language

DML stands for Data Manipulation Language. It is used for accessing and
manipulating data in a database. It handles user requests.
3. Data Control Language

DCL stands for Data Control Language. It is used to retrieve the stored or
saved data.
The DCL execution is transactional. It also has rollback parameters.
Here are some tasks that come under DCL:
Grant: It is used to give user access privileges to a database.
Revoke: It is used to take back permissions from the user.
4. Transaction Control Language

TCL is used to run the changes made by the DML statement. TCL can be
grouped into a logical transaction.
Here are some tasks that come under TCL:
Commit: It is used to save the transaction on the database.
Rollback: It is used to restore the database to original since the last
Commit.

You might also like