Unit 6 Database Management Systems: Structure
Unit 6 Database Management Systems: Structure
Unit 6 Database Management Systems: Structure
SYSTEMS
Structure
6.0 Objectives
6.1 Introduction
6.2 File Oriented Approach
6.3 Database Approach
6.4 Database and DBMS
6.4.1 Database
6.4.2 Why Use a Database?
6.4.3 Database Management System (DBMS)
6.4.4 Characteristics of Data in a Database
6.1 INTRODUCTION
We all live in an age where the world is full of data and information. Everyone is aware
of Internet, which has become a huge source of information and also growing everyday.
The amount of data is actually very large to access and maintain.
For example, if we go to the bank to deposit or withdraw funds, if we make booking in
a hotel or railways reservation, even purchasing items from a supermarket or Mall, all
these activities involves an automatic update of the database which keeps the inventory
of the store room or showroom items.
Relational database systems have become increasingly popular since the late 1970’s.
They offer a powerful method for storing data in an application-independent manner.
This means that for many enterprises the database is at the core of the IT strategy.
Developments can progress around a relatively stable database structure which is secure,
reliable, efficient, and transparent.
In early systems, each suite of application programs had its own independent master
file. The duplication of data over master files could lead to inconsistent data. Efforts to
use a common master file for a number of application programs resulted in problems of
integrity and security. The production of new application programs could require
amendments to existing application programs, resulting in unproductive maintenance.
Data structuring techniques, developed to exploit random access storage devices,
increased the complexity of the insert, delete and update operations on data. As a first
step towards a DBMS, packages of subroutines were introduced to reduce programmer
effort in maintaining these data structures. However, the use of these packages still
requires knowledge of the physical organisation of the data. In the traditional database
applications, most of the information was stored and accessed in either textual or numeric
form. But nowadays, video clips, pictures files, weather data, satellite images and sound
image files are stored in separate and specialised databases such as multimedia databases.
In this unit we will introduce basic concepts of file oriented approach, drawbacks with
file oriented approach, database approach, use of DBMS, levels of abstraction of
DBMS, types of DBMS and security features.
6.4.1 Database
A database is a logically coherent collection of data with some inherent meaning,
representing some aspect of real world and which is designed, built and populated with
data for a specific purpose.
56
.....................................................................................................................
Database Management
6.6 DATABASE ENVIRONMENT Systems
The following components form the Database System Environment:
Stored Data Manager
The database and the database catalogue are stored on disk
Access to the disk is handled by the Operating System.
A higher-level stored data manager controls access to DBMS information that is
stored on disk, whether part of the database or the catalogue.
The stored data manager may use basic OS services for carrying out low-level
data transfer, such as handling buffers.
Once data is in buffers, the other DBMS modules, as well as other application
programs can process it.
DDL Compiler
A DDL Compiler processes the schema definitions and stores the descriptions (meta-
data) in the catalogue.
Runtime Database Processor
It handles database access at runtime. It receives retrieval or update operations and
carries them out on the database. Its access to the disk goes through the stored data
manager.
Query Compiler
It handles high-level queries entered interactively. It parses, analyses and interprets a
query, then generates calls to the runtime processor for execution.
Precompiler
Precompiler extracts DML commands from an application program written in a host
language. The commands are sent to DML compiler for compilation into code for
database access. The rest is sent to the host language compiler.
Client Program
This program accesses the DBMS running on a separate computer from the computer
on which the database resides. It is called the client computer, and the other is the
database server. In some cases a middle level is called the application server.
Database System Utilities
DBMSs have database utilities that help the DBA manage the system. Functions include:
Loading - used to load existing text/sequential files into the database. Source
format and desired target file are specified to the utility, and the utility reformats the
data to load into a table.
Backup – creates a backup copy of the database, usually by dumping database
onto tape. Can be used to restore the database in case of failure. Incremental
backup can be used which records only the changes since the last backup.
File Reorganisation – reorganise database files into different file organisations
to improve performance. 57
Middleware Technologies Performance Monitoring – monitors database usage and provides statistics to
the DBA. DBA uses the statistics for decision-making.
Tools, Environment and Communication Facilities
CASE Tools – used in the design phase to help speed up the development process.
Data dictionary system – stores catalogue information about schemas and
constraints, as well as design decisions, usage standards, application program
descriptions, user information. Also called an information repository. Can be
accesses directly by DBA or users when needed.
Application development environments – (i.e., JBuilder) provide environment
for developing database applications, and include facilities to help in database
design, GUI development, querying and updating and application development.
Communication software – allow users at remote locations to access the database
through computer terminals, workstations or personal computers. Connected to
the database through data communications hardware such as phone lines, local
area networks etc.
Self-Check Exercise
Note: i) Write your answers in the space given below.
ii) Check your answers with the answers given at the end of this Unit.
4) What is a DML Compiler?
.....................................................................................................................
.....................................................................................................................
.....................................................................................................................
This child / parent rule assures that data is systematically accessible. To get to a low-
level table, you start at the root and work your way down through the tree until you
reach your target. Of course, as you might imagine, one problem with this system is that
the user must know how the tree is structured in order to find anything!
The hierarchical model however, is much more efficient than the flat-file model we
discussed earlier because there is not as much need for redundant data. If a change in
the data is necessary, the change might only need to be processed once. Consider the
student flat-file database example as given below:
Table 6.1: Student flat-file
Name Address Course Grade
Mr. Eric Tachibana 123 Kensigton Chemistry 102 C+
Mr. Eric Tachibana 123 Kensigton Chinese 3 A
Mr. Eric Tachibana 122 Kensigton Data Structures B
Mr. Eric Tachibana 123 Kensigton English 101 A
Ms. Tonya Lippert 88 West 1st St. Psychology 101 A
Mrs. Tonya Ducovney 100 Capitol Ln. Psychology 102 A
Ms. Tonya Lippert 88 West 1st St. Human Cultures A
Ms. Tonya Lippert 88 West 1st St. European Governments A
59
Middleware Technologies As we mentioned before, this flat-file database would store an excessive amount of
redundant data. If we implemented this in a hierarchical database model, we would get
much less redundant data. Consider the following hierarchical database scheme:
However, as you can imagine, the hierarchical database model has some serious
problems. For one, you cannot add a record to a child table until it has already been
incorporated into the parent table. This might be troublesome if, for example, you wanted
to add a student who had not yet signed up for any courses.
Worse, yet, the hierarchical database model still create repetition of data within the
database. You might imagine that in the database system shown above, there may be a
higher level that includes multiple courses. In this case, there could be redundancy
because students would be enrolled in several courses and thus each “course tree”
would have redundant student information.
Redundancy would occur because hierarchical databases handle one-to-many
relationships well but do not handle many-to-many relationships well. This is because a
child may only have one parent. However, in many cases you will want to have the child
be related to more than one parent. For instance, the relationship between student and
class is a “many-to-many”. Not only can a student take many subjects but a subject
may also be taken by many students. How would you model this relationship simply
and efficiently using a hierarchical database? The answer is that you wouldn’t.
Though this problem can be solved with multiple databases creating logical links between
children, the fix is very kludge and awkward. Faced with these serious problems, the
computer brains of the world got together and came up with the network model.
Disadvantage
Links are unidirectional i.e., parent to child only. The model does not support many to
many relationships. So, in this model if we need any information from lower level then
we have to follow the complete hierarchy of the system.
60
Database Management
Systems
Disadvantage
The use of pointers leads to complex structures, which makes mapping of related data
very difficult.
In the above tables, “Code” is common field in both tables and these two tables are
linked together on this field.
The relational database is very popular and it is mostly used to create and maintain the
large amount of related data. In business, a relational database contains the following
types of tables.
Table to store customer information
Table to store employee information
Table to store order information
Table to store inventory information etc.
The customer, order and inventory table can be linked together to process or to generate
reports of orders and billing etc.
In such systems, the user interface and application programs run on the client,
when DMBS access is needed, the program establishes a connection to the DBMS
on the server side. Once the connection is created, the client can communicate
with the DBMS.
ODBC (Open Database Connectivity) is a standard that provides and application
processing interface which allows client side programs to call the DBMS as long
as both sides have the required software. Most database vendors provide ODBC
drivers for their systems.
Client programs can connect to several RDBMS and send query and transaction
requests using the ODBC API
Query requests are sent from the client to the server, and the server processes the
request and sends the result to the client.
A related Java standard is JDBC, which allows Java programs to access the
DBMS through a standard interface.
These systems are called two tier architectures because the software components
are distributed over two systems, the client and server.
Three-tier Client/Server Architecture for Web Applications
Many web applications use three-tier architecture, which adds an intermediate
layer between the client and the database server.
The middle tier is called the application server, or the web serrver. Plays an
intermediate role, by storing business rules (procedures/constraints) used to access
data from database.
Can improve database security by checking the client’s credentials before
forwarding request to database server.
Clients contain GUI interfaces and application specific rules.
The intermediate server accepts the requests from the client, processes the request
and sends the database commands to the db server, then passes the data from the
database server to the client, where it may be processes further and filtered.
The three tiers are: user interface, application rules, and data access.
Application Programs,
DBMS
Let us study the security features of the database in the next section.
65
Middleware Technologies
6.9 DATABASE SECURITY
Databases need to have level of security in order to protect the database against both
malicious and accidental threats. A threat is any type of situation that will adversely
affect the database system. Some factors that drive the need for security are as follows:
Theft and fraud
Confidentiality
Integrity
Privacy
Database availability
Threats to database security can come from many sources. People are a substantial
source of database threats. Different types of people can pose different threats. Users
can gain unauthorised access through the use of another person’s account. Some users
may act as hackers and/or create viruses to adversely affect the performance of the
system. Programmers can also pose similar threats. The Database Administrator can
also cause problems by not imposing an adequate security policy.
Some threats related to the hardware of the system are as follows:
Equipment failure
Deliberate equipment damage (e.g. arson, bombs)
Accidental / unforeseen equipment damage (e.g. fire, flood)
Power failure
Equipment theft
Threats can exist over the communication networks that an organisation uses. Techniques
such as wire tapping, cable disruption (cutting / disconnecting), and electronic interference
can all be used to disrupt services or reveal private information.
Countermeasures
Some countermeasures that can be employed are outlined below:
Access Controls (can be Discretionary or Mandatory)
Authorisation (granting legitimate access rights)
Authentication (determining whether a user is who they claim to be)
Backup
Journaling (maintaining a log file - enables easy recovery of changes)
Encryption (encoding data using an encryption algorithm)
RAID (Redundant Array of Independent Disks - protects against data loss due to
disk failure)
Polyinstantiation (data objects that appear to have different values to users with
different access rights / clearance)
Views (virtual relations which can limit the data viewable by certain users).
66
Database Management
6.10 POPULAR DBMS PACKAGES Systems
There are many popular DBMS available in market, some of them are:
Borland Paradox
Filemaker
IBMDB2
Informix
Ingres
Interbase
Microsoft SQL Server
Microsoft Access
Microsoft FoxPro
Oracle
Sybase
MySQL
PostgreSQL
mSQL
SQL Server
Let us discuss the database project development in the next section.
6.11.1 Analysis
The outputs from this stage should be:
A CONCEPTUAL DATA MODEL describing the information which is used within
the organisation but not in computer-related terms. This level of data analysis will be 67
Middleware Technologies considered in more detail later. One of the problems with any systems design in a large
organisation is that it must proceed in a piecemeal manner – it is impossible to create a
totally new Global system in one fell swoop, and each sub-system must dovetail with
others which may be at quite a different stage of development. The conceptual data
model provides a context within which more detailed design specifications can be
produced, and should help in maintaining consistency from one application area to
another.
A CONCEPTUAL PROCESS MODEL describing the functions of the organisation
in terms of events (e.g. a purchase, a payment, a booking) and the processes which
must be performed within the organisation to handle them. This may lead to a more
detailed functional specification - describing the organisational requirements which must
be satisfied, but not how they are to be achieved.
6.11.2 Design
This stage should produce:
A LOGICAL DATA MODEL: a description of the data to be stored in the database,
using the conventions prescribed by the particular DBMS to be used. This is sometimes
referred to as a SCHEMA and some DBMSs also give facilities for defining SUB-
SCHEMA or partitions of the overall schema. Logical data models supported by present
day DBMSs will be considered later.
A SYSTEM SPECIFICATION, describing in some detail what the proposed system
should do. This will now refer to COMPUTER PROCESSES, but probably in terms
of INPUT and OUTPUT MESSAGES rather than internal logic, describing, for instance,
the effect of selecting an item from a menu, or any option within a command driven
system. Program modules are defined in terms of the screen displays and/or reports
which they generate. Note that the data referred to here has a temporary existence, in
contrast with what is stored in the database itself.
6.11.3 Development
Specification of the database itself must now come down another level, to decisions
about PHYSICAL DATA STORAGE in particular files on particular devices. For
this a knowledge of the computer operating system, as well as the DBMS, is required.
Conventional program development - coding, testing, debugging etc. may also be done.
If a totally packaged system has been purchased this may not be necessary - it will
simply be a matter of discovering how to use the command and query language already
supplied to store and retrieve data, generate reports and other outputs. Even here an
element of testing and debugging may be involved, since it is unlikely that the new user
of a system will get it exactly right the first time. It is certainly inadvisable for this sort of
experimentation to take place using a live database!
6.11.4 Implementation
This puts the work of the previous three phases into everyday use. It involves such
things as loading the database with live rather than test data, staff training, probably the
introduction of new working practices. It is not unusual to have an old and a new
system running side by side for a while so that some back-up is available if the new
system fails unexpectedly.
6.11.5 Maintenance
68 Systems once implemented generally require further work done on them as time goes
by, either to correct original design faults or to accommodate changes in user requirements Database Management
or operating constraints. One of the objectives of using a DBMS is to reduce the Systems
impact of such changes - for example the data can be physically re-arranged without
affecting the logic of the programs which use it. Some DBMSs provide utility programs
to re-organise the data when either its physical or logical design must be altered.
6.13 SUMMARY
A database system is an integrated collection of related files along with the details about
their definition, interpretation, manipulation and maintenance. A DBMS is a major
software component of database system. It consists of collection of interrelated data
and programs to access that data. The primary goal of a DBMS is to provide an
environment which is both convenient and efficient to use in retrieving information from
and storing information into the database.
The DBMS not only makes the integrated collection of reliable and accurate data
available to multiple applications and users but also controls from unauthorised, users
to access the data.
A DBMS is a major software system consisting of a number of elements. It provides
users DDL for defining the external and conceptual view of the data and DML for
manipulating the data stored in the database. The database manager is the component
of DBMS that provide the interface between the user and the file system. The database
administration defines and maintains the three level of the database as well as the mapping
between levels to insulate die higher levels from changes that take place in the lower
levels. The DBA is responsible for implementing measures for ensuring the security,
integrity and recovery of the database. 69
Middleware Technologies
6.14 ANSWERS TO SELF CHECK EXERCISES
1) DBMS short for Database Management System is a software system that uses a
standard method of cataloging, retrieving, and running queries on data. The DBMS
manages incoming data, organises it, and provides ways for the data to be modified
or extracted by users or other programs. Some DBMS examples include MySQL,
PostgreSQL, Microsoft Access, SQL Server, FileMaker, Oracle, RDBMS,
Clipper, and Foxpro.
2) There are three levels of abstraction:
a. Physical level: The lowest level of abstraction describes how data are stored.
b. Logical level: The next higher level of abstraction, describes what data are
stored in database and what relationship among those data.
c. View level: The highest level of abstraction describes only part of entire
database.
3) Data independence means that the application is independent of the storage
structure and access strategy of data?. In other words, The ability to modify the
schema definition in one level should not affect the schema definition in the next
higher level.
Two types of Data Independence:
Physical Data Independence: Modification in physical level should not affect
the logical level.
Logical Data Independence: Modification in logical level should affect the view
level. Logical Data Independence is more difficult to achieve
4) It translates DML statements in a query language into low-level instruction that the
query evaluation engine can understand.
5) It is a collection of conceptual tools for describing data, data relationships data
semantics and constraints.
6.15 KEYWORDS
Data definition language : The subset of SQL used for defining and
(DDL) examining the structure of a database.
Data manipulation : 1.The subset of SQL used for inserting, deleting,
language (DML) updating and fetching data in a database.
Data Sub Language (DSL) : A computer language used to define or
manipulate the structure of a relational database
management system (DBMS)
SQL : SQL is short for Structured Query Language, an
international standard language for manipulating
relational databases.
71