Chapter 1 - Introduction To Database System
Chapter 1 - Introduction To Database System
Chapter 1 - Introduction To Database System
Chapter One
Introduction to Database system
1.1 Database - Definition and Usage
Database systems are designed to manage large data set in an organization. The data
management involves both definition and the manipulation of the data which ranges from simple
representation of the data to considerations of structures for the storage of information. The data
management also consider the provision of mechanisms for the manipulation of information.
Today, Databases are essential to every business. They are used to maintain internal records, to
present data to customers and clients on the World-Wide-Web, and to support many other
commercial processes. Databases are likewise found at the core of many modern organizations.
The power of databases comes from a body of knowledge and technology that has developed
over several decades and is embodied in specialized software called a database management
system, or DBMS. A DBMS is a powerful tool for creating and managing large amounts of data
efficiently and allowing it to persist over long periods of time, safely. These systems are among
the most complex types of software available.
Thus, for our question: What is a database? In essence a database is nothing more than a
collection of shared information that exists over a long period of time, often many years. In
common dialect, the term database refers to a collection of data that is managed by a DBMS.
Thus the DB course is about:
How to organize data
Supporting multiple users
Efficient and effective data retrieval
Secured and reliable storage of data
Maintaining consistent data
Making information useful for decision making
For example, one user, the grade reporting office, may keep a file on students and their grades.
Programs to print a transcript and to enter new grades into the file are implemented as part of the
application. A second user, the accounting office, may keep track of students fees and their
payments. Although both are interested in the data about students, each user maintains separate
files and programs to manipulate the files because each requires some data not available from the
other users files.
Summary
• File based systems were an early attempt to computerize the manual filing system.
• This approach is the decentralized computerized data handling method.
The introduction of shared files solves the problem of inconsistent data across different versions
of the same file held by different departments, but other problems may emerge, including:
• When each department had its own version of a file for processing, each department
could ensure that the structure of the file suited their specific application. If departments
have to share files, the file structure that suits one department might not suit another, for
example, data might need to be sorted in a different sequence for different applications
(for instance, customer details could be stored in alphabetical order, or numerical order,
or ascending or descending order of customer number).
• Some applications may require access to more data than others, for instance a credit
control application will need access to customer credit limit information, whereas an
delivery note printing application will only need access to customer name and address
details. The file will still need to contain the additional information to support the
application that requires it.
• If the structure of the data file needs to be changed in some way (for example, to reflect a
change in currency), this alteration will need to be reflected in all application programs
that use that data file. This problem is known as physical data dependence, and will be
examined in more detail later in the unit.
• While a data file is being processed by one application, the file will not be available for
other applications or for ad hoc queries. This is because, if more than one application is
allowed to alter data in a file at one time, serious problems can arise in ensuring that the
updates made by each application do not clash with one another. This issue of ensuring
consistent, concurrent updating of information is an extremely important one, and is dealt
with in detail for database systems in the unit on concurrency control. File-Based systems
avoid these problems by not allowing more than one application to access a file at one
time.
DBMS Engine
The engine is the central component of a DBMS. This component provides access to the
database and coordinates all of the functional elements of the DBMS. An important source of
data for the DBMS engine, and the database system as a whole, is known as metadata. Metadata
means data about data. Metadata is contained in a part of the DBMS called the data dictionary
(described below), and is a key source of information to guide the processes of the DBMS
engine. The DBMS engine receives logical requests for data (and metadata) from human users
and from applications, determines the secondary storage location (i.e. the disk address of the
requested data), and issues physical input/output requests to the computer operating system. The
data requested is fetched from physical storage into computer main memory; it is contained
therein special data structures provided by the DBMS. Whilst the data remains in memory, it is
managed by the DBMS engine. Additional data structures are created by the database system
itself, or by users of the system, in order to provide rapid access to data being processed by the
system. These data structures include indexes to speed up access to the data, buffer areas into
which particular types of data are retrieved, lists of free space etc. The management of these
additional data structures is also carried out by the DBMS engine.
3. Data Dictionary:
Due to the fact that a database is a self describing system, this tool, Data Dictionary, is
used to store and organize information about the data stored in the database.
Hardware:
Hardware are components that one can touch and feel. These components are comprised of various types
of personal computers, mainframe or any server computers to be used in multi-user system, network
infrastructure, and other peripherals required in the system.
Software:
Software are collection of commands and programs used to manipulate the hardware to perform a
function. These include components like the DBMS software, application programs, operating systems,
network software, language software and other relevant software.
Data:
Since the goal of any database system is to have better control of the data and making data useful, Data
is the most important component to the user of the database. There are two categories of data in any
database system: that is Operational and Metadata. Operational data is the data actually stored in the
system to be used by the user. Metadata is the data that is used to store information about the database
itself. The structure of the data in the database is called the schema, which is composed of the Entities,
Properties of entities, and relationship between entities.
Procedure:
Procedure is the rules and regulations on how to design and use a database. It includes procedures like
how to log on to the DBMS, how to use facilities, how to start and stop transaction, how to make
backup, how to treat hardware and software failure, how to change the structure of the database.
People:
This component is composed of the people in the organization that are responsible or play a role in
designing, implementing, managing, administering and using the resources in the database. This
component includes group of people with high level of knowledge about the database and the design
technology to other with no knowledge of the system except using the data in the database. In general
database users are include the following people:
Database Administrator Database Designer
Application programmer and System End Users
analysts
4. End Users
Workers, whose job requires accessing the database frequently for various purposes, there are different
group of users in this category.
i. Naïve Users:
Sizable proportion of users
Unaware of the DBMS
Only access the database based on their access level and demand
Use standard and pre-specified types of queries.
These users can be again classified as “Actors on the Scene” and “Workers Behind the Scene”.
Actors on the Scene:
Data Administrator
Database Administrator
Database Designer
End Users
1. Planning: that is identifying information gap in an organization and propose a database solution to
solve the problem.
2. Data analysis and requirements: that concentrates more on fact finding about the problem or the
opportunity. Feasibility analysis, requirement determination and structuring, and selection of best
design method are also performed at this phase.
i. Designer’s efforts are focused on
a) Information needs of Information users.
b) Information sources. Information constitution.
ii. Sources of information for the designer
a) Developing and gathering end user data views
b) Direct observation of the current system: existing and desired output
c) Interface with the systems design group
iii. The designer must identify the company’s business rules and analyze their impacts.
4. Design: in database designing more emphasis is given to this phase. The phase is further divided into
three sub-phases.
a) Conceptual Design: concise description of the data, data type, relationship between data
and constraints on the data. There is no implementation or physical detail consideration.
Used to elicit and structure all information requirements
5. DBMS Selection
The selection of an appropriate DBMS to support the database application.
Undertaken at any time prior to logical design provided sufficient information is available
regarding system requirements.
Also design the user interface and the application programs using the selected DBMS