Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Database and DBMS Introduction

Download as pdf or txt
Download as pdf or txt
You are on page 1of 31

PRINCIPLES OF DATABASE DESIGN

What is Data?
In simple words, data can be facts related to any object in
consideration. For example, your name, age, height, weight, etc. are
some data related to you. A picture, image, file, pdf, etc. can also be
considered data.

Data is a real-world entity or an object. Data is a distinct piece of


information or facts that need to be processed. It can be in any form
like text, number, picture, measurements, and bytes.
Example: Ankit, Delhi, 12, 80.

What is Information?
When data are processed, organized, structured, and interpreted in a
given context, so as to make them useful and meaningful, they are
called information.
Example: Name - Ankit, City - Delhi, Class – 12, Marks – 80.

What is Database?
A database is a shared collection of logically related data and its
description, designed to meet the information needs of an
organization.
Why a Database?
Size

You may have thousands or millions of rows of customers, or any piece


of information.

Accuracy

“Do you care if someone entered incorrect data?”. If yes, nothing could
actually prevent me from typing incorrect data into a spreadsheet.

Security

If the data is sensitive, and you need to restrict access to the data; It
doesn’t need to be shared with everyone. In addition, “Do you need to
know who made every change at every point?”.

Redundancy

If the redundant data (having multiple copies of the same data) will
lead to conflict, you would need to have only non-repeated unique
data.

Importance

“What if you had a disconnect or a crash, and you lost your


data?”. You’ve probably felt that pain before. And it’s unacceptable to
lose important data like orders of a customer, allergies of a patient,
flight bookings, …etc.
Overwriting

How about having more than one person overwriting the same data at
the same time. How about 10 at the same time or 100 people at the
same time?. You’ll end up with everybody overwriting everybody else’s
changes.

DATABASE COMPONENTS
There are five main components of a database:

Hardware:
The hardware consists of physical, electronic devices like computers,
I/O devices, storage devices, etc. This offers the interface between
computers and real-world systems.

The particular Hardware depends on the organization requirements


and the DBMS used.

A DBMS also requires a minimum amount of main memory and disk


space to run.

Software:
This is a set of programs used to manage and control the overall
database. This includes the database software itself, the Operating
System, the network software used to share the data among users, and
the application programs for accessing data in the database.

Data:
Data is a raw and unorganized fact that is required to be processed to
make it meaningful. Data can be simple at the same time unorganized
unless it is organized. Generally, data comprises facts, observations,
perceptions, numbers, characters, symbols, images, etc.

Procedure:
Procedure are a set of instructions and rules that help you to use the
DBMS. It is designing and running the database using documented
methods, which allows you to guide the users who operate and manage
it.

The users of the system require documented procedures on how to run


and or use the system. These may consist of instructions on how to:

• Log onto the DBMS


• Use a particular application program
• Start and stop the DBMS
• Make backup copies of the DBMS
• Handle software or hardware failures
• Change the structure of a table

Database Access Language:


Database Access language is used to access the data to and from the
database, enter new data, update already existing data, or retrieve
required data from DBMS. The user writes some specific commands in
a database access language and submits these to the database.

What is File Management System?


A file management system is a collection of programs that manage and
store data in files and folders in a computer hard disk. A file
management system manages the way of reading and writing data to
the hard disk. It is also known as conventional file system.
This system actually stores data in the isolated files which have their
own physical location on the drive, and users manually go to these
locations to access these files. It is the easiest way to store the data like
text, videos, images, audios, etc. in general files. Data redundancy is
high in file management system, and it cannot be controlled easily.
Data consistency is not met, and the integration of data is hard to
achieve.
Database Management System (DBMS)

Database Management System (DBMS) is a collection of


programs that enable its users to access databases, manipulate data,
report, and represent data. It also helps to control access to the
database. Database Management Systems are not a new concept and,
as such, had been first implemented in the 1960s.

The DBMS is the software system that enables users to define, create,
maintain and control access to the database.

The database has your actual data and the rules about that data, while
the DBMS is the program that surrounds and manages your actual
data, and it enforces the rules you specified on your data. The rules for
example could be the type of the data, like integer or string, or the
relationship between them.

Roles of a DBMS
1. Provides a security system which prevents unauthorized access to
the database.
2. Provides an Integrity system which maintains consistency of the
stored data.
3. Provides a concurrency control system which allows shared
access of the database.
4. Provides a recovery control system which restores the database to
a previous consistent state following a hardware or software
failure.
5. Provides a user accessible catalog which contains description of
the data in the database.

There are different DBMS, and they are categorized under:

• Relational Database Management Systems

• Hierarchical Database Systems

• Network Database Systems

• Object-Oriented Database Systems

• NoSQL Database Systems

Advantages of DBMS
• Sharing of data - DBMS offers a variety of techniques to enable sharing of
data.
• Improved Data integrity – Database integrity refrs to the validity and
consistency of stored data.
• Improved security – Database security is the protection of the database from
unauthorized users. This security may take the form of user names and
passwords to identify users authorized to access the database.
• Enforcement of standards – Integration allows the Database Administrator
to defineand enforce the necessary standards. May include data formats to
facilitate exchange of data between systems, naming conventions,
documentation standards etc.
• Economy of scale – Combining all the organization operational data into one
database and creating a set of applications that work on this source of data
can result in cost savings.
• Improved data accessibility and responsiveness – Many DBMS provide query
languages or report writers that allows users to ask ad hoc questions and to
obtain the required information almost immediately at their terminal.
• Increased Productivity – DBMS provides many of the standard functions that
allows the programmer to concentrate on the specific functionality required
by the users.
• Improved maintenance through data independence.
• Increased concurrency.
• Backup and recovery services.

Disadvantage of DBMS
DBMS may offer plenty of advantages but, it has certain flaws-

• Cost of Hardware and Software of a DBMS is quite high which increases the
budget of your organization.
• Most database management systems are often complex systems, so the
training for users to use the DBMS is required.
• In some organizations, all data is integrated into a single database which can
be damaged because of electric failure or database is corrupted on the
storage media.
• Use of the same program at a time by many users sometimes lead to the loss
of some data.
• DBMS can’t perform sophisticated calculations.

In general, we shall focus on 2 types of databases:

• Non-Relational

• Relational
Non-Relational Database Management System (Non-RDBMS)

In Non-RDBMS, data is stored in key-value pairs. For example:

Example: Data stored in Non-RDBMS

Here, customers' data are stored in key-value pairs.

Commonly used Non-RDBMS: MongoDB, Amazon DynamoDB, Redis, etc.

Relational Database Management System (RDBMS)

In RDBMS, data is stored in tabular format. For example,


Example: Relational Database

Here, customers is a table inside the database.


The first row is the attributes of the table. Each row after that contains the data
of a customer.

In RDBMS, two or more tables may be related to each other. Hence the term
"Relational". For example,
Example: Relationship in RDBMS

Here, orders and customers are related through customer_id .

Commonly used RDBMS: MySQL, PostgreSQL, MSSQL, Oracle etc.


Types of Database Model

Database Model: It determines the logical structure of a database


and fundamentally determines in which manner data can be stored,
organized and manipulated.

There are four common types of database model that are useful for
different types of data or information. Depending upon your specific
needs, one of these models can be used.

1. Hierarchical databases.
2. Network databases.
3. Relational databases.
4. Object-oriented databases.

1. Hierarchical databases

It is one of the oldest database model developed by IBM for information


Management System. In a hierarchical database model, the data is
organized into a tree-like structure. In simple language we can say that
it is a set of organized data in tree structure.

This type of Database model is rarely used nowadays. Its structure is like
a tree with nodes representing records and branches representing fields.
The windows registry used in Windows XP is an example of a
hierarchical database. Configuration settings are stored as tree
structures with nodes.

The following figure shows the generalized the structure of Hierarchical


database model in which data is stored in the form of tree like structure
(data represented or stored in root node, parent node and child node).
The following figure shows the example of hierarchical database model
for the university management system. This type of database employs
the “parent-child” relationship of storing data.

Advantages

• The model allows us easy addition and deletion of new information.


• Data at the top of the Hierarchy is very fast to access.
• It worked well with linear data storage mediums such as tapes.
• It relates well to anything that works through a one to many
relationships. For example; there is a president with many managers
below them, and those managers have many employees below them,
but each employee has only one manager.

Disadvantages

• It requires data to be repetitively stored in many different entities.


• Now a day there is no longer use of linear data storage mediums such
as tapes.
• Searching for data requires the DBMS to run through the entire
model from top to bottom until the required information is found,
making queries very slow.
• This model support only one to many relationships, many to many
relationships are not supported.

2. Network databases

This is looks like a Hierarchical database model due to which many


time it is called as modified version of Hierarchical database. Network
database model organised data more like a graph and can have more
than one parent node. The network model is a database model
conceived as a flexible way of representing objects and their
relationships.
Advantage

• The network model is conceptually simple and easy to design.


• The network model can represent redundancy in data more
effectively than in the hierarchical model.
• The network model can handle the one to many and many to many
relationships which is real help in modelling the real-life situations.
• The data access is easier and flexible than the hierarchical model.
• The network model is better than the hierarchical model in isolating
the programs from the complex physical storage details.

Disadvantage:

• All the records are maintained using pointers and hence the whole
database structure becomes very complex.
• The insertion, deletion and updating operations of any record require
the large number of pointers adjustments.
• The structural changes to the database is very difficult.
3. Relational Database

A relational database is developed by E. F. Codd in 1970. The various


software systems used to maintain relational databases are known as a
relational database management system (RDBMS). In this model, data
is organised in rows and column structure i.e., two-dimensional tables
and the relationship is maintained by storing a common field. It consists
of three major components.

In relational model, three key terms are heavily used such as relations,
attributes, and domains. A relation nothing but is a table with rows and
columns. The named columns of the relation are called as attributes, and
finally the domain is nothing but the set of values the attributes can take.
The following figure gives us the overview of rational database model.

Terminology used in Relational Model

• Tuple: Each row in a table is known as tuple.


• Cardinality of a relation: The number of tuples in a relation
determines its cardinality. In this case, the relation has a cardinality of
4.
• Degree of a relation: Each column in the tuple is called an attribute.
The number of attributes in a relation determines its degree. The
relation in figure has a degree of 3.

Keys of a relation-

• Primary key- it is the key that uniquely identifies a table. It doesn’t have
nullvalues.
• Foreign key- it refers to the primary key of some other table.it permits
only those values which appear in the primary key of the table to which
it refers.

Some of the example of relational database are as follows.

Oracle: Oracle Database is commonly referred to as Oracle RDBMS or


simply as Oracle. It is a multi-model database management system
produced and marketed by Oracle Corporation.

MySQL: MySQL is an open-source relational database management


system (RDBMS) based on Structured Query Language (SQL). MySQL
runs on virtually all platforms, including Linux, UNIX, and Windows.

Microsoft SQL Server: Microsoft SQL Server is an RDBMS that supports


a wide variety of transaction processing, business intelligence, and
analytics applications in corporate IT environments.

PostgreSQL: PostgreSQL, often simply Postgres, is an object-relational


database management system (ORDBMS) with an emphasis on
extensibility and standards compliance.

DB2: DB2 is an RDBMS designed to store, analyze, and retrieve data


efficiently.
The following tables shows the sample relational database model for the
bank environment where the data associated with the bank is stored in
the form two dimensional tables.

Advantage

• Relational model is one of the most popular used database model.


• In relational model, changes in the database structure do not affect
the data access.
• The revision of any information as tables consisting of rows and
columns is much easier to understand.
• The relational database supports both data independence and
structure independence concept which makes the database design,
maintenance, administration and usage much easier than the other
models.
• In this we can write complex query to accesses or modify the data
from database.
• It is easier to maintain security as compare to other models.

Disadvantages

• Mapping of objects in relational database is very difficult.


• Object oriented paradigm is missing in relation model.
• Data Integrity is difficult to ensure with Relational database.
• Relational Model is not suitable for huge database but suitable for
small database.
• Hardware overheads are incurred which make it costly.
• Ease of design can lead to bad design.
• Relational database system hides the implementation complexities
and the physical data storage details from the users.

4. Object-oriented databases

An object database is a system in which information is represented in


the form of objects as used in object-oriented programming. Object
oriented databases are different from relational databases which are
table-oriented. The object-oriented data model is based on the object-
oriented- programming language concept, which is now in wide use.
Inheritance, polymorphism, overloading. object-identity, encapsulation
and information hiding with methods to provide an interface to objects,
are among the key concepts of object-oriented programming that have
found applications in data modelling. The object-oriented data model
also supports a rich type system, including structured and collection
types.

The following figure shows the difference between relation and object-
oriented database model.
The following figure shows an example of object-oriented model.

Advantages

• Object database can handle different types of data while relational


data base handles a single data. Unlike traditional databases like
hierarchical, network or relational, the object-oriented databases can
handle the different types of data, for example, pictures, voice video,
including text, numbers and so on.
• Object-oriented databases provide us code reusability, real world
modelling, and improved reliability and flexibility.
• The object-oriented database is having low maintenance costs as
compared to other model because most of the tasks within the system
are encapsulated, they may be reused and incorporated into new tasks.
Disadvantages

• There isno universally defined data model for an OODBMS, and most
models lack a theoretical foundation.
• In comparison to RDBMSs the use of OODBMS is still relatively
limited.
• There is a Lack of support for security in OODBMSs that do not
provide adequate security mechanisms.
• The system more complex than that of traditional DBMSs.

Difference between File System and Database Management System


There are following differences between file system and DBMS:

File System Database Management System (DBMS)

1. It is a software system used for creating and


1. It is a software system that manages and controls
managing the databases. DBMS provides a systematic
the data files in a computer system.
way to access, update, and delete data.

2. Database Management System supports multi-user


2. File system does not support multi-user access.
access.

3. Data consistency is more due to the use of


3. Data consistency is less in the file system.
normalization.

4. File system is not secured. 4. Database Management System is highly secured.

5. File system is used for storing the unstructured 5. Database management system is used for storing
data. the structured data.

6. In the file system, data redundancy is high. 6. In DBMS, Data redundancy is low.

7. No data backup and recovery process is present in


7. There is a backup recovery for data in DBMS.
a file system.
8. Handling of a file system is easy. 8. Handling a DBMS is complex.

9. Cost of database management system is more than


9. Cost of a file system is less than the DBMS.
the file system.

10. If one application fails, it does not affect other 10. If the database fails, it affects all application which
application in a system. depends on it.

11. In the file system, data cannot be shared 11. In DBMS, data can be shared as it is stored at one
because it is distributed in different files. place in a database.

12. These system does not provide concurrency


12. This system provides concurrency facility.
facility.

13. Example: NTFS (New technology file system), 13. Example: Oracle, MySQL, MS SQL Server, DB2,
EXT (Extended file system), etc. Microsoft Access, etc.
DBMS ARCHITECTURE

This is the first step when creating a database management system.


The architecture of a database management system plays an important
role in determining the actual design and layout of the database. The
architecture of the DBMS determines the functionality of the database.
In contrast, its architecture is dependent on the number of users and
the method of connection of those users and the database.
To understand the DBMS architecture, you must have a clear idea
about the client and server. The client is the user requesting service, in
this case, access to the data stored in the database. The server side is
the side that answers the request of the client.
The architecture can be divided into the following three categories
based on the layers between the server and client layer.

• 1 tier Architecture
• 2 tier Architecture
• 3 tier Architecture

We can determine the DBMS architecture based on the number of


clients and its usage.
Let us discuss all these architectures individually to better understand
the working of each architecture.

1 Tier Architecture
In this architecture, the user directly interacts with the database itself,
which means that it is accessible to the user to create, alter or delete
the data.
The user directly sits on the database, and there is no layer between the
user and the database. Therefore, there is no data abstraction; the
whole data is available.
There is no interactive user interface.
This architecture is only preferred when the user aims to prepare a
local application as there is no real security is in this architecture.
This method provides a quick response to the user.

2 Tier Architecture
This architecture is rather the simplest version of a server-client
application.
It provides an application layer that works as an isolating layer
between the client and the database server. But this application layer is
only available on the user side.
This application layer is only available at the client end. This
application layer provides a certain level of data abstraction.
The user interfaces and application program provide different views to
the different users run on the user-side. In contrast, the server-side
provides various functionalities such as translating and performing
queries.
The user can only communicate using the application layer at the
client-side, and the server end application interacts with the database.
3 Tier Architecture
This architecture is similar to 2 Tier architecture as it also provides an
intermediate layer between the client and the database server. The only
difference is that it adds one more application layer as an intermediate,
and now the application layer is available at both the client and the
server sides.
In this architecture, the user connects through the user application,
and that user application interacts with the application server that
communicates with the databases.
This provides ample layer to abstract the data.
Any modification in the data done by the user does not directly affect
the database itself. The changes are first performed on the application
layer. There is no real connection between the database and the user.
The query processing and other functionalities of the database
management system are performed at the server-side application.
It allows the DBA to manage the access of the various users. It also
enables concurrent transactions on the database.

Three Level ANSI Architecture

Schema Architecture
It describes the architecture of the database at various levels.
There are three layers of schema that separates the physical, logical
and application part of the database. Physical Schema, Logical Schema
and the External Schema are mapped together.
There are two levels of mapping between various levels.
At the external-conceptual level and the conceptual-internal level, the
request and response are transformed to communicate between these
two layers.
They interact with each other to perform the operations on the
database. A schema represents all the database entities and the
relationship between those entities.
The overall description of the database is called the database
Schema.

There are 3 types of schema:


1. External schemas
2. Conceptual schema
3. Internal schema

Internal Level
It defines the physical structure of the database. It defines the format
in which the data will be stored or, say, the structure in which the data
will be stored. It is also known as the Physical Schema. It describes the
complex low-level data structures of the database. At this level of
schema, the database administrator determines the definition of the
database by using data definition language. The user does not have
access to this level, requiring special rights. Initially, only the data
administrator can alter any change into it.
Conceptual Level
At this level, we describe the design of the database. It describes what
data will be stored in the database. This is called the logical schema;
the end-user does not have access to this layer. The programmers and
database administrators are the only ones with access to this level. At
this level, we store the data in the data structures described at the
physical level. However, the details of the implementation of the
physical level are kept hidden at this level.

External Level
It describes the various views of the database. This level is also called
the view level. There can be different views for different users
depending on the needs of the end-users.
At this level, the end-user interacts with the database, and the details
of both physical and logical schemas are kept hidden from the user.
The hiding of details of one level from other and only showing the
necessary information is called abstraction.
DBMS LANGUAGES
User can access, update, delete, and store data or information in the
database using database languages. The following are the databases
languages in the database management system:
• Data Definition Language
• Data Manipulation Language
• Data Control Language
• Transaction Control Language
Data Definition Language (DDL)
Data Definition Language is used for defining the structure or schema
of the database. It is also used for creating tables, indexes, applying
constraints, etc. in the database.
The main purpose of DDL is to store the information of metadata like
the number of schemas and tables, their names, indexes, constraints,
columns in each table, etc. The result of Data Definition Language
statements will be a set of tables which are stored in a special file
called data directory or data dictionary.
This language is used by the conceptual schema to access and retrieve
the records from/to the database respectively, where these records
describe entities, relationship, and attributes.
There are following Data Definition Languages (DDL)
Commands:

• Create: This command is used to create a new table or a new


database.
• Alter: This command is used to alter or change the structure of
the database table.
• Drop: This command is used to delete a table, index, or views
from the database.
• Truncate: This command is used to delete the records or data
from the table, but its structure remains as it is.
• Rename: This command is used to rename an object from the
database.
• Comment: This command is used for adding comments to our
table.

Data Manipulation Language (DML)


Data Manipulation Language is a language used to access or
manipulate the data in the database. In simple words, this language is
used to retrieve the data from the database, insert new data into the
database, and delete the existing data from the database.
Data Manipulation Language is mainlyof two types:

• Procedural DML: This type of DML describes what data is to


be accessed and how to get that data.
• Declarative DML or Non-procedural DML: This type of
DML only describes what data is to be accessed without
specifying how to get it.

There are following Data Manipulation Language (DML)


Commands:

• Select: This command is used to retrieve or access the data from


the database table.
• Insert: This command is used to insert the records into the
table.
• Update: This command is used to change/update the existing
data in a table.
• Delete: This command is used to delete one or all the existing
records from the table.

Data Control Language


DCL is used to access the stored or saved data. It is mainly used for
revoking and granting user access on a database. In
the Oracle database, this language does not have the feature of
rollback. It is a part of SQL.
There are following Data Control Language (DCL)
Commands:

• Grant: This command allows user’s access privileges to the


database.
• Revoke: This command removes the accessibility of users from
the database objects.

Transaction Control Languages


Transaction Control language is a language which manages
transactions within the database. It is used to execute the changes
made by the data manipulation language (DML) statements.
There are following Transaction Control Language (TCL)
Commands:

• Commit: This command is used to save the transactions in the


database.
• Rollback: This command is used to restore the database to that
state which was last committed.

You might also like