Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Unit 1

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 17

What is Database

The database is a collection of inter-related data which is used to retrieve, insert and
delete the data efficiently. It is also used to organize the data in the form of a table,
schema, views, and reports, etc.

For example: The college Database organizes the data about the admin, staff,
students and faculty etc.

Using the database, you can easily retrieve, insert, and delete the information.

Database Management System


o Database management system is a software which is used to manage the
database. For example: MySQL, Oracle, etc are a very popular commercial
database which is used in different applications.
o DBMS provides an interface to perform various operations like database
creation, storing data in it, updating data, creating a table in the database and
a lot more.
o It provides protection and security to the database. In the case of multiple
users, it also maintains data consistency.

DBMS allows users the following tasks:

o Data Definition: It is used for creation, modification, and removal of


definition that defines the organization of data in the database.
o Data Updation: It is used for the insertion, modification, and deletion of the
actual data in the database.
o Data Retrieval: It is used to retrieve the data from the database which can be
used by applications for various purposes.
o User Administration: It is used for registering and monitoring users, maintain
data integrity, enforcing data security, dealing with concurrency control,
monitoring performance and recovering information corrupted by unexpected
failure.

Characteristics of DBMS
o It uses a digital repository established on a server to store and manage the
information.
o It can provide a clear and logical view of the process that manipulates data.
o DBMS contains automatic backup and recovery procedures.
o It contains ACID properties which maintain data in a healthy state in case of failure.
o It can reduce the complex relationship between data.
o It is used to support manipulation and processing of data.
o It is used to provide security of data.
o It can view the database from different viewpoints according to the requirements of
the user.

Advantages of DBMS
o Controls database redundancy: It can control data redundancy because it stores all
the data in one single database file and that recorded data is placed in the database.
o Data sharing: In DBMS, the authorized users of an organization can share the data
among multiple users.
o Easily Maintenance: It can be easily maintainable due to the centralized nature of
the database system.
o Reduce time: It reduces development time and maintenance need.
o Backup: It provides backup and recovery subsystems which create automatic backup
of data from hardware and software failures and restores the data if required.
o multiple user interface: It provides different types of user interfaces like graphical
user interfaces, application program interfaces

Disadvantages of DBMS
o Cost of Hardware and Software: It requires a high speed of data processor and
large memory size to run DBMS software.
o Size: It occupies a large space of disks and large memory to run them efficiently.
o Complexity: Database system creates additional complexity and requirements.
o Higher impact of failure: Failure is highly impacted the database because in most of
the organization, all the data stored in a single database and if the database is
damaged due to electric failure or database corruption then the data may be lost
forever.
Database

What is Data?
Data is a collection of a distinct small unit of information. It can be used in a variety
of forms like text, numbers, media, bytes, etc. it can be stored in pieces of paper or
electronic memory, etc.

Word 'Data' is originated from the word 'datum' that means 'single piece of
information.' It is plural of the word datum.

In computing, Data is information that can be translated into a form for efficient
movement and processing. Data is interchangeable.

What is Database?
A database is an organized collection of data, so that it can be easily accessed and
managed.

You can organize data into tables, rows, columns, and index it to make it easier to
find relevant information.

Database handlers create a database in such a way that only one set of software
program provides access of data to all the users.

The main purpose of the database is to operate a large amount of information by


storing, retrieving, and managing data.

There are many dynamic websites on the World Wide Web nowadays which are
handled through databases. For example, a model that checks the availability of
rooms in a hotel. It is an example of a dynamic website that uses a database.

There are many databases available like MySQL, Sybase, Oracle, MongoDB,
Informix, PostgreSQL, SQL Server, etc.

Modern databases are managed by the database management system (DBMS).

SQL or Structured Query Language is used to operate on the data stored in a


database. SQL depends on relational algebra and tuple relational calculus.

A cylindrical structure is used to display the image of a database.


Evolution of Databases
The database has completed more than 50 years of journey of its evolution from flat-
file system to relational and objects relational systems. It has gone through several
generations.

The Evolution

File-Based

1968 was the year when File-Based database were introduced. In file-based
databases, data was maintained in a flat file. Though files have many advantages,
there are several limitations.

One of the major advantages is that the file system has various access methods, e.g.,
sequential, indexed, and random.

It requires extensive programming in a third-generation language such as COBOL,


BASIC.

Hierarchical Data Model

1968-1980 was the era of the Hierarchical Database. Prominent hierarchical database
model was IBM's first DBMS. It was called IMS (Information Management System).

In this model, files are related in a parent/child manner.

Below diagram represents Hierarchical Data Model. Small circle represents objects.
Like file system, this model also had some limitations like complex implementation,
lack structural independence, can't easily handle a many-many relationship, etc.

Network data model


Charles Bachman developed the first DBMS at Honeywell called Integrated Data
Store (IDS). It was developed in the early 1960s, but it was standardized in 1971 by
the CODASYL group (Conference on Data Systems Languages).

In this model, files are related as owners and members, like to the common network
model.

Network data model identified the following components:

o Network schema (Database organization)


o Sub-schema (views of database per user)
o Data management language (procedural)
This model also had some limitations like system complexity and difficult to design and
maintain.

Data Models
Data Model is the modeling of the data description, data semantics, and consistency
constraints of the data. It provides the conceptual tools for describing the design of a
database at each level of data abstraction. Therefore, there are following four data
models used for understanding the structure of the database:

1) Relational Data Model: This type of model designs the data in the form of rows
and columns within a table. Thus, a relational model uses tables for representing data
and in-between relationships. Tables are also called relations. This model was initially
described by Edgar F. Codd, in 1969. The relational data model is the widely used
model which is primarily used by commercial data processing applications.

2) Entity-Relationship Data Model: An ER model is the logical representation of


data as objects and relationships among them. These objects are known as entities,
and relationship is an association among these entities. This model was designed by
Peter Chen and published in 1976 papers. It was widely used in database designing.
A set of attributes describe the entities. For example, student_name, student_id
describes the 'student' entity. A set of the same type of entities is known as an 'Entity
set', and the set of the same type of relationships is known as 'relationship set'.

3) Object-based Data Model: An extension of the ER model with notions of


functions, encapsulation, and object identity, as well. This model supports a rich type
system that includes structured and collection types. Thus, in 1980s, various database
systems following the object-oriented approach were developed. Here, the objects
are nothing but the data carrying its properties.

4) Semistructured Data Model: This type of data model is different from the other three
data models (explained above). The semistructured data model allows the data specifications
at places where the individual data items of the same type may have different attributes sets.
The Extensible Markup Language, also known as XML, is widely used for representing the
semistructured data. Although XML was initially designed for including the markup
information to the text document, it gains importance because of its application in the
exchange of data.

Data model Schema and Instance

o The data which is stored in the database at a particular moment of time is


called an instance of the database.
o The overall design of a database is called schema.
o A database schema is the skeleton structure of the database. It represents the
logical view of the entire database.
o A schema contains schema objects like table, foreign key, primary key, views,
columns, data types, stored procedure, etc.
o A database schema can be represented by using the visual diagram. That
diagram shows the database objects and relationship with each other.
o A database schema is designed by the database designers to help
programmers whose software will interact with the database. The process of
database creation is called data modeling.

A schema diagram can display only some aspects of a schema like the name of
record type, data type, and constraints. Other aspects can't be specified through the
schema diagram. For example, the given figure neither show the data type of each
data item nor the relationship among various files.

In the database, actual data changes quite frequently. For example, in the given
figure, the database changes whenever we add a new grade or add a student. The
data at a particular moment of time is called the instance of the database.
Data Independence
o Data independence can be explained using the three-schema architecture.
o Data independence refers characteristic of being able to modify the schema at one
level of the database system without altering the schema at the next higher level.

There are two types of data independence:

1. Logical Data Independence


o Logical data independence refers characteristic of being able to change the
conceptual schema without having to change the external schema.
o Logical data independence is used to separate the external level from the conceptual
view.
o If we do any changes in the conceptual view of the data, then the user view of the
data would not be affected.
o Logical data independence occurs at the user interface level.
2. Physical Data Independence
o Physical data independence can be defined as the capacity to change the internal
schema without having to change the conceptual schema.
o If we do any changes in the storage size of the database system server, then the
Conceptual structure of the database will not be affected.
o Physical data independence is used to separate conceptual levels from the internal
levels.
o Physical data independence occurs at the logical interface level.

Fig: Data Independence

Database Languages in DBMS


o A DBMS has appropriate languages and interfaces to express database queries and
updates.
o Database languages can be used to read, store and update the data in the database.
Types of Database Languages

1. Data Definition Language (DDL)


o DDL stands for Data Definition Language. It is used to define database structure or
pattern.
o It is used to create schema, tables, indexes, constraints, etc. in the database.
o Using the DDL statements, you can create the skeleton of the database.
o Data definition language is used to store the information of metadata like the
number of tables and schemas, their names, indexes, columns in each table,
constraints, etc.

Here are some tasks that come under DDL:

o Create: It is used to create objects in the database.


o Alter: It is used to alter the structure of the database.
o Drop: It is used to delete objects from the database.
o Truncate: It is used to remove all records from a table.
o Rename: It is used to rename an object.
o Comment: It is used to comment on the data dictionary.
These commands are used to update the database schema that's why they come
under Data definition language.

2. Data Manipulation Language (DML)


DML stands for Data Manipulation Language. It is used for accessing and
manipulating data in a database. It handles user requests.

Here are some tasks that come under DML:

o Select: It is used to retrieve data from a database.


o Insert: It is used to insert data into a table.
o Update: It is used to update existing data within a table.
o Delete: It is used to delete all records from a table.
o Merge: It performs UPSERT operation, i.e., insert or update operations.
o Call: It is used to call a structured query language or a Java subprogram.
o Explain Plan: It has the parameter of explaining data.
o Lock Table: It controls concurrency.

3. Data Control Language (DCL)


o DCL stands for Data Control Language. It is used to retrieve the stored or saved data.
o The DCL execution is transactional. It also has rollback parameters.

(But in Oracle database, the execution of data control language does not have
the feature of rolling back.)

Here are some tasks that come under DCL:

o Grant: It is used to give user access privileges to a database.


o Revoke: It is used to take back permissions from the user.

There are the following operations which have the authorization of Revoke:

CONNECT, INSERT, USAGE, EXECUTE, DELETE, UPDATE and SELECT.

4. Transaction Control Language (TCL)


TCL is used to run the changes made by the DML statement. TCL can be grouped
into a logical transaction.

Here are some tasks that come under TCL:

o Commit: It is used to save the transaction on the database.


o Rollback: It is used to restore the database to original since the last Commit.

Centralized and Client Server


Architecture for DBMS
Centralized Architecture of DBMS:
Architectures for DBMSs have generally followed trends seen in architectures for
larger computer systems. The primary processing for all system functions, including
user application programs, user interface programs, and all DBMS capabilities, was
handled by mainframe computers in earlier systems. The primary cause of this was
that the majority of users accessed such systems using computer terminals with
limited processing power and merely display capabilities. Only display data and
controls were delivered from the computer system to the display terminals, which
were connected to the central node by a variety of communications networks, while
all processing was done remotely on the computer system.

The majority of users switched from terminals to PCs and workstations as hardware
prices decreased. Initially, Database Systems operated on these computers in a
manner akin to how they had operated display terminals. As a result, the DBMS itself
continued to operate as a centralized DBMS, where all DBMS functionality,
application program execution, and UI processing were done on a single computer.
The physical elements of a centralized architecture Client/server DBMS designs
emerged as DBMS systems gradually began to take advantage of the user side's
computing capability.
Client-server Architecture of DBMS:
We first talk about client/server architecture in general, and then we look at how
DBMSs use it. In order to handle computing settings with a high number of PCs,
workstations, file servers, printers, database servers, etc., the client/server architecture
was designed.

A network connects various pieces of software and hardware, including email and
web server software. To define specialized servers with a particular functionality is the
aim. For instance, it is feasible to link a number of PCs or compact workstations to a
file server that manages the client machines' files as clients. By having connections to
numerous printers, different devices can be designated as a printer server; all print
requests from clients are then directed to this machine. The category of specialized
servers also includes web servers and email servers. Many client machines can utilize
the resources offered by specialized servers. The user is given the proper user
interfaces for these servers as well as local processing power to run local applications
on the client devices. This idea can be applied to various types of software, where
specialist applications, like a CAD (computer-aided design) package, are kept on
particular server computers and made available to a variety of clients. Some devices
(such as workstations or PCs with discs that only have client software installed) would
only be client sites.

The idea of client/server architecture presupposes an underpinning structure made


up of several PCs and workstations as well as fewer mainframe computers connected
via LANs as well as other types of computer networks. In this system, a client is often
a user machine that offers local processing and user interface capabilities. When a
client needs access to extra features-like database access-that are not available on
that system, it connects to a server that offers those features. A server is a computer
system that includes both hardware and software that can offer client computer
services like file access, printing, archiving, or database access. Generally speaking,
some workstations install both client and server software, while others just install
client software. Client and server software, however, typically run on separate
workstations, which is more typical. On this underlying client/server framework, Two-
tier and Three-tier fundamental DBMS architectures were developed.

Two-Tier Client Server Architecture:


Here, the term "two-tier" refers to our architecture's two layers-the Client layer and
the Data layer. There are a number of client computers in the client layer that can
contact the database server. The API on the client computer will use JDBC or some
other method to link the computer to the database server. This is due to the
possibility of various physical locations for clients and database servers.

Three-Tier Client-Server Architecture:


The Business Logic Layer is an additional layer that serves as a link between the
Client layer and the Data layer in this instance. The layer where the application
programs are processed is the business logic layer, unlike a Two-tier architecture,
where queries are performed in the database server. Here, the application programs
are processed in the application server itself.

Classification of Database
Management Systems
ADRIENNE WATT
Database management systems can be classified based on several criteria, such as
the data model, user numbers and database distribution, all described below.

Classification Based on Data Model


The most popular data model in use today is the relational data model. Well-known
DBMSs like Oracle, MS SQL Server, DB2 and MySQL support this model. Other
traditional models, such as hierarchical data models and network data models, are
still used in industry mainly on mainframe platforms. However, they are not
commonly used due to their complexity. These are all referred to
as traditional models because they preceded the relational model.

In recent years, the newer object-oriented data models were introduced. This
model is a database management system in which information is represented in the
form of objects as used in object-oriented programming. Object-oriented databases
are different from relational databases, which are table-oriented. Object-oriented
database management systems (OODBMS) combine database capabilities with
object-oriented programming language capabilities.

The object-oriented models have not caught on as expected so are not in


widespread use. Some examples of object-oriented DBMSs are O2, ObjectStore
and Jasmine.

Classification Based on User Numbers


A DBMS can be classification based on the number of users it supports. It can be
a single-user database system, which supports on+ Oe user at a time, or
a multiuser database system, which supports multiple users concurrently.

Classification Based on Database Distribution


There are four main distribution systems for database systems and these, in turn,
can be used to classify the DBMS.

Centralized systems
With a centralized database system, the DBMS and database are stored at a single
site that is used by several other systems too. This is illustrated in Figure 6.1.
In the early 1980s, many Canadian libraries used the GEAC 8000 to convert their
manual card catalogues to machine-readable centralized catalogue systems. Each
book catalogue had a barcode field similar to those on supermarket products.

Distributed database system


In a distributed database system, the actual database and the DBMS software are
distributed from various sites that are connected by a computer network, as shown
in Figure 6.2.

Homogeneous distributed database systems


Homogeneous distributed database systems use the same DBMS software from
multiple sites. Data exchange between these various sites can be handled easily.
For example, library information systems by the same vendor, such as Geac
Computer Corporation, use the same DBMS software which allows easy data
exchange between the various Geac library sites.
Heterogeneous distributed database systems
In a heterogeneous distributed database system, different sites might use different
DBMS software, but there is additional common software to support data exchange
between these sites. For example, the various library database systems use the
same machine-readable cataloguing (MARC) format to support library record data
exchange.

You might also like