Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Rdbms Unit2

Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

Data models

Data Model is the modeling of the data description, data semantics, and consistency
constraints of the data. It provides the conceptual tools for describing the design of a
database at each level of data abstraction. Therefore, there are following four data models
used for understanding the structure of the database:

1) Relational Data Model: This type of model designs the data in the form of rows and columns within
a table. Thus, a relational model uses tables for representing data and in-between relationships. Tables
are also called relations. This model was initially described by Edgar F. Codd, in 1969. The relational data
model is the widely used model which is primarily used by commercial data processing applications.

2) Entity-Relationship Data Model: An ER model is the logical representation of data as objects and
relationships among them. These objects are known as entities, and relationship is an association
among these entities. This model was designed by Peter Chen and published in 1976 papers. It was
widely used in database designing. A set of attributes describe the entities. For example, student_name,
student_id describes the 'student' entity. A set of the same type of entities is known as an 'Entity set',
and the set of the same type of relationships is known as 'relationship set'.

3) Object-based Data Model: An extension of the ER model with notions of functions, encapsulation,
and object identity, as well. This model supports a rich type system that includes structured and
collection types. Thus, in 1980s, various database systems following the object-oriented approach were
developed. Here, the objects are nothing but the data carrying its properties. Play Video

4) Semistructured Data Model: This type of data model is different from the other three data models
(explained above). The semistructured data model allows the data specifications at places where the
individual data items of the same type may have different attributes sets. The Extensible Markup
Language, also known as XML, is widely used for representing the semistructured data. Although XML
was initially designed for including the markup information to the text document, it gains importance
because of its application in the exchange of data.

← PrevNext

Data model Schema and Instance


o The data which is stored in the database at a particular moment of time is called an instance of
the database.
o The overall design of a database is called schema.
o A database schema is the skeleton structure of the database. It represents the logical view of
the entire database.
o A schema contains schema objects like table, foreign key, primary key, views, columns, data
types, stored procedure, etc.
o A database schema can be represented by using the visual diagram. That diagram shows the
database objects and relationship with each other.
o A database schema is designed by the database designers to help programmers whose
software will interact with the database. The process of database creation is called data
modeling.

A schema diagram can display only some aspects of a schema like the name of record type,
data type, and constraints. Other aspects can't be specified through the schema diagram. For
example, the given figure neither show the data type of each data item nor the relationship
among various files.

In the database, actual data changes quite frequently. For example, in the given figure, the
database changes whenever we add a new grade or add a student. The data at a particular
moment of time is called the instance of the database.
Data Independence
o Data independence can be explained using the three-schema architecture.
o Data independence refers characteristic of being able to modify the schema at one level of the
database system without altering the schema at the next higher level.

There are two types of data independence:

1. Logical Data Independence


o Logical data independence refers characteristic of being able to change the conceptual schema
without having to change the external schema.
o Logical data independence is used to separate the external level from the conceptual view.
o If we do any changes in the conceptual view of the data, then the user view of the data would
not be affected.
o Logical data independence occurs at the user interface level.

2. Physical Data Independence


o Physical data independence can be defined as the capacity to change the internal schema
without having to change the conceptual schema.
o If we do any changes in the storage size of the database system server, then the Conceptual
structure of the database will not be affected.
o Physical data independence is used to separate conceptual levels from the internal levels.
o Physical data independence occurs at the logical interface level.

Fig: Data Independence

Database Languages in DBMS


o A DBMS has appropriate languages and interfaces to express database queries and updates.
o Database languages can be used to read, store and update the data in the database.

Types of Database Languages


1. Data Definition Language (DDL)
o DDL stands for Data Definition Language. It is used to define database structure or
pattern.
o It is used to create schema, tables, indexes, constraints, etc. in the database.
o Using the DDL statements, you can create the skeleton of the database.
o Data definition language is used to store the information of metadata like the number of tables
and schemas, their names, indexes, columns in each table, constraints, etc.

Here are some tasks that come under DDL:

o Create: It is used to create objects in the database.


o Alter: It is used to alter the structure of the database.
o Drop: It is used to delete objects from the database.
o Truncate: It is used to remove all records from a table.
o Rename: It is used to rename an object.
o Comment: It is used to comment on the data dictionary.

These commands are used to update the database schema that's why they come under Data
definition language.

2. Data Manipulation Language (DML)


DML stands for Data Manipulation Language. It is used for accessing and manipulating
data in a database. It handles user requests.

Here are some tasks that come under DML:

o Select: It is used to retrieve data from a database.


o Insert: It is used to insert data into a table.
o Update: It is used to update existing data within a table.
o Delete: It is used to delete all records from a table.
o Merge: It performs UPSERT operation, i.e., insert or update operations.
o Call: It is used to call a structured query language or a Java subprogram.
o Explain Plan: It has the parameter of explaining data.
o Lock Table: It controls concurrency.

3. Data Control Language (DCL)


o DCL stands for Data Control Language. It is used to retrieve the stored or saved data.
o The DCL execution is transactional. It also has rollback parameters.

(But in Oracle database, the execution of data control language does not have the
feature of rolling back.)

Here are some tasks that come under DCL:

o Grant: It is used to give user access privileges to a database.


o Revoke: It is used to take back permissions from the user.

There are the following operations which have the authorization of Revoke:

CONNECT, INSERT, USAGE, EXECUTE, DELETE, UPDATE and SELECT.

4. Transaction Control Language (TCL)


TCL is used to run the changes made by the DML statement. TCL can be grouped into a
logical transaction.

Here are some tasks that come under TCL:

o Commit: It is used to save the transaction on the database.


o Rollback: It is used to restore the database to original since the last Commit.
Interfaces in DBMS
A database management system (DBMS) interface is a user interface that
allows for the ability to input queries to a database without using the query
language itself.
User-friendly interfaces provided by DBMS may include the following:

1. Menu-Based Interfaces for Web Clients or Browsing –


These interfaces present the user with lists of options (called menus) that
lead the user through the formation of a request. Basic advantage of
using menus is that they removes the tension of remembering specific
commands and syntax of any query language. The query is basically
composed step by step by collecting or picking options from a menu that
is shown by the system. Pull-down menus are a very popular technique
in Web based interfaces. They are also often used in browsing
interface which allow a user to look through the contents of a database in
an exploratory and unstructured manner.

2. Forms-Based Interfaces –
A forms-based interface displays a form to each user. Users can fill out all
of the form entries to insert new data, or they can fill out only certain
entries, in which case the DBMS will redeem same type of data for other
remaining entries. These types of forms are usually designed or created
and programmed for the users that have no expertise in operating
system. Many DBMSs have forms specification languages which are
special languages that help specify such forms.
Example: SQL* Forms is a form-based language that specifies queries
using a form designed in conjunction with the relational database schema.

3. Graphical User Interface –


A GUI typically displays a schema to the user in diagrammatic form.The
user then can specify a query by manipulating the diagram. In many
cases, GUIs utilize both menus and forms. Most GUIs use a pointing
device such as mouse, to pick a certain part of the displayed schema
diagram.

4. Natural language Interfaces –


These interfaces accept request written in English or some other
language and attempt to understand them. A Natural language interface
has its own schema, which is similar to the database conceptual schema
as well as a dictionary of important words.
The natural language interface refers to the words in its schema as well
as to the set of standard words in a dictionary to interpret the request.If
the interpretation is successful, the interface generates a high-level query
corresponding to the natural language and submits it to the DBMS for
processing, otherwise a dialogue is started with the user to clarify any
provided condition or request. The main disadvantage with this is that the
capabilities of this type of interfaces are not that much advance.
5 Speech Input and Output –
There is limited use of speech be it for a query or an answer to a question or
being a result of a request it is becoming commonplace. Applications with
limited vocabularies such as inquiries for telephone directory, flight
arrival/departure, and bank account information are allowed speech for input
and output to enable ordinary folks to access this information.
The Speech input is detected using predefined words and used to set up
the parameters that are supplied to the queries. For output, a similar
conversion from text or numbers into speech takes place.

6 Interfaces for DBA –


Most database system contains privileged commands that can be used
only by the DBA’s staff. These include commands for creating accounts,
setting system parameters, granting account authorization, changing a
schema, reorganizing the storage structures of a databases.

Classification of database management


system

There are various types of databases used for storing different varieties of data:
1) Centralized Database
It is the type of database that stores data at a centralized database system. It comforts the
users to access the stored data from different locations through several applications. These
applications contain the authentication process to let users access data securely. An example
of a Centralized database can be Central Library that carries a central database of each library
in a college/university.

Advantages of Centralized Database

o It has decreased the risk of data management, i.e., manipulation of data will not affect
the core data.
o Data consistency is maintained as it manages data in a central repository.
o It provides better data quality, which enables organizations to establish data standards.
o It is less costly because fewer vendors are required to handle the data sets.

Disadvantages of Centralized Database

o The size of the centralized database is large, which increases the response time for
fetching the data.
o It is not easy to update such an extensive database system.
o If any server failure occurs, entire data will be lost, which could be a huge loss.

2) Distributed Database
Unlike a centralized database system, in distributed systems, data is distributed among
different database systems of an organization. These database systems are connected via
communication links. Such links help the end-users to access the data easily. Examples of
the Distributed database are Apache Cassandra, HBase, Ignite, etc.

We can further divide a distributed database system into:

Play Video
o Homogeneous DDB: Those database systems which execute on the same operating
system and use the same application process and carry the same hardware devices.
o Heterogeneous DDB: Those database systems which execute on different operating
systems under different application procedures, and carries different hardware
devices.

Advantages of Distributed Database

o Modular development is possible in a distributed database, i.e., the system can be


expanded by including new computers and connecting them to the distributed system.
o One server failure will not affect the entire data set.

3) Relational Database
This database is based on the relational data model, which stores data in the form of
rows(tuple) and columns(attributes), and together forms a table(relation). A relational
database uses SQL for storing, manipulating, as well as maintaining the data. E.F. Codd
invented the database in 1970. Each table in the database carries a key that makes the data
unique from others. Examples of Relational databases are MySQL, Microsoft SQL Server,
Oracle, etc.

Properties of Relational Database


There are following four commonly known properties of a relational model known as ACID
properties, where:
A means Atomicity: This ensures the data operation will complete either with success or
with failure. It follows the 'all or nothing' strategy. For example, a transaction will either be
committed or will abort.

C means Consistency: If we perform any operation over the data, its value before and after
the operation should be preserved. For example, the account balance before and after the
transaction should be correct, i.e., it should remain conserved.

I means Isolation: There can be concurrent users for accessing data at the same time from
the database. Thus, isolation between the data should remain isolated. For example, when
multiple transactions occur at the same time, one transaction effects should not be visible to
the other transactions in the database.

D means Durability: It ensures that once it completes the operation and commits the data,
data changes should remain permanent.

4) NoSQL Database
Non-SQL/Not Only SQL is a type of database that is used for storing a wide range of data
sets. It is not a relational database as it stores data not only in tabular form but in several
different ways. It came into existence when the demand for building modern applications
increased. Thus, NoSQL presented a wide variety of database technologies in response to the
demands. We can further divide a NoSQL database into the following four types:

a. Key-value storage: It is the simplest type of database storage where it stores every
single item as a key (or attribute name) holding its value, together.
b. Document-oriented Database: A type of database used to store data as JSON-like
document. It helps developers in storing data by using the same document-model
format as used in the application code.
c. Graph Databases: It is used for storing vast amounts of data in a graph-like
structure. Most commonly, social networking websites use the graph database.
d. Wide-column stores: It is similar to the data represented in relational databases.
Here, data is stored in large columns together, instead of storing in rows.

Advantages of NoSQL Database

o It enables good productivity in the application development as it is not required to


store data in a structured format.
o It is a better option for managing and handling large data sets.
o It provides high scalability.
o Users can quickly access data from the database through key-value.

5) Cloud Database
A type of database where data is stored in a virtual environment and executes over the cloud
computing platform. It provides users with various cloud computing services (SaaS, PaaS,
IaaS, etc.) for accessing the database. There are numerous cloud platforms, but the best
options are:

o Amazon Web Services(AWS)


o Microsoft Azure
o Kamatera
o PhonixNAP
o ScienceSoft
o Google Cloud SQL, etc.

6) Object-oriented Databases
The type of database that uses the object-based data model approach for storing data in the
database system. The data is represented and stored as objects which are similar to the objects
used in the object-oriented programming language.

7) Hierarchical Databases
It is the type of database that stores data in the form of parent-children relationship nodes.
Here, it organizes data in a tree-like structure.

Data get stored in the form of records that are connected via links. Each child record in the
tree will contain only one parent. On the other hand, each parent record can have multiple
child records.

8) Network Databases
It is the database that typically follows the network data model. Here, the representation of
data is in the form of nodes connected via links between them. Unlike the hierarchical
database, it allows each record to have multiple children and parent nodes to form a
generalized graph structure.

9) Personal Database
Collecting and storing data on the user's system defines a Personal Database. This database is
basically designed for a single user.

Advantage of Personal Database

o It is simple and easy to handle.


o It occupies less storage space as it is small in size.

10) Operational Database


The type of database which creates and updates the database in real-time. It is basically
designed for executing and handling the daily data operations in several businesses. For
example, An organization uses operational databases for managing per day transactions.

11) Enterprise Database


Large organizations or enterprises use this database for managing a massive amount of data.
It helps organizations to increase and improve their efficiency. Such a database allows
simultaneous access to users.

Advantages of Enterprise Database:

o Multi processes are supportable over the Enterprise database.


o It allows executing parallel queries on the system.

ACID Properties in DBMS


DBMS is the management of data that should remain integrated when any changes are done
in it. It is because if the integrity of the data is affected, whole data will get disturbed and
corrupted. Therefore, to maintain the integrity of the data, there are four properties described
in the database management system, which are known as the ACID properties. The ACID
properties are meant for the transaction that goes through a different group of tasks, and there
we come to see the role of the ACID properties.

In this section, we will learn and understand about the ACID properties. We will learn what
these properties stand for and what does each property is used for. We will also understand
the ACID properties with the help of some examples.

ACID Properties
The expansion of the term ACID defines for:
1) Atomicity
The term atomicity defines that the data remains atomic. It means if any operation is
performed on the data, either it should be performed or executed completely or should not be
executed at all. It further means that the operation should not break in between or execute
partially. In the case of executing operations on the transaction, the operation should be
completely executed and not partially.

Example: If Remo has account A having $30 in his account from which he wishes to send
$10 to Sheero's account, which is B. In account B, a sum of $ 100 is already present. When
$10 will be transferred to account B, the sum will become $110. Now, there will be two
operations that will take place. One is the amount of $10 that Remo wants to transfer will be
debited from his account A, and the same amount will get credited to account B, i.e., into
Sheero's account. Now, what happens - the first operation of debit executes successfully, but
the credit operation, however, fails. Thus, in Remo's account A, the value becomes $20, and
to that of Sheero's account, it remains $100 as it was previously present.
In the above diagram, it can be seen that after crediting $10, the amount is still $100 in
account B. So, it is not an atomic transaction.

The below image shows that both debit and credit operations are done successfully. Thus the
transaction is atomic.

Thus, when the amount loses atomicity, then in the bank systems, this becomes a huge issue,
and so the atomicity is the main focus in the bank systems.

2) Consistency
The word consistency means that the value should remain preserved always. In DBMS, the
integrity of the data should be maintained, which means if a change in the database is made,
it should remain preserved always. In the case of transactions, the integrity of the data is very
essential so that the database remains consistent before and after the transaction. The data
should always be correct.

Example:
In the above figure, there are three accounts, A, B, and C, where A is making a transaction T
one by one to both B & C. There are two operations that take place, i.e., Debit and Credit.
Account A firstly debits $50 to account B, and the amount in account A is read $300 by B
before the transaction. After the successful transaction T, the available amount in B becomes
$150. Now, A debits $20 to account C, and that time, the value read by C is $250 (that is
correct as a debit of $50 has been successfully done to B). The debit and credit operation
from account A to C has been done successfully. We can see that the transaction is done
successfully, and the value is also read correctly. Thus, the data is consistent. In case the
value read by B and C is $300, which means that data is inconsistent because when the debit
operation executes, it will not be consistent.

3) Isolation
The term 'isolation' means separation. In DBMS, Isolation is the property of a database where
no data should affect the other one and may occur concurrently. In short, the operation on one
database should begin when the operation on the first database gets complete. It means if two
operations are being performed on two different databases, they may not affect the value of
one another. In the case of transactions, when two or more transactions occur simultaneously,
the consistency should remain maintained. Any changes that occur in any particular
transaction will not be seen by other transactions until the change is not committed in the
memory.

Example: If two operations are concurrently running on two different accounts, then the
value of both accounts should not get affected. The value should remain persistent. As you
can see in the below diagram, account A is making T1 and T2 transactions to account B and
C, but both are executing independently without affecting each other. It is known as Isolation.
4) Durability
Durability ensures the permanency of something. In DBMS, the term durability ensures that
the data after the successful execution of the operation becomes permanent in the database.
The durability of the data should be so perfect that even if the system fails or leads to a crash,
the database still survives. However, if gets lost, it becomes the responsibility of the recovery
manager for ensuring the durability of the database. For committing the values, the COMMIT
command must be used every time we make changes.

Therefore, the ACID property of DBMS plays a vital role in maintaining the consistency and
availability of data in the database.

Thus, it was a precise introduction of ACID properties in DBMS. We have discussed these
properties in the transaction section also.

Transaction States in DBMS


States through which a transaction goes during its lifetime. These are the
states which tell about the current state of the Transaction and also tell how
we will further do the processing in the transactions. These states govern the
rules which decide the fate of the transaction whether it will commit or abort.
They also use Transaction log. Transaction log is a file maintain by
recovery management component to record all the activities of the
transaction. After commit is done transaction log file is removed.
These are different types of Transaction States :

1. Active State –
When the instructions of the transaction are running then the transaction
is in active state. If all the ‘read and write’ operations are performed
without any error then it goes to the “partially committed state”; if any
instruction fails, it goes to the “failed state”.

2. Partially Committed –
After completion of all the read and write operation the changes are made
in main memory or local buffer. If the changes are made permanent on
the DataBase then the state will change to “committed state” and in case
of failure it will go to the “failed state”.

3. Failed State –
When any instruction of the transaction fails, it goes to the “failed state” or
if failure occurs in making a permanent change of data on Data Base.

4. Aborted State –
After having any type of failure the transaction goes from “failed state” to
“aborted state” and since in previous states, the changes are only made
to local buffer or main memory and hence these changes are deleted or
rolled-back.

5. Committed State –
It is the state when the changes are made permanent on the Data Base
and the transaction is complete and therefore terminated in the
“terminated state”.

6. Terminated State –
If there isn’t any roll-back or the transaction comes from the “committed
state”, then the system is consistent and ready for new transaction and
the old transaction is terminated.

You might also like