Distributed DBMS
Distributed DBMS
Distributed DBMS
A distributed database is a set of interconnected databases that is distributed over the computer
network or internet. A Distributed Database Management System (DDBMS) manages the distributed
database and provides mechanisms to make the databases transparent to the users. In these systems,
data is intentionally distributed among multiple nodes so that all computing resources of the
organization can be optimally used.
A distributed database is a collection of multiple interconnected databases, which are spread
physically across various locations that communicate via a computer network.
Features
• Databases in the collection are logically interrelated with each other. Often they represent a single
logical database.
• Data is physically stored across multiple sites. Data in each site can be managed by a DBMS
independent of the other sites.
• The processors in the sites are connected via a network. They do not have any multiprocessor
configuration.
• A distributed database is not a loosely connected file system.
• A distributed database incorporates transaction processing, but it is not synonymous with a
transaction processing system.
Distributed Database Management System
A distributed database management system (DDBMS) is a centralized software system that manages a
distributed database in a manner as if it were all stored in a single location.
Features
• It is used to create, retrieve, update and delete distributed databases.
• It synchronizes the database periodically and provides access mechanisms by the virtue of which the
distribution becomes transparent to the users.
• It ensures that the data modified at any site is universally updated.
• It is used in application areas where large volumes of data are processed and accessed by numerous
users simultaneously.
• It is designed for heterogeneous database platforms.
• It maintains confidentiality and data integrity of the databases.
Distributed database
Distributed databases can be broadly classified into homogeneous and heterogeneous distributed
database environments
Homogeneous Distributed Databases
In a homogeneous distributed database, all the sites use identical DBMS and operating systems. Its
properties are −
• • The sites use very similar software.
• • The sites use identical DBMS or DBMS from the same vendor.
• • Each site is aware of all other sites and cooperates with other sites to process user requests.
• • The database is accessed through a single interface as if it is a single database.
Data Replication
Data replication is the process of storing separate copies of the database at two or more sites. It is a
popular fault tolerance technique of distributed databases.
Advantages of Data Replication
• Reliability − In case of failure of any site, the database system continues to work since a copy is
available at another site(s).
• Reduction in Network Load − Since local copies of data are available, query processing can be
done with reduced network usage, particularly during prime hours. Data updating can be done at non-
prime hours.
• Quicker Response − Availability of local copies of data ensures quick query processing and
consequently quick response time.
• Simpler Transactions − Transactions require less number of joins of tables located at different sites
and minimal coordination across the network. Thus, they become simpler in nature.
Fragmentation
Fragmentation is the task of dividing a table into a set of smaller tables. The subsets of the table are
called fragments. Fragmentation can be of three types: horizontal, vertical, and hybrid (a combination
of horizontal and vertical). Horizontal fragmentation can further be classified into two techniques:
primary horizontal fragmentation and derived horizontal fragmentation. Fragmentation should be
done in a way so that the original table can be reconstructed from the fragments. This is needed so
that the original table can be reconstructed from the fragments whenever required. This requirement is
called “re-constructiveness.”
Advantages
1. Permits a number of transactions to be executed concurrently
2. Results in parallel execution of a single query
3. Increases level of concurrency, also referred to as, intra-query concurrency
4. Increased System throughput.
5. Since data is stored close to the site of usage, the efficiency of the database system is increased.
6. Local query optimization techniques are sufficient for most queries since data is locally available.
7. Since irrelevant data is not available at the sites, the security and privacy of the database system can
be maintained.
Disadvantages
1. Applications whose views are defined on more than one fragment may suffer performance
degradation if applications have conflicting requirements.
2. Simple tasks like checking for dependencies, would result in chasing after data in a number of sites
3. When data from different fragments are required, the access speeds may be very high.
4. In the case of recursive fragmentations, the job of reconstruction will need expensive techniques.
5. Lack of backup copies of data in different sites may render the database ineffective in case of
failure of a site.
Vertical Fragmentation
In vertical fragmentation, the fields or columns of a table are grouped into fragments. In order to
maintain re-constructiveness, each fragment should contain the primary key field(s) of the table.
Vertical fragmentation can be used to enforce the privacy of data.
Grouping
• Starts by assigning each attribute to one fragment
• At each step, joins some of the fragments until some criteria are satisfied.
• Results in overlapping fragments
Splitting
• Starts with relation and decides on beneficial partitioning based on the access behaviour of
applications to the attributes
• Fits more naturally within the top-down design
• Generates non-overlapping fragments
Horizontal Fragmentation
Horizontal fragmentation groups the tuples of a table in accordance with the values of one or more
fields. Horizontal fragmentation should also confirm the rule of re-constructiveness. Each horizontal
fragment must have all columns of the original base table.
Hybrid Fragmentation
In hybrid fragmentation, a combination of horizontal and vertical fragmentation techniques are used.
This is the most flexible fragmentation technique since it generates fragments with minimal
extraneous information. However, reconstruction of the original table is often an expensive task.
Hybrid fragmentation can be done in two alternative ways −
At first, generate a set of horizontal fragments; then generate vertical fragments from one or more of
the horizontal fragments.
At first, generate a set of vertical fragments; then generate horizontal fragments from one or more of
the vertical fragments.