0% found this document useful (0 votes)

3 views

Lecture3-Distributed Introduction

Distributed DB

Uploaded by

amirosama2121

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

Lecture3-Distributed Introduction

Distributed DB

Uploaded by

amirosama2121

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 38

Principles of Distributed Database

Systems
M. Tamer Özsu
Patrick Valduriez

© 2020, M.T. Özsu & P. Valduriez 1

Outline
◼ Introduction
◼ What is a distributed DBMS
◼ Centralized Database vs Distributed Database
◼ DDB Advantages/Disadvantages
◼ Distributed Computing
◼ Distributed and Parallel Database Design
◼ Parallelism types
◼ DDB types
◼ Classification of DDB
◼ Distributed DBMS promises

© 2020, M.T. Özsu & P. Valduriez 2

Current Distribution – Geographically
Distributed Data Centers

© 2020, M.T. Özsu & P. Valduriez 3

What is a Distributed Database System?
◼ A collection of multiple logically related database distributed over
a computer network.
- Database whose relations reside on different sites

- Database who's some of its relations are replicated at different

sites.

- Database whose relations are split between different sites

❑ A Distributed Database Management System (DDBMS) is

the software system that manages a distributed database
and makes the distribution transparent to the user.

Distributed database system (DDBS) = DDB + D–DBMS

4
Distributed DBMS
• It is the software system that permit the management of the distributed DB and
makes the distribution transparency to users.

• DDBMS consist of single logical DB that is split into a number of fragments.

• Each fragment is stored on one or more computers under the control of a separate
DBMS with the computers connected by a communication network.

• Each site is capable independently processing user request that require access to
local data and is also capable of processing data stored on other computer in the
network.

• User access the distributed database via an application. Applications are classified
as those that don't require data from other sites (local application), and those that do
require data from other sites (global applications)

• We require a DDBMS to have at least one global application

© 2020, M.T. Özsu & P. Valduriez 5

DDBMS Characteristics
A DDBMS therefore has the following characteristics:

◼ A collection of logical related shared data.

◼ The data is split into several fragments.
◼ Fragments may be replicated.
◼ The sites are linked by a communications network.
◼ The data at each site is under the control of a DBMS.
◼ The DBMS at each site handle local applications,
autonomously.
◼ Each DBMS participate in at least one global application.

© 2020, M.T. Özsu & P. Valduriez 6

Centralized Vs. Distributed Databases

In centralized database In Distributed Databases

• Data is stored in multiple
• Data is located in one place places (each is running a
(one server) DBMS)
• All DBMS functionalities are • DBMS functionalities are
distributed over many
done by that server machines

7
Centralized DBMS Environment

© 2020, M.T. Özsu & P. Valduriez 8

Distributed DBMS Environment

© 2020, M.T. Özsu & P. Valduriez 9

Centralized Vs. Distributed Databases

10
Centralized Vs. Distributed Databases

11
Why Might Data be Distributed

• To minimize communication costs or response time.

• Maintain control and security.

• To increase its availability in the event of failure.

• Data is too large.

12
DDBMS- Advantages

• Reflects organizational structure

• Improved shareability and local autonomy
• Improved availability
• Improved reliability
• Improved performance
• Economics
• Modular growth

© 2020, M.T. Özsu & P. Valduriez 13

DDBMS- Disadvantages

• Complexity
• Cost
• Security
• Integrity control more difficult
• Lack of standards
• Lack of experience
• Database design more complex

© 2020, M.T. Özsu & P. Valduriez 14

DDBMS- Example (Banking System)

© 2020, M.T. Özsu & P. Valduriez 15

Distributed Computing

◼ A number of autonomous processing elements (not

necessarily homogeneous) that are interconnected by a
computer network and that cooperate in performing their
assigned tasks.
◼ What is being distributed?
❑ Processing logic
❑ Function
❑ Data
❑ Control

© 2020, M.T. Özsu & P. Valduriez 16

Distributed Computing

◼ Distributed computing system as a number of

interconnected autonomous processing elements (PEs).
◼ Their capabilities may differ, they may be heterogeneous.
◼ PEs do not have access to each other’s state, which they
can only learn by exchanging messages that incur a
communication cost.
◼ Therefore, when data is distributed, its management and
access in a logically integrated manner requires special
care from the distributed DBMS software

© 2020, M.T. Özsu & P. Valduriez 17

Parallel VS. Distributed Databases
In Parallel Database System (To improve performance through
parallelization)

◼ DBMS running across multiple processors and disks that is designed to

execute operations in parallel, whenever possible, in order to improve
performance.
◼ Distributed processing usually imply parallel processing (not distribution of
data)
◼ Can have parallel processing on a single machine

In Distributed Database System (To increased availability )

◼ Data is physically stored across several sites, and each site is managed by a
DBMS capable of running independent of the other sites.
◼ In contrast to parallel databases, sharing data is the key of a DDBs

18
Parallel VS. Distributed Databases

19
Different Architectures

Three possible architectures for passing and processing

data:
a) Shared memory -- processors share a common
memory

b) Shared disk -- processors share a common disk

c) Shared nothing -- processors share neither a

common memory nor common disk

20
Different Architectures

21
Parallel VS. Distributed Databases
In Parallel Databases
• Machines are physically close to each other, e.g., same server room
• Machines connects with dedicated high-speed LANs and switches
• Communication cost is assumed to be small
• Can shared-memory, shared-disk, or shared-nothing architecture

In Distributed Databases
• Machines can be far from each other, e.g., in different continent
• Can be connected using public-purpose network, e.g., Internet
• Communication cost and problems cannot be ignored
• Usually shared-nothing architecture.

22
Type of parallelism
1. Inter-query Parallelism
Queries/transactions execute in parallel with one another.
2. Intra-query parallelism
A single query that is executed in parallel using multiple processors or
disks using shared nothing architecture. To improve the query’s
response time.
3. Intra-operation parallelism
▪ Execution of single complex or large operations in parallel in multiple
processors.
◼ Executing concurrently multiple instances of an operator, with each
instance working on a subset of the data.
◼ Intra-operator parallelism is based primarily on partitioning the input
relation into non-overlapping data segments. Followed by a final merge of
the results
For example, ORDER BY clause of a query that tries to execute on millions of
records can be parallelized on multiple processors.
23
Type of parallelism

4 Inter-operation Parallelism
4.1 Pipe-lined parallelism
Execution of different operations in pipe-lined fashion. For example, if
we need to join three tables, one processor may join two tables and
send the result set records as and when they are produced to the other
processor. In the other processor the third table can be joined with the
incoming records and the result can be produced.

4.2 Independent parallelism

Execution of each operation individually in different processors
only if they can be executed independent of each other.
For example, if we need to join four tables, then two can be joined
at one processor and the other two can be joined at another
processor. Final join can be done later.

24
Types of Distributed Database Systems

◼ Three main factors are used to differentiate between

different types of DDBMSs.

25
Homogeneous Distributed Databases

26
Homogeneous Distributed Databases

There are two types of homogeneous distributed database :

• Autonomous − Each database is independent that functions on its

own (consist of nodes that operate independently and exchange
information with each other using message passing). They are
integrated by a controlling application and use message passing to
share data updates.

• Non-autonomous − Data is distributed across the homogeneous

nodes and a central or master DBMS co-ordinates data updates
across the sites.

27
Homogeneous Distributed Databases:
Autonomy
◼ Design autonomy:

❑ Individual DBMSs are free to use the data models and

transaction management technique that they prefer.
◼ Communication autonomy:

❑ Each individual DBMSs is free to make its own decision on

providing other DBMSs with information.
◼ Execution autonomy:

❑ Each DBMS can execute the transactions that are

submitted to it in anyway that it wants to.

28
Heterogeneous Distributed Databases
In a heterogeneous distributed database, different sites have different
operating systems, DBMS products and data models. Its properties
are:
• Different sites use dissimilar schemas and software.

• The system may be composed of a variety of DBMSs like relational,

network, hierarchical or object oriented.

• Query processing is complex due to dissimilar schemas.

• Transaction processing is complex due to dissimilar software.

• A site may not be aware of other sites and so there is limited co-
operation in processing user requests

29
Heterogeneous Distributed Databases

◼ Many database applications require data from a

variety of preexisting databases located in a
heterogeneous collection of hardware and software
platforms
Object Unix Relational
Oriented Site 5 Unix
Site 1
Hierarchical
Window
Site 4 Communications
network

Network
Object DBMS
Oriented Site 3 Site 2 Relational
Linux Linux

30
Heterogeneous Distributed Databases

◼ Federated: Each site may run different database system, but

the data access is managed through a single conceptual
schema.
❑ This implies that the degree of local autonomy is minimum. Each site must
adhere to a centralized access policy. There may be a global schema.

◼ Multi-database: There is no one conceptual global schema.

For data access a schema is constructed dynamically as
needed by the application software.

31
Distributed DBMS Architectures

DDBMS architectures are generally developed depending on three

parameters :

• Distribution − It states the physical distribution of data across the

different sites.

• Autonomy − It indicates the distribution of control of the database

system and the degree to which each constituent DBMS can
operate independently.

• Heterogeneity − It refers to the uniformity or dissimilarity of the

data models, system components and databases.

32
Classification of DDBMS

Distribution Peer-to-Peer
Distributed DBS

Distributed Multi-
DBS

Client\server

Autonomy

Multi-DBS

Heterogeneity
Federated DBS

33
History – Early Distribution
Peer-to-Peer (P2P)

History – Client/Server

Distributed DBMS Promises

 Transparent management of distributed, fragmented,

and replicated data

 Improved reliability/availability through distributed

transactions

 Improved performance

 Easier and more economical system expansion

Scalability

◼ Issue is database scaling and workload scaling

◼ Adding processing and storage power

◼ Scale-out: add more servers

❑ Scale-up: increase the capacity of one server → has limits

Outline
◼ Introduction
❑
❑

❑
❑ Distributed DBMS architecture

Distributed DBMS Architecture
No ratings yet
Distributed DBMS Architecture
49 pages
Advanced Data Base Management Systems
No ratings yet
Advanced Data Base Management Systems
35 pages
Chapter - 6 Distributed Database System
No ratings yet
Chapter - 6 Distributed Database System
50 pages
MC4202 - Adavanced Database Technology
No ratings yet
MC4202 - Adavanced Database Technology
159 pages
Chapter - 7 Distributed Database System
0% (1)
Chapter - 7 Distributed Database System
54 pages
Chapter 4 Distributed Database Systems
No ratings yet
Chapter 4 Distributed Database Systems
69 pages
Advance Concept in Data Bases Unit-3 by Arun Pratap Singh
100% (2)
Advance Concept in Data Bases Unit-3 by Arun Pratap Singh
81 pages
Distributeddbms Er. Inderjeet Bal
No ratings yet
Distributeddbms Er. Inderjeet Bal
60 pages
RBD Lectures Merged
No ratings yet
RBD Lectures Merged
367 pages
Chapter 6 Distributed System Management
No ratings yet
Chapter 6 Distributed System Management
12 pages
1 Introduction
No ratings yet
1 Introduction
46 pages
Distributed DBMS
No ratings yet
Distributed DBMS
6 pages
1-Introduction TO Principles of Distributed Database Systems
No ratings yet
1-Introduction TO Principles of Distributed Database Systems
46 pages
Chapter 7
No ratings yet
Chapter 7
22 pages
Topic 7 - Distributed Database Systems
No ratings yet
Topic 7 - Distributed Database Systems
44 pages
1 Distributed DB
No ratings yet
1 Distributed DB
67 pages
Unit-Iii Distributed Database: System
No ratings yet
Unit-Iii Distributed Database: System
55 pages
Distributed Databases
No ratings yet
Distributed Databases
39 pages
Notes_1071_MCA-20-23 Unit- 4.1
No ratings yet
Notes_1071_MCA-20-23 Unit- 4.1
48 pages
Ch1 (CSE417)
No ratings yet
Ch1 (CSE417)
46 pages
ADT Notes
No ratings yet
ADT Notes
36 pages
Intro To DDBMS
No ratings yet
Intro To DDBMS
12 pages
Distributed Database: Source
No ratings yet
Distributed Database: Source
19 pages
Unit V NoSQL Databases
No ratings yet
Unit V NoSQL Databases
124 pages
Distributed Database Management Systems
No ratings yet
Distributed Database Management Systems
55 pages
Distributed Database Management Systems
No ratings yet
Distributed Database Management Systems
55 pages
Parallel Databases
No ratings yet
Parallel Databases
23 pages
Distributed Database Design: Basics
No ratings yet
Distributed Database Design: Basics
18 pages
Distributed Database
No ratings yet
Distributed Database
9 pages
All Merged
No ratings yet
All Merged
513 pages
Database Fundamentals Distributed Databases
No ratings yet
Database Fundamentals Distributed Databases
18 pages
Team:DBMS: by Navdeep Kaur Assistant Professor Computer Science Department
No ratings yet
Team:DBMS: by Navdeep Kaur Assistant Professor Computer Science Department
19 pages
Module 1
No ratings yet
Module 1
24 pages
Distributed Multimedia & Database System
No ratings yet
Distributed Multimedia & Database System
58 pages
Distributed Database Management Systems
No ratings yet
Distributed Database Management Systems
23 pages
1 Introduction
No ratings yet
1 Introduction
42 pages
Outline: What Is A Distributed DBMS Distributed DBMS Architecture
No ratings yet
Outline: What Is A Distributed DBMS Distributed DBMS Architecture
40 pages
Distributed Database Management Systems: Outline
No ratings yet
Distributed Database Management Systems: Outline
40 pages
DDBS Lec1
No ratings yet
DDBS Lec1
20 pages
Distributed Databases
No ratings yet
Distributed Databases
32 pages
1 Introduction
No ratings yet
1 Introduction
46 pages
Database II: Distributed Databases
No ratings yet
Database II: Distributed Databases
15 pages
DDBS BCS 2 distributed database notes
No ratings yet
DDBS BCS 2 distributed database notes
16 pages
Unit 4
No ratings yet
Unit 4
23 pages
Distributed Databases: Indu Saini (Research Scholar) IIT Roorkee Enrollment No.: 10926003
No ratings yet
Distributed Databases: Indu Saini (Research Scholar) IIT Roorkee Enrollment No.: 10926003
14 pages
Tybca Recent Trends in It Chpter 1
No ratings yet
Tybca Recent Trends in It Chpter 1
16 pages
UNIT- 1 DDB
No ratings yet
UNIT- 1 DDB
34 pages
Distributed Database Management
No ratings yet
Distributed Database Management
7 pages
Introduction-Distributed DBMS-1-26
No ratings yet
Introduction-Distributed DBMS-1-26
26 pages
Lecture 1
No ratings yet
Lecture 1
46 pages
CSE 453 Slide 1
No ratings yet
CSE 453 Slide 1
46 pages
SQL Unit 3 Distributed DB
No ratings yet
SQL Unit 3 Distributed DB
10 pages
1 Introduction
No ratings yet
1 Introduction
50 pages
DDB-distribution Database Important.
No ratings yet
DDB-distribution Database Important.
15 pages
Distributed Database
No ratings yet
Distributed Database
12 pages
1 DDBMS Introduction
No ratings yet
1 DDBMS Introduction
18 pages
Distributed Database Design
88% (8)
Distributed Database Design
85 pages
Fundamental Research of Distributed Database PDF
No ratings yet
Fundamental Research of Distributed Database PDF
9 pages
Database And Computer Management: SERIES 1, #3
From Everand
Database And Computer Management: SERIES 1, #3
Elias Mutegi
No ratings yet
Database Management System
From Everand
Database Management System
Knowledge Flow
No ratings yet
Cu Medical I Pad Aed Manual
No ratings yet
Cu Medical I Pad Aed Manual
65 pages
RBI Grade B 2021 Phase 1-2 PYQs
No ratings yet
RBI Grade B 2021 Phase 1-2 PYQs
155 pages
Idle Air Control Valve
No ratings yet
Idle Air Control Valve
14 pages
Explaining International Relations Since 1945 PDF
0% (4)
Explaining International Relations Since 1945 PDF
2 pages
Formula Car f1 Facts
No ratings yet
Formula Car f1 Facts
2 pages
2_Reading_List_2025-01-02
No ratings yet
2_Reading_List_2025-01-02
3 pages
Love Pill
No ratings yet
Love Pill
4 pages
Retaining Ring
No ratings yet
Retaining Ring
1 page
Exploring The Perceptions and Experiences of Some Freshmen Using Online Registration System in Niger
No ratings yet
Exploring The Perceptions and Experiences of Some Freshmen Using Online Registration System in Niger
9 pages
TDS S420MX
No ratings yet
TDS S420MX
2 pages
Subwoofer Box Pioneer W312D4
No ratings yet
Subwoofer Box Pioneer W312D4
4 pages
NIH Stroke Scale-1
No ratings yet
NIH Stroke Scale-1
3 pages
Grade 7 English Curriculum Map
No ratings yet
Grade 7 English Curriculum Map
16 pages
EPS
No ratings yet
EPS
31 pages
Lamri Co
No ratings yet
Lamri Co
3 pages
Purpose: Updated Lesson Plan: Food and The Digestive System
No ratings yet
Purpose: Updated Lesson Plan: Food and The Digestive System
5 pages
TYPE-3 RTU DRAWING-2017-05-10-cst-en
No ratings yet
TYPE-3 RTU DRAWING-2017-05-10-cst-en
39 pages
Matrix Algebra: Appendix A
No ratings yet
Matrix Algebra: Appendix A
6 pages
Online Bus Ticket Booking
50% (2)
Online Bus Ticket Booking
6 pages
Aopoa SantokYai Brochure
No ratings yet
Aopoa SantokYai Brochure
12 pages
Vocal Function Exercises For Presbylaryn PDF
No ratings yet
Vocal Function Exercises For Presbylaryn PDF
9 pages
Duracoat AR: Elastomeric, Flexible Cementitious Waterproofing Coating
No ratings yet
Duracoat AR: Elastomeric, Flexible Cementitious Waterproofing Coating
3 pages
Question and Answer
No ratings yet
Question and Answer
47 pages
BT 006981912
No ratings yet
BT 006981912
1 page
İng. Katalog
No ratings yet
İng. Katalog
14 pages
Ratu Fianita Priningrum - ABAP - Summarry Weekly
No ratings yet
Ratu Fianita Priningrum - ABAP - Summarry Weekly
20 pages
Prince of Persia
No ratings yet
Prince of Persia
3 pages
Spatial Concepts
No ratings yet
Spatial Concepts
86 pages
ICMR_result Intermediate Extramural grant_2024
No ratings yet
ICMR_result Intermediate Extramural grant_2024
11 pages
Principles of Working Capital Management - Lecture 3
No ratings yet
Principles of Working Capital Management - Lecture 3
40 pages

Lecture3-Distributed Introduction

Uploaded by

Lecture3-Distributed Introduction

Uploaded by

Principles of Distributed Database

© 2020, M.T. Özsu & P. Valduriez 1

© 2020, M.T. Özsu & P. Valduriez 2

© 2020, M.T. Özsu & P. Valduriez 3

- Database who's some of its relations are replicated at different

- Database whose relations are split between different sites

❑ A Distributed Database Management System (DDBMS) is

Distributed database system (DDBS) = DDB + D–DBMS

• DDBMS consist of single logical DB that is split into a number of fragments.

• We require a DDBMS to have at least one global application

© 2020, M.T. Özsu & P. Valduriez 5

◼ A collection of logical related shared data.

© 2020, M.T. Özsu & P. Valduriez 6

In centralized database In Distributed Databases

© 2020, M.T. Özsu & P. Valduriez 8

© 2020, M.T. Özsu & P. Valduriez 9

• To minimize communication costs or response time.

• Maintain control and security.

• To increase its availability in the event of failure.

• Data is too large.

• Reflects organizational structure

© 2020, M.T. Özsu & P. Valduriez 13

© 2020, M.T. Özsu & P. Valduriez 14

© 2020, M.T. Özsu & P. Valduriez 15

◼ A number of autonomous processing elements (not

© 2020, M.T. Özsu & P. Valduriez 16

◼ Distributed computing system as a number of

© 2020, M.T. Özsu & P. Valduriez 17

◼ DBMS running across multiple processors and disks that is designed to

In Distributed Database System (To increased availability )

Three possible architectures for passing and processing

b) Shared disk -- processors share a common disk

c) Shared nothing -- processors share neither a

4.2 Independent parallelism

◼ Three main factors are used to differentiate between

All sites of the database system

There are two types of homogeneous distributed database :

• Autonomous − Each database is independent that functions on its

• Non-autonomous − Data is distributed across the homogeneous

❑ Individual DBMSs are free to use the data models and

❑ Each individual DBMSs is free to make its own decision on

❑ Each DBMS can execute the transactions that are

• The system may be composed of a variety of DBMSs like relational,

• Query processing is complex due to dissimilar schemas.

• Transaction processing is complex due to dissimilar software.

◼ Many database applications require data from a

◼ Federated: Each site may run different database system, but

◼ Multi-database: There is no one conceptual global schema.

DDBMS architectures are generally developed depending on three

• Distribution − It states the physical distribution of data across the

• Autonomy − It indicates the distribution of control of the database

• Heterogeneity − It refers to the uniformity or dissimilarity of the

© 2020, M.T. Özsu & P. Valduriez 34

© 2020, M.T. Özsu & P. Valduriez 35

 Transparent management of distributed, fragmented,

 Improved reliability/availability through distributed

 Easier and more economical system expansion

© 2020, M.T. Özsu & P. Valduriez

◼ Issue is database scaling and workload scaling

◼ Adding processing and storage power

◼ Scale-out: add more servers

❑ Scale-up: increase the capacity of one server → has limits

© 2020, M.T. Özsu & P. Valduriez 51

© 2020, M.T. Özsu & P. Valduriez 57

You might also like