0% found this document useful (0 votes)

103 views

Lecture-13-14 Parallel and Distributed Systems Programming Models-Jameel

The document discusses different levels of parallelism including bit-level, instruction-level, data parallelism, and task parallelism. It then covers Flynn's taxonomy of computer architectures including SISD, SIMD, and MIMD models. Finally, it discusses concepts like distributed systems, processes, threads and events, Amdahl's law, guidelines for parallelization, and multi-chip multiprocessor architectures like SMP, MPP, and DSM systems.

Uploaded by

Abdul Barii

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

103 views

Lecture-13-14 Parallel and Distributed Systems Programming Models-Jameel

Uploaded by

Abdul Barii

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 70

Parallelism Levels

• Bit-level parallelism
• Number of bits processed per clock cycle (often called a word size)
• Increased from 4-bit, to 8-bit, 16-bit, 32-bit, and to 64-bit
• Instruction-level parallelism
• Computers now use multi-stage processing pipelines to speed up execution
• Data parallelism or loop parallelism
• The program loops can be processed in parallel
• Task parallelism
• The problem can be decomposed into tasks that can be carried out
concurrently. For example, SPMD. Note that data dependencies cause
different flows of control in individual tasks

2
Parallel Computer Architecture
• Flynn’s taxonomy of computer architectures
• Based on the # of concurrent control/instruction
& data streams
• SISD (Single Instruction Single Data)
• Scalar architecture with one processor/core
• SIMD (Single Instruction, Multiple Data)
• Supports vector processing
• Operations on individual vector components are
carried out concurrently

3
Parallel Computer Architecture
• MIMD (Multiple Instructions, Multiple Data)
• Several processors/cores function
asynchronously and independently
• At any time, different processors/cores may be
executing different instructions on different data
• Several types of systems:
• Uniform Memory Access (UMA)
• Cache Only Memory Access (COMA)
• Non-Uniform Memory Access (NUMA)

4
Distributed Systems
• A distributed system is a collection of:
• Autonomous computers
• Connected through a network

• Distribution software called middleware, is used to:

• Coordinate computer activities
• Share system resources

5
Characteristics of Distributed Systems
• Users perceive the system as a single, integrated computing facility
• Components are autonomous
• Scheduling, resource management and security policies are
implemented by each system
• There are multiple:
• Points of control
• Points of failure
• Resources may not be accessible at all times
• Such distributed systems can be scaled via additional resources
• They can be designed to maintain availability even at low levels of
hardware/software/network reliability
6
Desirable Properties of a Distributed System
• Access Transparency
• Local and remote resources are accessed using identical operations
• Location Transparency
• Information objects are accessed without knowing their location
• Concurrency Transparency
• Several processes run concurrently using shared information objects without
interference among them
• Replication Transparency
• Multiple instances of information objects increase reliability without the
knowledge of users or applications

7
Desirable Properties of a Distributed System
• Failure Transparency
• Concealment of failures
• Migration Transparency
• Information objects in the system are moved without affecting the operation
performed on them
• Performance Transparency
• The system can be reconfigured based on the load and quality of service
(QoS) requirements
• Scaling Transparency
• The system and applications can scale without changing the system structure
and without affecting the applications
8
Processes, Threads and Events
• Dispatchable units of work:
• Process – is a program in execution
• Thread – is a lightweight process
• State of a process/thread:
• Information required to restart a suspended process/thread, e.g. program
counter and the current values of the registers
• Event
• A change of state of a process, e.g., local or communication events

9
Amdahl’s Law
 We parallelize our programs in order to run them faster

 How much faster will a parallel program run?

 Suppose that the sequential execution of a program takes T1 time units

and the parallel execution on p processors takes Tp time units

 Suppose that out of the entire execution of the program, s fraction of it is

not parallelizable while 1-s fraction is parallelizable

 Then the speedup (Amdahl’s formula):

4
Amdahl’s Law: An Example
 Suppose that 80% of you program can be parallelized and that you
use 4 processors to run your parallel version of the program

 The speedup you can get according to Amdahl is:

 Although you use 4 processors you cannot get a speedup more than
2.5 times (or 40% of the serial running time)

5
Real Vs. Actual Cases
 Amdahl’s argument is too simplified to be applied to real cases

 When we run a parallel program, there are a communication

overhead and a workload imbalance among processes in general

20 80 20 80
Serial Serial

Parallel 20 20 Parallel 20 20
Process 1 Process 1

Process 2 Process 2

Cannot be parallelized
Process 3 Process 3
Cannot be parallelized Can be parallelized

Process 4 Process 4 Communication overhead

Can be parallelized
Load Unbalance
1. Parallel Speed-up: An Ideal Case 2. Parallel Speed-up: An Actual Case

6
Guidelines
 In order to efficiently benefit from parallelization, we
ought to follow these guidelines:

1. Maximize the fraction of our program that can be parallelized

2. Balance the workload of parallel processes

3. Minimize the time spent for communication

7
Parallel Computer Architectures

Parallel Computer Architectures

Multi-Chip Single-Chip
Multiprocessors Multiprocessors

9
Multi-Chip Multiprocessors
 We can categorize the architecture of multi-chip multiprocessor
computers in terms of two aspects:

 Whether the memory is physically centralized or distributed

 Whether or not the address space is shared

Address Space
Shared Individual
Memory

Centralized SMP (Symmetric Multiprocessor)/UMA N/A

(Uniform Memory Access) Architecture
Distributed Distributed Shared Memory (DSM)/NUMA MPP (Massively Parallel
(Non-Uniform Memory Access) Processors)/UMA
Architecture Architecture

10
Symmetric Multiprocessors
 A system with Symmetric Multiprocessors (SMP) architecture uses a
shared memory that can be accessed equally from all processors

Processor Processor Processor Processor

Cache Cache Cache Cache

Bus or Crossbar Switch

Memory
I/O

 Usually, a single OS controls the SMP system

11
Massively Parallel Processors
 A system with a Massively Parallel Processors (MPP) architecture
consists of nodes with each having its own processor, memory and
I/O subsystem

Interconnection Network

Processor Processor Processor Processor

Cache Cache Cache Cache

Bus Bus Bus Bus

Memory I/O Memory I/O Memory I/O Memory I/O

 Typically, an independent OS runs at each node

12
Distributed Shared Memory
 A Distributed Shared Memory (DSM) system is typically built on a
similar hardware model as MPP

 DSM provides a shared address space to applications using a

hardware/software directory-based coherence protocol

 The memory latency varies according to whether the memory is

accessed directly (a local access) or through the interconnect
(a remote access) (hence, NUMA)

 As in a SMP system, typically a single OS controls a DSM system

13
Parallel Computer Architectures

Parallel Computer Architectures

Multi-Chip
Multi-Chip Single-Chip
Multiprocessors
Multiprocessors Multiprocessors

14
Chip Multiprocessors
 The outcome is a single-chip multiprocessor referred to as Chip
Multiprocessor (CMP)

 CMP is currently considered the architecture of choice

 Cores in a CMP might be coupled either tightly or loosely

 Cores may or may not share caches
 Cores may implement a message passing or a shared memory inter-core
communication method

 Common CMP interconnects (referred to as Network-on-Chips or NoCs)

include bus, ring, 2D mesh, and crossbar

 CMPs could be homogeneous or heterogeneous:

 Homogeneous CMPs include only identical cores
 Heterogeneous CMPs have cores which are not identical

16
Models of Parallel Programming
 What is a parallel programming model?

 A programming model is an abstraction provided by the hardware

to programmers

 It determines how easily programmers can specify their algorithms into

parallel unit of computations (i.e., tasks) that the hardware understands

 It determines how efficiently parallel tasks can be executed on the hardware

 Main Goal: utilize all the processors of the underlying architecture

(e.g., SMP, MPP, CMP) and minimize the elapsed time of
your program

18
Traditional Parallel Programming
Models
Parallel Programming Models

Shared Memory Message Passing

19
Shared Memory Model
 In the shared memory programming model, the abstraction is that
parallel tasks can access any location of the memory

 Parallel tasks can communicate through reading and writing

common memory locations

 This is similar to threads from a single process which share a single

address space

 Multi-threaded programs (e.g., OpenMP programs) are the best fit

with shared memory programming model

20
Shared Memory Model
Single Thread Multi-Thread
Si = Serial
Pj = Parallel Time

Time
S1 S1 Spawn

P1
P1 P2 P3 P3
P2
Join
P3 S2 Shared Address Space

S2
Process

Process

21
Shared Memory Example
begin parallel // spawn a child thread
private int start_iter, end_iter, i;
shared int local_iter=4, sum=0;
shared double sum=0.0, a[], b[], c[];
shared lock_type mylock;

for (i=0; i<8; i++) start_iter = getid() * local_iter;

a[i] = b[i] + c[i]; end_iter = start_iter + local_iter;
sum = 0; for (i=start_iter; i<end_iter; i++)
for (i=0; i<8; i++) a[i] = b[i] + c[i];
if (a[i] > 0) barrier;
sum = sum + a[i];
Print sum; for (i=start_iter; i<end_iter; i++)
if (a[i] > 0) {
Sequential lock(mylock);
sum = sum + a[i];
unlock(mylock);
}
barrier; // necessary

end parallel // kill the child thread

Print sum;

Parallel
22
Traditional Parallel Programming
Models
Parallel Programming Models

Shared Memory Message Passing

28
Message Passing Model
 In message passing, parallel tasks have their own local memories

 One task cannot access another task’s memory

 Hence, to communicate data they have to rely on explicit messages

sent to each other

 This is similar to the abstraction of processes which do not share an

address space

 Message Passing Interface (MPI) programs are the best fit with the
message passing programming model

29
Message Passing Model
Single Thread Message Passing
S = Serial
P = Parallel
Time

Time
S1 S1 S1 S1 S1

P1 P1 P1 P1 P1

P2 S2 S2 S2 S2

P4
Process 0 Process 1 Process 2 Process 3
S2
Node 1 Node 2 Node 3 Node 4

Data transmission over the Network

Process

30
SPMD and MPMD
 When we run multiple processes with message-passing, there are
further categorizations regarding how many different programs are
cooperating in parallel execution

 We distinguish between two models:

1. Single Program Multiple Data (SPMD) model

2. Multiple Programs Multiple Data (MPMP) model

34
SPMD
 In the SPMD model, there is only one program and each process
uses the same executable working on different sets of data

a.out

Node 1 Node 2 Node 3

35
MPMD
 The MPMD model uses different programs for different processes,
but the processes collaborate to solve the same problem

 MPMD has two styles, the master/worker and the coupled analysis

a.out= Structural Analysis,

a.out b.out a.out b.out c.out b.out = fluid analysis and
c.out = thermal analysis

Example

Node 1 Node 2 Node 3 Node 1 Node 2 Node 3

1. MPMD: Master/Slave 2. MPMD: Coupled Analysis

36
Concluding Remarks
 To summarize, keep the following 3 points in mind:

 The purpose of parallelization is to reduce the time spent

for computation

 Ideally, the parallel program is p times faster than the sequential

program, where p is the number of processes involved in the parallel
execution, but this is not always achievable

 Message-passing is the tool to consolidate what parallelization has

separated. It should not be regarded as the parallelization itself

39
8. Distributed systems
◼ Collection of autonomous computers, connected through a network
operating under the control and distribution software.
◼ Middleware → software enabling individual systems to coordinate
their activities and to share system resources.
◼ Main characteristics of distributed systems:
 The users perceive the system as a single, integrated computing facility.
 The components are autonomous.
 Scheduling and other resource management and security policies are
implemented by each system.
 There are multiple points of control and multiple points of failure.
 The resources may not be accessible at all times.
 Can be scaled by adding additional resources.
 Can be designed to maintain availability even at low levels of
hardware/software/network reliability.

Dan C. Marinescu Cloud Computing Second Edition - Chapter 4. 27

Distributed systems - desirable properties
◼ Access transparency - local and remote information objects are
accessed using identical operations.
◼ Location transparency - information objects are accessed without
knowledge of their location.
◼ Concurrency transparency - several processes run concurrently using
shared information objects without interference among them.
◼ Replication transparency - multiple instances of information objects
increase reliability without the knowledge of users or applications.
◼ Failure transparency - the concealment of faults.
◼ Migration transparency - the information objects in the system are moved
without affecting the operation performed on them.
◼ Performance transparency - the system can be reconfigured based on
the load and quality of service requirements.
◼ Scaling transparency - the system and the applications can scale without
a change in the system structure and without affecting the applications.

Dan C. Marinescu Cloud Computing Second Edition - Chapter 4. 28

9. Modularity
◼ Modularity, layering, and hierarchy are means to cope with the
complexity of a distributed application software.
◼ Software modularity, the separation of a function into independent,
interchangeable modules requires well-defined interfaces specifying
the elements provided and supplied to a module.
◼ Requirement for modularity → clearly define the interfaces between
modules and enable the modules to work together.
◼ The steps involved in the transfer of the flow of control between the
caller and the callee:
 The caller saves its state including the registers, the arguments, and the
return address on the stack
 The callee loads the arguments from the stack, carries out the calculations
and then transfers control back to the caller.
 The caller adjusts the stack, restores its registers, and continues its
processing.

Dan C. Marinescu Cloud Computing Second Edition - Chapter 4. 29

Modular software design principles
◼ Information hiding → the user of a module does not need to know
anything about the internal mechanism of the module to make effective
use of it.
◼ Invariant behavior → the functional behavior of a module must be
independent of the site or context from which it is invoked.
◼ Data generality→ the interface to a module must be capable of passing
any data object an application may require.
◼ Secure arguments → the interface to a module must not allow side-
effects on arguments supplied to the interface.
◼ Recursive construction → a program constructed from modules must
be usable as a component in building larger programs/modules
◼ System resource management → resource management for program
modules must be performed by the computer system and not by
individual program modules.

Dan C. Marinescu Cloud Computing Second Edition - Chapter 4. 30

Soft modularity
◼ Soft modularity → divide a program into modules which call each other
and communicate using shared-memory or follow procedure call
convention.
 Hides module implementation details.
 Once the interfaces of the modules are defined, the modules can be
developed independently.
 A module can be replaced with a more elaborate, or with a more efficient
one, as long as its interfaces with the other modules are not changed.
 The modules can be written using different programming languages and
can be tested independently.
◼ Challenges:
 Increases the difficulty of debugging; for example, a call to a module with
an infinite loop will never return.
 There could be naming conflicts and wrong context specifications.
 The caller and the callee are in the same address space and may misuse
the stack, e.g., the callee may use registers that the caller has not saved
on the stack, and so on.
Dan C. Marinescu Cloud Computing Second Edition - Chapter 4. 31
Enforced modularity; the client-server paradigm
◼ Modules are forced to interact only by sending and receiving
messages.
◼ More robust design,
 Clients and servers are independent modules and may fail separately.
 Does not allow errors to propagate.
◼ Servers are stateless, they do not have to maintain state
information. A server may fail and then come back up without the
clients being affected, or even noticing the failure of the server.
◼ Enforced modularity makes an attack less likely because it is difficult
for an intruder to guess the format of the messages or the sequence
numbers of segments, when messages are transported by TCP.
◼ Often based on RPCs.

Dan C. Marinescu Cloud Computing Second Edition - Chapter 4. 32

Remote procedure calls (RPCs)
◼ Introduced in early 1970s by Bruce Nelson and used for the first
time at PARC.
 Reduce fate sharing between caller and the callee.
 RPCs take longer than local calls due to communication delays.
◼ RPC semantics
 At least once → a message is resent several times and an answer is
expected. The server may end up executing a request more than once,
but an answer may never be received. Suitable for operation free of
side-effects
 At most once → a message is acted upon at most once. The sender
sets up a timeout for receiving the response. When the timeout expires
an error code is delivered to the caller. Requires the sender to keep a
history of the time-stamps of all messages as messages may arrive
out-of-order. Suitable for operations which have side effects
 Exactly once → implements at most once semantics and requests an
acknowledgment from server.

Dan C. Marinescu Cloud Computing Second Edition - Chapter 4. 33

Client-server communication for World Wide Web.
Three-way handshake involves
the first three messages
exchanged between the client
browser and the server.
Once the TCP connection is
established the HTTP server
takes its time to construct the
page to respond to the first
request; to satisfy the second
request the HTTP server must
retrieve an image from the disk.
Response time components:
1. RTT (Round-trip time).
2. Server residence time.
3. Data transmission time.

Cloud Computing Second Edition - Chapter

Dan C. Marinescu 4. 34
10. Layering and hierarchy
◼ Layering demands modularity → each layer fulfills a well-defined
function.
◼ Communication patterns are more restrictive, a layer is expected
to communicate only with the adjacent layers. This restriction
reduces the system complexity and makes it easier to understand
its behavior.
◼ Strictly enforced layering can prevent optimizations. For example,
cross-layer communication in networking was proposed to allow
wireless applications to take advantage of information available at
the Media Access Control (MAC) sub-layer of the data link layer.
◼ There are systems where it is difficult to envision a layered
organization because of the complexity of the interaction between
the individual modules.
◼ Could a layered cloud architecture be designed that has practical
implications for the future development of computing clouds?

Dan C. Marinescu Cloud Computing Second Edition - Chapter 4. 35

Communication protocol layering
◼ Internet protocol stack:
 Physical layer → accommodate divers physical communication
channels carrying electromagnetic, optical, or acoustic signals .
 Data link layerHow → address the problem to transport bits, not signals
between two systems directly connected to one another by a
communication channel.
 Network layer → packets carying bits have to traverse a chain of
intermediate nodes from a source to the destination; the concern is how
to forward the packets from one intermediate node to the next.
 Transport layer → the source and the recipient of packets are outside
the network this layer guarantees delivery from source to destination.
 Application layer → data sent and received by the hosts at the network
periphery has a meaning only in the context of an application.

Cloud Computing Second Edition -

Dan C. Marinescu Chapter 4. 36
11. Virtualization; layering and virtualization
◼ Virtualization abstracts the underlying physical resources of a
system and simplifies its use, isolates users from one another, and
supports replication which increases system elasticity and reliability.
◼ Virtualization simulates the interface to a physical object:
 Multiplexing → create multiple virtual objects from one instance of a
physical object. E.g., a processor is multiplexed among a number of
processes or threads.
 Aggregation → create one virtual object from multiple physical objects.
E.g., a number of physical disks are aggregated into a RAID disk.
 Emulation → construct a virtual object from a different type of a physical
object. E.g., a physical disk emulates Random Access Memory.
 Multiplexing and emulation → E.g., virtual memory with paging
multiplexes real memory and disk and a virtual address emulates a real
address; the TCP protocol emulates a reliable bit pipe and multiplexes a
physical communication channel and a processor.

Dan C. Marinescu Cloud Computing Second Edition - Chapter 4. 37

Virtualization and cloud computing
◼ Virtualization is a critical aspect of cloud computing, equally
important for providers and consumers of cloud services for several
reasons:
 System security → it allows isolation of services running on the same
hardware.
 Performance isolation → allows developers to optimize applications and
cloud service providers to exploit multi-tenancy.
 Performance and reliability → it allows applications to migrate from one
platform to another.
 Facilitates development and management of services offered by a
provider.
◼ A hypervisor runs on the physical hardware and exports hardware-
level abstractions to one or more guest operating systems.
◼ A guest OS interacts with the virtual hardware in the same manner it
would interact with the physical hardware, but under the watchful
eye of the hypervisor which traps all privileged operations and
mediates the interactions of the guest OS with the hardware.
Dan C. Marinescu Cloud Computing Second Edition - Chapter 4. 38
12. Peer-to-peer systems (P2P)
◼ P2P represents a significant departure from the client-server model
and have several desirable properties:
 Require a minimally dedicated infrastructure, as resources are contributed
by the participating systems.
 Highly decentralized.
 Scalable, individual nodes are not required to be aware of global state.
 Are resilient to faults and attacks, as few of their elements are critical for
the delivery of service and the abundance of resources can support a high
degree of replication.
 Individual nodes do not require excessive network bandwidth as servers
used by client-server model do.
 The systems are shielded from censorship due to the dynamic and often
unstructured system architecture.
◼ Undesirable properties:
 Decentralization raises the question if P2P systems can be managed
effectively and provide the security required by various applications.
 Shielding from censorship makes them a fertile ground for illegal activities.
Dan C. Marinescu Cloud Computing Second Edition - Chapter 4. 39
Resource sharing in P2P systems
◼ This distributed computing model promotes low-cost access to
storage and CPU cycles provided by participant systems.
 Resources are located in different administrative domains.
 P2P systems are self-organizing and decentralized, while the servers in
a cloud are in a single administrative domain and have a central
management.
◼ Napster, a music-sharing system, developed in late 1990s gave
participants access to storage distributed over the network.
◼ The first volunteer-based scientific computing, SETI@home, used
free cycles of participating systems to carry out compute-intensive
tasks.

Dan C. Marinescu Cloud Computing Second Edition - Chapter 4. 40

Organization of P2P systems
◼ Regardless of the architecture, P2P systems are built around an
overlay network, a virtual network superimposed over the real network.
 Each node maintains a table of overlay links connecting it with other
nodes of this virtual network, each node is identified by its IP addresses.
 Two types of overlay networks, unstructured and structured, are used.
 Random walks starting from a few bootstrap nodes are usually used by
systems desiring to join an unstructured overlay.
◼ Each node of a structured overlay has a unique key which determines
its position in the structure; the keys are selected to guarantee a
uniform distribution in a very large name space.
◼ Structured overlay networks use key-based routing (KBR); given a
starting node v0 and a key k, the function KBR(v0,k) returns the path in
the graph from v0 to the vertex with key k.
◼ Epidemic algorithms are often used by unstructured overlays to
disseminate network topology.

Dan C. Marinescu Cloud Computing Second Edition - Chapter 4. 41

Examples of P2P systems

◼ Skype, a voice over IP telephony service allows close to 700 million

registered users from many countries around the globe to
communicate using a proprietary voice-over-IP protocol.
◼ Data streaming applications such as Cool Streaming
◼ BBC's online video service,
◼ Content distribution networks such as CoDeeN.
◼ Volunteer computing applications based on the BOINC (Berkeley
Open Infrastructure for Networking Computing) platform.

Dan C. Marinescu Cloud Computing Second Edition - Chapter 4. 42

Processes, Threads and Events
• Process Group
• A collection of cooperating processes
• Processes cooperate and communicate to reach a common goal
• Global State of a Distributed System
• Distributed Systems consist of several processes and communication channels
• Global State is the union of states of individual processes and channels

10
Messages and Communication Channels
• A message is a structured unit of information
• A communication channel provides the means for processes or
threads to:
• Communicate with one another
• Coordinate their actions by exchanging messages
• Communication is done using send(m) and receive(m) system calls, where m is
a message

11
Messages and Communication Channels
• State of a communication channel
• Given two processes 𝑝𝑖 and 𝑝𝑗 , the state of the channel 𝜉𝑖,𝑗 from 𝑝𝑖 to 𝑝𝑗
consists of messages sent by 𝑝𝑖 but not yet received by 𝑝𝑗
• Protocol
• A finite set of messages exchanged among processes to help them coordinate
their actions

12
Process Coordination – Communication Protocols
• A major challenge is to guarantee that 2 processes will reach an
agreement in case of channel failures
• Communication protocols ensure process coordination by
implementing:
• Error Control mechanisms
• Using error detection and error correction codes
• Flow Control
• Provides feedback from the receiver, it forces the sender to transmit only the amount of
data the receiver can handle
• Congestion Control
• Ensures that the offered load of the network does not exceed the network capacity

13
Process Coordination – Time and time intervals
• Process Coordination requires:
• A global concept of time shared by cooperating entities
• The measurement of time intervals, the time elapsed between 2 events
• Two events in the global history may be unrelated
• Neither one is the cause of the other
• Such events are said to be concurrent events
• Local timers provide relative time measurements
• An isolated system can be characterized by its history, i.e., a sequence of
events

2
Process Coordination – Time and time intervals
• Global agreement on time is necessary to trigger actions that should
occur concurrently
• Timestamps are often used for event ordering
• Using a global time base constructed on local virtual clocks

3
Causality Example: Event Ordering

4
Logical Clocks
• Logical Clock (LC)
• An abstraction necessary to ensure the clock condition in the absence of a
global clock
• A process maps events to positive integers
• LC(e) is the local variable associated with event e.
• Each process time-stamps the message m it sends with the value of
the logical clock at the time of sending:

• The rules to update the logical clock:

5
Logical Clocks
1 2 3 4 5 12
p 1

m 1 m 2 m
5

1 2 6 7 8 9
p 2

m 3 m
4

1 2 3 10 11
p 3

• Three processes and their logical clock

6
Logical p
1 2 3 4 5 12

Clocks 1

m 1 m 2 m
5

1 2 6 7 8 9
p 2

m 3 m
4

1 2 3 10 11
p 3

7
Message Delivery Rules; Causal Delivery
• A real-life network might reorder messages.
• First-In-First-Out (FIFO) delivery
• Messages are delivered in the same order they are sent.
• Causal delivery
• An extension of the FIFO delivery
• Used in case when a process receives messages from different sources.
• Communication channel typically does not guarantee FIFO delivery
• However, FIFO delivery is enforced by attaching a sequence number to each message sent
• The sequence numbers are also used to reassemble messages out of individual packets.

8
Concurrency
• Required by system and application software:
• Reactive systems respond to external events
• e.g., operating system kernel, embedded systems.

• Improve performance
• Parallel applications partition workload & distribute it to multiple threads running
concurrently.

• Support variable load & shorten the response time of distributed applications, like
• Transaction management systems
• Client-server applications

9
Consensus Protocols
• Consensus
• Process of agreeing to one of several alternates proposed by a number of
agents.
• Consensus Service
• Set of n processes
• Clients send requests, propose a value and wait for a response
• Goal is to get the set of processes to reach consensus on a single proposed
value.

10
Consensus Protocols
• Consensus protocol assumptions:
• Processes run on processors and communicate through a network
• processors and network may experience failures, (but not the complicated failures).

• Processors:
• Operate at arbitrary speeds
• Have stable storage and may rejoin the protocol after a failure
• Send messages to one another.
• Network:
• May lose, reorder, or duplicate messages
• Messages are sent asynchronously
• Message may take arbitrary long time to reach the destination.

11
Client-Server Paradigm
• This paradigm is based on the enforced modularity
• Modules are forced to interact only by sending and receiving messages.
• A more robust design
• Clients and servers are independent modules and may fail separately.
• Servers are stateless
• May fail and then come up without the clients being affected or even noticing
the failure of the server.
• An attack is less likely
• Difficult for an intruder to guess the:
• Format of the messages
• Sequence numbers of the segments, when messages are transported by TCP
12
Logical Clocks
• Logical Clock (LC)
• An abstraction necessary to ensure the clock condition in the absence of a global
clock
• A process maps events to positive integers
• LC(e) is the local variable associated with event e.
• Each process time-stamps the message m it sends with the value of the
logical clock at the time of sending:

• The rules to update the logical clock:

LC(e) = LC + 1 → if e is a local event or a send(m) event
LC(e) = max(LC + 1, TS(m) + 1) → if e = receive(m)

2
Logical 1 2 3 4 5 12
p
Clocks 1

m 1 m 2 m
5

1 2 6 7 8 9
p 2

m 3 m
4

1 2 3 10 11
p 3

LC(e) = LC + 1 → if e is a local event or a send(m) event

LC(e) = max(LC + 1, TS(m) + 1) → if e = receive(m)
3
Client-Server Paradigm
• This paradigm is based on the enforced modularity
• Modules are forced to interact only by sending and receiving messages.
• A more robust design
• Clients and servers are independent modules and may fail separately.
• Servers are stateless
• May fail and then come up without the clients being affected or even noticing
the failure of the server.
• An attack is less likely
• Difficult for an intruder to guess the:
• Format of the messages
• Sequence numbers of the segments, when messages are transported by TCP
4
Services
• Email service
• Sender and receiver communicate asynchronously using inboxes and outboxes
• Mail daemons run at each site.
• Event service
• supports coordination in a distributed environment
• Based on the publish-subscribe paradigm
• An event producer publishes events and an event consumer subscribes to events
• Server maintains queues for each event and delivers notifications to clients when an
event occurs.

5
Services

(a)

(b)
6
WWW
• 3-way handshake
• First 3 messages exchanged between the client and the server
• Once a TCP connection is established the HTTP server takes its time to
construct the page to respond the first request
• To satisfy the second request, the HTTP server must retrieve an image
from the disk
• Response time includes
• Round Trip Time (RTT)
• Server residence time
• Data transmission time
7
Browser Web Server

WWW HTTP request

RTT
SYN
SYN

TCP connection establishment

ACK + HTTP request

Server residence time.

ACK Web page created on the fly
Data
Data transmission time
Data ACK

HTTP request
ACK

Server residence time.

Image retrieved from disk

Image transmission time

Image

time time

8
HTTP Communication HTTP client
Web
request
TCP
port
HTTP
80 server
• A Web client can:
Browser
response

• communicate directly with the server

request to proxy
HTTP client
• communicate through a proxy Web
Browser
request to server

Proxy
TCP port 80

response to client HTTP

server
response to proxy

• use tunneling to cross the network. HTTP client request request

TCP HTTP
Web Tunnel port

Browser
80 server
response response

Learn Multithreading with Modern C++
From Everand
Learn Multithreading with Modern C++
James Raynard
No ratings yet
Parallel Processing
No ratings yet
Parallel Processing
35 pages
Pumps Lecture
100% (1)
Pumps Lecture
38 pages
Air Pilot Electronic Unit
100% (2)
Air Pilot Electronic Unit
10 pages
Parallel_computing
No ratings yet
Parallel_computing
32 pages
Memory in Multiprocessor System
No ratings yet
Memory in Multiprocessor System
52 pages
Introduction To Parallel Programming
No ratings yet
Introduction To Parallel Programming
129 pages
Arch13 Multiprocessors Afterlecture
No ratings yet
Arch13 Multiprocessors Afterlecture
70 pages
Cs405-Computer System Architecture: Module - 1 Parallel Computer Models
No ratings yet
Cs405-Computer System Architecture: Module - 1 Parallel Computer Models
91 pages
Cs405-Computer System Architecture: Module - 1 Parallel Computer Models
No ratings yet
Cs405-Computer System Architecture: Module - 1 Parallel Computer Models
72 pages
Cs405-Computer System Architecture: Module - 1 Parallel Computer Models
No ratings yet
Cs405-Computer System Architecture: Module - 1 Parallel Computer Models
72 pages
Cloud Computing CS 15-319: Programming Models-Part I Lecture 4, Jan 25, 2012
No ratings yet
Cloud Computing CS 15-319: Programming Models-Part I Lecture 4, Jan 25, 2012
40 pages
CS439 CC 2 Parallel Distributed Systems[1]
No ratings yet
CS439 CC 2 Parallel Distributed Systems[1]
37 pages
CS 213: Parallel Processing Architectures: Laxmi Narayan Bhuyan
No ratings yet
CS 213: Parallel Processing Architectures: Laxmi Narayan Bhuyan
26 pages
Slides Taken From: Parallel Computing Platforms
No ratings yet
Slides Taken From: Parallel Computing Platforms
11 pages
APznzaaBPbq19r7DttJsFJDiz6xdljQmPxg0oflqRAoyoqcN6IEEo4yrW Ck8XgHkH5PDMZIHRNz7h0ZpQWHOHwyjvO3PX93sVHvLd5fwcGETUu8XvmdTkaodNRbNrLgkDFPQZVQMfz8KHkZay30aqD0CVLA10PSummzrUt1vN32NEahcaq-m3CTYqZXjSBaBus9kPl5fj8KDKPT (1)
No ratings yet
APznzaaBPbq19r7DttJsFJDiz6xdljQmPxg0oflqRAoyoqcN6IEEo4yrW Ck8XgHkH5PDMZIHRNz7h0ZpQWHOHwyjvO3PX93sVHvLd5fwcGETUu8XvmdTkaodNRbNrLgkDFPQZVQMfz8KHkZay30aqD0CVLA10PSummzrUt1vN32NEahcaq-m3CTYqZXjSBaBus9kPl5fj8KDKPT (1)
80 pages
CS 213: Parallel Processing Architectures: Laxmi Narayan Bhuyan
No ratings yet
CS 213: Parallel Processing Architectures: Laxmi Narayan Bhuyan
26 pages
CICS 504 Computer Organization
No ratings yet
CICS 504 Computer Organization
35 pages
COA - Unit 4
No ratings yet
COA - Unit 4
84 pages
BDS Session 2
No ratings yet
BDS Session 2
56 pages
Multi Threading
No ratings yet
Multi Threading
168 pages
Parallel Programming Module 1
No ratings yet
Parallel Programming Module 1
71 pages
Chapter 1 (Parallel Computer Models)
No ratings yet
Chapter 1 (Parallel Computer Models)
20 pages
CS Chap7 Multicores Multiprocessors Clusters
No ratings yet
CS Chap7 Multicores Multiprocessors Clusters
65 pages
Multiprocessors - Parallel Processing Overview: "The Real World Is Inherently Concurrent Yet Our Computational
No ratings yet
Multiprocessors - Parallel Processing Overview: "The Real World Is Inherently Concurrent Yet Our Computational
78 pages
BDS Session 2
No ratings yet
BDS Session 2
58 pages
Concurrent Programming With Threads: Rajkumar Buyya
No ratings yet
Concurrent Programming With Threads: Rajkumar Buyya
168 pages
BDS-Session-2
No ratings yet
BDS-Session-2
58 pages
ceg4131_models
No ratings yet
ceg4131_models
27 pages
CA Chap7 Multicores Multiprocessors
No ratings yet
CA Chap7 Multicores Multiprocessors
42 pages
Unit VI Parallel Programming Concepts
No ratings yet
Unit VI Parallel Programming Concepts
90 pages
Background: Computer System Architectures Computer System Software
No ratings yet
Background: Computer System Architectures Computer System Software
25 pages
Introduction To Parallel Processing
No ratings yet
Introduction To Parallel Processing
49 pages
Cloud Computing - Lecture 3
No ratings yet
Cloud Computing - Lecture 3
22 pages
PDS Merged
No ratings yet
PDS Merged
182 pages
RS_PDS-OE 3010
No ratings yet
RS_PDS-OE 3010
8 pages
Ca - Unit 4
No ratings yet
Ca - Unit 4
77 pages
Parallel Computing
No ratings yet
Parallel Computing
28 pages
High Performance Computing
No ratings yet
High Performance Computing
17 pages
Coa Unit 04
No ratings yet
Coa Unit 04
85 pages
DSECL ZG 522: Big Data Systems: Session 2: Parallel and Distributed Systems
No ratings yet
DSECL ZG 522: Big Data Systems: Session 2: Parallel and Distributed Systems
58 pages
Parallel Processing
No ratings yet
Parallel Processing
31 pages
Parallel Computing: Er. Anupama Singh Department of Computer Science & Engg
No ratings yet
Parallel Computing: Er. Anupama Singh Department of Computer Science & Engg
22 pages
APznzabMSGRiAQ8A6MYm6rveAifgi1HxTbiTS9Yf85jZUPqJgWxkujRhNKxar3EMmdUmkYBO7lY9cgFKwY4fwAkv2bcmoL6bQOuYWj_ptvmKvZa7LIHiGWTA-SGiv4ZX1G6v7akwnOUhTbDF77ogwOam9w3m9razgp9_G3AN8-n7pGnvYDhIz5LR3pHaezRf34N7xBAUUWK5LTsnzw1
No ratings yet
APznzabMSGRiAQ8A6MYm6rveAifgi1HxTbiTS9Yf85jZUPqJgWxkujRhNKxar3EMmdUmkYBO7lY9cgFKwY4fwAkv2bcmoL6bQOuYWj_ptvmKvZa7LIHiGWTA-SGiv4ZX1G6v7akwnOUhTbDF77ogwOam9w3m9razgp9_G3AN8-n7pGnvYDhIz5LR3pHaezRf34N7xBAUUWK5LTsnzw1
31 pages
Multi Threading
No ratings yet
Multi Threading
168 pages
Architecture
No ratings yet
Architecture
67 pages
Parallel Computer Models: CEG 4131 Computer Architecture III Miodrag Bolic
No ratings yet
Parallel Computer Models: CEG 4131 Computer Architecture III Miodrag Bolic
27 pages
CS 133 Parallel & Distributed Computing: Course Instructor: Adam Kaplan Lecture #1: 4/2/2012
No ratings yet
CS 133 Parallel & Distributed Computing: Course Instructor: Adam Kaplan Lecture #1: 4/2/2012
22 pages
Unit 7 - Parallel Processing Paradigm
No ratings yet
Unit 7 - Parallel Processing Paradigm
26 pages
Lec7 PDF
No ratings yet
Lec7 PDF
16 pages
HPA - Notes
No ratings yet
HPA - Notes
5 pages
What Is Serial Computing?: Traditionally, Software Has Been Written For Serial Computation
No ratings yet
What Is Serial Computing?: Traditionally, Software Has Been Written For Serial Computation
22 pages
24-25 - Parallel Processing PDF
No ratings yet
24-25 - Parallel Processing PDF
36 pages
L38 TLP
No ratings yet
L38 TLP
13 pages
Paralle Processing in Brief
No ratings yet
Paralle Processing in Brief
31 pages
Unit V
No ratings yet
Unit V
95 pages
Why Multiprocessors?: Motivation: Opportunity
No ratings yet
Why Multiprocessors?: Motivation: Opportunity
20 pages
Parallel Computing
No ratings yet
Parallel Computing
19 pages
Operating Systems Interview Questions You'll Most Likely Be Asked
From Everand
Operating Systems Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Computer Science: Learn about Algorithms, Cybersecurity, Databases, Operating Systems, and Web Design
From Everand
Computer Science: Learn about Algorithms, Cybersecurity, Databases, Operating Systems, and Web Design
Jonathan Rigdon
No ratings yet
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
From Everand
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Hack into your Friends Computer
From Everand
Hack into your Friends Computer
Magelan Cyber Security
No ratings yet
Java Swing Components Are Not Thread-Safe in Java
No ratings yet
Java Swing Components Are Not Thread-Safe in Java
2 pages
MX1230 Product Datasheet
No ratings yet
MX1230 Product Datasheet
32 pages
AHL 3.14 Vector Equation of Line
No ratings yet
AHL 3.14 Vector Equation of Line
54 pages
Module-2 CC
No ratings yet
Module-2 CC
5 pages
A Novel Thermal Model For HEVEV Battery Modeling Based On CFD
No ratings yet
A Novel Thermal Model For HEVEV Battery Modeling Based On CFD
8 pages
Pinoy Bix Tomasi
No ratings yet
Pinoy Bix Tomasi
22 pages
Ncert Solutions Class 9 Math Chapter 13 Surface Area and Volumes
No ratings yet
Ncert Solutions Class 9 Math Chapter 13 Surface Area and Volumes
97 pages
Week 3 Lab
No ratings yet
Week 3 Lab
2 pages
Exam On Work and Power
No ratings yet
Exam On Work and Power
9 pages
P.T. Stanvac Indonesia P.T. Schlumberger Geophysics: Nusantara
No ratings yet
P.T. Stanvac Indonesia P.T. Schlumberger Geophysics: Nusantara
16 pages
Unit-I &ii-Tlw-2022
No ratings yet
Unit-I &ii-Tlw-2022
60 pages
Gustav Mie Theorie
No ratings yet
Gustav Mie Theorie
52 pages
Mathematical Logic
No ratings yet
Mathematical Logic
6 pages
MECCOCT18-12511: Volatile Corrosion Inhibitor Gel Casing Filler: A Field Application
No ratings yet
MECCOCT18-12511: Volatile Corrosion Inhibitor Gel Casing Filler: A Field Application
6 pages
Fernando Tola &amp Carmen Dragonetti - Trisvabhāvakārikā of Vasubandhu
100% (2)
Fernando Tola &amp Carmen Dragonetti - Trisvabhāvakārikā of Vasubandhu
43 pages
Experiment 2: To Perform Addition & Subtraction of Two 8 Bit Numbers Using Microprocessor 8085A and 8051
No ratings yet
Experiment 2: To Perform Addition & Subtraction of Two 8 Bit Numbers Using Microprocessor 8085A and 8051
10 pages
11 KV 23
No ratings yet
11 KV 23
181 pages
Realvce: Free Vce Exam Simulator, Real Exam Dumps File Download
No ratings yet
Realvce: Free Vce Exam Simulator, Real Exam Dumps File Download
6 pages
(1906) Wireless Telegraphy and Telephony (Wireless Radio)
100% (1)
(1906) Wireless Telegraphy and Telephony (Wireless Radio)
436 pages
Dow-Espesantes ACRYSOLES PDF
No ratings yet
Dow-Espesantes ACRYSOLES PDF
12 pages
Production of Granola Breakfast Cereal B
No ratings yet
Production of Granola Breakfast Cereal B
6 pages
TOEFL Dimas
No ratings yet
TOEFL Dimas
3 pages
Analysis of Riboflavin in A Vitamin Pill
No ratings yet
Analysis of Riboflavin in A Vitamin Pill
5 pages
Automatic Irrigation System Using Arduino Microcontroller
No ratings yet
Automatic Irrigation System Using Arduino Microcontroller
41 pages
Sava Tablice
No ratings yet
Sava Tablice
13 pages
Electrolyser-Operating Manual PDF
0% (1)
Electrolyser-Operating Manual PDF
6 pages
Surgical Aids
100% (2)
Surgical Aids
8 pages
Unemployment Problem in Bangladesh and Its Impact On Economic Growth
No ratings yet
Unemployment Problem in Bangladesh and Its Impact On Economic Growth
15 pages