Distributed Systems and Cloud Computing notes
Distributed Systems and Cloud Computing notes
UNIT-I
Introduction to Distributed Systems
Introduction
A distributed system is a collection of independent computers that appear to the users of the system
as a single coherent system. These systems are characterized by the following features:
Concurrency: Multiple processes run simultaneously, potentially interacting with each other.
No global clock: There's no single time source; processes need to coordinate without a shared
clock.
Independent failures: Components may fail independently, and the system should handle such
failures gracefully.
Scalability: Ability to expand and manage increased demand efficiently.
Heterogeneity: Comprising different hardware, operating systems, and network technologies.
1. Client-Server Systems
Architecture: Clients request services, and servers provide them.
Examples: Web servers, email servers.
Characteristics:
Centralized control (server).
Clients and servers can be on different networks.
Servers can handle multiple clients simultaneously.
2. Peer-to-Peer (P2P) Systems
Architecture: All nodes are both clients and servers.
Examples: File-sharing networks like BitTorrent.
Characteristics:
Decentralized control.
Scalability by adding more peers.
Resource sharing among equals.
3. Grid Computing
Architecture: Combines resources from multiple domains to reach a common goal.
Examples: SETI@home, Large Hadron Collider computing grid.
Characteristics:
High computational power.
Resource sharing across administrative domains.
Heterogeneous systems collaboration.
4. Cloud Computing
Architecture: Provides on-demand computing resources over the internet.
Examples: Amazon Web Services (AWS), Microsoft Azure.
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 1/11
Characteristics:
Elastic scalability.
Pay-as-you-go model.
Services delivered over the internet.
System Models
Introduction
System models provide a conceptual framework to understand the structure and behavior of
distributed systems. They help in analyzing system performance, scalability, and reliability.
Architectural Models
1. Client-Server Model
Clients initiate requests, and servers respond.
Supports multiple clients and servers.
Example: Web architecture.
2. Peer-to-Peer Model
All nodes have equal roles.
Nodes can act as both clients and servers.
Example: P2P file sharing.
3. Layered Model
System is divided into layers, each providing services to the layer above.
Enhances modularity.
Example: OSI model in networking.
4. Object-Based Model
System is modeled as a collection of interacting objects.
Promotes encapsulation and reusability.
Example: Distributed Object Systems like CORBA.
Fundamental Models
1. Interaction Model
Latency: Delay between sending and receiving messages.
Bandwidth: Data transfer rate.
Jitter: Variation in message delay.
2. Failure Model
Crash Failures: Process halts and remains halted.
Omission Failures: Messages are lost.
Timing Failures: Responses occur outside the specified time interval.
Byzantine Failures: Arbitrary or malicious failures.
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 2/11
3. Security Model
Threats: Unauthorized access, data interception.
Defense Mechanisms: Encryption, authentication protocols.
Policies: Access control, audit trails.
Message Passing
Synchronous Communication
Sender waits until the message is received.
Ensures message delivery before proceeding.
Asynchronous Communication
Sender proceeds without waiting.
Messages may be buffered.
Primitives
Send(message, destination)
Receive(message, source)
Shared Memory
Distributed Objects: Objects that exist in different address spaces but can communicate.
RMI: Allows a program to call methods on an object residing in another address space.
Stub: Client-side proxy representing the remote object.
Skeleton: Server-side entity that dispatches calls to the actual object.
Process
1. Client invokes a method on the stub.
2. Stub marshals the parameters and sends a request.
3. Skeleton receives, unmarshals, and invokes the method.
4. Response is sent back through the stub to the client.
java
// Remote Interface
public interface MyService extends Remote {
String sayHello(String name) throws RemoteException;
}
// Implementation
public class MyServiceImpl extends UnicastRemoteObject implements MyService {
public MyServiceImpl() throws RemoteException {}
public String sayHello(String name) {
return "Hello, " + name;
}
}
UNIT-II
Synchronization
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 4/11
Introduction
In distributed systems, maintaining a consistent notion of time and state across multiple nodes is
challenging due to the lack of a global clock and varying message delays.
Logical Clocks
Vector Clocks:
Each process maintains a vector timestamp.
Rules:
1. Increment own counter on local event.
2. On sending a message, include the vector.
3. On receiving, update vector to element-wise maximum and increment own counter.
Benefits: Captures causality more precisely than Lamport timestamps.
Global States
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 5/11
Definition: Collection of local states of all processes and the state of the communication
channels.
Snapshot Algorithms:
Chandy-Lamport Algorithm:
Used to record a consistent global state.
Process initiates by recording its state and sending marker messages.
On receiving a marker for the first time, record state and send markers.
Subsequent markers indicate the end of channel state recording.
Distributed Debugging
Purpose: Ensure that only one process enters a critical section at a time.
Algorithms:
1. Centralized Algorithm:
One coordinator manages access.
Pros: Simple.
Cons: Single point of failure, bottleneck.
2. Token Ring Algorithm:
Processes form a logical ring.
Token circulates; possession grants access.
Pros: Fairness.
Cons: Token loss requires recovery.
3. Ricart-Agrawala Algorithm:
Processes multicast request messages.
Grant access if no higher-priority request.
Pros: No central coordinator.
Cons: High message overhead.
Elections
Consensus Problem: Processes must agree on a value, even in the presence of failures.
Impossibility in Asynchronous Systems:
FLP Impossibility: In a fully asynchronous system with even one faulty process, consensus
is impossible.
Solutions in Synchronous Systems:
Byzantine Generals Problem:
Processes may act maliciously.
Byzantine Fault Tolerance (BFT) algorithms handle arbitrary faults.
Paxos Algorithm:
Leader-based consensus.
Roles: Proposers, Acceptors, Learners.
Handles crash failures.
Raft Algorithm:
Similar to Paxos but designed for understandability.
Uses leader election and log replication.
Two-Phase Commit Protocol (2PC):
Used in distributed transactions.
Phases:
1. Voting Phase: Coordinator asks participants to prepare.
2. Commit Phase: If all vote yes, commit; else, abort.
Three-Phase Commit Protocol (3PC):
Improves upon 2PC by adding an additional phase to reduce blocking.
Good luck with your exam preparation! Remember to focus on both the theoretical aspects and their
practical applications in real-world systems.
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 7/11