Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
4 views

Distributed Systems and Cloud Computing notes

The document provides an overview of distributed systems, detailing their characteristics, types, and advantages, such as resource sharing and scalability. It discusses various system models, inter-process communication methods, synchronization challenges, and coordination algorithms like distributed mutual exclusion and consensus protocols. Additionally, it offers tips for exam preparation, emphasizing the importance of understanding concepts and practical applications.

Uploaded by

sspam9293
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Distributed Systems and Cloud Computing notes

The document provides an overview of distributed systems, detailing their characteristics, types, and advantages, such as resource sharing and scalability. It discusses various system models, inter-process communication methods, synchronization challenges, and coordination algorithms like distributed mutual exclusion and consensus protocols. Additionally, it offers tips for exam preparation, emphasizing the importance of understanding concepts and practical applications.

Uploaded by

sspam9293
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Distributed Systems and Cloud Computing

UNIT-I
Introduction to Distributed Systems

Characteristics of Distributed Systems

Introduction

A distributed system is a collection of independent computers that appear to the users of the system
as a single coherent system. These systems are characterized by the following features:

Concurrency: Multiple processes run simultaneously, potentially interacting with each other.
No global clock: There's no single time source; processes need to coordinate without a shared
clock.
Independent failures: Components may fail independently, and the system should handle such
failures gracefully.
Scalability: Ability to expand and manage increased demand efficiently.
Heterogeneity: Comprising different hardware, operating systems, and network technologies.

Examples of Distributed Systems

1. Client-Server Systems
Architecture: Clients request services, and servers provide them.
Examples: Web servers, email servers.
Characteristics:
Centralized control (server).
Clients and servers can be on different networks.
Servers can handle multiple clients simultaneously.
2. Peer-to-Peer (P2P) Systems
Architecture: All nodes are both clients and servers.
Examples: File-sharing networks like BitTorrent.
Characteristics:
Decentralized control.
Scalability by adding more peers.
Resource sharing among equals.
3. Grid Computing
Architecture: Combines resources from multiple domains to reach a common goal.
Examples: SETI@home, Large Hadron Collider computing grid.
Characteristics:
High computational power.
Resource sharing across administrative domains.
Heterogeneous systems collaboration.
4. Cloud Computing
Architecture: Provides on-demand computing resources over the internet.
Examples: Amazon Web Services (AWS), Microsoft Azure.
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 1/11
Characteristics:
Elastic scalability.
Pay-as-you-go model.
Services delivered over the internet.

Advantages of Distributed Systems

Resource Sharing: Enables access to remote resources and services.


Scalability: Systems can grow incrementally.
Reliability and Availability: Redundancy allows for fault tolerance.
Performance: Parallelism can improve speed.
Flexibility: Modular and can be easily modified.

System Models

Introduction

System models provide a conceptual framework to understand the structure and behavior of
distributed systems. They help in analyzing system performance, scalability, and reliability.

Architectural Models

1. Client-Server Model
Clients initiate requests, and servers respond.
Supports multiple clients and servers.
Example: Web architecture.
2. Peer-to-Peer Model
All nodes have equal roles.
Nodes can act as both clients and servers.
Example: P2P file sharing.
3. Layered Model
System is divided into layers, each providing services to the layer above.
Enhances modularity.
Example: OSI model in networking.
4. Object-Based Model
System is modeled as a collection of interacting objects.
Promotes encapsulation and reusability.
Example: Distributed Object Systems like CORBA.

Fundamental Models

1. Interaction Model
Latency: Delay between sending and receiving messages.
Bandwidth: Data transfer rate.
Jitter: Variation in message delay.
2. Failure Model
Crash Failures: Process halts and remains halted.
Omission Failures: Messages are lost.
Timing Failures: Responses occur outside the specified time interval.
Byzantine Failures: Arbitrary or malicious failures.

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 2/11
3. Security Model
Threats: Unauthorized access, data interception.
Defense Mechanisms: Encryption, authentication protocols.
Policies: Access control, audit trails.

Networking and Internetworking

Local Area Networks (LANs): Connect computers within a small area.


Wide Area Networks (WANs): Connect systems over large distances.
Internet Protocols: TCP/IP stack facilitates communication.
Routing: Determines the path data takes across networks.
Internetworking: Connecting separate networks via routers and gateways.

Inter-Process Communication (IPC)

Message Passing

Synchronous Communication
Sender waits until the message is received.
Ensures message delivery before proceeding.
Asynchronous Communication
Sender proceeds without waiting.
Messages may be buffered.
Primitives
Send(message, destination)
Receive(message, source)

Shared Memory

Processes communicate by reading and writing to shared variables.


Requires synchronization mechanisms like semaphores or mutexes.
Faster communication within the same machine.

Distributed Objects and Remote Method Invocation (RMI)

Distributed Objects: Objects that exist in different address spaces but can communicate.
RMI: Allows a program to call methods on an object residing in another address space.
Stub: Client-side proxy representing the remote object.
Skeleton: Server-side entity that dispatches calls to the actual object.
Process
1. Client invokes a method on the stub.
2. Stub marshals the parameters and sends a request.
3. Skeleton receives, unmarshals, and invokes the method.
4. Response is sent back through the stub to the client.

Remote Procedure Call (RPC)

Concept: Allows a program to execute a procedure on a remote host as if it were local.


Process:
1. Client Stub: Prepares the request by marshalling parameters.
2. Network: Transmits the request to the server.
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 3/11
3. Server Stub: Unmarshals the request and invokes the service.
4. Response: Result is marshalled and sent back.
Advantages:
Simplifies coding of network communication.
Abstracts the complexity of the network layer.

Events and Notifications

Event-based Systems: Components communicate by emitting and receiving events.


Publish/Subscribe Model:
Publishers: Emit events.
Subscribers: Register interest in certain events.
Event Service: Manages subscriptions and delivery.
Notifications: Messages sent to inform about events.

Case Study: Java RMI

Java RMI: Java's implementation of Remote Method Invocation.


Components:
Remote Interface: Defines methods that can be invoked remotely.
Remote Object: Implements the remote interface.
RMI Registry: Directory for locating remote objects.
Process:
1. Define a remote interface.
2. Implement the remote interface.
3. Create server and client applications.
4. Start the RMI registry.
5. Bind the remote object in the registry.
6. Clients look up the object and invoke methods.
Example:

java

// Remote Interface
public interface MyService extends Remote {
String sayHello(String name) throws RemoteException;
}

// Implementation
public class MyServiceImpl extends UnicastRemoteObject implements MyService {
public MyServiceImpl() throws RemoteException {}
public String sayHello(String name) {
return "Hello, " + name;
}
}

UNIT-II
Synchronization

Time and Global States

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 4/11
Introduction

In distributed systems, maintaining a consistent notion of time and state across multiple nodes is
challenging due to the lack of a global clock and varying message delays.

Logical Clocks

Purpose: Provide a sequence of events in a distributed system.


Lamport Timestamps:
Rules:
1. Each process increments its counter before each event.
2. On sending a message, include the counter.
3. On receiving a message, set the counter to max(local, received) + 1.
Properties: If event A happened before event B, then timestamp(A) < timestamp(B).

Synchronizing Physical Clocks

Need: Coordinated actions, consistent timestamps.


Clock Drift: Clocks may run at different speeds.
Protocols:
Cristian's Algorithm:
Clients request time from a time server.
Adjusts for network delay.
Berkeley Algorithm:
Master clock polls slaves.
Averages the time differences.
Network Time Protocol (NTP):
Hierarchical system of time servers.
Provides synchronization over the internet.

Events and Process States

Event Ordering: Determining the sequence of events.


Happens-Before Relation (→):
If A and B are events in the same process and A occurs before B, then A → B.
If A is a send event and B is the corresponding receive event, then A → B.
Transitivity: If A → B and B → C, then A → C.

Logical Time and Logical Clocks

Vector Clocks:
Each process maintains a vector timestamp.
Rules:
1. Increment own counter on local event.
2. On sending a message, include the vector.
3. On receiving, update vector to element-wise maximum and increment own counter.
Benefits: Captures causality more precisely than Lamport timestamps.

Global States

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 5/11
Definition: Collection of local states of all processes and the state of the communication
channels.
Snapshot Algorithms:
Chandy-Lamport Algorithm:
Used to record a consistent global state.
Process initiates by recording its state and sending marker messages.
On receiving a marker for the first time, record state and send markers.
Subsequent markers indicate the end of channel state recording.

Distributed Debugging

Challenges: Inconsistent views, non-determinism.


Techniques:
Logging: Record events for post-mortem analysis.
Checkpointing: Save states periodically.
Predicate Detection: Check for conditions over global states.

Coordination and Agreement

Distributed Mutual Exclusion

Purpose: Ensure that only one process enters a critical section at a time.
Algorithms:
1. Centralized Algorithm:
One coordinator manages access.
Pros: Simple.
Cons: Single point of failure, bottleneck.
2. Token Ring Algorithm:
Processes form a logical ring.
Token circulates; possession grants access.
Pros: Fairness.
Cons: Token loss requires recovery.
3. Ricart-Agrawala Algorithm:
Processes multicast request messages.
Grant access if no higher-priority request.
Pros: No central coordinator.
Cons: High message overhead.

Elections

Purpose: Select a coordinator or leader among distributed processes.


Algorithms:
1. Bully Algorithm:
Any process can initiate.
Higher ID processes "bully" lower ones.
Upon failure detection, election is called.
2. Ring Algorithm:
Processes are arranged in a logical ring.
Election message circulates with IDs.
Highest ID process is elected.
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 6/11
Multicast Communication

Purpose: Send messages to a group of processes.


Types:
Unreliable Multicast: Best-effort delivery.
Reliable Multicast: Guarantees delivery to all members.
Atomic Multicast: All or none delivery; useful for consistency.
Implementations:
IP Multicast: Network-level support.
Application-Level Multicast: Overlays built at the application layer.

Consensus and Related Problems

Consensus Problem: Processes must agree on a value, even in the presence of failures.
Impossibility in Asynchronous Systems:
FLP Impossibility: In a fully asynchronous system with even one faulty process, consensus
is impossible.
Solutions in Synchronous Systems:
Byzantine Generals Problem:
Processes may act maliciously.
Byzantine Fault Tolerance (BFT) algorithms handle arbitrary faults.
Paxos Algorithm:
Leader-based consensus.
Roles: Proposers, Acceptors, Learners.
Handles crash failures.
Raft Algorithm:
Similar to Paxos but designed for understandability.
Uses leader election and log replication.
Two-Phase Commit Protocol (2PC):
Used in distributed transactions.
Phases:
1. Voting Phase: Coordinator asks participants to prepare.
2. Commit Phase: If all vote yes, commit; else, abort.
Three-Phase Commit Protocol (3PC):
Improves upon 2PC by adding an additional phase to reduce blocking.

Tips for Exam Preparation


Understand Concepts: Focus on grasping the underlying principles rather than memorizing.
Draw Diagrams: Visual representations can aid in understanding architectures and algorithms.
Practice Algorithms: Work through examples of synchronization and consensus algorithms.
Relate to Real Systems: Consider how concepts apply to actual distributed systems like cloud
platforms.
Review Case Studies: Java RMI and other implementations provide practical insights.
Test Yourself: Attempt past exam questions or create your own based on the topics.

Good luck with your exam preparation! Remember to focus on both the theoretical aspects and their
practical applications in real-world systems.
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 7/11

You might also like