Distributed System Lecture 1
Distributed System Lecture 1
Pan Hui
pan.hui@helsinki.fi
Huber Flores
huber.flores@helsinki.fi
Agenda
• Concept and definition
– What is a distributed system
• Modelling
• Given the model
– How to reason about correctness
• Goals
– Sharing resources
– Openness
– Scalability
– Availability
Distributed transparency
Transparency Description
Access Hide differences in data representation and how an object
is accessed
Location Hide where an object is located
Relocation Hide that an object may be moved to another location
while in use
Migration Hide that an object may move to another location
Replication Hide that an object is replicated
Concurrency Hide that an object may be shared by several independent
users
Failure Hide the failure and recovery of an object
Network layers
• Message from sender
goes through all the
layers on its way
• Protocols/
functionality
described on each
layer separately
• Abstraction!!!
Architecture layers
• Different layers –
different abstractions
• Assumptions about
the lower layer’s
behaviour
• A process in each
layer
Critical challenges
• Knowledge is local – no single central control
point
• Clocks are not synchronized
• No globally shared address space
• Topology and routing – Everything is dynamic
• Scalability
• Processes and links fails
– Fault tolerance, e.g., CODA File System
Helsinki, Finland, 2018.
19
Common issues
• Leader election
• Mutual exclusion
• Time synchronization
• Distributed snapshot
• Reliable multicast
• Replica management
• Consensus
Implementation
• Real network
– Cellular network, 3G, LTE, 5G
• Simulation
– Programming languages, e.g., Python, Java, Spark,
Erlang, and so on
– Multi-core, multi-thread, shared-memory
• Cloud
– Virtualization (on the fly and on demand)
Hands-on session
• Execute python routines to communicate two
processes
– https://gist.github.com/huberflores/6a5ecee3ef4
920d16b4c0cb1c737bb6f
MODELLING
Models
• We will reason about distributed systems by
relying on models. There are many dimensions of
variability in distributed systems. Examples
– Type of processors
– inter-process communication mechanisms
– Timing assumptions
– Failure classes
– Security features, etc.
Models
• Models are simple
abstractions that help
to overcome the
variability --
abstractions that
preserve the essential
features, but hide the
implementation details
and simplify writing
distributed algorithms
for problem solving
A classification
Modelling communication
• System topology is a graph
G = (V, E), where V = set of
nodes (sequential
processes) E = set of edges
(links or channels,
bi/unidirectional).
Reliable channel
• Axiom1: message m sent
⇔ message m received
• Axiom2: propagation
delay arbitrary, but finite
Model transformations
“Can model X be
• Stronger models implemented using
– Simplify reasoning, but model Y?” is an
– Need extra work to implement interesting question
in computer science.
• Weaker models
– Are easier to implement
– Have closer relationship with the real world, but
– Might be difficult or impossible to proove correct
• Common transformations (weaker to stronger)
– Non-FIFO to FIFO channel
– Message passing to shared memory
– Non-atomic to atomic broadcast
Other classifications
• Reactive vs transformational systems
– Reactive system always ready to react to requests
or changes
– Transformational (nonreactive) system reaches a
fixed point (like termination) after which no
further changes
• Named vs anonymous systems
– Anonymous when the algorithms do not consider
name or identifiers of processes.
QUESTIONS?
Your tasks
• For next lecture session: (From Van Steen)
– Read Chapter 2 Architectural Styles
– Start reading Chapter 3 Processes