To le ra ting Byza ntine Be ha vio r in
Distrib ute d Syste ms
Mig ue l Co rre ia
Unive rsity o f Lisb o a
LASIGE / Na vig a to rs g ro up
CyLa b / CMU, De c e mb e r 2007
Motivation
• Every year thousands of new vulnerabilities
appear, zillions of attacks and intrusions happen
– Doing the best we know/can, using security best
practices etc. is essential but not enough
• Systems with high societal importance are
becoming “online”
– Critical infrastructures: gas, water, power,…
– Controlled by computers indirectly connected to the
Internet
2
Intrusion Tolerance
• (also called Byzantine Fault Tolerance)
• To apply the Fault Tolerance paradigm in the
domain of Security
• Do the best we know to protect systems
• …but vulnerabilities still remain…
• Tolerate intrusions that still occur
3
I-T: an example
I-T Distributed Service
Servers (N)
Redundancy
T
REC
R
O
C
Diversity
Request
Reply
NFS, DNS,
on-line CA,
Web server,
etc.
0-Day vulnerability
Clients
4
Outline
• Hybrid system models and Wormholes
• I-T State machine replication
• Randomized I-T protocols
• Primary-backup vs decentralized protocols
• Conclusions
5
Hyb rid syste m mo d e ls a nd
Wo rmho le s
Homogeneous system models
• Most work on I-T assumes an homogeneous
system model; typically:
– Asynchronous (no bounds on delays)
– Byzantine/arbitrary faults, including attacks/intrusions
Host 1
Host 2
Host n
Processes
Processes
Processes
OS
OS
OS
Payload Network
7
Hybrid system models
• We proposed and are interested on hybrid
system models. For instance:
– Asynchronous/Byzantine as before (red) +
– Wormhole that is secure/tamperproof (green)
Host 1
Host 2
Host n
Processes
Processes
Processes
OS
OS
OS
Local
Wh.
Local
Wh.
Local
Wh.
Wormhole Control Channel (optional)
Payload Network
8
Question 1: practical?
• Yes, it models several current systems:
• PCs with Trusted Platform Modules (TPM)
– https://www.trustedcomputinggroup.org/
• PCs with SmartCards
• DIY: PCs with virtual machines (Xen, VMWare)
• DIY: PCs with hardware appliances
9
Question 2: why model?
• Why not do research about PCs + SmartCards
or TPMs or…?
• In our research we want:
– Expressive models of real systems
– Sound theoretical basis for proofs of correctness
– Enablers for building new algorithms
• For practical minds:
– We don’t want to be restricted to what can be done
with SmartCards or TPMs…
10
Question 3: model what?
• In this talk:
– “insecure system + secure subsystem”
• But there are other possibilities, e.g.,
– “untimely system + timely subsystem”
– A. Casimiro, P. Veríssimo, Timely Computing
Base
11
I-T Sta te ma c hine re p lic a tio n
State machine replication basics
I-T Distributed Service
Servers (N)
SMR is a mechanism
to make any
deterministic service
fault-tolerant
Request
Reply
Clients
13
SMR definition
• Servers are state machines:
– state variables, commands
Atomic multicast protocol
• Basic idea: to make all servers follow the same
sequence of states, i.e., enforce:
– Initial state: all servers start in the same state
– Agreement: all servers execute the same commands
– Total order: all servers execute the commands in the
same order
– Determinism: the same command executed in the
same initial state generates the same final state
14
Main Contribution
• There is a maximum number f of servers that can be
faulty for the system to remain correct
• With an homogeneous system model (asynchronous
Byzantine):
– Minimum: N=3f+1 servers
– 4 servers to tolerate 1 faulty, 7 to tolerate 2 faulty,…
• With a hybrid system model (secure wormhole in
servers; not in clients):
– Minimum: N=2f+1 servers
– 3 to tolerate 1 faulty, 5 to tolerate 2 faulty,…
– This reduction has a huge impact on the system cost: hw, sw,
admin (diversity)
15
Trusted Ordering Wormhole
• The TOW is a wormhole that serves specifically to
implement a 2f+1 I-T atomic multicast
• Provides a single service with two purposes:
– Says when a message can be delivered (which is when f+1
servers have it)
– Says the order in which it must be delivered
• API:
– TOW_sent – “I sent a message”
– TOW_received – “I received a message”
• Output:
– TOW_decide – “You can deliver the message, order is n”
16
2f+1 Atomic multicast w/ TOW
H(M) – a collision-resistant hash function
N=3 f=1
S0
decide H(M1),1
X
decide H(M1),1
received H(M1)
sent H(M1)
received H(M1)
X
S2
TOW
works the same way with more messages
X
M1
S1
f+1 servers have M1
order = 1
message delivery
17
Performance of I-T SMR
• Nice runs
• Bad runs
18
I-T SMR Research trends
• BFT – Castro and Liskov (OSDI 99)
– First efficient I-T SMR system
• Increasing speed:
– FaB Paxos (DSN’05), Q/U (SOSP’05), HQ (OSDI’06),
Zyzzyva (SOSP’07)
• Reducing window of vulnerability:
– BFT-PR (TOCS’02), Sousa et al. (SAC’06)
• Reducing number of replicas:
– this work (SRDS’04), BFT2F (NSDI’07), A2M-PBFTEA (SOSP’07)
19
Ra nd o mize d I-T p ro to c o ls
Motivation
• Randomized Byzantine FT agreement protocols:
– Introduced in 1983: Ben-Or (PODC), Rabin (FOCS)
– Since then many others appeared…
• But from a practical point of view:
– Ben-Or style protocols (“local coins”) à run in an
exponential expected number of communication steps
– Rabin style protocols (“shared coin”) à rely on publickey crypto
• DS folklore: work in the area is theoretical;
protocols too slow for most applications…
• …but are they really slow?
21
RITAS
• First, we designed an arguably efficient stack of
randomized I-T protocols, RITAS (no wormhole)
– No signatures, asynchronous, decentralized, n=3f+1
• Then implemented and evaluated their performance…
– LANs, PlanetLab, wireless (PCs and PDAs)
22
Local coins vs Shared coin
• Binary consensus protocols evaluated:
– Bracha’s (84), expected n. rounds O(2n-f), no crypto
– ABBA (01), expected n. rounds constant, public-key
crypto
• Testbed
– 10/100/1000 Mbps local-area network (LAN)
– 11 Dell PowerEdge 850 computers (2.8 GHz, 2 GB
RAM)
– Linux 2.6.11
23
Latency
Shared Coin
has always much
higher latency
Latency (µs) [1000 Mbps, no faults]
Proposal
Distribution
Uniform
Corrosive
Random
Machines (n)
4
7
10
Local
824
2187
4132
Shared
21590
31315
43633
Local
2453
6172
12075
Shared
33834
38529
55169
Local
2056
5812
11501
Shared
24320
36325
49206
24
Throughput
Local
Shared
Local
Coin
Coin
Coin
is always
isisnot
affected
affected
betterbythan
by
Byzantine
Shared Coin
faults
Maximum Throughput (decisions/s)
Faultload
Local
FailureShared
free
Crash
Byzantine
Machines (n)
4
7
10
450
170
80
13
9
8
Local
600
225
110
Shared
31
25
20
Local
330
87
30
Shared
16
9
8
25
Number
of
The
The
Shared
performance
average
protocols
Coinnumber
is
always
more
isRounds
similar
of
terminate
robust
rounds
forwith
the
is
in
failure-free
one the
round
Byzantine
in
very
thecrash
low
crash
faultload
faultload
and
faultloads
Number of Rounds until Decision
Faultload
Local
FailureShared
free
Crash
Byzantine
Machines (n)
4
7
10
1.004
1.005
1.009
1.013
1.018
1.010
Local
1.000
1.000
1.000
expected
Shared
1.000
1.000
1.000result is
Local
1.462
1.569
2.289
Shared
1.016
1.017
1.012
Theoretical
128 rounds
26
Randomized Atomic Broadcast
Bracha’84
• Is it fast/practical?
• Testbed
– 100 Mbps LAN
– 4 nodes (Pentium III PCs, 500 MHz, 128 MB RAM)
– Linux (kernel version 2.6.15)
27
Throughput
• No
Byzantine
faults, n=4
faults – throughput almost not affected
~721 msgs/s
~711 msgs/s
~650 msgs/s
~634 msgs/s
~460 msgs/s ~465 msgs/s
28
Prima ry-b a se d vs d e c e ntra lize d
p ro to c o ls
Faster RITAS?
•
We wanted RITAS to be faster; best candidate for
improvement: Binary Consensus (bottom)
– Fastest RITAS’s BC (Bracha 84): decentralized, n=3f+1, O(n3)
message complexity, no signatures
•
Decentralized algorithms that solve asynchronous
Byzantine BC can be build with and only with:
1. More Processes: n = 5f+1, O(n2) message complexity and no
signatures
2. More Messages: n = 3f+1, O(o) message complexity (n2 < o =
n2f) and no signatures
3. Signatures: n = 3f+1, O(n2) message complexity and using
signatures
•
To improve RITAS, option 2, message complex. O(n2f)
30
State machine replication revisited
• For decentralized consensus algorithms, best:
– n = 3f+1, O(o) message complexity (n2 < o = n2f), no
signatures
• But for a primary-based SMR like BFT:
– n = 3f+1, O(n2) message complexity, no signatures
• SMR with n=2f+1:
– Requires distributed “heavy” wormhole
– Decentralized (but not randomized)
• What about a primary-based SMR?
– n=2f+1 ? “Lighter” wormhole?
31
Co nc lusio ns
Conclusions (1)
• Intrusion tolerance: a new paradigm for more
secure distributed systems
• Hybrid system models and Wormholes
– Model reality as sound basis for proofs of correctness
– Enablers for building new algorithms…
– … without getting tied to current devices
• First solution for I-T state-machine replication
with only 2f+1 replicas
33
Conclusions (2)
• Randomized I-T protocols
– Experimentation contradicted DS folklore
– Protocols are practical
– Local coin protocols are fast/practical but scale worse
than shared-coin protocols
• Primary-based vs decentralized protocols
– Primary-based have to recover from faulty leader
– But decentralized protocols have constraints that do
not apply to primary-based
34
Thank you. Questions?
http://www.di.fc.ul.pt/~mpc/
http://www.navigators.di.fc.ul.pt/
• Some related publications:
– M Correia, NF Neves, P Veríssimo. How to Tolerate Half Less One Byzantine Nodes
in Practical Distributed Systems. IEEE SRDS 2004
– N F Neves, M Correia, P Veríssimo. Solving Vector Consensus with a Wormhole.
IEEE TPDS 16-12, Dec. 2005
– M Correia, N F Neves, L C Lung, P Veríssimo. Low Complexity Byzantine-Resilient
Consensus. Distributed Computing, 17-3 Mar. 2005
– P Veríssimo, Travelling through Wormholes: a new look at Distributed Systems
Models. SIGACT News 37-1, 2006
– M Correia, N F Neves, P Veríssimo. From Consensus to Atomic Broadcast: TimeFree Byzantine-Resistant Protocols without Signatures. Computer Journal 41-1, Jan.
2006
– H Moniz and N F Neves and M Correia and P Veríssimo. Randomized IntrusionTolerant Asynchronous Services. DSN 2006
– A Bessani, M. Correia, H Moniz, N F Neves, P Verissimo. When 3 f +1 is not Enough:
Tradeoffs for Decentralized Asynchronous Byzantine Consensus. DISC 2007
35