Construction of a Byzantine Linearizable SWMR Atomic Register from SWSR Atomic Registers
Abstract
The SWMR atomic register is a fundamental building block in shared memory distributed systems and implementing it from SWSR atomic registers is an important problem. While this problem has been solved in crash-prone systems, it has received less attention in Byzantine systems. Recently, Hu and Toueg gave such an implementation of the SWMR register from SWSR registers. While their definition of register linearizability is consistent with the definition of Byzantine linearizability of a concurrent history of Cohen and Keidar, it has these drawbacks. (1) If the writer is Byzantine, the register is linearizable no matter what values the correct readers return. (2) It ignores values written consistently by a Byzantine writer. We need a stronger notion of a correct write operation. (3) It allows a value written to just one or a few readers’ SWSR registers to be returned, thereby not validating the intention of the writer to write that value honestly. (4) Its notion of a “current” value returned by a correct reader is not related to the most recent value written by a correct write operation of a Byzantine writer. We need a more up to date version of the value that can be returned by a correct reader. In this paper, we give a stronger definition of a Byzantine linearizable register that overcomes the above drawbacks. Then we give a construction of a Byzantine linearizable SWMR atomic register from SWSR registers that meets our stronger definition. The construction is correct when , where is the number of readers, is the maximum number of Byzantine readers, and the writer can also be Byzantine. The construction relies on a public-key infrastructure.
Keywords: Byzantine fault tolerance, SWMR atomic register, Linearizability, SWSR register
1 Introduction
Implementing shared registers from weaker types of registers is a fundamental problem in distributed systems and has been extensively studied [2, 3, 5, 11, 6, 8, 12, 13, 18, 19, 21, 17, 22, 23, 24]. We consider the problem of implementing a single-writer multi-reader register (SWMR) from single-writer single-reader (SWSR) registers in a system with Byzantine processes. This SWMR register in a Byzantine setting is of great importance in recent research. For example, Mostefaoui et al. [16] prove that in message-passing systems with Byzantine failures, there is a -resilient implementation of a SWMR register if and only if processes are faulty, where is the number of Byzantine processes and is the total number of processes. It was the first to give the definition of a linearizable SWMR register in the presence of Byzantine processes and [4] generalized it to objects of any type. Aguilera et al. [1] use atomic SWMR registers to solve some agreement problems in hybrid systems subject to Byzantine process failures. Cohen and Keidar [4] give -resilient implementations of three objects – asset transfer, reliable broadcast, atomic snapshots – using atomic SWMR registers in systems with Byzantine failures where at most processes are faulty. Their implementations were based on their definition of Byzantine linearizability of a concurrent history.
In other related work, a SWMR register was built above a message-passing system where processes communicate using send/receive primitives with the constraint that [10, 16]. These works do not use signatures. Unbounded history registers were required in [10] whereas [16] used messages per write operation. Although building SWMR registers over SWSR registers or over message-passing systems is equivalent as SWSR registers can be emulated over send/receive and vice versa, this is a round-about and expensive solution. A similar problem for the client-server paradigm in message-passing systems was solved in [14] using cryptography.
1.1 Motivation
The SWMR atomic register is seen to be a basic building block in shared memory distributed systems and implementing it from SWSR atomic registers is an important problem. While this problem has been solved in crash-prone systems, it has received recent attention in Byzantine systems. Recently, Hu and Toueg gave such an implementation of the SWMR register from SWSR registers [9]. While their definition of register linearizability is consistent with the definition of Byzantine linearizability of a concurrent history of Cohen and Keidar [4], both [9, 4] as well as [10, 16] have the following drawbacks.
-
1.
If the writer is Byzantine, the register is vacuously linearizable no matter what values the correct readers return. Reads by correct processes can return any value whatsoever including the initial value while the register meets their definition of linearizability. In particular, there is no view consistency. For example, in the Hu-Toueg algorithm, consider a scenario where a Byzantine writer writes a different data value associated with the same counter value to the various readers’ SWSR registers. The correct readers will return different data values associated with the same counter value, thus having inconsistent views. An example application where this is a problem is collaborative editing for a document hosted on a single server. Another reason why this is problematic is that it violates the agreement clause of the well-known consensus/Byzantine agreement problem, which requires that all non-faulty processes must agree on the same value even if the source is Byzantine. We require view consistency.
-
2.
Their definition of register linearizability does not factor in, or ignores, those values written by a Byzantine writer, by honestly following the writer protocol for those values. We need a stronger notion of a correct write operation that factors in such values as being written correctly. Also, note that the Byzantine writer is in control of the execution both above and below the SWMR register interface and hence the value that it writes in a correct write operation can be assumed to be the value intended to be written (correctly) and not altered by Byzantine behavior.
-
3.
Their definition of register linearizability allows a value written by a Byzantine writer to just a single reader’s SWSR register to be returned by a correct process. In order to validate that the writer intended to write that value honestly, we would like a minimum threshold number of readers’ SWSR registers to be written that same value to enable that value to become eligible for being returned to a correct reader. This validates the intention of the Byzantine writer to write that particular value.
-
4.
In their definition of register linearizability, their notion of a “current” value returned by a correct reader is not related to the most recent value written by a correct write operation of a Byzantine writer. We need a more up to date version of the value that can be returned by a correct reader. This helps give a stronger guarantee of progress from the readers’ perspective.
Our definition of a Byzantine linearizable register is stronger than not just that of [4, 9] but also that of [10, 16, 14] and overcomes the above drawbacks. Further, we are interested in implementing the SWMR register over SWSR registers directly in the shared memory model.
1.2 Contributions
-
1.
In this paper, we give a stronger definition of a Byzantine linearizable register that overcomes all the above drawbacks of [4, 9] and [10, 16, 14]. We introduce the concept of a correct write operation by a Byzantine writer as one that conforms to the write protocol. We also introduce the notion of a pseudo-correct write operation by a Byzantine writer, which has the effect of a correct write operation. Only correct and pseudo-correct writes may be returned by correct readers. The correct and pseudo-correct writes are totally ordered and this order is the total order in logical time [15, 20] in which the writes are performed.
-
2.
Then we give a construction of a Byzantine linearizable SWMR atomic register from SWSR atomic registers that meets our stronger definition. The construction is correct when , where is the number of readers, is the maximum number of Byzantine readers, and the writer can also be Byzantine. The construction relies on a public-key infrastructure (PKI).
The construction develops the idea of the readers issuing logical read timestamps to the values set aside for them by the writer. Logical global states on the readers’ SWSR registers, akin to consistent cuts in message-passing systems [15], are then constructed. The algorithm logic ensures that values read at/along such global states form a total order, thereby helping to ensure Byzantine register linearizability.
As compared to the algorithm in [9] which can tolerate any number of Byzantine readers, our algorithm requires . Also, in the algorithm in [9], a reader that stops reading also stops taking implementation steps whereas our algorithm requires a reader helper thread to take infinitely many steps even if it has no read operation to apply. The algorithm in [9] as well as our algorithm use a PKI.
Outline: Section 2 gives the system model and preliminaries. Section 3 gives our characterization of a Byzantine linearizable register based on logical time and culminates in the definition of such a register. Section 4 gives our construction of the SWMR Byzantine linearizable register using SWSR registers. Section 5 gives the correctness proof. Section 6 concludes.
2 Model and Preliminaries
2.1 Model Basics
We consider the shared memory model of a distributed system. The system contains a set of asynchronous processes. These processes access some shared memory objects. All inter-process communication is done through an API exposed by the objects. Processes invoke operations that return some response to the invoking process. We assume reliable shared memory but allow for an adversary to corrupt up to processes in the course of a run. A corrupted process is defined as being Byzantine and such a process may deviate arbitrarily from the protocol. A non-Byzantine process is correct and such a process follows the protocol and takes infinitely many steps.
We also assume a PKI. Using this, each process has a public-private key pair used to sign data and verify signatures of other processes. A values signed by process is denoted .
We give an algorithm that emulates an object , viz., a SWMR register from SWSR registers. We assume that there is adequate access control such that a SWSR register can be accessed only by the single writer and the single reader between whom the register is set up, and that another (Byzantine) process cannot access it. The algorithm is organized as methods of . A method execution is a sequence of steps. It begins with the invoke step, goes through steps that access lower-level objects, viz., SWSR registers, and ends with a return step. The invocation and response delineate the method’s execution interval. In an execution , each correct process invokes methods sequentially, and steps of different processes are interleaved. Byzantine processes take arbitrary steps irrespective of the protocol. The history of an execution is the sequence of high-level invocation and response events of the emulated SWMR register in . A history defines a partial order on operations. if the response event of precedes the invocation event of in . is concurrent with if neither precedes the other.
In our algorithm, we assume that each reader process has a helper thread that takes infinitely many steps even if the reader stops reading the implemented register. These steps are outside the invocation-response intervals of the readers’ own operations. Also, the linearization point of a pseudo-correct write operation may fall after the invocation-response interval. These are non-standard features of our shared memory model.
2.2 Linearizability of a Concurrent History
Linearizability, a popular correctness condition for concurrent objects, is defined using an object’s sequential specification.
Definition 1.
(Linearization of a concurrent history:) A linearization of a concurrent history of object is a sequential history such that:
-
1.
After removing some pending operations from and completing others by adding matching responses, it contains the same invocations and responses as ,
-
2.
preserves the partial order , and
-
3.
satisfies ’s sequential specification.
A SWMR register as well as a SWSR register expose the read and write operations. The sequential specification of a SWMR and a SWSR register states that a read operation from register returns the value last written to . Following Cohen and Keidar [4], we manage Byzantine behavior in a way that provides consistency to correct processes. This is achieved by linearizing correct processes’ operations and offering a degree of freedom to embed additional operations by Byzantine processes.
Let denote the projection of history to all correct processes. History is Byzantine linearizable if can be augmented by (some) operations of Byzantine processes such that the completed history is linearizable. Thus, there is another history with the same operations by correct processes as in , and additional operations by at most Byzantine processes.
Definition 2.
(Byzantine linearization of a concurrent history [4]:) A history is Byzantine linearizable if there exists a history such that and is linearizable.
An object supports Byzantine linearizable executions if all of its executions are Byzantine linearizable. SWMR registers support Byzantine linearizable executions because before every read from such a register, invoked by a correct process, one can add a corresponding Byzantine write.
2.3 Linearizability of Register Implementations
Hu and Toueg defined register linearizability in a system with Byzantine processes as follows [9]. They let be the initial value of the implemented register and be the value written by the th write operation by the writer of the implemented register.
Definition 3.
(Register Linearizability [9]:) In a system with Byzantine process failures, an implementation of a SWMR register is linearizable if and only if the following holds. If the writer is not malicious, then:
-
•
(Reading a “current” value) If a read operation R by a process that is not malicious returns the value then:
-
–
there is a write operation that immediately precedes R or is concurrent with R, or
-
–
(the initial value) and no write operation precedes R.
-
–
-
•
(No “new-old” inversion) If two read operations R and R’ by processes that are not malicious return values and , respectively, and R precedes R’, then .
While this definition of register linearizability is consistent with the definition of a Byzantine linearization of a concurrent history (Definition 2), in the sense that both are concerned only with correct processes’ views, it is not ideal for the reasons given in Section 1.1. Therefore the register should meet stronger criteria of a linearizable register, in the face of Byzantine processes, to accommodate the behavior of the Byzantine writer when it is behaving (writing) correctly. We term such a register as a Byzantine linearizable register. In this paper, we first define a Byzantine linearizable register, and then solve the problem of constructing a Byzantine linearizable SWMR register from SWSR registers.
3 Characterization of Byzantine Register Linearizability
The object SWMR register supports Byzantine linearizable executions [4]. However, we need to construct a SWMR register from SWSR registers. We assume wlog that there are SWSR registers writable by the single writer and readable by reader . Here we characterize the requirements for such a construction, culminating in Definition 13 of Byzantine Register Linearizability.
The writer as well as the reader processes can be Byzantine. As a Byzantine reader can return any value whatsoever, the linearizability specification is based on values that correct readers return. The Byzantine writer can behave anyhow and can write different values to the SR registers, or write different values to different subsets of SR registers while not writing to some of them at all, or write multiple different values over time to the same some or all SR registers, as part of the same write operation. We assume that of the readers are Byzantine. In our characterization, we seek recourse to logical time but present the final definition of the Byzantine linearizable register using physical time.
We refer to a write operation by a timestamp vector of size , where is the logical timestamp assigned to a value of written into . This vector gives the logical time vector of the write operation . As is a logical timestamp, it can equivalently be assigned by reader . Thus, correct readers assign monotonically increasing timestamp values whereas a timestamp that violates this must have been assigned by a Byzantine reader and can be rejected/ignored by correct readers. We define the relations , , and (concurrent) on the set of timestamp vectors in the standard way as follows.
-
•
-
•
-
•
forms a lattice.
In general, when an object , denoted a high-level object (HLO) is simulated or constructed using objects of another type , denoted a low-level object (LLO), there are two interfaces. A process interacts with the HLO through a high-level interface (HLI) through alternating invocations and matching responses. Between such a pair of matching invocation and response, the process interacts with the LLO through a low-level interface (LLI) using alternating invocations and responses. Such interactions are in software.
For our problem, the HLO is the Byzantine-tolerant SWMR atomic register and the HLI is the read and write operation. The LLO is the SWSR atomic register and the LLI is also the read and write operation. We term the program code executed below the HLI and above the LLI for a single invocation of a write/read at the HLI as the code or protocol for the (HLI) write operation/read operation, respectively.
In the face of Byzantine readers as well as a Byzantine writer, we need to define a correct write operation. In the sequel, we use or to refer to the actual data value written. A invocation at the HLI may be converted at a Byzantine writer into possibly multiple operation invocations for different at the LLI to all or some subset of the various instances of the LLO. If a invocation at the HLI is converted by a Byzantine writer into an invocation of and it executes the protocol exactly for this value , it is considered as a correct write operation because that can be taken to be the value the writer writes or intended to write. Likewise if the at the HLI is converted into multiple serial invocations of (for different values of ) and the protocol for each of these is correctly followed, these various are considered correct write operations because that sequence of write operations can be taken to be the values the writer writes or intended to write. This is because the invocation/response at the HLI is at a Byzantine process which controls the execution of code above the LLI and above the HLI. In a correct write operation, the code between the HLI and the LLI is followed correctly by the Byzantine process.
Definition 4.
A correct write operation is a write operation that follows the write protocol.
So far in the literature [4, 9], any behavior of a Byzantine writer is allowable in the linearizability definition. We accommodate a Byzantine writer differently and introduce the concept of a pseudo-correct write operation (Definition 7), which is a Byzantine write operation that has the effect of a correct write operation. This is first informally motivated as follows. A Byzantine write operation can, for example,
-
1.
write multiple values to the various (possibly resulting in multiple pseudo-correct write operations) or
-
2.
together with earlier write operations write a single value (possibly resulting in a pseudo-correct write operation), or
-
3.
together with earlier write operations that wrote different values write those values (possibly resulting in multiple pseudo-correct write operations).
Thus, there is no longer a one-one mapping from write operations issued to the HLI object interface to values written to the object; it is a many-many mapping. If a pseudo-correct write operation is a result of values written across multiple HLO write operations, we define that pseudo-correct write operation to occur in the latest of those HLO write operations. Its invocation and response are those of that latest HLO operation. Its linearization point occurs when the write takes effect for correct readers and this can happen even after the HLO write response to the HLO write invocation in which the pseudo-correct write occurred. Due to the many-many mapping, one HLO write invocation-response can result in different values being written by the pseudo-correct operations and read/returned by correct readers. This does not pose any ambiguity because the different values that are returned to the correct readers have different logical timestamp vectors.
Definition 5.
A potential pseudo-correct write operation of value is a write operation, timestamped , that may not follow the write protocol but
-
1.
is such that there does not exist any correct write operation timestamp where , and
-
2.
there is a quorum of size indices such that was written to and logically timestamped (equivalently and in practice by reader ).
Definition 6.
A write operation stabilizes if its value is returnable, meaning eligible for being returned, by a correct reader.
A correct write operation always stabilizes whereas a potential pseudo-correct write may stabilize, depending on run-time dynamic data races, steps of Byzantine readers, and the algorithm. Only all write operations that stabilize have a linearization point.
Definition 7.
A pseudo-correct write operation is a potential pseudo-correct write operation that stabilizes.
Definition 8.
(Monotonicity/Total Order of stabilized write operation vector timestamps Property:) The set of write operation timestamps that stabilize is totally ordered.555This definition is not about the monotonicity of vector timestamp values as a function of the physical time order in which their operations stabilize, but about the set of such vector timestamps which is totally ordered.
If the Monotonicity/Total Order Property is satisfied, of any two potential pseudo-correct writes whose vector timestamps are concurrent, at most one can stabilize.
Definition 9.
(Genuine Advance of Timestamp Property:) For vector timestamps and of correct and pseudo-correct write operations such that , there is index of a correct reader process such that .
In conjunction with the Monotonicity/Total Order Property, the Genuine Advance of Timestamp Property guarantees that there has been progress from to and this progress includes a new write of to a register for a correct process . Thus the progress is not a fake operation reported by a Byzantine reader. Further, the latest values as per at are also .
Let denote the timestamp of a correct or a pseudo-correct write operation, which is the timestamp of a stabilized write operation.
Definition 10.
A consistent timestamp of a write operation is a vector timestamp such that .
The set of consistent timestamps forms a sublattice of . No consistent write timestamp is concurrent with a correct or pseudo-correct write timestamp. Thus at run-time, it is determined that a correct write timestamp and a pseudo-correct write timestamp is an inevitable timestamp in . The execution path traced in the lattice passes through these correct and pseudo-correct write operation timestamp states.
Definition 11.
(View Consistency Property:) If a write operation is returned to a correct reader’s read R, it is returned to all correct readers’ read operations that are preceded by R, assuming there are no further writes to after .
Definition 12.
(Total Ordering Property:) If two write operations and are seen by any two correct readers, they are seen in a same common order.
View Stability and Total Ordering properties together guarantee that all correct readers can see all correct and pseudo-correct writes in the same order.
Only correct and pseudo-correct writes may be returned by correct readers. A correct reader cannot distinguish between a correct and a pseudo-correct write operation. A pseudo-correct write operation timestamped may lose a race due to asynchrony of process executions to a pseudo-correct or correct write operation timestamped where , in which case is not actually returned to any read operation and is deemed to have an invisible linearization point. The correct and pseudo-correct writes are totally ordered by their linearization points, and (if the Monotonicity/Total Order on Vector Timestamps Property is satisfied) this order is (a) the total order on their timestamps, which is (b) the total order in logical time in which these writes were performed, as also (c) the total order in which these write operation timestamped values are encountered in a run-time traversal of the lattice and potentially returned by HLI read operations. 666Note this this total order on the linearization points is not uniquely defined by the natural total order in which operations were issued to the HLI object interface, because as observed earlier, there is a many-many mapping from the writes issued to the HLI object interface to values written to the object.
In our characterization, we used logical time and LLOs but now present the final definition of the Byzantine linearizable register using physical time and HLOs. Let be the value written by the th correct or pseudo-correct write , following the notation in [7]. Note that to determine , and requires knowing what happened below the HLI and above the LLI because of the nature of pseudo-correct writes; but there is actually no need to determine , , and .
Definition 13.
(Byzantine Linearizable Register). In a system with Byzantine process failures, an implementation of a SWMR register is linearizable if and only if the following two properties are satisfied.
-
1.
Reading a current value: When a read operation R by a non-Byzantine process returns the value :
-
(a)
if then no correct or pseudo-correct write operation precedes R
-
(b)
else if then was written by the most recent correct write operation that precedes R or by a later pseudo-correct or correct write operation (either a pseudo-correct write operation, that precedes or overlaps with R, or a correct write operation that overlaps R).
-
(a)
-
2.
No “new-old” inversions: If read operations R and R’ by non-Byzantine processes return values and , respectively, and R precedes R’, then .
In Definition 13, “precedes”, “overlaps”, and “later” are with respect to physical time of HLI invocations and responses. In Case 1b, note that the most recent correct write operation W that precedes R has its linearization point within the physical time duration of W. Later pseudo-correct write operations that precede R or overlap with R may have their linearization points after their response in physical time. In particular, the most recent pseudo-correct write operation that precedes R in physical time may have its linearization point after its physical time duration completes and hence it may be during R or even after R completes.
Additionally, Monotonicity/Total Order of Vector Timestamps Property, Genuine Advance Property, View Consistency Property, and Total Ordering Property are useful properties that overcome the drawbacks of the previous definition(s) of Byzantine Register Linearizability discussed in Section 1.1.
4 The Algorithm
4.1 Basic Idea and Operation
Because the writer is Byzantine, a reader cannot unilaterally use the value in its register but needs to coordinate with other readers. But a reader may not invoke a read operation indefinitely. So we assume that each reader process has a reader helper thread that is always running and participates in this coordination. A correct write operation needs to wait for acknowledgements from the readers so that the write value is guaranteed to get written in the algorithm data structures and stabilize, and is not overwritten before it stabilizes. This guarantees progress by ensuring that the most recent correct write operation advances with time. The algorithm data structures are described in Algorithm 1. The reader helper thread is given in Algorithm 2. is the writer process and denotes the set of readers.
The writer writes , where is the value to be written and is a sequence number assigned by the writer, to all the reader registers and then waits for of the acknowledgement registers to be written with this value.
The reader helper thread () loops forever. In each iteration, it reads and if the value overwrites the earlier value, it increments its local logical time (the witness time) and writes to all the registers . It then reads each and if the logical time of an entry is larger than the previous logical time read, it stores the value in a local variable (if less than, or equal but the entry is different, is marked as Byzantine). If there are at least entries with identical values then these entries are placed in the local set . This signed by is written to all . Each is read to local variable . If there are at least having identical entries for at least readers , these are placed in local variable , which is then written in all , and is updated with . All the are read and placed in set . All the elements in , i.e., () are totally ordered by a relation , (to be defined in Definition 18), as will be proved in Theorem 4. The latest element in is identified as via a call to and if is different from the local , (a) it is written to all and (b) is updated with the common occurring in identical entries for values of in all the at least in that is .
Observe that registers are needed to ensure that the entries, i.e., their vector timestamps, form a total order; otherwise if two correct processes were to concurrently write their respective s into their rows, the two entries may be partially ordered. This introduces an additional level of indirection in the algorithm.
4.2 Instantiating Framework Definitions in the Algorithm
A reader sees the value written by the most recent correct write operation that precedes the read operation, or it may return a later written value that is written by a correct or pseudo-correct write operation. In the context of our algorithm, the definitions of a correct write operation, a potential pseudo-correct write operation, and a pseudo-correct write operation apply directly. A write operation stabilizes (Definition 16) if the value is potentially returnable by a correct read operation, i.e., when it is written to for all values of (in the form of an containing the required number of correctly signed s). A correct write operation always stabilizes whereas a potential pseudo-correct write operation may stabilize depending on the outcome of concurrency data races and behavior of the writer and Byzantine readers.
A correct write operation always stabilizes because it waits for acknowledgements in before completion, thereby allowing the value it has written to to be eligible for being returned by a read operation. A correct write operation corresponds to a sufficient condition for stabilization. Writing the same value to readers’ in a potential pseudo-correct operation is a necessary condition for stabilization. For the potential pseudo-correct write operation to become a pseudo-correct write operation, the Byzantine readers need to collaborate to allow the value being written to the other correct readers’ to stabilize. Recall that a pseudo-correct write operation may have the value written to the readers’ registers across multiple prior write operations. The set of correct and pseudo-correct writes is exactly the set of writes whose values stabilize (follows from Theorem 3).
We show (Corollary 2) that the set of all values that stabilize are totally ordered by the logical times of their “reading” from the readers’ registers . Specifically, we use to denote the vector of logical times of reading a value from by the various readers , and we denote this total order relation . The notation as opposed to the timestamp vector we introduced in Section 3 is useful because not all readers may report the logical times of their reading from . Thus may not have all components whereas has all components. Roughly speaking, the relation is equivalent to ; the formal definition of is given later in Definition 18. We will also abbreviate as simply . The total order on timestamp vectors of values that stabilize is the total order on the linearization points of correct and pseudo-correct write operations.777As noted in Section 3, the total order in which the values stabilize may be different from this total order because of concurrency races for stabilization between a pseudo-correct write and a pseudo-correct or correct write , where . If stabilizes before , the write of will have a invisible linearization point as it will not be returned to any correct read operation. This follows from the no “new-old” inversions clause in Theorem 8.
More than one value can stabilize as part of the same HLO write operation. This can happen when, for example, some of the readers’ registers could have been written to in earlier Write operations or a HLO invocation-response of a Byzantine write contains multiple pseudo-correct write operations. For any pair of values that stabilize, this total order between them satisfies the Genuine Advance Property if , as we will prove in Theorem 5.
With the introduction of the relation, the definition of register linearizability (Definition 2) is adapted to the algorithm by rephrasing the No ‘‘new-old’’ inversions property as follows.
Definition 14.
(Byzantine Linearizabile Register for the algorithm). In a system with Byzantine process failures, an implementation of a SWMR register is linearizable if and only if the following two properties are satisfied.
-
•
Reading a current value: When a read operation R by a non-Byzantine process returns the value :
-
–
if then no correct or pseudo-correct write operation precedes R
-
–
else if then was written by the most recent correct write operation that precedes R or by a later pseudo-correct or correct write operation (either a pseudo-correct write operation, that precedes or overlaps with R, or a correct write operation that overlaps R).
-
–
-
•
No “new-old” inversions: If read operations R and R’ by non-Byzantine processes return values and , respectively, and R precedes R’, then .
5 Correctness Proof
Definition 15.
The witness set of an that is formed, , is the maximal set of at least entries common to the at least elements (s) of the .
Those at least identical entries in the intersection of the at least s in the form .
Definition 16.
The field/value common to all the entries in the witness set of an inform set is defined to stabilize when the containing correctly signed s is written to all the for some process .
Definition 17.
For a value that stabilizes with , is the set of tuples for all the at least reader processes for all the at least entries in .
Only a value that stabilizes may be returned by a reader.
Theorem 1.
A correct write operation is guaranteed to stabilize provided .
Proof.
A correct write operation writes the same to for all correct and will not complete unless it gets acks in . It may get acks from Byzantine processes but as , it will need an ack from at least one correct process. A correct process gives the ack only after it has written the value to , i.e., when the value has stabilized. We now show that at least one correct process will have the value stabilize, before which the value written to the will not be overwritten.
For all correct , the witness timestamps are correctly written to . At least correct reader helper threads of correct processes will eventually read from , form their s, sign those sets and write to . Some first correct reader thread will eventually read from and have at least having identical entries for at least processes . It will thus be able to form its , then write it to all , and will then write to after which the correct write operation will complete. Thus, is guaranteed to have stabilized as the will not be overwritten until then. ∎
Theorem 2.
A potential pseudo-correct write operation may stabilize provided .
Proof.
A value written by a potential pseudo-correct operation writes the same value to readers’ , i.e., to at least correct readers’ , across possibly multiple prior write operations and the current write operation. These reader processes write that value, if not overwritten, to . Let the Byzantine processes read the value from their and thereafter behave as though the value had been written to their and thenceforth behave correctly. There is now a way that the value may stabilize if the Byzantine writer does not overwrite the values in () until at least one process forms its of correctly signed s for that value and writes the to . This condition will be satisfied as per the logic in the proof of Theorem 1 (although the writer need not wait for acks) as now the Byzantine reader processes are collaborating and behaving like correct reader processes after having read the value set aside for them in . ∎
Theorem 3.
If a value stabilizes, it must have been written by a correct write operation or by a potential pseudo-correct write operation.
Proof.
A correct write operation stabilizes (Theorem 1). So we need to only prove the following contrapositive, namely that if a write is not a potential pseudo-correct write operation, it will not stabilize.
If a write is not a potential pseudo-correct operation, it is not written to at least correct processes’ across possibly multiple write operations. Then there is no way a of at least entries can form at at least processes, and hence an of correctly signed s cannot form at any process, correct or Byzantine, and cannot be written to any . If a Byzantine process attempts to write a fake in , that will be detected by correct processes as that written in will not pass the signature test. Hence that value is deemed to not have stabilized. ∎
Proof.
If one correct process returns a value of write , it must have written the corresponding to this to all . If there are no further writes to beyond by the writer, all correct readers’ read operations issued after is returned will return , based on the pseudo-code. Hence View Consistency is satisfied. ∎
If an is formed at reader , it has read from at least s having at least identical entries across those s. Likewise if an is formed at reader . (Recall that those at least identical entries in the intersection of the at least s in the form .) and write and to and , respectively. When a third reader reads these two values and as part of invocation compares and ( and ),
-
•
there are at least reader witness timestamps common to and . As the corresponding processes provided witnesses to both witness sets, any such process would have done so first for and then for or vice-versa.
-
•
there are at least processes that provided (signed) s written to that formed part of both and . Any such process would have written its that formed part of to before it wrote its that formed part of or vice-versa.
Observation 1.
A correct process forms its successive s only in non-decreasing order of source witness timestamps for the at least common witnesses as it writes these s to .
Let sign that is part of . Then let sign that is part of . ( is one of the at least processes that sign and , assuming . Note that these at least processes may not include any correct process.) For all correct , we have the property that all the at least witnesses common to and have a higher or equal witness timestamp in than in . Up to Byzantine processes can sign that is part of and that is intended to be part of such that some of the common witness timestamps common to and will have a smaller witness timestamp in than in (while some will have a greater or equal witness timestamp in than in . Such and , which require a quorum of at least signed s having identical entries for processes will form at any correct or Byzantine process only if , ie., . This is because the Byzantine processes would be using fake (out-of-order) witness timestamp entries for at least one of and . To prevent such a quorum or from forming, we require .
Definition 18.
Given and , iff for the at least readers that witnessed values in both and , ’s timestamp ’s timestamp and there is at least one reader that witnessed values in both and and ’s timestamp ’s timestamp.
If for any that witnessed both and the witness timestamps are equal, then and have the same value.
If , we also interchangeably say that for the corresponding values, and .
Definition 19.
Given distinct and , iff .
Observation 2.
If then of the at least processes that provided witness timestamps to both and there is a process whose witness timestamp its witness timestamp and there is a process whose witness timestamp its witness timestamp .
Theorem 4.
For s and at any two (possibly different) reader processes, , i.e., , provided .
Proof.
We prove by contradiction. Assume . Given and , from Observation 2 there is a process whose witness timestamp its witness timestamp and there is a process whose witness timestamp its witness timestamp . As noted earlier, a process that provided witness set inputs to both and by writing to could have provided its input for first and then for or vice-versa.
Without loss of generality assume provides the witness set input for before providing the witness set input for . Then consider the process ’s witness timestamps. So let process provide ’s witness timestamp (along with other witness timestamps) which is written to . If is correct, it will not consider/input ’s witness timestamp as . Only up to Byzantine processes can process/input ’s timestamp after but, as , that falls short of the threshold required to form an (of signed s) at any process. So and will not exist.
Likewise if we assume the process provides its witness set input for before providing its witness set input for , then consider process ’s witness timestamps. Let process provide ’s witness timestamp (along with other witness timestamps) which is written to . If is correct, it will not consider/input ’s witness timestamp as . Only up to Byzantine processes can process/input ’s witness timestamp after . But as , each correct or Byzantine process will fall short of the threshold required to form an (of signed s). So and will not exist.
Thus if and form, all processes that provide input witness timestamps to both and provide first to and then to or all provide first to and then to . Thus . ∎
Definition 20.
(Genuine Advance Property in the algorithm:) When , is a genuine advance over if there is at least one correct reader process such that ’s witness timestamp ’s witness timestamp.
Only value corresponding to an that is written to could be returned by a correct process if the signed s in that pass the signature test. The Genuine Advance Property is useful because each new value returned by a correct reader is a new value genuinely written to at least one correct reader ’s and not falsely reported by a Byzantine reader, in addition to that same value being the most recent value in a total of readers ’s registers as per . Another very important reason why this property is important is explained after Theorem 6. Definition 20 is the counterpart of Definition 9.
Theorem 5.
Algorithm 1 satisfies the Genuine Advance Property if .
Proof.
If , then in order that the Genuine Advance Property holds, there is at least one correct reader process such that ’s witness timestamp ’s witness timestamp. For this to happen, we require that two quora of size , which is the minimum number of correct processes having entries in and , intersect among the set of correct processes (having size ). Thus,
∎
Only value corresponding to an that is written to could be returned by a correct process if the signed s in that pass the signature test. From Theorem 4, all the form a total order based on . This leads to the following corollary.
Corollary 1.
The partial vector timestamps of values that stabilize are totally ordered.
Recall that the are partial vectors having entries because not all readers’ entries may be present in the s of the . Let those entries that are not reported have a timestamp value denoted in .
Theorem 6.
Proof.
For every partial timestamp vector of a value that stabilizes in the algorithm, we construct a full timestamp vector as follows. Let and be the corresponding timestamps of the current value that has stabilized. Let be the smallest timestamp greater than to stabilize and let denote the corresponding full vector timestamp we construct.
Initialize:
loop:
Identify
// Invariant 1:
// Invariant 2:
,
endloop
Let denote the (total order) set of partial vector timestamps for values that have stabilized. Let denote the set of corresponding full vector timestamps.
Theorem 7.
is isomorphic to .
Proof.
With respect to Invariant 1,
-
•
let be any index such that ,
-
•
let be any index such that ,
-
•
let be any index such that ,
-
•
let be any index such that and ,
-
•
let be any index such that and .
Invariant 2 follows because of the following.
-
•
For all , ,
-
•
for all , ,
-
•
for all , ,
-
•
for all , ,
-
•
for all , .
From Invariant 1, there must exist at least one index , or one index such that . Hence and Invariant 2 holds.
In order to satisfy the Genuine Advance of Timestamps Property (Definition 9) for , there must exist a correct process index , or satisfying , in the analysis above. This requires 2 quora of processes to intersect among the set of correct processes, requiring as shown in the proof of Theorem 5.
The theorem follows from Invariants 1 and 2. ∎
As is a total order from the proof of Theorem 4, Theorem 7 implies that is also a total order. The Monotonocity/Total Order of Vector Timestamps of Stabilized Writes Property is thus satisfied, and requires as Theorem 4 also requires it.
From the proof of Theorem 7, the Genuine Advance of Timestamps Property over is satisfied by the algorithm, provided . ∎
Note that when , different values written to different correct readers’ registers can be ordered/stabilized in any permutation by the Byzantine processes but in any given execution, only one permutation can occur. But as the Genuine Advance Property is not satisfied when as is this case, the Byzantine readers can cause any arbitrary sequence of unbounded length (with each member of the sequence being distinct from the one before it) of these different values to successively stabilize and be returned by correct readers. This is an unbounded sequence of fake writes that can be returned to reads. This is another reason why the Genuine Advance Property is important.
An that is formed at any process is written to and thus stabilizes, by Definition 16. We now have the following corollary to Theorems 4 and 5.
Corollary 2.
Theorem 8.
Algorithm 1 implements a linearizable SWMR register using SWSR registers, provided .
Proof.
We show that the value returned by a read operation satisfies “reading a current value” and “no new-old inversions”.
-
•
Reading a current value: From the algorithm pseudo-code, a read R returns the value such that is the latest value ordered by that has stabilized, up until some point of time during R. By Theorem 3, such a value must have been written by a correct write operation or by a pseudo-correct write operation that has stabilized. From Theorem 1, a correct write always stabilizes before the write operation returns/completes. Therefore, will be the value written by the most recent correct write operation that precedes R, or by a later write operation. The later write operation may be (i) a correct write operation that overlaps with R, or (ii) a pseudo-correct write operation that has stabilized (and which precedes or overlaps R). Thus a current value is read.
-
•
No “new-old” inversions: Let read R by return and let read R’ by return , where R precedes R’. R will write in before returning. R’ will read from , add these elements to and invoke which will return the most recent value as per . It is guaranteed that if then because all the values in are totally ordered by (Theorem 4, Corollary 2) and from Definition 18, implies that the write of had greater (or equal) witness timestamps than the write of for the at least common processes that witnessed both writes. The value returned by R’ is . Thus there are no inversions.
The Genuine Advance Property implicitly needed requires (Theorem 5). ∎
Let correct reader return to R(i,1) and return to R(i,2), where R(i,1) precedes R(i,2). Let correct reader return to R(j,1) and let it then issue R(j,2), where R(j,1) precedes R(j,2). As the register is Byzantine linearizable (Theorem 8), from the values returned to , we have . If R(j,2) were to return , which leads to a contradiction. Hence R(j,2) must return or a value with a higher timestamp vector . Thus, the Total Ordering Property cannot be violated by the algorithm. This logic along with Theorem 5 about the Genuine Advance Property, implicitly needed for linearization, gives the following corollary.
Corollary 3.
The Total Ordering property (Definition 12) of values returned by correct readers is satisfied, provided .
As stated in Section 2, a SWMR register supports Byzantine linearizable executions because before every read operation of a correct process, one can add a corresponding Byzantine write in the linearization [4]. Next, we elaborate on this to give a linearization of an execution to show that the SWMR Byzantine linearizable register we constructed supports Byzantine linearizability of executions. First, read operations of correct readers are linearized in the order of the return values, as per the relation. Then an update of a value by the Byzantine writer is added just before the first read operation that reads that value. In general, the value returned by a read operation is based on the values written by one or more than one HLI write operation as the writer is Byzantine. Further, one HLI write operation by the Byzantine writer may result in different read values being returned by multiple correct readers. To differentiate among the multiple updates of different values over time to the SWMR register, we treat each update, which has stabilized, as an independent Byzantine (correct or pseudo-correct) write operation, associated with its vector timestamp.
Theorem 9.
The SWMR Byzantine linearizable register implemented by Algorithm 1 satisfies Byzantine linearizability of executions.
Proof.
Let be a linearization of the correct readers’ read operations such that (i) the local order of reads at each such reader is preserved, and (ii) if read R returns , read R’ returns , and then R precedes R’ in .
When is returned by a read, the latest write operation to which any of the witness timestamps in belongs is denoted as . In the linearization , place a (Byzantine) write operation writing a value and assigned timestamp immediately before the first read operation R in that reads – this is one of the possibly multiple Byzantine write operations corresponding to the write operation .
The resulting linearization is seen to be a Byzantine linearization that considers all the read operations of correct readers, and includes some Byzantine write operations. In the extreme case, before every correct read operation in , we add a corresponding Byzantine write operation. ∎
Additionally, the algorithm satisfies Monotonicity/Total Order of Stabilized Vector Timestamps (Def. 8 & Theorem 6, ), Genuine Advance of Timestamps (Def. 9 & Theorem 6, ), View Consistency (Def. 11 & Lemma 1, ), Total Ordering (Def. 12 & Corollary 3, ).
Space Complexity
The algorithm uses shared SWSR registers: registers of size , registers of size , registers of size . It also uses shared SWSR registers: registers of size , and registers of size .
The local space at each reader process can be seen to be .
6 Conclusions
This paper studied Byzantine tolerant construction of a SWMR atomic register from SWSR atomic registers. It is the first to propose a definition of Byzantine register linearizability by non-trivially taking into account Byzantine behavior of the writer and readers, and by overcoming the drawbacks of the definition used by previous works. We introduced the concept of a correct write operation by a Byzantine writer. We also introduced the notion of a pseudo-correct write operation by a Byzantine writer, which has the effect of a correct write operation. Only correct and pseudo-correct writes may be returned by correct readers. The correct and pseudo-correct writes are totally ordered by their linearization points and this order is the total order in logical time in which the writes were performed. We then gave an algorithm to construct a Byzantine tolerant SWMR atomic register from SWSR atomic registers that meets our definition of Byzantine register linearizability.
References
- [1] Marcos K. Aguilera, Naama Ben-David, Rachid Guerraoui, Virendra J. Marathe, and Igor Zablotchi. The impact of RDMA on agreement. In Peter Robinson and Faith Ellen, editors, Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing, PODC 2019, Toronto, ON, Canada, July 29 - August 2, 2019, pages 409–418. ACM, 2019.
- [2] Hagit Attiya, Amotz Bar-Noy, and Danny Dolev. Sharing memory robustly in message-passing systems. J. ACM, 42(1):124–142, 1995.
- [3] James E. Burns and Gary L. Peterson. Constructing multi-reader atomic values from non-atomic values. In Fred B. Schneider, editor, Proceedings of the Sixth Annual ACM Symposium on Principles of Distributed Computing, Vancouver, British Columbia, Canada, August 10-12, 1987, pages 222–231. ACM, 1987.
- [4] Shir Cohen and Idit Keidar. Tame the wild with byzantine linearizability: Reliable broadcast, snapshots, and asset transfer. In Seth Gilbert, editor, 35th International Symposium on Distributed Computing, DISC 2021, October 4-8, 2021, Freiburg, Germany (Virtual Conference), volume 209 of LIPIcs, pages 18:1–18:18. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2021.
- [5] Sibsankar Haldar and K. Vidyasankar. Constructing 1-writer multireader multivalued atomic variable from regular variables. J. ACM, 42(1):186–203, 1995.
- [6] Maurice Herlihy. Wait-free synchronization. ACM Trans. Program. Lang. Syst., 13(1):124–149, 1991.
- [7] Maurice Herlihy and Nir Shavit. The Art of Multiprocessor Programming. Morgan-Kaufmann, 2008.
- [8] Maurice Herlihy and Jeannette M. Wing. Linearizability: A correctness condition for concurrent objects. ACM Trans. Program. Lang. Syst., 12(3):463–492, 1990.
- [9] Xing Hu and Sam Toueg. On implementing SWMR registers from SWSR registers in systems with byzantine failures. In Christian Scheideler, editor, 36th International Symposium on Distributed Computing, DISC 2022, October 25-27, 2022, Augusta, Georgia, USA, volume 246 of LIPIcs, pages 36:1–36:19. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2022.
- [10] Damien Imbs, Sergio Rajsbaum, Michel Raynal, and Julien Stainer. Read/write shared memory abstraction on top of asynchronous byzantine message-passing systems. J. Parallel Distributed Comput., 93-94:1–9, 2016.
- [11] Amos Israeli and Amnon Shaham. Optimal multi-writer multi-reader atomic register. In Norman C. Hutchinson, editor, Proceedings of the Eleventh Annual ACM Symposium on Principles of Distributed Computing, Vancouver, British Columbia, Canada, August 10-12, 1992, pages 71–82. ACM, 1992.
- [12] Leslie Lamport. On interprocess communication. part I: basic formalism. Distributed Comput., 1(2):77–85, 1986.
- [13] Leslie Lamport. On interprocess communication. part II: algorithms. Distributed Comput., 1(2):86–101, 1986.
- [14] Dahlia Malkhi and Michael K. Reiter. Secure and scalable replication in phalanx. In The Seventeenth Symposium on Reliable Distributed Systems, SRDS 1998, West Lafayette, Indiana, USA, October 20-22, 1998, Proceedings, pages 51–58. IEEE Computer Society, 1998.
- [15] Friedemann Mattern. Virtual time and global states of distributed systems. In Parallel and Distributed Algorithms, pages 215–226. North-Holland, 1988.
- [16] Achour Mostéfaoui, Matoula Petrolia, Michel Raynal, and Claude Jard. Atomic read/write memory in signature-free byzantine asynchronous message-passing systems. Theory Comput. Syst., 60(4):677–694, 2017.
- [17] Richard E. Newman-Wolfe. A protocol for wait-free, atomic, multi-reader shared variables. In Fred B. Schneider, editor, Proceedings of the Sixth Annual ACM Symposium on Principles of Distributed Computing, Vancouver, British Columbia, Canada, August 10-12, 1987, pages 232–248. ACM, 1987.
- [18] Gary L. Peterson. Concurrent reading while writing. ACM Trans. Program. Lang. Syst., 5(1):46–55, 1983.
- [19] Gary L. Peterson and James E. Burns. Concurrent reading while writing II: the multi-writer case. In 28th Annual Symposium on Foundations of Computer Science, Los Angeles, California, USA, 27-29 October 1987, pages 383–392. IEEE Computer Society, 1987.
- [20] Michel Raynal and Mukesh Singhal. Logical time: Capturing causality in distributed systems. Computer, 29(2):49–56, 1996.
- [21] Ambuj K. Singh, James H. Anderson, and Mohamed G. Gouda. The elusive atomic register revisited. In Fred B. Schneider, editor, Proceedings of the Sixth Annual ACM Symposium on Principles of Distributed Computing, Vancouver, British Columbia, Canada, August 10-12, 1987, pages 206–221. ACM, 1987.
- [22] K. Vidyasankar. Converting lamport’s regular register to atomic register. Inf. Process. Lett., 28(6):287–290, 1988.
- [23] K. Vidyasankar. A very simple construction of 1-writer multireader multivalued atomic variable. Inf. Process. Lett., 37(6):323–326, 1991.
- [24] Paul M. B. Vitányi and Baruch Awerbuch. Atomic shared register access by asynchronous hardware (detailed abstract). In 27th Annual Symposium on Foundations of Computer Science, Toronto, Canada, 27-29 October 1986, pages 233–243. IEEE Computer Society, 1986.