1 Introduction

Multi-party computation (MPC) allows a number of parties to compute a function on their respective sensitive inputs without leaking anything but the computation result. Recently, there has been a lot of interest in concretely efficient actively secure MPC in the honest-majority setting with abort, in which fewer than n/2 out of n parties may be corrupted. In this setting, very efficient solutions are known and it is also possible to achieve fairness, i.e., either all parties learn the result or none do, which is not possible without a honest majority.

A number of recent works have achieved particularly striking performance numbers. Binary circuits can be evaluated at a cost of sending 10 bits per AND gate for three parties due to [11], and arithmetic circuits can be evaluated at a cost of sending 4 (for \(n=3\)), \(5(n-1)\), or 42 field elements per multiplication due to [16]. However, this still leaves at least a factor four communication increase compared to passive security. Moreover, these best known protocols unfortunately do not satisfy fairness (unlike other honest-majority protocols).

In this work, we improve on the state-of-the-art of concretely efficient honest-majority MPC by further decreasing communication complexity, while also supporting fairness. Concerning communication complexity, we decrease communication in the three main variants of the protocol of Lindel and Nof by factors of approximately 2, 5, and 7, respectively. In all cases, the gap between passive and active security becomes only a factor 2. Moreover, in the three-party setting, the best protocol now requires sending just two messages per party per multiplication. Some of this improvement comes from better use of PRNGs; a more significant improvement comes from applying the tool of batchwise multiplication verification [2], a technique that allows to check that many multiplications have been performed correctly by essentially checking a single multiplication.

We additional provide a novel three-party protocol, based on the SPDZ protocol [8], that reduces online communication from 2 in our protocol described above to \(\frac{4}{3}\) messages per party per multiplication. This comes at the expense of requiring a preprocessing phase with \(\frac{5}{3}\) messages per party per multiplication. Our SPDZ-based protocol also makes heavy use of PRNGs and batchwise multiplication verification, but additionally incorporates the idea of taking a two-party protocol in the preprocessing model, and replacing the distributed preprocessing protocol by in-the-plain preprocessing by a third party. This idea was known before but, as far as we know, has never been applied; we extend this idea by allowing the preprocessing to be spread evenly between the three parties. By way of comparison, in the two-party dishonest majority setting, a recent SPDZ variant by Keller et al. [14] requires the equivalent of around 130 field elements to be sent per party, highlighting the communication gap between the honest- and dishonest-majority settings.

In both our Lindell-Nof and our SPDZ based protocol, the decrease in communication implies an increase in computation, but we show that in many practical settings, communication is still the bottleneck.

Finally, we show how to add fairness both of our constructions. We employ general principles to achieve fairness such as using signature-based broadcast for agreement and MACs or signatures to prevent output manipulation. Our solutions are especially crafted to ensure that they add as little practical overhead as possible; in particular, they do not affect the above communication complexity results. This means that communication-efficient, actively secure MPC is possible in practice without having to sacrifice fairness.

1.1 Outline

We discuss preliminaries in Sect. 2, before presenting our Lindell-Nof-based and SPDZ-based constructions in Sects. 3 and 4, respectively. We give a brief performance analysis in Sect. 5.

1.2 Related Work

Several recent works are closely related to this paper. Concerning efficient honest-majority MPC, the most relevant work is the framework for communication-efficient MPC from [16] that forms the basis of our first protocol. It is also the closest competitor in terms of overall communication complexity that we are aware of. Another recent honest-majority MPC framework is due to [7]. Although their construction is quite a bit less communication-efficient than ours, it does work for arbitrary rings as opposed to just fields. They also provide a (less efficient) construction for fairness largely based on the same principles as ours.

Concerning the technique of batchwise multiplication verification, the groundwork was laid out in several earlier works. Ben-Sasson et al. [2] first proposed batchwise multiplication verification. As discussed below, there it was used to get an asymptotic result; we are not aware of works using it to improve practical performance. Works such as the Pinocchio verifiable computation system [19] and the Trinocchio protocol that combines it with MPC [20] were a main inspiration to start seeing batchwise multiplication verification also as a tool that may deliver practical efficiency. Corrigan-Gibbs and Boneh [4] first proposed to use batchwise multiplication verification where one party provides data and a number of other parties verify it, as in our SPDZ-based protocol; but there it is not for performing the MPC but for checking its inputs.

2 Preliminaries

In this section, we present our notation and the security model for honest-majority MPC with abort, and the main technique we will use to minimise its communication: batchwise multiplication verification.

2.1 Notation and Security Model

The protocols in this paper are for n parties \(\mathcal {P}=\{\mathcal {P}_1,\mathcal {P}_2,\ldots ,\mathcal {P}_n\}\), where an adversary may statically corrupt a minority of up to t parties, i.e., \(2t<n\). We generally work in the field \(\mathbb Z_p\) for some prime \(p > 2^{\sigma }\), where \(\sigma \) is a statistical security parameter. We use \([{x}]\) to denote a Shamir secret sharing of x; \(\left[ \![{x}\right] \!]\) to denote an additive sharing; and \(\left\langle {x}\right\rangle =(\left[ \![{x}\right] \!],\left[ \![{\alpha x}\right] \!])\) to denote a SPDZ sharing consisting of an additive sharing of the value and its MAC. \([x]_i\), \(\left[ \![{x}\right] \!]_i\), \(\left\langle {x}\right\rangle _i\) refer to shares held by party \(\mathcal {P}_i\). We heavily use pseudorandom number generators (PRNGs) to sample random data. For a pseudorandom number generator \({\mathsf {prng}}\) we use the notation \(r \leftarrow {\mathsf {prng}}\) to indicate sampling r uniformly at random (from the relevant domain). [ab] denotes the interval \([a,a+1,\ldots ,b]\).

We define security in the traditional standalone security model from [3] as adapted in [16]. Security in this model is captured by demanding indistinguishability of the real-world protocol execution to an ideal-world execution with a trusted third party. In the real-world model, the protocol is run between honest parties in the presence of a non-uniform probabilistic polynomial time adversary \(\mathcal {A}\) that acts on behalf of the corrupted parties. We assume a synchronous network with pairwise private channels and a rushing adversary (that receives its messages in each round before it sends them). A party may abort, meaning it sends a special abort message to all parties, who abort in the next round. An execution of a protocol \(\pi \) in this model with inputs \(x_1,\ldots ,x_n\), adversarial auxiliary input z and security parameter \(\kappa \) is denoted \(\text {Real}_{\pi ,\mathcal {A}(z),\mathcal {C}}(x_1,\ldots ,x_n,\kappa )\). This is a tuple containing the outputs of the honest parties and an arbitrary output chosen by the adversary.

The ideal-world model defines how an idealised protocol execution looks like in which the computation is performed by an incorruptible trusted party executing a certain functionality. The functionality defines the exact security guarantees; we will define variants with and without fairness. In the ideal-world model, the trusted party executes the functionality in the presence of the honest parties and a non-uniform probabilistic polynomial time adversary \(\mathcal {S}\).

The functionalities for fair and non-fair MPC both start with each party \(\mathcal {P}_i\) sending its input \(x_i\) to the trusted party. The adversary may choose an arbitrary input for corrupted parties and may also provide \(\bot \) to indicate an abort. The trusted party computes output y as specified by f, or sets \(y=\bot \) if the adversary supplied \(\bot \). In the fair variant, the trusted party sends the outputs to all of the parties. In the non-fair variant, the trusted party sends y to the adversary who returns \(c\in \{\top ,\bot \}^n\). For each party \(\mathcal {P}_i\), if \(c_i=\top \) the trusted party sends y to \(\mathcal {P}_i\), otherwise it sends \(\bot \). Ideal-world executions with these functionalities are denoted \(\text {Ideal}_{f,\mathcal {S}(z),\mathcal {C}}(x_1,\ldots ,x_n,\kappa )\) or \(\text {Ideal}^\text {fair}_{f,\ldots }(\ldots )\): a tuple containing the outputs of the honest parties and an arbitrary output chosen by the adversary.

Security is defined as indistinguishability between real-world and ideal-world executions. Precisely, we say that a protocol \(\pi \) securely computes f with statistical security parameter \(\sigma \) for honest majority if, for every adversary \(\mathcal {A}\), there exists a simulator \(\mathcal {S}\) such that, for all \(x_i,z,\mathcal {C}\) with \(\left| \mathcal {C}\right| \le t\), the distinguishing probability between \(\text {Ideal}_{f,\mathcal {S}(z),\mathcal {C}}(x_1,\ldots ,x_n,\kappa )\) and \(\text {Real}_{\pi ,\mathcal {A}(z),\mathcal {C}}(x_1,\ldots ,x_n,\kappa )\) is at most \(2^{-\sigma }+\mu (\kappa )\) for some \(\mu \) negligible in \(\kappa \). Protocol \(\pi \) fairly computes f with statistical security parameter \(\sigma \) for honest majority if the same holds with respect to \(\text {Ideal}^\text {fair}_{f,\ldots }(\ldots )\). Security can also be defined more generally for any functionality . As is well-known, we can design protocols containing calls to an ideal functionality (in the so-called -hybrid model) and then replace the ideal functionality by a secure protocol implementing it [3].

The above model describes standalone executions with synchronous communication, but we believe that neither limitation is inherent to our protocol. In asynchronous models, unlike above, there is no global round clock. In general, synchronous protocols can be made asynchronous by having each party confirm to all other parties that it has received all messages for round t, and only proceeding to send messages for round \(t+1\) after receiving all confirmations [15], but this is of course costly. We expect that such confirmations are only necessary at a few points in our protocol. In composable models, unlike the standalone model above, protocols are proven secure also in the presence of simultaneous other protocols and protocol instances. Here, we note that we use only black-box non-rewinding simulators, so adding “start synchronisation” should be enough to achieve composability [15]. We leave details for future work.

2.2 Batchwise Multiplication Verification

Batchwise multiplication verification was introduced in [2] to improve the asymptotic complexity of verifying preprocessed multiplication triples over small fields. Standard multiplications checks, e.g. based on sacrificing, scale with the security parameter (which is larger than the field size), but using batchwise multiplication verification, these costs can be spread over a batch.

In particular, given secret-shared values \([a_1]\), \(\ldots \), \([a_N]\), \([b_1]\), \(\ldots \), \([b_N]\), \([c_1]\), \(\ldots \), \([c_N]\), the goal is to verify that \(c_i=a_i \cdot b_i\) for all i. This is done by translating these N equalities of field elements into a single equality of polynomials, and verifying this equality based on the Schwartz-Zippel lemma [21, 22]. Fix nonzero \(\omega _1,\ldots ,\omega _{2N-1}\), and let A(x), B(x) be of degree \(\le N-1\) such that for \(i\in [1,N]\), \(A(\omega _i)=a_i\) and \(B(\omega _i)=b_i\). If we let \(C(x)=A(x)B(x)\), then obviously \(C(\omega _i)=c_i\) for \(i\in [1,N]\), but the converse is also true: if there exists a polynomial C(x) of degree \(\le 2N-1\) such that \(C(x)=A(x)B(x)\) and \(C(\omega _i)=c_i\) for \(i\in [1,N]\), this implies \(c_i=a_i\cdot b_i\).

In batchwise multiplication verification, first, C(x) is constructed by computing \(C(\omega _j)=A(\omega _j)\cdot B(\omega _j)\), \(j\in [N+1,2N-1]\) using passively secure MPC and deriving its coefficients by interpolation. Then, A, B, and C are evaluated in a random point \(s \not \in \{\omega _1, \ldots , \omega _{2N-1}\}\). This can be done with local linear operations given shares of the \(a_i\), \(b_i\), \(c_i\), and \(C(\omega _j)\). Finally, a multiplication check protocol is run to check that \(A(s)\cdot B(s)=C(s)\). The Schwartz-Zippel lemma, states that for a non-zero degree d polynomial, P, over field \(\mathcal {F}\) of and a random \(r \in S\) for a finite \(S \subseteq \mathcal {F}\) the probability that \(P(r) = 0\) is at most d/|S|. Thus if \(A(s)\cdot B(s) = C(s)\) then with high probability, \(A(x) \cdot B(x)=C(x)\) as polynomials and hence \(a_i \cdot b_i=c_i\). Note that for each triple, an additional passively secure multiplication is needed, but the multiplication check is performed only once per batch, giving the asymptotic advantage.

In [4], the above idea is used in a different setting: some party provides inputs to MPC, and we want to verify that inputs satisfy a certain property. This property is phrased in terms of a number of multiplications of linear combinations of inputs, and the multiplications are checked similarly to above. In this case, the inputter determines and provides the “witness” values \(C(\omega _j)\) proving that the multiplications are correct, and the computing parties again use a simple protocol to check that \(A(s)\cdot B(s)=C(s)\). It is also shown there that the various polynomial computations can be performed efficiently using FFTs.

3 Lindell-Nof with Fewer Messages and More Fairness

In this section, we show how to reduce the communication complexity of the Lindell-Nof protocol for honest-majority MPC [16] and how to add fairness. We outline their construction (Sect. 3.1); plug in batchwise multiplication verification (Sect. 3.2); analyse and further reduce communication complexity (Sect. 3.3); finally, we show how to achieve fairness and discuss two other improvements (Sect. 3.4).

3.1 The Lindell-Nof Construction

Lindell and Nof present a framework for efficient actively secure MPC with a honest majority [16]. The basic observation underlying this framework is that many passively secure MPC protocols are “actively secret”, essentially meaning that an active attack can break correctness of the computation, but not privacy. Hence, to perform a computation in an actively secure way, one can simply perform the computation using a passively secure protocol and, prior to opening the result, retrospectively check that all multiplications, as these are the only operations that require interaction, have been performed correctly.

In slightly more detail, the Lindell-Nof construction uses of t-out-of-n secret sharing, such as Shamir secret sharing or replicated secret sharing. The protocol starts with all parties secret-sharing their inputs, and checking whether they are “correct”, in the sense that the shares of all honest parties reconstruct to a unique value. Next, a passively secure MPC is executed, with linear operations performed locally on shares and multiplication using known protocols for Shamir by Gennaro et al. [12], Damgård and Nielsen [6] and for replicated secret sharing by Araki et al. [1]. We will refer to these three multiplication methods as GRR, DN and AFL+ respectively. Finally, the correctness of the performed multiplications is checked using one of two possible methods, and if this check passes, the secret shares of the output are reconstructed to obtain the output. Overall, this gives active security without fairness with relatively little communication.

3.2 Plugging in Batchwise Multiplication Verification

We now show how batchwise multiplication verification can be used to efficiently implement the multiplication check in the Lindell-Nof protocol. As discussed above, the multiplication check is called at the end of the protocol to check correctness of a number of passively secure multiplications performed before.

Our protocol performing this multiplication check is shown in Fig. 1. The protocol uses functionalities \(\mathcal {F}_\textsc {Rand} \) for generating \({\mathsf {share}}{r}\) for random \(r \in \mathbb Z_p\) and \(\mathcal {F}_\textsc {Coin} \) for generating a public field element \(r \in \mathbb Z_p{\setminus }\{0\}\) known to all parties as described in [16]. Moreover, it uses a passively secure multiplication protocol that, as described by Lindell and Nof [16], needs to be “secure up to additive attacks”, meaning that the adversary can manipulate its result only by adding an additive offset to its result. The GRR, DN and AFL+ protocols mentioned above all meet this requirement.

Our multiplication protocol follows the basic idea of [2], but avoids its actively secure \(A(s)\cdot B(s)=C(s)\) check. We add a random multiplication triple \((a_N,b_N,c_N)\) to the batch of triples and choose s uniformly at random from \(\mathbb Z_p\). Then, the values of A(s), B(s), C(s) are uniformly random and can be opened so that the check \(A(s)\cdot B(s)=C(s)\) can be performed in the plain. (Note that this option was not available to the authors of [2] since they need s from an extension field so A(s), B(s), C(s) are not uniform).

Fig. 1.
figure 1

Batchwise multiplication check for Lindell-Nof

We now prove correctness of our multiplication check. In Lindell-Nof, correctness of their multiplication check is shown in [16, Lemma 3.9]. We prove that the same result holds for our multiplication check, implying that it can be used as a drop-in replacement in their protocol. Actually, our result is slightly more complete since we do not just prove correctness but also privacy of the multiplication check. In the full version, we use this result for a self-contained proof of an optimised version of the Lindell-Nof protocol.

Proposition 1

Suppose shares \(([a_i],[b_i])_{i=1}^{N-1}\) are correct and \(([c_i])_{i=1}^{N-1}\) are valid, and that \([\cdot ]\leftarrow [\cdot ]\cdot [\cdot ]\) is a multiplication protocol secure up to additive attack. There exists a simulator that, on input \(\varDelta _i:=c_i-(a_i\cdot b_i)\) and the shares held by the corrupted parties, simulates an execution of the protocol from Fig. 1 with respect to an active adversary corrupting a minority of parties with statistical distance at most negligibly greater than \((2N-2)/(|\mathbb Z_p|-2N)\). In particular, if any \(\varDelta _k\ne 0\), then the honest parties output accept with at most this probability; if all \(\varDelta _k=0\) then honest parties fail or succeed at the will of the adversary.

Proof

The simulator proceeds as follows. The simulator first simulates the generation of random \([a_N]\) and \([b_N]\) and the computation of \([c_N]\), \([a_{N+1}],\ldots ,[a_{2N-1}]\), \([b_{N+1}],\ldots ,[b_{2N-1}]\), \([c_{N+1}],\ldots ,[c_{2N-1}]\), learning the errors \(\varDelta _{N},\ldots ,\varDelta _{2N-1}\) to the \(c_i\) introduced by the adversary (which is possible since the protocol is secure up to additive attack). Simulate the generation of s and the computation of [A(s)], [B(s)], and [C(s)]. Let D(x) be of degree \(\le 2N-2\) such that \(D(\omega _1)=\varDelta _{1},\ldots ,D(\omega _{2N-1})=\varDelta _{2N-1}\). If \((\varDelta _{1},\ldots ,\varDelta _{2N-1})\ne \mathbf 0 \) but \(D(s)=0\), abort. Generates random A(s) and B(s), and let \(C'(s)=A(s)\cdot B(s)\) and \(C(s)=C'(s)+D(s)\). Simulate the opening of [A(s)] to A(s), [B(s)] to B(s), and [C(s)] to C(s). Let the honest parties output \(\texttt {success}\) if \(D(s)= 0\) and the adversary provides the correct shares of [A(s)], [B(s)], [C(s)] and fail otherwise.

We argue that this simulation is indeed indistinguishable. For this, we need to check that the view of the adversary and the outputs of the honest parties in the simulation are indistinguishable from a real execution. Concerning the view of the adversary, note that the values A(s) and B(s) that are opened are uniformly random because of the inclusion of the random \([a_{N}]\), \([b_{N}]\). Given these values A(s) and B(s), \(C'(s)=A(s)\cdot B(s)\) is the value that is opened for [C(s)] if all multiplications are correct. By linearity of the computation of C(s), given A(s) and B(s) the value the adversary expects for [C(s)] is \(C'(s)+D(s)\). Hence, the simulation of the multiplication check is indistinguishable to the adversary and its success implies \((\varDelta _{1},\ldots ,\varDelta _{c_{1}})=\mathbf 0 \), unless \((\varDelta _{1},\ldots ,\varDelta _{2N-1})\ne \mathbf 0 \) and \(D(s)=0\). But D(s) is the evaluation of a polynomial of degree at most \(2N-2\) in a random point from \(\mathbb Z_p\setminus \{0,\omega _1,\ldots \omega _{2N-1}\}\), so by the Schwartz-Zippel lemma, if \((\varDelta _{1},\ldots ,\varDelta _{c_{1}})\ne \mathbf 0 \) then \(D(s)=0\) with probability \((2N-2)/(|\mathbb Z_p|-2N)\). Hence, except with this probability, the adversary cannot make wrong multiplications pass the check, so also the honest parties’ outputs are indistinguishable.    \(\square \)

Corollary 1 (Informal)

The protocol for computing an arithmetic circuit over a finite field from [16] with the batchwise multiplication check from Fig. 1 computes any n-party functionality f with computational security in the presence of a malicious adversary controlling up to \(t<n/2\) corrupted parties.

In the full version of this paper, we present an optimised and slightly simplified version of the Lindell-Nof protocol and prove its security in detail.

3.3 Performance Analysis and Optimisation with PRNGs

Table 1 shows how the amount of communication in the Lindell-Nof protocol is reduced by batchwise multiplication verification, and how it can be further reduced with PRNGs. As mentioned above Lindel and Nof give three concrete instantiations of their protocol based on the GRR, DN and AFL+ multiplication protocols respectively [1, 6, 12]. (The exact variants of the protocols used for this comparison are given in the full version of this paper.) They instantiate three core operations, multiplying, opening shared values and generating a random shared value, and use them in two multiplication checks. The first check uses 2 multiplications, 2 random values and 3 openings; the second check uses 6 multiplications and 3 random values. In GRR, the first check is used; in DN, the second check is used; and in AFL+, a slight optimisation of the first check is used, leading to the given performance in Table 1.

As shown, using batchwise multiplication verification, checking a multiplication requires essentially one additional multiplication. As a result, using it instead of either of the Lindell-Nof multiplication checks reduces communication by a factor 2 to 3.5. The constant cost of the check (hidden behind the \(\gtrsim \) symbol in the table) is spread over the triples in a batch but pretty small: e.g., for \({\le }10\) parties the batch size needed to make the overhead less than one is always less than 50 and to make it less than 0.1 it is less than 500. As shown in Sect. 5, this is possible without affecting computational complexity too much.

Table 1. Field elements sent per party for the Lindell-Nof protocol instantiated with GRR, DN (both with or without PRNG optimizations) and AFL+ (with PRNG optimization). The number of parties and the threshold is denoted by n and t respectively (generally \(n\approx 2t\)). Grey areas are our results

Using PRNGs, we can reduce communication in the GRR and DN constructions even further. For instance, consider the re-sharing of values that takes place in GRR multiplication: instead of sending shares to each party, the dealing party can simply set the shares of t parties by pairwise PRNGs between him and the recipients so that he only needs to send \(n-t-1\) shares, halving communication if \(n=2t+1\). This idea is of course not new, but it is still important for us since applying it reduces communication in the Shamir constructions by an additional factor of at least two. In particular, using PRNGs, the Shamir-based construction with GRR becomes as communication-efficient as the PRNG-based construction. Details appear in the full version of this paper.

3.4 Further Improvements

Adding Fairness. To achieve fairness, we first let the parties reach agreement on whether to produce an output. Once there is agreement, we let the parties derive the output in such a way that the adversary cannot force a failure anymore.

To reach agreement, we use detectable broadcast [10]. Detectable broadcast lets a party send a message to all parties so that either all parties receive the same message, or all parties agree that the broadcast has failed. In our case, the adversary may cause this failure after seeing the value to be broadcast. Unlike full broadcast, it can be achieved over private channels without set-up assumptions. Essentially, [10] achieves detectable broadcast by letting each party once pick and distribute a public key, and performing a pairwise check if all parties consistently sent out their keys. After this setup, broadcasts are performed with the standard Dolev-Strong protocol [9]. In our protocol, parties detectably broadcast their shares of A(s), B(s), and C(s) in the last round of the multiplication check; the parties decide to produce an output only if all parties have successfully broadcast a value; all shares consistently reconstruct to some values A(s), B(s), and C(s); and \(A(s)\cdot B(s)=C(s)\).

To derive the output, we need to ensure that honest parties can detect wrong values sent by corrupted parties. If there are only few parties, each party \(\mathcal {P}_i\) can input a random information-theoretic MAC key \(\alpha _i,\beta _i\) into the MPC (with PRSS, this requires no communication) and the parties compute MAC \(\alpha _i\cdot x+\beta _i\) on output x. After the multiplication check, all parties send their shares of x and \(\alpha _i x+\beta _i\) to \(\mathcal {P}_i\), who selects whichever reconstructed x has a correct MAC. For many parties, this technique is not secure since it costs \(\log ((t+1)\left( {\begin{array}{c}n-1\\ t\end{array}}\right) )\approx n\) bits security; for that case see the full version of this paper.

Efficient Inner Products. One particularly appealing property of MPC based on secret sharing schemes like Shamir and replicated secret sharing, is that they allow inner products \([c]=\sum _{i = 1}^{l}[a_i]\cdot [b_i]\) to be computed at the cost of a single multiplication. Such multiplication protocols first locally perform the multiplication (turning t-out-of-n shared inputs into a 2t-out-of-n sharing of the product) and then re-share the result (turning the product from a 2t-out-of-n sharing back into a t-out-of-n sharing). To compute an inner product, several local multiplications are first summed up and then the result is re-shared.

We can make such inner product computations actively secure by generalising batchwise multiplication verification to verify many inner products of the same length. Instead of generating two random values and computing their product, we generate 2l random values and compute their inner product. We then define polynomials \((A_i(x))_{i=1}^{l}\), \((B_i(x))_{i=1}^{l}\), C(x) in the natural way; exchange shares of \((A_i(s))_{i=1}^{l}\), \((B_i(s))_{i=1}^{l}\), C(s); and check whether \(\sum _{i=1}^{l}A_i(s)B_i(s)=C(s)\). This gives the same security guarantees as batchwise multiplication verification.

Smaller Fields. Because of the false positive rate of the Schwartz-Zippel lemma, our construction requires a field of size at least \(2N\cdot 2^\sigma \), where \(\sigma \) is the statistical security parameter. When working over a smaller field, the multiplication check can be performed repeatedly. This way, statistical security can be boosted arbitrarily: repeating the check k times increases statistical security from \(\log ((|\mathbb Z_p|-2N)/(2N-2))\) to \(\log (\left( {\begin{array}{c}|\mathbb Z_p|-2N\\ k\end{array}}\right) /\left( {\begin{array}{c}2N-2\\ k\end{array}}\right) )\) bits. Note that repeated checking can be done more efficiently than by just repeating the full check as follows. Instead of adding one random triple to a batch of multiplications, we add k of them; and instead of generating one random challenge s, we generate k challenges \(s_i\) and evaluate \(A(s_i)\), \(B(s_i)\), and \(C(s_i)\) for \(i=1,\ldots ,k\). (These can be opened because of the inclusion of the k random triples).

4 SPDZ with an Untrusted Dealer

In this section, we present a protocol for honest-majority 3PC. The main contribution is a communication efficient protocol implementing the preprocessing phase for the 2PC SPDZ protocol using batchwise multiplication verification to check the correctness of Beaver triples generated locally by a third party dealer \(\mathcal {P}_3\). In the online phase two parties \(\mathcal {P}_1, \mathcal {P}_2\) use the preprocessed values in the regular two party SPDZFootnote 1 to compute the desired function. Using a small addition to the online SPDZ protocol, based on ideas from [13], we can allow the dealer to provide input to and receive output from the 2PC protocol, thus giving an actively secure 3PC protocol in the honest-majority setting. We leave these modifications as an exercise.

We note that, the resulting 3PC protocol is highly asymmetric; in the preprocessing phase the \(\mathcal {P}_3\) is doing most of the work while in the online phase \(\mathcal {P}_1, \mathcal {P}_2\) do all the work. To better utilise resources across all three parties, we also develop a load balanced version of the protocol. This works by letting each of party play the role of the dealer in separate runs of the preprocessing phase. In the online phase, we then partition the multiplications to be performed into three sets to be evaluated by each pair of parties in a 2PC fashion. The overall communication per multiplication required in both versions is 5 field elements for the preprocessing phase and 4 field elements in the online phase (as per the regular 2PC SPDZ protocol). Thus using the load balanced version of the protocol we get 4/3 and 5/3 fields elements an average per party in the preprocessing and online phases respectively. We defer the load balancing version of the protocol to the full version of the paper. In this section we focus on our protocol for the SPDZ preprocessing phase.

We note that, compared to our Lindell-Nof based protocol, the protocol presented in this section does communicate three additional field elements per multiplication. However, the online phase communicates two field elements less than the Lindell-Nof based protocol. Thus the setting were preprocessing is available our SPDZ-based protocol is preferable.

4.1 Data Needed for the Online Phase

Before we describe our protocol for the preprocessing phase we here first summarise the data that should be generated: We use \(\left\langle {a}\right\rangle = (\left[ \![{a}\right] \!], \left[ \![{\alpha a}\right] \!])\) to denote a SPDZ sharing of \(a \in \mathbb {Z}_p\), where the sharing is between the parties \(\mathcal {P}_1, \mathcal {P}_2\). Here \(\alpha \in \mathbb {Z}_p\) is a random MAC key fixed at initialisation and unknown to both \(\mathcal {P}_1, \mathcal {P}_2\), but which they share additively. The shared value \(\alpha a\) of is an information theoretic MAC on a, which is used in the online phase to ensure active security.

The online phase of SPDZ needs preprocessed multiplication triples and input masks. A multiplication triple is SPDZ sharings \((\left\langle {a}\right\rangle , \left\langle {b}\right\rangle , \left\langle {c}\right\rangle )\) where \(a, b \in \mathbb Z_p\) are random values and \(c = ab\). In the online phase each multiplication will consume one triple. An input mask is a pair \((r, \left\langle {r}\right\rangle )\) for a random value \(r \in \mathbb Z_p\) known to, say, \(\mathcal {P}_1\). In the online phase each input provided by \(\mathcal {P}_1\) consumes one such mask. For security in the online phase we require that the preprocessed data should be correct in the sense that the shared values and their MACs should obey the correlations described above. Furthermore, the shared values should be unknown and random in the view of any corrupt party participating in the online phase (i.e., either \(\mathcal {P}_1\) or \(\mathcal {P}_2\)). We describe the ideal functionality more formally in the full version of the paper.

4.2 Preprocessing Phase

The basic idea of our protocol is to let \(\mathcal {P}_3\) generate the all the preprocessed data locally, and send the appropriate shares to \(\mathcal {P}_1, \mathcal {P}_2\). Batchwise multiplication verification is then used to check that \(\mathcal {P}_3\) generated the multiplication triples correctly, and a separate check is used to check that the MACs are correct. To save communication our protocol heavily relies on joint PRNGs \({\mathsf {prng}}_{i, j}\) between each pair of parties \(\mathcal {P}_i, \mathcal {P}_j\) in order to non-interactively share values.

Our protocol \(\varPi _\textsc {Deal}\) implementing the preprocessing phase is described in detail in Figs. 2 and 3. In Fig. 2 we show how the protocol is initialised by using the joint PRNGs to sample a random MAC key \(\alpha \) in such a way that \(\alpha \) is unknown to all parties but is additively secret shared between each pair of parties \(\mathcal {P}_i,\mathcal {P}_j\), denoted , . Additionally, \(\mathcal {P}_1\) and \(\mathcal {P}_2\) use \({\mathsf {prng}}_{1,2}\) to sample a challenge \(s_{1,2}\) used for multiplication checks.

In Fig. 2 we also describe two subprotocols which will be used through out the \(\varPi _\textsc {Deal}\) protocol. These protocols use the PRNGs to non-interactively generate a random additive sharing \(\left[ \![{r}\right] \!]\) between \(\mathcal {P}_1, \mathcal {P}_2\), where r is known to \(\mathcal {P}_3\) (4a of Fig. 2), and given any such shared r an additive sharing of \(\left[ \![{\alpha r}\right] \!]\) between all the parties (4b of Fig. 2). Note, that this means that by sending \(\mathcal {P}_3\)’s share \(\left[ \![{\alpha r}\right] \!]_3\) of \(\alpha r\) to, say, \(\mathcal {P}_1\) we can trivially compute a SPDZ sharing \(\left\langle {r}\right\rangle \) by adding \(\left[ \![{\alpha r}\right] \!]_3\) to \(\left[ \![{\alpha r}\right] \!]_1\). In the protocol we slightly abuse notation in this case by saying that \(\mathcal {P}_1\) updates her share \(\left[ \![{\alpha r}\right] \!]_1 = \left[ \![{\alpha r}\right] \!]_1 + \left[ \![{\alpha r}\right] \!]_3\). Note that this requires \(\mathcal {P}_3\) so send exactly one field element per SPDZ sharing.

In Fig. 3 we describe how to generate and verify the actually preprocessed data to be used in the online phase. Multiplication triples are generated by first using the 4a and 4b subprotocols to generate \(\left\langle {a}\right\rangle \) and \(\left\langle {b}\right\rangle \) as described above. \(\mathcal {P}_3\) then computes \(c = ab\) and additively shares it among the parties, using 4b on c we get its MAC. This requires \(\mathcal {P}_3\) to send four field elements.

For a batch of triples \((\left\langle {a_i}\right\rangle , \left\langle {b_i}\right\rangle , \left\langle {c_i}\right\rangle )_{i=1}^{N-1}\) the multiplicative property \(a_ib_i = c_i\) is verified using batch multiplication verification similar to the Lindell-Nof case above. In this case we let the dealer \(\mathcal {P}_3\) compute and additively share (without MACs) the values \(c_{N+1} = C(\omega _{N+1}), \ldots , c_{2N - 1} = C(\omega _{2N - 1})\), as in [4]. \(\mathcal {P}_1, \mathcal {P}_2\) verify the multiplications by checking the polynomials evaluated in the challenge point s generated at initialisation. Again we can open A(s), B(s), C(s) by sacrificing one triple. The check requires a single field element sent per triple and an additional element per batch of \(N-1\) triples. Overall, a total of 5 field elements are sent to generate each multiplication triples and verify the multiplicative property plus one additional field element per batch.

Input masks are simply generated by first using the 4a and 4b subprotocols to generate \(\left\langle {r}\right\rangle \), and then letting \(\mathcal {P}_3\) send the value r to the party using the input mask. This requires sending two field elements for each input mask.

Finally, \(\mathcal {P}_1, \mathcal {P}_2\) must check that all the MACs resulting from invocations of the 4b subprotocol are correct. We do this using protocol similar to the MAC check subprotocol of the regular SPDZ protocol. Essentially, the parties take a pseudorandom linear combination of all the shared values generated, and check that the MACs a consistent with the result. This takes constant communication.

The intuition for security of the protocol goes as follows. Consider first a corrupt \(\mathcal {P}_i\) for \(i \in \{1,2\}\), i.e., one of the parties that will run the online phase. In this case, the dealer \(\mathcal {P}_3\) is honest, and only deals correct random additive shares, which does not reveal information on the shared values. Furthermore, since \(\mathcal {P}_i\) only sends messages in the protocols checking correctness of the dealt shares, \(\mathcal {P}_i\) can only influence the protocol by making it abort (which we allow anyway), but cannot influence the values of any of the shared values. Thus the preprocessed data will be correct and \(\mathcal {P}_i\) will not get information on the shared values. Consider then a corrupt dealer \(\mathcal {P}_3\). By the security of the multiplication verification and MAC check, if the protocol does not abort, then with overwhelming probability the preprocessed data will be correct. \(\mathcal {P}_3\) will learn all values shared in the preprocessing phase, but since these are independent of the parties’ input to the online phase and since \(\mathcal {P}_3\) does not directly participate in the online phase of the protocol, this does not leak any private information.

In the full version we prove security more formally, giving this result:

Corollary 2 (Informal)

Combining the \(\varPi _\textsc {Deal}\) with the 2PC online phase of SPDZ and the outsourced MPC additions of [13] leads to an over all protocol that computes any 3-party functionality f with computational security in the presence of a malicious adversary controlling at most one corrupted party.

Fig. 2.
figure 2

Protocol \(\varPi _\textsc {Deal}\)

Fig. 3.
figure 3

Protocol \(\varPi _\textsc {Deal}\) (cont’d)

4.3 Variants and Extensions

Fairness. Fairness is easily achieved in the load-balanced variant of the protocol described in the fullversion, similarly to the Lindell-Nof case. Essentially, each party \(\mathcal {P}_i\) inputs MAC key \(\alpha _i,\beta _i\) and mask \(\delta _i\) (for which we can use input masks). Then, \(\alpha _i x+\beta _i\) and \(x+\delta _i\) are opened to the other two parties. These values are checked with the SPDZ MAC check and then provided to \(\mathcal {P}_i\). The SPDZ MAC check needs to be performed such that everybody agrees on its result, which essentially means that we need to compute a sum \(\sum \left[ \![{\sigma }\right] \!]_1+\left[ \![{\sigma }\right] \!]_2+\left[ \![{\sigma }\right] \!]_3\) in a fair way. This can be done by letting each party secret-share its summand in a digitally signed way and the other parties forwarding these secret shares, similarly to Dolev-Strong broadcast. We omit the details because of space.

Preprocessing Other Material. Apart from multiplication triples, other random data can be preprocessed in order to speed up specific computations in the SPDZ online phase. For example, Damgård et al. [5] show how to preprocess random square pairs \(\left\langle {a}\right\rangle , \left\langle {a^2}\right\rangle \) for random a. In the online phase \(\left\langle {z}\right\rangle = \left\langle {x^2}\right\rangle \) can be computed from \(\left\langle {x}\right\rangle \) by revealing \(\varepsilon = x - a\) and setting \(\left\langle {z}\right\rangle = 2\varepsilon \left\langle {x}\right\rangle + \left\langle {a^2}\right\rangle - \varepsilon ^2\), which requires only half the communication of regular online multiplications. Our dealer based protocol allows such material to be generated very efficiently.

To preprocess \(N - 1\) pairs of squares \((\left\langle {a_i}\right\rangle ,\left\langle {a^2_i}\right\rangle )_{i=1}^{N - 1}\), we run the protocol for generating multiplication triples as above, except the dealer sets all \(b_i = a_i\) (including \(b_N\) in the triple to be sacrificed). Note that in this case \(B(s)=A(s)\) does not need to be computed or exchanged separately.

Damgård et al. also preprocess random bits, i.e., values \(\left\langle {x}\right\rangle \) so that \(x \in \{0, 1\}\). Such preprocessed values are useful to speed up the online computation of e.g. comparisons [17]. To preprocessed random bits \(\left\langle {x_1}\right\rangle ,\ldots ,\left\langle {x_{N-1}}\right\rangle \), we run the protocol for generating multiplication triples as above, except the dealer sets all \(a_i = x_i\) and \(b_i = 1 - x_i\) (implying \(c_i = 0\)). If we use \((\left\langle {a_N}\right\rangle ,(1-\left\langle {a_N}\right\rangle ),\left\langle {a_N}\right\rangle (1-\left\langle {a_N}\right\rangle )\) for random \(a_N\) as the extra multiplication triple to be sacrificed, we have \(B(x)=1-A(x)\) so B(s) does not need to be computed or exchanged. Thus the preprocessing of both a square pair and a bit requires communicating one less field element than a multiplication.

Similarly, we can compute other useful preprocessed material by having the dealer prove the appropriate multiplicative relations using the batchwise multiplication check. For example, random values with their negative powers \(\left\langle {r}\right\rangle ,\left\langle {r^{-1}}\right\rangle ,\ldots ,\left\langle {r^{-k}}\right\rangle \) are useful to compute \(\left\langle {x^2}\right\rangle ,\ldots ,\left\langle {x^k}\right\rangle \) from \(\left\langle {x}\right\rangle \) by opening (rx) and taking \(\left\langle {x^i}\right\rangle =(rx)^{i} \left\langle {r^{-i}}\right\rangle \) (e.g., for secure equality [17]). Correctness is verified from triples \(\left\langle {a_1}\right\rangle =\left\langle {r}\right\rangle ,\left\langle {b_1}\right\rangle =r^{-1},\left\langle {c_1}\right\rangle =1\), \(\left\langle {a_i}\right\rangle =\left\langle {r^{-1}}\right\rangle \), \(\left\langle {b_i}\right\rangle =\left\langle {r^{-i+1}}\right\rangle \), \(\left\langle {c_i}\right\rangle =\left\langle {r^{-i}}\right\rangle \), \(i=2,\ldots ,k\).

Secret-shared random matrix products can be used to efficiently compute matrix products [18]: given random matrices \(\left\langle \mathbf{U }\right\rangle \), \(\left\langle \mathbf{V }\right\rangle \), and \(\left\langle \mathbf{W }\right\rangle = \left\langle \mathbf{U \cdot \mathbf V }\right\rangle \) of the correct size, matrix product \(\left\langle \mathbf{Z }\right\rangle = \left\langle \mathbf{X \cdot \mathbf Y }\right\rangle \) is computed by opening \(\left\langle \mathbf{X - \mathbf U }\right\rangle \) and \(\left\langle \mathbf{Y - \mathbf V }\right\rangle \) and letting

$$ \left\langle \mathbf{Z }\right\rangle =(\mathbf X -\mathbf U )\cdot (\mathbf Y -\mathbf V )+(\mathbf X -\mathbf U )\cdot \left\langle \mathbf{V }\right\rangle +(\mathbf Y -\mathbf V )\cdot \left\langle \mathbf{U }\right\rangle +\left\langle \mathbf{W }\right\rangle . $$

To preprocess a random matrix product, the dealer provides secret shares of all \(U_{i,j}\), \(V_{j,k}\) and products \(U_{i,j} \cdot V_{j,k}\), and proves their correctness. The elements of W are computed as linear combinations of these products. Preprocessing in this case reduces overall communication, e.g., by a factor 1.5 for \(2\,\times \,2\) matrices or a factor 2.5 for \(10\,\times \,10\) matrices. Similarly, in the common case of multiplying value (i.e., 1-by-1 matrix) \(\left\langle {x}\right\rangle \) with each element in vector (i.e., 1-by-n matrix) \(\left\langle {\mathbf {y}}\right\rangle \), online communication halves and overall communication decreases by 33%.

Smaller Fields. As in the Lindell-Nof case, we need a field of size at least \(2N\cdot 2^\sigma \), but as there, we can enhance the statistical security of \(\varPi _\textsc {Deal} \) by repeating the multiplication check. Of course, for an overall secure protocol for fields smaller than \(2^\sigma \), also modifications to the SPDZ online phase are needed, cf. [8].

Fig. 4.
figure 4

Number of Lindell-Nof multiplications that can be checked for correctness per second based on the given network capacity or computation effort, with batches of size \(2^1,\ldots ,2^9\) for a 64-bit prime (left) or 128-bit prime (right)

5 Performance Evaluation

In this section we present performance estimates suggesting that, despite the computations in our protocols, communication is often still the main bottleneck.

5.1 Implementation Details

To estimate the computation effort of our protocol, we have implemented batchwise multiplication verification both in the Lindell-Nof and the SPDZ setting. In both cases, we implemented only computation (including PRNG evaluation, secret sharing, reconstruction, and the MAC check) and not communication. For the PRNG, we used the SPDZ-2 implementation based on AES-NIFootnote 2.

We implemented the batch check in batch sizes of \(2^k\) using fields \(\mathbb Z_p\) that allow efficient modular arithmetic and efficient FFTs for those batch sizes (batches do not need to be completely filled up). Batch verification relies heavily on performing FFTs of the size of the batch for performing interpolation; with batch size \(2^k\), we can use the efficient Cooley-Tukey FFT algorithm. This requires a \((2^k)\)th root of unity in \(\mathbb Z_p\), or equivalently, \(2^k|p-1\). To have fast modular arithmetic, we use pseudo-Mersenne primes \(p=2^s-2^l+1\); note that if \(k\le l\) then \(2^k|2^l|p-1\). (We cannot use regular Mersenne primes \(2^s-1\) since \(2^k\not \mid 2^s-1\).) In particular, we use our own modular arithmetic/FFT implementation for primes \(2^{64}-2^{10}+1\) and \(2^{128}-2^{54}+1\), allowing batches up to \(2^9\), and \(2^{53}\), respectively.

To estimate communication complexity, we compute the number of bits that each party needs to send to check correctness of one multiplication. For Lindell-Nof, this is the same for each party; for SPDZ, we use load-balancing so that communication is also evenly spread. The number of multiplications per second is computed as the bandwidth divided by that amount of bits.

5.2 Evaluation Results

Figure 4, estimates the number of multiplications that can be checked in the Lindell-Nof protocol using Shamir secret sharing, GRR multiplication, and our batchwise check. (Note that this does not include the multiplication to be checked itself.) We show, for different batch sizes \(2^k\), how many checks are allowed by the bandwidth of a 50 Mbps WAN, a 1 Gbps LAN, and a 2 Gbps LAN. We also show, on a single core of a Amazon M4.large machine (a 2.3/2.4 GHz Intel Xeon E5), how many checks can be handled by the processor. As expected, larger batches are good for communication complexity but bad for computation complexity. With a 1 Gbps LAN and a single core, computation quickly becomes the bottleneck, but still it is possible to process check around 5 million multiplications per second for 64-bit primes and 2 million for 128-bit primes. Note that batchwise verification is trivially parallelizable by checking each batch on a different core, so the number of checks per second can easily be increased by increasing the number of cores. With less than 1 Gbps available, communication quickly becomes the bottleneck rather than computation. We did not run experiments for more than three parties, but in general, the amount of computation should stay roughly the same (since it is dominated by the FFTs) whereas the amount of communication increases as shown in Table 1.

Fig. 5.
figure 5

Number of SPDZ multiplication triples that can be preprocessed per second based on the given network capacity or computation effort (excluding online phase), with batches of size \(2^2,\ldots ,2^9\) for a 64-bit prime (left) or 128-bit prime (right).

Figure 5 similarly estimates the number of multiplication triples per second of our SPDZ preprocessing, load-balanced between the three parties. As above, for different batch sizes \(2^k\), we plot the number of triples that can be generated on a 50 Mbps WAN, a 1 Gbps LAN, and a 2 Gbps LAN; and a single Amazon M4.large core. Note that SPDZ has less communication than Lindell-Nof for small batch sizes; this is because the constant overhead of the SPDZ batch check is very small (just a few field elements). However, for larger batches, Lindell-Nof has less communication (each party sends one field element per check vs. the dealer sends five field elements for one third of the checks). In SPDZ, on a 1Gbps network with a single core, computation is the bottleneck, and around 5 million triples per second are possible for a 64-bit primes or around 2 million triples for a 128-bit prime; with two to four cores, it is possible to reach around 10 million triples for a 64-bit prime or 5 million triples for a 128-bit prime.