Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

HACCLE: Metaprogramming For Secure Multi-Party Computation

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

HACCLE: Metaprogramming for

Secure Multi-Party Computation


Yuyan Bao∗2 Kirshanthan Sundararajah∗ Raghav Malik
yuyan.bao@uwaterloo.ca ksundar@purdue.edu malik22@purdue.edu
University of Waterloo Purdue University Purdue University
Canada USA USA

Qianchuan Ye Fei Wang Alexander Seto


Christopher Wagner Mohammad Hassan Ameri Benjamin Delaware
Nouraldin Jaber Donghang Lu Roopsha Samanta
{ye202,wagne279,njaber}@purdue.edu {wang603,mameriek,lu562}@purdue.edu {aseto,bendy,roopsha}@purdue.edu
Purdue University Purdue University Purdue University
USA USA USA

Aniket Kate Pierre-David Letourneau Tiark Rompf


Christina Garman Benoit Meister Milind Kulkarni
Jeremiah Blocki Jonathan Springer {tiark,milind}@purdue.edu
{aniket,clg,jblocki}@purdue.edu {letourneau,meister,springer}@reservoir.com Purdue University
Purdue University Reservoir Labs USA
USA USA
Abstract contains an embedded domain-specific language Harpoon,
Cryptographic techniques have the potential to enable dis- for software developers without cryptographic expertise to
trusting parties to collaborate in fundamentally new ways, write MPC-based programs, and uses Lightweight Modular
but their practical implementation poses numerous chal- Staging (LMS) for code generation.
lenges. An important class of such cryptographic techniques Harpoon programs are compiled into acyclic circuits repre-
is known as Secure Multi-Party Computation (MPC). Devel- sented in HACCLE’s Intermediate Representation (HIR) that
oping Secure MPC applications in realistic scenarios requires serves as an abstraction over different cryptographic pro-
extensive knowledge spanning multiple areas of cryptogra- tocols such as secret sharing, homomorphic encryption, or
phy and systems. And while the steps to arrive at a solution garbled circuits. Implementations of different cryptographic
for a particular application are often straightforward, it re- protocols serve as different backends of our toolchain. The
mains difficult to make the implementation efficient, and extensible design of HIR allows cryptographic experts to
tedious to apply those same steps to a slightly different ap- plug in new primitives and protocols to realize computa-
plication from scratch. Hence, it is an important problem tion. And the use of standard metaprogramming techniques
to design platforms for implementing Secure MPC applica- lowers the development effort significantly.
tions with minimum effort and using techniques accessible We have implemented Harpoon and HACCLE, and used
to non-experts in cryptography. them to program interesting applications (e.g., secure auc-
In this paper, we present the HACCLE (High Assurance tion) and key computation components of Secure MPC ap-
Compositional Cryptography: Languages and Environments) plications (e.g., matrix-vector multiplication and merge sort).
toolchain, specifically targeted to MPC applications. HACCLE We show that the performance is improved by using our
optimization strategies and heuristics.
∗ Both authors contributed equally to this work.
2 Work performed while author was at Purdue University. CCS Concepts: · Software and its engineering → Com-
pilers; Domain specific languages.

Keywords: metaprogramming, domain-specific language,


This work is licensed under a Creative Commons Attribution 4.0 Interna- secure multi-party computation
tional License.
ACM Reference Format:
GPCE ’21, October 17ś18, 2021, Chicago, IL, USA Yuyan Bao, Kirshanthan Sundararajah, Raghav Malik, Qianchuan
© 2021 Copyright held by the owner/author(s). Ye, Christopher Wagner, Nouraldin Jaber, Fei Wang, Mohammad
ACM ISBN 978-1-4503-9112-2/21/10. Hassan Ameri, Donghang Lu, Alexander Seto, Benjamin Delaware,
https://doi.org/10.1145/3486609.3487205 Roopsha Samanta, Aniket Kate, Christina Garman, Jeremiah Blocki,

130
GPCE ’21, October 17ś18, 2021, Chicago, IL, USA Bao and Sundararajah, et al.

Pierre-David Letourneau, Benoit Meister, Jonathan Springer, Tiark Contributions The main intellectual contribution of this
Rompf, and Milind Kulkarni. 2021. HACCLE: Metaprogramming paper is a toolchain for developing Secure MPC applications
for Secure Multi-Party Computation. In Proceedings of the 20th called HACCLE (High Assurance Compositional Cryptog-
ACM SIGPLAN International Conference on Generative Programming: raphy: Languages and Environments). Our framework con-
Concepts and Experiences (GPCE ’21), October 17ś18, 2021, Chicago, tains an embedded domain-specific language (eDSL) Harpoon
IL, USA. ACM, New York, NY, USA, 14 pages. https://doi.org/10.
for designing MPC-based applications, and uses standard
1145/3486609.3487205
metaprogramming techniques to lower the development ef-
1 Introduction fort. Allowing seamless construction of MPC-based applica-
Secure Multi-Party Computation (MPC) enables a group of tions by software developers without expertise in advanced
distrusting parties to jointly perform computation without cryptography is the main purpose of providing such a high-
revealing any participant’s private data that they do not level programming language. A Harpoon program is com-
wish to share with others. It has broad practical applications, piled to an acyclic combinational circuit, which is described
e.g., Yao’s millionaires problem [50], secure auctions [5, 24], in a HACCLE Intermediate Representation (HIR). HIR ex-
voting, privacy-preserving network security monitoring [8], poses the essential data-oblivious nature of MPC, and allows
privacy-preserving genomics [26, 49], private stable match- cryptography experts to experiment with new primitives
ing [18], ad conversion [28], spam filtering on encrypted and protocols. Our framework also provides a specialized
email [22], and privacy-preserving machine learning [16]. Se- backend for estimating the resource usage (e.g., compute
cure MPC applications are generally realized as circuits com- time and memory space) prior to execution.
municating information ś both private and public ś among
This paper makes the following specific contributions:
parties.
• HACCLE Toolchain: A compilation framework to build
Although MPC techniques and protocols have seen much
and execute MPC applications written in Harpoon ś an
success in the cryptography community, it is still challenging
embedded domain-specific language (eDSL) in Scala based
to build practical MPC applications. Executing cryptographic
on the LMS metaprogramming and compiler platform.
protocols is notoriously slow, due to the encryption and com-
• HACCLE Intermediate Representation (HIR): An ex-
munication overhead. The largest benchmark reported in
tensible circuit-like intermediate representation tailored
Fairplay [33] ś a secure two-party computation system ś was
to abstract cryptographic primitives used in MPC.
finding the median of two sorted input arrays containing ten
• Optimization Strategies: Methods for optimizing the
16-bit numbers from each party. Running the benchmark re-
MPC application by specialization as it flows through each
quired execution of 4383 gates, and took over 7 seconds on a
stage of our HACCLE toolchain.
local area network. While improving computing capabilities
and network bandwidth, implementation techniques can con- The rest of the paper is organized as follows. Sec. 2 pro-
tribute to 3-4 orders of magnitude improvements [20]. These vides background on cryptographic protocols involved in
techniques include optimizations that reduce the number Secure MPC and motivates the need for developing MPC-
of gates and the depth of a circuit and reduce the computa- based applications. We describe the key impediments for
tional costs of executing a cryptographic protocol. However, developing practical MPC applications with the example of
such optimizations do not exist in general-purpose compiler secure auctions. Sec. 3 illustrates components of our com-
frameworks. piler and HIR. Sec. 4 describes the HACCLE toolchain and
While several MPC frameworks have been proposed [4, associated workflow. Sec. 5 describes the optimizations im-
9, 15, 17, 25, 29, 31, 32, 35, 43, 47, 48, 51, 51, 52], they either plemented in our compiler toolchain. Sec. 6 discusses our
provide low-level cryptographic primitives or high-level ab- toolchain on three case studies in detail. Sec. 7 summarizes
stractions like traditional programming languages, but not related work and Sec. 8 concludes the paper. The HACCLE
both. The low-level frameworks provide high degrees of implementation is available online at:
customized protocol execution, but the users are generally https://github.com/YuyanBao/HACCLE.
expected to be experts in either one or both of cryptography
and optimizing circuits. These MPC frameworks provide lit- 2 Motivating Example and Background
tle or no type safety to prevent semantic errors, and it is As an example of Secure MPC, consider online auctions. On-
difficult to write applications in a way that is portable across line auctions have great practical importance and different
different protocols. The high-level frameworks provide tradi- models are widely used, e.g., by eBay, Google AdWords, and
tional programming abstractions that hide the data-oblivious Facebook. In general, a secure online auction works as fol-
nature of secure computation from programmers. But these lows. Buyers place their sealed bids on items, and for each
frameworks are tied to only one or a few protocols and their item, the highest bidder is chosen to buy it. In this setup,
compilation procedures ś from high-level abstractions to parties are not permitted to know others’ bids. Hence, con-
low-level primitives ś are not easy to extend to perform ducting successful secret auctions in the absence of a trusted
application-specific optimizations [51]. authority requires cryptographic techniques to preserve the

131
HACCLE: Metaprogramming for Secure Multi-Party Computation GPCE ’21, October 17ś18, 2021, Chicago, IL, USA

secrecy of bids while performing necessary computation, following code shows the high-level design of HACCLE in-
such as finding the highest bidder, in an assuredly trustwor- termediate representation (HIR) using LMS. The case classes
thy way. One of the significant use cases of secure auctions Bit and Num define the primitive constructs of encoding
is procurement via a competitive bidding process, where boolean circuits and arithmetic circuits respectively, where
no participant trusts each other, including the auctioneer. the types Rep[SBit] and Rep[SNum] denote the staged rep-
While a trusted third party handling the auction may be resentations of secure bits and numbers (see Sec. 3.3).
acceptable when the items under auction have low value, // Boolean circuit interface
this is generally a less desirable option in high-value and abstract class SBit
corruption-prone environments, such as procurement for case class Bit(val value: Rep[SBit], ... ) {
public construction contracts. def &(that: Bit) = { ... }
There are many different types of auction policies studied def |(that: Bit) = { ... }
}
by economists and game theorists. An auction where the
// Arithmetic circuit interface
highest bidder is chosen to buy the item by paying the high- abstract class SNum
est bid is known as a first-price auction. A second-price or case class Num(val value: Rep[SNum], ...) {
Vickrey auction [46] is an alternative auction policy where def +(that: Num) = { ... }
the highest bidder is chosen to buy the item at the second def -(that: Num) = { ... }
highest price. Second-price auctions provide buyers with def <(that: Num) = { ... }
the incentive to bid their true valuation and do not allow }
for price discovery (i.e., no ramping up of prices). Hence, It is of course possible to implement Num on top of binary
second-price auctions are especially suitable for high-value circuits and Bit arrays using standard half adders and full
low-trust environments, such as public procurement. Second- adders (see Sec. 3.3), but some secure cryptographic protocols
price auctions also apply to settings where multiple items are directly support arithmetic circuits.
auctioned and/or bids may have additional structure, such Now to express a secure first-price auction, we can use
as if/then conditions to evaluate specific contract terms that operations on an array of pairs of Nums that denote encrypted
need to be taken into account for comparison. Such settings bidders’ identities and their bids:
are described as generalized second-price auctions. Given // assume input: Array[(Num, Num)]
that secrecy of the bids is preserved, the computation re- var res = input(0)
quired when a single item is auctioned is simpler than when for (i <- (1 until input.length))
multiple items are auctioned. Hence it is desirable both from res = if (res._2 < input(i)._2) input(i) else res
res
programmability and efficiency viewpoints that the online
auction application is written once for the general case and Observe that the linear sequence of operations in the above
gets automatically and correctly specialized for the desired code results in a suboptimal circuit. Rewriting the code in a
number of items, number of bidders, comparison logic, etc. functional style, as, input.reduce(_ max _), allows us to
Most implementation techniques for Secure MPC appli- abstract over the reduction pattern and substitute the linear
cations (e.g., first- and second-price auctions) are based on sequence with a tree reduction patten, which yields a circuit
circuits. Equivalent functionality can be expressed as a Scala of logarithmic depth, allowing efficient parallel computation.
program: e.g., the following expresses an AND gate template, Using known techniques for extracting functional dependen-
with bit-width determined by the input array: cies from imperative loops [19, 37], this transformation is
val input = Array(0, 1, 1, 0) automated and applied to for loops. Now, all we need are
var res = input(0) generic functions: max, sndmax (shown below) and reduce
for (i <- (1 until input.length)) (Fig. 1). The latter divides the computation into subproblems
res = res & input(i) of size 𝑛/2 and call the subproblems recursively.
res
// compare (bid id, bid value)
Just like in DSLs for hardware design [2, 27], using metapro- def max(a: (Num, Num), b: (Num, Num)): (Num, Num) =
gramming techniques to stage bitwise operations rather than if (a._2 < b._2) b else a
execute them directly is the key to our approach. Implement- // compare (bid id, bid value, price = 2nd highest bid)
ing secure circuits then amounts to specializing the encoding def sndmax(a: (Num, Num, Num), b: (Num, Num, Num)) =
and operators for the respective cryptographic backends. val prz = ... // 2nd highest of a._2,a._3,b._2,b._3
We use Lightweight Modular Staging (LMS) [38] to turn if (a._2 < b._2) (b._1,b._2,prz) else (a._1,a._2,prz)
the encoding and operators into staged expressions, so that Type classes, e.g., Ordering[T] or Encoding[T], can be used
programs like the previous AND template become circuit to further abstract over comparison or access logic.
generators. In LMS, type constructor Rep[T] is used to de- With the given comparator functions, we can transform
note a staged expression, which will cause an expression the previous imperative code to a functional style, which
of type T to become part of the generated program. The generates optimal circuits:

132
GPCE ’21, October 17ś18, 2021, Chicago, IL, USA Bao and Sundararajah, et al.

def reduce[T](input: Array[T])(f: (T, T) => T): T = { Homomorphic encryption enables operations on encrypted
def rec(elems: Array[T]): T = data. The PALISADE [44], TFHE [13], and HElib [39] libraries
if (elems.length == 1) elems(0) serve as our FHE backends. They all implement asymmet-
else { ric protocols that use a pair of public and private keys for
val b1 = elems.slice(0, elems.length/2)
encryption and decryption. The TFHE library implements a
val b2 = elems.slice(elems.length/2, elems.length)
f(rec(b1), rec(b2)) very fast gate-by-gate bootstrapping mechanism [11, 12], and
} allows to evaluate a boolean circuit composed of binary gates
rec(input) over encrypted data. The HElib library implements many
} optimizations to make homomorphic evaluation run faster.
Figure 1. Generic function reduce yielding a circuit of logarithmic depth. The PALISADE library supports the BGV [7], BFV [6, 21],
and CKKS [10] schemes. In cryptography, ciphertext and
// compute first-price auction
val max = reduce(input)(max)
plaintext mean private and public information, respectively.
// compute second-price auction In this paper, we may use these terms interchangeably.
def initPrice(x) = (x._1, x._2, x._2)
Garbled Circuits. Yao’s Garbled circuits [50] is a two-
def dropSecretBidValue(x) = (x._1, x._3)
val r = reduce(input.map(initPrice)))(sndmax)
party secure computation scheme for boolean circuits against
val snd_max = dropSecretBidValue(r) semi-honest adversaries. Obliv-C [51] is the library that we
use to support Yao’s Garbled Circuits protocols.
For second-price auctions, we transform each element in
the array to a 3-tuple of bidder’s identity, highest bid, and System and Communication Models. There are two
initial price, and reduce with sndmax. Sec. 6.1 shows a full popular system models for multi-party computation. The
implementation in our HACCLE toolchain. MPC-as-a-service setting allows some parties to play the
We continue our discussion of Secure MPC background by role of servers and to provide MPC services to clients with
looking at different protocols for secure computation, system private input. The other setting is where the parties run-
models, trust models, and the offline/online paradigm. ning the MPC protocols are the participants who provide the
input. The HACCLE toolchain does not enforce a specific
Secret sharing. Secret sharing [40] is a cryptographic
setting; instead, users choose the suitable setting for their
technique that distributes secret data amongst a group of
applications and keep that setting in mind when developing
parties, and allows the secret to be reconstructed only when
programs. Similarly, the HACCLE toolchain does not enforce
a sufficient portion of shares are combined. A (𝑡, 𝑛)-secret
any communication model. The parties/machines could be
sharing scheme allows the secret 𝑠 to be split into 𝑛 shares.
fully connected, could form a star network structure, or could
Any 𝑡 − 1 of the shares reveal no information about 𝑠, while
be any specified structure. As long as the network structure
any 𝑡 shares can complete reconstruction of the secret 𝑠.
is supported by one of HACCLE’s backends, HACCLE is able
The SPDZ [15] and HoneyBadgerMPC [32] frameworks
to compile the program.
serve as our secret sharing backends and provide Python-
style programming environments for writing custom MPC Trust/Adversary Models. Developing MPC applications
programs. These frameworks let developers express MPC requires understanding the security assumptions of an MPC
programs (e.g., second-price auction) as arithmetic expres- library, such as the trust/adversary models. There are two
sions. Constructing the most efficient MPC programs is the major adversary models: semi-honest and malicious. A semi-
major challenge for developers. First, developers must know honest adversary follows the protocol, but tries to learn
how to build an efficient circuit, e.g., realizing a balanced tree from received messages. A malicious adversary has the same
reduction to reduce the depth of a circuit and to parallelize power as a semi-honest one in analyzing the protocol exe-
the computation, instead of performing a linear reduction cution. In addition, it may also control, manipulate, or ar-
over a list of elements. Second, developers must have a good bitrarily inject messages to the network. In HACCLE, pro-
understanding of the cost of every primitive operation (e.g., grammers only need to provide a model of choice and the
usage of logically similar but different comparison operators toolchain will pick proper sub-protocols to build up the MPC
may yield different costs). These challenges are significantly programs satisfying the adversary model described.
different from writing an efficient program in the traditional
Offline Phase. The offline/online paradigm is applied by
setting and can be successfully overcome by a compiler.
many MPC protocols and frameworks. The online phase uses
Homomorphic Encryption. Cloud computing may vio- a buffer of preprocessed input-independent values created
late privacy. In this scenario, one party wants to perform during the offline phase. Thus, the MPC framework can run
computation by outsourcing to another (possibly untrusted) the offline phase to prepare them beforehand. The online
party, e.g., training machine learning models of private data phase is where clients/users provide their inputs and get ex-
on a public cloud server. This can be achieved by homomor- pected output; it can gain a significant speed-up with the help
phic encryption, another important cryptographic primitive. of the offline phase. A number of preprocessed values are

133
HACCLE: Metaprogramming for Secure Multi-Party Computation GPCE ’21, October 17ś18, 2021, Chicago, IL, USA

Harpoon
Transform
Staged
Transform Backend a and b, evaluating the expression 𝑎 + 𝑏 will generate code
+ Computation
Scala Graph LMS Program for a given backend. For the Helib backend, the generated
code will be Ctxt r = a; r += b;, where Ctxt is the type
Backend Specification of a ciphertext in the Helib library. For the TFHE backend,
Figure 2. Compilation in HACCLE. the generated code will be:
required for multiplications and comparisons. The volume LweSample* x5 =
new_gate_bootstrapping_ciphertext_array(64, x2->params);
of preprocessing data depends on the online phase, and it is
fhe_add(x5, a, b, 64, bk);
hard for programmers without security expertise to work out
where LweSample is the type of a ciphertext in the TFHE
those requirements. In HACCLE, programmers need not care
library. As a TFHE program does not provide arithmetic
about the secret parameters. They describe only the compu-
expressions and operations, the compiler encodes an integer
tation and the private information. The HACCLE toolchain
as a bit-array of size 64. The function fhe_add is part of our
can synthesize suitable settings for the offline phase.
HACCLE library of the TFHE library.
3 Compiler 3.2 Harpoon
The HACCLE toolchain uses LMS [38] to support our towers HAccle Rich Representation for Program OperatiON (Har-
of abstractions. Staging is a technique for building extensible, poon) language is an expressive subset of Scala for writing
flexible DSLs by providing code generators that successively MPC programs. It is an imperative and monomorphic lan-
lower higher-level abstractions to lower-level abstractions, guage, featuring standard control flow operations: loops,
and, ultimately, to executable code. Importantly, staging al- function calls, conditionals, and recursions. The language is
lows optimization to be performed at every level of the low- designed to be expressive enough that programmers could
ering process. Hence, some optimizations can be performed easily write Harpoon code directly, while being constrained
at high levels of abstraction (e.g., optimization on plaintext enough to ensure that Harpoon programs can be imple-
computation (see Sec. 5)), while other optimizations can be mented via translation to secure low-level computation. In
performed at lower levels of abstraction. As a result, abstrac- practice, Harpoon serves as the top-level IR for the HACCLE
tion penalties are minimized. Another benefit of staging is pipeline, and is the language for end-user programs.
that because the translation is written in terms of generators, The Harpoon language is not only able to access Scala
it is simple to add new abstractions at any given level. libraries, but also provides a set of cryptographic data struc-
tures, e.g., HArray[T] is an encrypted array that allows one
3.1 Staged Compilation
to index on ciphertexts. It also provides a set of security an-
Multi-Stage Programming [42] (or staging) is the program- notations that are read via reflection and are used to direct
ming language technique that executes programs in multiple code generation. They are agnostic to the target backend, and
stages. A staged computation does not immediately compute are used by subsequent stages of the HACCLE pipeline. For
a result, but returns a program fragment that represents the example, the annotation sec is used to mark the provider
computation and that can be explicitly executed to form the (also the owner) of private data. Recursive functions and
next computational stage. The key benefit of staging is that loops may be annotated with an upper bound on the number
the present-stage code can be written in a high-level style, of recursive calls and iterations. This expression can refer-
yet generates future-stage code that is very low-level and ence the parameters of the function, allowing this bound to
efficient. Fig. 2 illustrates an end-to-end compilation path vary according to the context where a function is called, e.g.,
in HACCLE. The compiler takes a Scala program with Har- consider the signature of merge function:
poon annotations (see Sec. 3.2), and constructs a computation @bound(a.length + b.length)
graph that expresses an abstract circuit. Given a backend def merge(a: HArray[Int], b: HArray[Int]): Harray[Int]
specification, the compiler will generate a target program
The upper bound of the number of recursive calls is the sum
for it. Currently, our compiler is not able to automatically
of the length of the two input arrays. Note that the semantics
choose an appropriate backend and initialize all the parame-
of function calls in Harpoon is not impacted by the bound;
ters for it. Thus, a backend specification is needed. It is a file
rather it is used by subsequent stages of the pipeline to bound
that contains a set of parameters for translating an abstract
the invocation of a recursive function call (see Sec. 3.4).
circuit to a concrete backend program.
The annotated program is also equipped with a type sys-
Generative Programming and Lightweight Modular tem, and ensures that information about private data cannot
Staging (LMS). As mentioned in Sec. 2, the HACCLE com- be leaked. This provides the first-layer guarantees that the
piler uses LMS for code generation due to its metaprogram- programs can be successfully compiled by the later stages of
ming capabilities, and the type constructor Rep[T] is used to the pipeline. Consider the statement println(a), where 𝑎
denote a staged expression. For example, the type Rep[SNum] is annotated as private data. The compiler will report a type
denotes an encrypted integer. Given two Rep[SNum] values error, as encrypted data is not understandable or meaningful

134
GPCE ’21, October 17ś18, 2021, Chicago, IL, USA Bao and Sundararajah, et al.

Float, FloatArray In this case, a variable declaration statement in Harpoon, i.e.,


@sec(alic) val x = 5;, is transformed to val o = new
UNum, UNumArray Num, NumArray Owner(alice); val x = Num(o, 5); in HIR.
In the scenario of using a secret sharing scheme, an in-
teger is encoded as the ShareNum data structure in HIR
Bit, BitArray shown below. It expresses a general secret sharing proto-
Figure 3. Example of multi-level HIRs. col. The provider is the one who contributes the value that
to users. But the assignment @sec(alice) val r = a is is shared among a set of players with threshold. The set
permitted, as the annotation expresses that the variable r of observers are those who are allowed to access the value
stores encrypted data. While the type system at this stage once it gets combined.
does not make use of fine-grained ownership information, case class ShareNum(
this information will be passed down through the pipeline. val provider : Set[Rep[SOwner]], // who provides it
See [3] for the details of Harpoon language. val players : Set[Rep[SOwner]], // players
val observers : Set[Rep[SOwner]], // who observes it
3.3 Intermediate Representation val threshold : Int, // threshold
val value : Rep[SShareNum] // shares
HACCLE intermediate representation (HIR) serves as an
)
interface between high-level programming languages and
cryptographic backends. HIR is a domain-specific intermedi- In addition, HIR provides libraries for implementing secure
ate language, and gains benefits from LMS to support towers computation. Those libraries are not supported by general-
of abstractions. It encompasses all the primitive operations purpose compilers, but are essential to build interesting
which we have supported so far, e.g., encryption, decryption, multi-party applications with security guarantees. For exam-
sharing, and combining. ple, the following shows the operations of an array support-
ing indexing on a ciphertext, where arr is an HIR array.
Multi-level IR. Different backends may support different
• arr(i): array index, where 𝑖 is a plaintext or a ciphertext.
sets of operations in HIRÐno backend is łcomplete” in that
• arr.update(i, v): update the 𝑖th element with the value 𝑣,
there is a direct implementation of each HIR operation in that
where 𝑖 is either a plaintext or a ciphertext.
backend. For example, the TFHE backend supports logical op-
• arr.slice(i, j): array slicing from the 𝑖th element until
erations but not arithmetic ones. In contrast, other backends
the 𝑗the element, where 𝑖 and 𝑗 are plaintext.
may support arithmetic operations but not boolean ones.
• arr.length: the length of the array
The compiler’s job is to rewrite HIR circuits to be compatible
with backends. The way these array operations with secure indices are
As shown in Fig. 3, HIR is a multi-level IR. The compiler currently implemented is through, essentially, a naive Obliv-
can thus use rewrites to target the subset of operations that a ious RAM (ORAM): to index into an array with a ciphertext
given backend supports. For example, arithmetic operations index, the compiler generates a circuit that wires every ar-
(adds, multiplies) can be rewritten into bit-level implemen- ray element, and a secure selector (multiplexer) to output
tations (as, e.g., ripple-carry adders, or bit-level implemen- the desired array element. This is equivalent to a set of if-
tations), or boolean operations can be represented as arith- then-elses to choose the desired array element, except with
metic operations that happen to operate over Z2 . We have a logarithmic depth instead of a linear depth. Writing to an
developed a set of these rewrite rules for various backends array element with a ciphertext index is the equivalent of an
(and, indeed, rely on exactly this type of rewrite to support array copy, where each element of the new array performs
floating point operations). a check for whether the old element of the array should be
A key task for integrating a new backend is identifying copied, or the łupdate” value should be copied.
what set of HIR operations that module supports, hence As implementation details of cryptographic backends are
directing the compiler to perform appropriate rewrites. No- abstracted away from HIR, our framework can be easily
tably, if the compiler cannot rewrite an HIR circuit to target extended to support more advanced cryptographic backends,
the set of operations a backend supports, it will manifest as for example, a backend with ORAM. Here, we would leverage
a type error, providing feedback to the user. HIR’s ability to provide backend-specific rewrite rules, and
In the scenario of using a FHE scheme, an integer is en- would directly rewrite array operations to ORAM operations.
coded as the Num data structure shown below, where the
Type System. HIR also abstracts away the implementa-
fields provider and value are abstraction of the party who
tion details of cryptographic primitives and protocols. For
provides the value and the encrypted value respectively.
case class Num(
example, an addition operation does not specify how a se-
val provider: Set[Rep[SOwner]], // who provides it cure addition is achieved, as different protocols perform in
val value : Rep[SNum] // encrypted value different ways. But the type rules provide an approximation
) of data access policy that specifies how data is provided,

135
HACCLE: Metaprogramming for Secure Multi-Party Computation GPCE ’21, October 17ś18, 2021, Chicago, IL, USA

accessed, and shared. For example, an addition operation allowed. One is the standard if-statement, where its condi-
on two shared numbers is only allowed on the same set of tion depends on plaintext comparisons, and the two branches
players with the same threshold, which are known at com- consist of a sequence of statements that may have side effects.
pile time. And the result is provided by either one of its The other has the form z = if (b) x else y, where the
operand’s providers with the same set of players with the value of b is the result of private comparisons. Obliviousness
same threshold, and is allowed to be accessed by either one is effectively guaranteed by executing both the consequent
of the operands’ observers. and alternative branches. If the backend is a boolean circuit,
def +(x: ShareNum, y: ShareNum) = { this if-construct is further transformed to a selector. If the
assert(x.players.equals(y.players)) backend is an arithmetic circuit, the program is transformed
assert(x.threshold == y.threshold) to z = b * x + (1 - b) * y. In the following Harpoon
ShareNum(x.provider | y.provider, players, code snippet, the variable arr stores a sequence of shared
x.observers | y.observers, threshold, value.+(y.value)) numbers, and the comparison result of max < arr(i) is a
}
shared secret value. Thus, the program
Given a cryptographic backend, HIR code is further trans- if (max < arr(i)) { max = arr(i) }
formed to a program with the corresponding cryptographic
is transformed to
semantics. And the HIR type system is refined to provide
val b = max < arr(i)
more precise information on data access policy. For exam-
max = b * arr(i) + (1 - b) * max
ple, the type rule of the addition operation is refined to the
following when using the additive secret sharing scheme. Note that such a program transformation is non-trivial for a
program allowing mutable states. Currently, an if-statement
def +(x: ShareNum, y: ShareNum) = {
will be transformed if the side effects of its two branches can
assert(x.players.equals(y.players))
assert(x.players.size == x.threshold) be syntactically detected.
assert(x.threshold == y.threshold)
Loops and Recursion. All function calls are treated as
ShareNum(x.provider | y.provider, players,
x.observers & y.observers, threshold, value.+(y.value))
macros and are simply inlined. All loops are unfolded as
} the number of iterations is a compile-time constant. Fig. 4
demonstrates our treatment of recursive calls, where the
The type rule checks it is a 𝑛-out-of-𝑛 secret sharing scheme,
obliviousness is achieved by using the extra plaintext pa-
i.e., x.players.size == x.threshold. The refined type
rameter d on the right side of the figure. In the transformed
rule provides a stronger security guarantee, i.e., the trans-
program, the value d is initialized by the Harpoon annotation
formed program is compatible with the semantics of the
and decreases with each iteration. This makes sure that the
backend. For example, an FHE target program is not trans-
recursive call only iterates d times. Note that the function
formed to a program that may invoke secret sharing primi-
func is a polymorphic overloading function in HIR.
tives. See [3] for the details of HIR.
3.5 Code Generation
3.4 Obliviousness
Cryptographic Backends. In the context of building cir-
In addition to bridging the semantic gap between a high and
cuits, LMS is used to specialized a circuit with respect to a
a low-level language, our compiler also bridges the semantic
target backend. The outcome of such a programmatic special-
gap of obliviousness. A program without privacy concern
ization is a compiled target of the circuit. The code generator
diverts its control flow according to the input: statements
transforms an abstract circuit to a concrete one for a given
are executed conditionally, loop for a variable number of
backend. For example, the following adder expressed in HIR
iterations, etc. To protect privacy, boolean and arithmetic
is specialized to a boolean or arithmetic circuit based on the
circuits have to be oblivious in the sense that they perform
backend.
the same sequence of operations regardless of the input. The
val o1 = Owner();
following transformations may seem quite inefficient at first output((Num(o1, 10).+(Num(o1, 5))).eval(o1))
sight, but they are absolutely necessary in order to maintain
obliviousness. The essence of multi-stage programming is to generate effi-
cient programs using high-level constructs without run-time
Encrypted Array Indexing. Indexing an array with a ci- penalty [41]. The example in Fig. 5 a shows a code snippet
phertext is encoded as a multiplexer circuit that takes every that generates a for loop. Note that the if condition is com-
element of the array as an input and outputs the element posed of a plaintext boolean type, so this code is executed at
in the position. This multiplexer circuit consists of integer code generation time as shown Fig. 5 b.
comparators and selectors.
Resource Estimation. This is one of the special notewor-
Conditional Execution. After a typed Harpoon program thy backends: instead of performing a computation, it gen-
is transformed to HIR code, there are two types of if-constructs erates a graphical representation of the HIR circuit, which is

136
GPCE ’21, October 17ś18, 2021, Chicago, IL, USA Bao and Sundararajah, et al.

Scala Program Harpoon Program HIR Program


val a = 5 @sec(alice) val a = 5 val o = Owner(alice)
val b = 15 @sec(alice) val b = 15 val a = Num(o, 5) val b = Num(o, 15)
def gcd(x: Int, y: Int) @bound(5) val gcd = func((d: Rep[Int], x: Rep[SNum],
: Int = { def gcd(x: Int @sec, y: Int @sec) y: Rep[SNum]) => {
if (x == 0) y : Int @sec = { if (d == 0) y
else gcd(y % x, x) if (x == 0) y else if (x == 0) gcd(d-1, x, y)
} else gcd(y % x, x) else gcd(d - 1, y % x, x)
println(gcd(a, b)) } })
@reveal(alice) val r = gcd(a, b) val r = Num(o, gcd(5, a.value, b.value)).eval(o)
println(r) println(r)
Figure 4. Compute the Greatest Common Divisor (GCD) of two numbers. The left one shows the Scala textbook implementation. The middle one shows the
Harpoon program. The annotations express that a user, alice, computes the GCD of her private data a and b through a different party, which performs
computation on the data in an encrypted form, and provides the encrypted results to alice. The right one shows the corresponding HIR program. The
translated gcd function has one extra parameter d initialized by the bound Harpoon annotation, and decreases with each iteration.
(a) HIR code example: This framework can also be easily extended to evaluate
val sum = func((x: Rep[SNumArray], len: Rep[Int]) => {
var n = 0 val b = true costs that do not follow this simple model. A data structure at
var res = Num(o1, 0).value each HIR node and a function that performs accumulation of
while (n < len) {
if (b) { res = res + x(n) } cost based on the type of the node are sufficient to estimate
n += 1 the cost. Cost models can also be parameterized on values
}
res
which are configurable but known at compile time (e.g., in-
}) teger bitwidth). The prime modulus can be determined by
(b) Generated C code of TFHE backend: the security specifications, and specific edge costs. The abil-
const LweSample* x3(const LweSample* x4, int x5){ ity to estimate the cost of a program becomes useful when
int x6 = 0;
const LweSample* x7 = num_init(0, 64 ,x2); selecting a target from multiple backends. A program may
while (x6 < x5) { be better suited for execution on a particular backend than
x7 = add(x7, array_index(x4, x6, 64, x0), 64, x0);
x6 = x6 + 1;
another. If the available backends’ cost models are compa-
} rable, then we can generate resource estimations to choose
return x7; the best one for execution.
}
Figure 5. (a) HIR code example (b) Generated C code of TFHE backend.
4 HACCLE Workflow
This section describes the compilation flow of our HACCLE
fed to a generic łEvaluator”. This is a resource estimation pro- framework as shown in Fig. 6. In the very first stage of the
gram that traverses the graph and performs analysis at each flow, an input program is staged to a complete Harpoon
node. The estimator is parameterized on a given resource program that consists of an entry point for the inputs pro-
model, which specifies costs of each node, edge type, and vided by the parties, computation and necessary revealing
the depth of each edge in the graph.
At the most basic level, the resource estimation framework Scala Program with
Harpoon embedded
expects an enumeration of the abstract gates for a particular
Compiler stage
cost model, a description of how each HIR node type affects 1

these gates, and depths. The total cost is tallied in terms of


abstract gates. For example, a cost model for a secret shar- Harpoon program

ing backend may have round complexity and communication


Input
complexity as its abstract gates, whereas a circuit backend params
2

may have AND, OR, and NOT as its abstract gates. The eval- Backend Resource HIR
uator traverses the HIR graph and accumulates the abstract Estimates (protocol-independent)

gate costs produced by each node, and tracks the maximum 3

total depth encountered for critical path estimation. In the


case of a secret sharing scheme, traversing the graph will Resource HIR HIR
HIR (…)
potentially increment round and communication complexity Estimation
Graph
(garbled circuits) (secret sharing)

as new computation nodes are encountered, whereas a cir-


cuit backend will increment gate costs. These gate costs are 4

then instantiated with specific costs (in terms of lower-level


Obliv-C SPDZ
operations) based on the resource estimates determined by …
code code

cryptographic experts.
Figure 6. HACCLE Compilation Framework.

137
HACCLE: Metaprogramming for Secure Multi-Party Computation GPCE ’21, October 17ś18, 2021, Chicago, IL, USA

of results. The Harpoon program is compiled to HIR code, no runtime overhead for the generated code since it is exe-
which is one big acyclic circuit, and is further lowered to the cuted at the Scala runtime, offering the so-called łabstraction
protocol specific HIR program. Finally, the code for a spe- without regret” (see Sec. 3.1).
cific backend is generated from the low-level HIR program.
Stage 2 The next step is to generate an abstract circuit: a
Resource estimation models the resource usage of the com-
Harpoon program is compiled down to HIR code (see Sec. 3.3),
putation in a specific protocol, and guides the compiler to
which is, essentially, a bounded-size and single-assignment
generate optimal code. The following subsections illustrate
representation of the program. Here, the bound annotation
the stages of our toolchain from writing an MPC application
in the Harpoon program is used to unroll loops and inline
as a program to executing it using different protocols, and
recursive functions, leading to a functional and loop-free
how the type system provides various security guarantees
representation of the program. The HIR program at this
at different stages.
stage is still independent of a particular protocol. Hence, it is
essentially a direct translation of the Harpoon program into
4.1 Specifying the Program HIR code without considering the abilities of any particular
A programmer starts by providing a Scala program that em- backend. The key typing guarantee that HIR code provides
beds a secure computation, which is written in Harpoon (see at this level is that the appropriate HIR operation will be
Sec. 3.2). The Scala program runs at client locations, and used based on whether inputs to an operation are private or
is responsible for processing input, setting up communica- public.
tion channels, etc. The Harpoon program actually performs
the secure computation that is written parametrically: ef- Stage 3 The next compilation stage specializes an HIR
fectively, a Harpoon program is a function that accepts the circuit to a specific protocol. The choice of protocol is de-
number of parties and their inputs as parameters. termined by the security specification file. Here, we do not
change the language representation of the programÐthe re-
sulting program is still in HIR. Instead, this stage rewrites
4.2 Generating a Circuit HIR code to limit the use of HIR operations to those sup-
Stage 1 The first stage of compilation transforms a Scala + ported by a particular backend. For example, a backend that
Harpoon program to a pure Harpoon program, i.e., executing only supports boolean operations requires translating all
a Scala program stages away the non-Harpoon fragment of operations on integers and floating point to bit-level oper-
the code: local input files are read into memory and connec- ations. Similarly, a backend that only supports operations
tions are set up to the relevant servers. on integers requires translating floating point operations to
After the stage 1 compilation, a Harpoon program rep- decomposed operations on the component parts (mantissa
resents just the secure computation that must be performed. and exponent). Here, HIR switches to the use of backend-
This program will eventually be transformed to a circuit that specific type systems that enforce the following property:
performs the desired secure processing. However, the secure a type-checked backend-specific HIR circuit enforces the
computation is not ready for execution yet. Any publicly requirements of that backend for security (e.g., the set of
known information about the inputs (e.g., the bitwidths, or sharers matches up when performing operations in a secret-
the maximum input size) has not yet been incorporated into sharing backend).
the circuit, and the input values are not yet known. At this
stage, the Harpoon type system provides the key security Stage 4 The final step of generating a circuit is specific
guarantee that private data will not leak via public channels. to a backend implementation. Here, an HIR circuit is trans-
An important note is that each Harpoon program repre- lated to be compatible with a particular backend. This is
sents a single secure computation that compiles to a single the key module interface provided by our system. It may
circuit. Hence, the Harpoon program must compile down require translating the circuit to a set of API calls (e.g., our
to a circuit whose size is determined only by the publicly TFHE backend), or to a different programming language
available information about the inputs. In many applications, (e.g., translating to Obliv-C for the garbled-circuit backend,
there are multiple secure computation that must occur (e.g., or Scale-Mamba for the secret-sharing backend). The back-
in database applications, there may be multiple queries; each end is configured based on the information in the security
query represents a different secure computation). Here, we specification file. At this point, the circuit is in an executable
leverage the blurred distinction between compile time and form, and can perform the desired secure computation, using
runtime. Generating a Harpoon program happens at what the actual inputs from the various parties.
programmers traditionally consider run time: the Scala pro-
gram is actually running to produce the Harpoon program. 5 Optimization
Hence, the Scala program can include a loop over the set of Our compiler contains a set of optimizing transformations,
queries, and for each query, a new Harpoon program is gen- e.g., peephole optimizations, common subexpression elimina-
erated, compiled and executed. The abstraction in Scala has tion, constant folding, and dead code elimination. In addition

138
GPCE ’21, October 17ś18, 2021, Chicago, IL, USA Bao and Sundararajah, et al.

Harpoon Program: HIR Program:


x0: Number(64, 2) (13)
@sec(alice) val a = 2 val o1 = Owner()
scala.math.pow(2, 8) UNum(o1, 2).pow(8)
x1: Multiply(x0, x0) (16)
Generated TFHE program:
const LweSample* x3 = unum_init(2, 64, x2);
LweSample* x4 = unum_mul(x3, x3, 64, x0); x2: Multiply(x1, x0) (16)

LweSample* x5 = unum_mul(x4, x4, 64, x0);


return unum_mul(x5, x5, 64, x0); x3: Multiply(x2, x0) (16)

Figure 7. Computing pow (2, 8), where 2 is private, and x0 and x2 are the
cloud key and private key used for encryption.
x4: Multiply(x3, x0) (16) x0: Number(64, 2) (13)

to those optimizations that a general purpose compiler has, x5: Multiply(x4, x0) (16) x1: Multiply(x0, x0) (16)

we identified several optimizations specific to Secure MPC


circuits. Given an in-memory representation of a boolean or x6: Multiply(x5, x0) (16) x2: Multiply(x1, x1) (16)
an arithmetic circuit, these optimizations reduce the depth
of circuits and the number of costly gates.
x7: Multiply(x6, x0) (16) x3: Multiply(x2, x2) (16)

5.1 Scalar Multiplication Figure 8. Graphs of computing pow (2, 8): before (left) and after applying
The multiplicative depth of circuits is the main practical optimizations (right).
limitation in performing computation over encrypted data. ≠. One operator may be encoded by two or more other op-
We identify that multiplication can be eliminated when one erators. However, the two expressions may have different
of the operands of a multiplication is 0 or 1 in plaintext. In costs. We identify some implementation heuristics that help
addition, consider the case of calculating pow(𝑥, 𝑛), where 𝑥 us generate efficient programs.
is an encrypted number. The compiler can divide the com- For example, the HoneyBadgerMPC library provides two
putation into subproblems of size 𝑛/2 and call the subprob- comparison protocols: LessThan and Equality. They are used
lems recursively. Fig. 7 shows the program of computing to express 𝑎 < 𝑏 and 𝑎 == 𝑏 on shared values, and return
pow(2, 8), where 2 is private. The Harpoon program is trans- a secret shared value. Building an MPC compiler requires
formed to HIR code, and is further generated to the TFHE us to implement other operators in terms of these two. For
program, where the function unum_mul multiplies two 64- example, a naive and intuitive implementation is to encode
bit encrypted numbers. The generated program only needs 𝑎 ≥ 𝑏 as (𝑏 < 𝑎) + (𝑎 == 𝑏). An alternative way is to
𝑂 (log 𝑛) multiplies. This optimization is simple, but has a encode it as 1 − (𝑏 < 𝑎). Our abstract resource estimator
dramatic impact on performance. generates one LEQ, one ADD and one EQUAL gate for the
The effectiveness of the optimization is clearly demon- first encoding, and one SUB and one LEQ gate for the second
strated in Fig. 8, which shows the graphs of the generated encoding. In the HoneyBadgerMPC resource model, the costs
circuits. The left (before optimization) is a depth-7 circuit of addition and subtraction are trivial since they require no
with 7 multiply gates. The right (after optimization) is a communication, and the multiplication takes one round and
depth-3 circuit with three multiply gates. one multicast to finish. The round complexity of comparison
The generated graphs show an abstract model of execu- is seven times more than the cost of multiplication [36], the
tion cost where each operation is treated as atomic. However, communication cost is even more expensive. Also, the cost of
the resource estimation framework can be specialized to par- equality check is higher than the less than operation. Thus,
ticular backends by providing the corresponding models we believe that the second encoding is better due to the
of execution cost (in terms of communication complexity, reduced number of comparison. This demonstrates how we
number of logic gates, etc.). These backend-specific resource experiment optimizations guided by our resource estimators.
estimates can be used to compare different optimization To verify the above observation, we perform a set of pri-
strategies and intelligently select the appropriate one based vate comparison in a HoneyBadgerMPC program (on the
on the execution semantics of the targeted backend. In ad- same machine used in Sec. 6). Our tests execute 100 times
dition, as mentioned in Sec. 3.5, these specialized estimates
Table 1. Execution time of evaluating 𝑎 ≥ 𝑏 for 100 times, where 𝑎 and 𝑏
even let us pick the most optimal backend to target. are randomly generated number ranging from 1 to 100.
Encoding 𝑎 ≥ 𝑏 Execution Time
5.2 Private Comparison
(𝑏 < 𝑎) + (𝑎 == 𝑏) 0.23s
The private comparison is a major bottleneck in MPC pro- 1 − (𝑏 < 𝑎) 0.10s
tocols due to their inherent non-arithmetic structure [14].
Private comparison operators include <, ≤, >, ≥, == and

139
HACCLE: Metaprogramming for Secure Multi-Party Computation GPCE ’21, October 17ś18, 2021, Chicago, IL, USA

of the greater or equal comparison on two randomly gener-


ated numbers. Table 1 compares the running time of the two @sec var bidders = Array(0, 1 ..., n - 1)
@sec var bids = Array(b1, b2, ..., bn)
encodings. var ifst = bidders(0) var isnd = bidders(1)
var fst = bids(0) var snd = bids(1)
if (bids(0) < bids(1)) {
6 Evaluation ifst = bidders(1) fst = bids(1)
} else {
This section presents three case studies to assess our frame- isnd = bidders(1) snd = bids(1)
work focusing on Harpoon and HIR, optimizing scalar mul- }
tiplication, and support for indexing arrays with secrets, for (i <- 2 until bids.length) {
if (fst < bids(i)) {
respectively. For simplicity, the test program uses plaintext isnd = ifst snd = fst
values instead of obtaining them at runtime. We conducted ifst = bidders(i) fst = bids(i)
} else if (snd < bids(i)) {
our experiments on a machine with 8 Intel Core i7 processors isnd = bidders(i) snd = bids(i)
and 16 GB RAM that runs Ubuntu 18.04 LTS. }
}
(ifst, snd)
6.1 Case Study 1: Secure Auctions
Figure 9. Harpoon code snippet performing a second-price auction, where
Recall the discussion of the practical importance of secure b1, b2, ..., bn are parameters passed to the method.
auctions in Sec. 2. This experiment implements a second-
val rand = new scala.util.Random
price auction that is designed to give bidders confidence to val start = 1000
bid their best price without overpaying. The bidder who @sec(alice) val m = Array.fill(10)(
submits the highest bid is awarded the item and pays the start + rand.nextInt(start + 1))
amount of the second-highest bid. val v = Array(1, 399, 1, 413, 1, 587, 1, 354, 1, 444)
Fig. 9 shows the code snippets in Harpoon, where the ele- m * v
ments in arrays bidders and bid denote bidder’s identities Figure 10. Test program of Matrix-Vector Multiplication.
and their bids. The implementation uses four variables (fst,
snd, ifst and isnd) to store the values of the first and sec- matrix (where 100 ≤ 𝑁 ≤ 500), and multiplies with a fixed
ond highest bids and the identities of holders respectively. vector [1, 399, 1, 413, 1, 587, 1, 354, 1, 444]. The test shows the
As shown in Fig. 9, writing the Harpoon implementation effectiveness of our optimization discussed in Sec. 5.1.
does not require developers to have cryptographic concerns Table 2 compares the running time of the generated HElib
or circuit building mindset. They can program functionally programs with and without optimizations. As 𝑁 increases
or imperatively, thanks to the expressiveness of Scala. from 100 to 500, the speedups become more observable. We
As mentioned in Sec. 2, our compiler could transform the have provided median of absolute runtime before and after
imperative Harpoon program to a functional style one as the optimization with 95% confidence.
(bids zip bidders).map(..).reduce(..), which yields
a circuit of logarithmic depth that allows efficient parallel 6.3 Case Study 3: Merge Sort
computation. MergeSort is a key computation component of various Se-
We have generated SPDZ and HoneyBadgerMPC pro- cure MPC applications. For example, when multiple parties
grams to realize secure auctions. For testing and develop- exchange messages anonymously, both the content and the
ment, the HoneyBadgerMPC program runs in a simulated metadata (e.g., the length of the message) need to be pro-
network, and contains lines of code dealing with network tected. Secure sort is one of the core kernels used for such
connections and synchronizations. The Harpoon and HIR anonymous communications [1].
developers need not to have those concerns. This case-study implements MergeSort in HIR, as it ex-
poses the language features needed in writing secure com-
6.2 Case Study 2: Matrix-Vector Product putation. The implementation involves array indexing and
Secure matrix-vector multiplication is a core kernel in many conditional executions. Notably, an array lookup on a private
real-world applications. For example, in the area of privacy- index is not supported by most programming languages [23].
preserving machine learning, matrix-vector multiplication is
one of the common building blocks of neural networks [47]. Table 2. Execution time (in seconds) of the HElib programs
During the training and inference procedures, it is often the that perform multiplication of a matrix of 10 ∗ 𝑁 and a vector
case that multiple parties combine their data where secure [1, 399, 1, 413, 1, 587, 1, 354, 1, 444] before and after the optimization.
matrix-vector multiplication can be used to preserve privacy. N 100 200 300 400 500
The case study performs a set of secure matrix-vector Before (median) 7.57 21.31 55.81 135 292
multiplication, where one party (the client) has an input Error (confidence 95%) 0.51 0.44 0.72 1.45 2.05
matrix, and the other party (the server) has a vector. Fig. 10 After (median) 6.32 15.18 34.89 80 162
shows the test program that randomly generates a 10 ∗ 𝑁 Error (confidence 95%) 0.05 0.34 0.23 0.34 0.49

140
GPCE ’21, October 17ś18, 2021, Chicago, IL, USA Bao and Sundararajah, et al.

1 val o1 = Owner() val s = 0 environment where optimizations can be performed at a


2 var arr = NumArray(o1, 3, 1, 5, 2) // input
3 val e = arr.length
lower level. Compared with SCALE-MAMBA, HACCLE pro-
4 def merge(o: Owner, arr1: NumArray, arr2: NumArray) = { vides staging driven by type systems, estimates resource
5 var res = NewNumArray(o, arr1.length + arr2.length) consumption, and focuses on optimization at a higher level.
6 var i = Num(o, 0) var j = Num(o, 0) var k = 0
7 while (k < res.length) { HoneybadgerMPC [32] is another backend of HACCLE
8 val b1 = i < Num(o, arr1.length) that supports secret-sharing based protocols. The unique-
9 val b2 = j < Num(o, arr2.length)
10 val p = if (b1.not) arr2(j) else if (b2.not) arr1(i)
ness of HoneybadgerMPC is the combination of a robust
11 else if (arr1(i) <= arr2(j)) arr1(i) else arr2(j) online phase and an optimal non-robust offline phase. It pro-
12 res = res.update(k, p) vides fairness guarantees even in the asynchronous network
13 // updating arr1 index
14 i = if (b1.not) i else if (b2.not) i + Num(o, 1) else setting and also preserves efficiency to make MPC programs
15 if (p == arr1(i)) i + Num(o, 1) else i practical to run.
16 // updating arr2 index
17 j = if (b1.not) j + Num(o, 1) else if (b2.not) j As privacy preserving machine learning becomes more
18 else if (p == arr2(j)) j + Num(o, 1) else j and more popular, many frameworks have been developed
19 k = k + 1
20 }
specifically for this use case, such as ABY [17], ABY3 [34],
21 res CHET [16], EzPC [9], CrypTFlow [30] and SecureNN [47].
22 } These frameworks are highly optimized for machine learning
23 val r = recFuel(10);
24 val mergesort = r.rec[NumArray, Owner, Int, Int] { and are designed for two-party or three-party settings. We
25 f => (a, o, i, j) => { choose not to include them due to our desire to support
26 val mid = (j - i) / 2
27 if (mid == 0 || i >= j){ a }
an arbitrary number of parties. There are also many other
28 else { MPC frameworks such as Viff [45], Jiff [43], MPyC [4] and
29 val left = a.slice(i, mid) val right = a.slice(mid, j) PICCO [52]. Theoretically, any framework can be embedded
30 merge(o, f(left, o, 0, left.length),
31 f(right, o, 0, right.length))} as a backend in HACCLE even though not all of them are
32 } integrated at the moment.
33 }
34 val res = mergesort(arr, o1, s, e)
35 output(res.eval(o1))

Figure 11. MergeSort implemented in HIR. 8 Conclusion


Secure MPC-based applications play a crucial role in solving
MergeSort recursively divides an input array into two
many important practical problems such as in high-value pro-
halves and then merges the two sorted halves. Our imple-
curement. But developing performant MPC-based applica-
mentation is shown in Fig. 11. In the function mergesort, the
tions from scratch is a notoriously difficult task as it requires
variable r (line 23) stores a recursion object initialized with
expertise ranging from cryptography to circuit optimization.
the bound 10. The expression r.rec (line 24) is the construct
Therefore software developers need a compiler toolchain for
for defining a bounded recursive function call. This allows
developing MPC-based applications. As a solution to this
one to explicitly specify the bound of the defining recur-
problem, we have introduced the HACCLE toolchain, a multi-
sive function. The NumArray is the type for arrays that allow
stage compiler for optimized circuit generation. We believe
private indexing. The two parameters i and j are plaintext,
that the HACCLE toolchain offers a compelling approach to
which is important for unrolling the recursive function at
the design and implementation of Secure MPC applications,
compile time. The function slice(i, j) returns a subarray
using metaprogramming techniques.
from the ith element until the jth element, where i and j
are plaintext integers. The if-statement (lines 27 to 31) is the
standard one as its condition depends on a plain text value.
The function merge is used for merging two halves. All the Acknowledgments
if-constructs appearing in this function are oblivious as their We thank the anonymous reviewers for their helpful sug-
conditions depend on ciphertext values. The loop (line 7) is gestions and comments. This research is based upon work
bounded as the length of an array is known at compile time. supported by the Office of the Director of National Intel-
ligence (ODNI), Intelligence Advanced Research Projects
7 Related Work Activity (IARPA), contract #2019-19020700004. The views
There have been many MPC frameworks proposed in re- and conclusions contained herein are those of the authors
cent years and several of them are already integrated into and should not be interpreted as necessarily representing
HACCLE. We list the prominent MPC frameworks as follows. the official policies, either expressed or implied, of ODNI,
SCALE-MAMBA [29] is an existing MPC framework that IARPA, or the U.S. Government. The U.S. Government is
is closest to HACCLE. We utilize it as one of our crypto- authorized to reproduce and distribute reprints for govern-
graphic backends to implement secret sharing and FHE based mental purposes notwithstanding any copyright annotation
protocols. It is a combination of a compiler and a run-time therein.

141
HACCLE: Metaprogramming for Secure Multi-Party Computation GPCE ’21, October 17ś18, 2021, Chicago, IL, USA

References [15] Ivan Damgård, Marcel Keller, Enrique Larraia, Valerio Pastro, Pe-
[1] Nikolaos Alexopoulos, Aggelos Kiayias, Riivo Talviste, and Thomas ter Scholl, and Nigel P. Smart. 2013. Practical Covertly Secure
MPC for Dishonest Majority - Or: Breaking the SPDZ Limits. In ES-
Zacharias. 2017. MCMix: Anonymous Messaging via Secure Multiparty
ORICS (Lecture Notes in Computer Science, Vol. 8134). Springer, 1ś18.
Computation. In USENIX Security Symposium. USENIX Association,
https://doi.org/10.1007/978-3-642-40203-6_1
1217ś1234. http://eprint.iacr.org/2017/778
[16] Roshan Dathathri, Olli Saarikivi, Hao Chen, Kim Laine, Kristin E.
[2] Jonathan Bachrach, Huy Vo, Brian Richards, Yunsup Lee, Andrew
Lauter, Saeed Maleki, Madanlal Musuvathi, and Todd Mytkowicz. 2019.
Waterman, Rimas Avižienis, John Wawrzynek, and Krste Asanović.
CHET: an optimizing compiler for fully-homomorphic neural-network
2012. Chisel: Constructing Hardware in a Scala Embedded Language.
inferencing. In PLDI. ACM, 142ś156. https://doi.org/10.1145/3314221.
In Proceedings of the 49th Annual Design Automation Conference (San
3314628
Francisco, California) (DAC ’12). Association for Computing Machin-
[17] Daniel Demmler, Thomas Schneider, and Michael Zohner.
ery, New York, NY, USA, 1216ś1225. https://doi.org/10.1145/2228360.
2015. ABY - A framework for efficient mixed-protocol se-
2228584
cure two-party computation.. In NDSS. The Internet Society.
[3] Yuyan Bao, Kirshanthan Sundararajah, Raghav Malik, Qianchuan Ye,
https://www.ndss-symposium.org/ndss2015/aby---framework-
Christopher Wagner, Nouraldin Jaber, Fei Wang, Mohammad Hassan
efficient-mixed-protocol-secure-two-party-computation
Ameri, Donghang Lu, Alexander Seto, Benjamin Delaware, Roopsha
[18] Jack Doerner, David Evans, and Abhi Shelat. 2016. Secure Stable
Samanta, Aniket Kate, Christina Garman, Jeremiah Blocki, Pierre-
Matching at Scale. In CCS. ACM, 1602ś1613. https://doi.org/10.1145/
David Letourneau, Benoît Meister, Jonathan Springer, Tiark Rompf,
2976749.2978373
and Milind Kulkarni. 2020. HACCLE: Metaprogramming for Secure
[19] Grégory M. Essertel, Guannan Wei, and Tiark Rompf. 2019. Precise
Multi-Party Computation - Extended Version. CoRR abs/2009.01489
reasoning with structured time, structured heaps, and collective op-
(2020). https://arxiv.org/abs/2009.01489
erations. Proc. ACM Program. Lang. 3, OOPSLA (2019), 157:1ś157:30.
[4] Barry Schoenmakers. 2020. MPyC: Secure multiparty computation in
https://doi.org/10.1145/3360583
Python. https://github.com/lschoe/mpyc
[20] David Evans, Vladimir Kolesnikov, and Mike Rosulek. 2018. A Prag-
[5] Peter Bogetoft, Ivan Damgård, Thomas P. Jakobsen, Kurt Nielsen,
matic Introduction to Secure Multi-Party Computation. Found. Trends
Jakob Pagter, and Tomas Toft. 2006. A Practical Implementation of Se-
Priv. Secur. 2, 2-3 (2018), 70ś246. https://doi.org/10.1561/3300000019
cure Auctions Based on Multiparty Integer Computation. In Financial
[21] Nicolas Gama, Malika Izabachène, Phong Q. Nguyen, and Xiang Xie.
Cryptography (Lecture Notes in Computer Science, Vol. 4107). Springer,
2016. Structural Lattice Reduction: Generalized Worst-Case to Average-
142ś147. https://doi.org/10.1007/11889663_10
Case Reductions and Homomorphic Cryptosystems. In EUROCRYPT
[6] Zvika Brakerski. 2012. Fully Homomorphic Encryption without Mod-
(2) (Lecture Notes in Computer Science, Vol. 9666). Springer, 528ś558.
ulus Switching from Classical GapSVP. In CRYPTO (Lecture Notes in
https://doi.org/10.1007/978-3-662-49896-5_19
Computer Science, Vol. 7417). Springer, 868ś886. https://doi.org/10.
[22] Trinabh Gupta, Henrique Fingler, Lorenzo Alvisi, and Michael Walfish.
1007/978-3-642-32009-5_50
2017. Pretzel: Email encryption and provider-supplied functions are
[7] Zvika Brakerski, Craig Gentry, and Vinod Vaikuntanathan. 2011. Fully
compatible. In SIGCOMM. ACM, 169ś182. https://doi.org/10.1145/
Homomorphic Encryption without Bootstrapping. Electron. Collo-
3098822.3098835
quium Comput. Complex. (2011), 111. https://eccc.weizmann.ac.il/
[23] Marcella Hastings, Brett Hemenway, Daniel Noble, and Steve
report/2011/111
Zdancewic. 2019. Sok: General purpose compilers for secure multi-
[8] Martin Burkhart, Mario Strasser, Dilip Many, and Xenofontas A. Dim-
party computation. In 2019 IEEE Symposium on Security and Privacy
itropoulos. 2010. SEPIA: Privacy-Preserving Aggregation of Multi-
(SP). IEEE, 1220ś1237. https://doi.org/10.1109/SP.2019.00028
Domain Network Events and Statistics. In USENIX Security Symposium.
[24] Markus Hinkelmann, Andreas Jakoby, Nina Moebius, Tiark Rompf,
USENIX Association, 223ś240. http://www.usenix.org/events/sec10/
and Peer Stechert. 2011. A cryptographically t-private auction system.
tech/full_papers/Burkhart.pdf
Concurr. Comput. Pract. Exp. 23, 12 (2011), 1399ś1413.
[9] Nishanth Chandran, Divya Gupta, Aseem Rastogi, Rahul Sharma, and
[25] Andreas Holzer, Martin Franz, Stefan Katzenbeisser, and Helmut Veith.
Shardul Tripathi. 2019. EzPC: Programmable and Efficient Secure Two-
2012. Secure two-party computations in ANSI C. In Proceedings of
Party Computation for Machine Learning. In EuroS&P. IEEE, 496ś511.
the 2012 ACM conference on Computer and communications security.
https://doi.org/10.1109/EuroSP.2019.00043
772ś783. https://doi.org/10.1145/2382196.2382278
[10] Jung Hee Cheon, Andrey Kim, Miran Kim, and Yong Soo Song. 2017.
[26] Karthik A Jagadeesh, David J Wu, Johannes A Birgmeier, Dan Boneh,
Homomorphic Encryption for Arithmetic of Approximate Numbers. In
and Gill Bejerano. 2017. Deriving genomic diagnoses without revealing
ASIACRYPT (1) (Lecture Notes in Computer Science, Vol. 10624). Springer,
patient genomes. Science (2017).
409ś437. https://doi.org/10.1007/978-3-319-70694-8_15
[27] David Koeplinger, Matthew Feldman, Raghu Prabhakar, Yaqi Zhang,
[11] Ilaria Chillotti, Nicolas Gama, Mariya Georgieva, and Malika Iz-
Stefan Hadjis, Ruben Fiszel, Tian Zhao, Luigi Nardi, Ardavan Pedram,
abachène. 2016. Faster Fully Homomorphic Encryption: Bootstrapping
Christos Kozyrakis, and Kunle Olukotun. 2018. Spatial: A Language
in Less Than 0.1 Seconds. In ASIACRYPT (1) (Lecture Notes in Computer
and Compiler for Application Accelerators. SIGPLAN Not. 53, 4 (June
Science, Vol. 10031). 3ś33. https://doi.org/10.1007/978-3-662-53887-6_1
2018), 296ś311. https://doi.org/10.1145/3296979.3192379
[12] Ilaria Chillotti, Nicolas Gama, Mariya Georgieva, and Malika Iz-
[28] Benjamin Kreuter. 2017. Secure MPC at Google. Real World Crypto.
abachène. 2017. Faster Packed Homomorphic Operations and Ef-
[29] KU Leuven. 2019. SCALE-MAMBA Software. https://homes.esat.
ficient Circuit Bootstrapping for TFHE. In ASIACRYPT (1) (Lecture
kuleuven.be/~nsmart/SCALE/.
Notes in Computer Science, Vol. 10624). Springer, 377ś408. https:
[30] Nishant Kumar, Mayank Rathee, Nishanth Chandran, Divya Gupta,
//doi.org/10.1007/978-3-319-70694-8_14
Aseem Rastogi, and Rahul Sharma. 2020. CrypTFlow: Secure Ten-
[13] Ilaria Chillotti, Nicolas Gama, Mariya Georgieva, and Malika Iz-
sorFlow Inference. In IEEE Symposium on Security and Privacy. IEEE,
abachène. August 2016. TFHE: Fast Fully Homomorphic Encryption
336ś353. https://doi.org/10.1109/SP40000.2020.00092
Library. https://tfhe.github.io/tfhe/
[31] Chang Liu, Xiao Shaun Wang, Kartik Nayak, Yan Huang, and Elaine
[14] Geoffroy Couteau. 2016. Efficient Secure Comparison Protocols. IACR
Shi. 2015. Oblivm: A programming framework for secure computation.
Cryptol. ePrint Arch. (2016), 544. http://eprint.iacr.org/2016/544
In 2015 IEEE Symposium on Security and Privacy. IEEE, 359ś376. https:
//doi.org/10.1109/SP.2015.29

142
GPCE ’21, October 17ś18, 2021, Chicago, IL, USA Bao and Sundararajah, et al.

[32] Donghang Lu, Thomas Yurek, Samarth Kulshreshtha, Rahul Govind, Science, Vol. 3016). Springer, 30ś50. https://doi.org/10.1007/978-3-540-
Aniket Kate, and Andrew Miller. 2019. HoneyBadgerMPC and Asyn- 25935-0_3
chroMix: Practical Asynchronous MPC and its Application to Anony- [42] Walid Taha and Tim Sheard. 2000. MetaML and multi-stage program-
mous Communication. In Proceedings of the 2019 ACM SIGSAC Con- ming with explicit annotations. Theor. Comput. Sci. 248, 1-2 (2000),
ference on Computer and Communications Security. 887ś903. https: 211ś242. https://doi.org/10.1016/S0304-3975(00)00053-0
//doi.org/10.1145/3319535.3354238 [43] Multiparty.org Development Team. 2020. JavaScript implementation
[33] Dahlia Malkhi, Noam Nisan, Benny Pinkas, and Yaron Sella. 2004. of federated functionalities. https://github.com/multiparty/jiff
Fairplay - Secure Two-Party Computation System. In USENIX Security [44] The PALISADE team. 2021. PALISADE, homomorphic encryption softare
Symposium. USENIX, 287ś302. http://www.usenix.org/publications/ library. https://palisade-crypto.org/
library/proceedings/sec04/tech/malkhi.html [45] The VIFF team. 2021. VIFF, the virtual ideal functionality framework.
[34] Payman Mohassel and Peter Rindal. 2018. ABY3 : A Mixed Protocol http://viff.dk/
Framework for Machine Learning. In CCS. ACM, 35ś52. https://doi. [46] William Vickrey. 1961. Counterspeculation, Auctions, and Ccom-
org/10.1145/3243734.3243760 petitive Sealed Tenders. The Journal of Finance 16, 1 (1961), 8ś37.
[35] Aseem Rastogi, Matthew A. Hammer, and Michael Hicks. 2014. Wys- https://doi.org/10.1111/j.1540-6261.1961.tb02789.x
teria: A Programming Language for Generic, Mixed-Mode Multiparty [47] Sameer Wagh, Divya Gupta, and Nishanth Chandran. 2019. SecureNN:
Computations. In IEEE Symposium on Security and Privacy. IEEE Com- 3-Party Secure Computation for Neural Network Training. Proc. Priv.
puter Society, 655ś670. https://doi.org/10.1109/SP.2014.48 Enhancing Technol. 2019, 3 (2019), 26ś49. https://doi.org/10.2478/
[36] Tord Ingolf Reistad and Tomas Toft. 2007. Secret sharing comparison by popets-2019-0035
transformation and rotation. In International Conference on Information [48] Xiao Wang, Alex J. Malozemoff, and Jonathan Katz. 2016. EMP-toolkit:
Theoretic Security. Springer, 169ś180. https://doi.org/10.1007/978-3- Efficient MultiParty computation toolkit. https://github.com/emp-
642-10230-1_14 toolkit.
[37] Tiark Rompf and Kevin J Brown. 2017. Functional parallels of se- [49] Xiao Shaun Wang, Yan Huang, Yongan Zhao, Haixu Tang, XiaoFeng
quential imperatives (short paper). In Proceedings of the 2017 ACM Wang, and Diyue Bu. 2015. Efficient Genome-Wide, Privacy-Preserving
SIGPLAN Workshop on Partial Evaluation and Program Manipulation. Similar Patient Query based on Private Edit Distance. In CCS. ACM,
83ś88. https://doi.org/10.1145/3018882.3018891 492ś503. https://doi.org/10.1145/2810103.2813725
[38] Tiark Rompf and Martin Odersky. 2010. Lightweight modular staging: [50] Andrew Chi-Chih Yao. 1982. Protocols for Secure Computations (Ex-
a pragmatic approach to runtime code generation and compiled DSLs. tended Abstract). In FOCS. IEEE Computer Society, 160ś164. https:
In GPCE. ACM, 127ś136. https://doi.org/10.1145/1868294.1868314 //doi.org/10.1109/SFCS.1982.38
[39] Victor Shoup Shai Halevi. April 2013. HElib: Design and Implemen- [51] Samee Zahur and David Evans. 2015. Obliv-C: A Language for Exten-
tation of a Homomorophic-Encryption Library. https://github.com/ sible Data-Oblivious Computation. IACR Cryptol. ePrint Arch. (2015),
shaih/HElib 1153. http://eprint.iacr.org/2015/1153
[40] Adi Shamir. 1979. How to Share a Secret. Commun. ACM 22, 11 (1979), [52] Yihua Zhang, Aaron Steele, and Marina Blanton. 2013. PICCO: a
612ś613. https://doi.org/10.1145/359168.359176 general-purpose compiler for private distributed computation. In Pro-
[41] Walid Taha. 2003. A Gentle Introduction to Multi-stage Programming. ceedings of the 2013 ACM SIGSAC conference on Computer & communi-
In Domain-Specific Program Generation (Lecture Notes in Computer cations security. 813ś826. https://doi.org/10.1145/2508859.2516752

143

You might also like