research-article

Public Access

SPARKs: Succinct Parallelizable Arguments of Knowledge

Authors:

Naomi Ephraim,

Cody Freitag,

Ilan Komargodski, and

Rafael PassAuthors Info & Claims

Journal of the ACM, Volume 69, Issue 5

Article No.: 31, Pages 1 - 88

https://doi.org/10.1145/3549523

Published: 27 October 2022 Publication History

All formats PDF

Abstract

We introduce the notion of a Succinct Parallelizable Argument of Knowledge (SPARK). This is an argument of knowledge with the following three efficiency properties for computing and proving a (non-deterministic, polynomial time) parallel RAM computation that can be computed in parallel time T with at most p processors:

—

The prover’s (parallel) running time is $ T + \mathrm{poly}\hspace{-2.0pt}\log (T \cdot p) $ . (In other words, the prover’s running time is essentially T for large computation times!)

—

The prover uses at most $ p \cdot \mathrm{poly}\hspace{-2.0pt}\log (T \cdot p) $ processors.

—

The communication and verifier complexity are both $ \mathrm{poly}\hspace{-2.0pt}\log (T \cdot p) $ .

The combination of all three is desirable, as it gives a way to leverage a moderate increase in parallelism in favor of near-optimal running time. We emphasize that even a factor two overhead in the prover’s parallel running time is not allowed.

Our main contribution is a generic construction of SPARKs from any succinct argument of knowledge where the prover’s parallel running time is $ T \cdot \mathrm{poly}\hspace{-2.0pt}\log (T \cdot p) $ when using p processors, assuming collision-resistant hash functions. When suitably instantiating our construction, we achieve a four-round SPARK for any parallel RAM computation assuming only collision resistance. Additionally assuming the existence of a succinct non-interactive argument of knowledge (SNARK), we construct a non-interactive SPARK that also preserves the space complexity of the underlying computation up to $ \mathrm{poly}\hspace{-2.0pt}\log (T\cdot p) $ factors.

We also show the following applications of non-interactive SPARKs. First, they immediately imply delegation protocols with near optimal prover (parallel) running time. This, in turn, gives a way to construct verifiable delay functions (VDFs) from any sequential function. When the sequential function is also memory-hard, this yields the first construction of a memory-hard VDF.

1 Introduction

Interactive proof systems, introduced by Goldwasser, Micali, and Rackoff [36], are one of the most fundamental concepts in theoretical computer science. Such systems consist of a prover who is able to convince a verifier of the validity of some statement if and only if it is true. The “if” direction is called completeness and the “only if” direction is called soundness. Proof systems where soundness is only guaranteed to hold for efficient (i.e., polynomial-time) provers are called argument systems.

We focus on succinct argument systems for $ \mathbf {NP} $ : argument systems where the total communication is essentially independent of the size of the verification circuit of the language and even shorter than the statement. Since their introduction [15, 41, 44], succinct argument systems have drawn significant attention due to their appealing efficiency properties. Nowadays they are widely implemented and used in various systems, most notably in numerous blockchain platforms.

One aspect of such argument systems that has been the center of many recent works (e.g., References [18, 25, 38, 57] to name a few) is prover efficiency. Consider the application of succinct arguments to delegating (possibly non-deterministic) computation, where a prover performs some expensive computation and then uses a succinct argument to convince an efficient verifier of the validity of the output. If computing a proof takes much longer than the computation (even, say, a multiplicative factor of two), then this would cause a significant delay making the system useless in various realistic settings. This is particularly relevant for computations that are already incredibly time-consuming, or for applications like verifiable delay functions where the overhead of the prover directly impacts security. This motivates the following question:

Is it possible to compute a proof in parallel

to the computation while incurring no additional delay?

SPARKs. In this work, we answer the above question affirmatively for any non-deterministic parallel RAM computation. We introduce succinct parallelizable arguments of knowledge (SPARKs) where the prover’s running time is “essentially” optimal. More precisely, an interactive argument $ (\mathcal {P},\mathcal {V}) $ is a SPARK if instances solvable in (non-deterministic) parallel time T using p processors can be proven with the following efficiency requirements (ignoring dependence on the security parameter or statement size):

•

The prover’s parallel time is $ T + \mathrm{poly}\hspace{-2.0pt}\log (T \cdot p) $ .¹ (In other words, the prover’s running time is essentially T for large computations!)

•

The prover uses at most $ p \cdot \mathrm{poly}\hspace{-2.0pt}\log (T \cdot p) $ processors. In other words, the prover preserves the total work and parallelism of the underlying computation up to polylogarithmic factors.

•

The communication and verifier complexity are $ \mathrm{poly}\hspace{-2.0pt}\log (T \cdot p) $ .

We note that the third property is standard for succinct arguments. The first two properties stipulate that the running time of a prover, with only a moderate number of parallel processors over those used by the computation, is optimal—even a factor two overhead in terms of a prover running time is not allowed. Without the first property, there are existing succinct arguments with time $ T \cdot p \cdot \mathrm{poly}\hspace{-2.0pt}\log (T \cdot p) $ using only a single processor (e.g., References [11, 14]). Without the second property, there are existing constructions with parallel time $ T + \mathrm{poly}\hspace{-2.0pt}\log (T \cdot p) $ but require roughly $ T \cdot p $ processors (e.g., Reference [11]). No prior construction achieves all three properties simultaneously.

1.1 Our Results

Our results consider succinct arguments for arbitrary non-deterministic polynomial-time PRAM computation. Specifically, we consider machines M that run in parallel time T when using p processors.

Our main contribution is a generic transformation that starts with any succinct argument of knowledge, and shows how to transform multiplicative prover overhead to only additive overhead. Specifically, given a succinct argument of knowledge where the prover has $ \alpha ^{\star } $ multiplicative overhead (over the depth of the underlying computation) when using p processors, we show how to obtain an argument with $ \mathrm{poly} (\alpha ^{\star }) $ additive overhead when using roughly $ p \cdot \alpha ^{\star } $ processors. More precisely, we prove the following theorem:

Theorem 1.1 (Informal; see Theorems 6.1 and 6.18).

Assuming collision-resistant hash functions, any succinct argument of knowledge for $ \mathbf {NP} $ where the prover runs in parallel time $ T \cdot \alpha ^\star $ when using p processors can be generically transformed into a succinct argument where the prover runs in parallel time $ T + (\alpha ^\star)^2 \cdot \mathrm{poly}\hspace{-2.0pt}\log (T \cdot p) $ when using $ p \cdot \alpha ^\star \cdot \mathrm{poly}\hspace{-2.0pt}\log (T\cdot p) $ processors. Additionally, if the original argument is non-interactive, then so is the resulting one.

We refer to arguments with multiplicative prover overhead $ \alpha ^\star \in \mathrm{poly}\hspace{-2.0pt}\log (T \cdot p) $ when using p processors as $ \text{depth-preserving} $ as they preserve the parallelism and depth of the underlying computation up to $ \mathrm{poly}\hspace{-2.0pt}\log (T \cdot p) $ multiplicative factors. It immediately follows that any $ \text{depth-preserving} $ succinct argument of knowledge implies a SPARK, assuming collision resistance.

Theorem 1.2 (Informal; see Theorems 7.2 and 7.6).

Assuming collision-resistant hash functions, any $ \text{depth-preserving} $ succinct argument of knowledge for $ \mathbf {NP} $ can be generically transformed into a SPARK. Additionally, if the underlying argument is non-interactive, then so is the resulting SPARK.

By instantiating the underlying succinct arguments in the above theorem, we get the following main results: First, by using Kilian’s succinct argument [41] with a $ \text{depth-preserving} $ PCP (which can be obtained from the PCP of Ben-Sasson et al. [11]), we construct four-round SPARKs based on the existence of collision-resistant hash functions alone.

Theorem 1.3 (Informal; see Theorem 7.4).

Assuming collision-resistant hash functions, there exists a four-round SPARK for non-deterministic polynomial-time PRAM computation.

We additionally construct SPARKs in the non-interactive setting from a succinct non-interactive argument of knowledge (SNARK). Specifically, assuming the existence of any SNARK (not necessarily depth-preserving), we can construct $ \text{depth-preserving} $ SNARKs based on the construction of Bitansky et al. [16]. Their SNARK construction also has the property that it is space-preserving, meaning that the space used to construct the proof is at most a $ \mathrm{poly}\hspace{-2.0pt}\log (T \cdot p) $ multiplicative overhead over the space of the computation. The resulting SPARK is therefore also space-preserving, which yields the following theorem:

Theorem 1.4 (Informal; see Theorem 7.8).

Assuming collision-resistant hash functions and any SNARK, there exists a space-preserving, non-interactive SPARK for non-deterministic polynomial-time PRAM computation.

Model of Computation. We define and build SPARKs for PRAM computations, where our SPARK prover is also a PRAM machine. While the PRAM model of computation is very expressive in theory, there is clearly not an exact one-to-one correspondence with real computers. For example, we do not take into account the performance of caches or other optimizations in modern processors that can easily result in additional overhead. As such, we view the results in this article as showing a theoretical feasibility for practical implementations of SPARKs. We next briefly discuss and justify both the model of computation and the notion of time used in this work. For further details, see Section 3.1.

Recall that a RAM machine is a Turing machine with random access to its memory string. Between accesses, the machine applies some transition function to determine its next memory access. Each access is either a read or write, and we additionally assume that every time a process writes a value to a location in memory, it receives the previous value at that location. We define the running time of a RAM machine as the number of memory accesses it makes. For parallel RAM machines, we define the parallel running time as the number of “rounds” of memory accesses made by all processors, so if two processors access memory during the same logical round, then we only count it as a single unit of parallel time. In other words, a SPARK proves a PRAM computation that makes T rounds of parallel memory accesses with $ T + \mathrm{poly}\hspace{-2.0pt}\log (T \cdot p) $ rounds of parallel accesses.

Similar models have been used in other contexts for delegating RAM computation (see, e.g., References [38, 39]), but they were less sensitive to the model, since they did not care about small multiplicative overheads. However, we believe that the above timing model we propose is reflective of real programs. For memory-intensive programs, our model captures the fact that memory accesses are practically the most time-consuming operations. For compute-intensive tasks, where the memory accesses are more sparse, it is only better that the overhead of a SPARK scales with the number of memory accesses and not the computation time itself.

1.2 Applications

Below, we present applications of SPARKs that rely on the fact that in a SPARK, the prover both computes and proves the validity of a computation in parallel time, which is essentially as efficient as possible. While our focus here is on establishing theoretical feasibility results, we expect that our ideas may also be useful in practical constructions, which we leave for future work.

Time-tight delegation of PRAM computation. In the problem of verifiable delegation of computation [35, 39, 52], there is a client who wishes to outsource an expensive (possibly non-deterministic) computation M on an input x to a powerful yet untrusted server. The server should not only produce the output y but also a proof that the computation was done correctly.

A non-interactive SPARK for a class of PRAM computations directly gives a delegation protocol for the same class. This is because SPARKs satisfy a “delayed-output” property—the output y of the computation need not be known to the SPARK prover or verifier in advance, as it is computed in parallel to the proof. Therefore, using a non-interactive SPARK, a server can perform a PRAM computation as well as compute a proof with essentially no overhead in running time. Specifically, for T-time computations with p processors, the server runs in time $ T + \mathrm{poly}\hspace{-2.0pt}\log (T \cdot p) $ and uses at most $ p \cdot \mathrm{poly}\hspace{-2.0pt}\log (T \cdot p) $ processors. We call delegation schemes with this property time-tight.

We emphasize that our non-interactive SPARK construction yields a time-tight delegation protocol for non-deterministic computations that use any amount of parallelism. For example, consider the case where a client wants to outsource a PRAM computation over a large database (stored at the server) but only knows a hash of the database. The server can perform the computation while proving both that the output is correct and the database is consistent with the client’s hash. Furthermore, if both the server and the client have agreed upon the hash at the beginning of the protocol, then the running time depends only on the time of the PRAM computation (otherwise, the server will need to prove that the initial database hashes to the correct value, which requires computing a hash over the whole database and will be expensive if the database is large).

VDFs from sequential functions. Verifiable delay functions (VDFs) are functions that require some “long” time T to compute (where T is a parameter given to the function), yet the answer to the computation can be efficiently verified given a proof that can be jointly generated with the output (with only small overhead) [20, 21, 51, 56]. The work of Boneh et al. [20] suggests a theoretical construction of VDFs based on succinct non-interactive arguments (SNARGs) and any iteratively sequential function (ISF).² Other known constructions of VDFs [51, 56] rely on the repeated squaring assumption—a concrete ISF.

Let us recall what ISFs are. A sequential function (SF) is a function that takes a long time to compute, even if one has many parallel processors. An ISF is the iteration of some round function and the assumption is that iterating the round function is the fastest way to evaluate the ISF, even if one has many parallel processors. Clearly, any VDF implies an SF and so any construction of VDFs will necessarily rely on such (but this is not the case for an ISF³). It is thus a very natural question whether we can get a VDF based on only SFs and SNARGs. Note that the construction of Boneh et al. [20] inherently relies on the iterated structure of the underlying sequential function.⁴

We observe that any non-interactive SPARK for computing and proving an SF implies a VDF: Simply compute the non-interactive SPARK for the SF. Therefore by our main result, any SF, SNARK, and collision-resistant hash function imply a VDF.

Theorem 1.5 (Informal; see Theorem 9.4 and Corollary 9.8).

Assuming the existence of a collision-resistant hash function, a SNARK, and a sequential function, there exists a VDF.

In fact, one way to view our main construction is by improving existing techniques for constructing verifiable computation for iterated functions from SNARGs to arbitrary functions using SNARKs (and collision-resistant hash functions). An interesting open question is how to construct verifiable computation for arbitrary functions from only SNARGs, rather than SNARKs.

Memory-hard VDFs. A particularly appealing extension of the application above is to the existence of memory-hard VDFs. Recall that VDFs only guarantee that a long computation has been performed (and anyone can verify this publicly). It is very natural to require that not only a time-consuming computation was performed but also that the computation required many resources, for example, a large portion of the memory across time.

Clearly any VDF that is based on an ISF is not memory-hard. The reason is that even if the basic round function is memory-hard, upon every iteration the memory consumption goes to (essentially) zero! Since the VDF construction discussed above does not necessarily have to be instantiated with an ISF but rather any SF (and a SPARK for computing it), we can use a memory-hard sequential function (e.g., References [1, 2, 3, 5, 28, 29, 30]) and get a VDF where the computation not only takes a long time, but also requires large memory throughout.

Theorem 1.6 (Informal; see Theorem 9.4 and Corollary 9.11).

Assuming the existence of a collision-resistant hash functions, a SNARK, and a memory-hard sequential function, there exists a memory-hard VDF.

Last, we note that sequentiality and memory-hardness are two examples of functions that are hard to compute with bounded resources. Since a SPARK computes a function and constructs the proof in parallel, then the above transformations can be used to preserve any hardness property of a PRAM computation, so long as the function remains hard after an additive increase in the parallel running time (and an small increase in the number of parallel processors). This enables generically turning hard functions into verifiable hard functions (see Theorem 9.4 for a formal version of this claim).

1.3 Related Work

Succinct arguments with efficient provers. We elaborate on the existing succinct arguments that focus on prover efficiency. We consider the general setting of proving computation that takes T parallel time using p processors (although most works only explicitly consider the setting where $ p=1 $ and T is the total time).

First, we recall that Kilian’s succinct argument consists of a prover who commits to a PCP using a Merkle tree and locally opens a set of random locations specified by the verifier. As such, efficient PCP constructions immediately give rise to succinct arguments with an efficient prover. Specifically in References [11, 14], they show how to construct PCPs in quasi-linear time, which yield succinct arguments with a prover running in $ T \cdot p \cdot \mathrm{poly}\hspace{-2.0pt}\log (T \cdot p) $ time for computation with total work $ T \cdot p $ . In Reference [11], they show how to construct a quasi-linear size PCP that can be computed in $ \mathrm{poly}\hspace{-2.0pt}\log (T\cdot p) $ depth with roughly $ T \cdot p $ processors, when given the transcript of the computation. This results in a succinct argument where the prover runs in parallel time $ T + \mathrm{poly}\hspace{-2.0pt}\log (T\cdot p) $ using roughly $ T \cdot p $ processors. When restricting the prover to use at most $ p \cdot \mathrm{poly}\hspace{-2.0pt}\log (T \cdot p) $ processors, as required by SPARKs, this yields a succinct argument where the prover runs in parallel time $ T \cdot p \cdot \mathrm{poly}\hspace{-2.0pt}\log (T\cdot p) $ . Furthermore, the above arguments can be made non-interactive by applying the Fiat-Shamir transformation [34, 44].

A different line of work has focused additionally on the prover’s space complexity. Bitansky et al. [16] (following Valiant’s [55] incrementally verifiable computation framework using recursive proof composition) construct complexity-preserving SNARKs, in which both the time and space of the underlying computation up to (multiplicative) polynomial factors in the security parameter. For the task of delegating deterministic $ (T\cdot p) $ -time S-space computation, Holmgren and Rothblum [38] give a prover with $ T \cdot p \cdot \mathrm{poly}\hspace{-2.0pt}\log (T \cdot p) $ total time and $ S + o(S) $ space assuming sub-exponential LWE.

Tight VDFs. As we describe shortly in Section 2, our construction splits the computation into “chunks” and proves each of them in parallel. This idea is inspired by the recent transformations of Boneh et al. and Döttling et al. [20, 26] in the context of verifiable delay functions (VDFs) [20, 21]. Those works show how to use a VDF for an iterated sequential function where the honest evaluator has some overhead, into a VDF where the honest evaluator uses multiple parallel processors and has essentially no parallel time overhead. However, iterated functions can be naturally split into chunks and so most of the technical difficulty in our work does not arise in that context. See Section 2 for more details.

IOPs. In an effort to bring down the quasi-linear overhead of PCPs, Ben-Sasson et al. [13] and Reingold et al. [52] introduced the concept of interactive oracle proofs (IOPs).⁵ IOPs are a type of proof system that combines aspects of interactive proofs (IPs) and PCPs: In every round a prover sends a possibly long message but the verifier is allowed to read only a few bits. IOPs also generalize Interactive PCPs [40]. The most recent IOP is due to Ron-Zewi and Rothblum [54] (improving Ben-Sasson et al. [10]) and achieves nearly optimal overhead in proof length (i.e., a $ 1+\epsilon $ factor for an arbitrary $ \epsilon \gt 0 $ ) and constant rounds and query complexity, however, the prover’s running time is some unspecified polynomial.

2 Technical Overview

In this section, we present the main techniques underlying our transformation from succinct arguments of knowledge with small multiplicative prover overhead to SPARKs.

2.1 Warmup: SPARKs for Iterated Functions

Our starting point stems from the recent works of Boneh et al. and Döttling et al. [20, 27]. For concreteness, we describe the setting of Reference [20], which focuses on the simplified case of proving correctness of the output of an iterated function $ g^{(T)}(x_0) = (g \circ \ldots \circ g)(x_0) $ for some $ T \in \mathbb {N} $ . Rather than proving that $ g^{(T)}(x_0) = x_T $ directly, they split the computation into different sub-computations of geometrically decreasing size such that the proof for every sub-computation completes by time T.

To demonstrate this idea, suppose for simplicity that each iteration takes one unit of time to compute and that there is a succinct argument that can non-interactively prove any computation of k iterations of g in 2k additional time. Then, to prove that $ g^{(T)}(x_0) = x_T $ , they first perform $ 1/3 $ of the computation to obtain $ g^{(T/3)}(x_0) = x_{T/3} $ and then prove its correctness. Observe that $ x_{T/3} $ can be computed in time $ T/3 $ and the proof can be generated in time $ 2T/3 $ by assumption, so the proof that $ g^{(T/3)}(x_0) = x_{T/3} $ completes by time T. In parallel to proving that $ g^{(T/3)}(x_0) = x_{T/3} $ , they additionally compute and prove $ 1/3 $ of the remaining computation (namely, that $ g^{((T-T/3)/3)}(x_{T/3}) = x_{5T/9} $ ) in a separate parallel thread, which also will finish by time T. They continue in this fashion recursively until the remaining computation can be verified directly.

In this construction, the prover only needs to start at most $ O(\log T) $ parallel computation threads and finishes in essentially parallel time T. The final proof consists of $ O(\log T) $ proofs of the intermediate sub-computations. The verifier checks each proof for the sub-computations independently and accepts if all checks pass and the proposed inputs and outputs are consistent with each other. More generally, if the given non-interactive argument had $ \alpha ^\star $ multiplicative overhead, then the resulting number of threads needed would be $ O(\alpha ^\star \cdot \log T) $ . So, when the overhead is quasi-linear (i.e. $ \alpha ^\star \in \mathrm{poly}\hspace{-2.0pt}\log {T} $ ), the resulting argument is still succinct.

2.2 Extending SPARKs to Arbitrary Computations

The focus of this work is extending the above example to handle arbitrary non-deterministic polynomial-time computation (possibly with a long output) that introduces many complications. For now, we focus on the case of RAM computation that uses only a single processor (we later show how to extend this to arbitrary parallel RAM computations). Specifically, suppose we are given a statement $ (M,x,T) $ with witness w, where M is a RAM machine and we want to prove that $ M(x,w) $ outputs some value y within T steps. We emphasize that our goal is to capture general non-deterministic, polynomial-time computation where the output y is not known in advance, so we would like to simultaneously compute y given $ (M,x,T) $ and w, and prove its correctness. Since M is a RAM machine, it has access to some (potentially large) memory D consisting of n words in memory. We let $ \lambda $ be the security parameter and size of a word, and T be an arbitrary polynomial in $ \lambda $ . Let us try to employ the above strategy in this more general setting.

As M does not necessarily implement an iterated function, the first problem we encounter is that there is no natural way to split the computation into many sub-computations with small input and output. For intermediate statements, the naïve solution would be to prove that running the RAM machine M for k steps starting at some initial memory $ D_{\mathsf {start}} $ results in final memory $ D_{\mathsf {final}} $ . However, this is a problem, because the size of the memory, n, may be large—perhaps even as large as the full running time T—so the intermediate statements we need to prove may be huge!

A natural attempt to mitigate this would be to instead provide a succinct digest of the memory at the beginning and end of each sub-computation and then have the prover additionally prove that it knows a memory string consistent with each digest. Concretely, each sub-computation corresponding to k steps of computation would contain digests $ c_{\mathsf {start}}, c_{\mathsf {final}} $ . The prover would show that there exist strings $ D_\mathsf {start} $ , $ D_\mathsf {final} $ such that (1) $ c_\mathsf {start} $ , $ c_\mathsf {final} $ are digests of $ D_\mathsf {start} $ , $ D_\mathsf {final} $ , respectively, and (2) starting with memory $ D_\mathsf {start} $ and running RAM machine M for k steps results in memory $ D_\mathsf {final} $ . This seems like a step in the right direction, since the statement size for each sub-computation would only depend on the output size of the digest and not the size of the memory. However, the prover’s witness—and hence running time to prove each sub-computation—still scales linearly with the size of the memory in this approach. Therefore, the main challenge we are faced with is removing the dependence on the memory size in the witness of the sub-computations.

Using local updates. To overcome the above issues, we observe that in each sub-computation the prover only needs to prove that the transition from the initial digest $ c_{\mathsf {start}} $ to the final digest $ c_{\mathsf {final}} $ is consistent with k steps of computation done by M. At a high level, we do so by proving that there exists a sequence of k local updates to $ c_{\mathsf {start}} $ that result in $ c_{\mathsf {final}} $ . Then, to verify a sub-computation corresponding to k steps, we can simply check the k local updates to the digest of the memory, rather than checking the memory in its entirety. To formalize this idea, we rely on compressing hash functions that allow for local updates that can be efficiently computed in parallel to the main computation. We call these concurrently updatable hash functions.

Given such hash functions, will use a succinct argument of knowledge $ (\mathcal {P} _\mathsf {sARK},\mathcal {V} _\mathsf {sARK}) $ for an $ \mathbf {NP} $ language $ \mathcal {L}_{\mathsf {upd}} $ that corresponds to checking that a sequence of local updates are valid. Specifically, a statement $ (M,x,k,c_\mathsf {start}, c_\mathsf {final}) \in \mathcal {L}_{\mathsf {upd}} $ if and only if there exists a sequence of updates $ u_1, \ldots , u_k $ such that, starting with short digest $ c_\mathsf {start} $ , running M on input x for k steps specifies the updates $ u_1,\ldots , u_k $ that result in a digest $ c_\mathsf {final} $ . Then, as long as the updates are themselves succinct, the size of the witness scales only with the number of steps of the computation and not with the size of the memory.

To make the above approach work, we need updatable hash functions that satisfy the following two properties:

(1)

Updates can be computed efficiently in parallel to the main computation.

(2)

Updates can be verified as modifying only the specified locations in memory.

We next explain how we obtain the required hash functions satisfying the above properties. We believe that this primitive and the techniques used to obtain it are of independent interest.

Concurrently Updatable Hash Functions. Roughly speaking, concurrently updatable hash functions are computationally binding hash functions that support updating parts of the underlying message without re-hashing the whole message. For efficiency, we additionally require that one can perform several sequential updates concurrently. For soundness, we require that no efficient adversary can find two different openings for the same location even if it is allowed to perform polynomially many update operations. A formal definition appears in Section 5.

We focus on the case where each update is local (a single word per timestep), but we show how to extend this to updating many words in parallel in Section 5. Our construction relies on Merkle trees [43] and hence can be instantiated with any collision-resistant hash function. Recall that a Merkle tree uses a compressing hash function, which we assume for simplicity is given by $ h:\lbrace 0,1\rbrace ^{2\lambda }\rightarrow \lbrace 0,1\rbrace ^{\lambda } $ , and is obtained via a binary tree structure where nodes are associated with values. The leaves are associated with arbitrary values and each internal node is associated with a value that is the hash of the concatenation of its children’s values.

It is well known that Merkle trees, when instantiated with a collision-resistant hash function h, act as short (binding) commitments with local opening. The latter property enables proving claims about specific blocks in the input without opening the whole input, by revealing the authentication path from some input block to the root (i.e., the hashes corresponding to sibling nodes along the path from the leaf to the root). Not only do Merkle trees have the local opening property, but the same technique allows for local updates. Namely, one can update the value of a specific word in the input and compute the new root value without recomputing the whole tree (by updating the hashes along the path from the updated block to the root). All of these local procedures cost time that is proportional to the depth of the tree, $ \log _2 n $ , as opposed to the full memory n. We denote this update time as $ \beta $ (which may additionally depend polynomially on $ \lambda $ , for example, to compute the hash function at each level in the tree).

Let us see what happens when we use Merkle trees as our hash function. Recall that the Merkle tree contains the hash of the memory at every step of the computation, and we update its value after each such step. The latter operation, as mentioned above, takes $ \beta $ time. So even with local updates, using Merkle trees naïvely incurs a $ \beta $ delay for every update operation that implies a $ \beta $ multiplicative delay for the whole computation (which we want to avoid)! To handle this, we use a pipelining technique to perform the local updates in parallel.

Pipelining updates. Consider two updates $ u_1 $ and $ u_2 $ that we want to apply to the current Merkle tree sequentially. We observe that, since Merkle trees updates work “level by level,” we can first update the first level of the tree (corresponding to the leaves) according to $ u_1 $ . Then, update the second layer according to $ u_1 $ and in parallel update the first layer using $ u_2 $ . Continuing in this fashion, we can update the third layer according to $ u_1 $ and in parallel update the second layer using $ u_2 $ , and so on. The idea can be generalized to pipeline $ u_1,\ldots ,u_k $ , so the final update $ u_k $ completes after $ (k-1)+\beta $ steps, and the memory is consistent with the Merkle tree given by performing update operations $ u_1,\ldots ,u_k $ sequentially. The implementation of this idea requires $ \beta $ additional parallel threads, since the computation for at most $ \beta $ updates will overlap at a given time. A key point that allows us to pipeline these concurrent updates is that the operations at each level in the tree are data-independent in a standard Merkle tree. Namely, each processor can perform all of the reads/writes to a given level in the tree at a single timestep, and the next processor can continue in the next timestep without incurring any delay.

Verifying that updates are local. With regards to the soundness of this primitive, a subtle—yet important—point that we need in our application is that it must be possible to prove that a valid update only modifies the locations it specifies. For example, suppose a cheating prover updates the digest with respect to one location in memory while simultaneously rewriting other locations in memory in a way that does not correspond to the memory access done by the machine M. Then, the prover will later be able to open inconsistent values and prove that M computes whatever it wants. Moreover, the prover could gradually make these changes across many different updates. Fortunately, the structure of Merkle trees allow us to prove that a local update only changes a single location. At a high level, this is because the authentication path for a leaf in a Merkle tree effectively binds the root of the tree to the entire memory. Thus, we show that if a Merkle tree is updated at some location, then one can use the authentication path to prove that no other locations were modified. Furthermore, we show in the general case how to extend this for updating many locations in a single update.

Ensuring optimal prover runtime. Using the above ingredients, we discuss how to put everything together to ensure optimal prover runtime. Concretely, suppose we have a concurrently updatable hash function where each update takes time $ \beta $ , and a succinct non-interactive argument of knowledge with quasilinear prover overhead for the language $ \mathcal {L}_{\mathsf {upd}} $ . Recall that a statement $ (M,x,k,c_\mathsf {start}, c_\mathsf {final}) \in \mathcal {L}_{\mathsf {upd}} $ if there exists a sequence of k hash function updates such that (1) the updates are consistent with the computation of M and (2) applying these updates to $ c_{\mathsf {start}} $ results in $ c_{\mathsf {final}} $ . Let $ \alpha ^\star $ be the multiplicative overhead of the succinct argument with respect to the number of updates (so a computation with $ k \le T $ updates takes time $ k \cdot \alpha ^\star $ to prove). Note that $ \alpha ^\star \in \mathrm{poly} (\beta ,\log T) $ , as we require that the total time to prove a $ \mathcal {L}_{\mathsf {upd}} $ statement is quasilinear in the work, and a statement for at most T updates requires $ T \cdot \beta $ total work.

As discussed above, to prove that $ M(x,w) $ outputs a value y in T steps, we split the computation into m sub-computations that all complete by time T. The ith sub-computation will consist of a “compute” phase, where we compute $ k_i $ steps of the total T computation steps, and a “proof” phase, where we use the succinct argument to prove correctness of those $ k_i $ steps. For the “compute” phase, recall that performing $ k_i $ steps of computation while also updating the digest takes $ k_i \cdot \beta $ total work. However, as described above, we can pipeline these updates so the parallel time to compute these updates is only $ (k_i - 1) + \beta $ .

For the “proof” phase to complete in the desired amount of time, we need to set the values of $ k_i $ appropriately. Each proof for $ k_{i} \le T $ steps of computation takes at most $ k_i \cdot \alpha ^\star $ time. Therefore, the largest “chunk” of computation we can compute and prove by roughly time T is $ T/(\alpha ^\star + 1) $ . For convenience, let $ \gamma \triangleq \alpha ^\star + 1 $ . Then, in the first sub-computation, we can compute and prove $ k_{1} = T/\gamma $ steps of computation. In each subsequent computation, we compute and prove a $ \gamma $ fraction of the remaining computation. Putting everything together, we get that $ k_i = (T/\gamma) \cdot (1-1/\gamma)^{i-1} $ for $ i \in [m-1] $ and then $ k_m \lt \gamma $ is the number of remaining steps such that $ \sum _{i=1}^m k_i = T $ . This results in roughly $ \gamma \log T \in \mathrm{poly} (\beta , \log T) $ total sub-proofs, meaning that the proof size depends only polylogarithmically on T.

In Figure 1, we show the structure of the compute and proof phases for all m sub-computations. We emphasize that the entire protocol completes within $ T+\alpha ^{\star } \cdot \gamma + \beta $ parallel time, since the first $ m-1 $ sub-proofs complete by time $ T + \beta $ , all m sub-computations complete by time $ T+\beta $ , and the proof of the final $ \gamma $ steps takes roughly $ \alpha ^{\star } \cdot \gamma $ time to prove. Since $ \alpha ^{\star } $ , $ \gamma $ , and $ \beta $ are in $ \mathrm{poly} (\lambda ,\log T) $ , this implies that we only have a small additive rather than multiplicative overhead.

Fig. 1.

We note that in the overview above where we discuss SPARKs for iterated functions, correctness of the final sub-computation is proven by having the prover send the witness in the clear, and having the verifier check it directly. In our full construction, we instead have the prover give a succinct proof for the last sub-computation. The main reason for this is that for the case of general parallel RAM computations, we want the communication complexity and the complexity of the verifier to depend only poly-logarithmically on the depth T and processors $ \rho $ used in the original computation. However, the witness for the final sub-computation may have length linear in $ \rho $ (since at each step in the final sub-computation, the witness may specify the actions of each of the $ \rho $ processors). Having the prover instead provide a succinct proof solves this issue.

Next, we note that we have a $ \beta $ gap between the time that the “compute” phase ends and the “proof” phase begins for a particular sub-computation. This is because we have to wait $ \beta $ additional time to finish computing the updates before we can start the proofs. However, we can immediately start computing the next sub-computation without waiting for the updates to complete. Last, the number of processors used in the protocol is $ \beta $ at all times in the constantly running “compute” phase that is additionally computing updates to the digest in parallel. Then, to run each of the m sub-proofs in parallel, we get at most a factor of m times the number of processors used by a single sub-proof.

Computing the initial digest. Before giving the full protocol, we address a final issue, which is for the prover to compute the digest of the initial memory string. Specifically, the prover needs to hash a string $ D \in \{0,1 \}^n $ , which the RAM machine M assumes contains its inputs $ (x,w) $ . Directly hashing to the string $ x || w $ would require roughly $ \left|x \right|+\left|w \right| $ additional time, which could be as large as T. To circumvent the need to compute the initial digest, we simply do not compute a digest of the initial memory! Instead, we start with a digest of an uninitialized memory that can be computed efficiently and allows each position to be initialized exactly once whenever it is first accessed.

We extend our hash function definition to enable this as follows: We start with a dummy value $ \bot $ for the leaves of the Merkle tree. Because the leaves all have the same value, we can compute the root of the Merkle tree efficiently without touching all of the nodes in the tree. Specifically, if the leaves have the value $ \mathbf {dummy}(0) $ , then we can define the value of the nodes at level j recursively as $ \mathbf {dummy}(j) = h(\mathbf {dummy}(j-1) || \mathbf {dummy}(j-1)) $ . Then the initial digest is just the root $ \mathbf {dummy}(\log n) $ . Note that here, the prover does not need to initialize the whole tree in memory with dummy values, it simply needs to compute $ \mathbf {dummy}(\log n) $ as the initial digest.

Whenever the prover accesses a location in D for the first time, it performs the corresponding local update to the Merkle tree. However, performing this update is non-trivial as many of the nodes in the Merkle tree may still be uninitialized. What saves us is that any uninitialized node must correspond to leaves that are also uninitialized, so they still have the value $ \bot $ . As such, we can compute the value of any uninitialized node at level j efficiently as $ \mathbf {dummy}(j) $ . To maintain efficiency, the prover can keep track of a bit for each node to check if it has been initialized or not.

Given a single authentication path for a newly initialized location in memory, the verifier can check that this path is a valid opening for $ \bot $ with the previous digest and for the new value with the updated digest. This guarantees that only the newly initialized value was modified, and the verifier can make sure each location is updated at most once by disallowing the prover from updating locations to $ \bot $ . Furthermore, the verifier can check that any initialized value not part of the witness (corresponding to the input x) is consistent with what M expects.

2.3 Our SPARK Construction

We now summarize our full SPARK construction. Suppose that we have (1) a concurrently updatable hash function that starts as uninitialized where each update takes time $ \beta $ and (2) a succinct non-interactive argument of knowledge $ (\mathcal {P} _{\mathsf {sARK}},\mathcal {V} _{\mathsf {sARK}}) $ for the update language $ \mathcal {L}_{\mathsf {upd}} $ with $ \alpha ^\star \in \mathrm{poly} (\lambda , \log T) $ multiplicative overhead. Let $ \gamma \triangleq \alpha ^\star + 1 $ , as described above, which is the fraction of remaining computation done at each step. The protocol $ (\mathcal {P},\mathcal {V}) $ for a statement $ (M,x,T) $ is as follows:

(1)

$ \mathcal {V} $ samples public parameters $ \mathit {pp} $ for the hash function and sends them to $ \mathcal {P} $ .

(2)

Using $ \mathit {pp} $ , $ \mathcal {P} $ computes the digest $ c_\mathsf {start} $ for the uninitialized memory $ D_\mathsf {start} = \bot ^n $ .

(3)

$ \mathcal {P} $ computes $ T/\gamma $ steps of $ M(x,w) $ while in parallel updating $ D_\mathsf {start} $ and performing the corresponding local updates to digest $ c_1 = c_\mathsf {start} $ .

(4)

After completing the $ T/\gamma $ steps of the computation (but not necessarily completing all corresponding updates), $ \mathcal {P} $ starts recursively computing and proving the remaining $ T - T/\gamma $ steps in parallel.

(5)

Let $ u_1, \ldots , u_{T/\gamma } $ be the current updates, which result in digest $ c_1^{\prime } $ . After computing the current updates, $ \mathcal {P} $ uses $ \mathcal {P} _{\mathsf {sARK}}(u_1, \ldots , u_{T/\gamma }) $ for language $ \mathcal {L}_{\mathsf {upd}} $ to prove that starting with digest $ c_1 $ , running M on input x for $ T/\gamma $ steps results in digest $ c_1^{\prime } $ .

(6)

$ \mathcal {P} $ continues until there are at most $ \gamma $ steps of the computation, at which point $ \mathcal {P} $ computes and proves the remaining steps and sends the proof to $ \mathcal {V} $ .

(7)

After finishing the computation and all corresponding updates, $ \mathcal {P} $ uses the final digest to open the output y and give a proof of its correctness. $ \mathcal {V} $ accepts if the proof certifying y verifies and $ \mathcal {V} _\mathsf {sARK} $ accepts all sub-protocols, which are consistent with each other.

Handling interactive protocols. The same transformation described above applies to interactive r-round succinct argument of knowledge. However, since the protocol is interactive, the prover starts an interactive protocol to prove that sub-computations were performed correctly. It is not necessarily the case that the messages in the various interactive arguments will be “synced” up, and so our transformation suffers from (at most) a $ \mathrm{poly}\hspace{-2.0pt}\log T $ factor increase in the round complexity. For specific underlying succinct arguments, however, it may be the case that we can synchronize the rounds to reduce the round complexity. Indeed, this is the case for Kilian’s succinct argument, which we discuss in Section 7.1.

Extending to PRAM computation. We next discuss how to extend the protocol given above to deal with parallel RAM computation with any number of processors. We assume for simplicity that in the given machine no two processors access the same location in memory concurrently. Suppose M is a PRAM machine where $ M(x,w) $ runs in parallel time T using p processors. In the above protocol, we emulate each step of M while performing the corresponding hash function updates in parallel. The SPARK prover can use p processors to emulate M, but as M might access p locations in memory at each step, the hash function needs to support updating any set of p positions concurrently. We show how to generalize the updatable hash function scheme described above to handle such updates while still supporting pipelining for each set of updates. As for efficiency, we observe that this naively increases the overhead to compute each sub-proof by a factor of p (if the overhead scales with the total work rather than the depth of the underlying computation). As such, we need to use an underlying succinct argument that has overhead $ \alpha ^\star \in \mathrm{poly}\hspace{-2.0pt}\log (T \cdot p) $ in the depth of the underlying computation while using at most p processors. We refer to such arguments as $ \text{depth-preserving} $ and discuss how to construct them using known techniques in Sections 7.1 and 7.2.

Security proof and argument of knowledge definition. We note that proving security in the above construction is somewhat non-trivial. The key issue is that we need to simultaneously extract witnesses from super logarithmically many concurrent or parallel arguments of knowledge, without causing a blow-up in the complexity of the resulting extractor. In the non-interactive case, it is pretty straightforward to deal with this, since the statements are all “fixed” and so concurrent composition just works. However, the interactive setting is more challenging, since there are more dependencies. This issue came up and was resolved in previous works, e.g., References [42, 49], where new extraction techniques and definitions were introduced. In our case, we introduce yet another argument of knowledge definition, which (1) enables dealing with this issue in our proof of security, (2) is equivalent to common definitions of proofs of knowledge, and (3) we believe is conceptually simpler and much easier to work with. We view this definition as an additional independent contribution. See Section 4 for additional details in the context of SPARKs and Section 3.3 and Appendix A in the context of standard notions of succinct arguments of knowledge.

3 Preliminaries

Basic notation. For a distribution X we denote by $ x \leftarrow X $ the process of sampling a value x from the distribution X. For a set $ \mathcal {X} $ , we denote by $ x \leftarrow \mathcal {X} $ the process of sampling a value x from the uniform distribution on $ \mathcal {X} $ . Supp $ (X) $ denotes the support of the distribution X. For an integer $ n \in \mathbb {N} $ we denote by $ [n] $ the set $ \{1,2,\ldots ,n\} $ . We use PPT as an acronym for probabilistic polynomial time.

A function $ \mathsf {negl} :\mathbb {N} \rightarrow \mathbb {R} $ is negligible if it is asymptotically smaller than any inverse-polynomial function, namely, for every constant $ c \gt 0 $ there exists an integer $ N_{c} $ such that $ \mathsf {negl} (\lambda) \le \lambda ^{-c} $ for all $ \lambda \gt N_{c} $ . Two sequences of random variables $ X = \{{X_{\lambda }}\}_{\lambda \in \mathbb {N}} $ and $ Y = \{{Y_{\lambda }}\}_{\lambda \in \mathbb {N}} $ are computationally indistinguishable if for any non-uniform PPT algorithm $ \mathcal {A} = \{{\mathcal {A}_{\lambda }}\}_{\lambda \in \mathbb {N}} $ there exists a negligible function $ \mathsf {negl} $ such that $ \vert \text{Pr}[ \mathcal {A} _{\lambda }(1^{\lambda} ,X_{\lambda })=1 ] - \text{Pr}[ \mathcal {A} _{\lambda }(1^{\lambda} ,Y_{\lambda })=1 ] \vert \le \mathsf {negl} (\lambda) $ for all $ \lambda \in \mathbb {N} $ . For a language L with relation $ R_{L} $ , we let $ R_{L}(x) $ denote the set of witnesses w such that $ (x,w) \in R_{L} $ . We say that an ensemble $ \{{X_{n }}\}_{n \in \mathbb {N}} $ is uniformly computable if there exists a Turing Machine M such that $ M(1^{n}) $ outputs $ X_{n} $ in time polynomial in n.

Interactive Protocols. We consider interactive (P)RAM machines and interactive protocols. Formally, we assume there is a specified part of a machine’s memory for input from and output to another interactive machine, so the time for an interactive machine to send a message is simply the time to write it to its output tape. Given a pair of interactive machines $ \mathcal {P} $ and $ \mathcal {V} $ , we denote by $ \langle \mathcal {P} (z_{P}), \mathcal {V} (z_{V}) \rangle (x) $ the random variable representing the output of $ \mathcal {V} $ with common input x and private input $ z_{V} $ , when interacting with $ \mathcal {P} $ with private input $ z_{P} $ , when the random tape of each machine is uniformly and independently chosen.

The round complexity of the protocol is the number of distinct messages sent between $ \mathcal {P} $ and $ \mathcal {V} $ . We say that a protocol is non-interactive if it consists of one message from $ \mathcal {P} $ to $ \mathcal {V} $ and then $ \mathcal {V} $ computes its output. To define the complexity of an interactive machine A, we let $ \mathbf {work} _{A}(x) $ denote the maximum amount of work done by $ A(x) $ over any possible interactions.

3.1 RAM Model

Random Access Memory (RAM) computation consists of a machine M that keeps some local state $ \mathit {state} $ and has read/write access to memory $ D \in ( \{0,1 \}^{\lambda})^{n} $ (equivalent to the tape of a Turing machine). Here, $ \lambda $ is the security parameter and length of a word,⁶ and $ n \le 2^{\lambda } $ is the number of words in memory used by M. We assume that M specifies n and that $ \left|(M,x) \right| \le n $ . When we write $ M(x) $ to denote running M on input x, this means that M expects its initial memory D to be equal to $ x || 0^{n\lambda -\left|x \right|} $ . The computation is defined using a function $ \mathbf {step} $ , which has the following syntax:

$ \begin{equation*} (\mathit {state}^{\prime },\mathit {op},\ell ,v ^{\mathsf {wt}}) = \mathbf {step} (M,\mathit {state},v ^{\mathsf {rd}}). \end{equation*} $

Specifically, $ \mathbf {step} $ takes as input the description of the machine M, the current state $ \mathit {state} $ , and a word $ v ^{\mathsf {rd}} $ that was read in the last step from memory. Then, it outputs the next state $ \mathit {state}^{\prime } $ , the operation $ \mathit {op} \in \{\mathsf {rd},\mathsf {wt}\} $ to do next, the next location $ \ell \in [n] $ to access, and the word $ v ^{\mathsf {wt}} $ to write next if $ \mathit {op} = \mathsf {wt} $ (or $ \bot $ if $ \mathit {op} = \mathsf {rd} $ ).

Using $ \mathbf {step} $ , we can define each step of RAM computation to run $ \mathbf {step} $ and then either do a read or a write. We assume that each write operation returns the value in the memory location before the write. Formally, starting with an initially empty state $ \mathit {state} _{0} $ and letting $ v _{0}^{\mathsf {rd}} = \bot $ , the ith step of computation for $ i\ge 1 $ is defined as:

(1)

Compute $ (\mathit {state} _{i}^{},\mathit {op} _{i}^{},\ell _{i}^{},v ^{\mathsf {wt}}_{i}) = \mathbf {step} (M,\mathit {state} _{i-1}^{},v ^{\mathsf {rd} {}}_{i-1}) $ .

(2)

If $ \mathit {op} _{i} = \mathsf {rd} $ , then let $ v ^{\mathsf {rd}}_{i} $ be the word in location $ \ell _{i} $ of D.

(3)

If $ \mathit {op} _{i} = \mathsf {wt} $ , then let $ v ^{\mathsf {rd}}_{i} $ be the word at location $ \ell _{i} $ in D and write $ v _{i}^{\mathsf {wt}} $ to that location.

The computation halts when $ \mathbf {step} $ outputs a special halting value with the output y of $ M(x) $ written at the start of the memory, where we assume that the final state specifies the output length. Without loss of generality, we assume that the state size can hold $ O(\log n) $ bits.

Parallel RAM Computation. Our main results will be in the parallel-RAM (PRAM) setting, where each step of the machine can potentially branch to multiple processes that have access to the same memory D. This can be formalized by allowing $ \mathbf {step} $ to output multiple tuples $ (\mathit {state}^{\prime },\mathit {op},\ell ,v ^{\mathsf {wt}}) $ , each associated with a process identifier specifying the process to continue the computation from that state. Then, each process continues by running $ \mathbf {step} $ at each step, as above. The computation halts when there are no running processes.

For convenience, we define an algorithm $ \mathbf {parallel\text{-} step} $ that logically runs $ \mathbf {step} $ for all active processes. It has the following syntax:

$ \begin{equation*} (\mathit {State}^{\prime },\mathit {Op},S,V ^{\mathsf {wt}}) = \mathbf {parallel\text{-} step} (M,\mathit {State},V ^{\mathsf {rd}}). \end{equation*} $

Here, all inputs and outputs are tuples containing a value for each process. Specifically, if there are p active processes before the step, and $ p^{\prime } $ resulting processes, then $ \mathit {State} = (\mathit {state} _{i})_{i\in [p]} $ , $ V ^{\mathsf {rd}} = (v ^{\mathsf {rd}}_{i})_{i\in [p]} $ , $ \mathit {State}^{\prime } = (\mathit {state} _{i}^{\prime })_{i\in [p^{\prime }]} $ , $ \mathit {Op} = (\mathit {op} _{i})_{i\in [p^{\prime }]} $ , $ S = (\ell _{i})_{i\in [p^{\prime }]} $ , $ V ^{\mathsf {wt}} = (v ^{\mathsf {wt}}_{i})_{i\in [p^{\prime }]} $ . For each $ i \in [p] $ , in the previous step the ith process had state $ \mathit {state} _{i} $ and read (or overwrote) value $ v _{i}^{\mathsf {rd}} $ . For each $ i \in [p^{\prime }] $ , the ith process after the step has state $ \mathit {state} _{i}^{\prime } $ , and accesses location $ \ell _{i} $ in memory by either writing $ v ^{\mathsf {wt}}_{i} $ to it when $ \mathit {op} _{i}=\mathsf {wt} $ , or reads from it when $ \mathit {op} _{i}=\mathsf {rd} $ . Note that $ V ^{\mathsf {wt}} $ contains $ \bot $ for each element corresponding to a read operation. Also, note that if process i was spawned in this step, then $ \mathit {state}^{\prime }_{i} $ will be its initial state.

For ease of notation, we will also define an algorithm $ \mathbf {access} $ , which accesses a set of locations in memory and then reads and writes to them as specified. Specifically, $ \mathbf {access} ^{D}(\mathit {Op},S,V ^{\mathsf {wt}}) $ has memory access to D, takes as input $ \mathit {Op} $ , S, and $ V ^{\mathsf {wt}} $ as defined above, and does the following for each $ i \in [\left|\mathit {Op} \right|] $ :

(1)

If $ \mathit {op} _{i} = \mathsf {rd} $ , then let $ v _{i}^{\mathsf {rd}} $ be the word at location $ \ell _{i} $ of D.

(2)

If $ \mathit {op} _{i} = \mathsf {wt} $ , then let $ v _{i}^{\mathsf {rd}} $ be the word at location $ \ell _{i} $ in D and write $ v _{i}^{\mathsf {wt}} $ to that location.

It then outputs $ V ^{\mathsf {rd}} = (v ^{\mathsf {rd}}_{1},\ldots ,v ^{\mathsf {rd}}_{\left|\mathit {Op} \right|}) $ .

Using $ \mathbf {parallel\text{-} step} $ and $ \mathbf {access} $ , we can then formalize a full PRAM computation as follows: Starting with $ \mathit {State} _{0} = (\mathit {state} _{0}) $ , where $ \mathit {state} _{0} $ is an initially empty state, and $ V ^{\mathsf {rd}}_{0} = (\bot) $ , the ith step of the PRAM computation M for $ i\ge 1 $ is defined as:

(1)

Compute $ (\mathit {State} _{i}^{},\mathit {Op} _{i}^{},S_{i}^{},V ^{\mathsf {wt}}_{i}) = \mathbf {parallel\text{-} step} (M,\mathit {State} _{i-1}^{},V ^{\mathsf {rd} {}}_{i-1}) $ .

(2)

Let $ V _{i}^{\mathsf {rd}} = \mathbf {access} ^{D}(\mathit {Op} _{i},S_{i},V _{i}^{\mathsf {wt}}) $ .

The computation halts when all running processes reach a halting state, and the output y of $ M(x) $ is written to the start of the memory, where we additionally assume that the output length is encoded in the final state(s).

We are in the exclusive-read exclusive-write (EREW) model, i.e., the most restrictive PRAM model, where if some process accesses a location (either a read or a write) in memory while another process accesses the same location (either a read or a write), then there are no guarantees for the resulting effect. In addition to specifying the memory size n, we also assume that a PRAM machine specifies the number of concurrent processes p it uses, and that $ p \le n $ , as we are in the EREW model. Last, we assume that all processes in a PRAM computation have local registers that can be used to communicate the results of each step.

(P)RAM Complexity. Each step of RAM computation is allowed to make a single access to memory. We think of $ \mathbf {step} $ , which computes the transition function from $ \mathit {state} $ to $ \mathit {state}^{\prime } $ , as being implemented by an efficient CPU algorithm with access to a constant number of words.

As a result, we define the running time of a RAM machine M as the number of accesses it makes to its working memory. For PRAM machines, each step of computation may make multiple parallel accesses to memory via different processors.

To model the complexity of a (P)RAM machine M, we consider two complexity measures: $ \mathbf {work} $ and $ \mathbf {depth} $ . Specifically, we let $ \mathbf {work} _{M}(x) $ denote the total amount of computation done by all processors measured in steps (or equivalently memory accesses). When M is a non-deterministic machine, we denote this by $ \mathbf {work} _{M}(x,w) $ where w is the witness. We let $ \mathbf {depth} _{M}(x) $ (analogously, $ \mathbf {depth} _{M}(x,w) $ ) denote the number of sequential steps until M halts, where steps that occur in parallel are counted as one step. For a (non-parallel) RAM machine, we simply denote its running time by $ \mathbf {work} _{M}(x) $ .

We also assume that n words in memory can be allocated and initialized to zeros for free.

3.2 Universal and NP Relations

Next, we define a variant of the universal relation, introduced by Reference [8]. For efficiency reasons, it will be helpful to define this relative to different computational models, so we give definitions for Turing machine computation and RAM machine computation.

Definition 3.1 (Universal Relation).

The universal relation for Turing machines $ \mathcal {R}_{\mathcal {U}} ^{\mathbf {TM}} $ is the set of instance-witness pairs $ ((M,x,y,L,t),w) $ where M is a Turing machine such that $ M(x,w) $ outputs y within t steps, and additionally $ \left|y \right| \le L $ . We let $ \mathcal {L}_{\mathcal {U}} ^{\mathbf {TM}} $ be the corresponding universal language. We similarly define $ \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} $ and $ \mathcal {L}_{\mathcal {U}} ^{\mathbf {PRAM}} $ to the be universal relation and language, respectively, for PRAM computation, where the given machine M is a PRAM machine.

The main difference between our definition and the standard universal relation of Reference [8] is that we consider computation with long outputs y, and we also include an upper bound L on the length of y. We include y to have a definition that captures both deterministic and non-deterministic polynomial-time computation. A similar relation was given in Reference [24] to define a canonical relation for $ \mathbf {P} $ . Moreover, the universal relation of Reference [8] is linear-time reducible to our definition above. With regards to L, we include this because in our main construction of SPARKs, the output y of the computation will not be known in advance. However, the complexity of the scheme inherently depends on L (as the output of the protocol is y).

Finally, we note that for a statement $ (M,x,y,L,t) $ with respect to PRAM computation, we do not place any restriction on the length of the witness w. Specifically, the machine M may only access t positions in w, but it could be the case that $ \left|w \right| $ is significantly greater than t.

3.3 Succinct Arguments of Knowledge

In this section, we define succinct arguments of knowledge [8, 41, 44] for relations $ R \subseteq \mathcal {R}_{\mathcal {U}} ^{\mathbf {TM}} $ . We focus on $ \mathbf {NP} $ languages and relations, where the argument of knowledge definition is restricted to polynomial-time statements.

Definition 3.2 (Succinct Arguments of Knowledge for NP Relations)

Let $ \alpha :\mathbb {N} ^3 \rightarrow \mathbb {N} $ . A pair of interactive machines $ (\mathcal {P},\mathcal {V}) $ is called a succinct argument of knowledge with $ \alpha $ -prover efficiency for a relation $ R \subseteq \mathcal {R}_{\mathcal {U}} ^{\mathbf {TM}} $ if the following conditions hold:

Completeness:

For every $ \lambda \in \mathbb {N} $ and $ ((M,x,y,L,t),w) \in R $ ,

$ \begin{equation*} \text{Pr}\left[ \langle \mathcal {P} (w),\mathcal {V} \rangle (1^{\lambda} ,(M,x,y,L,t)) = 1 \right] = 1, \end{equation*} $

where the probability is over the random coins of $ \mathcal {P} $ and $ \mathcal {V} $ .

Argument of Knowledge for

$ \mathbf {NP} $ : There exists a probabilistic oracle machine $ \mathcal {E} $ and a polynomial q such that for every non-uniform probabilistic polynomial-time prover $ \mathcal {P} ^\star = \lbrace \mathcal {P} ^\star _{\lambda }\rbrace _{\lambda \in \mathbb {N}} $ and every constant $ c \in \mathbb {N} $ , there exists a negligible function $ \mathsf {negl} $ such that for every $ \lambda \in \mathbb {N} $ , $ M,x,t,L,y \in \{0,1\}^* $ with $ \left|M,x,t,y \right| \le \lambda $ , $ L \le \lambda $ , and $ t \le \left|x \right|^{c} $ , and every $ z,s \in \{0,1\}^* $ , the following hold:

Let $ \mathcal {P} _{\lambda ,z,s}^{\star } $ denote the machine $ \mathcal {P} ^{\star }_{\lambda } $ with auxiliary input z and randomness s fixed, let $ \mathcal {V} _{r} $ denote the verifier $ \mathcal {V} $ using randomness $ r \in \{0,1\}^{\ell (\lambda)} $ where $ \ell (\lambda) $ is a bound on the number of random bits used by $ \mathcal {V} (1^{\lambda} ,\cdot) $ . Then:

(1)

The expected running time of $ \mathcal {E} ^{\mathcal {P} ^{\star }_{\lambda ,z,s},\mathcal {V} _{r}}(1^{\lambda} ,(M,x,y,L,t)) $ is bounded by $ q(\lambda ,t) $ , where the expectation is over $ r \leftarrow \{0,1\} ^{\ell (\lambda)} $ and the random coins of $ \mathcal {E} $ .

(2)

It holds that

$ \begin{equation*} \text{Pr}\left[ \begin{array}{l} r \leftarrow \{0,1\} ^{\ell (\lambda)} \\ w \leftarrow \mathcal {E} ^{\mathcal {P} ^{\star }_{\lambda ,z,s},\mathcal {V} _{r}}(1^{\lambda} ,(M,x,y,L,t)) \end{array} : \begin{array}{l} \langle \mathcal {P} ^{\star }_{\lambda ,z,s},\mathcal {V} _{r} \rangle (1^{\lambda} ,(M,x,y,L,t)) = 1 \\ \wedge \ ((M,x,y,L,t),w)\not\in R \end{array} \right] \le \mathsf {negl} (\lambda). \end{equation*} $

Succinctness:

There exist polynomials $ q_1, q_2 $ such that for any $ \lambda \in \mathbb {N} $ and $ (M,x,y,L,t) \in \{0,1\} ^{*} $ , it holds that

$ \begin{equation*} \mathbf {work} _{\mathcal {V}}(1^{\lambda} ,(M,x,y,L,t)) \le q_1(\lambda ,\left|(M,x,y,L) \right|,\log t) \end{equation*} $

and the length of the transcript produced in the interaction between $ \mathcal {P} (w) $ and $ \mathcal {V} $ on common input $ (1^{\lambda} ,(M,x,y,L,t)) $ is bounded by $ q_2(\lambda ,\log t) $ .

α-Prover Runtime:

For all $ \lambda \in \mathbb {N} $ and $ ((M,x,y,L,t),w) \in R $ , it holds that

$ \begin{equation*} \mathbf {work} _{\mathcal {P}}(1^{\lambda} ,(M,x,y,L,t),w) \le \alpha (\lambda ,\left|(M,x,y,L) \right|,t). \end{equation*} $

If the above holds for $ R = \mathcal {R}_{\mathcal {U}} ^{\mathbf {TM}} $ , then we say that $ (\mathcal {P}, \mathcal {V}) $ is a succinct argument of knowledge for $ \mathbf {NP} $ .

We note that we could naturally relax the above definition so completeness and efficiency only hold for statements $ (M,x,y,L,t) $ where $ t \le T(\left|x \right|) $ for some slightly super-polynomial function T, as in References [7, 24]. In our results, if we assume this weaker notion, then our resulting SPARK will satisfy the same notion.

We also note that the above definition captures succinct arguments of knowledge for any specific $ \mathbf {NP} $ language $ \mathcal {L} $ with relation $ R_\mathcal {L} $ (not necessarily contained in $ \mathcal {R}_{\mathcal {U}} ^{\mathbf {TM}} $ ). The relation $ R_\mathcal {L} $ implicitly determines an $ \mathbf {NP} $ verification machine $ M_\mathcal {L} $ with time bound $ T \in \mathrm{poly} (|x|) $ . Then, we can consider the relation $ R = \{((M_\mathcal {L}, x, 1, 1, T(|x|)),w) : M_{\mathcal {L}}(x,w) = 1 \text{ within } T(|x|) \text{ steps} \} \subseteq \mathcal {R}_{\mathcal {U}} ^{\mathbf {TM}} $ .

Remark 1 (Comparison with Previous Definitions).

In contrast to the definition of universal arguments of knowledge, the argument of knowledge definition above for $ \mathbf {NP} $ holds only for all malicious provers $ P^\star $ and constants c where the statements $ (M,x,y,L,t) $ have $ t \le |x|^c $ . We also define the extractor to run in expected polynomial time $ q(\lambda , t) $ where q is a polynomial independent of $ \mathcal {P} ^\star $ or the specific time bound $ |x|^c $ . This is in spirit of universal arguments [8] where they define a weak extractor that only extracts a single bit of the witness at a time (because they deal with t, which is not necessarily bounded by a polynomial).

We note that for $ \mathbf {NP} $ , our extractor definition differs from the standard notion, which does not give the extractor oracle access to $ \mathcal {V} _{r} $ , runs in expected time proportional to $ \epsilon (\lambda)-\kappa (\lambda) $ , and always extracts a valid witness. Here, $ \epsilon (\lambda) $ is the success probability of $ \mathcal {P} ^{\star }_{\lambda ,z,s} $ and $ \kappa (\lambda) $ is the knowledge error (see Reference [9]). Nevertheless, we show in Section A that our definition for $ \mathbf {NP} $ is implied by a definition of witness-extended emulation for $ \mathbf {NP} $ arguments, which is in turn implied by the standard argument of knowledge definition for $ \mathbf {NP} $ with negligible knowledge error [42] (with minor modifications to fit into our setting).

We emphasize that the above definition is given for relations in $ \mathcal {R}_{\mathcal {U}} ^{\mathbf {TM}} $ where the time bound t represents the total work of the computation. We can readily extend this to relations for parallel computations where the machine M runs in depth t and uses $ p_{M} $ processors, by generically bounding the total work by $ t \cdot p_{M} $ in the above definition. Below, we more precisely quantify the prover efficiency for parallel computations by decoupling the prover’s depth and parallelism, which may depend on the parallelism and depth of the underlying computation.

Definition 3.3 (Decoupling Prover Efficiency for Succinct Arguments).

Let $ \alpha ,\rho :\mathbb {N} ^{4} \rightarrow \mathbb {N} $ . A succinct argument of knowledge $ (\mathcal {P} _{}, \mathcal {V} _{}) $ for a relation $ R \subseteq \mathcal {R}_{\mathcal {U}} ^{\mathbf {TM}} $ satisfies $ (\alpha ,\rho) $ -prover efficiency if for all $ \lambda \in \mathbb {N} $ and $ ((M,x,y,L,t),w) \in R $ where M uses at most $ p_{M} $ processors, it holds that

$ \begin{equation*} \mathbf {work} _{\mathcal {P}}(1^{\lambda} ,(M,x,y,L,t),w) \le \alpha (\lambda ,\left|(M,x,y,L) \right|,t) \end{equation*} $

using $ \rho (\lambda ,\left|(M,x,y,L) \right|,t) $ processors.

We may also consider relations R consisting of parallel machines M that use $ p_{M} $ processors, in which case $ \alpha $ and $ \rho $ may additionally depend on $ p_{M} $ .

We note that a succinct argument of knowledge with $ \alpha $ -prover runtime immediately gives a succinct argument of knowledge satisfying $ (\alpha ^{\prime },1) $ -prover efficiency where $ \alpha ^{\prime }(\lambda ,\left|(M,x,y ,L) \right|, t,p_{M}) = \alpha (\lambda ,\left|(M,x,y,L) \right|,t \cdot p_{M}) $ .

SNARKs. Next, we define succinct non-interactive arguments of knowledge.

Definition 3.4 (SNARKs for NP Relations)

A Succinct Non-interactive Argument of Knowledge (SNARK) for a relation $ R \subseteq \mathcal {R}_{\mathcal {U}} ^{\mathbf {TM}} $ is a tuple of probabilistic algorithms $ (\mathcal {G},\mathcal {P},\mathcal {V}) $ with the following syntax:

•

$ (\mathit {crs}, \text{st}) \leftarrow \mathcal {G} (1^{\lambda}) $ : A PPT algorithm that on input a security parameter $ \lambda $ outputs a common reference string $ \mathit {crs} $ and a verification state $ \text{st} $ .

•

$ \pi \leftarrow \mathcal {P} (\mathit {crs}, (M,x,y,L,t), w) $ : A probabilistic algorithm that on input a common reference string $ \mathit {crs} $ , a statement $ (M,x,y,L,t) $ , and a witness w, outputs a proof $ \pi $ .

•

$ b \leftarrow \mathcal {V} (\text{st} , (M,x,y,L,t), \pi) $ : A PPT algorithm that on input a verification state $ \text{st} $ , a statement $ (M,x,y,L,t) $ , and a proof $ \pi $ , outputs a bit b indicating whether to accept or reject.

We require the following properties:

Completeness:

For every $ \lambda \in \mathbb {N} $ and $ ((M,x,y,L,t),w) \in R $ ,

$ \begin{equation*} \text{Pr}\left[ \begin{array}{l} (\mathit {crs}, \text{st}) \leftarrow \mathcal {G} (1^{\lambda}) \\ \pi \leftarrow \mathcal {P} (\mathit {crs}, (M,x,y,L,t), w) \\ b \leftarrow \mathcal {V} (\text{st} , (M,x,y,L,t), \pi) \end{array} : b = 1 \right] = 1. \end{equation*} $

Adaptive Argument of Knowledge for

$ \mathbf {NP} $ : For any non-uniform polynomial-time prover $ \mathcal {P} ^\star = \lbrace \mathcal {P} ^\star _{\lambda }\rbrace _{\lambda \in \mathbb {N}} $ , there exists a probabilistic machine $ \mathcal {E} $ and a polynomial q, such that for every $ c \in \mathbb {N} $ , there exists a negligible function $ \mathsf {negl} $ such that for every $ \lambda \in \mathbb {N} $ and $ z,s \in \{0,1 \}^{*} $ , the following hold:

Let $ \mathcal {P} _{\lambda ,z,s}^{\star } $ denote the machine $ \mathcal {P} ^{\star }_{\lambda } $ with auxiliary input z and randomness s fixed. Then:

(1)

The running time of $ \mathcal {E} (\mathit {crs},z,s) $ is bounded by $ q(\lambda ,t) $ , where $ (\mathit {crs},\mathit {st}) \leftarrow \mathcal {G} (1^{\lambda}) $ , and t is given by the statement output by $ \mathcal {P} ^{\star }_{\lambda ,z,s}(\mathit {crs}) $ .

(2)

It holds that

$ \begin{equation*} \text{Pr}\left[ \begin{array}{l} (\mathit {crs}, \text{st}) \leftarrow \mathcal {G} (1^{\lambda}) \\ ((M,x,y,L,t), \pi) \leftarrow \mathcal {P} ^{\star }_{\lambda ,z,s}(\mathit {crs}) \\ b \leftarrow \mathcal {V} (\text{st} , (M,x,y,L,t), \pi) \\ w \leftarrow \mathcal {E} (\mathit {crs},z,s) \end{array} : \begin{array}{l} b = 1 \ \wedge \\ ((M,x,y,L,t),w)\not\in R \ \wedge \\ t \le |x|^c \end{array} \right] \le \mathsf {negl} (\lambda). \end{equation*} $

Succinctness:

There exist polynomials $ q_1, q_2 $ such that for any $ \lambda \in \mathbb {N} $ , $ (\mathit {crs},\text{st}) $ in the support of $ \mathcal {G} (1^{\lambda}) $ , $ (M,x,y,L,t) \in \{0,1\} ^* $ with $ |y| \le L $ , witness w, and proof $ \pi $ in the support of $ \mathcal {P} (\mathit {crs}, (M,x,y,L,t), w) $ ,⁷ it holds that

•

$ \mathbf {work} _{\mathcal {V}}(\text{st} ,(M,x,y,L,t), \pi) \le q_1(\lambda ,|(M,x,y,L)|,\log t) $ and

•

$ |\pi | \le q_2(\lambda ,\log t) $ .

α-Prover Runtime:

For all $ \lambda \in \mathbb {N} $ and $ ((M,x,y,L,t),w) \in R $ , it holds that

$ \begin{equation*} \mathbf {depth} _{\mathcal {P}}(\mathit {crs},(M,x,y,L,t),w) = \alpha (\lambda , |(M,x,y,L)|,t). \end{equation*} $

If the above holds for $ R = \mathcal {R}_{\mathcal {U}} ^{\mathbf {TM}} $ , then we say that $ (\mathcal {G},\mathcal {P}, \mathcal {V}) $ is a SNARK for $ \mathbf {NP} $ . When $ \mathit {crs} = \text{st} $ for $ \mathcal {G} (1^{\lambda}) $ , we say that the SNARK is publicly verifiable and write $ \mathit {crs} \leftarrow \mathcal {G} (1^{\lambda}) $ .

We note that our definition of adaptive argument of knowledge for $ \mathbf {NP} $ is implied by the definition of Reference [16] for $ \mathbf {NP} $ . As in the interactive setting, we can similarly relax the completeness and efficiency properties to only hold for statements with t bounded by a slightly super-polynomial function $ T(|x|) $ as in Reference [16].

Remark 2 (On the Distribution over the Auxiliary Input).

With regards to auxiliary input, our SNARK definition follows the convention of Reference [15]. However, as they point out, it was shown by References [17, 22] that this definition is too strong. In particular, they show that it is impossible to achieve assuming indistinguishability obfuscation. As such, the argument of knowledge definition can be relaxed to consider security with respect to a particular distribution of auxiliary input appropriate for the specific application.

As with interactive arguments, we can also extend the above definition to decouple prover efficiency into prover depth and parallelism.

Definition 3.5 (Decoupling Prover Efficiency for SNARKs).

Let $ \alpha ,\rho :\mathbb {N} ^{3} \rightarrow \mathbb {N} $ . A SNARK $ (\mathcal {G},\mathcal {P},\mathcal {V}) $ for a relation $ R \subseteq \mathcal {R}_{\mathcal {U}} ^{\mathbf {TM}} $ satisfies $ (\alpha ,\rho) $ -prover efficiency if for all $ \lambda \in \mathbb {N} $ , $ (\mathit {crs},\mathit {st}) $ in the support of $ \mathcal {G} (1^{\lambda}) $ , and $ ((M,x,y,L,t),w) \in R $ , it holds that

$ \begin{equation*} \mathbf {work} _{\mathcal {P}}(\mathit {crs},(M,x,y,L,t),w) \le \alpha (\lambda ,\left|(M,x,y,L) \right|,t) \end{equation*} $

using $ \rho (\lambda ,\left|(M,x,y,L) \right|,t) $ processors.

We may also consider relations R consisting of parallel machines M that use $ p_{M} $ processors, in which case $ \alpha $ and $ \rho $ may additionally depend on $ p_{M} $ .

4 Succinct Parallelizable Arguments of Knowledge

In this section, we define succinct parallelizable arguments of knowledge for non-deterministic polynomial-time PRAM computation, using the following syntax for interactive protocols: We denote by $ \langle \mathcal {P} (w),\mathcal {V} \rangle $ the output of $ \mathcal {V} $ in the interaction, which may be of arbitrary (polynomial) length. Furthermore, we let $ \mathcal {V} $ output $ \bot $ to indicate reject, and output $ y\ne \bot $ to accept the output y.

Definition 4.1 (SPARKs for NP Relations)

A Succinct Parallelizable Argument of Knowledge (SPARK) for a relation $ R \subseteq \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} $ is a tuple of probabilistic interactive machines $ (\mathcal {P},\mathcal {V}) $ where $ \mathcal {P} $ is a PRAM machine, satisfying the following properties:

Completeness:

For every $ \lambda \in \mathbb {N} $ and $ ((M,x,y,L,t),w) \in R $ where M has access to $ n \le 2^{\lambda } $ words in memory,

$ \begin{equation*} \text{Pr}\left[ \langle \mathcal {P} (w),\mathcal {V} \rangle (1^{\lambda} ,(M,x,t,L)) = y \right] = 1, \end{equation*} $

where the probability is over the random coins of $ \mathcal {P} $ and $ \mathcal {V} $ .

Argument of Knowledge for

$ \mathbf {NP} $ : There exists a probabilistic oracle machine $ \mathcal {E} $ and a polynomial q such that for every non-uniform polynomial-time prover $ \mathcal {P} ^\star = \lbrace \mathcal {P} ^\star _{\lambda }\rbrace _{\lambda \in \mathbb {N}} $ and every constant $ c \in \mathbb {N} $ , there exists a negligible function $ \mathsf {negl} $ such that for every $ \lambda \in \mathbb {N} $ , $ z,s \in \{0,1\} ^{*} $ , and $ (M,x,t,L) \in \{0,1\} ^* $ with $ \left|M,x,t \right| \le \lambda $ , $ L \le \lambda $ , M having access to $ n\le 2^{\lambda } $ words in memory and $ p_{M} $ processors, and $ t\cdot p_{M} \le |x|^c $ , the following hold:

Let $ \mathcal {P} _{\lambda ,z,s}^{\star } $ denote the machine $ \mathcal {P} ^{\star }_{\lambda } $ with auxiliary input z and randomness s fixed, let $ \mathcal {V} _{r} $ denote the verifier $ \mathcal {V} $ using randomness $ r \in \{0,1\} ^{l(\lambda)} $ where $ l(\lambda) $ is a bound on the number of random bits used by $ \mathcal {V} (1^{\lambda} ,\cdot) $ . Then:

(1)

The expected time of $ \mathcal {E} ^{\mathcal {P} ^{\star }_{\lambda ,z,s},\mathcal {V} _{r}}(1^{\lambda} ,(M,x,t,L)) $ is bounded by $ q(\lambda ,t \cdot p_{M}) $ , where the expectation is over $ r \leftarrow \{0,1\} ^{l(\lambda)} $ and the random coins of $ \mathcal {E} $ .

(2)

It holds that

$ \begin{align*} &\text{Pr}\left[ \begin{array}{l} r \leftarrow \{0,1\} ^{l(\lambda)} \\ y = \langle \mathcal {P} ^{\star }_{\lambda ,z,s},\mathcal {V} _{r} \rangle (1^{\lambda} ,(M,x,t,L)) \\ w \leftarrow \mathcal {E} ^{\mathcal {P} ^{\star }_{\lambda ,z,s},\mathcal {V} _{r}}(1^{\lambda} ,(M,x,t,L)) \end{array} : y \ne \bot \wedge ((M,x,y,L,t),w)\not\in R \right] \le \mathsf {negl} (\lambda). \end{align*} $

Succinctness:

There exist polynomials $ q_{1},q_{2} $ such that for any $ \lambda \in \mathbb {N} $ , $ (M,x,t,L) \in \{0,1\} ^* $ where M has access to $ n \le 2^{\lambda } $ words in memory and $ p_{M} $ processors, it holds that

$ \begin{equation*} \mathbf {work} _{\mathcal {V}}(1^{\lambda} ,(M,x,t,L)) \le q_{1}(\lambda ,|(M,x)|,L,\log (t \cdot p_{M})) \end{equation*} $

and the length of the transcript produced in the interaction between $ \mathcal {P} (w) $ and $ \mathcal {V} $ on common input $ (1^{\lambda} ,(M,x,t,L)) $ is bounded by $ q_{2}(\lambda ,L,\log (t\cdot p_{M})) $ .

Optimal prover depth:

There exist polynomials $ q_{1}, q_2 $ such that for all $ \lambda \in \mathbb {N} $ and $ ((M,x,y,L,t),w) \in R $ where M has access to $ n \le 2^{\lambda } $ words in memory and $ p_{M} $ processors, it holds that

$ \begin{equation*} \mathbf {depth} _{\mathcal {P}}(1^{\lambda} ,(M,x,t,L),w) \le t + q_1(\lambda ,|(M,x)|,L,\log (t \cdot p_{M})) \end{equation*} $

and the total number of processors used by $ \mathcal {P} $ is at most $ p_{M} \cdot q_2(\lambda ,\log (t \cdot p_{M})) $ .

If the above holds for $ R = \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} $ , then we say that $ (\mathcal {P}, \mathcal {V}) $ is a SPARK for non-deterministic polynomial-time PRAM computation.

We next remark about some subtleties in our definition and compare to related notions.

Remark 3 (Delayed Output).

We note that our definition of SPARKs has a “delayed output” property where the prover picks the output of the protocol rather than it being known a priori to both the prover and verifier. For typical $ \mathbf {NP} $ languages, this distinction is not important, because the prover is always trying to prove that the relation outputs 1. However, for proving more general polynomial-time computation, the output may not be known in advance, so the prover must compute both the output and a proof.

Remark 4 (Execution-dependent Extraction).

Since there may be many possible outputs y of the computation, it is very important that the extractor finds a witness for the actual output y that $ \mathcal {V} $ accepts in the interaction. Morally, this definition should capture the fact that the prover actually knows a witness for that output, instead of a witness for an arbitrary output $ y^{\prime } $ that the prover may never convince the verifier of. This is particularly relevant for $ \mathbf {NP} $ relations, since when a prover convinces a verifier of an accepting witness (i.e., one where the relation outputs 1) it is not meaningful to extract a witness, which makes the relation output 0. Note that it does not suffice to run the protocol and simply give the extractor y (and require the extractor to provide a witness for that output), as the malicious prover may only convince $ \mathcal {V} $ of any particular y with small probability.

A similar challenge motivated the work on precise proofs of knowledge [45], where they defined arguments of knowledge where the extractor’s behavior depended on a specific instance of the protocol.⁸ To capture this, their extractor receives a uniformly sampled view of the prover in the protocol and extracts a consistent witness. In our definition above, we choose to give the extractor oracle access to the fixed prover as well as the verifier with fixed randomness that results in accepting a particular output y. This is akin to giving the extractor an oracle version of the view, while additionally making the extractor black-box in both the malicious prover and (fixed) verifier. As such, the extractor can emulate the interaction to deterministically figure out the output y it needs to extract for.

Remark 5 (On Composition).

It is often important for arguments of knowledge to be composable—that is, to be able to be used as a sub-protocol (possibly many times). Indeed, we require this for our transformation from arguments of knowledge to SPARKs. Often, the challenge with composing proofs of knowledge is obtaining the desired running time of the final extractor.

One definition that composes well is precise argument of knowledge [45]. As explained above, in that definition the extractor receives the prover’s view in the protocol, and for every view, the running time of the extractor is a fixed polynomial (in the prover’s running time on that view). However, this notion is quite strong, and hence is not known to hold for standard arguments of knowledge. A more standard notion is witness-extended emulation [42], where the extractor is not given a view, but instead must output a uniformly distributed view of the verifier as well as a witness. Moreover, the extractor only needs to run in expected polynomial time, and may use rewinding. However, when this is used as a sub-protocol, the view picked by the extractor may not be compatible with the external view in the rest of the protocol.

To fix this issue, our definition essentially gives the extractor a uniformly sampled view, and we require that the extractor runs in expected polynomial time over the choice of the view. This can be seen as a relaxation of precise argument of knowledge, since it does not need to be efficient for every view, but also as a (conceptual) strengthening of witness-extended emulation, because the extractor must work on a given view, rather than being able to sample one itself.

Remark 6 (On the Dependence on Parallelism).

An important contribution of our SPARK definition is decoupling the time of a PRAM computation from the total work done. As such, we briefly discuss the dependence on the number of processors used by the underlying PRAM machine.

For a PRAM machine M that uses $ p_{M} $ processors and runs in time t, we note that the work of M can be generically bounded by $ t \cdot p_{M} $ . Therefore, we use $ t \cdot p_{M} $ in place of the usual notion of work for succinctness and prover efficiency.

The only other dependence on $ p_{M} $ in our SPARK definition is in the amount of processors we allow the prover to use. As the prover must emulate $ M(x,w) $ in roughly the same depth that M uses, the prover needs to at least use $ p_{M} $ processors. Furthermore, we require in our definition that the parallelism is preserved up to multiplicative $ \mathrm{poly} (\lambda , \log (t \cdot p_{M})) $ factors, following similar definitions for complexity-preserving arguments [18].

Non-interactive SPARKs. Next, we define non-interactive SPARKs for non-deterministic polynomial-time PRAM computation. Non-interactive SPARKs differ from SNARKs (Definition 3.4) in two key ways, analogously to the interactive setting. First, a non-interactive SPARK must compute the output of the (possibly non-deterministic) computation while computing the proof, and second, we require near-optimal prover efficiency. However, the other requirements, most notably the argument of knowledge definition, are nearly the same as in SNARKs.

Definition 4.2 (Non-interactive SPARKs for NP Relations)

A Non-interactive Succinct Parallelizable Argument of Knowledge (niSPARK) for a relation $ R \subseteq \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} $ is a tuple of probabilistic algorithms $ (\mathcal {G} _\mathsf {ni},\mathcal {P} _\mathsf {ni},\mathcal {V} _\mathsf {ni}) $ with the following syntax:

•

$ (\mathit {crs}, \text{st}) \leftarrow \mathcal {G} _\mathsf {ni} (1^{\lambda}) $ : A PPT algorithm that on input a security parameter $ \lambda $ outputs a common reference string $ \mathit {crs} $ and a verification state $ \text{st} $ .

•

$ (y,\pi) \leftarrow \mathcal {P} _\mathsf {ni} (\mathit {crs},(M,x,t,L),w) $ : A probabilistic algorithm that on input a common reference string $ \mathit {crs} $ , a statement $ (M,x,t,L) $ , and a witness w, outputs a value y and a proof $ \pi $ .

•

$ b \leftarrow \mathcal {V} _\mathsf {ni} (\text{st} ,(M,x,y,L,t),\pi) $ : A PPT algorithm that on input a verification state $ \text{st} $ , a statement $ (M,x,y,L,t) $ , and a proof $ \pi $ , outputs a bit b indicating whether to accept or reject.

We require the following properties:

Completeness:

For every $ \lambda \in \mathbb {N} $ and $ ((M,x,y,L,t),w) \in R $ where M has access to $ n \le 2^{\lambda } $ words in memory,

$ \begin{equation*} \text{Pr}\left[ \begin{array}{l} (\mathit {crs}, \text{st}) \leftarrow \mathcal {G} _\mathsf {ni} (1^{\lambda}) \\ (y,\pi) \leftarrow \mathcal {P} _\mathsf {ni} (\mathit {crs}, (M,x,t,L), w) \\ b \leftarrow \mathcal {V} _\mathsf {ni} (\text{st} , (M,x,y,L,t), \pi) \end{array} : b = 1 \right] = 1. \end{equation*} $

Adaptive Argument of Knowledge for

$ \mathbf {NP} $ : For all non-uniform polynomial-time provers $ \mathcal {P} ^\star = \lbrace \mathcal {P} ^\star _{\lambda }\rbrace _{\lambda \in \mathbb {N}} $ , there exists a probabilistic machine $ \mathcal {E} $ and a polynomial q such that for every constant $ c \in \mathbb {N} $ , there is a negligible function $ \mathsf {negl} $ such that for every $ \lambda \in \mathbb {N} $ and $ z,s \in \{0,1\} ^{*} $ , the following hold:

Let $ \mathcal {P} _{\lambda ,z,s}^{\star } $ denote the machine $ \mathcal {P} ^{\star }_{\lambda } $ with auxiliary input z and randomness s fixed. Then:

(1)

The running time of $ \mathcal {E} (\mathit {crs},z,s) $ is bounded by $ q(\lambda ,t \cdot p_{M}) $ , where t is given by the statement $ (M,x,y,L,t) $ output by $ \mathcal {P} ^{\star }_{\lambda ,z,s}(\mathit {crs}) $ and $ p_{M} $ is the number of processors used by M.

(2)

It holds that

$ \begin{equation*} \text{Pr}\left[ \begin{array}{l} (\mathit {crs}, \text{st}) \leftarrow \mathcal {G} _\mathsf {ni} (1^{\lambda}) \\ ((M,x,y,L,t), \pi) \leftarrow \mathcal {P} ^{\star }_{\lambda ,z,s}(\mathit {crs}) \\ b \leftarrow \mathcal {V} _{\mathsf {ni}}(\text{st} , (M,x,y,L,t), \pi) \\ w \leftarrow \mathcal {E} (\mathit {crs},z,s) \end{array} : \begin{array}{l} b=1 \ \wedge \\ ((M,x,y,L,t),w)\not\in R \ \wedge \\ t\cdot p_{M} \le |x|^c \end{array} \right] \le \mathsf {negl} (\lambda), \end{equation*} $

where $ p_{M} $ is the number of processors used by M.

Succinctness:

There exist polynomials $ q_1, q_2 $ such that for any $ \lambda \in \mathbb {N} $ , $ (\mathit {crs},\text{st}) $ in the support of $ \mathcal {G} _\mathsf {ni} (1^{\lambda}) $ , $ (M,x,t,L) \in \{0,1\} ^* $ where M uses $ n \le 2^{\lambda } $ words in memory and $ p_{M} $ processors, witness w, and $ (y,\pi) $ in the support of $ \mathcal {P} _\mathsf {ni} (\mathit {crs}, (M,x,t,L), w) $ , it holds that

(1)

$ \mathbf {work} _{\mathcal {V} _\mathsf {ni}}(\text{st} ,(M,x,y,L,t),\pi) \le q_1(\lambda ,|(M,x)|,L,\log (t \cdot p_{M})) $ ,

(2)

$ |y| \le L $ , and

(3)

$ |\pi | \le q_2(\lambda , L, \log (t \cdot p_{M})) $ .

Optimal prover depth:

There exists polynomials $ q_1 $ and $ q_2 $ such that for all $ \lambda \in \mathbb {N} $ and $ ((M,x,t,L,y),w) \in R $ where M has access to $ n \le 2^{\lambda } $ words in memory and $ p_{M} $ processors, it holds that

$ \begin{equation*} \mathbf {depth} _{\mathcal {P} _\mathsf {ni}}(\mathit {crs},(M,x,t,L),w) = t + q_1(\lambda ,|(M,x)|,L,\log (t\cdot p_{M})) \end{equation*} $

and the total number of processors used by $ \mathcal {P} _\mathsf {ni} $ is in $ p_{M} \cdot q_2(\lambda ,\log (t \cdot p_{M})) $ .

If the above holds for $ R = \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} $ , then we say that $ (\mathcal {G} _\mathsf {ni}, \mathcal {P} _\mathsf {ni}, \mathcal {V} _\mathsf {ni}) $ is a non-interactive SPARK for non-deterministic polynomial-time PRAM computation. When $ \text{st} =\mathit {crs} $ for $ \mathcal {G} _\mathsf {ni} (1^{\lambda}) $ , we say that the non-interactive SPARK is publicly verifiable and write $ \mathit {crs} \leftarrow \mathcal {G} _\mathsf {ni} (1^{\lambda}) $ .

5 Concurrently Updatable Hash Functions

In this section, we define and construct a hash function that (1) allows concurrently updating arbitrary positions in the string underlying the digest, (2) has the property that different updates can be computed concurrently using multiple processors in a pipelined fashion (described in more detail below). This can be seen as a strengthening of locally updatable hash functions, with extra efficiency properties. We define our construction in the PRAM model.

For a security parameter $ \lambda \in \mathbb {N} $ , our hash function will be for strings D consisting of $ n \le 2^{\lambda } $ words of length $ \lambda $ . It will be helpful for us to capture the case when D is not defined at every location, that is, some words are set to $ \bot $ . To formalize this, below, we define the notion of a partial string, which is simply a succinct way to represent strings over $ (\{0,1\} ^{\lambda } \cup \{\bot \})^{n} $ .

Definition 5.1 (Partial String).

For any string $ s \in (\{0,1\} ^{\lambda } \cup \{\bot \})^{*} $ of words, the partial string D representing s is defined as follows: D is given by tuple $ (n,I,A) $ , where n is the number of words (or $ \bot $ elements) in s, $ I \subseteq [n] $ is the set of non- $ \bot $ locations in s, and $ A \in \{0,1\} ^{|I|} $ is the assignment to those indices. We let $ D_{i} $ denote the ith word in s.

Next, we define the hash functions used in this article. A concurrently updatable hash function is a tuple of algorithms $ (\mathbf {C}.\mathbf {Gen} ,\mathbf {C}.\mathbf {Hash} ,\mathbf {C}.\mathbf {Open} ,\mathbf {C}.\mathbf {Update} ,\mathbf {C}.\mathbf {VerOpen},\mathbf {C}.\mathbf {VerUpd}) $ with the following syntax⁹:

•

$ \mathit {pp} \leftarrow \mathbf {C}.\mathbf {Gen} (1^{\lambda} ,n) $ : A PPT algorithm that on input the security parameter $ \lambda $ in unary and an integer n, outputs public parameters $ \mathit {pp} $ .

•

$ (\mathit {ptr} , \mathit {digest}) = \mathbf {C}.\mathbf {Hash} (\mathit {pp} ,D) $ : A deterministic algorithm that on input public parameters $ \mathit {pp} $ and a partial string D, outputs a pointer $ \mathit {ptr} $ to a location in memory and a string $ \mathit {digest} $ .

•

$ (V, \pi) = \mathbf {C}.\mathbf {Open} (\mathit {pp} ,\mathit {ptr} ,S) $ : A read-only deterministic algorithm that on input public parameters $ \mathit {pp} $ , a pointer $ \mathit {ptr} $ , and an ordered set $ S = (\ell _{1},\ldots ,\ell _{p}) $ of locations $ \ell _{i} \in [n] $ , outputs a tuple $ V = (v _{1},\ldots ,v _{p}) $ of values $ v _{i} \in \{0,1\} ^{\lambda } \cup \{\bot \} $ , and a proof $ \pi $ .

•

$ (\mathit {digest},\tau) = \mathbf {C}.\mathbf {Update} (\mathit {pp} ,\mathit {ptr} ,S,V) $ : A deterministic algorithm that on input public parameters $ \mathit {pp} $ , a pointer $ \mathit {ptr} $ , an ordered set $ S = (\ell _{1},\ldots ,\ell _{p}) $ of locations $ \ell _{i} \in [n] $ , and a tuple $ V = (v _{1},\ldots ,v _{p}) $ of words $ v _{i} \in \{0,1\} ^{\lambda } $ , outputs a digest $ \mathit {digest} $ and a proof $ \tau $ .

•

$ b = \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} ,\mathit {digest},S,V,\pi) $ : A deterministic algorithm that on input public parameters $ \mathit {pp} $ , a digest $ \mathit {digest} $ , an ordered set $ S = (\ell _{1},\ldots ,\ell _{p}) $ of locations $ \ell _{i} \in [n] $ , a tuple $ V = (v _{1},\ldots ,v _{p}) $ of values $ v _{i} \in \{0,1\} ^{\lambda } \cup \{\bot \} $ , and a proof $ \pi $ , outputs a bit b.

•

$ b = \mathbf {C}.\mathbf {VerUpd} (\mathit {pp} ,\mathit {digest},S,V,\mathit {digest}^{\prime },\tau) $ : A deterministic algorithm that on input public parameters $ \mathit {pp} $ , a digest $ \mathit {digest} $ , an ordered set $ S = (\ell _{1},\ldots ,\ell _{p}) $ of locations $ \ell _{i} \in [n] $ , a tuple $ V = (v _{1},\ldots ,v _{p}) $ of words $ v _{i} \in \{0,1\} ^{\lambda } $ , a digest $ \mathit {digest}^{\prime } $ , and a proof $ \tau $ , outputs a bit b.

We assume for simplicity that there are no duplicate locations specified by the set S in the above algorithms. We note that when S is a single location $ \ell $ and V is a single word v, to simplify notation, we let $ \mathbf {C}.\mathbf {Open} $ , $ \mathbf {C}.\mathbf {Update} $ , $ \mathbf {C}.\mathbf {VerOpen} $ , and $ \mathbf {C}.\mathbf {VerUpd} $ take $ \ell $ and v as input rather than the singleton ordered set $ (\ell) $ and tuple $ (v) $ . We require the following completeness, soundness, and efficiency properties.

At a high level, completeness says that opening or updating an honestly generated digest gives a valid proof, and that the string underlying the digest is correct. Moreover, this holds after any sequence of updates to the digest.

Definition 5.2 (Completeness).

Let $ \lambda ,n \in \mathbb {N} $ with $ n \le 2^{\lambda } $ , $ \mathit {pp} $ be in the support of $ \mathbf {C}.\mathbf {Gen} (1^{\lambda} ,n) $ , $ D = (n,I,A) $ be a partial string, and $ m \ge 0 $ . For any ordered sets $ S^{(i)} \subseteq [n] $ and tuples $ V ^{(i)} \in (\{0,1\} ^{\lambda })^{\left|S^{(i)} \right|} $ for $ i \in [m] $ , do the following:

(1)

Compute $ (\mathit {ptr} , \mathit {digest} ^{(0)}) = \mathbf {C}.\mathbf {Hash} (\mathit {pp} , D) $ .

(2)

For $ i = 1,\ldots ,m $ , compute $ (\mathit {digest} ^{(i)},\tau ^{(i)}) = \mathbf {C}.\mathbf {Update} (\mathit {pp} , \mathit {ptr} , S^{(i)},V ^{(i)}) $ .

Let $ D^{\prime } $ be the partial string resulting from writing each word in $ V ^{(i)} $ to D at the corresponding location in $ S^{(i)} $ for $ i = 1,\ldots ,m $ . Then, the following hold for any $ p\in \mathbb {N} $ and ordered set $ S = (\ell _{1},\ldots ,\ell _{p}) $ of locations in $ [n] $ :

Open Completeness.

Let $ (V,\pi) = \mathbf {C}.\mathbf {Open} (\mathit {pp} ,\mathit {ptr} ,S) $ where $ V = (v _{1},\ldots ,v _{p}) $ . Then,

$ \begin{align*} \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} , \mathit {digest} ^{(m)}, S, V, \pi)=1 \; \wedge \; D^{\prime }_{\ell _{i}} = v _{i} \ \forall i \in [p]. \end{align*} $

Update Completeness.

For any tuple $ V \in (\{0,1\} ^{\lambda })^{p} $ , let $ (\mathit {digest},\tau) = \mathbf {C}.\mathbf {Update} (\mathit {pp} ,\mathit {ptr} ,S,V) $ . It holds that

$ \begin{equation*} \mathbf {C}.\mathbf {VerUpd} (\mathit {pp} ,\mathit {digest} ^{(m)},S,V,\mathit {digest},\tau) = 1. \end{equation*} $

Next, we define soundness, which informally says that no PPT adversary can give a digest and a sequence of valid updates that update some position $ \ell $ to a word $ v ^{\mathsf {prev}} $ and then open $ \ell $ to a different value $ v ^{\mathsf {final}} \ne v ^{\mathsf {prev}} $ .

Definition 5.3 (Soundness).

For all non-uniform PPT adversaries $ \mathcal {A} = \{\mathcal {A} _{\lambda }\}_{\lambda \in \mathbb {N}} $ , there exists a negligible function $ \mathsf {negl} $ such that for all $ \lambda \in \mathbb {N} $ , it holds that for all with $ n \le 2^{\lambda } $ ,

$ \begin{equation*} \text{Pr}\left[ \begin{array}{l} \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} ,\mathit {digest} ^{(0)},S^{(0)},V ^{(0)},\pi ^{(0)}) = 1 \; \wedge \\ \forall i \in [m]: \mathbf {C}.\mathbf {VerUpd} (\mathit {pp} ,\mathit {digest} ^{(i-1)},S^{(i)},V ^{(i)},\mathit {digest} ^{(i)},\tau ^{(i)})=1 \; \wedge \\ \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} ,\mathit {digest} ^{(m)},S,V,\pi) = 1 \; \wedge \\ \exists \ell \in S \cap S^{(0)} : v ^{\mathsf {prev}} \ne v ^{\mathsf {final}} \end{array} \right] \le \mathsf {negl} (\lambda), \end{equation*} $

the probability is over the choice of $ \mathit {pp} \leftarrow \mathbf {C}.\mathbf {Gen} (1^{\lambda },n) $ and $ (m , \lbrace (\mathit {digest} ^{(i)} , S^{(i)} , V ^{(i)} , \tau ^{(i)})\rbrace _{i \in [m]}, \mathit {digest} ^{(0)} , S^{(0)} , V ^{(0)} , \pi ^{(0)} , S , V , \pi) \leftarrow \mathcal {A} _{\lambda }(\mathit {pp}) $ , and $ v ^{\mathsf {prev}} $ and $ v ^{\mathsf {final}} $ are defined as follows:

•

$ v ^{\mathsf {prev}} $ is the value in $ V ^{(i)} $ at the index of $ \ell $ in $ S^{(i)} $ , where $ i\in \{0,\ldots ,m\} $ is the largest index with $ \ell \in S^{(i)} $ .

•

$ v ^{\mathsf {final}} $ is the value in V at the index of $ \ell $ in S.

Last, we require the following efficiency properties, which at a high level say that any sequence of k updates can be computed (while opening the previous values) in a pipelined fashion with only additive overhead:

Definition 5.4 (Parallel Efficiency).

Let $ \beta :\mathbb {N} \rightarrow \mathbb {N} $ . We say that a concurrently updatable hash function satisfies $ \beta $ -parallel efficiency if the following hold for all $ \lambda ,n\in \mathbb {N} $ with $ n \le 2^{\lambda } $ , $ \mathit {pp} $ in the support of $ \mathbf {C}.\mathbf {Gen} (1^{\lambda} ,n) $ , and ordered sets $ S\subseteq [n] $ :

•

The algorithms $ \mathbf {C}.\mathbf {Open} $ , $ \mathbf {C}.\mathbf {Update} $ , $ \mathbf {C}.\mathbf {VerOpen} $ , and $ \mathbf {C}.\mathbf {VerUpd} $ when given public parameters $ \mathit {pp} $ and locations S can each be computed with $ |S| \cdot \beta (\lambda) $ work, which can be decoupled into depth $ \beta (\lambda) $ with $ |S| \cdot \beta (\lambda) $ processors.

•

Computing $ \mathbf {C}.\mathbf {Hash} (\mathit {pp} ,D) $ for any partial string $ D = (n,I,A) $ can be done with $ |I| \cdot \beta (\lambda) $ work, which can be decoupled into depth $ \beta (\lambda) $ with $ |I| \cdot \beta (\lambda) $ processors.

•

For any pointer $ \mathit {ptr} $ , and tuple $ V \in (\{0,1\} ^{\lambda })^{\left|S \right|} $ , define $ (V^{\prime },\pi ,\mathit {digest},\tau) $ as follows:

◦

$ (V^{\prime },\pi) = \mathbf {C}.\mathbf {Open} (\mathit {pp} ,\mathit {ptr} ,S) $

◦

$ (\mathit {digest},\tau) = \mathbf {C}.\mathbf {Update} (\mathit {pp} ,\mathit {ptr} ,S,V) $

There exists an algorithm $ \mathbf {OpenUpdate} (\mathit {pp} ,\mathit {ptr} ,S,V) $ that outputs $ (V^{\prime },\pi ,\mathit {digest},\tau) $ , such that k sequential calls to $ \mathbf {OpenUpdate} $ , each on at most $ p_{\mathsf {max}} $ locations, can be computed with $ p_{\mathsf {max}} \cdot \beta (\lambda) $ work, which can be decoupled into depth $ (k-1) + \beta (\lambda) $ using at most $ p_{\mathsf {max}} \cdot \beta (\lambda) $ processors.

When $ \beta $ is a polynomial, we say the scheme satisfies parallel efficiency.

Remark 7.

We emphasize that the completeness and soundness properties we give for concurrently updatable hash functions must hold for any sequence of m “valid” updates. At a high level, these notions stipulate that an opening will always give the correct values (with a proof) and that no adversary can find an opening for a value you would not expect (based on the updates). Furthermore, we require $ \mathbf {C}.\mathbf {VerUpd} $ to ensure that an update to a set of locations does not affect any other locations.

We note that even when viewed as a hash function with local updates (i.e., updates to a single location rather than a set) our definition generalizes some previous notions. Specifically, this applies to standard notions of completeness and position binding for vector commitments [23], as when there are no updates (i.e., $ m=0 $ ), they are equivalent. Our definition also generalized the read and write security properties of other Merkle tree commitments, such as those in Reference [39].

We note that it does not suffice to consider the properties to hold with respect to a single update (i.e., when $ m=1 $ ). This is because our hash functions keep state, so it may be the case that it internally keeps a counter and artificially breaks completeness or soundness after some $ m \gt 1 $ updates have occurred.

5.1 Hash Function Building Blocks

Before giving our concurrently updatable hash function construction, we provide some preliminary definitions and building blocks.

Binary trees. When we discuss complete binary trees with n leaves, we refer to each node having a level, where the leaves are level 0 and the root is level $ \log n $ . For a node at level i, its children are the two nodes adjacent to it at level $ i-1 $ , and its parent is the node adjacent to it at level $ i+1 $ .

Definition 5.5 (Ancestor Nodes).

For a complete binary tree and a set of leaves S, we define the set $ \mathbf {ancestors} (S) $ to be the set containing all nodes that are ancestors of any node in S, including S. For a single node $ \ell $ , we simply write $ \mathbf {ancestors} (\ell) $ to denote the ancestors of the node $ \ell $ .

Definition 5.6 (Dangling Nodes).

Let T be a complete binary tree and S be a set of leaves in $ \mathit {MT} $ . The dangling nodes with respect to S, denoted $ \mathbf {dangling} (S) $ , is the set consisting of all siblings of nodes in $ \mathbf {ancestors} (S) $ that themselves are not contained in $ \mathbf {ancestors} (S) $ . For a single leaf $ \ell $ , we simply write $ \mathbf {dangling} (\ell) $ to denote the dangling nodes relative to $ \{\ell\} $ .

We remark that the notion of dangling nodes for a set S is a generalization of an authentication path for a single location $ \ell $ . Specifically, just like an authentication path gives a proof for opening a single location in a Merkle tree, the values for nodes in $ \mathbf {dangling} (S) $ can similarly be used to certify an opening for the locations in S. Next, we bound the size of a dangling set.

Claim 5.7.

Consider a complete binary tree with n leaves and let $ S \subseteq [n] $ . If $ 0 \lt |S| \le p $ , then $ |\mathbf {dangling} (S)| \le p \log (n/p) $ .

Proof.

A similar observation and proof were given in Reference [46]. We give the full proof with our notation here for completeness.

We prove the claim by induction on i where $ n = 2^i $ for any $ p \in [n] $ . In the base case, when $ i=0 $ so $ n=2^0 = 1 $ , $ |\mathbf {dangling} (S)| = 0 \le p \log (n/p) $ for all $ p \in [2^0] = \lbrace 1\rbrace $ as required. We next show the claim for $ n = 2^{i} $ for $ i \gt 0 $ assuming it for $ n/2 = 2^{i-1} $ . Let $ S \subseteq [n] $ be a set of leaves for the complete binary tree with n leaves. Let $ S_L = S \cap \lbrace 1,\ldots , n/2\rbrace $ and $ S_R = S \cap \lbrace n/2+1,\ldots , n\rbrace $ , where we consider $ S_L $ to be a set of leaves in the sub-tree of height $ i-1 $ rooted at the left child of the root, and similarly $ S_R $ to be a set of leaves in the sub-tree rooted at the right child of the root.

We first consider the case when $ |S_L|, |S_R| \gt 0 $ . By the inductive hypothesis, there are at most $ |S_L| \log (n/(2|S_L|)) $ nodes in $ \mathbf {dangling} (S_L) $ and similarly at most $ |S_R| \log (n/(2|S_R|)) $ nodes in $ \mathbf {dangling} (S_R) $ . This implies that

$ \begin{align*} &|\mathbf {dangling} (S_L)| + |\mathbf {dangling} (S_R)| \\ &\le |S_L| \log \left(\frac{n}{2|S_L|}\right) + |S_R| \log \left(\frac{n}{2|S_R|}\right) \\ &= (|S_L| + |S_R|) \log n - \left(|S_L| \log |S_L| + |S_R| \log |S_R|\right) - \left(|S_L| + |S_R|\right)\!. \end{align*} $

Using the fact that $ a \log a + b \log b \ge (a+b) (\log (a+b) - 1) $ for any $ a,b \gt 0 $ ,¹⁰ this implies that

$ \begin{align*} |\mathbf {dangling} (S_L)| + |\mathbf {dangling} (S_R)| &\le p \log n - p (\log p - 1) - p \\ &= p \log (n/p). \end{align*} $

Furthermore, note that this covers all nodes in $ \mathbf {dangling} (S) $ as the roots of both $ \mathbf {ancestors} (S_L) $ and $ \mathbf {ancestors} (S_R) $ (when viewed as sub-trees) are in $ \mathbf {ancestors} (S) $ (since $ |S_L|, |S_R| \gt 0 $ ), and there are no other siblings that cross between the two sub-trees $ \mathbf {ancestors} (S_L) $ and $ \mathbf {ancestors} (S_R) $ .

Now consider the case where either $ |S_L| = 0 $ or $ |S_R| = 0 $ . Note that because we assume $ p \gt 0 $ , it cannot be the case that both $ |S_L| $ and $ |S_R| $ are 0. Without loss of generality, we consider the case where $ |S_R| = 0 $ . In this case, it must be that $ |S_L| = p \le n/2 $ . Then by the inductive hypothesis there are at most $ p\log (n/(2p)) $ nodes in $ \mathbf {dangling} (S_L) $ . Furthermore, $ \mathbf {dangling} (S) $ consists of all nodes in $ \mathbf {dangling} (S_L) $ plus the root of $ S_R $ . So,

$ \begin{align*} |\mathbf {dangling} (S)| &\le 1 + p\log (n/(2p)) = p \log (n/p) + (1 - p) \\ &\le p \log (n/p), \end{align*} $

which holds given that $ p \ge 1 $ .□

Next, we give the following helpful claim, which follows from the definition of a dangling set, which will be helpful in our concurrently updatable hash function construction. Recall that a proper tree is one where every node has either zero children or two children.

Claim 5.8.

For any set S of leaves in a complete binary tree with n leaves, $ \mathbf {ancestors} (S) \cup \mathbf {dangling} (S) $ is a proper sub-tree with leaves $ S \cup \mathbf {dangling} (S) $ .

Proof.

Note that if S is empty, the claim holds vacuously, then so henceforth, we assume S is non-empty. Let T be the sub-tree consisting of $ \mathbf {ancestors} (S) \cup \mathbf {dangling} (S) $ . Note that T is a tree since $ \mathbf {ancestors} (S) $ is a tree, and every node in $ \mathbf {dangling} (S) $ is a child of a node in $ \mathbf {ancestors} (S) $ . To show that T is proper and that its leaves are $ S \cup \mathbf {dangling} (S) $ , we will show that every node in T is either in $ S \cup \mathbf {dangling} (S) $ , in which case it is a leaf, or is in $ \mathbf {ancestors} (S) \setminus S $ and has both of its children in T, which suffices for the claim. Consider any node $ \mathit {node} $ in T. If $ \mathit {node} \in \mathbf {dangling} (S) $ , then its children are not in T, since neither child is an ancestor of S by definition, and hence neither can be in $ \mathbf {dangling} (S) $ . It follows that $ \mathit {node} $ is a leaf. If $ \mathit {node} \in S $ , then it is a leaf in the complete binary tree and is in T, so is a leaf in T. If $ \mathit {node} \in \mathbf {ancestors} (S) \setminus S $ , then its children are in $ \mathbf {ancestors} (S) \cup \mathbf {dangling} (S) $ , and so are both in T. $ \Box $ □

Merkle trees. Let $ h :\{0,1\} ^{2\lambda } \rightarrow \{0,1\} ^{\lambda } $ be a compressing hash function. A Merkle tree [43] for a string $ D \in \{0,1\} ^{n\lambda } $ consists of a complete binary tree of $ \log n + 1 $ levels labelled $ 0,\ldots , \log n $ where level i consists of $ n/2^i $ nodes. Each node is associated with a value in $ \{0,1\} ^{\lambda } $ . The leaves at level 0 correspond to D, split into n blocks of length $ \lambda $ . The value of each node at level $ i \gt 0 $ is defined to be the hash (using h) of the concatenation of its children’s values at level $ i-1 $ . The single node at level $ \log n $ is referred to as the root or digest of the Merkle tree.

An authentication path $ \pi = (\pi _0, \ldots , \pi _{\log n - 1}) $ for a leaf $ i \in [n] $ consists of the values in the tree corresponding to the siblings of all nodes along the path from the leaf to the root, ordered from level 0 to $ \log n - 1 $ . An authentication path $ \pi = (\pi _0, \ldots , \pi _{\log n - 1}) $ for a leaf i is said to be a valid opening for $ v \in \{0,1\} ^{\lambda } $ with respect to a digest $ \mathit {digest} $ if when hashing the value v at leaf i with $ \pi _0 $ , hashing the resulting value with $ \pi _1 $ , and so on for all values in $ \pi $ , the final value equals $ \mathit {digest} $ . Whenever updating the value of a leaf i with block $ \mathbf {block} $ , we additionally re-compute the hash values along the path to the root using its authentication path. The overall size needed to store the Merkle tree in memory is $ 2n\lambda $ bits. In our construction, rather than using an authentication path, we will use the notion of a dangling set (5.6) that generalizes an authentication path for multiple leaves.

Assuming the underlying hash function h is collision-resistant, it is well known that a Merkle tree is binding to a fully defined string that allows for local opening and updates. Moreover, it is known that a standard Merkle tree satisfies the standard completeness and binding properties of a commitment.

In our construction, we will want to use a Merkle tree for values $ v \in \{0,1\} ^{\lambda } \cup \{\bot \} $ . Therefore, we will use a Merkle tree for $ 2\lambda $ -bit values, so we can uniquely encode each element of $ \{0,1\} ^{\lambda } \cup \{\bot \} $ as a string of length $ 2\lambda $ and each node in the Merkle tree corresponds to two consecutive words in memory.

Segment Tree. A segment tree is a data structure that provides a way for the prover to efficiently check if a range of indices in the partial string $ D = (n,I,A) $ are $ \bot $ . To this end, we want to represent the set I (which will be constantly updated) in a way that allows us to check if $ [i_1,i_2] \cap I = \emptyset $ in $ O(\log n) $ time and independent of $ |I| $ and $ |i_2 - i_1| $ .

To do so, we use a segment tree that mirrors the Merkle tree and consists of a complete binary tree with n leaves. Each node has an associated bit that is 1 if the corresponding node in the Merkle tree has been initialized and 0 otherwise. Every time a leaf in the Merkle tree is updated, we initialize all nodes in the tree along the path to the root, meaning we set the corresponding bits in the segment tree to 1. Then, if any node in the segment tree has a bit of 0, then it guarantees that all indices corresponding to the leaves that are descendants of this node are $ \bot $ . This implies that for any range $ [i_1,i_2] $ , we can check if $ [i_1,i_2] \cap I = \emptyset $ by checking the bits of $ O(\log n) $ nodes in the tree that cover this range of indices. This data structure only requires 2n additional bits to store.

5.2 Construction

Let $ \mathcal {H} = \{\mathcal {H} _{\lambda }\}_{\lambda \in \mathbb {N}} $ be a collision-resistant hash function family ensemble with $ h :\{0,1\} ^{4\lambda } \rightarrow \{0,1\} ^{2\lambda } $ for each $ h \in \mathcal {H} _{\lambda } $ . We also assume that we have a canonical, deterministic encoding of each value in $ \{0,1\} ^{\lambda } \cup \{\bot \} $ to $ 2\lambda $ -bit strings, denoted by $ \mathbf {block} (v) $ for $ v \in \{0,1\}^{\lambda } \cup \{\bot\} $ , which can efficiently decoded (for example, we could represent $ v \in \{0,1\} ^{\lambda } $ as $ v || 0^{\lambda } $ and $ \bot $ as $ 1^{2\lambda } $ ).

We now give our full concurrently updatable hash function construction $ \mathbf {C} = (\mathbf {C}.\mathbf {Gen} ,\mathbf {C}.\mathbf {Hash} ,\mathbf {C}.\mathbf {Open} ,\mathbf {C}.\mathbf {Update} ,\mathbf {C}.\mathbf {VerOpen},\mathbf {C}.\mathbf {VerUpd}) $ .

$ \mathit {pp} \leftarrow \mathbf {C}.\mathbf {Gen} (1^{\lambda} ,n) $ : Sample $ h \leftarrow \mathcal {H} _{\lambda } $ and output $ \mathit {pp} = (h,n) $ .

$ (\mathit {ptr} ,\mathit {digest}) = \mathbf {C}.\mathbf {Hash} (\mathit {pp} ,D) $ :

(1)

Parse $ \mathit {pp} = (h,n) $ . Allocate $ 4n\lambda + 2n + 2\lambda \log n $ bits of memory at a pointer $ \mathit {ptr} $ , starting with a Merkle tree with n leaves at $ \mathit {ptr} $ , a corresponding segment tree at pointer $ \mathit {segtree} $ , and $ \log n $ extra blocks of size $ 2\lambda $ at pointer $ \mathit {aux} $ .

We assume that all memory is initialized to 0.

(2)

Define $ \mathbf {dummy}(0) = \mathbf {block} (\bot) $ . Let $ h = \mathit {pp} $ , and for $ j = 1, \ldots , \log n $ , compute $ \mathbf {dummy}(j) = h(\mathbf {dummy}(j-1) || \mathbf {dummy}(j-1)) $ and write it to the next block of free memory at $ \mathit {aux} $ .

(3)

Recall that $ D = (n,I,A) $ specifies a set I of non- $ \bot $ indices with values given in A. Run the update procedure defined below by $ \mathbf {C}.\mathbf {Update} (\mathit {pp} ,\mathit {ptr} ,I,A) $ .

(4)

Let $ \mathit {digest} $ be the value of the root in $ \mathit {ptr} $ , or $ \mathbf {dummy}(\log n) $ if it is uninitialized, and output $ (\mathit {ptr} , \mathit {digest}) $ .

$ (V,\pi) = \mathbf {C}.\mathbf {Open} (\mathit {pp} ,\mathit {ptr} ,S) $ : Parse $ \mathit {pp} = (h,n) $ . Let $ p = \left|S \right| $ and let $ S = (\ell ^{(1)},\ldots , \ell ^{(p)}) $ . Let $ \mathit {segtree} $ be the pointer to the segment tree in memory.

(1)

Compute the set $ \mathbf {dangling} (S) $ .

(2)

Let R be an initially empty set, which will store all read values.

(3)

For each level $ j = 0, \ldots , \log n-1 $ , do the following:

(a)

In parallel for each node $ \ell \in S \cup \mathbf {dangling} (S) $ at level j:

•

Read $ \ell $ in $ \mathit {ptr} $ , and let its value be $ u _{\ell }^{\mathsf {rd}} $ .

•

Read $ \ell $ in $ \mathit {segtree} $ , and let its value be $ b_{\ell }^{\mathsf {rd}} $ .

(b)

For every $ \ell \in S \cup \mathbf {dangling} (S) $ at level j, if $ b_{\ell }^{\mathsf {rd}} = 0 $ , let $ u _{\ell }^{\mathsf {rd}} = \mathbf {dummy}(j) $ . Add $ (\ell ,u ^{\mathsf {rd}}_{\ell }) $ to R.

To form the output, do the following:

(1)

For each $ i \in [p] $ , let $ v ^{(i)} \in \{0,1\} ^{\lambda } \cup \{\bot\} $ be the value such that $ (\ell ^{(i)},\mathbf {block} (v ^{(i)})) \in R $ .

(2)

Let $ \pi $ be a list containing all $ (\ell ,u) $ in R such that $ \ell \in \mathbf {dangling} (S) $ .

(3)

Note that the above values exist in R since it contains an entry for each node in $ S \cup \mathbf {dangling} (S) $ . Output $ (V,\pi) $ where $ V = (v ^{(1)}, \ldots , v ^{(p)}) $ .

$ (\mathit {digest},\tau) = \mathbf {C}.\mathbf {Update} (\mathit {pp} ,\mathit {ptr} ,S,V) $ : Let $ p = \left|S \right| $ , $ S = (\ell ^{(1)},\ldots , \ell ^{(p)}) $ , and $ V = (v ^{(1)},\ldots ,v ^{(p)}) $ . Parse $ \mathit {pp} = (h,n) $ . Let $ \mathit {segtree} $ be the pointer to the segment tree in memory.

Preprocessing Steps.

(1)

Compute the sets of nodes $ \mathbf {dangling} (S) $ and $ \mathbf {ancestors} (S) $ .

(2)

Let $ R,W $ be sets, initially empty, which will contain the read and written values (respectively).

(3)

Add $ (\ell ^{(i)},\mathbf {block} (v ^{(i)})) $ to W for all $ i \in [p] $ .

For each level $ j = 0, \ldots , \log (n)-1 $ :

Access Step.

Do the following in parallel:

•

For every node $ \ell \in \mathbf {ancestors} (S) $ at level j, in parallel:

–

Let u be the value with $ (\ell ,u) \in W $ , and write u to $ \ell $ in $ \mathit {ptr} $ . Let $ u ^{\mathsf {prev}}_{\ell } $ be the value overwritten.

–

Write 1 to $ \ell $ in $ \mathit {segtree} $ , and let the value overwritten be $ b^{\mathsf {prev}}_{\ell } $ .

•

For every $ \ell \in \mathbf {dangling} (S) $ at level j, in parallel:

–

Read $ \ell $ in $ \mathit {ptr} $ , and let its value be $ u _{\ell }^{\mathsf {rd}} $ .

–

Read $ \ell $ in $ \mathit {segtree} $ , and let its value be $ b_{\ell }^{\mathsf {rd}} $ .

Compute Steps.

(1)

In parallel for every $ \ell \in \mathbf {ancestors} (S) $ at level j, if $ b_{\ell }^{\mathsf {prev}} = 0 $ , then set $ u ^{\mathsf {prev}} = \mathbf {dummy}(j) $ . Add $ (\ell ,u ^{\mathsf {prev}}) $ to R.

(2)

In parallel for every $ \ell \in \mathbf {dangling} (S) $ at level j, if $ b_{\ell }^{\mathsf {rd}} = 0 $ , then set $ u ^{\mathsf {rd}} = \mathbf {dummy}(j) $ . Add $ (\ell ,u ^{\mathsf {rd}}) $ to R.

(3)

In parallel for every node $ \ell \in \mathbf {ancestors} (S) $ at level $ j+1 $ , do the following:

(a)

For its left child and right child, let $ u _L $ and $ u _R $ , respectively, be the values given by W if they exist and by R otherwise. If neither, then abort and output $ \bot $ .

(b)

Compute u as the hash of $ u _{L} || u _{R} $ using h, and add $ (\ell ,u) $ to W.

Form Output.

(1)

For each $ i \in [p] $ , let $ v ^{(i)} _{\mathsf {prev}} \in \{0,1\} ^{\lambda } \cup \{\bot\} $ be the value such that $ (\ell ^{(i)},\mathbf {block} (v ^{(i)} _{\mathsf {prev}})) \in R $ .

(2)

Let $ \pi $ be a list containing all $ (\ell ,u) $ in R such that $ \ell \in \mathbf {dangling} (S) $ .

(3)

If any of the above values cannot be found, then output $ \bot $ . Otherwise, output $ (\mathit {digest},\tau) $ where $ \mathit {digest} $ is the value of the root given by W and $ \tau = (v ^{(1)}_{\mathsf {prev}},\ldots ,v ^{(p)}_{\mathsf {prev}},\pi) $ .

$ b = \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} ,\mathit {digest},S,V,\pi) $ : Parse $ \mathit {pp} = (h,n) $ and output 1 if and only if the following steps are successful:

(1)

Check that $ \left|S \right| = \left|V \right| $ , each element of S is in $ [n] $ , each value in V is in $ \{0,1\} ^{\lambda } \cup \{\bot\} $ , and each element of $ \pi $ is a pair $ (\ell ,u) \in [n] \times \{0,1\} ^{2\lambda } $ .

(2)

Compute $ \mathbf {dangling} (S) $ and check that the set of locations in $ \pi $ is equal to $ \mathbf {dangling} (S) $ .

(3)

Let R be a set, initialized with all elements $ \pi $ and $ (\ell ^{(i)},\mathbf {block} (v ^{(i)})) $ , where $ \ell ^{(i)} $ is the ith location in S and $ v ^{(i)} $ is the ith value in V.

(4)

For each level $ j=0,\ldots , \log n-1 $ , do the following:

(a)

For each pair of sibling nodes $ \ell _{L},\ell _{R} $ in $ S \cup \mathbf {dangling} (S) $ at level j, let $ \ell $ be the location of their parent node.

(b)

Compute u as the hash of the values for $ \ell _{L} $ and $ \ell _{R} $ given by R using h.

(c)

Add $ (\ell ,u) $ to R.

(5)

Check that the value in R corresponding to the root is equal to $ \mathit {digest} $ .

$ b = \mathbf {C}.\mathbf {VerUpd} (\mathit {pp} ,\mathit {digest},S,V,\mathit {digest}^{\prime },\tau) $ : Parse $ \mathit {pp} = (h,n) $ and output 1 if and only if the following hold:

(1)

$ \tau $ can be parsed as $ V^{\prime } || \pi $ where $ \left|V^{\prime } \right| = \left|S \right| $ .

(2)

Each value of V is in $ \{0,1\} ^{\lambda } $ .

(3)

$ \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} ,\mathit {digest},S,V^{\prime },\pi) = 1 $ .

(4)

$ \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} , \mathit {digest}^{\prime },S, V, \pi) = 1 $ .

Theorem 5.9.

Assuming the existence of collision-resistant hash function families, there exists a concurrently updatable hash function.

We prove Theorem 5.9 in Section 5.3, where we show that the construction $ \mathbf {C} $ satisfies completeness in Lemma 5.10, soundness in Lemma 5.14, and efficiency in Lemma 5.18.

5.3 Proofs

Lemma 5.10 (Completeness).

The construction $ \mathbf {C} $ satisfies completeness.

Proof.

Fix any $ \lambda ,n\in \mathbb {N} $ with $ n \le 2^{\lambda } $ and $ \mathit {pp} $ in the support of $ \mathbf {C}.\mathbf {Gen} (1^{\lambda} ,n) $ . To show the completeness properties, recall that the hash function algorithms keep track of a Merkle tree at $ \mathit {ptr} $ and a segment tree at $ \mathit {segtree} $ to keep track of which nodes are initialized. We start by defining a notion that captures when memory at $ (\mathit {ptr} ,\mathit {segtree}) $ is consistent with a Merkle tree for a partial string D. Formally, we say that $ (\mathit {ptr} ,\mathit {segtree}) $ is consistent with a partial string $ D = (n,I,A) $ if the following hold:

(1)

For every $ i \in I $ , leaf i has value 1 in $ \mathit {segtree} $ ,

(2)

For every node with value 1 in $ \mathit {segtree} $ , the values of its ancestors in $ \mathit {segtree} $ are set to 1, and

(3)

For every node $ \mathit {node} $ with value 1 in $ \mathit {segtree} $ , its value in $ \mathit {ptr} $ is equal to the value of $ \mathit {node} $ in the Merkle tree for $ \mathbf {block} (D_{1}) || \ldots || \mathbf {block} (D_{n}) $ using the hash function given by $ \mathit {pp} $ .

We start by showing that doing an update preserves consistency.

Claim 5.11.

Suppose that $ (\mathit {ptr} ,\mathit {segtree}) $ is consistent with a partial string D. For any ordered set $ S = (\ell ^{(1)},\ldots ,\ell ^{(p)}) $ of locations $ \ell ^{(i)} \in [n] $ and tuple $ V = (v ^{(1)},\ldots ,v ^{(p)}) $ of words $ v ^{(i)} \in \{0,1\} ^{\lambda } $ , let $ (\mathit {ptr} ^{\prime },\mathit {segtree}^{\prime }) $ be pointers to memory after computing $ \mathbf {C}.\mathbf {Update} (\mathit {pp} ,\mathit {ptr} ,S,V) $ . Then, $ (\mathit {ptr} ^{\prime },\mathit {segtree}^{\prime }) $ is consistent with the partial string $ D^{\prime } $ , where $ D_{\ell ^{(i)}}^{\prime } = v ^{(i)} $ for all $ i\in [p] $ , and $ D^{\prime } $ agrees with D at all other locations.□

Proof.

When $ \mathbf {C}.\mathbf {Update} (\mathit {pp} ,\mathit {ptr} ,S,V) $ is computed, the only nodes updated in $ \mathit {ptr} $ and $ \mathit {segtree} $ are those in $ \mathbf {ancestors} (S) $ . In $ \mathit {segtree} $ , every node in $ \mathbf {ancestors} (S) $ is set to 1. This immediately gives the first two properties of consistency. To show the third property, let $ \mathit {MT} $ be the Merkle tree for the string $ \mathbf {block} (D_{1}) || \ldots || \mathbf {block} (D_{n}) $ using the hash function given by $ \mathit {pp} $ . We need to show that every node with value 1 in $ \mathit {segtree}^{\prime } $ has the same value in $ \mathit {ptr} ^{\prime } $ and $ \mathit {MT} $ . Since $ (\mathit {ptr} ,\mathit {segtree}) $ are consistent with D, and the only changes are to nodes in $ \mathbf {ancestors} (S) $ , it suffices to show that this holds for every node in $ \mathbf {ancestors} (S) $ . Throughout this proof, we will refer to iteration j of $ \mathbf {C}.\mathbf {Update} $ as the iteration that updates the jth level of the tree, for $ j=0,\ldots ,\log n $ .

Consider any node $ \mathit {node} \in \mathbf {ancestors} (S) $ . We show by induction on the level of $ \mathit {node} $ that its value in $ \mathit {ptr} ^{\prime } $ is equal to its value in $ \mathit {MT} $ . For the base case, when $ \mathit {node} $ is at level 0 (i.e., it is a leaf), it follows that $ \mathit {node} = \ell ^{(i)} $ for some index i. It is only updated at iteration 0, where it is set to $ \mathbf {block} (v ^{(i)}) = \mathbf {block} (D_{\ell ^{(i)}}) $ , which gives the base case.

Next, assume that every node at level j has the same value in $ \mathit {ptr} ^{\prime } $ and $ \mathit {MT} $ , and suppose $ \mathit {node} $ is at level $ j+1 $ . For convenience, let $ \ell _{L},\ell _{R} $ be the locations for the left and right child of $ \mathit {node} $ , respectively. During the update, $ \mathit {node} $ is only written to in the $ (j+1) $ st iteration, where it is set to the hash of the concatenation of values corresponding to its children, found in sets $ R,W $ maintained by the algorithm. Let $ u _{L},u _{R} $ be the values used for the left and right child, respectively. To show that the value for $ \mathit {node} $ is indeed its value in $ \mathit {MT} $ , it therefore suffices to show that $ u _{L},u _{R} $ are the values for $ \ell _{L},\ell _{R} $ in $ \mathit {MT} $ . Without loss of generality, we show this for the value $ u _{L} $ used for $ \ell _{L} $ .

To prove that $ u _{L} $ is indeed the value of $ \ell _{L} $ in $ \mathit {MT} $ , we claim the following:

Subclaim 5.12.

If $ \ell _{L} $ is initialized before the $ (j+1) $ st iteration, then $ u _{L} $ is the value of $ \ell _{L} $ in $ \mathit {ptr} ^{\prime } $ . If it is not initialized, then $ u _{L} $ is set to $ \mathbf {dummy}(j) $ .□

We complete the proof assuming Subclaim 5.12 and then show that the subclaim holds. The only time that $ \ell _{L} $ is accessed by $ \mathbf {C}.\mathbf {Update} $ is during the jth iteration. There are two cases to consider:

•

Case 1: $ \ell _{L} $ is in $ \mathbf {ancestors} (S) $ . In this case, it is initialized during iteration j, so it follows by Subclaim 5.12 that $ u _{L} $ is its value in $ \mathit {ptr} ^{\prime } $ . Since it is at level j, then by the inductive hypothesis, this is equal to the value in $ \mathit {MT} $ .

•

Case 2: $ \ell _{L} $ is in $ \mathbf {dangling} (S) $ . In this case, it is not changed by $ \mathbf {C}.\mathbf {Update} $ . If it was already initialized before the update, then the inductive hypothesis applies as in the previous case. If not, then $ u _{L} = \mathbf {dummy}(j) $ by Subclaim 5.12. Moreover, since $ (\mathit {ptr} ,\mathit {segtree}) $ is consistent with D before the update, then the fact that $ \ell _{L} $ is uninitialized in $ \mathit {segtree} $ implies that $ D_{\ell } = \bot $ for every leaf $ \ell $ that is a descendant of $ \ell _{L} $ . Therefore, the value of $ \ell _{L} $ in $ \mathit {MT} $ is $ \mathbf {dummy}(j) $ , so $ u _{L} $ is indeed equal to the value of $ \ell _{L} $ in $ \mathit {MT} $ .

Since $ \mathit {node} $ is an ancestor of a leaf in S, these are the only two cases. Therefore, assuming Subclaim 5.12, the value $ u _{L} $ agrees with $ \mathit {MT} $ . To complete the proof, it remains to show Subclaim 5.12.

To prove that Subclaim 5.12 holds, recall that the algorithm $ \mathbf {C}.\mathbf {Update} $ first checks if $ \ell _{L} $ is in the set W and then checks the set R. Both children are only accessed and modified in $ R,W $ in iteration j. Between the two children, at least one child must be in $ \mathbf {ancestors} (S) $ . In this case, in iteration j it is initialized and its final value in memory is added to W, which is the value used. If either child is not in $ \mathbf {ancestors} (S) $ , then it is in $ \mathbf {dangling} (S) $ by definition. In this case, it follows that in iteration j it is added to R (and not W), where either its value in memory is used, or $ \mathbf {dummy}(j) $ if it is not initialized. This completes the proof of Subclaim 5.12, which in turn gives the claim. $ \Box $

Next, we show that the memory after initially hashing a partial string is consistent with that partial string.

Claim 5.13.

Let $ D_{\mathsf {start}} = (n,I,A) $ be a partial string, and let $ (\mathit {ptr} ,\mathit {segtree}) $ be the pointers to the Merkle tree and segment tree in memory after running $ \mathbf {C}.\mathbf {Hash} (\mathit {pp} ,D_{\mathsf {start}}) $ . Then, $ (\mathit {ptr} ,\mathit {segtree}) $ are consistent with $ D_{\mathsf {start}} $ .

Proof.

Running $ \mathbf {C}.\mathbf {Hash} (\mathit {pp} ,D_{\mathsf {start}}) $ results in the same memory as running:

(1)

$ (\mathit {ptr} ,\mathit {digest}) = \mathbf {C}.\mathbf {Hash} (\mathit {pp} ,D_{\bot }) $ , where $ D_{\bot } $ is the empty partial string.

(2)

$ \mathbf {C}.\mathbf {Update} (\mathit {pp} ,\mathit {ptr} ,I,A) $ , where we recall that I specifies the set of non- $ \bot $ locations in $ D_{\mathsf {start}} $ and A is the assignment to those locations.

After $ \mathbf {C}.\mathbf {Hash} (\mathit {pp} ,D_{\bot }) $ , it is vacuously true that the resulting memory is consistent with $ D_{\bot } $ , since there are no non- $ \bot $ words in $ D_{\bot } $ . Therefore, by Claim 5.11, the memory after $ \mathbf {C}.\mathbf {Update} (\mathit {pp} ,\mathit {ptr} ,I,A) $ is consistent with $ D_{\mathsf {start}} $ . $ \Box $ □

We are now ready to prove completeness. Fix any partial string $ D_{\mathsf {start}} = (n,I,A) $ , integer $ m \ge 0 $ , ordered sets $ S^{(i)} \subseteq [n] $ and tuples $ V ^{(i)} \in (\{0,1\} ^{\lambda })^{\left|S^{(i)} \right|} $ for $ i\in [m] $ . Compute

(1)

$ (\mathit {ptr} ,\mathit {digest} _{0}) = \mathbf {C}.\mathbf {Hash} (\mathit {pp} ,D_{\mathsf {start}}) $ .

(2)

For $ i=1,\ldots ,m $ , compute $ (\mathit {digest} ^{(i)},\tau ^{(i)}) = \mathbf {C}.\mathbf {Update} (\mathit {pp} ,\mathit {ptr} ,S^{(i)},V ^{(i)}) $ .

Let D be the partial string formed by writing each word in $ V ^{(i)} $ to $ D_{\mathsf {start}} $ at the corresponding location in $ S^{(i)} $ for $ i=1,\ldots ,m $ , and let $ \mathit {MT} $ be the Merkle tree for D. We start by noting that $ (\mathit {ptr} ,\mathit {segtree}) $ is consistent with D after all m updates. This following by induction on m: For the base case, when $ m=0 $ , this follows from Claim 5.13. For the inductive step, assuming this holds for m updates, then Claim 5.11 implies that it holds after the $ (m+1) $ st update. Using the fact that $ (\mathit {ptr} ,\mathit {segtree}) $ is consistent with D, we proceed to show open completeness and update completeness.

Open Completeness. Fix any $ p \ge 0 $ and ordered set $ S = (\ell ^{(1)},\ldots ,\ell ^{(p)}) $ . Compute

$ \begin{equation*} (V,\pi) = \mathbf {C}.\mathbf {Open} (\mathit {pp} ,\mathit {ptr} ,S), \end{equation*} $

and parse $ V = (v ^{(1)},\ldots ,v ^{(p)}) $ . To show open completeness, we first make the following assertions about the values in $ \mathit {MT} $ :

•

For all $ \ell ^{(i)} \in S $ , the value at leaf $ \ell ^{(i)} $ in $ \mathit {MT} $ is equal to $ \mathbf {block} (v ^{(i)}) $ .

•

For all $ \ell \in \mathbf {dangling} (S) $ , the value in $ \mathit {MT} $ is equal to the value u such that $ (\ell ,u) \in \pi $ .

•

The value of the root in $ \mathit {MT} $ is equal to $ \mathit {digest} ^{(m)} $ .

These assertions hold by consistency of $ (\mathit {ptr} ,\mathit {segtree}) $ with D. Specifically, each of these values is either given by the node’s value in $ \mathit {ptr} $ , or is set to $ \mathbf {dummy}(j) $ if uninitialized and at level j. Each initialized node agrees with $ \mathit {MT} $ by consistency, and for any uninitialized node, consistency implies that all of the leaves that are descendants of that node must be uninitialized and thus have the value $ \bot $ . Therefore, $ \mathbf {dummy}(j) $ is the value at the corresponding location in $ \mathit {MT} $ . Therefore, in either case, the value given above is equal to the corresponding value in $ \mathit {MT} $ .

Using this, we proceed to show open completeness. We need to show (1) that D agrees with V at the locations in S, and (2) that $ \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} ,\mathit {digest} ^{(m)},S,V,\pi) $ accepts. (1) follows immediately from our observation that V correspond to the values at S in $ \mathit {MT} $ .

For (2), recall that $ \mathbf {C}.\mathbf {VerOpen} $ does syntactic checks on V and $ \pi $ and then iteratively hashes values down the tree to obtain a digest $ \mathit {digest} ^{\star } $ . It accepts if all syntactic checks pass and $ \mathit {digest} ^{\star } = \mathit {digest} ^{(m)} $ . By construction, V consists of a value $ v ^{(i)} $ for $ i\in [p] $ , and the proof $ \pi $ contains a pair $ (\ell ,u) $ for each $ \ell \in \mathbf {dangling} (S) $ , so the syntactic checks pass.

To show that $ \mathit {digest} ^{\star } = \mathit {digest} ^{(m)} $ , we have that $ \mathit {digest} ^{\star } $ is derived from the values in V and $ \pi $ , which constitute a set of values for $ S \cup \mathbf {dangling} (S) $ . Specifically, $ \mathit {digest} ^{\star } $ is obtained by iteratively hashing each pair of siblings at each level until reaching the root. By Claim 5.8, there is a sub-tree containing $ \mathbf {ancestors} (S) $ whose leaves are all in $ S \cup \mathbf {dangling} (S) $ . It follows that having values for every node in $ S \cup \mathbf {dangling} (S) $ suffices to obtain a value for the root. Moreover, since the values given for $ S \cup \mathbf {dangling} (S) $ are equal to the corresponding values in $ \mathit {MT} $ , then $ \mathit {digest} ^{\star } $ is equal to the root of $ \mathit {MT} $ . Since $ \mathit {digest} ^{(m)} $ is also equal to the root of $ \mathit {MT} $ , then $ \mathit {digest} ^{\star } = \mathit {digest} ^{(m)} $ , which concludes the proof of open completeness.

Update Completeness. Fix any $ p \ge 0 $ , ordered set $ S = (\ell ^{(1)},\ldots ,\ell ^{(p)}) $ , and tuple $ V = (v ^{(1)},\ldots ,v ^{(p)}) $ . Compute

$ \begin{equation*} (\mathit {digest},\tau) = \mathbf {C}.\mathbf {Update} (\mathit {pp} ,\mathit {ptr} ,S,V). \end{equation*} $

To show update completeness, we need to show that $ \mathbf {C}.\mathbf {VerUpd} (\mathit {pp} ,\mathit {digest} ^{(m)},S,V,\mathit {digest},\tau) = 1 $ , which consists of syntactic checks and two inner verifications. The syntactic checks pass by definition of $ \mathbf {C}.\mathbf {Update} $ , which in particular state that $ \tau $ can be parsed as $ V^{\prime } || \pi $ where $ V^{\prime } $ is a tuple of p values. For the verifications, we need to show that both of the following hold:

(A)

$ \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} ,\mathit {digest} ^{(m)},S,V^{\prime },\pi) = 1 $

(B)

$ \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} ,\mathit {digest},S,V,\pi) = 1 $

For Equation 1, we claim that $ (V^{\prime },\pi) $ would be the output of $ \mathbf {C}.\mathbf {Open} (\mathit {pp} ,\mathit {ptr} ,S) $ , had it been run before the final update. Specifically, for each $ i\in S $ , $ V^{\prime } $ consists of a value $ v _{\mathsf {prev}}^{(i)} $ with $ \mathbf {block} (v _{\mathsf {prev}}^{(i)}) $ equal to the value in memory at each leaf in S before the update, or $ \bot $ if the leaf is uninitialized, just as what would be output by $ \mathbf {C}.\mathbf {Open} $ . For $ \pi $ , it consists of the values read for each node in $ \mathbf {dangling} (S) $ , or the dummy values if uninitialized. Since $ \mathbf {C}.\mathbf {Update} $ never writes to the nodes in $ \mathbf {dangling} (S) $ , then these values are exactly what would be returned by $ \mathbf {C}.\mathbf {Open} $ . Therefore, Equation 1 holds by open completeness.

For Equation 2, we claim that $ (V,\pi) $ would be the output of running $ \mathbf {C}.\mathbf {Open} (\mathit {pp} ,\mathit {ptr} ,S) $ after this final update. To see this, we observe that V consists of a value $ v ^{(i)} $ for each $ \ell ^{(i)} \in S $ where $ \mathbf {block} (v ^{(i)}) $ is equal to its value in $ \mathit {ptr} $ after the update. Moreover, each of these nodes is initialized, and so these are the values that would be returned by $ \mathbf {C}.\mathbf {Open} $ . For $ \pi $ , the same logic as above holds (namely, that the nodes in $ \mathbf {dangling} (S) $ are not changed by $ \mathbf {C}.\mathbf {Update} $ , and so are determined exactly as by $ \mathbf {C}.\mathbf {Open} $ ). Therefore, Equation 2 accepts by open completeness, concluding the proof.

Lemma 5.14 (Soundness).

The construction $ \mathbf {C} $ satisfies soundness.

Proof.

Suppose for contradiction there exists a non-uniform PPT adversary $ \mathcal {A} = \{\mathcal {A} _{\lambda }\}_{\lambda \in \mathbb {N}} $ and a polynomial q such that for infinitely many $ \lambda \in \mathbb {N} $ , there exists an integer $ n\le 2^{\lambda } $ such that

$ \begin{align} \text{Pr}\left[ \begin{array}{l} \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} ,\mathit {digest} ^{(0)},S^{(0)},V ^{(0)},\pi ^{(0)}) = 1 \; \wedge \\ \forall i \in [m]: \mathbf {C}.\mathbf {VerUpd} (\mathit {pp} ,\mathit {digest} ^{(i-1)},S^{(i)},V ^{(i)},\mathit {digest} ^{(i)},\tau ^{(i)})=1 \; \wedge \\ \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} ,\mathit {digest} ^{(m)},S,V,\pi) = 1 \; \wedge \\ \exists \ell \in S \cap S^{(0)} : v ^{\mathsf {prev}} \ne v ^{\mathsf {final}} \end{array} \right] \ge \frac{1}{q(\lambda)}, \end{align} $

(5.1)

the probability is over $ \mathit {pp} \leftarrow \mathbf {C}.\mathbf {Gen} (1^{\lambda },n) $ and $ (m , \lbrace (\mathit {digest} ^{(i)},S^{(i)},V ^{(i)},\tau ^{(i)})\rbrace _{i \in [m]} , \mathit {digest} ^{(0)} , S^{(0)} , V ^{(0)} , \pi ^{(0)} , S , V , \pi) \leftarrow \mathcal {A} _{\lambda }(\mathit {pp}) $ and $ v ^{\mathsf {prev}} $ and $ v ^{\mathsf {final}} $ are defined as follows:

•

$ v ^{\mathsf {prev}} $ is the value in $ V ^{(j)} $ at the position of $ \ell $ in $ S^{(j)} $ , where $ j\in \{0,\ldots ,m\} $ is the largest index with $ \ell \in S^{(j)} $ .

•

$ v ^{\mathsf {final}} $ is the words in V at the index of $ \ell $ in S.

We show that whenever $ \mathcal {A} $ succeeds, we can construct authentication paths certifying that $ \ell $ can be opened to two different values in $ \mathit {digest} ^{(m)} $ , which breaks the binding of standard Merkle trees assuming collision resistance.

The outline of the proof is as follows: First, in Claim 5.15, we will show that given a valid opening for many locations, we can efficiently construct a valid with respect to each individual location, which in fact is just a single Merkle tree authentication path. This claim actually suffices for the case of no updates, i.e., $ m=0 $ . To deal with $ m\gt 0 $ , we show in Claim 5.16 how, given an opening under for $ \ell $ under $ \mathit {digest} ^{(i)} $ and a valid update proof to $ \mathit {digest} ^{(i+1)} $ , we can construct an opening for $ \ell $ under $ \mathit {digest} ^{(i+1)} $ (or otherwise break collision resistance). At a high level, applying Claim 5.15 and then Claim 5.15 m times yields two Merkle tree authentication paths for $ v ^\mathsf {prev} \not= v ^\mathsf {final} $ with respect to $ \mathit {digest} ^{(m)} $ , which contradicts collision resistance of $ \mathcal {H} $ as required.

We next formally state these general claims, then prove the lemma assuming they hold, and finally prove each of the claims to complete the proof of the lemma.

Claim 5.15.

For any $ \lambda , n \le 2^{\lambda }, p\in \mathbb {N} $ , $ \mathit {pp} $ in the support of $ \mathbf {C}.\mathbf {Gen} (1^{\lambda} ,n) $ , ordered set $ S = (\ell ^{(1)},\ldots ,\ell ^{(p)}) $ , tuple $ V = (v ^{(1)}, \ldots , v ^{(p)}) $ , digest $ \mathit {digest} $ , and proof $ \pi $ , if

$ \begin{equation*} \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} ,\mathit {digest},S,V,\pi) = 1, \end{equation*} $

then there exist proofs $ \pi ^{(1)},\ldots ,\pi ^{(p)} $ such that

$ \begin{equation*} \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} ,\mathit {digest},\ell ^{(i)},v ^{(i)},\pi ^{(i)}) = 1 \end{equation*} $

for all $ i\in [p] $ . Moreover, they can be computed from $ (S,V,\pi) $ in polynomial time.□

Claim 5.16.

There exists a polynomial-time algorithm $ \mathcal {A}^{\prime } $ that on input $ (\mathit {pp} , \mathit {digest}, \ell , v, \pi , \mathit {digest}^{\prime }, S , V , \tau), $ if

(1)

$ \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} , \mathit {digest}, \ell , v, \pi) = 1 $ and

(2)

$ \mathbf {C}.\mathbf {VerUpd} (\mathit {pp} , \mathit {digest}, S, V, \mathit {digest}^{\prime }, \tau) = 1 $ ,

then $ \mathcal {A}^{\prime } $ either outputs a collision in $ \mathcal {H} $ under h, where h is given by $ \mathit {pp} $ , or outputs a proof $ \pi ^\star $ such that

$ \begin{equation*} \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} , \mathit {digest}^{\prime }, \ell , v ^\star , \pi ^\star) = 1, \end{equation*} $

where $ v ^\star = v $ if $ \ell \not\in S $ and otherwise $ v ^{\star } $ is the value in V at the index of $ \ell $ in S.

Proving the lemma assuming the above claims. We next prove the lemma assuming that Claims 5.15 and 5.16 hold. We condition on the event that $ \mathcal {A} $ succeeds, which occurs with probability at least $ 1/p(\lambda) $ .

First for the case of $ m=0 $ , we apply Claim 5.15 for $ (S^{(0)}, V ^{(0)}, \pi ^{(0)}) $ and $ \ell \in S^{(0)} $ to efficiently compute a proof $ \pi _\ell ^{(0)} $ such that $ \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} ,\mathit {digest} ^{(0)},\ell ,v _\ell ^{(0)},\pi _\ell ^{(0)}) = 1 $ where $ v _\ell ^{(0)} $ is the value in $ V ^{(0)} $ corresponding to location $ \ell \in S^{(0)} $ . Note that, since $ m=0 $ , then $ v _\ell ^{(0)} = v ^\mathsf {prev} $ by definition. Next, we apply the claim for $ (S, V, \pi) $ and $ \ell \in S $ to efficiently compute a proof $ \pi _\ell ^\mathsf {final} $ such that $ \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} , \mathit {digest} ^{(0)}, \ell , v ^\mathsf {final}, \pi _\ell ^\mathsf {final}) = 1 $ . By definition of $ \mathbf {C}.\mathbf {VerOpen} $ , $ \pi _\ell ^{(0)} $ and $ \pi _\ell ^\mathsf {final} $ both give valid Merkle tree authentication paths with respect to the same location but different values $ v ^\mathsf {prev} \not= v ^\mathsf {final} $ . This contradicts collision resistance of $ \mathcal {H} $ as this event occurs with probability $ 1/p(\lambda) $ by assumption.

Next, we consider the case when $ m \gt 0 $ . Again, we start by applying Claim 5.15 for $ (S^{(0)} , V ^{(0)} , \pi ^{(0)}) $ and $ \ell \in S^{(0)} $ to efficiently compute $ \pi _\ell ^{(0)} $ such that $ \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} ,\mathit {digest} ^{(0)},\ell ,v _\ell ^{(0)},\pi _\ell ^{(0)}) = 1 $ where $ v _\ell ^{(0)} $ is the value for $ \ell $ in $ V ^{(0)} $ . Now we apply Claim 5.16 for $ i = 1,\ldots , m $ to either find a collision or construct a proof $ \pi _\ell ^{(i)} $ for the value $ v _\ell ^{(i)} $ specified by the first i updates. Specifically, for the first case of $ i=1 $ , note that $ (\mathit {pp} , \mathit {digest} ^{(i-1)}, \ell , v _\ell ^{(i-1)}, \pi _\ell ^{(i-1)}, \mathit {digest} ^{(i)}, S^{(i)}, V ^{(i)}, \tau ^{(i)}) $ satisfy the conditions for Claim 5.16. As a result, we either find a collision or compute a proof $ \pi _\ell ^{(i)} $ for the value $ v _\ell ^{(i)} $ with respect to $ \mathit {digest} ^{(i)} $ . Assuming we do not find a collision, this implies that the conditions for the claim also hold for general $ i \gt 1 $ as well. As such, after applying the claim at most m times, we will either find a collision or have computed a proof $ \pi _\ell ^{(m)} $ such that $ \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} , \mathit {digest} ^{(m)}, \ell , v _\ell ^{(m)}, \pi _\ell ^{(m)}) = 1 $ . Note that $ v _\ell ^{(m)} = v ^\mathsf {prev} $ by definition. Finally, we apply Claim 5.15 for $ (S, V, \pi) $ and $ \ell \in S $ to efficiently compute efficiently compute a proof $ \pi _\ell ^\mathsf {final} $ such that $ \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} , \mathit {digest} ^{(m)}, \ell , v ^\mathsf {final}, \pi _\ell ^\mathsf {final}) = 1 $ . Again, by definition of $ \mathbf {C}.\mathbf {VerOpen} $ , $ \pi _\ell ^{(m)} $ and $ \pi _\ell ^\mathsf {final} $ both give valid authentication paths for $ \ell $ but for different values $ v ^\mathsf {prev} \not= v ^\mathsf {final} $ . Thus, in the case where applying Claim 5.16 does not directly find a collision with respect to $ \mathcal {H} $ , we still find a collision by the binding property of Merkle trees. As this event occurs with probability $ 1/p(\lambda) $ by assumption, this contradicts the collision resistance of $ \mathcal {H} $ .

Proving the claims. We now continue to prove Claims 5.15 and 5.16. Towards this, we start by defining a helpful criteria for when $ \mathbf {C}.\mathbf {VerOpen} $ accepts. This requires defining an algorithm $ \mathbf {extend} $ and the notion of an induced value. To define these, fix any $ \lambda ,n,p\in \mathbb {N} $ with $ p \le n \le 2^{\lambda } $ , $ \mathit {pp} $ in the support of $ \mathbf {C}.\mathbf {Gen} (1^{\lambda} ,n) $ , ordered set $ S = (\ell ^{(1)},\ldots ,\ell ^{(p)}) $ , tuple $ V = (v ^{(1)},\ldots ,v ^{(p)}) $ , and list $ \pi $ of values for nodes in $ \mathbf {dangling} (S) $ .

Define $ \mathbf {extend} (\mathit {pp} ,S,V,\pi) $ to do the following: Parse $ \mathit {pp} = (h,n) $ and let T be the proper sub-tree of the complete binary tree given by Claim 5.8 whose leaves are $ S \cup \mathbf {dangling} (S) $ . Assign values to the nodes in T as follows:

•

For each leaf $ \ell ^{(i)} $ in S, let its value be given by $ \mathbf {block} (v ^{(i)}) $ .

•

For each node in $ \mathbf {dangling} (S) $ , let its value be given by $ \pi $ .

•

For the remaining nodes, iteratively hash each pair of siblings using h at each level to assign a value to their parent, until reaching the root.

Let $ \mathit {MT} $ be the resulting (proper) Merkle tree on T, and define $ \mathbf {extend} (\mathit {pp} , S, V, \pi) = \mathit {MT} $ .

Using this algorithm, we define an induced value as follows: For any node $ \ell $ and value u, we say that $ (\ell ,u) $ is induced by $ (S,V,\pi) $ if the value of $ \ell $ in $ \mathit {MT} $ is u, where $ \mathit {MT} = \mathbf {extend} (S,V,\pi) $ . Note that this implies that u is the value of $ \ell $ in any Merkle tree that agrees with the above values at $ S \cup \mathbf {dangling} (S) $ . Our main observation for this proof is that when $ S,V,\pi $ have the correct syntax, the following subclaim holds:

Subclaim 5.17.

$ \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} ,\mathit {digest},S,V,\pi) $ accepts if and only if $ \mathit {digest} $ is the value for the root induced by $ (S,V,\pi) $ .

This follows immediately from the definition of $ \mathbf {C}.\mathbf {VerOpen} $ . Specifically, $ \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} ,\mathit {digest}, S,V , \pi) $ implicitly runs $ \mathbf {extend} (\mathit {pp} ,S,V,\pi) $ , compares the value of the resulting root to $ \mathit {digest} $ , and accepts when they are equal. Using Subclaim 5.17, we are now ready to prove the two claims above.

Proof of Claim 5.15

Fix $ \lambda $ , $ \mathit {pp} $ , S, V, $ \mathit {digest} $ , and $ \pi $ as in the statement of the claim. Let $ \mathit {MT} =\mathbf {extend} (\mathit {pp} ,S,V,\pi) $ . For each $ i\in [p] $ , let $ \pi ^{(i)} $ contain all pairs $ (\ell ,u) $ such that $ \ell \in \mathbf {dangling} (\ell ^{(i)}) $ and u is the value of $ \ell ^{(i)} $ in $ \mathit {MT} $ . Note that these values exist as $ \mathbf {dangling} (\ell ^{(i)}) \subseteq \mathbf {ancestors} (S) \cup \mathbf {dangling} (S) $ , and $ \mathit {MT} $ contains the latter nodes.

For each $ i\in [p] $ , we show first that $ \pi ^{(i)} $ is efficiently computable, and then we show that it gives a valid opening proof. For efficiency, note that $ \mathbf {extend} (\mathit {pp} ,S,V,\pi) $ runs in time $ \mathrm{poly} (\lambda ,p,\log n) $ , since it requires computing at most $ \left|S \cup \mathbf {dangling} (S) \right| $ hashes, each taking time polynomial in $ \lambda $ , and $ \left|S \cup \mathbf {dangling} (S) \right| \in \mathrm{poly} (p,\log n) $ by Claim 5.7. Moreover, the input to $ \mathbf {extend} $ has length polynomial in $ \lambda $ , p, and $ \log n $ , so it follows that $ \pi ^{(i)} $ can be computed in polynomial time based on $ S,V,\pi $ .

Next, we show that $ \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} ,\mathit {digest},\ell ^{(i)},v ^{(i)},\pi ^{(i)}) = 1 $ . By Subclaim 5.17, this accepts whenever $ \mathit {digest} $ is the value for the root induced by $ (\ell ^{(i)},v ^{(i)},\pi ^{(i)}) $ . Let $ \mathit {MT}^{\prime } = \mathbf {extend} (\mathit {pp} ,\ell ^{(i)},v ^{(i)},\pi ^{(i)}) $ . We want to show that $ \mathit {digest} $ is the value of the root in $ \mathit {MT}^{\prime } $ . Note that the values of $ \ell ^{(i)} $ and of $ \mathbf {dangling} (\ell ^{(i)}) $ agree between $ \mathit {MT} $ and $ \mathit {MT}^{\prime } $ by definition. It follows that the values for each ancestor of $ \ell ^{(i)} $ agree between the two Merkle trees. Finally, we note that, since $ \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} ,\mathit {digest},S,V,\pi) $ accepts, then $ \mathit {digest} $ is the value of the root of $ \mathit {MT} $ , and hence is the value of the root of $ \mathit {MT}^{\prime } $ , which completes the proof. $ \Box $ □

Proof of Claim 5.16

Since $ \mathbf {C}.\mathbf {VerUpd} (\mathit {pp} , \mathit {digest}, S, V, \mathit {digest}^{\prime }, \tau) = 1 $ , then $ \tau $ can be parsed as $ V _{\mathsf {prev}} || \pi ^{\prime } $ such that $ \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} , \mathit {digest}, S, V _{\mathsf {prev}}, \pi ^{\prime }) = 1 $ and $ \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} , \mathit {digest}^{\prime }, S, V, \pi ^{\prime }) = 1 $ . In the case that $ \ell \in S $ , then by Claim 5.15, $ \mathcal {A}^{\prime } $ can use $ (S,V,\pi ^{\prime }) $ to compute and output a proof $ \pi ^{\star } $ in polynomial time such that $ \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} ,\mathit {digest}^{\prime },\ell ,v ^{\star },\pi ^{\star }) $ accepts, where $ v ^{\star } $ is the value of $ \ell $ given by V. As a result, we focus on the case that $ \ell \not\in S $ , and thus $ v ^\star = v $ .

Consider running the verifications $ \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} , \mathit {digest}, \ell , v, \pi) $ , $ \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} , \mathit {digest}, S, V _{\mathsf {prev}},\pi ^{\prime }) $ , and $ \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} , \mathit {digest}^{\prime }, S, V, \pi ^{\prime }) $ . They all accept by assumption, and from the inputs to each we can define a Merkle tree with all the induced values. Specifically, let $ \mathit {MT} = \mathbf {extend} (\mathit {pp} ,\ell ,v,\pi) $ , let $ \mathit {MT} ^{\mathsf {prev}} = \mathbf {extend} (\mathit {pp} ,S,V _{\mathsf {prev}},\pi ^{\prime }) $ , and let $ \mathit {MT} ^{\mathsf {final}} = \mathbf {extend} (\mathit {pp} ,S,V,\pi ^{\prime }) $ . By Subclaim 5.17, the root of $ \mathit {MT} $ and $ \mathit {MT} ^{\mathsf {prev}} $ is $ \mathit {digest} $ , and the root of $ \mathit {MT} ^{\mathsf {final}} $ is $ \mathit {digest}^{\prime } $ . Note that $ \mathit {MT} $ contains all nodes in $ \mathbf {ancestors} (\ell) \cup \mathbf {dangling} (\ell) $ , and both $ \mathit {MT} ^{\mathsf {prev}} $ and $ \mathit {MT} ^{\mathsf {final}} $ contain $ \mathbf {ancestors} (S) \cup \mathbf {dangling} (S) $ .

To construct a proof $ \pi ^\star $ corresponding to opening location $ \ell $ to value $ v ^{\star } $ in $ \mathit {digest}^{\prime } $ , we need to construct values for $ \mathbf {dangling} (\ell) $ , which are simply the nodes in the authentication path for $ \ell $ . Before defining $ \pi ^{\star } $ , we introduce some notation. For $ j\in \{0,\ldots ,\log n-1\} $ , let $ \mathit {node} _{j} $ be the ancestor of $ \ell $ at level j and let $ \mathit {sib} _{j} $ be its sibling. Also, let $ i \in [\log n] $ be the level in a binary tree containing the closest common ancestor of leaf $ \ell $ and any leaf in S.

Next, define $ \pi ^{\star } $ to contain all pairs $ (\mathit {sib} _{j},u _{j}) $ for $ j\in \{0,\ldots ,\log n-1\} $ where $ u _{j} $ is defined as follows:

•

If $ j \lt i-1 $ , then $ u _{j} $ is the value of $ \mathit {sib} _{j} $ in $ \mathit {MT} $ (or $ \bot $ if it does not exist).

•

If $ j \ge i-1 $ , then $ u _{j} $ is the value of $ \mathit {sib} _{j} $ in $ \mathit {MT} ^{\mathsf {final}} $ (or $ \bot $ if it does not exist).

We claim that either $ \mathbf {C}.\mathbf {VerOpen} (\mathit {digest}^{\prime }, \ell , v ^\star , \pi ^\star) = 1 $ , in which case $ \mathcal {A}^{\prime } $ outputs $ \pi ^{\star } $ , or we can find a collision in the hash function. Recall that $ \mathbf {C}.\mathbf {VerOpen} $ can be split into syntactic checks, and checking the value of $ \mathit {digest}^{\prime } $ . We first show that the syntactic checks done by $ \mathbf {C}.\mathbf {VerOpen} $ pass, and then we show that either $ \mathcal {A}^{\prime } $ outputs a collision, or the rest of the verification succeeds.

For the syntactic checks, it follows that the inputs to $ \mathbf {C}.\mathbf {VerOpen} $ are formatted correctly, so we only need to show that $ \pi ^{\star } $ contains a value for all nodes in $ \mathbf {dangling} (\ell) = (\mathit {sib} _{1},\ldots , \mathit {sib} _{\log n-1}) $ . To show this, we have the following:

•

For $ j \lt i-1 $ , $ \mathit {sib} _{j} \in \mathbf {dangling} (\ell) $ by definition and so is successfully found in $ \mathit {MT} $ .

•

For $ j=i-1 $ , we note that $ \mathit {node} _{i} $ is the closest common ancestor of $ \ell $ and S, and is not a leaf, since $ \ell \not\in S $ . Therefore, the children of $ \mathit {node} _{i} $ , namely, $ \mathit {sib} _{i-1} $ or $ \mathit {node} _{i-1} $ , must be in $ \mathbf {ancestors} (S) \cup \mathbf {dangling} (S) $ . This implies that $ \mathit {sib} _{j} $ is found successfully in $ \mathit {MT} ^{\mathsf {final}} $ .

We note that this also implies that $ \mathit {node} _{i-1} $ is in $ \mathbf {dangling} (S) $ , since it cannot be in $ \mathbf {ancestors} (S) $ by definition of i, which is will be helpful later on in the proof.

•

For $ j \gt i-1 $ , we have that $ \mathit {node} _{j}\in \mathbf {ancestors} (\mathit {node} _{i})\in \mathbf {ancestors} (S) $ , and so its sibling $ \mathit {sib} _{j} \in \mathbf {ancestors} (S) \cup \mathbf {dangling} (S) $ .

This shows that $ \pi ^{\star } $ contains a value for every node in $ \mathbf {dangling} (\ell) $ , so the syntactic checks done by verification pass.

Next, $ \mathbf {C}.\mathbf {VerOpen} (\mathit {digest}^{\prime }, \ell , v ^\star , \pi ^\star) $ checks $ \mathit {digest}^{\prime } $ by computing the root induced by $ (\ell ,v ^{\star },\pi ^{\star }) $ . Along the way, it computes a value for each node in $ \mathbf {ancestors} (\ell ^{(i)}) $ . Let $ c_{1},\ldots ,c_{\log n} $ be these values. We will show that either $ c_{\log n} = \mathit {digest}^{\prime } $ , and so verification accepts, or we can find a collision. Towards this, we have the following observations:

(1)

$ c_{i-1} $ is the value of $ \mathit {node} _{i-1} $ in $ \mathit {MT} $ .

This holds, since $ c_{i-1} $ is computed based on leaf values for $ \ell $ and for $ \mathit {sib} _{0},\ldots ,\mathit {sib} _{i-2} $ from $ \mathit {MT} $ , and so it agrees with $ \mathit {MT} $ .

(2)

Either $ \mathit {node} _{i-1} $ has the same value in $ \mathit {MT} $ and $ \mathit {MT} ^{\mathsf {prev}} $ , or we can find a collision.

Both Merkle trees $ \mathit {MT} $ and $ \mathit {MT} ^{\mathsf {prev}} $ have $ \mathit {digest} $ as the root. They also both contain $ \mathit {node} _{i-1} $ , since it is in both $ \mathbf {ancestors} (\ell) $ by definition and in $ \mathbf {dangling} (S) $ as shown above. This implies that they also contain the nodes in its authentication path. If the values for $ \mathit {node} _{i-1} $ between the two trees are not the same, then this would give two different openings for $ \mathit {node} _{i-1} $ relative to $ \mathit {digest} $ , which can be used to find a collision.

(3)

$ \mathit {node} _{i-1} $ has the same value in $ \mathit {MT} ^{\mathsf {prev}} $ and $ \mathit {MT} ^{\mathsf {final}} $ .

$ \mathit {MT} ^{\mathsf {prev}} $ is induced by $ (S,V _{\mathsf {prev}},\pi ^{\prime }) $ , while $ \mathit {MT} ^{\mathsf {final}} $ is induced by $ (S,V,\pi ^{\prime }) $ . Therefore, these trees agree at all nodes in $ \pi ^{\prime } $ , which consists of all nodes in $ \mathbf {dangling} (S) $ , and in particular contains $ \mathit {node} _{i-1} $ as shown above. Therefore, $ \mathit {MT} ^{\mathsf {prev}} $ and $ \mathit {MT} ^{\mathsf {final}} $ have the same value for $ \mathit {node} _{i-1} $ .

(4)

$ (c_{i},\ldots ,c_{\log n}) $ are the values for $ \mathit {node} _{i},\ldots ,\mathit {node} _{\log n} $ , respectively, in $ \mathit {MT} ^{\mathsf {final}} $ .

By combining observation 1, 2, and 3, we have that $ c_{i-1} $ is the value of $ \mathit {node} _{i-1} $ in $ \mathit {MT} ^{\mathsf {final}} $ . Moreover, the values for $ \mathit {sib} _{i-1},\ldots ,\mathit {sib} _{\log n-1} $ in $ \pi ^{\star } $ are defined to be the values from $ \mathit {MT} ^{\mathsf {final}} $ . For $ j = i,\ldots ,\log n $ , the value $ c_{j} $ is computed as the hash of these values for $ \mathit {sib} _{j-1} $ and $ \mathit {node} _{j-1} $ , so $ c_{j} $ is the value of $ \mathit {node} _{j} $ in $ \mathit {MT} ^{\mathsf {final}} $ .

Observation 4 implies that $ c_{\log n} = \mathit {digest}^{\prime } $ , so $ \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} , \mathit {digest}^{\prime }, \ell , v ^\star , \pi ^\star) = 1 $ , as required. $ \Box $ □

This completes the proof of Lemma 5.14.

Lemma 5.18 (Parallel Efficiency).

There exists a polynomial $ \beta $ such that the construction $ \mathbf {C} $ satisfies $ \beta $ -parallel efficiency.

Proof.

We show the three required efficiency properties in the following claims. The lemma then follows by letting the polynomial $ \beta $ be any polynomial larger than $ q_1 $ , $ q_2 $ , and $ q_3 $ given in the claims.

For the following claims, let $ t_{\mathsf{H} }(\lambda) $ denote the time it takes to hash each pair of $ 2\lambda $ -bit words, and note that $ t_{\mathsf{H} }(\lambda) \in \mathrm{poly} (\lambda) $ . It will also be helpful to note that for any set S of p locations, $ \mathbf {ancestors} (S) \cup \mathbf {dangling} (S) $ contains at most $ p\log n $ nodes by definition.

Claim 5.19.

There exists a polynomial $ q_1 $ such that for any $ \lambda ,n\in \mathbb {N} $ with $ n \le 2^{\lambda } $ and $ \mathit {pp} $ in the support of $ \mathbf {C}.\mathbf {Gen} (1^{\lambda} ,n) $ , the algorithms $ \mathbf {C}.\mathbf {Open} $ , $ \mathbf {C}.\mathbf {Update} $ , $ \mathbf {C}.\mathbf {VerOpen} $ , and $ \mathbf {C}.\mathbf {VerUpd} $ , when given a set S of p locations and public parameters $ \mathit {pp} $ , can each be computed in with work $ p \cdot q_{1}(\lambda) $ , or with depth $ q_1(\lambda) $ using $ p \cdot q_1(\lambda) $ processors.□

Proof.

We analyze $ \mathbf {C}.\mathbf {Update} $ , and we observe that the analyses for $ \mathbf {C}.\mathbf {Open} $ and $ \mathbf {C}.\mathbf {VerOpen} $ follow similarly as the algorithms have the same overall structure. Furthermore, $ \mathbf {C}.\mathbf {VerUpd} $ simply calls $ \mathbf {C}.\mathbf {VerOpen} $ twice. Thus, it suffices to argue the claim for $ \mathbf {C}.\mathbf {Update} $ .

We note that $ \mathbf {C}.\mathbf {Update} $ can be split into (1) preprocessing, (2) access and compute steps at each level in the tree, and (3) forming the output. Before analyzing the complexity of each of these, we discuss how to implement each of the relevant sets to achieve efficiency. The sets S, $ \mathbf {dangling} (S) $ , $ \mathbf {ancestors} (S) $ , R, and W each contain at most $ p \cdot \log n \in p \cdot \mathrm{poly} (\lambda) $ nodes, and $ R,W $ additionally contain $ 2\lambda $ -bit values for each node. We would like each set to support concurrent reads and writes to distinct locations. This is done by allocating $ 2n \cdot \mathrm{poly} (\lambda) $ bits in memory for each set (initialized to zeroes) and using an indicator bit to say if an element is in the set or not followed by its value (if any).

This can be done as there are 2n nodes in the tree, and each location can be encoded with $ \log (2n) $ bits (and so with the above implementation, there are $ \mathrm{poly} (\lambda) $ bits in memory for each node). Specifically, the root is encoded as 0, and for each node with index i, its left and right children are encoded as $ 2i+1 $ and $ 2i+2 $ , respectively. The exact encoding is not important for our application, only that each location requires $ \log (2n) $ bits and that it gives a way to find a node’s parent or child in time $ \mathrm{poly} (\lambda) $ . Note that with this encoding and at most $ p\cdot \mathrm{poly} (\lambda) $ processors for each of the above sets, every location in each set can be accessed concurrently.

Next, we analyze the running time of (1), (2), and (3). For (1), the preprocessing steps require computing the relevant sets, which can be done in depth $ \mathrm{poly} (\lambda) $ using p processors with the implementation described above. Specifically, computing R and W is straightforward, where for W, we assume that each $ \mathbf {block} (v ^{(i)}) $ can be encoded (and decoded) in $ \mathrm{poly} (\lambda) $ time. For $ \mathbf {ancestors} (S) $ , we can use p processors as follows: Each of the p processors can start at the leaf nodes (where each processor know its starting leaf index). Subsequently, they can move down the tree and update the sets. To make sure only one process is accessing a single location at a time, after each processor adds node at level i of the tree, it can check if that node’s sibling was also added to $ \mathbf {ancestors} (S) $ . If so, then only the processor accessing the sibling with the larger index can move on to the next level. Once a node stops (because its corresponds to the smaller of the two nodes), it can stop checking nodes further down the tree. Thus, at most two processors might be trying to access a node at each step, and each processor can efficiently check if it should continue. After determining $ \mathbf {ancestors} (S) $ , the set $ \mathbf {dangling} (S) $ can be computed in depth $ \mathrm{poly} (\lambda) $ with p processors, where each processor is initially assigned to a leaf node in S, and adds that node’s sibling to $ \mathbf {dangling} (S) $ whenever the sibling is not given by $ \mathbf {ancestors} (S) $ . Each processor can stop making updates exactly as above, so each memory location is only accessed by a single process.

For (2), we would like each access step to take a single time slot, as specified by the algorithm. To do this, at the end of the pre-processing steps, we can compute the locations for each leaf in $ S \cup \mathbf {dangling} (S) $ , which only adds an additional $ \mathrm{poly} (\lambda) $ depth using $ p \log n $ processors, and then spawn $ p\log n $ processors to access these locations in Merkle tree in the subsequent access step. Then, during the compute steps, using depth $ \mathrm{poly} (\lambda) $ and at most p processors, the locations for the next access step can be computed as above. Continuing in this fashion ensures that each access step is indeed a single step, with at most p processors. The compute steps additionally require updating R and W, as well as computing a hash per each of the p processors. This takes depth $ \mathrm{poly} (\lambda) $ using $ p \cdot \mathrm{poly} (\lambda) $ processors, where $ \mathrm{poly} (\lambda) $ extra processors are possibly needed to compute the hash efficiently. These access and compute steps are repeated $ \log n \le \lambda $ times for each level in the tree.

For (3), forming the output requires reading R with at most $ \mathrm{poly} (\lambda) $ work per element in the set, which can be distributed as above. Obtaining the value of $ \mathit {digest} $ from W requires an additional $ O(\lambda) $ depth.

Thus, it holds that there is a polynomial $ q_1 $ such that $ \mathbf {C}.\mathbf {Update} $ , $ \mathbf {C}.\mathbf {Open} $ , $ \mathbf {C}.\mathbf {VerOpen} $ , and $ \mathbf {C}.\mathbf {VerUpd} $ can be computed with work $ p \cdot q_{1}(\lambda) $ , or with depth $ q_1(\lambda) $ using at most $ p \cdot q_1(\lambda) $ processors. $ \Box $ □

Claim 5.20.

There exists a polynomial $ q_2 $ such that for any $ \lambda ,n\in \mathbb {N} $ with $ n \le 2^{\lambda } $ , $ \mathit {pp} $ in the support of $ \mathbf {C}.\mathbf {Gen} (1^{\lambda} ,n) $ , and partial string $ D = (n,I,A) $ computing $ \mathbf {C}.\mathbf {Hash} (\mathit {pp} ,D) $ can be done in work $ \left|I \right| \cdot q_{2}(\lambda) $ , or with depth $ q_{2}(\lambda) $ with $ |I| \cdot q_{2}(\lambda) $ processors.

Proof.

Recall that computing $ \mathbf {C}.\mathbf {Hash} (\mathit {pp} ,D) $ consists of allocating memory initialized to 0 (which we assume is free), computing $ \log n $ hashes to compute dummy values, and running $ \mathbf {C}.\mathbf {Update} (\mathit {pp} ,\mathit {ptr} ,I,A) $ . As shown in the previous claim, running $ \mathbf {C}.\mathbf {Update} $ takes either work $ \left|I \right| \cdot q_{1}(\lambda) $ , or depth $ q_1(\lambda) $ using $ |I| \cdot q_1(\lambda) $ processors, and computing $ \log n \le \lambda $ hashes requires $ t_{\mathsf{H} }(\lambda) \cdot \log n \in \mathrm{poly} (\lambda) $ work. Thus, we let $ q_2 $ be a polynomial such that $ q_2(\lambda) $ is at least as large as $ q_1(\lambda) + t_\mathsf{H} (\lambda) \cdot \lambda $ to cover the depth requirement. $ \Box $ □

Claim 5.21.

There exists a polynomial $ q_3 $ and an algorithm $ \mathbf {OpenUpdate} $ such that the following holds: For any $ \lambda ,p,n \in \mathbb {N} $ with $ n \le 2^{\lambda } $ , $ \mathit {pp} $ in the support of $ \mathbf {C}.\mathbf {Gen} (1^{\lambda} ,n) $ , pointer $ \mathit {ptr} $ , ordered set $ S \subseteq [n] $ of p locations, and tuple of words $ V \in (\{0,1\} ^{\lambda })^{p} $ , define $ (V^{\prime },\pi ,\mathit {digest},\tau) $ as follows:

•

$ (V^{\prime },\pi) = \mathbf {C}.\mathbf {Open} (\mathit {pp} ,\mathit {ptr} ,S) $ and

•

$ (\mathit {digest},\tau) = \mathbf {C}.\mathbf {Update} (\mathit {pp} ,\mathit {ptr} ,S,V) $ .

It holds that $ \mathbf {OpenUpdate} (\mathit {pp} ,\mathit {ptr} ,S,V) $ outputs $ (V^{\prime },\pi ,\mathit {digest} , \tau) $ and computing k sequential calls to $ \mathbf {OpenUpdate} $ , each on at most $ p_\mathsf {max} $ locations, can be done with $ k \cdot p_{\mathsf {max}} \cdot q_{3}(\lambda) $ work, or with depth $ (k-1) + q_3(\lambda) $ using at most $ p_\mathsf {max} \cdot q_3(\lambda) $ processors.

Proof.

For the algorithm $ \mathbf {OpenUpdate} $ , we note that $ \mathbf {C}.\mathbf {Update} $ already computes the values for S before the update and the values for $ \mathbf {dangling} (S) $ . We therefore define $ \mathbf {OpenUpdate} $ to run $ \mathbf {C}.\mathbf {Update} $ to obtain $ (\mathit {digest},\tau) $ , parse $ \tau = V^{\prime } || \pi $ where $ V^{\prime } \in (\{0,1\} ^{\lambda } \cup \{\bot\})^{p} $ and output $ (V^{\prime },\pi ,\mathit {digest},\tau) $ . Since $ V^{\prime } $ gives value for each location in S in the Merkle tree before being updated (or $ \bot $ is uninitialized), then $ V^{\prime } $ is the tuple of values for S given by $ \mathbf {C}.\mathbf {Open} (\mathit {pp} ,\mathit {ptr} ,\ell) $ before the update. Additionally, because the node values for $ \mathbf {dangling} (S) $ are unchanged by $ \mathbf {C}.\mathbf {Update} $ , the proof $ \pi $ output by $ \mathbf {OpenUpdate} $ will be the same as in $ \mathbf {C}.\mathbf {Open} $ . Therefore, the output of $ \mathbf {OpenUpdate} $ is correct.

To perform k sequential updates to the Merkle tree, we observe that it is possible to pipeline them, as we describe next. Note that each update only needs to share memory corresponding to the Merkle tree and segment tree. All other memory used by the algorithm specified in Claim 5.19 can be allocated per updated. Consider a sequence of k sequential calls to $ \mathbf {OpenUpdate} $ , denoted $ \mathit {Upd} ^{i} $ for $ i \in \{0,\ldots ,k-1\} $ , each updating at most $ p_{\mathsf {max}} $ locations. Recall that $ \mathbf {OpenUpdate} $ pre-processes its input, then iterates over the levels of a binary tree doing a single access step and then compute steps at each level, and then forms its output. In what follows, it will be helpful to denote the phases of computation done by $ \mathit {Upd} ^{i} $ as the sequence:

$ \begin{equation*} P^{i}, A_{0}^{i}, C_{0}^{i}, A_{1}^{i}, C_{1}^{i}, \ldots , A_{\log (n)-1}^{i}, C_{\log (n)-1}^{i},F^{i}, \end{equation*} $

where $ P^{i} $ denotes the pre-processing steps, $ A_{j}^{i} $ is the access step at iteration j, $ C_{j}^{i} $ denotes the compute steps at iteration j, and $ F^{i} $ corresponds to the steps for forming the output.

To perform the updates in parallel, we will pipeline them in different processes so one starts after the other: Specifically, $ \mathit {Upd} ^{0} $ will start at time 0, $ \mathit {Upd} ^{1} $ will start at time 1, and in general $ \mathit {Upd} ^{i} $ will start at time i. Each process remembers the node values it sees during the procedure. The value of the root node, when all operations finish, is the new digest. Additionally, even if some update is given less than $ p_{\mathsf {max}} $ positions, we require that certain phases of the update whose running time depends on $ p_{\mathsf {max}} $ (namely, the preprocessing steps and compute steps) still take time as if they were given $ p_{\mathsf {max}} $ positions. Namely, each of these takes fixed polynomial time in $ \lambda $ and $ p_{\mathsf {max}} $ , so this can be easily implemented by doing dummy operations until the right amount of time has elapsed. This ensures that for each update $ P^{i} $ takes the same amount of time for each update i, and $ C_{i}^{j} $ takes the same amount of time for each $ i,j $ .

In terms of correctness, we want to show that for every $ i \in [k] $ , the output of $ \mathit {Upd} ^{i} $ in the concurrent execution is the identical to its output in a sequential execution where the operations are run sequentially (using the number of processors specified by the $ \mathbf {C}.\mathbf {Update} $ description). To do so, we will show that for each block of memory shared between different operations, the memory accesses to that block occur in the same order in both executions. The shared memory is that in $ \mathit {ptr} $ and $ \mathit {segtree} $ . Note that the only steps that access this memory are the access steps $ A_{j}^{i} $ .

Consider any memory location in level j of $ \mathit {ptr} $ or $ \mathit {segtree} $ . This is only accessed by $ A_{j}^{i} $ for each i. Therefore, consider any $ A_{j}^{i} $ and $ A_{j}^{i^{\prime }} $ such that such that $ A_{j}^{i} $ occurs before $ A_{j}^{i^{\prime }} $ in the sequential execution. We will show that this is preserved in the concurrent execution.

To show this, let $ t_{P} $ be the depth of the preprocessing steps in single call to $ \mathbf {C}.\mathbf {Update} $ and let $ t_{C} $ be the depth of the compute steps in a single $ \mathbf {C}.\mathbf {Update} $ , and note that $ t_{P},t_{C} $ are functions of $ \lambda ,p_{\mathsf {max}} $ . In the concurrent execution, $ A_{j}^{i} $ occurs at time $ t \triangleq i + t_{P} + j \cdot (t_{C}+1) $ . This is because $ \mathit {Upd} ^{i} $ starts at time i, and before $ A_{j}^{i} $ occurs, there are $ t_{P} $ steps for the pre-processing $ P^{i} $ , j access steps $ A_{0}^{i},\ldots ,A_{j-1}^{i} $ , and j groups of $ t_{C} $ compute steps $ C_{0}^{i},\ldots ,C_{j-1}^{i} $ . Let $ t^{\prime } \triangleq i^{\prime } + t_{P} + j \cdot (t_{C} + 1) $ be the time that $ A_{j}^{i^{\prime }} $ occurs. Since $ A_{j}^{i} $ occurs first in the sequential execution, then $ i \lt i^{\prime } $ , which implies that $ t \lt t^{\prime } $ . Since this holds for every $ i \ne i^{\prime } $ , it follows that every memory access to level j of the tree occurs in the same order in both the concurrent and sequential executions, which implies correctness. Note that this crucially relied on the fact that each access step indeed is a single step.

Last, we show efficiency for the pipelined operations. We note that, since $ \mathbf {OpenUpdate} $ requires running $ \mathbf {C}.\mathbf {Update} $ and then formatting the output, a single invocation to $ \mathbf {OpenUpdate} $ requires depth $ 2 \cdot q_{1}(\lambda) $ using at most $ p_{\mathsf {max}} \cdot q_1(\lambda) $ processors by Claim 5.19, and can be done with $ 2 p_{\mathsf {max}} \cdot q_{1}(\lambda) $ total work. This implies that the total work to do all k operations is $ k \cdot p_{\mathsf {max}} \cdot 2q_{1}(\lambda) $ . To decouple this into depth and processors, we note that, since we pipeline the operations such that in every step a new $ \mathbf {OpenUpdate} $ begins, the total depth of this sequence of operations can be bounded by $ 2\cdot q_{1}(\lambda) + (k-1) $ . Moreover, there can be a total of $ 2\cdot q_{1}(\lambda) $ operations occurring concurrently, and so $ (2 \cdot q_{1}(\lambda)) \cdot (p_{\mathsf {max}} \cdot q_1(\lambda)) $ bounds the total number of processors needed at any given time. Letting $ q_{3}(\lambda) = 2 \cdot (q_{1}(\lambda))^{2} $ completes the proof. $ \Box $ □

This completes the proof of Lemma 5.18.

6 From Succinct Arguments to SPARKs

In this section, we present our main transformation, which will be instrumental in our construction of SPARKs. Specifically, we show a generic transformation from any concurrently updatable hash function and succinct argument of knowledge for $ \mathbf {NP} $ , to an argument that satisfies the SPARK completeness and argument of knowledge properties, and where the provers overhead depends additively on the multiplicative overhead of the original succinct argument. As we show in Section 8, when instantiating this transformation with a succinct argument whose prover overhead is sufficiently small (which is indeed satisfied by existing succinct arguments), this transformation yields a SPARK.

We first give the transformation in the interactive setting. To do so, we start by describing a helper language in Section 6.1 and then give the interactive protocol in Section 6.2. We then prove completeness, argument of knowledge, optimal prover depth, and succinctness in Section 6.3. Finally, we show the transformation in the non-interactive setting in Section 6.4.

6.1 The Update Language

Let $ (M,x,y,L,t) $ be any statement in $ \mathcal {L}_{\mathcal {U}} ^{\mathbf {PRAM}} $ , where M is a PRAM program with access to a string $ D \in \{0,1\} ^{n\lambda } $ in memory for $ n \le 2^{\lambda } $ . To help with our construction, we define the language $ \mathcal {L}_{\mathsf {upd}} $ in Figure 2. This language corresponds to k steps of a PRAM computation where at each step we additionally update a digest corresponding to the memory of M. Specifically, a statement

$ \begin{equation*} (M_{},x_{},k_{},\mathit {pp} _{},h_{}, \mathit {digest} _{0},\mathit {hash} _{0}, \mathit {digest} _{k}^{},\mathit {hash} _{k}^{}) \end{equation*} $

is in $ \mathcal {L}_{\mathsf {upd}} $ if there exists a sequence of k consistent updates that start with digest $ \mathit {digest} _0 $ and end with digest $ \mathit {digest} _{k} $ . Here, each update may correspond to concurrently reading or writing multiple positions. The ith update $ (\mathit {digest} _{i}^{},V ^{\mathsf {prev}}_{i},V ^{\mathsf {rd}}_{i},\pi _{i}^{},\tau _{i}^{}) $ specifies the digest $ \mathit {digest} _{i} $ after that step, the values $ V _{i}^{\mathsf {prev}} $ at the updated locations in the digest before the update, the values $ V _{i}^{\mathsf {rd}} $ read from or overwritten in D during that step, and proofs $ \pi _i, \tau _{i} $ validating the operations performed at that step.

Fig. 2.

The relation of this language is defined relative to a starting PRAM configuration $ (\mathit {State} _{0},V _{0}^{\mathsf {rd}}) $ and the values given by

$ \begin{equation*} (\mathit {State} _{i}^{},\mathit {Op} _{i}^{},S_{i}^{},V ^{\mathsf {wt}}_{i}) = \mathbf {parallel\text{-} step} (M,\mathit {State} _{i-1}^{},V ^{\mathsf {rd} {}}_{i-1}) \end{equation*} $

for $ i \in [k] $ . For every step i, the relation checks (1) that the update from $ \mathit {digest} _{i-1} $ to $ \mathit {digest} _{i} $ is valid (using proof $ \tau _{i} $ and the values $ V ^{\mathsf {rd}} $ and $ V ^{\mathsf {wt}} $ ) and (2) there is a valid opening for $ \mathit {digest} _{i-1} $ at locations in $ S_{i} $ (using proof $ \pi _{i} $ and the values $ V _{i}^{\mathsf {prev}} $ ). Specifically, this check guarantees that the value in $ V _{i}^{\mathsf {rd}} $ claimed to have been read for each position either already appeared there under $ \mathit {digest} _{i-1} $ , or that the position was $ \bot $ before step i and was initialized correctly in step i. Last, it checks that the values before the sequence of updates $ \mathit {State} _{0},V _{0}^{\mathsf {rd}} $ and those after the final update $ \mathit {State} _{k},V _{k}^{\mathsf {rd}} $ hash (using h) to the values $ \mathit {hash} _{0},\mathit {hash} _{k} $ , respectively, given by the statement.

We emphasize that for each step i, the values $ V _{i}^{\mathsf {rd}} $ , $ V _{i}^{\mathsf {wt}} $ , and $ V _{i}^{\mathsf {prev}} $ each serve a difference purpose: for each $ \mathsf {wt} $ operation in the update, $ V _{i}^{\mathsf {wt}} $ contains the value written to D, and $ V _{i}^{\mathsf {rd}} $ contains the value overwritten in D. For each $ \mathsf {rd} $ operation, $ V _{i}^{\mathsf {rd}} $ contains the read value (and $ V _{i}^{\mathsf {wt}} $ contains $ \bot $ ). Finally, $ V _{i}^{\mathsf {prev}} $ contains the values underlying the digest before the update, at all the positions in question.

The key properties of this language are (1) the witness scales with the length of the computation and not the size of the memory, and (2) witnesses for consecutive $ \mathcal {L}_{\mathsf {upd}} $ computations can be merged into a single witness for a larger $ \mathcal {L}_{\mathsf {upd}} $ computation. This allows us to prove that $ (M,x,y,L,t) \in \mathcal {L}_{\mathcal {U}} ^{\mathbf {PRAM}} $ with witness w by splitting a proof that $ M(x,w) = 1 $ into proofs of many sub-computations, where the proof of each sub-computation will correspond to a statement in $ \mathcal {L}_{\mathsf {upd}} $ .

The complexity of $ \mathcal {L}_{\mathsf {upd}} $ . Note that the language $ \mathcal {L}_{\mathsf {upd}} $ is a standard $ \mathbf {NP} $ language. In particular, verifying that an instance-witness pair corresponding to $ k\le t $ updates is in the relation for $ \mathcal {L}_{\mathsf {upd}} $ can be done by a circuit with depth $ k \cdot \beta (\lambda) \cdot q(\lambda ,\left|(M,x) \right|,\log t) $ for a polynomial q using $ \beta (\lambda) \cdot p_{M} $ processors, where $ \beta $ is the efficiency of the concurrently updatable hash function, whenever the number of positions changed in each update is at most $ p_{M} $ (this follows from the efficiency of the concurrently updatable hash function). When using a succinct argument to prove statements in $ \mathcal {L}_{\mathsf {upd}} $ , we can either view the relation as a circuit, Turing machine, or PRAM machine that uses $ \beta (\lambda) \cdot p_{M} $ processors.

6.2 Interactive Protocol

In this section, we give our protocol in Figures 3 and 4. It relies on the following ingredients:

Fig. 3.

Fig. 4.

•

A succinct argument of knowledge $ (\mathcal {P} _{\mathsf {sARK}},\mathcal {V} _{\mathsf {sARK}}) $ for $ \mathcal {L}_{\mathsf {upd}} $ with $ (\alpha ,\rho) $ -prover efficiency.

•

A concurrently updatable hash function $ \mathbf {C} $ with $ \beta $ -parallel efficiency.

•

A collision-resistant hash function family ensemble $ \mathcal {H} = \{\mathcal {H} _{\lambda }\}_{\lambda \in \mathbb {N}} $ with $ h :\{0,1\} ^{*} \rightarrow \{0,1\}^{\lambda } $ for each $ h \in \mathcal {H} _{\lambda } $ . We note that this is implied by $ \mathbf {C} $ .

We refer to Section 2 for a high-level overview of the construction and next give the formal details.

Parameters. For ease of readability for the protocol and corresponding proofs, we define the parameters for the protocol with respect to the relation $ \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} $ , security parameter $ \lambda \in \mathbb {N} $ , and statement $ (M,x,t,L) \in \{0,1\} ^{*} $ as follows: Note that we assume that all functions defined below are computable in polynomial time in their input length.

•

$ n \le 2^{\lambda } $ is the amount of words in memory needed to run M, and $ p_{M} $ is the number of parallel processors used by M.

•

$ \beta \triangleq \beta (\lambda) $ is the “hash efficiency” of our construction. Namely, $ \beta $ upper bounds the parallel efficiency of $ \mathbf {C} $ on security parameter $ \lambda $ and the time to compute a hash from $ \mathcal {H} _{\lambda } $ . Specifically, we will be using the hash function $ h \in \mathcal {H} _{\lambda } $ on inputs containing k RAM states and k words, for $ k \in \mathbb {N} $ , and we require that this takes time $ \beta $ using $ k \cdot \beta $ processors. For example, this can be achieved by using $ \mathbf {C} $ for $ \mathcal {H} $ .

•

$ \alpha $ and $ \rho $ are functions defining the prover efficiency of $ (\mathcal {P} _{\mathsf {sARK}}, \mathcal {V} _{\mathsf {sARK}}) $ . For any security parameter $ \Lambda $ , machine, input, and output of total length X, and bound on time T to verify a statement in $ \mathcal {L}_{\mathsf {upd}} $ using P processors, without loss of generality, we assume $ \alpha (\Lambda ,X,T,P) / T $ and $ \rho (\Lambda , X, T, P) $ are increasing functions in X, T, and P.¹¹ For any statement in $ \mathcal {L}_{\mathsf {upd}} $ corresponding to k updates, we note that T can be written as $ k \cdot f(k) $ where f is increasing in k (and also depends on $ \lambda ,\left|(M,x) \right| $ ), and so $ \alpha (\Lambda , X, T,P) / k $ is also increasing as a function of k.

•

$ \ell _{\mathsf {upd}},t_{\mathsf {upd}}, p_{\mathsf {upd}} $ are functions determining the complexity of an $ \mathcal {L}_{\mathsf {upd}} $ instance on at most t updates. Define $ \ell _{\mathsf {upd}} \triangleq \ell _{\mathsf {upd}}(\lambda ,\left|(M,x) \right|,t) $ to be an upper bound on the statement length, and note that $ \ell _{\mathsf {upd}} \in \left|(M,x) \right| + \log t + \mathrm{poly} (\lambda) $ by definition of $ \mathcal {L}_{\mathsf {upd}} $ . We let $ t_{\mathsf {upd}} \triangleq t_{\mathsf {upd}}(\lambda ,\left|(M,x) \right| , t) $ upper bound the time to verify the instance using $ p_{\mathsf {upd}} \triangleq p_{\mathsf {upd}}(\lambda ,p_{M}) $ processors. Note that $ t_{\mathsf {upd}} \in t \cdot \beta \cdot \left|(M,x) \right| \cdot \mathrm{poly} (\lambda ,\log t) $ when $ p_{\mathsf {upd}} = \beta \cdot p_{M} $ .

•

$ \alpha ^{\star } \triangleq \alpha (\lambda , \ell _{\mathsf {upd}},t_{\mathsf {upd}},p_{\mathsf {upd}}) / t $ is the worst-case multiplicative overhead (with respect to the depth t) of the depth of running $ \mathcal {P} _{\mathsf {sARK}} $ to prove a statement in $ \mathcal {L}_{\mathsf {upd}} $ corresponding to at most t steps of computation, when using $ \rho ^{\star } \triangleq \rho (\lambda ,\ell _{\mathsf {upd}},t_{\mathsf {upd}},p_{\mathsf {upd}}) $ processors. Note that this implies that any valid $ \mathcal {L}_{\mathsf {upd}} $ statement with $ k \le t $ steps can be proven in parallel time $ \alpha ^{\star } \cdot k $ with $ \rho ^{\star } $ processors.

•

$ \gamma \triangleq \alpha ^{\star } + 1 $ is such that a $ 1/\gamma $ fraction of remaining computation is done at each recursive call to $ \mathbf {Compute\text{-} and \text{-} prove} $ . We note that $ \gamma $ can be efficiently computed as a function of the common inputs to the protocol.

We formalize the protocol in Figures 3 and 4. We are now ready to state our main theorem.

Theorem 6.1 (Restatement of Theorem 1.1).

Suppose there exists a concurrently updatable hash function and a succinct argument of knowledge $ (\mathcal {P} _{\mathsf {sARK}}, \mathcal {V} _{\mathsf {sARK}}) $ with $ (\alpha ,\rho) $ -prover efficiency for the $ \mathbf {NP} $ language $ \mathcal {L}_{\mathsf {upd}} $ . Then, there exists an interactive protocol $ (\mathcal {P}, \mathcal {V}) $ for $ \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} $ satisfying SPARK completeness and argument of knowledge for $ \mathbf {NP} $ , as well as the following efficiency properties:

There exists a polynomial q such that for all $ \lambda \in \mathbb {N} $ and $ ((M,x,y,L,t), w) \in \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} $ where M has access to $ n \le 2^{\lambda } $ words in memory and $ p_{M} $ processors, the following hold: Let $ \alpha ^\star $ and $ \rho ^\star $ (formally defined above based on $ \alpha $ and $ \rho $ ) be the multiplicative overhead in depth (with respect to the number of steps) and number of parallel processors used, respectively, by $ \mathcal {P} _{\mathsf {sARK}} $ to prove a statement in $ \mathcal {L}_{\mathsf {upd}} $ corresponding to at most t steps of computation. Then:

•

The depth of the prover is bounded by $ t + (\alpha ^\star)^2 \cdot \left|(M,x) \right| \cdot L \cdot q(\lambda , \log (t \cdot p_{M})) $ when using $ (p_{M} + \alpha ^\star \cdot \rho ^\star) \cdot q(\lambda , \log (t \cdot p_{M})) $ processors.

•

The work of the verifier is bounded by $ \alpha ^\star \cdot \left|(M,x) \right| \cdot L \cdot q(\lambda , \log (t\cdot p_{M})) $ , and the length of the transcript produced in the interaction between $ \mathcal {P} (w) $ and $ \mathcal {V} $ is bounded by $ \alpha ^\star \cdot L \cdot q(\lambda , \log (t\cdot p_{M})) $ .

We prove Theorem 6.1 by showing that the protocol in Figure 4 is a SPARK for $ \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} $ with $ \rho $ -succictness for every $ \rho $ with $ \rho (\lambda ,t) \in \mathrm{poly} (\lambda ,\log t) $ . The proof is given in Section 6.3. Specifically, we show completeness in Lemma 6.2, argument of knowledge in Lemma 6.3, prover efficiency in Lemma 6.13, and succinctness in Lemmas 6.16 and 6.17. Before giving the proofs, we give the following remarks about the construction:

Remark 8 (On the Size of M and x)

We note that when we bound the communication complexity (Lemma 6.17), we assume without loss of generality that the machine M and input x are a priori bounded by a fixed polynomial in $ \lambda $ . This enables us to bound the number of sub-protocols, and hence the communication complexity, independently of $ \left|(M,x) \right| $ . A similar observation was made by Reference [16] to achieve succinctness. This assumption is without loss of generality, since $ \mathcal {P} $ , when given a large input $ (M,x) $ , could instead compute $ \mathit {digest} = h(M,x) $ where h is a hash function and prove the instance $ (U_{h},(h,\mathit {digest}),t^{\prime },L) $ using witness $ (M,x,w) $ . Here, $ U_{h} $ is a universal RAM machine for $ p_{M} $ bounded parallelism with the hash function h hardcoded. $ U_{h} $ receives input $ \mathit {digest} $ , witness $ (M,x,w) $ , and checks that $ \mathit {digest} = h(M,x) $ and if so, computes and outputs $ y = M(x,w) $ . U has size $ \mathrm{poly} (\lambda) $ independent of $ \left|(M,x) \right| $ , and because it is a RAM machine, it can perform the hash and simulate M in time $ t^{\prime } = t + \left|(M,x) \right| \cdot \mathrm{poly} (\lambda) $ . Additionally, U uses the same amount of parallelism as M and $ n + \left|(M,x) \right| \cdot \mathrm{poly} (\lambda) $ memory, where the additional memory is used to compute the hash (note that if the resulting memory size is larger than $ 2^{\lambda } $ , then $ \mathcal {P} $ and $ \mathcal {V} $ can simply use a polynomially larger security parameter to prove the resulting statement).

To formalize this transformation, both $ \mathcal {P} $ and $ \mathcal {V} $ would be changed to compute $ \mathit {digest} $ and run the SPARK protocol with statement $ (U_{h},(h,\mathit {digest}),t^{\prime },L) $ . As such, the running times of the prover and verifier incur a delay of $ \left|(M,x) \right| \cdot \mathrm{poly} (\lambda) $ , but the remaining complexity would be based on having a statement of size $ \mathrm{poly} (\lambda) $ and a time bound of $ t^{\prime } = t + \left|(M,x) \right| \cdot \mathrm{poly} (\lambda) $ .

Remark 9 (On the Dependence on t and p_M)

We note that our construction when used for a PRAM machine M needs to know the time bound t and the bound on number of processors $ p_{M} $ ahead of time. Specifically, the parameter $ \gamma $ , which determines how the prover divides up the computation, depends on both t and $ p_{M} $ . This assumption is standard for universal arguments [8], but for some applications, a bound on time or processors may not be a priori known. Existing techniques for constructing efficient SNARKs based on incremental verifiable computation (e.g., References [16, 55]) do not require this assumption, but it is not clear how to extend this approach to the interactive setting (starting from weaker assumptions). We leave it as an open question to construct a SPARK where the prover does not know t and $ p_{M} $ in advance.

6.3 Proofs

In this section, we prove completeness, argument of knowledge, succinctness, and prover efficiency.

Lemma 6.2 (Completeness).

For every $ \lambda \in \mathbb {N} $ and $ ((M,x,y,L,t),w) \in \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} $ where M has access to $ n \le 2^{\lambda } $ words in memory, it holds that

$ \begin{equation*} \text{Pr}\left[ \langle \mathcal {P} (w),\mathcal {V} \rangle (1^{\lambda} ,(M,x,t,L)) = y \right] = 1, \end{equation*} $

where the probability is over the random coins of $ \mathcal {P} $ and $ \mathcal {V} $ .

Proof.

Let $ \Pi _{i} $ be as defined by the protocol for $ i \in [m] $ , with statement

$ \begin{equation*} \mathit {statement} _{i} = (M_{i},x_{i},k_{i},\mathit {pp} _{i},h_{i}, \mathit {digest} _{i},\mathit {hash} _{i}, \mathit {digest} _{i}^{\prime },\mathit {hash} _{i}^{\prime }). \end{equation*} $

Recall that $ \mathcal {V} $ accepts and outputs $ y \ne \bot $ if and only if conditions 6a through 6f from Figure 4 are valid with respect to these statements. Conditions 6b, 6c, 6d, and 6e follow immediately by definition of $ \mathcal {P} $ . Therefore, we focus on conditions 6a and 6f.

For conditions 6a and 6f, we first show that the sequence of t updates $ u_i = (\mathit {digest} _{i}^{},V ^{\mathsf {prev}}_{i},V ^{\mathsf {rd}}_{i},\pi _{i}^{},\tau _{i}^{}) $ for $ i \in [t] $ that the prover computes at each step (across all statements) are valid. In particular, let $ (\mathit {State} _{i}^{},\mathit {Op} _{i}^{},S_{i}^{},V ^{\mathsf {wt}}_{i}) = \mathbf {parallel\text{-} step} (M,\mathit {State} _{i-1}^{},V ^{\mathsf {rd} {}}_{i-1}) $ for all $ i \in [t] $ where we initialize $ \mathit {State} _0,V ^{\mathsf {rd}}_{0} $ as $ \mathit {State} _{\mathsf {start}},V ^{\mathsf {rd}}_{\mathsf {start}} $ , as in the protocol. We show that all conditions specified in $ \mathcal {L}_{\mathsf {upd}} $ hold for each update $ u_i $ according to the computation of M.

To show this, recall that the digest and proofs in each update i of the full computation are computed as $ (V _{i}^{\mathsf {prev}},\pi _i, \mathit {digest} _{i}, \tau _i) = \mathbf {OpenUpdate} (\mathit {pp} ,\mathit {ptr}, S_i, V _i) $ , where $ V _{i} $ is defined from $ V _{i}^{\mathsf {rd}},V _{i}^{\mathsf {wt}} $ as in the protocol. By the efficiency property of Definition 5.4, the values computed are equivalent to computing $ (V _{i}^{\mathsf {prev}}, \pi _i) = \mathbf {C}.\mathbf {Open} (\mathit {pp} , \mathit {ptr} , S_i) $ and then $ (\mathit {digest} _i, \tau _i) = \mathbf {C}.\mathbf {Update} (\mathit {pp} , \mathit {ptr} , S_i, V _i) $ sequentially at each step. Given this, it holds that before step i of the full computation, the prover has computed $ (\mathit {ptr} , \mathit {digest} _0) = \mathbf {C}.\mathbf {Hash} (\mathit {pp} , D_\bot) $ , where $ D_\bot $ is the empty partial string, and then computed $ i-1 $ updates. Let $ D^{\star } $ be the true string resulting from the first $ i-1 $ updates, and let $ D^{\mathbf {C}} $ be the partial string underlying the digest. Namely, $ D^{\star } $ starts as $ x || w || 0^{n\lambda - \left|x \right| - \left|w \right|} $ , $ D^\mathbf {C} $ starts as $ D_\bot $ , and we apply the same $ i-1 $ logical updates to both strings. Note that $ D^\mathbf {C} = \bot $ for all positions j that have not yet been accessed, and $ D^\mathbf {C} _j = D^{\star }_j $ for all other locations.

Next, we will use $ D^{\star } $ and $ D^\mathbf {C} $ to show that update $ u_{i} $ satisfies conditions 1, 2, 3, and 4 of $ \mathcal {L}_{\mathsf {upd}} $ . First, by update completeness, since $ V _{i} $ is defined from $ V _{i}^{\mathsf {rd}},V _{i}^{\mathsf {wt}} $ exactly as in the definition of $ \mathcal {L}_{\mathsf {upd}} $ , and $ (\mathit {digest} _i, \tau _i) = \mathbf {C}.\mathbf {Update} (\mathit {pp} , \mathit {ptr} , S_i, V _i) $ then it follows that $ \mathbf {C}.\mathbf {VerUpd} (\mathit {pp} , \mathit {digest} _{i-1}, S_i, V _i, \mathit {digest} _i , \tau _i) $ accepts as required by condition 1. Next, by open completeness of $ \mathbf {C} $ , since $ (V _{i}^{\mathsf {prev}}, \pi _i) = \mathbf {C}.\mathbf {Open} (\mathit {pp} , \mathit {ptr} , S_i) $ , then $ \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} , \mathit {digest} _{i-1}, S_i, V _{i}^{\mathsf {prev}}, \pi _i) $ accepts. This satisfies condition 2 of $ \mathcal {L}_{\mathsf {upd}} $ . Open completeness also implies that $ V _{i}^{\mathsf {prev}} $ are the values of $ S_{i} $ in $ D^{\mathbf {C}} $ . This gives condition 3, since the value of each location in $ D^\mathbf {C} $ is equal to $ \bot $ if it has not been accessed yet, and otherwise $ \mathcal {P} $ sets it to the corresponding value in $ V _{i}^{\mathsf {rd}} $ given for that location in $ D^{\star } $ . Last, for each location $ \ell _{j} \in S_{i} $ , when the corresponding value in $ V _{i}^{\mathsf {prev}} $ is set to $ \bot $ and $ \ell _{j} \le \left|x \right| $ , then $ D^{\mathbf {C}}_{\ell _{i}} = \bot $ and so location $ \ell _{i} $ has never been accessed. This implies that $ D^{\star }_{\ell _{i}} = x_{\ell _{i}} $ , which gives condition 4. Thus, all conditions specified by $ \mathcal {L}_{\mathsf {upd}} $ hold for each update $ u_i = (\mathit {digest} _{i}^{},V ^{\mathsf {prev}}_{i},V ^{\mathsf {rd}}_{i},\pi _{i}^{},\tau _{i}^{}) $ as required.

We now show that $ \mathcal {V} $ accepts condition 6a for the full protocol of Figure 4. Because each update is valid with respect to $ \mathcal {L}_{\mathsf {upd}} $ , it follows that the prover $ \mathcal {P} _{\mathsf {sARK}} $ for sub-protocol $ \Pi _i $ receives a valid witness with respect to $ \mathit {statement} _i $ for $ i \in [m] $ . Specifically, it receives the $ k_i $ consecutive updates corresponding to the ith sub-computation performed by $ \mathcal {P} $ , where the starting hash corresponds to the starting states and words read in the witness, and the ending hash corresponds to the final states and words read resulting from the sequence of updates, both by definition of $ \mathcal {P} $ . Completeness of $ (\mathcal {P} _{\mathsf {sARK}}, \mathcal {V} _{\mathsf {sARK}}) $ implies that $ \mathcal {V} _{\mathsf {sARK}} $ accepts in protocols $ \Pi _i $ .

For condition 6f, we have that $ \mathcal {P} $ honestly steps through the computation of $ M(x,w) $ . To see that $ \mathcal {P} $ reaches the final state, recall that each sub-computation corresponds to $ k_{i} $ steps of the original computation and $ \sum _{i=1}^m k_i = t $ (by condition 6c). Therefore, the final state $ \mathit {State} _{m} $ corresponds to the state of $ M(x,w) $ after t steps. Since $ ((M,x,y,L,t),w) \in \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} $ , then after t steps the final state will be the halting state. We showed above that the prover performs all updates correctly and consistent with memory, so it follows by open completeness that $ \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} , \mathit {digest} _m^{\prime }, [\lceil L/\lambda \rceil ], Y, \pi _{\mathsf {final}}) = 1 $ and that Y is the right length, and hence that the output is equal to y.□

Lemma 6.3 (Argument of Knowledge).

$ (\mathcal {P},\mathcal {V}) $ satisfies the argument of knowledge for $ \mathbf {NP} $ property of Definition 4.1.

Proof.

To show that $ (\mathcal {P},\mathcal {V}) $ is an argument of knowledge for $ \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} $ , consider any non-uniform polynomial-time prover $ \mathcal {P} ^{\star } = \{\mathcal {P} ^{\star }_{\lambda }\}_{\lambda \in \mathbb {N}} $ , integer $ c \in \mathbb {N} $ , security parameter $ \lambda \in \mathbb {N} $ , and statement $ (M,x,t,L) $ where M accesses at most $ n \le 2^{\lambda } $ memory and $ p_{M} $ processors, with $ \left|M,x,t \right| \le \lambda $ , $ L \le \lambda $ , and $ t\cdot p_{M} \le \left|x \right|^{c} $ . Let $ \mathcal {P} ^{\star }_{\lambda ,z,s} $ denote $ \mathcal {P} ^{\star }_{\lambda } $ with auxiliary input z and hardcoded randomness s for any $ z,s\in \{0,1\}^{*} $ . Let $ \mathcal {V} _r $ denote the verifier $ \mathcal {V} $ with hardcoded randomness $ r \in \{0,1\}^{l(\lambda)} $ , where $ l(\lambda) $ is an upper bound on the randomness used by the verifier. Note that l is a function of $ \lambda $ , since by Lemma 6.16, the verifier runs in time polynomial in $ \lambda ,\left|(M,x) \right|L,p_{M},\log t $ , each of which are bounded by a fixed polynomial in $ \lambda $ .

Recall that $ (\mathcal {P},\mathcal {V}) $ consists of m sub-protocols $ \Pi _1,\ldots , \Pi _m $ , where each is an instance of the protocol $ (\mathcal {P} _{\mathsf {sARK}}, \mathcal {V} _{\mathsf {sARK}}) $ . Let $ \mathcal {E} _\mathsf {sARK} $ be the extractor for $ (\mathcal {P} _{\mathsf {sARK}}, \mathcal {V} _{\mathsf {sARK}}) $ with expected running time bounded by a polynomial $ q_\mathsf {sARK} $ , which exists by assumption that $ (\mathcal {P} _{\mathsf {sARK}}, \mathcal {V} _{\mathsf {sARK}}) $ is an argument of knowledge. As a subroutine to our full extractor, we first construct a probabilistic oracle machine $ \mathcal {E} _{\mathsf {inner}} ^{\mathcal {P} ^{\star }_{\lambda ,z,s},\mathcal {V} _r} $ that uses $ \mathcal {E} _\mathsf {sARK} $ to extract witnesses for the statements in each sub-protocol defined by the interaction $ (\mathcal {P} ^{\star }_{\lambda ,z,s},\mathcal {V} _r) $ , as follows:

$ \mathcal {E} _{\mathsf {inner}} ^{\mathcal {P} ^{\star }_{\lambda ,z,s},\mathcal {V} _r}(1^{\lambda} , (M,x,t,L)) $ :

(1)

Emulate the interaction between $ \mathcal {P} ^{\star }_{\lambda ,z,s} $ and $ \mathcal {V} _r $ on common input $ (1^{\lambda} , (M,x,t,L)) $ , which uniquely determines the statement $ \mathit {statement} _i $ used for sub-protocol $ \Pi _i $ for all $ i \in [m] $ . Let Y be the values in the opening sent in the final message of the protocol.

(2)

For all $ i \in [m] $ , define the prover $ \mathcal {P} _{i}^{\star } $ and verifier $ \mathcal {V} _{\mathsf {sARK}, r_i} $ for the protocol $ (\mathcal {P} _{\mathsf {sARK}}, \mathcal {V} _{\mathsf {sARK}}) $ on common input $ (1^{\lambda} , \mathit {statement} _i) $ as follows:

•

$ \mathcal {P} _{i}^{\star } $ emulates the interaction between $ \mathcal {P} ^{\star }_{\lambda ,z,s} $ and $ \mathcal {V} _{r} $ on common input $ (1^{\lambda} , (M,x,t,L)) $ until the start of $ \Pi _i $ . $ \mathcal {P} _{i}^{\star } $ then interacts with $ \mathcal {V} _{\mathsf {sARK}} $ as part of $ \Pi _i $ for statement $ \mathit {statement} _i $ while continuing to use $ \mathcal {P} ^{\star }_{\lambda ,z,s} $ and $ \mathcal {V} _{r} $ to emulate messages for all other sub-protocols.

•

$ \mathcal {V} _{\mathsf {sARK}, r_i} $ is the verifier $ \mathcal {V} _{r} $ on common input $ (1^{\lambda} , (M,x,t,L)) $ restricted to its interaction in sub-protocol $ \Pi _i $ . Namely, $ \mathcal {V} _{\mathsf {sARK}, r_i} $ uses fixed randomness $ r_i $ determined by r for $ \Pi _i $ .

Note that $ \mathcal {P} ^{\star }_{i} $ and $ \mathcal {V} _{\mathsf {sARK},r_{i}} $ can be emulated using oracles $ \mathcal {P} ^{\star }_{\lambda ,z,s} $ and $ \mathcal {V} _{r} $ .

(3)

For $ i \in [m] $ , let $ \mathit {wit} _i \leftarrow \mathcal {E} _\mathsf {sARK} ^{\mathcal {P} _{i}^{\star }, \mathcal {V} _{\mathsf {sARK}, r_i}}(1^{\lambda} , \mathit {statement} _i) $ , where all queries made by $ \mathcal {E} _{\mathsf {sARK}} $ to $ \mathcal {P} ^{\star }_{i} $ and $ \mathcal {V} _{\mathsf {sARK},r_{i}} $ are emulated by $ \mathcal {E} _{\mathsf {inner}} $ using its own oracles $ \mathcal {P} ^{\star }_{\lambda ,z,s} $ and $ \mathcal {V} _{r} $ .

(4)

Output $ (\mathit {wit} _1,\ldots , \mathit {wit} _m,Y) $ .

In the following claims, we show that (1) $ \mathcal {E} _{\mathsf {inner}} $ runs in expected polynomial time (over r and its own random coins) and (2) with all but negligible probability (over r and the coins of $ \mathcal {E} _\mathsf {sARK} $ ), either $ \mathcal {P} ^{\star }_{\lambda ,z,s} $ fails to convinces $ \mathcal {V} _{r} $ or for all $ i \in [m] $ the witness $ \mathit {wit} _i $ extracted by $ \mathcal {E} _\mathsf {sARK} $ is valid for $ \mathit {statement} _i $ with respect to $ \mathcal {L}_{\mathsf {upd}} $ :

Claim 6.4.

There exists a polynomial $ q_\mathsf {inner} $ such that for every non-uniform probabilistic polynomial-time prover $ \mathcal {P} ^{\star } = \{\mathcal {P} ^{\star }_{\lambda }\}_{\lambda \in \mathbb {N}} $ , $ \lambda ,c\in \mathbb {N} $ , statement $ (M,x,t,L) $ where M has access to $ n \le 2^{\lambda } $ words in memory and $ p_{M} $ processors, with $ \left|M,x,t \right| \le \lambda $ , $ L \le \lambda $ , and $ t\cdot p_{M} \le \left|x \right|^{c} $ , and $ z,s\in \{0,1\} ^{*} $ , the expected running time (with a single processor) of $ \mathcal {E} _{\mathsf {inner}} ^{\mathcal {P} ^{\star }_{\lambda ,z,s},\mathcal {V} _{r}}(1^{\lambda} , (M,x,t,L)) $ is at most $ q_\mathsf {inner} (\lambda ,t \cdot p_{M}) $ .□

Proof.

We first analyze the time to emulate a full interaction between $ \mathcal {P} ^{\star }_{\lambda ,z,s} $ and $ \mathcal {V} _{r} $ , which is used to determine the statements $ \mathit {statement} _i $ and to emulate the oracle calls of $ \mathcal {E} _\mathsf {sARK} $ to $ \mathcal {P} _{i}^{\star } $ and $ \mathcal {V} _{\mathsf {sARK}, r_i} $ . Since each oracle call takes a single step by assumption, it follows that the emulation time is at most $ \mathbf {work} _\mathcal {V} (1^{\lambda} , (M,x,t,L)) $ to receive and read each message. By the succinctness of $ (\mathcal {P},\mathcal {V}) $ (given by Lemma 6.16) this is bounded by a polynomial $ q_\mathcal {V} (\lambda ,\left|(M,x) \right|,L,p_{M},\log (t\cdot p_{M})) $ independent of $ \mathcal {P} ^{\star } $ and the statement.

Next, we analyze the expected running time of $ \mathcal {E} _\mathsf {sARK} ^{\mathcal {P} ^{\star }_i, \mathcal {V} _{\mathsf {sARK}, r_i}} $ for each $ i \in [m] $ . Recall that $ t_{\mathsf {upd}} \cdot p_\mathsf {upd} $ is an upper bound on the amount of work to verify a statement with at most t updates in $ \mathcal {L}_{\mathsf {upd}} $ . Since $ \mathcal {E} _{\mathsf {sARK}} $ is extracting a witness for an $ \mathcal {L}_{\mathsf {upd}} $ statement, then for each $ i\in [m] $ , the expected running time of $ \mathcal {E} _\mathsf {sARK} ^{\mathcal {P} ^{\star }_i, \mathcal {V} _{\mathsf {sARK}, r_i}} $ is at most $ q_\mathsf {sARK} (\lambda , t_{\mathsf {upd}} \cdot p_\mathsf {upd}) $ for some polynomial $ q_\mathsf {sARK} $ when given oracle access to $ \mathcal {P} ^{\star }_i $ and $ \mathcal {V} _{\mathsf {sARK}, r_i} $ assuming $ r_i $ is uniformly distributed. As the random coins for $ \mathcal {V} _{r} $ are uniform and $ \mathcal {V} $ invokes m independent instances of $ \mathcal {V} _{\mathsf {sARK}} $ , this implies that the randomness $ r_i $ used by $ \mathcal {V} _{\mathsf {sARK}, r_i} $ is uniform. Thus, the expected running time of $ \mathcal {E} _\mathsf {sARK} ^{\mathcal {P} _{i}^{\star }, \mathcal {V} _{\mathsf {sARK}, r_i}} $ is at most $ q_\mathsf {sARK} (\lambda ,t_{\mathsf {upd}} \cdot p_\mathsf {upd}) $ .

Putting everything together, we have that $ \mathcal {E} _{\mathsf {inner}} $ first emulates the interaction between $ \mathcal {P} ^{\star }_{\lambda ,z,s} $ and $ \mathcal {V} _{r} $ and then runs $ \mathcal {E} _\mathsf {sARK} $ to extract a witness m times while emulating the oracle calls of $ \mathcal {E} _\mathsf {sARK} $ (and the resulting oracle calls made to $ \mathcal {P} ^{\star }_{\lambda ,z,s} $ and $ \mathcal {V} _{r} $ ). Thus, the full expected running time is bounded by

$ \begin{align*} q_\mathcal {V} (\lambda ,L,p_{M}, \log (t \cdot p_{M})) + m \cdot q_\mathcal {V} (\lambda ,L,p_{M}, \log (t\cdot p_{M})) \cdot q_\mathsf {sARK} (\lambda , t_{\mathsf {upd}} \cdot p_\mathsf {upd}). \end{align*} $

We can bound $ t_{\mathsf {upd}}(\lambda ,\left|(M,x) \right|,t) \in \mathrm{poly} (\lambda ,\left|(M,x) \right|,t) $ and $ p_\mathsf {upd} (\lambda , p_{M}) \in \mathrm{poly} (\lambda , p_{M}) $ , as well as $ \left|(M,x) \right| \le \lambda $ , and $ L \le \lambda $ . For m, by succinctness (Lemma 6.16), we have that $ m \le \alpha ^{\star } \cdot \mathrm{poly} (\lambda ,\left|(M,x) \right|,L,\log (t \cdot p_{M})) $ and $ \alpha ^{\star } $ can be bounded by a polynomial in $ \lambda ,\left|(M,x) \right|,t,p_{M} $ by definition. Putting these bounds together, this implies that the expected running time is bounded by a polynomial $ q_\mathsf {inner} (\lambda , t \cdot p_{M}) $ . $ \Box $ □

Claim 6.5.

For every non-uniform probabilistic polynomial-time prover $ \mathcal {P} ^{\star } = \{\mathcal {P} ^{\star }_{\lambda }\}_{\lambda \in \mathbb {N}} $ and constant $ c \in \mathbb {N} $ , there exists a negligible function $ \mathsf {negl} _\mathsf {inner} $ such that for all $ \lambda \in \mathbb {N} $ , statement $ (M,x,t,L) $ where M has access to $ n \le 2^{\lambda } $ and $ p_{M} $ processors, and with $ \left|M,x,t \right| \le \lambda $ , $ L \le \lambda $ , and $ t\cdot p_{M} \le \left|x \right|^{c} $ , and every $ z,s\in \{0,1\}^{*} $ , it holds that

$ \begin{align*} & \text{Pr}\left[ \begin{array}{l} r \leftarrow \{0,1\} ^{l(\lambda)} \\ y = \langle \mathcal {P} ^{\star }_{\lambda ,z,s}, \mathcal {V} _{r} \rangle (1^{\lambda} , (M,x,t,L)) \\ (\mathit {wit} _1,\ldots ,\mathit {wit} _m,Y) \leftarrow \mathcal {E} _{\mathsf {inner}} ^{\mathcal {P} ^{\star }_{\lambda ,z,s}, \mathcal {V} _{r}}(1^{\lambda} , (M,x,t,L)) \end{array} : \begin{array}{l} y \not= \bot \ \wedge \\ \exists i \in [m] : (\mathit {statement} _i, \mathit {wit} _i) \not\in R_{\mathsf {upd}} \end{array} \right] \\ &\le \mathsf {negl} _\mathsf {inner} (\lambda) , \end{align*} $

where $ \mathit {statement} _{i} $ is defined to be the statement of the ith sub-protocol in the interaction $ (\mathcal {P} ^{\star }_{\lambda ,z,s},\mathcal {V} _{r}) $ .

Proof.

To analyze the above probability, we start by formalizing an algorithm $ \mathcal {S} $ , which is implicit in the description of $ \mathcal {E} _{\mathsf {inner}} $ . The algorithm $ \mathcal {S} $ takes as input $ r \in \{0,1\} ^{l(\lambda)} $ , and emulates the interaction $ (\mathcal {P} ^{\star }_{\lambda ,z,s},\mathcal {V} _{r}) $ . It then outputs $ (y,\mathit {statement} _{1},\ldots ,\mathit {statement} _{m}) $ , where $ \mathit {statement} _{i} $ is the ith statement in the interaction and y is the output of the protocol. Note that these statements are the same as the ones computed by $ \mathcal {E} _{\mathsf {inner}} $ in the first step of its description. We can then write the above probability as

$ \begin{align*} \text{Pr}\left[ \begin{array}{l} r \leftarrow \{0,1\} ^{l(\lambda)} \\ (y,\mathit {statement} _{1},\ldots ,\mathit {statement} _{m}) = \mathcal {S} (r) \\ (\mathit {wit} _1,\ldots ,\mathit {wit} _m,Y) \leftarrow \mathcal {E} _{\mathsf {inner}} ^{\mathcal {P} ^{\star }_{\lambda ,z,s}, \mathcal {V} _{r}}(1^{\lambda} , (M,x,t,L)) \end{array} : \begin{array}{l} y \not= \bot \ \wedge \\ \exists i \in [m] : (\mathit {statement} _i, \mathit {wit} _i) \not\in R_{\mathsf {upd}} \end{array} \right] \!. \end{align*} $

Next, we apply a union bound to upper bound this by

$ \begin{align} \sum _{i \in [m]} \text{Pr}\left[ \begin{array}{l} r \leftarrow \{0,1\} ^{l(\lambda)} \\ (y,\mathit {statement} _{1},\ldots ,\mathit {statement} _{m}) = \mathcal {S} (r) \\ (\mathit {wit} _1,\ldots ,\mathit {wit} _m,Y) \leftarrow \mathcal {E} _{\mathsf {inner}} ^{\mathcal {P} ^{\star }_{\lambda ,z,s}, \mathcal {V} _{r}}(1^{\lambda} , (M,x,t,L)) \end{array} : \begin{array}{l} y \not= \bot \ \wedge \\ (\mathit {statement} _i, \mathit {wit} _i) \not\in R_{\mathsf {upd}} \end{array} \right] \!. \end{align} $

(6.1)

We now upper bound the above probability for any particular $ i \in [m] $ . We notice that whenever $ y \not= \bot $ , that implies that $ \mathcal {V} $ accepts in protocol $ \Pi _i $ for $ \mathit {statement} _i $ .

By definition of $ \mathcal {E} _{\mathsf {inner}} ^{\mathcal {P} ^{\star }_{\lambda ,z,s},\mathcal {V} _{r}} $ , for each $ i \in [m] $ , the witness $ \mathit {wit} _{i} $ is computed by running $ \mathcal {E} _{\mathsf {sARK}}^{\mathcal {P} _{i}^{\star },\mathcal {V} _{\mathsf {sARK},r_{i}}} $ , where $ \mathcal {E} _{\mathsf {inner}} $ uses its oracles $ \mathcal {P} ^{\star }_{\lambda ,z,s} $ and $ \mathcal {V} _{r} $ to emulate all queries that $ \mathcal {E} _{\mathsf {sARK}} $ makes to $ \mathcal {P} _{i}^{\star } $ and $ \mathcal {V} _{\mathsf {sARK},r_{i}} $ . Specifically, emulating $ \mathcal {P} _{i}^{\star } $ requires querying $ \mathcal {P} _{\lambda ,z,s}^{\star } $ for every sub-protocol, and querying $ \mathcal {V} _{r} $ for all protocols other than i.

Let $ r_{-i} $ be the randomness of $ \mathcal {V} _{r} $ used in all protocols other than i, where it uses $ r_{i} $ . Note that $ \mathcal {P} ^{\star }_{i} $ only depends on $ r_{-i} $ , since it only uses $ \mathcal {V} _{r} $ in protocols other than i. Another way to state this is to view $ \mathcal {P} _{i}^{\star } $ as an randomized prover that emulates the verifier in all sub-protocols other than i using its internal randomness, where in the above execution, its internal randomness is $ r_{-i} $ . To make this clear, let $ \mathcal {P} ^{\star }_{i,r_{-i}} $ denote the prover $ \mathcal {P} ^{\star }_{i} $ (viewing it as a randomized algorithm) that uses randomness $ r_{-i} $ to emulate the verifier in all protocols other than i, and note that $ \mathcal {P} ^{\star }_{i,r_{-i}} $ can still be emulated using the oracles $ \mathcal {P} _{\lambda ,z,s}^{\star } $ and $ \mathcal {V} _{r} $ . We can then write the above probability as

$ \begin{align*} \text{Pr}\left[ \begin{array}{l} r \leftarrow \{0,1\} ^{l(\lambda)} \\ (y,\mathit {statement} _{1},\ldots ,\mathit {statement} _{m}) = \mathcal {S} (r) \\ \mathit {wit} _{i} \leftarrow \mathcal {E} _{\mathsf {sARK}}^{\mathcal {P} _{i,r_{-i}}^{\star }, \mathcal {V} _{\mathsf {sARK},r_{i}}}(1^{\lambda} , \mathit {statement} _{i}) \end{array} : \begin{array}{l} y \not= \bot \ \wedge \\ (\mathit {statement} _i, \mathit {wit} _i) \not\in R_{\mathsf {upd}} \end{array} \right]. \end{align*} $

Whenever $ y \ne \bot $ , it must be the case that $ \mathcal {V} $ accepts in all sub-protocols, and therefore by definition of $ \mathcal {P} _{i}^{\star } $ , it follows that $ \mathcal {V} _{\mathsf {sARK},r_{i}} $ accepts in protocol $ \Pi _{i} $ with $ \mathcal {P} _{i,r_{-i}}^{\star } $ . We can therefore upper bound the above probability by

$ \begin{align} \text{Pr}\left[ \begin{array}{l} r \leftarrow \{0,1\}^{l(\lambda)} \\ (y,\mathit {statement} _{1},\ldots ,\mathit {statement} _{m}) = \mathcal {S} (r) \\ \mathit {wit} _{i} \leftarrow \mathcal {E} _{\mathsf {sARK}}^{P_{i,r_{-i}}^{\star }, V_{\mathsf {sARK},r_{i}}}(1^{\lambda} , \mathit {statement} _{i}) \end{array} : \begin{array}{l} \langle \mathcal {P} ^{\star }_{i}, \mathcal {V} _{\mathsf {sARK}, r_i} \rangle (1^{\lambda} , \mathit {statement} _i) = 1 \\ \wedge \ (\mathit {statement} _i, \mathit {wit} _i) \not\in R_{\mathsf {upd}} \end{array} \right] . \end{align} $

(6.2)

We can now use the argument of knowledge property of $ (\mathcal {P} _{\mathsf {sARK}},\mathcal {V} _{\mathsf {sARK}}) $ . Let $ l^{\prime }(\lambda) $ be the length of the randomness used by $ \mathcal {V} _{\mathsf {sARK}} $ . For any $ r = (r_{-i},r_{i}) \in \{0,1\}^{l(\lambda)} $ , using $ r_{-i} $ as the randomness for $ \mathcal {P} ^{\star }_{i,r_{-i}} $ , by the argument of knowledge property of $ (\mathcal {P} _{\mathsf {sARK}},\mathcal {V} _{\mathsf {sARK}}) $ there exists a negligible function $ \mu _{i} $ (which depends on the algorithm $ \mathcal {P} _{i}^{\star } $ but is independent of its randomness) such that for every randomness $ r_{-i} $ for $ \mathcal {P} _{i}^{\star } $ , and for the statement $ \mathit {statement} _{i} $ (which in this case is determined by $ r_{-i} $ ) it holds that

$ \begin{align*} \text{Pr}\left[ \begin{array}{l} r_{i} \leftarrow \{0,1\} ^{l^{\prime }(\lambda)} \\ \mathit {wit} _{i} \leftarrow \mathcal {E} _{\mathsf {sARK}}^{\mathcal {P} _{i,r_{-i}}^{\star }, \mathcal {V} _{\mathsf {sARK},r_{i}}}(1^{\lambda} , \mathit {statement} _{i}) \end{array} : \begin{array}{l} \langle \mathcal {P} ^{\star }_{i,r_{-i}}, \mathcal {V} _{\mathsf {sARK}, r_i} \rangle (1^{\lambda} , \mathit {statement} _i) = 1 \\ \wedge \ (\mathit {statement} _i, \mathit {wit} _i) \not\in R_{\mathsf {upd}} \end{array} \right] \le \mu _{i}(\lambda). \end{align*} $

By using the law of total probability in Equation (6.2) (to sum over each choice of $ r_{-i} $ ), and by applying the above inequality, we obtain that Equation (6.2) is bounded above by $ \mu _{i}(\lambda) $ . Finally, by plugging this back into Equation (6.1), we obtain that the probability in the statement of the claim is upper bounded by $ \sum _{i\in [m]} \mu _i(\lambda) $ . As in the analysis of the previous claim, we can bound m by $ \mathrm{poly} (\lambda ,\left|(M,x) \right|,L,t,p_{M}) $ . As $ \left|(M,x) \right| \le \lambda $ , $ L \le \lambda $ , and $ t \cdot p_{M} \le \left|x \right|^{c} $ , then $ m \in \mathrm{poly} (\lambda) $ , so this is negligible as required. $ \Box $ □

Using $ \mathcal {E} _{\mathsf {inner}} $ to extract the witnesses in the sub-protocols, we now define the full extractor $ \mathcal {E} $ that outputs a witness w for $ (M,x,y,L,t) $ given oracle access to $ \mathcal {P} ^{\star }_{\lambda ,z,s} $ and $ \mathcal {V} _{r} $ , where y is the value output by $ \mathcal {V} _{r} $ when interacting with $ \mathcal {P} ^{\star }_{\lambda ,z,s} $ .

$ \mathcal {E} ^{\mathcal {P} ^{\star }_{\lambda , z, s}, \mathcal {V} _{r}}(1^{\lambda} ,(M,x,t,L)) $ :

(1)

Run $ (\mathit {wit} _1, \ldots , \mathit {wit} _m,Y) \leftarrow \mathcal {E} _{\mathsf {inner}} ^{\mathcal {P} ^{\star }_{\lambda ,z,s}, \mathcal {V} _{r}}(1^{\lambda} , (M,x,t,L)) $ .

(2)

Parse each $ \mathit {wit} _{i} $ as containing an initial set of states and values read $ (\mathit {State} ^{(i)},V ^{\mathsf {rd},(i)}) $ as well as a sequence of updates, where the updates across all m witnesses together yield an overall sequence of t updates $ u_{j}= (\mathit {digest} _{j}^{},V ^{\mathsf {prev}}_{j},V ^{\mathsf {rd}}_{j},\pi _{j}^{},\tau _{j}^{}) $ for $ j \in [t] $ (abort if this is not the case).

(3)

For $ j = 1,\ldots ,t $ , compute $ (\mathit {State} _{j}^{},\mathit {Op} _{j}^{},S_{j}^{},V ^{\mathsf {wt}}_{j}) = \mathbf {parallel\text{-} step} (M,\mathit {State} _{j-1}^{},V ^{\mathsf {rd} {}}_{j-1}) $ where $ \mathit {State} _{0} $ is the tuple containing the initial RAM state and $ V _{0}^{\mathsf {rd}} = (\bot) $ .

(4)

Let $ D^{\mathsf {Init}} \in \{0,1\} ^{n\lambda } $ be the string where for each $ \ell \in [n] $ , the $ \ell $ th word is set to its value in $ V _{i}^{\mathsf {rd}} $ , where i is the first iteration with $ \ell \in S_{i} $ , or the $ \ell $ th word in Y if $ \ell $ is never accessed and $ \ell \le \lceil L/\lambda \rceil $ , or $ 0^{\lambda } $ otherwise.

(5)

Output w to be the string of length $ n\lambda -\left|x \right| $ starting at position $ \left|x \right| $ in $ D^{\mathsf {Init}} $ .

We note that while $ D^{\mathsf {Init}} $ and w above may be as large as $ n\cdot \lambda $ bits, they can be specified while running M by using at most $ \lambda + \log n $ bits for each non-zero value. Furthermore, they can have at most $ t+\lceil L/\lambda \rceil $ non-zero values, since M makes at most t memory accesses, and at most $ \lceil L/\lambda \rceil $ additional positions are accessed in specifying the output. Thus, $ D^{\mathsf {Init}} $ and w can be computed with at most $ \mathrm{poly} (\lambda ,L,t,\log n) $ additive overhead in time and space.

Claim 6.6.

There exists a polynomial q such that $ \mathcal {E} ^{\mathcal {P} ^{\star }_{\lambda ,z,s},\mathcal {V} _{r}}(1^{\lambda} , (M,x,t,L)) $ has expected running time at most $ q(\lambda , t \cdot p_{M}) $ .

Proof.

$ \mathcal {E} $ first runs $ \mathcal {E} _{\mathsf {inner}} $ , which has expected running time bounded by a polynomial $ q_\mathsf {inner} (\lambda , t \cdot p_{M}) $ by Claim 6.4. We bound the remaining running time of $ \mathcal {E} $ by a polynomial in $ \lambda $ and $ t \cdot p_{M} $ , which completes the claim.

$ \mathcal {E} $ parses the output as containing m sets of states and words that together have size $ m \cdot p_{M} \cdot \mathrm{poly} (\lambda) $ , as well as a sequence of t updates, where each update has size at most $ 2\beta \cdot p_{M} \cdot \lambda \in \mathrm{poly} (\lambda) $ by the efficiency of the underlying concurrently updatable hash function. As $ m \in \mathrm{poly} (\lambda) $ as discussed in the previous claims, together this takes time $ t \cdot p_{M} \cdot \mathrm{poly} (\lambda) $ . Using these updates to determine which values to read, $ \mathcal {E} $ emulates M for t steps, which can be done in time $ t \cdot p_{M} \cdot \mathrm{poly} (\lambda) $ . Finally, $ \mathcal {E} $ computes the initial memory $ D^{\mathsf {Init}} $ to output a witness w, which, as discussed above, requires specifying at most $ t+\lceil L/ \lambda \rceil $ positions and therefore takes at most $ \mathrm{poly} (\lambda ,L,t) \in \mathrm{poly} (\lambda ,t) $ time. Altogether, $ \mathcal {E} $ runs in expected time at most $ q_\mathsf {inner} (\lambda , t \cdot p_{M}) + t \cdot p_{M} \cdot \mathrm{poly} (\lambda) + t \cdot p_{M} \cdot \mathrm{poly} (\lambda) + \mathrm{poly} (\lambda ,t), $ which can be bounded by a polynomial $ q(\lambda , t \cdot p_{M}) $ . $ \Box $ □

Claim 6.7.

For every non-uniform probabilistic polynomial-time prover $ \mathcal {P} ^{\star } = \{\mathcal {P} ^{\star } _{\lambda }\}_{\lambda \in \mathbb {N}} $ and constant $ c \in \mathbb {N} $ , there exists a negligible function $ \mathsf {negl} $ such that for all $ \lambda \in \mathbb {N} $ , statement $ (M,x,t,L) $ where M has access to $ n \le 2^{\lambda } $ and $ p_{M} $ processors, and with $ \left|(M,x,t) \right| \le \lambda $ , $ L \le \lambda $ , and $ t\cdot p_{M} \le \left|x \right|^{c} $ , and all $ z,s\in \{0,1\}^{*} $ , it holds that

$ \begin{align*} \text{Pr}\left[ \begin{array}{l} r \leftarrow \{0,1\}^{l(\lambda)} \\ y = \langle \mathcal {P} ^{\star }_{\lambda ,z,s}, \mathcal {V} _{r} \rangle (1^{\lambda} , (M,x,t,L)) \\ w \leftarrow \mathcal {E} ^{\mathcal {P} ^{\star }_{\lambda ,z,s}, \mathcal {V} _{r}}(1^{\lambda} , (M,x,t,L)) \end{array} : \begin{array}{l} y \not= \bot \ \wedge \\ ((M,x,y,L,t),w) \not\in \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} \end{array} \right] \le \mathsf {negl} (\lambda). \end{align*} $

Proof.

In the following, all probabilities are over $ r \leftarrow \{0,1\}^{l(\lambda)} $ and $ w \leftarrow \mathcal {E} ^{\mathcal {P} ^{\star }_{\lambda ,z,s}, \mathcal {V} _{r}}(1^{\lambda} , (M,x,t,L)) $ , and we let y and $ \mathit {statement} _{i} $ for $ i \in [m] $ be determined by r in each probability, namely, $ y = \langle \mathcal {P} ^{\star }_{\lambda ,z,s}, \mathcal {V} _{r} \rangle (1^{\lambda} , (M,x,t,L)) $ and $ \mathit {statement} _{i} $ is the statement used by $ \mathcal {P} ^{\star }_{\lambda ,z,s} $ for the ith sub-protocol with $ \mathcal {V} _{r} $ . Additionally, we let $ \mathit {wit} _1,\ldots , \mathit {wit} _m,Y $ be the output of $ \mathcal {E} _{\mathsf {inner}} $ during the execution of $ \mathcal {E} $ in each probability.

Suppose by way of contradiction that there exists a polynomial p such that for infinitely many $ \lambda \in \mathbb {N} $ ,

$ \begin{align*} \text{Pr}\left[ \begin{array}{l} y \not= \bot \ \wedge \ ((M,x,y,L,t),w) \not\in \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} \end{array} \right] \gt 1 / p(\lambda). \end{align*} $

We can rewrite this probability as

$ \begin{align*} & \text{Pr}\left[ \begin{array}{l} y \not= \bot \ \wedge \\ \forall i \in [m] \ (\mathit {statement} _i, \mathit {wit} _i) \in R_{\mathsf {upd}}\ \wedge \\ ((M,x,y,L,t),w) \not\in \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} \end{array} \right] + \text{Pr}\left[ \begin{array}{l} y \not= \bot \ \wedge \\ \exists i \in [m] \ (\mathit {statement} _i, \mathit {wit} _i) \not\in R_{\mathsf {upd}}\ \wedge \\ ((M,x,y,L,t),w) \not\in \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} \end{array} \right] \\ \le &\text{Pr}\left[ \begin{array}{l} y \not= \bot \ \wedge \\ \forall i \in [m] \ (\mathit {statement} _i, \mathit {wit} _i) \in R_{\mathsf {upd}}\ \wedge \\ ((M,x,y,L,t),w) \not\in \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} \end{array} \right] + \mathsf {negl} _\mathsf {inner} (\lambda), \end{align*} $

by Claim 6.5 above. As $ \mathsf {negl} _\mathsf {inner} (\lambda) \lt 1/(2p(\lambda)) $ for infinitely many $ \lambda \in \mathbb {N} $ , this implies that for infinitely many $ \lambda \in \mathbb {N} $ ,

$ \begin{align*} \text{Pr}\left[ \begin{array}{l} y \not= \bot \ \wedge \\ \forall i \in [m] \ (\mathit {statement} _i, \mathit {wit} _i) \in R_{\mathsf {upd}}\ \wedge \\ ((M,x,y,L,t),w) \not\in \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} \end{array} \right] \gt \frac{1}{2p(\lambda)}. \end{align*} $

Furthermore, by a standard averaging argument, it holds that

$ \begin{align*} \text{Pr}\left[ \begin{array}{l} y \not= \bot \ \wedge \\ \forall i \in [m] \ (\mathit {statement} _i, \mathit {wit} _i) \in R_{\mathsf {upd}}\ \wedge \\ ((M,x,y,L,t),w) \not\in \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}}\ \wedge \\ \mathcal {E} \text{ halts after } 4 \cdot p(\lambda) \cdot q(\lambda ,t \cdot p_{M}) \text{ steps} \end{array} \right] \le \frac{1}{4p(\lambda)}. \end{align*} $

Otherwise, the expected work done by $ \mathcal {E} $ must be greater than $ q(\lambda ,t \cdot p_{M}) $ , in contradiction with Claim 6.4. This implies that for infinitely many $ \lambda \in \mathbb {N} $ ,

$ \begin{align} \text{Pr}\left[ \begin{array}{l} y \not= \bot \ \wedge \\ \forall i \in [m] \ (\mathit {statement} _i, \mathit {wit} _i) \in R_{\mathsf {upd}}\ \wedge \\ ((M,x,y,L,t),w) \not\in \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}}\ \wedge \\ \mathcal {E} \text{ halts within } 4 \cdot p(\lambda) \cdot q(\lambda ,t \cdot p_{M}) \text{ steps} \end{array} \right] \gt \frac{1}{4p(\lambda)}. \end{align} $

(6.3)

Given this, consider the following non-uniform adversary $ \mathcal {A} = \{\mathcal {A} _{\lambda }\}_{\lambda \in \mathbb {N}} $ . At a high level, we will show that on input $ \mathit {pp} \leftarrow \mathbf {C}.\mathbf {Gen} (1^{\lambda} ,n) $ and $ h \leftarrow \mathcal {H} _{\lambda } $ , $ \mathcal {A} $ will either break the soundness of $ \mathbf {C} $ or the collision-resistance of $ \mathcal {H} $ with at least the probability above. In its non-uniform advice, $ \mathcal {A} _{\lambda } $ will have hardcoded the code of $ \mathcal {P} ^{\star }_{\lambda ,z,s} $ , the statement $ (M,x,t,L) $ , and the value of $ p(\lambda) $ .

$ \mathcal {A} _{\lambda }(\mathit {pp}, h) $ :

(1)

Sample $ r \leftarrow \{0,1\}^{l(\lambda)} $ . Let $ \mathcal {V} _{\mathit {pp},h,r} $ be the verifier that uses $ (\mathit {pp},h) $ as its first message and the string r for all other random bits needed.

(2)

Run the interaction $ y = \langle \mathcal {P} ^{\star }_{\lambda ,z,s}, \mathcal {V} _{\mathit {pp},h,r} \rangle (1^{\lambda} , (M,x,t,L)) $ . If $ y = \bot $ , then abort and output $ \bot $ . Otherwise, let $ Y,\pi ,\mathit {State} _{\mathsf {final}},V _{\mathsf {final}}^{\mathsf {rd}} $ be the final message sent by $ \mathcal {P} ^{\star }_{\lambda ,z,s} $ .

(3)

For at most $ 4 \cdot p(\lambda) \cdot q(\lambda ,t\cdot p_{M}) $ steps, run $ w \leftarrow \mathcal {E} ^{\mathcal {P} ^{\star }_{\lambda ,z,s}, \mathcal {V} _{\mathit {pp},r}}(1^{\lambda} , (M,x,t,L)) $ . If $ \mathcal {E} $ does not output within $ 4 \cdot p(\lambda) \cdot q(\lambda ,t\cdot p_{M}) $ steps, then abort and output $ \bot $ . Otherwise, let $ \mathit {wit} _1,\ldots , \mathit {wit} _m $ be the witnesses output by $ \mathcal {E} _{\mathsf {inner}} $ for statements $ \mathit {statement} _1, \ldots , \mathit {statement} _m $ .

(4)

If there exists an $ j \in [m] $ such that $ (\mathit {statement} _j, \mathit {wit} _j) \not\in R_{\mathsf {upd}} $ , then abort and output $ \bot $ . Otherwise, parse each witness $ \mathit {wit} _{j} $ as containing an initial set of states and words read $ (\mathit {State} ^{(j)},V ^{\mathsf {rd},(j)}) $ , as well as a sequence of updates. Let $ u_{1},\ldots ,u_{t} $ be the sequence of t updates obtained across all m witnesses. For each update $ i \in [t] $ we now have the following values and notation:

•

The values $ \mathit {State} _{i},\mathit {Op} _{i},S_{i},V _{i}^{\mathsf {wt}} $ from each step of $ \mathcal {E} $ ’s emulation.

•

The extracted update $ u_i = (\mathit {digest} _{i}^{},V ^{\mathsf {prev}}_{i},V ^{\mathsf {rd}}_{i},\pi _{i}^{},\tau _{i}^{}) $ .

•

Let $ V _{i} $ be a tuple of $ \left|S_{i} \right| $ values, where the jth value is that of $ V _{i}^{\mathsf {rd}} $ or $ V _{i}^{\mathsf {wt}} $ according to the corresponding operation given by $ \mathit {Op} _{i} $ .

Last, we have the following starting values:

•

The starting values $ (\mathit {State} _{0},V _{0}^{\mathsf {rd}}) $ defined by $ \mathcal {E} $ .

•

The initial digest computed by $ \mathcal {V} $ , denoted $ \mathit {digest} _{0} $ .

We will be using this notation throughout the proof.

(5)

Check that $ \mathcal {E} $ ’s emulation is consistent with the extracted updates. Specifically, let $ K_{0} = 0 $ and let $ K_{j} $ be the number of updates in sub-statements 1 through j for each $ j \in [m] $ . If there exists a $ j \in [m] $ such that $ (\mathit {State} ^{(j)}, V ^{\mathsf {rd},(j)}) $ is not equal to $ (\mathit {State} _{K_{j-1}},V ^{\mathsf {rd}}_{K_{j-1}}) $ , then let j be the smallest such index and output $ ((\mathit {State} ^{(j)}, V ^{\mathsf {rd},(j)}),(\mathit {State} _{K_{j-1}},V ^{\mathsf {rd}}_{K_{j-1}})) $ . Similarly, if $ (\mathit {State} _{\mathsf {final}},V ^{\mathsf {rd}}_{\mathsf {final}}) \ne (\mathit {State} _{t},V ^{\mathsf {rd}}_{t}) $ , then output these four values.

(6)

Next, $ \mathcal {A} _{\lambda } $ emulates the computation of $ M(x,w) $ . To avoid confusion with the values in the extracted update, we will use a superscript “ $ \star $ ” to denote the values computed in this emulation. Let $ \mathit {State} _{0}^\star $ be a tuple containing the initial RAM state, $ V _{0}^{\mathsf {rd} \star } = (\bot) $ , and $ D^{\star } = x || w $ be the initial memory string for use by M.

For $ i = 1,\ldots , t $ , do the following:

(a)

Compute $ (\mathit {State} _{i}^{\star },\mathit {Op} _{i}^{\star },S_{i}^{\star },V ^{\mathsf {wt} \star }_{i}) = \mathbf {parallel\text{-} step} (M,\mathit {State} _{i-1}^{\star },V ^{\mathsf {rd} {\star }}_{i-1}) $ .

(b)

Read from and write to $ D^{\star } $ by running $ V _{i}^{\mathsf {rd} \star } = \mathbf {access} ^{D^{\star }}(\mathit {Op} _{i}^{\star },S_{i}^{\star },V _{i}^{\mathsf {wt}}) $ .

Let $ Y^\star $ be the tuple containing the first $ L^{\prime } = \lceil L / \lambda \rceil $ words of $ D^\star $ , and let $ y^{\star } $ be the concatenation of the first $ \mathit {outlen} $ bits from $ Y^{\star } $ , where $ \mathit {outlen} $ is the output length specified by $ \mathit {State} _{t}^{\star } $ .

(7)

If there exists an index i such that $ V _{i}^{\mathsf {rd}} \not= V _{i}^{\mathsf {rd} \star } $ , then let i be the smallest such index. Compute a digest of the empty partial string $ (\mathit {ptr} ^{\star }, \mathit {digest} _0^\star) = \mathbf {C}.\mathbf {Hash} (\mathit {pp}, D_\bot) $ and then compute $ (*,\pi ^{\star }) = \mathbf {C}.\mathbf {Open} (\mathit {pp}, \mathit {ptr} ^\star , S_{i}) $ . Output

$ \begin{equation*} (i-1,\{(\mathit {digest} _j, S_j, V _{j}, \tau _j)\}_{j\in [i-1]},\mathit {digest} _{0},S_{i},(\bot)^{\left|S_{i} \right|},\pi ^{\star },V _{i}^{\mathsf {prev}},\pi _{i}). \end{equation*} $

(8)

If $ Y \ne Y^{\star } $ , then compute a digest of the empty partial string $ (\mathit {ptr} ^{\star }, \mathit {digest} _0^\star) = \mathbf {C}.\mathbf {Hash} (\mathit {pp} , D_\bot) $ and then compute $ (*,\pi ^{\star }) = \mathbf {C}.\mathbf {Open} (\mathit {pp}, \mathit {ptr} ^\star , [L^{\prime }]) $ . Output

$ \begin{equation*} (t, \{(\mathit {digest} _j, S_j, V _j, \tau _j)\}_{j\in [t]}, \mathit {digest} _0, [L^{\prime }], (\bot)^{L^{\prime }}, \pi ^{\star }, Y, \pi _{\mathsf {final}}). \end{equation*} $

(9)

Otherwise, abort and output $ \bot $ .

To analyze the success of $ \mathcal {A} $ in breaking the soundness of $ \mathcal {H} $ and $ \mathbf {C} $ , in the subsequent subclaims, we argue that (1) $ \mathcal {A} _{\lambda } $ runs in (strict) polynomial time; (2) if $ \mathcal {A} _{\lambda } $ outputs in step 5, then $ \mathcal {A} _{\lambda } $ finds a collision in h; (3) if $ \mathcal {A} _{\lambda } $ outputs in steps 7 or 8, then $ \mathcal {A} _{\lambda } $ finds values that breaking the soundness of $ \mathbf {C} $ ; and (4) if $ \mathcal {A} _{\lambda } $ reaches step 9, then it must be the case that $ ((M,x,y,L,t),w) \in \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} $ .

Given these subclaims, we can conclude the proof as follows: First, note that $ \mathcal {A} _{\lambda } $ outputs in steps 5, 7, 8, or 9 whenever $ y \not= \bot $ , $ (\mathit {statement} _i, \mathit {wit} _i) \in R_{\mathsf {upd}} $ for all $ i \in [m] $ , and $ \mathcal {E} $ halts within $ 4 \cdot p(\lambda) \cdot q(\lambda , t\cdot p_{M}) $ steps. We can break this event into two cases as

$ \begin{align*} &\text{Pr}\left[ \begin{array}{l} y \not= \bot \ \wedge \\ \forall i \in [m] \ (\mathit {statement} _i, \mathit {wit} _i) \in R_{\mathsf {upd}}\ \wedge \\ \mathcal {E} \text{ halts within } 4 \cdot p(\lambda) \cdot q(\lambda , t\cdot p_{M}) \text{ steps} \ \wedge \\ ((M,x,y,L,t),w) \in \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} \end{array} \right] \\ + &\text{Pr}\left[ \begin{array}{l} y \not= \bot \ \wedge \\ \forall i \in [m] \ (\mathit {statement} _i, \mathit {wit} _i) \in R_{\mathsf {upd}}\ \wedge \\ \mathcal {E} \text{ halts within } 4 \cdot p(\lambda) \cdot q(\lambda , t\cdot p_{M}) \text{ steps} \ \wedge \\ ((M,x,y,L,t),w) \not\in \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} \end{array} \right]. \end{align*} $

By Subclaim 6.12, the first term is greater than the probability that $ \mathcal {A} _{\lambda } $ outputs in step 9. By Equation (6.3), the second term is greater than $ 1/(4p(\lambda)) $ . Putting these together, we get that the probability that $ \mathcal {A} _{\lambda } $ outputs in step 5, 7, or 8 is greater than $ 1/(4p(\lambda)) $ . It then follows from Subclaims 6.8, 6.9, and 6.11 that for infinitely many $ \lambda \in \mathbb {N} $ , $ \mathcal {A} _{\lambda } $ runs in polynomial time and either outputs a collision in $ \mathcal {H} $ or in $ \mathbf {C} $ with probability at least $ 1/(4p(\lambda)) $ . As $ \mathcal {A} $ directly implies an adversary $ \mathcal {A}^{\prime } $ that either gets $ \mathit {pp} $ or h as input and simulates the other input for $ \mathcal {A} $ , this implies that $ \mathcal {A} $ can be used to break the soundness of $ \mathcal {H} $ or of $ \mathbf {C} $ with probability at least $ 1/(4p(\lambda)) $ , in contradiction.

Subclaim 6.8.

There exists a polynomial $ q_\mathcal {A} $ such that for every $ h \in \mathcal {H} _{\lambda } $ and $ \mathit {pp} $ in the support of $ \mathbf {C}.\mathbf {Gen} (1^{\lambda} ,n) $ , the running time of $ \mathcal {A} _{\lambda }(\mathit {pp},h) $ is at most $ q_\mathcal {A} (\lambda) $ for all $ \lambda \in \mathbb {N} $ .□

Proof.

The running time of $ \mathcal {A} _{\lambda } $ is bounded by the sum of (1) the time to run $ \langle \mathcal {P} ^{\star }_{\lambda ,z,s}, \mathcal {V} _{\mathit {pp},h,r} \rangle (1^{\lambda} ,(M,x,t,L)) $ , (2) the total amount of time $ \mathcal {A} _{\lambda } $ spends running $ \mathcal {E} $ , (3) the time to check that all $ (\mathit {statement} _i, \mathit {wit} _i) $ pairs are in $ R_{\mathsf {upd}} $ , (4) the time to check for an output in step 5, (5) the time to emulate the execution of M, and (6) the time to check for and compute an output in steps 7 and 8. We separately argue that each of these run in at most polynomial time in $ \lambda ,\left|(M,x) \right|,L,p_{M},t $ , which are each bounded by a fixed polynomial in $ \lambda $ as $ \left|(M,x) \right|\le \lambda $ , $ L \le \lambda $ , and $ t\cdot p_{M} \le \left|x \right|^{c} $ .

First, (1) is bounded by a polynomial in $ \lambda $ , since $ \mathcal {P} ^{\star }_{\lambda ,z,s} $ runs in polynomial time for any $ z,s \in \{0,1\} ^* $ and both the communication complexity and running time of $ \mathcal {V} _{\mathit {pp},h,r} $ are bounded by a fixed polynomial $ \mathrm{poly} (\lambda ,\left|(M,x) \right|,L,p_{M},t) $ by Lemmas 6.16, 6.17, and by definition of $ \alpha ^{\star } $ . Next, (2) is bounded by $ 4 \cdot p(\lambda) \cdot q(\lambda ,t\cdot p_{M}) $ by definition of $ \mathcal {A} _{\lambda } $ , and $ p,q $ are polynomials. For (3), it requires checking the at most t updates are valid where each check requires a polynomial amount of work in $ \lambda ,\left|(M,x) \right|,p_{M},\log t $ by definition of $ \mathcal {L}_{\mathsf {upd}} $ and the efficiency of $ \mathbf {C} $ . Next, (4) requires comparing $ m+1 $ values of containing at most $ p_{M} $ states, where each state is a constant number of words, and $ p_{M} $ words of length $ \lambda $ , and so takes time $ \mathrm{poly} (\lambda ,p_{M}) $ . Next, (5) takes t steps of computation, each of which takes time bounded by a fixed polynomial in $ \lambda ,p_{M} $ by the definition of PRAM computation. Last, (6) requires $ (t+L) \cdot p_{M} \cdot \lambda $ time to check equality of all corresponding values. Computing the initial digest and opening requires $ 2\beta (\lambda)\cdot p_{M} \in \mathrm{poly} (\lambda) $ by efficiency of $ \mathbf {C} $ . Then, the full output has size at most $ t \cdot p_{M} \cdot \mathrm{poly} (\lambda) \in \mathrm{poly} (\lambda) $ and takes at most $ t \cdot p_{M} \cdot \mathrm{poly} (\lambda) \in \mathrm{poly} (\lambda) $ time to compute.

As $ \left|(M,x) \right|,L,p_{M},t $ are bounded fixed polynomials in $ \lambda $ as above, the (strict) running time of $ \mathcal {A} _{\lambda } $ is bounded by some polynomial $ q_\mathcal {A} (\lambda) $ for all $ \lambda \in \mathbb {N} $ . $ \Box $ □

Subclaim 6.9.

If $ \mathcal {A} _{\lambda }(\mathit {pp},h) $ outputs in step 5, then $ \mathcal {A} _{\lambda } $ finds values that break the collision-resistance of $ \mathcal {H} $ .

Proof.

Let $ K_{0}=0 $ and let $ K_{j} $ for $ j \in [m] $ be the number of updates in witnesses 1 through j. Let $ \mathit {hash} ^{(j)},\mathit {hash} ^{(j)\prime } $ be the hashes given in $ \mathit {statement} _{j} $ for all $ j \in [m] $ . Suppose that $ \mathcal {A} _{\lambda } $ outputs in step 5, meaning that either there exists some $ j \in [m] $ such that $ (\mathit {State} ^{(j)},V ^{\mathsf {rd},(j)}) \ne (\mathit {State} _{K_{j-1}},V ^{\mathsf {rd}}_{K_{j-1}}) $ or $ (\mathit {State} _{\mathsf {final}},V ^{\mathsf {rd}}_{\mathsf {final}}) \ne (\mathit {State} _{t},V ^{\mathsf {rd}}_{t}) $ . We first discuss the former case and then the latter.

In the first case, let j be the smallest index with $ (\mathit {State} ^{(j)},V ^{\mathsf {rd},(j)}) \ne (\mathit {State} _{K_{j-1}},V ^{\mathsf {rd}}_{K_{j-1}}) $ . Since $ \mathcal {A} _{\lambda } $ reached step 5, then the output y of $ \mathcal {V} _{\mathit {pp},h,r} $ is not equal to $ \bot $ , which in particular implies that $ \mathit {hash} ^{(j-1)\prime } = \mathit {hash} ^{(j)} $ , and that all m extracted witnesses are valid. Since $ \mathit {wit} _{j-1} $ is a valid witness for $ \mathit {statement} _{j-1} $ (and $ \mathit {State} _{K_{j-1}} $ corresponds to the state after the first $ K_{j-1} $ updates), then by definition of $ \mathcal {L}_{\mathsf {upd}} $ it holds that $ h(\mathit {State} _{K_{j-1}},V ^{\mathsf {rd}}_{K_{j-1}}) = \mathit {hash} ^{(j-1)\prime } $ . Since $ \mathit {wit} _{j} $ is a valid witness for $ \mathit {statement} _{j} $ , then $ \mathit {hash} ^{(j)} = h(\mathit {State} ^{(j)} , V ^{\mathsf {rd},(j)}) $ . Last, since $ y \ne \bot $ , then $ \mathit {hash} ^{(j-1)\prime } = \mathit {hash} ^{(j)} $ . Therefore, $ \mathcal {A} _{\lambda } $ successfully finds a collision.

In the second case, $ y \ne \bot $ implies that $ \mathit {hash} ^{(m)\prime } = h(\mathit {State} _{\mathsf {final}},V ^{\mathsf {rd}}_{\mathsf {final}}) $ , and the fact that $ \mathit {wit} _{m} $ is valid for $ \mathit {statement} _{m} $ implies that $ h(\mathit {State} _{t},V ^{\mathsf {rd}}_{t}) = \mathit {hash} ^{(m)\prime } $ , so this also results in a collision. $ \Box $ □

Next, we show that whenever $ \mathcal {A} $ reaches step 6, rather than viewing the extracted witnesses as m separate $ \mathcal {L}_{\mathsf {upd}} $ instances, they can be viewed as a single instance corresponding to all t updates. This will show that all t updates are in fact being applied to consecutive digests, which will help us show the subsequent claims analyzing $ \mathcal {A} $ ’s attack. In the subsequent claims, we say that $ (\mathit {State},V ^{\mathsf {rd}}) $ is a PRAM configuration if during any step of a PRAM evaluation, the set of states after that step are $ \mathit {State} $ and the words read in that step are $ V ^{\mathsf {rd}} $ .

Subclaim 6.10.

Let $ \mathit {hash} _{\mathsf {start}} = h(\mathit {State} _{0},V _{0}^{\mathsf {rd}}) $ and $ \mathit {hash} _{\mathsf {final}} = h(\mathit {State} _{\mathsf {final}},V _{\mathsf {final}}^{\mathsf {rd}}) $ . Define

$ \begin{equation*} \mathit {statement} _{\mathsf {comb}} = (M,x,t,\mathit {pp},h,\mathit {digest} _{0},\mathit {hash} _{\mathsf {start}},\mathit {digest} _{t} , \mathit {hash} _{\mathsf {final}}), \end{equation*} $

and $ \mathit {wit} _{\mathsf {comb}} = (\mathit {State} _{0},V _{0}^{\mathsf {rd}},u_{1} , \ldots ,u_{t}) $ . If $ \mathcal {A} _{\lambda }(\mathit {pp},h) $ reaches step 6, then $ (\mathit {statement} _{\mathsf {comb}},\mathit {wit} _{\mathsf {comb}})\in \mathcal {L}_{\mathsf {upd}} $ .

Proof.

We start with an independent fact about the $ \mathcal {L}_{\mathsf {upd}} $ language, which we will then apply to show that the combined statement in the claim is indeed a valid $ \mathcal {L}_{\mathsf {upd}} $ statement. Consider any two $ R_{\mathsf {upd}} $ instances

$ \begin{equation*} (\mathit {statement} _{1},\mathit {wit} _{1}) = ((M,x,k_{1},\mathit {pp},h,\mathit {digest} _{0},\mathit {hash} _{0},\mathit {digest} _{1},\mathit {hash} _{1}), (\mathit {State} _{1},V _{1}^{\mathsf {rd}},u_{1}^{1},\ldots ,u_{k_{1}}^{1})), \end{equation*} $

$ \begin{equation*} (\mathit {statement} _{2}, \mathit {wit} _{2}) = ((M,x,k_{2},\mathit {pp},h,\mathit {digest} _{1},\mathit {hash} _{1},\mathit {digest} _{2},\mathit {hash} _{2}), (\mathit {State} _{2},V _{2}^{\mathsf {rd}},u_{1}^{2},\ldots ,u_{k_{2}}^{2})) \end{equation*} $

that agree on $ M,x,\mathit {pp},h $ , and such that the final digest and hash $ (\mathit {digest} _{1},\mathit {hash} _{1}) $ in the first statement matches the initial ones in the second statement. Let $ \mathit {State} _{1,\mathsf {final}} $ be the final state computed when verifying $ (\mathit {statement} _{1},\mathit {wit} _{1}) $ and let $ V _{1,\mathsf {final}}^{\mathsf {rd}} $ be the final words read, given by update $ u_{k_{1}}^{1} $ . In other words, $ (\mathit {State} _{1,\mathsf {final}},V _{1,\mathsf {final}}^{\mathsf {rd}}) $ is the final PRAM configuration in the first $ R_{\mathsf {upd}} $ instance. We claim that if $ (\mathit {State} _{1,\mathsf {final}}, V _{1,\mathsf {final}}^{\mathsf {rd}}) $ is equal to $ (\mathit {State} _{2}, V _{2}^{\mathsf {rd}}) $ (that is, the initial configuration of the second statement), then we can combine the statements together to get a new valid instance with statement

$ \begin{equation*} \mathit {statement}^{\prime }= (M,x,k_{1}+k_{2},\mathit {pp},h,\mathit {digest} _{0},\mathit {hash} _{0},\mathit {digest} _{2},\mathit {hash} _{2}) \end{equation*} $

and witness $ \mathit {wit}^{\prime } = (\mathit {State} _{1},V _{1}^{\mathsf {rd}},u_{1}^{1},\ldots ,u_{k_{1}}^{1},u_{1}^{2},\ldots ,u_{k_{2}}^{2}) $ .

To show this, we first show that every update $ i \in [k_{1}+k_{2}] $ in $ \mathit {wit}^{\prime } $ satisfies conditions 1, 2, 3, and 4, of $ \mathcal {L}_{\mathsf {upd}} $ . These four conditions are defined by starting with $ (\mathit {State} _{1},V _{1}^{\mathsf {rd}}) $ as the starting PRAM configuration for M and using the updates in the witnesses to iteratively compute $ k_{1}+k_{2} $ PRAM steps. Then, for each step i, checks are done that depend on the values of the ith step, the input x, the initial digest $ \mathit {digest} _{1} $ given by the statement, and the values in the ith update. Since $ (\mathit {statement}^{\prime },\mathit {wit}^{\prime }) $ and $ (\mathit {statement} _{1},\mathit {wit} _{1}) $ have the same machine M and input x, start with the same initial values $ \mathit {digest} _{1}, \mathit {State} _{1},V _{1}^{\mathsf {rd}} $ , and agree at the first $ k_{1} $ updates, this implies that conditions 1, 2, 3, and 4 hold for the first $ k_{1} $ steps. For the remaining $ k_{2} $ steps, we observe that, since the final state PRAM configuration of the first statement matches $ (\mathit {State} _{2},V _{2}^{\mathsf {rd}}) $ , then the values computed for each step of verifying $ (\mathit {statement} _{2},\mathit {wit} _{2}) $ are the same as those computed when verifying the final $ k_{2} $ steps of the combined statement. It follows that every all $ k_{1}+k_{2} $ updates satisfy the required conditions.

It remains to show that the $ \mathcal {L}_{\mathsf {upd}} $ requirements for $ \mathit {hash} _{0},\mathit {hash} _{2} $ are satisfied. We have that $ \mathit {hash} _{0} = h(\mathit {State} _{1},V _{1}^{\mathsf {rd}}) $ as this is a requirement of the first statement being valid. We have that $ \mathit {hash} _{2} $ is a hash of the final configuration for the combined statement, because this final configuration is the same as that of the second statement, as shown above. It follows that $ (\mathit {statement}^{\prime },\mathit {wit}^{\prime }) $ is a valid $ \mathcal {L}_{\mathsf {upd}} $ instance.

We observe that the above holds for any number of statements by the same logic, which we will use to show the claim. Suppose that $ \mathcal {A} _{\lambda } $ reaches step 6 and consider any $ j \in [m] $ . Let

$ \begin{equation*} \mathit {statement} _{j} = (M^{(j)},x^{(j)},k^{(j)},\mathit {pp} ^{(j)},h^{(j)},\mathit {digest} ^{(j)},\mathit {hash} ^{(j)},\mathit {digest} ^{(j)\prime },\mathit {hash} ^{(j)\prime }) \end{equation*} $

be the jth statement in the interaction between $ \mathcal {P} ^{\star } _{\lambda ,z,s} $ and $ \mathcal {V} _{\mathit {pp},h,r} $ . Since $ \mathcal {A} _{\lambda } $ did not output in step 2, then the output y of the interaction is not equal to $ \bot $ , which implies that $ (M^{(j)},x^{(j)},\mathit {pp} ^{(j)},h) = (M,x,\mathit {pp},h) $ . Since $ \mathcal {A} _{\lambda } $ did not output in step 4, then $ \mathit {wit} _{j} $ is valid for $ \mathit {statement} _{j} $ . Since $ \mathcal {A} _{\lambda } $ did not output in step 5, then the PRAM configuration $ (\mathit {State} _{K_{j-1}},V _{K_{j-1}}^{\mathsf {rd}}) $ before the start of the j sub-statement matches $ (\mathit {State} ^{(j)},V ^{\mathsf {rd},(j)}) $ . Therefore, the m witnesses satisfy all conditions above to “combine” them into a new witness. Based on our claim above, the new statement is

$ \begin{equation*} \mathit {statement}^{\prime \prime } = (M,x,t,\mathit {pp},h,\mathit {digest} ^{(1)},\mathit {hash} ^{(1)},\mathit {digest} ^{(m)\prime },\mathit {hash} ^{(m)\prime }) \end{equation*} $

with witness

$ \begin{equation*} \mathit {wit}^{\prime \prime } = (\mathit {State} ^{(1)},V ^{\mathsf {rd},(1)},u_{1},\ldots ,u_{t}), \end{equation*} $

where the new instance corresponds to t updates since $ \sum _{j=1}^{m} k^{(j)} = t $ by the fact that $ y \ne \bot $ .

Recall that our goal is to show that $ (\mathit {statement} _{\mathsf {comb}},\mathit {wit} _{\mathsf {comb}}) $ (given in the claim statement) is in $ R_{\mathsf {upd}} $ . The difference between $ \mathit {statement} _{\mathsf {comb}} $ and $ \mathit {statement}^{\prime \prime } $ is in the digests and hashes.

The initial digest and both hashes in $ \mathit {statement}^{\prime \prime } $ are equal to those in $ \mathit {statement} _{\mathsf {comb}} $ , since these are included in the checks done by the verifier, and so are implied by $ y \ne \bot $ . For the final digest, we have that $ \mathit {digest} ^{(m)\prime } $ (given by $ \mathit {statement} _{m}) $ is equal to the digest given by update $ u_{t} $ , which is $ \mathit {digest} _{t} $ , since the extracted witnesses are valid. It follows that $ \mathit {statement}^{\prime \prime } = \mathit {statement} _{\mathsf {comb}} $ .

For the witnesses, the difference between $ \mathit {wit}^{\prime \prime } $ and $ \mathit {wit} _{\mathsf {comb}} $ is that for the initial configuration, $ \mathit {wit}^{\prime \prime } $ has $ (\mathit {State} ^{(1)},V ^{\mathsf {rd},(1)}) $ , while $ \mathit {wit} _{\mathsf {comb}} $ has $ (\mathit {state} _{0},V _{0}^{\mathsf {rd}}) $ . Since $ \mathcal {A} _{\lambda } $ did not output in step 5, these are equal, which concludes the claim. $ \Box $ □

Subclaim 6.11.

If $ \mathcal {A} _{\lambda }(\mathit {pp},h) $ outputs in step 7 or 8, then $ \mathcal {A} _{\lambda } $ finds values that violate the soundness of $ \mathbf {C} $ .

Proof.

We first show the claim for the case that $ \mathcal {A} _{\lambda } $ outputs in step 7 and at the end discuss how to modify the proof in the case that $ \mathcal {A} _{\lambda } $ outputs in step 8.

Suppose $ \mathcal {A} _{\lambda } $ outputs in step 7, meaning that there exists an index i such that $ V _{i}^{\mathsf {rd}} \ne V _{i}^{\mathsf {rd} \star } $ . Moreover, let i be the smallest such index. In this case, $ \mathcal {A} _{\lambda } $ outputs

$ \begin{equation*} (i-1, \{(\mathit {digest} _j, S_j, V _{j}, \tau _j)\}_{j\in [i-1]},\mathit {digest} _{0},S_{i},(\bot)^{\left|S_{i} \right|},\pi ^{\star },V _{i}^{\mathsf {prev}},\pi _{i}). \end{equation*} $

Informally, the values output by $ \mathcal {A} _{\lambda } $ correspond to $ i-1 $ updates, an opening of locations $ S_{i} $ in $ \mathit {digest} _{0} $ , and an opening of location $ S_{i} $ in $ \mathit {digest} _{i-1} $ (the digest before the ith update). To show that this breaks the soundness of $ \mathbf {C} $ , we need to show that (A) all updates and openings are valid, yet (B) $ V _{i}^{\mathsf {prev}} $ is not equal to set of values at locations $ S_{i} $ consistent with the $ i-1 $ updates.

For (A), we first show that the initial opening of $ \mathit {digest} _0 $ at locations $ S_i $ to $ (\bot)^{\left|S_{i} \right|} $ with proof $ \pi ^\star $ is valid. Note that $ \mathcal {A} _{\lambda } $ computes $ (\mathit {ptr} ^{\star },\mathit {digest} _{0}^{\star }) = \mathbf {C}.\mathbf {Hash} (\mathit {pp} ,D_{\bot }) $ and then $ \pi ^{\star } = \mathbf {C}.\mathbf {Open} (\mathit {pp} ,\mathit {ptr} ^{\star },S_{i}) $ . By completeness of $ \mathbf {C} $ , it follows that $ \pi ^\star $ is valid for $ (\bot)^{\left|S_{i} \right|} $ at locations $ S_i $ with respect to the digest $ \mathit {digest} _0^\star $ computed by $ \mathcal {A} _{\lambda } $ . Since $ \mathbf {C}.\mathbf {Hash} $ is deterministic, it follows that $ \mathit {digest} _{0} = \mathit {digest} _{0}^{\star } $ , so we conclude that

$ \begin{equation*} \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} ,\mathit {digest} _{0},S_{i},(\bot)^{\left|S_{i} \right|},\pi ^{\star }) = 1. \end{equation*} $

Next, using the fact that $ \mathcal {A} $ reaches step 6, by Claim 6.10 the extracted witnesses form a combined statement $ (\mathit {statement} _{\mathsf {comb}},\mathit {wit} _{\mathsf {comb}}) $ in $ R_{\mathsf {upd}} $ containing all t extracted updates. This implies that the sequence of updates from $ \mathit {digest} _0 $ up until $ \mathit {digest} _{i-1} $ are all valid. Namely,

$ \begin{equation*} \mathbf {C}.\mathbf {VerUpd} (\mathit {pp} ,\mathit {digest} _{j-1},S_{j},V _{j}, \mathit {digest} _{j},\tau _{j}) = 1 \text{ for all } j \in [i-1]. \end{equation*} $

This also implies that the proof $ \pi _i $ is a valid opening proof for $ V _{i}^{\mathsf {prev}} $ at locations $ S_i $ with respect to $ \mathit {digest} _{i-1} $ . Namely,

$ \begin{equation*} \mathbf {C}.\mathbf {VerOpen} (\mathit {pp} ,\mathit {digest} _{i-1},S_{i},V _{i}^{\mathsf {prev}},\pi _{i}) = 1. \end{equation*} $

Thus, the openings and updates output by $ \mathcal {A} _{\lambda } $ are valid.

For the rest of the proof, it will be helpful to define the following notation: Let IND be an index where $ V _{i}^{\mathsf {rd}} $ and $ V _{i}^{\mathsf {rd} \star } $ are not equal (which must exist by assumption). Let $ v ^{\mathsf {rd}} $ , $ v ^{\mathsf {rd} \star } $ , $ v ^{\mathsf {prev}} $ , and $ \ell $ be the corresponding values at index IND of $ V _{i}^{\mathsf {rd}} $ , $ V _{i}^{\mathsf {rd} \star } $ , $ V _{i}^{\mathsf {prev}} $ , and $ S_{i} $ , respectively. Before showing (B), we make the following simplifying observations, which make use of the assumption that $ V _{i}^{\mathsf {rd}} \not= V _{i}^{\mathsf {rd} \star } $ :

(1)

The first i updates and the first i steps in the emulation correspond to the same values, that is, $ (\mathit {Op} _{j}^{\star },S_{j}^{\star },V _{j}^{\mathsf {wt} \star }) = (\mathit {Op} _{j},S_{j},V _{j}^{\mathsf {wt}}) $ for all $ j \le i $ .

This holds because of the following: The values $ (\mathit {Op} _{j}^{\star },S_{j}^{\star },V _{j}^{\mathsf {wt} \star }) $ are computed as a deterministic function of the initial configuration $ (\mathit {State} _{0}^{\star },V _{0}^{\mathsf {rd} \star }) $ and the words read in every step of the emulation done by $ \mathcal {A} $ . The values $ (\mathit {Op} _{j},S_{j},V _{j}^{\mathsf {wt}}) $ are computed as a deterministic function of the initial configuration $ (\mathit {State} _{0},V _{0}^{\mathsf {rd}}) $ and words read in the emulation done by $ \mathcal {E} $ . The initial configurations are equal by definition, that is, $ (\mathit {State} _{0}^{\star },V _{0}^{\mathsf {rd} \star }) = (\mathit {State} _{0},V _{0}^{\mathsf {rd}}) $ . Since i is the first iteration where $ V _{i}^{\mathsf {rd} \star } \ne V _{i}^{\mathsf {rd}} $ , then the words read in both are the same. The observation follows.

(2)

There exists an update $ i^{\prime } \lt i $ with $ \ell \in S_{i^{\prime }} $ .

This means that update i cannot be the first iteration that accesses location $ \ell $ . This holds, because if i was the first such iteration, then $ v ^{\mathsf {rd}} $ would be the value in location $ \ell $ of $ x || w $ by definition of w, as would $ v ^{\mathsf {rd} \star } $ , in contradiction. Note that this relies on observation 1 above to use the fact that the locations accessed in the extracted updates and the emulation are the same.

Now, to show (B), we claim that $ v ^{\mathsf {prev}} \not= v ^{\mathsf {rd} \star } $ , yet $ v ^{\mathsf {rd} \star } $ is the value consistent with the updates that we “expect” to open at location $ \ell $ with respect to $ \mathit {digest} _{i-1} $ . To show that $ v ^{\mathsf {prev}} \not= v ^{\mathsf {rd} \star } $ , we note that because the ith update is valid with respect to $ R_{\mathsf {upd}} $ , this implies that $ v ^{\mathsf {prev}} $ is either equal to $ \bot $ or $ v ^{\mathsf {rd}} $ . However, $ v ^{\mathsf {rd} \star } $ is not equal to $ v ^{\mathsf {rd}} $ by assumption. Additionally, $ v ^{\mathsf {rd} \star } $ cannot be equal to $ \bot $ , since it starts off as a value in $ x || w $ and is never updated to a non- $ \bot $ value.

To formalize the notion that $ v ^{\mathsf {rd} \star } $ is the value we expect to open, recall that $ v ^{\mathsf {rd} \star } $ is the value in location $ \ell $ of $ D^{\star } $ at step i. We argue that at every step starting from the first time $ \ell $ is accessed, the value at location $ \ell $ in $ D^{\star } $ is consistent with the $ i-1 $ extracted updates above. This will show that $ v ^{\mathsf {rd} \star } $ is the value we expect to be at $ \ell $ . Consider the first such update $ i_{0} \lt i $ that accesses location $ \ell $ , which is guaranteed to exist by observation 2 above. If $ \ell $ is read during update $ i_{0} $ (as specified by $ \mathit {Op} _{i_{0}} $ ), then the corresponding update value is the value at location $ \ell $ in $ x || w $ by definition of w, which by definition is given by $ V _{i_{0}}^{\mathsf {rd}} $ . Otherwise (when update $ i_{0} $ writes to $ \ell $ ), the value written to $ D^{\star } $ at location $ \ell $ is given by $ V _{i_{0}}^{\mathsf {wt} \star } $ , and $ V _{i_{0}}^{\mathsf {wt} \star } = V _{i_{0}}^{\mathsf {wt}} $ by observation 1 above. By the way that $ \mathcal {A} _{\lambda } $ updates $ D^{\star } $ throughout the emulation, all subsequent reads and writes to $ \ell $ in $ D^\star $ are consistent with the extracted updates. This implies that $ v ^{\mathsf {rd} \star } $ is the value read from or written to $ \ell $ during the last update $ i^{\prime }\lt i $ that accessed $ \ell $ , which in turn is the value to which we expect $ \ell $ to open.

This completes the proof of the claim that if $ \mathcal {A} _{\lambda } $ outputs in step 7, then it finds values that violate soundness of $ \mathbf {C} $ . We conclude by discussing the case where $ \mathcal {A} _{\lambda } $ outputs in step 8, which follows by similar logic. In this case, we are given that $ Y \ne Y^{\star } $ (rather than $ V _{i}^{\mathsf {rd}} \ne V _{i}^{\mathsf {rd} \star } $ as above). To show that the output in step 8 violates soundness, we need to argue all updates and openings are valid, yet for some $ \ell \le L^{\prime } $ , the $ \ell $ th value Y is not the value we expect to open at location i with respect to $ \mathit {digest} _t $ . Let $ Y_{\ell } $ be this value, and let $ Y^{\star }_{\ell } $ be the corresponding value in $ Y^{\star } $ . The initial opening and all t updates (before location $ \ell $ is opened to its value in Y) are valid by identical logic as above. The final opening $ \pi _{\mathsf {final}} $ is accepting, since $ \mathcal {V} _{\mathit {pp},h,r} $ outputs a non- $ \bot $ value. Next, to argue that $ Y_{\ell } $ is not the value we expect to open, we can show that $ Y_{\ell }^{\star } $ is the value we expect to open. If location $ \ell $ is never accessed, then it would follow that $ Y_{\ell } = Y_{\ell }^{\star } $ , since both would be the corresponding word in $ x || w $ , so it follows that there must be some previous access for location $ \ell $ . Therefore, the same logic used in the above argument holds. $ \Box $ □

Subclaim 6.12.

If $ \mathcal {A} _{\lambda } $ outputs $ \bot $ in step 9, then it holds that $ ((M,x,y,L,t),w) \in \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} $ .

Proof.

When $ \mathcal {A} _{\lambda } $ does not output in step 2, it holds that $ y \not= \bot $ , so the final state $ \mathit {State} _{\mathsf {final}} $ must be halting. Since $ \mathcal {A} _{\lambda } $ does not output in step 5, then $ \mathit {State} _{\mathsf {final}} $ is equal to the final state $ \mathit {State} _{t} $ computed by $ \mathcal {E} $ in the extraction. It follows that $ \mathit {State} _t $ is a halting state.

When $ \mathcal {A} _{\lambda } $ does not output in step 7, it holds that $ V _{i}^{\mathsf {rd}} = V _{i}^{\mathsf {rd} \star } $ for all $ i \in [t] $ . Since $ \mathit {State} _0 = \mathit {State} _0^\star $ , $ V ^{\mathsf {rd}}_{0} = V _0^{\mathsf {rd} \star } $ , and $ \mathbf {parallel\text{-} step} $ is a deterministic function, this implies that $ \mathit {State} _t^\star $ computed by $ \mathcal {A} _{\lambda } $ is equal to $ \mathit {State} _t $ , which corresponds to a halting state as argued above. Moreover, the emulation done by $ \mathcal {A} _{\lambda } $ perfectly emulates the computation of $ M(x,w) $ , so it is the case that $ M(x,w) = y^\star $ within t steps, so $ ((M,x,t,y^\star),w) \in \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} $ . To show that $ y = y^{\star } $ , recall that $ y^{\star } $ is the first $ \mathit {outlen} $ bits of $ Y^{\star } $ , where $ \mathit {outlen} $ is the output length specified by $ \mathit {State} _{t}^{\star } = \mathit {State} _{t} $ . We have that $ Y = Y^\star $ whenever $ \mathcal {A} _{\lambda } $ does not output in step 8. Moreover, y is the concatenation of the first $ \mathit {outlen} $ bits of Y, which follows because $ \mathcal {A} $ does not abort in step 2. It follows that $ y = y^{\star } $ . Putting everything together, if $ \mathcal {A} _{\lambda } $ outputs in step 9, then it follows that $ ((M,x,y,L,t),w) \in \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} $ , as required. $ \Box $ □

This completes the proof of Claim 6.7. $ \Box $

This completes the proof of Lemma 6.3.

Lemma 6.13 (Prover Efficiency).

There exists a polynomial q such that for any $ \lambda \in \mathbb {N} $ and $ ((M,x,y,L,t),w) \in \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} $ where M uses $ p_{M} $ processors and has access to $ n \le 2^{\lambda } $ words in memory, it holds that

$ \begin{equation*} \mathbf {depth} _\mathcal {P} (1^{\lambda} ,(M,x,t,L),w) \le t + (\alpha ^\star)^{2} \cdot \left|(M,x) \right|\cdot L \cdot q(\lambda ,\log (t \cdot p_{M})) \end{equation*} $

using at most $ 4 \cdot p_{M} \cdot \beta + \rho ^\star \cdot \gamma \log t \le (p_{M} + \alpha ^\star \cdot \rho ^\star) \cdot q(\lambda ,\log (t \cdot p_{M})) $ processors.

Proof.

The work of $ \mathcal {P} $ can be split into initialization, running $ \mathbf {Compute\text{-} and \text{-} prove} $ , and then proving the output. We first focus on the prover’s complexity for initialization and proving the output, specified in Steps 2, 3, and 5 in Figure 4.

For Step 2 of initialization, the prover computes the initial state $ \mathit {State} _\mathsf {start} $ for M, the set $ V ^\mathsf {rd} _\mathsf {start} = (\bot) $ , the parameter $ \gamma $ , and the hash $ \mathit {hash} _{\mathsf {start}} $ . Both $ \mathit {State} _{\mathsf {start}} $ and $ V ^{\mathsf {rd}}_{\mathsf {start}} $ can be computed in time $ O(\lambda) $ , and since $ \mathit {hash} _{\mathsf {start}} $ corresponds to hashing a single state and word, it can also be in time $ \beta \in \mathrm{poly} (\lambda) $ (see parameters paragraph). To compute $ \gamma $ , the prover needs to compute the following parameters: First, the prover can compute the hash efficiency parameter $ \beta = \beta (\lambda) $ given the security parameter $ \lambda $ . Next, we recall the following parameters based on the definition of $ \mathcal {L}_{\mathsf {upd}} $ , which can be efficiently computed at the start of the protocol given the security parameter $ \lambda $ , time bound t, and processors $ p_{M} $ used by the machine M: The length of $ \mathcal {L}_{\mathsf {upd}} $ statements for at most t updates is at most $ \ell _\mathsf {upd} (\lambda , \left|(M,x) \right|,t) \in \log t + \left|(M,x) \right| + \mathrm{poly} (\lambda) $ , and when using $ p_\mathsf {upd} (\lambda , p_{M}) = \beta \cdot p_{M} $ processors, the verification procedure takes time at most $ t_\mathsf {upd} (\lambda , \left|(M,x) \right|, t) = t \cdot \beta \cdot \left|(M,x) \right| \cdot \mathrm{poly} (\lambda , \log t) $ . Given these parameters, we can compute $ \alpha ^\star = \alpha (\lambda ,\ell _{\mathsf {upd}},t_{\mathsf {upd}}, p_{\mathsf {upd}})/t $ and finally $ \gamma = \alpha ^\star + 1 $ , which the rest of the protocol depends on. All of these parameters can be efficiently computed in polynomial time in the input length on a single processor, so in total this step requires $ \mathrm{poly} (\lambda) + \mathrm{poly}\hspace{-2.0pt}\log (\lambda ,\left|(M,x) \right|, p_{M},t) \in \mathrm{poly} (\lambda , \log (t \cdot p_{M})) $ work with a single processor.

For Step 3 of initialization, the prover needs to compute the initial digest $ \mathit {digest} _\mathsf {start} $ and allocate memory to run M. By Definition 5.4, the work to compute $ \mathit {digest} _\mathsf {start} $ is $ \beta $ . To allocate memory and copy the input x, this takes at most $ \left|x \right| \in \mathrm{poly} (\lambda) $ time.

In Step 5, the prover needs to open $ \lceil L/ \lambda \rceil $ locations in the concurrently updatable hash function, which takes $ \lceil L/ \lambda \rceil \cdot \beta $ work by Definition 5.4. The prover additionally sends $ \mathit {State} _\mathsf {final} $ and $ V ^\mathsf {rd} _\mathsf {final} $ that have size $ O(\lambda) $ as they correspond to a halting state for the PRAM computation M. As $ \beta \in \mathrm{poly} (\lambda) $ , this step takes at most $ L \cdot \mathrm{poly} (\lambda) $ time to compute.

Combining the above, everything other than $ \mathbf {Compute\text{-} and \text{-} prove} $ requires an additive overhead in depth (with just a single processor) of at most $ L \cdot \left|x \right| \cdot \mathrm{poly} (\lambda , \log (t \cdot p_{M})) $ .

It remains to analyze $ \mathbf {Compute\text{-} and \text{-} prove} $ . Recall that $ \mathbf {Compute\text{-} and \text{-} prove} $ starts m sub-protocols $ \Pi _{1},\ldots ,\Pi _{m} $ . We start by bounding the number of sub-protocols m by $ \gamma \log t $ in Claim 6.14. We then argue in Claim 6.15 that, starting at Step 4, $ \mathcal {P} $ completes all sub-protocols in depth at most $ t + \gamma ^{2} \cdot (\log t+1) + \beta $ while using a total of $ 3 \cdot p_{M} \cdot \beta + m \cdot \rho ^\star $ processors. As $ \gamma = \alpha ^\star + 1 $ , this implies that the total depth of $ \mathcal {P} $ is $ t + (\alpha ^\star)^2 \cdot \left|x \right| \cdot L \cdot \mathrm{poly} (\lambda , \log (t \cdot p_{M})) $ when using a total of $ 3 \cdot p_{M} \cdot \beta + \rho ^\star \cdot \gamma \log t \le (p_{M} + \alpha ^\star \cdot \rho ^\star) \cdot \mathrm{poly} (\lambda , \log (t \cdot p_{M})) $ processors.

Last, we recall Remark 8, which states that we can assume without loss of generality that $ \left|(M,x) \right| $ is bounded by an a priori fixed polynomial in $ \lambda $ when proving $ \mathcal {L}_{\mathsf {upd}} $ statements regarding $ M,x $ , as long as the statements are proven relative to the time bound $ t^{\prime } = t+\left|(M,x) \right| $ rather than t. If not, then the prover (and verifier) can incur an additive $ \left|(M,x) \right| \cdot \mathrm{poly} (\lambda) $ delay in depth using a single processor and prove a related statement where it is the case. Therefore, combining this with the above, it follows that there exists a polynomial q such that the total depth of $ \mathcal {P} $ can be bounded by $ t + (\alpha ^\star)^2 \cdot \left|(M,x) \right| \cdot L \cdot \mathrm{poly} (\lambda , \log (t \cdot p_{M})) $ when using a total of $ (p_{M} + \alpha ^\star \cdot \rho ^\star) \cdot \mathrm{poly} (\lambda , \log (t \cdot p_{M})) $ processors, as required.

Claim 6.14.

The number of protocols m started by $ \mathcal {P} $ is at most $ \gamma \log t $ .□

Proof.

Recall that $ k_i $ is the number of steps in the ith sub-protocol. By definition of the protocol, it holds that $ k_1 = \lfloor t/\gamma \rfloor $ . Then at each subsequent call to $ \mathbf {Compute\text{-} and \text{-} prove} $ , the number of steps $ k_i $ in the ith sub-protocol is equal to $ 1/\gamma $ times the number of remaining steps (rounding down if necessary), until the number of remaining steps is less than $ \gamma $ . Thus, we can recursively define $ k_i = \lfloor (1/\gamma) \cdot (t - \sum _{j=1}^{i-1} k_j)\rfloor $ for all i less than the number of sub-protocols m. For notational convenience, we define $ K_i = \sum _{j=1}^i k_j $ to be the number of steps computed in the first i sub-protocols.

To bound m, we lower bound the number of steps computed by the first i sub-protocols before hitting the base case. Namely, we show for any $ i \in [m-1] $ ,

$ \begin{equation*} K_i \ge t \cdot \left(1 - \left(1 - \frac{1}{\gamma }\right)^{i}\right) - i. \end{equation*} $

We prove this lower bound on $ K_i $ via induction on i. The base case of $ i=1 $ holds as $ (1 - (1 - 1/\gamma)^{1}) - 1= 1/\gamma - 1 $ and $ K_1 = k_1 = \lfloor t / \gamma \rfloor \ge t/\gamma - 1 $ . For the inductive step, assume the bound holds for $ j = i-1 $ . Then, the claim follows by the following set of inequalities:

$ \begin{align*} K_i = K_{i-1} + k_i &= K_{i-1} + \left\lfloor \frac{1}{\gamma } \left(t - K_{i-1} \right) \right\rfloor \\ &\ge K_{i-1} \left(1 - \frac{1}{\gamma } \right) + \frac{t}{\gamma } - 1 \\ &\ge t \cdot \left(1 - \left(1 - \frac{1}{\gamma }\right)^{i-1}\right) \left(1 - \frac{1}{\gamma } \right) + \frac{t}{\gamma } - (i-1) \cdot \left(1- \frac{1}{\gamma }\right) - 1 \\ &\ge t \cdot \left(1 - \left(1 - \frac{1}{\gamma }\right)^{i} \right) - i . \end{align*} $

Note that m is then defined to be the smallest value such that the number of steps remaining is smaller than $ \gamma \log t+1 $ , so $ t - K_m \lt \gamma \log t+1 $ . Using the above lower bound, we note that for any arbitrary value $ m^\star $ , it holds that $ t - K_{m^\star } \le t \cdot (1 - 1/\gamma)^{m^\star } + m^\star $ . Therefore, if $ t \cdot (1 - 1/\gamma)^{m^\star } + m^\star \le \gamma \log t+1 $ for some $ m^\star $ , then $ t - K_{m^\star } \le \gamma \log t+1 $ . As m is the smallest value for which $ t - K_{m} \le \gamma \log t+1 $ , this would imply that $ m \le m^\star $ , since we would have hit the base case before $ m^\star $ .

Plugging in $ m^\star = \gamma \log t $ , we get that $ t\cdot (1-1/\gamma)^{m^\star } + m^\star \le \gamma \log t+1 $ . Thus, it follows that $ m \le \gamma \log t $ . $ \Box $ □

Claim 6.15.

The prover completes all protocols $ \Pi _{1},\ldots ,\Pi _{m} $ in depth at most $ t + \gamma ^{2} \cdot (\log t+1) + \beta $ while using at most $ 3 \cdot p_{M} \cdot \beta + m \cdot \rho ^\star $ processors in total.

Proof.

For $ i \in [m] $ , let the ith sub-protocol $ \Pi _{i} $ have statement $ \mathit {statement} _{i} $ and witness $ \mathit {wit} _{i} $ as defined by the protocol. The prover’s depth when considering $ \Pi _i $ consists of (1) $ k_{i} $ steps of computation corresponding to running M, (2) computing the witness $ \mathit {wit} _{i} $ for $ \mathcal {P} _{\mathsf {sARK}} $ , (3) computing the hash $ \mathit {hash} _{i}^{\prime } $ for the statement $ \mathit {statement} _{i} $ , and (4) running $ \mathcal {P} _{\mathsf {sARK}} $ to prove the computation.

To compute $ \mathit {wit} _i $ for $ \mathcal {P} _{\mathsf {sARK}} $ , $ \mathcal {P} $ makes $ k_{i} $ pipelined calls to $ \mathbf {OpenUpdate} $ in parallel to the computation. It follows that performing the computation in (1) and the $ k_{i} $ concurrent calls to $ \mathbf {OpenUpdate} $ in (2) can together be computed in depth $ k_i + \beta $ using $ p_{M} + p_{M} \cdot \beta $ processors by Definition 5.4. For (3), we note that this corresponds to hashing at most $ p_{M} $ states and words, and so $ \mathit {hash} _{k} $ can be computed in parallel in time $ \beta $ with $ p_{M} \cdot \beta $ processors (see parameters paragraph). As the hash is computed in parallel to the final update in (2) (which also takes $ \beta $ steps), it follows that together (1), (2), and (3) can be done in depth $ k_{i} + \beta $ with $ 3p_{M} \cdot \beta $ processors.

We note that steps (1)–(3) happen consecutively for all m protocols, which all together consist of t steps of computation while computing the corresponding updates and hash on the side. Thus, across all protocols, a total of $ 3 \cdot p_{M} \cdot \beta $ processors are used for these three steps and these steps all finish by time $ t + \beta $ .

For (4), we claim that any valid $ \mathcal {L}_{\mathsf {upd}} $ statement corresponding to $ k \le t $ updates can be proven in time $ k \cdot \alpha ^\star $ using $ \rho ^\star $ processors. Let $ \widehat{\alpha } (k) $ and $ \widehat{\rho } (k) $ be the function representing the depth and processors used to prove valid statements corresponding to k updates. It follows that the time to prove such a statement is $ (\widehat{\alpha } (k) / k) \cdot k \le (\widehat{\alpha } (t) / t) \cdot k $ with $ \widehat{\rho } (k) \le \widehat{\rho } (t) $ processors, since it holds without loss of generality that $ \widehat{\alpha } (k) / k $ is an increasing function in k and that $ \widehat{\rho } $ is increasing (see parameters paragraph for more discussion). However, we defined $ \alpha ^\star $ as $ \widehat{\alpha } (t) / t $ and $ \rho ^{\star } = \widehat{\rho } (t) $ , which implies the claim. Furthermore, all of these proofs happen simultaneously, so this adds a factor of $ m \cdot \rho ^\star $ processors to the total computation. It follows that running $ \mathcal {P} _{\mathsf {sARK}} $ requires depth $ k_i \cdot \alpha ^{\star } $ (with one processor) for sub-protocol i.

Putting everything together, at the start of some sub-computation i with T steps remaining, we compute and prove $ \lfloor T / \gamma \rfloor $ steps of computation for $ i \le m-1 $ , or T steps when $ i=m $ , where recall that we defined $ \gamma \triangleq \alpha ^\star + 1 $ . By the above, for each $ i \le m-1 $ this requires depth bounded by

$ \begin{align*} &\lfloor T / \gamma \rfloor + \beta + \alpha ^\star \cdot \lfloor T / \gamma \rfloor = \lfloor T / \gamma \rfloor (\alpha ^\star +1) + \beta \le T + \beta . \end{align*} $

For $ i=m $ , this requires depth bounded by

$ \begin{align*} T + \beta + \alpha ^{\star } \cdot T = T\cdot (\alpha ^{\star }+1) + \beta = T \cdot \gamma + \beta . \end{align*} $

For all sub-protocols $ \Pi _i $ , we start recursively computing and proving the remaining $ t - \sum _{i=1}^{i-1} k_j $ steps at depth $ \sum _{i=1}^{i-1} k_j $ . By the above, this implies that protocol $ \Pi _i $ for $ i \le m-1 $ finishes at depth

$ \begin{equation*} \sum _{i=1}^{i-1} k_j + \left(t - \sum _{i=1}^{i-1} k_j\right) + \beta = t + \beta . \end{equation*} $

For protocol $ \Pi _m $ , note that it starts at depth $ \sum _{i=1}^{m-1} k_i $ and takes $ k_{m} \cdot \gamma + \beta $ depth to compute and prove by the above. Thus, it completes at depth

$ \begin{equation*} \sum _{i=1}^{m-1} k_i + k_m \cdot \gamma + \beta = t + k_{m}\cdot (\gamma -1) + \beta \le t + (\gamma \log t+1)(\gamma -1)+\beta \le t + \gamma ^{2} \cdot (\log t+1) + \beta \end{equation*} $

as $ k_{m} \le \gamma \log t+1 $ . Thus, all protocols finish within depth $ t + \gamma ^{2} \cdot (\log t+1) + \beta $ , as required. $ \Box $ □

This completes the proof of Lemma 6.13.

Lemma 6.16 (Verifier Efficiency).

$ \begin{equation*} \mathbf {work} _{\mathcal {V}}(1^{\lambda} ,(M,x,t,L)) \le \alpha ^\star \cdot \left|(M,x) \right| \cdot L \cdot q(\lambda , \log (t\cdot p_{M})). \end{equation*} $

Proof.

To bound the work of the verifier, we note that a bound on the length of each message is known to the verifier in advance (as they depend on $ \alpha ^{\star } $ , L, $ \gamma $ , and $ \beta $ , which are all known), so we can assume that the verifier aborts if it receives a message of the wrong length.

To analyze the verifier’s efficiency, we have that the verifier first samples $ \mathit {pp} \leftarrow \mathbf {C}.\mathbf {Gen} (1^{\lambda} ,n) $ and $ h \leftarrow \mathcal {H} _{\lambda } $ and then computes $ \gamma $ , $ \mathit {State} _{\mathsf {start}} $ , and $ V _{\mathsf {start}}^{\mathsf {rd}} $ . Sampling $ \mathit {pp} ,h $ take time $ \mathrm{poly} (\lambda) $ , and as discussed in the proof of Lemma 6.13, the rest of the values can be computed in time $ \mathrm{poly} (\lambda ,\log (t \cdot p_{M})) $ .

The rest of the verifier’s running time is in running and checking consistency of the m sub-protocols, checking the starting and ending hashes, and verifying the output. The m sub-protocols are computed using $ (\mathcal {P} _{\mathsf {sARK}}, \mathcal {V} _{\mathsf {sARK}}) $ , and by succinctness, there exists a polynomial $ q_{\mathsf {sARK}} $ such that $ \mathcal {V} _{\mathsf {sARK}} $ runs in time

$ \begin{align*} q_{\mathsf {sARK}}(\lambda ,\ell _{\mathsf {upd}},\log (t_{\mathsf {upd}} \cdot p_{\mathsf {upd}})) \in \mathrm{poly} (\lambda ,\left|(M,x) \right|,\log (t \cdot p_{M})), \end{align*} $

where we recall that $ \ell _{\mathsf {upd}}(\lambda ,\left|(M,x) \right|,t) \in \mathrm{poly} (\lambda ,\left|(M,x) \right|,\log t) $ upper bounds the $ \mathcal {L}_{\mathsf {upd}} $ statement length, $ t_{\mathsf {upd}}(\lambda ,\left|(M,x) \right|,t) \in t \cdot \mathrm{poly} (\lambda ,\left|(M,x) \right|,\log t) $ upper bounds the depth to verify a $ \mathcal {L}_{\mathsf {upd}} $ statement with at most t updates when using $ p_{\mathsf {upd}}(\lambda ,p_{M}) = p_{M} \cdot \beta $ processors.

Next, checking consistency between the sub-protocols is mostly syntactic and can be done in time $ m \cdot \left|(M,x) \right| \cdot \mathrm{poly} (\lambda ,\log t) $ . Checking $ \mathit {digest} _{0} $ and $ \mathit {hash} _{0} $ can be done in time $ \mathrm{poly} (\lambda) $ , as well as checking $ \mathit {hash} _{m}^{\prime } $ , since it corresponds to hashing a halting state and single word for the end of the PRAM computation. Similarly, checking that $ \mathit {state} _{\mathsf {final}} $ is a halting state can be done in time $ O(\lambda), $ as halting states consist of a single PRAM state with a constant number of words. Verifying the output y can be done in time $ \lceil L/\lambda \rceil \cdot \beta \in L \cdot \mathrm{poly} (\lambda) $ by the efficiency of the hash function and the fact that $ \left|y \right| \le L $ .

Putting everything together, we get that the verifier runs in time $ m \cdot L \cdot \mathrm{poly} (\lambda ,\left|(M,x) \right|,\log (t \cdot p_{M})) $ . Since $ m \le \gamma \log t $ by Claim 6.14 and $ \gamma = \alpha ^{\star } + 1 $ , this is bounded by

$ \begin{equation*} \alpha ^{\star } \cdot L \cdot \mathrm{poly} (\lambda ,\left|(M,x) \right|,\log (t \cdot p_{M})). \end{equation*} $

Last, by Remark 8, we note that we can assume without loss of generality that the $ \mathcal {L}_{\mathsf {upd}} $ statements are relative to a machine $ M^{\prime } $ and input $ x^{\prime } $ with length bounded by a fixed polynomial in $ \lambda $ and a time bound $ t^{\prime } = t + \mathrm{poly} (\lambda ,\left|(M,x) \right|) $ , so long as $ \mathcal {V} $ has an additional $ \left|(M,x) \right| \cdot \mathrm{poly} (\lambda) $ factor in its running time to hash $ (M,x) $ . Therefore, combining this with the above and noting that $ \log (t^{\prime }) \in \mathrm{poly} (\lambda ,\log t) $ , the verifier’s total running time is at most $ \alpha ^{\star } \cdot \left|(M,x) \right| \cdot L \cdot q(\lambda ,\log (t \cdot p_{M})) $ for a fixed polynomial q.□

Lemma 6.17 (Communication Complexity).

There exists a polynomial q such that for any $ \lambda \in \mathbb {N} $ , $ (M,x,t,L) \in \{0,1\}^{*} $ where M has access to $ n \le 2^{\lambda } $ words in memory and $ p_{M} $ processors, it holds that the length of the transcript produced between $ \mathcal {P} (w) $ and $ \mathcal {V} $ on common input $ (1^{\lambda} , (M,x,t,L)) $ is bounded by

$ \begin{equation*} \alpha ^\star \cdot L \cdot q(\lambda , \log (t \cdot p_{M})). \end{equation*} $

Proof.

The dominating part of the communication comes from the communication in all sub-protocols defined by $ \mathbf {Compute\text{-} and \text{-} prove} $ . The rest of the communication has size at most $ \mathrm{poly} (\lambda) $ to send $ \mathit {pp} $ and h, size $ \mathrm{poly} (\lambda) $ to send $ \mathit {State} _{\mathsf {final}},V _{\mathsf {final}}^{\mathsf {rd}} $ (as they correspond to the final state of the computation) and at most $ \lambda \cdot \lceil L/\lambda \rceil + L \cdot \beta \in L \cdot \mathrm{poly} (\lambda) $ to send the final proof. Put together, this is at most $ L \cdot \mathrm{poly} (\lambda) $ .

The m sub-protocols in $ \mathbf {Compute\text{-} and \text{-} prove} $ are computed using $ (\mathcal {P} _{\mathsf {sARK}}, \mathcal {V} _{\mathsf {sARK}}) $ , so they have communication bounded by some fixed polynomial $ q_{\mathsf {sARK}} $ in $ \lambda $ and $ \log (t_{\mathsf {upd}} \cdot p_{\mathsf {upd}}) $ by succinctness of $ (\mathcal {P} _{\mathsf {sARK}}, \mathcal {V} _{\mathsf {sARK}}) $ , where recall $ t_{\mathsf {upd}} $ upper bounds the tome to verify the ith $ \mathcal {L}_{\mathsf {upd}} $ statement when using $ p_{\mathsf {upd}} $ processors. Since $ t_{\mathsf {upd}}(\lambda ,\left|(M,x) \right|,t) \le t \cdot \mathrm{poly} (\lambda ,\left|(M,x) \right|,\log t) $ when $ p_{\mathsf {upd}}(\lambda ,p_{M}) = p_{M} \cdot \beta $ , then this implies that the communication across all protocols is at most $ m \cdot q_{\mathsf {sARK}}(\lambda ,\log (t_{\mathsf {upd}} \cdot p_{\mathsf {upd}})) \in m \cdot \mathrm{poly} (\lambda ,\log (t \cdot p_{M})) $ , where we additionally used the fact that $ \left|(M,x) \right| \le n \le 2^{\lambda } $ . The prover also has to send the statement for each sub-protocol, which adds $ m \cdot \ell _{\mathsf {upd}} \in m \cdot \mathrm{poly} (\lambda ,\left|(M,x) \right|,\log t) $ to the communication complexity, where we recall that $ \ell _{\mathsf {upd}} $ is the upper bound on the $ \mathcal {L}_{\mathsf {upd}} $ statement length. By Claim 6.14, $ m \le \gamma \log t $ and $ \gamma =\alpha ^{\star }+1 $ , so all together $ \mathbf {Compute\text{-} and \text{-} prove} $ adds

$ \begin{equation*} \alpha ^{\star } \cdot \mathrm{poly} (\lambda ,\left|(M,x) \right|,\log (t \cdot p_{M})) \end{equation*} $

to the communication complexity.

Putting everything together, we get a bound of $ \alpha ^{\star } \cdot L \cdot \mathrm{poly} (\lambda ,\left|(M,x) \right|,\log (t \cdot p_{M})) $ . Finally, by Remark 8, without loss of generality, we can assume that $ \left|(M,x) \right| $ is bounded by a fixed polynomial in $ \lambda $ when used in the $ \mathcal {L}_{\mathsf {upd}} $ statements, as long as the statements are proven relative to a time bound $ t^{\prime } = t + \mathrm{poly} (\lambda ,\left|(M,x) \right|) $ (rather than t) and the prover and verify incur an additional delay of $ \left|(M,x) \right| \cdot \mathrm{poly} (\lambda) $ delay (which was taken into account in the proofs of prover and verifier efficiency). Therefore, this implies that $ \log t^{\prime } \in \mathrm{poly} (\lambda ,\log t) $ (since $ \left|(M,x) \right| \le n \le 2^{\lambda } $ ), and so the number of rounds and total communication is bounded by $ \alpha ^{\star } \cdot L \cdot q(\lambda , \log (t \cdot p_{M})) $ for a fixed polynomial q.□

6.4 Non-interactive Protocol

In this section, we give the protocol from Section 6 in the non-interactive setting. Specifically, we show a transformation from any concurrently updatable hash function and succinct non-interactive argument of knowledge (SNARK) to an argument where the multiplicative overhead of the SNARK prover translates into only additive overhead for the resulting prover. Our construction is nearly the same as in the interactive case, though we additionally need to assume that the underlying succinct argument is a SNARK. We formally define SNARKs in Section 3.3.

Let $ \mathbf {C} $ be a concurrently updatable hash function, let $ (\mathcal {G} _\mathsf {snark}, \mathcal {P} _\mathsf {snark}, \mathcal {V} _\mathsf {snark}) $ be a SNARK for $ \mathcal {L}_{\mathsf {upd}} $ with $ (\alpha ,\rho) $ -prover efficiency, and let $ \mathcal {H} = \{\mathcal {H} _{\lambda }\}_{\lambda \in \mathbb {N}} $ be a collision-resistant hash function family ensemble. When we mention the prover and verifier $ (\mathcal {P} _{}, \mathcal {V} _{}) $ , we refer to the construction of Section 6.2. We now give the high-level details of our construction $ (\mathcal {G} _\mathsf {ni}, \mathcal {P} _\mathsf {ni}, \mathcal {V} _\mathsf {ni}) $ for $ \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} $ , emphasizing the key differences from our interactive construction.

$ (\mathit {crs}, \mathrm{st}) \leftarrow \mathcal {G} _\mathsf {ni} (1^{\lambda}) $ : Let $ \mathit {pp} \leftarrow \mathbf {C}.\mathbf {Gen} (1^{\lambda} , n) $ where $ n=2^{\lambda } $ , $ h \leftarrow \mathcal {H} _{\lambda } $ , and $ (\mathit {crs} _\mathsf {snark}, \mathrm{st} _\mathsf {snark}) \leftarrow \mathcal {G} _\mathsf {snark} (1^{\lambda}) $ . Output $ (\mathit {crs}, \mathrm{st}) $ where $ \mathit {crs} = (\mathit {crs} _\mathsf {snark}, \mathit {pp},h) $ and $ \mathrm{st} = (\mathrm{st} _\mathsf {snark}, \mathit {pp},h) $ .¹²

$ (y, \pi) \leftarrow \mathcal {P} _\mathsf {ni} (\mathit {crs}, (M,x,t,L), w) $ : Let $ \mathit {crs} = (\mathit {crs} _\mathsf {snark}, \mathit {pp}, h) $ . Let $ M^{\prime } $ be the same as the machine M, except that it specifies $ n=2^{\lambda } $ as the amount of words in memory it has access to. Without sending any messages, run the prover $ \mathcal {P} (w) $ on common input $ (M^{\prime },x,t,L) $ using $ (\mathit {pp},h) $ as the verifier’s first message and $ \mathit {crs} _\mathsf {snark} $ as the common reference string for all underlying SNARKs. Let y be the output of the computation and $ \pi $ be all messages that would have been sent in the protocol. Output $ (y,\pi) $ .

$ b \leftarrow \mathcal {V} _\mathsf {ni} (\mathrm{st} , (M,x,y,L,t), \pi) $ : Let $ \mathrm{st} = (\mathrm{st} _\mathsf {snark}, \mathit {pp}, h) $ . If M uses more than $ 2^{\lambda } $ words in memory, then output $ b=0 $ . Otherwise, let $ M^{\prime } $ be the same as the machine M, except that it specifies $ n=2^{\lambda } $ as the amount of words in memory it has access to. Parse $ \pi $ as all messages from $ \mathcal {P} $ , and run the verifier $ \mathcal {V} $ for statement $ (M^{\prime },x,t,y,L) $ using $ (\mathit {pp},h) $ as the verifier’s first message and $ \mathrm{st} _\mathsf {snark} $ to verify all underlying SNARKs. Let $ y^{\prime } $ be the value that $ \mathcal {V} $ would have output. Output $ b=1 $ if $ y = y^{\prime } $ and $ b=0 $ otherwise.

We get the following theorem:

Theorem 6.18.

Suppose there exists a concurrently updatable hash function and a SNARK $ (\mathcal {G} _{\mathsf {snark}} , \mathcal {P} _{\mathsf {snark}} , \mathcal {V} _{\mathsf {snark}}) $ with $ (\alpha ,\rho) $ -prover efficiency for the $ \mathbf {NP} $ language $ \mathcal {L}_{\mathsf {upd}} $ . Then, there exists a tuple $ (\mathcal {G} _{\mathsf {ni}} , \mathcal {P} _{\mathsf {ni}} , \mathcal {V} _{\mathsf {ni}}) $ satisfying niSPARK completeness and argument of knowledge for $ \mathbf {NP} $ , as well as the same efficiency properties as Theorem 6.1.

Specifically, there exists a polynomial q such that for all $ \lambda \in \mathbb {N} $ and $ ((M,x,y,L,t),w) \in \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} $ where M has access to $ n \le 2^{\lambda } $ words in memory and $ p_{M} $ processors, the following hold: Let $ \alpha ^{\star } $ and $ \rho ^{\star } $ (formally defined based on $ \alpha $ and $ \rho $ ) be the multiplicative overhead in depth (with respect to the number of steps) and number of parallel processors used, respectively, by $ \mathcal {P} _{\mathsf {snark}} $ to prove a statement in $ \mathcal {L}_{\mathsf {upd}} $ corresponding to at most t steps of computation. Then:

•

The depth of $ \mathcal {P} _{\mathsf {ni}} $ is bounded by $ t + (\alpha ^\star)^2 \cdot \left|(M,x) \right| \cdot L \cdot q(\lambda , \log (t \cdot p_{M})) $ when using $ (p_{M} + \alpha ^\star \cdot \rho ^\star) \cdot q(\lambda , \log (t \cdot p_{M})) $ processors.

•

The work of $ \mathcal {V} _{\mathsf {ni}} $ is bounded by $ \alpha ^\star \cdot \left|(M,x) \right| \cdot L \cdot q(\lambda , \log (t\cdot p_{M})) $ , and the length of the transcript produced in the interaction between $ \mathcal {P} (w) $ and $ \mathcal {V} $ is bounded by $ \alpha ^\star \cdot L \cdot q(\lambda , \log (t\cdot p_{M})) $ .

To prove Theorem 6.18, we note that completeness, succinctness, and optimal prover depth follow identically as in the proof of the construction in Section 6. The proof of adaptive argument of knowledge is conceptually similar yet differs in the technical details as the definition of the extractor for both the underlying SNARK and niSPARK are different. As such, we give the full proof of adaptive argument of knowledge in Appendix B.

As we discuss in Remark 2 in Section 3.3, the argument of knowledge property of the underlying SNARK may only hold for certain distributions over the auxiliary input of the malicious prover. In this case, the argument of knowledge property in our construction holds for any distribution Z over the auxiliary input of the malicious prover so long as the SNARK is secure with auxiliary input drawn from $ (Z,\mathbf {C}.\mathbf {Gen} (1^{\lambda}),\mathcal {H} _{\lambda }) $ .

7 Main Results

We first construct a four-round SPARK in Section 7.1 assuming only collision resistance. Additionally assuming the existence of a SNARK, we construct a space-preserving, non-interactive SPARK in Section 7.2.

7.1 Four-round SPARKs

We consider general parallel RAM computations consisting of statements $ (M,x,y,L,t) $ where M is a parallel RAM machine using any $ p_{M} $ number of processors. If we instantiate our transformation from Section 6 with a succinct argument where the prover has $ \alpha ^\star = \mathrm{poly} (\lambda , \log (t\cdot p_{M})) $ overhead in depth while using at most $ \rho ^\star = p_{M} $ processors, then the transformation of Theorem 6.1 yields a SPARK for $ \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} $ . To capture the requirements we need, we first formalize and define this notion as a $ \text{depth-preserving} $ succinct argument of knowledge.

Definition 7.1 (Depth-Preserving Succinct Argument of Knowledge)

We say that a succinct argument of knowledge $ (\mathcal {P}, \mathcal {V}) $ for a relation $ R \subseteq \mathcal {R}_{\mathcal {U}} ^{\mathbf {TM}} $ is $ \text{depth-preserving} $ if there exists a polynomial q such that $ (\mathcal {P}, \mathcal {V}) $ satisfies $ (\alpha , \rho) $ -prover efficiency for $ \alpha (\lambda , \left|(M,x,y,L) \right|, t, p_{M}) = (t + \left|(M,x,y,L) \right|) \cdot q(\lambda , \log (t\cdot p_{M})) $ and $ \rho (\lambda , \left|(M,x,y,L) \right|, t, p_{M}) = p_{M} $ .

In the following theorem, we show that a $ \text{depth-preserving} $ succinct argument of knowledge yields a SPARK:

Theorem 7.2.

Suppose there exists a concurrently updatable hash function and a $ \text{depth-preserving} $ succinct argument of knowledge for $ \mathbf {NP} $ . Then, there exists a SPARK for non-deterministic polynomial time PRAM computation.

Proof.

Let $ (\mathcal {P} _{\mathsf {sARK}}, \mathcal {V} _{\mathsf {sARK}}) $ be a $ \text{depth-preserving} $ succinct argument of knowledge for the language $ \mathcal {L}_{\mathsf {upd}} $ where $ \left|(M,x) \right| \in \mathrm{poly} (\lambda) $ by Remark 8. Let $ \alpha $ and $ \rho $ be the efficiency of $ (\mathcal {P} _{\mathsf {sARK}}, \mathcal {V} _{\mathsf {sARK}}) $ . We recall that the length of $ \mathcal {L}_{\mathsf {upd}} $ statements for at most t updates is at most $ \ell _\mathsf {upd} (\lambda , \left|(M,x) \right|, t) \in \log t + \mathrm{poly} (\lambda) $ , and when using $ p_\mathsf {upd} (\lambda , p_{M}) \in p_{M} \cdot \mathrm{poly} (\lambda) $ processors, the verification procedure takes depth at most $ t_\mathsf {upd} (\lambda , \left|(M,x) \right|, t) = t \cdot \mathrm{poly} (\lambda , \log t) $ . Since $ (\mathcal {P} _{\mathsf {sARK}}, \mathcal {V} _{\mathsf {sARK}}) $ is $ \text{depth-preserving} $ , this implies that there exists a polynomial q such that

$ \begin{equation*} \alpha ^\star = \alpha (\lambda , \ell _\mathsf {upd}, t_\mathsf {upd}, p_\mathsf {upd}) / t \le q(\lambda , \log (t \cdot p_{M})) \end{equation*} $

and $ \rho ^\star = \rho (\lambda ,\ell _{\mathsf {upd}},t_{\mathsf {upd}},p_{\mathsf {upd}}) = p_{\mathsf {upd}} \le p_{M} \cdot q(\lambda , \log (t \cdot p_{M})) $ . Theorem 6.1 implies that there exists an interactive protocol $ (\mathcal {P},\mathcal {V}) $ for $ \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} $ that satisfies SPARK completeness and argument of knowledge for $ \mathbf {NP} $ . Furthermore, plugging the values for $ \alpha ^\star = q(\lambda , \log (t\cdot p_{M})) $ and $ \rho ^\star = p_{M} \cdot q(\lambda , \log (t \cdot p_{M})) $ into the theorem, this implies that there exists a polynomial $ q^{\prime } $ such that the following efficiency properties hold:

•

The depth of the prover is bounded by $ t + \left|(M,x) \right| \cdot L \cdot q^{\prime }(\lambda , \log (t \cdot p_{M})) $ when using $ p_{M} \cdot q^{\prime }(\lambda , \log (t \cdot p_{M})) $ processors.

•

The work of the verifier is bounded by $ \left|(M,x) \right| \cdot L \cdot q^{\prime }(\lambda , \log (t \cdot p_{M})) $ and the length of the transcript produced in the interaction between $ \mathcal {P} (w) $ and $ \mathcal {V} $ is bounded by $ L \cdot q^{\prime }(\lambda , \log (t \cdot p_{M})) $ .

This immediately implies the SPARK optimal prover depth and succinctness properties.□

We will instantiate the transformation in Theorem 7.2 by showing that Kilian’s protocol [41] with the parallelizable PCP construction of Reference [11] is a four-round $ \text{depth-preserving} $ succinct argument of knowledge. This results in a SPARK from collision resistance alone. Furthermore, we will describe how to improve the round complexity of this instantiation. Specifically, when applying the above theorem generically to an r-round succinct argument of knowledge, the round complexity of the resulting SPARK is roughly $ r \cdot \mathrm{poly} (\lambda ,\log (t \cdot p_{M})) $ . We will show that when using Kilian’s protocol, we can instead preserve the round complexity, resulting in a four-round SPARK.

We next recall Kilian’s argument and the PCP we use and then show how this yields a four-round SPARK.

PCPs and Kilian’s succinct argument. At a high level, Kilian’s argument gives a way to compile a probabilistically checkable proof (PCP) of knowledge into a four-round succinct argument of knowledge. To show that Kilian’s argument can be made to be $ \text{depth-preserving} $ , we will need the PCP to also have a $ \text{depth-preserving} $ property. We therefore start by defining a $ \text{depth-preserving} $ PCP of knowledge. In what follows, we use the notation $ \mathcal {V} ^{\pi } $ to denote a verifier $ \mathcal {V} $ with oracle access to a proof string $ \pi $ .

Definition 7.3 (Depth-Preserving PCP of Knowledge)

A $ \text{depth-preserving} $ PCP of knowledge for an $ \mathbf {NP} $ relation R is a pair $ (\mathcal {P} _\mathsf {pcp}, \mathcal {V} _\mathsf {pcp}) $ satisfying the following:

Completeness:

For any $ \lambda \in \mathbb {N} $ , $ (x,w)\in R $ , and $ \pi \leftarrow \mathcal {P} _\mathsf {pcp} (1^{\lambda} ,x,w) $ , it holds that $ \mathcal {V} _\mathsf {pcp} ^\pi (1^{\lambda} ,x) = 1 $ .

Proof of Knowledge:

There exists a PPT extractor $ \mathcal {E} $ and a negligible function $ \mathsf {negl} $ such that for any $ \lambda \in \mathbb {N} $ , $ x \in \{0,1\}^* $ , and proof $ \pi \in \{0,1\}^* $ ,

$ \begin{equation*} \text{Pr}\left[ w \leftarrow \mathcal {E} (x, \pi) : \begin{array}{l} \mathcal {V} _\mathsf {pcp} ^\pi (1^{\lambda} , x) = 1 \\ \wedge \; (x, w) \not\in R \end{array} \right] \le \mathsf {negl} (\lambda). \end{equation*} $

$ \text{Depth-Preserving} $ Prover Efficiency: Let M be the Turing machine that verifies the relation R using $ p_{M} $ processors. There exists a polynomial q such that for any $ \lambda \in \mathbb {N} $ and $ (x,w) \in R $ , the depth of $ \mathcal {P} _\mathsf {pcp} (x,w) $ is bounded by $ t \cdot q(\lambda , \left|x \right|, \log (t \cdot p_{M})) $ when using $ p_{M} $ processors, where t is the depth of $ M(x,w) $ .

Verifier Efficiency: Let M be the Turing machine that verifies the relation R using $ p_{M} $ processors. There exists a polynomial q such that for any $ \lambda \in \mathbb {N} $ , input $ (x,w) \in \{0,1\}^* $ , and oracle string $ \pi \in \{0,1\}^* $ , $ \mathcal {V} _\mathsf {pcp} ^\pi (1^{\lambda} ,x) $ runs in time $ q(\lambda , \left|x \right|, \log (t\cdot p_{M})) $ , where t is the running time of $ M(x,w) $ .

Throughout this section, we will be focusing only on non-adaptive PCPs. In such a PCP, both the set of queries made by the verifier and the decision algorithm used by the verifier do not depend on the answers to previous PCP queries. Thus, a non-adaptive PCP verifier can be viewed as an interactive algorithm without access to $ \pi $ that first outputs a set of query indices $ \mathcal {I} $ , receives the corresponding answers according to $ \pi $ , and then indicates whether to accept or reject. It will be convenient to view the PCP verifier as such when we specify Kilian’s protocol.

We next observe that the PCP of Ben-sasson et al. [11] gives a $ \text{depth-preserving} $ , non-adaptive PCP of knowledge. To see this, we note that when viewing their construction as a PCP for a specific $ \mathbf {NP} $ language, it has the property that the PCP can be computed from the tableau of the computation in depth $ \mathrm{poly} (\lambda ,\log (t \cdot p_{M})) $ using $ t \cdot p_{M} $ processors. Such a PCP implies a $ \text{depth-preserving} $ PCP of knowledge by restricting the prover to only use $ p_{M} $ processors at a time, which increases its depth by a factor of t and satisfies the above definition.

To put everything together, we next recall Kilian’s succinct argument and discuss its efficiency when instantiated with a $ \text{depth-preserving} $ PCP. Given any non-adaptive PCP system $ (\mathcal {P} _\mathsf {pcp}, \mathcal {V} _\mathsf {pcp}) $ for $ \mathbf {NP} $ , Kilian’s transformation yields a four-round interactive protocol $ (\mathcal {P}, \mathcal {V}) $ defined as follows: Let L be a language with witness relation $ R_L $ . The common input to the protocol is $ (1^{\lambda} , x) $ and $ \mathcal {P} $ receives private input w such that $ (x,w) \in R_L $ .

(1)

$ \mathcal {V} $ samples a function h from a collision-resistant hash function family and sends h to $ \mathcal {P} $ .

(2)

$ \mathcal {P} $ computes $ \pi \leftarrow \mathcal {P} _\mathsf {pcp} (1^{\lambda} , x,w) $ , computes a Merkle tree hash of $ \pi $ using h, and sends the root to $ \mathcal {V} $ .

(3)

$ \mathcal {V} $ samples a set $ \mathcal {I} $ of query indices from $ \mathcal {V} _\mathsf {pcp} (1^{\lambda} , x) $ and sends it to $ \mathcal {P} $ .

(4)

$ \mathcal {P} $ opens up the locations in $ \mathcal {I} $ in the Merkle tree and sends the openings along with the authentication paths to $ \mathcal {V} $ .

(5)

The verifier accepts if and only if (a) $ \mathcal {V} _\mathsf {pcp} (1^{\lambda} , x) $ accepts given the openings and (b) all authentication paths are valid.

The above protocol is a four-round succinct argument of knowledge if $ (\mathcal {P} _\mathsf {pcp}, \mathcal {V} _\mathsf {pcp}) $ is a PCP of knowledge and h is a collision-resistant hash function. We note that the second message where $ \mathcal {P} $ computes the PCP proof $ \pi $ with a Merkle tree is the most time-consuming step and is why we need a PCP with an efficient prover. All other steps can be computed in time $ \mathrm{poly} (\lambda , \left|x \right| , \log (t \cdot p_{M})) $ for any PCP.

Next, we sketch why Kilian’s protocol is $ \text{depth-preserving} $ when using a $ \text{depth-preserving} $ PCP. The prover $ \mathcal {P} _{\mathsf {kilian}} $ consists of computing a PCP $ \pi $ , computing the Merkle tree root of $ \pi $ , and then opening up locations in the Merkle tree corresponding to the verifier’s queries. By definition of a $ \text{depth-preserving} $ PCP, computing the PCP can be done in depth $ t \cdot \mathrm{poly} (\lambda ,\left|x \right|,\log (t \cdot p_{M})) $ with $ p_{M} $ processors. This results in a PCP of length $ t \cdot p_{M} \cdot \mathrm{poly} (\lambda ,\left|x \right|,\log (t \cdot p_{M})) $ . The Merkle root can then be computed in depth $ t \cdot \mathrm{poly} (\lambda ,\left|x \right|,\log (t \cdot p_{M})) $ with $ p_{M} $ processors. By the bound on the length of the PCP combined with the PCP verifier’s effiency, the query locations can be opened in time $ \mathrm{poly} (\lambda ,\left|x \right|,\log (t \cdot p_{M})) $ . It follows that instantiating Kilian’s protocol in this way results in a $ \text{depth-preserving} $ succinct argument of knowledge.

Constructing a four-round SPARK. We now describe our four-round SPARK construction. We assume familiarity with the protocol of Section 6.2, which we denote by $ (\mathcal {P} _{\mathsf {spark}}, \mathcal {V} _{\mathsf {spark}}) $ , and which serves as the basis for the construction. For the underlying succinct argument of knowledge in that protocol, we use Kilian’s succinct argument with a $ \text{depth-preserving} $ PCP of knowledge as described above, which we denote by $ (\mathcal {P} _\mathsf {kilian}, \mathcal {V} _\mathsf {kilian}) $ .

The protocol $ (\mathcal {P}, \mathcal {V}) $ for $ \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} $ is defined as follows: The common input to the protocol is $ (1^{\lambda} , (M,x,t,L)) $ and $ \mathcal {P} $ receives private input w such that $ ((M,x,y,L,t), w) \in \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} $ where y is the output of $ M(x,w) $ within t steps. When we refer to protocol $ (\mathcal {P} _{\mathsf {spark}}, \mathcal {V} _{\mathsf {spark}}) $ , we mean the protocol with the same inputs.

(1)

$ \mathcal {V} $ computes the first message $ \mathit {msg} _{1} $ for $ \mathcal {V} _\mathsf {spark} $ and a hash function h for $ (\mathcal {P} _\mathsf {kilian}, \mathcal {V} _\mathsf {kilian}) $ . $ \mathcal {V} $ sends $ (\mathit {msg} _{1}, h) $ to $ \mathcal {P} $ .

(2)

Using $ \mathit {msg} _{1} $ , $ \mathcal {P} $ runs the prover algorithm $ \mathbf {Prove} _\mathsf {spark} $ through the $ \mathbf {Compute\text{-} and \text{-} prove} $ step, which determines statements for the m sub-protocols. For each of the sub-protocols, $ \mathcal {P} $ uses h to compute the second message of $ (\mathcal {P} _\mathsf {kilian}, \mathcal {V} _\mathsf {kilian}) $ for the given statements. Recall that this consists of a Merkle tree digest of the PCP for that part of the computation, which $ \mathcal {P} $ stores explicitly for all protocols. After computing all second messages in parallel, $ \mathcal {P} $ sends them to $ \mathcal {V} $ at the same time.

(3)

$ \mathcal {V} $ responds with the third message of $ (\mathcal {P} _\mathsf {kilian}, \mathcal {V} _\mathsf {kilian}) $ for the m sub-protocols, consisting of indices to open in each PCP.

(4)

$ \mathcal {P} $ opens all relevant locations with authentication paths in the PCPs and sends the results to $ \mathcal {V} $ along with the final message $ \mathit {msg} _{\mathsf {final}} $ sent by $ \mathcal {P} _\mathsf {spark} $ .

(5)

$ \mathcal {V} $ accepts and outputs the value y specified by $ \mathcal {V} _\mathsf {spark} $ if all of the underlying $ (\mathcal {P} _\mathsf {kilian}, \mathcal {V} _\mathsf {kilian}) $ protocols accept and all conditions checked by $ \mathcal {V} _\mathsf {spark} $ hold.

As $ (\mathcal {P} _\mathsf {kilian}, \mathcal {V} _\mathsf {kilian}) $ is a $ \text{depth-preserving} $ succinct argument of knowledge assuming only the existence of collision-resistant hash functions, the above construction yields the following theorem:

Theorem 7.4 (Restatement of Theorem 1.3).

Suppose there exists a family of collision-resistant hash functions. Then, there exists four-round SPARK for non-deterministic polynomial-time PRAM computation.

Proof.

We consider the protocol $ (\mathcal {P}, \mathcal {V}) $ defined above which uses a $ \text{depth-preserving} $ succinct argument of knowledge and a collision-resistant hash function family.

The proofs of completeness and argument of knowledge for $ (\mathcal {P}, \mathcal {V}) $ follow identically to the analysis of Theorem 6.1, and the protocol above is defined in four rounds.

Succinctness follows from Theorem 7.2, since the underlying argument is $ \text{depth-preserving} $ . We briefly discuss prover efficiency. The prover complexity in $ (\mathcal {P} _\mathsf {kilian}, \mathcal {V} _\mathsf {kilian}) $ , which dominates the prover complexity in the four-round SPARK, comes from the second and fourth messages of the protocol. All other messages by the prover and the verifier can be computed in time $ \mathrm{poly} (\lambda , \left|(M,x) \right|, L, \log (t \cdot p_{M})) $ . Without waiting for all messages of the protocol, all sub-protocols would have finished by depth $ t+\gamma ^{2} \cdot (\log t+1) + \beta $ by the analysis of Lemma 6.13. Thus, the second messages of the sub-protocols will finish by this time, so the second message will be sent by time $ (t+\gamma ^{2} \cdot (\log t+1) + \beta) \in t + (\alpha ^\star)^{2} \cdot \left|(M,x) \right| \cdot \mathrm{poly} (\lambda ,\log (t \cdot p_{M})) $ . The fourth message simply consists of opening locations in the Merkle trees with authentication paths. Assuming the entire PCP is stored, this can be computed in time $ \mathrm{poly} (\lambda , \log (t \cdot p_{M})) $ for each of m PCPs in parallel. Thus, the total time for the protocol to finish is $ t + (\alpha ^\star)^{2} \cdot \left|(M,x) \right| \cdot L \cdot \mathrm{poly} (\lambda ,\log (t \cdot p_{M})) $ . Again, as the underlying argument is $ \text{depth-preserving} $ , this implies that $ \alpha ^\star \in \mathrm{poly} (\lambda , \log (t \cdot p_{M})) $ and $ \rho ^\star = p_{M} \cdot \mathrm{poly} (\lambda) $ as in Theorem 7.2, so the protocol satisfies optimal prover depth. Thus, the resulting protocol is a valid SPARK for $ \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} $ .□

7.2 Non-interactive SPARKs

If we instantiate our transformation with a SNARK, as in Section 6.4, then the resulting protocol is non-interactive. Furthermore, if the SNARK is $ \text{depth-preserving} $ as in Definition 7.1, then this implies a non-interactive SPARK. For completeness, we define a $ \text{depth-preserving} $ SNARK and formally state this result below. We note that the proof follows identically to that of Theorem 7.2.

Definition 7.5 (Depth-Preserving SNARK)

We say that a SNARK $ (\mathcal {G}, \mathcal {P}, \mathcal {V}) $ for a relation $ R \subseteq \mathcal {R}_{\mathcal {U}} ^{\mathbf {TM}} $ is $ \text{depth-preserving} $ if there exists a polynomial q such that $ (\mathcal {G}, \mathcal {P}, \mathcal {V}) $ satisfies $ (\alpha , \rho) $ -prover efficiency for

$ \begin{equation*} \alpha (\lambda , \left|(M,x,y,L) \right|, t, p_{M}) = (t + \left|(M,x,y,L) \right|) \cdot q(\lambda , \log (t\cdot p_{M})) \end{equation*} $

and $ \rho (\lambda , \left|(M,x,y,L) \right|, t, p_{M}) = p_{M} $ .

Theorem 7.6.

Assuming there exists a concurrently updatable hash function and a $ \text{depth-preserving} $ SNARK for $ \mathbf {NP} $ . Then, there exists a non-interactive SPARK for non-deterministic polynomial-time PRAM computation.

Assuming the existence of collision-resistant hash functions, Bitansky et al. [16] show how to transform any (possibly inefficient or preprocessing) SNARK into a complexity-preserving SNARK using recursive proof composition (following ideas of Valiant [55]). We show that, for parallel RAM machines M using $ p_{M} $ processors, their construction gives a $ \text{depth-preserving} $ SNARK when allowing the prover to use $ p_{M} $ processors as well. The fact that the SNARK is complexity-preserving means that it also preserves the space complexity of the underlying computation up to $ \mathrm{poly} (\lambda , \log (t \cdot p_{M})) $ factors. We isolate this property and refer to it as space-preserving, defined as follows:

Definition 7.7 (Space-preserving).

We say that a succinct argument $ (\mathcal {P}, \mathcal {V}) $ for a relation $ R \subseteq \mathcal {R}_{\mathcal {U}} ^{\mathbf {TM}} $ is space-preserving if there exists a polynomial q such that for any $ \lambda \in \mathbb {N} $ , and $ ((M,x,y,L,t), w) \in R $ where $ M(x,w) $ uses $ n \le 2^{\lambda } $ space and $ p_{M} $ processors, it holds that the space of $ \mathcal {P} $ is at most $ n \cdot q(\lambda ,\log (t \cdot p_{M})) $ . We analogously define space-preserving for succinct non-interactive arguments $ (\mathcal {G},\mathcal {P}, \mathcal {V}) $ .

At a high level, the transformation of Reference [16] splits the t-time computation into roughly t parts of size $ \mathrm{poly} (\lambda) $ and constructs proofs for each part separately. Each of these proofs are treated as independent of each other and can be computed in parallel. At first, this does not provide any benefit, since the verifier would need to check roughly t distinct proofs. However, they show how to combine multiple proofs by proving the existence of a set of “lower-level” proofs that the verifier would have accepted. Using this idea, they combine proofs recursively in a tree-like fashion of constant-depth until the verifier only has to verify a single proof.

We briefly discuss the proof of this transformation and discuss why the resulting SNARK is $ \text{depth-preserving} $ . Completeness is straightforward. Proving that this transformation preserves the argument of knowledge property is more subtle and relies on the fact that the SNARK composition only has constant depth (without making stronger assumptions about the knowledge extractor for the underlying SNARK). Succinctness follows as the final proof is simply a single SNARK proof. To show that the resulting SNARK is $ \text{depth-preserving} $ and space-preserving, we note that even if the underlying SNARK has $ \mathrm{poly} (t) $ overhead in time and space for a t-time computation, each individual proof will only require $ \mathrm{poly} (\lambda) $ overhead, since the size of each sub-computation is only $ \mathrm{poly} (\lambda) $ . Thus, the “layer one” proofs (corresponding to the proofs of the main computation) only incur a $ \mathrm{poly} (\lambda) $ multiplicative overhead in the underlying depth and space, and at most $ \mathrm{poly} (\lambda) $ proofs will be processed in parallel at any time. Furthermore, the composed proofs at higher levels of the tree can be computed as soon as they are ready, and only $ \mathrm{poly} (\lambda) $ proofs will be computed at any time. Once computed, the prover can “forget” the previous parts of the computation, so it only needs to keep information about $ \mathrm{poly} (\lambda) $ proofs around, consisting of the current “frontier” in this tree. We refer the curious reader to Reference [16] for more details of this proof.

Using the above SNARK transformation in our non-interactive SPARK construction of Section 6.4, we get the following theorem assuming collision-resistance and any SNARK. We emphasize that if the underlying SNARK is publicly verifiable, then so is the resulting SPARK.

Theorem 7.8.

Suppose there exists a family of collision-resistant hash functions and a SNARK. Then, there exists a space-preserving, non-interactive SPARK for non-deterministic polynomial-time PRAM computation.

Completeness, argument of knowledge, succinctness, and optimal prover depth all follow directly from the analysis of Theorems 6.1, 6.18, and 7.2. As a result, we focus on the space complexity of the prover. The space complexity is dominated by the sub-protocols. The space used by the computation is defined to be n, and all other parts are bounded by $ (\left|M,x \right| + L) \cdot \mathrm{poly} (\lambda , \log (t \cdot p_{M})) $ . If the underlying SNARK is space-preserving, then it holds that each subprotocol uses at most $ n \cdot \mathrm{poly} (\lambda , \log (t \cdot p_{M})) $ space. There are at most $ m \le (\alpha ^\star + 1) \log t $ sub-protocols, which are bounded by $ \mathrm{poly} (\lambda , \log (t \cdot p_{M})) $ , since the protocol is $ \text{depth-preserving} $ . Thus, the space used by all sub-protocols is at most $ n \cdot \mathrm{poly} (\lambda , L, \log (t \cdot p_{M})) $ as required.

8 Extensions

In this section, we discuss various extensions of our main result.

8.1 Space-preserving Interactive SPARKs

In Section 7.2, we gave a transformation from SNARKs to non-interactive SPARKs that are also space-preserving. As discussed in that section, this relies on a transformation from SNARKs to complexity-preserving SNARKs due to Reference [16], which only works in the non-interactive setting. Specifically, if each intermediate argument in that transformation requires interaction, then this would make the round complexity, and hence communication complexity, depend at least linearly on t. This raises the question, can we construct space-preserving (interactive) SPARKs from weaker assumptions than space-preserving non-interactive SPARKs? We emphasize that the four-round SPARK protocol given in Section 7.1 is not space-preserving. In particular, that construction requires storing an entire PCP for each sub-protocol, so it requires space that depends on the time bound t of the underlying computation rather than the space bound.

Bitanksy and Chiesa [18] posed this question for succinct arguments of knowledge (without the optimal prover depth requirement). They construct four-round complexity-preserving succinct arguments of knowledge by adapting Kilian’s four-round argument. Instead of relying on PCPs in Kilian’s blueprint, they make use of a one-round complexity-preserving multi-prover interactive proof (MIP) of knowledge. In a MIP, there are many provers, and they are crucially not allowed to interact with each other (otherwise, it would be equivalent to the setting of a single prover). They show how to compile such a MIP into a succinct argument using function commitments. At a high level, function commitments allow the prover to commit to a function without evaluating it at every point, so they use the function commitments to commit to the MIP prover algorithms. In contrast, to commit to a PCP string in Kilian’s protocol, the prover needs to compute the full PCP string.

In Reference [18], they show how to construct the required function commitments based only on fully homomorphic encryption (FHE), and so the resulting complexity-preserving succinct argument of knowledge is based only on FHE. By instantiating our SPARK construction of Section 6.2 with their succinct argument, we get the following theorem assuming collision resistance and FHE:

Theorem 8.1.

Suppose there exists a collision-resistant hash function family and a secure FHE scheme. Then, there exists a space-preserving SPARK for non-deterministic polynomial-time (sequential) RAM computations.

The space-preserving property follows from the same observations as in the non-interactive case. However, we note that the round complexity of the resulting SPARK is $ \mathrm{poly} (\lambda , L, \log (t \cdot p_{M})) $ . In short, the trick used in Section 7.1 to construct a four-round SPARK using Kilian’s succinct argument does not immediately work to collapse rounds, as the prover needs to do quasi-linear work both to commit to the functions of the MIP provers and to homomorphically evaluate their responses. Additionally, we note that the complexity-preserving succinct argument is private-coin, so the resulting space-preserving SPARK is also private-coin.

Last, we remark that the complexity-preserving succinct argument of Reference [18] is only given for RAM (rather than PRAM computations), so the above theorem is also only stated for sequential RAM computations. We note that it actually holds for computations with moderate parallelism—namely, machines computable in time t with $ \mathrm{poly} (\lambda ,\log t) $ parallelism. At a high level, this follows, because SPARKs for sequential RAM computation generically give $ \text{depth-preserving} $ succinct arguments for computation with moderate parallelism by ignoring the parallelism of the underlying computation and treating it as a $ t \cdot \mathrm{poly} (\lambda ,\log t) $ -time sequential computation. Applying our transformation to this results in a SPARK for moderately parallel computations. We leave the extension to full PRAM computation as future work.

Open problems. We comment on open problems left by Bitansky and Chiesa [18], which if resolved would immediately give results for space-preserving SPARKs. The first is to construct complexity-preserving PCPs. Using such a PCP in Kilian’s argument would yield a complexity-preserving, public-coin, succinct argument. In turn, this can be used to construct a space-preserving, public-coin, four-round SPARK, by the techniques described in Section 7.1. Next, is it possible to construct a complexity-preserving, public-coin, succinct argument without going through PCPs and Kilian’s transformation? Again, this would at least give a space-preserving, public-coin SPARK, although not necessarily with constant round complexity.

8.2 Proof Composition

We recall that in the transformation from succinct arguments to SPARKs, the prover proves $ m \le (\alpha ^\star + 1) \cdot \log t $ separate sub-protocols, where recall $ \alpha ^\star $ is the overhead in depth of the underlying argument and t is the depth of the computation. This requires that the prover communicate m proofs, and the verifier needs to check all of them. Even when the underlying argument is $ \text{depth-preserving} $ , the number of protocols $ m \in \mathrm{poly} (\lambda , \log (t \cdot p_{M})) $ may be undesirable.

In the non-interactive setting, the prover can generically compose proofs such that the prover only has to send—and the verifier only has to verify—a single SNARK proof. Specifically, let $ \Pi _1,\ldots , \Pi _m $ be the m underlying SNARK protocols with statements $ \mathit {statement} _i $ and witnesses $ \mathit {wit} _i $ for each $ i \in [m] $ . The prover will initially compute proofs $ \pi _1, \ldots , \pi _m $ for each statement, which takes at most $ t + \mathrm{poly} (\lambda , \log (t \cdot p_{M})) $ time. At this point, the prover can send a hash of all m statements, witnesses, and proofs to the verifier and additionally use the SNARK to prove that it knows a set of statements, witnesses, and proofs that (1) the original SPARK verifier would have accepted and (2) are consistent with the provided hash. This additional work only incurs an additive $ \mathrm{poly} (\lambda , \log (t \cdot p_{M})) $ delay by the prover, so the resulting protocol still satisfies the optimal prover depth property required by a SPARK. This is a standard proof composition technique (see References [16, 55] for more details), and because this only requires one level of recursive composition, the argument of knowledge property is preserved.

In the interactive setting, proof composition does not generically work as described above to reduce communication and verifier complexity. However, in the case of Kilian’s protocol and our construction in Section 7.1, we can do proof composition to reduce communication and verifier complexity at the cost of two extra messages of communication. At a high level, instead of sending the roots of the Merkle tree for all m PCPs, the prover hashes all of the statements and roots together and sends it to the verifier. This takes at most $ t + \mathrm{poly} (\lambda , \log (t \cdot p_{M})) $ time to finish the first prover message. At this point in time, the verifier sends randomness to specify challenge queries for the m PCPs (which can be compressed using a pseudo-random generator). The prover then uses a four-round succinct argument of knowledge to prove that it knows a set of openings consistent with the hash answering all of the PCP queries that the verifier would have accepted. The complexity of this statement is only $ \mathrm{poly} (\lambda , \log (t \cdot p_{M})) $ , so it only incurs an additional $ \mathrm{poly} (\lambda , \log (t \cdot p_{M})) $ delay in the protocol as required. The argument of knowledge analysis follows similarly to the non-interactive setting. Furthermore, at the end of the protocol, the verifier only needs to check a single succinct argument of knowledge at the cost of an extra round of communication.

8.3 Efficiency Tradeoffs

We note that for some applications, requiring optimal prover depth may not be necessary. There may be a hard constraint on the time to finish the proof (e.g. compute the proof within 1 hour) or on the number of processors (e.g. compute the proof as fast as possible using p processors). We emphasize that the construction in Section 6.2 is flexible to these varying needs, depending on the specific application. Specifically, by choosing $ \gamma $ appropriately (which recall corresponds to the fraction of the remaining computation to compute and prove at each step), we can handle any pre-specified prover running time or achieve the best-possible running time given a fixed number of processors.

9 Applications to Verifiable Hard Functions

We observe that any non-interactive SPARK for deterministic computations gives a way to turn any function implemented in the parallel RAM model into a verifiable function that can be computed in roughly the same parallel time. In particular, this implies that any sequential function (one that can be computed in time T but not much faster) can be made into a verifiable delay function (VDF). Furthermore, if the underlying sequential function satisfies some hardness property, such as memory-hardness, then this is preserved in the transformation. In the following, we formally define verifiable hard functions and then show how to construct them using publicly verifiable non-interactive SPARKs for deterministic computations:

9.1 Defining Verifiable Hard Functions

In the subsequent definitions, we make use of the following algorithms with the specified syntax:

$ \mathit {pp} \leftarrow \mathbf {Gen} (1^{\lambda}) $ : A PPT algorithm that on input a security parameter $ \lambda $ outputs public parameters $ \mathit {pp} $ . We assume for simplicity that $ \mathit {pp} $ contains $ 1^{\lambda} $ .

$ x \leftarrow \mathbf {Sample} (\mathit {pp}) $ : A PPT algorithm that on input a security parameter $ \lambda $ and public parameters $ \mathit {pp} $ outputs a string $ x \in \{0,1\}^* $ .

$ y = (\mathit {pp}, x) $ : A deterministic algorithm that on input a security parameter $ \lambda $ , public parameters $ \mathit {pp} $ , and an input $ x \in \{0,1\}^* $ , outputs a value $ y \in \{0,1\} ^* $ .

$ (y, \pi) \leftarrow \mathbf {EvalWithProof} (\mathit {pp}, x) $ : An algorithm that on input a security parameter $ \lambda $ , public parameters $ \mathit {pp} $ , and an input $ x \in \{0,1\} ^* $ , outputs a value $ y \in \{0,1\} ^* $ and a proof $ \pi \in \{0,1\} ^* $ . The value y can be generated by the deterministic algorithm $ (\mathit {pp}, x) $ . The second output $ \pi $ can be generated using randomness, so it may not be unique.

$ b \leftarrow \vee \mathrm{f}(\mathit {pp}, x, (y,\pi)) $ : A probabilistic algorithm that on input a security parameter $ \lambda $ , public parameters $ \mathit {pp} $ , an input $ x \in \{0,1\} ^* $ , a value $ y \in \{0,1\} ^* $ , and a proof $ \pi \in \{0,1\} ^* $ , outputs a bit b indicating whether to accept or reject.

Using the above syntax, we define a verifiable function in the public parameters model.

Definition 9.1 (Verifiable Function).

A verifiable function is a a tuple $ (\mathbf {Gen},\mathbf {EvalWithProof},\vee \mathrm{f}) $ of algorithms such that the following hold:

Completeness:

For every $ \lambda \in \mathbb {N} $ , $ \mathit {pp} \in \mathrm{Supp} ({\mathbf {Gen} (1^{\lambda}))} $ , and $ x \in \{0,1\} ^* $ , it holds that

$ \begin{equation*} \text{Pr}\left[ \vee \mathrm{f} (\mathit {pp},x, \mathbf {EvalWithProof} (\mathit {pp}, x)) = 1 \right] = 1. \end{equation*} $

Soundness:

For every non-uniform PPT algorithm $ \mathcal {A} = \lbrace \mathcal {A} _{\lambda }\rbrace _{\lambda \in \mathbb {N}} $ , there exists a negligible function $ \mathsf {negl} $ such that for every $ \lambda \in \mathbb {N} $ , it holds that

$ \begin{equation*} \text{Pr}\left[ \begin{array}{l} \mathit {pp} \leftarrow \mathbf {Gen} (1^{\lambda}) \\ [5pt] (x,y^{\prime },\pi ^{\prime }) \leftarrow \mathcal {A} _{\lambda }(\mathit {pp}) \\ [5pt] (y,\pi) \leftarrow \mathbf {EvalWithProof} (\mathit {pp},x) \\ [5pt] b \leftarrow \vee \mathrm{f} (\mathit {pp}, x, y^{\prime },\pi ^{\prime }) \end{array} : \begin{array}{l} b = 1 \\ [5pt] \wedge \ y \ne y^{\prime } \end{array} \right] \le \mathsf {negl} (\lambda). \end{equation*} $

Before defining a hard function, we define the notion of a class of algorithms. Recall that an algorithm $ \mathcal {A} = \{\mathcal {A} _{\lambda }\}_{\lambda \in \mathbb {N}} $ is a actually sequence of algorithms for each $ \lambda \in \mathbb {N} $ . A class $ \mathcal {C} $ is a set of algorithms satisfying some predicate as a function of $ \lambda $ . Also, we recall the distinction between uniform and non-uniform algorithms $ \mathcal {A} = \{\mathcal {A} _{\lambda }\}_{\lambda \in \mathbb {N}} $ . $ \mathcal {A} $ is uniform if for all $ \lambda \in \mathbb {N} $ , $ \mathcal {A} _{\lambda } $ can be computed by a constant-size PPT Turing machine on input $ 1^{\lambda} $ . A non-uniform algorithm may not have a constant-size description to efficiently generate $ \mathcal {A} _{\lambda } $ for all $ \lambda \in \mathbb {N} $ . At a high level, a hard function can be computed by a uniform algorithm in an “honest” class whereas it cannot be computed even by non-uniform algorithms in an “adversarial” class.

Definition 9.2 (Hard Function).

Let $ \mathcal {C} ^\mathsf {Honest} $ and $ \mathcal {C} ^\mathsf {Adv} $ be classes of algorithms. A $ (\mathcal {C} ^{\mathsf {Honest}},\mathcal {C} ^{\mathsf {Adv}}) $ -hard function is a tuple of algorithms $ (\mathbf {Gen}, \mathbf {Sample}, \mathsf{Eval}) $ such that the following hold:

Honest Evaluation:

There exists a uniform algorithm $ \mathcal {A} = \{\mathcal {A} _{\lambda }\}_{\lambda \in \mathbb {N}} \in \mathcal {C} ^\mathsf {Honest} $ such that for all $ \lambda \in \mathbb {N} $ , $ \mathit {pp} \in $ Supp $ \ (\mathbf {Gen} (1^{\lambda})) $ , and $ x \in \mathrm{Supp}( {\mathbf {Sample} (\mathit {pp}))} $ ,

$ \begin{equation*} \mathcal {A} _{\lambda }(\mathit {pp}, x) = (\mathit {pp}, x). \end{equation*} $

Hardness:

For every non-uniform PPT algorithm $ \mathcal {A} _0 = \lbrace \mathcal {A} _{0,\lambda }\rbrace _{\lambda \in \mathbb {N}} $ ,¹³ there exists a negligible function $ \mathsf {negl} $ such that for every $ \lambda \in \mathbb {N} $ , it holds that

$ \begin{equation*} \text{Pr}\left[ \begin{array}{l} \mathit {pp} \leftarrow \mathbf {Gen} (1^{\lambda}) \\ \mathcal {A} _1 \leftarrow \mathcal {A} _{0,\lambda }(\mathit {pp}) \\ x \leftarrow \mathbf {Sample} (\mathit {pp}) \\ y \leftarrow \mathcal {A} _1(x) \end{array} : \begin{array}{l} (\mathit {pp},x) = y \\ \wedge \ \mathcal {A} _1 \in \mathcal {C} ^\mathsf {Adv} \end{array} \right] \le \mathsf {negl} (\lambda). \end{equation*} $

We say that a hard function has bounded output if for any $ \mathit {pp} $ in the support of $ \mathbf {Gen} (1^{\lambda}) $ and x in the support of $ \mathbf {Sample} (1^{\lambda}) $ , it holds that $ | (\mathit {pp}, x)| \le \lambda $ .

In the above definition, we emphasize that for hardness, the non-uniform algorithm $ \mathcal {A} _0 $ is allowed to do arbitrary polynomial-time pre-processing on the public parameters and then must output a valid algorithm $ \mathcal {A} _1 $ in the class $ \mathcal {C} ^\mathsf {Adv} $ that breaks security. In particular, this is stronger than a definition where the same adversary must work for all public parameters while also coming from the restricted class $ \mathcal {C} ^\mathsf {Adv} $ .

Combining the above two notions, we can define a verifiable hard function in the public parameters model.

Definition 9.3 (Verifiable Hard Function).

Let $ \mathcal {C} ^\mathsf {Honest} $ and $ \mathcal {C} ^\mathsf {Adv} $ be classes of algorithms. A verifiable $ (\mathcal {C} ^\mathsf {Honest}, \mathcal {C} ^\mathsf {Adv}) $ -hard function is a tuple $ (\mathbf {Gen}, \mathbf {Sample}, \mathbf {EvalWithProof}, \vee \mathrm{f}) $ such that $ (\mathbf {Gen}, \mathbf {Sample}, \mathsf{Eval}) $ is a $ (\mathcal {C} ^\mathsf {Honest}, \mathcal {C} ^\mathsf {Adv}) $ -hard function and $ (\mathbf {Gen},\mathbf {EvalWithProof},\vee \mathrm{f}) $ is a verifiable function.

Comparison with Reference [6]. Alwen and Tackmann [6] propose a definitional framework for moderately hard functions, which has been used in subsequent works defining various notions of memory-hard function (e.g., Reference [2]). The main goal of Reference [6] is to come up with a definition that composes nicely in applications. As such, they assume that both the honest and adversarial executions of a moderately hard function have bounded access to an idealized oracle. They propose an indifferentiability-style definition so when analyzing applications using moderately hard functions, it suffices to consider only the resource usage in an “ideal world” scenario. In contrast, our main goal is to show that applying SPARKs to an arbitrary moderately hard function preserves hardness in a “real world” setting, so we do not want to assume that the function has access to an idealized oracle. However, when applying SPARKs to a specific hard function in an idealized model, it would be beneficial to analyze the specific construction within the indifferentiability framework of Reference [6]. We leave this as important and interesting future work when using specific verifable hard functions in further applications.

9.2 Verifiable Hard Functions from Non-interactive SPARKs

We next give a generic theorem that, at a high level, shows that any hard function that can be implemented by a parallel RAM algorithm in parallel time T can be bootstrapped into a verifiable hard function using a publicly verifiable non-interactive SPARK for deterministic computations while nearly preserving the parallel running time and number of processors.

To formalize this, we define a class of parallel RAM algorithms that can be computed in roughly time T with p processors. For any functions $ T,p,q :\mathbb {N} \rightarrow \mathbb {N} $ , let $ \mathcal {P} ^{T,p,q} $ be the class of algorithms such that an algorithm $ \mathcal {A} = \{\mathcal {A} _{\lambda }\}_{\lambda \in \mathbb {N}} $ is in $ \mathcal {P} ^{T,p,q} $ if for all $ \lambda \in \mathbb {N} $ , $ \mathcal {A} _{\lambda } $ is a parallel RAM algorithm running in parallel time $ T(\lambda) + q(\lambda) $ with at most $ p(\lambda) \cdot q(\lambda) $ processors. For any $ T,p :\mathbb {N} \rightarrow \mathbb {N} $ , we define

$ \begin{equation*} \mathbf {Honest}\mathcal {P} ^{T,p} = \bigcup _{q \in \mathrm{poly} (\lambda + \log (T(\lambda) \cdot p(\lambda)))} \mathcal {P} ^{T,p,q}. \end{equation*} $

We assume that for algorithms $ \mathcal {A} = \{\mathcal {A} _{\lambda }\}_{\lambda \in \mathbb {N}} $ in $ \mathbf {Honest}\mathcal {P} ^{T,p} $ , the value of $ q(\lambda) $ is given by $ \mathcal {A} _{\lambda } $ . We note that other definitions (e.g. the definition of a sequential function from Reference [20]) consider honest algorithms that run in time exactly $ T(\lambda) $ with exactly $ p(\lambda) $ processors. We allow for additive $ \mathrm{poly} (\lambda + \log (T(\lambda) \cdot p(\lambda))) $ terms in the depth (and multiplicative ones in the number of processors) to capture overheads roughly independent of the length of the computation. In particular, we do this to make the class robust under application of a SPARK, which we formalize in the following theorem. One could also separate q into two functions $ q_{1} $ and $ q_{2} $ defining the additional overheads in the depth and processors, respectively, but for simplicity, we treat these as a single function.

Theorem 9.4.

Let $ T,p :\mathbb {N} \rightarrow \mathbb {N} $ be efficiently computable functions and let $ \mathcal {C} ^\mathsf {Adv} $ be any class of algorithms. Assuming the existence of publicly verifiable non-interactive SPARKs for deterministic parallel computations, if there exists a $ (\mathbf {Honest}\mathcal {P} ^{T,p}, \mathcal {C} ^\mathsf {Adv}) $ -hard function with bounded output, then there exists a verifiable $ (\mathbf {Honest}\mathcal {P} ^{T,p}, \mathcal {C} ^\mathsf {Adv}) $ -hard function.

By combining this with Theorem 7.8, we get the following:

Corollary 9.5.

Let $ T,p :\mathbb {N} \rightarrow \mathbb {N} $ be efficiently computable functions and let $ \mathcal {C} ^\mathsf {Adv} $ be any class of algorithms. Assuming the existence of collision-resistant hash function families, publicly verifiable SNARKs for $ \mathbf {NP} $ , and a $ (\mathbf {Honest}\mathcal {P} ^{T,p}, \mathcal {C} ^\mathsf {Adv}) $ -hard function with bounded output, then there exists a verifiable $ (\mathbf {Honest}\mathcal {P} ^{T,p}, \mathcal {C} ^\mathsf {Adv}) $ -hard function.

Proof of Theorem 9.4

Let $ (\mathbf {Gen} _\mathsf {hard}, \mathbf {Sample} _\mathsf {hard}, \mathsf{Eval} _\mathsf {hard}) $ be a $ (\mathbf {Honest}\mathcal {P} ^{T,p}, \mathcal {C} ^\mathsf {Adv}) $ -hard function with bounded output. Let $ (G,P,V) $ be a non-interactive SPARK for deterministic computations. We construct $ (\mathbf {Gen}, \mathbf {Sample}, \mathbf {EvalWithProof}, \vee \mathrm{f}) $ to be a verifiable $ (\mathbf {Honest}\mathcal {P} ^{T,p},\mathcal {C} ^\mathsf {Adv}) $ -hard function, defined as follows:

$ \mathit {pp} \leftarrow \mathbf {Gen} (1^{\lambda}) $ : Run $ \mathit {crs} _\mathbf {SPARK} \leftarrow G(1^{\lambda}) $ and $ \mathit {pp} _\mathsf {hard} \leftarrow \mathbf {Gen} _\mathsf {hard} (1^{\lambda}) $ . Output $ \mathit {pp} = (\mathit {crs} _\mathbf {SPARK},\mathit {pp} _\mathsf {hard}) $ .

$ x \leftarrow \mathbf {Sample} (\mathit {pp}) $ : Let $ (\mathit {crs} _\mathbf {SPARK}, \mathit {pp} _\mathsf {hard}) = \mathit {pp} $ , and output $ x \leftarrow \mathbf {Sample} _\mathsf {hard} (\mathit {pp} _\mathsf {hard}) $ .

$ (y,\pi) \leftarrow \mathbf {EvalWithProof} (\mathit {pp}, x) $ : Let $ (\mathit {crs} _\mathbf {SPARK}, \mathit {pp} _\mathsf {hard}) = \mathit {pp} $ that specifies security parameter $ \lambda $ , $ M = \{M_{\lambda }\}_{\lambda \in \mathbb {N}} $ be the uniform algorithm from the honest evaluation property of $ (\mathbf {Gen} _\mathsf {hard}, \mathbf {Sample} _\mathsf {hard}, \mathsf{Eval} _\mathsf {hard}) $ , and let $ q(\lambda) $ be the value such that $ M_{\lambda } $ runs in time $ T^{\prime }(\lambda) = T(\lambda) + q(\lambda + \log (T(\lambda) \cdot p(\lambda))) $ , where q is a polynomial guaranteed to exist by the definition of $ \mathbf {Honest}\mathcal {P} ^{T,p} $ . Output $ (y,\pi) \leftarrow P(\mathit {crs} _\mathbf {SPARK}, (M_{\lambda }, (\mathit {pp} _\mathsf {hard},x), \lambda , T^{\prime }(\lambda))) $ . We additionally define $ (\mathit {pp}, x) $ as $ M_{\lambda }(\mathit {pp} _\mathsf {hard}, x) $ .

$ b \leftarrow \vee \mathrm{f} (\mathit {pp},x, (y,\pi)) $ : Let $ (\mathit {crs} _\mathbf {SPARK}, \mathit {pp} _\mathsf {hard}) = \mathit {pp} $ and $ M = \{M_{\lambda }\}_{\lambda \in \mathbb {N}} $ be the uniform algorithm from the honest evaluation property of $ (\mathbf {Gen} _\mathsf {hard}, \mathbf {Sample} _\mathsf {hard}, \mathsf{Eval} _\mathsf {hard}) $ . Output $ b \leftarrow V(\mathit {crs} _\mathbf {SPARK}, (M_{\lambda }, (\mathit {pp} _\mathsf {hard}, x), y, \lambda , T^{\prime }(\lambda)), \pi) $ .

We now show that (1) $ (\mathbf {Gen}, \mathbf {Sample}, \mathsf{Eval}) $ is a $ (\mathbf {Honest}\mathcal {P} ^{T,p}, \mathcal {C} ^\mathsf {Adv}) $ -hard function and (2) $ (\mathbf {Gen}, \mathbf {EvalWithProof}, \vee \mathrm{f}) $ is a verifiable function, which completes the proof of the lemma.

For (1), note that, by completeness of $ (G, P, V) $ , if $ (y,\pi) \leftarrow P(\mathit {crs} _\mathbf {SPARK}, (M_{\lambda }, (\mathit {pp} _\mathsf {hard}, x) , \lambda , T^{\prime }(\lambda))) $ , then $ y = M_{\lambda }(\mathit {pp} _\mathsf {hard}, x) = \mathsf{Eval} _\mathsf {hard} (\mathit {pp} _\mathsf {hard}, x) $ where $ |y| \le \lambda $ , since $ \mathsf{Eval} _\mathsf {hard} $ has bounded output.

We first argue honest evaluation. Since $ M \in \mathbf {Honest}\mathcal {P} ^{T,p} $ , it follows that for all $ \lambda \in \mathbb {N} $ , $ M_{\lambda } $ runs in time $ T^{\prime }(\lambda) = T(\lambda) + q(\lambda + \log (T(\lambda) \cdot p(\lambda))) $ using at most $ p^{\prime }(\lambda) = p(\lambda) \cdot q(\lambda + \log (T(\lambda) \cdot p(\lambda))) $ processors. By efficiency of the non-interactive SPARK, it holds that P runs in time $ T^{\prime }(\lambda) + \mathrm{poly} (\lambda , \left|(M_{\lambda },x) \right|, \log (T^{\prime }(\lambda) \cdot p^{\prime }(\lambda))) $ using at most $ p^{\prime }(\lambda) \cdot \mathrm{poly} (\lambda , \log (T^{\prime }(\lambda) \cdot p^{\prime }(\lambda))) $ processors. As $ x \in \mathrm{Supp} ({\mathbf {Sample} (\mathit {pp}))} $ , it holds that $ \left|x \right| \in \mathrm{poly} (\lambda) $ . Furthermore, $ \left|M_{\lambda } \right| $ is a constant that q may depend on, so we can assume that $ \left|M_{\lambda } \right| \in \mathrm{poly} (\lambda ,\log T(\lambda)) $ . It follows that there exist a polynomial $ q^{\prime } $ such that for all $ \lambda \in \mathbb {N} $ , $ \mathbf {EvalWithProof} $ runs in time $ T(\lambda) + q^{\prime }(\lambda + \log (T(\lambda) \cdot p(\lambda))) $ using at most $ p(\lambda) \cdot q^{\prime }(\lambda + \log (T(\lambda) \cdot p(\lambda))) $ processors.

For hardness, suppose there exists a non-uniform PPT adversary $ \mathcal {A} _0 = \{\mathcal {A} _{0,\lambda }\}_{\lambda \in \mathbb {N}} $ and a polynomial $ p_{\mathcal {A}} $ such that for infinitely many $ \lambda \in \mathbb {N} $ ,

Since x is sampled from $ \mathbf {Sample} _\mathsf {hard} (\mathbf {Gen} _\mathsf {hard} (1^{\lambda})) $ and $ y = \mathsf{Eval} _\mathsf {hard} (\mathit {pp}, x) $ , this implies that $ \mathcal {A} _0 $ also breaks the hardness of $ (\mathbf {Gen} _\mathsf {hard}, \mathbf {Sample} _\mathsf {hard}, \mathsf{Eval} _\mathsf {hard}) $ , in contradiction.

For (2), we note that completeness of $ (\mathbf {Gen}, \mathbf {EvalWithProof}, \vee \mathrm{f}) $ follows immediately by completeness of $ (G, P, V) $ . Soundness follows, since the argument of knowledge property of $ (G, P, V) $ implies soundness. Specifically, suppose there exists a non-uniform PPT algorithm $ \mathcal {A} = \{\mathcal {A} _{\lambda }\}_{\lambda \in \mathbb {N}} $ and a polynomial p such that for all $ \lambda \in \mathbb {N} $ ,

$ \begin{equation*} \text{Pr}\left[ \begin{array}{l} \mathit {pp} \leftarrow \mathbf {Gen} (1^{\lambda}) \\ (x,y^{\prime },\pi ^{\prime }) \leftarrow \mathcal {A} _{\lambda }(\mathit {pp}) \\ (y,\pi) \leftarrow \mathbf {EvalWithProof} (\mathit {pp},x) \\ b \leftarrow \vee \mathrm{f} (\mathit {pp}, x, y^{\prime },\pi ^{\prime }) \end{array} : \begin{array}{l} b = 1 \\ \wedge \ y \ne y^{\prime } \end{array} \right] \gt 1/p(\lambda). \end{equation*} $

We construct the adversary $ P^\star = \{P^\star _{\lambda }\}_{\lambda \in \mathbb {N}} $ for the non-interactive SPARK, which has $ \mathcal {A} $ hardcoded as non-uniform advice. For all $ \lambda \in \mathbb {N} $ , $ P^\star _{\lambda }(\mathit {crs} _\mathbf {SPARK}) $ samples $ \mathit {pp} _\mathsf {hard} \leftarrow \mathbf {Gen} _\mathsf {hard} (1^{\lambda}) $ , computes $ (x,y^{\prime },\pi ^{\prime }) \leftarrow \mathcal {A} _{\lambda }((\mathit {crs} _{\mathbf {SPARK}},\mathit {pp} _\mathsf {hard})) $ , computes $ M_{\lambda } $ and $ T^{\prime }(\lambda) $ , and outputs $ ((M_{\lambda } , (\mathit {pp} _\mathsf {hard}, x), y^{\prime }, \lambda , T^{\prime }(\lambda)) , \pi ^{\prime }) $ . Because $ \mathcal {A} _{\lambda } $ is PPT, and $ \mathbf {Gen} _{\mathsf {hard}}(1^{\lambda}) $ , $ M_{\lambda } $ , and $ T^{\prime }(\lambda) $ can be computed in polynomial-time, this implies that $ P^\star _{\lambda } $ is PPT. Furthermore, by definition of $ \mathcal {A} $ , we can rewrite the above probability statement to conclude that

$ \begin{equation*} \text{Pr}\left[ \begin{array}{l} \mathit {crs} _\mathbf {SPARK} \leftarrow G(1^{\lambda}) \\ ((M_{\lambda }, (\mathit {pp} _\mathsf {hard}, x), y^{\prime }, \lambda , T^{\prime }(\lambda)),\pi ^{\prime }) \leftarrow P^\star _{\lambda }(\mathit {crs} _\mathbf {SPARK}) \\ (y,\pi) \leftarrow P(\mathit {crs} _\mathbf {SPARK},(M_{\lambda }, (\mathit {pp} _\mathsf {hard}, x), \lambda , T^{\prime }(\lambda))) \\ b \leftarrow V(\mathit {crs} _\mathbf {SPARK},(M_{\lambda }, (\mathit {pp} _\mathsf {hard}, x), y^{\prime }, \lambda , T^{\prime }(\lambda)), \pi ^{\prime }) \end{array} : \begin{array}{l} b = 1 \\ \wedge \ y \ne y^{\prime } \end{array} \right] \gt 1/p(\lambda). \end{equation*} $

Because $ \mathsf{Eval} $ is deterministic, the “witness” w is empty, so the output of the computation is unique. By completeness, the output is the value y output by P. Therefore, the argument of knowledge property of $ (G, P,V) $ stipulates that any $ y^{\prime } \not= y $ cannot be accepted with greater than $ 1/p(\lambda) $ probability for any polynomial p, in contradiction.□

9.3 Applications to VDFs

At a high level, a T-sequential function is a function that can be computed in roughly T time with “moderate parallelism” but cannot be computed any quicker with much more parallelism.

To capture this notion, for any $ T :\mathbb {N} \rightarrow \mathbb {N} $ , define

$ \begin{equation*} \mathbf {Honest}\mathcal {P} ^{T,\mathrm{poly}\hspace{-2.0pt}\log } = \bigcup _{p,q \in \mathrm{poly} (\lambda + \log T(\lambda))} \mathcal {P} ^{T,p,q}. \end{equation*} $

Note that this is simply $ \mathbf {Honest}\mathcal {P} ^{T,p} $ restricted to the case where the number of processors p is logarithmic in T, and so this class captures the honest execution of a T-sequential function. We next define an adversarial analog, which is allowed to use many parallel processors,

$ \begin{equation*} \mathsf {Adv} \mathcal {P} ^T = \bigcup _{ \begin{array}{c} p \in \mathrm{poly} (\lambda + \log T(\lambda)), -PLXBCR- q \in \mathrm{poly} (\lambda + T(\lambda)) \end{array}} \mathcal {P} ^{T,p,q}. \end{equation*} $

We now formally define a sequential function.

Definition 9.6 (Sequential Function).

For any $ T :\mathbb {N} \rightarrow \mathbb {N} $ , the tuple $ (\mathbf {Gen}, \mathbf {Sample}, \mathsf{Eval}) $ is a T-sequential function if there exists an $ \epsilon \in (0,1) $ such that it is a $ (\mathbf {Honest}\mathcal {P} ^{T,\mathrm{poly}\hspace{-2.0pt}\log }, \mathsf {Adv} \mathcal {P} ^{(1-\epsilon) \cdot T}) $ -hard function. We say that a sequential function has bounded output if for any $ \mathit {pp} \in \mathrm{Supp}( {\mathbf {Gen} (1^{\lambda}))} $ and $ x \in \mathrm{Supp}( {\mathbf {Sample} (1^{\lambda}))} $ , it holds that $ | (\mathit {pp}, x)| \le \lambda $ .

Next, a verifiable delay function is simply a sequential function that is additionally verifiable, formalized as follows:

Definition 9.7 (Verifiable Delay Function).

Let $ T :\mathbb {N} \rightarrow \mathbb {N} $ . A T-verifiable delay function (T-VDF) is a tuple $ (\mathbf {Gen}, \mathbf {Sample}, \mathbf {EvalWithProof}, \vee \mathrm{f}) $ such that $ (\mathbf {Gen}, \mathbf {Sample}, \mathsf{Eval}) $ is a T-sequential function and $ (\mathbf {Gen}, \mathbf {EvalWithProof}, \vee \mathrm{f}) $ is a verifiable function. In the case where each algorithm takes as input a time bound T, we say the tuple is simply a verifiable delay function if it is a T-verifiable delay function for any input T.

We note that previous definitions of VDFs are functions that take as input a time bound T and require that the resulting function is a T-VDF for any valid input T. This models the scenario in practice where you want to “tune” a function that can be computed in a particular time T but not faster with more parallelism. We define a T-VDF with respect to a particular time bound T to capture the case where the underlying sequential function may not be able to be tuned to run in any given time bound.

Corollary 9.8.

Let $ T :\mathbb {N} \rightarrow \mathbb {N} $ . Assuming the existence of publicly verifiable non-interactive SPARKs for deterministic computations with moderate parallelism, if there exists a T-sequential function with bounded output, then there exists a T-verifiable delay function.

Proof.

Let $ \epsilon \in (0,1) $ be the constant sequentiality gap that is guaranteed to exist for the given T-sequential function. By the definition of a sequential function, there exists a uniform algorithm $ \mathcal {A} $ in $ \mathbf {Honest}\mathcal {P} ^{T,\mathrm{poly}\hspace{-2.0pt}\log } $ that computes the evaluation algorithm of the sequential function. It follows that there exists a polynomials $ p,q $ in $ \mathrm{poly} (\lambda + \log T(\lambda)) $ such that $ \mathcal {A} $ is in $ \mathcal {P} ^{T,p,q} \subseteq \mathbf {Honest}\mathcal {P} ^{T,q} $ . By setting $ \mathcal {C} ^\mathsf {Adv} = \mathsf {Adv} \mathcal {P} ^{(1-\epsilon) \cdot T} $ in Theorem 9.4, we get that there exists a verifiable $ (\mathbf {Honest}\mathcal {P} ^{T,p},\mathsf {Adv} \mathcal {P} ^{(1-\epsilon) \cdot T}) $ -hard function, which implies a hard function that can be computed by an algorithm in $ \mathcal {P} ^{T,p,q^{\prime }} $ for a function $ q^{\prime } $ in $ \mathrm{poly} (\lambda + \log (T(\lambda) \cdot p(\lambda))) \in \mathrm{poly} (\lambda + \log (T(\lambda))) $ as p is in $ \mathrm{poly} (\lambda +\log T(\lambda)) $ . Therefore, $ \mathcal {P} ^{T,p,q^{\prime }} \subseteq \mathbf {Honest}\mathcal {P} ^{T,\mathrm{poly}\hspace{-2.0pt}\log } $ , which gives the claim.□

For the above corollary, we note that in the case where the sequential function takes as input a time bound T, the resulting verifiable delay function can also take in a time bound T. We note that similar to Corollary 9.5, we can instantiate the SPARKs in Corollary 9.8 based on collision-resistant hash functions and SNARKs for $ \mathbf {NP} $ .

Candidate sequential functions. We briefly discuss existing candidate sequential functions that can be used in Corollary 9.8. We note that in all cases we discuss, it was already known how to construct VDFs, but we emphasize that our transformation is completely independent of the specific details of the underlying sequential function.

Any iterated sequential function is also a sequential function. An iterated sequential function has the additional structure that some small sequential component is repeated T times to obtain a sequential function with respect to any time bound T. The assumption is that any a priori unbounded number of iterations cannot be significantly sped up with parallelism. In other words, it is not possible to make shortcuts in the computation without computing all intermediate outputs in order. Boneh et al. [20] show how to construct VDFs from any iterated sequential function using any succinct non-interative argument for deterministic computations with quasi-linear prover overhead. Candidate iterated sequential functions include iterated hashing and repeated squaring in groups of unknown order [53]. For repeated squaring, more practically efficient VDF constructions are known that make use of the additional algebraic structure [51, 56].

Another approach for constructing sequential functions is using secure hardware. The construction, on input x, simply waits T steps and then outputs the evaluation y of a PRF on x. When implemented using secure hardware, the key for the PRF is kept hidden, so the only way to compute y is to use the hardware, which incurs the time T delay. This construction can be securely realized in software assuming indistinguishability obfuscation and the existence of a sequential decision problem (see, e.g., References [19, 47]). Furthermore, this construction can be turned into a VDF by making the secure function additionally output a signature on the pair $ (x,y) $ . Soundness follows, since the only way to construct valid signatures is to compute the secure function.

It is an interesting open problem to construct new (non-iterated) sequential functions from simpler assumptions. Based on Corollary 9.8, any such construction immediately implies a VDF assuming publicly verifiable non-interactive SPARKs for deterministic computations with moderate parallelism.

Remark 10.

We emphasize the importance that the underlying SPARK in the transformation can handle (deterministic) parallel computation that uses $ \mathrm{poly} (\lambda , \log T) $ processors. For most iterated functions, it is the case that each iteration can be sped up with parallelism, for example, by using ASICs. However, this amount of parallelism scales only polynomially with the input length, $ \lambda $ , and does not depend more than logarithmically on the total time bound T.

9.4 Applications to Memory-hard VDFs

We next show how publicly verifiable non-interactive SPARKs for deterministic computations can be used to construct memory-hard VDFs. A memory-hard VDF in turn implies publicly verifiable, non-interactive proofs of space [31]. There are various proposed definitions for memory-hardness. Alwen and Serbinenko [5] define cumulative memory complexity, which stipulates that the average memory usage for a function must be large. Alwen, Blocki, and Pietrzak [2] define a conceptually stronger notion of sustained memory complexity that stipulates a function must use large memory for many steps (rather than only on average).

We start by giving an overview of the definitions for cumulative and sustained memory complexity. For a parallel RAM machine M, an input x, and an index $ i \in \mathbb {N} $ , let $ \mathbf {Space} (M,x,i) $ be the number of non-zero words in memory during the ith (parallel) step of the computation of M on input x. The cumulative memory complexity of M is

$ \begin{equation*} \mathbf {CMC} (M) = \max _x \sum _{i=1}^{\mathbf {depth} _M(x)} \mathbf {Space} (M,x,i), \end{equation*} $

where recall that $ \mathbf {depth} _M(x) $ is the parallel running time of M on input x. The s-sustained memory complexity of M is defined as

$ \begin{equation*} s\text{-}\mathbf {SMC} (M) = \max _x \left| \{i \in [\mathbf {depth} _M(x)] : \mathbf {Space} (M,x,i) \gt s\} \right|. \end{equation*} $

It was observed in Reference [2] that for any function f, there exists a machine M that implements f with $ s\text{-}\mathbf {SMC} \in O(T / \log T) $ where T is the depth required to compute f.

For any $ S :\mathbb {N} \rightarrow \mathbb {N} $ , we define $ \mathbf {CMem} ^S $ to be the class of algorithms $ \mathcal {A} = \{\mathcal {A} _{\lambda }\}_{\lambda \in \mathbb {N}} $ such that $ \mathbf {CMC} (\mathcal {A} _{\lambda }) \le S(\lambda) \cdot \mathbf {depth} _{\mathcal {A} _{\lambda }} $ for all $ \lambda \in \mathbb {N} $ . Similarly, we define $ \mathbf {SMem} ^S $ be the class of algorithms $ \mathcal {A} = \{\mathcal {A} _{\lambda }\}_{\lambda \in \mathbb {N}} $ such that as a function of $ \lambda $ , $ S(\lambda)\text{-}\mathbf {SMC} (\mathcal {A} _{\lambda }) \in o(\mathbf {depth} _{A_{\lambda }}) $ .

For simplicity of presentation, we define the following memory-hardness notions with respect to sustained memory complexity using $ \mathbf {SMem} $ . However, we emphasize that we could analogously define the notion with respect to cumulative memory complexity using $ \mathbf {CMem} $ or any other recently proposed memory-hardness definitions such as static memory-hardness [28]. We intuitively define an $ (S,T) $ -memory-hard sequential function is one that requires T parallel time to compute and cannot be computed using less than S memory for all but $ o(T) $ steps. We formalize this as follows:

Definition 9.9 (Memory-hard Sequential Function).

For any $ S, T :\mathbb {N} \rightarrow \mathbb {N} $ , the tuple $ (\mathbf {Gen}, \mathbf {Sample}, \mathsf{Eval}) $ is a $ (S,T) $ -memory-hard sequential function if there exists an $ \epsilon \in (0,1) $ such that it is a $ (\mathbf {Honest}\mathcal {P} ^{T,\mathrm{poly}\hspace{-2.0pt}\log }, \mathsf {Adv} \mathcal {P} ^{(1-\epsilon) \cdot T} \cup \mathbf {SMem} ^{S}) $ -hard function.

A memory-hard VDF is simply a memory-hard sequential function that is also verifiable, formalized as follows:

Definition 9.10 (Memory-hard Verifiable Delay Function).

Let $ S,T :\mathbb {N} \rightarrow \mathbb {N} $ . A $ (S,T) $ -memory-hard verifiable delay function is a tuple $ (\mathbf {Gen}, \mathbf {Sample}, \mathbf {EvalWithProof}, \vee \mathrm{f}) $ such that $ (\mathbf {Gen}, \mathbf {Sample} , \mathsf{Eval}) $ is a $ (S,T) $ -memory-hard sequential function and $ (\mathbf {Gen}, \mathbf {EvalWithProof}, \vee \mathrm{f}) $ is a verifiable function. In the case where $ S \in \Omega (T/\log T) $ and each algorithm takes as input a time bound T, we say the tuple is simply a memory-hard verifiable delay function if it is a $ (S,T) $ -memory-hard verifiable delay function for any input T.

Similar to Corollary 9.8, it holds that memory-hardness is also preserved under the transformation of Theorem 9.4.

Corollary 9.11.

Let $ S,T:\mathbb {N} \rightarrow \mathbb {N} $ . Assuming the existence of publicly verifiable non-interactive SPARKs for deterministic computations with moderate parallelism, if there exists a $ (S,T) $ -memory-hard sequential function, then there exists a $ (S,T) $ -memory-hard verifiable delay function.

Proof.

Let $ \epsilon \in (0,1) $ be the constant guaranteed to exist for the given $ (S,T) $ -memory-hard sequential function. The corollary follows exactly as in the proof of Corollary 9.8, by setting $ \mathcal {C} ^\mathsf {Adv} = \mathsf {Adv} \mathcal {P} ^{(1-\epsilon) \cdot T} \cup \mathbf {SMem} ^{S} $ in Theorem 9.4.□

We note that by combining the above corollary with Theorem 8.1, we obtain memory-hard verifiable delay functions based on memory-hard sequential functions, collision-resistant hash functions, and SNARKs for $ \mathbf {NP} $ .

Candidate memory-hard sequential functions. Most constructions of memory-hard sequential functions are proven secure in the (parallel) random oracle model and then instantiated with a sufficient hash function $ h :\{0,1\} ^* \rightarrow \{0,1\}^{\lambda } $ , which we will use in the remaining discussion. We emphasize that, once instantiated with a concrete hash function, the following candidates are only heuristically secure based on the random oracle methodology. As a result, our resulting transformations are secure under the same assumptions.

Percival [50] introduced the function $ \mathbf {Scrypt} $ as a candidate memory-hard function. At a high level, $ \mathbf {Scrypt} $ on input x first performs $ T/2 $ iterated hashes to generate a “database” D of size $ T/2 $ , where $ D[0] = x $ and $ D[i] = h(D[i-1]) $ for $ i = 1,\ldots , T/2-1 $ . It then continues the hash chain while additionally indexing into this database. Specifically, $ D[i] = h(D[i-1] \oplus D[D[i-1] \mathrm{\text{ }mod}\text{ } T/2]) $ for $ i = T/2,\ldots , T $ . The output of the function is defined to be $ D[T] $ . The honest evaluation of the function stores T words in memory. Intuitively, if an adversary stores much less than $ T/2 $ words, then if it encounters an index $ D[i-1] \mathrm{\text{ }mod}\text{ } T/2 $ that is not stored, it will need to recompute this value from the closest stored position, which will take much more time. Indeed, Alwen et al. [4] show that $ \mathbf {Scrypt} $ requires $ \Omega (T^2) $ cumulative memory complexity. Furthermore, $ \mathbf {Scrypt} $ is also sequential (in the random oracle model), as each subsequent query to h is uniformly random and hard to predict, so it behaves like an iterated random oracle. Using $ \mathbf {Scrypt} $ , we can construct a VDF with high cumulative memory complexity assuming non-interactive SPARKs for deterministic computations. However, $ \mathbf {Scrypt} $ does not have high sustained memory complexity, since for any S, it can be computed in time $ O(T^2 / S) $ using S memory.

A more general class of memory-hard function are based on labelings of directed acyclic graphs (DAGs). Let $ G_n $ be a DAG on n vertices $ \lbrace v_1,\ldots ,v_n\rbrace $ . The label of a node $ v_i $ , denoted $ \ell _i $ , is recursively defined as $ \ell _i = h(i, \ell _{p_1},\ldots , \ell _{p_d}), $ where $ p_1,\ldots , p_d $ are the incoming edges to $ v_i $ . The function is defined by the graph $ G_n $ , the input is a seed to the hash function h, and the output is the label of the sink of the graph. The hash function is evaluated in the parallel random oracle model, so algorithms can query multiple points in parallel in one “round.” For honest evaluation, a parallel RAM algorithm can compute the graph labeling function with time complexity that scales with the depth of the graph and parallel complexity that scales with the width.¹⁴ Memory lower bounds in this model are proven via pebbling arguments on the underlying graphs (see, e.g., Reference [5] for more information). The depth of the graph also serves as a lower bound for the parallel time to compute such functions. Thus, non-interactive SPARKs for deterministic computations give a way to make such graph labeling functions verifiable. We emphasize that this implies that many works that give graph labeling constructions (e.g., References [2, 5]) that satisfy stricter memory-hardness requirements also are preserved under our framework. Specifically, Alwen et al. [2] construct a function that has $ s\text{-}\mathbf {SMC} $ for $ s \in \Omega (T / \log T) $ where T is the depth required to compute the function. Using this function, Corollary 9.11 implies a memory-hard VDF assuming non-interactive SPARKs for deterministic computation.

Finally, as with sequential functions, another approach for constructing memory-hard sequential functions is via secure hardware. We assume that the secure hardware has some a priori bounded storage capacity of $ \mathrm{poly} (\lambda) $ words, and any further required storage is stored externally to the secure enclave. As in the case of sequential functions, the secure hardware waits at least T time and then outputs a PRF evaluation on the input x. Additionally, the secure hardware can externally store a large randomly generated file and perform a simple “proof of storage” to make sure that it is stored for the entire duration of the execution. This can be implemented, for example, using a Merkle tree to verify that random locations of the file are being stored while only keeping the root of the Merkle tree within the secure enclave for authentication. At a high level, security follows, because the hardware only computes its output if enough time and memory have been used. As the PRF key is hidden, there is no other way to compute the output without running the secure hardware. As for the sequential function, this can be further made verifiable by outputting a signature on the PRF input and output.

We believe it is an interesting open question to construct memory-hard sequential functions in software without random oracles. Based on Corollary 9.11, this immediately gives a memory-hard VDF assuming publicly verifiable non-interactive SPARKs for deterministic computations.

Footnotes

Only the additive $ \mathrm{poly}\hspace{-2.0pt}\log (T \cdot p) $ term is allowed to additionally depend on the security parameter or statement size.

Their original construction relied on incremental verifiable computation [55], which exists based on SNARKs [15], and any ISF. In an updated version they show that SNARGs, along with ISFs, are sufficient.

However, a continuous VDF [32] does imply an ISF.

⁴

In the construction based on SNARGs and ISFs, they need to be able to “break” the computation of the function in various mid-points of the computation, and the internal “state” in those locations has to be small for efficiency of the construction. In the construction based on SNARKs and ISFs, they rely on a tight construction of incremental verifiable computation, but the number of parallel processors required for the latter is as large as the cost of a single step [12, 16, 48], and so many steps are needed.

⁵

To clarify notation, IOPs (introduced by Reference [13]) are equivalent to the notion of Probabilistically Checkable Interactive Proofs (introduced concurrently and independently by Reference [52]).

⁶

We note that the length of a word only needs to be greater than $ \log n $ , but can be as large as any fixed polynomial in $ \lambda $ . We set it to $ \lambda $ for simplicity.

⁷

Note that we could additionally require a verifier to be efficient for “dishonest” proofs that are not in the support of an honest prover $ \mathcal {P} $ . However, given any verifier $ \mathcal {V} $ that satisfies succinctness for honest proofs with universal polynomial p, we can construct an efficient verifier $ \mathcal {V}^{\prime } $ for any proof by running $ \mathcal {V} $ for at most $ p(\lambda , \left|(M,x,y,L) \right|, \log t) $ steps and rejecting otherwise.

⁸

They considered instances with different running times, whereas we consider instances with different outputs.

⁹

For simplicity, the only randomized algorithm in our definition is the key generation algorithm, and the rest are deterministic. However, with minor modifications to our main protocol, we could use a scheme where all algorithms may be randomized.

¹⁰

This follows by an application of Jensen’s inequality to the function $ f(x) = 2x \log x $ , which is convex on all $ x \gt 0 $ . Specifically,

$ \begin{equation*} a \log a + b \log b = \frac{f(a)}{2} + \frac{f(b)}{2} \ge f\left(\frac{a+b}{2}\right) = (a+b) \log \left(\frac{a+b}{2}\right) = (a+b) \left(\log (a+b) - 1\right)\!. \end{equation*} $

¹¹

For example, if $ \alpha (\Lambda ,X,T,P) / T $ were not increasing in T, then we could define an upper bound $ \alpha ^{\prime }(\Lambda ,X,T,P) = T \cdot \max _{t \le T} (\alpha (\Lambda ,X,t,P) / t) $ that is increasing in T and preserves asymptotic behavior.

¹²

Note that if the underlying SNARK is publicly verifiable, then $ \mathrm{st} _\mathsf {snark} = \mathit {crs} _\mathsf {snark} $ . Then, $ \mathit {crs} = \mathrm{st} $ , so the resulting non-interactive argument is also publicly verifiable.

¹³

We note that we could naturally extend this definition to model hardness with respect to a more expensive preprocessing attack, but we define polynomial-time attackers for simplicity.

¹⁴

We additionally need to account for the parallel time to compute the hash function, which increases the time and parallel complexity by at most a factor of $ \mathrm{poly} (\lambda) $ .

A Witness-extended Emulation

In this section, we define the notion of witness-extended emulation for succinct arguments and show that this implies the argument of knowledge definition of Definition 3.2.

Recall that for a non-uniform prover $ \mathcal {P} ^\star = \lbrace \mathcal {P} ^\star _{\lambda }\rbrace _{\lambda \in \mathbb {N}} $ , we let $ \mathcal {P} ^\star _{\lambda , z, s} $ denote the machine $ \mathcal {P} ^\star _{\lambda } $ with auxiliary input z and randomness s fixed. Additionally, we let

$ \begin{equation*} \mathit {View} _\mathcal {V} ^{\mathcal {P} ^\star _{\lambda ,z,s}}(1^{\lambda} , (M,x,y,L,t)) \end{equation*} $

denote the distribution representing the view of $ \mathcal {V} $ when interacting with $ \mathcal {P} ^\star _{\lambda ,z,s} $ on input $ 1^{\lambda} $ and $ (M,x,y,L,t) $ . Additionally, we let $ \mathbf {Acc} _\mathcal {V} (\mathit {view}) $ be the predicate that outputs 1 if a view $ \mathit {view} $ is accepting for $ \mathcal {V} $ and 0 otherwise. The definition below is based on the definition of Lindell [42] and extended to the case of arguments similar to Reference [37]. We additionally modify the definition to capture relations $ R \subseteq \mathcal {R}_{\mathcal {U}} ^{\mathbf {TM}} $ similar to Reference [8] as discussed in Section 3.3.

Definition A.1 (Witness-extended Emulation for $ \mathbf {NP} $ Arguments) Let $ (\mathcal {P}, \mathcal {V}) $ be an interactive argument for a relation $ R \subseteq \mathcal {R}_{\mathcal {U}} ^{\mathbf {TM}} $ . Let $ \mathbf {WE} $ be a probabilistic machine that is given as input a security parameter $ 1^{\lambda} $ , a statement $ (M,x,y,L,t) $ , and oracle access to a machine $ \mathcal {P} ^\star _{\lambda ,z,s} $ . We let $ \mathbf {WE} _1^{\mathcal {P} ^\star _{\lambda ,z,s}}(1^{\lambda} , (M,x,y,L,t)) $ and $ \mathbf {WE} _2^{\mathcal {P} ^\star _{\lambda ,z,s}}(1^{\lambda} , (M,x,y,L,t)) $ denote the first and second outputs of the emulator, respectively.

We say that $ \mathbf {WE} $ is a witness-extended emulator for $ (\mathcal {P}, \mathcal {V}) $ and R if there exists a polynomial q such that for every non-uniform probabilistic polynomial-time prover $ \mathcal {P} ^\star = \lbrace \mathcal {P} ^\star _{\lambda }\rbrace _{\lambda \in \mathbb {N}} $ and every constant c, there exists a negligible function $ \mathsf {negl} $ such that for every $ \lambda \in \mathbb {N} $ , $ (M,x,y,L,t) $ with $ \left|(M,x,t,y) \right| \le \lambda $ , $ L\le \lambda $ , and $ t \le \lambda ^c $ , and every $ z, s \in \{0,1\} ^* $ , the following hold:

(1)

$ \mathbf {WE} ^{\mathcal {P} ^\star _{\lambda ,z,s}}(1^{\lambda} , (M,x,y,L,t)) $ runs in expected polynomial time $ q(\lambda , t) $ .

(2)

The view output by $ \mathbf {WE} _1 $ is identically distributed to the view of $ \mathcal {V} $ in a real interaction with $ \mathcal {P} ^\star _{\lambda , z, s} $ . That is, the corresponding distributions satisfy

$ \begin{equation*} \mathbf {WE} _1^{\mathcal {P} ^\star _{\lambda ,z,s}}(1^{\lambda} , (M,x,y,L,t)) \equiv \mathit {View} _\mathcal {V} ^{\mathcal {P} ^\star _{\lambda ,z,s}}(1^{\lambda} , (M,x,y,L,t)). \end{equation*} $

(3)

The probability that $ \mathbf {WE} _1 $ outputs an accepting view for $ \mathcal {V} $ , and yet $ \mathbf {WE} _2 $ does not output a correct witness, is negligible. That is,

$ \begin{equation*} \text{Pr}\left[ \begin{array}{l} \mathbf {Acc} _\mathcal {V} \left(\mathbf {WE} _1^{\mathcal {P} ^\star _{\lambda ,z,s}}(1^{\lambda} , (M,x,y,L,t))\right) = 1 \\ \wedge \ \left((M,x,y,L,t), \mathbf {WE} _2^{\mathcal {P} ^\star _{\lambda ,z,s}}(1^{\lambda} , (M,x,y,L,t))\right) \not\in R \end{array} \right] \le \mathsf {negl} (\lambda). \end{equation*} $

We next show that the above definition of witness-extended emulation implies the argument of knowledge definition in Section 3.3 for $ \mathbf {NP} $ relations.

Lemma A.2.

Let $ (\mathcal {P}, \mathcal {V}) $ be succinct argument for a relation $ R \subseteq \mathcal {R}_{\mathcal {U}} ^{\mathbf {TM}} $ . If there exists a witness-extended emulator $ \mathbf {WE} $ for $ (\mathcal {P}, \mathcal {V}) $ and R, then $ (\mathcal {P}, \mathcal {V}) $ satisfies the argument of knowledge for $ \mathbf {NP} $ condition in Definition 3.2.

Proof.

Using $ \mathbf {WE} $ , we construct a probabilistic oracle machine $ \mathcal {E} $ as required. Recall that both $ \mathcal {E} $ and $ \mathbf {WE} $ receive as input $ (1^{\lambda} , (M,x,y,L,t)) $ and get oracle access to a prover $ P^\star _{\lambda ,z,s} $ , while $ \mathcal {E} $ additionally gets oracle access to a verifier $ \mathcal {V} _r $ with uniformly sampled randomness fixed to r. Let $ \ell (\lambda) $ denote the length of the randomness r used by $ \mathcal {V} (1^{\lambda} ,\cdot) $ . We define $ \mathcal {E} ^{\mathcal {P} ^\star _{\lambda ,z,s}, \mathcal {V} _r} $ as follows:

$ \mathcal {E} ^{\mathcal {P} ^\star _{\lambda ,z,s}, \mathcal {V} _r}(1^{\lambda} , (M,x,y,L,t)) $ :

(1)

Emulate the view between $ P^\star _{\lambda ,z,s} $ and $ \mathcal {V} _r $ on input $ (1^{\lambda} , (M,x,y,L,t)) $ . If $ \mathcal {V} _r $ rejects in this view, then output $ \bot $ .

(2)

Sample $ (\mathit {view}, w) \leftarrow \mathbf {WE} ^{\mathcal {P} ^\star _{\lambda ,z,s}}(1^{\lambda} , (M,x,y,L,t)) $ until $ \mathbf {Acc} _\mathcal {V} (\mathit {view}) = 1 $ or $ 2^{2^{\lambda }} $ iterations have passed.

•

If $ \mathbf {Acc} _\mathcal {V} (\mathit {view}) = 1 $ at any point, then output the corresponding witness w.

•

Otherwise, for all strings $ w \in \{0,1\}^{t} $ , output the first one such that $ ((M,x,y,L,t), w) \in R $ or $ \bot $ if none exist.

It remains to prove that $ \mathcal {E} $ satisfies the argument of knowledge for $ \mathbf {NP} $ requirements of Definition 3.2. Specifically, let $ \mathcal {P} ^\star = \lbrace \mathcal {P} ^\star _{\lambda }\rbrace _{\lambda \in \mathbb {N}} $ be a non-uniform probabilistic polynomial-time prover and c be any constant. We need to show that there exists a negligible function $ \mathsf {negl} $ such that for every $ \lambda \in \mathbb {N} $ , $ (M,x,y,L,t), z, s \in \{0,1\}^* $ with $ \left|(M,x,y,t) \right| \le \lambda $ , $ L \le \lambda $ , and $ t \le \left|x \right|^c $ , the following hold:

Running time:

$ \mathcal {E} ^{\mathcal {P} ^\star _{\lambda , z, s}, \mathcal {V} _r}(1^{\lambda} ,(M,x,y,L,t)) $ runs in expected time $ q(\lambda , t) $ for some polynomial q (independent of $ \mathcal {P} ^\star $ and c), where the expectation is over a uniformly chosen $ r \leftarrow \{0,1\}^{\ell (\lambda)} $ and the randomness of $ \mathcal {E} $ .

Correctness:

It holds that

$ \begin{equation*} \text{Pr}\left[ \begin{array}{l} r \leftarrow \{0,1\}^{\ell (\lambda)} \\ w \leftarrow \mathcal {E} ^{\mathcal {P} ^\star _{\lambda , z, s}, \mathcal {V} _r}(1^{\lambda} , (M,x,y,L,t)) \end{array} : \begin{array}{l} \langle \mathcal {P} ^\star _{\lambda , z, s}, \mathcal {V} _r \rangle (1^{\lambda} , (M,x,y,L,t)) = 1 \\ \wedge \ ((M,x,y,L,t), w) \not\in R \end{array} \right] \le \mathsf {negl} (\lambda). \end{equation*} $

We next focus on each of these conditions.

Running time. For the running time, we first note that by succinctness of $ (\mathcal {P}, \mathcal {V}) $ , there exists a polynomial $ q_1 $ such that the number of messages and total communication between $ \mathcal {P} ^\star _{\lambda ,z,s} $ and $ \mathcal {V} _r $ is bounded by $ q_1(\lambda , \log t) $ . This also bounds the running time of emulating the interaction between $ \mathcal {P} ^\star _{\lambda ,z,s} $ and $ \mathcal {V} _r $ given oracle access to each machine. If $ \mathcal {V} _r $ rejects, then we are done.

Otherwise, we define the value

$ \begin{equation*} \epsilon = \text{Pr}\left[ r \leftarrow \{0,1\}^{\ell (\lambda)} : \langle \mathcal {P} ^\star _{\lambda ,z,s}, \mathcal {V} _r \rangle (1^{\lambda} , (M,x,y,L,t)) = 1 \right], \end{equation*} $

which is greater than 0 in the case that $ \mathcal {V} _r $ accepts for some choice of r. For the analysis of the expected running time, we note that we continue with probability $ \epsilon $ where $ \mathcal {V} _r $ accepts.

In this case, we first try running $ \mathbf {WE} $ until its first output is an accepting view. By definition of witness-extended emulation, it holds that the first output of $ \mathbf {WE} $ is identically distributed to the real interaction between $ \mathcal {P} ^\star _{\lambda ,z,s} $ and $ \mathcal {V} $ , so this means we will run $ \mathbf {WE} $ at most $ 1/\epsilon $ times in expectation. By definition of $ \mathbf {WE} $ , there exists a polynomial $ q_2 $ such that each run of $ \mathbf {WE} $ takes expected $ q_2(\lambda ,t) $ time. So, this contributes at most $ \epsilon \cdot (1/\epsilon) \cdot q_2(\lambda ,t) = q_2(\lambda ,t) $ to the expected running time.

We last consider the case where $ 2^{2^{\lambda }} $ independent iterations pass without finding an accepting view. This event occurs with probability $ (1-\epsilon)^{2^{2^{\lambda }}} $ given that $ \mathcal {V} _r $ initially accepted. In this case, we run in time $ 2^{t} \cdot \mathrm{poly} (\lambda , t) $ to emulate M on all choices of w of size at most t. Let B be this time to brute-force, which in particular is bounded by $ 2^{2^{\lambda / 2}} $ for sufficiently large $ \lambda $ , since $ t \le \left|x \right|^{c} $ . Thus, this case contributes a factor of at most $ \epsilon \cdot (1-\epsilon)^{2^{2^{\lambda }}} \cdot B $ to the expected running time. We show that this is in fact bounded by 1, at least for sufficiently large $ \lambda $ . In the case that $ \epsilon \lt 1/B $ , this clearly holds. When $ \epsilon \ge 1/B $ , we can bound $ (1-\epsilon)^{2^{2^{\lambda }}} \le (1-1/B)^{B \cdot (2^{2^{\lambda }} / B)} \le (1/e)^{2^{2^{\lambda / 2}}} \le 1/B $ , using the fact that $ B \le 2^{2^{\lambda /2}} $ for sufficiently large $ \lambda $ . Thus, for any value of $ \epsilon $ , it holds that $ \epsilon \cdot (1-\epsilon)^{2^{2^{\lambda }}} \cdot B \le 1 $ for sufficiently large $ \lambda $ , so in particular is bounded by some polynomial $ q_3(\lambda , t) $ (to account for small values of $ \lambda $ where this is not necessarily bounded by 1).

Putting it all together, we upper bound the expected running time of $ \mathcal {E} $ by

$ \begin{equation*} q_1(\lambda , \log t) + q_2(\lambda ,t) + q_3(\lambda , t) = q(\lambda , t), \end{equation*} $

for some polynomial q, independent of $ \mathcal {P} ^\star _{\lambda ,z,s} $ , as required.

Correctness. For correctness, suppose by way of contradiction that there exists a polynomial p such that for infinitely many $ \lambda \in \mathbb {N} $ ,

We show this contradicts the correctness property for $ \mathbf {WE} $ . For notational convenience, we first define the following events:

•

We say $ \mathbf {WE} _1 $ accepts when $ \mathbf {Acc} _\mathcal {V} (\mathbf {WE} _1^{\mathcal {P} ^\star _{\lambda ,z,s}}(1^{\lambda} , (M,x,y,L,t))) = 1 $ , and $ \mathbf {WE} _2 $ is valid when $ ((M,x,y,L,t), \mathbf {WE} _2^{\mathcal {P} ^\star _{\lambda ,z,s}}(1^{\lambda} , (M,x,y,L,t))) \in R $ , where the probabilities are over the randomness of $ \mathbf {WE} $ .

•

We say $ \mathcal {V} _r $ accepts when $ \langle \mathcal {P} ^\star _{\lambda , z, s}, \mathcal {V} _r \rangle (1^{\lambda} , (M,x,y,L,t)) = 1 $ , and w is valid when $ ((M,x,y,L,t), w) \in R $ , where the probabilities are over a random $ r \leftarrow \{0,1\}^{\ell (\lambda)} $ and $ w \leftarrow \mathcal {E} ^{\mathcal {P} ^\star _{\lambda , z, s}, \mathcal {V} _r}(1^{\lambda} , (M,x,y,L,t)) $ .

Towards a contradiction, we consider the event where the witness w output by $ \mathcal {E} $ is valid given that $ \mathcal {V} _r $ accepts. Let $ \mathbf {BF} $ be the event that $ \mathbf {WE} _1 $ fails to accept for $ 2^{2^{\lambda }} $ iterations, at which point $ \mathcal {E} $ will always output a valid witness w (if one exists). We note that, since M is a Turing machine that runs in time at most t, it can only read the first t bits of its input string. Thus, if any valid witness exists, then there will exist a witness of length at most t, which will be found by brute force search. When $ \mathbf {BF} $ does not occur, $ \mathcal {E} $ samples a uniformly random output of $ \mathbf {WE} _2 $ conditioned on $ \mathbf {WE} _1 $ accepting. In the case where there exists any valid witness w for $ ((M,x,y,L,t),w) \in R $ , this implies that

$ \begin{align*} \text{Pr}\left[ w \text{ is valid} \mid \mathcal {V} _r \text{ accepts} \right] &= \text{Pr}\left[ \mathbf {BF} \right] \cdot 1 + (1-\text{Pr}\left[ \mathbf {BF} \right]) \cdot \text{Pr}\left[ \mathbf {WE} _2 \text{ is valid} \mid \mathbf {WE} _1 \text{ accepts} \right] \\ &\ge \text{Pr}\left[ \mathbf {WE} _2 \text{ is valid} \mid \mathbf {WE} _1 \text{ accepts} \right] \!. \end{align*} $

For the case where there does not exist a valid witness, note that this inequality still holds, as both terms are simply zero. Considering the complement events, this implies that

$ \begin{equation*} \text{Pr}\left[ w \text{ is invalid} \mid \mathcal {V} _r \text{ accepts} \right] \le \text{Pr}\left[ \mathbf {WE} _2 \text{ is invalid} \mid \mathbf {WE} _1 \text{ accepts} \right]. \end{equation*} $

Because $ \text{Pr}\left[ \mathbf {WE} _1 \text{ accepts} \right] = \text{Pr}\left[ \mathcal {V} _r \text{ accepts} \right] $ , it follows that

$ \begin{equation*} \text{Pr}\left[ \begin{array}{l} \mathcal {V} _r \text{ accepts} \\ \wedge \ w \text{ is invalid} \end{array} \right] \le \text{Pr}\left[ \begin{array}{l} \mathbf {WE} _1 \text{ accepts} \\ \wedge \ \mathbf {WE} _2 \text{ is invalid} \end{array} \right] \!. \end{equation*} $

However, this implies that $ \text{Pr}\left[ \mathbf {WE} _1 \text{ accepts} \wedge \mathbf {WE} _2 \text{ is invalid} \right] \gt 1/p(\lambda) $ for some polynomial p, in contradiction.□

We conclude this section by relating our argument of knowledge definition for $ \mathbf {NP} $ (Definition 3.2) to the standard definition given by Reference [9]. In the standard definition, the extractor $ \mathcal {E} $ has oracle access to $ \mathcal {P} ^{\star } _{\lambda ,z,s} $ , always extracts a witness, and runs in expected time $ p(\lambda) / (\epsilon (\lambda) - \kappa (\lambda)) $ for a polynomial p, where $ \epsilon (\lambda) $ is the success probability of $ \mathcal {P} ^{\star } _{\lambda ,z,s} $ and $ \kappa (\lambda) $ is the knowledge error (where these functions may additionally depend on the statement length).

Recall that Lindell showed that the standard definition for proofs of knowledge implies witness-extended emulation for proofs [42]. The difference between that definition of witness-extended emulation for proofs and ours for $ \mathbf {NP} $ arguments (Definition A.1) is that, in addition to being for arguments rather than proofs, our requirements are for statements $ (M,x,y,L,t) $ with $ \left|(M,x,t,y) \right| \le \lambda $ , $ L \le \lambda $ , and $ t\le \lambda ^{c} $ . We also allow the emulator to run in time polynomial in $ \lambda ,t $ (similar to universal arguments), rather than simply in $ \lambda $ .

We observe that upon making these same modifications to the standard argument of knowledge definition, it follows by Reference [42] that the resulting definition implies witness-extended emulation for arguments. By combining this with Lemma A.2, we conclude that Definition 3.2 is implied by a more standard definition.

B Proofs from Section 6.4

Lemma B.1 (Adaptive Argument of Knowledge).

$ (\mathcal {G} _\mathsf {ni}, \mathcal {P} _\mathsf {ni}, \mathcal {V} _\mathsf {ni}) $ is an adaptive argument of knowledge for $ \mathbf {NP} $ .

Proof.

Let $ \mathbf {C} $ be a concurrently updatable hash function, let $ \mathcal {H} = \{\mathcal {H} _{\lambda }\}_{\lambda \in \mathbb {N}} $ be a collision-resistant hash function family ensemble, and let $ (\mathcal {G} _{\mathsf {snark}},\mathcal {P} _{\mathsf {snark}},\mathcal {V} _{\mathsf {snark}}) $ be a SNARK for the language $ \mathcal {L}_{\mathsf {upd}} $ , given in Figure 2. Consider any non-uniform polynomial-time prover $ \mathcal {P} ^{\star } = \lbrace \mathcal {P} ^{\star }_{\lambda }\rbrace _{\lambda \in \mathbb {N}} $ and security parameter $ \lambda \in \mathbb {N} $ . Let $ \mathcal {P} ^{\star }_{\lambda ,z,s} $ denote $ \mathcal {P} ^{\star }_{\lambda } $ with auxiliary input z and hardcoded randomness s for any $ z,s\in \{0,1\}^{*} $ .

Let $ (\mathit {crs},\mathit {st}) \leftarrow \mathcal {G} _{\mathsf {ni}}(1^{\lambda}) $ . Recall that a proof in the non-interactive SPARK scheme consists of m sub-proofs, each corresponding to an instance of $ \mathcal {L}_{\mathsf {upd}} $ , as well as values certifying the output of the computation. As a subroutine to our full extractor, we first construct a probabilistic oracle machine $ \mathcal {E} _{\mathsf {inner}} $ that uses extractors for the SNARK to extract witnesses for each sub-proof.

$ \mathcal {E} _{\mathsf {inner}} (\mathit {crs},z,s) $ :

(1)

Compute $ ((M,x,y,L,t),\pi) = P^{\star }_{\lambda ,z,s}(\mathit {crs}) $ . Let m be the number of $ \mathcal {L}_{\mathsf {upd}} $ statements specified by $ \pi $ , and for each $ i\in [m] $ , let $ (\mathit {statement} _{i},\pi _{i}) $ be ith $ \mathcal {L}_{\mathsf {upd}} $ statement and proof, and let $ Y,\pi _{\mathsf {final}} $ be the opening and proof certifying the output y, all given by $ \pi $ . Let $ k_{i} $ be the number of updates in each $ \mathcal {L}_{\mathsf {upd}} $ statement for $ i\in [m] $ , and let M have access to n words in memory and $ p_{M} $ processors. Check that $ \sum _{i=1}^{m} k_{i} = t $ , $ n \le 2^{\lambda } $ , and that Y consists of $ \lceil L/\lambda \rceil $ words, and abort and output $ \bot $ if any of these do not hold.

(2)

Parse $ \mathit {crs} = (\mathit {crs} _{\mathsf {snark}},\mathit {pp},h) $ . For each $ i \in [m] $ , define $ \mathcal {P} _{i,z^{\prime },s}^{\star }(\mathit {crs} _{\mathsf {snark}}) $ to be a SNARK prover with auxiliary input $ z^{\prime } = (\mathit {pp},h,z) $ and randomness s hardcoded that runs $ ((M,x,y,L,t),\pi) \leftarrow \mathcal {P} ^{\star }_{\lambda ,z,s}((\mathit {crs} _{\mathsf {snark}},\mathit {pp},h)) $ and outputs $ (\mathit {statement} _{i},\pi _{i}) $ given by $ \pi $ . Let $ \mathcal {P} _{i}^{\star } $ denote this machine without its auxiliary input or randomness specified, and let $ \mathcal {E} _{\mathsf {snark},i} $ be the SNARK extractor for $ \mathcal {P} _{i}^{\star } $ .

(3)

For $ i \in [m] $ , compute $ \mathit {wit} _i \leftarrow \mathcal {E} _{\mathsf {snark},i}(\mathit {crs} _{\mathsf {snark}},z^{\prime },s) $ . Output $ (\mathit {wit} _1 , \ldots , \mathit {wit} _m , Y , \pi _{\mathsf {final}}) $ .

In the following claims, we show that (1) $ \mathcal {E} _{\mathsf {inner}} $ runs in polynomial time (over the randomness of $ \mathcal {G} _{\mathsf {ni}} $ and its own random coins) and (2) with all but negligible probability (over randomness of $ \mathcal {G} _{\mathsf {ni}} $ , the coins of $ \mathcal {E} _\mathsf {snark} $ , and randomness for $ \mathcal {V} _{\mathsf {ni}} $ ), either $ \mathcal {P} ^{\star }_{\lambda ,z,s} $ fails to convinces $ \mathcal {V} _{\mathsf {ni}} $ or for all $ i \in [m] $ the witness $ \mathit {wit} _i $ extracted by $ \mathcal {E} _{\mathsf {snark},i} $ is valid for $ \mathit {statement} _i $ with respect to $ R_{\mathsf {upd}} $ :

Claim B.2.

There exists a polynomial $ q_\mathsf {inner} $ such that for every $ \lambda \in \mathbb {N} $ and $ z,s\in \{0,1\}^{*} $ , the running time of $ \mathcal {E} _{\mathsf {inner}} (\mathit {crs}, z,s) $ is at most $ q_\mathsf {inner} (\lambda ,t \cdot p_{M}) $ , where $ (\mathit {crs},\mathit {st}) \leftarrow \mathcal {G} _{\mathsf {ni}}(1^{\lambda}) $ , t is given by the statement output by $ \mathcal {P} ^{\star }_{\lambda ,z,s}(\mathit {crs}) $ and $ p_{M} $ is the number of processors used by the machine M given in the statement.□

Proof.

$ \mathcal {E} _{\mathsf {inner}} $ runs $ P^{\star }_{\lambda ,z,s} $ , does validity checks on its output, and runs $ \mathcal {E} _{\mathsf {snark},i} $ for each $ i \in [m] $ . The running time of $ \mathcal {P} ^{\star }_{\lambda ,z,s} $ is bounded by a polynomial $ q^{\star }(\lambda) $ where $ q^{\star } $ does not depend on $ \lambda ,z,s $ . The checks on the output of $ \mathcal {P} ^{\star }_{\lambda ,z,s} $ take time polynomial in its output length, which is therefore bounded by $ \mathrm{poly} (\lambda) $ . Note that if these checks pass, then it implies that for each $ i\in [m] $ , the number of updates $ k_{i} $ in the ith sub-statement is at most t and that $ n \le 2^{\lambda } $ .

When the checks pass, $ \mathcal {E} _{\mathsf {inner}} $ continues by running the SNARK extractors. For each $ i \in [m] $ , the running time of $ \mathcal {E} _{\mathsf {snark},i}(\mathit {crs} _{\mathsf {snark}},z^{\prime },s) $ is a polynomial $ q_{i}(\lambda ,w_{i}) $ independent of $ \lambda ,z,s $ , where $ w_{i} $ is the work to verify the ith $ \mathcal {L}_{\mathsf {upd}} $ statement. As discussed in Section 6.2, this is bounded by $ k_{i} \cdot \beta (\lambda ,\log n) \cdot \mathrm{poly} (\lambda ,\left|(M,x) \right|,p_{M},\log t) \in \mathrm{poly} (\lambda ,\left|(M,x) \right|,p_{M},t) $ , since $ k_{i} \le t $ and $ n \le 2^{\lambda } $ . As $ M,x $ are part of the output of $ \mathcal {P} ^{\star }_{\lambda ,z,s} $ , it follows that $ \left|(M,x) \right| $ is bounded by $ q^{\star }(\lambda) $ , and so the work to verify the ith statement is bounded by a fixed polynomial $ q^{\prime }(\lambda ,p_{M},t) $ . Putting everything together, $ \mathcal {E} _{\mathsf {inner}} $ runs in time

$ \begin{equation*} q_{\mathsf {inner}}(\lambda ,t \cdot p_{M}) \in q^{\star }(\lambda) + \mathrm{poly} (\lambda) + \sum _{i=1}^{m} q_{i}(\lambda ,q^{\prime }(\lambda ,p_{M},t)). \end{equation*} $

Since the output length of the prover depends multiplicatively on m, then m is also bounded by $ q^{\star }(\lambda) $ , so it follows that $ q_{\mathsf {inner}}(\lambda ,t \cdot p_{M}) $ is polynomial in $ \lambda $ and $ t \cdot p_{M} $ . $ \Box $ □

Claim B.3.

There exists a negligible function $ \mathsf {negl} _\mathsf {inner} $ such that for all $ \lambda \in \mathbb {N} $ and $ z,s \in \{0,1\}^{*} $ , it holds that

$ \begin{align*} & \text{Pr}\left[ \begin{array}{l} (\mathit {crs},\mathit {st}) \leftarrow \mathcal {G} _{\mathsf {ni}}(1^{\lambda}) \\ ((M,x,y,L,t),\pi) = \mathcal {P} ^{\star }_{\lambda ,z,s}(\mathit {crs}) \\ b \leftarrow \mathcal {V} _{\mathsf {ni}}(\mathit {st},(M,x,y,L,t),\pi) \\ (\mathit {wit} _1,\ldots ,\mathit {wit} _m,Y,\pi _{\mathsf {final}}) \leftarrow \mathcal {E} _{\mathsf {inner}} (\mathit {crs},z,s) \end{array} : \begin{array}{l} b=1 \ \wedge \\ \exists i \in [m] : (\mathit {statement} _i, \mathit {wit} _i) \not\in R_{\mathsf {upd}} \end{array} \right] \\ &\le \mathsf {negl} _\mathsf {inner} (\lambda) , \end{align*} $

where $ \mathit {statement} _{i} $ is defined to be the statement of the ith sub-proof.

Proof.

In all of the following probabilities, m is the number of $ \mathcal {L}_{\mathsf {upd}} $ statements given in the proof $ \pi $ , and the statements are denoted $ \mathit {statement} _{1},\ldots ,\mathit {statement} _{m} $ . We start by applying a union bound to upper bound the probability in the statement of the claim by

$ \begin{align} \sum _{i \in [m]} \text{Pr}\left[ \begin{array}{l} (\mathit {crs},\mathit {st}) \leftarrow \mathcal {G} _{\mathsf {ni}}(1^{\lambda}) \\ ((M,x,y,L,t),\pi) \leftarrow \mathcal {P} ^{\star }_{\lambda ,z,s}(\mathit {crs}) \\ b \leftarrow \mathcal {V} _{\mathsf {ni}}(\mathit {st},(M,x,y,L,t),\pi) \\ (\mathit {wit} _1,\ldots ,\mathit {wit} _m,Y,\pi _{\mathsf {final}}) \leftarrow \mathcal {E} _{\mathsf {inner}} (\mathit {crs},z,s) \end{array} : \begin{array}{l} b=1 \ \wedge \\ (\mathit {statement} _i, \mathit {wit} _i) \not\in R_{\mathsf {upd}} \end{array} \right] . \end{align} $

(B.1)

We now upper bound the above for any particular $ i \in [m] $ .

By definition of $ \mathcal {E} _{\mathsf {inner}} $ , whenever the proof $ \pi $ satisfies $ \sum _{i=1}^{m} k_{i} = t $ , when M uses at most $ 2^{\lambda } $ words in memory, and when Y has the right length, then $ \mathcal {E} _{\mathsf {inner}} $ runs a SNARK extractor for each i. Note that these are also requirements for $ \mathcal {V} _{\mathsf {ni}} $ to accept (regardless of the randomness for $ \mathcal {V} _{\mathsf {ni}} $ ), and so in the event that $ b=1 $ , $ \mathcal {E} _{\mathsf {inner}} $ attempts to extract a witness for each sub-proof. This implies that the above is equal to the probability where $ \mathit {wit} _{i} $ is sampled using the extractor $ \mathcal {E} _{\mathsf {snark},i}(\lambda ,z^{\prime },s) $ for $ \mathcal {P} _{i}^{\star } $ where $ z^{\prime } = (\mathit {pp},z) $ . Therefore, using the definitions of $ \mathcal {G} _{\mathsf {ni}} $ and $ \mathcal {E} _{\mathsf {inner}} $ , we can write the above probability as

$ \begin{align*} \text{Pr}\left[ \begin{array}{l} (\mathit {crs} _{\mathsf {snark}},\mathit {st} _{\mathsf {snark}}) \leftarrow \mathcal {G} _{\mathsf {snark}}(1^{\lambda}) \\ \mathit {pp} \leftarrow \mathbf {C}.\mathbf {Gen} (1^{\lambda} ,n) \\ h \leftarrow \mathcal {H} _{\lambda } \\ ((M,x,y,L,t),\pi) \leftarrow \mathcal {P} ^{\star }_{\lambda ,z,s}((\mathit {crs} _{\mathsf {snark}},\mathit {pp},h)) \\ b \leftarrow \mathcal {V} _{\mathsf {ni}}((\mathit {st} _{\mathsf {snark}},\mathit {pp},h),(M,x,y,L,t),\pi) \\ \mathit {wit} _{i} \leftarrow \mathcal {E} _{\mathsf {snark}}(\mathit {crs} _{\mathsf {snark}},(\mathit {pp},h,z),s) \end{array} : \begin{array}{l} b=1 \ \wedge \\ (\mathit {statement} _i, \mathit {wit} _i) \not\in R_{\mathsf {upd}} \end{array} \right], \end{align*} $

where $ n=2^{\lambda } $ . Whenever $ b=1 $ , then $ \mathcal {V} _{\mathsf {ni}} $ accepts all sub-proofs, and therefore by definition of $ \mathcal {P} _{i}^{\star } $ , it follows that $ \mathcal {V} _{\mathsf {snark}} $ accepts sub-proof i. We can therefore upper bound the above probability by

$ \begin{align} \text{Pr}\left[ \begin{array}{l} (\mathit {crs} _{\mathsf {snark}},\mathit {st} _{\mathsf {snark}}) \leftarrow \mathcal {G} _{\mathsf {snark}}(1^{\lambda}) \\ \mathit {pp} \leftarrow \mathbf {C}.\mathbf {Gen} (1^{\lambda} ,n) \\ h \leftarrow \mathcal {H} _{\lambda } \\ (\mathit {statement} _{i},\pi _{i}) \leftarrow \mathcal {P} ^{\star }_{i,(\mathit {pp},h,z),s}(\mathit {crs} _{\mathsf {snark}}) \\ b_{i} \leftarrow \mathcal {V} _{\mathsf {snark}}(\mathit {st} _{\mathsf {snark}},\mathit {statement} _{i},\pi _{i}) \\ \mathit {wit} _{i} \leftarrow \mathcal {E} _{\mathsf {snark}}(\mathit {crs} _{\mathsf {snark}},(\mathit {pp},h,z),s) \end{array} : \begin{array}{l} b_{i}=1 \ \wedge \\ (\mathit {statement} _i, \mathit {wit} _i) \not\in R_{\mathsf {upd}} \end{array} \right] . \end{align} $

(B.2)

Next, for any fixed $ \mathit {pp} $ in the support of $ \mathbf {C}.\mathbf {Gen} $ and $ h \in \mathcal {H} _{\lambda } $ , the above probability is bounded by a negligible function $ \mathsf {negl} _{i} $ that does not depend on $ \lambda ,\mathit {pp},h,z,s $ by the argument of knowledge property of the SNARK. It follows by the law of total probability (to sum over each choice of $ \mathit {pp} $ ) that Equation (B.2) is bounded by $ \mathsf {negl} _{i} $ .

Finally, by plugging this back into Equation (B.1), we obtain that the probability in the statement of the claim is upper bounded by $ \sum _{i\in [m]} \mathsf {negl} _i(\lambda) $ . Since m determines the length of the output of $ \mathcal {P} ^{\star }_{\lambda ,z,s} $ , then $ m \in \mathrm{poly} (\lambda) $ , and so this is negligible as required. $ \Box $ □

Using $ \mathcal {E} _{\mathsf {inner}} $ to extract the witnesses in the sub-protocols, we now define the full extractor $ \mathcal {E} $ that outputs a witness w for $ (M,x,y,L,t) $ .

$ \mathcal {E} (\mathit {crs},z,s) $ :

(1)

Run $ (\mathit {wit} _1, \ldots , \mathit {wit} _m,Y,\pi _{\mathsf {final}}) \leftarrow \mathcal {E} _{\mathsf {inner}} (\mathit {crs},z,s) $ , and abort and output $ \bot $ if the output of $ \mathcal {E} _{\mathsf {inner}} $ is $ \bot $ . Let $ (M,x,y,L,t) $ be the statement output by $ \mathcal {P} ^{\star }_{\lambda ,z,s} $ when computed by $ \mathcal {E} _{\mathsf {inner}} $ .

(2)

Parse each $ \mathit {wit} _{i} $ as a sequence of updates, which together yield an overall sequence of t updates $ u_{j}= (\mathit {digest} _{j}^{},V ^{\mathsf {prev}}_{j},V ^{\mathsf {rd}}_{j},\pi _{j}^{},\tau _{j}^{}) $ for $ j \in [t] $ (abort if this is not the case). Specifically, let $ (V _1^\mathsf {rd}, \ldots , V _t^\mathsf {rd}) $ be the tuples of values read from these updates.

(3)

(4)

Let $ D^{\mathsf {Init}} \in \{0,1\}^{n\lambda } $ be the string where for each $ \ell \in [n] $ , the $ \ell $ th word is set to its value in $ V _{i}^{\mathsf {rd}} $ , where i is the first iteration with $ \ell \in S_{i} $ , or the $ \ell $ th word in Y if $ \ell $ is never accessed and $ \ell \le \lceil L/\lambda \rceil $ , or $ 0^{\lambda } $ otherwise.

(5)

Output w to be the string of length $ n\lambda -\left|x \right| $ starting at position $ \left|x \right| $ in $ D^{\mathsf {Init}} $ .

Claim B.4.

There exists a polynomial q such that $ \mathcal {E} (\mathit {crs},z,s) $ runs in time at most $ q(\lambda , t \cdot p_{M}) $ .

Proof.

$ \mathcal {E} $ first runs $ \mathcal {E} _{\mathsf {inner}} $ , which has running time bounded by a polynomial $ q_\mathsf {inner} (\lambda , t \cdot p_{M}) $ by Claim B.2. Note that if $ \mathcal {E} _{\mathsf {inner}} $ does not output $ \bot $ , then it implies in particular that the number n of words in memory used by M is at most $ 2^{\lambda } $ . It also implies L is bounded by a fixed polynomial in $ \lambda $ , since $ \mathcal {E} _{\mathsf {inner}} $ checks that Y, which is part of the proof $ \pi $ , consists of $ \lceil L/\lambda \rceil $ words and hence at least L bits. Using these, we bound the remaining running time of $ \mathcal {E} $ by a polynomial in $ \lambda $ and $ t \cdot p_{M} $ , which completes the claim.

After running $ \mathcal {E} _{\mathsf {inner}} $ , $ \mathcal {E} $ parses its output as a sequence of t updates, where each update has size at most $ 2\beta (\lambda ,\log n) \cdot p_{M} \cdot \lambda \in \mathrm{poly} (\lambda) $ by the efficiency of the underlying hash function, which takes time $ t \cdot p_{M} \cdot \mathrm{poly} (\lambda) $ . Using these updates to determine which values to read, $ \mathcal {E} $ emulates M for t steps, which can be done in time $ t \cdot p_{M} \cdot \mathrm{poly} (\lambda ,\left|M \right|) $ . This can be bounded by $ t \cdot p_{M} \cdot \mathrm{poly} (\lambda) $ , since M is part of the output of $ \mathcal {P} _{\lambda ,z,s}^{\star } $ , so $ \left|M \right| $ is bounded by a fixed polynomial in $ \lambda $ . Finally, $ \mathcal {E} $ computes the initial memory $ D^{\mathsf {Init}} $ to output a witness w, which, as discussed above, requires specifying at most $ t+\lceil L/ \lambda \rceil $ positions and therefore takes at most $ \mathrm{poly} (\lambda ,L,t,\log n) $ time. This is bounded by $ \mathrm{poly} (\lambda ,t) $ , as $ L,\log n $ are in $ \mathrm{poly} (\lambda) $ . Altogether, $ \mathcal {E} $ runs in time at most $ q_\mathsf {inner} (\lambda , t \cdot p_{M}) + t \cdot p_{M} \cdot \mathrm{poly} (\lambda) + t \cdot p_{M} \cdot \mathrm{poly} (\lambda) + \mathrm{poly} (\lambda ,t) $ that can be bounded by a polynomial $ q(\lambda , t \cdot p_{M}) $ . $ \Box $ □

Claim B.5.

For every constant $ c\in \mathbb {N} $ , there exists a negligible function $ \mathsf {negl} $ such that for all $ \lambda \in \mathbb {N} $ and $ z,s\in \{0,1\} ^{*} $ ,

$ \begin{align*} \text{Pr}\left[ \begin{array}{l} (\mathit {crs},\mathit {st}) \leftarrow \mathcal {G} _{\mathsf {ni}}(1^{\lambda}) \\ ((M,x,y,L,t),\pi) \leftarrow \mathcal {P} ^{\star }_{\lambda ,z,s}(\mathit {crs}) \\ b \leftarrow \mathcal {V} _{\mathsf {ni}}(\mathit {st},(M,x,y,L,t),\pi) \\ w \leftarrow \mathcal {E} (\mathit {crs},z,s) \end{array} : \begin{array}{l} b=1 \ \wedge \\ ((M,x,y,L,t),w) \not\in \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}}\ \wedge \\ t\cdot p_{M} \le \left|x \right|^{c} \end{array} \right] \le \mathsf {negl} (\lambda), \end{align*} $

where $ p_{M} $ is the number of processors used by M.

Proof.

In the following, all probabilities are over $ (\mathit {crs},\mathit {st}) \leftarrow \mathcal {G} _{\mathsf {ni}}(1^{\lambda}) $ , $ ((M,x,y,L,t),\pi) \leftarrow \mathcal {P} ^{\star }_{\lambda ,z,s}(\mathit {crs}) $ , $ b \leftarrow \mathcal {V} _{\mathsf {ni}}(\mathit {st},(M,x,y,L,t),\pi) $ , and $ w \leftarrow \mathcal {E} (\mathit {crs},z,s) $ . We let $ \mathit {statement} _{i},\pi _{i} $ for all $ i \in [m] $ be the statement and proof given $ \mathcal {P} ^{\star }_{\lambda ,z,s} $ for the ith $ \mathcal {L}_{\mathsf {upd}} $ instance, and we define $ p_{M} $ to be the number of processors used by M. Additionally, we let $ \mathit {wit} _1,\ldots , \mathit {wit} _m,Y,\pi _{\mathsf {final}} $ be the output of $ \mathcal {E} _{\mathsf {inner}} $ during the execution of $ \mathcal {E} $ in each probability.

Suppose by way of contradiction that there exists a polynomial p such that for infinitely many $ \lambda \in \mathbb {N} $ ,

$ \begin{align*} \text{Pr}\left[ \begin{array}{l} b=1 \ \wedge \\ ((M,x,y,L,t),w) \not\in \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}}\ \wedge \\ t\cdot p_{M} \le \left|x \right|^{c} \end{array} \right] \gt \frac{1}{p(\lambda)}. \end{align*} $

We can rewrite this probability as

$ \begin{align*} & \text{Pr}\left[ \begin{array}{l} b=1 \ \wedge \\ ((M,x,y,L,t),w) \not\in \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}}\ \wedge \\ t\cdot p_{M} \le \left|x \right|^{c} \ \wedge \\ \forall i \in [m] \ (\mathit {statement} _i, \mathit {wit} _i) \in R_{\mathsf {upd}} \end{array} \right] + \text{Pr}\left[ \begin{array}{l} b=1 \ \wedge \\ ((M,x,y,L,t),w) \not\in \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}}\ \wedge \\ t\cdot p_{M} \le \left|x \right|^{c} \ \wedge \\ \exists i \in [m] \ (\mathit {statement} _i, \mathit {wit} _i) \not\in R_{\mathsf {upd}} \end{array} \right] \\ \le &\text{Pr}\left[ \begin{array}{l} b=1 \ \wedge \\ ((M,x,y,L,t),w) \not\in \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} \\ t \le \left|x \right|^{c} \ \wedge \\ \forall i \in [m] \ (\mathit {statement} _i, \mathit {wit} _i) \in R_{\mathsf {upd}} \end{array} \right] + \mathsf {negl} _\mathsf {inner} (\lambda), \end{align*} $

by Claim B.3 above. As $ \mathsf {negl} _\mathsf {inner} (\lambda) \lt 1/(2p(\lambda)) $ for infinitely many $ \lambda \in \mathbb {N} $ , this implies that for infinitely many $ \lambda \in \mathbb {N} $ ,

$ \begin{align} \text{Pr}\left[ \begin{array}{l} b=1 \ \wedge \\ ((M,x,y,L,t),w) \not\in \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}}\ \wedge \\ t\cdot p_{M} \le \left|x \right|^{c} \ \wedge \\ \forall i \in [m] \ (\mathit {statement} _i, \mathit {wit} _i) \in R_{\mathsf {upd}} \end{array} \right] \gt \frac{1}{2p(\lambda)}. \end{align} $

(B.3)

Given this, consider the following non-uniform adversary $ \mathcal {A} = \{\mathcal {A} _{\lambda }\}_{\lambda \in \mathbb {N}} $ that can be used to break the soundness of either $ \mathbf {C} $ or $ \mathcal {H} $ , where $ \mathcal {A} _{\lambda } $ has $ z,s $ and the description of $ \mathcal {P} ^{\star }_{\lambda } $ hardcoded:

$ \mathcal {A} _{\lambda }(\mathit {pp},h) $ :

(1)

Sample $ (\mathit {crs} _{\mathsf {snark}},\mathit {st} _{\mathsf {snark}}) \leftarrow \mathcal {G} _{\mathsf {snark}}(1^{\lambda}) $ . Let $ \mathit {crs} =(\mathit {crs} _{\mathsf {snark}},\mathit {pp},h) $ and $ \mathit {st} =(\mathit {st} _{\mathsf {snark}},\mathit {pp},h) $ .

(2)

Compute $ ((M,x,y,L,t),\pi) = \mathcal {P} ^{\star }_{\lambda ,z,s}(\mathit {crs}) $ . Check that $ t\cdot p_{M} \le \left|x \right|^{c} $ , where $ p_{M} $ is the number of processors used by M. If this does not hold, then abort and output $ \bot $ . Let $ (\mathit {State} _{\mathsf {final}},V _{\mathsf {final}}^{\mathsf {rd}}) $ be the final states and words in $ \pi $ (corresponding to those sent in the final message).

(3)

Run $ w \leftarrow \mathcal {E} (\mathit {crs},z,s) $ . If $ \mathcal {E} $ outputs $ \bot $ , then abort and output $ \bot $ . Otherwise, let $ \mathit {wit} _1,\ldots , \mathit {wit} _m $ be the witnesses output by $ \mathcal {E} _{\mathsf {inner}} $ for statements $ \mathit {statement} _1, \ldots , \mathit {statement} _m $ .

(4)

Sample $ b \leftarrow \mathcal {V} _{\mathsf {ni}}(\mathit {st},(M,x,y,L,t),\pi) $ . If $ b = 0 $ , then abort and output $ \bot $ . If $ b=1 $ , then let $ \mathit {statement} _{i},\pi _{i} $ be the statement and proof for the ith $ \mathcal {L}_{\mathsf {upd}} $ instance given by $ \pi $ for $ i\in [m] $ and let $ Y,\pi _{\mathsf {final}} $ be the opening for the final output given by $ \pi $ .

(5)

If there exists a $ j \in [m] $ such that $ (\mathit {statement} _j, \mathit {wit} _j) \not\in R_{\mathsf {upd}} $ , then abort and output $ \bot $ . Otherwise, parse each witness $ \mathit {wit} _{j} $ as containing an initial set of states and words read $ (\mathit {State} ^{(j)},V ^{\mathsf {rd},(j)}) $ , as well as a sequence of updates. Let $ u_{1},\ldots ,u_{t} $ be the sequence of t updates obtained across all m witnesses where $ u_i = (\mathit {digest} _{i}^{},V ^{\mathsf {prev}}_{i},V ^{\mathsf {rd}}_{i},\pi _{i}^{},\tau _{i}^{}) $ for all $ i \in [t] $ . Additionally, for each $ i \in [t] $ , let $ V _{i} $ be a tuple of $ \left|S_{i} \right| $ values, where the jth value is that of $ V _{i}^{\mathsf {rd}} $ or $ V _{i}^{\mathsf {wt}} $ according to the corresponding operation given by $ \mathit {Op} _{i} $ .

Recall that $ \mathcal {E} $ ’s emulation defined the starting values $ (\mathit {State} _{0},V _{0}^{\mathsf {rd}}) $ and values $ (\mathit {State} _i, \mathit {Op} _i, S_i , V _i^{\mathsf {wt}}) $ for each RAM step. Last, let $ \mathit {digest} _{0} $ be the initial digest computed by $ \mathcal {V} $ .

(6)

(7)

For $ i = 1,\ldots , t $ , do the following:

(1)

(2)

Read from and write to $ D^{\star } $ by running $ V _{i}^{\mathsf {rd} \star } = \mathbf {access} ^{D^{\star }}(\mathit {Op} _{i}^{\star },S_{i}^{\star },V _{i}^{\mathsf {wt}}) $ .

(8)

$ \begin{equation*} (i-1, \{(\mathit {digest} _j, S_j, V _{j}, \tau _j)\}_{j\in [i-1]},\mathit {digest} _{0},S_{i},(\bot)^{\left|S_{i} \right|},\pi ^{\star },V _{i}^{\mathsf {prev}},\pi _{i}). \end{equation*} $

(9)

(10)

Otherwise, abort and output $ \bot $ .

To analyze the success of $ \mathcal {A} $ in breaking the soundness of $ \mathbf {C} $ , below we argue that (1) $ \mathcal {A} _{\lambda } $ runs in polynomial time; (2) if $ \mathcal {A} _{\lambda } $ outputs in steps 8 or 9, then $ \mathcal {A} _{\lambda } $ finds values that breaking the soundness of $ \mathbf {C} $ ; and (3) if $ \mathcal {A} _{\lambda } $ reaches step 10, then it must be the case that $ ((M,x,y,L,t),w) \in \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} $ .

Given these claims, we can conclude the proof as follows. First, note that $ \mathcal {A} _{\lambda } $ outputs in step 6, 8, 9, or 10 whenever $ b=1 $ , $ (\mathit {statement} _i, \mathit {wit} _i) \in R_{\mathsf {upd}} $ for all $ i \in [m] $ , $ t\cdot p_{M} \le \left|x \right|^{c} $ , and $ \mathcal {E} $ does not output $ \bot $ . Note that if $ (\mathit {statement} _{i},\mathit {wit} _{i})\in R_{\mathsf {upd}} $ , then the output of $ \mathcal {E} $ is not $ \bot $ . We can therefore break the event that $ \mathcal {A} $ outputs in step 6, 8, 9, or 10 into two cases as

$ \begin{align*} &\text{Pr}\left[ \begin{array}{l} b=1 \ \wedge \\ \forall i \in [m] \ (\mathit {statement} _i, \mathit {wit} _i) \in R_{\mathsf {upd}}\ \wedge \\ t\cdot p_{M} \le \left|x \right|^{c} \ \wedge \\ ((M,x,y,L,t),w) \in \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} \end{array} \right] + \text{Pr}\left[ \begin{array}{l} b=1 \ \wedge \\ \forall i \in [m] \ (\mathit {statement} _i, \mathit {wit} _i) \in R_{\mathsf {upd}}\ \wedge \\ t\cdot p_{M} \le \left|x \right|^{c} \ \wedge \\ ((M,x,y,L,t),w) \not\in \mathcal {R}_{\mathcal {U}} ^{\mathbf {PRAM}} \end{array} \right]. \end{align*} $

We observe that the only difference between $ \mathcal {A} _{\lambda } $ and the adversary given in the proof of Lemma 6.3 for the interactive case is in steps 1, 2, 3, and 4. After step 4, $ \mathcal {A} _{\lambda } $ uses the witnesses it obtained identically to the adversary in the interactive case. It therefore follows by the same logic as in Subclaim 6.12 that the first term in the above probability is greater than the probability that $ \mathcal {A} _{\lambda } $ outputs in step 10. The second term is greater than $ 1/(2p(\lambda)) $ by Equation (B.3). Putting these together, we get that the probability that $ \mathcal {A} _{\lambda } $ outputs in steps 6, 8, or 9 is greater than $ 1/(2p(\lambda)) $ . To obtain a contradiction, we show below that $ \mathcal {A} _{\lambda } $ runs in polynomial time in Subclaim B.6. We observe that by Subclaim 6.9, if $ \mathcal {A} _{\lambda } $ outputs in step 6, then it finds values that violate the soundness of $ \mathcal {H} $ . Similarly, by Subclaim 6.11, if $ \mathcal {A} _{\lambda } $ outputs in step 8 or step 9, then it finds values that violate the soundness of $ \mathbf {C} $ . Therefore, $ \mathcal {A} _{\lambda } $ can be used to break the soundness of $ \mathbf {C} $ or of $ \mathcal {H} $ with probability at least $ 1/(2p(\lambda)) $ , in contradiction.

Subclaim B.6.

There exists a polynomial $ q_\mathcal {A} $ such that for every $ \mathit {pp} \in \mathrm{Supp}( {\mathbf {C}.\mathbf {Gen} (1^{\lambda} ,2^{\lambda }))} $ and $ h \in \mathcal {H} _{\lambda } $ , the running time of $ \mathcal {A} _{\lambda }(\mathit {pp},h) $ is at most $ q_\mathcal {A} (\lambda) $ for all $ \lambda \in \mathbb {N} $ .□

Proof.

The running time of $ \mathcal {A} _{\lambda } $ is bounded by the sum of (1) the time to run $ \mathcal {G} _{\mathsf {snark}} $ , (2) the time to run $ \mathcal {P} ^{\star }_{\lambda ,z,s} $ and check its output, (3) the total amount of time $ \mathcal {A} _{\lambda } $ spends running $ \mathcal {E} $ , (4) the time to run $ \mathcal {V} _{\mathsf {ni}} $ , (5) the time to check that all $ (\mathit {statement} _i, \mathit {wit} _i) $ pairs are in $ R_{\mathsf {upd}} $ , (6) the time to check for and compute an output in step 6, (7) the time to emulate the execution of M, and (8) the time to check for and compute an output in steps 8 and steps 9. We separately argue that each of these runs in at most polynomial time in $ \lambda $ .

First, (1) is bounded by a polynomial in $ \lambda $ by the efficiency of $ \mathcal {G} _{\mathsf {snark}} $ and $ \mathbf {C}.\mathbf {Gen} $ . (2) is bounded by a polynomial in $ \lambda $ , since $ \mathcal {P} ^{\star }_{\lambda ,z,s} $ runs in fixed polynomial time $ q^{\star }(\lambda) $ for any $ z,s \in \{0,1\}^* $ . Note that, since $ M,x $ are part of the prover’s output, then $ \left|(M,x) \right| $ is also bounded $ q^{\star }(\lambda) $ . When $ \mathcal {A} _{\lambda } $ does not abort after running the prover, this implies that $ t\cdot p_{M} \le \left|x \right|^{c} \in \mathrm{poly} (\lambda) $ . Next, (3) is bounded by a polynomial $ q_{\mathcal {E}}(\lambda ,t \cdot p_{M}) $ by Claim B.4. This is polynomial in $ \lambda $ by the bounds on $ \left|x \right|,t $ above. Note that if $ \mathcal {A} _{\lambda } $ does not abort after running $ \mathcal {E} $ , then by definition of $ \mathcal {E} _{\mathsf {inner}} $ , this implies that $ L \le q^{\star }(\lambda) $ and $ n \le 2^{\lambda } $ . For (4), by succinctness, the running time of $ \mathcal {V} _{\mathsf {ni}} $ is bounded by a fixed polynomial $ \mathrm{poly} (\lambda ,\left|(M,x) \right|,L,p_{M},t) \in \mathrm{poly} (\lambda) $ by the aforementioned bounds. For (5), it requires checking that at most t updates are valid where each check requires a polynomial amount of work in $ \lambda ,\left|(M,x) \right|,\beta (\lambda ,\log n),p_{M},\log t $ by definition of $ \mathcal {L}_{\mathsf {upd}} $ and the efficiency of $ \mathbf {C} $ . By the above bounds, this is in $ \mathrm{poly} (\lambda) $ . Next, (6) requires checking equality of $ (m+1) \cdot \mathrm{poly} (\lambda) \cdot p_{M} $ values, which is bounded by a polynomial in $ \lambda $ , since $ p_{M} \le \left|x \right|^{c} $ and because $ m \in \mathrm{poly} (\lambda) $ as the output length of $ \mathcal {P} ^{\star } _{\lambda ,z,s} $ depends on m. Next, (7) takes t steps of computation, each of which takes time bounded by a fixed polynomial in $ \lambda ,\left|M \right|,p_{M} $ by the definition of PRAM computation, which is polynomial in $ \lambda $ by the above. Last, (8) requires $ (t+L +1) \cdot p_{M} \cdot \lambda $ time to check equality of all corresponding values. Computing the initial hash and opening requires $ 2\beta (\lambda ,\log n)\cdot p_{M} \in \mathrm{poly} (\lambda) $ by efficiency of $ \mathbf {C} $ . Then, the full output has size at most $ t \cdot p_{M} \cdot \mathrm{poly} (\lambda) \in \mathrm{poly} (\lambda) $ and takes at most $ t \cdot p_{M} \cdot \mathrm{poly} (\lambda) \in \mathrm{poly} (\lambda) $ time to compute.

Therefore, the running time of $ \mathcal {A} _{\lambda } $ is bounded by some polynomial $ q_\mathcal {A} (\lambda) $ for all $ \lambda \in \mathbb {N} $ .

$ \Box $ □

This completes the proof of Claim B.5.

$ \Box $

This completes the proof of Lemma B.1.

Acknowledgments

We thank Krzysztof Pietrzak for a useful discussion regarding memory-hard functions.

The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied, of ODNI, IARPA, or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for governmental purposes notwithstanding any copyright annotation therein.

References

[1]

Joël Alwen, Jeremiah Blocki, and Krzysztof Pietrzak. 2017. Depth-robust graphs and their cumulative memory complexity. In EUROCRYPT(Lecture Notes in Computer Science, Vol. 10212). 3–32.

Google Scholar

[2]

Joël Alwen, Jeremiah Blocki, and Krzysztof Pietrzak. 2018. Sustained space complexity. In EUROCRYPT(Lecture Notes in Computer Science, Vol. 10821). Springer, 99–130.

Google Scholar

[3]

Joël Alwen, Binyi Chen, Chethan Kamath, Vladimir Kolmogorov, Krzysztof Pietrzak, and Stefano Tessaro. 2016. On the complexity of scrypt and proofs of space in the parallel random oracle model. In EUROCRYPT(Lecture Notes in Computer Science, Vol. 9666). Springer, 358–387.

Google Scholar

[4]

Joël Alwen, Binyi Chen, Krzysztof Pietrzak, Leonid Reyzin, and Stefano Tessaro. 2017. Scrypt is maximally memory-hard. In EUROCRYPT(Lecture Notes in Computer Science, Vol. 10212). 33–62.

Google Scholar

[5]

Joël Alwen and Vladimir Serbinenko. 2015. High parallel complexity graphs and memory-hard functions. In STOC. ACM, 595–603.

Google Scholar

[6]

Joël Alwen and Björn Tackmann. 2017. Moderately hard functions: Definition, instantiations, and applications. In TCC(Lecture Notes in Computer Science, Vol. 10677). Springer, 493–526.

Google Scholar

[7]

Boaz Barak. 2001. How to go beyond the black-box simulation barrier. In FOCS. IEEE Computer Society, 106–115.

Google Scholar

[8]

Boaz Barak and Oded Goldreich. 2008. Universal arguments and their applications. SIAM J. Comput. 38, 5 (2008), 1661–1694.

Digital Library

Google Scholar

[9]

Mihir Bellare and Oded Goldreich. 1992. On defining proofs of knowledge. In CRYPTO(Lecture Notes in Computer Science, Vol. 740). Springer, 390–420.

Google Scholar

[10]

Eli Ben-Sasson, Alessandro Chiesa, Ariel Gabizon, Michael Riabzev, and Nicholas Spooner. 2017. Interactive oracle proofs with constant rate and query complexity. In ICALP(LIPIcs, Vol. 80). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 40:1–40:15.

Google Scholar

[11]

Eli Ben-Sasson, Alessandro Chiesa, Daniel Genkin, and Eran Tromer. 2013. On the concrete efficiency of probabilistically-checkable proofs. In STOC. ACM, 585–594.

Google Scholar

[12]

Eli Ben-Sasson, Alessandro Chiesa, Daniel Genkin, Eran Tromer, and Madars Virza. 2013. SNARKs for C: Verifying program executions succinctly and in zero knowledge. In CRYPTO(Lecture Notes in Computer Science, Vol. 8043). Springer, 90–108.

Google Scholar

[13]

Eli Ben-Sasson, Alessandro Chiesa, and Nicholas Spooner. 2016. Interactive oracle proofs. In TCC(Lecture Notes in Computer Science, Vol. 9986). 31–60.

Google Scholar

[14]

Eli Ben-Sasson and Madhu Sudan. 2008. Short PCPs with polylog query complexity. SIAM J. Comput. 38, 2 (2008), 551–607.

Digital Library

Google Scholar

[15]

Nir Bitansky, Ran Canetti, Alessandro Chiesa, Shafi Goldwasser, Huijia Lin, Aviad Rubinstein, and Eran Tromer. 2017. The hunting of the SNARK. J. Cryptol. 30, 4 (2017), 989–1066.

Digital Library

Google Scholar

[16]

Nir Bitansky, Ran Canetti, Alessandro Chiesa, and Eran Tromer. 2013. Recursive composition and bootstrapping for SNARKS and proof-carrying data. In STOC. ACM, 111–120.

Google Scholar

[17]

Nir Bitansky, Ran Canetti, Omer Paneth, and Alon Rosen. 2016. On the existence of extractable one-way functions. SIAM J. Comput. 45, 5 (2016), 1910–1952.

Crossref

Google Scholar

[18]

Nir Bitansky and Alessandro Chiesa. 2012. Succinct arguments from multi-prover interactive proofs and their efficiency benefits. In CRYPTO(Lecture Notes in Computer Science, Vol. 7417). Springer, 255–272.

Google Scholar

[19]

Nir Bitansky, Shafi Goldwasser, Abhishek Jain, Omer Paneth, Vinod Vaikuntanathan, and Brent Waters. 2016. Time-lock puzzles from randomized encodings. In ITCS. ACM, 345–356.

Google Scholar

[20]

Dan Boneh, Joseph Bonneau, Benedikt Bünz, and Ben Fisch. 2018. Verifiable delay functions. In CRYPTO(Lecture Notes in Computer Science, Vol. 10991). Springer, 757–788.

Google Scholar

[21]

Dan Boneh, Benedikt Bünz, and Ben Fisch. 2018. A survey of two verifiable delay functions. IACR Cryptol. ePrint Arch. 2018 (2018), 712.

Google Scholar

[22]

Elette Boyle and Rafael Pass. 2015. Limits of extractability assumptions with distributional auxiliary input. In ASIACRYPT(Lecture Notes in Computer Science, Vol. 9453). Springer, 236–261.

Google Scholar

[23]

Dario Catalano and Dario Fiore. 2013. Vector commitments and their applications. In Public Key Cryptography(Lecture Notes in Computer Science, Vol. 7778). Springer, 55–72.

Google Scholar

[24]

Kai-Min Chung, Huijia Lin, and Rafael Pass. 2013. Constant-round concurrent zero knowledge from P-Certificates. In FOCS. IEEE Computer Society, 50–59.

Google Scholar

[25]

Craig Costello, Cédric Fournet, Jon Howell, Markulf Kohlweiss, Benjamin Kreuter, Michael Naehrig, Bryan Parno, and Samee Zahur. 2015. Geppetto: Versatile verifiable computation. In S&P. IEEE Computer Society, 253–270.

Google Scholar

[26]

Nico Döttling, Sanjam Garg, Yuval Ishai, Giulio Malavolta, Tamer Mour, and Rafail Ostrovsky. 2019. Trapdoor hash functions and their applications. In CRYPTO(Lecture Notes in Computer Science, Vol. 11694). Springer, 3–32.

Google Scholar

[27]

Nico Döttling, Sanjam Garg, Giulio Malavolta, and Prashant Nalini Vasudevan. 2020. Tight verifiable delay functions. In SCN(Lecture Notes in Computer Science, Vol. 12238). Springer, 65–84.

Google Scholar

[28]

Thaddeus Dryja, Quanquan C. Liu, and Sunoo Park. 2018. Static-memory-hard functions, and modeling the cost of space vs. time. In TCC(Lecture Notes in Computer Science, Vol. 11239). Springer, 33–66.

Google Scholar

[29]

Cynthia Dwork, Andrew V. Goldberg, and Moni Naor. 2003. On memory-bound functions for fighting spam. In CRYPTO(Lecture Notes in Computer Science, Vol. 2729). Springer, 426–444.

Google Scholar

[30]

Cynthia Dwork, Moni Naor, and Hoeteck Wee. 2005. Pebbling and proofs of work. In CRYPTO(Lecture Notes in Computer Science, Vol. 3621). Springer, 37–54.

Google Scholar

[31]

Stefan Dziembowski, Sebastian Faust, Vladimir Kolmogorov, and Krzysztof Pietrzak. 2015. Proofs of space. In CRYPTO(Lecture Notes in Computer Science, Vol. 9216). Springer, 585–605.

Google Scholar

[32]

Naomi Ephraim, Cody Freitag, Ilan Komargodski, and Rafael Pass. 2020. Continuous verifiable delay functions. In EUROCRYPT(Lecture Notes in Computer Science, Vol. 12107). Springer, 125–154.

Google Scholar

[33]

Naomi Ephraim, Cody Freitag, Ilan Komargodski, and Rafael Pass. 2020. SPARKs: Succinct parallelizable arguments of knowledge. In EUROCRYPT(Lecture Notes in Computer Science, Vol. 12105). Springer, 707–737.

Google Scholar

[34]

Amos Fiat and Adi Shamir. 1986. How to prove yourself: Practical solutions to identification and signature problems. In CRYPTO(Lecture Notes in Computer Science, Vol. 263). Springer, 186–194.

Google Scholar

[35]

Shafi Goldwasser, Yael Tauman Kalai, and Guy N. Rothblum. 2015. Delegating computation: Interactive proofs for muggles. J. ACM 62, 4 (2015), 27:1–27:64.

Digital Library

Google Scholar

[36]

Shafi Goldwasser, Silvio Micali, and Charles Rackoff. 1989. The knowledge complexity of interactive proof systems. SIAM J. Comput. 18, 1 (1989), 186–208.

Digital Library

Google Scholar

[37]

Jens Groth and Yuval Ishai. 2008. Sub-linear zero-knowledge argument for correctness of a shuffle. In EUROCRYPT(Lecture Notes in Computer Science, Vol. 4965). Springer, 379–396.

Google Scholar

[38]

Justin Holmgren and Ron Rothblum. 2018. Delegating computations with (almost) minimal time and space overhead. In FOCS. IEEE Computer Society, 124–135.

Google Scholar

[39]

Yael Tauman Kalai and Omer Paneth. 2016. Delegating RAM computations. In TCC. 91–118.

Google Scholar

[40]

Yael Tauman Kalai and Ran Raz. 2008. Interactive PCP. In ICALP(Lecture Notes in Computer Science, Vol. 5126). Springer, 536–547.

Google Scholar

[41]

Joe Kilian. 1992. A note on efficient zero-knowledge proofs and arguments (extended abstract). In STOC. ACM, 723–732.

Google Scholar

[42]

Yehuda Lindell. 2003. Parallel coin-tossing and constant-round secure two-party computation. J. Cryptol. 16, 3 (2003), 143–184.

Crossref

Google Scholar

[43]

Ralph C. Merkle. 1989. A certified digital signature. In CRYPTO(Lecture Notes in Computer Science, Vol. 435). Springer, 218–238.

Digital Library

Google Scholar

[44]

Silvio Micali. 2000. Computationally sound proofs. SIAM J. Comput. 30, 4 (2000), 1253–1298.

Digital Library

Google Scholar

[45]

Silvio Micali and Rafael Pass. 2006. Local zero knowledge. In STOC. ACM, 306–315.

Google Scholar

[46]

Dalit Naor, Moni Naor, and Jeffery Lotspiech. 2001. Revocation and tracing schemes for stateless receivers. In CRYPTO(Lecture Notes in Computer Science, Vol. 2139). Springer, 41–62.

Digital Library

Google Scholar

[47]

Omer Paneth. 2019. Alternative VDF Constructions. Retrieved from: https://dci.mit.edu/video-gallery/2019/5/29/alternate-vdf-constructions-by-omer-paneth-of-mit-vdf-day-2019.

Google Scholar

[48]

Bryan Parno, Jon Howell, Craig Gentry, and Mariana Raykova. 2013. Pinocchio: Nearly practical verifiable computation. In IEEE Symposium on Security and Privacy. IEEE Computer Society, 238–252.

Google Scholar

[49]

Rafael Pass and Alon Rosen. 2008. Concurrent nonmalleable commitments. SIAM J. Comput. 37, 6 (2008), 1891–1925.

Digital Library

Google Scholar

[50]

Colin Percival. 2009. Stronger key derivation via sequential memory-hard functions. In BSDCan.

Google Scholar

[51]

Krzysztof Pietrzak. 2019. Simple verifiable delay functions. In ITCS(LIPIcs, Vol. 124). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 60:1–60:15.

Google Scholar

[52]

Omer Reingold, Guy N. Rothblum, and Ron D. Rothblum. 2016. Constant-round interactive proofs for delegating computation. In STOC. ACM, 49–62.

Google Scholar

[53]

Ronald L. Rivest, Adi Shamir, and David A. Wagner. 1996. Time-lock Puzzles and Timed-release Crypto. Manuscript.https://people.csail.mit.edu/rivest/pubs/RSW96.pdf.

Google Scholar

[54]

Noga Ron-Zewi and Ron D. Rothblum. 2020. Local proofs approaching the witness length [extended abstract]. In FOCS. IEEE, 846–857.

Google Scholar

[55]

Paul Valiant. 2008. Incrementally verifiable computation or proofs of knowledge imply time/space efficiency. In TCC(Lecture Notes in Computer Science, Vol. 4948). Springer, 1–18.

Google Scholar

[56]

Benjamin Wesolowski. 2019. Efficient verifiable delay functions. In EUROCRYPT(Lecture Notes in Computer Science, Vol. 11478). Springer, 379–407.

Google Scholar

[57]

Howard Wu, Wenting Zheng, Alessandro Chiesa, Raluca Ada Popa, and Ion Stoica. 2018. DIZK: A distributed zero knowledge proof system. In USENIX Security Symposium. USENIX Association, 675–692.

Google Scholar

Index Terms

SPARKs: Succinct Parallelizable Arguments of Knowledge
1. Theory of computation
  1. Computational complexity and cryptography
    1. Cryptographic protocols

Recommendations

Succinct Summing over Sliding Windows

This paper considers the problem of estimating the sum the last $$W$$W elements of a stream of integers in $$\left\{ 0,1,\ldots , R \right\} $$0,1,ý,R. Specifically, we study the memory requirements for computing a $$ R W\varepsilon $$RWý-additive ...
Read More
SPARKs: Succinct Parallelizable Arguments of Knowledge
Advances in Cryptology – EUROCRYPT 2020
Abstract
We introduce the notion of a Succinct Parallelizable Argument of Knowledge (SPARK). This is an argument system with the following three properties for computing and proving a time T (non-deterministic) computation:
- The prover’s (parallel) running ...
Read More
Proving as fast as computing: succinct arguments with constant prover overhead
STOC 2022: Proceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing

Succinct arguments are proof systems that allow a powerful, but untrusted, prover to convince a weak verifier that an input x belongs to a language L ∈ NP, with communication that is much shorter than the NP witness. Such arguments, which grew out of ...
Read More

Comments

Information & Contributors

Information

Published In

Journal of the ACM Volume 69, Issue 5

October 2022

420 pages

ISSN:0004-5411

EISSN:1557-735X

DOI:10.1145/3563903

Editor:
Venkatesan Guruswami
University of California, Berkeley, United States

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2022

Online AM: 10 August 2022

Accepted: 25 May 2022

Revised: 24 May 2021

Received: 15 June 2020

Published in JACM Volume 69, Issue 5

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Refereed

Funding Sources

NSF
AFOSR
DARPA
Alon Young Faculty
Office of the Director of National Intelligence (ODNI)
Intelligence Advanced Research Projects Activity (IARPA)

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
372
Total Downloads

Downloads (Last 12 months)238
Downloads (Last 6 weeks)23

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Abstract

1 Introduction

1.1 Our Results

1.2 Applications

1.3 Related Work

2 Technical Overview

2.1 Warmup: SPARKs for Iterated Functions

2.2 Extending SPARKs to Arbitrary Computations

2.3 Our SPARK Construction

3 Preliminaries

3.1 RAM Model

3.2 Universal and NP Relations

3.3 Succinct Arguments of Knowledge

4 Succinct Parallelizable Arguments of Knowledge

5 Concurrently Updatable Hash Functions

5.1 Hash Function Building Blocks

5.2 Construction

5.3 Proofs

6 From Succinct Arguments to SPARKs

6.1 The Update Language

6.2 Interactive Protocol

6.3 Proofs

6.4 Non-interactive Protocol

7 Main Results

7.1 Four-round SPARKs

7.2 Non-interactive SPARKs

8 Extensions

8.1 Space-preserving Interactive SPARKs

8.2 Proof Composition

8.3 Efficiency Tradeoffs

9 Applications to Verifiable Hard Functions

9.1 Defining Verifiable Hard Functions

9.2 Verifiable Hard Functions from Non-interactive SPARKs

9.3 Applications to VDFs

9.4 Applications to Memory-hard VDFs

Footnotes

A Witness-extended Emulation

B Proofs from Section 6.4

Acknowledgments

References

Index Terms

Recommendations

Succinct Summing over Sliding Windows