1 Introduction

Interactive proofs [62] are central to modern cryptography and complexity theory. One extensively studied aspect of interactive proofs is their expressibility, culminating with the result \({\mathsf {IP}}={\mathsf {PSPACE}}\) [96]. Another aspect, which is the focus of this work, is that proofs for \({\mathsf {NP}}\) statements can potentially be much shorter than an \({\mathsf {NP}}\) witness and be verified much faster than the time required for checking the \({\mathsf {NP}}\) witness.

1.1 Background

Succinct interactive arguments. In interactive proofs for \({\mathsf {NP}}\) with statistical soundness, significant savings in communication (let alone verification time) are unlikely [21, 58, 67, 102]. If we settle for proof systems with computational soundness, known as argument systems [9], then significant savings can be made. Using collision-resistant hashes (\(\text{ CRH } \)s) and probabilistically-checkable proofs (\(\text{ PCP } \)s) [16], Kilian [74] showed a four-message interactive argument for \({\mathsf {NP}}\) where, to prove membership of an instance \(x\) in a given \({\mathsf {NP}}\) language L with \({\mathsf {NP}}\) machine \(M_{L}\), communication and verification time are bounded by \({\mathrm {poly}}(\lambda + |M_{L}| + |x| + \log t)\), and the prover’s running time is \({\mathrm {poly}}(\lambda + |M_{L}| + |x| + t)\). Here, \(t\) is the classical \({\mathsf {NP}}\) verification time of \(M_{L}\) for the instance \(x\), \(\lambda \) is a security parameter, and \({\mathrm {poly}}\) is a universal polynomial (independent of \(\lambda \), \(M_{L}\), \(x\), and \(t\)). We call such argument systems succinct.

Proof of knowledge. A natural strengthening of computational soundness is (computational) proof of knowledge: it requires that, whenever the verifier is convinced by an efficient prover, not only can we conclude that a valid witness for the theorem exists, but also that such a witness can be extracted efficiently from the prover. This property is satisfied by most proof system constructions, including the aforementioned one of Kilian [19], and is useful in many applications of succinct arguments.

Removing interaction. Kilian’s protocol requires four messages. A challenge, which is of both theoretical and practical interest, is the construction of non-interactive succinct arguments. As a first step in this direction, Micali [82] showed how to construct publicly-verifiable one-message succinct non-interactive arguments for \({\mathsf {NP}}\), in the random oracle model, by applying the Fiat–Shamir heuristic [54] to Kilian’s protocol. In the plain model, one-message solutions are impossible for hard-enough languages (against non-uniform provers), so one usually considers the weaker goal of two-message succinct arguments where the verifier message is generated independently of the statement to be proven. Following [68], we call such arguments \(\text{ SNARG } \)s. More precisely, a \(\text{ SNARG } \) for a language L is a triple of algorithms \((G ,P,V)\) where: the generator \(G\), given the security parameter \(\lambda \), samples a reference string \(\sigma \) and a corresponding verification state \(\tau \) (\(G\) can be thought to be run during an offline phase, by the verifier, or by someone the verifier trusts); the (honest) prover \(P(\sigma ,x,w)\) produces a proof \(\pi \) for the statement “\(x\in L\)” given a witness \(w\); then, \(V(\tau ,x,\pi )\) verifies the validity of \(\pi \). Soundness should hold even if \(x\) is chosen depending on \(\sigma \).

Gentry and Wichs [68] showed that no \(\text{ SNARG } \) can be proven secure via a black-box reduction to a falsifiable assumption [85]; this may justify using non-standard assumptions to construct \(\text{ SNARG } \)s. (Note that [68] rule out \(\text{ SNARG } \)s only for (hard-enough) \({\mathsf {NP}}\) languages. For the weaker goal of verifying deterministic polynomial-time computations in various models, there are beautiful constructions relying on standard assumptions, such as [3, 20, 37, 38, 40, 41, 52, 55, 59, 77]. We focus on verifying nondeterministic polynomial-time computations.)

Extending earlier works [1, 44, 50, 83], several works showed how to remove interaction in Kilian’s \(\text{ PCP } \)-based protocol and obtain \(\text{ SNARG } \)s of knowledge (\(\text{ SNARK } \)s) using extractable collision-resistant hashing [10, 11, 46, 60], or construct \(\text{ MIP } \)-based \(\text{ SNARK } \)s using fully-homomorphic encryption with an extractable homomorphism property [8].

The preprocessing model. A notion that is weaker than a \(\text{ SNARK } \) is that of a preprocessing \(\text{ SNARK } \): here, the verifier is allowed to conduct an expensive offline phase. More precisely, the generator \(G\) takes as an additional input a time bound \(T\), may run in time \({\mathrm {poly}}(\lambda +T)\) (rather than \({\mathrm {poly}}(\lambda + \log T)\)), and generates \(\sigma \) and \(\tau \) that can be used, respectively, to prove and verify correctness of computations of length at most \(T\). Bitansky et al. [12] showed that \(\text{ SNARK } \)s can always be “algorithmically improved”; in particular, preprocessing \(\text{ SNARK } \)s imply ones without preprocessing. (The result of [12] crucially relies on the fast verification time and the adaptive proof-of-knowledge property of the \(\text{ SNARK } \).) Thus, “preprocessing can always be removed” at the expense of only a \({\mathrm {poly}}(\lambda )\)-loss in verification efficiency.

Zero knowledge. Another desired feature of \(\text{ SNARK } \)s is zero knowledge, namely hiding from the verifier anything but the truth of the statement being proved. More concretely, we aim at constructions of \(\text{ SNARK } \)s that satisfy the standard notion of non-interactive zero-knowledge [17]. Previous \(\text{ SNARK } \) constructions, starting from [64] and onward, achieve zero knowledge at a very small extra cost. This will also be the case for the main constructions in this work. However, to simplify the exposition, we will start by focusing on the other features, and discuss the extra zero knowledge feature separately.

1.2 Motivation

The typical approach to construct succinct arguments (or, more generally, other forms of proof systems with nontrivial efficiency properties) conforms with the following methodology: first, give an information-theoretic construction, using some form of probabilistic checking to verify computations, in a model that enforces certain restrictions on provers (e.g., the \(\text{ PCP } \) model [10, 11, 19, 44, 46, 60, 74, 82] or other models of probabilistic checking [8, 72, 76, 94, 95, 97, 98]); next, use cryptographic tools to compile the information-theoretic construction into an argument system (where there are no restrictions on the prover other than it being an efficient algorithm). We refer to the former ingredient as a probabilistic proof system and to the latter as a cryptographic compiler.

Existing constructions of preprocessing \(\text{ SNARK } \)s seem to diverge from this methodology, while at the same time offering several attractive features: such as public verification, proofs consisting of only O(1) encrypted (or encoded) field elements, and verification via arithmetic circuits that are linear in the statement.

Groth [64] and Lipmaa [78] (who builds on Groth’s approach) introduced clever techniques for constructing preprocessing \(\text{ SNARK } \)s by leveraging knowledge-of-exponent assumptions [25, 43, 71] in bilinear groups. At high level, Groth considered a simple reduction from circuit satisfaction problems to an algebraic satisfaction problem of quadratic equations, and then constructed a set of specific cryptographic tools to succinctly check satisfiability of this problem. Gennaro et al. [56] made a first step to better separate the “information-theoretic ingredient” from the “cryptographic ingredient” in preprocessing \(\text{ SNARK } \)s. They formulated a new type of algebraic satisfaction problems, called Quadratic Span Programs (\(\text{ QSP } \)s), which are expressive enough to allow for much simpler, and more efficient, cryptographic checking, essentially under the same assumptions used by Groth. In particular, they invested significant effort in obtaining an efficient reduction from circuit satisfiability to \(\text{ QSP } \)s. (See Sect. 1.5 for a more detailed overview of the relation between our work and [56].)

Comparing the latter QSP-based approach to the probabilistic-checking-based approach described above, we note that a reduction to an algebraic satisfaction problem is a typical first step, because such satisfaction problems tend to be more amenable to probabilistic checking. As explained above, cryptographic tools are then usually invoked to enforce the relevant probabilistic-checking model (e.g., the \(\text{ PCP } \) one). The aforementioned works [56, 64, 78], on the other hand, seem to somehow skip the probabilistic-checking step, and directly construct specific cryptographic tools for checking satisfiability of the algebraic problem itself. While this discrepancy may not be a problem per se, we believe that understanding it and formulating a clear methodology for the construction of preprocessing \(\text{ SNARK } \)s are problems of great interest. Furthermore, a clear methodology that separates an information-theoretic probabilistic proof system from a cryptographic compiler may lead not only to a deeper conceptual understanding, but also to concrete improvements to different features of \(\text{ SNARK } \)s (e.g., communication complexity, verifier complexity, prover complexity, and so on). Thus, we ask:

Is there a general methodology for constructing preprocessing \(\text{ SNARK } \)s from probabilistic proof systems? Which improvements can it lead to?

1.3 Our Results

We present a general methodology for constructing preprocessing \(\text{ SNARK } \)s from suitable kinds of probabilistic proof systems. Using different instantiations of this methodology, we obtain conceptually simple variants of previous \(\text{ SNARK } \)s, as well as \(\text{ SNARK } \)s with new efficiency features.

Our methodology starts with a linear PCP,Footnote 1 a more structured variant of a classical PCP in which the verifier can make a small number of inner-product queries to a single proof vector. We transform such a linear PCP into a stronger kind of probabilistic proof system called a linear interactive proof, and then to a \(\text{ SNARK } \) via a cryptographic compiler that respects the efficiency features of the linear PCP.

In more detail, our contribution is threefold:

  • We introduce a new, information-theoretic probabilistic proof system that extends the standard interactive proof model by considering algebraically-bounded provers. Concretely, we focus on linear interactive proofs (\(\text{ LIP } \)s), where both honest and malicious provers are restricted to computing linear (or affine) functions of messages they receive over some finite field or ring. We construct succinct two-message \(\text{ LIP } \)s for \({\mathsf {NP}}\) by applying a simple and general transformation to any linear PCP. We also present an alternative construction of LIPs from classical PCPs.

  • We give cryptographic transformations from (succinct, two-message) \(\text{ LIP } \)s to preprocessing \(\text{ SNARK } \)s, using different forms of linear targeted malleability [34], which can be instantiated based on existing knowledge assumptions. More concretely, we assume a “linear-only” encryption scheme that only supports linear homomorphism. Our transformation is very intuitive: to force a prover to “act linearly” on the verifier’s message, as in the LIP soundness guarantee, the preprocessed common reference string simply includes an encryption of each field or ring element in the verifier’s LIP message with such a linear-only encryption. This enables the honest \(\text{ SNARK } \) prover to faithfully compute an encryption of its correct LIP message, which the verifier can decrypt. For the case of designated-verifier \(\text{ SNARK } \)s, this simple idea suffices. To obtain public verification, we require the LIP verification to be “simple” (say, testing whether a quadratic function of the answers is 0) and replace the linear-only encryption by a linear-only encoding that supports “simple” zero-tests (say, via pairing).

  • Following this methodology, we obtain several constructions that either simplify previous ones or exhibit new asymptotic efficiency features. The latter include “single-ciphertext preprocessing \(\text{ SNARK } \)s” and improved succinctness-soundness tradeoffs in the designated-verifier setting. We also offer a new perspective on existing constructions of preprocessing \(\text{ SNARK } \)s: namely, although existing constructions do not explicitly invoke \(\text{ PCP } \)s, they can be reinterpreted as using linear \(\text{ PCP } \)s.

  • We also extend our methodology to obtain zero-knowledge \(\text{ LIP } \)s and \(\text{ SNARK } \)s.

We now discuss our results further, starting in Sect. 1.3.1 with the information-theoretic constructions of \(\text{ LIP } \)s, followed in Sect. 1.3.2 by the cryptographic transformations to preprocessing \(\text{ SNARK } \)s, and concluding in Sect. 1.3.3 with the new features we are able to obtain.

Fig. 1
figure 1

High-level summary of our transformations

1.3.1 Linear Interactive Proofs

The \(\text{ LIP } \) model modifies the traditional interactive proofs model in a way analogous to the way the common study of algebraically-bounded “adversaries” modifies other settings, such as pseudorandomness [35, 86] and randomness extraction [48, 63]. In the \(\text{ LIP } \) model, both honest and malicious provers are restricted to apply linear (or affine) functions over a finite field \(\mathbb {F}\) to messages they receive from the verifier. (The notion can be naturally generalized to apply over rings.) The choice of these linear functions can depend on auxiliary input to the prover (e.g., a witness), but not on the verifier’s messages (Fig. 1).

With the goal of non-interactive succinct verification in mind, we restrict our attention to (input-oblivious) two-message \(\text{ LIP } \)s for boolean circuit satisfiability problems with the following template. To verify the relation \(\mathcal {R}_{C} = \left\{ (x,w):C(x,w)=1\right\} \) where \(C\) is a boolean circuit, the \(\text{ LIP } \) verifier \(V_{\scriptscriptstyle {\mathsf {LIP}}}\) sends to the \(\text{ LIP } \) prover \(P_{\scriptscriptstyle {\mathsf {LIP}}}\) a message \({\mathbf {q}}\) that is a vector of field elements, depending on \(C\) but not on \(x\); \(V_{\scriptscriptstyle {\mathsf {LIP}}}\) may also output a verification state \({\mathbf {u}}\). The \(\text{ LIP } \) prover \(P_{\scriptscriptstyle {\mathsf {LIP}}}(x,w)\) applies to \({\mathbf {q}}\) an affine transformation \(\Pi = (\Pi ',\varvec{b})\), resulting in only a constant number of field elements. The prover’s message \({\mathbf {a}} = \Pi ' \cdot {\mathbf {q}}+\varvec{b}\) can then be quickly verified (e.g., with \(O(|x|)\) field operations) by \(V_{\scriptscriptstyle {\mathsf {LIP}}}\), and the soundness error is at most \(O(1/|\mathbb {F}|)\). From here on, we shall use the term \(\text{ LIP } \) to refer to \(\text{ LIP } \)s that adhere to the above template.

LIP complexity measures. Our constructions provide different tradeoffs among several complexity measures of an \(\text{ LIP } \), which ultimately affect the features of the resulting preprocessing \(\text{ SNARK } \)s. The two most basic complexity measures are the number of field elements sent by the verifier and the number of those sent by the prover. An additional measure that we consider in this work is the algebraic complexity of the verifier (when viewed as an \(\mathbb {F}\)-arithmetic circuit). Specifically, splitting the verifier into a query algorithm \(Q_{\scriptscriptstyle {\mathsf {LIP}}}\) and a decision algorithm \(D_{\scriptscriptstyle {\mathsf {LIP}}}\), we say that it has degree \((d_{Q} ,d_{D})\) if \(Q_{\scriptscriptstyle {\mathsf {LIP}}}\) can be computed by a vector of multivariate polynomials of total degree \(d_{Q}\) each in the verifier’s randomness, and \(D_{\scriptscriptstyle {\mathsf {LIP}}}\) by a vector of multivariate polynomials of total degree \(d_{D}\) each in the \(\text{ LIP } \) answers \({\mathbf {a}}\) and the verification state \({\mathbf {u}}\). Finally, of course, the running times of the query algorithm, decision algorithm, and prover algorithm are all complexity measures of interest. See Sect. 2.3 for a definition of \(\text{ LIP } \)s and their complexity measures.

As mentioned above, our \(\text{ LIP } \) constructions are obtained by applying general transformations to two types of \(\text{ PCP } \)s. We now describe each of these transformations and the features they achieve. Some of the parameters of the resulting constructions are summarized in Table 1.

LIPs from linear PCPs. A linear \(\text{ PCP } \) (\(\text{ LPCP } \)) of length \(m\) is an oracle computing a linear function \(\varvec{\pi }:\mathbb {F}^{m} \rightarrow \mathbb {F}\); namely, the answer to each oracle query \({\varvec{q}}_{i} \in \mathbb {F}^{m}\) is \(a_{i}=\left\langle \varvec{\pi } , {\varvec{q}}_{i} \right\rangle \). Note that, unlike in an \(\text{ LIP } \) where different affine functions, given by a matrix \(\Pi \) and shift \(\varvec{b}\), are applied to a message \({\mathbf {q}}\), in an \(\text{ LPCP } \) there is one linear function \(\varvec{\pi }\), which is applied to different queries. (An \(\text{ LPCP } \) with a single query can be viewed as a special case of an \(\text{ LIP } \).) This difference prevents a direct use of an \(\text{ LPCP } \) as an \(\text{ LIP } \).

Our first transformation converts any (multi-query) \(\text{ LPCP } \) into an \(\text{ LIP } \) with closely related parameters. Concretely, we transform any \(k\)-query \(\text{ LPCP } \) of length \(m\) over \(\mathbb {F}\) into an \(\text{ LIP } \) with verifier message in \(\mathbb {F}^{(k+1)m}\), prover message in \(\mathbb {F}^{k+1}\), and the same soundness error up to an additive term of \({1}/{|\mathbb {F}|}\). The transformation preserves the key properties of the \(\text{ LPCP } \), including the algebraic complexity of the verifier. Our transformation is quite natural: the verifier sends \({\mathbf {q}}=(\varvec{q}_{1},\dots ,\varvec{q}_{k+1})\) where \(\varvec{q}_{1},\dots ,\varvec{q}_{k}\) are the \(\text{ LPCP } \) queries and \(\varvec{q}_{k+1}=\alpha _{1}\varvec{q}_{1}+\cdots + \alpha _{k}\varvec{q}_{k}\) is a random linear combination of these. The (honest) prover responds with \(a_{i}=\left\langle \varvec{\pi } , {\mathbf {\varvec{q}}}_{i} \right\rangle \), for \(i=1,\ldots ,k+1\). To prevent a malicious prover from using inconsistent choices for \(\varvec{\pi }\), the verifier checks that \(a_{k+1}=\alpha _{1}a_{1}+\cdots +\alpha _{k}a_{k}\).

By relying on two different \(\text{ LPCP } \) instantiations, we obtain two corresponding \(\text{ LIP } \) constructions:

  • A variant of the Hadamard-based \(\text{ PCP } \) of Arora et al. [4] (ALMSS), extended to work over an arbitrary finite field \(\mathbb {F}\), yields a very simple \(\text{ LPCP } \) with three queries. After applying our transformation, for a circuit \(C\) of size \(s\) and input length \(n\), the resulting \(\text{ LIP } \) for \(\mathcal {R}_{C}\) has verifier message in \(\mathbb {F}^{O(s^{2})}\), prover message in \(\mathbb {F}^{4}\), and soundness error \(O(1/|\mathbb {F}|)\). When viewed as \(\mathbb {F}\)-arithmetic circuits, the prover \(P_{\scriptscriptstyle {\mathsf {LIP}}}\) and query algorithm \(Q_{\scriptscriptstyle {\mathsf {LIP}}}\) are both of size \(O(s^{2})\), and the decision algorithm is of size \(O(n)\). Furthermore, the degree of \((Q_{\scriptscriptstyle {\mathsf {LIP}}},D_{\scriptscriptstyle {\mathsf {LIP}}})\) is (2, 2).

  • A (strong) quadratic span program (\(\text{ QSP } \)), as defined by Gennaro et al. [56], directly yields a corresponding \(\text{ LPCP } \) with three queries. For a circuit \(C\) of size \(s\) and input length \(n\), the resulting \(\text{ LIP } \) for \(\mathcal {R}_{C}\) has verifier message in \(\mathbb {F}^{O(s)}\), prover message in \(\mathbb {F}^{4}\), and soundness error \(O(s/|\mathbb {F}|)\). When viewed as \(\mathbb {F}\)-arithmetic circuits, the prover \(P_{\scriptscriptstyle {\mathsf {LIP}}}\) is of size \(\widetilde{O}(s)\), the query algorithm \(Q_{\scriptscriptstyle {\mathsf {LIP}}}\) is of size \(O(s)\), and the decision algorithm is of size \(O(n)\). The degree of \((Q_{\scriptscriptstyle {\mathsf {LIP}}},D_{\scriptscriptstyle {\mathsf {LIP}}})\) is \((O(s),2)\).

A notable feature of the \(\text{ LIP } \)s obtained above is the very low “online complexity” of verification: in both cases, the decision algorithm is an arithmetic circuit of size \(O(n)\). Moreover, all the efficiency features mentioned above apply not only to satisfiability of boolean circuits \(C\), but also to satisfiability of \(\mathbb {F}\)-arithmetic circuits.

In both the above constructions, the circuit to be verified is first represented as an appropriate algebraic satisfaction problem, and then probabilistic checking machinery is invoked. In the first case, the problem is a system of quadratic equations over \(\mathbb {F}\), and, in the second case, it is a (strong) quadratic span program (\(\text{ QSP } \)) over \(\mathbb {F}\). These algebraic problems are the very same problems underlying [56, 64, 78].

As explained earlier, [56] invested much effort to show an efficient reduction from circuit satisfiability problems to \(\text{ QSP } \)s. Our work does not subsume nor simplify the reduction to \(\text{ QSP } \)s of [56], but instead reveals a simple \(\text{ LPCP } \) to check a \(\text{ QSP } \), and this \(\text{ LPCP } \) can be plugged into our general transformations. Reducing circuit satisfiability to a system of quadratic equations over \(\mathbb {F}\) is much simpler, but generating proofs for the resulting problem is quadratically more expensive. (Concretely, both [64, 78] require \(O(s^{2})\) computation already in the preprocessing phase). See Sect. 3.1 for more details.

LIPs from traditional PCPs. Our second transformation relies on traditional “unstructured” \(\text{ PCP } \)s. These \(\text{ PCP } \)s are typically more difficult to construct than \(\text{ LPCP } \)s; however, our second transformation has the advantage of requiring the prover to send only a single field element. Concretely, our transformation converts a traditional \(k\)-query \(\text{ PCP } \) into a 1-query \(\text{ LPCP } \), over a sufficiently large field. Here the \(\text{ PCP } \) oracle is represented via its truth table, which is assumed to be a binary string of polynomial size (unlike the \(\text{ LPCP } \)s mentioned above, whose truth tables have size that is exponential in the circuit size). The transformation converts any \(k\)-query \(\text{ PCP } \) of proof length \(m\) and soundness error \(\varepsilon \) into an \(\text{ LIP } \), with soundness error \(O(\varepsilon )\) over a field of size \(2^{O(k)} / \varepsilon \), in which the verifier sends \(m\) field elements and receives only a single field element in return. The high-level idea is to use a sparse linear combination of the \(\text{ PCP } \) entries to pack the \(k\) answer bits into a single field element. The choice of this linear combination uses additional random noise to ensure that the prover’s coefficients are restricted to binary values, and uses easy instances of subset-sum to enable an efficient decoding of the \(k\) answer bits.

Taking time complexity to an extreme, we can apply this transformation to the \(\text{ PCP } \)s of Ben-Sasson et al. [28] and get \(\text{ LIP } \)s where the prover and verifier complexity are both optimal up to \({\mathrm {polylog}}(s)\) factors, but where the prover sends a single element in a field of size \(|\mathbb {F}|=2^{\lambda \cdot {\mathrm {polylog}}(s)}\). Taking succinctness to an extreme, we can apply our transformation to \(\text{ PCP } \)s with soundness error \(2^{-\lambda }\) and \(O(\lambda )\) queries, obtaining an \(\text{ LIP } \) with similar soundness error in which the prover sends a single element in a field of size \(|\mathbb {F}|=2^{\lambda \cdot O(1)}\). For instance, using the query-efficient \(\text{ PCP } \)s of Håstad and Khot [69], the field size is only \(|\mathbb {F}|=2^{\lambda \cdot (3+o(1))}\).Footnote 2 (Jumping ahead, this means that a field element can be encrypted using a single, normal-size ciphertext of homomorphic encryption schemes such as Paillier or Elgamal even when \(\lambda =100\).) On the down side, the degrees of the \(\text{ LIP } \) verifiers obtained via this transformation are high; we give evidence that this is inherent when starting from “unstructured” \(\text{ PCP } \)s. See Sect. 3.2 for more details.

Honest-verifier zero-knowledge LIPs. We also show how to make the above \(\text{ LIP } \)s zero-knowledge against honest verifiers (\(\text{ HVZK } \)). Looking ahead, using \(\text{ HVZK } \text{ LIP } \)s in our cryptographic transformations results in preprocessing \(\text{ SNARK } \)s that are zero-knowledge (against malicious verifiers in the CRS model).

For the Hadamard-based \(\text{ LIP } \), an \(\text{ HVZK } \) variant can be obtained directly with essentially no additional cost. More generally, we show how to transform any \(\text{ LPCP } \) where the decision algorithm is of low degree to an \(\text{ HVZK } \text{ LPCP } \) with the same parameters up to constant factors (see Sect. 8); this \(\text{ HVZK } \text{ LPCP } \) can then be plugged into our first transformation to obtain an \(\text{ HVZK } \text{ LIP } \). Both of the \(\text{ LPCP } \) constructions mentioned earlier satisfy the requisite degree constraints.

For the second transformation, which applies to traditional \(\text{ PCP } \)s (whose verifiers, as discussed above, must have high degree and thus cannot benefit from our general \(\text{ HVZK } \) transformation), we show that if the \(\text{ PCP } \) is \(\text{ HVZK } \) (see [47] for efficient constructions), then so is the resulting \(\text{ LIP } \); in particular, the \(\text{ HVZK } \text{ LIP } \) answer still consists of a single field element.

Proof of knowledge. In each of the above transformations, we ensure not only soundness for the \(\text{ LIP } \), but also a proof of knowledge property. Namely, it is possible to efficiently extract from a convincing affine function \(\Pi \) a witness for the underlying statement. The proof of knowledge property is then preserved in the subsequent cryptographic compilations, ultimately allowing to establish the proof of knowledge property for the preprocessing \(\text{ SNARK } \). As discussed in Sect. 1.1, proof of knowledge is a very desirable property for preprocessing \(\text{ SNARK } \)s; for instance, it enables to remove the preprocessing phase, as well as to improve the complexity of the prover and verifier, via the result of [12].

Table 1 Summary of our \(\text{ LIP } \) constructions

1.3.2 Preprocessing \(\text{ SNARK } \)s from \(\text{ LIP } \)s

We explain how to use cryptographic tools to transform an \(\text{ LIP } \) into a corresponding preprocessing \(\text{ SNARK } \). At high level, the challenge is to ensure that an arbitrary (yet computationally-bounded) prover behaves as if it was a linear (or affine) function. The idea, which also implicitly appears in previous constructions, is to use an encryption scheme with targeted malleability [34] for the class of affine functions: namely, an encryption scheme that “only allows affine homomorphic operations” on an encrypted plaintext (and these operations are independent of the underlying plaintexts). Intuitively, the verifier would simply encrypt each field element in the \(\text{ LIP } \) message \({\mathbf {q}}\), send the resulting ciphertexts to the prover, and have the prover homomorphically evaluate the \(\text{ LIP } \) affine function on the ciphertexts; targeted malleability ensures that malicious provers can only invoke (malicious) affine strategies.

We concretize the above approach in several ways, depending on the properties of the \(\text{ LIP } \) and the exact flavor of targeted malleability; different choices will induce different properties for the resulting preprocessing \(\text{ SNARK } \). In particular, we identify natural sufficient properties that enable an \(\text{ LIP } \) to be compiled into a publicly-verifiable \(\text{ SNARK } \). We also discuss possible instantiations of the cryptographic tools, based on existing knowledge assumptions. (Recall that, in light of the negative result of [68], the use of nonstandard cryptographic assumptions seems to be justified.)

Designated-verifier preprocessing SNARKs from arbitrary LIPs. First, we show that any \(\text{ LIP } \) can be compiled into a corresponding designated-verifier preprocessing \(\text{ SNARK } \) with similar parameters. (Recall that “designated verifier” means that the verifier needs to maintain a secret verification state.) To do so, we rely on what we call linear-only encryption: an additively homomorphic encryption that is (a) semantically-secure, and (b) linear-only. The linear-only property essentially says that, given a public key \({\mathsf {pk}}\) and ciphertexts \({\mathsf {Enc}}_{{\mathsf {pk}}}(a_{1}),\dots ,{\mathsf {Enc}}_{{\mathsf {pk}}}(a_{m})\), it is infeasible to compute a new ciphertext \(c'\) in the image of \({\mathsf {Enc}}_{{\mathsf {pk}}}\), except by “knowing” \(\beta ,\alpha _{1},\dots ,\alpha _{m}\) such that \(c' \in {\mathsf {Enc}}_{{\mathsf {pk}}}(\beta +\sum _{i=1}^{m} \alpha _{i} a_{i})\). Formally, the property is captured by guaranteeing that, whenever \(A({\mathsf {pk}},{\mathsf {Enc}}_{{\mathsf {pk}}}(a_{1}),\dots ,{\mathsf {Enc}}_{{\mathsf {pk}}}(a_{m}))\) produces valid ciphertexts \((c'_{1} ,\dots ,c'_{k})\), an efficient extractor \(E\) (non-uniformly depending on \(A\)) can extract a corresponding affine function \(\Pi \) “explaining” the ciphertexts. As a candidate for such an encryption scheme, we propose variants of Paillier encryption [88] (as also considered in [56]) and of Elgamal encryption [51] (in those cases where the plaintext is guaranteed to belong to a polynomial-size set, so that decryption can be done efficiently). These variants are “sparsified” versions of their standard counterparts; concretely, a ciphertext does not only include \({\mathsf {Enc}}_{{\mathsf {pk}}}(a)\), but also \({\mathsf {Enc}}_{{\mathsf {pk}}}(\alpha \cdot a)\), for a secret field element \(\alpha \). (This “sparsification” follows a pattern found in many constructions conjectured to satisfy “knowledge-of-exponent” assumptions.) As for Paillier encryption, we have to consider \(\text{ LIP } \)s over the ring \(\mathbb {Z}_{\mathfrak {p}\mathfrak {q}}\) (instead of a finite field \(\mathbb {F}\)); essentially, the same results also hold in this setting (except that soundness is \(O(1/\min \left\{ \mathfrak {p},\mathfrak {q}\right\} )\) instead of \(O(1/|\mathbb {F}|)\)).

We also consider a notion of targeted malleability, weaker than linear-only encryption, that is closer to the definition template of Boneh et al. [34]. In such a notion, the extractor is replaced by a simulator. Relying on this weaker variant, we are only able to prove the security of our preprocessing \(\text{ SNARK } \)s against non-adaptive choices of statements (and still prove soundness, though not proof of knowledge, if the simulator is allowed to be inefficient). Nonetheless, for natural instantiations, even adaptive security seems likely to hold for our construction, but we do not know how to prove it. One advantage of working with this weaker variant is that it seems to allow for more efficient candidates constructions. Concretely, the linear-only property rules out any encryption scheme where ciphertexts can be sampled obliviously; instead, the weaker notion does not, and thus allows for shorter ciphertexts. For example, we can consider a standard (“non-sparsified”) version of Paillier encryption. We will get back to this point in Sect. 1.3.3.

For further details on the above transformations, see Sect. 6.1.

Publicly-verifiable preprocessing SNARKs from LIPs with low-degree verifiers. Next, we identify properties of \(\text{ LIP } \)s that are sufficient for a transformation to publicly-verifiable preprocessing \(\text{ SNARK } \)s. Note that, if we aim for public verifiability, we cannot use semantically-secure encryption to encode the message of the \(\text{ LIP } \) verifier, because we need to “publicly test” (without decryption) certain properties of the plaintext underlying the prover’s response. The idea, implicit in previous publicly-verifiable preprocessing \(\text{ SNARK } \) constructions, is to use linear-only encodings (rather than encryption) that do allow such public tests, while still providing certain one-wayness properties. When using such encodings with an \(\text{ LIP } \), however, it must be the case that the public tests support evaluating the decision algorithm of the \(\text{ LIP } \) and, moreover, the \(\text{ LIP } \) remains secure despite some “leakage” on the queries. We show that \(\text{ LIP } \)s with low-degree verifiers (which we call algebraic \(\text{ LIP } \)s), combined with appropriate one-way encodings, suffice for this purpose.

More concretely, like [56, 64, 78], we consider candidate encodings in bilinear groups under similar knowledge-of-exponent and computational Diffie–Hellman assumptions; for such encoding instantiations, we must start with an \(\text{ LIP } \) where the degree \(d_{D}\) of the decision algorithm \(D_{\scriptscriptstyle {\mathsf {LIP}}}\) is at most quadratic. (If we had multilinear maps supporting higher-degree polynomials, we could support higher values of \(d_{D}\).) In addition to \(d_{D}\le 2\), to ensure security even in the presence of certain one-way leakage, we need the query algorithm \(Q_{\scriptscriptstyle {\mathsf {LIP}}}\) to be of polynomial degree.

Both of the \(\text{ LIP } \) constructions from \(\text{ LPCP } \)s described in Sect. 1.3.1 satisfy these requirements. When combined with the above transformation, these \(\text{ LIP } \) constructions imply new constructions of publicly-verifiable preprocessing \(\text{ SNARK } \)s, one of which can be seen as a simplification of the construction of [64] and the other as a reinterpretation (and slight simplification) of the construction of [56].

For more details, see Sect. 6.2.

Zero knowledge. In all aforementioned transformations to preprocessing \(\text{ SNARK } \)s, if we start with an \(\text{ HVZK } \text{ LIP } \) (such as those mentioned in Sect. 1.3.1) and additionally require a rerandomization property for the linear-only encryption/encoding (which is available in all of the candidate instantiations we consider), we obtain preprocessing \(\text{ SNARK } \)s that are (perfect) zero-knowledge in the CRS model. In addition, for the case of publicly-verifiable (perfect) zero-knowledge preprocessing \(\text{ SNARK } \)s, the CRS can be tested, so that (similarly to previous works [56, 64, 78]) we also obtain succinct ZAPs. See Sect. 6.3.

1.3.3 New Efficiency Features for \(\text{ SNARK } \)s

We obtain the following improvements in communication complexity for preprocessing \(\text{ SNARK } \)s.

“Single-ciphertext preprocessing SNARKs”. If we combine the \(\text{ LIP } \)s that we obtained from traditional \(\text{ PCP } \)s (where the prover returns only a single field element) with “non-sparsified” Paillier encryption, we obtain (non-adaptive) preprocessing \(\text{ SNARK } \)s that consist of a single Paillier ciphertext. Moreover, when using the query-efficient \(\text{ PCP } \) from [69] as the underlying \(\text{ PCP } \), even a standard-size Paillier ciphertext (with plaintext group \(\mathbb {Z}_{\mathfrak {p}\mathfrak {q}}\) where \(\mathfrak {p},\mathfrak {q}\) are 512-bit primes) suffices for achieving soundness error \(2^{-\lambda }\) with \(\lambda =100\). (For the case of [69], due to the queries’ dependence on the input, the reference string of the \(\text{ SNARK } \) also depends on the input.) Alternatively, using the sparsified version of Paillier encryption, we can also get security against adaptively-chosen statements with only two Paillier ciphertexts.

Optimal succinctness. A fundamental question about succinct arguments is how low can we push communication complexity. More accurately: what is the optimal tradeoff between communication complexity and soundness? Ideally, we would want succinct arguments that are optimally succinct: to achieve \(2^{-\Omega (\lambda )}\) soundness against \(2^{O(\lambda )}\)-bounded provers, the proof length is \(O(\lambda )\) bits long.

In several existing constructions of succinct arguments, to provide \(2^{-\Omega (\lambda )}\) soundness against \(2^{O(\lambda )}\)-bounded provers, the prover has to communicate \(\omega (\lambda )\) bits to the verifier. Concretely, \(\text{ PCP } \)-based (and \(\text{ MIP } \)-based) solutions require \(\Omega (\lambda ^{3})\) bits of communication. This also holds for preprocessing \(\text{ SNARK } \)s based on Paillier encryption, which suffer from subexponential-time attacks. In the case of pairing-based solutions, subexponential-time attacks are not known to be inherent (this applies to the base groups, relevant to SNARK constructions, rather than the target group).Footnote 3

Following our approach, any candidate for linear-only homomorphic encryption that does not suffer from subexponential-time attacks, would yield other instantiations of preprocessing \(\text{ SNARK } \)s that are optimally succinct. Currently, the only known such candidate is Elgamal encryption (say, in appropriate elliptic curve groups) [89]. However, the problem with using Elgamal decryption in our approach is that it requires to compute discrete logarithms.

One way to overcome this problem is to ensure that honest proofs are always decrypted to a known polynomial-size set. This can be done by taking the \(\text{ LIP } \) to be over a field \(\mathbb {F}_{\mathfrak {p}}\) of only polynomial size, and ensuring that any honest proof \(\varvec{\pi }\) has small \(\ell _{1}\)-norm \(\Vert \varvec{\pi }\Vert _{1}\), so that in particular, the prover’s answer is taken from a set of size at most \(\Vert \varvec{\pi }\Vert _{1} \cdot \mathfrak {p}\). For example, in the two \(\text{ LPCP } \)-based constructions described in Sect. 1.3.1, this norm is \(O(s^{2})\) and \(O(s)\), respectively, for a circuit of size \(s\). This approach, however, has two caveats: the soundness of the underlying \(\text{ LIP } \) is only \(1/{\mathrm {poly}}(\lambda )\) and moreover, the verifier’s running time is proportional to \(s\), and not independent of it, as we usually require. With such an \(\text{ LIP } \), we would be able to directly use Elgamal encryption because linear tests on the plaintexts can be carried out “in the exponent,” without having to take discrete logarithms.

Finally, a rather generic approach for obtaining “almost-optimal succinctness” is to use (linear-only) Elgamal encryption in conjunction with any linear homomorphic encryption scheme (perhaps not having the linear-only property) that is sufficiently secure. Concretely, the verifier sends his \(\text{ LIP } \) message encrypted under both encryption schemes, and then the prover homomorphically evaluates the affine function on both. The additional ciphertext can be efficiently decrypted, and can assist in the decryption of the Elgamal ciphertext. For example, there are encryption schemes based on Ring-LWE [79] that are conjectured to have quasiexponential security; by using these in the approach we just discussed, we can obtain \(2^{-\Omega (\lambda )}\) soundness against \(2^{O(\lambda )}\)-bounded provers with \(\widetilde{O}(\lambda )\) bits of communication.

Strong knowledge and reusability. Designated-verifier \(\text{ SNARK } \)s typically suffer from a problem known as the verifier rejection problem: security is compromised if the prover can learn the verifier’s responses to multiple adaptively-chosen statements and proofs. For example, the \(\text{ PCP } \)-based (or \(\text{ MIP } \)-based) \(\text{ SNARK } \)s of [8, 10, 11, 46, 60] suffer from the verifier rejection problem because a prover can adaptively learn the encrypted \(\text{ PCP } \) (or \(\text{ MIP } \)) queries, by feeding different statements and proofs to the verifier and learning his responses, and since the secrecy of these queries is crucial, security is lost.

Of course, one way to avoid the verifier rejection problem is to generate a new reference string for each statement and proof. Indeed, this is an attractive solution for the aforementioned \(\text{ SNARK } \)s because generating a new reference string is very cheap: it costs \({\mathrm {poly}}(\lambda )\). However, for a designated-verifier preprocessing \(\text{ SNARK } \), generating a new reference string is not cheap at all, and being able to reuse the same reference string across an unbounded number of adaptively-chosen statements and proofs is a very desirable property.

A property that is satisfied by all algebraic \(\text{ LIP } \)s (including the \(\text{ LPCP } \)-based \(\text{ LIP } \)s discussed in Sect. 1.3.1), which we call strong knowledge, is that such attacks are impossible. Specifically, for such \(\text{ LIP } \)s, every prover either makes the verifier accept with probability 1 or with probability less than \(O({\mathrm {poly}}(\lambda )/|\mathbb {F}|)\). (In Sect. 9, we also show that traditional “unstructured” PCPs cannot satisfy this property.) Given LIPs with strong knowledge, it seems that designated-verifier \(\text{ SNARK } \)s that have a reusable reference string can be constructed. Formalizing the connection between strong knowledge and reusable reference string actually requires notions of linear-only encryption that are somewhat more delicate than those we have considered so far. See details in Sect. 9 for additional discussions.

1.4 Previous Structured PCPs

Ishai et al. [72] proposed the idea of constructing argument systems with nontrivial efficiency properties by using “structured” \(\text{ PCP } \)s and cryptographic primitives with homomorphic properties, rather than (as in previous approaches) “unstructured” polynomial-size \(\text{ PCP } \)s and collision-resistant hashing. We have shown how to apply this basic approach in order to obtain succinct non-interactive arguments with preprocessing. We now compare our work to other works that have also followed the basic approach of [72].

Strong vs. weak linear PCPs. Both in our work and in [72], the notion of a “structured” \(\text{ PCP } \) is taken to be a linear \(\text{ PCP } \). However, the notion of a linear \(\text{ PCP } \) used in our work does not coincide with the one used in [72]. Indeed there are two ways in which one can formalize the intuitive notion of a linear \(\text{ PCP } \). Specifically:

  • A strong linear \(\text{ PCP } \) is a \(\text{ PCP } \) in which the honest proof oracle is guaranteed to be a linear function, and soundness is required to hold for all (including nonlinear) proof oracles.

  • A weak linear \(\text{ PCP } \) is a \(\text{ PCP } \) in which the honest proof oracle is guaranteed to be a linear function, and soundness is required to hold only for linear proof oracles.

In particular, a weak linear \(\text{ PCP } \) assumes an algebraically-bounded prover, while a strong linear \(\text{ PCP } \) does not. While Ishai et al. [72] considered strong linear \(\text{ PCP } \)s, in our work we are interested in studying algebraically-bounded provers, and thus consider weak linear \(\text{ PCP } \)s.

Arguments from strong linear PCPs. Ishai et al. [72] constructed a four-message argument system for \({\mathsf {NP}}\) in which the prover-to-verifier communication is short (i.e., an argument with a laconic prover [67]) by combining a strong linear \(\text{ PCP } \) and (standard) linear homomorphic encryption; they also showed how to extend their approach to “balance” the communication between the prover and verifier and obtain a \(O(1/\varepsilon )\)-message argument system for \({\mathsf {NP}}\) with \(O(n^{\varepsilon })\) communication complexity. Let us briefly compare their work with ours.

First, in this paper we focus on the non-interactive setting, while Ishai et al. focused on the interactive setting. In particular, in light of the negative result of Gentry and Wichs [68], this means that the use of non-standard assumptions in our setting (such as linear targeted malleability) may be justified; in contrast, Ishai et al. only relied on the standard semantic security of linear homomorphic encryption (and did not rely on linear targeted malleability properties). Second, we focus on constructing (non-interactive) succinct arguments, while Ishai et al. focus on constructing arguments with a laconic prover. Third, by relying on weak linear \(\text{ PCP } \)s (instead of strong linear \(\text{ PCP } \)s) we do not need to perform (explicitly or implicitly) linearity testing, while Ishai et al. do. Intuitively, this is because we rely on the assumption of linear targeted malleability, which ensures that a prover is algebraically bounded (in fact, in our case, linear); not having to perform proximity testing is crucial for preserving the algebraic properties of a linear \(\text{ PCP } \) (and thus, e.g., obtain public verifiability) and obtaining \(O({\mathrm {poly}}(\lambda )/|\mathbb {F}|)\) soundness with only a constant number of encrypted/encoded group elements. (Recall that linearity testing only guarantees constant soundness with a constant number of queries.)

Turning to computational efficiency, while their basic protocol does not provide the verifier with any saving in computation, Ishai et al. noted that their protocol actually yields a batching argument: namely, an argument in which, in order to simultaneously verify the correct evaluation of \(\ell \) circuits of size S, the verifier may run in time S (i.e., in time \(S / \ell \) per circuit evaluation). In fact, a set of works [94, 95, 97, 98] has improved upon, optimized, and implemented the batching argument of Ishai et al. [72] for the purpose of verifiable delegation of computation.

Finally, [94] have also observed that \(\text{ QSP } \)s can be used to construct weak linear \(\text{ PCP } \)s; while we compile weak linear \(\text{ PCP } \)s into \(\text{ LIP } \)s, [94] (as in previous work) compile weak linear \(\text{ PCP } \)s into strong ones. Indeed, note that a weak linear \(\text{ PCP } \) can always be compiled into a corresponding strong one, by letting the verifier additionally perform linearity testing and self-correction; this compilation does not affect proof length, increases query complexity by only a constant multiplicative factor, and guarantees constant soundness.

Remark 1.1

The notions of (strong or linear) \(\text{ PCP } \) discussed above should not be confused with the (unrelated) notion of a linear PCP of Proximity (linear PCPP) [31, 80], which we now recall for the purpose of comparison.

Given a field \(\mathbb {F}\), an \(\mathbb {F}\)-linear circuit [100] is an \(\mathbb {F}\)-arithmetic circuit \(C:\mathbb {F}^{h} \rightarrow \mathbb {F}^{\ell }\) in which every gate computes an \(\mathbb {F}\)-linear combination of its inputs; its kernel, denoted \({\mathrm {ker}}(C)\), is the set of all \(w\in \mathbb {F}^{h}\) for which \(C(w)=0^{\ell }\). A linear PCPP for a field \(\mathbb {F}\) is an oracle machine V with the following properties: (1) V takes as input an \(\mathbb {F}\)-linear circuit \(C\) and has oracle access to a vector \(w\in \mathbb {F}^{h}\) and an auxiliary vector \(\pi \) of elements in \(\mathbb {F}\), (2) if \(w\in {\mathrm {ker}}(C)\) then there exists \(\pi \) so that \(V^{w,\pi }(C)\) accepts with probability 1, and (3) if \(w\) is far from \({\mathrm {ker}}(C)\) then \(V^{w,\pi }(C)\) rejects with high probability for every \(\pi \).

Thus, a linear PCPP is a proximity tester for the kernels of linear circuits (which are not universal), while a (strong or weak) linear \(\text{ PCP } \) is a \(\text{ PCP } \) in which the proof oracle is a linear function.

1.5 Related and Subsequent Work

In this section we include a more detailed comparison with the work of Gennaro et al. [56] (GGPR), which is the most closely related to the current work, as well as some subsequent works in this area.

Comparison with GGPR. Our work can be seen as providing a conceptually simple general methodology that not only captures close variantsFootnote 4 of the SNARKs from GGPR (as well as earlier SNARKs from [64, 78]), but can also be instantiated in other useful ways. In more detail, GGPR consider the QSP constraint satisfaction problem, and show how to directly compile it into a SNARK. This is similar to the previous works of Groth [64] and Lipmaa [78], except that the QSP representation is quadratically more efficient. In contrast, our starting point is a linear interactive proof—a new kind of probabilistic information-theoretic proof system, which we show how to build from any (classical or linear) PCP. Only then, we compile such LIPs into SNARKs. The LIP abstraction also admits a natural zero-knowledge variant, which in the QSP-based approach is part of the cryptographic compiler. When using a LIP based on the QSP construction of GGPR, we end up with a slightly different SNARK from that of GGPR, which is in fact slightly less succinct (8 vs. 7 bilinear group elements). Indeed, the GGPR construction makes an additional optimization thanks to compiling QSPs directly.

Whereas QSPs (as well as their arithmetic QAP variant) are tied to polynomials and to quadratic verification, the linear PCP and LIP primitives are more general. GGPR-style linear PCPs still give the best efficiency for most applications, however, other linear PCPs have proven useful in this work and in subsequent works [7, 22, 87]. For example, we show that a LIP based on the Hadamard linear PCP, which is not captured by a QSP, yields a very simple SNARK construction with quadratic CRS size. The single-query linear PCP (or LIP) we obtain from a classical PCP, which serves as a basis for “single-ciphertext SNARKs,” is also not captured by a QSP. Applications in subsequent works are discussed below.

Subsequent developments. An influential work of Groth [65], building on a 2-element LIP implicit in [45], obtained a (publicly verifiable) SNARK requiring only 3 bilinear group elements (or roughly 1000 bits), and left open the possibility of a SNARK with 2 group elements. The latter would follow from a LIP with a linear decision procedure. However, the existence of such a LIP was ruled out in [65], settling an open question posed in the conference version of this work.

These barriers from [65] were recently circumvented in [22] by relaxing either the soundness or the completeness requirement. Settling for inverse-polynomial soundness, practical designated-verifier SNARKs for small circuits with only 2 group elements were obtained by applying a variant of the packing transformation from this work to the Hadamard PCP. Moreover, a 1-element LIP with a linear decision procedure, negligible soundness error, and non-negligible (but sub-constant) completeness error follows from the \({\mathsf {NP}}\)-hardness of approximating a problem related to linear codes, implying 2-element laconic arguments for \({\mathsf {NP}}\) with negligible soundness error and sub-constant completeness error. Finally, a plausible (but yet unproven) hardness of approximation result would imply a 1-element laconic argument with predictable answers, which would in turn imply witness encryption [57].

Several other kinds of “linear” probabilistic proof systems in the spirit of LIP were used in subsequent works. For instance, a variant of LIP was used in [13] to obtain sublinear-communication arguments for arithmetic circuits in which the prover runs in linear time. Fully linear proof systems, where linear queries apply jointly to the input and the proof vector, were used for sublinear zero-knowledge proofs on secret-shared data and information-theoretic secure multiparty computation [7].

We refer the reader to Thaler’s recent survey [99] for an overview of SNARKs based on Linear PCP (Chapter 14) and comparison to other approaches to practical arguments (Chapter 15). Earlier expositions appear in [7, Section 2], [18, Section 5], and [73].

1.6 Organization

In Sect. 2, we introduce the notions of \(\text{ LPCP } \)s and \(\text{ LIP } \)s. In Sect. 3, we present our transformations for constructing \(\text{ LIP } \)s from several notions of \(\text{ PCP } \)s. In Sect. 4, we give the basic definitions for preprocessing \(\text{ SNARK } \)s. In Sect. 5, we define the relevant notions of linear targeted malleability, as well as candidate constructions for these. In Sect. 6, we present our transformations from \(\text{ LIP } \)s to preprocessing \(\text{ SNARK } \)s. In Sect. 7, we discuss two constructions of algebraic \(\text{ LPCP } \)s. In Sect. 8, we present our general transformation to obtain \(\text{ HVZK } \) for \(\text{ LPCP } \)s with low-degree decision algorithms. In Sect. 9, we discuss the notion of strong knowledge and its connection to designated-verifier \(\text{ SNARK } \)s with a reusable reference string.

2 Definitions of \(\text{ LIP } \)s and \(\text{ LPCP } \)s

We begin with the information-theoretic part of the paper, by introducing the basic definitions of \(\text{ LPCP } \)s, \(\text{ LIP } \)s, and relevant conventions.

2.1 Polynomials, Degrees, and Schwartz–Zippel

Vectors are denoted in bold, while their coordinates are not; for example, we may write \(\varvec{a}\) to denote the ordered tuple \((a_{1},\dots ,a_{n})\) for some n. A field is denoted \(\mathbb {F}\); we always work with fields that are finite. We say that a multivariate polynomial \(f :\mathbb {F}^{m} \rightarrow \mathbb {F}\) has degree d if the total degree of f is at most d. A multivalued multivariate polynomial \(\mathbf {f} :\mathbb {F}^{m} \rightarrow \mathbb {F}^{\mu }\) is a vector of polynomials \((f_{1},\dots ,f_{\mu })\) where each \(f_{i} :\mathbb {F}^{m} \rightarrow \mathbb {F}\) is a (single-valued) multivariate polynomial.

A very useful fact about polynomials is the following:

Lemma 2.1

(Schwartz–Zippel) Let \(\mathbb {F}\) be any field. For any nonzero polynomial \(f :\mathbb {F}^{m} \rightarrow \mathbb {F}\) of total degree d and any finite subset S of \(\mathbb {F}\),

$$\begin{aligned} \mathop {\mathrm{Pr}}\limits _{\varvec{s}\leftarrow S^{m}} \Big [f(\varvec{s})=0 \Big ] \le \frac{d}{|S|} . \end{aligned}$$

In particular, any two distinct polynomials \(f,g :\mathbb {F}^{m} \rightarrow \mathbb {F}\) of total degree d can agree on at most a d/ |S| fraction of the points in \(S^{m}\).

2.2 Linear \(\text{ PCP } \)s

A linear probabilistically-checkable proof (\(\text{ LPCP } \)) system for a relation \(\mathcal {R}\) over a field \(\mathbb {F}\) is one where the \(\text{ PCP } \) oracle is restricted to compute a linear function \(\varvec{\pi }:\mathbb {F}^{m} \rightarrow \mathbb {F}\) of the verifier’s queries. Viewed as a traditional \(\text{ PCP } \), \(\varvec{\pi }\) has length \(|\mathbb {F}|^{m}\) (and alphabet \(\mathbb {F}\)). For simplicity, we ignore the computational complexity issues in the following definition, and refer to them later when they are needed.

Definition 2.2

(Linear \(\text{ PCP } \) (\(\text{ LPCP } \))) Let \(\mathcal {R}\) be a binary relation, \(\mathbb {F}\) a finite field, \(P_{\scriptscriptstyle {\mathsf {LPCP}}}\) a deterministic prover algorithm, and \(V_{\scriptscriptstyle {\mathsf {LPCP}}}\) a probabilistic oracle verifier algorithm. We say that the pair \((P_{\scriptscriptstyle {\mathsf {LPCP}}} ,V_{\scriptscriptstyle {\mathsf {LPCP}}})\) is a (input-oblivious) \(k\)-query linear PCP for \(\mathcal {R}\) over \(\mathbb {F}\) with knowledge error \(\varepsilon \) and query length \(m\) if it satisfies the following requirements:

  • Syntax On any input \(x\) and oracle \(\varvec{\pi }\), the verifier \(V_{\scriptscriptstyle {\mathsf {LPCP}}}^{\varvec{\pi }}(x)\) makes \(k\) input-oblivious queries to \(\varvec{\pi }\) and then decides whether to accept or reject. More precisely, \(V_{\scriptscriptstyle {\mathsf {LPCP}}}\) consists of a probabilistic query algorithm \(Q_{\scriptscriptstyle {\mathsf {LPCP}}}\) and a deterministic decision algorithm \(D_{\scriptscriptstyle {\mathsf {LPCP}}}\) working as follows. Based on its internal randomness, and independently of \(x\), \(Q_{\scriptscriptstyle {\mathsf {LPCP}}}\) generates \(k\) queries \(\varvec{q}_{1},\dots ,\varvec{q}_{k} \in \mathbb {F}^{m}\) to \(\varvec{\pi }\) and state information \({\mathbf {u}}\); then, given \(x\), \({\mathbf {u}}\), and the \(k\) oracle answers \(a_{1} = \left\langle \varvec{\pi } , \varvec{q}_{1} \right\rangle ,\dots ,a_{k} = \left\langle \varvec{\pi } , \varvec{q}_{k} \right\rangle \), \(D_{\scriptscriptstyle {\mathsf {LPCP}}}\) accepts or rejects.

  • Completeness For every \((x,w) \in \mathcal {R}\), the output of \(P_{\scriptscriptstyle {\mathsf {LPCP}}}(x,w)\) is a description of a linear function \(\varvec{\pi }:\mathbb {F}^{m} \rightarrow \mathbb {F}\) such that \(V_{\scriptscriptstyle {\mathsf {LPCP}}}^{\varvec{\pi }}(x)\) accepts with probability 1.

  • Knowledge There exists a knowledge extractor \(E_{\scriptscriptstyle {\mathsf {LPCP}}}\) such that for every linear function \(\varvec{\pi }^{*} :\mathbb {F}^{m} \rightarrow \mathbb {F}\) if the probability that \(V_{\scriptscriptstyle {\mathsf {LPCP}}}^{\varvec{\pi }^{*}}(x)\) accepts is greater than \(\varepsilon \) then \(E_{\scriptscriptstyle {\mathsf {LPCP}}}^{\varvec{\pi }^{*}}(x)\) outputs \(w\) such that \((x,w) \in \mathcal {R}\).Footnote 5

Furthermore, we say that \((P_{\scriptscriptstyle {\mathsf {LPCP}}} ,V_{\scriptscriptstyle {\mathsf {LPCP}}})\) has degree \((d_{Q} ,d_{D})\) if, additionally,

  1. 1.

    the query algorithm \(Q_{\scriptscriptstyle {\mathsf {LPCP}}}\) is computed by a degree \(d_{Q}\) arithmetic circuit (i.e., there are \(k\) polynomials \(\varvec{p}_{1},\dots ,\varvec{p}_{k}:\mathbb {F}^\mu \rightarrow \mathbb {F}^{m}\) and state polynomial \(\varvec{p}:\mathbb {F}^\mu \rightarrow \mathbb {F}^{m'}\), all of degree \(d_{Q}\), such that the \(\text{ LPCP } \) queries are \(\varvec{q}_{1} = \varvec{p}_{1}(\varvec{r}),\dots ,\varvec{q}_{k} = \varvec{p}_{k}(\varvec{r})\) and the state is \({\mathbf {u}}= \varvec{p}(\varvec{r})\) for a random \(\varvec{r}\in \mathbb {F}^{\mu }\)), and

  2. 2.

    the decision algorithm \(D_{\scriptscriptstyle {\mathsf {LPCP}}}\) is computed by a degree \(d_{D}\) arithmetic circuit (i.e., for every input \(x\) there is a test polynomial \(\varvec{t}_{x}:\mathbb {F}^{m'+k}\rightarrow \mathbb {F}^\eta \) of degree \(d_{D}\) such that \(\varvec{t}_{x}({\mathbf {u}},a_{1},\dots ,a_{k}) = 0^{\eta }\) if and only if \(D_{\scriptscriptstyle {\mathsf {LPCP}}}(x,{\mathbf {u}},a_{1},\dots ,a_{k})\) accepts).

Finally, for a security parameter \(\lambda \), we say that \((P_{\scriptscriptstyle {\mathsf {LPCP}}} ,V_{\scriptscriptstyle {\mathsf {LPCP}}})\) is an algebraic LPCP (for \(\lambda \)) if it has degree \(({\mathrm {poly}}(\lambda ) ,{\mathrm {poly}}(\lambda ))\).

Remark 2.3

(Infinite relations \(\mathcal {R}\)) When \(\mathcal {R}\) is an infinite relation \(\cup _{\ell \in \mathbb {N}} \mathcal {R}_{\ell }\), both \(V_{\scriptscriptstyle {\mathsf {LPCP}}}=(Q_{\scriptscriptstyle {\mathsf {LPCP}}} ,D_{\scriptscriptstyle {\mathsf {LPCP}}})\) and \(P_{\scriptscriptstyle {\mathsf {LPCP}}}\) also get as input \(1^{\ell }\). In this case, all parameters \(k,m,\mu ,m',\eta \) may also be a function of \(\ell \).

Some of the aforementioned properties only relate to the \(\text{ LPCP } \) verifier \(V_{\scriptscriptstyle {\mathsf {LPCP}}}\), so we will also say things like “\(V_{\scriptscriptstyle {\mathsf {LPCP}}}\) has degree...,” i.e., using the verifier as the subject (rather than the \(\text{ LPCP } \)).

Honest-verifier zero-knowledge LPCPs. We also consider honest-verifier zero-knowledge (\(\text{ HVZK } \)) \(\text{ LPCP } \)s. In an \(\text{ HVZK } \text{ LPCP } \), soundness or knowledge is defined as in a usual \(\text{ LPCP } \), and \(\text{ HVZK } \) is defined as in a usual \(\text{ HVZK } \text{ PCP } \). For convenience, let us recall the definition of a \(\text{ HVZK } \text{ PCP } \):

Definition 2.4

(Honest-verifier zero-knowledge PCP (HVZK PCP)) A \(\text{ PCP } \) system \((P_{\scriptscriptstyle {\mathsf {PCP}}} ,V_{\scriptscriptstyle {\mathsf {PCP}}})\) for a relation \(\mathcal {R}\), where \(P_{\scriptscriptstyle {\mathsf {PCP}}}\) is also probabilistic, is \(\delta \)-statistical HVZK if there exists a simulator \(S_{\scriptscriptstyle {\mathsf {PCP}}}\), running in expected polynomial time, for which the following two ensembles are \(\delta \)-close (\(\delta \) can be a function of the field, input length, and so on):

$$\begin{aligned} \big \{S_{\scriptscriptstyle {\mathsf {PCP}}}(x)\big \}_{(x,w) \in \mathcal {R}} \text { and } \left\{ \mathrm {View}\big (V_{\scriptscriptstyle {\mathsf {PCP}}}^{\pi _{x,w}}(x)\big ) \;\vert \; \pi _{x,w} \leftarrow P_{\scriptscriptstyle {\mathsf {PCP}}}(x,w) \right\} _{(x,w) \in \mathcal {R}} , \end{aligned}$$

where \(\mathrm {View}\) represents the view of the verifier, including its coins and the induced answers according to \(\pi \).

If the above two distributions are identically distributed then we say that \((P_{\scriptscriptstyle {\mathsf {PCP}}} ,V_{\scriptscriptstyle {\mathsf {PCP}}})\) is perfect \(\text{ HVZK } \) (Fig. 2).

Fig. 2
figure 2

Diagram of an \(\text{ LPCP } \) and an input-oblivious two-message \(\text{ LIP } \)

2.3 Linear Interactive Proofs

A linear interactive proof (\(\text{ LIP } \)) is defined similarly to a standard interactive proof [62], except that each message sent by a prover (either an honest or a malicious one) must be a linear function of the previous messages sent by the verifier. In fact, it will be convenient for our purposes to consider a slightly weaker notion that allows a malicious prover to compute an affine function of the messages. While we will only make use of two-message \(\text{ LIP } \)s in which the verifier’s message is independent of its input, below we define the more general notion.

Definition 2.5

(Linear Interactive Proof (LIP)) A linear interactive proof over a finite field \(\mathbb {F}\) is defined similarly to a standard interactive proof [62], with the following differences.

  • Each message exchanged between the prover \(P_{\scriptscriptstyle {\mathsf {LIP}}}\) and the verifier \(V_{\scriptscriptstyle {\mathsf {LIP}}}\) is a vector \({\mathbf {q}}_{i} \in \mathbb {F}^{m}\) over \(\mathbb {F}\).

  • The honest prover’s strategy is linear in the sense that each of the prover’s messages is computed by applying some linear function \(\Pi _{i}:\mathbb {F}^{m} \rightarrow \mathbb {F}^{k}\) to the verifier’s previous messages \(({\mathbf {q}}_{1},\dots ,{\mathbf {q}}_{i})\). This function is determined only by the input \(x\), the witness \(w\), and the round number i.

  • Knowledge should only hold with respect to affine prover strategies \(\Pi ^{*} = (\Pi ,\varvec{b})\), where \(\Pi \) is a linear function, and \(\varvec{b}\) is some affine shift.

Analogously to the case of \(\text{ LPCP } \)s (Definition 2.2), we say that a two-message \(\text{ LIP } \) is input-oblivious if the verifier’s messages do not depend on the input \(x\). In such a case the verifier can be split into a query algorithm \(Q_{\scriptscriptstyle {\mathsf {LIP}}}\) that outputs the query \({\mathbf {q}}\) and possibly a verification state \({\mathbf {u}}\), and a decision algorithm \(D_{\scriptscriptstyle {\mathsf {LIP}}}\) that takes as input \({\mathbf {u}}\), \(x\), and the \(\text{ LIP } \) answer \(\Pi \cdot {\mathbf {q}}\). We also consider notions of degree and algebraic \(\text{ LIP } \)s, also defined analogously to the \(\text{ LPCP } \) case.

Remark 2.6

(\(\text{ LPCP } \)s and \(\text{ LIP } \)s over rings) The notions of \(\text{ LPCP } \) and an \(\text{ LIP } \) can be easily extended to be over a ring rather than over a field. One case of particular interest is \(\text{ LIP } \)s over \(\mathbb {Z}_{N}\), where N is the product of two primes \(\mathfrak {p}\) and \(\mathfrak {q}\). (\(\text{ LIP } \)s over \(\mathbb {Z}_{N}\) are needed, e.g., when used in conjunction with Paillier encryption; see Sect. 5.3.) All of our results generalize, rather directly, to the case of \(\mathbb {Z}_{N}\), where instead of achieving soundness-error \(O(1/|\mathbb {F}|)\), we achieve soundness \(O(1/\min \left\{ \mathfrak {p},\mathfrak {q}\right\} )\). For simplicity, when presenting most results, we shall restrict attention to fields.

Remark 2.7

(Honest-verifier zero knowledge) We also consider an honest-verifier zero-knowledge variant of \(\text{ LIP } \)s (\(\text{ HVZK } \text{ LIP } \)s), which is defined analogously to Definition 2.4. In this case, the honest prover is probabilistic.

Remark 2.8

(LIP vs. LPCP) Note that a one-query \(\text{ LPCP } \) is an \(\text{ LIP } \) where the prover returns a single field element; however, when the prover returns more than one field element, an \(\text{ LIP } \) is not a one-query \(\text{ LPCP } \). In this paper we construct both \(\text{ LIP } \)s where the prover returns more than a single field element (see Sect. 3.1) and \(\text{ LIP } \)s where the prover returns a single field element (see Sect. 3.2).

3 Constructions of \(\text{ LIP } \)s

We present two transformations for constructing \(\text{ LIP } \)s, in Sects. 3.1, and 3.2, respectively.

3.1 \(\text{ LIP } \)s From \(\text{ LPCP } \)s

We show how to transform any \(\text{ LPCP } \) into a two-message \(\text{ LIP } \) with similar parameters. Crucially, our transformation does not significantly affect strong knowledge or algebraic properties of the \(\text{ LPCP } \) verifier. Note that a non-trivial transformation is indeed required in general because the \(\text{ LIP } \) verifier cannot simply send to the \(\text{ LIP } \) prover the queries \(\varvec{q}_{1},\dots ,\varvec{q}_{k}\) generated by the \(\text{ LPCP } \) verifier. Unlike in the \(\text{ LPCP } \) model, there is no guarantee that the \(\text{ LIP } \) prover will apply the same linear function to each of these queries; instead, we only know that the \(\text{ LIP } \) prover will apply some affine function \(\Pi \) to the concatenation of \(\varvec{q}_{1},\dots ,\varvec{q}_{k}\). Thus, we show how to transform any \(\text{ LPCP } \) \((P_{\scriptscriptstyle {\mathsf {LPCP}}} ,V_{\scriptscriptstyle {\mathsf {LPCP}}})\) with knowledge error \(\varepsilon \) into a two-message \(\text{ LIP } \) \((P_{\scriptscriptstyle {\mathsf {LIP}}} ,V_{\scriptscriptstyle {\mathsf {LIP}}})\) with knowledge error at most \(\varepsilon + \frac{1}{|\mathbb {F}|}\). If the \(\text{ LPCP } \) has \(k\) queries of length \(m\) and is over a field \(\mathbb {F}\), then the \(\text{ LIP } \) verifier \(V_{\scriptscriptstyle {\mathsf {LIP}}}\) will send \((k+1)m\) field elements and receive \((k+1)\) field elements from the \(\text{ LIP } \) prover \(P_{\scriptscriptstyle {\mathsf {LIP}}}\). The idea of the transformation is for \(V_{\scriptscriptstyle {\mathsf {LIP}}}\) to run \(V_{\scriptscriptstyle {\mathsf {LPCP}}}\) and then also perform a consistency test (consisting of also sending to \(P_{\scriptscriptstyle {\mathsf {LIP}}}\) a random linear combination of the \(k\) queries of \(V_{\scriptscriptstyle {\mathsf {LPCP}}}\) and then verifying the obvious condition on the received answers).

More precisely, we construct a two-message \(\text{ LIP } \) \((P_{\scriptscriptstyle {\mathsf {LIP}}} ,V_{\scriptscriptstyle {\mathsf {LIP}}})\) from an \(\text{ LPCP } \) \((P_{\scriptscriptstyle {\mathsf {LPCP}}} ,V_{\scriptscriptstyle {\mathsf {LPCP}}})\) as follows:

Construction 3.1

Let \((P_{\scriptscriptstyle {\mathsf {LPCP}}} ,V_{\scriptscriptstyle {\mathsf {LPCP}}})\) be a \(k\)-query \(\text{ LPCP } \) over \(\mathbb {F}\) with query length \(m\). Define a two-message \(\text{ LIP } \) \((P_{\scriptscriptstyle {\mathsf {LIP}}} ,V_{\scriptscriptstyle {\mathsf {LIP}}})\) as follows.

  • The \(\text{ LIP } \) verifier \(V_{\scriptscriptstyle {\mathsf {LIP}}}\) runs the \(\text{ LPCP } \) verifier \(V_{\scriptscriptstyle {\mathsf {LPCP}}}\) to obtain \(k\) queries \(\varvec{q}_{1},\dots ,\varvec{q}_{k} \in \mathbb {F}^{m}\), draws \(\alpha _{1},\dots ,\alpha _{k}\) in \(\mathbb {F}\) uniformly at random, and sends to the \(\text{ LIP } \) prover \(P_{\scriptscriptstyle {\mathsf {LIP}}}\) the \((k+1)m\) field elements obtained by concatenating the \(k\) queries \(\varvec{q}_{1},\dots ,\varvec{q}_{k}\) together with the additional query \(\varvec{q}_{k+1} := \sum _{i=1}^{k} \alpha _{i} \varvec{q}_{i}\).

  • The \(\text{ LIP } \) prover \(P_{\scriptscriptstyle {\mathsf {LIP}}}\) runs the \(\text{ LPCP } \) prover \(P_{\scriptscriptstyle {\mathsf {LPCP}}}\) to obtain a linear function \(\varvec{\pi }:\mathbb {F}^{m} \rightarrow \mathbb {F}\), parses the \((k+1)m\) received field elements as \(k+1\) queries of \(m\) field elements each, applies \(\varvec{\pi }\) to each of these queries to obtain \(k+1\) corresponding field elements \(a_{1},\dots ,a_{k+1}\), and sends these answers to the \(\text{ LIP } \) verifier \(V_{\scriptscriptstyle {\mathsf {LIP}}}\).

  • The \(\text{ LIP } \) verifier \(V_{\scriptscriptstyle {\mathsf {LIP}}}\) checks that \(a_{k+1} = \sum _{i=1}^{k} \alpha _{i} a_{i}\) (if this is not the case, it rejects) and decides whether to accept or reject by feeding the \(\text{ LPCP } \) verifier \(V_{\scriptscriptstyle {\mathsf {LPCP}}}\) with the answers \(a_{1},\dots ,a_{k}\).

Lemma 3.2

(From LPCP to LIP) Suppose \((P_{\scriptscriptstyle {\mathsf {LPCP}}} ,V_{\scriptscriptstyle {\mathsf {LPCP}}})\) is a \(k\)-query \(\text{ LPCP } \) for a relation \(\mathcal {R}\) over \(\mathbb {F}\) with query length \(m\) and knowledge error \(\varepsilon \). Then, \((P_{\scriptscriptstyle {\mathsf {LIP}}} ,V_{\scriptscriptstyle {\mathsf {LIP}}})\) from Construction 3.1 is a two-message \(\text{ LIP } \) for \(\mathcal {R}\) over \(\mathbb {F}\) with verifier message in \(\mathbb {F}^{(k+1)m}\), prover message in \(\mathbb {F}^{k+1}\), and knowledge error \(\varepsilon + \frac{1}{|\mathbb {F}|}\). Moreover,

  • if \((P_{\scriptscriptstyle {\mathsf {LPCP}}} ,V_{\scriptscriptstyle {\mathsf {LPCP}}})\) has strong knowledge, then \((P_{\scriptscriptstyle {\mathsf {LIP}}} ,V_{\scriptscriptstyle {\mathsf {LIP}}})\) also does, and

  • if \((P_{\scriptscriptstyle {\mathsf {LPCP}}} ,V_{\scriptscriptstyle {\mathsf {LPCP}}})\) has an algebraic verifier of degree \((d_{Q} ,d_{D})\), then \((P_{\scriptscriptstyle {\mathsf {LIP}}} ,V_{\scriptscriptstyle {\mathsf {LIP}}})\) has one with degree \((d_{Q} ,\max \{2,d_{D}\})\).

Proof

Syntactic properties and completeness are easy to verify. Furthermore, since the construction of \(V_{\scriptscriptstyle {\mathsf {LIP}}}\) from \(V_{\scriptscriptstyle {\mathsf {LPCP}}}\) only involves an additional quadratic test, the degree of \(V_{\scriptscriptstyle {\mathsf {LIP}}}\) is \((d_{Q} ,\max \{2,d_{D}\})\). We are left to argue knowledge (and strong knowledge).

Let \(\Pi :\mathbb {F}^{(k+1)m} \rightarrow \mathbb {F}^{k+1}\) be an affine strategy of a potentially malicious \(\text{ LIP } \) prover \(P_{\scriptscriptstyle {\mathsf {LIP}}}^{*}\). We specify \(\Pi \) by \((k+1)^{2}\) linear functions \(\varvec{\pi }_{i,j} :\mathbb {F}^{m} \rightarrow \mathbb {F}\) for \(i,j \in \{1,\dots ,k+1\}\) and a constant vector \(\gamma = (\gamma _{1},\dots ,\gamma _{k+1}) \in \mathbb {F}^{k+1}\) such that the i-th answer of \(P_{\scriptscriptstyle {\mathsf {LIP}}}^{*}\) is given by \(a_{i} := \sum _{j=1}^{k+1} \left\langle \varvec{\pi }_{i,j} , \varvec{q}_{j} \right\rangle + \gamma _{i}\). It suffices to show that, for any choice of queries \(\varvec{q}_{1},\dots ,\varvec{q}_{k}\), exactly one of the following conditions holds:

  • \(a_{i} = \left\langle \varvec{\pi }_{k+1,k+1} , \varvec{q}_{i} \right\rangle \) for all \(i\in [k]\), or

  • with probability greater than \(1-\frac{1}{|\mathbb {F}|}\) over \(\alpha _{1},\dots ,\alpha _{k}\), \(P_{\scriptscriptstyle {\mathsf {LIP}}}^{*}\) does not pass the consistency check.

Indeed, the above tells us that if \(\Pi \) makes \(V_{\scriptscriptstyle {\mathsf {LIP}}}\) accept with probability greater than \(\varepsilon + \frac{1}{|\mathbb {F}|}\), then \(\varvec{\pi }_{k+1,k+1}\) makes \(V_{\scriptscriptstyle {\mathsf {PCP}}}\) accept with probability greater than \(\varepsilon \). Knowledge (and strong knowledge) thus follow as claimed.

To show the above, fix a tuple of queries, and assume that, for some \(i^{*} \in [k]\), \(a_{i^{*}} \ne \left\langle \varvec{\pi }_{k+1,k+1} , \varvec{q}_{i^{*}} \right\rangle \). For the consistency check to pass, it should hold that:

$$\begin{aligned} \sum _{i=1}^{k} \alpha _{i} \left( \sum _{j=1}^{k+1} \left\langle \varvec{\pi }_{i,j} , \varvec{q}_{j} \right\rangle + \gamma _{i} \right) = \sum _{j=1}^{k+1} \left\langle \varvec{\pi }_{k+1,j} , \varvec{q}_{j} \right\rangle + \gamma _{k+ 1}. \end{aligned}$$

Equivalently,

$$\begin{aligned} \sum _{j=1}^{k+1}\sum _{i=1}^{k} \alpha _{i} \left\langle \varvec{\pi }_{i,j} , \varvec{q}_{j} \right\rangle + \sum _{i=1}^{k} \alpha _{i} \gamma _{i} = \sum _{j=1}^{k+1} \left\langle \varvec{\pi }_{k+1,j} , \varvec{q}_{j} \right\rangle + \gamma _{k+ 1}. \end{aligned}$$

Breaking the first summation using the equality \(\varvec{q}_{k+1} = \sum _{j=1}^{k} \alpha _{j} \varvec{q}_{j}\), we get:

$$\begin{aligned}&\sum _{j=1}^{k} \sum _{i=1}^{k} \alpha _{i} \left\langle \varvec{\pi }_{i,j} , \varvec{q}_{j} \right\rangle + \sum _{i=1}^{k} \alpha _{i} \left\langle \varvec{\pi }_{i,k+1} , \left( \sum _{j=1}^{k} \alpha _{j} \varvec{q}_{j} \right) \right\rangle + \sum _{i=1}^{k} \alpha _{i} \gamma _{i} \\&\quad = \sum _{j=1}^{k} \left\langle \varvec{\pi }_{k+1,j} , \varvec{q}_{j} \right\rangle + \left\langle \varvec{\pi }_{k+1,k+1} , \sum _{j=1}^{k} \alpha _{j}\varvec{q}_{j} \right\rangle + \gamma _{k+ 1}. \end{aligned}$$

Rearranging, we see that the consistency check reduces to verifying the following equation:

$$\begin{aligned}&\sum _{i,j=1}^{k} \alpha _{i} \alpha _{j} \left\langle \varvec{\pi }_{i,k+1} , \varvec{q}_{j} \right\rangle + \sum _{i=1}^{k} \alpha _{i} \left( \sum _{j=1}^{k} \left\langle \varvec{\pi }_{i,j} , \varvec{q}_{j} \right\rangle - \left\langle \varvec{\pi }_{k+1,k+1} , \varvec{q}_{i} \right\rangle + \gamma _{i} \right) \\&\quad - \left( \sum _{i=1}^{k} \left\langle \varvec{\pi }_{k+1,i} , \varvec{q}_{i} \right\rangle + \gamma _{k+1} \right) = 0. \end{aligned}$$

Because \(\sum _{j=1}^{k+1} \left\langle \varvec{\pi }_{i^{*},j} , \varvec{q}_{j} \right\rangle + \gamma _{i^{*}} = a_{i^{*}} \ne \left\langle \varvec{\pi }_{k+1,k+1} , \varvec{q}_{i^{*}} \right\rangle \), the coefficient of \(\alpha _{i^{*}}\) in the above polynomial is non-zero. Hence, by the Schwartz–Zippel Lemma (see Lemma 2.1), the identity holds with probability at most \(\frac{1}{|\mathbb {F}|}\). \(\square \)

In light of the two \(\text{ LPCP } \) constructions described in Sect. 7, we deduce the following two theorems.

Theorem 3.3

Let \(\mathbb {F}\) be a finite field and \(C:\{0,1\}^{n} \times \{0,1\}^{h} \rightarrow \{0,1\}\) a boolean circuit of size \(s\). There is an input-oblivious two-message \(\text{ LIP } \) for \(\mathcal {R}_{C}\) with knowledge error \(O(1/|\mathbb {F}|)\), verifier message in \(\mathbb {F}^{O(s^{2})}\), prover message in \(\mathbb {F}^{4}\), and degree \((2 ,2)\). Furthermore:

  • the \(\text{ LIP } \) prover \(P_{\scriptscriptstyle {\mathsf {LIP}}}\) is an arithmetic circuit of size \(O(s^{2})\);

  • the \(\text{ LIP } \) query algorithm \(Q_{\scriptscriptstyle {\mathsf {LIP}}}\) is an arithmetic circuit of size \(O(s^{2})\);

  • the \(\text{ LIP } \) decision algorithm \(D_{\scriptscriptstyle {\mathsf {LIP}}}\) is an arithmetic circuit of size \(O(n)\).

Theorem 3.4

Let \(\mathbb {F}\) be a finite field and \(C:\{0,1\}^{n} \times \{0,1\}^{h} \rightarrow \{0,1\}\) a boolean circuit of size \(s\). There is an input-oblivious two-message \(\text{ LIP } \) for \(\mathcal {R}_{C}\) with knowledge error \(O(s/|\mathbb {F}|)\), verifier message in \(\mathbb {F}^{O(s)}\), prover message in \(\mathbb {F}^{4}\), and degree \((O(s) ,2)\). Furthermore:

  • the \(\text{ LIP } \) prover \(P_{\scriptscriptstyle {\mathsf {LIP}}}\) is an arithmetic circuit of size \(\widetilde{O}(s)\);

  • the \(\text{ LIP } \) query algorithm \(Q_{\scriptscriptstyle {\mathsf {LIP}}}\) is an arithmetic circuit of size \(O(s)\);

  • the \(\text{ LIP } \) decision algorithm \(D_{\scriptscriptstyle {\mathsf {LIP}}}\) is an arithmetic circuit of size \(O(n)\).

3.1.1 Zero-Knowledge

The \(\text{ LIP } \)s we obtain by via above transformation can all be made honest-verifier zero-knowledge (\(\text{ HVZK } \)) by starting with an \(\text{ HVZK } \text{ LPCP } \). For this purpose, we show in Sect. 8 a general transformation from any \(\text{ LPCP } \) with \(d_{D}=O(1)\) to a corresponding \(\text{ HVZK } \text{ LPCP } \), with only small overhead in parameters.

3.2 \(\text{ LIP } \)s From (Traditional) \(\text{ PCP } \)s

We present a second general construction of \(\text{ LIP } \)s. Instead of \(\text{ LPCP } \)s, this time we rely on a traditional \(k\)-query \(\text{ PCP } \) in which the proof \(\pi \) is a binary string of length \(m={\mathrm {poly}}(|x|)\). While any \(\text{ PCP } \) can be viewed as an \(\text{ LPCP } \) (by mapping each query location \(q\in [m]\) to the unit vector \(\varvec{e}_{q}\) equal to 1 at the \(q\)-th position and 0 everywhere else), applying the transformation from Sect. 3.1 yields an \(\text{ LIP } \) in which the prover’s message consists of \(k+1\) field elements. Here we rely on the sparseness of the queries of an \(\text{ LPCP } \) that is obtained from a \(\text{ PCP } \) in order to reduce the number of field elements returned by the prover to 1. (In particular, the transformation presented in this section does not apply to either of the \(\text{ LPCP } \)s we construct in Sect. 7, because they do not have sparse queries.) The construction relies on the easiness of solving instances of subset sum in which each integer is bigger than the sum of the previous integers (see [81]).

Fact 3.5

There is a quasilinear-time algorithm for the following problem:

  • input: Non-negative integers \(w_{1},\dots ,w_{k},a\) such that each \(w_{i}\) is bigger than the sum of the previous \(w_{j}\).

  • output: A binary vector \((a_{1} ,\dots ,a_{k}) \in \{0,1\}^{k}\) such that \(a=\sum _{i=1}^{k} a_{i} w_{i}\) (if one exists).

(All integers are given in binary representation.)

The following construction uses a parameter \(\ell \) that will affect the soundness error. We assume that the field \(\mathbb {F}\) is of a prime order \(\mathfrak {p}\) where \(\mathfrak {p}> 2^{k}\ell \) and identify its elements with the integers \(0,\dots ,\mathfrak {p}-1\).

Construction 3.6

Let \((P_{\scriptscriptstyle {\mathsf {PCP}}} ,V_{\scriptscriptstyle {\mathsf {PCP}}})\) be a \(k\)-query \(\text{ PCP } \) with proof length \(m\). Define an \(\text{ LIP } \) \((P_{\scriptscriptstyle {\mathsf {LIP}}} ,V_{\scriptscriptstyle {\mathsf {LIP}}})\) over \(\mathbb {F}\) as follows.

  • The \(\text{ LIP } \) verifier \(V_{\scriptscriptstyle {\mathsf {LIP}}}\) runs the \(\text{ PCP } \) verifier \(V_{\scriptscriptstyle {\mathsf {PCP}}}\) to obtain \(k\) distinct query locations \(q_{1},\dots ,q_{k} \in [m]\), picks a sequence of \(k\) random field elements

    $$\begin{aligned} w_{1}&\leftarrow [0,\ell -1], w_{2} \leftarrow [\ell ,2\ell -1],w_{3} \leftarrow [3\ell ,4\ell -1] ,\ldots ,\\ w_{k}&\leftarrow [(2^{k-1}-1)\ell ,2^{k-1}\ell -1], \end{aligned}$$

    and sends to the \(\text{ LIP } \) prover \(P_{\scriptscriptstyle {\mathsf {LIP}}}\) the vector \(\varvec{q}=\sum _{i=1}^{k} w_{i} \varvec{e}_{q_{i}}\), where \(\varvec{e}_{j}\) is the j-th unit vector in \(\mathbb {F}^{m}\).

  • The \(\text{ LIP } \) prover \(P_{\scriptscriptstyle {\mathsf {LIP}}}\) responds by applying to \(\varvec{q}\) the linear function \(\varvec{\pi }:\mathbb {F}^{m} \rightarrow \mathbb {F}\) whose coefficients are specified by the \(m\) bits of the \(\text{ PCP } \) generated by the \(\text{ PCP } \) prover \(P_{\scriptscriptstyle {\mathsf {PCP}}}\). Let \(a\) denote the field element returned by \(P_{\scriptscriptstyle {\mathsf {LIP}}}\).

  • The \(\text{ LIP } \) verifier \(V_{\scriptscriptstyle {\mathsf {LIP}}}\) applies the subset sum algorithm of Fact 3.5 to find \((a_{1} ,\dots ,a_{k}) \in \{0,1\}^{k}\) such that \(a=\sum _{i=1}^{k} a_{i} w_{i}\) (if none exists it rejects) and decides whether to accept by feeding the \(\text{ PCP } \) verifier \(V_{\scriptscriptstyle {\mathsf {PCP}}}\) with \(a_{1},\dots ,a_{k}\).

Lemma 3.7

(From PCP to LIP) Suppose \((P_{\scriptscriptstyle {\mathsf {PCP}}} ,V_{\scriptscriptstyle {\mathsf {PCP}}})\) is a \(k\)-query \(\text{ PCP } \) for a relation \(\mathcal {R}\) with proof length \(m\) and knowledge error \(\varepsilon \), and \(\mathbb {F}\) is a field of prime order \(\mathfrak {p}\) with \(\mathfrak {p}> 2^{k} \ell \). Then \((P_{\scriptscriptstyle {\mathsf {LIP}}} ,V_{\scriptscriptstyle {\mathsf {LIP}}})\) from Construction 3.6 is a two-message \(\text{ LIP } \) for \(\mathcal {R}\) over \(\mathbb {F}\) with verifier message in \(\mathbb {F}^{m}\), prover message in \(\mathbb {F}\), and knowledge error \(\varepsilon + \frac{2^{k}}{\ell }\).

Proof

Because the prover message is in \(\mathbb {F}\) (i.e., the prover returns a single field element) the prover strategy is an affine function \(\Pi ^{*} :\mathbb {F}^{m} \rightarrow \mathbb {F}\) (i.e., as in an \(\text{ LPCP } \), see Remark (2.8)). Let \(\varvec{\pi }^{*} :\mathbb {F}^{m} \rightarrow \mathbb {F}\) be a linear function and \(\gamma ^{*} \in \mathbb {F}\) be a constant such that \(\Pi ^{*}(\varvec{q}) = \left\langle \varvec{\pi }^{*} , \varvec{q} \right\rangle + \gamma ^{*}\) for all \(\varvec{q}\in \mathbb {F}^{m}\).

We say that query positions \(q_{1},\dots ,q_{k} \in [m]\) are invalid with respect to \(\Pi ^{*}\) if \(\gamma ^{*} \ne 0\) or there is \(i \in \{1,\dots ,k\}\) such that \(\Pi ^{*} (\varvec{e}_{q_{i}}) \not \in \{0,1\}\). It suffices to show that, for any strategy \(\Pi ^{*}\) as above, conditioned on any choice of invalid query positions \(q_{1},\dots ,q_{k}\) by \(V_{\scriptscriptstyle {\mathsf {LIP}}}\), the probability of \(V_{\scriptscriptstyle {\mathsf {LIP}}}\) accepting is bounded by \(2^{k} / \ell \). Indeed, for queries for which \(\Pi ^{*}\) is valid, it holds that \(\Pi ^{*}(q_{i}) = \left\langle \varvec{\pi }^{*} , q_{i} \right\rangle \in \{0,1\}\) corresponding a traditional \(\text{ PCP } \) oracle \(\varvec{\pi }^{*}\), so that the knowledge guarantees of \((P_{\scriptscriptstyle {\mathsf {PCP}}} ,V_{\scriptscriptstyle {\mathsf {PCP}}})\) would kick in.

The above follows from the sparseness of the answers \(a\) that correspond to valid strategies and the high entropy of the answer resulting from any invalid strategy. Concretely, fix any candidate solution \((a_{1} ,\dots ,a_{k}) \in \{0,1\}^{k}\) and pick \(w_{1},\dots ,w_{k}\) as in Construction 3.6. Since each \(w_{i}\) is picked uniformly from an interval of size \(\ell \),

$$\begin{aligned}&_{w_{1},\dots ,w_{k}} \left[ \; \Pi ^{*} \cdot \left( \sum _{i=1}^{k} w_{i} \varvec{e}_{q_{i}} \right) ^{\top } = \sum _{i=1}^{k} a_{i} w_{i} \;\right] \\&\quad = _{w_{1},\dots ,w_{k}} \left[ \; \left\langle \varvec{\pi }^{*} , \sum _{i=1}^{k} w_{i} \varvec{e}_{q_{i}} \right\rangle + \gamma ^{*} = \sum _{i=1}^{k} a_{i} w_{i} \;\right] \\&\quad = _{w_{1},\dots ,w_{k}} \left[ \; \sum _{i=1}^{k} (\pi ^{*}_{q_{i}}-a_{i}) w_{i} + \gamma ^{*} =0 \;\right] \\&\quad \le \frac{1}{\ell }. \end{aligned}$$

Indeed, noting that \(\sum _{i=1}^{k} (\pi ^{*}_{q_{i}}-a_{i}) w_{i} + \gamma ^{*}\) is a degree-1 polynomial in the variables \(w_{1},\dots ,w_{k}\),

  • if there is \(i \in \{1,\dots ,k\}\) such that \(\Pi ^{*}(\varvec{e}_{q_{i}}) \not \in \{0,1\}\) then the coefficient of \(w_{i}\) is non-zero (since \(a_{i} \in \{0,1\}\)) and thus, by the Schwartz–Zippel Lemma (see Lemma 2.1), the probability that the polynomial vanishes is at most \(1 / \ell \); and

  • if instead for all \(i \in \{1,\dots ,k\}\) it holds that \(\Pi ^{*}(\varvec{e}_{q_{i}}) \in \{0,1\}\) then it must be that \(\gamma ^{*} \ne 0\); if there is \(i \in \{1,\dots ,k\}\) such that \(\pi ^{*}_{q_{i}} \ne a_{i}\) then the same argument as in the previous bullet holds; otherwise, \(\gamma ^{*} = 0\) with probability 0 since we know that \(\gamma ^{*} \ne 0\).

By a union bound, the probability that there exists solution \((a_{1} ,\dots ,a_{k}) \in \{0,1\}^{k}\) such that \(\Pi (\sum _{i=1}^{k} w_{i} \varvec{e}_{q_{i}} ) = \sum _{i=1}^{k} a_{i} w_{i}\) is at most \(2^{k} / \ell \). Hence, the subset sum algorithm will fail to find a solution and \(V_{\scriptscriptstyle {\mathsf {LIP}}}\) will reject except with at most \(2^{k} / \ell \) probability. \(\square \)

By setting \(\ell := 2^{k} / \varepsilon \), we obtain the following corollary:

Corollary 3.8

Suppose \((P_{\scriptscriptstyle {\mathsf {PCP}}} ,V_{\scriptscriptstyle {\mathsf {PCP}}})\) is a \(k\)-query \(\text{ PCP } \) for a relation \(\mathcal {R}\) with proof length \(m\) and knowledge error \(\varepsilon \), and \(\mathbb {F}\) is a field of prime order \(\mathfrak {p}\) with \(\mathfrak {p}> 2^{2k} / \varepsilon \). Then \((P_{\scriptscriptstyle {\mathsf {LIP}}} ,V_{\scriptscriptstyle {\mathsf {LIP}}})\) from Construction 3.6 is a two-message \(\text{ LIP } \) for \(\mathcal {R}\) over \(\mathbb {F}\) with verifier message in \(\mathbb {F}^{m}\), prover message in \(\mathbb {F}\), and knowledge error \(2\varepsilon \).

There are many \(\text{ PCP } \)s in the literature (e.g., [4, 5, 16, 29, 30, 32, 33, 49, 53, 66, 70, 80, 84, 90, 93]), optimizing various parameters.

Focusing on asymptotic time complexity, perhaps the most relevant \(\text{ PCP } \)s for our purposes here are those of Ben-Sasson et al. [28]. They constructed \(\text{ PCP } \)s where, to prove and verify that a random-access machine \(M\) accepts \((x,w)\) within \(t\) steps for some \(w\) with \(|w| \le t\), the prover runs in time \((|M| + |x| + t) \cdot {\mathrm {polylog}}(t)\) and the verifier runs in time \((|M|+|x|) \cdot {\mathrm {polylog}}(t)\) (while asking \({\mathrm {polylog}}(t)\) queries, for constant soundness). Invoking Corollary 3.8 with these \(\text{ PCP } \)s, one can deduce the following theorem.

Theorem 3.9

Let \(\mathbb {F}\) be a finite field and \(C:\{0,1\}^{n} \times \{0,1\}^{h} \rightarrow \{0,1\}\) a boolean circuit of size \(s\). There is an input-oblivious two-message \(\text{ LIP } \) for \(\mathcal {R}_{C}\) with knowledge error \(2^{-\lambda }\), verifier message in \(\mathbb {F}^{\tilde{O}(s)}\), prover message in \(\mathbb {F}\), and \(|\mathbb {F}| > 2^{\lambda \cdot {\mathrm {polylog}}(s)}\). Furthermore:

  • the \(\text{ LIP } \) prover \(P_{\scriptscriptstyle {\mathsf {LIP}}}\) runs in time \(\tilde{O}(s)\);

  • the \(\text{ LIP } \) query algorithm \(Q_{\scriptscriptstyle {\mathsf {LIP}}}\) runs in time \(\tilde{O}(s) + \lambda \cdot n\cdot {\mathrm {polylog}}(s)\);

  • the \(\text{ LIP } \) decision algorithm \(D_{\scriptscriptstyle {\mathsf {LIP}}}\) runs in time \(\lambda \cdot n\cdot {\mathrm {polylog}}(s)\).

(All the above running times are up to \({\mathrm {polylog}}(|\mathbb {F}|)\) factors.)

Focusing on communication complexity instead, we can invoke Corollary 3.8 with the query-efficient \(\text{ PCP } \)s of Håstad and Khot [69], which have \(\lambda + o(\lambda )\) queries for soundness \(2^{-\lambda }\). (Because their \(\text{ PCP } \)s have a query algorithm that depends on the input, we only obtain an \(\text{ LIP } \) where the verifier’s message depends on the input; it is plausible that [69] can be modified to be input oblivious, but we did not check this.)

Theorem 3.10

Let \(\mathbb {F}\) be a finite field and \(C:\{0,1\}^{n} \times \{0,1\}^{h} \rightarrow \{0,1\}\) a boolean circuit of size \(s\). There is a two-message \(\text{ LIP } \) for \(\mathcal {R}_{C}\) with knowledge error \(2^{-\lambda }\), verifier message in \(\mathbb {F}^{{\mathrm {poly}}(s)}\), prover message in \(\mathbb {F}\), and \(|\mathbb {F}| > 2^{\lambda \cdot (3+o(1))}\). Furthermore:

  • the \(\text{ LIP } \) prover \(P_{\scriptscriptstyle {\mathsf {LIP}}}\) runs in time \({\mathrm {poly}}(s)\);

  • the \(\text{ LIP } \) query algorithm \(Q_{\scriptscriptstyle {\mathsf {LIP}}}\) runs in time \({\mathrm {poly}}(s) + \lambda \cdot n\cdot {\mathrm {polylog}}(s)\);

  • the \(\text{ LIP } \) decision algorithm \(D_{\scriptscriptstyle {\mathsf {LIP}}}\) runs in time \(\lambda \cdot n\cdot {\mathrm {polylog}}(s)\).

(All the above running times are up to \({\mathrm {polylog}}(|\mathbb {F}|)\) factors.)

The verifiers of the \(\text{ PCP } \)s of Ben-Sasson et al. [28] (used to derive Theorem 3.9) and of Håstad and Khot [69] (used to derive Theorem 3.10) do not have low degree, and thus the \(\text{ LIP } \)s they induce via our transformation are not algebraic. In Sect. 9, we give evidence that this is inherent, proving that no \(\text{ PCP } \) (for a hard enough language) can have low-degree verifiers (or even satisfy a weaker, but still useful, property that we call “strong knowledge”).

3.2.1 Zero-Knowledge

In Sect. 3.1.1 we discussed a generic transformation from any \(\text{ LPCP } \) with \(d_{D}=O(1)\) to a corresponding \(\text{ HVZK } \text{ LPCP } \). A (traditional) \(\text{ PCP } \) does not typically induce an \(\text{ LPCP } \) with \(d_{D}=O(1)\). Thus, if we want to obtain an \(\text{ HVZK } \text{ LIP } \) through Construction 3.6, we need a different approach.

We observe that if we plug into Construction 3.6 a \(\text{ PCP } \) that is \(\text{ HVZK } \) (see Definition 2.4), then the corresponding \(\text{ LIP } \) is also \(\text{ HVZK } \).

Lemma 3.11

In Lemma 3.7, if \((P_{\scriptscriptstyle {\mathsf {PCP}}} ,V_{\scriptscriptstyle {\mathsf {PCP}}})\) is a \(\text{ HVZK } \text{ PCP } \) then \((P_{\scriptscriptstyle {\mathsf {LIP}}} ,V_{\scriptscriptstyle {\mathsf {LIP}}})\) is a \(\text{ HVZK } \text{ LIP } \).

4 Definitions of \(\text{ SNARK } \)s and Preprocessing \(\text{ SNARK } \)s

We now turn to the cryptographic part of this work. We define the notions of a \(\text{ SNARK } \) and a preprocessing \(\text{ SNARK } \). Our definitions follow those in previous works (see for instance, [10, 64]).

We first recall the universal relation [19], which provides us with a canonical form to represent verification-of-computation problems. Because such problems typically arise in the form of algorithms (e.g., “is there \(w\) that makes program \(\varvec{P}\) accept \((x,w)\)?”), we adopt the universal relation relative to random-access machines [6, 39].

Definition 4.1

The universal relation is the set \(\mathcal {R}_{\mathcal {U}}\) of instance-witness pairs \((y,w) = \big ( (M,x,t) ,w\big )\), where \(|y|,|w| \le t\) and \(M\) is a random-access machine, such that \(M\) accepts \((x,w)\) after at most \(t\) steps.Footnote 6 We denote by \(\mathcal {L}_{\mathcal {U}}\) the universal language corresponding to \(\mathcal {R}_{\mathcal {U}}\).

We now proceed to define \(\text{ SNARG } \)s and preprocessing \(\text{ SNARG } \)s. A succinct non-interactive argument (\(\text{ SNARG } \)) is a triple of algorithms \((G ,P,V)\) that works as follows. The (probabilistic) generator \(G\), on input the security parameter \(\lambda \) (presented in unary) and a time bound \(T\), outputs a reference string \(\sigma \) and a corresponding verification state \(\tau \). The honest prover \(P(\sigma ,y,w)\) produces a proof \(\pi \) for the instance \(y=(M,x,t)\) given a witness \(w\), provided that \(t\le T\); then \(V(\tau ,y,\pi )\) verifies the validity of \(\pi \).

The \(\text{ SNARG } \) is adaptive if the prover may choose the statement after seeing \(\sigma \), otherwise, it is non-adaptive; the \(\text{ SNARG } \) is fully-succinct if \(G\) runs “fast,” otherwise, it is of the preprocessing kind.

Definition 4.2

A triple of algorithms \((G ,P,V)\) is a \(\text{ SNARG } \) for the relation \(\mathcal {R}\subseteq \mathcal {R}_{\mathcal {U}}\) if the following conditions are satisfied:

  1. 1.

    Completeness

    For every large enough security parameter \(\lambda \in \mathbb {N}\), every time bound \(T\in \mathbb {N}\), and every instance-witness pair \((y,w) = \big ( (M,x,t),w\big ) \in \mathcal {R}\) with \(t\le T\),

    $$\begin{aligned} \mathrm{Pr}\left[ V(\tau ,y,\pi )=1\left| \begin{array}{r} (\sigma ,\tau ) \leftarrow G(1^{\lambda },T) \\ \pi \leftarrow P(\sigma ,y,w) \end{array} \right] = 1\right. . \end{aligned}$$
  2. 2.

    Soundness (depending on which notion is considered)

    • non-adaptive: For every polynomial-size prover \(P^{*}\), every large enough security parameter \(\lambda \in \mathbb {N}\), every time bound \(T\in \mathbb {N}\), and every instance \(y= (M,x,t)\) for which \(\not \exists \, w\text { s.t. } (y,w) \in \mathcal {R}\),

      $$\begin{aligned} \mathrm{Pr}\left[ V(\tau ,y,\pi )=1 \left| \begin{array}{r} (\sigma ,\tau ) \leftarrow G(1^{\lambda },T) \\ \pi \leftarrow P^{*}(\sigma ,y) \end{array} \right] \le {\mathsf {negl}}(\lambda )\right. . \end{aligned}$$
    • adaptive: For every polynomial-size prover \(P^{*}\), every large enough security parameter \(\lambda \in \mathbb {N}\), and every time bound \(T\in \mathbb {N}\),

      $$\begin{aligned} \mathrm{Pr}\left[ \begin{array}{c} V(\tau ,y,\pi )=1 \\ \not \exists \, w\text { s.t. } (y,w) \in \mathcal {R}\end{array} \left| \begin{array}{r} (\sigma ,\tau ) \leftarrow G(1^{\lambda },T) \\ (y,\pi ) \leftarrow P^{*}(\sigma ) \end{array} \right] \le {\mathsf {negl}}(\lambda )\right. . \end{aligned}$$
  3. 3.

    Efficiency

    There exists a universal polynomial p (independent of \(\mathcal {R}\)) such that, for every large enough security parameter \(\lambda \in \mathbb {N}\), every time bound \(T\in \mathbb {N}\), and every instance \(y= (M,x,t)\) with \(t\le T\),

    • the generator \(G\) runs in time \( {\left\{ \begin{array}{ll} p(\lambda + \log T) &{} \text {for a fully-succinct} \text{ SNARG } \\ p(\lambda + T) &{} \text {for a preprocessing} \text{ SNARG } \end{array}\right. } \) ;

    • the prover \(P\) runs in time \( {\left\{ \begin{array}{ll} p(\lambda + |M|+ |x| + t+ \log T) &{} \text {for a fully-succinct} \quad \text{ SNARG } \\ p(\lambda + |M|+ |x| + T) &{} \text {for a preprocessing}\quad \text{ SNARG } \end{array}\right. } \) ;

    • the verifier \(V\) runs in time \(p(\lambda + |M| + |x| + \log T)\);

    • an honestly generated proof has size \(p(\lambda + \log T)\).

Proof of knowledge. A \(\text{ SNARG } \) of knowledge (\(\text{ SNARK } \)) is a \(\text{ SNARG } \) where soundness is strengthened as follows:

Definition 4.3

A triple of algorithms \((G ,P,V)\) is a \(\text{ SNARK } \) for the relation \(\mathcal {R}\) if it is a \(\text{ SNARG } \) for \(\mathcal {R}\) where adaptive soundness is replaced by the following stronger requirement:

  • Adaptive proof of knowledgeFootnote 7 For every polynomial-size prover \(P^{*}\) there exists a polynomial-size extractor \(E\) such that for every large enough security parameter \(\lambda \in \mathbb {N}\), every auxiliary input \(z\in \{0,1\}^{{\mathrm {poly}}(\lambda )}\), and every time bound \(T\in \mathbb {N}\),

    $$\begin{aligned} \mathrm{Pr}\left[ \begin{array}{c} V(\tau ,y,\pi )=1 \\ (y,w) \notin \mathcal {R}\end{array} \left| \begin{array}{r} (\sigma ,\tau ) \leftarrow G(1^{\lambda },T) \\ (y,\pi ) \leftarrow P^{*}(z,\sigma ) \\ w\leftarrow E(z,\sigma ) \end{array} \right] \le {\mathsf {negl}}(\lambda ) \right. . \end{aligned}$$

One may want to distinguish between the case where the verification state \(\tau \) is allowed to be public or needs to remain private: a publicly-verifiable \(\text{ SNARK } \) (\(\text{ pvSNARK } \)) is one where security holds even if \(\tau \) is public; in contrast, a designated-verifier \(\text{ SNARK } \) (\(\text{ dvSNARK } \)) is one where \(\tau \) needs to remain secret.

Zero-knowledge. A zero-knowledge \(\text{ SNARK } \) (or “succinct NIZK of knowledge”) is a \(\text{ SNARK } \) satisfying a zero-knowledge property. Namely, zero knowledge ensures that the honest prover can generate valid proofs for true theorems without leaking any information about the theorem beyond the fact that the theorem is true (in particular, without leaking any information about the witness that he used to generate the proof for the theorem). Of course, when considering zero-knowledge \(\text{ SNARK } \)s, the reference string \(\sigma \) must be a common reference string that is trusted, not only by the verifier, but also by the prover.

Definition 4.4

A triple of algorithms \((G ,P,V)\) is a (perfect) zero-knowledge SNARK for the relation \(\mathcal {R}\) if it is a \(\text{ SNARK } \) for \(\mathcal {R}\) and, moreover, satisfies the following property:

  • Zero Knowledge

    There exists a stateful interactive polynomial-size simulator \(S\) such that for all stateful interactive polynomial-size distinguishers \(\mathcal {D}\), large enough security parameter \(\lambda \in \mathbb {N}\), every auxiliary input \(z\in \{0,1\}^{{\mathrm {poly}}(\lambda )}\), and every time bound \(T\in \mathbb {N}\),

    $$\begin{aligned}&\mathrm{Pr}\left[ \begin{array}{c} t\le T\\ (y,w) \in \mathcal {R}_{\mathcal {U}} \\ \mathcal {D}(\pi ) = 1 \end{array} \left| \begin{array}{r} (\sigma ,\tau ) \leftarrow G(1^{\lambda },T) \\ (y,w) \leftarrow \mathcal {D}(z,\sigma ) \\ \pi \leftarrow P(\sigma ,y,w) \end{array} \right] \right. \\&\quad = \mathrm{Pr}\left[ \begin{array}{c} t\le T\\ (y,w) \in \mathcal {R}_{\mathcal {U}} \\ \mathcal {D}(\pi ) = 1 \end{array} \left| \begin{array}{r} (\sigma ,\tau ,{\mathsf {trap}}) \leftarrow S(1^{\lambda },T) \\ (y,w) \leftarrow \mathcal {D}(z,\sigma ) \\ \pi \leftarrow S(z,\sigma ,y,{\mathsf {trap}}) \end{array} \right] \right. . \end{aligned}$$

As usual, Definition 4.4 can be relaxed to consider the case in which the distributions are only statistically or computationally close.

As observed in [11], \(\text{ dvSNARK } \)s (resp., \(\text{ pvSNARK } \)s) can be combined with zero-knowledge (not-necessarily-succinct) non-interactive arguments (NIZKs) of knowledge to obtain zero-knowledge \(\text{ dvSNARK } \)s (resp., \(\text{ pvSNARK } \)s). This observation immediately extends to preprocessing \(\text{ SNARK } \)s, thereby providing a generic method to construct zero-knowledge preprocessing \(\text{ SNARK } \)s from preprocessing \(\text{ SNARK } \)s.

In this work, we also consider more “direct,” and potentially more efficient, ways to construct zero-knowledge preprocessing \(\text{ SNARK } \)s by relying on various constructions of \(\text{ HVZK } \text{ LIP } \)s (and without relying on generic NIZKs). See Sect. 6.3.

(We note that when applying the transformations of [12], e.g., to remove preprocessing, zero knowledge is preserved.Footnote 8)

Multiple theorems. A desirable property (especially so when preprocessing is expensive) is the ability to generate \(\sigma \) once and for all and then reuse it in polynomially-many proofs (potentially by different provers). Doing so requires security also against provers that have access to a proof-verification oracle. While for \(\text{ pvSNARK } \)s this multi-theorem proof of knowledge property is automatically guaranteed, this is not the case for \(\text{ dvSNARK } \)s. In Sect. 9, we formally define and discuss this stronger notion of security, and show that some of our \(\text{ dvSNARK } \)s achieve this security notion because of the “strong knowledge” of the underlying \(\text{ LIP } \)s. (We note that, like zero-knowledge above, the multi-theorem proof-of-knowledge property is preserved by the transformations of [12].)

OUR FOCUS. In this work we study preprocessing \(\text{ SNARK } \)s, where (as stated in Definition 4.2) the generator \(G\) may run in time polynomial in the security parameter \(\lambda \) and time bound \(T\).

4.1 Preprocessing \(\text{ SNARK } \)s for Boolean Circuit Satisfaction Problems

In Sect. 4, we have defined \(\text{ SNARK } \)s for the universal relation. In this work, at times it will be more convenient to discuss preprocessing \(\text{ SNARK } \)s for boolean circuit satisfaction problems rather than for the universal relation.Footnote 9 We thus briefly sketch the relevant definitions, and also explain how preprocessing \(\text{ SNARK } \)s for boolean circuit satisfaction problems suffice for obtaining preprocessing \(\text{ SNARK } \)s, with similar efficiency, for the universal relation. (Indeed, because we are often interested in the correctness of algorithms, and not boolean circuits, it is important that this transformation be efficient!)

We begin by introducing boolean circuit satisfaction problems:

Definition 4.5

The boolean circuit satisfaction problem of a boolean circuit \(C:\{0,1\}^{n} \times \{0,1\}^{h} \rightarrow \{0,1\}\) is the relation \(\mathcal {R}_{C} = \{(x,w) \in \{0,1\}^{n} \times \{0,1\}^{h} : C(x,w) = 1 \}\); its language is denoted \(\mathcal {L}_{C}\). For a family of boolean circuits \(\mathcal {C}= \left\{ C_{\ell } :\{0,1\}^{n(\ell )} \times \{0,1\}^{h(\ell )} \rightarrow \{0,1\}\right\} _{\ell \in \mathbb {N}}\), we denote the corresponding infinite relation and language by \(\mathcal {R}_{\mathcal {C}} = \bigcup _{\ell \in \mathbb {N}}\mathcal {R}_{C_{\ell }}\) and \(\mathcal {L}_{\mathcal {C}} = \bigcup _{\ell \in \mathbb {N}}\mathcal {L}_{C_{\ell }}\).

A preprocessing \(\text{ SNARK } \) for a uniform family of boolean circuits \(\mathcal {C}\) is defined analogously to a preprocessing \(\text{ SNARK } \) for the universal relation, with only small syntactic modifications. The (probabilistic) generator \(G\), on input the security parameter \(\lambda \) and an index \(\ell \) for the circuit \(C_{\ell } :\{0,1\}^{n(\ell )} \times \{0,1\}^{h(\ell )} \rightarrow \{0,1\}\), outputs a reference string \(\sigma \) and a corresponding verification state \(\tau \). (Both \(\tau \) and \(\sigma \) can be thought to include \(\lambda \) and \(\ell \).) Given \(w\), the honest prover \(P(\sigma ,x,w)\) produces a proof \(\pi \) attesting that \(x\in \mathcal {L}_{C_{\ell }}\); then, \(V(\tau ,x,\pi )\) verifies the validity of \(\pi \). As for efficiency, we require that there exists a universal polynomial p (independent of the family \(\mathcal {C}\)) such that for every large enough security parameter \(\lambda \in \mathbb {N}\), index \(\ell \in \mathbb {N}\), and input \(x\in \{0,1\}^{n(\ell )}\):

  • the generator \(G\) runs in time \(p(\lambda + |C_{\ell }|)\);

  • the prover \(P\) runs in time \(p(\lambda + |C_{\ell }|)\);

  • the verifier \(V\) runs in time \(p(\lambda + |x| + \log |C_{\ell }|)\);

  • an honestly generated proof has size \(p(\lambda + \log |C_{\ell }|)\).

We can also consider the case where \(\mathcal {C}\) is a non-uniform family, in which case \(G\) and \(P\) will get as additional input the circuit \(C_{\ell }\).

We show how to obtain a preprocessing \(\text{ SNARK } \) for \(\mathcal {R}_{\mathcal {U}}\) from preprocessing \(\text{ SNARK } \)s for uniform families of boolean circuits. To do so, we need to introduce the notion of a universal RAM simulator:

Definition 4.6

Let \(n\in \mathbb {N}\). We say that a boolean circuit family \(\mathcal {C}_{n} = \{ C_{T} :\{0,1\}^{n} \times \{0,1\}^{h(T)} \rightarrow \{0,1\}\}_{T}\) is a universal RAM simulator for \(n\)-bit instances if, for every \(y= (M,x,t)\) with \(|y| = n\), \(C_{T}(y,\cdot )\) is satisfiable if and only if \(y\in \mathcal {L}_{\mathcal {U}}\) and \(t\le T\). A witness map of \(\mathcal {C}_{n}\), denoted \(\mathsf {w}\), is a function such that, for every \(y= (M,x,t)\) with \(|y| = n\) and \(t\le T\), if \((y,w) \in \mathcal {R}_{\mathcal {U}}\) then \(C_{T}(y,\mathsf {w}(T,y,w)) = 1\). An inverse witness map of \(\mathcal {C}_{n}\), denoted \(\mathsf {w}^{-1}\), is a function such that, for every \(y= (M,x,t)\) with \(|y| = n\) and \(t\le T\), if \(C_{T}(y,w') = 1\) then \((y,\mathsf {w}^{-1}(T,y,w')) \in \mathcal {R}_{\mathcal {U}}\).

Construction 4.7

For every \(n\in \mathbb {N}\), given a preprocessing \(\text{ SNARK } \) \((G ,P,V)\) for a universal RAM simulator \(\mathcal {C}_{n}\) (for \(n\)-bit instances) with (efficient) witness map \(\mathsf {w}\) and inverse witness map \({\mathsf {wit}}^{-1}\), we construct a preprocessing \(\text{ SNARK } \) \((G_{n}' ,P_{n}',V_{n}')\) for those pairs \((y,w)\) in the universal relation with \(|y|=n\) as follows:

  • \(G_{n}'(1^{\lambda },T) := G(1^{\lambda },T)\);

  • \(P_{n}'(\sigma ,y,w) := P(\sigma ,y,\mathsf {w}(T,y,w))\);

  • \(V_{n}'(\tau ,y,\pi ) := V(\tau ,y,\pi )\).

Claim 4.8

\((G_{n}' ,P_{n}',V_{n}')\) is a preprocessing \(\text{ SNARK } \) for pairs \((y,w)\) in the universal relation with \(|y|=n\).

Proof

The existence of \({\mathsf {wit}}^{-1}\) ensures that proof of knowledge is preserved. Concretely, a knowledge extractor \(E_{n}'\) for a prover convincing \(V_{n}'\) would first run a knowledge extractor for the same prover convincing \(V\) and then run \({\mathsf {wit}}^{-1}\) to obtain a suitable witness.

The efficiency of \(\mathcal {C}\), \(\mathsf {w}\), and \(\mathsf {w}^{-1}\) has direct implications to the efficiency of \((G_{n}' ,P_{n}',V_{n}')\). Concretely:

  • Let \(f(T) := |C_{T}|\). The growth rate of \(f(T)\) affects the efficiency of \(G_{n}'\) and \(P_{n}'\), because the efficiency of \(G\) and \(P\) depends on \(|C_{T}|\). So, for instance, if \(G\) and \(P\) run in time \(|C_{T}|^{2} \cdot {\mathrm {poly}}(\lambda )\) and \(f(T) = \Omega (T^{2})\), then \(G_{n}'\) and \(P_{n}'\) run in time \(\Omega (T^{4}) \cdot {\mathrm {poly}}(\lambda )\).

  • The running time of \(\mathsf {w}\) affects the running time of \(P_{n}'\). Indeed, \(P_{n}'\) must first transform the witness \(w\) for \(y\) into a witness \(w'\) for \(C_{T}(y,\cdot )\), and only then he can invoke \(P\). So, for instance, even if \(f(T) = \tilde{O}(T)\) but \(\mathsf {w}\) runs in time \(\Omega (T^{3})\), then \(P_{n}'\) will run in time \(\Omega (T^{3})\).

  • The running time of \(\mathsf {w}^{-1}\) sometimes affects the running time of \(G_{n}'\), \(P_{n}'\), and \(V_{n}'\). Indeed, if the proof of knowledge property of \((G_{n}' ,P_{n}',V_{n}')\) is used in a security reduction (e.g., verifying the correctness of cryptographic computations) then the slower \(\mathsf {w}^{-1}\) is the more expensive is the security reduction, and thus the larger the security parameter has to be chosen for \((G ,P,V)\). A larger security parameter affects the efficiency of all three algorithms.

We thus wish the growth rate of \(f(T)\) to be as small as possible, and that \(\mathsf {w}\) and \(\mathsf {w}^{-1}\) be as fast as possible. The reduction from RAM computations to circuits of Ben-Sasson et al. [27] implies that there is a universal RAM simulator where \(f(T) = \tilde{O}(T)\) and both \(\mathsf {w}\) and \(\mathsf {w}^{-1}\) run in sequential time \(\tilde{O}(T)\) (or in parallel time \(O((\log T)^{2})\)). \(\square \)

Next, we show how to remove the restriction on the instance size, by using collision-resistant hashing. (Indeed, \((G_{n}' ,P_{n}',V_{n}')\) only handles instances \(y\) with \(|y|=n\).) Let \(\mathcal {H}= \{\mathcal {H}_{\lambda }\}_{\lambda \in \mathbb {N}}\) be a collision-resistant hash-function family such that functions in \(\mathcal {H}_{\lambda }\) map \(\{0,1\}^{*}\) to \(\{0,1\}^{\lambda }\). For any \(h\in \mathcal {H}_{\lambda }\) and instance \(y\), define \(y_{h}\) to be the instance \((U_{h},h(x),{\mathrm {poly}}(\lambda )+O(t))\), where \(U_{h}\) is a universal random-access machine that, on input \((\tilde{x},\tilde{w})\), parses \(\tilde{w}\) as \(((M,x,t),w)\), verifies that \(\tilde{x} = h(M,x,t)\), and then runs \(M(x,w)\) for at most \(t\) steps. Because we can assume a uniform super-polynomial upper bound on \(t\), say \(t\le \lambda ^{\log \lambda }\), there is a constant \(c > 0\) for which we can assume that \(|y_{h}|=\lambda ^{c}\).

Construction 4.9

We construct a preprocessing \(\text{ SNARK } \) \((G'' ,P'',V'')\) for the universal relation as follows:

  • \(G''(1^{\lambda },T)\) outputs \((\tilde{\sigma },\tilde{\tau }) := \big ( (\sigma ,h),(\tau ,h) \big )\) where \((\sigma ,\tau ) \leftarrow G_{\lambda ^{c}}'(1^{\lambda },T)\) and \(h\leftarrow \mathcal {H}_{\lambda }\);

  • \(P''(\tilde{\sigma },y,w) := P_{\lambda ^{c}}'(\sigma ,y_{h},(y,w))\);

  • \(V''(\tilde{\tau },y,\pi ) := V_{\lambda ^{c}}'(\tau ,y_{h},\pi )\).

Claim 4.10

\((G'' ,P'',V'')\) is a preprocessing \(\text{ SNARK } \) for the universal relation.

Proof

The required efficiency follows readily by construction. The proof of knowledge also follow directly: applying the extractor of the underlying system, we either obtain a valid witness for the statement \(y\), or a collision. The latter is ruled out by collision resistance. \(\square \)

In sum, asymptotically, we incur in essentially no overhead if we focus on constructing preprocessing \(\text{ SNARK } \)s for uniform families of boolean circuits.

5 Linear-Only Encryption and Encodings

We introduce and discuss the two cryptographic tools used in this paper. First, in Sect. 5.1, we present linear-only encryption and then, in Sect. 5.2, linear-only one-way encodings. In Sect. 5.3, we discuss candidate instantiations for both. Later, in Sect. 6, we describe how to use these tools in our transformations from \(\text{ LIP } \)s to \(\text{ SNARK } \)s (or \(\text{ SNARG } \)s).

5.1 Linear-Only Encryption

At high-level, a linear-only encryption scheme is a semantically-secure encryption scheme that supports linear homomorphic operations, but does not allow any other form of homomorphism.

We first introduce the syntax and correctness properties of linear-only encryption; then its (standard) semantic-security property; and finally its linear-only property. In fact, we consider two formalizations of the linear-only property (a stronger one and a weaker one).

Syntax and correctness. A linear-only encryption scheme is a tuple of algorithms \(({\mathsf {Gen}} ,{\mathsf {Enc}},{\mathsf {Dec}},{\mathsf {Add}},{\mathsf {ImVer}})\) with the following syntax and correctness properties:

  • Given a security parameter \(\lambda \) (presented in unary), \({\mathsf {Gen}}\) generates a secret key \({\mathsf {sk}}\) and a public key \({\mathsf {pk}}\). The public key \({\mathsf {pk}}\) also includes a description of a field \(\mathbb {F}\) representing the plaintext space.

  • \({\mathsf {Enc}}\) and \({\mathsf {Dec}}\) are (randomized) encryption and (deterministic) decryption algorithms working in the usual way.

  • \({\mathsf {Add}}({\mathsf {pk}},c_{1},\dots ,c_{m},\alpha _{1},\dots ,\alpha _{m})\) is a homomorphic evaluation algorithm for linear combinations. Namely, given a public key \({\mathsf {pk}}\), ciphertexts \(\left\{ c_{i} \in {\mathsf {Enc}}_{{\mathsf {pk}}}(a_{i})\right\} _{i\in [m]}\), and field elements \(\left\{ \alpha _{i}\right\} _{i\in [m]}\), \({\mathsf {Add}}\) computes an evaluated ciphertext \({\hat{c}}\in {\mathsf {Enc}}_{{\mathsf {pk}}}(\sum _{i\in [m]}\alpha _{i}a_{i})\).

  • \({\mathsf {ImVer}}_{{\mathsf {sk}}}(c')\) tests, using the secret key \({\mathsf {sk}}\), whether a given candidate ciphertext \(c'\) is in the image of \({\mathsf {Enc}}_{{\mathsf {pk}}}\).

Remark 5.1

Because in most of this paper we restrict attention to \(\text{ LPCP } \)s and \(\text{ LIP } \)s over fields, we present linear-only encryption schemes for the case where plaintexts belong to a field. The definition naturally extends to the case where plaintexts belong to a ring. Typically, we are interested in the ring \(\mathbb {Z}_{N}\) for either the case where N is a prime \(\mathfrak {p}\) (in which case the ring \(\mathbb {Z}_{\mathfrak {p}}\) is isomorphic to the field \(\mathbb {F}_{\mathfrak {p}}\)) or where N is the product of two primes. (See corresponding Remark (2.6) in the \(\text{ LIP } \) definition.)

Remark 5.2

A symmetric-key variant of linear-only encryption can be easily defined. While ultimately a private-key linear homomorphic encryption implies a public-key one [92], using a private-key encryption could, in principle, have efficiency benefits.

Remark 5.3

The linear homomorphism property can be relaxed to allow for cases where the evaluated ciphertext \({\hat{c}}\) is not necessarily in the image of \({\mathsf {Enc}}_{{\mathsf {pk}}}\), but only decrypts to the correct plaintext; in particular, it may not be possible to perform further homomorphic operations on such a cipher.

Semantic security. Semantic security of linear-only encryption is defined as usual. Namely, for any polynomial-size adversary \(A\) and large enough security parameter \(\lambda \in \mathbb {N}\):

$$\begin{aligned} \mathrm{Pr}\left[ b' = b \left| \begin{array}{r} ({\mathsf {sk}},{\mathsf {pk}}) \leftarrow {\mathsf {Gen}}(1^{\lambda }) \\ (a_{0},a_{1}) \leftarrow A({\mathsf {pk}}) \\ b \leftarrow \{0,1\}\\ b' \leftarrow A\big ({\mathsf {pk}},{\mathsf {Enc}}_{{\mathsf {pk}}}(a_{b})\big ) \\ \end{array} \right] \le \frac{1}{2} + {\mathsf {negl}}(\lambda ) \right. . \end{aligned}$$

Linear-only homomorphism. The linear-only (homomorphism) property essentially says that, given a public key \({\mathsf {pk}}\) and ciphertexts \((c_{1} ,\dots ,c_{m})\), it is infeasible to compute a new ciphertext \(c'\) in the image of \({\mathsf {Enc}}_{{\mathsf {pk}}}\), except by evaluating an affine combination of the ciphertexts \((c_{1} ,\dots ,c_{m})\). (Affinity accounts for adversaries encrypting plaintexts from scratch and then adding them to linear combinations of the \(c_{i}\).) Formally, the property is captured by guaranteeing that, whenever the adversary produces a valid ciphertext, it is possible to efficiently extract a corresponding affine function “explaining” the ciphertext.

Definition 5.4

An encryption scheme has the linear-only (homomorphism) property if for any polynomial-size adversary \(A\) there is a polynomial-size extractor \(E\) such that, for any sufficiently large \(\lambda \in \mathbb {N}\), any auxiliary input \(z\in \{0,1\}^{{\mathrm {poly}}(\lambda )}\), and any plaintext generator \(\mathcal {M}\),

$$\begin{aligned} {\small \mathrm{Pr}\left[ \begin{array}{c} \exists \, i \in [k] \text { s.t. } \\ {\mathsf {ImVer}}_{{\mathsf {sk}}}(c_{i}')=1 \\ \text {and} \\ {\mathsf {Dec}}_{{\mathsf {sk}}}(c_{i}')\ne a_{i}' \end{array} \left| \begin{array}{r} ({\mathsf {sk}},{\mathsf {pk}}) \leftarrow {\mathsf {Gen}}(1^{\lambda }) \\ (a_{1},\dots ,a_{m}) \leftarrow \mathcal {M}({\mathsf {pk}}) \\ (c_{1} ,\dots ,c_{m}) \leftarrow ({\mathsf {Enc}}_{{\mathsf {pk}}}(a_{1}) ,\dots ,{\mathsf {Enc}}_{{\mathsf {pk}}}(a_{m})) \\ (c_{1}',\dots ,c_{k}') \leftarrow A({\mathsf {pk}},c_{1},\dots ,c_{m};z) \\ (\Pi ,\varvec{b}) \leftarrow E({\mathsf {pk}},c_{1},\dots ,c_{m};z) \\ (a_{1}',\dots ,a_{k}')^{\top } \leftarrow \Pi \cdot (a_{1},\dots ,a_{m})^{\top } + \varvec{b}\end{array} \right] \le {\mathsf {negl}}(\lambda )\right. } , \end{aligned}$$

where \(\Pi \in \mathbb {F}^{k\times m}\) and \(\varvec{b}\in \mathbb {F}^{k}\).

Remark 5.5

(On the auxiliary input \(z\)) In Definition 5.4, the polynomial-size extractor is required to succeed for any (adversarial) auxiliary input \(z\in \{0,1\}^{{\mathrm {poly}}(\lambda )}\). This requirement seems rather strong considering the fact that \(z\) could potentially encode arbitrary circuits. For example, \(z\) could encode a circuit that, given as input public key \({\mathsf {pk}}\), outputs \({\mathsf {Enc}}_{{\mathsf {pk}}}(x)\) where \(x=f_{s}({\mathsf {pk}})\) and \(f_{s}\) is some hardwired pseudorandom function. In this case, the extractor would be required to (efficiently) reverse engineer the circuit, which seems to be a rather strong requirement (or even an impossible one, under certain obfuscation assumptions).

While for presentational purposes Definition 5.4 is simple and convenient, it can be relaxed to only consider specific “benign” auxiliary-input distributions. Indeed, in our application, it will be sufficient to only consider a truly-random auxiliary input \(z\). (Requiring less than that seems to be not expressive enough, because we would at least like to allow the adversary to toss random coins.)

An analogous remark holds for both Definitions 5.8 and 5.17.

Remark 5.6

(Oblivious ciphertext sampling) Definition 5.4 has a similar flavor to plaintext awareness. In fact, an encryption scheme cannot satisfy the definition if it allows for “oblivious sampling” of ciphertexts. (For instance, both standard Elgamal and Paillier encryption do.) Thus, the set of strings \(c\) that are valid (i.e., for which \({\mathsf {ImVer}}_{{\mathsf {sk}}}(c)=1\)) must be “sparse.” Later on, we define a weaker notion of linear-only encryption that does not have this restriction.

Remark 5.7

In order for Definition 5.4 to be non-trivial, the extractor \(E\) has to be efficient (for otherwise it could run the adversary \(A\), obtain \(A\)’s outputs, decrypt them, and then output a zero linear function and hard-code the correct values in the constant term). As for the equivalent formulation in Remark (5.11), for similar reasons the simulator \(S\) has to be efficient; additionally, requiring statistical indistinguishability instead of computational indistinguishability does not strengthen the assumption.

Linear targeted malleability. We also consider a weaker variant of the linear-only property, which we call linear targeted malleability. (Indeed, the definition follows the lines of the notion of targeted malleability proposed by Boneh et al. [34], when restricted to the class of linear, or affine, functions.)

Definition 5.8

An encryption scheme has the linear targeted malleability property if for any polynomial-size adversary \(A\) and plaintext generator \(\mathcal {M}\) there is a polynomial-size simulator \(S\) such that, for any sufficiently large \(\lambda \in \mathbb {N}\), and any auxiliary input \(z\in \{0,1\}^{{\mathrm {poly}}(\lambda )}\), the following two distributions are computationally indistinguishable:

$$\begin{aligned} \left\{ \begin{array}{l} {\mathsf {pk}}, \\ a_{1},\dots ,a_{m}, \\ s, \\ {\mathsf {Dec}}_{{\mathsf {sk}}}(c_{1}'),\dots ,{\mathsf {Dec}}_{{\mathsf {sk}}}(c_{k}') \end{array} \left| \begin{array}{r} ({\mathsf {sk}},{\mathsf {pk}}) \leftarrow {\mathsf {Gen}}(1^{\lambda }) \\ (s,a_{1},\dots ,a_{m}) \leftarrow \mathcal {M}({\mathsf {pk}}) \\ (c_{1} ,\dots ,c_{m}) \leftarrow ({\mathsf {Enc}}_{{\mathsf {pk}}}(a_{1}) ,\dots ,{\mathsf {Enc}}_{{\mathsf {pk}}}(a_{m})) \\ (c_{1}',\dots ,c_{k}') \leftarrow A({\mathsf {pk}},c_{1},\dots ,c_{m};z) \\ \text {where} \\ {\mathsf {ImVer}}_{{\mathsf {sk}}}(c_{1}')=1,\dots ,{\mathsf {ImVer}}_{{\mathsf {sk}}}(c_{k}')=1 \end{array} \right\} \right. , \end{aligned}$$

and

$$\begin{aligned} \left\{ \begin{array}{l} {\mathsf {pk}}, \\ a_{1},\dots ,a_{m}, \\ s, \\ a_{1}',\dots ,a_{k}' \end{array} \left| \begin{array}{r} ({\mathsf {sk}},{\mathsf {pk}}) \leftarrow {\mathsf {Gen}}(1^{\lambda }) \\ (s,a_{1},\dots ,a_{m}) \leftarrow \mathcal {M}({\mathsf {pk}}) \\ (\Pi ,\varvec{b}) \leftarrow S({\mathsf {pk}};z) \\ (a_{1}',\dots ,a_{k}')^{\top } \leftarrow \Pi \cdot (a_{1},\dots ,a_{m})^{\top } + \varvec{b}\end{array} \right\} \right. \end{aligned}$$

where \(\Pi \in \mathbb {F}^{k\times m}\), \(\varvec{b}\in \mathbb {F}^{k}\), and \(s\) is some arbitrary string (possibly correlated with the plaintexts).

Remark 5.9

Definition 5.8 can be further relaxed to allow the simulator to be inefficient. Doing so does not let us prove knowledge but still enables us to prove soundness (i.e., obtain a \(\text{ SNARG } \) instead of a \(\text{ SNARK } \)). See Remark (6.4) in Sect. 6.1.

As mentioned above, Definition 5.8 is weaker than Definition 5.4, as shown by the following lemma.

Lemma 5.10

If a semantically-secure encryption scheme has the linear-only property (Definition 5.4), then it has the linear targeted malleability property (Definition 5.8).

Proof sketch

Let \(E\) be the (polynomial-size) extractor of a given polynomial-size adversary \(A\). We use \(E\) to construct a (polynomial-size) simulator \(S\) for \(A\). The simulator \(S\) simply runs \(E\) on fake ciphertexts:

  • \(S({\mathsf {pk}};z) \equiv \)

    1. 1.

      \((c_{1} ,\dots ,c_{m}) \leftarrow ({\mathsf {Enc}}_{{\mathsf {pk}}}(0) ,\dots ,{\mathsf {Enc}}_{{\mathsf {pk}}}(0))\);

    2. 2.

      \((y,c_{1}',\dots ,c_{k}') \leftarrow A({\mathsf {pk}},c_{1},\dots ,c_{m};z)\);

    3. 3.

      \((\Pi ,\varvec{b}) \leftarrow E({\mathsf {pk}},c_{1},\dots ,c_{m};z)\);

    4. 4.

      output \((y,\Pi ,\varvec{b})\).

By invoking semantic security and the extraction guarantee of \(E\), we can show that \(S\) works. The proof follows by a standard hybrid argument. First we consider an experiment where \(S\) gives \(A\) and \(E\) an encryption of \({\mathbf {a}} \leftarrow \mathcal {M}({\mathsf {pk}})\), rather than an encryption of zeros, and argue computational indistinguishability by semantic security. Then we can show that the output in this hybrid experiment is statistically close to that in the real experiment by invoking the extraction guarantee. \(\square \)

A converse to Lemma 5.10 appears unlikely, because Definition 5.8 seems to allow for encryption schemes where ciphertexts can be obliviously sampled while Definition 5.4 does not.

Remark 5.11

(Alternative formulation) To better compare Definition 5.8 with Definition 5.4, we now give an equivalent formulation of Definition 5.4. For any polynomial-size adversary \(A\) there is a polynomial-size simulator \(S\) such that, for any sufficiently large \(\lambda \in \mathbb {N}\), any auxiliary input \(z\in \{0,1\}^{{\mathrm {poly}}(\lambda )}\), and any plaintext generator \(\mathcal {M}\), the following two distributions are computationally indistinguishable:

$$\begin{aligned} \left\{ \begin{array}{l} {\mathsf {pk}}, \\ a_{1},\dots ,a_{m}, \\ c_{1},\dots ,c_{m}, \\ {\mathsf {out}}_{{\mathsf {sk}}}(c_{1}'),\dots ,{\mathsf {out}}_{{\mathsf {sk}}}(c_{k}'), \\ z\end{array} \left| \begin{array}{r} ({\mathsf {sk}},{\mathsf {pk}}) \leftarrow {\mathsf {Gen}}(1^{\lambda }) \\ (a_{1},\dots ,a_{m}) \leftarrow \mathcal {M}({\mathsf {pk}}) \\ (c_{1} ,\dots ,c_{m}) \leftarrow ({\mathsf {Enc}}_{{\mathsf {pk}}}(a_{1}) ,\dots ,{\mathsf {Enc}}_{{\mathsf {pk}}}(a_{m})) \\ (c_{1}',\dots ,c_{k}') \leftarrow A({\mathsf {pk}},c_{1},\dots ,c_{m};z) \\ \end{array} \right\} \right. \end{aligned}$$

where \({\mathsf {out}}_{{\mathsf {sk}}}(c') := {\left\{ \begin{array}{ll} {\mathsf {Dec}}_{{\mathsf {sk}}}(c') &{} \text {if } {\mathsf {ImVer}}_{{\mathsf {sk}}}(c')=1 \\ \bot &{} \text {if } {\mathsf {ImVer}}_{{\mathsf {sk}}}(c')=0 \end{array}\right. }\;\), and

$$\begin{aligned} \left\{ \begin{array}{l} {\mathsf {pk}}, \\ a_{1},\dots ,a_{m}, \\ c_{1},\dots ,c_{m}, \\ a_{1}',\dots ,a_{k}', \\ z\end{array} \left| \begin{array}{r} ({\mathsf {sk}},{\mathsf {pk}}) \leftarrow {\mathsf {Gen}}(1^{\lambda }) \\ (a_{1},\dots ,a_{m}) \leftarrow \mathcal {M}({\mathsf {pk}}) \\ (c_{1} ,\dots ,c_{m}) \leftarrow ({\mathsf {Enc}}_{{\mathsf {pk}}}(a_{1}) ,\dots ,{\mathsf {Enc}}_{{\mathsf {pk}}}(a_{m})) \\ (\Pi ,\varvec{b}) \leftarrow S({\mathsf {pk}},c_{1},\dots ,c_{m};z)\\ (a_{1}',\dots ,a_{k}')^{\top } = \Pi \cdot (a_{1},\dots ,a_{m})^{\top } + \varvec{b}\end{array} \right\} \right. \end{aligned}$$

where \(\Pi \in \mathbb {F}^{k\times m}\) and \(\varvec{b}\in \mathbb {F}^{k}\), with the convention that if the i-th row of \(\Pi \) is left empty then \(a_{i}' := \bot \).

5.2 Linear-Only One-Way Encoding

Unlike linear-only encryption schemes, linear-only encoding schemes allow to publicly test for certain properties of the underlying plaintexts without decryption (which is now allowed to be inefficient). In particular, linear-only encoding schemes cannot satisfy semantic security. Instead, we require that they only satisfy a certain one-wayness property.

We now define the syntax and correctness properties of linear-only encoding schemes, their one-wayness property, and their linear-only property.

Syntax and correctness. A linear-only encoding scheme is a tuple of algorithms \(({\mathsf {Gen}} ,{\mathsf {Enc}},{\mathsf {SEnc}},{\mathsf {Test}},{\mathsf {Add}},{\mathsf {ImVer}})\) with the following syntax and correctness properties:

  • Given a security parameter \(\lambda \) (presented in unary), \({\mathsf {Gen}}\) generates a public key \({\mathsf {pk}}\). The public key \({\mathsf {pk}}\) also includes a description of a field \(\mathbb {F}\) representing the plaintext space.

  • Encoding can be performed in two modes: \({\mathsf {Enc}}_{{\mathsf {pk}}}\) is an encoding algorithm that works in linear-only mode, and \({\mathsf {SEnc}}_{{\mathsf {pk}}}\) is a deterministic encoding algorithm that works in standard mode.

  • As in linear-only encryption, \({\mathsf {Add}}({\mathsf {pk}},c_{1},\dots ,c_{m},\alpha _{1},\dots ,\alpha _{m})\) is a homomorphic evaluation algorithm for linear combinations. Namely, given a public key \({\mathsf {pk}}\), encodings \(\left\{ c_{i} \in {\mathsf {Enc}}_{{\mathsf {pk}}}(a_{i})\right\} _{i\in [m]}\), and field elements \(\left\{ \alpha _{i}\right\} _{i\in [m]}\), \({\mathsf {Add}}\) computes an evaluated encoding \({\hat{c}}\in {\mathsf {Enc}}_{{\mathsf {pk}}}(\sum _{i\in [m]}\alpha _{i}a_{i})\). Also, \({\mathsf {Add}}\) works in the same way for any vector of standard-mode encodings \(\left\{ c_{i} \in {\mathsf {SEnc}}_{{\mathsf {pk}}}(a_{i})\right\} _{i\in [m]}\).

  • \({\mathsf {ImVer}}_{{\mathsf {pk}}}(c')\) tests whether a given candidate encoding \(c'\) is in the image of \({\mathsf {Enc}}_{{\mathsf {pk}}}\) (i.e., in the image of the encoding in linear-only mode).

  • \({\mathsf {Test}}\big ({\mathsf {pk}},\varvec{t},{{\mathsf {Enc}}_{{\mathsf {pk}}}(a_{1})},\dots ,{{\mathsf {Enc}}_{{\mathsf {pk}}}(a_{m})},{{\mathsf {SEnc}}_{{\mathsf {pk}}}(\tilde{a}_{1})},\dots ,{{\mathsf {SEnc}}_{{\mathsf {pk}}}(\tilde{a}_{\tilde{m}})}\big )\) is a public test for zeros of \(\varvec{t}\). Namely, given a public key \({\mathsf {pk}}\), a test polynomial \(\varvec{t}:\mathbb {F}^{m+\tilde{m}} \rightarrow \mathbb {F}^\eta \), encodings \({\mathsf {Enc}}_{{\mathsf {pk}}}(a_{i})\), and standard-mode encodings \({\mathsf {SEnc}}_{{\mathsf {pk}}}(\tilde{a}_{i})\), \({\mathsf {Test}}\) tests whether \(\varvec{t}(a_{1},\dots ,a_{m},\tilde{a}_{1},\dots ,\tilde{a}_{\tilde{m}})=0^\eta \).

Remark 5.12

(Degrees supported by \({\mathsf {Test}}\)) In this work, we restrict our attention to the case in which \({\mathsf {Test}}\) only takes as input test polynomials \(\varvec{t}\) of at most quadratic degree. This restriction comes from the fact that, at present, the only candidates for linear-only one-way encoding schemes that we know of are based on bilinear maps, which only let us support testing of quadratic degrees. (See Sect. 5.3.) This restriction propagates to the transformation from algebraic \(\text{ LIP } \)s discussed in Sect. 6.2, where we must require that the degree of the \(\text{ LIP } \) query algorithm is at most quadratic. Nonetheless our transformation holds more generally (for query algorithms of \({\mathrm {poly}}(\lambda )\) degree), when given linear-only one-way encoding schemes that support tests of the appropriate degree.

\(\Delta \)-power one-wayness. In our main application of transforming algebraic \(\text{ LIP } \)s into public-verifiable preprocessing \(\text{ SNARK } \)s (see Sect. 6.2), linear-only encoding schemes are used to (linearly) manipulate polynomial evaluations over \(\mathbb {F}\). The notion of one-wayness that we require is that, given polynomially-many encodings of low-degree polynomials evaluated at a random point \(s\), it is hard to find \(s\).

Definition 5.13

A linear-only encoding scheme satisfies \(\Delta \)-power one-wayness if for every polynomial-size \(A\) and all large enough security parameter \(\lambda \in \mathbb {N}\),

$$\begin{aligned} \mathrm{Pr}\left[ s^{*} = s\left| \begin{array}{r} {\mathsf {pk}}\leftarrow {\mathsf {Gen}}(1^{\lambda }) \\ s\leftarrow \mathbb {F}\\ (c_{1} ,\dots ,c_{\Delta }) \leftarrow \big ({\mathsf {Enc}}_{{\mathsf {pk}}}(s),\dots ,{\mathsf {Enc}}_{{\mathsf {pk}}}(s^{\Delta })\big ) \\ (\tilde{c}_{1} ,\dots ,\tilde{c}_{\Delta }) \leftarrow \big ({\mathsf {SEnc}}_{{\mathsf {pk}}}(s),\dots ,{\mathsf {SEnc}}_{{\mathsf {pk}}}(s^{\Delta })\big ) \\ s^{*} \leftarrow A\big ({\mathsf {pk}},c_{1},\dots ,c_{\Delta },\tilde{c}_{1},\ldots ,\tilde{c}_{\Delta }\big ) \end{array} \right] \right. \le {\mathsf {negl}}(\lambda ) . \end{aligned}$$

Our constructions of preprocessing \(\text{ SNARK } \)s from \(\text{ LIP } \)s also involve manipulations of multivariate polynomials. Thus, we are in fact interested in requiring a more general property of multivariate \(\Delta \)-power one-wayness.

Definition 5.14

A linear-only encoding scheme satisfies multivariate \(\Delta \)-power one-wayness if, for every polynomial-size \(A\), large enough security parameter \(\lambda \in \mathbb {N}\), and \(\mu \)-variate polynomials \((p_{1} ,\dots ,p_{\ell })\) of total degree at most \(\Delta \):

$$\begin{aligned} \mathrm{Pr}\left[ \begin{array}{c} p^{*}\not \equiv 0\\ \mathrm {and} \\ p^{*}(\varvec{s})= 0 \end{array} \left| \begin{array}{r} {\mathsf {pk}}\leftarrow {\mathsf {Gen}}(1^{\lambda })\\ \varvec{s}\leftarrow \mathbb {F}^\mu \\ (c_{1} ,\dots ,c_{\ell }) \leftarrow \big ({\mathsf {Enc}}_{{\mathsf {pk}}}(p_{1}(\varvec{s})),\dots ,{\mathsf {Enc}}_{{\mathsf {pk}}}(p_{\ell }(\varvec{s}))\big ) \\ (\tilde{c}_{1} ,\dots ,\tilde{c}_{\ell }) \leftarrow \big ({\mathsf {SEnc}}_{{\mathsf {pk}}}(p_{1}(\varvec{s})),\dots ,{\mathsf {SEnc}}_{{\mathsf {pk}}}(p_{\ell }(\varvec{s}))\big ) \\ p^{*}\leftarrow A\big ({\mathsf {pk}},c_{1},\dots ,c_{\ell },\tilde{c}_{1},\ldots ,\tilde{c}_{\ell }\big ) \end{array} \right] \le {\mathsf {negl}}(\lambda )\right. , \end{aligned}$$

where \(\Delta ,\ell ,\mu \) are all \({\mathrm {poly}}(\lambda )\), and \(p^{*}\) is a \(\mu \)-variate polynomial.

For the case of univariate polynomials (i.e., \(\mu =1\)), it is immediate to see that Definition 5.14 is equivalent to Definition 5.13; this follows directly from the fact that univariate polynomials over finite fields can be efficiently factored into their roots [15, 24, 42, 101]. We show that the two definitions are equivalent also for any \(\mu ={\mathrm {poly}}(\lambda )\), provided that the encoding scheme is rerandomizable; indeed, in the instantiation discussed in this paper the encoding is deterministic and, in particular rerandomizable.

Proposition 5.15

If \({\mathsf {Enc}},{\mathsf {SEnc}}\) are rerandomizable (in particular, if deterministic), then (univariate) \(\Delta \)-power one-wayness implies (multivariate) \(\Delta \)-power one-wayness for any \(\mu ={\mathrm {poly}}(\lambda )\).

Proof

Assume that \(A\) violates the \(\mu \)-variate \(\Delta \)-power one-wayness with probability \(\varepsilon \) for a vector of polynomials \((p_{1} ,\dots ,p_{\ell })\). We use \(A\) to construct a new adversary \(A'\) that breaks (univariate) \(\Delta \)-power one-wayness with probability at least \(\varepsilon /\mu \Delta \).

Given input \( \big ({\mathsf {pk}},{\mathsf {Enc}}_{{\mathsf {pk}}}(s),\dots ,{\mathsf {Enc}}_{{\mathsf {pk}}}(s^{\Delta }),{\mathsf {SEnc}}_{{\mathsf {pk}}}(s),\dots ,{\mathsf {SEnc}}_{{\mathsf {pk}}}(s^{\Delta })\big )\), \(A'\) first samples \(i\in [\mu ]\) and \(s_{1},\dots ,s_{i-1},s_{i+1},\dots ,s_{\mu } \in \mathbb {F}\) at random. Then, thinking of \(s\) as \(s_{i}\) and \(\varvec{s}\) as \((s_{1} ,\dots ,s_{\mu })\), \(A'\) uses the linear homomorphism and rerandomization to sample \(({\mathsf {Enc}}_{{\mathsf {pk}}}(p_{1}(\varvec{s})),\dots ,{\mathsf {Enc}}_{{\mathsf {pk}}}(p_{\ell }(\varvec{s})),{\mathsf {SEnc}}_{{\mathsf {pk}}}(p_{1}(\varvec{s})),\dots ,{\mathsf {SEnc}}_{{\mathsf {pk}}}(p_{\ell }(\varvec{s}))\big )\) and feeds these to \(A\), who in turn outputs a polynomial \(p^{*}\). Next, \(A'\) does the following:

  1. 1.

    Let \(p^{*}_{1}=p^{*}\), and set \(j := 1\).

  2. 2.

    While \(j<i\) and \(p^{*}_{j}(x_{j},x_{j+1},\dots ,x_{\mu })\not \equiv 0\):

    1. (a)

      Decompose \(p^{*}_{j}\) according to the \(x_{j}\)-monomials: \(p^{*}_{j}(x_{j},\dots ,x_{\mu }) = \sum _{k=0}^\Delta {x_{j}^kp^{*}_{j+1,k}(x_{j+1},\dots ,x_{\mu })}\).

    2. (b)

      Set \(p^{*}_{j+1}\) to be the non-zero polynomial \(p^{*}_{j+1,k}\) with minimal k.

    3. (c)

      Set \(j := j+1\).

  3. 3.

    After computing \(p^{*}_{i}\), restrict the \(\mu -i\) last variables to \(\varvec{s}\), i.e., compute the \(x_{i}\)-univariate polynomial \(p^{*}_{i}(x_{i},s_{i+1},\dots ,s_{\mu })\), and factor it to find at most \(\Delta \) roots; finally, output one of the roots at random as a guess for \(s=s_{i}\).

To analyze the success probability of \(A'\), we rely on the following claim:

Claim 5.16

If \(p^{*}\not \equiv 0\) and \(p^{*}(s_{1},\dots ,s_{\mu })=0\), then there exists \(i\in [\mu ]\) such that:

$$\begin{aligned}&p^{*}_{i}(x,s_{i+1},\dots ,s_{\mu }) \not \equiv 0\\&p^{*}_{i}(s_{i},s_{i+1},\dots ,s_{\mu }) = 0. \end{aligned}$$

Proof of Claim 5.16

The proof is by induction on \(i\). The base case is when \(i=1\), for which it holds that:

$$\begin{aligned}&p^{*}_{1}(s_{1},\dots ,s_{\mu })=0\\&p^{*}_{1}(x_{1},\dots ,x_{\mu })\not \equiv 0 . \end{aligned}$$

For any \(i\) with \(2< i< \mu \), suppose that:

$$\begin{aligned}&p^{*}_{i}(s_{i},\dots ,s_{\mu })=0\\&p^{*}_{i}(x_{i},\dots ,x_{\mu })\not \equiv 0\\&p^{*}_{i}(x_{i},s_{i+1},\dots ,s_{\mu })\equiv 0; \end{aligned}$$

then, by the construction of \(p^{*}_{i+1}\) from \(p^{*}_{i}\),

$$\begin{aligned}&p^{*}_{i+1}(s_{i+1},\dots ,s_{\mu })=0\\&p^{*}_{i+1}(x_{i+1},\dots ,x_{\mu })\not \equiv 0. \end{aligned}$$

If this inductive process reaches \(p^{*}_{\mu }\), then it holds that:

$$\begin{aligned}&p^{*}_{\mu }(s_{\mu })=0\\&p^{*}_{\mu }(x_{\mu })\not \equiv 0, \end{aligned}$$

which already satisfies the claim. \(\square \)

Note that \(A'\) guesses the \(i\) guaranteed by Claim 5.16 with probability \(1/\mu \), and hence, with the same probability, finds a non-trivial polynomial that vanishes at the challenge point \(s=s_{i}\); in such a case, \(A'\) thus guesses \(s\) correctly, from among at most \(\Delta \) roots, with probability at least \(1/\Delta \). The overall probability of success of \(A'\) is at lest \(\varepsilon /\mu \Delta \), and this concludes the proof of Proposition 5.15.

Linear-only homomorphism. The linear-only property of linear-only one-way encoding schemes is defined analogously to the case of linear-only encryption. Essentially, it says that, given the public key \({\mathsf {pk}}\), encodings in linear-only mode \(({\mathsf {Enc}}_{{\mathsf {pk}}}(a_{1}) ,\dots ,{\mathsf {Enc}}_{{\mathsf {pk}}}(a_{m}))\), and possibly additional encodings in standard mode \(({\mathsf {SEnc}}_{{\mathsf {pk}}}(\tilde{a}_{1}) ,\dots ,{\mathsf {SEnc}}_{{\mathsf {pk}}}(\tilde{a}_{\tilde{m}}))\), it is infeasible to compute a new encoding \(c'\) in the image of \({\mathsf {Enc}}_{{\mathsf {pk}}}\), except by evaluating an affine combination of the encodings \(({\mathsf {Enc}}_{{\mathsf {pk}}}(a_{1}) ,\dots ,{\mathsf {Enc}}_{{\mathsf {pk}}}(a_{m}))\); in particular, “standard mode” encodings in the image of \({\mathsf {SEnc}}_{{\mathsf {pk}}}\) cannot be “moved into” the image of \({\mathsf {Enc}}_{{\mathsf {pk}}}\). Formally, the property is captured by guaranteeing that, whenever the prover produces a valid new encoding, it is possible to efficiently extract the corresponding affine combination.

Definition 5.17

A linear encoding scheme has the linear-only (homomorphism) property if for any polynomial-size adversary \(A\) there is a polynomial-size extractor \(E\) such that for any sufficiently large \(\lambda \in \mathbb {N}\), any auxiliary input \(z\in \{0,1\}^{{\mathrm {poly}}(\lambda )}\), and any plaintext generator \(\mathcal {M}\):

$$\begin{aligned}&\mathrm{Pr}\left[ \begin{array}{c} (a_{1}',\dots ,a_{k}')^{\top } = \Pi \cdot (a_{1},\dots ,a_{m})^{\top } + \varvec{b}\\ \mathrm { and } \\ \exists \, i\in [k]: {\mathsf {ImVer}}_{{\mathsf {sk}}}(c_{i}')=1 \;\mathrm { but }\;c_{i}' \not \in {\mathsf {Enc}}_{{\mathsf {pk}}}(a'_{i}) \end{array} \left| \begin{array}{r} {\mathsf {pk}}\leftarrow {\mathsf {Gen}}(1^{\lambda }) \\ (a_{1},\dots ,a_{m},\tilde{a}_{1},\dots ,\tilde{a}_{\tilde{m}}) \leftarrow \mathcal {M}({\mathsf {pk}}) \\ (c_{1} ,\dots ,c_{m}) \leftarrow ({\mathsf {Enc}}_{{\mathsf {pk}}}(a_{1}) ,\dots ,{\mathsf {Enc}}_{{\mathsf {pk}}}(a_{m}))\\ (\tilde{c}_{1} ,\dots ,\tilde{c}_{\tilde{m}}) \leftarrow ({\mathsf {SEnc}}_{{\mathsf {pk}}}(\tilde{a}_{1}) ,\dots ,{\mathsf {SEnc}}_{{\mathsf {pk}}}(\tilde{a}_{\tilde{m}})) \\ (c_{1}',\dots ,c_{k}') \leftarrow A({\mathsf {pk}},c_{1},\dots ,c_{m},\tilde{c}_{1},\dots ,\tilde{c}_{\tilde{m}};z) \\ (\Pi ,\varvec{b}) \leftarrow E({\mathsf {pk}},c_{1},\dots ,c_{m},\tilde{c}_{1},\dots ,\tilde{c}_{\tilde{m}};z)\\ \end{array} \right] \right. \\&\quad \le {\mathsf {negl}}(\lambda ), \end{aligned}$$

where \(\Pi \in \mathbb {F}^{k\times m}\) and \(\varvec{b}\in \mathbb {F}^{k}\).

5.3 Instantiations

We discuss candidates for our notions of linear-only encryption and one-way encoding schemes.

Linear-only property (and linear targeted malleability) from Paillier encryption. Paillier encryption [88] has plaintext group \((\mathbb {Z}_{N},+)\), where N is a product of two \(\lambda \)-bit primes \(\mathfrak {p}\) and \(\mathfrak {q}\). (See Remarks (2.6) and (5.1).) We consider two variants of Paillier encryption:

  • A “single-ciphertext” variant with linear targeted malleability. We assume that standard Paillier encryption satisfies Definition 5.8. Note that this variant cannot satisfy Definition 5.4 (which is stronger, as shown in Lemma 5.10), because it is easy to “obliviously sample” valid Paillier ciphertexts without “knowing” the corresponding plaintext. (See Remark (5.6).)

  • A “two-ciphertext” variant with linear-only property. In order to (heuristically) prevent oblivious sampling, we can “sparsify” the ciphertext space of Paillier encryption by following the template of knowledge-of-exponent assumptions. Concretely, an encryption of a plaintext \(a\) consists of \({\mathsf {Enc}}_{{\mathsf {pk}}}(a)\) and \({\mathsf {Enc}}_{{\mathsf {pk}}}(\alpha \cdot a)\) for a secret random \(\alpha \in \mathbb {Z}_{N}\); additionally, an image verification algorithm \({\mathsf {ImVer}}_{{\mathsf {sk}}}\) checks this linear relation. (This candidate is also considered in [56].) We then assume that this variant of Paillier encryption satisfies Definition 5.4.

Because Paillier encryption is based on the decisional composite residuosity assumption, it suffers from factoring attacks, and thus security for succinct arguments based on the above instantiations can only be assumed to hold against subexponential-time provers (specifically, running in time \(2^{o(\lambda ^{1/3})}\)).

Linear-only property (and linear targeted malleability) from Elgamal encryption. Elgamal encryption [51] has plaintext group \((\mathbb {Z}_{\mathfrak {p}},\times )\) for a large prime \(\mathfrak {p}\), and is conjectured to resist subexponential-time attacks when implemented over elliptic curves [89].

We are interested in additive, rather than multiplicative, homomorphism for plaintexts that belong to the field \(\mathbb {F}_{\mathfrak {p}}\) (whose elements coincide with those of \(\mathbb {Z}_{\mathfrak {p}}\)). Thus, we would like the plaintext group to be \((\mathbb {Z}_{\mathfrak {p}},+)\) instead. The two groups \((\mathbb {Z}_{\mathfrak {p}},\times )\) and \((\mathbb {Z}_{\mathfrak {p}},+)\) are in fact isomorphic via the function that maps a plaintext \(a\) to a new plaintext \(g^{a} (\bmod \, \mathfrak {p})\), where g is a primitive element of \(\mathbb {F}_{\mathfrak {p}}\). Unfortunately, inverting this mapping is computationally inefficient: in order to recover the plaintext \(a\) from \(g^{a} (\bmod \, \mathfrak {p})\), the decryption algorithm has to compute a discrete logarithm base g; doing so is inefficient when \(a\) can be any value. Thus, a naive use Elgamal encryption in our context presents a problem.

Nonetheless, as explained in Sect. 1.3.3, we can still use Elgamal encryption in our context by ensuring that the distribution of the honest \(\text{ LIP } \) prover’s answers, conditioned on any choice of verifier randomness, has a polynomial-size support. Doing so comes with two caveats: it results in succinct arguments with only \(1/{\mathrm {poly}}(\lambda )\) security and (possibly) a slow online verification time (but, of course, with proofs that are still succinct).

Here too, to prove security, we can consider single-ciphertext and two-ciphertext variants of Elgamal encryption that we assume satisfy Definitions 5.8 and 5.4, respectively.

Linear-only property (and linear targeted malleability) from Benaloh encryption. Benaloh encryption [14] generalizes the quadratic-residuosity-based encryption scheme of Goldwasser and Micali [61] to higher residue classes; it can support any plaintext group \((\mathbb {Z}_{\mathfrak {p}},+)\) where \(\mathfrak {p}\) is polynomial in the security parameter. Unlike Elgamal encryption (implemented over elliptic curves) and similarly to Paillier encryption, Benaloh encryption is susceptible to subexponential-time attacks.

As before, we can consider single-ciphertext and two-ciphertext variants of Benaloh encryption that we assume satisfy Definitions 5.8 and 5.4, respectively. Because we are restricted to \(\mathfrak {p}= {\mathrm {poly}}(\lambda )\), succinct arguments based on Benaloh encryption can only yield \(1/{\mathrm {poly}}(\lambda )\) security.

Linear-only one-way encodings from KEA in bilinear groups. In order to obtain publicly-verifiable preprocessing \(\text{ SNARK } \)s (see Sect. 6.2), we seek linear-only encodings that have \({\mathrm {poly}}(\lambda )\)-power one-wayness and allow to publicly test for zeroes of \({\mathrm {poly}}(\lambda )\)-degree polynomials. For this, we use the same candidate encoding over bilinear groups, and essentially the same assumptions, as in [56, 64, 78]; because of the use of bilinear maps, we will in fact only be able to publicly test for zeros of quadratic polynomials. The corresponding construction can be instantiated with conjectured exponential security. Indeed, subexponential-time attacks against the base groups (relevant to the security in pairing-based SNARK constructions) are not known to be inherent. (In terms of concrete instantiations, however, existing symmetric-pairing groups are subject to subexponential attacks, whereas there do exist asymmetric-pairing groups conjectured to be resilient to subexponential attacks, for instance [23].)

For the sake of completeness, and since the construction does not correspond directly to a known encryption scheme as in the examples above, we give the basic construction and relevant assumptions. For the sake of simplicity, we describe the construction and relevant assumption in the setting of symmetric pairings, but the construction could naturally be adapted to the setting of asymmetric pairings.

The encoding is defined over a bilinear group ensemble \(\{\mathcal {G}_{\lambda }\}_{\lambda \in \mathbb {N}}\) where each \((\mathbb {G},\mathbb {G}_{\mathsf {T}}) \in \mathcal {G}_{\lambda }\) is a pair of groups of prime order \(\mathfrak {p}\in (2^{\lambda -1},2^\lambda )\) with an efficiently-computable pairing \(e :\mathbb {G}\times \mathbb {G}\rightarrow \mathbb {G}_{\mathsf {T}}\). A public key \({\mathsf {pk}}\) includes the description of the groups and \(g,g^{\alpha } \in \mathbb {G}\), where \(g\in \mathbb {G}^{*}\) is a generator and \(\alpha \leftarrow \mathbb {F}_{\mathfrak {p}}\) is random. The encoding is deterministic: the linear-only mode encoding is \({\mathsf {Enc}}_{{\mathsf {pk}}}(a) := (g^{a},g^{\alpha a})\), and the standard-mode encoding is \({\mathsf {SEnc}}_{{\mathsf {pk}}}(a) := g^{a}\). Public image verification is as follows: \({\mathsf {ImVer}}_{{\mathsf {pk}}}(f,f')\) outputs 1 if and only if \(e(f,g^{\alpha })=e(g,f')\). Public testing of quadratic polynomials can also be done using the pairing: for \(\{(g_{i},g_{i}^{\alpha })\}_{i \in [m]} = \{ {\mathsf {Enc}}_{{\mathsf {pk}}}(a_{i}) \}_{i \in [m]}\) and \(\{\tilde{g}_{i}\}_{i \in [\tilde{m}]} = \{{\mathsf {SEnc}}_{{\mathsf {pk}}}(\tilde{a}_{i})\}_{i \in [\tilde{m}]}\), \({\mathsf {Test}}\) uses \(g_{1},\dots ,g_{m}\) and \(\tilde{g}_{1},\dots ,\tilde{g}_{\tilde{m}}\) and the pairing to test zeros for a quadratic polynomial \(\varvec{t}\). The required cryptographic assumptions are:

Assumption 5.18

(KEA and poly-power DL in bilinear groups) There exists an efficiently-samplable group ensemble \(\{\mathcal {G}_{\lambda }\}_{\lambda \in \mathbb {N}}\), where each \((\mathbb {G},\mathbb {G}_{\mathsf {T}}) \in \mathcal {G}_{\lambda }\) are groups of prime order \(\mathfrak {p}\in (2^{\lambda -1},2^\lambda )\) having a corresponding efficiently-computable pairing \(e :\mathbb {G}\times \mathbb {G}\rightarrow \mathbb {G}_{\mathsf {T}}\), such that the following properties hold.

  1. 1.

    Knowledge of exponent For any polynomial-size adversary \(A\) there exists a polynomial-size extractor \(E\) such that for all large enough \(\lambda \in \mathbb {N}\), any auxiliary input \(z\in \{0,1\}^{{\mathrm {poly}}(\lambda )}\),Footnote 10 and any group element sampler S,

    $$\begin{aligned}&\mathrm{Pr}\left[ \begin{array}{c} f'=f^{\alpha } \\ \prod _{i\in [t]} g_{i}^{\pi _{i}} \ne f \end{array} \left| \begin{array}{r} (\mathbb {G},\mathbb {G}_{\mathsf {T}}) \leftarrow \mathcal {G}_{\lambda } \\ (g_{1},\dots ,g_{t}) \leftarrow S(\mathbb {G},\mathbb {G}_{\mathsf {T}}) \\ \alpha \leftarrow \mathbb {F}_{\mathfrak {p}} \\ (f,f') \leftarrow A(\mathbb {G},\mathbb {G}_{\mathsf {T}},g_{1},g_{1}^{\alpha },\dots ,g_{t},g_{t}^{\alpha };z) \\ (\pi _{1},\dots ,\pi _{t}) \leftarrow E(\mathbb {G},\mathbb {G}_{\mathsf {T}},g_{1},g_{1}^{\alpha },\dots ,g_{t},g_{t}^{\alpha };z) \end{array} \right] \right. \\&\quad \le {\mathsf {negl}}(\lambda ) . \end{aligned}$$
  2. 2.

    Hardness of poly-power discrete logarithms For any polynomial-size adversary \(A\), polynomial \(t={\mathrm {poly}}(\lambda )\), all large enough \(\lambda \in \mathbb {N}\), and generator sampler S:

    $$\begin{aligned} \mathrm{Pr}\left[ s' = s\left| \begin{array}{r} (\mathbb {G},\mathbb {G}_{\mathsf {T}}) \leftarrow \mathcal {G}_{\lambda } \\ s\leftarrow \mathbb {F}_{\mathfrak {p}} \\ g \leftarrow S(\mathbb {G}) \text { where } \langle g \rangle = \mathbb {G}\\ s' \leftarrow A(\mathbb {G},\mathbb {G}_{\mathsf {T}},g,g^s,g^{s^{2}}\dots ,g^{s^{t}}) \\ \end{array} \right] \le {\mathsf {negl}}(\lambda )\right. . \end{aligned}$$

Remark 5.19

(Lattice-based candidates) In principle, we may also consider as candidates lattice-based encryption schemes (e.g., [91]). However, our confidence that these schemes satisfy linear-only properties may be more limited, as they can be tweaked to yield fully-homomorphic encryption schemes [36].

6 Preprocessing \(\text{ SNARK } \)s from \(\text{ LIP } \)s

We describe how to combine \(\text{ LIP } \)s and linear-only encryption and encodings in order to construct preprocessing \(\text{ SNARK } \)s. Before describing our transformations, we make two technical remarks.

SNARKs and LIPs for boolean circuit families. Since the \(\text{ LIP } \)s that we have presented so far are for boolean circuit satisfaction problems, it will be convenient to construct here preprocessing \(\text{ SNARK } \)s for boolean circuit satisfaction problems. As explained in Sect. 4.1, such preprocessing \(\text{ SNARK } \)s imply preprocessing \(\text{ SNARK } \)s for the universal relation \(\mathcal {R}_{\mathcal {U}}\) with similar efficiency.

Also, for the sake of simplicity, the \(\text{ LIP } \) constructions that we have presented so far are for satisfiability of specific boolean circuits. However, all of these constructions directly extend to work for any family of boolean circuits \(\mathcal {C}= \left\{ C_{\ell }\right\} _{\ell \in \mathbb {N}}\), in which case all the \(\text{ LIP } \) algorithms (e.g., \(V_{\scriptscriptstyle {\mathsf {LIP}}}=(Q_{\scriptscriptstyle {\mathsf {LIP}}} ,D_{\scriptscriptstyle {\mathsf {LIP}}})\) and \(P_{\scriptscriptstyle {\mathsf {LIP}}}\)) will also get as input \(1^{\ell }\) (as foreshadowed in Remark (2.3)). If the circuit family \(\mathcal {C}\) is uniform, all the \(\text{ LIP } \) algorithms are uniform as well. If the circuit family \(\mathcal {C}\) is non-uniform, then \(Q_{\scriptscriptstyle {\mathsf {LIP}}}\) and \(P_{\scriptscriptstyle {\mathsf {LIP}}}\) will also get a circuit \(C_{\ell }\) as auxiliary input (in addition to \(1^{\ell }\)).

Field size depending on \(\lambda \). Definition 2.5 (and Definition 2.2) are with respect to a fixed field \(\mathbb {F}\). However, since the knowledge error of a \(\text{ LIP } \) (or \(\text{ LPCP } \)) typically decreases with the field size, it is often convenient to let the size of \(\mathbb {F}\) scale with a security parameter \(\lambda \). In fact, when combining a \(\text{ LIP } \) with some of our linear-only encryption and encoding candidates, letting \(\mathbb {F}\) scale with \(\lambda \) is essential, because security will only hold for a large enough plaintext space. (For example, this is the case for the Elgamal-like linear-only encoding described in Sect. 5.3). All of the \(\text{ LIP } \) constructions described in Sects. 3.1 and 3.2 do work for arbitrarily large fields, and we can assume that \((P_{\scriptscriptstyle {\mathsf {LIP}}} ,V_{\scriptscriptstyle {\mathsf {LIP}}})\) simply get as additional input the description of the field; abusing notation, we will just denote this description by \(\mathbb {F}_{\lambda }\).

6.1 Designated-Verifier Preprocessing \(\text{ SNARK } \)s from Arbitrary \(\text{ LIP } \)s

We describe how to combine a \(\text{ LIP } \) and linear-only encryption to obtain a designated-verifier preprocessing \(\text{ SNARK } \).

Construction 6.1

Let \(\left\{ \mathbb {F}_{\lambda }\right\} _{\lambda \in \mathbb {N}}\) be a field ensemble (with efficient description and operations). Let \(\mathcal {C}= \left\{ C_{\ell }\right\} _{\ell \in \mathbb {N}}\) be a family of circuits. Let \((P_{\scriptscriptstyle {\mathsf {LIP}}} ,V_{\scriptscriptstyle {\mathsf {LIP}}})\) be an input-oblivious two-message \(\text{ LIP } \) for the relation \(\mathcal {R}_{\mathcal {C}}\), where for the field \(\mathbb {F}_{\lambda }\), the verifier message is in \(\mathbb {F}_{\lambda }^{m}\), the prover message is in \(\mathbb {F}_{\lambda }^{k}\), and the knowledge error is \(\varepsilon (\lambda )\). Let \(\mathcal {E}= ({\mathsf {Gen}},{\mathsf {Enc}},{\mathsf {Dec}},{\mathsf {Add}},{\mathsf {ImVer}})\) be a linear-only encryption scheme whose plaintext field, for security parameter \(\lambda \), is \(\mathbb {F}_{\lambda }\). We define a preprocessing \(\text{ SNARK } \) \((G,P,V)\) for \(\mathcal {R}_{\mathcal {C}}\) as follows.

  • \(G(1^{\lambda },1^{\ell })\) invokes the \(\text{ LIP } \) query algorithm \(Q_{\scriptscriptstyle {\mathsf {LIP}}}(\mathbb {F}_{\lambda },1^{\ell })\) to generate an \(\text{ LIP } \) message \({\mathbf {q}} \in \mathbb {F}_{\lambda }^{m}\) along with a secret state \({\mathbf {u}}\in \mathbb {F}^{m'}\), generates \(({\mathsf {sk}},{\mathsf {pk}}) \leftarrow {\mathsf {Gen}}(1^{\lambda })\), computes \(c_{i} \leftarrow {\mathsf {Enc}}_{{\mathsf {pk}}}(q_{i})\) for \(i\in [m]\), defines \(\sigma := ({\mathsf {pk}},c_{1},\dots ,c_{m})\) and \(\tau := ({\mathsf {sk}},{\mathbf {u}})\), and outputs \((\sigma ,\tau )\). (Assume that both \((\sigma ,\tau )\) contain \(\ell \) and the description of the field \(\mathbb {F}_{\lambda }\)).

  • \(P(\sigma ,x,w)\) invokes the \(\text{ LIP } \) prover algorithm \(P_{\scriptscriptstyle {\mathsf {LIP}}}(\mathbb {F}_{\lambda },1^{\ell },x,w)\) to get a matrix \(\Pi \in \mathbb {F}_{\lambda }^{k\times m}\) representing its message function, invokes the homomorphic \({\mathsf {Add}}\) to generate \(k\) ciphertexts \(c_{1}',\dots ,c_{k}'\) encrypting \(\Pi \cdot {\mathbf {q}}\), defines \(\pi := (c_{1}',\dots ,c_{k}')\), and outputs \(\pi \).

  • \(V(\tau ,x,\pi )\), verifies, for \(i\in [k]\), that \({\mathsf {ImVer}}_{{\mathsf {sk}}}(c'_{i})=1\), lets \(a_{i} := {\mathsf {Dec}}_{{\mathsf {sk}}}(c'_{i})\) and outputs the decision of \(D_{\scriptscriptstyle {\mathsf {LIP}}}(\mathbb {F}_{\lambda },1^{\ell },x,{\mathbf {u}},(a_{1},\dots ,a_{k}))\).

Lemma 6.2

Suppose that the \(\text{ LIP } \) \((P_{\scriptscriptstyle {\mathsf {LIP}}} ,V_{\scriptscriptstyle {\mathsf {LIP}}})\) has knowledge error \(\varepsilon (\lambda )\) and \(\mathcal {E}\) is a linear-only encryption scheme. Then, \((G,P,V)\) from Construction 6.1 is a designated-verifier preprocessing \(\text{ SNARK } \) with knowledge error \(O(\varepsilon (\lambda )) + {\mathsf {negl}}(\lambda )\). Furthermore:

  • \({\mathsf {time}}(G) = {\mathsf {time}}(Q_{\scriptscriptstyle {\mathsf {LIP}}}) + {\mathrm {poly}}(\lambda ) \cdot m\),

  • \({\mathsf {time}}(P) = {\mathsf {time}}(P_{\scriptscriptstyle {\mathsf {LIP}}}) + {\mathrm {poly}}(\lambda ) \cdot k^{2} \cdot m\),

  • \({\mathsf {time}}(V) = {\mathsf {time}}(D_{\scriptscriptstyle {\mathsf {LIP}}}) + {\mathrm {poly}}(\lambda ) \cdot k\),

  • \(|\sigma | = {\mathrm {poly}}(\lambda ) \cdot m\), \(|\tau | = {\mathrm {poly}}(\lambda ) + m'\), and \(|\pi |= {\mathrm {poly}}(\lambda ) \cdot k\).

Proof

Completeness follows readily from the completeness of \((P_{\scriptscriptstyle {\mathsf {LIP}}} ,V_{\scriptscriptstyle {\mathsf {LIP}}})\) and the correctness of \(\mathcal {E}\). Efficiency follows readily from that of \((P_{\scriptscriptstyle {\mathsf {LIP}}} ,V_{\scriptscriptstyle {\mathsf {LIP}}})\). We thus focus on establishing the knowledge property.

Let \(P^{*}\) be a malicious polynomial-size prover. We construct a knowledge extractor \(E\) for \(P^{*}\) in two steps: \(E(\sigma ,z)\), given common reference string \(\sigma \) and auxiliary input \(z\), first invokes the linear-only extractor \(E'(\sigma ,z)\) for \(P^{*}\) (on the same input \((\sigma ,z)\) as \(P^{*}\)) to obtain an \(\text{ LIP } \) affine transformation \(\Pi ^{*}\) “explaining” the encryptions output by \(P^{*}\); in the second step, \(E\) invokes the \(\text{ LIP } \) extractor \(E_{\scriptscriptstyle {\mathsf {LIP}}}\) with oracle access to \(\Pi ^{*}\) and on input the statement chosen by \(P^{*}\) to obtain an assignment for the circuit.

We now argue that \(E\) works as required. First, we claim that, except with negligible probability, whenever \(P^{*}(\sigma ,z)\) produces a statement \(x\) and proof \({\mathbf {c}}'=(c_{1}',\dots ,c_{k}')\) accepted by the verifier, the extracted \(\Pi ^{*}\) is such that \(D_{\scriptscriptstyle {\mathsf {LIP}}}(\mathbb {F}_{\lambda },1^{\ell },x,{\mathbf {u}},{\mathbf {a}}^{*})=1\), where \({\mathbf {u}}\) is the private state of the verifier and \({\mathbf {a}}^{*} = \Pi ^{*}({\mathbf {q}})\). Indeed, by the linear-only property of \(\mathcal {E}\) (see Definition 5.4), except with negligible probability, whenever the verifier is convinced, \({\mathbf {a}}^{*} = \Pi ^{*}({\mathbf {q}})\) is equal to \({\mathbf {a}}= {\mathsf {Dec}}_{{\mathsf {sk}}}({\mathbf {c'}})\), which is accepted by the \(\text{ LIP } \) decision algorithm.

Second, we claim that, due to semantic security of \(\mathcal {E}\), except with probability \(4\varepsilon +{\mathsf {negl}}(\lambda )\), whenever the verifier accepts, the extracted proof \(\Pi ^{*}\) is not only “locally satisfying” but “globally satisfying,” and thus the LIP extractor \(E_{\scriptscriptstyle {\mathsf {LIP}}}\) is guaranteed to extract from \(\Pi ^{*}\) a witness for \(x\).

Assume toward contradiction that for some noticeable \(\delta (\lambda )\), it holds with probability \(4\varepsilon + \delta \) that \(D_{\scriptscriptstyle {\mathsf {LIP}}}(\mathbb {F}_{\lambda },1^{\ell },x,{\mathbf {u}},\Pi ^{*}({\mathbf {q}}))=1\), and yet \(E_{\scriptscriptstyle {\mathsf {LIP}}}^{\Pi ^{*}}(x)\) fails to output a valid witness for \(x\). We describe a reduction that breaks the semantic security of \(\mathcal {E}\). The reduction applies \(Q_{\scriptscriptstyle {\mathsf {LIP}}}(\mathbb {F}_\lambda ,1^{\ell })\) to sample two random queries \({\mathbf {q_0}},{\mathbf {q_1}}\) along with corresponding states \({\mathbf {u}}_0,{\mathbf {u}}_1\). It then submits \({\mathbf {q_0}},{\mathbf {q_1}}\) to a challenger and gets back a public key \({\mathsf {pk}}\) and encryption of \({\mathsf {Enc}}_{\mathsf {pk}}({\mathbf {q}}_b)\), it then runs the extractor with \(\sigma = ({\mathsf {pk}},{\mathsf {Enc}}_{\mathsf {pk}}({\mathbf {q}}_b))\) (and auxiliary input \(z\)) and obtains \(\Pi ^{*}\). We consider two events:

  • Event F: \(E_{\scriptscriptstyle {\mathsf {LIP}}}^{\Pi ^{*}}(x)\) fails to output a valid witness for \(x\)

  • Event D: \(D_{\scriptscriptstyle {\mathsf {LIP}}}(\mathbb {F}_{\lambda },1^{\ell }, x,{\mathbf {u}}_0,\Pi ^{*}({\mathbf {q}}_0))\ne D_{\scriptscriptstyle {\mathsf {LIP}}}(\mathbb {F}_{\lambda },1^{\ell }, x,{\mathbf {u}}_1,\Pi ^{*}({\mathbf {q}}_1)) \).

If F and D both occur, the reduction outputs the unique bit \(\beta \) such that \( D_{\scriptscriptstyle {\mathsf {LIP}}}(\mathbb {F}_{\lambda },1^{\ell }, x,{\mathbf {u}}_\beta ,\Pi ^{*}({\mathbf {q}}_\beta )) = 1\); otherwise, the reduction outputs a random guess \(\beta \). We next show that the reduction guesses b with noticeable advantage over 1/2. For this, we shall prove:

  1. 1.

    Both F and D occur at least with noticeable probability \(\delta \).

  2. 2.

    If both F and D occur, the reduction guesses \(\beta = b\) with constant advantage; specifically, with probability 2/3.

First, by our assumption toward contradiction

$$\begin{aligned} \mathrm{Pr}[D_{\scriptscriptstyle {\mathsf {LIP}}}(\mathbb {F}_{\lambda },1^{\ell }, x,{\mathbf {u}}_b,\Pi ^{*}({\mathbf {q}}_b)) = 1 \wedge F] \ge 4\varepsilon +\delta . \end{aligned}$$

Second, since \(E_{\scriptscriptstyle {\mathsf {LIP}}}\) fails to extract from \(\Pi ^{*}\), and \({\mathbf {q}}_{1-b},{\mathbf {u}}_{1-b}\) are sampled independently of \(\Pi ^{*}\),

$$\begin{aligned} \mathrm{Pr}\left[ D_{\scriptscriptstyle {\mathsf {LIP}}}(\mathbb {F}_{\lambda },1^{\ell }, x,{\mathbf {u}}_{1-b},\Pi ^{*}({\mathbf {q}}_{1-b})) = 1 \vert F\right] \le \varepsilon . \end{aligned}$$

We now have

$$\begin{aligned} \mathrm{Pr}\left[ D\vert F\right]&\ge \mathrm{Pr}\left[ D_{\scriptscriptstyle {\mathsf {LIP}}}(\mathbb {F}_{\lambda },1^{\ell }, x,{\mathbf {u}}_{b},\Pi ^{*}({\mathbf {q}}_{b})) = 1 \vert F\right] \\&\quad - \mathrm{Pr}\left[ D_{\scriptscriptstyle {\mathsf {LIP}}}(\mathbb {F}_{\lambda },1^{\ell }, x,{\mathbf {u}}_{1-b},\Pi ^{*}({\mathbf {q}}_{1-b})) = 1 \vert F\right] \\&\ge \frac{4\varepsilon +\delta }{\mathrm{Pr}[F]} -\varepsilon . \end{aligned}$$

In particular,

$$\begin{aligned} \mathrm{Pr}\left[ D\wedge F\right] \ge 3\varepsilon +\delta \ge \delta . \end{aligned}$$

Furthermore,

$$\begin{aligned} \mathrm{Pr}\left[ \beta = b \vert F\wedge D\right]&\ge 1- \mathrm{Pr}\left[ D_{\scriptscriptstyle {\mathsf {LIP}}}(\mathbb {F}_{\lambda },1^{\ell }, x,{\mathbf {u}}_{1-b},\Pi ^{*}({\mathbf {q}}_{1-b})) = 1 \vert F\wedge D\right] \\&= 1- \frac{\mathrm{Pr}\left[ D_{\scriptscriptstyle {\mathsf {LIP}}}(\mathbb {F}_{\lambda },1^{\ell }, x,{\mathbf {u}}_{1-b},\Pi ^{*}({\mathbf {q}}_{1-b})) = 1 \wedge D \vert F\right] }{\mathrm{Pr}\left[ D\vert F\right] }\\&\ge 1- \frac{\varepsilon }{3\varepsilon +\delta } \ge 2/3. \end{aligned}$$

This concludes the proof.\(\square \)

Designated-verifier non-adaptive preprocessing SNARKs from linear targeted malleability. We also consider a notion that is weaker than linear-only encryption: encryption with linear targeted malleability (see Definition 5.8). For this notion, we are still able to obtain, via the same Construction 6.1, designated-verifier preprocessing \(\text{ SNARK } \)s, but this time only against statements that are non-adaptively chosen.

Lemma 6.3

Suppose that the \(\text{ LIP } \) \((P_{\scriptscriptstyle {\mathsf {LIP}}} ,V_{\scriptscriptstyle {\mathsf {LIP}}})\) has knowledge error \(\varepsilon (\lambda )\) and \(\mathcal {E}\) is an encryption scheme with linear targeted malleability. Then, \((G,P,V)\) from Construction 6.1 is a designated-verifier non-adaptive preprocessing \(\text{ SNARK } \) with knowledge error \(\varepsilon (\lambda ) + {\mathsf {negl}}(\lambda )\).

Proof

Let \(P^{*}\) be a malicious polynomial-size prover, which convinces the verifier for infinitely many false statements \(x\). By the targeted malleability property (see Definition 5.8), there exists a polynomial-size simulator \(S\) (depending on \(P^{*}\)) such that:

$$\begin{aligned} \left\{ \begin{array}{l} {\mathsf {pk}}, \\ {\mathbf {q}}, \\ {\mathbf {u}}, \\ {\mathbf {a}} \end{array} \left| \begin{array}{r} ({\mathsf {sk}},{\mathsf {pk}}) \leftarrow {\mathsf {Gen}}(1^{\lambda }) \\ ({\mathbf {q}},{\mathbf {u}}) \leftarrow Q_{\scriptscriptstyle {\mathsf {LIP}}}\\ {\mathbf {c}} \leftarrow {\mathsf {Enc}}_{{\mathsf {pk}}}({\mathbf {q}})\\ {\mathbf {c}}' \leftarrow P^{*}({\mathsf {pk}},{\mathbf {c}};x) \\ \text { where} \\ {\mathsf {ImVer}}_{{\mathsf {sk}}}({\mathbf {c}}')=1\\ {\mathbf {a}} \leftarrow {\mathsf {Dec}}_{{\mathsf {sk}}}({\mathbf {c}}') \end{array} \right\} \right. \approx _{c} \left\{ \begin{array}{l} {\mathsf {pk}}, \\ {\mathbf {q}}, \\ {\mathbf {u}}, \\ {\mathbf {a}} \\ \end{array} \left| \begin{array}{r} ({\mathsf {sk}},{\mathsf {pk}}) \leftarrow {\mathsf {Gen}}(1^{\lambda }) \\ ({\mathbf {q}},{\mathbf {u}}) \leftarrow Q_{\scriptscriptstyle {\mathsf {LIP}}}\\ (\Pi ,\varvec{b}) \leftarrow S({\mathsf {pk}};x) \\ {\mathbf {a}} \leftarrow \Pi \cdot {\mathbf {q}} + \varvec{b}\end{array}\right. \right\} , \end{aligned}$$

where \({\mathbf {q}}\) is the \(\text{ LIP } \) query and \({\mathbf {u}}\) is the \(\text{ LIP } \) verification state.

If \(P^{*}\) convinces the verifier to accept with probability at least \(\varepsilon +\delta \) for some noticeable \(\delta \), then, with at least the same probability, the distribution on the left satisfies that \(D_{\scriptscriptstyle {\mathsf {LIP}}}(x,{\mathbf {u}},{\mathbf {a}})=1\). Because this condition is efficiently testable, the simulated distribution on the right satisfies the same condition with probability at least \(\varepsilon +\delta /2\). However, in this distribution the generation of \({\mathbf {q}}\) and \({\mathbf {u}}\) is independent of the generation of the simulated affine function \(\Pi '=(\Pi ,\varvec{b})\). Therefore, by averaging, there is a \(\delta /2\) fraction of \(\Pi '\) such that, with probability at least \(\varepsilon \) over the choice of \({\mathbf {q}}\) and \({\mathbf {u}}\), it holds that \(D_{\scriptscriptstyle {\mathsf {LIP}}}(x,{\mathbf {u}},\Pi {\mathbf {q}}+\varvec{b})=1\). This yields an extractor whose expected running time is \(2/\delta = \lambda ^{O(1)}\). The extractor repeatedly samples \(({\mathsf {sk}},{\mathsf {pk}})\) applies the simulator \(S({\mathsf {pk}};x)\) and then applies the \(\text{ LIP } \) extractor attempting to extract from \(\Pi '=(\Pi ,\varvec{b})\). \(\square \)

Remark 6.4

(Inefficient simulator) As mentioned in Remark (5.9), Definition 5.8 can be weakened by allowing the simulator to be inefficient. In such a case, we are able to obtain designated-verifier non-adaptive preprocessing \(\text{ SNARG } \)s (note the lack of the knowledge property), via essentially the same proof as the one we gave above for Lemma 6.3.

Remark 6.5

(A word on adaptivity) One can strengthen Definition 5.8 by allowing the adversary to output an additional (arbitrary) string \(y\), which the simulator must be able to simulate as well. Interpreting this additional output as the adversary’s choice of statement, a natural question is whether the strengthened definition suffices to prove security against adaptively-chosen statements as well.

Unfortunately, to answer this question in the positive, it seems that a polynomial-size distinguisher should be able to test whether a statement \(y\) output by the adversary is a true or false statement. This may not be possible if \(y\) encodes an arbitrary \({\mathsf {NP}}\) statement (and for the restricted case of deterministic polynomial-time computations, the approach we just described does in fact work.)

We stress that, while we do not know how to prove security against adaptively-chosen statements, we also do not know of any attack on the construction in the adaptive case.

6.2 Publicly-Verifiable Preprocessing \(\text{ SNARK } \)s from Algebraic \(\text{ LIP } \)s

We show how to transform any \(\text{ LIP } \) with degree \((d_{Q} ,d_{D})=({\mathrm {poly}}(\lambda ) ,2)\) to a publicly-verifiable preprocessing \(\text{ SNARK } \) using linear-only one-way encodings with quadratic tests. The restriction to quadratic tests (i.e., \(d_{D}\le 2\)) is made for simplicity, because we only have one-way encoding candidates based on bilinear maps. As noted in Remark (5.12), the transformation can in fact support any \(d_{D}= {\mathrm {poly}}(\lambda )\), given one-way encodings with corresponding \(d_{D}\)-degree tests.

Construction 6.6

Let \(\left\{ \mathbb {F}_{\lambda }\right\} _{\lambda \in \mathbb {N}}\) be a field ensemble (with efficient description and operations). Let \(\mathcal {C}= \left\{ C_{\ell }\right\} _{\ell \in \mathbb {N}}\) be a family of circuits. Let \((P_{\scriptscriptstyle {\mathsf {LIP}}} ,V_{\scriptscriptstyle {\mathsf {LIP}}})\) be an input-oblivious two-message \(\text{ LIP } \) for the relation \(\mathcal {R}_{\mathcal {C}}\), where for field \(\mathbb {F}_{\lambda }\), the verifier message is in \(\mathbb {F}_{\lambda }^{m}\), the prover message is in \(\mathbb {F}_{\lambda }^{k}\), and the knowledge error is \(\varepsilon (\lambda )\); assume that the verifier degree is \((d_{Q},d_{D})=({\mathrm {poly}}(\lambda ),2)\). Let \(\mathcal {E}= ({\mathsf {Gen}},{\mathsf {Enc}},{\mathsf {SEnc}},{\mathsf {Test}},{\mathsf {Add}},{\mathsf {ImVer}})\) be a linear-only one-way encoding scheme whose plaintext field, for security parameter \(\lambda \), is \(\mathbb {F}_{\lambda }\). We define a preprocessing \(\text{ SNARK } \) \((G,P,V)\) for \(\mathcal {R}_{\mathcal {C}}\) as follows.

  • \(G(1^{\lambda },1^{\ell })\) invokes the \(\text{ LIP } \) query algorithm \(Q_{\scriptscriptstyle {\mathsf {LIP}}}(\mathbb {F}_{\lambda },1^{\ell })\) to generate an \(\text{ LIP } \) message \({\mathbf {q}} \in \mathbb {F}_{\lambda }^{m}\) along with a secret state \({\mathbf {u}}\in \mathbb {F}^{m'}\), generates \({\mathsf {pk}}\leftarrow {\mathsf {Gen}}(1^{\lambda })\), lets \(c_{i} \leftarrow {\mathsf {Enc}}_{{\mathsf {pk}}}(q_{i})\) for \(i\in [m]\), \(\tilde{c}_{i} \leftarrow {\mathsf {SEnc}}_{{\mathsf {pk}}}(u_{i})\) for \(i\in [m']\), defines \(\sigma := ({\mathsf {pk}},c_{1},\dots ,c_{m})\) and \(\tau := ({\mathsf {pk}},\tilde{c}_{1},\dots ,\tilde{c}_{m'})\), and outputs \((\sigma ,\tau )\). (Assume that both \((\sigma ,\tau )\) contain \(\ell \) and the description of the field \(\mathbb {F}_{\lambda }\)).

  • \(P(\sigma ,x,w)\) invokes the \(\text{ LIP } \) prover algorithm \(P_{\scriptscriptstyle {\mathsf {LIP}}}(\mathbb {F}_{\lambda },1^{\ell },x,w)\) to get a matrix \(\Pi \in \mathbb {F}_{\lambda }^{k\times m}\) representing its message function, invokes the homomorphic \({\mathsf {Add}}\) to generate \(k\) encodings \(c_{1}',\dots ,c_{k}'\) for \(\Pi \cdot {\mathbf {q}}\), defines \(\pi := (c_{1}',\dots ,c_{k}')\), and outputs \(\pi \).

  • \(V(\tau ,x,\pi )\) verifies that \({\mathsf {ImVer}}_{{\mathsf {pk}}}(c'_{i})=1\) for \(i\in [k]\), lets \(\varvec{t}_{x}:\mathbb {F}^{k+m'}\rightarrow \mathbb {F}^\eta \) be the quadratic polynomial given by \(D_{\scriptscriptstyle {\mathsf {LIP}}}(\mathbb {F}_{\lambda },1^{\ell },x,\ldots )\), and accepts if and only if \({\mathsf {Test}}\big ({\mathsf {pk}},\varvec{t}_{x},c_{1}',\dots ,c_{k}',\tilde{c}_{1},\dots ,\tilde{c}_{m'}\big ) = 1\).

Lemma 6.7

Suppose that the \(\text{ LIP } \) \((P_{\scriptscriptstyle {\mathsf {LIP}}} ,V_{\scriptscriptstyle {\mathsf {LIP}}})\) has knowledge error \(\varepsilon (\lambda )\) and \(\mathcal {E}\) is a linear-only one-way encoding scheme. Then \((G,P,V)\) from Construction 6.6 is a publicly-verifiable preprocessing \(\text{ SNARK } \) with knowledge error \(\varepsilon (\lambda ) + {\mathsf {negl}}(\lambda )\). Furthermore:

  • \({\mathsf {time}}(G) = {\mathsf {time}}(Q_{\scriptscriptstyle {\mathsf {LIP}}}) + {\mathrm {poly}}(\lambda ) \cdot m\),

  • \({\mathsf {time}}(P) = {\mathsf {time}}(P_{\scriptscriptstyle {\mathsf {LIP}}}) + {\mathrm {poly}}(\lambda ) \cdot k^{2} \cdot m\),

  • \({\mathsf {time}}(V) = {\mathrm {poly}}(\lambda ) \cdot {\mathsf {time}}(D_{\scriptscriptstyle {\mathsf {LIP}}})\),

  • \(|\sigma | = {\mathrm {poly}}(\lambda ) \cdot m\), \(|\tau | = {\mathrm {poly}}(\lambda ) \cdot m'\), and \(|\pi |= {\mathrm {poly}}(\lambda ) \cdot k\).

Proof

Completeness follows readily from the completeness of \((P_{\scriptscriptstyle {\mathsf {LIP}}} ,V_{\scriptscriptstyle {\mathsf {LIP}}})\) and the correctness of \(\mathcal {E}\). Efficiency as claimed above follows readily from that of the underlying LIP. We thus focus on establishing the knowledge property.

Let \(P^{*}\) be a malicious polynomial-size prover. As in the designated-verifier case, we construct its knowledge extractor \(E\) in two steps: first invoke the linear-only extractor \(E'\) for \(P^{*}\) (on the same input (\(\sigma ,z\)) as \(P^{*}\)) to obtain an \(\text{ LIP } \) affine transformation \(\Pi ^{*}\) “explaining” the encryptions output by \(P^{*}\), and then the \(\text{ LIP } \) extractor \(E_{\scriptscriptstyle {\mathsf {LIP}}}\) (with oracle access to \(\Pi ^{*}\) and on input the statement chosen by \(P^{*}\)) to obtain an assignment for the circuit.

We now argue that \(E\) works as required. (Differently from the designated verifier case (see proof of Lemma 6.2), we now rely on \({\mathrm {poly}}(\lambda )\)-power one-wayness (Definition 5.13) of the linear-only encoding scheme, instead of semantic security.)

First, we claim that, except with negligible probability, whenever \(P^{*}(\sigma ,z)\) produces a statement \(x\) and proof \({\mathbf {c}}'=(c_{1}',\dots ,c_{k}')\) accepted by the verifier, the extracted \(\Pi ^{*}\) is such that \(\varvec{t}_{x}(\Pi ^{*}({\mathbf {q}}),{\mathbf {u}}) = {\mathbf {0}}\). Indeed, by the linear-only property of \(\mathcal {E}\) (see Definition 5.17), except with negligible probability, whenever the verifier is convinced, it holds that \({\mathbf {c}}'\in {\mathsf {Enc}}_{{\mathsf {pk}}}(\Pi ^{*}({\mathbf {q}}))\) (i.e., \({\mathbf {c}}'\) encodes the plaintext \(\Pi ^{*}({\mathbf {q}})\)); moreover, since the verifier only accepts if \({\mathsf {Test}}\big ({\mathsf {pk}},\varvec{t}_{x},{\mathbf {c}}',{\tilde{{\mathbf {c}}}}\big )=1\), where \({\tilde{{\mathbf {c}}}}=({\tilde{c}}_{1},\dots ,\tilde{c}_{m'})\) is the (standard-mode) encoding of \({\mathbf {u}}\), it indeed holds that \(\varvec{t}_{x}(\Pi ^{*}({\mathbf {q}}),{\mathbf {u}}) = {\mathbf {0}}\).

Second, recall that the query \({\mathbf {q}}({\mathbf {r}})\) and state \({\mathbf {u}}({\mathbf {r}})\) are polynomials in the randomness \({\mathbf {r}}\) of \(Q_{\scriptscriptstyle {\mathsf {LIP}}}\) of degrees \(d_{Q}\) and \(d_{D}\), respectively. Accordingly, \(\varvec{t}_{x}(\Pi ^{*}({\mathbf {q}}({\mathbf {r}})),{\mathbf {u}}({\mathbf {r}}))\) is a \((d_{Q}d_{D})\)-degree polynomial in the randomness \({\mathbf {r}}\) of the query algorithm \(Q_{\scriptscriptstyle {\mathsf {LIP}}}\). We claim that \(\varvec{t}_{x}(\Pi ^{*}({\mathbf {q}}({\mathbf {r}})),{\mathbf {u}}({\mathbf {r}}))\equiv {\mathbf {0}}\), i.e., it is vanishes everywhere; in particular, \(\Pi ^{*}\) is a (perfectly) convincing \(\text{ LIP } \) affine function. Indeed, if that is not the case, then, since \(\varvec{t}_x\) is of degree \(d_{Q}\cdot d_{D}={\mathrm {poly}}(\lambda )\), we can use the extractor \(E\) to break the \({\mathrm {poly}}(\lambda )\)-power one-wayness of the linear-only scheme (see Definitions 5.13 and 5.14). Specifically, given encodings of \({\mathrm {poly}}(\lambda )\) polynomials in \({\mathbf {r}}\), we manage to find a \({\mathrm {poly}}(\lambda )\)-degree polynomial \(t_x(\Pi ^{*}({\mathbf {q}}(\cdot )),{\mathbf {u}}(\cdot ))\) that vanishes on \({\mathbf {r}}\), but not everywhere. \(\square \)

Table 2 Summary of most of our preprocessing \(\text{ SNARK } \) constructions

6.3 Resulting Preprocessing \(\text{ SNARK } \)s

We now state what preprocessing \(\text{ SNARK } \)s we get by applying our different transformations. Let \(\mathcal {C}= \left\{ C_{\ell }\right\} \) be a circuit family where \(C_{\ell }\) is of size \(s=s(\ell )\) and input size \(n=n(\ell )\). Table 2 summarizes (most) of the preprocessing \(\text{ SNARK } \)s obtained from our \(\text{ LIP } \) constructions (from Sects. 3.1 and 3.2) and computational transformations (from Sects. 6.1 and 6.2).

Zero-knowledge and ZAPs. As mentioned before, if the \(\text{ LIP } \) is \(\text{ HVZK } \) then the corresponding preprocessing \(\text{ SNARK } \) is zero-knowledge (against malicious verifiers in the CRS model), provided that linear-only encryption (or one-way encoding) are rerandomizable; all of our candidates constructions are rerandomizable.

As mentioned in Sect. 3.1.1, both of our \(\text{ LIP } \) constructions based on \(\text{ LPCP } \)s can be made \(\text{ HVZK } \) (either via the general transformation described in Sect. 8 or via construction-specific modifications discussed in Sect. 7). As for the \(\text{ LIP } \) constructions based on traditional \(\text{ PCP } \)s, we need to start with an \(\text{ HVZK } \text{ PCP } \). For efficient such constructions, see [47].

The zero-knowledge preprocessing \(\text{ SNARK } \)s we obtain are arguments of knowledge where the witness can be extracted without a trapdoor on the CRS; this is unlike what happens in typical NIZKs (based on standard assumptions). This property is crucial when recursively composing \(\text{ SNARK } \)s as in [12].

Finally, the zero-knowledge \(\text{ SNARK } \)s we obtain are, in fact, perfect zero-knowledge. Moreover, for the case of publicly-verifiable (perfect) zero-knowledge preprocessing \(\text{ SNARK } \)s, the CRS can be tested, so that (similarly to previous works [56, 64, 78]) we also obtain succinct ZAPs.

7 Two \(\text{ LPCP } \)s for Circuit Satisfaction Problems

We describe two ways of obtaining algebraic \(\text{ LPCP } \)s for boolean circuit satisfaction problems (see Definition 4.5). For simplicity, in this section (as well as in Sects. 3.1 and 3.2), we focus on relations \(\mathcal {R}_{C}\) where \(C\) is a fixed boolean circuit. All the discussions can be naturally extended to relations \(\mathcal {R}_{\mathcal {C}}\) where \(\mathcal {C}\) is a uniform boolean circuit family.

7.1 An \(\text{ LPCP } \) from the Hadamard Code

We begin with a very simple \(\text{ LPCP } \), which is the well-known Hadamard-based \(\text{ PCP } \) of Arora et al. [4] (ALMSS), naturally extended to work over an arbitrary finite field \(\mathbb {F}\). Since we only require soundness against linear proof oracles, there is no need to apply linearity testing and self-correction as in the original \(\text{ PCP } \) of ALMSS. This makes the construction of the \(\text{ LPCP } \) and its analysis even simpler, and has the advantage of yielding knowledge error \(O(1/|\mathbb {F}|)\) with a constant number of queries. The resulting \(\text{ LPCP } \) verifier will have degree \((2 ,2)\), so that the \(\text{ LPCP } \) is algebraic.

We now formulate the properties of the simplified version of the Hadamard-based \(\text{ PCP } \) in the arithmetic setting:

Claim 7.1

Let \(\mathbb {F}\) be a finite field and \(C:\{0,1\}^{n} \times \{0,1\}^{h} \rightarrow \{0,1\}\) a boolean circuit of size \(s\). There is a 3-query \(\text{ LPCP } \) for \(\mathcal {R}_{C}\) with knowledge error \(2/|\mathbb {F}|\), query length \(s+s^{2}\), and degree \((2 ,2)\). Furthermore:

  • the \(\text{ LPCP } \) prover \(P_{\scriptscriptstyle {\mathsf {LPCP}}}\) is an arithmetic circuit of size \(O(s^{2})\);

  • the \(\text{ LPCP } \) query algorithm \(Q_{\scriptscriptstyle {\mathsf {LPCP}}}\) is an arithmetic circuit of size \(O(s^{2})\);

  • the \(\text{ LPCP } \) decision algorithm \(D_{\scriptscriptstyle {\mathsf {LPCP}}}\) is an arithmetic circuit of size \(O(n)\).

Construction and analysis. It is convenient to formulate our variant of the ALMSS construction for a relation \(\mathcal {R}(x,w)\) defined by an arithmetic (rather than boolean) circuit over \(\mathbb {F}\). In the following, an arithmetic circuit may contain addition gates of unbounded fan-in and multiplication gates of fan-in 2; both types of gates can have unbounded fan-out. A sequence of one or more nodes (gates or inputs) defines the output of the circuit. The size of the circuit is defined as the total number of nodes. We say that an arithmetic circuit \(C:\mathbb {F}^{n} \times \mathbb {F}^{h} \rightarrow \mathbb {F}^{\ell }\) is satisfied on a given input if all of the outputs are 0; that is, the corresponding relation is defined as \(\mathcal {R}_{C} := \{({\mathbf {x}},{\mathbf {w}}) \in \mathbb {F}^{n} \times \mathbb {F}^{h} : C({\mathbf {x}},{\mathbf {w}}) = 0^{\ell }\}\).

The standard problem for boolean circuit satisfaction can be easily reduced to the problem of arithmetic circuit satisfaction with only constant overhead.

Claim 7.2

(From boolean circuit satisfaction to arithmetic circuit satisfaction) Let \(\mathbb {F}\) be a finite field, n an input length parameter, and h an output length parameter. There exist efficient (in fact, linear-time) transformations \(({\mathsf {arith}} ,{\mathsf {inp}},{\mathsf {wit}},{\mathsf {wit}}^{-1})\) such that, for any boolean circuit \(C:\{0,1\}^{n} \times \{0,1\}^{h} \rightarrow \{0,1\}\) (with \({\mathrm {AND}}\), \({\mathrm {OR}}\), and \({\mathrm {NOT}}\) gates) and input \(x\in \{0,1\}^{n}\), the following conditions hold:

  • \({\mathsf {arith}}(C)\) outputs an arithmetic circuit \(C' :\mathbb {F}^{n+1} \times \mathbb {F}^{h} \rightarrow \mathbb {F}^{h+1}\);

  • \({\mathsf {inp}}(x)\) outputs an input \({\mathbf {x}}\in \mathbb {F}^{n+1}\);

  • there is \(w\in \{0,1\}^{h}\) s.t. \(C(x,w)=1\) if and only if there is \({\mathbf {w}}\in \mathbb {F}^{h}\) s.t. \(C'({\mathbf {x}},{\mathbf {w}})=0^{h+1}\);

  • if \((x ,w) \in \mathcal {R}_{C}\) then \({\mathsf {wit}}(C,x,w)\) outputs a witness \({\mathbf {w}}\in \mathbb {F}^{h}\) such that \(({\mathbf {x}} ,{\mathbf {w}}) \in \mathcal {R}_{C'}\);

  • if \(({\mathbf {x}} ,{\mathbf {w}}) \in \mathcal {R}_{C'}\) then \({\mathsf {wit}}^{-1}(C',{\mathbf {x}},{\mathbf {w}})\) outputs a witness \(w\in \{0,1\}^{h}\) such that \((x ,w) \in \mathcal {R}_{C}\).

Proof sketch

Let \({\mathbf {x}}=(-1,x_{1},\dots ,x_{n})\). The circuit \(C'({\mathbf {x}},{\mathbf {w}})\) emulates the computation of \(C\) on \((x,{\mathbf {w}})\), using the constant \(-1\) to emulate \({\mathrm {OR}}\) and \({\mathrm {NOT}}\) gates, and treating the entries of \({\mathbf {w}}\) as bits of \(w\). (For instance, \({\mathrm {NOT}}(z)\) can be computed as \(1+(-1) \cdot z\).) The first output of \(C'\) is equal to \(1-C(x,{\mathbf {w}})\) (for all \({\mathbf {w}}\in \{0,1\}^{h})\). The remaining h outputs of \(C'\) are used to ensure that any satisfying \({\mathbf {w}}\in \mathbb {F}^{h}\) is a bit vector. This is done by outputting \(w_{i} \cdot (1-w_{i})\) for \(i \in \{1,\dots ,h\}\). \(\square \)

We now proceed to describe the construction of the \(\text{ LPCP } \) for an arithmetic circuit \(C:\mathbb {F}^{n} \times \mathbb {F}^{h} \rightarrow \mathbb {F}^{\ell }\). We let \(z_{i}\) denote the value of the i-th wire of \(C\) given input \({\mathbf {x}}\in \mathbb {F}^{n}\) and witness \({\mathbf {w}}\in \mathbb {F}^{h}\), where we assume that \(z_{i}=x_{i}\) for \(i \in \{1,\dots ,n\}\) and \(z_{s-\ell +1},\ldots ,z_{s}\) are the values of the output wires (which are supposedly 0). The honest prover uses a linear oracle \(\varvec{\pi }\) whose coefficient vector includes \(z_{1},\ldots ,z_{s}\), and \(z_{i} \cdot z_{j}\) for all \(i,j \in \{1,\dots ,s\}\). The verifier needs to check that: (1) the coefficient vector is consistent with itself (namely, the entry which supposedly contained \(z_{i}\cdot z_{j}\) is indeed the product of the entries containing \(z_{i}\) and \(z_{j}\)); (2) \(z_{i}=x_{i}\) for \(i \in \{1,\dots ,n\}\); (3) \(z_{s-i}=0\) for \(i \in \{0,\dots ,\ell -1\}\); and (4) for every gate, the given value for its output wire is consistent with the given values for its input wires.

The first condition can be verified with two queries: the first query asks for a random linear combination of the \(z_{i}\) and the second query asks for the linear combination of the \(z_{i} \cdot z_{j}\) that corresponds to the square of the first query. The verifier checks that the second answer is the square of the first answer. It follows from the Schwartz–Zippel Lemma (cf. Lemma 2.1) that if condition (1) is violated then this test will fail except with at most \(2/|\mathbb {F}|\) probability. Conditions 2, 3, 4 are tested together using a single query by taking a random linear combination with coefficient vector \(\varvec{r}\) of the left hand sides of the following equations:

  • \(z_{i}=x_{i}\) for \(i=1,\ldots ,n\);

  • \(z_{j}=0\) for \(j=s-(\ell -1),\ldots ,s\);

  • \(\big (\sum _{j=1}^{k} z_{i_{j}} \big )-z_{k}=0\) for each addition gate with input wires \(i_{1},\ldots ,i_{k}\) and output wire k;

  • \(z_{i} \cdot z_{j}-z_{k}=0\) for each multiplication gate with input wires ij and output wire k.

Note that this random linear combination can be efficiently transformed into a random linear combination of the coefficients \(z_{i}\) and \(z_{i}\cdot z_{j}\). The verifier accepts if the answer is equal to the corresponding linear combination of the right hand side, namely to \(\sum _{i=1}^{n} r_{i}x_{i}\). Assuming that condition (1) is satisfied, this test fails with probability at most \(1/|\mathbb {F}|\).

Overall, we get a 3-query \(\text{ LPCP } \) with knowledge error \(2 / |\mathbb {F}| = O(1/|\mathbb {F}|)\), query length \(s+ s^{2} = O(s^{2})\) and degree \((2 ,2)\). The construction is summarized as follows:

Construction 7.3

Let \(\mathbb {F}\) be a finite field and \(C:\mathbb {F}^{n} \times \mathbb {F}^{h} \rightarrow \mathbb {F}^{\ell }\) an arithmetic circuit over \(\mathbb {F}\) of size \(s\).

LPCP prover algorithm \(P_{\scriptscriptstyle {\mathsf {LPCP}}}\). Given \(({\mathbf {x}} ,{\mathbf {w}}) \in \mathbb {F}^{n} \times \mathbb {F}^{h}\) such that \(C({\mathbf {x}},{\mathbf {w}}) = 1\), compute the values \(z_{1},\dots ,z_{s}\) of all wires in \(C({\mathbf {x}},{\mathbf {w}})\), and output the linear function \(\varvec{\pi }:\mathbb {F}^{s+ s^{2}} \rightarrow \mathbb {F}\) defined by the coefficients \(z_{i}\) for all \(i \in [s]\) and \(z_{i} \cdot z_{j}\) for all \(i,j \in [s]\).

LPCP query algorithm \(Q_{\scriptscriptstyle {\mathsf {LPCP}}}.\) The query algorithm \(Q_{\scriptscriptstyle {\mathsf {LPCP}}}\) has hardcoded in it:

  • A matrix \(A_{C} \in \mathbb {F}^{s\times (s+ s^{2})}\)

  • A vector \(\varvec{b}_{C} \in \mathbb {F}^{s- n}\)

Both \(A_{C}\) and \(\varvec{b}_{C}\) can be computed efficiently (in fact, in linear time) from \(C\). The query algorithm \(Q_{\scriptscriptstyle {\mathsf {LPCP}}}\) outputs queries \(\varvec{q}_{1},\varvec{q}_{2},\varvec{q}_{3} \in \mathbb {F}^{s+ s^{2}}\) that are computed as follows:

  1. 1.

    sample a random vector \(\varvec{r}= (\varvec{r}_{1}, \varvec{r}_{2}) \in \mathbb {F}^{2s}\); denote by \(\varvec{r}_{{\mathbf {x}}}\) the first n coordinates of \(\varvec{r}_{1}\) and by \(\varvec{r}_{C}\) the last \(s- n\) coordinates of \(\varvec{r}_{1}\);

  2. 2.

    set \(\varvec{q}_{1} := \varvec{r}_{1} \cdot A_{C}\);

  3. 3.

    the first \(s\) elements of \(\varvec{q}_{2}\) is the vector \(\varvec{r}_{2}\) and the last \(s^{2}\) elements are 0;

  4. 4.

    the first \(s\) elements of \(\varvec{q}_{3}\) are 0 and the last \(s^{2}\) element are \(\varvec{r}_{2}[i] \cdot \varvec{r}_{2}[j]\) for all \(i,j \in [s]\).

Additionally the query algorithm \(Q_{\scriptscriptstyle {\mathsf {LPCP}}}\) outputs the state information \({\mathbf {u}}= (u_{C},\varvec{r}_{{\mathbf {x}}})\) where \({\mathbf {u}}_{C} = \left\langle \varvec{r}_{C} , \varvec{b}_{C} \right\rangle \).

LPCP decision algorithm \(D_{\scriptscriptstyle {\mathsf {LPCP}}}.\) Given input \({\mathbf {x}}\), state \({\mathbf {u}}= (u_{C} ,\varvec{r}_{{\mathbf {x}}})\), and answers \(a_{1},a_{2},a_{3} \in \mathbb {F}\), verify that

$$\begin{aligned} \varvec{t}_{x}({\mathbf {u}}, a_{1},a_{2},a_{3}) := \left( a_{1} - (u_{C} + \left\langle \varvec{r}_{{\mathbf {x}}} , {\mathbf {x}} \right\rangle ) ,\, a_{2}^{2} - a_{3} \right) = (0 ,0) . \end{aligned}$$

Remark 7.4

(\(\text{ HVZK } \) variant) To make the \(\text{ LPCP } \) of Claim 7.1 an \(\text{ HVZK } \text{ LPCP } \), we could apply the general transformation presented in Sect. 8. However, there is a transformation specific to the \(\text{ LPCP } \) of Claim 7.1 that introduces almost no overhead. Essentially, the prover can simply concatenate a random element to the linear function he generates; this can be interpreted as adding a dummy wire to the circuit and assigning to it a random value; the verifier then reasons in terms of this new circuit. In fact, the idea we just described is a very simple instance of the transformation in Sect. 8, which generalizes it to work for any \(\text{ LPCP } \).

7.2 An \(\text{ LPCP } \) from Quadratic Span Programs

Next, we note that an \(\text{ LPCP } \) with linear query algorithm and quasilinear prover algorithm can be easily obtained by going through the quadratic span programs of Gennaro et al. [56]; this gain is at the expense of the degree of the query algorithm which now becomes \(O(s)\) instead of 2.

Claim 7.5

Let \(\mathbb {F}\) be a finite field and \(C:\{0,1\}^{n} \times \{0,1\}^{h} \rightarrow \{0,1\}\) a boolean circuit of size \(s\). There is a 3-query \(\text{ LPCP } \) for \(\mathcal {R}_{C}\) with knowledge error \(O(s/|\mathbb {F}|)\), query length \(O(s)\), and degree \((O(s) ,2)\). Furthermore:

  • the \(\text{ LPCP } \) prover \(P_{\scriptscriptstyle {\mathsf {LPCP}}}\) is an arithmetic circuit of size \(\widetilde{O}(s)\);

  • the \(\text{ LPCP } \) query algorithm \(Q_{\scriptscriptstyle {\mathsf {LPCP}}}\) is an arithmetic circuit of size \(O(s)\);

  • the \(\text{ LPCP } \) decision algorithm \(D_{\scriptscriptstyle {\mathsf {LPCP}}}\) is an arithmetic circuit of size \(O(n)\).

Construction and analysis. We present the definition of a quadratic span program satisfiability problem, based on the definition of a strong quadratic-span program (\(\text{ QSP } \)) in [56].

Definition 7.6

[56, Definition 5] Let \(\mathbb {F}\) be a finite field. The \(\text{ QSP } \) satisfiability problem over \(\mathbb {F}\) of a (strong) \(\text{ QSP } \) \(Q= (n,m,d,\mathbb {V},\mathbb {W},t)\), where \(\mathbb {V}= \big (v_{i}(z)\big )_{0\le i \le m}\) and \(\mathbb {W}= \big (w_{i}(z) \big )_{0\le i \le m}\) are two vectors of polynomials of degree \(d\) over \(\mathbb {F}\) and \(t\) is a polynomial over \(\mathbb {F}\), is the relation:

$$\begin{aligned} \mathcal {R}_{Q}:= & {} \left\{ \big ( {\mathbf {x}}, (\varvec{a} ,\varvec{b},\varvec{h}) \big ) : t(z) \cdot \left( \sum _{i=0}^{d-1}{h_{i} \cdot z^{i}} \right) \right. \\= & {} \left. \left( v_{0}(z) + \sum _{i=1}^{n}{x_{i} \cdot v_{i}(z)} + \sum _{i=1}^{m-n}{a_{i} \cdot v_{i+n}(z)}\right) \cdot \left( w_{0}(z) + \sum _{i=1}^m{b_{i} \cdot w_{i}(z)}\right) \right\} , \end{aligned}$$

where \({\mathbf {x}}\in \mathbb {F}^{n}\), \(\varvec{a}\in \mathbb {F}^{m- n}\), \(\varvec{b}\in \mathbb {F}^{m}\), and \(\varvec{h}\in \mathbb {F}^{d}\).

Gennaro et al. [56] have shown that there is a very efficient reduction from boolean circuit satisfiability problems to \(\text{ QSP } \) satisfiability problems:

Claim 7.7

[56] Let \(\mathbb {F}\) be a field. There exist transformations \(({\mathsf {qsp}} ,{\mathsf {inp}},{\mathsf {wit}},{\mathsf {wit}}^{-1})\) such that, for any boolean circuit \(C:\{0,1\}^{n} \times \{0,1\}^{h} \rightarrow \{0,1\}\) and input \(x\in \{0,1\}^{n}\), the following conditions hold:

  • \({\mathsf {qsp}}(C)\) outputs a \(\text{ QSP } \) \(Q\) where \(m= O(|C|)\), \(d= O(|C|)\), and \(n' = O(n)\); moreover, \(Q\) is sparsely represented;

  • \({\mathsf {inp}}(x)\) outputs an input \({\mathbf {x}}\in \mathbb {F}^{n'}\);

  • there is \(w\in \{0,1\}^{h}\) s.t. \(C(x,w)=1\) if and only if there is \({\mathbf {w}}\) s.t. \(({\mathbf {x}},{\mathbf {w}}) \in \mathcal {R}_{Q}\);

  • if \((x,w) \in \mathcal {R}_{C}\) then \({\mathsf {wit}}(C,x,w)\) outputs a witness \({\mathbf {w}}\) such that \(({\mathbf {x}},{\mathbf {w}}) \in \mathcal {R}_{Q}\);

  • if \(({\mathbf {x}},{\mathbf {w}}) \in \mathcal {R}_{Q}\) then \({\mathsf {wit}}^{-1}(Q,{\mathbf {x}},{\mathbf {w}})\) outputs a witness \(w\in \{0,1\}^{h}\) such that \((x,w) \in \mathcal {R}_{C}\).

Moreover, \({\mathsf {qsp}},{\mathsf {inp}},{\mathsf {wit}}^{-1}\) run in linear time and \({\mathsf {wit}}\) runs in quasilinear time.

We now give a construction of an \(\text{ LPCP } \) for a \(\text{ QSP } \) satisfiability problem:

Construction 7.8

Let \(\mathbb {F}\) be a field and \(Q= (n,m,d,\mathbb {V},\mathbb {W},t)\) a (strong) QSP.

LPCP prover algorithm \(P_{\scriptscriptstyle {\mathsf {LPCP}}}.\) Given \(({\mathbf {x}} ,(\varvec{a} ,\varvec{b},\varvec{h}))\) such that \(({\mathbf {x}},(\varvec{a} ,\varvec{b},\varvec{h})) \in \mathcal {R}_{Q}\), output the linear function \(\varvec{\pi }:\mathbb {F}^{2m- n+ d} \rightarrow \mathbb {F}\) defined by \(\varvec{\pi }:= (\varvec{a} ,\varvec{b},\varvec{h})\) (i.e., the concatenation of the coefficients vectors).

LPCP query algorithm \(Q_{\scriptscriptstyle {\mathsf {LPCP}}}.\) Select a random point \(r\in \mathbb {F}\) and output the queries \(\varvec{q}_{1},\varvec{q}_{2},\varvec{q}_{3} \in \mathbb {F}^{2m- n+d}\) defined as follows:

  • the first \(m-n\) elements of \(\varvec{q}_{1}\) are \(v_{i}(r)\) for all \(n< i \le m\) and the last \(m+ d\) element are 0;

  • the first \(m-n\) and the last \(d\) elements of \(\varvec{q}_{2}\) are 0 and the remaining \(m\) elements are \(w_{i}(r)\) for all \(0 < i \le m\);

  • the first \((m-n) + m\) elements of \(\varvec{q}_{3}\) are 0 and the last \(d\) elements are \(r^{i}\) for all \(0 \le i < d\).

Additionally, the algorithm outputs the state information \({\mathbf {u}}:= (\left\{ v_{i}(r)\right\} _{0\le i \le n} ,w_{0}(r),t(r)) \in \mathbb {F}^{n+ 2}\).

LPCP decision algorithm \(D_{\scriptscriptstyle {\mathsf {LPCP}}}.\) Given input \(x\), state \({\mathbf {u}}= (\left\{ v_{i}(r)\right\} _{0\le i \le n} ,w_{0}(r),t(r))\), and answers \(a_{1},a_{2},a_{3}\), verify that

$$\begin{aligned} \varvec{t}_{x}({\mathbf {u}}, a_{1},a_{2},a_{3}) := \left( v_{0}(r) + \sum _{i=1}^{n}{x_{i} \cdot v_{i}(r)} + a_{1} \right) \cdot (w_{0}(r) + a_{2}) - a_{3} \cdot t(r) = 0 . \end{aligned}$$

We now prove that Construction 7.8 has the desired properties. Correctness follows from the construction. Next, fix a linear function \(\varvec{\pi }^{*} :\mathbb {F}^{2m- n+ d} \rightarrow \mathbb {F}\) and view it as \(\varvec{\pi }^{*} = (\varvec{a}^{*} ,\varvec{b}^{*},\varvec{h}^{*})\) for some \(\varvec{a}^{*} \in \mathbb {F}^{m- n}\), \(\varvec{b}^{*} \in \mathbb {F}^{m}\), and \(\varvec{h}^{*} \in \mathbb {F}^{d}\). Suppose that \(V_{\scriptscriptstyle {\mathsf {LPCP}}}^{\varvec{\pi }^{*}}(x)\) accepts with more than \(2d/|\mathbb {F}|\) probability. By construction, we have that:

$$\begin{aligned}&\mathop {\mathrm{Pr}}\limits _{r\leftarrow \mathbb {F}}\left[ \left( v_{0}(r) + \sum _{i=1}^{n}{x_{i} \cdot v_{i}(r)} + \sum _{i=1}^{m-n}{a_{i}^{*} \cdot v_{i+n}(r)} \right) \cdot \left( w_{0}(r) + \sum _{i=1}^{m}{b_{i}^{*} \cdot w_{i}(r)} \right) \right. \\&\quad -\left. \left( \sum _{i=0}^{d-1}{h_{i}^{*} \cdot r^{i}}\right) \cdot t(r) = 0\right] >\frac{2d}{|\mathbb {F}|} . \end{aligned}$$

By the Schwartz–Zippel Lemma (cf. Lemma 2.1), we deduce that:

$$\begin{aligned}&\left( v_{0}(z) + \sum _{i=1}^{n}{x_{i} \cdot v_{i}(z)} + \sum _{i=1}^{m-n}{a_{i}^{*} \cdot v_{i+n}(z)} \right) \cdot \left( w_{0}(z) + \sum _{i=1}^{m}{b_{i}^{*} \cdot w_{i}(z)} \right) \\&\quad = \left( \sum _{i=0}^{d-1}{h_{i}^{*} \cdot z^{i}}\right) \cdot t(z) . \end{aligned}$$

We conclude that \(({\mathbf {x}} ,(\varvec{a}^{*} ,\varvec{b}^{*},\varvec{h}^{*})) \in \mathcal {R}_{Q}\). We thus have a 3-query \(\text{ LPCP } \) for \(\mathcal {R}_{Q}\) with knowledge error \(2d/|\mathbb {F}|\), query length \(2m- n+ d\), and degree \((d ,2)\). Furthermore, the \(\text{ LPCP } \) prover \(P_{\scriptscriptstyle {\mathsf {LPCP}}}\) runs in time \(O(m+ d)\), the \(\text{ LPCP } \) query algorithm \(Q_{\scriptscriptstyle {\mathsf {LPCP}}}\) runs in time \(O(m+ d)\) (since it only has to produce random evaluations of polynomials ), and the \(\text{ LPCP } \) decision algorithm \(D_{\scriptscriptstyle {\mathsf {LPCP}}}\) runs in time \(O(n)\).

Invoking Claim 7.7, we obtain 3-query \(\text{ LPCP } \) for \(\mathcal {R}_{C}\), where \(C:\{0,1\}^{n} \times \{0,1\}^{h} \rightarrow \{0,1\}\) is a boolean circuit of size \(s\), with knowledge error \(O(s/|\mathbb {F}|)\), query length \(O(s)\), and degree \((O(s) ,2)\). Furthermore, the \(\text{ LPCP } \) prover \(P_{\scriptscriptstyle {\mathsf {LPCP}}}\) runs in time \(\widetilde{O}(s)\), the \(\text{ LPCP } \) query algorithm \(Q_{\scriptscriptstyle {\mathsf {LPCP}}}\) runs in time \(O(s)\), and the \(\text{ LPCP } \) decision algorithm \(D_{\scriptscriptstyle {\mathsf {LPCP}}}\) runs in time \(O(n)\).

Remark 7.9

(\(\text{ HVZK } \) variant) To make the \(\text{ LPCP } \) of Claim 7.5 an \(\text{ HVZK } \text{ LPCP } \), we could apply the general transformation presented in Sect. 8. However, Gennaro et al. [56] give a transformation for the specific case of \(\text{ QSP } \)s that introduces almost no overhead. Their transformation is rather different from our general transformation and exploits special features of \(\text{ QSP } \)s. At a very high-level, we first pad each of the \(\text{ LPCP } \) answers with a random field element, and then add terms to the proof that allow a verifier to cancel the noise, but without leaking any further information. In particular, in our transformation, the verifier has to make sure that these additional terms are properly structured, and doing so requires additional tests. Instead, [56] use the special \(\text{ QSP } \) structure: instead of padding the \(\text{ LPCP } \) answers with random field elements, they add randomized factors of the target polynomial \(t\) to the answers, and this is already sufficient to force the prover to stick to the proper structure, and does not require additional structure tests.

8 \(\text{ HVZK } \) for \(\text{ LPCP } \)s with Low-Degree Decision Algorithm

We describe a general transformation that takes any \(\text{ LPCP } \) with \(d_{D}=2\) to a corresponding \(\text{ LPCP } \) that is \(\text{ HVZK } \), with only a small overhead in parameters. (With additional work, the transformation can in fact be extended to work for \(d_{D}= O(1)\).)

Theorem 8.1

There exists an efficient compiler that takes any \(\text{ LPCP } \) \((P_{\scriptscriptstyle {\mathsf {LPCP}}} ,V_{\scriptscriptstyle {\mathsf {LPCP}}})\) with \(d_{D}=2\) into an \(\text{ LPCP } \) \((P_{\scriptscriptstyle {\mathsf {LPCP}}}' ,V_{\scriptscriptstyle {\mathsf {LPCP}}}')\) with the following properties:

  • if \((P_{\scriptscriptstyle {\mathsf {LPCP}}} ,V_{\scriptscriptstyle {\mathsf {LPCP}}})\) has knowledge error \(\varepsilon \) then \((P_{\scriptscriptstyle {\mathsf {LPCP}}}' ,V_{\scriptscriptstyle {\mathsf {LPCP}}}')\) has knowledge error \(\varepsilon +O(1/|\mathbb {F}|)\);

  • \((P_{\scriptscriptstyle {\mathsf {LPCP}}}' ,V_{\scriptscriptstyle {\mathsf {LPCP}}}')\) is \(O(1/|\mathbb {F}|)\)-statistical \(\text{ HVZK } \) (and can be made perfect \(\text{ HVZK } \) if \(V_{\scriptscriptstyle {\mathsf {LPCP}}}'\) is allowed to sample its randomness from \(\mathbb {F}-\{0\}\) instead of \(\mathbb {F}\));

  • if \((P_{\scriptscriptstyle {\mathsf {LPCP}}} ,V_{\scriptscriptstyle {\mathsf {LPCP}}})\) has query length \(m\), \(k\) queries, private state of length \(m'\), \(\eta \) test polynomials, and degree \((d_{Q} ,2)\), then \((P_{\scriptscriptstyle {\mathsf {LPCP}}}' ,V_{\scriptscriptstyle {\mathsf {LPCP}}}')\) has query length \(O(\eta k^{4}(m+m'))\), \(O(k+\eta )\) queries, private state of length \(O(k^{2}m')\), \(O(\eta )\) test polynomials, and degree \((d_{Q} ,2)\).

Remark 8.2

(Parameters) Typically (and this is the case for both of the \(\text{ LPCP } \) constructions of Sect. 7), \(k\) and \(\eta \) are constants and \(m'<m\) (e.g., \(m\) is proportional to the size of the circuit to be verified and \(m'\) is proportional to the input size of the circuit). For such typical choices, \((P_{\scriptscriptstyle {\mathsf {LPCP}}}' ,V_{\scriptscriptstyle {\mathsf {LPCP}}}')\) has the same parameters as \((P_{\scriptscriptstyle {\mathsf {LPCP}}} ,V_{\scriptscriptstyle {\mathsf {LPCP}}})\), up to constant factors.

Of course, because the transformation guaranteed by Theorem 8.1 is generic, optimizations are often possible in specific cases to, e.g., reduce the number of queries. For instance, the two \(\text{ LPCP } \) constructions described in Sect. 7 have \(\text{ HVZK } \) variants with almost no overhead (see Remarks 7.4 and 7.9).

The rest of this section contains the proof of Theorem 8.1. Let us begin with the high-level idea.

The high-level idea. The \(\text{ LPCP } \) prover adds to each of the original \(\text{ LPCP } \) answers \(a_{i}\) a random element \(\xi _{i}\); clearly, the distribution of these new answers is independent of the witness used by the prover, and thus the new answers are \(\text{ HVZK } \). Unfortunately, this modification allows the \(\text{ LPCP } \) verifier to only compute a noisy evaluation of the test polynomial, namely \(\varvec{t}_{x}({\mathbf {u}},{\mathbf {a}}+{\mathbf {\xi }})\), when instead the verifier needed a noiseless evaluation. To recover soundness without breaking \(\text{ HVZK } \), the verifier needs to learn the difference \(\varvec{\Delta }=\varvec{t}_{x}({\mathbf {u}},{\mathbf {a}}+{\mathbf {\xi }})-\varvec{t}_{x}({\mathbf {u}},{\mathbf {a}})\), but nothing else. We show how the prover can do this by adding suitable crossterms to the linear function, and letting the verifier test that the new linear function, which includes these crossterms, has the correct “structure.” We now concretize this high-level idea by proceeding in three steps.

Step 1: soundness against provers respecting quadratic tensor relations. In our transformation, an honest \(\text{ LPCP } \) proof \(\varvec{\pi }\) is mapped to a new proof \(\varvec{\pi }' = (\varvec{\pi }_{1},\dots ,\varvec{\pi }_{c})\) that must satisfy a set \(\mathcal {T}\) of quadratic tensor relations; a quadratic tensor relation is a triple \((i,j,k) \in [c]^{3}\), that corresponds to the requirement \(\varvec{\pi }_{i} = \varvec{\pi }_{j} \otimes \varvec{\pi }_{k}\).Footnote 11 Thus, we can write:

$$\begin{aligned} \mathcal {T}= \left\{ \varvec{\pi }_{i_{\ell }} = \varvec{\pi }_{j_{\ell }} \otimes \varvec{\pi }_{k_{\ell }}\right\} _{\ell =1}^{\beta } , \end{aligned}$$

where \( {\mathbf {a}} \otimes {\mathbf {b}} = (a_{i} \cdot b_{j})_{i,j}\) is the tensor product of vectors \({\mathbf {a}}\) and \({\mathbf {b}}\). We say that \(\beta \) is the size of \(\mathcal {T}\). In our construction, \(\mathcal {T}\) is independent of the input \(x\), so we focus our attention to this case.

We begin by showing that any \(\text{ LPCP } \) that is secure against \(\mathcal {T}\)-respecting provers (for some given \(\mathcal {T}\)) can be transformed into a corresponding \(\text{ LPCP } \) against arbitrary provers, with a small overhead in parameters and while maintaining \(\text{ HVZK } \). For a given set of quadratic tensor relations \(\mathcal {T}\), we say that an \(\text{ LPCP } \) has soundness (or knowledge) error \(\varepsilon \) against \(\mathcal {T}\)-respecting provers if it has soundness (or knowledge) error \(\varepsilon \) against proofs that satisfy all the quadratic tensor relations in \(\mathcal {T}\).

Claim 8.3

There exists an efficient \(\text{ LPCP } \) transformation \(\mathbb {T}\) such that, for any set of quadratic tensor relations \(\mathcal {T}\), the following properties hold:

  • Security: If \((P_{\scriptscriptstyle {\mathsf {LPCP}}} ,V_{\scriptscriptstyle {\mathsf {LPCP}}})\) is an \(\text{ LPCP } \) for a relation \(\mathcal {R}\) with knowledge error \(\varepsilon \) against \(\mathcal {T}\)-respecting provers, then \((P_{\scriptscriptstyle {\mathsf {LPCP}}}' ,V_{\scriptscriptstyle {\mathsf {LPCP}}}') := \mathbb {T}(\mathcal {T},P_{\scriptscriptstyle {\mathsf {LPCP}}},V_{\scriptscriptstyle {\mathsf {LPCP}}})\) is an \(\text{ LPCP } \) for \(\mathcal {R}\) with knowledge error \(\varepsilon +O(1/{|\mathbb {F}|})\).

  • Zero-Knowledge Preservation: If \((P_{\scriptscriptstyle {\mathsf {LPCP}}} ,V_{\scriptscriptstyle {\mathsf {LPCP}}})\) is \(\delta \)-statistical \(\text{ HVZK } \) then \((P_{\scriptscriptstyle {\mathsf {LPCP}}}' ,V_{\scriptscriptstyle {\mathsf {LPCP}}}')\) is \((\delta + 1/|\mathbb {F}|)\)-statistical \(\text{ HVZK } \). (If \(V_{\scriptscriptstyle {\mathsf {LPCP}}}'\) is allowed to sample randomness from \(\mathbb {F}-\{0\}\) instead of \(\mathbb {F}\) then \((P_{\scriptscriptstyle {\mathsf {LPCP}}}' ,V_{\scriptscriptstyle {\mathsf {LPCP}}}')\) can be made \(\delta \)-statistical \(\text{ HVZK } \).)

  • Efficiency: If \(\mathcal {T}\) has size \(\beta \), \((P_{\scriptscriptstyle {\mathsf {LPCP}}} ,V_{\scriptscriptstyle {\mathsf {LPCP}}})\) has query length \(m\), \(k\) queries, \(\eta \) test polynomials, and degree \((d_{Q} ,d_{D})\), then \((P_{\scriptscriptstyle {\mathsf {LPCP}}}' ,V_{\scriptscriptstyle {\mathsf {LPCP}}}')\) has query length \(O(m)\), \(k+O(\beta )\) queries, \(\eta +\beta \) test polynomials, and degree \((d_{Q} ,\max \left\{ d_{D},2\right\} )\).

Proof sketch

We sketch the transformation \(\mathbb {T}\) and an argument for its correctness; a detailed description of the transformation can be found after the proof sketch, in Construction 8.4.

The transformation leverages the fact that any quadratic tensor relation can be checked with only 3 queries and a quadratic decision predicate. Concretely, given two vectors \({\mathbf {a}}\) and \({\mathbf {b}}\) and another vector \({\mathbf {c}}\), to test whether \({\mathbf {c}}={\mathbf {a}}\otimes {\mathbf {b}}\), we can sample two random vectors \({\mathbf {r}}_{{\mathbf {a}}}\) and \({\mathbf {r}}_{{\mathbf {b}}}\), and test whether

$$\begin{aligned} \left\langle {\mathbf {c}} , {\mathbf {r}}_{{\mathbf {a}}}\otimes {\mathbf {r}}_{{\mathbf {b}}} \right\rangle = \left\langle {\mathbf {a}} , {\mathbf {r}}_{{\mathbf {a}}} \right\rangle \cdot \left\langle {\mathbf {b}} , {\mathbf {r}}_{{\mathbf {b}}} \right\rangle . \end{aligned}$$

By the Schwartz–Zippel Lemma (see Lemma 2.1), if the quadratic tensor relation does not hold, then the test passes with probability at most \(2/|\mathbb {F}|\).

Thus, letting \(\varvec{\pi }^{*} = (\varvec{\pi }_{1},\dots ,\varvec{\pi }_{c})\) be a potentially malicious proof, if we want to test whether \(\varvec{\pi }^{*}\) satisfies the set of quadratic tensor relations \(\mathcal {T}\), we proceed as follows. For each \((i,j,k) \in \mathcal {T}\), generate queries \(\varvec{q}_{i},\varvec{q}_{j},\varvec{q}_{k}\) to \(\varvec{\pi }^{*}\) so that

$$\begin{aligned} \left\langle \varvec{\pi }^{*} , \varvec{q}_{i} \right\rangle&= \left\langle \varvec{\pi }_{i} , {\mathbf {r}}_{j}\otimes {\mathbf {r}}_{k} \right\rangle \\ \left\langle \varvec{\pi }^{*} , \varvec{q}_{j} \right\rangle&= \left\langle \varvec{\pi }_{j} , {\mathbf {r}}_{j} \right\rangle \\ \left\langle \varvec{\pi }^{*} , \varvec{q}_{k} \right\rangle&= \left\langle \varvec{\pi }_{k} , {\mathbf {r}}_{k} \right\rangle \end{aligned}$$

where \({\mathbf {r}}_{j}\) and \({\mathbf {r}}_{k}\) are random vectors of suitable length,Footnote 12 and then test whether \(\left\langle \varvec{\pi }^{*} , \varvec{q}_{i} \right\rangle = \left\langle \varvec{\pi }^{*} , \varvec{q}_{j} \right\rangle \cdot \left\langle \varvec{\pi }^{*} , \varvec{q}_{k} \right\rangle \).

While the solution described in the previous paragraph does provide suitable security and efficiency guarantees, it does not preserve \(\text{ HVZK } \), because the additional “structure” queries may reveal information about the witness. To fix this problem, we proceed as follows. For each quadratic tensor relation \({\mathbf {c}}={\mathbf {a}}\otimes {\mathbf {b}}\), we modify the part of the proof corresponding to it, as well as the verifier’s “structure” queries for it. Specifically, we extend the vectors \({\mathbf {a}},{\mathbf {b}},{\mathbf {c}}\) to \({\mathbf {a}}'=({\mathbf {a}}|\xi _{{\mathbf {a}}})\), \({\mathbf {b}}'=({\mathbf {b}}|\xi _{{\mathbf {b}}})\), \({\mathbf {c}}' = {\mathbf {a}}'\otimes {\mathbf {b}}'\), where \(\xi _{{\mathbf {a}}}\) and \(\xi _{{\mathbf {b}}}\) are random field elements; similarly, the verifier’s “structure” queries \({\mathbf {r}}_{{\mathbf {a}}},{\mathbf {r}}_{{\mathbf {b}}},{\mathbf {r}}_{{\mathbf {a}}}\otimes {\mathbf {r}}_{{\mathbf {b}}}\) are extended to \({\mathbf {r}}'_{{\mathbf {a}}}=({\mathbf {r}}_{{\mathbf {a}}}|r_{{\mathbf {a}}})\), \({\mathbf {r}}'_{{\mathbf {b}}}=({\mathbf {r}}_{{\mathbf {b}}}|r_{{\mathbf {b}}})\), \({\mathbf {r}}'_{{\mathbf {a}}}\otimes {\mathbf {r}}'_{{\mathbf {b}}}\), where \(r_{{\mathbf {a}}}\) and \(r_{{\mathbf {b}}}\) are random field elements. The two query answers \(\left\langle {\mathbf {a}}' , {\mathbf {r}}'_{{\mathbf {a}}} \right\rangle \) and \(\left\langle {\mathbf {b}}' , {\mathbf {r}}'_{{\mathbf {b}}} \right\rangle \) are now truly random conditioned on \(r_{{\mathbf {a}}}\) and \(r_{{\mathbf {b}}}\) being non-zero, and hence can be simulated.Footnote 13 In addition, \(\left\langle {\mathbf {c}}' , {\mathbf {r}}'_{{\mathbf {a}}}\otimes {\mathbf {r}}'_{{\mathbf {b}}} \right\rangle \) can be simulated by taking the product of the simulated answers. As for the queries of the original \(\text{ LPCP } \), these can be appropriately padded with zeros so that the answers to them are not affected by the changes to the proof; the answers to these queries are simulatable, when assuming that the original \(\text{ LPCP } \) is \(\text{ HVZK } \). Overall, the modification to preserve \(\text{ HVZK } \) does not change the number of queries, and increases their length \(m\) by at most a factor of \(1+\frac{1}{m} < 2\).

This concludes the proof sketch for Claim 8.3. \(\square \)

Construction 8.4

(Transformation for Claim 8.3) Let \((P_{\scriptscriptstyle {\mathsf {LPCP}}} ,V_{\scriptscriptstyle {\mathsf {LPCP}}})\) be an \(\text{ LPCP } \) for \(\mathcal {R}\) that is secure against \(\mathcal {T}\)-respecting provers, for some set of quadratic tensor relations \(\mathcal {T}\) of size \(\beta \). Parse a proof \(\varvec{\pi }\) as follows:

$$\begin{aligned} \varvec{\pi }= \left( \begin{array}{c} \varvec{\pi }_{1} \\ \vdots \\ \varvec{\pi }_{\alpha }\\ \varvec{\pi }_{\alpha +1} \\ \vdots \\ \varvec{\pi }_{\alpha +\beta } \end{array}\right) = \left( \begin{array}{c} \varvec{\pi }_{1} \\ \vdots \\ \varvec{\pi }_{\alpha }\\ \varvec{\pi }_{j_{1}}\otimes \varvec{\pi }_{k_{1}} \\ \vdots \\ \varvec{\pi }_{j_{\beta }}\otimes \varvec{\pi }_{k_{\beta }} \end{array}\right) , \end{aligned}$$

where \(c = \alpha +\beta \) is the number of components in \(\varvec{\pi }\) and \((\alpha +1,j_{1},k_{1}),\dots ,(\alpha +\beta ,j_{\beta },k_{\beta })\) are the \(\beta \) triples in \(\mathcal {T}\). (We can always relabel triples in \(\mathcal {T}\) so that the \(\beta \) components of \(\varvec{\pi }\) constrained by a tensor relation in \(\mathcal {T}\) appear after any unconstrained component.) Note that each of \(j_{1},\dots ,j_{\beta },k_{1},\dots ,k_{\beta }\) can be any index in \([\alpha +\beta ]\) (always subject to the condition that are no “cycles” in \(\mathcal {T}\)).

Construct the \(\text{ LPCP } \) \((P_{\scriptscriptstyle {\mathsf {LPCP}}}' ,V_{\scriptscriptstyle {\mathsf {LPCP}}}')\) as follows.

  • The prover \(P_{\scriptscriptstyle {\mathsf {LPCP}}}'\). Given \((x ,w) \in \mathcal {R}\), \(P_{\scriptscriptstyle {\mathsf {LPCP}}}'\) invokes \(P_{\scriptscriptstyle {\mathsf {LPCP}}}(x,w)\) to obtain a proof \(\varvec{\pi }\) as above, samples random \({\mathbf {\xi }}\in \mathbb {F}^{\alpha +\beta }\), and outputs the proof \(\varvec{\pi }'\) defined as follows:

    $$\begin{aligned} \varvec{\pi }' = \left( \begin{array}{c} \varvec{\pi }_{1}' \\ \vdots \\ \varvec{\pi }_{\alpha }'\\ \varvec{\pi }_{\alpha +1}' \\ \vdots \\ \varvec{\pi }_{\alpha +\beta }' \end{array}\right) = \left( \begin{array}{c} (\varvec{\pi }_{1}|\xi _{1}) \\ \vdots \\ (\varvec{\pi }_{\alpha }|\xi _{\alpha })\\ (\varvec{\pi }_{j_{1}}' \otimes \varvec{\pi }_{k_{1}}' |\xi _{\alpha + 1}) \\ \vdots \\ (\varvec{\pi }_{j_{\beta }}' \otimes \varvec{\pi }_{k_{\beta }}'|\xi _{\alpha + \beta }) \end{array}\right) . \end{aligned}$$

    Let us explain the mapping from \(\varvec{\pi }\) to \(\varvec{\pi }'\) in words. Let us assume without loss of generality that the triples \((\alpha +1,j_{1},k_{1}),\dots ,(\alpha +\beta ,j_{\beta },k_{\beta })\) are labeled so that they are in a “topological order”: namely, if \(\ell < \ell '\) then it cannot be that \(j_{\ell }\) or \(k_{\ell }\) is equal to \(\alpha + \ell '\). (A topological order exists because there are no cycles in \(\mathcal {T}\).) We first pad each of the components \(\varvec{\pi }_{1},\dots ,\varvec{\pi }_{\alpha }\) with a random element to obtain \(\varvec{\pi }_{1}',\dots ,\varvec{\pi }_{\alpha }'\). Then, for \(\ell =1,\dots ,\beta \) (in this order), we pad \(\varvec{\pi }_{j_{\ell }}' \otimes \varvec{\pi }_{k_{\ell }}'\) with a random element to obtain \(\varvec{\pi }_{\alpha +\ell }'\); because of the topological order, when defining \(\varvec{\pi }_{\alpha +\ell }'\), we have already defined both \(\varvec{\pi }_{j_{\ell }}'\) and \(\varvec{\pi }_{k_{\ell }}'\).

  • The query algorithm \(Q_{\scriptscriptstyle {\mathsf {LPCP}}}'\). The query algorithm \(Q_{\scriptscriptstyle {\mathsf {LPCP}}}'\) invokes \(Q_{\scriptscriptstyle {\mathsf {LPCP}}}\) to obtain queries \(\varvec{q}_{1},\dots ,\varvec{q}_{k}\) and a state \({\mathbf {u}}\), and then produces and outputs (along with \({\mathbf {u}}\)) new queries

    $$\begin{aligned} \varvec{q}^{(1)}_{1},\dots ,\varvec{q}^{(1)}_{k}, \varvec{q}^{(2)}_{s_{1}},\dots ,\varvec{q}^{(2)}_{s_{\gamma }}, \varvec{q}^{(3)}_{\alpha +1},\dots ,\varvec{q}^{(3)}_{\alpha +\beta }, \end{aligned}$$

    where \(s_{1},\dots ,s_{\gamma }\) are the (distinct) indices in the set \(\{j\}_{(i,j,k) \in \mathcal {T}} \cup \{k\}_{(i,j,k) \in \mathcal {T}}\); note that \(\gamma \le 2\beta \). The new queries are defined as follows.

    • The queries \(\varvec{q}^{(1)}_{1},\dots ,\varvec{q}^{(1)}_{k}\) correspond to the original queries \(\varvec{q}_{1},\dots ,\varvec{q}_{k}\) to \(\varvec{\pi }\). In other words, we construct each \(\varvec{q}^{(1)}_{j}\) so that

      $$\begin{aligned} \left\langle \varvec{\pi }' , \varvec{q}^{(1)}_{j} \right\rangle = \left\langle \varvec{\pi } , \varvec{q}_{j} \right\rangle . \end{aligned}$$

      Since \(\varvec{\pi }'\) contains \(\varvec{\pi }\) (and some new terms), each \(\varvec{q}^{(1)}_{j}\) can be obtained from \(\varvec{q}_{j}\) via suitable padding with zeros (in the locations corresponding to the new terms).

    • The remaining queries are for testing the tensor structure in a way that preserves \(\text{ HVZK } \):

      • \(*\) For \(\ell \in [\gamma ]\), \(\varvec{q}^{(2)}_{s_{\ell }}\) is constructed so that

        $$\begin{aligned}&\left\langle \varvec{\pi }' , \varvec{q}^{(2)}_{s_{\ell }} \right\rangle = \left\langle \varvec{\pi }'_{s_{\ell }} , ({\mathbf {r}}^{(2)}_{s_{\ell }}|{r}^{(2)}_{s_{\ell }}) \right\rangle \\&= {\left\{ \begin{array}{ll} \left\langle \varvec{\pi }_{s_{\ell }} , {\mathbf {r}}^{(2)}_{s_{\ell }} \right\rangle + \xi _{s_{\ell }}{r}^{(2)}_{s_{\ell }} &{} \text {if } s_{\ell } \in \{1,\dots ,\alpha \} \\ \left\langle \varvec{\pi }_{j_{s_{\ell }-\alpha }}' \otimes \varvec{\pi }_{k_{s_{\ell }-\alpha }}' , {\mathbf {r}}^{(2)}_{s_{\ell }} \right\rangle + \xi _{s_{\ell }}{r}^{(2)}_{s_{\ell }} &{} \text {if } s_{\ell } \in \{\alpha + 1,\dots ,\alpha +\beta \} \end{array}\right. }, \end{aligned}$$

        where \({\mathbf {r}}^{(2)}_{s_{\ell }}\) is a random vector (of suitable length) and \({r}^{(2)}_{s_{\ell }}\) is a random field element.

      • \(*\) For \(\ell \in [\beta ]\), \(\varvec{q}^{(3)}_{\alpha +\ell }\) is constructed so that

        $$\begin{aligned} \left\langle \varvec{\pi }' , \varvec{q}^{(3)}_{\alpha +\ell } \right\rangle= & {} \left\langle \varvec{\pi }'_{\alpha +\ell } , (({\mathbf {r}}^{(2)}_{j_{\ell }}|{r}^{(2)}_{j_{\ell }}) \otimes ({\mathbf {r}}^{(2)}_{k_{\ell }}|{r}^{(2)}_{k_{\ell }})|{r}^{(3)}_{\alpha +\ell }) \right\rangle \\= & {} \left\langle \varvec{\pi }_{j_{\ell }}' \otimes \varvec{\pi }_{k_{\ell }}' , ({\mathbf {r}}^{(2)}_{j_{\ell }}|{r}^{(2)}_{j_{\ell }}) \otimes ({\mathbf {r}}^{(2)}_{k_{\ell }}|{r}^{(2)}_{k_{\ell }}) \right\rangle + \xi _{\alpha + \ell }{r}^{(3)}_{\alpha +\ell } , \end{aligned}$$

        where \({r}^{(3)}_{\alpha +\ell }\) is a random field element.

  • The decision algorithm \(D_{\scriptscriptstyle {\mathsf {LPCP}}}'\). The decision algorithm \(D_{\scriptscriptstyle {\mathsf {LPCP}}}'\) invokes \(D_{\scriptscriptstyle {\mathsf {LPCP}}}\) on the answers \(a^{(1)}_{1},\dots ,a^{(1)}_{k}\) and checks that \(a^{(3)}_{\alpha +1} = a^{(2)}_{j_{1}}a^{(2)}_{k_{1}},\dots ,a^{(3)}_{\alpha +\beta } = a^{(2)}_{j_{\beta }}a^{(2)}_{k_{\beta }}\).

Step 2: from higher-degree tensor relations to quadratic tensor relations. The notion of a set of tensor relations introduced in the previous step can of course be extended to relations of more than 2 vectors. In general, a set of tensor relations \(\mathcal {T}\) is a set of tuples of potentially different length; we say that a vector \(\varvec{\pi }= (\varvec{\pi }_{1},\dots ,\varvec{\pi }_{c})\) satisfies a set of tensor relations in \(\mathcal {T}\) if for every tuple \((i,j,k,\dots ,z) \in \mathcal {T}\) it holds that \(\varvec{\pi }_{i} = \varvec{\pi }_{j} \otimes \varvec{\pi }_{k} \otimes \cdots \otimes \varvec{\pi }_{z}\). (As before, we require tuples in \(\mathcal {T}\) to not form cycles, etc.; see Footnote 11.)

We describe a (zero-knowledge preserving) transformation that maps any \(\text{ LPCP } \) that is secure against \(\mathcal {T}\)-respecting provers, for some \(\mathcal {T}\) containing tensor relations that are at most cubic, into a corresponding \(\text{ LPCP } \) that is secure against \(\mathcal {T}'\)-respecting provers, for a corresponding set of quadratic tensor relations \(\mathcal {T}'\). (While not needed here, the transformation can be extended to deal with arbitrary sets of tensor relations; see Remark (8.6).)

Claim 8.5

There exists an efficient \(\text{ LPCP } \) transformation \(\mathbb {Q}\) such that, for any set of at-most-cubic tensor relations \(\mathcal {T}\), the following properties hold:

  • Security: If \((P_{\scriptscriptstyle {\mathsf {LPCP}}} ,V_{\scriptscriptstyle {\mathsf {LPCP}}})\) is an \(\text{ LPCP } \) for a relation \(\mathcal {R}\) with knowledge error \(\varepsilon \) against \(\mathcal {T}\)-respecting provers, then \((\mathcal {T}',P_{\scriptscriptstyle {\mathsf {LPCP}}}' ,V_{\scriptscriptstyle {\mathsf {LPCP}}}') := \mathbb {Q}(\mathcal {T},P_{\scriptscriptstyle {\mathsf {LPCP}}},V_{\scriptscriptstyle {\mathsf {LPCP}}})\) is an \(\text{ LPCP } \) for \(\mathcal {R}\) with knowledge error \(\varepsilon \) against \(\mathcal {T}'\)-respecting provers, where \(\mathcal {T}'\) is a set of quadratic tensor relations.

  • Zero-Knowledge Preservation: If \((P_{\scriptscriptstyle {\mathsf {LPCP}}} ,V_{\scriptscriptstyle {\mathsf {LPCP}}})\) is \(\delta \)-statistical \(\text{ HVZK } \) then so is \((P_{\scriptscriptstyle {\mathsf {LPCP}}}' ,V_{\scriptscriptstyle {\mathsf {LPCP}}}')\). (In particular, perfect \(\text{ HVZK } \) is preserved.)

  • Efficiency: If \((P_{\scriptscriptstyle {\mathsf {LPCP}}} ,V_{\scriptscriptstyle {\mathsf {LPCP}}})\) has query length \(m\), \(k\) queries, \(\eta \) test polynomials, and degree \((d_{Q} ,d_{D})\), then \((P_{\scriptscriptstyle {\mathsf {LPCP}}}' ,V_{\scriptscriptstyle {\mathsf {LPCP}}}')\) has query length \(O(m)\), \(k\) queries, \(\eta \) test polynomials, and degree \((d_{Q} ,d_{D})\). Furthermore, if \(\mathcal {T}\) has size \(\beta \) then \(\mathcal {T}'\) has size \(O(\beta )\).

Proof sketch

The idea is to simply split every cubic tensor into two quadratic tensors. Namely, for every tensor relation \(\varvec{\pi }_{i} = \varvec{\pi }_{j} \otimes \varvec{\pi }_{k} \otimes \varvec{\pi }_{\ell }\) in \(\mathcal {T}\), we let \(\mathcal {T}'\) require the two tensor relations \(\varvec{\pi }_{i'} = \varvec{\pi }_{j} \otimes \varvec{\pi }_{k}\) and \(\varvec{\pi }_{i} = \varvec{\pi }_{i'} \otimes \varvec{\pi }_{\ell }\), for some new index \(i'\). With this modification the length of the proof at most doubles and, moreover, the size of \(\mathcal {T}'\) is at most \(2\beta = O(\beta )\). \(\square \)

While it is possible to test cubic tensor relations “directly,” in a manner that is analogous to the way we test quadratic tensor relations in the proof of Claim 8.3, doing so requires decision predicates of cubic degree, which we wish to avoid. Claim 8.5 thus lets us move from cubic tensor relations to quadratic tensor relations before any tensor relations are tested.

Remark 8.6

Claim 8.5 can be extended to handle any set of tensor relations \(\mathcal {T}\), and not only quadratic ones. Concretely, a tensor relation over d vectors can be split into \((d-1)\) quadratic tensor relations. The corresponding parameter changes to the new proof length and size of \(\mathcal {T}'\) can be easily deduced.

Step 3: from LPCP to HVZK LPCP against tensor-respecting provers. We describe a transformation that maps any \(\text{ LPCP } \) into a corresponding perfect \(\text{ HVZK } \text{ LPCP } \), with similar parameters, that is secure against \(\mathcal {T}\)-respecting provers for a certain set of at-most-cubic tensor relations \(\mathcal {T}\).

Claim 8.7

There exists an efficient \(\text{ LPCP } \) transformation \(\mathbb {Z}\) with the following properties:

  • Security: If \((P_{\scriptscriptstyle {\mathsf {LPCP}}} ,V_{\scriptscriptstyle {\mathsf {LPCP}}})\) is an \(\text{ LPCP } \) for a relation \(\mathcal {R}\) with knowledge error \(\varepsilon \) and \(d_{D}=2\), then \((P_{\scriptscriptstyle {\mathsf {LPCP}}}' ,V_{\scriptscriptstyle {\mathsf {LPCP}}}') := \mathbb {Z}(P_{\scriptscriptstyle {\mathsf {LPCP}}},V_{\scriptscriptstyle {\mathsf {LPCP}}})\) is a perfect \(\text{ HVZK } \text{ LPCP } \) for \(\mathcal {R}\) with knowledge error \(\varepsilon +O(1/{|\mathbb {F}|})\) against \(\mathcal {T}\)-respecting provers for some set of at-most-cubic tensor relations \(\mathcal {T}\).

  • Efficiency: If \((P_{\scriptscriptstyle {\mathsf {LPCP}}} ,V_{\scriptscriptstyle {\mathsf {LPCP}}})\) has query length \(m\), \(k\) queries, state length \(m'\), \(\eta \) test polynomials, and degree \((d_{Q} ,2)\), then \((P_{\scriptscriptstyle {\mathsf {LPCP}}}' ,V_{\scriptscriptstyle {\mathsf {LPCP}}}')\) has query length \(O(\eta k^{4}(m+m'))\), \(k+\eta +1\) queries, state length \(O(k^{2}m')\), \(\eta +1\) test polynomials, and degree \((d_{Q} ,2)\). Furthermore, \(\mathcal {T}\) has size \(\beta = 3\eta \).

Proof

Recall that, for an input \(x\), we denote by \(\varvec{t}_{x}(u_{1},\dots ,u_{m'},a_{1},\dots ,a_{k})\) the (quadratic) test polynomial of the decision algorithm \(D_{\scriptscriptstyle {\mathsf {LPCP}}}(x,\ldots )\). In general, \(\varvec{t}_{x}\) is a \(\eta \)-dimensional multivariate polynomial; however, to simplify notation, we describe our construction for the case \(\eta =1\) and then, in Remark (8.9), explain how the construction can be extended to the case \(\eta > 1\) (and what is the effect on the parameters).

Define the padding polynomial (of \(t_{x}\)) to be:

$$\begin{aligned} \varvec{\Delta }_{t_{x}}(u_{1},\dots ,u_{m'},a_{1},\dots ,a_{k},\xi _{1},\dots ,\xi _{k})&= t_{x}(u_{1},\dots ,u_{m'},a_{1}+\xi _{1},\dots ,a_{k}+\xi _{k})\\&\quad -t_{x}(u_{1},\dots ,u_{m'},a_{1},\dots ,a_{k}). \end{aligned}$$

Note that \(\varvec{\Delta }_{t_{x}}\) can be expanded to:

$$\begin{aligned}&\varvec{\Delta }_{t_{x}}(u_{1},\dots ,u_{m'},a_{1},\dots ,a_{k},\xi _{1},\dots ,\xi _{k})\\&\quad = \sum _{(i ,j)\in [m']\times [k]} c^{{\mathbf {u}}\otimes {\mathbf {\xi }}}_{x,i,j} u_{i}\xi _{j}+ \sum _{(i ,j)\in [k]\times [k]} c^{{\mathbf {a}}\otimes {\mathbf {\xi }}}_{x,i,j} a_{i}\xi _{j}\\&\qquad + \sum _{(i ,j)\in [k]\times [k]} c^{{\mathbf {\xi }}\otimes {\mathbf {\xi }}}_{x,i,j} \xi _{i}\xi _{j}, \end{aligned}$$

for some coefficients \(c^{{\mathbf {u}}\otimes {\mathbf {\xi }}}_{x,i,j}\), \(c^{{\mathbf {a}}\otimes {\mathbf {\xi }}}_{x,i,j}\), and \(c^{{\mathbf {\xi }}\otimes {\mathbf {\xi }}}_{x,i,j}\); we denote the corresponding coefficients vectors by

$$\begin{aligned} {\mathbf {c}}^{{\mathbf {u}}\otimes {\mathbf {\xi }}}_{x} = \left( c^{{\mathbf {u}}\otimes {\mathbf {\xi }}}_{x,i,j}\right) _{(i ,j)\in [m']\times [k]}, {\mathbf {c}}^{{\mathbf {a}}\otimes {\mathbf {\xi }}}_{x} = \left( c^{{\mathbf {a}}\otimes {\mathbf {\xi }}}_{x,i,j}\right) _{(i ,j)\in [k]\times [k]}, {\mathbf {c}}^{{\mathbf {\xi }}\otimes {\mathbf {\xi }}}_{x} = \left( c^{{\mathbf {\xi }}\otimes {\mathbf {\xi }}}_{x,i,j}\right) _{(i ,j)\in [k]\times [k]}. \end{aligned}$$

Construction 8.8

Construct \((P_{\scriptscriptstyle {\mathsf {LPCP}}}' ,V_{\scriptscriptstyle {\mathsf {LPCP}}}')\) from \((P_{\scriptscriptstyle {\mathsf {LPCP}}} ,V_{\scriptscriptstyle {\mathsf {LPCP}}})\) as follows:

  • The prover \(P_{\scriptscriptstyle {\mathsf {LPCP}}}'\). Given \((x ,w) \in \mathcal {R}\), \(P_{\scriptscriptstyle {\mathsf {LPCP}}}'\) invokes \(P_{\scriptscriptstyle {\mathsf {LPCP}}}(x,w)\) to obtain a proof \(\varvec{\pi }\in \mathbb {F}^{m}\), samples a random \({\mathbf {\xi }}\in \mathbb {F}^{k}\), and outputs the proof \(\varvec{\pi }'\) defined as follows:

    $$\begin{aligned} \varvec{\pi }' = \left( \begin{array}{c} \varvec{\pi }\\ {\mathbf {\xi }}\\ {\mathbf {c}}^{{\mathbf {u}}\otimes {\mathbf {\xi }}}_{x} \\ {\mathbf {c}}^{{\mathbf {a}}\otimes {\mathbf {\xi }}}_{x} \\ {\mathbf {c}}^{{\mathbf {\xi }}\otimes {\mathbf {\xi }}}_{x}\\ {\mathbf {\xi }}\otimes {\mathbf {c}}^{{\mathbf {u}}\otimes {\mathbf {\xi }}}_{x} \\ \varvec{\pi }\otimes {\mathbf {\xi }} \otimes {\mathbf {c}}^{{\mathbf {a}}\otimes {\mathbf {\xi }}}_{x} \\ {\mathbf {\xi }}\otimes {\mathbf {\xi }} \otimes {\mathbf {c}}^{{\mathbf {\xi }}\otimes {\mathbf {\xi }}}_{x} \end{array} \right) . \end{aligned}$$

    In other words, \(\varvec{\pi }'\) has 8 component vectors, and we can define \(\mathcal {T}\) as the set

    $$\begin{aligned} \mathcal {T}:= \{(6,2,3),(7,1,2,4),(8,2,2,5)\} . \end{aligned}$$
  • The query algorithm \(Q_{\scriptscriptstyle {\mathsf {LPCP}}}'\). The query algorithm \(Q_{\scriptscriptstyle {\mathsf {LPCP}}}'\) invokes \(Q_{\scriptscriptstyle {\mathsf {LPCP}}}\) to obtain queries \(\varvec{q}_{1},\dots ,\varvec{q}_{k}\) and a state \({\mathbf {u}}\), and then produces new queries \(\varvec{q}'_{1},\dots ,\varvec{q}'_{k},\varvec{q}'_{k+1},\varvec{q}'_{k+2}\). Specifically:

    • For each \(j\in [k]\), the prefix of \(\varvec{q}'_{j}\) is \(\varvec{q}_{j}\), and the suffix is a unit vector \({\mathbf {e}}_{j}\), which is 1 in the j-th position and zero otherwise. In other words, we construct \(\varvec{q}'_{j}\) so that:

      $$\begin{aligned} \left\langle \varvec{\pi }' , \varvec{q}'_{j} \right\rangle = \left\langle \varvec{\pi } , \varvec{q}_{j} \right\rangle +\xi _{j}. \end{aligned}$$
    • The vector \(\varvec{q}'_{k+1}\) takes a random combination of the elements \({\mathbf {c}}^{{\mathbf {u}}\otimes {\mathbf {\xi }}}_{x}, {\mathbf {c}}^{{\mathbf {a}}\otimes {\mathbf {\xi }}}_{x},{\mathbf {c}}^{{\mathbf {\xi }}\otimes {\mathbf {\xi }}}_{x}\). In other words, \(\varvec{q}'_{k+1}\) consists of a random vector \({\mathbf {r}}\) padded with zeros so that

      $$\begin{aligned} \left\langle \varvec{\pi }' , \varvec{q}'_{k+1} \right\rangle = \left\langle \left( {\mathbf {c}}^{{\mathbf {u}}\otimes {\mathbf {\xi }}}_{x}, {\mathbf {c}}^{{\mathbf {a}}\otimes {\mathbf {\xi }}}_{x}, {\mathbf {c}}^{{\mathbf {\xi }}\otimes {\mathbf {\xi }}}_{x}\right) , {\mathbf {r}} \right\rangle . \end{aligned}$$
    • The vector \(\varvec{q}'_{k+2}\) consists of several copies of each of \(\varvec{q}_{1},\dots ,\varvec{q}_{k}\) and \({\mathbf {u}}\), and of zeros, so that

      $$\begin{aligned} \left\langle \varvec{\pi }' , \varvec{q}'_{k+2} \right\rangle = \varvec{\Delta }_{t_{x}}({\mathbf {u}},\left\langle \varvec{\pi } , \varvec{q}_{1} \right\rangle ,\dots ,\left\langle \varvec{\pi } , \varvec{q}_{k} \right\rangle ,{\mathbf {\xi }})= \varvec{\Delta }_{t_{x}}({\mathbf {u}},{\mathbf {a}},{\mathbf {\xi }}) . \end{aligned}$$

    As for the state, \(Q_{\scriptscriptstyle {\mathsf {LPCP}}}'\) outputs \({\mathbf {u}}' = ({\mathbf {u}} ,{\mathbf {r}})\).

  • The decision algorithm \(D_{\scriptscriptstyle {\mathsf {LPCP}}}'\). Given \({\mathbf {u}}'=({\mathbf {u}},{\mathbf {r}})\) and answers \(\left( a'_{1},\dots ,a'_{k+2}\right) \), \(D_{\scriptscriptstyle {\mathsf {LPCP}}}'\) checks that \(t_{x}({\mathbf {u}},a'_{1},\dots ,a'_{k}) = a'_{k+2}\) and \(a'_{k+1}=\left\langle \left( {\mathbf {c}}^{{\mathbf {u}}\otimes {\mathbf {\xi }}}_{x},{\mathbf {c}}^{{\mathbf {a'}}\otimes {\mathbf {\xi }}}_{x}, {\mathbf {c}}^{{\mathbf {\xi }}\otimes {\mathbf {\xi }}}_{x}\right) , {\mathbf {r}} \right\rangle \), where \(t_{x}\) is the test polynomial of \(D_{\scriptscriptstyle {\mathsf {LPCP}}}(x,\ldots )\).

Remark 8.9

(Multi-dimensional test polynomials) As mentioned above, to simplify notation, we have defined the padding polynomial and given Construction 8.8 for the case \(\eta =1\). For the general case of \(\eta \)-dimensional polynomial \(\varvec{t}_{x}\) with \(\eta >1\), we proceed as follows. Instead of defining a single padding polynomial, we define \(\eta \) padding polynomials \(\varvec{\Delta }_{\varvec{t}_{x},i}\), each with its own coefficients \({\mathbf {c}}_{i}=\left( {\mathbf {c}}^{{\mathbf {u}}\otimes {\mathbf {\xi }}}_{x,i}, {\mathbf {c}}^{{\mathbf {a}}\otimes {\mathbf {\xi }}}_{x,i},{\mathbf {c}}^{{\mathbf {\xi }}\otimes {\mathbf {\xi }}}_{x,i}\right) \). Accordingly, the third, fourth, and fifth row of the proof \(\varvec{\pi }'\) are extended to \(\eta \) corresponding rows; similarly for the sixth, eight, and tenth row of \(\varvec{\pi }'\). The query \(\varvec{q}'_{k+1}\) is modified to include \(\eta \) random vectors \({\mathbf {r}}_{1},\dots ,{\mathbf {r}}_{\eta }\); indeed, the linear combination checking the correctness of \({\mathbf {c}}_{1},\dots ,{\mathbf {c}}_{\eta }\) can be taken simultaneously for all of the \({\mathbf {c}}_{i}\) (rather than for each \({\mathbf {c}}_{i}\) separately). The last query \(\varvec{q}'_{k+2}\) is replaced by \(\eta \) queries \(\varvec{q}'_{k+2},\dots ,\varvec{q}'_{k+\eta +1}\), each according to the appropriate \(\varvec{\Delta }_{\varvec{t}_{x},i}\).

First, note that the degree of \((P_{\scriptscriptstyle {\mathsf {LPCP}}}' ,V_{\scriptscriptstyle {\mathsf {LPCP}}}')\) is the same as that of \((P_{\scriptscriptstyle {\mathsf {LPCP}}} ,V_{\scriptscriptstyle {\mathsf {LPCP}}})\): we have only introduced linear operations to both query and decision polynomials.

Next, let us argue that \((P_{\scriptscriptstyle {\mathsf {LPCP}}}' ,V_{\scriptscriptstyle {\mathsf {LPCP}}}')\) has the claimed knowledge error. Assuming that the vectors \(({\mathbf {c}}^{{\mathbf {u}}\otimes {\mathbf {\xi }}}_{x},{\mathbf {c}}^{{\mathbf {a}}\otimes {\mathbf {\xi }}}_{x},{\mathbf {c}}^{{\mathbf {\xi }}\otimes {\mathbf {\xi }}}_{x})\) appear correctly in the proof \(\varvec{\pi }'\) and the prover is \(\mathcal {T}\)-respecting, then \((P_{\scriptscriptstyle {\mathsf {LPCP}}}' ,V_{\scriptscriptstyle {\mathsf {LPCP}}}')\) has the exact same knowledge error as in \((P_{\scriptscriptstyle {\mathsf {LPCP}}} ,V_{\scriptscriptstyle {\mathsf {LPCP}}})\), because then the verifier exactly computes \(t_x({\mathbf {u}},{\mathbf {a}})\). If instead the vectors \(({\mathbf {c}}^{{\mathbf {u}}\otimes {\mathbf {\xi }}}_{x},{\mathbf {c}}^{{\mathbf {a}}\otimes {\mathbf {\xi }}}_{x}, {\mathbf {c}}^{{\mathbf {\xi }}\otimes {\mathbf {\xi }}}_{x})\) are not computed properly, then the random linear combination test will fail except with probability \(1/|\mathbb {F}|\). Thus, the knowledge error of \((P_{\scriptscriptstyle {\mathsf {LPCP}}}' ,V_{\scriptscriptstyle {\mathsf {LPCP}}}')\) is at most \(\varepsilon +1/|\mathbb {F}|\).

To see that \((P_{\scriptscriptstyle {\mathsf {LPCP}}}' ,V_{\scriptscriptstyle {\mathsf {LPCP}}}')\) is perfect \(\text{ HVZK } \), note that the honest verifier does not learn from an honest proof any information except for the fact that \(\varvec{t}_{{x}}({\mathbf {u}},{\mathbf {a}})=\varvec{t}_{{x}}({\mathbf {u}},\left\langle \varvec{\pi } , \varvec{q}_{1} \right\rangle ,\dots ,\left\langle \varvec{\pi } , \varvec{q}_{k} \right\rangle )=0\). More formally, the answers \(a'_{1},\dots ,a'_{k}\) can all be simulated by random independent elements; the answer \(a'_{k+1}\) can be simulated by just taking a random combination of \(({\mathbf {c}}^{{\mathbf {u}}\otimes {\mathbf {\xi }}}_{x},{\mathbf {c}}^{{\mathbf {a}}\otimes {\mathbf {\xi }}}_{x},{\mathbf {c}}^{{\mathbf {\xi }}\otimes {\mathbf {\xi }}}_{x})\); and the difference \(a'_{k+2}=\varvec{\Delta }_{t_{x}}({\mathbf {u}},{\mathbf {a}},{\mathbf {\xi }})\) can be simulated simply as \(\varvec{t}_{x}({\mathbf {u}},{\mathbf {a}}')\), where \({\mathbf {u}}\) is an honestly generated state for the verifier and \({\mathbf {a}}'\) are the simulated answers \(a'_{1},\dots ,a'_{k}\).

Combining Claims 8.78.5, and 8.3 completes the proof of Theorem 8.1.

9 Multi-Theorem Designated-Verifier \(\text{ SNARK } \)s via Strong Knowledge

A desirable property of \(\text{ SNARK } \)s is the ability to generate the reference string \(\sigma \), once and for all, and then reuse it to produce polynomially-many proofs (potentially by different provers). Doing so is especially desirable for preprocessing \(\text{ SNARK } \)s, where generating \(\sigma \) afresh is expensive.

However, being able to securely reuse \(\sigma \) requires security also against provers that have access to a proof-verification oracle. For publicly-verifiable \(\text{ SNARK } \)s, this multi-theorem proof of knowledge is automatically guaranteed. For designated-verifier \(\text{ SNARK } \)s, however, multi-theorem proof of knowledge needs to be required explicitly as an additional property. Intuitively, this is achieved by ensuring that the verifier’s response “leaks” only a negligible amount of information about the verification state \(\tau \) (for then malicious prover strategies that create a significant correlation between \(\tau \) and the event of the verifier rejecting are ruled out).Footnote 14

Security against such provers can be formulated for (computational) soundness or proof of knowledge, both in the non-adaptive and adaptive settings. Because in this paper we are typically interested in adaptive proof of knowledge, we formulate it in this setting.

Definition 9.1

A triple of algorithms \((G ,P,V)\) is a multi-theorem SNARK for the relation \(\mathcal {R}\subseteq \mathcal {R}_{\mathcal {U}}\) if it is a \(\text{ SNARK } \) for \(\mathcal {R}\) where adaptive proof of knowledge (Definition 4.3) is replaced by the following stronger requirement:

  • Multi-theorem adaptive proof of knowledge

    For every polynomial-size prover \(P^{*}\), there exists a polynomial-size extractor \(E\) such that for every large enough security parameter \(\lambda \in \mathbb {N}\), auxiliary input \(z\in \{0,1\}^{{\mathrm {poly}}(\lambda )}\), and time bound \(T\in \mathbb {N}\),

    $$\begin{aligned} \mathrm{Pr}\left[ \begin{array}{c} V(\tau ,y,\pi )=1 \\ (y,w) \notin \mathcal {R}\end{array} \left| \begin{array}{r} (\sigma ,\tau ) \leftarrow G(1^{\lambda },T) \\ (y,\pi ) \leftarrow P^{* \; V(\tau ,\cdot ,\cdot )}(z,\sigma ) \\ w\leftarrow E(z,\sigma ) \end{array} \right] \right. \le {\mathsf {negl}}(\lambda ) . \end{aligned}$$

AsFootnote 15 discussed in Sect. 1.3.3, the \(\text{ PCP } \)-based (or \(\text{ MIP } \)-based) \(\text{ SNARK } \)s of [8, 10, 11, 44, 46, 60, 83] are not multi-theorem \(\text{ SNARK } \)s, because a malicious prover can adaptively learn the encrypted \(\text{ PCP } \) (or \(\text{ MIP } \)) queries (whose secrecy is crucial for security), just by feeding different proofs to the verifier and learning his responses. In this paper, some of the designated-verifier preprocessing \(\text{ SNARK } \)s that we construct satisfy multi-theorem proof of knowledge (under suitable assumptions), and some do not.

Concretely, an interesting property that is satisfied by algebraic \(\text{ LIP } \)s, which we call strong knowledge, is that such “correlation attacks” are impossible: roughly, every \(\text{ LIP } \) prover either makes the \(\text{ LIP } \) verifier accept with probability 1 or with probability less than \(O({\mathrm {poly}}(\lambda )/|\mathbb {F}|)\). Designated-verifier preprocessing \(\text{ SNARK } \)s constructed from such \(\text{ LIP } \)s can thus be shown to have the multi-theorem property. Details follow.

Definition 9.2

An \(\text{ LPCP } \) \((P_{\scriptscriptstyle {\mathsf {LPCP}}} ,V_{\scriptscriptstyle {\mathsf {LPCP}}})\) with knowledge error \(\varepsilon \) has strong knowledge error if for every input \(x\) and every linear function \(\varvec{\pi }^{*} :\mathbb {F}^{m} \rightarrow \mathbb {F}\), the probability that \(V_{\scriptscriptstyle {\mathsf {LPCP}}}^{\varvec{\pi }^{*}}(x)\) accepts is either 1 or at most \(\varepsilon \). (When we are not paying attention to the knowledge properties of the \(\text{ LPCP } \), we shall call this property strong soundness.) An analogous definition holds for \(\text{ LIP } \)s.

For sufficiently large field \(\mathbb {F}\), algebraic \(\text{ LPCP } \)s and \(\text{ LIP } \)s have the strong knowledge error property, as proved in the following lemma.

Lemma 9.3

Let \((P_{\scriptscriptstyle {\mathsf {LPCP}}} ,V_{\scriptscriptstyle {\mathsf {LPCP}}})\) be an \(\text{ LPCP } \) over \(\mathbb {F}\) with knowledge error \(\varepsilon \); if \((P_{\scriptscriptstyle {\mathsf {LPCP}}} ,V_{\scriptscriptstyle {\mathsf {LPCP}}})\) has degree \((d_{Q} ,d_{D})\), then \((P_{\scriptscriptstyle {\mathsf {LPCP}}} ,V_{\scriptscriptstyle {\mathsf {LPCP}}})\) has strong knowledge error \(\max \{\varepsilon , \frac{d_{Q}d_{D}}{|\mathbb {F}|}\}\). An analogous statement holds for (input-oblivious two-message) \(\text{ LIP } \)s.

Proof

Since \((P_{\scriptscriptstyle {\mathsf {LPCP}}} ,V_{\scriptscriptstyle {\mathsf {LPCP}}})\) has knowledge error \(\varepsilon \), it also has knowledge error \(\max \{\varepsilon ,\frac{d_{Q}d_{D}}{|\mathbb {F}|}\}\). We are now only left to show that, for every input \(x\) and linear function \(\varvec{\pi }^{*}\), if \(\mathrm{Pr}[V_{\scriptscriptstyle {\mathsf {LPCP}}}^{\varvec{\pi }^{*}}(x)=1] > \max \{\varepsilon ,\frac{d_{Q}d_{D}}{|\mathbb {F}|}\} \ge \frac{d_{Q}d_{D}}{|\mathbb {F}|}\) then \(\mathrm{Pr}[V_{\scriptscriptstyle {\mathsf {LPCP}}}^{\varvec{\pi }^{*}}(x)=1] = 1\). Indeed, letting \(\varvec{t}_{x}\) be the test polynomial, \(\varvec{p}\) the state polynomial, \(\varvec{p}_{1},\dots ,\varvec{p}_{k}\) the query polynomials of \(V_{\scriptscriptstyle {\mathsf {LPCP}}}\),

$$\begin{aligned} \mathrm{Pr}\big [V_{\scriptscriptstyle {\mathsf {LPCP}}}^{\varvec{\pi }^{*}}(x)=1\big ]= & {} \mathop {\mathrm{Pr}}\limits _{\varvec{r}\leftarrow \mathbb {F}^{\mu }} \Big [ \varvec{t}_{x}\big (\varvec{p}(\varvec{r}),\left\langle \varvec{\pi }^{*} , \varvec{p}_{1}(\varvec{r}) \right\rangle ,\dots ,\left\langle \varvec{\pi }^{*} , \varvec{p}_{k}(\varvec{r}) \right\rangle \big ) = 0^{\eta } \Big ] \\= & {} \mathop {\mathrm{Pr}}\limits _{\varvec{r}\leftarrow \mathbb {F}^{\mu }}\big [\, {\mathbf {a}}(\varvec{r}) = 0^{\eta } \,\big ], \end{aligned}$$

where \({\mathbf {a}}\) is the polynomial of degree \(d_{Q}d_{D}\) defined by \({\mathbf {a}}(\varvec{r}) := \varvec{t}_{x}\big (\varvec{p}(\varvec{r}),\left\langle \varvec{\pi }^{*} , \varvec{p}_{1}(\varvec{r}) \right\rangle ,\dots ,\left\langle \varvec{\pi }^{*} , \varvec{p}_{k}(\varvec{r}) \right\rangle \big )\). By the Scwartz–Zippel Lemma (cf. Lemma 2.1), if \(\mathrm{Pr}[V_{\scriptscriptstyle {\mathsf {LPCP}}}^{\varvec{\pi }^{*}}(x)=1] > \frac{d_{Q}d_{D}}{|\mathbb {F}|}\) then \({\mathbf {a}} \equiv 0^{\eta }\) and thus \(\mathrm{Pr}[V_{\scriptscriptstyle {\mathsf {LPCP}}}^{\varvec{\pi }^{*}}(x)=1]=1\).

A similar argument proves the analogous statement for \(\text{ LIP } \)s. \(\square \)

Do LIPs with strong knowledge exist? In Sect. 3, we presented two types of \(\text{ LIP } \) constructions.

  • Both \(\text{ LIP } \)s constructed in Sect. 3.1 are algebraic (as they are based on algebraic \(\text{ LPCP } \)s) and hence, because of Lemma 9.3, do enjoy strong knowledge.

  • In contrast, the \(\text{ LIP } \)s constructed in Sect. 3.2 are not algebraic and also do not enjoy strong knowledge (or soundness). The reason is that those \(\text{ LIP } \)s are based on traditional \(\text{ PCP } \)s that do not enjoy strong knowledge (or soundness).

In fact, we now prove that no (traditional) \(\text{ PCP } \) (for a hard-enough language) can enjoy strong soundness, so that the lack of strong knowledge (or soundness) for the \(\text{ LIP } \)s constructed in Sect. 3.2 is inherent. Concretely, we show that if a language L has a (traditional) \(\text{ PCP } \) with strong soundness, then it can be decided quite easily.

Definition 9.4

Let \(\ell :\mathbb {N}\rightarrow \mathbb {N}\) be a function. The complexity class \({\mathsf {MA}}(\ell )\) is the set of languages L for which there exists a probabilistic polynomial-time Turing machine M such that, for every instance x,

  • if \(x \in L\), then there is \(y \in \{0,1\}^{\ell (|x|)}\) such that \(\mathrm{Pr}[M(x,y)=1] > 2/3\);

  • if \(x \not \in L\), then for every \(y \in \{0,1\}^{\ell (|x|)}\) it holds that \(\mathrm{Pr}[M(x,y)=0] > 2/3\).

Theorem 9.5

Let \((P_{\scriptscriptstyle {\mathsf {PCP}}} ,V_{\scriptscriptstyle {\mathsf {PCP}}})\) be a \(k\)-query \(\text{ PCP } \) with proof length \(m\) for a language L, where \(k\) and \(m\) are functions of the input size. If \((P_{\scriptscriptstyle {\mathsf {PCP}}} ,V_{\scriptscriptstyle {\mathsf {PCP}}})\) has strong soundness error 1/3, then \(L \in {\mathsf {MA}}(2k(\log m+1))\).

Proof

Because \((P_{\scriptscriptstyle {\mathsf {PCP}}} ,V_{\scriptscriptstyle {\mathsf {PCP}}})\) has strong soundness error 1/3, for every x and \(\pi \), either \(\mathrm{Pr}[V_{\scriptscriptstyle {\mathsf {PCP}}}^{\pi }(x)=1]=1\) or \(\mathrm{Pr}[V_{\scriptscriptstyle {\mathsf {PCP}}}^{\pi }(x)=1]\le 1/3\). (See Definition 2.2.) It suffices to show that, for every \(x\in L\) and \(\pi \) such that \(\mathrm{Pr}[V_{\scriptscriptstyle {\mathsf {PCP}}}^{\pi }(x)=1]=1\), there is \(S\subseteq [m]\) of size at most \(2k\) for which \(\mathrm{Pr}[V_{\scriptscriptstyle {\mathsf {PCP}}}^{(\pi |_S ,\mathbf{1}|_{[m]\setminus S})}(x)=1]=1\), where \((\pi |_S ,\mathbf{1}|_{[m]\setminus S})\) is the \(\text{ PCP } \) oracle that is the same as \(\pi \) at the locations in S and is equal to 1 at all other locations. Indeed, to see that the latter is sufficient, consider the \({\mathsf {MA}}\) verifier that, on input x and candidate witness \((S,\pi |_S) \in \{0,1\}^{2k(\log m+1)}\), runs \(V_{\scriptscriptstyle {\mathsf {PCP}}}\) with \((\pi |_S ,\mathbf{1}|_{[m]\setminus S})\) and accepts if and only if \(V_{\scriptscriptstyle {\mathsf {PCP}}}\) does; by the above claim and the completeness of \((P_{\scriptscriptstyle {\mathsf {PCP}}} ,V_{\scriptscriptstyle {\mathsf {PCP}}})\), for any \(x\in L\), there is a witness that makes the \({\mathsf {MA}}\) verifier accept with probability 1; furthermore, for every \(x\notin L\), the (strong) soundness of \((P_{\scriptscriptstyle {\mathsf {PCP}}} ,V_{\scriptscriptstyle {\mathsf {PCP}}})\) implies that the \({\mathsf {MA}}\) verifier accepts with probability at most 1/3 regardless of the witness.

To prove the aforementioned claim, fix any such \(x \in L\) and define \(S\subseteq [m]\) to be the set of positions that are queried by \(V_{\scriptscriptstyle {\mathsf {PCP}}}\) with probability at least 1/2; by averaging \(|S| \le 2k\). (While S may depend on x, it may not depend on the \(\text{ PCP } \) proof.) Let \(\pi \in \{0,1\}^{m}\) be a \(\text{ PCP } \) oracle for which \(\mathrm{Pr}[V_{\scriptscriptstyle {\mathsf {PCP}}}^{\pi }(x)=1]=1\). We show that also \(\mathrm{Pr}[V_{\scriptscriptstyle {\mathsf {PCP}}}^{(\pi |_S ,\mathbf{1}|_{[m]\setminus S})}(x)=1]=1\), or else strong soundness is violated. Indeed, assume toward contradiction that this is not the case, then by strong soundness \(\mathrm{Pr}[V_{\scriptscriptstyle {\mathsf {PCP}}}^{(\pi |_S ,\mathbf{1}|_{[m]\setminus S})}(x)=1]\le 1/3\). Consider a sequence of hybrid \(\text{ PCP } \) oracles \(\pi _0,\dots ,\pi _{m-|S|}\) defined as follows: starting from \(\pi _0 = \pi \), we set the bits outside of S one by one (in any fixed order) to 1, until we get \(\pi _{m-|S|}=(\pi |_S ,\mathbf{1}|_{[m]\setminus S})\). By strong soundness, there exist \(i\in [m-|S|]\) such that \(\mathrm{Pr}[V_{\scriptscriptstyle {\mathsf {PCP}}}^{\pi _{i-1}}(x)=1]=1\) and \(\mathrm{Pr}[V_{\scriptscriptstyle {\mathsf {PCP}}}^{\pi _{i}}(x)=1]\le 1/3\). However, \(\pi _{i-1}\) and \(\pi _{i}\) only differ on a single position \(j\in [m]\setminus S\) that, in particular, is queried with probability at most 1/2, implying that \(\mathrm{Pr}[V_{\scriptscriptstyle {\mathsf {PCP}}}^{\pi _{i-1}}(x)=1]-\mathrm{Pr}[V_{\scriptscriptstyle {\mathsf {PCP}}}^{\pi _{i}}(x)=1]\le 1/2\), leading to a contradiction.    \(\square \)

From LIPs with strong knowledge to multi-theorem designated-verifier preprocessing SNARKs. At high level, the fact that \(\text{ LIP } \)s with strong knowledge make “correlation attacks” impossible gives strong intuition for why such \(\text{ LIP } \)s should give rise (through our transformation from Sect. 6.1) to designated-verifier preprocessing \(\text{ SNARK } \)s with the multi-theorem property.

However, formalizing this intuition seems to require a notion of linear-only encryption that is stronger than the one introduced in Sect. 5.1. Specifically, we need to be able to repeatedly extract from a malicious prover in order to simulate its queries to the proof-verification oracle. Doing so raises difficulties similar to the case of plaintext-aware encryption (see [25, 26]), and seems to be solvable via “interactive extractability assumptions” [25, 26, 46].

We now provide a definition of linear-only encryption that, together with strong knowledge, suffices to obtain multi-theorem \(\text{ SNARK } \)s. We thus obtain a confirmation that using strong knowledge is a “conceptually correct” method serving as a good heuristic toward the construction of multi-theorem \(\text{ SNARK } \)s (despite the fact that we need to make fairly strong cryptographic assumptions to formalize this step).

Linear-only homomorphism with interactive extraction. A linear-only encryption scheme with interactive extraction is the same as a (standard) linear-only encryption (Definition 5.4), except that it has a stronger extraction guarantee. Recall that the (standard) linear-only property says that whenever an efficient adversary, given a public key \({\mathsf {pk}}\) and ciphertexts \((c_{1} ,\dots ,c_{m})\), produces a ciphertext \(c'\) in the image of \({\mathsf {Enc}}_{{\mathsf {pk}}}\), there is an efficient extractor that outputs a corresponding affine function “explaining” the ciphertext (or, more accurately, the underlying plaintext) as an affine combination of \((c_{1} ,\dots ,c_{m})\) (or, more accurately, their underlying plaintexts). While the standard definition only guarantees “one-time” extraction, the interactive definition gives the adversary the option to interact with the extractor and try to repeatedly sample additional ciphertexts \(c_{2}',c_{3}',\dots \) given the previously-extracted affine combinations. The definition below is along the same lines as definitions in [25, 26, 46].

Definition 9.6

An encryption scheme has the linear-only property with interactive extraction if, for any polynomial-size (interactive) adversary \(A\), there is a polynomial-size (interactive) extractor \(E\) such that, for any sufficiently large \(\lambda \in \mathbb {N}\), any auxiliary input \(z\in \{0,1\}^{{\mathrm {poly}}(\lambda )}\), and any plaintext generator \(\mathcal {M}\), \(A\) wins the following game with negligible probability:

  1. 1.

    Generation step:

    • \(({\mathsf {sk}},{\mathsf {pk}}) \leftarrow {\mathsf {Gen}}(1^{\lambda })\);

    • \((a_{1},\dots ,a_{m}) \leftarrow \mathcal {M}({\mathsf {pk}})\);

    • \((c_{1} ,\dots ,c_{m}) \leftarrow ({\mathsf {Enc}}_{{\mathsf {pk}}}(a_{1}) ,\dots ,{\mathsf {Enc}}_{{\mathsf {pk}}}(a_{m}))\).

  2. 2.

    For \(i\in \left\{ 1,\dots ,|A|\right\} \):

    • \((c_{i,1}',\dots ,c_{i,k}') \leftarrow A({\mathsf {pk}},c_{1},\dots ,c_{m};e_1,\dots ,e_{i-1};z)\);

    • \(e_{i} \leftarrow E({\mathsf {pk}},c_{1},\dots ,c_{m};i;z)\), where \(e_{i}\) is either an affine function \((\Pi _{i} ,\varvec{b}_{i})\) or it is \(\bot \).

  3. 3.

    \(A\) wins if there exists some \(i\in [|A|]\), such that any one of the following holds:

    • The extractor fails to identify that \(A\) outputs an invalid cipher. Namely, there exists \(j \in [k]\) such that \({\mathsf {ImVer}}_{{\mathsf {sk}}}(c_{i,j}')\ne 1\) and \(e_{i}\ne \bot \).

    • The extractor fails to produce an affine function that explains the ciphertext produced by \(A\). Namely, there exists \(j \in [k]\) such that \({\mathsf {ImVer}}_{{\mathsf {sk}}}(c_{i,j}') = 1\) and one of the following conditions hold: (1) \(e_{i} = \bot \), or (ii) \({\mathsf {Dec}}_{{\mathsf {sk}}}(c_{i,j}')\ne a_{j}'\) where \(e_{i}=(\Pi _{i} ,\varvec{b}_{i}) \ne \bot \) and \((a_{i,1}',\dots ,a_{i,k}')^{\top } \leftarrow \Pi _{i} \cdot (a_{1},\dots ,a_{m})^{\top } +\varvec{b}_{i}\).

Remark 9.7

(Instantiations) While Definition 9.6 seems stronger than Definition 5.4, all the instantiations described in Sect. 5.3 are plausible candidates for satisfying it. (In particular, [25, 26, 46] considered “knowledge of exponent assumptions” satisfying a similar requirement.)

We now show that in Lemma 6.2, provided that the linear-only encryption satisfies Definition 9.6 and the \(\text{ LIP } \) has strong knowledge error, we obtain a multi-theorem \(\text{ SNARK } \) (again through Construction 6.1).

Lemma 9.8

Suppose that the \(\text{ LIP } \) \((P_{\scriptscriptstyle {\mathsf {LIP}}} ,V_{\scriptscriptstyle {\mathsf {LIP}}})\) has strong knowledge error \({\mathrm {poly}}(\lambda )/ |\mathbb {F}|\) and \(\mathcal {E}\) is a linear-only encryption scheme with interactive extraction (Definition 9.6). Then, \((G,P,V)\) from Construction 6.1 is a multi-theorem designated-verifier preprocessing \(\text{ SNARK } \).

Proof

We show that any polynomial-size adversary \(A\) that can access the proof-verification oracle \(V(\tau ,\cdot ,\cdot )\) can be transformed into a new polynomial-size adversary \(A'\) that cannot access \(V(\tau ,\cdot ,\cdot )\) such that, except with negligible probability, the output of \(A'\) is equal to the (final) output of \(A\). The lemma then follows by applying Lemma 6.2 with adversary \(A'\) where the random coins used by \(A'\) are used as auxiliary input.

First, we use \(A\) to define a new interactive adversary \(I_A\) that (following the template of Definition 9.6) works as follows. Given a public key \({\mathsf {pk}}\) and ciphertexts \((c_{1} ,\dots ,c_{m})\), \(I_A\) runs \(A\) and simulates the proof-verification oracle \(V(\tau ,\cdot ,\cdot )\) for \(A\). Specifically, when \(A\) outputs the first query \((y_{1},\pi _{1})\) to \(V(\tau ,\cdot ,\cdot )\), \(I_A\) proceeds as follows:

  1. 1.

    \(I_A\) outputs the tuple of ciphers \(\pi _{1}\) and then receives \(e_{1}\) from the extractor;

  2. 2.

    if \(e_{1}=\bot \), \(I_A\) answers \(A\)’s query \((y_{1},\pi _{1})\) with 0;

  3. 3.

    otherwise, \(e_{i}\) is an affine function \((\Pi _{1} ,\varvec{b}_{1})\), and \(I_A\) runs the LIP verification procedure using fresh coins; namely, it samples \(({\mathbf {u}},\varvec{q}) \leftarrow Q_{\scriptscriptstyle {\mathsf {LIP}}}\), computes \(d_{1} = D_{\scriptscriptstyle {\mathsf {LIP}}}({\mathbf {u}},\Pi _{1} \cdot \varvec{q}^{\top } + \varvec{b}_{1})\), and then answers \(A\)’s query \((y_{1},\pi _{1})\) with \(d_{1}\). (Note that \({\mathbf {u}}\) is sampled independently of the state contained in the verification state \(\tau \), which is unknown to \(I_A\).)

Then, \(I_A\) continues to run \(A\), each time simulating the answers of \(V(\tau ,\cdot ,\cdot )\) the same way it simulated its first answer. Finally, when \(A\) produces its final output \((y_{f},\pi _{f})\), \(I_A\) also outputs \((y_{f},\pi _{f})\). By the interactive extraction guarantee (see Definition 9.6), \(I_A\) has a corresponding polynomial-size extractor \(E\) such that, when \(I_A\) and \(E\) play the extraction game, \(I_A\) wins with negligible probability.Footnote 16

Next, we use \(I_A\) and \(E\) to define a new (non-interactive) adversary \(A'\) that works as follows. Given a public key \({\mathsf {pk}}\) and ciphertexts \((c_{1} ,\dots ,c_{m})\), \(A'\) runs the extraction game between \(I_A\) and \(E\), and then outputs the final output \((y_{f},\pi _{f})\) of \(I_A\). Note that \(A'\) does not require access to \(V(\tau ,\cdot ,\cdot )\).

We claim that, except with negligible probability, the output of \(A'\) is equal to the (final) output of \(A\). To show this, it suffices to argue that, except with negligible probability, the answers provided by \(I_A\) to \(A\) and those provided by \(V(\tau ,\cdot ,\cdot )\) are equal. And indeed:

  • Whenever \(A\) makes a query \((y_{i},\pi _{i})\) where \(\pi _{i}\) contains a ciphertext \(c'\) such that \({\mathsf {ImVer}}_{{\mathsf {sk}}}(c')\ne 1\), \(V(\tau ,\cdot ,\cdot )\) returns 0. In such a case, the extractor \(E\) outputs \(\bot \), and so \(I_A\) answers \(A\)’s query with 0.

  • For any other query \((y_{i},\pi _{i})\) of \(A\), except with negligible probability, \(E\) outputs \((\Pi _{i},\varvec{b}_{i})\) that explains \(A\)’s output; i.e., \({\mathsf {Dec}}_{{\mathsf {sk}}}(c_{i,j}')= a_{j}'\) for all \(j\in [k]\), where \((a_{i,1}',\dots ,a_{i,k}')^{\top } \leftarrow \Pi _{i} \cdot (a_{1},\dots ,a_{m})^{\top } +\varvec{b}_{i}\). We now consider two cases.

    The first case is where \((\Pi _{i},\varvec{b}_{i})\) convinces the \(\text{ LIP } \) verifier with probability 1; in this case, both \(V(\tau ,\cdot ,\cdot )\) and \(I_A\) return 1.

    The second case is where \((\Pi _{i},\varvec{b}_{i})\) does not always convince the \(\text{ LIP } \) verifier; in particular, by the strong knowledge guarantee, \((\Pi _{i},\varvec{b}_{i})\) convinces the \(\text{ LIP } \) verifier with only negligible probability. We argue that, in this case, except with negligible probability, both \(V(\tau ,\cdot ,\cdot )\) and \(I_A\) return 0. This clearly holds for \(I_A\), because it samples fresh queries and state from \(Q_{\scriptscriptstyle {\mathsf {LIP}}}\). The fact that this is also the case for \(V(\tau ,\cdot ,\cdot )\) follows from semantic security. For, if this were not the case, \(I_A\) and \(E\) could be used to produce an affine function that causes \(V(\tau ,\cdot ,\cdot )\) to accept and yet does not satisfy all but a negligible fraction of \(({\mathbf {u}},\varvec{q}) \in Q_{\scriptscriptstyle {\mathsf {LIP}}}\); this would allow to efficiently distinguish this encrypted \(\text{ LIP } \) query from an encrypted random \(\text{ LIP } \) query.

The proof of the lemma is now complete.\(\square \)