\createprocedureblock

procbcenter,boxedcodesize=,width=

Artemis: Efficient Commit-and-Prove SNARKs for zkML

Hidde Lycklama^1*, Alexander Viand^2*, Nikolay Avramov¹, Nicolas Küchler¹, Anwar Hithnawi³

¹ETH Zurich ²Intel Labs ³University of Toronto

Abstract

The widespread adoption of machine learning (ML) in various critical applications, from healthcare to autonomous systems, has raised significant concerns about privacy, accountability, and trustworthiness. To address these concerns, recent research has focused on developing zero-knowledge machine learning (zkML) techniques that enable the verification of various aspects of ML models without revealing sensitive information. Recent advances in zkML have substantially improved efficiency; however, these efforts have primarily optimized the process of proving ML computations correct, often overlooking the substantial overhead associated with verifying the necessary commitments to the model and data. To address this gap, this paper introduces two new Commit-and-Prove SNARK (CP-SNARK) constructions (Apollo and Artemis) that effectively address the emerging challenge of commitment verification in zkML pipelines. Apollo operates on KZG commitments and requires white-box use of the underlying proof system, whereas Artemis is compatible with any homomorphic polynomial commitment and only makes black-box use of the proof system. As a result, Artemis is compatible with state-of-the-art proof systems without trusted setup. We present the first implementation of these CP-SNARKs, evaluate their performance on a diverse set of ML models, and show substantial improvements over existing methods, achieving significant reductions in prover costs and maintaining efficiency even for large-scale models. For example, for the VGG model, we reduce the overhead associated with commitment checks from 11.5x to 1.2x. Our results suggest that these contributions can move zkML towards practical deployment, particularly in scenarios involving large and complex ML models.

^†^†footnotetext: * These authors contributed equally to this work.

Introduction

In recent years, the use of machine learning (ML) has become increasingly pervasive, with applications ranging from personalized recommendations and healthcare diagnostics to conversational agents like ChatGPT and autonomous vehicles. As ML transitions from an academic tool to a widely used technology with real-world impacts, concerns about privacy, accountability, and trustworthiness are mounting. In response, there has been a push to regulate AI, including efforts by governments to ensure these technologies are deployed responsibly and ethically [3, 26, 33]. At the same time, the research community has increasingly recognized that ensuring the integrity and correctness of ML models is crucial for maintaining trust in these systems, especially in high-stakes domains. This, in turn, has driven a wide range of research focused on developing transparent, verifiable, and auditable machine learning methods, targeting various stages of model development and deployment [38, 31, 41, 12, 32].

Much of the ML verification and auditing research assumes access to models and their underlying data. However, this assumption is often infeasible, particularly in contexts involving sensitive data or where organizations are unwilling to share models for competitive reasons. To address this, some recent efforts have focused on leveraging cryptographic techniques to verify various properties of ML models without requiring direct access to data or models, thereby preserving the privacy needed in these applications. Specifically, many of these efforts leverage zero-knowledge proofs (ZKPs) to verify various aspects of the data and/or the model, also known as “zkML” [27, 10, 30, 29, 46]. Applying Zero-Knowledge proofs to ML can present significant challenges due to the scalability issues inherent in ML. However, recent advances in zkML have greatly improved its efficiency and scalability, with the most efficient approaches today leveraging advanced lookup features in modern proof systems to optimize the proving process. In particular, zkML systems based on Halo2-style proof systems have demonstrated significant performance gains, allowing for large-scale, trustless deep learning inference with minimal overhead [27, 10].

Due to their zero-knowledge nature, zkML proofs do not disclose the specifics of the model used in an inference, thereby preserving intellectual property and privacy. However, this also means that such a proof of inference only verifies the correct execution of a machine learning model without providing information about the model’s identity. A proof of that is, by itself, not generally useful in practice, as it does not ensure the computation was performed using the intended model or that the model itself was not tampered with or replaced. Therefore, linking the proof to a specific model for which certain guarantees have been established – such as being trained under specific conditions or fulfilling particular attributes – is crucial. In practice, this link is established through cryptographic commitments to the model and, as part of the ZKP, demonstrating that the model in the zkML proof indeed corresponds to the model that was committed to.

To date, the vast majority of research in zkML has focused primarily on enhancing the efficiency of proving ML computations while largely neglecting the overhead associated with ensuring the consistency between the model and the model commitment [27]. However, as zkML methods continue improving, the overhead associated with model commitments is becoming a significant bottleneck. Recent studies have observed that commitment-related operations can account for a substantial portion of the total overhead in inference pipelines [27, 16]. In fact, as we show in this work, for larger models, existing approaches to commitment consistency checks for zkML [4, 27, 16, 45] can dominate the overall verification time with, for some models, more than 90% of the prover’s time spent on these checks rather than on ML computations.

Commit-and-Prove SNARKs. The need to efficiently verify that a part of a witness in a ZKP matches a value that was committed to earlier arises in many contexts beyond zkML. Similar patterns occur in applications such as anonymous credential systems, e-voting schemes, verifiable encryption protocols, and decentralized auditing systems [1]. Campanelli et al. [9] formalize this as Commit-and-Prove SNARKs (CP-SNARKs), i.e., a Succinct Non-Interactive Argument of Knowledge (SNARK) that can also show that (a part of) the witness is consistent with an external commitment. For certain types of ZKPs, such as Sigma Protocols and Bulletproofs [6], expressing statements about values contained in commitments is an inherent part of the protocol. As a result, these SNARKs either directly fulfill the Commit-and-Prove SNARK (CP-SNARK) definition or can be trivially adapted to fulfill it [11, 6]. However, for most generic SNARKs, an explicit construction is required [9, 7, 1]. This can either take the form of re-computing a commitment inside the SNARK (as is done in current zkML works that consider commitments), or it can take the form of an extension to the underlying proof system (as is the case in the LegoSNARK [9] line of work).

Contributions. In this paper, we present a new approach for constructing efficient Commit-and-Prove SNARKs that incorporates minimal computation within the SNARK and extends the underlying proof system in a highly efficient manner. More concretely, this paper presents the following contributions:

1.

We introduce Apollo, a new Commit-and-Prove SNARK that extends the LegoSNARK-style techniques initially proposed in Lunar [8, 7]. Apollo simplifies the construction process by minimally adapting the arithmetization of the witness within the SNARK. This optimized approach allows Apollo to achieve 7.3x improvements in prover time over Lunar. However, Apollo inherits the trusted setup requirement from Lunar [7].
2.

We also propose Artemis, a new Commit-and-Prove SNARK, which only makes black-box use of the underlying SNARK and supports any homomorphic polynomial commitment. As a result, it supports modern state-of-the-art proof systems without trusted setup, such as Halo2 with IPA-based commitments [17].
3.

We provide the first implementation of Lunar’s CP-SNARK, along with our implementations of the Apollo and Artemis constructions, all of which are made publicly available as open-source software¹¹1 https://github.com/pps-lab/artemis .
4.

We evaluate Apollo and Artemis on a diverse set of ML models, including GPT-2 [35], utilizing state-of-the-art zkML techniques for proving the correctness of ML computations [10]. Our evaluation shows that Apollo and Artemis dramatically outperform existing approaches, improving upon the state of the art by an order of magnitude. In addition, we show that Artemis without trusted setup achieves effectively the same performance as Apollo (and Artemis) with trusted setup.

Background

We begin by defining the two core building blocks of our work, namely Polynomial Commitments and (Commit-and-Prove) SNARKs. We then outline the Plonkish Arithmetization framework, which underpins the proof systems used in state-of-the-art zkML. These details will be relevant to understanding how we efficiently instantiate our construction for, e.g., the Halo2 proof system [17]. Due to space constraints, we refer to Appendix A for additional definitions.

Notation. We use the standard notation for bitstrings $\{0,1\}^{*}$ , groups ( $h$ generates $\mathbb{G}$ ) and fields $\mathbb{F}_{p}$ with order $p$ . We use bracket notation to denote ranges, e.g., $[n]=\{1,\ldots,n\}$ , and symbols representing polynomials are displayed in bold. We define a language $\mathcal{L}$ as a subset of $\{0,1\}^{*}$ and a relation $\mathcal{R}$ as a subset of $\{0,1\}^{*}\times\{0,1\}^{*}$ . The asymptotic security notions in this section are all quantified over $\lambda$ -compatible relations $\mathcal{R}_{\lambda}$ and we therefore use the simplified notation $\mathcal{R}$ instead.

Polynomial Commitments

Definition 2.1 (Polynomial Commitments [28]).

Polynomial commitments allow a prover to commit to a polynomial while retaining the ability to later reveal the polynomial’s value at any specific point, along with a proof that the revealed value is indeed correct. These commitments are an important building block for constructing succinct proofs. A polynomial commitment scheme consists of a triple (PC.Setup, PC.Commit, PC.Eval), where:

•

$\textsf{{PC}.Setup}(d)\rightarrow\key[ck]$ : prepares the public parameters given the maximum supported degree of polynomials $d$ and outputs a common reference string pp.
•

$\textsf{{PC}.Commit}(\key[ck],\mathbf{g},r)\rightarrow c$ : computes a commitment $c$ to a polynomial $\mathbf{g}$ , using randomness $r$ .
•

$\textsf{{PC}.Eval}(\key[ck],c,x,y;\mathbf{g},r)\rightarrow\{0,1\}$ : A protocol in which the prover convinces a verifier that $c$ commits to $f$ such that $f(x)=y$ .

A polynomial commitment scheme is secure if it provides correctness, polynomial binding, evaluation binding, and hiding properties. We refer to [28] for a formal definition of these properties. Additionally, we require that PC.Eval be an interactive argument of knowledge with knowledge soundness, ensuring the existence of an extractor that can recover the committed polynomial from any evaluation, provided it has full access to the adversary’s state.

zk-SNARK s

A proof for a relation $\mathcal{R}$ is a protocol between a prover $P$ and an efficient verifier $V$ by which $P$ convinces $V$ that $\exists w:\mathcal{R}(x,w)=1$ , where $x$ is called an instance, and $w$ a witness for $x$ . If the proof is a single message from $P$ to $V$ , it is non-interactive and consists of three polynomial-time algorithms:

•

$\textsf{Setup}(1^{\lambda},\mathcal{R})\rightarrow(\texttt{pp},\texttt{vk})$ : Setup public parameters crs and a verification key vk for a relation $\mathcal{R}$ and security parameter $\lambda$ .
•

$\textsf{Prove}(\texttt{pp},x,w)\rightarrow\pi$ : If $\mathcal{R}(x,w)=1$ , output a proof $\pi$ .
•

$\textsf{Verify}(\texttt{vk},x,\pi)\rightarrow\{0,1\}$ : Verify a proof $\pi$ for instance $x$ .

Proofs generally support a class of relations, for instance bounded size arithmetic circuits, including the size of a relation $\abs{\mathcal{R}}$ . A proof that satisfies completeness, knowledge soundness, and succinctness is a Succinct Non-Interactive Argument of Knowledge (SNARK). If the proof also satisfies zero-knowledge, i.e., it does not reveal any other information than the statement being true, it is a zero-knowledge Succinct Non-Interactive Argument of Knowledge (zk-SNARK). We provide formal definitions of these properties in Definition A.4 in Appendix A.

Commit-and-Prove SNARK (CP-SNARK). CP-SNARKs are SNARKs where the instance contains one or more commitments to parts of the witness [9, 7, 1]. In particular, the instance contains a set of commitments, i.e., $(x,c_{1},\ldots,c_{\ell})$ , to subsets of the witness $(w,r_{1},\ldots,r_{\ell})$ where $c_{i}=\textsf{Com.Commit}(w_{i},r_{i})$ with $w_{i}$ a subset of the witness.

Definition 2.2 (CP-SNARKs [9]).

Let $\mathcal{R}$ be a relation over $\mathcal{D}_{x}\times\mathcal{D}_{w}$ where $\mathcal{D}_{w}$ splits over $\ell+1$ arbitrary domains $\mathcal{D}_{1}\times\ldots\times\mathcal{D}_{\ell}\times\mathcal{D}_{v}$ for some arity parameter $\ell\geq 1$ . We denote the sub-witnesses $w_{1},\ldots,w_{\ell},w_{v}$ following this split. Let $\text{Com}=(\textsf{Com.Setup},\textsf{Com.Commit},\textsf{Com.Verify})$ be a commitment scheme (as per Definition A.1) whose message space $\mathcal{M}$ is such that $\mathcal{D}_{i}\subset\mathcal{M}$ for all $i\in[\ell]$ . A Commit-and-Prove SNARK for a relation $\mathcal{R}$ and a commitment scheme Com is a SNARK for a relation $\mathcal{R}^{\text{Com}}$ such that:

\mathcal{R}^{\textsf{Com}}=\left\{\begin{array}[]{c}\begin{array}[]{l}\left((x% ,c_{1},\ldots,c_{\ell}),(w,r_{1},\ldots,r_{\ell})\right)\end{array}:\\ \begin{array}[]{l}(x,w)\in\mathcal{R}\\ \bigwedge_{j\in[\ell]}\textsf{Com.Verify}(\key[ck],c_{j},w_{j},r_{j})=1\end{% array}\par\end{array}\right\}

CP-SNARKs satisfy completeness, knowledge soundness, and succinctness properties similar to SNARKs. Similar to zk-SNARKs, we can also consider a zero-knowledge variant of CP-SNARKs. We refer to Campanelli et al. [9] for a formal definition of CP-SNARK properties.

Arithmetization

In the context of SNARKs that express statements over computations, the computation is generally expressed as bounded-depth arithmetic circuits. As most SNARKs internally rely on representing constraints as polynomials, arithmetization acts as an intermediary between the (circuit) computation and the polynomial representation required by the underlying proof system. Specifically, arithmetization reduces statements about computations to algebraic statements involving polynomials of a bounded degree. Some operations can be easily transformed into arithmetic operations, either because they are algebraic operations over a finite field or because they can be easily adapted to algebraic operations. However, more complex operations (e.g., comparisons or any higher-order function) are not as easily expressed in arithmetic circuits. As a result, modern SNARKs generally support more advanced arithmetization, such as lookups and custom gates that can help address this overhead. This induces a complex design problem, where different approaches to arithmetizing the same computation can give rise to proofs with vastly different efficiency. In the following, we focus on the Plonkish arithmetization that is used by many state-of-the-art proof systems, including Halo2 [17]. Halo2 is zk-SNARK that builds upon the original Halo protocol [5] but combines it with Plonkish arithmetization to express functions or applications as circuits, as originally introduced by Plonk [18]. Specifically, Halo2 relies on UltraPLONK’s [2] arithmetization, which adds support for custom gates and lookup arguments.

Definition 2.3 (Plonkish Arithmetization [17, 2]).

Consider a grid comprised of $n$ rows (where $n=2^{k}$ for some $k$ ) with $n_{f}$ fixed columns, $n_{a}$ advice columns, and $n_{p}$ instance columns. Let $F_{i,j}\in\mathbb{F}_{p}$ be the value in the $j$ -th row of the $i$ -th fixed column, and let $A_{i,j}$ and $P_{i,j}$ be defined equivalently for advice and instance columns, respectively. Let $\{\mathbf{f}_{i}(X)\}_{i\in n_{f}}$ , $\{\mathbf{a}_{i}(X)\}_{i\in n_{a}}$ , and $\{\mathbf{p}_{i}(X)\}_{i\in n_{p}}$ be the polynomials representing the fixed, advice, and instance columns, respectively, where

•

$\mathbf{f}_{i}(X)$ interpolates s.t. $\mathbf{f}_{i}(\omega^{j})=F_{i,j}$ for $i\in[n_{f}],j\in[n]$
•

$\mathbf{a}_{i}(X)$ interpolates s.t. $\mathbf{a}_{i}(\omega^{j})=A_{i,j}$ for $i\in[n_{a}],j\in[n]$
•

$\mathbf{p}_{i}(X)$ interpolates s.t. $\mathbf{p}_{i}(\omega^{j})=P_{i,j}$ for $i\in[n_{p}],j\in[n]$ .

for $\omega\in\mathbb{F}_{p}$ a $n=2^{k}$ primitive root of unity.

Constraints for (custom) gates are expressed as multivariate polynomials $b_{i}$ in $n_{f}+n_{a}+n_{i}+1$ indeterminates of degree at most $n-1$ , for which we only consider their evaluation at points of the form:

	$\displaystyle b_{i}(X,$	$\displaystyle f_{0}(X),...,f_{n_{f}-1}(X),a_{0}(X),...,a_{n_{a}-1}(X),$
		$\displaystyle p_{0}(X),...,p_{n_{p}-1}(X)).$

We refer to the extensive literature on Plonkish arithmetization for details on copy and permutation constraints [11, 17, 39, 18].

Related Work

In this section, we provide a concise overview of recent developments in zkML, focusing on key state-of-the-art results. We then review existing work on Commit-and-Prove SNARKs and discuss their limitations.

zkML. The field of zero-knowledge machine learning has seen rapid development in recent years, driven by the application and optimization of various proof systems for ML inference and training tasks. While there has been some work addressing ML training [44, 20, 40], the majority of research has concentrated on ML inference. Initial efforts in this area were primarily focused on convolutional neural networks (CNNs) and used early proof systems such as Groth16 [23], which are capable of proving statements about relations formulated as Quadratic Arithmetic Programs (QAPs).

For instance, ZEN [16] proposes a stranded encoding method to optimize the multiplication of multiple small fixed-point numbers in one field element. vCNN [29] and pvCNN [46] enhance support for CNN architectures by proposing arithmetizations of convolutions that significantly reduce the number of multiplications required in their QAP representation. zkCNN [30] proposes a novel technique for proving convolutions with linear prover time using a sumcheck-based protocol. However, these works do not consider more recent ML developments and are generally not practical for larger models.

More recent research has favored the Halo2 proof system, which supports Plonkish arithmetization, due to its enhanced expressiveness and the absence of a trusted setup [5]. In particular, the support for custom gates and lookup arguments enables more efficient arithmetization of complex ML layers, which were previously costly to arithmetize. Kang et al. [27] propose a construction based on Halo2 to prove inference for ImageNet-scale models, demonstrating a substantial improvement in prover time compared to earlier methods. EZKL [43, 15, 19] provides an open-source platform that can arithmetize computational graphs to Plonkish, with support for a wide variety of deep learning and data science models. Finally, ZKML [10] introduces an optimizing compiler that translates Tensorflow Lite models into Plonkish arithmetizations for Halo2, supporting a wide range of neural network layers and models related to computer vision and language models, including language models such as GPT-2.

Commit-and-Prove SNARKs. Most zkML works overlook the issue of commitments entirely. The few that do discuss it, generally propose a straightforward approach based on effectively “(re-)computing” the commitment inside the SNARK [16, 27, 45, 14]. However, commitments and SNARKs generally rely on different algebraic structures; therefore, one needs to emulate operations, such as elliptic curve computations, using a large number of arithmetic circuit operations. To address this mismatch, ZK-friendly elliptic curves (e.g., the Jubjub curve from Zcash [17]) have been proposed. These curves reduce the overhead by decreasing the number of constraints needed to verify a commitment, but, despite these improvements, they remain far from efficient. Given these limitations of ZK-friendly elliptic curves, recent research has shifted towards hash-based commitments. While conventional hash functions like (e.g., SHA256) introduce more overhead than elliptic curve-based methods, ZK-friendly hash functions such as Poseidon [21] provide a more efficient alternative, outperforming elliptic curve-based commitments, including those using ZK-friendly curves. Nonetheless, our evaluation shows that even with these improvements, the overhead remains too high for zkML, particularly when dealing with large-scale models.

Campanelli et al. formalized the notion of “commit-and-proof-SNARKs” (CP-SNARKs) [9], and proposed an alternative approach to constructing them in LegoSNARK [9] which proposed an adaptation of the Groth16 [23] zk-SNARK to a CP-SNARK. Subsequent works have proposed CP-SNARKs for a variety of proof systems. For example, Chen et al. show how to convert sumcheck-based SNARKs to CP-SNARKs [11], though this only applies for expensive multilinear commitments which are required for sumcheck. Eclipse [1] introduces a compiler that transforms Interactive Oracle Proof (IOP)-style SNARKs instantiated with Pedersen-like commitments into CP-SNARKs relying on amortized Sigma protocols. This transformation results in a proof size sublinear in the number of commitments and size of the committed witness. However, the verifier’s computational overhead is linear with respect to the committed input size, which significantly impacts the verifier efficiency when a large portion of the witness is committed, as is the case in zkML.

Most relevant to our work, Lunar [7, 8] presents a compiler for IOP-style SNARKs with polynomial commitments by proving shifts of related polynomials using a pairing-based construction. This method offers a proof size overhead that is independent of the size of the committed witness. However, a limitation of this approach is that it only supports pairing-based polynomial commitments, which, for all currently known pairing-based polynomial commitments, requires a trusted setup [5]. Lunar does not provide an implementation and, as we show in the following sections, makes a series of simplifying assumptions about the layout and cost model of Plonkish arithmetizations. As we discuss in § 4.1, these result in significant overheads when applying Lunar in practice.

\procb

$\Pi_{Artemis}=\langle P((x,c_{1},\ldots,c_{\ell}),(w,r_{1},\ldots,r_{\ell})),V% ((x,c_{1},\ldots,c_{\ell}))\rangle$ $P$ : $\mu$ \sample $\mathbb{F}$ _p, $r$ _ $\mu$ \sample $\mathbb{F}$ _p, $c$ _ $\mu$ ←PC.Commit (\key[ck], $\bm{\mu}$ , $r$ _ $\mu$ ) where $\bm{\mu}$ is the degree-0 polynomial defined by $\mu$ and send $c$ _ $\mu$ to $V$
$V$ : α\sample $\mathbb{F}$ _p, β\sample $\mathbb{F}$ _p and send α,β to $P$
$P$ : define $r$ ← $r$ _ $\mu$ + ∑i=0ℓ $r$ i+1 ⋅αi and $\mathbf{g}$ := $\bm{\mu}$ + ∑i=0ℓ $\mathbf{w_{i+1}}$ ⋅αi where $\mathbf{w_{i}}$ is defined by interpreting wi as a coefficient vector
$P$ : ρ← $\mathbf{g}$ (β) and send ρ to $V$
$P$ and $V$ execute the zk-SNARK $\Pi$ with (( $x$ , α, $\beta$ , $\rho$ ), ( $w$ , $\mu$ )) for the relation $\left\{\begin{array}[]{c}((x,\alpha,\beta,\rho),(w,\mu))\\ :\\ (x,w)\in\mathcal{R}\land\rho=(\bm{\mu}+\sum_{i}\mathbf{w_{i+1}}\cdot\alpha^{i}% )(\beta)\end{array}\right\}$
and halt if $\Pi$ aborts
$P$ and $V$ define $c$ ← $c$ $\mu$ + ∑i=0ℓ $c$ i+1 ⋅αi
$P$ and $V$ execute PC.Eval (\key[ck], $c$ , β, ρ; $\mathbf{g}$ , $r$ ) and output the result

Figure 1: Artemis CP-SNARK. We denote

w_{1},\ldots,w_{\ell},w_{v}

as the sub-witnesses of

w\in\mathcal{D}_{w}

following the split of

\mathcal{D}_{w}

Design

We begin by presenting Apollo, a CP-SNARK in the LegoSNARK style [9] for Plonk [18, 2] and KZG-style commitments [28]. Next, we introduce Artemis, a CP-SNARK that operates with arbitrary proof systems (i.e., makes only black-box use of the underlying SNARK) that supports any homomorphic polynomial commitment. Most importantly, it supports state-of-the-art proof systems like Halo2 that do not require a trusted setup. We then provide a formal security proof for Artemis. Finally, we discuss the efficient instantiation of Artemis for proof systems like Halo2 that use Plonkish arithmetization. A detailed discussion of concrete performance and a comparison to existing approaches is deferred to the next section (cf. § 5).

Apollo: Improved CP-SNARKs for Plonk and KZG

Recent zkML advancements based on Plonk-style proof systems have significantly reduced the overhead associated with proving ML computations [43, 27, 10]. Nonetheless, these works have either overlooked commitment checks entirely or only considered "recomputation" approaches, such as Poseidon-based commitments [27, 45]. Meanwhile, outside the scope of zkML, a series of works beginning with LegoSNARK [9] have introduced alternative approaches for handling commitment checks. These approaches are based on the insight that SNARKs, in general, inherently require committing to the witness internally. As a result, these works bypass the need to add costly recomputation constraints to the SNARK by constructing specialized proofs that link these internal witness commitments to external commitments. This has the potential to dramatically improve performance. However, as we discuss below, these solutions unfortunately have significant limitations in practice. We address these limitations by introducing Apollo, which significantly optimizes the state-of-the-art approach.

Lunar [7] proposes a LegoSNARK-style construction for Plonk-style proof systems that represents the current state of the art. One of the key challenges in this approach to CP-SNARKs is that the internal witness commitments usually do not directly correspond to the commitments we want to verify. For example, internal commitments generally commit to more than just the (part of) the witness that we are interested in. As a result, Lunar actually introduces two specialized proofs: a shifting proof ( $\mathsf{CP}_{\text{link}}^{(2)}$ in [7]) that effectively aligns the external and internal commitments, and the core linking proof ( $\mathsf{CP}_{\text{link}}^{(1)}$ in [7]) that establishes that the external proofs indeed commit to the witness values.

For example, in the Plonkish arithmetization (cf. Definition 2.3), the witness values of interest (e.g., the model weights) might appear across a variety of rows and columns in the grid. As part of the proof, the prover commits to a polynomial encoding of each column in the grid. Thus, the witness values of interest will be spread across multiple commitments and also across the entire evaluation domain. The shifting proof in Lunar shows that the original external commitment and a shifted version that aligns the values to the evaluation domain of the witness values are commitments to the same underlying polynomial. Lunar’s construction only operates on a single column, i.e., in the case of witness values being spread across multiple columns and, therefore, commitments, multiple instances of the shifting and linking proofs are required. More importantly, Lunar assumes that the values for each external commitment appear contiguously inside the witness column, which is unlikely to be the case for zkML. Whenever a value appears out-of-order, or after a gap, additional shifting and linking proofs are required. As a result, Lunar incurs significant overheads when applied to real-world settings because of the large amount of additional shifting and linking proofs to align complicated real-world arithmetizations with the external commitments. We show in our evaluation (cf. § 5) that these overheads are significant in practice.

In Apollo, our key insight is that instead of addressing the complexity of aligning commitments with complex real-world arithmetizations through multiple external proofs, we can exploit the flexibility of Plonkish arithmetizations to perform the alignment once inside the Plonk proof. Specifically, we add an additional advice column $a_{n_{a}+1}$ that contains the witness values of interest in the same sequence as they appear in the external commitment, and already aligned correctly on the evaluation domain. We then add a copy constraint $b_{i}$ for each witness value to link the new copies to their original cells in the grid. With this, we can directly perform the linking proof between the new advice column $a_{n_{a}+1}$ and the external commitments $c_{1},\ldots,c_{\ell}$ . In case there are more witness values than can fit a single column, we overflow into additional columns, each requiring one additional linking proof.

Our approach entirely removes the need to perform the shifting proofs and dramatically reduces the number of linking proof instances. For example, for an inference proof for MobileNet [37], Lunar requires 20 shifting and 20 linking proofs, while Apollo requires only a single linking proof. We omit formal proofs for Apollo, as we directly use the linking protocol ( $\mathsf{CP}_{\text{link}}^{(1)}$ ) from Lunar [7] and otherwise merely extend the arithmetization of the underlying SNARK in a straightforward manner. Though Apollo represents a significant advance compared to the existing state of the art, it inherits some of the inherent shortcomings of Lunar’s construction. Specifically, both Lunar and Apollo are white-box constructions that depend very explicitly on details of the arithmetization, the commitments, and the proof system. In addition, Lunar’s linking proof ( $\mathsf{CP}_{\text{link}}^{(1)}$ ) requires a pairing-based polynomial commitment, i.e., KZG which requires a trusted setup. Therefore, we next discuss Artemis, which addresses these drawbacks.

Artemis: Efficient CP-SNARKs w/o Trusted Setup

Current CP-SNARK constructions are closely tied to the specific proof systems and commitments they employ. In contrast, the re-computing approach is more general, as it treats the underlying proof system as a black box, but this flexibility comes with considerable overhead. With Artemis, we propose a new approach for CP-SNARKs that achieves both generality and efficiency. Our approach is compatible with any homomorphic polynomial commitment and any generic proof system (i.e., we only make black-box use of the proof system).

Polynomial Commitments. To verify the consistency of committed witness elements in a homomorphic polynomial commitment, consider the following setup. Let $w_{i}$ for $i\in[\ell]$ be the part of the witness $w$ (cf. Definition 2.2) that is committed to in a homomorphic polynomial commitment $c_{i}$ (cf. Definition 2.1). For our CP-SNARK, we need to check $c_{i}\stackrel{{\scriptstyle?}}{{=}}\textsf{{PC}.Commit}(\key[ck],\mathbf{w_{i% }},r_{i})$ for $r_{i}\in\mathbb{F}_{p}$ , where $\mathbf{w_{i}}$ denotes the the polynomial encoding of $w_{i}$ . Equivalently, we can express this as checking that $\mathbf{w_{i}}\stackrel{{\scriptstyle?}}{{=}}\mathbf{w_{i}}^{\prime}$ , where $\mathbf{w_{i}}^{\prime}\in\mathbb{F}_{p}\left[X\right]$ s.t. $c_{i}=\textsf{{PC}.Commit}(\key[ck],\mathbf{w_{i}}^{\prime},r_{i}^{\prime})$ for $r_{i}^{\prime}\in\mathbb{F}_{p}$ . At first glance, this might suggest that verification via the SNARK requires re-computing the commitment, as is typically required with Poseidon hash-based commitments. However, as polynomial commitments offer the ability to compute an opening of the polynomial evaluated at a specific point (PC.Eval), we can simplify the process by evaluating $\mathbf{g}_{w_{i}}$ on a random point $\beta$ and checking that it matches the opening of the commitment at $\beta$ (i.e., $\textsf{{PC}.Eval}(\key[ck],c,\beta,\mathbf{w_{i}}(\beta);\mathbf{w_{i}}^{% \prime},r_{i}^{\prime})=1$ ). This approach relies on the well-known DeMillo–Lipton–Schwartz–Zippel Lemma (c.f. Lemma A.5 in Appendix A), which states that a polynomial of degree $d$ over a field $\mathbb{F}$ evaluated at a random point is non-zero with probability at most $d/\abs{\mathbb{F}}$ .

Efficient Checking of Polynomial Commitments. Evaluating the polynomial $\mathbf{w_{i}}$ corresponding to the witness at a random point $\beta$ requires only a few arithmetic operations, and can therefore be done very efficiently inside the SNARK. However, opening a polynomial commitment to a specific point is generally more expensive than recomputing the commitment. For example, for KZG commitments, this opening requires essentially the same computation as the original commitment and also additional pairing operations. However, with a polynomial commitment, we can use PC.Eval outside the SNARK to evaluate the commitment on the same random point without leaking the entire polynomial. Specifically, we can run this computation outside the SNARK by releasing the point $\rho_{i}=\mathbf{w_{i}}(\beta)$ from the SNARK as $\rho_{i}=\mathbf{w_{i}}(\beta)+\mu_{i}$ for a random masking value $\mu_{i}\sample\mathbb{F}_{p}$ . Using a homomorphic polynomial commitment, the prover can easily provide a commitment $c_{\mu_{i}}=\textsf{{PC}.Commit}(\bm{\mu}_{i},r_{\mu_{i}})$ where $\bm{\mu}_{i}$ is the polynomial encoding of $\mu_{i}$ and $r_{\mu_{i}}\sample\mathbb{F}_{p}$ . This allows us to run PC.Eval not on $(\key[ck],c_{i},\beta,\rho_{i};\mathbf{w_{i}},r_{i})$ but on $(\key[ck],c_{i}+c_{\mu_{i}},\beta,\rho_{i};\mathbf{w_{i}}+\bm{\mu}_{i},r_{i}+r% _{\mu_{i}})$ , removing any potential leakage.

Aggregating Multiple Commitments. Up to this point, we have considered each $\mathbf{w_{i}}$ and $c_{i}$ individually. However, a key advantage of our approach—particularly in comparison to Lunar and Apollo —is the ability to aggregate all $\mathbf{w_{i}}$ and $c_{i}$ , thereby reducing the number of commitment checks to a single PC.Eval operation. To achieve this, we compute a linear combination with an additional challenge $\alpha$ from the verifier, specifically, we set:

\rho=(\bm{\mu}+\textstyle\sum_{i}\mathbf{w_{i}}\cdot\alpha^{i})(\beta)

where $\bm{\mu}$ is the polynomial encoding of a single random masking value $\mu\sample\mathbb{F}_{p}$ , and $\alpha\sample\mathbb{F}_{p}$ is the verifier-provided challenge. We then run a single instance of PC.Eval:

\textsf{{PC}.Eval}(\key[ck],c_{\mu}+\textstyle\sum_{i}c_{i}\cdot\alpha^{i},% \beta,\rho;\bm{\mu}+\textstyle\sum_{i}\mathbf{w_{i}}\cdot\alpha^{i},r_{\mu}+% \textstyle\sum_{i}r_{i}\cdot\alpha^{i}).

We show in our proof that the knowledge soundness error this introduces is negligible. Note, that this is distinct from the usual batch opening that some polynomial commitments support (e.g., employed in Plonk with KZG commitments [17]). Since we only consider opening of commitments at the same value, and do not need to verify each result individually but only the aggregated value, our optimization technique applies to any homomorphic polynomial commitment.

$\Pi_{Artemis}$ Construction. We construct $\Pi_{Artemis}$ , a (zk) CP-SNARK for a relation $\mathcal{R}$ matching Definition 2.2 and a homomorphic polynomial commitment scheme PC, given PC and a SNARK (or zk-SNARK) $\Pi$ for a $\mathcal{R}^{\prime}$ that we will define below. We provide the complete protocol in Figure 1, and focus on discussing key points below. The prover commits to a random mask $\mu$ (specifically, its polynomial encoding $\bm{\mu}$ ) and sends the commitment $c_{\mu}$ to the verifier. The verifier replies with two challenge values, $\alpha$ and $\beta$ . The prover uses $\alpha$ to compute a linear combination $\mathbf{g}$ of the witnesses $\mathbf{w_{i}}$ masked with $\bm{\mu}$ . The prover then evaluates $\mathbf{g}$ at $\beta$ and sends the resulting value $\rho$ to the verifier. This enables the prover and verifier to run a SNARK $\Pi$ for a slightly extended version of the original relation:

\mathcal{R}^{\prime}=\leavevmode\hbox{$\left\{\begin{array}[]{c}((x,\alpha,% \beta,\rho),(w,\mu))\\ :\\ (x,w)\in\mathcal{R}\land\rho=(\bm{\mu}+\sum_{i}\mathbf{w_{i+1}}\cdot\alpha^{i}% )(\beta)\end{array}\right\}$}

I.e., we extend the original relation by a simple masked polynomial evaluation of a linear combination of the witness polynomials at a challenge point. In practice, this does not introduce a significant overhead, and we discuss how to augment arithmetizations of $\mathcal{R}$ to efficient arithmetizations of $\mathcal{R}^{\prime}$ in § 4.4. Assuming $\Pi$ does not abort, the protocol then proceeds to using PC.Eval to show that the commitments evaluate to the same value. Towards this, both verifier and prover compute a masked linear combination of the commitments using $\alpha$ which is possible due to their homomorphic nature. This, together with the masked linear combination of the witness polynomials, and a masked linear combination of the commitment randomnesses, forms the input to PC.Eval. Due to the (repeated applications of) DeMillo–Lipton–Schwartz–Zippel Lemma, this check will complete (with high probability) only if the witnesses in the SNARK indeed agree with the committed values. We provide a full proof of security for Artemis in § 4.3.

	$\|\pi\|$	Prove (time)	Verify (time)
Eclipse [1]	$O(\log(\ell\cdot d))$	$O(\ell\cdot d)$	$O(\ell\cdot d)$ $\mathbb{G}$
Lunar [7]	$O(\ell)$	$O(\ell\cdot d)$	$O(\ell)$ $\mathsf{P}$
Apollo (§4.1)	$O(\ell)$	$O(\ell\cdot d)$	$O(\ell)$ $\mathsf{P}$
Artemis (§4.2)	$O(\ell)$	$O(\ell\cdot d)$	$O(\ell)$ $\mathbb{G}$ + $O(1)$ $\mathsf{P}$

Table 1: Asymptotic comparison of the overhead introduced by CP-SNARKs for Plonkish relations with KZG-style commitments, with

\ell

input commitments each of size

d

. Prover time is expressed in group operations, while verifier time is split into group exponentiations (

\mathbb{G}

) and pairings (

\mathsf{P}

Cost Analysis. We provide a comparison of the asymptotic complexity of CP-SNARKs that support Plonkish relations in Table 1. We consider only instantiations using KZG-style commitments, as Lunar and Apollo are only compatible with these. We report the commitment checking overhead, i.e., the overhead of the CP-SNARK over an equivalent non-CP SNARK for the same relation. All approaches introduce linear prover overhead, which is likely optimal as simply reading the external commitments already induces such an overhead. Similarly, all approaches add linear overhead to the proof size. While Lunar, Apollo and Artemis all add verifier overhead that is linear in the number of external commitments, we note that for Artemis, we require only a single pairing operation with the linear overhead only consisting of efficient group operations. We note that asymptotics do not provide a full picture of performance. For example, Apollo introduces the same asymptotic overhead as Lunar [7], but is significantly faster in practice. We refer to § 5 for a detailed evaluation of concrete performance.

Security Proof for Artemis

We now show that Artemis is a CP-SNARK. A technicality in the proof is that for knowledge soundness, our extractor must be able to extract the randomness of the individual commitments, even though we only have a single evaluation proof that is masked by a random value. To do so, our extractor internally invokes the extractor of PC several times to reconstruct the randomness from different evaluation proofs.

Theorem 4.1 (Artemis CP-SNARK).

$\Pi_{Artemis}$ in Fig. 1 is a CP-SNARK for the relation $\mathcal{R}$ and commitment scheme PC. If $\Pi$ is zero-knowledge and PC is hiding, then $\Pi_{Artemis}$ is zero-knowledge.

Proof.

$\Pi_{Artemis}$ satisfies the properties of a CP-SNARK: Completeness:. It follows from the completeness of the SNARK and the homomorphic and completeness properties of the polynomial commitment scheme. $P$ convinces $V$ with high probability that $((x,\alpha,\beta,\rho),(w,\mu))\in\mathcal{R}^{\prime}$ from the completeness of the SNARK $\Pi$ . Hence, it holds that $\rho=(\bm{\mu}+\sum_{j}\mathbf{w_{i}}\cdot\alpha^{j})(\beta)$ . Further, since PC is homomorphic, it holds that

	$\displaystyle c$	$\displaystyle=c_{\mu}+\textstyle\sum_{j=1}^{\ell}c_{j}\cdot\alpha^{j}$
		$\displaystyle=\textsf{{PC}.Commit}(\key[ck],\bm{\mu},r_{\mu})+\textstyle\sum_{% i=1}^{\ell}\textsf{{PC}.Commit}(\key[ck],\mathbf{w_{i}},r_{i})\cdot\alpha^{i}$
		$\displaystyle=\textsf{{PC}.Commit}(\key[ck],\bm{\mu}+\textstyle\sum_{i=1}^{% \ell}\mathbf{w_{i}}\cdot\alpha^{i},r_{\mu}+\textstyle\sum_{i=1}^{\ell}r_{i}% \cdot\alpha^{i})$

Hence, the opening proof of the PC for $c$ evaluates to $\rho$ at $\beta$ due to the homomorphic property of the scheme. $V$ accepts because of the completeness of the polynomial commitment scheme.

Knowledge Soundness. Our goal is to extract a witness $(w,r_{1},\ldots,r_{\ell})$ that satisfies the relation $\mathcal{R}^{\prime}$ given an instance $(x,c_{1},\ldots,c_{\ell})$ . Specifically, $(w,r_{1},\ldots,r_{\ell})$ such that $(x,w)\in\mathcal{R}$ and $c_{i}$ opens to $\mathbf{w_{i}}$ with randomness $r_{i}$ for all $i\in[\ell]$ . At a high level our extractor $\mathcal{E}_{Artemis}$ works as follows:

1.

Extract the witness of the SNARK $\Pi$ $\tilde{w}$ using $\mathcal{E}_{\Pi}$ .
2.

Execute protocol $\ell+1$ times for distinct challenges by rewinding the prover to the second step of the protocol in order to reconstruct the masked polynomial defined by the randomness of the commitments $\tilde{r}_{1},\ldots,\tilde{r}_{\ell}$ through the different $r$ obtained through the output of $\mathcal{E}_{\textsf{PC}}$ on the evaluation proof for $c$ .
3.

Return $(\tilde{w},\tilde{r}_{1},\ldots,\tilde{r}_{\ell})$ .

We now provide a detailed proof.
Suppose that \adv convinces $V$ that $(x,w)\in\mathcal{R}^{\prime}$ with non-negligible probability. We show that there exists $\mathcal{E}_{Artemis}$ that, assuming the existence of extractors $\mathcal{E}_{\Pi}$ for $\Pi$ and $\mathcal{E}_{\textsf{PC}}$ for PC, outputs a valid witness $(w,r_{1},\ldots,r_{\ell})$ for $\mathcal{R}^{\prime}$ with non-negligible probability given access to \adv.

We first invoke the extractor $\mathcal{E}_{\Pi}$ which exists due to the knowledge soundness of $\Pi$ . Upon receiving the same input as $\adv_{\Pi}$ , $\mathcal{E}_{\Pi}$ outputs a witness $(\hat{w},\mu)$ after interacting with \adv such that $((x,{\beta},{\rho}),(\hat{w},\hat{\mu}))$ satisfies the relation $\mathcal{R}^{\prime}$ . If the cheating prover \adv convinces the $\Pi_{Artemis}$ verifier $V$ , then the proof $\Pi$ is valid, except with negligible probability $\epsilon_{\Pi}$ . Hence, in the following, we know that $(x,\hat{w})\in\mathcal{R}$ and ${\rho}=\hat{\mathbf{g}}({\beta})$ where $\hat{\mathbf{g}}$ is defined as in the protocol using $\hat{w}$ and $\hat{\mu}$ .

The extractor then samples $\ell+1$ distinct random challenges for $\alpha$ and runs the protocol with \adv $\ell+1$ times, rewinding the prover to the second step of the protocol where it receives $\alpha$ from $V$ . On each iteration, if the cheating prover \adv convinces the $\Pi_{Artemis}$ verifier $V$ , then the verifier outputs 1 after the evaluation protocol except with negligible probability $\epsilon_{\textsf{PC}}$ . The extractor $\mathcal{E}_{\textsf{PC}}$ returns a polynomial $\mathbf{g}^{\prime}$ such that $\rho=\mathbf{g}^{\prime}(\beta)$ , as well as a decommitment $r^{\prime}$ for $c^{\prime}$ . Suppose that $\mathbf{g}^{\prime}\neq\hat{\mathbf{g}}$ . Then, because $\beta$ was sampled uniformly at random, from the Demillo-Lipton-Schwartz-Zippel (Lemma A.5 in Appendix A), it holds that:

\Pr\left[\textsf{{PC}.Eval}(\key[ck],c^{\prime},\beta,\rho;\mathbf{g}^{\prime}% ,r^{\prime})=1\right]\leq\frac{d}{p}\enspace,

where $d=\max_{i}\abs{w_{i}}$ . Hence, $\mathbf{g}^{\prime}=\hat{\mathbf{g}}$ with overwhelming probability.

With the $\ell+1$ evaluation pairs of $\alpha,\tilde{r}$ , $\mathcal{E}_{Artemis}$ reconstructs the randomness of the individual commitments $\tilde{r}_{1},\ldots,\tilde{r}_{\ell}$ . Interpolating the points of the $\ell+1$ decommitments $\alpha^{(j)},r^{\prime,(j)}$ , $\mathcal{E}_{Artemis}$ retrieves the randomness $\tilde{r}_{\mu},\tilde{r}_{1},\ldots,\tilde{r}_{\ell}$ such that

\tilde{r}_{\mu}+\textstyle\sum_{i=1}^{\ell}\tilde{r}_{i}\cdot(\alpha^{(j)})^{i% }=\tilde{r}^{\prime,(j)}

for all $j\in[\ell+1]$ . The probability that $\tilde{r}_{1},\ldots,\tilde{r}_{\ell}\neq r_{1},\ldots,r_{\ell}$ depends on the probability that any of the points $\alpha^{(j)},r^{\prime,(j)}$ is not on $\hat{\mathbf{g}}$ or fails to be extracted by $\mathcal{E}_{\textsf{PC}}$ , and is bounded by

\displaystyle(\ell+1)\cdot\Pr\left[\mathbf{g}^{\prime}\neq\hat{\mathbf{g}}% \right]=(\ell+1)\cdot\left(\frac{d}{p}+\epsilon_{\textsf{PC}}\right).

As a result, the total soundness error is bounded by

		$\displaystyle\epsilon_{\Pi}+\frac{(\ell+1)\cdot d}{p}+(\ell+1)\cdot\epsilon_{% \textsf{PC}}$
	$\displaystyle\leq{}$	$\displaystyle\epsilon_{\Pi}+\frac{2\cdot\abs{w}}{p}+(\abs{w}+1)\cdot\epsilon_{% \textsf{PC}}$

because $\ell\cdot d$ is at most the size of the witness $\abs{w}$ , resulting in a soundness error that is negligible in the security parameter $\lambda$ . Finally, the extractor performs the rewinding procedure an expected $O(\ell)$ times, resulting in a running time of $\mathcal{E}_{Artemis}$ linear in $\abs{x}$ and $\abs{w}$ .

Zero-knowledge. $Artemis_{\mathcal{R}}$ satisfies zero-knowledge if PC is hiding and $\Pi$ is a zk-SNARK. Concretely, we show that there exists a simulator $\mathsf{Sim}_{Artemis}$ that, assuming the existence of a simulator $\mathsf{Sim}_{\textsf{PC}}$ for PC and a simulator $\mathsf{Sim}_{\Pi}$ for $\Pi$ , outputs a valid transcript when given an instance $(x,c_{1},\ldots,c_{\ell})$ as input. We will show that the transcript generated by $\mathsf{Sim}_{Artemis}$ is statistically indistinguishable from the view of an honest verifier $V$ running an interactive protocol $\Pi_{Artemis}$ with the prover $P$ holding a valid instance and witness $((x,c_{1},\ldots,c_{\ell}),(w,r_{1},\ldots,r_{\ell}))\in\mathcal{R}$ .

At a high level, the simulator $\mathsf{Sim}_{Artemis}$ must generate a valid transcript consisting of $c_{\mu}$ , $\rho$ and valid transcripts for $\Pi$ and PC for a given instance $(x,c_{1},\ldots,c_{\ell})$ and challenges $\alpha,\beta$ . The primary challenge is in generating the transcript for PC, as $\mathsf{Sim}_{\Pi}$ will create a suitable transcript no matter what value of $\beta$ the simulator passes to it (as long as it is consistent with other uses of $\beta$ ). However, an instance $(c,\beta,\rho)$ for PC is valid only if the polynomial $\mathbf{g}$ inside $c$ evaluated at $\beta$ equals $\rho$ . In addition, we need to ensure that $c=c_{\mu}+\textstyle\sum_{i=1}^{\ell}c_{i}\cdot\alpha^{i}$ , as this is how the verifier computes $c$ in Artemis. This can easily be achieved by setting $c_{\mu}=c_{\rho}-\textstyle\sum_{i=1}^{\ell}c_{i}\cdot\alpha^{i}$ where $c_{\rho}$ is a commitment to a polynomial that evaluates to $\rho$ everywhere. More specifically, the simulator $\mathsf{Sim}_{Artemis}$ proceeds as follows:

1.

Sample two random values $\rho^{\prime}\sample\mathbb{F}_{p},r_{\mu}\sample\mathbb{F}_{p}$ and
2.

Compute $c_{\rho}^{\prime}=\textsf{{PC}.Commit}(\key[ck],\mathbf{g}_{\rho^{\prime}},r_{% \mu})$ where $\mathbf{g}_{\rho^{\prime}}$ is the 0-degree polynomial defined by $\rho^{\prime}$ .
3.

Compute $c_{\mu}^{\prime}=c_{\rho}^{\prime}-\textstyle\sum_{i=1}^{\ell}c_{i}\cdot\alpha% ^{i}$ .
4.

Invoke $\mathsf{Sim}_{\textsf{PC}}$ to generate a transcript $\tau_{\textsf{PC}}$ on instance $(c_{\rho}^{\prime},\beta,\rho)$ .
5.

Invoke $\mathsf{Sim}_{\Pi}$ to generate a transcript $\tau_{\Pi}$ on instance $(x,\alpha,\beta,\rho)$ .
6.

Output the tuple $(c_{\mu}^{\prime},\rho^{\prime},\tau_{\textsf{PC}},\tau_{\Pi})$ .

The transcript output by $\mathsf{Sim}_{Artemis}$ is valid for $(x,c_{1},\ldots,c_{\ell})$ and challenges $\alpha,\beta$ , because $c=c_{\mu}^{\prime}+\textstyle\sum_{i=1}^{\ell}c_{i}\cdot\alpha^{i}=c_{\rho}-% \textstyle\sum_{i=1}^{\ell}c_{i}\cdot\alpha^{i}+\textstyle\sum_{i=1}^{\ell}c_{% i}\cdot\alpha^{i}=c_{\rho}$ , resulting in a valid instance $(c_{\rho},\beta^{\prime},\rho)$ for PC. The distribution of $c_{\mu}^{\prime}$ is the same as that of $c_{\mu}$ in the real interaction due to the hiding property of PC. The distribution of the evaluation point $\beta^{\prime}$ output by $\mathsf{Sim}_{Artemis}$ is the same as $\beta$ in the real interaction, as the former is uniformly random over $\mathbb{F}_{p}$ , and the latter is masked by a random value uniformly sampled from $\mathbb{F}_{p}$ . Therefore, the full transcript is indistinguishable from the transcript of the verifier interacting with the prover in the real world.

∎

Figure 2: Simplified Visualization of a Plonkish grid with our extensions for a single commitment.

Figure 3: Visualization of a Plonkish grid with our extensions for

\ell

commitments of size

d

Efficient Arithmetization for Artemis

In Artemis, we need to augment arithmetizations of $\mathcal{R}$ to an efficient arithmetization of $\mathcal{R}^{\prime}$

\mathcal{R}^{\prime}=\leavevmode\hbox{$\left\{\begin{array}[]{c}((x,\alpha,% \beta,\rho),(w,\mu))\\ :\\ (x,w)\in\mathcal{R}\land\rho=(\bm{\mu}+\sum_{i}\mathbf{w_{i+1}}\cdot\alpha^{i}% )(\beta)\end{array}\right\}$}

While doing this naively will generally be reasonably efficient, in the following we show an optimized approach, focusing on Plonkish arithmetizations (cf. Definition 2.3) as these are used by the state-of-the-art zkML approaches. In Figures 2 and 3, we visualize the required additions to the Plonkish grid. Note that this is not to scale: in practice, grids will have many more rows, and the vast majority of the grid will be dedicated to the original relation $\mathcal{R}$ rather than our additions.

Strawman Approach. A naive approach to arithmetizing $\mathcal{R}^{\prime}$ would be express it as the inner product of the witness polynomial and the powers of $\beta_{0},\ldots,\beta^{d}$ . As $\beta$ is public, the verifier can easily compute these powers, resulting in fewer constraints. Unfortunately, this approach leads to a significant overhead for the verifier, as it must interpolate a polynomial for the powers of $\beta$ over the evaluation domain, resulting in a linear overhead of the verifier.

Horner’s Method. As the additional constraint that we need to add is essentially an evaluation of a polynomial at a specific point, we can utilize an arithmetization based on Horner’s method [25]. In order to illustrate this, we first consider a simplified setting, with a single commitment $c$ to witness polynomial $\mathbf{w}$ with coefficients $w$ (i.e., $\ell=1$ ). For this simplified setting, which we visualize in Figure 2, we will also assume that the size $d$ of the witness matches the number of rows $n$ of the Plonkish grid. We denote the individual elements $w_{i}$ as $w_{i}^{(0)},\ldots,w_{i}^{(d-1)}$ . Note that we specifically use zero-based indexing here as this is more natural when considering these elements as coefficients of $\mathbf{w_{i}}$ .

According to Horner’s method, we can then compute

w^{(0)}+w^{(1)}\beta+w^{(2)}\beta^{2}+w^{(3)}\beta^{3}+\cdots

w^{(0)}+\beta\bigg{(}w^{(1)}+\beta\Big{(}w^{(2)}+\beta\big{(}w^{(3)}+\cdots% \big{)}\Big{)}\bigg{)}.

This latter form enables a convenient recursive computation, that, in order to compute the partial evaluation down to degree $j$ only requires access to the $j$ -th coefficient, $\beta$ , and the $j+1$ -th partial evaluation. We denote the partial evaluation for the $j$ -th degree as $\rho_{j}$ . Then, we have the recurrence relation

\rho_{j}=w^{(j)}+\beta*\rho_{j+1}\text{\ with \ }\rho_{d}=0.

To express this in the Plonkish grid, we extend the grid with a set of additional columns: $\mathbf{a}_{n_{a}+1}$ to store $\rho_{j+1}$ , $\mathbf{a}_{n_{a}+2}$ to store $\rho_{j}$ , and $\mathbf{a}_{n_{a}+3}$ to store $w^{(j)}$ . We also add a selector column $\mathbf{f}_{n_{f}+1}$ , and an instance column $\mathbf{p}_{n_{p}+1}$ to store $\beta$ . Generally, the verifier needs to interpolate a polynomial for each instance column, which would be expensive for $\mathbf{p}_{n_{p}+1}$ , as it contains values across the entire evaluation domain. However, the polynomial simply needs to evaluate to the $\beta$ across the entire evaluation domain. Therefore, we can forgo the expensive interpolation and directly generate a constant polynomial $\bm{g}(X)=\beta$ . We add copy constraints to ensure that the copies of the witness values correspond to their original occurrences in the arithmetization of $\mathcal{R}$ . In addition, we add copy constraints to link the occurrences of each $\rho_{j}$ across both columns, i.e., $\mathbf{a}_{n_{a}+1}$ and $\mathbf{a}_{n_{a}+2}$ . Finally we add a custom gate constraint:

\begin{split}\mathbf{b}_{\textsf{h}}(X,\ldots,\mathbf{a}_{n_{a}+1}(X),\mathbf{% a}_{n_{a}+2}(X),\mathbf{a}_{n_{a}+3}(X),\mathbf{p}_{n_{p}+1}(X),\mathbf{f}_{n_% {f}+1}(X))\\ =\mathbf{f}_{n_{f}+1}(X)\cdot(\mathbf{a}_{n_{a}+3}(X)+\mathbf{a}_{n_{a}+1}(X)% \cdot\mathbf{p}_{n_{p}+1}(X)-\mathbf{a}_{n_{a}+2}(X))\end{split}

Finally, we note that we could forgo the $\rho_{\text{prev}}$ column and instead use a custom gate spanning two rows, saving one column. However, as in the state of the art zkML approaches using Plonkish arithmetizations [10], we restrict ourselves to single-row custom gates.

Supporting Larger Commitments. So far, we have assumed that the size $d$ of the commitment $w$ matches the number of rows $n$ in the plonkish grid. Where $d$ is smaller, we can trivially pad $w$ with zeros. However, if $d$ is larger than $n$ , we need to split $w$ across multiple advice columns. A straightforward approach might add a separate pair of advice columns for the intermediate values $\rho^{\prime},\rho_{prev}^{\prime}$ for each witness column, as well as multiple custom gates and selector columns. However, we can avoid this overhead by combining Horner’s method with a (generalized) even-odd decomposition approach. Specifically, we use the common observation that

w^{(0)}+w^{(1)}\beta+w^{(2)}\beta^{2}+w^{(3)}\beta^{3}+w^{(4)}\beta^{4}+w^{(5)% }\beta^{5}+\cdots

can be rewritten as

\begin{split}&w^{(0)}+w^{(2)}\beta^{2}+w^{(4)}\beta^{2^{2}}+\cdots\\ +\beta&\left(w^{(1)}+w^{(3)}\beta^{2}+w^{(5)}\beta^{2^{2}}+\ldots\right)\\ \end{split}

which can be interpreted as a combination of two polynomials in $X^{2}$ . Combining this with the Horner’s method approach, we arrive at

\left(w^{(0)}+\beta w^{(1)}\right)+\beta^{2}\Bigg{(}\left(w^{(2)}+\beta w^{(3)% }\right)+\beta^{2}\Big{(}\cdots\Big{)}\bigg{)}\Bigg{)}.

which gives rise to the recurrence

\rho_{j}=\left(w^{(j)}+\beta*w^{(j+1)}\right)+\beta^{2}\rho_{j+1}

where $n$ is the number of rows in the grid and $\rho_{d}=0$ . This is why we split the witness into the columns not based on sequential chunks, but instead based on even and odd terms (cf. Figure 3). We can easily adapt our custom gate to compute this new formula by introducing a new instance column $\mathbf{p}_{n_{p}+2}$ for $\beta^{2}$ . As is the case for the instance column $\mathbf{p}_{n_{p}+1}$ that contains $\beta$ , the verifier does not need to interpolate this, as we can directly construct the (constant) polynomial that evaluates to $\beta^{2}$ at all points. This approach generalizes to any number $k=\ceil{d/n}$ of columns: instead of splitting the polynomial into even and odd components, we split it modulo $k$ . This requires the addition of an instance column $\beta^{i}$ for $i\in[1,k]$ , but as these do not need to be interpolated, this does not impact runtime significantly.

Refer to caption — Figure 4: Prover Time in minutes for KZG-based (top) and IPA-based (bottom) approaches for various models. As Apollo and Lunar only support KZG-based instantiations, they are omitted in the bottom row. Poseidon fails to scale to Diffusion and GPT-2, while Lunar fails to scale to GPT-2, as described in § 5.3, and are therefore omitted for these models.

Supporting Multiple Commitments. Finally, we consider the case with $\ell$ commitments, beginning with the naive approach, then show how this can be extended to an efficient solution for a large number of small commitments, before introducing our optimization for multiple large commitments. Similar to the naive approach to supporting larger commitments, we can resolve this by adding a pair of advice columns (for $\rho_{i}$ and ${\rho_{prev}}_{i}$ ) for each witness column. This introduces three advice columns per commitment, however, in cases where all commitments are small, this is highly inefficient, as the vast majority of each column will be unused. Instead, if all commitments are sufficiently small (specifically, smaller than $\frac{n}{3}$ ), we can more efficiently “stack” multiple commitments into a single column, and make use of the same additional advice columns (and the same custom gate) by simply setting ${\rho_{prev}}_{i}$ to zero whenever a new commitment starts. However, when each commitment might be larger than we can accommodate in a single column (as will generally be the case in zkML), we cannot apply this technique. Instead, our optimization relies on aggregating multiple commitments. The key insight here is that we can use essentially the same optimized technique we used to handle multiple columns per witness to also handle multiple witnesses. For this, we introduce additional instance columns for the powers of $\alpha$ , and in our custom gate, replace each occurrence of $w^{(j)}_{i}$ with

\textstyle\sum_{i=1}^{\ell}\alpha^{i}w_{i}^{(j)}

We visualize our additions to grid in Figure 3. For ease of presentation, we assume all commitments require $\lceil\frac{\ell}{d}\rceil$ columns. In practice, one can trivially adjust the custom gate in order to support different amounts of columns for each witness.

Masking. Finally, we consider $\mu$ , which needs to be added to the result of the polynomial evaluation. For the vast majority of arithmetizations of $\mathcal{R}$ , there will be suitable empty cells and existing custom gates (e.g., addition or inner products) that we can reuse, in which case we only need to add a single copy constraint to link the computed value of $\rho_{1}$ with its copy in the addition. In the rare cases where it is not possible to integrate this addition into the existing grid, we add a new row that contains only $\mu$ and a copy of $\rho_{1}$ and, if necessary, a custom gate for addition and an associated selector column.

Evaluation

In this section, we evaluate the performance of Apollo and Artemis for various computer vision and natural language processing models. We compare against the existing state of the art, namely Lunar [8] and Poseidon [21]. We focus on showing that our constructions make zkML significantly more practical, especially for large models. In addition, we show that Artemis can achieve similarly low overheads even without relying on trusted setup.

Implementation

In addition to implementing our constructions, Apollo and Artemis, we provide the first (to the best of our knowledge) complete implementation of Lunar’s $\mathsf{CP}_{\text{link}}$ construction [7]. We implement all techniques in Rust, as an extension of the Halo2 library [17], which includes implementations for KZG- [28] and IPA-based [5] zero-knowledge proofs. We instantiate the underlying group with the pairing-friendly BN256 curve for KZG-based proofs and the Pallas curve for the IPA-based proofs. This allows us to use our constructions in combination with the models used in the state-of-the-art zkML work [10], which is also based on Halo2. We make all our implementations and benchmarking configurations available as open-source²²2https://github.com/pps-lab/artemis . Below, we discuss the implementation of each of the approaches we evaluate in more detail:

No Commitment: This baseline does not check commitments at all, as in Chen et al. [10].
Poseidon: We used a Poseidon [21] gadget provided by the Halo2 library [17].
Lunar: We implement Lunar’s CP-SNARK construction [7] for Halo2’s Plonkish arithmetization. Specifically, we implement $\mathsf{CP}_{\text{link}}^{(1)}$ and $\mathsf{CP}_{\text{link}}^{(2)}$ from [7]. We use Halo2’s underlying finite field Rust library ff. $\mathsf{CP}_{\text{link}}$ relies heavily on division of vanishing polynomials on a subset of the evaluation domain, which is not directly supported by Halo2’s polynomial implementation. Therefore, we extend this implementation with support for efficient FFT-based polynomial division to ensure competitive performance of $\mathsf{CP}_{\text{link}}$ .
Apollo: We implement Apollo (cf. § 4.1) which performs the alignment of the witness in the arithmetization using a small set of extra columns and copy constraints, resulting in a significantly more efficient $\mathsf{CP}_{\text{link}}$ . The implementation otherwise uses the same approach as Lunar.
Artemis: For Artemis (cf. § 4.2), our construction based on homomorphic polynomial commitments, we use Halo2’s standard implementation of polynomial commitments, and implement the arithmetization of polynomial evaluation using Horner’s method (cf. § 4.4) as a gadget in the Halo2 library.

Experimental Setup

We evaluate the prover time, verifier time and proof size for Halo2-based zkML inference proofs with a commitment to the model for a wide range of different models. We perform the evaluation on AWS EC2 instances running Ubuntu 24.04, with instance types adjusted to meet each model’s resource demands: r6i.8xlarge (32 vCPUs, 256 GB RAM), r6i.16xlarge (64 vCPUs, 512 GB RAM), and r6i.32xlarge (128 vCPUs, 1024 GB RAM). This corresponds to the model-instance mapping used in [10], except for VGG-16, for which Poseidon requires a larger instance. We therefore evaluate all VGG-16 experiments on r6i.32xlarge instances. Below, we briefly describe the models we consider in our evaluation.

MNIST: A minimal CNN [22] with $8.1$ K parameters and $444.9$ K FLOPs, trained on the MNIST image classification task, evaluated on an r6i.8xlarge instance.
ResNet-18: An image classifier [24] trained on CIFAR-10, with $280.9$ K parameters and $81.9$ M FLOPs, evaluated on an r6i.8xlarge instance.
DLRM: A deep learning recommendation model [34], with $764.3$ K parameters and $1.9$ M FLOPs, evaluated on an r6i.8xlarge instance.
MobileNet: A mobile-optimized image classifier [37] trained on ImageNet, with $3.5$ M parameters and $601.8$ M FLOPs, evaluated on an r6i.16xlarge instance
VGG-16: A CNN with $15.2$ M parameters and $627.9$ M FLOPs, trained on CIFAR-10 [42], evaluated on an r6i.32xlarge instance.
Diffusion: A small text-to-image Stable Diffusion model [36], with $19.5$ M parameters and $22.9$ B FLOPs, evaluated on an r6i.32xlarge instance.
GPT-2: A distilled transformer-based language model optimized for inference [35], with $81.3$ M parameters and $188.9$ M FLOPs, evaluated on an r6i.32xlarge instance.

We perform five measurements for the verifier time and report the mean and the standard deviation (as error bars) in the figures.

Results

In Figure 4 we report wall clock runtimes for the prover, similarly, we report verifier times in Figure 5, while we report proof sizes in Table 2.

Prover Overhead. We begin by discussing prover overhead (cf. Figure 4), which is by far the most important metric when considering the practicality of zkML. For Poseidon, the overhead of recomputing the commitment inside the SNARK results in a significant overhead that scales roughly linearly in the model size, ranging from 3.2x-17.3x for KZG, and from 3.2x-17.2x for IPA compared to the baseline (No Commitment). The approach of Lunar using the internal witness commitment of the SNARK reduces the overhead to 3.0x-7.5x in the case of KZG, which is an improvement over Poseidon, but is still significant because the number of $\mathsf{CP}_{\text{link}}$ proofs scales with the number of witness-containing columns. A notable exception where Poseidon outperforms Lunar is for MobileNet, whose architecture results in a large number of columns relative to the number of weights. Nevertheless, the overheads of prior approaches are prohibitively expensive, particularly for larger models. Note that, for GPT-2 and Diffusion, Poseidon was unable to complete successfully because of memory requirements beyond 1024GB, which is the maximum available memory for AWS r6i instances. Similarly, for Lunar, which does not run successfully for GPT-2.

In comparison, our CP-SNARK constructions Apollo and Artemis outperform the related approaches across all configurations, introducing an overhead of only 1.01x-1.18x for KGZ and 1.03x-1.42x for IPA. These approaches only require adaptations to the arithmetization and the proof system that are very concretely efficient. Apollo is significantly faster than Lunar, because the alignment of the witness using copy constraints in the arithmetization obviates the need for shifting proofs. For smaller models, Apollo outperforms Artemis as the latter needs to extend the arithmetization with a custom gate, the relative impact of which reduces as the model grows. As a result, we observe Artemis outperforming Apollo for larger models. More importantly, we note that Artemis offers very similar prover times whether using KZG or IPA commitments (without trusted setup), a setting which Lunar and Apollo do not support.

Verifier. We present the verifier times in Figure 5. KZG-based proof systems provide a verification time constant in the size of the witness. However, even for the baseline (No Commitment) the verifier times for different models still vary, because the different model output size result in different proof instance sizes. Similarly, merely adding the commitments to the instance increases the KZG verifier time. However, the vast majority of the differences in verifier time between the different approaches are due to the additional checks that (some of) the approaches introduce. In contrast, verifier times for IPA scale (logarithmically) with the size of the witness so we expect slower verification times in general.

Poseidon shows a negligible increase in verification time for KZG, as it only adapts the arithmetization of the relation and not the SNARK, resulting in a tiny increase in verification time due the addition of the commitment to the public input. In contrast, for IPA-based Poseidon, we observe a considerable increase in verifier time (2x-11x) due to the complex arithmetization of Poseidon. Lunar (which only supports KZG) increases the verification time compared to No Commitment significantly (8.5x-252.9x), as it requires a linear number of additional pairing operations to verify the $\mathsf{CP}_{\text{link}}$ proofs. Although Apollo (which also only supports KZG) reduces the number of required pairing operations compared to Lunar, the verification overhead is, in some configurations, still significant (1x-4x), compared to the baseline (No Commitment). Artemis, on the other hand, requires only one additional pairing verification, resulting in a negligible overhead in verification time (1.0x-1.1x) for KZG. For IPA, the verification overhead is also relatively low (at most 1.2x), which is significantly lower than the prior state of the art in this setting.

		No Com.	Artemis	Apollo	Lunar	Poseidon
			§ 4.2	§ 4.1	[7]	[21]
KZG	MNIST	9	10	10	9	12
	ResNet-18	14	15	15	14	16
	DLRM	5	6	6	4	10
	MobileNet	18	18	18	18	21
	VGG	16	17	16	11	15
	Diffusion	32	33	33	16	-
	GPT-2	15	16	15	-	-
IPA	MNIST	10	12	-	-	14
	ResNet-18	16	18	-	-	18
	DLRM	7	9	-	-	11
	MobileNet	19	21	-	-	23
	VGG	17	20	-	-	17
	Diffusion	34	36	-	-	-
	GPT-2	17	19	-	-	-

Table 2: Proof size in kB for KZG-based (top) and IPA-based (bottom) approaches for various models. As Apollo and Lunar only support KZG-based instantiations, they are omitted in the bottom row. Poseidon fails to scale to Diffusion and GPT-2, while Lunar fails to scale to GPT-2, as described in § 5.3, and are therefore omitted for these models.

Proof Size. While not of primary concern for most zkML applications, we report proof sizes in Table 2 for completeness. In general, proof sizes are very small (a few dozen kB at most) for the baseline (No Commitment) across all models. Furthermore, the overhead of adding commitment verification is generally low across all approaches. In fact, in some cases we see a decrease in proof size for Lunar. This is an artifact of the restrictions of Lunar’s construction, which require the evaluation domain of the SNARK to be at least as large as the (committed) witness. As a result, there are instances where we need to increase the number of rows in the Plonkish grid to achieve this. In these cases, we can then make use of these additional rows by re-layouting the original grid into fewer columns, which reduces the proof size.

Summary. In conclusion, we demonstrate that both Apollo and Artemis significantly advance the state-of-the-art for practical zkML. Commitment verification is essential for real-world usage zkML, yet existing approaches introduced significant overheads that made zkML impractical for all but the smallest models. We demonstrate that, with Apollo and Artemis, it is possible to apply zkML with commitment verification to large models of real-world interest. Furthermore, we show that, with Artemis, this is possible while using state-of-the-art SNARKs that do not require trusted setup.

Acknowledgements

We would like to thank Christian Knabenhans for his insightful feedback. We would also like to acknowledge our sponsors for their generous support, including Meta, Google, and SNSF through an Ambizione Grant No. PZ00P2_186050.

References

[1] Diego F Aranha, Emil Madsen Bennedsen, Matteo Campanelli, Chaya Ganesh, Claudio Orlandi, and Akira Takahashi. ECLIPSE: Enhanced compiling method for pedersen-committed zkSNARK engines. Cryptology ePrint Archive, 2021.
[2] Aztec Network. Proving system components. https://docs.aztec.network/protocol-specs/cryptography/proving-system/overview, 2021. Accessed: 2024-9-1.
[3] Abeba Birhane, Ryan Steed, Victor Ojewale, Briana Vecchione, and Inioluwa Deborah Raji. AI auditing: The broken bus on the road to AI accountability. arXiv [cs.CY], January 2024.
[4] EZKL Blog. Removing additional commitment cost, 2023. Accessed: 2024-07-22.
[5] Sean Bowe, Jack Grigg, and Daira Hopwood. Halo: Recursive proof composition without a trusted setup. Technical report, Cryptology ePrint Archive, Report 2019/1021, 2019.
[6] Benedikt Bünz, Jonathan Bootle, Dan Boneh, Andrew Poelstra, Pieter Wuille, and Greg Maxwell. Bulletproofs: Short proofs for confidential transactions and more. In 2018 IEEE Symposium on Security and Privacy (SP), May 2018.
[7] Matteo Campanelli, Antonio Faonio, Dario Fiore, Anaïs Querol, and Hadrián Rodríguez. Lunar: a toolbox for more efficient universal and updatable zkSNARKs and commit-and-prove extensions. Cryptology ePrint Archive, Paper 2020/1069, 2020. (Extended Version).
[8] Matteo Campanelli, Antonio Faonio, Dario Fiore, Anaïs Querol, and Hadrián Rodríguez. Lunar: A toolbox for more efficient universal and updatable zkSNARKs and commit-and-prove extensions. In ASIACRYPT 2021, 2021.
[9] Matteo Campanelli, Dario Fiore, and Anaïs Querol. LegoSNARK: Modular design and composition of succinct zero-knowledge proofs. CCS ’19. ACM, November 2019.
[10] Bing-Jyue Chen, Suppakit Waiwitlikhit, Ion Stoica, and Daniel Kang. ZKML: An optimizing system for ML inference in zero-knowledge proofs. 2024.
[11] Binyi Chen, Benedikt Bünz, Dan Boneh, and Zhenfei Zhang. HyperPlonk: Plonk with linear-time prover and high-degree custom gates. In EUROCRYPT 2023. 2023.
[12] Dami Choi, Yonadav Shavit, and David Duvenaud. Tools for verifying neural models’ training data. arXiv [cs.LG], July 2023.
[13] Richard A Demillo and Richard J Lipton. A probabilistic remark on algebraic program testing. Inf. Process. Lett., 7(4):193–195, June 1978.
[14] EZKL Docs. Visibility: What is private?, 2023. https://docs.ezkl.xyz/visibility_what_is_private/. Accessed: 2024-09-03.
[15] EZKL. An engine for doing inference for deep learning models and other computational graphs in a zk-snark (ZKML). Accessed: 02-09-2024.
[16] Boyuan Feng, Lianke Qin, Zhenfei Zhang, Yufei Ding, and Shumo Chu. ZEN: An optimizing compiler for verifiable, zero-knowledge neural network inferences. Cryptology ePrint Archive, 2021.
[17] Zcash Foundation. Halo2 book, 2021. https://zcash.github.io/halo2/. Accessed: 2024-078-02.
[18] Ariel Gabizon, Zachary J Williamson, and Oana Ciobotaru. Plonk: Permutations over lagrange-bases for oecumenical noninteractive arguments of knowledge. Cryptology ePrint Archive, 2019.
[19] Bianca-Mihaela Ganescu and Jonathan Passerat-Palmbach. Trust the process: Zero-knowledge machine learning to enhance trust in generative AI interactions. In The 5th AAAI Workshop on Privacy-Preserving Artificial Intelligence, 2024.
[20] Sanjam Garg, Aarushi Goel, Somesh Jha, Saeed Mahloujifar, Mohammad Mahmoody, Guru-Vamsi Policharla, and Mingyuan Wang. Experimenting with zero-knowledge proofs of training. 2023.
[21] Lorenzo Grassi, Dmitry Khovratovich, Christian Rechberger, Arnab Roy, and Markus Schofnegger. Poseidon: A new hash function for zero-knowledge proof systems. USENIX Security, pages 519–535, 2021.
[22] Ruslan Grimov. The minimal neural network that achieves 99 https://github.com/ruslangrimov/mnist-minimal-model, 2018.
[23] Jens Groth. On the size of pairing-based non-interactive arguments. In EUROCRYPT 2016, 2016.
[24] Kaiming He, X. Zhang, Shaoqing Ren, and Jian Sun. Identity mappings in deep residual networks. In European Conference on Computer Vision, 2016.
[25] W. G. Horner. A new method of solving numerical equations of all orders, by continuous approximation. Philosophical Transactions of the Royal Society of London, 109:308–335, 1819.
[26] White House. Executive order on the safe, secure, and trustworthy development and use of artificial intelligence, October 2023. E.O. 14110.
[27] Daniel Kang, Tatsunori Hashimoto, Ion Stoica, and Yi Sun. Scaling up trustless DNN inference with zero-knowledge proofs. arXiv [cs.CR], October 2022.
[28] Aniket Kate, Gregory M Zaverucha, and Ian Goldberg. Constant-size commitments to polynomials and their applications. In ASIACRYPT 2010, 2010.
[29] Seunghwa Lee, Hankyung Ko, Jihye Kim, and Hyunok Oh. vCNN: Verifiable convolutional neural network based on zk-SNARKs. Cryptology ePrint Archive, 2020.
[30] Tianyi Liu, Xiang Xie, and Yupeng Zhang. zkCNN: Zero knowledge proofs for convolutional neural network predictions and accuracy. 2021.
[31] Hidde Lycklama, Nicolas Küchler, Alexander Viand, Emanuel Opel, Lukas Burkhalter, and Anwar Hithnawi. Cryptographic auditing for collaborative learning. In NeurIPS ML Safety Workshop, 2022.
[32] Hidde Lycklama, Alexander Viand, Nicolas Küchler, Christian Knabenhans, and Anwar Hithnawi. Holding Secrets Accountable: Auditing Privacy-Preserving Machine Learning. In USENIX Security, Philadelphia, PA, August 2024.
[33] National Science and Technology Council Committee on Technology. Preparing for the future of artificial intelligence, October 2016.
[34] Maxim Naumov, Dheevatsa Mudigere, Hao-Jun Michael Shi, et al. Deep learning recommendation model for personalization and recommendation systems. CoRR, abs/1906.00091, 2019.
[35] Alec Radford, Jeff Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. Language models are unsupervised multitask learners. 2019.
[36] Robin Rombach, A. Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. High-resolution image synthesis with latent diffusion models. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10674–10685, 2021.
[37] Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. Mobilenetv2: Inverted residuals and linear bottlenecks. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4510–4520, 2018.
[38] Peter Schulam and Suchi Saria. Can you trust this prediction? auditing pointwise reliability after learning. In AISTATS, volume 89, pages 1022–1031, 2019.
[39] Srinath Setty, Justin Thaler, and Riad Wahby. Customizable constraint systems for succinct arguments. IACR Cryptol eprint Arch, 2023:552, 2023.
[40] Ali Shahin Shamsabadi, Sierra Calanda Wyllie, Nicholas Franzese, Natalie Dullerud, Sébastien Gambs, Nicolas Papernot, Xiao Wang, and Adrian Weller. Confidential-PROFITT: Confidential PROof of FaIr training of trees. In ICLR, 2022.
[41] Reza Shokri. PRIVACY AUDITING OF MACHINE LEARNING USING MEMBERSHIP INFERENCE ATTACKS. ICLR, 2022.
[42] Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556, 2014.
[43] Tobin South, Alexander Camuto, Shrey Jain, Shayla Nguyen, Robert Mahari, Christian Paquin, Jason Morton, and Alex Pentland. Verifiable evaluations of machine learning models using zkSNARKs. arXiv [cs.LG], February 2024.
[44] Haochen Sun and Hongyang Zhang. ZkDL: Efficient zero-knowledge proofs of deep learning training. arXiv [cs.LG], July 2023.
[45] Suppakit Waiwitlikhit, Ion Stoica, Yi Sun, Tatsunori Hashimoto, and Daniel Kang. Trustless audits without revealing data or models. In ICML’24, April 2024.
[46] Jiasi Weng, Jian Weng, Gui Tang, Anjia Yang, Ming Li, and Jia-Nan Liu. pvcnn: Privacy-preserving and verifiable convolutional neural network testing. IEEE Transactions on Information Forensics and Security, 18:2218–2233, 2023.

Appendix A Definitions

Definition A.1 (Commitment Scheme).

A non-interactive commitment scheme consists of a message space $\mathcal{M}$ , randomness space $\mathcal{O}$ , a commitment space $\mathcal{C}$ and a tuple of polynomial-time algorithms $(\textsf{Com.Setup},\textsf{Com.Commit},\textsf{Com.Verify})$ defined as follows:

•

$\textsf{Com.Setup}(1^{\lambda})\rightarrow\textsf{crs}$ : Given a security parameter $\lambda$ , it outputs public parameters crs.
•

$\textsf{Com.Commit}(\textsf{crs},m,r)\rightarrow c$ : Given public parameters crs, a message $m\in\mathcal{M}$ and randomness $r\in\mathcal{O}$ , it outputs a commitment $c$ .
•

$\textsf{Com.Verify}(\textsf{crs},c,r,m)\rightarrow\{0,1\}$ : Given public parameters crs, a commitment $c$ , a decommitment $r$ , and a message $m$ , it outputs $1$ if the commitment is valid, otherwise $0$ .

A non-interactive commitment scheme has the following properties:

•

Correctness. For all security parameters $\lambda$ , for all $m$ and for all crs output by $\textsf{Com.Setup}(1^{\lambda})$ , if $c=\textsf{Com.Commit}(\textsf{crs},m,r)$ , then $\textsf{Com.Verify}(\textsf{crs},c,m,r)=1$ .

•

Binding. For all polynomial-time adversaries $\mathcal{A}$ , the probability

\begin{split}\Pr\bigl{[}\textsf{Com.Verify}(\textsf{crs},c,m_{1},r_{1})&=1% \land\\ \textsf{Com.Verify}(\textsf{crs},c,m_{2},r_{2})&=1\land m_{1}\neq m_{2}:\\ \textsf{crs}\leftarrow\textsf{Com.Setup}(1^{\lambda}),(c,r_{1},r_{2},&m_{1},m_% {2})\leftarrow\mathcal{A}(\textsf{crs})\bigl{]}\end{split}

is negligible.

•

Hiding. For all polynomial-time adversaries $\mathcal{A}$ , the advantage

\begin{split}|\Pr[\mathcal{A}(\textsf{crs},c)=1:c&\leftarrow\textsf{Com.Commit% }(\textsf{crs},m_{1},r)]-\\ \Pr[\mathcal{A}(\textsf{crs},c)=1:c&\leftarrow\textsf{Com.Commit}(\textsf{crs}% ,m_{2},r)]|\end{split}

is negligible, for all messages $m_{1},m_{2}$ .

Definition A.2 (Homomorphic Commitment Scheme [6]).

A homomorphic commitment scheme is a non-interactive commitment scheme such that $\mathcal{M}$ , $\mathcal{O}$ and $\mathcal{C}$ are all abelian groups and for all $m_{1},m_{2}\in\mathcal{M}$ and $r_{1},r_{2}\in\mathcal{O}$ , we have

\begin{split}&\textsf{Com.Commit}(\textsf{crs},m_{1}+m_{2},r_{1}+r_{2})=\\ &\textsf{Com.Commit}(\textsf{crs},m_{1},r_{1})+\textsf{Com.Commit}(\textsf{crs% },m_{2},r_{2}).\end{split}

Definition A.3 (KZG Commitments [28]).

KZG commitments leverage bilinear pairings to create a commitment scheme for polynomials where the commitments have constant size. Let $\mathbb{G}_{1}$ , $\mathbb{G}_{2}$ and $\mathbb{G}_{T}$ be cyclic groups of prime order $p$ such with generators $h_{1}\in\mathbb{G}_{1}$ and $h_{2}\in\mathbb{G}_{2}$ . Let $e:\mathbb{G}_{1}\times\mathbb{G}_{2}\rightarrow\mathbb{G}_{T}$ be a bilinear pairing, so that $e(\alpha\cdot h_{1},\beta\cdot h_{2})=\alpha\beta\cdot e(h_{1},h_{2})$ . The KZG polynomial commitment scheme for some polynomial $\mathbf{g}$ made up of coefficients $\mathbf{g}_{i}$ is defined by four algorithms:

•

$\textsf{{PC}.Setup}(d)$ : Sample $\alpha\sample\mathbb{F}_{p}$ and output

\texttt{pp}\leftarrow\left(\alpha\cdot h_{1},\ldots,\alpha^{d}\cdot h_{1},% \alpha\cdot h_{2}\right)

•

$\textsf{{PC}.Commit}(\texttt{pp},\mathbf{g})$ : Output $\text{com}=\mathbf{g}(\alpha)\cdot h_{1}$ , computed as

\text{com}\leftarrow\sum_{i=0}^{d}\mathbf{g}_{i}\cdot(\alpha^{i}\cdot h_{1})

•

$\textsf{{PC}.Prove}(\texttt{pp},\text{com},\mathbf{g},x):$ Compute the remainder and quotient

q(X),r(X)\leftarrow\left(\mathbf{g}(X)-\mathbf{g}(x)\right)/\left(X-x\right).

Check that the remainder $r(X)$ and, if true, output $\pi=q(\alpha)\cdot h_{1}$ , computed as $\sum_{i=0}^{d}\left(q_{i}\cdot(\alpha^{i}\cdot h_{1})\right)$ .

•

$\textsf{{PC}.Check}(\texttt{pp},\text{com},x,y,\pi)$ : Accept if the following pairing equation holds:

e(\pi,\alpha\cdot h_{2}-x\cdot h_{2})=e(\text{com}-y\cdot h_{1},h_{2})

The security properties of KZG commitments fundamentally rely on the hardness of the polynomial division problem. The parameter $\alpha$ acts as a trapdoor and must be discarded after PC.Setup to ensure the binding property. Hence, we require a trusted setup to generate the public parameters and securely discard $\alpha$ , which can be computed using MPC or, depending on the deployment, computed by the auditor acting as a trusted dealer. Together, PC.Prove and PC.Check form the evaluation protocol for the scheme. The hiding property relies on the discrete logarithm assumption, so if $\alpha$ is not discarded this breaks the binding property but not the hiding property. We refer to [28] for a detailed security analysis. Further, KZG commitments are homomomorphic, i.e., if $\text{com}_{1}$ and $\text{com}_{2}$ are commitments to polynomials $\mathbf{g}_{1}$ and $\mathbf{g}_{2}$ , then $\text{com}_{1}+\text{com}_{2}$ is a commitment to polynomial $\mathbf{g}_{1}+\mathbf{g}_{2}$ .

Definition A.4.

A zk-SNARK is a proof with the following properties:

•

Completeness. For every true statement for the relation $\mathcal{R}$ an honest prover with a valid witness always convinces the verifier, i.e., $\forall(x,w)\in\mathcal{R}{}:$

\condprob{\textsf{Verify}_{\vk}(x,\pi)=1}{\begin{gathered}(\textsf{crs},\vk)% \leftarrow\textsf{Setup}(1^{\lambda})\\ \pi\leftarrow\textsf{Prove}_{\textsf{crs}}(x,w)\end{gathered}}=1

•

Knowledge Soundness. For every PPT adversary, there exists a PPT extractor that gets full access to the adversary’s state (including its random coins and inputs). Whenever the adversary produces a valid argument, the extractor can compute a witness with high probability: $\forall\adv{}\exists\mathcal{E}:$

\condprob{\begin{gathered}\textsf{Verify}_{\vk}(\tilde{x},\tilde{\pi})=1\\ \land\mathcal{R}(\tilde{x},w^{\prime})=0\end{gathered}}{\begin{gathered}(% \textsf{crs},\vk)\leftarrow\textsf{Setup}(1^{\lambda})\\ ((\tilde{x},\tilde{\pi});w^{\prime})\leftarrow\adv{}|\mathcal{E}(\textsf{crs})% \\ \end{gathered}}=\negl

We stress here that this definition requires a non-black-box extractor, i.e., the extractor gets full access to the adversary’s state.

•

Succinctness. For any $x$ and $w$ , the length of the proof is given by $|\pi|=\poly\cdot\pcpolynomialstyle{polylog}(|x|+|w|)$ .

•

Zero-Knowledge. There exists a PPT simulator $\sdv=(\sdv_{1},\sdv_{2})$ such that $\sdv_{1}$ outputs a simulated CRS crs and a trapdoor \key[td]; On input crs, $x$ , and \key[td], $\sdv_{2}$ outputs a simulated proof $\pi$ , and for all PPT adversaries $\adv=(\adv_{1},\adv_{2})$ , such that

	$\displaystyle\left\|\condprob{\begin{gathered}(x,w)\in\mathcal{R}\\ {}\land{}\\ \adv_{2}(\pi)=1\end{gathered}}{\begin{gathered}(\textsf{crs},\vk)\leftarrow% \textsf{Setup}(1^{\lambda})\\ (x,w)\leftarrow\adv_{1}(1^{\lambda},\textsf{crs})\\ \pi\leftarrow\textsf{Prove}_{\textsf{crs}}(x,w)\end{gathered}}-\right.$
	$\displaystyle\left.\condprob{\begin{gathered}(x,w)\in\mathcal{R}\\ {}\land{}\\ \adv_{2}(\pi)=1\end{gathered}}{\begin{gathered}(\textsf{crs}^{\prime},\key[td]% )\leftarrow\sdv_{1}(1^{\lambda})\\ (x,w)\leftarrow\adv_{1}(1^{\lambda},\textsf{crs}^{\prime})\\ \pi\leftarrow\sdv_{2}(\textsf{crs}^{\prime},\key[td],x)\end{gathered}}\right\|=\negl$

Lemma A.5 (Demillo-Lipton-Schwartz-Zippel [13]).

Let $f\in\mathbb{F}_{p}[X]$ be a non-zero polynomial of degree $d$ over a prime field $\mathbb{F}_{p}$ . Let $S$ be any finite subset of $\mathbb{F}_{p}$ and let $r$ be a field element selected independently and uniformly from set $S$ . Then

\Pr[f(r)=0]\leq\frac{d}{|S|}.

Appendix B Ethics and Open Science Statements

Ethics Statement. This work introduces Efficient Commit-and-Prove SNARKs for zkML, aiming to improve privacy and security in machine learning applications. Our work aims to empower users by providing tools that ensure data privacy, transparency, and integrity in machine learning applications. By enhancing privacy-preserving ML, we contribute to the responsible use of data, protecting individuals’ sensitive data from unauthorized access or misuse. However, we recognize that any cryptographic tool, including SNARKs, can be misused if applied irresponsibly. To mitigate these risks, we encourage the community to adhere to ethical guidelines when deploying zkML solutions in practice.

Open Science Statement. To ensure the reproducibility of our results, we will publish the code for our system, including the implementation of existing work generated as part of this work. We will also provide detailed documentation of our experimental setup and an artifact evaluation to facilitate the reproduction of our results. All resources will be publicly accessible.