The Murky Proof System Waters: Part II

If you haven't done so yet check out Part I for the first part of this blog post, where we give some background on zkVMs and discuss the capabilities of a zkVM that we need or want a proof system to enable. Here, we continue by looking at what features of proving schemes can allow them to meet those requirements.

Features

Now from the other direction, we consider what features of proving schemes allow them to – or prevent them from – meeting our needs and wants.

Constraint System

The prevailing approach to building zkVMs looks something like this: execute a program in a familiar ISA that compiler tools can easily target (like RISC-V32), then transform the trace of the program execution into the format the proof system requires. However, this approach requires managing a disconnect: the ISAs used broadly in computing are binary-encoded (circuit wires encode 0s and 1s manipulated by Boolean gates), while most proving schemes work over an arithmetic-encoding (circuit wires encode finite-field elements manipulated by addition and multiplication gates).

A constraint system is an arithmetic-encoding of circuit wires and gates amenable to proving, and different proving schemes are based on different constraint systems. Naturally, constraint systems are also proliferating, with the most popular being R1CS, Plonkish, AIR, and their superset CCS. Different constraint systems are more or less complicated to write and more or less efficient at encoding certain gates and more or less efficient for various proof systems to prove, and embed numerous tradeoffs that are far too much in the weeds for us to delve completely into.

Choice of constraint system is, however, just the beginning. The zkVM needs to be encoded in the constraint system, creating an arithmetization. This arithmetization also depends on how the zkVM stores memory and authenticates that read and write operations are correctly executed, a process called memory checking with its own myriad approaches and tradeoffs. Moreover, arithmetizations can in some sense be global, and therefore usable by different proving backends at the same time, while others are tailored to a particular proving scheme and so aren't reusable.

Ultimately, the choice of constraint system, arithmetization, and memory checking embedded within the design of a specific proof system are deeply interwoven choices, with significant impacts on both speed and reusability.

(Polynomial) Commitment Schemes

Proving schemes generally fit into the so-called 'commit-and-prove' paradigm where the program is arithmetized into a circuit, the inputs to the circuit are then committed to, and finally the prover constructs an argument of knowledge (the ARK in SNARK) that they know the assignments that lead to the claimed circuit – and therefore program – evaluation.

In this context 'committed to' refers to the use of a cryptographic primitive called a commitment scheme, which has two key properties: hiding and binding. Hiding means that I can take a value x, commit to it y = commit(x), and give out the commitment y without worrying that anyone can learn anything about x from it. This is not quite the same thing as encrypting x, but thinking of it in that way isn't the worst heuristic. Binding meanwhile means that whenever I then choose I can reveal to a verifier that y is a commitment to x through a process called opening, but am unable to open to any x' != x without being caught. It turns out to be possible to commit to values and then prove statements about them in zero-knowledge without needing to reveal them (hence 'commit-and-prove').

Proof systems most often require one or more commitment schemes, and the choice of scheme has a significant impact on the efficiency – in both space and time – of the proving scheme. Choice of commitment schemes can also impact both the need for and the difficulty of establishing the public parameters shared by the prover and the verifier, as discussed in the next section.

Particularly important to a lot of the most advanced proving schemes are polynomial commitments, which allow committing to an entire polynomial over a field, rather than to just a (vector of) field elements. Notably, some polynomial commitment schemes are succinct, like Zeromorph, compressing down the polynomial representation to a small constant number of field elements, which can be important for viable on-chain storage and verifiability.

Trusted/Universal/Untrusted Setup

Many proving schemes (or, more precisely, the commitment schemes used by the proving scheme) require a trusted setup ceremony, where a group of parties collaborate to generate the public parameters – such as a common/structured reference string (CRS/SRS) – to be used by provers and verifiers. Setup ceremonies can be tricky, but they have been done successfully in the past and are becoming more and more normalized. So as long as a proving scheme requires zero to a few such ceremonies it is viable. If one requires many more that becomes much more difficult and expensive to manage in a secure and trustworthy way, and in particular makes it extremely challenging to use on-chain.

Roughly speaking, there are four possible breakdowns of how a given proof system's setup affects zkVM deployment:

the scheme requires no trusted (aka a transparent) setup;
the scheme requires a single, universal setup for all circuits;
the scheme requires per-circuit setup, but we can guarantee that we will only need to ever do setup for a very small number of circuits;
the scheme requires per-circuit setup, and we might have many circuits we ultimately need to generate parameters for.

Again just speaking roughly here, (1-2) are ideal, (3) is fine (especially off-chain), and (4) can be a no-go for many applications, though with some off-chain use possible in particular circumstances.

Lookup Arguments

The straightforward way to prove a zkVM 'does the right thing' is to directly implement a CPU circuit that executes each instruction, and then prove that. However, the resulting step circuit grows directly with the complexity of the ISA/architecture, with a lot of unneeded 'waste constraints' capturing circuitry that has nothing to do with the operation being implemented, e.g., the 'logical shift' implementation when executing a 'logical and' instruction (<< vs. &). An alternative approach is to use instruction lookups: instead of actually implementing the operation, the proof instead just references a row of a table enumerating every possible input/output pair for every possible instruction, available to both the prover and verifier.

Instruction lookups are an incredibly insightful and powerful idea, but one that is certainly of the 'so crazy it just might work' variety: although just looking up an instruction makes the step circuit significantly smaller, the needed lookup tables would seem to be exponentially-sized and therefore unusably massive for a 32-bit architecture. However, it turns out that through careful design that abuses the redundancy of information across those gigantic tables, it is possible to encode them in a small enough form as to make lookup arguments not just practical, but actually the basis for some of the fastest proving schemes currently available.

However, like most design decisions using lookups does bring tradeoffs. Most especially, extending an ISA with additional instructions requires adding their tables to the argument scheme, complicating the process of adding new functionality, such as precompiles, directly into an architecture.

Recursion: Composition or Folding (for Incremental Verifiable Computation, or IVC)

To prove an arbitrarily long-running computation we do not want an arbitrarily long proof. This is especially important for on-chain verifiability, given the need for concise proofs that are cheap to check and store. A key insight of modern verifiable computation is that you can make it incremental through recursion: to prove the ith step of the computation you include the process of verifying the i-1th step, so that the proof of that ith step also encodes the truth of all previous steps as well.

However, this requires the verifier itself to be implemented in-circuit alongside the step circuit, together making an augmented circuit which is what is ultimately proven at each iteration. This verifier circuit can be just as large, or even much larger, than the step circuit. This additional cost is called the recursion overhead, and the practical efficiency of an IVC implementation depends on having as small of a recursion overhead as possible. Note that folding schemes, like the Nova family, are a promising approach to getting recursion (and therefore IVC) for ideally cheaper than implementing a full verifier in-circuit.

Note as well that recursion can be critical to achieving true zero-knowledge in verifiable computation. For many proving schemes proof size is dependent on the length of the computation, causing the information implicit in that dependency to leak to the verifier. In technical language, the relationship between y and pi is not data-oblivious, so by looking at pi the verifier can eliminate possible values for y, and thereby learn some information about the private input. Without an approach like recursion to break that link, a 'zkVM' is really something like a 'succinctVM' – providing a concise proof, but no zero-knowledge guarantees.

Native Parallelization (for Proof-Carrying Data, or PCD)

Proving in zero-knowledge is inherently parallelizable, since all intermediary values are known at time of proving due to the initial execution. Still, certain schemes may be designed with the specific goal of optimizing fast distributed proving. In particular, certain IVC schemes are amenable to distribution – even with recursion – producing a generalization of IVC known as PCD. This is a valuable capability, given that parallelization can reduce the practical impact of proving overhead – making verifiable computation both more practical and more economically viable.

Fields: Large or Small

Proving schemes (and their underlying commitment schemes) generally work over specific finite fields. Most often, these will either be large fields defined by (cycles of) elliptic curves or small fields. Elliptic curve-based large fields have some benefits: they frequently support pairing-based arguments like Groth16 and are already used by Ethereum and in many other settings. Small fields often have the benefit of cheaper operations and so better performance overall, at the cost of needing to do non-native arithmetic for verification/recursion on-chain.

A recent innovation is proving schemes over large fields that mimic small field behavior by guaranteeing that most of the field operations occur over small elements (like 0s and 1s).

Online Verification

One last minor note is that for some use cases, users may not care all that much about non-interactivity of proofs. In many settings the prover and the verifier may be high-uptime web services, where it is perfectly reasonable for the parties to interact as part of verification. There exist some highly efficient, usually vector oblivious linear evaluation (or VOLE)-based schemes that may provide better efficiency at a cost of constant-round interactivity, see for example a recent preprint where the cost of proving each step depends only on the specific instruction that is executed.

Conclusion

As a parting thought, it is worth observing that all of the tradeoffs we have seen are due to competing approaches to get to the same result – different arithmetizations, different commitment schemes, different fields, and so forth, all for fast and trustworthy proving. There is no barrier that prevents the development of a single, 'optimal' proof system: non-interactive and as efficient as all competitors, with support for recursion, parallelization, precompiles, on-chain verifiability, transparent or universal setup, and all based on well-understood mathematical foundations with an audited and battle-tested implementation. But even just theoretically there is no clear path forward to such an ideal construction.

So where does that leave us? Well, we wouldn't be in the murky waters if there was a straightforward takeaway. But hopefully this post has clarified the many dimensions along which proof systems for zkVMs can differ, and what attributes of each new 'latest and greatest' design to look out for when trying to parse through how it may stack up against last week's novelty for applications of interest. Necessarily the science must stabilize – and be standardized – to enable the sort of sober trusted implementations of trusted constructions that the cryptographic community aims for. But in the meantime, there is plenty of excitement in exploring what there is and what comes next.

Nexus Community News: Local Meetups, and the Point Incentive Era Begins

Partnering with HIVE Intelligence: Unifying Language, Data, and Trust for Verifiable AI

Building on Nexus: A Quickstart Guide to Build Apps

The Murky Proof System Waters: Part II

Features

Conclusion

Samuel Judson