The Murky Proof System Waters: Part I

At the heart of the every zkVM is a proving scheme that turns a trace of a computation into an efficiently verifiable proof of its execution. This choice of proof system is one of the defining design decisions for a zkVM – it impacts speed, security, interoperability, and suitability for a variety of applications.

A central challenge for zkVM development is that the present and future of proving schemes is rather murky. Few if any areas of cryptographic research are moving faster than the design of zkVM-focused proof systems, which makes it very difficult for engineers to decide which schemes to prioritize for robust implementation, and for the broader research community to decide which to prioritize for scrutiny and standardization.

So instead of cataloguing the many proving schemes sprouting up in the literature – which would quickly go out of date – instead let's push past the individual trees and try to take in a view of the entire forest. So here in Part I, we will discuss what do we need (or ideally want) of a proof system in order to use it to prove zkVM executions. Then, in our forthcoming Part II we will overview what features of a scheme makes it more or less able to meet those requirements. Between both posts we hope to chart out what a zkVM needs of its proof system, and what makes choosing the right proof system so challenging – and exciting.

But before we dive into all that, let's start with a quick overview of what a zkVM is.

A Quick Background on zkVMs

If you are not familiar with zkVMs, the basic concept is as simple as it is powerful. A virtual machine is a program that runs programs. Just to be concrete, suppose we have a program f that computes the min and max of a list of bytes:

fn f(x: Input) -> Output {   // let's assume Input = &[u8]
    let mut min: u8 = 0xff;
    let mut max: u8 = 0x00;

    assert!(x.len() > 0);

    x.iter().for_each(|b| {
      min = if *b < min { *b } else { min };
      max = if *b > max { *b } else { max };
    });
        
    Output::from([min, max])
}

Rust has built-ins for this sort of thing, but for clarity of exposition we spell it out.

Then, at its simplest a VM is just a program that takes in (f, x), and evaluates the code of f on x in order to return f(x).

fn VM(f: &dyn Fn(Input) -> Output, x: Input) -> Output {
    f(x)
}

Just about the simplest possible model of a VM.

So if we run the VM on f as something like VM(f, [0x01, 0x05, 0x07, 0x00]), we would expect to get [0x00, 0x07] back out. Of course a virtual machine this 'thin' does not really serve much of a purpose, but there are a variety of additional capabilities we can build into VMs that can make it worthwhile to add in the abstraction layer.

A zkVM is just such a VM, being empowered with two additional properties: it is proving and it can be private (or, more specifically zero-knowledge, hence the 'zk').

Proving means that in addition to returning z = f(x), the zkVM also returns a proof pi. After the prover computes (z, pi) = zkVM(f, x), anybody can use (f, x, z, pi) to quickly verify that f(x) == z without having to rerun all of f. Just to reiterate: no matter how long or complex is f is, and no matter how many compute cycles it takes the execute it, anybody can use the proof to check its correct execution quickly and cheaply.

Private means that the program can have a second input f(x, y), and the zkVM can still generate a proof pi such that anybody can take (f, x, z, pi) and quickly verify that the prover knows a y such that f(x, y) == z. The zero-knowledge property guarantees that in the process the verifier learns nothing about the specific y used. Here we call x the public input (known to both the prover and the verifier) and y the private input (known to the prover only).

So altogether, in some Rust-like pseudocode a zkVM looks something like:

struct Proof {
    pub digest: &[u8; 32];
    ...
}

fn zkVM(
  f: &dyn Fn(Input, Input) -> Output, 
  x: Input, 
  y: Input,
) -> (Output, Proof) {
    let z: Output = f(x, y);
    let pi: Proof = prove(f, x, y, z);
    
    assert_eq!(pi.digest, hash(f));
    
    (z, pi)
}

A zkVM with support for public and private inputs.

As a shameless plug, you can try out the Nexus zkVM now.

Although zkVMs are a simple concept, the design of a prove function (and its corresponding verify routine) that is fast enough for a zkVM to be useful in real applications relies on the absolute cutting edge of cryptographic research. This is not hyperbole. Both the theoretical and practical state-of-the-art in proving scheme design for zkVMs is – at time of writing – advancing on the order of weeks. Moreover, it is advancing in terms of both breadth, due to fundamentally new ideas and approaches, and depth, due to new optimizations and enhancements of existing constructions.

This rapid advancement makes developing software that is both state-of-the-art in performance and robust and trustworthy extremely difficult. As implementers we want to be careful and deliberate, since zkVMs are fundamentally about providing trust and privacy. But, at the same time we do not want to be left behind as the fields charges ahead in terms of performance. The mixed metaphor of a forest and murky waters invokes escaping a swamp, and that feels like the best analogy: the field is hard to navigate and with a lot of risk in trying to move too fast, yet there are very good reasons to want to make quick progress.

Here at Nexus we see this challenge as an opportunity. Verifiable computation has many valuable and exciting applications to building provable accountability and transparency into the foundations of computing and the internet. So we're working hard on finding the best path forward for efficient and trustworthy proving with the Nexus zkVM.

An important note to all of this is that a zkVM implementation is not necessarily forced into using a single proof system. Much of what a zkVM does – execute compiled code and generate a trace of its execution – is global to all proof systems. A zkVM can absolutely be designed to support multiple proving backends at the same time, each with different features and tradeoffs for different applications. Some proof systems share even more between them, such as being based on the same constraint systems, arithmetizations, commitment schemes, and memory models (don't worry if these are unfamiliar topics, we will cover each in turn). Still, more supported proof systems means more code which means greater costs in terms of development and maintenance and auditing, which motivates backing a zkVM with as few schemes as possible while still meeting user requirements.

Wants and Needs

Let's begin by considering what are some of the properties either required or desired when building and deploying zkVMs.

Note we will (with one exception) not discuss correctness or security: we take as a given that any proving scheme worth considering must be correct and secure.

Performance

The simplest of needs – we want a proving schemes to be fast, in order to shrink the overhead required to run a program in a zkVM compared to normal direct execution on hardware. In effect, executing zkVM is necessarily slower than just executing f alone as the former invokes the latter, but the smaller of a delta – i.e., the quicker prove is – the better. Now 'fast' here is relative: there are plenty of zkVM applications where something taking three days instead of three seconds to get a proof may not ultimately matter all that much, such as in many zk-rollup or lottery applications. However, in general increased efficiency drives a corresponding increase in the number of applications for which the zkVM is viable.

On-Chain Verifiability

One of the main application domains for zkVMs is moving costly computations off of decentralized runtimes like the Ethereum Virtual Machine (EVM). Instead of doing the expensive computation on-chain, it can be run off-chain by an untrusted party (like a zk-rollup operator) who then generates and submits to the network a proof of correct execution. An on-chain verifier can then check and accept the submitted proof, allowing users to treat the resultant computation as though it had in fact been executed on-chain in its entirety.

In order to do this however, the proof must a) be able to be verified efficiently/cheaply on-chain, and b) be small enough to be stored cheaply on-chain. A significant contributor to (a) is whether the mathematical foundation of the proof is compatible with the mathematical foundations of the chain (e.g., the bn254 curve for Ethereum). Proving schemes that are based on an alternative foundation (e.g., a different curve, lattices, or a small field) may still be usable, but with the tradeoff that the verification circuit must then be implemented using non-native field arithmetic, which will likely raise the size of/cost of storing and running that verification circuit on-chain.

Roughly speaking, the 'ideal' proof form for at least Ethereum verifiability is a Groth16 proof over bn254 (as invented by our very own Chief Scientist, Jens Groth), which has an extremely small proof and existing software support for verification using native field arithmetic. So one approach is to compress larger proofs down to size recursively, by using proof systems with small proofs (like Groth16) to prove proof systems with larger proofs.

Precompiles

Precompiles involve writing dedicated circuits that avoid the overhead of the CPU abstraction for computations that are important to certain application domains. Think like a hash function or elliptic curve arithmetic for proving cryptographic operations, or matrix multiplication for proving inference by a machine-learned model. In many schemes precompiles come at a privacy cost – the verifier can see that a precompile has been used – but this may be a valid tradeoff and is irrelevant for public verifiable computation use cases.

Some proving schemes prioritize (efficiently) proving a specific (set of) circuits, such as some lookup argument-based constructions like Jolt, making extending them with precompiles a challenge. Others can prove any arbitrary circuit, but can only compose that one specific circuit across program execution steps – a property called uniformity, which precludes the main utility of using precompiles to avoid the CPU abstraction. Supporting precompiles is a significant boost for many computationally-intensive application domains, and will likely be necessary to bring some of them, such as ML inference, to practicality.

Community

The proving scheme implementers community is small but growing, and already produces a lot of code. Prioritizing the use of proving schemes that have a lot of existing high-quality software support may increase the frequency and confidence with which new approaches can be implemented and validated.

Trusted Foundations

This last entry leans a bit more into the theoretical side of zero-knowledge cryptography. There are a variety of mathematical foundations already being used for proving schemes (e.g., cycles of elliptic curves, small fields, lattices, error-correcting codes, hash functions), each leading to different dependencies on various computational hardness assumptions. Compared to say, the RSA assumption or Diffie-Hellman assumptions, many of these depended-upon assumptions are relatively underexplored from a theoretical standpoint. Moreover, building zero-knowledge arguments on top of these foundations is also a very new field, and even elite researchers make mistakes.

As such, there is a benefit to working with the best known schemes built on the best known foundations that have received the greatest scrutiny from researchers, and in following the standardization efforts already under way for the domain. Formal verification can also boost confidence in the correctness and security of both the choice of scheme and the quality of an implementation.

This discussion continues in The Murky Proof System Waters: Part II, where we analyze the pros and cons of the different potential proof system features.

Share this article: Link copied to clipboard!