Building a Zero Knowledge Proof from an R1CS

Given an arithmetic circuit encoded as a Rank 1 Constraint System, it is possible to create a ZK-proof of having a witness, albeit not a succinct one. This article describes how to accomplish that.

A zero knowledge proof for an R1CS is accomplished by converting the witness vector into finite field elliptic curve points and replacing the Hadamard product with a bilinear pairing for each row.

Given a Rank 1 Constraint System where each matrix has $n$ rows and $m$ columns, we write it as

$$\mathbf{L}\mathbf{a}\circ\mathbf{R}\mathbf{a}=\mathbf{O}\mathbf{a}$$

Where $\mathbf{L}$, $\mathbf{R}$, $\mathbf{O}$ are matrices with $n$ rows and $m$ columns, and $\mathbf{a}$ is the witness vector (containing a satisfying assignment to each of the signals in the arithmetic circuit). The vector $\mathbf{a}$ has 1 column and $m$ rows and $\circ$ is element-wise multiplication (Hadamard product).

In expanded form, it looks like

$$ \left[ \begin{array}{ccc} l_{1,1} & \cdots & l_{1,m} \\ \vdots & \ddots & \vdots \\ l_{n,1} & \cdots & l_{n,m} \end{array} \right] \left[ \begin{array}{c} a_1 \\ \vdots \\ a_m \end{array} \right] \circ \left[ \begin{array}{ccc} r_{1,1} & \cdots & r_{1,m} \\ \vdots & \ddots & \vdots \\ r_{n,1} & \cdots & r_{n,m} \end{array} \right] \left[ \begin{array}{c} a_1 \\ \vdots \\ a_m \end{array} \right] = \left[ \begin{array}{ccc} o_{1,1} & \cdots & o_{1,m} \\ \vdots & \ddots & \vdots \\ o_{n,1} & \cdots & o_{n,m} \end{array} \right] \left[ \begin{array}{c} a_1 \\ \vdots \\ a_m \end{array} \right] $$

$$ = \left[ \begin{array}{ccc} a_1 l_{1,1} + \cdots + a_m l_{1,m} \\ \vdots \\ a_1 l_{n,1} + \cdots + a_m l_{n,m} \end{array} \right] \circ \left[ \begin{array}{ccc} a_1 r_{1,1} + \cdots + a_m r_{1,m} \\ \vdots \\ a_1 r_{n,1} + \cdots + a_m r_{n,m} \end{array} \right] = \left[ \begin{array}{ccc} a_1 o_{1,1} + \cdots + a_m o_{1,m} \\ \vdots \\ a_1 o_{n,1} + \cdots + a_m o_{n,m} \end{array} \right] $$

$$ = \left[ \begin{array}{ccc} \sum_{i=1}^m a_i l_{1,i} \\ \sum_{i=1}^m a_i l_{2,i} \\ \vdots \\ \sum_{i=1}^m a_i l_{n,i} \end{array} \right] \circ \left[ \begin{array}{ccc} \sum_{i=1}^m a_i r_{1,i} \\ \sum_{i=1}^m a_i r_{2,i} \\ \vdots \\ \sum_{i=1}^m a_i r_{n,i} \end{array} \right] = \left[ \begin{array}{ccc} \sum_{i=1}^m a_i o_{1,i} \\ \sum_{i=1}^m a_i o_{2,i} \\ \vdots \\ \sum_{i=1}^m a_i o_{n,i} \end{array} \right] $$

$$ = \begin{array}{ccc} \sum_{i=1}^m a_i l_{1,i} \sum_{i=1}^m a_i r_{1,i} = \sum_{i=1}^m a_i o_{1,i} \\ \sum_{i=1}^m a_i l_{2,i} \sum_{i=1}^m a_i r_{2,i} = \sum_{i=1}^m a_i o_{2,i} \\ \vdots \\ \sum_{i=1}^m a_i l_{n,i} \sum_{i=1}^m a_i r_{n,i} = \sum_{i=1}^m a_i o_{n,i} \end{array} $$

In this setup, we can prove to a verifier that we have a witness vector $\mathbf{a}$ that satisfies the R1CS simply by giving them the vector $\mathbf{a}$, but with the obvious drawback that this is not a zero knowledge proof!

Zero knowledge proof algorithm for an R1CS.

If we “encrypt” the witness vector by multiplying each entry with $G_1$ or $G_2$, the math will still work properly!

To understand this, consider that if we carry out the matrix multiplication

$$ \begin{bmatrix} 1 & 2 \\ 3 & 4 \\ \end{bmatrix} \begin{bmatrix} 4 \\ 5 \end{bmatrix} = \begin{bmatrix} 14 \\ 32 \end{bmatrix} $$

and also

$$ \begin{bmatrix} 1 & 2 \\ 3 & 4 \\ \end{bmatrix} \begin{bmatrix} 4G_1 \\ 5G_1 \end{bmatrix} = \begin{bmatrix} 14G_1 \\ 32G_1 \end{bmatrix} $$

The discrete logs of the two elliptic curve points in the second matrix multiplication are the same value as the elements of the first matrix multiplication.

In other words, each time we multiply the column vector by a row in the square matrix, we carry out two elliptic curve point multiplications and one elliptic curve addition.

Notation for elliptic curves

We say $[aG_1]_1$ is a $\mathbb{G}_1$ elliptic curve point created from multiplying the field element $a$ by $\mathbb{G}_1$. We say $[aG_2]_2$ is a $\mathbb{G}_2$ elliptic curve point generated by multiplying a with the generator $G_2$. Because of the discrete log problem, we cannot extract $a$ given $[aG_1]_1$ or $[aG_2]_2$. Given a $A \in \mathbb{G}_1$ and $B \in \mathbb{G}_2$ point, we say the pairing of the two points is $A\bullet B$.

Prover steps

Let’s encrypt our $\mathbf{a}$ vector by multiplying each entry with the generator point $G_1$ to produce the elliptic curve point $[a_iG_1]_1$.

For the matrix $\mathbf{L}$, we are doing the following:

$$ \left[ \begin{array}{ccc} l_{1,1} & \cdots & l_{1,m} \\ \vdots & \ddots & \vdots \\ l_{n,1} & \cdots & l_{n,m} \end{array} \right] \left[ \begin{array}{c} [a_1 G_1]_1 \\ \vdots \\ [a_m G_1]_1 \end{array} \right] = \left[ \begin{array}{ccc} l_{1,1}[a_1 G_1]_1 & + \cdots + & l_{1,m}[a_m G_1]_1 \\ \vdots & \ddots & \vdots \\ l_{n,1}[a_1 G_1]_1 & + \cdots + & l_{n,m}[a_m G_1]_1 \end{array} \right] = \left[ \begin{array}{c} \sum_{i=1}^m l_{1,i}[a_i G_1]_1 \\ \sum_{i=1}^m l_{2,i}[a_i G_1]_1 \\ \vdots \\ \sum_{i=1}^m l_{n,i}[a_i G_1]_1 \end{array} \right] $$

In anticipation of the Hadamard product becoming a list of elliptic curve pairings, we can use $G_2$ points to also encrypt the $\mathbf{a}$ vector so the verifier can do the pairing.

$$ \left[ \begin{array}{ccc} r_{1,1} & \cdots & r_{1,m} \\ \vdots & \ddots & \vdots \\ r_{n,1} & \cdots & r_{n,m} \end{array} \right] \left[ \begin{array}{c} [a_1 G_2]_2 \\ \vdots \\ [a_m G_2]_2 \end{array} \right] = \left[ \begin{array}{ccc} r_{1,1}[a_1 G_2]_2 & + \cdots + & r_{1,m}[a_m G_2]_2 \\ \vdots & \ddots & \vdots \\ r_{n,1}[a_1 G_2]_2 & + \cdots + & r_{n,m}[a_m G_2]_2 \end{array} \right] = \left[ \begin{array}{c} \sum_{i=1}^m r_{1,i}[a_i G_2]_2 \\ \sum_{i=1}^m r_{2,i}[a_i G_2]_2 \\ \vdots \\ \sum_{i=1}^m r_{n,i}[a_i G_2]_2 \end{array} \right] $$

After this operation, we have a single column of elliptic curve points in $G_2$ originating from the multiplication $\mathbf{L}\mathbf{a}$ and a single column of $G_2$ points from $\mathbf{R}\mathbf{a}$.

The naive next step would be to encrypt the $\mathbf{a}$ vector with $G_{12}$ points so that the verifier can pair the result of $\mathbf{L}\mathbf{a}$ with $\mathbf{R}\mathbf{a}$ to see if it equals $\mathbf{O}\mathbf{a}$, but $G_{12}$ points are massive, so we would rather have the verifier pair the $\mathbf{O}\mathbf{a}$ elliptic curve points in $G_1$ then pair each entry with a $G_2$ point. Pairing with a $G_2$ point is, in a sense, “multiplying by one” but turning the $G_1$ point into a $G_{12}$ point.

The prover then hands the $G_1$ vector and the $G_2$ vector to the verifier.

Verification step

Thus, the verification step becomes

$$\left[ \begin{array}{c} \sum_{i=1}^m l_{i,1}[a_i G_1]_1 \\ \sum_{i=1}^m l_{i,1}[a_i G_1]_1 \\ \vdots \\ \sum_{i=1}^m l_{i,1}[a_i G_1]_1 \end{array} \right] \begin{matrix} \bullet \\ \bullet \\ \vdots \\ \bullet \end{matrix} \left[ \begin{array}{c} \sum_{i=1}^m r_{i,1}[a_i G_2]_2 \\ \sum_{i=1}^m r_{i,1}[a_i G_2]_2 \\ \vdots \\ \sum_{i=1}^m r_{i,1}[a_i G_2]_2 \end{array} \right]\stackrel{?}{=} \left[ \begin{array}{c} \sum_{i=1}^m o_{i,1}[a_i G_1]_1 \\ \sum_{i=1}^m o_{i,1}[a_i G_1]_1 \\ \vdots \\ \sum_{i=1}^m o_{i,1}[a_i G_1]_1 \end{array} \right] \begin{matrix} \bullet \\ \bullet \\ \vdots \\ \bullet \end{matrix} \left[ \begin{array}{c} G_2 \\ G_2 \\ \vdots \\ G_2 \end{array} \right] $$

$$ = \begin{array}{c} \sum_{i=1}^m l_{i,1}[a_i G_1]_1\bullet \sum_{i=1}^m r_{i,1}[a_i G_2]_2 \\ \sum_{i=1}^m l_{i,2}[a_i G_1]_1\bullet \sum_{i=1}^m r_{i,2}[a_i G_2]_2 \\ \vdots \\ \sum_{i=1}^m l_{i,n}[a_i G_1]_1\bullet \sum_{i=1}^m r_{i,n}[a_i G_2]_2 \end{array} \stackrel{?}{=} \begin{array}{c} \sum_{i=1}^m o_{i,1}[a_i G_1]_1\bullet G_2 \\ \sum_{i=1}^m o_{i,2}[a_i G_1]_1\bullet G_2 \\ \vdots \\ \sum_{i=1}^m o_{i,n}[a_i G_1]_1 \bullet G_2 \end{array} $$

The above vectors of $G_{12}$ elements will be element-wise equal if and only if the prover has provided a valid witness.

Well, almost. We’ll get to that in a following section.

First we need to mention an important implementation detail

Public inputs

If our knowledge claim is “I know $x$ such that $x³ + 5x + 5 = y$ where $y = 155$”, then our witness vector will probably look like the following:

$$[1, y, x, v]$$

where $v = x^2$. In this case, we need $[1, y]$ to be public. To accomplish that, we simply don’t encrypt the first two elements of the witness. The verifier will check the public outputs, then encrypt the public inputs by multiplying them with a $G_1$ or $G_2$ point so that the verification formula does not change.

Dealing with a malicious prover.

Because the vectors are encrypted, the verifier cannot immediately know if the vector of $\mathbb{G}₁$ points encrypts the same values as the vector of $\mathbb{G}₂$ points.

That is, the prover is supplying $\mathbf{a}G_1$ and $\mathbf{a}G_2$. Since the verifier doesn’t know the discrete logs of the vector of points, how does the verifier know that the vector of $\mathbb{G}₁$ points has the same discrete logs as the vector of $\mathbb{G}₂$ points?

The verifier can check for the equality of the discrete logs (without knowing them) by pairing both vectors of points with a vector of the opposite generator and seeing that the resulting $\mathbb{G}_{12}$ points are equal. Specifically,

$$ \begin{bmatrix} a_1G_1 \\ a_2G_1 \\ \vdots \\ a_mG_1 \end{bmatrix} \begin{matrix} \bullet \\ \bullet \\ \vdots \\ \bullet \end{matrix} \begin{bmatrix} G_2 \\ G_2 \\ \vdots \\ G_2 \end{bmatrix} \stackrel{?}{=} \begin{bmatrix} a_1G_2 \\ a_2G_2 \\ \vdots \\ a_mG_2 \end{bmatrix} \begin{matrix} \bullet \\ \bullet \\ \vdots \\ \bullet \end{matrix} \begin{bmatrix} G_1 \\ G_1 \\ \vdots \\ G_1 \end{bmatrix} $$

This algorithm is mostly academic

This algorithm is very inefficient for the verifier. If the matrices in the R1CS are large (and for interesting algorithms, they will be), then the verifier has a lot of pairings and elliptic curve additions to do. Elliptic curve addition is rather fast, but elliptic curve pairings are slow (and cost a lot of gas on Ethereum).

However, it is nice to see that at this stage, zero knowledge proofs are possible, and if you have a good grasp of elliptic curve operations (and haven’t forgotten your matrix arithmetic), they aren’t hard to understand.

Making this technique truly zero knowledge

As it is right now, our witness vector cannot be decrypted, however it can be guessed. If an attacker (someone trying to discover the unencrypted witness) uses some auxiliary information to make an educated guess at the witness, they can check their work by multiplying their guessed witness vector by the elliptic curve point generators and seeing if the result is the same as the prover’s witness vectors.

We will learn how to defend against witness guessing in our coverage of Groth16.

Also, keep in mind nobody does the described algorithm in the real world, as it is too inefficient. However, if you implement it, it will help you practice implementing meaningful elliptic curve arithmetic and build a functional end-to-end (almost) zero knowledge proof.

You can see an example implementation of algorithm described here by Obront in this repo.

Learn more with RareSkills

This material is from our Zero Knowledge Course.

Originally Published August 26, 2023

The Intuition Behind ECDSA

The intuition behind elliptic curve digital signatures (ECDSA) This article explains how the ECDSA (Elliptic Curve Digital Signature Algorithm) works as well as why it works. We will incrementally “rediscover” the algorithm from first principles in this tutorial. Prerequisites We assume prior knowledge of Elliptic Curve Arithmetic Elliptic Curve Arithmetic in Finite Fields Digital Signature […]

Trusted Setup

Trusted Setup A trusted setup is a mechanism ZK-SNARKs use to evaluate a polynomial at a secret value. Observe that a polynomial $f(x)$ can be evaluated by computing the inner product of the coefficients with successive powers of $x$: For example, if $f(x)=3x^3+2x^2+5x+10$, then the coefficients are $[3,2,5,10]$ and we can compute the polynomial as […]

The Schwartz-Zippel Lemma and its application to Zero Knowledge Proofs

The Schwartz-Zippel Lemma and its application to Zero Knowledge Proofs Nearly all ZK-Proof algorithms rely on the Schwartz-Zippel Lemma to achieve succintness. The Schwartz-Zippel Lemma states that if we are given two polynomials $p(x)$ and $q(x)$ with degrees $d_p$ and $d_q$ respectively, and if $p(x) \neq q(x)$, then the number of points where $p(x)$ and […]

Lagrange Interpolation with Python

Lagrange Interpolation with Python Lagrange interpolation is a technique for computing a polynomial that passes through a set of $n$ points. Interpolating a vector as a polynomial Examples A straight line through two points Consider that if we have two points, they can be interpolated with a line. For example, given $(1, 1)$ and $(2, […]