Short Discrete Log Proofs for FHE and Ring-LWE Ciphertexts

. In applications of fully-homomorphic encryption (FHE) that involve computation on encryptions produced by several users, it is important that each user proves that her input is indeed well-formed. This may simply mean that the inputs are valid FHE ciphertexts or, more generally, that the plaintexts m additionally satisfy f ( m ) = 1 for some public function f . The most eﬃcient FHE schemes are based on the hardness of the Ring-LWE problem and so a natural solution would be to use lattice-based zero-knowledge proofs for proving properties about the ciphertext. Such methods, however, require larger-than-necessary parameters and result in rather long proofs, especially when proving general relationships. In this paper, we show that one can get much shorter proofs (roughly 1 . 25KB) by ﬁrst creating a Pedersen commitment from the vector corresponding to the randomness and plaintext of the FHE ciphertext. To prove validity of the ciphertext, one can then prove that this commitment is indeed to the message and randomness and these values are in the correct range. Our protocol utilizes a connection between polynomial operations in the lattice scheme and inner product proofs for Pe-dersen commitments of B¨unz et al. (S&P 2018). Furthermore, our proof of equality between the ciphertext and the commitment is very amenable to amortization – proving the equivalence of k ciphertext / commitment pairs only requires an additive factor of O (log k ) extra space than for one such proof. For proving additional properties of the plaintext(s), one can then directly use the logarithmic-space proofs of Bootle et al. (Eurocrypt 2016) and B¨unz et al. (IEEE S&P 2018) for proving arbitrary relations of discrete log commitment. Our technique is not restricted to FHE ciphertexts and can be applied to proving many other

In this paper, we show that one can get much shorter proofs (roughly 1.25KB) by first creating a Pedersen commitment from the vector corresponding to the randomness and plaintext of the FHE ciphertext. To prove validity of the ciphertext, one can then prove that this commitment is indeed to the message and randomness and these values are in the correct range. Our protocol utilizes a connection between polynomial operations in the lattice scheme and inner product proofs for Pedersen commitments of Bünz et al. (S&P 2018). Furthermore, our proof of equality between the ciphertext and the commitment is very amenable to amortization -proving the equivalence of k ciphertext / commitment pairs only requires an additive factor of O(log k) extra space than for one such proof. For proving additional properties of the plaintext(s), one can then directly use the logarithmic-space proofs of Bootle et al. Our technique is not restricted to FHE ciphertexts and can be applied to proving many other relations that arise in lattice-based cryptography. For example, we can create very efficient verifiable encryption / decryption schemes with short proofs in which confidentiality is based on the hardness of Ring-LWE while the soundness is based on the discrete logarithm problem. While such proofs are not fully post-quantum, they are adequate in scenarios where secrecy needs to be future-proofed, but one only needs to be convinced of the validity of the proof in the pre-quantum

Introduction
Fully-homomorphic encryption (FHE) allows for evaluations of arbitrary functions over encrypted data. The traditional application of this primitive is outsourcing -a user encrypts his data and sends it to a server who performs the (intensive) computation and returns back the encrypted result. In this scenario, the user is the only one affected by the outcome of the computation, and so it is not necessary for him to prove that his ciphertexts he submitted to the server are properly formed.
There are other applications of FHE, however, that involve computations on ciphertexts submitted by several users [LTV12,MW16,PS16]. For example, multi-key FHE allows the server to compute over ciphertexts encrypted under different keys and produce a result that can then be jointly decrypted by the participating parties. One can also use FHE in a "distributed ledger" (e.g. [ABB + 18]) setting where users can submit ciphertexts encrypted under some particular public key and a computation can be performed by anyone on behalf of the holder of the secret key to produce an encrypted output. This is useful in scenarios where certain entities (the holder of the secret key in our example) wish to perform only a limited amount of computation.
For the above scenarios where more than one user is involved, it is important that each party provides a zero-knowledge proof that his input is a valid FHE ciphertext -otherwise the final output may, unknowingly to anyone else, be constructed from invalid data. It may furthermore be necessary to prove that the encrypted message satisfies certain additional properties dictated by the protocol. For encryptions based on the discrete logarithm problem, such proofs can be very efficiently constructed for certain relations using techniques in [CS03] and for general circuits using the more recent logarithmic space proofs for discrete logarithms [BCC + 16,BBB + 17]. FHE schemes, on the other hand, are constructed from LWE (or LWE-like) encryption schemes (e.g. [BGV12]), which unfortunately do not enjoy such practical proofs. For example, the most efficient verifiable encryption scheme for Ring-LWE [LN17] ciphertexts only handles linear relations B m = t and gives proofs of knowledge of an m satisfying B m = c · t, where c is some polynomial with small coefficients. This is satisfactory in some scenarios (see [LN17] for examples), but is not general enough for many other applications. Obtaining proofs without the polynomial c even for simple relations would make the proof sizes on the order of megabytes (cf. [LLNW18]).
In this work, we take a different approach for creating such proofs. An FHE (or more generally, a Ring-LWE) ciphertext can be written as where A is the public key, t is the ciphertext, and s consists of the randomness and the message. All operations are performed over some polynomial ring R q = Z q [X]/(f ) for some integer q and a monic, irreducible polynomial f ∈ Z[X] of degree d.
The main result of the current work is an efficient protocol for proving knowledge of s with small coefficients in the above equation. Our strategy is to first create a joint Pedersen commitment t = Com( s) to all the coefficients in s, and prove in zero-knowledge that these coefficients, when interpreted as a polynomial vector s, satisfy (1). At the same time, the proof will also show that the coefficients of s are in the required range for valid Ring-LWE ciphertexts. Moreover, if we have many Ring-LWE ciphertexts t 1 , . . . , t k , then the size of our proof is only approximately an additive factor of O(log k) larger than the proof for one equation in (1).
Once we have a Pedersen commitment of the coefficients of s, we can additionally use the aforementioned very efficient zero-knowledge proofs for discrete logarithm commitments [BCC + 16,BBB + 17] to prove arbitrary properties of the plain-text contained in s. This gives us a verifiable encryption scheme (and also a verifiable decryption scheme) for Ring-LWE ciphertexts (see Section 1.5). As an example of the proof size, a proof of ciphertext validity of a Ring-LWE encryption scheme in (9) requires only 1.25KB.

Post-Quantum Security
One of the side advantages of FHE based on Ring-LWE is that the encryption scheme remains secure against quantum attacks (assuming that the Ring-LWE problem is post-quantum secure). Since Pedersen commitments are statisticallyhiding and all the proofs are statistical zero-knowledge, the secrecy of the ciphertext and the Pedersen commitment is still based on just Ring-LWE. The soundness of the proofs, however, is based on the hardness of the discrete log problem and is therefore not post-quantum.
Having the soundness of the proof not be post-quantum is still, for many scenarios, acceptable even if we do foresee quantum computers appearing in the future. For example, all proofs created until quantum computers capable of breaking discrete log actually appear would still be valid. Furthermore, the protocol can be easily altered to force the prover to create his Pedersen commitment and the zero-knowledge proof with "fresh" randomly-chosen generators and complete his proof in a specified amount of time. 4 Breaking the soundness of this proof system would thus require solving the discrete log problem using a quantum computer within a prescribed (e.g. several seconds) time interval.
While building a quantum computer capable of breaking cryptographic problems presents a very substantial scientific and engineering challenge, building one that is capable of solving such problems in seconds is a potentially significantly harder problem. For a 2048-bit number, under some reasonable assumptions on the error rate and the speed of each gate computation on a superconducting platform, this would take around 27 hours and a billion physical qubits [FMMC12]. A trapped-ion based computer with very low error rate would need 110 days to perform the same operation [LWF + 17]. One can sometimes decrease the running time by utilizing more qubits, but there are several other roadblocks that would keep the computation time from decreasing beyond certain barriers (c.f. [Gid18] for a discussion). While it is too early to guess when (or if) it will be possible to run Shor's algorithm in under a minute, it certainly appears to be a problem that will require overcoming many more fundamental challenges even after a "basic" fault-tolerant universal quantum computer is built.

Other Applications
Our general result gives a way to prove knowledge that the secret s in the linear equation (1) is the same as in the commitment Com( s), where Com(·) is a Pedersen commitment to the individual coefficients of s. Because (1) is quite generic, it can be used to represent many relations throughout lattice cryptography. For example, ciphertexts, commitments, public keys in encryption / signature schemes, etc. are all of this form. One can therefore apply our protocol as a first step in a larger protocol that needs to prove something about the secret s. For example, verifiable encryption and decryption schemes (where the prover or decryptor needs to prove that the plaintext m satisfies f (m) = 1 for some public function f ) has many applications (c.f. [CS03]) and such schemes that retain the post-quantum secrecy of the ciphertext can thus be built using our techniques. We sketch the construction in Section 1.5 and note that proving validity of FHE ciphertexts is just a special case of verifiable encryption.

Previous Related Work
A connection between Ring-LWE and discrete log commitments has been previously explored by Benhamouda et al. [BCK + 14]. The construction in the current paper is completely different and enjoys significant advantages (both theoretical and practical) over the aforementioned prior work. Firstly, the modulus q in (1) has to be the same as the group size underlying the discrete log commitment for the proof in [BCK + 14] -and taking q ≈ 2 256 would require making the Ring-LWE / FHE scheme significantly less efficient than it needs to be (typical sizes of q are ≈ 2 30 ). Secondly, the protocol in [BCK + 14] requires a separate Pedersen commitment for every coefficient of s rather than one commitment for all the coefficients of s. Thirdly, the proof is a Σ-protocol with soundness error 1/d (where n is the degree of f ) and so needs to be repeated around a dozen times. While [BCK + 14] did not provide concrete parameters, we would estimate that our proofs would be shorter by 2 -3 orders of magnitude. And additionally, our current proof can be amortized for proving k equations as in (1) while only incurring an O(log k) additive overhead.
Our work can also be seen as complementary to that of Fiore, Gennaro, and Pastro [FGP14] where they give a succinct proof that the evaluation in the FHE scheme was performed correctly for certain types of functions.

High Level Overview of the Protocol
Our general proof is for k copies of (1) -in other words a proof of a matrix S ∈ R m×k q with bounded coefficients such that (2) We will explicitly write out which modular reductions occur as it will change throughout the protocol.
In this overview, we will sketch the proof of a simpler version of (2), which is just a Ring-LWE / Ring-SIS equation where a i , t, s i ∈ R q and the coefficients of s i have absolute value less than B. Afterwards, we will explain how this can be extended to the full proof of (2). Let G be a group of size p ≤ 2 256 in which the discrete problem is hard. The prover first rewrites (3) so that it is entirely over the ring Z[X] -i.e. there are no reductions modulo q and f : The polynomials r 1 and r 2 are not unique, but we would like them to simultaneously have small coefficients and be of small degree. We show that r 1 can be of degree 2(d − 1) and have coefficients of absolute value at most d 2 (Bm + f ∞ ), while r 2 can have degree d − 2 with coefficients having absolute value at most 1 2 (q − 1).
The prover creates a Pedersen commitment t = Com(s 1 , . . . , s m , r 1 , r 2 ) ∈ G where each integer coefficient of s i and r i is in the exponent of a different generator g j . 5 The prover sends t to the verifier. 5 If we would like to achieve post-quantum security based on the assumption that discrete log cannot be solved in a prescribed amount of time, then the gi should not be known to the prover before the start of the proof. This can be arranged by either having the verifier sending them (or more precisely, send a short seed that expands into the prescribed number of generators) at the start of the protocol or using a randomness beacon in non-interactive proofs.
The verifier chooses a random challenge element α ∈ Z p and sends it to the prover. The prover now needs to give several proofs. In the real protocol, all these will be combined into one proof, but for ease of exposition, we will explain them separately here. The first proof is a range proof π s,r from [BBB + 17] showing that all the committed values in t are in the correct ranges. The second proof is a proof that (4) evaluated at α holds true over the field Z p . By the Schwartz-Zippel lemma, this implies that with probability > 1 − 2d/|G|, this equation also holds true over the polynomial ring Z p [X]. Since we have already proven that the coefficients of s i and r i are relatively small and we assumed that q is also small (compared to p), we know that if (4) holds true in Z p [X], then it also holds over Z[X] because no reduction modulo p takes place. This will complete the proof. We now just have to prove that (4) evaluated at α holds true mod p.
Define the matrices where the rows of S consist of the integer coefficients of s i and r i with the constant coefficient being in the leftmost column row and the coefficients of σ j X j , then the i th row of S is σ 0 σ 1 · · · σ d−1 . With this notation, observe that the matrix product SV = s 1 (α) · · · s m (α) r 1 (α) r 2 (α) T mod p, and so Thus if we prove that then we will end up proving that (4) evaluated at α is true modulo p. Since U, V and t(α) are public and we have a commitment to the coefficients of S, we can apply an extension of the inner-products proofs from [BCC + 16,BBB + 17] to prove our linear relation. 6 To complete the protocol, the prover simply sends π, π s,r to the verifier and he accepts if all the proofs are correct.
Combining the Two Proofs. In the real protocol which we describe in Section 5, we combine the two proofs π s,r and π into one. The reason is that the range proof π s,r in [BBB + 17] works by writing each coefficient in binary, storing a matrix of these coefficients, and then giving a proof that each coefficient of the decomposition is 0 or 1 (the number of these coefficients then implies the range).
Due to the fact that the ranges of the s i and r i are different, storing these in the same matrix would require us to increase the size of the matrix to accommodate the largest coefficients, which would be wasteful. Thus instead of proving the matrix equation (6), we write these out as a series of appropriate equations (each of varying lengths) where the coefficients of S are in binary and prove those instead. This allows us to do a range proof and the proof of (6) in one step.
We provide explicit details of the above algorithm in Section 5. We additionally obtain a tighter security proof of the inner-product proof of [BCC + 16,BBB + 17] by using a different extraction strategy, described in Section 3. In addition, our zero-knowledge range proof is somewhat simpler than the one in [BBB + 17] because our range proof is constructed on top of a zero-knowledge inner product proof instead of the original Bulletproof inner product proof which is not zeroknowledge. This allows for not blinding the vectors in the range proof simplifying extraction and saving two rounds of the protocol. The additional complexity in the inner product proof is basically just a Schnorr proof (see Section 4). These small improvements may be of independent interest. Some observations about the proof strategy. The reason that we converted (3) into (4) and then used the Schwartz-Zippel lemma for proving (4) is for reducing the time complexity of the proof. An alternate, simpler, procedure for proving (3) would have been the following: first write (3) as and create the commitments t s and t r1 as before. Now, observe that polynomial multiplication a i s i can be written as a matrix / vector product As, where column j (labeled from 0 to d − 1) of A consists of the coefficients of the d − 1 degree polynomial a i X j mod f and s is a vector of coefficients of s i . Thus m i=1 a i s i can be written as a matrix / vector product itself. Then one could directly apply the modified inner-product proof to prove (7) modulo p, which would again imply that this equation holds true over Z (since the coefficients are all much smaller than p), and so this implies (3). The main problem with the above approach is that the matrices A are d × d matrices, and so the proof of matrix/vector product would require O(d 2 ) exponentiations (or multiplications in elliptic curve groups) in G. For typical values of d > 1000, this operation is quite expensive and could take several minutes even on a reasonably powerful machine. Our proof, on the other hand, takes advantage of the fact that the operations can be interpreted over the ring Z p [X] for a very large p and one can then prove polynomial equality via the Schwartz-Zippel lemma. Since polynomial evaluation is an inner-product of d-dimensional vectors, constructing a matrix product proof only requires O(d) exponentiations per evaluation. Note that this is also the reason that our proofs would be much less computationally efficient for proving relations over Z (i.e. LWE / SIS relations).
Another issue to draw attention to is that the polynomial equations we want to prove are modulo q, whereas the proofs are done modulo a larger p. As mentioned before, the reason for this is that in typical cryptographic applications of the Ring-LWE / Ring-SIS problems (such as FHE), the modulus q is not very large (smaller than 2 40 ). On the other hand, the discrete log commitments must be performed over a much larger-size group. If, however, an application called for the modulus q to be a large prime, then our proof could use q = p, and we would never need to switch to working over Z[X] -we could always work over Z q [X] and have no need for the polynomial r 1 .
Simultaneously proving k polynomial equations. The proof for proving knowledge of S satisfying (2) is a straightforward extension of the above-described algorithm with the strategy for the proof being the same. First, we will prove that in the analogue of (4), all the coefficients of S, R 1 , R 2 are small and then prove that the above equation holds, with high probability, over the ring Z p [X] for a very large p. This will imply that (8) also holds over Z[X], and thus (2) is true. We now describe the protocol in slightly more detail. The first step of the protocol remains virtually identical with the prover committing to S and R 1 , R 2 . After receiving the challenge α, the prover again wishes to show that the coefficients of S, R 1 , R 2 are in the appropriate ranges and prove the equality of (8) where each polynomial is evaluated at α.
If we define I n ∈ Z n×n to be the identity matrix, then one can rewrite what we would like to prove as If, for a polynomial m × k matrix S, we create the m × (kd) integer matrix S by writing each polynomial in S as a row consisting of its d coefficients (the way way that s i were expanded in the matrix S in (5)), then we can rewrite the above equation as Since all the matrices in the above equation except  are public, we can again apply the modified inner-product proof from [BCC + 16,BBB + 17] to prove the equality modulo p. And again, as before, our real protocol would combine the range proof and modified inner-product proof into one proof.

Application to Verifiable Encryption and Decryption for Ring-LWE Ciphertexts
Notice that the first step of our proof involved creating a Pedersen commitment t to the coefficients of S. The rest of the proof then went on to show that the commitment is really to an S satisfying (2). Since at the end of the protocol, we end up with a Pedersen commitment to S, we can use another SNARK (e.g. one from [BBB + 17]) that proves arbitrary relations of its committed values. Thus just proving knowledge of S naturally gives rise to verifiable encryption and decryption schemes for Ring-LWE encryption, as we sketch below. In a verifiable encryption scheme, the encryptor produces an encryption of a message m and a ZKPoK that the ciphertext is a valid encryption to m and that f (m) = 1 for a public function f . Consider the following "usual" encryption scheme based on Ring-LWE [LPR13]: The secret key are polynomials s, e with small, bounded coefficients and the public key consists of a random polynomial a ∈ R q and t = as + e ∈ R q .
The encryption of a message m ∈ R q , where all coefficients of m are in the range [0, p), is created as in the below equation, where r, e 1 , e 2 are polynomials with bounded coefficients.
For a verifiable encryption scheme, we can use our proof system with A ∈ R 2×4 q and S ∈ R 4×1 q to create a Pedersen commitment(s) to S and prove that all the coefficients of r, e i , m lie within their prescribed bounds and that (9) is satisfied by the commitment(s) representing S. The preceding proves knowledge of the plaintext m for the ciphertext u v .
To decrypt a ciphertext u v , the decryptor first computes v − us = p(er + e 2 − se 1 ) + m.
Since all the coefficients of the above equation are small, no reduction modulo q takes place and this equation holds true over Z[X].
Computing v − us mod p therefore recovers m.
To construct a verifiable decryption scheme, let g = er + e 2 − se 1 from the above equation. Let β be a bound on g such that no reduction modulo q takes place in (10) and so decryption still works (i.e. β should be less than approximately q/p). Then the decryptor should be able to prove knowledge of s, e, g, m in the following equation with coefficients of s, e having the appropriate bounds and m having all coefficients in [0, p).
Proving the above shows that m is a valid decryption. To show that there is only one possible decryption (i.e. only one possible solution to the above equation), suppose there exist two solutions: If s = s , then the first row of (12) implies a non-zero solution to a(s − s ) + (e − e ) = 0.
Writing a as above can either be shown to be impossible either via an information-theoretic argument or via the computational assumption that the Ring-SIS problem [PR06,LM06] is hard. 7 If s = s , then the second row of (12) implies that p(g − g ) + (m − m ) = 0. Since the coefficients are small enough that no reduction modulo q takes place, the preceding implies that m − m is a multiple of p, which implies that m = m (since the coefficients of m − m are in the range (−p, p).)

Open Problems
We have shown how linear relations over polynomial rings can have very compact proofs by converting the problem into a form that is compatible with the compact SNARKs in [BCC + 16,BBB + 17]. While the proofs are small, creating such proofs may require on the order of hundreds of thousands of exponentiations. It would therefore be interesting to see whether one can transform the problem into a form compatible with SNARKS that are less compact but may require fewer operations, such as for example those in [WTS + 18]. Since the latter proofs are particularly tailored to parallelizable functions, they may also result in rather efficient proofs for LWE / SIS ciphertexts, and not require one to work over polynomial rings. We leave this direction as an open problem.

Notation
We use bold letters f for polynomials, arrows for column vectors as in v, and capital letters A for matrices. Vectors and matrices of polynomials are denoted by bold letters v with arrows and bold capital letters M, respectively. We write R = Z[X]/(f ) for the ring of integer polynomials modulo a monic irreducible polynomial f ∈ Z[X], R q for the quotient ring R/qR for some prime q and similarly Z p for Z/pZ.
Let v 1 ∈ Z n p and v 2 ∈ Z n p be two vectors over Z p . Then we write v 1 , v 2 ∈ Z p , v 1 • v 2 ∈ Z n p and v 1 ⊗ v 2 ∈ Z n 2 p for their inner product, componentwise product and tensor product, respectively.
Norms. The absolute value |a| of an element a ∈ Z q is defined to be the absolute value of the centralized representative in {−(q −1)/2, . . . , (q −1)/2}. The infinity norm s ∞ of a polynomial s ∈ R q is the maximum absolute value of all of its coefficients. Likewise, the infinity norm s ∞ of a vector of polynomials is the maximum over the infinity norms of its coefficient polynomials.
Multi exponentiations. For a group G of order p, written multiplicatively, and vectors g = (g 1 , . . . , g n ) T ∈ G n and a = (a 1 , . . . , a n ) T ∈ Z n p we use the notation g a = g a1 1 . . . g an n ∈ G. Throughout the paper the group G will be understood to be cyclic of prime order p with hard computational discrete-log problem. A Pedersen multi-commitment over generators g ∈ G n , u ∈ G to a vector v ∈ Z n p with randomness ρ $ ← Z p is given by the multi-exponentiation t = g v u ρ . This is clearly perfectly hiding and computationally binding under the assumption that it is hard to compute a non-trivial discrete-log relation between the generators g, u. The latter problem is easily seen to be equivalent to the discrete-log problem.
Serializing matrices to vectors. We will need to serialize matrices A ∈ Z n×m p to vectors. For this reason we define functions where a contains the coefficients of A in row major order.
In many programming languages, most notably C, this is how matrices are stored in memory so that Serialize is a non-operation in these languages. We extend Serialize to polynomial matrices over Z[X] by first expanding each polynomial to its row coefficient vector and then proceeding as before.
Expanding integers to their binary representation. We will also need to map integers to their binary representation, including negative integers. For this we define the function that maps a signed b-bit integer to its binary representation using two's complement. More precisely, z = (z 0 , . . . , z b−1 ) T is defined by Again this representation for signed integers is used by all modern CPU's and Binary is a non-operation. We extend Binary to vectors where Binary is applied to each coefficient individually.

Forking Lemma
For proving the security of proof systems based on the Bulletproof technique from [BBB + 17] one needs a special forking lemma which shows that it is possible to obtain many accepting transcripts from a prover for challenges that are organized in a large tree. The forking lemma used in the Bulletproof paper goes back to [BCC + 16, Lemma 1]. It is only stated in terms asymptotic in the security parameter. Moreover, the tree finding algorithm for computing the tree that is given and analyzed in the proof of the forking lemma does not try to avoid collisions between the challenges. But it is necessary that there are no collision so that the transcripts can be used for extraction. Therefore, in order to compute the success probability of the tree finding algorithm, the collision probability has to be taken into account in addition to the failure probability of the prover. For a 256 bit curve, the collision probability gets quite large for moderately sized trees and as a result of this the reasoning of the forking lemma only applies to provers whose failure probability 1 − ε is small. Concretely, to obtain a tree of accepting transcripts of height µ where every inner node has n children one needs ε > n µ /2 85 . For example in the case of the Bulletproof inner product proof, where n = 4 and µ = log l with l the length of the vectors, ε > l 2 /2 85 and the forking lemma only proves the inner product proof to be sound with soundness error 2 −35 if l = 2 25 , a length easily reached in our application. One would need to repeat the proof four times in order to get below 2 −128 . We give a different forking lemma with a different extraction algorithm together with a concrete analysis in this section. Our forking lemma achieves negligible soundness error. It is still non-tight though, which is unavoidable as one needs to obtain n µ = l log n transcripts. We stress that we do not think that this nontightness in the security proof allows for any actual attacks for 256 bit curves. Let us start by recalling the definition of a tree of accepting transcripts.
Definition 3.1. Let P * be a deterministic prover for a (2µ+1)-move interactive proof protocol where the honest verifier V sends µ challenges in steps 2, 4, . . . , 2µ. An (n 1 , . . . , n µ )-tree of accepting transcripts associated with P * is a tree of height µ of the following form. Every node in level i, 0 ≤ i ≤ µ − 1, has precisely n i+1 children, all nodes except the root are labeled by a challenge and each leaf additionally contains the transcript obtained by interacting with P * and sending the challenges in the path from the root to this leaf. Moreover, the challenges in all nodes with the same parent are distinct and V accepts all transcripts in the leaves.
Lemma 3.2. Let P * be a deterministic prover for a (2µ + 1)-move interactive proof protocol where the honest verifier V sends µ = log(l) uniformly random challenges from a set C of size p in steps 2, 4, . . . , 2µ. Then there exists an algorithm tree-finder that, when given rewindable black-box access to P * , computes an (n 1 , . . . , n µ )-tree of accepting transcripts with probability at least 1/4 in expected time at most for every α > 1 1−n/p 2 and with n = max 1≤i≤µ−1 n i under the assumption that P * convinces V with probability ε ≥ α µ α−1 nµ p = l log α α−1 nµ p . Running P * once is assumed to take unit time.
Proof. We construct tree-finder = tree-finder(1) as a recursive algorithm with tree-finder(i), i = 1, . . . , µ, interacting with P * from the 2i-th move onward. A naive first approach would be as follows. For i < µ, tree-finder(i) would run P * until and including move 2i + 1 sending a uniformly random challenge c i ∈ C in step 2i. Then the algorithm would call tree-finder(i + 1). Afterwards it would rewind P * back to just after step 2(i − 1) + 1 and repeat the process for a total of n i different challenges. So in the second iteration tree-finder(i) would sample a uniform challenge from C \{c i }. The tree-finding algorithm tree-finder(µ) in the last level would send a last challenge c µ and check whether the interaction with P * led to a valid proof, i.e. V would accept the proof. Then it would repeat for as many last challenges c µ as needed to get n µ valid proofs for n µ different c µ . The problem with this approach is that in any level for many challenges c i there might only be very few continuations c i+1 , . . . , c µ that lead to valid proofs (or none at all). Hence the tree-finding algorithm might run into dead ends where tree-finder(µ) runs for a very long time or does not terminate at all.
For fixed challenges c 1 , . . . , c i−1 , let ε i be the acceptance probability over all uniform continuations c i , . . . , c µ . In particular ε 1 = ε. Then for some c i let ε i+1 = ε i+1 (c i ) be the acceptance probability under the additional condition that the i-th challenge is c i . Now from a standard heavy rows / averaging argument we know ε i+1 ≥ ε i /α, α > 1, for at least a fraction of 1 − 1/α of the c i . Therefore our solution to the problem is as follows. After choosing c i , tree-finder(i) estimates ε i+1 by running P * until the end for many continuations c i+1 , . . . , c µ and counting the number of valid proofs. Then the tree finding algorithm only continues with c i if the acceptance probability does not decrease too much by fixing c i . The complete algorithm is as follows where 1 < λ < √ α and T i are specified later.

2:
Initialize tree as a tree containing only an empty root We analyze the algorithm under the assumption ε i ≥ ε/α i−1 . The challenge c i is chosen and the acceptance probability ε i+1 = ε i+1 (c i ) estimated during the loop in lines 12 − 25. We define the following probabilities in one iteration of the loop.
So p 0 , p 1 and p 2 are the probabilities of continuing the loop, choosing a "good" challenge c i , and choosing a "bad" challenge, respectively. Note that p 0 + p 1 + p 2 = 1. By the heavy rows argument, with probability at least 1 − 1/ √ α − n/p, ε i+1 (c i ) ≥ ε/( √ α · α i−1 ). Therefore and by the Chernoff bound, On the other hand we find for p 2 , where we have set δ > 0 such that (1 + δ)ε i+1 = λε/α i , i.e. δ = λε α i εi+1 − 1. We want to bound min(δ, δ 2 )ε i+1 from below. Notice that is strictly decreasing on the interval ε i+1 ∈ [0, ε/α i [. Hence, Moreover, δε i+1 > (λ − 1) ε α i and therefore We set λ such that the arguments of the exponential function in p 1 and p 2 are equal; that is, Then . With these probabilities we now calculate the probability that the loop ends with a bad c i . It is given by The probability that the first-level tree-finder(1) chooses n 1 good challenges c 1 is (1 − p bad ) n1 . Under this condition our assumption ε 2 ≥ ε/α is true for the second-level tree finders and they all choose only good challenges with probability (1 − p bad ) n1n2 . Write N = µ−1 i=1 (n 1 . . . n i ) ≤ µ−1 i=1 n i = n µ −n n−1 < n µ = (2 log n ) µ = (2 µ ) log n = l log n for n = max 1≤i≤µ−1 n i . We see that with probability (1 − p bad ) N only good challenges are chosen in the whole execution of the tree-finding algorithm and the assumption is true for all invocations of tree-finder(i). Now, by the Bernoulli inequality, which is bigger than 1/2 if p 2 ≤ (1 − 1/ √ α − n/p)/(2N ), which in turn is implied by

The expected number of iterations of the loop in lines 12−25 under the condition that a good c i is chosen is
and each iteration takes time T i + 1. So with probability at least 1/2 the conditioned expected runtime of the whole tree finding algorithm is at most Here we have used ε ≥ α µ n µ /((α − 1)p) which implies ε/α µ−1 − n µ /p ≥ ε/α µ = ε/l log α . When we are not so lucky and some bad challenges are chosen the algorithm might run for a long time but we just limit the runtime to 2t. Then the probability for obtaining a full n-tree of accepting transcripts is at least 1 2 (1 − 1 2 ) = 1 4 since the probability that an algorithm with expected runtime t runs longer than 2t is at most 1/2. Notice that in expected time 8t we can obtain an n-tree of accepting transcripts.
Example. The implied constant in the big-O statement for the runtime of the extractor is readily computed from the formulas in the proof of Lemma 3.2. For example in the case where p ≈ 2 256 , n = 4, l = 2 25 and α = 1.3, one finds that λ ≈ 1.075 and the implied constant is about 1564.

Zero-Knowledge Inner Product Proof
In an inner product proof there is a commitment t = g v1 h v2 u ρ to two vectors whose inner product x = v 1 , v 2 is publicly known. The goal is to prove knowledge of an opening to t that really fulfills this inner product relation. In this section we give a variant of the Bulletproof inner product proof which differs in that it is zero-knowledge. In the original protocol, after folding the vectors down to just 1-dimensional elements, the prover reveals the opening to the commitment. The main difference of the modified protocol from this section is that instead of revealing the opening it uses a Schnorr-type proof to prove knowledge of an opening in zero-knowledge, in a way that also proves the necessary product relation. With a zero-knowledge inner product proof at hand we can significantly simplify our main protocol compared to the similar Bulletproof range proof from [BBB + 17]. For example, our proof is only three round compared to the five rounds of the range proof. The advantage stems from the fact that the secret vectors do not have to be blinded which is the reason for much of the complication in the Bulletproof range proof. We write Π ·,· (·; ·) for our inner product proof protocol, which is detailed in Figure 1.
The length l of the secret vectors v 1 , v 2 is assumed to be a power of two. In the main protocol from Section 5 we need an inner product proof for vectors of arbitrary length but it is trivial to achieve this by just padding the vectors with zeros. If t = g v1 h v2 u ρ is a commitment to two vectors of length l which is not a power of two, we can just interpret this as a commitment to vectors of length 2 log l over more generators g , h . Notice that the inner product of the padded vectors stays the same.
Theorem 4.1. The protocol given in Figures 1 and 2 is complete, perfectly honest verifier zero-knowledge and generalized special sound under the discrete-log assumption. So there is an extractor E that, when given rewindable black-box access to a deterministic prover P * , either outputs an opening v * 1 , v * 2 ∈ Z l p , ρ * ∈ Z p of t, i.e. t = g v * 1 h v * 2 u ρ * , such that x = v * 1 , v * 2 , or a non-trivial discretelog relation between g, h, u and two auxiliary generators e, f ∈ G. The extractor E runs in expected time at most O(l 2+log α log l/ε) for some α > 1, for example α = 1.3, when P * has acceptance probability ε ≥ 10 α α−1 l log α /p. Running P * once is assumed to take unit time.
Proof. The subprotocol without the first move is a 2µ + 1 move protocol for µ = log(l) + 1, which fulfills the prerequisites of the forking lemma given in Lemma 3.2. After sending a uniformly random generator a = e b of the group G for a uniform b ∈ Z p , the extractor E can thus use tree-finder to obtain a (4, . . . , 4, 5)-tree of accepting transcripts of this subprotocol. More precisely, with probability at least 1/2 over the choice of a, the verifier V will accept with probability at least ε/2 ≥ α µ α−1 n µ /p. Therefore tree-finder will be successful with probability at least 1/8. If it is not successful, E restarts.
Consider the 5 accepting transcripts from neighboring leaves with the same parent node. Only the last challenges differ in the transcripts and we have the 5 verification equations The parties run (g, h, t ; v1, v2, ρ ) = folding( g, h, a, u, t ; v1, v2, ρ) where the secrets v1, v2, ρ ∈ Zp are such that Fig. 1. Zero-knowledge inner product Bulletproof Π ·,· (·; ·). It proves knowledge of an opening to a Pedersen commitment t = g v 1 h v 2 u ρ such that the vectors v1 and v2 fulfill an inner product relation v1, v2 = x.
Prover P

Verifier V
Inputs: and both parties compute and t = t c −1 −1 tt c 1 . They recursively run (g, h, t ; v1, v2, ρ ) = folding( g , h , a, u, t Else g = g, h = h ∈ G, and P knows v1 = v1, v2 = v2, ρ = ρ ∈ Zp, such that Fig. 2. Bulletproof folding protocol folding( g, h, a, u, t; v1, v2, ρ). This reduces a Pedersen multi-commitment of the form t = g v 1 h v 2 a v 1 , v 2 u ρ to a new commitment t = g v 1 h v 2 a v 1 v 2 u ρ with the same (inner) product structure but in dimension 1. Furthermore, given an opening for t having the correct inner product structure, one can extract an opening for t that also has the inner product structure by using the extractor from the forking lemma (Lemma 3.2).
for i = 1, . . . , 5 with distinct c i ∈ Z p . Let (λ 1 , λ 2 , λ 3 ) T ∈ Z 3 p be the solution of the linear system   1 1 1 It exists because it is well-known that the determinant of this Vandermonde matrix is equal to −(c 1 c 2 c 3 ) −1 (c 1 − c 2 )(c 1 − c 3 )(c 2 − c 3 ) = 0. Now raise the first 3 equations in 13 for i = 1, 2, 3 to the powers of λ i and multiply them. This gives In the same manner we can extract openings for w and w , With these openings to t , w and w we can reconstruct the equations in (13) and get By comparing exponents we either find a non-trivial discrete-log relation between g, h, a, u, which gives a relation between g, h, u, e since E knows expressions of g, h, a, u as powers of g, h, u, e. Or we have

Multiplying this equation by c 3
i yields a polynomial of degree 4 which has five roots c i . Hence it must be the zero polynomial and from the leading coefficient we get The extractor performs this process for all parents in the second-to-last level µ−1 = log(l) of the tree of accepting transcripts. Then, with the same techniques and as is detailed in [BBB + 17], the extractor can invert all the log(l) folding steps and either compute a non-trivial discrete-log relation or an opening such that x * = v 1 , v 2 . If x * = x then E has an opening of t as stated in the theorem. If not, E starts over from scratch but samples a challenge generator a = f b ∈ Z p for the first move. By this E obtains an opening and can compute which is a non-trivial discrete-log relation. Not taking into account the simple arithmetic over Z p , the expected running time of E is at most 16 times the expected running time of tree-finder.
We turn to the zero-knowledge property. The first message by the verifier containing the generator a and all the messages in the folding protocol are independently uniformly random. This is because all the cross-terms t −1 , t 1 are independently blinded with independently random factors u σ−1 and u σ1 . So the simulator can just choose a $ ← G and all messages in the folding protocol uniformly randomly. From these messages the honest verifier computes the generators g, h and the commitment t . Now it remains to simulate the Schnorr-type protocol at the end for proving knowledge of an opening of t that obeys the product relation. This is made possible by how we set up the verification equation. The simulator first samples c $ ← Z p , and then z 1 , z 2 $ ← Z p , which are independent from the previously chosen messages because of y 1 and y 2 , respectively. Then he chooses w $ ← G which is independent because of the blinding factor u σ . Last the simulator samples τ $ ← Z p which is still uniformly random because of σ. Now w ∈ G is not independent anymore but instead fully determined by the previous choices and the simulator can compute it correctly as which clearly makes the verification equation true.

The Main Protocol
In this section we present in detail our protocol to prove knowledge of a matrix S ∈ R m×k q consisting of short polynomials of infinity norm less than B such that where A ∈ R n×m q and T ∈ R n×k q are public. First, when A, S, T are lifted to matrices over Z[X], the equation is true modulo q and f . So there are matrices R 1 , R 2 over Z[X] such that More precisely, notice that T − AS ∈ (Z[X]) n×k consists of polynomials of degree at most 2(d − 1) and infinity norm less than mdBq/2 when we use central representatives for coefficients in Z q . Moreover, T−AS is a multiple of f modulo q. So we can exactly divide T−AS by f over Z q [X] to obtain R 2 with polynomials of degree at most d − 2 and coefficients in {−(q − 1)/2, . . . , (q − 1)/2}. Then, dividing T−AS−fR 2 by q yields R 1 with polynomials of degree at most 2(d−1) and infinity norm less than (mdB + d f ∞ )/2. Next, for a prime p we have and then for an α ∈ Z p the equation Conversely, by the Schwartz-Zippel lemma, if Equation (17) is true for a uniformly random α, then Equation (16) holds with probability at least 1 − 2(d − 1)/p. In this case, if p ≥ 2(mdB + d f ∞ )q, Equation (15) is true since no reduction modulo p takes place, and Equation (14) follows. So in order to prove knowledge of a matrix S ∈ S m×k B as in Equation (14), it suffices to prove knowledge of matrices S, R 1 and R 2 of integer polynomials whose coefficients have absolute value less than B, B 1 = (mdB + d f ∞ )/2 and B 2 = q/2, respectively, such that Equation (17) is true for a uniformly random α.
We describe our strategy for conducting such a proof. If we expand all polynomials in the secret matrices S, R 1 , R 2 to their coefficient row vectors of dimensions d, 2d − 1 and d − 1, respectively, and hence consider the matrices as integer matrices S, R 1 , R 2 , then, with α d = (1, α, . . . , α d−1 ) T , we can equivalently write Now a natural strategy would be to produce a Pedersen multi-commitment over a group of order p to the secret matrices S, R 1 , R 2 . Then one could prove that the matrices fulfill Equation (18) by reducing them to integers using in the order of log(mkd) bulletproof folding steps. In addition one would also need to give a range proof that the coefficients of the matrices are sufficiently small. For increased efficiency we combine these proofs in one single proof. The usual method for range proofs consists of expressing the coefficients by their binary representations so that the range follows from the number of bits used per coefficient. The proof that this representation really only contains bits in {0, 1} is most easily done via an inner product proof as in [BBB + 17]. Therefore we want to reduce Equation (18) to an inner product equation which then can be integrated into the range proof. To this end we first multiply from both sides by uniformly random vectors β ∈ Z k p and γ ∈ Z n p , so that This equation implies Equation (18) with probability at least 1 − 2/p. Next we serialize the secret matrices to column vectors s ∈ Z mkd , r 1 ∈ Z nk(2d−1) and r 2 ∈ Z nk(d−1) in row-major order. With these the last equation is equivalent to the inner product equation Finally, we expand each secret vector one more time and replace the coefficients by their binary representation using two's complement for negative numbers. We get It remains to prove that the secret vector s 1 only contains coefficients in {0, 1}. As usual this is done by proving that there is a second vector s 2 , the vector with all bits flipped, such that s 1 • s 2 = 0 and s 1 + s 2 = 1. The first property holds with probability at least 1 − 1/p if ϕ, s 1 • s 2 = ϕ • s 2 , s 1 = 0 for a uniformly random vector ϕ. Similarly, the second property follows with overwhelming probability from ϕ, s 1 + s 2 = ϕ, s 1 + ϕ • s 2 , 1 = ϕ, 1 . We incorporate both inner product equations into Equation (19) and arrive at v + ϕ • s 2 + ψ ϕ, s 1 + ψ 1 = γ T T(α) β + ψ v, 1 + (ψ + ψ 2 ) ϕ, 1 where ψ ∈ Z p is another uniformly random field element with the purpose of separating the three inner product equations.
When given a Pedersen multi-commitment to the vectors s 2 and s 1 it is easy to compute a commitment to v 1 = v + ϕ • s 2 + ψ ϕ and v 2 = s 1 + ψ 1. It might be unclear at first how to multiply s 2 componentwise with ϕ inside the multicommitment, which means each coefficient has to be multiplied by a different value. There is a standard trick to do this. Suppose g ∈ G l is the vector of generators underlying s 2 . Then we just reinterpret this part of the commitment as a commitment over generators g = g ϕ −1 . Since g s2 = ( g ϕ −1 ) ϕ• s2 = ( g ) s2 , our original commitment containing s 2 over g thus becomes a commitment containing ϕ • s 2 over g . Now given the commitment to v 1 and v 2 we prove that the inner product of these vectors of dimension l = mkdb + nk(2d − 1)b 1 + nk(d − 1)b 2 is equal to x = γ T T(α) β + ψ v, 1 + (ψ + ψ 2 ) ϕ, 1 . It follows with overwhelming probability that s 1 gives rise to a matrix S ∈ R m×k q of short polynomials such that AS = T over R q . For the inner product proof we make use of Bulletproofs, which have communication cost logarithmic in l. But in contrast to the range proof in [BBB + 17], we do not blind the vectors and instead use a variant of the Bulletproof inner product proof that is zero knowledge. Here one first reduces the vectors to dimension 1 and then uses a zero-knowledge Schnorr-type proof for the one-dimensional base case. See Figure 3 for the complete protocol and Theorem 5.1 for its security. We state the zero-knowledge inner product Bulletproof in Figure 1.
Theorem 5.1. If p ≥ 2(mdB + d f ∞ )q, then the protocol in Figure 3 is complete, perfectly honest verifier zero-knowledge and generalized special sound under the discrete-log assumption in the sense that there is an extractor E with the following properties. When given rewindable black-box access to a deterministic prover P * that convinces the honest verifier with probability ε ≥ 100l/p, E either outputs a solution S * ∈ R m×k q to AS * = T, which consists of polynomials whose coefficients fit in b = log(B) + 1 bits, or a non-trivial discrete-log relation between generators of the group G. The extractor E runs in expected time at most O(l 2.4 log l/ε). Running P * once is assumed to take unit time.
Proof. Completeness is clear from the discussion at the beginning of Section 5 and the zero-knowledge property follows immediately from the fact that the inner product proof is honest verifier zero-knowledge; see Theorem 4.1. Let us now prove soundness. The extractor E runs P * , sends uniformly random challenges in the second move and then uses the extractor for the inner product proof assuming acceptance probability ε/2 to get an opening for t, c.f. Theorem 4.1. From an averaging argument we know that for at least half of the challenges in the second move the inner product proof π is valid with probability at least ε/2. Then, since ε/2 > 10αl log α /((α−1)p) for α ≥ 1.3, the conditions of Theorem 4.1 are met. So after an expected number of 2 trials we can assume that E either has a non-trivial discrete-log relation or an opening v * 1 , v * 2 , ρ * of t, i.e.

The last equation is equivalent to
which can be interpreted as a multivariate polynomial P over Z p in n + k + l + 2 variables that evaluates to zero at (α, β, γ, ϕ, ψ). If the polynomial is the zero polynomial it follows that s * 1 • s * 2 = 0 and s * 1 + s * 2 = 1 Prover P Verifier V
The coefficient of X ν of the polynomial in the (i, j)-th entry of the matrix in the middle corresponds to the coefficient of α ν β j γ i of our multivariate polynomial P that we assume to be zero. So, but from our assumption on p this equation is even true over Z[X] and we finally get AS * = T over R q . It remains to consider the case where P = 0. Note that in this case the polynomial is of total degree at most 2d. Consequently, it can evaluate to zero at no more than 2dp n+k+l+1 points in Z n+k+l+2 p (this is just a counting version of the Schwartz-Zippel lemma). Now the extractor E reruns P * but sends a uniform challenge (α, β, γ, ϕ, ψ) ∈ Z n+k+l+2 p from the set of non-roots of P . Then E again tries to extract from the inner product proof and continues in this fashion until he is successful for a second time. At least for a fraction of 1 2 − 2d p of the non-roots, the inner product proof is accepted with probability at least ε/2. So after an expected number of roughly 2 trials E will get a non-trivial discrete-log relation or new multivariate polynomial P that is zero outside of the small set of roots of our original polynomial P so that P must be different to P . But then, since P and P are in one-to-one correspondence to openings of the commitment t, we must have two different openings and can compute a non-trivial discrete-log relation. We see the total expected runtime of E is at most 4 times the expected runtime of the extractor of the inner product proof.

Proof size
The communication size of our protocol from Figure 3 is very small. Instead of all the individual challenges in the second move the verifier can just send a short seed that is expanded to the challenges with the help of a XOF. Moreover, in the non-interactive version of the protocol via the Fiat-Shamir transform the challenges are expanded from public information and the first message. So such a non-interactive proof only consists of the first message and the inner product proof of size logarithmic in l. Simple counting shows that one full non-interactive proof consists of 2 log l + 3 group elements and 3 elements of Z p . If a 256 bit elliptic curve is used for G, then this results in 64 log l + 192 bytes per proof.

Number of exponentiations
Computing multi-exponentiations over G is by far the most time-consuming operation in our main protocol. We count the number of exponentiations to be performed by the prover and verifier in order to estimate the time needed to execute the protocol. The prover computes l exponentiations for g , l + 1 exponentiations for t and only 1 exponentiation for w ( s 1 and s 2 are binary) plus the exponentiations in the inner product proof. The verifier computes 2l + 1 exponentiations and those from the inner product proof. In the inner product proof the prover has to compute 2·2 log l −i +6 exponentiations in the i-th folding level, i = 0, . . . , log l −1. This amounts to 4·2 log l +6 log l −4 < 8l+6 log l+2 exponentiations for the full Bulletproof folding. In addition there are 6 exponentiations needed for the Schnorr-type proof. The verifier performs 4 log l < 4 log l + 1 exponentiations for the folding protocol and 6 exponentiations for the verification equation. This can be heavily optimized by delaying exponentiations; see [BBB + 17, Section 6.2]. We conclude that the total exponentiation costs for the prover and verifier are less than 10l + 6 log l + 10 and 2l + 4 log l + 10 exponentiations.

Example
We return to the example of a verifiable encryption scheme from Section 1.5. In the case of verifiable encryption, one has to prove a matrix equation A s = t with parameters n = 2, m = 4, k = 1, B = 4. For the ring R q , a common example for encrypting messages that are binary polynomials (c.f. [ADPS16]) is setting f = X 1024 + 1 and q being a prime of about 13 bits, and p = 2. With these parameters we find the length l of the secret vectors s 1 and s 2 in the inner product proof to be equal to 100296. It then follows from above that the prover and verifier need to compute about 724986 and 200667 exponentiations to run our protocol for this application. With current CPUs one exponentiation on a 256 bit elliptic curve can be computed in about 35000 cycles (see https://bench.cr.yp.to/results-dh.html), which amounts to roughly 85000 exponentiations per second. So computing one of our proofs should be possible in less than 10 seconds. This can then be improved by using specialized algorithms for computing multi-exponentiations, in particular Pippenger's algorithm [Pip80]. The size of the proof is 1.25 kilobytes.