SOFIA: MQ-based signatures in the QROM

Ming-Shing Chen1 and Andreas Hülsing2 and Joost Rijneveld3 and Simona Samardjiska3,4 and Peter Schwabe3

1 Department of Electrical Engineering, National Taiwan University, and Research Center for Information Technology Innovation, Academia Sinica, Taipei, Taiwan [email protected] 2 Department of Mathematics and Computer Science, Technische Universiteit Eindhoven, Eindhoven, The Netherlands [email protected] 3 Digital Security Group, Radboud University, Nijmegen, The Netherlands [email protected], [email protected] 4 Faculty of Computer Science and Engineering, Ss. Cyril and Methodius University, Skopje, R. Macedonia [email protected]

Abstract. We propose SOFIA, the rst MQ-based signature scheme provably secure in the quantum-accessible random oracle model (QROM). Our construction relies on an extended version of Unruh's transform for 5-pass identication schemes that we describe and prove secure both in the ROM and QROM. Based on a detailed security analysis, we provide concrete parameters for SOFIA that achieve 128 bit post-quantum security. The result is SOFIA- 4-128 with parameters that are carefully optimized to minimize signature size and maximize performance. SOFIA-4-128 comes with an implemen- tation targeting recent Intel processors with the AVX2 vector-instruction set; the implementation is fully protected against timing attacks. Keywords: Post-quantum , multivariate cryptography, 5- pass identication schemes, QROM, Unruh's transform, vectorized im- plementation.

1 Introduction

At Asiacrypt 2016, Chen, Hülsing, Rijneveld, Samardjiska, and Schwabe pre- sented a post-quantum signature scheme called MQDSS [CHR+16], obtained by applying a generalized Fiat-Shamir transform to a 5-pass identication schemes (IDS) with security based on the hardness of solving a system of multivari- ate quadratic equations (MQ problem). Unlike previous MQ-based signature schemes, MQDSS comes with a reduction from a random instance of MQ; it This work was supported by the Netherlands Organization for Scientic Research (NWO) under Veni 2013 project 13114 and by the European Commission through the ICT Programme under contract ICT-645622 PQCRYPTO. Permanent ID of this document: 355ce8e96788011f95a9ea9db7793836. Date: July 7, 2017 does not need additional assumptions on the hardness of the Isomorphism-of- (IP) [Pat96] or related problems like MinRank [Cou01,FLP08]. Unfortunately, the security reduction of MQDSS is in the random-oracle model and highly non-tight. The authors state that they see the proposal as a major step towards a scheme with a tight reduction to [sic] MQ in the quantum-random-oracle model (QROM) or even better in the standard model. In this paper, we make another major step towards such a scheme. More specically, we propose SOFIA, a scheme that is provably EU-CMA secure in the QROM if the MQ problem is hard and allows for a tight reduction in the ROM (albeit not in the QROM). To achieve this, we start from Unruh's transform [Unr15] for transforming sigma protocols to non-interactive zero-knowledge proofs (and signatures) in the QROM. The reason for a dierent transform comes from the inherent problems of the Fiat-Shamir transform (and also the generalization to 5-pass schemes) in the QROM. Namely, the proof technique introduced by Pointcheval and Stern [PS96] requires rewinding of the adversary and adaptively programming the random oracle. Not only does this cause problems in the QROM, it also produces non- tight proofs. Unruh's transform avoids these problems by adopting and tweaking an idea from Fischlin's transform [Fis05] that solves the rewinding problem. Instead of simply applying the transform to a 3-pass IDS, we extend it such that it applies to any 5-pass IDS with binary second challenge (named q2-IDS in [CHR+16]) and thus to the MQ-based IDS from [SSH11]. This is in the same spirit as the generalization of the Fiat-Shamir transform in [CHR+16]. We prove that the signature scheme resulting from the application of the transform is post- quantum EU-CMA secure (PQ-EU-CMA) in the QROM. This proof follows a two-step approach: We rst give a (tight) proof in the ROM, and then discuss the changes necessary to carry over to the QROM. We then instantiate the construction with the 5-pass MQ-based IDS introduced by Sakumoto, Shirai, and Hiwatari in [SSH11] and provide various optimizations particularly suited for this specic IDS. These optimizations almost halve the size of the signature compared to the non-optimized generic transform. We instantiate SOFIA with carefully optimized parameters aiming at the 128-bit post-quantum security level; we refer to this instance as SOFIA-4-128. A comparison with MQDSS-31-64 from [CHR+16], which targets the same security level, shows that the improvements in security guarantees come at a cost: with 123 KiB, SOFIA-4-128 signatures are about a factor of 3 larger than MQDSS- 31-64 signatures and our optimized SOFIA-4-128 software takes about a factor of 3 longer for both signing and verication than the optimized MQDSS-31- 64 software presented in [CHR+16]. However, like MQDSS, SOFIA features extremely short keys; specically, SOFIA-4-128 public keys have 64 bytes and secret keys have 32 bytes. SOFIA is not the rst concrete signature scheme with a proof in the QROM. Notably, TESLA-768 [ABBD15] is a lattice-based signature scheme with a re- duction in the QROM, while Picnic-10-38 [CDG+17] is the result of construct- ing a signature scheme from a symmetric primitive using the transform by

2 Unruh [Unr15] that was mentioned above. Relying on even more conservative assumptions, the hash-based signature scheme SPHINCS-256 [BHH+15] has a tight proof in the standard model. Although SOFIA-4-128 remains faster than SPHINCS-256 (which is, because of its standard model assumptions, arguably the `scheme to beat'), we do signicantly exceed its 40 KiB signature size. Con- versely, but on a similar note, SOFIA-4-128 outperforms Picnic-10-38 both in terms of signing speed and signature size. TESLA-768 remains the `odd one out' with its small signatures but much larger keys; it strongly depends on context whether this is an upside or a problem. See Table3 for a numeric overview of the comparison. Organization of this paper. Section2 gives the necessary background on identication schemes and signature schemes. Section3 presents the modied Unruh transform to support q2 identication schemes. Section4 revisits the 5-pass identication scheme introduced in [SSH11]. Section5 introduces the SOFIA signature scheme and nally Section6 explains our parameter choices for SOFIA-4-128 and gives details of our optimized implementation. Availability of software. We place all software presented in this paper into the public domain to maximize reusability of our results. It is available for download at https://joostrijneveld.nl/papers/sofia.

2 Preliminaries

In the following we provide basic denitions used throughout this work. We are concerned with post-quantum security, i.e., a setting where honest parties use classical computers but adversaries might have access to a quantum computer. Therefore, we adapt some common security notions accordingly, modeling ad- versaries as quantum algorithms. Digital signatures. In this work we are concerned with the construction of digital-signature schemes. These are dened as follows.

Denition 2.1 (Digital signature scheme). A digital-signature scheme with security parameter k, denoted Dss(1k) is a triplet of -time algorithms Dss = (KGen, Sign, Vf) dened as follows:

 The -generation algorithm KGen is a probabilistic algorithm that outputs a key pair (sk, pk).  The signing algorithm Sign is a possibly probabilistic algorithm that on input a secret key sk and a message M outputs a signature σ.  The verication algorithm Vf is a deterministic algorithm that on input a public key pk, a message M and a signature σ outputs a bit b, where b = 1 indicates that the signature is accepted and b = 0 indicates a reject.

We write Dss instead of Dss(1k), whenever the security parameter k is clear from context or irrelevant. For correctness of a Dss, we require that for all

3 (sk, pk) ← KGen(), all messages M and all signatures σ ← Sign(sk,M), we get Vf(pk, M, σ) = 1, i.e., that correctly generated signatures are accepted. Existential Unforgeability under Adaptive Chosen Message Attacks. The standard security notion for digital signature schemes is existential un- forgeability under adaptive chosen message attacks (EU-CMA)[GMR88] which is dened using the following experiment.

Experiment eu-cma ExpDss(1k)(A) (sk, pk) ← KGen() (M ?, σ?) ← ASign(sk,·)(pk) Let Qs be the queries to . {(Mi)}1 Sign(sk, ·) Return i ? ? and ? Qs . 1 Vf(pk,M , σ ) = 1 M 6∈ {Mi}1 For the success probability of an adversary A in the above experiment we write

eu-cma h eu-cma i SuccDss(1k) (A) = Pr ExpDss(1k)(A) = 1 .

A signature scheme is called EU-CMA-secure if any PPT algorithm A has only negligible success probability. Similarly, a signature scheme is called PQ-EU-CMA- secure if any quantum polynomial time adversary has only negligible success probability. We only give a formal denition for the latter.

Denition 2.2( PQ-EU-CMA). Let k ∈ N and Dss(1k) a digital signature scheme with security parameter k as dened above. We call Dss(1k) post-quantum existentially unforgeable under chosen message attacks or PQ-EU-CMA-secure if for all Qs, t = poly(k) the success probability of any quantum algorithm A (the adversary) running in time ≤ t, making at most Qs queries to Sign in the above experiment, is negligible in k: eu-cma negl SuccDss(1k) (A) = (k) . Post-quantum security against key-only attacks (PQ-KOA) is dened the same as PQ-EU-CMA but with Qs = 0, i.e., the adversary is given no access to the signing oracle. Identication Schemes. An identication scheme (IDS) is a protocol that allows a prover P to prove its identity to a verier V. More formally this is covered by the following denition. Denition 2.3 (Identication scheme). An identication scheme with se- curity parameter k, denoted IDS(1k), consists of three PPT algorithms IDS = (KGen, P, V) such that:  the key generation algorithm KGen is a probabilistic algorithm that outputs a key pair (sk, pk).  P and V are interactive algorithms, executing a common protocol. The prover P takes as input a secret key sk and the verier V takes as input a public key pk. At the conclusion of the protocol, V outputs a bit b with b = 1 indicating accept and b = 0 indicating reject.

4 We write IDS instead of IDS(1k), whenever the security parameter k is clear from context or irrelevant. For correctness of an IDS, we require that for all (pk, sk) ← KGen() we have

Pr [hP(sk), V(pk)i = 1] = 1, where hP(sk), V(pk)i refers to the common execution of the protocol between P with input sk and V on input pk. In this work we are concerned with 5-pass IDS, i.e. IDS where a transcript consists of ve messages. A canonical 5-pass IDS is an IDS where the prover and the verier exchange two challenges and replies. More formally:

Denition 2.4 (Canonical 5-pass identication schemes). Consider IDS = (KGen, P, V), a 5-pass identication scheme with two challenge spaces C1 and C2. We call IDS a canonical 5-pass identication scheme if the prover can be split into three subroutines P = (P0, P1, P2) and the verier into three subroutines V = (ChS1, ChS2, Vf) such that

 P0(sk) computes the initial commitment com sent as the rst message and a state state fed forward to P1.  ChS1, computes the rst challenge message ch1 ←R C1, sampling at random from the challenge space C1.  , computes the rst response of the prover (and updates P1(state, ch1) resp1 the state state) given access to the state and the rst challenge.  ChS2, computes the second challenge message ch2 ←R C2.  , computes the second response of the prover given access P2(state, ch2) resp2 to the state and the second challenge.  , upon access to the public key and the whole Vf(pk, com, ch1, resp1, ch2, resp2) transcript outputs V's nal decision. Note that the state forwarded among the prover algorithms can contain all inputs to previous prover algorithms if they are needed later. We also assume that the verier keeps all sent and received messages to feed them to Vf. Figure1 describes a canonical 5-pass IDS. We will consider a particular type of 5-pass identication protocols where the size of the two challenge spaces is restricted to q and 2. Denition 2.5( q2 -Identication scheme). A q2 -Identication scheme IDS is a canonical 5-pass identication scheme where for the challenge spaces C1 and C2 it holds that |C1| = q and |C2| = 2. Moreover, the probability that the commitment com takes a given value is ≤ 2−k, where the probability is taken over the random choice of the input and the used randomness. Our goal is to construct signature schemes from identication schemes. It is well known that for this passively secure identication schemes suce. In this setting, security is dened in terms of two properties: soundness and honest- verier zero-knowledge (HVZK). To prove security of our signature scheme, we

5 P V

(state, com) ← P (sk) 0 com

k ch1 ←R ChS1(1 ) ch1

(state, resp1) ← P1(state, ch1) resp1

k ch2 ←R ChS2(1 ) ch2

resp2 ← P2(state, ch2) resp2

b ← Vf(pk, com, ch1, resp1, ch2, resp2)

Fig. 1. Canonical 5-pass IDS will not make use of soundness but of a similar property called q2-extractor which is a variant of special soundness. This is combined with a notion of key-one- wayness to later be able to argue about security. The issue with soundness is that the common proof-technique makes use of rewinding which becomes troublesome when dealing with quantum adversaries.

Denition 2.6 ((statistical) Honest-verier zero-knowledge). Let k ∈ N, IDS(1k) = (KGen, P, V) an identication scheme with security parameter k. We say that IDS is statistical honest-verier zero-knowledge if there exists a probabilistic polynomial time algorithm S, called the simulator, such that the statistical distance between the following two distribution ensembles is negligible in k:

{(pk, sk) ← KGen() : (sk, pk, trans(hP(sk), V(pk)i))} {(pk, sk) ← KGen() : (sk, pk, S(pk))} . Intuitively it must be hard for any cryptographic scheme to derive a valid secret key given a public key. To formally capture this intuition, we need to dene what valid means. For this we dene the notion of a key relation. Denition 2.7. (Key relation) Let IDS be a q2-Identication scheme and R some relation. We say IDS has key relation R if

∀(pk, sk) ← KGen() : (pk, sk) ∈ R Now that we have dened what valid means, we can dene key-one-wayness.

Denition 2.8. (PQ-KOW) Let k ∈ N be the security parameter, IDS(1k) be a q2-Identication scheme with key relation R. We call IDS post-quantum key-one- way (PQ-KOW) (with respect to key relation R) if for all quantum polynomial time algorithms A

pq−kow  0 0  SuccIDS(1k) (A) = Pr (pk, sk) ← KGen(), sk ← A(pk):(pk, sk ) ∈ R = negl(k)

6 In [CHR+16] it was shown that in general, for q2-Identication Schemes, it is not possible to extract eciently a matching secret key-only from two related transcripts (as in the case of 3-pass schemes fullling special soundness). In order to capture the nature of these schemes and provide sucient conditions for ecient extraction, the authors proposed the denition of a q2-Extractor. In the following we give a post-quantum version of q2-Extractor that xes two slight technical shortcomings of the denition in [CHR+16]. On the one hand, we add the algorithm that actually generates the transcripts to the denition, on the other hand we use the notion of key relation to capture what kind of secret key the extractor returns.

Denition 2.9 (PQ-q2-Extractor). Let IDS(1k) be a q2-Identication scheme with key relation R and A a quantum polynomial time algorithm that upon in- put of security parameter 1k and an IDS(1k) public key pk outputs, with non- negligible probability, four valid transcripts with respect to pk:

(1) trans = (com, ch1, resp1, ch2, resp2) (2) 0 0 trans = (com, ch1, resp1, ch2, resp2) (1) (3) 0 0 trans = (com, ch1, resp1, ch2, resp2) (4) 0 0 0 0 trans = (com, ch1, resp1, ch2, resp2) where 0 and 0 . ch1 6= ch1 ch2 6= ch2 We say that IDS(1k) has a PQ-q2-Extractor if there exists a quantum polyno- mial time algorithm KIDS, the extractor, that, given a public key pk and access to A, outputs a secret key sk such that (pk, sk) ∈ R with non-negligible success probability in k.

A classical version of this denition is obtained by restricting A and KIDS to classical PPT algorithms.

3 From q2-IDS to signatures in the QROM

In [CHR+16], the authors show that the Fiat-Shamir transformation can be gen- eralized to the case of 5-pass IDS whose ChS2 is bounded to two elements. They show that the Pointcheval-Stern proof [PS96] can be extended to this case, and the obtained signature scheme can be shown EU-CMA secure in the random oracle model. This result is further extended to any 2n + 1 round identication scheme that fullls a certain kind of special soundness in [DGV+16]. However, similar to the standard Fiat-Shamir transform, their proofs rely on the fork- ing lemma, which introduces two serious problems in the post-quantum setting: rewinding of the adversary, and adaptively programming the random oracle. While it is known how to deal with the latter [Unr15], the former seems to be- come a real show stopper [ARU14]. The only known way to x the Fiat-Shamir transform in the QROM setting [DFG13] is using oblivious commitments, which are a certain kind of trapdoor commitments, eectively avoiding rewinding at

7 the cost of introducing the necessity of a . This makes the so- lution not applicable in our setting as there are no trapdoor functions with a reduction from the MQ-problem. In [Unr15], Unruh proposes a dierent transform, based on Fischlin's trans- form [Fis05], that turns 3-pass zero-knowledge proofs into non-interactive ones in the QROM. In addition, Unruh shows how to use his transform to obtain a signature scheme. The transform essentially works by unrolling Fischlin's transform and then applying a few tweaks. This works, as Fischlin's transform already avoids rewinding. The basic idea is to let the signer generate several transcripts for a commitment. This is iterated for several initial commitments. Next, the signer blinds all responses in the transcripts by applying a length- preserving hash. All the obtained data is hashed together with the public key and the message to obtain a challenge vector. This challenge vector determines one transcript per commitment that has to be unblinded, i.e., for which the response must be included in the signature. The signature consists of all the transcripts with blinded responses and the unblinded responses for the tran- scripts identied by the challenge vector. The reasoning behind the transform is that without knowing the secret key, a forger cannot know suciently many valid openings to be able to include all the challenged responses. On the other hand, a security reduction can replace the length-preserving hash (modeled as QRO) by an invertible function (e.g. a QPRP). That way, a reduction can unblind the remaining responses in the signature by inverting the function. Now, it can be argued that an adversary with non-negligible success probability must have known several valid transcripts for at least one commitment. The unblinding reveals those transcripts and they can be used to run the extractor. Here, we show that a similar transform can be applied to 5-pass IDS with a binary second challenge (i.e., q2-IDS). Basically, we treat the second challenge- response round like the rst. However, as we have a binary second challenge, we ask that for each rst challenge, a transcript for both values of the second challenge is generated. The main dierence between the security reduction of Unruh's transform and our extension to q2-IDS is a more involved argument to show that we get suciently many valid transcripts that follow the pattern needed to extract a valid secret key. As this argument is essentially independent of the RO, we rst give a proof in the classical ROM. This also allows us to show that the reduction is tight in the ROM. Afterwards we describe how things change in the QROM along the lines of Unruh's QROM proof. This is where the reduction becomes loose. It remains an interesting open question if this is a fundamental issue with QROM reductions or if the existing techniques are just not suciently evolved, yet.

3.1 Extending Unruh's transform to q2-IDS

Let IDS = (KGen, P, V) be a q2-IDS, where we have P = (P0, P1, P2) and V = , and let be two parameters, where . Moreover (ChS1, ChS2, Vf) r, t ∈ N 2 6 t 6 q |resp | |resp | |resp | |resp | let H1 : {0, 1} 1 → {0, 1} 1 , H2 : {0, 1} 2 → {0, 1} 2 , and H :

8 {0, 1}∗ → {0, 1}dlog 2ter be hash functions, later modeled as random oracles. We dene the following digital signature scheme (KGen, Sign, Vf). The key generation algorithm just runs IDS.KGen(). Signature and verication algorithms are given in Figures2 and3.

Sign(sk,M)

For j ∈ {1, . . . , r} do (j) (j) (state , com ) ← P0(sk) For i ∈ {1, . . . , t} do (i,j) (1,j) (i−1,j) ch1 ←R ChS1 \{ch1 ,..., ch1 } (i,j) (i,j) (j) (i,j) (state , resp1 ) ← P1(state , ch1 ) (i,j) H (i,j) cr1 ← 1(resp1 ) (i,j,0) (i,j) (i,j,1) (i,j) resp2 ← P2(state , ch2 = 0), resp2 ← P2(state , ch2 = 1) (i,j,0) H (i,j,0) (i,j,1) H (i,j,1) cr2 ← 2(resp2 ), cr2 ← 2(resp2 )

t (j) n (i,j) (i,j) (i,j,0) (i,j,1) o transfull(j) := com , ch1 , cr1 , (cr2 , cr2 ) i=1  r  md ← H pk,M, {transfull(j)}j=1

Read md as vector ((I1,B1),..., (Ir,Br)) t (j) n (i,j) (i,j) (i,j,0) (i,j,1) o transred(j) := com , ch1 , cr1 , (cr2 , cr2 ) i6=Ij ,i=1  r  n (Ij ,j) (Ij ,j) (Ij ,j,Bj ) (Ij ,j,¬Bj )o σ := md, transred(j), ch1 , resp1 , resp2 , cr2 j=1

Fig. 2. Signature generation

For ease of exposition, we will use the notation T (j, i, b) for a string that has the format of a transcript of the IDS (not necessarily a valid transcript), corresponding to the j-th round of the non-interactive protocol, with i and b being the indices of the corresponding challenges ch1 and ch2, i.e.

(j) (i,j) (i,j) (i,j,b) T (j, i, b) := (com , ch1 , resp1 , ch2 = b, resp2 ), where j ∈ {1, . . . , r}, i ∈ {1, . . . , t}, b ∈ {0, 1}.

3.2 PQ-EU-CMA-Security in the ROM In the following, we rst establish post-quantum security under key-only attacks (PQ-KOA). More specically, we will show that a successful KOA-forger A can be used to extract a valid secret key for the underlying IDS. Afterwards, we will extend the result to existential unforgeability under chosen message attacks.

9 Vf(pk, σ, M)

Read md as vector ((I1,B1),..., (Ir,Br)) For j ∈ {1, . . . , r} do (Ij ,j) H (Ij ,j) cr1 ← 1(resp1 ) (Ij ,j,Bj ) H (Ij ,j,Bj ) cr2 ← 2(resp2 )

0  r  md ← H pk,M, {transfull(j)}j=1 Check that md0 =? md For j ∈ {1, . . . , r} do Check that (1,j) (t,j)are all distinct ch1 ,... ch1 Check ? (j) (Ij ,j) (Ij ,j) (Ij ,j,Bj ) 1 = b ← Vf(pk, com , ch1 , resp1 ,Bj , resp2 ) If all checks succeed, output success.

Fig. 3. Verication

PQ-KOW ⇒ PQ-KOA. The following lemma gives an exact relation between the key-one-wayness of the identication scheme and the security of the proposed signature scheme under key-only attacks.

Lemma 3.1. Let k, t, r ∈ N be the parameters of the signature scheme from Fig- ures2 and3 above, using a q2-IDS that has a key relation R, a PQ-q2-extractor, and is PQ-KOW secure. Let A be a quantum algorithm that implements a KOA forger which given only the public key pk outputs a valid message-signature pair (M, σ) with probability . Then, in the random oracle model there exists an al- gorithm MA that given oracle access to any such A breaks the KOW security of IDS in essentially the same running time as the given A and with success probability 0 −r log 2t  ≥  − (qH + 1)2 t+1 . (2) Proof. We now show how to construct such an algorithm MA. On input of an A IDS public key pk, M rst runs A(pk). Let EA be the event that A outputs a valid message-signature pair (M, σ) with

 r  n (Ij ,j) (Ij ,j) (Ij ,j,Bj ) (Ij ,j,¬Bj )o σ = md, transred(j), ch1 , resp1 , resp2 , cr2 . j=1

Then EA implies that for every j ∈ {1, . . . , r}, T (j, Ij,Jj) is a valid transcript of IDS and the Verier Vf accepts. Now, our goal is to use the q2-extractor to ex- tract. This means, we need to obtain four valid transcripts T (j, i1, 0), T (j, i1, 1), A T (j, i2, 0), T (j, i2, 1) for some j ∈ {1, . . . , r}. To this end, M simulates the ran- dom oracles H1 and H2 for A in the common way. The important point is that this way MA learns all of A's queries together with the given responses. Hence, when given A's forgery, MA can open all blinded responses in the signature.

10 Now, MA will only fail to extract if among all the 2tr opened transcripts of the signature, there are no four valid transcripts with the above pattern.

Consider the event E¬ext which describes this case. E¬ext: For every j ∈ {1, . . . , r}, and for every i1, i2 ∈ {1, . . . , t}, i1 6= i2, at least one of T (j, i1, 0), T (j, i1, 1), T (j, i2, 0), T (j, i2, 1) is not a valid transcript of the IDS. A We will now upper bound Pr[(EA ∩ E¬ext)] and thereby lower bound M 's success probability.

Let (M, σ) be A's output under the event (EA ∩ E¬ext). First, (M, σ) must be   valid because of . Now, consider the set of tuples r , EA S¬ext pk,M, {transfull(j)}j=1 such that for every there is at most one ∗ with ∗ and j ∈ {1, . . . , r} Ij T (j, Ij , 0) ∗ being valid transcripts of . It is clear that 's output under the T (j, Ij , 1) IDS A event EA ∩ E¬ext must come from S¬ext. Indeed, if a tuple does not satisfy the given condition, then there exist at least two indices ∗ ∗∗ such that ∗ , Ij ,Ij T (j, Ij , 0) ∗ , ∗∗ , ∗∗ are valid transcripts of , which is in con- T (j, Ij , 1) T (j, Ij , 0) T (j, Ij , 1) IDS tradiction to the event E .  ¬ext  Let r be such a tuple. Then the indexes that dene pk,M, {transfull(j)}j=1 the required openings in σ are obtained as the output of the random oracle H on input of the tuple, i.e.

 r  ((I1,B1),..., (Ir,Br)) ← H pk,M, {transfull(j)}j=1

In order for the signature to pass verication, for each j ∈ {1, . . . , r}, the transcript T (j, Ij,Bj) must be valid. Given the conditions of E¬ext, for each j ∈ {1, . . . , r}, there are at most t + 1 valid transcripts per j. Hence for the r entire ((I1,B1),..., (Ir,Br)) at most (t + 1) possible values. Thus, the prob- ability for the adversary to produce a valid signature from such a tuple is (t+1)r 2t −r log t+1 . (2t)r = 2 Now let qH be the number of queries of the adversary to the oracle H. We have that −r log 2t Pr(EA ∩ E¬ext) ≤ (qH + 1)2 t+1 , as A can try at most qH tuples to obtain a valid signature and output a signature based on a new tuple otherwise. Towards obtaining a bound on MA's success A probability, note that M succeeds in the event (EA ∩ ¬E¬ext). For this event we get

−r log 2t Pr(EA ∩ ¬E¬ext) = Pr(EA) − Pr(EA ∩ E¬ext) ≥  − (qH + 1)2 t+1 . This proves the claimed bound. ut

PQ-KOA ⇒ PQ-EU-CMA. Given the above lemma, it suces to reduce PQ-KOA to PQ-EU-CMA security to eventually prove PQ-EU-CMA security of the proposed scheme, i.e. we have to show that we can answer an adversary's signature queries without knowledge of a secret key. This is done in the following lemma. Afterwards we can derive the main theorem of the section.

11 Lemma 3.2. Let k, t, r ∈ N be the parameters of the signature scheme from Figures2 and3 above, using a q2-IDS that is honest-verier zero-knowledge. Let A be a quantum algorithm that breaks the PQ-EU-CMA security of the sig- nature scheme with probability . Then, in the random oracle model there exists an algorithm MA that breaks the PQ-KOA security of the signature scheme in essentially the same running time as A and with success probability

0 −rk  ≥ (1 − qSignqH2 ). (3) Proof. We show how to construct MA that on input a public key pk of the signature scheme (which is also a public key for IDS), access to a HVZK-simulator SIDS for IDS and the random oracles H1, H2, H, breaks the KOA security of the signature scheme. The running time and success probability of MA are essentially the same as that of A up to a negligible dierence. Upon receiving the public key pk, MA runs A(pk), simulating all signature A and random oracle queries for A. Whenever A queries H1 or H2, M simply A forwards the query to his respective RO. For H, M keeps a local list LH. A Whenever A queries H, M rst checks LH and returns the stored answer if one exists. Otherwise, MA forwards the query to his oracle H and stores the query together with the result in LH before returning the response. Whenever A makes a signature query on a message M, MA does the follow- ing: ˜ dlog 2ter 1. Samples md ←R {0, 1} and interprets it as challenge string, i.e., ˜ ((I1,B1),..., (Ir,Br)) := md. 2. Runs the HVZK-simulator SIDS r times to obtain r valid transcripts of IDS: r n (j) (Ij ,j) (Ij ,j) (j) (Ij ,j,Bj ) o (com , ch1 , resp1 , ch2 , resp2 ) , j=1

and uses them as the challenged transcripts T (j, Ij,Bj) for j ∈ {1, . . . , r}. 3. Blinds the responses (Ij ,j) and (Ij ,j,Bj ) for every : resp1 resp2 j ∈ {1, . . . , r}

(Ij ,j) H (Ij ,j) (Ij ,j,Bj ) H (Ij ,j,Bj ) cr1 ← 1(resp1 ), cr2 ← 2(resp2 ) 4. For all , and all r samples j ∈ {1, . . . , r} (i, b) ∈ {1, . . . , t}×{0, 1}\{(Ij,Bj)}j=1 a rst challenge

(i,j) (Ij ,j) (1,j) (i−1,j) ch1 ←R ChS1 \{ch1 , ch1 ,..., ch1 }, samples fake responses

(i,j) (i,j,b) resp1 ←R RespS1, resp2 ←R RespS2, blinds the fake responses

(i,j) H (i,j) (i,j,b) H (i,j,b) cr1 ← 1(resp1 ), cr2 ← 2(resp2 ), and sets t (j) n (i,j) (i,j) (i,j,0) (i,j,1) o transfull(j) := com , ch1 , cr1 , (cr2 , cr2 ) . i=1

12   5. Checks if there is already an entry for r in . If so, pk,M, {transfull(j)}j=1 LH    A aborts. Otherwise, A stores r ˜ in . M M pk,M, {transfull(j)}j=1 , md LH 6. Outputs the signature

 r  n (Ij ,j) (Ij ,j) (Ij ,j,Bj ) (Ij ,j,¬Bj )o σ = md, transred(j), ch1 , resp1 , resp2 , cr2 , j=1

n ot where (j) (i,j) (i,j) (i,j,0) (i,j,1) transred(j) := com , ch1 , cr1 , (cr2 , cr2 ) i6=Ij ,i=1 Finally, MA outputs whatever A outputs. Now, MA succeeds exactly with A's success probability as long as it does not abort. All RO queries follow the correct distribution and so do the sig- natures. An abort only occurs if A queried H before on the value for which   A wants to program. The value has the form r with M pk,M, {transfull(j)}j=1 n ot (j) (i,j) (i,j) (i,j,0) (i,j,1) The r transfull(j) := com , ch1 , cr1 , (cr2 , cr2 ) . {transfull(j)}j=1 i=1 term has at least rk bits of entropy as the commitments have at least k bits of entropy according to the denition of q2-IDS and there is one commitment for each of the r rounds. This is merely a very loose (but more than sucient) lower bound on the entropy as the blinded responses also add additional en- tropy. Hence, if A makes a total of qH queries for H and qSign signature queries, an abort occurs with probability

−rk Pr[abort] ≤ qSignqH2 .

Hence, MA succeeds with probability

0 −rk  ≥ (1 − qSignqH2 ).

ut

PQ-KOW ⇒ PQ-EU-CMA. Combining the two previous lemmas we obtain the following theorem.

Theorem 3.3. Let k, t, r ∈ N be the parameters of the signature scheme from Figures2 and3 above using a q2-IDS IDS that is statistical honest-verier zero- knowledge and has a PQ-q2-extractor. Let A be a PQ-EU-CMA forger that suc- ceeds with probability . Then, there exists an algorithm MA, that in the random oracle model breaks the PQ-KOW security of IDS in essentially the same running time as A and with success probability

0 −rk −r log 2t  ≥  − qSignqH2 − (qH + 1)2 t+1 ,

Proof. Suppose there exists a PQ-EU-CMA forger A that succeeds with non- negligible probability . We construct a PQ-KOW adversary C for the q2-IDS as follows.

13 C runs A(pk), to construct a key-only forger MA as in Lemma 3.2, that suc- ceeds with probability (3). Now as in Lemma 3.1, C can extract a valid secret key sk, in approximately the same time, and with only negligibly smaller probability (see (2)). In total the success probability of C is

0 −rk −r log 2t  ≥  − qSignqH2 − (qH + 1)2 t+1 , and the running time of C is essentially the same as that of A. ut

3.3 PQ-EU-CMA security in the QROM We now show that with only slight changes, the two lemmas above also hold in the QROM. We do this in reverse order, starting with the PQ-KOA to PQ- EU-CMA reduction as it is the easier case. As already the QROM proofs in Unruh's work which we build on are non-tight, we only give our arguments in the asymptotic regime. PQ-KOA ⇒ PQ-EU-CMA. We will rst revisit the reduction from PQ-KOA to PQ-EU-CMA. We show the following lemma:

Lemma 3.4. Let k, t, r ∈ N be the parameters of the signature scheme from Figures2 and3 above, using a q2-IDS that is honest-verier zero-knowledge. Let A be a quantum algorithm that breaks the PQ-EU-CMA security of the signature scheme with probability . Then, in the quantum-accessible random oracle model there exists a quantum algorithm MA that breaks the PQ-KOA security of the signature scheme in essentially the same running time as A and with success probability 0 ≥ (1 − negl(k)). (4) Proof (Sketch). The proof in the ROM above also applies in the QROM with essentially a single change. The queries to H1 and H2 are still just forwarded by MA without interaction. This works without any issues in the QROM given that MA is now a quantum algorithm (which is unavoidable in the QROM). The only issue is the way MA handles H. It is not possible anymore for MA to learn A's queries to H and thereby not possible to abort. However, we only added the abort condition above for clarity: in the classical case MA could also simply always program H. Then A's success probability might change if MA programmed on an input previously queried by A. However, we still obtain the same bound on the probability. In the QROM, Unruh showed in [Unr15, Corollary 11] that this adaptive programming only negligibly changes A's success probability (the exact argument for our specic case is exactly the one made in the rst game hop of the proof of Theorem 15 in [Unr15]). From this it follows that MA's success probability still only negligibly deviates from that of A. ut

PQ-KOW ⇒ PQ-KOA. Now we revisit the reduction from PQ-KOW to PQ- KOA in the quantum-accessible ROM. While we still do this in the asymptotic regime, we make the parts of the reduction loss explicit which depend on the parameters r, t of the scheme. This is to support parameter selection in later sections.

14 Lemma 3.5. Let k, t, r ∈ N be the parameters of the signature scheme from Fig- ures2 and3 above, using a q2-IDS that has a key relation R, a PQ-q2-extractor, and is PQ-KOW secure. Let A be a quantum algorithm that implements a KOA forger which given only the public key pk outputs a valid message-signature pair (M, σ) with probability . Then, in the quantum-accessible random oracle model there exists a quantum algorithm MA that given oracle access to any such A breaks the KOW security of IDS in essentially the same running time as the given A and with success probability

0 −(r log 2t )/2  ≥  − 2(qH + 1)2 t+1 . (5) Proof (Sketch). A QROM version of our proof is obtained by essentially following the proof of Lemma 17 in [Unr15]. The changes in the proof above are as follows. A First, M cannot learn A's RO queries to H1 and H2 by simulating these the classical way, anymore. Instead, MA simulates these oracles using one quantum PRP (QPRP) per oracle with a random secret key per QPRP. QPRPs exist as shown in [Zha16] and they are quantum indistinguishable from random functions. Now, MA can open the blinded responses in the signature by inverting the QPRP using the secret key. Second, the analysis of Pr(EA ∩ E¬ext) changes. As we have shown, the probability of a tuple from E¬ext to lead to a valid −r log 2t signature is 2 t+1 . We can now follow the analysis in [Unr15] that reduces distinguishing the constant zero function from a Bernoulli distributed boolean function to nding a tuple in E¬ext that leads to a valid signature. Thereby we get the claimed bound:

−(r log 2t )/2 Pr(EA ∩ E¬ext) ≤ 2(qH + 1)2 t+1 .

ut

PQ-KOW ⇒ PQ-EU-CMA. Putting the above two lemmas together allows us to state the following theorem.

Theorem 3.6. Let k, t, r ∈ N be the parameters of the signature scheme above using a q2-IDS IDS that is statistical honest-verier zero-knowledge and has a PQ-q2-extractor. Let A be a PQ-EU-CMA forger that succeeds with probability . Then, there exists a quantum algorithm MA, that in the quantum-accessible random oracle model breaks the PQ-KOW security of IDS in essentially the same running time as A and with success probability

0 −(r log 2t )/2  ≥ ( − 2(qH + 1)2 t+1 )(1 − negl(k)).

4 The Sakumoto-Shirai-Hiwatari 5-pass IDS scheme

In [SSH11], Sakumoto, Shirai, and Hiwatari proposed two new identication schemes, a 3-pass and a 5-pass IDS, based on the intractability of the MQ prob- lem. Unlike previous public key schemes, their solution provably relies only on

15 the MQ problem (and the security of the commitment scheme), and not on other related problems in multivariate cryptography such as the Isomorphism of Poly- nomials (IP) problem [Pat96], the related Extended IP [DHYC06] and IP with partial knowledge [Tho13] problems or the MinRank problem [Cou01,FLP08]. Let us quickly recall the MQ problem.

Denition 4.1( MQ problem (search version)). Let m, n, q ∈ N, x = and let denote the family of vectorial functions F (x1, . . . , xn) MQ(n, m, Fq) : n m of degree 2 over : Fq → Fq Fq F MQ(n, m, Fq) = { (x) = (f1(x), . . . , fm(x))| X (s) X (s) fs(x) = ai,j xixj + bi xi, s ∈ {1, . . . , m}}. i,j i An instance MQ(F, v) of the MQ (search) problem is dened as: Given F m nd, if any, n such that F . ∈ MQ(n, m, Fq), v ∈ Fq s ∈ Fq (s) = v The decisional version of the MQ problem is NP -complete [GJ79]. It is widely believed that the MQ problem is intractable even for quantum computers in the average case, i.e., that there exists no polynomial time quantum algorithm that given F and F (for random n) outputs a ←R MQ(n, m, Fq) v = (s) s ←R Fq solution s0 to the MQ(F, v) problem with non-negligible probability. We will later also need the MQ relation RMQ which is the relation of MQ instances and solutions:

Denition 4.2( MQ relation). The MQ relation is a binary relation RMQ(m,n,q) : m n with (MQ(n, m, Fq) × Fq ) × Fq

((F, v), s) ∈ RMQ(m,n,q), i F(s) = v. We will omit m, n, q whenever they are clear from the context. In [SSH11], Sakumoto, Shirai, and Hiwatari propose a clever splitting tech- nique, using the so-called polar form of the function F which is the function G(x, y) = F(x + y) − F(x) − F(y). Using the polar form and its bilinearity, it becomes possible to split a secret into two shares, such that none of the shares on its own leaks anything about the secret. Using this result, they showed how to construct zero knowledge arguments of knowledge for the MQ problem, using a statistically hiding and computationally binding commitment scheme. They present a 3- and a 5-pass protocol with dif- fering performance properties. Later, in [CHR+16] the security properties of the 5-pass scheme were reexamined to provide the minimal requirements for Fiat- Shamir type signatures from 5-pass IDS. For completeness and better readability we provide the description of the 5-pass IDS, together with the properties that we will use.

Let (pk, sk) = ((F, v), s) ∈ RMQ be the public and private keys of the Prover (i.e., key generation just samples from the MQ relation). Without loss of gen- erality, let the elements from be . The 5-pass IDS from [SSH11] is Fq α1, . . . , αq given in Figure4.

16 P(pk, sk) V(pk)

//setup r t n e m 0, 0 ←R Fq , 0 ←R Fq

r1 ← s − r0 //commit

c0 ← Com(r0, t0, e0) com = (c , c ) c1 ← Com(r1, G(t0, r1) + e0) 0 1 //challenge 1 I ←R {1, . . . , q} //first response ch1 = I

t1 ← αI r0 − t0 e F r e 1 ← αI ( 0) − 0 t e resp1 = ( 1, 1) //challenge 2

ch2 ←R {0, 1} //second response ch2 If r ch2 = 0, resp2 ← 0 Else r resp resp2 ← 1 2 //verify If parse r check ch2 = 0, resp2 = 0, ? c0 = Com(r0, αI r0 − t1, αI F(r0) − e1) Else, parse r check resp2 = 1, ? c1 = Com(r1, αI (v − F(r1)) − G(t1, r1) − e1)

Fig. 4. The 5-pass IDS by Sakumoto, Shirai, and Hiwatari

The following theorem summarizes the security properties of the 5-pass iden- tication scheme from [SSH11] that we need.

Theorem 4.3. The 5-pass identication scheme from [SSH11] (see Fig.4)

1. is statistically honest verier zero-knowledge when the commitment scheme Com is statistically hiding, 2. has key relation RMQ(m,n,q), 3. is post-quantum key-one-way if the MQ search problem is hard on average, and 4. has a PQ-q2-Extractor when the commitment scheme Com is computation- ally binding against quantum polynomial time algorithms.

The rst statement was shown in [SSH11]. The second statement holds by con- struction. The third statement follows from the second. The PQ-q2-Extractor essentially follows from a proof in [CHR+16]. In [CHR+16] the existence of a (non-quantum) q2-Extractor was proven under the condition that the commit- ment scheme is computationally binding. The proof shows that there exists a PPT algorithm that given four valid transcripts of the IDS with the correct pat- tern always either extracts a secret key or outputs two valid openings for the commitment. Hence, as long as the used commitment scheme achieves the tra- ditional denition of computationally binding also against quantum polynomial time algorithms, the 5-pass identication scheme from [SSH11] has a PQ-q2- Extractor (as the probability to output two valid openings must be negligible).

17 5 Instantiation from the Sakumoto-Shirai-Hiwatari 5-pass IDS

In the previous sections, we have dened a signature scheme as the result of a transformed q2-IDS scheme. Here, we dene it instantiated with the 5-pass identication scheme proposed in [SSH11].

5.1 SOFIA

We dene the signature scheme in generic terms by describing the required pa- rameters and the functions KGen, Sign and Vf, and defer giving concrete param- eters and for a specic security parameter to the next section. For m, n, r, t Fq k now, we only need to x elements of the eld . Without loss of 2 6 t 6 q Fq generality, we denote them by α1, . . . , αt. Key generation. The SOFIA key generation algorithm formally just samples a MQ relation. Practically, the algorithm is realized as shown in Figure5.

KGen()

k sk ←R {0, 1}

SF, s,Srte ← PRGsk(sk)

F ← XOFF(SF) v ← F(s)

pk := (SF, v) Return (pk, sk)

Fig. 5. SOFIA key generation

The secret key is used as a seed to derive the following values:

 SF, a seed from which the system parameter F is expanded; s , the secret input to the MQ function;  , a seed that is used to sample all vectors r(i), t(i) and e(i). Note that this Srte 0 0 0 seed is not yet needed during key generation, but is required during signing.

Signature generation. For the signing procedure, we assume as input a mes- sage M ∈ {0, 1}∗ and a secret key sk. The signing procedure is given in Figure6. Note that the scheme denition includes several optimizations to reduce the signature size. We discuss these later in this section.

18 Sign(sk,M)

SF, s,Srte ← PRGsk(sk)

F ← XOFF(SF)

pk := (SF, F(s)) r(1) r(r) t(1) t(r) e(1) e(r) PRG 0 ,..., 0 , 0 ,..., 0 , 0 ,..., 0 ← rte(Srte,M) For j ∈ {1, . . . , r} do r(j) s(j) r(j) 1 ← − 0 (j) r(j) t(j) e(j) c0 ← Com( 0 , 0 , 0 ) (j) r(j) G t(j) r(j) e(j) c1 ← Com( 1 , ( 0 , 1 ) + 0 ) (j) (j) (j) com := (c0 , c1 ) For i ∈ {1, . . . , t} do t(i,j) r(j) t(j) e(i,j) F r(j) e(j) 1 ← αi 0 − 0 , 1 ← αi ( 0 ) − 0 (i,j) t(i,j) e(i,j) resp1 := ( 1 , 1 ) (i,j) H (i,j) cr1 ← 1(resp1 ) (j,0) r(j) (j,1) r(j) resp2 := 0 , resp2 := 1 (j,0) H (j,0) (j,1) H (j,1) cr2 ← 2(resp2 ), cr2 ← 2(resp2 ) t (j) n (i,j)o (j,0) (j,1) transfull(j) := (com , cr1 , cr2 , cr2 ) i=1  r  md ← H pk,M, {transfull(j)}j=1

((I1,B1),..., (Ir,Br)) ← XOFtrans(md) t (j) n (i,j)o (j,¬Bj ) transred(j) := (c , cr , cr ) ¬Bj 1 2 i6=Ij ,i=1  n or  Return (Ij ,j) (j,Bj ) md, transred(j), αIj , resp1 , resp2 j=1

Fig. 6. SOFIA signature generation

The signer begins by eectively performing KGen() to obtain pk and F, and then iterates through r rounds of the transformed identication scheme to obtain the transcript. He then uses this as input for XOFtrans to derive a sequence of indices ((I1,B1),..., (Ir,Br)), which eectively dictate the responses that should be included unblinded in the signature.

Verication. Upon receiving a message M, a signature σ, and a public key pk = (SF, v), the verier begins by obtaining the system parameter F and parsing the signature σ as dened by its construction in Sign(), above. The verication routine that follows is listed in Figure7.

19 Vf(pk, σ, M)

F ← XOFF(SF)

((I1,B1),..., (Ir,Br)) ← XOFtrans(md) For j ∈ {1, . . . , r} do (Ij ,j) H (Ij ,j) cr1 ← 1(resp1 ) (Ij ,Bj ) H (Ij ,Bj ) cr2 ← 2(resp2 )

For j ∈ {1, . . . , r} do

If Bj = 0 then r(j) (Ij ,Bj ) 0 := resp2 (j) r(j) r(j) t(Ij ,j) F r(j) e(Ij ,j) c0 ← Com( 0 , αIj 0 − 1 , αIj ( 0 ) − 1 ) Else

r(j) (Ij ,Bj ) 1 := resp2 (j) r(j) v F r(j) G t(Ij ,j) r(j) e(Ij ,j) c1 ← Com( 1 , αIj ( − ( 1 )) − ( 1 , 1 ) − 1 ) 0  r  md ← H pk,M, {transfull(j)}j=1 Return md0 = md

Fig. 7. SOFIA signature verication

Optimizations. There are several optimizations that can be applied to signa- tures resulting from a transformed q2-IDS. Some of them are specic for SOFIA and some are more general; similar and related optimizations were suggested in [Unr15], [CHR+16] and [CDG+17]. Excluding unnecessary blindings. The signature contains blindings of all com- puted responses, as well as a selection of opened responses (Ij ,j) and (j,Bj ). resp1 resp2 It is redundant to include the values (Ij ,j) and (j,Bj ), as these can be recom- cr1 cr2 puted based on the opened responses. This optimization was actually proposed in the generic Unruh transform [Unr15], and applies to any construction similar to Unruh's and ours. However, for the verier to know which responses were ac- tually opened, they must be able to reproduce the indices ((I1,B1),..., (Ir,Br)), which are derived from the transcript, and without the blinded responses, this transcript is incomplete. To solve this circular dependency, we could include the selected indices in the signature. However, for typical parameters5, we can do this more eciently by breaking XOFtrans into two parts, composing it of a hash func- tion over the transcript H and a extendable output function XOF to derive  IB  the indices from the hash output. We then include r H pk,M, {transfull(j)}j=1 as part of the signature, so that the verier can reconstruct the indices, blind

5 See Section 6.1

20 the corresponding responses, construct transfull, and recompute the same hash for comparison. Fixed challenge space denition. Following the generic description of the signa- ture, the selected α(i,j) are included in the signature. Depending on the specic choice of t and q, it may be more ecient to include the challenges α(i,j) that were not selected. However, there is no reason not to take this a step further and simply x a challenge space ChS1 of t elements. That way, all the α's from ChS1 will be selected and there is no need to include them in the signature. This not only reduces the signature size, but also simplies the implementation. Excluding unnecessary second responses. The underlying IDS from [SSH11] has a specic property, namely that the second responses do not depend on the previous state (that is, on the rst challenge and response). Therefore, regardless of the value of α, the second responses are always the same. For this reason, they need to be included only once per commitment (rather than repeating the same value t times). Combined with the previous optimization, this implies that one of the second responses will be opened, and the other will be included blinded. Omitting commitments. The check that the verier performs for each round consists of recomputing c(j), and comparing it to one of the commits supplied Bj by the signer. Similar to the above, and as already suggested in [SSH11], the signer can omit the commits that the verier will recompute. A hash over all commits could be included instead, which the verier can reconstruct using the commits (j) he recomputes and the commits (j) the signer includes. However, cB c¬B it turns outj that this hash is not necessary either:j as these commitments are part of the transcript and the verier is already checking the correctness of the transcript as per the rst optimization, the correctness of the recomputed commitments is implicitly checked in the process. While constructing this scheme, we attempted several other variations. No- tably, we explored opening for multiple α-challenges, but that led to no im- provement in the number of rounds, and, in some cases, to a contradiction of the zero-knowledge property. Variants that employ a form of internal parallelization by committing to multiple values for t0 do reduce the number of rounds, but increase the size of the transcript disproportionately. Altogether, the above optimizations are crucial: they add up to almost 110 KiB, nearly halving the signature size of the scheme that results from the transform.

5.2 Security of SOFIA

In Section3 we described an extension of Unruh's transform to q2-IDS and have proven that it provides PQ-EU-CMA security in the QROM for any underlying q2-IDS with a PQ-q2-extractor, the HVZK property, and PQ-key-one-wayness. This, of course, implies that this transform can immediately be applied to the 5-pass MQ IDS from [SSH11], to give an MQ signature provably secure in the QROM.

21 As discussed in the previous subsection, some optimizations can signicantly improve the performance of the scheme. They deviate from the generic construc- tion, however, causing a need for some changes in the security proof. Fortunately, only minor changes are required. We specify the following theorem.

Theorem 5.1. Let k ∈ N be the security parameter. The signature scheme SOFIA is post-quantum existentially unforgeable under adaptive chosen message attacks in the quantum-accessible random oracle model if the following conditions are satised:

 The search version of the MQ problem is intractable in the average case,  the hash functions H, H1, and H2 are modeled as quantum-accessible random oracles,  the commitment function Com is computationally binding against quantum adversaries, statistically hiding, and has O(k) bits of output entropy,  the pseudorandom generators, PRGrte, PRGsk have outputs computationally indistinguishable from random for any polynomial-time quantum adversary.

Proof. First let's consider a signature scheme obtained by applying the optimiza- tions from the previous section on the signature scheme from Figures2 and3. We will refer to it as the optimized scheme throughout this proof. We will show that this optimized scheme is PQ-EU-CMA secure, if the underlying q2-IDS satises the conditions from Theorem 4.3. We will assume some additional properties of the IDS, that represent a special case of q2-IDS schemes. The optimized scheme is characterized by the following optimizations.

 We x the challenge space ChS1 to t elements. Note that this change does not inuence the security arguments at all.  We assume that the underlying IDS of the optimized scheme is such that the second response does not depend on the rst challenge and response, but

only on the second challenge and the initial output by the prover P0. In this case, in the signature generation, instead of calculating the second response as (i,j,ch2) (i,j) for every , we calculate it resp2 ← P2(state , ch2) i ∈ {1, . . . , t} once per round as (j,ch2) (j) . The full transcript is now resp2 ← P2(state , ch2) n ot r , with (j) (i,j) (i,j) (j,0) (j,1) . {transfull(j)}j=1 transfull(j) = com , ch1 , cr1 , (cr2 , cr2 ) i=1 The reduced transcript transred(j) that is included in the signature is inu- enced similarly.

 Assuming that the underlying IDS is such that com = (c0, c1), we omit from the signature the commitment that the verier recomputes, depend- cch2 ing on the challenge ch2. This alters the content of transred(j) but not of 6 transfull(j). It is straight forward to verify that Lemma 3.1 (and in the QROM, Lemma 3.5) still hold for the optimized scheme. We only removed duplicate information from

6 If we would have used the optimization from [SSH11] directly, this would have in- uenced the full transcript as well.

22 the signature, not aecting the reductions ability to extract. The exact proba- bility of abort in Lemma 3.2 might change as we remove some values from r , maybe reducing its entropy. However, the given bound only de- {transfull(j)}j=1 pends on the amount of entropy coming from the commitments which remains unchanged for the optimized scheme. Therefore, the claims of Lemma 3.2 remain valid. Next, recall (cf. Theorem 4.3) that, under the assumption of intractability of the MQ problem on average, and assuming computationally binding and sta- tistically hiding properties of Com, the 5-pass IDS from [SSH11] is PQ-KOW, is HVZK, and has a PQ-q2-Extractor. Furthermore, it satises the particular properties that the optimized scheme above requires. Thus applying the opti- mized transform on the Sakumoto-Shirai-Hiwatari 5-pass IDS scheme, we obtain a PQ-EU-CMA secure signature (cf. Theorems 3.3 and 3.6) To complete the proof, we note that using a standard game hopping argu- ment, it is straightforward to show that the success probability of a PQ-EU- CMA adversary against SOFIA is negligibly close to the success probability of a PQ-EU-CMA adversary against the optimized scheme from the Sakumoto-

Shirai-Hiwatari 5-pass IDS scheme when the outputs of PRGrte and PRGsk are post-quantum computationally indistinguishable from random. ut

6 SOFIA-4-128

Having described the scheme in general terms, we now provide concrete pa- rameters that allow us to specify a specic instance, which we will refer to as SOFIA-4-128. We present an optimized software implementation and list the re- sults, in particular in comparison to MQDSS-31-64. All benchmarks mentioned below were obtained on a single core of an Intel Core i7-4770K (Haswell) CPU, following the standard practice of disabling TurboBoost and hyper-threading. We compiled the code using gcc 4.9.2-10, with -O3 and -march=native.

6.1 Parameters The previous section assumed a number of parameters and functions. Notably, we must dene , the eld in which we perform the arithmetic, and and , Fq n m the number of variables and equations dening the MQ problem. The number of rounds is determined by (i.e. the number of responses (i,j), bounded r t resp1 by q in SOFIA) and the targeted security level, using Theorem 3.6. For MQDSS-31-64, the choice of was motivated by the fact that it brings F31 the soundness error close to 1 while providing convenient characteristics for fast 2 implementation [CHR+16]. For SOFIA-4-128, our primary focus is on optimiz- ing for signature size while still maintaining eciency. To do so, we compute signature sizes for a wide range of candidate systems7, and investigate several in more detail by implementing and measuring the resulting MQ evaluation func- tions. Notably, we look at the results of , and MQ(128, 128, F4) MQ(96, 96, F7) 7 A calculation script is included in the software package.

23 , and compare to from [CHR+16]. Of these, MQ(72, 72, F16) MQ(64, 64, F31) is the decisive winner, resulting in the smallest8 signatures MQ(128, 128, F4) while still providing decent performance. See Table2 on page 28 for benchmarks of single evaluation functions and the related signature sizes. Note that, as the number of rounds does not depend on the choice of but merely on , the r Fq t signing time scales proportionally9.

Parameters for . A straightforward method for solving sys- MQ(m, n, Fq) tems of quadratic equations in variables over is by performing exhaustive m n Fq search on all possible qn values for the variables, and testing whether they satisfy the system. Currently, [BCC+10] provide the fastest enumeration algorithm for systems over , needing n operations10. F2 4 log n · 2 In addition, there exist algebraic techniques that analyze the properties of the ideal generated by the given polynomials. The most important are the algorithms from the F4/F5 family [Fau99,Fau02,BFS15,BFP12], and the variants of the XL algorithm [CKPS00,Die04,YC05,YC04]. Although dierent in description, the two families bear many similarities, which results in similar complexity [YCY13]. In the Boolean case, today's state of the art algorithms from the aforemen- tioned families, BooleanSolve [BFSS13] and FXL [YC04], provide improvement over exhaustive search. In particular, the improvement is visible for polynomials with more than 200 variables (respectively, Θ(20.792n) and Θ(20.875n) complex- ity when m = n). A very recent algorithm, the Crossbred algorithm [JV17] over , is likely to further improve the asymptotic complexity, as the authors report F2 that it passes the exhaustive search barrier already for 37 Boolean variables. Unfortunately, at the time of writing, the preprint does not include a detailed complexity analysis that we can use11. Interestingly, the current best known algorithms, BooleanSolve [BFSS13], FXL [YC04,YC05], the Crossbred algorithm [JV17] and the Hybrid approach [BFP12] all combine algebraic techniques with exhaustive search. This immedi- ately allows for improvement in their quantum version using Grover's quantum search algorithm [Gro96], provided the cost of implementing them on a quantum computer does not diminish the gain from Grover. Unfortunately, the current literature lacks analysis of the quantum version of these algorithms. To the best of our knowledge, a detailed analysis has only been done for pure enumeration using Grover's search [WS16], showing that a system of n equations in n variables can be solved using Θ(n · 2n/2) operations.

8 This is also the minimum amongst all candidate systems we looked at  it is not merely beating and , but also less common options such as and . F7 F16 F5 F8 9 This disregards the overhead for the various hash computations, which can be con- sidered to scale similarly. 10 The techniques from [BCC+10] can be extended to other elds with the same Fq expected complexity of n . Θ(logq n · q ) 11 The authors of [JV17] conrmed that the complexity analysis is an ongoing work, and will soon be made public.

24 In what follows we will analyze the complexity of the quantum versions of the Hybrid approach and BooleanSolve, and use the results as a reference point in choosing parameters for that provide 128 bit post-quantum MQ(m, n, Fq) security12. First of all, we note that m = n is the best choice in terms of hardness of the MQ problem. Indeed, if there are more equations than variables, they provide more information about the solution, so nding one becomes easier. On the other hand, if there are more variables than equations, we can simply x n − m variables and reduce the problem to a smaller one, with m variables. Let . Without loss of generality, the F = (f1, . . . , fm), fi ∈ Fq[x1, . . . , xn] equation system that we want to solve is F(x) = 0. The main complexity in both the Hybrid approach and BooleanSolve comes

from performing linear algebra on a Macalay matrix MacD(F) of degree D (with rows formed by the coecients of the monomials of ufi of maximal degree D). The degree D should be big enough so that a Gröbner basis of the ideal generated by the polynomials can be obtained by performing linear algebra on

the Macaulay matrix. The smallest such D is called the degree of regularity Dreg, and for semi-regular systems (which is a very plausible assumption for randomly

generated polynomials) it is given by Dreg(n, m) = 1 + deg(HSq(t)), where  2 m   n  (1 − t ) for and (1 + t) HSq(t) = n , q > 2, HS2(t) = 2 m , (1 − t) + (1 + t ) +

and the + subscript denotes that the series has been truncated before the rst non-positive coecient.

Since Dreg determines the size of the matrix, and thus the complexity of the linear algebra performed on it, both algorithms rst x k among the n variables in order to reduce the complexity of the costliest computational step. Now the linear algebra step is instead performed on ˜ , where ˜ MacDreg (F) F = ˜ ˜ ˜ (f1,..., fm) and fi(x1, . . . , xn−k) = fi(x1, . . . , xn−k, an−k+1, . . . , an), for some k. (an−k+1, . . . , an) ∈ F2 Given the linear algebra constant 2 6 ω 6 3, the complexity of the Hybrid approach for solving systems of equations in variables over is n n Fq

CHyb(n, k) = Guess(q, k) · CF 5(n − k, n). (6) where   ω n + Dreg(n, m) − 1 CF 5(n, m) = Θ m , Dreg(n, m) is the complexity of computing a Gröbner basis of a system of m equations in n variables, , using the F5 algorithm [Fau02], k in the m > n Guess(q, k) = logq(k)q classical case and k/2 in the quantum case using Grover's Guess(q, k) = logq(k)q algorithm13. Also, k is an optimization parameter chosen such that the overall complexity is minimized. 12 A similar analysis can be made using the algorithms from the XL family. 13 We assume that in the quantum case the factor is the same as in the classical logq(k) case, as opposed to the factor k from [WS16].

25 In the case of , the BooleanSolve algorithm performs better than the Hy- F2 brid approach. It reduces the problem to testing the consistency of a related linear system ˜ (7) u · MacDreg (F) = (0,..., 0, 1) If the system is consistent, then the original system does not have a solution. This allows for pruning of all the inconsistent branches corresponding to some k. A simple exhaustive search is then performed on the remaining branches. a ∈ F2 It can be shown that the running time of the algorithm is dominated by the rst part of the algorithm in both the classical and the quantum version, although in the quantum case the dierence is not as big as a consequence of the reduced complexity of the rst part. Therefore, for simplicity, we omit the exhaustive search on the remaining branches from our analysis. The complexity of the BooleanSolve algorithm is given by

˜ (8) CBool(n, k) = Guess(2, k) · Ccons(MacDreg (F)). where Guess(2, k) is dened the same as in the Hybrid approach, and

Dreg (n−k,n) X n C (Mac (F˜)) = Θ(N 2 log2 N log log N),N = cons Dreg i i=0 is the complexity of testing consistency of the matrix (7), using the sparse linear algebra algorithm from [GLS98]. Table1 below provides estimates of the minimum requirements for 128 bit post-quantum security of with regard to BooleanSolve and Hybrid MQ(n, n, Fq) Approach using Grover's search, as well as plain use of Grover's search. In the estimates we used ω = 2.3, which is smaller than the best known value ω = 2.3728639 [Gal14]. We provide the optimal number of xed variables in brackets, where actually this number does not equal the number of variables in the initial system. When this is the case, the optimal strategy is to simply use Grover (x all variables), which we denote with G. Note that since any system of n variables over can be eciently transformed into a system of variables over , we F2s sn F2 have scaled the results for BooleanSolve for larger accordingly. F2s As mentioned earlier, a new algebraic method for equations over , the F2 Crossbred algorithm, was proposed very recently [JV17]. The main idea of this approach is to rst perform some operations on the Macalay matrix of degree

Dreg(n − k, n) of the given system, and only afterwards x variables. In particu- lar, the algorithm rst tries to nd enough linearly independent elements in the kernel of a submatrix of , corresponding to monomials of special- MacDreg (n−k,n) ized degree in the variables that will later remain in the system (i.e. will not be xed). These can then be used to form new polynomials in the n − k remaining variables of total small degree d, which added to Macd(F˜) will result in work- ing with a much smaller Macalay matrix. The advantage here comes from using sparse linear algebra algorithms on for the rst part and dense MacDreg (n−k,n)

26 F2 F3 F4 F5 F7 F8 BooleanSolve 221 (200) / 111 / / 56 Hybrid G G G G G 84 (57) Grover 251 158 126 108 90 84

F11 F13 F16 F17 F31 F32 BooleanSolve / / 28 / / 14 Hybrid 77 (51) 73 (43) 69 (40) 69 (40) 61 (30) 60 (21) Grover 73 68 63 62 51 51

Table 1. Lower bound on number of variables n for 128 bit post quan- tum security against the quantum versions of Hybrid approach [BFP12] and BooleanSolve [BFSS13]. In brackets is the number of xed variables. G denotes that the best strategy is to x all variables, i.e. plain Grover search.

linear algebra only on the smaller Macalay matrix in the second part14. Since [JV17] does not contain a complexity analysis, we refrain from claiming exact security requirements based on the quantum version of the algorithm. Neverthe- less, following the description of the algorithm we have estimated that a system of 256 variables over (which applies to or ) F2 MQ(128, 128, F4) MQ(64, 64, F16) provides 128-bit security against the quantum version of Crossbred algorithm. We will include a more detailed analysis for the quantum version once a classical complexity analysis is available.

Number of rounds r and blinded responses per round t. The choice of t provides a trade-o between size and speed; a larger t implies a smaller error, resulting in less rounds, but more included blinded responses per round (the additional computational cost of which is insignicant). Interestingly, t = 3 provides the minimal size, followed by t = 4, and, only then, t = 2. The decrease in rounds quickly diminishes, however, making t = 3 and t = 4 the most attractive choices15. Given the above considerations (and with a prospect of some convenience of implementation), we select the parameters n = m = 128, q = 4 and t = 3. For a security level of 128 bits post-quantum security, it follows from Theorem 3.6 −(r log 2t )/2 that we must select r such that 2 t+1 < 2−128. This implies r = 438.

14 An external specialization of variables is also possible, but this does not bring any improvement classically, and we have veried for some parameters that this is the case also quantumly. 15 Note that is naturally bounded by , making these the only options for . t q F4

27 size size cyclesb t = 3, r = 438 t = 4, r = 378 21 412 123.22 KiB 129.97 KiB MQ(128, 128, F4) 36 501 129.00 KiBa 136.20 KiBa MQ(96, 96, F7) 25 014 136.91 KiB 144.73 KiB MQ(72, 72, F16) 6 616c 149.34 KiBa 158.15 KiBa MQ(64, 64, F31) a This assumes optimally packing the elements of , which may not Fq be practical. b This is the cost of a single evaluation. In practice, batching provides a speedup. See Section 6.2. c As reported in [CHR+16].

Table 2. Benchmarks for varying parameter sets

Required functions. Before being able to implement the scheme, we must still dene several of the functions we have assumed to exist. In particular, we need: a string commitment function Com; pseudo-random generators PRGsk and PRGrte; extendable output functions XOFF and XOFIB; permutation functions H1 and H2; and a cryptographic hash function H. We instantiate the extendable output functions, the string commitment func- tions, the permutations and the hash function with SHAKE-128 [BDPV11]. This applies trivially, except for XOFIB, of which the output domain is a series of ternary and binary indices (as t = 3). We resolve this by applying rejection sam- pling to the output of SHAKE-128 to derive the ternary challenges. For XOFF, we achieve a signicant speedup by dividing its output in four separate pieces, generating each of them with a domain-separated call to cSHAKE-128 [BDPV11]. For the application of H to the public key, the message and the transcript, we note that collision resilience is achieved when the message is absorbed into the SHAKE-128 state after the transcript, as absorbing the transcript randomizes the state suciently to prevent internal collisions.

For PRGrte and PRGsk, we also instantiate with SHAKE-128. We note that dierent systems can make dierent choices here without breaking compatibility  this is a local decision for the signer. In fact, for the optimized Haswell im- plementation discussed in the next section, we instantiate PRGrte with AES in counter mode, using the AES-NI instruction set.

6.2 Implementation

As part of this work, we provide a C reference implementation and an imple- mentation optimized for AVX2. The focus of this section is the evaluation of the function, given the abovementioned parameter set . The MQ MQ(128, 128, F4) rest of the scheme depends on fairly straight-forward operations (such as multi- plying vectors of elements by a constant scalar) and applications of existing F4

28 public domain implementations of AES-CTR and SHAKE-128 (although we do briey discuss parallel evaluation of the latter). Before discussing the computation, we note that the chosen parameters lend themselves to a very natural data representation. Throughout the entire scheme, we interpret 256 bit vectors as vectors of 128 bitsliced elements: the low F4 128 bits make up the lower bits of the two-bit elements, and the high 128 bits make up the higher bits of each element. This makes operations such as scalar multiplication in C code very convenient, as this can be easily expressed as logical operations on bit sequences, but provides an even more important benet for AVX2 assembly code. Notably, one vector of elements now ts exactly into F4 one 256 bit vector register, with the lower bits tting into the low lane and the higher bits into the high lane. Whereas other parameter sets could result in having to consider crossing the lanes, in this case the separation is very natural. When it comes to sampling elements in from the output of SHAKE-128 or F4 AES-CTR, we can freely interpret the random data to be in bitsliced representa- tion. Similarly, we simply write the elements to the signature in this representa- tion, as signature verication enjoys precisely the same benets. All in all, there is no point throughout the entire scheme at which we need to actually perform a bitslicing operation.

Evaluating MQ. For a given input x, we split the evaluation into two phases: computing all quadratic monomial terms xixj, and composing them together to evaluate the quadratic polynomials.

Computing the quadratic terms. To perform the rst step, we use a similar ap- proach as was used in [CHR+16]. It can be viewed as a combination of their approach for and for , as we now operate on a single register that con- F2 F31 tains all input elements, but eectively view each lane as 16 separate single-byte registers. Using vpshufb instructions, it is very cheap to arrange the elements in such a way that all multiplications can be performed using only a minimal number of rotations. We used the script from [CHR+16] as a starting point to generate the arrangement. A bitsliced multiplication in can be eciently performed using only a few F4 logical operations. The inputs to these multiplications are a register containing x and a register containing some rotated arrangement of x. However, some of these operations require the low and high lanes of the vector registers to interact, which is typically costly. As x is constant, we can speed up these multiplications by rewriting them as shown below, and presetting two registers that contain

[xhigh|xhigh] and [xhigh ⊕ xlow|xlow], respectively. Note that all of these opera- tions are not performed on single bits, but rather on 128 bit vector lanes. The multiplication of 128 elements then requires only two vpand instructions, one vperm instruction, and a vpxor to combine the results.

chigh = (ahigh ∧ (blow ⊕ bhigh)) ⊕ (alow ∧ bhigh)

clow = (alow ∧ blow) ⊕ (ahigh ∧ bhigh)

29 Multiplying, and accumulating results. We focus on two approaches to perform the second and most costly part of the evaluation, in which all of the above monomials need to be multiplied with coecients from F and summed into the output vector. They are best described as iterating either `horizontally' or `verti- cally' through the required multiplications. For the vertical approach, we iterate over all16 registers of monomials, broadcasting each of the monomials to each of the 128 possible positions by shu ing bytewise and applying eight bit-rotations before multiplying with a sequence of coecients from F and adding into an accumulator. Alternatively, we iterate over the output elements in the outer- most loop. For each output element, we iterate over all registers of monomials, perform the multiplications and horizontally sum the results by making use of the popcnt instruction. Intuitively, the latter approach may seem like more work (notably because it requires more loads from memory), but in practice it turns out to be faster for our parameters. The main reason for this is that by maintaining multiple separate accumulators, loaded monomials can be re-used while still maintaining chains of logic operations that operate on independent results (as the accumulators are only joined together later), which leads to highly ecient scheduling. For both cases, a lot of computational eort can be saved by delaying part of the multiplication in . This is done by computing both x f x f F4 [ˆhigh ∧ high|ˆlow ∧ low] and [xˆlow ∧fhigh|xˆhigh ∧flow], with f from F and xˆ a sequence of quadratic mono- mials, and accumulating these results separately. After accumulating, these val- ues can be used as inputs to nish all multiplications and reductions at once, eliminating the majority of the logic operations that would otherwise be per- formed for each of the 65 multiplications.

Evaluating MQ instances in parallel. While repeatedly iterating over the monomials can be done quickly (especially since they easily t in L1 data cache), each of the coecients in F is used only once, making loading these a considerable burden. As F is constant for each evaluation, however, a signicant speedup can be achieved by processing multiple instances of the MQ function in parallel. In particular the vertical approach lends itself nicely for this, as its critical section leaves some registers unused. For the horizontal approach, it becomes a trade-o with registers that are used for parallel accumulators, but, as loading of F is an important bottleneck, there is still signicant gain from parallelizing multiple instances. For SOFIA-4-128, the signer evaluates r = 438 instances of F and its polar form G on completely independent inputs, which can be trivially batched. The verier performs 438 evaluations of F, and on average half as many evaluations of G, which can also be batched together.

16 There are n·(n+1) such monomials, which results in 64 1 256-bit sequences. 2 = 8256 2 We round up to 65 by zeroing out half of the high and half of the low lane. To still get results that are compatible with implementations on other platforms, we create similar gaps in the stream of random values used to construct F, ensuring that the same random elements are still used for the same coecients.

30 Parallel SHAKE-128 and cSHAKE-128. As will be apparent in the next sec- tion, many cycles are spent computing the Keccak permutation (as part of either SHAKE-128 or cSHAKE-128). Some of the main culprits17 are the commitments, the blinding of responses and the expansion of F. While the Keccak permuta- tion does not lend itself nicely to internal parallelization, it is trivially possible to compute four instances in parallel in a 256 bit vector register. This allows us to seriously speed up the computations of the many commits and blindings, as these are all fully independent and can be grouped together across rounds. De- riving F can be parallelized by splitting it in four domain-separated cSHAKE-128 calls operating on the same seed, as was alluded to in Section 6.1.

Benchmarks. All things considered, evaluating the MQ function horizontally in batches of three turns out to give the fastest results, measuring in at 17 558 cycles per evaluation. Evaluating vertically is slightly more expensive, at 18 598 cycles per evaluation. The cost for evaluating the polar form is not signicantly dierent, diering by approximately a hundred cycles from regular MQ. Gen- erating the monomial terms xiyj + xjyi is somewhat more costly, but this is countered by the fact that the linear terms cancel out. To generate a signature, we spend 21 305 472 cycles. Of this, 15 420 520 cycles can be attributed to evaluating MQ, and 43 954 to AES-CTR. The remainder is almost entirely accounted for by the various calls to SHAKE-128 and cSHAKE- 128 for the commitments, blindings and randomness expansion. In particular, expanding F costs 1 120 782 cycles. Note, however, that if many signatures are to be generated, this expansion only needs to be done once and F can be kept in memory across subsequent signatures. Verication costs 15 492 686 cycles, and key generation costs 1 157 112. Re- mark that key generation is dominated by expansion of F. Systems that require often re-keying may opt to reconsider the diversication of F, and instead re-use the same random system in a similar way as batch signature generation can. This would asymptotically reduce the average cost of key generation all the way down to essentially a single MQ evaluation, which costs only 21 412 cycles. The keys of SOFIA-4-128 are very small by nature, with the secret key consisting of only a single 32 byte seed, and the 64 byte public key being made up of a seed and a single MQ output. The natural candidate for comparison is MQDSS-31-64 [CHR+16], the de facto predecessor of this work. While MQDSS has a proof in the ROM, we focus further comparison on post-quantum schemes that have proofs in the QROM or standard model. See Table3, below; as mentioned in the introduction, we include SPHINCS-256 [BHH+15] (standard model), Picnic-10-38 [CDG+17] (QROM) and TESLA-768 [ABBD15] (QROM).

17 Hashing the transcript in parallel is a bit more involved. While it could be done using constructions like ParallelHash [BDPV11], we omit this for the sake of simplicity.

31 |σ| |pk|, |sk| keygen signing verication (bytes) (bytes) (cycles) (cycles) (cycles) SOFIA-4-128 a 126 176 64 32 1 157 112 21 305 472 15 492 686 MQDSS-31-64a 40 952 72 64 1 826 612 8 510 616 5 752 612 SPHINCS-256b 41 000 1056 1088 3 237 260 51 636 372 1 451 004 Picnic-10-38c,d 195 458 64 32 ≈36 000 ≈112 716k ≈58 680 000 TESLA-768a 2 336 4128k 3216k ≈172 800me 2 232 906 863 790 a Benchmarked on an Intel Core-i7-4770K (Haswell). b Benchmarked on an Intel Xeon E3-1275 (Haswell). c Benchmarked on an Intel Core-i7-4790 (Haswell). d Converted from milliseconds at 3.6GHz. e Using a measurement from [CDG+17], as [ABBD15] does not report key generation.

Table 3. Benchmark overview

References

ABBD15. Erdem Alkim, Nina Bindel, Johannes Buchmann, and Özgür Dagdelen. TESLA: tightly-secure ecient signatures from standard lattices. Cryp- tology ePrint Archive, Report 2015/755, 2015. http://eprint.iacr.org/ 2015/755/, version 20161117:055833. ARU14. Andris Ambainis, Ansis Rosmanis, and Dominique Unruh. Quantum attacks on classical proof systems: The hardness of quantum rewinding. In FOCS 2014, pages 474483, 2014. http://eprint.iacr.org/2014/296. BCC+10. Charles Bouillaguet, Hsieh-Chung Chen, Chen-Mou Cheng, Tung Chou, Ruben Niederhagen, , and Bo-Yin Yang. Fast exhaustive search for polynomial systems in . In Stefan Mangard and François- F2 Xavier Standaert, editors, Cryptographic Hardware and Embedded Systems  CHES 2010, volume 6225 of LNCS, pages 203218. Springer, 2010. https://eprint.iacr.org/2010/313. BDPV11. Guido Bertoni, Joan Daemen, Michaël Peeters, and Gilles Van Assche. The Keccak reference, 2011. http://keccak.noekeon.org/. BFP12. Luk Bettale, Jean-Charles Faugère, and Ludovic Perret. Solving polyno- mial systems over nite elds: improved analysis of the hybrid approach. In Joris van der Hoeven and Mark van Hoeij, editors, Proceedings of the 37th International Symposium on Symbolic and Algebraic Computation  IS- SAC '12, pages 6774. ACM, 2012. https://hal.inria.fr/hal-00776070/ document. BFS15. Magali Bardet, Jean-Charles Faugère, and Bruno Salvy. On the complexity of the F5 Gröbner basis algorithm. Journal of Symbolic Computation, 70:49 70, 2015. https://hal.inria.fr/hal-01064519/document. BFSS13. Magali Bardet, Jean-Charles Faugère, Bruno Salvy, and Pierre-Jean Spaen- lehauer. On the complexity of solving quadratic boolean systems. Jour- nal of Complexity, 29(1):5375, 2013. www-polsys.lip6.fr/~jcf/Papers/ BFSS12.pdf.

32 BHH+15. Daniel J. Bernstein, Daira Hopwood, Andreas Hülsing, Tanja Lange, Ruben Niederhagen, Louiza Papachristodoulou, Michael Schneider, Peter Schwabe, and Zooko Wilcox-O'Hearn. SPHINCS: practical stateless hash-based signa- tures. In Marc Fischlin and Elisabeth Oswald, editors, Advances in Cryptol- ogy  EUROCRYPT 2015, volume 9056 of LNCS, pages 368397. Springer, 2015. http://cryptojedi.org/papers/#sphincs. CDG+17. Melissa Chase, David Derler, Steven Goldfeder, Claudio Orlandi, Sebastian Ramacher, Christian Rechberger, Daniel Slamanig, and Greg Zaverucha. Post-quantum zero-knowledge and signatures from symmetric-key primi- tives. Cryptology ePrint Archive, Report 2017/279, 2017. http://eprint. iacr.org/2017/279/. CHR+16. Ming-Shing Chen, Andreas Hülsing, Joost Rijneveld, Simona Samardjiska, and Peter Schwabe. From 5-pass MQ-based identication to MQ-based signatures. In Jung Hee Cheon and Tsuyoshi Takagi, editors, Advances in Cryptology  ASIACRYPT 2016, volume 10032 of LNCS, pages 135165. Springer, 2016. http://eprint.iacr.org/2016/708. CKPS00. Nicolas Courtois, Er Klimov, Jacques Patarin, and Adi Shamir. E- cient algorithms for solving overdened systems of multivariate polyno- mial equations. In Bart Preneel, editor, Advances in Cryptology  EU- ROCRYPT 2000, volume 1807 of LNCS, pages 392407. Springer, 2000. www.iacr.org/archive/eurocrypt2000/1807/18070398-new.pdf. Cou01. Nicolas T. Courtois. Ecient zero-knowledge authentication based on a linear algebra problem MinRank. In Colin Boyd, editor, Advances in Cryp- tology  ASIACRYPT 2001, volume 2248 of LNCS, pages 402421. Springer, 2001. https://eprint.iacr.org/2001/058. DFG13. Özgür Dagdelen, Marc Fischlin, and Tommaso Gagliardoni. The Fiat Shamir transformation in a quantum world. In Kazue Sako and Palash Sarkar, editors, Advances in Cryptology - ASIACRYPT 2013, volume 8270 of LNCS, pages 6281. Springer, 2013. https://eprint.iacr.org/2013/245. DGV+16. Özgür Dagdelen, David Galindo, Pascal Véron, Sidi Mohamed El Yous Alaoui, and Pierre-Louis Cayrel. Extended security arguments for signature schemes. Designs, Codes and Cryptography, 78(2):441461, 2016. DHYC06. Jintai Ding, Lei Hu, Bo-Yin Yang, and Jiun-Ming Chen. Note on design criteria for rainbow-type multivariates. Cryptology ePrint Archive, Report 2006/307, 2006. https://eprint.iacr.org/2006/307. Die04. Claus Diem. The XL-algorithm and a conjecture from commutative al- gebra. In Pil Joong Lee, editor, Advances in Cryptology  ASIACRYPT 2004, volume 3329 of LNCS, pages 323337. Springer, 2004. https: //www.iacr.org/archive/asiacrypt2004/33290320/33290320.pdf. Fau99. Jean-Charles Faugère. A new ecient algorithm for computing Gröbner bases (F4). Journal of Pure and Applied Algebra, 139:6188, 1999. http: //www-polsys.lip6.fr/~jcf/Papers/F99a.pdf. Fau02. Jean-Charles Faugère. A new ecient algorithm for computing Gröbner bases without reduction to zero (F5). In Proceedings of the 2002 In- ternational Symposium on Symbolic and Algebraic Computation  ISSAC '02, pages 7583. ACM, 2002. http://www-polsys.lip6.fr/~jcf/Papers/ F02a.pdf. Fis05. Marc Fischlin. Communication-ecient non-interactive proofs of knowledge with online extractors. In Victor Shoup, editor, Advances in Cryptology

33  CRYPTO 2005, volume 3621 of LNCS, pages 152168. Springer, 2005. https://www.iacr.org/archive/crypto2005/36210148/36210148.pdf. FLP08. Jean-Charles Faugère, Françoise Levy-dit-Vehel, and Ludovic Perret. Crypt- analysis of MinRank. In David Wagner, editor, Advances in Cryptology  CRYPTO 2008, volume 5157 of LNCS, pages 280296. Springer, 2008. http://www-polsys.lip6.fr/~jcf/Papers/crypto08.pdf. Gal14. François Le Gall. Powers of tensors and fast matrix multiplication. In Proceedings of the 39th International Symposium on Symbolic and Algebraic Computation  ISSAC '14, pages 296303. ACM, 2014. https://arxiv. org/pdf/1401.7714.pdf. GJ79. Michael R. Garey and David S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman and Company, 1979. GLS98. Mark Giesbrecht, A. Lobo, and B. David Saunders. Certifying inconsistency of sparse linear systems. In Proceedings of the 1998 International Symposium on Symbolic and Algebraic Computation  ISSAC '98, pages 113119, 1998. https://cs.uwaterloo.ca/~mwg/files/incons.pdf. GMR88. Sha Goldwasser, Silvio Micali, and Ronald L. Rivest. A dig- ital signature scheme secure against adaptive chosen-message attacks. SIAM Journal on Computing, 17(2):281308, 1988. https://people.csail.mit.edu/silvio/Selected%20Scientific% 20Papers/Digital%20Signatures/A_Digital_Signature_Scheme_Secure_ Against_Adaptive_Chosen-Message_Attack.pdf. Gro96. Lov K. Grover. A fast quantum mechanical algorithm for database search. In Proceedings of the Twenty-eighth Annual ACM Symposium on Theory of Computing  STOC '96, pages 212219. ACM, 1996. https://arxiv.org/ pdf/quant-ph/9605043v3.pdf. JV17. Antoine Joux and Vanessa Vitse. A crossbred algorithm for solving boolean polynomial systems. Cryptology ePrint Archive, Report 2017/372, 2017. http://eprint.iacr.org/2017/372. Pat96. Jacques Patarin. Hidden eld equations (HFE) and isomorphisms of poly- nomials (IP): Two new families of asymmetric algorithms. In Ueli Maurer, editor, Advances in Cryptology  EUROCRYPT '96, volume 1070 of LNCS, pages 3348. Springer, 1996. http://www.minrank.org/hfe.pdf. PS96. David Pointcheval and Jacques Stern. Security proofs for signature schemes. In Ueli Maurer, editor, Advances in Cryptology  EUROCRYPT '96, volume 1070 of LNCS, pages 387398. Springer, 1996. https://www.di.ens.fr/ ~pointche/Documents/Papers/1996_eurocrypt.pdf. SSH11. Koichi Sakumoto, Taizo Shirai, and Harunaga Hiwatari. Public-key iden- tication schemes based on multivariate quadratic polynomials. In Phillip Rogaway, editor, Advances in Cryptology  CRYPTO 2011, volume 6841 of LNCS, pages 706723. Springer, 2011. https://www.iacr.org/archive/ crypto2011/68410703/68410703.pdf. Tho13. Enrico Thomae. About the Security of Multivariate Quadratic Public Key Schemes. PhD thesis, Ruhr-University Bochum, Ger- many, 2013. https://www.iacr.org/phds/116_EnricoThomae_ AboutSecurityMultivariateQuadr.pdf. Unr15. Dominique Unruh. Non-interactive zero-knowledge proofs in the quantum random oracle model. In Elisabeth Oswald and Marc Fischlin, editors, Ad- vances in Cryptology  EUROCRYPT 2015, volume 9056 of LNCS, pages 755784. Springer, 2015. http://eprint.iacr.org/2014/587.

34 WS16. Bas Westerbaan and Peter Schwabe. Solving binary MQ with grover's algorithm. In Claude Carlet, Anwar Hasan, and Vishal Saraswat, editors, Security, Privacy, and Advanced Cryptography Engineering, volume 10076 of LNCS. Springer, 2016. Document ID: 40eb0e1841618b99ae343a073d6c1e, http://cryptojedi.org/papers/#mqgrover. YC04. Bo-Yin Yang and Jiun-Ming Chen. Theoretical analysis of XL over small elds. In Huaxiong Wang, Josef Pieprzyk, and Vijay Varadharajan, editors, Information Security and Privacy, volume 3108 of LNCS, pages 277288. Springer, 2004. http://www.iis.sinica.edu.tw/papers/byyang/2386-F. pdf. YC05. Bo-Yin Yang and Jiun-Ming Chen. All in the XL family: Theory and prac- tice. In Choon sik Park and Seongtaek Chee, editors, Information Secu- rity and Cryptology  ICISC 2004, pages 6786. Springer, 2005. http: //by.iis.sinica.edu.tw/by-publ/recent/xxl.pdf. YCY13. Jenny Yuan-Chun Yeh, Chen-Mou Cheng, and Bo-Yin Yang. Operating Degrees for XL vs. F4/F5 for Generic MQ with Number of Equations Linear in That of Variables. In Marc Fischlin and Stefan Katzenbeisser, editors, Number Theory and Cryptography: Papers in Honor of Johannes Buchmann on the Occasion of His 60th Birthday, pages 1933. Springer, 2013. http: //www.iis.sinica.edu.tw/papers/byyang/17377-F.pdf. Zha16. Mark Zhandry. A note on quantum-secure PRPs. Cryptology ePrint Archive, Report 2016/1076, 2016. http://eprint.iacr.org/2016/1076.

35