arXiv:2106.11710v1 [cs.CR] 22 Jun 2021 rirr intrs h xedtr ofreasignature a forge forge private to or the classic expenditure messages forge with The encrypted two all signatures. public decipher schemes arbitrary the therefor asymmetric to and access keys these needs of only computer keys respectively quantum RSA Curve The and Elliptic are. (ECDSA) schemes Algorithm common Signature most Digital our or like curves i elliptic , security including forge logarithms, whose can discrete messages on that based encrypted algorithm decrypt Shors and becau on signatures is based That privacy.” [1] algorithms and on are as security aftermath there of Or profound terms in have privacy. authentica data to message and sensory going integrity, security are data nonrepudiation, their of and of “principles it, loss puts a fearing are zto [ ization time, polynomial broken. in schemes these public hand a rendering from other therefor key the private on a clas derive computer by can quantum unbreakable sufficient essentially A being computers. therefor length, key ubrter)lk ocle atc rbesta antb cannot that problems called so like theory) number ers promising for good most very but the signifiant signing one for on purposes. worse in verification focus FALCON differ makes further which that aspect Dilithium, to in and come different FALCON Io schemes in will are curren usage We which for schemes evaluate suited cases. will explainin better some are i After implementations others why available NIST. to on In compared the aspects nodes. basis some by i IoT currently underlying standardization for schemes the of signature as some well process comparing t as the am is devices i This paper interface all use. this much human currently pretty we for break schemes to case able signature be and then will that size sufficient intrs oe osritDevices Constraint Power Signatures, rsuperposition or hti h hr r urnl cee ne standard- under schemes currently are there why is That h unu eouini oig ihqatmcomput- quantum With coming. is revolution quantum The Terms Index Abstract 1 3 2 1 erfrt lsi fsmtigi o ietylvrgn e leveraging directly not is something if classic circumstances to normal refer under we secure considered is that III-A section compare unu-eitn iia intrsshmsfor schemes signatures digital Quantum-resistant ntewyt e oeadmr ucinl people functional, more and more get to way the on ? ,[]ta r ae nohrhr rbes(not problems hard other on based are that [2] ], Qatmcmuesaeo h oio ogtt a to get to horizon the on are computers —Quantum 3 Itre fTig,QatmRssac,Secure Resistance, Quantum Things, of —Internet optr ie xoetal ihincreased with exponentially rises computers .I I. NTRODUCTION o-oe IoT low-power [email protected] 1 opttoa Science Computational ntanglement st ansHattenbach Hannes ri Universit Freie use- T prime eln DE Berlin, tion, the sic tly he se n e g s 2 ntebs intr otnesfrteITs a:FALCON focussing far: be so IoT will the i performance for Dilithium. finally contenders additional and signature And best with IoT. the the starting on in signature IoT, relevant on the metrics focus based then in hash will and of lattice schemes chapter kinds: on, kind following used The what rely most two schemes. on algorithms the overview on QR focus an with problems give next will mathematical for the I candidates underlying performance those. In current giving for things. over and metrics overview of Algorithms (QR) an internet Resistant give the Quantum signatu will on and i as encryption section well current as break to standards ability detail more their little a about being computing, quantum to regarding background IoT the for possibilities metrics. performance usage their their and schemes with devices power. power computing low and resource storage of severe few consist the very IoT is end-user-devices The though to Smartphones constraint-ness. compared signing and Anoth IoT is Laptops (IoT). the less Things like in the of up Internet non coming the important problem in as data being sensor but sensitive user, end the privacy our save to computers security. quantum and by forged easily that n aa ..ol h ih niiscndcytti data. this decrypt can entities right the privacy/confidentialit only s.t. ensures data, which any encryption, is group opofitgiyo uhniiyo n aa signature A data. any parts, of two authenticity of consists or next scheme the integrity for are proof used schemes to Signature be signatures. can : itself of pillar Hashing knowled password. any we that infer to about length, ability the stor variable without like passwords applications matching possibly different of variety of a as for be important set that implies, to const bigger refer name of a commonly value the smaller from as a about create is not, length we groups shall If these reversible. that of efficiently functions, first way The groups. one three into divided Cryptography A. hrfria on ogv ml nrdcinand introduction small a give to going signature am existing i evaluate Therefor to going am i paper this In for mind to coming directly not cases use the of One osl paigtemi oi fcytgah a be can cryptography of topic main the speaking Loosely at ¨ I B II. ACKGROUND hashing signing rporpi ahn is hashing Cryptographic . and verifying h last The . n and ing These used of y ant ed ge re er TABLE I RSA encryption. Similar to that other common schemes like IETF IOT CLASSES ECDSA can also be broken be slightly modified versions of Class RAM Flash Shors Algorithm. C0 << 10 KiB << 100 KiB C1 10 KiB 100 KiB B. QR Algorithms C2 50 KiB 250 KiB The two main algorithms with practical use cases that have a great speed-up compared to classical solutions, are the already schemes consist of the two parts encryption and decryption. introduced algorithm by Shor and an algorithm by Grover Additionally to those parts for signatures as well as encryption that can essentially reverses one-way functions by creating there needs to be process of key-generation. We also differ- a superposition over all possible inputs, flipping all inputs entiate between symmetric and asymmetric schemes. The first with the wanted output (without knowing the inputs) and then one has a different private and public key while the latter uses flipping this state about its mean and repeating this process a the same for de- and encryption. More details about which of lot of times [5]. While Shors algorithm provides exponential those schemes will be more or less endangered by quantum speed-up, Grovers algorithm only provides quadratic speed- computing are in section III-A and III-B. up. It was also shown, that something similar to grovers In general we denote a signature scheme as the group of three algorithm but with exponential speedup is impossible [6]. algorithms {GEN, SIGN, VER} and a encryption scheme as Which implies that hashing as well as symmetric cryptography {GEN, ENC, DEC}. stays relatively secure. The quadratic speedup provided by quantum computers can easily be mitigated by doubling the B. Internet of Things key length. On the other hand though, classical asymmetric The IoT consists of a growing number (currently over cryptography is endangered by shors algorithm and quantum 3 billion [3]) of devices of all sorts, having in common, computers. that they communicate with each other and the environment But not all asymmetric cryptography schemes are equally rather than directly with humans. Those devices range from affected. There are different proposals, both for QR encryption automatic lights and smart home devices to tiny interconnected and for QR signature schemes. They all do have in common sensors in automatic fabrication. A common characteristic though, that their security is not absolutely mathematically though is, that most of these devices have limited processing proven, but based upon assumptions. We therefor need to power, flash storage and random access memory (RAM). A consider a few measures that make schemes more or less popular example for hobbyist IoT devices is the ESP32 from secure. Espressif Microsystems. They offer multiple modules with up to 240Mhz clock on the 32bit IC, up to 16MiB Flash Storage C. Performance Metrics and 320KiB RAM. Which is more than other comparable Some performance metrics exist in QR schemes as well as devices but way less then a lower spec modern smartphone, in classic schemes. with 10 times the frequency, 4GB of RAM and 64GB of Key length and message length [7] are the storage. more obvious ones. The computing time also comes to mind as Since the IoT consists of very different types of constrained a performance metric. Here you need to differentiate between nodes the IETF introduced different classes on which to key generation, which is less important, since it should only classify IoT nodes, those can be seen in table II-B occur rarely, and signing as well as signature verification 4. III. QUANTUM RESISTANT SECURITY Primarily in signatures another metric arises: how often can a private key be used before it needs to be switched out A. for another one, because the signature leaked information of In contrast to classical computers, where information is the key. This is not particularly relevant in most cases, as processed in discrete states, a quantum computer leverages methods can be used to create long term procedures from quantum mechanics to operate on so-called qubits - quan- short term procedures (those where a key can rarely, if ever, be tum objects that can be in superposition or entangled with recycled). But it is relevant in the case of the IoT, since those each other. Opening a new kind of computing. One of the methods require extra memory which is sparse in IoT-devices. implications of that is, that it is now possible to factor large Additionally they tend to make the signatures themselves numbers in polynomial time using an algorithm developed by longer, which also is not preferable in the IoT. [7] Shor [4]. This algorithm uses a so-called Quantum-Fourier- Additionally to more traditional performance metrics we Transform (QFT) to (probabilistically) get the frequencies somehow need to measure the security of given schemes of which a given function output occurs. That can be used against an attack by a quantum computer. together with euclids algorithm of finding the greatest common The main measurement we can take to measure quantum devisor to derive the prime factors. Prior to to quantum resistance is to count how many quantum gates or quantum computers this was considered a hard problem that could only bits are needed. [8] be computed in exponential time and was therefor considered practically impossible and was used as the basis-problem for 4as well as its counterparts de- and encryption TABLE II The main problem is, that the ‘shuffling’ is realized through QR SECURITYCLASSESANDTHEIRTRADITIONALCOUNTERPARTSAS k ∗ n matrices that are generally big (millions of bits) and CLASSIFIED BY THE NIST therefor unfeasible for constrained IoT devices. Class security comparable to The second one will be discussed in greater detail in section 1 AES-128 III-E, since it is also used as one of the main problems for 2 SHA256 3 AES-192 signature schemes. Those schemes are called lattice based 4 SHA384 and also have some implementation with the most famous for 5 AES-256 encryption being NTRUEncrypt5. E. Signatures But unfortunately this is rather hard and there is currently The other pillar of cryptography, signature schemes, is what no standard benchmark to measure quantum resistance [9], we will focus on in greater detail. As well as in encryption nevertheless the National Institute of Science and Technology schemes we can differentiate between different underlying (NIST) created a standard that describes how secure a scheme mathematical problems. Those are pretty much the same as in is against a quantum computer by classifying it within 5 encryption schemes: Hash based, Lattice based, Multivariate classes that can be determined with grovers algorithm [10], polynomial based, Code based, Super-singular isogeny based. [11]. Those classes can be seen in table III-C. [1] Rainbow is the only implementation of a QR signature that D. Encryption is a current contender for standardization that is neither lattice QR encryption schemes can be based upon a multitude of nor hash based. And as already mentioned in the previous different mathematical problems thought to be hard even for section it is multivariate based. quantum computers. Since this sparsity of alternatives we we also focus on hash Sadly, being thought of as secure mostly is not based upon and lattice based signatures in this paper. actual rigorous proof but assumptions. Therefor one problem 1) Hash Based Signatures (HBS): Hash based signatures that was used as a asymmetric encryption basis, the knapsack have their security based upon the hardness of reversing hashes problem, was broken soon after its introduction by so-called or one-way functions. The most easy one is the Lamport one approximate lattice reduction attacks [7]. time signature (OTS). [7] That signature has essentially two Later iterations which include “conjugacy search problem private keys for every bit in the message digest. Let n ∈ N be and related problems in braid groups, and the problem of the bit-length of the digest, then the secret key would be: solving multivariate systems of polynomials in finite fields” [7] kpriv = (S0,0,S0,1)||(S1,0,S1,1)|| ... ||(Sn,0,Sn,1) have been under active research with the latter being broken after standardization and implementation [7]. The advantage of those schemes is, that the private keys do Nevertheless there is an implementation of a multivariate- not have to have any special characteristic that could be taken based scheme, called Rainbow, that is also currently a con- advantage of by a quantum computer to break anything. They tender for standardization. But as an encryption scheme its not do have to be high entropy though, to not be easily forgeable very suitable since the process of decrypting in multivariate with even a classic computer. These secrets are then hashed based schemes requires some guessing work [9] which is not (with a one-way function h) and published as the public key desirable in IoT environments. An additional problem that k = would make rainbow unsuitable for IoT use-cases is its big pub 22kB public key. While private keys can rather easily be (h(S0,0),h(S0,1))||(h(S1,0),h(S1,1))|| ... ||(h(Sn,0),h(Sn,1)) shrunk in key-generation through help of a pseudo-random- When a message is signed the signer just publishes the generator, thats generally not the case for large public keys. secret corresponding to every bit of the digest (Sk,b with On the other hand we have a problem that is not yet b being the bit-value in the k-th position ob the digest) s.t. very well researched and also not much in use, but has one everyone can hash that secret and see that this private keys implementation called SIKE. This problem is based upon are indeed the ones corresponding to the public key and the supersingular elliptic Curves, which are itself a modification of correct bit-value of the digest. Signing as well as verifying elliptic curve problems that should make it quantum resistant. are therefor rather easy operations with one disadvantage: the But since this topic is not well-studied yet we are mostly left keys and the signature are super big. But there are some with schemes based upon the following two thought to be rather easy improvements for this problem e.g. one could quantum-hard problems. only sign the zeros, therefor reducing key sizes by a factor The first one is so called code-based cryptography. Here the of 2 as well as average signature sizes. To mitigate an attack decoder has to correct errors of data that has been seemingly that can flip digest-zeros to ones a checksum is added (that randomly shuffled, but only those with access to the private key can easily ‘unshuffle’ the data to then use special error 5NTRU is short for N-th Degree Truncated Polynomial Ring. Also NTRUEncrypt might pop up the most, but the two main contenders for actual correction codes. The most researched one is called McEliece future use are FALCON (which also uses NTRU) and Dilithium, which uses and even has quite fast (100µs) and secure implementations. other ring lattices can only be decreased by flipping a one to zero, which is Algorithm 2 SIGN impossible if you do not know the pre-image (private key) 1) input message digest Mi of that location). Another improvement often wrongly 6 cited 2) sign as described for Lamport or Winternitz schemes as the successor to the Merkle OTS is the Winternitz scheme (or other OTS schemes that generate the public key by (WOTS), which builds upon the same idea but uses a different hashing the private key): Si = Sign(Mi) (greater) basis b, which inturn makes the signing and verifying 3) publish Si together with all hashes h needed to itera- more computational expensive by needing to apply hashes tively generate the root hash kpub. These are m hashes. b times. The great advantage though is that the keys and b signatures also decrease by a factor of /2. This can be a great Algorithm 3 VER advantage for IoT applications, since time is not as valuable 1) input signature − and digest as storage. Therefor WOTS is actually used in practice, for (Si, [hj ,hi+2,j+2,...,hi k]) M and already known multiple use public key k = example as a signature on the IOTA distributed ledger. [13] i pub h − − A directly visible disadvantage of those schemes (as the 0 (n 1) 2) hash S to generate ki name implies) is that they can trivially only be used one time, i pub 3) hash ki = h together with h to generate h since most of the private key gets public with the signature. pub i j ij 4) hash the value from previous step together with the next A trivial countermeasure would be to append the next public hash given by the signature keys to the message and sign them as well, but thats not a 5) repeat step 4) until the root hash k should be found good idea in most use cases, since you might as well just use pub (thats m steps in total) return True if they are equal and symmetric cryptography which is also considered as quantum False otherwise resistant as hashes. Another idea would be to just publish a whole lot of private keys that can than be used one by one. But thats not a super brilliant idea since signer as well as verifier need to store all these keys which is specially infeasible in IoT On the other side (the signer) we still need to store all n scenarios (that have very constrained storage). Schemes that private keys and calculate m hashes every time we want to can be used multiple times are smartly called multiple time sign anything. The second step can be skipped by also storing signatures (MTS). the hashes instead of calculating them, which additionally increases the storage needed by a factor of . But thats A smarter approach than simply publishing n public keys 2 infeasible for most storage constrained devices. Therefor an and storing n private keys was proposed by Merkle [12]. His approach uses so called Merkle hash trees to make it possible additional tweak was applied to this algorithm: Instead of randomly generating each private key and storing it, we use to have a very small public key that can still verify n signatures on the tradeoff that every signature now increases by a factor a Pseudo-Random-Generator (PRG) together with a seed and a counter to be able to generate every private key on the fly. of log(n). The idea is as shown in algorithms {1, 2, 3}. We can then iteratively generate our Merkle tree and drop every node we already used to calculate the next parent hash Algorithm 1 GEN without exceeding our RAM to generate the root hash to then 1) generate m random values, those are the private n = 2 publish it as the public multi time key. For signing we can then keys. i create our private key again with the help of the PRG and 2) for every private key kpriv generate a one-time public i i again calculate all needed hashes iteratively the same way. key kpub = h(kpriv) (until here it is similar to a trivial But it is rather computationally expensive to recalculate the MTS) whole tree on every signature. That is why we should cache 3) hash every two ‘neighboring’ keys i , j together in kpub kpub as many in-between hashes as possible since every already n i j pairs to generate /2 new hashes hij = h(kpub, kpub) stored hash reduces the computational expenses by a factor of 4) hash those in pairs for the next iteration and repeat until 2. The verification stays the same. the hash-tree is complete and we only have one root This scheme is know as the eXtended Merkle Signature hash denoted as kpub Scheme (XMSS) which also has some further variants and 5) publish kpub that can now be used to verify n signatures developments. [3] Another disadvantage of schemes as described is the so- This is already very useful for IoT actors that only need to called statefulness, which means that the signer cannot just verify, less so for sensors that still need to store all n private sign any message with a key after being reset, since some kind keys. The computational cost is higher, caused by calculating of state is needed that would be lost in a reset. [1] Besides all those hashes but thats commonly worth the tradeoff. that and even more impactful the verifier in an MSS needs to manage which keys/ part of the tree have already bin used, 6the Merkle OTS has two parts, one that is similar to the Lamport scheme since reusing keys is imperative to the schemes security. (which was then improved by Winternitz) and one that uses Merkle Hash 2) Lattice Based Signatures (LBS): In a stateless scheme on Trees, which most of the literature refers to as the Merkle Signature, but is not a predecessor of the winternitz scheme which does not use Merkle Trees the other hand, all you need to sign a message is a static private [12] key. That brings us to the other kind of signature schemes, the ones that are more similar to traditional asymmetric crypto TABLE III in the sense that they rely on a not so trivial mathematical MEASUREMENT RESULTS OF COMPARING BLISS ON M4 WITH SPHINCS ON INTEL XEON ASOF [1] problem that is not easily algorithmically solvable. But instead of prime factorization or elliptic curve calculation, this one Metric SPHINCS results BLISS results seems to be hard to solve, even by a quantum computer. Signature clock-cycles 50 million 5.9 million Verification clock-cycles 1.6 million 1 million The problem used for most of these schemes are based upon Public 1KB 7KB lattices. Private key size 1KB 2KB A lattice in this case is a high-dimensional grid with only Signature size 41KB 960 bytes integer values. Or to be more precise: “An n-dimensional lattice is the set of vectors that can be expressed as the sum of integer multiples of a specific set of n vectors, collectively although it got a major update and has quite a few variants. The called the basis of the lattice—note that there are an infinite most prominent LBS scheme was called BLISS. Suhail et al. number of different bases that will all generate the same then compared these two implementation and measured their lattice” [7] To put it mathematically we can denote a lattice performance. BLISS (the LBS) was evaluated on a common IoT processor with ARMs M4 architecture while SPHINCS L as L = {P ai ∗ bi : ai ∈ Z} with b0,...,bn being arbitrary basis vectors. The mathematical problems that these schemes (the HBS) was evaluated on a intel XEON server processor. are based upon are the shortest vector problem (SVP) where Still BLISS performed considerably better with exception of a very short vector between to points need to be found or the key sizes. Results of their measurements are shown in table the Closest Vector Problem (CVP), where a lattice vector III. needs to be found that is closest to a given arbitrary point. We can therefor compare and somewhat conclude the pros The directly arising problem though is, that to get reasonable and cons of LBS compared to HBS. First of all it is to say, security the basis (which serves as a private key) of the lattice that HBS is already very well studied and bases its security needs to be in the range of megabits, which again is not upon already well established and praxis-tested Problems ideal for our use-cases. That is why researchers developed the (hashing), while lattice based Security is still quite new and in NTRU crypto system, that introduces certain symmetries to the active research. It bases its security upon the CVP (or SVP) lattice structure s.t. the key sizes can be much smaller while which itself is not studied that well, and a few vulnerabilities lowering the security only slightly. [7], [9] These new schemes have already been shown [1]. Another comparison, less about are not only resistant to quantum attacks but also improve the underlying mathematical problems, but more from an efficiency compared to traditional cryptography by having applicational standpoint, is also from interest: statefulness. We speed improvement by a factor of 10-100. [1] Sadly these already know what stateful means, but Suhail et al. summarizes lattice structures where vulnerable through lattice reduction it rather well: “stateful scheme necessitates techniques to Chosen Attacks (CCA) in the case of the maintenance of the updated nonrepeated secret key upon encryption schemes. But that was fixed with the introduction each signature generation process. It is essential to keep track of a special padding scheme that made these attacks impossible of nonrepeated key pairs, failing which will result in the but also increased the key-lengths. [7] In the case of signature degradation of the security of the cryptographic scheme” We schemes (like NTRUSign) the problem is even more severe. can already see that this would be a problem in many use- The signature works by first mapping the message to a vector cases. There are HBS schemes, like the above SPHINCS, that and then signing by solving the CVP for this vector. The found a way to to make themselves stateless, but that comes problem is, that this procedure leaks information about the with its own downsides, like in this case greatly increasing private key s.t. it was shown to be practically broken after only the key-generation expenses (compare table III) , which is 7 around 400 signatures. To mitigate that issue the signer does caused by using keys in a random order therefor making BDS not give the actual closest lattice vector, but a lattice vector optimization no longer applicable. that is close enough by a certain measure, but not necessary the It can therefor be concluded that stateful schemes are great closest. Therefor the leaked information is nearly neglectable in processor power/time constrained use cases while stateless and the signature and private key secure for around a billion schemes trade computing power for storage used and are signatures, although it is still advised to change the private therefor better in memory constrained use-cases. [1] key after around 10 million signatures. That is totally feasible compared to some MTS mentioned before since in many cases IV. QR SIGNATURES IN IOT 10 million signatures is a whole lot. [3] We now have an overview of what kind of QR signature Actual lattice based implementation that were proposed in schemes exist with some focus on the two most promising 2017 are GPV, GLP and BLISS. But now there are newer and underlying mathematical structures, hashes and lattices. In the better implementations like FALCON that will be discussed following sections we will even more focus on what kind of in section V. actual implementation exist and which of them are feasible in 3) Comparison of HBS and LBS and Statefulness versus an constrained environment. statelessness: At the time [1] was written, one the most common HBS was SPHINCS, which has not changed much, 7compare [14] A. Performance Metrics in IoT TABLE IV COMPARISON OF STACK USAGES FOR DIFFERENT SCHEMES AND THEIR First of all we need to establish what metrics we need to OPERATIONS ( - MEANS THAT IT HAS NOT BEAN MEASURED WHILE / consider to evaluate which implementation are better suited MEANS NOT APPLICABLE) and which are ill suited. As mentioned in section we have Implementation name GEN (bytes) SIGN (bytes) VER (bytes) constraints in different fields. Those are primarily available Dilithium-3 [10] 50k 86k 54k energy therefor computing time and power as well as storage. 2021 Dilithium(dyn) [15] - 52k 36k The storage itself can also be split into read only memory 2021 Dilithium(sta) [15] / 8 35k 19k (ROM) and writeable memory (RAM). Most operating sys- qTESLA-1 [10] 22k 29k 23k qTESLA-3 [10] 43k 28k 45k tems (e.g. RIOT) split the available RAM again to expose Falcon-5 [10] 120k 120k 120k two kinds of memory structures: stack and heap. These are 2021 FALCON [15] - 42k 4.7k often relevant since the stack and heap are used for different purposes, but since those are resizable at build time we wont TABLE V focus on the differentiation between stack and heap. Another COMPARISONOFCLOCKCYCLESNEEDEDFORTHEOPERATIONSOF DIFFERENT IMPLEMENTATIONS, PERFORMED ON ARM M4 CHIP WHICH memory differentiation that might be more relevant for fu- WAS CLOCKED AT 168MHZ THEREFOR 10 MILLIONCLOCKCYCLES ture differentiation between stateful and stateless schemes is EQUAL ROUGHLY 60MS.EACHVALUEISAMILLIONCLOCKCYCLES between persistent and volatile memory. For stateful schemes Implementation name GEN SIGN VER we need persistent writeable memory and probably quite a lot Dilithium-3 [10] 2.3 8.3 2.3 of it while stateless schemes typically require more computing Dilithium-3 [11] 2.1 7.2 2.1 time for every signature. Which brings us to the main metrics: 2021 Dilithium(dyn) [15] - 29 3.4 2021 Dilithium(sta) [15] - 8 1.5 For each algorithm (GEN, SIGN,VER) we are interested in qTESLA-3 [10] 30 11 2.2 RAM/cache usage, execution time/energy consumption. The Falcon-5 [10] 365 165 1 significance of SIGN and VER performance is pretty obvious, 2021 Falcon [15] - 75 1 9 but the GEN is also not to be forgotten since we have shown that most schemes either need to switch the keys because they leak over time or are simply just MTS i.e. have to switch Unfortunately i could not find any reliable data about 10 keys after a fixed amount of signatures, this amount is also compiled code size (i.e. ROM usage) except for Dilithium influenced by available storage/time as we have shown in and FALCON which is shown in table VII which is an im- section III-E1 with the Merkle tree based signatures. We are portant parameter, but the official NIST competition reference also interested in different sizes that need to be either stored implementations [2] can give us a rough idea if we have a or transmitted via the network, these are signature as well look how big the source code files are. This is shown in table as private and public key sizes. In many schemes these sizes VI. have been unusually big [9]. The overall needed ROM for any After comparing all these signatures and keeping in mind algorithm is also from great interest since this is needed to that only Rainbow (implemented in python, not quite ready say whether a scheme is applicable for different classes of for IoT), Falcon and Dilithium became successful finalists IoT nodes as mentioned in section IV-A. [2], we can conclude that there are different schemes with Another thing regarding these classes that we might want to their own strengths and weaknesses. But in conclusion we can mention is, that there is no quantum computer with even nearly see that Dilithium shines on the signing side while FALCON sufficiently sized quantum registers yet and probably will not 9after optimizations these could be improved by further 43% [16] be in only a few years. Therefor it is probable that when these 10And i did not have enough resources left to measure it on my own, this quantum computers exist, the IoT and their hardware will also could be part of future comparison research have gotten much better. Still, the right signature schemes are needed, but it might be okay to be a few kilobytes larger than these classes. TABLE VI UNCOMPILEDCODESIZEOFREFERENCEIMPLEMENTATIONS (DIFFERENT B. comparison of different signatures SECURITYLEVELS (LIKE DILITHIUM-3 AND DILITHIUM-5) DO NOT HAVE ANY STATIC CHANGES REFLECTED IN CODE SIZE) Since we now know what to look out for we can now compare results of measurements from different QR signature Scheme Size implementations. In this comparison we already focused on FALCON 372KB Dilithium 270KB schemes that could be relevant for the IoT (i.e. skipped SPHINCS+ 180KB schemes that had way to much memory/cpu usage or are to old, e.g. in [9] the only schemes that had below 4kB keys TABLE VII where SIKE and Round5, both of which did not make it to FLASHSIZES) the NIST finalist). The measurements seen in tables IV and V were performed by [9], [11] and [15] Scheme Size FALCON 57KB 8in the case of static Dilithium the keys where precomputed and directly 2021 Dilithium (Dyn) 11KB stored in flash 2021 Dilithium (Sta) 26KB TABLE VIII We now want to solve the CVP for the vector v = {24.68}. COMPARISON OF KEY AND SIGNATURE SIZES If we now have a short basis (e.g. {2}) this algorithm for Scheme public key signature finding a close vector will represent v as 12.34 ∗{2} and then SPHINCS 1KB 43KB round to 12 which would give us 12 ∗{2} = {24} which is Dilithium-3 1.4KB 2.7KB indeed a close vector, although not the closest, which would FALCON-1 900B 690B FALCON-5 1.7KB 1.3KB have been {25} as we know. On the other hand, if we have ECDSA 64B 64B a long basis (e.g. {14}) this algorithm for finding a close vector will represent v as 1.72 ···∗{7} and then round to 2 which would give us 2 ∗{14} = {28} which is not a close performs by far the best when it comes to verification, which is vector. Of course this example would not make any sense in especially useful on actor nodes or other nodes that primarily a practical environment but it highlights why a short basis need to verify received data e.g. for updates. While Dilithium is good for solving CVP while a long one is not. But both would be better in use-cases where sensitive data has to be bases are equally useful in verifying that a vector is indeed a signed. But, all these schemes are stateless and both Dilithium lattice vector by simply checking if it can be represented by and FALCON (as well as qTesla) are lattice based schemes, an integer multiple of our basis. With these properties we can the last finalist, Rainbow is code based, and as we can also see now create a first simple lattice based signature scheme which from the enormous signature size SPHINCS is the only hash is given by {5, 6, 7}. based contender, but did not make it to the finalists. It was also observed that FALCON and Dilithium have the best security Algorithm 5 Lattice GEN against quantum computers and annealers to computational 1) Choose an arbitrary n-basis consisting of short vectors. expenses ratio [8]. We will therefor focus on these schemes This is the private key. in the following sections. 2) Choose any other basis consisting of long vectors for the same lattice. This is the public key. V. FALCON AND DILITHIUM Since we are now focussing on lattice based algorithms it would make sense to describe these again in further detail. Algorithm 6 Lattice SIGN We already know (compare section III-E2) that a lattice is 1) input message digest Mi and already known private key essentially a high dimensional grid (over a residue field). We being a lattice basis bpriv also know that the underlying problem used for the schemes 2) genrate random seed r security is some form of CVP (or SVP), but we will now 3) compute u = h(Mi, r) show how an actual cryptographic algorithm is based upon 4) map u to a n-dimensional vector vu these problems. 5) solve CVP for vu by taking advantage of our small basis The main idea is that these lattices have a nearly infinite bpriv (similar to 4, but more complicated in practice) and amount of bases, some of which make it easy to solve these assign the found vector to p problems, others make it hard. We can now construct a lattice 6) s = p − u; return signature (r, s) with two equal bases, one with rather short vectors, that make it easy to solve these problems and one with arbitrary long vectors, that make it hard. The reason behind a basis Algorithm 7 Lattice VER being better than another is complicated but in can be greatly illustrated with an algorithm that is used to solve this CVP 1) input signature (r, s) and digest Mi and already known which is called Babai’s rounding technique. [17] public key being another lattice basis bpub 2) abort if s is not small (the vector is not close to the Algorithm 4 Babei’s rounding technique given digest vector) 1) input lattice basis B and vector v for which we search 3) compute u = h(Mi, r) the closest (in practice a close enough) lattice vector 4) map u to a n-dimensional vector vu 2) transform basis of v into B (since v uses a standard 5) check whether vu is an actual lattice vector given by basis this is a trivial step, the problem is, that the your basis bpub and return result probability that our new representation uses integer values is neglectable) 3) round our new representation to the next integer values. A. FALCON Return these. This is however not implemented one-by-one into FAL- CON, because of two main issues. This algorithm can be seen in 4 and we can now see that The first one is that we need a high dimensional lattice (to using short vectors our rounding brings us to an actually close not be prone to LLL Lattice Reduction) and an arbitrary basis vector, while a long basis wont. Just imagine the most trivial lies in O(n2) which would result in unfeasible Megabyte-sized case, a one dimensional lattice that consist of all values in Z99. keys. The solution to this problem is to not use any arbitrary bases but introduce some kind of symmetry that makes the get close to this many signatures since signing with FALCON keys a lot smaller. This symmetry is introduced with the is way more energy intense then its competitor Dilithium. already mentioned NTRU-Lattice, which upscales only a few basis vectors to a high dimensional basis by applying specific B. Dilithium rotations. Dilithium’s main idea is similar to the one used by FAL- The second problem (compare section III-E2) is the con- CON, with the main difference being that it ditches the tinuous leakage of our private key from signature to signature gauss sampler. This sampler is the main reason FALCON that arises through usage of algorithm . That is because in this can achieve such small public keys and signatures, but it also algorithm the shape of the closest vector is directly related to makes the implementation a lot complexer and the signature the shape of the private basis. To mitigate this issue a random step much harder, as we have seen in the previous section. sampling is introduced. It is called the GPV sampler [18] and Since Dilithium ditches this step for another more trivial step, it works by not rounding to the closest integer but randomly it makes signing much faster and easier while also reducing rounds up or down. FALCON also improved this sampling the code sizes needed to store this scheme on any device, process by applying Fast Fourier Sampling which sped up making it possible to be only a small part even on a C0 the process by orders of magnitude but as later figured out device. However neither signing nor verifying would work on introduced a new vulnerability. a C1 device, while all would work on a C2 Device which The new problem lies in the use of the floating point unit is already good. Another advantage of ditching this sampling for the fast fourier calculations. This is not only a problem is that it enabled the Dilithium team to focus on making the since many IoT devices do not have a floating point unit, but whole scheme constant-time. This inturn makes timing attacks it also introduced a side-channel attack vulnerability. Floating as these that FALCON is vulnerable to impossible by design. Point Arithmetics (FPA) are known to be prone to side-channel [22] attacks however in this case it has a somewhat severe impact. VI. CONCLUSION FALCON was side channel attacked by measuring elec- To conclude we can see that there are two NIST finalist tromagnetic radiation created by the FFT inside the sampler with promising performance even for the IoT, some perfor- which uses the floating point unit. Such that the attacker could mance improvement for specific platforms are still possible reconstruct the private key after only 10k measurements [19]. (like Dilithiums avx-2 vector optimizations), but even without Therefor another fork af falcon, zalcon was introduced which those we have performance competitive with more traditional ditches the FPA (also good for other iot devices) and uses NTT schemes like ECDSA. Both of them are lattice based, since the instead of FTT [20]. However this NTT seems to be even more most used alternatives are hash based, which results in very vulnerable to this attack. [19] big signatures by design. A fix for this vulnerability though could be to introduce We had a look at multiple different schemes and their a masking s.t. power consumption etc. is more randomized. underlying mathematics and found that HBS are easier to This would be a rather standard approach to mitigate this FPA understand and proof their security with only assuming hard- Side-Channel Attack. ness of pre image attacks for hashes. But we also found that Besides this timing attack another team [21] created a fault their signatures get unfeasible big for IoT scenarios, therefor attack that could also retrieve the private key, but they propose needing lattice based signatures, that offer better signature a countermeasure for this is detecting the fault attack by length performance on the downside of building their security verifying the own signature, which can take further advantage on a less researched mathematical problem. of FALCONs fast ( 3ms) verification time. Still the two LBS schemes that were further researched, To conclude FALCON would be a good contender for class FALCON and Dilithium would both fit on C2 IoT nodes and C2 Devices and if only the verification part is needed it could perform reasonably well on them. Besides memory constraints also be used in C1 devices, which no other QR signature and FALCONs performance demands during signing due to scheme can do. It is not super feasible in real-time scenarios the hardness of gauss sampling, they both have performance for sensor networks though, since signing data can take about comparable or even better to more traditional non-QR schemes one whole second (compare table V). [23]. If we only want to verify messages FALCON would If we also take energy consumption into consideration (see be an even better performer, since most of its hard part can table IX) we can see that FALCON really shines in verifying be omitted, and signature sizes, public keys and verification messages and is even super efficient on the highest security hardness are very small. level. But the key generation takes a lot of energy which would But FALCON still faces some issues on the signing side, be bad in battery operated environments. But even with a small since it is quite demanding and not constant-time which inturn 600mAH Battery that would still be around 34 thousand key poses vulnerability against side channel attacks. Other attacks generations before the battery is drained and since a single like fault injection etc still have to be evaluated for con- key can safely be reused for 10 million signatures that would crete implementations on concrete devices. Also the quantum- result in 340 billion signatures made possible by pretty much resistance is based on the CVP, and quantum computing in the smallest available LiPO-cell. But a device would not even general is still an active field of research and rapidly evolving. TABLE IX DIRECT COMPARISON OF FALCON AND DILITHIUM ON M4. [11]

Scheme public key size signature size GEN Energy SIGN Energy VER Energy Dilithium-3 (L2) 1472B 2701B 2.3mJ 5mJ 1.7mJ FALCON-512 (L1) 897B 690B 118mJ 23mJ 0.3mJ FALCON-1025 (L5) 1793B 1330B 232mJ 35mJ 0.7mJ ECDSA 64B 64B 1.7mJ 4mJ 4mJ

But with sufficiently large quantum computers on the hori- [16] T. Oder, J. Speith, K. H¨oltgen, and T. G¨uneysu, “Towards practical zon we need to make sure to quantum proof our smart homes, microcontroller implementation of the signature scheme falcon,” in Post- , J. Ding and R. Steinwandt, Eds. Cham: campuses and cities before it is to late. And FALCON as well Springer International Publishing, 2019, pp. 65–80. as Dilithium seem to be reasonable contenders for that. [17] S. Galbraith, “Chapter 18 algorithms for the closest and shortest vector problems,” 2014. [18] P. T. P. Blog, “Falcon – a post-quantum signature scheme,” REFERENCES https://pqshield.com/falcon-a-post-quantum-signature-scheme/, accessed june 2021. [1] S. Suhail, R. Hussain, A. Khan, and C. S. Hong, “On the role of hash- [19] E. Karabulut and A. Aysu, “Falcon down: Breaking falcon post-quantum based signatures in quantum-safe internet of things: Current solutions signature scheme through side-channel attacks.” and future directions,” IEEE Internet of Things Journal, vol. 8, no. 1, [20] P.-A. Fouque, F. G´erard, M. Rossi, and Y. Yu, “Zalcon: An alternative pp. 1–17, 2021. fpa-free sampler for falcon.” [2] NIST, “Post-quantum cryptography - round 3 submissions,” [21] S. McCarthy., J. Howe., N. Smyth., S. Brannigan., and M. O’Neill., https://csrc.nist.gov/Projects/post-quantum-cryptography/round-3-submissions, “Bearz attack falcon: Implementation attacks with countermeasures on 2021. the falcon signature scheme,” in Proceedings of the 16th International [3] C. Cheng, R. Lu, A. Petzoldt, and T. Takagi, “Securing the internet of Joint Conference on e-Business and Telecommunications - SECRYPT,, things in a quantum world,” IEEE Communications Magazine, vol. 55, INSTICC. SciTePress, 2019, pp. 61–71. no. 2, pp. 116–120, 2017. [22] L. Ducas, E. Kiltz, T. Lepoint, V. Lyubashevsky, P. Schwabe, [4] P. W. Shor, “Polynomial-time algorithms for prime factorization G. Seiler, and D. Stehl´e, “Crystals-dilithium: A lattice-based digital and discrete logarithms on a quantum computer,” SIAM Review, signature scheme,” IACR Transactions on Cryptographic Hardware and vol. 41, no. 2, pp. 303–332, 1999. [Online]. Available: Embedded Systems, vol. 2018, no. 1, pp. 238–268, Feb. 2018. [Online]. https://doi.org/10.1137/S0036144598347011 Available: https://tches.iacr.org/index.php/TCHES/article/view/839 [5] L. K. Grover, “Quantum mechanics helps in searching for a needle in [23] P. Kampanakis and D. Sikeridis, “Two post-quantum signature use-cases: a haystack,” Phys. Rev. Lett., vol. 79, pp. 325–328, Jul 1997. [Online]. Non-issues, challenges and potential solutions,” 11 2019. Available: https://link.aps.org/doi/10.1103/PhysRevLett.79.325 [6] C. H. Bennett, E. Bernstein, G. Brassard, and U. Vazirani, “Strengths and weaknesses of quantum computing,” SIAM Journal on Computing, vol. 26, no. 5, pp. 1510–1523, 1997. [Online]. Available: https://doi.org/10.1137/S0097539796300933 [7] R. A. Perlner and D. A. Cooper, “Quantum resistant public key cryptography: A survey,” in Proceedings of the 8th Symposium on Identity and Trust on the Internet, ser. IDtrust ’09. New York, NY, USA: Association for Computing Machinery, 2009, p. 85–93. [Online]. Available: https://doi.org/10.1145/1527017.1527028 [8] M. Raavi, S. Wuthier, P. Chandramouli, Y. Balytskyi, X. Zhou, and S.-Y. Chang, “Security comparisons and performance analyses of post- quantum signature algorithms,” in International Conference on Applied Cryptography and Network Security. Springer, 2021, pp. 424–447. [9] T. M. Fern´andez-Caram´es, “From pre-quantum to post-quantum iot security: A survey on quantum-resistant for the internet of things,” IEEE Internet of Things Journal, vol. 7, no. 7, pp. 6457–6480, 2020. [10] A. Khalid, S. McCarthy, M. O’Neill, and W. Liu, “Lattice-based cryp- tography for iot in a quantum world: Are we ready?” in 2019 IEEE 8th International Workshop on Advances in Sensors and Interfaces (IWASI), 2019, pp. 194–199. [11] M. J. O. Saarinen, “Mobile energy requirements of the upcoming nist post-quantum cryptography standards,” in 2020 8th IEEE International Conference on Mobile Cloud Computing, Services, and Engineering (MobileCloud), 2020, pp. 23–30. [12] R. Merkle, “A certified digital signature,” vol. 435, 08 1989, pp. 218– 238. [13] I. Foundation, “Assuring authenticity in the tangle with signatures,” https://blog.iota.org/assuring-authenticity-in-the-tangle-with-signatures-791897d7b998/, 2019. [14] J. Buchmann, E. Dahmen, and M. Schneider, “Merkle tree traversal revisited,” in Post-Quantum Cryptography, J. Buchmann and J. Ding, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2008, pp. 63–78. [15] G. Banegas, K. Zandberg, A. Herrmann, E. Baccelli, and B. Smith, “Quantum-resistant security for software updates on low-power networked embedded devices,” Cryptology ePrint Archive, Report 2021/781, 2021, https://eprint.iacr.org/2021/781.