<<

Analysis of Selected Modes for Authenticated

by

Hassan Musallam Ahmed Qahur Al Mahri

Bachelor of Engineering (Computer Systems and Networks) (Sultan Qaboos University) – 2007

Thesis submitted in fulfilment of the requirement for the degree of Doctor of Philosophy

School of Electrical Engineering and Computer Science Science and Engineering Faculty Queensland University of Technology

2018 Keywords

Authenticated encryption, AE, AEAD, ++AE, AEZ, block cipher, CAESAR, confidentiality, COPA, differential fault analysis, differential , ElmD, fault attack, forgery attack, integrity assurance, leakage resilience, modes of op- eration, OCB, OTR, SHELL, side channel attack, statistical fault analysis, sym- metric encryption, tweakable block cipher, XE, XEX.

i ii Abstract

Cryptography assures information security through different functionalities, es- pecially confidentiality and integrity assurance. According to Menezes et al. [1], confidentiality means the process of assuring that no one could interpret infor- mation, except authorised parties, while data integrity is an assurance that any unauthorised alterations to a message content will be detected. One possible ap- proach to ensure confidentiality and data integrity is to use two different schemes where one scheme provides confidentiality and the other provides integrity as- surance. A more compact approach is to use schemes, called Authenticated En- cryption (AE) schemes, that simultaneously provide confidentiality and integrity assurance for a message. AE can be constructed using different mechanisms, and the most common construction is to use block cipher modes, which is our focus in this thesis. AE schemes have been used in a wide range of applications, and defined by standardisation organizations. The National Institute of Standards and Technol- ogy (NIST) recommended two AE block cipher modes CCM [2] and GCM [3]. However, some potential attacks have been proposed against certain AE schemes [4]; in particular the interaction between confidentiality and integrity assurance components in an AE scheme may cause problems unless special care is taken [5]. Therefore, at the beginning of 2013, the Competition for Authenticated En- cryption: Security, Applicability, and Robustness (CAESAR), was launched to invite proposals for AE schemes that are more powerful and practical in terms of security and speed than GCM [6]. The overall aim of this research is to analyse the security of some block cipher modes of operation submitted to CAESAR. The analysis is performed through considering the structure of the block cipher mode used in AE schemes. This may reveal vulnerabilities in the integrity assurance that can be exploited in forgery attacks. Further, we propose fault attacks against certain CAESAR

iii block cipher modes. Finally, an AE block cipher mode is proposed that assures resilience against side channel leakage, and provides misuse resistance and online computation at both sender and receiver sides. This thesis analyses four block cipher modes: ++AE, OTR, XEX/XE and AEZ, and has five contributions. The first work identifies serious flaws in the integrity assurance mechanism of ++AE [7]. Most significantly, it does not verify the most significant bit of any message block. Other flaws are also identified which allow forgeries (including addition, deletion and swapping of certain blocks) in a chosen plaintext mes- sage. This work therefore concludes that ++AE is insecure as an mode of operation. The second contribution is the investigation of the generic OTR mode [8,9]. The current masking coefficients are specific to the finite field used to update masks. This work shows that certain choices of primitive polynomials result in mask collisions that can be exploited in forgery attacks. Alternatively, generic masking coefficients are proposed that can be used with any block size and any primitive polynomial, without affecting the security provided by this scheme. Thirdly, we apply fault attacks to XEX/XE modes [10] to either eliminate the effect of secret masks or retrieve their values. Either of these cases enables existing fault attack techniques to recover the secret . Different fault attack methods were demonstrated in this work by using permanent, transient, and biased fault injections. This work also shows that the AE modes: COPA, ELmD, SHELL, OCB2 and OTR are susceptible to our fault attack techniques. The fourth contribution describes a fault analysis on AEZ [11] focusing mainly on AEZ v4.2 and the most updated version AEZ v5. This work shows that all three 128-bit keys in AEZ v4.2 can be uniquely retrieved using only three fault injections. In addition, a similar approach using four fault injections can uniquely recover all keys of AEZ v5. Two approaches are suggested to prevent attackers from exploiting the structure of AEZ in order to minimise the number of faults. The final contribution studies the leakage of existing AE block cipher modes that can be exploited in side channel attacks. Certain AE proposals provide leakage resilience and misuse resistance on the sender side only (and not the receiver side), while others provide leakage resilience and misuse resistance on both sides, but cannot perform online computation. This work proposes an AE block cipher mode that is online, leakage-resilient and misuse resistant at both the sender and receiver ends.

iv Contents

Keywords ...... i Abstract...... iii Table of Contents...... v List of Figures...... xi List of Tables...... xiii Notation...... xv Declaration ...... xix Previously Published Material...... xxi Acknowledgements ...... xxiii

Chapter 1 Introduction1 1.1 Background and motivation ...... 1 1.2 Research aims and objectives ...... 4 1.3 Research contributions...... 5 1.4 Organisation of thesis...... 8

Chapter 2 Background and literature review 11 2.1 Confidentiality provided by block ciphers...... 11 2.1.1 Block ciphers ...... 12 2.1.2 Modes of operation...... 15 2.2 Integrity assurance provided by block ciphers...... 25 2.2.1 Message codes ...... 26 2.2.2 Properties of codes...... 26 2.2.3 Construction of message authentication codes ...... 27 2.3 Authenticated encryption ...... 29 2.3.1 Classification of AE schemes...... 30 2.3.2 Generic composition ...... 32 2.3.3 Dedicated AE schemes...... 34 2.3.4 Comparison between dedicated AE modes ...... 38

v 2.3.5 CAESAR competition ...... 38 2.3.6 Selected block cipher-based AE modes ...... 41 2.4 Cryptanalytic attacks...... 42 2.4.1 Attack goals...... 42 2.4.2 Attack models...... 43 2.4.3 Confidentiality attacks...... 44 2.4.4 Integrity assurance attacks...... 46 2.4.5 Implementation attacks...... 47 2.5 Summary ...... 52

Chapter 3 Analysis of ++AE authenticated encryption mode 55 3.1 Description of ++AE...... 57 3.2 Fundamental flaw in ++AE...... 58 3.3 Other integrity assurance flaws in ++AE...... 62

3.3.1 Repeating internal vectors Ii 1 and Qi 1 ...... 62 − − 3.3.2 Groups of blocks that result in the same MDC value . . . 66 3.4 Forgery attacks on ++AE...... 71 3.4.1 Forgery attack using two groups...... 72 3.4.2 Forgery attack using a single group...... 75 3.5 Experimental verification...... 76 3.5.1 Verification of insertion and deletion attack ...... 77 3.5.2 Verification of swapping attack using two groups ...... 78 3.5.3 Verification of swapping attack using a single group . . . . 78 3.6 General remarks about ++AE design...... 80 3.7 Conclusion...... 81

Chapter 4 Tweaking generic OTR mode to avoid forgery attacks 83 4.1 Tweakable block ciphers ...... 84 4.2 OTR description ...... 86 4.2.1 Generic OTR mode...... 86 4.2.2 AES-OTR mode ...... 90 4.3 Existing analysis of OTR...... 91 4.4 New analysis of OTR...... 94 4.4.1 Proposed attacks...... 95 4.5 Proposed solution...... 98 4.5.1 Proposed instantiation of encryption/decryption core . . . 99

vi 4.5.2 Proposed instantiation of authentication core ...... 99 4.6 Alternative solution...... 101 4.7 Security bounds for OTR using the new masking coefficients . . . 102 4.8 Conclusion...... 104

Chapter 5 Analysis of XEX mode using fault attacks 107 5.1 Preliminaries ...... 109 5.1.1 AES description...... 109 5.1.2 The design of XEX mode...... 110 5.1.3 Fuhr et al.’s fault attack on AES ...... 112 5.2 Existing fault attacks on XEX...... 114 5.3 Eliminating the masks in XEX mode ...... 115 5.3.1 Stuck-at-zero fault attack ...... 116 5.3.2 Skipping an instruction fault attack ...... 117 5.3.3 Security implication for mask elimination ...... 117 5.4 A only attack to reveal secret mask L ...... 118 5.4.1 Fault model A at round 9 ...... 121 5.4.2 Fault model A at round 8 ...... 121 5.4.3 Fault model B at round 9 ...... 123 5.5 Experimental results ...... 128 5.5.1 Fault model B simulation ...... 129 5.5.2 Retrieving one byte...... 129 5.5.3 Retrieving the whole mask...... 131 5.6 Application to authenticated encryption modes ...... 131 5.7 Countermeasures ...... 132 5.8 Conclusion...... 133

Chapter 6 Fault analysis of AEZ 135 6.1 Description of AEZ...... 136 6.1.1 AEZ v4.2 ...... 140 6.1.2 AEZ v5 ...... 141 6.2 Review of existing analysis on AEZ...... 141 6.3 Existing fault attacks on AES...... 142 6.3.1 Mukhopadhyay’s fault attack ...... 144 6.3.2 Feasibility of Mukhopadhyay’s attack ...... 145 6.4 New fault analysis on AEZ v4.2...... 146

vii 6.4.1 Direct application...... 146 6.4.2 Improved application...... 147 6.4.3 Further improvement...... 151 6.5 New fault analysis on AEZ v5...... 152 6.5.1 Changes from v4.2 ...... 152 6.5.2 Implications of changes...... 153 6.5.3 Attack approach using four faults ...... 154 6.5.4 Attack approach using three faults ...... 155 6.6 Experimental results and comparison ...... 156 6.7 Possible solutions...... 157 6.8 Conclusion...... 158

Chapter 7 An online leakage-resilient authenticated encryption mode 161 7.1 Leakage ...... 162 7.1.1 Masking...... 163 7.1.2 Hiding...... 163 7.1.3 Leakage resilience...... 164 7.2 Leakage resilience through fresh re-keying ...... 164 7.3 Existing leakage-resilient AE schemes ...... 166 7.4 Security goals...... 167 7.4.1 Leakage resilience at both sender and receiver ...... 168 7.4.2 Misuse predictability resistance ...... 168 7.4.3 Online computation...... 169 7.4.4 Minimum overhead...... 170 7.4.5 Properties of existing AE schemes ...... 170 7.5 Proposed design...... 171 7.6 An instantiation of our proposed design ...... 175 7.7 Security analysis ...... 177 7.7.1 Security of g against SPA and DPA ...... 177

7.7.2 Security of EK against SPA ...... 178 7.7.3 Security against Fault Attacks ...... 178 7.7.4 Security of overall scheme ...... 179 7.8 Performance estimates ...... 180 7.9 Conclusion...... 184

viii Chapter 8 Conclusions and future research 185 8.1 Review of contributions...... 186 8.1.1 Analysis of ++AE mode...... 186 8.1.2 Analysis of OTR mode...... 187 8.1.3 Fault analysis of XEX mode ...... 188 8.1.4 Fault analysis of AEZ mode ...... 189 8.1.5 An online leakage-resilient AE mode ...... 190 8.2 Concluding remarks...... 190 8.3 Future research...... 192

Bibliography 195

ix x List of Figures

2.1 Block cipher interface...... 12 2.2 [1]...... 14 2.3 Two rounds of SPN in PRESENT cipher [12, 13]...... 15 2.4 ECB encryption...... 17 2.5 CBC encryption...... 18 2.6 OFB encryption...... 21 2.7 CFB encryption...... 23 2.8 CTR encryption...... 24 2.9 Message authentication code...... 27 2.10 CBC-MAC...... 29 2.11 Authenticated encryption interface...... 29 2.12 Authenticated encryption with associated data...... 30 2.13 Classification of AE schemes...... 31 2.14 Generic composition AE schemes...... 33 2.15 Power consumption during a DES call [14]...... 49 2.16 Differential trace for five predictions of a key byte [15]...... 50

3.1 ICV integrity mechanism...... 56 3.2 ++AE encryption and decryption operations...... 60 3.3 The chosen message for forgeries against ++AE...... 72 3.4 Forgery attack using insertion and deletion...... 73 3.5 Forgery attack by swapping groups...... 74 3.6 Forgery attack by swapping blocks within a group...... 76

4.1 OTR encryption operation with parallel associated data [8]. . . . 88 4.2 OTR encryption operation with serial associated data [8]...... 92 4.3 Proposed OTR encryption diagram with parallel associated data. 101 4.4 An alternative proposed OTR encryption diagram with parallel associated data...... 103

xi 5.1 XEX mode...... 110 5.2 Doubling masking for XEX mode...... 112

5.3 Masks containing the byte L32...... 120 5.4 Graphical representation of round 9 attack to retrieve the value of

(3L)00...... 122 5.5 Graphical representation of round 8 attack to retrieve four bytes in L...... 123 5.6 Mask bytes targeted according to the position of faulty bytes (colour figure)...... 125 5.7 Technique to retrieve all bytes of 21283L (colour figure)...... 128 5.8 Success rate to determine one mask byte for different probabilities. 131

6.1 AEZ scheme...... 138 6.2 AEZ-core scheme...... 139 6.3 Propagation of fault at the input of 8th round [16]...... 145 6.4 AES4 function in AEZ v4.2...... 148 6.5 Reduced AES4 function...... 149 6.6 Sketch of internal states in reduced AES4 function...... 150 6.7 AES4 function in AEZ v5...... 153

7.1 Fresh re-keying concept...... 165 7.2 Slight modification to fresh re-keying scheme...... 171 7.3 Authenticated encryption fresh re-keying scheme...... 173 7.4 Our proposal for leakage-resilient AE mode...... 174 7.5 A tree structure for updating the key...... 176 7.6 Leakage-resilient integrity assurance scheme...... 180 7.7 Performance overhead for different instantiations...... 182 7.8 Area overhead for different instantiations...... 183

xii List of Tables

2.1 Encryption and decryption algorithms in ECB mode...... 16 2.2 Encryption and decryption algorithms in CBC mode...... 18 2.3 Encryption and decryption algorithms in OFB mode...... 20 2.4 Encryption and decryption algorithms in CFB mode...... 22 2.5 Encryption and decryption algorithms in CTR mode...... 24 2.6 Security of generic composition AE schemes [17]...... 35 2.7 Comparison between certain dedicated AE schemes...... 39 2.8 AE candidates during CAESAR rounds...... 40 2.9 Overview of certain block cipher-based AE candidates in CAESAR. 42

3.1 ++AE encryption and decryption algorithms...... 59 3.2 Chosen plaintext in the first experiment...... 78 3.3 Obtained ciphertext in the first experiment...... 78 3.4 Modified ciphertext in the first experiment...... 78 3.5 Decrypted modified ciphertext in the first experiment...... 78 3.6 Chosen plaintext in the second experiment...... 79 3.7 Obtained ciphertext in the second experiment...... 79 3.8 Modified ciphertext in the second experiment...... 79 3.9 Decrypted modified ciphertext in the second experiment...... 79 3.10 Chosen plaintext in the third experiment...... 80 3.11 Obtained ciphertext in the third experiment...... 80 3.12 Modified ciphertext in the third experiment...... 80 3.13 Decrypted modified ciphertext in the third experiment...... 80

4.1 OTR algorithm in parallel mode [8]...... 89 4.2 Comparison of OTR with certain AE schemes [8]...... 90 4.3 OTR algorithm in serial mode [8]...... 93 4.4 Proposed OTR algorithm...... 100

5.1 Summary of Fuhr et al.’s fault attack on AES round 9 [18]. . . . . 114

xiii 5.2 Summary of fault attacks on XEX-based AE scheme [19]...... 115 5.3 Success rate of fault attacks using fault model B at round 9. . . . 130 5.4 Summary of attacks to retrieve secret masks in certain AE modes. 132

6.1 AEZ-core algorithm [11]...... 140 6.2 Differential fault attacks on AES...... 143 6.3 Comparison of different attacks on AEZ...... 157

7.1 Summary of security features provided by leakage-resilient AE schemes...... 170 7.2 Implementation overhead of different hardware constructions. . . . 182

xiv Notation

The following notation will be used consistently throughout this thesis:

K : k-bit secret key used for the block cipher encryption/decryption func- • tion and for mask/tweak/vector initialisation.

N : the nonce, a public value that is changed for each message. • A : associated data. • M : the ith block in the plaintext message. • i C : the ith block in the corresponding ciphertext message. • i n : the block length in bits of the block cipher. • m : the number of blocks in the plaintext message. • d : the number of blocks of associated data. • TA : the authentication tag obtained from associated data. • TE : the authentication tag obtained from plaintext message. • T : the τ-bit final authentication tag of authenticated encryption scheme. • ,I : secret n-bit internal chaining vectors obtained in processing the ith • i i block.

ICV : Integrity Check Vector: a secret n-bit value changed for each mes- • sage.

IV : public message number used during the initialising phase. • MDC : Modification Detection Code: an n-bit authentication tag, depen- • dent on both the message content and the ICV.

xv si,(r),(o) : the (j, k) byte in a matrix of j rows and k columns representing • jk encryption state of plaintext Mi after the operation o of round r where j, k 0, 1, 2, 3 , o SB, SR, MC, AK and r 1,..., 10 . ∈ { } ∈ { } ∈ { } K(r) : the (j, k) byte in matrix of the sub-key K of round r where j, k • (jk) ∈ 0, 1, 2, 3 and r 1,..., 10 . { } ∈ { } L : the (j, k) byte in a nonce-based secret value where j, k 0, 1, 2, 3 . • jk ∈ { } E ( ) : the block cipher encryption function under the key K. • K · D ( ) : the block cipher decryption function under the key K. • K · SB( ) : the AES SubBytes operation. • · 1 SB− ( ) : the inverse of AES SubBytes operation. • · SR( ) : the AES ShiftRow operation. • · 1 SR− ( ) : the inverse of AES ShiftRow operation. • · MC( ) : the AES MixColumn operation. • · 1 MC− ( ) : the inverse of AES MixColumn operation. • ·

0, 1 ∗ : the set of all finite-length binary strings. •{ } 0, 1 a : the set of binary strings of length a. •{ } ε : the empty string. • X : the length of the string X in bits. •| | [X] : the ith bit of the bit string X. • i [X] : the (j, k) byte in matrix representing the bit string X where • (jk) j, k 0, 1, 2, 3 . ∈ { } X Y : the concatenation of strings X and Y . • k X : max X /a , 1 . •| |a {d| | e } n X : returns (X ,X ,...,X ) where x = X , X = n for i < x and • ←− 1 2 x | |n | i| X n. | x|≤

xvi n X 1 X : the 10∗ written as X 10 −| |− . • k msb (X) : the most significant c bits of X provided that X c. • c | |≥ lsb (X) : the least significant c bits of X provided that X c. • c | |≥ n 1 a∗ : a 2 − (the result of complementing the most significant bit of a). • ⊕ + : arithmetic addition operation modulo 2n. • : arithmetic subtraction operation modulo 2n. •− : bitwise-and operation. •∧ : bitwise-exclusive-or operation. •⊕ : logical left shift operation. • : logical right shift operation. •

xvii xviii QUT Verified Signature xx Previously Published Material

The following articles have been published, and contain material based on the content of this thesis.

Journal articles: • 1. H. Qahur Al Mahri, L. Simpson, H. Bartlett, E. Dawson, and K. Wong, “A fundamental flaw in the ++AE authenticated encryption mode,” Journal of Mathematical Cryptology, vol. 12, no. 1, pp. 37–42, 2018. 2. H. Qahur Al Mahri, L. Simpson, H. Bartlett, E. Dawson, and K. Wong, “Fault analysis of AEZ,” Concurrency and Computation: Prac- tice and Experience, 2018. https://doi.org/10.1002/cpe.4785.

Conference papers: • 1. H. Qahur Al Mahri, L. Simpson, H. Bartlett, E. Dawson, and K. Wong, “Forgery attacks on ++AE authenticated encryption mode,” in Proceedings of the Australasian Computer Science Week Multicon- ference, Canberra, Australia, February 2–5, 2016, pp. 33:1–9, ACM, 2016. 2. H. Qahur Al Mahri, L. Simpson, H. Bartlett, E. Dawson, and K. Wong, “Tweaking generic OTR to avoid forgery attacks,” in Proceed- ings of the Applications and Techniques in Information Security - 6th International Conference, ATIS 2016, Cairns, QLD, Australia, Octo- ber 26–28, 2016 (L. Batten and G. Li, eds.), vol. 651 of Communica- tions in Computer and Information Science, pp. 41–53, 2016. 3. H. Qahur Al Mahri, L. Simpson, H. Bartlett, E. Dawson, and K. Wong, “Fault attacks on XEX mode with application to certain au-

xxi thenticated encryption modes,” in Proceedings of the Information Se- curity and Privacy - 22nd Australasian Conference, ACISP 2017, Auckland, New Zealand, July 3–5, 2017, Part I (J. Pieprzyk and S. Suriadi, eds.), vol. 10342 of Lecture Notes in Computer Science, pp. 285–305, Springer, 2017. 4. H. Qahur Al Mahri, L. Simpson, H. Bartlett, E. Dawson, and K. Wong, “A fault-based attack on AEZ v4.2,” in IEEE Trustcom/Big- DataSE/ICESS, Sydney, Australia, August 1–4, 2017, pp. 634–641, IEEE, 2017. 5. I. Salam, H. Qahur Al Mahri, L. Simpson, H. Bartlett, E. Dawson, and K. Wong, “Fault Attacks on Tiaoxin-346,” in Proceedings of the Australasian Computer Science Week Multiconference, Brisbane, Aus- tralia, January 30 – February 2, 2018, ACM, 2018.

xxii Acknowledgements

First of all, all praise and thanks are due to Allah, the most Generous and the most Merciful, for giving me the ability and patience to accomplish this work. It is all His reconciliation and facilitation that laid out the difficulties I faced during the past few years of my study. My sincerest gratitude and appreciation is to my principal supervisor, Leonie Simpson, for her support, encouragement and guidance. I cannot forget the enthusiasm and commitment of my associate supervisors Harry Bartlett, Ed Dawson and Kenneth Wong. I am really lucky to have such a great team who has spared no effort in order to achieve the goals of this study. Without their guidance, motivation, and insightful comments, this work won’t exist. I want to express my sincere appreciation to them for each hour, minute or second they spent with me. I truly appreciate Ernest Foo, Bouchra Senadji and Matt McKague for re- viewing this document and giving valuable comments. Thank you for taking time out of your commitments to enhance the quality of this work. I would like to express my appreciation for staff and students of the infor- mation security discipline at QUT, especially to my fellow PhD student, MD Iftekhar Salam, for his cooperation, useful discussion and brainstorming. His re- search was also on authenticated encryption, but based. We have published together the conference paper “Fault Attacks on Tiaoxin-346” in the 2018 Australasian Computer Science Week (ASCW) conferences. I am so thankful for the government of the Sultanate of Oman for giving me the opportunity to continue my studies and providing me with a full PhD scholarship. I am so grateful to Dr. Nasser Al-Ismaily, Dr. Sattam Al-Riyami, Yousef Al-Busaidi and Dr. Sultan Al-Hinai for their assistance and trust in me. My appreciation is also to all members of the Omani Society in Queensland who create an atmosphere of happiness and fun to help us to forget the feelings of

xxiii being so far away from home. Words cannot describe how grateful I am to my parents for their support, en- couragement, and prayers to complete this journey. They have always motivated me and my siblings, and have offered us everything they had in order to seek knowledge since our childhood. Whatever I write and say cannot measure up to their sacrifice. I also want to thank my brother and sisters for giving assistance whenever needed and enhancing my self confidence. I cannot forget the sacrifice and struggle of my wife. She has endured the last three years, and spent long hours alone to provide me with the right environment to study. I appreciate her tolerance and standing by me when I most needed support. How would I acknowledge without thanking my lovely children? Thanks a bunch to Bayan, Muna, Meaad and Abdulaziz. I have spent long periods of time away from you, but your sacrifice is greatly appreciated. Lastly, my gratitude to my brother-in-law Salim Jaboob for all he has done and for his caring for my family, while I am away.

xxiv Chapter 1

Introduction

1.1 Background and motivation

Modern technological developments have brought about a revolution in the world of communication, facilitating information exchange and improving different walks of life. Modern communication systems are associated with huge data flows processed in digital form and transferred over insecure channels, such as the internet. This makes such data vulnerable to observation and modification. Cryptography has emerged as a tool to prevent eavesdroppers from data vi- olation without restricting legitimate users’ access to secure information. The value of information varies depending on the application ranging from personal contact details to national secrets and strategic plans. According to [20], cryp- tography is fundamentally the science of hiding the meaning of a message. How- ever, cryptography in a more comprehensive viewpoint provides four information security functionalities: message confidentiality, integrity assurance, entity au- thentication and data origin authentication [1]. Losing one functionality implies a breach of security in the overall . Confidentiality means that only authorised parties should understand the message content [1]. Adversaries should not be able to identify any useful infor- mation about the transmitted message, except its length. Confidentiality can be provided by enciphering (or encrypting) the plaintext message into a ciphertext message using a cryptographic cipher. Only authorised users having access to the secret key can perform the inverse of the encryption process, in order to

1 2 Chapter 1. Introduction decipher (or decrypt) the message to obtain its original form [1]. Integrity assurance is an assurance that any unauthorised alteration to a message will be detected [1]. Ideally, all data manipulation, including inser- tion, deletion and substitution, should be identified by the integrity assurance mechanism. Applying this definition, integrity assurance is not equivalent to au- thentication that is more related to origin identification. Authentication has two main branches: entity authentication and message authentication (also known as data origin authentication) [1]. Entity authentication identifies the communi- cating parties associated with the transmitted messages (whether these parties are persons, computers, ATM machines, etc.). On the other hand, message au- thentication or data origin authentication is used to verify information integrity and identify the original source [1, 21]. Altering the transmitted message implies that the source of information has changed. Hence, providing message authenti- cation implies the integrity assurance of a message is also provided. The terms authenticity and integrity assurance are sometimes used interchangeably in the public literature, where authenticity or authentication means integrity assurance more than entity authentication. The term integrity assurance will be used con- sistently throughout this thesis. Confidentiality and integrity assurance can be provided by cryptographic ci- phers. There are two branches: asymmetric and symmetric ciphers. Asymmetric ciphers use two keys, public and private, whereas symmetric ciphers share the same key [1]. Symmetric ciphers are further divided into two types: stream and block ci- phers. Generally, stream ciphers are faster than block ciphers, and have lower error propagation when transmission errors occur. However, block ciphers are based on simple, easy to analyse functions while the overall scheme is still diffi- cult to cryptanalyse. They still have the potential to perform fast computation, and have been employed in a wide range of network-based symmetric encryption applications. Different block cipher modes of operation were defined as FIPS standards [22] and recommended by the National Institute of Standards and Technology (NIST) [23]. The research presented in this thesis focuses only on symmetric block ciphers, particularly block cipher modes of operation. Securing messages in various applications may require both confidentiality and integrity assurance. A new type of symmetric scheme, called authenticated encryption (AE), was designed to simultaneously provide these two function- 1.1. Background and motivation 3 alities [17]. Conventional AE schemes provide both functionalities by simply combining two different schemes, one for confidentiality and the other for in- tegrity assurance. Bellare and Namprempre [17, 24] first formally analysed the security of these generic composition AE schemes. After that, a variant of AE, called authenticated encryption with associated data (AEAD), was proposed by Rogaway [25] where additional data, such as packet headers, does not require confidentiality, but still requires integrity assurance. Since then, advances in cryptography have led to the development of more efficient block cipher-based AEAD schemes, such as CCM [26,2], GCM [27,3], and OCB [28, 10, 29, 30]. Both CCM and GCM schemes were recommended by NIST. In addition, six AE modes, which are OCB2 [10], Key Wrap [31], CCM [26], EAX [32], Encrypt-then-MAC (EtM) [17, 24], and GCM [27], are standardised by the In- ternational Organization for Standardization (ISO) in ISO/IEC 19772. However, some schemes have been shown to be prone to attacks, such as the cycling attacks against GCM [4]. More serious failures in the interaction between encryption and integrity assurance mechanisms are noted by Bernstein [5]. In addition, the secu- rity properties in schemes constructed by combining two modes cannot far exceed the properties of each mode individually. Although there are other efficient and dedicated AE schemes, such as IAPM [33] and OCB1 [28], these are covered by patents restricting their usage in industrial products. Driven by these issues with the existing AE schemes, the necessity for secure, efficient, patent-free AE schemes and a lack of secure AE standards, the Com- petition for Authenticated Encryption: Security, Applicability, and Robustness (CAESAR) was announced in January 2013 [6]. CAESAR was launched at the Early Symmetric Crypto workshop in Mondorf-les-Bains in Luxembourg, with the aim to specify a final portfolio of AE schemes that are more efficient than the AES-GCM mode, and can be used in a wide range of applications. The ini- tiative resonated widely and received 57 submissions in the first round in 2014. The submissions were based on diverse constructions, such as block cipher-based, stream cipher-based, sponge-based, permutations, etc. This research chooses to analyse block ciphers since they are the biggest group of submissions. Modes of operation can be analysed with respect to different attacks. This thesis considers forgery attacks where an adversary can intercept and alter mes- sages and their authentication tags before sending them on to the intended recip- ient. Forgery attacks succeed if the integrity assurance mechanism cannot detect 4 Chapter 1. Introduction alterations to the received message or the corresponding tag. Additionally, this thesis analyses the security of modes against implementation attacks. Physical implementations of cryptographic algorithms may leak information about their internal operation. This leakage can be exploited by implementation attacks, including side channel analysis (SCA) attacks [34] and fault attacks [35]. SCA attacks exploit the side channel leakage, such as power consumption, timing information and electromagnetic emission, while fault attacks deliberately intro- duce an error into the operation of algorithm and observe the erroneous outputs to obtain information about the secret key.

1.2 Research aims and objectives

The CAESAR competition provides an opportunity for analysing authenticated encryption proposals. There were 30 block cipher-based submissions which re- quire in depth evaluation, and CAESAR was in its first round when this research started. The main aim of this research is to analyse the security of certain block cipher modes of operation used to construct AE schemes submitted to CAESAR. The research targets the mode of operation used in these designs rather than the underlying block cipher. This thesis analyses the security of block cipher modes considering the following three objectives:

1. Identify design flaws in the integrity assurance mechanism of the block cipher based modes of operation submitted to CAESAR. If some flaws are identified, the research explores how these flaws can be exploited in forgery attacks, to breach integrity assurance of the scheme. Additionally, if feasible we propose changes that can easily remedy such flaws.

2. Investigate the vulnerability of certain CAESAR block cipher modes to im- plementation attacks, especially fault attacks. While implementation at- tacks on block cipher are well-studied, assessing the vulnerability of modes of operation to implementation attacks is not straightforward to determine. Most existing fault attacks against block ciphers are applied with the as- sumption that the block cipher is in ECB mode. This research investigates the type of faults and the required number of faults to breach the security of certain AE block cipher modes submitted to CAESAR, given the known fault attacks on AES in ECB mode. 1.3. Research contributions 5

3. Use both the public literature and the outcomes of the analysis in this research regarding AE block cipher modes that provide resistance against side channel attacks to propose an AE mode of operation that is resistant to side channel attacks, without compromising potential important features, such as the ability to perform online computation.

The outcome of this research will contribute to the CAESAR evaluation pro- cess. In addition, successful analysis may be applicable to other related block cipher modes for authenticated encryption, which would be useful for the cryp- tographic community.

1.3 Research contributions

This thesis has five major contributions, as follows:

Contribution 1: A block cipher mode of operation for authenticated encryp- tion known as plus-plus-AE (++AE) [7] was submitted in the first round of CAESAR. This mode uses a special integrity assurance mechanism by combining addition modulo 2n and XOR operations to propagate alterations in the cipher- text message to the last block, resulting in an incorrect authentication tag. In this contribution, we extensively analysed ++AE, revealing serious weaknesses in the integrity assurance mechanism. Firstly, this mode is shown to be vul- nerable to a fundamental flaw: the scheme does not verify the most significant bit of any block in the plaintext message. Secondly, there are additional flaws that can be exploited by choosing a plaintext message containing four consec- utive blocks of particular values to construct multiple forged messages. There are hundreds of block values that can be used in this way. The attacks against ++AE are deterministic, and the forged messages are guaranteed to pass the integrity check. The success of these forgeries is independent of the underlying block cipher, key or additional secret value. We provide mathematical proofs for the flaws in the ++AE algorithm, describe the process to exploit these flaws in forgery attacks and verify these attacks experimentally using 128-bit AES as the underlying block cipher. We conclude that ++AE is insecure as an authenti- cated encryption mode of operation. Parts of this contribution were published in the Australasian Information Security Conference (AISC 2016) [36]. An article 6 Chapter 1. Introduction

discussing the fundamental flaw in ++AE has been published in the Journal of Mathematical Cryptology [37].

Contribution 2: This contribution considers the security of the Offset Two- Round (OTR) authenticated encryption mode [8] with respect to forgery attacks. The current version of OTR gives a security proof for specific choices of the block size (n) and the primitive polynomial used to construct the finite field F2n . Although the OTR construction is generic, the security proof is not. For every choice of finite field, the distinctness of masking coefficients must be verified to ensure security. In this work, we show that some primitive polynomials result in collisions among the masking coefficients used in the current instantiation, permitting forgeries to be constructed. We propose a new way to instantiate

OTR so that the masking coefficients will be distinct in every finite field F2n , thus generalising OTR and permitting a generic security proof without reducing the security of OTR. The results of this work were published in the International Conference on Applications and Techniques in Information Security (ATIS 2016) [38].

Contribution 3: The XOR-Encrypt-XOR (XEX) mode [10] uses nonce-based secret masks (L) that are distinct for each message. The existence of secret masks in XEX mode prevents the application of conventional fault attack techniques, such as differential fault analysis. In this work, we investigate fault attacks against XEX mode that either eliminate the effect of the secret masks or retrieve their values. Either of these outcomes enables existing fault attack techniques to subsequently be applied to recover the secret key. To estimate the success rate and feasibility, we ran simulations for ciphertext-only fault attacks against 128- bit AES in XEX mode. This work discusses also the relevance of the proposed fault attacks to certain CAESAR authenticated encryption modes based on XEX, such as OCB2, OTR, COPA, SHELL and ElmD. Finally, we suggest effective countermeasures to provide resistance to these fault attacks. The results of this work were published in the Australasian Conference on Information Security and Privacy (ACISP 2017) [39].

Contribution 4: We investigate differential fault attacks against the AEZ authenticated encryption scheme [11]. AEZ uses three different 128-bit keys (I, J, L) and can potentially work without a nonce or with a repeated nonce. 1.3. Research contributions 7

The algorithm has been updated several times during the three rounds of the CAESAR cryptographic competition, but our analysis mainly focuses on AEZ v4.2, and investigates the applicability of these analyses to the most updated version: AEZ v5. This work identifies the best place in the implementation to apply differential fault attacks. We exploit the structure of AEZ to minimise the total number of faults required for key recovery. We propose an approach that can reduce the number of fault injections required to retrieve all three AEZ v4.2 keys, I, J and L, from six to three injections such that these keys are uniquely determined. In addition, a similar approach using four fault injections can uniquely recover all of the keys of AEZ v5. The attacks in this work are verified experimentally using generic implementations of AEZ v4.2 and v5 de- veloped in the programming language C. The results on v4.2 were published in the IEEE International Conference on Trust, Security and Privacy in Computing and Communications (IEEE TrustCom 2017) [40]. A more comprehensive anal- ysis against AEZ v4.2 and v5 was accepted for publication in the TrustCom2017 special issue in Concurrency and Computation: Practice and Experience journal [41].

Contribution 5: Cryptographic algorithms may be vulnerable to side channel attacks that exploit the information leakage of physical implementations. Provid- ing leakage resilience is difficult for block cipher based authenticated encryption (AE) schemes, and becomes more challenging when the scheme is intended to provide leakage resilience at both ends (in sender and receiver devices), and ad- ditionally in misuse resistance situations where an adversary has control over the nonce or initialisation vector. Hence, existing leakage-resilient AE schemes tend to collect the whole message either at the sender or receiver before the en- cryption/decryption process. Such an approach is effective in preventing attacks but limits the online computation of the scheme. In this work, we propose a block cipher based AE mode that provides leakage resilience at both the encryp- tion/decryption ends without losing the potential to perform online computation. The proposed scheme is efficient for applications, such as smart cards, pay-TV, RFID tags and ASIC/FPGA chips as these devices usually have limited resources and operate in hostile environments. 8 Chapter 1. Introduction

1.4 Organisation of thesis

This thesis is organised as follows:

Chapter2 provides an overview of the background and work related to this research. It covers the methods used to provide confidentiality and integrity assurance using block cipher modes and approaches to provide authenticated en- cryption. Additionally, it outlines some cryptanalytic attacks grouped into three branches: confidentiality attacks, integrity assurance attacks and implementation attacks.

Chapter3 presents a serious flaw in the integrity assurance of the authen- ticated encryption mode ++AE. We demonstrate that ++AE does not verify the most significant bit in any plaintext block. After that, more weaknesses are identified that can be exploited in chosen plaintext forgery attacks. An attacker can either delete and insert particular ciphertext blocks in the original cipher message or reorder them and the success of these forgeries is guaranteed. Fi- nally, this chapter gives some experimental results to demonstrate the forgery attacks.

Chapter4 considers the security of OTR, which is a block cipher mode of operation for AEAD, with respect to forgery attacks. This chapter proposes minor modifications to OTR intended to make the generic version of this scheme

more robust. We specify a set of masks that are distinct in every finite field F2n . This enables OTR to work in any finite field without invalidating the security claimed.

Chapter5 proposes fault attacks on a block cipher mode, XOR-Encrypt- XOR (XEX). This mode is used in a few authenticated encryption schemes. In this mode, the use of nonce-based secret mask renders the traditional Differ- ential Fault Attack (DFA) useless, as the nonce changes for each encryption. To overcome this problem, the chapter describes two different types of fault attack methodologies.

Chapter6 presents the use of differential fault attacks against the authenti- cated encryption scheme AEZ, an AES-based scheme that uses a 384-bit key. 1.4. Organisation of thesis 9

This chapter explains the best place to apply differential fault attacks, and ex- ploits the structure of AEZ to minimise the total number of faults required for key recovery. The analysis in this chapter is applied to the two versions of AEZ: v4.2 and v5.

Chapter7 proposes a block cipher-based AE mode that provides online, leakage- resilient and misuse-resistant authenticated encryption at both the sender and receiver. The scheme provides these features using a minimum number of primi- tives. The performance of the proposed mode is evaluated and compared to some existing leakage-resilient AE modes.

Chapter8 concludes this research by outlining the main outcomes and sug- gests some directions for future work in block cipher modes for authenticated encryption. 10 Chapter 1. Introduction Chapter 2

Background and literature review

This chapter discusses two main cryptography goals, confidentiality and integrity assurance, and their combination in authenticated encryption. These two goals can be achieved using various approaches and one common method is using a block cipher. This chapter gives an overview how confidentiality and integrity assurance can be provided separately using block cipher modes, and how to combine these two services into a single scheme to provide block cipher-based authenticated encryption. Finally, the chapter introduces different attacks to violate confidentiality and integrity assurance. This chapter is arranged as follows: Section 2.1 gives a review of block ciphers and their various methods of construction. This section also elaborates more on block cipher modes of operation. Block cipher-based integrity assurance, the con- cept of message authentication codes and their properties and constructions are discussed in more detail in Section 2.2. The concept of authenticated encryption (AE), classification of AE scheme focusing on certain dedicated AE block cipher modes, and the CAESAR competition are explained in Section 2.3. Section 2.4 outlines a range of cryptanalytic techniques for block ciphers including attacks on confidentiality, integrity assurance and implementation attacks.

2.1 Confidentiality provided by block ciphers

Data confidentiality can be provided by encrypting data into an (unreadable) format to prevent eavesdropping by third parties. The encryption process can

11 12 Chapter 2. Background and literature review

use a symmetric cipher where the plaintext message M is transformed into a ciphertext message using a secret key K shared between communicating parties [42, 21]. Symmetric ciphers are intended to ensure that an adversary cannot decrypt the ciphertext message to the original plaintext message without knowl- edge of the secret key. That is, the message is protected against unauthorised access. Symmetric ciphers are split into two types: stream and block ciphers. This section focuses only on confidentiality provided by block ciphers.

2.1.1 Block ciphers

A block cipher is a for which the encryption function transforms an n-bit plaintext string (block) in to an n-bit ciphertext block under the control of a key. Block ciphers are used to construct different primitives, such as pseudorandom generators and message authentication codes. There is no ideal block cipher that can serve all applications, but different cryptographic applications use different types of block ciphers depending on factors such as encryption speed, required memory, hardware footprint, etc. A block cipher encryption, as illustrated in Figure 2.1, is a one-to-one function

EK that maps an n-bit plaintext block Mi to an n-bit ciphertext block Ci under a

k-bit secret key K. This is usually represented as Ci = EK (Mi). The strength of the block cipher security basically depends on the . Each key represents a different block cipher function (mapping) so that a k-bit key can define up to a total of 2k different mapping functions.

Mi Mi

K E or EK

Ci Ci Figure 2.1: Block cipher interface.

Most block ciphers, as described in [42], are iterated ciphers. That is, the cipher composes of a number of similar rounds where each round consists of two main functions: a function and a round function. The key schedule 2.1. Confidentiality provided by block ciphers 13 function is used to produce different keys, called sub-keys, from the original master key. The round function takes two inputs, the current state of the cipher and one sub-key, and combines these two variables using various logic operations. The encryption process is repeated for a number of iterations to produce a block of ciphertext as a final output. Two commonly used classes of block ciphers are Feistel ciphers and Substitution- Permutation Network (SPN) ciphers [42]. An example of a Feistel cipher is the (DES) [43], and a commonly used SPN cipher is the advanced encryption standard (AES) [44].

2.1.1.1 Feistel ciphers

Feistel ciphers are named after Horst Feistel [45] who with his colleagues from IBM developed in 1974 a block cipher based on an earlier internal design called [46]. One main advantage of Feistel ciphers is that the encryption pro- cess is identical to decryption, except that the key schedule function is applied reversely. This enables both processes to use the same hardware circuit, and both processes perform similarly in software. Feistel ciphers divide each 2n-bit plaintext block into two n-bit halves: right half Ri and left half Li. After that, a round function f is applied with a sub-key

Ki to the right half before XOR-ing the result with the left half as shown in

Figure 2.2. Then, the left and right halves are swapped to form Li+1 and Ri+1 as follows:

Li+1 = Ri R = f(R ,K ) L i+1 i i ⊕ i This process is repeated for a number of rounds till the final ciphertext block is generated. As shown in Figure 2.2, Feistel ciphers do not apply the swap operation in the last round. Unlike SPN ciphers, Feistel ciphers do not require the round function to be invertible.

2.1.1.2 Substitution-permutation network (SPN) ciphers

Substitution-permutation network ciphers generate a ciphertext block after a series of two core functions is applied consecutively. These two main functions are substitution and permutation. The substitution function replaces s bits of 14 Chapter 2. Background and literature review

L0 R0 R16 L16 K1 K16

f f

L1 R1 R15 L15 K2 K15

f f

...... L15 R15 R1 L1 K16 K1

f f

R16 L16 L0 R0

Encryption Process Decryption Process Figure 2.2: Feistel cipher [1]. input message with a different set of s bits. This mapping process is also called an s-box, which should be a one-to-one permutation. For a message block of length (` s) bits, a total of ` s-box calls are required. Secure s-boxes have × the property that changing one bit at input causes half of the bits at output to change on average. On the other hand, the second function, permutation, changes the order of bits within the input message itself. That is, this function is a linear transforma- tion that changes the values of bits in the block cipher state. Efficient permuta- tion means that changing one bit at the input of a substitution (i) changes many s-box inputs at substitution (i + 1). A well-designed substitution-permutation network satisfies Shannon’s confusion and diffusion properties [47]. 2.1. Confidentiality provided by block ciphers 15

The substitution-permutation process, as shown in Figure 2.3, is repeated for a specified number of rounds, and each round does the following functions: XOR with a round key, perform substitution and then perform permutation. Permutation of the last round is not applied; however the first and last rounds are XOR-ed with sub-keys in a process called whitening. Feistel networks require three rounds so that each bit of the input has a potential effect on every bit of the output, whereas, the SPN used, for example in AES, requires only two rounds to have this effect [42].

Ki

S S S S S S S S S S S S S S S S

Ki+1

S S S S S S S S S S S S S S S S

Figure 2.3: Two rounds of SPN in PRESENT cipher [12, 13].

2.1.2 Modes of operation

As discussed in Section 2.1.1, only fixed-length blocks of data are encrypted or decrypted by a block cipher. Hence, a straightforward method to encrypt larger messages is to divide the message into blocks according to the block size of the used cipher and then encrypt each block separately. This method is called Electronic Codebook (ECB) mode. However, this mode has one main disadvantage. Repeated plaintext blocks result in the same ciphertext blocks, resulting in information leakage that breaches message confidentiality. Thus, alternative methods to process long messages, called modes of operation, are considered. There are many different types of block cipher modes. This section reviews five common modes of operation. These are: Electronic Codebook (ECB), Cipher Block Chaining (CBC), Output Feedback (OFB), Cipher Feedback (CFB) and Counter mode (CTR). The first four modes were defined as block cipher modes 16 Chapter 2. Background and literature review of DES in FIPS 81 [22]. After that, NIST recommended these four modes in addition to the counter mode for the AES block cipher [23]. The International Organization for Standardization (ISO) also defined in 1991 the first edition of ISO/IEC 10116 for modes of operation. The third edition, published in 2006, has the same five modes of operation defined by NIST [23]. All these five modes aim to protect confidentiality, but they differ in performance and the required operational attributes. The encryption process for each block cipher mode is illustrated in the next five sections, and some properties for each mode will also be discussed. To de- scribe these modes, assume M = (M , ,M ) is a plaintext message consisting 1 ··· m of m blocks each of length n bits, and suppose C = (C , ,C ) is the corre- 1 ··· m sponding ciphertext message after encrypting the message M.

2.1.2.1 Electronic codebook (ECB) mode

ECB mode [1, 20, 22, 23, 46] first partitions the plaintext message into m blocks of n-bit length. After that, each plaintext block is encrypted individually using the forward function EK , whereas the decryption process applies the backward function DK to each ciphertext block such that:

M = D (E (M )), i [1, m] i K K i ∀ ∈ The encryption and decryption algorithms of ECB mode are described in Ta- ble 2.1 while Figure 2.4 illustrates the encryption process only.

Table 2.1: Encryption and decryption algorithms in ECB mode.

Algorithm 1: ECB encryption Algorithm 2: ECB decryption 1. (M ,...,M ) n M 1. (C ,...,C ) n C 1 m ←− 1 m ←− 2. if M < n then 2. for i 1 to m do | m| ← 3. M M fi 3. M D (C ) od m ← m i ← K i 4. for i 1 to m do 4. return M (M ,...,M ) ← ← 1 m 5. C E (M ) od i ← K i 6. return C (C ,...,C ) ← 1 m ECB has the following properties:

Padding: if the message does not have a length multiple of the block • size n of the underlying cipher, then the message is padded to have this 2.1. Confidentiality provided by block ciphers 17

M1 M2 M3 Mm

E E E E K K K ······ K

C1 C2 C3 Cm Figure 2.4: ECB encryption.

length. One common way of padding is appending a one to the message end, followed by as many zeros as needed to reach a multiple of the block size.

Identical plaintext blocks: duplicated plaintext blocks are encrypted • into identical ciphertext blocks. Compared to other modes, this is a disad- vantage of ECB mode.

Backward function: ECB mode requires both the forward E and back- • K ward DK functions of the block cipher. This is less efficient than modes which require only the forward function to perform both encryption and decryption.

Bit errors: if an error, either accidentally or deliberately, occurs to a bit • in the transmitted message, only the corresponding block is affected and the remaining blocks are still decrypted correctly.

Parallelism: since each block can be processed independently and ECB • mode can encrypt/decrypt subsequent blocks without the need to have previous ones, parallel processing is possible.

Self-synchronous: if a bit is inserted or destroyed in the transmitted • message, ECB mode on the decryption side parses the ciphertext message into different subsequent blocks and the recovered blocks will be invalid. However, when a complete block is inserted or lost, the synchronisation is preserved as the remaining blocks will be decrypted correctly. 18 Chapter 2. Background and literature review

2.1.2.2 Cipher block chaining (CBC) mode

In CBC mode [1, 20, 22, 23, 46], each ciphertext block is based on every previous plaintext block in addition to its corresponding plaintext block. The mode chains the ciphertext generated in the previous block to the current block, and repeats this for subsequent blocks. The first block in this mode requires an initialisation vector IV such that the value of IV is not a secret, but should be unpredictable to make each CBC encryption nondeterministic. One way to obtain an unpre- dictable value is to choose a predictable nonce, and then encrypt it to generate an IV value under the same key K or a key derived from K. CBC mode also requires the protection of IV against alterations as IV repetition leads to secu- rity vulnerabilities [48, 49]. Algorithms of CBC mode are described in Table 2.2 and the encryption process is illustrated in Figure 2.5.

Table 2.2: Encryption and decryption algorithms in CBC mode.

Algorithm 1: CBC encryption Algorithm 2: CBC decryption 1. (M ,...,M ) n M 1. (C ,...,C ) n C 1 m ←− 0 m ←− 2. if M < n then 2. for i 1 to m do | m| ← 3. Mm Mm fi 3. Mi DK (Ci) Ci 1 od ← ← ⊕ − 4. IV random(), C IV 4. return M (M ,...,M ) ← 0 ← ← 1 m 5. for i 1 to m do ← 6. Ci EK (Ci 1 Mi) od ← − ⊕ 7. return C (C ,C ,...,C ) ← 0 1 m

M1 M2 M3 Mm

IV

E E E E K K K ······ K

C1 C2 C3 Cm Figure 2.5: CBC encryption.

CBC mode has the following properties: 2.1. Confidentiality provided by block ciphers 19

Padding: the last block should be padded if it is an incomplete block. • Identical plaintext blocks: unlike ECB, the CBC mode uses a chaining • mechanism so that any change in the key, IV , or previous plaintext blocks results in different ciphertext blocks. Hence, choosing a fresh IV for every new encryption prevents recognising matched plaintext blocks by observing their corresponding ciphertext blocks.

Backward function: the forward function is used in the CBC encryption • algorithm, while the backward function is used in the CBC decryption algorithm.

Bit errors: when a transmission error occurs in a ciphertext block C , then • i only two consecutive plaintext blocks, Mi0 and Mi0+1, are affected since both

encounter the block Ci as: Mi0 DK (Ci) Ci 1 and Mi0+1 DK (Ci+1) ← ⊕ − ← ⊕ Ci. Mi0 is totally random compared to the original block Mi, whereas,

Mi0+1 is altered only in bits where Ci has errors. After that, Mi0+2 should

be recovered correctly when both Ci+1 and Ci+2 are error-free.

Parallelism: since each block encryption should wait for all previous plain- • text blocks to be encrypted first, CBC cannot perform parallel computa- tion during encryption. However, the decryption process can be parallelised since the decryption function requires merely the ciphertext blocks (see line 3 of Algorithm 2 in Table 2.2).

Self-synchronous: the scheme can resynchronised if bits of block size or • its multiples are inserted or deleted from the ciphertext message. Only the corresponding blocks and the one following are affected, but subsequent blocks remain valid. However, insertions or deletions of length different than the block size lose the scheme synchronisation completely.

Although the CBC mode solves the problem of repeated blocks in ECB, CBC still leaks information about the plaintext blocks when two ciphertext blocks match as follows:

C = C E (C M ) = E (C M ) i+1 j+1 ⇒ K i ⊕ i+1 K j ⊕ j+1 C M = C M ⇒ i ⊕ i+1 j ⊕ j+1 M M = C C ⇒ i+1 ⊕ j+1 i ⊕ j 20 Chapter 2. Background and literature review

The attacker knows the XOR of the unknown plaintext blocks, Mi+1 and Mj+1,

by observing the XOR of known ciphertext blocks, Ci and Cj. This is likely to occur after 2n/2 blocks which is infeasible for a cipher that has a long block size such as the 128-bit AES, but it is a problem for ciphers, like DES, having a block size of 64 bits.

2.1.2.3 Output feedback (OFB) mode

The OFB mode [1, 20, 22, 23, 46] uses an IV that is encrypted in the first block to generate a keystream. The keystream encrypts the first plaintext block through an XOR operation. After that, this keystream is fed as an input to the second block to generate a different keystream. The second keystream is XOR-ed with the second plaintext block to produce the second ciphertext block. This encryption process is repeated as necessary to encrypt all plaintext blocks as shown in Figure 2.6. That is, the OFB mode uses the block cipher to form a stream cipher where the keystream is produced independently of both plaintext and ciphertext blocks. OFB has two versions depending on whether the feedback is full or partial. With full feedback, the block cipher input is fully updated by the block cipher output of the previous block. On the other hand, with partial feedback, the block cipher input is considered as a that is shifted by t bits only taken from the block cipher output of the previous block. Algorithms of OFB mode with full feedback are described in Table 2.3. The IV in OFB is not secret, but should be used only once under a given key. Thus, a nonce used in OFB can be a simple counter. This requirement for IV is more relaxed than the IV in CBC mode. Table 2.3: Encryption and decryption algorithms in OFB mode.

Algorithm 1: OFB encryption Algorithm 2: OFB decryption 1. (M ,...,M ) n M 1. (IV,C ,...,C ) n C 1 m ←− 1 m ←− 2. IV random(), S IV 2. S IV ← 0 ← 0 ← 3. for i 1 to m do 3. for i 1 to m do ← ← 4. Si EK (Si 1) 4. Si EK (Si 1) ← − ← − 5. C S M od 5. M S C od i ← i ⊕ i i ← i ⊕ i 6. return C (IV,C ,...,C ) 6. return M (M ,...,M ) ← 1 m ← 1 m

OFB mode has the following properties: 2.1. Confidentiality provided by block ciphers 21

IV

E E E E K K K ······ K

M1 M2 M3 Mm

C1 C2 C3 Cm Figure 2.6: OFB encryption.

Padding: the message is not padded if its length is not a multiple of the • block size. When the last block is incomplete, the most significant bits of the keystream are XOR-ed with this block without the need for padding.

Identical plaintext blocks: as per CBC mode, choosing a fresh IV for • every message encryption prevents identical plaintext blocks to encrypt to matched blocks.

Backward function: unlike ECB and CBC modes, OFB mode does not • use the backward function, but it uses only EK for both encryption and decryption algorithms (see line 4 in Algorithm 1 and 2 in Table 2.3).

Bit errors: errors in certain C bits affect only the exact bits in the • i corresponding plaintext block when recovered. Errors do not propagate to subsequent blocks in both full and partial feedback. Compared to CBC and CFB modes, OFB mode is the best in recovering from bit errors.

Parallelism: OFB mode cannot perform parallel computation since each • block cipher call uses the previous call output as its input. However, all block cipher calls can be preprocessed in advance as they are independent of the plaintext and ciphertext blocks.

Self-synchronous: OFB cannot self-synchronise if certain ciphertext bits • are inserted or deleted although the keystream remains correct. 22 Chapter 2. Background and literature review

2.1.2.4 Cipher feedback (CFB) mode

CFB mode [1, 20, 22, 23, 46] is exactly OFB mode, except the feedback is taken from the ciphertext whereas in OFB mode the feedback is taken from the block cipher output. This allows block ciphers to be used as self-synchronous stream ciphers. The basic idea of CFB mode is to encrypt an IV to generate the first keystream. Subsequent keystreams are generated by encrypting the ciphertext from the previous block. Each keystream is XOR-ed with a plaintext block to generate the corresponding ciphertext block. Similar to OFB, the ciphertext feedback to a consecutive block can be full or partial in CFB mode. For partial feedback, if the block cipher input Si is

encrypted to generate the ciphertext block Ci, then Si+1 for the next block is updated, for instance, as follows:

Si+1 = lsbn t(Si) msbt(Ci), 1 t n − k ≤ ≤ Algorithms of CFB mode with full ciphertext feedback are described in Table 2.4 and the encryption process is illustrated in Figure 2.7.

Table 2.4: Encryption and decryption algorithms in CFB mode.

Algorithm 1: CFB encryption Algorithm 2: CFB decryption 1. (M ,...,M ) n M 1. (C ,C ,...,C ) n C 1 m ←− 0 1 m ←− 2. IV random(), C IV 2. for i 1 to m do ← 0 ← ← 3. for i 1 to m do 3. Mi EK (Ci 1) Ci od ← ← − ⊕ 4. Ci EK (Ci 1) Mi od 4. return M (M1,...,Mm) ← − ⊕ ← 5. return C (C ,C ,...,C ) ← 0 1 m

CFB mode has the following properties:

Padding: as per OFB mode, the message is not padded if its length is not • a multiple of the block size; rather, the last incomplete block is XOR-ed with the most significant bits of the keystream.

Identical plaintext blocks: as per CBC and OFB modes, choosing a • fresh IV for every message encryption prevents identical plaintext blocks from producing matched ciphertext blocks. 2.1. Confidentiality provided by block ciphers 23

IV

E E E E K K K ······ K

M1 M2 M3 Mm

C1 C2 C3 Cm Figure 2.7: CFB encryption.

Backward function: CFB mode also uses only E for both encryption • K and decryption algorithms (see lines 4 and 3 in Algorithms 1 and 2, respec- tively, in Table 2.4).

Bit errors: with full feedback, bit errors in C affect certain bits in the • i corresponding plaintext block where Ci had errors, and completely destroy

the consecutive block Ci+1. Errors do not propagate to subsequent blocks as each ciphertext block is used in just two adjacent blocks. However,

if the feedback is partial by t bits, errors in Ci affect certain bits in the

corresponding plaintext block where Ci had errors, and affect also at most ( n/t ) subsequent blocks till the error is shifted out the feedback state. d e Parallelism: CFB mode is similar to CBC mode in that it cannot perform • parallel computation during encryption, whereas the decryption process can be parallelised.

Self-synchronous: CFB is the only mode that can self-synchronise when • certain ciphertext bits are inserted or deleted from the ciphertext message. Depending on the number of feedback bits (t), CFB self-synchronises after only ( n/t + 1) blocks. d e

2.1.2.5 Counter (CTR) mode

CTR mode [1, 20, 22, 23, 46] is another mode where a block cipher is used as a keystream generator. CTR mode can be considered an OFB mode where block cipher inputs are updated as a counter instead of using the block cipher feedback. 24 Chapter 2. Background and literature review

Users should ensure that a counter value is not used twice, since an attacker can decrypt successfully a ciphertext block if he knows one of the two plaintext blocks that use the same counter value. One way to ensure this uniqueness is to define the counter value as (IV CTR ) where CTR is an s-bit variable (s < n) that k i i can start at zero value and increments by one for every block cipher call. This enables the CTR mode to encrypt at most 2s blocks without changing K or IV . CTR mode is fully parallel during encryption and decryption so this mode is desirable for high speed applications. Algorithms of CTR mode are described in Table 2.5 and the encryption process is illustrated in Figure 2.8.

Table 2.5: Encryption and decryption algorithms in CTR mode.

Algorithm 1: CTR encryption Algorithm 2: CTR decryption 1. (M ,...,M ) n M 1. (S ,C ,...,C ) n C 1 m ←− 0 1 m ←− 2. IV random(), S (IV CTR ) 2. for i 1 to m do ← 0 ← k 0 ← 3. for i 1 to m do 3. Mi EK (Si 1) Ci ← ← − ⊕ 4. Ci EK (Si 1) Mi 4. Si (Si 1 + 1) od ← − ⊕ ← − 5. Si (Si 1 + 1) od 5. return M (M1,...,Mm) ← − ← 6. return C (S ,C ,...,C ) ← 0 1 m

Counter1 Counter2 Counter3 Counterm

E E E E K K K ······ K

M1 M2 M3 Mm

C1 C2 C3 Cm Figure 2.8: CTR encryption.

CTR mode has the following properties:

Padding: as per OFB and CFB modes, CTR mode does not require mes- • sage padding; rather, the last incomplete block is XOR-ed with the most significant bits of the keystream. 2.2. Integrity assurance provided by block ciphers 25

Identical plaintext blocks: changing the counter value results in dif- • ferent keystreams that generate different ciphertext blocks for matched plaintext blocks.

Backward function: CTR mode shares with OFB and CFB modes the • feature of using only the forward function for both encryption and decryp- tion.

Bit errors: if errors affect certain bits in C , only the corresponding plain- • i text block will have errors in bits where Ci had. These errors do not propagate to subsequent blocks as each block is computed separately.

Parallelism: CTR mode is fully parallel for both encryption and decryp- • tion.

Self-synchronous: OFB cannot self-synchronise if certain ciphertext bits • are inserted or deleted from the ciphertext message.

2.2 Integrity assurance provided by block ciphers

Unlike confidentiality, integrity assurance does not aim to prevent unauthorised parties from reading a message, but it detects unauthorised modifications to the message. This is achieved by using various cryptographic algorithms. The most common way in symmetric cryptography to provide an assurance of data integrity is by using message authentication codes (MACs). MACs are special cases of a type of cryptographic primitive called hash functions [50]. Hash functions are compression functions that take an arbitrary length mes- sage and generate a short fixed-length value, called a hash-value or tag, consid- ered as the digital fingerprint of the message. This hash-value should uniquely represent a message such that any alteration to the message generates a different hash-value. In addition, hash functions should easily compute the hash-value, but computing the reverse transformation should be infeasible. Hash functions consist of two classes: un-keyed and keyed hash functions. The main difference between keyed and un-keyed hash functions is that the hash value from un-keyed functions must be stored securely whereas the message itself can be stored in an insecure place. That is, the integrity of the stored hash value should be protected such that if an attacker changes both the message and its 26 Chapter 2. Background and literature review corresponding hash value, the calculated hash value at the receiver side will not be equal to the stored one and consequently the integrity test will fail. On the other hand, in keyed hash functions, both a message and its hash value can be stored in an insecure place and can be transmitted over an insecure channel, but the key should be securely stored. Message authentication codes (MACs) are keyed hash functions.

2.2.1 Message authentication codes

The aim of a MAC is to detect if the received message has been altered during transmission (i.e. accidental or intentional modification). MAC is a symmetric scheme that has three different algorithms: key generation, tagging and verifica- tion as follows:

Key generation algorithm: accepts no input and generates a k-bit string • K, where K such that is the set of possible keys that can be the ∈ K K output of the key generation algorithm.

Tagging algorithm: takes take two inputs, a secret key K 0, 1 k and • ∈ { } an arbitrary length message M 0, 1 ∗, and produces as an output a tag ∈ { } T 0, 1 τ as T = MAC(K,M). This is a deterministic algorithm that ∈ { } always generates the same T provided that K and M remain the same.

Verification algorithm: accepts three inputs, a secret key K 0, 1 k , an • ∈ { } τ arbitrary length message M 0, 1 ∗ and a tag T 0, 1 . Then, the ∈ { } ∈ { } algorithm returns a tag T 0 = MAC(K,M) based on the received message,

and compares T 0 to the received tag T . If the two tags are identical (T =

T 0), the algorithm returns 1, otherwise it returns 0. The receiver of (M,T ) assumes that the message was not modified during transmission when the two tags match. An attacker who does not have an access to the key should be unable to modify the message and produce a valid tag. That is, MAC provides certain data integrity assurance. MAC operation is illustrated in Figure 2.9.

2.2.2 Properties of message authentication codes

According to [1], a message authentication mechanism needs to have the following properties: 2.2. Integrity assurance provided by block ciphers 27

M M

K MAC M T M T MAC K k Insecured channel k

T 0

T Y T = T 0 Accept

N Reject Figure 2.9: Message authentication code.

Ease of computation: it is easy to compute the tag T for a given message • M and K.

Compression: the MAC scheme digests an arbitrary length message to • produce a fixed-length value.

For a MAC scheme to be secure, it should also provide the following properties (assuming an attacker cannot recover the key K):

One-wayness: this feature ensures that if the MAC tag T is known, it is • difficult to find a message M such that MAC(K,M) = T [1].

Computation resistance: it is practically infeasible to compute a valid pair • (M ,T ) even if the attacker is given a set of pairs (M ,T ) such that M = j j i i j 6 Mi in all given pairs [1].

2.2.3 Construction of message authentication codes

Message authentication codes are constructed based on different cryptographic primitives. Secure MACs can be constructed using hash functions [51, 52]. One simple construction of such type is to append the message to the key and compute their hash-value. That is, for a h( ), the MAC tag is computed as · follows: T = h(M K) || or T = h(K M) || 28 Chapter 2. Background and literature review

This makes the MAC scheme as efficient as the underlying hash function. How- ever, the flaw with this construction is that the MAC tag for a new message

M 0 appended to the message M can be computed without the need to know the secret key K. Alternatively, the MAC can be constructed by following the Hash-then-Encrypt model where a message M is hashed first to generate an output-value, and, then, this output-value is encrypted using a block cipher to generate the MAC tag as follows:

T = EK (h(M))

This construction is used as a standard in CCITT X.509 [53]. Another com- mon hash-based MAC, called HMAC, is developed by Bellare et al. [54, 51] that calculates the MAC tag as follows:

T = h(K opad, h(K ipad, M)) 2 ⊕ 1 ⊕ where ipad and opad are fixed-length constants. HMAC construction refined RFC-1828 to RFC-2104 as an internet standard to assure IP packet integrity [51]. MACs can also be constructed using a block cipher operated in a certain mode of operation such as CMAC [55], IACBC [33], GMAC[3], OMAC [56], PMAC [57] and CBC-MAC [58]. The cipher block chaining-message authentication code (CBC-MAC) mode [58] is a popular mode that is widely used for integrity as- surance purposes. CBC-MAC, as depicted in Figure 2.10, can be used with any symmetric block cipher and any arbitrary length message. It requires three keys:

One key K is used for the underlying block cipher while two keys (K1,K2) are used for XOR-ing with the final message block. If the last plaintext block has a

length equal to the block size of the used cipher, then K1 will be used. Other-

wise, the last block will be padded and then XOR-ed with K2. The MAC tag T is the most significant bits of the output from the last encryption call. Different cryptographic primitives that use a secret key can also be used to construct a MAC scheme. MACs can use stream ciphers [59, 60], sponge construction [61] or even dedicated constructions that are designed to provide integrity assurance, such as Chaskey [62]. 2.3. Authenticated encryption 29

M M M M 10 0 1 2 3 m k ···

IV (K1,K2)

E E E E K K K ······ K

T Figure 2.10: CBC-MAC.

2.3 Authenticated encryption

Several block cipher modes have been suggested, either to provide confidentiality or to provide integrity assurance. Schemes that provide both confidentiality and integrity assurance in a single design are called Authenticated Encryption (AE). For encryption, AE usually takes three inputs: a key, a nonce and a plaintext message, and generates a ciphertext message and an authentication tag as follows:

(K,N,M) (C,T ) E ⇒ For decryption, the AE algorithm accepts four inputs: a key, a nonce, a ciphertext message and a tag, and returns the original plaintext message when the received message passes successfullty the integrity assurance test. Otherwise, an invalid symbol ( ) is returned. The AE decryption algorithms is defined as follows: ⊥ D (K,N,C,T ) M or D ⇒ ⊥ Interfaces of both encryption and decryption algorithms are illustrated in Fig- ure 2.11.

K T K M N N or E C D M C T ⊥ Encryption algorithm Decryption algorithm Figure 2.11: Authenticated encryption interface. 30 Chapter 2. Background and literature review

AE schemes have become an active research topic for several reasons. AE schemes provide simultaneous confidentiality and integrity assurance which pro- vide a higher level of security compared to encryption only schemes. Compared to straightforward combination of confidentiality-only schemes and MAC schemes, AE schemes should be more efficient. Secondly, AE designs are less prone to be used incorrectly than the combined schemes. Certain schemes which claimed to provide confidentiality and integrity assurance have shown to be vulnerable to serious flaws. For instance, a version of the IPsec protocol [63] that uses CBC mode to provide both AE goals has been shown to be insecure [64, 65]. This lack of formal security analysis for AE schemes has motivated researchers to study more the AE security and propose more efficient AE designs. Not all data in applications are required to be encrypted, such as packet headers in internet protocols. These headers should be in plaintext in order to be identified by different communicating parties. However, these data still require integrity assurance to prevent attacks changing such headers. Therefore, Rogaway [25] introduced authenticated encryption with associated data (AEAD), which is a generalisation of AE where the associated data are still authenticated with the plaintext message but not encrypted, as illustrated in Figure 2.12.

Integrity assurance

Associated data Message

Encryption Figure 2.12: Authenticated encryption with associated data.

AE can be established using block ciphers, stream ciphers, permutation- based, sponge-based, or even customised schemes. Each design approach has its characteristics, but block cipher-based schemes are commonly used to provide authenticated encryption. This thesis concentrates only on block cipher-based AE schemes.

2.3.1 Classification of AE schemes

AE schemes can be classified in different ways. This thesis follows the widely used approach that classifies AE proposals according to three parameters: 2.3. Authenticated encryption 31

the number of underlying components: confidentiality can be provided by • one component, while integrity assurance is provided by another. Alterna- tively, both these security goals are provided using a single component.

the number of secret keys: AE schemes can use two keys, each for a com- • ponent, or they can use a single key in the entire design.

the number of passes for data processing: if a scheme requires one data run • for confidentiality and another separate run for integrity assurance, then the scheme is a two-pass scheme. Unlike to two-pass, one-pass schemes use only a single run of data to provide both confidentiality and integrity assurance.

Based on this classification, AE schemes can be divided into four classes. First, a straightforward approach to construct an AE scheme is by combining an encryption scheme with a MAC scheme such that each scheme uses its own key. Such constructions are called generic composition that are two-pass and use two different keys. Second, certain two-pass schemes follow the generic composition approach where data are processed twice (once to provide confidentiality and another to provide integrity assurance), except that they use a single key in the two combined schemes. Third, a more efficient approach is to use a single data run during both encryption and MAC operations. Such schemes are called one- pass or single-pass AE schemes where the number of used keys can be one or two. This classification of AE schemes is shown in Figure 2.13.

AE schemes

Two-pass One-pass

Two keys One key Two keys One key

Generic composition CCM EAX CWC GCM IAPM OCB EPBC Figure 2.13: Classification of AE schemes. 32 Chapter 2. Background and literature review

2.3.2 Generic composition

Generic composition schemes seem less complex to establish. The user has the ability to independently choose encryption and integrity assurance schemes [24]. Bellare and Namprempre [17, 24] classify generic composition AE schemes into three categories (illustrated in Figure 2.14) as follows:

MAC-then-Encrypt (MtE): In this technique, a sender calculates the MAC • tag T of a plaintext message M using a key K1 before encrypting both

the plaintext message and the tag under a different key K2. The receiver decrypts the ciphertext message, computes the MAC tag of the recovered plaintext message. Then, the calculated tag is compared with the received one to verify the message integrity.

T = MAC(K1,M) C = E (M T ) K2 k

Encrypt-and-MAC (E&M): This method encrypts and computes the MAC • tag of the plaintext simultaneously. The decryption of the ciphertext mes- sage is performed first at the receiving side, and then verification of the MAC tag based on the recovered plaintext message.

C0 = EK2 (M)

T = MAC(K1,M)

C = C0 T k

Encrypt-then-MAC (EtM): This technique is opposite to (MtE) approach. • Firstly, the sender encrypts the plaintext message, computes the MAC tag of the ciphertext, and finally appends the ciphertext message with the tag for transmission. At the receiver side, the verification of the MAC tag is done first, and the ciphertext message is decrypted only when it passes the integrity assurance test.

C0 = EK2 (M)

T = MAC(K1,C0)

C = C0 T k 2.3. Authenticated encryption 33

M M M

K1 MAC K1 MAC K2 E

T M E K2 C

K 2 E K1 MAC

C T C T C MAC-then-Encrypt Encrypt-and-MAC Encrypt-then-MAC Figure 2.14: Generic composition AE schemes.

Another aspect to consider when comparing these three constructions is the treatment of the IV . When an IV is used in an AE scheme, then both E&M and EtM require the integrity assurance of IV to be protected in order to prevent attacks that can change the IV while the MAC tag is still valid. Hence, the IV should be included in the MAC algorithm when generating the MAC tag. The concept is different with MtE as it does not require the IV to be authenticated. Changing the IV in MtE mechanism will produce an invalid plaintext message and an invalid MAC tag during decryption, and consequently the integrity test fails. These three constructions are natural choices and easy to analyse. Bellare and Namprempre [17] analyse the security of these three approaches for a black-box choice of an encryption scheme and a MAC. They consider three confidentiality notions as defined in [66, 67, 68]: Indistinguishability-chosen plaintext attack (IND-CPA): the inability of • an adversary to distinguish which plaintext message was encrypted based on a ciphertext message which results from encrypting one of two chosen plaintext messages.

Indistinguishability-chosen ciphertext attack (IND-CCA): the inability of • an adversary to distinguish which plaintext message was encrypted even if the adversary can encrypt and decrypt arbitrary messages before getting the challenge ciphertext message.

Non-malleability-chosen plaintext attack (NM-CPA): the inability of an • adversary to construct a different ciphertext that decrypts to a plaintext 34 Chapter 2. Background and literature review

message related to the chosen plaintext message of the given ciphertext challenge.

Bellare and Namprempre [17] introduce two notions for integrity assurance:

Integrity-plaintext (INT-PTXT): the inability to produce a valid (C,T ) • pair that decrypted to a plaintext message never encrypted before.

Intergrity-ciphertext (INT-CTXT): the inability to produce a valid (C,T ) • pair that is not previously produced irrespective of whether the correspond- ing plaintext message is new or not.

Two types of MACs, strong unforgeable (SUF-CMA) and weak unforgeable WUF-CMA), are used in the security analysis of generic composition construc- tions. A weak forgery MAC (WUF-CMA) assumes the attacker is not allowed to query a pair (Mi,Ti) where Mi is already queried to the tagging algorithm, while a strong forgery MAC (SUF-CMA) assumes the attacker is forbidden to make a forgery with a pair (Mi,Ti) where Ti is already returned by the tagging algorithm [69]. The security of the three generic composition constructions is compared in Table 2.6 under both SUF-CMA and WUF-CMA MAC schemes. The ( ) mark indicates that the AE composition is secure, whereas the ( ) mark X × indicates that the composition is insecure. Note that E&M scheme is only secure for INT-PTXT notion, whereas MtE satisfies both IND-CPA and INT-PTXT notions. The generic composition EtM scheme satisfies all security notions un- der SUF-CMA which makes it a better choice to construct an AE scheme. EtM scheme has been adopted into the ISO/IEC 19772 standard. However, generic composition constructions can lead to serious flaws in im- plementation (see e.g. [70]). Improper combination of an encryption and a MAC scheme may make the whole design insecure. In addition, this approach requires two different keys and data processed twice, one for encryption and another for integrity assurance, which can exacerbate the computational cost. Therefore, dedicated AE designs that are more efficient than generic composition construc- tions have been proposed.

2.3.3 Dedicated AE schemes

This section presents some existing AE schemes. All the listed schemes use a single key, except IAPM, and a nonce to provide AE, but the first four schemes 2.3. Authenticated encryption 35

Table 2.6: Security of generic composition AE schemes [17].

Composition MtE E&M EtM MAC WUF- SUF- WUF- SUF- WUF- SUF- CMA CMA CMA CMA CMA CMA Confidentiality IND-CPA X X × × X X IND-CCA × × × × × X NM-CPA × × × × × X Integrity assurance INT-PTXT X X X X X X INT-CTXT × × × × × X are two-pass, whereas the last three are one-pass AE schemes. These schemes are considered as a reference, especially GCM, for new proposed AE designs so that they should be more efficient and avoid cryptographic weaknesses in these dedicated AE schemes.

2.3.3.1 CCM

The counter with CBC-MAC (CCM) mode is an AEAD mode recommended by NIST that can be used with any 128-bit block cipher. CCM provides au- thenticated encryption by combining the CTR mode for encryption/decryption processes, and the CBC-MAC mode for integrity assurance. CCM mode follows MAC-then-Encrypt composition and uses only the forward function of the un- derlying block cipher, but CCM cannot perform online or parallel computation. Another drawback of the CCM mode is that when the associated data are fixed, CCM still authenticates the associated data every time. That is, preprocessing associated headers is not supported in CCM mode. CCM is standardised in NIST SP800-38C and in ISO/IEC 19772. Criticism regarding CCM efficiency and complexity is outlined by Rogaway and Wagner [71]. For more details about CCM mode, refer to [26,2].

2.3.3.2 EAX

The extended message encrypted authentication (EAX) mode is also an AEAD mode that provides authenticated encryption by combining the CTR mode for en- cryption/decryption processes, and a version of CBC-MAC mode, called OMAC 36 Chapter 2. Background and literature review

[56], for integrity assurance. Similar to CCM, EAX is not parallelisable, and uses only the forward function of the used block cipher. However, EAX mode has other important features compared to CCM. EAX is an online scheme and not restricted to certain fixed-length block ciphers. EAX accepts messages of arbitrary length that are not necessarily multiples of the cipher block length. The nonce length in EAX is not limited and the authentication tag can be of any length as long as it is not exceeding the cipher block length. If the associated data A are fixed, then EAX processes A only once for integrity assurance and uses this preprocessing in future runs. Such factors make EAX more effective than CCM mode. EAX is standardised in ISO/IEC 19772. For more details about EAX mode, refer to [32].

2.3.3.3 CWC

The Carter-Wegman counter (CWC) mode is an AEAD mode that follows Encrypt- then-MAC composition. CWC mode provides AE by combining the CTR mode for encryption/decryption processes, and the Carter-Wegman universal hash function for integrity assurance. Unlike EAX, CWC mode cannot accept ar- bitrary length messages and can be used only with block ciphers of block size 128, 192 or 256 bits. On the other hand, CWC is online and parallelisable. CWC also supports associated data preprocessing and is more efficient than EAX in hardware implementations. For more details about CWC mode, refer to [72].

2.3.3.4 GCM

The Galois/Counter mode (GCM) is an AEAD mode that also follows Encrypt- then-MAC composition. GCM utilises the CTR mode for encryption/decryption processes, and a hash function in the Galois finite field, called GHASH, for integrity assurance. GCM is an online, parallelisable scheme recommended by NIST that can be used with any 128-bit block cipher. GCM mode uses only the forward cipher function, and the integrity assurance phase can be pre-computed when the associated data are fixed. GCM has high performance both in hardware and software implementations which make this mode the best option for high speed applications. GCM is standardised in NIST SP800-38D and in ISO/IEC 19772. However, GCM becomes vulnerable to cycling attacks where the powers of certain authentication keys used in GHASH repeat after short cycles [4]. Such “weak” keys lead to a forgery attack on the MAC of GCM. For more details about 2.3. Authenticated encryption 37

GCM mode, refer to [27,3].

2.3.3.5 IAPM

The integrity aware parallelizable mode (IAPM) was the first scheme that pro- vided AE using one-pass. This mode uses two keys, one for the block cipher en- cryption and decryption functions and the other to generate pairwise keystreams or masks. These masks are XOR-ed both with the block cipher inputs and out- puts to prevent any information leakage. Integrity assurance is provided by en- crypting the checksum of the plaintext blocks using a different mask. IAPM mode is an online, parallelisable scheme that is very efficient in software and hardware implementations and uses a minimum number of block cipher calls. However, IAPM is not AEAD, is patented, uses both forward and backward functions of the used block cipher, and cannot process arbitrary length messages. For more details about IAPM mode, refer to [33].

2.3.3.6 OCB

The offset codebook mode (OCB) is an AE scheme that has three different de- signs: OCB1, OCB2 and OCB3. The first design OCB1 is based on IAPM mode, but uses a single key both for mask generation and encryption/decryption func- tions. OCB1 is a one-pass, online mode, parallelisable, accepts arbitrary message lengths and enables lightweight and efficient implementations. However, OCB1 is not an AEAD scheme. Only OCB2 and OCB3 provide AEAD. OCB2 is stan- dardised in ISO/IEC 19772, while OCB3 is the RFC 7253 standard [73]. OCB1 was a candidate to be standardised by NIST as an AE mode, but GCM and CCM modes were chosen. This may be due to the patent coverage for OCB1. For more details about OCB designs, refer to [28, 10, 29, 30].

2.3.3.7 EPBC

The efficient error-propagating block chaining (EPBC) mode was proposed by Zuquete and Guedes as an AE mode. EPBC is a one-pass scheme, but based on error propagation and integrity check vector to provide integrity assurance. It is an online, unparallelisable scheme that requires both forward and backward functions. In 2007, Mitchell [74] noted that the internal function (g) of EPBC mode cannot generate a bit pair (0, 1). Therefore, he states that such flaw can be 38 Chapter 2. Background and literature review exploited in forging ciphertext that yields a valid integrity check vector. However, the claim is refuted by Di et al. [75], who prove that Mitchell’s analysis has some lapses, and confirm that EPBC is still secure. For more details about EPBC mode, refer to [76].

2.3.4 Comparison between dedicated AE modes

This section compares the characteristics of the AE modes described in Sec- tion 2.3.3. A summary of the characteristics is outlined in Table 2.7. This table shows that CWC and GCM are the only two modes that are parallelisable, online and patent free. Compared to CWC and GCM, proposals such as IAPM, OCB1 and EPBC are more efficient as they are one-pass AE modes and require signif- icantly fewer calls of the underlying block cipher. The most efficient of these is OCB1, which only requires m + 2 calls of the underlying block cipher. However, OCB1 has patent coverage by Philip Rogaway.

2.3.5 CAESAR competition

Competition for Authenticated Encryption: Security, Applicability, and Robust- ness (CAESAR) is a cryptographic competition announced in January 2013 to identify a portfolio of AE schemes that are more efficient than the AES-GCM mode, and can be used in a vast range of applications [6]. The proposed AE schemes are expected to have sophisticated features both in software and hard- ware applications. In addition to provable security against known attacks, they are expected to be efficient, robust and practical. CAESAR is a continuation of a tradition followed by the cryptographic com- munity that usually launches competitions to enhance research and understand- ing of certain cryptographic primitives. For instance, NIST organised a com- petition in 1997 to specify an advanced encryption standard (AES) to protect sensitive government information [77]. Out of fifteen proposals, Rijndael design won this competition. After that, the eSTREAM competition [78] was announced in 2004 to identify new stream ciphers suitable for software and hardware appli- cations. eSTREAM chose a portfolio consisting of seven proposals, four software and three hardware, as winners of this competition out of the 34 submitted ci- phers. Later in 2007, NIST announced SHA-3 competition to specify a new hash algorithm and Keccak won this competition out of the total 51 submissions [79]. 2.3. Authenticated encryption 39

Table 2.7: Comparison between certain dedicated AE schemes.

Mode CCM EAX CWC GCM IAPM OCB1 EPBC

Mode type AEAD AEAD AEAD AEAD AE AE AE Block size 128- n-bit 128-, 128- n-bit n-bit 2n- bit 192- bit bit or 256- bit

Block cipher function EK EK EK EK EK , EK , EK , DK DK DK Keys One One One One Two One One One- or two-pass Two Two Two Two One One One Parallelisable No No Yes Yes Yes Yes No Online No Yes Yes Yes Yes Yes Yes Patent free Yes Yes Yes Yes No No Yes Preprocessing of A No Yes Yes Yes - - -

Number of calls 2m + 2m + 2m†+ 2m†+ m + m + 2 m + 3 d + 2 d+1? d + 2 d + 1 2 + δ

† one m indicates the number of calls of the underlying hash function. ? assuming N = n, otherwise the number of calls is 2m + d + N /n . | | d| | e δ is equal to log2(m + 1).

Then, Password Hashing Competition (PHC) was launched in 2013 to specify a password hashing standard and was selected from 24 submissions as the winner of this competition [80].

2.3.5.1 CAESAR requirements

CAESAR [6] expects submissions to fulfil the following requirements:

Accept five inputs: three are mandatory, plaintext M, associated data A • and key K, and two are optional inputs, the secret message number and the public message number IV/N.

Generate one variable length output: ciphtertext. • 40 Chapter 2. Background and literature review

Provide integrity assurance for associated data, plaintext, public message • number and secret message number.

Provide confidentiality for plaintext and secret message number. • the designer is allowed to specify a maximum length for M and A, but the • maximum length of each variable must be equal to or bigger than 65536 bytes.

the designer is allowed to recommend certain parameter sets, but must not • exceed 10 recommendations.

2.3.5.2 CAESAR submissions

CAESAR has three rounds before the announcement of the finalists. The com- petition received 57 submissions in the first round on March, 15th 2014. Only 30 AE proposals reached the second round on July, 7th 2015. The third round was announced on August, 15th 2016, and has only 16 candidates. From these 16 candidates, only 7 AE candidates have been chosen as finalists on March, 5th 2018. A summary of the AE submissions in each round, categorised by the underlying design approach, is given in Table 2.8.

Table 2.8: AE candidates during CAESAR rounds.

Design approach First round Second Third Finalist round round

Block cipher 30 13 8 3 Dedicated? 4 3 3 2 Stream cipher 9 3 1 1 Compression Function 1 1 0 0 Permutation 3 2 0 0 Sponge 10 8 4 1 Total 57 30 16 7 ? dedicated ciphers are AEGIS, MORUS, Tiaoxin and POLAWIS. 2.3. Authenticated encryption 41

2.3.6 Selected block cipher-based AE modes

This section describes the features of certain block cipher AE proposals in the CAESAR competition in Table 2.9 that will be considered later in the contribu- tion chapters of this thesis. Each chapter provides detailed description of relevant schemes. The proposals are characterised according to four criteria:

AES-based: This feature determines whether the underlying block cipher • is AES, AES4 or AES10. The functions AES4 and AES10 are AES functions consisting of four and ten rounds, respectively, and are discussed in more detail in Chapter6.

Mode: This criterion states the type of mode used by the proposal that • can be:

– ECB : Electronic codebook [22] – OTR : Offset two-round Feistel [8,9] – XEX : XOR-Encrypt-XOR [10] – XE : XOR-Encrypt [10]

Masking: As block ciphers use input and output whitening, modes use • masking to increase security. The masking techniques can be:

– doubling: masks are doubled before being applied to both the input and output of the underlying block cipher. The multiplication is usu-

ally performed in certain finite field F2n . – AX: a combination of addition modulo 2n and XOR operations.

Parameters: This criterion specifies the length of key, nonce and authenti- • cation tag for each mode, and the maximum length for M and A in bits.

Table 2.9 shows that all these modes, except ++AE, are based on the concept of XEX/XE or OTR modes where masks are updated using multiplication in a finite field. ++AE uses a cross-chaining mechanism by combining addition modulo 2n and XOR operations to provide integrity assurance. This technique is efficient for high computation but has serious flaws as will be shown in Chapter3. Regarding candidates in Table 2.9, ++AE was in the first round of CAESAR, SHELL reached the second round, and AES-OTR and AEZ reached the third 42 Chapter 2. Background and literature review

Table 2.9: Overview of certain block cipher-based AE candidates in CAESAR.

Candidate AES-based Mode Masking Parameters K N T M A | | | | | | | | | | ++AE Yes ECB AX 128 64 128 < 267 < 267 AES-OTR Yes XE doubling 128, 256 96 128 267 267 { } ≤ ≤ OCB Yes XEX doubling 128, 192, 256 128 64, 96, 128 V V { } { } AES-COPA Yes XEX doubling 128 128 128 V V ELmD Yes XEX doubling 128 64 128 255 267 267 − ≤ ≤ SHELL Yes XEX doubling 128 64, 80 128 < 266 < 266 { } AES4, AEZ OTR 128 96 128 V V AES10 − V : variable length round. Only OCB was chosen in the finalists. Before the announcement of the third round, AES-COPA and ElmD were merged into one submission, called COLM, and COLM was also chosen in the finalists.

2.4 Cryptanalytic attacks

An attack is a deliberate attempt to breach the security of a cryptosystem [81]. is a vast research field that aims to break or analyse the security of cryptographic algorithms. Without researching cryptanalysis the community will not be able to improve a ciphers’ security. Cryptanalytic attacks use different models, have diverse goals and apply various techniques.

2.4.1 Attack goals

Attacks are mounted for different goals. This section lists the main objectives as follows:

Key recovery: This attack aims to recover parts of or the entire secret • key. Retrieving the secret key of a cryptographic cipher compromises the security completely as the attacker can recover the plaintext messages cor- responding to all past and future ciphertext messages, and satisfy the other objectives listed in this section.

Distinguishing attacks: Indistinguishability means that an attacker given • plaintext messages will not able to distinguish their corresponding cipher- text messages [66, 67, 68]. These attacks do not necessarily retrieve the 2.4. Cryptanalytic attacks 43

secret key, but they show certain weaknesses in the used cryptographic scheme.

Forgery of ciphertext or MAC: These attacks aim to alter the cipher- • text message or its MAC tag, or both the message and MAC so that the decryption algorithms will not detect such modification and accept the message as authentic. Attacker does not need to recover the secret key to perform such forgeries.

State recovery: These attacks are more effective in stream ciphers than • block ciphers. Retrieving the internal state permits an attacker to compute the keystream and recover the plaintext message. In addition, some stream ciphers load the secret key in the internal state during the initialisation phase. Hence, recovering the state may enable the attacker to invert the cipher and compute backward to the initial state, which contains the secret key.

2.4.2 Attack models

Cryptanalytic attacks are classified into five attack models based on the access of an attacker to plaintext and ciphertext messages of a cryptographic algorithm. For all these attacks models, we assume that the underlying algorithm is known to the attacker whether this algorithm is an encryption scheme, MAC, etc., whereas the secret key is assumed to be unknown. These five attack models are:

1. Ciphertext-only attack: The attacker knows a set of ciphertext mes- sages without knowing or having control over their corresponding plaintext messages. This attack requires an inactive wiretapper, and is the weakest attack model since it gives the attacker the least information. Schemes vulnerable to this attack are considered completely insecure. This attack is also called known-ciphertext attack.

2. Known-plaintext attack: A set of plaintext messages and their corre- sponding ciphertext messages are known by the attacker. This attack ex- ploits the way that these known plaintext messages are encrypted in order to deduce further secret information.

3. Chosen-plaintext attack: The attacker can nominate a set of plaintext messages and request their corresponding ciphertext messages in order to 44 Chapter 2. Background and literature review

deduce information about ciphertext messages never requested before or about the secret key.

4. Adaptive chosen-plaintext attack: This attack is a chosen-plaintext at- tack where the attacker can choose plaintext messages based on ciphertext messages from previous requests. The plaintext messages are not chosen randomly, rather, they are correlated with the outputs from previous re- quests.

5. Chosen-ciphertext attack: This attack is the opposite to the chosen- plaintext attack where the attacker can nominate a set of ciphertext mes- sages and obtain their corresponding plaintext messages.

2.4.3 Confidentiality attacks

Various cryptanalysis methods are proposed to breach the confidentiality of cryp- tographic algorithms. The straightforward option is exhaustively search for the secret key as firstly explained in the following list. This option requires a lot of time and huge computational power. Alternatively, other attack techniques exploit the structure of the targeted cryptographic algorithm, and succeed if the required time or data complexity is smaller than the corresponding complexity of the exhaustive key search. Certain well-known cryptanalysis methods are listed as follows:

1. Exhaustive key search: These attacks do exist in theory, due to the fact that the length of keys and blocks in any block cipher is limited. An attacker requires a few known plaintext-ciphertext pairs to search all possible keys till one key is found that encrypts the given plaintext message to its corresponding ciphertext. For a k-bit key, the attacker should perform at most 2k operations to find the correct key which is infeasible with the current computational power and resources for ciphers having a long block size such as 128-bit. Cryptographic algorithm designers aim to make this attack the best available analysis to an attacker.

2. Differential attack: This attack is a chosen-plaintext attack that can be applied to a wide range of cryptographic primitives. Differential analysis was known several decades ago. However, the first published effort to use this attack was proposed by Murphy against the FEAL cipher in 1990 [82]. 2.4. Cryptanalytic attacks 45

Then, Biham and Shamir [83, 84] proposed a differential analysis to break DES requiring 247 chosen plaintext messages. The concept of differential attack is to observe the effect of a particular difference in a plaintext pair on the difference between their corresponding . This observation enables an attacker to assign a probability to each key candidate such that the correct key should have the maximum value.

3. Linear attack: This attack is a known-plaintext attack that exploits the linear relation between a subset of plaintext bits to a subset of bits in the internal state or the ciphertext message. This relation assigns different probabilities to all possible values for a key subset such that the correct subset has the most probability. Matsui [85] used this attack to retrieve the DES key using 243 plaintex/ciphertext pairs.

4. Algebraic attack: This attack expresses a cryptographic cipher as a set of polynomial equations that have a certain number of variables. The attacker can solve the equations if he can obtain many known plaintext/ciphertext pairs so that the numbers of equations and variables are the same. This attack was first used by Kipnis and Shamir [86] to attack the Hidden Fields Equations public key cryptosystem.

5. : This attack basically represents the ciphertext message as a polynomial function of the plaintext message. To do so, the s-boxes of a block cipher are represented as algebraic functions, espe- cially quadratic functions, using both plaintext and ciphertext messages. If the polynomial degree is d, then the attacker requires d + 1 known plain- tex/ciphertext pairs. An n-bit block cipher is vulnerable to this attack if the number of coefficients are less than or equal to 2n. This attack is first proposed by Jakobsen and Knudsen [87].

6. Related-key attack: This attack is a known-plaintext attack that anal- yses the cipher security under different mathematically related keys. The keys themselves are still unknown to the attacker, but the attacker knows the mathematical relation between these keys so that he can trace the difference between related keys through the key schedule function of the targeted cipher. This attack was first introduced by Biham [88] attacking LOKI and Lucifer ciphers to reduce the complexity of exhaustive search attacks by a factor of three. 46 Chapter 2. Background and literature review

2.4.4 Integrity assurance attacks

MAC schemes produce a fixed-length tag, which may make some attacks possible. Therefore, the MAC designer’s job is to make the following attacks computation- ally infeasible:

1. MAC key recovery attack: The principle is to have a few message-MAC pairs, and try all possible keys to find the key used for MAC generation. This strategy is similar to exhaustive key search for block cipher encryption. The easiest solution to prevent such an attack is to increase the key length, so this attack becomes impractical [89].

2. Birthday paradox: This is a generic attack that is applicable to any MAC scheme. The basic idea of this attack is to find a collision where two or more different messages generate the same MAC value. Theoretically, this collision always exists in MAC schemes since a MAC compresses an arbitrary length message to produce a fixed-length value. The reduces the number of message-MAC pairs to find a collision in a τ-bit MAC scheme from 2τ pairs to approximately 2τ/2 [20]. This attack follows the paradox of finding two students in a class with the same birthday from the 365 possible days in a year. This birthday paradox surprisingly needs only 23 students so that there is approximately 50% chance that two students have the same birthday.

3. Forgery attack: The attacker observes a set of message-MAC pairs before predicting the MAC tag for a new message without knowing the internal secret key. This forgery succeeds if the new message-MAC pair is accepted as valid by the verification MAC algorithm. Forgery attacks do not aim to recover the secret key, rather, they exploit the MAC structure to find weak- nesses. Depending on the attacker’s ability to choose a message, forgery attacks are divided into two types [89]:

- Selective forgery: The attacker can choose messages to generate their corresponding MAC tags. - Existential forgery: The attacker does not have control over requested messages, rather, only observing previous message-MAC pairs.

This attack will be used to attack certain AE block cipher modes in Chap- ters3 and4. 2.4. Cryptanalytic attacks 47

2.4.5 Implementation attacks

Physical implementations of cryptographic algorithms may leak information, such as operational time of an encryption, power consumption by a device, electromagnetic emission or fault response of an algorithm, either accidental or intentional, that may occur during the encryption process. The attacks that exploit the side channel leakage, such as power consumption, timing information and electromagnetic emission, are called side channel analysis (SCA) attacks [34]. On the other hand, attacks that exploit the output of an implementation after an error is induced during the device operation (for example, to retrieve information about the secret key) are called fault attacks [35]. The idea of SCA attacks was used several decades ago during wars and by in- telligence agencies. For instance, according to an unclassified document by NSA, a crypto-equipment, called the Bell-telephone 131-B2 used during the World War II, was discovered leaking the entire plaintext being encrypted as radiation that could be read by an oscilloscope [90]. However, the first SCA work in the research community was done by Kocher [34] attacking the public-key RSA algorithm. He successfully recovered the RSA private keys by measuring the time required to decrypt a message. After that in 1998, Kocher introduced a more effective attack, power analysis attack [91]. Every cryptographic algorithm is vulnerable to implementation attacks even if the algorithm provides strong computational security in terms of mathematical complexity and big key size. This computational security is guaranteed under black-box models where the attacker is assumed to have access only to either or both plaintext message and ciphertext message, and has no access by any way to the internal values neither to retrieve nor to interfere. In practice, this is not always the case and attackers can sometimes exploit algorithm implementations as an extra source of information. The provable security of an algorithm alone is insufficient to make its implementation secure in a physical device. This security should be accompanied with physical security to thwart implementation attacks. Attacks exploiting the physical implementation of cryptographic algorithms are classified in different ways. They can be divided according to the nature of the exploited information: power consumption, timing information, electromag- netic emission, fault injection response etc. Another distinguisher is whether the attack is active or passive. Active attacks interfere with the operation of the targeted device in order to leak information about the secret key, whereas 48 Chapter 2. Background and literature review passive attacks observe the inherent leakage of the device without a noticeable interference. Passive attacks are more threatening since they do not leave phys- ical damage to detect. Side channel attacks, for instance power analysis attack, timing attacks and electromagnetic attacks, are passive attacks whereas fault at- tacks are active attacks [92]. The following sections discuss two effective attacks, power analysis attacks and fault attacks, in more detail as these attacks will be used later in Chapters5,6 and7.

2.4.5.1 Power analysis attacks

The idea of power analysis [91] is based on the fact that hardware devices run- ning cryptographic algorithms consume power that fluctuates depending on the operation being performed and the data being processed. This fluctuation in power is considered useful information that can be exploited in effective, prac- tical power analysis attacks [91, 15]. Power analysis attacks are considered the most threatening side channel attacks since they are non-invasive, undetectable and require only low-budget equipment. A simple oscilloscope is sufficient to conduct such an attack. These attacks were first used by Kocher [91] to reveal the secret key of DES running on a smart card. Since then, a large amount of research has been published in this area and key recovery attacks on almost all cryptographic algorithms, both symmetric and asymmetric, are possible using power analysis, including AES [93] and RSA [94]. Power analysis attacks are divided into two types: Simple Power Analysis (SPA) and Differential Power Analysis (DPA). According to [95], SPA attacks exploit device leakage under the same or few inputs using the same key, whereas DPA attacks exploit the device leakage under many different inputs for the same secret key.

2.4.5.1.1 Simple power analysis Simple power analysis uses visible infor- mation on a power trace to retrieve the secret key bits [91]. A power trace is the power consumption of a device over a period of time while the device processes certain operations. SPA inspects the power trace to identify the operation being executed, and recovers the values at input or output using a single or few plain- text messages, as Mangard et al. [95] explained. Multiple shots of the trace can be taken to reduce the noise. SPA can easily identify structural features, such as the number of rounds or 2.4. Cryptanalytic attacks 49 timing, using a single power trace. For instance, Figure 2.15 (reproduced from [14]) shows a single call of a DES encryption function where it is easy to recognise that the algorithm has 16 rounds. DES and Differential Power Analysis 159

240

230

220

210

200

190

180

170

160

150

140 0 5000 10000 15000 20000 25000 30000 35000 40000 Figure 2.15: Power consumption during a DES call [14]. Fig. 1. Electric consumption measured on the 16 rounds of a DES computation

However, SPA requires the attacker to know precisely the algorithm imple- mentation,In and DPA using attacks, SPA some alone differentials to retrieve on two the sets key of average is a challenge. consumption Algorithm are computed, and the attacks succeed if an unusual phenomenon appears – on implementationsthese differentials usually of involve consumption some – for degree a good of choice noise of that some may of the hide key bits the useful data-dependent(we give details leakage. below), Furthermore, so that we are interpreting able to find out the those power key tracebits. What to retrieve makes DPA attacks so impressive, when they work, is the fact that they can key bitsfind sometimes out the secret requires key of more a public in algorithm depth analysis, (for example so an DES, attacker but also may many instead other algorithms) without knowing anything (nor trying to find anything) about considerthe other particular more implementation efficient attacks, of that like algorithm. DPA attacks. Implementations exist that are PreventingDPA-resistant SPA (differentialsattacks are do relatively not show anything easy in special) practice but not [91 SPA-resistant]. Adding noise to (some critical information can be deduced from the consumption curves). On the leakage,the contrary, adding other dummy implementations instructions exist and that balancing are SPA-resistant conditional but not DPA- branches in implementationsresistant (some are critical common information defences can to be prevent found by studyingSPA. differentials of two mean curves of consumption). Finally, some implementations can be found that resist both types of attack (at least at the ), or none of them. 2.4.5.1.2Throughout Differential this paper, power we study analysis more particularlyDifferential DPA and power we will analysis not (DPA) any longer with SPA. Indeed, as we see below, DPA can easily be analyzed in a is a statisticalmathematical tool way that (and exploits not only the in an correlation empirical way). between There exist the many leakage attacks and the data processedbased on to the recover electric consumption. secret keys We from do not a physical claim to give algorithm here solutions implementation to all the problems that may result from these attacks. [91, 15, 14]. Unlike SPA, DPA uses a collection of power traces using different inputs. DPAThe is cryptographic better than algorithms SPA to we filter consider out herenoise. make DPA use of is a relatively secret key in effective, order to compute an output information from an input information. It may be a easy to implementciphering, a deciphering and requires or a signature little resources. operation. In particular, all the material DPA uses a hypothetical model that simulates the physical devices. This model is used to predict partial values of certain intermediate variables. After that, these predicted values are compared to the actual measurements in order to 50 Chapter 2. Background and literature review weigh the guess correctness. Guesses with the maximum (or minimum depends on the model) values are usually the actual variable values. Different statistical methods are used to perform differential power analysis, such as single-bit DPA [91], multi-bit DPA [96] and correlation power analysis CPA [97]. The original single-bit DPA [91] is based on a particular bit in an intermediate variable, and uses this bit to sort out the collected power traces. For example, this bit can be the last output bit of the last s-box in the first round of AES. To calculate this bit, an attacker has to guess one byte of the secret key K[i], and use this guess with the plaintext byte M[i] to sort the collected traces into two sets, 0-traces when the bit value is 0 and 1-traces when the bit value is 1 . Finally, the differential trace in each set is taken and the two averaged traces are XOR-ed with each other. If the guess was correct, the final differential trace will show a few visible spikes at certain points otherwise the trace will be flat. For instance, Figure 2.16 shows the corresponding final differential trace for five guesses K[i] 101, , 105 in an AES implementation. It is clear that ∈ { ··· } K[i] = 103 is10 the correct value. J Cryptogr Eng (2011) 1:5–27

FigureFig. 2.16: 8 Five Differential differential trace traces for for thefive DPA predictions test predicting of a the key LSB byte of [15]. Ii,0 for guesses K0 = 101,...,105 from top to bottom, with the correct key K0 = 103, corresponding to the third trace The efficiency of single-bit DPA attack is improved when multi-bits are con- sidered rather than a single bit. This multi-bit DPA attack was first reportedFig. 9 DPA results showing the average trace for an AES-128 opera- tion running on an FPGA (top), the differential trace for an incorrect by Messergesdepending [96]. A similar on whether attack the based selection on functionthe hamming result weightis 0 or 1 of multi-bitsguess of a byte of the last round key (middle) and the differential trace is called correlationfor the candidate power analysisKn and [ the97]. plaintext Other attacks, being encrypted such as when templatefor attacks the correct key byte (bottom) the trace was captured. [98] and stochastic attacks [99], use leakage modelling of the targeted device as an The difference of the subsets’ averages is then examined. m initialisation phase to enhance the effectiveness of DPA attacks when a limited = D(C , K )T [ j] If the value of the S-box output bit predicted by the selection [ j]= i 1 i n i number of samples are available. D m D(C , K ) function has even a tiny correlation to the power traces, the  i=1 i n DPA test will show spikes indicating that the candidate K m ( − D(C , K )) T [ j] n − i=1 1 i n i is correct. For each wrong K , the predicted values of I , m ( − ( , )) n i n i=1 1 D Ci Kn will be (largely) unrelated to any data being processed by the  target device, and the DPA test will not be (or will be much For a typical DPA analysis, the guess for Kn that produces less) statistically significant. the largest spikes in the differential trace D is considered Difference traces were prepared for all 256 possible val- to be the most likely candidate for the correct value. for K0 (i.e., K0 = 0,...,255). Figure 8 shows, from top The attack can be adjusted easily for other cipher modes to bottom, five traces for K0 = 101,...,105. The correct and target devices. For example, Fig. 9 shows a DPA result value for K0 is 103, as is obvious from the presence of large from the FPGA implementation of AES-CBC shown in spikes in the K0 = 103 trace (which matches Fig. 6). Traces Fig. 2. For convenience, a single oscilloscope capture was 5 for incorrect K0 values have much smaller spikes or are used to capture all AES operations needed for the attack, relatively flat. then the capture file was divided into 65,536 separate AES The same analysis can be repeated for all the 16 bytes operations for analysis. Also, because the ciphertext (instead of the state (n=0,…,15) to recover the entire 128-bit AES of plaintext) was available, the DPA process was used to secret key from the device. The same traces can be reused in find bytes of the last round key.6 The top trace in Fig. 9 is finding each key byte; it is not necessary to collect separate the average power trace for an AES operation. The middle data, since each test is checking for different correlations in trace is a differential trace for a DPA test carried out with an the data set. incorrect guess for the first byte of the last round key and the A DPA test can be summarized as follows: Let T denote bottom trace shows the corresponding differential trace for 7 the set of traces that are collected and let Ti denote the ith the correct key byte guess. trace. Let Ti [ j] denote power measurement or sample at the jth time offset within the trace T .LetC denote the set of i 6 known inputs or outputs for the traces with C corresponding This analysis is a ciphertext-only attack; knowledge of the plaintext i is not required. to the ith trace. Let D(C , K ) denote a binary valued selec- i n 7 Although the FPGA yields less information per AES operation than tion function with input Ci and the guess Kn of a part of a the smart card in Fig. 1, the FPGA leaks its key out more quickly because key. Each point j in the differential trace D for the guess it performs many more AES operations per second. The FPGA analysis Kn is computed as follows: was automated using a simple automated tool to identify and synchro- nize the individual AES block operations. The entire process (including the time for the FPGA to perform the AES operations, the capture and transfer of the trace data to a PC, and all necessary processing and 5 The presence of smaller spikes for incorrect hypothesis is due to har- averaging steps to solve for the complete key) took 125 s from start to monics, which are discussed in Sect. 4.4. finish. 123 2.4. Cryptanalytic attacks 51

2.4.5.2 Fault attacks

Fault attacks are active attacks that induce faults into a cryptographic imple- mentation to extract information about the secret key by analysing the erroneous outputs [35]. Boneh et al. [35] are the first who used fault attacks against cryp- tographic protocols when they attacked RSA. Inspired by this paper, numerous encryption algorithms, including DES [100] and AES [101, 102], have been shown susceptible to fault attacks. Fault attacks are powerful techniques that can re- trieve the entire key of AES using only a single fault injection [16, 103]. Fault attacks require physical access to the targeted device to inject faults. These attacks are particularly dangerous for secure embedded systems, such as smart card, RFID tags, pay-TV, automotive applications and similar applica- tions where an attacker can have full control over them. Fault attacks can be easily injected into these systems using an inexpensive equipment. However, that are located in secure places, such as bank servers and net- work encryption boxes, are not vulnerable to fault attacks as only legitimate users have physical access to such systems. Various physical technologies can be utilised to inject a fault into a cryptosys- tem, including clock tampering [104], supplying a voltage glitch [105], radiating an electromagnetic field [106] or using a laser beam [107]. Faults have differ- ent effects that can be flipping a bit, skipping an instruction from execution or destroying a memory cell. Fault attacks analyse a cryptosystem response under the assumption of dif- ferent fault models that depend on:

the number of bits affected: one bit, a few bits, one byte or a few bytes; • the modification type: stuck-at-zero, stuck-at-one, flip a bit or a random • fault;

fault precision: how precisely the fault location and its timing can be con- • trolled;

fault duration: transient or permanent. • Attacks exploit faults using various analysis methods. Certain fault attack methods require pairs of fault-free and faulty ciphertexts using identical or re- lated plaintexts. Differential Fault Analysis (DFA) [100] collects pairs of correct 52 Chapter 2. Background and literature review and faulty ciphertexts such that both ciphertext messages in every pair are gen- erated from the same input to the block cipher. After that, DFA exploits the relation between the two ciphertext in each pair to recover the secret key. The Safe-Error Attack (SEA) [108] fixes part of the secret key to a known value and investigates a collision between two identical inputs with a fault injected in one computation. That is, the technique exploits faults that result in identical ci- phertext, but change intermediate values. In the Collision Fault Analysis (CFA) [109], an attacker invokes a fault at the beginning rounds of an algorithm and finds collisions between fault-free and faulty ciphertexts using identical input and the same secret key. On the contrary, Statistical Fault Attack (SFA) [18] requires only a collection of faulty ciphertexts to recover the correct key. It does not require identical inputs or correct ciphertext messages. However, SFA requires two necessary conditions to fulfil before applying to any cryptographic scheme: different block cipher calls should have different inputs, and the attacker can obtain the direct outputs of the block cipher [19]. The idea of using statistical analysis was previously used to enhance the effi- ciency of differential fault attacks [110, 111]. However, Fuhr et al. [18] were the first to use this technique using faulty ciphertext messages only without the need to know or repeat the plaintext message. SFA or ciphertext-only attacks anal- yses the bias in the distribution of faulty intermediate states using a statistical distinguisher. The attacker induces a fault into one byte of the internal state of unknown location for several different plaintext messages. Then, the attacker collects several faulty ciphertext bytes and guesses the relevant key bytes to ob- tain the distribution of the corresponding faulty intermediate states. After that, a distinguisher is applied for each guess to return a value that should be maximal or minimal when the guess value matches the correct key value. Various distin- guishers can be used, such as maximum likelihood, hamming weight, Squared Euclidean Imbalance (SEI), etc. The choice of the appropriate distinguishesr depends on the attacker knowledge on the used fault model.

2.5 Summary

This chapter covered four main areas related to this research: block-cipher-based methods to provide confidentiality; block-cipher-based methods to provide in- 2.5. Summary 53 tegrity assurance; block-cipher-based authenticated encryption and finally at- tacks on cryptographic schemes. The aim of this chapter is to expand the knowl- edge base in such topics, particularly related to authenticated encryption. First, the chapter gave an overview of block ciphers and some common modes of operation. Then, the purpose and the most common approach to provide integrity assurance were discussed. After that, the chapter outlined the features of secure message authentication codes (MACs) and their different constructions. The chapter explained also authenticated encryption schemes and classified them into four categories. The most desirable AE scheme is one that assures both data confidentiality and integrity using a single key and a single data pass. Seven dedicated modes of AE were listed with their features and limitations. The latest state of authenticated encryption was investigated, especially the ongoing CAESAR competition, proposal requirements and submitted proposals. Finally, different types of attacks on cryptographic schemes were listed. At- tacks can be established for different goals, and assume the attacker has certain capabilities depending on the control over plaintext messages, ciphertext mes- sages or both. Attacks can breach the confidentiality part, integrity assurance mechanism or both. Other attacks require physical access to the algorithm im- plementation. All these issues were generally reviewed in this chapter. All CAESAR AE schemes used in later chapters of this thesis were briefly compared in Table 2.9. Looking forward, Chapters3 and4 generally focus on forgery attacks against AE modes, whereas Chapters5 and6 investigate the effect of fault analysis on certain CAESAR schemes. Combination of forgery, power analysis and fault analysis attacks are used to propose a new AE mode of operation in Chapter7. 54 Chapter 2. Background and literature review Chapter 3

Analysis of ++AE authenticated encryption mode

The ++AE proposal [7] was submitted to the Competition for Authenticated Encryption: Security, Applicability, and Robustness (CAESAR) [6] in 2014. This block cipher mode of operation is intended to provide authenticated encryption with associated data (AEAD). The ++AE cipher design is similar to AE schemes previously proposed by Recacha: Input and Output Block Chaining (IOBC) [112] and Input Output Chaining (IOC) [113]. All of these designs use a block cipher in Electronic CodeBook (ECB) mode with cross block chaining and a detectable redundancy paradigm for the integrity check, rather than computing a message authentica- tion code (MAC). An integrity check vector ICV that is known to both sender and receiver is appended to the plaintext message before encryption and en- crypted along with the other plaintext. When the ciphertext is decrypted and the plaintext is recovered, the decryption of the final block is compared to the known ICV to check for possible alterations in the message during transmission, as shown in Figure 3.1. The ICV mechanism has a fundamental disadvantage compared to a MAC. If the ICV matches a part of the plaintext message, a ciphertext message truncated at that point will decrypt to a plaintext which the receiver will presume is unaltered. The method used for determining the value of the integrity check vector (ICV ) in ++AE is different to that used for IOBC and IOC, although the verification process follows the same concept. If

55 Forgery Attacks on ++AE Authenticated Encryption Mode

Hassan Qahur Al Mahri Leonie Simpson Harry Bartlett [email protected] [email protected] [email protected]

Ed Dawson Kenneth Koon-Ho Wong [email protected] [email protected] Queensland University of Technology 2 George St, Brisbane 4000, Australia

ABSTRACT key size. Recacha designed ++AE [8] based on his previously In this paper, we analyse a block cipher mode of operation proposed AE schemes, namely IOBC [6] and IOC [7]. All submitted in 2014 to the cryptographic competition for Recacha’s schemes, IOBC, IOC and ++AE, use a block cipher in authenticated encryption (CAESAR). This mode is designed by ECB mode with cross block chaining and a detectable redundancy Recacha and called ++AE (plus-plus-ae). We propose a chosen paradigm for integrity check rather than computing a message plaintext forgery attack on ++AE that requires only a single authentication code (MAC). Compared to other similar modes, chosen message query to allow an attacker to construct multiple ++AE has some sophisticated features, such as it is parallelisable, forged messages. Our attack is deterministic and guaranteed to has minimal overhead and supports the authentication of pass ++AE integrity check. We demonstrate the forgery attack associated data [8]. using 128-bit AES as the underlying block cipher. Hence, ++AE In Recacha’s earlier schemes [6], the communicating parties agree is insecure as an authenticated encryption mode of operation. on a predefined value, called Integrity Check Vector (ICV). This is appended to the end of the plaintext message before encryption CCS Concepts 56 (see FigureChapter 1). The 3. encrypted Analysis ICV of ++AE block authenticated is referred to encryptionas a mode • Security and privacy~Cryptanalysis and Modification Detection Code (MDC). During decryption, errors in other attacks ciphertext are propagated to the last block (MDC). The altered changes toMDC the ciphertext yields the decryption occur, the of block an incorrect chaining ICV. mechanism Only messages is intended to Keywords with the expected ICV are accepted as authentic. As will be Authenticated encryption; ++AE; confidentiality; integrity;propagate block thesediscussed errors later, during ++AE decryptionuses a different so method that the for recovered determiningICV the is different cipher; forgery attack; symmetric encryption; CAESAR; AEAD.to the originalvalueICV of thevalue, ICV, thus although revealing the verification the integrity process breach. follows the same concept described above. 1. INTRODUCTION Authenticated encryption (AE) is a scheme that provides both data K K confidentiality and integrity [1]. Confidentiality ensures that only the intended recipient can read the contents of a transmitted message, while data integrity assures that the contents have not (C, MDC) been altered during transmission by unauthorised means [4]. Enc Dec Some security protocols do not require all data to be encrypted, such as protocol headers that need to be in clear text in order to be ICV =? ICV* identified by different communicating parties. Therefore, Rogaway [9] singles out a new version of AE schemes, called (M, ICV) (M*, ICV*) authenticated encryption with associated data (AEAD), where the associated data are authenticated only. Sender Receiver One common approach to provide AE and AEAD is through Figure 3.1: ICV integrity mechanism. block cipher modes. One of these proposed modes is ++AE (plus- Figure 1. Integrity mechanism [6] plus-ae) [8]; a candidate submitted to the Competition for Weaknesses in previous designs using an ICV have been Authenticated Encryption: Security, Applicability, andThe blockidentified chaining. Mitchell mechanisms [5] showed used in that IOBC IOBC and is vulnerableIOC have topreviously a been 푛 Robustness (CAESAR) [2]. shown to beknown flawed. plaintext Mitchell forgery presented attack with aa knowncomplexity plaintext of around forgery 2 ⁄3, attack on ++AE is an authenticated encryption with associated data (AEAD)IOBC in [114where] with 푛 is complexity the underlying of 2blockn/3. Bottinellicipher block et length. al. [115 Bottinelli] presented et a chosen al. [3] breach the integrity of IOC with a probability of 1 − 3 × mode of operation for a block cipher of arbitrary block length and n plaintext forgery2−푛. attack on IOC with a success rate of 1 3 2− . The ++AE − × Permission to make digital or hard copies of all or part of this workproposal for uses a different block chaining mechanism, which involves repeated use personal or classroom use is granted without fee provided that copies are This paper identifies a weakness in the integrity assurance not made or distributed for profit or commercial advantage andof twothat typesmechanism of addition: of the bitwise ++AE mode. XOR This and weakness integer additioncan be exploited modulo in 2n. copies bear this notice and the full citation on the first page. To copy a chosen plaintext forgery attack similar to the attack approach by otherwise, or republish, to post on servers or to redistribute to lists,This chapterBottinelli, analyses Reyhanitabar ++AE and and Vaudenay reveals serious [3] in breaking weaknesses the IOC in the integrity requires prior specific permission and/or a fee. assurance mechanismauthenticated of encryption ++AE. Firstly mode. Forand ++AE, most seriously, if an attacker the integrity can mech- AISC2016@ACSW, February 2–5, 2016, Canberra, A.C.T., Australia. anism in ++AEobtain doesthe ciphertext not verify message the mostcorresponding significant to a bitchosen of anyplaintext plaintext block. Copyright message containing two groups of four consecutive blocks DOI: Secondly, certainanywhere groups before of the four last consecutiveblock, the attacker plaintext can construct blocks multiple cause the internal values of the block cross-chaining mechanism to repeat. For 16 specific four-block plaintext groups, a guaranteed chosen plaintext forgery attack can be performed provided the chosen plaintext message includes at least two groups of four con- secutive blocks selected from the 16 specific values. One of these 16 four block groups was identified in an online comment by Bahack in 2014 [116]. Thirdly, a further flaw is identified that can be exploited in chosen plaintext forgery at- tacks that require only a single group of four consecutive blocks (of particular values) in the plaintext message. Over 450 groups of four consecutive plaintext block values are identified for which only a single group is required for multiple forgeries to be constructed. These chosen plaintext forgery attacks are not probabilistic attacks; in every case, the ciphertext for a plaintext message which includes one of these specific 3.1. Description of ++AE 57 four-block plaintext values can be manipulated so that the recovered plaintext differs from the original but the integrity check mechanism does not detect the forgery. The success of these attacks is also independent of the underlying block cipher, key or additional secret value. This chapter is arranged as follows: Section 3.1 briefly describes the ++AE scheme. Sections 3.2 and 3.3 provide the mathematical proofs relating to the basic operations of ++AE. In Section 3.2, the most fundamental flaw in ++AE is proven: that the most significant bit of each plaintext block does not affect the encryption of the integrity check vector. Section 3.3 describes other weaknesses in the ++AE structure related to the chaining mechanism and the effect of such weaknesses on the integrity assurance mechanism. Section 3.4 presents several chosen plaintext forgery attacks which exploit these weaknesses in the ++AE chaining mechanism. Section 3.5 presents the results of several experimental verifications of these forgery attacks using 128-bit AES as the underlying block cipher. The last section draws a conclusion.

3.1 Description of ++AE

This section describes the ++AE block cipher mode of operation, based on Re- cacha’s two submissions [7] to the CAESAR competition: v1.0 in March 2014 and the revised version v1.1 in April 2014. The encryption and decryption pro- cedures for ++AE are identical in both versions, and our analysis applies to both. ++AE is designed to be a lightweight AE mode with a negligible overhead compared to the well-known ECB mode. ++AE requires four additional n-bit sums for each plaintext block call, and two encryption calls prior to processing each message to generate the initializing vectors, IVa and IVb. The scheme supports two modes of operation, stateless and stateful. In stateless mode, a distinct value for the IV is provided to each message to generate the two vectors

(IVa,IVb). In stateful form, the two vectors (IVa,IVb) take the value of the last chaining vectors (Qm+1,Im+1), respectively, from the previous message. The last form of operation is claimed to show some resistance against forgery attacks in case IV repeats. ++AE supports parallelisation of block cipher calls, which enables the scheme to achieve high performance. The mode works with any block cipher regardless 58 Chapter 3. Analysis of ++AE authenticated encryption mode

of the block and key sizes. A large amount of data can be processed, limited by 2n/2 messages, with a maximum size of 2n/2 blocks per message.

Suppose that M = (M1,M2,...,Mm) is an m-block plaintext message to be ++AE encrypted, using the symmetric key K and the public session value IV previously shared between sender and receiver. Note that IV should be different for each message to be encrypted during a given key lifetime. According to design specifications, IV can be a sequential counter or a random value provided that it is distinct for each message [7]. For the first message in the stateful case and every message in the stateless case, the encryption/decryption process begins by computing two secret n-bit

values, denoted IVa and IVb, as follows:

IVa = EK (IV )

IVb = EK (IVa).

Integrity is assured by appending an integrity check vector ICV to the end of the plaintext message. The value of ICV is computed as follows:

ICV = (IV IV ) + (IV (m + d)). a ⊕ b ⊕ The encrypted ICV block is referred to as a Modification Detection Code (MDC ).

++AE uses pairs of internal chaining vectors, Qi and Ii. Changes in the cipher- text should propagate through the chaining vectors to the last block during decryption, resulting in an incorrect decryption of MDC to ICV. Only messages with the expected ICV are accepted as authentic. The ++AE encryption and decryption operations are described in Table 3.1 and illustrated in Figure 3.2.

3.2 Fundamental flaw in ++AE

This section explains how the chaining mechanism in ++AE results in a funda- mental flaw in the integrity assurance mechanism. The mathematical foundations are provided before explaining the flaw.

Lemma 3.1. Let a be a n-bit binary number. Then,

n 1 n 1 n a 2 − a 2 − (mod 2 ). ⊕ ≡ ± 3.2. Fundamental flaw in ++AE 59

Table 3.1: ++AE encryption and decryption algorithms.

Algorithm 1: ++AE Encryption Algorithm 2: ++AE Decryption

EncryptK (IV,M) DecryptK (IV,C MDC ) 1. Q IV 1. Q IV k 0 ← a 0 ← a 2. I0 IVb 2. I0 IVb 3. for←i 1 to m do 3. for←i 1 to m do ← ← 4. Ii Mi Qi 1 4. Xi DK (Ci) ← ⊕ − ← 5. Qi Ii + Ii 1 + Qi 1 5. Qi Xi Ii 1 ← − − ← ⊕ − 6. Xi Qi Ii 1 6. Ii Qi (Ii 1 + Qi 1) ← ⊕ − ← − − − 7. Ci EK (Xi) 7. Mi Ii Qi 1 8. od ← 8. od ← ⊕ − 9. Qm+1 (ICV Qm) + Im + Qm 9. Qm+1 DK (MDC) Im 10. MDC ← E (Q⊕ I ) 10. I ←Q I ⊕Q ← K m+1 ⊕ m m+1 ← m+1 − m − m 11. return C MDC 11. ICV 0 I Q k ← m+1 ⊕ m 12. if ICV 0 = ICV then 13. return M = M1,...,Mm 14. else { } 15. return INVALID 16. fi

n 1 P− j Proof. Let [a]j.2 be the binary representation of a. j=0 n 1 n 1 n Then, since 2 − 2 − (mod 2 ): ≡ − n 2 − n 1 X j n 1 a 2 − = [a]j.2 + ([a]n 1 1)2 − ⊕ − ⊕ j=0 n 2 − X j n 1 = [a]j.2 + (1 [a]n 1)2 − − − j=0 n 2 − X j n 1 = [a]j.2 + ([a]n 1 + 1 2[a]n 1)2 − − − − j=0 n 1 − X j n 1 n [a]j.2 + 2 − [a]n 1.2 ≡ − − j=0 n 1 n a 2 − (mod 2 ) ≡ ±

Theorem 3.2. Suppose a plaintext block Mj is submitted for ++AE encryption, resulting in the ciphertext block Cj and the inner vectors Qj and Ij. If Mj is n 1 replaced by M ∗ = M 2 − , then j j ⊕ n 1 (a) I will be replaced by I∗ = I 2 − ; j j j ⊕ 60 Chapter 3. Analysis of ++AE authenticated encryption mode

M1 M2 Mm Mm+1 = ICV

IVa

I1 I2 Im Im+1

+ + + + + + + +

Q1 Q2 Qm Qm+1

IVb X1 X2 Xm Xm+1

EK EK EK EK

C1 C2 Cm Cm+1 = MDC

C1 C2 Cm Cm+1 = MDC

DK DK DK DK

X1 X2 Xm Xm+1 IVb

Q1 Q2 Qm Qm+1 + - + - + - + -

I1 I2 Im Im+1

IVa

M1 M2 Mm Mm+1 = ICV 0 Figure 3.2: ++AE encryption and decryption operations.

n 1 (b) Q will be replaced by Q∗ = Q 2 − ; j j j ⊕ n 1 (c) X will be replaced by X∗ = X 2 − . j j j ⊕

0 0 0 Proof. Let Ij, Qj and Xj denote the new values of Ij, Qj and Xj, respectively. Using Lemma 3.1 and Algorithm 1:

0 n 1 n 1 Ij = Mj∗ Qj 1 = Mj Qj 1 2 − = Ij 2 − = Ij∗ ⊕ − ⊕ − ⊕ ⊕ 0 n 1 Qj = (Ij + 2 − ) + Ij 1 + Qj 1 − − n 1 = (Ij + Ij 1 + Qj 1) + 2 − − −

= Qj∗

0 0 n 1 Xj = Qj Ij 1 = Qj 2 − Ij 1 = Xj∗ ⊕ − ⊕ ⊕ − 3.2. Fundamental flaw in ++AE 61

Theorem 3.3. Suppose a plaintext block Mj is submitted for ++AE encryp- tion, resulting in the ciphertext block Cj. If Mj is submitted for encryption a second time, but with either or both of Qj 1 and Ij 1 replaced by Qj∗ 1 and Ij∗ 1 − − − − respectively, then

0 0 0 (a) If Ij 1 is replaced by Ij∗ 1, then Ij = Ij, Qj = Qj∗, Xj = Xj; − −

0 0 0 (b) If Qj 1 is replaced by Qj∗ 1, then Ij = Ij∗, Qj = Qj, Xj = Xj; − −

0 0 (c) If both Ij 1 and Qj 1 are replaced by Ij∗ 1 and Qj∗ 1, then Ij = Ij∗, Qj = Qj∗, − − − − 0 Xj = Xj;

(d) for any of cases (a) to (c), X 0 = X for all i j. i i ≥ Proof. Lemma 3.1 is again applied to the ++AE algorithm.

(a) If Ij 1 is replaced by Ij∗ 1, then: − −

0 Ij = Mj Qj 1 = Ij ⊕ − 0 0 Qj = Ij + Ij∗ 1 + Qj 1 − − n 1 = Ij + Ij 1 + 2 − + Qj 1 − − n 1 = (Ij + Ij 1 + Qj 1) + 2 − − −

= Qj∗

0 0 n 1 n 1 Xj = Qj Ij∗ 1 = Qj 2 − Ij 1 2 − = Xj ⊕ − ⊕ ⊕ − ⊕

(b) Similarly, if Qj 1 is replaced by Qj∗ 1, then: − −

0 n 1 n 1 Ij = Mj Qj∗ 1 = Mj Qj 1 2 − = Ij 2 − = Ij∗ ⊕ − ⊕ − ⊕ ⊕ 0 0 Qj = Ij + Ij 1 + Qj∗ 1 − − n 1 n 1 = Ij + 2 − + Ij 1 + Qj 1 2 − − − − = Qj

0 0 Xj = Qj Ij 1 = Qj Ij 1 = Xj ⊕ − ⊕ − The proof of case (c) follows similarly, while the proof of (d) follows by repeated application of (a), (b) or (c) as relevant.

From Theorem 3.2, complementing the most significant bit of a plaintext block Pj will complement the most significant bit of the internal values Ij, Qj and Xj. From Theorem 3.3, if the values for Ij and/or Qj are complemented, the 62 Chapter 3. Analysis of ++AE authenticated encryption mode

encryption of the subsequent plaintext Pj+1 is not affected. Theorems 3.2 and 3.3 show that the MDC is unaffected by any change in the most significant bit of any plaintext block, except for the last block (PN+1 = ICV ). Consequently, the decryption of this MDC will give the correct ICV for the altered message. In other words, the integrity mechanism in ++AE does not verify the most significant bit of any plaintext block. The purpose of the integrity assurance mechanism is to detect alterations to the received data either due to accidental or intentional modification. However, the integrity assurance mechanism in ++AE does not detect alterations to the most significant bit in any plaintext block. This is a fundamental flaw. This weakness allows a malicious sender or receiver to dispute the content of a validly sent message by claiming that the message is different in the most significant bit of any subset of the message blocks. Based on this flaw, a straightforward chosen plaintext forgery attack would also be possible using two ciphertexts produced using the same IV (nonce-reuse). However, the ++AE specification requires a different value IV to be used for each message encrypted. If nonce-reuse is possible, then plaintext forgeries are easily constructed.

3.3 Other integrity assurance flaws in ++AE

In this section, further mathematical properties of combinations of bitwise XOR and modular addition are outlined. These results are applied to the cross- chaining processes used in ++AE. The theorems in this section are the founda- tion for the flaws in the ++AE structure which are exploited in forgery attacks in Section 3.4.

3.3.1 Repeating internal vectors Ii 1 and Qi 1 − − The first type of forgery involves messages with certain groups of four consecutive

plaintext blocks (Mi,Mi+1,Mi+2,Mi+3), where each group has the property that

Qi+3 = Qi 1 and Ii+3 = Ii 1. That is, the internal values are repeated after these − − four blocks have been encrypted. The following results enable us to identify these groups. 3.3. Other integrity assurance flaws in ++AE 63

Lemma 3.4. Let a be a n-bit binary number. Then,

(2n 1) a 2n 1 a (mod 2n). − ⊕ ≡ − − Proof.

n 1 n 1 X− X− (2n 1) a = 2j [a] .2j − ⊕ ⊕ j j=0 j=0 n 1 X− = (1 [a] )2j ⊕ j j=0 n 1 X− (1 [a] )2j ≡ − j j=0 n 1 j X− X 2j [a] .2j ≡ − j j=0 j=0 2n 1 a (mod 2n) ≡ − − Lemma 3.5. Let a be a n-bit binary number. Then,

(2n 2) a 2n 2 a + 2[a] (mod 2n). − ⊕ ≡ − − 0 Proof.

n 1 n 1 X− X− (2n 2) a = 2j [a] .2j − ⊕ ⊕ j j=1 j=0 n 1 X− = (1 [a] )2j + [a] ⊕ j 0 j=1 n 1 X− (1 [a] )2j [a] + 2[a] ≡ − j − 0 0 j=1 n 1 n 1 X− X− 2j [a] .2j + 2[a] ≡ − j 0 j=1 j=0 2n 2 a + 2[a] (mod 2n) ≡ − − 0 Theorem 3.6. Suppose M = (M ,...,M ,M ,M ,M ,...,M )(m 1 i i+1 i+2 i+3 m+1 ≥ 4) is an m-block plaintext message submitted for ++AE encryption where the 64 Chapter 3. Analysis of ++AE authenticated encryption mode

final block, Mm+1, is the ICV. If

(M ,M ,M ,M ) = (2n 1, 2n 1, 2n 1, 2n 1) i i+1 i+2 i+3 − − − − where 1 i m 3, then ≤ ≤ −

n Ii = 2 1 Qi 1 Qi = Ii 1 1 − − − − − n n Ii+1 = 2 Ii 1 Qi+1 = 2 2 Qi 1 − − − − − n Ii+2 = Qi 1 + 1 Qi+2 = 2 1 Ii 1 − − − − Ii+3 = Ii 1 Qi+3 = Qi 1. − − Proof. From Lemma 3.4:

Ij = Pj Qj 1 ⊕ − n = (2 1) Qj 1 − ⊕ − n = 2 1 Qj 1, for j i, i + 1, i + 2, i + 3 − − − ∈ { } Qj = Ij + Ij 1 + Qj 1 − − n = 2 1 + Ij 1, for j i, i + 1, i + 2, i + 3 − − ∈ { }

Therefore:

n Ii = 2 1 Qi 1 − − − n Qi = 2 1 + Ii 1 = 1 + Ii 1 − − − −

n n Ii+1 = 2 1 Qi = 2 Ii 1 − − − − n n n Qi+1 = 2 1 + Ii = 2(2 1) Qi 1 = 2 Qi 1 = 2 2 Qi 1 − − − − − − − − − −

n n Ii+2 = 2 1 Qi+1 = Qi 1 (2 1) = Qi 1 + 1 − − − − − − n n Qi+2 = 2 1 + Ii+1 = 2 1 Ii 1 − − − −

n Ii+3 = 2 1 Qi+2 = Ii 1 − − − n Qi+3 = 2 1 + Ii+2 = Qi 1 − − 3.3. Other integrity assurance flaws in ++AE 65

Note 1: This is the pattern of plaintext blocks identified by Bahack [116].

Theorem 3.7. Suppose M = (M ,...,M ,M ,M ,M ,...,M )(m 1 i i+1 i+2 i+3 m+1 ≥ 4) is an m-block plaintext message submitted for ++AE encryption where the

final block, Mm+1, is the ICV. If

(M ,M ,M ,M ) = (2n 2, 2n 2, 2n 2, 2n 2) i i+1 i+2 i+3 − − − − where 1 i m 3, then Qi+3 = Qi 1 and Ii+3 = Ii 1 . ≤ ≤ − − − Proof. Using Algorithm 1 and Lemma 3.5:

n n Ii = (2 2) Qi 1 = 2 2 Qi 1 + 2[Qi 1]0 − ⊕ − − − − − n Qi = 2 + Ii 1 + 2[Qi 1]0 = (2 2) ( Ii 1 2[Qi 1]0 + 2[Ii 1]0) − − − − ⊕ − − − − − Ii+1 = Ii 1 2[Qi 1]0 + 2[Ii 1]0 − − − − − n n Qi+1 = 2 4 Qi 1 + 2[Qi 1]0 + 2[Ii 1]0 = (2 2) (2 + Qi 1 2[Ii 1]0) − − − − − − ⊕ − − − Ii+2 = 2 + Qi 1 2[Ii 1]0 − − − n n Qi+2 = 2 2 Ii 1 + 2[Ii 1]0 = (2 2) Ii 1 − − − − − ⊕ − Ii+3 = Ii 1 − Qi+3 = Qi 1 −

Theorem 3.8. Suppose M = (M1,...,Mi,Mi+1,Mi+2,Mi+3,...,Mm+1) where m 4 and 1 i m 3, is an m-block plaintext message submitted for ++AE ≥ ≤ ≤ − n n 1 encryption where the final block, M , is the ICV. Let r 2 1, 2 − m+1 ∈ { − − n n 1 n 1 1, 2 2, 2 − 2 and r∗ = r 2 − . − − } ⊕

(a) If (M ,M ,M ,M ) (r, r, r, r), (r, r, r∗, r∗), (r, r∗, r, r∗), (r, r∗, r∗, r) , i i+1 i+2 i+3 ∈ { } then Ii+3 = Ii 1 and Qi+3 = Qi 1; − −

(b) If (M ,M ,M ,M ) (r, r, r, r∗), (r, r, r∗, r), (r, r∗, r, r), (r∗, r, r, r) , i i+1 i+2 i+3 ∈ { } then Ii+3 = Ii∗ 1 and Qi+3 = Qi∗ 1. − − Proof. Case (r, r, r, r) with r = 2n 1 is proved in Theorem 3.6 and with r = 2n 2 − − is proved in Theorem 3.7. Other cases follow by combining these results with repeated applications of Theorems 3.2 and 3.3(c).

From Theorem 3.8(a), note that there are 16 groups that satisfy the condition mentioned at the beginning of Section 3.3.1. (The result of Theorem 3.8(b) will be used in the next section.) 66 Chapter 3. Analysis of ++AE authenticated encryption mode

3.3.2 Groups of blocks that result in the same MDC value

Other forgeries to consider again involve groups of four consecutive plaintext blocks, but now require the identification of collections of these group with the

property that (for given values of Ii 1 and Qi 1) the MDC is unaffected by − − replacing the group (Mi,Mi+1,Mi+2,Mi+3) with another group of four blocks from the same collection. Given the results of Section 3.2, this requires that the value of Ii+3 is the same for all groups in the collection, except possibly for the most significant bit, and likewise for the value of Qi+3. (Note that if a message is altered to one which has the same MDC, the changes in the message will not be detected by the integrity assurance mechanism. That is, a guaranteed forgery is obtained.) The following results are used to identify these collections of groups.

3.3.2.1 Basic results

Lemma 3.9. Let a be a n-bit binary number. Then,

n 2 n 2 n 1 n a 2 − a + 2 − [a]n 2.2 − (mod 2 ). ⊕ ≡ ± − Proof. The proof of this lemma is similar to the proof of Lemma 3.1, but with the indices n and n 1 replaced by n 1 and n 2, respectively. − − −

Theorem 3.10. Suppose a plaintext block Mj is submitted for ++AE encryption,

resulting in the ciphertext block Cj and the inner vectors Qj and Ij. If Mj is n 2 replaced by M 2 − , then j ⊕ n 2 n 2 n 1 (a) Ij will be replaced by Ij 2 − = Ij + 2 − [Ij]n 2.2 − ; ⊕ ± − n 2 n 1 n 2 (b) Qj will be replaced by Qj + 2 − [Ij]n 2.2 − = Qj 2 − ([Ij]n 2 ± − ⊕ ⊕ − ⊕ n 1 [Qj]n 2).2 − ; − n 2 n 1 (c) Xj will be replaced by Xj 2 − ([Ij]n 2 [Qj]n 2).2 − . ⊕ ⊕ − ⊕ − Proof. The proof follows the same argument as the proof of Theorem 3.2, but using Lemma 3.9 instead of Lemma 3.1.

Theorem 3.11. Suppose a plaintext block Mj is submitted for ++AE encryption,

resulting in the ciphertext block Cj. If Mj is submitted for encryption a second n 2 n 1 time, but with both of Qj 1 and Ij 1 replaced by Qj 1 2 − c1.2 − and − − − ⊕ ⊕ n 2 n 1 Ij 1 2 − c2.2 − , respectively, where c1, c2 0, 1 , then − ⊕ ⊕ ∈ { } 3.3. Other integrity assurance flaws in ++AE 67

n 2 n 1 n 2 n 1 (a) Ij will be replaced by Ij 2 − c1.2 − = Ij + 2 − (c1 [Ij]n 2)2 − ; ⊕ ⊕ ± ⊕ − n 2 n 1 n 2 n 1 (b) Q will be replaced by Q + 2 − (g c )2 − = Q 2 − (g0 c )2 − ; j j ± ⊕ 2 j ⊕ ⊕ ⊕ 2 n 1 (c) X will be replaced by X g0 .2 − , j j ⊕

0 where g = [Ij Ij 1 Qj 1]n 2 1 and g = [Ij Ij 1 Qj 1 Qj]n 2 1 . ⊕ − ⊕ − − ⊕ ⊕ − ⊕ − ⊕ − ⊕ Proof. The proof follows the same argument as the proof of Theorem 3.3, but n 2 n 1 using 2 − and Lemma 3.9 instead of 2 − and Lemma 3.1, respectively.

Based on these theorems, it is possible that mixtures of plaintext blocks n n n 1 n 2 r, r∗, s, s∗, where r 2 1, 2 2 , r∗ = r 2 − , s = r 2 − , and s∗ = ∈ { − − } ⊕ ⊕ n 1 s 2 − , can be used in a four-block subset within a plaintext message that ⊕ permits forgeries similar to those discussed in Section 3.4. The suitability of a four-block collection is determined by the number of s and s∗ blocks in the group.

Firsly, if there are no s or s∗ blocks, from Theorem 3.8, then Ii+3 = Ii 1 or − Ii∗ 1 and Qi+3 = Qi 1 or Qi∗ 1 for all groups. Likewise, Theorems 3.10 and 3.11 − − − show that any group with an even number of s or s∗ terms will also have the property that Ii+3 and Qi+3 can only differ from Ii 1 and Qi 1 respectively in − − the most significant bit. The same argument also shows that any group with an odd number of s or s∗ terms will have the property that Ii+3 and Qi+3 differ from Ii 1 and Qi 1 − − respectively in the second-most significant bit and possibly also in the most significant bit, but not in any other bit. Therefore, provided the blocks are chosen within this collection, the values of Ii+3 for any two messages can only differ in the most significant bit, and likewise the values of Qi+3.

3.3.2.2 Other conditions required for forgeries

Three properties are required in order to obtain forgeries from a group (Mi,Mi+1,

Mi+2,Mi+3) of plaintext blocks such that each block can be r, r∗, s or s∗. The properties relate to the values of the internal variables (Xi,Xi+1,Xi+2,Xi+3) that correspond to the chosen group (Mi,Mi+1,Mi+2,Mi+3) as follows:

1. Two ciphertext blocks Cj and Ck differ exactly when the corresponding

internal variables Xj and Xk differ.

2. It is required that either X = X or X = X , or both. i 6 i+2 i+1 6 i+3 68 Chapter 3. Analysis of ++AE authenticated encryption mode

0 0 0 0 3. The set of internal variables (Xi ,Xi+1,Xi+2,Xi+3) obtained from (Xi,Xi+1,

Xi+2,Xi+3) by swapping either Xi with Xi+2 or Xi+1 with Xi+3, or both, must correspond to another plaintext group within the same collection as

the original group (Mi,Mi+1,Mi+2,Mi+3). Th first property always holds, since the encryption function E ( ) and de- K · cryption function D ( ) are both one-to-one. For the third property, in fact, this K · is also true for any permutation of (Xi,Xi+1,Xi+2,Xi+3) due to the structure of ++AE. The following results enable us to demonstrate this fact.

Theorem 3.12. Suppose M = (M1,...,Mi,Mi+1,Mi+2,Mi+3,...,Mm+1) where m 4 and 1 i m 3, is an m-block plaintext message submitted for ++AE ≥ ≤ ≤ − encryption where the final block, Mm+1, is the ICV. (a) When M = M = M = M = 2n 1, then i i+1 i+2 i+3 −

n n Xi = Xi+2 = Ii 1 (2 1) (2 Ii 1) − ⊕ − ⊕ − − n n Xi+1 = Xi+3 = (2 1 Qi 1) (2 1) (Qi 1 + 1). − − − ⊕ − ⊕ −

(b) When M = M = M = M = 2n 2, then i i+1 i+2 i+3 −

n Xi = Xi+2 = Ii 1 (2 2) ( Ii 1 2[Qi 1]0 + 2[Ii 1]0) − ⊕ − ⊕ − − − − − n Xi+1 = Xi+3 = ( 2 Qi 1 + 2[Qi 1]0) (2 2) (2 + Qi 1 2[Ii 1]0). − − − − ⊕ − ⊕ − − −

Proof. For (a), these results follow immediately from the results of Theorem 3.6 upon noting (by Lemma 3.4) that:

n n Ii 1 1 = (2 1) (2 Ii 1) − − − ⊕ − − n n 2 2 Qi 1 = (2 1) (Qi 1 + 1) − − − − ⊕ − n n 2 1 Ii 1 = (2 1) Ii 1 − − − − ⊕ − n n Qi 1 = (2 1) (2 1 Qi 1) − − ⊕ − − − For (b), these results are obtained in a similar fashion to the results of Theo- rem 3.7, but using Lemma 3.5 in place of Lemma 3.4.

Note 2: For future reference, denote the values of Xi and Xi+1 from this theorem as A and A respectively (using the values from (a) when r = 2n 1 0 1 − and the values from (b) when r = 2n 2). − 3.3. Other integrity assurance flaws in ++AE 69

Every plaintext group (Mi,Mi+1,Mi+2,Mi+3) corresponds to a unique set of internal variables (Xi,Xi+1,Xi+2,Xi+3), and vice versa. Furthermore, The- orems 3.2, 3.3, 3.10 and 3.11 show that changing only the most significant bit and/or second-most significant bit of the plaintext blocks (Mi,Mi+1,Mi+2,Mi+3) can only affect these same bits of the internal variables (Xi,Xi+1,Xi+2,Xi+3). In particular, note from Theorems 3.10(c) and 3.11(c) that the second-most signifi- cant bit of any Xj will differ from A0 or A1 (as appropriate) exactly when Mj = s ? or s . Thus, the collection of plaintext groups with an even number of s or s∗ blocks corresponds one-to-one with a unique collection of (Xi,Xi+1,Xi+2,Xi+3) sets, while the collection of groups with an odd number of s or s∗ blocks cor- responds one-to-one with a different unique collection of (Xi,Xi+1,Xi+2,Xi+3) sets. Therefore, any permutation of the Xj values within a set of Xj values will result in another set in the same collection, as required. Finally, we identify those groups of plaintext blocks for which X = X i 6 i+2 or X = X as required by property 2 above. This is done according to the i+1 6 i+3 n n number of s or s∗ blocks contained in each group. (Recall that r 2 1, 2 2 , ∈ { − − } n 1 n 2 n 1 r∗ = r 2 − , s = r 2 − and s∗ = s 2 − .) ⊕ ⊕ ⊕

3.3.2.2.1 Plaintext groups containing no s or s∗ terms: These groups are represented as follows:

(Mi,Mi+1,Mi+2,Mi+3) = (r/r∗, r/r∗, r/r∗, r/r∗)

There are exactly 2 24 = 32 possible groups in this collection. Firsly, con- × sider the case in which there are an odd number of r∗ blocks. Theorems 3.2, 3.3 and 3.8 indicate that exactly one of the pairs (Xi,Xi+2) and (Xi+1,Xi+3) must have terms that differ in the most significant bit. In this case, all 16 groups with this property can be used in our guaranteed forgery.

Now consider when the number of r∗ blocks is even in the group of four consecutive blocks. When there are exactly two r∗ blocks and these blocks are either consecutive or first and last in the group, then both the pairs (Xi,Xi+2) and (Xi+1,Xi+3) will always differ in the most significant bit. Thus, guaran- teed forgeries can be obtained for each of the 8 groups in this case. In the remaining cases, the internal variables (Xi,Xi+1,Xi+2,Xi+3) will be equal to

(A0,A1,A0,A1), (A0∗,A1∗,A0∗,A1∗), (A0∗,A1,A0∗,A1) or (A0,A1∗,A0,A1∗). Swapping

Xi with Xi+2 and Xi+1 with Xi+3 in these cases will give the same pattern and 70 Chapter 3. Analysis of ++AE authenticated encryption mode will not allow a forgery. In summary, of the 32 groups with this pattern, there are (16+8 = 24) groups which allow forgeries of this type.

3.3.2.2.2 Plaintext groups containing one s or s∗ term: These groups are represented as follows: ( ) (r/r∗, r/r∗, r/r∗, s/s∗), (r/r∗, r/r∗, s/s∗, r/r∗), (Mi,Mi+1,Mi+2,Mi+3) ∈ (r/r∗, s/s∗, r/r∗, r/r∗), (s/s∗, r/r∗, r/r∗, r/r∗)

Note that there are exactly 2 24 4 = 128 possible groups of plaintext blocks × × with this pattern; each of these corresponds to a different pattern of (Xi, Xi+1,

Xi+2, Xi+3) values. For all groups in this collection, Theorems 3.10 and 3.11 indicate that exactly one of the pairs (Xi,Xi+2) and (Xi+1,Xi+3) must have terms that differ in the second-most significant bit.

3.3.2.2.3 Plaintext groups containing two s or s∗ terms: The groups are represented as follows:    (r/r∗, r/r∗, s/s∗, s/s∗), (r/r∗, s/s∗, s/s∗, r/r∗),     (s/s , s/s , r/r , r/r ), (s/s , r/r , r/r , s/s ),  (M ,M ,M ,M ) ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ i i+1 i+2 i+3 ∈    − − − − − − − − − − − − − − − − − − −−   (r/r∗, s/s∗, r/r∗, s/s∗), (s/s∗, r/r∗, s/s∗, r/r∗) 

There are 2 24 6 = 192 possible groups of plaintext blocks, of which 128 do × × not have the (s/s∗) terms in alternate positions (above the dashed line) and the remaining 64 have (below the dashed line). First consider the non-alternating groups. As before by Theorems 3.10 and 3.11, the Xj values corresponding to

Mj = s/s∗ will differ from A0 or A1 in the second-most significant bit, while Xj values corresponding to M = r/r∗ will not. This guarantees that X = X j i 6 i+2 and X = X . i+1 6 i+3 Now consider the groups having alternate positions of the s or s∗ terms. For these groups, Xi and Xi+2 will always agree in the second-most significant bit, as will Xi+1 and Xi+3. However, further consideration of the most significant bit (again using Theorems 3.10 and 3.11) shows that either X X = X X i ⊕ i+2 i+1 ⊕ i+3 n 1 or X X = X X 2 − , and that each of these results holds for half i ⊕ i+2 i+1 ⊕ i+3 ⊕ of the groups in this collection. In the latter case, exactly one of the conditions 3.4. Forgery attacks on ++AE 71

X = X and X = X holds, and so a guaranteed forgery can be obtained i 6 i+2 i+1 6 i+3 for each of the 64/2 = 32 groups with this property. In summary, 128 + 32 = 160 groups with this pattern can be used for this type of forgery.

3.3.2.2.4 Plaintext groups containing three s or s∗ terms: These groups are represented as follows: ( ) (r/r∗, s/s∗, s/s∗, s/s∗), (s/s∗, r/r∗, s/s∗, s/s∗), (Mi,Mi+1,Mi+2,Mi+3) ∈ (s/s∗, s/s∗, r/r∗, s/s∗), (s/s∗, s/s∗, s/s∗, r/r∗)

Again, there are exactly 2 24 4 = 128 possible groups of plaintext blocks of × × this form. In this case, each pattern of (Xi,Xi+1,Xi+2,Xi+3) values must have three terms which differ from (A0,A1,A0,A1) in the second-most significant bit.

Thus, exactly one of the pairs (Xi,Xi+2) and (Xi+1,Xi+3) must differ in this bit position.

3.3.2.2.5 Plaintext groups containing four s or s∗ terms: These groups are represented as follows:

(Mi,Mi+1,Mi+2,Mi+3) = (s/s∗, s/s∗, s/s∗, s/s∗)

There are exactly 2 24 = 32 possible groups of plaintext blocks of this form. Of × these, the groups with an odd number of s∗ terms can be shown to have either X = X or X = X . Thus, the 16 groups with this property will each i 6 i+2 i+1 6 i+3 give a guaranteed forgery. The total number of choices for groups of four consecutive plaintext blocks that give a guaranteed forgery is 456 accumulated in the following table:

Number of s/s∗ terms 0 1 2 3 4 Total

Number of choices 24 128 160 128 16 456

3.4 Forgery attacks on ++AE

In this section, the properties discussed in Section 3.3 are used to construct various forgeries against ++AE. A man-in-the-middle attack model is considered where the attacker is able to intercept and potentially alter messages before 72 Chapter 3. Analysis of ++AE authenticated encryption mode

sending them on to the intended recipient. The proposed forgery attacks only require a single chosen plaintext message to be encrypted using ++AE. The attacker can use the corresponding ciphertext and modify it in various ways without invalidating the MDC or its decryption to the original ICV, provided the length of the ciphertext is unchanged. Possible modifications include inserting and deleting certain groups of ciphertext blocks, reordering groups of blocks or swapping blocks within groups. The attacks are deterministic and the modified ciphertext is guaranteed to pass the ++AE integrity test during decryption in every case. These forgeries are applicable to both versions of ++AE. Note that these forgeries do not depend on the underlying block cipher algorithm, but only on the properties of the block chaining mechanism analysed above.

3.4.1 Forgery attack using two groups

3.4.1.1 Forgery attack using insertion and deletion

For this proposed attack, suppose a plaintext message M contains two separate groups of four consecutive blocks whose values are chosen from those listed in Theorem 3.8(a) of Section 3.3.1. The values chosen for the two groups may be the same or different. The message is chosen as follows:

M = (M1,...,Mi,Mi+1,Mi+2,Mi+3,...,Mj,Mj+1,Mj+2,Mj+3,...,Mm+1)

where (m 9), as shown in Figure 3.3. This message is encrypted using ++AE to ≥ produce a ciphertext message C = (C1, C2, ... , Cm+1), such that Ci, Ci+1, Ci+2,

Ci+3 correspond to Mi, Mi+1, Mi+2, Mi+3 and Cj, Cj+1, Cj+2, Cj+3 correspond to Mj, Mj+1, Mj+2, Mj+3.

M M1 ... Mi 1 Mi Mi+1 Mi+2 Mi+3 ... Mj 1 Mj Mj+1 Mj+2 Mj+3 ... Mm Mm+1 = ICV − −

C C1 ... Ci 1 Ci Ci+1 Ci+2 Ci+3 ... Cj 1 Cj Cj+1 Cj+2 Cj+3 ... Cm Cm+1 = MDC − − Figure 3.3: The chosen message for forgeries against ++AE.

One way to construct a forged ciphertext message C∗ from the obtained ci- phertext C, is by deleting the four consecutive ciphertext blocks (Ci,Ci+1,Ci+2,Ci+3), and repeating the four consecutive ciphertext blocks (Cj, Cj+1, Cj+2, Cj+3). The 3.4. Forgery attacks on ++AE 73 forged ciphertext message is established as follows:

C∗ = C (1 k i 1); k k ≤ ≤ − C∗ = C (i k j 1); k k+4 ≤ ≤ − C∗ = C (j k m + 1). k k ≤ ≤

Such a forged message C∗, as illustrated in Figure 3.4, will not cause any change in the number of blocks m. It will also produce the same MDC and decrypt to give the same ICV, so it will be accepted as genuine by the receiver.

M M1 ... Mi 1 Mi Mi+1 Mi+2 Mi+3 ... Mj 1 Mj Mj+1 Mj+2 Mj+3 ... Mm Mm+1 − −

C C1 ... Ci 1 Ci Ci+1 Ci+2 Ci+3 ... Cj 1 Cj Cj+1 Cj+2 Cj+3 ... Cm Cm+1 − −

C C1 ... Ci 1 ... Cj 1 Cj Cj+1 Cj+2 Cj+3 Cj Cj+1 Cj+2 Cj+3 ... Cm Cm+1 ∗ − −

M M1 ... Mi 1 ... Mj 1 Mj Mj+1 Mj+2 Mj+3 Mj Mj+1 Mj+2 Mj+3 ... Mm Mm+1 ∗ − − Figure 3.4: Forgery attack using insertion and deletion.

Another way to construct a forged ciphertext message from C is by inserting the four blocks (Ci,Ci+1,Ci+2,Ci+3) after the original ones and deleting the blocks

(Cj, Cj+1, Cj+2, Cj+3). That is, the forged ciphertext message will be as follows:

C∗ = C (1 k i + 3); k k ≤ ≤ Ck∗ = Ck 4 (i + 4 k j + 3); − ≤ ≤ C∗ = C (j + 4 k m + 1). k k ≤ ≤ Note that this approach can be used for any of the groups of four plaintext blocks identified in Theorem 3.8(a). Thus, there are multiple chosen plaintext messages that can be used to obtain ciphertext messages, each of which can be manipulated in various ways to obtain a guaranteed forgery.

3.4.1.2 Forgery attack using swapping

Instead of insertion and deletion using two groups of four consecutive plaintext blocks, two consecutive groups from the list of groups in Theorem 3.8(a) can also be used, provided that the groups are not identical to one another. An attacker can then construct a forged ciphertext message by swapping the ciphertext blocks 74 Chapter 3. Analysis of ++AE authenticated encryption mode

of the two groups. These modifications do not cause any change in the number of blocks m, or in the internal chaining values, Q and I. Hence, the MDC and its decryption to the original ICV will remain unchanged. To illustrate the proposed attacks, suppose a plaintext message M contains two consecutive groups of plaintext blocks whose values are chosen from the list in Theorem 3.8(a). The two groups can be anywhere in the plaintext before the last block as follows:

M = (M ,...,M ,M ,M ,M ,M ,M ,M ,M ,...,M )(m 9) 1 i i+1 i+2 i+3 i+4 i+5 i+6 i+7 m+1 ≥ The message is encrypted using ++AE to produce a ciphertext message C =

(C1,C2,... ,Cm+1), such that Ci,Ci+1,Ci+2,Ci+3 correspond to Mi,Mi+1,Mi+2,Mi+3 and Ci+4,Ci+5,Ci+6,Ci+7 correspond to Mi+4,Mi+5,Mi+6,Mi+7.

A forged ciphertext message C∗ can be constructed from the obtained cipher- text C, by swapping the four consecutive ciphertext blocks (Ci,Ci+1,Ci+2,Ci+3), with the ciphertext blocks (Ci+4,Ci+5,Ci+6,Ci+7). The forged ciphertext message is established as follows:

C∗ = C (1 k i 1); k k ≤ ≤ − C∗ = C (i k i + 3); k k+4 ≤ ≤ Ck∗ = Ck 4 (i + 4 k i + 7); − ≤ ≤ C∗ = C (i + 8 k m + 1). k k ≤ ≤

Such a forged message C∗, as illustrated in Figure 3.5, will also produce the same MDC and decrypt to give the same ICV, so it will be accepted as genuine by the receiver.

M M1 ... Mi 1 Mi Mi+1 Mi+2 Mi+3 Mi+4 Mi+5 Mi+6 Mi+7 ... Mm Mm+1 −

C C1 ... Ci 1 Ci Ci+1 Ci+2 Ci+3 Ci+4 Ci+5 Ci+6 Ci+7 ... Cm Cm+1 −

C C1 ... Ci 1 Ci+4 Ci+5 Ci+6 Ci+7 Ci Ci+1 Ci+2 Ci+3 ... Cm Cm+1 ∗ −

M M1 ... Mi 1 Mi+4 Mi+5 Mi+6 Mi+7 Mi Mi+1 Mi+2 Mi+3 ... Mm Mm+1 ∗ − Figure 3.5: Forgery attack by swapping groups.

Note that the same procedure is valid for any two different groups mentioned 3.4. Forgery attacks on ++AE 75 in Theorem 3.8(a). Hence, more plaintext messages can be chosen to obtain ciphertext messages, each of which can be manipulated to obtain a guaranteed forgery.

3.4.2 Forgery attack using a single group

Suppose a plaintext message M contains a group of four consecutive blocks (Mi,

Mi+1, Mi+2, Mi+3) according to one of the patterns discussed in Section 3.3.2, as follows:

M = (M1,...,Mi,Mi+1,Mi+2,Mi+3,...,Mm+1) where m 4, 1 i m 3 and the final block, M , is the ICV. This message ≥ ≤ ≤ − m+1 is encrypted using ++AE and results in the ciphertext C where the final block,

Cm+1, is the MDC :

C = (C1,...,Ci,Ci+1,Ci+2,Ci+3,...,Cm+1).

A forged ciphertext message C? can be constructed from the obtained cipher- text C, in up to three ways as follows:

(1) by swapping the ciphertext block Ci with the ciphertext block Ci+2;

(2) by swapping the ciphertext block Ci+1 with the ciphertext block Ci+3;

(3) by swapping both the ciphertext block Ci with the ciphertext block Ci+2

and the ciphertext block Ci+1 with the ciphertext block Ci+3.

The three different forged ciphertext messages, as depicted in Figure 3.6, are established as follows:

? (1) C = (C1,...,Ci+2,Ci+1,Ci,Ci+3,...,Cm+1);

? (2) C = (C1,...,Ci,Ci+3,Ci+2,Ci+1,...,Cm+1);

? (3) C = (C1,...,Ci+2,Ci+3,Ci,Ci+1,...,Cm+1).

These changes to the ciphertext will not affect the MDC. After decryption they give the same ICV, so the altered plaintext will be accepted as genuine by the receiver. Note that this approach can be used for any of the 456 groups of four plain- text blocks identified in Section 3.3.2. Thus, there are many chosen plaintext 76 Chapter 3. Analysis of ++AE authenticated encryption mode

M M1 ... Mi Mi+1 Mi+2 Mi+3 ...Mm+1

C C1 ... Ci Ci+1 Ci+2 Ci+3 ... Cm+1

? C C1 ... Ci+2 Ci+3 Ci Ci+1 ... Cm+1

? M M1 ... Mi+2 Mi+3 Mi Mi+1 ...Mm+1 Figure 3.6: Forgery attack by swapping blocks within a group. messages that can be used to obtain ciphertext messages each of which has at least two blocks of different value that can be swapped without affecting the ++AE integrity process.

3.5 Experimental verification

In this section, the proposed forgery attacks on ++AE are verified experimentally as follows:

The first experiment (illustrated in Section 3.5.1) deletes and inserts par- • ticular groups of blocks in the original ciphertext message.

The second experiment (illustrated in Section 3.5.2) changes the order of • particular blocks in two consecutive groups such that each group consists of four blocks.

The third experiment (illustrated in Section 3.5.3) also changes the order of • particular blocks, but it requires only a single group instead of two groups.

The underlying block cipher used in these experiments is AES with a 128-bit key. The experiments are implemented using C language and the GNU GCC compiler. These were run on a desktop computer with 64-bit Windows 7 Enter- prise operating system and Intel Core i7-4790 3.6GHz processor. 3.5. Experimental verification 77

3.5.1 Verification of insertion and deletion attack

The experiment is performed for a plaintext message of eleven 128-bit blocks as follows:

(1) An attacker chooses a plaintext message containing two separate groups such that each group contains four consecutive identical blocks.

(2) The eleven-block plaintext is encrypted using ++AE mode with 128-bit AES as the underlying block cipher.

(3) The attacker can construct a forged message by deleting the ciphertext blocks corresponding to one group and repeating the ciphertext blocks corresponding to the second group, as shown in Figure 3.4.

(4) The modified ciphertext is decrypted using ++AE mode with 128-bit AES as the underlying block cipher.

The above forgery attack is demonstrated with a specific example, as follows:

(1) Form a plaintext of eleven 128-bits blocks considering the last block as the

ICV, as shown in Table 3.2. The message has the blocks (M2, M3, M4, M5) = (2128 1, 2128 1, 2127 1, 2127 1) as one group and the blocks (M , − − − − 7 M , M , M ) = (2127 2, 2128 2, 2127 2, 2128 2) as the second group. 8 9 10 − − − − These two groups are shown in bold. The remaining plaintext blocks have been chosen randomly.

(2) Table 3.3 shows the ciphertext obtained from the encryption of the message

in Table 1. The corresponding ciphertext blocks (C2, C3, C4, C5) and (C7,

C8, C9, C10) are shown in bold.

(3) The ciphertext is modified by deleting the ciphertext blocks (C2, C3, C4,

C5) and repeating the ciphertext blocks (C7, C8, C9, C10), as shown in Table 3.4.

(4) The modified ciphertext is decrypted to obtain the message shown in Table 3.5. Note that the obtained plaintext message is different to the original chosen plaintext shown in Table 3.2, but both have the same last block. 78 Chapter 3. Analysis of ++AE authenticated encryption mode

Table 3.2: Chosen plaintext in Table 3.3: Obtained ciphertext in the the first experiment. first experiment.

6bc1bee22e409f96e93d7e117393172a b9501472308964e206f3652aa72ccdd6 ffffffffffffffffffffffffffffffff 8b6af01acb7464cb68c4a3548aaf95a6 ffffffffffffffffffffffffffffffff 469c7fcb75d5d9a1b418cb997b09a185 7fffffffffffffffffffffffffffffff 7784d2a5aac5066e710a151d5590c008 7fffffffffffffffffffffffffffffff e2c75fad134f621d9223df0a9c2793d8 30c81c46a35ce411e5fbc1191a0a52ea c213b40c852738f6d446d79eb8f79b03 7ffffffffffffffffffffffffffffffe 4106dc419a06db149dd22bfb583ffeee fffffffffffffffffffffffffffffffe 7df76b0c1ab899b33e42f047b91b546f 7ffffffffffffffffffffffffffffffe 4106dc419a06db149dd22bfb583ffeee fffffffffffffffffffffffffffffffe 7df76b0c1ab899b33e42f047b91b546f f69f2445df4f9b17ad2b417be66c3710 3758893b632fd6857c89f9b5aeb1427d

Table 3.4: Modified ciphertext Table 3.5: Decrypted modified cipher- in the first experiment. text in the first experiment.

b9501472308964e206f3652aa72ccdd6 6bc1bee22e409f96e93d7e117393172a c213b40c852738f6d446d79eb8f79b03 30c81c46a35ce411e5fbc1191a0a52ea 4106dc419a06db149dd22bfb583ffeee 7ffffffffffffffffffffffffffffffe 7df76b0c1ab899b33e42f047b91b546f fffffffffffffffffffffffffffffffe 4106dc419a06db149dd22bfb583ffeee 7ffffffffffffffffffffffffffffffe 7df76b0c1ab899b33e42f047b91b546f fffffffffffffffffffffffffffffffe 4106dc419a06db149dd22bfb583ffeee 7ffffffffffffffffffffffffffffffe 7df76b0c1ab899b33e42f047b91b546f fffffffffffffffffffffffffffffffe 4106dc419a06db149dd22bfb583ffeee 7ffffffffffffffffffffffffffffffe 7df76b0c1ab899b33e42f047b91b546f fffffffffffffffffffffffffffffffe 3758893b632fd6857c89f9b5aeb1427d f69f2445df4f9b17ad2b417be66c3710

3.5.2 Verification of swapping attack using two groups

The experiment outlined in Section 3.5.1 is repeated, but the first group now is (M , M , M , M ) = (2127 1, 2128 1, 2128 1, 2127 1) and the second group 2 3 4 5 − − − − is (M , M , M , M ) = (2128 2, 2127 2, 2127 2, 2128 2). The ciphertext is 6 7 8 9 − − − − modified by swapping the ciphertext blocks (C2, C3, C4, C5) with the ciphertext blocks (C6, C7, C8, C9). Tables 3.6, 3.7, 3.8, 3.9 show the first four steps of the experiment.

3.5.3 Verification of swapping attack using a single group

This experiment is performed for a plaintext message of seven 128-bit blocks as follows: 3.5. Experimental verification 79

Table 3.6: Chosen plaintext in Table 3.7: Obtained ciphertext in the the second experiment. second experiment.

6bc1bee22e409f96e93d7e117393172a b9501472308964e206f3652aa72ccdd6 7fffffffffffffffffffffffffffffff 7784d2a5aac5066e710a151d5590c008 ffffffffffffffffffffffffffffffff 469c7fcb75d5d9a1b418cb997b09a185 ffffffffffffffffffffffffffffffff 8b6af01acb7464cb68c4a3548aaf95a6 7fffffffffffffffffffffffffffffff e2c75fad134f621d9223df0a9c2793d8 fffffffffffffffffffffffffffffffe 7df76b0c1ab899b33e42f047b91b546f 7ffffffffffffffffffffffffffffffe 4106dc419a06db149dd22bfb583ffeee 7ffffffffffffffffffffffffffffffe f6c71eedc3d99bb183cb5b8d1568e606 fffffffffffffffffffffffffffffffe 973f2ef34879e2027f1734303ff21f89 f69f2445df4f9b17ad2b417be66c3710 e0cd0b77eed4f7041fde256038dbf51f

Table 3.8: Modified ciphertext Table 3.9: Decrypted modified cipher- in the second experiment. text in the second experiment.

b9501472308964e206f3652aa72ccdd6 6bc1bee22e409f96e93d7e117393172a 7df76b0c1ab899b33e42f047b91b546f fffffffffffffffffffffffffffffffe 4106dc419a06db149dd22bfb583ffeee 7ffffffffffffffffffffffffffffffe f6c71eedc3d99bb183cb5b8d1568e606 7ffffffffffffffffffffffffffffffe 973f2ef34879e2027f1734303ff21f89 fffffffffffffffffffffffffffffffe 7784d2a5aac5066e710a151d5590c008 7fffffffffffffffffffffffffffffff 469c7fcb75d5d9a1b418cb997b09a185 ffffffffffffffffffffffffffffffff 8b6af01acb7464cb68c4a3548aaf95a6 ffffffffffffffffffffffffffffffff e2c75fad134f621d9223df0a9c2793d8 7fffffffffffffffffffffffffffffff e0cd0b77eed4f7041fde256038dbf51f f69f2445df4f9b17ad2b417be66c3710

(1) Construct a plaintext message containing one of the groups of four consec- utive blocks listed in Section 3.3.2.

(2) Encrypt the seven-block plaintext using 128-bit AES in ++AE mode.

(3) Swap the corresponding ciphertext blocks, as shown in Figure 3.6.

(4) Decrypt the modified ciphertext using 128-bit AES in ++AE mode.

(5) Compare the ICV at the end of the recovered plaintext with the original ICV.

The above forgery attack is demonstrated with a specific example, as follows:

(1) Construct a plaintext M of seven 128-bits blocks, where the last block, , n n n 1 n 1 is the ICV. Set (M , M , M , M ) = (2 1, 2 1, 2 − 1, 2 − 1). 1 2 3 4 − − − − 80 Chapter 3. Analysis of ++AE authenticated encryption mode

Choose the remaining plaintext blocks randomly. Our example is shown in

Table 3.10, with M1, M2, M3 and M4 shown in bold.

(2) Encrypt M to obtain the ciphertext, shown in Table 3.11. The ciphertext

blocks C1, C2, C3 and C4 are shown in bold. Table 3.10: Chosen plaintext in Table 3.11: Obtained ciphertext in the the third experiment. third experiment.

6bc1bee22e409f96e93d7e117393172a b9501472308964e206f3652aa72ccdd6 ffffffffffffffffffffffffffffffff 8b6af01acb7464cb68c4a3548aaf95a6 ffffffffffffffffffffffffffffffff 469c7fcb75d5d9a1b418cb997b09a185 7fffffffffffffffffffffffffffffff 7784d2a5aac5066e710a151d5590c008 7fffffffffffffffffffffffffffffff e2c75fad134f621d9223df0a9c2793d8 30c81c46a35ce411e5fbc1191a0a52ea c213b40c852738f6d446d79eb8f79b03 f69f2445df4f9b17ad2b417be66c3710 3758893b632fd6857c89f9b5aeb1427d

(3) Modify the ciphertext by swapping C1 with C3 and C2 with C4, as shown in Table 3.12.

(4) Decrypt the modified ciphertext to obtain the message shown in Table 3.13.

(5) Note that both messages have the same last block (ICV), but the obtained plaintext message is different to the original chosen plaintext shown in Table 3.10.

Table 3.12: Modified ciphertext Table 3.13: Decrypted modified cipher- in the third experiment. text in the third experiment.

b9501472308964e206f3652aa72ccdd6 6bc1bee22e409f96e93d7e117393172a 7784d2a5aac5066e710a151d5590c008 7fffffffffffffffffffffffffffffff e2c75fad134f621d9223df0a9c2793d8 7fffffffffffffffffffffffffffffff 8b6af01acb7464cb68c4a3548aaf95a6 ffffffffffffffffffffffffffffffff 469c7fcb75d5d9a1b418cb997b09a185 ffffffffffffffffffffffffffffffff c213b40c852738f6d446d79eb8f79b03 30c81c46a35ce411e5fbc1191a0a52ea 3758893b632fd6857c89f9b5aeb1427d f69f2445df4f9b17ad2b417be66c3710

3.6 General remarks about ++AE design

Our results demonstrated that the main problem in ++AE is the cross-chaining mechanism used to provide integrity assurance, which combines addition modulo 3.7. Conclusion 81

2n and XOR operations. The effect of certain changes in the input can be easily determined and subsequent inputs can be modified to compensate. This occurs because the whole integrity assurance mechanism occurs on one side of the design (either before the block cipher during encryption or after the block cipher during decryption). This allows an attacker to propose certain patterns in the plaintext message such that changes within these patterns will not be detected by the receiver. We do not recommend the use of the ++AE structure. However, if future designs adopt the principle used in ++AE to provide integrity assurance, we advise the following techniques be considered:

Use a highly non-linear function with the secret key to diffuse changes in • the inputs throughout the internal variables. In this case, the propagation mechanism can still be either before or after applying the block cipher function. This should prevent attackers from changing certain blocks and compensating for those changes in subsequent blocks.

Include both input and output of the underlying block cipher in the chain- • ing mechanism. This is a significant design change which would make the cipher similar to existing designs such as EPBC [76], IOBC [112] and IOC [113]. However, attacks exist against these designs such as [74, 114, 115].

3.7 Conclusion

In this chapter, we analysed the security of the ++AE authenticated encryption mode against forgery attacks. Firstly, the most serious flaw found in the integrity mechanism of ++AE is that it does not verify the most significant bit of any plaintext block except the last block (Mm+1 = ICV ). This is a fundamental flaw in the authentication mechanism, which allows a malicious sender or receiver to dispute the content of a validly sent message by claiming that the message is different in the most significant bit of any subset of the message blocks. Secondly, 16 different groups of four consecutive plaintext blocks are deter- mined that can be used to obtain guaranteed forgeries provided that two or more of these groups are used. Thirdly, we showed that an even larger number of groups of four consecutive plaintext blocks can be used to generate a guaran- teed forgery using a single group of plaintext blocks, significantly increasing the 82 Chapter 3. Analysis of ++AE authenticated encryption mode scope of this attack. In total, 456 groups of four consecutive plaintext blocks are identified for which forgeries can be guaranteed. Furthermore, up to three distinct forgery approaches may be applied to a single plaintext message. The proposed forgery attacks in this chapter are demonstrated for a selection of chosen plaintext messages using 128-bit AES in ++AE mode. In every case, decrypting the forged ciphertext gives the correct ICV, and so the modification would not be detected by a receiver. Our forgery attacks are chosen plaintext attacks. However, if an attacker knows a number of plaintext messages and their corresponding ciphertexts, then the attacker can perform known plaintext attacks if one of the identified 456 groups of four blocks is recognised. In summary, serious flaws in the ++AE integrity mechanism are identified that are easily exploited in forgery attacks. These flaws are based on the prop- erties of the cross-chaining structure used in ++AE. The flaws apply regardless of the choice of the underlying block cipher and independently of the chosen key or the message. There is no efficient and straightforward solution that can fix the weaknesses in ++AE. This mode should not be considered suitable for authenticated encryption. Chapter 4

Tweaking generic OTR mode to avoid forgery attacks

Offset Two-Round (OTR) [8,9] is an AEAD mode proposed by Minematsu which is online, provides both encryption and integrity assurance in one-pass and sup- ports parallel operations. This mode is a generic scheme that can take any block cipher of block size n as a cryptographic primitive for encryption/decryption pur- poses and for message authentication. The structure of the OTR scheme is very similar to OCB2 [10], except that OCB2 uses both the block cipher encryption and decryption functions, whereas OTR uses only the block cipher encryption function. A version of OTR mode that uses AES as the underlying block cipher, called AES-OTR [117], was submitted to the CAESAR competition [6]. OTR applies different masks to the block cipher inputs as in OCB2 [10]. These masks are updated using a finite field multiplication technique called the doubling masking. The security proof of OTR requires that all input masks are distinct; however, the masks used in OTR and AES-OTR have only been proved to be distinct for a specific choice of n and a particular primitive polynomial defining the finite field. This chapter shows, firstly, that the instantiation used in the revised OTR version [8] uses masking coefficients that are not always distinct in fields based on other primitive polynomials, including when n = 128. This work demonstrates 6 that using the current instantiation with other primitive polynomials results in non-distinct masking values that could potentially be exploited in forgery attacks

83 84 Chapter 4. Tweaking generic OTR mode to avoid forgery attacks against the scheme. This problem exists with most modes that use the doubling masking technique. Secondly, an alternative set of masking coefficients are proposed so that OTR can use the same set of coefficients for any block size n and any primitive poly- nomial, without affecting the security provided by this scheme. That is, this work generalises the OTR mode using the technique of doubling masking, and removes the requirement for the user to perform huge prior calculations in order to ensure that the masks do not overlap. Note that this work does not imply that OTR mode or AES-OTR are in- secure in every case. If the primitive polynomial used to construct the finite field produces distinct masks, such as the commonly used primitive polynomial f(x) = x128 + x7 + x2 + x + 1, then the attack presented here will not be appli- cable. The solution described here may also apply to similar block cipher modes that use the doubling masking technique, such as OCB2 [10], ELmD [118] and AES-COPA [119]. This chapter is organised as follows: Section 4.1 defines the notion of tweak- able block ciphers. Section 4.2 describes the OTR block cipher mode, both the generic and AES-OTR versions. Existing analyses of OTR are reviewed in Sec- tion 4.3. Section 4.4 discusses the problem of the current mask instantiation that can be exploited in forgery attacks. Section 4.5 and 4.6 propose two alternative approaches to instantiate OTR masks to avoid forgery attacks. Section 4.7 dis- cusses the effect of the proposed solutions on the security bounds of OTR. A conclusion of the work in this chapter is given in Section 4.8.

4.1 Tweakable block ciphers

A conventional block cipher encryption algorithm EK accepts two inputs: a secret k-bit key K and an n-bit input string M, as previously described in Chapter2. The output is an n-bit string C, so the block cipher is represented as a map E : 0, 1 n 0, 1 n where is the key space. However, this kind of block K K×{ } → { } K cipher generates the same ciphertext message for a fixed plaintext message. This distinguishes the block cipher from a random permutation. Hence, a block cipher has to be used in an appropriate mode of operation. Various block cipher modes of operation have been suggested, either to provide confidentiality or integrity assurance, or to provide Authenticated Encryption (AE). 4.1. Tweakable block ciphers 85

A tweakable block cipher (TBC) takes, in addition to K and M, a third variable called a tweak T (or a mask) to differentiate messages. It should be cheaper in terms of cost to change the tweak rather than changing the key. n n These ciphers can be represented as EbK : 0, 1 0, 1 where K × T × { } → { } T is the tweak space. A tweakable block cipher can use the standard ECB mode without having the problem of generating identical ciphertext for fixed plaintext messages. This notion was introduced by Liskov, Rivest and Wagner [120]. Liskov, Rivest and Wagner (LRW) suggest two approaches for constructing tweakable block ciphers, and state that these constructions are provably secure, as long as the underlying block cipher is secure. These constructions can be represented as E (∆ E (M )) and E (M ∆ ) ∆ where ∆ is a universal K h ⊕ K i K i ⊕ h ⊕ h h hash function operating as the tweak. However, these constructions need two keys that should be independent of each other. Further, Rogaway [10] describes two new approaches for constructing tweak- able block ciphers. These are known as XOR-Encrypt (XE) and XOR-Encrypt- XOR (XEX) schemes, and use a single key for both the block encryption opera- tion and to initialise the sequence of masking values used as the tweaks. These modes use a secret masking value in processing each block. The secret value L is initially obtained from the encryption of the nonce, and then each time a differ- ent value is needed the previous value of L is doubled. This results in a series of masking values: 2L, 22L, 23L, . . . , 2mL. XE and XEX modes are represented as i 1 i 1 i 1 EbK : EK (Mi 2 − L) and EbK : EK (Mi 2 − L) 2 − L, respectively. Rogaway ⊕ ⊕ ⊕ [10] defines bounds for i such that mask collision is excluded in these designs. More details on these schemes will be discussed later in Chapter5.

The multiplication is performed in the finite field F2n by multiplying two input polynomials and finding the remainder modulo a primitive polynomial. 2L means the result of multiplying the generator of the finite field F2n with L with respect to F2n . Similarly, 4L can be represented as 2(2L) whereas 3L denotes (2L L). ⊕ When a new value is needed which is outside the range of masking values 2L, 22L,... , 2mL, a value 2hugeL is used such that huge is much greater than m. For the primitive polynomial f(x) = x128 + x7 + x2 + x + 1, Rogaway chooses 2huge as 3 and shows that this is far away from the offsets 2L, 22L, . . . , 2mL. In addition, 3L is easy to calculate as 3L = 2L L; therefore, 3L can be XOR-ed with the ⊕ checksum of plaintext blocks to obtain the authentication tag as in OCB2 [10]. 86 Chapter 4. Tweaking generic OTR mode to avoid forgery attacks

Note however that the value 3 might not be equivalent to 2huge when a dif- ferent primitive polynomial is used. Collisions between the masks 3L and 2jL can be found in such cases and lead to simple forgery attacks. Because of this, the choice of values for 2huge must be investigated for every choice of finite field to verify it is distinct from the series of masking values.

4.2 OTR description

Offset Two-round (OTR) is an authenticated encryption block cipher mode that is online, one-pass and each segment of two consecutive blocks can be processed in parallel. OTR mode has a similar structure to the seminal OCB mode [10, 28], but OTR uses only the forward function of the block cipher for both encryption and decryption algorithms. The scheme requires a minimal number of block cipher calls: for an s-bit plaintext message, OTR uses only ( s/n + 2) block d e cipher calls. OTR is a generic mode that can accept any block cipher size n 1.A ≥ submission of OTR to CAESAR, called AES-OTR, uses AES as the underlying block cipher and processes associated data in two different forms. Sections 4.2.1 and 4.2.2 highlight the difference between generic OTR and the special version AES-OTR.

4.2.1 Generic OTR mode

Generic OTR has two versions: the first proposed version that was published in EUROCRYPT in 2014 [9] and the revised version with updated masking coefficients [8]. This chapter analyses the revised version [8], and refers to it as OTR throughout the rest of the chapter. The OTR algorithm accepts the following four inputs:

K Key K 0, 1 | | • ∈ { } Nonce N 0, 1 j for 1 j n 1 • ∈ { } ≤ ≤ −

Associated Data A 0, 1 ∗ • ∈ { }

Plaintext M 0, 1 ∗ • ∈ { } and has the following two outputs: 4.2. OTR description 87

Ciphertext C 0, 1 ∗ such that C = M • ∈ { } | | | | Tag T 0, 1 τ for 1 τ n • ∈ { } ≤ ≤ Although OTR is a general scheme and can be used with different block sizes, Minematsu [8] assumes that n = 128 and the multiplication of masks is per- formed in the finite field F2128 , defined by the commonly used primitive polyno- mial f(x) = x128 + x7 + x2 + x + 1. Note that this instantiation is specific for a unique size only which restricts the scheme from being generic. The author warns users that “care must be taken since masking coefficients are specific to the choice of n and the polynomial defining the field” [8]. The OTR encryption algorithm consists of two algorithms known as cores: an encryption core EFE and authentication core AFE. Similarly, the OTR decryp- tion algorithm uses a decryption core DFE which performs the same steps as the encryption core EFE, but applies the inverse operations. These three cores, EFE,

DFE and AFE, are described in Table 4.1 and the OTR encryption algorithm is illustrated Figure 4.1.

The encryption core EFE divides a plaintext message M into chunks, each containing two plaintext blocks, M2i 1 and M2i. The number of chunks is com- − puted as: ` = m/2 (line 3 of EF in Table 4.1). Each chunk is 2n bits in d e E length and is encrypted using a two-round Feistel permutation. OTR is referred to as an Offset Two-Round scheme. This Feistel structure enables the scheme to use only the forward function of the block cipher. Each two-round Feistel chunk in OTR uses two instantiations of a tweakable block cipher EbK . OTR employs the concept of XE scheme that uses secret masks to build a tweakable block cipher. Each pair of plaintext blocks, M2i 1 and M2i, is XOR-ed with two − different masks and encrypted to obtain the corresponding two ciphertext blocks as follows:

i 1 C2i 1 = EK (2 − L M2i 1) M2i − ⊕ − ⊕ i 1 C2i = EK (2 − 3L C2i 1) M2i 1 ⊕ − ⊕ −

i 1 i 1 Note that these two blocks use two different masks: 2 − L and 2 − 3L. The secret value L is initially obtained as L = EK (N). These two masks are doubled to obtain two masks for the next chunk and so on. Special consideration should be taken for the last chunk. If m is even, then a variant of the two-round Feistel

(lines 8 to 11 of EFE in Table 4.1) is used. Otherwise, a variant of CTR mode is 88 Chapter 4. Tweaking generic OTR mode to avoid forgery attacks

when m is even when m is odd N M1 M2 M3 M4 Mm 1 Mm Mm −` 1 L 2L ··· 2 − L Z ` 1 EK EK EK 2 − L EK msb msb EK ` 1 3L 2 3L 2 − 3L · · L E E E K K ··· K

C1 C2 C3 C4 Cm 1 Cm Cm − Σ = M2 M4 Mm 2 Σ = M2 M4 ⊕ ⊕ · · · ⊕ − ⊕ ⊕ · · · ⊕ Z Cm Mm 1 Mm ⊕ ⊕ − ⊕ ` 1 ` 1 L∗ = 2 − 3L L∗ = 2 − L ·

Σ

2 3 L∗ when M = n | m| 6 n 0 A1 A2 Ad 1 Ad 7L∗ when Mm = n − | | d 2 EK Q 2Q 2 − Q EK d 1 2 − 3Q when A = n TE | d| 6 EK EK EK d 1 2 ··· 2 − 3 Q when Ad = n | | EFE Q EK TA ··· msb AFE TA = 0n when A = ε T Figure 4.1: OTR encryption operation with parallel associated data [8].

used (lines 13 and 14 of EFE in Table 4.1).

The authentication core AFE uses a variant of the PMAC1 scheme [10] to authenticate associated data in parallel. First of all, a constant-based secret value (Q) is computed. This mask is XOR-ed with the first associated data block. After that, the mask is multiplied by 2 before XOR-ing with the next block and so on. Different masks are used for the last associated data block (Ad) depending on whether this block is full or partial. The associated data tag TA is obtained by computing the checksum of all block cipher outputs and applying a final block cipher call. If the associated data are empty, then TA = 0. The authentication tag T in OTR is generated by computing the checksum of the right blocks M2i in each two-round Feistel segment. That is, the checksum of even blocks only is computed. Then, a dedicated mask (depending on the form of the last chunk) is XOR-ed with the checksum of plaintext blocks and the result is encrypted to obtain TE. The final tag T is obtained by XOR-ing TE with the tag TA for the associated data. That is, T = TE TA. ⊕ 4.2. OTR description 89

Table 4.1: OTR algorithm in parallel mode [8].

Algorithm 1: OTR Encryption Core EFE Algorithm 2: OTR Decryption Core DFE 1. Σ 0n 1. Σ 0n ← ← 2. L EK (N) 2. L EK (N) ← n ← n 3. (M1,...,Mm) M, ` = m/2 3. (C1,...,Cm) C, ` = m/2 4. for i = 1 to ` ←−1 do d e 4. for i = 1 to ` ←−1 do d e −i 1 −i 1 5. C2i 1 EK (2 − L M2i 1) M2i 5. M2i 1 EK (2 − 3L C2i 1) C2i − ← i 1 ⊕ − ⊕ − ← i 1 ⊕ − ⊕ 6. C2i EK (2 − 3L C2i 1) M2i 1 6. M2i EK (2 − L M2i 1) C2i 1 ← ⊕ − ⊕ − ← ⊕ − ⊕ − 7. Σ Σ M2i od 7. Σ Σ M2i od 8. if m←is even⊕ then 8. if m←is even⊕ then ` 1 ` 1 9. Z EK (2 − L Mm 1) 9. M2m 1 EK (2 − 3L Cm) Cm 1 ← ⊕ − − ← ` 1 ⊕ ⊕ − 10. Cm msb Mm (Z) Mm 10. Z EK (2 − L Mm 1) ← | | ` 1 ⊕ ← ⊕ − 11. C2m 1 EK (2 − 3L Cm) Mm 1 11. Mm msb Mm (Z) Cm − ← ⊕ ⊕ − ← | | ⊕ 12. Σ Σ Z Cm fi 12. Σ Σ Z Cm fi 13. if m←is odd⊕ then⊕ 13. if m←is odd⊕ then⊕ ` 1 ` 1 14. Cm msb Mm (EK (2 − L)) Mm 14. Mm msb Mm (EK (2 − L)) Cm 15. Σ ←Σ M| fi| ⊕ 15. Σ ←Σ M| fi| ⊕ ← ⊕ m ← ⊕ m 16. if m is even and Mm = n then 16. if m is even and Mm = n then ` 1 3| |6 ` 1 3| |6 17. TE E (2 − 3 L Σ) fi 17. TE E (2 − 3 L Σ) fi ← K ⊕ ← K ⊕ 18. if m is even and Mm = n then 18. if m is even and Mm = n then `| 1 | `| 1 | 19. TE E (7.3.2 − L Σ) fi 19. TE E (7.3.2 − L Σ) fi ← K ⊕ ← K ⊕ 20. if m is odd and Mm = n then 20. if m is odd and Mm = n then ` 1 |2 |6 ` 1 |2 |6 21. TE E (2 − 3 L Σ) fi 21. TE E (2 − 3 L Σ) fi ← K ⊕ ← K ⊕ 22. if m is odd and Mm = n then 22. if m is odd and Mm = n then ` |1 | ` |1 | 23. TE E (7.2 − L Σ) fi 23. TE E (7.2 − L Σ) fi ← K ⊕ ← K ⊕ 24. C (C1,...,Cm) 24. M (M1,...,Mm) 25. return← (C, TE) 25. return← (M, TE) Algorithm 3: OTR Authentication with Parallel Associated Data AFE 1. Ξ 0n ← 2. Q EK (0) ← n 3. (A1,...,Ad) A 4. for i = 1 to d←− 1 do − 5. Ξ Ξ EK (Q Ai) 6. Q ← 2Q⊕od ⊕ 7. Ξ ←Ξ A ← ⊕ d 8. if Mm = n then TA EK (3Q Ξ) | |6 2 ← ⊕ 9. else TA EK (3 Q Ξ) fi 10. return ←TA ⊕

This structure of OTR enables the scheme to have certain properties. OTR can process message blocks in parallel up to half the blocks, but not fully par- allel. The online feature is not fully acquired as OTR needs buffering of two consecutive blocks. With respect to these two features, OCB is more efficient as it is fully parallel and online. Furthermore, OTR is a nonce-respecting scheme, and the scheme does not claim any security when the nonce is repeated. On the other hand, OTR does not require the inverse function D ( ) of the block K · 90 Chapter 4. Tweaking generic OTR mode to avoid forgery attacks

cipher, while OCB does. This improves the scheme in two aspects. First, imple- menting both functions of the block cipher increases the footprint of a hardware design or the memory required for a software implementation [121]. Second, the inverse function in certain block ciphers sometimes is slower than the forward function. For example, the AES implementation [122] has an inverse function that is slower by 45% than the forward function. Hence, using only the forward function for both encryption and decryption algorithms has performance benefits and is desirable in practice. For a message of m plaintext blocks and d associated data blocks, OTR has the features illustrated in Table 4.2, compared to some well-known AE schemes. The ( ) mark indicates that the scheme uses a multiplication function in Galois † field GF (2n) that is called for m times while the block cipher forward function E ( ) is called for the other m times. The ( ?) mark indicates that OTR is k · X online and parallel for ( m/2 + d/2 ) blocks only. d e d e Table 4.2: Comparison of OTR with certain AE schemes [8].

Mode Number of Online Parallelisable Block cipher calls functions

OCB2 [10] m + d + 2 XX EK ,DK GCM [3] 2m† + d + 1 XX EK , Mul† CCM [2] 2m + d + 2 E × × K EAX [32] 2m + d + 1 X EK ? ×? OTR [8,9] m + d + 2 X X EK

4.2.2 AES-OTR mode

A version of OTR mode that uses AES (i.e. n = 128) as the underlying block cipher was submitted to CAESAR [6]. This dedicated version, called AES-OTR [117], preserves the main features of the generic scheme. AES-OTR has been updated a few times during the three rounds of CAESAR. At the date of writing this document, the current version of AES-OTR is v3.1. AES-OTR has the following specifications:

K Key K 0, 1 | | where K 128, 192, 256 . • ∈ { } | |∈ { } Nonce N 0, 1 j for 8 j 120. • ∈ { } ≤ ≤ 4.3. Existing analysis of OTR 91

64 Associated Data A 0, 1 ∗ such that A 2 . • ∈ { } | |8 ≤ 64 Plaintext M 0, 1 ∗ such that M 2 . • ∈ { } | |8 ≤

Ciphertext C 0, 1 ∗ such that C = M . • ∈ { } | | | | Tag T 0, 1 τ for 32 τ 128. • ∈ { } ≤ ≤

Additionally to the AFE core that processes associated data blocks in parallel, AES-OTR proposes another approach that handles the associated data blocks serially. That is, AES-OTR operates in two forms: parallel and serial. The parallel mode of AES-OTR uses exactly the same cores that are illustrated in Table 4.1 and Figure 4.1. The only exception is that the nonce is not encrypted directly to obtain the secret value L as in EFE and DFE cores. Instead, the nonce N and the tag length τ are encoded first using the format function used in OCB3 [30]. This function Format(τ, N) returns an n-bit string as follows:

n 8 N n2s(τ mod n, 7) 0 − −| | 1 N k k k where n2s(x, n) converts a number x into a binary string of n bits. After that, this encoded number is encrypted under EK to obtain the secret value L. AES-OTR uses a variant of the OMAC mode [56] to authenticate associated data serially. To differentiate AES-OTR cores when the scheme operates in serial mode, EF-SE, DF-SE and AF-SE denote the encryption, decryption and associ- ated data cores, respectively. The OTR operation in serial mode is illustrated in Table 4.3 and Figure 4.2. Most of the design components are still identical, except in two situations. First, the tag TA of the associated data is involved in the beginning of EF-SE and DF-SE to generate L (line 2 of EF-SE and DF-SE in Table 4.3). Second, the authentication core AF-SE processes associated data blocks serially (lines 1 to 9 of AF-SE in Table 4.3).

4.3 Existing analysis of OTR

This section reviews two analyses that highlight certain weaknesses in OTR struc- ture which can be exploited in forgery attacks. First, Huang and Wu [123] demonstrated a forgery attack against the integrity assurance of OTR with a 11 probability of success of 2− . Second, Bost and Sanders [124] presented attacks against OTR due to a collision between input masks. 92 Chapter 4. Tweaking generic OTR mode to avoid forgery attacks

when m is even when m is odd τ, N M1 M2 M3 M4 Mm 1 Mm Mm −` 1 L 2L ··· 2 − L Fmt Z ` 1 EK EK EK 2 − L EK msb msb EK ` 1 3L 2 3L 2 − 3L · · TA EK EK EK 2 ··· × L C1 C2 C3 C4 Cm 1 Cm Cm − Σ = M2 M4 Mm 2 Σ = M2 M4 ⊕ ⊕ · · · ⊕ − ⊕ ⊕ · · · ⊕ Z Cm Mm 1 Mm ⊕ ⊕ − ⊕ ` 1 ` 1 L∗ = 2 − 3L L∗ = 2 − L ·

Σ

2 3 L∗ when M = n | m| 6 n 0 A1 A2 Ad 1 Ad 7L∗ when Mm = n ··· − | | EK

EK 2Q when A = n | d| 6 TE 4Q when A = n | d| EF-SE Q EK EK EK EK TA ··· msb AF-SE TA = 0n when A = ε T Figure 4.2: OTR encryption operation with serial associated data [8].

OTR can take messages of length up to 2n/2 bytes. For example, the maxi- mum length of a message in 128-bit AES-OTR is 264 bytes (i.e. 260 blocks). For a randomly chosen plaintext message of 260 blocks, there are 259 even blocks. Amongst these even blocks, Huang and Wu [123] showed that there are about 2 59 1 117 2 × − = 2 possible collisions between two cipher block outputs corresponding to any two of these even blocks. That is, AES-OTR has only (128 117 = 11) − bits of security rather than 128 bits. Although this attack is correct, it does not invalidate the security proof of OTR. The success probability of this attack is 10 worse than the authenticity bound of AES-OTR which is 2− for data complexity of 259 blocks [123]. Prior to Bost and Sanders [124] analysis, OTR used different instantiation

values for the masking coefficients. The two masks used for a chunk (M2i 1,M2i) − 4.3. Existing analysis of OTR 93

Table 4.3: OTR algorithm in serial mode [8].

Algorithm 4: OTR Encryption Core EF-SE Algorithm 5: OTR Decryption Core DF-SE 1. Σ 0n 1. Σ 0n ← ← 2. L 2(EK (Format(τ, N)) TA) 2. L 2(EK (Format(τ, N)) TA) ← n ⊕ ← n ⊕ 3. (M1,...,Mm) M, ` = m/2 3. (C1,...,Cm) C, ` = m/2 4. for i = 1 to ` ←−1 do d e 4. for i = 1 to ` ←−1 do d e −i 1 −i 1 5. C2i 1 EK (2 − L M2i 1) M2i 5. M2i 1 EK (2 − 3L C2i 1) C2i − ← i 1 ⊕ − ⊕ − ← i 1 ⊕ − ⊕ 6. C2i EK (2 − 3L C2i 1) M2i 1 6. M2i EK (2 − L M2i 1) C2i 1 ← ⊕ − ⊕ − ← ⊕ − ⊕ − 7. Σ Σ M2i od 7. Σ Σ M2i od 8. if m←is even⊕ then 8. if m←is even⊕ then ` 1 ` 1 9. Z EK (2 − L Mm 1) 9. M2m 1 EK (2 − 3L Cm) Cm 1 ← ⊕ − − ← ` 1 ⊕ ⊕ − 10. Cm msb Mm (Z) Mm 10. Z EK (2 − L Mm 1) ← | | ` 1 ⊕ ← ⊕ − 11. C2m 1 EK (2 − 3L Cm) Mm 1 11. Mm msb Mm (Z) Cm − ← ⊕ ⊕ − ← | | ⊕ 12. Σ Σ Z Cm fi 12. Σ Σ Z Cm fi 13. if m←is odd⊕ then⊕ 13. if m←is odd⊕ then⊕ ` 1 ` 1 14. Cm msb Mm (EK (2 − L)) Mm 14. Mm msb Mm (EK (2 − L)) Cm 15. Σ ←Σ M| fi| ⊕ 15. Σ ←Σ M| fi| ⊕ ← ⊕ m ← ⊕ m 16. if m is even and Mm = n then 16. if m is even and Mm = n then ` 1 3| |6 ` 1 3| |6 17. TE E (2 − 3 L Σ) fi 17. TE E (2 − 3 L Σ) fi ← K ⊕ ← K ⊕ 18. if m is even and Mm = n then 18. if m is even and Mm = n then `| 1 | `| 1 | 19. TE E (7.3.2 − L Σ) fi 19. TE E (7.3.2 − L Σ) fi ← K ⊕ ← K ⊕ 20. if m is odd and Mm = n then 20. if m is odd and Mm = n then ` 1 |2 |6 ` 1 |2 |6 21. TE E (2 − 3 L Σ) fi 21. TE E (2 − 3 L Σ) fi ← K ⊕ ← K ⊕ 22. if m is odd and Mm = n then 22. if m is odd and Mm = n then ` |1 | ` |1 | 23. TE EK (7.2 − L Σ) fi 23. TE EK (7.2 − L Σ) fi 24. C ←(C ,...,C ) ⊕ 24. M ←(M ,...,M ) ⊕ ← 1 m ← 1 m 25. T msbτ (TE) 25. T msbτ (TE) 26. return← (C,T ) 26. return← (M,T ) Algorithm 6: OTR Authentication with Serial Associated Data AF-SE 1. Ξ 0n ← 2. Q EK (0) ← n 3. (A1,...,Ad) A 4. for i = 1 to d←− 1 do 5. Ξ E (Ξ −A ) od ← K ⊕ i 6. Ξ Ξ Ad 7. if ←M ⊕= n then TA E (2Q Ξ) | m|6 ← K ⊕ 8. else TA EK (4Q Ξ) fi 9. return ←TA ⊕ in the two-round Feistel permutation of OTR were as follows:

i 1 C2i 1 = EK (2 − L M2i 1) M2i − ⊕ − ⊕ i 1 C2i = EK (2 − L δ C2i 1) M2i 1 ⊕ ⊕ − ⊕ − where δ = EK (N) and L = 4δ. A proof that OTR is secure for the instantiation when n = 128 and using the primitive polynomial f(x) = x128 + x7 + x2 + x + 1 is given in [9]. 94 Chapter 4. Tweaking generic OTR mode to avoid forgery attacks

Bost and Sanders [124] showed trivial collisions between the OTR input masks can be found when special forms of primitive polynomial are used. The two masks used can rewritten as: 2i+1δ and (2i+1 + 1)δ. For certain choices of primitive i j n/2 polynomial, it is possible that 2 = 2 + 1 for i, j 2 in F2n . That is, ≤ collisions are not formally excluded when this definition of mask instantiation is used. These collisions can be exploited in practical forgery attacks. They showed that for a large range of block cipher sizes n 10000, certain choices ≤ of primitive polynomials lead to mask collisions. They consider the two widely- used block sizes, n = 64 and n = 128. When the primitive polynomial f(x) = 64 4 3 x +x +x +x+1 is used to construct F264 , collisions can easily be found in the scheme. Similarly, some specific primitive pentanomials lead to mask collisions in F2128 . Bost and Sanders [124] suggested the use of different masking coefficients chosen from the set given by Rogaway in [10]. Accordingly, Minematsu followed their suggestion and updated the OTR instantiation coefficients [8]. The sug- gested mask coefficients are currently used in both OTR and AES-OTR. Mine- matsu also noted that care must be taken in specifying the masking coefficients for other choices of n and of primitive polynomials defining F2n . However, this is left to the user to determine suitability.

4.4 New analysis of OTR

Although the new masking technique suggested by Bost and Sanders [124] helps to formally exclude collisions when n = 128 and n = 64, this solution considers very specific choices of finite fields based on certain commonly used primitive polynomials. This solution is not sufficient to make OTR secure as a generic block cipher mode. As a general scheme, OTR is designed to work with any block size n and any finite field F2n . However, to obtain security assurance for a different finite field, the user has to prove that the chosen masks for that instantiation of OTR are distinct and do not overlap. This requires discrete log computations. Using discrete log computation, Rogaway proves in [10] that certain sets of masks are distinct from each other and provide unique representation. For generic instantiations of OTR using block sizes and primitive polynomials other than those already examined, there is a risk that a user may not select suitable masking coefficients for the instantiation. This open problem motivates 4.4. New analysis of OTR 95 us to seek a more robust definition of OTR in which the masking coefficients are distinct for any choice of finite field. Note that this will be applicable to OTR and also to any design which uses the doubling masking technique. In this section, a similar approach to Bost and Sanders’ work [124] is taken. Two special forms of primitive polynomial, different to those discussed in [124], are considered, and which lead to a collision for the currently used masking coefficients. Case 1: Primitive polynomial of the form f(x) = xn + x + 1. Many primitive polynomials can be found in this trinomial form f(x) = xn + x + 1 [125]. In this case, xn will be equal to x + 1. That is, 3 will be equivalent to 2n and is not 2huge as the current instantiation assumes. Therefore, a collision can be found between 3L and 2nL as long as ` > n. Case 2: Primitive polynomial of the form f(x) = xn + x2 + 1. This is another form of trinomial, and in this case, xn will be equal to x2 + 1. 2 n 2 That is, 5 (which is equal to 3 in F2n ) will be equivalent to 2 . In OTR, 3 is used when the last block M is not a full block ( M = n) as shown in Figure 4.1 m | m|6 and 4.2.

4.4.1 Proposed attacks

This section shows how collisions between the masks can be exploited to breach the integrity assurance of OTR. The two cases discussed in the beginning of Section 4.4 are considered separately. Our analysis assumes a man-in-the-middle attack model where the attacker is able to intercept and alter messages before sending them on to the intended recipient. It is assumed that the attacker knows both the plaintext message and its corresponding ciphertext using a single query of the OTR oracle, so this is a known plaintext scenario. Suppose that a plaintext message M is as follows:

M = (M1,M2,...,Mm) such that the number of blocks m = M is odd and the number of chunks of | |n two blocks ` > n + 1 where ` = m/2 . This message is encrypted using OTR d e and results in the ciphertext C:

C = (C1,C2,...,Cm). 96 Chapter 4. Tweaking generic OTR mode to avoid forgery attacks

4.4.1.1 Case 1 collisions

In this case, as noted in Section 4.4, 3L = 2nL. Suppose that the last block is a full block ( M = n). A forged ciphertext message C? can be constructed as | m| follows:

? C1 = M2(n+1) 1 − ? C2 = M1 M2(n+1) C2(n+1) 1 ⊕ ⊕ − n = M1 EK (M2(n+1) 1 2 L) ⊕ − ⊕ C? = C , 3 i < m i i ≤ ? Cm = Cm C1 M2(n+1) 1 ⊕ ⊕ −

? ? ? That is, three blocks, C1 , C2 and Cm, are altered while the remaining blocks are still the same. Decrypting C? will give the same value for all plaintext blocks ? ? except for M2 and Mm as follows:

M ? = E (C? 3L) C? 1 K 1 ⊕ ⊕ 2 n = EK (M2(n+1) 1 3L) M1 EK (M2(n+1) 1 2 L) − ⊕ ⊕ ⊕ − ⊕ = M1 ? ? ? M2 = EK (M1 L) C1 = EK (M1 L) M2(n+1) 1 ⊕ ⊕ ⊕ ⊕ − ? ` 1 ? M = E (2 − L) C m K ⊕ m ` 1 = EK (2 − L) Cm C1 M2(n+1) 1 ⊕ ⊕ ⊕ − = Mm C1 M2(n+1) 1 ⊕ ⊕ −

Let Σ0 be the checksum of all even plaintext blocks of message M except M2 and the last block Mm. That is,

Σ0 = Σ M M ⊕ 2 ⊕ m 4.4. New analysis of OTR 97

Therefore, the checksum of plaintext blocks for the forged message is:

? ? ? Σ = Σ0 M M ⊕ 2 ⊕ m = Σ0 EK (M1 L) M2(n+1) 1 Mm C1 M2(n+1) 1 ⊕ ⊕ ⊕ − ⊕ ⊕ ⊕ − = Σ0 E (M L) C M ⊕ K 1 ⊕ ⊕ 1 ⊕ m = Σ0 M M ⊕ 2 ⊕ m = Σ

Both C and the forged message C? produce the same checksum value. Thus, C? will produce the same tag T as C, and will be accepted as genuine.

4.4.1.2 Case 2 collisions

For this case, suppose that the message M has also the following features: M = | m|6 n j n, M1 = 0 and M2(n+1) 1 = ( 1, 0 10∗) for 1 j < n. If M is encrypted − { } k ≤ using OTR, the pair (C,T ) is obtained such as T = TE TA. Since M = 0n, ⊕ 1 an attacker can calculate EK (L) as follows:

C E (L 0n) M 1 ← K ⊕ ⊕ 2 E (L) C M K ← 1 ⊕ 2 The attacker can construct a forged pair (C?,T ?) by changing both C and T in the original pair such that C? has only one ciphertext block as follows:

? C = msbj(M2(n+1) 1 EK (L)) − ⊕ ? TE = C2(n+1) 1 M2(n+1) − ⊕ T ? = TE? TA ⊕ Decrypting the forged pair (C?,T ?) will give:

M ? = msb (E (L)) C? 1 j K ⊕ 1 = msbj(EK (L)) msbj(M2(n+1) 1 EK (L)) ⊕ − ⊕ = msbj(M2(n+1) 1) −

Calculating the encryption tag TE0 and final tag T 0 for the received ciphertext 98 Chapter 4. Tweaking generic OTR mode to avoid forgery attacks

will be:

? Σ = M1 = M2(n+1) 1 − 2 TE0 = E (Σ 3 L) K ⊕ 2 = EK (M2(n+1) 1 3 L) − ⊕ n = EK (M2(n+1) 1 2 L) − ⊕ = C2(n+1) 1 M2(n+1) − ⊕ = TE?

? ? T 0 = TE TA = T ⊕ Thus, the forged pair (C?,T ?) will be processed and M ? will be regarded as a valid message. This clearly demonstrates the integrity assurance mechanism is flawed.

4.5 Proposed solution

Section 4.4.1 demonstrates that, for certain forms of primitive polynomial, colli- sions occur between masking values and this can be exploited in forgery attacks. This implies that the current choice of masking coefficients cannot be used in a generic construction of OTR. For every choice of finite field, the implementer must verify that the masking values will be distinct in order to ensure that the design is secure against forgery attacks. This section proposes two minor modifications to OTR which guarantee that the masking coefficients are distinct for any choice of finite field. This makes the generic OTR scheme more robust, since it reduces the chance of security compromise as a result of incorrect user choices. Our modifications preserve the main features of OTR mode and still use the powerful doubling masking method. ` 1 2 ` 1 3 Note from Table 4.1 that OTR uses four special masks: (2 − 3 L), (2 − 3 L), ` 1 ` 1 (7.2 − L) and (7.3.2 − L), during the process of generating the authentication tag TE. The choice of mask depends on two message features: whether the number of blocks m is even or odd; and whether the last block is a full block ( M = n) or not. Note that the difference between any two masks in these four | m| mask set is much greater than the maximum possible length (number of blocks) 4.5. Proposed solution 99 of any plaintext message. This will prevent an attacker forcing collisions between masks by changing the message length (inserting or deleting blocks), or changing the checksum value Σ. The following design changes the structure to preserve such features without the need for masks that are so far away from one another.

4.5.1 Proposed instantiation of encryption/decryption core

We propose two minor modifications, as shown in Table 4.4 and Figure 4.3, to provide a generic version of OTR. Firstly, the masking values for odd blocks are set to start from 23L, whereas the masking values for even blocks start from 3 2− L. The four masks to be XOR-ed with the checksum of plaintext blocks are defined as follows:

2L when m is even and M = n • | m|6 22L when m is even and M = n • | m| 2 2− L when m is odd and M = n • | m|6 1 2− L when m is odd and M = n • | m| These choices ensure that the masks will not collide regardless of the primitive 1 2 polynomial being used. The values 2− L and 2− L can easily be obtained from L with the right shift operation instead of the left shift used in the current scheme. Secondly, the last step in the process used to compute the tag is slightly redesigned, separating the XOR of checksum Σ and the number of chunks ` in the plaintext message. This will prevent an attacker exploiting the tag computation by compensating between these two variables, Σ and `. In our proposal, changing either variable Σ or ` will not have a clear effect on the other. The cost of this change is one extra block cipher call. However, this extra block cipher call makes the generic OTR design more robust.

4.5.2 Proposed instantiation of authentication core

As noted in Section 4.2, the authentication core can process the associated data in two modes: serial or parallel. For serial associated data, the same design will be used as it uses only two masks: 2Q and 22Q. These masks are distinct regardless of the primitive polynomial used. For parallel associated data, only the masks used with the last block Ad are changed. OTR (Figure 4.1) uses the 100 Chapter 4. Tweaking generic OTR mode to avoid forgery attacks

Table 4.4: Proposed OTR algorithm.

Algorithm 7: OTR Encryption Algorithm 8: OTR Decryption 1. Σ 0n 1. Σ 0n ← ← 2. L EK (N) 2. L EK (N) ← n ← n 3. (M1,...,Mm) M, ` = m/2 3. (C1,...,Cm) C, ` = m/2 4. for i = 1 to ` ←−1 do d e 4. for i = 1 to ` ←−1 do d e −i+2 − (i+2) 5. C2i 1 EK (2 L M2i 1) M2i 5. M2i 1 EK (2− L C2i 1) C2i − ← (i+2) ⊕ − ⊕ − ← i+2 ⊕ − ⊕ 6. C2i EK (2− L C2i 1) M2i 1 6. M2i EK (2 L M2i 1) C2i 1 ← ⊕ − ⊕ − ← ⊕ − ⊕ − 7. Σ Σ M2i od 7. Σ Σ M2i od 8. if m←is even⊕ then 8. if m←is even⊕ then `+2 (`+2) 9. Z EK (2 L Mm 1) 9. M2m 1 EK (2− L Cm) Cm 1 ← ⊕ − − ← `+2 ⊕ ⊕ − 10. Cm msb Mm (Z) Mm 10. Z EK (2 L Mm 1) ← | | (`+2)⊕ ← ⊕ − 11. C2m 1 EK (2− L Cm) Mm 1 11. Mm msb Mm (Z) Cm − ← ⊕ ⊕ − ← | | ⊕ 12. Σ Σ Z Cm fi 12. Σ Σ Z Cm fi 13. if m←is odd⊕ then⊕ 13. if m←is odd⊕ then⊕ `+2 `+2 14. Cm msb Mm (EK (2 L)) Mm 14. Mm msb Mm (EK (2 L)) Cm 15. Σ ←Σ M| fi| ⊕ 15. Σ ←Σ M| fi| ⊕ ← ⊕ m ← ⊕ m 16. if m is even and Mm = n then 16. if m is even and Mm = n then 17. W E (2L Σ)| fi|6 17. W E (2L Σ)| fi|6 ← K ⊕ ← K ⊕ 18. if m is even and Mm = n then 18. if m is even and Mm = n then 19. W E (22L |Σ) fi| 19. W E (22L |Σ) fi| ← K ⊕ ← K ⊕ 20. if m is odd and Mm = n then 20. if m is odd and Mm = n then 1 | |6 1 | |6 21. W E (2− L Σ) fi 21. W E (2− L Σ) fi ← K ⊕ ← K ⊕ 22. if m is odd and Mm = n then 22. if m is odd and Mm = n then 2 | | 2 | | 23. W EK (2− L Σ) fi 23. W EK (2− L Σ) fi 24. TE ←E (W `⊕) 24. TE ←E (W `⊕) ← K ⊕ ← K ⊕ 25. C (C1,...,Cm) 25. M (M1,...,Mm) 26. return← (C, TE) 26. return← (M, TE) Algorithm 9: OTR Authentication with Algorithm 10: OTR Authentication with Parallel Associated Data Serial Associated Data 1. Ξ 0n 1. Ξ 0n ← ← 2. Q EK (0) 2. Q EK (0) ← n ← n 3. (A1,...,Ad) A 3. (A1,...,Ad) A 4. for i = 1 to d←− 1 do 4. for i = 1 to d←− 1 do − − 5. Ξ Ξ EK (Q Ai) 5. Ξ EK (Ξ Ai) 6. Q ← 2Q⊕od ⊕ 6. od ← ⊕ ← 7. Ξ Ξ Ad 7. Ξ Ξ Ad ← ⊕ 1 ← ⊕ 8. if Mm = n then TA EK (2− Q Ξ) 8. if Mm = n then TA EK (2Q Ξ) | |6 2 ← ⊕ | |6 ← ⊕ 9. else TA EK (2− Q Ξ) fi 9. else TA EK (4Q Ξ) fi 10. return ←TA ⊕ 10. return ←TA ⊕

d 1 d 1 2 masks 2 − 3Q and 2 − 3 Q when A = n and A = n, respectively. However, | d|6 | d| there is no need for these two masks to be very far away from each other, as changing the number of blocks d in the associated data will directly affect the 1 accumulated tag TA. Thus, the two suggested masks for the last block are 2− Q 2 and 2− Q when A = n and A = n respectively. These two masks are still easy | d|6 | d| to compute, unaffected by changing the number of blocks and avoid the base 3. 4.6. Alternative solution 101

when m is even when m is odd N M1 M2 M3 M4 Mm 1 Mm Mm 3 4 −`+2 2 L 2 L ··· 2 L Z `+2 EK EK EK 2 L EK msb msb EK 3 4 (`+2) 2− L 2− L 2− L

L E E E K K ··· K

C1 C2 C3 C4 Cm 1 Cm Cm − Σ = M2 M4 Mm 2 Σ = M2 M4 ⊕ ⊕ · · · ⊕ − ⊕ ⊕ · · · ⊕ Z Cm Mm 1 Mm ⊕ ⊕ − ⊕ 2 L∗ = 2L L∗ = 2− L

Σ

L∗ when M = n | m| 6 2L∗ when M = n | m| n 0 A1 A2 Ad 1 Ad E − K d 2 Q 2Q 2 − Q ` EK 1 2− Q when A = n EK | d| 6 EK EK EK 2 ··· 2− Q when Ad = n | | TE Q EK TA EFE ··· msb AFE TA = 0n when A = ε T Figure 4.3: Proposed OTR encryption diagram with parallel associated data.

Note that the base of all new masks suggested for OTR is 2. This guarantees that the masks will not overlap, and enables us to use the same masks for all choices of n and all choices of the primitive polynomials used to define the finite

field F2n .

4.6 Alternative solution

All masks in the proposed solution in Section 4.5 have the form 2j. This prevents mask collision regardless of the primitive polynomial used. However, the masks j j in odd blocks use the form 2 , whereas even blocks use masks of form 2− as illustrated in Figure 4.3. The scheme requires two different functions for doubling masking technique: one to double 2j masks using left shift operation and the j other to double 2− in the opposite direction using right shift operation. This is less efficient and can be avoided. 102 Chapter 4. Tweaking generic OTR mode to avoid forgery attacks

One alternative approach is to derive the masks for each pair of plaintext blocks (M2i 1,M2i) as follows: − 2+2i The odd block M2i 1 has the mask 2 L. • − The even block M has the mask 23+2iL. • 2i

That is, the user obtains, firstly, the secret value L as described in line 2 of EFE in Table 4.4. Then, three more masks (2L, 22L, 23L) are computed, but reserved to be used in the tag generation process. After that, the first odd block M1 uses the mask 24L, and then this mask is doubled 2(24L) = 25L to be used with the even block M2. For the next two plaintext blocks (M3,M4), the last mask in 5 6 the previous chunk is doubled 2(2 L) = 2 L to be used with M3 and this new 6 7 mask is also doubled 2(2 L) = 2 L to be used with M4. Hence, odd blocks use masks of sequence (24L, 26L, 28L, ), whereas even blocks use masks of sequence ··· (25L, 27L, 29L, ). This approach is illustrated in Figure 4.4. The four reserved ··· masks (L, 2L, 22L, 23L) are used in tag generation as follows:

L when m is even and M = n • | m|6 2L when m is even and M = n • | m| 22L when m is odd and M = n • | m|6 23L when m is odd and M = n • | m| Now, note that all masks in this alternative approach have the form 2jL and are updated using one doubling masking function. This solution is therefore preferable compared to the one discussed in Section 4.5. A similar technique can be used with OTR in serial mode.

4.7 Security bounds for OTR using the new mask- ing coefficients

This section discusses the impact of changing the input masks on the security bounds of OTR. Firstly, all of the proposed masks for OTR have the form 2j. n Since 2 is a generator of the finite field F2n and the order of this field is 2 1, − j j n this assures that 2 = 2 0 iff j = j0 for any j 2 1. Therefore, these masks ≤ − n 1 will not collide for any collection of blocks with a length less than (2 − 3), − 4.7. Security bounds for OTR using the new masking coefficients 103

when m is even when m is odd N M1 M2 M3 M4 Mm 1 Mm Mm 4 6 −2+2` 2 L 2 L ··· 2 L Z 2+2` EK EK EK 2 L EK msb msb EK 25L 27L 23+2`L

L E E E K K ··· K

C1 C2 C3 C4 Cm 1 Cm Cm − Σ = M2 M4 Mm 2 Σ = M2 M4 ⊕ ⊕ · · · ⊕ − ⊕ ⊕ · · · ⊕ Z Cm Mm 1 Mm ⊕ ⊕ − ⊕ 2 L∗ = L L∗ = 2 L

Σ

L∗ when M = n | m| 6 2L∗ when M = n | m| n 0 A1 A2 Ad 1 Ad E − K 22Q 23Q 2dQ ` EK Q when Ad = n EK EK EK EK | | 6 ··· 2Q when Ad = n | | TE Q EK TA EFE ··· msb AFE TA = 0n when A = ε T Figure 4.4: An alternative proposed OTR encryption diagram with parallel as- sociated data. which is well beyond the message length restriction of 2n/2 blocks imposed by the designer of OTR. Thus, the collision attacks discussed in Section 4.4.1 are now precluded. Secondly, the security proofs of OTR [8] assume that all tweakable block cipher calls in each of the two rounds have distinct masks. As shown in the above paragraph, the proposed masks are guaranteed to be different from each other; thus, the same security bounds for OTR still hold. Finally, our proposed modification to OTR adds one extra block cipher call, as shown in Figure 4.3. This step is required in order to avoid using a mask with a base other than 2. The probability that an attacker can guess the tag successfully is still 1/2τ where is τ the tag length. Therefore, the security of OTR will not be degraded with the new instantiation method. Note that the suggested solutions affect the efficiency of the original OTR 104 Chapter 4. Tweaking generic OTR mode to avoid forgery attacks

in two aspects. Originally, OTR uses (m + d + 2) block cipher calls to process an m-block plaintext message and d-block associated data. The proposed OTR increases this number by one to (m + d + 3). In addition, the proposed solutions require double the number of multiplications compared to the original OTR [8,9]. The original OTR [8] uses two mask forms, 2jL and 2j3L for odd and even blocks, respectively. Each time 2j3L is doubled to 2j+13L, and then this mask is XOR- ed with 2jL to obtain 2j+1L. That is, each pair of two plaintext blocks requires only one finite field multiplication. On the other hand, the proposed approaches require two finite field multiplications for each pair of blocks. However, doubling masking technique is very cheap in hardware implementations [8]. Furthermore, recent studies show that this technique can be fast in software implementations [126]. Above all, the proposed solutions add great flexibility to the design of OTR as a generic block cipher mode, guaranteeing the stated security in return for a relatively low additional cost.

4.8 Conclusion

OTR is a block cipher mode of operation for AEAD that uses a doubling masking technique. OTR is designed to be applicable for any block size n but currently

requires a suitable choice to be made for the finite field F2n used for doubling the mask values. The security of OTR against forgery attacks depends on the distinctness of the masking values. This distinctness has only been proved for specific choices of primitive polynomial in the particular cases of n = 64 and n = 128. For any selected polynomial, such a proof is required and this is not a simple task for users. In this chapter, we showed that the masks used in the current instantiation of OTR are not distinct for certain choices of finite field. Using these choices, we demonstrated practical forgery attacks against OTR. Thus, the generic form of the OTR design is not secure. Two minor modifications to OTR were proposed to make the generic version of this scheme more robust. In each case, this involved specifying a set of masks

that are distinct in every finite field F2n . This enables OTR to work with any finite field without invalidating the security claimed, and avoids the need for users to carry out their own checks. Note that this work does not imply that the versions of OTR described in 4.8. Conclusion 105

[8, 117] are insecure since both modes recommend to use the primitive polynomial f(x) = x128 +x7 +x2 +x+1 that guarantees distinct masks. Also, these suggested solutions may apply to other similar block cipher modes that use the doubling masking technique, such as OCB2 [10], ELmD [118] and AES-COPA [119], since all these modes use the same technique to update masks and adopt the same primitive polynomial to perform finite field multiplication. Beyond the current CAESAR competition, we recommend that future designs based on the doubling masking technique use masking coefficients which all have the same form, for example, 2j or 3j. This avoids the issue of users having to verify whether a particular primitive polynomial results in mask collisions. Hence, the scheme is more general and less prone to flaws based on poor choices of finite fields. 106 Chapter 4. Tweaking generic OTR mode to avoid forgery attacks Chapter 5

Analysis of XEX mode using fault attacks

The XOR-Encrypt-XOR (XEX) block cipher mode was introduced by Rogaway in 2004 [10]. XEX mode can be used with any block cipher and uses nonce-based secret masks that are distinct for each message. A sequence of secret masks ∆i (also known as offsets) is obtained from the encryption of the nonce. A different mask from this sequence is XOR-ed with each message block both before and after the underlying block cipher algorithm is applied. If the mode does not apply the last XOR operation with the secret mask, then it is called XE mode. XEX/XE modes can be used to provide Authenticated Encryption (AE) al- gorithms. The concept of XEX/XE modes was used to refine an existing scheme, OCB1 [28], into a new mode, OCB2 [10]. Other examples of AE schemes that are based on XEX or XE mode are: COPA [119], ELmD [118], SHELL[127] and OTR [117]. Unlike OCB2 and OTR, the first three schemes use a constant, instead of a nonce, to derive the secret masks. XEX/XE modes are faster, easier to implement and their security is easier to prove than other tweakable block ciphers or standard block cipher modes [10]. Furthermore, the plaintext message is processed only once in AE schemes. These are attractive features for block cipher modes. The drawback of such modes is that the security depends on both the key (K) and the mask (L); revealing either of them will breach the security of the AE mode as a whole [10, 128]. In XEX mode, the nonce-based masks act as a barrier to conventional fault

107 108 Chapter 5. Analysis of XEX mode using fault attacks attack methods. For DFA where two identical block cipher inputs are required to produce a pair of correct/faulty ciphertexts, such as [100, 109, 108], XOR-ing different nonce-based masks with the plaintext blocks prevents this. For SFA where each block cipher output is XOR-ed with a different secret mask, there is no direct access to the block cipher outputs. Thus, neither DFA nor SFA can be applied directly. This research is motivated by this fact. In this chapter, fault attacks are proposed against XEX mode by targeting the XEX part that uses nonce-based secret masks. The fault attack techniques are proposed to either skip the masking effect or retrieve the value of the secret mask L. In either case, conventional fault attack techniques [100, 18] can then be used to recover the secret key. In the worst case, the entire key can be retrieved with a single additional fault as described in [129, 16, 130, 103]. In addition, if XEX is used as an AE mode, an attacker can breach the AE security by constructing forged messages [128, 10]. Unlike previous fault attacks on XEX-based modes, this approach targets the part of the mode where a direct application of existing fault attack techniques is not possible. The chapter also contains several simulations on a PC to demonstrate the effectiveness of the proposed attacks and to calculate their success rates. The simulations use 128-bit AES as the underlying block cipher operating in XEX mode. Note that this work does not show how the faults are injected by perform- ing hardware experiments. However, the fault models considered in this chapter are well documented in the literature and have been shown to be practical in cer- tain research papers, such as [131, 105, 107], and so can be applied as outlined in this chapter. Further, the applicability of the proposed techniques to certain authenticated encryption modes is investigated, including candidates in the ongoing CAESAR competition [6], such as COPA, ELmD, SHELL and OTR. Our attacks show that the masking function is a point of vulnerability. Hence, efficient alternative constructions for the mask updating function are suggested as countermeasures. This chapter is organised as follows: Section 5.1 briefly describes the AES and XEX schemes. Previous fault analyses of XEX mode are presented in Sec- tion 5.2. Section 5.3 describes an approach to eliminate the barrier posed by the nonce-based secret masks in XEX mode. The next section shows fault attacks on the last rounds of AES to retrieve the secret masks given faulty ciphertext messages only. Section 5.5 presents the results of several experimental verifica- 5.1. Preliminaries 109 tions of these fault attacks using 128-bit AES as the underlying block cipher. Section 5.6 verifies the relevance of our proposed approaches to certain authen- ticated encryption modes and Section 5.7 investigates mechanisms to avoid the proposed fault attacks. The last section draws a conclusion.

5.1 Preliminaries

This section elaborates the preliminaries required for conducting a fault analysis on XEX mode.

5.1.1 AES description

This section briefly describes the Advanced Encryption Standard (AES), and refers the reader to [44] for more technical details. AES is assumed the underlying block cipher when a certain fault analysis technique is applied to XEX mode. AES is a 128-bit symmetric block cipher that allows three key sizes: 128-bit, 192-bit and 256-bit. AES is an iterated cipher that consists of a number of similar rounds. The number of rounds in AES is 10, 12 or 14 depending on the key size respectively. Each round in AES consists of four fundamental operations:

SubBytes (SB): This operation is a non-linear substitution that replaces • each byte in the internal state s(r),(o) with another according to a fixed 8 8 ij × s-box.

ShiftRow (SR): This operation changes the order of bytes within the same • state where certain bytes are shifted cyclically-left by a certain number of steps.

MixColumn (MC): This is a linear transformation of the four bytes in each • column of the state matrix.

AddRoundKey (AK): The state matrix is combined with a round key by a • bitwise XOR operation.

AES does not apply the MixColumn operation in the last round. The internal state of 128-bit AES is organised as a matrix of 4 4 bytes where 0 j, k < 4, × ≤ where j represents the row number and k represents the column number. Each i,(r),(o) 1,(9),(AK) state for a plaintext block Mi is written as sjk . For example, s00 is 110 Chapter 5. Analysis of XEX mode using fault attacks

the first byte of the encryption state of block M1 after AddRoundKey operation of round 9.

5.1.2 The design of XEX mode

Rogaway [10] introduced a mode of operation for block ciphers known as XEX. The underlying block cipher can be any symmetric block cipher. The scheme is proposed as an efficient approach to build a tweakable block cipher from an ordinary block cipher. An ordinary block cipher accepts two variables (a key and message) whereas a tweakable block cipher accepts an additional third vari- able called tweak. Tweakable block ciphers (see Chapter4) were first defined by Liskov, Rivest and Wagner (LRW) [120]. However, LRW constructions use two keys: one used in the block cipher functions and the other used in a non- cryptographic function. Compared to LRW, XEX scheme improves the construc- tion of tweakable block ciphers by reducing these two keys to one. XEX mode is a nonce-based scheme where every plaintext message uses a different nonce (N). The nonce is encrypted to obtain a secret value L = EK (N). This secret value uses a certain function (f( )) to derive a sequence of secret · masks ∆ so that ∆ is used during the processing of the ith message block M . { i} i i The XEX mode (see Figure 5.1) is defined as:

C = E (P ∆ ) ∆ i K i ⊕ i ⊕ i where ∆i = f(i, L) and L = EK (N).

M1 M2 Mm N ∆1 ∆2 ∆m

E E E E K K K ······ K

∆1 ∆2 ∆m L

∆1 = f(L) C1 C2 Cm Figure 5.1: XEX mode.

Note that XEX mode uses a single key for both the block encryption oper- ation and initialisation of the sequence of masking values. This scheme uses an 5.1. Preliminaries 111 approach that is simpler, more efficient and easier to prove correct than LRW constructions [120]. A similar approach can be used to build a scheme called XE defined as: C = E (P ∆ ) i K i ⊕ i where the second XOR with the mask ∆i is omitted. Both XE and XEX are secure up to the birthday bound provided that the masks are distinct. XEX mode provides security against chosen ciphertext attacks (CCA-secure) while XE mode provides security against chosen plaintext attacks (CPA-secure) provided that the underlying block cipher is CCA-secure and CPA-secure, respectively. Detailed security analyses for XEX and XE are presented in [10].

5.1.2.1 Mask instantiation

The XEX and XE schemes require certain properties during mask instantiation. First, it is important to ensure that masks are always distinct from each other and do not overlap during processing different blocks of the same message. Mask overlapping enables an attacker to perform a straightforward forgery by swapping blocks with matching masks. The function used for mask instantiation should produce a large range of distinct masks that covers at least 2n/2 blocks. Second, for efficiency of implementation, every new mask ∆i+1 should be easily calcu- lated from the previous one ∆i. Masks should arise in some sequence where a subsequent mask is an increment of the prior mask. One efficient approach for mask instantiation is called the doubling masking technique or powering-up construction. This approach was first used in EME mode [132]. Rogaway [10] suggests using this technique in the XEX and XE modes. In the doubling masking technique, new masks are obtained as ∆i+1 =

2∆i where multiplication is performed in the finite field F2n . If ∆0 starts with L, the doubling masking technique results in a series of masking values: L, 2L, 2 3 m 1 2 L, 2 L, ... , 2 − L. Each message block uses a different mask that is XOR-ed both before and after the underlying block cipher algorithm is applied, as shown in Figure 5.2(a).

The mask multiplication is performed in the finite field F2n by multiplying two input polynomials and finding the remainder modulo a primitive polynomial.

When n = 128 and the finite field F2128 is constructed using the commonly used 112 Chapter 5. Analysis of XEX mode using fault attacks

primitive polynomial f(x) = x128 + x7 + x2 + x + 1, the doubling is as follows:  L 1 if msb1(L) = 0 2L =  (5.1) (L 1) 012010000111 if msb (L) = 1  ⊕ 1 and one way to calculate doubling is shown in Figure 5.2(b). Note that this choice of finite field is used in Rogaway’s paper [10] and has also been adopted in other designs such as COPA, ELmD, SHELL and OTR. Rogaway [10] proves that if masks are updated as 2iL, then this choice of finite field generates unique 126 126 64 masks for 2− i 2 which is far more than 2 . ≤ ≤

M M M 1 2 m Input : L N m 1 (1) Constant[2] = 0x00, 0x87 L 2L 2 − L { } (2) F = msb1(L) EK EK EK EK (3) DL = L 1 ······  m 1 (4) DL = DL Constant[F ] L 2L 2 − L ⊕ L output : DL C1 C2 Cm

(a) The most common masking of XEX mode. (b) Timing-resistant implementation of doubling masking technique Figure 5.2: Doubling masking for XEX mode.

5.1.3 Fuhr et al.’s fault attack on AES

This section describes Fuhr et al.’s [18] fault attacks on AES using ciphertext only. These attacks are very efficient as they assume that the attacker has access only to a certain number of faulty ciphertext messages all encrypted under the same secret key. Later in Section 5.4, this approach is applied to XEX mode to retrieve the actual value of the secret value L. Fuhr et al. [18] introduce several fault attacks on AES-128 assuming an at- tacker cannot choose nor know the plaintext message. These attacks use only faulty ciphertext messages without the need to collect faulty/correct ciphertext pairs as in DFA. The attack targets the last four rounds of AES and leads to the entire key recovery. This work considers different fault models that all have non-uniform distri- 5.1. Preliminaries 113 bution. That is, it is assumed that the injected fault forces the targeted byte value to a fixed, but still unknown constant: the distribution of the faulty value is biased. Three fault models have been used in this work: A. Stuck-at-zero with probability 1:

i,(r),(o) i,(r),(o) sjk = sjk AND 0 with probability 1

.

B. Stuck-at-zero with probability 1/2 (where e represents an error uniformly distributed in [0, 255]):

( i,(r),(o) i,(r),(o) sjk AND 0 with probability 1/2 sjk = i,(r),(o) sjk AND e with probability 1/2

C. Stuck-at-error with probability 1 (where e represents an error uniformly distributed in [0, 255]):

i,(r),(o) i,(r),(o) sjk = sjk AND e with probability 1

. Note that these three fault models represent different degrees of control on the faulty byte. The first model (refer as fault model A) assumes that the attacker has a perfect control on the faulty byte. The fault induces a constant value. The second fault model (fault model B) assumes that the injected fault causes a bias on the targeted byte to a fixed value. The last fault model (fault model C) assumes that the attacker only knows that the fault distribution is biased without any further information. Since the attacker has only a number of faulty ciphertext messages, a sta- tistical analysis is required to retrieve the correct key value. In this statistical i,(r),(o) analysis, each byte sjk in the internal state is represented in terms of a sub- (r) set of the corresponding ciphertext [Ci](jk) and key K(jk). Then, this subset of (r) the key K(jk) is recovered by guessing its value and evaluating the distribution i,(r),(o) of sjk in several faulty ciphertext messages using a distinguisher. A dis- tinguisher takes several faulty ciphertext messages and assigns a score for each (r) candidate of K(jk). The candidate value that scores the maximum (or minimum depending on the distinguisher) is typically the correct value. 114 Chapter 5. Analysis of XEX mode using fault attacks

Fuhr et al. used different distinguishers to retrieve the correct key. One commonly used distinguisher is hamming weight. For example, each bit in the i,(r),(o) faulty byte sjk is biased toward zero with strong probability in fault model A. Hamming weight distinguisher returns the hypothesis that minimises the average number of 1’s (hamming weight) in a such faulted byte in a collection of faulty ciphertext messages. Fuhr et al. applied fault attacks to the 6th, 7th, 8th and 9th rounds in AES using different fault models. The results of the attack on the 9th round were computed over 1000 simulations using hamming weight distinguisher and are summarised in Table 5.1. Table 5.1: Summary of Fuhr et al.’s fault attack on AES round 9 [18].

Fault Model Number of faults Success rate A 1 0.99 B 14 0.99 C 18 0.99

5.2 Existing fault attacks on XEX

To the best of our knowledge, only two pieces of work investigate fault attacks against the XEX scheme. One study is conducted by Dobraunig et al. [19] that targets the XEX-based AE schemes. The other study is performed by Unter- luggauer and Mangard [133] who propose differential power analysis (DPA) and differential fault analysis (DFA) against memory and disk encryption schemes using XEX as an encryption mode. Dobraunig et al. [19] investigate fault attacks on XEX-based AE modes. This work is significant as it demonstrated the practical relevance of statistical fault attacks proposed by Fuhr et al. [18], as briefly described in Section 5.1.3, to authenticated encryption modes. The work targeted, amongst others, the XEX- based AE modes, such as OCB, OTR, COPA, SHELL and ElmD [6]. Fault attacks on XEX-based modes in [19] require access to parts of the mode where the block cipher output is either known or XOR-ed with a constant-based secret mask. These attacks are not applicable when the masks are not constant-based. Unlike the attacks reported by Dobraunig et al. [19], this chapter proposes fault attack against XEX mode even if the masks are not constant-based. 5.3. Eliminating the masks in XEX mode 115

The authors of [19] performed fault-injection experiments on three real hard- ware platforms. They demonstrated that the key can be recovered with a couple of faulty ciphertexts. A summary of the XEX-based AE schemes and their tar- geted part is presented in Table 5.2. Note that all the listed XEX-based modes in Table 5.2 use constant-based masks with the exception of OCB [28]. Although OCB uses nonce-based masks, the attack targeted the XE part, where the out- put of the block cipher is accessible, but not the XEX part. OTR scheme is XE-based, so the output of the block cipher is known and not XOR-ed with any mask. Table 5.2: Summary of fault attacks on XEX-based AE scheme [19].

AE mode Classification Mask type Targeted part COPA XEX Constant-based XEX ELmD XEX Constant-based XEX SHELL XEX Constant-based XEX OCB XEX Nonce-based XE OTR XE Nonce-based XE

Unterluggauer and Mangard [133] investigate certain memory encryption schemes in terms of DPA and DFA attacks. Memory encryption schemes use several encryption modes, including XEX mode in which 128-bit AES is used as the underlying block cipher. In these schemes, the memory is divided into sectors and each sector is divided into blocks. The mask in XEX-based memory encryp- tion scheme is mainly derived from the finite-field multiplication of the sector number and the memory block address. It is easy to see that re-encrypting the same memory block results in the same mask generated. Fixing the mask and encrypting the same plaintext of a memory block twice allow the attacker to easily conduct differential fault analysis. In summary, Unterluggauer and Man- gard [133] applied differential fault analysis on an XEX-based encryption scheme where the mask is fixed, although it is still unknown.

5.3 Eliminating the masks in XEX mode

In this section, we present two approaches to eliminate the masks of XEX/XE modes using fault attacks. We exploit the primitive polynomial commonly used in XEX mode to eliminate the effect of the secret masks. This effectively converts 116 Chapter 5. Analysis of XEX mode using fault attacks the XEX mode to ECB mode, so subsequent faults can be injected to retrieve the secret key.

5.3.1 Stuck-at-zero fault attack

The duration of an injected fault can be permanent or transient. Permanent fault means that certain bits are disturbed permanently for the entire operation of a hardware platform, whereas transient faults change the value of certain bits temporarily. In addition to duration, the location of a fault can be either precise: affecting a certain bit in an internal register, or random. In this section, our fault model assumes that the fault will occur in a single bit anywhere in the secret mask register L except the last byte. That is, the jth bit in L, where 0 j < 120, will be stuck at ‘zero’ value permanently. This fault ≤ model is considered to be feasible using sophisticated technology, such as laser fault injection system and we note that several research papers adopt this fault model (see fault attacks against AES in [101] and ACORN in [134]). Due to the features of the primitive polynomial used in the doubling masking technique, the entire masking value L will be stuck permanently at zero after only a few faulted plaintext blocks in a multi-block message. If only one bit of L is faulted, L will reach zero after 128 blocks whereas if one byte is faulted, L will be zero only after 16 blocks. Therefore, the effect of masks in XEX mode is avoided. The location of the faulted bits cannot be between 120 j 127 since these ≤ ≤ bit positions are XOR-ed with the feedback value (0x87) in the case where the most significant bit of L is 1 (see Equation 5.1). Therefore, destroying such bit positions will not zero the mask L, though it will increase the chance for mask collision. Assuming a permanent fault is not necessary, the attacker can inject transient stuck-at-zero faults. This fault model has been shown to be feasible using low- budget equipment in [107]. The attacker can force certain bits of L to be 0 for few consecutive blocks. In case a byte is faulted, 16 consecutive set-to-zero faults are required to be induced to any of the first 15 bytes of L. 5.3. Eliminating the masks in XEX mode 117

5.3.2 Skipping an instruction fault attack

If it is assumed that transient/permanent stuck-at-zero faults are not applicable or consume more time/cost, L can also be overcome in software implementations using a more efficient and easy to set-up fault model. L can be overcome by skipping an instruction, i.e. an instruction is not executed. An instruction can be skipped by applying glitch attacks [101]. This fault model was investigated and proved practical in [105]. For example, if the implementation of Figure 5.2(b) has been used, one way to eliminate the effect of masks in XEX mode is to skip the execution of instruction (2) in this implementation for 128 consecutive blocks. This step will cause the doubling mechanism to always choose Constant[0] and not Constant[1] provided that the value of F is zero before the fault injections. Thus after 128 blocks, the entire 128-bit L will be zero and L will be stuck to zero during the processing of all following blocks. The chance that F is 0 before a fault injection is 50%. In case F was 1, the attack can be repeated for a second time with another set of 128 consecutive blocks, but now with a better chance that this carry flag F is zero in at least one of these instances. Another more effective way to overcome the masks is to omit the execution of instruction (4) (see Figure 5.2(b)) for 128 consecutive blocks, regardless of the value of the carry flag F , as in the previous approach. This approach guarantees that the mask L will be stuck at a value of zero for all following plaintext blocks. Note that this approach is implementation-dependent. That is, the attacker needs to have knowledge of the implementation to successfully overcome L. How- ever, it is clear that similar approaches can be derived to achieve the same pur- pose for implementations other than the one discussed here.

5.3.3 Security implication for mask elimination

Forcing the masks in XEX mode to zero reduces XEX mode to ECB mode. As a result, if the mode is used as an AE mode, this will enable attackers to breach the integrity assurance mechanism of the mode. In addition, the secret key can be recovered using additional faults. One extra fault injection can completely recover the key as described in [129, 16, 130, 103]. These proposed fault attacks are easy and efficient due to the particular form 118 Chapter 5. Analysis of XEX mode using fault attacks of the primitive polynomial used to define the finite field. Since the polynomial is sparse and the feedback path is from bits all located in the final byte of L, the attacks work effectively. This work demonstrates the weakness of this commonly used polynomial with respect to fault attacks. Section 5.7 suggests different primitive polynomials that avoid these fault attacks.

5.4 A Ciphertext only attack to reveal secret mask L

In this section, we describe an approach to obtain the value of L under the stricter requirement of using ciphertext blocks only. For this section, 128-bit AES is considered the underlying block cipher used in XEX mode. The challenge with attacking AES in XEX mode rather than the ECB mode is that the block cipher output is XOR-ed with a mask prior to generating the ciphertext. That is, the attacker does not have direct access to the output of the encryption. In addition, masks are guaranteed to be different from each other. A ciphertext-only statistical fault attack has been used previously to determine the secret key in AES encryption [18], but this attack requires a collection of ciphertext bytes that share the same sub-key byte. Therefore, the statistical fault attack cannot be applied directly to obtain the secret key from AES in XEX mode. This thesis shows, however, that the relationship between the masks used in the doubling masking mechanism enables us to adapt this attack to reveal the initial mask value L. From this, it is then straightforward to find the secret key, completely breaking the security of the cipher. In fact, it is noted that the key bits can be determined using the same ciphertext bytes used to reveal the mask L. As a first step toward retrieving L, several masks are collected that share cer- tain mask bytes only. From Equation 5.1 and Figure 5.2(b), note that the dou- bling operation used in XEX mode causes the secret mask L = (L00,L01, ..., L33) to shift by one bit to the left for each block in the message. Moreover, note that the only bits of L that are potentially changed in this process are those in the

final byte, L33, of L. Thus, after eight shifts, all of the bytes in the original mask 8 L - except for L00 and L33 - will appear again in 2 L, but shifted a whole byte to the left. Likewise, all but three bytes of L will appear in 216L and the original 5.4. A Ciphertext only attack to reveal secret mask L 119

byte L32 appears a total of fifteen times as:

( 8 16 24 32 40 48 56 ) L32, (2 L)31, (2 L)30, (2 L)23, (2 L)22, (2 L)21, (2 L)20, (2 L)13, 64 72 80 88 96 104 112 (2 L)12, (2 L)11, (2 L)10, (2 L)03, (2 L)02, (2 L)01, (2 L)00 before being shifted out of L. In addition, Equation 5.1 can be used to show that:

there is a one-to-one relation between the values of (28L) and L . • 33 00 the value of (28L) depends only on L and L such that L can be • 32 00 33 33 8 determined uniquely from L00 and (2 L)32.

The first of these results effectively gives us a sixteenth copy of L32 from the 120 mask byte (2 L)33. In total (for a sufficiently long message), up to sixteen 120 copies of L32 can be collected, as shown in Figure 5.3.((2 L)33 is denoted as

0 L32 in this figure.) In some situations, 16 repetitions of the same byte are not sufficient to deter- mine the byte’s value with high probability. One way to overcome this problem is to increase the number of repetitions by using two bytes rather than one. For 8 example, if the byte L32 and the byte (2 L)32 are used, then each byte will repeat 16 times. as :

 L , (28L) , (216L) , (224L) , (232L) ,   32 31 30 23 22  40 48 56 64 72 (2 L)21, (2 L)20, (2 L)13, (2 L)12, (2 L)11,  80 88 96 104 112 120 0  (2 L)10, (2 L)03, (2 L)02, (2 L)01, (2 L)00, (2 L)33 and  (28L) , (216L) , (224L) , (232L) , (240L) ,   32 31 30 23 22  48 56 64 72 80 (2 L)21, (2 L)20, (2 L)13, (2 L)12, (2 L)11,  88 96 104 112 120 128 0  (2 L)10, (2 L)03, (2 L)02, (2 L)01, (2 L)00, (2 L)33

128 respectively. In addition to these 32 repetitions, the value of the byte (2 L)32 120 120 can be calculated since both (2 L)33 and (2 L)00 have already been retrieved. 128 This new byte (2 L)32 will also repeat another 16 times. as:

 (2128L) , (2136L) , (2144L) , (2152L) , (2160L) ,   32 31 30 23 22  168 176 184 192 200 (2 L)21, (2 L)20, (2 L)13, (2 L)12, (2 L)11,  208 216 224 232 240 248 0  (2 L)10, (2 L)03, (2 L)02, (2 L)01, (2 L)00, (2 L)33 120 Chapter 5. Analysis of XEX mode using fault attacks

L00 L01 L02 L03 L01 L02 L03 L10

L10 L11 L12 L13 L11 L12 L13 L20

L20 L21 L22 L23 L21 L22 L23 L30 L32

L30 L31 L32 L33 L31 L32 L32

L 28L 216L 224L

L32

L32 L32 L32

232L 240L 248L 256L

L32

L32 L32 L32

264L 272L 280L 288L

L32 L32 L32

0 L32 296L 2104L 2112L 2120L

Figure 5.3: Masks containing the byte L32.

8 In summary, each of the two bytes L32 and (2 L)32 will repeat 16 times and the combination of the two will repeat 16 more times. At the end, 48 occurrences depending only on 16 bits can be collected. The same concept applies if one takes three bytes or more. In the case of three bytes (24 bits), 96 repetitions in total can be collected. The following experiments consider two fault models, fault model A and fault model B as described in Section 5.1.3. A fault attack will be applied to the internal state s at rounds 8 and 9 of the AES encryption operation. As addressed in [131], these fault models are possible, but for accurate value/location fault injections, high technical skills and high cost might be needed. However, [131] emphasises that the inability to inject only the desired fault does not imply the inability to induce the fault. In either case, this section outlines the vulnerability of the XEX mode if these faults are possible. 5.4. A Ciphertext only attack to reveal secret mask L 121

5.4.1 Fault model A at round 9

i,(9),(AK) Suppose that a fault is injected on a certain byte at the end of round 9 (sjk ). i,(9),(AK) For example, suppose a fault is induced on the byte s00 for the first and second plaintext blocks (i 1, 2 ) as shown in Figure 5.4. ∈ { } The injected faults cause the two specified bytes to take a constant value. The faulty bytes will be identical during propagation in the SubBytes, ShiftRow and AddRoundKey operations of round 10 as follows:

1,(10),(SR) 1,(10),(SB) 1,(9),(AK) s00 = s00 = SB(s00 ) 2,(10),(SR) 2,(10),(SB) 2,(9),(AK) s00 = s00 = SB(s00 ) [C ] = s1,(10),(SR) K(10) L 1 (00) 00 ⊕ (00) ⊕ 00 [C ] = s2,(10),(SR) K(10) (2L) 2 (00) 00 ⊕ (00) ⊕ 00 [C ] [C ] = L (2L) = (3L) 1 (00) ⊕ 2 (00) 00 ⊕ 00 00

1,(10),(SR) 2,(10),(SR) When [C1](00) and [C2](00) are XOR-ed, the two bytes s00 and s00 cancel each other since they are identical, and the value of (3L)00 is easily ob- tained. Note that this attack is based on the XOR of two consecutive blocks to (10) obtain (3L)00 and there is no need to find the sub-key byte (K(00) ). Repeating the above experiment for blocks i 9, 10 will retrieve the byte ∈ { } (28L) (29L) = (283L) which is equivalent to (3L) (the second byte of 00 ⊕ 00 00 01 3L). Similarly, block i 17, 18 will enable us to determine the third byte of ∈ { } 3L, and so on. Thus, the whole 3L mask can be retrieved by inducing faults in 32 blocks of a cipher in XEX mode, and consequently, the original mask L can be easily obtained. (To determine the final byte (3L)33 it is necessary to adjust for the known value of (3L)00.)

5.4.2 Fault model A at round 8

Assume that the fault is injected on a full diagonal at the end of round 8. For ex- ample, suppose faults are injected to the state si,(8),(AK) where i 1, 2 accord- jk ∈ { } ing to Figure 5.5. The diagonal consists of four bytes that can have jk indexes as: 00, 11, 22, 33 , 01, 12, 23, 30 , 02, 13, 20, 31 or 03, 10, 21, 32 . Injecting { } { } { } { } faults to a full diagonal seems infeasible; however, in software implementation running on 32-bit CPUs, a fault to one instruction can distribute to four bytes (see for example [106]). 122 Chapter 5. Analysis of XEX mode using fault attacks

SB SR Block 1

s1,(9),(AK) s1,(10),(SB) s1,(10),(SR) C1

K(10) L K(10) 2L

SB SR Block 2

s2,(9),(AK) s2,(10),(SB) s2,(10),(SR) C2

XOR of Block 1 and Block 2: Legend Constant value due to fault injection = Known from ciphertext Unknown byte 3L C1 C2 Figure 5.4: Graphical representation of round 9 attack to retrieve the value of (3L)00.

In this case, a MixColumn operation is involved. Hence, it is required to know one full column of the internal state in order to reverse the MixColumn operation. Again the injected faults make the four bytes in the diagonal a constant value and they will remain identical through round 9 and 10. XOR-ing the ciphertext blocks will retrieve four bytes of 3L mask. For instance, if faults are injected into the diagonal 00, 11, 22, 33 , then the bytes (3L) , (3L) ,(3L) , (3L) will { } { 00 13 22 31} be retrieved. Repeating the experiment with another faulty diagonal for plaintext blocks i 9, 10 , will retrieve four bytes of the mask 28(3L). However, note that these ∈ { } retrieved bytes are each shifted one byte to the left of the bytes in the mask 3L. Hence, four diagonal fault injections can retrieve the original mask L completely. That is, in total, eight faulty blocks are needed where each block has a faulty diagonal. 5.4. A Ciphertext only attack to reveal secret mask L 123

MC SB SR Block 1

s1,(9),(SR) s1,(9),(MC) s1,(9),(AK) s1,(10),(SB) s1,(10),(SR) C1

SR SB

s1,(9),(SB) s1,(8),(AK) K(9) K(10) L K(9) K(10) 2L

SR SB

s2,(9),(SB) s2,(8),(AK)

MC SB SR Block 2

s2,(9),(SR) s2,(9),(MC) s2,(9),(AK) s2,(10),(SB) s2,(10),(SR) C2

XOR of Block 1 and Block 2: Legend Constant value due to fault injection = Known from ciphertext Unknown byte 3L C1 C2 Figure 5.5: Graphical representation of round 8 attack to retrieve four bytes in L.

5.4.3 Fault model B at round 9

Unlike fault model A, the fault induced by fault model B does not give a fixed output. Thus, it is necessary to collect several faulty bytes that share the same mask byte and apply a statistical fault analysis with a distinguisher in order to predict the correct value for this mask byte from all possible hypothetical values. As discussed in Section 5.4, with reference to Figure 5.3, the position i 1 of the target byte will move through the various locations in the mask 2 − L as subsequent blocks of the message are processed, so faults should be injected to different bytes of the AES encryption operation for different message blocks. Our attack uses the hamming weight distinguisher, which chooses the hypothetical mask value that minimises the average hamming weight for the faulty stuck bytes as described in Section 5.1.3.

5.4.3.1 Retrieving one byte

As in Section 5.4.1 and 5.4.2, this attacks aims to obtain information on the content of the mask 2i3L. If 2i3L is retrieved completely, then it is easy to calculate the original mask L. Suppose that a fault is injected on a certain byte at the end of round 9 i,(9),(AK) (sjk ). The byte index (jk) will vary depending on the value of the block index (i). The attack is performed in four steps as follows: 124 Chapter 5. Analysis of XEX mode using fault attacks

1. Collect ciphertext bytes that share two mask bytes (2iL) , (2i+8L) . { 32 32} The beginning of Section 5.4 demonstrates that a maximum of 48 faulty ciphertext bytes can be collected that share two mask bytes.

2. Collect another 48 faulty ciphertext bytes that share two mask bytes (2i+1L) , { 32 (2i+9L) from blocks consecutive to the blocks in step 1. The index of 32} each faulty byte in the first set is the same as the index of the corresponding faulty byte in the second set.

3. XOR the two faulty ciphertext bytes in each pair from consecutive blocks to eliminate the shared sub-key byte and obtain only two mask bytes (2i3L) , (2i+83L) , and their continued mask (2i+1283L) . { 32 32} 32 4. Use the hamming weight distinguisher to predict the correct value for (2i3L) , (2i+83L) and their continued mask (2i+1283L) from 216 pos- { 32 32} 32 sible candidates.

This fault attack is demonstrated with the following example:

Steps (1-2) First collect (2 48 = 96) faulty bytes in which 48 bytes share the × two mask bytes L , (28L) , and the other 48 bytes share their consecutive { 32 32} mask bytes (2L) , (29L) . These 96 bytes can be obtained using three sets: { 32 32} A, B and G, where each set contains 32 consecutive blocks (see Figure 5.6). Set A shares the two mask bytes L , (2L) , set B shares (28L) , (29L) and { 32 32} { 32 32} set G shares (2128L) , (2129L) . Remember that the mask bytes (2128L) and { 32 32} 32 (2129L) are continued masks of L , (28L) and 2L , (29L) respectively. 32 { 32 32} { 32 32} The targeted block indexes in each set are: Set A: ( ) 1, 9, 17, 25, 33, 41, 49, 57, 65, 73, 81, 89, 97, 105, 113, 121, i ∈ 2, 10, 18, 26, 34, 42, 50, 58, 66, 74, 82, 90, 98, 106, 114, 122

Set B: ( ) 9, 17, 25, 33, 41, 49, 57, 65, 73, 81, 89, 97, 105, 113, 121, 129, i ∈ 10, 18, 26, 34, 42, 50, 58, 66, 74, 82, 90, 98, 106, 114, 122, 130 5.4. A Ciphertext only attack to reveal secret mask L 125

Set G: ( ) 129, 137, 145, 153, 161, 169, 177, 185, 193, 201, 209, 217, 225, 233, 241, 249 i ∈ 130, 138, 146, 154, 162, 170, 178, 186, 194, 202, 210, 218, 226, 234, 242, 250

The index (jk) of the faulty ciphertext byte corresponds to each block in every row of all sets A, B and G is:

n o jk 32, 31, 30, 23, 22, 21, 20, 13, 12, 11, 10, 03, 02, 01, 00, 33 ∈ However, to target each ciphertext byte, a fault is injected to its corresponding i,(9),(AK) internal state byte s where k0 = (k + j) mod 4. The faulty ciphertext jk0 byte indexes are not the same as the targeted internal byte indexes because of the last ShiftRow operation.

L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 2L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33

28L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 29L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33

216L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 217L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33

224L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 225L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 . . . .

2112L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 2113L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33

2120L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 2121L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33

2128L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 2129L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33

2136L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 2137L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33

2144L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 2145L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 . . . .

2240L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 2241L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33

2248L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 2249L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33

Legend Set A bytes Set B bytes Set G bytes Unknown byte Figure 5.6: Mask bytes targeted according to the position of faulty bytes (colour figure).

Step (3) XOR-ing two ciphertext bytes of the same index jk from two consec- utive blocks will give the following result: 126 Chapter 5. Analysis of XEX mode using fault attacks

[C ] = si+1,(10),(SR) K(10) 2iL i+1 (jk) jk ⊕ (jk) ⊕ jk i (10) i+1,(10),(SB) = 2 Ljk K s ⊕ (jk) ⊕ jk0 i (10) i+1,(9),(AK) = 2 Ljk K SB(s ) ⊕ (jk) ⊕ jk0 i+1 (10) i+2,(9),(AK) [Ci+2](jk) = 2 Ljk K SB(s ) ⊕ (jk) ⊕ jk0 i i+1,(9),(AK) i+2,(9),(AK) [Ci+1](jk) [Ci+2](jk) = (2 3L)jk SB(s ) SB(s ) ⊕ ⊕ jk0 ⊕ jk0 where k0 = (k +j) mod 4. The value of ([C ] [C ] ) is calculated from i+1 (jk) ⊕ i+2 (jk) every two consecutive blocks in each set. This yields the following equations: Set A:

i+1,(9),(AK) i+2,(9),(AK) [Ci+1](jk) [Ci+2](jk) = (3L)32 SB(s ) SB(s ]) ⊕ ⊕ jk0 ⊕ jk0 Set B:

8 i+1,(9),(AK) i+2,(9),(AK) [Ci+1](jk) [Ci+2](jk) = (2 3L)32 SB(s ) SB(s ) ⊕ ⊕ jk0 ⊕ jk0 Set G:

128 i+1,(9),(AK) i+2,(9),(AK) [Ci+1](jk) [Ci+2](jk) = (2 3L)32 SB(s ) SB(s ) ⊕ ⊕ jk0 ⊕ jk0

128 where the value of (2 3L)32 is uniquely determined by the values of (3L)32 and (283L) , as discussed previously. Each set gives 16 values for ([C ] 32 i+1 (jk) ⊕ [Ci+2](jk)), and in total 48 values. The sbox in AES is resistant to differential analysis. Thus, knowing the XOR of SB(si+1,(9),(AK)) and SB(si+2,(9),(AK)) neither uniquely determines si+1,(9),(AK) jk0 jk0 jk0 nor si+2,(9),(AK). However, the injected faults will bias the faulty internal bytes to jk0 the all-zero byte. One, therefore, can proceed by assuming that one of the faulty bytes is zero, namely that si+2,(9),(AK) = 0. This assumption is valid 50% of the jk0 time only. Then, a statistical distinguisher is applied to the value of si+1,(9),(AK) jk0 that is determined by this assumption.

16 8 Step (4) For each of the 2 candidates for (3L)32 and (2 3L)32, compute the value for si+1,(9),(AK) in sets A, B and G. By completing this step, 48 values jk0 5.4. A Ciphertext only attack to reveal secret mask L 127 for si+1,(9),(AK) are collected for each of the 216 candidates. Use the hamming jk0 8 weight distinguisher to predict the correct value for (3L)32 and (2 3L)32 and 128 their continued mask (2 3L)32.

5.4.3.2 Retrieving all bytes

Extending this attack to determine the remaining mask bytes requires careful manipulation of the fault injections. The timing of consecutive fault injections is critical to the success of the attack. If the process of recovering a byte after a fault injection is still in progress, injecting a subsequent fault will cause multiple faults in the internal state. This makes the attack impractical. However, allowing a delay after recovering the first byte before injecting the subsequent fault results in the first retrieved mask byte being shifted out of the internal state. The retrieved byte is no longer useful. All of the bytes in a mask can be obtained by performing fifteen consecutive iterations of fault injections with the appropriate timing as discussed above. This means that any internal state contains at most two faulty bytes. Most modern devices come with 16-bit or 32-bits registers which makes faulting two bytes at a time feasible. This attack requires a message of at least 3722 full blocks encrypted using 128-bit AES in XEX mode. Note that this attack does not require all the 3722 blocks to be faulted. The steps to retrieve the entire mask L (see Figure 5.7) are as follows:

1. For blocks (1 i 250), perform the fault attack, as in Section 5.4.3.1 ≤ ≤ and the example at the beginning of this section, on certain blocks (shown as orange bytes in Figure 5.7) to retrieve the two mask bytes (3L) , { 32 (283L) and their continued byte (21283L) . 32} 32 128 136 144 Note that the byte (2 3L)32 continues to appear as (2 3L)31, (2 3L)30, , (22403L) , (22483L) . ··· 00 33 2. For blocks (1+248j) i (250+248j) where j 1,..., 14 , perform the ≤ ≤ ∈ { } same attack as in step 1. Each iteration is shown in Figure 5.7 as successive coloured bytes as yellow, red, ... , blue, grey, pink, and lime. Each iteration retrieves extra bytes and starts just after the previous one.

3. Work backwards to calculate the mask bytes and begin with the last faulty 128 Chapter 5. Analysis of XEX mode using fault attacks

block (shown as lime in Figure 5.7). Use Equation 5.1 during the transi- tion from one iteration of faults to the previous iteration. For example, 3464 3464 (2 3L)33 (grey byte) can be calculated from (2 3L)00 (pink byte) and 3472 (2 3L)32 (lime byte). That is, the byte retrieved in the last iteration is not lost.

4. Repeat this approach working backward every 250-block iteration until 21283L is entirely retrieved. Note that any internal state contains at most two faulty bytes.

5. Compute the original mask L from 21283L using the primitive polynomial.

3L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 . . 8 . 2 3L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 .

2163L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 232003L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33

2243L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 232083L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 . . 232163L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33

21123L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 232243L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33

21203L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 232323L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33

21283L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 232403L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 136 . 2 3L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 . . . 234643L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33

22403L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 234723L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33

22483L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 234803L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33

22563L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 . 234883L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 . . .

24803L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 237043L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33

24883L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 237123L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33

24963L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33 237203L 00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33

Legend Retrieve first mask byte Retrieve second mask byte Retrieve third mask byte Retrieve 13th mask byte

Retrieve 14th mask byte Retrieve last two bytes Backward calculated bytes Unknown byte Figure 5.7: Technique to retrieve all bytes of 21283L (colour figure).

5.5 Experimental results

The approaches for faults attacks discussed in Section 5.3, 5.4.1 and 5.4.2 of this chapter are clear and foreseen. That is, the attacks are guaranteed to be 5.5. Experimental results 129 successful as long as the fault requirements are satisfied. On the other hand, the success rate of the approaches described in Section 5.4.3.1 and 5.4.3.2 are difficult to calculate. Thus, experimental verifications of these approaches were undertaken in order to estimate the actual success rate. A simulated experiment was performed to retrieve one byte of a secret mask given faulty ciphertexts only and using the attack method in Section 5.4.3.1. After that, the experiment was extended to retrieve the entire L mask. The ex- periment used 128-bit AES as the underlying block cipher. The implementation ran on a desktop computer with Intel Core i7-4790 3.6GHz processor using the C language and the GNU GCC compiler.

5.5.1 Fault model B simulation

To simulate fault model B, two implementations were used to generate a fault: the pseudo-random C function rand() and the AES cipher with different plaintext messages. In either case, one bit from the output was used to determine when the stuck-at-zero action occurs. A bit value of zero means no fault should be injected whereas a bit value of one indicates that a random byte from the output should be injected into a certain byte in another implementation of AES operating in XEX mode.

5.5.2 Retrieving one byte

As a preliminary step, several sub-experiments with different numbers of faulty bytes were performed to determine how many faulty bytes are needed to obtain a high success rate. The experiments started with two faulty bytes that share one mask byte. Then, the number of faulty bytes was increased by two for every following iteration till the number of faulty bytes was 32 such that every targeted block has only one faulty byte. For each iteration, the success rate was computed over 1000 simulations using different plaintext messages and nonces. These experiments were performed twice: one run used faults generated from the rand() function and the second used faults from the output of AES in XEX mode. Note these two implementations were used only to generate faults. The actual encryption function, where faults are injected, used a second implementation of AES in XEX mode. Secondly, the success rate was computed for the attack using 96 faulty bytes, 130 Chapter 5. Analysis of XEX mode using fault attacks as described in Section 5.4.3.1. This approach provides the hamming weight distinguisher with 48 faulty bytes that share only two mask bytes. The results of this experiment are presented in Table 5.3 for the number of faulty bytes ranging from 2 to 32 and lastly 96. Note that the success rates in both columns are close to each other. The last row shows that 96 faulty ciphertext bytes are enough to allow one byte in the secret mask to be retrieved with a success rate of 99.9%.

Table 5.3: Success rate of fault attacks using fault model B at round 9.

Number of Success Rate (1000 iterations) Success Rate (1000 iterations) faulty bytes rand() as PRF AES as PRF 2 0.280 0.289 4 0.233 0.238 6 0.405 0.377 8 0.467 0.472 10 0.528 0.556 12 0.637 0.631 14 0.673 0.703 16 0.753 0.744 18 0.787 0.790 20 0.821 0.839 22 0.873 0.861 24 0.885 0.888 26 0.909 0.901 28 0.930 0.929 30 0.940 0.936 32 0.957 0.956 96 0.999 0.999

Finally, the success rate to retrieve one mask byte was evaluated by con- sidering a more relaxed injection probabilities (p) to bias the faulty byte to- wards zero. Figure 5.8 compares the success rate and data complexity for p 0.5, 0.375, 0.25 . Note that the success rate is about 0.96 when p = 0.5 ∈ { } and about 0.87 when p = 0.375 in case of (1 faulty byte/block), and these proba- bilities increase to 0.999 and 0.975 respectively in case of (2 faulty bytes/block). That is, for low injection probabilities, if an attacker is able to fault more bytes per block, the success rate will increase. 5.6. Application to authenticated encryption modes 131

One faulty byte per block Two faulty bytes per block 1 1

0.8 0.8

0.6 0.6

0.4 0.4 Success Rate Success Rate

0.2 p : 0.500 0.2 p : 0.500 p : 0.375 p : 0.375 p : 0.250 p : 0.250 0 0 2 8 16 24 32 4 16 32 48 64 80 96 Number of Faulty Bytes Number of Faulty Bytes Figure 5.8: Success rate to determine one mask byte for different probabilities.

5.5.3 Retrieving the whole mask

An experiment was performed to demonstrate the success rate of retrieving the entire secret mask 21283L as discussed in Section 5.4.3.2. Each byte was retrieved using 96 faulty bytes. The success rate was also computed over 1000 different plaintext messages each of length 3722 blocks and each with a different nonce. It was found that the success rate to retrieve every bit in the mask 221443L is 99.2% when using AES as the pseudo-random function, and 99.3% when using the rand() function.

5.6 Application to authenticated encryption modes

Certain AE schemes that use the doubling masking technique are examined including the candidates of the ongoing CAESAR competition: OTR, COPA, ELmD and SHELL; and other AE modes, such as ISO 19772 OCB2 [10]. All of these AE block cipher modes use the masking technique of XEX/XE mode. These modes can be divided into two sets: one set uses nonce-based masks (L = E(N)) while the second set uses masks that are derived from a constant value (L = E(const)). The first set contains only OTR and OCB2 modes while the remaining modes belong to the second set. In the case of ciphertext-only fault attack, masks in the second set are much easier to retrieve using the statistical fault attack. The idea is to apply statistical fault attacks to the first block of different messages in order to obtain (K E(const)). Then, statistical fault 10 ⊕ 132 Chapter 5. Analysis of XEX mode using fault attacks attacks are applied for a second time to the second block in order to obtain (K 2E(const)). After that, XOR the two values to obtain 3E(const) and 10 ⊕ consequently K10. On the other hand, modes in the first set cannot apply statistical fault attacks directly. One needs to use a multi-block message as described in Section 5.4. A summary of the relevance of our techniques against the secret masks in these authenticated encryption modes is presented in Table. 5.4. The (X) mark indicates that the fault attack technique in the corresponding section of this chapter can be applied to the mode, whereas the ( ) mark indicates that our × technique can not be applied. Note that attacks in Section 5.4 cannot be applied to OTR as OTR (see Chapter4) is XE-based and not XEX-based. Table. 5.4 shows that COPA, ELmD and SHELL are XEX-based modes, whereas OTR is an XE-based mode. OCB2 is the only mode that is based on both XEX and XE modes. The (?) symbol in Table. 5.4 indicates that the secret mask in these modes can be retrieved more effectively by direct application of SFA [18, 19] than our technique in Section 5.4 since the masks are constant- based. Note that SFA can be used to retrieve the secret key in COPA, ELmD and SHELL that are based on constant-based masks. However, when the masks are based on a nonce, SFA can be applied to the XE part only in OCB2 and OTR. In comparison, our retrieval attack is applicable to the XEX part of OCB2, but not applicable to the XE part in OCB2 and OTR.

Table 5.4: Summary of attacks to retrieve secret masks in certain AE modes.

AE mode Mode Mask Our fault technique SFA [19] type type Section 5.3 Section 5.4 Retrieve L ? COPA XEX Constant-based X X X ? ELmD XEX Constant-based X X X ? SHELL XEX Constant-based X X X XEX Nonce-based OCB2 X X × XE Nonce-based X X OTR XE Nonce-based × X × X

5.7 Countermeasures

The success of the fault attacks presented in this chapter depends on the prop- erties of the primitive polynomial used to construct the finite field for updating 5.8. Conclusion 133 mask values in XEX mode. The polynomial used in Section 5.1.2 (also adopted by OCB2, COPA, ELmD, SHELL and OTR) is sparse and the feedback is ob- tained only from bits located in the final byte. Changing the mask updating function is one approach to prevent our attacks. This work outlines two alterna- tive techniques for the mask updating function so that the proposed attacks are not applicable. The technique used in the CAESAR candidate, OCB3 [30], is an alternative option for updating masks which makes our attacks irrelevant. Although OCB3 still uses the doubling mechanism, the masks also depend on an index and each mask is XOR-ed with the previous one which prevents the repetition of mask bytes. Another approach to preclude our attacks is to use a different function for incrementing masks. Krovetz and Rogaway [29] investigate several maximal 128- bit Linear Feedback Shift Registers (LFSRs); their internal states could be used as secret masks. Examples of efficient maximal LFSRs that have performance comparable to the doubling masking are:

S(X,Y ) = (Y, (X 1) (X 1) (Y 148))  ⊕  ⊕ ∧ S(A, B, C, D) = (C,D,B, (A 1) (A 1) (D 107))  ⊕  ⊕ ∧ S(A, B, C, D) = (C,D,B, (A 1) (A 1) (D 15))  ⊕  ⊕  where X = Y = 64 and A = B = C = D = 32. These LFSRs do not use the | | | | | | | | | | | | most significant bit of the previous mask to increment the next one and do not allow repetition of mask bytes. Thus, using one of these LFSRs for incrementing masks will avoid our attacks.

5.8 Conclusion

The masking technique in XEX mode acts as a barrier to the fault attack methods commonly used to recover the secret key of the underlying block cipher. This chapter presented different fault attack techniques against the generic XEX mode for block ciphers by targeting the secret masks used. Firstly, three fault attack methods were demonstrated in this chapter that convert XEX mode into ECB mode by forcing the secret mask L to zero. Injecting a permanent fault into a bit (or a byte) anywhere in the register containing the 134 Chapter 5. Analysis of XEX mode using fault attacks secret mask L, except for the final byte, will overcome the masking barrier after only 128 (resp. 16) blocks. This can also be achieved using transient faults on a few consecutive message blocks. For software implementations of XEX mode, it was demonstrated that L can be eliminated through skipping instruction faults. Secondly, instead of eliminating L, a detailed ciphertext-only attack can be performed to retrieve L. Because the polynomial used in the doubling masking technique allows repetition of mask bytes, SFA can be used with a collection of faulty ciphertext blocks to retrieve the bytes of L. Finding the secret mask allows then the attacker to retrieve the secret key using the same faulty blocks. Thirdly, the ciphertext-only attacks to retrieve L were verified through soft- ware simulations. In the case of fault model B, it was found that the success rate of retrieving one byte of L is 99.9%, and that of retrieving the entire mask is 99.2%. In addition, certain authenticated encryption modes were identified that are susceptible to our proposed fault attack techniques. It has been found that the masks used in AE modes COPA, ELmD, SHELL, OCB2 and OTR can be retrieved or eliminated. The masks in all these modes use the same primitive polynomial that makes them vulnerable to our attack. Our work demonstrates that it is the mask updating function f(x) = x128 + x7 + x2 + x + 1 that makes XEX vulnerable to these fault attacks. Hence, an efficient solution to preclude these attacks is to change this primitive polynomial to other functions that are less sparse and more conservative, such as those listed in Section 5.7. Chapter 6

Fault analysis of AEZ

This chapter considers the application of fault analysis techniques to the CAE- SAR candidate AEZ [11]. As discussed in Section 2.4.5, fault attacks are a form of implementation attacks which may sometimes be easier to apply and more efficient than a brute force attack to recover a secret key of an encryption algorithm. AEZ [11] is a block cipher mode based on AES which uses three 128-bit keys and provides authenticated encryption with associated data (AEAD). Currently, AEZ is a candidate in the third round of the CAESAR cryptographic compe- tition [6]. The algorithm has been updated several times during these three rounds. The analyses in this chapter focus on AEZ v4.2, but also investigate the applicability of these analyses to the recent version AEZ v5. The scheme claims to provide strong security and usability properties, including flexible ciphertext expansion and nonce-reuse resistance. Analysis of the security of AEZ against cryptographic attacks, such as [135, 136, 137], has limitations in relation to feasibility: most existing analyses require birthday complexity while the others do not invalidate the security claims of this scheme. In addition, there is no published analysis of AEZ security using fault attacks. This chapter analyses the security of AEZ against fault attacks, as AEZ makes use of AES and fault attacks on AES are well-known [16, 103]. Firstly, the proper place in the AEZ scheme to directly apply the existing differential fault attacks is identified. Then, we consider the application of known fault attacks on AES to AEZ. This chapter shows that an attacker requires six

135 136 Chapter 6. Fault analysis of AEZ faults to uniquely determine the three 128-bit keys in either AEZ v4.2 or v5, or five faults to reduce the key search space to 224. After that, the application of the attack is further improved by reducing the number of fault injections required. All three 128-bit keys in AEZ v4.2 can be uniquely retrieved using only three random-valued single byte fault injections. A similar approach using four fault injections can uniquely recover all three keys of AEZ v5. Reducing the number of required faults saves time and makes the attack more feasible. This chapter is organised as follows: Section 6.1 briefly describes the AEZ scheme while Section 6.2 gives a brief review of existing analysis on AEZ. Sec- tion 6.3.1 describes existing differential fault attacks on AES, especially Mukhopad- hyay’s attack [16]. Section 6.3.2 considers the feasibility to perform the hardware fault model as outlined in Mukhopadhyay’s attack. Section 6.4 describes an ap- proach to retrieve all three keys of AEZ using minimal fault injections. This section summarises the steps needed to perform an experimental attack and presents a further improvement to the proposed approach. Section 6.5 discusses the feasibility of presented fault approaches on the most updated version (v5). Section 6.6 shows the experimental results and compares the total number of fault injections in different fault approaches. Section 6.7 describes mechanisms that make fault analysis on AEZ more difficult. Section 6.8 draws a conclusion.

6.1 Description of AEZ

In this section, we briefly describe the AEZ scheme, and refer the reader to [11, 138] for more technical details. AEZ [11] is a block cipher mode that pro- vides authenticated encryption and was submitted to the ongoing cryptographic competition CAESAR. The work in this chapter mainly focuses on AEZ v4.2 and v5. AEZ v4.2 [138] was the latest version when this work was first conducted. At the moment of writing this chapter, December 2017, the current version of AEZ is v5 [139]. For simplicity, we omit the addendum "v4.2" and "v5" when referring to features common to both versions. AEZ has the following specification regarding certain input/output parame- ters:

AEZ supports arbitrary length for plaintext message M and associated • data A. 6.1. Description of AEZ 137

The default length for the key is K = 384 bits and it is recommended that • | | K 128. | |≥ The nonce length is recommended to be N 128. However, AEZ allows • | |≤ that N = 0 and allows also different nonce lengths for different messages | | under the same key.

Number of bytes of a fixed authentication block is ABY T ES. Default • value for ABY T ES = 16 and it is recommended that ABY T ES 16. ≤ AEZ uses Authenticator τ that measures the ciphertext expansion in bits • over its plaintext. Authenticator is τ = 8 ABY T ES. × The length of the ciphertext output is given as C = M +τ. • | | | | AEZ uses two AES functions, AES4 and AES10 such that AES4 consists of • four rounds whereas AES10 consists of ten rounds.

The block cipher encryption function mainly depends on two AES round functions: AES4 and AES10. Let the composition of the three main functions, SubBytes (SB), ShiftRow (SR) and MixColumn (MC), used in the AES round function be defined as follows:

aesr(X) = MC SR SB(X) ◦ ◦

AES4 is a four-round function of aesr( ) as follows: ·

AES4 (X) = aesr(aesr(aesr(aesr(X K ) K ) K ) K ) K K ⊕ 0 ⊕ 1 ⊕ 2 ⊕ 3 ⊕ 4 where K = (K0,K1,K2,K3,K4). AES10 is defined similarly, but with ten rounds of aesr( ) instead of four. Note that this definition makes AES10 equivalent to · the standard AES, except that AES10 applies the MixColumn operation in the last round (round 10) and uses a sequence of keys as sub-keys all built from the three 128-bit keys. The AEZ scheme provides several important features. Firstly, AEZ claims to provide nonce-reuse/misuse-resistant authenticated encryption (MRAE). Ac- cording to Rogaway and Shrimpton [31], the confidentiality of an MRAE scheme is only compromised in that repeated messages of same (N, A, M) can be de- tected when the nonce is repeated whereas the integrity assurance is not affected 138 Chapter 6. Fault analysis of AEZ at all. Secondly, AEZ can perform parallel operations such that the performance is comparable to AES in CTR mode. On the other hand, AEZ is not online, according to the definition of Bellare et al. [140]. In addition, AEZ is not easy to implement in hardware [138]. AEZ (see Figure 6.1) uses four different modules for processing the plaintext message and associated data as follows:

τ

N A τ M 00 ... 0

T

AEZ-hash Length

= 0 < 256

AEZ-prf 256 ≥ ∆ AEZ-tiny

AEZ-core

C Figure 6.1: AEZ scheme.

AEZ-hash : this function uses the tweak T = (N, A, τ) formed from the • nonce N, associated data A and authenticator τ to create a mask ∆ that will be used in other modules, AEZ-prf, AEZ-tiny or AEZ-core.

AEZ-prf : this module is used when M = ε. It generates an output of • length τ that acts as an authentication tag.

AEZ-tiny : this module is used when 0 < M < 256 τ. AEZ-tiny uses a • | | − balanced Feistel network where the number of rounds is between 8 and 24, depending on the length of the plaintext message. The round function is solely AES4.

AEZ-core : this module is used when M 256 τ. This module combines • | |≥ − two modes, EME [132] and OTR [9], in one design using both functions AES10 and AES4. Each pair of blocks requires 5 AES4 calls which is equiva- lent to 10 AES rounds per block. Neither AEZ-core nor AEZ-tiny requires 6.1. Description of AEZ 139

the AES-inverse operation. The work in this chapter considers only the AEZ-core scheme that is illustrated in Figure 6.2 and Table 6.1.

M1 M10 Mm Mm0 Mu Mv Mx My

X

1, 1 1, m ∆ 0, 1

Xu Xv 0, 0 0, 0 0, 4 0, 5 1, 1 − X1 Xm Xm S S S S 2, 1 2, m 1, 4 1, 5 ··· − − Y1 Ym S

Yu Yv 1, 2 0, 0 0, 0 0, 4 0, 5 − 0, 2 ∆ 1, 1 1, m Y

Cx Cy C1 C10 Cm Cm0 Cu Cv Figure 6.2: AEZ-core scheme.

AEZ works by appending a fixed authentication block of zero-valued bytes (ABY T ES) to the plaintext and encrypting the result using a mask ∆. This mask is generated from the function AEZ-hash that takes as input a mask T built from a nonce (N), associated data (A) and the authenticator (τ). Integrity assurance is provided by decrypting the ciphertext and verifying the presence of the all-zero authentication block. The AEZ secret key (K) is (3 128 = 384) bits long and is arranged as: I × k J L, where I = J = L = 128. However, if K < 384, the key is transformed k | | | | | | | | into 384-bit long string using the hash function BLAKE2b [141] as follows:  K if K = 384 I J L = | | k k BLAKE2b(K) if K = 384 | |6 As noted above, AEZ-core uses two block encryption functions, AES4 and AES10. These are represented in Figure 6.2 as small rectangles (AES4) and larger j,i squares (AES10) and are denoted in the cipher specification as EK (X), where K is the key, X is the input string, and the parameters j, i N specify which of ∈ the two functions (AES4 or AES10) is used and how the AES round sub-keys are obtained from I, J and L. 140 Chapter 6. Fault analysis of AEZ

Table 6.1: AEZ-core algorithm [11].

Algorithm 1: AEZ-core Enciphering Function 1. ∆ AEZ-hash(K,T ) ← 2. (M M 0 ,...,M M 0 ,M M ,M M ) M where M = = M 0 = 1 1 m m u v x y ← | 1| ··· | m| Mx = My = 128, d = Mu + Mv < 256 | | | | | | | |1,i 0,0 3. for i 1 to m do W M E (M 0); X M 0 E (W ) od ← i ← i ⊕ K i i ← i ⊕ K i 4. if d = 0 then X X ... X ← 1 ⊕ ⊕ m 5. else if d < 128 then X X ... X E0,4(M ) ← 1 ⊕ ⊕ m ⊕ K u 6. else X X ... X ,4(M ) E0,5(M ) ← 1 ⊕ ⊕ m ⊕ K u ⊕ K v 7. fi 0,1 1,1 8. S M ∆ X E (M ); S M E− (S ); S S S x ← x ⊕ ⊕ ⊕ K y y ← y ⊕ K x ← x ⊕ y 9. for i 1 to m do ← 2,i 10. Si0 EK (S); Yi Wi Si0; Zi Xi Si0; ← 0,0 ← ⊕ ←1,i ⊕ 11. C0 Y E (Z ); C Z E (C0) i ← i ⊕ K i i ← i ⊕ K i 12. od 13. if d = 0 then C C ε; Y Y ... Y u ← v ← ← 1 ⊕ ⊕ m 14. else if d < 128 then 1,4 0,4 15. Cu Mu EK− (S); Cv ε; Y Y1 ... Ym EK (Cu) ← ⊕ 1,4 ← ← ⊕ 1,5 ⊕ ⊕ 16. else Cu Mu EK− (S); Cv Mv EK− (S); Y Y1 ... Ym E0,4(C )← E0,5⊕(C ) ← ⊕ ← ⊕ ⊕ ⊕ K u ⊕ K v 17. fi 1,2 0,2 18. C S E− (S ); C S ∆ Y E (C ); y ← x ⊕ K y x ← y ⊕ ⊕ ⊕ K y 19. return C (C C0 ,...,C C0 ,C C ,C C ) ← 1 1 m m u v x y

AEZ has been updated several times since the initial proposal was submitted to CAESAR. At the date of writing this document, the current version of AEZ is v5, but v4.2 was the current version when this study was first conducted. Section 6.1.1 and 6.1.2 highlight the changes between v4.2 and v5.

6.1.1 AEZ v4.2

Version 4.2 is a minor revision from v4 and v4.1. For v4.2, the default key length is set to 384 bits and the key processing mechanism is changed to use the hash j,i function BLAKE2b [141]. Also, the encryption function EK (X) computes the ciphertext as follows: 6.2. Review of existing analysis on AEZ 141

j,i j i EK (X) K

1 N AES10K (X)(iJ †, I, J, L, I, J, L, I, J, L, I) − 128 0 N AES4K (X)(iI, J, I, L, 0 ) 128 1 N AES4K (X) (∆iI, J, I, L, 0 )

2 N AES4K (X) (∆iI, L, I, J, L) j 3 j 3 3 0 AES4 (X) (2 − L, J, I, L, 2 − L) ≥ K j 3 j 3 3 1 AES4 (X) (2 − L ∆ J, J, I, L, 2 − L ∆ J) ≥ ≥ K ⊕ i ⊕ i n † indicates a multiplication in GF (2 ) using a specific primitive polynomial.

3+ (i 1)/8 where ∆ = (2 b − c (i 1 mod 8)). i ⊕ −

6.1.2 AEZ v5

j,i This version changes the offsets used in EK (X). This revision is to avoid a forgery attack [142] that is possible if the same offset is used in two different j,i masks. The new definition of EK (X) is simplified as follows:

j,i j i EK (X) K

1 N AES10K (X)(iL, I, J, L, I, J, L, I, J, L, I) − 128 = 1 N AES4K (X) (∆, J, I, L, 0 ) 6 − i/8 where ∆ = j J 2d e I (i mod 8) L. · ⊕ · ⊕ ·

6.2 Review of existing analysis on AEZ

The existing literature includes cryptanalysis of AEZ. There is a key recovery attack on v2 and v3 [135], which is still applicable to v4.1 [136]. AEZ v4 is vulnerable to weak keys [137], and has a serious bug that results in a forgery attack [142]. These attacks are outlined below. Fuhr et al. [135] presented a generic key-recovery attack of birthday complex- ity against AEZ v2 and v3. These attacks are independent of the underlying permutation, and work even if the round function is replaced by the full AES. The attack in [135] is applied to AEZ under the condition that either no nonce 142 Chapter 6. Fault analysis of AEZ is used or the nonce is repeated. More importantly, the key derivation used in AEZ v2 and v3 allows the recovery of the master key (K) from the sub-key (J). However, this attack does not violate the security claim since the authors of AEZ do not claim beyond-birthday security. Nonetheless, this attack shows that AEZ is not a robust scheme, since it collapses completely when the birthday bound is exceeded. Schemes usually present some resilience at birthday bound and do not allow full key recovery. AEZ was updated to v4 and the minor revision v4.1 to thwart the key recovery attack presented in [135]. However, Chaigneau and Gilbert [136] showed later that v4.1 is still vulner- able to such an attack. They presented a key recovery attack with birthday complexity that exploited the use of the four round function AES4 in AEZ. As- suming I has been retrieved, a particular structure of plaintext in the last three rounds of AES4 allows an attacker to detect plaintext pairs that follow certain differential behaviour. These pairs are exploited to recover the J and L sub-keys. Further, Mennink [137] shows that AEZ v4 has weak keys for which a distin- guishing attack is possible. This observation on weak keys for AEZ is still valid to date. However, these weak keys represent only a tiny fraction of the entire key space, too small to be considered a practical breach. AEZ v4 is also vulnerable to a forgery attack as a result of a mistake in mask instantiation. Bonnetain et al. [142] found that the definition of masks in AEZ can lead to an overlap between two masks, which allows a simple forgery attack. Currently, AEZ has been updated to v5, which changes the definition of mask instantiation and simplifies it to avoid Bonnetain et al.’s forgery attack. However, the key recovery attack with birthday complexity is still applicable and could not be thwarted.

6.3 Existing fault attacks on AES

To the best of our knowledge, there are no existing fault attacks on AEZ in the public literature. However, there are several well-known differential fault attacks against AES. Differential fault attacks on AES were first introduced by Giraud [143]. Giraud [143] requires 250 pairs of correct and faulty ciphertexts where the fault disturbs one byte at the input of the 9th round. Subsequently, Dusart et al. [144] show that 40 pairs of correct and faulty ciphertexts were sufficient to recover the secret key of AES by inducing a fault in a byte anywhere between the 8th 6.3. Existing fault attacks on AES 143 and 9th round. Then, Piret and Quisquater [102] were able to reveal the AES key using only two ciphertext pairs, assuming the fault occurs between the 8th round MixColumn and 9th round MixColumn. Similarly, Mukhopadhyay presented an improved differential fault attack in [16] that showed with two pairs of correct and faulty ciphertexts the AES key can be deduced. Additionally, Mukhopadhyay [16] showed that one pair of correct/faulty ciphertexts can be used to deduce the AES key coupled with a brute-force search of 232. This attack was further improved in [103] to reduce the brute-force search from 232 to 28, again using one pair of correct and faulty ciphertexts. In this attack, the improvement is obtained by exploiting the linear relationship between AES sub-keys in the key schedule algorithm. These differential fault attacks are summarised in Table 6.2.

Table 6.2: Differential fault attacks on AES.

Ref. Year Fault Model Fault Location # Faulty ciphertexts [143] 2002 disturb 1 byte input of 9th 250 round [144] 2003 disturb 1 byte anywhere 40 between the 8th and 9th round [102] 2003 disturb 1 byte between the 8th 2 round MixColumn and 9th round MixColumn [16] 2009 disturb 1 byte input of 8th 2 round [16] 2009 disturb 1 byte input of 8th 1 and 232 round brute-force search [103] 2011 disturb 1 byte input of 8th 1 and 28 round brute-force search 144 Chapter 6. Fault analysis of AEZ

6.3.1 Mukhopadhyay’s fault attack

In this section, we review Mukhopadhyay’s differential fault attack against AES, as we apply this repeatedly in our attack on AEZ. We choose this specifically because of the minimum number of fault injections required by this attack to recover the key and the simple analysis it follows using few algebraic equations. However, we expect similar results could be obtained with other known differen- tial fault attacks on AES, especially Piret and Quisquater’s attack [102]. Mukhopadhyay’s attack [16] is a differential fault attack applied to the AES block cipher. This attack is based on inducing a random-valued single byte fault at the input of the eighth round. That is, a fault f, where f 1,..., 255 , ∈ { } is induced into a single byte in the internal state of the block cipher during the encryption process. The XOR difference between the correct/faulty state matrices propagates in the last three rounds of AES, as shown in Figure 6.3. This specific location enables the fault to be distributed to the entire state before producing the ciphertext message. As Figure 6.3 shows, the faults at the first column of the state matrix at the

input of the tenth round SubBytes is 2F1, F1, F1 and 3F1. Let x1, x8, x11 and x14 denote the correct ciphertext bytes, and x10 , x80 , x110 and x140 represent the faulty ciphertext bytes. The corresponding bytes of the last sub-key are K1, K8,

K11, K14. This pattern gives the following set of equations:

1 1 2F = SB− (x K ) SB− (x0 K ) 1 1 ⊕ 1 ⊕ 1 ⊕ 1 1 1 F = SB− (x K ) SB− (x0 K ) 1 14 ⊕ 14 ⊕ 14 ⊕ 14 1 1 F = SB− (x K ) SB− (x0 K ) 1 11 ⊕ 11 ⊕ 11 ⊕ 11 1 1 3F = SB− (x K ) SB− (x0 K ) 1 8 ⊕ 8 ⊕ 8 ⊕ 8 where F , K , K , K and K are all unknown values 0,..., 255 . 1 1 8 11 14 ∈ { } Mukhopadhyay shows that this system of equations reduces the possibili-

ties for K1, K8, K11 and K14 in the last sub-key of AES such that these key bytes can be uniquely determined with a probability of around 99% using only two correct/faulty ciphertext pairs. With only a single correct/faulty pair, the hypotheses for these four bytes in the last sub-key are reduced to 28. The same technique can be applied to the remaining columns to derive three more sets of equations. These sets cover the remaining bytes in the last sub- 6.3. Existing fault attacks on AES 145

f f 0 f 0 2f 0

SB SR MC f 0

f 0

3f 0 SB

F1

F2

F3

F4 SR

A1 A4 A8 A12 A0 A4 A8 A12 2F1 F4 F3 3F2 F1

A5 A9 A13 A1 SR A1 A5 A9 A13 SB F1 F4 3F3 2F2 MC F2

A10 A14 A2 A6 A2 A6 A10 A14 F1 3F4 2F3 F2 F3

A15 A3 A7 A11 A3 A7 A11 A15 3F1 2F4 F3 F2 F4 Figure 6.3: Propagation of fault at the input of 8th round [16]. key. That is, Mukhopadhyay’s attack can uniquely determine the last sub-key in AES using two correct/faulty pairs, or can reduce the key space to 232 using only one correct/faulty pair, for which a brute-force search is feasible with current computation power. Mukhopadhyay’s attack was further improved in [103], reducing the key search space to 28 when a single byte fault is used. The improved attack em- ploys the AES key schedule algorithm to reduce key hypotheses from 232 to 28. However, this improvement cannot be directly applied to AEZ since it uses three different keys (I, J, L) that are not related.

6.3.2 Feasibility of Mukhopadhyay’s attack

Mukhopadhyay’s attack is very practical compared to most existing fault attacks. Mukhopadhyay’s attack uses a very weak assumption about the hardware fault required; assuming a random fault is induced to one byte at the input of the eighth round in AES. As a primary step, an attacker can predict the right location by analysing other side channel information such as counting the number of clock edges or conducting a power scan. This single-byte random fault has been shown to be feasible using low-budget equipment. As an example, such a fault was reported in [145], obtained on exper- iments on a smart card integrated with an AES co-processor. This experiment 146 Chapter 6. Fault analysis of AEZ

successfully retrieved the AES key using Piret and Quisquater’s attack. This at- tack uses the same fault model as Mukhopadhyay’s attack. Hence, the approach outlined in this chapter can be easily verified.

6.4 New fault analysis on AEZ v4.2

In this work, we apply Mukhopadhyay’s attack to AEZ. As shown in Section 6.1, AEZ uses three unrelated 128-bit keys (I, J, L). Thus, revealing one key is not sufficient to determine the other two keys. This suggests that the required num- ber of pairs of correct and faulty ciphertexts for AEZ will be triple that required for attacks on AES. This research is motivated by this fact, and investigates the structure of AEZ in order to decrease the required number of fault injections to increase the practicality of the fault attack. However, most of the attacks pre- sented in this chapter do not apply if the light, efficient function AES4 is replaced by the more secure but heavier AES10 function. AEZ claims to provide misuse-resistant authenticated encryption (MRAE) in which the scheme preserves optimal security when the nonce is repeated. This chapter applies fault attacks to AEZ in this specific case when the nonce is re- used. However, the authors of AEZ warn that omitting/repeating the nonce is not allowed unless all encrypted pairs of associated data and messages under the same key are distinct from each other. Hence, neither these attacks nor the attacks in [135] invalidate the scheme security when the recommended specifications are followed. For the rest of the chapter, assume that the attacker knows ciphertext mes- sages all generated from the same plaintext. Assume also either the nonce is not used ( N = 0) or the N value is fixed during the encryption of different | | messages (nonce repeated) since this is a usual requirement for most differential fault analysis. For simplicity, suppose the associated data and the value of τ are fixed. Suppose also ( M +τ) mod 256 = 128. This implies that the M block is | | v an empty block and that M has a value of (1 0127). v k

6.4.1 Direct application

In AEZ, we applied Mukhopadhyay’s fault attack to the AES10 encryption func-

tion in the Mu part or Mv part of the AEZ-core function. The Mu and Mv parts are the only two parts in AEZ where a direct output of the block cipher is known 6.4. New fault analysis on AEZ v4.2 147 to the user. Hence, these two parts are the only suitable places for fault attacks.

In this chapter, we target only the Mu part, but the same results can be obtained if Mv part is targeted instead. Compared to the standard AES, AES10 applies the MixColumn operation to the last round. However, this has not prevented the Mukhopadhyay’s attack since MixColumn is a linear transformation. The approach to uniquely determine (I, L, J) keys is performed as follows: 1. Recovery of I: Inject two faults (i.e. one at a time) to a single byte at the input of the eighth round in AES10. After that, the correct and two faulty ciphertext messages are used in Mukhopadhyay’s attack to recover the last sub-key (I) in AES10. The entire I key can be uniquely determined with 99% probability of success.

2. Recovery of L: Since I is recovered, we can easily invert the tenth round in AES10. Inject two additional faults to a single byte at the input of the seventh round in AES10 and use Mukhopadhyay’s attack to recover L. Again, these two faults will uniquely recover the entire L.

3. Recovery of J: Now, both I and L are known, so we can calculate back both the tenth and ninth rounds. Similarly, inject the last two faults at the input of the sixth round and recover the last sub-key J. This approach requires in total one correct and six faulty ciphertext messages to uniquely reveal all three 128-bit keys, (I, L, J). However, if we inject only one fault instead of two in step 3, the J key space will be reduced to 224. According to Mukhopadhyay’s attack, faults in Step 2 will also uniquely recover four bytes 1 24 in MC− (J). That is, the key space for J is further reduced to 2 instead of 232. However, if only one fault is induced in both step 2 and 3, I will be uniquely determined, key space for L will be 224, but key space for J will be (224 232 = 256). ×

6.4.2 Improved application

In this section, we describe an alternative approach that exploits the knowledge of I and the AEZ structure to reduce the faulty ciphertexts from six to four pairs, with all keys (I, J, L) still uniquely determined with the same probabil- ity (99%). That is, our proposed approach improves the direct application of Mukhopadhyay’s attack on AEZ by at least 33.3%. 148 Chapter 6. Fault analysis of AEZ

6.4.2.1 Recovery of I

We start our attack by assuming that I is determined using two correct/faulty pairs, as in Section 6.4.1. However, we use this attack slightly differently so that we can reuse the injected faults and minimise the overall number of faulty ciphertexts. Differential fault attacks on AES, such as [16, 102, 103], usually use two correct ciphertexts of different plaintext and their corresponding faulty ciphertexts in order to reveal the last sub-key. In our approach, we use only one correct ciphertext and two faulty ciphertexts all of the same plaintext. This application makes the difference due to the fault injections affect only small parts in AEZ, as represented in red in Figure 6.2. We experimentally verified that this approach still gives the same result (i.e. uniquely determines I). We now proceed with the rest of the technique given that I has already been

revealed using three ciphertexts, one correct (Cu1) and two faulty (Cu0 1,Cu0 2), all

of the same plaintext (Mu1). Note that all these three ciphertexts (Cu1,Cu0 1,Cu0 2)

are also inputs to the second AES4 in Mu part.

6.4.2.2 Recovery of Yu

In this part, we exploit the use of AES4 in the Mu part to reduce the number of faulty ciphertexts required to determine the remaining two keys: J and L. AES4 is a four round function that uses all three keys as depicted in Figure 6.4. Note

that Yu in AES4 is not XOR-ed with any key (i.e. Yu is XOR-ed with a string 128 of zeros 0 ). As a primary step, we consider Yu1, corresponding to the correct

ciphertext (Cu1), is a sub-key and we need to reveal its value.

4I J I

Zu Cu SB SR MC SB SR MC

L

Uu Yu MC SR SB MC SR SB

Figure 6.4: AES4 function in AEZ v4.2.

Changes in Yu due to injected faults to AES10 (or the second AES4) can be 6.4. New fault analysis on AEZ v4.2 149

observed from the change in Cx value, as shown in red in Figure 6.2. To uniquely reveal Yu1, two single-byte faults, one at a time, are needed. Note that these two additional faults are different from the faults during the process of I recovery and should not happen simultaneously. The faults should occur at the input of the second round of the second AES4 in the Mu part. Let the outputs of AES4 corresponding to these correct and two faulty blocks be Yu1,Yu03,Yu04 respectively, and Cx1,Cx0 3,Cx0 4 be their corresponding outputs at Cx block.

Considering Yu1 as the last sub-key implies that the output is a block of zero bytes 0128. The output corresponding to fault injections can be represented as

Y 0 = Y ∆ and Y 0 = Y ∆ , where ∆ = C C0 and ∆ = C C0 . u3 u1 ⊕ 1 u4 u1 ⊕ 2 1 x1 ⊕ x3 2 x1 ⊕ x4 That is, three output values can be obtained: 0128 is the correct ciphertext and

(∆1, ∆2) are the two faulty ciphertexts such that all outputs share the same secret value Yu1. Now, we can again use Mukhopadhyay’s attack to uniquely determine Yu1.

Note that retrieving the value of Yu1 enables us to retrieve both Yu03 and Yu04, and both Yu01 and Yu02, corresponding to faults during the recovery process of I.

6.4.2.3 Recovery of J and L

At this stage, I and Yu1 are determined using four fault injections. That is, the second AES4 in the Mu part shrinks to two rounds instead of four, as shown in

Figure 6.5. The attacker now knows three inputs to AES4 (Cu1,Cu0 1,Cu0 2) and their corresponding outputs (Yu1,Yu01,Yu02), and the function is reduced to only two rounds. This information is sufficient to uniquely reveal the values of J and L.

J I L

Zu SB SR MC SB SR MC Uu

Figure 6.5: Reduced AES4 function.

Cryptanalysis of two rounds of AES can be performed in different ways. We outline one approach to analyse the reduced form of AES4 to reveal both J and L, as they are unrelated. A sketch of the reduced AES4 is depicted in Figure 6.6. Note that each diagonal of J collides with all columns of L, each in a byte. 150 Chapter 6. Fault analysis of AEZ

For example, the diagonal J[0, 5, 10, 15] collides with the first column L[0, 1, 2, 3] i,(2),(MC) in byte s00 . Similarly, it collides with the second column L[4, 5, 6, 7] in i,(2),(MC) s10 and so on. We proceed to find four bytes J[0, 5, 10, 15] and four bytes L[0, 1, 2, 3] by making hypotheses on their values, thus allowing to calculate the collision points (black bytes in Figure 6.6). The hypotheses that lead to collisions amongst the three message pairs are the most likely ones. However, this approach needs a total of 232 232 = 264 hypotheses, and has to be performed four times to retrieve × the entire J and L. Hence, the time complexity for this approach is high.

J

SB SR MC

Zu

MC SR SB

Uu Legend Known value Guess value L Unknown value I Figure 6.6: Sketch of internal states in reduced AES4 function.

To overcome this high complexity, one can compute the black bytes under only 28 hypotheses of one column of L for each of the three pairs and then store these values. This has to be performed three more times each with a different L column. After that, the attacker searches 232 hypotheses of one diagonal of the sub-key J and detects which value leads to collisions amongst the three message pairs. If the number of pairs is l, the complexity of this step is 232 + 4l28. Simulations verify that if l 3, this approach can uniquely determine a diagonal ≥ of J. The same analysis applies to the remaining three diagonals of J; however, there is no need to re-compute the collision bytes as this step has been done during the recovery process of the first diagonal. The overall complexity of this 6.4. New fault analysis on AEZ v4.2 151 approach is then 234 + 210l, which is practical on a standard desktop computer.

When the entire J is recovered, L can be easily determined, since Zu, J, I and

Uu are all known.

6.4.2.4 Attack algorithm

In summary, the steps of our improved approach, described in Section 6.4.2 to recover (I, J, L) keys are as follows:

1. Compute the ciphertext for a plaintext of length (( M +τ) mod 256 = | | 128). Consider this ciphertext as the fault-free message.

2. Inject two random faults at the input of the eighth round of AES10 encryp-

tion in the Mu part of the AEZ-core function.

3. Use Mukhopadhyay’s attack to uniquely determine the last sub-key I in

AES10. Note that Cu is also the input to the second AES4 in the Mu part.

Hence, we have three different known inputs to AES4 as (Cu1,Cu0 1,Cu0 2); however, their corresponding outputs are still unknown.

4. Using the same plaintext message, inject two additional random faults to

the second round of the lower AES4 in the Mu part of the AEZ-core function.

5. Again use Mukhopadhyay’s attack to determine the value of Yu1 corre- sponding to the fault-free message.

6. Use the retrieved Yu1 to determine the AES4 outputs of the remaining two

blocks (Cu0 1,Cu0 2) in step 3.

7. Using a brute-force search, find four bytes of J that lead to collisions in all three pairs collected in steps 3, 5 and 6.

8. Repeat step 7 three more times to retrieve the entire J.

9. Use the known values of Zu, J, I and Uu in Figure 6.5 to retrieve the entire L.

6.4.3 Further improvement

In Section 6.4.2, we outlined an approach that uniquely retrieves the three 128-bit keys (I, J, L) in AEZ using only four faults: two to retrieve I and two to retrieve 152 Chapter 6. Fault analysis of AEZ

Yu. We now further reduce the number of fault injections to three: two to retrieve 32 I and one to retrieve Yu. That is, the number of value hypotheses for Yu is 2 as Mukhopadhyay’s attack shows. However, we have seen in Section 6.4.2.3 that

the total time complexity to retrieve J and L given a unique value for I and Yu 34 10 32 is (2 +2 l). Hence, if Yu has 2 possible values, the overall complexity rapidly increases to (266 + 242l), which is impractical given a standard computational equipment.

We proceed by first reducing the possible values for Yu before applying the recovery process for (J, L) as outlined in Section 6.4.2.3. Note that L is the

penultimate key in both AES10 and AES4 in the Mu part. After retrieving I, one can use its value to inversely compute the state matrix at the input of the th 1 9 round MixColumn in AES10, where MC− (L) is the current sub-key. Due to the fault propagation, Mukhopadhyay’s attack can be applied to uniquely 1 determine four bytes MC− (L)[0, 7, 10, 13] using the faults pattern at the input

of the ninth round (2f 0, f 0, f 0, 3f 0) (see Figure 6.3). That is, using two fault 1 injections to AES10, we can retrieve four bytes of MC− (L) in addition to the recovery of the entire I. 1 32 Given four bytes MC− (L)[0, 7, 10, 13], and 2 possible values for (Yu1,Yu01),

we can inversely compute four bytes of (Z J)[0, 5, 10, 15] and (Z0 J)[0, 5, 10, 15]. u1⊕ u1⊕ Since both Zu1 and Zu0 1 are known, we can XOR these bytes with each other to

check which value satisfies the differential (Z Z0 )[0, 5, 10, 15]. We discard u1 ⊕ u1 each (Yu1,Yu01) value that does not satisfy this differential property. Simulations

show that only one value for (Yu1,Yu01) gets returned as a result. From here, the recovery process for (J, L) is the same as in Section 6.4.2.

6.5 New fault analysis on AEZ v5

This section highlights the changes in the most updated version of AEZ v5. We consider the feasibility of the attacks described in Section 6.4 on this current AEZ version.

6.5.1 Changes from v4.2

As discussed in Section 6.1.2, AEZ v5 changes the offsets in order to avoid the forgery attack presented in [142]. Specifically, it simplifies the offsets used in the 6.5. New fault analysis on AEZ v5 153 function AES4 as follows:

AES4 (X) = aesr(aesr(aesr(aesr(X ∆) J) I) L) 0128 K ⊕ ⊕ ⊕ ⊕ ⊕

i/8 where ∆ = j J 2d e I (i mod 8) L. · ⊕ · ⊕ · This definition implies that the lower AES4 function in the Mu part of AEZ- core is defined as shown in Figure 6.7. Note that the only difference between v4.2 and v5 in this function AES4 is the instantiation of the first sub-key that had a value of (4I) in v4.2 whereas this value has changed to (2I 4L) in v5. ⊕ The remaining sub-keys are still identical.

2I 4L J I ⊕

Zu Cu SB SR MC SB SR MC

L

Uu Yu MC SR SB MC SR SB

Figure 6.7: AES4 function in AEZ v5.

6.5.2 Implications of changes

The main idea in Section 6.4.2 is to first retrieve the values of I and Yu, which in turn reduces the number of rounds in AES4 from four to two. This approach enables the attacker to determine the remaining sub-key values, J and L, using a guess-and-exclude method as previously described in the section. However, this approach cannot directly apply to AES4 in AEZ v5. Determining I and Yu will not shrink the number of rounds as the first sub-key (2I 4L) involves now an ⊕ additional key L. Hence, the fault attacks we applied to v4.2 cannot be used directly against this updated version. Nonetheless, the changes in v5 do not completely protect AEZ against fault analysis. 154 Chapter 6. Fault analysis of AEZ

6.5.3 Attack approach using four faults

We proceed using the fault analysis concept in Section 6.4.2, but using different steps in the recovery process. In this section, the recovery process starts using four fault injections to determine the sub-keys (I, J, L); in Section 6.5.4 we in- vestigate whether the number of faults can be further reduced. One approach for the recovery process on AEZ v5 is performed as follows:

1. Recovery of I: As discussed in Section 6.4, the recovery process starts by inducing two faults at the input of the 8th round of the AES10 function

in the Mu part and then applying Mukhopadhyay’s attack to uniquely determine the entire I key. In addition to recovering I, these faults can 1 be further exploited to uniquely retrieve four bytes MC− (L)[0, 7, 10, 13] of the inverse-MixColumn value of the penultimate key L in AES10. These four bytes will be used in the next steps to reduce the key search for both

L and Yu. Note that this step produces three known inputs (Cu1,Cu0 1,Cu0 2) to AES4 such that their corresponding outputs (currently still unknown)

are (Yu1,Yu01,Yu02) respectively.

2. Recovery of L: Inject an additional fault at the input of the 7th round of AES10. Since I is uniquely determined, the encryption state in AES10 can be inversely computed till the end of the 9th round, and then Mukhopad- hyay’s attack can be applied to reduce the possible values for L to 232 only. For each of the 232 candidate values for L, compute the four bytes 1 MC− (L)[0, 7, 10, 13] and compare the calculated value to the right value retrieved in step 1. This comparison reduces the possible values for L from 32 24 2 to 2 . This step also produces an additional pair (Cu0 3,Yu03) for AES4.

3. Recovery of Yu: Similarly as in Section 6.4.3, inject one extra fault at the nd beginning of the 2 round of the lower AES4 function in the Mu part. This 32 step will also provide the attacker with only 2 possible values for Yu.

4. Reduce search space of Yu: Since L is the penultimate key in both AES10 32 and AES4, the 2 possible values for Yu can be used with the retrieved value 1 for MC− (L)[0, 7, 10, 13] in step 1 to check which candidate value follows the fault pattern at the beginning of the third round of the lower AES4

function. This step will significantly reduce the possible values for Yu to 6.5. New fault analysis on AEZ v5 155

29. That is, till this step four faults have been induced, the entire I has ∼ been retrieved, ( 29) values for Y and (224) values for L. ∼ u

5. Reduce search space of L: Using the values for Yu and the unique value 1 for MC− (L)[0, 7, 10, 13], the attacker can use AES4 to calculate back four bytes of the internal state prior to the SubBytes operation of the second round, namely (Z J)[0, 5, 10, 15]. This generates only ( 29) possible u1 ⊕ ∼ values for (Z J)[0, 5, 10, 15]. After that, one can proceed by computing u1 ⊕ 24 the Zu1 value (see Figure 6.7) for Cu1. Since there are 2 possible values 24 for L, Zu1 will also have 2 possible values. The attacker can repeat such

process for the remaining pairs (Cu0 1,Cu0 2,Cu0 3) to calculate (Zu0 1,Zu0 2,Zu0 3). Now, we can XOR these bytes with each other to check which value satisfies

the differential (Z Z0 )[0, 5, 10, 15], (Z Z0 )[0, 5, 10, 15] and (Z u1 ⊕ u1 u1 ⊕ u2 u1 ⊕ Zu0 3)[0, 5, 10, 15]. We discard each (L, Yu1) value that does not satisfy this

differential property. Simulations show that only one value for (L, Yu1) gets returned as a result. The complexity of this step is ( 234) which is still ∼ very feasible.

6. Recovery of J: By knowing all Cu, Yu, I and L variables, J can be easily determined.

The attack approach in this section is very similar to the previous analysis in Section 6.4.2 and 6.4.3, but this approach exploits the fact that L is the penultimate key in both AES10 and AES4 to reduce the key search for both Yu and L. This enables the attacker to significantly eliminate the rounds in AES4 and easily determine J. This approach is also applicable to AEZ v4.2 and can be considered as an alternative mechanism. In fact, this approach is much easier than the approach in Section 6.4.2 for AEZ v4.2 since L is not involved at the beginning of the first round of AES4.

6.5.4 Attack approach using three faults

In Section 6.5.3, all three sub-keys (I, J, L) are uniquely determined using only four fault injections. This section investigates whether the number of faults can be further reduced to three. The approach to achieve this is performed as follows:

1. Recovery of I: Use one fault injection at the input of the 8th round of AES10 so that the key search for I is reduced to 232. This fault also provide 156 Chapter 6. Fault analysis of AEZ

32 1 the attacker with the corresponding 2 possible values for MC− (L)[0, 7, 10, 13].

2. Recovery of Yu: Inject two additional faults to the second round of the

lower AES4 in Mu part as this uniquely determines the value of the entire

Yu. In addition, these two faults can further retrieve the actual value of 1 the four bytes MC− (L)[0, 7, 10, 13].

1 3. Reduce search space of I: Compare the actual value for MC− (L)[0, 7, 10, 13] that is retrieved in step 2 to the 232 possible values computed in step 1. As a result, the key search space for I is reduced to ( 29). ∼ 4. Recovery of L: The attacker cannot proceed any further since L still has 296 possible values and L is involved at the beginning of AES4.

This approach uses three fault attacks to uniquely retrieve Yu and four bytes 1 MC− (L)[0, 7, 10, 13]. However, this approach cannot reduce the number of rounds to two since L is still unknown. Three rounds of AES4 and three col- lected pairs will not enable the attacker to retrieve the remaining keys with feasible computation using a standard desktop computer.

6.6 Experimental results and comparison

We successfully verified the procedure discussed in Section 6.4 and 6.5 by sim- ulating random fault injections in a software implementation of AEZ v4.2 in C language with the GNU GCC compiler on a standard desktop computer. Note that we did not perform hardware fault injections, but these have been demon- strated on AES by the experiments of [145], so we consider this to be a feasible approach. The summary of data, time, memory costs is presented in Table 6.3. Note that the offline time complexity and the required memory are specific to our implementation. Hence, these figures may vary for more optimised imple- mentations. The remaining key exhaustive search is implementation independent and shows whether the key is uniquely retrieved or whether additional search is required by the attacker. Comparing the improved procedures of Section 6.4.2 to 6.5.3 to direct appli- cation of the Mukhopadhyay’s attack, detailed in Section 6.4.1, shows that the improved procedures require fewer fault injections in order to retrieve the three 6.7. Possible solutions 157

Table 6.3: Comparison of different attacks on AEZ.

Version Approach No. of Memory Offline Key Faulty Time Exhaustive Complexity Search v4.2 Direct 6 4 KB ≈ 234 0 & application 5 4 KB ≈ 234 224 v5 (Section 6.4.1) 4 4 KB ≈ 234 256 Improved application 4 4 KB ≈ 234 0 (Section 6.4.2) v4.2 Improved application 3 16 KB ≈ 235 0 (Section 6.4.3) Improved v5 application 4 18 KB ≈ 234 0 (Section 6.5.3) main AEZ keys (I, J, L). The first three lines in Table 6.3 show the required faulty encryptions for direct application of Mukhopadhyay’s attack on AEZ. The middle two lines show the corresponding requirements for the improved attacks outlined in Section 6.4.2 and 6.4.3. Observe that the number of fault injec- tions needed has been reduced and the key is uniquely determined. The last line shows that four faults are required to uniquely determine all keys in AEZ v5. The number of faults cannot be further reduced since L is used in the first sub-key of AES4.

6.7 Possible solutions

In Sections 6.4.2, 6.4.3 and 6.5.3, faults are injected to both AES10 and AES4 in AEZ, and then we exploit the lightweight function AES4 in order to minimise the number of faults required to recover all three 128-bit keys, I, J and L. The direct application of fault analysis on AES10, as discussed in Section 6.4.1, cannot be avoided with the current structure of Mu part in AEZ. To thwart this attack, AEZ structure would need to be changed so that the direct outputs of AES10 in

Mu and Mv parts are not known to the user. 158 Chapter 6. Fault analysis of AEZ

This section outlines two suggestions to avoid exploiting AES4 in order to min- imise the number of faults. One intuitive solution is replacing AES4 function with the more secure function AES10. However, this suggestion burdens the scheme with extra computation that would affect the overall performance. Separate ex- periments are needed to compute the actual rate of performance degradation that would result in AEZ. An alternative solution is to change the sequence of round keys in AES4. One possible definition for AES4 in Mu and Mv parts is as follows:

AES4 (X) = aesr(aesr(aesr(aesr(X ∆ ) ∆ ) I) J) 0 K ⊕ 1 ⊕ 2 ⊕ ⊕ ⊕

i/8 where ∆ = j J 2d e I (i mod 8) L and ∆ = L J. Both ultimate 1 · ⊕ · ⊕ · 2 ⊕ and penultimate keys in AES10 and AES4 are different. This modification will not allow the attacker to shrink AES4 from four to two rounds so the attacks described in Sections 6.4 and 6.5 cannot be applied.

6.8 Conclusion

This chapter presented the results of applying well-known fault attacks on AES to the block cipher mode of operation AEZ v4.2 and v5. AEZ uses a relatively large key of size 384 bits organised as three 128-bit keys, I, J and L. Unlike the standard AES, the three 128-bit keys in AEZ are unrelated to each other. Hence, the number of fault injections required to retrieve all three keys is larger than for AES. AEZ claims to be a misuse-resistant scheme where the nonce can be omit- ted or repeated. We demonstrated that this claim is not correct: namely, that Mukhopadhyay’s attack on AES can be applied to AEZ if the nonce is either omitted or repeated. Direct application of this attack requires at least six fault injections to uniquely determine the three 128-bit keys. We have shown how the structure of AEZ v4.2 can be exploited to allow an improved attack, which reduces the number of fault injections to four while still uniquely determining all three keys. After that, we further reduced the number of fault injections to three through additional search of an intermediate value Yu in a precomputed table. This makes the attack more practical and simple as each extra fault injection requires time and cost. This result compares favourably to the existing attack on AES, which requires two faults to retrieve the 128-bit AES 6.8. Conclusion 159 key whereas our improved attack on AEZ requires only three faults to retrieve the entire 384-bit key. Currently, AEZ has been updated to v5. However, direct application of Mukhopadhyay’s attack is still possible. Furthermore, we again proposed an improved attack that reduces the number of faults required for successful key retrieval from six to four. Direct application of Mukhopadhyay’s attack on AEZ cannot be avoided un- less the AEZ structure is changed. However, we pointed out that exploiting the light function AES4 to reduce the number of faults can be prevented. One way is to replace the AES4 function by AES10. Another approach is to change the sub-keys in AES4. We stress that these attacks, as with all differential fault attacks, do not apply to AEZ if a nonce is changed for every message. 160 Chapter 6. Fault analysis of AEZ Chapter 7

An online leakage-resilient authenticated encryption mode

Several countermeasures against SCA are proposed in the literature. Three well- known countermeasures that can be applied to any cryptographic algorithm are masking, hiding and provision of leakage resilience. Masking [14, 146, 147] and hiding [148] both aim to minimise the amount of useful information leaked, using approaches such as randomising intermediate values or using power-balanced logic. Most of these techniques are algorithm or hardware dependent. The user needs to customise the countermeasure according to the underlying cryptographic modules and the countermeasure must be revisited when a new attack appears. Furthermore, masking schemes can be defeated using high order attacks [149], and have been shown vulnerable to glitches [150]. Alternatively, leakage-resilient algorithms aim to construct primitives that withstand side channel attacks even in the presence of bounded leakage about the secret key. Leakage resilience is a protocol-level protection that can be defined for different leakage models to provide generic security [151, 152, 153, 154, 155, 156, 157]. Several designs [158, 159, 160, 161, 162] propose leakage-resilient AE schemes. However, these leakage-resilient AE schemes lack important features provided in conventional AE schemes, especially parallelism and online computation. That is, the methods used for providing leakage resilience destroys these features. Also, only a few current AE proposals [159, 160, 161] claiming leakage resilience are

161 162 Chapter 7. An online leakage-resilient authenticated encryption mode also misuse resistant at both the sender and receiver. No existing AE scheme is leakage-resilient and fully online at both communicating parties. Some designs require the message to be fully collected before the encryption process starts, while others require verification of the integrity of the whole ciphertext message before the decryption process begins. In this chapter, we propose a scheme providing leakage-resilient authenticated encryption with online capabilities using a minimum number of primitives. The same level of SCA-security is provided on both the sender and receiver sides. The work is motivated by the lack of leakage-resilient AE schemes that are fully online at both sender and receiver among previous proposals. Designing an efficient and secure block cipher mode providing all previous features is a challenging research problem. This work is an attempt to address this. This chapter is organised as follows: Section 7.1 discusses side channel leak- age of cryptographic implementations and presents certain techniques to de- fend against such leakage. Section 7.2 explains how to achieve leakage resilience through key updating mechanism. Section 7.3 overviews several existing leakage- resilient AE schemes. The main features that an AE block cipher mode should have, in addition to providing leakage resilience, are discussed in Section 7.4. Section 7.5 provides a high-level representation of our proposal for a leakage- resilient block cipher-based AE mode. Section 7.6 outlines different approaches to instantiate our proposed design. Section 7.7 analyses the security provided by our scheme against SCA assuming the underlying functions have certain features. Section 7.8 evaluates our proposed scheme and compares it to some existing AE schemes. A conclusion of this chapter is given in Section 7.9.

7.1 Leakage

Cryptographic algorithms are designed to resist mathematical attacks such as linear and differential cryptanalysis. These algorithms are proved secure in a black-box model assuming that no leakage occurs and that the attacker has access to certain specific interfaces, such as the plaintext message and its corre- sponding ciphertext. However, this is not sufficient to make the implementation of these algorithms secure in practice. Physical implementations of cryptographic algorithms may leak unintentional information about their internal variables that can be exploited in SCA attacks. 7.1. Leakage 163

A large body of research has been proposed to defend against SCA attacks. However, eliminating SCA attacks completely is not possible, but some designs are more vulnerable than others. There is big ambiguity in this area of cryptog- raphy for several reasons. For instance, to provide concrete physical security, the leakage should first be quantified accurately which is still a contentious topic in the literature. Furthermore, designers do not know how powerful the attacker is and the type of equipment she has. In addition, some techniques to perform SCA attacks have been reported, yet, other powerful attacks may not been published to the public. All these factors makes SCA countermeasures inefficient. Nonethe- less, certain countermeasures such as masking and hiding have been proposed to prevent certain types of attacks.

7.1.1 Masking

Masking [14, 146, 147] is a very effective technique that randomises the inter- mediate values using on-the-fly random numbers. The principle of masking is dividing the secret variable into d + 1 shares where d is the masking order. After that, each share is processed independently. Before generating the final output, the d+1 shares are combined to construct the correct output. This principle fol- lows the well-known secret sharing approach [163, 164]. Cryptographers started using masking as a countermeasure against side channel attacks in [14, 165]. Af- ter that, this method is used to protect various cryptographic ciphers, such as AES [146, 166].

7.1.2 Hiding

Hiding [15, 148, 167] intends to isolate the device leakage from the sensitive data providing a data-independent power consumption profile. That is, hiding aims to minimise the signal-to-noise ratio either by reducing the leakage or increasing the noise [15]. This increases the number of traces required to perform a successful DPA attack. Hiding can be performed on the implementation or hardware level. For instance, each bit in an implemenation can be represented by multi-bits to balance power consumption [168]. In hardware level, a designer can add balanced-logic gates that consume the same amount of power independent of the data and operation [148]. Apparently, both approaches impose overhead either in area or in time and memory. 164 Chapter 7. An online leakage-resilient authenticated encryption mode

7.1.3 Leakage resilience

Leakage-resilient cryptography [151, 152, 153, 154, 155, 156, 157] has emerged to address the gap between theoretical abstraction and real implementation of cryptographic algorithms. The purpose of leakage resilience is to formally extend the provable security of a cryptographic algorithm to its implementation. Leak- age resilience makes cryptographic schemes still secure even if they leak certain amount of information. Leakage-resilient cryptography is considered to be more generic (protocol-level) and allows the underlying building blocks and hardware to change easily. Several leakage-resilient schemes follow Kocher’s key updating [153] as briefly described in Section 7.2. This strategy limits the data complexity for each key to prevent DPA attacks. Different models have been proposed depending of the attacker’s ability and information leaked about the secret key through side channel attacks. Most models assume that the overall amount of leakage is bounded by some parameter λ which is shorter than the secret key length. This chapter assumes that the attacker can repeatedly and adaptively learn a leakage function that returns certain key bits, but the total number of bits leaked is bounded by λ. The contribution of this chapter is not the leakage model, but the construction of a block cipher-based scheme that is secure under this model.

7.2 Leakage resilience through fresh re-keying

One approach to achieve leakage resilience is by key updating or refreshing such that every run of the cryptographic primitive uses a different key. This is known as fresh re-keying. The main concept of fresh re-keying is to derive fresh keys (session keys) from the master key. Every plaintext block is encrypted/decrypted using a different session key K∗ and any given key is never used twice. This ap- proach makes DPA attacks infeasible. However, fresh re-keying does not protect the underlying encryption/decryption primitives against SPA, although this is less threatening and easier to prevent than DPA. Using key updating to provide leakage resilience was introduced in [153] which incorporated a heavy initialisation phase through a tree-based structure. A sim- ilar but more effective design called fresh re-keying [169, 170, 154] has the ad- vantage of a low-cost initialisation phase. Since then, several schemes to provide 7.2. Leakage resilience through fresh re-keying 165 encryption using leakage-resilient pseudorandom generators (PRGs) or pseudo- random functions (PRFs) have been proposed [155, 154, 171]. Following the concept in [170], fresh re-keying uses two types of function (de- noted g and EK ) as depicted in Figure 7.1. The function g is used to protect

EK , which performs the actual encryption of message block Mi to produce ci- phertext block Ci, by providing a new key K∗ (referred to as session key) for every message block. The session key K∗ is generated from the master key K and a nonce N. The function g must be secure against side channel attack (both DPA and SPA), but is not required to be cryptographically strong, while the function EK is required to be cryptographically strong and secure against SPA attacks only. This separation makes this structure generic and enables the user to easily change the two functions (g and EK ) in trade-offs between security and performance, as long as the requirements for the two functions are satisfied.

N

K g

K∗

Mi EK Ci Figure 7.1: Fresh re-keying concept.

This structure inherently prevents DPA attacks on EK since K∗ is only used once and it is agreed that the leakage from one execution is not sufficient to recover the entire key through DPA attacks. The original function g that derives

K∗ should have sufficient resistance against both DPA and SPA. According to [170], the minimum requirements for g are:

High diffusion: flipping one bit in K flips many bits in K∗ with a chance of • 50%. Taha et al. [167] require also that g should be nonlinear as they show a structure that provides high diffusion; however, it is still vulnerable to DPA attacks. Thus, they demonstrated that high diffusion is not sufficient and suggested g be nonlinear.

No additional secret material: the communicating parties are not expected • to share any secrets except the master key K. 166 Chapter 7. An online leakage-resilient authenticated encryption mode

No synchronisation: the communicating parties are not expected to share • a variable that needs to keep synchronised.

Small overhead: the g function should impose small area and performance • overhead. g is not required to provide as high a level of cryptographic

security as EK function.

Secure against both SPA and DPA: protecting g against SPA and DPA • should be more effective than protecting the main function EK itself. The resistance of g against SPA and DPA can be either structurally inherited or provided through hardware mechanisms such as hiding and masking.

Fresh re-keying schemes guarantee DPA-security as long as the nonce is dis- tinct for every new message. The sender side (encryption/authentication) has control over the nonce and can ensure this. Changing the nonce means gener- ating different session keys. Thus, leakage-resilient schemes effectively protect the sender against DPA attacks. However, at the receiver side, the situation is different. The receiver accepts the nonce and the ciphertext message for decryp- tion/verification. An attacker can force the nonce to be the same for different ciphertext messages, which leads to the same session keys for different inputs.

Hence, the attacker can apply DPA to recover K∗ based on the power leakage from the forged messages even if the tag verification eventually fails. Most ex- isting leakage-resilient schemes are insecure at the receiver side for this reason. Several schemes [159, 161, 160] point out this dilemma and suggest solutions to tackle the problem of leakage resilience at the receiver side.

7.3 Existing leakage-resilient AE schemes

To the best of our knowledge, only five proposals in the public literature pro- vide leakage-resilient authenticated encryption schemes. These are RCB [158], Barwell [159], ISAP [161], Kocher [162] and DTE [160]. The first AE mode that provides leakage resilience is by Kocher [162]. This work is effective to protect unauthorised access, alteration or disclosure of sensi- tive data stored in hardware chips such as firmware or FPGA bit-files. However, this work does not claim any leakage-resilience security in the case when the nonce is reused. That is, it is not misuse-resistant. 7.4. Security goals 167

Agrawal et al. [158] proposed another leakage-resilient AE block cipher based scheme called Re-keying Code Book (RCB). They combine two primitives, OCB for authenticated encryption and a black box re-keying function without leakage. However, Abed et al. [172] subsequently demonstrated several weaknesses in the design of RCB. Hence, RCB fails to provide secure authenticated encryption, regardless of the protection it provides against SCA. Berti et al. [160] investigated the provision of two effective features, misuse- resistance and leakage-resilience, in one scheme (called DTE) based on crypto- graphic symmetric functions. This work can be considered as an AE mode that prevents DPA attacks on both the sender and receiver sides using leak-free (i.e. PRF without leakage) and leaky components (i.e. PRF with leakage). This design is efficient and provides strong SCA-security, but it begins the encryp- tion process by hashing the whole message. Therefore, this cannot be online at sender side. Importantly, this work shows that obtaining the combination of leakage resilience and full misuse-resistance is not possible using only standard cryptographic primitives such as block ciphers or hash functions. Two AE proposals do not use block ciphers as the building blocks to provide leakage resilience. Firstly, Barwell et al. [159] propose an AE design based on asymmetric cryptography. Secondly, ISAP [161] uses a sponge-based primitive combined with a fresh re-keying function. The ISAP proposal provides important features that our proposal shares. ISAP is designed to prevent DPA on both the encryption and decryption parties and limits the attack surface at the receiver side to SPA. However, ISAP requires the whole message to be collected at the receiver side and its integrity to be verified before decryption. This mechanism enables ISAP to prevent DPA attack at the receiver side. Protecting the integrity assurance scheme itself against leakage is achieved through using a sponge-based design, a structure which is assumed to capture certain classes of leakage.

7.4 Security goals

This section outlines the main features that a proposed block cipher mode should have, in addition to providing authenticated encryption:

Security: • - Leakage resilience at both sender and receiver; 168 Chapter 7. An online leakage-resilient authenticated encryption mode

- Misuse predictability resistance.

Performance: • - Online computation at both sides; - Minimum overhead.

We discuss each of these properties to provide insight during the design of our proposed scheme.

7.4.1 Leakage resilience at both sender and receiver

Providing leakage resilience for nonce-based schemes on the sender side is easier than on the receiver side. As we have seen in Section 7.2, the main challenge on the receiver side is that an adversary can control the nonce, which in turn affects the session keys. With regard to this scenario, certain current leakage-resilient proposals, including [162, 167], fail to provide leakage resilience at the receiver side. Other AE schemes [159, 160, 161] avoid this problem by requiring the knowledge of the whole message (i.e. nonce, plaintext message and associated data) before encryption/decryption processes commence. This helps the receiver side to detect any alteration in the message. However, this has a performance cost, as it destroys the online capabilities of the proposed scheme. The proposed solution should provide equivalent leakage resilience on both the sender and receiver. This should be achieved without requiring either the sender or the receiver to collect the entire message before processing.

7.4.2 Misuse predictability resistance

In nonce-based authenticated encryption schemes, Rogaway and Shrimpton [31] define (full) misuse-resistance as the expected security provided by a scheme when the nonce is repeated. With a repeated nonce, they specify that the integrity assurance mechanism of the scheme should not be affected and the confidentiality should be damaged only by detection of repeated messages consisting of the same plaintext, nonce and associated data. This notion is called misuse-resistant authenticated encryption (MRAE). Providing both leakage-resilience and full misuse resistance is a challenging goal to achieve. The interaction between the two notions was well-studied by 7.4. Security goals 169

Berti et al. [160], who argued that cumulatively observing the leakage of certain plaintext messages with a repeated nonce through SPA can distinguish it from random leakage. That is, the indistinguishability-based security is broken. In fact, this observation applies to repeated use of any symmetric cipher, whether or not a fresh nonce is used with each repetition. Hence, (full) misuse-resistance of a scheme falls down in the presence of leakage. Instead, they show that a scheme can maintain ciphertext integrity assurance (but not indistinguishability) in the presence of both misuse and leakage, a notion we denote as misuse predictability resistance (MPR). In order to breach MPR security of a ciphertext message, an adversary needs to produce the whole plaintext/ciphertext message that produces the correct tag. DTE [160] provides this security notion in the presence of leakage and nonce-misuse. Similarly, our proposed scheme is intended to be MPR.

7.4.3 Online computation

Following the definition of Bellare et al. [140], an online cipher is one for which a plaintext block Mi can be encrypted using only previous plaintext blocks until th Mi. That is, if M consists on m blocks, the i ciphertext block, Ci, does not depend on plaintext blocks (Mi+1,...,Mm). Ciphers that are not online are referred to as offline. Online ciphers preserve the length of a data stream without the need for buffering or huge memory. These ciphers are preferable in real-time applications and in devices with limited resources. Online authenticated encryption has few definitions in literature, especially when this feature is combined with MRAE security. In this chapter, we aim for MPR rather than MRAE, so we follow Fleischmann et al.’s [173] definition which means that the ciphertext block Ci can be generated before the plaintext block

Mi+1 has to be read. This definition is easy to understand and basically relies on Bellare et al.’s [140] definition for online ciphers. However, online AE does not mean that the decryption algorithm at the receiver side releases plaintext blocks before tag verification. At the receiver side, blocks are processed one by one as they arrive, but the plaintext message is only returned if the calculated tag matches the received one. Applying this definition to existing AE modes, CCM mode [2] is not an online mode as the length of the message is required to be known in advance, while both GCM [3] and OCB [28] modes are considered online though the decrypted plaintext message is only returned after integrity assurance test. The scheme 170 Chapter 7. An online leakage-resilient authenticated encryption mode proposed in this chapter should similarly exhibit online capabilities.

7.4.4 Minimum overhead

The proposed scheme should provide all the properties discussed previously in Section 7.4 with minimum overhead, compared to the unprotected implementa- tion of a cryptographic scheme. For hardware implementation, it is preferable for the proposed solution to have both small footprint and take a minimum number of clock cycles for execution, so the design can protect low cost and hardware restricted devices.

7.4.5 Properties of existing AE schemes

In order to thoroughly understand the existing leakage-resilient AE schemes, we compare these schemes with regard to the properties discussed so far in Sec- tion 7.4, except the overhead property since different instantiations can be used for different schemes. The summary of functionalities provided by these schemes and our proposal in this chapter is shown in Table 7.1. Note that no existing leakage-resilient AE scheme is fully online on both the sender and receiver sides. The exception is RCB [158] which is considered an online scheme; however, as mentioned in Section 7.3, RCB has some structural weaknesses [172].

Table 7.1: Summary of security features provided by leakage-resilient AE schemes.

Scheme Primitive DPA-secure MPR Fully AE-Secure sender/receiver Online

RCB [158] Block cipher / X X × X × Barwell [159] Asymmetric / X X X × X ISAP [161] Sponge / X X X × X Kocher [162] Block cipher / X × × × X DTE [160] Block cipher / X X X × X This proposal Block cipher X/X X X X 7.5. Proposed design 171

7.5 Proposed design

This section provides a high-level representation of our proposal for a leakage- resilient block cipher-based AE mode. In this design, we follow the fresh re- keying scheme that is based on two building primitives, g and EK , as discussed in Section 7.2. The design should use only these primitives to satisfy all the features listed in Section 7.4, especially the online requirement. That is, our proposal does not increase the computational burden by requiring any extra hardware modules. This means that the proposal scheme provides important features while still being suitable for restricted-resource hardware. Firstly,we modify the basic structure of the fresh re-keying scheme (see Fig- ure 7.1) so that the leakage at both sender and receiver is almost the same. The slight alteration is shown in Figure 7.2. This toy model encrypts a constant Cst0 using a block cipher to generate a key stream block. This key stream is XOR- ed with the plaintext block to produce the ciphertext block. The advantage of this modification is that the leakage of EK is independent of both Mi and Ci. Changing either the plaintext or ciphertext value has no effect on the leakage produced by g or EK . Cst0 is a public constant during encryption and decryption processes, but not a nonce. A user can select a hard-coded value for Cst0, but an attacker has no control over this value since the value of Cst0 is part of the initial loading of an implementation into a device. Encrypting a constant is used in some existing leakage-resilient schemes [160, 167]. Thus on both sides, DPA attacks to retrieve K∗ are not feasible even if the nonce is reused. Hence, this scheme is more effective in misuse-resistance settings than the standard scheme of fresh re-keying. Nonetheless, this toy model does not provide authenticated encryption nor encrypt multi-block messages.

N Cst0

K∗ K g EK

M1 C1 Figure 7.2: Slight modification to fresh re-keying scheme.

Secondly, we add integrity assurance to our scheme from Figure 7.2. It is challenging providing integrity assurance using only block ciphers in the pres- 172 Chapter 7. An online leakage-resilient authenticated encryption mode ence of leakage. The output from a block cipher can be used as a stream to encrypt/decrypt messages, but this approach does not provide integrity assur- ance. Some proposals use the hash-then-MAC paradigm [160, 162] where the whole message is passed first over a hash function and the output is encrypted using a secret key cryptographic function to generate the final tag. This approach requires relatively low complexity to construct a leakage-resilient AE scheme that preserves security in misuse-resistance and on both sides. However, this paradigm limits the online computation of the scheme. Currently, there are only two block cipher-based schemes, namely PSV-MAC [156] and Schipper’s scheme [174], that provide leakage-resilient integrity assur- ance. However, in the case when the nonce/IV is repeated and in the presence of leakage, PSV-MAC [156] fails to provide DPA security. Schipper’s scheme [174] randomly generates keys for every different message and then uses a MAC scheme to calculate the tag, so is not affected by the repeated nonce. The prob- lem with Schipper’s scheme is that it specifies a key for each message, and not for each block. Thus for long multi-block messages, DPA attacks will succeed if, for example, CBC-MAC is used as the MAC scheme. That is, both schemes fail to provide sufficient DPA security. We provide an alternative integrity assurance mechanism based on repeated application of both g and EK components, as illustrated in Figure 7.3. This pro- vides leakage-resilient authenticated encryption for the plaintext block M1 such that the output is both C1 and the authentication tag T . Importantly, in misuse- resistance settings where the nonce can be repeated for different plaintext/cipher- text messages, the adversary cannot use DPA attacks on EK components to re- cover the session key K∗ nor K∗∗. For the first EK , changing either the plaintext or ciphertext value has no effect on the leakage produced by EK , while for the second EK , changing Ci will consequently change the corresponding K∗∗. Hence, both EK components are protected against DPA. This model provides all the re- quired features discussed in Section 7.4 except the online feature, and maintains the same SCA security at both sender and receiver. The model uses only the already implemented primitives without any extra area cost. Finally, we extend the toy model in Figure 7.3 to process multi-block messages in a manner that enables the scheme to be online. The final scheme is depicted in Figure 7.4. Starting from block M1, the output after processing each block is used as a new key for the next block. This process continues until the last block 7.5. Proposed design 173

N Cst0

K∗ K g EK

M1 C1 C1

K∗∗ g EK

T Figure 7.3: Authenticated encryption fresh re-keying scheme.

Mm has been encrypted. The scheme is now online since the encryption of each block Mi does not depend on the following blocks (Mi+1,...,Mm). Note that the tag T is generated by applying an additional call of the g function using the master key K. Without this additional call, a forgery attack can be performed on the scheme. To illustrate such a forgery, suppose a plaintext message M = (M1,...,Mm) is encrypted and returns a tag T . An attacker can change the message by choosing a new message M 0 = (M10 ,...,Mm0 ) and appending this message to the end of the original one (M M 0). For this new k constructed message, an attacker can easily calculate the correct authenticated tag T 0 as:

- K0 T 0 ← - C M E (Cst ), i [1, m] i0 i0 Ki0 1 0 ← ⊕ − ∀ ∈

- Ki00 g(Ki0 1,Ci0), i [1, m] ← − ∀ ∈

- K0 EK (C0), i [1, m] i ← i00 i ∀ ∈

- return T 0 K0 ← m

Such a forged message (M M 0,T 0) will be considered as a valid message when || verified by the receiver. The fixed-length PSV-MAC [156] is vulnerable to this attack if it is used for variable length messages. To avoid this, a finalising phase using the g function under the master key K is added to our scheme to generate the tag. Note that g in this step does not add any extra security to the tag, except that it differentiates the final block from ordinary blocks in a message. 174 Chapter 7. An online leakage-resilient authenticated encryption mode

At the receiver side, the scheme performs the decryption process block by block sequentially as they arrive, and the attacker cannot apply DPA even if the nonce is forced to repeat. The final step is tag verification, and the recovered plaintext message is only returned if it passes this integrity assurance test. That is, the proposed mode does not pose a requirement to verify the message in advance in order to prevent DPA when the nonce is repeated. This proposal allows the sender and receiver to securely process variable-length messages, and produce ciphertext blocks based on the received plaintext blocks without the restriction of previously knowing the message length or requiring the collection of the entire message before operation.

N Cst0 Cst0 Cst0

K∗ K g EK EK EK ··· M1 M2 Mm

C1 C2 Cm

K∗∗ K∗∗ K∗∗ g EK g EK g EK

K g Encryption Process T N C1 C2 Decryption Process Cm

K∗ K∗∗ K∗∗ K∗∗ K g g EK g EK g EK ···

Cst0 Cst0 Cst0 K g

T E E E K K ··· K C1 C2 Cm

M1 M2 Mm Figure 7.4: Our proposal for leakage-resilient AE mode.

Note that the mode illustrated in Figure 7.4 is general regardless of the actual instantiation used for g and EK . The user can choose different instantiations to trade-off between security and performance. 7.6. An instantiation of our proposed design 175

7.6 An instantiation of our proposed design

This section describes an instantiation of our proposal with concrete functions selected for the components EK and g. We stress that the proposal itself is a high-level scheme and can be instantiated with different functions or block ciphers depending on the application and the trade-off between efficiency and security. The natural choice for the encryption/decryption function EK is the standard 128-bit AES cipher. We recommend the use of AES for EK . On the other side, different constructions can be used for g. One approach is to protect g against leakage using other physical countermeasures, such as masking and hiding. For instance, Medwed et al. [170] propose g as a modular multiplication between K and N such that the multiplication itself is protected by masking and shuffling. However, this instantiation opens a door for attacks on both g and EK as discussed in [175]. Another possible instantiation is to use a masked AES implementation such as the recently proposed implementation [150]. However, we would prefer an instantiation with lower computation cost, which has been subject to public scrutiny. To instantiate g, we follow the idea of the GGM construction [176] to update the secret key K to a new key K∗. In this approach, the master key K is updated to another secret variable through a series of randomised steps following a tree- like structure. The tree originates from the master key and each step uses one bit of the nonce to perform two child nodes. The output from the last step is used as a session key K∗. This structure is limited to two executions under any given key; hence, preventing DPA attacks. This last approach has been used in several designs such as [154, 153, 162, 155, 171, 167]. Certain schemes [155, 154, 171] use the GGM construction to construct a leakage-resilient PRF where the randomisation function used in each step of the tree can be a full block cipher or a full hash function. In our proposal, we follow the tree structure proposed by Taha et al. [167]. We do not require g to be a strong PRF. That is, the generated key from g does not require extra randomness as long as it is unknown to the attacker. Eliminating this need enables the structure to be more lightweight and efficient. This does not compromise the resultant security of the scheme as the actual encryption/decryption process is performed by the EK component, not g. The g function is used only to update keys without significant leakage that can be exploited in power analysis attacks. 176 Chapter 7. An online leakage-resilient authenticated encryption mode

To achieve the important features for g function discussed in Section 7.4, Taha et al. [167] used two rounds of the AES as the randomisation function

W t[N]i (K) for each step in the tree as illustrated in Figure 7.5. The key is used

as data whereas a fixed-input is used as a key in the function W t[N]i (K). To

update a session key Ki to a new key Ki+1, W t[N]i (K) is specified as follows: ( W t1128 (Ki), if [N]i = 1 Ki+1 = W t[N]i (Ki) = W t0128 (Ki), if [N]i = 0

th such that [N]i is the i bit in the nonce N. W t is a two-round function of AES such that the key can be all ones (1128) or zeros (0128) depending on the value of

[N]i. Note that, this structure does not compute all paths in the tree; only the path specified by the nonce value is computed.

K0

N0 = 0 W t W t N0 = 1

0 1 K1 K1

N1 = 0 W t W t N1 = 1 N1 = 0 W t W t N1 = 1

00 01 10 11 K2 K2 K2 K2

N2 = 0 W t W t N2 = 1

010 011 K3 K3

Figure 7.5: A tree structure for updating the key.

To enable a lightweight implementation, the two rounds needed for the func-

tion W t[N]i (K) are obtained from the AES in component EK . Taha et al. [167]

suggest that a triggered circuit is added to the AES implementation in EK to acknowledge when the output is ready (i.e. after two rounds for g or ten rounds

for EK ). This additional circuit has an implementation cost of only two gates at 3.7 Gate Equivalent (GE) which is a negligible area overhead. 7.7. Security analysis 177

7.7 Security analysis

This section analyses the security of our proposed scheme. Structurally, the design consists mainly of two components EK and g. In our instantiation, AES is used for EK and a tree structure based on two rounds of AES is used for g. AES is known to be resistant to regular mathematical attacks. Assuming each component is secure against mathematical attacks, the composition of the components cannot provide an attacker with any useful information to break the system. In this section, we assume that the algorithm is secure against regular attacks, and it is the implementation that is vulnerable. First, we investigate the resistance of g against side channel attacks. After that, we discuss how EK can be protected against SPA only, and discuss the security of the proposal against fault attacks. Then, assuming g is SCA-secure and EK is SPA-secure, we contend that our proposal provides sufficient authenticated encryption security and MPR security at both sender and receiver sides, and protects against both SPA and DPA attacks as a whole scheme.

7.7.1 Security of g against SPA and DPA

In this proposal, g uses a lightweight GGM construction that absorbs one bit of the nonce in each step. This construction limits the data complexity for each session key to two. To determine one byte of the secret key in AES using DPA attacks, between 10 and 50 different plaintext messages are required [177]. Lim- iting the data complexity to two renders the leakage insufficient to be exploited in DPA attacks to recover the secret key, as shown in [154, 171, 167]. Medwed et al. [157] highlighted that this limitation of data complexity in GGM construction may not be enough to completely prevent attacks. They verified experimentally that attacks on certain tree-based implementation in an 8-bit microcontroller can reach non-negligible success rate. To thwart this attack, they suggested to run all the 16 s-boxes (SB) in AES in parallel. In this case, the accumulative leakage of s-boxes is:

16 X l = L(SB(M [i] k[i])) (7.1) j j ⊕ i=1 where L( ) is a leakage function and l is the total leakage from all s-boxes for · j Mj[i] data byte. 178 Chapter 7. An online leakage-resilient authenticated encryption mode

This parallel s-boxes processing makes SCA resistance of g grows super- exponentially depending on the degree of parallelism of the implementation. For SPA, the attacker is required to find 16 bytes of the key where each byte k[i] 1,..., 255 such that the total leakage across s-boxes satisfy Equation 7.1. ∈ { } Using only one plaintext value, the solution is infeasible. Even if one knows the key bytes individually, knowing the right order of these bytes becomes even harder with the accumulative leakage in Equation 7.1. For DPA, the correct solution for Equation 7.1 is checked using two differ- ential plaintext values. Experiments performed by Taha et al. [167] estimated the success rate over small parts of the key. They showed that there is an ex- tremely high bound of computational complexity when the full key space 2128 is considered. To uniquely determine the correct key, 128 trace equations are needed [178]. In summary, we argue that instantiating g with this GGM construction such that all s-boxes in the AES function run in parallel is sufficient to thwart SPA and DPA attacks.

7.7.2 Security of EK against SPA

Our proposal protects EK against DPA attacks as other fresh re-keying schemes do. However, EK is still required to be secure against SPA. Protecting AES against SPA should be easier than DPA [15]. SPA is only useful when significant amount of data-dependant leakage is visible. In practice, noise is usually large enough to make SPA alone ineffective. Nonetheless, several designs to secure AES against SPA exist in literature. AES can be protected against SPA by shuffling as in [170, 179] with little overhead. Recently, Groß et al. [150] introduce an efficient, lightweight masking approach to protect AES. A similar approach can protect EK against SPA attacks.

7.7.3 Security against Fault Attacks

Besides SCA attacks, another class of implementation attacks are fault attacks. This section discusses the feasibility of DFA and SFA attacks against the pro- posed scheme, and more generally against schemes based on fresh re-keying. Fresh re-keying updates the key of each encryption call. Consequently, DFA is infeasible at both sender and receiver provided the nonce is not repeated. That 7.7. Security analysis 179 is, DFA is prevented by default with fresh re-keying approach. Similarly, SFA requires different message blocks to be encrypted under the same key. Again, for fresh re-keying based schemes, every message block receives a fresh key, thus SFA is not applicable. If the nonce N is repeated, then this mode and fresh re-keying based schemes [162, 170, 161, 160] will be vulnerable to both DFA and SFA. In this case, other countermeasures, such as fault detection circuits, can be used to protect fault at- tacks. These detection techniques mainly depend on the duplication of execution to identify faults. For example, Karri et al. [180] proposed an error detection approach that performs the inverse operation before releasing the produced out- put to ensure that it decrypts to the original input. Fresh re-keying does not provide security against fault attack when the nonce is omitted or repeated, but this proposed fresh re-keying scheme would still be secure against DPA with a repeated nonce.

7.7.4 Security of overall scheme

The proposed design consists mainly of two parts, encryption/decryption part and integrity assurance part as shown in Figure 7.4. Encryption/Decryption part: The construction of our proposal follows the idea of the fresh re-keying scheme. This technique has been extensively stud- ied and found effective to prevent DPA attacks [159, 154, 153, 169, 170]. The block cipher is used as a stream generator to encrypt plaintext blocks. This en- ables the scheme to be more secure in misuse-resistance settings since the leakage of the encryption function is independent of the plaintext/ciphertext message. In addition, each plaintext block encryption is provided with a new key. Hence; if the g function is SCA-secure then the encryption part is a strong pseudorandom permutation because ciphertext blocks are generated using a strong pseudoran- dom permutation (AES in the instantiation of Section 7.6) and g only ensures distinct fresh keys. Integrity assurance part: To provide strong integrity assurance, the mes- sage blocks are required to be input to the block cipher (i.e. not XOR-ed with the output stream block like the encryption part) as this is effective in detection of any alteration to the message. However, feeding the blocks as input to the cipher enables an attacker to perform multiple authentication/verification oper- ations under the same nonce which in turns renders the scheme vulnerable to 180 Chapter 7. An online leakage-resilient authenticated encryption mode

DPA attacks. To overcome this challenge, we use a different approach to process message blocks. We still use the message blocks as input to the block cipher to provide strong integrity assurance. However, the value of the block contributes to the generation of the session key as illustrated in Figure 7.6. That is, the block value

(Mi/Ci) is used in g function to generate its session key K∗∗. Changing either

K or N will change the session key K∗ whereas changing K, N or (Mi/Ci) will change the session key K∗∗. Thus, multiple decryptions under the same nonce to recover session keys cannot be applied in this design since changing the ciphertext leads to changing the intermediate session keys. Similar to the encryption/decryption part, if g is SCA-secure, then this part is a strong pseudo- random function since blocks are fed into a a strong pseudorandom permutation (AES). The purpose of g is to ensure a different key for each message block.

Mi/Ci

K∗∗ K∗ g EK ... Figure 7.6: Leakage-resilient integrity assurance scheme.

In summary, we conclude that if g is SCA-secure and EK is a strong PRP, the mode illustrated in Figure 7.4 provides both strong data confidentiality and in- tegrity assurance for plaintext messages. That is, the mode of operation provides sufficient authenticated encryption. Furthermore, the scheme provides sufficient misuse predictability resistance since in the worst case scenario, the attacker can detect messages started with identical plaintext blocks when the nonce is repeated. However, the attacker cannot use this knowledge to perform forgery attacks as any alteration to the message will change the intermediate keys and these errors propagate to the last block to appear in the final tag.

7.8 Performance estimates

In this section, we provide some estimates for the performance of our proposal in comparison to some existing leakage-resilient AE schemes using different hard- ware implementations. As listed in Table 7.1, the block cipher-based schemes 7.8. Performance estimates 181 that could be compared to our scheme are DTE, RCB and Kocher’s design. How- ever, RCB is insecure [172] and Kocher’s design is vulnerable to DPA attacks at the receiver side [162]. Therefore, we compare the performance of our scheme with DTE in terms of speed and area, for various application scenarios and for various message lengths ranging from one 128-bit block (m = 1) to 7500 blocks (m = 7500). In each scenario, we choose particular implementations for different components in each scheme, and aim for equivalent security for both schemes.

Our proposal uses two components, g and EK , whereas DTE uses these and additionally a hash function, h, as a third component. To compare, we use existing hardware implementations to instantiate the underlying building blocks

(g, EK , h) for which performance metrics already exist. We choose the current smallest AES implementation [181] and its first order version protected against

SCA for the component EK . For g, we use the tree-based construction that was discussed in Section 7.6 using two different settings, (s = 1) and (s = 8). As Taha et al. [167] explained, the performance of g can be improved if each step in the tree uses s bits rather than one (s = 1). This will trade-off the SCA-security bound, but this can be tolerated in certain low-cost applications. The third component, h, is used only in DTE. Hence, for a fair comparison, we choose three different implementations of some well-known hash functions. We use two implementations for SHA-256, one for fast computation and the other for small area. The winner of SHA-3 competition, Keccak-256, is also used to cover different applications. The real cost of these implementations for g, EK and h in terms of the number of cycles required to process a 128-bit block and area (GE) at certain CMOS technology is listed in Table 7.2. Evaluating our scheme and DTE using hardware implementations chosen from Table 7.2 gives an indication of the relative performance of these two schemes in real hardware. If the hardware implementations of g, EK and h are known, the remainder of the scheme is a sequential execution of these hard- ware cores depending on the number of plaintext blocks m. We assume that g,

EK and h functions are processed in series. For our first comparison, we consider the case where strong security is re- quired. Therefore, we choose g with s = 1, the protected version for AES and Keccak-256 for h. The results of the two schemes are shown in Figure 7.7(a). The performance of two scheme is comparable for short messages (m < 1500), but our proposal is slower that DTE by about 50% when m reaches 7500. 182 Chapter 7. An online leakage-resilient authenticated encryption mode

Table 7.2: Implementation overhead of different hardware constructions.

Primitive Construction Latency(cycles) Area (GE) Technology s = 1, tree-based [167] 512 3.7 130nm g s = 8, tree-based [167] 64 3.7 130nm AES [181] 226 2400 180nm E K protected AES [181] 266 11031 180nm fast SHA-256 [182] 68 21670 130nm h small SHA-256 [182] 292 6125 130nm Keccak-256 [183] 25 56713 180nm

106 106 7.8 · 3.9 · 7 Our proposal (s = 1) 3.5 Our proposal (s = 8) DTE DTE 6 3

5 2.5

4 2

3 1.5 Number of cycles 2 Number of cycles 1

1 0.5

0 0 1 1,500 3,000 4,500 6,000 7,500 1 1,500 3,000 4,500 6,000 7,500 Number of blocks Number of blocks (a) (b) 106 106 5.5 · 7.2 · 5 Our proposal (s = 8) Our proposal (s = 1) DTE 6 DTE

4 5

3 4 3 2 Number of cycles Number of cycles 2 1 1

0 0 1 1,500 3,000 4,500 6,000 7,500 1 1,500 3,000 4,500 6,000 7,500 Number of blocks Number of blocks (c) (d) Figure 7.7: Performance overhead for different instantiations.

For the case of fast computation with reduced security, we choose g with s = 8, the unprotected AES for EK and the fast SHA-256 for h although Keccak- 256 is the fastest, but SHA-256 is a better option if one considers the huge area 7.8. Performance estimates 183 overhead caused by Keccak-256. The results are shown in Figure 7.7(b). The performance of the two schemes is almost identical to each other, although our scheme is slightly better for long messages (m 6000). ≥ Finally for small-area applications, we consider both implementations of g, since both occupy the same area, while the unprotected AES is used for EK and the small SHA-256 hash function is used for h. The results when s = 8 are shown in Figure 7.7(c), whereas the results when s = 1 are shown in Figure 7.7(d). Our proposal is more efficient than DTE by 30% when m = 7500 in Figure 7.7(c) because g is more efficient than the hash function. However, the situation is flipped in Figure 7.7(d). DTE requires less number of cycles than our scheme by 23% when m = 7500 although it is not as efficient as DTE in Figure 7.7(a). The performance of the two schemes with respect to the number of cycles is compared in Figure 7.7. The performance is also measured by the required area of the implementation. The overall area of our scheme compared to DTE for all four instantiations is shown in Figure 7.8. The figure shows that our scheme is more lightweight as it always requires less area than DTE by 83.7%, 90%, 71.8% and 71.8%, for (a), (b), (c) and (d) scenarios, respectively, which makes it efficient for hardware-restricted devices. In addition, as noted in Table 7.1, our design is the only proposal that is an online scheme.

104 · 6.8 Our proposal DTE 6

4 Area(GE) 2

0 (a) (b) (c) (d) Figure 7.8: Area overhead for different instantiations. 184 Chapter 7. An online leakage-resilient authenticated encryption mode

7.9 Conclusion

In this chapter, we have studied leakage-resilient AE schemes based on fresh re- keying. Most block cipher-based leakage-resilient schemes fail to preserve suffi- cient security in the presence of both leakage and nonce-misuse. In addition, only a few proposals secure both communicating ends, sender and receiver, against side channel attacks. However, such proposals require the whole message to be collected before the encryption or decryption process which makes the scheme unable to perform online computations. We proposed the first block cipher-based mode that provides online, leakage- resilient and misuse-resistant authenticated encryption on both the sender and receiver. The scheme provides these features using a minimum number of prim- itives. The leakage resilience is against DPA attacks only since fresh re-keying fundamentally does not provide security against SPA attacks. However, combining all these important features in one design has a trade-off in terms of performance. Our proposed scheme uses the SCA-secure g function during the processing of each block. We suggested the solution proposed by

Taha et al. [167] to instantiate g and the AES block cipher to instantiate EK . This solution imposes a negligible area overhead and significantly improves the execution time. This suggested instantiation is very practical and suitable for restricted-area hardware. Compared to DTE [160], our design is slower and requires longer execution time under certain hardware instantiations, and its performance is equal or even faster than DTE in other different instantiations. Importantly, our design always requires less area than DTE and provides online computation. Chapter 8

Conclusions and future research

Although AE schemes can be constructed using different primitives, such as stream ciphers, sponge ciphers or other dedicated designs, block cipher modes are the most common type of AE scheme. For example, more than 50% of the CAESAR submissions are block cipher modes of operation. This thesis anal- ysed certain block cipher modes of operation designed to provide authenticated encryption. The overall aim of this thesis was to analyse the security of certain authenti- cated encryption block cipher modes submitted to CAESAR. Four block cipher modes were investigated in this thesis: ++AE mode, OTR mode, XEX/XE mode and AEZ mode. XEX and XE modes are not AE modes, but six CAE- SAR AE proposals use the concept of XEX/XE modes (OCB, AES-OTR, AES- COPA, ELmD, SHELL and AEZ). Through the analysis of these modes, the thesis achieved the following objectives:

1. Identify flaws in the modes and propose methods to exploit these flaws either in key recovery attacks or in forgery attacks to breach the integrity mechanism. When flaws are identified, suggest methods where feasible to adjust the design to prevent these flaws being exploited.

2. Exploit weaknesses in an AE mode to apply or improve the application of fault attacks. While all block ciphers are vulnerable to fault attacks, the mode of operation can exacerbate the vulnerability by reducing the number of faults required to achieve the attack goals.

185 186 Chapter 8. Conclusions and future research

3. Propose a new AE block cipher mode that is resilient against side channel attacks without losing the potential to perform online computation. (Note that existing AE schemes which are resilient to side channel attacks require the knowledge of the entire message before Encryption/decryption either at sender or receiver side.)

This research used mathematical analysis and computer simulations to verify the identified weaknesses and proposed attacks. The software simulations were performed using the C programming language with the GNU GCC compiler on a standard desktop computer. The source codes and test vectors for the targeted CAESAR schemes were mainly obtained via SUPERCOP [184]. The main contributions of this thesis are reviewed in Section 8.1. Section 8.2 highlights high-level remarks concluding this research, and Section 8.3 discusses avenues for future research.

8.1 Review of contributions

This section summarises the contributions of this thesis. The first contribution was the analysis of ++AE authenticated encryption mode, that identified sev- eral weaknesses in the internal structure of the mode. Then, the instantiation of masks in the generic OTR mode was revisited, which showed that the cur- rent instantiation limits the mode generality and, more seriously, enables forgery attacks for certain primitive polynomial used to update masks. Thirdly, fault at- tacks against XEX mode were investigated, and the relevance of this analysis to certain AE candidates in the CAESAR competition was explored. Fourthly, we applied differential fault attacks against AEZ mode and exploited the structure of AEZ to minimise the total number of faults required for key recovery resulting in a more practical attack than the direct application of previously published at- tacks. Finally, a new block cipher mode was proposed to provide authenticated encryption with leakage resilience at both the encryption and decryption ends and which does not compromise the potential to perform online computation.

8.1.1 Analysis of ++AE mode

This contribution was an extensive analysis of the security of ++AE authenti- cated encryption mode. This contribution achieved objective 1. 8.1. Review of contributions 187

The most serious flaw found in the integrity mechanism of ++AE is that it does not verify the most significant bit of any plaintext block except the last block. This is a fundamental flaw that allows a malicious sender or receiver to dispute the content of a validly sent message by claiming that the message is different in the most significant bit of any subset of the message blocks, except the last block which is the integrity check vector. Further weaknesses were identified that can be exploited in chosen plaintext forgery attacks. These weaknesses are due to the fact that, for certain input block values, the encryption of four consecutive plaintext blocks will result in the values of the internal chaining values, Qi and Ii being repeated. This flaw enables an attacker to choose a plaintext and use the corresponding ciphertext to construct multiple forged messages that will be accepted as authentic messages during decryption. An attacker can either delete and insert particular ciphertext blocks in the original cipher message or reorder them. The success rate for this forgery attack is 100%. Sixteen specific four-block plaintext groups were identified that guarantee a chosen plaintext forgery attack, provided the chosen plaintext message includes at least two groups of four consecutive blocks selected from the 16 specific values. Additionally, we showed that over 450 groups of four consecutive plaintext blocks can be used to generate a guaranteed forgery using a single group of plaintext blocks, significantly increasing the scope of this attack. Finally, verification results to demonstrate the forgery attacks were given using 128-bit AES in ++AE mode using the C language and the GNU GCC compiler. In every case, decrypting the forged ciphertext gave the correct ICV, and so the modification would not be detected by a receiver. This analysis verified that the mechanism used in ++AE is not suitable to provide integrity assurance. The flaws are based on the properties of the cross- chaining structure used in ++AE, and exist regardless of the choice of underlying block cipher and independently of the chosen key.

8.1.2 Analysis of OTR mode

Bost and Sanders [124] showed trivial collisions between the input masks in the previous version of OTR [9] when special forms of primitive polynomial are used. They suggested the use of different masking coefficients chosen from the set given by Rogaway in [10]. Accordingly, Minematsu followed their suggestion and updated the OTR instantiation masks [8]. 188 Chapter 8. Conclusions and future research

This work revisited the new OTR instantiation coefficients and investigated the extent to which these new masks can be used with different choices of prim- itive polynomial in the generic OTR mode. This work showed that the new instantiation of OTR uses masking coefficients that are not always distinct when used with certain forms of primitive polynomials. We identified mask collisions that can be exploited to perform forgery attacks against the scheme. Two alternative approaches for masking coefficients were proposed, so that OTR can use the same set of coefficients for any block size and any primitive polynomial, without affecting the security provided by this scheme. That is, this work generalises the OTR mode using the technique of doubling masking, and removes the requirement for the user to perform huge calculations in advance in order to ensure that the masks do not overlap. Note that this work does not invalidate the security of the OTR [8] mode nor the CAESAR version, AES-OTR [117]. However, it provides users with more flexibility and trust in the generic OTR mode which can be used in any finite field. This should make OTR more robust and less prone to flaws based on poor component selection. This contribution satisfied objective 1.

8.1.3 Fault analysis of XEX mode

This contribution was an investigation of fault attacks on XEX mode, assuming that the mask updating function is a doubling operation on a finite field, which is a standard choice in XEX. The research was motivated by the fact that the nonce cannot be repeated in the nonce-based AE; thus conventional fault analyses, such as differential fault analysis and statistical fault analysis, cannot be applied. This contribution achieved objective 2. Attacks were proposed using different fault models. Firstly, fault attacks targeted the masks in XEX mode to make them zero using either permanent, transient or skipping instruction faults. These attacks exploited the feature of the standard primitive polynomial f(x) = x128 + x7 + x2 + x + 1 used for mask updating, thereby converting XEX to ECB mode. Subsequently, normal DFA analysis can be applied to recover the secret key. The properties of the same primitive polynomial were further exploited in statistical fault analysis using only faulty ciphertext messages. Values of the secret masks were first retrieved before finding the secret key using the same faulty messages. This analysis was verified using both statistical analysis and 8.1. Review of contributions 189 software simulations in C. Five authenticated encryption modes (four submitted to CAESAR) are based on XEX/XE modes and use the same primitive polynomial. This makes them vulnerable to our attacks. These modes are: COPA/AES-COPA, ELmD, SHELL, OCB2 and OTR/AES-OTR. Although both OCB3 and AEZ modes are based on XEX/XE modes, they are not vulnerable to our attacks since each uses a different mechanism to update masks. Most importantly, this work demonstrated that it is the mask updating func- tion f(x) = x128 + x7 + x2 + x + 1 that makes XEX vulnerable to these fault attacks, rather than the XEX mode itself. Hence, an efficient solution to pre- clude these attacks was to replace this primitive polynomial with other functions that are less sparse and more conservative, such as the ones listed in Section 5.7 of this thesis.

8.1.4 Fault analysis of AEZ mode

This research analysed the security of the authenticated encryption scheme AEZ v4.2 and v5 against differential fault attacks. This contribution achieved objec- tive 2. AEZ [11] is a block cipher mode based on AES, which uses three 128-bit keys, I, J and L, and provides authenticated encryption with associated data (AEAD). AEZ is a candidate in the third round of the CAESAR competition [6]. The scheme claims to provide strong security and usability properties; particularly it can potentially work without a nonce or with a repeated nonce. Under these conditions, this work identified the best place to apply differen- tial fault attacks. Direct application of fault attacks required at least six fault injections to uniquely determine the three 128-bit keys. However, in this thesis, we exploited the structure of AEZ v4.2 to allow an improved attack, which re- duces the number of fault injections to three with all three keys still uniquely determined. We showed also the same analysis can be applied to the current ver- sion AEZ v5, but the number of faults for successful key retrieval is four instead of three faults as required for v4.2. This is mainly due to changes in the mask updating function. This work also outlined two suggestions to prevent the exploitation of the structure of AEZ that allowed us to minimise the number of faults. These sug- gestions will prevent an attacker from correlating the three keys and minimising 190 Chapter 8. Conclusions and future research the key search, but the direct application of fault analysis, as discussed in Sec- tion 6.4.1 still applies, it cannot be avoided with the current structure of AEZ.

8.1.5 An online leakage-resilient AE mode

In this contribution, we proposed a block cipher mode of operation that provides authenticated encryption, online operation, leakage-resilience and misuse resis- tance at both the sender and receiver ends. Most of the prior block cipher-based AE constructions either provide leakage resilience for only sender side (and not the receiver side), or they provide both sender and receiver side with leakage and misuse resilience, but were not suitable for online computations. Some designs require the message to be fully collected before the encryption process starts, while others require verification of the integrity of the whole ciphertext message before the decryption process begins. This motivated the design of this new block cipher mode of operation. This contribution satisfied objectives 1, 2 and 3. The structure followed the fresh re-keying technique introduced in [153, 169, 170, 154] where each key is only used once to encrypt/decrypt a single block. Internally, the mode involved three components: a block cipher E (e.g., AES), a non-cryptographically-secure, but leakage-resilient mixing function g, and the XOR operation in order to use the output of the block cipher as a keystream. When encrypting the final message block, care must be taken to provide a non- malleable MAC tag. An instantiation for both E and g was suggested that is practical and suit- able for restricted-area hardware. Compared to DTE [160], our design was slower and required longer execution time under certain hardware instantiations, but its performance was equal or even faster than DTE in other different instantia- tions. Importantly, our design always required less hardware area than DTE and provided online computation.

8.2 Concluding remarks

This section outlines some remarks about the security of modes of operation, which may help as guidelines for future designs. Firstly, the security of the mode of operation used for a block cipher must be as important as the security as the block cipher itself. Design flaws in the mode of operation can compromise the 8.2. Concluding remarks 191 entire AE scheme. Although ++AE imposes a negligible overhead compared to ECB and has several important features, weaknesses in the chaining mechanism intended to propagate ciphertext errors for integrity assurance completely dis- credit the ++AE scheme. Demonstrating the ++AE mode flaws is important, so that it is not used in real-life applications. This cross-chaining mechanism, internally using XOR and addition mod 2n operations with an integrity check vector (ICV) is not a suitable method to provide integrity assurance. Alterna- tively, designers can follow a similar, but more secure approach in EPBC [76] that still uses cross-chaining and (ICV) to provide integrity assurance. Another aspect highlighted in this thesis is that XEX and XE modes can be used to provide efficient AE schemes. XEX and XE modes are fully parallel, fully online, use a single key and require only two (one in XE) additional XOR operations with masks compared to ECB. Compared to conventional AE schemes such as GCM and CCM modes, XEX- and XE-based AE modes reduce the number of block cipher calls by almost half, which is a significant improvement. However, these modes require mask uniqueness, and any flaw in the mode during mask updating can be exploited in forgery attacks. This seems to be a simple requirement, but assuring mask distinctness is a difficult and time-consuming task. For example, Minematsu [9] and Hoang et al. [138] had chosen mask updating functions for OTR and AEZ, respectively, that result in mask collisions, as were later shown in [124] and [142], respectively. Chapter4 proposes a generic solution for masking that enables the XE-based OTR mode to work in any finite field without invalidating the security claimed. The same technique can be used for OCB2 [10]. In nonce-reuse settings, nonce-based AE schemes are vulnerable to differ- ential fault analysis (DFA). However, such analysis can be avoided if the AE scheme is used only in nonce-respecting scenarios where the nonce or IV never repeats. Both XE and XEX have the potential to avoid DFA in nonce-respecting setting, as changing the nonce should prevent an attacker from generating two identical inputs to the underlying block cipher, a necessary condition to apply DFA. However, this thesis showed that this is not always correct. The primitive polynomial used for updating masks in XEX/XE modes can be the weak point and faults can be applied first to recover the masks before applying further dif- ferential fault analysis to recover the key. Primitive polynomials for updating masks must therefore be chosen carefully. 192 Chapter 8. Conclusions and future research

Regarding statistical fault analysis (SFA), this is only applicable if two con- ditions are met: the inputs to the block cipher should be different, and the faulty ciphertexts obtained should be the direct outputs of the block cipher [19]. Under these conditions, XE is vulnerable to SFA, whereas XEX is not. However, this thesis again showed that the primitive polynomial used for updating masks can degrade the mode and enable SFA to apply against XEX. Alternative techniques for the mask updating function are outlined in Section 5.7. A final remark on secure AE block cipher mode is that increasing the key size does not proportionally increase the complexity of the mode of operation against fault attacks. The 128-bit secret key in AES can be retrieved using only two faults [16, 102]. AEZ has a key size three times longer, though this thesis showed that the entire 384-bit key in AEZ v4.2 can be uniquely retrieved using only three faults. Besides the key size, the structure of the mode is important in increasing the security of an AE scheme against fault attacks. Updating AEZ from v4.2 to v5 increases the number of faults required to determine the secret key from three to four as shown in Section 6.5, as the mode changes the encryption function masks, particularly the masks for the AES4 function. Section 6.7 proposes other mechanisms that make fault analysis on AEZ more challenging.

8.3 Future research

The areas discussed in this thesis expose certain avenues for future research. Examples of these topics are as follows:

Assessing the security of block cipher modes of operation is complex, time • consuming and error prone. Proofs of security for modes of operation are usually difficult, and use many assumptions that may not be realistic. It would be an interesting topic to develop tools that automatically evalu- ate the security of AE modes of operation. Such tools will provide users with additional trust in the proposed modes and facilitate efforts to analyse schemes in cryptographic competitions, like CAESAR. Automatic analysis of cryptographic schemes started several years ago. For example, Courant et al. [185] presented an automated analysis for generic asymmetric en- cryption schemes. Similarly, Gagné et al. [186] used such a technique to automate the security proofs for encryption modes of operations. Propos- ing new tools to automatically analyse authenticated encryption modes is 8.3. Future research 193

an area for future research.

Section 5.7 proposes a few countermeasures to avoid the proposed fault • attacks in Chapter5. These countermeasures can be further analysed and compared regarding their efficiency. Additionally, new countermeasures may be suggested that aim to update masks in XEX/XE modes in an efficient, but difficult to analyse manner.

Chapters5 and6 analyse the security of XEX and AEZ modes, respec- • tively, against fault attacks. However, these analyses are established using mathematical computation and verified using software simulation. Explor- ing approaches to extend and verify such attack in real hardware might be a future direction for research. Such verification provides accurate estimation of the viability of these attacks with current technology.

Chapter7 highlights the need for an efficient design that can be used as • a g function. The g function is not required to be a strong PRF, but is required to ensure the distinctness of key updates. Existing proposals for g are limited and expensive, especially in terms of execution time, and tend to build g as a strong pseudorandom function. Hence, designing a more efficient g structure remains a challenge, regardless of the design of the mode of operation. When this challenge is addressed, it will significantly improve our design’s online computation cost and may even allow paral- lelism. In addition, efficient block cipher-based modes to provide integrity assurance can be constructed. Exploring new ideas in this direction and deeper experimental analysis of our proposal are topics for further research. 194 Chapter 8. Conclusions and future research Bibliography

[1] A. Menezes, P. van Oorschot, and S. Vanstone, Handbook of Applied Cryp- tography. CRC Press, 1996.

[2] M. Dworkin, “Recommendation for block cipher modes of operation: The CCM mode for authentication and confidentiality,” NIST Special Publica- tion 800-38C, 2004.

[3] M. Dworkin, “Recommendation for block cipher modes of operation: Ga- lois/counter mode (GCM) and GMAC,” NIST Special Publication 800-38D, 2007.

[4] M. Saarinen, “Cycling attacks on GCM, GHASH and other polynomial MACs and hashes,” in FSE, vol. 7549 of Lecture Notes in Computer Sci- ence, pp. 216–225, Springer, 2012.

[5] D. Bernstein, “Cryptographic competitions: Disasters,” 2014. http:// competitions.cr.yp.to/disasters.html. Accessed December 22, 2017.

[6] D. Bernstein, “Cryptographic competitions: CAESAR,” 2014. http:// competitions.cr.yp.to/caesar-submissions.html. Accessed October 11, 2017.

[7] F. Recacha, “++AE v1.1.” http://competitions.cr.yp.to/ caesar-call.html. Accessed June 28, 2017, 2014.

[8] K. Minematsu, “Parallelizable rate-1 authenticated encryption from pseu- dorandom functions.” Cryptology ePrint Archive, Report 2013/628, 2013. http://eprint.iacr.org/.

[9] K. Minematsu, “Parallelizable rate-1 authenticated encryption from pseu- dorandom functions,” in Advances in Cryptology–EUROCRYPT, vol. 8441 of Lecture Notes in Computer Science, pp. 275–292, Springer, 2014.

195 196 BIBLIOGRAPHY

[10] P. Rogaway, “Efficient instantiations of tweakable blockciphers and refine- ments to modes OCB and PMAC,” in ASIACRYPT, vol. 3329 of Lecture Notes in Computer Science, pp. 16–31, Springer, 2004.

[11] V. Hoang, T. Krovetz, and P. Rogaway, “Robust authenticated-encryption AEZ and the problem that it solves,” in Advances in Cryptology - EURO- , Part I, vol. 9056 of Lecture Notes in Computer Science, pp. 15–44, Springer, 2015.

[12] J. Jean, “TikZ for cryptographers.” https://www.iacr.org/authors/ tikz/. Accessed December 21, 2017, 2016.

[13] A. Bogdanov, L. R. Knudsen, G. Leander, C. Paar, A. Poschmann, M. J. B. Robshaw, Y. Seurin, and C. Vikkelsoe, “PRESENT: an ultra-lightweight block cipher,” in CHES, vol. 4727 of Lecture Notes in Computer Science, pp. 450–466, Springer, 2007.

[14] L. Goubin and J. Patarin, “DES and differential power analysis (the "du- plication" method),” in CHES, vol. 1717 of Lecture Notes in Computer Science, pp. 158–172, Springer, 1999.

[15] P. Kocher, J. Jaffe, B. Jun, and P. Rohatgi, “Introduction to differential power analysis,” J. Cryptographic Engineering, vol. 1, no. 1, pp. 5–27, 2011.

[16] D. Mukhopadhyay, “An improved fault based attack of the Advanced En- cryption Standard,” in AFRICACRYPT, vol. 5580 of Lecture Notes in Computer Science, pp. 421–434, Springer, 2009.

[17] M. Bellare and C. Namprempre, “Authenticated encryption: Relations among notions and analysis of the generic composition paradigm,” in ASI- ACRYPT, vol. 1976 of Lecture Notes in Computer Science, pp. 531–545, Springer, 2000.

[18] T. Fuhr, É. Jaulmes, V. Lomné, and A. Thillard, “Fault attacks on AES with faulty ciphertexts only,” in FDTC, pp. 108–118, IEEE Computer So- ciety, 2013.

[19] C. Dobraunig, M. Eichlseder, T. Korak, V. Lomné, and F. Mendel, “Sta- tistical fault attacks on nonce-based authenticated encryption schemes,” in BIBLIOGRAPHY 197

ASIACRYPT, vol. 10031 of Lecture Notes in Computer Science, pp. 369– 395, 2016.

[20] C. Paar and J. Pelzl, Understanding Cryptography - A Textbook for Stu- dents and Practitioners. Springer, 2010.

[21] W. Stallings, Cryptography and network security - principles and practice (3. ed.). Prentice Hall, 2003.

[22] FIPS PUB, “DES modes of operation,” National Bureau of Standards, Na- tional Technical Information Service, Springfield, VA, 1980.

[23] M. Dworkin, “Recommendation for block cipher modes of operation. meth- ods and techniques,” tech. rep., National Inst of Standards and Technology Gaithersburg Md Computer Security Div, 2001.

[24] M. Bellare and C. Namprempre, “Authenticated encryption: Relations among notions and analysis of the generic composition paradigm,” J. Cryp- tology, vol. 21, no. 4, pp. 469–491, 2008.

[25] P. Rogaway, “Authenticated-encryption with associated-data,” in ACM Conference on Computer and Communications Security, pp. 98–107, ACM, 2002.

[26] D. Whiting, R. Housley, and N. Ferguson, “AES encryption & authentica- tion using CTR mode & CBC-MAC,” IEEE P802, vol. 11, 2002.

[27] D. McGrew and J. Viega, “The security and performance of the ga- lois/counter mode (GCM) of operation,” in INDOCRYPT, vol. 3348 of Lecture Notes in Computer Science, pp. 343–355, Springer, 2004.

[28] P. Rogaway, M. Bellare, and J. Black, “OCB: A block-cipher mode of operation for efficient authenticated encryption,” ACM Trans. Inf. Syst. Secur., vol. 6, no. 3, pp. 365–403, 2003.

[29] T. Krovetz and P. Rogaway, “The software performance of authenticated- encryption modes,” in FSE, vol. 6733 of Lecture Notes in Computer Science, pp. 306–327, Springer, 2011.

[30] T. Krovetz and P. Rogaway, “OCB,” 2014. http://competitions.cr.yp. to/caesar-submissions.html. Accessed November 1, 2017. 198 BIBLIOGRAPHY

[31] P. Rogaway and T. Shrimpton, “A provable-security treatment of the key- wrap problem,” in EUROCRYPT, vol. 4004 of Lecture Notes in Computer Science, pp. 373–390, Springer, 2006.

[32] M. Bellare, P. Rogaway, and D. Wagner, “The EAX mode of operation,” in FSE, vol. 3017 of Lecture Notes in Computer Science, pp. 389–407, Springer, 2004.

[33] C. Jutla, “Encryption modes with almost free message integrity,” in EU- ROCRYPT, vol. 2045 of Lecture Notes in Computer Science, pp. 529–544, Springer, 2001.

[34] P. Kocher, “Timing attacks on implementations of Diffie-Hellman, RSA, DSS, and other systems,” in CRYPTO, vol. 1109 of Lecture Notes in Com- puter Science, pp. 104–113, Springer, 1996.

[35] D. Boneh, R. DeMillo, and R. Lipton, “On the importance of checking cryptographic protocols for faults (extended abstract),” in EUROCRYPT, vol. 1233 of Lecture Notes in Computer Science, pp. 37–51, Springer, 1997.

[36] H. Qahur Al Mahri, L. Simpson, H. Bartlett, E. Dawson, and K. Wong, “Forgery attacks on ++AE authenticated encryption mode,” in Proceedings of the Australasian Computer Science Week Multiconference, Canberra, Australia, February 2-5, 2016, pp. 33:1–9, ACM, 2016.

[37] H. Qahur Al Mahri, L. Simpson, H. Bartlett, E. Dawson, and K. Wong, “A fundamental flaw in the ++AE authenticated encryption mode,” Journal of Mathematical Cryptology, vol. 12, no. 1, pp. 37–42, 2018.

[38] H. Qahur Al Mahri, L. Simpson, H. Bartlett, E. Dawson, and K. Wong, “Tweaking generic OTR to avoid forgery attacks,” in Proceedings of the Applications and Techniques in Information Security - 6th International Conference, ATIS 2016, Cairns, QLD, Australia, October 26-28, 2016 (L. Batten and G. Li, eds.), vol. 651 of Communications in Computer and Information Science, pp. 41–53, 2016.

[39] H. Qahur Al Mahri, L. Simpson, H. Bartlett, E. Dawson, and K. Wong, “Fault attacks on XEX mode with application to certain authenticated encryption modes,” in Proceedings of the Information Security and Privacy BIBLIOGRAPHY 199

- 22nd Australasian Conference, ACISP 2017, Auckland, New Zealand, July 3-5, 2017, Part I (J. Pieprzyk and S. Suriadi, eds.), vol. 10342 of Lecture Notes in Computer Science, pp. 285–305, Springer, 2017.

[40] H. Qahur Al Mahri, L. Simpson, H. Bartlett, E. Dawson, and K. Wong, “A fault-based attack on AEZ v4.2,” in IEEE Trustcom/BigDataSE/ICESS, Sydney, Australia, August 1-4, 2017, pp. 634–641, IEEE, 2017.

[41] H. Qahur Al Mahri, L. Simpson, H. Bartlett, E. Dawson, and K. Wong, “Fault analysis of AEZ,” Concurrency and Computation: Practice and Ex- perience, 2018. https://doi.org/10.1002/cpe.4785.

[42] D. Stinson, Cryptography: Theory and practice. CRC press, 2005.

[43] Data Encryption Standard, “Federal information processing standards pub- lication 46,” National Bureau of Standards, US Department of Commerce, 1977.

[44] J. Daemen and V. Rijmen, The Design of Rijndael: AES - The Advanced Encryption Standard. Information Security and Cryptography, Springer, 2002.

[45] H. Feistel, “Cryptography and computer privacy,” Scientific American, vol. 228, no. 5, pp. 15–23, 1973.

[46] L. Knudsen and M. Robshaw, The Block Cipher Companion. Information Security and Cryptography, Springer, 2011.

[47] C. E. Shannon, “Communication theory of secrecy systems,” Bell Labs Technical Journal, vol. 28, no. 4, pp. 656–715, 1949.

[48] G. V. Bard, “The vulnerability of SSL to chosen plaintext attack,” IACR Cryptology ePrint Archive, vol. 2004, p. 111, 2004.

[49] P. Rogaway, “Evaluation of some blockcipher modes of operation,” Cryp- tography Research and Evaluation Committees (CRYPTREC) for the Gov- ernment of Japan, 2011.

[50] D. Knuth, The Art of Computer Programming: Sorting and Searching, vol. 3. Addison-Wesley, 1973. 200 BIBLIOGRAPHY

[51] H. Krawczyk, M. Bellare, and R. Canetti, “HMAC: Keyed-hashing for mes- sage authentication,” RFC, vol. 2104, pp. 1–11, 1997.

[52] J. Black, S. Halevi, H. Krawczyk, T. Krovetz, and P. Rogaway, “UMAC: Fast and secure message authentication,” in CRYPTO, vol. 1666 of Lecture Notes in Computer Science, pp. 216–233, Springer, 1999.

[53] M. Burrows, M. Abadi, and R. Needham, “A logic of authentication,” in SOSP, pp. 1–13, ACM, 1989.

[54] M. Bellare, R. Canetti, and H. Krawczyk, “Keying hash functions for mes- sage authentication,” in CRYPTO, vol. 1109 of Lecture Notes in Computer Science, pp. 1–15, Springer, 1996.

[55] M. Dworkin, “Recommendation for block cipher modes of operation: The CMAC mode for authentication,” NIST Special Publication 800-38B, 2016.

[56] T. Iwata and K. Kurosawa, “OMAC: One-key CBC MAC,” in FSE, vol. 2887 of Lecture Notes in Computer Science, pp. 129–153, Springer, 2003.

[57] J. Black and P. Rogaway, “A block-cipher mode of operation for paralleliz- able message authentication,” in EUROCRYPT, vol. 2332 of Lecture Notes in Computer Science, pp. 384–397, Springer, 2002.

[58] J. Black and P. Rogaway, “CBC MACs for arbitrary-length messages: The three-key constructions,” in CRYPTO, vol. 1880 of Lecture Notes in Com- puter Science, pp. 197–215, Springer, 2000.

[59] T. Moreau, “The Frogbit cipher, a data integrity algorithm. eSTREAM, ECRYPT stream cipher project, report 2005/001, 2005.” http://www. ecrypt.eu.org/stream. Accessed November 1, 2017.

[60] A. Braeken, J. Lano, N. Mentens, B. Preneel, and I. Verbauwhede, “SFINKS: A synchronous stream cipher for restricted hardware environ- ments,” in SKEW-Symmetric Key Encryption Workshop, vol. 55, p. 72, 2005.

[61] G. Bertoni, J. Daemen, M. Peeters, and G. Van Assche, “Keccak family main document,” Submission to NIST (Round 2), vol. 3, p. 30, 2009. BIBLIOGRAPHY 201

[62] N. Mouha, B. Mennink, A. V. Herrewege, D. Watanabe, B. Preneel, and I. Verbauwhede, “Chaskey: An efficient MAC algorithm for 32-bit micro- controllers,” in Selected Areas in Cryptography, vol. 8781 of Lecture Notes in Computer Science, pp. 306–323, Springer, 2014.

[63] R. Atkinson, “IP encapsulating security payload (ESP),” RFC, vol. 1827, pp. 1–12, 1995.

[64] S. Bellovin, “Problem areas for the IP security protocols,” in USENIX Security Symposium, USENIX Association, 1996.

[65] J. Degabriele and K. Paterson, “Attacking the IPsec standards in encryption-only configurations,” in IEEE Symposium on Security and Pri- vacy, pp. 335–349, IEEE Computer Society, 2007.

[66] S. Goldwasser and S. Micali, “Probabilistic encryption,” J. Comput. Syst. Sci., vol. 28, no. 2, pp. 270–299, 1984.

[67] O. Goldreich, “A uniform-complexity treatment of encryption and zero- knowledge,” J. Cryptology, vol. 6, no. 1, pp. 21–53, 1993.

[68] D. Dolev, C. Dwork, and M. Naor, “Nonmalleable cryptography,” SIAM J. Comput., vol. 30, no. 2, pp. 391–437, 2000.

[69] M. Bellare, J. Kilian, and P. Rogaway, “The security of the cipher block chaining message authentication code,” J. Comput. Syst. Sci., vol. 61, no. 3, pp. 362–399, 2000.

[70] N. Borisov, I. Goldberg, and D. Wagner, “Intercepting mobile communica- tions: The insecurity of 802.11,” in MobiCom, pp. 180–189, ACM, 2001.

[71] P. Rogaway and D. Wagner, “A critique of CCM,” IACR Cryptology ePrint Archive, vol. 2003, p. 70, 2003.

[72] T. Kohno, J. Viega, and D. Whiting, “CWC: A high-performance conven- tional authenticated encryption mode,” in FSE, vol. 3017 of Lecture Notes in Computer Science, pp. 408–426, Springer, 2004.

[73] T. Krovetz and P. Rogaway, “The OCB authenticated-encryption algo- rithm,” RFC, vol. 7253, pp. 1–19, 2014. 202 BIBLIOGRAPHY

[74] C. Mitchell, “Cryptanalysis of the EPBC authenticated encryption mode,” in IMA Int. Conf., vol. 4887 of Lecture Notes in Computer Science, pp. 118– 128, Springer, 2007.

[75] B. Di, L. Simpson, H. Bartlett, E. Dawson, and K. Wong, “Correcting flaws in Mitchell’s analysis of EPBC,” in AISC, vol. 161 of CRPIT, pp. 57–60, Australian Computer Society, 2015.

[76] A. Zuquete and P. Guedes, “Efficient error-propagating block chaining,” in IMA Int. Conf., vol. 1355 of Lecture Notes in Computer Science, pp. 323– 334, Springer, 1997.

[77] NIST, “Cryptographic competitions: AES,” 1997. http://competitions. cr.yp.to/aes.html. Accessed October 11, 2017.

[78] European Network of Excellence for Cryptology, “Cryptographic compe- titions: eSTREAM,” 2004. http://competitions.cr.yp.to/estream. html. Accessed October 11, 2017.

[79] NIST, “Cryptographic competitions: SHA-3,” 2007. http:// competitions.cr.yp.to/sha3.html. Accessed October 11, 2017.

[80] “Cryptographic competitions: PHC,” 2013. https://password-hashing. net. Accessed October 11, 2017.

[81] R. Shirey, “Internet security glossary,” RFC, vol. 2828, pp. 1–212, 2000.

[82] S. Murphy, “The cryptanalysis of FEAL-4 with 20 chosen plaintexts,” J. Cryptology, vol. 2, no. 3, pp. 145–154, 1990.

[83] E. Biham and A. Shamir, “Differential cryptanalysis of DES-like cryptosys- tems,” in CRYPTO, vol. 537 of Lecture Notes in Computer Science, pp. 2– 21, Springer, 1990.

[84] E. Biham and A. Shamir, Differential Cryptanalysis of the Data Encryption Standard. Springer, 1993.

[85] M. Matsui, “ method for DES cipher,” in EURO- CRYPT, vol. 765 of Lecture Notes in Computer Science, pp. 386–397, Springer, 1993. BIBLIOGRAPHY 203

[86] A. Kipnis and A. Shamir, “Cryptanalysis of the HFE public key cryp- tosystem by relinearization,” in CRYPTO, vol. 1666 of Lecture Notes in Computer Science, pp. 19–30, Springer, 1999.

[87] T. Jakobsen and L. Knudsen, “The interpolation attack on block ciphers,” in FSE, vol. 1267 of Lecture Notes in Computer Science, pp. 28–40, Springer, 1997.

[88] E. Biham, “New types of cryptanalytic attacks using related keys,” J. Cryp- tology, vol. 7, no. 4, pp. 229–246, 1994.

[89] B. Preneel, “Cryptanalysis of message authentication codes,” in ISW, vol. 1396 of Lecture Notes in Computer Science, pp. 55–65, Springer, 1997.

[90] NSA, “TEMPEST: A signal problem,” Cryptologic Spectrum, vol. 2, no. 3, 1972. https://www.nsa.gov/news-features/ declassified-documents/cryptologic-spectrum/assets/files/ tempest.pdf. Accessed December 16, 2017.

[91] P. Kocher, J. Jaffe, and B. Jun, “Differential power analysis,” in CRYPTO, vol. 1666 of Lecture Notes in Computer Science, pp. 388–397, Springer, 1999.

[92] L. Batina, P. Buysschaert, E. De Mulder, N. Mentens, S. Ors, B. Preneel, G. Vandenbosch, and I. Verbauwhede, “Side channel attacks and fault at- tacks on cryptographic algorithms,” REVUE HF, pp. 36–45, 2004.

[93] E. Biham and A. Shamir, “Power analysis of the key scheduling of the AES candidates,” in Proceedings of the second AES Candidate Conference, pp. 115–121, 1999.

[94] R. Novak, “SPA-based adaptive chosen-ciphertext attack on RSA imple- mentation,” in Public Key Cryptography, vol. 2274 of Lecture Notes in Computer Science, pp. 252–262, Springer, 2002.

[95] S. Mangard, E. Oswald, and T. Popp, Power analysis attacks - revealing the secrets of smart cards. Springer, 2007.

[96] T. Messerges, Power analysis attacks and countermeasures for crypto- graphic algorithms. University of Illinois at Chicago, 2000. 204 BIBLIOGRAPHY

[97] E. Brier, C. Clavier, and F. Olivier, “Correlation power analysis with a leakage model,” in CHES, vol. 3156 of Lecture Notes in Computer Science, pp. 16–29, Springer, 2004.

[98] S. Chari, J. Rao, and P. Rohatgi, “Template attacks,” in CHES, vol. 2523 of Lecture Notes in Computer Science, pp. 13–28, Springer, 2002.

[99] W. Schindler, K. Lemke, and C. Paar, “A stochastic model for differen- tial side channel cryptanalysis,” in CHES, vol. 3659 of Lecture Notes in Computer Science, pp. 30–46, Springer, 2005.

[100] E. Biham and A. Shamir, “Differential fault analysis of secret key cryp- tosystems,” in CRYPTO, vol. 1294 of Lecture Notes in Computer Science, pp. 513–525, Springer, 1997.

[101] J. Blömer and J. Seifert, “Fault based cryptanalysis of the Advanced En- cryption Standard (AES),” in Financial Cryptography, vol. 2742 of Lecture Notes in Computer Science, pp. 162–181, Springer, 2003.

[102] G. Piret and J. Quisquater, “A differential fault attack technique against SPN structures, with application to the AES and KHAZAD,” in CHES, vol. 2779 of Lecture Notes in Computer Science, pp. 77–88, Springer, 2003.

[103] M. Tunstall, D. Mukhopadhyay, and S. Ali, “Differential fault analysis of the Advanced Encryption Standard using a single fault,” in WISTP, vol. 6633 of Lecture Notes in Computer Science, pp. 224–233, Springer, 2011.

[104] F. Amiel, C. Clavier, and M. Tunstall, “Fault analysis of dpa-resistant algorithms,” in FDTC, vol. 4236 of Lecture Notes in Computer Science, pp. 223–236, Springer, 2006.

[105] J. Schmidt and C. Herbst, “A practical fault attack on and multiply,” in FDTC, pp. 53–58, IEEE Computer Society, 2008.

[106] A. Dehbaoui, A. Mirbaha, N. Moro, J. Dutertre, and A. Tria, “Electromag- netic glitch on the AES round counter,” in COSADE, vol. 7864 of Lecture Notes in Computer Science, pp. 17–31, Springer, 2013. BIBLIOGRAPHY 205

[107] S. Skorobogatov and R. Anderson, “Optical fault induction attacks,” in CHES, vol. 2523 of Lecture Notes in Computer Science, pp. 2–12, Springer, 2002.

[108] S. Yen and M. Joye, “Checking before output may not be enough against fault-based cryptanalysis,” IEEE Trans. Computers, vol. 49, no. 9, pp. 967– 970, 2000.

[109] J. Blömer and V. Krummel, “Fault based collision attacks on AES,” in FDTC, vol. 4236 of Lecture Notes in Computer Science, pp. 106–120, Springer, 2006.

[110] M. Rivain, “Differential fault analysis on DES middle rounds,” in CHES, vol. 5747 of Lecture Notes in Computer Science, pp. 457–469, Springer, 2009.

[111] D. Gu, J. Li, S. Li, Z. Ma, Z. Guo, and J. Liu, “Differential fault analysis on lightweight blockciphers with statistical cryptanalysis techniques,” in FDTC, pp. 27–33, IEEE Computer Society, 2012.

[112] F. Recacha, “IOBC: A new chaining method for block ciphering (in span- ish),” Proceedings: IV Reunion Espanola de Criptologia, Valladolid, pp. 85– 92, 1996.

[113] F. Recacha, “IOC: The most lightweight authenticated encryption mode?,” 2013. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1. 1.298.1691&rep=rep1&type=pdf. Accessed May 1, 2016.

[114] C. Mitchell, “Analysing the IOBC authenticated encryption mode,” in In- formation Security and Privacy, pp. 1–12, Springer, 2013.

[115] P. Bottinelli, R. Reyhanitabar, and S. Vaudenay, “Breaking the IOC au- thenticated encryption mode,” in Progress in Cryptology–AFRICACRYPT, pp. 126–135, Springer, 2014.

[116] L. Bahack, “Online comment.” https://groups.google.com/forum/#! forum/crypto-competitions. Accessed May 1, 2016, 2014.

[117] K. Minematsu, “AES-OTR,” 2014. http://competitions.cr.yp.to/ caesar-submissions.html. Accessed February 24, 2017. 206 BIBLIOGRAPHY

[118] N. Datta and M. Nandi, “ElmD,” 2014. http://competitions.cr.yp. to/caesar-submissions.html. Accessed February 24, 2017.

[119] E. Andreeva, A. Bogdanov, A. Luykx, B. Mennink, E. Tischhauser, and K. Yasuda, “AES-COPA,” 2014. http://competitions.cr.yp.to/ caesar-submissions.html. Accessed February 24, 2017.

[120] M. Liskov, R. Rivest, and D. Wagner, “Tweakable block ciphers,” in CRYPTO, vol. 2442 of Lecture Notes in Computer Science, pp. 31–46, Springer, 2002.

[121] T. Iwata and K. Yasuda, “BTM: A single-key, inverse-cipher-free mode for deterministic authenticated encryption,” in Selected Areas in Cryptography, vol. 5867 of Lecture Notes in Computer Science, pp. 313–330, Springer, 2009.

[122] D. Osvik, J. Bos, D. Stefan, and D. Canright, “Fast software AES encryp- tion,” in FSE, vol. 6147 of Lecture Notes in Computer Science, pp. 75–93, Springer, 2010.

[123] T. Huang and H. Wu, “Attack on AES-OTR,” 2014. https://groups. google.com/forum/#!topic/crypto-competitions/. Accessed Novem- ber 2, 2017.

[124] R. Bost and O. Sanders, “Trick or tweak: On the (in)security of OTR’s tweaks.” Cryptology ePrint Archive, Report 2016/234, 2016. http:// eprint.iacr.org/.

[125] G. Seroussi, “Table of low-weight binary irreducible polynomials.” Com- puter Systems Laboratory, Report HPL-98-135, 1998. http://www.hpl. hp.com/techreports/98/HPL-98-135.pdf?jumpid=reg_R1002_USEN.

[126] K. Aoki, T. Iwata, and K. Yasuda, “How fast can a two-pass mode go? a parallel deterministic authenticated encryption mode for AES-NI,” 2012. http://hyperelliptic.org/DIAC/. Accessed November 8, 2017.

[127] L. Wang, “SHELL,” 2014. http://competitions.cr.yp.to/ caesar-submissions.html. Accessed February 24, 2017. BIBLIOGRAPHY 207

[128] K. Minematsu, “Improved security analysis of XEX and LRW modes,” in Selected Areas in Cryptography, vol. 4356 of Lecture Notes in Computer Science, pp. 96–113, Springer, 2006.

[129] H. Bar-El, H. Choukri, D. Naccache, M. Tunstall, and C. Whelan, “The sor- cerer’s apprentice guide to fault attacks,” IACR Cryptology ePrint Archive, vol. 2004, p. 100, 2004.

[130] D. Saha, D. Mukhopadhyay, and D. Chowdhury, “A diagonal fault attack on the Advanced Encryption Standard,” IACR Cryptology ePrint Archive, vol. 2009, p. 581, 2009.

[131] A. Barenghi, L. Breveglieri, I. Koren, and D. Naccache, “Fault injection attacks on cryptographic devices: Theory, practice, and countermeasures,” Proceedings of the IEEE, vol. 100, no. 11, pp. 3056–3076, 2012.

[132] S. Halevi and P. Rogaway, “A parallelizable enciphering mode,” in CT-RSA, vol. 2964 of Lecture Notes in Computer Science, pp. 292–304, Springer, 2004.

[133] T. Unterluggauer and S. Mangard, “Exploiting the physical disparity: Side- channel attacks on memory encryption,” in COSADE, vol. 9689 of Lecture Notes in Computer Science, pp. 3–18, Springer, 2016.

[134] P. Dey, R. Rohit, and A. Adhikari, “Full key recovery of ACORN with a single fault,” J. Inf. Sec. Appl., vol. 29, pp. 57–64, 2016.

[135] T. Fuhr, G. Leurent, and V. Suder, “Collision attacks against CAESAR candidates - forgery and key-recovery against AEZ and marble,” in ASI- ACRYPT, vol. 9453 of Lecture Notes in Computer Science, pp. 510–532, Springer, 2015.

[136] C. Chaigneau and H. Gilbert, “Is AEZ v4.1 sufficiently resilient against key-recovery attacks?,” IACR Trans. Symmetric Cryptol., vol. 2016, no. 1, pp. 114–133, 2016.

[137] B. Mennink, “Weak keys for AEZ, and the external key padding attack,” in CT-RSA, vol. 10159 of Lecture Notes in Computer Science, pp. 223–237, Springer, 2017. 208 BIBLIOGRAPHY

[138] V. Hoang, T. Krovetz, and P. Rogaway, “AEZ v4.2: Authenticated encryp- tion by enciphering. CAESAR submission,” 2016. http://competitions. cr.yp.to/caesar-submissions.html. Accessed October 11, 2017.

[139] V. Hoang, T. Krovetz, and P. Rogaway, “AEZ v5: Authenticated encryp- tion by enciphering. CAESAR submission,” 2017. http://competitions. cr.yp.to/caesar-submissions.html. Accessed October 11, 2017.

[140] M. Bellare, A. Boldyreva, L. Knudsen, and C. Namprempre, “Online ci- phers and the hash-CBC construction,” in CRYPTO, vol. 2139 of Lecture Notes in Computer Science, pp. 292–309, Springer, 2001.

[141] J. Aumasson, S. Neves, Z. Wilcox-O’Hearn, and C. Winnerlein, “BLAKE2: simpler, smaller, fast as MD5,” in ACNS, vol. 7954 of Lecture Notes in Computer Science, pp. 119–135, Springer, 2013.

[142] X. Bonnetai, P. Derbe, S. Duval, J. Jean, G. Leurent, B. Minaud, and V. Suder, “An easy attack on AEZ,” 2017. https://www.nuee. nagoya-u.ac.jp/labs/tiwata/fse2017/slides/Rump-02.pdf. Accessed October 11, 2017.

[143] C. Giraud, “DFA on AES,” in AES Conference, vol. 3373 of Lecture Notes in Computer Science, pp. 27–41, Springer, 2004.

[144] P. Dusart, G. Letourneux, and O. Vivolo, “Differential fault analysis on A.E.S,” in ACNS, vol. 2846 of Lecture Notes in Computer Science, pp. 293– 306, Springer, 2003.

[145] N. Selmane, S. Guilley, and J. Danger, “Practical setup time violation attacks on AES,” in EDCC, pp. 91–96, IEEE Computer Society, 2008.

[146] K. Schramm and C. Paar, “Higher order masking of the AES,” in CT-RSA, vol. 3860 of Lecture Notes in Computer Science, pp. 208–225, Springer, 2006.

[147] E. Prouff and M. Rivain, “Masking against side-channel attacks: A formal security proof,” in EUROCRYPT, vol. 7881 of Lecture Notes in Computer Science, pp. 142–159, Springer, 2013. BIBLIOGRAPHY 209

[148] K. Tiri, M. Akmal, and I. Verbauwhede, “A dynamic and differential CMOS logic with signal independent power consumption to withstand differential power analysis on smart cards,” in Solid-State Circuits Conference, 2002. ESSCIRC 2002. Proceedings of the 28th European, pp. 403–406, IEEE, 2002.

[149] T. Messerges, “Using second-order power analysis to attack DPA resis- tant software,” in CHES, vol. 1965 of Lecture Notes in Computer Science, pp. 238–251, Springer, 2000.

[150] H. Groß, S. Mangard, and T. Korak, “An efficient side-channel pro- tected AES implementation with arbitrary protection order,” in CT-RSA, vol. 10159 of Lecture Notes in Computer Science, pp. 95–112, Springer, 2017.

[151] M. Abdalla, S. Belaïd, and P. Fouque, “Leakage-resilient symmetric en- cryption via re-keying,” in CHES, vol. 8086 of Lecture Notes in Computer Science, pp. 471–488, Springer, 2013.

[152] S. Dziembowski and K. Pietrzak, “Leakage-resilient cryptography,” in FOCS, pp. 293–302, IEEE Computer Society, 2008.

[153] P. Kocher, “Leak-resistant cryptographic indexed key update,” Mar. 25 2003. US Patent 6,539,092.

[154] S. Belaïd, F. Santis, J. Heyszl, S. Mangard, M. Medwed, J. Schmidt, F. Standaert, and S. Tillich, “Towards fresh re-keying with leakage-resilient PRFs: Cipher design principles and analysis,” J. Cryptographic Engineer- ing, vol. 4, no. 3, pp. 157–171, 2014.

[155] K. Pietrzak, “A leakage-resilient mode of operation,” in EUROCRYPT, vol. 5479 of Lecture Notes in Computer Science, pp. 462–482, Springer, 2009.

[156] O. Pereira, F. Standaert, and S. Vivek, “Leakage-resilient authentication and encryption from symmetric cryptographic primitives,” in ACM Con- ference on Computer and Communications Security, pp. 96–108, ACM, 2015. 210 BIBLIOGRAPHY

[157] M. Medwed, F. Standaert, and A. Joux, “Towards super-exponential side- channel security with efficient leakage-resilient PRFs,” in CHES, vol. 7428 of Lecture Notes in Computer Science, pp. 193–212, Springer, 2012.

[158] M. Agrawal, T. Bansal, D. Chang, A. Chauhan, S. Hong, J. Kang, and S. Sanadhya, “RCB: Leakage-resilient authenticated encryption via re- keying,” The Journal of Supercomputing, pp. 1–26, 2016.

[159] G. Barwell, D. Martin, E. Oswald, and M. Stam, “Authenticated encryption in the face of protocol and side channel leakage,” IACR Cryptology ePrint Archive, vol. 2017, p. 68, 2017.

[160] F. Berti, F. Koeune, O. Pereira, T. Peters, and F. Standaert, “Leakage- resilient and misuse-resistant authenticated encryption,” IACR Cryptology ePrint Archive, vol. 2016, p. 996, 2016.

[161] C. Dobraunig, M. Eichlseder, S. Mangard, F. Mendel, and T. Unterlug- gauer, “ISAP - towards side-channel secure authenticated encryption,” IACR Transactions on Symmetric Cryptology, vol. 2017, no. 1, pp. 80– 105, 2017.

[162] P. Kocher, P. Rohatgi, and J. Jaffe, “Verifiable, leak-resistant encryption and decryption,” Feb. 26 2013. US Patent 8,386,800.

[163] G. R. Blakley et al., “Safeguarding cryptographic keys,” in Proceedings of the national computer conference, vol. 48, pp. 313–317, 1979.

[164] A. Shamir, “How to share a secret,” Commun. ACM, vol. 22, no. 11, pp. 612–613, 1979.

[165] S. Chari, C. Jutla, J. Rao, and P. Rohatgi, “Towards sound approaches to counteract power-analysis attacks,” in CRYPTO, vol. 1666 of Lecture Notes in Computer Science, pp. 398–412, Springer, 1999.

[166] T. Messerges, “Securing the AES finalists against power analysis attacks,” in FSE, vol. 1978 of Lecture Notes in Computer Science, pp. 150–164, Springer, 2000.

[167] M. Taha and P. Schaumont, “Key updating for leakage resiliency with ap- plication to AES modes of operation,” IEEE Trans. Information Forensics and Security, vol. 10, no. 3, pp. 519–528, 2015. BIBLIOGRAPHY 211

[168] J. Jaffe, P. Kocher, and B. Jun, “Balanced cryptographic computational method and apparatus for leak minimizational in smartcards and other cryptosystems,” Jan. 21 2003. US Patent 6,510,518.

[169] M. Medwed, C. Petit, F. Regazzoni, M. Renauld, and F. Standaert, “Fresh re-keying II: Securing multiple parties against side-channel and fault at- tacks,” in CARDIS, vol. 7079 of Lecture Notes in Computer Science, pp. 115–132, Springer, 2011.

[170] M. Medwed, F. Standaert, J. Großschädl, and F. Regazzoni, “Fresh re- keying: Security against side-channel and fault attacks for low-cost de- vices,” in AFRICACRYPT, vol. 6055 of Lecture Notes in Computer Sci- ence, pp. 279–296, Springer, 2010.

[171] F. Standaert, O. Pereira, Y. Yu, J. Quisquater, M. Yung, and E. Oswald, “Leakage resilient cryptography in practice,” in Towards Hardware-Intrinsic Security, Information Security and Cryptography, pp. 99–134, Springer, 2010.

[172] F. Abed, F. Berti, and S. Lucks, “Insecurity of RCB: leakage-resilient authenticated encryption,” IACR Cryptology ePrint Archive, vol. 2016, pp. 11–21, 2016.

[173] E. Fleischmann, C. Forler, and S. Lucks, “McOE: A family of almost fool- proof on-line authenticated encryption schemes,” in FSE, vol. 7549 of Lec- ture Notes in Computer Science, pp. 196–215, Springer, 2012.

[174] J. Schipper, “Leakage-resilient authentication,” Master’s thesis, Utrecht University, 2011.

[175] S. Belaïd, P. Fouque, and B. Gérard, “Side-channel analysis of multiplica- tions in GF(2128) - application to AES-GCM,” in ASIACRYPT, vol. 8874 of Lecture Notes in Computer Science, pp. 306–325, Springer, 2014.

[176] O. Goldreich, S. Goldwasser, and S. Micali, “How to construct random functions,” J. ACM, vol. 33, no. 4, pp. 792–807, 1986.

[177] F. Standaert, B. Gierlichs, and I. Verbauwhede, “Partition vs. comparison side-channel distinguishers: An empirical evaluation of statistical tests for univariate side-channel attacks against two unprotected CMOS devices,” 212 BIBLIOGRAPHY

in ICISC, vol. 5461 of Lecture Notes in Computer Science, pp. 253–267, Springer, 2008.

[178] O. Mangasarian and B. Recht, “Probability of unique integer solution to a system of linear equations,” European Journal of Operational Research, vol. 214, no. 1, pp. 27–30, 2011.

[179] N. Veyrat-Charvillon, M. Medwed, S. Kerckhof, and F. Standaert, “Shuf- fling against side-channel attacks: A comprehensive study with cautionary note,” in ASIACRYPT, vol. 7658 of Lecture Notes in Computer Science, pp. 740–757, Springer, 2012.

[180] R. Karri, K. Wu, P. Mishra, and Y. Kim, “Fault-based side-channel crypt- analysis tolerant rijndael symmetric block cipher architecture,” in DFT, pp. 427–435, IEEE Computer Society, 2001.

[181] A. Moradi, A. Poschmann, S. Ling, C. Paar, and H. Wang, “Pushing the limits: A very compact and a threshold implementation of AES,” in EU- ROCRYPT, vol. 6632 of Lecture Notes in Computer Science, pp. 69–88, Springer, 2011.

[182] P. Kocher, “Complexity and the challenges of securing SoCs,” in DAC, pp. 328–331, ACM, 2011.

[183] Z. Shi, C. Ma, J. Cote, and B. Wang, “Hardware implementation of hash functions,” in Introduction to Hardware Security and Trust, pp. 27–50, Springer, 2012.

[184] “eBACS: ECRYPT benchmarking of cryptographic systems.” http:// bench.cr.yp.to. Accessed December 28, 2017.

[185] J. Courant, M. Daubignard, C. Ene, P. Lafourcade, and Y. Lakhnech, “To- wards automated proofs for asymmetric encryption schemes in the random oracle model,” in Proceedings of the 2008 ACM Conference on Computer and Communications Security, CCS 2008, Alexandria, Virginia, USA, Oc- tober 27-31, 2008 (P. Ning, P. Syverson, and S. Jha, eds.), pp. 371–380, ACM, 2008.

[186] M. Gagné, P. Lafourcade, Y. Lakhnech, and R. Safavi-Naini, “Automated security proof for symmetric encryption modes,” in Advances in Computer BIBLIOGRAPHY 213

Science - ASIAN 2009. Information Security and Privacy, 13th Asian Computing Science Conference, Seoul, Korea, December 14-16, 2009. Pro- ceedings (A. Datta, ed.), vol. 5913 of Lecture Notes in Computer Science, pp. 39–53, Springer, 2009.