DEGREE PROJECT IN COMPUTER SCIENCE AND ENGINEERING, SECOND CYCLE, 30 CREDITS STOCKHOLM, SWEDEN 2020
On Asynchronous Group Key Agreements Tripartite Asynchronous Ratchet Trees
PHILLIP GAJLAND
KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE
On Asynchronous Group Key Agreements
Phillip Gajland
A thesis submitted for the degree of
Master of Science
Theoretical Computer Science: Algorithms, Complexity and Cryptography
Supervisor (CH): Prof. Dr. Serge Vaudenay Security and Cryptography Laboratory (LASEC) EPFL – Swiss Federal Institute of Technology Lausanne
Supervisor (SE): Prof. Dr. Mats N¨aslund Division of Theoretical Computer Science (TCS) KTH - Royal Institute of Technology
Examiner: Prof. Dr. Johan H˚astad Department of Mathematics KTH - Royal Institute of Technology
School of Electrical Engineering and Computer Science KTH Royal Institute of Technology
Stockholm / Lausanne - Spring 2020 Abstract
The subject of secure messaging has gained notable attention lately in the cryptographic community. For communications between two parties, paradigms such as the double ratchet, used in the Signal protocol, provide provably strong security guarantees such as forward secrecy and post-compromise security. Variations of the Signal protocol have enjoyed widespread adoption and are embedded in several well known messaging services, including Signal, WhatsApp and Facebook Secret Conversations. However, providing equally strong guarantees that scale well in group settings remains somewhat less well studied and is often neglected in practice. This motivated the need for the IETF Messaging Layer Security (MLS) working group. The first continuous group key agreement (CGKA) protocol to be proposed was Asynchronous Ratcheting Trees (ART) [Cohn-Gordon et al., 2018] and formed the basis of TreeKEM [Barnes et al., 2019], the CGKA protocol currently suggested for MLS. In this thesis we propose a new asynchronous group key agreement protocol based on a one-round Tripartite Diffie-Hellman [Joux, 2000]. Furthermore, we show that our protocol can be generalised for an n-ary asynchronous ratchet tree, assuming the existence of a one-round (n + 1)-way Diffie-Hellman key exchange, based on a n-multilinear map [Boneh and Silverberg, 2003]. We analyse ART, TreeKEM, and our proposals from a complexity theoretic perspective and show that our proposals improve the cost of update operations. Finally we present some discussion and improvements to the IETF MLS standard.
Keywords: MLS, secure messaging, cryptography.
Phillip Gajland (Stockholm / Lausanne - Spring 2020) On Asynchronous Group Key Agreements
1 Sammanfattning
Amnet¨ om s¨akra meddelanden har p˚a senare tid skapat uppm¨arksamhet inom kryptografiska samfundet. F¨orkommunikationer mellan tv˚aparter ger paradigmer s˚asom Double Ratchet, som anv¨ands i Signal-protokollet, starka bevisbara s¨akerhetsgarantier som forward secrecy och post-compromise security. Variationer av Signal-protokollet anv¨ands mycket i praktiken och ¨arinb¨addadei flera v¨alk¨andameddelandetj¨ansters˚asomSignal, WhatsApp och Facebook Secret Conversations. D¨aremot¨arprotokoll som erbjuder lika starka garantier och som skalar v¨ali gruppsituationer n˚agotmindre studerade och ofta eftersatta i praktiken. Detta motiverade behovet av arbetsgruppen IETF Messaging Layer Security (MLS). Det f¨orstakontinuerliga gruppnyckelprotokollet (CGKA) som f¨oreslogsvar Asynchronous Ratcheting Trees (ART) [Cohn-Gordon et al., 2018] och lade grunden f¨or TreeKEM [Barnes et al., 2019], det CGKA-protokoll som f¨orn¨arvarande f¨oreslagitsf¨orMLS. I detta examensarbete f¨oresl˚arvi ett nytt asynkront gruppnyckelprotokoll baserat p˚aen en-rundad Tripartite Diffie–Hellman [Joux, 2000]. Vidare visar vi att v˚artprotokoll kan generaliseras f¨orn-ary tr¨admed hj¨alpav ett en-rundat (n + 1)-v¨agDiffie-Hellman nyckelutbyte, baserat p˚aen multilinj¨armappning [Boneh and Silverberg, 2003]. Vi analyserar ART, TreeKEM och v˚araf¨orslag ur ett teoretiskt perspektiv samt visar att v˚araf¨orslag f¨orb¨attrar kostnaden f¨oruppdateringsoperationer. Slutligen presenterar vi n˚agradiskussioner och f¨orb¨attringarav IETF MLS-standarden.
Nyckelord: MLS, s¨aker meddelandehantering, kryptografi.
Phillip Gajland (Stockholm / Lausanne - Spring 2020) On Asynchronous Group Key Agreements
2 ”Ich will! – Das Wort ist m¨achtig, Spricht’s einer ernst und still; Die Sterne reißt’s vom Himmel Das eine Wort: Ich will!” - Große
3 Acknowledgements
Foremost I would like to express my deepest appreciation to my supervisor, Serge Vaudenay, for hosting me at LASEC and making this thesis possible. His remarkable attention to detail and emphasis on quality research leaves me in awe. I am extremely grateful for the continued guidance that I have received from my supervisor in Sweden, Mats N¨aslund.His deep insights into such a wide range of topics in cryptography have been most valuable. I would also like to thank Johan H˚astadfor agreeing to be the examiner of this thesis and for putting me in touch with Mats. My time at the lab has been most memorable and I would like to thank all my colleagues there for the fruitful conversations. A particular mention goes to Fatih Balli, whom I had the privilege of sharing an office with, as well as Martine Corval for helping me with the various administrative hurdles. Finally, I thank my parents and family for their continued support.
4 Contents
1 Background 8 1.1 Motivation ...... 8 1.2 Outline ...... 11 1.3 Objectives of Secure Messaging ...... 11 1.4 Security Notions ...... 12 1.5 Relevant Previous Work ...... 13 1.6 Introductory Cryptography ...... 13 1.7 KEM ...... 15
2 Secure Messaging 17 2.1 The Signal Protocol ...... 17 2.2 X3DH ...... 17 2.3 Hash Ratchets & the Double Ratchet Algorithm ...... 19
3 Secure Group Messaging 24 3.1 Messaging Layer Security (MLS) ...... 25 3.2 Asynchronous Ratcheting Trees ...... 28 3.3 Tree KEM ...... 30 3.4 Tripartite Asynchronous Ratcheting Trees ...... 31
4 Discussion 38 4.1 Results and Conclusion ...... 38 4.2 Open Questions for the IETF MLS Standard ...... 39
5 List of Figures
1.1 PGP hasn’t been adopted by the general public as non-technical users might find it hard to use. Inspecting public keys with gpg, the OpenPGP part of the GNU Privacy Guard (GnuPG)...... 9 1.2 A X.509 self-signed root certificate representing a certificate authority. . . . . 11
2.1 The three Diffie-Hellman operations making up X3DH are
DH1 = DH(IKA, SPKB), DH2 = DH(EKA, IKB) and DH3 = DH(EKA, SPKB). The shared key k is computed as k = KDF(DH1||DH2||DH3). If the prekey bundle contains a one-time i prekey OPKB, then k is computed as k = KDF(DH1||DH2||DH3||DH4), i where DH4 = DH(EKA, OPKB)...... 19 2.2 In a KDF chain, part of the output from a KDF is used as the input to a following KDF. The other part is used as an output key. Since each output appears random, a KDF chain provides forward secrecy...... 21
2.3 Diffie-Hellman ratchet. Alice is initialised with Bob’s ratchet pkB0 . Alice
performs a DH computation with pkB0 and skA1 , the output is used to derive
a new sending chain key. The next message sent by Alice includes her pkA1 .
Bob then performs a DH computation using pkA1 and skB0 . The output is used to derive his new receiving chain key, which is the same as Alice’s sending
chain key. Bob then computes a new pair (pkB1 ; skB1 ) and derives a new key
for his sending chain using pkA1 and skB1 ...... 22
3.1 Using the Signal protocol for a group of 8 users would require 28 channels. . . 24 3.2 Users A, B and C publish their KeyPackages to a directory ...... 26 3.3 User A creates a group with users B and C...... 27 3.4 User B sends an update...... 28 3.5 ART - At each node a public key pk and a secret key sk is stored (separated by a semicolon). For intermediate nodes, the children’s secret keys are used as the exponents to compute sk. f(·) maps a group element to an integer, f(·): G → Z/|G|Z, where G is an arbitrary Diffie-Hellman group. The path from C to the root is marked in red and the nodes lying on the copath of C are marked in bold...... 29 3.6 TreeKEM - At each node a public key pk, a secret key sk and a symmetric secret is stored (separated by a semicolon)...... 30
6 LIST OF TABLES PKG
3.7 Tripartite ART - At each node a public key pk and a secret key sk is stored (separated by a semicolon). For intermediate nodes, one secret key along with two public keys belonging to the children are used to compute sk. The path from C to the root is marked in red and the nodes lying on the copath of C are marked in bold...... 33
List of Tables
3.1 Comparison of ART, Tripartite ART and TreeKEM - listing the number of elementary cryptographic operations. Send update denotes the number of operations done by a group member initialising an update. Similarly, process update denotes the number of operations done by each group member to process an update. Transmission indicates the amount of key material that needs to be broadcasted to make an update. Storage indicates the number of public keys need to be stored along the path in order to compute the values stored at intermediate nodes. When updates are processed, on average only a constant number of keys need to be computed, the remaining keys along the path need to be cached though. As usual, (pk; sk) denotes a key pair of an arbitrary public-key cryptosystem and s denotes a symmetric secret . . . . 37
7 Chapter 1
Background
1.1 Motivation
Recently, the desire for secure messaging has gained an increase in popularity, in part due to the Snowden revelations [Macaskill and Dance, 2013]. Messaging applications have progressively been adopting end-to-end security mechanisms to ensure that messages are not accessible to servers involved in the communications, but only to the communicating end parties. Signal, WhatsApp, Telegram and Facebook Secret Conversations are just some of the messaging applications that have implemented such mechanisms, enjoying regular usage by over a billion people worldwide. For communications between two parties, paradigms such as the double ratchet, used in the Signal protocol, provide provably strong security. However, achieving equally strong security that scales well in group settings remains somewhat less well studied and is often neglected in practice. Establishing keys to obtain such protections is challenging for group chat settings, in which more than two clients need to agree on a key but may not be online at the same time. This motivated the need for the Internet Engineering Task Force (IETF) Messaging Layer Security (MLS) working group. The first continuous group key agreement (CGKA) protocol to be proposed was Asynchronous Ratcheting Trees (ART) [Cohn-Gordon et al., 2018] and formed the basis of TreeKEM [Barnes et al., 2019], the CGKA protocol currently suggested for MLS.
1.1.1 Communication In this section we give a brief overview of some core protocols and standards used in online communications.
Email One of the oldest standards for asynchronous messaging is the Simple Mail Transfer Protocol (SMTP) [Postel, 1982]. In essence, it allows email servers to interoperate, enabling Alice to send a mail from her Gmail account to Bob’s Yahoo account. SMTP was
8 CHAPTER 1. BACKGROUND PKG not designed to provide any form of confidentially and mails are sent in plaintext. As a result, email servers have complete access to the content of mails sent via SMTP. In 2013, the Snowden revelations showed that the NSA targeted email servers as part of the PRISM surveillance program [Greenwald, 2013]. In 1991 Phil Zimmermann developed Pretty Good Privacy (PGP), which was later turned into an open standard in RFC4880 [Callas et al., 2007]. PGP significantly improved the security of email communication, as it could be used for signing, encrypting and decrypting emails and files. In 1993 Zimmermann became the subject of a controversial criminal investigation for exporting munitions without a license. PGP never used keys shorter than 128-bits in length, and ciphers with keys longer than 40-bits were consider munitions under US export regulations at the time. However, Zimmermann was able to circumvent regulations by printing the source code to PGP in a book that was the legally exported globally1. Users could then scan and compile the source code to use PGP. The charges against Zimmermann were later dropped. Whilst PGP relies on a so-called web of trust, in which users sign each others keys, another standard, S/MIME (Secure/Multipurpose Internet Mail Extension) depends on a centralised public key infrastructure to manage keys [Ramsdell, 2004]. As a result, S/MIME if often used in large organisations but has not enjoyed widespread adoption amongst the general public. Also PGP has some drawbacks. If Alice’s private key is compromised, all previously sent messages can be decrypted by an adversary. Furthermore, PGP fails to provide a user-friendly solution for secure communications to non-technical users, as users are expected to understand concepts such as public-key cryptography and key management [See Figure 1.1].
user@host / $ gpg --list-public-keys [email protected] [email protected] pub rsa2048 2018-11-03 [SCEA] [expires: 2021-01-10] 429D86EBCD838B9F45B5AAEA1A233544E7530609 uid [ultimate] Peter Smith (work)
pub rsa4096 2019-11-12 [SC] [expires: 2020-10-12] A75704A2CB6932375B5F30A4B138C0DAB8E41E6B uid [ultimate] Peter Smith
Figure 1.1: PGP hasn’t been adopted by the general public as non-technical users might find it hard to use. Inspecting public keys with gpg, the OpenPGP part of the GNU Privacy Guard (GnuPG).
Chat Protocols Chat protocols were originally envisioned to serve low-latency synchronous communications, contrary to mail protocols that were aimed at high-latency asynchronous communications. However, recently this distinction has become less clear, as chat protocols have been designed to operate also in asynchronous settings. One of the first standardised chat protocols was XMPP (Extensible Messaging and Presence Protocol)
1See https://philzimmermann.com/EN/essays/index.html
9 CHAPTER 1. BACKGROUND PKG
[Saint-Andre, 2004]. XMPP was designed for “streaming XML elements in order to exchange messages and presence information in close to real time”, and was originally based on the messaging service jabber.org. Whilst XMPP does not provide any form of confidentially [more on this in Section 1.3], XMPP is frequently used over an encrypted channel such as TLS [Saint-Andre, 2011]. Over time, end-to-end security mechanisms have been proposed for XMPP, including Jabber OpenPGP (XEP-0027) and more recently Off-the-Record (OTR) Messaging (XEP-0364) [Muldowney, 2014, Whited, 2019]. OTR is a security enhancement compared to PGP, in that the former provides deniable authentication [more on that in Section 1.4]. The major drawback of OTR is that it is designed for synchronous messaging between two parties and, thus, cannot be used for group messaging or asynchronous messaging. On the other hand, text messages sent with SMS in GSM are only encrypted to the cell phone tower. After being received and decrypted at the tower, messages are sent to mobile carriers and stored in plaintext.
Algorithms The main public-key algorithms used in the OpenPGP standard are; RSA for encrypting or singing, as well as ElGamal for encrypting and DSA for signing, with most implementations having an upper bound of 4096 bits for public keys. OpenPGP also supports a variety of symmetric-key algorithms, notably; Triple DES, Blowfish, as well as AES with 128, 192 and 256 bit keys [Callas et al., 2007]. As mentioned above, S/MIME relies on a public key certificates, as does TLS/SSL. The X.509 standard defines certificates that bind a public key to an identity. The certificates are either signed by a certificate authority or self-signed [See Figure 1.2].
10 CHAPTER 1. BACKGROUND PKG
Certificate: Data: Version: 3 (0x2) Serial Number: 04:00:00:00:00:01:15:4b:5a:c3:94 Signature Algorithm: sha1WithRSAEncryption Issuer: C=BE, O=GlobalSign nv-sa, OU=Root CA, CN=GlobalSign Root CA Validity Not Before: Sep 1 12:00:00 1998 GMT Not After : Jan 28 12:00:00 2028 GMT Subject: C=BE, O=GlobalSign nv-sa, OU=Root CA, CN=GlobalSign Root CA Subject Public Key Info: Public Key Algorithm: rsaEncryption Public-Key: (2048 bit) Modulus: 00:da:0e:e6:99:8d:ce:a3:e3:4f:8a:7e:fb:f1:8b: ... Exponent: 65537 (0x10001) X509v3 extensions: X509v3 Key Usage: critical Certificate Sign, CRL Sign X509v3 Basic Constraints: critical CA:TRUE X509v3 Subject Key Identifier: 60:7B:66:1A:45:0D:97:CA:89:50:2F:7D:04:CD:34:A8:FF:FC:FD:4B Signature Algorithm: sha1WithRSAEncryption d6:73:e7:7c:4f:76:d0:8d:bf:ec:ba:a2:be:34:c5:28:32:b5: ...
Figure 1.2: A X.509 self-signed root certificate representing a certificate authority.
1.2 Outline
The remainder of this thesis is split into the following sections. We start by introducing some security notions as well as objectives for designing secure messaging protocols. The rest of Chapter 1 is meant to lay the foundations for the cryptography needed to understand the main results. Familiar readers may wish to skip these sections. Chapter 2 covers the most essential paradigms needed for secure messaging. The main contributions of the thesis can be found in Chapter 3, which analyses two different asynchronous group key agreement protocols, namely Asynchronous Ratcheting Trees (ART) and TreeKEM. Furthermore, we present our own asynchronous group key agreement protocol, Tripartite ART. The rest of the chapter is a comparison of specific parts of the protocols, namely the update operation. Chapter 4 concludes the thesis and offers a summary of our results as well as some discussion.
1.3 Objectives of Secure Messaging
When designing a messaging protocol we wish to have the following desirable properties.
• Alice and Bob should not have to be online at the same time, that is, the protocol is asynchronous.
11 CHAPTER 1. BACKGROUND PKG
• Authenticity (of keys): Alice is convinced that she is in fact communicating with Bob and not a malicious Mallory posing as Bob.
• Confidentiality: Alice can communicate with Bob without Eve being able to eavesdrop on their conversation.
• Integrity: Alice can detect whether a message received from Bob has been modified by malicious Mallory.
1.4 Security Notions
First we introduce a few security notions with the help of some informal definitions. These will be useful throughout the remainder of this thesis. The term post-compromise security (PCS), sometimes referred to as self-healing or future secrecy, was first introduced by Cohn- Gordon et al. in 2016. Intuitively, post-compromise security protects sessions from past compromises, whilst forward secrecy protects sessions from future compromises. Even if a party’s secrets have been previously compromised, a protocol between Alice and Bob provides PCS, if communications eventually return to become authenticated and private [Cohn-Gordon et al., 2016]. On the other hand, a protocol between Alice and Bob provides forward secrecy (FS) if previous communications remain private, even if long-term secret keys are compromised in the future. In some situations the mere fact that a conversation between Alice and Bob took place can be undesirable. Therefore, we may want Alice and Bob to be able to plausibly deny the existence of their communication. We say that a protocol between Alice and Bob provides deniability if an attacker cannot obtain a cryptographic proof of communications between the two parties. At the time of its initial release, more than 20 years ago, Transport Layer Security (TLS) version 1.0 offered standardised means of confidential communication. Whilst the most recent version of TLS enforces forward secrecy trough the use of a ephemeral Diffie-Hellman key exchange [Rescorla, 2018], the protocol still fails to provide other desirable security guarantees. Should a server’s long term private key be exposed, future communications can become exposed. Work by Cohn-Gordon et al. shows how to formalise post-compromise security and proves that TLS version 1.3 lacks PCS [Cohn-Gordon et al., 2016]. The Off-the-Record Messaging protocol provides deniable authentication. That is, messages are not digitally signed such that Carol, a generic third party, can verify them [Borisov et al., 2004]. In fact, anyone could forge messages to make them look as if they came from Alice or Bob. However, whilst communicating with one another, Alice and Bob are provided authenticity and integrity. This is achieved by appending all information necessary to forge messages to encrypted messages. Hence, if an adversary is able to create digitally authentic messages in a conversation, she is also able to forge messages in the conversation. Whilst OTR itself has not enjoyed widespread adoption, some of its techniques can be found in multiple cryptographic constructions, not least in the Signal protocol, and its double-ratcheting algorithm. The Signal protocol, which was developed outside of the academic community, provides both forward secrecy and post-compromise security.
12 CHAPTER 1. BACKGROUND PKG
1.5 Relevant Previous Work
Due to its reduced cost, the concept of Asynchronous Ratcheting Trees (ART) [Cohn-Gordon et al., 2018] was the first practically viable proposal to achieve post-compromise security asynchronously in a group setting. The asynchronous key-encapsulation mechanism for tree structures used in MLS finds its roots in ART. At EUROCRYPT 2019 Alwen et al. gave a formal security analysis of the Signal protocol and showed that Signal provides both post-compromise security and forward secrecy [Alwen et al., 2019a]. Recent work by Alwen et al. showed that TreeKEM had weak forward secrecy. However, we note that the analysis was done on version 7 of MLS, which has evolved considerably since then [Alwen et al., 2019b]. Cremers et al. study the post-compromise security disadvantages with multiple groups [Cremers et al., 2019].
1.6 Introductory Cryptography
We now introduce some fundamental cryptographic primitives that are essential for the remainder of this thesis.
Definition 1 (Symmetric Cryptosystem). We define a symmetric cryptosystem to be the tuple of two algorithms (Enc, Dec):
• Enc(k, x) encrypts the plaintext x using the key k. • Dec(k, c) decrypts the ciphertext c using the key k.
Definition 2 (Correctness of a Symmetric Cryptosystem). A symmetric cryptosystem (Enc, Dec) is correct if ∀x ∈ P, ∀k ∈ K Dec(k, Enc(k, x)) = x where P and K are the set of plaintexts and the set of keys respectively.
Definition 3 (Public-Key Cryptosystem). We define a public-key cryptosystem to be the tuple of three algorithms (Gen, Enc, Dec):
• Gen(1k) generates a key pair (sk, pk), where sk and pk denote the secret key and public key respectively and k is the security parameter.
• Enc(pk, x) encrypts the plaintext x using the public key pk. • Dec(sk, c) decrypts the ciphertext c using the secret key sk.
Definition 4 (Correctness of a Public-Key Cryptosystem). A public-key cryptosystem (Gen, Enc, Dec) is correct if
∀x ∈ P Dec(sk, Enc(pk, x)) = x, where (pk, sk) ← Gen(1k) and P is the set of plaintexts.
In Section 3.3, we will need an updatable public-key cryptosystem in order to achieve forward secrecy [Alwen et al., 2019b, Jost et al., 2019].
13 CHAPTER 1. BACKGROUND PKG
Definition 5 (Updatable Public-Key Cryptosystem). We define an updatable public-key cryptosystem to be the tuple of three algorithms (Gen, Enc, Dec):
• Gen(sk0) outputs an initial public key pk0, given a uniformly random key sk0. • Enc(pk, x) encrypts the plaintext x using the public key pk and outputs a new public key pk0.
• Dec(sk, c) decrypts the ciphertext c using the secret key sk and outputs a new secret key sk0
Definition 6 (Correctness of a Updatable Public-Key Cryptosystem). An updatable public-key cryptosystem (Gen, Enc, Dec) is correct if ∀xi ∈ P and ∀sk0 ∈ K pk0 ← Gen(sk0); (ci, pki) ← Enc(pki−1, xi); P 0 0 = 1, (xi, ski) ← Dec(ski−1, ci): xi = xi where P and K are the set of plaintexts and the set of keys respectively.
Informally, a key derivation function (KDF) generates a random key from a secret value and a public value such as a salt, with the help of a standard hash function. Let KDF denote a HMAC-based Extract-and-Expand Key Derivation Function (HKDF) as defined in RFC5869 [Krawczyk and Eronen, 2010]. Establishing a shared secret over an insecure channel is central to a vast number of cryptographic protocols. The Diffie-Hellman Key Exchange [Diffie and Hellman, 1976] provides an elegant solution to this problem and is widely used in practice.
Definition 7 (Diffie-Hellman Key Exchange). Let G be a cyclic group of prime order p with
∗ skA a generator g. Alice chooses her secret key skA ∈ Zp and computes her public key pkA = g . ∗ skB Then Bob chooses his secret key skB ∈ Zp and computes his public key pkB = g . After exchanging their public keys pkA and pkB, Alice and Bob check that pkB ∈ hgi \ {1} and pkA ∈ hgi \ {1}. Finally, they can compute a shared secret key k
k = KDF(gskAskB )