QUANTITATIVE VERIFICATION OF GOSSIP PROTOCOLS FOR CERTIFICATE TRANSPARENCY by MICHAEL COLIN OXFORD A thesis submitted to the University of Birmingham for the degree of DOCTOR OF PHILOSOPHY School of Computer Science College of Engineering and Physical Sciences University of Birmingham December 2020 2 Abstract Certificate transparency is a promising solution to publicly auditing Internet certificates. However, there is the potential of split-world attacks, where users are directed to fake versions of the log where they may accept fraudulent certificates. To ensure users are seeing the same version of a log, gossip protocols have been designed where users share and verify log-generated data. This thesis proposes a methodology of evaluating such protocols using probabilistic model checking, a collection of techniques for formally verifying properties of stochastic systems. It also describes the approach to modelling and verifying the protocols and analysing several aspects, including the success rate of detecting inconsistencies in gossip messages and its efficiency in terms of bandwidth. This thesis also compares different protocol variants and suggests ways to augment the protocol to improve performances, using model checking to verify the claims. To address uncertainty and unscalability issues within the models, this thesis shows how to transform models by allowing the probability of certain events to lie within a range of values, and abstract them to make the verification process more efficient. Lastly, by parameterising the models, this thesis shows how to search possible model configurations to find the worst-case behaviour for certain formal properties. 4 Acknowledgements To Auntie Mary and Nanny Lee. Writing this thesis could not have been accomplished after four tumultuous years alone. Firstly, I want to thank my co-supervisors, Dave Parker and Mark Ryan, for their unconditional support and always willing to make time for me despite having hectic work schedules. I could not have asked for more patient nor intelligent tutors. I also want to thank: the Security Group for the interesting seminars and conversations over lunch; Graham Shaw and Adam Williams for overseeing my placement at Nettitude; Eike Ritter, David Galindo and Alice Miller for their useful comments on my work; Birmingham's BlueBEAR service for helping me complete some experiments; and the reviewers for the CNS'20 conference for their valuable feedback on the paper. A number of close friends have helped me to overcome moments when I thought I would never finish this thesis. These people are: Paul Goldring; Kelvin Cheung; Pablo Vinuesa; Barri Matharu; Johnny Chan; and Susan Geng. Thank you all for some great memories and life advice. Lastly, I want to thank my sisters, Rebecca and Sarah, and my wonderful parents, Agnes and Steven. I really cannot thank you all enough for what you have done for me to get here, so here is another - thank you. Any mistakes found in this thesis are of course my own. 6 Contents List of Figures 11 List of Tables 13 Glossary 17 1 Introduction 21 2 Related Work 27 2.1 Transparency . 27 2.2 Gossip and Auditing for CT . 30 2.3 Probabilistic Model Checking . 31 3 Background 37 3.1 Certificate Transparency . 37 3.2 CT Gossiping . 50 3.3 Probabilistic Model Checking . 55 3.4 Derivative-free Optimisation . 74 4 Modelling and Verification of Gossip Protocols 81 4.1 Network Topology . 82 4.2 Modelling the Protocol . 85 7 CONTENTS 4.3 Specification of Protocol Properties . 92 4.4 Server-to-server Gossip . 94 4.5 Experimental Results . 96 4.6 Summary . 111 5 Tackling Uncertainty and Unscalability using IDTMCs 113 5.1 Using IDTMCs When Client Probabilities are Unknown . 114 5.2 IDTMC Abstraction . 121 5.3 Experimental Results . 125 5.4 Summary . 133 6 Model Parameter Optimisation 135 6.1 Deriving Network Model Parameters . 135 6.2 Adapting the Black-box Optimisation Problem . 139 6.3 Python Application . 140 6.4 Experimental Results . 146 6.5 Combining IDTMCs With SMBO . 156 6.6 Summary . 157 7 Discussion and Conclusion 161 Bibliography 163 Appendix A Constructing Components from IDTMCs and Updating ADTMCs 195 A.1 ConstructComponent . 196 A.2 UpdateADTMC . 196 Appendix B Deriving Distributions Using Surrogate Parameters 199 8 Appendix C Snapshots of PRISM Code 203 C.1 Normal Scenario Model (Without Server Gossip) . 203 C.2 Split-world Scenario Model With Intervals (Without Server Gossip) . 212 CONTENTS 10 List of Figures 3.1 A pair of Merkle hash trees . 41 3.2 Communication flow of CT . 46 3.3 Querying a certificate database . 48 3.4 CT information for a certificate . 49 3.5 An Illustration of a split-world attack . 51 3.6 Illustration of the Chuat et al. CT gossip protocols . 54 3.7 DTMC model example . 57 3.8 Example abstraction of a DTMC . 69 3.9 Demonstration of the Hyperopt library . 79 3.10 Demonstration of the Benderopt library . 80 4.1 Example of a network topology . 84 4.2 Abstract representation of log growth . 91 4.3 Model checking results for the normal scenario . 98 4.4 Model checking results for the split-world scenario (1) . 100 4.5 Model checking results for the split-world scenario (2) . 101 4.6 Statistical results for the normal scenario . 103 4.7 Statistical results for the split-world scenario . 104 4.8 Box-and-whisker plots for randomly sampled data (Chapter 4) . 105 11 4.9 Comparing statistical and verification results (normal) . 108 4.10 Comparing statistical and verification results (split-world, init. design) 109 4.11 Comparing statistical and verification results (split-world, ext. design) . 110 5.1 Box-and-whisker plots for randomly sampled data (chapter 5) . 119 5.2 IDTMC model checking results (normal) . 120 5.3 IDTMC model checking results (split-world) . 121 5.4 Abstraction process of an IDTMC . 124 5.5 Comparing IDTMC and ADTMC verification results . 132 6.1 Workflow of the optimiser code . 145 6.2 Best result found for a fixed number of trials . 149 6.3 Results for normal models using the suggested parameters . 150 6.4 Results for split-world models using the suggested parameters . 151 6.5 Statistical model checking results for larger models . 152 6.6 Box-and-whisker plots for randomly sampled data (chapter 6) . 154 6.7 Comparing verification results with simulation data . 155 6.8 Investigation into the local behaviour of the objective function for normal scenario models . 158 6.9 Investigation into the local behaviour of the objective function for split- world scenario models . 159 List of Tables 4.1 Description of csth=ssth variables . 92 4.2 Initial modelling setup for both model types. For each client type, they connect with server types S1 and S2 with probabilities 0:02 and 0:28, respectively. They also connect with one other unique server with probability 0:7 e.g. the client type C1 connects with server type S3 with probability 0:7. ............................ 96 4.3 Model statistics (Chapter 4) . 98 4.4 Client type frequency (Chapter 4) . 102 4.5 Mean values for each proportion . 106 5.1 Probability intervals (chapter 5) . 116 5.2 Maximal/minimal values for each proportion (chapter 5) . 119 5.3 Model statistics (chapter 5) . 130 6.1 List of options for the Python application . 143 6.2 Probability intervals (chapter 6) . 147 6.3 Suggested modelling parameters . 148 6.4 Client type frequency (chapter 6) . 149 6.5 Maximal/minimal values for each proportion (chapter 6) . 154 LIST OF TABLES 14 List of Algorithms 1 Generic sequential model-based optimisation (SMBO). 76 2 Constructing the abstract component . 125 3 Building an ADTMC . 126 4 Deriving a probability distribution from fixed intervals . 138 5 Objective function to optimise . 140 LIST OF ALGORITHMS 16 Glossary N Set of natural numbers Z Set of integer numbers R Set of real numbers PKI Public Key Infrastructure CA Certificate authority DNS Domain name service CT Certificate Transparency RFC Request for comments TLS Transport Layer Security h Cryptographic hash function URL Uniform resource locator SCT Signed certificate timestamp MMD Maximum merge delay STH Signed tree head OCSP Online Certificate Status Protocol HTTPS Hypertext Transfer Protocol Secure m Gossip message data sth STH data M Markov model 17 Glossary DTMC Discrete time Markov chain MDP Markov decision process Act Set of actions S State space E Transition relation function Si Set of initial states P Probability transition function ιinit Initial distribution function AP Set of atomic propositions L State labelling function π State-sequenced path F P athM;s Set of all finite paths starting from state s F P athM Set of all finite paths IP athM;s Set of all infinite paths starting from state s IP athM Set of all infinite paths σ Adversary function Adv Set of all adversaries PCTL Probabilistic Computational Tree Logic φ State-based PCTL formula or a general PCTL formula depending on the context Φ Path-based PCTL formula P Probabilistic path operator X Next operator U Until operator j=Adv Satisfied under Adv F Future operator 18 Glossary ♦ Future operator (alternative) rstate State reward function raction Transition reward function T Target set of states P rob Probability of an event =k I Instantaneous reward after exactly k 2 N steps ≤k C Cumulative reward after exactly k 2 N steps RT Reachability reward before reaching target set T R Reward operator IDTMC Interval DTMC ADTMC Abstract DTMC F Objective function SMBO Sequential model-based optimization N Surrogate model A acquisition function H Observation history set TPE Tree-structured Parzen estimator NT Network topology G Gossip rate function P Client type profile function
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages218 Page
-
File Size-