<<

University of Calgary PRISM: University of Calgary's Digital Repository

Graduate Studies The Vault: Electronic Theses and Dissertations

2014-09-30 New Notions of Secrecy and User Generated in

Alimomeni, Mohsen

Alimomeni, M. (2014). New Notions of Secrecy and User Generated Randomness in Cryptography (Unpublished doctoral thesis). University of Calgary, Calgary, AB. doi:10.11575/PRISM/27097 http://hdl.handle.net/11023/1874 doctoral thesis

University of Calgary graduate students retain copyright ownership and moral rights for their thesis. You may use this material in any way that is permitted by the Copyright Act or through licensing that has been assigned to the document. For uses that are not allowable under copyright legislation or licensing, you are required to seek permission. Downloaded from PRISM: https://prism.ucalgary.ca UNIVERSITY OF CALGARY

New Notions of Secrecy and

User Generated Randomness in Cryptography

by

Mohsen Alimomeni

A THESIS

SUBMITTED TO THE FACULTY OF GRADUATE STUDIES

IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE

DEGREE OF DOCTOR OF PHILOSOPHY

DEPARTMENT OF COMPUTER SCIENCE

CALGARY, ALBERTA

SEPTEMBER, 2014

c Mohsen Alimomeni 2014

Abstract

Randomness plays a central role in computer science, in particular cryptography. Almost all

cryptographic primitives depend crucially on randomness because randomness and unpre-

dictability in secret keys provides the means for security. Usually one assumes that perfect

randomness, a sequence of independently and uniformly distributed bits, are accessible to

algorithms. This is a strong assumption. Physical sources of randomness are neither uniformly

random, nor produce necessarily independent bits. Therefore, the aim of this thesis is to start

from a realistic model of randomness, investigate notions of secrecy and their randomness

requirements and finally find practical methods for generation of randomness that matches

the requirements of cryptographic primitives. We consider a model of random source where

the source output follows one distribution from a set of possible distributions, each with the

property that the maximum probability of symbols is bounded and can not be arbitrarily

close to 1. This model does not assume independence or uniformity of the output symbols,

and is considered to be a realistic model of randomness. From this point, the thesis can be

divided into two main parts:

In the first part, considering various notions of information theoretic secrecy, a fundamental

problem is to find the properties of randomness needed to achieve security in these notions.

Traditional cryptographic protocols simply assume perfect randomness and build on this

assumption. We explore the results that show secrecy can not be based on imperfect random-

ness that is not uniform or independent. Thus a line of work attempts to relax notions of

secrecy in such a way that they can be constructed with non-perfect sources, and possibly

require smaller key sizes. Yet they should match real life applications. An important work

in this context is entropic security where the key size could be smaller than the message

depending on message distribution. Inspired by this, we propose two relaxed notions of

secrecy that are motivated by practical applications. In the first notion, motivated by an

i application in biometrics authentication, we propose guessing secrecy where the probability

of guessing the message for a computational unbounded adversary with the best strategy

remains the same when a ciphertext is given or when it is not. We compare the randomness

requirements of guessing secrecy with stronger notions and show that in some cases such

as key length, the requirements are the same. For key distribution however, we found a

family of distributions that provide guessing secrecy but not perfect secrecy. In the second

notion, we investigate randomness requirements of multiple message encryption. Considering

a natural extension of secrecy definition to multiple messages, we show that independent

keys are needed to encrypt each message. We then propose a relaxed notion in which the

security of last message is more important than past messages, although the leakage of past

messages is bounded using entropic security. By this assumption, we achieve smaller key

length compared to -indistinguishability, and comparable to key length for entropic security.

This notion has applications such as location privacy.

In the second part of the thesis, since secrecy crucially depends on perfect randomness, we

investigate how perfect randomness can be practically generated, specifically from human

game-play. Unlike many random number generators that assume independent or uniform

random bits in the random source, we base our work on the realistic model of randomness.

Our main observation is that human game-play has an element of randomness coming from

the errors in their game-play which is the main entertaining factor of the game. We also

observed that this game-play can distinguish among a group of people if the right features

are collected from the game-play. We incrementally changed our game design until the

distinguishability among a small population is maximized, and then run the experiments

required to show the viability of this approach over a larger population. This approach can

also provide a hard to delegate authentication property where a human could not emulate

the behavior of another human even given statistical information about their game-play. Acknowledgments

My research and this thesis would not have been possible without the help and support of

kind people around me; my supervisor, committee members, friends, family and my wife.

First and foremost I wish to express my deepest appreciation to my supervisor, Reihaneh

Safavi-Naini, for her continuous support during my PhD study and research. Your patience

in reading the drafts of my work over and over helped me learn a lot from your comments

and views on topics discussed in this thesis. The joy and enthusiasm you have for research was contagious and motivational for me.

My sincere thanks goes to my supervisor committee, Philipp Woelfel and Payman Mohassel who helped me during different stages of my research in UofC. I would like to thank my

examiners Keith Martin and Christoph Simon for the comments and suggestions that improved

this work. I gratefully acknowledge the funding sources that made my Ph.D. work possible.

The funding was provided as scholarship, teaching and research assistantships I received from

department of computer science in UofC and Alberta Innovates Technology Future (AITF) .

I would like to thank many friends that made our life happier during our stay in Calgary.

Specifically I appreciate Johnson family for their kind hospitality. I enjoyed my time in

Calgary with my close friends and their family for the gatherings and trips we had together.

Thank you all my nice friends. I would also like to thank my parents, brothers and sister.

They were always supporting me and encouraging me with their best wishes.

Nobody could have stood by me through the good and bad times of more than 4 years of

my study better than my wife. In occasions when I was very busy with paper deadlines, she

tolerated a monotonous life, yet cheered me up to motivate me work better. I am indebted

to you Narges Mashayekhi.

iii Table of Contents

Abstract ...... i Acknowledgments ...... iii Table of Contents ...... iv List of Tables ...... viii List of Figures ...... ix List of Symbols ...... x 1 Introduction ...... 1 1.1 Formal model of randomness ...... 2 1.2 Randomness for secrecy ...... 3 1.2.1 Guessing secrecy ...... 6 1.2.2 Correlated keys for multiple messages ...... 7 1.3 Generating random numbers, and applications ...... 8 1.3.1 TRG from human game-play: Using video games ...... 9 1.3.2 TRG from human game-play: Game theoretic approach ...... 9 1.3.3 User generated randomness for authentication ...... 9 1.4 Other contributions ...... 10 1.4.1 Review, partial results, and comparison of secrecy primitives . . . . . 10 1.4.2 Location based storage ...... 10 1.5 Subsequent works ...... 11 1.6 Thesis structure ...... 11 1.6.1 Theorems and proofs ...... 12 2 Preliminaries and Basics ...... 13 2.1 Probability theory ...... 13 2.2 Information theoretic measures ...... 14 2.3 Information theoretic security ...... 20 2.3.1 Computational versus information theoretic model ...... 21 2.4 Secrecy ...... 22 2.4.1 Secret sharing ...... 26 2.5 Concluding remarks ...... 28 3 Randomness requirement of secrecy ...... 29 3.1 Modeling random sources ...... 29 3.2 Dealing with weak random sources in cryptography ...... 32 3.2.1 Local versus public versus shared randomness ...... 34 3.3 Paradigm 1: Randomness extraction ...... 34 3.3.1 Deterministic extractors ...... 35 3.3.2 Seeded Extractors ...... 36 3.4 Paradigm 2: Constructions using imperfect randomness ...... 38 3.4.1 Randomness requirements for perfect secrecy ...... 39 3.4.2 Relaxation of perfect secrecy ...... 41 3.4.3 Comparison of perfect secrecy relaxations ...... 42 3.4.4 Randomness requirements of indistinguishability ...... 43 3.4.5 Secrecy with weak random sources ...... 45

iv 3.4.6 t-source admit randomized encryption ...... 48 3.4.7 One-time pad is universal for deterministic encryption ...... 49 3.5 Randomness requirement for secret sharing ...... 52 3.6 Authentication sources ...... 58 3.6.1 Authentication with t-sources ...... 60 3.7 Comparison of random sources ...... 61 3.8 Concluding remarks ...... 62 4 Guessing secrecy ...... 64 4.1 Introduction ...... 64 4.1.1 Motivation ...... 65 4.1.2 Related work ...... 66 4.1.3 Our contribution ...... 67 4.2 Secrecy based on guessing probability ...... 68 4.3 Requirements on the key size ...... 70 4.4 Requirements on the key distribution ...... 75 4.4.1 Guessing secrecy with imperfect randomness ...... 78 4.4.2 Relation with perfect secrecy ...... 80 4.5 Applications ...... 81 4.6 Bounds on conditional min-entropy ...... 82 4.7 Concluding remarks ...... 85 5 Information Theoretic Security of Sequential High Entropy Messages . . . . . 86 5.1 Introduction ...... 86 5.1.1 Our contribution ...... 87 5.1.2 Applications ...... 88 5.1.3 Entropic security ...... 89 5.2 Encryption of multiple sequential messages ...... 91 5.2.1 Relaxing λ-message security ...... 94 5.2.2 Min-entropy v.s. -indistinguishability ...... 96 5.2.3 Encryption of uniformly random messages ...... 96 5.2.4 Using correlated keys for min-entropy messages ...... 97 5.2.5 Entropic security of the past messages ...... 106 5.3 Concluding remarks ...... 108 5.4 Proofs of lemmas and theorems ...... 108 5.4.1 Proof of Lemma 5.2.1 ...... 108 5.4.2 Proof of Lemma 5.2.2 ...... 109 5.4.3 Proof of Lemma 5.2.3 ...... 109 5.4.4 Proof of security for uniform messages ...... 109 5.4.5 Proof of Theorem 5.2.2 ...... 111 6 Human-Assisted Generation of Random Numbers ...... 113 6.1 Introduction ...... 113 6.1.1 Related work ...... 114 6.2 The structure of a TRG ...... 116 6.2.1 Entropy estimation of the source ...... 117 6.2.2 Extraction module ...... 118 6.2.3 Measuring statistical property of the output ...... 120

v 6.3 Human game-play in video games for randomness generation ...... 121 6.3.1 Our contribution ...... 122 6.3.2 Applications ...... 123 6.3.3 The TRG design ...... 124 6.3.4 Experimental setup and results ...... 130 6.4 Human game-play in zero-sum games for randomness generation ...... 135 6.4.1 This work ...... 137 6.4.2 Background on expander graphs ...... 138 6.4.3 The TRG design ...... 141 6.4.4 Experiments ...... 146 6.4.5 Measures of randomness for our game ...... 146 6.5 Comparison to Halprin et al. approach ...... 149 6.6 Concluding remarks ...... 150 7 Human game-play for user authentication ...... 152 7.1 Introduction ...... 152 7.1.1 Our contribution ...... 154 7.1.2 Related work ...... 159 7.2 Delegation in authentication ...... 161 7.2.1 HtD Authentication ...... 161 7.3 HtE authentication games ...... 164 7.3.1 The model ...... 165 7.3.2 Criteria to select human features ...... 167 7.4 From HtE games to HtD authentication ...... 169 7.4.1 Adversarial benefits from delegation ...... 170 7.4.2 Game client bypassing attacks ...... 171 7.4.3 MiM attacks ...... 174 7.4.4 Security of the protocol ...... 177 7.5 The proof of concept: HtE game ...... 179 7.5.1 The game design ...... 179 7.5.2 The verification function ...... 181 7.5.3 Experimental setup ...... 184 7.6 Concluding remarks ...... 187 7.6.1 Extensions and future work ...... 187 8 Concluding remarks ...... 189 8.1 Randomness requirements of secrecy ...... 190 8.1.1 Sources for secret sharing ...... 190 8.1.2 Guessing secrecy ...... 192 8.2 Generation of random numbers ...... 194 8.2.1 Randomness generators using human game-play ...... 194 8.2.2 Randomness for authentication ...... 195 Bibliography ...... 197 A LoSt: Location Based Storage ...... 216 A.1 Introduction ...... 216 A.1.1 Setting Considered ...... 217 A.1.2 Our Contribution ...... 219

vi A.2 Related Work ...... 221 A.3 Storage Model ...... 222 A.4 Trust Assumptions and Impossibility Results ...... 225 A.4.1 Assumptions on Adversarial Behavior ...... 227 A.5 Proofs of Location ...... 227 A.5.1 PoL Scheme ...... 229 A.5.2 Security Model ...... 231 A.5.3 Constructing a PoL ...... 233 A.5.4 PoR with Recoding ...... 234 A.5.5 A Secure PoL Using PoR with Recoding ...... 238 A.6 Experiments ...... 240 A.6.1 Geolocation Method ...... 241 A.6.2 Error Analysis ...... 243 A.7 Conclusion ...... 244 A.8 Acknowledgments ...... 245

vii List of Tables

3.1 Encryption function with uniform keys ...... 40 3.2 Encryption table with non-uniform keys ...... 40

1 4.1 Table for 32 -guessing secrecy ...... 73 3 4.2 Partitioned one-time pad for 32 -guessing secrecy ...... 75 6.1 Min-entropy of users in first sub-game ...... 147 6.2 Result of statistical test ...... 148 6.3 Min-entropy of users in second sub-game ...... 148 6.4 Result of statistical test ...... 148

viii List of Figures and Illustrations

2.1 Van diagram of Entropy measures ...... 16 2.2 Secure Communication in Cryptography ...... 20

3.1 Impossibility of randomness extraction from a (n, n 1)-source ...... 36 3.2 Comparison of random sources ...... − ...... 62

6.1 Screen-shot of the game ...... 125 6.2 The measurement of output ...... 125 6.3 Min-entropy for players ...... 131 6.4 Min-entropy in blocks of bits (One user) ...... 131 6.5 Min-entropy during level A, B, C for 3 users ...... 132 6.6 Average min-entropy change during levels over all users ...... 132 6.7 Min-entropy of bits ...... 133 6.8 The game ...... 144

7.1 Screen-shot of the game ...... 179

7.2 12Hit precision ...... 179 7.3 User verification accuracy; measurements matching the profile ...... 181 7.4 The smooth edf of the features for 7 random users, and the distance between them, illustrating how the features can distinguish among a group of people. 183 7.5 The smooth histogram of the feature values for 7 users, illustrating how the features can distinguish among a group of people...... 185 7.6 A user trying to emulate the behavior of another user; increase in hit accuracy results in increase in targeting time...... 186

A.1 Example of File Location Regions ...... 223 A.2 Bad Verification Example ...... 224 A.3 Location Faking ...... 226 A.4 Security Experiment for PoL ...... 233 A.5 Boxplots of Geolocation Error (km) for File Retrieval and Challenge Sizes 5, 35, 65 ...... 243

ix List of Symbols and Abbreviations

Symbol Definition  ∆ .; . Statistical Distance G(.) Guessing Probability

H(.) Shannon Entropy

H∞(.) Min-Entropy

E Expected Value I(., .) Mutual Information

. Probability Distributon P BPP Bounded Probabalistic Polynomial

CDF Cumulative Distribution Function DB Distance Bounding

FAR False Acceptance Rate

FRR False Rejection Rate

HtD Hard to Delegate

HtE Hard to Emulate ITS Information Theoretic Security

KS Kolmogorov-Smirnov

MAC Message Authentication Code

MiM Man in the Middle OS Operating System

OTP One-Time Pad PDF Probability Distribution Function

PPT Probabalistic Polynomial Turing

PS Perfect Secrecy

x RNG Random Number Generator SSL Secure Socket Layer

TRG True Randomness Generator U of C University of Calgary

xi Chapter 1

Introduction

This thesis is about the theory, applications and generation of randomness with main emphasis

on information theoretic security. Randomness is vital in many areas of computer science,

specifically in cryptography. Almost all cryptographic primitives use randomness somewhere

in their design and they all assume perfect randomness is available, so the proofs are mostly

based on the assumption that randomness is perfect. By perfect, we mean a sequence of

independently generated and uniformly distributed bits, that is completely unpredictable.

In other words, attackers must not be able to guess even one bit of the random secret with

1 probability more that 2 . Poor choices of randomness in the past have resulted in complete breakdown of security and

expected functionality of the system. Early reported examples of bad choices of randomness

resulting in security failure include attack on Netscape implementation of the SSL protocol

[GW96] and the weakness of entropy collection in Linux Pseudo-Random Generator [GPR06].

A more recent high profile reported case was the discovery of collisions among secret (and

public) keys generated by individuals around the world [LHA+12, HDWH12]. Further

studies attributed the phenomenon partly due to the flaw in Linux kernel randomness

generation subsystem. Therefore, is an important task in real world applications, and new sources of randomness would add to the security of systems.

Many of the attacks on secure systems are due to bad randomness used, not the underlying

security algorithms and protocols.

In practice, computer systems generate random bits heuristically from the accessible

physical sources of randomness that are not guaranteed to generate random bits with required

properties, or at least there is no proof that they do so. Therefore, on one hand, good

1 randomness is very important and is needed inherently in many applications of cryptography,

and on the other hand, sources of randomness that can be proved to have the required

security properties for cryptography can not be easily found. So there is a gap between what

is needed in cryptography and what can be achieved in practice. One approach to fill this

gap is to find constructions for cryptographic primitives that can be proved secure when

perfect randomness is substituted by the output of true random number generators (TRG).

Another approach is to modify constructions such that they demand less of this expensive

computational resource, namely “randomness”.

To prove that a construction is secure when perfect randomness is substituted by the

output of a true randomness generator, a precise modeling of this output is required. In this

thesis, we start with a formal model of randomness that can represent the output of physical

random sources for practical applications. Then the thesis is divided into two main parts, in which we base our arguments on the formalism of random sources. Part 1 consists of three

chapters investigating randomness requirements of different secrecy notions. Part 2 consists

of two chapters on practical generation of random numbers given the formal model and their

applications in cryptography.

1.1 Formal model of randomness

A TRG is a process that reads the raw output from a source of randomness (e.g. physical

sources such as thermal noise) and then transforms this output to perfect randomness by

applying a post-processing function. Real world constructions of TRGs often assume that

the raw output of the physical random source satisfies certain properties (e.g. the symbols

are normally distributed) and they design a post-processing function that transforms the raw

output to perfect randomness. However, the properties are heuristically tested and there is no

proof that the raw output satisfies the properties. Note that “mathematically proving” that

the output has any specific properties seems impossible for known sources of randomness due

2 to non-deterministic behavior during time, or at least we do not yet know how to prove that.

Thus a line of work sought to relax the assumptions on the properties of the raw output. The

most accepted model of randomness (many works are based on this model) was proposed by

Chor and Goldreich [CG88] where the only assumption on the output distribution is a bound

on the probability that each symbol occurs, and they call it weak random sources. In this

source, the symbols are neither considered independent nor uniformly distributed. Moreover,

other models of randomness proposed in the literature can be represented as special case of

this model. We consider weak random sources in this thesis.

1.2 Randomness for secrecy

In this part of thesis, based on the formalism of the randomness model, we study the ran-

domness requirements of cryptographic primitives related to secrecy in terms of number and

distribution of bits needed for generating secret key or randomness. In particular, secrecy

primitives such as encryption and secret sharing are investigated to determine if realistic

models of random sources can be used effectively to provide security for these primitives. The

task is to seek for a minimal condition on the randomness that can be used in a particular

cryptographic primitive. Here minimal conditions refers to how practical is the assumption

on the random source for a primitive. For example, assuming access to independent and

uniformly distributed random bits is considered costly, while assuming that a source generates

output bits from a distribution with a certain bound on the probability of each element,

is considered more realistic. So a natural question is that to what extent a cryptographic

primitive depends on randomness? And what are the properties (e.g. distribution) of the

randomness that are needed for a particular primitive? In other words, what is the “random-

ness complexity” of a cryptographic primitive to provide reasonable security guarantees. By

randomness complexity, we mean both the quantity, i.e. the number of random bits, and the

quality, i.e. the distribution of the randomness that is needed to accomplish a cryptographic

3 task. To formally define randomness complexity of a primitive, a measure of randomness

is needed. One estimate for the quality of the randomness is min-entropy which measures

the unpredictability of a random variable by the largest probability that its outputs can

take. This measure enables us to classify different cryptographic primitives based on their

randomness demand. Comparing to complexity theory where complexity classes are defined

to classify algorithms based on their need for time or space, in this context “random sources”

are defined to classify algorithms based on their need for randomness. A random source is

defined as a class of probability distributions that have some property, e.g. distributions that

are sufficient to do a cryptographic tasks such as encryption.

Modern cryptography was initiated by the seminal work of Shannon [Sha49], where he

formally defined a notion of secrecy, called perfect secrecy, and a by product of his work was to bound the amount of randomness required to achieve perfect secrecy. He proved

results on the size and distribution of the secret key needed for this definition. Perfect

secrecy requires that a ciphertext does not give any more information about the plaintext

than the adversary already knows prior to viewing the ciphertext. Shannon showed that an

encryption scheme that can achieve perfect secrecy is one-time pad which expects that the

key be selected uniformly at random with a size equal to the plaintext. He also proved that

for any encryption scheme with perfect secrecy, Shannon entropy of the keys must be greater

than that of the plaintexts. This is a very expensive demand because even if one could afford

the size of the key, there is no guarantee about uniformity of the key, as discussed before. So

a number of works tried to relax these requirements. In the first natural relaxation, some 

leakage of the message was allowed. This was formalized as -indistinguishability (-secrecy).

This relaxation however, could not reduce the requirements on the secret key significantly as

it will be shown in Theorem 3.4.2 that even for a key of length one bit less than the message,

-secrecy is impossible for  1/2. ≤

4 Since the above limitations are hard to satisfy in practical scenarios, modern cryptography

turned toward computational security [GM82]. In computational security the adversary

is assumed to have limited computational power, and breaking security is related to a

computationally “hard” problem where hardness of a problem is formalized using complexity

theory. Using this approach a scheme is secure if a probabilistic polynomial time (PPT)

adversary (algorithm) can have negligible success chance in breaking the scheme. Note that

requiring “negligible” success chance instead of 0 success chance is needed to reduce the key

length: it was recently proved by Dodis [Dod12] that for an adversary running in time v

equal to the maximum bit length of the message and the ciphertext, success probability of

0 in breaking the encryption implies that the min-entropy of the key must be greater than

the min-entropy of the message. Compared to information theoretic security, it is important

to note that computational security is based on unproven assumptions such as hardness of

integer factoring, or discrete logarithm problem.

In information theoretic setting, Russel and Wang [RW02] relaxed perfect secrecy by

assuming a bound on the prior knowledge of the adversary about the plaintext. With this

relaxed notion, they could achieve smaller key length depending on the amount of adversaries

knowledge. But their scheme only works when keys are sampled uniformly at random. Dodis

and Smith [DS05] build upon their work by showing that the relaxed notion of Russel and

Wang is equivalent to the definition of randomness extractors and by this, they could achieve

simpler constructions.

For non-uniform keys, which we refer to them as imperfect keys, a number of papers

[MP91, DS02, DOPS04] tried to find out whether -secrecy is possible if the key is not chosen

uniformly at random. -secrecy allows a small amount of information to be leaked to the

adversary compared to perfect secrecy which tolerates no information leakage. Bosley and

Dodis [BD07] settled the problem by proving that even for -secrecy, imperfect keys are not

sufficient for encryption. They proved that for practical key lengths that are not exponential

5 in the size of plaintext, either the key is transformable to a uniform key with a size equal

to the size of plaintext using a deterministic function, or encryption is impossible. This

deterministic function is called an extractor which extracts almost uniform randomness from

non-uniform randomness. If such a function exists, then one can transform the key to an

almost uniform key and then use one-time pad to encrypt a plaintext of the same size as the

output key.

In this thesis, we propose two new notions of security that are motivated by practical

applications.

1.2.1 Guessing secrecy

In the first relaxation, a definition of secrecy named “Guessing secrecy” is proposed and

its properties and randomness requirements is investigated. This definition is based on

min-entropy of the message, requiring that the predictability of message does not change when an adversary views a ciphertext. Predictability refers to the best advantage of the

adversary in guessing message sampled from a distribution. The best an adversary can do is

to guess the most probable value from the distribution, measuring this in terms of bits gives

us “Min-Entropy”: For a random variable X over the set of messages , the best guess of X ∗ ∗ an adversary is the x with X (x ) = maxx X (x) and measuring this unpredictability ∈ X P P

in terms of bits results in log maxx X (x). We show that guessing secrecy requires some − P of the same conditions on the secret key as in Shannon’s perfect secrecy. For example, it

requires the same length of the secret key as Shannon’s perfect secrecy. However, there exists

a family of distributions on messages and keys such that guessing secrecy is satisfied, but

perfect secrecy is not for that family. The significance of this result is that depending on the

security guarantee that is required in the secrecy system, there exists one reasonable model

of secrecy that weaker random sources (compared to uniformly distributed ones) suffices to

provide security. Comparing this to entropic security [DS05] where a smaller but uniformly

6 distributed key sufficed to provide security, in guessing secrecy the distribution of the secret

key can be relaxed, but not its length. This work was published in ICITS’12 [ASN12].

1.2.2 Correlated keys for multiple messages

Most relaxations of secrecy in information theoretic security are based on one message security,

and a fresh sampling of the secret key, completely independent of the first key, is required for

the encryption of a second message. In this work, we first extend the definition of secrecy to

multiple messages where an obvious encryption scheme would be to use independent keys to

provide the security for messages. We prove bounds on the length of the uniformly distributed

and independent keys in multiple message security. Then we show how correlated keys can be

used to encrypt a sequence of messages, by relaxing the definition of multiple message security.

We assume that the adversary’s advantage in predicting messages (before encryption) is

sufficiently small and thus the adversary can not guess the message before encryption occurs.

In our construction, the secret keys would still be close to uniform distribution, but they

are not independent of the previous keys. In this relaxation, last message in the sequence

is proved to be secure at the price of some leakage in older messages. We however, show

that this leakage is not arbitrary and can be bounded in terms of entropic security [DS05].

This is useful in scenarios that the importance of encrypted messages have a due time and

the security of last message would be more important. This happens in applications such as

keeping the privacy of location or health records, where the last record (e.g. location) is the

most important and the past records are not. Compared to entropic security, if the number

of messages λ to be encrypted was known prior to the beginning of communication, one could

use a construction of [DS05] to encrypt the first λ 1 messages and use one-time pad to − encrypt the last one. However, if the number of messages is not known, or the messages

are streamed (with no possible ending), then our construction has the advantage of using

correlated keys and achieve a smaller size of key.

7 1.3 Generating random numbers, and applications

In this part of thesis, we discuss two methods of generating random numbers using human

interaction with computers. Generating random numbers is investigated in many works due

to its importance in security applications. The deterministic nature of computers make it

impossible to generate random numbers unless an external source of randomness is utilized.

Prior works investigated random number generation from physical sources with inherent

randomness such as thermal noise. In many such sources the underlying laws that govern the

behavior of the system is unknown so the phenomenon is perceived as random. One can argue

that once these laws are discovered, then the source cannot be considered as random anymore.

Quantum source HotBit [Wal01] however can be constructed to rely on proven uncertainty

laws of quantum mechanics, and so have provable uncertain properties. Realization of these

sources however introduce inaccuracies that affect this property.

There is a debate that many physical sources of randomness are not inherently random,

compared to quantum events such as radioactive decays which are inherently random.

Therefore quantum phenomena are used as a more reliable source of randomness, and many

TRGs are built based on them such as HotBit. Other works such as [HN09, ZLwW+09]

considered human as a source of randomness. This is because choices and actions of human

although biased, contain inherent randomness. In our two proposals, we used human as an

external source of randomness where the unpredictable behavior or choices of human could

be used to generate random numbers. Even though human choices and behaviors have biases,

but because of the complexity of modeling human cognitive and perceptual systems, it would

be hard to predict the human choices or randomness while they engage in a game and their

goal is to win the game.

8 1.3.1 TRG from human game-play: Using video games

In the first method, we propose a random number generator from human game-play in video

games. Our observation is that for the game to be entertaining, there is always an inevitable

element of error in human game-play that can be used to generate random numbers. Our

experiments showed that even for the most experienced players, these errors are sufficient to

produce enough unpredictability to generate random numbers. In particular, this approach is

quite effective in gaming consoles and smart phones where hardware sources of generating

randomness is not sufficient to provide enough entropy for server communication. We discuss

our experiments and the other applications of this approach in real world applications. This work is to be published in the post proceedings of ISC’13 [ASN14].

1.3.2 TRG from human game-play: Game theoretic approach

Halprin et al [HN09] proposed to generate randomness from human game-play in zero-sum

games based on psychological experiments that showed human behavior is close to random when involved in competitive games that leave a few choices for the human (2 to 3 choices).

Their work however, used a zero-sum game with many choices for the human. The many

choices for human was to achieve a higher rate of randomness generation but contrasting

psychological results that requires few choices for human. We extended their work by keeping

the rate of randomness generation high, and at the same time giving only a few choices

(3 choices) to the human. We also did not use any extra randomness to achieve security

guarantees required for a TRG. This work is published in GameSec’13 [ASS13].

1.3.3 User generated randomness for authentication

Finally we discuss how the randomness in the human game-play can be used for authentication

purposes. The errors of human game-play in video games although sufficiently random to

generate randomness, but we found out that these errors (and in general human behavior in

game-play) represent a fingerprint of a human that can be used as an additional factor in

9 authenticating humans with interesting properties. Our empirical studies showed that the

behavior of a human entity in games is very hard to mimic by others even given information

about their game-play behavior and statistical information of how the entity plays the game.

This property has some interesting applications such as hard-to-delegate authentication: It

should be hard for an authorized entity to delegate its authentication information to others

so that they get authenticated too.

1.4 Other contributions

Comparing sources for various cryptographic primitives is an interesting question. This is to

find the primitives that are more demanding in terms of randomness. We present a literature

review and our partial results and discuss this line of results.

1.4.1 Review, partial results, and comparison of secrecy primitives

We review the results on randomness requirements of information theoretic encryption and

secret sharing, starting from the seminal work of Shannon by highlighting the main results.

We discuss and generalize the proofs on the size of the key needed for encryption. We then

discuss randomness requirements for encryption in terms of the distribution of the key and the

minimum requirement for secret keys. Comparing random sources in different applications is

an important subject to understand how demanding an application is in terms of randomness

requirements. In this regard, a number of results on the relation between the randomness

requirements of secret sharing and encryption is proved.

1.4.2 Location based storage

Distributed storage servers are ubiquitous nowadays to store files and data of many organi-

zations. In this service, data may be stored on any server around the globe. However, it is

10 important to know the country for which certain sensitive data (such as health records) are

stored in to determine applicable laws to the data. In this work, using a number of servers

around the world and proof of retrievability protocols, an estimate of the location of files is

calculated in a reasonable scenario that servers in distributed storage systems do not keep

copies of files due to economic benefits. Our main contribution was in the implementation

and running of the experiment to locate the files with theoretical analysis of how an estimate

of the location can be derived using multiple servers. To do this, a server-client application was developed over planet-lab network of computers. This work is published in CCSW’12

[WSNA+12].

1.5 Subsequent works

Jiang [Jia13] compared our definition of guessing secrecy with various secrecy definitions, in which he showed that guessing secrecy is a weaker notion than -secrecy. In other words,

any encryption scheme that provides -secrecy, would provide guessing secrecy. However, a

scheme providing guessing secrecy does not necessarily provide -secrecy.

Iwamoto et al. [IS14] generalized guessing secrecy to another type of entropy called

“Renyi entropy”, which is based on the collision probability of a random variable, i.e. the

probability that two identical copies of the same variable output the same realization. The

authors extended the results to a general model of entropy where Min-entropy and Shannon

entropy is a subset of it. They derived bounds on the size of secret key needed to achieve

secrecy in this model which conformed to the bounds on guessing secrecy and -secrecy.

1.6 Thesis structure

The thesis is divided into seven chapters and an appendix. The first chapter is introduction

to the problem, prior work and our contribution. In the second chapter, we recall some of

11 the essential backgrounds needed for the thesis, including concepts from probability theory, information measures and information theoretic security. In the third chapter, we review secrecy in information theoretic security in terms of randomness requirements. We remind some of the celebrated results and also prove basic results needed in the following two chapters. In forth chapter, we give a definition of secrecy, called guessing secrecy. We prove that although it needs the same number of random bits to encrypt messages under this definition, but in certain cases, the randomness can be biased. In the fifth chapter, we discuss our proposal for encryption of multiple messages. In sixth chapter, we discuss two approaches in generating random numbers practically using human interaction with computers. In the seventh chapter, we show how human game-play can be used to provide authentication with a new property that the authentication can not be delegated to a third party. In the last chapter, we point out concluding remarks of the thesis and discuss some of the open problems and interesting directions from this work. In the appendix, a joint work is presented on location based storage.

1.6.1 Theorems and proofs

In this thesis, whenever a theorem or lemma is used from other works, it is cited to the original paper proving it. If a theorem or lemma is not cited, the proof is from our work.

12 Chapter 2

Preliminaries and Basics

In this chapter, we recall some of the backgrounds on probability

theory and information theoretic security which is extensively

used in this work from [CT91].

2.1 Probability theory

Information theoretic security (ITS) relies on discrete probability theory for many of its

definitions and concepts. In this section, we remind the readers parts of probability theory

that are used in this thesis. We follow the notations in [CM97a]:

A discrete probability space (Ω,P ) is defined over a finite or countably infinite set Ω, called sample space, and a probability function P that assigns a number in interval [0, 1] to each

element of the sample space with sum equal to 1 over the whole sample space. P [ω] is

the value assigned to ω Ω with probability function P . An event A is a subset of the ∈ sample space and the probability associated with an event A, denoted by P [A] is the sum P of the probabilities of elements of A, i.e. P [A] = ω∈A P [ω]. A discrete random variable X is defined over a probability space (Ω,P ) and is a mapping from the sample space to an

alphabet with a distribution function X that assigns a probability X (x) to the event X P P x defined as follows: ∈ X

X X (x) = P [X = x] = P [ω]. P ω∈Ω:X(ω)=x In this thesis, random variables are always denoted by capital letters (e.g. X), their alphabet

is denoted by the corresponding calligraphic letters (e.g. ) and their realization, i.e. the X value of a random variable observed, is denoted by the corresponding lower case letter (e.g.

13 x). By size of we mean the number of elements in the set and by length of , we mean X X X the number of bits needed to represent elements of (length of x means its bit length). X

Having a pair of random variables X and Y over the same sample space with respective

alphabets and , the random variable XY is defined over the alphabet with joint X Y X × Y

probability distribution XY : [0, 1] given by XY (x, y) = P [X = x, Y = y]. The P X × Y → P conditional probability distribution of the random variable X given that Y takes on the value

y with Y (y) > 0 is denoted by X|Y (x y) and defined by ∈ Y P P |

XY (x, y) X|Y (x y) = X|Y =y(x) = P , for x . (2.1) P | P Y (y) ∈ X P Note that for a value y , X Y = y is a random variable over . We say that two random ∈ Y | X variables X and Y are independent if for all x , y ∈ X ∈ Y

XY (x, y) = X (x) Y (y) or X|Y (x y) = X (x). (2.2) P P P P | P

From the equation (2.1), it is easy to see that if Y (y) > 0, then P

Y |X (y x) X (x) X|Y (x y) = P | P . (2.3) P | Y (y) P This equality is called Bayes’ theorem.

The expected value of a random variable X defined over real numbers is denoted by E[X] and is given by X E[X] = x X (x). P x∈X For a random variable X, support of X, supp(X), is all the elements x with non-zero ∈ X

probability, i.e. supp(X) = x X (x) > 0 . { ∈ X |P }

2.2 Information theoretic measures

In many scenarios in ITS one needs a measure for the amount of information contained in a

random variable, or the information that is leaked from a message or secret after revealing

14 a function of it. Information in a random variable is associated with unpredictability or

uncertainty of that random variable. If the output of a random variable is completely

predictable, then random variable’s output adds no extra information to one’s knowledge and

the more unpredictable is the random variable, the more information is gained after learning

its output. The terms information gain, uncertainty and unpredictability of a random variable

refer to different aspects of the same concept. A formalization of this concept was introduced

in the seminal work of Shannon [Sha48]. Shannon defined the entropy or uncertainty of a

random variable as a measure of information that is gained after observing a realization of

the random variable.

Definition 2.2.1 (Shannon entropy) The Shannon entropy H(X) of a random variable X with probability distribution X is given by: P X H(X) = X (x) log( X (x)) = E[ log X ], − P P − P x∈X where log is in base 2. Note that we will use logarithm in base 2 in the entire thesis.

For example the information contained in a random variable over n-bit strings with uniform

distribution is n bits, since we can not predict any particular realization of the random variable and the output is completely unpredictable and thus by observing a realization of

the random variable we learn n bits of information. However, if the distribution was not

uniform, then we could guess the most probable element of the space as a candidate and

thus the realization of the random variable was predictable with better chance. Therefore we

expect that the information contained in such a random variable be less than n bits.

The joint Shannon entropy of joint random variables X and Y is the entropy of the

joint probability distribution XY . The conditional Shannon entropy H(X Y ) of the random P | variable X given Y is the average value of H(X Y = y) over all possible value of y, i.e. | X H(X Y ) = Y (y)H(X Y = y). | − P | y∈Y

15 Definition 2.2.2 (Mutual information) The mutual information between two random vari- ables measures the mutual dependence of them and is given by

X X  XY (x, y)  I(X; Y ) = XY (x, y) log P . P X (x) Y (y) x∈X y∈Y P P Note that the above definition is symmetric in X and Y . Mutual information can also be

defined as the amount of information reduction in X after knowing Y , i.e.

H(X) H(X Y ). − |

As can be derived from the Van diagram in Figure 2.1. See [CT91] for a proof of this.

H(X) H(Y )

H(X|Y ) I(X; Y ) H(Y |X)

H(X,Y )

Figure 2.1: Van diagram of Entropy measures

We also need a measure of the “closeness” of two random variables over the same alphabet.

This measure is called statistical distance and is defined as follows:

Definition 2.2.3 (Statistical distance) The Statistical distance between two random variables

X,Y over is defined by X  1 X ∆ X; Y = max Pr[X T ] Pr[Y T ] = X (x) Y (x) . T ⊆X 2 | ∈ − ∈ | x |P − P |  We use X  Y to denote ∆ X; Y . ≈ ≤

16 We denote the expected statistical distance of two random variables X,Y conditioned on Z

by   1 X ∆ (X,Z); (Y,Z) = ∆ X; Y Z = Z (z) X|Z (x z) Y |Z (x z) . 2 | x,z P P | − P | We say X is -close to Y if ∆X; Y  . Specifically if Y is the uniform distribution, then ≤ we say X is -random. We will use the following lemmas on statistical distance throughout

this thesis.

Lemma 2.2.1 (Triangle inequality) For any random variables X,Y,Z we have

∆X; Y  ∆X; Z + ∆Z; Y . ≤

The above lemma is a well-known fact about statistical distance which is used in many papers.

Lemma 2.2.2 For any two random variables X,Y over that are jointly distributed with M a random variable A, and any function f on , M

∆f(X); f(Y ) A ∆X; Y A, | ≤ | with equality if f is a one-to-one function.

Proof. Suppose f( ) is the image of f on , then by definition of statistical distance the M M following holds:

 1 X ∆ (f(X),A); (f(Y ),A) = A(a) f(X)|A(b a) f(Y )|A(b a) 2 P P | − P | b∈f(M),a∈A

1 X X = A(a) X|A(m a) Y |A(m a) 2 P {P | − P | } b∈f(M),a∈A m∈M:f(m)=b 1 X X A(a) X|A(m a) Y |A(m a) ≤ 2 P P | − P | b∈f(M),a∈A m∈M:f(m)=b 1 X = A(a) X|A(m a) Y |A(m a) 2 P P | − P | m∈M,a∈A

= ∆(X,A); (Y,A).

17 To have equality, it is sufficient to have f to be one-to-one. 

Lemma 2.2.3 For any two random variables X  Y over , and any function f on , ≈ M M we have (X, f(X))  (Y, f(Y )). ≈

Proof. Suppose f( ) is the image of f on , then by definition of statistical distance M M 1 X ∆(X, f(x)); (Y, f(Y )) = Pr[X = a, f(X) = b] Pr[Y = a, f(Y ) = b] 2 | − | a∈M,b∈f(M)

1 X X = P r[X = a, X = c] Pr[Y = a, Y = c] . 2 − a∈M,b∈f(M) c∈f −1(b)

But Pr[A = a, A = b] Pr[A = a] since it can be either zero (when a = c) or Pr[A = a] ≤ 6 (when a = c, or a f −1(b)). Since f is a function, the latter case occurs for all b if and only ∈ if f is one to one. Also note that for each b, there is at most one possibility that a = c, and

therefore continuing from the last equation the following holds

∆(X, f(x)); (Y, f(Y )) ∆X; Y . ≤



Definition 2.2.4 (Perfect versus imperfect random bits) We will refer to a sequence of bits as perfect random bits if it was generated from the uniform distribution, i.e. 0-random distribution. The terms perfect randomness is generally used when referring to symbols generated independently by uniform distribution.

A uniform distribution over an n-bit space or the set , is denoted by Un or UR respectively. R A sequence of bits generated from a distribution that is -random with relatively small  is called almost random bits and such a distribution is called nearly perfect random distribution.

By imperfect randomness we refer to a randomness that is not uniformly distributed.

18 As seen in Definition 2.2.1, Shannon entropy is an average unpredictability of a random variable. The greater is the Shannon entropy of a random variable, the greater is its

uncertainty and thus the more perfect is its randomness in average. However in cryptographic

applications one usually needs a worst case measure of randomness. To have an insight,

consider the following random variable:

  n 0 with probability 0.99, X = (2.4)  Un with probability 0.01. It is easy to verify that H(X) 0.01n. Although we don’t expect to have a useful randomness ≥ from one sample of the random variable X, but Shannon entropy of X is relatively high. For

cryptographic applications, the randomness associated with this random variable is not useful

because the adversary would guess the value 0 for the output of the random variable and a

success rate of 99% is achieved. In order to prevent this, another measure of information (or

randomness) was proposed which measures randomness in the worst case.

Definition 2.2.5 (Min-Entropy) Let X be a random variable with probability distribution

X . Then the min-entropy of X is: P

n 1 o H∞(X) = min log = log maxx X (x). x Pr[X=x] − P

For example, for the random variable X in 2.4, H∞(X) = 0.99 holds which is a constant

number compared to the increasing behavior of the Shannon entropy on the same random variable. Therefore, min-entropy is a more secure measure of randomness in a random variable

for cryptographic applications.

From the definition of Shannon entropy and Min-entropy we have:

H∞(X) H(X) log supp(X). ≤ ≤

A more comprehensive study of entropy with the proofs of theorems can be found in

[CT91, CM97a].

19 2.3 Information theoretic security

We start with the classical scenario of symmetric cryptography: A party, named Alice, wants

to send a message X to another party, Bob, over a public channel controlled by an adversary,

Eve, as shown in Figure 2.2. Based on their communication security goal, Alice encodes

the message to Y using an Encoder and sends it over the channel. Bob and Eve receive a

copy of Y and then Bob decodes Y to Z using a Decoder. If Bob’s observations from the

communication is the same as Eve’s observations, i.e. Z, and there is no prior information

shared between Alice and Bob, then there is no advantage for Bob over Eve and thus there

is no security. One assumption to circumvent this is that Alice and Bob share some secret

key K prior to the communication which is not shared with Eve. There are other models in

ITS to give parties an advantage over the adversary such as Bounded Storage (or retrieval)

model, noisy channels, etc. Here we just consider the case of a shared randomness between

the parties. Based on the communication goal we may have two main relevant scenarios:

X Y Z Encoder Decoder

K K

Figure 2.2: Secure Communication in Cryptography

Secrecy. In this scenario, Alice wants to send a message to Bob such that Eve can

not learn the message by observing all the communications between them

passively, i.e. with no modification of the message. However Bob must be able

to correctly recover the message with overwhelming probability using their

shared secret key. To accomplish this task, they use an encryption system.

Authentication. In this scenario, Alice wants to send a message to Bob but now Eve can

tamper with the messages sent over the channel. The goal is to protect the

20 communication from Eve’s modification or insertion of messages such that

Bob can detect a fraudulent message. Message authentication code is the

mechanism to achieve this goal.

In this thesis we mainly focus on the first goal, namely secrecy.

2.3.1 Computational versus information theoretic model

The security of the above scenarios can be proved either in computational or information

theoretic models. In computational security, one assumes that, 1) the adversary who wants to

break the security of the system is computationally bounded, and 2) there are hard problems

for which no known probabilistic polynomial time (PPT) algorithm exists. For example, the

problem of finding the factors of a composite number with two prime factors of length k is

considered hard because there is no known PPT algorithm to solve this problem. However

there is not even a single problem in NP that is proved to be computationally infeasible in a

reasonable model of computation. Practically, a hard problem is considered to be a problem

for which it is infeasible to solve it in reasonable time in the best known algorithm that solves

that problem. But no proof is yet found for the best possible algorithm for a hard problem.

Therefore, computational security is based on this unproven assumption that a particular

problem is a hard one.

On the other hand, protocols and constructions that do not rely on computational assumption

are called “information theoretically” or “unconditionally” secure. Information theoretic

security (ITS) relies on no unproven assumption and does not restrict the computational

power of the adversary. Instead, it uses probability theory to provide mathematical proof

that a scheme is secure.

While ITS provides stronger security, it is not always practical due to the results of

Shannon that we will discuss in this chapter and the next. However the study of ITS is very

21 important for the following main reasons:

1. The security in ITS is unconditional, i.e. it does not depend on the computa-

tional complexity problems such as factoring problem. If it was proved that no

hard problem exists in NP, still security of schemes in ITS hold.

2. There are settings that are practical in ITS, e.g. secret sharing and multi-party

computation. Studying ITS will help us better formalize the concepts in these

settings, and extend them to computational security.

3. ITS and computational security contribute to each other. Some of the defini-

tions and proofs in ITS are naturally extended to the computational setting.

Therefore, the study of ITS will shed more light on our understanding of

computational security. Also there are information theoretic tools that are

used in computational security, such as randomness extractors.

In this thesis, we only consider information theoretic security where we assume no bound

on the computational power of the adversary. In the following sections, we investigate the

ITS results for secrecy with the most significant results that are of importance in this thesis.

2.4 Secrecy

In this section, we recall some of the classical results in secrecy. The results come mostly

from the seminal work of Shannon [Sha49] ”communication theory of secrecy systems”. In

secrecy, Alice encrypts a message, called plaintext, with a key from the key space. The result

of encryption is called ciphertext. The ciphertext is sent to Bob via a public channel, and

Eve can observe the ciphertext. Alice’s security goal is to keep the message confidential when

Eve has observed the ciphertext.

Definition 2.4.1 (Encryption system) Let be the set of plaintexts that must be encrypted X along with a key coming from the set . Let the set of all ciphertexts, i.e. the range K Y 22 of encoder function, called encryption in this context. A (enc, dec)-encryption system is a pair of functions enc : and dec : such that for every x X × K → Y Y × K → X ∈ X and k it holds that dec(enc(m, k), k) = m, i.e. the decryption has no errors. In this ∈ K thesis, an encryption system is always deterministic unless otherwise stated, and is defined over message, ciphertext and key spaces , , respectively, and the corresponding random X Y K variables X,Y,K are assumed.

An encryption function can also be randomized, where we assume that some extra random

bits are accessible, either locally or publicly.

A realistic assumption is that the adversary may have some prior information about the

plaintext even before observing the ciphertext. For example if the adversary knows that the

plaintext is English text, then this information may help her to better guess the plaintext after

observing the ciphertext. The adversaries prior knowledge about the plaintext can be modeled

as a random variable X over the set of plaintexts with probability distribution X . We also X P assume that there is a random variable K over the set of keys with probability distribution K

K . In this model, the random variables X and K are assumed to be independent. We also P

consider the random variable Y over ciphertexts with probability distribution Y that Y P

is determined by X and K and the encryption function enc. The strongest definition of P P security against an eavesdropping adversary for encryption systems was given by Shannon in

[Sha49]. Intuitively speaking, it require that the ciphertext gives no information whatsoever

about the plaintexts. This means that the best an adversary can do to gain information

about the plaintexts is to use only her prior information and discard the ciphertext, as by the

definition one ciphertext does not give any information about the plaintexts. In the language

of probability theory this means that the probability distribution over the plaintexts X P

does not change even given a ciphertext y, i.e. X = X|Y =y for all y . In terms of P P ∈ Y entropy, this is equivalent to say that H(X Y ) = H(X). If an encryption system satisfies | this property, then we say it is perfectly secure, or it satisfies perfect secrecy:

23 Definition 2.4.2 (Perfect secrecy) An encryption system provides perfect secrecy if the ciphertext reveals no information about the plaintext, i.e., I(X; Y ) = 0.

There are some equivalent definitions of perfect secrecy which may be easier to work with. The

following definition of perfect secrecy says that the probability distribution of the encryption

of any two plaintexts under all keys must be identical, i.e.

x0, x1, y, Y |X (y x0) = Y |X (y x1), ∀ P | P | which is proved to be equivalent to

 ∆ enc(x0,K); enc(x1,K) = 0, x0, x1 . ∀ ∈ X

This formulation of perfect secrecy is called perfect indistinguishability. A proof of equivalence

of perfect secrecy and perfect indistinguishability can be found in [KL07].

Even though perfect secrecy is the strongest notion of security, it can be achieved by a simple

construction, called one-time pad which is defined as follows. Assume , and are a finite X K Y group G with group operation ”+”. Define the encryption function enc to be

enc(x, k) = x + k, x , k , ∀ ∈ X ∈ K with the keys sampled from uniform distribution over = G. The above scheme is called K one-time pad.

Theorem 2.4.1 [Sha49] The one-time pad encryption system satisfies perfect secrecy.

For one-time pad to work, we require two conditions which are not practical in real world

security applications:

1. The number of the keys must be equal to the number of the plaintexts.

2. The keys must remain uniformly distributed in the view of the adversary:

(a) the key must be sampled from uniform distribution, and

24 (b) the key must remain uniform when the cipher is disclosed, e.g.

the adversary must not gain any information about it.

One might think that there exist encryption systems with perfect secrecy such that they

do not have the above two limitations. However, Shannon proved results which rules out this

hope:

Theorem 2.4.2 [Sha49] If an encryption system satisfies perfect secrecy, then we have

H(K) H(X). Consequently if X is uniformly distributed, then the number of keys must be ≥ at least as much as the plaintexts, i.e. K X . | | ≥ | |

The above theorem says that we can not have perfect secrecy with key entropy (or key

space) smaller than plaintext entropy (space). The following theorem says that if one uses as

much keys as the plaintexts, then the key must be sampled from uniform distribution.

Theorem 2.4.3 [Sha49] (Shannon’s theorem) An encryption system with = = |X | |Y| |K| satisfies perfect secrecy if and only if

1. The distribution over keys, K , is uniform. P 2. For every x and every y , there exists a unique key k such that ∈ X ∈ Y ∈ K enc(x, k) = y.

The results of Shannon motivated a line of research to relax the notion of perfect secrecy

in order to achieve better bounds on the number of keys or amount of randomness needed to

have a secure encryption in ITS. One relaxation was introduced by allowing some negligible

bias in the distribution of ciphertexts.

In the indistinguishability formulation of secrecy, perfect secrecy was equivalent to having

identical distributions over ciphertexts computed on plaintexts with all keys, i.e. enc(x, K)

for x and random variable K. One relaxation, called -indistinguishability, allows a ∈ X small bias in the distribution of random variables enc(x, K).

25 Definition 2.4.3 (-indistinguishability) An encryption system satisfies -indistinguishability if  ∆ enc(x1,K); enc(x2,K) . ≤ Unfortunately this relaxation can not achieve better bounds on the number of keys as we will discuss in the next chapter.

A more comprehensive introduction to the notion of perfect secrecy and the proofs of the

above theorems can be found in [KL07, Sti06].

2.4.1 Secret sharing

Consider a scenario where there is a secret distributed among a group of n parties and each

of them requiring a such that the secret can be reconstructed if more than t of the parties

come together. This method of sharing a secret and reconstructing it under the mentioned

condition, is called a (n, t)-threshold secret sharing, read as t out of n threshold secret sharing.

From security point of view, we require that if the number of shares known to the adversary

is less than the threshold t, no information is leaked about the secret.

We need the following definition to formally define secret sharing:

Definition 2.4.4 For an n-tuple N = (a1, a2, . . . , an), and a set

Tt = i1, i2, . . . , it 1 i1 < i2 < < it n , N[Tt] is defined as a t-tuple from N whose { | ≤ ··· ≤ } elements are chosen by the indexes from Tt, i.e.

N[Tt] = (ai1 , ai2 , . . . , ait ).

We call Tt a t-tuple set. We denote all possible t-tuple combinations of N by N[t], i.e.

N[t] = N[Tt] Tt is a t-tuple set . { | }

Definition 2.4.5 (Threshold secret sharing) Suppose the secret is an element of the set X and we generate the shares that are elements of the set , using randomness in the set . Y R 26 An (n, t)-threshold secret sharing scheme is a pair of functions share : n and X × R → Y rec : t such that Y → X

r , x : rec(s) = x, s share(x, r)[t]. ∀ ∈ R ∈ X ∀ ∈

Note that the notation share(x, r)[t] comes from Definition 2.4.4 and denotes the set of all

possible t-tuples from share(x, r).

We define perfect or µ secrecy of secret sharing schemes the same as encryption.

Definition 2.4.6 (Perfect secrecy of secret sharing) Suppose that X and R are random variables over and with probability distribution X and R respectively. Then share(X,R) X R P P is a random variable with a probability distribution induced by X and R. A secret sharing scheme provides perfect secrecy if for every (t 1)-tuple set Tt−1, I(X; S) = H(X) H(X S) = − − |

0, for random variable S = share(X,R)[Tt−1] holds, which is equivalent to

 x0, x1, ∆ share(x0,R)[Tt−1]; share(x1,R)[Tt−1] = 0. ∀

Note that the distribution of S is determined by X,R, the set Tt−1 and the sharing function.

This definition of secrecy states that t 1 of the shares if known by the adversary do not leak − any information about the secret.

We can define µ-secrecy for secret sharing, the same way as -secrecy was define for

encryption, as follows:

Definition 2.4.7 (µ-Secrecy of secret sharing) A secret sharing scheme provides µ-secrecy if for every (t 1)-tuple set Tt−1 the following holds: −  x0, x1, ∆ share(x0,R)[Tt−1]; share(x1,R)[Tt−1] µ. ∀ ≤

As an example, a (2, 2)-secret sharing scheme is a pair of functions share : 2 X × R → Y and rec : 2 such that Y → X

27 r , x : rec(share(x, r)) = x. ∀ ∈ R ∈ X

Suppose that (S1,S2) share(X,R) where S1,S2 are random variables over , then a ← Y (2, 2)-secret sharing provides µ-secrecy if

 x0, x1, ∆ share(x0,R)[1]; share(x1,R)[1] µ, ∀ ≤ where share(x, R)[1] is either of the shares based on Definition 2.4.4.

A secret sharing scheme uses randomness to distribute the shares from the secret. The

following lower bound was proved by Blundo et al.

Theorem 2.4.4 [BDSV96] For a (n, t)-secret sharing scheme that provides perfect secrecy with the secret from the set and the randomness required for distributing the shares from X , we have R log (t 1) log . |R| ≥ − |X |

2.5 Concluding remarks

This chapter presented background concepts and definitions used throughout thesis. We

reviewed the basic definitions of secrecy primitives such as encryption and secret sharing, and

the classical results regarding the key size and distribution needed for the primitives were

discussed.

28 Chapter 3

Randomness requirement of secrecy

This chapter is devoted to a literature review of results in random-

ness requirements for secrecy applications, specifically encryption

and secret sharing. We also present results on secrecy and se-

cret sharing sources and a number of open questions that are

discussed throughout the chapter. For completeness, a short

summary of results in authentication and randomized algorithms

is also included.

In the previous chapter, we discussed the Shannon’s impossibility results on the size and distribution of randomness needed for perfect secrecy. Shannon’s result showed that one needs a “perfectly random” key of “size at least the length” of the message to be able to encrypt it with perfect secrecy. This result motivated many researchers to relax the definition of perfect secrecy to achieve better bounds on the size of the key, or relax the requirements on the distribution of the secret key needed. In the following section, we first discuss modeling of random sources, then discuss various relaxations of Shannon’s definition of secrecy and investigate their randomness requirements.

3.1 Modeling random sources

In many different applications in computer science, we need a sequence of perfect random bits to run an algorithm, or to securely run a cryptographic protocol. However there may not be a source of perfect randomness and even if it exists, it may not be easily accessible.

Even if we could find a source of perfect randomness in future, it may leak some information to the adversary during the time it is being used. However, there exist sources that can

29 generate imperfect randomness, such as radioactive decay, thermal noise, the photoelectric

effect or other quantum phenomena. In general, randomness is considered to be an expensive

resource and the assumption that perfectly random bits are easily collectible is not practical.

So the aim of this line of research is to minimize the assumptions on randomness for various

cryptographic primitives or protocols. This is usually achieved by following two goals: one is to minimize the number of random bits needed to achieve the security guarantee of the cryptographic application compared to the input message, and the second is to minimize the assumption on the distribution of the randomness used. The first goal is clear, for example

minimizing the number of random bits needed to achieve secure encryption of n bit messages.

But for the second goal, it is not clear what assumption on the distribution of randomness is

practical.

A line of work started by Von Neumann [vN51a] to find the practical assumption on the

randomness that can be generated in real world RNGs, a model of physical random sources.

An acceptable model should consider all of the following limitations of the physical random

sources:

1. The output distribution of the physical source can not be proved to be uniform.

2. The output distribution of the physical source is not only non-uniform but also

unknown.

3. The randomness property of the physical source changes over time because of

the physical changes that occur in the device [TC11].

4. The randomness generated by the source may leak during time of use, for

example through side channel attacks.

In a nutshell, the generated randomness might change its unknown behavior over time

and may be partially leaked. Therefore, not only proving security of a scheme based on

the uniform distribution is an unrealistic assumption, but also relying on a single imperfect

30 random distribution is not realistic either. So one must consider a family of distributions

to base the security proofs of schemes on. Thus a reasonable approach is to minimize our

assumption about the properties of the randomness generated by these devices.

Von Neumann modeled physical random sources by considering a bit sequence, each bit

independently sampled from a fixed Bernoulli distribution, i.e. Pr[1] = p, Pr[0] = 1 p. He − showed that a simple process can transform a bit sequence from this source to a uniformly

random bit sequence. This works by mapping each two consecutive bits to the first bit if

the bits were different, and rejecting them (output null) if the bits were the same. Since

for a Bernoulli sequence Pr[10] = Pr[01] = p(1 p) holds, he argued that the output would − be uniformly distributed. Note that this process always produces uniformly random bits

regardless of the value p.

The Von Neumann assumption was relaxed by Blum [Blu86] where the random source is

modeled as a finite state Markov chain, i.e. the bits are not assumed to be independently

generated, and depend on a finite number of previous bits. Blum could also propose an

algorithm that transform such sequences to a uniformly random sequence. Santha and

Vazirani [SV86] further relaxed the above requirement by removing the independence between

bits. They assumed that the source outputs bits that are arbitrarily biased within a fixed

range [δ, 1 δ] and each bit is dependent on all previous bits from the source. They showed − that no process can turn a sequence of bits generated by their model to uniformly random

bits (or even -close to uniform).

In adversarial scenarios where the adversary can gain information from the output of

random source via side channel attacks, we need to relax the above models even more to

consider adversarial knowledge.

One proposed method for modeling a source of randomness was given by a series of

31 works [NZ96a, Sip88, NZ96b] where it was suggested that the output of a random source be

represented by a family of distributions rather than just one, such that all distributions in

the family have some specific property.

Definition 3.1.1 (Random source) A random source φ on domain is a family of dis- X tributions, usually with a certain property. We write X φ when X is a distribution in ∈ φ.

Another model is the bit fixing source, where the output is an n-bit sequence where

t of them are perfectly random and independent, and n t of them are controlled by an − adversary. These n t bits can be fixed by the adversary or can arbitrarily depend on other bits. −

The most general model of all above was based on source with bounded probability by

Chor and Goldreich [CG88] where we can only assume that the probability of all symbols

generated from the source is bounded by a threshold t.

Definition 3.1.2 ((n, t)-sources, weak random sources) A random source φ on 0, 1 n is { } called a (n, t)-source, or simply a t-source, for t n, if for X φ, H∞(X) t, i.e., if ≤ ∈ ≥ x 0, 1 n, Pr[X = x] 2−t. By weak random sources we refer to all t-sources for t < n. ∀ ∈ { } ≤

An example of a (n, t)-source is a perfectly random source for which a function of length

n t bits of it is controlled by the adversary. We may use the term t-source if the length of − the source is clear in the context.

3.2 Dealing with weak random sources in cryptography

Many of the existing protocols or algorithms are only proved to work with perfect randomness,

that is a sequence of mutually independent and uniformly distributed bits. But practically,

one needs to assume a model of weak random sources and build primitives based on that, or

find the random sources that a primitive can provide security guarantees for. This problem

32 has been investigated in two paradigms:

Paradigm 1. Provide a “ randomness extraction ” process that can transform weak random

sources to a perfect one, and then use the output of the process in the existing constructions

that provide security guarantees for a primitive only for perfect random sources. In this

paradigm, the random extraction process can be seen as a black box that transforms a weak

source to a perfect one. Then the perfect source is given as input to the constructions that

the existing proof of security holds.

Paradigm 2. Construct new algorithms or modify the existing ones in such a way that they

provide the security guarantees of the primitive for weak random sources. In this paradigm,

the existing or new algorithms must be proved to provide security guarantees for a primitive, when a weak random source is the only available source of randomness. In this paradigm,

the randomness requirements of a primitive is investigated.

Note that paradigm 1 is a special case of paradigm 2 since any construction for a

randomness extractor can be used to construct secure primitives for weak random sources.

We already know that there are random sources (such as Von Neumann source) that can be

transformed to perfect randomness to be used in the first paradigm. One might ask whether

all weak random sources are applicable in the first paradigm. On the other hand, if no new

algorithm or proof is found that provides the security guarantees of a primitive with no use

of randomness extractors, then we are interested to prove that the only way to construct

a primitive with weak random sources is to first extract from the source and then use a

construction that works for perfect randomness. In other words, paradigm 2 can only be

achieved through paradigm 1 in this case.

33 3.2.1 Local versus public versus shared randomness

We differentiate among three types of randomness as resources available for cryptography.

Local randomness is a resource generated and used locally by one party, and the other involved

parties can not access this resource. Public randomness is a resource that should be publicly

accessible by all parties involved in communication, i.e. the honest parties and the adversary.

A local randomness can be made public by sending it over the communication channel. Shared randomness is a resource only accessible by parties that intend to communicate securely,

and not by adversary. For example, a secret key is a shared randomness only accessible by

communicating parties, usually assumed to be shared prior to communication.

Shared randomness is considered to be the most expensive randomness resource in cryp-

tography, since it must be generated and shared secretly such that no other entities can

access it. The sharing process happens over a medium that must be secure. For local or

public randomness, there is no secure sharing process that can cause side channel leakage

of the randomness. Although we assume the shared randomness is not accessible by any

parties other than the communicating ones, we might assume that it is leaked partially to an

adversary who is interested to intercept or modify the communication.

In this thesis, we are mostly concerned with shared randomness, even though decreasing

the requirements on public or local randomness is also desirable.

3.3 Paradigm 1: Randomness extraction

In this section, we briefly review randomness extraction theory where random sources that

can be transformed to perfect randomness are classified. We discuss why extraction paradigm

may not be possible because not all random sources are deterministically extractable. Let us

start with the formal definition of randomness extractors.

34 3.3.1 Deterministic extractors

Definition 3.3.1 (Deterministic extractor) For a random source φ over , an -deterministic K b extractor ext : 0, 1 is a function such that for all K φ, ext(K)  Ub holds. φ is K → { } ∈ ≈ -extractable to b bits if such a function exists.

Von Neumann source is an example of 0-extractable source. A simple example of ex-

tractable sources is the following:

Example 3.3.1 Let φ1 be a (2, 1)-source over 0, 1, 2, 3 with 4 distributions uniformly { } distributed over subsets 0, 1 , 1, 2 , 2, 3 , 3, 0 respectively, i.e. { } { } { } { }  φ1 = U{0,1},U{1,2},U{2,3},U{3,0} .

The following function is a 0-extractor ext : 0, 1, 2, 3 0, 1 that extract 1 uniform bit { } → { }

from φ1:

ext(i) = i mod 2.

It can be simply verified that for every K φ1, ext(K) 0 U1. ∈ ≈

The above source is called a (2, 1)-flat source.

Definition 3.3.2 (Flat source) A flat source over corresponds to the uniform distribution X over subsets of . If the length of the subsets is at most t bits, then it is called a flat X (n, t)-source.

One might hope that a large number of random sources are -extractable, but it turns

out that extractable sources are a very small subset of random sources [Sha02]. For example,

it can be simply seen that (n, t)-sources are not extractable for any t < n.

Proposition 3.3.1 There exists no deterministic function ext that can extract even one single random bit from (n, t)-sources (even from high min-entropy ones, e.g. t = n 1). −

35 More formally, consider the goal of designing an extractor for all random variables Y over

n n 0, 1 with H∞(Y ) n 1. That is a function ext : 0, 1 0, 1 such that for every { } ≥ − { } → { } random variable Y , the distribution ext(Y ) is uniform over 0, 1 , i.e. ext(Y ) outputs 0 or 1 { } with almost equal probability, as shown in Figure 3.1-(a).

0, 1 n 0, 1 n { } { } 0 0

Y 1 1 1

Y 2 X

ext ext

(a) The ext (b) RV X is set to divides the space constant 1 by ext into two parts

Figure 3.1: Impossibility of randomness extraction from a (n, n − 1)-source

It is easy to see that no such function ext exists. This is because for every such function

ext, there exists a bit b such that the set S = x ext(x) = b is of size at least 2n−1. It { | } follows that the random variable X (Figure 3.1-(b)) which is uniformly distributed on S has

H∞(X) n 1 and yet ext(X) is fixed to the value b. ≥ −

3.3.2 Seeded Extractors

Despite the above result, a probabilistic proof [Sip88] shows that if one randomizes the

randomness extractor using a random seed, then it is possible to extract from t-sources. The

use of a seed may seem to be in conflict with our goal of minimizing the need for perfect

randomness, but it turns out that by investing a small random seed of logarithmic length

(compared to the source length), we can extract almost all the randomness in weak random

sources. Such extractors are called seeded extractors, or simply extractors.

However one might ask how to gain this small random seed. One answer is that the

logarithmic length of the seed, compared to the randomness it can extract is negligible

and this could help in some applications. For example, Zuckerman [Zuc91] showed that

it is possible to simulate randomized algorithms with weak random sources (rather than

36 perfect random bits) using seeded extractors. We will discuss their result after reviewing the

definition of extractors.

Definition 3.3.3 [NZ96a] (Extractor) For a random source φ over of length n, an ex- K tractor is a function ext : 0, 1 l such that for every random variable K φ we K × R → { } ∈ have ext(X,R)  Ul for some l < n. An extractor is called a Strong Extractor if ≈

(ext(X,R),R)  (Ul,R). ≈

The probabilistic method shows that for every (n, t)-source and  > 0, there exists an

extractor with seed length log = d = log(n t) + 2log( 1 ) + O(1) and output length |R| −  l = t + d 2 log( 1 ) O(1). This argument was first used by Sipser [Sip88] (for a related −  − function called dispersers), and also appears in [RTS00]. However, in most applications of

extractors it is not sufficient to prove the existence of an extractor and an explicit construction

is required.

Definition 3.3.4 (Extractor for t-sources) A (n, t, l, )-extractor is a function that can extract

l bits from a (n, t)-source, -close to uniform.

Naturally, we want the seed length to be as small as possible, and the output length to be as

large as possible, by minimizing entropy loss, that is t + d l where d = log . − |R| Extractors use a random seed as a catalyst to extract the randomness in a weak random

source. But this may seem to be circular problem: To extract nearly perfect bits, you need a

small amount of random bits. However, these extractors are proved to work for simulation of

BPP algorithms. These sources are called simulatable sources, which are random sources

that are sufficient for running BPP algorithms.

Theorem 3.3.1 (Simulating BBP algorithms) [Zuc91, NZ96a]A (n, t)-source can simulate

BPP algorithms if t = 2λn for λ > 0.

37 Zuckerman showed that for running a randomized algorithm Alg that uses a random

element r of the set 0, 1 l to provide the output Alg(r), one can instead simulate the algorithm { } by using a weak random source K and a strong seeded extractor ext. They constructed

a new algorithm Alg0(k) = max Alg(ext(k, d) for all random seeds d and k , which { | ∈ K} enumerates over all random seeds and compute the output of the extractor on k. Then runs

the main algorithm Alg on the output for every seed and take a majority vote. They proved

0 that the output distribution Alg (K) is close to Alg(Ul). The important fact in the above is

that the number of seeds is logarithmic, and this will not change the complexity class of the

BPP algorithm.

A very good survey on construction of extractors can be found in [Sha02, NTS99].

3.4 Paradigm 2: Constructions using imperfect randomness

In this section, we investigate the paradigm that proves bounds for the key size or possible

distributions of key for which secrecy applications are satisfied. We prove results that show what best can be achieved in terms of key size and distribution for secrecy and then discuss

constructions that can achieve the bounds.

Shannon’s result (Theorem 2.4.1) shows that one-time pad achieves the security in terms

of perfect secrecy. However, one-time pad requires a uniformly distributed key with length

at least the length of the message (Theorem 2.4.3). The mentioned limitations raise two

fundamental questions: First, is there any encryption system with a key length less than that

of the message, and second, is it possible to have a key distribution other than uniform? A

negative answer to these questions implies that one-time pad is always required for a perfectly

secure and correct encryption, or in other words, one-time pad is universal.

38 3.4.1 Randomness requirements for perfect secrecy

We start by finding the requirements for perfect secrecy in this section. The following theorem was proved in Theorem 2.7 of [KL07] when message has uniform distribution. In the following,

the theorem is proved for the general case where messages can be from any distribution (but

assuming deterministic encryption).

Theorem 3.4.1 There is no deterministic encryption scheme (enc, dec) that provides perfect secrecy and correctness with supp(K) < supp(X) . | | | |

Proof. Since the decryption function must be deterministic and the correctness property

holds, each ciphertext must be decrypted to only one message under a fixed key, and thus

supp(Y ) supp(X) . For a message x, let Y (x) = y k supp(K), enck(x) = y . As- | | ≥ | | { |∃ ∈ } suming supp(K) < supp(X) , then for every message x supp(X), Y (x) < supp(Y ) | | | | ∈ | | | |

holds and so there exists y0 supp(Y ) such that y0 / Y (x). Such y0 contradicts perfect ∈ ∈

secrecy, i.e. X|Y (x y0) = 0 = X (x). P | 6 P 

Shannon’s result also showed that for any perfectly correct and secure encryption system,

H(K) H(X), which means that the length of the key greater than the message for uniform ≥ distribution over the messages. Also if = = , then the only secure encryption |X | |Y| |K| system is one-time pad [KL07].

Therefore, there exists no encryption system that provides perfect secrecy and achieves

a better key length than the one-time pad. However, the question of whether there exists

any encryption system that works with non-uniform distributions remains open, if we allow

keys with longer length than the message. Simple examples show that not only this is

possible, but also any encryption system that works with uniformly distributed keys can

be turned into an encryption system that also works with a non-uniform distribution, if we

are allowed to have more keys than messages. For example, consider one-time pad over a

39 → → K (k) k↓m 0 1 K (k) k↓m 0 1 P P 0.5 0 0 1 0.5 0 0 1 0.5 1 1 0 = 0.25 10 1 0 ⇒ 0.25 01 1 0

Table 3.1: Encryption function with uniform keys Table 3.2: Encryption table with non-uniform keys

one-bit space, where y = x k and the 2 keys in the system are selected with probability ⊕ 1 2 (Table 3.1). Select one of the keys, 1 and “replace” it with two keys 10, 01, each with 1 probability 4 , such that encryption of message under these key equals the encryption under 1, i.e. x 01 = x 10 = x 1 (Table 3.2). It can be simply seen that this new encryption ⊕ ⊕ ⊕ system provides perfect secrecy, and the key distribution is no longer uniform. However,

for all such examples as above, we may argue that there are 0-extractable. For the above

example, the deterministic extractor is ext(10) = ext(01) = 1 and ext(0) = 0. On the

other hand, for any such non-uniform distribution that is deterministically extractable, there

exists an encryption system that provides perfect secrecy. Define the encryption as follows:

first apply the extractor to the key (which will eventually make it uniform), and then apply

one-time pad.

Now we can make the following conjecture: for any encryption system (enc, dec) that provides

perfect secrecy using a random variable K on key space, K is 0-extractable to b uniform

bits and b is at least equal to the length of the message space. To formally state the above

conjecture, let us first define random sources for encryption.

Definition 3.4.1 A source φ is said to “admit” perfect encryption of n bits, if there exists an encryption scheme that provides perfect secrecy for n bit messages for each K φ. ∈

Any random source φ that is 0-extractable to b bits obviously admits perfect encryption of

b bits. Now let the source φ be one single random variable K, and we make the following

conjecture.

Conjecture 3.4.1 Any random variable K that admits perfect encryption of b bits, is 0- extractable to at least b bits.

40 T The above conjecture can be partially proved if k = , where k = y x , y = k∈K Y 6 ∅ Y { |∃ ∈ X

enck(x) for every k . } ∈ K

Proof. For a fixed y0 k∈K k, let ext(k) = deck(y0). This function is well defined since for ∈ ∩ Y

every k, y0 k. Since the decryption is deterministic, for each key a single message is the ∈ Y

output of the extractor. Moreover, for each message x, there exists a key k that x = deck(y0),

otherwise X|Y (x y0) = 0, therefore image of the decryption of y0 using all keys would be P | n n message space, i.e. 0, 1 . Now we prove that for every K φ, and every x0, x1 0, 1 , { } ∈ ∈ { }

Pr[ext(K) = x0] = Pr[ext(K) = x1] holds, and this proves the result since the probabilities

for each output value is equal.

X Pr[ext(K) = x0] = Pr[decK (y0) = x0] = K (k)IK (k) P x0 k

= Y |X (y0 x0) = Y |X (y0 x1) P | P |

= Pr[decK (y0) = x1] = Pr[ext(K) = x1]

where Kx = k deck(y0) = x , and IK (k) = 1 if k Kx and 0 otherwise. Also note that for { | } x ∈

a perfectly secure encryption the following holds: for all y, x0, x1, Y |X (y x0) = Y |X (y x1). P | P | 

The above proof is only for a single random variable and the general case does not hold.

Dodis and Bosley [BD07] constructed a random source φ that admits perfect encryption of b

bits, but it was not even -extractable to 1 bit. This source however, is of exponential size

compared to the message length. We will discuss this result later in Section 3.4.7.

3.4.2 Relaxation of perfect secrecy

Many works considered various relaxation of Shannon’s work due to its limitations. The

following definitions have been considered in literature as the relaxed version of perfect

41 secrecy:

 SECX (): X Y = y  X y | ≈ ∀ ∈ Y

SECY () : encK (x)  Y x ≈ ∀ ∈ X

SECXY ():(X,Y )  X.Y ≈

SECind() : encK (x0)  encK (x1) x0, x1 ≈ ∀ ∈ X

SECI (): I(X; Y )  ≤

SECent() : encK (X0)  encK (X1) X0,X1 ≈ ∀  SECG(): G X encK (X) G(X)  | − ≤ where Y = encK (X) is the random variable over ciphertexts, X0,X1 are high min-entropy P sources, i.e. m-source, G(X) = maxx X (x) and G(X Y ) = Y (y) maxx X|Y (x y). P | y P P |

Recall that SECI (0) is the definition of perfect secrecy. It is known that the definitions

SECX (0), SECY (0), SECXY (0), SECind(0) and SECI (0) are all equivalent to perfect secrecy when

 = 0 [KL07]. But this is not true in general for  > 0. Definition SECent was proposed by

Dodis and Smith [DS05] and will be reviewed later in this chapter. Definition SECG was

proposed in our work and we will investigate its properties in Chapter 4. In the following

section, we give a survey of the relevant results.

3.4.3 Comparison of perfect secrecy relaxations

A summary of relations between secrecy definitions is given in this section. The notation

SECA() SECB(δ) means that an encryption system that provides secrecy definition A with →

42 bias , also provides secrecy under definition B with bias δ.

[IO11] SECX () SEC∗() where means all definitions. → ∗

[IO11] SECind()  SECY ()

[IO11] SECind() SECXY () →

[IO11] SECXY () SECind(2) →

[Csi96] SECI () SECXY (√2 ln 2) →

[Csi96] SECXY () SECI ( log  ) → |X | p [Jia13] SECI () SECG( 1/2 ln 2) →

[Jia13] SECG SECI SECX 6→ 6→

SECind() SECent() →

SECent SECind 6→

The definition SECX is the strongest definition among all. The adversary is given a ciphertext

and this should not change the priori distribution over the messages more than  in statistical

distance. The definitions SECind, SECXY and SECI are almost equivalent, and only differ subtly

in their security parameter . Therefore we will mostly work with definition SECind. Note

that any impossibility result for SECind is an impossibility result for SECX , SECXY and SECI .

3.4.4 Randomness requirements of indistinguishability

First, we prove that even for SECind() and uniformly distributed keys, we require a key of

length at least equal to the length of the message. Moreover, if the length of key is one bit

n−2 less than message length, then for every message x1, there exist at least 2 messages x2

1 that are at least 3 -distinguishable from each other.

Theorem 3.4.2 For any encryption scheme on n-bit messages using uniformly distributed

n r−1 keys of length r < n and for any message x1, there exists at least 2 3.2 messages x2 −

43 such that 1 ∆enc (x ); enc (x ) > . K 1 K 2 3

Proof. Consider the SECind definition where the adversary receives the distribution over

ciphertexts for 2 message pairs x1 and x2 where x1 = x2. The advantage of the adversary in 6

distinguishing the two joint distributions should be small. For i = 1, 2, let Yi = encK (xi),

and T = y k; y = enck(x1) , then the following holds { |∃ }  ∆ Y1; Y2 = max Pr[Y1 S] Pr[Y2 S] S⊂Y | ∈ − ∈ |

Pr[Y1 T ] Pr[Y2 T ] ≥ | ∈ − ∈ |

Note that Pr[Y1 T ] = 1 by definition of T . Since key is assumed to be uniformly distributed, ∈ then we have

X Pr[encK (x) T ] = K (k)IT (enck(x)) ∈ P k∈K P I (enc (x)) = k T k |K| where IT is the indicator function, i.e. IT (y) = 1 if y T and is 0 otherwise. ∈

For fixed key k, the encryption of at most 2r messages is in T , because of the correctness

property of encryption (two messages must not be encrypted to the same cipher under

one key. If not, then we may have more than 2r messages in T ). In other words, for k:

P r P P 2r IT (enck(x)) 2 , and therefore for all keys IT (enck(x)) 2 holds. For at most x ≤ k x ≤ P 22r n λ messages x, it holds that IT (enck(x)) , and thus for at least 2 λ messages x2, k ≥ λ − P 22r 2r IT (enck(x2)) < holds. Therefore for all such messages Pr[encK (x2) T ] < and we k λ ∈ λ have r  2 ∆ encK (x1); encK (x2) > 1 . − λ

r−1 n r−1 Now let λ = 3.2 and the result is proved: for every message x1, at least 2 3.2 − 1 messages are distinguishable from x1 with more than bias. In other words, for r = n 1 3 − 44 n−2 and any message x1, there exists 2 messages x2 such that

1 ∆enc (x ); enc (x ) > . K 1 K 2 3

With the same argument, let λ = 2r+1 and for at least 2n 2r+1 messages the adversary can − 1 distinguish the messages with at least 2 advantage. 

From now on, we will use the terms -indistinguishability and -secrecy interchangeably.

Dodis [Dod12] proved a related result for the lower bound on the size of the key as follows:

Theorem 3.4.3 [Dod12] For any message distribution X, if an encryption system provides

-secrecy, then the key space satisfies 2H∞(X)(1 ). |K| ≥ − In particular, if the message is uniformly distributed then (1 ). Note that the |K| ≥ |X | − result of Theorem 3.4.2 was given in terms of bit length of message and key, but the above

theorem is stated in terms of the number of messages and keys.

In brief, the bounds discussed in this section can be summarized in three conditions:

- For uniformly distributed messages: (1 ), |K| ≥ |X | − - For uniformly distributed keys: if  < 1 for -secrecy, |K| ≥ |X | 2 - For arbitrary key and message distribution: 2H∞(X)(1 ). |K| ≥ −

The conclusion of the above results leaves the following question unanswered:

Question 3.4.1 For arbitrary message distribution, is there any better lower bound on the number of keys needed to achieve -secrecy, or is the bound in Theorem 3.4.3 tight? In other words, is there any encryption system that can achieve the lower bound?

3.4.5 Secrecy with weak random sources

As shown in Section 2.4, secrecy depends crucially on randomness to provide us with good

security guarantee. In some cases, as in Theorem 2.4.3, perfect randomness is inherent

45 for secrecy and even if we relax it to -secrecy (or -indistinguishability), still it crucially

depends on nearly perfect randomness, i.e. sources δ-close to uniform for sufficiently small δ.

On the other hand, all encryption schemes in the literature depend on perfect randomness

somewhere in their constructions. The question is if we can construct an encryption system

that works with imperfect randomness and still it provides security guarantees. In other words, is perfect randomness inherent for -secrecy or the only way to gain -secrecy from

imperfect randomness is to transform it to almost perfect randomness (e.g. using extractors)

and then use one-time pad to achieve -secrecy? If the latter is the only way, then this means

that one-time pad is a universal method of encryption in ITS. In this section, we will review

the results that show one-time pad is essentially universal except if the number of bits to

encrypt is very small relative to the key length. Universality of one-time pad means that

any encryption that provides -secrecy is equivalent to one-time pad. We begin with the

definition of encryption sources, sources that admit encryption:

Definition 3.4.2 (-encryption source) Consider an encryption system with messages , X keys and ciphers . A random source φ over “admits” -secrecy of n bits, if there exists K Y K and encryption scheme (enc, dec), such that for all messages x1, x2 of length n and all ∈ X distributions K φ we have ∈  ∆ enc(x1,K); enc(x2,K) . ≤

The source φ is called -encryption or -secrecy source and is said to “admit” -encryption of the message. To emphasize on the message size n, we may refer to φ by (n, )-encryption source. The term encryption source is used to refer to -encryption sources for sufficiently small .

For example, a uniformly random source on key space of size n is 0-encryption source for n bit messages.

46 Example 3.4.1 The source φ1 in Example 3.3.1 admits perfect encryption of 1 bit using the following table:

m/k 0 1 2 3

0 1 0 1 0

1 0 1 0 1

Although in the above examples, we only use a flat source, but it can be simply proved that

if a source admits -secrecy, then a convex combination of distributions in that source also

admit -secrecy.

Definition 3.4.3 Let Xi i=1...n be n distributions. X is said to be a convex combination of { } P P these distributions if there exists α1, . . . , αn such that X (x) = αi X (x) and αi = 1. P i P i i Let φ be a source. We define conv(φ) to be the source consisted of all convex combinations of any number of distributions in φ.

Lemma 3.4.1 If a source φ admits -secrecy, then conv(φ) admits -secrecy.

Proof. It is sufficient to prove that if two distributions A and B allow for -encryption of , X

then C allows for -encryption of , where C is a random variable with PC = αPA + βPB X and α + β = 1.

Let fi(C) = enc(xi,C), for i 1, 2 . Then we have: ∈ { }  X ∆ enc(x0,C); enc(x1,C) = Pr[f0(C) = z] Pr[f1(C) = z] c | − | X −1 −1 = Pr[C = f0 (c)] Pr[C = f1 (c) c − X −1 −1 α(Pr[A = f0 (c)] Pr[A = f1 (c)]) ≤ c − −1 −1 + β(Pr[B = f (c)] Pr[B = f (c)]) 0 − 1   = α∆ enc(x0,A); enc(x1,A) + β∆ enc(x0,B); enc(x1,B)

= (α + β) = .



47 For example, any distribution in conv(φ1) in Example 3.4.1 admits perfect encryption of 1 bit.

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Set (α0, . . . , α3) = ( 3 , 3 , 6 , 6 ) or ( 3 , 3 , 3 , 0) and we get distributions ( 4 , 3 , 4 , 6 ) and ( 6 , 3 , 3 , 6 ) over the key space respectively, both admitting perfect encryption of 1 bit for the table in

Example 3.4.1.

For a weak random source (t-source) φ, we know that there exists a set of flat distributions

such that their convex combination becomes φ and this fact helps simplifying many proofs.

It is interesting to show the same holds for -encryption sources.

Conjecture 3.4.2 if φ admits -encryption of n bits, then there exists a flat source φ0 only consisted of flat distributions such that conv(φ) = conv(φ0). This means that any encryption source can be expressed as convex combination of a set of flat distributions.

In the following sections, the problem of universality of one-time pad is investigated. If

one-time pad is universal, then any source that admits for -encryption of n bits is extractable

to n bits and thus one-time pad can be applied for the construction of the encryption. But if

not, then one must find weaker sources of randomness that admit encryption. In a nutshell,

the answer depends on the encryption function. If we allow randomized encryption, then in

the following section we show that t-sources admit encryption. The randomization is using

a local randomness that is transmitted, or a public randomness. However, if one requires

deterministic encryption, then one-time pad is universal.

3.4.6 t-source admit randomized encryption

Encryption sources are defined for deterministic encryption where one random source src

is the only available source. However if we allow randomization, we can simply show that

if the encryption is allowed to access a short public random seed, then any t-source would

admit encryption of roughly t-bit messages. Assume that the encryption function is using

a (n, t)-random source φ, and a uniformly random seed from the set to encrypt an l-bit R

48 message from . Define the encryption function enck(x; r) = (r, ext(k, r) x), where r is a X ⊕ random seed, and ext is a (n, t, l, )-extractor. Then we have

  ∆ (R, ext(K,R) x0); (R, ext(K,R) x1) ∆ (R, ext(K,R) x0); (R,Ul x0) (3.1) ⊕ ⊕ ≤ ⊕ ⊕  + ∆ (R,Ul x0); (R,Ul x1) (3.2) ⊕ ⊕  + ∆ (R,Ul x1); (R, ext(K,R) x1) (3.3) ⊕ ⊕  2∆ (R,Ul); (R, ext(K,R)) (3.4) ≤ 2, (3.5) ≤ where the first inequality is from triangle inequality, the term (3.2) is 0 from the security

property of one-time pad as an encryption function using uniform keys, inequality (3.4) is a

property of statistical distance and inequality (3.5) is the property of randomness extractors.

Note that the random seed is considered to be public and can be revealed as a part of the

cipher text. The above construction is discussed and generalized in Privacy Amplification

[BBR88]. Nevertheless, we are interested to know if deterministic encryption is possible with weak random sources. The next section discusses this problem.

3.4.7 One-time pad is universal for deterministic encryption

Investigation of sources that admit -encryption started by McInnes and Pinkas [MP91] where

they showed that secrecy can not be based on specific random sources in [CG88, SV86], even

if it is restricted to encryption of a one-bit message. But afterward, Dodis and Spencer

[DS02] could construct a particular imperfect source that was not extractable, and they could

securely encrypt a one-bit message with it. Later on, the result of [MP91] was improved by

Dodis et al. [DOPS04] to a larger class of sources. Based on the above results, the main

question to be answered is:

Assume that we can encrypt n-bit message with a random source. Does this imply extracting at least one almost perfect bit deterministically from the source?

49 Bosley and Dodis [BD07] settled this question positively. They show that if n < log(κ),

one can extract n almost random bits from a random source over κ-bit key space that admits

secure encryption of a n-bit plaintext. If n > log(κ), then there exists -encryption sources

that are not extractable. It is clear that in case n > log(κ), the key can not be sampled

efficiently, since its length is exponential in size of the plaintext, and thus we will not have

efficient encryption.

Theorem 3.4.4 [BD07] The theorem has two parts:

 > 0, if φ is a δ-encryption source over n-bit space, and n > log(κ)+2 log( 1 ), •∀  then φ is  + δ-extractable to a (n 2 log 1 )-bit space. −  For any n log κ log log κ 2, there exists a source φ that is a 0-encryption • ≤ − − source over n-bit messages, but not -extractable to even a 1-bit space.

We provide an outline of the proof for first part from [BD07].

Proof. Let the encryption function be enc : such that K φ and x1, x2 , X × K → Y ∀ ∈ ∀ ∈ X  ∆ enc(x1,K); enc(x2,K) . (3.6) ≤

To complete the proof, it is sufficient to show that φ is extractable. So an extractor

ext : is needed such that K φ, K → R ∀ ∈  ∆ ext(K); UR δ + , (3.7) ≤

holds for δ > 0. The main idea of the proof is to reduce the construction of this extractor to

another extractor ext0 : for another source φ0 defined by: Y → R

0 φ = enc(UX , k) k K . (3.8) { | ∈ }

So it is sufficient to prove that if φ0 is δ-extractable to l bits and φ allows for -encryption,

then φ is (δ + )-extractable to l bits. If ext0 is the extractor for φ0, then the following ext would be the extractor for φ:

ext(k) = ext0(enc(1, k)).

50 To prove this, the following inequalities hold:

 0  ∆ ext(K); UR = ∆ ext (enc(1,K)); UR (3.9)

0 0  ∆ ext (enc(1,K)); ext (enc(UX ,K)) (3.10) ≤ 0  + ∆ ext (enc(UX ,K)); UR

X 0   + pk.∆ ext (enc(UX , k)); UR (3.11) ≤ k X  + pk.δ =  + δ. (3.12) ≤ k The main point of this reduction is that the new source φ0 contains only 2κ distributions

(Dk = enc(UX , k)) and each of them is a flat n-source, since for any k , and x1 = x2, ∈ K 6 n enc(x1, k) = enc(x2, k) holds and thus the 2 different values in the encryption output makes 6 the source flat n-source.

The last step of the proof is the following independent lemma:

Lemma 3.4.2 Any family of 2κ flat n-sources over some set is δ-extractable to (n 2 log 1 ) Y − δ 1 bits, where n > logκ + 2 log δ .



Interpretation: Let δ = 2−r, this lemma says if the number of such flat distributions is 2κ

bounded by κ < 2n+2r, then probabilistic method shows there exists a deterministic extractor

for this random source that can extract n 2r bits. Note that the number of all flat n-sources − |Y| κ over C is exactly 2n which is a huge number compared to 2 . We know for such a huge number of flat n-sources, no extractor exists that can extract even 1 bit but this lemma says

if we bound this number, a random function is a good extractor by overwhelming probability.

This result shows that random sources that admit efficient encryption are extractable,

and thus one-time pad is essentially universal. In other words, if a random source admits

51 encryption efficiently, then the encryption function can be built from two functions: A

deterministic extractor function that extracts random bits from the key, and the one-time

pad that uses the output of the extractor as the key. The authors also show the following

source

φ = All random variables K encK (x0) 0 encK (x1) for every x0, x1 , { | ≈ ∈ X } which obviously admits 0-encryption, is not even -extractable to 1 bit, when n log κ ≤ − log log κ 2. Note that φ is the largest possible source that admits perfect encryption of n bits. −

In this section, we summarized the main results on whether -secrecy is possible with weak random sources. We can conclude that sources that admit -encryption are extractable

to uniformly random bits if we require deterministic encryption. For randomized encryption

however, t-source admits secure encryption. A t-source can be extracted to n random bits

using a seeded extractor (and a random seed) and then one-time pad provides the secure

encryption, as discussed in Section 3.4.6.

3.5 Randomness requirement for secret sharing

In the previous section, we discussed the limitations of information theoretic encryption and

how impractical it seems for real world applications. Nonetheless, there are certain scenarios

that we can utilize information theoretic encryption in real world. Suppose that Alice and

Bob want to communicate secretly and they have access to two communication channels.

Eve, willing to learn their communication, can eavesdrop one of these channels, and can

choose which one to eavesdrop. This is quite practical. For example, Alice and Bob can

communicate over Internet and phone at the same time and Eve has only the capability

to eavesdrop one of them. In this scenario, Alice can sample a key k using a local random

number generator and encrypt the message using the key. Then she can send k in one of the

channels and the encrypted message enck(x) in the other channel. The security properties of

52 encryption guarantees that if Eve has access to only one of the channels, then the message is

guaranteed security as if it was encrypted using a shared key. This is a very especial case of

secret sharing defined in Chapter 2 Section 2.4.1.

The main results in randomness requirements of secret sharing, is due to Dodis et al. [DPP06] where they compared secret sharing sources and encryption sources. We review and then

extend some of their results. To start, we define random sources for secret sharing.

Definition 3.5.1 (Secret sharing sources) A source φ over is said to “admits” (m, t)-secret R sharing of n bits with µ-secrecy if for any random variable X over secret space of length n, X and for x0, x1 X, there exists a secret sharing scheme (share, rec) such that ←  R φ, tuple set Tt−1 : ∆ share(x0,R)[Tt−1]; share(x1,R)[Tt−1] µ. ∀ ∈ ∀ ≤

The notations follow the Definition 2.4.5.

In the first theorem, Dodis et al. showed that a source admitting perfect encryption of n bits, admits (2, 2)-secret sharing of n bits.

Theorem 3.5.1 [DPP06] Any random source φ that admits perfect encryption of n bits, admits perfect (2, 2)-secret sharing of n bits.

The proof of the above theorem is very easy: Having an encryption system (enc, dec), Let

share2(x, r) (s1 = r, s2 = enc(x, r)) and ←

rec(s1, s2) dec(s2, s1). ←

Theorem 3.5.2 [DPP06] There is a source which allows for perfect (2, 2)-secret sharing of

1 a bit, but does not allow for -encryption of a bit for any  < 3 . i.e.  1 φ share2(1) but φ enc(1) for  < . ∃ → 6−→ 3

The following table is an example of perfectly secure secret sharing table for a source φ that

is not encryptable and not extractable (perfect).

53 m k 0 1 2 3 4 5 \ 0 (3,2) (1,4) (2,1) (4,3) (1,3) (1,1)

1 (1,2) (3,4) (2,3) (4,1) (3,1) (3,3) n o where φ = U{0,1},U{2,3},U{0,2,4},U{0,3,5} .

Theorem 3.5.3 [DPP06] If a random source φ admits (2, 2)-secret sharing of 1 bit, then it

1 admits 2 -encryption of 1 bits.

To extend the results to more general secret sharing cases, we start with (3, 3) secret

sharing.

Theorem 3.5.4 If a source φ admits -encryption of n bits, and φ0 admits δ-encryption of m bits, where m is the length of the image of encryption function for φ, then (φ, φ0) admit

(3, 3)-secret sharing with max(, δ)-secrecy.

Proof. Let φ admit -encryption using function enc, and φ0 using function enc0. Choose

0 ∗ 0 K1 φ and K2 φ , and let enc (m, k1k2) = enc (enc(m, k1), k2), for k1 K1 and k2 K2. ∈ ∈ ← ← 0 We need to define the (3, 3)-secret sharing function share3 using enc and enc in a black box way. Define the sharing function to be the following:

 ∗  share3(m, k1k2) = k1, k2, enc (m, k1k2) .

We must prove security for any 2 of the shares, but it is clear that the first two shares k1, k2

do not reveal any information about the message. So it remains to prove the security for the

encryption given one of the keys.

∗ ∗  X ∗ ∗  ∆ enc (m0,K1K2); enc (m1,K1K2) K2 K (k2)∆ enc (m0,K1k2); enc (m1,K1k2) | ≤ P 2 k2 X  K (k2)∆ enc(m0,K1); enc(m1,K1) (3.13) ≤ P 2 k2 X K (k2) = , ≤ P 2 k2

54 where the above inequality (3.13) is implied by Lemma 2.2.2.

For the other possible pair of shares the following holds.

∗ ∗  X ∗ ∗  ∆ enc (m0,K1K2); enc (m1,K1K2) K1 K (k1)∆ enc (m0, k1K2); enc (m1, k1K2) | ≤ P 1 k1 X 0 0  K (k1)∆ enc (c0,K2); enc (c1,K2) ≤ P 1 k1 X K (k1)δ = δ, ≤ P 1 k1 where ci = enc(mi, k1) for i 0, 1 . ∈ { }  The above result can be generalized to prove that j independent encryption sources admit

(j, j)-secret sharing, i.e. sharej(n). For the above construction to work, one source generate

enough random bits to encrypt a n-bit message, while the other source generates enough

random bits to encrypt a t-bit message, where t is the output length of the first encryption

that is at last n. However one might expect that two independent sources φ and φ0 that

provide -secrecy of n bits message, provide (3, 3)-secret sharing with -secrecy for n bit

message. Therefore we conjecture the following:

Conjecture 3.5.1 If two independent sources φ and φ0 admit -encryption of n bits, then

(φ, φ0) admits (3, 3)-secret sharing with -secrecy.

Theorem 3.5.5 If a random source φ over a m-bit space admits -encryption of n bits, and

φ0 admits (2, 2)-secret sharing with δ-secrecy, then (φ, φ0) admits (3, 3)-secret sharing of n bits, with  + 2δ-secrecy.

Proof. Let φ admits -encryption using function enc, and φ0 admits secret sharing with

0 δ-secrecy using function share2. Choose K1 φ and K2 φ , and define share3(m, k1k2) = ∈ ∈

share2(k1, k2), enc(m, k1) for k1 K1 and k2 K2. It is enough to show that for K1 φ ← ← ∈ 0 and K2 φ ∈  ∆ share3(m0,K1K2)[i, 3]; share3(m1,K1K2)[i, 3] <  + 2δ, (3.14)

55 where i 1, 2 , since knowing the shares 1 and 2 will only reveal the key and nothing about ∈ { } the message. Inequality (3.14) is equivalent to

 ∆ (enc(m0,K1),Si); (enc(m1,K1),Si) <  + 2δ,

where Si = share2(K1,K2)[i]. Let Ci = enc(mi,K1). Then:

  ∆ enc(m0,K1); enc(m1,K1) Si = ∆ (C0,S1); (C1,S1) (3.15) | X X S1 (s) C0|S1 (c s) C1|S1 (c s) . ≤ s P c P | − P | Now security of secret sharing states that

 i 0, 1 and k10, k11 , ∆ share2(k10,K2)[i]; share2(k11,K2)[i] δ. ∀ ∈ { } ∈ K ≤

Here we use a stronger definition of security [IO11] (definition 2 and section 5) for our proof

to work:  s, ∆ K1 S1 = s; K1 δ. (3.16) ∀ | ≤ 0 For a fixed s, let K = (K1 S1 = s), i.e. K0 (k) = K |S (k s). Using the triangle inequality 1 | P 1 P 1 1 | (two times) for encryption the following holds:

 0 0  ∆ enc(m0,K1); enc(m1,K1) S1 = s = ∆ enc(m0,K ); enc(m1,K ) | 1 1 0  ∆ enc(m0,K ); enc(m0,K1) ≤ 1  +∆ enc(m0,K1); enc(m1,K1)

0  + ∆ enc(m1,K1); enc(m1,K1)  = ∆ enc(m0,K1); enc(m1,K1)

0  +2∆ K1; K1 (3.17)

 + 2δ, ≤ where inequality (3.17) is derived from Lemma 2.2.2. Now continuing from inequality (3.15),

 X  ∆ (enc(m0,K1),Si); (enc(m1,K1),Si) = Pr[S1 = s]∆ C0; C1 S1 = s)  + 2δ. s | ≤

56 

The definition of security for secret sharing in 3.16 is one of the strongest security def-

initions which is derived from the definition of perfect secrecy. In perfect secrecy, we

require that c, Pr[M C = c] = Pr[M] where C = enc(M,K). This is equivalent to ∀ |

m0, m1, c, Pr[M = m0 C = c] = Pr[M = m1 C = c]. We can relax the above defi- ∀ | | nitions using statistical distance, i.e. ∆M C = c; M  for the first definition, and | ≤  ∆ enc(m0,K); enc(m1,K)  for the second definition. However, these relaxations are not ≤ equivalent according to [IO11]. There are examples of encryption systems that are secure for

the latter but not former. So the former definition is the strongest.

Note that the above theorem is the more general version of Theorem 3.5.4 in the sense

that Theorem 3.5.5 implies 3.5.4. This is because encryption implies 2 out of 2 secret sharing

from Theorem 3.5.1. Also note that in Theorem 3.5.5, we use a stronger definition for

security of secret sharing, and also the indistinguishability parameter is bigger compared to

Theorem 3.5.4. The most general case is when 3 out of 3 secret sharing can be constructed

from only 2 out of 2 secret sharing. So we make the following conjecture:

Conjecture 3.5.2 If φ admits (2, 2)-secret sharing of n bits with -secrecy, and φ0 admits

(2, 2)-secret sharing of t bits with δ-secrecy, then (φ, φ0) admits (3, 3)-secret sharing of n bits with max(, δ)-secrecy, where t is length of the first share of secret sharing function for φ in bits.

A possible construction for the above conjecture is

0 share3(m, k1k2) = share2(m, k1)[1], share2(share2(m, k1)[2]),

but the proof of Theorems 3.5.4 and 3.5.1 does not easily extend to prove the above conjecture.

57 3.6 Authentication sources

In the previous sections, we considered the case that adversary is eavesdropping on the channel,

but does not change any of the messages. However, if the adversary can modify the messages

on the channel, it might be able to fool the other party to accept a fraudulent message without detecting any tampering with the message. This problem was first investigated in

[GMS74].

One might think that one-time pad would work for this purpose but an adversary receiving

the cipher y = x+k can add a value t to the cipher and then the party decrypting the message would recover x+t without detecting any tampering. Based on this argument, secrecy can not

help to protect against tampering by itself and thus, we need another primitive to authenticate

the messages over the channel. To achieve this goal in ITS, a message authentication code

is used. The sender, Alice, computes a tag for the message with a shared key and sends it

along with the message to the receiver, Bob. At Bob’s side, the tag is checked to see if it

corresponds to the message with the shared key. An adversary can attempt to insert a new

message and tag, or replace a correctly authenticated message with a different one and hope

Bob can not detect this malicious behavior. The success probability of the adversary in this

attack is denoted by psucc.

Definition 3.6.1 (MAC) A Message Authentication Code (MAC) is a pair of functions

mac : Σ and ver :( Σ) 0, 1 such that for every key k and every message M×K → M× ×K → { } m , the verification always output 1 for the right message, i.e. ver((m, mac(m, k)), k) = 1, ∈ M and outputs 0 otherwise.

The security of a MAC is measured by the success chance of an adversary in impersonation

and substitution attacks defined as follows:

Definition 3.6.2 (δ-MAC) Suppose , and Σ are set of messages, keys and tags, with M K

probability distributions M , K and Σ respectively. A function mac : Σ is a P P P M × K → 58 δ-MAC for a distribution K on keys if P K

0 0 psucc = P [ver(m , σ , k) = 1 σ = mac(m, k)] δ, | ≤ where m = m0 M and σ0, σ Σ. Note that the probability is calculated over all keys k . 6 ∈ ∈ ∈ K

Note that in the definition we do not have any assumption about the message distribution,

as was the case in perfect secrecy.

1 Let δ = |Σ| . It is easy to verify that psucc can never be smaller than δ. This is because the adversary can always choose a random tag and send it along with a forged message. Then

she hope it will not be detected and her success chance would never be smaller than δ. Thus

δ is the minimum success chance of the adversary. However Stinson [Sti92] proved that a

MAC that achieves this minimum success rate would require a large key space.

Theorem 3.6.1 [Sti92] A 1 -MAC would require ( Σ 1) + 1. |Σ| |K| ≥ |M| | | −

This result shows that perfect authentication even needs more keys compared to perfect

secrecy (Theorem 2.4.2). However we saw that in secrecy, reasonable relaxations could not

help to achieve better key rate. But in authentication, Wegman and Carter [WC81] proved

that we could have authentication with key size logarithmic in size of the message, if we allow

a success chance of 2δ for the adversary.

Theorem 3.6.2 [WC81] There exists a 2 -MAC with key length = O(log ). |Σ| |K| |M|

This results follows our intuition that message authentication should need less randomness

compared to secrecy, since in secrecy we want to hide all functions of the message and only a

negligible information is allowed to be leaked about the message, whereas in authentication,

some information is added to the message to check it has not been altered in the communication.

A comprehensive introduction and proof of theorems can be found in [Sti92, Sti91, Sti86,

WC81].

59 3.6.1 Authentication with t-sources

For authentication, Maurer and Wolf [MW97] proved that universal hash functions are

message authentication codes that are secure with even high min-entropy weak random

n sources, i.e. (n, t)-sources with t > 2 . On the other hand, Dodis and Spencer [DS02] showed n that weak sources with lower min-entropy, i.e. t < 2 , does not suffice for authentication of even a single bit. This result was strengthened by Dodis and Wichs [DW09] that showed

low entropy sources does not suffice for even randomized messages authentication codes. We will summarize these results in this section. We start with the definition of authentication

sources:

Definition 3.6.3 (δ-authentication sources) Consider a MAC with messages from , keys M from and tags from Σ. Let φ be a random source on . Then we say the MAC is a K K

(φ, δ)-authentication code if for all distributions K φ, psucc δ holds. If for a source φ, ∈ ≤ there exists a δ-MAC, then we say φ is an (n, δ)-authentication source, where n = log( ). |X |

Maurer and Wolf proved that authentication is possible with random sources with high

enough min-entropy in the following theorem:

Theorem 3.6.3 [MW97] For every µ > 0 and (n, t)-random source with t ( 1 + µ)n, there ≥ 2 exists a δ-MAC, with δ 2−(µn/2−1) to authenticate an n bit message. ≤

For example, if n = 100 and µ = 0.1, then for a random source with at least 0.6n min-entropy,

there exists a MAC with adversary’s success probability of less than 2−4. The weakest

source that we can use in this example must have at least 0.54n min-entropy such that the

adversary’s success probability becomes less than 1.

However, Theorem 3.6.3 does not rule out the possibility of authentication with lower

min-entropy sources. But Dodis and Spencer [DS02] proved an impossibility result showing

that lower min-entropy sources are not sufficient for authentication. However they did not

60 consider the randomized authentication codes. This result was later improved in [DW09] for

randomized messages authentication sources. Here we bring the latter result:

Theorem 3.6.4 [DW09] If a random source φ on is an (n, δ)-authentication source with K n δ < 1/4, then for all K φ, we must have H∞(K) > . ∈ 2

Note that the above theorem considers the general case where the authentication function

can be randomized, i.e. the mac function uses a public randomness to compute the tag.

Theorems 3.6.4 and 3.6.3, completely characterize the authentication sources: A source φ

n over admits (n, δ)-authentication source if and only if K φ, H∞(K) > . K ∀ ∈ 2 Therefore, authentication source lies strictly between simulatable and extractable sources.

3.7 Comparison of random sources

We are interested to compare random sources for various cryptographic applications to

know which are more demanding for randomness and are easier to construct when perfect

randomness is not available. One can compare extractable, secrecy and authentications

sources to find the relation between the source. Obviously, extractable sources are subset

of secrecy and authentications sources, since extractable sources are the ones that uniform

random bits can be extracted from them and we can apply the output to every cryptographic

primitive that works with almost random bits.

On the other hand, based on Theorem 3.4.4, encryption sources are the same as ex-

tractable sources if one is restricted to efficient encryption. This indicates that almost perfect

randomness is inherent for encryption with -secrecy.

About authentication sources, Theorem 3.6.3 says that we can use any weak random source

for authentication as long as it has high min-entropy of half the length of the message.

61 All sources

Simulatable

Authentication

et Sha r ri ec pta n y b g S r l c e n E Extractable

Perfect random source

Perfect ⊂ Extractable ⊆ Encryption ⊂ Secret Shar- ing ⊂ Authentication ⊂ Simulatable ⊂ All sources

Figure 3.2: Comparison of random sources

3.8 Concluding remarks

The investigation of random sources that admit various primitives in computer science started

back in 1951 by Von Neumann [vN51a] where various modeling of random sources were

proposed, and the most general weak sources were proposed by Goldreich et al. [CG88]. The

first major breakthrough happened in 1991 when Zuckerman [Zuc91] proved BPP algorithms

are simulatable by weak random sources (t-sources).

Classifying random sources for encryption specifically started in 1990 by the work of McInnes

and Pinkas [MP91] where they showed encryption of even 1 bit is impossible with certain

random sources. Finally this case was settled after 14 years (2004) by Bosley and Dodis

[BD07] where they showed general weak sources do not admit efficient encryption. They also

extended their results to computational security. The question of classifying random sources

for secret sharing started with the work of Dodis et al. [DPP06] in 2006 and is yet open for

the general case. We explored possible directions to classify secret sharing sources but the

problem of black-box construction of a primitive based on others is usually a hard one. We

proved some partial results in this regard and the open questions were discussed.

Classification of random sources has also been investigated for other primitives such as

differential privacy, distributed computing, interactive symmetric key protocols under both

62 passive and active adversary settings. Dodis et al. [DLAMV12] investigated this problem for differential privacy, where they showed certain random sources admit construction of differential privacy mechanisms. In distributed computing Goldwasser et al. [GSV05] showed possibility of certain distributed computations, namely Byzantine protocols, using imperfect randomness. The randomness requirements of symmetric key cryptography in interactive setting was initiated by Bennet et al. [BBR88] under the term “Privacy amplification”. In general, privacy amplification is possible with imperfect randomness, but many aspects of it such as minimizing the number of rounds of interaction, adversarial assumption (active, passive), fuzzy randomness, etc. is still investigated.

63 Chapter 4

Guessing secrecy

In this chapter, we propose a new relaxation of secrecy that we

call guessing secrecy. This is a natural definition that requires

that the adversary’s success chance of guessing the plaintext

using his best guessing strategy does not change after seeing

the ciphertext. We investigate its randomness requirements and

show that the key length requirement is the same as perfect

secrecy, but there exists a family of non-uniform distributions

that admits encryption of messages under guessing secrecy.

4.1 Introduction

Consider the classical scenario of symmetric cryptography: Alice wants to send a message X

securely to Bob, over a reliable communication channel that is eavesdropped by Eve who has

unlimited computational power. The goal of secrecy systems is to prevent Eve from learning

the message, given her view of the communication channel. Without giving any advantage to

Alice and Bob, providing secrecy against Eve in the above setting is impossible. In Shannon’s

model of secrecy system [Sha49], Alice and Bob share a secret key K that is unknown to Eve.

Other models assume other types of advantage, such as noisy view of communication channel

[Wyn75], or bounded storage for Eve [Mau92].

In this chapter we follow Shannon’s model and consider the case that Alice and Bob share

a realization of a random variable (key). Shannon’s definition of perfect secrecy requires

that no information about the plaintext be leaked from the ciphertext. As discussed in

Chapters 2 and 3, -secrecy requires a fresh key to be selected for encryption of each message,

64 and the entropy of the key distribution must be at least equal to the entropy of the plaintext.

However, the size of the key space must be at least as large as the size of plaintext space.

These requirements are hard, if not impossible, to satisfy since even if one could tolerate

using a new key for every message, guaranteeing uniformity of the key is a real challenge in

practice. This is because there is no known source of randomness with guaranteed uniform

output, and in almost all cases the output of a randomness source is likely to have biases.

In this chapter, we propose a new relaxation of secrecy that we call perfect guessing

secrecy, or guessing secrecy for short. This is a natural definition that requires that the

adversary’s success chance guessing of the plaintext using his best guessing strategy does not

change after seeing the ciphertext. Unlike perfect secrecy, guessing secrecy does allow some

leakage of information but requires that the best guess of the plaintext remains the same

after seeing the ciphertext. We define guessing secrecy and prove a number of results. We

show that similar to perfect secrecy, in guessing secrecy the size of the key space can not

be less than the size of plaintext space. Moreover, when the two sets are of equal size, one

can find two families of distributions on the plaintext space and key space, such that perfect

guessing secrecy is guaranteed for any pair of distributions, one from each family. In other words, perfect guessing secrecy can be guaranteed with non-uniform keys also. We also show

the relation between perfect secrecy and perfect guessing secrecy. We discuss our results and

propose direction of future research.

4.1.1 Motivation

Guessing secrecy preserves the min-entropy of a random variable by using the randomness

from a shared key. This provides sufficient level of security in some scenarios for example when the communicated random variable is used as the input to an extractor to generate a

uniformly distributed key. Reducing randomness requirement of the shared key (not requiring

the key to be uniformly distributed and/or the key length be the same as the plaintext length)

for guessing secrecy might help us to improve such scenarios in practice. This can be used

65 when biometrics is used as a secret key for example, as discussed in Section 4.5. Biometric

readers can not reproduce the precise copy of the secret every time it measures the biometric.

Therefore, an error correcting piece of information is used to retrieve the exact secret back,

and then an extractor is used to transform the secret into uniform distribution. In Section 4.5, we discuss how guessing secrecy can be applied in such a scenario.

Another motivation for guessing secrecy is to find the minimum requirements for stronger

notions of secrecy such as indistinguishability. This is because any encryption function

satisfying guessing secrecy, also satisfies stronger notions of secrecy such as indistinguishability.

Any requirement for the secret key in guessing secrecy, will directly translate into a requirement

for stronger notions.

4.1.2 Related work

A number of authors considered alternative models to achieve more practical cryptosystems:

-secrecy. One of the first attempts to relax the requirement on the independence of plaintext

and ciphertext distribution, led to the definition of -secrecy that allows small amount of

information leakage about the plaintext after viewing the ciphertext.

A number of papers [MP91, DS02, DOPS04] considered whether -secrecy is possible

if the key is not chosen uniformly at random. Bosley and Dodis [BD07] considered this

problem and proved that for practical key lengths (not exponential in plaintext length) either

encryption is impossible, or the key is deterministically extractable to a uniform key with the

same length as the plaintext. That is there is a deterministic function that takes the key as

input and generates a random string of the size at least equal to the message. The generated

random string can then be used in one-time pad to provide secrecy. In other words, they

proved that any encryption function that provides -secrecy is essentially one-time pad.

Entropic Security. Russel and Wang [RW02] proposed a notion of secrecy, called semantic

66 security, based on semantic security and assumed a bound on the prior knowledge of the

adversary about the plaintexts. With this restriction, they could reduce the length of the

key, depending on the amount of adversary’s prior knowledge about plaintexts. Dodis and

Smith [DS05] extended entropic security and provided simpler constructions that achieved

entropic security. Although compared to perfect secrecy, their notion of secrecy needs smaller

key size, but their scheme still requires the key to be uniformly distributed.

Bounded Storage Model. Another direction proposed was to limit the memory of the compu-

tationally unbounded adversary. Maurer [Mau92] introduced Bounded Storage Model, and

proved that a constant size key can be used to provide unconditional security in this model

[CM97b]. Aumann and Rabin [AR99] defined the notion of everlasting security using the

Bounded Storage Model, and showed that a key can be reused to send an exponential number

of plaintexts [DR02].

In these models the key is either deterministically extractable, or is almost uniformly dis-

tributed. Moreover the assumption of bounded storage is challenged in many real life

scenarios.

4.1.3 Our contribution

We propose a new notion of secrecy that we call guessing secrecy that is similar to Shannon’s

formulation of perfect secrecy, but uses min-entropy and conditional min-entropy instead of

corresponding Shannon entropies. Perfect guessing secrecy, referred to as guessing secrecy

for simplicity, requires that the best chance of the adversary in guessing the plaintext does

not change after viewing the ciphertext. In other words, it requires that the conditional

min-entropy of the plaintext distribution given ciphertext, be equal to the min-entropy of the

plaintext distribution.

We show that similar to perfect secrecy, perfect guessing secrecy requires the size of

the key space to be at least equal to the size of plaintext space. If the sizes are the same

67 however, unlike Shannon secrecy, it may be possible to obtain perfect guessing secrecy using weaker random sources. We show two concrete families of distributions, on the message space

and key space respectively, with the property that perfect guessing secrecy is guaranteed

for any distribution on messages from the former family, together with any distribution on

the keys, from the latter family. Ideally, one would like to have both families to be large:

that is perfect guessing secrecy be obtained for many plaintext distributions, using a large

family of distributions on the keys. We, however, show that for any family of distributions

on the message space that contains the uniform distribution, one must choose the key to

be uniformly random, and the only encryption system that provides guessing secrecy is the

one-time pad. That is the family of distributions on the key space reduces to a single element.

We leave the problem of finding larger families of distributions for message and key space that

satisfy perfect guessing secrecy as an open problem. We also show the relationship between

perfect secrecy and perfect guessing secrecy.

We also investigate the natural relaxation of perfect guessing secrecy, which allows an 

advantage for the adversary for guessing the message, and we call it -guessing secrecy. We

show that this  advantage can help us achieve smaller keys.

4.2 Secrecy based on guessing probability

In this section, we discuss the formal definition of guessing secrecy. We prove that an

encryption scheme with guessing secrecy still needs a key of length at least as large as the

plaintext, but we show that depending on the distribution of the plaintexts, it is possible to

provide guessing secrecy with non-uniform distributions on the key space.

68 Definition 4.2.1 (Guessing probability) The guessing probability of a random variable X with probability distribution X , denoted by G(X), and is given by: P

G(X) = max X (x). x P This is, the success probability of correctly guessing the value of a realization of variable when using the best guessing strategy (guessing the most probable value of the range as the guess).

Guessing probability is related to min-entropy as H∞(X) = log G(X). Min-entropy is − a measure of success chance of guessing X, or in other words, predictability of a random variable by an adversary. It can also be viewed as the worst case entropy compared to

Shannon entropy which is an average entropy.

Definition 4.2.2 (Conditional guessing probability) The conditional guessing probability of X given a random variable Y with a joint probability distribution XY is given by: P X G(X Y ) = Y (y)G(X Y = y), | y P | and measures the average unpredictability of X, averaged over all realization of the random variable Y . Note that H∞(X Y ) = log G(X Y ). | − | The concept of guessing probability is related but not equivalent to guessing entropy

introduced in [Mas94]. Guessing entropy measures the expected number of guesses required

to determine a realization of a random variable, assuming the guessing strategy is by asking

the elements of the set in decreasing order of probabilities, starting from the element with P the highest probability. Guessing entropy of random variable X is defined by ipi where

pi values are probability values in X sorted in decreasing order (i = 1 to X ). Guessing | |

probability however is the probability of a single best guess at X which is equal to p1 and so

guessing probability is always less than the guessing entropy.

Finally we can define guessing secrecy as follows

69 Definition 4.2.3 (Guessing secrecy) Let X be a random variable over plaintexts with X

probability distribution X , and K a random variable over keys with probability distribution P K

K where X and K are independent. An encryption scheme enc : satisfies P X × K → Y weak perfect guessing secrecy for distributions X and K if G(X Y ) = G(X). The scheme P P | satisfies strong perfect guessing secrecy for distributions X and K if for any y , P P ∈ Y G(X Y = y) = G(X) holds. |

Clearly a scheme with strong guessing security satisfies the weak guessing secrecy require-

ment. However the converse is not true in general. From now on, we will use guessing secrecy

to refer to weak perfect guessing secrecy, unless otherwise mentioned (We previously used

guessing secrecy in lieu of perfect guessing secrecy).

We can relax the definition of guessing secrecy as follows:

Definition 4.2.4 An encryption scheme provides -guessing secrecy if G(X Y ) G(X) . | − ≤

Remark. Min-entropy has been commonly used to measure the randomness in a random variable. In this chapter we use min-entropy to measure secrecy. Using guessing probability

instead of min-entropy allows us to remove log and provides a natural way of capturing − security.

4.3 Requirements on the key size

Since guessing secrecy is a weaker notion than perfect secrecy, we hope that a smaller key can

be used to encrypt a message compared to perfect secrecy. However, the next theorem states

that the size of key space must be at least the size of plaintext space, and this is regardless

of plaintext distribution.

Theorem 4.3.1 If an encryption function enc : satisfies guessing secrecy for X × K → Y distributions X and K over plaintexts and keys respectively, then holds. P P |K| ≥ |X | 70 Proof. Assume there is an encryption function that provides guessing secrecy and < . |K| |X |

Let Zx = y Y |X (y x) = 0 . For each x , Zx is non-empty, i.e. Zx > 0. This is { ∈ Y|P | } ∈ X | | because the size of the key space is less than the size of plaintext space, and so, the image of

x under encryption function using all keys will be not be equal to . Let x∗ be an element of Y with the highest probability. Now from the definition of conditional guessing secrecy we X have:

X G(X Y ) = max X (x) Y |X (y x) (4.1) x | y P P | X X = max X (x) Y |X (y x) + max X (x) Y |X (y x), (4.2) x P P | x P P | y∈ /Zx∗ y∈Zx∗ | {z } | {z } S1 S2 ∗ where x = arg maxx X (x). And the first summand satisfies: P

X ∗ ∗ S1 X (x ) Y |X (y x ) (4.3) ≥ P P | y∈ /Zx∗

∗ X ∗ = X (x ) Y |X (y x ) = G(X). (4.4) P y P |

It is easy to verify that S2 > 0 since . Therefore G(X Y ) = S1 + S2 > G(X) which |Y| ≥ |X | | contradicts guessing secrecy. 

In -guessing secrecy where an  advantage in guessing the message is given to the

adversary, we prove the key size can be smaller.

Theorem 4.3.2 Assume an encryption function (enc, dec) on random variables X,Y,K over messages, ciphertexts and keys respectively, that provides -guessing secrecy with a uniformly distributed key such that for every x and y, there is only one key k such that enck(x) = y. Any such encryption function satisfies the following

pmin(X) 1) where pmin(X) = minx X (x). |K| ≥ |X | +pmin(X) P 2)  ( ) 1 G(X), if = . ≤ |X | − |K| |K| |X | |Y| 71 Proof. From the proof of Theorem 4.3.1 if the key is uniformly distributed, and for every

∗ ∗ x and y only one key encrypts x to y, then maxx X (x) Y |X (y x) = X (x ) Y |X (y x ) for P P | P P | ∗ x = arg maxx X (x), and thus S1 = G(X). Hence G(X Y ) G(X) = S2  holds. Because P | − ≤ of the same property on keys, and the fact that (from the correctness property |Y| ≥ |X |

of encryption), we also have Zx∗ X K . This is because at most ciphertexts can | | ≥ | | − | | |K| ∗ ∗ be decrypted to x , and for at least ciphertexts y, it holds that Y |X (y x ) = 0. |X | − |K| P | Hence we have

X pmin(X)  S2 = max X (x) Y |X (y x) ( ) . (4.5) ≥ x P P | ≥ |X | − |K| y∈Zx∗ |K| The inequality (4.5) implies that pmin(X) which proves the first condition. |K| ≥ |X | +pmin(X)

For the second condition, since = we have Zx∗ = X K , and hence |X | |Y| | | | | − | | X G(X)  = S2 max X (x) Y |X (y x) ( ) . (4.6) ≤ x P P | ≤ |X | − |K| y∈Zx∗ |K| which proves the result. 

A direct corollary of the above theorem is that the length of the key can only be one bit

less than the length of the message, if  = pmin.

Corollary 4.3.1 If an encryption function satisfies -guessing secrecy for random variables

|X | X,K on messages and keys respectively, and  = pmin(X), then . |K| ≥ 2

For example, suppose the message space is 0,..., 7 with distribution  1 , 5 , 1 ,..., 1 . { } 2 16 32 32 1 Now let  = 32 and thus we must have an encryption system with 4 keys. The Table 4.1 is 1 an example encryption table achieving 32 -guessing secrecy with 4 keys. From the inequality (4.6) the following corollary holds.

Corollary 4.3.2 If an encryption function satisfies -guessing secrecy for messages of length

n with H∞(X) t, and the key is uniformly distributed over r bit strings, then  ≥ ≤ 2−t(2n−r 1). − 72 1 5 1 1 1 1 1 1 X 2 16 32 32 32 32 32 32 P → K k↓m 0 1 2 3 4 5 6 7 P1 4 0 0 1 2 3 4 5 6 7 1 4 1 1 2 3 0 5 6 7 4 1 4 1 2 3 0 1 6 7 4 5 1 4 1 3 0 1 2 7 4 5 6

1 Table 4.1: Table for 32 -guessing secrecy

For example, for a message with min-entropy n/2 and a key of length 3n/4,  2−n/4 holds ≤ which is considered a desirable level of security. This can be achieved using the encryption

function enck(x) = k + x where x and k are selected from a group G and a subgroup H respectively with + as their addition operation.

If one is given more information about the message distribution, smaller key sizes or

smaller  may be achievable. In this case, the set of messages with smallest probability should

be known. In other words, for  > 0, we need to find the set  = x X (x)  . X { ∈ X | P ≤ }

Corollary 4.3.3 For a given  > 0, if there exists a set  such that for all x  X ⊂ X ∈ X it holds that X (x) , then there exists an encryption system that achieves (b)-guessing P ≤ l |X| m secrecy with a uniformly distributed key from a set of size r =  , where b = . |X \X | r Proof. To prove the result, we construct an encryption system that satisfies (b)-guessing

secrecy given the conditions in the corollary. The construction is as follows. First partition the

message space into subsets 1,..., b+1, each of size r (except maybe the last set which can be X X

smaller), with 1 containing the values with highest probabilities ( 1 = x X (x) > ), X X ∈ X | P

to b+1 containing the lowest (for j > 1, j = x X (x) ), Now for each subset Xi, X X ∈ X | P ≤ define the encryption function to be

−1 enck(x) = g (g(k) + g(x)), x i, k , ∈ X ∈ K

r where g : 0, 1 G is a one-to-one function that corresponds to each element of i or , { } → X K an element in a group G of the same size 2r, and + is the group operation. Note that the

73 encryption function is independently defined over each subset Xi using the same key space K of size r.

Now since we assume keys are uniformly distributed, using the proof of Theorem 4.3.1, the

following holds

X G(X Y ) = max X (x) Y |X (y x) x | y P P | X X = max X (x) Y |X (y x) + max X (x) Y |X (y x) (4.7) x P P | x P P | y∈encK(X1) y∈Y\ encK(X1)

= S1 + S2 G(X) + b, (4.8) ≤ where equation (4.7) is from the definition of conditional guessing probability. Equation

(4.8) follows from the proof of Theorem 4.3.1 and the fact that 1 contains the highest X

probable elements of and thus Zx∗ = encK( 1) = y = enck(x) k , x X1 . Therefore X X { | ∈ K ∈ }

S1 = G(X) holds. The condition on  says that for all elements x i for i = 1, we have X ∈ X 6

X (x) . Therefore P ≤ X S2 max X (x) Y |X (y x) ≤ x∈X \X1 P P | y∈Zx∗ X X = max X (x) Y |X (y x) + + max X (x) Y |X (y x) x∈X2 P P | ··· x∈Xb+1 P P | y∈encK(X2) y∈encK(Xb+1)

 + +  = b. ≤ ···



We call the above construction “partitioned one-time pad”. For an example construction,

consider the message space 0,..., 7 with distribution  1 , 5 , 1 ,..., 1 . Now let  = 1 { } 2 16 32 32 32

and we will have  = 6 and therefore based on the above corollary, we need a key space of |X | 3 size 2 for 32 -guessing secrecy. Table 4.2 illustrates the partitioned one-time pad construction for the above message distribution, which gives encryption of 3 bits by only a 1 bit key.

The above corollary is particularly useful when message distribution have many elements with very small probability, for which the partitioned one-time pad construction would yield

74 1 5 1 1 1 1 1 1 X 2 16 32 32 32 32 32 32 P → K k↓m 0 1 2 3 4 5 6 7 P1 2 0 0 1 2 3 4 5 6 7 1 2 1 1 0 3 2 5 4 7 6

3 Table 4.2: Partitioned one-time pad for 32 -guessing secrecy

an encryption function with the smallest number of keys. For example if one wants to encrypt

an English word from all possible words with 3 letters, and the message is chosen based on its

frequency in English texts, then it is very likely that the message is “the” or “you” compared

to very unlikely words such as “qis” or “yip”. For such a message distribution, the above

construction can achieve a smaller key comparable in size to the size of the set with only

high probable elements.

4.4 Requirements on the key distribution

The following theorem states that for any encryption system that provides guessing secrecy,

the min-entropy of the key distribution must be at least the min-entropy of the plaintext

distribution. This is very similar to the result of Shannon for perfect secrecy where the

Shannon entropy of the key distribution must be at least the Shannon entropy of the plaintext

distribution.

Theorem 4.4.1 If an encryption function enc : satisfies guessing secrecy, then X × K → Y

G(K) G(X) or H∞(K) H∞(X). For -guessing secrecy the guessing probability of the ≤ ≥ key satisfies G(K) G(X) + . ≤

75 Proof. First see that from the definition of conditional guessing probability it holds that

X G(X Y ) = Y (y) max X|Y (x y) x | y P P | X = max X (x) Y |X (y x) (4.9) x y P P | X X = max X (x) K (k) Pr(enc(x, k) = y) (4.10) x {P P } y k X X = max X (x) K (k) , (4.11) x {P P } y k∈Kx,y where Kx,y = k enc(x, k) = y . Note that Pr(enc(x, k) = y) is an indicator function, i.e. { | } its value is zero or 1, so the last equality holds. Considering the encryption function as a

table with columns representing plaintexts and rows representing keys, then we take the last

summation (4.11) only in one row of the table, namely the row corresponding to the key with

highest probability, i.e. k∗. Then continuing from the last equality we have:

X ∗ G(X Y ) X (x) K (k ) (4.12) | ≥ x P P

= max K (k). (4.13) k P

Finally from the definition of guessing secrecy G(X) = G(X Y ) G(K) holds. | ≥

The above proof can simply be extended for -guessing secrecy. From the fact that

G(X Y ) G(K) and G(X Y ) G(X) , we have G(K) G(X) + . | ≥ | − ≤ ≤ 

We can define guessing secrecy for a family of distributions over keys as defined for perfect

secrecy in the following way:

Definition 4.4.1 (Guessing secrecy source) An encryption system provides guessing secrecy

for a random source φ over the key space if for all distributions in the source, it satisfies guessing secrecy, i.e.

K φ, G(X encK (X)) = G(X). ∀ ∈ | 76 A source that satisfies the above is called a guessing secrecy source.

For example, let φt be a t-source, i.e. a family of distributions that have min-entropy at

least t. Then we say an encryption scheme satisfies guessing secrecy for the random source

φt, if it satisfies guessing secrecy for all distributions in the family, i.e. all distributions that

have min-entropy at least t.

We can also require secrecy for a family of distributions over plaintext space. For example,

the definition of secrecy in [RW02], requires security only if the plaintext is sampled from

a family of distributions that have sufficiently high min entropy. Here to compare Perfect

secrecy and guessing secrecy, we define the notion of secrecy for a family of distributions over

the plaintext space. We use the following notations and abbreviations.

Definition 4.4.2 (Random sources on messages) An encryption function enc satisfies φM-

GS (or PS) if enc satisfies the definition of guessing (or perfect) secrecy for a family of distributions φM over the plaintext space.

We are interested in finding families of distributions over the key and the plaintext spaces

such that for any pair of distributions, one from each families, guessing secrecy is guaranteed.

Restricting the message distribution to any distribution from a t-source φt, we can prove a

theorem similar to Shannon’s theorem for perfect secrecy as follows:

Corollary 4.4.1 If = = , an encryption function enc satisfies φt-GS for a t- |X | |Y| |K| source φt, if and only if,

(1) x, y, kx,y = 1; ∀ 1 (2) the distribution over K is uniform, that is K (k) = . P |K| P P Proof. enc has guessing secrecy if and only if maxx X (x) K (k) = maxx X (x). y {P k∈Kx,y P } P P P Now consider the uniform distribution over and maxx K (k) = 1 holds. On X y { k∈Kx,y P } 77 the other hand, from Theorem 4.4.1, when is uniformly distributed, the key must be X uniformly distributed also.

Now for uniform distribution over keys, if there exists x0, y0 such that kx0,y0 > 1, then,

X X max K (k) > max K (k) = 1, x { P } |Y| k P y k∈Kx,y which is a contradiction. 

The above corollary implies that any encryption system that provides guessing secrecy for

a family of distributions over plaintexts that contains the uniform distribution, is essentially

the one-time pad and this implies guessing secrecy is equivalent to perfect secrecy in these

cases. However, the question of whether there exists an encryption function providing guessing

secrecy for a family of distribution over keys remains open when a family of distributions is

considered over plaintexts that do no contain uniform distribution.

4.4.1 Guessing secrecy with imperfect randomness

In this section, we investigate whether guessing secrecy is possible when keys that are not

uniformly distributed. This question is particularly interesting when the key distribution is

from a weak random source. Although we cannot give a direct answer to this question, we

can show that guessing secrecy is possible with non-uniform keys if plaintexts are coming

from certain distributions.

We need the following definition to state the main theorem of this section:

Definition 4.4.3 For a random variable X with probability distribution X , let 2 be the P P

probability of the second highest probable value of X. Note that 2(X) may be equal to G(X). P Let G(X) S(X) = , 2(X) P

78 and

X (x0) U(X) = max P . x0,x1 X (x1) P where x0, x1 and x0 can be equal to x1. U(X) is actually the highest probability of X ∈ X P divided by the minimum probability of X . This can be used to measure the uniformity of a P distribution and was used in other works in different context (as an example see [CK78]).

For a set of distributions over plaintexts such that the first and second highest probabilities

are “far” from each other, we show that there exists encryption schemes with guessing secrecy

if the distribution over keys are such that the maximum and minimum probabilities are

“close”.

n Theorem 4.4.2 For m > 1, let φX be a random source over = 0, 1 such that X X { } ∈

φX ,S(X) m. For a family of distributions φ over keys defined as φ = K U(K) m , ≥ { | ≤ } there exists an encryption function enc : that provides φM-GS such that X × K → Y = = . |X | |Y| |K|

∗ ∗ Proof. Let x be a value in with the highest probability, i.e. X (x ) = G(X). Then X P U(K) m S(X) implies that for all y ≤ ≤ ∈ Y

∗ 0 0 X (x ) K (k) X (x ) K (k ), P P ≥ P P

for all x0 and k0 , where k is the key that encrypts x∗ to y. Thus it holds that ∈ X ∈ K

max X (x) K (k) = G(X) K (k). x {P P } P

79 Now for an encryption function f such that x , y : kx,y = 1, we have: ∀ ∈ X ∈ Y X X G(X Y ) = max X (x) K (k) (4.14) | x {P P } y k∈Kx,y X = max X (x) K (k) f(x, k) = y (4.15) x y {P P | }

X ∗ = G(X) K (k) f(x , k) = y (4.16) y {P | }

= G(X). (4.17)

The last equality is because by summing over values of y, K (k) will take all values of K P P which sum to one. 

In the above theorem, U(K) is a measure of closeness of K to uniform distribution. For

example if m = 1, then the key must be chosen uniformly at random. As m grows larger, the

key distribution further deviates from uniform distribution in terms of statistical distance.

This implies that the farther the second highest probability in X is from G(X), the farther P

G(K) can get from the lowest probability of K and the more non-uniform distributions can P be used for guessing secrecy, which is desirable for our purpose.

4.4.2 Relation with perfect secrecy

We first consider the following notions of secrecy: Guessing Secrecy (GS), φt-GS, Perfect-

Secrecy (PS) and all-Perfect-Secrecy (all-PS). Definitions of these notions follow from previous

definitions and the notations. With all-PS we mean perfect secrecy for family of all distribu-

tions over plaintexts.

The following relations follow from the definitions of these notions:

all-PS = φt-GS ⇒

⇓ ⇓ PS = GS, ⇒

80 where a b means if a function has property a, then it will also have property b. Based ⇒ on theorem 2.7 in [KL07] and Theorem 4.3.1, for all these notions of security, we must have

. |K| ≥ |X | (2) all-PS φt-GS ←

(1) 9 ↑ (3) (4) PS 8 GS, where a b means a implies b under certain conditions. → (1) If = = , then PS all-PS ([KL07] Theorem 2.8). |X | |Y| |K| →

(2) If = = , then φt-GS all-PS (Theorem 4.4.1). |X | |Y| |K| → (3, 4) There exists an encryption scheme providing guessing secrecy for a family of distributions

over plaintexts and keys but it does not provide φt-GS (Theorem 4.4.2).

Points 3 and 4, are the main advantages of guessing secrecy over perfect secrecy (and

-secrecy) in terms of randomness requirements.

4.5 Applications

Cryptography is often based on uniformly distributed secret keys such that an exact copy

of is available to the involved parties. But in reality based on our discussions in previous

chapters, neither uniform distribution assumption for secret keys is practical, nor that an

exact copy is always available. In this section, we consider secret keys that are not precisely

reproducible for the parties. In other words, the parties have access to a secret key that are

related but not exactly reproducible. An example of a secret key that is neither uniform

nor reproducible is biometrics such as fingerprint or iris scan. A reader device is usually

used to measure these biometrics and rarely produces the same results in two readings W

and W 0, even though the readings are likely to be close with regard to a distance function D.

In practice, a party has W stored on a server for example, and the other party has a fresh

reading of the secret W 0 such that D(W, W 0) d, and they want to agree on a secret key that ≤

81 is the precisely identical and uniformly distributed. Sending W 0 over the channel is neither

efficient nor secure since the channel is assumed to be eavesdropped by the adversary, and viewing W 0 will leak much information about W . A more efficient solution [DORS08] is to

send a function of W 0, denoted by SS(W 0), such that W 0 can be reconstructed from W and

SS(W 0). This function is basically part of an error correcting code that with the help from

W can retrieve W 0 back. But since the shared secret must be uniformly random (even for

computational security), and W 0 is not uniformly distributed and even if uniform, a function

SS(W 0) of it is revealed, then one needs to use W 0 to agree on a uniformly random key. An

extractor can be used to do this so that the parties agree on an identical secret key that is

uniformly distributed. To use the extractor, we need a guarantee on the min-entropy of W 0

0 0  0 0 which is measured by H∞ W SS(W ) = H∞(W ) log SS(W ) . One problem with the | − | | above is that reusing the biometric for several time will leak more and more information since

SS(W 0) leaks information about both W 0 and the original secret W . The naive solution is to

encrypt the SS(W 0) such that it reveals no more information about W, W 0. But since in this

scenario, we only need to retain the min-entropy of W 0 for the extractor to work, stronger

notions of secrecy is not required. Guessing secrecy provides the sufficient security to keep the

min-entropy of W 0 high. Therefore using an encryption function enc that provides guessing

0 0 0  0 secrecy, one can encrypt SS(W ) such that H∞ W encK (SS(W ) = H∞(W ). The main | advantage in this scenario is that the key variable K must satisfy the properties for guessing

secrecy and not stronger notions of secrecy.

4.6 Bounds on conditional min-entropy

In this section, we improve a bound on conditional min-entropy (or equivalently conditional

guessing probability) and prove conditions for equality in the bounds.

A bound on conditional min-entropy was proved in [DORS08] which was widely used afterward

in different applications of min-entropy:

82 Theorem 4.6.1 [DORS08] For any two random variables X and Y with joint probability distribution XY , we have: P

H∞(X Y ) H∞(X) log Y . (4.18) | ≥ − | |

Note that the above theorem can be written in terms of guessing probability.

G(X Y ) G(X). (4.19) | ≤ |Y|

The above result is a special case of Theorem 4.6.3 which we prove later in this chapter.

In the rest of this chapter, we will use inequalities on guessing probability for our proofs to

make them easier to follow. Apparently any upper bound on the guessing probability can be

translated to a lower bound on min-entropy.

The following theorem states that to have equality in 4.19 and 4.6.1, the size of must |Y| be at least the same as the number of x X that takes the maximum probability: ∈

Theorem 4.6.2 For random variables X and Y with a joint probability distribution XY , P we have G(X Y ) = G(X) if and only if y , x max such that Y |X (y, x) = 1, | |Y| ∀ ∈ Y ∃ ∈ X P where max = x X (x) = G(X) . X { ∈ X |P }

Proof. Suppose that G(X Y ) = G(X). Then from the equation (4.9) and (4.19), it holds | |Y| that X X max X (x) Y |X (y, x) = max X (x), x x y {P P } y P which is equivalent to having

X   max X (x) Y |X (y, x) max X (x) = 0. (4.20) x x y {P P } − P

But since x, y : Y |X (y, x) 1, each term in the summation is positive and hence the sum ∀ P ≤ is equal to zero if and only if each term is equal to zero. So we must have:

y , max X (x) Y |X (y, x) = max X (x). (4.21) ∀ ∈ Y x {P P } x P

83 Now if y , there exists x max such that Y |X (y, x) = 1, then it is obvious that the ∀ ∈ Y ∈ X P equality (4.21) holds, and thus the equality (4.20) holds.

For the only if part, for X (x) Y |X (y, x) to be maximized, x must be in max since if not, {P P } X

then maxx X (x) Y |X (y, x) < maxx X (x). Now if there exists y such that for every {P P } P ∈ Y

x max, we have Y |X (y, x) < 1, then maxx X (x) Y |X (y, x) < maxx X (x) which is a ∈ X P {P P } P contradiction and this completes the proof. 

Corollary 4.6.1 For two random variables X and Y with joint probability distribution XY , P if G(X Y ) = G(X) then max . | |Y| |X | ≥ |Y|

Proof. From Theorem 4.6.2, y , there exists x max such that Y |X (y, x) = 1. Hence ∀ ∈ Y ∈ X P

max holds. |X | ≥ |Y| 

In the next theorem we prove an improved tight upper bound on conditional guessing

probability. This bound is equal to the bound on equation (4.19) if Y is a one-to-one function

on a subset of max. But if max < , then the new bound would be a better estimation X |X | |Y| for conditional guessing probability.

The theorem states that G(X Y ) is bounded by the sum of largest values of X (x) for | |Y| P x . ∈ X

Theorem 4.6.3 For random variables X and Y with joint probability distribution XY , P

G(X Y ) is at most the sum of the largest values of X (x), i.e. | |Y| P X G(X Y ) max X (x), | ≤ S⊆X P x∈S where S = . | | |Y|

Proof. Fix the distribution of X in the rest of the proof. Then we must prove the inequality

for every correlated random variable Y . Let Y be the random variable that maximizes

84 G(X Y ). If for all y Y, X (x) Y |X (y, x) is maximized only when Y |X (y, x) = 1, then | ∈ P P P from Theorem 4.6.2, Y is a one-to-one function of a subset of X of size m with maximum

probabilities and this completes the proof.

Now assume (by contradiction) that there exists a y0 such that Y |X (y0, x0) = 1, for a x0 that P 6 P maximizes X (x0) Y |X (y0, x0). Since Y |X (y, x0) = 1 then there must be other values P P y P 0 of y such that Y |X (y, x0) = 1. Now consider another random variable Y equal to Y with P 6

the only difference that all values of y with Y |X (y, x0) = 1 except y0, are assigned to ∈ Y P 6 0 another value of X not in the first m ones. Then for Y , Y |X (y0, x0) = 1 holds. Now for the P new random variable Y 0 it holds that X X max X (x) Y |X (y, x) = max X (x) Y |X (y, x) + X (x0) Y |X (y0, x0), x {P P } x {P P } P P y y6=y0 that is strictly greater than the value of sum for Y which is a contradiction. 

4.7 Concluding remarks

We proposed a new definition of secrecy that provides sufficient security guarantee in some

realistic application scenario and matches the intuition that for a good secrecy system it

should be hard to “guess” the plaintext. Although our current results might not provide a

direct practical application, but our initial results highlights an important aspect of guessing

secrecy that is not present in other known relaxations of secrecy: it is possible to use non-

uniform keys to provide perfect guessing secrecy. In all other definitions of secrecy, keys must

be uniformly selected. We showed families of distributions on messages and keys that provide

perfect guessing secrecy. Finding encryption systems that provide perfect guessing secrecy

for larger families of distributions, is an interesting open question. We also investigated

the randomness requirements of -guessing secrecy defined as G(X Y ) G(X) , and | − ≤ showed an efficient construction that can achieve this notion with keys of size smaller than

the message.

85 Chapter 5

Information Theoretic Security of Sequential High

Entropy Messages

In this chapter we discuss multiple message security in infor-

mation theoretic security. The natural extension of security from

one message to multiple messages (λ messages), requires λ times

the amount of key required for one-message secure encryption.

We present a relaxation of the notion for a sequence of messages,

where the security of the last one is more important than pre-

vious messages. This has applications in location privacy and

encryption of health records.

5.1 Introduction

In earlier chapters, we considered secrecy of one message in information theoretic setting.

An encryption system that has provable properties in this setting, when used for encryption

of multiple messages, needs a fresh key for each message. In this chapter we consider the

scenario that a sequence of messages with unknown length, must be secured. Obviously one

can use a one-message secure encryption system, and generate independent keys for each

message. This will guarantee that all messages stay with high security. The requirement for

independent keys is from the fact that encryption of past messages leak information about

the future messages if keys are correlated. Our goal is to relax the security requirement of

message sequences to reduce key size of the system. We introduce a two-level security system

in which the last message is perfectly secure, while all the past messages are “entropically”

86 secure

Security for multiple messages is investigated in computational security, for example in

section 3.4.3 of [KL07] where it is shown that a computationally secure encryption function

using a single key can be securely employed for encryption of multiple messages. For encryption

in information theoretic security however, the only work to the best of our knowledge that

discusses more than one message security is by Kawachi et al. [KPT11], where they relate

2-message security with non-malleability property of encryption.

Russel and Wang [RW02] introduced the notion of information theoretic entropic security where the ciphertext does not leak any predicate of the plaintext, if the plaintext is guaranteed

to have min-entropy t. Dodis and Smith [DS05] improved entropic security by requiring

that no function of the plaintext is leaked. They showed that this definition is equivalent to

indistinguishability of two random variables W1,W2 with min-entropy t. The main advantage

of this relaxation of Shannon’s security is that the key length required to provide entropic

security can be less than the length of the plaintext. Dodis and Smith showed that for

plaintext of length n with min-entropy t, at least a key of length n t is needed to provide − entropic security.

5.1.1 Our contribution

In this chapter, we consider a scenario where the value of a message to the communicant

depends on its position in a message sequence. In particular, the last message needs to be

perfectly secure but the message before the last, has a lower security requirement and their

corresponding ciphertexts leak some information. A scenario where such a requirement makes

sense is outlined in Section 5.1.2.

We require all past messages remain secure using the notion of entropic security, while the

last message be protected using indistinguishability notion, the strongest notion of security.

This mixed notions allows us to reduce the required key for ` messages, each of length n

87 and min entropy t, from `n to roughly n + (` 1)t. Providing security in this model is − straightforward, if ` is known beforehand: the first `-1 messages are encrypted using an

encryption system with entropic security, and the `-th message with one-time pad. The

challenge is to provide this property for a stream of messages, always ensuring perfect security

for the very last message. We formalize security for this scenario and propose a construction with provable security. In our construction, the first message is encrypted using one-time

pad. For all the following messages once the ciphertext is published, we will show that

the remaining min-entropy in the key is t, and so extracting this randomness will allow

indistinguishability security of the next message to be achieved with roughly n t new random − bits. In other words, by reusing the randomness in the key one can reduce the number of new

random bits. The main security Theorems are 5.2.3 and 5.2.4 showing that the min-entropy

of the key given previous ciphertexts is at least t, and the extracted randomness from the

key along with n t fresh random bits provides indistinguishability of the last message while − guaranteeing the entropic security for past messages encrypted in the system.

5.1.2 Applications

Our paradigm has a number of applications, for example in location privacy where one needs

to hide its location. The user wants to protect its privacy from an adversary who observes

the locations or collect the locations when using a location based service. One method is to

obfuscate the location, for example by providing a fake location to all requests for locations which may contradict the reason behind location based services. In location specifically,

although secrecy of past locations is important, it is particularly important to have the

current location hidden- this is also because of the safety of the individual. Therefore in our

setting, the user shares secret keys with the trusted location based services and only those will have access to the location. Other observers will not be able to derive user location,

especially the last location.

Another scenario is encryption of health records. Keeping the health records of a patient is

88 important in general, but the latest records are usually the most important ones that are

needed to be kept secure. In last message security, the security of the last message is always

guaranteed, but the system may leak about the previous messages, a leakage that is bounded

though.

5.1.3 Entropic security

Russel and Wang [RW02] considered the case that messages have high entropy. This assump-

tion rules out the possibility that the messages are highly predictable by the adversary. This

is quite practical in many scenarios, where the adversary has side information about the

message, but high entropy is left in the message, e.g. adversary knows that the message is

English text. The authors considered a security definition similar to semantic security [GM82],

that hides all predicates of the message, if they have high entropy and in that case, they

showed that the key length can be less than the message. Dodis and Smith [DS05] extended

their result by improving the definition of security to all functions (compared to predicates)

of messages and named it entropic security. They also provided simpler constructions and

concrete bounds on the size of the key.

Dodis and Smith defined entropic security for a probabilistic map Y () that hides all

functions of X with leakage .

Definition 5.1.1 A probabilistic map Y () hides all functions of X with leakage , if for every adversary A, there exists an adversary A∗ such that for all functions f : 0, 1 ∗, X → { }

Pr[A(Y (X)) = f(X)] Pr[A∗() = f(X)] . | − | ≤

This means that the probability of the adversary predicting f(X) is roughly the same with

Y (X) as it is without Y (X). If Y () hides all functions of X with leakage  for all t-sources

X, Y () is said to be (t, )-entropically secure.

The authors also showed the equivalence of this definition to an indistinguishability definition

as follows:

89 Definition 5.1.2 The probabilistic map Y () is (t, )-indistinguishable if for all pairs of

t-sources X1 and X2, Y (X1)  Y (X2) holds. ≈

Note the difference between the definition of indistinguishability and entropic indistinguisha-

bility: in the latter instead of two realizations x1 and x2 of the random variable X, two

random variables X1 and X2 that are t-sources are used.

Theorem 5.1.1 [DS05] Let Y be a randomized map with inputs of length n. Then

1- (t, )-entropic security for predicates implies (t1, 4)-indistinguishability.

2- (t2, )-indistinguishability implies (t, /8)-entropic security for all functions when t ≥ 1 2 log  + 1.

The above theorem shows the equivalence of entropic security and entropic indistinguishability.

This equivalence implies that all output distributions of Y (X) are -close to Un for a t-source

X, so a (n, t, n, )-extractor is (t, 2)-indistinguishable. For the encryption to provide perfect

correctness property, the extractor needs to be invertible. The equivalence of entropic security

and extractor immediately gives a bound on the key size.

Proposition 5.1.1 [DS05] Any encryption scheme (Y (X) = encK (X)) that provides (t, )- entropic security requires a key of length at least n t for messages of length n. −

Using expander graphs and random hashing constructions, the authors could construct an ef-

ficient (t, )-entropically secure encryption with length of the key equal to n t+2 log(1/)+2. −

Definition 5.1.3 A function ext : 0, 1 n 0, 1 d 0, 1 l is an (n, t, l, )-extractor if { } × { } → { }

for all (n, t)-sources (W Y ), we have (ext(W, S), S, Y )  (Ul, S, Y ) where S is uniform on | ≈ 0, 1 d. { }

Note than for concrete constructions, we need efficient randomness extractors.

90 5.2 Encryption of multiple sequential messages

In information theoretic security, notions of secrecy for encryption are usually defined only

for one message. It is often assumed that encryption of more than one message requires a

fresh key that is chosen independent of the previously used key. Certain attacks such as

frequency analysis can be used to derive information about the encrypted message, if two

correlated keys are used to encrypt two messages. In this section, we first define multiple

message security based on the indistinguishability of messages and show how the size of the

key affects the security of messages, when the key is uniformly distributed. We then propose

a relaxation of this definition under which a smaller number of random bits is required for

multiple message security. We show that assuming that high min-entropy messages are

encrypted, one can choose the keys for future messages dependent on the previously used keys

and still achieve -secrecy of the last message, as well as entropic security of past messages.

We are interested to extend the security of an encryption function that works for one mes-

sage to multiple message security and show how correlated keys can be used to provide security.

In the first definition, we consider the security of all messages when the encryption of them

is revealed. In the following definition, we use a natural extension of -indistinguishability for

multiple messages, where the security of all messages is required when their encryption is

revealed.

Definition 5.2.1 A λ-message encryption scheme (kgλ, encλ, decλ) constructed from an encryption system (kg, enc, dec), with message, ciphertext, and key spaces denoted by , X Y and respectively, is defined as follows: K The message, ciphertext, and key spaces of the λ-message system is, 0λ, 0λ • X Y and 0λ, where 0 = , 0 = and 0 = . Here is a special K X X ∪ ⊥ Y Y ∪ ⊥ K K ∪ ⊥ ⊥ symbol denoting empty message, empty ciphertext or empty key, and is used to

obtain (kg, enc, dec) from (kgλ, encλ, decλ).

91 λ 0λ enc (x1, . . . , xλ) := enck (x1),..., enck (xλ), (x1, . . . , xλ) . • k1,...,kλ 1 λ ∈ X kgλ, the key generation algorithm, is a probabilistic algorithm possibly based • on the description of kg, and takes as input a security parameter 1b, and an

λ b auxiliary input s, and outputs λ keys of length b, (k1, . . . , kλ) = kg (1 , s).

Note that the special symbol is not considered when counting the number of messages or ⊥ keys. It is a notation used to get enc from encλ.

For x0 = (x0, . . . , x0 ), x1 = (x1, . . . , x1 ) 0λ, the notation x0 x1 means for all x0 = , 1 λ 1 λ ∈ X 6≡ i 6 ⊥ and x1 = , we have x0 = x1. In other words, two such tuples are not considered equal, if i 6 ⊥ i 6 i they differ even in one element, i.e. ith element for i [λ]. ∈ Definition 5.2.2 A λ-message encryption scheme (kgλ, encλ, decλ), provides λ-message

-indistinguishable security (SECindλ ()) if there exist jointly distributed random variables

λ 0 1 K = (K1,...,Kλ) generated by kg , such that for all x x it implies the following: 6≡

λ 0 0 λ 1 1 enc (x , . . . , x )  enc (x , . . . , x ). K 1 λ ≈ K 1 λ Similar to one message security, the correctness property should also hold, i.e.

λ λ decK (encK (x1, . . . , xλ)) = (x1, . . . , xλ).

The above generalization of SECind is a natural requirement for multiple message security where we require that the joint distribution of encryption of two message tuples is no more

than -distinguishable. Equivalently, in the game view of the definition between a chal-

lenger and adversary, the challenger samples k = (k1, . . . , kj) K and then the adversary → 0 1 chooses two message tuples x and x . Then challenger selects b U1 and sends back → b enck(x ) to the adversary. The adversary guesses the bit b and sends a number c 0, 1 ∈ { } back to challenger. The adversary is successful if b = c. In the ideal case, we require that

the success of adversary in this game be at most 1/2. However, it can be easily seen that

a success probability of 1/2+ for adversary is equivalent to 2-indistinguishability of messages.

92 Let’s now investigate how one message and λ message security are related as λ message

indistinguishability is a natural extension of one message security.

Theorem 5.2.1 For an encryption scheme (kg, enc, dec) that provides SECind(), the cor-

λ λ λ λ responding (kg , enc , dec ) provides SECindλ (λ) if kg invokes the kg function λ times independently to generate λ independent keys (with the same distribution as in SECind).

Proof. We prove the theorem for λ = 2 and the general case follows. Using triangle inequality,

for every tuples (x1, w1) and (x2, w2) the following holds:

2 2  2 2  ∆ enc (x1, w1); enc (x2, w2) ∆ enc (x1, w1); enc (x2, w1) K K ≤ K K 2 2  + ∆ encK (x2, w1); encK (x2, w2)  ∆ encK (x1); encK (x2) encK (w1) ≤ 1 1 | 2  + ∆ encK (w1); encK (w2) encK (x2) 2 2 | 1 = 2.

The last equality holds only when K1 and K2 are independent random variables such that

for all x1, x2 X, encK (x1)  encK (x2) for i = 1, 2. ← i ≈ i  Intuitively speaking, the advantage of adversary in distinguishing between the two message

tuples is the probability that it can distinguish between any two message pairs, i.e. x1 and

x2. Thus the advantage would be the sum of all advantages for individual messages.

For example, one-time pad provides two-message security, if two keys are sampled inde-

pendently from a uniform distribution. Assuming that the keys are uniformly distributed,

it is easy to show that the length of each key must be at least equal to the length of the

message.

Theorem 5.2.2 For any SECind2 encryption scheme on n-bit messages using two uniformly distributed and independent keys of length r, and for any message pair (x1, w1) with cipher

93 2n 2r−1 distribution (Y1,Z1), there exists at least 2 3.2 pairs (x2, w2) with cipher distribution −

(Y2,Z2) such that 1 ∆(Y ,Z ); (Y ,Z ) > , 1 1 2 2 3 and at least 22n 22r+1 message pairs with at least 1 distinguishability. − 2

Proof. Proof in the Section 5.4.5. 

For example, for any encryption system that provides SECind2 with independent and uniformly

distributed keys of length r = n 1, for each message tuple x0, there exists at least − 2n−1 1 1 2 message tuples x that are distinguishable with more than 3 advantage. Therefore -indistinguishability requires the length of the key to be at least 2 time the length of the

message.

5.2.1 Relaxing λ-message security

We consider a new model where the messages are encrypted one by one in a sequence, and

the security of the last message is of utmost importance. In other words, revealing the

encryption of new messages may reveal information about past messages. In this model, we

index messages based on their order of encryption, i.e. x1, x2, . . . , xλ where xi is encrypted

before xj if i < j. This model is particularly useful when the security of messages have an

expiration time and after the time is due, the security of the message is not as important.

For example in location privacy models, last location of the user is the most important and

previous locations are not as important. Or the medical records of patients may match this

model if only the last record is the most important one to keep secure. Note that in this

model, past messages will not be revealed to the adversary, but future messages may leak

information about them. This leakage is not arbitrary and it can be bounded in terms of

entropic security. Moreover, we assume a certain min-entropy for the messages in our model.

In the following definition, we relax λ-message security based on the above intuition.

94 λ λ λ Definition 5.2.3 An encryption scheme (kg , enc , dec ) provides SECindλ∗ (1, 2) if for in-

i i dependent t-sources X1,...,Xλ, and for all i [λ], and x = x , there exist jointly distributed ∈ 1 6 2 random variables K = (K1,...,Kλ) such that the following conditions are independently satisfied:

1) Last message 1-secrecy: For i [λ] ∈ λ i λ i enc (X1,X2, . . . , x )  enc (X1,X2 . . . , x ). K 1 ≈ 1 K 2 0 2) Past message 2-entropic security: For any t-source X1,

λ λ 0 enc (X1,X2,...,Xλ)  enc (X ,X2,...,Xλ). K ≈ 2 K 1

Past message security effectively requires entropic security for the first message when en-

cryption of all future messages is given. This is because entropic security of the ith message

given the future ones is implied by the entropic security of the first message given future

messages.

In Definition 5.2.1, we assumed that adversary receives encryption of a tuple of messages

(regardless of message distribution), and should distinguish between them. In Definition 5.2.3, we provide the information about the encryption of past (or future) messages to the adversary,

but we assume that the adversary has uncertainty about those messages, measured in terms

of min-entropy. In the game view of the definition, the challenger will choose k K and the → adversary chooses an index i [λ] and two messages xi , xi , and sends them to challenger. ∈ 1 2 i i The challenger will then choose b U{1,2} and sends back enc (X1,X2,...,Xi−1, x ) where → K b

X1,X2,...,Xi−1 are random variables with min-entropy t in the view of adversary. Here the

i adversary receives encKi (xb) along with the extra information about the encryption of the past

i−1 messages enc (X1,...,Xi−1). The assumption that Xj, j [λ] is of min-entropy K1,K2,...,Ki−1 ∈ t is equivalent to that the adversary knows n t bits of the message being encrypted, or − equivalently, the message has been chosen from a space of 2t messages from all 2n messages.

95 5.2.2 Min-entropy v.s. -indistinguishability

One might ask whether  or perfect indistinguishability of messages can guarantee a certain

min-entropy in the message, but this does not hold (the reader may skip this section if

the distinction between the two concepts is clear). There is no guarantee that a certain

min-entropy is left in the message even if perfect indistinguishability holds for messages given

the encryption from the adversary’s point of view. An -indistinguishable encryption system

provides security regardless of the message distribution X, and consequently regardless

of min-entropy of the messages. In other words, an encryption scheme that provides -

indistinguishability implies that

 x0, x1 X; ∆ encK (x0); encK (x1) , ∀ ← ≤

and yet H∞(X) can be an arbitrary value. This means that even if the adversary has side

information that there are only two possibilities for the message being encrypted, namely

H∞(X) = 1, it should not be possible to distinguish between the distribution of their

encryption. When we assume that the message has certain min-entropy t, it means that

the adversary’s side information is limited to a function of length n t of the message − being encrypted. This is quite a practical assumption in many practical scenarios where the

messages being encrypted are neither random, nor completely predictable. For example, the

adversary only knows that the message is English text, and this gives the adversary some

information, not enough to predict the message from only a small set of messages.

5.2.3 Encryption of uniformly random messages

We start our construction for multiple message security by considering uniformly and in-

dependently distributed messages, i.e. X1,...,Xλ are uniform distribution and mutually

independent. The construction and proofs for this case provide the intuition for the case that

messages has certain min-entropy and thus are not uniformly random. In this section we

argue that a single key can be used to encrypt all messages when messages in the sequence

96 satisfy the mentioned property. To keep the argument simple, we explain it for encryption of

3 messages X1,X2,X3, where the messages are uniformly distributed and independent over a

n bit space , and the proof for the general case is given in Section 5.4.4. For i [3], define X ∈

Yi = encK (Xi) = K Xi, for a uniformly distributed K over . ⊕ K In the above, we use the same key to encrypt all three messages. It is simple to prove λ

message security in terms of SECind3∗ for this encryption system, even though we reuse the

0 key for all three encryptions. To prove last message security, we show that for all x3 = x , 6 3 0  ∆ x3 K; x K Y1,Y2 = 0. which is implied if for all x3, y1, y2, A = Pr[x3 K = y3 Y1 = ⊕ 3 ⊕ | ⊕ | −n y1,Y2 = y2] = 2 holds. The implication is from the definition of statistical distance. But

since A = Pr[K = x3 y3 X1 K = y1,X2 K = y2] and K is independent of uniformly ⊕ | ⊕ ⊕

distributed X1,X2, the result is implied. Intuitively speaking, each uniformly distributed

message Xi encrypts the key K using one-time pad and thus K will remain uniformly dis-

tributed in the view of an adversary even given prior encrypted messages. Past message

security is also implied from the proof of last message security using the same techniques.

The full proof is in Section 5.4.4.

One can use entropic security to encrypt each message. To do so, a key of at least

logarithmic length in message length is required to encrypt truly random messages. So for

encryption of λ messages of length n, it requires at least λ log n bits of key. Here we encrypt λ

random messages using a key of length n independent of λ. Thus for any λ where λ log n > n,

the above scheme has advantage over entropic security in terms of key length.

5.2.4 Using correlated keys for min-entropy messages

In this section, we state our main result in how to use correlated keys to achieve SECind∗λ in Definition 5.2.3. The relaxation of -indistinguishability of λ messages in the Definition 5.2.3

97 enables us to encrypt a sequence of messages with shorter key lengths. We start from an

encryption system that provides perfect secrecy for one message and then discuss how the

key generation algorithm can be modified to reuse keys for the future encryption of messages

in a sequence.

The construction and proofs are an extension of the case for uniformly random messages

discussed in Section 5.2.3, when considering messages with min-entropy t. In this case, we

prove that there will be a left-over randomness in the key K given the encryption of the all

previously encrypted messages. In other words, the adversary has uncertainty about the key

even given the past ciphertexts.

Our main observation is that this left-over randomness can be used to encrypt future messages

securely. However, the drawback is that security of the past messages is reduced, when

adversary sees the future ciphertexts. To use this leftover randomness, we use randomness

extractor to recycle this randomness.

Definition 5.2.4 Let ext : 0, 1 n 0, 1 l be a polynomial time probabilistic function that { } → { } uses r bits of randomness. We say that ext is an (n, t, `, )-strong average-case extractor if for

n all random variables K over 0, 1 with a joint distribution (K,Y ) such that H∞(K Y ) t, { } | ≥ the following holds.  ∆ ext(K,S); U` Y,S , | ≤ where S is the uniform distribution over 0, 1 r. { } We use the extractor constructed based on the leftover hash lemma [ILL89] to achieve the

constructive parameters of an extractor. The choice of this extractor is only for simplicity

and not efficiency in terms of the length of the random seed. For efficiency, it can be replaced with an extractor that needs a logarithmic seed length.

Proposition 5.2.1 [ILL89, DORS08] There exists a (n, t, `, )-strong average-case extractor with ` = t 2 log(1/) + 2. In particular, for universal hash functions  satisfies  = − 1 √ −H∞(K) ` 2 2 2 .

98 As for the construction, we also need a modified version of one-time pad, we call it

randomized one-time pad.

Definition 5.2.5 For a message x and a key k in a finite field F, randomized one-time

pad is defined as enck(x, r) = hr(k) x where r F is a fresh (public) randomness used to ⊕ ∈ randomize one-time pad, and h is a family of XOR hash functions [DS05] indexed by r. One example of such a family is hr(k) = rk with the multiplication in the field F. We will simply use enck(x, r) = rk x for the rest of this chapter. ⊕

Note that randomized one-time pad provides the same security as one-time pad but uses

an extra randomness. This extra randomness used in this construction is essential for past

message security as outlined in Theorem 5.2.4.

A summary of the construction and the assumptions is given in the following diagram:

Reusing key when messages have high min-entropy

Alice Bob

k1 Un ←

y1 = enck1 (x1) y1

x1 = deck1 (y1)

H∞(X1) t1 ≥ l1 = t1 2 log(1/) + 2 − 0 k Un−l 2 ← 1 s1 U ← O(log(n)) 0 k2 = ext(k1, s1) k || 2

y2 = enck2 (x2) y2, s1 0 k2 = ext(k1, s1) k || 2

x2 = deck2 (y2) H∞(X2) t2,... ≥

99 The construction: We assume that Alice and Bob have access to a source of shared

randomness, and want to encrypt a sequence of messages coming from a source with min-

entropy t. Alice encrypts an n-bit message x1 by sampling a uniform key k1 K1 = Un, ← and using an encryption scheme (enc, dec) that provides perfect security such as one-time

pad. We use a randomized version of one-time pad to achieve past message security, i.e.

y1 = enck (x1, r1) = x1 r1k1. Then y1, r1 is sent over the insecure channel to Bob. Bob can 1 ⊕

decrypt the ciphertext using the decryption function deck (y1) = r1k1 y1. So far, we do not 1 ⊕

assume a certain min-entropy in the message and the security of x1 being encrypted is in

terms of -indistinguishability. However, to encrypt a second message, we assume that the

min-entropy of the first message is estimated given the adversary’s view. At this point, the

adversary has the ciphertext and possibly some side information about x1. Assuming that the

security of the first message is no longer important, Alice extracts the leftover randomness in

K1 using a strong average-case uniform extractor that outputs l bits, assuming that the mes-

02 sage x1 is sampled from a t-source. Then Alice samples another key k from Un−` and appends

02 this key to the extracted key, i.e. k2 = ext(k1, s1) k . Then the new key is used to encrypt a ||

second message x2. This approach is followed up to the ith message xi. Here the leftover ran-

domness in Ki−1 is extracted using an extractor that outputs ` bits. Alice then samples a fresh

key of length n ` and appends it to the output of the extractor, i.e. ki = ext(ki−1, si−1) Un−`. − ||

To prove that our construction achieves Definition 5.2.3, we need to prove the following

steps:

1. Min-entropy of the last key Ki given all past ciphertexts is greater than t.

2. The new key Ki+1 = ext(Ki,Si) Un−` provides -secrecy of the last message || according to Definition 5.2.3.

3. The past messages are still entropically secure given all subsequent ciphertexts,

as stated in Definition 5.2.3.

100 Remark 5.2.1 We assume all the messages in the sequence have min-entropy t for simplicity in proofs. But the proofs can be modified to work for different values of t for each message.

To state the main theorems of this section, we need a number of definitions and lemmas

that can help us in proofs.

δ Definition 5.2.6 A random variable K has δ-smooth min-entropy t if H∞(K) = maxL≈δK H∞(L) where the maximum is taken over all -close variables to K. Conditional smooth min-entropy is also defined in the same manner:

δ H∞(K Y ) = max H∞(L Y ). | L≈δK |

δ Obviously H (K) H∞(K). ∞ ≥

It is easy to argue that an extractor that can extract min-entropy of a random variable with

min-entropy t, can also extract the smooth min-entropy of the random variable.

Lemma 5.2.1 A (n, t, `, )-strong average case extractor satisfies

 ∆ ext(K,S); U` Y,S  + δ, | ≤

for smooth min-entropy, namely when Hδ (K Y ) t. ∞ | ≥

Lemma 5.2.2 For random variables X,Y and a family of functions fr : , H∞(X fR(Y )) Y → Z | ≥

H∞(X) log maxr supp(fr(Y )) holds. −

n Lemma 5.2.3 For joint distributed random variables X, W, Z over 0, 1 , if H∞(X Z) = { } | δ n, then H∞(X W, Z) = H∞(X W ). Consequently if for δ > 0, H (X Z) = n then | | ∞ | Hδ (X W, Z) = Hδ (X W ). ∞ | ∞ |

Proofs of the above lemmas are given in Section 5.4.

101 In the following theorem, we prove that the min-entropy of the key Ki given all previous

encrypted messages Yi,...,Y1 is at least equal to the min-entropy of the ith message Xi. The

proof of the theorem follows the same intuition as in proof in Section 5.4.4 where we proved

that reusing key in encryption of multiple messages provides SECind∗. In contrast, here we

need to deal with non-uniform messages and we assume that the messages are t-sources. In

the following theorem, without loss of generality, we hide the randomness used in randomized

one-time pad (rk x) for simpler expression of proof. So we consider k x for encryption of ⊕ ⊕ x using the key k.

Theorem 5.2.3 Let (kg, enc, dec) be the randomized one-time pad for an n bit message, where kg takes as input a security parameter 1n and generates n independently and uniformly distributed bits. Each time Um (for m > 0) is used in the definition, it means an independent

(fresh) and uniformly distributed random variable. Define (kgλ, encλ, decλ) following the

λ n Definition 5.2.1, with the key generation function (k1, . . . , kλ) = kg (1 , (s1, . . . , sλ−1)) such that

k1 = Un,

k2 = ext(k1, s1) Un−`, || ...

kλ = ext(kλ−1, sλ−1) Un−`. || where ext is a (n, t, l, )-strong average-case extractor and si is a fresh randomness used as a seed in the extractor (i [λ 1]). For Xi,Ki (i [λ]), let Yi = Ki Xi. Then ∈ − ∈ ⊕  i i H (Ki Y ) H∞(Xi) holds where Y = (Yi,...,Y1) for random variables Yi,...,Y1. ∞ | ≥

102 Proof. Considering only the last ciphertext Yi for the key Ki ( i [λ]), then for every y ∀ ∈ ∈ Y we have

H∞(Ki Yi = y) = log max K |Y (k y) (5.1) | − k P i i |

= log max Pr[Ki = k Xi Ki = y] − k | ⊕ X = log max X (x) Pr[Ki = k Ki = x y] (5.2) k i − x P | ⊕

= log max X (x) = H∞(Xi). (5.3) − x P i

In the above, equation (5.1) is from the definition of min-entropy, 5.2 is substituting Yi by

Xi Ki and taking a sum over possible values of Xi. Finally the equation (5.3) holds because ⊕

the probability Pr[Ki = k Ki = x y] is 1 for x = k y and 0 otherwise, and for every k, | ⊕ ⊕ there exists a x that the probability becomes 1. Thus maximum of the expression over k is

the maximum over values of x and we have the result.

Consequently, H∞(Ki Yi) = H∞(Xi) holds by summing over values of Yi = y. |

 i To prove that H (Ki Y ) H∞(Xi) it is sufficient to prove that for all y, ∞ | ≥

 i−1  i−1 H (Ki Y = y) = H (Ki Y ) = n, (5.4) ∞ | ∞ |

i−1 since from Lemma 5.2.3 (let Z = Y and Y = Yi) the following holds.

 i  H (Ki Y ) = H (Ki Yi) H∞(Xi). (5.5) ∞ | ∞ | ≥

So it is left to prove equation (5.4) for all i [λ], which we prove by induction. The fact we ∈ proved in equation (5.3), that

H∞(Ki Yi = y) = H∞(Ki Yi) = H∞(Xi), (5.6) | | is used in all induction steps.

103 Basis: For i = 1, because K1 is uniformly distributed, then H∞(K1) = n. For i = 2, from

equation (5.6) for every y it holds that

  ∆ K2; Un Y1 = y = ∆ ext(K1,S1); U` Y1 = y , | | ≤ which from the definition of smooth min-entropy implies that

  H (K2 Y1 = y) = H (ext(K1,S1) Un−l Y1 = y) = n. ∞ | ∞ || |

 By taking a sum over values of y, H (K2 Y1) = n holds. ∞ |

Inductive step: We assume that equation (5.4) holds for i = q, and we prove it for i = q + 1.

The key generation process implies that

 q  q H (Kq+1 Y ) = H ((ext(Kq) Un−`) Y ) , ∞ | ∞ || |

λ since Kq+1 = ext(Kq,Sq) Un−` from the construction of the kg function. From the assump- || tion in the inductive step, the equation (5.4) holds for i = q. Therefore, Lemma 5.2.3 and

equation (5.6) implies that for every y

 q  H (Kq Y ) = H (Kq Yq = y) = H∞(Xq). ∞ | ∞ |

Thus from the definition of randomness extractors for yq q the following holds. ∈ Y

q q q−1 q q q−1 ∆ ext(Kq,Sq−1) Un−`; Un Y = y ,S = ∆ ext(Kq,Sq−1); U` Y = y ,S . || | | ≤

Finally, the definition of smooth min-entropy implies that

 q  q q H (Kq+1 Y ) = H (Kq+1 Y = y ) H∞(Un) = n, ∞ | ∞ | ≥ which proves the inductive step for i = q + 1. 

In the above theorem, we showed that there is a (smooth) min-entropy left in the key

given all the previous ciphertexts. Now we can prove that a new message can be encrypted

104 using the leftover randomness in the key, and consequently our construction achieves the

Definition 5.2.3.

Theorem 5.2.4 The encryption scheme (kgλ, encλ, decλ) as defined in Theorem 5.2.3 pro- vides SEC λ∗ (4, 2√2) if the messages are sampled from t-sources Xj for j [λ]. ind ∈

The proof of Theorem 5.2.4 is divided into two parts, we prove 4-secrecy of last messages

in the following, and 2√2-entropic security of past messages is proved in Section 5.2.5.

Proof.[of last message security] Let Yi = encK (Xi) = Ki Xi, i [λ] and ` be the length of i ⊕ ∈

the new extracted key, i.e. ` = H∞(Xi) 2 log(1/) + 2. For every  > 0, we need to prove − that

i i i−1 i−1 ∆ encK (x ); encK (x ) Y S 4, i 0 i 1 | ≤

i i where Y = (Y1,...,Yi) and S = (S1,...,Si). From Proposition 5.2.1, Lemma 5.2.1 and the

 i−1 fact that H (Ki−1 Y ) H∞(Xi−1) based on Theorem 5.2.3, there exist a (n, t, `)-average ∞ | ≥ case strong extractor ext such that

i−1 i−1 ∆ ext(Ki−1,Si−1); U` S ,Y 2, | ≤ where t = H∞(Xi−1). Now let Ki = ext(Ki−1,Si−1) Un−`, and the triangle inequality implies || that:

i i i−1 i−1 A = ∆ encK (x ); encK (x ) Y ,S i 0 i 1 | i i i−1 i−1 ∆ encK (x ); encU (x ) Y ,S ≤ i 0 n 0 | i i i−1 i−1 + ∆ encU (x ); encU (x ) Y ,S n 0 n 1 | i i i−1 i−1 + ∆ encK (x ); encU (x ) Y ,S . i 1 n 1 |

i i  enc is a perfectly secure encryption function, and consequently ∆ encUn (x0); encUn (x1) = 0

i holds. From Lemma 2.2.2, let f(K) = encK (xj) (for j = 0 or j = 1) and hence it is implied

105 that

i i i−1 i−1 i−1 i−1 ∆ encK (x ); encU (x ) Y ,S ∆ f(Ki); f(Un) Y ,S i j n j | ≤ | i−1 i−1 ∆ Ki; Un Y ,S . ≤ |

This inequality implies the following:

i−1 i−1 A 2∆ Ki; Un Y ,S ≤ | i−1 i−1 = 2∆ (ext(Ki,Si) Un−`); Un Y ,S || | i−1 i−1 = 2∆ ext(Ki,Si); U` Y ,S | 4. ≤ where the last inequality holds from Proposition 5.2.1, and that we use the extractor for

smooth min-entropy. 

λ λ λ The above theorem implies that the encryption system (kg , enc , dec ) provides SECindλ∗ (4).

5.2.5 Entropic security of the past messages

Although we proved the security for the last message in the system, but the leakage of the

first message is not arbitrary and in the following theorem we bound this leakage using the

definition of entropic security. First we need the following lemma known as XOR lemma

proved in Lemma 3.6 of [DS05].

Lemma 5.2.4 [DS05] If A, B are independent random variables such that H∞(A)+H∞(B) ≥ 1 n + 2 log( ) + 1, and hr is a XOR universal family, then  { }  ∆ (r, hr(A) B); (r, Un)  where r is uniform over , the set of all hash functions in the ⊕ ≤ R family.

One example of a XOR universal hash function is hr(k) = rk where r and k are both considered

as bit strings in GF (2n) and the multiplication is the field multiplication. Therefore the

randomized one-time pad defined in Definition 5.2.5 enck(x, r) = hr(k) x satisfies the ⊕ 106 requirements of the Lemma 5.2.4.

Lemma 5.2.5 Given the key generation algorithm in Theorem 5.2.3, the key distributions satisfy

H∞(Ki Ki+1,...,Kλ) n `. | ≥ −

0 Proof. Consider K = Ki+1,...,Kλ as a randomized function of Ki, where only the function

0 ext(Ki,Si) with only ` bits of output from Ki is used in K . From Lemma 5.2.2, only ` bits

0 from Ki would leak given K which will prove the result. 

Finally we can prove the past message entropic security in the following theorem.

Theorem 5.2.5 For the encryption scheme (kgλ, encλ, decλ) defined in Theorem 5.2.3, let

W = (encK2 (X2),..., encKλ (Xλ),S1,...,Sλ−1), be all the extra information the adversary

0 has to break the security of the first message. Then for every t-source X1,X1, it holds that

0  ∆ encK (X1); encK (X ) W 2√2. (5.7) 1 1 1 | ≤

Proof. The encryption function enck(x, r) = rk x = hr(k) x is from a XOR universal ⊕ ⊕ family. Now from inequality (5.7) the following holds:

0   ∆ encK (X1); encK (X ) W ∆ encK (X1); Un W 1 1 1 | ≤ 1 | 0  + ∆ encK (X ); Un W 1 1 |  2∆ hr(K1) X1; Un W ≤ ⊕ |  2∆ hr(K1) X1; Un ext(K1, s1) . (5.8) ≤ ⊕ |

It is obvious that X and W are independent, i.e. XW = X W . This is because all messages P P P in the sequence are mutually independent, and the message distribution is independent

of the key distribution. But K1 is not independent of W as K1 is correlated with K2 =

107 ext(K1,S1) U`. To use Lemma 5.2.4, we need to compute H∞(K1 W ). Lemma 5.2.2 and || | 5.2.5 implies that

H∞(K1 W ) = H∞(K1 ext(K1, s1)) H∞(K1) log(ext(K1, s1)) = n `, | | ≥ − − where ` is the length of the output of the extractor, based on H∞(X) = t. This implies that

1 H∞(X1) + H∞(K1 W ) t + n ` = n + 2 log( ) 2. Therefore Lemma 5.2.4 implies that | ≥ −  −  ∆ hr(K1) X1; Un W √2 and thus ⊕ | ≤ 0  ∆ encK (X1); encK (X ) W 2√2. 1 1 1 | ≤



5.3 Concluding remarks

In this chapter, we proposed to use correlated keys for encryption of multiple streaming

messages where we showed -secrecy for the last message in the stream, and entropic security

of past messages.

5.4 Proofs of lemmas and theorems

5.4.1 Proof of Lemma 5.2.1

Suppose the extractor ext satisfies Definition 5.2.4 for (K Y ) with min-entropy t. Let L δ K | ≈ and triangle inequality implies that:

  ∆ ext(L, S); Ul Y,S ∆ ext(L, S); ext(K,S) Y,S (5.9) | ≤ |  + ∆ ext(K,S); Ul Y,S (5.10) | δ + . ≤ In the above, we first used the triangle inequality, then the term (5.9) is at most δ from

Lemma 2.2.2 and the term (5.10) is at most  because of the ext function property.

108 5.4.2 Proof of Lemma 5.2.2

The definition of conditional min-entropy implies that

X X H∞(X fR(Y )) = log max R(r) X (x) Pr[fr(Y ) = z X = x]. x | − z r P P |

Now for a fixed r partition into two parts, z fr(supp(Y )), and z / fr(supp(Y )). For the Z ∈ ∈ first partition it holds that

X G(X) max X (x) Pr[fr(Y ) = z X = x] . x P | ≤ fr(supp(Y )) z∈fr(supp(Y )) | | but for the second partition we have

X max X (x) Pr[f(Y ) = z X = x]) = 0, x P | z∈ /f(supp(Y ))

since Pr[f(Y ) = z] = 0 for z in the second partition.

5.4.3 Proof of Lemma 5.2.3

H∞(X Z) = n implies n = H∞(X Z) H(X Z) (since X is of length n) and thus | | ≤ |

H(X Z) = n which means X and Z are independent variables and thus H∞(X W, Z) = | | δ H∞(X W ). Consequently if H (X Z) = n, there exists a random variable L δ X such that | ∞ | ≈ δ H (X Z) = H∞(L Z) = n and for the same random variable L, H∞(L W, Z) = H∞(L W ) ∞ | | | | holds which implies that Hδ (X W, Z) = Hδ (X W ). ∞ | ∞ |

5.4.4 Proof of security for uniform messages

To prove the claim in Section 5.2.3, it is sufficient to prove that the key distribution is

independent of a sequence of ciphertexts y1, y2,... , i.e.

K|Y ,Y ,...(k y1, y2, y3,... ) = K (k). (5.11) P 1 2 | P

109 Then both last and past message security are implied by this. To prove equation (5.11), the

definition of conditional probability and the chain rule implies that

K,Y1,Y2,...(k, y1, y2,... ) K|Y1,Y2,...(k y1, y2,... ) = P P | Y ,Y ,...(y1, y2,... ) P 1 2 K (k) Y |K (y1 k) Y |Y ,K (y2 y1, k) ... = P P 1 | P 2 1 | . (5.12) Y (y1) Y |Y (y2 y1) ... P 1 P 2 1 | Therefore it is sufficient to prove that the term (5.12) is equal to K (k). To prove this, we P show that for all i > 0,

Y |Y ,...,Y (yi yi−1, . . . , y1) = Y (yi). (5.13) P i i−1 1 | P i

This combined with the fact that Y |K (yi k) = Y (yi) implies that the term (5.12) is equal P i | P i

to K (k). Note that Y |K (yi k) = Y |X (yi xi) = Y (yi) holds from the perfect secrecy of P P i | P i i | P i

one-time pad and the fact that both Xi and K are uniformly distributed.

To complete the proof, (5.13) implies that

X Y |Y ,...,Y (yi yi−1, . . . , y1) = K (k) X |X ,...,X (k yi k yi−1, . . . , k y1). P i i−1 1 | P P i i−1 1 ⊕ | ⊕ ⊕ k −n We prove that the term X |X ,...,X (k yi k yi−1, . . . , k y1) = 2 = X (k yi) holds P i i−1 1 ⊕ | ⊕ ⊕ P i ⊕ by induction (where n is the length of the message). For i = 2 it is implied that

X −n X |X (k y2 k y1) = X (x1) X (x1 y1 y2) = 2 . P 2 1 ⊕ | ⊕ P 1 P 2 ⊕ ⊕ x1 Now assuming

−n −n X |X ,...,X (k yq k yq−1, . . . , k y1) = 2 = X (k yq) = 2 . P q q−1 1 ⊕ | ⊕ ⊕ P q ⊕

for induction step i = q where in the last term k = x1 y1, we prove it for induction step ⊕ i = q + 1.

X X |X ,...,X (k yq+1 k yq, . . . , k y1) = X (x1) P q+1 q 1 ⊕ | ⊕ ⊕ P 1 x1

X |X ,...,X (k yq+1 k yq, . . . , k y2) (5.14) P q+1 q 2 ⊕ | ⊕ ⊕ X −n = X (x1) X (k yq+1) = 2 . (5.15) P 1 P q+1 ⊕ x1

110 where in the terms (5.14) and (5.15), k = x1 y1, and (5.15) is implied by the induction step ⊕ i = q.

5.4.5 Proof of Theorem 5.2.2

Consider the SECind2 definition for two message pairs, where the adversary receives the

distribution over ciphertexts for 2 message pairs (x1, w1) and (x2, w2) where x1 = x2 and 6

w1 = w2. Here the advantage of the adversary in distinguishing the two joint distributions 6 1 0 should be sufficiently small (i.e. less than 3 . For i = 1, 2, let Yi = encK (xi),Zi = encK (wi),

0 and T = (y, z) k, k ; y = enck(x1), z = enck(w1) . This implies that { |∃ }  ∆ (Y1,Z1); (Y2,Z2) = max Pr[(Y1,Z1) S] Pr[(Y2,Z2) S] S⊂Y2 | ∈ − ∈ |

Pr[(Y1,Z1) T ] Pr[(Y2,Z2) T ] . ≥ | ∈ − ∈ |

Note that since keys are assumed independent and uniformly distributed, then

X 0 Pr[(encK (x), encK0 (w)) T ] = K (k) K0 (k )IT (enck(x), enck0 (w)) ∈ P P k,k0∈K P k,k0 IT (enck(x), enck0 (w)) = 2 , |K| where IT is the indicator function, i.e. IT (y, z) = 1 if (y, z) T and is 0 otherwise. ∈

For fixed keys k, k0, the encryption of at most 22r message pairs is in T , because of

the correctness property of encryption (two messages must not be encrypted to the same

cipher under one key. If not, then we may have more than 22r pairs). In other words, for k:

P 2r P P IT (enck(x), enck0 (w)) 2 , and therefore for all keys 0 IT (enck(x), enck0 (w)) x,w ≤ k,k x,w ≤ 4r P 24r 2 holds. For at most λ message pairs (x, w) we can have 0 IT (enck(x), enck0 (w)) , k,k ≥ λ 2n P 24r and so for at least 2 λ message pairs (x2, w2), 0 IT (enck(x2), enck0 (w2)) < holds. − k,k λ 22r Therefore for all such pairs Pr[(encK (x2), encK0 (w2)) T ] < and this implies that ∈ λ 2r  2 ∆ (encK (x1), encK0 (w1)); (encK (x2), encK0 (w2)) > 1 . − λ

111 2r−1 2n 2r−1 Now let λ = 3.2 and the result holds: for every message pair (x1, w1), at least 2 3.2 − 1 message pairs are distinguishable from (x1, w1) with more than 3 bias. Also there is at least 1 distinguishability for at least 22n 3.22r−1 message pairs. For 3 − r = n 1, there are 22n−2 pairs that are highly distinguishable. − Remark. The above proof is regardless of the distribution over messages unlike the result of

Shannon depends on the message distribution. Here we assume that the key is uniformly distributed. In general, we can conclude that if either the message or the key is uniformly distributed, then we need a key length of at least the length of the message. However, we can not conclude anything for the case that both the messages and the keys are not uniformly distributed.

112 Chapter 6

Human-Assisted Generation of Random Numbers

In this chapter, we discuss our proposal for random number

generation in two approaches. The first approach is using human

game-play in video games, and second approach is extending a

work by Halprin et al. using human game-play in specially de-

signed games. The second work is a joint work and I contributed

40% mainly toward the proof of theorems about expander graphs,

implementation and experiments.

6.1 Introduction

Randomness has a central role in computer science and in particular information security.

Security of cryptographic algorithms and protocols relies on keys that must be random.

Random coins used in randomized encryption and authentication algorithms and values

such as nonces in protocols, must be unpredictable. In all these cases, unpredictability of

random values is crucial for security proofs. There are also applications such as online games,

gambling applications and lotteries in which unpredictability is a critical requirement.

Poor choices of randomness in the past have resulted in complete breakdown of security and

expected functionality of the system. Early reported examples of bad choices of randomness

resulting in security failure include attack on Netscape implementation of the SSL protocol

[GW96] and weakness of entropy collection in Linux Pseudo-Random Generator [GPR06].

A more recent high profile reported case was the discovery of collisions among secret (and

public) keys generated by individuals around the world [LHA+12, HDWH12]. Further studies

attributed the phenomenon partly due to the flaw in Linux kernel randomness generation

113 subsystem.

In computer systems, perfect randomness is commonly generated by sampling a complex

external source such as disk accesses or time intervals between system interrupts, or is

from users’ inputs. Importance of perfect randomness in computer systems has been well

recognized and operating systems such as Linux and Windows have dedicated subsystems

for entropy collection and randomness generation. These subsystems use internal system

interrupts as well as user generated events as entropy source. High demand on entropy pools,

for example when a computer runs multiple processes and algorithms that require randomness

at the same time, can result in either pseudorandom values instead of truly random values,

or stopping the execution until sufficient randomness become available.

The rate of randomness generation can be increased by including new sources of randomness which in many cased requires new hardware. An attractive alternative that does not

require additional hardware is to use human assistance in randomness generation. However,

experiments in psychology have shown that the asking human to produce randomness has

bias [Wag72] under normal conditions.

6.1.1 Related work

Users’ inputs through devices such as mouse and keyboard [ZLwW+09], has been widely used

for background entropy collection in computer systems. An example is Linux based systems

[GPR06] in which the operating system continuously runs a background process to collect

entropy from users’ inputs. Compared to our approach these entropy sources in general are

expected to have lower entropy rate when used for on-demand collection of entropy. This is

because of the repetitive patterns of mouse movements or key presses.

In [HN09], Halprin et al. proposed an approach to construct an entropy source using

human game play. Their work built on the results in experimental psychology. It is known

that humans, if asked to choose numbers randomly, will do a poor job and their choices will

114 be biased. Wagenaar [Wag72] used experiments in which participants were asked to produce

random sequences and noted that in all experiments human choices deviated from uniform

distribution. In [RB92], Rapport et al. through a series of experiments showed that if human

plays a competitive zero-sum game against another human with uniform choices as the best

strategy, their choices will be close to uniform. In their experiment they used matching

pennies game in which each player makes a choice between head or tail using an unbiased coin,

and the first player wins if both players choose the same side and the second, if they choose

different side. In this game the optimal strategy of users is random selection between head

and tail. Their result showed that users almost followed uniform random strategy confirming

that human can be a good source of entropy if they are engaged in a strategic game and

entropy generation is an indirect result of their actions. Halprin et al. used these studies to

propose an entropy source using human game play against a computer. In their work human

plays a zero-sum game with uniform optimal strategy against the computer. The game is an

extended matching pennies game (user has more than two choices) and is played many times

to increase the amount of randomness generated.

They used an extension of this game that uses n choices to the player: the user is presented

by an n n matrix displayed on the computer screen and is asked to choose a matrix location. × The user wins if their choice is the same as the square chosen by the computer. They noted

that the visual representation of the game resulted in the user input to be biased as users

avoided corner points and limiting squares. The sequence generated by human was used as the

input to a seeded extractor (Definition 3.3.3) to generate a sequence that is -close to uniform.

In addition to the human input sequence, the TRG uses a second source of perfect randomness

to provide seed for the randomness extractor. They provided visual representations of human

choices that indicates a good spread of points. However statical and min-entropy evalu-

ation of the system is restricted to using statistical tests on the output of the seeded extractor.

115 In this chapter, we first discuss the general design structure of a true randomness

generator (TRG) and then present two approaches to generate randomness using human. In

one approach, we propose an integrated approach where the game play between a human and

the computer is used to implement the two phases of a TRG including randomness source

and randomness extraction phase. That is the user’s input provides the required randomness

for the entropy source and the extractor both. The other approach is a novel idea in using

human game-play in video games to generate randomness.

6.2 The structure of a TRG

n A sequence of random variables Xi is called an almost perfectly random sequence, if { }i=1 n Xi Xi−1 = xi−1,...,X1 = x1 is -biased. An entropy source is a generator of sequences { | }i=1 n of symbols xi each sampled from a random variable Xi, where all Xi are defined over { }i=1 a finite set . It is important to note that output symbols of an entropy source may be X correlated and not necessarily have the same distribution.

An entropy source uses physical processes such as noise in electronic circuits, or software

processes that are“unpredictable”, to output a sequence over an alphabet that is highly

“unpredictable”, where unpredictability is measured by min-entropy. Although the output

of an entropy sequence can have high level of randomness, but the underlying distribution

may be far from uniform. To make the output of an entropy source to follow a uniform

distribution, a post processing step is usually used.

A True Randomness Generator (TRG) thus consists of two modules: (i) an entropy source

that generates a sequence of symbols with a lower bound on its min-entropy, followed by, (ii)

a randomness extractor module that transforms the entropy source into an almost perfectly

random sequence.

In practice one needs to estimate the min-entropy of the entropy source to be able to

116 choose appropriate parameters for the extractor module. The distribution of the entropy

source and its min-entropy may fluctuate over time and so a TRG needs to use an extractor

that provides sufficient tolerance for these fluctuations. The final step in ensuring that the

output of the TRG is almost perfectly random is to statistically test the final output. We

note that both min-entropy estimation step and the final statistical testing are required to

ensure the correctness of the whole process. In certain works such as [HN09], the first step is

missed and this may cause biases in the output that remain undetected by the statistical tests.

In this section, we explain the two modules of a TRG and how the final sanity check is

done.

6.2.1 Entropy estimation of the source

To measure the min-entropy (or Shannon entropy) of a source one needs to assume certain

n structure in the source distribution. For a list of n samples si from a source S over the { }i=1 finite set S, if we assume that the source S is i.i.d., that is samples are independently and identically distributed, then having enough samples from the source allows us to estimate the

probability distribution of the source with high confidence and find the entropy as in [BK12]

(section 9.2).

NIST draft [BK12] gives the requirements of entropy sources as well as proposing a number

of tests to estimate the min-entropy of the source in i.i.d. and non-i.i.d. case, both. The

testing method first checks whether the source can be considered i.i.d. NIST suggests the

following set of tests for i.i.d. sources (section 9.1 of [BK12]): 6 shuffling tests, Chi-square

test, test for independence and goodness of fit. If all of the test are passed, the source is

assumed to be i.i.d. and then a conservative method is used to estimate the entropy of i.i.d.

source. If any of the tests are not passed however, the source is considered to be non-i.i.d.,

and a number of tests are used to estimate the min-entropy. These second group of tests are

collision test, partial collection test, Markov test, compression test and the frequency test,

117 each outputting a value as the estimation of the min-entropy. The final min-entropy will be

the minimum over all these estimated values. While i.i.d. and non-i.i.d. tests provide an

estimation of the entropy, they may fail to detect anomalies in the source. Therefore, NIST

defines a set of sanity checks that will make sure this does not happen. The sanity checks

contain two tests: Compression test and collision test. If the sanity checks fail, no estimation will be given.

For our experiments, we obtained an unofficial version of the code (the code is not released yet) and used it to estimate the min-entropy of our source. We ran tests to verify whether our

estimations are meaningful (our sanity checks), and also check consistency in the min-entropy

estimation for a data set from /dev/random in Linux. Our analysis showed that the estimation

from NIST set of tests are sound, but are very conservative (admitted in section 9.3.1 of

[BK12]. For example, we expect to have roughly the same approximation of min-entropy for

the data in /dev/random. But the approximation from the NIST tests depended very much

on the number of samples given to the tests (which is quite intuitive and acceptable). This

caused very low estimates for a subset of our users with smaller sample size and in general, min-entropy estimation in our experiments were conservative.

6.2.2 Extraction module

Randomness extractors are deterministic or probabilistic functions that transform the output

of an entropy source to uniform distribution using a mapping from n to m bits (usually

m n), extracting the entropy of the source. ≤ To guarantee randomness of their output, randomness extractors need guarantee on the

randomness property of (e.g. the min-entropy) their input entropy source. Extractors that

can extract randomness from sources that satisfy a lower bound on their min-entropy, are

probabilistic [Sha11]. A probabilistic extractor has two inputs: an entropy source together with a second input that is called seed. Good probabilistic extractors use a short seed

(logarithmic in the input size) to extract all the randomness (close to the min-entropy) of the

118 input entropy source.

The Extraction module can use a general seeded extractor that will guarantee randomness

of the output for any distribution with min-entropy (Definition 2.2.5) k, or an approach

proposed in [BST03] in which the set of possible input sources is limited to a set of 2t possible

ones all with min-entropy k. The former approach requires a fresh random seed for each

extraction but has the advantage that the input source can have any distribution with the

required min-entropy. This latter approach however requires the input sequence to be one of

the set of 2t possible sources, but has the advantage that one can choose a function from

a class of available extractors and hard code that in the systems. This means in practice

no randomness is required. However no guarantee can be given about the output if the

input sequence is not one of the 2t that has been used for the design of the system and this

property cannot be tested for an input sequence. Halprin et al. used the latter approach,

using a t-universal hash function as the extractor. The randomness guarantee of the final

result requires the assumption that the human input is one of the 2t sources. In practice,

t can not be arbitrarily large and must be small to guarantee a minimum output rate for

randomness. This can pose a security risk that the actual distribution is not one of the 2t

distributions. Halprin et al. did not perform quantitative analysis of user sequences and used visual representation of the human choices to conclude the choices were random.

We note that simpler extractors such as the Von Neumann extractor [vN51b] put strong

requirements on their input sequence. For example Von Neumann extractor requires the input

string to be a Bernoulli sequence with parameter p which is not satisfied by the sequence

of human choices where successive choices may be dependent with unknown and changing

distribution. The expander graph extractor works for all distributions whose min-entropy is

lower bounded by a given value, and does not put extra requirements on the input sequence.

In [BST03], Barak et al. proposed a framework for randomness extraction with guaranteed

statistical property for the output, that can be seen as a compromise between seeded and

119 deterministic extractors.

Barak et al. framework. [BST03] The motivation of this work is to extract randomness for

cryptographic applications where the adversary may influence the output of the entropy

source. The adversary’s influence is modeled by a class of 2t possible distributions generated

by the source. They proposed a randomness extractor with provable output properties (in

terms of statistical distance) for sources that have sufficient min-entropy while the output

source symbols may be correlated. The extraction uses t-resilient extractor which can extract

from 2t distributions (selected adversarially), all having min-entropy k. Certain hash functions

are good t-resilient extractors.

Theorem 6.2.1 [BST03] For every n, k, m and  and l 2, an l-wise independent hash ≥ function with a seed of length l is a t-resilient extractor, where t = l (k m 2 log ( 1 ) 2 − − 2  − log (l) + 2) m 2 log ( 1 ). 2 − − − 2 

An l-wise independent family of hash functions can be constructed using a polynomial

P i−1 n hs(x) = 1≤i≤l aix of degree l over the finite field GF (2 ), where s = (a1, . . . , al) is the seed of the extractor and x GF (2n) is the n-bit input. ∈ The t-resilient extractors in Barak et. al’s approach reuses a truly random seed that is

hardwired into the system (e.g. at manufacturing time) and does not need new randomness

for every extraction. Although extractors enjoy sound mathematical foundations, in practice

the output of entropy sources are mostly processed using hash functions with computational

assumptions and so extractors have not been widely implemented in hardware or software.

In our second approach, we follow the framework of Barak et al.

6.2.3 Measuring statistical property of the output

To examine statistical properties of the output sequence, a set of statistical tests are developed

that compares the expected statistics in perfectly random sequences with the output of the

process. For example, the expected number of 0s and 1s are roughly equal for a perfectly

120 random sequence of bits, Or in such a sequence one does not expect a long sequence of

consecutive run of 1s or 0s. Statistical tests will collect these statistics from the sequence

and compare it with the expected behavior of a perfectly random sequence, and at the

end it outputs a p-value, a value showing how close is the output to a perfectly random

sequence. However, these tests can only detect certain statistical biases and can not prove

that a sequence is truly random. Nevertheless, this is the best one can do to ensure random

properties of the output.

In this chapter, We use the statistical tests in a battery of tests called Rabbit [LS07b].

Rabbit set of tests includes tests from NIST [ea10] and DIEHARD [Mar98] with modified

parameters tuned for smaller sequences and hardware generators. We used an implementation

of these test in [LS07a].

We now describe the two approaches in generating random numbers in the following sections.

6.3 Human game-play in video games for randomness generation

In this method, a novel indirect approach is proposed to entropy collection from human input

in game play that uses games as a targeted activity that the human engages in, and as a by

product generates random values. Video games are one of the most widely used computer

applications and embedding entropy collection in a large class of such games provides a rich

source of entropy for computer systems. Our main observation is that human, even if highly

skilled, would not be able to have perfect game play in video games because of a large set of

factors related to human cognitive processes and motor skill and coordination, limitations of

computer interface including display, keyboard and mouse, and unpredictability elements

in the game. The combination of these factors in well designed games results in the player

“missing” the target in the game where although the goal may appear simple, achieving it is

not always possible. Games usually come with a scoring system that rewards smaller “misses”

of the target and provides incentive for further play.

121 6.3.1 Our contribution

We propose to use the error resulting from the confluence of the complex factors outlined

above, as an entropy source. The unpredictability in the output of this source is inherent

in the game design: that is a game that always results in win or loose is not “interesting”

and will not be designed. In contrast games in which the user can “loose” a good portion of

rounds are considered interesting. In a human game play randomness can be collected from

different variables in the game, including the timing of different actions, the size of the “miss”

as well as variables recording aspects of the human input such as angle of a shot, and so in

each round, even when the user wins, a good amount of entropy can be generated.

In this section, we describe our indirect approach to entropy collection from human input

in game play that uses games as a targeted activity that the human engages in, and as a by

product generates random values.

Implementation

As a proof of concept we designed and implemented a multilevel archery game, collected user

inputs in the game and extracted randomness from the collected inputs. For randomness

extraction we used the approach in [BST03] that uses universal hash functions. This allowed

us to have provable randomness in the output of the extractor, as long as a good estimate

of the input entropy is given. To estimate the min-entropy of the input to the extractor

(min-entropy of the user input), we employed a set of min-entropy estimation tests proposed

by NIST and used a beta implementation by NIST1[BK12].

Our results clearly show that error variables, for example the distance between the target

and the trajectory of the arrow, provides a very good source of entropy. The experiments

show that the game can generate 15 to 21.5 bits of min-entropy per user shot using only

the error variable. The variation in the amount of min-entropy is due to the variations in

the game level and also varying levels of skill and learnability of users. Our experiments

1The software was provided by Tim Hall and John Kelsey from NIST.

122 demonstrate that although entropy decreases as players become more experienced, but the

entropy of the error variable will stay non-zero and even for the most experienced player at

the lowest level of the game, 15 bits entropy per shot can be expected. The details of the

game, experiments analysis of the users’ input sequences and the extraction algorithm are

given in Section 6.3.4.

6.3.2 Applications

Game consoles and smart phones. Game consoles require perfect randomness for secure

communication with the game servers, checking the digitally signed games and firmware

updates from the server and to provide randomness to the games that is played. Lack of

good random generation subsystems in these consoles may result in attacks such as reported

in [Hot10]. Incorporating our approach into the video games played in such consoles would

provide a method of generating randomness with high rate and verifiable properties. Our

approach also provides an ideal entropy source for small devices such as smart phones that

are used for gaming and have limited source of entropy.

On-demand RNG in OS. An immediate application of our proposal is to provide on-demand

entropy source for OS random number generation module. In softwares such as PGP, Openssl,

and GnuPG that need generation of cryptographic keys, using perfect randomness is critical.

Such applications rely on the random number generation of the OS which may not have

perfect randomness available at the time of the request. Our proposed entropy source can be

used for entropy collection from users by asking them to play a simple game. Our experiments

showed that producing 100 bits of entropy required 6 runs of the game, making the approach

an effective solution in these situations.

Contributory random number generation. In virtualized environments, multiple users share

the same hardware (including CPU and RAM) and so the entropy pools of different users

share a substantial amount of entropy produced by the system’s shared hardware, resulting in

123 the risk of dependence between entropy pools of different users. This is an important concern

if the resulting randomness is used for key generation, potentially leading to attacks such as

those reported in [HDWH12]. Using users’ game play provides a source of randomness that

is independent from the underlying hardware and other users’ randomness.

6.3.3 The TRG design

Consider a computer game in which the player aims to hit a target, and wins if the projectile

“hits” the target. There are many factors that contribute to the user missing the target even if

they play their “best”, making the game result uncertain. We propose to use the uncertainty in the game’s result as the entropy source. Assuming a human is engaged in the game and

plays their best, the uncertainty in the result will be due to a large set of factors including, 1)

limitations of human cognitive and motor skill to correctly estimate values, and calculate the

best response values (e.g. time limitations imposed by the game to calculate the best angle

and speed of throw) and perform the required action at the exact time, 2) limitation of input

devices for inputing the best values when they are known, for example limitation of a mouse

in pointing an arrow in a particular direction, and 3) unknown parameters of the game (e.g.

game’s hidden constants) and variabilities that can be introduced in different rounds. Other

human related factors that would contribute to the unpredictability of the results are, limited

attention span, cognitive biases, communication errors, limits of memory and the like. These

uncertainties can be amplified by introducing extra uncertainty (pseudo-randomness) in the

game: for example allowing the target to have slow movement. As a proof of concept for this

proposal, we designed and implemented an archery game, studied user generated sequences

and the randomness extracted from them. Below are more details of this work.

6.3.3.1 The game design

Our basic game is a single shooter archery game in which the player aims an arrow at a

target and wins if the arrow hits the target: the closer to the center of the target, the higher

124 the score. A screen-shot of the game is shown in Figure 7.1. The arrow path follows the laws

of physics and is determined by the direction of the shot, initial velocity of the arrow, and

the earth gravity pull force. This results in a parabolic path that the arrow will traverse to

hit the target. The player chooses an initial speed and angle of throw to hit the target. The

game is available to play at [Ali13a].

Figure 6.1: Screen-shot of the game Figure 6.2: The measurement of output

The target is shown by a circular disk on the screen. The game records the distance

between the center of the target and the trajectory (Figure 7.2). To display the trajectory on

the screen, graphic engine translates the location of the arrow into pixel values and show their

locations on the display. We however use the actual value of the distance between the center

of the target and the trajectory calculated using laws of physics (kinematic equations), and

then round it off to a 32 bit floating point number (the effective bits). The advantage of using

this approach is not only avoiding entropy loss, but also independence of the implementation

and measurements from the screen size and resolution of the end device. For the error variable we use the range I = [ 120, 120] with each sample read as a 32 bit floating point number, − and represented as [Sign(1bit), Exponent(8bits), Fraction(23 bits)].

We will refer to each shot, as a round of the game. After playing the game for a number of

rounds, the server will have a sequence of floating point numbers in the interval I = [ 120, 120]. − The range [-120, 120] can be adjusted depending on factors such as screen size and target

shape. One can use multiple seemingly unrelated variables in the game for the source of

entropy. Examples are, the angle and initial velocity of the shot, time that takes for a user

to make a shot, and the time between two consecutive shots. We only analyze the entropy

125 source corresponding to the variable that represents the human error in hitting the target.

The game was implemented using HTML5 for ease of portability.

6.3.3.2 Game parameters and levels

Our initial experiments showed that the game parameters affect the final entropy. We designed

game levels with varying sets of parameters. The parameters that we alter in the game

are: 1) Location of the target, 2) Speed of the target movement, 3) Gravity force with two

possible direction, and 4) Wind speed and direction. These parameters can change for every

player shot, or stay constant during the game. There were other possibilities such as adding

an obstacle (e.g. a blocker wall) to force the player choose angles from a wider spectrum,

putting time limit on each shot to force the player to release the arrow faster, smaller target

or farther target in the screen that could be considered in future. The final game has 8 levels,

3 of which were used for experiments labeled as A, B and C respectively. In level A, all

parameters were “fixed” with no change during the rounds, and so no input is used from the

computer. In level B, a subset of game parameters are changed in each round of the game

and the values of the parameters were shown in the interface so the player can decide based

on that information. In level C, the values of changing parameters of level B were not shown

to the user (except the direction of gravity and wind). The high uncertainty in this level of

the game makes the players rely on their intuition to play the game.

We did not perform a user study to show attractiveness of these levels but comments

from users indicated level B was the most appealing level.

6.3.3.3 Entropy source

The distance between O, the target center, and the trajectory at O0, is a 32 bit floating point

number. One can use quantization to map these numbers into ` bins. A simple quantization

is to consider circular areas, centered at O with linearly increasing distances: the first circle will have radius r, the second 2r, etc. Now if O0 for a miss trajectory is in the first circle, it

is considered 0, the next one 1 and so on. A good quantization and extraction will ensure

126 that every element of the alphabet is generated “roughly” with the same relative frequency.

To have this property, we followed the randomness extraction framework of [BST03] with

an extra statistical evaluation step at the end. Our randomness extraction and evaluation

has three steps. i) Estimate min-entropy of the sequence; ii) Given the estimate, apply an

appropriate extractor (in our case pairwise independent hash function) on the sequence; and

iii) Use statistical tests to evaluate quality of the final output sequence.

We used the NIST tests [BK12] outlined in Section 6.2.1, to estimate the min-entropy of

our sequences. Our experimental results showed that our entropy sources were not independent

and identically distributed (i.i.d). This was because for each data set, either the Directional

run or Covariance tests (part of shuffling tests) failed. We estimated the min-entropy of our

sequences assuming non-i.i.d sources. In the post processing step, sequences were converted to

truly random sequences, using extractors. We used a t-resilient extractor defined over a finite

field, and so floating point numbers needed to be translated into numbers in that field. One

naive approach was to cast the floating point numbers into an integer value corresponding to

the same bit representation of the floating point number. This method however will affect

the structure of the sequence. For example, the sequence of differences between two floating

point numbers (which represents the distance of the arrow from the target center) will have

a different structure from the sequence of differences between their corresponding integer values if simply casted. In order to maintain the structure of the entropy sequences in our

experiments in section 4, we added a processing step to convert the output sequence into a

sequence of integers while keeping the structure of the source as explained in next section.

The final output string (after application of extractor) was evaluated using statistical tests.

We used the TestU01 framework [LS07b] with an implementation available at [LS07a]. This

framework has implemented an extensive sets of statistical tests including [BRS+10, Mar98].

127 6.3.3.4 From entropy source to the output

We read 32 bit floating point numbers as the output of the entropy source and interpreted

each sample as a 32 bit integer as follows: In this step, we apply a function f : I to the → Z sequence of floating point numbers generated by the entropy source.

1 Divide I into 232 partitions.

2 Index each partition from 216 to 216. − 3 For each number generated, return the index of the

partition containing the number.

This additional step applied on the source, does not decrease the entropy.

Proposition 6.3.1 The conversion function f does not decrease the entropy in terms of Shannon and min-entropy.

Proof. Consider X as the distribution of the entropy source when generating one symbol.

The distribution of the source after applying the conversion function, would become f(X). It

is simple to show that H(f(X)) H(X)[CT91], with equality if and only if f is an injective ≤ function. Since the function f is injective from the 32 bit floating point numbers to 32 bit

integers, for Shannon entropy, H(f(X)) = H(X) holds. For min-entropy the same result

holds:

H∞(f(X)) = log max Pr[f(X) = y] − y = log max Pr[X f −1(y)] − y ∈ = log max Pr[X = x] − x

= H∞(X), where y 2−16,..., 216 and x = f −1(y) for all y. ∈ { } 

128 In general, applying a function on an entropy source will decrease the Shannon and

min-entropy unless the function has certain properties. For Shannon entropy being injective

is the necessary and sufficient condition to preserve the entropy. For min-entropy however,

being injective is sufficient but not necessary to preserve the entropy.

For the next step, we needed to estimate the min-entropy of this sequence. To use a

min-entropy test, we needed a sufficiently long sequence over an alphabet. We interpreted

each 32 bits block as a collection of sub-blocks of different lengths. We were limited by

available user generated data and so the size of the sub-block depended on the experiment to

ensure that a sufficiently long sequence was available. We used the min-entropy test outlined

in Section 6.2.1 and considered each sample as 32 1bit sub-blocks, and obtained an estimation

of min-entropy per bit. We considered all bits of the input, even those with low min-entropy

in the estimation. Given the estimate of k bit min-entropy for a single bit, we obtained

an approximate value for min-entropy of each sample as 32k. Here we effectively assumed

bits have similar min-entropy which is reasonable since our per-bit min-entropy estimation

considered all bits. We performed the above calculations for data from each player including

all levels, resulting in minimum estimated min-entropy of 0.626 per bit. For 32 bits, we

estimated 32 0.626 20 as the minimum min-entropy of the source per 32 bits. Note that × ≈ this minimum is over the data from all levels for each user, and the minimum we reported

earlier (15 bits) is the measured min-entropy for the most skilled user in the simplest level.

We closely followed the Barak et al.’s framework by using a 32-wise independent hash function, with  = 2−2, and m = 11. Using Theorem 6.2.1, the extractor was chosen to be t-resilient with t = 16. In the above, 2t is the number of possible distributions chosen by the adversary.

Variations of the distribution due to the players experience could be modeled similarly. The

random seed for the extractor was generated from /dev/random in Linux. To examine the

properties of the final output sequence, we used the statistical tests Rabbit [LS07b]. Rabbit

129 set of tests includes tests from NIST [ea10] and DIEHARD [Mar98] with modified parameters

tuned for smaller sequences and hardware generators. We used an implementation of these

tests in [LS07a]. All tests were successfully passed.

6.3.4 Experimental setup and results

In this section, we present our experimental results. We asked a set of 9 players to play each

of the three levels at least 400 rounds. The rest of the levels were played for learning. Our

objective was to answer to the following:

1- The minimum entropy that can be expected in a single round: As noted earlier factors such

as user’s skill level before the game starts, and learning through playing the game, and the

match between the skill level and difficulty of the game will affect the final entropy of each

round.

2- The change in min-entropy of a player over time: We examine how more experience and

familiarity with the game would affect the amount of entropy generated in a round.

3- The effect of the game level on min-entropy: In this experiment, we determine the best

game parameters that maximize the rate of the output entropy.

We performed two sets of experiments to estimate the minimum entropy per round that

can be expected from the game.

A) Entropy of generated sequences for one player.

In this experiment we measured the min-entropy of the sequences generated by each player.

We partitioned a player’s sequence of shots into 20 parts and measured the min-entropy

for each part per bit, i.e. considering each bit of a floating point number as one sample which gives us 32 samples per round. The graph in Figure 6.3, demonstrates the maximum,

minimum and average min-entropy for each player, here a set of 9 players. We also repeated

this in experiment E for all players and Figure 6.4 illustrates the result of min-entropy in

130 each bit for one sample user, which is consistent with the experiment on data from all users.

Entropy

ô

ô ô ô ô ô ô ô ô 3.5 ô ô 0.65 ô ô ô ô ô ô ò ô ò ò ô ò ò ò ò ô 3.0 ò ò ò ò ò ò ò ò ò ò ò ò 0.64 ò ô ò 2.5 ì ì ì ì ì ì ì ì ì ì ì ò ì ì ì ì ì ì ì ì 2.0 ì ì ì 0.63 ì à à à à à à à à à à 1.5 à à à à à à à à à à à à ô à

ò à 0.62 1.0 ì

æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ ô æ æ 0.5 ò à æ à àì ô ì ìò 0.61 ò òô ô ô Bit æàìòôæàìòæàì æà æ 5 10 15 20 25 30 index

Figure 6.3: Min-entropy for players Figure 6.4: Min-entropy in blocks of bits (One user)

B) Sequence generated by the population.

In this experiment, data from all users were considered as one. We then measured the

min-entropy (per bit) for this data set. Our estimation of min-entropy for the population

shows that the average min-entropy in the output is 0.643 per bit, so on average, with 5

shots (5 32 bits) one can generate 103 bits of min-entropy. The average time for each shot × (over all players) was approximately 2 seconds. Note that the estimation was higher than the

average min-entropy of all users (when min-entropy was measured separately) which is 0.61

because of higher estimation of min-entropy by NIST tests for larger datasets as noted at the

end of Section 6.2.1.

C) Effect of players’ experience on min-entropy.

An important factor in game design is the effect of players’ experience on the generated

entropy. Intuitively, we expect the entropy to decrease as players play more. In our game,

one expects more experienced players to hit closer to the target center more often and so less

observable error, while an inexperienced player’s shot to appear more random. We estimated

the min-entropy of each of the game levels for 3 different players. Our results confirms this

expectation. However it shows that even the most experienced user at the lowest game level

can generate good level of entropy.

131 Figure 6.5 illustrates the change in min-entropy in each of the three game levels A, B

and C, as players gain more experience. Figure 6.5 also shows how the design of the game

neutralizes the affect of player’s experience to keep the average min-entropy high enough for

randomness extraction.

The graphs 6.5 and 6.6 are divided into three parts, each consisting of 3 graphs corre-

sponding to the 3 players. The three parts, left (from 0 to 20), middle (21 to 40) and right (41

to 60), correspond to the levels A, B and C, respectively. We used 3 players with the highest

(the blue curves marked with letter H), average (the red curves with letter A) and lowest

(the yellow curves with letter L) scores for this experiment. An interesting observation about

level C is that the min-entropy does not necessarily decrease for a user which is expected

from the fact that game parameters are randomly changed and not known by the players.

D) Min-entropy and game levels.

We considered the change in min-entropy over time for a level. That is reduction in entropy

as users become more skilled. We used the min-entropy estimation for all player’s data, when

partitioned into 20 sections as in previous experiment. The data corresponds to the sequence

of shots over time and so the first section of the data comes first -that is users starting the

game- then the second and the third sections as they get more experienced. We did this

experiment for data for all users to find the average trend of min-entropy.

ò Entropy

0.66 æ ò ò ò 0.66 æ ì ì ì Level A Level B Level C æ ì ì ò ì ì ì ì ì ì æ æ ì ò ì ì ò ì æ æ æ ì ì æ ì æ æ ìæ æ æ æ ì æ ò 0.65 ò æ ò ò ò ò æ 0.64 ì ò ì à ò à ò ò ì ò à æò æ ò à 0.64 ì ì ò ò ì ò ìæ æ ì ì ì ì ò ò æ òà à à ò æ æ ì òà ì ì æ æ ò ò à à à ò æ æ æ æ ò æ à ò à ò æ æ æ æ à à òà ò ì æ ì ì ì ì ì æ à ò ì ì ì ò à 0.63 ò ì ì ì ì à 0.62 ò ò ì ì àò ò à à à ì ì æ ì ò à ì ì ì à ò à æ ò ò à ì æ æ æ à æ æ ò 0.62 æ æ ì à ò æ à ò ò à à à æ ò 0.60 à 0.61 æ ò ò æ à à æ à à æ ò à à æ æ à à æ æ ò ò à à 0.60 à à à à à à à Time à à à à à 10 20 30 40 50 à 10 20 30 40 50

Figure 6.6: Average min-entropy change during levels Figure 6.5: Min-entropy during level A, B, C for 3 users over all users

132 Entropy Entropy Entropy Entropy

ô ô ô ô ô ô ô ô ô ô ô ô ô 3.5 ô ô ô ô ô ô ô ô ô ô ô ô 3.5 ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô ô 3.5 ô ò ô ô ô ò ô ò ô ô ô ô ô ô ô ô ô ô ô ô ò ò ô ô ò 3.0 ô ô ô ò ô ò ò ô 3.0 ô ò ò ò ô ò ò ò ò ò ò ò ò ò ò ò ò ò ò ô ò ò ò ò ò ò 3 ò ò ò ò ò ò 3.0 ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ò ô ò ò ò ò 2.5 ò ò ò ò ò ô ò ì 2.5 ô ò ò ì ò ì ì ì ì 2.5 ò ì ì ì ì ò ì ì ì ì ì ì ì ì ì ì ì ô ì ì ì ì ì ì ì ì ì ò ì ì ì ì ì ò ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì ì 2.0 ì ì ì ì ì ì ì ì ì ì ì ì ì ì 2.0 ì ì ì ì ì ì ì ì ì 2.0 ò 2 ì ì ì ì à à à à à à à à à à à à à à à à à à à à à à à à à 1.5 à à à à à à à à à 1.5 à à à à à à à à ô à à à à à à à à à à à ì à à à à à à à à à 1.5 à à à à à à à à à à à à à à à à à à à à à ô ò à à à à ô à à à ô à à à ò à ò à ò 1.0 à 1 ì 1.0 1.0 ì ì æ æ æ ì æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ ô æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ æ ô æ æ æ æ æ æ æ æ æ ò à æ ôæ ô æ ò à æ æ æ ò à òæ 0.5 0.5 à æ à ì 0.5 à ì à à ôà à àì ì ô ì ò ô ôà ì ìò ì ì ìò ì ìò ò òô ò òô ò òô ò òô ô ô Bit ô ô Bit Bit ô Bit æà æà æà æà æ ô ô ô æàìòôæàìòæàì æà æ ìòô ìò ì æàìòôæàìòæàì æà æ æàìòôæàìòæàì æà æ 5 10 15 20 25 30 index 5 10 15 20 25 30 index 5 10 15 20 25 30 index 5 10 15 20 25 30 index (a) Min-entropy in blocks (b) Min-entropy in blocks (c) Min-entropy in blocks (d) Min-entropy in blocks of bits (all players) of bits (level A) of bits (level B) of bits (level C)

Figure 6.7: Min-entropy of bits

The graph is divided into three parts corresponding to the three levels as in previous

section. Figure 6.6 shows the results of all measurements in the left (level A), middle (level

B), and right (level C) parts. Level A shows a reduction in the min-entropy as the players

become more experienced, and it has the highest min-entropy decrease among the three levels.

In level B, the min-entropy fluctuates around the value 0.625 and is relatively stable. For

level C however, there is no clear trend in the data and this is true in general for all players,

but the average min-entropy is higher than levels A and B. One reason for the increase of

min-entropy in level C is probably the reluctance of players to play well over time due to the

many unknown variables of the game that makes it hard to win. This confirms the effect of

non-deterministic and unknown values of parameters which makes the skill level somewhat

irrelevant.

E) Most important bits in the output.

Different bits of the 32 bit representation of the error variable may have different amount of

entropy. In this experiment, we run the min-entropy estimation test individually on each bit

of the output (one bit per 32 bit sample). We also run the same min-entropy estimation test

on t consecutive bits of this 32-bit sample. We used a sliding window of t consecutive bits,

shifting one bit at a time, for t = 1, 2, 3, 4, 5. Figure 6.7 shows the result of this experiment.

Each point on the X-axis of each curve corresponds to the starting of a t-block. For t = 2

(second curve from below) for example, the first point corresponds to the block consisting of

the first and the second bits, the second point corresponds to the block corresponding to the

133 second and the third bits and so on. The Y-axis shows the min-entropy of the block. The

graphs in Figure 6.7a are for all the data from all users in all levels. The Figures 6.7b, 6.7c

and 6.7d show the result for for level A, level B and level C, individually (respectively).

The graph shows, the min-entropy of the most significant bit (MSB) is high and then the

following 5 bits have min-entropy close to zero. The MSB corresponds to the sign bit of the

floating point number as described in IEEE 754 for single precision floating point format.

This sign bit in the number shows if the arrow hits the target below or above the center. The

next 5 bits are the first bits of the 8 bit exponent in this representation. Since the output of

the game is in the interval [ 120, 120], the exponent part of the output numbers is less than − 8 bits and so these values have almost zero entropy. The rest of the bits in the output have

high min-entropy. This is specifically true for bits in locations 20 to 32. The graphs show

that the higher entropies are contributed by the less significant bits of the output, which

correspond to small errors of the players. These small errors are independent of the user and

level of the game. This suggests that the min-entropy contributed from these bits are present

for all users and levels of the game. Thus, even level A of the game expects to generate

good amount of min-entropy from these bits. This conclusion is also confirmed by other

experiments in Section 6.3.4.

F) Randomness required by computer.

As noted earlier, the least significant bits of the output corresponds to cumulation of small

errors in different part of the system and contribute most to the min-entropy. Thus even the

sequence collected from level A without any random input from the computer, could be used

for entropy generation. To confirm this observation we asked the most experienced player

(with highest score) to play level A again, after they had played levels A, B and C more than

1200 rounds. We measured the min-entropy for this data. The player had 20 arrows to hit

the target and with each shot to the center, a bonus arrow was given. The user played for 3

134 games, totaling 331 shots to the target. With 83% of the shots to the center, the estimated

min-entropy of the player in this 331 shots was roughly 0.521 per bit.

This suggests that the sequence generated by the game has a minimum min-entropy

independent of the randomness used by the game (computer). For higher levels of game

that require randomness, one can use pseudorandom generators in real-time or generate the

sequences beforehand and use them as needed.

6.4 Human game-play in zero-sum games for randomness generation

In this section we describe another approach to generating randomness using human game-

play in zero-sum games. In this proposal, human game play against a computer is the only

source of randomness. The game consists of a sequence of sub-games. Each sub-game is a

simple two player zero-sum game between the user and the computer which can be seen as

an extended matching pennies game. In each sub-game the human makes a choice among a

number of alternatives, each with the same probability of winning, resulting in the user’s best

strategy to be random selection among their possible choices. The first game corresponds to

the entropy generation step in TRG and subsequent sub-games correspond to steps of an

extractor algorithm.

The TRG algorithm is based on a seeded extractor that is constructed using an expander graph. Expander graphs are highly connected d-regular graphs where each vertex is connected

to d neighbors. This notion is captured by a measure called spectral expansion. It has

been proved that random walks on these graphs can be used to extract randomness [AB09].

Assuming an initial probability distribution on the set of vertexes of the graph, it is proved P [AB09] that by taking a random walk of ` steps from any vertex in the graph that is chosen

according to , one ends up at a vertex that represents a distribution over the vertexes that P is -close to the uniform distribution. In other words starting from any distribution, each step

of the random walk results in a new distribution over the vertexes that is closer to uniform

135 and so by taking sufficiently long walk, one can obtain a distribution that is -close to the

uniform distribution.

We use human input to provide the required randomness in the framework above: that is

for the initial distribution as well as the randomness for each step of the walk. To obtain P randomness from human, a sequence of games is presented to the user and the human input

in the game is used in the TRG algorithm. In the first sub-game, the graph is presented

to the user who will be asked to randomly choose a vertex. This choice represents a source

symbol that is generated according to some unknown distribution ; that is human choice P is effectively a symbol of an entropy source. Human choices however, although have some

entropy but cannot be assumed to be uniformly distributed. A subsequent random walk

of length ` over the graph will be used to obtain an output symbol for TRG with close to

uniform randomness guarantee.

To use human input for the generation of the random walk, on each vertex of the graph

the user is presented with a simple game which effectively requires them to choose among

the set of the neighboring vertexes. The game is zero-sum with uniform optimal strategy and

so the human input would correspond to uniform selection, and consequently one random step on the graph. For a given  and an estimate for the min-entropy of the initial vertex

selection, one can determine the number of required random steps so that the output of the

TRG has the required randomness.

In the above we effectively assume human input in a uniform optimal strategy zero-sum game is close to uniform. This assumption is reasonable when human is presented with a few

choices, (based on the experiments in [RB92]). In practice however the human input will be

close to uniform and so the proposed extraction process can be seen as approximating the

random walk by a high min-entropy walk. Obtaining theoretical results on the quality of

output in an expander graph extractor when the random walk is replaced with a walk with

high min-entropy is an interesting theoretical question. We however demonstrate feasibility

136 of this approach experimentally.

6.4.1 This work

We designed and implemented a TRG that is based on a game on a 3-regular expander graph with 10 vertexes. The game consists of a sequence of sub-games. A number of screen shots

of the game are shown in Figure 6.8. In each sub-game the human and the computer make

a choice from a set of vertexes. If the choices coincide, computer wins and if they do not

coincide, human wins. In our implementation the human first make a choice and then the

computer’s choice is shown. In the first sub-game user makes a choice among the 10 vertexes

of the graph, and in all subsequent sub-games, among the 3 neighbors of the current vertex.

We perform a number of experiments to validate our assumptions.

6.4.1.1 Implementation and experiments

We implemented the above game and experimented with nine human users playing the games.

We measured min-entropy of human input in the first sub-game, that is when human is an

entropy source, and also subsequent sub-games when human input is used to emulate the

random walk. For the former, that is to estimate the initial distribution , we designed a P one round game which requires the user to choose a vertex of the graph and they win if their

choice is not predictable by the computer. We used NIST [BK12] tests for estimating the

min-entropy of both distributions. The details of experiments are given in Section 6.4.4. Our

results once again shows that humans, once engaged in a two-party zero sum game with

uniform optimal strategy, are good sources of randomness. The min-entropy of human choices

in the first sub-game is 2.1 bits per symbol (10 symbol) in average and in the subsequent

sub-games is 1.38 bits per symbol (3 symbols) in average. These compared to the maximum

available entropy of the source on corresponding number of symbols, i.e. log2 10 = 3.32 and

log2 3 = 1.58, indicate that indeed the human choices are close to uniformly random and the final output of TRG is expected to be close to random.

137 6.4.1.2 Applications

TRGs are an essential component of computing systems. User based TRG add an extra level

of assurance about the randomness source: users know that their input has been used to

generate randomness. An example application of this approach is generating good random

keys using user’s input. Asking a user to generate a 64 bit key that is random will certainly

result in a biased string. Using the approach presented in this chapter, the user can select an

element of the space (say a password with 13 characters) randomly. The user choice will be

used as the initial entropy source, and then subsequent games will ensure that the final 64

bits is close to uniform. Assuming a 3-regular expander graph with 10 vertexes, one needs 3

steps in the expander graph to reach a 1/4-close to uniform. Section 6.4.3.4 further discusses

how longer sequences can be generated.

6.4.2 Background on expander graphs

Expander graphs are well connected graphs in the sense that to make the graph disconnected

one needs to remove relatively large number of edges. Connectivity of a graph can be

quantified using measures such as the minimum number of neighboring vertexes for all

sub-graphs or minimum number of edges that leave all sub-graphs (minimums are taken

over all subgraphs of certain size) [HLW06]. For a d-regular graph the second eigenvalues of

the adjacency matrix captures the connectivity of the graph. This measure is referred to as spectral expansion.

Normalized adjacency matrix of a d-regular graph with n vertexes is an n n binary × 1 matrix with Ai,j = d if vertex i and j are connected by an edge, and zero otherwise.

6.4.2.1 Expander graphs as extractors

Given a graph and a starting vertex X, one can make a random walk of length `, by randomly

choosing one of the neighbors of X, say X1, move to X1, then randomly choose one of the

neighbors of X1, say X2, and repeat this ` times.

138 Let G denote an expander graph with normalized adjacency matrix A, and let denote P an initial distribution on the vertexes of G. After one random step from each vertex, the

distribution on the vertexes is given by A and becomes closer to uniform. That is, the P statistical distance between the distribution on the graph vertexes and the uniform distribution

reduces. Continuing the random walk on the graph for ` steps, the distribution on the vertexes

becomes Al and gets closer to the uniform distribution. The rate of convergence to uniform P distribution for d-regular expander graphs is determined by the second eigenvalue of the

normalized adjacency matrix of the graph which is denoted by λ from now on.

Lemma 6.4.1 [AB09, lemma 21.3] Let G be a regular graph over n vertexes and be an P initial distribution over G’s vertexes. Then we have:

l l A 1n λ . P − ≤ where . is the l2 norm defined as X = pPn x2, considering X to be the vector k k k k i=1 i n (x1, x2, . . . , xn), and 1n is the probability vector of size 2 corresponding to uniform distribu- tion.

The random walk on an expander graph explained above gives the following extractor

construction.

Lemma 6.4.2 [AB09, lemma 21.27] Let  > 0. For every n and k n, there exists an ≤ explicit (k, )-extractor ext : 0, 1 n 0, 1 t 0, 1 n where t = O(n k + log 1/). { } × { } → { } −

The above lemma assumes an expander graph with λ = 1/2, but in general for an arbitrary

λ and min-entropy k, we can derive the following theorem from the above lemmas:

Proposition 6.4.1 Let 1n be the vector corresponding to uniform distribution and X be a

k-source with probability distribution over 0, 1 n. Let G be a d-regular expander graph P { }

139 over 2n vertexes with normalized adjacency matrix A. For a random-walk of length l over the graph starting from a vertex selected according to distribution , we have P

l  1 l −k/2 −n/2 ∆ A ; 1n λ √n(2 + 2 ). P ≤ 2

Proof. The proof of the above theorem follows from the proof of Lemmas 6.4.1 and 6.4.2.

l  1 X l ∆ A ; 1n = Pr[A = a] Pr[1n = a] (6.1) P 2 P − a∈{0,1}n

1 l √n A 1n (6.2) ≤ 2 P − 1 l √nλ 1n (6.3) ≤ 2 kP − k 1 = √nλl(2−k/2 + 2−n/2), (6.4) 2 where equation (6.1) follows from the definition of statistical distance, equation (6.2) is

followed from the relation between l2 and l1 norms, i.e. V √n V , equation (6.3) comes | | ≤ k k from the proof of Lemmas 6.4.1 and 6.4.2, and equation (6.4) follows from linear algebra facts q 2 2 2 −k ( 1n + 1n ) and that min-entropy of is k, which gives 2 . kP − k ≤ kPk k k P kPk ≤ 

For an expander graph G, given  and k as the min-entropy of the initial distribution on vertexes, we can compute the number of required steps of the random-walk on the expander

graph so that the distribution on the graph vertexes becomes -close to uniform distribution.

Note that min-entropy of the initial vertex distribution results in closeness to uniform distri-

bution, but the random walk will amplify this closeness.

Let λ = 2−α and  = 2−β. To be -close to uniform, we must have 1 √nλl(2−k/2 +2−n/2) . 2 ≤ This gives us the following lower bound on l:

1 l [β + log(√n) + log(2−k/2 + 2−n/2)]. (6.5) ≥ α The above bound requires the value λ for the graph. Equation (6.5) shows that for a

given  and min-entropy, smaller λ correspond to shorter random walk. So one needs to find

140 graphs with smaller λ.

The following theorem shows that regular graphs have small λ.

Theorem 6.4.1 [AB09, section 21.2.1] For a constant d N, any d-regular, N-vertex graph ∈ G satisfies λ 2√d 1/d(1 o(1)) where the o(1) term vanishes as N . ≥ − − → ∞

Ramanujan graphs are d-regular graphs that achieve λ 2√d 1/d and so are excellent ≥ − as spectral expanders. For a fixed d and large N, the d-regular N-vertex Ramanujan graph

minimizes the λ. There are several explicit constructions of Ramanujan graphs. In the

following we explain one of the simpler constructions.

6.4.2.2 A simple explicit construction for expander graphs

There are explicit constructions of expander graphs that can be efficiently generated. That

is vertexes are indexed by i I and there is an algorithm that for any i, generates the ∈ index of its neighbors. For example the p-cycle with inverse chords construction gives us

a 3-regular expander graph with p vertexes, p is prime, in which a vertex X is labeled by

x 0, p 1 and the neighbor vertexes have indexes x 1, x+1 and x−1. Here all arithmetic ∈ { − } − are mod p and 0−1 is defined to be 0. The spectral expansion of this graph is very close to the

above-mentioned bound. The construction is due to Lubotzky-Phillips-Sarnak [LPS86] and

the proof that the construction is a Ramanujan graph, uses deep mathematical results. The

λ for this graph is upper bounded by 0.94. Other explicit constructions of expander graphs

use graph product techniques such as Zig-Zag product and replacement product [RVW00].

6.4.3 The TRG design

We propose an integrated approach to construct TRGs using human input in games. Important

properties of this approach are, (i) the only source of randomness for the TRG is the human

input, and (ii) the final results have guaranteed and adjustable level of randomness. This

latter property means that the user can improve the quality of the output randomness at any

141 time and to any level of closeness to uniform randomness, simply by adjusting the length of

the random walk. In comparison, in the construction of Halprin et al. (i) the entropy source

is based on human input and a second external source of perfect randomness is required to

provide seed for the extractor, and (ii) the size and quality of the final output depends on the

extractor that is used after obtaining the output of the entropy source. Here changing the

quality of the final output, requires replacing the extractor and performing the calculations

from scratch.

The outline of the approach is given in Section 6.4 and includes, (i) choosing an expander

graph with appropriate parameters, and (ii) designing an appropriate game that is “attractive”

to play and has the required property (uniform optimal strategy for each sub-game).

6.4.3.1 Choosing the expander graph

The starting point is choosing a d-regular expander graph with an appropriate number of vertexes 2n, each vertex labeled with a binary string of length n. The two parameters n

and d will be directly related to the computational efficiency of the system in generating

random bits: larger n means longer output string and more random bits, and larger d means

faster convergence to the uniform distribution and so shorter walk to reach the same level

of closeness to the uniform distribution (see Theorem 6.4.1). In practice however, because

the graph is the basis of the game visual presentation to the user, one needs to consider

usability of the system and so the choice of n and d will take this factor into account. Another

important requirement is that steps of the random walk must correspond to an independent

and uniformly distributed random variable. Experiments in human psychology [RB92] shows

that bias will increase in human choices for larger sets of possible choices. and thus the random walk generated by human input will be farther from uniformly random. In Section 6.4.3.4 we discuss issues that arise in choosing the graph and extending the approach when longer

sequences of random bits are required.

142 6.4.3.2 The game structure

The TRG algorithm uses a game between human and computer. The game consists of a

sequence of sub-games. In all sub-games the human makes a choice and wins if their choice

is not correctly guessed by the computer. Each sub-game is an extended matching pennies

game which is known to have uniform optimal strategy for both players. The main difference

between the initial and subsequent sub-games is the number of choices available to the user,

and the way these choices are used. The computer choices are made by a pseudorandom

number generator.

Initial Sub-game In the initial sub-game the user chooses a vertex from the set • of 2n possible vertexes in the graph, and wins if the computer cannot “guess”

their choice.

Walk Sub-games Each subsequent sub-game correspond to a single step of the • random walk. At vertex V , the player can choose any of the vertexes that are

adjacent to V . Using a d-regular graph ensures that the number of choices at

every vertex is the same.

The game proceeds in ` steps where ` is determined by the required quality of the final

output randomness and the min-entropy of the input. In practice one needs an estimate of

user min-entropy of the initial game, when choosing among 2n vertexes, to be used with d

the degree of the graph, in the equation (6.5) to obtain an estimate for the number of steps

on the graph for a chosen value of .

To estimate min-entropy of user in the initial game, we developed a second game that

simply asks user to choose a vertex and win if their choice is not correctly guessed by the

computer. In our experiments the suggested min-entropy for a graph of 10 vertexes is 2.32

bits which is although lower than the min-entropy of uniform distribution (log 10 3.58) , 2 ≈ but shows a high level of min-entropy.

143 a-Game start b-Initial vertex selection c-First step of random walk d-Computer wins Figure 6.8: The game

Although human play differently but assuming they behave “similarly”, one can obtain

an average value for the min-entropy of the initial choice for a graph of size n, over a large

population of users and use that for estimating the number of steps.

6.4.3.3 The game design

Our basic game is a combination of hide and seek and the Roshambo (Rock-paper-scissors)

games. The hide and seek game corresponds to the initial sub-game where human should

choose from a number of vertexes to start the game. The Roshambo game corresponds to

the second sub-game (i.e. random walk). For the choice of expander graph, we used Peterson

graph which is a 3-regular graph with 10 vertexes.

In the first sub-game, the human player starts the game by selecting a vertex from the

graph G. This will place a sheep on the vertex (Figure 6.8-b).

The computer responds by placing a wolf on a random vertex. This is similar to the hide

and seek game. The player loses if the wolf and sheep are on the same vertex. If the player wins, then the game will highlight the vertexes that are adjacent to the selected vertex by the

player (marked by a star in figures). The same sequence of steps (user’s choice (Figure 6.8-c),

followed by the computer choice (Figure 6.8-d) is now played using the highlighted vertexes

instead of all vertexes in the graph, and the human will win (Figure 6.8-c) if the choices are

different and lose otherwise. The winner is the player with higher final score.

144 6.4.3.4 The game analysis

The graph has 10 vertexes and so can generate at most log2 3 bits of entropy. In a real world design, a large 3-regular expander graph (e.g. with roughly 264 vertexes) can be easily

constructed using the construction in Section 6.4.2.2. To construct such a graph, the largest

prime number smaller than 264 is chosen for the number of vertexes. For such a graph,

λ is estimated to be 2√3 1/3 = 0.94. The choice of 3-regular graphs is based on the − psychological experiments [RB92] showing that too many choices would increase the bias

of human selection. To prevent this, we only recommend using 3-regular graphs for the

construction of the expander graph.

Now let  = 2−15, and considering the min-entropy of the initial selection be 32 (i.e. 0.5

per bit), then we need 12 steps of random walk on the graph to get 2−15-close to uniform

distribution: 1 l β + log(√n) + log(2−k/2 + 2−n/2) 11.23. ≥ α ≈ where α 0.089, β = 15, k = 32, n = 64. ≈ Note that although we cannot find the exact value of the min-entropy of the initial

selection, it is possible to find a good estimate experimentally. This is by considering many

samples from different human players and using statistical models to estimate average min-

entropy. Equation (6.5) can be used to find the minimum number of steps that is required

for the final distribution to be -close to uniform distribution.

The graph will appear as a continuous circle. For the first sub-game, the human chooses a

random vertex from the graph, or equivalently a random point on the circle. Using a circular

representation of the graph has the advantage that all vertexes appear with the same value

(importance) and no end points will be left out because of its location and this will avoid

human tendency to avoid corners, as observed in [Wag72, HN09].

After the initial vertex selection, the graph will be displayed locally (vertexes adjacent to

the current selected vertex of the user) and then the user is asked to play the second sub-game.

145 We implemented the game using HTML 5 technology so that it can be run and played on

any system having an Internet browser and even on touch screen devices such as tablets or

smart phones. The game is simple to learn and intuitive and can be accessed online [Ali13b].

6.4.4 Experiments

We designed a game between human and computer and asked each player to play the game

for at least 1000 rounds. This is sufficient to run the required min-entropy and statistical

tests.

Our objectives in the experiments are the following:

1. Estimate the min-entropy of human choices in sub-game 1, which is required for

the extraction component of the TRG. As noted earlier, expander graphs can

be used for extracting randomness if the min-entropy of the input is bounded

below.

2. Examine the statistical properties (e.g. min-entropy) of the walk to verify if

the walk by human is a good approximation of a random walk.

3. Examine the statistical properties of the final output,

We run the above three tests for all participants and will present the results.

6.4.5 Measures of randomness for our game

In our first experiment, we measured the min-entropy of the player’s initial selection of a vertex on the graph. To measure min-entropy, we used the method in Section 6.2.1, and the

results is as follows: The table summarizes the data we collected from 9 users. For each

user, the first row is the number of total plays, the second row is the number of times the

human choice collided with the computer’s choice (the wolf was placed on the sheep), the

146 User 1 2 3 4 5 6 7 8 9 Total Shots 1770 2141 3980 3439 2021 652 1685 905 983 Collisions 150 322 365 348 149 70 167 45 112 Probability 0.08 0.14 0.09 0.1 0.07 0.1 0.1 0.05 0.11 Min-entropy 0.45 0.49 0.61 0.65 0.52 0.49 0.47 0.55 0.51

Table 6.1: Min-entropy of users in first sub-game

third row is the probability that a collision occurred in game plays (the number of collisions

divided by the total number of plays) and the last row is the min-entropy of the player’s

choices per bit. We then counted the number of times each vertex is selected by human. The

expected behavior is that the number of times each vertex is selected must be roughly the same.

In conclusion for the first sub-game,the lowest min-entropy is 0.45 per bit, and on average we expect the min-entropy of 0.52 per bit.

6.4.5.1 Random walk game

In the second experiment, we collected data from game play of participants over a long

sequence of game play. We expect the choice of neighbors to be uniformly random selection

over the set of adjacent vertexes. We map the set of adjacent vertexes to the set 1, 2, 3 { } based on an ordering on the vertexes (e.g. an ordering on their labels which are numbers)

To examine this in practice, we applied the statistical tests in Section 6.2.3 to measure the

statistical properties of this sequence. The results are summarized in Table 6.2. The table

gives a summary of a subset of tests and their corresponding p-value for the output. The

data to generate this table is from a sample player. We also examined the data from other

players with all statistical tests being passed with p-values far from 0 or 1.

We also calculated the min-entropy of the random walk (human choices for each step)

to further confirm the random properties of this sequence. The Table 6.3 summarizes the

results:

The results show that the min-entropy of players is more when the choices are less (here

147 Statistical Test result LinearComplexity 0.87 LempelZiv 0.59 SpectralFourier 0.63 Kolmogorov-Smirnov 0.47 PeriodsInString 0.51 HammingWeight 0.85 HammingCorrelation 0.23 HammingIndependence 0.78 RunTest 0.13 RandomWalk 0.29

Table 6.2: Result of statistical test

User 1 2 3 4 5 6 7 8 9 Total Shots 370 369 1009 1786 560 1071 4821 1190 1065 Collisions 118 118 237 595 140 373 1624 379 334 Probability 0.31 0.29 0.23 0.33 0.25 0.34 0.33 0.31 0.31 Min-entropy 0.49 0.51 0.64 0.75 0.78 0.72 0.83 0.61 0.79

Table 6.3: Min-entropy of users in second sub-game

3). The min-entropy of the initial selection of vertexes can be measured - assume it is k.

Using the results in Lemma 6.4.2, the final output of the game would be -close to uniform if

the walk is uniformly random. Using the above results, we note that the walk is close to a

random walk that is uniformly distributed. To examine the effect of this discrepancy, we ran

the set of statistical tests on the final output and the results is summarized in the following

table:

Statistical Test result LinearComplexity 0.51 LempelZiv 0.73 SpectralFourier 0.34 Kolmogorov-Smirnov 0.23 PeriodsInString 0.13 HammingWeight 0.55 HammingCorrelation 0.69 HammingIndependence 0.29 RunTest 0.60 RandomWalk 0.67

Table 6.4: Result of statistical test

148 The numbers in Table 6.4 are the p-values of each test. The test is passed if these values

are far from 0 or 1. A margin of 0.001 is usually accepted for the test to be passed.

Overall the experiments shows viability of the approach in practice.

6.5 Comparison to Halprin et al. approach

In psychological experiments that showed human choices would be close to uniform in

competitive games, the choices were limited to a small number (2 or 3 choices) [RB92] and

more choice add bias to their selection. Nevertheless, the game designed in Halprin et al.

[HN09] had too many choices for the human and an extra source of randomness was used to

eliminate the bias. This was done to keep the entropy rate high, since having a few choices will only generate few bits of randomness. In the approach discussed in Section 6.4, we

proposed a game design along the Halprin et al. [HN09] game-theoretic approach in which

human had only 3 choices and the randomness extraction was done as part of the game design with no need for seed. It was showed that this design still keep the rate of min-entropy high

because of the added extractor in the game that needs no randomness. These approaches are

fundamentally different from the approach in discussed in Section 6.3 that uses the complexity

of the process of generation of human input in the game, as the entropy source. Our approach

is more in the spirit of sampling a complex process such as disk access, now using human

and computer interaction as the complex process. To use Halrin et al.’s approach in practice,

one needs to design a two-party game with the supporting game-theoretic analysis to show

the optimal strategy is uniform. The next step is to convert the game into an interesting

game between the human and computer and validate that human would play as expected (is

able to simulate the optimal strategy). In contrast our approach can be easily incorporated

into video games such as target shooting games and does not need new game designs.

149 6.6 Concluding remarks

TRGs are an essential component of security systems. Using human as an entropy source

is particularly attractive in shared environments such as cloud services where traditional

sources of entropy (computing hardware and software) are shared among users and extra

caution must be used to ensure randomness extracted from the entropy sources do not result

in correlated randomness for users that are sharing the services. There are a number of

extensions and directions for future work. On theoretical side in approach discussed in

Section 6.4, analysis of randomness extractors that are based on expander graphs when the

random walk is replaced by a walk with a guaranteed min-entropy is an interesting open

question. On the implementation and experimental side, we noted that for generating larger

strings, for example a 64 or 80 bit strings, the full graph cannot be presented to the user.

Here one needs to find ways of enabling the user to make the initial selection of the string with “good” initial min-entropy and then use portions of the graph corresponding to the

neighbors of the vertex, to receive the user’s input for the random step at that vertex. The

choice of the graph and creating an “interesting” game and interface to encourage random

selection will improve effectiveness of the approach and usability of the system. We analyzed

strings generated by nine users. Wider user studies are required to measure min-entropy and

statistical properties of strings at different stages (entropy source, random walk and final

output) as well as usability of the system. Using human input also improves trustworthiness

of the generated randomness. Hardware faults or malicious tampering with entropy sources

may result in biases in the randomness that are not detectable. Our approach is protected

against such faults or malicious tampering.

Our work is the first construction of a full TRG that uses only human game play. Us-

ing human users to construct TRGs with higher rate is an interesting direction for future work.

In Section 6.3 we proposed and analyzed a new approach to entropy collection using

150 human errors in video games. We verified the approach by developing a basic intuitive game and studied the sequences generated by users. The approach can be generalized to various type of popular online games with simple modifications. We ran a set of experiments to find out the best choice of game parameters for random number generation. Our experiments showed that with this simple design and considering the “worst” case where the user was experienced and made the least error, the rate of entropy generation is at least 15 bits per shot. This rate can be increased by adding variability to the game and also using multiple measurable variables instead of only one. Adding variability to the game increased the min-entropy by 7 bits per round. In choosing parameters one needs to consider attractiveness of the game: increase in entropy can be immediately obtained if game constants such as gravitational force in our case are changed without user’s knowledge. However this would decrease the entertainment factor of the game. Studying these factors and in general the randomness generated by users needs a larger user study which is part of our future work.

For the randomness extraction we implemented and used t-resilient extractors. The output from extractors passed all statistical tests.

Our work opens a new direction for randomness generation in environments without compu- tational capability or randomness generation subsystems, and provides an attractive solution in a number of applications such as providing an independent entropy source for servers with limited or untrusted random generation device.

151 Chapter 7

Human game-play for user authentication

In this chapter, a user authentication mechanism using human

game-play is proposed. Human game-play although has random

properties, but our experiments showed that it leaves a footprint

of the users when sufficient number of features is collected. Our

empirical study showed that the game-play is also not easily

emulatable by another human even given statistical information

about the collected features and thus provides a mechanism to

detect sharing of authentication information.

7.1 Introduction

In user authentication systems, a prover and a verifier engage in a protocol at the end of which the verifier will be convinced about the identity of the prover. Authentication systems

must provide security and usability - that is ensure that authentication succeeds for the

correct user with the valid credentials, and at the same time the system is easy and flexible

to use in various applications and settings. These two properties are generally in conflict:

more security systems relies on special hardware, either in the form of a hardware token or

a special reader for collecting user biometric, or complex protocols. Password systems are

highly usable but well known not to provide sufficient security [Che13].

A particularly important insecurity of password systems is that password can be shared

[Kay11, SCD+07]. Sharing passwords in work environments is a well-known problem [Cor11]:

executives share their passwords with their administrative assistants and assistants share

passwords among each other. This sharing is for convenience and although desirable, but it

152 exposes security risk even though is primarily among office workers and within the perimeter

of an office. A different type of sharing happens in work from home scenario: an employee

shares its password with a third party as a way of delegating their work. In this case the

third party has the privileges of office worker in accessing corporate system in their role. The

risk of such an access is enormous: in the case of a software developer who delegates their work to an outside worker, not only quality of work is at stake but also intellectual property

of the company is at risk. And even worse the identity of the outsider worker is completely

hidden. Using hardware tokens although makes it harder but cannot prevent this delegation,

as reported in [Ent13] where a developer in the US outsourced their work to a Chinese firm

by sending the RSA token used for secure access, using FedEx. Another well-known example

of undesirable sharing of credentials is in subscription-based services such as HBO GO and

Netflix [Wor13, But12].

In this chapter we consider the problem of delegation in authentication and aim at

developing a Hard to Delegate (HtD) authentication system; that is a system that makes it

hard for users to delegate their authentication information.

An immediate solution to HtD authentication is to use private information of users as

authentication credentials, or link the credentials to such information. This solution however

is not only unacceptable because of privacy concerns of workers and danger of possible

compromises of company’s security systems, but also may not provide sufficient entropy for

secure authentication. Other possible solutions include preventing multiple simultaneous

connections with the same credentials, and restricting the source IP of the connection, which would be effective in certain scenarios such as subscription-based video streaming services,

but cannot provide a solution for workers delegating their work, by using proxies for example.

The approach proposed in this chapter is to use behavioral biometrics. Behavioral

biometrics are user specific information related to aspects of their behavior including typing

rhythm, voice, gait, game playing skills and timing of doing actions [GF04]. Not all behavioral

153 biometrics are suitable for HtD authentication. In particular features related to physical

properties such as one’s voice or facial features may be replicable by recording these feature

and replaying them as needed.

Behavioral biometrics provide a number of advantages over traditional biometrics. They

can be collected non-obtrusively, and even without the knowledge of the user (although

this will be a privacy breach). Collection of behavioral data often does not require any

special hardware and so can be very cost effective. While behavioral biometrics may not be

sufficiently unique to provide reliable identification, they have been shown to provide accurate

identity verification [Rev08]. In [MMC+11], behavioral biometric is considered for continuous

authentication of users to the system. This is a very useful property that ensures that a user who had logged on to the system, has persisted in their session, and the session or the account

has not been compromised. The goal of non-delegation is however different- here security is

for the initial login and preventing a valid user to share their credential. The two types of

systems can be used in a complementary way: a secure session authenticated using a HtD

game can stay continuously authenticated using an appropriate continuous authentication

system.

7.1.1 Our contribution

We propose a model for the HtD authentication that captures traditional attacks in authenti-

cation as well as the new HtD property. In our model the authentication protocol is between

an untrusted prover and a trusted verifier. The verifier sends challenges to a prover with a

claimed identity that is registered in the system, and receives the responses. At the end of

the protocol, possibly after multiple rounds of challenge and response, the verifier outputs a

single bit, Accept or Reject. We require the protocol to be correct, i.e. outputs Accept if

the claimed prover is actually responding to the challenges, and secure, i.e. outputs Reject

for an entity who tries to impersonate the prover, without having the correct credentials

(including valid tokens). These two are the known properties of authentication systems. We

154 also require Hard to Delegate property where the probability of a user who has received

the credential of a valid user and is attempting to impersonate them, will pass the protocol with small probability. In other words, the protocol outputs Reject for an entity who has

all the transferable credentials of a different user and attempts to impersonate them. By

transferable credentials, we mean all credentials including passwords, keys, PIN numbers as well as hardware tokens.

The verifier uses a software agent that is programmed to execute the protocol. The prover

also uses a client agent that interacts through two interfaces: on one interface it interacts with the software agent of the verifier, and on second interface it interacts with the human

user, presenting the challenge to the user and collecting the response.

We achieve HtD authentication using Hard to Emulate authentication games. The

main observation is that human game-play in computer games is influenced by intrinsic

personality/cognitive/human aspects, and skills of the user. In games with a rich set of

interactions, one may collect sufficiently many data points in game-plays to construct a user

profile, that can be used for user authentication with high accuracy. The data points capture

skill and behavior of the user, which with sufficiently many, could provide the HtD property.

In other word, different users will be represented by distinct profiles that can be used to

distinguish them, and generating a feature vector consistent with the profile of a different

user will be “hard” and tantamount to emulating the behavior and skill of the user in the

game. Here “hard” means long hours of dedicated training, which as will be described later

(Section 7.5.3) will become infeasible if multiple features are collected from human at once.

Collecting data points from human game-play in games had been considered in [YG09] when the user plays a game of poker, and feature points are game specific and related to users’

strategy in Poker game-play. Our approach is different and applicable to many non-trivial

games that rely on human cognitive abilities, behavior and skill, and producing a rich set

of data points from human interaction with the game to capture human idiosyncrasies such

155 as concentration, reaction time, strategies in simple interactions, speed of movement of the

mouse and the like. Each data point can be seen as a combination of multiple human intrinsic

features that is hard to emulate.

We call these games Hard to Emulate (HtE) games - intuitively meaning that the game-play

of a user cannot be emulated by another user.

We define α-FRR (False Rejection Rate) and β-FAR (False Acceptance Rate) for these

games in terms of the probability that a user is rejected incorrectly, and the probability that

a user is accepted incorrectly. In practice human features may change due to factors such as

practice and learning, age, incapacity and the like. However, profiles can be updated after

each successful authentication and so the current profile remains representative of the user.

To support this hypothesis we designed and implemented a target shooting computer

game where in each round of the game, the human is presented with a (moving) target, and

the human expected response is to aim an arrow at the target and release it such that it hits

the target. This basic game allows to collect data points including, the time that the human waits until it grabs the arrow (delay time), the time it takes to aim at the target and the

precision of the hit (skill), the location of the first mouse click when first grabbing the arrow

(personal preference), initial speed (understanding physics of the game ), and the frequency of

misses of the target (skill). All the above features are collected in each game-play of the user.

The game was designed with multiple levels to examine the relevance of different features

and data points. The variation in levels were based on target movements, putting an obstacle

in front of the target, and limiting the visibility time of the target to restrict the time to

shoot at the target.

We showed the effectiveness of our approach empirically. We collected data from game-play

of 100 users. We used the data to construct user profiles and later measure accuracy of

authentication system. We showed that the authentication game can correctly verify 91%

of the users (9%-FRR) and less than 6% FAR for all users. We also would like to evaluate

156 non-delegability of the game-play. For this we run a controlled experiment where user 1 was

tasked to emulate user 2. User 1 could observe the game-play of user 2 and also was given

the feature measurements as well as statistical information. The experiment clearly showed

that features have different values in terms of non-delegability - some are emulatable (and so

delegable) and some are harder. For example, emulating behavior related to timing of action was harder than the number of misses. The results are detailed in Section 7.5.3.

The game is implemented and can be accessed on-line to test the HtD property.

7.1.1.1 Practical challenges

To have a practical implementation of HtD authentication using hard to emulate games, a

number of technical challenges must be addressed. As noted earlier the prover consists of the

human player and the client agent that interacts with the verifier agent. The human will play

the game on the screen displayed by the client and the collected data is sent to the server. One must ensure that the only way of generating data to be sent to the verifier is through the client. A cheating player may try to bypass the client software agent and instead of

generating the feature vector through a game-play, replays a feature vector by the information

previously passed on by the legitimate user (in case of delegation to impersonate them) and

send it to the verifier. Or the player may tamper with the client to pass on these information.

These are the problems faced by on-line game industries and works on tamper-proofing games

against these attacks (discussed in Section 7.4.2) can be applied directly here.

A second challenge is to prevent Man-in-the-middle types of attack: that is the delegatee

passing on the received challenge to the legitimate user who will solve the challenge and will

send the results back, that will be relayed back to the verifier. The end result being the

delegatee will pass the authentication. First note that the attack is hard to orchestrate as

the legitimate user must be on-line right at the time of challenge response. But in general,

one can use assumptions on the location of the prover to bound the time it takes for the

challenge and response, using distance bounding protocols. These protocols can be employed

157 over the Internet with a reasonable accuracy, once appropriate profiling of the user location

is used. Such techniques however become less reliable for shorter distances, but applicable

in scenarios where the work is outsourced to workers in a different region or continent.

Man in the middle attack will remain a challenge when the delegatee is in close geographic

proximity of the user, if we assume both entities are on-line at the same time. By introducing

continuous authentication techniques, for example sending challenges at random times, one

can substantially reduce the success chance of the adversary in this attack (We use the term

adversary for the collusion of legitimate user and delegatee).

In Section 7.4 we will consider implementing such systems in real world. We will first

discuss possible insecurities and attacks and then, propose a protocol that can protect against

the attacks.

7.1.1.2 Applications

Remote working. In remote working, an employee has to work remotely and we assume

this involves using a VPN service to connect to the infrastructure in the company to access

resources. For an employee who wants to delegate the work to a third-party, the VPN

authentication credentials must be shared so that the third-party can login into the internal

network of the company. In this case, a HtD authentication system can be incorporated into

the VPN client, to prevent the delegation. Any attempt to login to the internal network would need to solve the challenges of HtD authentication.

Subscription-based services. Users who subscribe to on-line services may share their sub-

scription to a third-party by passing on their credential (possibly to share subscription cost).

Assuming that users need a client to access the service, the HtD authentication can be used

to prevent sharing of credentials. Here HtD authentication system with small false negative

is required to avoid dis-satisfaction of customers.

Cheat-proofing games. HtD games can be used to protect against players sharing their

account information with a third-party. The incentive here could be similar to subscription

158 as well as allowing a more experienced player to play on their behalf. In another scenario,

an attacker may hijack an account [CH07] to take advantage of the users progress in the

game. The attacker may gain access to the game-play history of the user. This could also be

prevented by incorporating HtD authentication in the game, and collecting game-play points

related to behavior of the user and use it for continuous authentication. We did not examine

the overhead of the HtD authentication in a real-world game but for our prototype, HtD

authentication added about 2KB of data to every response sent to the game server, which would be negligible compared to the network communication in on-line games.

7.1.2 Related work

Secure delegation as a desirable property has been studied in a number of works [MW06, AJ09].

A legitimate user of a resource shares part of their privileges with another user in the system.

In cryptography one can delegate their signing right to another party [BPW12, HSMW06] without revealing their secret keys. If private keys are shared, there will be no mechanism

to distinguish the original signer and the person who has received the shared key. In our

scenario, we assume all transferable information such as public/private keys are shared, and

still would like to be able to distinguish users.

Behavioral biometrics [Rev08] is a relatively new research area. Human computer interac-

tion based biometrics such as keystroke dynamics[MR00, ASL+05] and mouse movements

[PB04, JY11] has been shown to be effective for this purpose. In [YG09], the authors showed

that measuring the player’s strategy in a poker game is also effective for user verification.

Alayed et al. [AFN13] used a first person shooter game to distinguish between normal

behavior of the players, and cheating behavior. The output of their classifier is a binary value,

indicating presence of cheating or no cheating.

The idea of using implicit memory for authentication was proposed by Denning et al.

[DBvDJ11] (US Patent [BDJ14]) to make password recovery easier compared to explicit

159 memory methods such as passwords. They showed that implicit learning of sets of images

make it very easy to recover password and yet provides a strong password for authentication.

Implicit learning is a concept in cognitive psychology where a pattern is learned without any

conscious knowledge of it. Bojinov et al. [BSR+12] also used the concept of implicit learning

to defend against “rubber hose Attacks” in authentication. They designed a computer game in which the players will implicitly learn a password for authentication such that they can not be

coerced into revealing it, and this is because the players do not have any conscious knowledge

of it. Our approach is different from the two since we require that even a collusion of a user with attackers should not result in their authentication to the system. HtD authentication

tries to prevent sharing of authentication information by using behavioral biometrics that

can not be imitated, such as skill and timing of action. Our method achieves the goals of

[DBvDJ11] and [BSR+12] but requires more, and thus we can consider it as a generalization

of the two works. The users in our system do not learn an implicit pattern, but we use a game

to measure their various abilities so that it becomes hard for someone without those abilities

to mimic their behavior. Our method also satisfies the goal of [DBvDJ11] since the users do

not need to explicitly memorize a secret, but they just use their abilities to accomplish a task.

It also satisfies the goal of [BSR+12] since not only the user can not be coerced into revealing

their secret, but also they can not even willingly collude with the attackers to authenticate

them.

An extensive survey of behavioral biometric can be found in [YG08]. Authors summarize various works in behavioral biometric such as keyboard, mouse, gait, voice and game-play.

In Section 7.2, we describe the delegation problem in authentication by providing a model

of interactive authentication that is hard to delegate. In Section 7.3, we propose our approach

to HtD authentication which is behavioral biometric. In Section 7.4, we discuss practical

attacks on HtE games (Hard to Emulate) that can still allow delegation of authentication,

160 and propose that using cheat-proofing for the game and distance bounding can prevent these

attacks. In Section 7.5, we discuss our prototype design, the features we collect from users

and describe our experiment settings and the results.

7.2 Delegation in authentication

Secure authentication relies on what users know, what they have, what users are, as well as

attributes such as location, distance. Examples of what users know and have are password and

tokens respectively, and both can be passed on to others. What people are, often measured

using biometrics, and is usually considered nontransferable, although in many cases this

assumption may not be valid for a particular system where users biometric template can

be recorded or replicated. Studying biometric systems suitable for HtD authentication is

an interesting extension of this work. In this section, we propose a formalization of HtD

authentication.

7.2.1 HtD Authentication

We consider a verifier V that interacts with a prover P from the set of provers . A prover P P is modeled as an entity with a unique set of properties (e.g. location, IP, behavior and

etc.) that can respond to the challenges from V . A challenge response between V and P is

formalized by an experiment denoted

exp = V (x; rV )  P (y; rV ), where x and y are the input values of V and P , respectively, and rV and rP are the randomness

that was used during the challenge response.

The verifier may interact with an adversary A who is impersonating the prover and in

that case A substitutes P in the above notation. The adversary A may also interact with the

prover P while responding to the verifier and we denote this interaction by

V (x; rV )  A(y; rA)  P (z; rP )

161 where x, y and z are the inputs of V,A and P , respectively. The input value of the verifier is

used to produce the challenges or verify the prover responses, and the input value of prover

is used to produce the responses. The inputs may be equal or similar (similarity based on a

distance function) which means the two entities are sharing the information between them.

At the end of the interaction, the verifier V outputs out 0, 1 which is 1 if the interacting ∈ { } entity is accepted and 0 otherwise.

Definition 7.2.1 A Hard to Delegate (HtD) Authentication protocol is a tuple (reg, P, V ), where reg is a randomized registration algorithm between V and P that takes the randomness

r and outputs sP (denoted by sP reg(r, V, P )), which is the shared information between ← verifier and prover. V (; ) is a probabilistic polynomial time (ppt) algorithm of the verifier and

P (; ) is a ppt algorithm of the prover or a human (or a combination in our case) that responds to the verifier. We first require that the verifier outputs “out” in polynomial time. Moreover, we require that the HtD authentication protocol satisfies the -correctness, δ-security and

ζ-HtD properties:

-correctness: The probability that the verifier rejects an honest prover P is low: P ; ∀ ∈ P h i Pr out = 0 : V (sP ; rV )  P (sP ; rP ) . r,rV ,rP ≤ where sP reg(r, V, P ), and Pr[out = 1 : E] is the probability that verifier outputs 1 after ← the experiment E is over. Note that this is not conditional probability and it only implies the sequence of action: that the probability of out = 1 is measured after the experiment is performed.

δ-security: The probability that the verifier accepts a dishonest prover P 0 that claims to be P is low. P,P 0 ,P 0 = P ; ∀ ∈ P 6

h 0 i Pr out = 1 : V (sP ; rV )  P (sP 0 ; rP 0 ) δ. r,rV ,rP ≤

162 0 0 where sP reg(r, V, P ) and sP 0 reg(r, V, P ). Note that if P is not registered in the ← ← system, sP 0 may be null.

ζ-HtD: The probability that an adversary A can emulate a prover P even given the shared information between P and V is low: h i Pr out = 1 : V (sP ; rV )  A(sP ; rP , rA) ζ. r,rV ,rP ≤ where sP reg(r, V, P ), but the inputs may have been computed by an interaction between ← the legitimate prover and the verifier, and the shared information is provided to the adversary by the prover intentionally. Here A have access to the same information as P but it can not emulate P to produce responses that result in its acceptance by the verifier, because of an internal property of P (such as behavior).

Remark: We note that the probabilities calculated above may also be over an unknown

behavior or an internal randomness of the prover, which is implicit in our notation.

The above definition only considers when the prover helps the adversary off-line by

providing the shared information. Although we argue later that in many scenarios of HtD

authentication, the prover is practically only able to help the adversary off-line, but still we

consider the possibility of on-line help from the prover. In this scenario, the attacker can

relay the interaction between the verifier and the prover to be able to fool the verifier. We

consider this attack in the general model of man-in-the-middle (MiM) attack.

Definition 7.2.2 A HtD authentication (reg, P, V ) provides σ-resistance to MiM if an adversary A who can interact with the prover P and is given the prover’s shared information with verifier, can not still be accepted with high probability.

h i Pr out = 1 : V (kP ; rV )  A(sP ; rP )  P (sP ; rP ) σ, r,rV ,rP ≤ where sP reg(r, V, P ). ← 163 For example in a classical password-based authentication, reg(r, V, P ) outputs kP that

is a password shared between verifier and prover. The verifier then generates a challenge

c, i.e. c V (; rV ), and sends it to P . The prover responds by applying a cryptographic ←

hash function h to the challenge and kP , i.e. s P (kP ; c) = h(c, kp). The verifier accepts, ←

1 V (kP ; c), if s = h(c, kp) and rejects otherwise. This simple authentication satisfies ← correctness and security definitions but does not satisfy HtD property and resistance to MiM,

0 0 because if the shared information kP is transferred to another entity P , then P will be able

0 to generate s P = h(c, kP ) and can simply impersonate P . ←

7.3 HtE authentication games

In this section, we describe our approach for HtD authentication based on human behavioral

features. We use features in human behavior that can be used to distinguish them and are also

hard to emulate by another human. Human features are the distinguishing characteristics of

a human that determine how a human behaves, looks or what its habits are when performing

a task. According to trait theory [Kas03], there exists human features that are relatively

stable over time for a human, and differ across individuals, and thus these features (aka traits)

can be used to distinguish them. Our work build on this theory from psychology. Many

human features can be measured through various psychological tests in psychometrics [Kas03].

We are particularly interested in features that can be measured using human computer

interaction in a computer game. Informally, a game that measures a set of human features

is Hard to Emulate if it is hard for a person to emulate the behavior of another person and

generate similar game-play features such that no efficient classifier can distinguish between

the two. Note that emulating another person, is not in terms of generating the same final

results; rather it is their game-play throughout the game and the set of features captured

164 from that. We will empirically show that HtE games can be constructed and as a proof of

concept, we propose a HtE game demonstrator that can be used to capture a number of

features, for which we argue the HtE property.

7.3.1 The model

We consider a verifier V that interacts with a prover P from the set of provers . The prover P P has two components: a human and a game client. The challenge response happens between

the verifier and the game client over an authenticated channel. The client interacts with the

human to generate responses by collecting human behavior in game-play. The game client

displays a game, possibly by the parameters received from verifier, and then the human is

asked to accomplish a goal by game-play. The human features will then be collected during

game-play and then response of the game client to the verifier would be a function of the

collected feature values.

The game measures b features F = (f1, f2, . . . , fb), called a feature vector, simultaneously

during a time interval when the human is playing the game. Subsequent measurement of the

features over time for n time slots will produce a set of feature vectors, F1,F2,...,Fn, which we denote by RP (n), i.e. response of P to n consecutive measurements of feature vector in

the game. The user’s secret profile is generated during a registration phase, when users sign

up in the system by playing the game to produce RP (n), which we call it the user’s profile

from now on. The measurements from all users will create a corpus of user profiles which will be used to verify users. Later on, the users are asked to play the game again, and a set

0 0 of n consequent measurements of feature vectors RP (n ) are collected from the users.

0 Finally, a classifier decides whether RP (n) and RP (n ) are generated from the same human

entity. A classifier is a function cls that maps a sequence of feature vectors x to a class cls(x),

in our case the human entities registered in the system. We consider a classifier that uses

a distance function (Example in Section 7.5.2) to compute the distance between two sets

0 of feature vectors RP (n) and RP (n ), and we require a number of properties. Firstly, the

165 feature measurements for a human entity must be stable for two consequent measurements.

0 In other words, the distance between RP (n) and RP (n ) should be small. Secondly, the

feature measurements of two human entities must be distinguishable and third, a human

entity P 0 should not be able to generate feature measurements in the game such that the

classifier can not distinguish it from another human entity P , even if P 0 is given whatever

information about the measured values of features when P plays the game. In other words,

P 0 should not be able to emulate the behavior of P in game to fool the classifier, even if P 0

knows how P plays the game in terms of measured feature values.

We formalize these requirements in the following:

Definition 7.3.1 An (m, α, β)-Authentication Game is a game G that measures b features when performed by a human P from a population of m human users in the system. The P human will play the game G and each measurement will generate b feature values f1, . . . , fb, where fi (i 1, . . . , b ) are real numbers in an interval I. The profile of human entity P in ∈ { } the system is n measurement of the feature vectors and is denoted by RP (n), for a value of

n > 0.

We require that there exists an appropriate algorithm cls that matches a set n0 of feature vectors to a human entity, such that the following conditions are satisfied:

0 α-FRR: The classifier should map a measurement of feature vectors RP (n ) to • P with high probability:

h 0 i Pr cls(RP (n )) = P α. P 6 ≤

β-FAR: For an entity P , the probability that feature measurements of another • entity P 0 is classified as P is low.

h 0 i P, Pr cls(RP 0 (n )) = P β. ∀ P 06=P ≤

166 An authentication game provides the correctness and security properties of an authentica-

tion system. Moreover we require the authentication game to provide the hard to emulate

property that is formalized in the following definition:

Definition 7.3.2 An authentication game G is γ-Hard to Emulate if it is hard (empirically)

for another entity P 0 to emulate to generate the same measurements as P does, even given

∗ 0 any transferable information, i.e. the responses RP (n) of P for n measurements. Let RP 0 (n ) be the measurement of features for P 0 when P 0 is emulating P and is given measurements of

feature vectors RP (n) from P , and any other transferable information, then we require

h ∗ 0 i P, Pr cls(RP 0 (n )) = P I = RP (n) γ. ∀ P 06=P | ≤ where I is a random variable representing the information P 0 can access.

We denote an (m, α, β)-Authentication game that provides γ-Hard to Emulate property by

(m, α, β, γ)-HtE Game. We note that an authentication game does not necessarily provide

the HtE property as will be discussed in Section 7.5.3; A few features can be collected from

human to fairly distinguish among a group, but it is easy to emulate.

7.3.2 Criteria to select human features

Psychological studies, as well as behavioral biometric works, suggest that various human

features can be used to distinguish people. For our application however, we require features

not only be human distinguishing, but also they should be hard to emulate by another

human. Therefore it should not be easy for a human to easily learn to emulate another

human to produce similar feature measurements that can fool any classifier. For example,

human unconscious choices of objects can be considered as a feature. Although these features

can distinguish individuals to certain extent, but the behavior of selecting choices can be

emulated very easily by other individuals to fake their identity. One example of a single

feature that is quantitative and hard to emulate is intelligence. There are intelligence tests

167 (IQ) that can quantify an aspect of intelligence and are hard for an entity with lower score

to emulate to perform as a higher score. In this research, we are interested to find a set of

features, measurable through a game-play, that are not easy to emulate.

We divide the human features we use in this project into two groups: features that

measure human skills and those that measure behavior.

A human skill is the ability of the human to accomplish a task with certain perfectness or

performance. For example certain human features in keyboard typing is a human skill, which

is accomplished by different levels of performance (e.g. speed, correctness, etc.) for distinct

humans.

A human behavior is a (usually unconscious) physical or mental footprints of a human when

doing a task. For example, the pattern of typing (e.g. timing) or mouse movement dynamic

[JY11, MR00] is a human behavior which is unconscious.

Our claim is that combining a number of features from the both categories above can

be a good candidate for a set of hard to emulate features. Our intuition is that a task that

measures a number of human skills, along with human behaviors at the same time is hard to

emulate, and we backup this by our experiments.

To make it harder to emulate features, we need tasks that can measure multiple features

at the same time so that it becomes hard for entities to emulate each other. Another reason we need multiple features is that it makes it more feasible to distinguish people in a larger

population.

In brief, our proposal considers the following rationale to select the appropriate features: 1)

Feature distinguishability; a feature should distinguish among a population of human entities,

the larger the population is the better. 2) Feature stability for consecutive measurements; we require that the values of the measured feature be stable over time and does not change

168 drastically. In our authentication system, we consider features that subsequent measurement

of values of the feature result in high classification rates. 3) Multiple orthogonal features; Our

experiments show that measuring multiple features would make it easier to distinguish entities

in larger populations and would increase the success rate of the system. Moreover, if the

features are orthogonal (uncorrelated), then there is a higher chance of distinguishability with

smaller number of features. 4) Rival skill features; features that have dependent measurements

such that an attempt to change one will affect the other. For example, precision and speed of

doing a task are two rival features, because increasing the precision need more concentration

and time and this will decrease the speed of doing the task. We believe that rival features

provide the means for hard emulatability.

Note that rival features usually produce correlated measurements of the features and are thus

not orthogonal. Therefore we need to ensure that we collect enough feature values such that

the hard to emulate property is satisfied with low FAR.

7.4 From HtE games to HtD authentication

In this section, we assume a (m, α, β, γ)-HtE game and we discuss how (, δ, ζ)-HtD authenti-

cation with σ-resistance to MiM attack is achievable in a real world application using the

HtE games. Correctness and security of HtD authentication is satisfied by the FRR and FAR

property of the game, i.e.  = α, δ = β. But for HtD property and resistance to MiM attack we need to consider possible attack scenarios.

As discussed earlier, the prover is considered to be a game client with two interface: one

interface with the human player to collect responses and one with the verifier to send the

responses back to the verifier. Considering two settings of host-based local authentication and

remote authentication (on-line authentication) of users, there are a number of possible attacks

on our model of the HtE game that needs to be addressed in order to prove it provides HtD

authentication. In both local and remote settings, if the delegatee can bypass the game-client

169 to directly communicate with the verifier, then the responses can be generated using the

game-play information of a legitimate prover, and sent to the verifier. The verifier’s view of

communication would be non-distinguishable for the prover and delegatee. The delegatee may

also use a game bot trained to replicate the behavior of the prover to the verifier. Therefore,

a secure game design ensures that the game client is the only channel for the delegatee to

interact with the verifier.

In the remote setting specifically, an attacker can relay the communication to the prover so

that the prover’s game-play is returned to the verifier. In either case, the attacker would be

successful to bypass the game-playing and simply provide the acceptable information to the verifier.

In this section, we discuss the attacks and possible solutions and we argue that a HtE

authentication game can provide HtD authentication.

7.4.1 Adversarial benefits from delegation

When considering attack models for a security problem, the power of adversary is an important

factor in designing the appropriate system to prevent the attacks. Note that here adversary

is the collusion of the prover and delegatee. We assume that the adversary in our case seeks

mainly financial benefit from being authorized to access the service. This is because the

adversary gets help from an authorized entity in the system and the adversary will have

access to all same information and services that the trusted entity has access to. Thus the

adversary does not want to steal information or harm the system in any way. We assume

that the main reason in sharing information between the authorized entity and adversary is

their benefit in being able to share the access to service. Therefore, the attack on the system

should be economically viable for both the adversary and authorized entity, depending on

the type of the service that is obtained by accessing the system. For example in a remote working scenario, the benefit of the legitimate entity in delegating the authentication is that

a third-party would do the job (who should also be paid) and subtracting the actual salary

170 from the payment to the third-party would be the benefit of the entity. In a subscription

based service, the benefit would be at most the fee for subscription to the service.

7.4.2 Game client bypassing attacks

For both local and remote settings, a basic requirement for our proposed authentication

mechanism to be secure is that the users’ only interface to communicate with the verifier

is through playing the game presented to them by a trusted game client (trusted by the verifier). This assumption is crucial to prevent a delegatee to bypass the game-client to pass

the verifier the information received from a legitimate prover.

Assuming that the delegatee can not bypass the client, then the only way to get authenti-

cated in the system is through emulating the behavior of prover in game. In this section, we

argue that this assumption is feasible by considering possible attacks on the game client, and

investigating protection mechanisms against these attacks. There are three attacks possible

on the client. 1- Tampering with the network: where the delegatee interacts with the verifier

by responding over network without playing through the game client. The delegatee can then

generate responses based on prover’s information. 2- Game client modification: where the

attacker modifies the client to retrieve/modify embedded keys or state of client. 3- Automated game-play (bot) where a software/hardware emulator is used to generate the behavior of the

legitimate prover in game-play.

In the following, we discuss the above attacks and possible prevention mechanisms.

7.4.2.1 Anti-cheating mechanisms

There is an ongoing research on the topic of cheat prevention in on-line games that enables

hackers to modify the client, or change the network communication so that they win without

playing. A survey and classification of these attacks can be found in [WK12, WS07]. The

first formalization of cheating in on-line games was given by Baughman et al. [BL01] where

they provide a number of anti-cheating mechanisms to protect against attackers.

171 Preventing these attacks is important because games are not just played for fun, but

to make money in a multi-billion industry. The success of an on-line multi-player game is very much dependent on its fairness among players and thus gaming industry invests on

developing anti-cheating mechanisms due to its financial significance.

There are many anti-cheating mechanisms proposed in literature including stealth mea-

surements [FKS08], software mechanism to prevent modification of game-client [MGM06, Val],

hardware prevention mechanisms to detect input data attacks (e.g. simulating mouse clicks

and key strokes) based on Intel hardware features [SGJ07], trusted computing assumptions

[BM07] and Accountable virtual machines [HARD10]. The protection mechanisms prevent

modification of game-client, automated playing of the game and modification of network

communication between server and client.

In the following sections, we will summarize the methods in this line of work that can be

used to protect a HtE game against the three mentioned attacks.

7.4.2.2 Tampering with network communication

In this attack the delegatee uses a trained software that can emulate the behavior of a

legitimate prover, to bypass the game client and sends the information to the verifier over

network. To prevent this attack we assume a secure communication between the verifier and

game client. This can be achieved by obfuscating a shared key K inside the game client. We

assume this key is not retrievable/modifiable by the users of the system, neither the prover,

not the delegatee. Note that we do not restrict access to the same game client software by any party, so the delegatee may acquire a copy of the game client with the same shared key.

Assuming the shared key, a secure authentication mechanism can be implemented in the

game client to prevent any tampering with the network, including replay attack where the

delegatee only replays the responses from the prover. In Section 7.4.3.1, we argue how the

shared key can be used to prevent network attacks.

We note that this is not a full proof solution, but it is assumed in many cheat-proofing

172 mechanisms for games [HARD10] as it effectively prevents cheating.

7.4.2.3 Game client modifications

The delegatee might modify the client to bypass the authentication system in two ways. First

by installing a cheat along with the game client as a patch or loadable module to help in

emulating the behavior of the prover, and in second method by retrieving/modifying the

shared key with the verifier to be able to tamper with the network communication. To

mitigate these attacks, authors in [BCR10, TBB12] propose to symbolically execute the client

to find the constraints on the state of the client implied by the responses received from it,

and then using constraint solvers to find if such constraints could be generated by user input.

An extension of this approach was proposed in [HARD10] which uses Accountable virtual

machines (AVM). In this approach, the game is run in a virtual machine that monitors the

state of the game during user game-play and outputs a log of the game events (e.g. mouse

click, key stroke, etc) which will be sent to the verifier. Having all the logs, the verifier can

simulate running the game with the events in the log to find inconsistencies. The solution

prevents client modification as well as tampering with the network. More details can be

found in the paper. Another solution called GameGuard [TK10] is a Windows-based module

that can be added to the game to protect against modifications by hiding game modules and

then a hack detection module. There are also solutions based on tamper-resistant hardware

[BM07] that use a dedicated hardware to check the state of the client.

7.4.2.4 Automated game-play (bot)

A game bot is a software/hardware agent that can emulate game-play. In this attack, a

game bot can be trained to be able to emulate the behavior of the prover, without client

modification or tampering with the network. For example one type of game bots can generate

the sequence of mouse clicks and key strokes to play the game, by image processing the game

environment. Depending on the graphics of the game, such tools can get very complex and

harder to implement. There are general protection mechanisms to mitigate these attack such

173 as Intel hardware protection mechanism [SGJ07], and software techniques such as human

interactive proofs (e.g. Captcha) [MY12]. There are a number of works that were conducted

for specific games such as [GWXW09, CPC08]. In [GWXW09], human observational proofs

(HOP) are used to distinguish between human and bots. HOPs differentiate bots from

human players by monitoring actions taken by the player that are difficult for a bot to

perform. [CPC08] tries to distinguish human behavior from bots by arguing that certain

human behaviors are difficult to perform by a bot because they are AI-hard.

Note that the methods in [GWXW09, CPC08] collect feature from game-play and can be

simply incorporated into our proposal by unifying the collected features and doing further

analysis on the feature vector to detect bots, and then verify the identity of prover.

7.4.3 MiM attacks

A possible attack on HtD authentication is that the delegatee may relay the communication

from the verifier to a legitimate prover, and get the responses back and send the response

back to the verifier. This is very similar to man-in-the-middle attack where the delegatee sits

in the middle and relay the communication. This attack is only possible if the prover helping the delegatee is on-line to respond to the challenges from the verifier. In real-world scenarios where HtD authentication is applicable

such as subscription-based services, or remote working, the prover may not be able to respond

to the challenges in real time, as this is the delegatee who is communicating with the verifier,

and not the prover. The prover may even be off-line at the time of challenge response. But in

general, to prevent this types of MiM attacks in other applicable scenarios, we use a distance

bounding protocol to measure the timing of the responses from the provers. Here we assume

that he prover will always connect to the verifier within a certain distance, so that we can

measure a maximum communication time at registration phase of the protocol. This time will later be used to compare with the time measured during challenge-response phase of the

protocol where we use a distance bounding protocol to measure the round trip time.

174 7.4.3.1 Distance bounding

Distance bounding was first introduced by Brands and Chaum [BC94] to prevent MiM attacks.

We will use a DB protocol similar to [HK05] with modifications to prevent MiM attack in

HtE games. Distance bounding protocols are usually constructed for wireless environments where the round trip time of the signals is measured to bound the distance to legitimate

provers. One difference in our case is that we need to measure this time over a wired network.

Distance bounding over wired networks has been considered in a number of works. Drimer et

al [DM07] proposed to use DB over wired networks to prevent relay attacks between bank

terminals and smart cards. Watson et al [WSNA+12] also proposed DB to estimate the

location of a server over a wired network based on the work [KBJK+06a] which describes a

method to achieve an estimation error of 67 km for distance. In summary, using DB protocols

over wired networks provides a reliable outcome if computation time at the prover’s side is

not dominant over network communication time. We already have features that measure the

computation time (game-play time here) in client that can later be subtracted from the round

trip time which solves this problem. We incorporate the DB protocol in our final protocol

and argue its effectiveness against the HtE game. The final protocol will use a number of DB

+ Game challenge responses illustrated in Fig. 1.

Fig. 1: One interaction of V (; GP, N)  P (K; GP, N)

Verifier Game-client Choose GP & a random number N 0, 1 ∗. ∈ { } t1 = time Send GP, N

Compute R from human game-play

Compute τ = macK (R,N) Compute the time T needed to compute R, τ Send R, τ t2 = time

In each round of the game, the verifier will generate the games parameters GP (environment

parameters) along with a random fresh nonce N. and sends them to the prover. In the prover’s

175 side, the game client interact with the human to obtain the human features R, and then

client will calculate τ = macK (R,N) using the shared key K and a message authentication

code mac. Upon receiving the response, the verifier will calculate the round trip time ∆(t).

Note that delegatee also can access the game client with the same K on the client, and so the

delegatee is able to respond to these challenges. However, the game-play information and the

distance bounding challenge are mixed in the response and the delegatee needs to either play

the game, or relay it to a legitimate prover. Assuming the delegatee is not in proximity of

prover and not closer than prover to verifier, any attempt to relay the communication would

be a mafia fraud attack as explained in Section 7.4.3.2.

7.4.3.2 Distance bounding and message authentication codes

Distance bounding (DB) [BC94] is a protocol between a prover and verifier in which at the

end of protocol, the verifier outputs a bit indicating whether the prover is within a distance

bound from the verifier or not. There are 3 main types of attack against DB protocols 1-

Mafia fraud, where an adversary possibly within the distance bound tries to impersonate

a legitimate prover out of bound by relaying communication, 2- Terrorist fraud, where the

prover helps the adversary off-line by providing information so that the adversary can be

authenticated, 3- Distance fraud, where the prover claims to be closer than it actually is.

Distance bounding is generally performed by measuring round trip time of the communication.

In this chapter, the adversary can only attack DB in mafia fraud scenario and we denote the

advantage of adversary A in succeeding such attack by Advmafia(A).

Message authentication codes (MAC) are used to detect message tampering and forgery.

A function mac is provided that accepts a secret key K and message M and outputs a tag

τ = macK (M). The verifier can apply the function on the same message and key to detect

any tampering with the message. An adversary may tamper with the communication to

change a message. We denote the success of an adversary A in forging a message in MACs

176 by Advmac(A).

7.4.4 Security of the protocol

The Fig. 2 (next page) is a description of the 5 phases of the protocol. Our assumption is

that the game client can not be modified (e.g. to retrieve or change the key K) and no game

bot is used to play the game on behalf of users based on discussion in Section 7.4.2. Given

that, the mac function prevents any network modification of responses to the server, except

that the attacker can forge the mac. The distance bounding protocol also prevents relaying

attack over the network.

Therefore, the delegatee can only win to fool the verifier if one of the following holds: 1)

Delegatee can emulate the behavior of legitimate prover in the game, or 2) Delegatee can relay

the communication to legitimate prover and succeed in mafia fraud attack, or 3) Delegatee

can succeed in generating the right message authentication tag for a response and nonce.

We note that here we only protect the security in initiation of the authentication protocol,

and continuous authentication is required to ensure the same entity will access the system

during the connection session.

Given the above arguments, using a (m, α, β, γ)-HtE authentication game, a secure

message authentication code mac and distance bounding protocol, we can achieve an

(α, β, max γ, Advmac(A) )-HtD authentication where Advmac(A) is the advantage of adver- { } sary A in forging the mac. The protocol is also (max γ, Advmac(A), Advmafia(A) )-resistance { } to MiM attack, where Advmafia(A) is the advantage of A in attacking the mafia fraud in

distance bounding. A short description of distance bounding and message authentication

codes is given in Section 7.4.3.2.

177 Fig. 2: Authentication Protocol that is Hard to Delegate

Setup phase()

1. A cheat-proof game-client with an embedded key K is securely

released to the prover. The prover may share it with a delegatee.

Registration phase(n) For a prover P, reg(r, V, P ) is defined as:

1. The game challenge response (Fig. 1) is run for n instances:

n interactions in V (; GP, N)  P (K; GP, N).

2. For each instance i, the tag τi is verified and if successful,

Ri is stored in the provers profile.

3.∆ i = t2 t1 is stored on the profile (game-play time subtracted). −

4. Profile(P) = (Ri, ∆i), i [n] { ∈ } Challenge response(m) 1. The challenge response in Fig. 1 is run for m instances

to get new measurements.

2. Resp(P) = (Ri, τi, ∆i), i [m] { ∈ } Verification(t)

1. Check if τi = macK (Ri,Ni) is true, else reject.

2. Compare the time delays ∆i in Resp(P) with Profile(P).

If the delay is more that a maximum threshold, output reject.

3. Run the verification function (Section 7.5.2) on Ris from

Resp(P) and Profile(P) and output the result of verification.

Adaptive stage()

1. If verification passes, add collected Ris from Resp(p) to

Profile(p)

178 Figure 7.1: Screen-shot of the game 7.5 The proof of concept: HtE game

Our proposal for a HtE authentication game is a game that measures a number of human

features when played. The game measures three quantities of a human skill and five quantities

of human behavior while playing. The game is available on-line to play (the link is removed

because of anonymity requirements of conference).

7.5.1 The game design

We designed an archery type game where the human player will drag an arrow to set the

initial speed and angle and release it to a target. The game is very intuitive to learn and

easy to play.

7.5.1.1 feature selection and the rationale

The game measures the following human features each time an arrow is released to the

target: t1-Hit precision. The distance between the arrow and center of the target after

hitting. This is a floating point number in the interval [ 120, 120] as shown in Fig. 7.2. −

t2-Targeting time. The time in milliseconds that it takes for the player to aim

and shoot the target. This is the time difference between the start of dragging

and when the arrow is released. This is a positive floating point number in (0, 10].

t3-Latency time. The time in milliseconds that it takes for the player to begin

dragging a new arrow, after the game is reset (a previous arrow hits the target). Figure 7.2: Hit preci- This is a positive floating point number in (0, 4]. t4, t5-Relative initial Mouse sion click coordinates. The relative x, y coordinates of the mouse initial clicking on the

179 screen to drag the arrow, with the coordinate of the arrow nock as the center. These are two

floating point numbers greater than 0 and independent of the screen resolution. t6, t7-Initial velocity and angle. The initial velocity and relative angle of the arrow on the time it is

released toward the target. t8-Miss count: The number of misses between each two successful

shots.

We note that the only changing parameter in our design is the target location (in level 3)

and it affects the measurements for t1 and t7. For both features, we measure them relative

to the location of the target center. This makes the human data points independent of the

game parameters.

There are other measurable human features such as mouse location during the game-play,

that we did not consider in our experiment for simplicity.

The features t1, t2 and t8 measure three human skills, that is they measure how precise

and fast are the players in targeting an arrow. The features t3, t4, t5, t6 and t7 measure five

human unconscious behaviors.

Note that t1 and t2, t3 are rival features, i.e. a player trying to increase precision of

the hit, needs to increase the time of aiming at the target. In other words, an attempt to

change the value of hit precision, or misses features will change targeting time feature. In

our experiments, we argue empirically how decreasing both accuracy and shoot time are in

conflict. After measuring sufficient feature vectors of the users, we use a classifier described

in Section 7.5.2 which is based on the statistical distance of cumulative distribution function

(cdf) of feature measurements to distinguish among users. Our empirical studies also suggest

that the above features are stable for consecutive measurements of features and are suitable

to distinguish among a group of people.

7.5.1.2 Game Levels

The game is designed in 4 levels and the measurements are done in all the levels. We will make

observations about the affect of the design on the security of our approach in Section 7.5.3.

180 Probability Probability 1.0 1.0

0.8 0.8

0.6 0.6 Profile Profile 0.4 0.4 Measurement Measurement

0.2 0.2

Feature Value Feature Value 0.2 0.4 0.6 0.8 1.0 1.2 1.4 -100 -50 0 50 100

Targeting time Hit precision Figure 7.3: User verification accuracy; measurements matching the profile

In the first three levels, the location of the target is fixed. The first level is the easiest

possible: the target is fixed in the center of the screen with no movements and the player

has to set the angle and speed to hit the center of the target. In the second level, there is a

blocking wall that prevents the player to shoot at the target straight (Fig. 7.1. The player

should adjust the angle and speed to prevent hitting the wall. The third level is the same

as the first, but the target has a vertical periodic (sinusoidal) movement, so the player has

to predict the location of the target when the arrow is going to hit the target. The forth

level is different with previous 3 levels as the target will change its location. In every shoot,

the target will fade in and fade out in a fixed random location, forcing the player to release

the arrow in the time that target is visible. So the player has to predict when the target is visible and before it fades out, the arrow should have hit the target. In this level, the game will choose a random location of the target and show it to the player.

7.5.2 The verification function

An authentication system needs a verification function that outputs accept or reject after

the challenge response stage. The verification function in a HtE Authentication Game is

based on a distance function.

The verification function here is a matching algorithm that can match a set of data points

to a profile. We compared a number of such algorithms and our results for human features

showed that the following method gives the highest accuracy and efficiency. We don’t claim

though that this is the best verification function.

The verification function in this paper accepts as input two sets of features vectors RP (n)

181 0 and RP (n ) for user P and outputs a bit, accept or reject. The first set is measured during

a registration phase of the protocol, where the users plays the authentication game to provide

a sample of size n to the verifier, and the second sample is measured during a verification

phase that the user claims their identity and then plays the authentication game to provide a

sample of size n0. The samples collected during the registration phase is the profile of the

users and is stored in a database. The verification function will then estimate the distribution

of the both samples and compares them using a distance function.

7.5.2.1 Converting samples to distribution

To estimate the probability distribution from samples, there are different methods. One

method is to construct the histogram (by defining bins and counting the number of samples

in each bin), then finding a parametrized density function with similar shape and at the end,

the parameters are calculated by a goodness of fit algorithm.

However our empirical results show that constructing the cumulative distribution function

(cdf) can distinguish between the entities more effectively.

Having data points of the user for b features, our aim is to construct the cdf of a given vector of responses RP (n) = F1,F2,...,Fn, where Fi = f1i, f2i, . . . , fbi , i [n]. Here f1i is { } ∈

the ith measurement of the first feature f1.

Constructing the cdf of a multi-dimensional variable is not robust because the order of variables will change the final distribution. We work around this problem by constructing the

cdf of each variable and then combined them with a fusion function. We estimate the cdf of

each individual human feature fj, j [b], and then use a fusion function to combine the cdfs. ∈

To estimate the cdf of one feature fj, we extract the values of fj from the vector set and

denote this vector by Cj. Therefore we have Cj = fj1, fj2, . . . , fjn . { }

Assuming that the elements of Cj are samples of a source with distribution X, we want

to estimate cdf(X), which is defined as cdf(x) = Pr[X x], for a probability distribution ≤ Pr(X). But since we do not have the probability distribution X, we estimate the cdf which

182 Probability Probability Probability Probability 1.0 1.0 1.0 1.0

0.8 0.8 0.8 0.8

0.6 0.6 0.6 0.6

0.4 0.4 0.4 0.4

0.2 0.2 0.2 0.2

Value Value Value Value -100 -50 0 50 100 0.45 0.50 0.55 0.60 0.65 0.70 40 60 80 100 120 140 250 260 270 280 290 300

Hit precision Delay time Mouse X Mouse Y Figure 7.4: The smooth edf of the features for 7 random users, and the distance between them, illustrating how the features can distinguish among a group of people. we call it empirical distribution function (edf) defined as:

n 1 X edf(x) = Pr[X x] = I(Cji x). Cj n n ≤ i=1 ≤ where Cji is the ith value in Cj and the function I returns 1 if the input condition is true and

0 otherwise. In brief, edfCj (x) outputs the proportion of the sample points below value x.

7.5.2.2 The Distance function

0 Given two sets of samples Cj,Cj of size n and m respectively, we calculate the score as follows:

 mn 1/2 score = max edf(x) edf(x) . j x C C0 m + n j − j

scorej measures the distance of the two empirical distribution of the two sets of sample data.

The score is illustrated in Fig. 7.4 for 4 features measured in the game.

The above measure is also used in the Kolmogorov-Smirnov (KS) test as a measure of

similarity between two sets of data. The KS test measures the probability that two sets of

samples are generated by the same distribution.

0 Finally, for two sets of data points RP (n) and RP (n ), we define the score as a weighted b P sum of scorej for j [b], i.e. score = wj scorej, where wj corresponds to a measure of ∈ j=1 importance of the feature j. The score can be considered as a measure of the likelihood that

two sets of data samples are drawn from the same multivariate distribution. For a given

0 profile RP (n) of an entity P and a response set RP (n ), the verification function outputs

accept if the score is less than a threshold τ.

183 7.5.3 Experimental setup

We recruited 100 users to run our experiments, 96 from Amazon Mechanical Turk, and 4

from our research group. All users were instructed to play the game to achieve a minimum

score on the game in each level to ensure that their game-play is not careless random playing.

In the registration phase we collected roughly 120 data points (120 shots to the target) from

all users, and for the verification phase we collected only 30 data points, which takes roughly

a minute to complete in average.

7.5.3.1 Correctness of HtD authentication

Our experiments on data points show that the measured features are almost stable for user’s

current characteristics (profile), i.e. the change in their value is so small that does not affect

the classification algorithm. We examined the data from users in two consecutive time slots

and then constructed a histogram of the measurements. Fig. 7.3 is the histogram (cdf) of the

two consecutive measurements for two features - hit accuracy and targeting time - for one

user of the system. For changing behavior of the users, we update the profile of the users

upon a successful login as mentioned earlier.

7.5.3.2 Security of HtD authentication

In our second experiment, we used the classifier explained in Section 7.5.2 to measure how

the feature measurements in the game can verify the identity of users. Fig 7.5 and 7.4

illustrates the histogram (pdf and cdf resp.) of feature measurements for 7 user of the system.

Compared to the closeness in Fig 7.3, here the user’s histograms are distinguishable and

our classifier could verify 91 of 100 users correctly, and the 9 users were very close to the

threshold used. The threshold was set to have a low FAR (this is for level 3 of the game).

7.5.3.3 Rival features and HtE property

To verify the HtE property of our proposed game, we provided a player A with the information

from another player B in the system. The information was provided so that the player A

184 Frequency Frequency Frequency Frequency 0.012 0.08 3.0 0.04 0.010 2.5 0.06 0.008 0.03 2.0 0.006 0.04 1.5 0.02

0.004 1.0 0.01 0.02 0.002 0.5

Value Value Value 40 60 80 100 120 140 Value -100 -50 0 50 100 150 0.5 1.0 1.5 2.0 240 260 280 300 320

Hit precision Targeting time Mouse X Mouse Y Figure 7.5: The smooth histogram of the feature values for 7 users, illustrating how the features can distinguish among a group of people.

can train to emulate the behavior of B. The information includes the record of feature

measurements, feature statistics (average, min, max, stdev), graphs of feature values, and

the information from visually observing the game-play of B. Given all the information, we

asked A to play to emulate the behavior of B in several rounds, each with 30 game-play

to run the classifier cls. After each round, we provided a feedback to A of the change in

game-play to help A adjust the behavior in game-play. The player A was also told about

how the classifier rated the feature values compared to B’s profile. Since this experiment

needed direct supervision and was lengthy in time, we only employed two of the players for

this experiment, and each player tried to emulate the behavior of two users in the system.

For A’s role, we asked the highest scored player and an average user to try to emulate two

provers with higher and lower than average scores. Both of our players could not emulate

part of the behavior from the other user, and the classifier output reject for all rounds.

Fig. 7.6, illustrates the attempt made by a user to emulate a second user. The histogram of

feature measurement of user A is shown in dashed line before passing on B’s information.

The histogram of feature measurements of user B (from B’s profile) are shown in a blue thick

line style. The rest of curves correspond to the attempts made by user A to emulate the

behavior of user B for two sample features, hit accuracy and targeting time. As shown in the

figure, user A has lower hit accuracy initially, while targeting time is roughly similar to B.

But an attempt to increase the hit accuracy results in longer targeting time, even when A is trained to emulate the behavior of user B. Therefore user A could not emulate both features

at the same time.

185 Frequency Frequency 4 0.012 B B 0.010 A 3 A 1 0.008 1 2 2 2 0.006 3 3 0.004 4 1 4 5 0.002

Value Value -100 -50 0 50 100 150 0.6 0.8 1.0 1.2 1.4

Hit precision Targeting time Figure 7.6: A user trying to emulate the behavior of another user; increase in hit accuracy results in increase in targeting time. In general, time and coordinate related features were harder to emulate. For example the

difference in latency time of two users, although it could distinguish the users, but was not

significant so that a user can emulate the exact delay of the second user 1.

7.5.3.4 Game design criteria

We examined different game designs to perceive important factors that should be considered when designing the game. We measured the number of users correctly verifier in the 4 levels

of the game and our observation shows that parameter variability in the game is an important

factor. One problem here is that variability (randomness) in game may cause the classification

gets harder. In our game, one variable is the target location which affects our measurement

of hit accuracy and angle of shot. We consider relative measurements (angle relative to the vertical distance of target to bow, accuracy relative to the center of target) to remove the

impact of this variability. Also limiting the time of action (the user has to do an action in a

given time), measuring multiple features, and ensuring that the users play to get high score

in the registration phase are other important factors. The value of FRR for the 4 levels of the

game is 28%, 18%, 24%, 9% and value of FAR is 12%, 6%, 13%, 6% respectively. Therefore

level 3 is the best design (among 4 levels) for authentication and this is possibly because of

added variability in the game.

1The user trying to emulate a second user had this comment: “How can I delay for 100 milliseconds more in each game-play?!”

186 7.6 Concluding remarks

We proposed to collect the behavior of human players in computer games to prevent delegation

in authentication systems. Delegation of authentication occurs when a legitimate user of the

system passes on whatever information possible to a user such that the user gets authenticated.

We consider scenarios that delegation of authentication is not favorable such as remote working

and subscription-based services. Our experiments showed that collecting multiple features

can make it hard for another user to emulate the behavior of a legitimate user of the system,

and thus makes it hard to delegate authentication.

We showed empirically that our prototype game provides the hard to emulate property using

multiple features, specifically rival features. However, to implement a secure protocol we

need to ensure that the game is the only channel for the users to communicate with the verifier. We argued that there are three ways for a delegatee to bypass the client: 1- network

modification, 2- tampering with the client and 3- automated game-play. We then argued

that we can use techniques employed in cheat-proofing on-line games to harden bypassing

the client.

We also considered relay attacks where the delegatee relays the communication between

a legitimate user and verifier to get authenticated. To prevent this, a distance bounding

protocol was binded to game-play information when interacting with the verifier.

Our work suggests that behavioral biometrics can be used not only for authentication, but

also to prevent delegation of authentication. We think games are the most proper applications

to collect multiple features, and also cheat-proofing games is easier using the mechanisms

developed for this purpose in gaming industry.

7.6.1 Extensions and future work

To our knowledge this is the first work that formalizes the concept of HtD authentication

and proposes a protocol that can empirically achieve the security requirements.

187 The model of hard to delegate authentication might be satisfied by other soft or hard

biometrics, or using assumptions on the location, or source IP of the users, and this is part

of our future work. Using computer games has a number of benefits over other biometric

systems. Games with rich interaction allow many features that represent behavioral, cognitive

and/or skill aspects of a human to be collected and this makes the game hard to emulate,

in particular compared to capturing behavior through keystroke or mouse dynamics. The

method can be implemented in on-line games and authentication can be provided as a service

(e.g. OpenID system), with possibly less privacy issues because features individually cannot

be linked to users.

Our prototype game provides a proof of concept for the feasibility of the approach, and

suggests a number of considerations in the game design (Section 7.5.3). However, collecting

features in games needs further research to minimize the false positive and negative rates, as well as the time required for authentication in the system.

188 Chapter 8

Concluding remarks

In this chapter, we discuss some of the open problems and future directions that can extend the results of this work.

In a nutshell, we started this thesis by characterizing the properties of physical random sources which led to formalize a mathematical model of practical random sources. In this model, the random source generates outputs from a single distribution at a time, but the distribution may vary over time. However, the probability of each outcome is bounded and can not be arbitrarily close to 1, which means that a certain min-entropy is guaranteed in all the distributions. From here, the thesis can be divided into two main parts.

In the first part, we studied the randomness properties (in terms of size and distribution) of cryptographic primitives such as encryption, secret sharing and authentication. This problem is resolved for authentication and almost many cases in encryption that we discussed in

Chapter 3. But only the trivial cases for secret sharing was resolved in [DPP06] and we studied this problem for (3, 3)-secret sharing. Yet many questions remained unsolved that we briefly review in this chapter, in Section 8.1.1. Then motivated by practical scenarios, we studied two new notions of secrecy, guessing secrecy, and multiple message security of high entropy messages. The guessing secrecy notion was also studied by other works ([Jia13, IS14]) which extended the definition or proved equivalence to other notions. Yet, interesting questions remained that we summarize in Section 8.1.2.

In the second part of thesis, we articulated two methods to generate random numbers from human game-play and an application of this randomness in authentication. In one method, we used game-theoretic approach that could be applied to specially designed games. We designed a game over an expander graph and the game was consisted of a random walk on the graph

189 by human. In the second method, we used the errors human makes in playing video games.

These errors are the main entertaining factor of video games, without which the players

can win the game with perfect score all the time and the game would be boring soon. The

interesting directions that these methods can be extended is discussed in Section 8.2.1. In the

authentication mechanism from the randomness in human game-play, our experiments showed

that there is a distinguishing factor in the collected features from human. We changed the

game incrementally to achieve a higher distinguishing factor. The game design for randomness

generation was not generally hard if one could measure the error in game-play, but the game

design for authentication was complex since the number of collected features should be more

such that users in larger population could be distinguished. We discuss the future directions

for authentication games in Section 8.2.2.

8.1 Randomness requirements of secrecy

In Chapter 3, we investigated properties of random sources for secrecy primitives, where we

reviewed the results mainly for encryption, authentication and secret sharing. The main open

problem in this section is classifying the random sources for secret sharing.

8.1.1 Sources for secret sharing

Dodis et al. [DPP06] could classify the sources for a very special case of secret sharing

showing that any source that admits secure encryption of n bits, also admits secure (2, 2)-

secret sharing of n bits (Theorem 3.5.1). However for the inverse, a counter example was

given (Theorem 3.5.2) for a source that provides perfect (2, 2)-secret sharing, but it does

1 not provide -secure encryption for  < 3 . This secret sharing table constructed in this example uses a key space of exponential size in the size of the message. One can ask if any

secret sharing table exists with smaller key spaces compared to message. Our exhaustive

search of the space for random sources that admit secure (2, 2)-secret sharing but does not

190 provide secure encryption resulted in no random sources over 0, 1 m with m < 2n for one { } bit messages. Thus we conjecture th following.

Conjecture 8.1.1 Any source over 0, 1 m that admits secure (2, 2)-secret sharing of n bits { } also provides secure encryption of n bits if m 2n. ≤

Or we can directly prove that sources admitting secure secret sharing of n bits are extractable

to almost n bits.

Question 8.1.1 Any source over 0, 1 m that admits secure (2, 2)-secret sharing of n bits { } are extractable to n bits if m 2n. ≤

The above two questions are equivalent since a proof for one is a proof for the other. This

is because random sources over m bits space that admit secure encryption of n bits are

extractable (Theorem 3.4.4).

Investigating the same questions for general access structures is another interesting

direction. So far we only considered (2, 2)-secret sharing and the question for more general

access structures (e.g. (n, t)-threshold secret sharing) in secret sharing remains unsolved. We

investigated randomness requirements of (3, 3)-secret sharing in Theorems 3.5.4 and 3.5.5, where we showed that two random sources that admit secure encryption or secure (2, 2)-secret

sharing, also admits secure (3, 3)-secret sharing. For (3, 3)-secret sharing of n bits, we require

the first source to admit secure encryption on n bits and the second one must provide security

for t bits, where t is the output length of the first encryption. So we can ask the following

questions.

Question 8.1.2 Is it possible to build secure (3, 3)-secret sharing from any two random sources that admit secure encryption of n bits?

A seemingly harder question is whether it is possible to build secure encryption of 2n bits

from secure (3, 3)-secret sharing of n bits.

191 Question 8.1.3 Does any source that admits secure (3, 3)-secret sharing of n bits, also admit secure encryption of t 2n bits? ≤ Note that we can not encrypt more than 2n bits securely. This is because we know that

secure (3, 3)-secret sharing of n bits requires at least 2n bits of randomness (Theorem 2.4.4).

Thus for uniformly distributed random sources that admit secure (3, 3)-secret sharing, it

admit secure encryption of 2n bits. With the same argument, one can ask whether sources

that admit secure (n, t)-secret sharing of n bits, admit secure encryption of m bits where

1 m (t 1)n. ≤ ≤ −

8.1.2 Guessing secrecy

Guessing secrecy is a weaker notion of secrecy compared to indistinguishability, and we

showed that some of randomness properties required to provide guessing secrecy are weaker

than what is required in indistinguishability. The guessing secrecy definition can be extended

to achieve a higher level of security as follows: Suppose that an encryption system provides

-guessing secrecy, then one can show that it is equivalent to the following definition:

0 max Pr[ (encK (X)) = X] max Pr[ () = X] . A A − A0 A ≤ where is any adversarial function that given an input (may be null), will output the message. A Basically is the adversary that tries to guess the value of x. The above equivalence is A because the best strategy of the adversary in guessing the message X is to guess the highest

probable value of the message:

max Pr[ () = X] = G(X), A A

since the best strategy of an adversary in guessing the message, when given no information

about the message (except what is already known), is to output the value with the maximum

probability. Using the same argument, we can also prove that

  max Pr[ EncK (X) = X] = G X encK (X) . A A |

192 But the above equivalence can be extended to one of the strongest notions of secrecy called

“Semantic security” defined as follows:

Definition 8.1.1 An encryption system (enc, dec) provides -semantic security if for all message distributions X, and all functions f, and all adversarial functions , there exists an A adversarial function 0 such that A  Pr[ encK (X) = f(X)] Pr[ () = f(X)] . A − A ≤

It can be shown that the above definition implies the following

 0 max Pr[ encK (X) = f(X)] max Pr[ () = f(X)] , A A − A0 A ≤ where and 0 are adversarial functions and the above holds for all functions f over message A A space . But this definitions is equivalent to -guessing secrecy of f(X), X   G f(X) encK (X) G f(X) . | − ≤

The above argument shows that semantic security is equivalent to -guessing secrecy over all

functions f of message. The notion of -guessing secrecy we investigated in Chapter 4 is thus

equivalent to semantic security for only identity function, i.e. f(X) = X.

An interesting research question is to investigate the secret key requirements of -guessing

secrecy for a family of functions, e.g. functions with output length n 1 bit of an n bit − message.

Definition 8.1.2 An encryption system (enc, dec) provides  F GS(F) if for every function − f F, the following holds: ∈   G f(X) encK (X) G f(X) . | − ≤

One may hope that for a restricted family of functions F, a smaller key length can be achieved

that provides  F GS(F). Hence we ask the following, −

193 Question 8.1.4 What interesting family of functions F are there that can be used to build encryption systems providing  F GS(F) such that smaller key lengths (compared to the − length of message) are achievable.

Another interesting direction is to compare guessing secrecy over functions with other notions

of secrecy such as indistinguishability and entropic security.

8.2 Generation of random numbers

8.2.1 Randomness generators using human game-play

In Chapter 6, we discussed two approaches to generate randomness from human game-play.

The approach discussed in Section 6.4 employs a game over expander graphs to generate

randomness from human game-play over the graph. In this project as a proof of concept, we

developed a game over a small expander graph which showed the viability of the approach,

but the rate of randomness generation was low due to small graph used. To have higher

rate of randomness generation, one needs to extend this game to much larger expander

graphs such that the human have more choices to make in the initial sub-game. However,

the graph must remain 3-regular since the human choices in the second sub-game must be

small according to the discussion in Section 6.4.3. There are two challenges to design games with larger expander graphs. One is using the right method to display a large graph (e.g. of

232 nodes) on the screen properly such that the user can make the initial choice. The second

challenge is that the graph representation should not bias users in picking certain nodes of

the graph. For example, human may avoid selecting end nodes if nodes are arranged in a

line. So it seems that a good design for the graph is to use a completely symmetric layout for

all nodes, for example circular layout.

For the approach discussed in Section 6.3 where we proposed to use human game-play in video game, we only considered one source of randomness from the game-play. That is the

error in shooting the target. We explained some of the other interesting sources of randomness

194 in the game such as timing of actions and the location of mouse clicks (or the movement of

mouse). Including these sources of randomness in the final output is an interesting approach

that will possibly increase the rate of randomness generation. One needs to be careful for

mixing randomness from these sources since they may have dependent values. For example,

spending more time for aiming the target, will decrease the errors in shooting. So there is a

correlation between the source that measures timing of actions and the source that measures

errors in shooting. This should be considered when designing the mixing function to avoid

poor randomness in the output. Another interesting direction is to consider other types of

games such as strategy, arcade, racing and platform games for randomness generation, and

how the choice of the game type can affect the rate of randomness generation.

Another interesting direction is to analyze the security of TRG design, specifically the

TRGs that are widely used such as Linux and Windows TRG. The randomness from these

TRGs are generated and used under various conditions such as desktop, laptop, tablet,

special devices (routers), and cloud environment. Investigating the quality of randomness

generated under different conditions is interesting and challenging. For example in [GPR06],

certain weaknesses of Linux random number generator was discussed in general. In [LHA+12,

HDWH12], the authors found out flaws that were in Linux TRG in special devices (such as

routers) due to fewer sources of randomness available in such devices. An interesting open

question is to study the Linux random number generator in cloud environment where the virtual machines all served by one host may generate correlated randomness. This correlation

may exists among the virtual machines or between one virtual machine and the host.

8.2.2 Randomness for authentication

In Chapter 7, we discussed an approach to use the randomness in human game-play for

authentications purposes. Moreover, our experiments provided evidence that authentication

based on game-play provides a property we called Hard to Delegate, which prevents the users

195 of the system to delegate their authentication information even to their friends. The choice of video games for this purpose was mainly because of three reasons: 1- video games are one of the more complex tasks that provide a platform to collect many features from human to be able to distinguish among them, 2- even though they are complex, but are relatively simple to learn, and 3- they are entertaining enough to attract many users.

Our experiments showed that the more features collected from human, the more distinguisha- bility will be achieved among the users. Therefore the main interesting extension of this work is to consider more complex game designs that requires complex interaction with human to be performed. In Section 7.3.2, we provided a number of human traits that can be collected in games such as timing of actions, mouse moving behavior and skill traits such as the error in shooting the target. A game that can collect more of these traits at the same time is more likely to better distinguish among a group of people since it measures more of human abilities.

It may be challenging to design games that remains simple to learn even when many number of features is collected from human.

196 Bibliography

[AB09] Sanjeev Arora and Boaz Barak. Computational complexity: a modern approach,

volume 1. Cambridge University Press Cambridge, UK, 2009.

[AFN13] Hashem Alayed, Fotos Frangoudes, and Clifford Neuman. Behavioral-based

cheating detection in online first person shooters using machine learning tech-

niques. In Computational Intelligence in Games (CIG), 2013 IEEE Conference

on, pages 1–8, 2013.

[AJ09] Naveed Ahmed and Christian D Jensen. A mechanism for identity delegation

at authentication level. In Identity and Privacy in the Internet Age, pages

148–162. Springer, 2009.

[AKK09] Giuseppe Ateniese, Seny Kamara, and Jonathan Katz. Proofs of storage from

homomorphic identification protocols. In Proceedings of the 15th International Conference on the Theory and Application of Cryptology and Information

Security: Advances in Cryptology, ASIACRYPT ’09, pages 319–333, Berlin,

Heidelberg, 2009. Springer.

[Ali13a] Mohsen Alimomeni. Archery game, 2013. http://pages.cpsc.ucalgary.ca/

~malimome/game/.

[Ali13b] Mohsen Alimomeni. Sheep-wolf game to generate true random numbers by

human., 2013. http://pages.cpsc.ucalgary.ca/~malimome/expander/.

[AR99] Yonatan Aumann and Michael O. Rabin. Information theoretically secure

communication in the limited storage space model. In Proceedings of the

19th Annual International Cryptology Conference on Advances in Cryptology,

CRYPTO ’99, pages 65–79, London, UK, 1999. Springer.

197 [ASL+05] L.C.F. Araujo, Jr. Sucupira, L.H.R., M.G. Lizarraga, L.L. Ling, and J. B T

Yabu-Uti. User authentication through typing biometrics features. Signal

Processing, IEEE Transactions on, 53(2):851–855, 2005.

[ASN12] Mohsen Alimomeni and Reihaneh Safavi-Naini. Guessing secrecy. In Proceedings

of the 6th International Conference on Information Theoretic Security, ICITS’12,

pages 1–13. Springer, Berlin, Heidelberg, 2012.

[ASN14] Mohsen Alimomeni and Reihaneh Safavi-Naini. Human assisted randomness

generation using video games. IACR Cryptology ePrint Archive, 2014:45, 2014.

[ASS13] Mohsen Alimomeni, Reihaneh Safavi-Naini, and Setareh Sharifian. A true

random generator using human gameplay. In Decision and Game Theory for Security - 4th International Conference, GameSec 2013, Fort Worth, TX, USA,

November 11-12, 2013. Proceedings, pages 10–28. 2013.

[aws] Amazon web services (AWS). http://aws.amazon.com.

[BBR88] Charles H. Bennett, Gilles Brassard, and Jean-Marc Robert. Privacy amplifica-

tion by public discussion. SIAM J. Comput., 17(2):210–229, 1988.

[BC94] Stefan Brands and David Chaum. Distance-bounding protocols. In Work- shop on the Theory and Application of Cryptographic Techniques on Advances

in Cryptology, EUROCRYPT ’93, pages 344–359, Secaucus, NJ, USA, 1994.

Springer New York, Inc.

[BCR10] Darrell Bethea, Robert A. Cochran, and Michael K. Reiter. Server-side verifi-

cation of client behavior in online games. In In Proceedings of the 17th ISOC

Network and Distributed System Security Symposium, pages 21–36, 2010.

[BD07] Carl Bosley and Yevgeniy Dodis. Does privacy require true randomness? In

Proceedings of the 4th conference on Theory of cryptography, TCC’07, pages

198 1–20, Berlin, Heidelberg, 2007. Springer.

[BDJ14] Kevin Bowers, Tamara S. Denning, and Ari Juels. Methods and apparatus

for authenticating a user based on implicit user memory, 2014. US Patent

8,627,421.

[BDS11] Karyn Benson, Rafael Dowsley, and Hovav Shacham. Do you know where your

cloud files are? In Proceedings of the 3rd ACM Workshop on Cloud Computing

Security Workshop, CCSW ’11, pages 73–82, New York, NY, USA, 2011. ACM.

[BDSV96] Carlo Blundo, Alfredo De Santis, and Ugo Vaccaro. Randomness in distribution

protocols. Information and Computation, 131(2):111–139, 1996.

[BH10] Ayad Barsoum and Anwar M. Hasan. Provable possession and replica-

tion of data over cloud servers. Technical Report CACR 2010-32, Univer-

sity of Waterloo, http://www.cacr.math.uwaterloo.ca/techreports/2010/

cacr2010-32.pdf, 2010.

[BJO09] Kevin D. Bowers, Ari Juels, and Alina Oprea. Proofs of retrievability: Theory

and implementation. In Proceedings of the 2009 ACM Workshop on Cloud

Computing Security, CCSW ’09, pages 43–54, New York, NY, USA, 2009. ACM.

[BK12] Elaine Barker and John Kelsey. Recommendation for the entropy sources

used for random bit generation, 2012. http://csrc.nist.gov/publications/

drafts/800-90/draft-sp800-90b.pdf.

[BL01] Nathaniel E. Baughman and Brian N. Levine. Cheat-proof playout for central-

ized and distributed online games. In INFOCOM 2001. Twentieth Annual Joint

Conference of the IEEE Computer and Communications Societies. Proceedings.,

volume 1, pages 104–113 vol.1, 2001.

199 [BLS01] Dan Boneh, Ben Lynn, and Hovav Shacham. Short signatures from the weil

pairing. In Colin Boyd, editor, ASIACRYPT, volume 2248 of Lecture Notes in

Computer Science, pages 514–532. Springer, 2001.

[Blu86] Manuel Blum. Independent unbiased coin flips from a correlated biased

source:80a finite state markov chain. Combinatorica, 6(2):97–108, 1986.

[BM07] Shane Balfe and Anish Mohammed. Final fantasy: Securing on-line gaming

with trusted computing. In Proceedings of the 4th International Conference on

Autonomic and Trusted Computing, ATC’07, pages 123–134, Berlin, Heidelberg,

2007. Springer.

[BPW12] Alexandra Boldyreva, Adriana Palacio, and Bogdan Warinschi. Secure proxy

signature schemes for delegation of signing rights. Journal of Cryptology,

25(1):57–115, 2012.

[BRS+10] Lawrence E. Bassham, III, Andrew L. Rukhin, Juan Soto, James R. Nechvatal,

Miles E. Smid, Elaine B. Barker, Stefan D. Leigh, Mark Levenson, Mark Vangel,

David L. Banks, Nathanael Alan Heckert, James F. Dray, and San Vo. Sp

800-22 rev. 1a. a statistical test suite for random and pseudorandom number

generators for cryptographic applications. Technical report, Gaithersburg, MD,

United States, 2010.

[BSR+12] Hristo Bojinov, Daniel Sanchez, Paul Reber, Dan Boneh, and Patrick Lincoln.

Neuroscience meets cryptography: Designing crypto primitives secure against

rubber hose attacks. In Proceedings of the 21st USENIX Conference on Security

Symposium, Security’12, pages 33–33, Berkeley, CA, USA, 2012. USENIX

Association.

[BST03] Boaz Barak, Ronen Shaltiel, and Eran Tromer. True random number generators

secure in a changing environment. In ColinD. Walter, etinK. Ko, and Christof

200 Paar, editors, Cryptographic Hardware and Embedded Systems - CHES 2003,

volume 2779 of Lecture Notes in Computer Science, pages 166–180. Springer

Berlin Heidelberg, 2003.

[But12] J. Michael Butler. Privileged password sharing:“root” of all evil,

2012. https://www.sans.org/reading-room/analysts-program/

quest-password-sharing.

[BvDJ+11] Kevin D. Bowers, Marten van Dijk, Ari Juels, Alina Oprea, and Ronald L.

Rivest. How to tell if your cloud files are vulnerable to drive crashes. In Proceedings of the 18th ACM Conference on Computer and Communications

Security, CCS ’11, pages 501–514, New York, NY, USA, 2011. ACM.

[CG88] Benny Chor and Oded Goldreich. Unbiased bits from sources of weak ran-

domness and probabilistic communication complexity. SIAM J. Comput.,

17:230–261, 1988.

[CH07] Kuan-Ta Chen and Li-Wen Hong. User identification based on game-play

activity patterns. In Proceedings of the 6th ACM SIGCOMM Workshop on

Network and System Support for Games, NetGames ’07, pages 7–12, New York,

NY, USA, 2007. ACM.

[Che13] William Cheswick. Rethinking passwords. Commun. ACM, 56(2):40–44, 2013.

[CK78] Imre Csiszar and Janos Korner. Broadcast channels with confidential messages.

Information Theory, IEEE Transactions on, 24(3):339 – 348, 1978.

[CKBA08] Reza Curtmola, Osama Khan, Randal C. Burns, and Giuseppe Ateniese. Mr-

pdp: Multiple-replica provable data possession. In ICDCS, pages 411–420.

IEEE Computer Society, 2008.

201 [CM97a] Christian Cachin and Ueli Maurer. Entropy measures and unconditional security

in cryptography, 1997.

[CM97b] Christian Cachin and Ueli Maurer. Unconditional security against memory-

bounded adversaries. In Proceedings of the 17th Annual International Cryptology

Conference on Advances in Cryptology, CRYPTO ’97, pages 292–306, London,

UK, UK, 1997. Springer.

[Cor11] Lieberman Software Corporation. Password practices and outcomes, 2011.

http://www.liebsoft.com/uploadedFiles/wwwliebsoftcom/MARCOM/

Press/Content/2011-Password-Survey.pdf.

[CPC08] Kuan-Ta Chen, Hsing-Kuo Kenneth Pao, and Hong-Chung Chang. Game

bot identification based on manifold learning. In Proceedings of the 7th ACM

SIGCOMM Workshop on Network and System Support for Games, NetGames

’08, pages 21–26, New York, NY, USA, 2008. ACM.

[Csi96] Imre Csisz´ar.Almost independence and secrecy capacity. Problemy Peredachi

Informatsii, 32(1):48–57, 1996.

[CT91] Thomas M. Cover and Joy A. Thomas. Elements of Information Theory. Wiley

Series in Telecommunications. John Wiley & Sons, Inc., 2nd edition, 1991.

[DBvDJ11] Tamara Denning, Kevin Bowers, Marten van Dijk, and Ari Juels. Exploring

implicit memory for painless password recovery. In Proceedings of the SIGCHI

Conference on Human Factors in Computing Systems, CHI ’11, pages 2615–2618,

New York, NY, USA, 2011. ACM.

[DLAMV12] Yevgeniy Dodis, Adriana L´opez-Alt, Ilya Mironov, and Salil Vadhan. Differen-

tial privacy with imperfect randomness. In Advances in Cryptology–CRYPTO

2012, pages 497–516. Springer, 2012.

202 [DM07] Saar Drimer and Steven J. Murdoch. Keep your enemies close: Distance

bounding against smartcard relay attacks. In Proceedings of 16th USENIX

Security Symposium on USENIX Security Symposium, SS’07, pages 7:1–7:16,

Berkeley, CA, USA, 2007. USENIX Association.

[Dod12] Yevgeniy Dodis. Shannon impossibility, revisited. In Proceedings of the 6th

International Conference on Information Theoretic Security, ICITS’12, pages

100–110, Berlin, Heidelberg, 2012. Springer.

[DOPS04] Yevgeniy Dodis, Shien Jin Ong, Manoj Prabhakaran, and Amit Sahai. On the

(im)possibility of cryptography with imperfect randomness. In Proceedings of

the 45th Annual IEEE Symposium on Foundations of Computer Science, pages

196–205, Washington, DC, USA, 2004. IEEE Computer Society.

[DORS08] Yevgeniy Dodis, Rafail Ostrovsky, Leonid Reyzin, and Adam Smith. Fuzzy

extractors: How to generate strong keys from biometrics and other noisy data.

SIAM J. Comput., 38(1):97–139, 2008.

[DPP06] Yevgeniy Dodis, Krzysztof Pietrzak, and Bartosz Przydatek. Separating sources

for encryption and secret sharing. In Shai Halevi and Tal Rabin, editors, Theory

of Cryptography, volume 3876 of Lecture Notes in Computer Science, pages

601–616. Springer Berlin Heidelberg, 2006.

[DR02] Yan Zong Ding and Michael O. Rabin. Hyper-encryption and everlasting

security. In Proceedings of the 19th Annual Symposium on Theoretical Aspects

of Computer Science, STACS ’02, pages 1–26, London, UK, UK, 2002. Springer.

[DS02] Yevgeniy Dodis and Joel Spencer. On the (non)universality of the one-time pad.

In Proceedings of the 43rd Symposium on Foundations of Computer Science,

FOCS ’02, pages 376–, Washington, DC, USA, 2002. IEEE Computer Society.

203 [DS05] Yevgeniy Dodis and Adam Smith. Entropic security and the encryption of

high entropy messages. In Joe Kilian, editor, Theory of Cryptography, volume

3378 of Lecture Notes in Computer Science, pages 556–577. Springer Berlin /

Heidelberg, 2005.

[DW09] Yevgeniy Dodis and Daniel Wichs. Non-malleable extractors and symmetric

key cryptography from weak secrets. In STOC ’09: Proceedings of the 41st

annual ACM symposium on Theory of computing, pages 601–610, New York,

NY, USA, 2009. ACM.

[ea10] Rukhin et al. A statistical test suite for the validation of random number

generators and pseudo random number generators for cryptographic appli-

cations, 2010. http://csrc.nist.gov/groups/ST/toolkit/rng/documents/

SP800-22rev1a.pdf.

[Ent13] Verison Enterprise. Us developer outsourced his job to china, 2013. http:

//tnw.co/1kC7tCP.

[FKS08] Wu-chang Feng, Ed Kaiser, and Travis Schluessler. Stealth measurements for

cheat detection in on-line games. In Proceedings of the 7th ACM SIGCOMM

Workshop on Network and System Support for Games, NetGames ’08, pages

15–20, New York, NY, USA, 2008. ACM.

[Gei] Michael Geist. Location matters up in the cloud. http://www.thestar.com/

business/article/901068--geist-location-matters-up-in-the-cloud.

[GF04] Hugo Gamboa and Ana Fred. A behavioral biometric system based on human-

computer interaction. In Defense and Security, pages 381–392. International

Society for Optics and Photonics, 2004.

[GGWL10] Phillipa Gill, Yashar Ganjali, Bernard Wong, and David Lie. Dude, where’s

204 that ip?: Circumventing measurement-based ip geolocation. In Proceedings of

the 19th USENIX Conference on Security, USENIX Security’10, pages 16–16,

Berkeley, CA, USA, 2010. USENIX Association.

[GM82] Shafi Goldwasser and Silvio Micali. Probabilistic encryption & how to

play mental poker keeping secret all partial information. In Proceedings of

the Fourteenth Annual ACM Symposium on Theory of Computing, STOC ’82,

pages 365–377, New York, NY, USA, 1982. ACM.

[GMS74] Edward N. Gilbert, Florence J. MacWilliams, and Neil J. A. Sloane. Codes

which detect deception. Bell System Technical Journal, 53(3):405–424, 1974.

[Gov] Canadian Government. Personal information protection and electronic docu-

ments act. http://laws-lois.justice.gc.ca/PDF/P-8.6.pdf.

[GPR06] Zvi Gutterman, Benny Pinkas, and Tzachy Reinman. Analysis of the linux

random number generator. In Proceedings of the 2006 IEEE Symposium on

Security and Privacy, SP ’06, pages 371–385, Washington, DC, USA, 2006.

IEEE Computer Society.

[GSV05] Shafi Goldwasser, Madhu Sudan, and Vinod Vaikuntanathan. Distributed

computing with imperfect randomness. In Proceedings of the 19th Interna-

tional Conference on Distributed Computing, DISC’05, pages 288–302, Berlin,

Heidelberg, 2005. Springer.

[GW96] Ian Goldberg and David Wagner. Randomness and the netscape browser. Dr

Dobb’s Journal-Software Tools for the Professional Programmer, 21(1):66–71,

1996.

[GWXW09] Steven Gianvecchio, Zhenyu Wu, Mengjun Xie, and Haining Wang. Battle of

botcraft: Fighting bots in online games with human observational proofs. In

205 Proceedings of the 16th ACM Conference on Computer and Communications

Security, CCS ’09, pages 256–268, New York, NY, USA, 2009. ACM.

[HARD10] Andreas Haeberlen, Paarijaat Aditya, Rodrigo Rodrigues, and Peter Druschel.

Accountable virtual machines. In Proceedings of the 9th USENIX Conference on

Operating Systems Design and Implementation, OSDI’10, pages 1–16, Berkeley,

CA, USA, 2010. USENIX Association.

[HDWH12] Nadia Heninger, Zakir Durumeric, Eric Wustrow, and J. Alex Halderman.

Mining your ps and qs: detection of widespread weak keys in network devices. In

Proceedings of the 21st USENIX conference on Security symposium, Security’12,

pages 35–35, Berkeley, CA, USA, 2012. USENIX Association.

[HK05] G.P. Hancke and M.G. Kuhn. An rfid distance bounding protocol. In Se- curity and Privacy for Emerging Areas in Communications Networks, 2005.

SecureComm 2005. First International Conference on, pages 67–73, 2005.

[HLW06] Shlomo Hoory, Nathan Linial, and Avi Wigderson. Expander graphs and their

application. Bulletin of the AMS, 43(4):439561, 2006.

[HN09] Ran Halprin and Moni Naor. Games for extracting randomness. In Proceedings

of the 5th Symposium on Usable Privacy and Security, SOUPS ’09, pages

12:1–12:12, New York, NY, USA, 2009. ACM.

[Hot10] George Hotz. Console hacking 2010-ps3 epic fail. In 27th Chaos Communica-

tions Congress, 2010.

[HSMW06] Xinyi Huang, Willy Susilo, Yi Mu, and Wei Wu. Universal designated verifier

signature without delegatability. In Peng Ning, Sihan Qing, and Ninghui Li,

editors, Information and Communications Security, volume 4307 of Lecture

Notes in Computer Science, pages 479–498. Springer Berlin Heidelberg, 2006.

206 [ILL89] Russel Impagliazzo, Leonid A. Levin, and Michael Luby. Pseudo-random

generation from one-way functions. In Proceedings of the Twenty-first Annual

ACM Symposium on Theory of Computing, STOC ’89, pages 12–24, New York,

NY, USA, 1989. ACM.

[IO11] Mitsugu Iwamoto and Kazuo Ohta. Security notions for information theoreti-

cally secure encryptions. In Information Theory Proceedings (ISIT), 2011 IEEE

International Symposium on, pages 1777 –1781, 2011.

[IS14] Mitsugu Iwamoto and Junji Shikata. Information theoretic security for encryp-

tion based on conditional rnyi entropies. In Carles Padr, editor, Information

Theoretic Security, Lecture Notes in Computer Science, pages 103–121. Springer

International Publishing, 2014.

[Jia13] Shaoquan Jiang. On unconditional -security of private key encryption. The

Computer Journal, 2013.

[JKJ07] Ari Juels and Burton S. Kaliski Jr. Pors: Proofs of retrievability for large files.

In Proceedings of the 14th ACM Conference on Computer and Communications

Security, CCS ’07, pages 584–597, New York, NY, USA, 2007. ACM.

[JY11] Zach Jorgensen and Ting Yu. On mouse dynamics as a behavioral biometric

for authentication. In Proceedings of the 6th ACM Symposium on Information,

Computer and Communications Security, ASIACCS 2011, Hong Kong, China,

pages 476–482, 2011.

[Kas03] Saul Kassin. Psychology, Fourth Edition. Prentice Hall, 2003.

[Kay11] Joseph ’Jofish’ Kaye. Self-reported password sharing strategies. In Proceedings

of the SIGCHI Conference on Human Factors in Computing Systems, CHI ’11,

pages 2619–2622, New York, NY, USA, 2011. ACM.

207 [KBJK+06a] Ethan Katz-Bassett, John P. John, Arvind Krishnamurthy, David Wetherall,

Thomas Anderson, and Yatin Chawathe. Towards ip geolocation using delay and

topology measurements. In Proceedings of the 6th ACM SIGCOMM Conference

on Internet Measurement, IMC ’06, pages 71–84, New York, NY, USA, 2006.

ACM.

[KBJK+06b] Ethan Katz-Bassett, John P. John, Arvind Krishnamurthy, David Wetherall,

Thomas Anderson, and Yatin Chawathe. Towards ip geolocation using delay and

topology measurements. In Proceedings of the 6th ACM SIGCOMM Conference

on Internet Measurement, IMC ’06, pages 71–84, New York, NY, USA, 2006.

ACM.

[KL07] Jonathan Katz and Yehuda Lindell. Introduction to Modern Cryptography

(Chapman & Hall/Crc Cryptography and Network Security Series). Chapman

& Hall/CRC, 2007.

[KPT11] Akinori Kawachi, Christopher Portmann, and Keisuke Tanaka. Characterization

of the relations between information-theoretic non-malleability, secrecy, and

authenticity. In Proceedings of the 5th International Conference on Information

Theoretic Security, ICITS’11, pages 6–24, Berlin, Heidelberg, 2011. Springer.

[Lav] Terry Lavender. Getting the surveymonkey off our backs.

http://www.vancouverobserver.com/blogs/megabytes/2011/05/05/

getting-surveymonkey-our-backs.

[LHA+12] Arjen K. Lenstra, James P. Hughes, Maxime Augier, Joppe W. Bos, Thorsten

Kleinjung, and Christophe Wachter. Public keys. In Advances in Cryptology

CRYPTO 2012, volume 7417 of Lecture Notes in Computer Science, pages

626–642. Springer Berlin Heidelberg, 2012.

208 [LPS86] Alexander Lubotzky, Ralph Phillips, and Peter Sarnak. Explicit expanders

and the ramanujan conjectures. In Proceedings of the eighteenth annual ACM

symposium on Theory of computing, pages 240–246. ACM, 1986.

[LS07a] Pierre L’Ecuyer and Richard Simard. Testu01, 2007. http://www.iro.

umontreal.ca/~simardr/testu01/tu01.html.

[LS07b] Pierre L’Ecuyer and Richard Simard. Testu01: A c library for empirical testing

of random number generators. ACM Trans. Math. Softw., 33(4), 2007.

[Mar98] George Marsaglia. Diehard, 1998. http://www.stat.fsu.edu/pub/diehard/.

[Mas94] James L. Massey. Guessing and entropy. In In Proceedings of the 1994 IEEE

International Symposium on Information Theory, page 204, 1994.

[Mau92] Ueli Maurer. Conditionally-perfect secrecy and a provably-secure randomized

cipher. J. Cryptol., 5(1):53–66, 1992.

[MGM06] Christian M¨onch, Gisle Grimen, and Roger Midtstraum. Protecting online

games against cheating. In Proceedings of 5th ACM SIGCOMM Workshop on

Network and System Support for Games, NetGames ’06, New York, NY, USA,

2006. ACM.

[MMC+11] Tarik Mustafi´c, Arik Messerman, Seyit Ahmet Camtepe, Aubrey-Derrick

Schmidt, and Sahin Albayrak. Behavioral biometrics for persistent single

sign-on. In Proceedings of the 7th ACM Workshop on Digital Identity Manage-

ment, DIM ’11, pages 73–82, New York, NY, USA, 2011. ACM.

[MP91] James L. McInnes and Benny Pinkas. On the impossibility of private key

cryptography with weakly random keys. In Proceedings of the 10th Annual

International Cryptology Conference on Advances in Cryptology, CRYPTO ’90,

pages 421–435, London, UK, 1991. Springer.

209 [MR00] Fabian Monrose and Aviel D Rubin. Keystroke dynamics as a biometric for

authentication. Future Generation Computer Systems, 16(4):351–359, 2000.

[MSVV07] Aranyak Mehta, Amin Saberi, Umesh Vazirani, and Vijay Vazirani. Adwords

and generalized online matching. volume 54, New York, NY, USA, 2007. ACM.

[MW97] Ueli M. Maurer and Stefan Wolf. Privacy amplification secure against active

adversaries. In Proceedings of the 17th Annual International Cryptology Con-

ference on Advances in Cryptology, CRYPTO ’97, pages 307–321, London, UK,

UK, 1997. Springer.

[MW06] Mingchao Ma and Steve Woodhead. Authentication delegation for subscription-

based remote network services. Computers & Security, 25(5):371 – 378, 2006.

[MY12] Ryan McDaniel and Roman V. Yampolskiy. Development of embedded captcha

elements for bot prevention in fischer random chess. Int. J. Comput. Games

Technol., 2012:2:2–2:2, 2012.

[NTS99] Noam Nisan and Amnon Ta-Shma. Extracting randomness: A survey and new

constructions. Journal of Computer and System Sciences, 58(1):148–173, 1999.

[NZ96a] Noam Nisan and David Zuckerman. Randomness is linear in space. J. Comput.

Syst. Sci., 52(1):43–52, 1996.

[NZ96b] Noam Nisan and David Zuckerman. Randomness is linear in space. J. Comput.

Syst. Sci., 52(1):43–52, 1996.

[PB04] Maja Pusara and Carla E. Brodley. User re-authentication via mouse move-

ments. In Proceedings of the 2004 ACM Workshop on Visualization and Data

Mining for Computer Security, VizSEC/DMSEC ’04, pages 1–8, New York,

NY, USA, 2004. ACM.

210 [PGB11] Zachary N. J. Peterson, Mark Gondree, and Robert Beverly. A position paper

on data sovereignty: The importance of geolocating data in the cloud. In

Proceedings of the 3rd USENIX Conference on Hot Topics in Cloud Computing,

HotCloud’11, pages 9–9, Berkeley, CA, USA, 2011. USENIX Association.

[Pla] Planet-lab. http://www.planet-lab.org/.

[RB92] Anatol Rapoport and David V. Budescu. Generation of random series in

two-person strictly competitive games. Journal of Experimental Psychology:

General, 121(3):352, 1992.

[Rev08] Kenneth Revett. Behavioral Biometrics: A Remote Access Approach. John

Wiley & Sons, Ltd, 2008.

[RTS00] Jaikumar Radhakrishnan and Amnon Ta-Shma. Bounds for dispersers, extrac-

tors, and depth-two superconcentrators. SIAM J. Discret. Math., 13(1):2–24,

2000.

[RVW00] Omer Reingold, Salil Vadhan, and Avi Wigderson. Entropy waves, the zig-

zag graph product, and new constant-degree expanders and extractors. In

Proceedings of 41st Annual Symposium on Foundations of Computer Science,

pages 3–13, 2000.

[RW02] Alexander Russell and Hong Wang. How to fool an unbounded adversary with a

short key. In Proceedings of the International Conference on the Theory and Ap-

plications of Cryptographic Techniques: Advances in Cryptology, EUROCRYPT

’02, pages 133–148, London, UK, UK, 2002. Springer.

[SCD+07] Supriya Singh, Anuja Cabraal, Catherine Demosthenous, Gunela Astbrink, and

Michele Furlong. Password sharing: Implications for security design based on

211 social practice. In Proceedings of the SIGCHI Conference on Human Factors in

Computing Systems, CHI ’07, pages 895–904, New York, NY, USA, 2007. ACM.

[SGJ07] Travis Schluessler, Stephen Goglin, and Erik Johnson. Is a bot at the controls?:

Detecting input data attacks. In Proceedings of the 6th ACM SIGCOMM

Workshop on Network and System Support for Games, NetGames ’07, pages

1–6, New York, NY, USA, 2007. ACM.

[Sha48] Clause E. Shannon. A mathematical theory of communication. Bell System

Technical Journal, 27:379–423, 1948.

[Sha49] Claude E. Shannon. Communication theory of secrecy systems. Bell System

Technical Journal, 28:656–715, 1949.

[Sha02] Ronen Shaltiel. Recent developments in explicit constructions of extractors.

Bulletin of the EATCS, 77:67–95, 2002.

[Sha11] Ronen Shaltiel. An introduction to randomness extractors. 6756:21–41, 2011.

[Sip88] Michael Siper. Expanders, randomness, or time versus space. J. Comput. Syst.

Sci., 36(3):379–383, 1988.

[SRB] Storage resource broker (SRB). http://www.sdsc.edu/srb/.

[Sti86] Douglas R. Stinson. Some constructions and bounds for authentication codes.

In CRYPTO, pages 418–425, 1986.

[Sti91] Douglas R. Stinson. Universal hashing and authentication codes. In CRYPTO,

pages 74–85, 1991.

[Sti92] Douglas R. Stinson. Combinatorial characterizations of authentication codes.

Des. Codes Cryptography, 2(2):175–187, 1992.

212 [Sti06] Douglas R. Stinson. Cryptography: theory and practice. The CRC Press series

on discrete mathematics and its applications. Chapman & Hall/CRC, 2006.

[SV86] Miklos Santha and Umesh V. Vazirani. Generating quasi-random sequences

from semi-random sources. Journal of Computer and System Sciences, 33(1):75

– 87, 1986.

[SW08] Hovav Shacham and Brent Waters. Compact proofs of retrievability. In Proceedings of the 14th International Conference on the Theory and Application

of Cryptology and Information Security: Advances in Cryptology, ASIACRYPT

’08, pages 90–107, Berlin, Heidelberg, 2008. Springer.

[TBB12] HaiYun Tian, Phillip J. Brooke, and Anne-Gwenn Bosser. Behaviour-based

cheat detection in multiplayer games with event-b. In Proceedings of the 9th

International Conference on Integrated Formal Methods, IFM’12, pages 206–220,

Berlin, Heidelberg, 2012. Springer.

[TC11] Greg Taylor and George Cox. Behind intel’s new random-number genera-

tor, 2011. http://spectrum.ieee.org/computing/ hardware/behind-intels-new-

randomnumber-generator/.

[TK10] Luan Bui The and Van Nguyen Khanh. Gameguard: A windows-based software

architecture for protecting online games against hackers. In Proceedings of the

2010 Symposium on Information and Communication Technology, SoICT ’10,

pages 171–178, New York, NY, USA, 2010. ACM.

[Val] Valve anti-cheat system (vac). https://support.steampowered.com/kb_

article.php?ref=7849-RADZ-6869.

[vN51a] John von Neumann. Various techniques used in connection with random digits.

Applied Math Series, 12:36–38, 1951.

213 [vN51b] John von Neumann. Various techniques used in connection with random digits.

J. Research Nat. Bur. Stand., Appl. Math. Series, 12:36–38, 1951.

[Wag72] Willem A. Wagenaar. Generation of random sequences by human subjects: A

critical survey of literature. Psychological Bulletin, 77(1):65, 1972.

[Wal01] John Walker. Hotbits: Genuine random numbers, generated by radioactive

decay. Online: http://www. fourmilab. ch/hotbits, 2001.

[WC81] Mark N. Wegman and J.Lawrence Carter. New hash functions and their use

in authentication and set equality. Journal of Computer and System Sciences,

22(3):265 – 279, 1981.

[WK12] Jiyoung Woo and Huy Kang Kim. Survey and research direction on online

game security. In Proceedings of the Workshop at SIGGRAPH Asia, WASA

’12, pages 19–25, New York, NY, USA, 2012. ACM.

[Wor13] Jenna Wortham. No tv? no subscription? no problem, 2013. nyti.ms/

1rso2iX.

[WS07] Steven Daniel Webb and Sieteng Soh. Cheating in networked computer games:

A review. In Proceedings of the 2Nd International Conference on Digital

Interactive Media in Entertainment and Arts, DIMEA ’07, pages 105–112, New

York, NY, USA, 2007. ACM.

[WSNA+12] Gaven J. Watson, Reihaneh Safavi-Naini, Mohsen Alimomeni, Michael E.

Locasto, and Shivaramakrishnan Narayan. Lost: Location based storage. In

Proceedings of the 2012 ACM Workshop on Cloud Computing Security Workshop,

CCSW ’12, pages 59–70, New York, NY, USA, 2012. ACM.

[WSS07] Bernard Wong, Ivan Stoyanov, and Emin G¨unSirer. Octant: A comprehensive

framework for the geolocalization of internet hosts. In Proceedings of the 4th

214 USENIX Conference on Networked Systems Design & Implementation,

NSDI’07, pages 23–23, Berkeley, CA, USA, 2007. USENIX Association.

[Wyn75] Aaron D. Wyner. The wire-tap channel. Bell Syst. Tech. J., 54(8):1355–1367,

1975.

[YG08] Roman V. Yampolskiy and Venu Govindaraju. Behavioural biometrics: a

survey and classification. International Journal of Biometrics, 1(1):81–113,

2008.

[YG09] Roman V. Yampolskiy and Venu Govindaraju. Strategy based behavioural

biometrics: a novel approach to automated identification. Int. J. Comput. Appl.

Technol., 35(1):29–41, 2009.

[ZLwW+09] Qing Zhou, Xiaofeng Liao, Kwok wo Wong, Yue Hu, and Di Xiao. True

random number generator based on mouse movement and chaotic hash function.

Information Sciences, 179(19):3442 – 3450, 2009.

[Zuc91] David Zuckerman. Simulating BPP using a general weak random source.

In SFCS ’91: Proceedings of the 32nd annual symposium on Foundations of

computer science, pages 79–89, Washington, DC, USA, 1991. IEEE Computer

Society.

215 Appendix A

LoSt: Location Based Storage

This chapter is devoted to the problem of verifying the lo-

cation of files within distributed file storage systems such as

the cloud. It is important to know the geographical location of

sensative data (such as health records) to determine applicable

laws and regulations.

I have contributed about 10% to this work mainly toward im-

plementing the experiments over Planet-lab to estimate the geo-

graphical location of files on a real world network of computers.

A.1 Introduction

With the rise in data outsourcing to large-scale distributed storage systems comes the obvious

risk to data owners of a loss in control over where this data is held (geographically). This

is an important issue since the laws which apply to data generally depend upon the legal

jurisdiction under which the data is held. Consider the example of a country’s health records.

Health services must store vast quantities of patients’ data and ensure these are adequately

accessible to health professionals across the country. Therefore, outsourcing management of

this data to storage service providers is an attractive solution. Such services integrate the

local storage nodes together with new storage nodes provided by the service provider, and

create a unified view of data for users of the system. Health and personal privacy laws of a

country however, may make restrictions on the location of data. For example, the Canadian

Personal Information Protection and Electronic Documents Act [Gov] requires data to be

stored within Canada and so any outsourcing of data storage must be with a guarantee about

216 the location of the stored data. There are other cases that knowing the storage location of

data is important. Examples are, the case of whistle blowers and human right activists who wish to post data without the fear of data being subject to subpoenas, or simply participating

in online surveys where participants need to know where the survey results will be kept [Lav].

Distributed storage services, use systems such as Storage Resource Broker (SRB) [SRB],

to give users a single global file system (including name space and file hierarchy) for storage

of data files. One may look at traditional cloud storage services such as Amazon [aws], as a

special case of a distributed storage system where all the storage and data centers are owned

and managed by a single storage provider.

The service provider in such distributed file storage systems can provide users with the

ability to specify the location of their data. The goal of this paper is to examine the challenges

of guaranteeing such a service. In particular, providing an independent verification mechanism

for users to verify the claim of the storage service provider. If a user can formally prove that

a service provider is not storing files in the requested locations then he can hold the service

provider accountable for negligence. This could then result in legal proceedings and/or fines

being imposed on the service provider. Full details of the obligations of the provider would

be set out in their Service Level Agreement (SLA). The results of this paper highlight the

difficulty of providing a reliable mechanism to verify storage location within the current cloud

storage setting.

A.1.1 Setting Considered

In this paper we consider the problem of verifying a file’s location within a distributed storage

system where data centers are spread over a wide geographic area. The system is managed

by a service provider that allows users to choose one of a number of pre-specified regions for

file storage. The service provider may own a small number of data centers but also buys

extra storage space from other data centers as it receives requests from users. We assume

the service provider is not malicious but is willing to “cut corners” to increase its profit. As

217 we will see in Section A.4, if the service provider is malicious, no location guarantee can be

made.

The system works as follows: the user encodes a file and sends it for storage with the

request to be stored in a specific region (some geographic area). The storage service provider will then select storage servers whose claimed locations are within this region through

mechanisms such as auctions, which are outside the scope of this paper. A similar mechanism

is currently used by Google in advertising [MSVV07]. The provider may also perform some

extra processing including server specific coding, and finally sends the file to the chosen

servers. The user will be charged for storage reliability and timely access which requires

redundancy at the coding level. The user can request the list of servers that are storing file

copies, in the form of an Internet address for each server. The goal is to enable the user to verify that the file copies are actually stored on the servers in the list and that these servers are in the requested region.

Two mechanisms that appear relevant are:

Internet Geolocation systems [WSS07, KBJK+06b] are aimed at determining • the geographic location of a server (IP address). In such systems a trusted set

of landmarks are used to query the server and measure the round trip time

of the query, upper-bounding the distance of the server from the landmark.

Using queries from multiple landmarks and trilateration, one can locate the

server with reasonable accuracy. To verify the location of a file however, one

needs to determine the location of the claiming server and in addition show

that the server actually holds the file.

Proofs of Retrievability (PoR) were introduced [JKJ07] to allow servers to • “prove” to the user that they are actually storing a file that has been submitted

to the cloud for storage at a prior time using a challenge-response protocol.

There has been a number of papers studying PoRs (and the related notion of

218 Provable Data Possession, PDP) further [SW08, BJO09, AKK09].

A direct composition of these two mechanisms however, will become completely insecure

against a collusion attack where a number of storage servers collude to enable one of the

“colluders” to successfully make a false location claim. Note that there is a good incentive for

colluders to do this because this allows them to “sell” storage even if the server is not in the

region specified by the user. This will be discussed in detail in Section A.4.

A.1.2 Our Contribution

Proofs of Location. We introduce Proof of Location (PoL) schemes in the above setting,

that can be used by a client to obtain assurance about the location of a stored file. We

assume to provide reliability that the file is replicated by the storage service before being

stored. Our definitions can be adapted to use other forms of redundancy for file storage.

The attack model is the following: A colluding group of servers, some holding an encoded

copy of the file and located within the region requested by the user, share their privileged

information (for example secret keys) with the aim of enabling another storage server that is

either, (i) not within a defined distance bound from all the landmarks or, (ii) does not hold

the file, to claim that it holds the file and is in user’s requested region . The attack allows R the colluders to use storage servers that are not in the requested region (and so possibly less

in demand), or claim to have stored the file without actually storing it, and in both cases

obtain some financial gain. We assume the service provider although not malicious, wants

to minimize its costs and so when dealing with the data centers, does not take the required

precautions or use verification mechanisms that would be needed to ensure the quality of

storage and location of servers.

POL schemes are used by the clients to verify the claim of the service providers. The

scheme uses a challenge-response protocol between a storage server, with a claimed location,

and a set of landmarks to verify that (i) the server is within a bounded distance from the

219 landmark and, (ii) hold the file. Using the result and execution time of the challenge-response

protocol from a sufficient number of landmarks the system creates a bounded region to

estimate the location of the storage server.

Colluders succeed in their goal, if they can distort this region to include servers that are

not in the requested region. We show that without loss of generality, the only two possible

adversarial strategies of shortening where the server “pretends” to be closer to a landmark

than it actually is, and lengthening where the storage server “pretends” to be farther away,

can be defeated by the client generating one distinct coding of the file for each server that will hold the file. Using our model, we prove that security of this basic PoL reduces to the

security of the challenge-response protocol (a PoR in our case).

“Recoding” for Efficiency. This basic construction however substantially increases computa-

tion and communication costs of the user and makes the system impractical. To reduce these

costs we introduce a new property for PoR called re-coding, that splits the encoding key into

a client key used for initial encoding, and an intermediate key that is used for recoding. The

intermediate key that is held by the storage service provider, allows new codings (recoding)

of the file to be generated by the service provider from a given encoded file. The service

provider however cannot generate an encoded file from scratch, or change an encoded file to

the encoding of a new file. Verification requires both parts of the key. Recoding is a property

of independent interest and as discussed in Section A.5.4 can be used for increased efficiency

in distributed PoR. We give two PoR constructions, one in the symmetric and one in the

public key setting, with secure recoding algorithms. These schemes are based on a POR

scheme of Shacham and Waters [SW08] and without affecting the POR security, add the new

recoding functionality to the system.

Recoding schemes are somewhat related to the multi-copy and multi-replica PDP con-

structions of Curtmola et al. [CKBA08] and, Barsoum and Hasan [BH10]. Here the aim is to

ensure that multiple copies of the file are being stored, giving assurance about the redundancy

220 of the data if one of the file copies becomes corrupted. Their constructions can be viewed in

someways as the opposite of our recoding schemes since multi-copy PDP encoding generates

many encrypted files and one set of authenticators, where as our recoding schemes have a

single (unencrypted) file with many different sets of authenticators. A detailed study of the

relationship between these two primitives would be of great interest.

Our final construction is a combination of a geolocation system with these POR schemes.

Experiments. The underlying assumptions in PoL are, (i) geographic distances can be

bounded by round trip time of a packet over the Internet, and (ii) computation time of a

response in PoR adds tolerable error to these distances. To verify these, we simulated a

set-up where a storage server holds an encoded file and a user wants to verify its location with the help of a set of landmarks. In our experiments PlanetLab [Pla] nodes were used

for both landmarks and storage servers. Our experiments confirm the plausibility of the

underlying assumption and the soundness of the proposed system in estimating the location

of the storage server that holds the file, albeit with some error. We discuss the sources of

these errors and argue that some would be reduced in practice by using dedicated landmarks

instead of PlanetLab nodes, and some can be alleviated by better processing of data and

using better localization algorithms.

A.2 Related Work

The importance of file location in a cloud environment with respect to privacy and compliance

has been widely recognized [Gei]. The position paper by Peterson et al. [PGB11] on “data

sovereignty” in cloud considers the location of data in cloud and notes that Internet geolocation

and Proof of Data Possession (PDP) systems provide applicable technologies. The authors

however recognise that “Combining the concepts of PDP with Internet geolocation to establish

a novel data sovereignty protocol is non-trivial”.

Benson et al. [BDS11] examine the question of how to verify that a cloud provider

221 replicates data across multiple data centers in diverse geographic locations. They propose a

system that tests whether a cloud provider keeps a file copy in each of the data centers specified

in the service level agreement (SLA). In this setting a user is not necessarily interested in the

actual location of their data, only that data is stored in multiple locations. This differs from

our work where the goal is to verify that file copies are only stored within the specified region.

Benson et al. discuss the problems associated with combining a PoR with a geolocation

scheme. In their experiments however, they do not use PoR: files are stored in their original

form (no encoding) and the round trip time of a challenge that requires one or more lines of

the file to be retrieved, is used for estimating the location of the file.

Neither of the papers above consider the adversarial case, where servers attempt to fool

the verifiers into believing data is at a different location. Gill et al. [GGWL10] study the

ability of malicious nodes actively trying to manipulate latency times to fake the location of

a network node. Gill et al. assume measurements always reach the target. This assumption

cannot be directly used in our setting because of the passive nature of data, and the need

to verify its location through an active node, i.e. by the ‘custodian’ of the data which may

be at a different location. Since we are instead verifying the location of files rather than

the node (or custodian) which holds the file we use extra techniques which ensures that the

measurements reach the target. Our work is the first attempt to develop a model of security for file location in a distributed storage system considering realistic constraints and trust assumptions in these environments.

A.3 Storage Model

We consider a distributed storage service provider that uses a set of geographically dispersed

storage servers = S1,S2, ..., Sn , where Si denotes the label (IP address) of server i. The S { } service provider receives a user’s request to store an encoded file in a particular region , R which is a contiguous geographic area that houses one or more servers in . To store a file, a S

222 y S x S1 x S3 6 A B

S2 y S5

S4

Figure A.1: Example of File Location Regions

user encodes the file F obtaining F ∗, and sends it together with a region to the service R provider. The service provider will select a set of servers which it believes to be T ⊆ S contained in the region . The actual process of selecting is outside the scope of this R T paper, but reasonable options are auction systems such as those used by Google to select ads

[MSVV07]. The servers in could be malicious and so claim a location within without T R being there. To cut cost, the service provider will not use any independent verification

mechanism for checking the location of the servers. The service provider will replicate F ∗

and stores one copy on each server in (the provider may perform an additional encoding T step here). The set of labels will be provided to the verifier on request. T Example (non-adversarial): In Figure A.1 the provider has access to six servers = S

S1,S2,S3,S4,S5,S6 . Region A contains S1,S2,S3 and B contains S5,S6 . The user { } { } { } specifies that file x should be stored in Region A and file y in Region B. As shown in the

x figure, x is stored on = S1,S2 which are contained within A and so x’s storage conforms T { } with the user’s request. Similarly y is stored at S5 and S6 and so is stored correctly.

Landmark Infrastructure. We assume there is a trusted landmark infrastructure = L

L1,L2, ..., Ln , that is independent from the storage system. A user may interact with { }

223 S4

z S S 1 C 2 S5

z S3

z S6

= S1,S2,S3,S4,S5,S6 SServers{ within C are S },S ,S { 1 2 3} z = S ,S ,S S ,S ,S T { 1 3 6} 6⊆ { 1 2 3} Figure A.2: Bad Verification Example servers through this infrastructure. That is, a user’s request for location verification will result in landmarks sending challenges to the storage servers claiming to hold the file, for their location. The round trip times of these challenges will be used to determine a candidate location for a file (e.g. using trilateration). To verify that servers in a set are all within T the region , first a contiguous geographic measured region will be estimated for , and R M T then and will be compared. In Figure A.2 we see that a user has asked for a file z to M R be stored in the Region C containing S1,S2,S3 . Performing verification of the location of { } z z should fail since its actual location is = S1,S3,S6 S1,S2,S3 . Thus the measured T { } 6⊆ { } region z C. M 6⊆ We shall assume that storage servers can be freely accessed, i.e. a landmark node L can directly access a server S as if it were any other node on the Internet. Moreover, each storage server has a unique IP address that can be used by a landmark node L for direct communication.

224 A.4 Trust Assumptions and Impossibility Results

A “fraudulent claim” in the distributed storage service above involves either a fake location

or lying about the possession of a file. We assumed the service provider uses the storage

provided by independent data centers without careful verification of their location and so the

task of verifying location of the file is left for the users. We note that if the service provider

is malicious and colludes with malicious data centers, the user cannot have any guarantee

for the file location: the service provider simply will not provide a complete list of storage

servers that hold the file. This means that although the service provider may not be paid for

the hidden servers, he may have other incentives for their use. For example, file copies may

be stored in locations that are not subject to the legal restrictions of the requested region.

As a result there is no way for a client to be sure about the location of data.

Theorem A.4.1 In the above model, it is impossible to give a non-negligible guarantee for a

file’s location if the service provider colludes with the servers.

Now assume the service provider is not malicious but “negligent” in checking the server

locations. Each server is paid to store files and so servers want to bid to store a file F despite

not being located in the user’s specified region . Such servers will rely on the help of R colluding servers in responding to challenges. The following theorem shows that a collusion

of only two well positioned servers who are able to make file copies, can always succeed in

breaking the verification system.

Theorem A.4.2 In the above model, file location verification is impossible when copying is permitted and a malicious server trying to fake its location has an appropriately located colluder with access to all privileged information.

Consider Figure A.3. The server Treal holding a file F needs to fool the verifier into

believing its location is Tfake. For simplicity we will first assume that there are only two

225 t1 t1 + delay

L1 Treal Tcolluder Tfake L2

t2 tcolluder

Figure A.3: Location Faking

landmarks L1 and L2. Consider the measurement taken from L1 to the server. Since Treal is

closer to L1 than Tfake the adversary must introduce a delay so that the location appears to

be Tfake (this is denoted t1 + delay in the picture). Next consider the measurement taken

from L2 to the server. Since Treal is farther from L2 than Tfake the adversary needs to shorten

the measurement. The adversary does this with the help of a colluding server Tcolluder which

is closer to L2 than Treal and at least as far as the fake distance of Tfake from L2. Treal can

then send a copy of F to Tcolluder who will then respond to any of Treal’s challenges. As a

result the measured region estimating the location of the server shall be the shaded area

shown in Figure A.3. The adversary has therefore been successful in faking its location. Note

that by adding additional landmarks we will not prevent this attack as further measurements

(with appropriate delays) can only reduce the size of this constrained area.

226 A.4.1 Assumptions on Adversarial Behavior

Above we argued that (i) location assurance with a malicious service provider is impossible,

and (ii) even if the service provider behaves correctly, no assurance can be provided if colluding

servers can afford to make extra (non-remunerated) copies of the files. We thus make the

following assumption which directly relates to the theorems stated in the previous section.

Assumption 1 File replication (copy) is only performed by the service provider and for the

purpose of providing guaranteed reliability for which the user is charged.

We assume storage servers will not make extra copies of a file.

To justify this assumption first note that without this assumption, no guarantee about

the location can be provided. One remaining concern is if the assumption above makes sense

in real systems. We argue that the assumption is reasonable in the case of non-targeted

attacks and assuming that the colluding servers use a file replication strategy systematically

in an attempt to claim a different location for a specific server. With such a strategy the cost

of storage servers will be multiplied by the average number of copies that are needed to cover

the whole area necessary to succeed in the shortening attack strategy described later (See

Section A.5.2).

A.5 Proofs of Location

A PoL consists of three phases: Setup, Store and Locate.

In the Store phase the user encodes a file F and sends the encoded file F ∗ to the storage

provider together with a file tag τR. The tag names the file and includes metadata detailing

the file’s encoding and region where the file should be stored. The provider also initiates R an auction protocol to determine which file servers will be sent copies of the file. An T encoding of the file will be sent to each of these servers. Note that at this stage the provider

may re-encode the file ensuring a different encoding for each server.

227 In the Locate phase the user will employ a set of trusted landmarks to verify that all

copies of the encoded file F ∗ are stored within its specified region . The user initiates R this phase by querying for a set of labels of servers which store the file F . Next the user T forwards this list to the landmarks who challenge the servers in with respect to the file F . T There will be an individual challenge-response protocol run between each T and L , ∈ T ∈ L where the servers and landmarks are able to communicate directly. The landmarks will verify

the responses and for any positive verifications determine a file copy’s location by collating

the round trip times of challenge-response between the server holding that file copy and the

challenging landmarks. At the end of this stage the landmarks will output an area which M combines all of the determined file copy locations. The final step is for the user to verify that

M ⊆ R Adversarial strategies:

We assume that a colluding subset of servers have influenced the auction protocol and the

server list is corrupted and includes servers that are not in the user specified region. The T aim of the colluders is to influence the challenge-response protocol so that , the measured M region is consistent with the region , specified by the user, that is . Without R M ⊆ R loss of generality, we consider two types of adversarial strategy for each server to landmark

communication: shortening where the server “pretends” to be closer to a landmark than it

actually is, and lengthening where the server “pretends” to be farther away.1 We consider

an independent instance of the challenge-response protocol to be run between each pair of

T and L , and allow the adversary to use a different and appropriate strategy for ∈ T ∈ L each of these protocol instances. This effectively reduces the malicious behavior of the servers

to their malicious behavior between individual pairs of T and L. A shortening strategy is

possible when a server holds the file (and has lied about its location), has a colluding server

(or proxy) which will respond to challenges on its behalf. The colluding server may be a

server within holding the file or a server in not holding the file. Note that file copies T S\T 1Since locations of landmarks are public the adversary knows which strategy to take.

228 are encoded for a particular server, and so even if the colluding server has a file copy, it will

not have the encoding that is required for the response and so will need to forge the necessary

response. A lengthening strategy is always possible by delaying the time to respond to a

challenge but if this is the only available strategy then the adversary cannot successfully

forge his location.

A.5.1 PoL Scheme

In the following we formalize the notion of PoL, with the assumptions and landmark infras-

tructure outlined in Sections A.3 and A.4.1. We use , to denote the set of available servers, S T and those selected for the file storage, respectively, and to denote the set of landmark L servers. By , we are referring to bounded geographic areas that was specified by the R M user, and estimated by the PoL protocol, respectively. In the definition we will refer to file

tags τR and τT . The tag τR refers to the main file and τT refers to the encoding on a specific

server T .

Definition A.5.1 A Proof of Location (PoL) scheme consists of three stages:

Stage 1: Setup

keygen(π) (pk, skU , skS , skL): is run by a trusted authority. It takes as input π and → outputs a public key pk and secret keys for the user skU , provider skS and landmarks skL. Stage 2: Store a file: This stage consists of three algorithms. A request is made by the user to store a file F in region . This result in copies of the file to be stored on a set of R servers. The user performs one file encoding and then the storage provider translates this into multiple server dependent encodings.

auction(F, , ) : Determines the list of servers that will store the file through an R S → T T auction where servers in will bid to store based on a claimed location and cost of storage. S ∗ storeuser(skU ,F, ) (F , τR): Takes as input a file F together with a region and R → R ∗ skU , and outputs the encoded file F and a file handle τR.

229 ∗ ∗ ∗ storeprovider(skS ,F , τR, ) F , τT : T : Takes as input the encoded file F T → { T ∈ T } ∗ together with the file handle and , and outputs an encoded file F and file handle τT for T T each server T . ∈ T Stage 3: Location verification: This stage consists of four algorithms. It starts with a request from the user, which will initiate a challenge response phase between the landmarks and the servers that are expected to hold the file copies. Finally the landmarks will verify and combine the received responses to create a measured region and the user verifies that this is within their chosen region.

query(τR) : Queries the storage provider for the set of servers where the file associated → T to τR is stored.

challenge(skL, pk, τT ) (c, tc): Generates a challenge c for a server T . It also → ∈ T outputs the time the challenge is issued tc.

∗ response(c, F ) (r, tr): Generates a response r to challenge c (this is the only function T → run by the provers/storage servers). We also include as an output here the time tr that the challenger will receive the response.2

locate(skL, pk, ∆, τR) or : Takes as input the set ∆ of all challenges and responses → M ⊥

(c, r, tc, tr, L, T ) made between L and T . The algorithm then performs the verification ∈ L ∈ T 3 for each (c, r, tc, tr, L, T ). If verification fails then the error symbol is output, otherwise ⊥ the times (tc, tr) for each L and T communication are used to determine the measured region . M

verify( , τR) b 0, 1 : Verifies whether . If = then verification fails, M → ∈ { } M ⊆ R M ⊥ i.e. b = 0.

Correctness: A PoL is correct if for all whose true location is within the region the T R

corresponding is always contained in , i.e. verify( , τR) = 1. M R M 2 Note that including tr may seem unusual but this abstraction is necessary to facilitate our analysis. Such a response time is denoted in a similar way in [BvDJ+11]. 3In our schemes we require that for each pair L, T , an -fraction of responses verify correctly for the pair’s overall verification to pass. This is due to the use of PoRs.

230 A.5.2 Security Model

We formalize the security using a game between the environment and the adversary (colluding

servers). We assume that at least one server lies about its location during the auction,

(otherwise the list includes only correctly located servers and the adversary always wins). T

We denote the subset of servers which lied about their locations as fake and the truthful T

servers as real. T Let dist(L, T ) be the function used to calculate the distance between a landmark L and a

server T . Let Tclaim denote the claimed location of T . In the adversarial setting for any one

particular challenge-response protocol instance a server T may have an associated colluding

server which acts as a proxy. We denote this server Tproxy and the file encoding that it stores

∗ FTproxy (note that this may be the null string if no file is stored). For the two adversarial strategies, lengthening and shortening, we have the following.

An adversary uses the dist function to choose its strategy for each challenge issued to a

server T fake. If dist(L, T ) < dist(L, Tclaim) then the adversary must choose a lengthening ∈ T

strategy. If dist(L, T ) > dist(L, Tclaim) then the adversary must choose a shortening strategy.

In the lengthening case the server T will be answering the challenges and so the adversary

∗ will be given the file FT . In the shortening case the proxy server Tproxy will be answering the

∗ challenges and so the adversary will be given the file FTproxy . This proxy server will be chosen based on the adversary’s strategy.

Let us consider the security experiment for a PoL w.r.t. an adversary (a group of A servers in ) (see also Figure A.4): S

Setup: The environment generates (pk, skU , skL, skS ) by running keygen.

Store: First chooses a file F and a region . will lie about the location of at least A R A one server which will be chosen for the final output during the auction (auction(F, , )). T R S

This creates two sets, one of correctly located server real and one with fake locations fake. T T The file is encoded (using functions storeuser and storeprovider) and stored on servers in the

231 set = real fake. T T ∪ T Locate: Next the environment initiates location verification by calling query to receive

the set of storing servers. For each landmark server pair L, T , where L and T , T ∈ L ∈ T the challenge-response protocol will be run q times. The environment generates a challenge c

and challenge time tc by calling challenge. A response is created in one of three ways:

∗ T fake and dist(L, T ) > dist(L, Tclaim) (shortening): is given c and F . • ∈ T A Tproxy

must then output a response r and time tr. A

∗ T fake and dist(L, T ) < dist(L, Tclaim) (lengthening): is given c and F . • ∈ T A T

must then output a response r and time tr. A

T real: respond is called which outputs (r, tr). • ∈ T

For a single challenge-response instance (c, r, tc, tr, L, T ) is added to a set ∆. The environment

then calls the function locate on the final set ∆ to obtain the measured region . M Verify: Finally the environment verifies by calling verify. M ⊆ R

Definition A.5.2 We say a PoL system is secure if for all polynomial time adversaries A that corrupts at least one server location in the set , the probability that is negligible T M ⊆ R

(where fake = 0). |T | 6

A PoL built from only a geolocation scheme in which the challenge-response protocol

use simple computer pinging, does not satisfy the above definition of security because the

adversary is able to create valid responses r with any time of their choice tr using appropriate

proxies.

As mentioned earlier (cf. Section A.2) geolocation schemes based on timing information

rely on measurements reaching their target. These schemes will therefore fail in the presence

of a colluding proxy that allows them to hide their true location. To ensure that latency is

measured to the actual location we need to retrieve some part of the file being stored, pinging

is simply too basic.

232 P oL Experiment ExpA (π) (pk, sk , sk , sk ) ← keygen(π) setup U L S (R,F ) ← A(pk, π) % see 1 T = Treal ∪ Tfake ← auction(F, R, SA) % see 2 store ∗ (F , τR) ← storeuser(skU ,F, R) ∗ {FT , τT : T ∈ T } ← ∗ storeprovider(skS ,F , τR, T ) T ← query(τR); ∆ ← ∅ for m = 1 to q do % see 3 for all L ∈ L and T ∈ T do (c, tc) ← challenge(skL, pk, τT ) if T ∈ T then locate fake if dist(L, T ) < dist(L, Tclaim) then ∗ (r, tr) ← A(c, FT ) % see 4 else ∗ (r, tr) ← A(c, F ). Tproxy else ∗ (r, tr) ← response(c, FT ) ∆ ← ∆ ∪ {(c, r, tc, tr, L, T )} verify M ← locate(skL, pk, ∆, τR) return verify(M, τR)

Notes:

1. such that |F | > 0.

2. such that Tfake 6= ∅. SA denotes the set of all storage servers where some subset are malicious (controlled by A). 3. q is the maximum number of challenges made between a single landmark and server. ∗ 4. where tr ≥ t , the minimum time of L → T .

Figure A.4: Security Experiment for PoL

A.5.3 Constructing a PoL

We construct a PoL using a PoR and a geolocation scheme. The basic idea is to store the

PoR encoded file and replace the challenges (pinging) used in the geolocation scheme with

the challenges of the PoR scheme.

PoR A PoR consists of five algorithms: keygen, encode, challenge, response, verify and

extract. The extract function is only necessary for proofs and is not used in the construction.

Following the definitions from [SW08, JKJ07] a PoR is -sound if for any adversary responding

correctly to an -fraction of challenges, the file can be recovered with all but negligible

probability. The intuition here is that if the adversary can answer a sufficient number of

challenges correctly then it must be storing enough of the file for it to be retrievable.

Secure PoL Scheme We note that when combining a PoR and a geolocation scheme

as outlined below, different file encodings must be used for each server. If not then the

233 shortening strategy (cf. Section A.5) will always succeed if there is a colluding node in T acting as a proxy. Such a proxy would hold the file encoding required for constructing the

correct PoR response. To protect against this attack one can use a different encoding for

each server. The user will encode the file once for each server in , using a key specific to T that server, and submit all encoded files to the storage provider who distributes them to the

respective storage servers. The landmarks use the key of each server for verification. The

downfall of this construction however is that it makes the system extremely costly for users

from a computational and communication viewpoint.

A.5.4 PoR with Recoding

We introduce a new operation in PoR that we call re-coding which allows a third party with

a secret key to construct a new encoding of an encoded file. In a PoR with recoding the user

has a secret key for the first encoding ske and the recoder has an additional secret key skr. In

a privately verifiable scheme both keys are needed to verify a response, thus the verifier must

know both secret keys. For each recoding of a file there will be an associated recoding value

ρ we call a recoder. The recoder will be included as part of a new file tag for the recoded file.

A PoR scheme with recoding includes an additional recoding function recode which takes as

∗ input the encoded file F , tag τ, recoder ρ and recoding key skr and outputs a new encoding

of the file. The PoR schemes of Shacham-Waters (SW) [SW08] (both private and public)

can be securely augmented with a recoding function such that the new encoded files will

satisfy the security requirement of the original PoR scheme. The first step of encoding in

SW schemes uses an erasure correcting code to encode the file. Authentication checks are

constructed for blocks of this encoded file. For the specifics on the choice of erasure code we

refer the reader to [SW08].

∗ S-W Private PoR with Recoded Authenticators. Let f : 0, 1 prf Zp and h : { } × K → ∗ 0, 1 prf Zp be pseudorandom functions (PRFs). Let ( enc, Enc, Dec) be a symmetric { } × K → K

234 encryption scheme and ( mac, Mac, Ver) be a message authentication code (MAC). K r r r keygen(π): Generate keys kenc enc, kmac mac and kr prf . The encoder’s secret ← K ← K ← K

key is ske = (kenc, kmac) and recoder’s is skr.

0 0 store(ske,F ): Apply erasure code on F to obtain F ; split F into n blocks (for some n),

r each s sectors long mij 1≤i≤n. Parse ske as (kenc, kmac). Choose PRF key kprf prf and { }1≤j≤s ← K r α1, α2, ..., αs Zp. Compute file tag τ = t0 Mackmac (t0) where t0 = n Enckenc (kprf α1 ... αs). ← k k k k k For all i [1, n]: ∈ s X σi fkprf (i) + αjmij. ← j=1 ∗ 0 Output is file tag τ and F ; the encoded file F together with authenticators σi.

∗ ∗ recode(skr,F , τ, ρ): Takes as input encoded file F and recodes the file w.r.t. ρ. Calculate

∗ σρ,i σi + hskr (ρ) for each i, 1 i n. The recoded file Fρ is then the file blocks mij 1≤i≤n ← ≤ ≤ { }1≤j≤s together with the authenticators σρ,i and file tag τρ = n Enck (kprf α1 ... αs) ρ. k enc k k k k

challenge(pk, τρ): Choose an l element subset I of the set [1, n], and for each i I, a ∈ r random element νi B Zp. Challenge is the set Q = (i, νi) . ← ⊆ { } ∗ P response(Q, F ): For a challenge Q = (i, νi) compute and send: µj νimij ρ { } ← (i,νi)∈Q P for 1 j s, and σρ νiσρ,i. ≤ ≤ ← (i,νi)∈Q

verify(ske, skr, pk, Q, µj 1≤j≤s, σρ, τρ): Parse the secret key ske as (kenc, kmac, kh). Use { }

kmac to verify the MAC on τρ, if verification fails then abort. Otherwise parse τρ and use kenc

to decrypt and recover n, kprf , α1, ..., αs. Verify whether:

s ? X X σρ = νi(fkprf (i) + hskr (ρ)) + αjµj.

(i,νi)∈Q j=1

S-W Public PoR with Recoded Authenticators. Let e : G G GT be a bilinear map, let × → g be a generator of the group G and let H : 0, 1 ∗ G be the BLS hash [BLS01], treated { } → as a random oracle. Let (SKg, SSig, SVer) be a signature scheme.

r r α keygen(π): Choose (ssk, spk) SKg, α Zp and compute v g . The public key is ← ← ←

pk = (spk, v). The secret key of the encoder is ske = (α, ssk). Secret and public keys (βρ, vρ)

235 for recoding are generated by the recode function.

0 0 store(ske,F ): Given a file F , apply the erasure code to obtain F ; split F into n blocks

(for some n), each s sectors long mij 1≤i≤n. Parse ske as (α, ssk). Choose a random file { }1≤j≤s r name from Zp. Choose u1, u2, ..., us G. Let t0 = name n u1 ... us, the file tag is then ← k k k k

τ = t0 SSig (t0). For all i [1, n]: k ssk ∈ s !α Y mij σi H(name i) uj . ← k · j=1

∗ The encoded file F is mij 1≤i≤n together with authenticators σi. { }1≤j≤s recode(F ∗, τ, ρ): Takes as input the encoded file F ∗ and recodes the file w.r.t. ρ. Generate

r 4 βρ secret key for recoder ρ: βρ Zp. Compute associated public key vρ v for all ρ P . ← ← ∈ βρ This public key is issued to the verifier included in pk. Calculate σρ,i σ for all i, 1 i n. ← i ≤ ≤ ∗ The recoded file Fρ consists of the file blocks mij 1≤i≤n together with the authenticators σρ,i { }1≤j≤s and the file tag τρ = t0 SSig (t0) ρ. k ssk k

challenge(pk, τρ): Choose l element subset I of the set [1, n], and for each i I, a random ∈ r element νi B Zp. The challenge is the set Q = (i, νi): i I . ← ⊆ { ∈ } ∗ P response(Q, F ): For a challenge Q = (i, νi) compute and send: µj νimij ρ { } ← (i,νi)∈Q Q νi for 1 j s, and σρ σ . ≤ ≤ ← (i,νi)∈Q ρ,i

verify(pk, Q, µj 1≤j≤s, σρ, τρ):Use spk to verify the signature on τρ, if verification fails { }

then abort. Otherwise parse τρ and recover name, n, u1, ..., us. Verify:

s ? Y νi Y µj e(σρ, g) = e( H(name i) u , vρ). k · j (i,νi)∈Q j=1 Theorem A.5.1 SW’s Privately and Publicly verifiable PoRs extended to use recoded au- thenticators (as above) are -sound PoRs.

Theorem A.5.1 can be proved by adapting Shacham-Waters’s (SW) original proofs,

[SW08, Theorem 4.1 and 4.2], respectively. Parts 2 and 3 of their proofs are concerned with

4Despite the secret key not actually being dependent on the recoder ρ, this recoder is still used as an identifier to help select the correct public key νρ in the verify step.

236 constructing the file extractor and remain the same. Here we sketch the proofs for Part 1

(unforgeability of responses). To make our scheme fit with SW’s model we assume that during

the store phase both store and recode can be called. The rest of the experiment will proceed

in the same way. Our proof will follow a series of game hops and we let Gi denote Game i.

The game numberings match with those of the original proofs.

Proof. First we will prove the security of the private scheme.

G0, G1, G2, G3: These games are the same as those in the original proof barring a

few changes to include hK ( ). The first three hops relate to the security of the MAC, the · encryption scheme and the prf f, respectively.

0 G3’: Let G3 be the same as G3 except that the challenger evaluates hskr (ρ) by generating

r 0 a random value rh Zp. The difference between Games 3 and 3 is bounded by the adversary’s ←

ability to break the PRF security of h. Here we will suffer a 1/(Nqr) security loss due to a

hybrid argument, where N is the number of file blocks and qr is the number of recode queries.

G4: This game proves that any verifying response must be correctly computed. The

analysis is identical to that in the original proof of Shacham and Waters where the random

value r is now the sum of the two random values replacing fkprf (i) and hkh (ρ).

Now let us prove the security of the public scheme.

G0, G1: These games are the same as SW’s proof. The hop corresponds to an adversary’s

ability to forge a signature.

G2: In G2 if any verification instance succeeds but the authenticator σρ was not computed

Q νi correctly (i.e. σρ = σ ), then the challenger declares a failure and aborts. Let σρ 6 (i,νi)∈Q i,ρ 0 denote a correctly computed authenticator and σρ denote an incorrectly computed authenti- cator which verifies correctly. SW’s proof shows that an adversary that can distinguish G1

and G2 can solve the Computational Diffie-Hellman problem. We instead solve the related

problem: Given g, gα, gαβ1 , gαβ2 , ..., gαβn , h G; find hαβi for some i 1, ...n . It is easy to ∈ ∈ { } verify that given an adversary which solves this problem we can solve the CDH problem and

237 vice-versa. The construction to solve this new problem proceeds in a similar fashion to that

in the original proof.

r ζj γj j, 1 j s, the simulator chooses ζj, γj Zp and sets uj g h . ∀ ≤ ≤ ← ← r i, 1 i n, the simulator chooses ri Zp and programs the random oracle: ∀ ≤ ≤ ←

r Ps ζ m Ps γ m H(name i) = g i /(g j=1 j ij h j=1 j ij ). k

Next the simulator will compute the σi,ρ values for one particular ρ P and then following ∈ similar argument to the original proof we obtain:

1 Ps  Ps  γj ∆µj αβρ 0 −1 − j=1 ζj ∆µj j=1 h = σρσρ v .

G3: This game proves that any verifying response is correctly computed; the analysis

0 Qs µj is identical to SW’s proof. Observe that e(σρ, g) = e(σρ, g) again implies that j=1 uj = µ0 Qs j j=1 uj and so we can solve discrete logs. 

A.5.5 A Secure PoL Using PoR with Recoding

We now describe how to turn a PoR with recoding into a PoL. Consider a PoR with recoding:

(keygenPoR, storePoR, recodePoR, challengePoR, responsePoR, verifyPoR). The possible recoders ρ

used here will be the labels of servers contained within the user specified region , namely R . The PoL is defined as follows: T PoR keygen(π): Runs keygen . The user is given the encoder’s key skU = ske. The storage

provider is given the recoder key skS = skr. The landmarks are given the secret/public keys necessary for verification.

auction(F, , ): Runs an auction to determine the list of servers that will store the R S T file. Servers in bid to store based on a location and cost of storage. S

PoR storeuser(skU ,F, ): Runs store . Sets file tag as τR = τ . R kR

238 ∗ PoR ∗ storeprovider(skS ,F , τR, ): Parse τR as τ . For each node T run recode (skS ,F , τ, T ). T kR ∈ T

(Note that the public keys vρ generated by our publicly verifiable scheme would be given to the landmarks for verification.)

query(τR) : Returns the set of servers storing the file. → T PoR challenge(pk, τT ): Runs challenge and outputs the current time tc.

∗ PoR response(c, FT ): Runs response and outputs the time of response tr.

locate(skL, pk, ∆, τR):Parse skL as (kenc, kmac, kh) and for all (c, r, tc, tr, L, T ) ∆ call ∈ verifyPoR. If some T convincingly answers an -fraction of challenges made by a landmark ∈ T

L then the RTTs (tc, tr) between L and T are used to find the distance from L to T . Otherwise the function aborts and outputs . The landmarks will combine all distances to create an ⊥ area for file’s location. M

verify( , τR) verifies whether the constrained area is a subregion of the previously M M chosen region . R

Theorem A.5.2 The above construction is a secure PoL if the PoR is -sound.

Proof. We prove the theorem in a sequence of games.

G0: The first game is simply the Experiment shown in Figure A.4.

G1: G1 is the same as G0, with one difference. If the shortening strategy is necessary

for some T fake, i.e. dist(L, T ) > dist(L, Tclaim) then G1 aborts and outputs 0. If the ∈ T underlying PoR is -sound then G1 and G0 are identical. To see this consider two cases, where TProxy is colluding with T fake in G0: ∈ S

If TProxy succeeds in responding to an  (or greater) fraction of challenges from •

some L this implies that TProxy must be holding the encoded file which ∈ L contradicts our assumptions.

If TProxy responds to less than -fraction of challenges from a particular L • ∈ L then verification fails.

239 Therefore, if the underlying PoR is -sound the adversary will be unsuccessful with a shortening

strategy.

Let us now study the probability of success in G1 where the adversary is limited to the

lengthening strategy. When moving a file to an arbitrary location it is necessary to use a

combination of lengthening and shortening strategies. Thus when the adversary is limited to

only lengthening strategies for a particular L to T when a shortening is necessary, T must

instead give the correct time measurement. Thus the measured region will contain both the

actual location of T (performing the lengthening) and the attempted fake location Tclaim.

Therefore the measured region will not be contained within the requested region since M R T lies out with . Further discussion on these types of lengthening attacks can be found in R

[GGWL10, Section 4.2.1]. As a result since defines the area covering = real fake and M T T ∪ T

T fake this means that the adversary will always fail. ∈ T To reiterate, if an adversary is limited to only lengthening strategies then we will only

ever see an increase in the size of the measured region . As a result servers located out M with will be detected since . R M 6⊆ R 

A.6 Experiments

We now study our proposed PoL in practice. For simplicity we use only one file in a single

location (no redundancy). We simulated landmarks and servers using Planet-lab nodes [Pla].

A total of 28 nodes spread across North America (18) and Europe (10) were used, each able

to act as a landmark and a server. Our implementation combines a geolocation scheme and

SW’s private scheme.

We note that our PoL scheme’s time-distance functions must take into account the time

taken for a server to calculate the response to a landmark’s challenge. This issue has been

raised in [PGB11] and [BDS11] but no measurement was reported.

240 In [BDS11, Section 3.1] the authors describe one issue related to combining a PoR protocol

and a geolocation scheme. Consider a challenge corresponding to two parts of a file but one

of these parts is stored on a different server. Whilst computing the portion of the response

based at the challenged server, the second part could be retrieved from the other server.

Thus the computation time is masking the time to obtain the rest of the file. Based on our

assumptions this should not be a problem in our setting. We already assume that copying is

not possible and that the storage provider gives the verifier a list of all servers holding the

file. Thus the measured region we determine would cover both servers in the example.

A.6.1 Geolocation Method

Geolocation consists of three phases:

Learning phase: To construct a time-distance function we begin with a learning phase.

In traditional geolocation systems the function translating time into distance is derived by

looking at the time taken for a packet to travel between pairs of landmarks with known

locations. There are several ways this function can be derived and these have been studied in

[BDS11] which aim to locate files stored on data centers with Amazon’s storage cloud.

During our learning phase each node in turn acted as a server with all other nodes acting

as landmarks. The landmarks each issued a total of 50 challenges to the server and the

minimum time to receive a response was used. We ran this process four times over all nodes:

one case measuring the time for partial file retrieval, and three cases using different PoR

challenge sizes l = 5, 35, 65. For each set of data and node we determined a best-fit line

translating time into distance using the known Planet-lab locations.

Measure time: The provider claims to store a file on a server in a certain region. The

landmarks of that region are then selected to initiate a challenge-response to that server.

Again each landmark sent 50 challenges to the server and the median time to receive a

response was taken. We then translated these median times into distances using the best-fit

lines (as opposed to the minimum as previously used), this helps ensure the calculated

241 distance is not an underestimate.

Region estimation: From the previous phase we have a set of nodes with known location

and their distance to the target server. We now abstract the problem to a geometric problem,

called trilateration. Trilateration is the process of finding the location of an unknown point when its distances to some known points are given. The result can either be a region or

the center of that region. This technique is used in GPS systems where satellites want to

estimate the location of an object by measuring the RTT of signals sent to the GPS device.

In this phase, we have a circle centered around each landmark and we expect the region

containing the file to be the overlap of all these circle. We therefore need to estimate the region with highest opacity (density of circles). More formally, we obtain a system of non-linear

circle inequalities to solve:

2 2 2 (x ai) + (y bi) r . − − ≤ i Note that the system of inequalities may not have a solution and thus the intersection area

for all the circles may be null because of error in time (or line) computation. Therefore we

look for the highest density region, the region that is contained in most circles.

To simplify this problem we implemented an approximation of this algorithm that finds

the center points of the intersections for each group of three circles. We expect that the

center point found for each set of circles is within the region of highest opacity, thus giving a

good approximation of the server location.

Implementation Considerations We now discuss where errors may arise in each stage of

the protocol:

Learning phase error: The distance between nodes should not change for different challenge

sizes. This implies that the best-fit lines for different challenge sizes should be parallel but

due to random behavior in server computation time and network delay (routing, packet loss,

etc.) this was not perfectly achieved.

Measure Time error: Similar network and computation fluctuations again have an effect

242 European Nodes North American Nodes

Figure A.5: Boxplots of Geolocation Error (km) for File Retrieval and Challenge Sizes 5, 35, 65

here. Based on our experience we made the following conclusions: Our measurements were

taken 50 times for each challenge-response and then the median time was used to obtain more

consistent time measurements. The average was always biased by abnormal computation

behaviour. Based on our direct analysis of the computation times at the server we saw

that on some occasions the computation could take a million times more than usual. This

inaccuracy was a property of inconsistent Planet-Lab node computation times rather than

network delays, so we expect real landmarks to be more stable.

Region estimation error: Any error here could be reduced by performing multilateration

instead of combining the results of several trilaterations.

A.6.2 Error Analysis

The ultimate aim of our scheme is to verify the location of a file within a region, e.g. USA or

Europe. We consider verification successful if the measured area is within the region requested

for storing the file, i.e. within US borders. In practice we would expect the system to claim a

file is within a particular region e.g. USA, the verifier would then use the landmarks of that

region to verify the file’s location. We therefore considered the nodes of Europe and North

America separately.

We used the trilateration method described above to determine the center of the measured

regions for challenge sizes 5, 35 and 65 and for the file retrieval case. From these center points we can then determine the error distance to the actual file location. From Figures A.5 it can

243 be easily seen that increasing the challenge does not have an adverse effect on our ability to

locate the file. In fact the error remains relatively stable as we increase the challenge size; for

the European data (Figure A.5, European nodes) it actually decreases.

As a result we can therefore say that despite adding extra time delays to our measurements

from PoR, this has relatively little effect on the PoL scheme’s ability to verify the location

of files. Furthermore, based on the error sizes we currently obtained (around 800km for the

EU and 1000km for the US), the PoL scheme could safely locate files within areas such as a

continent with relatively good confidence. To achieve more fine grained results, i.e. to the

level of countries, we would need to both improve the accuracy of the geolocation algorithm

and increase the number of carefully placed landmarks used.

A.7 Conclusion

In this paper we consider the problem of file location in a distributed storage system and

show that even under the assumption that the service provider is not malicious, and is only

negligent, providing any guarantee is a challenging task. In particular we show that in this

restricted case, a set of colluding storage servers who can freely copy the files can always

succeed in breaking the security guarantee of the system (assurance about file location). We

then argue that modulo targeted attacks on a small number of files, copying files with the

aim of breaking the security is costly and unlikely to be followed by the colluders. As a result we limit ourselves to colluders who will not copy files. Although this may seem a restricted

model, we emphasize that in stronger cases as shown above, it is impossible to provide any

location guarantees.

We have formalised the notion of Proofs of Location for distributed storage systems,

assuming the storage service provider operates correctly but wishes to minimise costs. We

show how to construct a secure PoL from a geolocation scheme and a PoR, and to improve

the efficiency of the construction, introduce a new property for PoR, called recoding. We

244 give constructions of two PoR system with recoding that are extensions to the privately and

publicly verifiable schemes of Shacham and Waters [SW08].

We conclude by briefly discussing the possibility of providing PoLs within the setting of

cloud storage. It is obvious that our current construction does not perfectly fit with current

cloud infrastructure. Providers such as Amazon own all their data centers, so our model

considering a provider which buys external third-party storage scenario does not fit. There

are further issues related to how individual data centers may be contacted directly within the

cloud. Providers may operate in such a way that all traffic is centrally managed but in similar

experiments to our own, Benson et al. [BDS11] were able to locate Amazon’s data centers.

This tells us that such measurements can allow for the locations of servers to be determined within large scale cloud infrastructure. Despite this, further knowledge of how particular

clouds are structured would be useful in examine how measurements may be routed. Thus

helping to create a more reliable and accurate system.

Another scenario that would match our model is the cloud-of-clouds setting. Here we

consider some clouds to operate in a trusted fashion (i.e. maintain correct locations) and the

rest to act maliciously (i.e. claim their storage servers are in fake locations). If a user stores

his files across all of these clouds then a mechanism similar to what is outlined in this paper

may be a realistic prospect. Despite this possibility there is clearly further work to be done

in this area to find reliable methods to formally verify the location of files within a cloud

environment.

A.8 Acknowledgments

The authors would like to thank Nashad Safa and Sanchit Agarwal for initial work on the

implementations. The majority of this work was performed whilst the first author was

employed at the University of Calgary. The first author has been supported in part by ERC

Advanced Grant ERC-2010-AdG-267188-CRIPTO.

245