<<

The Pythia PRF Service

Adam Everspaugh?, Rahul Chatterjee?, Samuel Scott??, Ari Juels†, and Thomas Ristenpart‡ ?University of Wisconsin–Madison, {ace,rchat}@cs.wisc.edu ??Royal Holloway, University of London, [email protected] †Jacobs Institute, Cornell Tech, [email protected] ‡Cornell Tech, [email protected]

Abstract fore storage, but attackers can still mount highly effective brute-force cracking attacks against stolen databases. Conventional cryptographic services such as Well-resourced enterprises such as Facebook [41] hardware-security modules and software-based key- have therefore incorporated remote cryptographic oper- management systems offer the ability to apply a ations to harden password databases. Before a password pseudorandom function (PRF) such as HMAC to inputs is stored or verified, it is sent to a PRF service external of a client’s choosing. These services are used, for to the database. The PRF service applies a cryptographic example, to harden stored password hashes against function such as HMAC to client-selected inputs under offline brute-force attacks. a service-held secret key. Barring compromise of the We propose a modern PRF service called PYTHIA de- PRF service, its use ensures that stolen password hashes signed to offer a level of flexibility, security, and ease- (due to web server compromise) cannot be cracked using of-deployability lacking in prior approaches. The key- an offline brute-force attack: an attacker must query the stone of PYTHIA is a new cryptographic primitive called PRF service from a compromised server for each pass- a verifiable partially-oblivious PRF that reveals a por- word guess. Such online cracking attempts can be mon- tion of an input message to the service but hides the itored for anomalous volumes or patterns of access and rest. We give a construction that additionally supports throttled as needed. efficient bulk rotation of previously obtained PRF val- ues to new keys. Performance measurements show that While PRF services offer compelling security im- our construction, which relies on bilinear pairings and provements, they are not without problems. Even large zero-knowledge proofs, is highly practical. We also give organizations can implement them incorrectly. For ex- accompanying formal definitions and proofs of security. ample, Adobe hardened passwords using 3DES but in ECB mode instead of CBC-MAC (or another secure PRF We implement PYTHIA as a multi-tenant, scalable PRF service that can scale up to hundreds of millions construction) [26], a poor choice that resulted in disclo- of distinct client applications on commodity systems. In sure of many of its customers’ passwords after a breach. our prototype implementation, query latencies are 15 ms Perhaps more fundamental is that existing PRF services in local-area settings and throughput is within a factor do not offer graceful remediation if a compromise is de- of two of a standard HTTPS server. We further report tected by a client. Ideally it should be possible to cryp- tographically erase (i.e., render useless via key deletion) on implementations of two applications using PYTHIA, showing how to bring its security benefits to a new en- any PRF values previously used by the client, without terprise password storage system and a new brainwallet requiring action by end users and without affecting other system for Bitcoin. clients. In general, PRF services are so inaccessible and cumbersome today that their use is unfortunately rare. In this paper, we present a next-generation PRF ser- 1 Introduction vice called PYTHIA to democratize cryptographic hard- ening. PYTHIA can be deployed within an enterprise to Security improves in a number of settings when appli- solve the issues mentioned above, but also as a public, cations can make use of a cryptographic key stored on multi-tenant web service suitable for use by any type of a remote system. As an important example, consider organization or even individuals. PYTHIA offers several the compromise of enterprise password databases. Best security features absent in today’s conventional PRF ser- practice dictates that passwords be hashed and salted be- vices that are critical to achieving the scaling and flexibil-

1 ity required to simultaneously support a variety of clients lated concepts (of which there are many) in Section 7. and applications. As we now explain, achieving these Partially-oblivious PRFs. We introduce partially features necessitated innovations in both cryptographic oblivious PRFs (PO-PRFs) to rectify the above ten- primitive design and system architecture. sion between fine-grained key management and bulk key Key features and challenges. We refer to an entity us- management and achieve a primitive that supports batch key rotation. We give a PO-PRF protocol in the random ing PYTHIA as a client. For example, a client might be a web server that performs password-based authentica- model (ROM) similar to the core of the identity- based non-interactive key exchange protocol of Sakai, tion for all of its end users. Intuitively, PYTHIA allows such a client to query the service and obtain the PRF out- Ohgishi, and Kasahara [47]. This same construction was also considered as a left-or-right constrained PRF put Y = Fk(t, m) for a message m and a tweak t of the client’s choosing under a client-specific secret key k held by Boneh and Waters [14]. That said, the functional- by the service. Here, the tweak t is typically a unique ity achieved by our PO-PRF is distinct from these prior identifier for an end user (e.g., a random salt). In our works and new security analyses are required. Despite running password storage example, the web server stores relying on pairings, we show that the full primitive is fast Y in a database to authenticate subsequent logins. even in our prototype implementation. In addition to a lack of well-matched cryptographic PYTHIA offers security features that at, first glance, primitives, we find no supporting formal definitions that sound mutually exclusive. First, PYTHIA achieves mes- can be adapted for verifiable PO-PRFs. (Briefly, previous sage privacy for m while requiring clients to reveal t to definitions and proofs for fast OPRFs rely on hashing in the server. Message privacy ensures that the PRF ser- the ROM before outputting a value [19, 31]; in our set- vice obtains no information about the message m; in our ting, hashing breaks key rotation.) We propose a new as- password-storage example, m is a user’s password. At sumption (a one-more bilinear decisional Diffie-Hellman the same time, though, by revealing t to the PRF ser- assumption), give suitable security definitions, and prove vice, the service can perform fine-grained monitoring of the security of the core primitive in PYTHIA under these related requests: a high volume or otherwise anomalous definitions (in the appendix). Our new definitions and pattern of queries on the same t would in our running ex- technical approaches may be of independent interest. ample be indicative of an ongoing brute-force attack and might trigger throttling by the PRF service. Using PYTHIA in applications. We implement PYTHIA and show that it offers highly practical per- By using a unique secret key k for each client, PYTHIA formance on Amazon EC2 instances. Our experiments supports individual key rotation should the value Y be demonstrate that PYTHIA is practical to deploy using off- stolen (or feared to be stolen). With traditional PRF the-shelf components, with combined computation cost services and password storage, such key rotation is a of client and server under 12 milliseconds. A single headache, and in many settings impractical, because it 8-core virtualized server can comfortably support over requires transitioning stored values Y1,...,Yn (one for 1,000 requests per second, which is already within a fac- each user account) to a new PRF key. The only way to tor of two of a standard HTTPS server in the same con- do so previously was to have all n users re-enter or reset figuration. (Our PYTHIA implementation performs all their passwords. In contrast, the new primitive employed communication over TLS.) We discuss scaling to han- for Fk in PYTHIA supports fast key rotation: the server dle more traffic volume in the body; it is straightforward 0 can erase k, replace it with a new key k , and issue a given current techniques. compact (constant-sized) token with which the client can We demonstrate the benefits and practicality of quickly update all of its PRF outputs. This feature also PYTHIA for use in a diverse set of applications. First is enables forward-security in the sense that the client can our running example above: we build a new password- proactively rotate k without disrupting its operation. database system using a password “onion” that com- PYTHIA provides other features as well, but we defer bines parallelized calls to PYTHIA and a conventional their discussion to Section 2. Already, those listed above key hashing mechanism. Our onion supports PYTHIA surface some of the challenging cryptographic tensions key rotation, hides back-end latency to PYTHIA during that PYTHIA resolves. For example, the most obvious logins (which is particularly important when accessing primitive on which to base PYTHIA is an oblivious PRF PYTHIA as a remote third-party service), and achieves (OPRF) [29], which provides message privacy. But for high security in a number of compromise scenarios. rate-limiting, PYTHIA requires clients to reveal t, and ex- Finally, we show that PYTHIA provides valuable fea- isting OPRFs cannot hide only a portion of a PRF input. tures for different settings apart from enterprise pass- Additionally, the most efficient OPRFs (c.f., [31]) are not word storage. We implement a client that hardens a type amenable to key rotation. We discuss at length other re- of password-protected virtual-currency account called a

2 of the form Fkw (t, m) where F is a (to-be-defined) PRF keyed by kw, and the input is split into two parts. We call t a tweak following [33] and m the message. Look- ing ahead t will be made public to PYTHIA while m will be private. This is indicated by the shading of the PRF output boxes in the figure. Deployment scenarios. To motivate our design choices and security goals, we relay several envisioned deploy- ment scenarios for PYTHIA. Enterprise deployment: A single enterprise can deploy PYTHIA internally, giving query access only to other sys- Figure 1: Diagram of PRF derivations enabled by tems they control. A typical setup is that PYTHIA fields PYTHIA. Everything inside the large box is operated by queries from web servers and other public-facing sys- the server, which only learns tweaks and not the shaded tems that are, unfortunately, at high risk of compromise. messages. PRF queries to PYTHIA harden values stored on these vulnerable servers. This is particularly suited to storing “brainwallet” [15]; use of PYTHIA here prevents offline check-values for passwords or other low-entropy authen- F (t, m) t brute-force attacks of the type that have been common in tication tokens, where one can store kw where Bitcoin. is a randomly chosen, per-user identifier (a salt) and m is Our prototype implementation of PYTHIA is built with the low-entropy password or authentication token. Here open-source components and itself is open-source. We w can be distinct for each server using PYTHIA. have also released Amazon EC2 images to allow com- Public cloud service: A public cloud such as Ama- panies, individuals, and researchers to spin-up PYTHIA zon EC2, Google Compute Engine, or Microsoft Azure instances for experimentation. can deploy PYTHIA as an internal, multi-tenant service for their customers. Multi-tenant here means that differ- ent customers query the same PYTHIA service, and the 2 Overview and Challenges cloud provider manages the service, ensemble pre-key table, etc. This enables smaller organizations to obtain YTHIA We now give a high-level overview of P , the moti- the benefits of using PYTHIA for other cloud properties vations for its features, what prior approaches we inves- (e.g., web servers running on virtual machine instances) tigated, and the threat models we assume. First we fix while leaving management of PYTHIA itself to experts. some terminology and a high-level conceptual view of Public Internet service: One can take the public cloud what a PRF service would ideally provide. The service service deployment to the extreme and run PYTHIA in- is provisioned with a master secret key msk. This will stances that can be used from anywhere on the Internet. be used to build a tree that represents derived sub-keys This raises additional performance concerns, as one can- and, finally, output values. See Figure 1, which depicts not rely on fast intra-datacenter network latencies (sub- an example derivation tree associated with PYTHIA as millisecond) but rather on wide-area latencies (tens of well as which portions of the tree are held by the server milliseconds). The benefit is that PYTHIA could then be (within the large box) and which are held by the client used by arbitrary web clients, for example we will ex- (the leaves). Keys of various kinds are denoted by cir- plore this scenario in the context of hardening brainwal- cles and inputs by squares. lets via PYTHIA. From the msk we derive a number of ensemble keys. One could tailor a PRF service to each of these set- Each ensemble key is used by a client for a set of re- tings, however it is better to design a single, application- lated PRF invocations — the ensemble keys give rise agnostic service that supports all of these settings si- to isolated PRF instances. We label each ensemble key multaneously. A single design permits reuse of open- in the diagram by K[w]. Here w indicates a client- source implementations; standardized, secure-by-default chosen ensemble selector. An ensemble pre-key K[w] is configurations; and simplifies the landscape of PRF ser- a large random value chosen and held by the server. To- vices. gether, msk and K[w] are used to derive the ensemble key kw = HMAC(msk, K[w]). A table is necessary to sup- Security and functionality goals. Providing a single port cryptographic erasure of (or updates to) individual suitable design requires balancing a number of security ensemble keys, which amounts to deleting (or updating) and functionality goals. The most obvious requirements a table entry. are for a service that: provides low-latency protocols Each ensemble key can be used to obtain PRF outputs (i.e., single round-trip and amenable for implementation

3 as simple web interfaces); scales to hundreds of millions master secret key results in complete erasure of the of ensembles; and produces outputs indistinguishable old key and the update token. from random values even when adversaries can query the Two sets of challenges arise in designing PYTHIA. service. To this list of basic requirements we add: The first is cryptographic. It turns out that the combi- • Message privacy: The PRF service must learn noth- nation of requirements above are not satisfied by any ex- ing about m. Message privacy supports clients that isting protocols we could find. Ultimately we realized require sensitive values such as passwords to remain a new type of cryptographic primitive was needed that private even if the service is compromised, or to proves to be a slight variant of oblivious PRFs and blind promote psychological acceptability in the case that signatures. We discuss the new primitive, and our effi- a separate organization (e.g., a cloud provider) man- cient protocol realizing it, in the next section. The second ages the service. set of challenges surrounds building a full-featured ser- • Tweak visibility: The server must learn tweak t to vice that provides the core cryptographic protocol, which permit fine-grained rate-limiting of requests.1 In the we treat in Section 4. password storage example, a distinct tweak is as- signed to each user account, allowing the service to detect and limit guessing attempts against individ- 3 Partially-oblivious PRFs ual user accounts. We introduce the notion of a (verifiable) partially- • Verifiability: A client must be able to verify that oblivious PRF. This is a two-party protocol that allows a PRF service has correctly computed F for a kw the secure computation of F (t, m), where F is a PRF ensemble selector w and tweak/message pair t, m. kw with server-held key k and t, m are the input values. This ensures, after first use of an ensemble by a w The client can verify the correctness of F (t, m) rel- client, that a subsequently compromised server can- kw ative to a public key associated to k . Following our not surreptitiously reply to PRF queries with incor- w terminology, t is a tweak and m is a message. We say rect values.2 the PRF is partially oblivious because t is revealed to the • Client-requested ensemble key rotations: A client server, but m is hidden from the server. must be permitted to request a rotation of its en- Partially oblivious PRFs are closely related to, but dis- semble pre-key K[w] to a new one Kd[w]. The server tinct from, a number of existing primitives. A standard must be able to provide an update token ∆w to roll oblivious PRF [29], or its verifiable version [31], would forward PRF outputs under K[w] to become PRF hide both t and m, but masking both prevents granular outputs under Kd[w], meaning that the PRF is key- rate limiting by the server. Partially blind signatures [1] updatable with respect to ensemble keys. Addition- allow a client to obtain a signature on a similarly par- ally, ∆w must be compact, i.e., constant in the num- tially blinded input, but these signatures are randomized ber of PRF invocations already performed under w. and the analysis is only for unforgeability which is insuf- Clients can mandate that rotation requests be au- ficient for security in all of our applications. thenticated (to prevent malicious key deletion). A We provide more comparisons with related work in client must additionally be able to transfer an en- Section 7 and a formal definition of the new primitive in semble from one selector w to another selector w0. Appendix B. Here we will present the protocol that suf- • Master secret rotations: The server must be able to fices for PYTHIA. It uses an admissible bilinear pairing rotate the master secret key msk with minimal im- e : G1 × G2 → GT over groups G1, G2, GT of prime ∗ pact on clients. Specifically, the PRF must be key- order q, and a pair of hash functions H1 : {0, 1} → G1 ∗ updatable with respect to the master secret key msk and H2 : {0, 1} → G2 (that we will model as ran- so that PRF outputs under msk can be rolled for- dom ). More details on pairings are provided in Appendix B. A secret key k is an element of Z . The ward to a new master secret mskd . When such a w p rotation occurs, the server must provide a compact PRF F that the protocol computes is: kw update token δw for each ensemble w.  Fkw (t, m) = e H1(t),H2(m) . • Forward security: Rotation of an ensemble key or This construction coincides with the Sakai, Ohgishi, and 1In principle, the server need only be able to link requests involv- Kasahara [47] construction for non-interactive identity- ing the same t, not learn t. Explicit presentation of t is the simplest based key exchange, where t and m would be different mechanism that satisfies this requirement. identities and kw a secret held by a trusted key authority. 2 This matters, for example, if an attacker compromises the commu- Likewise, this construction is equivalent to the left-or- nication channel but not the server’s secrets (msk and K[w]). Such an attacker must not be able to convince the client that arbitrary or incor- right constrained PRF of Boneh and Waters [14]. The rect values are correct. contexts of these prior works are distinct from ours and

4 r PRF-Srv (msk) protocol hides m unconditionally, as H2(m) is a uni- PRF-Cl (w, t, m) formly random element of G2. r ←$ Zq r x ← H2(m) Verifiability: The protocol enables a client to verify that w, t, x- the output of PRF-Srv is correct, assuming the client has x˜ ← e(H1(t), x) previously stored pw. The server accompanies the output kw ← HMAC(msk, K[w]) y of the PRF with a zero-knowledge proof π of correct- kw pw ← g ness. y ← x˜kw Specifically, for a public key p = gkw , where g π ←$ ZKP(DLg(pw) = DLx˜(y)) w pw, y, π is a generator of G1, the server proves DLg(pw) = If pw matches & DLx˜(y). Standard techniques (see, e.g., Camenisch and π verifies then Stadler [20]) permit efficient ZK proofs of this kind in the Ret y1/r random oracle model. 3 The notable computational costs Else Ret ⊥ for the server are one pairing and one exponentiation in GT ; for the client, one pairing and two exponentiations Figure 2: The partially-oblivious PRF protocol used 4 in GT . in PYTHIA. The value π is a non-interactive zero- knowledge proof that the indicated discrete logs match. Efficient key updates: The server can quickly and The client also checks that pw matches ones seen previ- easily update the key kw for a given ensemble selec- ously when using selector w. tor w by replacing the table entry s = K[w] with a 0 new, randomly selected value s , thereby changing kw = 0 0 HMAC(msk, s) to kw = HMAC(msk, s ). It can then PRF-Srv (msk) transmit to the client an update token of the form ∆w = PRF-Cl (w, t, m) 0 Z kw/kw ∈ q. w, t, m- The client can update any stored PRF value k x˜ ← H (t k m)  w 3 Fkw (t, m) = e H1(t),H2(m) by raising it to ∆w; ∆ kw ← HMAC(msk, K[w]) F (t, m) w = F 0 (t, m) it is easy to see that kw kw . kw pw ← g The server can use the same mechanism to update y ← x˜kw msk, which requires generating a new update token for π ←$ ZKP(DLg(pw) = DLx˜(y)) pw, y, π each w and pushing these tokens to clients as needed. If p matches & w Unblinded variants. For deployments where oblivious- π verifies then Ret y ness of messages is unnecessary, we can use a faster, un- Else Ret ⊥ blinded variant of the PYTHIA protocol that dispenses with pairings shown in Figure 3. The only changes are Figure 3: The unblinded PRF protocol supported by that the client sends m to the server, there is no unblind- PYTHIA. Differences from the partially-oblivious pro- ing of the server’s response, and, instead of computing tocol in Figure 2 are shown in bold. x˜ ← e(H1(t), x) the server computes x˜ ← H (t k m) . our analyses will necessarily be different, but we note 3 that all three settings similarly exploit the algebraic struc- All group operations in this unblinded variant are over a ture of the bilinear pairing. See Section 7 for further dis- standard elliptic curve group G = hgi of order q and we ∗ cussion of related work. use a hash function H3 : {0, 1} → G. An alternative unblinded construction would be to The client-server protocol that computes Fkw (t, m) in a partially-oblivious manner is given in Figure 2. There have the server apply the Boneh-Lynn-Shacham short we let g be a generator of G1. We now explain how the signatures [13] to the client-submitted t k m; verification protocol achieves our requirements described in the last of correctness can be done using the signature verifica- section. tion routine, and we can thereby avoid ZKPs. This BLS

3 Blinding the message: In our protocol, the client Some details: The prover picks v ←$ Zq and then computes t1 = v v blinds the message m, hiding it from the server, by rais- g and t2 =x ˜ and c ← H3(g, pw, x,˜ y, t1, t2). Let u = v − c·k. The proof is π = (c, u). The verifier computes t0 = gu · pc and ing it to a randomly selected exponent r ←$ Z . As 1 w q t0 =x ˜uyc c = H (g, p , x,˜ y, t0 , t0 ) r r 2 . It outputs true if 3 w 1 2 . e H1(t),H2(m) = e H1(t),H2(m) , the client can 4The client’s pairing can be pre-computed while waiting for the unblind the output y of PRF-Srv by raising it to 1/r. This server’s reply.

5 Command Description perform a one-round cryptographic protocol (meaning a Init(w [, options]) Create table entry K[w] (for ensemble single message from client to server, and one message key kw) back). We present details in Section 3, but remind the Eval(w, t, m) Return PRF output F (t, m) kw reader that t is visible to the server in the client-server Reset(w, authtoken) Update K[w] (and thus kw); return protocol invoked by Eval, while m is blinded. update token ∆ w The server rate-limits requests based on the tweak t, GetAuth(w) Send one-time authentication token authtoken to client and can also raise an alert if the rate limit is exceeded. We give example rate limiting policies in Section 5. Figure 4: The basic PYTHIA API. Ensemble-key reset. A client can request that an en- semble key kw be reset by invoking Reset(w). This reset K variant may save a small amount of bandwidth. is accomplished by overwriting [w] with a fresh, ran- These unblinded variants provide the same services dom value. The name service returns a compact (e.g., (verifiability and efficient key updates) and security with 256-bit) update token ∆w that the client may use to up- the obvious exception of the secrecy of the message m. date all PRF outputs for the ensemble. It stores this to- In some deployment contexts an unblinded protocol may ken locally, encrypted under a public key specified by the be sufficient, for example when the client can maintain client, as explained below. state and submit a salted hash m instead of m directly. Note that reset results in erasure of the old value of kw. In this context, the salt should be held as a secret on the Thus a client that wishes to delete an ensemble key kw client and never sent to the server. permanently at the end of its lifecycle can do so with a Reset call. Reset is an authenticated call, and thus requires the 4 The PYTHIA Service Design following capability.

Figure 4 gives the high-level API exposed by PYTHIA Authentication. To authenticate itself for API calls, the to a client. We now describe its functions in terms of client must first invoke GetAuth, which has the server the lifecycle of an ensemble key. We assume a security transmit an (encrypted) authentication token authtoken parameter n specifying symmetric key lengths; a typical to the client out-of-band. The token expires after a pe- choice would be n = 128. riod of time determined by a configuration parameter in We defer to later sections the underlying client-server PYTHIA. Our current implementation uses e-mail for protocols and to Appendix A details on key lifecycle this, see Appendix A for more details. Of course, in management options, additional API calls for token man- some deployments one may want authentication to be agement and ensemble transfer, and a discussion of mas- performed in other ways, such as tokens dispensed by ter secret key rotation. administrators (for enterprise settings) or simply given out on a first-come-first-serve basis for each ensemble Ensemble initialization. To begin using the PYTHIA identifier (for public Internet services). service, a client creates an ensemble key for selector w by invoking Init(w [, options]).PYTHIA generates a fresh, random table entry K[w]. Recall that ensemble 4.1 Implementation key k = HMAC(msk, K[w]). So Init creates k as w w We implemented a prototype PYTHIA PRF service as a byproduct. a web application accessed over HTTPS. All requests Ideally, w should be an unguessable byte string. (An are first handled by an nginx web server with uWsgi as easily guessed one may allow attackers to squat on a key the application server gateway that relays requests to a selector, thereby mounting a denial-of-service (DoS) at- Django back-end. The PRF-Srv functionality is imple- tack.) For some applications, as we explain below, this mented as a Django module written in . Storage isn’t always possible. If an ensemble key for w already for the server’s key table and rate-limiting information is exists, then the PYTHIA service returns an error to the done in MongoDB. client. Otherwise, the client receives a message signify- We use the cryptographic library [2] (written ing that initialization is successful. in C) with our own Python wrapper. We use Barreto- Init includes a number of options we detail in Ap- Naehrig 254-bit prime order curves (BN-254) [4]. These pendix A. curves provide approximately 128-bits of security. PRF evaluation. To obtain a PRF value, a client can In our experiments the service is run on a single (vir- perform an evaluation query Eval(w, t, m), which re- tual) machine, but our software stack permits compo- turns Fkw (t, m). Here t is a tweak and m is a mes- nents (web server, application sever, database) to be dis- sage. To compute the PRF output, the client and server tributed among multiple machines with updates to con-

6 figuration files. Time (µs) Group Group Op Exp Hashing For the purpose of comparison, we implemented three G1 5.7 175 77 variants of the PYTHIA service. The first two are the un- G2 6.7 572 210 blinded protocols described in Section 3. In these two GT 9.8 1145 – schemes, the client sends m in the clear (possibly hashed pairing operation (e) takes 1005 µs with a secret salt value first) and the server replies with k Figure 5: Time taken by each operation in BN-254 y = H1(t k m) . In the first scheme, denoted UNB, k groups. Hashing times are for 64-byte inputs. the server provides p = g1 and a zero-knowledge proof where g is a generator of . The second scheme, de- 1 G1 Server Op Time (ms) noted BLS, uses a BLS signature for verification. The Table 1.2 k server provides p = g2 where g2 is a generator of G2 and Rate-limit 0.9 the client verifies the response by computing and com- UNB BLS PO Sign 0.3 0.3 1.5 paring the values: e(y, g2) = e(H1(t k m), p). Prove 0.5 0.3 2.5 Our partially-oblivious scheme is denoted PO. For the evaluation below we use a Python client im- Client Op UNB BLS PO plementing PRF-Cl for all three schemes using the same Blind - - 0.3 libraries indicated above for the server and httplib2 to Unblind - - 1.2 Verify 0.9 2.0 4.0 perform HTTPS requests. Figure 6: Computation time for major operations to 4.2 Performance perform a PRF evaluation. Table retrieves K[w] from database; Rate-limit updates rate-limiting record in For performance and scalability evaluation we hosted database; and Sign generates the PRF output; our PYTHIA server implementation on Amazon’s Elastic Compute Cloud (EC2) using a c4.xlarge instance which provides 8 virtual CPUs (Intel Xeon third generation, the results appear in Figure 7. Computation time domi- 2.9GHz), 15 GB of main memory, and solid state storage. nates in the LAN setting due to the almost negligible net- The web server, nginx, was configured with basic set- work latency. The WAN case with cold connections (no tings recommended for production deployment including HTTP KeepAlive) pays a performance penalty due to the one worker process per CPU. four round-trips required to set up a new TCP and TLS connection. While even 400 ms latencies are not pro- Latency. We measured client query latency for each hibitive in our applications, straightforward engineering protocol using two clients: one within the same Amazon improvements would vastly improve WAN timing: us- Web Service (AWS) availability zone (also c4.xlarge) ing TLS session resumption, using lower-latency secure and one hosted at the University of Wisconsin–Madison protocol like QUIC [46], or even switching to a custom with an Intel Core i7 CPU (3.4 GHz). We refer to the first UDP protocol (for an example one for oblivious PRFs, as the LAN (local-area network) setting and the second see [5]). as the WAN (wide-area network) setting. In the LAN set- tings we used the AWS internal IP address. All queries Throughput. We used the distributed load testing tool were made over TLS and measurements include the time autobench to measure maximum throughput for each required for clients to blind messages and unblind results scheme. We compare to a static page containing a typi- (PO), as well as verify proofs provided by the server (un- cal PRF response served over HTTPS as a baseline. We less indicated otherwise). All machines used for evalua- used two clients in the same AWS region as the server. tion were running Ubuntu 14.04. All connections were cold: no TLS session resumption or HTTP KeepAlive. Results appear in Figure 8. The Microbenchmarks for group operations appear in maximum throughput for a static page is 2,200 connec- Figure 5 and Figure 6 shows the timing of individual op- tions per second (cps); UNB and BLS 1,400 cps; and PO erations that comprise a single PRF evaluation. All re- 1,350 cps. Thus our PYTHIA implementation can handle sults are mean values computed over 10,000 operations. a large number of clients on a single EC2 instance. If These values were captured on an EC2 c4.xlarge instance needed, the implementation can be scaled with standard using the Python profiling library line profiler. The most techniques (e.g., a larger number of web servers and ap- expensive operations, by a large margin, are exponentia- plication servers on the front-end with a distributed key- tion in and the pairing operation. By extension, PO Gt value store on the back-end). sign, prove, and verify operations become expensive. We measured latencies averaged over 1,000 PRF re- Storage. Our implementation stores all ensemble pre- quests (with 100 warmup requests) for each scheme and key table (K) entries and rate-limiting information in

7 Latency (ms) PW-Onion(pw) LAN WAN h1 ← MD5(pw) Scheme Cold Hot No π Cold Hot No π sa ←$ {0, 1}160 UNB 7.0 3.8 2.4 389 82 80 h ← HMAC[SHA-1](h , sa) BLS 7.9 4.9 2.4 392 85 80 2 1 PO 14.9 11.8 5.2 403 96 84 h3 ← PRF-Cl(h2) = HMAC[SHA-256](h2, msk) RTT ping 0.1 82 h4 ← scrypt(h3, sa) h5 ← HMAC[SHA-256](h4) Figure 7: Average latency to complete a PRF-Cl with Ret (sa, h5) client-server communication over HTTPS. LAN: client and server in the same EC2 availability zone. WAN: Figure 9: The Facebook password onion. PRF-Cl(h2) server in EC2 US-West (California) and client in Madi- invokes the Facebook PRF service HMAC[SHA- son, WI. Hot connections made with HTTP KeepAlive 256](h2,Ks) with PRF-service secret key Ks. enabled; cold connections with KeepAlive disabled. No π: KeepAlive enabled; prove and verify computations hour granularity. Thus fielding a database for PYTHIA are skipped. can be accomplished on commodity hardware.

1,600 5 Password Onions 1,500 Web servers and other systems frequently store pass- 1,400 words in hashed form. A password onion is the result of additionally invoking a PRF service to harden the hash. 1,300 In currently suggested onions, one sequentially combines Offered (conns/sec) local hashing and application of the PRF service. We now present a service that we have implemented 1,200 1,300 1,400 1,500 1,600 on top of PYTHIA for managing password onions. First, Established (conns/sec) we describe the limitations of contemporary systems Static UNB BLS PO as exemplified by a recently disclosed architecture em- ployed by Facebook [42]. Then we show how our Figure 8: Throughput of PRF-Srv requests and a static password-onion system, which was easily engineered on page request over HTTPS measured using two clients top of PYTHIA, can address these limitations. and a server hosted in the same EC2 availability zone. In what follows, we use the term “client” or “web server” to denote the server performing authentication and storing derived values from passwords and “PRF MongoDB. A table entry is two 32 byte values: a SHA- server” to denote the PYTHIA service. 256 hash of the ensemble selector w and its associated value K[w]. In MongoDB the average storage size is 195 bytes per entry (measured as the average of 100K en- 5.1 Facebook password onion tries), including database overheads and indexes. This An example of a contemporary system, used by Face- implementation scales easily to 100 M clients with under book, is given in Figure 9.5 Their PRF service applies 20 GB of storage. HMAC using a service-held secret and returns the result. To rate-limit queries, our implementation stores tweak In this architecture, an adversary that compromises the values along with a counter and a timestamp (to ex- web server and the password hashes it stores must still pire old entries) in MongoDB. Tweak values are also mount an online attack against the PRF service to com- hashed using SHA-256 which ensures entries are of con- promise accounts. This is a big advance on the hashing- stant length. In our implementation each distinct tweak only practices that are commonly used. requires an average of 144 bytes per entry (including The Facebook architecture nevertheless has some overheads and indexes). Note however that rate limit- shortcomings. It is easy to see from Figure 9 that Face- ing entries are purged periodically as counts are only book’s system, like most contemporary PRF services, required for one rate-limiting period. Our implementa- lacks several important features present in PYTHIA. One tion imposes rate-limits at hour granularity. Assuming a maximum throughput of 2,000 requests per second, rate- 5This figure is of “archaeological” interest. It appears that vul- limiting storage never exceeds 1 GB. nerabilities in MD5 led to the addition of a layer of processing un- der SHA-1; when vulnerabilities were found in SHA-1, Facebook then All told, with only 20 GB stored data, PYTHIA can added layers of SHA-256. As we explain later, full-blown replacement serve over 100 M clients and perform rate-limiting at of MD5 and SHA-1 with SHA-256 was not easily accomplished.

8 is message privacy: the Facebook PRF service applies UpParOnion(w, sa, pw) HMAC to h2. This is the salted hash of the password, z ← PBKDF(pw, sa) and so learning the salt as well as compromising the PRF u ← PRF-Cl(w, sa, pw) z service suffices to re-enable offline brute-force attacks. h ← u Ret (h, sa) This threat is avoided by PYTHIA due to blinding. Another feature is batch key updates. In fact, the Face- Figure 10: An updatable, parallelizable password onion. book PRF service doesn’t permit autonomous key up- PRF-Cl returns elements of a group . The value w is a dates at all, in the sense of an update to msk that can be G unique PRF-service identifier for the web server (e.g., a propagated into PRF output updates. Should the client random 256-bit string) and sa is a random per-user salt (password database) be compromised, the only way to value. reconstitute a hash in an existing password onion is to wait until a user logs in and furnishes pw. It is not clear whether the Facebook PRF service performs granu- A web server generally aims to achieve a verification lar rate-limiting, although no such capability is indicated latency equal to some latency target T that is high enough in [41]. PYTHIA, as we shall see, addresses all of these to slow offline brute-force attacks, but low enough not to issues by design in our password onion system. burden users. For a parallelized onion a web server can The Facebook onion also presents a subtle perfor- meet its latency target by setting tlocal, tprf ≈ T . At the mance issue. By applying cryptographic primitives se- same time an offline attacker that has compromised the rially, the time to hash a password equals the time for web server and PYTHIA must perform about tlocal+tF > local computations, call it tlocal, plus the time for the T work to check a single password guess, where tF is round-trip PRF service call, call it tprf . An attacker that the computation time of Fkw (i.e., tprf minus network compromises the web service and PRF service incurs no latency). An attacker can parallelize, but her total work network latency, and thus may gain a considerable ad- still goes up relative to the serial onion approach for the vantage in guessing time over an honest web server. In same latency target T . our PYTHIA-based password onion service, we address We estimate the security improvement of parallel this issue by observing that it is possible to avoid seri- onions over serial onions using our benchmarks from alization of key derivation functions on the web server Section 4.2. We fix a login latency budget of T = and the PRF service call. That is, we introduce in our 300 ms.6 The latency costs for a PYTHIA query with YTHIA parallelizable pass- P -based service the idea of hot connections are 12 ms (LAN) and 96 ms (WAN). If word onions . one performs computations serially with a fixed T then PBKDF computations need to be reduced by 4% (LAN) 5.2 PYTHIA password onion and 32% (WAN) compared to the parallel approach. In the event that the PYTHIA server and password database The onion algorithm we construct for PYTHIA is shown are compromised, the serial onion enables speedup of of- in Figure 10. For PYTHIA, the output of PRF-Cl is fline dictionary attacks by the same percentages. an element of a group GT . To use this service, a web server stores (h, sa) upon password registration; Rate limiting and logging. The transparency of tweaks it verifies a proffered password pw0 by checking that enables the PYTHIA PRF service in this setting to execute UpParOnion(w, sa, pw0) = h. Written out we have that: any of a wide range of rate-limiting policies with per- account visibility (in contrast to what may be in Face- h = uz = e(H (sa),H (pw))kwz. 1 2 book an account-blind PRF service). As an example This design ensures that the key update functions in the demonstrating the flexibility of our architecture, in our PYTHIA API may be used to update onions as well. For implementation PYTHIA performs a tiered rate-limiting: 0 example, to update an ensemble key kw to kw, the service for a given account (t), it limits queries to at most 10 per computes and furnishes to the web server an update token hour per account, and at most 300 per month. (In ex- 0 ∆w ∆w = kw/kw. The web server may compute h for pectation, guessing a random 4-digit PIN would require each stored value h. 1.4 years under this policy.) It logs violations of these Parallelization. Password verification here is paral- thresholds. In a production environment, it could also lelizable in the sense that z and u may be computed in- send alerts to security administrators. dependently and then combined. Such parallel imple- We emphasize that a wide range of other rate-limiting mentation of the onion achieves a password verification policies is possible. We also point out that PYTHIA’s rate latency of max{t , t } (plus a single exponentia- local prf 6This is the default setting for Python’s bcrypt and scrypt modules, tion), as opposed to tlocal + tprf in a serialized imple- though all PBDKFs are tunable so one can choose T to be any value mentation. desired.

9 limiting supplements that normally implemented at the directly as a means to harden brainwallets. This appli- web server for remote login requests. PYTHIA performs cation showcases the ease with which a wide variety of rate limiting and may issue alerts even if the web server applications can be engineered around PYTHIA. is compromised. How brainwallets work. Every Bitcoin account has an Key update. The key update calls in the PYTHIA API, associated private / public key pair (sk, pk). The private and the ability to rotate either kw or msk efficiently, key sk is used to produce digital (ECDSA) signatures propagates up to the password onion service. Key up- that authorize payments from the account. The public dates instantly invalidate the web server’s existing pass- key pk permits verification of these signatures. It also word database—a useful capability in case of compro- acts as an account identifier; a Bitcoin address is derived mise. A compromised database becomes useless to an by hashing pk (under SHA-256 and RIPEMD-160) and attacker attempting to recover passwords, even with the encoding it (in base 58, with a check value). ability to query PYTHIA. Using a key update token, the Knowledge of the private key sk equates with control web server can then recover from compromise by re- of the account. If a user loses a private key, she therefore freshing its database. loses control over her account. For example, if a high en- We created a client simulator with MongoDB and tropy key sk is stored exclusively on a device such as a the mongoengine Python module. With this we bench- mobile phone or laptop, and the device is seized or physi- marked key updates with 100,000 database entries. The cally destroyed, the account assets become irrecoverable. client requested a key update from PYTHIA, received the Brainwallets offer an attractive remedy for such phys- update token ∆w, and updated each database entry. The ical risks of key loss. A brainwallet is simply a password complete update required less than 1 ms per entry, and or passphrase P memorized by a Bitcoin account holder. terminated in less than 97 seconds for all 100,000 en- The private key sk is generated directly from P . Thus tries. For a larger database we assume updates scale lin- the user’s memory serves as the only instrument needed early, and so an update for 1 million users completes in to authorize access to the account. under 17 minutes. In more detail, the passphrase is typically hashed using The web server need not need lock the database to per- SHA-256 to obtain a 256-bit string sk = SHA-256(P ). form updates; it can execute them in parallel with normal Bitcoin employs ECDSA signatures on the secp256k1 login operations. Doing so does require additional ver- elliptic curve; with high probability (≈ 1 − 2−126), sk sioning information for each entry to indicate the version is less than the group order, and a valid ECDSA pri- of kw (in the simplest form, whether or not it has received vate key. (Some websites employ stronger key derivation the latest update). functions. For example, WrapWallet by keybase.io [32] derives sk from an XOR of each of PBKDF2 and scrypt Database replication. Password databases can be repli- applied to P and permits use of a user-supplied salt.) cated with a key transfer using the API call Transfer (see Since a brainwallet employs only P as a secret, and Appendix A). In this replication each new copy uses a does not necessarily use any additional security mea- unique ensemble key selector and thus a cryptograph- sures, an attacker that guesses P can seize control of a ically independent PRF service key. Given a database user’s account. As account addresses are posted publicly w, {(sa1, h1),..., (sad, hd)} with d users, the adminis- 0 in the Bitcoin system (in the “blockchain”), an attacker trator invokes Transfer(w, w ) to obtain a token ∆ 0 . w→w can easily confirm a correct guess. Brainwallets are thus 0 ∆w→w0 The client computes hi = hi for i ∈ [1..d] and vulnerable to brute-force, offline guessing attacks. Nu- 0 0 0 sends the new database (w , {(sa1, h1),..., (sad, hd)}) merous incidents have come to light showing that brain- to the new server. The client does not modify salt values, wallet cracking is pandemic [15].7 which allows PYTHIA to link online guessing attacks car- ried out from multiple compromised web servers. Repli- cation in this way costs database copy time plus 1 ms per 6.1 A PYTHIA-hardened brainwallet entry to apply the update token, thus making it on the order of minutes for hundreds of thousands of users. PYTHIA offers a simple, powerful means of protecting brainwallets against offline attack. Hardening P in the same manner as an ordinary password yields a strong key 6 Hardened Brainwallets P˜ that can serve in lieu of P to derive sk. To use PYTHIA, a user chooses a unique identifier id, Brainwallets are a common but dangerous way to se- e.g., her e-mail address, an account identifier acct, and a cure accounts in the popular cryptocurrency Bitcoin, as 7At one point, rumor had it that cracking brainwallets was more well as in less popular cryptocurrencies such as Litecoin. profitable than “mining,”, the basic process of generating fresh Bit- Here we describe how the PYTHIA service can be used coins.

10 passphrase P . The identifier acct might be used to distin- • Catastrophic failure of PYTHIA: If a PYTHIA ser- guish among Bitcoin accounts for users who wish to use vice fails catastrophically, e.g., msk or K is lost, the same password for multiple wallets. The client then then in a typical setting, it is possible simply to sends (w = id, t = id k acct, m = P ) to the PYTHIA reset users’ passwords. In the brainwallet case, ˜ service to obtain the hardened value Fkw (t, m) = P . the result would be loss of virtual-currency assets Here, id is used both as an account identifier and as part protected by the server—a familiar event for Bit- of the salt. Message privacy in PYTHIA ensures that the coin users [38]. This problem can be avoided, service learns nothing about P . Then P˜ is hashed with for instance, using a threshold implementation of SHA-256 to yield sk. The corresponding public key PYTHIA, as mentioned in Section 6.2 or storing sk pk and address are generated in the standard way from in a secure, offline manner like a safe-deposit box sk [8]. for disaster recovery. PYTHIA forces a would-be brainwallet attacker to mount an online attack to compromise an account. Not 6.2 Threshold Security only is an online attack much slower, but it may be rate- limited by PYTHIA and detected and flagged. As the In order to gain both redundancy and security, we give PYTHIA service derives P˜ using a user-specific key, it a threshold scheme that can be used with a number of additionally prevents an attacker from mounting a dictio- Pythia servers to protect a secret under a single pass- nary attack against multiple accounts. While in the con- word. This scheme uses Shamir’s secret sharing thresh- ventional brainwallet setting, two users who make use old scheme [48] and gives (k, n) threshold security. That of the same secret P will end up controlling the same is, initially, n Pythia servers are contacted and used to account, PYTHIA ensures that the same password P pro- protect a secret s, and then any k servers can be used to duces distinct per-user key pairs. recover s and any adversary that has compromised fewer Should an attacker compromise the PYTHIA service than k Pythia servers learns no information about s. and steal msk and K, the attacker must still perform an Preparation. The client chooses an ensemble key se- offline brute-force attack against the user’s brainwallet. lector w, tweak t, password P , and contacts n Pythia So in the worst case, a user obtains security with PYTHIA servers to compute qi = PRF-Cli(w, t, P ) mod p for at least as good as without it. 0 < i ≤ n. The client selects a random polynomial Z∗ Additional security issues. A few subtle security issues of degree k − 1 with coefficients from p where p is Pk−1 j deserve brief discussion: a suitably large prime: f(x) = j=0 x aj. Let the secret s = a0. Next the client computes the vector • Stronger KDFs: To protect against brute-force Φ = (φ1, ..., φn) where φi = f(i) − qi. The client attack in the event of PYTHIA compromise, durably stores the value Φ, but does not need to protect a resource-intensive key-derivation function may it (it’s not secret). The client also stores public keys pi be desirable, as is normally used in password from each Pythia server to validate proofs when issuing databases. This can be achieved by replacing the future queries. SHA-256 hash of P˜ above with an appropriate KDF computation, or alternatively using an onion ap- Recovery. The client can reconstruct s if she has Φ by proach described in Section 5. querying any k Pythia servers giving k values qi. These qi values can be applied to the corresponding Φ values • Denial-of-service: By performing rate-limiting, to retrieve k distinct points that lie on the curve f(x). PYTHIA creates the risk of targeted denial-of- With k points on a degree k − 1 curve, the client can service attacks against Bitcoin users. As Bitcoin use interpolation to recover the unique polynomial f(x), is pseudonymous, use of an e-mail address as a which includes the curve’s intercept a0 = s. PYTHIA key-selector suffices to prevent such at- tacks against users based on their Bitcoin addresses Security. If an adversary is given Φ, w, t, the public alone. Users also have the option, of course, of us- keys pi, a ciphertext based on s, and the secrets from ing a semi-secret id. A general DoS attack against m < k Pythia servers, the adversary has no information the PYTHIA service is also possible, but of similar that will permit her to verify password guesses offline. concern for Bitcoin itself [9]. Compared to [48], this scheme reduces the problem of storing n secrets to having access to n secure OPRFs and • Key rotation: Rotation of an ensemble key k (or w durable (but non-secret) storage of the values Φ and pub- the master key msk) induces a new value of P˜ and lic keys p . thus a new (sk, pk) pair and account. A client can i handle such rotations in the na¨ıve way: transfer Verification. Verification of server responses occurs funds from the old address to the new one. within the Pythia protocol. If a server is detected to be

11 dishonest (or goes out of service), it can be easily re- key rotation requires N interactions with the PRF server placed by the client without changing the secret s. To re- to get N separate update tokens (one per unique tweak place a Pythia server that is suspected to be compromised for which a PRF output is stored). When N is large and or detected as dishonest, the client reconstructs the secret the number of ensembles w is small as in our password s using any k servers, executes Reset operations on all storage application, these inefficiencies add significant remaining servers: this effects a cryptographic erasure overheads. on the values Φ and f(x). The client then selects a new, Another issue with the above suggestions is that their random polynomial, keeping a0 fixed, and generates and security was only previously analyzed in the context of stores an updated Φ0 that maps to the new polynomial. one-more unforgeability [45] as targeted by blind signa- tures [21] and partially blind signatures [1]. (Some were 7 Related Work analyzed as conventional PRFs, but that is in a model where adversaries do not get access to a blinded server We investigated a number of designs based on exist- oracle.) The password onion application requires more ing cryptographic primitives in the course of our work, than unforgeability because message privacy is needed. though as mentioned none satisfied all of our design (A signature could be unforgeable but include the en- goals. Conventional PRFs built from block ciphers or tire message in its signature, and this would obviate the hash functions fail to offer message privacy or key rota- benefits of a PRF service for most applications.) These tion. Consider instead the construction H(t k m)kw for schemes, however, can be proven to be one-more PRFs, ∗ the notion we introduce, under suitable one-more DDH H : {0, 1} → G a cryptographic hash function map- style assumptions using the same proof techniques found ping onto a group G. This was shown secure as a con- ventional PRF by Naor, Pinkas, and Reingold assum- in Appendix B. ing decisional Diffie-Hellman (DDH) is hard in G and Fully oblivious PRFs [29] and their verifiable ver- when modeling H as a random oracle [43]. It supports sions [31] also do not allow granular rate limiting. We key rotations (in fact it is key-homomorphic [12]) and note that the Jarecki, Kiayias, and Krawczyk construc- verifiability can be handled using non-interactive zero- tions of verifiable OPRFs [31] in the RO model are knowledge proofs (ZKP) as in PYTHIA. But this ap- essentially the Ford-Kaliski protocol above, but with proach fails to provide message privacy if we submit both an extra hash computation, making the PRF output 0 k t and m to the server and have it compute the full hash. H (t k m k H(t k m) w ). Our notion of one-more un- One can achieve message-hiding by using blinding: predictability in the appendix captures the necessary re- have the client submit X = H(t k m)r for random quirements on the inner cryptographic component, and Z kw might modularize and simplify their proofs. Their trans- r ∈ |G| and the server reply with X as well as a ZKP proving this was done correctly. The resulting scheme form is similar to the unique blind signature to OPRF is originally due to Chaum and Pedersen [22], and sug- transformation of Camenisch, Neven, and shelat [19]. gested for use by Ford and Kaliski [28] in the context None of these efficient oblivious PRF protocols support of threshold password-authenticated secret sharing (see key rotations (with compact tokens or otherwise) as the also [3, 18, 23, 37]). There an end user interacts with final hashing step destroys updatability. one or more blind signature servers to derive a secret au- The setting of capture-resilient devices shares with thentication token. If G comes equipped with a bilin- ours the use of an off-system key-holding server and the ear pairing, one can dispense with ZKPs. The resulting desire to perform cryptographic erasure [35, 36]. They scheme is Boldyreva’s blinded version [11] of BLS sig- only perform protocols for encryption and signing func- natures [13]. However, neither approach provides gran- tionalities, however, and not (more broadly useful) PRFs. ular rate limiting when blinding is used: the tweak t is They also do not support granular rate limiting and mas- hidden from the server. Even if the client sends t as well, ter secret key rotation. the server cannot verify that it matches the one used to Our main construction coincides with prior ones compute X and attackers can thereby bypass rate limits. for other contexts. The Sakai, Ohgishi, and Kasa- To fix this, one might use Ford-Kaliski with a sep- hara [47] identity-based non-interactive key exchange arate secret key for each tweak. This would result in protocol computes a symmetric encryption key as k having a different key for each unique w, t pair. Mes- e(H1(ID1),H2(ID2)) for k a master secret held by sage privacy is maintained by the blinding, and querying a trusted party and ID1 and ID2 being the identities w, t, H(t0 k m)r for t 6= t0 does not help an attacker cir- of the parties. See [44] for a formal analysis of their cumvent per-tweak rate limiting. But now the server-side scheme. Boneh and Waters suggest the same construc- storage grows in the number of unique w, t pairs, a client tion as a left-or-right constrained PRF [14]. The settings using a single ensemble w must now track N public keys and their goals are different from ours, and in particular when they use the service for N different tweaks, and one cannot use either as-is for our applications. Na¨ıvely

12 one might hope that returning the constrained PRF key signed a new enterprise “password onion” system that kw H1(t) to the client suffices for our applications, but improves upon the one recently reported in use at Face- in fact this totally breaks rate-limiting. Security analysis book. Our system permits fast key rotations, enabling of our protocol requires new techniques, and in particu- practical reactive and proactive key management, and lar security must be shown to hold when the adversary uses a parallelizable onion design which, for a given au- has access to a half-blinded oracle — this rules out the thentication latency, imposes more computational effort techniques used in [14, 44]. on attackers after a compromise. We also explored the Key-updatable encryption [12] and proxy re- use of PYTHIA to harden brainwallets for cryptocurren- encryption [10] both support key rotation, and could cies. be used to encrypt password hashes in a way support- ing compact update tokens and that prevents offline Acknowledgements brute-force attacks. But this would require encryption and decryption to be handled by the hardening service, The authors thank Kenny Paterson for feedback on an preventing message privacy. early draft of this paper. This work was supported in part Verifiable PRFs as defined by [24,25,34,39] allow one by NSF grants CNS-1330308, CNS-1065134, and CNS- to verify that a known PRF output is correct relative to a 1330599, as well as a gift from Microsoft. public key. Previous verifiable PRF constructions are not oblivious, let alone partially oblivious. Threshold and distributed PRFs [24, 40, 43] as well as References distributed key distribution centers [43] enable a suffi- ciently large subset of servers to compute a PRF output, [1] Masayuki Abe and Tatsuaki Okamoto. Provably but previous constructions do not provide the granular secure partially blind signatures. In Advances in rate limiting and key rotation we desire. However, it is Cryptology–CRYPTO. Springer, 2000. clear that there are situations where applications would [2] D. F. Aranha and C. P. L. Gouvea.ˆ RELIC is benefit from a threshold implementation of PYTHIA, for an Efficient LIbrary for Cryptography. https: both redundancy and distribution of trust, as discussed in //github.com/relic-toolkit/relic. Section 6.2 for the case of brainwallets. [3] Ali Bagherzandi, Stanislaw Jarecki, Nitesh Saxena, 8 Conclusion and Yanbin Lu. Password-protected secret sharing. In Computer and Communications Security. ACM, We presented the design and implementation of PYTHIA, 2011. a modern PRF service. Prior works have explored the use [4] Paulo SLM Barreto and Michael Naehrig. Pairing- of remote cryptographic services to harden keys derived friendly elliptic curves of prime order. In Selected from passwords or otherwise improve resilience to com- Areas in Cryptography. Springer, 2006. promise. PYTHIA, however, transcends existing designs to simultaneously support granular rate limiting, efficient [5] Mihir Bellare, Sriram Keelveedhi, and Thomas Ris- key rotation, and cryptographic erasure. This set of fea- tenpart. Dupless: server-aided encryption for dedu- tures, which stems from practical requirements in appli- plicated storage. In USENIX Security. USENIX, cations such as enterprise password storage, proves to 2013. require a new cryptographic primitive that we refer to as a partially oblivious PRF. [6] Mihir Bellare, Chanathip Namprempre, David Unlike a (fully) oblivious PRF, a partially oblivious Pointcheval, Michael Semanko, and Matthew PRF causes one portion of an input to be revealed to Franklin. The one-more-RSA-inversion prob- the server to enable rate limiting and detection of on- lems and the security of Chaum’s blind signature line brute-force attacks. We provided a bilinear-pairing scheme. Journal of Cryptology, 16(3), 2003. based construction for partially oblivious PRFs that is [7] Mihir Bellare, Thomas Ristenpart, and Stefano Tes- highly efficient and simple to implement (given a pair- saro. Multi-instance security and its application ings library), and also supports efficient key rotations. A to password-based cryptography. In Advances in formal proof of security is unobtainable using existing Cryptology–CRYPTO. Springer, 2012. techniques (such as those developed for fully oblivious PRFs). We thus gave new definitions and proof tech- [8] Technical background of version 1 Bitcoin ad- niques that may be of independent interest. dresses. https://en.bitcoin.it/wiki/ We implemented PYTHIA and show how it may be Technical_background_of_version_1_ easily integrated it into a range of applications. We de- Bitcoin_addresses.

13 [9] Bitcoin wiki, “weaknesses”. https://en. [22] David Chaum and Torben Pryds Pedersen. Wal- bitcoin.it/wiki/Weaknesses. let databases with observers. In Advances in Cryptology–CRYPTO. Springer, 1993. [10] Matt Blaze, Gerrit Bleumer, and Martin Strauss. Divertible protocols and atomic proxy cryptogra- [23] Mario Di Raimondo and Rosario Gennaro. Prov- phy. In Advances in Cryptology–EUROCRYPT. ably secure threshold password-authenticated Springer, 1998. key exchange. In Advances in Cryptology– EUROCRYPT. Springer, 2003. [11] Alexandra Boldyreva. Threshold signatures, mul- tisignatures and blind signatures based on the gap- [24] Yevgeniy Dodis. Efficient construction of (dis- Diffie-Hellman-group signature scheme. In Public tributed) verifiable random functions. In Public Key Key Cryptography. Springer, 2002. Cryptography. Springer, 2002. [25] Yevgeniy Dodis and Aleksandr Yampolskiy. A ver- [12] Dan Boneh, Kevin Lewi, Hart Montgomery, and ifiable random function with short proofs and keys. Ananth Raghunathan. Key homomorphic PRFs In Public Key Cryptography. Springer, 2005. and their applications. In Advances in Cryptology– CRYPTO. Springer, 2013. [26] Paul Ducklin. Anatomy of a password disaster – Adobe’s giant-sized cryptographic blun- [13] Dan Boneh, Ben Lynn, and Hovav Shacham. Short der, 2013. https://nakedsecurity. signatures from the Weil pairing. In Advances in sophos.com/2013/11/04/anatomy-of- Cryptology–ASIACRYPT. Springer Berlin Heidel- a-password-disaster-adobes-giant- berg, 2001. sized-cryptographic-blunder/. [14] Dan Boneh and Brent Waters. Constrained pseu- [27] Taher ElGamal. A public key cryptosystem and a dorandom functions and their applications. In Ad- signature scheme based on discrete logarithms. In vances in Cryptology-ASIACRYPT. Springer, 2013. Advances in Cryptology–CRYPTO. Springer, 1985. [15] Brainwallet. https://en.bitcoin.it/ [28] Warwick Ford and Burton S. Kaliski, Jr. Server- wiki/Brainwallet. assisted generation of a strong secret from a pass- word. In International Workshops on Enabling [16] Emmanuel Bresson, Jean Monnerat, and Damien Technologies: Infrastructure for Collaborative En- Vergnaud. Separation results on the “one-more terprises. IEEE, 2000. computational” problems. In Topics in Cryptology– CT-RSA. Springer, 2008. [29] Michael J Freedman, Yuval Ishai, Benny Pinkas, and Omer Reingold. Keyword search and oblivious [17] Daniel R. L. Brown. Irreducibility to the one-more pseudorandom functions. In Theory of Cryptogra- evaluation problems: More may be less. Cryptol- phy. Springer, 2005. ogy ePrint Archive, Report 2007/435. [30] Oded Goldreich, Shafi Goldwasser, and Silvio Mi- [18] Jan Camenisch, Anna Lysyanskaya, and Gregory cali. How to construct random functions. Journal Neven. Practical yet universally composable two- of the ACM, 33(4), 1986. server password-authenticated secret sharing. In [31] Stanislaw Jarecki, Aggelos Kiayias, and Hugo Computer and Communications Security. ACM, Krawczyk. Round-optimal password-protected se- 2012. cret sharing and t-PAKE in the password-only [19] Jan Camenisch, Gregory Neven, and abhi she- model. In Advances in Cryptology–ASIACRYPT. lat. Simulatable adaptive oblivious transfer. In Springer, 2014. Advances in Cryptology–EUROCRYPT. Springer [32] Max Krohn and Chris Coyne. Wrap Wallet. Berlin Heidelberg, 2007. https://keybase.io/warp. [20] Jan Camenisch and Markus Stadler. Proof systems [33] Moses Liskov, Ronald L Rivest, and David Wag- for general statements about discrete logarithms. ner. Tweakable block ciphers. In Advances in Technical Report No. 260, Dept. of Computer Sci- Cryptology–CRYPTO. Springer, 2002. ence, ETH Zurich, 1997. [34] Anna Lysyanskaya. Unique signatures and veri- [21] David Chaum. Blind signatures for untraceable fiable random functions from the DH-DDH sep- payments. In Advances in Cryptology. Springer, aration. In Advances in Cryptology–CRYPTO. 1983. Springer, 2002.

14 [35] Philip MacKenzie and Michael K Reiter. Delega- Selector option Description tion of cryptographic servers for capture-resilient Email Contact email for selector devices. Distributed Computing, 16(4), 2003. Resettable Whether client-requested rotations allowed Limit Establish rate-limit per t [36] Philip MacKenzie and Michael K Reiter. Net- Time-out Date/time to delete kw Public-key Key under which to encrypt and store up- worked cryptographic devices resilient to capture. date and authentication tokens International Journal of Information Security, 2(1), Alerts Whether to email contact upon rate limit vi- 2003. olation

[37] Philip MacKenzie, Thomas Shrimpton, and Markus Figure 11: Optional settings for establishing key selec- Jakobsson. Threshold password-authenticated key tors in PYTHIA. exchange. In Advances in Cryptology–CRYPTO. Springer, 2002. A Additional PYTHIA API details [38] R. McMillan. The inside story of Mt. Gox, bitcoin’s $460 million disaster. Wired, 2014. Many PYTHIA-dependent services can benefit from ad- [39] Silvio Micali, Michael Rabin, and Salil Vadhan. ditional API features and calls beyond the primary ones Verifiable random functions. In Foundations of discussed in the body of the paper. (For example, the Computer . IEEE, 1999. PYTHIA password onion system in Section 5 uses the Transfer API call.) We detail these other API features in [40] Silvio Micali and Ray Sidney. A simple method this appendix. for generating and sharing pseudo-random func- tions, with applications to clipper-like key escrow Key-management options. The client can specify a systems. In Advances in Cryptology–CRYPTO. number of options in the call Init regarding management Springer, 1995. of the ensemble key kw. The client can provide a contact email address to which alerts and authentication tokens [41] Alec Muffet. Facebook: Password hashing & au- may be sent. (If no e-mail is given, no API calls requiring thentication. Presentation at Real World Crypto, authentication are permitted at present and no alerts are 2015. provided. Later versions of PYTHIA will support other authentication and alerting methods.) [42] Allec Muffet. Facebook: Password hashing and The client can specify whether k should be resettable https://video.adm.ntnu. w authentication. (default is “yes”). The client can specify a limit on the no/pres/54b660049af94. total number of Fkw queries that should be allowed be- [43] Moni Naor, Benny Pinkas, and Omer Rein- fore resetting K[w] (default is unlimited) and/or an ab- gold. Distributed pseudo-random functions and solute expiration date and time in UTC at which point KDCs. In Advances in Cryptology–EUROCRYPT. K[w] is deleted (default is no time-out). Either of these Springer, 1999. options overrides the resettable flag. The client can spec- ify a public key pkcl for a public-key encryption scheme [44] Kenneth G Paterson and Sriramkrishnan Srini- under which to encrypt authentication tokens and update vasan. On the relations between non-interactive key tokens (for Reset, Transfer, as described below, and for distribution, identity-based encryption and trapdoor master secret key rotations). Finally, the client can re- discrete log groups. Designs, Codes and Cryptog- quest that alerts be sent to the contact email address in raphy, 52(2), 2009. the case of rate limit violations. This option is ignored if no contact email is provided. The options are summa- [45] David Pointcheval and Jacques Stern. Provably rized in Figure 11. secure blind signature schemes. In Advances in PYTHIA also offers some additional API calls, given Cryptology–ASIACRYPT. Springer, 1996. in Figure 12, which we now describe. [46] Jim Roskind. QUIC: Multiplexed stream transport Ensemble transfer. A client can create a new ensemble over UDP. Google working design document, 2013. w0 (with the same options as in Init) while receiving an [47] R. Sakai, K. Ohgishi, and M. Kasahara. Cryptosys- update token that allows PRF outputs under ensemble w 0 tems based on pairing. In Cryptography and Infor- to be rolled forward to w . This is useful for importing mation Security, 2000. a password database to a new server. The PYTHIA ser- vice returns an update token ∆w→w0 for this purpose and 0 [48] Adi Shamir. How to share a secret. Communica- stores it encrypted under pkcl. For the case w = w, this tions of the ACM, 22(11):612–613, 1979. call also allows option updates on an existing ensemble

15 Command Description following. The key generation algorithm K outputs a 0 Transfer(w, w [, options]) Creates new ensemble public key and private key pair (pk, sk). We assume w0; outputs update token that from sk one can compute pk easily. The PRF-Srv ∆w→w0 ; resets kw SendTokens(w, authtoken) Sends stored update tokens to algorithm takes input the secret key sk and a client re- client quest message (a bit string) and returns a server re- PurgeTokens(w, authtoken) Purges all stored update to- sponse message (another bit string). The client algo- kens for ensemble w rithm PRF-Cl takes inputs a tweak t and message m, can make a single call to PRF-Srv, and outputs a value. Figure 12: The PYTHIA API. The individual calls are Finally we associate to the protocol a keyed function explained in detail in the text. ∗ ∗ ∗ Fsk : {0, 1} × {0, 1} → {0, 1} . A scheme is correct if executing PRF-ClPRF-Srvsk(·)(t, m) with fresh coins w. matches Fsk(t, m) with probability one. In words, the protocol computes the appropriate function of t, m. Update-token handling. The PYTHIA service stores update tokens encrypted under pkcl, with accom- Bilinear pairing setups. Let G1, G2, GT be groups panying timestamps for versioning. The API call all of order p that have associated to them an admissi- SendTokens causes these to be e-mailed to the client, ble bilinear pairing e : G1 × G2 → GT . Recall that while PurgeTokens causes update-token ciphertexts to for generators g1 ∈ G1, g2 ∈ G2, there exists a gen- be deleted from PYTHIA. α β αβ erator gT ∈ GT such that e(g1 , g2 ) = g for all Note that once an update token is deleted, old PRF T α, β ∈ Zp. As shorthand for below we refer to a pair- values to which the token was not applied become cryp- ing setup G = (g1, g2, gT , G1, G2, GT , e) and assume tographically erased — they become random values un- some compact description of G as a bit-string where ap- related to any messages. A client can therefore delete the propriate. key associated with an ensemble by calling Reset and PurgeTokens. The scheme. The partially-oblivious PRF at the core8 of Master secret rotations. PYTHIA can also rotate its our bilinear pairing scheme from Section 3 is as follows ∗ master secret key msk to a new key msk0. Recall that en- for some fixed pairing setup G. Let H1 : {0, 1} → G1 ∗ semble keys are computed as kw = HMAC(msk, K[w]), and H2 : {0, 1} → G2 be hash functions that we will so rotation of msk results in rotation of all ensemble later model as random oracles. 0 keys. To rotate to a new msk , the server computes kw Key generation K picks a random exponent sk and for all ensembles w with entries in K, and stores δw en- sk computes a public key pk = g1 . The PRF-Cl(t, m) crypted under pkcl. If no encryption key is set, then the algorithm computes a mask r ←$ Zp and sends t and token is stored in the clear. This is a forward-security r x = H2(m) to the server. The PRF-Srv(sk, t, x) com- issue while it remains, but only for that particular key sk putes y = e(H1(t), x) and a ZKP π that DLg (pk) = ensemble. At this point msk is safe to delete. Clients 1 DLx˜(y) where x˜ = e(H1(t), x). It sends pk, y, π to the can be informed of the key rotation via e-mail. client, who verifies the ZKP, deletes it, and then outputs Subsequent SendTokens requests will return the re- y1/r. The correctness of the scheme follows from the sulting update token, along with any other stored update correctness of the ZKP and the properties of the pairing. tokens for the ensemble. If multiple rotations occur be- tween client requests, then these can be aggregated in the The ZKP is used to ensure that a malicious server re- stored update token for each ensemble. This is trivial if sponds as per the protocol. In the following security they are stored in the clear (just multiply the new token analyses we focus primarily on malicious clients, and for against the old) and also works if they are encrypted with simplicity analyze a simpler version of the protocol that an appropriately homomorphic encryption scheme such omits the ZKP. The proofs found below can be extended as ElGamal [27]. to the full protocol by applying the zero-knowledge se- curity of the proof systems that we use (i.e., use the zero-knowledge simulator to produce fake, but realistic- B Formal Security Analyses looking to the client, proofs). We provide formal security notions for partially oblivi- ous PRFs, and proofs of security relative to them for our scheme from Section 3. 8For brevity we omit key selectors here, and instead focus on ana- lyzing security for a single key instance. Assuming properly generated Partially-oblivious PRFs. A partially oblivious PRF keys for each selector, one can show that security for a single key in- protocol Π = (K, PRF-Cl, PRF-Srv,F ) consists of the stance implies security for many.

16 A B Game om-UNPΠ Game om-BCDHG (pk, sk) ←$ K sk ←$ Zp

c ← 0 qh, q1,t, q2,t ← 0 PRF-Srv,H1,H2 Targ1,Targ2,Help sk (t1, m1, σ1),..., (t`, m`, σ`) ←$ A (i1, j1, σ1),..., (i`, j`, σ`) ←$ A (G, g1 ) If ∃i 6= j . (ti, mi) = (tj , mj ) then Ret false If qh ≥ ` then Ret false H1,H2 Ret (∧i(σi = Fsk (ti, mi)) ∧ c < `) If ∃α . (iα > q1,t) ∨ (jα > q2,t) then Ret false If ∃α 6= β . (i , j ) = (i , j ) then Ret false PRF-Srv(t, Y ) α α β β Ret ∀α . e(X ,Y )sk = σ c ← c + 1 iα jα α H1,H2 Ret PRF-Srvsk (t, Y ) Targ1 $ q1,t ← q1,t + 1 ; Xq1,t ← G1 ; Ret Xq1,t Figure 13: Security game for one-more unpredictability. Targ2 $ q2,t ← q2,t + 1 ; Yq2,t ← G2 ; Ret Yq2,t B.1 Unpredictability Security Help(Z) sk qh ← qh + 1 ; Ret Z We define a one-more unpredictability security notion. It modifies one-more unforgeability [45] to be suitable Figure 14: Security game for a one-more for the setting of unpredictable functions (as opposed to BCDH assumption for bilinear pairing setting publicly verifiable signatures). The game is shown in G = (g1, g2, gt, G1, G2, GT , e). Figure 13. We associate to any protocol Π, adversary A, and query number q the one-more-unpredictability ad- vantage defined as more unpredictability of our scheme The proof is essen- om-unp  A  tially identical to the proof of Boldyreva’s blind signa- AdvΠ,q (A) = Pr om-UNPΠ,q ⇒ true . tures [11]. The probability here (and for games defined later below) is over all random coins used by the procedures and the Theorem 1 Let Π be the simplified partially oblivious adversary. The event refers to the probability that the PRF protocol for a pairing setup and H ,H mod- value returned by the main procedure is true. In words, G 1 2 eled as random oracles. Then for any one-more un- the definition requires that an adversary cannot produce ` predictability adversary A making at most q PRF-Srv outputs of the PRF using less than ` queries on partially- queries, we give in the proof below a one-more CDH ad- blinded inputs to the server. One can easily extend this versary B such that notion to deal with full blinded inputs as well, but we Advom-unp(A) ≤ Advom-cdh(B) will not need this. Π G This notion of security is sufficient for PYTHIA in ap- where B runs in time that of A plus O(q) group opera- plications where the output of the protocol is not stored, tions. but rather used as an unforgeable credential such as with our hardened Brainwallet application (Section 6). Proof: We assume without loss of generality that A The security of our scheme is based on the fol- never repeats a query to either random oracle and makes lowing one-more bilinear computational Diffie-Hellman a random oracle H1(ti) and H2(mi) query for each (BCDH) problem, an extension of the one-more CDH (ti, mi, σi) triple it outputs. The adversary B will work assumption given by Boldyreva [11]. To the best as follows when given inputs G,X and access to oracles of our knowledge this assumption is new, but it is Targ1, Targ2, Help. First, it runs A. Whenever A makes a straightforward adaptation of previous one-more as- an H1(t) query, B queries Targ1 to obtain a G1-element sumptions [6, 11] to our setting. For a pairing setup G, that we will denote X[t], sets ct to be the number of H1 game om-BCDHG is defined in Figure 14. In words, the queries so far (including the current), and returns X[t] sk adversary gets a group element g1 ∈ G1 as well as to A. Whenever A makes an H2(m) server query, B target oracles Targ1, Targ2 that return random group ele- queries Targ2, obtains a G2-element that we will denote ments in G1, G2 respectively. Finally the adversary can Y [m], sets dm to be the number of H2 queries so far (in- query a helper oracle Help that raises GT elements to the cluding the current), and returns Y [m] to A. Whenever sk k. To win, it must compute ` values e(Xi,Yj) for ` A makes a PRF-Srv(t, Y ) query, the adversary B com- larger than the number of helper queries and each Xi,Yj putes Z ← e(H1(t),Y ), and then queries Z to its helper a unique pair of (distinct) values returned by the target or- oracle Help to obtain a value σ ∈ GT . It returns σ to A. acle. Let Advom-cdh(B) = Pr  om-BCDHB ⇒ true . G G Eventually A outputs a series of triples We have the following theorem establishing the one- (t1, m1, σ1),..., (tq, mq, σq). At this point

17 A adversary B outputs the sequence of pairs Game om-PRFΠ,ν

(ct1 , dm1 , σ1),..., (cmq , dmq , σq). (pk, k) ←$ K ; q, c ← 0 0 RoR,PRF-Srv,H1,H2 Suppose A wins its game. Then it made at most q − 1 (i1, . . . , i`, b ) ←$ A queries to PRF-Srv and so B makes at most q − 1 queries If ` > q or c ≥ ` then Ret false to Help. It is also the case that all predictions by A are If ∃α 6= β . iα = iβ then Ret false for unique tag, message pairs, meaning that B’s output 0 L` ~ Ret b = α=1 b[iα] will also be for unique pairs of targets. Finally, it is clear RoR(t, m) that correct predictions σi are also BCDH solutions. q ← q + 1 ; ~b[q] ←$ {0, 1} H1,H2 Z1 ← Fk (t, m) B.2 Pseudorandomness Security Z0 ←$ Rng Ret Z Unpredictability security, like unforgeability, is not suf- ~b[i] ficient in all applications. In particular, it could be that PRF-Srv(t, Y ) a protocol produces unforgeable outputs but each out- c ← c + 1 H1,H2 put leaks everything about the message. So in our use Ret PRF-Srvk (t, Y ) of PYTHIA for storage of hardened password hashes we need something more. Ideally we would prove a ver- Figure 15: Security game for one-more pseudorandom- sion of oblivious PRF security suitably adapted for the ness. partially oblivious setting. However, we do not believe our schemes can be proven secure relative to such no- tions since they require “programming” the PRF outputs. ting. Let Π be a partially-oblivious PRF protocol. The One might adapt our schemes to meet them, but the most game om-PRFΠ is defined in Figure 15. It gives the ad- ˜ efficient adaptation — considering instead Fk(t, m) = versary a challenge oracle RoR to which it can query k H(t k m k e(H1(t),H2(m)) ) as in [31] — prevents one (t, m) pairs. The oracle flips a fresh challenge bit and re- from performing key updates. At the same time, we sponds accordingly. We restrict attention to adversaries could find no attacks that exploit the algebraic structure that never repeat a query to RoR. Finally the adversary revealed by storing the unhashed output. This leaves the gets access to a PRF-Srv oracle. The adversary’s task question of what level of security can be proven about is to determine all of the challenge bits, and we mea- our scheme. sure this by asking that it guess the XOR of them. This We introduce a new notion called one-more PRF se- XOR measure is borrowed from the multi-instance secu- curity. Intuitively for a scheme that meets it, an attacker rity notions9 of Bellare, Ristenpart, and Tessaro [7]. We that interacts q − 1 times with PRF-Srv still cannot dis- associate to an adversary A and Π the advantage measure tinguish one more evaluation of Fk from a random point. defined by This appears to capture the security properties we re- om-prf  A  Adv (A) = 2 · Pr om-PRF ⇒ true − 1 . quire in the web server compromise case: even if the Π Π adversary breaks in, it can only distinguish from random Note that for any correct protocol Π there exists an ef- points as much of the stored hardened hashes as queries ficient adversary that can win the game with probability to PRF-Srv. 1/2 by querying RoR once on (1, t, m) for arbitrary t, m 0 0 To build up some intuition towards a formal notion, and outputting (1, b ) for randomly chosen b . Hence consider giving an adversary two oracles. The first is the advantage measure is scaled as shown. The use of a real-or-random function oracle RoR and the second is XOR ensures that even if one can solve q − 1 chal- 0 an oracle for the server’s implementation of the proto- lenges (e.g., using the PRF-Srv oracle) determining b col PRF-Srv. In a normal PRF game one would simply for q challenges requires gaining some advantage over the remaining challenge bit. have RoR reply either always with Fk(t, m) upon query t, m or always with a fresh random point. (Assume the Relationship with PRF security. This one-more PRF adversary never repeats a query to RoR.) But this game notion is a strict strengthening of the conventional PRF is trivial to win because the adversary can simply use security. Consider the following formulation of PRF se- the PRF-Srv oracle to compute Fk(t, m) and check if curity due originally to Goldreich, Goldwasser and Mi- it matches the value returned by RoR(t, m). We might cali [30]. An adversary A is given access to two oracles, want to somehow restrict queries to PRF-Srv but there one that returns Fk(t, m) on query t, m of their choos- seems no way to do this given the ability of clients to ing, and one that upon query t∗, m∗ of the adversary’s blind their messages. Instead we go a different route, adapting the concept 9Our one-more notions are not measuring multi-instance security in of one-more unpredictability to a pseudorandomness set- the sense of [7] as the same key underlies all challenges.

18 ∗ ∗ B choosing flips a bit b and returns either Fk(t , m ) or a Game om-BDDHG random point according to the bit. We allow A to make k ←$ Zp ; qch, q1,t, q2,t, c ← 0 0 Chal,Targ1,Targ2,Help k multiple queries to the first oracle but only a single to the (i1, . . . , i`, b ) ←$ B (G, g1 ) second. If ` > qch or c ≥ ` then Ret false

We now sketch a proof showing that PRF security If ∃α 6= β . iα = iβ then Ret false 0 L` ~ as defined above is implied by one-more PRF security. Ret b = α=1 b[iα] Consider any such PRF adversary A against Fk(t, m). We can build a one-more PRF adversary B whose advan- Chal(i, j) tage upper bounds A’s PRF advantage, as follows. To If q1,t < i or q2,t < j then Ret ⊥ qch ← qch + 1 ; ~b[qch] ←$ {0, 1} any Fk(t, m) query by A, the adversary B first queries γ1[i]·γ2[j]·k t, m to its own RoR oracle and also runs PRF-Cl(t, m) Z1 ← e(g1, g2) ; Z0 ←$ GT Ret Zq using its PRF-Srv oracle in order to compute Fk(t, m). ch

It determines the challenge bit for this RoR query and Targ1 returns Fk(t, m). When A makes a query to its sec- γ1[q1,t] q1,t ← q1,t + 1 ; γ1[q1,t] ←$ Zp ; Ret g ond oracle, the challenge oracle, adversary B queries its 1

RoR oracle and returns the result. When A outputs a bit Targ2 0 0 γ2[q2,t] b , the adversary B outputs b XOR’d with each of the $ Z q2,t ← q2,t + 1 ; γ2[q2,t] ← p ; Ret g2 already-solved challenge bits. It is easy to analyze this formally and show that B wins whenever A would in the Help(Z) k PRF game. c ← c + 1 ; Ret Z

Relationship with unpredictability. Showing that one- Figure 16: Game for the one-more BDDH more PRF security is strictly stronger than one-more assumption for bilinear pairing setting G = unpredictability is straightforward. Consider an adver- (g1, g2, gt, G1, G2, GT , e). sary A who computes ` tuples (ti, mi, σi), where σi = F (ti, mi), and c < ` queries to PRF-Srv. An adversary B can use A to win the om-PRF game with the same ad- an adversary to query Chal(X,Y ), with the implied vantage by forwarding queries to the corresponding or- meaning hopefully obvious. The helper oracle returns k acle, and wins the game by setting bi based on whether Z for any queried Z, effectively allowing the attacker RoR(ti, mi) = σi. to trivially solve a single BDDH instance. We define However, given a one-more unpredictable function F , advantage of an adversary B against a setup G by the function F 0(t, m) := F (t, m) k m is still unpre- Advom-bddh(B) = 2·Pr  om-BDDHB ⇒ true  − 1 G G dictable, but can be trivially distinguished. Note that if B makes two Targ queries, makes a sin- Analyzing our scheme. We now analyze the security of gle Chal(1, 1), and never uses Help, then this is exactly our partially-oblivious PRF scheme from Section 3. To the classic BDDH assumption. Thus this assumption is do so we introduce a new hardness assumption that is a stronger than BDDH. Results from [16] and [17] suggest generalization of the bilinear Decisional Diffie-Hellman that one-more problems of this type are possibly easier, (BDDH) assumption underlying the conventional PRF but the verdict is still out. k security of Fk(t, m) = e(H1(t),H2(m)) . (The latter is We have the following theorem. a corollary of a result due to Boneh and Waters [14].) The new assumption is analogous to the one-more BCDH as- Theorem 2 Let Π be the partially-oblivious PRF sumption given above. scheme for pairing setup G and with H1,H2 modeled as random oracles. Let adversary A be a one-more PRF Fix some pairing setting = G adversary making q queries to RoR(·, ·), h queries to (g , g , g , , , , e) and refer to the game 1 1 2 t G1 G2 GT H , h queries to H , and c < q queries to PRF-Srv. shown in Figure 16. The game tasks the adversary B 1 2 2 Then we give in the proof below a one-more BDDH ad- in distinguishing one more BDDH instance than the versary B such that number of helper queries it makes. That is, the adversary k Advom-prf (A) ≤ Advom-bddh(B) . is given a value g1 and target oracles Targ1, Targ2 which Π G give back random points in G1 and G2. The adversary Adversary B runs in time that of A plus O(h1 + h2 + c) can query a challenge oracle Chal(i, j), which gives group or pairing operations, makes at most q challenge back either e(X,Y )k for X,Y each being previously queries, c helper queries, h1 queries to Targ1 and h2 returned by the respective target oracles (if the challenge queries to Targ2. bit is one) or gives back a random point Z ∈ GT (if the challenge bit is zero). Later we will often allow Proof: This proof follows much the same steps as the

19 Chal,Targ,Help adversary B (G,K) We informally summarize the ideal functionality pk ← e(g1,K) Fv-OPRF of the Jarecki et al. model as follows: 0 RoRSim,SrvSim,HSim1,HSim2 (i1, . . . , i`, b ) ←$ A (pk) (1) First, F performs the key-generation and pa- 0 v-OPRF Ret (i1, . . . , i`, b ) rameter registering functions (for example, dis- RoRSim(t, m) tributing public keys). Ret Chal(X[t],Y [m]) (2) A user U submits a message x and a server S to evaluate the message. Communication is initiated HSim (t) 1 with the server S. X[t] ← Targ () ; Ret X[t] 1 (3) The oblivious PRF protocol takes place between U HSim2(m) and S. This is equivalent to the PRF-Srv protocol. Y [m] ← Targ2() ; Ret Y [m] (4) When both U and S have finished communicating, SrvSim(t, Y ) a flag is generated which denotes whether the veri- c ← c + 1 fiable property is satisfied. Ret Help(e(X[t],Y )) (5) U gets the value of the PRF on x. Moreover, the following must hold: Figure 17: Adversary B used in proof of Theorem 2. • The output x should be indistinguishable from a randomly chosen value. one-more unpredictable proof given previously. As be- • The server S evaluates inputs on at most as many fore, we assume without loss of generality that A makes points which are received by U. all queries to H1 and H2 at the start, and never makes This is a simplified version of the functionality from redundant queries. We construct an adversary B for the [31], where it is described in the universal composabil- om-PRF game as shown in Figure 17. ity framework. Under the UC framework, adversarial B responds to any query to HSim1(t) with a point capabilities include statically corrupting both users and X[t] ∈ G1, where X[t] is sampled from Targ1. Simi- servers, and complete control over the communication larly, for queries to HSim2(m), B returns a point Y [m] channel. sampled from Targ2. The security is measured by the probability that an en- Whenever A queries RoRSim(t, m), B calls the chal- vironment can distinguish whether it is interacting with lenge oracle Chal(X[t],Y [m]), and returns the response the real world (as defined above), or a simulated version to A. Notice that this challenge precisely matches the of the ideal world. expected real or random challenge. Using this simplified version, we prove the following. Furthermore, on queries (t, Y ) to SrvSim, B computes e(H1(t),Y ), and submits this value to the Help oracle. Proposition 1 Let Fv-OPRF be a verifiable, oblivious In this way, B accurately simulates all queries in the PRF as described above, and let Π be the partially obliv- om-PRF game. ious PRF derived from Fv-OPRF. Then for any adversary Finally, suppose A outputs a tuple (i1, i2, . . . , i`) and A against the one-more pseudorandomness of Π we con- a bit b0. Then B outputs the same tuple, and bit. If A struct an adversary B such that wins its game, then it has distinguished ` real or random instances of the PRF, with c < ` queries to the PRF-Srv om-prf 1 voprf oracle. It is easy to see that B has also distinguished ` Adv (A) − ≤ Adv (B) Π,q 2 Fv-OPRF challenges with c < ` queries to the Help oracle. Proof: (Sketch) For every query to the PRF-Srv oracle B.3 Relationship with Fully Oblivious from A, B simulates the same functionality in the v- PRFs OPRF game. For the RoR queries, the v-OPRF adver- sary chooses to return either the actual evaluation of the Jarecki et al. [31] propose a universally composable query, or a random value with probability 1/2. model of verifiable, oblivious pseudorandom functions. If B is interacting with the ideal world, then all RoR This model is close to the functionality required for our queries will be random, even though half of the queries PRF service. However, these constructions do not sup- are ‘real’. In this case, A cannot win with probability port key updates because hash functions are applied to better than 1/2. several values in the course of the protocol. Hashing de- On the other hand, if B is interacting with the simu- stroys any algebraic structure that could be exploited to lation, then half of the RoR queries are true evaluations enable key rotation. of the function Fv-OPRF. Therefore, this is an accurate

20 simulation of the om-PRF game, and A wins with prob- om-prf ability AdvΠ,q (A). Therefore, if A wins, then B assumes it must be the real world, and distinguishes the two with the desired probability. This result is unsurprising and not coincidental. Both definitions target the same security notion. However, when attempting to extend our constructions into the UC model described by Jarecki et al., we encounter an issue: without the final random oracle, we are unable to force the PRF-Srv queries to match up with the correct outputs. On the other hand, we claim that wrapping the PRF-Srv response in a hash function as in the Jarecki et al. constructions, then the following is true:

Claim 1 Let Π be a partially oblivious PRF, H be a hash function modeled as a random oracle, and Π0 be the functionality derived by matching the functionality of Π to a vOPRF. The final output is H(t, m, Fk(t, m)). Then for all adversaries A, we can construct an ad- versary B such that

voprf om-unp 1 Adv 0 (A) ≤ Adv (B) − . Π Π 2 The intuition is that due to the unpredictability of Π, the random oracle outputs of H can be programmed to correspond to PRF outputs in the correct way.

21