Security Analysis of Cryptographically Controlled Access to XML Documents

Security Analysis of Cryptographically Controlled Access to XML Documents ¤ Mart´in Abadi Bogdan Warinschi Computer Science Department Computer Science Department University of California at Santa Cruz Stanford University [email protected] [email protected] ABSTRACT ments [4, 5, 7, 8, 14, 19, 23]. This line of research has led to Some promising recent schemes for XML access control em- e±cient and elegant publication techniques that avoid data ploy encryption for implementing security policies on pub- duplication by relying on cryptography. For instance, us- lished data, avoiding data duplication. In this paper we ing those techniques, medical records may be published as study one such scheme, due to Miklau and Suciu. That XML documents, with parts encrypted in such a way that scheme was introduced with some intuitive explanations and only the appropriate users (physicians, nurses, researchers, goals, but without precise de¯nitions and guarantees for the administrators, and patients) can see their contents. use of cryptography (speci¯cally, symmetric encryption and The work of Miklau and Suciu [19] is a crisp, compelling secret sharing). We bridge this gap in the present work. We example of this line of research. They develop a policy query analyze the scheme in the context of the rigorous models language for specifying ¯ne-grained access policies on XML of modern cryptography. We obtain formal results in sim- documents and a logical model based on the concept of \pro- ple, symbolic terms close to the vocabulary of Miklau and tection". They also show how to translate consistent poli- Suciu. We also obtain more detailed computational results cies into protections, and how to implement protections by that establish security against probabilistic polynomial-time XML encryption [10]. Roughly, a protection is an XML tree adversaries. Our approach, which relates these two layers of in which nodes are guarded by positive boolean formulas fK ;K ;:::g the analysis, continues a recent thrust in security research over a set of symbols 1 2 that stand for crypto- and may be applicable to a broad class of systems that rely graphic keys. Protections have a simple and clear intended on cryptographic data protection. semantics: access to the information contained in a node is conditioned on possession of a combination of keys that sat- is¯es the formula that guards the node. For example, access 1. INTRODUCTION to a node guarded by (K1 ^ K2) _ K3 requires possessing ei- A classic method for enforcing policies on access to data ther keys K1 and K2 or key K3. (See Gi®ord's work for some is to keep all data in trusted servers and to rely on these of the roots of this approach [11].) Formally, a protection servers for mediating all requests by clients, authenticating describes a function that maps each possible set of keys to the clients and performing any necessary checks. An alter- the set of nodes that can be accessed using those keys, treat- native method, which is sometimes more attractive, consists ing the keys as symbols. On the other hand, the use of keys in publishing the data in such a way that each client can for deriving a partially encrypted document is not symbolic: see only the appropriate parts. In a naive scheme, many this process includes replacing the symbols K1;K2;::: with sanitized versions of the data would be produced, each cor- actual keys, and applying a symmetric encryption algorithm responding to a partial view suitable for distribution to a repeatedly, bottom-up, to the XML document in question. subset of the clients. This naive scheme is impractical in While Miklau and Suciu provide a thorough analysis of general. Accordingly, there has been much interest in more the translation of policies into protections, they leave a large elaborate and useful schemes for ¯ne-grained control on ac- gap between the abstract semantics of protections and the cess to published documents, particularly for XML docu- use of actual keys and encryption. The existence of this gap ¤ should not surprise us: an analogous gap existed in protocol This work was partly carried out while this author was analysis for 20 years, until recent e®orts to bridge it [1, 2, 13, a±liated with the University of California at Santa Cruz. 15, 18, 20]. Concretely, the gap means that the protection semantics leaves many problematic issues unresolved. We describe two such issues, as examples: Permission to make digital or hard copies of all or part of this work for ² Partial information: It is conceivable that even when a personal or classroom use is granted without fee provided that copies are node should be hidden according to a protection, the not made or distributed for pro£t or commercial advantage, and that copies partially encrypted document may in fact leak some bear this notice and the full citation on the £rst page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior speci£c information about the data in that node. permission and/or a fee. ² PODS 2005 June 13-15, 2005, Baltimore, Maryland. Encryption cycles: From the point of view of the ab- Copyright 2005 ACM 1-59593-062-0/05/06 ::: $5.00. stract semantics, encryption cycles (such as encrypting a key with itself) are legitimate and do not contradict Technically, we adapt and extend the approach of Abadi security. On the other hand, there are encryption algo- and Rogaway [1]. The novelties of this paper include the ap- rithms that satisfy standard cryptographic de¯nitions plication to document access control, signi¯cant di®erences of security but that leak keys when encryption cycles in basic de¯nitions motivated by this application, and the are created. treatment of secret sharing. First we provide an interme- More generally, there are many encryption methods and diate symbolic language for cryptographic expressions. We many notions of security for them (e.g., [3, 9, 12]), and it then de¯ne patterns of expressions; intuitively, a pattern is not clear which one, if any, provides adequate guaran- represents the information that an expression reveals to an tees for this application|nor is it exactly clear what those adversary. We show how to transform protections into cryp- guarantees might be. tographic expressions, and use patterns to provide an equiv- The immediate goal of this work is to bridge this gap alent semantics for protections. This equivalence is captured by reconciling the abstract semantics of protections with a in Theorem 1. Going further, we relate expressions to con- more concrete, computational treatment of security, and to crete computations on bit-strings. The most di±cult result de¯ne and establish precise security guarantees. We do not of this paper is Theorem 2. Informally, it states that pat- wish to replace the abstract semantics, which certainly has terns faithfully represent the information that expressions its place, but rather to complement it. reveal, even when expressions and patterns are implemented From a broader perspective, our goal is to develop, apply, with actual encryption schemes (not symbolically). More and promote useful concepts and tools for security anal- precisely, we associate probability distributions with an ex- ysis in the ¯eld of database theory. These concepts and pression and its pattern by mapping symbols to bit-strings tools do not pertain to statistical techniques, which have and implementing encryption with a semantically secure en- long been known in database research (e.g., [6, 22]), but cryption scheme [12], and prove that these distributions can- rather to cryptography. While sophisticated uses of cryp- not be distinguished by any probabilistic polynomial-time tology in database research may have been of modest scope, algorithm. Our main theorem, Theorem 3, reconciles the there is an obvious need for database security, and we believe abstract semantics of protections with the actual use of en- that cryptology has much to o®er. In research on crypto- cryption. We establish that if data is hidden according to graphic protocols, formal and complexity-theoretic methods a protection, then it is secret according to our de¯nition of have been successful in providing detailed models and in secrecy. enabling security proofs (sometimes automated ones). The same methods are bene¯cial for a broad class of systems that Contents require security. Each application, however, can necessitate The next section, Section 2, is mostly a review. In Sec- non-trivial, speci¯c insights and results. In the techniques tion 3 we introduce our formal language for representing that we study, partial and multiple encryptions occur in cryptographic expressions and give an alternative semantics (large, XML) data instances; we therefore depart from the to XML protections. Our main results are in Section 4: we situations most typically considered in the cryptography lit- give concrete interpretations to expressions and relate the erature, towards data management. It is this speci¯city that formal semantics of protections to a strong de¯nition of se- motivates the present paper. crecy. We conclude in Section 5. Overview of results Our analysis is directed at the core of the framework of 2. CONTROLLING ACCESS TO XML Miklau and Suciu, which aims to ensure data protection by DOCUMENTS WITH PROTECTIONS an interesting combination of encryption schemes and secret In this section we brieﬂy recall the key aspects of the work sharing schemes [21]. As a formal counterpart to their loose, of Miklau and Suciu. We focus on protections. We describe informal concept of data secrecy, we introduce a strong, pre- the derivation of partially encrypted documents from protec- cise cryptographic de¯nition. The de¯nition goes roughly as tions in the next section. We omit the policy query language follows.

Security Analysis of Cryptographically Controlled Access to XML Documents

Computational Learning Theory: New Models and Algorithms

ITERATIVE ALGOR ITHMS for GLOBAL FLOW ANALYSIS By

Second International Computer Programming Education Conference

A Auxiliary Definitions

Statement on the Selection of Jeffrey Ullman for a Turing Award

Assembling a Prehistory for Formal Methods: a Personal View Thomas Haigh [email protected]

Appendix B: Annotated Bibliography

The Halting Problem and Security's Language-Theoretic Approach

Prof. Jeffrey Ullman

1999 Annual Report

Computer Science: Reflections On

Anish Das Sarma