Extending Query Rewriting Techniques for Fine-Grained Access Control

Extending Query Rewriting Techniques for Fine-Grained Access Control Shariq Rizvi ∗ Alberto Mendelzon ∗ S. Sudarshan Prasan Roy ∗ University of California, University of Toronto Indian Institute of IBM India Research Berkeley Technology, Bombay Laboratory [email protected] [email protected] [email protected] [email protected] ABSTRACT In all the above cases, authorization is required at a very fine- Current day database applications, with large numbers of users, re- grained level, such as at the level of individual tuples. Also, as quire fine-grained access control mechanisms, at the level of indi- in the last example, there can be a policy that defines how an access vidual tuples, not just entire relations/views, to control which parts should be made apart from what data can be accessed. of the data can be accessed by each user. Fine-grained access con- Currently, authorization mechanisms in SQL permit access control is often enforced in the application code, which has numerous trol at the level of complete tables or columns, or on views. There is drawbacks; these can be avoided by specifying/enforcing access no direct way to specify fine-grained authorization to control which control at the database level. We present a novel fine-grained access tuples can be accessed by which users. In theory, fine-grained ac- control model based on authorization views that allows “authorization- cess control at the level of individual tuples can be achieved by transparent” querying; that is, user queries can be phrased in terms creating an access control list for each tuple. However this ap- of the database relations, and are valid if they can be answered us- proach is not scalable, and would be totally impractical in systems ing only the information contained in these authorization views. with millions of tuples, and thousands or millions of users, since We extend earlier work on authorization-transparent querying by it would require millions of access control specifications to be pro- introducing a new notion of validity, conditional validity. We give vided (manually) by the administrator. It is also possible to create a powerful set of inference rules to check for query validity. We views for specific users, which allow those users access to only se- demonstrate the practicality of our techniques by describing how lected tuples of a table, but again this approach is not scalable with an existing query optimizer can be extended to perform access con- large numbers of users. trol checks by incorporating these inference rules. Current generation information systems therefore typically by- pass database access control facilities, and embed access control in the application program used to access the database. Although 1. INTRODUCTION widely used, this approach has several disadvantages: Access control is an integral part of databases and information systems. • Access control has to be checked at each user-interface. This Granularity of access control refers to the size of individual data increases the overall code size. Any change in the access items which can be authorized to users. There are many scenarios control policy requires changing a large amount of code. that demand fine-grained access control: • All security policies have to be implemented into each of the applications built on top of this data (e.g. OLTP and decision • For an academic institution’s database that stores information support applications using the same underlying data). about student grades, it may be desired to allow students to • Given the large size of application code, it is possible to over- see only their own grades. On the other hand, a professor look loopholes that can be exploited to break through the se- should get access to all grades for a course she has taught. curity policies, e.g. improperly designed servlets. Also, it is • For a bank, a customer should be able to query her account easy for application programmers to create trap-doors with balance, and no one else’s balance. At the same time, a teller malicious intent, since it is impossible to check every line of should have read access to balances of all accounts but not code in a very large application. the addresses of customers corresponding to these balances. For the above reasons, fine-grained access control should ideally • A teller should be allowed to see the balance of any account by providing the account-id but not the balances of all ac- be specified and enforced at the database level. counts together. In this paper, we present a security model in which fine-grained authorization policies are defined and enforced in the database. ∗ Work done while at IIT Bombay This makes sure that the same policies hold, irrespective of how the data is accessed - through a report-writing tool, a query, or an application program. The key features of our model are as follows: Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are 1. Access control is specified using authorization views (Sec- not made or distributed for profit or commercial advantage, and that copies tion 2). An authorization view can be a traditional relational bear this notice and the full citation on the first page. To copy otherwise, to view or a parameterized view. A parameterized authoriza- republish, to post on servers or to redistribute to lists, requires prior specific tion view is an SQL view definition which makes use of pa- permission and/or a fee. SIGMOD 2004 June 13-18, 2004, Paris, France. rameters like user-id, time, user-location etc. The following Copyright 2004 ACM 1-58113-859-8/04/06 . $5.00. parameterized authorization view create authorization view MyGrades as available to the user if there is a query q0 using only the au- select * from Grades where student-id = $user-id thorization views that is equivalent to q, i.e., the two queries lets the user see all tuples in the Grades relation where the give the same result on all database states. We categorize student-id matches her user-id (parameters, such as user-id, such queries q as unconditionally valid. are denoted by a $ prefix). Parameterized views provide an The problem of rewriting a query using a set of available re- efficient and powerful way of expressing fine-grained autho- lational views [15] has received tremendous attention. It has rization policies. As views can project out specific columns been studied in the context of finding an efficient query ex- in addition to selecting rows, this framework allows fine- ecution plan by rephrasing a query using the set of available grained authorization at the cell-level. materialized views, in data integration systems, and for sup- We also provide a special form of parameterized views, which porting the separation of logical and physical views of data. we call access pattern views, which allows specification of Our access control model leverages off these techniques. authorizations such as “a teller can see the account informa- The idea that a query is valid (authorized) if it can be rewrit- tion of any one customer at a time, by providing her customer- ten in terms of authorized views was proposed earlier by id”. Motro [20] and by Rosenthal et al. [24, 22]. The model works within the basic SQL framework and does not require the DBA to encode policies using a separate rule 4. We show that certain queries can be answered using the in- language. formation contained in a set of authorization views, even if they cannot be rewritten using the views. Unconditional va- 2. We allow queries to be written in an authorization-transparent 0 manner, that is, queries can be written against the database lidity of q requires that q and q give the same results on all relations without having to refer to the authorization views. database states. The key idea in going beyond unconditional validity is that information in the authorization views avail- Given a user query (phrased in terms of database relations able to a user rules out many database states, and we need or views), our system checks if the query is valid, that is, it not require q and q0 to give the same result on such states. can be answered using the information available in the autho- On the other hand, as we show later (Example 4.3), requiring rization views that are accessible to the user. If found to be q and q0 to give the same answer only in the current database valid, the query is allowed to execute as originally specified, state is too weak a requirement, and can leak unauthorized without any modification, otherwise it is rejected. information. An obvious way to enforce access control using authorization views is to allow queries to be written only against these 5. Our next contribution is therefore to exactly characterize the views, not against the original database relations. However, class of queries, which we call conditionally valid queries, since different users (or classes of users) may have different that can be answered using the information contained in a set authorization views, this would require application program- of views in a given database state (Section 4.3). The idea of mers to code interfaces differently for each user (or class of conditional validity is novel to this paper. users), increasing the cost and complexity of application de- velopment. 6. We give a set of powerful inference rules which can be used Another alternative approach is to allow queries to be writ- to infer the unconditional and conditional validity of queries ten against database relations, but to modify the query by (Section 5).

Extending Query Rewriting Techniques for Fine-Grained Access Control

SAQE: Practical Privacy-Preserving Approximate Query Processing for Data Federations

WIN: an Efficient Data Placement Strategy for Parallel XML Databases

Tepzz¥ 67¥¥Za T

Conjunctive Query Answering in the Description Logic EL Using a Relational Database System

Query Rewriting Using Monolingual Statistical Machine Translation

On Using an Automatic, Autonomous and Non-Intrusive Approach for Rewriting SQL Queries

Towards Practical Privacy-Preserving Data Analytics by Noah Michael

Data Management 09 Transaction Processing

Distributed Complex Event Processing with Query Rewriting

Probabilistic Query Rewriting for Efficient and Effective Keyword

First-Order Query Rewriting for Inconsistent Databases

A Comparison of Query Rewriting Techniques for DL-Lite