
Subset Queries in Relational Databases Satyanarayana R Valluri Kamalakar Karlapalem [email protected] [email protected] Centre for Data Engineering International Institute of Information Technology Gachibowli, Hyderabad, INDIA Abstract 2. at the level of tuples of subsets. In this paper, we motivated the need for relational The objective of the paper is to present a class database systems to support subset query process- of queries that can be specified in a concise man- ing. We defined new operators in relational algebra, ner using relations of subsets. There are many and new constructs in SQL for expressing subset applications that require such queries to be speci- queries. We also illustrated the applicability of sub- fied. It should be noted that these queries can be set queries through different examples expressed translated to standard SQL queries using compli- using extended SQL statements and relational al- cated programming constructs which is non-trivial gebra expressions. Our aim is to show the utility of to application programmers. Therefore, there is a subset queries for next generation applications. motivation to facilitate direct execution of subset queries. Our aim in this paper is to reval the kind of ap- 1 Introduction plications that can be supported by subset queries but not to dwell on the efficiency issues of process- Relational database systems currently support pro- ing such queries. Once the notion of subset queries cessing of tuples of relations to generate a single and their utility is accepted, additional work to ef- result as a set of tuples. Relational algebra, cal- ficiently execute subset queries can be done as it is culus and SQL are used to specify queries on re- a challenging open research problem. lational databases. There have been extensions to In this paper, we relational algebra: `group by' clause groups the ¤ motivated the need for relational database sys- tuples and each group is represented by a single tems to support subset query processing, aggregated result tuple, `having' clause further re- ¤ strics the groups that should be part of the result. defined new operators in relational algebra, ¡£¢ A cube operator over dimensions generates and new constructs in SQL for expressing sub- result tuples, one for each group of tuples from a set queries, and subset of dimension domain values. In all the op- ¤ illustrated the applicability of subset queries erators: group by, having and cube, each group of through different examples expressed using tuples is represented by a single aggregated tuple. extended SQL statements and relational alge- In this paper, we relax this notion by proposing sub- bra expressions. set queries and show the utility of subset queries in concisely specifying new class of user queries. 1.1 Motivation A subset is a set of tuples. A relation of subset is a set of subsets. The operations on a relation of We motivate the utility of subset queries through subsets can be at two levels: examples. Consider an Item relation in a grocery shop: (ItemId, Name, Weight, Price, Type) where 1. at the level of subsets treating them as atomic ItemId is the unique id of the item, Name, Weight entities, and and Price denote the name, weight of the item in . grams and the price of the item in Dollars respec- The subset corresponding to query 1 is - © ¥¨§ § § ¦ tively. The domain of Type is ¥ “Eatable”,“Non- Eatable” ¦ . Table 1 shows an extension of the Item relation. Definition 2 Relation of subsets: A relation of subsets 1 is a set of subsets over the extension r of ! -¨. -¨. Query 1 “What are the items whose weight is -¨. 1 ¥ 2 ¦ the relation R. , where each ) more than 50 and whose price is less than 50?”. -¨. is a subset of tuples defined over the extension r, 4& 3 ¡ ¥¨§ © § §¦ 2 5 The answer to query 1 is the set of items . , where m is the number of subsets. Query 2 “What are the sets of “Non-Eatable” ¥6¥¨§ § ¦ ¥§ § § ¦ ¥§ § § ¦ For query 2, 1 = , , , items which can be bought such that the total price § § ¦ ¥§ § § § ¦7¦ ¥¨§ , . of these items is more than 150 and the total weight of these items is between 200 and 400?”. Unlike the query 1, there are multiple results for 2.2 Operations on subsets query 2. The result of the query is the set of sub- In this section, we define the properties that can be ¥¨§ ¨ § §¦ ¥§ ¨ § §¦ ¥¨§ § §¦ sets: ¥¥§ §¦ , , , , defined over the subsets. These operations can be § § § ¦¦ ¥§ . discussed under two categories: set operations and Unfortunately, SQL does not have provision for relational operations. specifying subset queries. This paper studies the problem of expressing subset queries and process- ing them. In the rest of the sections, we further 2.2.1 Set Operations develop the notion of subset queries and expound their utility. The set operations on subsets are shown in table 2. The organization of the paper is as follows. Sec- tion 2 introduces relation of subsets and its prop- erties. Section 3 develops subset relational algebra 2.2.2 Relational Operations and extensions to SQL to specify subset queries. The relational operations on the subsets are simi- Section 4 presents recent literature on novel SQL lar to those of the standard relational operations on extensions and contrasts it with subset queries. Fi- relations. The table 3 shows the various relational nally, Section 5 presents conclusions. operations on subsets. 2 Relation of Subsets 2.3 Properties of subsets ) -¨. Let be the intension of a relational schema. Let -0. "! Equality of subsets: Two subsets % and are ¥ ¦ be the set of attributes of said to be equal if they are defined on the same ¢ # ¥$ $ $ ¦ . Let = , , , be an extension of . Each '& *) ¡ # extension and contain the same set of tuples. $ % § ( $ % )0869;:=<?> 9;: ) . - - - , is a tuple defined over . - ) $ + $ + % % % ( % ). refers to the value of the attribute ( + ) of the Complement of a subset: Given -0. , the comple- $ +,# tuple % ( ). - ment . is defined as the set of tuples in extension -0. - . ¥¨$ %@/"$ %A+ # which do not belong to . -¨. -. - . # #"C + ¦ 2.1 Terminology and $ %7B . In this section we formally define the subsets and Table 4 shows the properties of subsets based on the set of subsets called relation of subsets. Further, the set theory. we develop the subset relational algebra. Definition 1 Subset of tuples: A subset of tuples 2.4 Relation of subsets operations taken from extension r of relation R is denoted as -¨. -. / . / is the cardinality of the subset, the number The operations on the relation of subsets can be dis- of tuples in the subset. Each tuple in -0. is defined cussed under two categories: set operations and re- over the same intension R. lational operations. Tuple ItemId Name Weight Price Type § 1 Soap 40 20 Non-Eatable § 2 Face Powder 250 70 Non-Eatable §© 3 Bread 60 15 Eatable § 4 Tooth Paste 150 50 Non-Eatable §D 5 Jam 35 65 Eatable §E 6 Chips 25 18 Eatable § 7 Hair Oil 100 35 Non-Eatable § 8 Sauce 75 40 Eatable § 9 Perfume 60 100 Non-Eatable F § 10 Candy 20 50 Eatable Table 1: Extension of the Item relation Operation Notation Definition ) ) -¨. -. -. -. G G ¥¨$ /0$ + $ + ¦ % % % % Union % or or both ) ) -0. -. -¨. -¨. H H ¥$ /0$ + $ + ¦ % % % % Intersection % and ) ) -0. -¨. -¨. -¨. C C ¥¨$ %I/0$ %J+ $ %7B+ ¦ % Set Difference % but - - . ¥¨$ %I/0$ %J+K# $ %7B+ ¦ Complement and . Table 2: Set Operations on a subset of tuples 2.4.1 Set Operations the result of applying a select condition on a sub- set might be an empty set. We assume that such The table 5 displays the various set operations that empty results are not included in the output. But can be applied on a relation of subsets. The output we can define a notion of outer subset select which of the unary union and unary intersection is a sin- include such empty results also in the output. Simi- gle subset of tuples whereas the output of the rest lar extensions can be made to the cross join and the of the operations is a relation of subsets. The name group by-having operations. “cross” is appropriate for the cross union and cross intersection operations since every pair of subsets are considered similar to the cross product opera- 2.5 Relation of subsets properties tion. While computing the cross union (cross in- Equality of relation of subsets: Two relations of tersection), if the result of the union (intersection) ) 1 1 subsets % and are said to be equal if they are of two subsets is empty, then we do not include it defined on the same extension # and both contain ) ) > 9;: -¨. in the output. But there may be some applications -¨. 1 +,1L% +,1 the same subsets. 1J% , . which require even the empty results in the output where a notion of outer cross union (cross intersec- Complement of relation of subsets: Given a re- 1 lation of subsets 1 , its complement, contains the - tion) can be used. - . 1 ¥ / complements of the subsets of 1 . 8> < -¨. - . +M1N¦ . 2.4.2 Relational Operations The table 6 shows the various relational opera- 3 Supporting subset queries tors that can be applied on relation of subsets. In [7], the notion of multi-relational algebra (MRA) 3.1 Subset representation is introduced for checking the correctness of query execution strategies in distributed databases. The A major problem for supporting subsets in rela- cross cartesian product and the cross join opera- tional databases is the representation of the subsets. tions are similar to the MJN operation defined on Since the first normal form states that the domain of multi-relations. every attribute should take only atomic values [11], As already mentioned in the previous sub-section a subset cannot be represented as a single tuple. Operation Notation Definition Description < 8 <V8 .
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages15 Page
-
File Size-