Extended Multiset-Table-Algebra Operations

International Journal of Intelligent Computing Research (IJICR), Volume 7, Issue 2, June 2016 Extended Multiset-Table-Algebra Operations Iryna Glushko Nizhyn Gogol State University, Ukraine Abstract 2. Multiset: Basic Definitions The paper is focused on some theoretical Let’s introduce the formal definition of multisets questions of the Database Theory. Multiset table in terms of monograph [5]. algebra is considered. The notion of a table specified Definition 1. A multiset with basis U is a using the notion of a multiset (or bag). A signature of function :U {1,2,...}, where U is an arbitrary multiset table algebra is filled up with new operations such as inner and outer joins, semi-join set (in a classical Cantor’s understanding). and aggregate operations. A formal mathematical Let's agree a multiset with basis n1 nk semantics of these operations is defined. The special U { d1 ,..., dk } to write down as {d1 ,..., dk }, element NULL is inserted in the universal domain for where ni is a number of duplicate of the element dі define of the outer join. in the multiset , so n () d , i1,..., k . i i 1. Introduction Let be a multiset with basis U dom . Here dom is the range of definition of multiset as a The relational data model is nowadays in function. widespread use as in database scientific research so Definition 2. A characteristic function of multiset and in practice. In its formal definition, originally is a function :D {0,1,2,...}, the values of proposed by E. Codd, the relational model is based which are specified by the following piecewise on sets of tuples, i.e. it does not allow duplicate schema: tuples in a relation [1]. There are many applications ()dif d dom, the most peculiar feature of which is multiplicity and ()d repeatability data. For example, these are 0, else; sociological polls of different population groups, for all d D , where D is the universe of elements calculations on DNA and others. Commercial of multiset bases. relational database systems are almost invariably Definition 3. An empty multiset m is a multiset based on multisets instead of sets. In other words, a characteristic function of which is a constant tables are in general allowed to include duplicate function, value of which is everywhere equal zero. tuples. Definition 4. A rank of finite multiset is a sum For example, the data model of SQL is relational in nature, as well as the relevant operations. of duplicate elements of its basis ()d ; However, unlike relational algebra, the tables ddom manipulated by SQL are not relations, but, rather, wherein m 0 . multisets. The reason for this peculiarity is twofold Let’s introduce a binary relation inclusion over [2]. First, this is due to a practical reason: since SQL multisets. tables may be very large, duplicate elimination might Definition 5. Multiset is included in multiset become a bottleneck for the computation of the ( ), if query result. Second, SQL extends the set of query operators by means of aggregate functions, whose U U &( d d U d d ) . operands are in general required to be multisets of Directly from definition follows that this relation values. is a partial order. So, naturally there is a need to expand The 1-multisets are the multisets whose range of possibilities of relational databases due to use of values is an empty set or a single-element set {1}. multisets (or bags). This problem was also These multisets are the analogues of ordinary sets. considered in [2-8]. However this question requires Analogs of standard set-theoretic operations and specification and extension because in the specified operations, which are using peculiarities of multisets works the due attention isn't paid to operations of and, therefore cannot be useful for (abstract) sets, are inner and outer joins, semijoin and aggregate defined in terms of characteristic functions in operations of multiset table algebra. monograph [5]. There are operations of multiset union All , intersection All , difference \ All , which Copyright © 2016, Infonomics Society 711 International Journal of Intelligent Computing Research (IJICR), Volume 7, Issue 2, June 2016 build multisets of general view. The Cartesian formulation of theorem. Indeed, the following product of multiset , the operation Dist() , which R equalities hold: 1) 1, R All 2 , R build 1-multiset, and analog of a full image for ,\,\,R R R R R ; 2) ~ , R multisets are defined too. 1 All 1 All 2 R R CRR ,\, All , where CR , 3. Multiset Table Algebra: Basic ,R ... , R {AR1}, {ARn}, Definitions {AAAAA1},{ 2 } { 1 ,..., n 1},{} n and RAA {1 ,...,n } is a scheme of the table , R Among the two sets that are considered, A is the (see, for example, [5, 10]). set of attributes and D is the universal domain. Definition 6. An arbitrary (finite) set of attributes 4. Extended Operations R A is called the scheme. Definition 7. A tuple of the scheme R is a The extended operations include inner and outer nominal set on pair R , D . The projection of this joins, semijoin, aggregate operations. Let's consider nominal set for the first component is equal to R . In these operations one at a time. other words, a tuple of the scheme is a function s: R D . 4.1. Inner Joins The set of all tuples on scheme R is designated as SR() and the set of all tuples is designated as There are four kinds of inner join operations SSR () . Cartesian join, natural join, join using attributes AA,..., and join on predicate p . Let’s define them. RA 1 n i1,..., k Definition 8. A table is a pair , R , where the Definition 10. The Cartesian Join of table on scheme R and table on scheme R , moreover first component is an arbitrary multiset basis of 1 2 RR1 2 , is a binary parametric operation Cj which () is an arbitrary set (in particular infinite) RR1, 2 of tuples of the scheme R and other component R of the form is a scheme of the table. Cj :()()()RRRR1 2 1 2 , RR, Let , R be the first component of the pair 1 2 1 1, R 1 Cj 2,RRR 2 ', 1 2 , , R , i.e., is the set . Thus, a certain scheme is RR1, 2 ascribed to every table. The set of all table on where 1,(RR 1 1), 2,()RR 2 2 . scheme R is designated as ()R and the set of all The basis of the multiset is defined by follow: table is designated as ()R . ( ) {s | s1 s 2 ( s 1 (1 ) s 2 ( 2 ) RA s s s )}. The number of duplicates is given by The notation Occ(,) s denotes the number of 1 2 the following formula: duplicate tuple s in the multiset . Let's agree a Occ(,)(,)(,s Occ s1 1 Occ s2 2 ), where multiset to write down as {sn1 ,..., snk }, where 1 k s (') and s s1 s 2 . n Occ(,)s , , and ( ) {s ,..., s } is i i 1 k Example 1. Consider the two tables 1, R 1 and a basis of the multiset . 2, R 2 are shown in Table 1 and Table 2, Definition 9. The multiset table algebra is the respectively. Let’s RABC1 {,,} and algebra , P, , where is the set of all tables, RDEF2 {,,} . R R R P, {,All All ,\,All p,, R , X R , ,Rt ,RR ,~} is Table 1. ψ ,R RR1, 2 1 the signature, p P, , XRRR,,,1 2 A, P , A B C are the sets of parameters. a a c The operations of signature are defined in a а с P, b b a [6]. Table 2. Theorem 1. Any expression over multiset table ψ2 ,R algebra can be replaced by equivalent to him D E F expression which uses only operations of selection, a a а join, projection, union, difference and renaming. a a c Proof. To prove this statement we will show that a a c operations of intersection and active complement can a a c be expressed through the operations noted in Copyright © 2016, Infonomics Society 712 International Journal of Intelligent Computing Research (IJICR), Volume 7, Issue 2, June 2016 If we decided to obtain a Cartesian join of these where s'(') and s' s1 s 2 . tables the result would be as shown in Table 3. Example 2. Consider the two tables ',R ' and This result can also be present as 1 1 ',R ' are shown in Table 4 and Table 5, 1,,R 1 Cj 2R 2 {{ А,,,,,,a В a С c 2 2 RR1, 2 respectively. Let’s RABC1'{,,} and {,,,,D a E a F ,},{,,,, a 2 А a В a RABD2 '{,,} . 6 С,,c D ,, a E ,, a F ,},{,, c А b 1 В,,,,,,,,,},b С a D a E a F a Table 4. ψ1',R' 1 {,,,,,,,,,,A b B b C a D a E a A B C a b c F,},} c 3 R R 1, where RRABCD {,,,, 1 2 1 2 a c a EF,} . a c a b b b Table 3. ψ1,R 1 Cj ψ2,R 2 R,R1 2 Table 5. ψ2',R' 2 A B C D E F A B D a a c a a а a b а a a c a a c a c a a a c a a c c b а a a c a a c c b а a a c a a а a a c a a c If we decided to obtain an Inner Natural Join of a a c a a c these tables the result can be present as a a c a a c 1',R 1 ' 2 ',R 2 ' b b a a a а RR1', 2 ' b b a a a c {{ А,,,,,,,},a В b С c D a 1 b b a a a c 2 b b a a a c {,,,,,,,},А a В с С a D a RR1' 2 '} , where RRABCD1''{,,,} 2 .

Extended Multiset-Table-Algebra Operations

Principle of Inclusion-Exclusion

MTH 102A - Part 1 Linear Algebra 2019-20-II Semester

Inclusion‒Exclusion Principle

Multipermutations 1.1. Generating Functions

Basic Operations for Fuzzy Multisets

Finding Direct Partition Bijections by Two-Directional Rewriting Techniques

Symmetric Relations and Cardinality-Bounded Multisets in Database Systems

An Overview of the Applications of Multisets1

Week 3-4: Permutations and Combinations

Essential Spectral Theory, Hall's Spectral Graph Drawing, the Fiedler

Lecture 1.1: Basic Set Theory

Set (Mathematics) from Wikipedia, the Free Encyclopedia