Database Systems 04 Relational Algebra

Total Page:16

File Type:pdf, Size:1020Kb

Database Systems 04 Relational Algebra 1 SCIENCE PASSION TECHNOLOGY Database Systems 04 Relational Algebra Matthias Boehm Graz University of Technology, Austria Computer Science and Biomedical Engineering Institute of Interactive Systems and Data Science BMVIT endowed chair for Data Management Last update: Mar 25, 2019 2 Announcements/Org . #1 Video Recording . Since last week, video/audio recording . Link in TeachCenter & TUbe (but not public yet) . #2 Exercise Submission . Submission through TeachCenter (max 5MB, draft possible) . Starting today (deadline Apr 02, 12.59pm ) Nobody left behind (if problems, contact us via newsgroup or email) INF.01014UF Databases / 706.004 Databases 1 – 04 Relational Algebra and Tuple Calculus Matthias Boehm, Graz University of Technology, SS 2019 Relational Data Model 3 Recap: Relations and Terminology . Domain D (value domain): e.g., Set S, INT, Char[20] Attribute . Relation R A1 A2 A3 . Relation schema RS: INT INT BOOL Set of k attributes {A 1,…,Ak} 3 7 T . Attribute Aj: value domain Dj = dom(Aj) . Relation: subset of the Cartesian product 1 2 T over all value domains Dj 3 4 F ⊆⊆⊆ ××× ××× ××× ≥≥≥ R D1 D2 ... Dk, k 1 Tuple 1 7 T . Additional Terminology cardinality: 4 . Tuple : row of k elements of a relation rank: 3 . Cardinality of a relation: number of tuples in the relation . Rank of a relation: number of attributes . Semantics: Set := no duplicate tuples (in practice: Bag := duplicates allowed) . Order of tuples and attributes is irrelevant INF.01014UF Databases / 706.004 Databases 1 – 04 Relational Algebra and Tuple Calculus Matthias Boehm, Graz University of Technology, SS 2019 Relational Data Model 4 Relational Algebra vs Tuple Calculus . Comparison Calculus-oriented Languages Scheme for Data Sub Languages QUEL ALPHA Relational Calculus (Tuple Calculus ) bidirectional mapping possible (relational completeness ) Relational Algebra [E. F. Codd: Relational Completeness of Data (SQL) Base Sublanguages. IBM Research Report RJ987, 1972 ] Algebra-oriented Languages INF.01014UF Databases / 706.004 Databases 1 – 04 Relational Algebra and Tuple Calculus Matthias Boehm, Graz University of Technology, SS 2019 5 Agenda . Relational Algebra . Tuple Calculus . Physical Operators (Preview Lecture 08) INF.01014UF Databases / 706.004 Databases 1 – 04 Relational Algebra and Tuple Calculus Matthias Boehm, Graz University of Technology, SS 2019 Relational Algebra 6 Database Research Self-Assessment 2018 PID Firstname Lastname Affiliation LID 102 Anastasia Ailamaki EPFL 1 104 Peter Bailis Stanford 105 Magdalena Balazinska U Washington 3 107 Peter Boncz CWI 2 108 Surajit Chaudhuri MS Research 3 111 Luna Dong Amazon 3 113 Juliana Freire NYU 5 115 Joe Hellerstein UC Berkley 6 116 Stratos Idreos Harvard 7 117 Donald Kossman MS Research 118 Tim Kraska MIT 7 120 Volker Markl TU Berlin 8 122 Tova Milo Tel Aviv University 9 123 C. Mohan IBM Research 10 124 Thomas Neumann TU Munich 11 126 Fatma Ozcan IBM Research 10 130 ChristopherINF.01014UF Databases Re / 706.004 Databases 1 Stanford– 04 Relational Algebra and 4 Tuple Calculus Matthias Boehm, Graz University of Technology, SS 2019 Relational Algebra 7 Database Research Self-Assessment 2018, cont. PID … Affiliation LID LID Location 102 EPFL 1 1 Lausanne,SUI 104 Stanford 2 Amsterdam, NLD 105 U Washington 3 3 Seattle, USA 107 CWI 2 4 Stanford, USA 108 MS Research 3 5 New York, USA 111 Amazon 3 6 Berkley, USA 113 NYU 5 7 Cambridge, USA 115 UC Berkley 6 8 Berlin, GER 116 Harvard 7 9 Tel Aviv, ISR 117 MS Research 10 San Jose, USA 118 MIT 7 11 Munich, GER 120 TU Berlin 8 122 Tel Aviv University 9 123 IBM Research 10 124 TU Munich 11 126 IBM Research 10 130INF.01014UF Databases Stanford / 706.004 Databases 1 4 – 04 Relational Algebra and Tuple Calculus Matthias Boehm, Graz University of Technology, SS 2019 8 Relational Algebra INF.01014UF Databases / 706.004 Databases 1 – 04 Relational Algebra and Tuple Calculus Matthias Boehm, Graz University of Technology, SS 2019 Relational Algebra 9 Core Relational Algebra . Relational Algebra . Operands: relations (normalized, variables for computing new values) . Operators : traditional set operations and specific relational operations . Basic Operations (minimal) . Cartesian product R×××S Traditional Specific . Union R∪∪∪S Set Relational . Difference R–S Operations Operations . Projection πi1, ..., im(R) R×××S . Selection σF(R) Basic πi1, ..., im(R) R∪∪∪S Operations . Derived Operations R–S σF(R) . Intersection R∩∩∩S ⨝ . Join R S Derived R∩∩∩S R⨝S . Division R÷÷÷S Operations R÷÷÷S . Rename ρS(R) INF.01014UF Databases / 706.004 Databases 1 – 04 Relational Algebra and Tuple Calculus Matthias Boehm, Graz University of Technology, SS 2019 Relational Algebra 10 Extended Relational Algebra . Extended Relational Algebra . Relational algebra introduced with set semantics (no duplicate tuples) . SQL with bag semantics (more flexibility and performance) . Codd’72: In a practical environment it would need to be augmented by a counting and summing capability, together with […] library functions […]. Additional Operations (Ext) . Duplicate elimination δ(R) Basic Ext . Grouping γA,f(B) R Derived . Sorting τA(R) . Bag (aka Multiset) Terminology AB a b . Multiplicity: # occurrences of an instance b c . Cardinality: # tuples (i.e., # instances weighted by multiplicity) a b INF.01014UF Databases / 706.004 Databases 1 – 04 Relational Algebra and Tuple Calculus Matthias Boehm, Graz University of Technology, SS 2019 Relational Algebra 11 Basic Cartesian Product Ext Derived . Definition: R×××S := {(r,s) | r ∈∈∈ R, s ∈∈∈ S} . Set of all pairs of inputs (equivalent in set /bag ) SF Bay Area . Example LID Location PID Firstname Lastname Affiliation LID 4 Stanford, USA 104 Peter Bailis Stanford 6 Berkley, USA 130 Christopher Re Stanford 4 ××× 10 San Jose, USA PID Firstname Lastname Affiliation LID LID Location 104 Peter Bailis Stanford 4 Stanford, USA 130 Christopher Re Stanford 4 4 Stanford, USA 104 Peter Bailis Stanford 6 Berkley, USA 130 Christopher Re Stanford 4 6 Berkley, USA 104 Peter Bailis Stanford 10 San Jose, USA 130 Christopher Re Stanford 4 10 San Jose, USA INF.01014UF Databases / 706.004 Databases 1 – 04 Relational Algebra and Tuple Calculus Matthias Boehm, Graz University of Technology, SS 2019 Relational Algebra 12 Basic Union Ext Derived . Definition: R∪∪∪S := {x | x ∈∈∈ R ∨∨∨ x ∈∈∈ S} . Set: set union with duplicate elimination (idempotent: S∪∪∪S = S ) . Bag: bag union (commutative but not idempotent) . Example w/ set semantics Firstname Lastname Affiliation Firstname Lastname Affiliation Anastasia Ailamaki EPFL Anastasia Ailamaki EPFL Peter Boncz CWI Peter Boncz CWI Volker Markl TU Berlin Volker Markl TU Berlin Thomas Neumann TU Munich Thomas Neumann TU Munich Magdalena Balazinska U Washington Firstname Lastname Affiliation Juliana Freire NYU Anastasia Ailamaki EPFL Tova Milo Tel Aviv University Magdalena Balazinska U Washington Juliana Freire NYU Union requires Tova Milo Tel Aviv University compatible schema INF.01014UF Databases / 706.004 Databases 1 – 04 Relational Algebra and Tuple Calculus Matthias Boehm, Graz University of Technology, SS 2019 Relational Algebra 13 Basic Difference Ext Derived . Definition: R–S := {x | x ∈∈∈ R ∧∧∧ x ∉∉∉ S} (sometimes \) . Set: set difference . Bag: element multiplicity of R minus multiplicity min(R,S) . Example w/ bag semantics Affiliation Affiliation Stanford Stanford U Washington U Washington Affiliation MS Research NYU Amazon MS Research – UC Berkley NYU Amazon Harvard IBM Research UC Berkley MIT Harvard TU Berlin MIT Stanford Difference requires IBM Research compatible schema INF.01014UF Databases / 706.004 Databases 1 – 04 Relational Algebra and Tuple Calculus Matthias Boehm, Graz University of Technology, SS 2019 Relational Algebra 14 Basic Projection and Selection Ext Derived . Projection πi1, ..., im(R) Example: Affiliation . Set: selection of attributes with πAffiliation (R) w/ EPFL duplicate elimination set semantics Stanford . Bag: selection of attributes U Washington CWI . Extended Projection MS Research Arithmetic expressions: π (R) . A,A*B Amazon . Duplicate occurrences: πA,A,B (R) NYU . Selection (restriction) σ (R) UC Berkley F Harvard . Selection of tuples satisfying the predicate F MIT (equivalent in set /bag ) TU Berlin . Example: σ (R) Affiliation=‘IBM Research‘ Tel Aviv University PID Firstname Lastname Affiliation LID IBM Research 123 C. Mohan IBM Research 10 TU Munich 126 Fatma Ozcan IBM Research 10 INF.01014UF Databases / 706.004 Databases 1 – 04 Relational Algebra and Tuple Calculus Matthias Boehm, Graz University of Technology, SS 2019 Relational Algebra 15 Query Trees . Composition of Complex Queries . Relational algebra expressions data flow graph (tree) . Leaf nodes represent base relations; root node represent result . Example πFirstname,Affiliation,Location ( σP.LID=L.LID (Participant × Location)) Firstname Affiliation Location πFirstname,Affiliation,Location Anastasia EPFL Lausanne, SUI Magdalena U Seattle, Operations σP.LID=L.LID Washington USA Peter CWI Amsterdam, Intermediates NLD ××× Surajit MS Seattle, Research USA ……… Participant Location INF.01014UF Databases / 706.004 Databases 1 – 04 Relational Algebra and Tuple Calculus Matthias Boehm, Graz University of Technology, SS 2019 Relational Algebra 16 Basic Intersection Ext Derived . Definition R∩∩∩S := R – (R – S) (derived from basic operations) . Set: set intersection derived from difference . Bag: bag intersection, with element multiplicity min(R,S) . Example Firstname Lastname Affiliation Anastasia Ailamaki EPFL Peter Boncz CWI Volker Markl
Recommended publications
  • Is Uncountable the Open Interval (0, 1)
    Section 2.4:R is uncountable (0, 1) is uncountable This assertion and its proof date back to the 1890’s and to Georg Cantor. The proof is often referred to as “Cantor’s diagonal argument” and applies in more general contexts than we will see in these notes. Our goal in this section is to show that the setR of real numbers is uncountable or non-denumerable; this means that its elements cannot be listed, or cannot be put in bijective correspondence with the natural numbers. We saw at the end of Section 2.3 that R has the same cardinality as the interval ( π , π ), or the interval ( 1, 1), or the interval (0, 1). We will − 2 2 − show that the open interval (0, 1) is uncountable. Georg Cantor : born in St Petersburg (1845), died in Halle (1918) Theorem 42 The open interval (0, 1) is not a countable set. Dr Rachel Quinlan MA180/MA186/MA190 Calculus R is uncountable 143 / 222 Dr Rachel Quinlan MA180/MA186/MA190 Calculus R is uncountable 144 / 222 The open interval (0, 1) is not a countable set A hypothetical bijective correspondence Our goal is to show that the interval (0, 1) cannot be put in bijective correspondence with the setN of natural numbers. Our strategy is to We recall precisely what this set is. show that no attempt at constructing a bijective correspondence between It consists of all real numbers that are greater than zero and less these two sets can ever be complete; it can never involve all the real than 1, or equivalently of all the points on the number line that are numbers in the interval (0, 1) no matter how it is devised.
    [Show full text]
  • Self-Organizing Tuple Reconstruction in Column-Stores
    Self-organizing Tuple Reconstruction in Column-stores Stratos Idreos Martin L. Kersten Stefan Manegold CWI Amsterdam CWI Amsterdam CWI Amsterdam The Netherlands The Netherlands The Netherlands [email protected] [email protected] [email protected] ABSTRACT 1. INTRODUCTION Column-stores gained popularity as a promising physical de- A prime feature of column-stores is to provide improved sign alternative. Each attribute of a relation is physically performance over row-stores in the case that workloads re- stored as a separate column allowing queries to load only quire only a few attributes of wide tables at a time. Each the required attributes. The overhead incurred is on-the-fly relation R is physically stored as a set of columns; one col- tuple reconstruction for multi-attribute queries. Each tu- umn for each attribute of R. This way, a query needs to load ple reconstruction is a join of two columns based on tuple only the required attributes from each relevant relation. IDs, making it a significant cost component. The ultimate This happens at the expense of requiring explicit (partial) physical design is to have multiple presorted copies of each tuple reconstruction in case multiple attributes are required. base table such that tuples are already appropriately orga- Each tuple reconstruction is a join between two columns nized in multiple different orders across the various columns. based on tuple IDs/positions and becomes a significant cost This requires the ability to predict the workload, idle time component in column-stores especially for multi-attribute to prepare, and infrequent updates. queries [2, 6, 10].
    [Show full text]
  • On Free Products of N-Tuple Semigroups
    n-tuple semigroups Anatolii Zhuchok Luhansk Taras Shevchenko National University Starobilsk, Ukraine E-mail: [email protected] Anatolii Zhuchok Plan 1. Introduction 2. Examples of n-tuple semigroups and the independence of axioms 3. Free n-tuple semigroups 4. Free products of n-tuple semigroups 5. References Anatolii Zhuchok 1. Introduction The notion of an n-tuple algebra of associative type was introduced in [1] in connection with an attempt to obtain an analogue of the Chevalley construction for modular Lie algebras of Cartan type. This notion is based on the notion of an n-tuple semigroup. Recall that a nonempty set G is called an n-tuple semigroup [1], if it is endowed with n binary operations, denoted by 1 ; 2 ; :::; n , which satisfy the following axioms: (x r y) s z = x r (y s z) for any x; y; z 2 G and r; s 2 f1; 2; :::; ng. The class of all n-tuple semigroups is rather wide and contains, in particular, the class of all semigroups, the class of all commutative trioids (see, for example, [2, 3]) and the class of all commutative dimonoids (see, for example, [4, 5]). Anatolii Zhuchok 2-tuple semigroups, causing the greatest interest from the point of view of applications, occupy a special place among n-tuple semigroups. So, 2-tuple semigroups are closely connected with the notion of an interassociative semigroup (see, for example, [6, 7]). Moreover, 2-tuple semigroups, satisfying some additional identities, form so-called restrictive bisemigroups, considered earlier in the works of B. M. Schein (see, for example, [8, 9]).
    [Show full text]
  • Cardinality of Sets
    Cardinality of Sets MAT231 Transition to Higher Mathematics Fall 2014 MAT231 (Transition to Higher Math) Cardinality of Sets Fall 2014 1 / 15 Outline 1 Sets with Equal Cardinality 2 Countable and Uncountable Sets MAT231 (Transition to Higher Math) Cardinality of Sets Fall 2014 2 / 15 Sets with Equal Cardinality Definition Two sets A and B have the same cardinality, written jAj = jBj, if there exists a bijective function f : A ! B. If no such bijective function exists, then the sets have unequal cardinalities, that is, jAj 6= jBj. Another way to say this is that jAj = jBj if there is a one-to-one correspondence between the elements of A and the elements of B. For example, to show that the set A = f1; 2; 3; 4g and the set B = {♠; ~; }; |g have the same cardinality it is sufficient to construct a bijective function between them. 1 2 3 4 ♠ ~ } | MAT231 (Transition to Higher Math) Cardinality of Sets Fall 2014 3 / 15 Sets with Equal Cardinality Consider the following: This definition does not involve the number of elements in the sets. It works equally well for finite and infinite sets. Any bijection between the sets is sufficient. MAT231 (Transition to Higher Math) Cardinality of Sets Fall 2014 4 / 15 The set Z contains all the numbers in N as well as numbers not in N. So maybe Z is larger than N... On the other hand, both sets are infinite, so maybe Z is the same size as N... This is just the sort of ambiguity we want to avoid, so we appeal to the definition of \same cardinality." The answer to our question boils down to \Can we find a bijection between N and Z?" Does jNj = jZj? True or false: Z is larger than N.
    [Show full text]
  • Relational Algebra
    Relational Algebra Instructor: Shel Finkelstein Reference: A First Course in Database Systems, 3rd edition, Chapter 2.4 – 2.6, plus Query Execution Plans Important Notices • Midterm with Answers has been posted on Piazza. – Midterm will be/was reviewed briefly in class on Wednesday, Nov 8. – Grades were posted on Canvas on Monday, Nov 13. • Median was 83; no curve. – Exam will be returned in class on Nov 13 and Nov 15. • Please send email if you want “cheat sheet” back. • Lab3 assignment was posted on Sunday, Nov 5, and is due by Sunday, Nov 19, 11:59pm. – Lab3 has lots of parts (some hard), and is worth 13 points. – Please attend Labs to get help with Lab3. What is a Data Model? • A data model is a mathematical formalism that consists of three parts: 1. A notation for describing and representing data (structure of the data) 2. A set of operations for manipulating data. 3. A set of constraints on the data. • What is the associated query language for the relational data model? Two Query Languages • Codd proposed two different query languages for the relational data model. – Relational Algebra • Queries are expressed as a sequence of operations on relations. • Procedural language. – Relational Calculus • Queries are expressed as formulas of first-order logic. • Declarative language. • Codd’s Theorem: The Relational Algebra query language has the same expressive power as the Relational Calculus query language. Procedural vs. Declarative Languages • Procedural program – The program is specified as a sequence of operations to obtain the desired the outcome. I.e., how the outcome is to be obtained.
    [Show full text]
  • Georg Cantor English Version
    GEORG CANTOR (March 3, 1845 – January 6, 1918) by HEINZ KLAUS STRICK, Germany There is hardly another mathematician whose reputation among his contemporary colleagues reflected such a wide disparity of opinion: for some, GEORG FERDINAND LUDWIG PHILIPP CANTOR was a corruptor of youth (KRONECKER), while for others, he was an exceptionally gifted mathematical researcher (DAVID HILBERT 1925: Let no one be allowed to drive us from the paradise that CANTOR created for us.) GEORG CANTOR’s father was a successful merchant and stockbroker in St. Petersburg, where he lived with his family, which included six children, in the large German colony until he was forced by ill health to move to the milder climate of Germany. In Russia, GEORG was instructed by private tutors. He then attended secondary schools in Wiesbaden and Darmstadt. After he had completed his schooling with excellent grades, particularly in mathematics, his father acceded to his son’s request to pursue mathematical studies in Zurich. GEORG CANTOR could equally well have chosen a career as a violinist, in which case he would have continued the tradition of his two grandmothers, both of whom were active as respected professional musicians in St. Petersburg. When in 1863 his father died, CANTOR transferred to Berlin, where he attended lectures by KARL WEIERSTRASS, ERNST EDUARD KUMMER, and LEOPOLD KRONECKER. On completing his doctorate in 1867 with a dissertation on a topic in number theory, CANTOR did not obtain a permanent academic position. He taught for a while at a girls’ school and at an institution for training teachers, all the while working on his habilitation thesis, which led to a teaching position at the university in Halle.
    [Show full text]
  • Relational Algebra and SQL Relational Query Languages
    Relational Algebra and SQL Chapter 5 1 Relational Query Languages • Languages for describing queries on a relational database • Structured Query Language (SQL) – Predominant application-level query language – Declarative • Relational Algebra – Intermediate language used within DBMS – Procedural 2 1 What is an Algebra? · A language based on operators and a domain of values · Operators map values taken from the domain into other domain values · Hence, an expression involving operators and arguments produces a value in the domain · When the domain is a set of all relations (and the operators are as described later), we get the relational algebra · We refer to the expression as a query and the value produced as the query result 3 Relational Algebra · Domain: set of relations · Basic operators: select, project, union, set difference, Cartesian product · Derived operators: set intersection, division, join · Procedural: Relational expression specifies query by describing an algorithm (the sequence in which operators are applied) for determining the result of an expression 4 2 The Role of Relational Algebra in a DBMS 5 Select Operator • Produce table containing subset of rows of argument table satisfying condition σ condition (relation) • Example: σ Person Hobby=‘stamps’(Person) Id Name Address Hobby Id Name Address Hobby 1123 John 123 Main stamps 1123 John 123 Main stamps 1123 John 123 Main coins 9876 Bart 5 Pine St stamps 5556 Mary 7 Lake Dr hiking 9876 Bart 5 Pine St stamps 6 3 Selection Condition • Operators: <, ≤, ≥, >, =, ≠ • Simple selection
    [Show full text]
  • Equivalents to the Axiom of Choice and Their Uses A
    EQUIVALENTS TO THE AXIOM OF CHOICE AND THEIR USES A Thesis Presented to The Faculty of the Department of Mathematics California State University, Los Angeles In Partial Fulfillment of the Requirements for the Degree Master of Science in Mathematics By James Szufu Yang c 2015 James Szufu Yang ALL RIGHTS RESERVED ii The thesis of James Szufu Yang is approved. Mike Krebs, Ph.D. Kristin Webster, Ph.D. Michael Hoffman, Ph.D., Committee Chair Grant Fraser, Ph.D., Department Chair California State University, Los Angeles June 2015 iii ABSTRACT Equivalents to the Axiom of Choice and Their Uses By James Szufu Yang In set theory, the Axiom of Choice (AC) was formulated in 1904 by Ernst Zermelo. It is an addition to the older Zermelo-Fraenkel (ZF) set theory. We call it Zermelo-Fraenkel set theory with the Axiom of Choice and abbreviate it as ZFC. This paper starts with an introduction to the foundations of ZFC set the- ory, which includes the Zermelo-Fraenkel axioms, partially ordered sets (posets), the Cartesian product, the Axiom of Choice, and their related proofs. It then intro- duces several equivalent forms of the Axiom of Choice and proves that they are all equivalent. In the end, equivalents to the Axiom of Choice are used to prove a few fundamental theorems in set theory, linear analysis, and abstract algebra. This paper is concluded by a brief review of the work in it, followed by a few points of interest for further study in mathematics and/or set theory. iv ACKNOWLEDGMENTS Between the two department requirements to complete a master's degree in mathematics − the comprehensive exams and a thesis, I really wanted to experience doing a research and writing a serious academic paper.
    [Show full text]
  • Python Mock Test
    PPYYTTHHOONN MMOOCCKK TTEESSTT http://www.tutorialspoint.com Copyright © tutorialspoint.com This section presents you various set of Mock Tests related to Python. You can download these sample mock tests at your local machine and solve offline at your convenience. Every mock test is supplied with a mock test key to let you verify the final score and grade yourself. PPYYTTHHOONN MMOOCCKK TTEESSTT IIII Q 1 - What is the output of print tuple[2:] if tuple = ′abcd′, 786, 2.23, ′john′, 70.2? A - ′abcd′, 786, 2.23, ′john′, 70.2 B - abcd C - 786, 2.23 D - 2.23, ′john′, 70.2 Q 2 - What is the output of print tinytuple * 2 if tinytuple = 123, ′john′? A - 123, ′john′, 123, ′john′ B - 123, ′john′ * 2 C - Error D - None of the above. Q 3 - What is the output of print tinytuple * 2 if tinytuple = 123, ′john′? A - 123, ′john′, 123, ′john′ B - 123, ′john′ * 2 C - Error D - None of the above. Q 4 - Which of the following is correct about dictionaries in python? A - Python's dictionaries are kind of hash table type. B - They work like associative arrays or hashes found in Perl and consist of key-value pairs. C - A dictionary key can be almost any Python type, but are usually numbers or strings. Values, on the other hand, can be any arbitrary Python object. D - All of the above. Q 5 - Which of the following function of dictionary gets all the keys from the dictionary? A - getkeys B - key C - keys D - None of the above.
    [Show full text]
  • Efficient Skyline Computation Over Low-Cardinality Domains
    Efficient Skyline Computation over Low-Cardinality Domains MichaelMorse JigneshM.Patel H.V.Jagadish University of Michigan 2260 Hayward Street Ann Arbor, Michigan, USA {mmorse, jignesh, jag}@eecs.umich.edu ABSTRACT Hotel Parking Swim. Workout Star Name Available Pool Center Rating Price Current skyline evaluation techniques follow a common paradigm Slumber Well F F F ⋆ 80 that eliminates data elements from skyline consideration by find- Soporific Inn F T F ⋆⋆ 65 ing other elements in the dataset that dominate them. The perfor- Drowsy Hotel F F T ⋆⋆ 110 mance of such techniques is heavily influenced by the underlying Celestial Sleep T T F ⋆ ⋆ ⋆ 101 Nap Motel F T F ⋆⋆ 101 data distribution (i.e. whether the dataset attributes are correlated, independent, or anti-correlated). Table 1: A sample hotels dataset. In this paper, we propose the Lattice Skyline Algorithm (LS) that is built around a new paradigm for skyline evaluation on datasets the Soporific Inn. The Nap Motel is not in the skyline because the with attributes that are drawn from low-cardinality domains. LS Soporific Inn also contains a swimming pool, has the same number continues to apply even if one attribute has high cardinality. Many of stars as the Nap Motel, and costs less. skyline applications naturally have such data characteristics, and In this example, the skyline is being computed over a number of previous skyline methods have not exploited this property. We domains that have low cardinalities, and only one domain that is un- show that for typical dimensionalities, the complexity of LS is lin- constrained (the Price attribute in Table 1).
    [Show full text]
  • CSC 443 – Database Management Systems Data and Its Structure
    CSC 443 – Database Management Systems Lecture 3 –The Relational Data Model Data and Its Structure • Data is actually stored as bits, but it is difficult to work with data at this level. • It is convenient to view data at different levels of abstraction . • Schema : Description of data at some abstraction level. Each level has its own schema. • We will be concerned with three schemas: physical , conceptual , and external . 1 Physical Data Level • Physical schema describes details of how data is stored: tracks, cylinders, indices etc. • Early applications worked at this level – explicitly dealt with details. • Problem: Routines were hard-coded to deal with physical representation. – Changes to data structure difficult to make. – Application code becomes complex since it must deal with details. – Rapid implementation of new features impossible. Conceptual Data Level • Hides details. – In the relational model, the conceptual schema presents data as a set of tables. • DBMS maps from conceptual to physical schema automatically. • Physical schema can be changed without changing application: – DBMS would change mapping from conceptual to physical transparently – This property is referred to as physical data independence 2 Conceptual Data Level (con’t) External Data Level • In the relational model, the external schema also presents data as a set of relations. • An external schema specifies a view of the data in terms of the conceptual level. It is tailored to the needs of a particular category of users. – Portions of stored data should not be seen by some users. • Students should not see their files in full. • Faculty should not see billing data. – Information that can be derived from stored data might be viewed as if it were stored.
    [Show full text]
  • Some Set Theory We Should Know Cardinality and Cardinal Numbers
    SOME SET THEORY WE SHOULD KNOW CARDINALITY AND CARDINAL NUMBERS De¯nition. Two sets A and B are said to have the same cardinality, and we write jAj = jBj, if there exists a one-to-one onto function f : A ! B. We also say jAj · jBj if there exists a one-to-one (but not necessarily onto) function f : A ! B. Then the SchrÄoder-BernsteinTheorem says: jAj · jBj and jBj · jAj implies jAj = jBj: SchrÄoder-BernsteinTheorem. If there are one-to-one maps f : A ! B and g : B ! A, then jAj = jBj. A set is called countable if it is either ¯nite or has the same cardinality as the set N of positive integers. Theorem ST1. (a) A countable union of countable sets is countable; (b) If A1;A2; :::; An are countable, so is ¦i·nAi; (c) If A is countable, so is the set of all ¯nite subsets of A, as well as the set of all ¯nite sequences of elements of A; (d) The set Q of all rational numbers is countable. Theorem ST2. The following sets have the same cardinality as the set R of real numbers: (a) The set P(N) of all subsets of the natural numbers N; (b) The set of all functions f : N ! f0; 1g; (c) The set of all in¯nite sequences of 0's and 1's; (d) The set of all in¯nite sequences of real numbers. The cardinality of N (and any countable in¯nite set) is denoted by @0. @1 denotes the next in¯nite cardinal, @2 the next, etc.
    [Show full text]