Algebraic Operations on Tabular Data Context 6.1 Basic Idea of Relational

Total Page:16

File Type:pdf, Size:1020Kb

Algebraic Operations on Tabular Data Context 6.1 Basic Idea of Relational Context Database Design: 6 The Relational Data Model: e s - developing a relational a Requirements analysis b database schema Algebraic operations on tabular data a t Design: a d - formal theory 6.1 Basic idea of relational languages Conceptual Design g n i Data handling in a rela- s 6.2 Relational Algebra operations u tional database Schema design d 6.3 Relational Algebra: Syntax and Semantics n - Algebra,-calculus -SQL -logical a (“create tables”) g Extensions 6.4. More Operators n i n g Using Databases 6.5 Special Topics of RA i s from application progs 6.5.1 Relational algebra operators in SQL e D : Special topics, 6.5.2 Relational completeness 1 Schema design t Physical Schema, 6.5.3 What is missing in RA? r a -physical P 6.5.4 RA operator trees Part 2: (“create access path”) Implementation of DBS Loading, administration, Transaction, Kemper / Eickler: 3.4, Elmasri /Navathe: chap. 74-7.6, tuning, maintenance, Concurrency control Garcia-Molina, Ullman, Widom: chap. 5 reorganization Recovery 6.1 Basic idea of relational languages Relational Languages • Data Model: Important concepts Goal of language design Language for definition and Given a relational database like the Video shop DB handling (manipulation ) of data Design a language, which allows to express queries – Languages for handling data: like: • Relational Algebra (RA) as a semantically well defined - Customers who rented videos for more than 100 $ last month applicative language - List of all movies no copy of which have been on loan since 2 month - List the total sales volume of each movie within the last year • Relational tuple calculus (domain calculus): predicate logic - Is there anybody whose rented movies all have category "horror"? interpretation of data and queries y? ……. ac riv • SQL / DML (or simply SQL) Language should be declarative ("descriptive") P – Kernel of SQL built upon RA as well as calculus, Historically: “Make query formulation ‘as easy as in natural language’ “ extended by operations like arithmetic expressions not available in RA or calculus HS / DBS05-08-RDML1 3 HS / DBS05-08-RDML1 4 Relational Algebra Basic Operations informally (repeated from chapter 4) Projection • Idea of Relational Algebra: – Given relations R(a1,...,an), S(b1,...,bm) Selection – Define operators which transform one or more tables into a result table. Example Union ∪ Tape ( id movieId ) Movie ( m_Id title ) 1 'B' Difference 'A' "Frenzy" - 5 'A' Join tables! 'B' "Matrix" 6 'B' A I Cartesian Project A A II B B I How could we find the number of tapes of "Matrix?" C B II C I I C II HS / DBS05-08-RDML1 5 (Renaming) II HS / DBS05-08-RDML1 6 1 Relational Algebra Basics 6.2 Relational Algebra operations • Why “algebra”? Terminological update – Mathematically, algebraic structures basically defined Let A be a set universal of attributes by a base set S of values and operations which map one or more elements of S to S and obey certain laws –A Relation Schema is a named n-tuple of attributes (e.g. groups, lattices, …) RS = R (a1,...,an), {a1,…an} ⊆ A – The base set of Relational Algebra is the set of all –RA is the set {a1,...,an} of attributes (columns) of RS relations (tables) with attributes from a given set A of called the type signature of R attributes. – The operation Σ applied to a relation R results in the – Operations on tables type signature of R: Σ ( R ) = RA projection, cartesian product, join, .... as introduced intuitively above –A Relational Database Schema is a set of relation schemas –A Database Relation R (conforming to Relation – Note: Result of an operation is time dependent Schema RS) is a subset of D(a1) X ...X D(an), the HS / DBS05-08-RDML1 7 cross product of the domains of the HSattributes / DBS05-08-RDML1 of 8R Set operations Relational Algebra Basic Operations Tape ( id movieId ) Movie ( movieId title ) • Cartesian (Cross) Product 1 'B' ∪ 'A' "Frenzy" ? – Cross product of two sets R and S: 5 'A' 'B' "Matrix" a set of pairs with type signature Σ(R) ⊆ A and Σ(S) ⊆ A 6 'B' – Result relation T should be a relation over R, S relations, Σ(R) ∪Σ(S) = A' ⊆ A ( assumed Σ(R) ∩Σ(S) = ∅) R and S are called union-compatible R ( a1 a2 ) S ( b1 a2 ) if the domains of Σ ( R ) and Σ ( S ) are pair wise the same or: two tables are union-compatible if they have 1 'A' X 3 'A' the same number of columns and have the same 5 'Z' 1 'B' domains in corresponding colums T ((a1 , a2) (b1 a2) ) R and S union-compatible, then set union and set difference Not a relation = (1 'A') (3 'A') R ∪ S and R \ S are defined as usual on mathematical sets schema in 1NF. (5 'Z') (3 'A') Therefore NOT Other set operations may be easily defined using ∪ and \ (1 'A' ) (1 'B') an operation of Relational Algebra HS / DBS05-08-RDML1 9 (5 'Z') (1 'B') HS / DBS05-08-RDML1 10 Relational Algebra Basic Operations Projection π Extended cross product X Let Σ(R) = B’, B ⊆ B’ Projection πB (R ) of R on B: Let R and S be relations, Σ(R) ={a1,...,an} ⊆ A , Set of rows from R with the columns not in B eliminated B' Σ(S) = {b1,...,bm} ⊆ A, Σ(R) ∩Σ(S) = ∅ then B – Schema Σ(R X S): project {R.a1,...R.an, S.b1, …,S.bm} = {R.a | a ∈ A} ∪ {S.b | b ∈ A} Omit relation qualifiers “R.” and “S.” - no naming conflict. – Extended cross product R X S : R X S = – No duplicates in πB (R ) (in theory!) {(a1,...an,b1,...,bm) | (a1,...,an) ∈ R, (b1,...,bm) ∈ S} πB (R ) = {r restricted to B | r ∈ R} Renaming, if Σ(R) ∩Σ(S) != ∅ : = {r’ | there is a tuple r ∈ R such that ρ <attrname> ← <newAttrname> (<relname>) r’ is the restriction of r to the attributes in B} HS / DBS05-08-RDML1 11 HS / DBS05-08-RDML1 12 2 Relational Algebra Basic Operations Selection σ "Find movies directed by Billy Wilder made 1960 or later" Parent (id, mother, father) πmother (Parent) = Mary 25 Mary, Paul select 47 Mary, John 55 Mary, Paul A relation with only one column and one row (no duplicates) Selection of tuples from a table R according to a predicate defined on R • Property of projection: B contains a key of R fl πB (R ) contains as many Predicate P :: R -> {TRUE, FLASE} tuples as R: | πB (R ) | = |R| defined on tuples of R. For each r ∈ R : P(r) = TRUE | FALSE • Useful for estimating the size of query results • Important for optimization HS / DBS05-08-RDML1 13 HS / DBS05-08-RDML1 14 Primitive predicates Row predicates Boolean row predicates Primitive (simple) predicates: Row predicates combine primitive (simple) predicates by and, or, not and parenthesis '(', ')' Let a, b be attributes, w value from dom (a) a θ b and a θ w are primitive predicates Inductive definition of syntax for (row) predicates where θ∈{ =, !=, <, <=, >, >=} ....as usual: - Primitive predicates are predicates Primitive predicates compare either an attribute and a - If Q, Q’ are predicates, then Q ∧ Q’ , Q ∨ Q’ and ¬Q value or two attributes are predicates - Operator preference and brackets as usual - There are no other predicates e.g. Movie.director = ‘Wilder, Billy' Movies directed by Spielberg before 1999 or an entertainment movie : or Rental.until > Rental.from + 1 month movie.director='Spielberg' ∧ (year <= TO_DATE('1999', 'yyyy') ∨ cat = 'entertainment ) HS / DBS05-08-RDML1 15 HS / DBS05-08-RDML1 16 Propositional semantics Selection of rows Semantics of predicates: Selection σ Let σ (R) = {r | r ∈ R and P(r) = TRUE } a, b be attributes of table R, r ∈ R, P P the predicate a θ v , Q is the predicate a θ b where P is a row predicate r(a) : value for attribute a of tuple r Then Note: P(r) := r(a) θ v - Selection operator selects the row with all attributes: Q(r) := r(a) θ r(b) Σ(R) = Σ (σ P (R) ) - size of result depends on selectivity of P S ≡ P ∧ Q : S(r,t) := P(r) ∧ Q(r) according to ∧ semantics selectivity := | σ P (R) | / | R | important for optimization ¬, ∨ and preference as usual in propositional logic Frequently, θ is equality predicate (=) HS / DBS05-08-RDML1 17 HS / DBS05-08-RDML1 18 3 Relational Algebra Basic Operations Relational Algebra: combining operators Example π title (σ P ( Movie) ) "Movies directed by Spielberg produced 1997 or later ! " where P = "director = 'Spielberg' and year >= 1997" Movie(mId,title,...,director, year) Find the actors performing in movie directed by σ ( Movie) P Spielberg where P = "director = 'Spielberg' and year >= 1997" M_ID TITLE CAT YEAR DIRECTOR PRICE_DAY LENGTH ---- ----- ---------- ------- ---------- -------- -------- Select Q Select P +Proj. 2 Amistat Drama 01.04.97 Spielberg 1 120 4 Minority Report Drama 01.04.02 Spielberg 2 145 2 Zeilen ausgewählt. π stage_name (σ P (σ Q ( Movie) X Starring ) where P = "Movie.id = Starring.movieId " Q = "director = 'Spielberg' Wrong result…? "Truth" defined relative to contents of DB Closed world assumption: for all predicates P r ∉R ⇒ P(r) = False HS / DBS05-08-RDML1 19 HS / DBS05-08-RDML1 20 Renaming Renaming Renaming of relations ρ <newname> (<relname>) Relation <relname> is renamed to <newname> in the Example: Employee( id, name, boss, …) context of expression Find subordinates of 'Miller' ρ <attrname> ← <newAttrname> (<relname>) Attribute <attrname> of relation <relname> is renamed to <newAttrname> in the context of expression π name (σ P (σ Q ( Employee) X Employee ))) where P = "Employee.name = 'Miller' " Q = "Employee.boss = Employee.id " π Sub.name (σ Q (σ P ( Employee X ( ρSub (Employee))))) where P = "Employee.name = 'Miller' " Q = "Sub.boss = Employee.id " HS / DBS05-08-RDML1 21 HS / DBS05-08-RDML1 22 Evalution example: one table – two roles 6.3 Relational Algebra: Syntax and Semantics Employee EmployeeSub Syntax of (simple) Relational Algebra defined id name boss id name boss NULL ρ 001 Abel NULL 001 Abel Sub (Employee) inductively : 002 Bebel 005 002 Bebel 005 004 Cebel 005 004 Cebel 005 005 Miller 001 005 Miller 001 (1) Each table identifier is a RA expression 006 Debel 001 006 Debel 001 ...
Recommended publications
  • Relational Algebra
    Relational Algebra Instructor: Shel Finkelstein Reference: A First Course in Database Systems, 3rd edition, Chapter 2.4 – 2.6, plus Query Execution Plans Important Notices • Midterm with Answers has been posted on Piazza. – Midterm will be/was reviewed briefly in class on Wednesday, Nov 8. – Grades were posted on Canvas on Monday, Nov 13. • Median was 83; no curve. – Exam will be returned in class on Nov 13 and Nov 15. • Please send email if you want “cheat sheet” back. • Lab3 assignment was posted on Sunday, Nov 5, and is due by Sunday, Nov 19, 11:59pm. – Lab3 has lots of parts (some hard), and is worth 13 points. – Please attend Labs to get help with Lab3. What is a Data Model? • A data model is a mathematical formalism that consists of three parts: 1. A notation for describing and representing data (structure of the data) 2. A set of operations for manipulating data. 3. A set of constraints on the data. • What is the associated query language for the relational data model? Two Query Languages • Codd proposed two different query languages for the relational data model. – Relational Algebra • Queries are expressed as a sequence of operations on relations. • Procedural language. – Relational Calculus • Queries are expressed as formulas of first-order logic. • Declarative language. • Codd’s Theorem: The Relational Algebra query language has the same expressive power as the Relational Calculus query language. Procedural vs. Declarative Languages • Procedural program – The program is specified as a sequence of operations to obtain the desired the outcome. I.e., how the outcome is to be obtained.
    [Show full text]
  • Relational Algebra and SQL Relational Query Languages
    Relational Algebra and SQL Chapter 5 1 Relational Query Languages • Languages for describing queries on a relational database • Structured Query Language (SQL) – Predominant application-level query language – Declarative • Relational Algebra – Intermediate language used within DBMS – Procedural 2 1 What is an Algebra? · A language based on operators and a domain of values · Operators map values taken from the domain into other domain values · Hence, an expression involving operators and arguments produces a value in the domain · When the domain is a set of all relations (and the operators are as described later), we get the relational algebra · We refer to the expression as a query and the value produced as the query result 3 Relational Algebra · Domain: set of relations · Basic operators: select, project, union, set difference, Cartesian product · Derived operators: set intersection, division, join · Procedural: Relational expression specifies query by describing an algorithm (the sequence in which operators are applied) for determining the result of an expression 4 2 The Role of Relational Algebra in a DBMS 5 Select Operator • Produce table containing subset of rows of argument table satisfying condition σ condition (relation) • Example: σ Person Hobby=‘stamps’(Person) Id Name Address Hobby Id Name Address Hobby 1123 John 123 Main stamps 1123 John 123 Main stamps 1123 John 123 Main coins 9876 Bart 5 Pine St stamps 5556 Mary 7 Lake Dr hiking 9876 Bart 5 Pine St stamps 6 3 Selection Condition • Operators: <, ≤, ≥, >, =, ≠ • Simple selection
    [Show full text]
  • Session 5 – Main Theme
    Database Systems Session 5 – Main Theme Relational Algebra, Relational Calculus, and SQL Dr. Jean-Claude Franchitti New York University Computer Science Department Courant Institute of Mathematical Sciences Presentation material partially based on textbook slides Fundamentals of Database Systems (6th Edition) by Ramez Elmasri and Shamkant Navathe Slides copyright © 2011 and on slides produced by Zvi Kedem copyight © 2014 1 Agenda 1 Session Overview 2 Relational Algebra and Relational Calculus 3 Relational Algebra Using SQL Syntax 5 Summary and Conclusion 2 Session Agenda . Session Overview . Relational Algebra and Relational Calculus . Relational Algebra Using SQL Syntax . Summary & Conclusion 3 What is the class about? . Course description and syllabus: » http://www.nyu.edu/classes/jcf/CSCI-GA.2433-001 » http://cs.nyu.edu/courses/fall11/CSCI-GA.2433-001/ . Textbooks: » Fundamentals of Database Systems (6th Edition) Ramez Elmasri and Shamkant Navathe Addition Wesley ISBN-10: 0-1360-8620-9, ISBN-13: 978-0136086208 6th Edition (04/10) 4 Icons / Metaphors Information Common Realization Knowledge/Competency Pattern Governance Alignment Solution Approach 55 Agenda 1 Session Overview 2 Relational Algebra and Relational Calculus 3 Relational Algebra Using SQL Syntax 5 Summary and Conclusion 6 Agenda . Unary Relational Operations: SELECT and PROJECT . Relational Algebra Operations from Set Theory . Binary Relational Operations: JOIN and DIVISION . Additional Relational Operations . Examples of Queries in Relational Algebra . The Tuple Relational Calculus . The Domain Relational Calculus 7 The Relational Algebra and Relational Calculus . Relational algebra . Basic set of operations for the relational model . Relational algebra expression . Sequence of relational algebra operations . Relational calculus . Higher-level declarative language for specifying relational queries 8 Unary Relational Operations: SELECT and PROJECT (1/3) .
    [Show full text]
  • Relational Algebra & Relational Calculus
    Relational Algebra & Relational Calculus Lecture 4 Kathleen Durant Northeastern University 1 Relational Query Languages • Query languages: Allow manipulation and retrieval of data from a database. • Relational model supports simple, powerful QLs: • Strong formal foundation based on logic. • Allows for optimization. • Query Languages != programming languages • QLs not expected to be “Turing complete”. • QLs not intended to be used for complex calculations. • QLs support easy, efficient access to large data sets. 2 Relational Query Languages • Two mathematical Query Languages form the basis for “real” query languages (e.g. SQL), and for implementation: • Relational Algebra: More operational, very useful for representing execution plans. • Basis for SEQUEL • Relational Calculus: Let’s users describe WHAT they want, rather than HOW to compute it. (Non-operational, declarative.) • Basis for QUEL 3 4 Cartesian Product Example • A = {small, medium, large} • B = {shirt, pants} A X B Shirt Pants Small (Small, Shirt) (Small, Pants) Medium (Medium, Shirt) (Medium, Pants) Large (Large, Shirt) (Large, Pants) • A x B = {(small, shirt), (small, pants), (medium, shirt), (medium, pants), (large, shirt), (large, pants)} • Set notation 5 Example: Cartesian Product • What is the Cartesian Product of AxB ? • A = {perl, ruby, java} • B = {necklace, ring, bracelet} • What is BxA? A x B Necklace Ring Bracelet Perl (Perl,Necklace) (Perl, Ring) (Perl,Bracelet) Ruby (Ruby, Necklace) (Ruby,Ring) (Ruby,Bracelet) Java (Java, Necklace) (Java, Ring) (Java, Bracelet) 6 Mathematical Foundations: Relations • The domain of a variable is the set of its possible values • A relation on a set of variables is a subset of the Cartesian product of the domains of the variables. • Example: let x and y be variables that both have the set of non- negative integers as their domain • {(2,5),(3,10),(13,2),(6,10)} is one relation on (x, y) • A table is a subset of the Cartesian product of the domains of the attributes.
    [Show full text]
  • Relational Algebra Expression Evaluation
    Faculty of Informatics, Masaryk University Relational Algebra Expression Evaluation Bachelor's Thesis Lucie Molkov´a Brno, spring 2009 Declaration Thereby I declare that this thesis is my original work, which I have created on my own. All sources and literature used in writing the thesis, as well as any quoted material, are properly cited, including full reference to its source. Advisor: RNDr. Vlastislav Dohnal, Ph.D. Acknowledgements I would like to thank my advisor RNDr. Vlastislav Dohnal, Ph.D. for the help and motivation he provided throughout my work on this thesis and to Loren K. Rhodes, Ph.D. for valuable inputs. Special thanks go to Tom´aˇsJanouˇsekand Bc. Petr Roˇckai for support and ideas that saved me a lot of time. Abstract Relational algebra is a query language that is being used to explain basic relational operations and their principles. Many books and articles are concerned with the theory of relational algebra, however there is no practical use for it. One of the reasons is that no software exists that would allow a real use of relational alge- bra. Most of the currently used relational database management systems work with SQL queries. This work therefore describes and implemets a tool that transforms a relational algebra expressions into SQL queries, allowing the expressions to be evaluated in standard databases. This work also defines a new, applicable syntax for relational algebra and describes its grammar in order to ensure the correctness of the expressions. Keywords Relationa Algebra, SQL, Transformation, Perl, Parsing, Parse::Yapp, Syntax Anal- ysis, Semantic Analysis Contents 1 Introduction 7 1.1 Objectives .
    [Show full text]
  • Relational Databases, Logic, and Complexity
    Relational Databases, Logic, and Complexity Phokion G. Kolaitis University of California, Santa Cruz & IBM Research-Almaden [email protected] 2009 GII Doctoral School on Advances in Databases 1 What this course is about Goals: Cover a coherent body of basic material in the foundations of relational databases Prepare students for further study and research in relational database systems Overview of Topics: Database query languages: expressive power and complexity Relational Algebra and Relational Calculus Conjunctive queries and homomorphisms Recursive queries and Datalog Selected additional topics: Bag Semantics, Inconsistent Databases Unifying Theme: The interplay between databases, logic, and computational complexity 2 Relational Databases: A Very Brief History The history of relational databases is the history of a scientific and technological revolution. Edgar F. Codd, 1923-2003 The scientific revolution started in 1970 by Edgar (Ted) F. Codd at the IBM San Jose Research Laboratory (now the IBM Almaden Research Center) Codd introduced the relational data model and two database query languages: relational algebra and relational calculus . “A relational model for data for large shared data banks”, CACM, 1970. “Relational completeness of data base sublanguages”, in: Database Systems, ed. by R. Rustin, 1972. 3 Relational Databases: A Very Brief History Researchers at the IBM San Jose Laboratory embark on the System R project, the first implementation of a relational database management system (RDBMS) In 1974-1975, they develop SEQUEL , a query language that eventually became the industry standard SQL . System R evolved to DB2 – released first in 1983. M. Stonebraker and E. Wong embark on the development of the Ingres RDBMS at UC Berkeley in 1973.
    [Show full text]
  • Database Management Systems Solutions Manual Third Edition
    DATABASE MANAGEMENT SYSTEMS SOLUTIONS MANUAL THIRD EDITION Raghu Ramakrishnan University of Wisconsin Madison, WI, USA Johannes Gehrke Cornell University Ithaca, NY, USA Jeff Derstadt, Scott Selikoff, and Lin Zhu Cornell University Ithaca, NY, USA CONTENTS PREFACE iii 1 INTRODUCTION TO DATABASE SYSTEMS 1 2 INTRODUCTION TO DATABASE DESIGN 6 3THERELATIONALMODEL 16 4 RELATIONAL ALGEBRA AND CALCULUS 28 5 SQL: QUERIES, CONSTRAINTS, TRIGGERS 45 6 DATABASE APPLICATION DEVELOPMENT 63 7 INTERNET APPLICATIONS 66 8 OVERVIEW OF STORAGE AND INDEXING 73 9 STORING DATA: DISKS AND FILES 81 10 TREE-STRUCTURED INDEXING 88 11 HASH-BASED INDEXING 100 12 OVERVIEW OF QUERY EVALUATION 119 13 EXTERNAL SORTING 126 14 EVALUATION OF RELATIONAL OPERATORS 131 i iiDatabase Management Systems Solutions Manual Third Edition 15 A TYPICAL QUERY OPTIMIZER 144 16 OVERVIEW OF TRANSACTION MANAGEMENT 159 17 CONCURRENCY CONTROL 167 18 CRASH RECOVERY 179 19 SCHEMA REFINEMENT AND NORMAL FORMS 189 20 PHYSICAL DATABASE DESIGN AND TUNING 204 21 SECURITY 215 PREFACE It is not every question that deserves an answer. Publius Syrus, 42 B.C. I hope that most of the questions in this book deserve an answer. The set of questions is unusually extensive, and is designed to reinforce and deepen students’ understanding of the concepts covered in each chapter. There is a strong emphasis on quantitative and problem-solving type exercises. While I wrote some of the solutions myself, most were written originally by students in the database classes at Wisconsin. I’d like to thank the many students who helped in developing and checking the solutions to the exercises; this manual would not be available without their contributions.
    [Show full text]
  • Chapter (5) the Relational Data Model, Relational Constraints, and the Relational Algebra
    Chapter (5) The Relational Data Model, Relational Constraints, and the Relational Algebra Objectives • Describe the basic principals of the relational model of data • Define the modeling concepts and notation of the relational model • Learn about the relational constraints • Define the update operations of the relational model • Handling the violations of the integrity constraints • Learn about relational algebra that is used to manipulate relations and specifying queries. The relational model was first introduced by Ted Codd of IBM Research in a 1970 in a classic paper [Codd 1970]. 1 Relational Model Concepts The relational model represents the database as a collection of relations. A relation is often resembles a table of values or to some extent, a “flat” file of records. There are important differences between relations and files. A relation that is thought as a table of values contains rows that represent a collection of related values. In the formal relational model terminology, a row is called a tuple, a column header is called an attribute, and the table is called a relation. The data type describing the types of values that can appear in each column is called a domain. 2 1 Domains A domain D is a set of values that are indivisible as far as the relational model is concerned, i.e., it contains a set of atomic values. Examples: USA_Phone numbers. Which is ….. Local_phone_numbers. Which is … Names: The set of names of persons. Social_security_numbers: The set of valid 9-digit social security numbers. GPA: A real value between 0 and 4. There is a data type of format associated with each domain.
    [Show full text]
  • Relational Algebra
    Cleveland State University CIS611 – Relational Database Systems Lecture Notes Prof. Victor Matos RELATIONAL ALGEBRA V. Matos - CIS611_LECTURE_NOTES_ALGEBRA.docx 1 THE RELATIONAL DATA MODEL (RM) and the Relational Algebra The relational model of data (RM) was introduced by Dr. E. Codd (CACM, June 1970). The RM is simpler and more uniform than the preceding Network and Hierarchical model. S.Todd, (IBM 1976) presented PRTV the first implementation of a relational algebra DBMS. A. Klug added summary functions for statistical computing (ACM SIGMOD 1982). Roth, Oszoyoglu et al. (1987) extended the model to allow nested data structures Clifford, Tansel, Navathe, and others have added time especifications into the model Current research is aimed at extending the model to support complex data objects, multimedia mgnt, hyperdata, geographical, temporal and logical processing. V. Matos - CIS611_LECTURE_NOTES_ALGEBRA.docx 2 THE RELATIONAL DATA MODEL (RM) and the Relational Algebra A relational database is a collection of relations A relation is a 2-dimensional table, in which each row represents a collection of related data values The values in a relation can be interpreted as a fact describing an entity or a relationship Relation name Attributes STUDENT Name SSN Address GPA Mary Poppins 111-22-3333 77 Picadilly St 4.00 Pepe Gonzalez 123-45-6789 123 Bonita Rd. 3.09 Tuples V. Sundarabatharan 999-88-7777 105 Calcara Ave. 3.87 Shi-Wua Yan 881-99-0101 778 Tienamen Sq. 3.88 V. Matos - CIS611_LECTURE_NOTES_ALGEBRA.docx 3 THE RELATIONAL DATA MODEL (RM) and the Relational Algebra Domains, Tuples, Attributes, and Relations A domain D is a set of atomic values.
    [Show full text]
  • Session 7 – Main Theme
    Database Systems Session 7 – Main Theme Functional Dependencies and Normalization Dr. Jean-Claude Franchitti New York University Computer Science Department Courant Institute of Mathematical Sciences Presentation material partially based on textbook slides Fundamentals of Database Systems (6th Edition) by Ramez Elmasri and Shamkant Navathe Slides copyright © 2011 and on slides produced by Zvi Kedem copyight © 2014 1 Agenda 1 Session Overview 2 Logical Database Design - Normalization 3 Normalization Process Detailed 4 Summary and Conclusion 2 Session Agenda . Logical Database Design - Normalization . Normalization Process Detailed . Summary & Conclusion 3 What is the class about? . Course description and syllabus: » http://www.nyu.edu/classes/jcf/CSCI-GA.2433-001 » http://cs.nyu.edu/courses/spring15/CSCI-GA.2433-001/ . Textbooks: » Fundamentals of Database Systems (6th Edition) Ramez Elmasri and Shamkant Navathe Addition Wesley ISBN-10: 0-1360-8620-9, ISBN-13: 978-0136086208 6th Edition (04/10) 4 Icons / Metaphors Information Common Realization Knowledge/Competency Pattern Governance Alignment Solution Approach 55 Agenda 1 Session Overview 2 Logical Database Design - Normalization 3 Normalization Process Detailed 4 Summary and Conclusion 6 Agenda . Informal guidelines for good design . Functional dependency . Basic tool for analyzing relational schemas . Informal Design Guidelines for Relation Schemas . Normalization: . 1NF, 2NF, 3NF, BCNF, 4NF, 5NF • Normal Forms Based on Primary Keys • General Definitions of Second and Third Normal Forms • Boyce-Codd Normal Form • Multivalued Dependency and Fourth Normal Form • Join Dependencies and Fifth Normal Form 7 Logical Database Design . We are given a set of tables specifying the database » The base tables, which probably are the community (conceptual) level . They may have come from some ER diagram or from somewhere else .
    [Show full text]
  • DML, Relational Algebra
    CS3200 – Database Design Spring 2018 Derbinsky SQL: Part 1 DML, Relational Algebra Lecture 3 SQL: Part 1 (DML, Relational Algebra) February 6, 2018 1 CS3200 – Database Design Spring 2018 Derbinsky Relational Algebra • The basic set of operations for the relational model – Note that the relational model assumes sets, so some of the database operations will not map • Allows the user to formally express a retrieval over one or more relations, as a relational algebra expression – Results in a new relation, which could itself be queried (i.e. composable) • Why is RA important? – Formal basis for SQL – Used in query optimization – Common vocabulary in data querying technology – Sometimes easier to understand the flow of complex SQL SQL: Part 1 (DML, Relational Algebra) February 6, 2018 2 CS3200 – Database Design Spring 2018 Derbinsky In the Beginning… Chamberlin, Donald D., and Raymond F. Boyce. "SEQUEL: A structured English query language." Proceedings of the 1974 ACM SIGFIDET (now SIGMOD) workshop on Data description, access and control. ACM, 1974. “In this paper we present the data manipulation facility for a structured English query language (SEQUEL) which can be used for accessing data in an integrated relational data base. Without resorting to the concepts of bound variables and quantifiers SEQUEL identifies a set of simple operations on tabular structures, which can be shown to be of equivalent power to the first order predicate calculus. A SEQUEL user is presented with a consistent set of keyword English templates which reflect how people use tables to obtain information. Moreover, the SEQUEL user is able to compose these basic templates in a structured manner in order to form more complex queries.
    [Show full text]
  • Optimizing Across Relational and Linear Algebra in Parallel Analytics Pipelines
    Optimizing Across Relational and Linear Algebra in Parallel Analytics Pipelines Asterios Katsifodimos Assistant Professor @ TU Delft [email protected] http://asterios.katsifodimos.com Outline • Introduction and Motivation • Databases vs. parallel dataflow systems • Declarativity in Parallel-dataflow Systems • The Emma language and optimizer • The Lara language for mixed linear- and relational-algebra programs • Optimizing across linear & relational Algebra • BlockJoin: A join algorithm for mixed relational & linear algebra programs • The road ahead 2 A timeline of data analysis tools from a biased perspective Appearance of First parallel Open Source First columnar Scalable, UDF- Alternative Better Abstractions Relational shared-nothing Projects and storage based, data MapReduce for dataflow systems Databases architectures mainstream Databases. Faster! analytics blooms implementations go databases mainstream 70s 80s 90s 00s 2004 - 2009 2009 - 2016 2016 - Ingres Gamma BerkeleyDB MonetDB Hadoop Spark Spark Dataframes Oracle TRACE MS Access C-Store MapReduce Flink SystemML System-R Te r a d a t a PostgreSQL Vertica Impala Emma … … MySQL Aster Data Giraph MRQL … GreenPlum Hive (SQL!) … HBase Blu … 1974: Don Chamberlin We need easier to use tools! publishes SQL Open Source rises 2003: Internet Explodes Criticism of MapReduce. NoSQL movement. Back to SQL? Google publishes MapReduce. 3 Relational DBs vs. Modern Big Data Processing Stack SQL SQL-like abstraction Iterative M/R-like Jobs (Java, Scala, …) SQL Compiler & Optimizer SQL Compiler
    [Show full text]