Query Optimization

Total Page:16

File Type:pdf, Size:1020Kb

Query Optimization CMU SCS CMU SCS Today’s Class Carnegie Mellon Univ. • History & Background Dept. of Computer Science • Relational Algebra Equivalences 15-415/615 - DB Applications • Plan Cost Estimation • Plan Enumeration C. Faloutsos – A. Pavlo Lecture#15: Query Optimization Faloutsos/Pavlo CMU SCS 15-415/615 2 CMU SCS CMU SCS Query Optimization 1970s – Relational Model • Remember that SQL is declarative. • Ted Codd saw the maintenance – User tells the DBMS what answer they want, overhead for IMS/Codasyl. not how to get the answer. • Proposed database abstraction based • There can be a big difference in on relations: Codd performance based on plan is used: – Store database in simple data structures. – See last week: 5.7 days vs. 45 seconds – Access it through high-level language. – Physical storage left up to implementation. Faloutsos/Pavlo CMU SCS 15-415/615 3 Faloutsos/Pavlo CMU SCS 15-415/615 4 CMU SCS CMU SCS IBM System R IBM System R • Skunkworks project at IBM Research in • First implementation of a query optimizer. San Jose to implement Codd’s ideas. • People argued that the DBMS could never • Had to figure out all of the things that we choose a query plan better than what a are discussing in this course themselves. human could write. • IBM never commercialized System R. • A lot of the concepts from System R’s optimizer are still used today. Faloutsos/Pavlo CMU SCS 15-415/615 5 Faloutsos/Pavlo CMU SCS 15-415/615 6 CMU SCS CMU SCS Today’s Class Relational Algebra Equivalences • History & Background • A query can be expressed in different • Relational Algebra Equivalences ways. • Plan Cost Estimation • The optimizer considers variations and • Plan Enumeration choose the one with the lowest cost. • Nested Sub-queries • How do we know whether two queries are equivalent? Faloutsos/Pavlo CMU SCS 15-415/615 7 Faloutsos/Pavlo CMU SCS 15-415/615 8 CMU SCS CMU SCS Relational Algebra Equivalences Predicate Pushdown SELECT name, cid • Two relational algebra expressions are FROM student, enrolled equivalent if they generate the same set of WHERE student.sid = enrolled.sid tuples. AND enrolled.grade = ‘A’ π name, cid π name, cid sid=sid σ grade=‘A’ sid=sid σ grade=‘A’ ⨝ student ⨝ enrolled student enrolled Faloutsos/Pavlo CMU SCS 15-415/615 9 Faloutsos/Pavlo CMU SCS 15-415/615 10 CMU SCS CMU SCS Relational Algebra Equivalences Relational Algebra Equivalences SELECT name, cid FROM student, enrolled • Selections: WHERE student.sid = – Perform them early enrolled.sid AND enrolled.grade = ‘A’ – Break a complex predicate, and push down σp1 p2 …pn(R) = σp1(σp2(σ…pn(R))…) πname, cid(σgrade=‘A’(student enrolled)) ∧ ∧ • Simplify a complex predicate = ⋈ – (X=Y AND Y=3) → X=3 AND Y=3 π student σ (enrolled name, cid( ( grade=‘A’ ))) Faloutsos/Pavlo CMU⋈ SCS 15-415/615 11 Faloutsos/Pavlo CMU SCS 15-415/615 12 CMU SCS CMU SCS Relational Algebra Equivalences Projection Pushdown SELECT name, cid • Projections: FROM student, enrolled – Perform them early WHERE student.sid = enrolled.sid • Smaller tuples AND enrolled.grade = ‘A’ • Fewer tuples (if duplicates are eliminated) – Project out all attributes except the ones π name, cid π name, cid requested or required (e.g., joining attr.) sid=sid σ grade=‘A’ sid=sid σ grade=‘A’ • This is not important for a column store… ⨝ student ⨝ enrolled student enrolled Faloutsos/Pavlo CMU SCS 15-415/615 13 Faloutsos/Pavlo CMU SCS 15-415/615 14 CMU SCS CMU SCS Projection Pushdown Relational Algebra Equivalences SELECT name, cid FROM student, enrolled • Joins: WHERE student.sid = – Commutative, associative enrolled.sid AND enrolled.grade = ‘A’ R S = S R π name, cid π name, cid (R S) T = R (S T) ⋈ ⋈ σ grade=‘A’ sid=sid • Q: How many⋈ different⋈ orderings⋈ ⋈ are there sid=sid π sid, cid π sid, name for an n-way join? ⨝ σ grade=‘A’ student ⨝ enrolled student enrolled Faloutsos/Pavlo CMU SCS 15-415/615 14 Faloutsos/Pavlo CMU SCS 15-415/615 15 CMU SCS CMU SCS Relational Algebra Equivalences Today’s Class • Joins: How many different orderings are • History & Background there for an n-way join? • Relational Algebra Equivalences • A: Catalan number ~ 4n • Plan Cost Estimation – Exhaustive enumeration: too slow. • Plan Enumeration • We’ll see in a second how an optimizer limits the search space... Faloutsos/Pavlo CMU SCS 15-415/615 16 Faloutsos/Pavlo CMU SCS 15-415/615 17 CMU SCS CMU SCS Cost Estimation Cost Estimation – Statistics SR • How long will a query take? • For each relation R we keep: #1 #2 – CPU: Small cost; tough to estimate – NR → # tuples #3 – Disk: # of block transfers – SR → size of tuple in bytes – Memory: Amount of DRAM used – V(A,R) → # of distinct values – Network: # of messages of attribute ‘A’ • How many tuples will be read/written? … • What statistics do we need to keep? #NR Faloutsos/Pavlo CMU SCS 15-415/615 18 Faloutsos/Pavlo CMU SCS 15-415/615 19 CMU SCS CMU SCS Derivable Statistics Derivable Statistics SR • FR → max# records/block • SC(A,R) → Selection Cardinality #1 avg# of records with A=given • BR → # blocks FR → NR / V(A,R) • SC(A,R) → selection cardinality #2 • Note that this assumes data uniformity avg# of records with A=given #3 – 10,000 students, 10 colleges – how many … students in SCS? #BR Faloutsos/Pavlo CMU SCS 15-415/615 20 Faloutsos/Pavlo CMU SCS 15-415/615 21 CMU SCS CMU SCS Additional Statistics Statistics • For index i: • Where do we store them? HT – Fi → average fanout (~50-100) i • How often do we update them? – HTi → # levels of index i (~2-3) ~ log(#entries)/log(F ) i • Manual invocations: – LBi # → blocks at leaf level – Postgres/SQLite: ANALYZE – MySQL: ANALYZE TABLE Faloutsos/Pavlo CMU SCS 15-415/615 22 Faloutsos/Pavlo CMU SCS 15-415/615 23 CMU SCS CMU SCS Selection Statistics Selections – Complex Predicates • We saw simple predicates (name=“Kayne”) • Selectivity sel(P) of predicate P: • How about more complex predicates, like == fraction of tuples that qualify – salary > 10000 • Formula depends on type of predicate. – age=30 AND jobTitle=“Costermonger” – Equality • What is their selectivity? – Range – Negation – Conjunction – Disjunction Faloutsos/Pavlo CMU SCS 15-415/615 24 Faloutsos/Pavlo CMU SCS 15-415/615 25 CMU SCS CMU SCS Selections – Complex Predicates Selections – Complex Predicates • Selectivity sel(P) of predicate P: • Assume that V(rating, sailors) has 5 == fraction of tuples that qualify distinct values (0–4) and NR = 5 • Formula depends on type of predicate. • Equality Predicate: A=constant – Equality – sel(A=constant) = SC(P) / V(A,R) – Range – Example: sel(rating=‘2’) = – Negation – Conjunction – Disjunction Faloutsos/Pavlo CMU SCS 15-415/615 25 26 CMU SCS CMU SCS Selections – Complex Predicates Selections – Complex Predicates • Assume that V(rating, sailors) has 5 • Assume that V(rating, sailors) has 5 distinct values (0–4) and NR = 5 distinct values (0–4) and NR = 5 • Equality Predicate: A=constant • Equality Predicate: A=constant – sel(A=constant) = SC(P) / V(A,R) – sel(A=constant) = SC(P) / V(A,R) – Example: sel(rating=‘2’) = – Example: sel(rating=‘2’) = count count V(rating,R)=5 0 1 2 3 4 0 1 2 3 4 26 26 rating rating CMU SCS CMU SCS Selections – Complex Predicates Selections – Complex Predicates • Assume that V(rating, sailors) has 5 • Assume that V(rating, sailors) has 5 distinct values (0–4) and NR = 5 distinct values (0–4) and NR = 5 • Equality Predicate: A=constant • Equality Predicate: A=constant – sel(A=constant) = SC(P) / V(A,R) – sel(A=constant) = SC(P) / V(A,R) – Example: sel(rating=‘2’) = – Example: sel(rating=‘2’) = 1/5 SC(rating=‘2’)=1 SC(rating=‘2’)=1 count V(rating,R)=5 count V(rating,R)=5 0 1 2 3 4 0 1 2 3 4 26 26 rating rating CMU SCS CMU SCS Selections – Complex Predicates Selections – Complex Predicates • Range Query: • Range Query: – sel(A>a) = (Amax – a) / (Amax – Amin) – sel(A>a) = (Amax – a) / (Amax – Amin) – Example: sel(rating >= ‘2’) – Example: sel(rating >= ‘2’) = (4 – 2) / (4 – 0) = 1/2 ratingmin = 0 ratingmax = 4 count count 0 1 2 3 4 0 1 2 3 4 27 27 rating rating CMU SCS CMU SCS Selections – Complex Predicates Selections – Complex Predicates • Negation Query • Negation Query – sel(not P) = 1 – sel(P) – sel(not P) = 1 – sel(P) – Example: sel(rating != ‘2’) – Example: sel(rating != ‘2’) SC(rating=‘2’)=1 count count 0 1 2 3 4 0 1 2 3 4 28 28 rating rating CMU SCS CMU SCS Selections – Complex Predicates Selections – Complex Predicates • Negation Query • Negation Query – sel(not P) = 1 – sel(P) – sel(not P) = 1 – sel(P) – Example: sel(rating != ‘2’) – Example: sel(rating != ‘2’) = 1 – (1/5) = 4/5 • Observation: selectivity ≈ probability SC(rating!=‘2’)=2 SC(rating!=‘2’)=2 SC(rating!=‘2’)=2 SC(rating!=‘2’)=2 count count 0 1 2 3 4 0 1 2 3 4 28 28 rating rating CMU SCS CMU SCS Selections – Complex Predicates Selections – Complex Predicates • Conjunction: • Disjunction: – sel(rating = ‘2’ AND name LIKE ‘C%’) – sel(rating = ‘2’ OR name LIKE ‘C%’) – sel(P P ) – sel(P1 P2) = sel(P1) · sel(P2) 1 2 = sel(P ) + sel(P ) – sel(P P ) – INDEPENDENCE ASSUMPTION 1 2 1 2 ⋀ = sel⋁(P1) + sel(P2) – sel(P1) · sel(P2) – INDEPENDENCE ASSUMPTION,⋁ again P1 P2 Faloutsos/Pavlo CMU SCS 15-415/615 29 Faloutsos/Pavlo CMU SCS 15-415/615 30 CMU SCS CMU SCS Selections – Complex Predicates Joins • Disjunction, in general: • Q: Given a join of R and S, what is the – sel(P1 OR P2 OR … Pn) = range of possible result sizes in #of tuples? – 1 - (1- sel(P1) ) · (1 - sel(P2) ) · … (1 - sel(Pn)) P1 P2 Faloutsos/Pavlo CMU SCS 15-415/615 31 Faloutsos/Pavlo CMU SCS 15-415/615 32 CMU SCS CMU SCS Result Size Estimation for Joins Result Size Estimation for Joins • General case: Rcols Scols = {A} where A • General case: Rcols Scols = {A} where A is not a key for either table.
Recommended publications
  • Evolution of Database Technology
    Introduction and Database Technology By EM Bakker September 11, 2012 Databases and Data Mining 1 DBDM Introduction Databases and Data Mining Projects at LIACS Biological and Medical Databases and Data Mining CMSB (Phenotype Genotype), DIAL CGH DB Cyttron: Visualization of the Cell GRID Computing VLe: Virtual Lab e-Science environments DAS3/DAS4 super computer Research on Fundamentals of Databases and Data Mining Database integration Data Mining algorithms Content Based Retrieval September 11, 2012 Databases and Data Mining 2 DBDM Databases (Chapters 1-7): The Evolution of Database Technology Data Preprocessing Data Warehouse (OLAP) & Data Cubes Data Cubes Computation Grand Challenges and State of the Art September 11, 2012 Databases and Data Mining 3 DBDM Data Mining (Chapters 8-11): Introduction and Overview of Data Mining Data Mining Basic Algorithms Mining data streams Mining Sequence Patterns Graph Mining September 11, 2012 Databases and Data Mining 4 DBDM Further Topics Mining object, spatial, multimedia, text and Web data Mining complex data objects Spatial and spatiotemporal data mining Multimedia data mining Text mining Web mining Applications and trends of data mining Mining business & biological data Visual data mining Data mining and society: Privacy-preserving data mining September 11, 2012 Databases and Data Mining 5 [R] Evolution of Database Technology September 11, 2012 Databases and Data Mining 6 Evolution of Database Technology 1960s: (Electronic) Data collection, database creation, IMS
    [Show full text]
  • CMU 15-445/645 Database Systems (Fall 2018 :: Query Optimization
    Query Optimization Lecture #13 Database Systems Andy Pavlo 15-445/15-645 Computer Science Fall 2018 AP Carnegie Mellon Univ. 2 ADMINISTRIVIA Mid-term Exam is on Wednesday October 17th → See mid-term exam guide for more info. Project #2 – Checkpoint #2 is due Friday October 19th @ 11:59pm. CMU 15-445/645 (Fall 2018) 4 QUERY OPTIMIZATION Remember that SQL is declarative. → User tells the DBMS what answer they want, not how to get the answer. There can be a big difference in performance based on plan is used: → See last week: 1.3 hours vs. 0.45 seconds CMU 15-445/645 (Fall 2018) 5 IBM SYSTEM R First implementation of a query optimizer. People argued that the DBMS could never choose a query plan better than what a human could write. A lot of the concepts from System R’s optimizer are still used today. CMU 15-445/645 (Fall 2018) 6 QUERY OPTIMIZATION Heuristics / Rules → Rewrite the query to remove stupid / inefficient things. → Does not require a cost model. Cost-based Search → Use a cost model to evaluate multiple equivalent plans and pick the one with the lowest cost. CMU 15-445/645 (Fall 2018) 7 QUERY PLANNING OVERVIEW System Catalog Cost SQL Query Model Abstract Syntax Annotated Annotated Tree AST AST Parser Binder Rewriter Optimizer (Optional) Name→Internal ID Query Plan CMU 15-445/645 (Fall 2018) 8 TODAY'S AGENDA Relational Algebra Equivalences Plan Cost Estimation Plan Enumeration Nested Sub-queries Mid-Term Review CMU 15-445/645 (Fall 2018) 9 RELATIONAL ALGEBRA EQUIVALENCES Two relational algebra expressions are equivalent if they generate the same set of tuples.
    [Show full text]
  • Evolution of Database Technology  1960S:  (Electronic) Data Collection, Database Creation, IMS (Hierarchical Database System by IBM) and Network DBMS
    Database Technology Erwin M. Bakker & Stefan Manegold https://homepages.cwi.nl/~manegold/DBDM/ http://liacs.leidenuniv.nl/~bakkerem2/dbdm/ [email protected] [email protected] 9/21/18 Databases and Data Mining 1 Evolution of Database Technology 1960s: (Electronic) Data collection, database creation, IMS (hierarchical database system by IBM) and network DBMS 1970s: Relational data model, relational DBMS implementation 1980s: RDBMS, advanced data models (extended-relational, OO, deductive, etc.) Application-oriented DBMS (spatial, scientific, engineering, etc.) 9/21/18 Databases and Data Mining 2 Evolution of Database Technology 1990s: Data mining, data warehousing, multimedia databases, and Web databases 2000 - Stream data management and mining Data mining and its applications Web technology Data integration, XML Social Networks (Facebook, etc.) Cloud Computing global information systems Emerging in-house solutions In Memory Databases Big Data 9/21/18 Databases and Data Mining 3 1960’s Companies began automating their back-office bookkeeping in the 1960s COBOL and its record-oriented file model were the work-horses of this effort Typical work-cycle: 1. a batch of transactions was applied to the old-tape-master 2. a new-tape-master produced 3. printout for the next business day. COmmon Business-Oriented Language (COBOL 2002 standard) 9/21/18 Databases and Data Mining 4 COBOL A quote by Prof. dr. E.W. Dijkstra (Turing Award 1972) 18 June 1975: “The use of COBOL cripples the mind; its teaching should, therefore, be regarded as a criminal offence.” September 2015: 9/21/18 Databases and Data Mining 5 COBOL Code (just an example!) 01 LOAN-WORK-AREA.
    [Show full text]
  • IBM Cloudant: Database As a Service Fundamentals
    Front cover IBM Cloudant: Database as a Service Fundamentals Understand the basics of NoSQL document data stores Access and work with the Cloudant API Work programmatically with Cloudant data Christopher Bienko Marina Greenstein Stephen E Holt Richard T Phillips ibm.com/redbooks Redpaper Contents Notices . 5 Trademarks . 6 Preface . 7 Authors. 7 Now you can become a published author, too! . 8 Comments welcome. 8 Stay connected to IBM Redbooks . 9 Chapter 1. Survey of the database landscape . 1 1.1 The fundamentals and evolution of relational databases . 2 1.1.1 The relational model . 2 1.1.2 The CAP Theorem . 4 1.2 The NoSQL paradigm . 5 1.2.1 ACID versus BASE systems . 7 1.2.2 What is NoSQL? . 8 1.2.3 NoSQL database landscape . 9 1.2.4 JSON and schema-less model . 11 Chapter 2. Build more, grow more, and sleep more with IBM Cloudant . 13 2.1 Business value . 15 2.2 Solution overview . 16 2.3 Solution architecture . 18 2.4 Usage scenarios . 20 2.5 Intuitively interact with data using Cloudant Dashboard . 21 2.5.1 Editing JSON documents using Cloudant Dashboard . 22 2.5.2 Configuring access permissions and replication jobs within Cloudant Dashboard. 24 2.6 A continuum of services working together on cloud . 25 2.6.1 Provisioning an analytics warehouse with IBM dashDB . 26 2.6.2 Data refinement services on-premises. 30 2.6.3 Data refinement services on the cloud. 30 2.6.4 Hybrid clouds: Data refinement across on/off–premises. 31 Chapter 3.
    [Show full text]
  • 15-445/645 Database Systems (Fall 2020) Carnegie Mellon University Prof
    Lecture #02: Intermediate SQL 15-445/645 Database Systems (Fall 2020) https://15445.courses.cs.cmu.edu/fall2020/ Carnegie Mellon University Prof. Andy Pavlo 1 Relational Languages Edgar Codd published a major paper on relational models in the early 1970s. Originally, he only defined the mathematical notation for how a DBMS could execute queries on a relational model DBMS. The user only needs to specify the result that they want using a declarative language (i.e., SQL). The DBMS is responsible for determining the most efficient plan to produce that answer. Relational algebra is based on sets (unordered, no duplicates). SQL is based on bags (unordered, allows duplicates). 2 SQL History Declarative query lanaguage for relational databases. It was originally developed in the 1970s as part of the IBM System R project. IBM originally called it “SEQUEL” (Structured English Query Language). The name changed in the 1980s to just “SQL” (Structured Query Language). The language is comprised of different classes of commands: 1. Data Manipulation Language (DML): SELECT, INSERT, UPDATE, and DELETE statements. 2. Data Definition Language (DDL): Schema definitions for tables, indexes, views, and other objects. 3. Data Control Language (DCL): Security, access controls. SQL is not a dead language. It is being updated with new features every couple of years. SQL-92 is the minimum that a DBMS has to support to claim they support SQL. Each vendor follows the standard to a certain degree but there are many proprietary extensions. 3 Aggregates An aggregation function takes in a bag of tuples as its input and then produces a single scalar value as its output.
    [Show full text]
  • The Relational Model Chapter 3
    The Relational Model Chapter 3 ECS 165A – Winter 2020 Mohammad Sadoghi Exploratory Systems Lab Department of Computer Science Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke !1 Why Study the Relational Model? ❖ Most widely used model. ▪ Vendors: IBM, Microsoft, Oracle, etc. ❖ “Legacy systems” in older models ▪ E.G., IBM’s IMS ❖ Old competitors: ▪ Hierarchical Model ▪ Network Model ❖ Competitors: object-oriented model ❖ Object-relational model Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke !2 Relational Database: Definitions ❖ Relational database: a set of relations ❖ Relation: made up of 2 parts: ▪ Schema: specifies name of relation, plus name and type of each column. • E.G. Students(sid: string, name: string, login: string, age: integer, gpa: real). ▪ Instance: a table, with rows and columns. #Rows = cardinality, #fields = degree / arity. ❖ Can think of a relation as a set of rows or tuples (i.e., all rows are distinct). Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke !3 Example Instance of Students Relation sid name login age gpa 53666 Jones jones@cs 18 3.4 53688 Smith smith@eecs 18 3.2 53650 Smith smith@math 19 3.8 ❖ Cardinality = 3, degree = 5, all rows distinct ❖ Do all columns in a relation instance have to be distinct? Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke !4 Relational Query Languages ❖ A major strength of the relational model: supports simple, powerful querying of data. ❖ Queries can be written intuitively, and the DBMS is responsible for efficient evaluation. ▪ The key: precise semantics for relational queries. ▪ Allows the optimizer to extensively re-order operations, and still ensure that the answer does not change.
    [Show full text]
  • DB2 for Z/OS® Overview
    Part I: Overview Chapter 1: Introduction A journey into the wonderland of DB2 DB2 for z/OS Fundamentals 0 © Copyright IBM Corporation 2009. All rights reserved. U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. – THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY. WHILE EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE INFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED “AS IS” WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. IN ADDITION, THIS INFORMATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE. IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION. NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, NOR SHALL HAVE THE EFFECT OF, CREATING ANY WARRANTIES OR REPRESENTATIONS FROM IBM (OR ITS SUPPLIERS OR LICENSORS), OR ALTERING THE TERMS AND CONDITIONS OF ANY AGREEMENT OR LICENSE GOVERNING THE USE OF IBM PRODUCTS AND/OR SOFTWARE. 1 © Copyright IBM Corporation DB2 for z/OS Fundamentals 2009. All rights reserved. This fundamental course provides you with basic knowledge to work with DB2 for z/OS. Upon Completion of this course, you will be able to: Identify DB2 components, understand DB2 architecture and the mechanism behind the scenes. Describe the system environment of DB2, understand how DB2 cooperates with other facilities to provide the functionalities. Develop applications with DB2 as data server. Understand the performance issues and know how to manage them.
    [Show full text]
  • Database Management Systems Ebooks for All Edition (
    Database Management Systems eBooks For All Edition (www.ebooks-for-all.com) PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Sun, 20 Oct 2013 01:48:50 UTC Contents Articles Database 1 Database model 16 Database normalization 23 Database storage structures 31 Distributed database 33 Federated database system 36 Referential integrity 40 Relational algebra 41 Relational calculus 53 Relational database 53 Relational database management system 57 Relational model 59 Object-relational database 69 Transaction processing 72 Concepts 76 ACID 76 Create, read, update and delete 79 Null (SQL) 80 Candidate key 96 Foreign key 98 Unique key 102 Superkey 105 Surrogate key 107 Armstrong's axioms 111 Objects 113 Relation (database) 113 Table (database) 115 Column (database) 116 Row (database) 117 View (SQL) 118 Database transaction 120 Transaction log 123 Database trigger 124 Database index 130 Stored procedure 135 Cursor (databases) 138 Partition (database) 143 Components 145 Concurrency control 145 Data dictionary 152 Java Database Connectivity 154 XQuery API for Java 157 ODBC 163 Query language 169 Query optimization 170 Query plan 173 Functions 175 Database administration and automation 175 Replication (computing) 177 Database Products 183 Comparison of object database management systems 183 Comparison of object-relational database management systems 185 List of relational database management systems 187 Comparison of relational database management systems 190 Document-oriented database 213 Graph database 217 NoSQL 226 NewSQL 232 References Article Sources and Contributors 234 Image Sources, Licenses and Contributors 240 Article Licenses License 241 Database 1 Database A database is an organized collection of data.
    [Show full text]
  • The University of Chicago Transaction and Message
    THE UNIVERSITY OF CHICAGO TRANSACTION AND MESSAGE: FROM DATABASE TO MARKETPLACE, 1970-2000 A DISSERTATION SUBMITTED TO THE FACULTY OF THE DIVISION OF THE SOCIAL SCIENCES IN CANDIDACY FOR THE DEGREE OF DOCTOR OF PHILOSOPHY DEPARTMENT OF SOCIOLOGY BY MICHAEL C. CASTELLE CHICAGO, ILLINOIS DECEMBER 2017 TABLE OF CONTENTS List of Figures iii List of Tables v List of Abbreviations vi Acknowledgements ix Abstract xi Chapter 1. Theoretical Foundations for Social Studies of Computing 1 Chapter 2. Computing in Organizations: Electronic Data Processing 32 and the Relational Model Chapter 3. The Transaction Abstraction: From the Paperwork Crisis 69 to Black Monday Chapter 4. Brokers, Queues, and Flows: Techniques of 127 Financialization and Consolidation Chapter 5. Where Do Electronic Markets Come From? 186 Chapter 6. The Platform as Exchange 219 References 240 ii LIST OF FIGURES Figure 1. Peirce’s sign-systems. 22 Figure 2. Date and Codd’s diagrammatic comparison of the logical 51 views of a relational database and of a network database Figure 3. Date’s depiction of the Codasyl DBTG network model. 53 Figure 4. From “Generalization: Key to Successful Electronic Data 57 Processing”. Figure 5. A B-tree index for a relation using an integer primary key, 58 as used in the System R relational database. Figure 6. Diagram depicting the dynamic re-balancing of a B-tree upon 59 inserting the value ’9’ into a full leaf. Figure 7. Charles T. Davies’ early transaction abstraction. 66 Figure 8. Jim Gray’s transactions. 66 Figure 9. New York Stock Exchange trading volume, 1970-2005. 68 Figure 10.
    [Show full text]
  • A Study of Migrating Biological Data from Relational Databases to Nosql Databases
    A Study of Migrating Biological Data from Relational Databases to NoSQL Databases by Nawal N. Moatassem Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science in the Computer Information Systems Program YOUNGSTOWN STATE UNIVERSITY August, 2015 A Study of Migrating Biological Data from Relational Databases to NoSQL Databases Nawal N. Moatassem I hereby release this thesis to the public. I understand that this thesis will be made available from the OhioLINK ETD Center and the Maag Library Circulation Desk for public access. I also authorize the University or other individuals to make copies of this thesis as needed for scholarly research. Signature: Nawal N. Moatassem, Student Date Approvals: Dr. Feng Yu, Thesis Advisor Date Dr. John R. Sullins, Committee Member Date Dr. Yong Zhang, Committee Member Date Dr. Salvatore A. Sanders, Associate Dean of Graduate Studies Date Acknowledgments First off, I would like to thank my family and friends for supporting me through this graduate degree program. I would like to thank my fiance, Eric, who has always stood by my side and encouraged me when times got tough. Lastly, I would like to thank all of my professors who have taught me so much, especially my thesis advisor, Dr. Feng Yu who has believed in me and prepared me for my future success. iii Abstract The purpose of this research is to conduct a literature survey on various NoSQL Not-only-SQL) architectures. Included along with this literature survey is an experiment to do a comparison between a relational database management system (RDBMS) and a NoSQL DBMS.
    [Show full text]
  • An Interview with Oracle's Larry Ellison
    How Oracle Came to Rule the Database World NYOUG – 2007 Rich Niemiec, TUSC (Special Thanks: Andy Mendelsohn, Michael Olin, Paul Dorsey, Caryl Lee Fisher & Pat Holmes) 1 Overview • The Paper that started it all – E. F. Codd • System-R & Ingres • Oracle is Founded as SDL • V1-V10g • Why did Oracle win? • Future market direction • Summary 2 1968: E. F. “Ted Codd” Invents Relational Theory in his mind “The SEQUEL/DML paper got accepted to 1974 SIGMOD. Several years later I got a call from a guy named Larry Ellison who’d read that paper; he basically used some of the ideas from that paper to good advantage.” – Don Chamberlin, then IBM (SQL Reunion, 1995) 1970: Codd’s Famous Paper – In 1969, E.F. Codd publishes the internal version of his famous paper internally to IBM. – June 1970: Edgar “Ted” F. Codd publicly publishes the paper: A Relational Model of Data for Large Shared Data Banks (Pgs. 377-387) • Information should be stored in tables • IBM refuses to implement his model to preserve revenues of IMS/DB • Customers pressured IBM to build it (System-R project) and a relational language SEQUEL (Structured English Query Language - later SQL). Oracle used pre-launch conference papers to write their own SQL & launched it first. 4 1970: Codd’s Famous Paper 5 INGRES – 1974 INteractive Graphics Retrieval System 1974: INGRES – 1972 Michael Stonebraker got a grant for a geo-query database system that would become INGRES – Michael Stonebraker, Eugene “Gene” Wong & others including students coming and going work on INGRES – Used QUEL instead of SQL – In a 1976 paper at ACM, Stonebraker, Wong, Kreps & Held wrote a paper: “The Design and Implemenation of INGRES.” – Some hostility between Berkeley and IBM group.
    [Show full text]
  • The Relational Model Chapter 3
    The Relational Model Chapter 3 Instructor: Vladimir Zadorozhny [email protected] Information Science Program School of Information Sciences, University of Pittsburgh Database Management Systems, R. Ramakrishnan and J. Gehrke INFSCI2710 Instructor: Vladimir Zadorozhny 1 Why Study the Relational Model? v Most widely used model. § Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc. v “Legacy systems” in older models § E.G., IBM’s IMS v Recent competitor: object-oriented model § ObjectStore, Versant, Ontos § A synthesis emerging: object-relational model • Informix Universal Server, UniSQL, O2, Oracle, DB2 Database Management Systems, R. Ramakrishnan and J. Gehrke INFSCI2710 Instructor: Vladimir Zadorozhny 2 Relational Database: Definitions v Relational database: a set of relations v Relation: made up of 2 parts: § Instance : a table, with rows and columns. #Rows = cardinality, #fields = degree / arity. § Schema : specifies name of relation, plus name and type of each column. • E.G. Students(sid: string, name: string, login: string, age: integer, gpa: real). v Can think of a relation as a set of rows or tuples (i.e., all rows are distinct). Database Management Systems, R. Ramakrishnan and J. Gehrke INFSCI2710 Instructor: Vladimir Zadorozhny 3 Example Instance of Students Relation sid name login age gpa 53666 Jones jones@cs 18 3.4 53688 Smith smith@eecs 18 3.2 53650 Smith smith@math 19 3.8 v Cardinality = 3, degree = 5, all rows distinct v Do all columns in a relation instance have to be distinct? Database Management Systems, R. Ramakrishnan and J. Gehrke INFSCI2710 Instructor: Vladimir Zadorozhny 4 Relational Query Languages v A major strength of the relational model: supports simple, powerful querying of data.
    [Show full text]