Index Concurrency Control

Total Page:16

File Type:pdf, Size:1020Kb

Index Concurrency Control Index Concurrency Control Index Concurrency Control 1 / 129 Index Concurrency Control Administrivia • Assignment 3 is due on Oct 19th @ 11:59pm • Exercise Sheeet 3 is due on Oct 19th @ 11:59pm (no late days allowed) 2 / 129 Recap Recap 3 / 129 Recap Index Data Structures • List of Data Structures: Hash Tables, B+Trees, Radix Trees • Most DBMSs automatically create an index to enforce integrity constraints. • B+Trees are the way to go for indexing data. 4 / 129 Recap Observation • We assumed that all the data structures that we have discussed so far are single-threaded. • But we need to allow multiple threads to safely access our data structures to take advantage of additional CPU cores and hide disk I/O stalls. 5 / 129 Recap Concurrency Control • A concurrency control protocol is the method that the DBMS uses to ensure "correct" results for concurrent operations on a shared object. • A protocol’s correctness criteria can vary: I Logical Correctness: Am I reading the data that I am supposed to read? I Physical Correctness: Is the internal representation of the object sound? 6 / 129 Recap Today’s Agenda • Latches Overview • Hash Table Latching • B+Tree Latching • Leaf Node Scans • Delayed Parent Updates (Blink-Tree) 7 / 129 Latches Overview Latches Overview 8 / 129 Latches Overview Locks vs. Latches • Locks I Protects the database’s logical contents from other txns. I Held for the duration of the transaction. I Need to be able to rollback changes. • Latches I Protects the critical sections of the DBMS’s internal physical data structures from other threads. I Held for the duration of the operation. I Do not need to be able to rollback changes. 9 / 129 Latches Overview Locks vs. Latches Locks Latches Separate. User transactions Threads Protect. Database Contents In-Memory Data Structures During. Entire Transactions Critical Sections Modes. Shared, Exclusive, Update, Intention Read, Write (a.k.a., Shared, Exclusive) Deadlock Detection & Resolution Avoidance . by. Waits-for, Timeout, Aborts Coding Discipline Kept in. Lock Manager Protected Data Structure Reference 10 / 129 Latches Overview Latch Modes • Read Mode I Multiple threads can read the same object at the same time. I A thread can acquire the read latch if another thread has it in read mode. • Write Mode I Only one thread can access the object. I A thread cannot acquire a write latch if another thread holds the latch in any mode. Read Write Read X X Write X X 11 / 129 Latches Overview Latch Implementations • Blocking OS Mutex • Test-and-Set Spin Latch • Reader-Writer Latch 12 / 129 Latches Overview Latch Implementations • Approach 1: Blocking OS Mutex I Simple to use I Non-scalable (about 25 ns per lock/unlock invocation) I Example: std::mutex std::mutexm; m.lock(); // Do something special... m.unlock(); 13 / 129 Latches Overview Latch Implementations • Approach 2: Test-and-Set Spin Latch (TAS) I Very efficient (single instruction to latch/unlatch) I Non-scalable, not cache friendly I Example: std::atomic<T> I Unlike OS mutex, spin latches do not suspend thread execution I Atomic operations are faster if contention between threads is sufficiently low std::atomic_flag latch; // atomic of boolean type (lock-free) while(latch.test_and_set(...)) { // Retry? Yield? Abort? } 14 / 129 Latches Overview Latch Implementations • Approach 3: Reader-Writer Latch I Allows for concurrent readers I Must manage read/write queues to avoid starvation I Can be implemented on top of spinlocks 15 / 129 Latches Overview Latch Implementations • Approach 3: Reader-Writer Latch I Allows for concurrent readers I Must manage read/write queues to avoid starvation I Can be implemented on top of spinlocks 16 / 129 Hash Table Latching Hash Table Latching 17 / 129 Hash Table Latching Hash Table Latching • Easy to support concurrent access due to the limited ways in which threads access the data structure. I All threads move in the same direction and only access a single page/slot at a time. I Deadlocks are not possible. • To resize the table, take a global latch on the entire table (i.e., in the header page). 18 / 129 Hash Table Latching Hash Table Latching • Approach 1: Page Latches I Each page has its own reader-write latch that protects its entire contents. I Threads acquire either a read or write latch before they access a page. • Approach 2: Slot Latches I Each slot has its own latch. I Can use a single mode latch to reduce meta-data and computational overhead. 19 / 129 Hash Table Latching Hash Table - Page Laches 20 / 129 Hash Table Latching Hash Table - Page Laches 21 / 129 Hash Table Latching Hash Table - Page Laches 22 / 129 Hash Table Latching Hash Table - Page Laches 23 / 129 Hash Table Latching Hash Table - Page Laches 24 / 129 Hash Table Latching Hash Table - Page Laches 25 / 129 Hash Table Latching Hash Table - Page Laches 26 / 129 Hash Table Latching Hash Table - Page Laches 27 / 129 Hash Table Latching Hash Table - Page Laches 28 / 129 Hash Table Latching Hash Table - Slot Laches 29 / 129 Hash Table Latching Hash Table - Slot Laches 30 / 129 Hash Table Latching Hash Table - Slot Laches 31 / 129 Hash Table Latching Hash Table - Slot Laches 32 / 129 Hash Table Latching Hash Table - Slot Laches 33 / 129 Hash Table Latching Hash Table - Slot Laches 34 / 129 Hash Table Latching Hash Table - Slot Laches 35 / 129 Hash Table Latching Hash Table - Slot Laches 36 / 129 B+Tree Concurrency Control B+Tree Concurrency Control 37 / 129 B+Tree Concurrency Control B+Tree Concurrency Control • We want to allow multiple threads to read and update a B+Tree at the same time. • We need to handle two types of problems: I Threads trying to modify the contents of a node at the same time. I One thread traversing the tree while another thread splits/merges nodes. 38 / 129 B+Tree Concurrency Control B+Tree Concurrency Control: Example 39 / 129 B+Tree Concurrency Control B+Tree Concurrency Control: Example 40 / 129 B+Tree Concurrency Control B+Tree Concurrency Control: Example 41 / 129 B+Tree Concurrency Control B+Tree Concurrency Control: Example 42 / 129 B+Tree Concurrency Control B+Tree Concurrency Control: Example 43 / 129 B+Tree Concurrency Control B+Tree Concurrency Control: Example 44 / 129 B+Tree Concurrency Control B+Tree Concurrency Control: Example 45 / 129 B+Tree Concurrency Control B+Tree Concurrency Control: Example 46 / 129 B+Tree Concurrency Control B+Tree Concurrency Control: Example 47 / 129 B+Tree Concurrency Control B+Tree Concurrency Control: Example 48 / 129 B+Tree Concurrency Control Latch Crabbing/Coupling • Protocol to allow multiple threads to access/modify B+Tree at the same time. • Basic Idea: I Get latch for parent. I Get latch for child I Release latch for parent if “safe”. • A safe node is one that will not split or merge when updated. I Not full (on insertion) I More than half-full (on deletion) 49 / 129 B+Tree Concurrency Control Latch Crabbing/Coupling • Find: Start at root and go down; repeatedly, I Acquire R latch on child I Then unlatch parent • Insert/Delete: Start at root and go down, obtaining W latches as needed. Once child is latched, check if it is safe: I If child is safe, release all latches on ancestors. 50 / 129 B+Tree Concurrency Control Example 1 - Find 38 51 / 129 B+Tree Concurrency Control Example 1 - Find 38 52 / 129 B+Tree Concurrency Control Example 1 - Find 38 53 / 129 B+Tree Concurrency Control Example 1 - Find 38 54 / 129 B+Tree Concurrency Control Example 1 - Find 38 55 / 129 B+Tree Concurrency Control Example 1 - Find 38 56 / 129 B+Tree Concurrency Control Example 1 - Find 38 57 / 129 B+Tree Concurrency Control Example 1 - Find 38 58 / 129 B+Tree Concurrency Control Example 2 - Delete 38 59 / 129 B+Tree Concurrency Control Example 2 - Delete 38 60 / 129 B+Tree Concurrency Control Example 2 - Delete 38 61 / 129 B+Tree Concurrency Control Example 2 - Delete 38 62 / 129 B+Tree Concurrency Control Example 2 - Delete 38 63 / 129 B+Tree Concurrency Control Example 2 - Delete 38 64 / 129 B+Tree Concurrency Control Example 2 - Delete 38 65 / 129 B+Tree Concurrency Control Example 2 - Delete 38 66 / 129 B+Tree Concurrency Control Example 2 - Delete 38 67 / 129 B+Tree Concurrency Control Example 3 - Insert 45 68 / 129 B+Tree Concurrency Control Example 3 - Insert 45 69 / 129 B+Tree Concurrency Control Example 3 - Insert 45 70 / 129 B+Tree Concurrency Control Example 3 - Insert 45 71 / 129 B+Tree Concurrency Control Example 3 - Insert 45 72 / 129 B+Tree Concurrency Control Example 3 - Insert 45 73 / 129 B+Tree Concurrency Control Example 4 - Insert 25 74 / 129 B+Tree Concurrency Control Example 4 - Insert 25 75 / 129 B+Tree Concurrency Control Example 4 - Insert 25 76 / 129 B+Tree Concurrency Control Example 4 - Insert 25 77 / 129 B+Tree Concurrency Control Example 4 - Insert 25 78 / 129 B+Tree Concurrency Control Example 4 - Insert 25 79 / 129 B+Tree Concurrency Control Observation • What was the first step that all the update examples did on+ theB Tree? • Taking a write latch on the root every time becomes a bottleneck with higher concurrency. • Can we do better? 80 / 129 B+Tree Concurrency Control Better Latching Algorithm • Assume that the leaf node is safe. • Use read latches and crabbing to reach it, and then verify that it is safe. • If leaf is not safe, then do previous algorithm using write latches. • Reference 81 / 129 B+Tree Concurrency Control Example 2 - Delete 38 82 / 129 B+Tree Concurrency Control Example 2 - Delete 38 83 / 129 B+Tree Concurrency Control Example 2 - Delete 38 84 / 129 B+Tree Concurrency Control Example 2 - Delete 38 85 / 129 B+Tree Concurrency Control Example 2 - Delete 38 86 / 129 B+Tree Concurrency Control Example 4 - Insert 25 87 / 129 B+Tree Concurrency Control Example 4 - Insert 25 88 / 129 B+Tree Concurrency Control Example 4 - Insert 25 89 / 129 B+Tree Concurrency Control Example 4 - Insert 25 90 / 129 B+Tree Concurrency Control Better Latching Algorithm • Find: Same as before.
Recommended publications
  • C Programming: Data Structures and Algorithms
    C Programming: Data Structures and Algorithms An introduction to elementary programming concepts in C Jack Straub, Instructor Version 2.07 DRAFT C Programming: Data Structures and Algorithms, Version 2.07 DRAFT C Programming: Data Structures and Algorithms Version 2.07 DRAFT Copyright © 1996 through 2006 by Jack Straub ii 08/12/08 C Programming: Data Structures and Algorithms, Version 2.07 DRAFT Table of Contents COURSE OVERVIEW ........................................................................................ IX 1. BASICS.................................................................................................... 13 1.1 Objectives ...................................................................................................................................... 13 1.2 Typedef .......................................................................................................................................... 13 1.2.1 Typedef and Portability ............................................................................................................. 13 1.2.2 Typedef and Structures .............................................................................................................. 14 1.2.3 Typedef and Functions .............................................................................................................. 14 1.3 Pointers and Arrays ..................................................................................................................... 16 1.4 Dynamic Memory Allocation .....................................................................................................
    [Show full text]
  • Open Data Structures (In Java)
    Open Data Structures (in Java) Edition 0.1G Pat Morin Contents Acknowledgments ix Why This Book? xi 1 Introduction 1 1.1 The Need for Efficiency ..................... 2 1.2 Interfaces ............................. 4 1.2.1 The Queue, Stack, and Deque Interfaces . 5 1.2.2 The List Interface: Linear Sequences . 6 1.2.3 The USet Interface: Unordered Sets .......... 8 1.2.4 The SSet Interface: Sorted Sets ............ 9 1.3 Mathematical Background ................... 9 1.3.1 Exponentials and Logarithms . 10 1.3.2 Factorials ......................... 11 1.3.3 Asymptotic Notation . 12 1.3.4 Randomization and Probability . 15 1.4 The Model of Computation ................... 18 1.5 Correctness, Time Complexity, and Space Complexity . 19 1.6 Code Samples .......................... 22 1.7 List of Data Structures ..................... 22 1.8 Discussion and Exercises .................... 26 2 Array-Based Lists 29 2.1 ArrayStack: Fast Stack Operations Using an Array . 30 2.1.1 The Basics ........................ 30 2.1.2 Growing and Shrinking . 33 2.1.3 Summary ......................... 35 Contents 2.2 FastArrayStack: An Optimized ArrayStack . 35 2.3 ArrayQueue: An Array-Based Queue . 36 2.3.1 Summary ......................... 40 2.4 ArrayDeque: Fast Deque Operations Using an Array . 40 2.4.1 Summary ......................... 43 2.5 DualArrayDeque: Building a Deque from Two Stacks . 43 2.5.1 Balancing ......................... 47 2.5.2 Summary ......................... 49 2.6 RootishArrayStack: A Space-Efficient Array Stack . 49 2.6.1 Analysis of Growing and Shrinking . 54 2.6.2 Space Usage ....................... 54 2.6.3 Summary ......................... 55 2.6.4 Computing Square Roots . 56 2.7 Discussion and Exercises ...................
    [Show full text]
  • Kernel Extensions and Device Support Programming Concepts
    Bull Kernel Extensions and Device Support Programming Concepts AIX ORDER REFERENCE 86 A2 36JX 02 Bull Kernel Extensions and Device Support Programming Concepts AIX Software November 1999 BULL ELECTRONICS ANGERS CEDOC 34 Rue du Nid de Pie – BP 428 49004 ANGERS CEDEX 01 FRANCE ORDER REFERENCE 86 A2 36JX 02 The following copyright notice protects this book under the Copyright laws of the United States of America and other countries which prohibit such actions as, but not limited to, copying, distributing, modifying, and making derivative works. Copyright Bull S.A. 1992, 1999 Printed in France Suggestions and criticisms concerning the form, content, and presentation of this book are invited. A form is provided at the end of this book for this purpose. To order additional copies of this book or other Bull Technical Publications, you are invited to use the Ordering Form also provided at the end of this book. Trademarks and Acknowledgements We acknowledge the right of proprietors of trademarks mentioned in this book. AIXR is a registered trademark of International Business Machines Corporation, and is being used under licence. UNIX is a registered trademark in the United States of America and other countries licensed exclusively through the Open Group. Year 2000 The product documented in this manual is Year 2000 Ready. The information in this document is subject to change without notice. Groupe Bull will not be liable for errors contained herein, or for incidental or consequential damages in connection with the use of this material. Contents Trademarks and Acknowledgements . iii 64-bit Kernel Extension Development. 23 About This Book .
    [Show full text]
  • Fundamental Data Structures Contents
    Fundamental Data Structures Contents 1 Introduction 1 1.1 Abstract data type ........................................... 1 1.1.1 Examples ........................................... 1 1.1.2 Introduction .......................................... 2 1.1.3 Defining an abstract data type ................................. 2 1.1.4 Advantages of abstract data typing .............................. 4 1.1.5 Typical operations ...................................... 4 1.1.6 Examples ........................................... 5 1.1.7 Implementation ........................................ 5 1.1.8 See also ............................................ 6 1.1.9 Notes ............................................. 6 1.1.10 References .......................................... 6 1.1.11 Further ............................................ 7 1.1.12 External links ......................................... 7 1.2 Data structure ............................................. 7 1.2.1 Overview ........................................... 7 1.2.2 Examples ........................................... 7 1.2.3 Language support ....................................... 8 1.2.4 See also ............................................ 8 1.2.5 References .......................................... 8 1.2.6 Further reading ........................................ 8 1.2.7 External links ......................................... 9 1.3 Analysis of algorithms ......................................... 9 1.3.1 Cost models ......................................... 9 1.3.2 Run-time analysis
    [Show full text]
  • Open Data Structures (In Pseudocode)
    Open Data Structures (in pseudocode) Edition 0.1Gβ Pat Morin Contents Acknowledgments ix Why This Book? xi 1 Introduction 1 1.1 The Need for Efficiency ..................... 2 1.2 Interfaces ............................. 4 1.2.1 The Queue, Stack, and Deque Interfaces . 5 1.2.2 The List Interface: Linear Sequences . 6 1.2.3 The USet Interface: Unordered Sets .......... 8 1.2.4 The SSet Interface: Sorted Sets ............. 8 1.3 Mathematical Background ................... 9 1.3.1 Exponentials and Logarithms . 10 1.3.2 Factorials ......................... 11 1.3.3 Asymptotic Notation . 12 1.3.4 Randomization and Probability . 15 1.4 The Model of Computation ................... 18 1.5 Correctness, Time Complexity, and Space Complexity . 19 1.6 Code Samples .......................... 21 1.7 List of Data Structures ..................... 23 1.8 Discussion and Exercises .................... 23 2 Array-Based Lists 31 2.1 ArrayStack: Fast Stack Operations Using an Array . 32 2.1.1 The Basics ........................ 32 2.1.2 Growing and Shrinking . 35 2.1.3 Summary ......................... 37 Contents 2.2 FastArrayStack: An Optimized ArrayStack . 37 2.3 ArrayQueue: An Array-Based Queue . 38 2.3.1 Summary ......................... 41 2.4 ArrayDeque: Fast Deque Operations Using an Array . 42 2.4.1 Summary ......................... 44 2.5 DualArrayDeque: Building a Deque from Two Stacks . 44 2.5.1 Balancing ......................... 48 2.5.2 Summary ......................... 50 2.6 RootishArrayStack: A Space-Efficient Array Stack . 50 2.6.1 Analysis of Growing and Shrinking . 55 2.6.2 Space Usage ....................... 55 2.6.3 Summary ......................... 56 2.7 Discussion and Exercises .................... 57 3 Linked Lists 61 3.1 SLList: A Singly-Linked List .
    [Show full text]
  • Handout 09: Suggested Project Topics
    CS166 Handout 09 Spring 2021 April 13, 2021 Suggested Project Topics Here is a list of data structures and families of data structures we think you might find interesting topics for your research project. You're by no means limited to what's contained here; if you have another data structure you'd like to explore, feel free to do so! My Wish List Below is a list of topics where, each quarter, I secretly think “I hope someone wants to pick this topic this quarter!” These are data structures I’ve always wanted to learn a bit more about or that I think would be particularly fun to do a deep dive into. You are not in any way, shape, or form required to pick something from this list, and we aren’t offer- ing extra credit or anything like that if you do choose to select one of these topics. However, if any of them seem interesting to you, we’d be excited to see what you come up with over the quarter. • Bentley-Saxe dynamization (turning static data structures into dynamic data structures) • Bε-trees (a B-tree variant designed to minimize writes) • Chazelle and Guibas’s O(log n + k) 3D range search (fast range searches in 3D) • Crazy good chocolate pop tarts (deamortizing binary search trees) • Durocher’s RMQ structure (fast RMQ without the Method of Four Russians) • Dynamic prefix sum lower bounds (proving lower bounds on dynamic prefix parity) • Farach’s suffix tree algorithm (a brilliant, beautiful divide-and-conquer algorithm) • Geometric greedy trees (lower bounds on BSTs giving rise to a specific BST) • Ham sandwich trees (fast searches
    [Show full text]
  • Geometric-Based Optimization Algorithms for Cable Routing and Branching in Cluttered Environments
    Clemson University TigerPrints All Dissertations Dissertations August 2020 Geometric-based Optimization Algorithms for Cable Routing and Branching in Cluttered Environments Nafiseh Masoudi Clemson University, [email protected] Follow this and additional works at: https://tigerprints.clemson.edu/all_dissertations Recommended Citation Masoudi, Nafiseh, "Geometric-based Optimization Algorithms for Cable Routing and Branching in Cluttered Environments" (2020). All Dissertations. 2702. https://tigerprints.clemson.edu/all_dissertations/2702 This Dissertation is brought to you for free and open access by the Dissertations at TigerPrints. It has been accepted for inclusion in All Dissertations by an authorized administrator of TigerPrints. For more information, please contact [email protected]. GEOMETRIC BASED OPTIMIZATION ALGORITHMS FOR CABLE ROUTING AND BRANCHING IN CLUTTERED ENVIRONMENTS A Dissertation Presented to the Graduate School of Clemson University In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy Mechanical Engineering by Nafiseh Masoudi August 2020 Accepted July 21, 2020 by: Dr. Georges M. Fadel, Committee Chair Dr. Margaret M. Wiecek Dr. Joshua D. Summers Dr. Cameron J. Turner Distinguished External Reviewer: Dr. Jonathan Cagan, Carnegie Mellon University ABSTRACT The need for designing lighter and more compact systems often leaves limited space for planning routes for the connectors that enable interactions among the system’s components. Finding optimal routes for these connectors in a densely populated environment left behind at the detail design stage has been a challenging problem for decades. A variety of deterministic as well as heuristic methods has been developed to address different instances of this problem. While the focus of the deterministic methods is primarily on the optimality of the final solution, the heuristics offer acceptable solutions, especially for such problems, in a reasonable amount of time without guaranteeing to find optimal solutions.
    [Show full text]
  • 1. Arrays an Array Is a Structure of Fixed-Size, Which Can Hold Items of the Same Data Type
    06/03/2020 8 Common Data Structures every Programmer must know 8 Common Data Structures every Programmer must know A quick introduction to 8 commonly used data structures Vijini Mallawaarachchi Feb 28 · 10 min read Data Structures are a specialized means of organizing and storing data in computers in such a way that we can perform operations on the stored data more efficiently. Data structures have a wide and diverse scope of usage across the fields of Computer Science and Software Engineering. Data structures are being used in almost every program or software system that has been developed. Moreover, data structures come under the fundamentals of Computer Science and Software Engineering. It is a key topic when it comes to Software Engineering interview questions. Hence as developers, we must have good knowledge about data structures. In this article, I will be briefly explaining 8 commonly used data structures every programmer must know. 1. Arrays An array is a structure of fixed-size, which can hold items of the same data type. It can be an array of integers, an array of floating-point numbers, an array of strings https://towardsdatascience.com/8-common-data-structures-every-programmer-must-know-171acf6a1a42 1/13 06/03/2020 8 Common Data Structures every Programmer must know or even an array of arrays (such as 2-dimensional arrays). Arrays are indexed, meaning that random access is possible. Fig 1. Visualization of basic Terminology of Arrays Array operations Traverse: Go through the elements and print them. Search: Search for an element in the array. You can search the element by its value or its index Update: Update the value of an existing element at a given index Inserting elements to an array and deleting elements from an array cannot be done straight away as arrays are fixed in size.
    [Show full text]
  • Open Data Structures (In C++)
    Open Data Structures (in C++) Edition 0.1Gβ Pat Morin Contents Acknowledgments ix Why This Book? xi Preface to the C++ Edition xiii 1 Introduction1 1.1 The Need for Efficiency.....................2 1.2 Interfaces.............................4 1.2.1 The Queue, Stack, and Deque Interfaces.......5 1.2.2 The List Interface: Linear Sequences.........6 1.2.3 The USet Interface: Unordered Sets..........8 1.2.4 The SSet Interface: Sorted Sets............8 1.3 Mathematical Background...................9 1.3.1 Exponentials and Logarithms............. 10 1.3.2 Factorials......................... 11 1.3.3 Asymptotic Notation.................. 12 1.3.4 Randomization and Probability............ 15 1.4 The Model of Computation................... 18 1.5 Correctness, Time Complexity, and Space Complexity... 19 1.6 Code Samples.......................... 22 1.7 List of Data Structures..................... 22 1.8 Discussion and Exercises.................... 25 2 Array-Based Lists 29 2.1 ArrayStack: Fast Stack Operations Using an Array..... 31 2.1.1 The Basics........................ 31 Contents 2.1.2 Growing and Shrinking................. 34 2.1.3 Summary......................... 36 2.2 FastArrayStack: An Optimized ArrayStack......... 36 2.3 ArrayQueue: An Array-Based Queue............. 37 2.3.1 Summary......................... 41 2.4 ArrayDeque: Fast Deque Operations Using an Array.... 41 2.4.1 Summary......................... 43 2.5 DualArrayDeque: Building a Deque from Two Stacks.... 44 2.5.1 Balancing......................... 47 2.5.2 Summary......................... 49 2.6 RootishArrayStack: A Space-Efficient Array Stack..... 50 2.6.1 Analysis of Growing and Shrinking.......... 54 2.6.2 Space Usage....................... 55 2.6.3 Summary......................... 56 2.6.4 Computing Square Roots...............
    [Show full text]
  • Midterm 2 2/27/06
    CSE 373 Midterm 2 2/27/06 Name ________________________________ Do not write your id number or any other confidential information on this page. There are 8 questions worth a total of 60 points. Please budget your time so you get to all of the questions, particularly some of the later ones that are worth more points than some of the earlier ones. Keep your answers brief and to the point. The exam is closed book. No calculators, laptops, cell phones, paging devices, PDAs, iPods, BlackBerrrys, time-travel machines, or other devices are allowed (or needed). Please wait to turn the page until everyone is told to begin. Page 1 of 9 CSE 373 Midterm 2 2/27/06 Score _________________ / 60 1. ______ / 6 2. ______ / 3 3. ______ / 6 4. ______ / 6 5. ______ / 6 6. ______ / 12 7. ______ / 12 8. ______ / 9 Page 2 of 9 CSE 373 Midterm 2 2/27/06 Question 1. (6 points) (a) What is the load factor of a hash table? (Give a definition.) (b) What is a reasonable value for the load factor of a hash table if the operations on the table are to be efficient (i.e., O(1) instead of something significantly slower)? (c) What needs to be true about a hash function if operations using that function to locate items in a hash table are to be efficient? Question 2. (3 points) You have been asked to design a hash function for a set data structure (an unordered collection with no duplicates). There are two possibilities.
    [Show full text]
  • Suggested Final Project Topics
    CS166 Handout 10 Spring 2019 April 25, 2019 Suggested Final Project Topics Here is a list of data structures and families of data structures we think you might find interesting topics for a final project. You're by no means limited to what's contained here; if you have another data structure you'd like to explore, feel free to do so! If You Liked Range Minimum Queries, Check Out… Range Semigroup Queries In the range minimum query problem, we wanted to preprocess an array so that we could quickly find the minimum element in that range. Imagine that instead of computing the minimum of the value in the range, we instead want to compute A[i] ★ A[i+1] ★ … ★ A[j] for some associative op- eration ★. If we know nothing about ★ other than the fact that it's associative, how would we go about solving this problem efficiently? Turns out there are some very clever solutions whose run- times involve the magical Ackermann inverse function. Why they're worth studying: If you really enjoyed the RMQ coverage from earlier in the quarter, this might be a great way to look back at those topics from a different perspective. You'll get a much more nuanced understanding of why our solutions work so quickly and how to adapt those tech- niques into novel settings. Lowest Common Ancestor and Level Ancestor Queries Range minimum queries can be used to solve the lowest common ancestors problem: given a tree, preprocess the tree so that queries of the form “what node in the tree is as deep as possible and has nodes u and v as descendants?” LCA queries have a ton of applications in suffix trees and other al- gorithmic domains, and there’s a beautiful connection between LCA and RMQ that gave rise to the first ⟨O(n), O(1)⟩ solution to both LCA and RMQ.
    [Show full text]
  • Computing Distance Histograms Efficiently in Scientific Databases
    IEEE International Conference on Data Engineering Computing Distance Histograms Efficiently in Scientific Databases Yi-Cheng Tu 1, Shaoping Chen 2, and Sagar Pandit 3 1Department of Computer Science and Engineering, University of South Florida 4202 E. Fowler Ave., ENB118, Tampa, FL 33620, U.S.A. [email protected] 2Department of Mathematics, Wuhan University of Technology 122 Luosi Road, Wuhan, Hubei, 430070, P. R. China [email protected] 3Department of Physics, University of South Florida 4202 E. Fowler Ave., PHY114, Tampa, FL 33620, U.S.A. [email protected] Abstract— Particle simulation has become an important re- However, many of such basic analytical queries require super- search tool in many scientific and engineering fields. Data gen- linear processing time if handled in a straightforward way, erated by such simulations impose great challenges to database as they are in current scientific databases. In this paper, we storage and query processing. One of the queries against particle simulation data, the spatial distance histogram (SDH) query, is report our efforts to design efficient algorithms for a type of the building block of many high-level analytics, and requires query that is extremely important in the analysis of particle quadratic time to compute using a straightforward algorithm. simulation data. In this paper, we propose a novel algorithm to compute SDH Particle simulations are computer simulations in which the based on a data structure called density map, which can be easily basic components of large systems (e.g., atoms, molecules, implemented by augmenting a Quad-tree index. We also show the results of rigorous mathematical analysis of the time complexity stars, galaxies ...) are treated as classical entities (i.e., particles) 3 of the proposed algorithm: our algorithm runs on Θ(N 2 ) for that interact for certain duration under postulated empirical 5 two-dimensional data and Θ(N 3 ) for three-dimensional data, forces.
    [Show full text]