Cache-Friendly Search Trees; Or, in Which Everything Beats Std::Set
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Binary Search Tree
ADT Binary Search Tree! Ellen Walker! CPSC 201 Data Structures! Hiram College! Binary Search Tree! •" Value-based storage of information! –" Data is stored in order! –" Data can be retrieved by value efficiently! •" Is a binary tree! –" Everything in left subtree is < root! –" Everything in right subtree is >root! –" Both left and right subtrees are also BST#s! Operations on BST! •" Some can be inherited from binary tree! –" Constructor (for empty tree)! –" Inorder, Preorder, and Postorder traversal! •" Some must be defined ! –" Insert item! –" Delete item! –" Retrieve item! The Node<E> Class! •" Just as for a linked list, a node consists of a data part and links to successor nodes! •" The data part is a reference to type E! •" A binary tree node must have links to both its left and right subtrees! The BinaryTree<E> Class! The BinaryTree<E> Class (continued)! Overview of a Binary Search Tree! •" Binary search tree definition! –" A set of nodes T is a binary search tree if either of the following is true! •" T is empty! •" Its root has two subtrees such that each is a binary search tree and the value in the root is greater than all values of the left subtree but less than all values in the right subtree! Overview of a Binary Search Tree (continued)! Searching a Binary Tree! Class TreeSet and Interface Search Tree! BinarySearchTree Class! BST Algorithms! •" Search! •" Insert! •" Delete! •" Print values in order! –" We already know this, it#s inorder traversal! –" That#s why it#s called “in order”! Searching the Binary Tree! •" If the tree is -
Persistent Predecessor Search and Orthogonal Point Location on the Word
Persistent Predecessor Search and Orthogonal Point Location on the Word RAM∗ Timothy M. Chan† August 17, 2012 Abstract We answer a basic data structuring question (for example, raised by Dietz and Raman [1991]): can van Emde Boas trees be made persistent, without changing their asymptotic query/update time? We present a (partially) persistent data structure that supports predecessor search in a set of integers in 1,...,U under an arbitrary sequence of n insertions and deletions, with O(log log U) { } expected query time and expected amortized update time, and O(n) space. The query bound is optimal in U for linear-space structures and improves previous near-O((log log U)2) methods. The same method solves a fundamental problem from computational geometry: point location in orthogonal planar subdivisions (where edges are vertical or horizontal). We obtain the first static data structure achieving O(log log U) worst-case query time and linear space. This result is again optimal in U for linear-space structures and improves the previous O((log log U)2) method by de Berg, Snoeyink, and van Kreveld [1995]. The same result also holds for higher-dimensional subdivisions that are orthogonal binary space partitions, and for certain nonorthogonal planar subdivisions such as triangulations without small angles. Many geometric applications follow, including improved query times for orthogonal range reporting for dimensions 3 on the RAM. ≥ Our key technique is an interesting new van-Emde-Boas–style recursion that alternates be- tween two strategies, both quite simple. 1 Introduction Van Emde Boas trees [60, 61, 62] are fundamental data structures that support predecessor searches in O(log log U) time on the word RAM with O(n) space, when the n elements of the given set S come from a bounded integer universe 1,...,U (U n). -
Binary Search Trees (BST)
CSCI-UA 102 Joanna Klukowska Lecture 7: Binary Search Trees [email protected] Lecture 7: Binary Search Trees (BST) Reading materials Goodrich, Tamassia, Goldwasser: Chapter 8 OpenDSA: Chapter 18 and 12 Binary search trees visualizations: http://visualgo.net/bst.html http://www.cs.usfca.edu/~galles/visualization/BST.html Topics Covered 1 Binary Search Trees (BST) 2 2 Binary Search Tree Node 3 3 Searching for an Item in a Binary Search Tree3 3.1 Binary Search Algorithm..............................................4 3.1.1 Recursive Approach............................................4 3.1.2 Iterative Approach.............................................4 3.2 contains() Method for a Binary Search Tree..................................5 4 Inserting a Node into a Binary Search Tree5 4.1 Order of Insertions.................................................6 5 Removing a Node from a Binary Search Tree6 5.1 Removing a Leaf Node...............................................7 5.2 Removing a Node with One Child.........................................7 5.3 Removing a Node with Two Children.......................................8 5.4 Recursive Implementation.............................................9 6 Balanced Binary Search Trees 10 6.1 Adding a balance() Method........................................... 10 7 Self-Balancing Binary Search Trees 11 7.1 AVL Tree...................................................... 11 7.1.1 When balancing is necessary........................................ 12 7.2 Red-Black Trees.................................................. 15 1 CSCI-UA 102 Joanna Klukowska Lecture 7: Binary Search Trees [email protected] 1 Binary Search Trees (BST) A binary search tree is a binary tree with additional properties: • the value stored in a node is greater than or equal to the value stored in its left child and all its descendants (or left subtree), and • the value stored in a node is smaller than the value stored in its right child and all its descendants (or its right subtree). -
A Set Intersection Algorithm Via X-Fast Trie
Journal of Computers A Set Intersection Algorithm Via x-Fast Trie Bangyu Ye* National Engineering Laboratory for Information Security Technologies, Institute of Information Engineering, Chinese Academy of Sciences, Beijing, 100091, China. * Corresponding author. Tel.:+86-10-82546714; email: [email protected] Manuscript submitted February 9, 2015; accepted May 10, 2015. doi: 10.17706/jcp.11.2.91-98 Abstract: This paper proposes a simple intersection algorithm for two sorted integer sequences . Our algorithm is designed based on x-fast trie since it provides efficient find and successor operators. We present that our algorithm outperforms skip list based algorithm when one of the sets to be intersected is relatively ‘dense’ while the other one is (relatively) ‘sparse’. Finally, we propose some possible approaches which may optimize our algorithm further. Key words: Set intersection, algorithm, x-fast trie. 1. Introduction Fast set intersection is a key operation in the context of several fields when it comes to big data era [1], [2]. For example, modern search engines use the set intersection for inverted posting list which is a standard data structure in information retrieval to return relevant documents. So it has been studied in many domains and fields [3]-[8]. One of the most typical situations is Boolean query which is required to retrieval the documents that contains all the terms in the query. Besides, set intersection is naturally a key component of calculating the Jaccard coefficient of two sets, which is defined by the fraction of size of intersection and union. In this paper, we propose a new algorithm via x-fast trie which is a kind of advanced data structure. -
X-Fast and Y-Fast Tries
x-Fast and y-Fast Tries Outline for Today ● Bitwise Tries ● A simple ordered dictionary for integers. ● x-Fast Tries ● Tries + Hashing ● y-Fast Tries ● Tries + Hashing + Subdivision + Balanced Trees + Amortization Recap from Last Time Ordered Dictionaries ● An ordered dictionary is a data structure that maintains a set S of elements drawn from an ordered universe � and supports these operations: ● insert(x), which adds x to S. ● is-empty(), which returns whether S = Ø. ● lookup(x), which returns whether x ∈ S. ● delete(x), which removes x from S. ● max() / min(), which returns the maximum or minimum element of S. ● successor(x), which returns the smallest element of S greater than x, and ● predecessor(x), which returns the largest element of S smaller than x. Integer Ordered Dictionaries ● Suppose that � = [U] = {0, 1, …, U – 1}. ● A van Emde Boas tree is an ordered dictionary for [U] where ● min, max, and is-empty run in time O(1). ● All other operations run in time O(log log U). ● Space usage is Θ(U) if implemented deterministically, and O(n) if implemented using hash tables. ● Question: Is there a simpler data structure meeting these bounds? The Machine Model ● We assume a transdichotomous machine model: ● Memory is composed of words of w bits each. ● Basic arithmetic and bitwise operations on words take time O(1) each. ● w = Ω(log n). A Start: Bitwise Tries Tries Revisited ● Recall: A trie is a simple data 0 1 structure for storing strings. 0 1 0 1 ● Integers can be thought of as strings of bits. 1 0 1 1 0 ● Idea: Store integers in a bitwise trie. -
Objectives Objectives Searching a Simple Searching Problem A
Python Programming: Objectives An Introduction to To understand the basic techniques for Computer Science analyzing the efficiency of algorithms. To know what searching is and understand the algorithms for linear and binary Chapter 13 search. Algorithm Design and Recursion To understand the basic principles of recursive definitions and functions and be able to write simple recursive functions. Python Programming, 1/e 1 Python Programming, 1/e 2 Objectives Searching To understand sorting in depth and Searching is the process of looking for a know the algorithms for selection sort particular value in a collection. and merge sort. For example, a program that maintains To appreciate how the analysis of a membership list for a club might need algorithms can demonstrate that some to look up information for a particular problems are intractable and others are member œ this involves some sort of unsolvable. search process. Python Programming, 1/e 3 Python Programming, 1/e 4 A simple Searching Problem A Simple Searching Problem Here is the specification of a simple In the first example, the function searching function: returns the index where 4 appears in def search(x, nums): the list. # nums is a list of numbers and x is a number # Returns the position in the list where x occurs # or -1 if x is not in the list. In the second example, the return value -1 indicates that 7 is not in the list. Here are some sample interactions: >>> search(4, [3, 1, 4, 2, 5]) 2 Python includes a number of built-in >>> search(7, [3, 1, 4, 2, 5]) -1 search-related methods! Python Programming, 1/e 5 Python Programming, 1/e 6 Python Programming, 1/e 1 A Simple Searching Problem A Simple Searching Problem We can test to see if a value appears in The only difference between our a sequence using in. -
Prefix Hash Tree an Indexing Data Structure Over Distributed Hash
Prefix Hash Tree An Indexing Data Structure over Distributed Hash Tables Sriram Ramabhadran ∗ Sylvia Ratnasamy University of California, San Diego Intel Research, Berkeley Joseph M. Hellerstein Scott Shenker University of California, Berkeley International Comp. Science Institute, Berkeley and and Intel Research, Berkeley University of California, Berkeley ABSTRACT this lookup interface has allowed a wide variety of Distributed Hash Tables are scalable, robust, and system to be built on top DHTs, including file sys- self-organizing peer-to-peer systems that support tems [9, 27], indirection services [30], event notifi- exact match lookups. This paper describes the de- cation [6], content distribution networks [10] and sign and implementation of a Prefix Hash Tree - many others. a distributed data structure that enables more so- phisticated queries over a DHT. The Prefix Hash DHTs were designed in the Internet style: scala- Tree uses the lookup interface of a DHT to con- bility and ease of deployment triumph over strict struct a trie-based structure that is both efficient semantics. In particular, DHTs are self-organizing, (updates are doubly logarithmic in the size of the requiring no centralized authority or manual con- domain being indexed), and resilient (the failure figuration. They are robust against node failures of any given node in the Prefix Hash Tree does and easily accommodate new nodes. Most impor- not affect the availability of data stored at other tantly, they are scalable in the sense that both la- nodes). tency (in terms of the number of hops per lookup) and the local state required typically grow loga- Categories and Subject Descriptors rithmically in the number of nodes; this is crucial since many of the envisioned scenarios for DHTs C.2.4 [Comp. -
Chapter 13 Sorting & Searching
13-1 Java Au Naturel by William C. Jones 13-1 13 Sorting and Searching Overview This chapter discusses several standard algorithms for sorting, i.e., putting a number of values in order. It also discusses the binary search algorithm for finding a particular value quickly in an array of sorted values. The algorithms described here can be useful in various situations. They should also help you become more comfortable with logic involving arrays. These methods would go in a utilities class of methods for Comparable objects, such as the CompOp class of Listing 7.2. For this chapter you need a solid understanding of arrays (Chapter Seven). · Sections 13.1-13.2 discuss two basic elementary algorithms for sorting, the SelectionSort and the InsertionSort. · Section 13.3 presents the binary search algorithm and big-oh analysis, which provides a way of comparing the speed of two algorithms. · Sections 13.4-13.5 introduce two recursive algorithms for sorting, the QuickSort and the MergeSort, which execute much faster than the elementary algorithms when you have more than a few hundred values to sort. · Sections 13.6 goes further with big-oh analysis. · Section 13.7 presents several additional sorting algorithms -- the bucket sort, the radix sort, and the shell sort. 13.1 The SelectionSort Algorithm For Comparable Objects When you have hundreds of Comparable values stored in an array, you will often find it useful to keep them in sorted order from lowest to highest, which is ascending order. To be precise, ascending order means that there is no case in which one element is larger than the one after it -- if y is listed after x, then x.CompareTo(y) <= 0. -
P4 Sec 4.1.1, 4.1.2, 4.1.3) Abstraction, Algorithms and ADT's
Computer Science 9608 P4 Sec 4.1.1, 4.1.2, 4.1.3) Abstraction, Algorithms and ADT’s with Majid Tahir Syllabus Content: 4.1.1 Abstraction show understanding of how to model a complex system by only including essential details, using: functions and procedures with suitable parameters (as in procedural programming, see Section 2.3) ADTs (see Section 4.1.3) classes (as used in object-oriented programming, see Section 4.3.1) facts, rules (as in declarative programming, see Section 4.3.1) 4.1.2 Algorithms write a binary search algorithm to solve a particular problem show understanding of the conditions necessary for the use of a binary search show understanding of how the performance of a binary search varies according to the number of data items write an algorithm to implement an insertion sort write an algorithm to implement a bubble sort show understanding that performance of a sort routine may depend on the initial order of the data and the number of data items write algorithms to find an item in each of the following: linked list, binary tree, hash table write algorithms to insert an item into each of the following: stack, queue, linked list, binary tree, hash table write algorithms to delete an item from each of the following: stack, queue, linked list show understanding that different algorithms which perform the same task can be compared by using criteria such as time taken to complete the task and memory used 4.1.3 Abstract Data Types (ADT) show understanding that an ADT is a collection of data and a set of operations on those data -
Binary Search Algorithm This Article Is About Searching a Finite Sorted Array
Binary search algorithm This article is about searching a finite sorted array. For searching continuous function values, see bisection method. This article may require cleanup to meet Wikipedia's quality standards. No cleanup reason has been specified. Please help improve this article if you can. (April 2011) Binary search algorithm Class Search algorithm Data structure Array Worst case performance O (log n ) Best case performance O (1) Average case performance O (log n ) Worst case space complexity O (1) In computer science, a binary search or half-interval search algorithm finds the position of a specified input value (the search "key") within an array sorted by key value.[1][2] For binary search, the array should be arranged in ascending or descending order. In each step, the algorithm compares the search key value with the key value of the middle element of the array. If the keys match, then a matching element has been found and its index, or position, is returned. Otherwise, if the search key is less than the middle element's key, then the algorithm repeats its action on the sub-array to the left of the middle element or, if the search key is greater, on the sub-array to the right. If the remaining array to be searched is empty, then the key cannot be found in the array and a special "not found" indication is returned. A binary search halves the number of items to check with each iteration, so locating an item (or determining its absence) takes logarithmic time. A binary search is a dichotomic divide and conquer search algorithm. -
Lock-Free Concurrent Van Emde Boas Array
Lock-free concurrent van Emde Boas Array Ziyuan Guo Reiji Suda (Adviser) [email protected] [email protected] The University of Tokyo Graduate School of Information Science and Technology Tokyo, Japan ABSTRACT To unleash the power of modern multi-core processors, concur- Lock-based data structures have some potential issues such as dead- rent data structures are needed in many cases. Comparing to lock- lock, livelock, and priority inversion, and the progress can be de- based data structures, lock-free data structures can avoid some po- layed indefinitely if the thread that is holding locks cannot acquire tential issues such as deadlock, livelock, and priority inversion, and a timeslice from the scheduler. Lock-free data structures, which guarantees the progress [5]. guarantees the progress of some method call, can be used to avoid these problems. This poster introduces the first lock-free concur- 2 OBJECTIVES rent van Emde Boas Array which is a variant of van Emde Boas • Create a lock-free search tree based on van Emde Boas Tree. Tree. It’s linearizable and the benchmark shows significant per- • Get better performance and scalability. formance improvement comparing to other lock-free search trees • Linearizability, i.e., each method call should appear to take when the date set is large and dense enough. effect at some instant between its invocation and response [5]. CCS CONCEPTS • Computing methodologies → Concurrent algorithms. 3 RELATED WORK The base of our data structure is van Emde Boas Tree, whichis KEYWORDS named after P. van Emde Boas, the inventor of this data structure [7]. -
Binary Search Algorithm
Binary search algorithm Definition Search a sorted array by repeatedly dividing the search interval in half. Begin with an interval covering the whole array. If the value of the search key is less than the item in the middle of the interval, narrow the interval to the lower half. Otherwise narrow it to the upper half. Repeatedly check until the value is found or the interval is empty. Generally, to find a value in unsorted array, we should look through elements of an array one by one, until searched value is found. In case of searched value is absent from array, we go through all elements. In average, complexity of such an algorithm is proportional to the length of the array. Divide in half A fast way to search a sorted array is to use a binary search. The idea is to look at the element in the middle. If the key is equal to that, the search is finished. If the key is less than the middle element, do a binary search on the first half. If it's greater, do a binary search of the second half. Performance The advantage of a binary search over a linear search is astounding for large numbers. For an array of a million elements, binary search, O(log N), will find the target element with a worst case of only 20 comparisons. Linear search, O(N), on average will take 500,000 comparisons to find the element. Algorithm Algorithm is quite simple. It can be done either recursively or iteratively: 1. get the middle element; 2.