On the Implementation of Purely Functional Data Structures for the Linearisation Case of Dynamic Trees
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Cache Oblivious Search Trees Via Binary Trees of Small Height
Cache Oblivious Search Trees via Binary Trees of Small Height Gerth Stølting Brodal∗ Rolf Fagerberg∗ Riko Jacob∗ Abstract likely to increase in the future [13, pp. 7 and 429]. We propose a version of cache oblivious search trees As a consequence, the memory access pattern of an which is simpler than the previous proposal of Bender, algorithm has become a key component in determining Demaine and Farach-Colton and has the same complex- its running time in practice. Since classic asymptotical ity bounds. In particular, our data structure avoids the analysis of algorithms in the RAM model is unable to use of weight balanced B-trees, and can be implemented capture this, a number of more elaborate models for as just a single array of data elements, without the use of analysis have been proposed. The most widely used of pointers. The structure also improves space utilization. these is the I/O model of Aggarwal and Vitter [1], which For storing n elements, our proposal uses (1 + ε)n assumes a memory hierarchy containing two levels, the times the element size of memory, and performs searches lower level having size M and the transfer between the in worst case O(logB n) memory transfers, updates two levels taking place in blocks of B elements. The in amortized O((log2 n)/(εB)) memory transfers, and cost of the computation in the I/O model is the number range queries in worst case O(logB n + k/B) memory of blocks transferred. This model is adequate when transfers, where k is the size of the output. -
Lecture 04 Linear Structures Sort
Algorithmics (6EAP) MTAT.03.238 Linear structures, sorting, searching, etc Jaak Vilo 2018 Fall Jaak Vilo 1 Big-Oh notation classes Class Informal Intuition Analogy f(n) ∈ ο ( g(n) ) f is dominated by g Strictly below < f(n) ∈ O( g(n) ) Bounded from above Upper bound ≤ f(n) ∈ Θ( g(n) ) Bounded from “equal to” = above and below f(n) ∈ Ω( g(n) ) Bounded from below Lower bound ≥ f(n) ∈ ω( g(n) ) f dominates g Strictly above > Conclusions • Algorithm complexity deals with the behavior in the long-term – worst case -- typical – average case -- quite hard – best case -- bogus, cheating • In practice, long-term sometimes not necessary – E.g. for sorting 20 elements, you dont need fancy algorithms… Linear, sequential, ordered, list … Memory, disk, tape etc – is an ordered sequentially addressed media. Physical ordered list ~ array • Memory /address/ – Garbage collection • Files (character/byte list/lines in text file,…) • Disk – Disk fragmentation Linear data structures: Arrays • Array • Hashed array tree • Bidirectional map • Heightmap • Bit array • Lookup table • Bit field • Matrix • Bitboard • Parallel array • Bitmap • Sorted array • Circular buffer • Sparse array • Control table • Sparse matrix • Image • Iliffe vector • Dynamic array • Variable-length array • Gap buffer Linear data structures: Lists • Doubly linked list • Array list • Xor linked list • Linked list • Zipper • Self-organizing list • Doubly connected edge • Skip list list • Unrolled linked list • Difference list • VList Lists: Array 0 1 size MAX_SIZE-1 3 6 7 5 2 L = int[MAX_SIZE] -
Advanced Data Structures
Advanced Data Structures PETER BRASS City College of New York CAMBRIDGE UNIVERSITY PRESS Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo Cambridge University Press The Edinburgh Building, Cambridge CB2 8RU, UK Published in the United States of America by Cambridge University Press, New York www.cambridge.org Information on this title: www.cambridge.org/9780521880374 © Peter Brass 2008 This publication is in copyright. Subject to statutory exception and to the provision of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published in print format 2008 ISBN-13 978-0-511-43685-7 eBook (EBL) ISBN-13 978-0-521-88037-4 hardback Cambridge University Press has no responsibility for the persistence or accuracy of urls for external or third-party internet websites referred to in this publication, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate. Contents Preface page xi 1 Elementary Structures 1 1.1 Stack 1 1.2 Queue 8 1.3 Double-Ended Queue 16 1.4 Dynamical Allocation of Nodes 16 1.5 Shadow Copies of Array-Based Structures 18 2 Search Trees 23 2.1 Two Models of Search Trees 23 2.2 General Properties and Transformations 26 2.3 Height of a Search Tree 29 2.4 Basic Find, Insert, and Delete 31 2.5ReturningfromLeaftoRoot35 2.6 Dealing with Nonunique Keys 37 2.7 Queries for the Keys in an Interval 38 2.8 Building Optimal Search Trees 40 2.9 Converting Trees into Lists 47 2.10 -
[Type the Document Title]
International Journal of Engineering Science Invention (IJESI) ISSN (Online): 2319 – 6734, ISSN (Print): 2319 – 6726 www.ijesi.org || PP. 31-35 Parallel and Nearest Neighbor Search for High-Dimensional Index Structure of Content-Based Data Using Dva-Tree A.RosiyaSusaiMary1, Dr.R.Suguna 2 1(Department of Computer Science, Theivanai Ammal College for Women, India) 2(Department of Computer Science, Theivanai Ammal College for Women, India) Abstract: We propose a parallel high-dimensional index structure for content-based information retrieval so as to cope with the linear decrease in retrieval performance. In addition, we devise data insertion, range query and k-NN query processing algorithms which are suitable for a cluster-based parallel architecture. Finally, we show that our parallel index structure achieves good retrieval performance in proportion to the number of servers in the cluster-based architecture and it outperforms a parallel version of the VA-File. To address the demanding search needs caused by large-scale image collections, speeding up the search by using distributed index structures, and a Vector Approximation-file (DVA-file) in parallel. Keywords: Distributed Indexing Structure, High Dimensionality, KNN- Search I. Introduction The need to manage various types of large scale data stored in web environments has drastically increased and resulted in the development of index mechanism for high dimensional feature vector data about such a kinds of multimedia data. Recent search engine for the multimedia data in web location may collect billions of images, text and video data, which makes the performance bottleneck to get a suitable web documents and contents. Given large image and video data collections, a basic problem is to find objects that cover given information need. -
Search Trees
Lecture III Page 1 “Trees are the earth’s endless effort to speak to the listening heaven.” – Rabindranath Tagore, Fireflies, 1928 Alice was walking beside the White Knight in Looking Glass Land. ”You are sad.” the Knight said in an anxious tone: ”let me sing you a song to comfort you.” ”Is it very long?” Alice asked, for she had heard a good deal of poetry that day. ”It’s long.” said the Knight, ”but it’s very, very beautiful. Everybody that hears me sing it - either it brings tears to their eyes, or else -” ”Or else what?” said Alice, for the Knight had made a sudden pause. ”Or else it doesn’t, you know. The name of the song is called ’Haddocks’ Eyes.’” ”Oh, that’s the name of the song, is it?” Alice said, trying to feel interested. ”No, you don’t understand,” the Knight said, looking a little vexed. ”That’s what the name is called. The name really is ’The Aged, Aged Man.’” ”Then I ought to have said ’That’s what the song is called’?” Alice corrected herself. ”No you oughtn’t: that’s another thing. The song is called ’Ways and Means’ but that’s only what it’s called, you know!” ”Well, what is the song then?” said Alice, who was by this time completely bewildered. ”I was coming to that,” the Knight said. ”The song really is ’A-sitting On a Gate’: and the tune’s my own invention.” So saying, he stopped his horse and let the reins fall on its neck: then slowly beating time with one hand, and with a faint smile lighting up his gentle, foolish face, he began.. -
The Random Access Zipper: Simple, Purely-Functional Sequences
The Random Access Zipper: Simple, Purely-Functional Sequences Kyle Headley and Matthew A. Hammer University of Colorado Boulder [email protected], [email protected] Abstract. We introduce the Random Access Zipper (RAZ), a simple, purely-functional data structure for editable sequences. The RAZ com- bines the structure of a zipper with that of a tree: like a zipper, edits at the cursor require constant time; by leveraging tree structure, relocating the edit cursor in the sequence requires log time. While existing data structures provide these time bounds, none do so with the same simplic- ity and brevity of code as the RAZ. The simplicity of the RAZ provides the opportunity for more programmers to extend the structure to their own needs, and we provide some suggestions for how to do so. 1 Introduction The singly-linked list is the most common representation of sequences for func- tional programmers. This structure is considered a core primitive in every func- tional language, and morever, the principles of its simple design recur througout user-defined structures that are \sequence-like". Though simple and ubiquitous, the functional list has a serious shortcoming: users may only efficiently access and edit the head of the list. In particular, random accesses (or edits) generally require linear time. To overcome this problem, researchers have developed other data structures representing (functional) sequences, most notably, finger trees [8]. These struc- tures perform well, allowing edits in (amortized) constant time and moving the edit location in logarithmic time. More recently, researchers have proposed the RRB-Vector [14], offering a balanced tree representation for immutable vec- tors. -
Design of Data Structures for Mergeable Trees∗
Design of Data Structures for Mergeable Trees∗ Loukas Georgiadis1; 2 Robert E. Tarjan1; 3 Renato F. Werneck1 Abstract delete(v): Delete leaf v from the forest. • Motivated by an application in computational topology, merge(v; w): Given nodes v and w, let P be the path we consider a novel variant of the problem of efficiently • from v to the root of its tree, and let Q be the path maintaining dynamic rooted trees. This variant allows an from w to the root of its tree. Restructure the tree operation that merges two tree paths. In contrast to the or trees containing v and w by merging the paths P standard problem, in which only one tree arc at a time and Q in a way that preserves the heap order. The changes, a single merge operation can change many arcs. merge order is unique if all node labels are distinct, In spite of this, we develop a data structure that supports which we can assume without loss of generality: if 2 merges and all other standard tree operations in O(log n) necessary, we can break ties by node identifier. See amortized time on an n-node forest. For the special case Figure 1. that occurs in the motivating application, in which arbitrary arc deletions are not allowed, we give a data structure with 1 an O(log n) amortized time bound per operation, which is merge(6,11) asymptotically optimal. The analysis of both algorithms is 3 2 not straightforward and requires ideas not previously used in 6 7 4 5 the study of dynamic trees. -
Parallel and Nearest Neighbor Search for High-Dimensional Index Structure of Content- Based Data Using DVA-Tree R
International Journal of Scientific Research and Review ISSN NO: 2279-543X Parallel and Nearest Neighbor search for high-dimensional index structure of content- based data using DVA-tree R. ROSELINE1, Z.JOHN BERNARD2, V. SUGANTHI3 1’2Assistant Professor PG & Research Department of Computer Applications, St Joseph’s College Of Arts and Science, Cuddalore. 2 Research Scholar, PG & Research Department of Computer Science, St Joseph’s College Of Arts and Science, Cuddalore. Abstract We propose a parallel high-dimensional index structure for content-based information retrieval so as to cope with the linear decrease in retrieval performance. In addition, we devise data insertion, range query and k-NN query processing algorithms which are suitable for a cluster-based parallel architecture. Finally, we show that our parallel index structure achieves good retrieval performance in proportion to the number of servers in the cluster-based architecture and it outperforms a parallel version of the VA-File. To address the demanding search needs caused by large-scale image collections, speeding up the search by using distributed index structures, and a Vector Approximation-file (DVA-file) in parallel Keywords: KNN- search, distributed indexing structure, high Dimensionality 1. Introduction The need to manage various types of large scale data stored in web environments has drastically increased and resulted in the development of index mechanism for high dimensional feature vector data about such a kinds of multimedia data. Recent search engine for the multimedia data in web location may collect billions of images, text and video data, which makes the performance bottleneck to get a suitable web documents and contents. -
Top Tree Compression of Tries∗
Top Tree Compression of Tries∗ Philip Bille Pawe lGawrychowski Inge Li Gørtz [email protected] [email protected] [email protected] Gad M. Landau Oren Weimann [email protected] [email protected] Abstract We present a compressed representation of tries based on top tree compression [ICALP 2013] that works on a standard, comparison-based, pointer machine model of computation and supports efficient prefix search queries. Namely, we show how to preprocess a set of strings of total length n over an alphabet of size σ into a compressed data structure of worst-case optimal size O(n= logσ n) that given a pattern string P of length m determines if P is a prefix of one of the strings in time O(min(m log σ; m + log n)). We show that this query time is in fact optimal regardless of the size of the data structure. Existing solutions either use Ω(n) space or rely on word RAM techniques, such as tabulation, hashing, address arithmetic, or word-level parallelism, and hence do not work on a pointer machine. Our result is the first solution on a pointer machine that achieves worst-case o(n) space. Along the way, we develop several interesting data structures that work on a pointer machine and are of independent interest. These include an optimal data structures for random access to a grammar- compressed string and an optimal data structure for a variant of the level ancestor problem. 1 Introduction A string dictionary compactly represents a set of strings S = S1;:::;Sk to support efficient prefix queries, that is, given a pattern string P determine if P is a prefix of some string in S. -
Bulk Updates and Cache Sensitivity in Search Trees
BULK UPDATES AND CACHE SENSITIVITY INSEARCHTREES TKK Research Reports in Computer Science and Engineering A Espoo 2009 TKK-CSE-A2/09 BULK UPDATES AND CACHE SENSITIVITY INSEARCHTREES Doctoral Dissertation Riku Saikkonen Dissertation for the degree of Doctor of Science in Technology to be presented with due permission of the Faculty of Information and Natural Sciences, Helsinki University of Technology for public examination and debate in Auditorium T2 at Helsinki University of Technology (Espoo, Finland) on the 4th of September, 2009, at 12 noon. Helsinki University of Technology Faculty of Information and Natural Sciences Department of Computer Science and Engineering Teknillinen korkeakoulu Informaatio- ja luonnontieteiden tiedekunta Tietotekniikan laitos Distribution: Helsinki University of Technology Faculty of Information and Natural Sciences Department of Computer Science and Engineering P.O. Box 5400 FI-02015 TKK FINLAND URL: http://www.cse.tkk.fi/ Tel. +358 9 451 3228 Fax. +358 9 451 3293 E-mail: [email protected].fi °c 2009 Riku Saikkonen Layout: Riku Saikkonen (except cover) Cover image: Riku Saikkonen Set in the Computer Modern typeface family designed by Donald E. Knuth. ISBN 978-952-248-037-8 ISBN 978-952-248-038-5 (PDF) ISSN 1797-6928 ISSN 1797-6936 (PDF) URL: http://lib.tkk.fi/Diss/2009/isbn9789522480385/ Multiprint Oy Espoo 2009 AB ABSTRACT OF DOCTORAL DISSERTATION HELSINKI UNIVERSITY OF TECHNOLOGY P.O. Box 1000, FI-02015 TKK http://www.tkk.fi/ Author Riku Saikkonen Name of the dissertation Bulk Updates and Cache Sensitivity in Search Trees Manuscript submitted 09.04.2009 Manuscript revised 15.08.2009 Date of the defence 04.09.2009 £ Monograph ¤ Article dissertation (summary + original articles) Faculty Faculty of Information and Natural Sciences Department Department of Computer Science and Engineering Field of research Software Systems Opponent(s) Prof. -
List of Transparencies
List of Transparencies Chapter 1 Primitive Java 1 A simple first program 2 The eight primitve types in Java 3 Program that illustrates operators 4 Result of logical operators 5 Examples of conditional and looping constructs 6 Layout of a switch statement 7 Illustration of method declaration and calls 8 Chapter 2 References 9 An illustration of a reference: The Point object stored at memory location 1000 is refer- enced by both point1 and point3. The Point object stored at memory location 1024 is referenced by point2. The memory locations where the variables are stored are arbitrary 10 The result of point3=point2: point3 now references the same object as point2 11 Simple demonstration of arrays 12 Array expansion: (a) starting point: a references 10 integers; (b) after step 1: original references the 10 integers; (c) after steps 2 and 3: a references 12 integers, the first 10 of which are copied from original; (d) after original exits scope, the orig- inal array is unreferenced and can be reclaimed 13 Common standard run-time exceptions 14 Common standard checked exceptions 15 Simple program to illustrate exceptions 16 Illustration of the throws clause 17 Program that demonstrates the string tokenizer 18 Program to list contents of a file 19 Chapter 3 Objects and Classes 20 Copyright 1998 by Addison-Wesley Publishing Company ii A complete declaration of an IntCell class 21 IntCell members: read and write are accessible, but storedValue is hidden 22 A simple test routine to show how IntCell objects are accessed 23 IntCell declaration with -
Funnel Heap - a Cache Oblivious Priority Queue
Alcom-FT Technical Report Series ALCOMFT-TR-02-136 Funnel Heap - A Cache Oblivious Priority Queue Gerth Stølting Brodal∗,† Rolf Fagerberg∗ Abstract The cache oblivious model of computation is a two-level memory model with the assumption that the parameters of the model are un- known to the algorithms. A consequence of this assumption is that an algorithm efficient in the cache oblivious model is automatically effi- cient in a multi-level memory model. Arge et al. recently presented the first optimal cache oblivious priority queue, and demonstrated the im- portance of this result by providing the first cache oblivious algorithms for graph problems. Their structure uses cache oblivious sorting and selection as subroutines. In this paper, we devise an alternative opti- mal cache oblivious priority queue based only on binary merging. We also show that our structure can be made adaptive to different usage profiles. Keywords: Cache oblivious algorithms, priority queues, memory hierar- chies, merging ∗BRICS (Basic Research in Computer Science, www.brics.dk, funded by the Danish National Research Foundation), Department of Computer Science, University of Aarhus, Ny Munkegade, DK-8000 Arhus˚ C, Denmark. E-mail: {gerth,rolf}@brics.dk. Partially supported by the Future and Emerging Technologies programme of the EU under contract number IST-1999-14186 (ALCOM-FT). †Supported by the Carlsberg Foundation (contract number ANS-0257/20). 1 1 Introduction External memory models are formal models for analyzing the impact of the memory access patterns of algorithms in the presence of several levels of memory and caches on modern computer architectures. The cache oblivious model, recently introduced by Frigo et al.