Week 7 Arrays, Lists, Pointers and Rooted Trees General Remarks

Total Page:16

File Type:pdf, Size:1020Kb

Week 7 Arrays, Lists, Pointers and Rooted Trees General Remarks CS 270 CS 270 Algorithms General remarks Algorithms Oliver Oliver Week 7 Kullmann Kullmann Binary Binary search search Arrays, lists, pointers and rooted trees Lists Lists Pointers Pointers Trees We conclude elementary data structures by discussing and Trees 1 Binary search Implementing implementing arrays, lists, pointers and trees. Implementing rooted trees rooted trees We also consider binary search. Tutorial Tutorial 2 Lists Reading from CLRS for week 7 3 Pointers 1 Chapter 10, Sections 10.2, 10.3, 10.4. 4 Trees 5 Implementing rooted trees 6 Tutorial CS 270 CS 270 Arrays Algorithms Vectors Algorithms Oliver Oliver Kullmann Kullmann Arrays are the most fundamental data structure: The dynamic form of an array (i.e., it can grow) can be called a Binary Binary search vector (as for C++; or “dynamic array”): search An array A is a static data-structure, with a fixed length Lists Lists The growth of the vector happens by internally holding an n N0, holding n objects of the same type. Pointers Pointers ∈ array, and when the need arises, to allocate a new, bigger Access to elements happens via A[i] for indices i, typically Trees Trees array, copy the old content, and delete the old array. 0-based (C-based languages), that is, i 0,..., n 1 , or Implementing Implementing ∈ { − } rooted trees When done “infrequently”, insertions (and deletions) at the rooted trees 1-based, that is, i 1,..., n . Tutorial Tutorial ∈ { } end of the vector require only amortised constant time; see This access, called random access, happens in constant the tutorial. time, and can be used for reading and writing. However insertions and deletions at the beginning of the Due to the fixed length of arrays, one cannot really speak of vector (or somewhere else) needs time linear in the current “insertion” and “deletion” for arrays. size of the vector, since the elements need to be shifted. Search in general is slow (one has to run through all A vector with additional structure, where also insertions and elements in the worst case), however fast in sorted arrays, deletions at the beginning happens in amortised constant via “binary search”. time, is typically called a deque (a “double-ended queue”). CS 270 CS 270 Searching in sorted vectors Algorithms Binary search Algorithms Searching in general vectors takes linear time (running through Oliver Oliver Kullmann class BinarySearch Kullmann all elements): { Binary public static int binary search( final int [] Binary search search 1 However, if the vector is sorted (we assume, as it is the A, int begin , int end , final int x) Lists { Lists default, ascending order), then it can be done in i f (A == null ) return 1; Pointers − Pointers logarithmic time (in the length n of the vector). i f (begin == end) return 1; Trees − Trees 2 We present the Java-function binary search, which while ( true ) Implementing { Implementing searches for an element x in an array A. rooted trees final int mid = (begin+end)/2; rooted trees Tutorial Tutorial 3 Instead of just returning true or false (for found or not), i f (A[mid] == x) return mid; i f (begin+1 == end) return 1; it is more informative to return an index i with A[i]= x, if − i f (A[mid] < x) found, and to return 1 otherwise. { − begin = mid+1; 4 Since it might not be so easy to (efficiently) form i f (begin == end) return 1; sub-arrays, our version of binary search allows to specify − a sub-array by its indices begin and end. } else end = mid; 5 As it is usually best, this so-called “range” is right-open, i.e., the beginning is included, but the ending excluded. } 6 The role model for that is begin = 0 and end = n. } CS 270 CS 270 Binary search (cont.) Algorithms Binary search with assertions Algorithms Oliver Oliver Kullmann public static int binary search( final int [] A Kullmann , int begin , int end , final int x) Binary { Binary search i f (A == null ) return 1; search − Lists assert(0 <= begin <= end <= A.length); Lists public static int binary search( final int [] Pointers i f (begin == end) return 1; Pointers − A, final int x) Trees while ( true ) Trees { { i f (A == null ) return 1; Implementing assert(0 <= begin < end <= A.length); Implementing − rooted trees rooted trees return binary search(A, 0, A.length, x); Tutorial final int mid = (begin+end)/2; Tutorial assert(begin <= mid < end); } i f (A[mid] == x) return mid; } i f (begin+1 == end) return 1; − assert(begin < mid) ; i f (A[mid] < x) begin = mid+1; { i f (begin == end) return 1; − } else end = mid; } } CS 270 CS 270 Analysing binary search Algorithms Analysing binary search (cont.) Algorithms Oliver Oliver Kullmann Kullmann Binary We obtain the second case of the Master Theorem Binary We have a divide-and-conquer algorithm, with the characteristic search search (log (1) = 0), whence recurrence Lists 2 Lists T (n)= T (n/2) + 1. Pointers Pointers Trees T (n) = Θ(lg n). Trees Implementing Implementing rooted trees Recall that this actually only implies an upper bound for the rooted trees That’s because we divide the array into two (nearly) equal Tutorial run-time of binary search — the lower bound implied by the Tutorial parts, i.e., b = 2 in the standard form of the recurrence for implicit Ω holds only for the recurrence, but not necessarily for the Master Theorem. the run-time. While we only need to investigate one of the two parts (due However, it is not too hard to see that also for the algorithm, to the sorting!), i.e., a = 1 for the Master Theorem. and actually for every possible search algorithm, we need at least Finally the work done for splitting happens in constant lg(n) comparisons. time, and thus c = 0 for the Master Theorem. CS 270 CS 270 Removing random access from vectors, gaining fast Algorithms Pointers to next and previous elements Algorithms Oliver Like a vector, the elements of a list are arranged in a linear order. Oliver general insertion and deletion: Linked lists Kullmann Kullmann Binary Binary search The basic idea here is that search With vectors we obtain random access — via indices, which are Lists each elements contains a pointer Lists just natural numbers, and thus arbitrary arithmetic can be Pointers to the next and the previous element of the list. Pointers performed with them — due to the contiguous and uniform Trees Trees storage scheme: underlying is an array, which is stored as one Implementing Implementing So a list-object x is a triple: contiguous block of memory cells, all of the same size. rooted trees rooted trees Tutorial Tutorial But to maintain contiguity, only deletions and insertions at the x.prev is a pointer to the previous element in the list; end of the vector are efficient (amortised constant-time) — if we x.next is a pointer to the next element in the list; give up contiguity, then we loose random access, but we gain x.key contains the key (or the data, if there is no “key”). efficient arbitrary deletions and insertions: (linked) lists. For the first element of the list, x.prev is NIL, and for the last Lists formally implement a dictionary (search, insertion, element, x.next is NIL. deletion), but, different from “real” dictionaries, search is slow, while insertion and deletion is very fast, i.e., constant-time. The whole list is represented by a pointer L to the first element (as usual, NIL if the list is empty). CS 270 CS 270 Searching Algorithms Excursion: Searching, in C++ Algorithms Oliver Oliver Kullmann For comparison, the same code in C++: Kullmann Binary Binary search const List search( const List L, const Key k) search ∗ ∗ { The SEARCH-function in Java-like code, using List as the Lists while (L != nullptr and L.key != k) Lists L = L.next; ∗ pointer-type (recall — nearly everything in Java is a pointer!): Pointers ∗ Pointers Trees return L; Trees static List search(List L, final Key k) { Implementing } Implementing while (L != null && L.key != k) rooted trees rooted trees L = L.next; Tutorial We see that in C/C++ we not only have pointers, but also Tutorial return L; values (as the ints!), and thus one can distinguish between pointers and values: The *-operator makes pointer-types from } value-types, and dereferences pointers (to values). Further Note that if x is not found, then L will automatically finally remarks: become NIL (that is, null for Java). “const List ” means that we do not change the values. ∗ More idiomatic would be the use of the > operator, − which makes for example L >key instead of L.key. − ∗ CS 270 CS 270 Insertion Algorithms Deletion Algorithms Oliver Oliver Kullmann Kullmann Binary Binary Inserting a list-object x into list L, at the beginning, again as search Deleting the list-element x from list L: search Java-code: Lists Lists Pointers static List delete(List L, final List x) Pointers static List insert(List L, final List x) { { Trees assert(x != null ); Trees assert(x != null ); Implementing assert(L != null ); Implementing x.next = L; rooted trees rooted trees Tutorial i f (x.prev != null ) x.prev.next = x.next; Tutorial x.prev = null ; else L = x.next; i f (L != null ) L.prev = x; i f (x.next != null ) x.next.prev = x.prev; L = x; return L; return L; } } Again the return-value is the new list. Note that the return-value is the new list. CS 270 CS 270 Other forms of linked lists Algorithms Final remarks on lists Algorithms Oliver Oliver Kullmann Kullmann Binary Binary search What we have outlined as the class List (see the lab session for search Our from of a linked list (the standard form) is more precisely Lists Lists the full implementation) would typically be considered as a type called doubly linked list, since we have back- and forth-pointers Pointers Pointers ListNode, while the List-class itself would be kind of a for every node.
Recommended publications
  • Yikes! Why Is My Systemverilog Still So Slooooow?
    DVCon-2019 San Jose, CA Voted Best Paper 1st Place World Class SystemVerilog & UVM Training Yikes! Why is My SystemVerilog Still So Slooooow? Cliff Cummings John Rose Adam Sherer Sunburst Design, Inc. Cadence Design Systems, Inc. Cadence Design System, Inc. [email protected] [email protected] [email protected] www.sunburst-design.com www.cadence.com www.cadence.com ABSTRACT This paper describes a few notable SystemVerilog coding styles and their impact on simulation performance. Benchmarks were run using the three major SystemVerilog simulation tools and those benchmarks are reported in the paper. Some of the most important coding styles discussed in this paper include UVM string processing and SystemVerilog randomization constraints. Some coding styles showed little or no impact on performance for some tools while the same coding styles showed large simulation performance impact. This paper is an update to a paper originally presented by Adam Sherer and his co-authors at DVCon in 2012. The benchmarking described in this paper is only for coding styles and not for performance differences between vendor tools. DVCon 2019 Table of Contents I. Introduction 4 Benchmarking Different Coding Styles 4 II. UVM is Software 5 III. SystemVerilog Semantics Support Syntax Skills 10 IV. Memory and Garbage Collection – Neither are Free 12 V. It is Best to Leave Sleeping Processes to Lie 14 VI. UVM Best Practices 17 VII. Verification Best Practices 21 VIII. Acknowledgment 25 References 25 Author & Contact Information 25 Page 2 Yikes! Why is
    [Show full text]
  • Lecture 2: Data Structures: a Short Background
    Lecture 2: Data structures: A short background Storing data It turns out that depending on what we wish to do with data, the way we store it can have a signifcant impact on the running time. So in particular, storing data in certain "structured" way can allow for fast implementation. This is the motivation behind the feld of data structures, which is an extensive area of study. In this lecture, we will give a very short introduction, by illustrating a few common data structures, with some motivating problems. In the last lecture, we mentioned that there are two ways of representing graphs (adjacency list and adjacency matrix), each of which has its own advantages and disadvantages. The former is compact (size is proportional to the number of edges + number of vertices), and is faster for enumerating the neighbors of a vertex (which is what one needs for procedures like Breadth-First-Search). The adjacency matrix is great if one wishes to quickly tell if two vertices and have an edge between them. Let us now consider a diferent problem. Example 1: Scrabble problem Suppose we have a "dictionary" of strings whose average length is , and suppose we have a query string . How can we quickly tell if ? Brute force. Note that the naive solution is to iterate over the strings and check if for some . This takes time . If we only care about answering the query for one , this is not bad, as is the amount of time needed to "read the input". But of course, if we have a fxed dictionary and if we wish to check if for multiple , then this is extremely inefcient.
    [Show full text]
  • FORSCHUNGSZENTRUM JÜLICH Gmbh Programming in C++ Part II
    FORSCHUNGSZENTRUM JÜLICH GmbH Jülich Supercomputing Centre D-52425 Jülich, Tel. (02461) 61-6402 Ausbildung von Mathematisch-Technischen Software-Entwicklern Programming in C++ Part II Bernd Mohr FZJ-JSC-BHB-0155 1. Auflage (letzte Änderung: 19.09.2003) Copyright-Notiz °c Copyright 2008 by Forschungszentrum Jülich GmbH, Jülich Supercomputing Centre (JSC). Alle Rechte vorbehalten. Kein Teil dieses Werkes darf in irgendeiner Form ohne schriftliche Genehmigung des JSC reproduziert oder unter Verwendung elektronischer Systeme verarbeitet, vervielfältigt oder verbreitet werden. Publikationen des JSC stehen in druckbaren Formaten (PDF auf dem WWW-Server des Forschungszentrums unter der URL: <http://www.fz-juelich.de/jsc/files/docs/> zur Ver- fügung. Eine Übersicht über alle Publikationen des JSC erhalten Sie unter der URL: <http://www.fz-juelich.de/jsc/docs> . Beratung Tel: +49 2461 61 -nnnn Auskunft, Nutzer-Management (Dispatch) Das Dispatch befindet sich am Haupteingang des JSC, Gebäude 16.4, und ist telefonisch erreich- bar von Montag bis Donnerstag 8.00 - 17.00 Uhr Freitag 8.00 - 16.00 Uhr Tel.5642oder6400, Fax2810, E-Mail: [email protected] Supercomputer-Beratung Tel. 2828, E-Mail: [email protected] Netzwerk-Beratung, IT-Sicherheit Tel. 6440, E-Mail: [email protected] Rufbereitschaft Außerhalb der Arbeitszeiten (montags bis donnerstags: 17.00 - 24.00 Uhr, freitags: 16.00 - 24.00 Uhr, samstags: 8.00 - 17.00 Uhr) können Sie dringende Probleme der Rufbereitschaft melden: Rufbereitschaft Rechnerbetrieb: Tel. 6400 Rufbereitschaft Netzwerke: Tel. 6440 An Sonn- und Feiertagen gibt es keine Rufbereitschaft. Fachberater Tel. +49 2461 61 -nnnn Fachgebiet Berater Telefon E-Mail Auskunft, Nutzer-Management, E.
    [Show full text]
  • Chapter 10: Efficient Collections (Skip Lists, Trees)
    Chapter 10: Efficient Collections (skip lists, trees) If you performed the analysis exercises in Chapter 9, you discovered that selecting a bag- like container required a detailed understanding of the tasks the container will be expected to perform. Consider the following chart: Dynamic array Linked list Ordered array add O(1)+ O(1) O(n) contains O(n) O(n) O(log n) remove O(n) O(n) O(n) If we are simply considering the cost to insert a new value into the collection, then nothing can beat the constant time performance of a simple dynamic array or linked list. But if searching or removals are common, then the O(log n) cost of searching an ordered list may more than make up for the slower cost to perform an insertion. Imagine, for example, an on-line telephone directory. There might be several million search requests before it becomes necessary to add or remove an entry. The benefit of being able to perform a binary search more than makes up for the cost of a slow insertion or removal. What if all three bag operations are more-or-less equal? Are there techniques that can be used to speed up all three operations? Are arrays and linked lists the only ways of organizing a data for a bag? Indeed, they are not. In this chapter we will examine two very different implementation techniques for the Bag data structure. In the end they both have the same effect, which is providing O(log n) execution time for all three bag operations.
    [Show full text]
  • Dynamic Allocation
    Eric Roberts Handout #22 CS 106B February 2, 2015 Dynamic Allocation The Allocation of Memory to Variables • When you declare a variable in a program, C++ allocates space for that variable from one of several memory regions. Dynamic Allocation 0000 • One region of memory is reserved for variables that static persist throughout the lifetime of the program, such data as constants. This information is called static data. • Each time you call a method, C++ allocates a new block of memory called a stack frame to hold its heap local variables. These stack frames come from a Eric Roberts region of memory called the stack. CS 106B • It is also possible to allocate memory dynamically, as described in Chapter 12. This space comes from February 2, 2015 a pool of memory called the heap. stack • In classical architectures, the stack and heap grow toward each other to maximize the available space. FFFF Dynamic Allocation Exercise: Dynamic Arrays • C++ uses the new operator to allocate memory on the heap. • Write a method createIndexArray(n) that returns an integer array of size n in which each element is initialized to its index. • You can allocate a single value (as opposed to an array) by As an example, calling writing new followed by the type name. Thus, to allocate space for a int on the heap, you would write int *digits = createIndexArray(10); Point *ip = new int; should result in the following configuration: • You can allocate an array of values using the following form: digits new type[size] Thus, to allocate an array of 10000 integers, you would write: 0 1 2 3 4 5 6 7 8 9 int *array = new int[10000]; 0 1 2 3 4 5 6 7 8 9 • The delete operator frees memory previously allocated.
    [Show full text]
  • A Fully-Functional Static and Dynamic Succinct Trees
    A Fully-Functional Static and Dynamic Succinct Trees GONZALO NAVARRO, University of Chile, Chile KUNIHIKO SADAKANE, National Institute of Informatics, Japan We propose new succinct representations of ordinal trees, and match various space/time lower bounds. It is known that any n-node static tree can be represented in 2n + o(n) bits so that a number of operations on the tree can be supported in constant time under the word-RAM model. However, the data structures are complicated and difficult to dynamize. We propose a simple and flexible data structure, called the range min-max tree, that reduces the large number of relevant tree operations considered in the literature to a few primitives that are carried out in constant time on polylog-sized trees. The result is extended to trees of arbitrary size, retaining constant time and reaching 2n + O(n=polylog(n)) bits of space. This space is optimal for a core subset of the operations supported, and significantly lower than in any previous proposal. For the dynamic case, where insertion/deletion (indels) of nodes is allowed, the existing data structures support a very limited set of operations. Our data structure builds on the range min-max tree to achieve 2n + O(n= log n) bits of space and O(log n) time for all the operations supported in the static scenario, plus indels. We also propose an improved data structure using 2n + O(n log log n= log n) bits and improving the time to the optimal O(log n= log log n) for most operations. We extend our support to forests, where whole subtrees can be attached to or detached from others, in time O(log1+ n) for any > 0.
    [Show full text]
  • Polynomial Division Using Dynamic Arrays, Heaps, and Packed Exponent Vectors ?
    Polynomial Division using Dynamic Arrays, Heaps, and Packed Exponent Vectors ? Michael Monagan and Roman Pearce Department of Mathematics, Simon Fraser University Burnaby, B.C. V5A 1S6, CANADA. [email protected] and [email protected] Abstract. A common way of implementing multivariate polynomial multiplication and division is to represent polynomials as linked lists of terms sorted in a term ordering and to use repeated merging. This results in poor performance on large sparse polynomials. In this paper we use an auxiliary heap of pointers to reduce the number of monomial comparisons in the worst case while keeping the overall storage linear. We give two variations. In the first, the size of the heap is bounded by the number of terms in the quotient(s). In the second, which is new, the size is bounded by the number of terms in the divisor(s). We use dynamic arrays of terms rather than linked lists to reduce storage allocations and indirect memory references. We pack monomials in the array to reduce storage and to speed up monomial comparisons. We give a new packing for the graded reverse lexicographical ordering. We have implemented the heap algorithms in C with an interface to Maple. For comparison we have also implemented Yan’s “geobuckets” data structure. Our timings demonstrate that heaps of pointers are com- parable in speed with geobuckets but use significantly less storage. 1 Introduction In this paper we present and compare algorithms and data structures for poly- nomial division in the ring P = F [x1, x2, ..., xn] where F is a field.
    [Show full text]
  • Will Dynamic Arrays Finally Change the Way Models Are Built?
    Will Dynamic Arrays finally change the way Models are built? Peter Bartholomew MDAO Technologies Ltd [email protected] ABSTRACT Spreadsheets offer a supremely successful and intuitive means of processing and exchanging numerical content. Its intuitive ad-hoc nature makes it hugely popular for use in diverse areas including business and engineering, yet these very same characteristics make it extraordinarily error-prone; many would question whether it is suitable for serious analysis or modelling tasks. A previous EuSpRIG paper examined the role of Names in increasing solution transparency and providing a readable notation to forge links with the problem domain. Extensive use was made of CSE array formulas, but it is acknowledged that their use makes spreadsheet development a distinctly cumbersome task. Since that time, the new dynamic arrays have been introduced and array calculation is now the default mode of operation for Excel. This paper examines the thesis that their adoption within a more professional development environment could replace traditional techniques where solution integrity is important. A major advantage of fully dynamic models is that they require less manual intervention to keep them updated and so have the potential to reduce the attendant errors and risk. 1 INTRODUCTION This paper starts by reviewing the decisions made at the time electronic spreadsheet was invented and looks at the impact of those original decisions. Dan Bricklin required a means for recording the parameters used in a formula which is both usable for the calculation and intelligible for the user. Dan was well aware of the “programmer’s way” of achieving this through the use of named variables but instead plumped for a strategy that was more action-led and intuitive.
    [Show full text]
  • Lecture Notes of CSCI5610 Advanced Data Structures
    Lecture Notes of CSCI5610 Advanced Data Structures Yufei Tao Department of Computer Science and Engineering Chinese University of Hong Kong July 17, 2020 Contents 1 Course Overview and Computation Models 4 2 The Binary Search Tree and the 2-3 Tree 7 2.1 The binary search tree . .7 2.2 The 2-3 tree . .9 2.3 Remarks . 13 3 Structures for Intervals 15 3.1 The interval tree . 15 3.2 The segment tree . 17 3.3 Remarks . 18 4 Structures for Points 20 4.1 The kd-tree . 20 4.2 A bootstrapping lemma . 22 4.3 The priority search tree . 24 4.4 The range tree . 27 4.5 Another range tree with better query time . 29 4.6 Pointer-machine structures . 30 4.7 Remarks . 31 5 Logarithmic Method and Global Rebuilding 33 5.1 Amortized update cost . 33 5.2 Decomposable problems . 34 5.3 The logarithmic method . 34 5.4 Fully dynamic kd-trees with global rebuilding . 37 5.5 Remarks . 39 6 Weight Balancing 41 6.1 BB[α]-trees . 41 6.2 Insertion . 42 6.3 Deletion . 42 6.4 Amortized analysis . 42 6.5 Dynamization with weight balancing . 43 6.6 Remarks . 44 1 CONTENTS 2 7 Partial Persistence 47 7.1 The potential method . 47 7.2 Partially persistent BST . 48 7.3 General pointer-machine structures . 52 7.4 Remarks . 52 8 Dynamic Perfect Hashing 54 8.1 Two random graph results . 54 8.2 Cuckoo hashing . 55 8.3 Analysis . 58 8.4 Remarks . 59 9 Binomial and Fibonacci Heaps 61 9.1 The binomial heap .
    [Show full text]
  • IBM System Storage DS Storage Manager Version 11.2: Installation and Host Support Guide DDC MEL Events
    IBM System Storage DS Storage Manager Version 11.2 Installation and Host Support Guide IBM GA32-2221-05 Note Before using this information and the product it supports, read the information in “Notices” on page 295. This edition applies to version 11 modification 02 of the IBM DS Storage Manager, and to all subsequent releases and modifications until otherwise indicated in new editions. This edition replaces GA32-2221-04. © Copyright IBM Corporation 2012, 2015. US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Contents Figures .............. vii Using the Summary tab ......... 21 Using the Storage and Copy Services tab ... 21 Tables ............... ix Using the Host Mappings tab ....... 24 Using the Hardware tab ......... 25 Using the Setup tab .......... 26 About this document ......... xi Managing multiple software versions..... 26 What’s new in IBM DS Storage Manager version 11.20 ................ xii Chapter 3. Installing Storage Manager 27 Related documentation .......... xiii Preinstallation requirements ......... 27 Storage Manager documentation on the IBM Installing the Storage Manager packages website .............. xiii automatically with the installation wizard .... 28 Storage Manager online help and diagnostics xiv Installing Storage Manager with a console Finding Storage Manager software, controller window in Linux and AIX ........ 31 firmware, and readme files ........ xiv Installing Storage Manager packages manually .. 32 Essential websites for support information ... xv Software installation sequence ....... 32 Getting information, help, and service ..... xvi Installing Storage Manager manually ..... 32 Before you call ............ xvi Uninstalling Storage Manager ........ 33 Using the documentation ........ xvi Uninstalling Storage Manager on a Windows Software service and support ....... xvii operating system ........... 33 Hardware service and support .....
    [Show full text]
  • Optimal Algorithm for Profiling Dynamic Arrays with Finite Values Dingcheng Yang, Wenjian Yu, Junhui Deng Shenghua Liu Bnrist, Dept
    Optimal Algorithm for Profiling Dynamic Arrays with Finite Values Dingcheng Yang, Wenjian Yu, Junhui Deng Shenghua Liu BNRist, Dept. Computer Science & Tech., CAS Key Lab. Network Data Science & Tech., Tsinghua Univ., Beijing, China Inst. Computing Technology, Chinese Academy of [email protected],[email protected], Sciences, Beijing, China [email protected] [email protected] ABSTRACT The problem of range mode query, which calculates the mode How can one quickly answer the most and top popular objects at of a sub-array A»i ::: j¼ for a given array A and a pair of indices any time, given a large log stream in a system of billions of users? ¹i; jº, has also been investigated [4, 10, 13]. The array with finite It is equivalent to find the mode and top-frequent elements ina values was considered. With a static data structure, the range p dynamic array corresponding to the log stream. However, most mode query can be answered in O¹ n/lognº time [4]. existing work either restrain the dynamic array within a sliding Majority and frequency approximation. The majority is the ele- window, or do not take advantages of only one element can be ment whose frequency is more than half of n. An algorithm was added or removed in a log stream. Therefore, we propose a profil- proposed to find majority in O¹nº time and O¹1º space [3]. Many ing algorithm, named S-Profile, which is of O¹1º time complexity work on the statistics like frequency count and quantiles, are un- for every updating of the dynamic array, and optimal in terms of der a setting of sliding window [1, 2, 5, 8, 11].
    [Show full text]
  • Binary Search Trees Part One
    Binary Search Trees Part One Taking Stock: Where Are We? ☐ Stack ☐ Queue ☐ Vector ☐ string ☐ PriorityQueue ☐ Map ☐ Set ☐ Lexicon CouldCould dodo thisthis withwith aa dynamicdynamic arrayarray (in(in class)class) oror aa linkedlinked listlist (exercise(exercise Stack ✓ to the reader). ☐ Queue to the reader). ☐ Vector ☐ string ☐ PriorityQueue ☐ Map ☐ Set ☐ Lexicon CouldCould dodo thisthis withwith aa ✓ Stack linked list (in class) or as ✓ Queue linked list (in class) or as aa dynamicdynamic arrayarray (exercise(exercise Vector ☐ toto thethe reader).reader). ☐ string ☐ PriorityQueue ☐ Map ☐ Set ☐ Lexicon ✓ Stack ✓ Queue ✓ Vector AlmostAlmost alwaysalways backedbacked string byby dynamicdynamic arrays.arrays. ✓ ☐ PriorityQueue ☐ Map ☐ Set ☐ Lexicon ✓ Stack ✓ Queue ✓ Vector AsAs aa binarybinary heapheap ✓ string onon toptop ofof aa ✓ PriorityQueue dynamicdynamic arrayarray ☐ Map (Assignment(Assignment 5!)5!) ☐ Set ☐ Lexicon ✓ Stack ✓ Queue ✓ Vector ✓ string ✓ PriorityQueue ☐ Map ☐ Set ☐ Lexicon Implementing Nonlinear Containers dikdik gerenuk impala kudu pudu quokka springbok What is the average cost of looking up an element in this list? Answer: O(n). Intuition: Most elements are far from the front. Can you chain a bunch of objects together so that most of them are near the front? An Interactive Analogy Take a deep breath. And exhale. Feel nicely oxygenated? … yet the path to each one is short. Your lungs have about 500 million alveoli… Key Idea: Most elements in a tree are near the root (top) of the tree. Harnessing this Insight 41 166 143 161 108 52 154 106 110 107 103 109 51 106143 103166 11041 10351 143107 108161 16652 10941 11052 107154 109106 15451 161108 106143 103166 11041 10351 143107 108161 16652 10941 11052 107154 109106 15451 HowHow dodo wewe knowknow toto gogo thisthis wayway 161108 109 toto getget 109?? 106143 103166 11041 10351 143107 108161 16652 10941 11052 107154 109106 15451 HowHow dodo wewe knowknow 161108 toto gogo thisthis wayway 108 toto getget 108?? Goal: Store elements in a tree structure where there’s an easy way to find them.
    [Show full text]