Fast, Dynamically-Sized Concurrent Hash Table⋆
Total Page:16
File Type:pdf, Size:1020Kb

Load more
Recommended publications
-
Yikes! Why Is My Systemverilog Still So Slooooow?
DVCon-2019 San Jose, CA Voted Best Paper 1st Place World Class SystemVerilog & UVM Training Yikes! Why is My SystemVerilog Still So Slooooow? Cliff Cummings John Rose Adam Sherer Sunburst Design, Inc. Cadence Design Systems, Inc. Cadence Design System, Inc. [email protected] [email protected] [email protected] www.sunburst-design.com www.cadence.com www.cadence.com ABSTRACT This paper describes a few notable SystemVerilog coding styles and their impact on simulation performance. Benchmarks were run using the three major SystemVerilog simulation tools and those benchmarks are reported in the paper. Some of the most important coding styles discussed in this paper include UVM string processing and SystemVerilog randomization constraints. Some coding styles showed little or no impact on performance for some tools while the same coding styles showed large simulation performance impact. This paper is an update to a paper originally presented by Adam Sherer and his co-authors at DVCon in 2012. The benchmarking described in this paper is only for coding styles and not for performance differences between vendor tools. DVCon 2019 Table of Contents I. Introduction 4 Benchmarking Different Coding Styles 4 II. UVM is Software 5 III. SystemVerilog Semantics Support Syntax Skills 10 IV. Memory and Garbage Collection – Neither are Free 12 V. It is Best to Leave Sleeping Processes to Lie 14 VI. UVM Best Practices 17 VII. Verification Best Practices 21 VIII. Acknowledgment 25 References 25 Author & Contact Information 25 Page 2 Yikes! Why is -
4 Hash Tables and Associative Arrays
4 FREE Hash Tables and Associative Arrays If you want to get a book from the central library of the University of Karlsruhe, you have to order the book in advance. The library personnel fetch the book from the stacks and deliver it to a room with 100 shelves. You find your book on a shelf numbered with the last two digits of your library card. Why the last digits and not the leading digits? Probably because this distributes the books more evenly among the shelves. The library cards are numbered consecutively as students sign up, and the University of Karlsruhe was founded in 1825. Therefore, the students enrolled at the same time are likely to have the same leading digits in their card number, and only a few shelves would be in use if the leadingCOPY digits were used. The subject of this chapter is the robust and efficient implementation of the above “delivery shelf data structure”. In computer science, this data structure is known as a hash1 table. Hash tables are one implementation of associative arrays, or dictio- naries. The other implementation is the tree data structures which we shall study in Chap. 7. An associative array is an array with a potentially infinite or at least very large index set, out of which only a small number of indices are actually in use. For example, the potential indices may be all strings, and the indices in use may be all identifiers used in a particular C++ program.Or the potential indices may be all ways of placing chess pieces on a chess board, and the indices in use may be the place- ments required in the analysis of a particular game. -
Lecture 2: Data Structures: a Short Background
Lecture 2: Data structures: A short background Storing data It turns out that depending on what we wish to do with data, the way we store it can have a signifcant impact on the running time. So in particular, storing data in certain "structured" way can allow for fast implementation. This is the motivation behind the feld of data structures, which is an extensive area of study. In this lecture, we will give a very short introduction, by illustrating a few common data structures, with some motivating problems. In the last lecture, we mentioned that there are two ways of representing graphs (adjacency list and adjacency matrix), each of which has its own advantages and disadvantages. The former is compact (size is proportional to the number of edges + number of vertices), and is faster for enumerating the neighbors of a vertex (which is what one needs for procedures like Breadth-First-Search). The adjacency matrix is great if one wishes to quickly tell if two vertices and have an edge between them. Let us now consider a diferent problem. Example 1: Scrabble problem Suppose we have a "dictionary" of strings whose average length is , and suppose we have a query string . How can we quickly tell if ? Brute force. Note that the naive solution is to iterate over the strings and check if for some . This takes time . If we only care about answering the query for one , this is not bad, as is the amount of time needed to "read the input". But of course, if we have a fxed dictionary and if we wish to check if for multiple , then this is extremely inefcient. -
A Fast Concurrent and Resizable Robin Hood Hash Table Endrias
A Fast Concurrent and Resizable Robin Hood Hash Table by Endrias Kahssay Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Master of Engineering in Electrical Engineering and Computer Science at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY February 2021 © Massachusetts Institute of Technology 2021. All rights reserved. Author................................................................ Department of Electrical Engineering and Computer Science Jan 29, 2021 Certified by. Julian Shun Assistant Professor Thesis Supervisor Accepted by . Katrina LaCurts Chair, Master of Engineering Thesis Committee 2 A Fast Concurrent and Resizable Robin Hood Hash Table by Endrias Kahssay Submitted to the Department of Electrical Engineering and Computer Science on Jan 29, 2021, in partial fulfillment of the requirements for the degree of Master of Engineering in Electrical Engineering and Computer Science Abstract Concurrent hash tables are among the most important primitives in concurrent pro- gramming and have been extensively studied in the literature. Robin Hood hashing is a variant of linear probing that moves around keys to reduce probe distances. It has been used to develop state of the art serial hash tables. However, there is only one existing previous work on a concurrent Robin Hood table. The difficulty in making Robin Hood concurrent lies in the potential for large memory reorganization by the different table operations. This thesis presents Bolt, a concurrent resizable Robin Hood hash table engineered for high performance. Bolt treads an intricate balance between an atomic fast path and a locking slow path to facilitate concurrency. It uses a novel scheme to interleave the two without compromising correctness in the concur- rent setting. -
Prefix Hash Tree an Indexing Data Structure Over Distributed Hash
Prefix Hash Tree An Indexing Data Structure over Distributed Hash Tables Sriram Ramabhadran ∗ Sylvia Ratnasamy University of California, San Diego Intel Research, Berkeley Joseph M. Hellerstein Scott Shenker University of California, Berkeley International Comp. Science Institute, Berkeley and and Intel Research, Berkeley University of California, Berkeley ABSTRACT this lookup interface has allowed a wide variety of Distributed Hash Tables are scalable, robust, and system to be built on top DHTs, including file sys- self-organizing peer-to-peer systems that support tems [9, 27], indirection services [30], event notifi- exact match lookups. This paper describes the de- cation [6], content distribution networks [10] and sign and implementation of a Prefix Hash Tree - many others. a distributed data structure that enables more so- phisticated queries over a DHT. The Prefix Hash DHTs were designed in the Internet style: scala- Tree uses the lookup interface of a DHT to con- bility and ease of deployment triumph over strict struct a trie-based structure that is both efficient semantics. In particular, DHTs are self-organizing, (updates are doubly logarithmic in the size of the requiring no centralized authority or manual con- domain being indexed), and resilient (the failure figuration. They are robust against node failures of any given node in the Prefix Hash Tree does and easily accommodate new nodes. Most impor- not affect the availability of data stored at other tantly, they are scalable in the sense that both la- nodes). tency (in terms of the number of hops per lookup) and the local state required typically grow loga- Categories and Subject Descriptors rithmically in the number of nodes; this is crucial since many of the envisioned scenarios for DHTs C.2.4 [Comp. -
Hash Tables & Searching Algorithms
Search Algorithms and Tables Chapter 11 Tables • A table, or dictionary, is an abstract data type whose data items are stored and retrieved according to a key value. • The items are called records. • Each record can have a number of data fields. • The data is ordered based on one of the fields, named the key field. • The record we are searching for has a key value that is called the target. • The table may be implemented using a variety of data structures: array, tree, heap, etc. Sequential Search public static int search(int[] a, int target) { int i = 0; boolean found = false; while ((i < a.length) && ! found) { if (a[i] == target) found = true; else i++; } if (found) return i; else return –1; } Sequential Search on Tables public static int search(someClass[] a, int target) { int i = 0; boolean found = false; while ((i < a.length) && !found){ if (a[i].getKey() == target) found = true; else i++; } if (found) return i; else return –1; } Sequential Search on N elements • Best Case Number of comparisons: 1 = O(1) • Average Case Number of comparisons: (1 + 2 + ... + N)/N = (N+1)/2 = O(N) • Worst Case Number of comparisons: N = O(N) Binary Search • Can be applied to any random-access data structure where the data elements are sorted. • Additional parameters: first – index of the first element to examine size – number of elements to search starting from the first element above Binary Search • Precondition: If size > 0, then the data structure must have size elements starting with the element denoted as the first element. In addition, these elements are sorted. -
FORSCHUNGSZENTRUM JÜLICH Gmbh Programming in C++ Part II
FORSCHUNGSZENTRUM JÜLICH GmbH Jülich Supercomputing Centre D-52425 Jülich, Tel. (02461) 61-6402 Ausbildung von Mathematisch-Technischen Software-Entwicklern Programming in C++ Part II Bernd Mohr FZJ-JSC-BHB-0155 1. Auflage (letzte Änderung: 19.09.2003) Copyright-Notiz °c Copyright 2008 by Forschungszentrum Jülich GmbH, Jülich Supercomputing Centre (JSC). Alle Rechte vorbehalten. Kein Teil dieses Werkes darf in irgendeiner Form ohne schriftliche Genehmigung des JSC reproduziert oder unter Verwendung elektronischer Systeme verarbeitet, vervielfältigt oder verbreitet werden. Publikationen des JSC stehen in druckbaren Formaten (PDF auf dem WWW-Server des Forschungszentrums unter der URL: <http://www.fz-juelich.de/jsc/files/docs/> zur Ver- fügung. Eine Übersicht über alle Publikationen des JSC erhalten Sie unter der URL: <http://www.fz-juelich.de/jsc/docs> . Beratung Tel: +49 2461 61 -nnnn Auskunft, Nutzer-Management (Dispatch) Das Dispatch befindet sich am Haupteingang des JSC, Gebäude 16.4, und ist telefonisch erreich- bar von Montag bis Donnerstag 8.00 - 17.00 Uhr Freitag 8.00 - 16.00 Uhr Tel.5642oder6400, Fax2810, E-Mail: [email protected] Supercomputer-Beratung Tel. 2828, E-Mail: [email protected] Netzwerk-Beratung, IT-Sicherheit Tel. 6440, E-Mail: [email protected] Rufbereitschaft Außerhalb der Arbeitszeiten (montags bis donnerstags: 17.00 - 24.00 Uhr, freitags: 16.00 - 24.00 Uhr, samstags: 8.00 - 17.00 Uhr) können Sie dringende Probleme der Rufbereitschaft melden: Rufbereitschaft Rechnerbetrieb: Tel. 6400 Rufbereitschaft Netzwerke: Tel. 6440 An Sonn- und Feiertagen gibt es keine Rufbereitschaft. Fachberater Tel. +49 2461 61 -nnnn Fachgebiet Berater Telefon E-Mail Auskunft, Nutzer-Management, E. -
Chapter 10: Efficient Collections (Skip Lists, Trees)
Chapter 10: Efficient Collections (skip lists, trees) If you performed the analysis exercises in Chapter 9, you discovered that selecting a bag- like container required a detailed understanding of the tasks the container will be expected to perform. Consider the following chart: Dynamic array Linked list Ordered array add O(1)+ O(1) O(n) contains O(n) O(n) O(log n) remove O(n) O(n) O(n) If we are simply considering the cost to insert a new value into the collection, then nothing can beat the constant time performance of a simple dynamic array or linked list. But if searching or removals are common, then the O(log n) cost of searching an ordered list may more than make up for the slower cost to perform an insertion. Imagine, for example, an on-line telephone directory. There might be several million search requests before it becomes necessary to add or remove an entry. The benefit of being able to perform a binary search more than makes up for the cost of a slow insertion or removal. What if all three bag operations are more-or-less equal? Are there techniques that can be used to speed up all three operations? Are arrays and linked lists the only ways of organizing a data for a bag? Indeed, they are not. In this chapter we will examine two very different implementation techniques for the Bag data structure. In the end they both have the same effect, which is providing O(log n) execution time for all three bag operations. -
Dynamic Allocation
Eric Roberts Handout #22 CS 106B February 2, 2015 Dynamic Allocation The Allocation of Memory to Variables • When you declare a variable in a program, C++ allocates space for that variable from one of several memory regions. Dynamic Allocation 0000 • One region of memory is reserved for variables that static persist throughout the lifetime of the program, such data as constants. This information is called static data. • Each time you call a method, C++ allocates a new block of memory called a stack frame to hold its heap local variables. These stack frames come from a Eric Roberts region of memory called the stack. CS 106B • It is also possible to allocate memory dynamically, as described in Chapter 12. This space comes from February 2, 2015 a pool of memory called the heap. stack • In classical architectures, the stack and heap grow toward each other to maximize the available space. FFFF Dynamic Allocation Exercise: Dynamic Arrays • C++ uses the new operator to allocate memory on the heap. • Write a method createIndexArray(n) that returns an integer array of size n in which each element is initialized to its index. • You can allocate a single value (as opposed to an array) by As an example, calling writing new followed by the type name. Thus, to allocate space for a int on the heap, you would write int *digits = createIndexArray(10); Point *ip = new int; should result in the following configuration: • You can allocate an array of values using the following form: digits new type[size] Thus, to allocate an array of 10000 integers, you would write: 0 1 2 3 4 5 6 7 8 9 int *array = new int[10000]; 0 1 2 3 4 5 6 7 8 9 • The delete operator frees memory previously allocated. -
Introduction to Hash Table and Hash Function
Introduction to Hash Table and Hash Function This is a short introduction to Hashing mechanism Introduction Is it possible to design a search of O(1)– that is, one that has a constant search time, no matter where the element is located in the list? Let’s look at an example. We have a list of employees of a fairly small company. Each of 100 employees has an ID number in the range 0 – 99. If we store the elements (employee records) in the array, then each employee’s ID number will be an index to the array element where this employee’s record will be stored. In this case once we know the ID number of the employee, we can directly access his record through the array index. There is a one-to-one correspondence between the element’s key and the array index. However, in practice, this perfect relationship is not easy to establish or maintain. For example: the same company might use employee’s five-digit ID number as the primary key. In this case, key values run from 00000 to 99999. If we want to use the same technique as above, we need to set up an array of size 100,000, of which only 100 elements will be used: Obviously it is very impractical to waste that much storage. But what if we keep the array size down to the size that we will actually be using (100 elements) and use just the last two digits of key to identify each employee? For example, the employee with the key number 54876 will be stored in the element of the array with index 76. -
A Fully-Functional Static and Dynamic Succinct Trees
A Fully-Functional Static and Dynamic Succinct Trees GONZALO NAVARRO, University of Chile, Chile KUNIHIKO SADAKANE, National Institute of Informatics, Japan We propose new succinct representations of ordinal trees, and match various space/time lower bounds. It is known that any n-node static tree can be represented in 2n + o(n) bits so that a number of operations on the tree can be supported in constant time under the word-RAM model. However, the data structures are complicated and difficult to dynamize. We propose a simple and flexible data structure, called the range min-max tree, that reduces the large number of relevant tree operations considered in the literature to a few primitives that are carried out in constant time on polylog-sized trees. The result is extended to trees of arbitrary size, retaining constant time and reaching 2n + O(n=polylog(n)) bits of space. This space is optimal for a core subset of the operations supported, and significantly lower than in any previous proposal. For the dynamic case, where insertion/deletion (indels) of nodes is allowed, the existing data structures support a very limited set of operations. Our data structure builds on the range min-max tree to achieve 2n + O(n= log n) bits of space and O(log n) time for all the operations supported in the static scenario, plus indels. We also propose an improved data structure using 2n + O(n log log n= log n) bits and improving the time to the optimal O(log n= log log n) for most operations. We extend our support to forests, where whole subtrees can be attached to or detached from others, in time O(log1+ n) for any > 0. -
Intel Threading Building Blocks
Praise for Intel Threading Building Blocks “The Age of Serial Computing is over. With the advent of multi-core processors, parallel- computing technology that was once relegated to universities and research labs is now emerging as mainstream. Intel Threading Building Blocks updates and greatly expands the ‘work-stealing’ technology pioneered by the MIT Cilk system of 15 years ago, providing a modern industrial-strength C++ library for concurrent programming. “Not only does this book offer an excellent introduction to the library, it furnishes novices and experts alike with a clear and accessible discussion of the complexities of concurrency.” — Charles E. Leiserson, MIT Computer Science and Artificial Intelligence Laboratory “We used to say make it right, then make it fast. We can’t do that anymore. TBB lets us design for correctness and speed up front for Maya. This book shows you how to extract the most benefit from using TBB in your code.” — Martin Watt, Senior Software Engineer, Autodesk “TBB promises to change how parallel programming is done in C++. This book will be extremely useful to any C++ programmer. With this book, James achieves two important goals: • Presents an excellent introduction to parallel programming, illustrating the most com- mon parallel programming patterns and the forces governing their use. • Documents the Threading Building Blocks C++ library—a library that provides generic algorithms for these patterns. “TBB incorporates many of the best ideas that researchers in object-oriented parallel computing developed in the last two decades.” — Marc Snir, Head of the Computer Science Department, University of Illinois at Urbana-Champaign “This book was my first introduction to Intel Threading Building Blocks.