4 Hash Tables and Associative Arrays
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Armadillo C++ Library
Armadillo: a template-based C++ library for linear algebra Conrad Sanderson and Ryan Curtin Abstract The C++ language is often used for implementing functionality that is performance and/or resource sensitive. While the standard C++ library provides many useful algorithms (such as sorting), in its current form it does not provide direct handling of linear algebra (matrix maths). Armadillo is an open source linear algebra library for the C++ language, aiming towards a good balance between speed and ease of use. Its high-level application programming interface (function syntax) is deliberately similar to the widely used Matlab and Octave languages [4], so that mathematical operations can be expressed in a familiar and natural manner. The library is useful for algorithm development directly in C++, or relatively quick conversion of research code into production environments. Armadillo provides efficient objects for vectors, matrices and cubes (third order tensors), as well as over 200 associated functions for manipulating data stored in the objects. Integer, floating point and complex numbers are supported, as well as dense and sparse storage formats. Various matrix factorisations are provided through integration with LAPACK [3], or one of its high performance drop-in replacements such as Intel MKL [6] or OpenBLAS [9]. It is also possible to use Armadillo in conjunction with NVBLAS to obtain GPU-accelerated matrix multiplication [7]. Armadillo is used as a base for other open source projects, such as MLPACK, a C++ library for machine learning and pattern recognition [2], and RcppArmadillo, a bridge between the R language and C++ in order to speed up computations [5]. -
Using Tabulation to Implement the Power of Hashing Using Tabulation to Implement the Power of Hashing Talk Surveys Results from I M
Talk surveys results from I M. Patras¸cuˇ and M. Thorup: The power of simple tabulation hashing. STOC’11 and J. ACM’12. I M. Patras¸cuˇ and M. Thorup: Twisted Tabulation Hashing. SODA’13 I M. Thorup: Simple Tabulation, Fast Expanders, Double Tabulation, and High Independence. FOCS’13. I S. Dahlgaard and M. Thorup: Approximately Minwise Independence with Twisted Tabulation. SWAT’14. I T. Christiani, R. Pagh, and M. Thorup: From Independence to Expansion and Back Again. STOC’15. I S. Dahlgaard, M.B.T. Knudsen and E. Rotenberg, and M. Thorup: Hashing for Statistics over K-Partitions. FOCS’15. Using Tabulation to Implement The Power of Hashing Using Tabulation to Implement The Power of Hashing Talk surveys results from I M. Patras¸cuˇ and M. Thorup: The power of simple tabulation hashing. STOC’11 and J. ACM’12. I M. Patras¸cuˇ and M. Thorup: Twisted Tabulation Hashing. SODA’13 I M. Thorup: Simple Tabulation, Fast Expanders, Double Tabulation, and High Independence. FOCS’13. I S. Dahlgaard and M. Thorup: Approximately Minwise Independence with Twisted Tabulation. SWAT’14. I T. Christiani, R. Pagh, and M. Thorup: From Independence to Expansion and Back Again. STOC’15. I S. Dahlgaard, M.B.T. Knudsen and E. Rotenberg, and M. Thorup: Hashing for Statistics over K-Partitions. FOCS’15. I Providing algorithmically important probabilisitic guarantees akin to those of truly random hashing, yet easy to implement. I Bridging theory (assuming truly random hashing) with practice (needing something implementable). I Many randomized algorithms are very simple and popular in practice, but often implemented with too simple hash functions, so guarantees only for sufficiently random input. -
Linear Probing with Constant Independence
Linear Probing with Constant Independence Anna Pagh∗ Rasmus Pagh∗ Milan Ruˇzic´ ∗ ABSTRACT 1. INTRODUCTION Hashing with linear probing dates back to the 1950s, and Hashing with linear probing is perhaps the simplest algo- is among the most studied algorithms. In recent years it rithm for storing and accessing a set of keys that obtains has become one of the most important hash table organiza- nontrivial performance. Given a hash function h, a key x is tions since it uses the cache of modern computers very well. inserted in an array by searching for the first vacant array Unfortunately, previous analyses rely either on complicated position in the sequence h(x), h(x) + 1, h(x) + 2,... (Here, and space consuming hash functions, or on the unrealistic addition is modulo r, the size of the array.) Retrieval of a assumption of free access to a truly random hash function. key proceeds similarly, until either the key is found, or a Already Carter and Wegman, in their seminal paper on uni- vacant position is encountered, in which case the key is not versal hashing, raised the question of extending their anal- present in the data structure. Deletions can be performed ysis to linear probing. However, we show in this paper that by moving elements back in the probe sequence in a greedy linear probing using a pairwise independent family may have fashion (ensuring that no key x is moved beyond h(x)), until expected logarithmic cost per operation. On the positive a vacant array position is encountered. side, we show that 5-wise independence is enough to ensure Linear probing dates back to 1954, but was first analyzed constant expected time per operation. -
Scala Tutorial
Scala Tutorial SCALA TUTORIAL Simply Easy Learning by tutorialspoint.com tutorialspoint.com i ABOUT THE TUTORIAL Scala Tutorial Scala is a modern multi-paradigm programming language designed to express common programming patterns in a concise, elegant, and type-safe way. Scala has been created by Martin Odersky and he released the first version in 2003. Scala smoothly integrates features of object-oriented and functional languages. This tutorial gives a great understanding on Scala. Audience This tutorial has been prepared for the beginners to help them understand programming Language Scala in simple and easy steps. After completing this tutorial, you will find yourself at a moderate level of expertise in using Scala from where you can take yourself to next levels. Prerequisites Scala Programming is based on Java, so if you are aware of Java syntax, then it's pretty easy to learn Scala. Further if you do not have expertise in Java but you know any other programming language like C, C++ or Python, then it will also help in grasping Scala concepts very quickly. Copyright & Disclaimer Notice All the content and graphics on this tutorial are the property of tutorialspoint.com. Any content from tutorialspoint.com or this tutorial may not be redistributed or reproduced in any way, shape, or form without the written permission of tutorialspoint.com. Failure to do so is a violation of copyright laws. This tutorial may contain inaccuracies or errors and tutorialspoint provides no guarantee regarding the accuracy of the site or its contents including this tutorial. If you discover that the tutorialspoint.com site or this tutorial content contains some errors, please contact us at [email protected] TUTORIALS POINT Simply Easy Learning Table of Content Scala Tutorial .......................................................................... -
The Power of Tabulation Hashing
Thank you for inviting me to China Theory Week. Joint work with Mihai Patras¸cuˇ . Some of it found in Proc. STOC’11. The Power of Tabulation Hashing Mikkel Thorup University of Copenhagen AT&T Joint work with Mihai Patras¸cuˇ . Some of it found in Proc. STOC’11. The Power of Tabulation Hashing Mikkel Thorup University of Copenhagen AT&T Thank you for inviting me to China Theory Week. The Power of Tabulation Hashing Mikkel Thorup University of Copenhagen AT&T Thank you for inviting me to China Theory Week. Joint work with Mihai Patras¸cuˇ . Some of it found in Proc. STOC’11. I Providing algorithmically important probabilisitic guarantees akin to those of truly random hashing, yet easy to implement. I Bridging theory (assuming truly random hashing) with practice (needing something implementable). Target I Simple and reliable pseudo-random hashing. I Bridging theory (assuming truly random hashing) with practice (needing something implementable). Target I Simple and reliable pseudo-random hashing. I Providing algorithmically important probabilisitic guarantees akin to those of truly random hashing, yet easy to implement. Target I Simple and reliable pseudo-random hashing. I Providing algorithmically important probabilisitic guarantees akin to those of truly random hashing, yet easy to implement. I Bridging theory (assuming truly random hashing) with practice (needing something implementable). Applications of Hashing Hash tables (n keys and 2n hashes: expect 1/2 keys per hash) I chaining: follow pointers • x • ! a ! t • • ! v • • ! f ! s -
Cmsc 132: Object-Oriented Programming Ii
© Department of Computer Science UMD CMSC 132: OBJECT-ORIENTED PROGRAMMING II Iterator, Marker, Observer Design Patterns Department of Computer Science University of Maryland, College Park © Department of Computer Science UMD Design Patterns • Descriptions of reusable solutions to common software design problems (e.g, Iterator pattern) • Captures the experience of experts • Goals • Solve common programming challenges • Improve reliability of solution • Aid rapid software development • Useful for real-world applications • Design patterns are like recipes – generic solutions to expected situations • Design patterns are language independent • Recognizing when and where to use design patterns requires familiarity & experience • Design pattern libraries serve as a glossary of idioms for understanding common, but complex solutions • Design patterns are used throughout the Java Class Libraries © Department of Computer Science UMD Iterator Pattern • Definition • Move through collection of objects without knowing its internal representation • Where to use & benefits • Use a standard interface to represent data objects • Uses standard iterator built in each standard collection, like List • Need to distinguish variations in the traversal of an aggregate • Example • Iterator for collection • Original • Examine elements of collection directly • Using pattern • Collection provides Iterator class for examining elements in collection © Department of Computer Science UMD Iterator Example public interface Iterator<V> { bool hasNext(); V next(); void remove(); -
Iterators in C++ C 2016 All Rights Reserved
Software Design Lecture Notes Prof. Stewart Weiss Iterators in C++ c 2016 All rights reserved. Iterators in C++ 1 Introduction When a program needs to visit all of the elements of a vector named myvec starting with the rst and ending with the last, you could use an iterative loop such as the following: for ( int i = 0; i < myvec.size(); i++ ) // the visit, i.e., do something with myvec[i] This code works and is a perfectly ne solution. This same type of iterative loop can visit each character in a C++ string. But what if you need code that visits all elements of a C++ list object? Is there a way to use an iterative loop to do this? To be clear, an iterative loop is one that has the form for var = start value; var compared to end value; update var; {do something with node at specic index} In other words, an iterative loop is a counting loop, as opposed to one that tests an arbitrary condition, like a while loop. The short answer to the question is that, in C++, without iterators, you cannot do this. Iterators make it possible to iterate through arbitrary containers. This set of notes answers the question, What is an iterator, and how do you use it? Iterators are a generalization of pointers in C++ and have similar semantics. They allow a program to navigate through dierent types of containers in a uniform manner. Just as pointers can be used to traverse a linked list or a binary tree, and subscripts can be used to traverse the elements of a vector, iterators can be used to sequence through the elements of any standard C++ container class. -
CSC 344 – Algorithms and Complexity Why Search?
CSC 344 – Algorithms and Complexity Lecture #5 – Searching Why Search? • Everyday life -We are always looking for something – in the yellow pages, universities, hairdressers • Computers can search for us • World wide web provides different searching mechanisms such as yahoo.com, bing.com, google.com • Spreadsheet – list of names – searching mechanism to find a name • Databases – use to search for a record • Searching thousands of records takes time the large number of comparisons slows the system Sequential Search • Best case? • Worst case? • Average case? Sequential Search int linearsearch(int x[], int n, int key) { int i; for (i = 0; i < n; i++) if (x[i] == key) return(i); return(-1); } Improved Sequential Search int linearsearch(int x[], int n, int key) { int i; //This assumes an ordered array for (i = 0; i < n && x[i] <= key; i++) if (x[i] == key) return(i); return(-1); } Binary Search (A Decrease and Conquer Algorithm) • Very efficient algorithm for searching in sorted array: – K vs A[0] . A[m] . A[n-1] • If K = A[m], stop (successful search); otherwise, continue searching by the same method in: – A[0..m-1] if K < A[m] – A[m+1..n-1] if K > A[m] Binary Search (A Decrease and Conquer Algorithm) l ← 0; r ← n-1 while l ≤ r do m ← (l+r)/2 if K = A[m] return m else if K < A[m] r ← m-1 else l ← m+1 return -1 Analysis of Binary Search • Time efficiency • Worst-case recurrence: – Cw (n) = 1 + Cw( n/2 ), Cw (1) = 1 solution: Cw(n) = log 2(n+1) 6 – This is VERY fast: e.g., Cw(10 ) = 20 • Optimal for searching a sorted array • Limitations: must be a sorted array (not linked list) binarySearch int binarySearch(int x[], int n, int key) { int low, high, mid; low = 0; high = n -1; while (low <= high) { mid = (low + high) / 2; if (x[mid] == key) return(mid); if (x[mid] > key) high = mid - 1; else low = mid + 1; } return(-1); } Searching Problem Problem: Given a (multi)set S of keys and a search key K, find an occurrence of K in S, if any. -
4 Hash Tables and Associative Arrays
4 FREE Hash Tables and Associative Arrays If you want to get a book from the central library of the University of Karlsruhe, you have to order the book in advance. The library personnel fetch the book from the stacks and deliver it to a room with 100 shelves. You find your book on a shelf numbered with the last two digits of your library card. Why the last digits and not the leading digits? Probably because this distributes the books more evenly among the shelves. The library cards are numbered consecutively as students sign up, and the University of Karlsruhe was founded in 1825. Therefore, the students enrolled at the same time are likely to have the same leading digits in their card number, and only a few shelves would be in use if the leadingCOPY digits were used. The subject of this chapter is the robust and efficient implementation of the above “delivery shelf data structure”. In computer science, this data structure is known as a hash1 table. Hash tables are one implementation of associative arrays, or dictio- naries. The other implementation is the tree data structures which we shall study in Chap. 7. An associative array is an array with a potentially infinite or at least very large index set, out of which only a small number of indices are actually in use. For example, the potential indices may be all strings, and the indices in use may be all identifiers used in a particular C++ program.Or the potential indices may be all ways of placing chess pieces on a chess board, and the indices in use may be the place- ments required in the analysis of a particular game. -
Implementing the Map ADT Outline
Implementing the Map ADT Outline ´ The Map ADT ´ Implementation with Java Generics ´ A Hash Function ´ translation of a string key into an integer ´ Consider a few strategies for implementing a hash table ´ linear probing ´ quadratic probing ´ separate chaining hashing ´ OrderedMap using a binary search tree The Map ADT ´A Map models a searchable collection of key-value mappings ´A key is said to be “mapped” to a value ´Also known as: dictionary, associative array ´Main operations: insert, find, and delete Applications ´ Store large collections with fast operations ´ For a long time, Java only had Vector (think ArrayList), Stack, and Hashmap (now there are about 67) ´ Support certain algorithms ´ for example, probabilistic text generation in 127B ´ Store certain associations in meaningful ways ´ For example, to store connected rooms in Hunt the Wumpus in 335 The Map ADT ´A value is "mapped" to a unique key ´Need a key and a value to insert new mappings ´Only need the key to find mappings ´Only need the key to remove mappings 5 Key and Value ´With Java generics, you need to specify ´ the type of key ´ the type of value ´Here the key type is String and the value type is BankAccount Map<String, BankAccount> accounts = new HashMap<String, BankAccount>(); 6 put(key, value) get(key) ´Add new mappings (a key mapped to a value): Map<String, BankAccount> accounts = new TreeMap<String, BankAccount>(); accounts.put("M",); accounts.put("G", new BankAcnew BankAccount("Michel", 111.11)count("Georgie", 222.22)); accounts.put("R", new BankAccount("Daniel", -
A Fast Concurrent and Resizable Robin Hood Hash Table Endrias
A Fast Concurrent and Resizable Robin Hood Hash Table by Endrias Kahssay Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Master of Engineering in Electrical Engineering and Computer Science at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY February 2021 © Massachusetts Institute of Technology 2021. All rights reserved. Author................................................................ Department of Electrical Engineering and Computer Science Jan 29, 2021 Certified by. Julian Shun Assistant Professor Thesis Supervisor Accepted by . Katrina LaCurts Chair, Master of Engineering Thesis Committee 2 A Fast Concurrent and Resizable Robin Hood Hash Table by Endrias Kahssay Submitted to the Department of Electrical Engineering and Computer Science on Jan 29, 2021, in partial fulfillment of the requirements for the degree of Master of Engineering in Electrical Engineering and Computer Science Abstract Concurrent hash tables are among the most important primitives in concurrent pro- gramming and have been extensively studied in the literature. Robin Hood hashing is a variant of linear probing that moves around keys to reduce probe distances. It has been used to develop state of the art serial hash tables. However, there is only one existing previous work on a concurrent Robin Hood table. The difficulty in making Robin Hood concurrent lies in the potential for large memory reorganization by the different table operations. This thesis presents Bolt, a concurrent resizable Robin Hood hash table engineered for high performance. Bolt treads an intricate balance between an atomic fast path and a locking slow path to facilitate concurrency. It uses a novel scheme to interleave the two without compromising correctness in the concur- rent setting. -
Hash Tables & Searching Algorithms
Search Algorithms and Tables Chapter 11 Tables • A table, or dictionary, is an abstract data type whose data items are stored and retrieved according to a key value. • The items are called records. • Each record can have a number of data fields. • The data is ordered based on one of the fields, named the key field. • The record we are searching for has a key value that is called the target. • The table may be implemented using a variety of data structures: array, tree, heap, etc. Sequential Search public static int search(int[] a, int target) { int i = 0; boolean found = false; while ((i < a.length) && ! found) { if (a[i] == target) found = true; else i++; } if (found) return i; else return –1; } Sequential Search on Tables public static int search(someClass[] a, int target) { int i = 0; boolean found = false; while ((i < a.length) && !found){ if (a[i].getKey() == target) found = true; else i++; } if (found) return i; else return –1; } Sequential Search on N elements • Best Case Number of comparisons: 1 = O(1) • Average Case Number of comparisons: (1 + 2 + ... + N)/N = (N+1)/2 = O(N) • Worst Case Number of comparisons: N = O(N) Binary Search • Can be applied to any random-access data structure where the data elements are sorted. • Additional parameters: first – index of the first element to examine size – number of elements to search starting from the first element above Binary Search • Precondition: If size > 0, then the data structure must have size elements starting with the element denoted as the first element. In addition, these elements are sorted.