Data Structures Libraries

Total Page:16

File Type:pdf, Size:1020Kb

Data Structures Libraries Data Structures Libraries By: Leonor Frias Moya Advisors: Jordi Petit Silvestre and Salvador Roura Ferret PhD Thesis Departament de Llenguatges i Sistemes Inform`atics Universitat Polit`ecnica de Catalunya June 2010 This PhD Dissertation was publically defended on June 8th, 2010, in Barcelona. According to the judgement of the Jury, this PhD Dissertation obtained the qualification excellent cum laude and with the European mention. In the public defense, it was my honor to have the following PhD Jury: Prof. Dr. Conrado Mart´ınez, Universitat Polit`ecnica de Catalunya, Spain • Dr. Joaquim Gabarr´o, Universitat Polit`ecnica de Catalunya, Spain • Prof. Dr. Ulrich Meyer, Goethe Universit¨at Frankfurt am Main, Germany • Prof. Dr. Giuseppe Italiano, Universit`adi Roma ’Tor Vergata’, Italy • Dr. Ricardo Baeza-Yates, Universitat Pompeu Fabra, Spain • To whom I am (and I will always be) very grateful. Leonor Frias Moya I dedicate this work to my parents, Eusebio and Leonor Dedico este trabajo a mis padres, Eusebio y Leonor Resum Aquesta tesi doctoral estudia diversos problemes que sorgeixen quan es combina teoria i pr`actica en algor´ısmia, en particular, quan es consideren els requirements de les biblioteques de programari. La Standard Template Library (STL) del llenguatge de programaci´oC++ actua com a fil conductor en l’estudi de problemes fonamentals en algor´ısmia, com ara les seq¨u`encies, els diccionaris, l’ordenaci´o i la selecci´o. Per fer-ho, seguim aproximacions diverses per`ointerrelacionades, i que han estat parcialment influenciades pels canvis recents i radicals en l’arquitectura dels computadors. Per una banda, presentem diversos algorismes i estructures de dades que es fonamenten en les capacitats dels computadors moderns. Concretament, de les lists de la STL presentem imple- mentacions conscients de la mem`oria cau que, tot aprofitant les propietats d’aquestes jerarquies de mem`oria, aconsegueixen recorreguts molt eficients. Alhora, com tamb´e´es el cas de les implementa- cions de la STL amb llistes doblement enllac¸ades, compleixen amb els requeriments de cost de l’est`andard per a les operacions, aix´ıcom amb la validesa en tot moment dels iteradors. Tamb´epresentem diversos algorismes pr`actics per a processadors multi-core. Concretament, descrivim algorismes paral lels per a la construcci´oi la inserci´om´ultiple en arbres vermell-negre. Aquests algorismes es basen· en (a) l’acc´es en temps constant als elements usant el seu ´ındex, com ´es el cas tamb´edels algorismes en el model PRAM, (b) operacions d’uni´oi divisi´od’arbres, aix´ıcom (c) t`ecniques d’equilibri din`amic de la c`arrega paral lela. La nostra implementaci´oest´en la del com- pilador GCC per als diccionaris de la STL. Els resultats· experimentals en mostren la conveni`encia pr`actica. A m´es a m´es, per preprocessar l’entrada de les anteriors operacions eficientment, hem definit un algorisme online general per particionar seq¨u`encies de mida desconeguda. Aquest m`etode t´eun inter`es tant te`oric com pr`actic. Per altra banda, descrivim millores en algorismes, ja existents, basats en reusar part dels c`alculs previs. En primer lloc, presentem implementacions en paral lel de la partition de la STL. Aquests s´on els primers algorismes pr`actics que fan un nombre `opt· im de comparacions, ´es a dir, una per element. M´es endavant, mostrem la utilitat d’aquesta propietat quan els elements s´on cadenes de car`acters (strings) i es fan particions repetitivament i jer`arquica, com ara en el quicksort i el quick- select. Certament, mostrem grans millores de rendiment en combinar-los amb t`ecniques ja existents per evitar comparacions de car`acters redundants. Els nostres algorismes poden usar-se per especial- itzar el sort i el nth element de la STL, respectivament. Tamb´e, fem notar que algunes de les comparacions en aquests tipus d’algorismes i estructures de dades no accedeixen a les cadenes de car`acters pr`opiament dites. Si, com ´es habitual, els strings s’implementen mitjanc¸ant punters, aquest fet ´es rellevant si es considera l’´us de la mem`oria cau. En aquest sentit, quantifiquem anal´ıticament el nombre d’accessos a strings en arbres binaris de cerca, i tamb´een quicksort i quickselect modi- ficats d’aquesta manera. Tamb´eanalitzem els beneficis d’afegir redund`ancia a aquests algorismes i estructures de dades. Finalment, presentem i analitzem detalladament l’algorisme multikey quickselect, que ´es l’an`aleg de l’algorisme d’ordenaci´o multikey quicksort per al problema de la selecci´oper a strings. Aquest al- gorisme ´es eficient respecte el nombre de comparacions de car`acters, no necessita mem`oria auxiliar, i ´es f`acil d’implementar. A m´es a m´es, es pot usar per a especialitzar el nth element de la STL. La nostra an`alisi considera tant el nombre de comparacions com el d’intercanvis sota la hip`otesi d’una distribuci´ode claus aleat`oria uniforme. Addicionalment, proposem diverses variants per millorar l’efici`encia, que tamb´ees poden aplicar al multikey quicksort. Abstract This thesis studies several problems that arise from combining theory and practice in algorithmics, in particular, when considering requirements in software libraries. The well-known Standard Template Library (STL) of the C++ programming language is the common thread of this research. We have tackled fundamental problems in algorithmics, such as sorting and selection, sequences and dictio- naries. This aim has been achieved using varied, intertwined and evolving approaches, in particular, partially influenced by fast and radical changes in technology. Most of the contributions concern generic algorithms and data structures that take advantage of the processing capabilities of modern computers. First of all, we present cache-conscious imple- mentations of STL lists. Taking advantage of the properties of memory hierarchies, very good performance is obtained for traversal-based operations. But like typical STL doubly linked list im- plementations, they keep up with the required cost for operations as well as with the validity of iterators at all times. Besides, we present several practical algorithms for multi-core computers. Specifically, we describe parallel algorithms for the construction and insertion of many elements at a time into red- black trees. These algorithms use (a) constant-cost index computations to access the elements, as it is the case for some existing algorithms in the PRAM model, (b) split and union operations on trees, and (c) parallel load-balancing techniques. The implementation extends a red-black tree codification of STL dictionaries in the GCC compiler. The experiments show the practicability of this approach. In addition, motivated by efficiently preprocessing the input of the aforementioned bulk operations in dictionaries, we describe a general online algorithm for partitioning sequences whose size is unknown. Our algorithm is interesting from both a theoretical and practical perspective. Furthermore, we have devised new variants of algorithms by exploiting the reusing of com- putations. In particular, we consider the parallel partitioning of an array with respect to a pivot element. We present practical algorithms that can be used to implement STL partition. These are the first such algorithms that perform an optimal number of comparisons, i.e., one per element. Later, we show that this property is particularly useful when the elements are strings and parti- tioning is repetitively used and in a hierarchical way, as in quicksort and quickselect. Specifically, we show big performance improvements combining parallel quicksort and quickselect with existing techniques to avoid redundant character comparisons. Furthermore, the resulting algorithms can be used to specialize STL sort and nth element, respectively. Moreover, we highlight that some of the comparisons made in this kind of algorithms and data structures do not need to access the actual strings. Under the common assumption of a pointer string implementation, this is very relevant from a cache-conscious perspective. In this sense, we analytically quantify the number of string lookups in so modified binary search trees, quicksort and quickselect. Also, we analyze the benefits of adding some redundancy on the top of these algorithms and data structures. Finally, we present and analyze multikey quickselect, the analog of multikey quicksort for the selection problem for strings. This algorithm is efficient with respect to the number of character comparisons, it is in-place and it is easy to implement. In particular, it can be used to specialize nth element for strings. We present an analysis of its cost, measuring both the number of compar- isons and the number of swaps, under a random uniform distribution. Last but not least, we propose several enhanced variants, that can be also applied to multikey quicksort. Acknowledgements In the development of my PhD thesis, I have accumulated knowledge about several topics, fluency in giving talks, but also, and above all, an international vision and experience about (research) work and life. But in order to make this moment a reality, I have had to overcome several challenges of various kinds. And I could not have made it without the support of the people around me. These acknowledgements are dedicated to them. Firstly, I thank my advisors, Jordi Petit and Salvador Roura, for willing to advise my thesis. In particular, I am specially grateful for their useful advice and patience. Besides, I am also very grateful to Conrado Mart´ınez for counting with me in his research projects and for being very acces- sible. In addition, I thank the rest of ALBCOM members, especially Mar´ıa J. Serna, for including me into account into their research activities. In particular, I enjoyed being part of several organizing committees, as well as attending several courses and conferences abroad. I had the chance to share some of them with Amalia Duch and Mar´ıa J.
Recommended publications
  • An Efficient Evolutionary Algorithm for Solving Incrementally Structured
    An Efficient Evolutionary Algorithm for Solving Incrementally Structured Problems Jason Ansel Maciej Pacula Saman Amarasinghe Una-May O'Reilly MIT - CSAIL July 14, 2011 Jason Ansel (MIT) PetaBricks July 14, 2011 1 / 30 Our goal is to make programs run faster We use evolutionary algorithms to search for faster programs The PetaBricks language defines search spaces of algorithmic choices Who are we? I do research in programming languages (PL) and compilers The PetaBricks language is a collaboration between: A PL / compiler research group A evolutionary algorithms research group A applied mathematics research group Jason Ansel (MIT) PetaBricks July 14, 2011 2 / 30 The PetaBricks language defines search spaces of algorithmic choices Who are we? I do research in programming languages (PL) and compilers The PetaBricks language is a collaboration between: A PL / compiler research group A evolutionary algorithms research group A applied mathematics research group Our goal is to make programs run faster We use evolutionary algorithms to search for faster programs Jason Ansel (MIT) PetaBricks July 14, 2011 2 / 30 Who are we? I do research in programming languages (PL) and compilers The PetaBricks language is a collaboration between: A PL / compiler research group A evolutionary algorithms research group A applied mathematics research group Our goal is to make programs run faster We use evolutionary algorithms to search for faster programs The PetaBricks language defines search spaces of algorithmic choices Jason Ansel (MIT) PetaBricks July 14, 2011
    [Show full text]
  • A Trusted Mechanised Javascript Specification
    A Trusted Mechanised JavaScript Specification Martin Bodin Arthur Charguéraud Daniele Filaretti Inria & ENS Lyon Inria & LRI, Université Paris Sud, CNRS Imperial College London [email protected] [email protected] d.fi[email protected] Philippa Gardner Sergio Maffeis Daiva Naudžiunien¯ e˙ Imperial College London Imperial College London Imperial College London [email protected] sergio.maff[email protected] [email protected] Alan Schmitt Gareth Smith Inria Imperial College London [email protected] [email protected] Abstract sation was crucial. Client code that works on some of the main JavaScript is the most widely used web language for client-side ap- browsers, and not others, is not useful. The first official standard plications. Whilst the development of JavaScript was initially just appeared in 1997. Now we have ECMAScript 3 (ES3, 1999) and led by implementation, there is now increasing momentum behind ECMAScript 5 (ES5, 2009), supported by all browsers. There is the ECMA standardisation process. The time is ripe for a formal, increasing momentum behind the ECMA standardisation process, mechanised specification of JavaScript, to clarify ambiguities in the with plans for ES6 and 7 well under way. ECMA standards, to serve as a trusted reference for high-level lan- JavaScript is the only language supported natively by all major guage compilation and JavaScript implementations, and to provide web browsers. Programs written for the browser are either writ- a platform for high-assurance proofs of language properties. ten directly in JavaScript, or in other languages which compile to We present JSCert, a formalisation of the current ECMA stan- JavaScript.
    [Show full text]
  • BCL: a Cross-Platform Distributed Data Structures Library
    BCL: A Cross-Platform Distributed Data Structures Library Benjamin Brock, Aydın Buluç, Katherine Yelick University of California, Berkeley Lawrence Berkeley National Laboratory {brock,abuluc,yelick}@cs.berkeley.edu ABSTRACT high-performance computing, including several using the Parti- One-sided communication is a useful paradigm for irregular paral- tioned Global Address Space (PGAS) model: Titanium, UPC, Coarray lel applications, but most one-sided programming environments, Fortran, X10, and Chapel [9, 11, 12, 25, 29, 30]. These languages are including MPI’s one-sided interface and PGAS programming lan- especially well-suited to problems that require asynchronous one- guages, lack application-level libraries to support these applica- sided communication, or communication that takes place without tions. We present the Berkeley Container Library, a set of generic, a matching receive operation or outside of a global collective. How- cross-platform, high-performance data structures for irregular ap- ever, PGAS languages lack the kind of high level libraries that exist plications, including queues, hash tables, Bloom filters and more. in other popular programming environments. For example, high- BCL is written in C++ using an internal DSL called the BCL Core performance scientific simulations written in MPI can leverage a that provides one-sided communication primitives such as remote broad set of numerical libraries for dense or sparse matrices, or get and remote put operations. The BCL Core has backends for for structured, unstructured, or adaptive meshes. PGAS languages MPI, OpenSHMEM, GASNet-EX, and UPC++, allowing BCL data can sometimes use those numerical libraries, but are missing the structures to be used natively in programs written using any of data structures that are important in some of the most irregular these programming environments.
    [Show full text]
  • Sorting Algorithm 1 Sorting Algorithm
    Sorting algorithm 1 Sorting algorithm In computer science, a sorting algorithm is an algorithm that puts elements of a list in a certain order. The most-used orders are numerical order and lexicographical order. Efficient sorting is important for optimizing the use of other algorithms (such as search and merge algorithms) that require sorted lists to work correctly; it is also often useful for canonicalizing data and for producing human-readable output. More formally, the output must satisfy two conditions: 1. The output is in nondecreasing order (each element is no smaller than the previous element according to the desired total order); 2. The output is a permutation, or reordering, of the input. Since the dawn of computing, the sorting problem has attracted a great deal of research, perhaps due to the complexity of solving it efficiently despite its simple, familiar statement. For example, bubble sort was analyzed as early as 1956.[1] Although many consider it a solved problem, useful new sorting algorithms are still being invented (for example, library sort was first published in 2004). Sorting algorithms are prevalent in introductory computer science classes, where the abundance of algorithms for the problem provides a gentle introduction to a variety of core algorithm concepts, such as big O notation, divide and conquer algorithms, data structures, randomized algorithms, best, worst and average case analysis, time-space tradeoffs, and lower bounds. Classification Sorting algorithms used in computer science are often classified by: • Computational complexity (worst, average and best behaviour) of element comparisons in terms of the size of the list . For typical sorting algorithms good behavior is and bad behavior is .
    [Show full text]
  • Automated Fortran–C++ Bindings for Large-Scale Scientific Applications
    Automated Fortran–C++ Bindings for Large-Scale Scientific Applications Seth R Johnson HPC Methods for Nuclear Applications Group Nuclear Energy and Fuel Cycle Division Oak Ridge National Laboratory ORNL is managed by UT–Battelle, LLC for the US Department of Energy github.com/swig-fortran Overview • Introduction • Tools • SWIG+Fortran • Strategies • Example libraries 2 Introduction 3 How did I get involved? • SCALE (1969–present): Fortran/C++ • VERA: multiphysics, C++/Fortran • MPACT: hand-wrapped calls to C++ Trilinos 4 Project background • Exascale Computing Project: at inception, many scientific app codes were primarily Fortran • Numerical/scientific libraries are primarily C/C++ • Expose Trilinos solver library to Fortran app developers: ForTrilinos product 5 ECP: more exascale, less Fortran Higher-level { }Fortran ECP application codes over time (credit: Tom Evans) 6 Motivation • C++ library developers: expand user base, more F opportunities for development and follow-on funding • Fortran scientific app developers: use newly exposed algorithms and tools for your code C • Multiphysics project integration: in-memory coupling of C++ physics code to Fortran physics code • Transitioning application teams: bite-size migration from Fortran to C++ C++ 7 Tools 8 Wrapper “report card” • Portability: Does it use standardized interoperability? • Reusability: How much manual duplication needed for new interfaces? • Capability: Does the Fortran interface have parity with the C++? • Maintainability: Do changes to the C++ code automatically update
    [Show full text]
  • Metal C Programming Guide and Reference
    z/OS Version 2 Release 3 Metal C Programming Guide and Reference IBM SC14-7313-30 Note Before using this information and the product it supports, read the information in “Notices” on page 159. This edition applies to Version 2 Release 3 of z/OS (5650-ZOS) and to all subsequent releases and modifications until otherwise indicated in new editions. Last updated: 2019-02-15 © Copyright International Business Machines Corporation 1998, 2017. US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Contents List of Figures...................................................................................................... vii List of Tables........................................................................................................ ix About this document.............................................................................................xi Who should read this document................................................................................................................. xi Where to find more information..................................................................................................................xi z/OS Basic Skills in IBM Knowledge Center.......................................................................................... xi How to read syntax diagrams......................................................................................................................xi How to send your comments to IBM......................................................................xv
    [Show full text]
  • Programming in Java
    Introduction to Programming in Java An Interdisciplinary Approach Robert Sedgewick and Kevin Wayne Princeton University ONLINE PREVIEW !"#$%&'(')!"*+,,,- ./01/23,,,0425,67 Publisher Greg Tobin Executive Editor Michael Hirsch Associate Editor Lindsey Triebel Associate Managing Editor Jeffrey Holcomb Senior Designer Joyce Cosentino Wells Digital Assets Manager Marianne Groth Senior Media Producer Bethany Tidd Senior Marketing Manager Michelle Brown Marketing Assistant Sarah Milmore Senior Author Support/ Technology Specialist Joe Vetere Senior Manufacturing Buyer Carol Melville Copyeditor Genevieve d’Entremont Composition and Illustrations Robert Sedgewick and Kevin Wayne Cover Image: © Robert Sedgewick and Kevin Wayne Page 353 © 2006 C. Herscovici, Brussels / Artists Rights Society (ARS), New York Banque d’ Images, ADAGP / Art Resource, NY Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trade- marks. Where those designations appear in this book, and Addison-Wesley was aware of a trademark claim, the designations have been printed in initial caps or all caps. The interior of this book was composed in Adobe InDesign. Library of Congress Cataloging-in-Publication Data Sedgewick, Robert, 1946- Introduction to programming in Java : an interdisciplinary approach / by Robert Sedgewick and Kevin Wayne. p. cm. Includes index. ISBN 978-0-321-49805-2 (alk. paper) 1. Java (Computer program language) 2. Computer programming. I. Wayne, Kevin Daniel, 1971- II. Title. QA76.73.J38S413 2007 005.13’3--dc22 2007020235 Copyright © 2008 Pearson Education, Inc. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher.
    [Show full text]
  • Draft ANSI Smalltalk Standard
    NCITS J20 DRAFT December, 1997 of ANSI Smalltalk Standard revision 1.9 Draft American National Standard for Information Systems - Programming Languages - Smalltalk Notice This is a draft proposed American National Standard. As such, this is not a completed standard. The Technical Committee may modify this document as a result of comments received during public review and its approval as a standard. Permission is granted to members of NCITS, its technical committees, and their associated task groups to reproduce this document for the purposes of NCITS standardization activities without further permission, provided this notice is included. All other rights are reserved. Any commercial or for-profit reproduction is strictly prohibited. NCITS J20 DRAFT December, 1997 ii of ANSI Smalltalk Standard revision 1.9 Copyright Copyright 1997 National Committee for Information Technology Standards Permission is granted to duplicate this document for the purpose of reviewing the draft standard. NCITS J20 DRAFT December, 1997 iii of ANSI Smalltalk Standard revision 1.9 Table of Contents Page FORWARD vi 1. GOALS AND SCOPE .................................................................................................................... 5 2. CONFORMING IMPLEMENTATIONS AND PROGRAMS .................................................. 7 3. THE SMALLTALK LANGUAGE ............................................................................................... 8 3.1 COMPUTATIONAL MODEL OF SMALLTALK EXECUTION................................................................
    [Show full text]
  • How to Sort out Your Life in O(N) Time
    How to sort out your life in O(n) time arel Číže @kaja47K funkcionaklne.cz I said, "Kiss me, you're beautiful - These are truly the last days" Godspeed You! Black Emperor, The Dead Flag Blues Everyone, deep in their hearts, is waiting for the end of the world to come. Haruki Murakami, 1Q84 ... Free lunch 1965 – 2022 Cramming More Components onto Integrated Circuits http://www.cs.utexas.edu/~fussell/courses/cs352h/papers/moore.pdf He pays his staff in junk. William S. Burroughs, Naked Lunch Sorting? quicksort and chill HS 1964 QS 1959 MS 1945 RS 1887 quicksort, mergesort, heapsort, radix sort, multi- way merge sort, samplesort, insertion sort, selection sort, library sort, counting sort, bucketsort, bitonic merge sort, Batcher odd-even sort, odd–even transposition sort, radix quick sort, radix merge sort*, burst sort binary search tree, B-tree, R-tree, VP tree, trie, log-structured merge tree, skip list, YOLO tree* vs. hashing Robin Hood hashing https://cs.uwaterloo.ca/research/tr/1986/CS-86-14.pdf xs.sorted.take(k) (take (sort xs) k) qsort(lotOfIntegers) It may be the wrong decision, but fuck it, it's mine. (Mark Z. Danielewski, House of Leaves) I tell you, my man, this is the American Dream in action! We’d be fools not to ride this strange torpedo all the way out to the end. (HST, FALILV) Linear time sorting? I owe the discovery of Uqbar to the conjunction of a mirror and an Encyclopedia. (Jorge Luis Borges, Tlön, Uqbar, Orbis Tertius) Sorting out graph processing https://github.com/frankmcsherry/blog/blob/master/posts/2015-08-15.md Radix Sort Revisited http://www.codercorner.com/RadixSortRevisited.htm Sketchy radix sort https://github.com/kaja47/sketches (thinking|drinking|WTF)* I know they accuse me of arrogance, and perhaps misanthropy, and perhaps of madness.
    [Show full text]
  • Sorting Algorithm 1 Sorting Algorithm
    Sorting algorithm 1 Sorting algorithm A sorting algorithm is an algorithm that puts elements of a list in a certain order. The most-used orders are numerical order and lexicographical order. Efficient sorting is important for optimizing the use of other algorithms (such as search and merge algorithms) which require input data to be in sorted lists; it is also often useful for canonicalizing data and for producing human-readable output. More formally, the output must satisfy two conditions: 1. The output is in nondecreasing order (each element is no smaller than the previous element according to the desired total order); 2. The output is a permutation (reordering) of the input. Since the dawn of computing, the sorting problem has attracted a great deal of research, perhaps due to the complexity of solving it efficiently despite its simple, familiar statement. For example, bubble sort was analyzed as early as 1956.[1] Although many consider it a solved problem, useful new sorting algorithms are still being invented (for example, library sort was first published in 2006). Sorting algorithms are prevalent in introductory computer science classes, where the abundance of algorithms for the problem provides a gentle introduction to a variety of core algorithm concepts, such as big O notation, divide and conquer algorithms, data structures, randomized algorithms, best, worst and average case analysis, time-space tradeoffs, and upper and lower bounds. Classification Sorting algorithms are often classified by: • Computational complexity (worst, average and best behavior) of element comparisons in terms of the size of the list (n). For typical serial sorting algorithms good behavior is O(n log n), with parallel sort in O(log2 n), and bad behavior is O(n2).
    [Show full text]
  • Sorting Using Bitonic Network with CUDA
    Sorting using BItonic netwoRk wIth CUDA Ranieri Baraglia Gabriele Capannini ISTI - CNR ISTI - CNR Via G. Moruzzi, 1 Via G. Moruzzi, 1 56124 Pisa, Italy 56124 Pisa, Italy [email protected] [email protected] Franco Maria Nardini Fabrizio Silvestri ISTI - CNR ISTI - CNR Via G. Moruzzi, 1 Via G. Moruzzi, 1 56124 Pisa, Italy 56124 Pisa, Italy [email protected] [email protected] ABSTRACT keep them up to date. Storage space must be used efficiently Novel \manycore" architectures, such as graphics processors, to store indexes, and documents. The indexing system must are high-parallel and high-performance shared-memory ar- process hundreds of gigabytes of data efficiently. Queries chitectures [7] born to solve specific problems such as the must be handled quickly, at a rate of hundreds to thousands graphical ones. Those architectures can be exploited to per second. All these services run on clusters of homoge- solve a wider range of problems by designing the related neous PCs. PCs in these clusters depends upon price, CPU algorithm for such architectures. We present a fast sorting speed, memory and disk size, heat output, and physical size algorithm implementing an efficient bitonic sorting network. [3]. Nowadays these characteristics can be find also in other This algorithm is highly suitable for information retrieval commodity hardware originally designed for specific graph- applications. Sorting is a fundamental and universal prob- ics computations. Many special processor architectures have lem in computer science. Even if sort has been extensively been proposed to exploit data parallelism for data intensive addressed by many research works, it still remains an inter- computations and graphics processors (GPUs) are one of esting challenge to make it faster by exploiting novel tech- those.
    [Show full text]
  • Insights Into the C++ Standard Library Pavel Novikov @Cpp Ape R&D Align Technology What’S Ahead
    Под капотом стандартной библиотеки C++ Insights into the C++ standard library Pavel Novikov @cpp_ape R&D Align Technology What’s ahead • use std::vector by default instead of std::list • emplace_back and push_back • prefer to use std::make_shared • avoid static objects with non-trivial constructors/destructors • small string optimization • priority queue and heap algorithms • sorting and selection • guarantee of strengthened worst case complexity O(n log(n)) for std::sort • when to use std::sort, std::stable_sort, std::partial_sort, std::nth_element • use unordered associative containers by default • beware of missing the variable name 2 using GadgetList = std::list<std::shared_ptr<Gadget>>; GadgetList findAllGadgets(const Widget &w) { GadgetList gadgets; for (auto &item : w.getAllItems()) if (isGadget(item)) gadgets.push_back(getGadget(item)); return gadgets; } void processGadgets(const GadgetList &gadgets) { for (auto &gadget : gadgets) processGadget(*gadget); } 3 using GadgetList = std::list<std::shared_ptr<Gadget>>; GadgetList findAllGadgets(const Widget &w) { GadgetList gadgets; for (auto &item : w.getAllItems()) if (isGadget(item)) gadgets.push_back(getGadget(item)); return gadgets; } void processGadgets(const GadgetList &gadgets) { for (auto &gadget : gadgets) processGadget(*gadget); } is this reasonable use of std::list? 3 std::list No copy/move while adding elements. Memory allocation for each element. • expensive • may fragment memory • in general worse element locality compared to std::vector 4 std::vector std::vector<T,A>::push_back() has amortized O 1 complexity std::vector<int> v; // capacity=0 v.push_back(1); // allocate, capacity=3 v.push_back(2); v.push_back(3); v.push_back(4); //alloc, move 3 elements, capacity=6 v.push_back(5); 5 6 std::vector 푐 — initial capacity 푚 — reallocation factor To push back 푁 elements we need log푚 푁/푐 reallocations.
    [Show full text]