Data Structures Libraries
Total Page:16
File Type:pdf, Size:1020Kb
Data Structures Libraries By: Leonor Frias Moya Advisors: Jordi Petit Silvestre and Salvador Roura Ferret PhD Thesis Departament de Llenguatges i Sistemes Inform`atics Universitat Polit`ecnica de Catalunya June 2010 This PhD Dissertation was publically defended on June 8th, 2010, in Barcelona. According to the judgement of the Jury, this PhD Dissertation obtained the qualification excellent cum laude and with the European mention. In the public defense, it was my honor to have the following PhD Jury: Prof. Dr. Conrado Mart´ınez, Universitat Polit`ecnica de Catalunya, Spain • Dr. Joaquim Gabarr´o, Universitat Polit`ecnica de Catalunya, Spain • Prof. Dr. Ulrich Meyer, Goethe Universit¨at Frankfurt am Main, Germany • Prof. Dr. Giuseppe Italiano, Universit`adi Roma ’Tor Vergata’, Italy • Dr. Ricardo Baeza-Yates, Universitat Pompeu Fabra, Spain • To whom I am (and I will always be) very grateful. Leonor Frias Moya I dedicate this work to my parents, Eusebio and Leonor Dedico este trabajo a mis padres, Eusebio y Leonor Resum Aquesta tesi doctoral estudia diversos problemes que sorgeixen quan es combina teoria i pr`actica en algor´ısmia, en particular, quan es consideren els requirements de les biblioteques de programari. La Standard Template Library (STL) del llenguatge de programaci´oC++ actua com a fil conductor en l’estudi de problemes fonamentals en algor´ısmia, com ara les seq¨u`encies, els diccionaris, l’ordenaci´o i la selecci´o. Per fer-ho, seguim aproximacions diverses per`ointerrelacionades, i que han estat parcialment influenciades pels canvis recents i radicals en l’arquitectura dels computadors. Per una banda, presentem diversos algorismes i estructures de dades que es fonamenten en les capacitats dels computadors moderns. Concretament, de les lists de la STL presentem imple- mentacions conscients de la mem`oria cau que, tot aprofitant les propietats d’aquestes jerarquies de mem`oria, aconsegueixen recorreguts molt eficients. Alhora, com tamb´e´es el cas de les implementa- cions de la STL amb llistes doblement enllac¸ades, compleixen amb els requeriments de cost de l’est`andard per a les operacions, aix´ıcom amb la validesa en tot moment dels iteradors. Tamb´epresentem diversos algorismes pr`actics per a processadors multi-core. Concretament, descrivim algorismes paral lels per a la construcci´oi la inserci´om´ultiple en arbres vermell-negre. Aquests algorismes es basen· en (a) l’acc´es en temps constant als elements usant el seu ´ındex, com ´es el cas tamb´edels algorismes en el model PRAM, (b) operacions d’uni´oi divisi´od’arbres, aix´ıcom (c) t`ecniques d’equilibri din`amic de la c`arrega paral lela. La nostra implementaci´oest´en la del com- pilador GCC per als diccionaris de la STL. Els resultats· experimentals en mostren la conveni`encia pr`actica. A m´es a m´es, per preprocessar l’entrada de les anteriors operacions eficientment, hem definit un algorisme online general per particionar seq¨u`encies de mida desconeguda. Aquest m`etode t´eun inter`es tant te`oric com pr`actic. Per altra banda, descrivim millores en algorismes, ja existents, basats en reusar part dels c`alculs previs. En primer lloc, presentem implementacions en paral lel de la partition de la STL. Aquests s´on els primers algorismes pr`actics que fan un nombre `opt· im de comparacions, ´es a dir, una per element. M´es endavant, mostrem la utilitat d’aquesta propietat quan els elements s´on cadenes de car`acters (strings) i es fan particions repetitivament i jer`arquica, com ara en el quicksort i el quick- select. Certament, mostrem grans millores de rendiment en combinar-los amb t`ecniques ja existents per evitar comparacions de car`acters redundants. Els nostres algorismes poden usar-se per especial- itzar el sort i el nth element de la STL, respectivament. Tamb´e, fem notar que algunes de les comparacions en aquests tipus d’algorismes i estructures de dades no accedeixen a les cadenes de car`acters pr`opiament dites. Si, com ´es habitual, els strings s’implementen mitjanc¸ant punters, aquest fet ´es rellevant si es considera l’´us de la mem`oria cau. En aquest sentit, quantifiquem anal´ıticament el nombre d’accessos a strings en arbres binaris de cerca, i tamb´een quicksort i quickselect modi- ficats d’aquesta manera. Tamb´eanalitzem els beneficis d’afegir redund`ancia a aquests algorismes i estructures de dades. Finalment, presentem i analitzem detalladament l’algorisme multikey quickselect, que ´es l’an`aleg de l’algorisme d’ordenaci´o multikey quicksort per al problema de la selecci´oper a strings. Aquest al- gorisme ´es eficient respecte el nombre de comparacions de car`acters, no necessita mem`oria auxiliar, i ´es f`acil d’implementar. A m´es a m´es, es pot usar per a especialitzar el nth element de la STL. La nostra an`alisi considera tant el nombre de comparacions com el d’intercanvis sota la hip`otesi d’una distribuci´ode claus aleat`oria uniforme. Addicionalment, proposem diverses variants per millorar l’efici`encia, que tamb´ees poden aplicar al multikey quicksort. Abstract This thesis studies several problems that arise from combining theory and practice in algorithmics, in particular, when considering requirements in software libraries. The well-known Standard Template Library (STL) of the C++ programming language is the common thread of this research. We have tackled fundamental problems in algorithmics, such as sorting and selection, sequences and dictio- naries. This aim has been achieved using varied, intertwined and evolving approaches, in particular, partially influenced by fast and radical changes in technology. Most of the contributions concern generic algorithms and data structures that take advantage of the processing capabilities of modern computers. First of all, we present cache-conscious imple- mentations of STL lists. Taking advantage of the properties of memory hierarchies, very good performance is obtained for traversal-based operations. But like typical STL doubly linked list im- plementations, they keep up with the required cost for operations as well as with the validity of iterators at all times. Besides, we present several practical algorithms for multi-core computers. Specifically, we describe parallel algorithms for the construction and insertion of many elements at a time into red- black trees. These algorithms use (a) constant-cost index computations to access the elements, as it is the case for some existing algorithms in the PRAM model, (b) split and union operations on trees, and (c) parallel load-balancing techniques. The implementation extends a red-black tree codification of STL dictionaries in the GCC compiler. The experiments show the practicability of this approach. In addition, motivated by efficiently preprocessing the input of the aforementioned bulk operations in dictionaries, we describe a general online algorithm for partitioning sequences whose size is unknown. Our algorithm is interesting from both a theoretical and practical perspective. Furthermore, we have devised new variants of algorithms by exploiting the reusing of com- putations. In particular, we consider the parallel partitioning of an array with respect to a pivot element. We present practical algorithms that can be used to implement STL partition. These are the first such algorithms that perform an optimal number of comparisons, i.e., one per element. Later, we show that this property is particularly useful when the elements are strings and parti- tioning is repetitively used and in a hierarchical way, as in quicksort and quickselect. Specifically, we show big performance improvements combining parallel quicksort and quickselect with existing techniques to avoid redundant character comparisons. Furthermore, the resulting algorithms can be used to specialize STL sort and nth element, respectively. Moreover, we highlight that some of the comparisons made in this kind of algorithms and data structures do not need to access the actual strings. Under the common assumption of a pointer string implementation, this is very relevant from a cache-conscious perspective. In this sense, we analytically quantify the number of string lookups in so modified binary search trees, quicksort and quickselect. Also, we analyze the benefits of adding some redundancy on the top of these algorithms and data structures. Finally, we present and analyze multikey quickselect, the analog of multikey quicksort for the selection problem for strings. This algorithm is efficient with respect to the number of character comparisons, it is in-place and it is easy to implement. In particular, it can be used to specialize nth element for strings. We present an analysis of its cost, measuring both the number of compar- isons and the number of swaps, under a random uniform distribution. Last but not least, we propose several enhanced variants, that can be also applied to multikey quicksort. Acknowledgements In the development of my PhD thesis, I have accumulated knowledge about several topics, fluency in giving talks, but also, and above all, an international vision and experience about (research) work and life. But in order to make this moment a reality, I have had to overcome several challenges of various kinds. And I could not have made it without the support of the people around me. These acknowledgements are dedicated to them. Firstly, I thank my advisors, Jordi Petit and Salvador Roura, for willing to advise my thesis. In particular, I am specially grateful for their useful advice and patience. Besides, I am also very grateful to Conrado Mart´ınez for counting with me in his research projects and for being very acces- sible. In addition, I thank the rest of ALBCOM members, especially Mar´ıa J. Serna, for including me into account into their research activities. In particular, I enjoyed being part of several organizing committees, as well as attending several courses and conferences abroad. I had the chance to share some of them with Amalia Duch and Mar´ıa J.