Folien Zur Vorlesung Sortierverfahren

Total Page:16

File Type:pdf, Size:1020Kb

Folien Zur Vorlesung Sortierverfahren Einleitung Modellannahmen Internes vs Externes Sortieren Qualit¨atsbeurteilungen Leistungsnachweis Sortierverfahren Einleitung Lutz Wegner 17. April 2014 1 / 34 Einleitung Modellannahmen Internes vs Externes Sortieren Qualit¨atsbeurteilungen Leistungsnachweis Inhaltsverzeichnis 1 Einleitung Warum Sortierverfahren? Sortieren und Suchen als algorithmische Herausforderung 2 Modellannahmen Ein realistisches Modell Offen gelassene Punkte 3 Internes vs Externes Sortieren Internes Sortieren (Hauptspeichersort) Externes Sortieren (Plattensortierung) 4 Qualit¨atsbeurteilungen Asymptotische und gemessene Laufzeiten Zus¨atzlicher Platz Stabilit¨at und Smoothness Praktische Ub¨ erlegungen Taxonomie 5 Leistungsnachweis Ihre Aufgabe 2 / 34 Einleitung Modellannahmen Internes vs Externes Sortieren Qualit¨atsbeurteilungen Leistungsnachweis Warum Sortierverfahren? Einer Studie aus den Achtzigern zufolge verbringen Mainframes 25 % ihrer CPU-Zeit mit Sortiervorg¨angen. Betrachte hierzu Join und Projektion in Datenbanken. select STUDENTS.NAME, COURSE, DEPT from SIGN_UP, STUDENTS where SIGN_UP.NAME = STUDENTS.NAME 3 / 34 Einleitung Modellannahmen Internes vs Externes Sortieren Qualit¨atsbeurteilungen Leistungsnachweis Tabelle 1: SIGN UP Tabelle 2: STUDENTS NAME COURSE NAME SEM DEPT Jones ALG+DS Smith 1 CSC Smith DB1 Miller 5 MATH Miller ALG+DS Jones 1 MATH Smith PROG1 Clark 3 CSC Clark DB1 Jones PROG1 Tabelle 3: SIGN UP ./ STUDENTS NAME COURSE DEPT Jones ALG+DS MATH Smith DB1 CSC Miller ALG+DS MATH Smith PROG1 CSC Clark DB1 CSC Jones PROG1 MATH 4 / 34 Einleitung Modellannahmen Internes vs Externes Sortieren Qualit¨atsbeurteilungen Leistungsnachweis Das Datenbankbeispiel deutet auch an, wie wichtig die Erkennung und Eliminierung von Duplikaten ist, z. B. nach einer Projektion auf NAME im Ergebnis oben, wenn im SQL-Befehl die DISTINCT-Klausel enthalten war. Indeed, I believe that virtually every important aspect of programming arises somewhere in the context of sorting and searching! [Knu98, p. v] 5 / 34 Einleitung Modellannahmen Internes vs Externes Sortieren Qualit¨atsbeurteilungen Leistungsnachweis Sortieren ist ein ideales Anwendungsgebiet fur¨ Algorithmenentwurf und Software Engineering Studien zu Effizienz und Laufzeitverhalten (performance evaluation) konkrete Komplexit¨atstheorie und hatte seinen ersten H¨ohepunkt in den funfziger¨ und sechziger Jahren, wie man an den Referenzen hier sieht ACM Symp. on Sorting, Nov. 1962 and CACM May 1963 Knuth’s bibliography in Computing Reviews 13 (1972) wurde aber immer wieder neu belebt, z. B. in dem Uberblicksartikel¨ IEEE TC Special issue on sorting, April 1985 6 / 34 Einleitung Modellannahmen Internes vs Externes Sortieren Qualit¨atsbeurteilungen Leistungsnachweis Realistische Modellannahmen Knuth [Knu98, p. 4] schl¨agt das folgende Modell vor. Gegeben sind N S¨atze (Objekte, items, records) R1, R2, ..., RN die zusammen eine Datei (a file) bilden. Jeder Satz Rj hat einen Schlussel¨ (key), Kj , der die Sortierung bestimmt. Zus¨atzlich kann der Satz weitere Informationen (sog “satellite information”) enthalten. Man definiert dann eine Ordnungsrelation (ordering relation)“<” uber¨ dem Schlusselraum¨ mit der Eigenschaft, dass fur¨ beliebige drei Schlusselwerte¨ a, b, c, gilt: genau eine der M¨oglichkeiten a < b, a = b, b < a ist wahr (law of trichotomy) wenn a < b und b < c, dann auch a < c (law of transitivity) Somit wird eine lineare (totale) Ordnung vorausgesetzt. 7 / 34 Einleitung Modellannahmen Internes vs Externes Sortieren Qualit¨atsbeurteilungen Leistungsnachweis Offen gelassene Punkte Das Ziel einer Sortierung ist es, eine Permutation p(1)p(2) ... p(N) der S¨atze zu finden, bei der die Schlussel¨ in nicht-absteigender Reihenfolge stehen: Kp(1) ≤ Kp(2) ≤ · · · ≤ Kp(N). Die folgenden Punkte bleiben offen: Wie werden die Datens¨atze geliefert (Band, Platte, on-line)? Wie viele S¨atze sind gegeben und ist die Anzahl vorher bekannt? Ist der Wertebereich der Schlussel¨ bekannt (begrenzt)? Welche Operationen sind erlaubt (nur Schlusselvergleiche,¨ arithmetische Operationen)? Wie erfolgt die Umordnung der S¨atze? Gibt es eine bekannte Anfangsordnung? Wie viel Extraplatz ist verfugbar?¨ 8 / 34 Einleitung Modellannahmen Internes vs Externes Sortieren Qualit¨atsbeurteilungen Leistungsnachweis Internes versus Externes Sortieren Traditionell unterscheidet man zwei Untermodelle Sortieren eines Array (eines per Index adressierbaren Felds) im Hauptspeicher Sortieren einer Plattendatei 9 / 34 Einleitung Modellannahmen Internes vs Externes Sortieren Qualit¨atsbeurteilungen Leistungsnachweis Internes Sortieren (Hauptspeichersortierung) Dieses Modell setzt eine Datenstruktur ¨ahnlich zu der unten in PASCAL-Notation gegebenen voraus. TYPE item = RECORD key : integer; info : infotype END; sequence = ARRAY[1..N] OF item; VAR a : sequence; Wir wollen jetzt eine Prozedur XYZ-Sort(VAR s : sequence); erstellen, die – bei Aufruf mit XYZ-Sort(a) – den Array a so umordnet, dass danach gilt a[i].key ≤ a[i+1].key ∀ i (1 ≤ i < N). 10 / 34 Einleitung Modellannahmen Internes vs Externes Sortieren Qualit¨atsbeurteilungen Leistungsnachweis Modellannahmen fur¨ Hauptspeichersortierung Die Annahmen in diesem Modell sind: Die S¨atze k¨onnen getauscht werden, was u. U. gleiche, feste L¨angen voraussetzt. Der Zugriff auf jeden Satz erfolgt mit gleichen Kosten (uniform cost model). Die Ordnung bestimmt sich nur uber¨ Schlusselvergleiche.¨ Alle S¨atze passen in den Hauptspeicher. Im einfachen Fall bleiben auch alle Parallelit¨atsaspekte außen vor. 11 / 34 Einleitung Modellannahmen Internes vs Externes Sortieren Qualit¨atsbeurteilungen Leistungsnachweis Externes Sortieren (Plattensortierung) Der Begriff “Plattendatei” (disc file) bezieht sich auf alle Dateien auf externen Medien mit wahlfreiem Zugriff auf einzelne S¨atze (random access), im Gegensatz zu B¨andern oder Streams. Die entsprechende Datenstruktur sieht jetzt wie folgt aus (wieder in PASCAL-Notation). TYPE page = ARRAY[1..s] OF item; file = ARRAY[1..M] OF page; buffer = ARRAY[1..k] OF page; VAR f : file; 12 / 34 Einleitung Modellannahmen Internes vs Externes Sortieren Qualit¨atsbeurteilungen Leistungsnachweis Modellannahmen fur¨ Plattensortierung Die Annahmen fur¨ eine sinnvolle Diskussion spiegeln die klassische Speicherhierarchie wider: k M und k ist fest w¨ahrend der Ausfuhrung¨ die Seitengr¨oße ist fest vorgegeben, z. B. 4 KB die Datei passt nicht in den Hauptspeicher der Zugriff ist nicht gleich teuer fur¨ alle Seiten auf der Platte Ein-/Ausgabezeiten bestimmen die Gesamtausfuhrungszeit¨ (mit Ausnahmen) innerhalb einer Seite sind die S¨atze linear geordnet Seiten, die die Datei belegt, sind nicht notwendigerweise aufeinanderfolgend und linear geordnet innerhalb der Zylinder Zylinder, die die Datei belegt, sind nicht notwendigerweise aufeinanderfolgend und linear geordnet die Seitenersetzung kann vom Anwender kontrolliert werden 13 / 34 Einleitung Modellannahmen Internes vs Externes Sortieren Qualit¨atsbeurteilungen Leistungsnachweis Andere Modelle Andere untersuchenswerte Modelle sind: Sortieren linearer Listen (chained nodes, internes Sortieren) Sortieren mit Parallelverarbeitung (VLSI, SIMD) Sortieren in Netzwerken (LAN, WAN) Sortieren von B¨andern (auch bubble memory) Hauptspeichersortierverfahren im virtuellen Adressraum (paged memory sorting) 14 / 34 Einleitung Modellannahmen Internes vs Externes Sortieren Qualit¨atsbeurteilungen Leistungsnachweis Asymptotische und gemessene Laufzeiten Neben der Messung der tats¨achlichen Laufzeiten eines Programms fur¨ verschiedene Eingaben, die von der Programmiersprache und dem Rechner abh¨angen, ist es ublich¨ die Leistungsf¨ahigkeit in einer abstrakteren Art zu messen (zu analysieren). Z¨ahlen kann man Schlusselvergleiche¨ Plattenzugriffe MIX-Instruktionen elementare PASCAL-, C-, Java-, . Instruktionen Schleifendurchg¨ange (Iterationen) Satzvertauschungen (record exchanges, moves) im besten, mittleren, schlechtesten Fall, fur¨ bestimmte Schlusselbereiche,¨ unterschiedliche Dateigr¨oßen, Vorsortierungen und Multimengen. Das Ergebnis schreibt man immer als Funktion der L¨ange n der Eingabe auf, meist in der “Big-Oh-Notation”. 15 / 34 Einleitung Modellannahmen Internes vs Externes Sortieren Qualit¨atsbeurteilungen Leistungsnachweis Big-Oh- und Big-Omega-Notation Es gelte f (n) = O(g(n)) gdw. es Konstanten k und n0 gibt, so dass |f (n)| ≤ k · |g(n)| ∀n > n0. Damit gibt man eine obere Schranke fur¨ die Wachstumsrate der Funktion an, die multiplikative und additive Konstanten ignoriert. Das rechtfertigt dann den Begriff der asymptotischen Laufzeit. Eine untere Schranke gibt man in der “Big-Omega-Notation” an: f (n) = Ω(g(n)) gdw. g(n) = O(f (n)) Genauer, es gibt eine Konstante k, so dass f (n) ≥ k · g(n) unendlich oft. 16 / 34 Einleitung Modellannahmen Internes vs Externes Sortieren Qualit¨atsbeurteilungen Leistungsnachweis Little-oh- und Big-Theta-Notation Weniger h¨aufig ist die “Little-oh-Notation” f (n) f (n) = o(g(n)) gdw. lim = 0 n→∞ g(n) die angibt, dass f (n) asymptotisch langsamer w¨achst als g(n). Insbesondere setzen wir voraus, dass – bei Angabe einer Funktion f (n) = O(g(n)) – man keine Funktion h(n) = o(g(n)) kennt, so dass f (n) = O(h(n)), d. h. wenn wir eine obere Schranke benennen, dann sollte sie die knappm¨oglichste sein. Obere und untere Schranke zusammen bilden die “Big-Theta-Notation” f (n) = Θ(g(n)) gdw. f (n) = O(g(n)) und f (n) = Ω(g(n)). 17 / 34 Einleitung Modellannahmen Internes vs Externes Sortieren Qualit¨atsbeurteilungen Leistungsnachweis Gemessene Laufzeiten I Auch wenn es gelingt, die asymptotische Laufzeit zu analysieren
Recommended publications
  • Improving the Performance of Bubble Sort Using a Modified Diminishing Increment Sorting
    Scientific Research and Essay Vol. 4 (8), pp. 740-744, August, 2009 Available online at http://www.academicjournals.org/SRE ISSN 1992-2248 © 2009 Academic Journals Full Length Research Paper Improving the performance of bubble sort using a modified diminishing increment sorting Oyelami Olufemi Moses Department of Computer and Information Sciences, Covenant University, P. M. B. 1023, Ota, Ogun State, Nigeria. E- mail: [email protected] or [email protected]. Tel.: +234-8055344658. Accepted 17 February, 2009 Sorting involves rearranging information into either ascending or descending order. There are many sorting algorithms, among which is Bubble Sort. Bubble Sort is not known to be a very good sorting algorithm because it is beset with redundant comparisons. However, efforts have been made to improve the performance of the algorithm. With Bidirectional Bubble Sort, the average number of comparisons is slightly reduced and Batcher’s Sort similar to Shellsort also performs significantly better than Bidirectional Bubble Sort by carrying out comparisons in a novel way so that no propagation of exchanges is necessary. Bitonic Sort was also presented by Batcher and the strong point of this sorting procedure is that it is very suitable for a hard-wired implementation using a sorting network. This paper presents a meta algorithm called Oyelami’s Sort that combines the technique of Bidirectional Bubble Sort with a modified diminishing increment sorting. The results from the implementation of the algorithm compared with Batcher’s Odd-Even Sort and Batcher’s Bitonic Sort showed that the algorithm performed better than the two in the worst case scenario. The implication is that the algorithm is faster.
    [Show full text]
  • A Proposed Solution for Sorting Algorithms Problems by Comparison Network Model of Computation
    International Journal of Scientific & Engineering Research Volume 3, Issue 4, April-2012 1 ISSN 2229-5518 A Proposed Solution for Sorting Algorithms Problems by Comparison Network Model of Computation. Mr. Rajeev Singh, Mr. Ashish Kumar Tripathi, Mr. Saurabh Upadhyay, Mr.Sachin Kumar Dhar Dwivedi Abstract:-In this paper we have proposed a new solution for sorting algorithms. In the beginning of the sorting algorithm for serial computers (Random access machines, or RAM’S) that allow only one operation to be executed at a time. We have investigated sorting algorithm based on a comparison network model of computation, in which many comparison operation can be performed simultaneously. Index Terms Sorting algorithms, comparison network, sorting network, the zero one principle, bitonic sorting network 1 Introduction 1.2 The output is a permutation, or reordering, of the input. There are many algorithms for solving sorting algorithms For example of bubble sort 8, 25,9,3,6 (networks).A sorting network is an abstract mathematical model of a network of wires and comparator modules that is used to sort 8 8 8 3 3 a sequence of numbers. Each comparator connects two wires and sorts the values by outputting the smaller value to one wire and 25 25 9 9 3 8 6 6 the large to the other. A sorting network consists of two items comparators and wires .each wires carries with its values and each 9 25 3 9 6 8 comparator takes two wires as input and output. This independence of comparison sequences is useful for parallel 3 25 6 9 execution of the algorithms.
    [Show full text]
  • Bitonic Sorting Algorithm: a Review
    International Journal of Computer Applications (0975 – 8887) Volume 113 – No. 13, March 2015 Bitonic Sorting Algorithm: A Review Megha Jain Sanjay Kumar V.K Patle S.O.S In CS & IT, S.O.S In CS & IT, S.O.S In CS & IT, PT. Ravi Shankar Shukla PT. Ravi Shankar Shukla PT. Ravi Shankar Shukla University, Raipur (C.G), India University, Raipur (C.G), India University, Raipur (C.G), India ABSTRACT 2.1 Bitonic Sequence The Batcher`s bitonic sorting algorithm is a parallel sorting A bitonic sequence of n elements ranges from x0,……,xn-1 algorithm, which is used for sorting the numbers in modern with characteristics (a) Existence of the index i, where 0 ≤ i ≤ parallel machines. There are various parallel sorting n-1, such that there exist monotonically increasing sequence algorithms such as radix sort, bitonic sort, etc. It is one of the from x0 to xi, and monotonically decreasing sequence from xi efficient parallel sorting algorithm because of load balancing to xn-1.(b) Existence of the cyclic shift of indices by which property. It is widely used in various scientific and characterics satisfy [10]. After applying the above engineering applications. However, Various researches have characteristics bitonic sequence occur. The bitonic split worked on a bitonic sorting algorithm in order to improve up operation is applied for producing two bitonic sequences on the performance of original batcher`s bitonic sorting which merge and sort operation is applied as shown in Fig 1. algorithm. In this paper, tried to review the contribution made by these researchers. 3.
    [Show full text]
  • Sorting Algorithm 1 Sorting Algorithm
    Sorting algorithm 1 Sorting algorithm In computer science, a sorting algorithm is an algorithm that puts elements of a list in a certain order. The most-used orders are numerical order and lexicographical order. Efficient sorting is important for optimizing the use of other algorithms (such as search and merge algorithms) that require sorted lists to work correctly; it is also often useful for canonicalizing data and for producing human-readable output. More formally, the output must satisfy two conditions: 1. The output is in nondecreasing order (each element is no smaller than the previous element according to the desired total order); 2. The output is a permutation, or reordering, of the input. Since the dawn of computing, the sorting problem has attracted a great deal of research, perhaps due to the complexity of solving it efficiently despite its simple, familiar statement. For example, bubble sort was analyzed as early as 1956.[1] Although many consider it a solved problem, useful new sorting algorithms are still being invented (for example, library sort was first published in 2004). Sorting algorithms are prevalent in introductory computer science classes, where the abundance of algorithms for the problem provides a gentle introduction to a variety of core algorithm concepts, such as big O notation, divide and conquer algorithms, data structures, randomized algorithms, best, worst and average case analysis, time-space tradeoffs, and lower bounds. Classification Sorting algorithms used in computer science are often classified by: • Computational complexity (worst, average and best behaviour) of element comparisons in terms of the size of the list . For typical sorting algorithms good behavior is and bad behavior is .
    [Show full text]
  • I. Sorting Networks Thomas Sauerwald
    I. Sorting Networks Thomas Sauerwald Easter 2015 Outline Outline of this Course Introduction to Sorting Networks Batcher’s Sorting Network Counting Networks I. Sorting Networks Outline of this Course 2 Closely follow the book and use the same numberring of theorems/lemmas etc. I. Sorting Networks (Sorting, Counting, Load Balancing) II. Matrix Multiplication (Serial and Parallel) III. Linear Programming (Formulating, Applying and Solving) IV. Approximation Algorithms: Covering Problems V. Approximation Algorithms via Exact Algorithms VI. Approximation Algorithms: Travelling Salesman Problem VII. Approximation Algorithms: Randomisation and Rounding VIII. Approximation Algorithms: MAX-CUT Problem (Tentative) List of Topics Algorithms (I, II) Complexity Theory Advanced Algorithms I. Sorting Networks Outline of this Course 3 Closely follow the book and use the same numberring of theorems/lemmas etc. (Tentative) List of Topics Algorithms (I, II) Complexity Theory Advanced Algorithms I. Sorting Networks (Sorting, Counting, Load Balancing) II. Matrix Multiplication (Serial and Parallel) III. Linear Programming (Formulating, Applying and Solving) IV. Approximation Algorithms: Covering Problems V. Approximation Algorithms via Exact Algorithms VI. Approximation Algorithms: Travelling Salesman Problem VII. Approximation Algorithms: Randomisation and Rounding VIII. Approximation Algorithms: MAX-CUT Problem I. Sorting Networks Outline of this Course 3 (Tentative) List of Topics Algorithms (I, II) Complexity Theory Advanced Algorithms I. Sorting Networks (Sorting, Counting, Load Balancing) II. Matrix Multiplication (Serial and Parallel) III. Linear Programming (Formulating, Applying and Solving) IV. Approximation Algorithms: Covering Problems V. Approximation Algorithms via Exact Algorithms VI. Approximation Algorithms: Travelling Salesman Problem VII. Approximation Algorithms: Randomisation and Rounding VIII. Approximation Algorithms: MAX-CUT Problem Closely follow the book and use the same numberring of theorems/lemmas etc.
    [Show full text]
  • Adaptive Bitonic Sorting
    Encyclopedia of Parallel Computing “00101” — 2011/4/16 — 13:39 — Page 1 — #2 A One of the main di7erences between “regular” bitonic Adaptive Bitonic Sorting sorting and adaptive bitonic sorting is that regular bitonic sorting is data-independent, while adaptive G"#$%&' Z"()*"++ bitonic sorting is data-dependent (hence the name). Clausthal University, Clausthal-Zellerfeld, Germany As a consequence, adaptive bitonic sorting cannot be implemented as a sorting network, but only on archi- Definition tectures that o7er some kind of 9ow control. Nonethe- Adaptive bitonic sorting is a sorting algorithm suitable less, it is convenient to derive the method of adaptive for implementation on EREW parallel architectures. bitonic sorting from bitonic sorting. Similar to bitonic sorting, it is based on merging, which Sorting networks have a long history in computer is recursively applied to obtain a sorted sequence. In science research (see the comprehensive survey []). contrast to bitonic sorting, it is data-dependent. Adap- One reason is that sorting networks are a convenient tive bitonic merging can be performed in O n parallel p way to describe parallel sorting algorithms on CREW- time, p being the number of processors, and executes ! " PRAMs or even EREW-PRAMs (which is also called only O n operations in total. Consequently, adaptive n log n PRAC for “parallel random access computer”). bitonic sorting can be performed in O time, ( ) p In the following, let n denote the number of keys which is optimal. So, one of its advantages is that it exe- ! " to be sorted, and p the number of processors. For the cutes a factor of O log n less operations than bitonic sake of clarity, n will always be assumed to be a power sorting.
    [Show full text]
  • Bitonic Sort and Quick Sort
    Assignment of bachelor’s thesis Title: Development of parallel sorting algorithms for GPU Student: Xuan Thang Nguyen Supervisor: Ing. Tomáš Oberhuber, Ph.D. Study program: Informatics Branch / specialization: Computer Science Department: Department of Theoretical Computer Science Validity: until the end of summer semester 2021/2022 Instructions 1. Study the basics of programming GPU using CUDA. 2. Learn the fundamentals of the development of parallel algorithms with TNL library (www.tnl- project.org). 3. Learn and understand parallel sorting algorithms, namely bitonic sort and quick sort. 4. Implement both algorithms into TNL library to run on CPU and GPU. 5. Implement unit tests for testing correctness of the implemented algorithms. 6. Perform measurement of speed-up compared to sorting algorithms in the STL library and GPU implementation [1]. [1] https://github.com/davors/gpu-sorting Electronically approved by doc. Ing. Jan Janoušek, Ph.D. on 30 November 2020 in Prague. Bachelor’s thesis Development of parallel sorting algorithms for GPU Nguyen Xuan Thang Department of Theoretical Computer Science Supervisor: Ing. Tomáš Oberhuber, Ph.D. May 13, 2021 Acknowledgements I would like to thank my supervisor Ing. Tomáš Oberhuber, Ph.D. for his sup- port, guidance and advices throughout the whole time of creating this thesis. My gratitude also goes to my family that helped me during these hard times. Declaration I hereby declare that the presented thesis is my own work and that I have cited all sources of information in accordance with the Guideline for adhering to ethical principles when elaborating an academic final thesis. I acknowledge that my thesis is subject to the rights and obligations stip- ulated by the Act No.
    [Show full text]
  • A Single SMC Sampler on MPI That Outperforms a Single MCMC Sampler
    ASINGLE SMC SAMPLER ON MPI THAT OUTPERFORMS A SINGLE MCMC SAMPLER Alessandro Varsi Lykourgos Kekempanos Department of Electrical Engineering and Electronics Department of Electrical Engineering and Electronics University of Liverpool University of Liverpool Liverpool, L69 3GJ, UK Liverpool, L69 3GJ, UK [email protected] [email protected] Jeyarajan Thiyagalingam Simon Maskell Scientific Computing Department Department of Electrical Engineering and Electronics Rutherford Appleton Laboratory, STFC University of Liverpool Didcot, Oxfordshire, OX11 0QX, UK Liverpool, L69 3GJ, UK [email protected] [email protected] ABSTRACT Markov Chain Monte Carlo (MCMC) is a well-established family of algorithms which are primarily used in Bayesian statistics to sample from a target distribution when direct sampling is challenging. Single instances of MCMC methods are widely considered hard to parallelise in a problem-agnostic fashion and hence, unsuitable to meet both constraints of high accuracy and high throughput. Se- quential Monte Carlo (SMC) Samplers can address the same problem, but are parallelisable: they share with Particle Filters the same key tasks and bottleneck. Although a rich literature already exists on MCMC methods, SMC Samplers are relatively underexplored, such that no parallel im- plementation is currently available. In this paper, we first propose a parallel MPI version of the SMC Sampler, including an optimised implementation of the bottleneck, and then compare it with single-core Metropolis-Hastings. The goal is to show that SMC Samplers may be a promising alternative to MCMC methods with high potential for future improvements. We demonstrate that a basic SMC Sampler with 512 cores is up to 85 times faster or up to 8 times more accurate than Metropolis-Hastings.
    [Show full text]
  • Sorting Algorithm 1 Sorting Algorithm
    Sorting algorithm 1 Sorting algorithm A sorting algorithm is an algorithm that puts elements of a list in a certain order. The most-used orders are numerical order and lexicographical order. Efficient sorting is important for optimizing the use of other algorithms (such as search and merge algorithms) which require input data to be in sorted lists; it is also often useful for canonicalizing data and for producing human-readable output. More formally, the output must satisfy two conditions: 1. The output is in nondecreasing order (each element is no smaller than the previous element according to the desired total order); 2. The output is a permutation (reordering) of the input. Since the dawn of computing, the sorting problem has attracted a great deal of research, perhaps due to the complexity of solving it efficiently despite its simple, familiar statement. For example, bubble sort was analyzed as early as 1956.[1] Although many consider it a solved problem, useful new sorting algorithms are still being invented (for example, library sort was first published in 2006). Sorting algorithms are prevalent in introductory computer science classes, where the abundance of algorithms for the problem provides a gentle introduction to a variety of core algorithm concepts, such as big O notation, divide and conquer algorithms, data structures, randomized algorithms, best, worst and average case analysis, time-space tradeoffs, and upper and lower bounds. Classification Sorting algorithms are often classified by: • Computational complexity (worst, average and best behavior) of element comparisons in terms of the size of the list (n). For typical serial sorting algorithms good behavior is O(n log n), with parallel sort in O(log2 n), and bad behavior is O(n2).
    [Show full text]
  • Metode Sorting Bitonic Pada GPU
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Gunadarma University Repository Metode Sorting Bitonic Pada GPU Yulisdin Mukhlis1 Lingga Harmanto2 Universitas Gunadarma Universitas Gunadarma Jl. Margonda Raya no 100 Depok Jl. Margonda Raya no 100 Depok [email protected] [email protected] Abstrak— Perubahan arsitektur komputer menjadi GPU adalah sebuah prosessor multithread yang mampu multiprocessor memang bisa membuat lebih banyak proses mendukung jutaan pemrosessan data pada satu waktu. Metode bisa dikerjakan sekaligus, namun perubahan tersebut tidaklah pemrosesan menggunakan konsep paralelisme antar thread. mampu meningkatkan kecepatan masing-masing proses secara Gambar arsitektur secara umum dari GPU diperlihatkan signifikan. Peningkatan kecepatan setiap proses bisa dicapai seperti pada gambar dibawah : melalui peningkatan kecepatan perangkat lunak. Kecepatan perangkat lunak sangat ditentukan oleh algoritmanya. Usaha untuk mencari algoritma yang lebih cepat tidaklah mudah, namun dengan adanya komputer multiprocessor, dapatlah dirancang algoritma yang lebih cepat, yaitu dengan memparalelkan proses komputasinya. Salah satu contoh implementasi dari multiprosessor pada desain grafis adalah GPU (graphical processing unit) yang dipelopori oleh NVIDIA. GPU menerapkan algoritma dari paralel computing. Salah satu algoritma tersebut adalah sorting. Sorting adalah salah satu masalah pokok yang sering Gambar 1. Arsitektur GPU dikemukakan dalam pemrosesan paralel. Strategi pemecahannya adalah dengan algoritma Divide and Conquer GPU terdiri dari n thread prosessor dan device memory. Setiap yaitu strategi pemecahan masalah dengan cara melakukan thread prosessor, terdiri dari beberapa precision FPU pembagian masalah yang besar tersebut menjadi beberapa (Fragment Processing Unit) dan 1024 register 32 bit. Device bagian yang lebih kecil secara rekursif hingga masalah memory akan menjadi tempat pemrosesan data sementara tersebut dapat dipecahkan secara langsung.
    [Show full text]
  • Oblivious Computation with Data Locality
    Oblivious Computation with Data Locality Gilad Asharov T-H. Hubert Chan Kartik Nayak Cornell Tech The University of Hong Kong UMD [email protected] [email protected] [email protected] Rafael Pass Ling Ren Elaine Shi Cornell Tech MIT Cornell University [email protected] [email protected] [email protected] Abstract Oblivious RAM compilers, introduced by Goldreich and Ostrovsky [JACM'96], compile any RAM program into one that is \memory-oblivious" (i.e., the access pattern to the memory is independent of the input). All previous ORAM schemes, however, completely break the locality of data accesses (by shuffling the data to pseudorandom positions in memory). In this work, we initiate the study of locality-friendly oblivious RAMs|Oblivious RAM compilers that preserve the locality of the accessed memory regions, while leaking only the lengths of contiguous memory regions accessed; we refer to such schemes as Range ORAMs. Our main results demonstrate the existence of a statistically-secure Range ORAM with only poly-logarithmic overhead (both in terms of the number of memory accesses, and in terms of locality). In our most optimized construction, the overhead is only a logarithmic factor greater than the best ORAM scheme (without locality). To further improve the parameters, we also consider the weaker notion of a File ORAM : whereas a Range ORAM needs to support read/write access to arbitrary regions of the memory, a File ORAM only needs to support access to pre-defined non-overlapping regions (e.g., files being stored in memory). Assuming one-way functions, we present a computationally-secure File ORAM that, up to log log n factors matches the best ORAM schemes (i.e., we essentially get \locality for free".) As an intermediate result, we also develop a novel sorting algorithm which is also asymp- totically optimal (up to log log n factors) and enjoys good locality (can be implemented using O(log n) sequential accesses).
    [Show full text]
  • Fast Segmented Sort on Gpus
    International Conference on Supercomputing (ICS), Chicago, IL, June, 2017 Fast Segmented Sort on GPUs † † † Kaixi Hou , Weifeng Liu‡§, Hao Wang , Wu-chun Feng †Department of Computer Science, Virginia Tech, Blacksburg, VA, USA, kaixihou, hwang121, wfeng @vt.edu { } Niels Bohr Institute, University of Copenhagen, Copenhagen, Denmark, [email protected] ‡ Department of Computer Science, Norwegian University of Science and Technology, Trondheim, Norway. § ABSTRACT both for traditional HPC applications and for big data processing. Segmented sort, as a generalization of classical sort, orders a batch In these cases, a large amount of independent arrays oen need to of independent segments in a whole array. Along with the wider be sorted as a whole, either because of algorithm characteristics adoption of manycore processors for HPC and big data applications, (e.g., sux array construction in prex doubling algorithms from segmented sort plays an increasingly important role than sort. In bioinformatics [15, 44]), or dataset properties (e.g., sparse matrices this paper, we present an adaptive segmented sort mechanism on in linear algebra [4, 28–31, 42]), or real-time requests from web GPUs. Our mechanisms include two core techniques: (1) a dif- users (e.g., queries in data warehouse [45, 49, 51]). e second ferentiated method for dierent segment lengths to eliminate the trend is that with the rapidly increased computational power of irregularity caused by various workloads and thread divergence; new processors, sorting a single array at a time usually cannot fully and (2) a register-based sort method to support N -to-M data-thread utilize the devices, thus grouping multiple independent arrays and binding and in-register data communication.
    [Show full text]