Efficient Algorithms for a Mesh-Connected Computer With

Total Page:16

File Type:pdf, Size:1020Kb

Efficient Algorithms for a Mesh-Connected Computer With Efficient Algorithms for a Mesh-Connected Computer with Additional Global Bandwidth by Yujie An A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Computer Science and Engineering) in the University of Michigan 2019 Doctoral Committee: Professor Quentin F. Stout, Chair Associate Professor Kevin J. Compton Professor Seth Pettie Professor Martin J. Strauss Yujie An [email protected] ORCID iD: 0000-0002-2038-8992 ©Yujie An 2019 Acknowledgments My researches are done with the help of many people. First I would like to thank my advisor, Quentin F. Stout, who introduced me to most of the topics discussed here, and helped me throughout my Ph.D. career at University of Michigan. I would also like to thank my thesis committee members, Kevin J. Compton, Seth Pettie and Martin J. Strauss, for their invaluable advice during the dissertation process. To my parents, thank you very much for your long-term support. All my achievements cannot be accomplished without your loves. To all my best friends, thank you very much for keeping me happy during my life. ii TABLE OF CONTENTS Acknowledgments ................................... ii List of Figures ..................................... v List of Abbreviations ................................. vii Abstract ......................................... viii Chapter 1 Introduction ..................................... 1 2 Preliminaries .................................... 3 2.1 Our Model..................................3 2.2 Related Work................................5 3 Basic Operations .................................. 8 3.1 Broadcast, Reduction and Scan.......................8 3.2 Rotation................................... 11 3.3 Sparse Random Access Read/Write..................... 12 3.4 Optimality.................................. 17 4 Sorting and Selecting ................................ 18 4.1 Sort..................................... 18 4.1.1 Sort in the Worst Case....................... 18 4.1.2 Sort when Only k Values are Out of Order............. 22 4.1.3 Splitter Sort............................. 27 4.1.4 Chain Rank............................. 29 4.2 Selection and Finding the Median..................... 32 4.2.1 Worst Case............................. 32 4.2.2 Expected Case........................... 34 4.3 Label Counting............................... 36 5 Fast Fourier Transform ............................... 38 5.1 Fast Fourier Transform........................... 38 5.2 Pattern Matching.............................. 43 6 Graph Algorithms .................................. 47 6.1 Graph Algorithms with Matrix Input.................... 47 iii 6.2 Graph Algorithms with Edge input..................... 58 6.3 Graph Algorithms on a CR MCC...................... 61 7 Computational Geometry .............................. 72 7.1 Convexity.................................. 72 7.2 Nearest Neighbor Problems......................... 75 7.3 Line Segment Intersection......................... 87 8 Coarse-Grained Algorithms ............................ 92 8.1 Broadcast, Reduction and Scan....................... 92 8.2 Sort..................................... 93 8.3 Minimum Spanning Tree.......................... 99 8.4 Convex Hull................................. 104 9 Conclusion ...................................... 106 9.1 Future Work................................. 107 Bibliography ...................................... 109 iv LIST OF FIGURES 2.1 Space-filling curve ordering...........................4 2.2 Mesh with additional global bandwidth.....................4 2.3 Mesh augmented by global communication power...............6 3.1 Segmented broadcast............................... 10 3.2 Data requesting operation with n = 256 and = 1=2; after step 3, channel n =2 saves the range of channels that saves pi in range [n=2; 3n=4) ....... 13 4.1 Sort using multiple splitters........................... 19 4.2 Sort when inputs are 1 quasi-sorted....................... 23 4.3 Determine 2k values to take out......................... 26 4.4 Splitter sort: put points into non-overlapping regions.............. 28 4.5 Chain rank.................................... 30 0 1 7 5.1 The values of !8;!8;:::;!8 in the complex plane............... 39 5.2 Directions of convolution and string matching.................. 44 5.3 Image pattern matching: set blank areas in the pattern to be wildcard to count the red regions, and white color otherwise; green region is an example of an invalid matching................................. 45 6.1 Creating a linear chain from a tree........................ 49 6.2 Game tree..................................... 50 6.3 A game tree that is hard to evaluate....................... 51 6.4 Bridges in G, the number on the node represents its pre-order numbering... 53 6.5 Articulation points in G, the number on the node represents its pre-order num- bering....................................... 54 6.6 Biconnected component: graph transformation................. 56 7.1 A convex hull example.............................. 73 7.2 Line L separates S1 and S2 ............................ 74 7.3 Nearest neighbor in a corner, p and q may have closer points outside the current vertical and horizontal slabs because currently d(p) > pc and d(q) > qc; this condition will be true for at most two points for each corner in each block... 75 7.4 Voronoi partition................................. 77 7.5 All squares that possibly contain a global closest dissimilar point from α ... 79 7.6 Modified version of Voronoi partition: intervals below with the same label must be ignored.................................. 82 v 7.7 New intervals generated: a points and b intervals can generate at most a + b intervals, intervals without labels above can have any label below....... 83 7.8 Updates in the Voronoi partitioning....................... 85 7.9 More than two points are closest to a point on L: C must be in between A and B, and CD must go through O; but in this case CD cannot be the optimal answer, because both AD and BD are shorter than it.............. 87 7.10 Sort line segments using partial order sort, bold lines are in S1 and others are in S2; only one intersection detected is I, and intersections between two line segments in S2 cannot be detected at this stage................. 88 7.11 Two visibility problems............................. 90 7.12 Blocking visibility................................ 90 8.1 Destination ordering............................... 93 8.2 Shuffle the array................................. 95 vi LIST OF ABBREVIATIONS VLSI Very Large Scale Integration PRAM Parallel Random Access Machine ER Exclusive Read EW Exclusive Write CR Concurrent Read CW Concurrent Write ADT Abstract Data Type SIMD Single Instruction Multiple Data RAR Random Access Read RAW Random Access Write SRAR Sparse Random Access Read SRAW Sparse Random Access Write MST Minimum Spanning Tree MSF Minimum Spanning Forest DFT discrete Fourier transform FFT fast Fourier transform KMP Knuth-Morris-Pratt HPC High Performance Computing vii ABSTRACT This thesis shows that adding additional global bandwidths to a mesh-connected computer can greatly improve the performance. The goal of this project is to design algorithms for mesh-connected computers augmented with limited global bandwidth, so that we can further enhance our understanding of the parallel/serial nature of the problems on evolving parallel architectures. We do this by first solving several problems associated with fundamental data movement, then sum- marize ways to resolve different situations one may observe in data movement in parallel computing. This can help us to understand whether the problem is easily parallelizable on different parallel models. We give efficient algorithms to solve several fundamental problems, which include sorting, counting, fast Fourier trans- form, finding a minimum spanning tree, finding a convex hull, etc. We show that adding a small amount of global bandwidth makes a practical design that com- bines aspects of mesh and fully connected models to achieve the benefits of each. Most of the algorithms are optimal. For future work, we believe that algorithms with peak-power constrains can make our model well adapted to the recent archi- tectures in high performance computing. viii CHAPTER 1 Introduction In recent years, advances in Very Large Scale Integration (VLSI) technology have provided increasingly powerful cores consist of many thousands and potentially millions of proces- sors. The layouts of cores and processors are very different from machine to machine, and they are changing very rapidly. For example, the PSC Bridges supercomputer has 28 cores on each node [1], and communication between nodes are expensive. This is not true for the same supercomputer system before 2016, not to say many other supercomputers. From a programmer’s prospective, it would be ideal to write programs for a Parallel Random Access Machine (PRAM). In a PRAM, all processors have a small local mem- ory but can communicate with a large shared memory in constant time. The power of such global communication can significantly reduce the program’s complexity, though al- gorithms can still be extremely hard to implement. It is a quite powerful theoretical model that is capable of solving many problems in poly-logarithmic time (i.e., O(logk n) time for some constant k)[2,3,4,5]. However, the PRAM is not practical and cannot be scaled to large numbers of processors. Further, many of the optimal (in O-notation) algorithms
Recommended publications
  • Instructor's Manual
    Instructor’s Manual Vol. 2: Presentation Material CCC Mesh/Torus Butterfly !*#? Sea Sick Hypercube Pyramid Behrooz Parhami This instructor’s manual is for Introduction to Parallel Processing: Algorithms and Architectures, by Behrooz Parhami, Plenum Series in Computer Science (ISBN 0-306-45970-1, QA76.58.P3798) 1999 Plenum Press, New York (http://www.plenum.com) All rights reserved for the author. No part of this instructor’s manual may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, microfilming, recording, or otherwise, without written permission. Contact the author at: ECE Dept., Univ. of California, Santa Barbara, CA 93106-9560, USA ([email protected]) Introduction to Parallel Processing: Algorithms and Architectures Instructor’s Manual, Vol. 2 (4/00), Page iv Preface to the Instructor’s Manual This instructor’s manual consists of two volumes. Volume 1 presents solutions to selected problems and includes additional problems (many with solutions) that did not make the cut for inclusion in the text Introduction to Parallel Processing: Algorithms and Architectures (Plenum Press, 1999) or that were designed after the book went to print. It also contains corrections and additions to the text, as well as other teaching aids. The spring 2000 edition of Volume 1 consists of the following parts (the next edition is planned for spring 2001): Vol. 1: Problem Solutions Part I Selected Solutions and Additional Problems Part II Question Bank, Assignments, and Projects Part III Additions, Corrections, and Other Updates Part IV Sample Course Outline, Calendar, and Forms Volume 2 contains enlarged versions of the figures and tables in the text, in a format suitable for use as transparency masters.
    [Show full text]
  • Quadratic Time Algorithms Appear to Be Optimal for Sorting Evolving Data
    Quadratic Time Algorithms Appear to be Optimal for Sorting Evolving Data Juan Jos´eBesa William E. Devanny David Eppstein Dept. of Computer Science Dept. of Computer Science Dept. of Computer Science Univ. of California, Irvine Univ. of California, Irvine Univ. of California, Irvine Irvine, CA 92697 USA Irvine, CA 92697 USA Irvine, CA 92697 USA [email protected] [email protected] [email protected] Michael T. Goodrich Timothy Johnson Dept. of Computer Science Dept. of Computer Science Univ. of California, Irvine Univ. of California, Irvine Irvine, CA 92697 USA Irvine, CA 92697 USA [email protected] [email protected] Abstract 1.1 The Evolving Data Model One scenario We empirically study sorting in the evolving data model. where the Knuthian model doesn't apply is for In this model, a sorting algorithm maintains an ap- applications where the input data is changing while an proximation to the sorted order of a list of data items algorithm is processing it, which has given rise to the while simultaneously, with each comparison made by the evolving data model [1]. In this model, input changes algorithm, an adversary randomly swaps the order of are coming so fast that it is hard for an algorithm to adjacent items in the true sorted order. Previous work keep up, much like Lucy in the classic conveyor belt 1 studies only two versions of quicksort, and has a gap scene in the TV show, I Love Lucy. Thus, rather than between the lower bound of Ω(n) and the best upper produce a single output, as in the Knuthian model, bound of O(n log log n).
    [Show full text]
  • Fast Sorting for Exact OIT of Complex Scenes
    CGI2014 manuscript No. 86 Fast Sorting for Exact OIT of Complex Scenes Pyarelal Knowles · Geoff Leach · Fabio Zambetta June 2014 Abstract Exact order independent transparency for example sorting per-object. This is because com- (OIT) techniques capture all fragments during ras- pletely sorting per-triangle is expensive and difficult, terization. The fragments are then sorted per-pixel requiring splitting triangles in cases of intersection or by depth and composited them in order using alpha cyclical overlapping. Order-independent transparency transparency. The sorting stage is a bottleneck for (OIT) allows geometry to be rasterized in arbitrary or- high depth complexity scenes, taking 70{95% of the der, sorting surfaces as fragments in image space. Our total time for those investigated. In this paper we show focus is exact OIT, for which sorting is central, although that typical shader based sorting speed is impacted we note there are approximate OIT techniques which by local memory latency and occupancy. We present use data reduction and trade accuracy for speed [17, and discuss the use of both registers and an external 13]. merge sort in register-based block sort to better use the memory hierarchy of the GPU for improved OIT rendering performance. This approach builds upon backwards memory allocation, achieving an OIT ren- dering speed up to 1:7× that of the best previous method and 6:3× that of the common straight forward OIT implementation. In some cases the sorting stage is reduced to no longer be the dominant OIT component. Keywords sorting · OIT · transparency · shaders · performance · registers · register-based block sort Figure 1: Power plant model, with false colouring to 1 Introduction show backwards memory allocation intervals.
    [Show full text]
  • Comparison Sorts Name Best Average Worst Memory Stable Method Other Notes Quicksort Is Usually Done in Place with O(Log N) Stack Space
    Comparison sorts Name Best Average Worst Memory Stable Method Other notes Quicksort is usually done in place with O(log n) stack space. Most implementations on typical in- are unstable, as stable average, worst place sort in-place partitioning is case is ; is not more complex. Naïve Quicksort Sedgewick stable; Partitioning variants use an O(n) variation is stable space array to store the worst versions partition. Quicksort case exist variant using three-way (fat) partitioning takes O(n) comparisons when sorting an array of equal keys. Highly parallelizable (up to O(log n) using the Three Hungarian's Algorithmor, more Merge sort worst case Yes Merging practically, Cole's parallel merge sort) for processing large amounts of data. Can be implemented as In-place merge sort — — Yes Merging a stable sort based on stable in-place merging. Heapsort No Selection O(n + d) where d is the Insertion sort Yes Insertion number of inversions. Introsort No Partitioning Used in several STL Comparison sorts Name Best Average Worst Memory Stable Method Other notes & Selection implementations. Stable with O(n) extra Selection sort No Selection space, for example using lists. Makes n comparisons Insertion & Timsort Yes when the data is already Merging sorted or reverse sorted. Makes n comparisons Cubesort Yes Insertion when the data is already sorted or reverse sorted. Small code size, no use Depends on gap of call stack, reasonably sequence; fast, useful where Shell sort or best known is No Insertion memory is at a premium such as embedded and older mainframe applications. Bubble sort Yes Exchanging Tiny code size.
    [Show full text]
  • Analysis of Fast Insertion Sort
    Analysis of Fast Insertion Sort A Technical Report presented to the faculty of the School of Engineering and Applied Science University of Virginia by Eric Thomas April 24, 2021 On my honor as a University student, I have neither given nor received unauthorized aid on this assignment as defined by the Honor Guidelines for Thesis-Related Assignments. Eric Thomas Technical advisor: Lu Feng, Department of Computer Science Analysis of Fast Insertion Sort Efficient Variants of Insertion Sort Eric Thomas University of Virginia [email protected] ABSTRACT complexity of 푂(푛) and the latter suffers from complicated implementation. According to Faro et al. [1], there does not As a widespread problem in programming language and seem to exist an iterative, in-place, and online variant of algorithm design, many sorting algorithms, such as Fast Insertion Sort that manages to have a worst-case time Insertion Sort, have been created to excel at particular use complexity of 표(푛2). Originating from Faro et al. [1], a cases. While Fast Insertion Sort has been demonstrated to family of Insertion Sort variants, referred to as Fast have a lower running time than Hoare's Quicksort in many Insertion Sort, is the result of an endeavor to solve these practical cases, re-implementation is carried out to verify problems. those results again in C++ as well as Python, which involves comparing the running times of Fast Insertion Sort, Merge Sort, Heapsort, and Quicksort on random and 2 Related Work partially sorted arrays of different lengths. Furthermore, the After reviewing research by “Mishra et al. [2], Yang et al.
    [Show full text]
  • Designing Efficient Sorting Algorithms for Manycore Gpus
    Designing Efficient Sorting Algorithms for Manycore GPUs Nadathur Satish Mark Harris Michael Garland University of California, Berkeley NVIDIA Corporation Abstract cores, typically augmented with vector units—are now commonplace and there is every indication that the trend We describe the design of high-performance parallel towards increasing parallelism will continue on towards radix sort and merge sort routines for manycore GPUs, tak- “manycore” chips that provide far higher degrees of par- ing advantage of the full programmability offered by CUDA. allelism. GPUs have been at the leading edge of this drive Our radix sort is the fastest GPU sort reported in the liter- towards increased chip-level parallelism for some time and ature, and is up to 4 times faster than the graphics-based are already fundamentally manycore processors. Current GPUSort. It is also highly competitive with CPU imple- NVIDIA GPUs, for example, contain up to 128 scalar pro- mentations, being up to 3.5 times faster than comparable cessing elements per chip [19], and in contrast to earlier routines on an 8-core 2.33 GHz Intel Core2 Xeon system. generations of GPUs, they can be programmed directly in Our merge sort is the fastest published comparison-based C using CUDA [20, 21]. GPU sort and is also competitive with multi-core routines. In this paper, we describe the design of efficient sort- To achieve this performance, we carefully design our al- ing algorithms for such manycore GPUs using CUDA. The gorithms to expose substantial fine-grained parallelism and programming flexibility provided by CUDA and the current decompose the computation into independent tasks that per- generation of GPUs allows us to consider a much broader form minimal global communication.
    [Show full text]
  • Thesis Submitted for the Degree of Doctor of Philosophy at the Australian National University
    Efficient Algorithms for Sorting and Synchronization Andrew Tridgell A thesis submitted for the degree of Doctor of Philosophy at The Australian National University February 1999 Except where otherwise indicated, this thesis is my own original work. Andrew Tridgell February 1999 Till min kara¨ fru, Susan Acknowledgments The research that has gone into this thesis has been thoroughly enjoyable. That en- joyment is largely a result of the interaction that I have had with my supervisors, col- leagues and, in the case of rsync, the people who have tested and used the resulting software. I feel very privileged to have worked with my supervisors, Richard Brent, Paul Mackerras and Brendan McKay. To each of them I owe a great debt of gratitude for their patience, inspiration and friendship. Richard, Paul and Brendan have taught me a great deal about the field of Computer Science by sharing with me the joy of discovery and investigation that is the heart of research. I would also like to thank Bruce Millar and Iain Macleod, who supervised my initial research in automatic speech recognition. Through Bruce, Iain and the Com- puter Sciences Laboratory I made the transition from physicist to computer scientist, turning my hobby into a career. The Department of Computer Science and Computer Sciences Laboratory have provided an excellent environment for my research. I spent many enjoyable hours with department members and fellow students chatting about my latest crazy ideas over a cup of coffee. Without this rich environment I doubt that many of my ideas would have come to fruition. The Australian Telecommunications and Electronics Research Board, the Com- monwealth and the Australian National University were very generous in providing me with scholarship assistance.
    [Show full text]
  • In-Place Merging Algorithms
    In-Place Merging Algorithms Denham Coates-Evelyn Department of Computer Science Kings College, The Strand London WC2R 2LS, U.K. [email protected] January 2004 Abstract In this report we consider the problem of merging two sorted lists of m and n keys each in-place. We survey known techniques for this problem, focussing on correctness and the attributes of Stability and Practicality. We demonstrate a class of unstable in-place merge algorithms that uses block rearrangement and internal buffering that actually does not merge in the presence of sufficient duplicate keys of a given value. We show four relatively simple block sorting techniques that can be used to correct these algorithms. In addition, we show relatively simple and robust techniques that does stable local block merge followed by stable block sort to create a merge. Our internal merge is base on Kronrod’s method of internal buffering and block partitioning. Using block size of O(√m + n) we achieve complexity of no more than 1.5(m+n)+O(√m + n lg(m+n)) comparisons and 4(m+n)+O(√m + n lg(m+n)) data moves. Using block size of O((m + n)/ lg(m + n)) gives complexity of no more than m + n + o(m + n) comparisons and 5(m + n) + o(m + n) moves. Actual experimental results indicates 4.5(m + n) + o(m + n) moves. Our algorithm is stable except for the case of buffer permutation during the merge, its implementation is much less complex than the unstable algorithm given in [26] which does optimum comparisons and 3(m + n) + o(m + n) moves.
    [Show full text]
  • Sorting by Generating the Sorting Permutation, and the Effect of Caching on Sorting
    Sorting by generating the sorting permutation, and the effect of caching on sorting. Arne Maus, [email protected] Department of Informatics University of Oslo Abstract This paper presents a fast way to generate the permutation p that defines the sorting order of a set of integer keys in an integer array ‘a’- that is: a[p[i]] is the i’th sorted element in ascending order. The motivation for using Psort is given along with its implementation in Java. This distribution based sorting algorithm, Psort, is compared to two comparison based algorithms, Heapsort and Quicksort, and two other distribution based algorithms, Bucket sort and Radix sort. The relative performance of the distribution sorting algorithms for arrays larger than a few thousand elements are assumed to be caused by how well they handle caching (both level 1 and level 2). The effect of caching is investigated, and based on these results, more cache friendly versions of Psort and Radix sort are presented and compared. Introduction Sorting is maybe the single most important algorithm performed by computers, and certainly one of the most investigated topics in algorithmic design. Numerous sorting algorithms have been devised, and the more commonly used are described and analyzed in any standard textbook in algorithms and data structures [Dahl and Belsnes, Weiss, Goodrich and Tamassia] or in standard reference works [Knuth, van Leeuwen]. New sorting algorithms are still being developed, like Flashsort [Neubert] and ‘The fastest sorting algorithm” [Nilsson@. The most influential sorting algorithm introduced since the 60’ies is no doubt the distribution based ‘Bucket’ sort which can be traced back to [Dobosiewicz].
    [Show full text]
  • Chapter 7 Sorting
    CHAPTER 7 SORTING All the programs in this file are selected from Ellis Horowitz, Sartaj Sahni, and Susan Anderson-Freed “Fundamentals of Data Structures in C”, Computer Science Press, 1992. 1 Sequential Search Example 44, 55, 12, 42, 94, 18, 06, 67 unsuccessful search – n+1 successful search n n+ 1 ∑ (i)/n= i= 1 2 2 Program:Sequential search int seqsearch( int list[ ], int searchnum, int n ) { /*search an array, list, that has n numbers. Return -1, if search num is not in the list */ int i; for (i=0; i<n && list[i] != searchnum ; i++) ; return (( i<n) ? i : -1); } 3 Program: Binary search int binsearch(element list[ ], int searchnum, int n) { /* search list [0], ..., list[n-1]*/ int left = 0, right = n-1, middle; while (left <= right) { middle = (left+ right)/2; switch (COMPARE(list[middle].key, searchnum)) { case -1: left = middle +1; break; case 0: return middle; case 1:right = middle - 1; } } O(log2n) return -1; } 4 Sorting Problem Definition – given (R0, R1, …, Rn-1), where Ri = key + data find a permutation , such that K(i-1) K(i), 0<i<n-1 sorted – K(i-1) K(i), 0<i<n-1 stable – if i < j and Ki = Kj then Ri precedes Rj in the sorted list internal sort vs. external sort criteria – # of key comparisons – # of data movements 5 Insertion Sort Insertion sort is a simple sorting algorithm that is appropriate for small inputs. Most common sorting technique used by card players. The list is divided into two parts: sorted and unsorted. In each pass, the first element of the unsorted part is picked up, transferred to the sorted sublist, and inserted at the appropriate place.
    [Show full text]
  • Contents Classification Comparison of Algorithms Popular Sorting
    From Wikipedia, the free encyclopedia A sorting algorithm is an algorithm that puts elements of a list in a certain order. The most-used orders are numerical order and lexicographical order. Efficient sorting is important for optimizing the use of other algorithms (such as search and merge algorithms) which require input data to be in sorted lists; it is also often useful for canonicalizing data and for producing human- readable output. More formally, the output must satisfy two conditions: 1. The output is in nondecreasing order (each element is no smaller than the previous element according to the desired total order); 2. The output is a permutation (reordering) of the input. Further, the data is often taken to be in an array, which allows random access, rather than a list, which only allows sequential access, though often algorithms can be applied with suitable modification to either type of data. Since the dawn of computing, the sorting problem has attracted a great deal of research, perhaps due to the complexity of solving it efficiently despite its simple, familiar statement. For example, bubble sort was analyzed as early as 1956.[1] Comparison sorting algorithms have a fundamental requirement of O(n log n) comparisons (some input sequences will require a multiple of n log n comparisons); algorithms not based on comparisons, such as counting sort, can have better performance. Although many consider sorting a solved problem—asymptotically optimal algorithms have been known since the mid-20th century—useful new algorithms are still being invented, with the now widely used Timsort dating to 2002, and the library sort being first published in 2006.
    [Show full text]
  • Sorting Algorithms
    Lecture 15: Sorting CSE 373: Data Structures and Algorithms Algorithms CSE 373 WI 19 - KASEY CHAMPION 1 Administrivia Piazza! Homework - HW 5 Part 1 Due Friday 2/22 - HW 5 Part 2 Out Friday, due 3/1 - HW 3 Regrade Option due 3/1 Grades - HW 1, 2 & 3 Grades into Canvas soon - Socrative EC status into Canvas soon - HW 4 Grades published by 3/1 - Midterm Grades PubLished CSE 373 SP 18 - KASEY CHAMPION 2 Midterm Stats CSE 373 19 WI - KASEY CHAMPION 3 Sorting CSE 373 19 WI - KASEY CHAMPION 4 Types of Sorts Comparison Sorts Niche Sorts aka “linear sorts” Compare two elements at a time Leverages specific properties about the General sort, works for most types of elements items in the list to achieve faster Element must form a “consistent, total ordering” runtimes For every element a, b and c in the list the niche sorts typically run O(n) time following must be true: - If a <= b and b <= a then a = b In this class we’ll focus on comparison - If a <= b and b <= c then a <= c sorts - Either a <= b is true or <= a What does this mean? compareTo() works for your elements Comparison sorts run at fastest O(nlog(n)) time CSE 373 SP 18 - KASEY CHAMPION 5 Sort Approaches In Place sort A sorting algorithm is in-place if it requires only O(1) extra space to sort the array Typically modifies the input collection Useful to minimize memory usage Stable sort A sorting algorithm is stable if any equal items remain in the same relative order before and after the sort Why do we care? - Sometimes we want to sort based on some, but not all attributes of an item -
    [Show full text]