Merge Sort Roberto Hibbler Dept

Total Page:16

File Type:pdf, Size:1020Kb

Merge Sort Roberto Hibbler Dept Merge Sort Roberto Hibbler Dept. of Computer Science Florida Institute of Technology Melbourne, FL 32901 [email protected] ABSTRACT solution in Section 3, the evaluation of our results in Section 4, Given an array of elements, we want to arrange those elements and our final conclusion in Section 5. into a sorted order. To sort those elements, we will need to make comparisons between the individual elements efficiently. Merge Sort uses a divide and conquer strategy to sort an array efficiently 2. RELATED WORK while making the least number of comparisons between array The three algorithms that we will discuss are Bubble Sort[1] , elements. Our results show that for arrays with large numbers of Selection Sort[2], and Insertion Sort[3]. All three are comparison array elements, Merge Sort is more efficient than three other sort algorithms, just as Merge Sort. comparison sort algorithms, Bubble Sort[1], Insertion Sort[3], and Selection Sort[2]. Our theoretical evaluation shows that Merge The Bubble Sort[1] algorithm works by continually swapping Sort beats a quadratic time complexity, while our empirical adjacent array elements if they are out of order until the array is in evaluation shows that on average Merge Sort is 32 times faster sorted order. Every iteration through the array places at least one than Insertion Sort[3], the current recognized most efficient element at its correct position. Although algorithmically correct, comparison algorithm, with ten different data sets. Bubble Sort[1] is inefficient for use with arrays with a large number of array elements and has a ͉ʚͦ͢ʛ time complexity. Keywords Knuth observed, also, that while Bubble Sort[1] shares the worst- Merge Sort, sorting, comparisons, Selection Sort[2], arrange case time complexity with other prevalent sorting algorithms, compared to them it makes far more element swaps, resulting in 1. INTRODUCTION poor interaction with modern CPU hardware. We intend to show The ability to arrange an array of elements into a defined order is that Merge Sort needs to make on average fewer element swaps very important in Computer Science. Sorting is heavily used with than Bubble Sort[1] . online stores, were the order that services or items were purchased determines what orders can be filled and who receives their order The Selection Sort[2]algorithm arranges array elements in order first. Sorting is also essential for the database management by first finding the minimum value in the array and swapping it systems used by banks and financial systems, such as the New with the array element that is in its correct position depending on Stock Exchange, to track and rank the billions of transactions that how the array is being arranged. The process is then repeated with go on in one day. There are many algorithms, which provide a the second smallest value until the array is sorted. This creates solution to sorting arrays, including algorithms such as Bubble two distinctive regions within the array, the half that is sorted and Sort[1], Insertion Sort[3], and Selection Sort[2]. While these the half that has not been sorted. Selection Sort[2]shows an algorithms are programmatically correct, they are not efficient for improvement over Bubble Sort[1] by not comparing all the arrays with a large number of elements and exhibit quadratic time elements in its unsorted half until it is time for that element to be complexity. placed into its sorted position. This makes Selection Sort[2]less affected by the input’s order. Though, it is still no less inefficient We are given an array of comparable values. We need to arrange with arrays with a large number of array elements. Also, even these values into either an ascending or descending order. with the improvements Selection Sort[2]still shares the same worst-case time complexity of ͉ʚͦ͢ʛ. We intend to show that We introduce the Merge Sort algorithm. The Merge Sort Merge Sort will operate at a worst-case time complexity faster algorithm is a divide-and-conquer algorithm. It takes input of an than ͉ʚͦ͢ʛ. array and divides that array into sub arrays of single elements. A single element is already sorted, and so the elements are sorted The Insertion Sort[3]algorithm takes elements from the input back into sorted arrays two sub-arrays at a time, until we are left array and places those elements in their correct place into a new with a final sorted array. We contribute the following: array, shifting existing array elements as needed. Insertion Sort[3]improves over Selection Sort[2]by only making as many 1. We introduce the Merge Sort algorithm. comparisons as it needs to determine the correct position of the 2. We show that theoretically Merge Sort has a worst-case current element, while Selection Sort[2]makes comparisons against each element in the unsorted part of the array. In the time complexity better than ͉ʚͦ͢ʛ. 3. We show that empirically Merge Sort is faster than )v average case, Insertion Sort[3]’s time complexity is ͉ʚ ʛ, but its Selection Sort[2] over ten data sets. ͨ worst case is ͦ , the same as Bubble Sort[1] and Selection ͉ʚ͢ ʛ Sort[2]. The tradeoff of Insertion Sort[3]is that on the average This paper will discuss in Section 2 comparison sort algorithms more elements are swapped as array elements are shifted within related to the problem, followed by the detailed approach of our the array with the addition of new elements. We intend to show that Merge Sort operates at an average case time complexity faster (38 27 43 3 9 82 10 1) Output – array A in ascending order than ͉ʚͦ͢ʛ. 3. APPROACH A large array with an arbitrary order needs to be arranged in an ascending or descending order, either lexicographically or numerically. Merge sort can solve this problem by using two key ideas. The first key idea of merge sort is that a problem can be divided and conquered. The problem can be broken into smaller arrays, and those arrays can be solved. Second, by dividing the array into halves, then dividing those halves by recursively halving them into arrays of single elements, two sorted arrays are merged into one array, as a single element array is already sorted. Refer to the following pseudocode: Figure 1: Shows the splitting of the input array into single element arrays. Input – A: array of n elements Output – array A sorted in ascending order 1. proc mergesort(A: array) 2. var array left, right, result 3. if length(A)<=1 4. return(A) 5. var middle=length(A)/2 6. for each x in A up to middle 7. add x to left 8. for each x in A after middle 9. add x to right 10. left=mergesort(left) 11. right=mergesort(right) 12. result=merge(left,right) 13. return result Input – left:array of m elements, right: array of k elements Figure 2: Shows the merging of the single element arrays Output – array result sorted in ascending order during the Merge Step. 14. proc merge(left: array, right: array) 15. var array result 16. which length(left) > 0 and length(right) > 0 As the example shows, array A is broken in half continuously 17. if first(left) <= first(right) until they are in arrays of only a single element, then those single 18. append first(left) to result 19. left=rest(left) elements are merged together until they form a single sorted array 20. else in ascending order. 21. append first(right) to result 22. right=rest(right) 23. end while 4. EVALUATION 24. if length(left) > 0 25. append left to result 4.1 Theoretical Analysis 26. if length(right) > 0 4.1.1 Evaluation Criteria 27. append right to result 28. return result All comparison based sorting algorithms count the comparisons of array elements as one of their key operations. The Merge Sort algorithm can be evaluated by measuring the number of As the pseudocode shows, after the array is broken up into a left comparisons between array elements. As the key operation, we half and a right half (lines 5 - 9), the two halves are divided can measure the number of comparisons made to determine the recursively (lines 10 – 11) until they are all within a single overall efficiency of the algorithm. We intend to show that element array. Then, the two halves’ elements are compared to because the Merge Sort algorithm makes less comparisons over determine how the two arrays should be arranged (lines 16 -22). the currently acknowledged most efficient algorithm, Insertion Should any one half contain elements not added to the sorted Sort[3], Merge Sort is the most efficient comparison sort array after the comparisons are made, the remainder is added so algorithm. no elements are lost (lines 24 – 27). In the following examples, using the given input, the division of the array (Figure 1) and how the array is merged back into a sorted array (Figure 2) are 4.1.1.1 Merge Sort Case Scenarios illustrated. 4.1.1.1.1 Worst Case Merge Sort makes the element comparisons we want to measure Inputs – A: array of n elements during the merge step, where pairs of arrays are recursively merged into a single array. Merge Sort’s worst case, depicted in total number of comparisons for the worst case, we get the Figure 3, is the scenario where during each recursive call of the following equations: merge step, the two largest elements are located in different arrays. This forces the maximum number of comparisons to occur. ) ) ʚ ʛ (11) In this case, the Merge Sort algorithm’s efficiency can be ͎ ͢ = 2͎ ʠͦʡ + ͦ represented by the number of comparisons made during each ) &) ͎ʚ͢ʛ = 2&͎ ʠ ʡ + (12) recursive call of the merge step, which is described in the ͦĞ ͦ ) (13) following recurrence equation where variable n is denoted as the ͎͢ʛ = ͢ ∗ 0 + ͦ log ͦ ͢ array size and T(n) refers to the total comparisons in the merge step: Similarly to earlier, equation (11) can be expanded to find a pattern; equation (12) can then be created by substituting k, and ) ͎ʚ͢ʛ = 2͎ ʠ ʡ + ͢ − 1 (1) by solving for k get equation (13), which is the total number of ͦ comparisons for the best case of Merge sort This also results in a ͎ʚ1ʛ = 0 (2) Big O time complexity of ͉ʚ͢ log ͢ʛ, just like the worst case.
Recommended publications
  • Sort Algorithms 15-110 - Friday 2/28 Learning Objectives
    Sort Algorithms 15-110 - Friday 2/28 Learning Objectives • Recognize how different sorting algorithms implement the same process with different algorithms • Recognize the general algorithm and trace code for three algorithms: selection sort, insertion sort, and merge sort • Compute the Big-O runtimes of selection sort, insertion sort, and merge sort 2 Search Algorithms Benefit from Sorting We use search algorithms a lot in computer science. Just think of how many times a day you use Google, or search for a file on your computer. We've determined that search algorithms work better when the items they search over are sorted. Can we write an algorithm to sort items efficiently? Note: Python already has built-in sorting functions (sorted(lst) is non-destructive, lst.sort() is destructive). This lecture is about a few different algorithmic approaches for sorting. 3 Many Ways of Sorting There are a ton of algorithms that we can use to sort a list. We'll use https://visualgo.net/bn/sorting to visualize some of these algorithms. Today, we'll specifically discuss three different sorting algorithms: selection sort, insertion sort, and merge sort. All three do the same action (sorting), but use different algorithms to accomplish it. 4 Selection Sort 5 Selection Sort Sorts From Smallest to Largest The core idea of selection sort is that you sort from smallest to largest. 1. Start with none of the list sorted 2. Repeat the following steps until the whole list is sorted: a) Search the unsorted part of the list to find the smallest element b) Swap the found element with the first unsorted element c) Increment the size of the 'sorted' part of the list by one Note: for selection sort, swapping the element currently in the front position with the smallest element is faster than sliding all of the numbers down in the list.
    [Show full text]
  • PROC SORT (Then And) NOW Derek Morgan, PAREXEL International
    Paper 143-2019 PROC SORT (then and) NOW Derek Morgan, PAREXEL International ABSTRACT The SORT procedure has been an integral part of SAS® since its creation. The sort-in-place paradigm made the most of the limited resources at the time, and almost every SAS program had at least one PROC SORT in it. The biggest options at the time were to use something other than the IBM procedure SYNCSORT as the sorting algorithm, or whether you were sorting ASCII data versus EBCDIC data. These days, PROC SORT has fallen out of favor; after all, PROC SQL enables merging without using PROC SORT first, while the performance advantages of HASH sorting cannot be overstated. This leads to the question: Is the SORT procedure still relevant to any other than the SAS novice or the terminally stubborn who refuse to HASH? The answer is a surprisingly clear “yes". PROC SORT has been enhanced to accommodate twenty-first century needs, and this paper discusses those enhancements. INTRODUCTION The largest enhancement to the SORT procedure is the addition of collating sequence options. This is first and foremost recognition that SAS is an international software package, and SAS users no longer work exclusively with English-language data. This capability is part of National Language Support (NLS) and doesn’t require any additional modules. You may use standard collations, SAS-provided translation tables, custom translation tables, standard encodings, or rules to produce your sorted dataset. However, you may only use one collation method at a time. USING STANDARD COLLATIONS, TRANSLATION TABLES AND ENCODINGS A long time ago, SAS would allow you to sort data using ASCII rules on an EBCDIC system, and vice versa.
    [Show full text]
  • Overview of Sorting Algorithms
    Unit 7 Sorting Algorithms Simple Sorting algorithms Quicksort Improving Quicksort Overview of Sorting Algorithms Given a collection of items we want to arrange them in an increasing or decreasing order. You probably have seen a number of sorting algorithms including ¾ selection sort ¾ insertion sort ¾ bubble sort ¾ quicksort ¾ tree sort using BST's In terms of efficiency: ¾ average complexity of the first three is O(n2) ¾ average complexity of quicksort and tree sort is O(n lg n) ¾ but its worst case is still O(n2) which is not acceptable In this section, we ¾ review insertion, selection and bubble sort ¾ discuss quicksort and its average/worst case analysis ¾ show how to eliminate tail recursion ¾ present another sorting algorithm called heapsort Unit 7- Sorting Algorithms 2 Selection Sort Assume that data ¾ are integers ¾ are stored in an array, from 0 to size-1 ¾ sorting is in ascending order Algorithm for i=0 to size-1 do x = location with smallest value in locations i to size-1 swap data[i] and data[x] end Complexity If array has n items, i-th step will perform n-i operations First step performs n operations second step does n-1 operations ... last step performs 1 operatio. Total cost : n + (n-1) +(n-2) + ... + 2 + 1 = n*(n+1)/2 . Algorithm is O(n2). Unit 7- Sorting Algorithms 3 Insertion Sort Algorithm for i = 0 to size-1 do temp = data[i] x = first location from 0 to i with a value greater or equal to temp shift all values from x to i-1 one location forwards data[x] = temp end Complexity Interesting operations: comparison and shift i-th step performs i comparison and shift operations Total cost : 1 + 2 + ..
    [Show full text]
  • Lecture 11: Heapsort & Its Analysis
    Lecture 11: Heapsort & Its Analysis Agenda: • Heap recall: – Heap: definition, property – Max-Heapify – Build-Max-Heap • Heapsort algorithm • Running time analysis Reading: • Textbook pages 127 – 138 1 Lecture 11: Heapsort (Binary-)Heap data structure (recall): • An array A[1..n] of n comparable keys either ‘≥’ or ‘≤’ • An implicit binary tree, where – A[2j] is the left child of A[j] – A[2j + 1] is the right child of A[j] j – A[b2c] is the parent of A[j] j • Keys satisfy the max-heap property: A[b2c] ≥ A[j] • There are max-heap and min-heap. We use max-heap. • A[1] is the maximum among the n keys. • Viewing heap as a binary tree, height of the tree is h = blg nc. Call the height of the heap. [— the number of edges on the longest root-to-leaf path] • A heap of height k can hold 2k —— 2k+1 − 1 keys. Why ??? Since lg n − 1 < k ≤ lg n ⇐⇒ n < 2k+1 and 2k ≤ n ⇐⇒ 2k ≤ n < 2k+1 2 Lecture 11: Heapsort Max-Heapify (recall): • It makes an almost-heap into a heap. • Pseudocode: procedure Max-Heapify(A, i) **p 130 **turn almost-heap into a heap **pre-condition: tree rooted at A[i] is almost-heap **post-condition: tree rooted at A[i] is a heap lc ← leftchild(i) rc ← rightchild(i) if lc ≤ heapsize(A) and A[lc] > A[i] then largest ← lc else largest ← i if rc ≤ heapsize(A) and A[rc] > A[largest] then largest ← rc if largest 6= i then exchange A[i] ↔ A[largest] Max-Heapify(A, largest) • WC running time: lg n.
    [Show full text]
  • Quick Sort Algorithm Song Qin Dept
    Quick Sort Algorithm Song Qin Dept. of Computer Sciences Florida Institute of Technology Melbourne, FL 32901 ABSTRACT each iteration. Repeat this on the rest of the unsorted region Given an array with n elements, we want to rearrange them in without the first element. ascending order. In this paper, we introduce Quick Sort, a Bubble sort works as follows: keep passing through the list, divide-and-conquer algorithm to sort an N element array. We exchanging adjacent element, if the list is out of order; when no evaluate the O(NlogN) time complexity in best case and O(N2) exchanges are required on some pass, the list is sorted. in worst case theoretically. We also introduce a way to approach the best case. Merge sort [4] has a O(NlogN) time complexity. It divides the 1. INTRODUCTION array into two subarrays each with N/2 items. Conquer each Search engine relies on sorting algorithm very much. When you subarray by sorting it. Unless the array is sufficiently small(one search some key word online, the feedback information is element left), use recursion to do this. Combine the solutions to brought to you sorted by the importance of the web page. the subarrays by merging them into single sorted array. 2 Bubble, Selection and Insertion Sort, they all have an O(N2) time In Bubble sort, Selection sort and Insertion sort, the O(N ) time complexity that limits its usefulness to small number of element complexity limits the performance when N gets very big. no more than a few thousand data points.
    [Show full text]
  • Visualizing Sorting Algorithms Brian Faria Rhode Island College, [email protected]
    Rhode Island College Digital Commons @ RIC Honors Projects Overview Honors Projects 2017 Visualizing Sorting Algorithms Brian Faria Rhode Island College, [email protected] Follow this and additional works at: https://digitalcommons.ric.edu/honors_projects Part of the Education Commons, Mathematics Commons, and the Other Computer Sciences Commons Recommended Citation Faria, Brian, "Visualizing Sorting Algorithms" (2017). Honors Projects Overview. 127. https://digitalcommons.ric.edu/honors_projects/127 This Honors is brought to you for free and open access by the Honors Projects at Digital Commons @ RIC. It has been accepted for inclusion in Honors Projects Overview by an authorized administrator of Digital Commons @ RIC. For more information, please contact [email protected]. VISUALIZING SORTING ALGORITHMS By Brian J. Faria An Honors Project Submitted in Partial Fulfillment Of the Requirements for Honors in The Department of Mathematics and Computer Science Faculty of Arts and Sciences Rhode Island College 2017 Abstract This paper discusses a study performed on animating sorting al- gorithms as a learning aid for classroom instruction. A web-based animation tool was created to visualize four common sorting algo- rithms: Selection Sort, Bubble Sort, Insertion Sort, and Merge Sort. The animation tool would represent data as a bar-graph and after se- lecting a data-ordering and algorithm, the user can run an automated animation or step through it at their own pace. Afterwards, a study was conducted with a voluntary student population at Rhode Island College who were in the process of learning algorithms in their Com- puter Science curriculum. The study consisted of a demonstration and survey that asked the students questions that may show improve- ment when understanding algorithms.
    [Show full text]
  • Binary Search
    UNIT 5B Binary Search 15110 Principles of Computing, 1 Carnegie Mellon University - CORTINA Course Announcements • Sunday’s review sessions at 5‐7pm and 7‐9 pm moved to GHC 4307 • Sample exam available at the SCHEDULE & EXAMS page http://www.cs.cmu.edu/~15110‐f12/schedule.html 15110 Principles of Computing, 2 Carnegie Mellon University - CORTINA 1 This Lecture • A new search technique for arrays called binary search • Application of recursion to binary search • Logarithmic worst‐case complexity 15110 Principles of Computing, 3 Carnegie Mellon University - CORTINA Binary Search • Input: Array A of n unique elements. – The elements are sorted in increasing order. • Result: The index of a specific element called the key or nil if the key is not found. • Algorithm uses two variables lower and upper to indicate the range in the array where the search is being performed. – lower is always one less than the start of the range – upper is always one more than the end of the range 15110 Principles of Computing, 4 Carnegie Mellon University - CORTINA 2 Algorithm 1. Set lower = ‐1. 2. Set upper = the length of the array a 3. Return BinarySearch(list, key, lower, upper). BinSearch(list, key, lower, upper): 1. Return nil if the range is empty. 2. Set mid = the midpoint between lower and upper 3. Return mid if a[mid] is the key you’re looking for. 4. If the key is less than a[mid], return BinarySearch(list,key,lower,mid) Otherwise, return BinarySearch(list,key,mid,upper). 15110 Principles of Computing, 5 Carnegie Mellon University - CORTINA Example
    [Show full text]
  • Advanced Topics in Sorting
    Advanced Topics in Sorting complexity system sorts duplicate keys comparators 1 complexity system sorts duplicate keys comparators 2 Complexity of sorting Computational complexity. Framework to study efficiency of algorithms for solving a particular problem X. Machine model. Focus on fundamental operations. Upper bound. Cost guarantee provided by some algorithm for X. Lower bound. Proven limit on cost guarantee of any algorithm for X. Optimal algorithm. Algorithm with best cost guarantee for X. lower bound ~ upper bound Example: sorting. • Machine model = # comparisons access information only through compares • Upper bound = N lg N from mergesort. • Lower bound ? 3 Decision Tree a < b yes no code between comparisons (e.g., sequence of exchanges) b < c a < c yes no yes no a b c b a c a < c b < c yes no yes no a c b c a b b c a c b a 4 Comparison-based lower bound for sorting Theorem. Any comparison based sorting algorithm must use more than N lg N - 1.44 N comparisons in the worst-case. Pf. Assume input consists of N distinct values a through a . • 1 N • Worst case dictated by tree height h. N ! different orderings. • • (At least) one leaf corresponds to each ordering. Binary tree with N ! leaves cannot have height less than lg (N!) • h lg N! lg (N / e) N Stirling's formula = N lg N - N lg e N lg N - 1.44 N 5 Complexity of sorting Upper bound. Cost guarantee provided by some algorithm for X. Lower bound. Proven limit on cost guarantee of any algorithm for X.
    [Show full text]
  • Sorting Algorithm 1 Sorting Algorithm
    Sorting algorithm 1 Sorting algorithm In computer science, a sorting algorithm is an algorithm that puts elements of a list in a certain order. The most-used orders are numerical order and lexicographical order. Efficient sorting is important for optimizing the use of other algorithms (such as search and merge algorithms) that require sorted lists to work correctly; it is also often useful for canonicalizing data and for producing human-readable output. More formally, the output must satisfy two conditions: 1. The output is in nondecreasing order (each element is no smaller than the previous element according to the desired total order); 2. The output is a permutation, or reordering, of the input. Since the dawn of computing, the sorting problem has attracted a great deal of research, perhaps due to the complexity of solving it efficiently despite its simple, familiar statement. For example, bubble sort was analyzed as early as 1956.[1] Although many consider it a solved problem, useful new sorting algorithms are still being invented (for example, library sort was first published in 2004). Sorting algorithms are prevalent in introductory computer science classes, where the abundance of algorithms for the problem provides a gentle introduction to a variety of core algorithm concepts, such as big O notation, divide and conquer algorithms, data structures, randomized algorithms, best, worst and average case analysis, time-space tradeoffs, and lower bounds. Classification Sorting algorithms used in computer science are often classified by: • Computational complexity (worst, average and best behaviour) of element comparisons in terms of the size of the list . For typical sorting algorithms good behavior is and bad behavior is .
    [Show full text]
  • COSC 311: ALGORITHMS HW1: SORTING Due Friday, September 22, 12Pm
    COSC 311: ALGORITHMS HW1: SORTING Due Friday, September 22, 12pm In this assignment you will implement several sorting algorithms and compare their relative per- formance. The sorting algorithms you will consider are: 1. Insertion sort 2. Selection sort 3. Heapsort 4. Mergesort 5. Quicksort We will discuss all of these algorithms in class. You should run your experiments on the department servers, remus/romulus (if you would like to write your code on another machine that is fine, but make sure you run the actual tim- ing experiments on remus/romulus). Instructions for how to access the servers can be found on the CS department web page under “Computing Resources.” If you are a Five College stu- dent who has previously taken an Amherst CS course or who enrolled during preregistration last spring, you should have an account already set up (you may need to change your password; go to https://www.amherst.edu/help/passwords). If you don’t already have an account, you can request one at https://sysaccount.amherst.edu/sysaccount/CoursePetition.asp. It will take a day to create the new account, so please do this right away. Please type up your responses to the questions below. I recommend using LATEX, which is a type- setting language that makes it easy to make math look good. If you’re not already familiar with it, I encourage you to practice! Your tasks: 1) Theoretical predictions. Rank the five sorting algorithms in order of how you expect their run- times to compare (fastest to slowest). Your ranking should be based on the asymptotic analysis of the algorithms.
    [Show full text]
  • Tailoring Collation to Users and Languages Markus Scherer (Google)
    Tailoring Collation to Users and Languages Markus Scherer (Google) Internationalization & Unicode Conference 40 October 2016 Santa Clara, CA This interactive session shows how to use Unicode and CLDR collation algorithms and data for multilingual sorting and searching. Parametric collation settings - "ignore punctuation", "uppercase first" and others - are explained and their effects demonstrated. Then we discuss language-specific sort orders and search comparison mappings, why we need them, how to determine what to change, and how to write CLDR tailoring rules for them. We will examine charts and data files, and experiment with online demos. On request, we can discuss implementation techniques at a high level, but no source code shall be harmed during this session. Ask the audience: ● How familiar with Unicode/UCA/CLDR collation? ● More examples from CLDR, or more working on requests/issues from audience members? About myself: ● 17 years ICU team member ● Co-designed data structures for the ICU 1.8 collation implementation (live in 2001) ● Re-wrote ICU collation 2012..2014, live in ICU 53 ● Became maintainer of UTS #10 (UCA) and LDML collation spec (CLDR) ○ Fixed bugs, clarified spec, added features to LDML Collation is... Comparing strings so that it makes sense to users Sorting Searching (in a list) Selecting a range “Find in page” Indexing Internationalization & Unicode Conference 40 October 2016 Santa Clara, CA “Collation is the assembly of written information into a standard order. Many systems of collation are based on numerical order or alphabetical order, or extensions and combinations thereof.” (http://en.wikipedia.org/wiki/Collation) “Collation is the general term for the process and function of determining the sorting order of strings of characters.
    [Show full text]
  • Sorting Algorithms
    Sorting Algorithms Next to storing and retrieving data, sorting of data is one of the more common algorithmic tasks, with many different ways to perform it. Whenever we perform a web search and/or view statistics at some website, the presented data has most likely been sorted in some way. In this lecture and in the following lectures we will examine several different ways of sorting. The following are some reasons for investigating several of the different algorithms (as opposed to one or two, or the \best" algorithm). • There exist very simply understood algorithms which, although for large data sets behave poorly, perform well for small amounts of data, or when the range of the data is sufficiently small. • There exist sorting algorithms which have shown to be more efficient in practice. • There are still yet other algorithms which work better in specific situations; for example, when the data is mostly sorted, or unsorted data needs to be merged into a sorted list (for example, adding names to a phonebook). 1 Counting Sort Counting sort is primarily used on data that is sorted by integer values which fall into a relatively small range (compared to the amount of random access memory available on a computer). Without loss of generality, we can assume the range of integer values is [0 : m], for some m ≥ 0. Now given array a[0 : n − 1] the idea is to define an array of lists l[0 : m], scan a, and, for i = 0; 1; : : : ; n − 1 store element a[i] in list l[v(a[i])], where v is the function that computes an array element's sorting value.
    [Show full text]