<<

DD2325 Applied Programming and / DA701 Programming and Computer Science for Physicists Lecture 2 Complexity and sorting

Hedvig Kjellström, [email protected]

Based on the slides by H. Azizpour, I. Leite, A. Maki and C. Edlund Last Week

• Introduction to Matlab Syntax – definitions • Iteration and Recursion – Recursion: function calling itself • Stacks – Last-In-First-Out structure • Structures and Structure Arrays – Container for many variables • Search: Sequential and Binary – Continue today...

2 Today

• Complexity • Sorting • Good Programming Practice

3 Complexity

4 Complexity

• Complexity of an : – How the elapsed time (number of operations) varies with the number of elements n – How an algorithm “behaves” as the input gets larger

• Talk to your neighbor (2 min): – Why would it be interesting to know this? – Find examples of situations where it is relevant

5 Complexity

• Complexity analysis: – Finding a mathematical expression of the time/operations T needed for an algorithm to finish – Ignoring low-level details such as programming language, hardware, low-level instructions to CPU

1. Large n ☞ only most • Mathematical expression: n-dependent factor important n = number of datapoints 2. Ignoring low-level T = A + B*log(n) + C*n + D*n2 + ... detail ☞ constants A, B, C, D not important

ÞAsymptotic complexity (n → ∞) in Big-O notation: T = O(1), T = O(log(n)), T = O(n), etc

6 Complexity of Sequential Search (unordered) How many operations in the worst case? Algorithm (0 if not in list) Everyone discuss with their neighbor for 5 min function location = seqsearch(element, list) location = 0; 1 operation i = 0; 1 operation while location == 0 1 operation i = i + 1; Constant no. of operations if list(i) == element Constant no. of operations location = i; 1 operation Worst case: n times end % if end % while end % seqsearch What is the asymptotic complexity in Complexity O(n) Big-O notation? Everyone thinks silently for 1 min

7 Complexity of Binary Search (ordered) Algorithm (0 if not in list) How many operations in the worst case? Everyone discuss with their neighbor for 5 min function location = binsearch(element, list) middle = 1 + floor(length(list)/2); Constant no. of operations if list(middle) == element 1 operation location = middle; 1 operation elseif (middle > 1) & (list(middle) > element) Constant no. of operations location = binsearch(element, list(1:(middle-1)); Constant no. of operations elseif (middle < length(list)) & (list(middle) < element)Constant no. of operations location = middle + binsearch(element, list((middle+1):end);Constant no. of op if location == middle % element not in sublist 1 operation location = 0; 1 operation end % if else % element not in list location = 0; 1 operation What is the asymptotic complexity in end % if Big-O notation? end % binsearch Everyone thinks silently for 1 min

Complexity O(log2(n)) 8 Sorting

9 Sorting

• Put elements of a list in a certain order • Output: – a permutation (reordering) of the input list • Properties: – Complexity, stability, memory usage, ... – Today focus on complexity

• No single best ! – The choice of a sorting algorithm depends on n and application.

10 View list as two parts sorted, unsorted. 1. Take elements from unsorted one by one. Insert them in their correct order in sorted. Algorithm function sorted = inssort(list) sorted = list(1); unsorted = list(2:end); while length(unsorted) > 0 i = length(sorted); while sorted(i) > unsorted(1) i = i – 1; end % while if i == 0 What is the asymptotic complexity in sorted = [unsorted(1) sorted]; elseif i == length(sorted) Big-O notation? Worst, best case? sorted = [sorted unsorted(1)]; Everyone discuss with their neighbor for 5 min else sorted = [sorted(1:i) unsorted(1) sorted((i+1):end)]; end % if unsorted = unsorted(2:end); end % while end % inssort • Inefficient on large lists, efficient on almost sorted lists

2 Complexity O(n ) in worst case – but O(n) for almost sorted lists! 11 View list as two parts sorted, unsorted. 2. Select smallest from unsorted and append last to sorted in each step.

Algorithm Sequential search O(n) function sorted = selsort(list) since unsorted list sorted = []; unsorted = list; while length(unsorted) > 0 [min_val, min_i] = min(unsorted); unsorted(min_i) = unsorted(1); unsorted(1) = min_val; sorted = [sorted unsorted(1)]; unsorted = unsorted(2:end); end % while end % selsort • Inefficient on large lists What is the asymptotic complexity in Big-O notation? Complexity O(n2) Everyone discuss with their neighbor for 5 min

12 View list as two parts sorted, unsorted. 3. Start from the end of unsorted and let smaller elements bubble down to the beginning of Algorithm unsorted by pairwise swapping. Append first function sorted = bubsort(list) element from unsorted last to sorted in sorted = []; unsorted = list; each step. while length(unsorted) > 0 swaps = 0; for i = length(unsorted):(-1):2 if unsorted(i) < unsorted(i-1) swaps = 1; val = unsorted(i); unsorted(i) = unsorted(i-1); unsorted(i-1) = val; end % if end % for if swaps sorted = [sorted unsorted(1)]; unsorted = unsorted(2:end); else % both lists are sorted! sorted = [sorted unsorted]; unsorted = []; What is the asymptotic complexity in end % if end % while Big-O notation? Worst, best case? end % bubsort Everyone discuss with their neighbor for 5 min • Inefficient on large lists, efficient on almost sorted lists Complexity O(n2) in worst case – but O(n) for almost sorted lists! (Demuth, 1956)

13 Divide and Conquer

Algorithm Sort (list) • if length of list is 1 – the list is sorted • if length of list is greater than 1 – partition the list into a lower list and a higher list – Sort (lower list) – Sort (higher list) – combine the sorted lower and sorted higher lists

• Recursion!

14 Choose a pivot element. 4. Go through the rest of the list and put elements lower than pivot into lower, elements higher than pivot into higher. • Interactive demo: Apply Quicksort to lower and higher. http://me.dt.in.th/page/Quicksort/

What is the asymptotic complexity in Big-O notation? Everyone discuss with their neighbor for 5 min

Complexity O(n*log2(n))

(Hoare, 1960)

15 Other Sorting Algorithms

• Shell Sort – A development of Insertion Sort and Bubble Sort

– A development of Insertion Sort

• Mergesort – Divide-and-conquer-based just like Quicksort

– Sort in a fixed number of buckets (see Hashing, Lecture 6)

16 No Single Best Sorting Algorithm!

• Efficient implementations generally use a hybrid algorithm, combining e.g. Quicksort for the overall sort with Insertion Sort for small lists at the bottom of the recursion. • Data size is important: – Small datasets (n < 50): Insertion Sort is fast enough, but O(n2) – Large datsets (n > 1000): Quicksort 1.5 to 2 times faster than the others (in most cases) • ... but also how sorted the array is before the algorithm begins!

• Comparison: http://www.sorting-algorithms.com/

17 Good Programming Practice

18 Guidelines for Good Programming

• If code is repeated – write a function (subprogram) – modularization • Comment preconditions and postconditions (see Lecture 1 for a few examples). • Use comments! The code should be written so the reader understands how it works. Comments should explain why it works and what it does. – My tip is to write a comment directly after the funcion header

function output = name(input) % Preconditions and postconditions % Describe the output and input % Name and date can be nice!

19 Guidelines for Good Programming

• Use descriptive names on variables, functions/subprograms

• Write the code in a simple way, avoid too smart solutions that are hard to follow – Becomes important only when optimizing the code for speed – In the beginning of development, ease of understanding will help more (avoiding lengthy debugging etc.)

• Use global variables only when necessary – Makes debugging difficult

• Use constants or variables instead of using the value repeatedly – Makes it easier to change a certain value (only change in one place)

20 Guidelines for User Friendly Programming

The program should • be easy to use – Give instructions if necessary press h for help – Give example or default values Birthday (yyyymmdd)? – Relevant, correct and polite communication • have error messages that are – easy to understand Birthday format is not correct – instructive Birthday should be on the format: yyyymmdd

21 User Friendly Program?

Welcome to your horoscope teller! Fill in your personal data and the horoscope teller will cast your personal horoscope. Do you want to continue? Yes What’s your name? Alice Do you want to continue? Yes What is wrong? Make a list... Everyone discuss with their neighbor for 5 min Birthday? 20000403 Fill in correctly! Do you want to continue? Yes Birthplace? Stockholm Do you want to continue? Yes When were your parents born? 19720404 Do you want to continue? Yes

*************** YOUR HOROSCOPE ******************** Your horoscope couldn’t be cast due to lack of information.

22 Easy to Read? disp(’Exchange sek to euro and vice versa’) kr = input(’The amount is: ’); sek = input(’The currency is: ’,’s’); if strcmp(sek, ’sek’) What is wrong? Make a list... disp(kr*0.1067) Everyone discuss with their neighbor for 5 min else disp(kr*(1/0.1067)) end TF = strcmp(s1,s2) compares two strings for equality. The strings s1,s2 are considered to be equal if the size and content of each are the same (the comparison is case sensitive). The function returns a scalar logical 1/0 for equality/inequality.

23 Easy to Read? function [x, y] = pop(z) if length(z)==0 % x = []; elseif length(z)==1 % check if length equals one x = z(1); y = []; What is wrong? Make a list... else % fix the rest Everyone discuss with their neighbor for 5 min x = z(1); y = z(2:end); end Checks if there is no element, then if it’s one element, and at last if it’s more than one element Compare with the pop function and its description in lecture 1

24 Conclusion

• Today: – Complexity – Sorting – Good Programming Practice

• Next week: – C Intro – Scientific Computing Examples

25