An Intuitive Introduction to Data Structures, 2Nd Edition
Total Page:16
File Type:pdf, Size:1020Kb
An Intuitive Introduction to Data Structures, 2nd Edition Brian Heinold Department of Mathematics and Computer Science Mount St. Mary’s University ©2019 Brian Heinold Licensed under a Creative Commons Attribution-Noncommercial-Share Alike 4.0 Unported License Preface This book covers standard topics in data structures including running time analysis, dynamic arrays, linked lists, stacks, queues, recursion, binary trees, binary search trees, heaps, hashing, sets, maps, graphs, and sorting. It is based on Data Structures and Algorithms classes I’ve taught over the years. The first time I taught the course, I used a bunch of data structures and introductory programming books as reference, and while I liked parts of all the books, none of them approached things quite in the way I wanted, so I decided to write my own book. I originally wrote this book in 2012. This version is substantially revised and reorganized from that earlier version. My approach to topics is a lot more intuitive than it is formal. If you are looking for a formal approach, there are many books out there that take that approach. I’ve tried to keep explanations short and to the point. When I was learning this material, I found that just reading about a data structure wasn’t helping the material to stick, so I would instead try to implement the data structure myself. By doing that, I was really able to get a sense for how things work. This book contains many implementations of data structures, but it is recommended that you either try to implement the data structures yourself or work along with the examples in the book. There are a few hundred exercises. They are all grouped together in Chapter 12. It is highly recommended that you do some of the exercises in order to become comfortable with the data structures and to get better at programming. If you spot any errors, please send me a note at [email protected]. Last updated: October 11, 2019. ii Contents 1 Running times of algorithms 1 1.1 Introduction....................................................1 1.2 Estimating the running times of algorithms.................................2 1.3 Common running times.............................................4 1.4 Some notes about big O notation.......................................4 1.5 Logarithms and Binary Search.........................................5 2 Lists 7 2.1 Dynamic arrays..................................................7 2.2 Linked lists..................................................... 12 2.3 Working with linked lists............................................ 17 2.4 More about linked lists.............................................. 19 2.5 Making the linked list class generic...................................... 21 2.6 Lists in the Java Collections Framework................................... 22 3 Stacks and Queues 25 3.1 Introduction.................................................... 25 3.2 Implementing a stack............................................... 26 3.3 Implementing a queue.............................................. 26 3.4 Stacks, queues, and deques in the Collections Framework........................ 27 3.5 Applications.................................................... 28 4 Recursion 30 4.1 Introduction.................................................... 30 4.2 Basic recursion examples............................................ 31 4.3 More sophisticated uses of recursion..................................... 37 4.4 The Sierpinski triangle.............................................. 40 4.5 Towers of Hanoi.................................................. 42 4.6 Working with recursion............................................. 44 5 Binary Trees 48 5.1 Introduction.................................................... 48 5.2 Implementing a binary tree........................................... 49 5.3 Binary trees and recursion............................................ 50 iii CONTENTS iv 5.4 More about binary trees............................................. 54 6 Binary Search Trees 57 6.1 Introduction.................................................... 57 6.2 Adding and deleting things........................................... 58 6.3 Implementing a BST............................................... 61 6.4 A generic BST................................................... 64 7 Heaps 67 7.1 Introduction.................................................... 67 7.2 Adding and removing things.......................................... 68 7.3 Running time and applications of heaps................................... 69 7.4 Implementing a heap............................................... 70 8 Sets and Hashing 73 8.1 Sets......................................................... 73 8.2 Implementing a set with hashing........................................ 74 8.3 More about hashing................................................ 75 8.4 Sets in the Collections Framework....................................... 76 8.5 Applications.................................................... 76 8.6 Set operations................................................... 77 9 Maps 79 9.1 Introduction.................................................... 79 9.2 Maps in the Collections Framework...................................... 79 9.3 Applications of maps............................................... 80 10 Graphs 83 10.1 Introduction.................................................... 83 10.2 Graph data structures.............................................. 84 10.3 Implementing a graph class........................................... 84 10.4 Searching...................................................... 86 10.5 Applications of searching............................................ 88 10.6 More applications of graphs........................................... 91 11 Sorting 93 11.1 Introduction.................................................... 93 11.2 Insertion sort.................................................... 93 11.3 Mergesort...................................................... 94 11.4 Quicksort...................................................... 97 11.5 Other sorts..................................................... 99 11.6 Comparison of sorting algorithms....................................... 103 CONTENTS v 11.7 Sorting in Java................................................... 105 12 Exercises 107 12.1 Chapter 1 Exercises................................................ 107 12.2 Chapter 2 Exercises................................................ 109 12.3 Chapter 3 Exercises................................................ 115 12.4 Chapter 4 Exercises................................................ 119 12.5 Chapter 5 Exercises................................................ 124 12.6 Chapter 6 Exercises................................................ 125 12.7 Chapter 7 Exercises................................................ 127 12.8 Chapter 8 Exercises................................................ 129 12.9 Chapter 9 Exercises................................................ 130 12.10Chapter 10 exercises............................................... 132 12.11Chapter 11 exercises............................................... 136 Index 138 Chapter 1 Running times of algorithms 1.1 Introduction In computer science, a useful skill is to be able to look at an algorithm and predict roughly how fast it will run. Choosing the right algorithm for the job can make the difference between having something that takes a few seconds to run versus something that takes days or weeks. As an example, here are three different ways to sum up the integers from 1 to n. 1. Probably the most obvious way would be to loop from 1 to n and use a variable to keep a running total. long total = 0; for (long i=1; i<=n; i++) total += i; 2. If you know enough math, there is a formula 1 + 2 + + n = n(n + 1)=2. So the sum could be done in one line, like below: ··· long total = n*(n+1)/2; 3. Here is a bit of a contrived approach using nested loops. long total = 0; for (long i=1; i<=n; i++) for (long j=0; j<i; j++) total++; Any of these algorithms would work fine if we are just adding up numbers from 1 to 100. But if we are adding up numbers from 1 to a billion, then the choice of algorithm would really matter. On my system, I tested out adding integers from 1 to a billion. The first algorithm took about 1 second to add up everything. The next algorithm returned the answer almost instantly. I estimate the last algorithm would take around 12 years to finish, based on its progress over the three minutes I ran it. In the terminology of computer science, the first algorithm is an O(n) algorithm, the second is an O(1) 2 algorithm, and the third is an O(n ) algorithm. This notation, called big O notation, is used to measure the running time of algorithms. Big O notation doesn’t give the exact running time, but rather it’s an order of magnitude estimation. Measuring the exact running time of an algorithm isn’t practical as so many different things like processor speed, amount of memory, what else is running on the machine, the version of the programming language, etc. can affect the running time. So instead, we use big O notation, which measures how an algorithm’s running time grows as some parameter n grows. In the example above, n is the integer we are summing up to, but in other cases it might be the size of an array or list, the number of items in a matrix, etc. 1 CHAPTER 1. RUNNING TIMES OF ALGORITHMS 2 2 With big O, we usually only care about the dominant or most important