LAST CHANCE STATS Contents 1. Warm up 6 1.1. Proof of Pythagorean Theorem 6 1.2. Irrationality of √ 2 6 1.3. Euclid's Proof

LAST CHANCE STATS JOSEPH BOYD Contents 1. Warm Up6 1.1. Proof of Pythagoreanp Theorem6 1.2. Irrationality of 26 1.3. Euclid's Proof of the Infinitude of Primes7 1.4. The Potato Paradox7 1.5. The Monty Hall Problem7 1.6. Finding Hamlet in the Digits of π 8 2. Differential Calculus9 2.1. Binomial Theorem9 2.2. (-δ) Definition of a Limit 10 2.3. Differentiating From First Principles 11 2.4. Chain Rule 12 2.5. Product Rule 12 2.6. Rolle's Theorem 13 2.7. Mean Value Theorem 13 3. Linear Algebra 14 3.1. Analytic Form of Inverse 14 3.2. Positive-definite Matrices 15 3.3. Gaussian Elimination 16 3.4. Matrix Decompositions 16 4. Vector Calculus 18 4.1. Preliminary Results 18 4.2. Vector Calculus 20 5. Series 22 5.1. Geometric Series 22 5.2. Taylor Series 22 5.3. π as an Infinite Series 25 6. Infinite Products 27 6.1. Viète'sFormula 27 6.2. The Basel Problem 28 6.3. Wallis' Product 29 6.4. The Method of Eratosthenes 30 1 2 JOSEPH BOYD 6.5. The Euler Product Formula 30 7. Deriving the Golden Ratio 31 7.1. Continued Fractions 31 7.2. Fibonacci Sequences 32 7.3. The Golden Ratio 32 8. The Exponential Function 32 8.1. Deriving the Exponential Function 33 8.2. Other Solutions 34 8.3. Euler's Formula 35 9. Integral Transforms 36 9.1. The Laplace Transform 36 9.2. Fourier Series 37 9.3. The Fourier Transform 38 10. The Factorial Function 38 10.1. Stirling's Approximation 39 10.2. The Gamma Function 40 11. Fundamentals of Statistics 41 11.1. Philosophical Notions of Probability 41 11.2. Fundamental Rules 42 11.3. Bayes' Theorem 44 11.4. Quantiles 45 11.5. Moments 46 11.6. Sample Statistics 47 11.7. Covariance 48 11.8. Correlation 49 11.9. Transformation of Random Variables 49 11.10. Change of Variables 50 11.11. Monte Carlo Statistics 50 12. Common Discrete Probability Distributions 51 12.1. Uniform Distribution 51 12.2. Bernoulli Distribution 52 12.3. Binomial Distribution 52 13. Common Continuous Probability Distributions 52 13.1. Laplace Distribution 52 13.2. Gaussian Distribution 53 13.3. Chi-squared Distribution 54 13.4. Student's T-Distribution 55 14. Hypothesis Testing 55 14.1. Standard Error 55 14.2. Confidence Intervals 56 14.3. Degrees of Freedom 56 14.4. Hypothesis Testing 56 15. The Central Limit Theorem 58 LAST CHANCE STATS 3 15.1. Moment-Generating Functions 58 15.2. Proof of the Central Limit Theorem 58 16. The Law of Large Numbers 59 16.1. Markov Inequality 59 16.2. Chebyshev Inequality 60 16.3. Law of Large Numbers 60 17. Information Theory 60 17.1. Entropy 61 17.2. Kullback-Liebler Divergence 61 17.3. Mutual Information 62 18. Convex Optimisation 62 18.1. Convexity 62 18.2. Necessary and Sufficient Conditions for Optimality 64 18.3. Unconstrained Problems 64 18.4. Equality Constraints 65 18.5. The Method of Lagrange Multipliers 65 18.6. Inequality Constraints 66 18.7. KKT Conditions 66 19. Computability and Complexity 67 19.1. Decision Problems 67 19.2. The Halting Problem 67 19.3. Turing Machines 68 19.4. P versus NP 70 19.5. Complexity Classes 70 20. Discrete Optimisation 71 20.1. Integer Programming Problems 71 20.2. Solution Methods 72 21. Graph Theory 74 21.1. The Seven Bridges of Königsberg 74 21.2. The Three Cottages Problem 75 21.3. NP-complete Graph Problems 75 21.4. Graph Traversal 76 21.5. Shortest Path Problem 77 22. Introduction to Machine Learning 77 22.1. Learning 78 22.2. Training and Prediction 79 23. Bayes and the Beta-Binomial Model 82 24. Generative Classifiers 83 25. Multivariate Normal Distribution 86 25.1. Gaussian Discriminant Analysis 87 26. Linear Regression 88 26.1. Constructing a Loss Function 88 26.2. Method of Least Squares 89 4 JOSEPH BOYD 26.3. The Gram Matrix 90 26.4. Gradient Descent 90 26.5. Model Prediction 93 26.6. Maximum Likelihood Estimation 93 26.7. Non-linear Fits 94 26.8. Bayesian Linear Regression 94 26.9. Robust Linear Regression 95 27. The Simplex Algorithm 97 28. Logistic Regression 99 28.1. Binary Logistic Regression 100 28.2. Multinomial Logistic Regression 102 29. Online Learning and Perceptrons 103 29.1. Online learning 103 30. Neural Networks 104 30.1. Convolutional Neural Networks 107 30.2. Recurrent Neural Networks 111 31. Support Vector Machines 115 31.1. Kernels 115 31.2. Kernelised kNN 115 31.3. Matrix Inversion Lemma 116 31.4. Reformulating Ridge Regression 116 31.5. Decision Boundaries 116 31.6. Sequential Minimal Optimisation 119 31.7. Support Vector Machine Regression 120 32. Decision Trees 120 32.1. Learning Decision Trees 120 32.2. Random Forest 121 32.3. Boosting methods 122 33. Dimensionality Reduction and PCA 122 33.1. Derivation of PCA 122 33.2. Singular Value Decomposition 123 33.3. Multi-dimensional Scaling 124 33.4. t-distributed Stochastic Neighbour Embedding 124 34. Structured Sequence Learning 124 34.1. Hidden Markov Models 124 34.2. Conditional Random Fields 124 34.3. Kalman Filters 124 35. Unsupervised Machine Learning 125 35.1. K-means 125 35.2. Gaussian Mixture Models 126 36. Performance Metrics 127 36.1. ROC Curves 128 37. Computer Architecture 129 LAST CHANCE STATS 5 37.1. Hardware 129 37.2. Software 130 37.3. Digital Circuitry 131 37.4. Operating Systems 132 38. Image Analysis and Processing 134 38.1. Connectivity 135 38.2. Image Scaling 135 38.3. Image Compression 135 38.4. Labeling 136 38.5. Thresholding 136 38.6. Linear Filters 137 38.7. Non-linear Filters 138 38.8. Mathematical Morphology 138 38.9. Segmentation 141 38.10. Feature Extraction 141 38.11. Transformations 142 39. Useless Maths Facts 143 References 145 6 JOSEPH BOYD 1. Warm Up This section presents an assortment of mathematical problems, each embodying an idea that is useful or instructive to the material that follows. 1.1. Proof of Pythagorean Theorem. Consider a square with sides of length c rotated inside a larger square, such that the four corners of the smaller square meet a distinct edge of the larger at a point a units along that edge. The sides of the larger square are therefore of length a + b for some a; b > 0. The area of the four resulting right-angled triangles that 1 fill the empty space are 2 ab apiece. Thus, we have it that, 1 (a + b)2 = c2 + 4 · ab; 2 expanding to, a2 + 2ab + b2 = c2 + 2ab; and finally, a2 + b2 = c2: p p 1.2. Irrationality of 2. Suppose 2 is rational. Then there exist a; b 2 Z, with a > b and gcd(a; b) = 11 such that, a p = 2: b Consequently, a2 = 2: b2 This implies that a2 is even, and therefore that a is even. That is, a = 2k; for some k 2 Z. So, by substitution, 4k2 = 2: b2 Rearranging, b2 = 2k2: However, this now implies that b2 is even, hence b also, implying a common factor between a and b, though by assumption they have no commonp factors, and clearly we can repeat the argument ad infinitum. This contradiction proves 2 is irrational2. 1 a That is, b is an irreducible fraction. This can be obtained by cancelling out the common factors. Every integer has a unique prime factorisation due to the fundamental theorem of arithmetic. 2This is an example of a proof by contradiction. LAST CHANCE STATS 7 1.3. Euclid's Proof of the Infinitude of Primes. Take any finite set of primes, p1; p2; : : : ; pn. Then, the number, q = p1 × p2 × · · · × pn + 1, is either prime or not prime. If it is prime, we have generated a new prime not in our set. If it is not, it has a prime factor not in our set, as q ≡ 1 (mod pi) for i = 1; 2; : : : ; n. Using this method we can always generate a new prime3. It follows that there are infinitely many primes. 1.4. The Potato Paradox. A potato that is 99% water weighs 100g. It is dried until it is only 98% water. It now weighs only 50g. How can this be? The mistake people typically make is to assume there has been a 1% reduction in the volume of water, rather than a change to the water/non-water ratio. In fact, the ratio of water to non-water content has changed from 1 : 99 to 2 : 98 or 1 : 49. Since the non-water content of the potato has not changed, the water must have lost 50% of its volume, and so the potato now weighs half as much! 1.5. The Monty Hall Problem. This famous paradox, named after an American game show host, has the following problem statement: A game show presents you with the choice of selecting one of three doors. Two of the doors contain a goat (undesirable), whilst the third contains a car. Upon your choice, the game show host, who knows the contents of each of the doors, opens one of the two remaining doors to reveal a goat. He then gives you the option of either sticking with your initial choice, or switching to the one remaining door. Is it advantageous to switch doors? The answer is that it is advantageous, increasing your winning chances from 1=3 to 2=3. The essential realisation is that the host (Monty Hall) has knowledge of the door contents and will always reveal a goat, whether you have chosen the car or not.

Load more