TOPIC 0 Introduction

TOPIC 0 Introduction

TOPIC 0 Introduction 1 Review Course: Math 535 (http://www.math.wisc.edu/~roch/mmids/) - Mathematical Methods in Data Science (MMiDS) Author: Sebastien Roch (http://www.math.wisc.edu/~roch/), Department of Mathematics, University of Wisconsin-Madison Updated: Sep 21, 2020 Copyright: © 2020 Sebastien Roch We first review a few basic concepts. 1.1 Vectors and norms For a vector ⎡ 푥 ⎤ ⎢ 1 ⎥ 푥 ⎢ 2 ⎥ 푑 퐱 = ⎢ ⎥ ∈ ℝ ⎢ ⋮ ⎥ ⎣ 푥푑 ⎦ the Euclidean norm of 퐱 is defined as ‾‾푑‾‾‾‾ ‖퐱‖ = 푥2 = √퐱‾‾푇‾퐱‾ = ‾⟨퐱‾‾, ‾퐱‾⟩ 2 ∑ 푖 √ ⎷푖=1 푇 where 퐱 denotes the transpose of 퐱 (seen as a single-column matrix) and 푑 ⟨퐮, 퐯⟩ = 푢 푣 ∑ 푖 푖 푖=1 is the inner product (https://en.wikipedia.org/wiki/Inner_product_space) of 퐮 and 퐯. This is also known as the 2- norm. More generally, for 푝 ≥ 1, the 푝-norm (https://en.wikipedia.org/wiki/Lp_space#The_p- norm_in_countably_infinite_dimensions_and_ℓ_p_spaces) of 퐱 is given by 푑 1/푝 ‖퐱‖ = |푥 |푝 . 푝 (∑ 푖 ) 푖=1 Here (https://commons.wikimedia.org/wiki/File:Lp_space_animation.gif#/media/File:Lp_space_animation.gif) is a nice visualization of the unit ball, that is, the set {퐱 : ‖푥‖푝 ≤ 1}, under varying 푝. There exist many more norms. Formally: 푑 푑 Definition (Norm): A norm is a function ℓ from ℝ to ℝ+ that satisfies for all 푎 ∈ ℝ, 퐮, 퐯 ∈ ℝ (Homogeneity): ℓ(푎퐮) = |푎|ℓ(퐮) (Triangle inequality): ℓ(퐮 + 퐯) ≤ ℓ(퐮) + ℓ(퐯) (Point-separating): ℓ(푢) = 0 implies 퐮 = 0. ⊲ The triangle inequality for the 2-norm follows (https://en.wikipedia.org/wiki/Cauchy– Schwarz_inequality#Analysis) from the Cauchy–Schwarz inequality (https://en.wikipedia.org/wiki/Cauchy– Schwarz_inequality). 푑 Theorem (Cauchy–Schwarz): For all 퐮, 퐯 ∈ ℝ |⟨퐮, 퐯⟩| ≤ ‖퐮‖2‖퐯‖2. 푑 The Euclidean distance (https://en.wikipedia.org/wiki/Euclidean_distance) between two vectors 퐮 and 퐯 in ℝ is the 2-norm of their difference 푑(퐮, 퐯) = ‖퐮 − 퐯‖2. Throughout we use the notation ‖퐱‖ = ‖퐱‖2 to indicate the 2-norm of 퐱 unless specified otherwise. 푑 We will often work with collections of 푛 vectors 퐱1 , … , 퐱푛 in ℝ and it will be convenient to stack them up into a matrix ⎡ 퐱푇 ⎤ ⎡ 푥 푥 ⋯ 푥 ⎤ ⎢ 1 ⎥ 11 12 1푑 푇 ⎢ ⎥ ⎢ 퐱2 ⎥ ⎢ 푥21 푥22 ⋯ 푥2푑 ⎥ 푋 = ⎢ ⎥ = ⎢ ⎥ , ⎢ ⋮ ⎥ ⎢ ⋮ ⋮ ⋱ ⋮ ⎥ 푇 ⎣ ⎦ ⎣ 퐱푛 ⎦ 푥푛1 푥푛2 ⋯ 푥푛푑 where 푇 indicates the transpose (https://en.wikipedia.org/wiki/Transpose): (Source (https://commons.wikimedia.org/wiki/File:Matrix_transpose.gif)) NUMERICAL CORNER In Julia, a vector can be obtained in different ways. The following method gives a row vector as a two-dimensional array. In [1]: # Julia version: 1.5.1 using LinearAlgebra, Statistics, Plots In [2]: t = [1. 3. 5.] Out[2]: 1×3 Array{Float64,2}: 1.0 3.0 5.0 To turn it into a one-dimensional array, use vec (https://docs.julialang.org/en/v1/base/arrays/#Base.vec). In [3]: vec(t) Out[3]: 3-element Array{Float64,1}: 1.0 3.0 5.0 To construct a one-dimensional array directly, use commas to separate the entries. In [4]: u = [1., 3., 5., 7.] Out[4]: 4-element Array{Float64,1}: 1.0 3.0 5.0 7.0 To obtain the norm of a vector, we can use the function norm (https://docs.julialang.org/en/v1/stdlib/LinearAlgebra/#LinearAlgebra.norm) (which requires the LinearAlgebra package): In [5]: norm(u) Out[5]: 9.16515138991168 which we can check "by hand" In [6]: sqrt(sum(u.^2)) Out[6]: 9.16515138991168 The . above is called broadcasting (https://docs.julialang.org/en/v1.2/manual/mathematical-operations/#man- dot-operators-1). It applies the operator following it (in this case taking a square) element-wise. Exercise: Compute the inner product of 푢 = (1, 2, 3, 4) and 푣 = (5, 4, 3, 2) without using the function dot (https://docs.julialang.org/en/v1/stdlib/LinearAlgebra/#LinearAlgebra.dot). Hint: The product of two real numbers 푎 and 푏 is a * b . In [7]: # Try it! u = [1., 2., 3., 4.]; # EDIT THIS LINE: define v # EDIT THIS LINE: compute the inner product between u and v ⊲ To create a matrix out of two vectors, we use the function hcat (https://docs.julialang.org/en/v1.2/base/arrays/#Base.hcat) and transpose. In [8]: u = [1., 3., 5., 7.]; v = [2., 4., 6., 8.]; X = hcat(u,v)' Out[8]: 2×4 Adjoint{Float64,Array{Float64,2}}: 1.0 3.0 5.0 7.0 2.0 4.0 6.0 8.0 With more than two vectors, we can use the reduce (https://docs.julialang.org/en/v1/base/collections/#Base.reduce-Tuple{Any,Any}) function. In [9]: u = [1., 3., 5., 7.]; v = [2., 4., 6., 8.]; w = [9., 8., 7., 6.]; X = reduce(hcat, [u, v, w])' Out[9]: 3×4 Adjoint{Float64,Array{Float64,2}}: 1.0 3.0 5.0 7.0 2.0 4.0 6.0 8.0 9.0 8.0 7.0 6.0 1.2 Multivariable calculus 1.2.1 Limits and continuity ‾‾‾푑‾‾‾‾‾2 푇 푑 Throughout this section, we use the Euclidean norm ‖퐱‖ = √∑푖=1 푥푖 for 퐱 = (푥1, … , 푥푑 ) ∈ ℝ . 푑 The open 푟-ball around 퐱 ∈ ℝ is the set of points within Euclidean distance 푟 of 퐱, that is, 푑 퐵푟(퐱) = {퐲 ∈ ℝ : ‖퐲 − 퐱‖ < 푟}. 푑 푑 A point 퐱 ∈ ℝ is a limit point (or accumulation point) of a set 퐴 ⊆ ℝ if every open ball around 퐱 contains an element 퐚 of 퐴 such that 퐚 ≠ 퐱. A set 퐴 is closed if every limit point of 퐴 belongs to 퐴. (Source (https://www.math212.com/2017/12/limit-point-closure.html)) 푑 푑 A point 퐱 ∈ ℝ is an interior point of a set 퐴 ⊆ ℝ if there exists an 푟 > 0 such that 퐵푟(퐱) ⊆ 퐴. A set 퐴 is open if it consists entirely of interior points. (Source (https://commons.wikimedia.org/wiki/File:Open_set_-_example.png)) 푑 푇 A set 퐴 ⊆ ℝ is bounded if there exists an 푟 > 0 such that 퐴 ⊆ 퐵푟(0), where 0 = (0, … , 0) . 푑 Definition (Limits of a Function): Let 푓 : 퐷 → ℝ be a real-valued function on 퐷 ⊆ ℝ . Then 푓 is said to have a limit 퐿 ∈ ℝ as 퐱 approaches 퐚 if: for any 휖 > 0, there exists a 훿 > 0 such that |푓(퐱) − 퐿| < 휖 for all 퐱 ∈ 퐷 ∩ 퐵훿(퐚) ∖ {퐚}. This is written as lim 푓(퐱) = 퐿. 퐱→퐚 ⊲ Note that we explicitly exclude 퐚 itself from having to satisfy the condition |푓(퐱) − 퐿| < 휖. In particular, we may have 푓(퐚) ≠ 퐿. We also do not restrict 퐚 to be in 퐷. 푑 Definition (Continuous Function): Let 푓 : 퐷 → ℝ be a real-valued function on 퐷 ⊆ ℝ . Then 푓 is said to be continuous at 퐚 ∈ 퐷 if lim 푓(퐱) = 푓(퐚). 퐱→퐚 ⊲ (Source (https://commons.wikimedia.org/wiki/File:Example_of_continuous_function.svg)) We will not prove the following fundamental analysis result, which will be useful below. See e.g. Wikipedia 푑 (https://en.wikipedia.org/wiki/Extreme_value_theorem). Suppose 푓 : 퐷 → ℝ is defined on a set 퐷 ⊆ ℝ . We ∗ ∗ say that 푓 attains a maximum value 푀 at 퐳 if 푓(퐳 ) = 푀 and 푀 ≥ 푓(퐱) for all 퐱 ∈ 퐷. Similarly, we say 푓 attains a minimum value 푚 at 퐳∗ if 푓(퐳∗ ) = 푚 and 푚 ≥ 푓(퐱) for all 퐱 ∈ 퐷. Theorem (Extreme Value): Let 푓 : 퐷 → ℝ be a real-valued, continuous function on a nonempty, closed, 푑 bounded set 퐷 ⊆ ℝ . Then 푓 attains a maximum and a minimum on 퐷. 1.2.2 Derivatives We begin by reviewing the single-variable case. Recall that the derivative of a function of a real variable is the rate of change of the function with respect to the change in the variable. Formally: Definition (Derivative): Let 푓 : 퐷 → ℝ where 퐷 ⊆ ℝ and let 푥0 ∈ 퐷 be an interior point of 퐷. The derivative of 푓 at 푥0 is ′ d푓(푥0) 푓(푥0 + ℎ) − 푓(푥0) 푓 (푥0) = = lim d푥 ℎ→0 ℎ provided the limit exists. ⊲ (Source (https://commons.wikimedia.org/wiki/File:Tangent_to_a_curve.svg)) Exercise: Let 푓 and 푔 have derivatives at 푥 and let 훼 and 훽 be constants. Show that [훼푓(푥) + 훽푔(푥)]′ = 훼푓 ′ (푥) + 훽푔′ (푥). ⊲ The following lemma encapsulates a key insight about the derivative of 푓 at 푥0 : it tells us where to find smaller values. Lemma (Descent Direction): Let 푓 : 퐷 → ℝ with 퐷 ⊆ ℝ and let 푥0 ∈ 퐷 be an interior point of 퐷 where ′ ′ 푓 (푥0) exists. If 푓 (푥0) > 0, then there is an open ball 퐵훿(푥0) ⊆ 퐷 around 푥0 such that for each 푥 in 퐵훿(푥0): (a) 푓(푥) > 푓(푥0) if 푥 > 푥0 , (b) 푓(푥) < 푓(푥0) if 푥 < 푥0 . ′ If instead 푓 (푥0) < 0, the opposite holds. ′ Proof idea: Follows from the definition of the derivative by taking 휖 small enough that 푓 (푥0) − 휖 > 0. ′ Proof: Take 휖 = 푓 (푥0)/2. By definition of the derivative, there is 훿 > 0 such that 푓(푥 + ℎ) − 푓(푥 ) 푓 ′ (푥 ) − 0 0 < 휖 0 ℎ for all 0 < ℎ < 훿. Rearranging gives ′ 푓(푥0 + ℎ) > 푓(푥0) + [푓 (푥0) − 휖]ℎ > 푓(푥0) by our choice of 휖. The other direction is similar. ◻ 푑 For functions of several variables, we have the following generalization. As before, we let 퐞푖 ∈ ℝ be the 푖-th standard basis vector. 푑 Definition (Partial Derivative): Let 푓 : 퐷 → ℝ where 퐷 ⊆ ℝ and let 퐱0 ∈ 퐷 be an interior point of 퐷.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    21 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us