Analytic Geometry, Linear Algebra, Kernels, RKHS, and Native Spaces John Erickson, IIT
Total Page:16
File Type:pdf, Size:1020Kb
Analytic Geometry, Linear Algebra, Kernels, RKHS, and Native Spaces John Erickson, IIT “There are two things you don’t want to see get made: Sausage and Mathematics” -John Erickson (I might have heard this somewhere however.) Note to the Reader ◼ This is the latest in a series of Mathematica notebooks that some people have taken to calling “The Erickson Kernel Notebooks”. By some people, I am referring to myself of course. I have used them in various forms in the Meshfree/Kernel research group here at IIT for the past two summers. As a result, I occasionally say things like, “as we discussed...”, “recall that...”. and end up referencing a concept that appears no where else in the current document. I am trying to remove all such references or replace them with something useful. ◼ If you find the material too elementary or the tone too pedantic at times, try not to judge too harshly that the author is an idiot. I would suggest waiting until the end to revisit this question and then form an opinion. Keep in mind that this was written partly to answer the question: “Can I teach this stuff to a very talented high school student?” and also inspired by the spirit of Arnold Ross who taught us to “think deeply about simple things.” ◼ We urge the reader to play with the Manipulates in order to gain a feel for the subject. We understand that not everyone is in love with Mathematica but the pdf version is but a mere shadow of the cdf version. While poetic and apparently hyperbolic (no pun intended), we mean this quite literally: the pdf is essentially the cdf stuck at the particular moment of time I happened to save it. ◼ Some severe limitations should be confessed to immediately: we will usually confine the discussion to positive definite kernels on a subset of the real line. This is mainly to keep the ideas, programs, and graphics as simple as possible. ◼ I always have enjoyed reading articles where the author freely gave his opinions about things so I have done so. Take them with a grain of salt however and think for yourself. Finally, at times you might find yourself chuckling a bit. That’s because I’m a pretty funny guy I think. Introduction An alternative title for this talk could have been “What are native spaces and what the heck is the native space norm measuring?” I used the above title because I think the best way to answer this question is to take a tour through a number of topics in mathematics ranging from the elementary to the more advanced. 2 analytic_geom_kernels_native_spaces.nb We will endeavor to provide some insight into what exactly the native space norm of a positive definite kernel is measuring while staying as elementary as possible. The native space norm is clearly designed to be a means of measuring the size of functions which are linear combinations of our given kernel evaluated at points in our space. The norm has the benefit that it comes from an inner product but it isn’t exactly clear what is being measured. In order to make things easier we will confine ourselves to dimension one and primarily confine ourselves to the native spaces determined by positive definite kernels, but we will mention a few conditionally positive definite kernels since they provide nice examples. Before proceeding however, let’s ask a question. A Question From Analytic Geometry We all know from high school that if you take the unit circle centered at the origin and apply a non singular matrix A to it, you get an ellipse. In high school you learn this in a different form as the paramet- ric equations of an ellipse. We also know that every ellipse centered at the origin is essentially the solution to a quadratic equation in two variables which can also be defined by a matrix B. In high school you learn this when you learn about conic sections, and it involved a fair amount of trigonometry since you may not have known enough about matrices at the time. So the question is the following: precisely formulate the above statement and find the relationship between the two matrices A and B. If you don’t know the answer and find it interesting then think about it for a moment. Remark: I actually did learn analytic geometry, but not this question, in high school but my school had a strong math program. Unfortunately, since geometry appears to have been deprecated in the modern American high school curriculum (presumably so that students feel good about themselves), I find that many students have never had a full course in analytic geometry which is a real pity since it therefore makes teaching them multivariable calculus and linear algebra that much harder. Of course, this is my fault for “making it so confusing.” Some Linear Algebra We don’t intend to Review all of linear algebra but simply state some results that should be familiar. You might even know these facts without realizing you know them. Incidentally, we will adopt the convention that vectors can be identified with column matrices. You should have learned that in finite dimensions, after choosing a basis... ◼ Every Linear transformation L:ℝ n → ℝm can be realized as a matrix L(u)=Au. ◼ In particular, with m= 1, every linear functional L:ℝ n → ℝ can be realized as a row matrix which can in turn be identified as a vector. In infinite dimensions this becomes one of the Riesz representation theorems for bounded linear functionals. It doesn’t seem reasonable that you can understand the latter result if you never realized it in finite dimensions. By the way, the set of all linear functionals on a finite dimensional vector space is also called the dual space. It is also a vector space and in fact it is essentially the same (isomorphic) to the original space due to the aforementioned correspondence. ◼ Every bilinear functional b:ℝ n ×ℝ m → ℝ can be realized as the matrix product b(u,v)=u T B v. analytic_geom_kernels_native_spaces.nb 3 ◼ In particular every (real) inner product b:ℝ n ×ℝ n → ℝ can be realized as b(u,v)=u T B v where the matrix B is both symmetric and positive definite (in addition to being bilinear already). Inner products are a very special type of kernel. In fact, in some sense they are canonical type of kernel. We will denote inner product by using the “bra-ket” notation 〈u,v〉=u T B v. ◼ In a real inner product space, or pre-Hilbert space, we can define a norm on the space by taking the inner product of a vector with itself, i.e., v2 =〈v,v〉=v T B v where B is again a symmetric positive definite matrix. It is clear (you should have seen) that an inner product will induce a norm in this fashion. Conversely if you are given a norm in a real inner product space derived from the inner product you can always recover the inner product using the following so-called “polarization identity” 〈u,v〉= 1 u2 +v 2 -u-v 2 2 This is, of course, nothing more than the law of cosines from high school trigonometry since 〈u,v〉=uv cos(θ). There is a similar result for complex inner product spaces. ◼ You should know about Eigenvalues and Eigenvectors. ◼ You should know the spectral theorem that every symmetric operator on a space of dimension n has n real Eigenvalues (when counted with multiplicity) and an orthogonal basis of eigenvectors. In terms of matrices, you can diagonalize a symmetric matrix and you get reals on the diagonal. Moreover the matrix effecting the change of basis is the just the transpose of the Eigenbasis. This is an example of a matrix factorization: A=PΛP -1 where P is the change of basis matrix. ◼ It would be nice if you were acquainted with some additional matrix factorizations: The Jordan canonical form, L U, Q R, Cholesky, and the SVD. We probably won’t use the Jordan form, but you really should know it. ◼ Finally, you should realize that the set of all bilinear functionals on a pair of vector spaces is isomorphic to the dual space of the tensor product of the original two spaces. More precisely, Bilinear(ℝn ×ℝ m ,ℝ)≃ Linear(ℝ n ⊗ℝm ,ℝ) This follows “trivially” from the universal mapping property of tensor products. I’m kidding of course, while also adopting the standard pompous tone in mathematics that we all learn so as to sound smarter. You don’t really need to know this and I just threw it in partly to be erudite but it is actually relevant. This result reduces the above claim about the representation of bilinear functionals (which you do need to know) to the previous claim about the representation of linear functionals which you do know. If I have piqued your interest, I’ll say a bit more. The left and right sides are both vectors spaces of dimension m n (that’s one of the properties of tensor products) therefore they must be isomorphic since dimension is an isomorphism invariant in the category of vector spaces if you want to make a mountain out of a molehill. The result is actually interesting because there is a natural isomorphism between these two vector spaces. If you want to know more about this google “Multilinear Algebra” or, god forbid, go to the library and read a book about it.