Function Approximations As Vector Spaces

Total Page:16

File Type:pdf, Size:1020Kb

Function Approximations As Vector Spaces Foundations of Reinforcement Learning with Applications in Finance Ashwin Rao, Tikhon Jelvis 1 Function Approximations as Affine Spaces Vector Space A Vector space is defined as a commutative group V under an addition oper- ation (written as +), together with multiplication of elements of V with ele- ments of a field K (known as scalars), expressed as a binary in-fix operation ∗ : K × V ! V, with the following properties: • a ∗ (b ∗ v) = (a ∗ b) ∗ v, for all a; b 2 K, for all v 2 V. • 1 ∗ v = v for all v 2 V where 1 denotes the multiplicative identity of K. • a ∗ (v1 + v2) = a ∗ v1 + a ∗ v2 for all a 2 K, for all v1; v2 2 V. • (a + b) ∗ v = a ∗ v + b ∗ v for all a; b 2 K, for all v 2 V. Function Space The set F of all functions from an arbitrary generic domain X to a vector space co-domain V (over scalars field K) constitutes a vector space (known as function space) over the scalars field K with addition operation (+) defined as: (f + g)(x) = f(x) + g(x) for all f; g 2 F; for all x 2 X and scalar multiplication operation (∗) defined as: (a ∗ f)(x) = a ∗ f(x) for all f 2 F; for all a 2 K; for all x 2 X Hence, addition and scalar multiplication for a function space are defined point-wise. Linear Map of Vector Spaces A linear map of Vector Spaces is a function h : V!W where V is a vector space over a scalars field K and W is a vector space over the same scalars field K, having the following two properties: • h(v1+v2) = h(v1)+h(v2) for all v1; v2 2 V (i.e., application of f commutes with the addition operation). • h(a ∗ v) = a ∗ h(v) for all v 2 V, for all a 2 K (i.e., application of f commutes with the scalar multiplication operation). 3 Then the set of all linear maps with domain V and co-domain W constitute a function space (restricted to just this subspace of all linear maps, rather than the space of all V!W functions). This function space (restricted to the subspace of all V!W linear maps) is denoted as the vector space L(V; W). The specialization of the function space of linear maps to the space L(V; K) (i.e., specializing the vector space W to the scalars field K) is known as the dual vector space and is denoted as V∗. Affine Space An Affine Space is defined asaset A associated with a vector space V and a binary in-fix operation ⊕ : A × V ! A, with the following properties: • For all a 2 A; a ⊕ 0 = a, where 0 is the zero vector in V (this is known as the right identity property). • For all v1; v2 2 V, for all a 2 A; (a ⊕ v1) ⊕ v2 = a ⊕ (v1 + v2) (this is known as the associativity property). • For each a 2 A, the mapping fa : V!A defined as fa(v) = a ⊕ v for all v 2 V is a bijection (i.e., one-to-one and onto mapping). The elements of an affine space are called points and the elements of the vector space associated with an affine space are called translations. The idea behind affine spaces is that unlike a vector space, an affine space doesn’t have anotion of a zero element and one cannot add two points in the affine space. Instead one adds a translation (from the associated vector space) to a point (from the affine space) to yield another point (in the affine space). The term translation is used to signify that we “translate” (i.e. shift) a point to another point in the affine space with the shift being effected bya translation in the associated vector space. This means there is a notion of “subtracting” one point of the affine space from another point of the affine space (denoted with the operation ⊖), yielding a translation in the associated vector space. A simple way to visualize an affine space is by considering the simple example of the affine space of all 3-D points on the plane defined by theequation z = 1, i.e., the set of all points (x; y; 1) for all x 2 R; y 2 R. The associated vector space is the set of all 3-D points on the plane defined by the equation z = 0, i.e., the set of all points (x; y; 0) for all x 2 R; y 2 R (with the usual addition and scalar multiplication operations). We see that any point (x; y; 1) on the affine space is translated to the point (x + x0; y + y0; 1) by the translation (x0; y0; 0) in the vector space. Note that the translation (0; 0; 0) (zero vector) results in the point (x; y; 1) remaining unchanged. Note that translations (x0; y0; 0) and (x00; y00; 00) applied one after the other is the same as the single translation (x0+x00; y0+y00; 00). Finally, note that for any fixed point (x; y; 1), we have a bijective mapping from the vector space z = 0 to the affine space z = 1 that maps any translation (x0; y0; 0) to the point (x + x0; y + y0; 1). 4 Linear Map of Affine Spaces A linear map of Affine Spaces is a function h : A!B where A is an affine space associated with a vector space V and B is an affine space associated with the same vector space V, having the following property: h(a ⊕ v) = h(a) ⊕ v for all a 2 A; for all v 2 V Function Approximations We represent function approximations by parameterized functions f : X × D[R] ! R where X is the input domain and D[R] is the parameters domain. The notation D[Y ] refers to a generic container data type D over a component generic data type Y . The data type D is specified as a generic container type because we consider generic function approximations here. A specific family of function approximations will customize to a specific container data type for D (eg: linear function approximations will customize D to a Sequence data type, a feed-forward deep neural network will customize D to a Sequence of 2-dimensional arrays). We are interested in viewing Function Approximations as points in an appropriate Affine Space. To explain this, we start by viewing parameters as points in an Affine Space. D[R] as an Affine Space P When performing Stochastic Gradient Descent or Batch Gradient Descent, pa- rameters p 2 D[R] of a function approximation f : X × D[R] ! R are updated using an appropriate linear combination of gradients of f with respect to w (at specific values of x 2 X ). Hence, the parameters domain D[R] can be treated as an affine space (call it P) whose associated vector space (over scalars field R) is the set of gradients of f with respect to parameters p 2 D[R] (denoted as rpf(x; p)), evaluated at specific values of x 2 X , with addition operation defined as element-wise real-numbered addition and scalar multiplication op- eration defined as element-wise multiplication with real-numbered scalars. We refer to this Affine Space P as the Parameters Space and we refer to it’s associ- ated vector space (of gradients) as the Gradient Space G. Since each point in P and each translation in G is an element in D[R], the ⊕ operation is element-wise real-numbered addition. We define the gradient function G : X! (P!G) as: G(x)(p) = rpf(x; p) for all x 2 X , for all p 2 P. 5 Representational Space R We consider a function I : P! (X! R) defined as I(p) = g : X! R for all p 2 P such that g(x) = f(x; p) for all x 2 X . The Range of this function I forms an affine space R whose associated vector space is the Gradient Space G, with the ⊕ operation defined as: I(p) ⊕ v = I(p ⊕ v) for all p 2 P; v 2 G We refer to this affine space R as the Representational Space(to signify the fact that the ⊕ operation for R simply “delegates” to the ⊕ operation for P and so, the parameters p 2 P basically serve as the internal representation of the function approximation I(p): X! R. This “delegation” from R to P implies that I is a linear map from Parameters Space P to Representational Space R. Notice that the __add__ method of the Gradient class in rl/function_approx.py is overloaded. One of the __add__ methods corresponds to vector addition of two gradients in the Gradient Space G. The other __add__ method corresponds to the ⊕ operation adding a gradient (treated as a translation in the vector space of gradients) to a function approximation (treated as a point in the affine space of function approximations). Stochastic Gradient Descent Stochastic Gradient Descent is a function SGD : X × R ! (P!P) representing a mapping from (predictor, response) data to a “parameters- update” function (in order to improve the function approximation), defined as: SGD(x; y)(p) = p ⊕ (α ∗ ((y − f(x; p)) ∗ G(x)(p))) for all x 2 X ; y 2 R; p 2 P, where α 2 R+ represents the learning rate (step size of SGD). For a fixed data pair (x; y) 2 X × R, with prediction error function e : P! R defined as e(p) = y − f(x; p), the (SGD-based) parameters change function U : P!G is defined as: U(p) = SGD(x; y)(p) ⊖ p = α ∗ (e(p) ∗ G(x)(p)) for all p 2 P.
Recommended publications
  • 171 Composition Operator on the Space of Functions
    Acta Math. Univ. Comenianae 171 Vol. LXXXI, 2 (2012), pp. 171{183 COMPOSITION OPERATOR ON THE SPACE OF FUNCTIONS TRIEBEL-LIZORKIN AND BOUNDED VARIATION TYPE M. MOUSSAI Abstract. For a Borel-measurable function f : R ! R satisfying f(0) = 0 and Z sup t−1 sup jf 0(x + h) − f 0(x)jp dx < +1; (0 < p < +1); t>0 R jh|≤t s n we study the composition operator Tf (g) := f◦g on Triebel-Lizorkin spaces Fp;q(R ) in the case 0 < s < 1 + (1=p). 1. Introduction and the main result The study of the composition operator Tf : g ! f ◦ g associated to a Borel- s n measurable function f : R ! R on Triebel-Lizorkin spaces Fp;q(R ), consists in finding a characterization of the functions f such that s n s n (1.1) Tf (Fp;q(R )) ⊆ Fp;q(R ): The investigation to establish (1.1) was improved by several works, for example the papers of Adams and Frazier [1,2 ], Brezis and Mironescu [6], Maz'ya and Shaposnikova [9], Runst and Sickel [12] and [10]. There were obtained some necessary conditions on f; from which we recall the following results. For s > 0, 1 < p < +1 and 1 ≤ q ≤ +1 n s n s n • if Tf takes L1(R ) \ Fp;q(R ) to Fp;q(R ), then f is locally Lipschitz con- tinuous. n s n • if Tf takes the Schwartz space S(R ) to Fp;q(R ), then f belongs locally to s Fp;q(R). The first assertion is proved in [3, Theorem 3.1].
    [Show full text]
  • Comparative Programming Languages
    CSc 372 Comparative Programming Languages 10 : Haskell — Curried Functions Department of Computer Science University of Arizona [email protected] Copyright c 2013 Christian Collberg 1/22 Infix Functions Declaring Infix Functions Sometimes it is more natural to use an infix notation for a function application, rather than the normal prefix one: 5+6 (infix) (+) 5 6 (prefix) Haskell predeclares some infix operators in the standard prelude, such as those for arithmetic. For each operator we need to specify its precedence and associativity. The higher precedence of an operator, the stronger it binds (attracts) its arguments: hence: 3 + 5*4 ≡ 3 + (5*4) 3 + 5*4 6≡ (3 + 5) * 4 3/22 Declaring Infix Functions. The associativity of an operator describes how it binds when combined with operators of equal precedence. So, is 5-3+9 ≡ (5-3)+9 = 11 OR 5-3+9 ≡ 5-(3+9) = -7 The answer is that + and - associate to the left, i.e. parentheses are inserted from the left. Some operators are right associative: 5^3^2 ≡ 5^(3^2) Some operators have free (or no) associativity. Combining operators with free associativity is an error: 5==4<3 ⇒ ERROR 4/22 Declaring Infix Functions. The syntax for declaring operators: infixr prec oper -- right assoc. infixl prec oper -- left assoc. infix prec oper -- free assoc. From the standard prelude: infixl 7 * infix 7 /, ‘div‘, ‘rem‘, ‘mod‘ infix 4 ==, /=, <, <=, >=, > An infix function can be used in a prefix function application, by including it in parenthesis. Example: ? (+) 5 ((*) 6 4) 29 5/22 Multi-Argument Functions Multi-Argument Functions Haskell only supports one-argument functions.
    [Show full text]
  • Making a Faster Curry with Extensional Types
    Making a Faster Curry with Extensional Types Paul Downen Simon Peyton Jones Zachary Sullivan Microsoft Research Zena M. Ariola Cambridge, UK University of Oregon [email protected] Eugene, Oregon, USA [email protected] [email protected] [email protected] Abstract 1 Introduction Curried functions apparently take one argument at a time, Consider these two function definitions: which is slow. So optimizing compilers for higher-order lan- guages invariably have some mechanism for working around f1 = λx: let z = h x x in λy:e y z currying by passing several arguments at once, as many as f = λx:λy: let z = h x x in e y z the function can handle, which is known as its arity. But 2 such mechanisms are often ad-hoc, and do not work at all in higher-order functions. We show how extensional, call- It is highly desirable for an optimizing compiler to η ex- by-name functions have the correct behavior for directly pand f1 into f2. The function f1 takes only a single argu- expressing the arity of curried functions. And these exten- ment before returning a heap-allocated function closure; sional functions can stand side-by-side with functions native then that closure must subsequently be called by passing the to practical programming languages, which do not use call- second argument. In contrast, f2 can take both arguments by-name evaluation. Integrating call-by-name with other at once, without constructing an intermediate closure, and evaluation strategies in the same intermediate language ex- this can make a huge difference to run-time performance in presses the arity of a function in its type and gives a princi- practice [Marlow and Peyton Jones 2004].
    [Show full text]
  • Sufficient Generalities About Topological Vector Spaces
    (November 28, 2016) Topological vector spaces Paul Garrett [email protected] http:=/www.math.umn.edu/egarrett/ [This document is http://www.math.umn.edu/~garrett/m/fun/notes 2016-17/tvss.pdf] 1. Banach spaces Ck[a; b] 2. Non-Banach limit C1[a; b] of Banach spaces Ck[a; b] 3. Sufficient notion of topological vector space 4. Unique vectorspace topology on Cn 5. Non-Fr´echet colimit C1 of Cn, quasi-completeness 6. Seminorms and locally convex topologies 7. Quasi-completeness theorem 1. Banach spaces Ck[a; b] We give the vector space Ck[a; b] of k-times continuously differentiable functions on an interval [a; b] a metric which makes it complete. Mere pointwise limits of continuous functions easily fail to be continuous. First recall the standard [1.1] Claim: The set Co(K) of complex-valued continuous functions on a compact set K is complete with o the metric jf − gjCo , with the C -norm jfjCo = supx2K jf(x)j. Proof: This is a typical three-epsilon argument. To show that a Cauchy sequence ffig of continuous functions has a pointwise limit which is a continuous function, first argue that fi has a pointwise limit at every x 2 K. Given " > 0, choose N large enough such that jfi − fjj < " for all i; j ≥ N. Then jfi(x) − fj(x)j < " for any x in K. Thus, the sequence of values fi(x) is a Cauchy sequence of complex numbers, so has a limit 0 0 f(x). Further, given " > 0 choose j ≥ N sufficiently large such that jfj(x) − f(x)j < " .
    [Show full text]
  • Introduction Vectors in Function Spaces
    Jim Lambers MAT 415/515 Fall Semester 2013-14 Lectures 1 and 2 Notes These notes correspond to Section 5.1 in the text. Introduction This course is about series solutions of ordinary differential equations (ODEs). Unlike ODEs covered in MAT 285, whose solutions can be expressed as linear combinations of n functions, where n is the order of the ODE, the ODEs discussed in this course have solutions that are expressed as infinite series of functions, similar to power series seen in MAT 169. This approach of representing solutions using infinite series provides the following benefits: • It allows solutions to be represented using simpler functions, particularly polynomials, than the exponential or trigonometric functions normally used to represent solutions in \closed form". This leads to more efficient evaluation of solutions on a computer or calculator. • It enables solution of ODEs with variable coefficients, as opposed to ODEs seen in MAT 285 that either had constant coefficients or were of a special form. • It facilitates approximation of solutions by polynomials, by truncating the infinite series after a certain number of terms, which in turn aids in understanding the qualitative behavior of solutions. The solution of ODEs via infinite series will lead to the definition of several families of special functions, such as Bessel functions or various kinds of orthogonal polynomials such as Legendre polynomials. Each family of special functions that we will see in this course has an orthogonality relation, which simplifies computations of coefficients in infinite series involving such functions. As n such, orthogonality is going to be an essential concept in this course.
    [Show full text]
  • Composition Operators on Function Spaces with Fractional Order of Smoothness
    Composition Operators on Function Spaces with Fractional Order of Smoothness Gérard Bourdaud & Winfried Sickel October 15, 2010 Abstract In this paper we will give an overview concerning properties of composition oper‐ ators T_{f}(g) :=f og in the framework of Besov‐Lizorkin‐Triebel spaces. Boundedness and continuity will be discussed in a certain detail. In addition we also give a list of open problems. 2000 Mathematics Subject Classification: 46\mathrm{E}35, 47\mathrm{H}30. Keywords: composition of functions, composition operator, Sobolev spaces, Besov spaces, Lizorkin‐Triebel spaces, Slobodeckij spaces, Bessel potential spaces, functions of bounded variation, Wiener classes, optimal inequalities. 1 Introduction Let E denote a normed of functions. : are space Composition operators T_{f} g\mapsto f\circ g, g\in E , simple examples of nonlinear mappings. It is a little bit surprising that the knowledge about these operators is rather limited. One reason is, of course, that the properties of T_{f} strongly depend on f and E . Here in this paper we are concerned with E being either a Besov or a Lizorkin‐Triebel space (for definitions of these classes we refer to the appendix at the end of this These scales of Sobolev Bessel article). spaces generalize spaces W_{p}^{m}(\mathbb{R}^{n}) , potential as well as Hölder in view of the spaces H_{p}^{s}(\mathbb{R}^{n}) , Slobodeckij spaces W_{p}^{s}(\mathbb{R}^{n}) spaces C^{s}(\mathbb{R}^{n}) identities m\in \mathbb{N} \bullet W_{p}^{m}(\mathbb{R}^{n})=F_{p,2}^{m}(\mathbb{R}^{n}) , 1<p<\infty, ; s\in \mathbb{R} \bullet H_{p}^{s}(\mathbb{R}^{n})=F_{p,2}^{s}(\mathbb{R}^{n}) , 1<p<\infty, ; 1 \bullet W_{p}^{s}(\mathbb{R}^{n})=F_{p,p}^{s}(\mathbb{R}^{n})=B_{p,p}^{s}(\mathbb{R}^{n}) , 1\leq p<\infty, s>0, s\not\in \mathbb{N} ; \bullet C^{s}(\mathbb{R}^{n})=B_{\infty,\infty}^{s}(\mathbb{R}^{n}) , s>0, s\not\in \mathbb{N}.
    [Show full text]
  • Lectures on Super Analysis
    Lectures on Super Analysis —– Why necessary and What’s that? Towards a new approach to a system of PDEs arXiv:1504.03049v4 [math-ph] 15 Dec 2015 e-Version1.5 December 16, 2015 By Atsushi INOUE NOTICE: COMMENCEMENT OF A CLASS i Notice: Commencement of a class Syllabus Analysis on superspace —– a construction of non-commutative analysis 3 October 2008 – 30 January 2009, 10.40-12.10, H114B at TITECH, Tokyo, A. Inoue Roughly speaking, RA(=real analysis) means to study properties of (smooth) functions defined on real space, and CA(=complex analysis) stands for studying properties of (holomorphic) functions defined on spaces with complex structure. On the other hand, we may extend the differentiable calculus to functions having definition domain in Banach space, for example, S. Lang “Differentiable Manifolds” or J.A. Dieudonn´e“Trea- tise on Analysis”. But it is impossible in general to extend differentiable calculus to those defined on infinite dimensional Fr´echet space, because the implicit function theorem doesn’t hold on such generally given Fr´echet space. Then, if the ground ring (like R or C) is replaced by non-commutative one, what type of analysis we may develop under the condition that newly developed analysis should be applied to systems of PDE or RMT(=Random Matrix Theory). In this lectures, we prepare as a “ground ring”, Fr´echet-Grassmann algebra having count- ably many Grassmann generators and we define so-called superspace over such algebra. On such superspace, we take a space of super-smooth functions as the main objects to study. This procedure is necessary not only to associate a Hamilton flow for a given 2d 2d system × of PDE which supports to resolve Feynman’s murmur, but also to make rigorous Efetov’s result in RMT.
    [Show full text]
  • Continuity of Composition Operators in Sobolev Spaces
    Continuity of composition operators in Sobolev spaces G´erard Bourdaud & Madani Moussai February 12, 2019 Abstract We prove that all the composition operators Tf (g) := f ◦ g, which take the Adams- m ˙ 1 Rn m ˙ 1 Rn Frazier space Wp ∩Wmp( ) to itself, are continuous mappings from Wp ∩Wmp( ) to itself, for every integer m ≥ 2 and every real number 1 ≤ p< +∞. The same automatic m Rn continuity property holds for Sobolev spaces Wp ( ) for m ≥ 2 and 1 ≤ p< +∞. 2000 Mathematics Subject Classification: 46E35, 47H30. Keywords: Composition Operators. Sobolev spaces. 1 Introduction We want to establish the so-called automatic continuity property for composition operators in classical Sobolev spaces, i.e. the following statement: Theorem 1 Let us consider an integer m> 0, and 1 ≤ p< +∞. If f : R → R is a function m Rn s.t. the composition operator Tf (g) := f ◦ g takes Wp ( ) to itself, then Tf is a continuous m Rn mapping from Wp ( ) to itself. This theorem has been proved • for m = 1, see [5] in case p = 2, and [22] in the general case, • for m > n/p, m> 1 and p> 1 [14]. arXiv:1812.00212v2 [math.FA] 11 Feb 2019 It holds also trivially in the case of Dahlberg degeneracy, i.e. 1 + (1/p) <m<n/p, see [19]. It does not hold in case m = 0, see Section 2 below. Thus it remains to be proved in the following cases: • m = 2, p = 1, and n ≥ 3. • m = n/p > 1 and p> 1. • m ≥ max(n, 2) and p = 1.
    [Show full text]
  • Vectors in Function Spaces
    Jim Lambers MAT 606 Spring Semester 2015-16 Lecture 18 Notes These notes correspond to Section 6.3 in the text. Vectors in Function Spaces We begin with some necessary terminology. A vector space V , also known as a linear vector space, is a set of objects, called vectors, together with two operations: • Addition of two vectors in V , which must be commutative, associative, and have an identity element, which is the zero vector 0. Each vector v must have an additive inverse −v which, when added to v, yields the zero vector. • Multiplication of a vector in V by a scalar, which is typically a real or complex number. The term \scalar" is used in this context, rather than \number", because the multiplication process is \scaling" a given vector by a factor indicated by a given number. Scalar multiplication must satisfy distributive laws, and have an identity element, 1, such that 1v = v for any vector v 2 V . Both operations must be closed, which means that the result of either operation must be a vector in V . That is, if u and v are two vectors in V , then u + v must also be in V , and αv must be in V for any scalar α. n Example 1 The set of all points in n-dimensional space, R , is a vector space. Addition is defined as follows: 0 1 0 1 0 1 u1 v1 u1 + v1 B u2 C B v2 C B u2 + v2 C u + v = B C + B C = B C = u + v: B .
    [Show full text]
  • Gaussian Measures in Function Space
    Pacific Journal of Mathematics GAUSSIAN MEASURES IN FUNCTION SPACE LAWRENCE A. SHEPP Vol. 17, No. 1 January 1966 PACIFIC JOURNAL OF MATHEMATICS Vol. 17, No. 1, 1966 GAUSSIAN MEASURES IN FUNCTION SPACE L. A. SHEPP Two Gaussian measures are either mutually singular or equivalent. This dichotomy was first discovered by Feldman and Hajek (independently). We give a simple, almost formal, proof of this result, based on the study of a certain pair of functionals of the two measures. In addition we show that two Gaussian measures with zero means and smooth Polya-type covariances (on an interval) are equivalent if and only if the right-hand slopes of the covariances at zero are equal. The H and J functionals* Two probability measures μ0 and μx on a space (Ω, &) are called mutually singular (μ0 _L μd if there is a set ΰe & for which μo(B) — 0 and μλ{Ω — B) = 0. The measures are called mutually equivalent (μ0 ~ μ±) if they have the same zero sets, i.e., μo(B) = 0 if and only if μ^B) = 0. Setting μ = μ0 + μx we may define the Radon-Nikodym derivatives, (1.1) XQ = dμo/dμ , X, = dμjdμ . LEMMA 1. (i) μ0 J_ μ1 if and only if XQ'X1 = 0 a.e. (μ). (ii) μQ — μ1 if and only if XQ-XX > 0 a.e. (μ). In (ii) suppose μ0 ~ μl9 If E = {Xo = 0} then μo(E) = l Xodμ =[0. J Thus μx(E) = 0 and also μ{E) = μo(E) + μ1(E) = 0. Similarly JE£ {X1=0} = 0 and so XQ ^L > 0 a.e.
    [Show full text]
  • Lecture 4 Inner Product Spaces
    Lecture 4 Inner product spaces Of course, you are familiar with the idea of inner product spaces – at least finite-dimensional ones. Let X be an abstract vector space with an inner product, denoted as “ , ”, a mapping from X X h i × to R (or perhaps C). The inner product satisfies the following conditions, 1. x + y, z = x, z + y, z , x,y,z X , h i h i h i ∀ ∈ 2. αx,y = α x,y , x,y X, α R, h i h i ∀ ∈ ∈ 3. x,y = y,x , x,y X, h i h i ∀ ∈ 4. x,x 0, x X, x,x = 0 if and only if x = 0. h i≥ ∀ ∈ h i We then say that (X, , ) is an inner product space. h i In the case that the field of scalars is C, then Property 3 above becomes 3. x,y = y,x , x,y X, h i h i ∀ ∈ where the bar indicates complex conjugation. Note that this implies that, from Property 2, x,αy = α x,y , x,y X, α C. h i h i ∀ ∈ ∈ Note: For anyone who has taken courses in Mathematical Physics, the above properties may be slightly different than what you have seen, as regards the complex conjugations. In Physics, the usual convention is to complex conjugate the first entry, i.e. αx,y = α x,y . h i h i The inner product defines a norm as follows, x,x = x 2 or x = x,x . (1) h i k k k k h i p (Note: An inner product always generates a norm.
    [Show full text]
  • Normed Spaces
    ³ Normed spaces Mathematicians are like Frenchmen: whatever you say to them they translate into their own language and forthwith it is something entirely different. — Johann Wolfgang von Goethe When speaking about the complex numbers C, we already observed that basically everything regarding convergence that can be done in R can be transferred to C by using the modulus in the complex numbers instead of the modulus in the real numbers. The notion of a norm further abstracts the essential properties of the modulus. Moreover, we have (at least as a set) identified C with R2 and equipped it with componentwise operations. This is a very elementary construction for finite-dimensional vector spaces. In this chapter we study normed spaces which generalise these concepts in the following sense: normed spaces are vector spaces equipped with a map called the norm, which plays the role of the modu- lus. There are many examples of normed spaces, the simplest being RN and KN. We will be particularly interested in the infinite-dimensional normed spaces, like the sequence spaces `p or function spaces like C(K). Also the important Lebesgue spaces Lp(W, S, m) and the abstract Hilbert spaces that we will study later on will be examples of normed spaces. ³.G Vector spaces In this section we give a brief reminder of vector spaces and associated notions. In what follows, K denotes either the field of real numbers R or the field of complex numbers C. Definition 2.1. A vector space E over K is a set E together with two maps + : E × E ! E (addition) and · : K × E ! E (scalar multiplication) such that the following properties are satisfied.
    [Show full text]