Investigating an Overdetermined System of Linear Equations by Using Convex Functions

Total Page:16

File Type:pdf, Size:1020Kb

Investigating an Overdetermined System of Linear Equations by Using Convex Functions Hacettepe Journal of Mathematics and Statistics Volume 46 (5) (2017), 865 874 Investigating an overdetermined system of linear equations by using convex functions Zlatko Pavi¢ ∗ y and Vedran Novoselac z Abstract The paper studies the application of convex functions in order to prove the existence of optimal solutions of an overdetermined system of lin- ear equations. The study approaches the problem by using even convex functions instead of projections. The research also relies on some spe- cial properties of unbounded convex sets, and the lower level sets of continuous functions. Keywords: overdetermined system, convex function, global minimum. 2000 AMS Classication: 15A06, 26B25. Received : 30.08.2016 Accepted : 19.12.2016 Doi : 10.15672/ HJMS.2017.423 1. Introduction We consider a system of m linear equations with n unknowns over the eld of real numbers given by a11x1 + ::: + a1nxn = b1 . (1.1) . .. : am1x1+ ::: + amnxn = bm Including the matrices 2 a11 : : : a1n 3 2 x1 3 2 b1 3 . (1.2) 6 . .. 7 6 . 7 6 . 7 A = 4 . 5 ; x = 4 . 5 ; b = 4 . 5 ; am1 : : : amn xn bm the given system gets the matrix form (1.3) Ax = b: ∗Department of Mathematics, Mechanical Engineering Faculty in Slavonski Brod, University of Osijek, Croatia, Email: [email protected] yCorresponding Author. zDepartment of Mathematics, Mechanical Engineering Faculty in Slavonski Brod, University of Osijek, Croatia, Email: [email protected] 866 n m Identifying the matrix A with a linear operator from R to R , the column matrix x n m with a vector of R , and the column matrices Ax and b with vectors of R , the given system takes the operator form. If the vector b is not contained in the image R(A) of the operator A, then the equation in formula (1.3) has no solution. Respectively, the system in formula (1.1) is said to be inconsistent. A system of linear equations is said to be overdetermined if there are more equations than unknowns. The overdetermined system with linearly independent equations is evidently inconsistent. n m To summarize, employing a linear operator A : R ! R where n < m, and a vector m n b 2 R , we receive the equation Ax = b with unknown vectors x 2 R . If b 2 R(A), the set of solutions is the preimage of b as (−1) n (1.4) A (b) = fx 2 R : Ax = bg: If b2 = R(A), the equation has no solution. To include both cases, we use a norm on the m n space R , and the function f : R ! R dened by (1.5) f(x) = kAx − bk: Then we are looking for the global minimum points x of the function f. Expressed as an equation, it is n or f(x) = miny2R f(y) (1.6) kAx − bk = min kAy − bk; n y2R and its solutions are called optimal solutions of the equation Ax = b respecting the given norm. The above equation is reduced to Ax = b if b 2 R(A). The existence of the solution of the minimization problem in formula (1.6) is provided in the framework of the theory of projections. In order to remake it without using projections, we will utilize even convex functions. Methodical introduction to overdetermined systems of linear equations can be seen in [12]. The general insight into the problem of least absolute deviations can be found in [1]. Linear optimization problems were presented in [2]. 2. Anity and convexity, rays, lower level sets Let X be a real vector space. Let x1; x2 2 X be vectors, and let t1; t2 2 R be coecients. Then the linear combination t1x1 + t2x2 is said to be ane (convex) if t2 = 1 − t1 (0 ≤ t2 = 1 − t1 ≤ 1). A set S ⊆ X is said to be ane (convex) if it contains all binomial ane (convex) combinations of its vectors. (−1) n The set A (b) ⊆ R in formula (1.4) is ane. In general, if a set B ⊆ R(A) is ane (−1) n (convex), then its preimage A (B) ⊆ R is ane (convex) too. Let S be an ane (convex) set. A function f : S ! R is said to be ane (convex) if the equality (inequality) (2.1) f t1x1 + t2x2 = (≤) t1f(x1) + t2f(x2) holds for all ane (convex) combinations t1x1 + t2x2 of points x1; x2 2 S. If x1; x2 2 X are vectors, then the ray or half-line R from x1 passing through x2 is the subset of X dened as R = (1 − t)x1 + tx2 : t ≥ 0 : If f : S ! R is a function, then the lower level set L of the function f with a height l 2 R is the subset of S dened as L = fx 2 S : f(x) ≤ lg: 867 The lower level sets of a convex function are convex. The following lemma can be found in [3, Theorem 1, page 23]. We oer the proof by using convex combinations. n 2.1. Lemma. Let S ⊆ R be an unbounded closed convex set, and let s 2 S be a point. Then there is a ray from s belonging to S. n Proof. Without loss of generality, using the translation in the space R by the vector −s, we can assume that the set S contains the origin o. So, we are looking for a ray from o contained in S. Since S is unbounded, we can pick out a sequence (yk)k of points yk 2 S such that for every . kykk = k k 2 N Since S is convex, each line segment between o and yk is contained in S. Then it follows that the convex combination k − 1 1 1 x = o + y = y k k k k k k belongs to for every . Each satises . We have the sequence of S k 2 N xk kxkk = 1 (xk)k points belonging to the intersection of the set and the unit sphere in n. The xk S0 S R set S0 is compact because it is closed and bounded, and therefore the sequence (xk)k has a subsequence converging to some point . (xrk )k x0 2 S0 Let R be the ray from o passing through x0. We will verify that R is contained in S. Since R = ftx0 : t ≥ 0g, we have to show that each ray point tx0 is in S. Let t ≥ 0 be a nonnegative coecient. Since S is convex, the points t rk − t t txrk = yrk = o + yrk rk rk rk are in S for rk ≥ t. The above points approach tx0 if k approaches innity. Since S is closed, the limit is in . tx0 S The proof of Lemma 2.1 is illustrated in Figure 1, and the same is still applicable to n an unbounded convex set of R and points of its interior. 2 Figure 1. A ray in the unbounded convex set of R Statements similar to the next lemma and corollary are exposed in [11, Theorem 8.4, Theorem 8.5 and Theorem 8.6, pages 30-31]. Bounded sets play a decisive role for the global minimum existence. Considering lower level sets and the global minimum of a continuous function, one can nd the following. 868 n 2.2. Lemma. Let f : R ! R be a continuous function whose lower level sets are bounded. Then f has the global minimum. n Proof. First we take a point x0 2 R , and a number l ≥ f(x0). Then we specify the lower level set L of the function f with the height l, which is nonempty. The set L is bounded by the assumption, and closed by the continuity of f. Thus L is compact, and ∗ ∗ ∗ there is a point x 2 L such that f(x ) = minx2L f(x). The value f(x ) is the global minimum, which is easy to demonstrate as follows. n ∗ Let x 2 R be a point. If x 2 L, then applies f(x ) ≤ f(x) ≤ l. If x2 = L, then we ∗ have f(x ) ≤ l < f(x). n A convex function on the space R approaching innity on all rays from the origin being examined. n 2.3. Corollary. Let f : R ! R be a convex function such that (2.2) lim f(tx) = 1 t!1 n for every x 2 R n fog. Then f has the global minimum. n Proof. Since f is convex on R , it is also continuous (a convex function on an open domain is continuous). Respecting the continuity of f in the context of Lemma 2.2, it is sucient to prove that the lower level sets of f are bounded. Let L be the lower level set of f with a height l ≥ f(o). The set L contains the origin o, and it is closed and convex. If L is not bounded, then it contains some ray R from the origin o by Lemma 2.1. Thus, a point x0 2 L n fog exists such that the ray R = ftx0 : t ≥ 0g. Since R ⊆ L, for each t ≥ 0 we have that f(tx0) ≤ l: n It follows that the condition in formula (2.2) does not apply to x0 2 R n fog. Hence the lower level set L must be bounded. Consequently, all lower level sets of the function f are bounded. Very well written and motivated book in [9] can be recommended as an introduc- tory course to the analysis of convex functions. One chapter of this book refers to the optimization. The usage of ane and convex combinations is important in the eld of mathematical inequalities.
Recommended publications
  • 8.5 Least Squares Solutions to Inconsistent Systems
    8.5 Least Squares Solutions to Inconsistent Systems Performance Criterion: 8. (f) Find the least-squares approximation to the solution of an inconsistent system of equations. Solve a problem using least-squares approximation. (g) Give the least squares error and least squares error vector for a least squares approximation to a solution to a system of equations. Recall that an inconsistent system is one for which there is no solution. Often we wish to solve inconsistent systems and it is just not acceptable to have no solution. In those cases we can find some vector (whose components are the values we are trying to find when attempting to solve the system) that is “closer to being a solution” than all other vectors. The theory behind this process is part of the second term of this course, but we now have enough knowledge to find such a vector in a “cookbook” manner. Suppose that we have a system of equations Ax = b. Pause for a moment to reflect on what we know and what we are trying to find when solving such a system: We have a system of linear equations, and the entries of A are the coefficients of all the equations. The vector b is the vector whose components are the right sides of all the equations, and the vector x is the vector whose components are the unknown values of the variables we are trying to find. So we know A and b and we are trying to find x. If A is invertible, the solution vector x is given by x = A−1 b.
    [Show full text]
  • Linear Algebra I
    Linear Algebra I Gregg Waterman Oregon Institute of Technology c 2016 Gregg Waterman This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. The essence of the license is that You are free: to Share to copy, distribute and transmit the work • to Remix to adapt the work • Under the following conditions: Attribution You must attribute the work in the manner specified by the author (but not in • any way that suggests that they endorse you or your use of the work). Please contact the author at [email protected] to determine how best to make any attribution. Noncommercial You may not use this work for commercial purposes. • Share Alike If you alter, transform, or build upon this work, you may distribute the resulting • work only under the same or similar license to this one. With the understanding that: Waiver Any of the above conditions can be waived if you get permission from the copyright • holder. Public Domain Where the work or any of its elements is in the public domain under applicable • law, that status is in no way affected by the license. Other Rights In no way are any of the following rights affected by the license: • Your fair dealing or fair use rights, or other applicable copyright exceptions and limitations; ⋄ The author’s moral rights; ⋄ Rights other persons may have either in the work itself or in how the work is used, such as ⋄ publicity or privacy rights. Notice For any reuse or distribution, you must make clear to others the license terms of this • work.
    [Show full text]
  • Underdetermined and Overdetermined Linear Algebraic Systems
    Underdetermined and Overdetermined Linear Algebraic Systems ES100 March 1, 1999 T.S. Whitten Objectives ● Define underdetermined systems ● Define overdetermined systems ● Least Squares Examples Review B 500 N = − ° = ∑ Fx 0; 500 FBC sin 45 0 = ° − = ∑ Fy 0; FBC cos 45 FBA 0 FBC FBA − sin 45° 0 F − 500 ⋅ BC = ° 1cos42445 4341 1F2BA3 0 coefficients variables Review cont. − sin 45° 0 F − 500 ⋅ BC = ° 1cos42445 4341 1F2BA3 0 coefficients variables The system of matrices above is of the form: Ax = b and can be solved using MATLAB left division thus, x = A\b × results in a 1 2 matrix of values for FBC and FBA Review Summary ● A system of two Equations and two unknowns may yield a unique solution. ● The exception is when the determinant of A is equal to zero. Then the system is said to be singular. ● The left division operator will solve the linear system in one step by combining two matrix operations ●A\B is equivalent to A-1*B Graphical Representation of Unique vs. Singular Systems unique solution y singular system x Underdetermined Systems ● A system of linear equations is may be undetermined if; 1 The determinant of A is equal to zero A = 0 2 The matrix A is not square, i.e. the are more unknowns than there are equations + + = x x 3y 2z 2 1 3 2 2 ⋅ y = x + y + z = 4 1 1 1 4 z Overdetermined Systems ● The converse of an underdetermined system is an overdetermined system where there are more equations than there are variables ● This situation arises frequently in engineering.
    [Show full text]
  • Solutions of Overdetermined Linear Problems (Least Square Problems)
    Chapter 3 Solutions of overdetermined linear problems (Least Square problems) 1. Over-constrained problems See Chapter 11 from the textbook 1.1. Definition In the previous chapter, we focused on solving well-defined linear problems de- fined by m linear equations for m unknowns, put into a compact matrix-vector form Ax = b with A an m m square matrix, and b and x m long column vectors. We focussed on using⇥ direct methods to seek exact solutions− to such well-defined linear systems, which exist whenever A is nonsingular. We will re- visit these problems later this quarter when we learn about iterative methods. In this chapter, we look at a more general class of problems defined by so- called overdetermined systems – systems with a larger numbers of equations (m) than unknowns (n): this time, we have Ax = b with A an m n matrix with m>n, x an n-long column vector and b an m long column vector.⇥ This time there generally are no exact solutions to the problem.− Rather we now want to find approximate solutions to overdetermined linear systems which minimize the residual error E = r = b Ax (3.1) || || || − || using some norm. The vector r = b Ax is called the residual vector. − Any choice of norm would generally work, although, in practice, we prefer to use the Euclidean norm (i.e., the 2-norm) which is more convenient for nu- merical purposes, as they provide well-established relationships with the inner product and orthogonality, as well as its smoothness and convexity (we shall see later).
    [Show full text]
  • Math 253 - Homework 2 Due in Class on Wednesday, February 19
    Math 253 - Homework 2 Due in class on Wednesday, February 19 Write your answers clearly and carefully, being sure to emphasize your answer and the key steps of your work. You may work with others in this class, but the solutions handed in must be your own. If you work with someone or get help from another source, give a brief citation on each problem for which that is the case. Part I While you are expected to complete all of these problems, do not hand in the problems in Part I. You are encouraged to write complete solutions and to discuss them with me or your peers. As extra motivation, some of these problems will appear on the weekly quizzes. 1. Practice Problems: (a) Section 1.3: 1, 2 2. Exercises: (a) Section 1.3: 1-13 odd, 19, 21, 25, 27, 29, 33 Part II Hand in each problem separately, individually stapled if necessary. Please keep all problems together with a paper clip. n 1. The center of mass of v1;:::; vk in R , with a mass of mi at vi, for i = 1; : : : ; k and total mass m = m1 + ··· + mk, is given by m v + ··· + m v m m v¯ = 1 1 k k = 1 v + ··· + k v : m m 1 m k n The centroid, which can be thought of as the geometric center, of v1;:::; vk in R is the center of mass with mi = 1 for each i = 1; : : : ; k, i.e. v + ··· + v 1 1 c¯ = 1 k = v + ··· + v : k k 1 k k 243 2−13 2 1 3 2−53 2 1 3 Pk (a) Let v1 = 415 ; v2 = 4 0 5 ; v3 = 4 3 5 ; v4 = 4 0 5 ; v5 = 4−45 : Let uk = i=1 vi.
    [Show full text]
  • System of Linear Equations - Wikipedia, the Free Encyclopedia
    System of linear equations - Wikipedia, the free encyclopedia http://en.wikipedia.org/wiki/System_of_linear_equations System of linear equations From Wikipedia, the free encyclopedia In mathematics, a system of linear equations (or linear system) is a collection of linear equations involving the same set of variables. For example, is a system of three equations in the three variables x, y, z. A solution to a linear system is an assignment of numbers to the variables such that all the equations are simultaneously satisfied. A solution to the system above is given by A linear system in three variables determines a collection of planes. The intersection point is the solution. since it makes all three equations valid.[1] In mathematics, the theory of linear systems is the basis and a fundamental part of linear algebra, a subject which is used in most parts of modern mathematics. Computational algorithms for finding the solutions are an important part of numerical linear algebra, and play a prominent role in engineering, physics, chemistry, computer science, and economics. A system of non-linear equations can often be approximated by a linear system (see linearization), a helpful technique when making a mathematical model or computer simulation of a relatively complex system. Very often, the coefficients of the equations are real or complex numbers and the solutions are searched in the same set of numbers, but the theory and the algorithms apply for coefficients and solutions in any field. For solutions in an integral domain like the ring of the integers, or in other algebraic structures, other theories have been developed.
    [Show full text]
  • Low-Rank Approximation
    Ivan Markovsky Low-Rank Approximation Algorithms, Implementation, Applications January 19, 2018 Springer vi Preface Preface Low-rank approximation is a core problem in applications. Generic examples in systems and control are model reduction and system identification. Low-rank approximation is equivalent to the principal component analysis method in machine learning. Indeed, dimensionality reduction, classification, and information retrieval problems can be posed and solved as particular low-rank approximation problems. Sylvester structured low-rank approximation has applications in computer algebra for the decoupling, factorization, and common divisor computation of polynomials. The book covers two complementary aspects of data modeling: stochastic esti- mation and deterministic approximation. The former aims to find from noisy data that is generated by a low-complexity system an estimate of that data generating system. The latter aims to find from exact data that is generated by a high com- plexity system a low-complexity approximation of the data generating system. In applications, both the stochastic estimation and deterministic approximation aspects are present: the data is imprecise due to measurement errors and is possibly gener- Simple linear models are commonly used in engineering despite of the fact that ated by a complicated phenomenon that is not exactly representable by a model in the real world is often nonlinear. At the same time as being simple, however, the the considered model class. The development of data modeling methods in system models have to be accurate. Mathematical models are obtained from first princi- identification and signal processing, however, has been dominated by the stochastic ples (natural laws, interconnection, etc.) and experimental data.
    [Show full text]
  • Computing the Strict Chebyshev Solution of Overdetermined Linear Equations
    MATHEMATICS OF COMPUTATION, VOLUME 31, NUMBER 140 OCTOBER 1977, PAGES 974-983 Computing the Strict Chebyshev Solution of Overdetermined Linear Equations By Nabih N. Abdelmalek Abstract. A method for calculating the strict Chebyshev solution of overdetermined systems of linear equations using linear programming techniques is described. This method provides: (1) a way to determine, for the majority of cases, all the equations belonging to the characteristic set, (2) an efficient method to obtain the inverse of the matrix needed to calculate the strict Chebyshev solution, and (3) a way of recog- nizing when an element of the Chebyshev solution equals a corresponding element of the strict Chebyshev solution. As a result, in general, the computational effort is con- siderably reduced. Also the present method deals with full rank as well as rank defi- cient cases. Numerical results are given. 1. Introduction. Consider the overdetermined system of linear equations (1) Ca = f, where C is a given real n x m matrix of rank k < m < n and / is a given real «-vector. Let E denote the set of n equations (1). The Chebyshev solution (C.S.) of system (1) is the OT-vectora* = (a*) which minimizes the Chebyshev norm z, (2) z = max\r¡(a)\, i E E, where r, is the rth residual in (1) and is given by (3) a-,.= cixax+ ••• + cimam - f¡, i E E. Let us denote the C.S. a* by (a*)c s . It is known that if C satisfies the Haar condition, the C.S. (a*)c s is unique. Otherwise it may not be unique.
    [Show full text]
  • ECEU692 Intro to SSI Course Notes Some Selected Topics in Linear
    EECE5698 SP 2011 DHB Set 2 D.H. Brooks ECEU692 Intro to SSI Course Notes Some selected topics in linear algebra for SSI (2) Prof. Brooks Northeastern University Spring 2011 This talk will relate to parts of • Chapter 5 • Appendix B EECE5698 SP 2011 DHB Set 2 D.H. Brooks Outline of those Selected Topics • Some subsurface problems that lead to linear systems of equations • Some different ways of looking at n equations in n unknowns: in particular, some geometric interpretations • The geometry of when square linear systems have: – exact solutions, – no solutions, and/or – non-unique solutions • Problems where you have more equations than unknowns • Problems where you have more unknowns than equations • A quick discussion of solving linear systems in Matlab • The Singular Value Decomposition to analyze linear systems 2 EECE5698 SP 2011 DHB Set 2 D.H. Brooks Outline of these Notes: Linear Systems that are not Square • What about errors in the model? • Overdetermined systems: motivation and geometry • “Solving” overdetermined systems via least-squares • What about when you can’t make enough measurements ? • Underdetermined systems: motivation and geometry • A few words on underdetermined systems • A few words on solving linear systems in Matlab 3 EECE5698 SP 2011 DHB Set 2 D.H. Brooks Reflectance Spectroscopy Again: Errors in our Assumptions In the reflectance spectroscopy problem, we have several sources of error: • We may not know, or be able to account for, all the chromophores present, e.g. – some chromophore is in the medium we did not expect, or – some chromophore is in the medium that we thought would be negligible but in fact is not • We may not have exact values for the spectra, e.g.
    [Show full text]
  • Arxiv:Math/0206212V3
    QUANTUM COHOMOLOGY VIA D-MODULES Martin A. Guest Dedicated to Graeme Segal on the occasion of his 60th birthday Abstract. We propose a new point of view on quantum cohomology, motivated by the work of Givental and Dubrovin, but closer to differential geometry than the existing approaches. The central object is a D-module which “quantizes” a commutative algebra associated to the (uncompactified) space of rational curves. Under appropriate conditions, we show that the associated flat connection may be gauged to the flat connection underlying quantum cohomology. This method clarifies the role of the Birkhoff factorization in the “mirror trans- formation”, and it gives a new algorithm (requiring construction of a Groebner basis and solution of a system of o.d.e.) for computation of the quantum product. Quantum cohomology first arose in physics, and its (mathematically conjectural) prop- erties were supported by physical intuition. A rigorous mathematical definition came later, based on deep properties of certain moduli spaces. We shall propose another point of view on quantum cohomology, closer in spirit to differential geometry. The main ingredient in our approach is a flat connection, considered as a holonomic D-module (or maximally overdetermined system of p.d.e.). This object itself is not new: Givental’s “quantum cohomology D-module” is already well known ([Gi1]), and the asso- ciated flat connection appears in Dubrovin’s theory of Frobenius manifolds ([Du]). But, in the existing literature, the D-module plays a subservient role, being a consequence of arXiv:math/0206212v3 [math.DG] 6 Oct 2004 the construction of the Gromov-Witten invariants and the quantum cohomology algebra.
    [Show full text]
  • Overdetermined Elliptic Systems
    FoCM manuscript No. (will be inserted by the editor) Overdetermined Elliptic Systems Katsiaryna Krupchyk1, Werner M. Seiler2, Jukka Tuomela1 1 Dept. of Mathematics, University of Joensuu, Finland, e-mail: [email protected], [email protected] 2 Interdisciplinary Centre for Scientific Computing, Universitat¨ Heidelberg, 69120 Heidel- berg, Germany, e-mail: [email protected] Abstract We consider linear overdetermined systems of partial differential equa- tions. We show that the introduction of weights classically used for the definition of ellipticity is not necessary, as any system that is elliptic with respect to some weights becomes elliptic without weights during its completion to involution. Fur- thermore, it turns out that there are systems which are not elliptic for any choice of weights but whose involutive form is nevertheless elliptic. We also show that reducing the given system to lower order or to an equivalent one with only one unknown function preserves ellipticity. Key words Overdetermined system, partial differential equation, symbol, ellip- tic system, completion, involution 1 Introduction The definition of ellipticity for general overdetermined systems is quite rarely found in the literature, one accessible exception being the encyclopaedia article [15, Def. 2.1]. Without the general definition one may encounter conceptual prob- lems already in very simple situations. For instance, consider the transformation of the two-dimensional Laplace equation uxx + uyy = 0 to the first order system (this is discussed in the recent textbook [42, Example 2.10]): ux = v , uy = w , vx + wy = 0 . The transformed system is not elliptic, although it is obviously equivalent to La- place’s equation.
    [Show full text]
  • Section 2.1 Systems of Linear Equations: an Introduction
    Chapter 2 Section 2.1 Systems of Linear Equations: An Introduction Definition A system of linear equations is a collection of multiple linear equations which are meant to be solved at the same time or simultaneously. Question What does it mean to solve a system of linear equations? Answer: To solve a system of equations means to find all for the unknowns that satisfy EVERY equation. To solve a system of linear equations, where all the equations are lines, means to find every that the share. We actually saw in the last section that solving a system of only two linear equations and two unknowns (or variables) is referred to as the intersection of two lines. Before continueing with solving systems of equations, we will first discuss how to setup a system of equations from a word problem. For the following three examples, we will setup but not solve the resulting system of equations. Example 1: An insurance company has three types of documents to process: contracts, leases, and policies. Each contract needs to be examined for 2 hours by the accountant and for 3 hours by the attorney, each lease needs to be examined for 4 hours by the accountant and 1 hour by the attorney, and each policy needs to be examined for 2 hours by the accountant and 2 hour by the attorney. The company processes twice as many policies as contracts and leases combined. If the accountant has 40 hours and the attorney has 30 hours each week to spend working on these documents, how many documents of each type can they process each week? Note: ALWAYS define your variables when setting up a problem Example 2: The Johnson Farm has 500 acres of land allotted for cultivating corn and wheat.
    [Show full text]