Projection and Its Importance in Scientific Computing ______

Projection and its Importance in Scientific Computing ________________________________________________ Stan Tomov EECS Department The University of Tennessee, Knoxville March 22, 2017 COCS 594 Lecture Notes Slide 1 / 41 03/22/2017 Contact information office : Claxton 317 phone : (865) 974-6317 email : [email protected] Additional reference materials: [1] R.Barrett, M.Berry, T.F.Chan, J.Demmel, J.Donato, J. Dongarra, V. Eijkhout, R.Pozo, C.Romine, and H.Van der Vorst, Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods (2nd edition) http://netlib2.cs.utk.edu/linalg/html_templates/Templates.html [2] Yousef Saad, Iterative methods for sparse linear systems (1st edition) http://www-users.cs.umn.edu/~saad/books.html Slide 2 / 41 Topics as related to high-performance scientific computing‏ Projection in Scientific Computing Sparse matrices, PDEs, Numerical parallel implementations solution, Tools, etc. Iterative Methods Slide 3 / 41 Topics on new architectures – multicore, GPUs (CUDA & OpenCL), MIC Projection in Scientific Computing Sparse matrices, PDEs, Numerical parallel implementations solution, Tools, etc. Iterative Methods Slide 4 / 41 Outline Part I – Fundamentals Part II – Projection in Linear Algebra Part III – Projection in Functional Analysis (e.g. PDEs) HPC with Multicore and GPUs Slide 5 / 41 Part I Fundamentals Slide 6 / 41 Projection in Scientific Computing [ an example – in solvers for PDE discretizations ] A model leading to self-consistent iteration with need for high-performance diagonalization and orthogonalization routines Slide 7 / 41 What is Projection? Here are two examples (from linear algebra) (from functional analysis) u u = f(x) The error (u – Pu) to be Pu orthogonalThe error to(u vector– Pu) toe be orthogonal to vector e 1 e = 1 Pu e 0 1 P : orthogonal projection of P : best approximation (projection) of f(x) vector u on e in span{ e } ⊂ C[0,1] Slide 8 / 41 Definition Projection is a linear transformation P from a linear space V to itself such that P2 = P equivalently Let V is direct sum of subspaces V1 and V2 V = V1 V2 i.e. for u ∈ V there are unique u1∈ V1 and u2∈V2 s.t. u = u1+u2 Then P: VV1 is defined for u ∈ V as Pu ≡ u1 Slide 9 / 41 Importance in Scientific Computing To compute approximations Pu u where dim V1 << dim V V = V1 V2 When computation directly in V is not feasible or even possible. A few examples: – Interpolation (via polynomial interpolation/projection) – Image compression – Sparse iterative linear solvers and eigensolvers – Finite Element/Volume Method approximations – Least-squares approximations, etc. Slide 10 / 41 Projection in R2 In R2 with Euclidean inner-product, i.e. for x, y ∈ R2 T (x, y) = x1 y1 + x2 y2 ( = y x = x⋅y ) and || x || = (x, x)1/2 u (u, e) Pu = e (Exercise) || e ||2 e Pu i.e. for || e || = 1 P : orthogonal projection of Pu = (u, e) e vector u on e Slide 11 / 41 Projection in Rn / Cn Similarly to R2 P : Orthogonal projection of u into span{e1, ... , em}, m ≤ n. Let ei, i = 1 ... m is orthonormal basis, i.e. (ei, ej) = 0 for i≠j and (ei, ej) = 1 for i=j P u = (u, e1) e1 + ... + (u, em) em ( Exercise ) Orthogonal projection of u on e1 Slide 12 / 41 How to get an orthonormal basis? Can get one from every subspace by Gram-Schmidt orthogonalization: Input : m linearly independent vectors x1, ..., xm Output : m orthonormal vectors x1, ..., xm 1. x1 = x1 / || x1 || 2. do i = 2, m 3. xi = xi - (xi, x1) x1 - ... - (xi, xi-1) xi-1 (Exercise: xi x1 , ... , xi-1) 4. xi = xi / || xi || Orthogonal projection of x on x 5. enddo i 1 Known as Classical Gram-Schmidt (CGM) orthogonalization Slide 13 / 41 How to get an orthonormal basis? What if we replace line 3 with the following (3')? 3. xi = xi - (xi, x1) x1 - ... - (xi, xi-1) xi-1 3'. do j = 1, i-1 xi xi = xi – (xi, xj) xj enddo Equivalent in exact arithmetic (Exercise) but not with round-off errors (next) ! Known as Modified Gram-Schmidt (MGS) orthogonalization Slide 14 / 41 CGS vs MGS [ Results from Julien Langou: ] Scalability of MGS and CGS on two different clusters for matrices of various size m=[500 1000 2000 4000] per processor, n = 100 0.35 3.5 0.3 3 MGS 0500 0.25 2.5 MGS 1000 MGS 0500 ) MGS 2000 ) c s e ( 0.2 s 2 MGS 1000 ( MGS 4000 e e m i 0.15 CGS 0500 1.5 CGS 0500 m t i 0.1 CGS 1000 t 1 CGS 1000 CGS 2000 0.05 0.5 CGS 4000 0 0 1 2 4 8 16 32 64 1 2 4 8 16 32 # procs # procs Slide 15 / 41 || I - Q^TQ || [ ResultsJulienLangou:] [ from 1.00E+01 1.00E+03 1.00E-15 1.00E-13 1.00E-11 1.00E-09 1.00E-07 1.00E-05 1.00E-03 1.00E-01 Accuracy ofMGS Accuracy 1.00E+01 1.00E+02 1.00E+03 1.00E+04 vs K(A) 1.00E+05 CGS on matrices ofincreasingconditionnumber CGSonmatrices 1.00E+06 1.00E+07 1.00E+08 CGS 1.00E+09 1.00E+10 MGS CGS vs ||A - QR||/||A|| MGS 1.00E-19 1.00E-18 1.00E-17 1.00E-16 1.00E-15 1.00E+01 1.00E+02 1.00E+03 1.00E+04 K(A) 1.00E+05 1.00E+06 1.00E+07 Slide Slide 1.00E+08 16 1.00E+09 / 41 / 1.00E+10 QR factorization Let A = [x1, ... , xm] be the input for CGS/MGS and Q = [q1, ... , qm] the output; R : an upper m m triangular matrix defined from the CGR/MGS. Then A = Q R Slide 17 / 41 Other QR factorizations What about the following? [known as Cholesky QR] 1. G = ATA 2. G = L LT (Cholesky factorization) 3. Q = A (LT)-1 Does Q have orthonormal columns (i.e. QTQ=I), i.e. A = QLT to be a QR factorization (Exercise) When is this feasible and how compares to CGS and MGS? Slide 18 / 41 Other QR factorizations Feasible when n >> m Allows efficient parallel implementation: blocking both computation and communication AT A G = Investigate numerically accuracy P1 P2 . and scalability (compare to CGS and MGS) Exercise Slide 19 / 41 How is done in LAPACK? Using Householder reflectors x w H = I – 2 w wT w = ? so that H x1 = e1 Hx w = ... (compute or look at the reference books) Allows us to construct Xk Hk-1 ... H1 X = Q (Exercise( LAPACK implementation : “delayed update” of the trailing matrix + “accumulate transformation” to apply it as BLAS 3 Slide 20 / 41 Part II Projection in Linear Algebra Slide 21 / 41 Projection into general basis How to define projection without orthogonalization of a basis? – Sometimes is not feasible to orthogonalize – Often the case in functional analysis (e.g. Finite Element Method, Finite Volume Method, etc.) where the basis is “linearly independent” functions (more later, and Lecture 2) We saw if X = [x1, ... , xm] is an orthonormal basis (*) P u = (u, x1) x1 + ... + (u, xm) xm How does (*) change if X are just linearly independent ? P u = ? Slide 22 / 41 Projection into a general basis The problem: T Find the coefficients C = (c1 ... cm) in P u = c1 x1 + c2 x2 + . + cm xm = X C so that u – Pu ⊥ span{x1, ..., xm} or ⇔ so that the error e in (1) u = P u + e is ⊥ span{x1, ..., xm} Slide 23 / 41 Projection into a general basis (1) u = P u + e = c1 x1 + c2 x2 + . + cm xm + e Multiply (1) on both sides by “test” vector/function xj (terminology from functional analysis) for j = 1,..., m 0 (u , xj) = c1 (x1 , xj)+ c2 (x2 , xj)+ . + cm (xm , xj) + (e, xj) i.e., m equations for m unknowns In matrix notations (XT X) C = XT u (Exercise) XT Pu = XT u XTX is the so called Gram matrix (nonsingular; why?) => there exists a unique solution C Slide 24 / 41 Normal equations System (XT X) C = XT u is known also as Normal Equations The Method of Normal Equations: Finding the projection (approximation) Pu ≡ XC (approximation) of u in X by solving the Normal Equations system Slide 25 / 41 Least Squares (LS) Equivalently, system P u (XT X) C = XT u gives also the solution of the LS problem min || X C – u || C Rm 2 ∈ 2 2 2 since || v1 – u || = || (v1 – Pu) – e || = || v1 – Pu || + || e || 2 2 ≥ || e || = || Pu – u || for ∀ v1 ∈ V1 Slide 26 / 41 LS nm n Note that the usual notations for LS is: For A∈R , b∈R find min || A x – b || Solving LS with QR xfactorization∈Rm T T Let A = Q R, Q A = R = R1 , Q b = c m 0 d n - m Then 2 T T 2 2 2 || Ax – b || = || Q Ax – Q b || = || R1x – c || + || d || i.e. we get minimum if x is such that R1 x = c Slide 27 / 41 Projection and iterative solvers The problem : Solve Ax = b in Rn Iterative solution: at iteration i extract an approximate n xi from just a subspace V = span[v1, ..., vm] of R How? As on slide 22, impose constraints: n b – Ax subspace W = span[w1,...,wm] of R , i.e. (*) (Ax, wi) = (b, wi) for ∀ wi ∈W= span[w1,...,wm] Conditions (*) known also as Petrov-Galerkin conditions Projection is orthogonal: V and W are the same (Galerkin conditions) or oblique : V and W are different Slide 28 / 41 Matrix representation Let V = [v1, ..., vm], W = [w1,...,wm] m solves i.e.

Projection and Its Importance in Scientific Computing ______

Orthogonal Projection, Low Rank Approximation, and Orthogonal Bases

On Multivariate Interpolation

Inner Product Spaces Isaiah Lankham, Bruno Nachtergaele, Anne Schilling (March 2, 2007)

Polynomial Interpolation

Sparse Polynomial Interpolation and Testing

Basics of Euclidean Geometry

Sparse Polynomial Interpolation Codes and Their Decoding Beyond Half the Minimum Distance *

Chinese Remainder Theorem

Linear Algebra Chapter 5: Norms, Inner Products and Orthogonality

Anl-7826 Chinese Remainder Ind Interpolation

Polynomial Interpolation: Summary ∏

Polynomial Interpolation: Standard Lagrange Interpolation, Barycentric Lagrange Interpolation and Chebyshev Interpolation