50 LA15 Abstracts

IP1 least-squares regression and principal component analy- Fast Approximation of the Stability Radius and the sis) are widely and routinely used. Devising scalable al- H∞ Norm for Large-Scale Linear Dynamical Sys- gorithms that enable large scale computation of the afore- tems mentioned primitives is crucial for meeting the challenges of Big Data applications. Sketching, which reduces dimen- The stability radius and the H∞ norm are well-known sionality through randomization, has recently emerged as quantities in the robust analysis of linear dynamical sys- a powerful technique for scaling-up these primitives in the tems with output feedback. These two quantities, which presence of massive data, for an extensive class of applica- are reciprocals of each other in the simplest interesting tions. In this talk, we outline how sketching can be used to case, respectively measure how much system uncertainty accelerate these core computations, and elaborate on the can be tolerated without losing stability, and how much tradeoffs involved in their use. We will also demonstrate an input disturbance may be magnified in the output. the utility of the presented algorithms for data analysis and The standard method for computing them, the Boyd- machine learning applications Balakrishnan-Bruinsma-Steinbuch algorithm from 1990, is globally and quadratically convergent, but its cubic cost Haim Avron per iteration makes it inapplicable to large-scale dynami- Business Analytics & Mathematical Sciences cal systems. We present a new class of efficient methods IBM T.J. Watson Research Center for approximating the stability radius and the H∞ norm, [email protected] based on iterative methods to find rightmost points of spec- tral value sets, which are generalizations of pseudospectra IP4 for modeling the linear fractional matrix transformations that arise naturally in analyzing output feedback. We also Point-Spread Function Reconstruction in Ground- discuss a method for approximating the real structured sta- Based Astronomy bility radius, which offers additional challenges. Finally, we Because of atmospheric turbulence, images of objects in describe our new public-domain MATLAB toolbox for low- outer space acquired via ground-based telescopes are usu- order controller synthesis, HIFOOS (H-infinity fixed-order ally blurry. One way to estimate the blurring kernel or optimization — sparse). This offers a possible alternative point spread function (PSF) is to make use of the aberra- to popular model order reduction techniques by applying tion of wavefronts received at the telescope, i.e., the phase. fixed-order controller design directly to large-scale dynam- However only the low-resolution wavefront gradients can ical systems. This is joint work with Nicola Guglielmi, be collected by wavefront sensors. In this talk, I will dis- Mert Gurbuzbalaban and Tim Mitchell. This speaker is cuss how to use regularization methods to reconstruct high- supported in cooperation with the International Linear Al- resolution phase gradients and then use them to recover the gebra Society. phase and the PSF in high accuracy. I will also address re- Michael L. Overton lated numerical issues such as the estimation New York University of the regularization parameter and the solution of the lin- Courant Instit. of Math. Sci. ear systems arising from the model. [email protected] Raymond H. Chan The Chinese Univ of Hong Kong IP2 Department of Mathematics [email protected] Tuned Preconditioners for Inexact Two-Sided In- verse and Rayleigh Quotient Iteration IP5 Computing both right and left eigenvectors of a gener- Accelerating Direct Linear Solvers with Hardware alised eigenvalue problem simultaneously is of interest in and Algorithmic Advances several important applications. We provide convergence results for inexact two-sided inverse and Rayleigh quotient Factorization-based algorithms often play a significant role iteration, which extend the previously established theory in developing scalable solvers. The higher fidelity simula- to the generalized non-Hermitian eigenproblem and inex- tions and extreme-scale parallel machines present unique act solves with a decreasing solve tolerance. Moreover, we challenges for designing new parallel algorithms and soft- consider the simultaneous solution of the forward and ad- ware. In this talk, we first present some techniques to en- joint problem arising in two-sided methods and extend the hance sparse factorization algorihtms to exploit the newer successful tuning strategy for preconditioners to two-sided heterogeneous node architectures, such as nodes with GPU methods, creating a novel way of preconditioning two-sided accelerators or Intel Xeon Phi. Secondly, we present a new algorithms. This is joint work with Patrick Kuerschner class of scalable factorization algorithms that have asymp- (MPI Magdeburg, Germany). totically lower complexity in both flops and storage, for which the acceleration power comes form exploiting low- Melina Freitag submatrices and randomization. Dept. of Mathematical Sciences University of Bath Xiaoye Sherry Li [email protected] Computational Research Division Lawrence Berkeley National Laboratory [email protected] IP3 Sketching-Based Matrix Computations for Large- Scale Data Analysis IP6 Variational Gram Functions: Convex Analysis and Matrix computations lies at the core of a broad range Optimization of methods in data analysis and machine learning, and certain primitives (e.g. linear We propose a class of convex penalty functions, called LA15 Abstracts 51

”Variational Gram Functions”, that can promote pairwise discuss the use of adaptive methods such as the adaptive relations, such as orthogonality, among a set of vectors finite element model and the algebraic multilevel substruc- in a vector space. When used as regularizers in convex tering method for the discussed problem, and we point out optimization problems, these functions of a Gram ma- the challenges and deficiencies in these approaches. trix find application in hierarchical classification, multitask learning, and estimation of vectors with disjoint supports, Volker Mehrmann among other applications. We describe a general condition Technische Universit¨at Berlin for convexity, which is then used to prove the convexity [email protected] of a few known functions as well as new ones. We give a characterization of the associated subdifferential and the proximal operator, and discuss efficient optimization al- IP9 gorithms for loss-minimization problems regularized with Linear Algebra Computations for Parameterized these penalty functions. Numerical experiments on a hier- Partial Differential Equations archical classification problem are presented, demonstrat- ing the effectiveness of these penalties and the associated The numerical solution of partial differential equations optimization algorithms in practice. (PDEs) often entails the solution of large linear systems of equations. This compute-intensive task becomes more Maryam Fazel challenging when components of the problem such as coef- University of Washington ficients of the PDE depend on parameters that are un- Electrical Engineering certain or variable. In this scenario, there there is a [email protected] need to compute many solutions for a single simulation, and for accurate discretizations, costs may be prohibitive. We discuss new computational algorithms designed to im- IP7 prove efficiency in this setting, with emphasis on new algo- Combinatorial Matrix Theory and Majorization rithms to handle stochastic problems and new approaches for reduced-order models. Majorization theory provides concepts for comparing Howard C. Elman mathematical objects according to “how spread out’ their University of Maryland, College Park elements are. In particular, two n-vectors may be ordered [email protected] by comparing the partial sums of the k largest compo- nents for k ≤ n. This is a fruitful concept in many ar- eas of mathematics and its applications, e.g., in matrix IP10 theory, combinatorics, probability theory, mathematical fi- Accurate Linear Algebra in Computational Meth- nance and physics. In this talk we discuss some combina- ods for System and Control Theory torial problems for classes of matrices where majorization plays a role. This includes (0, 1)-matrices with line sum We discuss the importance of robust and accurate imple- and pattern constraints, doubly stochastic matrices and mentation of core numerical linear algebra procedures in Laplacian matrices of graphs. An extension of majoriza- computational methods for system and control theory. In tion to partially ordered sets is presented. We also discuss particular, we stress the importance of error and pertur- a problem motivated by mathematical finance which leads bation analysis that identifies relevant condition numbers to interesting questions in qualitative matrix theory. This and guides computation with noisy data, and careful soft- speaker is supported in cooperation with the International ware implementation. The themes used as case studies in- Linear Algebra Society. clude rational matrix valued least squares fitting (e.g. least squares fit to frequency response measurements of an LTI Geir Dahl system), model order reduction issues (e.g. the Discrete University of Oslo Empirical Interpolation Method (DEIM)), accurate com- Dept. of Mathematics and Dept. of Informatics putationwithstructuredmatricessuchasscaledCauchy, geird@ifi.uio.no Vandermonde and Hankel matrices. Zlatko Drmac IP8 University of Zagreb Numerical Solution of Eigenvalue Problems Arising Department of Mathematics in the Analysis of Disc Brake Squeal [email protected]

We present adaptive numerical methods for the solution of parametric eigenvalue problems arsing from the discretiza- IP11 tion of partial differential equations modeling disc brake Low Rank Decompositions of Tensors and Matri- squeal. The eigenvectors are used for model reduction to ces: Theory, Applications, Perspectives achieve a low order model that can be used for optimization and control. The model reduction method is a variation of Numerical data are frequently organized as d-dimensional the proper orthogonal decomposition method. Several im- matrices, also called tensors. However, only small values portant challenges arise, some of which can be traced back of d are allowed since the computer memory is limited. In to the finite element modeling stage. Compared to the cur- the case of many dimensions, special representation for- rent industrial standard our new approach is more accurate mats are crucial, e.g. so called tensor decompositions. Re- in vibration prediction and achieves a better reduction in cently, the known tensor decompositions have been con- model size. This comes at the price of an increased com- siderably revisited and the two of them, previously used putational cost, but it still gives useful results when the only in theoretical physics, are now recognized as the most traditional method fails to do so. We illustrate the results adequate and useful tools for numerical analysis. These with several numerical experiments, some from real indus- two are the Tensor-Train and Hierarchical-Tucker decom- trial models and indicate where improvements of the cur- positions. Both are intrinsically related with low-rank ma- rent black box industrial codes are advisable. We then also trices associated with a given tensor. We present these 52 LA15 Abstracts

decompositions and the role of low-rank matrices for the Julianne Chung, Eric De Sturler construction of efficient numerical algorithms. Virginia Tech [email protected], [email protected] Eugene Tyrtyshnikov Institute of Numerical Mathematics, Russian Academy of Sci. CP1 [email protected] On a Nonlinear Inverse Problem in Electromag- netic Sounding

IP12 Electromagnetic induction measurements are often used Constrained Low Rank Approximations for Scal- for non-destructive investigation of certain soil properties, able Data Analytics which are affected by the electromagnetic features of the subsurface layers. Starting from electromagnetic data col- Constrained low rank approximations have been widely uti- lected by a ground conductivity meter, we propose a regu- lized in large-scale data analytics where the applications larized inversion method based on a low-rank approxima- reach far beyond the classical areas of scientific comput- tion of the Jacobian of the nonlinear model. The method ing. We discuss some fundamental properties of nonneg- depends upon a relaxation parameter and a regularization ative matrix factorization (NMF) and introduce some of parameter, both chosen by automatic procedures. The per- its variants for clustering, topic discovery in text analy- formance of the method is investigated by numerical exper- sis, and community detection in social network analysis. iments both on synthetic and experimental data sets. In particular, we show how a simple rank 2 NMF com- bined with a divide-and-conquer framework results in a Caterina Fenu simple yet significantly more effective and scalable method University of Cagliari, Italy for topic discovery. This simple approach can be further [email protected] generalized for graph clustering and community detection. Substantial experimental results illustrate significant im- Gian Piero Deidda provements both in computational time as well as quality University of Cagliari of solutions obtained. [email protected] Haesun Park Georgia Institute of Technology Giuseppe Rodriguez [email protected] University of Cagliari, Italy [email protected]

SP1 CP1 SIAG/Linear Algebra Prize Lecture - Localizing Nonlinear Eigenvalues: Theory and Applications Localized-Deim: An Overlapping Cluster Frame- work with Application in Model Reduction for Vibrations are everywhere, and so are the eigenvalues that Nonlinear Inversion describe them. Physical models that include involve damp- ing, delay, or radiation often lead to nonlinear eigenvalue A numerically efficient application of parametric model re- problems, in which we seek complex values for which an duction critically depends on affine parametrization of the (analytic) matrix-valued function is singular. In this talk, full-order state-space quantities so that the online projec- we show how to generalize eigenvalue localization results, tion step does not depend on the dimension of full model. such as Gershgorin’s theorem, Bauer-Fike, and pseudospec- Discrete Empirical Interpolation Method (DEIM) is com- tral theorems, to the nonlinear case. We demonstrate the monly used to construct such affine approximations. In this usefulness of our results on examples from delay differential talk, we propose overlapping clustering methods to improve equations and quantum resonances. the approximation power of local DEIM and investigate ef- ficient implementations of the underlying greedy selection David Bindel procedure. A nonlinear parametric inversion example aris- Cornell University ing in diffuse optical tomography is used to illustrate the [email protected] approach. Alexander R. Grimm CP1 Department of Mathematics An Augmented Hybrid Method for Large Scale In- Virginia Tech verse Problems [email protected]

In this work we use the weighted-GCV hybrid method of Serkan Gugercin Chung/Nagy/Oleary for solving ill-posed inverse problems. Virginia Tech. This method restricts the solution to a Krylov subspace Department of Mathematics and then uses regularization techniques on the projected [email protected] problem in order to stop the typical semiconvergence be- havior of iterative methods. Difficulties arise for large prob- lems where we are required to store the full Krylov space. CP1 We have developed a technique to compress the space us- Matrix Affine Transformation Algorithm for Rank- ing harmonic Ritz vectors. Computational examples are Reducing Image Data Informatics Process provided. Matrix affine transformation T=QR+C is used in a way of Geoffrey Dillon maximizing informatics content in orthogonal QR process. Texas Tech University Sample matrices were taken from images of quadrant pixel- [email protected] shift patches across evenly spaced 128x128 grid points. Our LA15 Abstracts 53

proposed algorithm cures the ill-posedness in the matrix in- between regularization (technique to prevent overfitting) version. By retrieving a specific rank-one pattern from a and robustness (technique to immunize against set-induced correction matrix C, we can enrich the principal QR fac- uncertainty) and finds regularizers without requiring cross torization. We tested the optimal matrix-rank-reducing validation. process particularly with the SVD analysis. Dimitri Papadimitriou Jay Min Lee Bell Labs Pohang Accelerator Lab, POSTECH [email protected] [email protected]

Youngjoo Chung CP2 School of Info. and Comm. Iterative Refinement for Symmetric Eigenvalue De- GIST composition and Singular Value Decomposition [email protected] An efficient iterative refinement algorithm is proposed for symmetric eigenvalue problems. The algorithm is simple, Yonghoon Kwon and it mainly consists of matrix multiplications. It con- Department of Mathematics, POSTECH structs an arbitrarily accurate eigenvalue decomposition, [email protected] up to the limit of computational precision. Using similar techniques, an iterative refinement algorithm for the singu- CP1 lar value decomposition is also derived. Since the proposed algorithms are based on Newton’s method, they converge Efficiencies in Global Basis Approximation for quadratically. Numerical results demonstrate the excellent Model Order Reduction in Diffuse Optical Tomog- performance of the proposed algorithm. raphy Kensuke Aishima We consider the nonlinear inverse problem of reconstruct- Graduate School of Information Science and Technology, ing parametric images of optical properties from diffuse University of Tokyo optical tomographic data. Recent work shows MOR tech- Kensuke [email protected] niques have promise in mitigating the computational bot- tleneck associated with solving for the parameters. In this Takeshi Ogita talk, we give an algorithm for efficiently computing the ap- Tokyo Woman’s Christian University proximate global basis needed in MOR by utilizing a new [email protected] interpretation of the transfer function and by capitalizing on Krylov recycling in a novel way. CP2 Meghan O’Connell Department of Mathematics Some Inverse Numerical Range Problems Tufts University We generalize the inverse field of values problem to the [email protected] q-numerical range and the rank-k numerical range of a matrix. We propose an algorithm for solving the inverse Misha E. Kilmer q-numerical range problem. Our algorithms exploits the Tufts University convexity of the q-numerical range. Approximating the [email protected] boundary of the q-numerical requires constructing approx- imation of the Davis-Weilandt shell of a matrix. We note Eric De Sturler some connections to computing pseudospectra. For the Virginia Tech rank-k numerical range, we have found a particular gen- [email protected] eralized eigenvalue problem whose solution facilitates con- structing subspaces for generating points in the rank-k nu- Serkan Gugercin merical range of a matrix, and also addresses the ques- Virginia Tech. tion of covering numbers for points in the rank-k numerical Department of Mathematics range. The results in this work could have applications in [email protected] computing eigenvalues as well as quantum computing. Russell Carden Christopher A. Beattie Department of Mathematics Virginia Polytechnic Institute and State University University of Kentucky [email protected] [email protected]

CP1 MJahromi Robust Multi-Instance Regression Shahid Bahonar University [email protected] Multi-instance regression consists in building a regression model that maps sets of instances (bags) to real-valued Iran Katsouleas, Greece Maroulas outputs. The Primary-Instance Regression (PIR) method National Technical University of Athens assumes that there is some primary instance (unknown dur- g [email protected], [email protected] ing training) which is responsible for the real valued label and that the rest of the items in the bag are noisy ob- servations of the primary instance. To immunize the pri- CP2 mary instance selection to noise, we propose an hyperplane An Algorithm for Finding a 2-Similarity Transfor- fitting procedure which exploits the algebraic equivalence mation from a Numerical Contraction to a Con- 54 LA15 Abstracts

traction [email protected]

Any matrix A with numerical radius at most 1 can be writ- ten as A = STS−1,whereT ≤1andS·S−1≤2. CP2 However, no explicit algorithm was given for producing Two-Level Orthogonal Arnoldi Method for Large such a similarity transformation. In this paper, we give Rational Eigenvalue Problems a method for constructing such similarity transformations. As a side benefit, the algorithm indicates if the numerical We propose a two-level orthogonal Arnoldi method for solv- radius of A is greater than than some given number and so ing large rational eigenvalue problems (REP), which ex- can be used to determine if the numerical radius is greater ploits the structure of the Krylov subspace of a Frobenius- than a given value. like linearization of the REP. We develop such method by using a different representation of the Krylov vec- Daeshik Choi tors, which is much more memory efficient than stan- Southern Illinois University Edwardsville dard Arnoldi applied on the linearization. In addition, we [email protected] present numerical examples that show that the accuracy of the new method is comparable to standard Arnoldi. Anne Greenbaum University of Washington Javier A. Gonz´alez Pizarro [email protected] Universidad Carlos III de Madrid [email protected]

CP2 Froilan Dopico Department of Mathematics The Markovian Joint Spectral Radius: What It Is Universidad Carlos III de Madrid, Spain and How to Compute It Efficiently [email protected] N Given a finite set of matrices F = {Ai}i=1,withAi ∈ Cd × d,theJointSpectralRadius(JSR)ofF is given by CP2 the generalization of the Gelfand’s formula for the spectral A Communication-Avoiding Arnoldi-Type of the radius of a matrix. In recent works it has been proved that Complex Moment-Based Eigensolver the JSR can be computed exactly, under suitable and gen- eral conditions, using polytope norms. In some cases, how- For solving interior eigenvalue problems, complex moment- ever, not all the products are allowed, because the matrices based eigensolvers have been actively studied because of in F are multiplied each other following some Markovian their high parallel efficiency. Recently, we proposed the law. Recently Kozyakin showed in [1] that it is still pos- Arnoldi-type complex moment-based eigensolver named sibile to compute Joint Spectral Radius in the Markovian the block SS–Arnoldi method. In this talk, we propose case as the classical JSR of a significantly higher dimen- an improvement of the block SS–Arnoldi method using  N    Nd × Nd a communication-avoiding Arnoldi process (s-step Arnoldi sional set of matrices F = Ai ,withAi ∈ C . i=1 process). We evaluate the performance of the proposed This implies that the exact evaluation of the Markovian method and compare with other complex moment-based JSR can be achieved in general using a polytope norm in eigensolvers. CNd, which is a challenge task if N is large. In this talk we address the question whether it is possible to reduce the Akira Imakura,TetsuyaSakurai computational complexity for the calculation of the Marko- Department of Computer Science vian JSR showing that it is possible to transform the prob- University of Tsukuba d lem into the evaluation of N polytope norms in C .This [email protected], [email protected] approach is strictly related with the idea of multinorms introduced by Jungers and Philippe for discrete–time lin- ear constrained switching systems [2]. As an illustrative CP3 application we shall consider the zero–stability of variable Block-Smoothers in Multigrid Methods for Struc- stepsize 3–step BDF formulas. tured Matrices

Usually in multigrid methods for structured matrices, like [1] V. Kozyakin. The Berger–Wang formula for the Toeplitz matrices or circulant matrices, as well as in geo- Markovian joint spectral radius. Linear Algebra and its metric multigrid methods point-smoothers are used. Anal- Applications. 448 (2014), 315–328. [2] M. Philippe and ysis for structured matrices mostly focusses on simple R. Jungers. Converse Lyapunov theorems for discrete-time methods like Richardson, smoothers like multicolor-SOR linear switching systems with regular switching sequences. are not considered. In this talk we assess general block- (2014) arXiv preprint arXiv:1410.7197. smoothers, where small blocks are inverted instead of single unknowns. The presented analysis fits in the established Antonio Cicone analysis framework and results in better converging multi- Universit`a degli studi dell’Aquila grid methods. [email protected] Matthias Bolten Nicola Guglielmi University of Wuppertal Universita degli Studi dell’Aquila Department of Mathematics [email protected] [email protected]

Vladimir Y. Protasov Moscow State University CP3 Department of Mechanics and Mathematics Multigrid Preconditioners for Boundary Control of LA15 Abstracts 55

Elliptic-Constrained Optimal Control Problems [email protected]

Our goal is to design and analyze efficient multigrid pre- Karsten Kahl conditioners for solving boundary control problems for Bergische Universit¨at Wuppertal linear-quadratic elliptic-constrained optimal control prob- Department of Mathematics lems. We have considered Dirichlet and Neumann bound- [email protected] ary conditions. Thus far, we have numerically obtained an optimal order preconditioner for the Hessian of the re- duced Neumann boundary control problem and suboptimal Bjoern Leder order preconditioner for the Hessian of the reduced Dirich- Bergische Universitaet Wuppertal let boundary control problem. Currently we are analyzing Department of Mathematics the behavior of the preconditioner in theory. [email protected]

Mona Hajghassem, Andrei Draganescu Stefan Krieg Department of Mathematics and Statistics Forschungszentrum Juelich University of Maryland, Baltimore County Juelich Supercomputing Centre [email protected], [email protected] [email protected]

Harbir Antil CP3 George Mason University Fairfax, VA Multigrid for Tensor-Structured Problems [email protected] We consider linear systems with tensor-structured matrices  t CP3 A = Ej . A Multigrid Solver for the Tight-Binding Hamilto- nian of Graphene These problems are found in Markov chains or high- dimensional PDEs. Due to this format, the dimension of A Since the Nobel prize has been awarded in 2010 for grows rapidly. The structure has to be exploited for solving the isolation of graphene, research on this miraculous 2- these systems efficiently. To use multigrid for these mod- dimensional material has flourished. In order to calculate els we build a method which keeps the structure intact to the electronic properties of graphene structures a tight- guarantee computational savings on all grids. We present binding approach can be used. The resulting discrete eigen- how to adapt the AMG framework to this setting using value problem leads to linear systems of equations that are tensor truncation. maximally indefinite and possess a Dirac pseudo-spin struc- ture. In this talk we present a spin-preserving geometric Sonja Sokolovic multigrid approach for these linear systems and show its Bergische Universitaet Wuppertal scalability with respect to several geometric parameters. [email protected]

Nils Kintscher Matthias Bolten Fachbereich C University of Wuppertal Applied Computer Science Group Department of Mathematics [email protected] [email protected]

Karsten Kahl Karsten Kahl Bergische Universit¨at Wuppertal Bergische Universit¨at Wuppertal Department of Mathematics Department of Mathematics [email protected] [email protected]

CP3 CP3 Adaptive Algebraic Multigrid for Lattice QCD Multigrid Preconditioning for the Overlap Opera- tor in Lattice QCD In this talk, we present a multigrid approach for systems of linear equations involving the Wilson Dirac operator aris- One of the most important operators in lattice QCD is ing from lattice QCD. It combines components that have given by the Overlap Dirac Operator. Due to bad condi- already been used separately in lattice QCD, namely the tioning solving linear systems with this operator can be- domain decomposition method ”Schwarz Alternating Pro- come rather challenging when approaching physically rel- cedure” as a smoother and the aggregation based inter- evant parameters and lattice spacings. In this talk we polation. We give results from a series of numerical tests present and analyze a novel preconditioning technique that from our parallel MPI-C Code with system sizes of up to yields significant speedups. Furthermore we take a closer 200,000,000 unknowns. look at the matrix sign function and its evaluation as part of the Overlap Dirac Operator. Matthias Rottmann Bergische Universitaet Wuppertal Artur Strebel Department of Mathematics Bergische Universitaet Wuppertal [email protected] Department of Mathematics [email protected] Andreas J. Frommer Bergische Universitaet Wuppertal James Brannick Fachbereich Mathematik und Naturwissenschaften Department of Mathematics 56 LA15 Abstracts

Pennsylvania State University of a band matrix to tridiagonal form is considered. We [email protected] present some details of implemented optimizations: dy- namic parallelization of eigenvectors computations, specu- Andreas J. Frommer lative computations in QR, dynamic parallelization of the Bergische Universitaet Wuppertal reduction of banded matrix to tridiagonal form and how Fachbereich Mathematik und Naturwissenschaften these techniques are combined for achieving high perfor- [email protected] mance. Nadezhda Mozartova, Sergey V Kuznetsov, Aleksandr Karsten Kahl Zotkevich Bergische Universit¨at Wuppertal Intel Corporation Department of Mathematics [email protected], [email protected] [email protected], alek- [email protected] Bj¨orn Leder, Matthias Rottmann Bergische Universitaet Wuppertal Department of Mathematics CP4 [email protected], [email protected] Performance of the Block Jacobi-Davidson Method wuppertal.de for the Solution of Large Eigenvalue Problems on Modern Clusters CP4 We investigate a block Jacobi-Davidson method for com- Devide-and-conquer Method for Symmetric- puting a few exterior eigenpairs of a large . definite Generalized Eigenvalue Problems of The block method typically requires more matrix-vector BandedMatricesonManycoreSystems and vector-vector operations than the standard algorithm. We have recently proposed a divide-and-conquer method However, this is more than compensated by the perfor- for banded symmetric-definite generalized eigenvalue prob- mance gains through better data reusage on modern CPUs, lems based on the method for tridiagonal ones (Elsner, which we demonstrate by detailed performance engineering 1997). The method requires less FLOPs than the con- and numerical experiments. The key ingredients to achiev- ventional methods (e.g. DSBGVD in LAPACK) when the ing high performance consist in both kernel optimizations band is narrow and has high parallelism. In this presen- and a careful design of the algorithm that allows using tation, we describe the implementation of the proposed blocked operations in most parts of the computation. We method for manycore systems and demonstrate the effi- show the performance gains of the block algorithm with our ciency of our solver. hybrid parallel implementation for a variety of matrices on up to 5 120 CPU cores. A new development we discuss in Yusuke Hirota this context is a highly accurate and efficient block orthog- RIKEN AICS onalization scheme that exploits modern hardware features [email protected] and mixed precision arithmetic. Melven Roehrig-Zoellner Toshiyuki Imamura RIKEN Advance Institute for Computational Science German Aerospace Center (DLR) [email protected] Simulation and Software Technology [email protected]

CP4 Jonas Thies Performance Analysis of the Householder Back- German Aerospace Center (DLR) transformation with Asynchronous Collective [email protected] Communication Achim Basermann Recently, communication avoiding and communication hid- German Aerospace Center (DLR) ing technologies are focused on to accelerate the per- Simulation and Software Technology formance on parallel supercomputer systems. For dense [email protected] eigenvalue computation, especially Householder back- transformation of eigenvectors, we observed asynchronous Florian Fritzen, Patrick Aulbach collective communication is applicable and some special German Aerospace Center (DLR) performance characteristics and deviations via actual nu- fl[email protected], [email protected] merical experiences, specifically, the peak performance is unevenly distributed. We are going to build a performance model and analyze its response when we change some per- CP4 formance parameters defined in the performance model. Performance Comparison of Feast and Primme in Toshiyuki Imamura Computing Many Eigenvalues in Hermitian Prob- RIKEN Advance Institute for Computational Science lems [email protected] Contour integration methods like FEAST are successfully used to partially solve large eigenproblems arising in Den- CP4 sity Functional Theory. These methods are scalable and Dynamic Parallelization for the Reduction of a can show better performance than restarted Krylov solvers Banded Matrix to Tridiagonal Form in computing many eigenpairs in the interior of the spec- trum. Recently we have used polynomial filters with Gen- In the talk a new dynamic parallelization for the reduction eralized Davidson (GD) with similar success under limited LA15 Abstracts 57

memory. In this talk, we compare FEAST with the GD Michele Benzi available in PRIMME, solving Hermitian problems from Department of Mathematics and Computer Science different applications. Emory University [email protected] Eloy Romero Alcalde Computer Science Department College of Williams & Mary CP5 [email protected] Generalizing Spectral Graph Partitioning to Sparse Andreas Stathopoulos Tensors College of William & Mary Department of Computer Science Spectral graph partitioning is a method for clustering data [email protected] organized as undirected graphs. The method is based on the computation of eigenvectors of the graph Laplacian. In many areas one wants to cluster data from a sequence of CP4 graphs. Such data can be organized as large sparse tensors. The Implicit Hari-Zimmermann Algorithm for the We present a spectral method for tensor partitioning based Generalized Svd on the computation of a best multilinear rank aproximation of the tensor. A few applications are briefly discussed. We developed the implicit Hari–Zimmermann method for computation of the generalized singular values of matrix Lars Eld´en pairs, where one of the matrices is of full column rank. Link¨oping University The method is backward stable, and, if the matrices per- Department of Mathematics mit, computes the generalized singular values with small [email protected] relative errors. Moreover, it is easy to parallelize. Un- like the triangular routine DTGSJA from Lapack, the Hari– Zimmermann method needs no preprocessing to make both CP5 matrices triangular. Even when the matrices are prepro- cessed to a triangular form, the sequential pointwise Hari– Graph Partitioning with Spectral Blends Zimmermann method is, for matrices of a moderate size, significantly faster than DTGSJA. A significant speedup is Spectral partitioning uses eigenvectors and eigenvalues to obtained by blocking of the algorithm, to exploit the effi- partition graphs. We show that blends of eigenvectors have ciency of BLAS-3 operations. better pointwise error bounds than the eigenvectors for graphs with small spectral gaps. We use a model problem, Sanja Singer the Ring of Cliques, to demonstrate the utility of spectral Faculty of Mechanical Engineering and Naval blends for graph partitioning. This analysis provides a con- Architecture, University of Zagreb vergence criterion in terms of conductance and the Cheeger [email protected] inequality.

Vedran Novakovic James P. Fairbanks STFC, Daresbury Laboratory Georgia Institute of Technology Warrington, United Kingdom [email protected] [email protected] Geoffrey D. Sanders Sasa Singer Center for Applied Scientific Computing Department of Mathematics, University of Zagreb Lawrence Livermore National Lab [email protected] [email protected]

CP5 Heuristics for Optimizing the Communicability of CP5 Digraphs On the Relation Between Modularity and Adja- cency Graph Partitioning Methods In this talk we investigate how to modify the edges of a directed network in order to tune its overall broadcast- Modularity clustering is a popular method in partitioning ing/receiving capacity. We introduce broadcasting and re- graphs. In this talk we will present the relation between ceiving measures (related to the network’s “hubs” and “au- the leading eigenvector b of a modularity matrix and the thorities”, respectively) in terms of the entries of appropri- T T eigenvectors ui of its corresponding adjacency matrix. This ate functions of the matrices AA and A A, respectively, relation allows us to approximate the vector b with some where A is the adjacency matrix of the network. The larger of the ui. The equivalence between normalized versions of these measures, the better the network is at propagating in- modularity clustering and adjacency clustering will also be formation along its edges. We also investigate the effect of demonstrated. edge addition, deletion, and rewiring on the overall broad- casting/receiving capacity of a network. In particular, we show how to add edges so as to increase these quantities as Hansi Jiang much as possible, and how to delete edges so as to decrease North Carolina State University them as little as possible. [email protected]

Francesca Arrigo Carl Meyer Universit`a degli Studi dell’Insubria Mathematics Department [email protected] N. C. State University 58 LA15 Abstracts

[email protected] for Hessenberg TN matrices whose minors are all nonneg- ative. In this talk, we consider inverse eigenvalue prob- lems for TN matrices including symmetric tridiagonal ma- CP5 trices from the viewpoint of discrete integrable systems as- Detecting Highly Cyclic Structure with Complex sociated with matrix eigenvalue algorithms. Moreover, we Eigenpairs propose a finite-step construction of TN matrices with pre- scribed eigenvalues. Highly 3- and 4-cyclic subgraph topologies are detectable via calculation of eigenvectors associated with certain com- Kanae Akaiwa,YoshimasaNakamura plex eigenvalues of Markov propagators. We characterize Graduate School of Informatics, Kyoto University this phenomenon theoretically to understand the capabil- [email protected], [email protected] ities and limitations for utilizing eigenvectors in this ven- ture. We provide algorithms for approximating these eigen- Masashi Iwasaki vectors and give numerical results, both for software that Department of Informatics and Environmental Science, utilizes complex arithmetic and software that is limited to Kyoto Prefectural University real arithmetic. Additionally, we discuss the application of [email protected] these techniques to motif detection. Christine Klymko Hisayoshi Tsutsumi, Akira Yoshida Lawrence Livermore National Laboratory Doshisha University [email protected] [email protected], [email protected] Geoffrey D. Sanders Center for Applied Scientific Computing Koichi Kondo Lawrence Livermore National Lab Doshisha University [email protected] Dept. of Engineering [email protected]

CP5 Orthogonal Representations, Projective Rank, and CP6 Fractional Minimum Positive Semidefinite Rank: A Kinetic Ising Model and the Spectra of Some Connections and New Directions Jacobi Matrices

This paper introduces r-fold orthogonal representations of One-dimensional statistical physics models play a funda- graphs and formalizes the understanding of projective rank mental role in understanding the dynamics of complex sys- as fractional orthogonal rank. Fractional minimum positive tems in a variety of fields, from chemistry and physics, semidefinite rank is defined and it is shown that the pro- to social sciences, biology, and nanoscience. Due to their jective rank of any graph equals the fractional minimum simplicity, they are amenable to exact solutions that can positive semidefinite rank of its complement. An r-fold lead to generalizations in higher dimensions. In 1963, version of the traditional definition of minimum positive R. Glauber solved exactly a one-dimensional spin model, semidefinite rank of a graph using Hermitian matrices that known in literature as the kinetic Ising chain, KISC, that fit the graph is also presented. led to many applications and two-temperature generaliza- tions. In this talk we consider a case of temperature dis- Leslie Hogben tributions, extracting information regarding the physical Iowa State University, properties of the system from the spectrum analysis of ma- American Institute of Mathematics trix certain Jacobi matrix. We also analyze the eigenvalues [email protected] of some perturbed Jacobi matrices. The results contain as particular cases the known spectra of several classes of Kevin Palmowski tridiagonal matrices studied recently. This is a join work Iowa State University with S. Kouachi, D.A. Mazilu, and I. Mazilu. [email protected] Carlos Fonseca David Roberson Department of Mathematics Division of Mathematical Sciences University of Coimbra Nanyang Technological University [email protected] [email protected] Said Kouachi Simone Severini Qassim University, Al-Gassim, Buraydah Department of Computer Science Saudi Arabia University College London [email protected] [email protected] Dan Mazilu, Irina Mazilu Washington and Lee University, VA CP6 USA Inverse Eigenvalue Problems for Totally Nonnega- [email protected], [email protected] tive Matrices in Terms of Discrete Integrable Sys- tems CP6 Several discrete integrable systems play key roles in ma- The Molecular Eigen-Problem and Applications trix eigenvalue algorithms such as the qd algorithm for symmetric tridiagonal matrices and the dhToda algorithm Referring to the classical eigenproblem of a matrix as the LA15 Abstracts 59

atomic eigenproblem, we extend this notion to a molecu- we present numerical algorithms for their solution. Finally, lar eigenproblem: Consider the module of matrices M = present computations some medium size and large sparse ICn×m with n ≥ m, over the noncommutative matrix ring matrices arising in different scientific and industrial appli- R = ICm×m.Them-molecular eigen-problem is the prob- cations. lem of determining Λ ∈ ICm×m and X ∈ ICn×m such that AX = XΛ. Existence conditions and characterizations of Vladimir Kostic its solution are given. If (X, Λ) is an m-molecular eigen- University of Novi Sad, Faculty of Science pair of A,thensois(XR,Λ), for all R ∈ C(Λ), the cen- Department of Mathematics and Informatics tralizer of the set of molecular eigenvalues. This freedom [email protected] may be exploited to define a canonical molecular eigenvec- tor. We apply this generalized formalism to solve matrix- Agnieszka Miedlar ODE’s, and give a lattice theoretic representation of the Technische Universit¨at Berlin structure of the solution sets. [email protected]

Erik Verriest School of Electrical and Computer Engineering CP6 Georgia Institute of Technology A Multigrid Krylov Method for Eigenvalue Prob- [email protected] lems

Nak-Seung Hyun We propose a new multigrid Krylov method for eigenvalue School of Electrical and Computer Engineering problems of differential operators. Arnoldi methods are Georgia Institute of Technology used on multiple grids. Approximate eigenvectors from a [email protected] coarse grid can be improved on a fine grid. We compare the new method with other approaches, and we also give anal- ysis of the convergence. This multigrid Arnoldi method is CP6 more robust than standard multigrid and has potential for Spectral Properties of the Boundary Value Prob- dramatic improvement compared to regular Arnoldi. lems for the Discrete Beam Equation Zhao Yang In this talk, we will consider the boundary-value problem Department of Mathematics for the fourth-order discrete beam equation Oklahoma State University A [email protected] 4 Δ yi + bi+2yi+2 = λai+2yi+2, −1 ≤ i ≤ n − 2, Ronald Morgan 2 3 2 3 Δ y−1 =Δ y−1 =Δ yn−1 =Δ yn−1 =0, Department of Mathematics which is a discrete analogy to the following boundary-value Baylor University problem for the fourth-order linear beam equation: ronald [email protected]

(4)     y (t)+b(t)y(t)=λa(t)y(t),y(0) = y (0) = y (1) = y (1) =CP7 0. For the ordinary differential equation boundary-value prob- Block-Asynchronous Jacobi Iterations with Over- lem, the monotonicity of the smallest positive eigenvalue lapping Domains was studied in the literature. The special structure of the Block-asynchronous Jacobi is an iteration method where matrices associated with the discrete problem allows us to a locally synchronous iteration is embedded in an asyn- analyze the spectral properties of the problem and to es- chronous global iteration. The unknowns are partitioned tablish the monotonicity of all eigenvalues of the discrete n n into small subsets, and while the components within the problem as the sequences {ai}i=1 and {bi}i=1 change. same subset are iterated in Jacobi fashion, no update order Jun Ji,BoYang in-between the subsets is enforced. The values of the non- Kennesaw State University local entries remain constant during the local iterations, [email protected], [email protected] which can result in slow inter-subset information propa- gation and slow convergence. Interpreting of the subsets as subdomains allows to transfer the concept of domain CP6 overlap typically enhancing the information propagation Matrix Nearness Problems for General Lyapunov- to block-asynchronous solvers. In this talk we explore the Type Stability Domains impact of overlapping domains to convergence and perfor- mance of block-asynchronous Jacobi iterations, and present In many applications asymptotic instability of the mathe- results obtained by running this solver class on state-of-the- matical (dynamical) system leads to the loss of the struc- art HPC systems. tural integrity of the real physical system due to the am- plification of the perturbations in the initial conditions. Hartwig Anzt However, in some cases, although dynamical system is Innovate Computing Lab asymptotically stable, the corresponding physical system University of Tennessee can loose its structural integrity due to transitional insta- [email protected] bility (due to too large amplitude or too high frequency) typical for dynamics governed by nonnormal matrices. To Edmond Chow consider such applications, we review general Lyapunov- School of Computational Science and Engineering type domains and formulate two matrix nearness problems Georgia Institute of Technology - the distance to delocalization and distance to localization [email protected] to generalize the distance to instability and the distance to stability, in both, discrete and continuous sense. Then, Daniel B. Szyld 60 LA15 Abstracts

Temple University In this talk, we present the new interfaces and implemen- Department of Mathematics tation details of the batched matrix- [email protected] routines in IntelR Math Kernel Library 11.3. Compared to the optimized non-batched counterparts, batched routines provide 8× and 15× speedups on average for IntelR XeonR Jack J. Dongarra TM Department of Computer Science processor and IntelR Xeon Phi coprocessor respectively. The University of Tennessee [email protected] Murat E. Guney, Sarah Knepper, Kazushige Goto, Vamsi Sripathi, Greg Henry, Shane Story CP7 Intel Corporation [email protected], [email protected], Performance Evaluation of the Choleskyqr2 Algo- [email protected], [email protected], rithm [email protected], [email protected] Cholesky QR computes the QR factorization through the Cholesky factorization. It has excellent suitability for HPC but is rarely practical due to its numerical instability. Re- CP7 cently, we have pointed out that an algorithm that repeats Cholesky QR twice, which we call CholeskyQR2, has much Data Sparse Technique improved stability. In this talk, we present the performance results of CholeskyQR2 and show its practicality. We also In this talk we will describe how H-matrix data sparse tech- discuss its application to the block Gram-Schmidt orthog- niques can be implemented in a parallel hybrid sparse linear onalization and the block Householder QR algorithm. solver based on algebraic non overlapping domain decom- position approach. Strong-hierarchical matrix arithmetic Takeshi Fukaya and various clustering techniques to approximate the local Hokkaido University Schur complements will be investigated, aiming at reduc- [email protected] ing workload and memory consumption while complying with structures of the local interfaces of the sub-domains. Yuji Nakatsukasa Utilization of these techniques to form effective global pre- Department of Mathematics conditioner will be presented. University of Manchester [email protected] Yuval Harness Inria Yuka Yanagisawa [email protected] Waseda University [email protected] Luc Giraud Inria Yusaku Yamamoto Joint Inria-CERFACS lab on HPC The University of Electro-Communications, Japan [email protected] [email protected] Emmanuel Agullo CP7 INRIA [email protected] High Performance Resolution of Dense Linear Sys- tems Using Compression Techniques and Applica- tion to 3D Electromagnetics Problems Eric Drave Stanford University Solving large 3-D electromagnetic problems is challenging. [email protected] Currently, accurate numerical methods are used to solve Maxwells equations in the frequency domain, which leads to solve dense linear systems. Thanks to recent advances on CP7 fast direct solvers by the means of compression techniques, we have developed a solver capable of handling systems A Newly Proposed BLAS Extension (xGEMMT) with millions of complex unknowns and thousands of right to Update a Symmetric Matrix Efficiently hand sides suited for our Petascale machine. We propose a complement to the Level 3 BLAS xGEMM David Goudin, Cedric Augonnet, Agnes Pujols, Muriel routine that computes C := α × A × B + β × C,whereC Sesques remains symmetric for general A and B matrices. For in- CEA/CESTA T stance, A may be the product of B and a symmetric or di- [email protected], [email protected], agonal matrix. This new xGEMMT routine provides func- [email protected], [email protected] tionality used in numerous algorithms and within Hessian- based optimization methods. In this talk, we discuss the CP7 subtleties of implementing and optimizing xGEMMT in the IntelR Math Kernel Library. Batched Matrix-Matrix Multiplication Operations R R R forTM Intel Xeon Processor and Intel Xeon Phi Co-Processor Sarah Knepper, Kazushige Goto, Murat E. Guney, Greg Henry, Shane Story Many numerical algorithms such as sparse solvers and finite Intel Corporation element method rely on a number of matrix-matrix multi- [email protected], [email protected], plication operations that can be performed independently. [email protected], [email protected], LA15 Abstracts 61

[email protected] matrix is divided by the upper bound for spectral norm, then unvarying and tight ranges for the variables in the algorithm are obtained. Thus overflow is avoided for all CP8 range of input matrices with reduced hardware cost. Eigenvalue Condition Numbers of Polynomial Eigenvalue Problems under M¨obius Transforma- Bibek Kabi tions , West Bengal, India Pin-721302 We study the effect of M¨obius transformations on the sen- [email protected] sitivity of polynomial eigenvalues problems (PEP). More precisely, we compare eigenvalue condition numbers for a Aurobinda Routray, Ramanarayan Mohanty PEP and the correspondent eigenvalue condition numbers Indian Institute of Technology Kharagpur for the same PEP modified with a M¨obius transforma- West Bengal, India tion. We bound this relationship with factors that depends [email protected], rama- on the condition number of the matrix that induces the [email protected] M¨obius transformation and on the eigenvalue whose norm- wise condition number we consider, and establish sufficient conditions where M¨obius transformations do not alter sig- CP8 nificantly the condition numbers. A Novel Numerical Algorithm for Triangularizing Quadratic Matrix Polynomials Luis Miguel Anguas Marquez Universidad Carlos III de Madrid For any monic linearization λI + A of a quadratic matrix [email protected] polynomial, Tisseur and Zaballa [SIAM J. Matrix Anal. Appl., 34-2 (2013), pp. 312-337] show that there exists a Froilan Dopico nonsingular matrix [UAU] that transforms A to a compan- Department of Mathematics ion linearization of a (quasi)-triangular quadratic matrix Universidad Carlos III de Madrid, Spain polynomial. We observe that the matrix [UAU]maybe [email protected] ill-conditioned while U is perfectly-conditioned. To con- quer the ill conditioning challenge, we design a numerical algorithm without the orthonormal characteristic of U. CP8 Computation of All the Eigenpairs for a Particular Yung-Ta Li Kind of Banded Matrix Fu Jen Catholic University, Taiwan [email protected] For a particular kind of banded matrix which is character- ized by the band width, we propose a new algorithm for computing all the eigenpairs effectively. Though the in- CP8 tended matrix has complex eigenvalues, our algorithm can A Fiedler-Like Approach to Spectral Equivalence compute all the complex eigenpairs only by the arithmetic of Matrix Polynomials of real numbers. We also present an error analysis and numerical examples for the proposed algorithm. In this talk we extend the notion of Fiedler pencils to square matrix polynomials of the form P (λ)= Hiroshi Takeuchi k k i=0 Aiφi(λ)where{φi(λ)}i=0 is either a Bernstein, New- Tokyo University of Science ton, or Lagrange basis. We use this new notion to pro- Japan vide a systematic way to easily generate large new families [email protected] of matrix pencils that are spectrally equivalent to P (λ), and consequently, for solving the polynomial eigenproblem Kensuke Aihara P (λ)x =0,x = 0. Time permitting, we will discuss some Department of Mathematical Information Science, numerical properties of these matrix pencils. Tokyo University of Science [email protected] Vasilije Perovic Western Michigan University Akiko Fukuda Department of Mathematics Shibaura Institute of Technology [email protected] Japan [email protected] D. Steven Mackey Department of Mathematics Emiko Ishiwata Western Michigan University Department of Mathematical Information Science, [email protected] Tokyo University of Science, Japan [email protected] CP8 On First Order Expansions for Multiplicative Per- CP8 turbation of Eigenvalues Fixed-Point Singular Value Decomposition Algo- rithms with Tight Analytical Bounds Let A be a matrix with any Jordan structure, and λ an eigenvalue of A whose largest Jordan block has size n. This work presents an analytical approach for finding the We present first order eigenvalue expansions under mul- ranges of the variables in fixed-point singular value decom- tiplicative perturbations A =(I + εC) A (I + εB)using position algorithm based on upper bound for the spectral Newton Polygon. Explicit formulas for the leading coef- norm of input matrix. We show that if each element of a ficients are obtained, involving the perturbation matrices 62 LA15 Abstracts

and appropriately normalized eigenvectors of A. If λ =0, cific NEP: the waveguide eigenvalue problem, which arises 1 the perturbation in the eigenvalue is of order of ε n , while from a finite-element discretization of a partial-differential 1 equation used in the study waves propagating in periodic if λ = 0, it is generically of order ε n−1 . medium. The algorithm is successfully applied to solve Fredy E. Sosa benchmark problems as well as complicated waveguides. Departamento de Matrm´aticas Giampaolo Mele Universidad Carlos III de Madrid, Espa˜na KTH Royal Institute of Technology [email protected] Department of numerical analysis [email protected] Julio Moro Universidad Carlos III de Madrid Elias Jarlebring Departamento de Matem´aticas KTH Stockholm [email protected] [email protected]

CP9 Olof Runborg KTH, Stockholm, Sweden Approximating the Leading Singular Triplets of a [email protected] Large Matrix Function

Given a large square matrix A and a sufficiently regular CP9 function f, we are interested in the approximation of the leading singular values and corresponding vectors of f(A), Verified Solutions of Delay Eigenvalue Problems and in particular of f(A),where· is the induced matrix with Multiple Eigenvalues 2-norm. Since neither f(A)norf(A)v can be computed ex- Consider computing error bounds for numerical solu- actly, we introduce and analyze an inexact Golub-Kahan- tions of nonlinear eigenvalue problems arising from delay- Lanczos bidiagonalization procedure. Particular outer and n×n differential equations: given A, B ∈ C and τ ≥ 0, find inner stopping criteria are devised to cope with the lack of n λ ∈ C and x ∈ C \{0} such that a true residual. −τλ Sarah W. Gaaf (λI − A − Be )x =0, Eindhoven University of Technology [email protected] where I is the identity matrix, for giving reliability of the solutions. The author previously proposed an algorithm for computing the error bounds. However, this is not applica- Valeria Simoncini ble when λ is multiple. We hence propose an algorithm Universita’ di Bologna which is applicable even in this case. [email protected] Shinya Miyajima Gifu University CP9 [email protected] Inverse Probing for Estimating diag(f(A))

Computing diag(f(A)) where f(A) is a function of a large CP9 sparse matrix can be computationally difficult. Probing Taylor’s Theorem for Matrix Functions and Pseu- attempts to solve this by using matrix polynomials to de- dospectral Bounds on the Condition Number termine the structure of f(A). However, these matrix poly- nomials can converge slowly, causing probing to fail. To We generalize Taylor’s theorem from scalar functions to avoid this, we propose a new method which we term In- matrix functions, obtaining an explicit expression for the verse Probing, that directly approximates the structure of remainder term. Consequently we derive pseudospectral f(A) based on a small sample of columns of f(A). We show bounds on the remainder which can be used to obtain an that this is more effective than probing in most situations. upper bound on the condition number of the matrix func- tion. Numerical experiments show that this upper bound can be calculated very quickly for f(A)=At,almostthree Jesse Laeuchli orders of magnitude faster than the current state-of-the-art College of William and Mary method. [email protected] Samuel Relton Andreas Stathopoulos University of Manchester, UK College of William & Mary [email protected] Department of Computer Science [email protected] Edvin Deadman University of Manchester [email protected] CP9 The Waveguide Eigenvalue Problem and the Ten- sor Infinite Arnoldi Method CP9 Contour Integration Via Rational Krylov for Solv- We present a new iterative algorithm for nonlinear eigen- ing Nonlinear Eigenvalue Problems value problems (NEPs), the tensor infinite Arnoldi method, which is applicable to a general class of NEPs. More- The Cauchy integral reformulation of the nonlinear eigen- over we show how to specialize the algorithm to a spe- value problem A(λ)x = 0 has led to subspace methods LA15 Abstracts 63

for nonlinear eigenvalue problems, where approximations trices into Anti-Triangular Ones of contour integration by numerical quadrature play the A factorization of a symmetric indefinite matrix, A = role of rational filters of the subspace. We show that in T some cases this filtering of the subspace by rational func- QMQ , with Q orthogonal and M symmetric block an- tions can be efficiently performed by applying a restarted titriangular (BAT) is described in [1], relying only on or- rational Krylov method. We illustrate that this approach thogonal transformations and revealing the inertia of the increases computational efficiency. Furthermore, locking of matrix. In [2] a block algorithm performing all operations converged eigenvalues can in the rational Krylov method almost entirely in level 3 BLAS is developed, featuring a be performed in a robust way. more favorable memory access pattern. In the same paper a lack of reliability is noticed in computing the inertia. In Roel Van Beeumen this talk we describe a new implementation of the BAT KU Leuven factorization and compare it to the other implementations [email protected] in terms of stability and reliability. 1 N. Mastronardi, P. Van Dooren, The antitriangular Karl Meerbergen factorization of symmetric matrices, SIMAX 34 2013 K. U. Leuven 173-196. [email protected] 2 Z. Bujanovic, D. Kressner, A block algorithm for com- puting antitriangular factorizations of symmetric ma- Wim Michiels trices, Numer Algor, to appear. KU Leuven [email protected]@ Nicola Mastronardi Istituto per le Applicazioni del Calcolo CP10 National Research Council of Italy, Bari [email protected] A Block Gram-Schmidt Algorithm with Reorthog- onalization and Conditions on the Tall, Skinny Qr Paul Van Dooren The talk considers block the Gram-Schmidt with reorthog- Universit´e Catholique de Louvain onalization (BCGS2) algorithm discussed in [J. Barlow and [email protected] A. Smoktunowicz, Reorthogonalized Block Classical Gram- Schmidt, Num. Math. ,123:398–423, 2013.] for producing CP10 the QR factorization of a matrix X that is partitioned into blocks. A building block operation for BCGS2 is the “tall- Gram-Schmidt Process with Respect to Bilinear skinny QR’ factorization (TSQR) which is assumed to be a Forms backward stable factorization. However, that assumption excludes some possible TSQR algorithms, in particular, a Gram-Schmidt orthogonalization process is probably the recent TSQR algorithm in [I.Yamazaki, S. Tomov, and J. most popular and frequently used scheme to obtain mutu- Dongarra. Mixed-Precision Cholesky QR factorization and ally orthogonal vectors. In this contribution we consider Its Case Studies on Multicore CPU with mulitiple GPUS. orthogonalization schemes with respect to the symmetric to appear, SIAM J. Sci. Computing, 2015]. It is shown bilinear forms and skew-symmetric bilinear forms.We ana- that the weaker stability conditions satisfied by the Ya- lyze their behavior in finite precision arithmetic and give mazaki et al. algorithm are sufficient for BCGS2 to pro- bounds for the loss of orthogonality between the computed duce a conditionally backward stable factorization. vectors. Miro Rozloznik Jesse L. Barlow Czech Academy of Sciences Penn State University Prague, Czech Republic Dept of Computer Science & Eng [email protected] [email protected]

CP10 CP10 Roundoff Error Analysis of the Choleskyqr2 and Block Methods for Solving Banded Symmetric Lin- Related Algorithms ear Systems Cholesky QR is an ideal QR factorization algorithm from Several years ago we discovered an algorithm for factoring the viewpoint of high performance computing, but it has banded symmetric systems of equations which required half rarely been used in practice due to numerical instability. the number of multiplications and 2/3 the space of the Recently, we showed that by repeating Cholesky QR twice, existing algorithms that ignored symmetry. The algorithm we can greatly improve the stability. In this talk, we reduced that matrix to a sequence of 1 x 1 and 2 x 2 pivots. present a detailed error analysis of the algorithm, which To prevent fillin and to promote stability a sequence of 2 we call CholeskyQR2. of related al- x 2 planar transformations are necessary when using 2 x 2 gorithms, such as the block Gram-Schmidt method using pivots which greatly complicates a block approach. CholeskyQR2, is also discussed.

Linda Kaufman Yusaku Yamamoto William Paterson University The University of Electro-Communications, Japan [email protected] [email protected]

Yuji Nakatsukasa CP10 Department of Mathematics On the Factorization of Symmetric Indefinite Ma- University of Manchester 64 LA15 Abstracts

[email protected] University of Rhode Island, Kingston, RI 02881 [email protected] Yuuka Yanagisawa Waseda University [email protected] CP11 The Inverse of Two Level Topelitz Operator Ma- Takeshi Fukaya trices Hokkaido University [email protected] The celebrated Gohberg-Semencul formula provides a for- mula for the inverse of a Toeplitz matrix based on the entries in the first and last columns of the inverse, un- CP10 der certain nonsingularity conditions. In this talk we will Mixed-Precision Orthogonalization Processes present similar formulas for two-level Toeplitz matrices and will provide a two variable generalization of the Gohberg- Orthogonalizing a set of dense vectors is an important com- Semencul formula in the case of a positive definite two- level Toeplitz matrix with a symbol of the form f(z1,z2)= putational kernel in subspace projection methods for solv- ∗ −1 −1 ∗ ing large-scale problems. In this talk, we discuss our ef- P (z1,z2) −1P (z1,z2) = R(z1,z2) R(z1,z2) −1for forts to improve the performance of the kernel, while main- z1,z2 ∈ T,whereP (z1,z2) and R(z1,z2) are stable op- taining its numerical accuracy. Our experimental results erator valued polynomials of two variables. In addition, demonstrate the effectiveness of our approaches. we propose an approximation of the inverse of a multilevel Toeplitz matrix with a positive symbol, and use it as the Ichitaro Yamazaki initial value for a Hotelling iteration to compute the in- University of Tennessee, Knoxville verse. Numerical results are included. [email protected] Selcuk Koyuncu Jesse L. Barlow University of North Georgia Penn State University [email protected] Dept of Computer Science & Eng [email protected] Hugo Woerdeman Drexel University Stanimire Tomov [email protected] Innovative Computing Laboratory, Computer Science Dept CP11 University of Tennessee, Knoxville [email protected] Recursive, Orthogonal, Radix-2, Stable Dct-Dst Algorithms and Applications Jakub Kurzak Discrete Fourier Transformation is engaged in image pro- University of Tennessee, Knoxville cessing, signal processing, speech processing, feature ex- [email protected] traction, convolution etc. In this talk, we address com- pletely recursive, radix-2, stable, and solely based discrete Jack J. Dongarra cosine transformation (DCT) and discrete sine transfor- Department of Computer Science mation (DST) algorithms having sparse, orthogonal, ro- The University of Tennessee tation, rotation-reflection, and butterfly matrices. We also [email protected] present image compression results and signal transform de- signs based on the completely recursive DCT and DST al- gorithms having sparse and orthogonal factors. CP11 A New Asynchronous Solver for Banded Linear Sirani M. Perera Systems Embry Riddle Aeronautical University [email protected] Banded linear systems occur frequently in mathematics and physics. A solution approach is discussed that decom- poses the original linear system into q-subsystems, where CP11 q is the number of superdiagonals. Each system is solved Structured Condition Numbers for the Solution of asynchronously, followed by a q×q constraint matrix prob- Parameterized Quasiseparable Linear Systems lem, and a final superposition. Reduction to a lower trian- gular system is never required, so the method can be fast We derive relative condition numbers for the solution of when q-processors are available. Numerical experiments quasiseparable linear systems with respect to relative per- using Matlab and Fortran are also discussed. turbations of the parameters in the quasiseparable and in the Givens-vector representations of the coefficient matrix Michael A. Jandron of the system. We compare these condition numbers with Naval Undersea Warfare Center the unstructured one and provide numerical experiments [email protected] showing that the structured condition numbers we present can be small in situations where the unstructured one is Anthony Ruffa huge. Naval Undersea Warfare Center, Newport, RI 02841 anthony.ruff[email protected] Kenet J. Pom´es Portal Universidad Carlos III de Madrid James Baglama Department of Mathematics Department of Mathematics [email protected] LA15 Abstracts 65

Froil´an M. Dopico University of Paris-Sud/INRIA Department of Mathematics [email protected] Universidad Carlos III de Madrid [email protected] Jack J. Dongarra Department of Computer Science The University of Tennessee CP11 [email protected] Local Fourier Analysis of Pattern Structured Op- erators Adrien Remy Local Fourier Analysis (LFA) is a well known tool for University of Paris-Sud analysing and predicting the convergence behavior of [email protected] multigrid methods. When first introduced, LFA required operators with constant coefficients. This requirement has Stanimire Tomov been relaxed in different ways, e.g., to analyze the complete Innovative Computing Laboratory, Computer Science two-grid method. We continue in this manner by extend- Dept ing the applicability of LFA by introducing a framework University of Tennessee, Knoxville for analyzing pattern structured operators. [email protected]

Hannah Rittich Ichitaro Yamazaki University of Wuppertal University of Tennessee, Knoxville [email protected] [email protected]

Matthias Bolten University of Wuppertal CP12 Department of Mathematics A Parallel Divide-and-Conquer Algorithm for [email protected] Computing the Moore-Penrose Inverses on Gpu

Karsten Kahl The massive parallelism of the platforms known as Com- Bergische Universit¨at Wuppertal pute Unified Device Architecture (CUDA) makes it pos- Department of Mathematics sible to achieve dramatic runtime reductions over a stan- [email protected] dard CPU in many applications at a very affordable cost. However, traditional algorithms are serial in nature and can’t take advantage of the multiprocessors that CUDA CP11 platforms provide. In this paper, we present a divide-and- conquer algorithm for computing the Moore-Penrose in- A Fast Structured Eigensolver for Toeplitz Matri- † ces verse A of a general matrix A. Our algorithm consists of two phases. The first phase uses a parallel algorithm to or- Abstract not available at time of publication. thogonally reduce a general matrix A to a bi-diagonal ma- trix B while the second one computes the Moore-Penrose of Xin Ye B using a divide-and-conquer approach. Then the Moore- Purdue University Penrose inverse of A is finally constructed from the results [email protected] of these two phases. Numerical tests show the dramatic speedup of our new approach over the traditional algo- Jianlin Xia rithms. Department of Mathematics Purdue University Xuzhou Chen [email protected] Fitchburg State University xchen@fitchburgstate.edu Raymond H. Chan The Chinese Univ of Hong Kong Jun Ji Department of Mathematics Kennesaw State University [email protected] [email protected]

CP12 CP12 Solving Dense Symmetric Indefinite Linear Sys- Exploring Qr Factorization on Gpu for Quantum tems on GPU Accelerated Architectures Monte Carlo Simulation

We present new implementations for dense symmetric in- Evaluation of probability computed as determinant of a definite factorizations on multicore processors accelerated dense matrix of wave functions is a computational kernel by a GPU. Though such algorithms are needed in many in QMCPack. This matrix undergoes a rank-one update if scientific applications, obtaining high performance of the the event is accepted. Sherman-Morrison formula is used factorization on the GPU is difficult because of the cost of to update the inverse of this matrix. The explicit inverse is symmetric pivoting. To improve performance, we explore recomputed occasionally to maintain numerical stability. different techniques that reduce communication and syn- QR factorization maintains stability without refactoriza- chronization between the CPU and GPU. We present per- tion. This effort explores low-rank updates to QR factor- formance results using Bunch-Kaufmann, Aasen and RBT ization on GPU to replace this computational kernel. methods. Eduardo F. D’Azevedo Marc Baboulin Oak Ridge National Laboratory 66 LA15 Abstracts

Computer Science and Mathematics Division The University of Tennessee [email protected] [email protected]

Paul Kent, Ying Wai Li Oak Ridge National Laboratory CP12 [email protected], [email protected] Batched Matrix Computations on Hardware Accel- erators Based on GPUs.

CP12 We will present techniques for small matrix computa- Comparing Hybrid and Native GPU Acceleration tions on GPUs and their use for energy efficient, high- for Linear Algebra performance solvers. Work on small problems delivers high performance through improved data reuse. Many numeri- Accelerating dense linear algebra using GPUs admits two cal libraries and applications need this functionality further models: hybrid CPU-GPU and GPU-only. The hybrid developed. We describe the main factorizations LU,QR, model factors the panel on the CPU while updating the and Cholesky for a set of small dense matrices in paral- trailing matrix on the GPU, concentrating the GPU on lel. We achieve significant acceleration and reduced energy high-performance matrix multiplies. The GPU-only model consumption against other solutions. Our techniques are performs the entire computation on the GPU, avoiding of interest to GPU application developers in general. costly data transfers to the CPU. We compare these two approaches for three QR-based algorithms: QR factoriza- Azzam Haidar tion, rank revealing QR, and reduction to Hessenberg. Department of Electrical Engineering and Computer Science Mark Gates University of Tennessee, Knoxville Computer Science Department [email protected] University of Tennessee [email protected] Ahmad Ahmad University of Tennessee Azzam Haidar [email protected] Department of Electrical Engineering and Computer Science Stanimire Tomov University of Tennessee, Knoxville University of Tennessee, Knoxville [email protected] [email protected]

Stanimire Tomov Jack J. Dongarra Innovative Computing Laboratory, Computer Science Department of Computer Science Dept The University of Tennessee University of Tennessee, Knoxville [email protected] [email protected]

CP13 CP12 Recycling Krylov Subspace Methods for Sequences Efficient Eigensolver Algorithm on Accelerator of Linear Systems with An Application to Lattice Based Architecture Qcd

The enormous gap between the high-performance capabil- Many problems in engineering, numerical simulations in ities of GPUs and the slow interconnect between them physics etc. require the solution of long sequences of slowly has made the development of numerical software that is changing linear systems. As an alternative to GCRO-DR, scalable across multiple GPUs extremely challenging. We we propose an algorithm that uses an oblique projection, describe a successful methodology on how to address the in the spirit of truly deflated GMRES, for deflating the challenges -starting from our algorithm design, kernel opti- eigenvalues of smallest magnitude. This oblique projection mization and tuning, to our programming model- in the de- requires approximations to right and left eigenvectors, and velopment of a scalable high-performance symmetric eigen- we propose a way of getting the approximations to the left value and singular value solver. eigenvectors, without having to build a Krylov subspace with respect to AH , which saves a lot of work. We will show Azzam Haidar that the new algorithm is numerically comparable to the Department of Electrical Engineering and Computer GCRO-DR algorithm and in addition, we further improve Science our results for the lattice QCD application by exploiting University of Tennessee, Knoxville non-trivial symmetries. [email protected] Nemanja Bozovic Piotr Luszczek University of Wuppertal, Germany University of Tennessee [email protected] [email protected] Andreas J. Frommer Stanimire Tomov Bergische Universitaet Wuppertal University of Tennessee, Knoxville Fachbereich Mathematik und Naturwissenschaften [email protected] [email protected]

Jack J. Dongarra Matthias Bolten Department of Computer Science University of Wuppertal LA15 Abstracts 67

Department of Mathematics Brazil [email protected] [email protected]

CP13 CP13 Generalized Jacobi Matrices and Band Algorithms

Jacobi matrices, i.e. symmetric tridiagonal matrices with On Rotations and Approximate Rational Krylov positive subdiagonal entries, represent thoroughly stud- Subspaces ied objects connected to the Lanczos tridiagonalization, the Golub-Kahan bidiagonalization, the Gauss quadrature, Rational Krylov subspaces are useful in many applications. etc. In this presentation, we study recently introduced ρ- These subspaces are built by not only matrix vector prod- wedge-shaped matrices that can be viewed as a generaliza- ucts but also by inverses of a matrix times a vector. This tion of Jacobi matrices. The definition is motivated by the results in significant faster convergence, but on the other structure of output matrices in the band generalization of hand often creates numerical issues. It is shown that ratio- the Golub-Kahan bidiagonalization and the band Lanczos nal Krylov subspaces under some assumptions are retrieved algorithm. without any explicit inversion or system solves involved. Instead the necessary computations are done implicitly us- Iveta Hnetynkova ing information from an enlarged Krylov subspace. In this Charles University, Prague talk the generic building blocks underlying rational Krylov [email protected] subspaces: rotations, twisted QR-factorizations, turnovers, fusions, are introduced. We will illustrate how to shrink Martin Plesinger a large Krylov subspace by unitary similarity transforma- Technical University of Liberec tions to a smaller subspace without -if all goes well- essen- [email protected] tial data loss. Numerical experiments support our claims that this approximation can be very good and as such can lead to time savings when solving particular ODE’s. CP13 Symmetric Inner-Iteration Preconditioning for Rank-Deficient Least Squares Problems Raf Vandebril Katholieke Universiteit Leuven Several steps of stationary iterative methods with a sym- Department of Computer Science metric splitting matrix serve as inner-iteration precondi- [email protected] tioning. We give a necessary and sufficient condition such that the inner-iteration preconditioning matrix is definite, Thomas Mach show that short recurrence type Krylov subspace methods TU Chemnitz preconditioned by the inner iterations determine a least Germany squares solution and the minimum-norm solution of linear [email protected] systems of equations, whose coefficient matrices may be rank-deficient, and give bounds for these methods. Miroslav Pranic Faculty of Science Keiichi Morikuni University of Banja Luka Institute of Computer Science [email protected] Academy of Sciences of the Czech Republic [email protected]

CP13 CP13 On Solving Linear Systems Using Adaptive Strate- Improving Thick-Restarting Lanczos Method by gies for Block Lanczos Method Subspace Optimization For Large Sparse Eigen- value Problems Block Methods based on conjugate directions improve arithmetic intensity and rate of convergence. When these The Thick-Restart Lanczos method is an effective method methods are used for the resolution of algebraic linear sys- for large sparse eigenvalue problems. In this work, we pro- tems Ax = b (A symmetric positive definite) a drawback pose and study a subspace optimization technique to ac- is the need to find an adequate block size. In this work we celerate this method. The proposed method augments the discuss and present results of several strategies for adap- Krylov subspace with a certain number of vectors from the tively updating the size of the block based on Ritz values previous restart cycle and then apply the Rayleigh-Ritz and a threshold for determining when the rule applies. procedure in the new subspace. Numerical experiments Christian E. Schaerer show that our method can converges two or three times Institute for Pure and Applied Mathematics - IMPA faster than the Lanczos method. [email protected] Lingfei Wu Pedro Torres Department of Computer Science Polytechnic School College of William & Mary National University of Asuncion [email protected] [email protected] Andreas Stathopoulos Amit Bhaya College of William & Mary Federal University of Rio de Janeiro Department of Computer Science 68 LA15 Abstracts

[email protected] present recent work in this direction.

Tom´a Gergelits CP14 Faculty of Mathematics and Physics, Ca-Ilu Preconditioner Charles University in Prague [email protected]ff.cuni.cz In this talk, we present a new parallel preconditioner called CA-ILU. It performs ILU(k) factorization in parallel and is Jan Papez applied as preconditioner without communication. Block Charles University in Prague Jacobi and Restrictive Additive Schwartz preconditioners Dept Numerical Mathematics are compared to this new one. Numerical results show [email protected] that CA-ILU(0) outperforms BJacobi and is close to RAS in term of iterations whereas CA-ILU, with complete LU Zdenek Strakos on each block, does for both of them but uses a lot of mem- Faculty of Mathematics and Physics, ory. Thus CA-ILU(k) takes advantages of both in term of Charles University in Prague memory consumption and iterations. [email protected]ff.cuni.cz Sebastien Cayrols INRIA CP14 [email protected] Reusing and Recycling Preconditioners for Se- quences of Matrices Laura Grigori INRIA For sequences of related linear systems, computing a new France preconditioner for each system can be expensive. We can [email protected] reuse a fixed preconditioner, but as the matrix changes this approach can become ineffective. We can alterna- Sophie M. Moufawad tively apply cheap updates to the preconditioner. This LRI - INRIA Saclay, France presentation discusses the benefits of reusing and recy- [email protected] cling preconditioners and proposes an update scheme we refer to as a Sparse Approximate Map. Applications in- clude the QMC method, model reduction, tomography, and CP14 Helmholtz-type problems. The Truncation of Low-Rank Preconditioner Up- Arielle K. Grim Mcnally dates with Applications Virginia Tech Eric de Sturler Rather than recomputing preconditioners for a sequence of [email protected] linear systems, it is often cheaper to update the precondi- tioner. Here we consider low-rank preconditioner updates and truncating those updates to keep them cheap. We Eric De Sturler consider applications from nonlinear PDEs and Electronic Virginia Tech Structure Calculations. [email protected]

Eric De Sturler Virginia Tech CP14 [email protected] Lu Preconditioners for Non-Symmetric Saddle Point Matrices with Application to the Incompress- Ming Li ible Navier-Stokes Equations Dept. Mathematics The talk focuses on threshold incomplete LU factorizations Virginia Polytechnic Institute and State University for non-symmetric saddle point matrices. The research is [email protected] motivated by the numerical solution of the linearized in- compressible Navier–Stokes equations. The resulting pre- Vishwas Rao conditioners are used to accelerate the convergence of a Department of Computer Science Krylov subspace method applied to finite element dis- Virginia Tech cretizations of fluid dynamics problems in three space di- [email protected] mensions. We shall discuss the stability of the factoriza- tion for generalized saddle point matrices and consider an extension for non-symmetric matrices of the Tismenetsky– CP14 Kaporin incomplete factorization. We demonstrate that in Interpretation of Algebraic Preconditioning As numerically challenging cases of higher Reynolds number Transformation of the Discretization Basis flows one benefits from using this two-parameter modifica- tion of a standard threshold ILU preconditioner. The per- In numerical solution of PDEs it is convenient to con- formance of the ILU preconditioners is studied numerically sider discretization and preconditioning closely linked to- for a wide range of flow and discretization parameters, and gether. As suggested by several techniques used through- the efficiency of the approach is shown if threshold parame- out decades, preconditioning can always be linked with ters are chosen suitably. The practical utility of the method transformation of the discretization basis. It has been is further demonstrated for the haemodynamic problem of shown recently that effective algebraic preconditioners can simulating a blood flow in a right coronary artery of a real be interpreted as the transformation of the discretization patient. basis and, simultaneously, as the change of inner prod- uct on corresponding Hilbert space. This contribution will Maxium Olshanskii LA15 Abstracts 69

University of Houston core of many recent developments in lattice-based cryptog- [email protected] raphy. We utilize the matrix of the linear transformation that relates two commonly used geometric embeddings to provide novel results concerning the equivalence of the SVP CP14 in these ideal lattices arising from rings of cyclotomic inte- Right Preconditioned MINRES using Eisenstat- gers. SSOR for Positive Semidefinite Systems Scott C. Batson Consider solving symmetric positive semidefinite systems. North Carolina State University We use MINRES because CG does not necessarily converge [email protected] for inconsistent systems. We use right preconditioning be- cause it is easier to preserve the equivalence of the prob- lem for inconsistent systems. We apply Eisenstats trick to CP15 economize the right SSOR preconditioned MINRES. Nu- Coupled Sylvester-Type Matrix Equations and merical experiments show that the method is more efficient Block Diagonalization and robust compared to no-preconditioned and right scal- ing preconditioned MINRES. We prove Roth’s type theorems for systems of matrix Kota Sugihara equations including an arbitrary mix of Sylvester and - Department of Informatics Sylvester equations, in which also the transpose or con- SOKENDAI (The Graduate University for Advanced jugate transpose of the unknown matrices appear. In full Studies) generality, we derive consistency conditions by proving that [email protected] such a system has a solution if and only if the associated set of 2 × 2 block matrix representations of the equations are block diagonalizable by (linked) equivalence transfor- Ken Hayami mations. Various applications leading to several particular National Institute of Informatics cases have already been investigated in the literature, some [email protected] recently and some long ago. Solvability of these cases follow immediately from our general consistency theory. We also show how to apply our main result to systems of Stein-type CP15 matrix equations. Ky Fan Theorem Applied to Randi´cEnergy Andrii Dmytryshyn Let G be an undirected simple graph of order n with vertex Ume˚aUniversity { } set V = v1,...,vn .Letdi bethedegreeofthevertex Department of Computing Science vi. The Randi´cmatrixR =(ri,j )ofG isthesquarema- [email protected] trix of order n whose (i, j)-entry is equal to 1/ di dj if the vertices vi and vj are adjacent, and zero otherwise. Bo T. K˚agstr¨om The Randi´c energy is the sum of the absolute values of the Ume˚aUniversity eigenvalues of R.LetX, Y,andZ be matrices, such that Computing Science and HPC2N X + Y = Z. Ky Fan established an inequality between the [email protected] sum of singular values of X, Y,andZ. We apply this in- equality to obtain bounds on Randi´c energy. Some results are presented considering the energy of a symmetric parti- CP15 tioned matrix, as well as an application to the coalescence of graphs. Generalized Inverses of Copositive Matrices, Self- Conditional Positive Semidefinite Matrices and In- Enide C. Andrade heritance Properties Center for Research & Development in Mathematics and Applications University of Aveiro A symmetric matrix A ∈ Rn×n is called copositive if [email protected] xT Ax ≥ 0 for all x ≥ 0, where the ordering of vectors is the standard component wise ordering. A ∈ Rn×n is said to be copositive of order n−1ifallthe(n−1)×(n−1) principal Ivan Gutman n×n Faculty of Science, University of Kragujevac, Serbia minors of A are copositive. Let A ∈ R be a copositive and State University of Novi Pazar, Serbia matrix of order n − 1. A rather well known result states [email protected] that A is not copositive if and only if A is invertible and that all the entries of the inverse of A are nonpositive. Re- Mar´ıa Robbiano, Bernardo Mart´ın cently, we have proved a version of this result for singular Departamento de Matem´aticas, Universidad Cat´olica matrices, using the group generalized inverse of A.Among del Norte, Antofagasta, Chile other things, this talk will showcase this result. [email protected], [email protected] Sivakumar K.C. Indian Institute of Technology Madras CP15 [email protected] The Linear Transformation that Relates the Canonical and Coefficient Embeddings of Ideals in Kavita Bisht Cyclotomic Integer Rings Indian Institute of Technology Madras, Chennai [email protected] The geometric embedding of an ideal in the algebraic inte- ger ring of some number field is called an ideal lattice. Ideal Ravindran G. lattices and the shortest vector problem (SVP) are at the Indian Statistical Institute, Chennai 70 LA15 Abstracts

[email protected] [email protected]

Martin Vohral´ık CP15 INRIA Paris-Rocquencourt The Principal Rank Characteristic Sequence and [email protected] the Enhanced Principal Rank Characteristic Se- quence CP16 The enhanced principal rank characteristic sequence of an Iterative Method for Mildly Overdetermined n × n symmetric matrix B is 1 2 ··· n,where k is A (re- Least-Squares Problems with a Helmholtz Block spectively, N) if all (respectively, none) the principal minors of order k are nonzero; if some but not all are nonzero, We present a matrix-free iterative method to solve least- S then k = . Results regarding the attainability of cer- squares problems, which contain a Helmholtz discretiza- tain classes of sequences are introduced, and restrictions tion. This type of problems arises in quadratic-penalty for some subsequences to appear in an attainable sequence methods for PDE-constrained optimization. The problems are discussed. of interest have multiple-right hand sides. The proposed strategy includes preconditioning to induce and exploit an Xavier Martinez-Rivera identity + low-rank system matrix structure. Random- Iowa State University ization and subsampling of one of the blocks in the least- [email protected] squares problem is used to reduce the computational cost.

CP15 Bas Peters A Characterization of Minimal Surfaces in the UBC Lorentz Group L3 [email protected] In this talk we establish the equation for the Gaussian Cur- vature of a minimal surface in the Lorentz Group L3.Us- Chen Greif ing the Gauss equation we prove that minimal surfaces in Department of Computer Science L3 with constant contact angle have non-positive Gaus- The University of British Columbia sian curvature. Also, we provide a congruence theorem for [email protected] minimal surfaces immersed in the Lorentz space L3. Felix J. Herrmann Rodrigo R. Montes Seismic Laboratory for Imaging and Modeling Federal University of Parana UFPR The University of British Columbia [email protected] [email protected]

CP16 CP16 Linear Algebra Provides a Basis for Elasticity On the Backward Stability of Chebyshev Rootfind- Without Stress Or Strain ing Via Colleague Matrix Eigenvalues

Linear algebra can describe the energy of deformation of In this work, we analyze the backward stability of the poly- hyper-elastic materials in terms of a single deformation gra- nomial rootfinding problem solved with colleague matrices. dient tensor without the need to define stress or strain. In other words, given a scalar polynomial p(x)orama- Positions of points replace strain and force replaces stress. trix polynomial P (x)expressedintheChebyshevbasis, The model is appropriate for both infinitesimal and finite the question is to determine whether the whole set of com- deformations, isotropic and anisotropic materials, quasi- puted eigenvalues of the colleague matrix, obtained with a static and dynamic elastic responses. backward stable algorithm are the set of roots of a nearby polynomial or not. We derive a first order backward error Humphrey H. Hardy analysis of the polynomial rootfinding algorithm using col- Piedmont College league matrices adapting the geometric arguments in [A. [email protected] Edelman and H. Murakami, Polynomial roots for compan- ion matrix eigenvalues, Math. Comp., 1995] to the Cheby- CP16 shev basis. We show that, if the norm of the coefficients of the polynomial are bounded by a moderate number, com- Bounds on Algebraic, Discretization, and Total Nu- puting its roots via the eigenvalues of its colleague matrix merical Approximation Errors for Linear Diffusion using a backward stable eigenvalue algorithm is backward PDEs stable. We present a posteriori error estimates that allow to give Javier P´erez lvaro a guaranteed bound on the algebraic and discretization School of Mathematics, The University of Manchester errors for conforming finite element approximations of [email protected] the Laplace equation. This extends the results of [Ern, Vohral´ık 2013] which allow to estimate the different compo- nents of the error and to guarantee an upper bound on the Vanni Noferini total approximation error. The efficiency of the bounds is University of Manchester discussed and extensive numerical results for higher-order [email protected] finite element approximations are presented.

Jan Pape CP16 Charles University in Prague Companion Matrices of Hermite-Birkhoff Inter- LA15 Abstracts 71

polants [email protected]

In computational mathematics, we sometimes have to deal with polynomials that are not represented in the monomial CP17 basis. They are given by values of a function and/or its Approximating the Cardinality of a Vector to Han- derivatives of various orders at some points called nodes or dle Cardinality Constraints in Optimization sample points. This case often leads to solving a Hermite- Birkhoff interpolation problem. The present work unifies An obvious, often loose, lower bound of cardinality of a various aspects of the Newton, Hermite and Birkhoff bases. vector, i.e. the number of nonzero elements, is the squared We study the companion pencils as an efficient tool for ratio of norm one over Euclidian norm. From this bound, computing roots of such polynomials. we derive an estimate of the cardinality by maximizing a Rayleigh ratio involving a positive definite matrix with a simple structure, leading to an analytical formula for esti- Nikta Shayanfar mating the cardinality. Numerical experiments show esti- Technische Universit¨at Braunschweig mation errors systematically smaller than those obtained [email protected] with the initial bound.

Amir Amiraslani Agnes Bialecki STEM Department, University of Hawaii-Maui College EDF R&D [email protected] [email protected]

Laurent El Ghaoui CP16 UC Berkeley [email protected] Fixing Gauge and Rank Deficiency Riadh Zorgati Gauge fixing is a powerful concept for handling redundant EDF R&D degrees of freedom in Lagrangian mechanics. The challenge Department OSIRIS is finding logically consistent and mathematically tractable [email protected] procedures for fixing the gauge. Working in the context of gradients of scalar potentials demonstrated how fixing the gauge also fixes rank deficient linear systems by restoring CP17 them to full rank. Also presented is an analytic derivation Primal and Dual Algorithms for the Ball Covering of gauge conditions and simplistic geometric methods for Problem identifying gauge conditions beyond one dimension. Consider the convex optimization problem of finding a Eu- Daniel Topa clidean ball of minimum radius that covers m Euclidean Los Alamos National Laboratory balls with fixed centers and radii in n.Weproposetwo U.S. Army Engineer Research and Development Center directional search methods in which search paths (rays or [email protected] two-dimensional hyperbolas in n) are constructed by in- tersecting bisectors (hyperboloids in n) and the step size Pedro Embid is determined explicitly. We prove interesting properties University of New Mexico about pairwise intersection of n-dimensional hyperboloids Department of Mathematics and Statistics to achieve finite convergence. [email protected] Akshay Gupte, Lin Dearing Department of Mathematical Sciences Clemson University CP17 [email protected], [email protected] Matrix-Free Krylov Subspace Methods for Solving a Riemannian Newton Equation CP17 We consider an optimization problem on the Stiefel man- Finding the Nearest Valid Covariance Matrix: a Fx ifold which is equivalent to the singular value decomposi- Market Case tion. The Riemannian Newton method for the problem has been proposed, but the Newton equation is expressed by We consider the problem of finding a valid covariance ma- a system of matrix equations which is difficult to solve di- trix in the FX market given an initial non-PSD estimate of rectly. In this talk, we apply matrix-free Krylov subspace such a matrix. The standard no-arbitrage assumption im- methods to a sparse linear system into which the Newton plies additional linear constraints on such matrices, which equation is rewritten. Numerical experiments show the ef- automatically makes them singular. As a result, such a fectiveness of our proposed method. problem is not well-posed while the PSD-solution is not strictly feasible. In order to deal with this issue, we de- scribed a low-dimensional face of the PSD cone that con- Kensuke Aihara tains the feasible set. After projecting the initial problem Department of Mathematical Information Science, onto this face, we come out with a reduced problem, which Tokyo University of Science turns out to be well posed and of a smaller scale. We show [email protected] that after solving the reduced problem the solution to the initial problem can be uniquely recovered in one step. Hiroyuki Sato Department of Management Science, Aleksei Minabutdinov Tokyo University of Science National Research University Higher School of Economics 72 LA15 Abstracts

[email protected] its norm. This inequality involves combinatorial constants depending on the size and pattern of the matrix. Its proof relies on diagonal scalings, constructed from the optimal CP17 dual variables of a parametric optimal assignment prob- Convergence of Newton Iterations for Order- lem. Convex Matrix Functions Marianne Akian, Stephane Gaubert In stochastic problems and some physical problems, it INRIA and CMAP, Ecole Polytechnique is important that solving the elementwise minimal non- [email protected], [email protected] negative solvent of a nonlinear matrix equation. A lot of such equations have properties of differentiable order- Andrea Marchesini convex functions. Using the properties of differentiable INRIA and order-convex functions, We will show that the Newton it- CMAP, Ecole Polytechnique erations for the equations are well-defined and converge [email protected] to the elementwise minimal nonnegative solvent. Finally, it is given numerical experiments of the iterations for the equations. CP18 Coherency Preservers Sang-Hyup Seo Department of Mathematics We will present a description of continuous coherency pre- Pusan National University servers on hermitian matrices. Applications in mathemat- [email protected] ical physics will be discussed. Peter Semrl Jong-Hyeon Seo University of Ljubljana Basic Science [email protected] DIGIST [email protected] CP18 Hyun-Min Kim A Connection Between Comrade Matrices and Department of mathematics Reachability Matrices Pusan National University [email protected] In this paper is explained a connection between comrade matrices and reachability matrices. This work is done by orthogonal polynomials, companion matrices and their CP17 connections with reachability matrices. Firstly an alge- Modulus-Type Inner Outer Iterative Methods for braic structure is introduced by these matrices. Then some Nonnegative Constrained Least Squares Problems properties of these matrices are described such as deter- minant, eigenvalues and eigenvectors. The fields are re- For the solution of large sparse nonnegative constrained stricted to that F = R or C, namely real and complex num- least squares (NNLS) problems, a new iterative method is bers respectively. Characteristic polynomial of the com- proposed by using conjugate gradient least squares method panion matrix C is related by for inner iterations and the modulus iterative method in the n − n−1 − − − ∈ F outer iterations for solving linear complementarity problem p(x)=x cn−1x ... c1x c0 [x]. (1) resulting from Karush-Kuhn-Tucker conditions of NNLS. Let A ∈ Fn×n, b ∈ Fn,thematrix Convergence analysis is presented and numerical exper- n−1 n×n iments show the proposed methods outperform the pro- R(A, b)=[b,Ab,...,A b] ∈ F , jected gradient methods. is the reachability matrix of the pair (A, b). Also, A is the Ning Zheng comrade matrix similar as C by an invertible matrix Qn i.e. SOKENDAI (The Graduate University for Advanced −1 A = QnCQn . Studies) [email protected] Easily, it is shown −1 R(C, b)=Qn R(A, Qnb). Ken Hayami National Institute of Informatics The generating polynomial is defined of b = T ∈ Fn×1 n−1 k [email protected] [b0,b1,...,bn−1] by b(x)= k=0 bkx then n−1 n−1 k −1 Junfeng Yin R(C, b)= bkR(C, ek)= bkC = b(C)=Qn R(A, Qnb) Department of Mathematics, Tongji University, Shanghai k=0 k=0 [email protected] (2) where ek stands for the (k+1)th unit vector in the standard basis of Fn×1. Now, an algebraic structure is introduced of CP18 −1 n×1 S(A)={Qn R(A, Qnb)|b ∈ F }, Tropical Bounds for the Eigenvalues of Block Struc- tured Matrices that is proven S(A)  F[x]/ ≺ d(x) . Determinants, eigenvalues and eigenvectors are described of this connec- We establish a log-majorization inequality, which relates tion. Also the colleague matrices and Chebyshev polyno- the moduli of the eigenvalues of a block structured ma- mials are investigated of this connection. trix with the tropical eigenvalues of the matrix obtained by replacing every block entry of the original matrix by Maryam Shams Solary LA15 Abstracts 73

Payame Noor university ternal boundaries of subdomains, augmented iterative [email protected] additive Schwarz algorithm in Krylov subspaces, multi- preconditioning techniques and aggregation. The work fo- cuses on experimental analysis of performance of the afore- CP18 mentioned methods. The implementation of the proposed Maximal Lower Bounds in Loewner Order approaches was done on the basis of hybrid parallel pro- gramming and taking into account sparse matrix structure We show that the set of maximal lower bounds of two sym- and with using the compressed sparse row format of the metric matrices with respect to Loewner order can be iden- matrix representation. The experiments have been per- tified to the quotient set O(p, q)/(O(p)×O(q)). Here, (p, q) formed for a representative set of test problems. denotes the inertia of the difference of the two matrices, O(p)isthep-th orthogonal group, and O(p, q) is the indef- Valery P. Il’in inite orthogonal group arising from a quadratic form with Institute of Computational Mathematics inertia (p, q). We discuss the application of this result to and Mathematical Geophysics SB RAS the synthesis of ellipsoidal invariants of hybrid dynamical [email protected] systems.

Nikolas Stott CP19 INRIA - CMAP, Ecole Polytechnique Robust Incomplete Factorization Preconditioner [email protected] with Mixed-Precision for Parallel Finite Element Analysis Xavier Allamigeon INRIA and CMAP, Ecole Polytechnique, CNRS In our previous study, localized IRIF(0) and IRIF(1) pre- [email protected] conditioners were proposed to solve the equation system in the frame work of the parallel FEM. This research aims to apply the mixed precision algorithm to this precondi- St´ephane Gaubert tioning method. As the numerical results, matrices given INRIA-Saclay & CMAP from shell structures were solved. It has been shown that Ecole Polytechnique the mixed-precision IRIF(0) and IRIF(1) are effective in [email protected] reducing the processing time by 40% while preserving the number of iterations. CP19 Naoki Morita, Gaku Hashimoto, Hiroshi Okuda Coupled Preconditioners for the Incompressible Graduate School of Frontier Sciences Navier-Stokes Equations The University of Tokyo [email protected], [email protected] After linearization and discretization of the incompressible tokyo.ac.jp, [email protected] Navier Stokes equations one has to solve block-structured indefinite linear systems. In this talk we consider SIMPLE- type preconditioners as proposed in [1]. It appears that CP19 this type of preconditioners can lead to a considerable ac- Efficient Simulation of Fluid-Structure Interactions celeration of the solver and are weakly depending on the Using Fast Multipole Method number of grid points and Reynolds number. Recently, the augmented Lagrangian method becomes popular [2]. The Regularized Stokes formulation has been shown to be very method is more expensive per iteration but independent of effective at modeling fluid-structure interactions when the the grid size and Reynolds number for academic problems. fluid is highly viscous. However, its computational cost The third preconditioner is the so-called ’grad-div’ precon- grows quadratically with the number of particles immersed ditioner. We also define modified versions of the SIMPLE in the fluid. We demonstrate how kernel-independent fast and ’grad-div’ preconditioners. In this talk we investigate multipole method can be applied to significantly improve their performance for academic problems [3]. the efficiency of this method, and present numerical results for simulating the dynamics of a large number of elastic Xin He rods immersed in 3D stokes flows. Delft University of Technology, Netherland [email protected] Minghao W. Rostami, Sarah D. Olson Worcester Polytechnic Institute Kees Vuik [email protected], [email protected] Delft University of Technology [email protected] CP19 Block Preconditioners for An Incompressible Mag- CP19 netohydrodynamics Problem On Some Algebraic Issues of Domain Decomposi- tion Approaches We focus on preconditioning techniques for a mixed fi- nite element discretization of an incompressible magneto- Various domain decomposition approaches are considered hydrodynamics (MHD) problem. Upon discretization and with regard to parallel solution of very large sparse sys- linearization, a 4x4 non-symmetric block-structured linear tems of linear algebraic equations arising from some fi- system needs to be (repetitively) solved. One of the prin- nite element or finite volume approximations of compli- cipal challenges is the presence of a skew-symmetric term cated boundary value problems on non-structured grids. that couples the fluid velocity with the electric field. Our The proposed methods include automatic construction of proposed technique exploits the block structure of the un- the balanced subdomains with or without a parametrized derlying linear system, utilizing and combining effective overlapping, different type of interface conditions on in- preconditioners for the mixed Maxwell and Navier-Stokes 74 LA15 Abstracts

subproblems. The preconditioner is based on dual and pri- of this set. We explore first-order optimization algorithms, mal Schur complement approximations to yield a scalable showing that exploiting the geometric structure leads to solution method. Large scale numerical results demon- a considerable acceleration. Moreover, we present a new strate the effectiveness of our approach. strategy which appears to be faster than the existing algo- rithms. Michael P. Wathen The University of British Columbia Bruno Iannazzo [email protected] Universita‘ di Perugia, Italy [email protected] Chen Greif Department of Computer Science Margherita Porcelli The University of British Columbia University of Bologna [email protected] [email protected]

Dominik Schotzau The University of British Columbia MS1 [email protected] Error Estimation in Krylov Subspace Methods for Matrix Functions

CP19 Using the Lanczos method to approximate f(A)b,thereis The WR-HSS Method and Its Subspace Accelera- no straightforward way to measure the error of the current tion for the Unsteady Elliptic Problem iterate. Therefore, different error estimators have been sug- gested, all of them specific to certain classes of functions. We consider the numerical methods for the unsteady ellip- We add a technique to compute error bounds for Stielt- tic problem with Dirichlet boundary condition. Taking into jes functions, using an integral representation of the error. account the idea of Hermitian/skew-Hermitian (HS) split- These bounds can be computed essentially for free, with ting, a class of waveform relaxation methods is established cost independent of the iteration number and the dimen- based on the HS splitting of the linear differential operator sion of A. in the corresponding differential equations, i.e., the WR- HSS method. Furthermore, the subspace acceleration of Marcel Schweitzer the WR-HSS method is also considered. Finally, the theo- Bergische Universit¨at Wuppertal retical and numerical behaviors of the above methods are [email protected] carefully analyzed. Andreas J. Frommer XI Yang Bergische Universitaet Wuppertal Department of Mathematics Fachbereich Mathematik und Naturwissenschaften Nanjing University of Aeronautics and Astronautics [email protected] [email protected]

MS1 MS1 A High Performance Algorithm for the Matrix Sign Functions of Matrices with Kronecker Sum Struc- Function ture The talk will described a blocked (partitioned) algorithm Functions of large and sparse matrices with tensor struc- for the matrix sign function, along with performance re- ture arise in a number of scientific and engineering appli- sults that compare the existing algorithm to the new one. cations. In this talk we focus on functions of matrices in The algorithm is a blocked version of the Higham’s stabi- Kronecker sum form. After discussing some recent results lized version of the Parlett-Schur matrix-sign algorithm for on the decay behavior of such matrix functions, we will triangular matrices. describe efficient Krylov subspace techniques for evaluat- ing a function of a matrix times a vector. The results of Sivan A. Toledo numerical experiments will be discussed. Tel Aviv University [email protected] Michele Benzi Department of Mathematics and Computer Science Emory University MS2 [email protected] Mpgmres-Sh : Multipreconditioned GMRES for Shifted Systems Valeria Simoncini Universita’ di Bologna We propose using the Multipreconditioned Generalized [email protected] Minimal Residual (MPGMRES) method to solve shifted linear systems (A + σjI)xj = b for j =1,...,Nσ with mul- tiple shift-and-invert preconditioners. The multiprecondi- MS1 tioned space is obtained by applying the preconditioners First-Order Riemannian Optimization Techniques to all search directions and searching for a minimum norm for the Karcher Mean solution over a larger subspace. We show that for thie par- ticular problem this space grows linearly and discuss our The Karcher mean is a matrix function of several positive implementation on systems arising from hydraulic tomog- definite matrices which represents their geometric mean. It raphy and matrix function computations. is the unique minimum over the set of positive definite ma- trices of a real function related to a Riemannian structure Tania Bakhos LA15 Abstracts 75

Institute for Computational and Mathematical s rashedi [email protected] Engineering Stanford University [email protected] MS2 Iterative Methods for Solving Shifted Linear Sys- Peter K. Kitanidis tems Built Upon a Block Matrix-vector Product Dept. of Civil and Environmental Engineering Stanford University We discuss methods for simultaneous solving a family of [email protected] shifted linear systems of the form ∈ Cn×n Scott Ladenheim (A + σj I)xσj = bσj j =1, 2,...,L with A . Temple University [email protected] We explore methods built upon Krylov subspace iterations which employ a block matrix-vector product as their core Arvind Saibaba operation. We demonstrate that by carefully designing Department of Electrical and Computer Engineering these block-iterative methods, we can overcome some re- Tufts University strictions present in many existing techniques, i.e., the re- [email protected] quirement that right-hand sides be collinear, incompatibil- ity with many preconditioners, and difficulty of integration with subspace recycling techniques. Daniel B. Szyld Temple University Kirk M. Soodhalter Department of Mathematics Temple University [email protected] Department of Mathematics [email protected] MS2 Nested Krylov Methods for Shifted Linear Systems MS3 Linearizations of Hermitian Matrix Polynomials Most algorithms for the simultaneous solution of shifted Preserving the Sign Characteristic linear systems make use of the shift-invariance property of the underlying Krylov spaces. This particular comes into The sign characteristic of a Hermitian matrix polynomial play when preconditioning is taken into account. We pro- P (λ) is a set of signs attached to the real eigenvalues pose a new iterative framework for the solution of shifted of P (λ), which is crucial for determining the behavior of systems that uses an inner multi-shift Krylov method as systems described by Hermitian matrix polynomials. In a preconditioner within a flexible outer Krylov method. this talk we present a characterization of the Hermitian Shift-invariance is preserved if the inner method yields strong linearizations that preserve the sign characteristic collinear residuals. of a given Hermitian matrix polynomial and identify sev- eral families of such linearizations that can be easily con- Manuel Baumann structed from the coefficients of the matrix polynomial. Delft University of Technology Faculty of Electrical Engineering, Mathematics and Maria Bueno Computer Department of Mathematics [email protected] University of California, Santa Barbara [email protected] Martin B. van Gijzen Delft University of Technology Froilan Dopico [email protected] Department of Mathematics Universidad Carlos III de Madrid, Spain MS2 [email protected] One Matrix, Several Right Hand Sides: Efficient Susana Furtado Alternatives to Block Krylov Subspace Methods Faculdade de Economia do Porto Matrix-block vector products are more efficient than a se- [email protected] quence of matrix-vector products on current architectures. This is why methods for linear systems with several, s MS3 say, right hand sides gained renewed interest recently. We investigate short recurrence alternatives to known block Low Rank Perturbations of Canonical Forms Krylov subspace methods and, in particular, variants of BiCG and QMR which need just one (not s) multiplica- In this talk, we will review some know results that describe tions with the transposed matrix. the change of the following canonical forms under low rank perturbations: Andreas J. Frommer • The Jordan canonical form of a square matrix. Bergische Universitaet Wuppertal Fachbereich Mathematik und Naturwissenschaften • The Weierstrass canonical form of a regular matrix [email protected] pencil. • The Kronecker canonical of a singular matrix pencil Somaiyeeh Rashedi without full rank. Department of Mathematical Sciences, University of Tabriz, 5 • The Smith form of a regular matrix polynomial. 76 LA15 Abstracts

We will also present some recent results on the change of William & Mary the Weierstrass canonical form of regular matrix pencils [email protected] under low rank perturbations.

Fernando De Teran, Froil´an M. Dopico MS4 Department of Mathematics Student Research on Infinite Toeplitz Matrices Universidad Carlos III de Madrid [email protected], [email protected] A Toeplitz matrix A has constant entries down each diag- onal. So the whole matrix is determined by those numbers ak. Then those numbers are the coefficients in the func- Julio Moro ikθ Universidad Carlos III de Madrid tion a(θ)=Σake . This symbol function is the super- Departamento de Matem´aticas convenient way to study the infinite matrix. The student [email protected] project started by factoring a(θ) into lower times upper (negative and positive powers), leading to A = LU.(Hugo Woerdeman has studied the much harder 2-level problem MS3 when a(θ, φ).) Then the student Liang Wang moved to the Matrix Polynomials in Non-Standard Form Radon Transform Matrix – with neat results that are still to be published. Matrix polynomials P (λ) and their associated eigenprob- lems are fundamental for a variety of applications. Cer- Gil Strang tainly the standard (and apparently most natural) way to MIT express such a polynomial has been [email protected] k k−1 P (λ)=λ Ak + λ Ak−1 + ···+ λA1 + A0 , m×n MS4 where Ai ∈ F . However, it is becoming increasingly Research in Linear Algebra with Undergraduates important to be able to work directly and effectively with polynomials in the non-standard form I will discuss the benefits and difficulties and general is- Q(λ)=φk(λ)Ak+φk−1(λ)Ak−1+···+φ1(λ)A1+φ0(λ)A0 , sues in working with undergraduate students on research projects in Linear Algebra. I will share results from k where {φi(λ)}i=0 is some other basis for the space of all projects with students from the past few years, including scalar polynomials of degree at most k. This talk will de- a Sinkhorn-Knopp fixed point problem scribe some new approaches to the systematic construc- (−1) T (−1) tion of families of linearizations for matrix polynomials like x = A (A x ) Q(λ), with emphasis on the classical bases associated with (−1) the names Newton, Hermite, Bernstein, and Lagrange. (where for x =(x1,...,xn), x =(1/x1,...,1/xn)), an approximation of derivatives using the LU factorization of D. Steven Mackey Vandermonde Matrices, and mosaicking with digital im- Department of Mathematics ages. Western Michigan University [email protected] David M. Strong Department of Mathematics Pepperdine University MS3 [email protected] A Diagonal Plus Low Rank Family of Linearizations for Matrix Polynomials MS4 We present a family of linearizations for an m × m ma- d i Undergraduate Research Projects in Linear Alge- trix polynomial P (x)= i=0 Pix that have the form bra A(x)=D(x)+UV t with D(x) diagonal. The diagonal term D(x)isadm × dm matrix and U and V are dm × m. Linear Algebra lends itself well for undergraduate research Moreover, D(x) can be chosen almost arbitrarily and U and projects as with one or two terms of Linear Algebra classes V can be computed from P (x)andD(x). We show that a variety of research projects are accessible. In addition, appropriate choices of D(x) lead to linearizations with very problems can often be explored by numerical and symbolic good numerical properties. We provide a cheap strategy to computations. In this talks we discuss some successful determine good choices for D(x). projects from the past, some ongoing ones, as well as some future possibilities. The focus will be on matrix completion Vanni Noferini problems and questions regarding determinantal represen- University of Manchester tations of multivariable polynomials. [email protected] Hugo J. Woerdeman Leonardo Robol Drexel University Scuola Normale Superiore hugo @ math.drexel.edu [email protected] MS5 MS4 Exploiting Sparsity in Parallel Sparse Matrix- Undergraduate Research in Linear Algebra 4 Matrix Multiplication

Abstract not available at time of publication. In this talk, we discuss graph and hypergraph partitioning based models and methods proposed to exploit the spar- Charles Johnson sity of SpGEMM computations for achieving efficiency on LA15 Abstracts 77

distributed memory architectures for different paralleliza- Nicholas Knight tion schemes. All models rely on one-dimensional parti- New York University tioning of both input matrices. In these models, parti- [email protected] tioning constraint encodes balancing computational loads of processors, whereas partitioning objective of minimiz- Oded Schwartz ing cutsize corresponds to increasing locality within each Hebrew University of Jerusalem processor and reducing volume of interprocessor communi- [email protected] cation. We further address latency issues in these models. The performance of these models will be discussed through an MPI-based parallel SpGEMM library developed for this MS5 purpose. The Input/Output Complexity of Sparse Matrix Kadir Akbudak, Oguz Selvitopi, Cevdet Aykanat Multiplication Bilkent University [email protected], [email protected], We consider the problem of multiplying sparse matrices [email protected] (over a semiring) where the number of non-zero entries is larger than main memory. In this paper we generalize the upper and lower bounds of Hong and Kung (STOC’81) MS5 to the sparse case. Our bounds depend of the number N of nonzero entries in A and C,aswellasthenum- Generalized Sparse Matrix-Matrix Multiplication ber Z of nonzero entries in AC . We show that AC and Its Use in Parallel Graph Algorithms ˜ N Z N can be computed using O B min M , M I/Os, with We present experimental results for optimized implemen- tations of various (1D, 2D, 3D) parallel SpGEMM algo- high probability. This is tight (up to polylogarithmic fac- rithms and their applications to graph problems such as tors) when only semiring operations are allowed, even for triangle counting, betweenness centrality, graph contract- dense rectangular matrices: We show a lower bound of N Z N ing, and Markov clustering. We quantify the effects of Ω B min M , M I/Os. in-node multithreading and the implementation trade-offs involved. We will also present a new primitive, masked SpGEMM, that avoids communication when the output Morten St¨ockel, Rasmus Pagh structure (or an upper bound on it) is known beforehand. IT University of Copenhagen Masked SpGEMM has applications to matrix-based trian- [email protected], [email protected] gle counting and enumeration.

Ariful Azad MS6 Purdue University Eigenvector Norms Matter in Spectral Graph The- [email protected] ory

Aydin Buluc We investigate the role of eigenvector norms in spectral Lawrence Berkeley National Laboratory graph theory to various combinatorial problems includ- [email protected] ing the densest subgraph problem, the Cheeger constant, among others. We introduce randomized spectral algo- John R. Gilbert rithms that produce guarantees which, in some cases, are Dept of Computer Science better than the classical spectral techniques. In particular, University of California, Santa Barbara we will give an alternative Cheeger sweep (graph partition- [email protected] ing) algorithm which provides a linear spectral bound for the Cheeger constant at the expense of an additional factor determined by eigenvector norms. Finally, we apply these MS5 ideas and techniques to problems and concepts unique to Hypergraph Partitioning for Sparse Matrix-Matrix directed graphs. Multiplication Franklin Kenter We model sparse matrix-matrix multiplication using an Rice University input-specific hypergraph, where nets of the hypergraph Dept. Computational and Applied Mathematics correspond to nonzero entries and vertices correspond to [email protected] nonzero scalar multiplications. The communication cost of a parallel algorithm corresponds to the size of the cut of a multi-way partition of the hypergraph. We will dis- MS6 cuss the efficiency of various algorithms for input matrices Local Clustering with Graph Diffusions and Spec- from several applications in the context of this and related tral Solution Paths hypergraph models. Eigenvectors of graph-related matrices are intimately con- Grey Ballard nected to diffusion processes that model the spread of Sandia National Laboratories information across a graph. Local Cheeger inequalities [email protected] guarantee that localized diffusions like the personalized PageRank eigenvector and heat kernel identify small, good- Alex Druinsky conductance clusters. We provide constant-time algo- Computational Research Division rithms for computing a generalized class of such diffusions, Lawrence Berkeley National Laboratory apply their solution paths to produce refined clusters, and [email protected] prove that these diffusions always identify clusters of con- 78 LA15 Abstracts

stant size. [email protected]

Kyle Kloster Purdue University, Department of Mathematics MS7 150 N. University Street West Lafayette, IN 47906 Recovering Planted Subgraphs via Convex Graph [email protected] Invariants Extracting planted subgraphs from large graphs is a fun- David F. Gleich damental question that arises in a range of application do- Purdue University mains. In this talk, we outline a computationally tractable [email protected] approach based on convex optimization to recovery struc- tured planted graphs that are embedded in larger graphs containing spurious edges. (Joint work with Utkan Cando- MS6 gan.) Signed Laplacians and Spectral Clustering Via Quotient Space Metrics Venkat Chandrasekaran California Institute of Technology [email protected] Signed Laplacians represent the structure of graphs whose edges possess signatures, i.e., certain group actions. They are quite useful for many purposes, e.g., modelling so- MS7 cial networks with friend/enemy relationships between its Computationally-Efficient Approximations to Ar- members. We investigate the spectral clustering method bitrary Linear Dimensionality Reduction Opera- for signed Laplacians. Like in the traditional spectral al- tors gorithms, the eigenvectors are used to map the graph ver- tices into unit spheres. We show that, by using the metrics We examine a fundamental matrix approximation problem, of certain quotient spaces of the spheres, e.g., projective motivated by computational considerations in large-scale spaces, one can cluster the graph vertices to find inter- data processing: how well can an arbitrary linear dimen- esting signature-related substructures. In particular, this sionality reduction (LDR) operator (a matrix A ∈ Rm×n provides a spectral approach to Harary’s structural balance with m

Fatihcan M. Atay MS7 Max Planck Institute for Mathematics in the Sciences, Leipzi More Data, Less Work: Sharp DataComputation [email protected] Tradeoffs for Linear Inverse Problems Often one wishes to estimate an unknown but structured signal from linear measurements where the number of mea- MS6 surements is far less that the dimension of the signal. Eigenvectors of the Nonlinear Graph p-Laplacian We present a unified theoretical framework for conver- and Application in Graph Clustering gence rates of various optimization schemes for such prob- lems. Our framework applies to both convex and non- We investigate the discrete p-Laplacian operator Lp on con- convex objectives, providing precise tradeoffs between run- nected graphs. We show that many relevant properties of ning time, data, and structure complexity for many opti- the spectrum of the linear Laplacian can be extended to the mization problems. Joint with B. Recht and S. Oymak. nonlinear Lp. In particular we provide a variational charac- Mahdi Soltanolkotabi terization of a set of n points of the spectrum of Lp,hence University of California, Berkeley we prove a p-Laplacian nodal domain theorem and some [email protected] higher-order Cheeger-type inequalities. We interpret the obtained results in the context of graph clustering, propos- ing few examples and discussing some computational is- MS7 sues. A Data-dependent Weighted LASSO under Poisson Noise Francesco Tudisco Dept. of Mathematics and Computer Science Sparse linear inverse problems appear in a variety of set- Saarland University tings, but often the noise contaminating observations can- [email protected] not accurately be described as bounded or arising from a Gaussian distribution. Poisson observations in particu- Matthias Hein lar are a characteristic feature of several real-world appli- Saarland University cations. Previous work on sparse Poisson inverse prob- Dept. of Mathematics and Computer Science lems encountered several limiting technical hurdles. In LA15 Abstracts 79

this talk I will describe an alternative, streamlined analy- algorithm that works well in practice. sis approach for sparse Poisson inverse problems which (a) sidesteps the technical challenges present in previous work, Massimiliano Fasi (b) admits estimators that can readily be computed using Universita di Bologna off-the-shelf LASSO algorithms, and (c) hints at a general [email protected] weighted LASSO framework for broader classes of prob- lems. At the heart of this new approach lies a weighted Nicholas Higham LASSO estimator for which data-dependent weights are School of Mathematics based on Poisson concentration inequalities. Unlike previ- The University of Manchester ous analyses of the weighted LASSO, the proposed analysis [email protected] depends on conditions which can be checked or shown to hold in general settings with high probability. Bruno Iannazzo Universita‘ di Perugia, Italy Xin Jiang [email protected] Duke University [email protected] MS8 Patricia Reynaud-Bouret University of Nice The Leja Method: Backward Error Analysis and [email protected] Implementation The Leja method is a well established scheme for comput- Vincent Rivoirard ing the action of the matrix exponential. We present a new University of Paris - Dauphine backward error analysis allowing a more efficient method. [email protected] From a scalar computation in high precision we predict the necessary number of scaling steps based only on a rough Laure Sansonnet estimate of the field of values or norm of the matrix and Argo ParisTech the desired backward error. The efficiency of the approach [email protected] is shown in numerical experiments.

Rebecca Willett Marco Caliari University of Wisconsin-Madison Univeristy of Verrona [email protected] [email protected]

Peter Kandolf MS8 Universitat Innsbruck Estimating the Condition Number of f(A)b [email protected]

I will present some new algorithms for estimating the con- Alexander Ostermann dition number of f(A)b. Standard condition number es- University of Innsbruck timation algorithms for f(A) require explicit computation [email protected] of matrix functions and their Fr´echet derivatives and are therefore unsuitable for the large, sparse A typically en- Stefan Rainer countered in f(A)b problems. The algorithms proposed University of Innsbruck here use only matrix-vector multiplications. The number Department of Mathematics of matrix-vector multiplications required to estimate the [email protected],at condition number is proportional to the square of the num- ber required by the underlying f(A)b algorithm. Numerical experiments demonstrate that the condition estimates are MS8 reliable and of reasonable cost. An Exponential Integrator for Polynomially Per- Edvin Deadman turbed Linear ODEs University of Manchester  [email protected] Consider the initial value problem u (t)= N  n×n ( =0 ε A)u(t), u(0) = u0,whereA0,A1,...,AN ∈ C n and u0 ∈ C . We present results that allow us to evaluate MS8 u(t) efficiently for several values of ε and t. An Algorithm for the Lambert W Function on Ma- We propose a new Krylov subspace method which uses the trices approximation of the product of the matrix exponential and a vector. The approach is based on the Taylor Despite being implicitly defined by the simple equation expansion of the exact solution u(t)withrespecttoε. z = W (z)exp(W (z)), the so-called “Lambert W’ function We show that the coefficient vectors of this expansion is not straightforward to compute, mainly because of the are given by the matrix exponential of a block Toeplitz lack of regularity and logarithm-like properties. The New- matrix. This result can be seen as a generalization of a ton method proves itself worthy for the scalar case, but a result given by Najfeld and Havel (1995). trivial extension to matrices suffers from numerical insta- A priori error bounds are derived for the approximation bility and difficulty in choosing a suitable starting matrix. which show a superlinear convergence for any N. These issues are fixed by deriving a numerically stable, coupled form of the Newton iteration and by employing a Antti Koskela blocked Schur decomposition, with suitable starting ma- University of Innsbruck trices chosen for each diagonal block. We obtain a robust [email protected] 80 LA15 Abstracts

Elias Jarlebring [email protected] KTH Stockholm [email protected] MS9 M.E. Hochstenbach Solving Linear Systems with Nonlinear Parameter Eindhoven University of Technology Dependency [email protected] We present the CORK method for solving the paramet- ric linear system A(σ)x = b. The idea is to approxi- MS9 mate A(σ) by a (rational) matrix polynomial and then Interpolatory Techniques for Model Reduction of extract x from the solution of a large scale problem with Multivariate Linear Systems using Error Estima- affine parameter. Previous work was reported using mono- tion mial (GAWE-method) and Chebyshev polynomials (infi- nite Arnoldi). The CORK framework is a general one that We consider interpolatory techniques for model reduction allows for much more freedom in the selection of polyno- of multivariate linear systems as they appear in the input mial or rational polynomial basis. output representation of quadratic bilinear systems. Exist- ing interpolatory techniques [C.Gu, QLMOR: a projection- Karl Meerbergen based nonlinear model order reduction approach using K. U. Leuven quadratic-linear representation of nonlinear systems, 2011], [email protected] [P.Benner and T.Breiten, Two-sided projection methods for nonlinear model order reduction 2015] for model reduc- Roel Van Beeumen, Wim Michiels tion of quadratic bilinear systems interpolate these gener- KU Leuven alized transfer functions at some random set of interpola- [email protected], tion points or by using the corresponding linear iterative [email protected] rational Krylov algorithm. The goal here is to propose an approach that identifies a good choice of interpolation points based on the error bound expressions derived re- MS9 cently in [L. Feng, A. C. Antoulas and P. Benner, Some Perfectly Matched Layers and Rational Krylov a posteriori error bounds for reduced order modelling of Subspaces with Adaptive Shifts for Maxwell Sys- (non-)parameterized linear systems, 2015]. The approach tems iteratively updates the interpolation points in a predefined sample space by computing the maximum error bound cor- The efficient and accurate modeling of waves in inhomoge- responding to the previous set of interpolation points. This neous media on unbounded domains is of paramount im- results in a greedy type algorithm for model reduction of portance in many different areas in science and engineer- the generalized transfer functions. Numerical results show ing. We show that optimal complex scaling and stability- the importance of choosing the interpolation points for corrected wave functions allow for the efficient computation some benchmark examples. of wave fields via Krylov subspace techniques. Polynomial and rational Krylov methods are discussed and numerical Mian Ilyas Ahmad experiments will illustrate the performance of the proposed Max Planck Institute - Magdeburg Krylov methods. Computational Methods in Systems and Control Theory [email protected] Vladimir L. Druskin Schlumberger-Doll Research [email protected] Lihong Feng, Peter Benner Max Planck Institute, Magdeburg, Germany [email protected], benner@mpi- Rob Remis magdeburg.mpg.de Circuits and Systems Group Delft University of Technology [email protected] MS9 Parallelization of the Rational Arnoldi Method Mikhail Zaslavsky Schlumberger-Doll Research The rational Arnoldi algorithm is a popular method in sci- [email protected] entific computing used to construct an orthonormal basis of a rational Krylov space. Each basis vector is a ratio- Joern Zimmerling nal matrix function times the starting vector. Rational Circuits and Systems Group functions possess a partial fraction expansion which of- Delft University of Technology ten allows to compute several basis vectors simultaneously. [email protected] However, this parallelism may cause instability due to the orthogonalization of ill-conditioned bases. We present and compare continuation strategies to minimize these effects. MS10 Recent Advances on Inverse Problems for Matrix Polynomials Mario Berljafa School of Mathematics We summarize several results on inverse eigenstructure The University of Manchester problems for matrix polynomials that have been obtained [email protected] recently and discuss how they complete other results previ- ously known in the literature. These new results are closely Stefan Guettel connected to the Index Sum Theorem and to the new class The University of Manchester of Polynomial Zigzag matrices. In particular, we solve the LA15 Abstracts 81

most general possible version of complete inverse eigen- nomial matrix of degree d with prescribed sets of left and structure problem for matrix polynomials. right minimal indices and with prescribed sets of finite ele- mentary divisors, satisfying the degree sum theorem. The Froil´an M. Dopico, Fernando De Teran use of a factorized form has the advantage that the differ- Department of Mathematics ent structural elements are easily recovered from each of Universidad Carlos III de Madrid the three factors. The left factor gives the left null space [email protected], [email protected] structure, the right factor gives the right null space struc- ture and the middle factor reveals the finite elementary D. Steven Mackey divisors. Department of Mathematics Western Michigan University Paul M. Van Dooren [email protected] Catholic University of Louvain Belgium Paul Van Dooren [email protected] Universit´e Catholique de Louvain [email protected] MS11 When Life is Linear Worldwide MS10 Matrix Functions: The Contributions of Leiba In this spring of 2015, Tim Chartier taught a MOOC Rodman through Davidson College and edX. The course came in two parts with both teaching applications of linear algebra In this presentation I will try to review the voluminous in computer graphics and data mining. Part 1 taught ba- contributions to linear algebra made by Leiba Rodman. I sics of linear algebra and was targeted to a broad audience. will combine this with some new insights into canonical Part 2 taught more advanced concepts such as eigenvectors, structures for selfadjoint quadratic matrix polynomials. Markov Chains, and the SVD. Both parts had activities in- tended to encourage exploration, discovery, and creativity. Peter Lancaster University of Calgary Tim Chartier Dept of Math and Statistics Davidson College [email protected] Dept. of Mathematics [email protected] MS10 Generic Low Rank Perturbations of Structured MS11 Matrices Coding the Matrix: Linear Algebra through Com- Recently, the theory of generic structure-preserving rank- puter Science Applications one perturbations of matrices that have symmetry struc- tures with respect to some indefinite inner product has Starting in 2008, I have been developing and teaching a lin- been developed. Concerning the case of arbitrary rank k, ear algebra course, Coding the Matrix, aimed at computer it was conjectured, but is not immediate, that a generic science students. Concepts are motivated and illustrated rank k perturbations can be interpreted as sequence of k using applications from computer science. I view program- generic rank-one perturbations. In this talk, we close this ming as a learning modality; students write programs to gap by giving an affirmative answer. develop and reinforce their understanding of concepts and proofs. In the MOOC version, taught twice through Cours- Leonhard Batzke era, student programs are submitted to a grader and au- TU Berlin tomatically checked using example data. I’ll discuss my [email protected] experiences.

Christian Mehl Philip Klein TU Berlin Brown University Institut fuer Mathematik [email protected] [email protected]

Andre C. Ran MS11 Vrije Universiteit LAFF Long and Prosper? Amsterdam [email protected] Linear Algebra: Foundations to Frontiers (LAFF) is a course developed specifically for the edX platform. It en- Leiba Rodman compasses more than 850 pages of notes, 250+ videos, on- College of William and Mary line activities, homeworks with answers, and programming Department of Mathematics exercises. We took on this activity to a large degree be- [email protected] cause of the MOOC movement’s promise to educate the masses, including those who don’t necessarily have access to high quality classes. What is the intended audience? MS10 Who showed up to take the course? What is the short A Factorized Form of the Inverse Polynomial Ma- term and long term prognosis? Will MOOCs change edu- trix Problem cation as we know it? The fact is that it is too early to tell. We discuss design decisions that underlie our course and In this talk we consider the problem of constructing a poly- early experiences from offering this course in Spring 2014 82 LA15 Abstracts

and Spring 2015. Theoretical Division [email protected] Maggie E. Myers The University of Texas at Austin [email protected] MS12 Analyzing Spgemm on Gpu Architectures Robert A. van de Geijn The University of Texas at Austin Mapping sparse matrix-sparse matrix multiplication Department of Computer Science (SpGEMM) to modern throughput-oriented architectures, [email protected] such as GPUs, presents a unique set challenges in com- parison with traditional CPUs implementations. In this presentation we will present an overview of the architec- MS11 tural issues that must be addressed by SpGEMM oper- Experience with OpenCourseWare Online Video ations on GPUs and discuss recent work to address these Lectures areas of concern. Specifically we will compare and contrast segmented and flat processing schemes and the impact of I will share my experience in preparing video lectures each scheme on the expected SpGEMM performance. for basic mathematics courses. Linear Algebra and also Computational Science came directly from the MIT class- Steven Dalton room. Calculus and the new Differential Equations videos University of Illinois at Urbana-Champaign were filmed without an audience—I bought a camera and [email protected] students helped. All are uploaded to OpenCourseWare ocw.mit.edu and to YouTube (where an individual can cre- Luke Olson ate a Channel, but most viewers will find videos from es- Department of Computer Science tablished websites). Using a blackboard can still be more University of Illinois at Urbana-Champaign effective than slides (but there is a place for slides). Your [email protected] voice and movements make a human connection that can change lives. MS12 Gilbert Strang The Distributed Block-Compressed Sparse Row Li- Massachusetts Institute of Technology brary: Large Scale and GPU Accelerated Sparse [email protected] Matrix Multiplication

The Distributed Block-Compressed Sparse Row (DBCSR) MS12 library is designed to efficiently perform block-sparse Strong Scaling and Stability: SpAMM Accelera- matrix-matrix multiplication, among other operations. It tion for the Matrix Square Root Inverse and the is MPI and OpenMP parallel, and can exploit accelerators. Heavyside Function It is developed as part of CP2K, where it provides core functionality for linear scaling electronic structure theory. For matrices with decay, incomplete/inexact approxima- The deployed algorithm takes in account the sparsity of the tions based on the dropping of small elements leads to a problem in order to reduce the MPI communication. We sparse approximation and fast solutions for preconditioners will discuss performance results and development insights. (and even close to full solutions in well-conditioned circum- stances) via the SpMM kernel. In this talk, I will develop an alternative strategy for the N-Body multiplication of Alfio Lazzaro,OleSchuett data local decay matrices, the Sparse Approximate Matrix ETH Zurich Multiply (SpAMM), based on recursive Cauchy-Swartz oc- alfi[email protected], [email protected] clusion of sub-multiplicative norms (metric query). The SpAMM kernel is sparse in the multiplication (task) space, Joost VandeVondele via occlusion and culling of small norms (rather than in Nanoscale Simulations the vector (data) space), yielding bounds for the product ETH Zurich that are multiplicative rather than additive. I will develop [email protected] a stability analysis for SpAMM accelerated Newton Schulz iterations towards the inverse square root, and show how the convergence of Frechet channels towards idempotence MS12 and nilpotence determine stability. I will demonstrate ex- A Framework for SpGEMM on GPUs and Hetero- treme stability under SpAMM approximation for NS itera- geneous Processors tion with congruential transformation, and show dramatic error cancellation between the nilpotent Frechet derivative This talk will focus on our framework and correspond- and the corresponding skew displacement, which grows as ing algorithms for solving three main challenges (un- −1/2 −1/2 cond(Zk ),withZk− >A . Finally, I will describe known number of nonzeros of the resulting matrix, expen- how N-Body structure of the SpAMM algorithm leads to sive insertion and load imbalance) of executing SpGEMM encouraging results for distributed memory implementa- on GPUs and emerging heterogeneous processors. See tions, with recent results for a hybrid Charm++/OpenMP http://arxiv.org/abs/1504.05022 for a preprint paper of approach to task parallelism, yielding strong scaling up this talk. to 500 cores per heavy atom for calculation of the matrix Heavyside function in the context of electronic structure Weifeng Liu theory. Copenhagen University [email protected] Matt Challacombe Los Alamos National Laboratory Brian Vinter LA15 Abstracts 83

Niels Bohr Institute, University of Copenhagen Ralph C. Smith [email protected] North Carolina State Univ Dept of Mathematics, CRSC [email protected] MS13 Active Subspaces in Theory and Practice Brian Williams Los Alamos National Laboratory Active subspaces are an emerging set of tools for working [email protected] with functions of several variables. I will motivate dimen- sion reduction in scalar-valued functions of several vari- ables, define the active subspace, and analyze methods for MS13 estimating it based on the function’s gradient. Recovery of Structured Multivariate Functions

Paul Constantine Approximation of multivariate functions typically suffers Colorado School of Mines from curse of dimension, i.e. the number of sampling points Applied Mathematics and Statistics grows exponentially with the dimension. We pose a struc- [email protected] tural assumption on the functions to be approximated - namely that they take a form of a ridge, f(x)=g(Ax). We present recovery algorithms for this kind of a problem, MS13 including tractability results. Sketching Active Subspaces Jan Vybiral When gradients are not available, estimating the eigenvec- Technical University Berlin tors that define a multivariate function’s active subspace [email protected] is especially challenging. We present a matrix sketching approach that exploits the connection between directional Massimo Fornasier derivatives and linear projections to estimate the eigenvec- Technical University Munich tors with fewer function calls than standard finite differ- [email protected] ences. Karin Schnass Armin Eftekhari,FirstNameEftehari Uni Innsbruck Colorado School of Mines [email protected] [email protected], [email protected]

Paul Constantine MS14 Colorado School of Mines Applied Mathematics and Statistics Fractional Tikhonov Regularization and the Dis- [email protected] crepancy Principle for Deterministic and Stochas- tic Noise Models

MichaelB.Wakin Recently, two different regularization methods called frac- Dept. of Electrical Engineering and Computer Science tional Tikhonov regularization have been introduced, Colorado School of Mines in particular to reduce the over-smoothing of classical [email protected] Tikhonov regularization. In this talk, we discuss conver- gence properties of these methods in the classical deter- ministic setting and show how one can lift these results to MS13 the case of stochastic noise models. We compare the re- Adaptive Morris Techniques for Active Subspace construction quality of fractional Tikhonov methods with Construction the classical approach for the discrepancy principle.

In this presentation, we discuss the use of adaptive Morris Daniel Gerth techniques for adaptive subspace construction. In appli- Faculty for Mathematics cations ranging from neutronics models for nuclear power Technical University Chemnitz plant design to partial differential equations with dis- [email protected] cretized random fields, the number of inputs can range from the thousands to millions. In many cases, however, Esther Klann the dimension of the active subspace of influential param- University of Linz, Austria eters is moderate in the sense that it is less than one hun- [email protected] dred. For codes in which adjoints are available, gradient approximations can be used to construct these active sub- Lothar Reichel spaces. In this presentation, we will discuss techniques, Kent State University which are applicable when adjoints are not available or [email protected] are prohibitively expensive. Specifically, we will employ Morris indices with adaptive stepsizes and step-directions to approximate the active subspace. We illustrate these Ronny Ramlau techniques using examples from nuclear and aerospace en- Johannes Kepler Universit¨at Linz, Austria gineering. [email protected]

Allison Lewis North Carolina State University MS14 [email protected] A Singular Value Decomposition for Atmospheric 84 LA15 Abstracts

Tomography Gabriele Stocchino, Massimo Vanzi University of Cagliari The image quality of ground based astronomical telescopes Italy suffers from turbulences in the atmosphere. Adaptive Op- [email protected], [email protected] tics (AO) systems use wavefront sensor measurements of incoming light from guide stars to determine an optimal shape of deformable mirrors (DM) such that the image of MS15 the scientific object is corrected after reflection on the DM. Sparse Approximate Inverse Preconditioners, Re- An important step in the computation of the mirror shapes visited is Atmospheric Tomography, where the turbulence profile of the atmosphere is reconstructed from the incoming wave- Preconditioners that are sparse approximations to the in- fronts. We present several reconstruction approaches and verse of a matrix can be applied very efficiently in paral- will in particular focus on reconstructions based on a sin- lel. A disadvantage, however, is that they are local–the gular value decomposition of the underlying operator. preconditioning only propagates information from one grid point to nearby grid points, resulting in slow convergence. Ronny Ramlau We investigate applying sparse approximate inverses in an Johannes Kepler Universit¨at Linz, Austria hierarchical basis. The hierarchical basis transformations [email protected] are also sparse matrix vector products. The method is related to using sparse approximate inverses as multigrid smoothers and to using wavelet bases, which have been in- MS14 vestigated in the past. Our goal, however, is to develop a Unbiased Predictive Risk Estimator for Regular- simple method that is easily parallelizable. ization Parameter Estimation in the Context of It- eratively Reweighted Lsqr Algorithms for Ill-Posed Edmond Chow Problems School of Computational Science and Engineering Georgia Institute of Technology Determination of the regularization parameter using the [email protected] method of unbiased predictive risk estimation (UPRE) for the LSQR Tikhonov regularization of under determined ill-posed problems is considered and contrasted with the MS15 generalized cross validation and discrepancy principle tech- Iterative Solver for Linear Systems Arising in Inte- niques. Examining the UPRE for the projected problem it rior Point Methods for Semidefinite Programming is shown that the obtained regularized parameter provides a good estimate for that to be used for the full problem Interior point methods for Semidefinite Programming with the solution found on the projected space. The results (SDP) face a difficult linear algebra subproblem. We pro- are independent of whether systems are over or underde- pose an iterative scheme to solve the Newton equation sys- termined. Numerical simulations support the analysis and tem arising in SDP. It relies on a new preconditioner which a large scale problem of image restoration validates the use exploits well the sparsity of matrices. Theoretical insights of these techniques. into the method will be provided. The computational re- sults for the method applied to large MaxCut and matrix- Rosemary A. Renaut norm minimization problems will be reported. Arizona State University School of Mathematical and Statistical Sciences Jacek Gondzio [email protected] The University of Edinburgh, United Kingdom [email protected] Saeed Vatankhah University of Tehran, Iran Stefania Bellavia [email protected] Universita di Firenze stefania.bellavia@unifi.it

MS14 Margherita Porcelli Shape Reconstruction by Photometric Stereo with University of Bologna Unknown Lighting [email protected] Photometric stereo is a typical technique in computer vi- sion to extract shape and color information from an object MS15 which is observed under different lighting conditions, from On Nonsingular Saddle-Point Systems with a Max- a fixed point of view. We will describe a fast and accurate imally Rank Deficient Leading Block algorithm to approximate the framed object, treating, in particular, the case when the position of the light sources We consider nonsingular saddle-point matrices whose lead- is not known. Numerical experiments concerning an appli- ing block is maximally rank deficient, and show that the cation in archaeology will be illustrated. inverse in this case has unique mathematical properties. We then develop a class of indefinite block precondition- Giuseppe Rodriguez ers that rely on approximating the null space of the leading University of Cagliari, Italy block. The preconditioned matrix is a product of two indef- [email protected] inite matrices but under certain conditions the conjugate gradient method can be applied and is rapidly convergent. Riccardo Dess`ı Spectral properties of the preconditioners are observed and Politecnico di Torino validated by numerical experiments. Italy [email protected] Chen Greif LA15 Abstracts 85

Department of Computer Science palfi[email protected] The University of British Columbia [email protected] MS16 Using Inverse-free Arithmetic in Large-scale Ma- MS15 trix Equations

A Tridiagonalization Method for Saddle-Point and Inverse-free arithmetic and permuted Riccati bases have Quasi-Definite Systems been used recently in the solution of small-scale algebraic Riccati equations, with excellent stability properties. We The tridiagonalization process of Simon, Saunders and Yip explore their use to solve large-scale matrix equations: our (1988) applied to a rectangular operator A gives rise to target is being able to deal with problems where the ma- USYMQR, which solves a least-squares problem with A, trix X has large norm, and hence the columns of [I; X]are and USYMLQ, which solves a least-norm problem with T an ill-conditioned basis for its column space. For simplic- A . Symmetric saddle-point systems may be viewed as a ity, we focus on solving Lyapunov equations with the ADI pair of least-squares/least-norm problems. This allows us algorithm. to merge USYMQR and USYMLQ into a single method that solves both problems in one pass. We present precon- Volker Mehrmann ditioned and regularized variants that apply to symmetric Technische Universit¨at Berlin and quasi-definite linear systems and illustrate the perfor- [email protected] mance of our implementation on systems coming from op- timization and fluid flow. Federico Poloni University of Pisa Dominique Orban [email protected] GERAD and Dept. Mathematics and Industrial Engineering Ecole Polytechnique de Montreal MS16 [email protected] Preconditioned Riemannian Optimization for Low- Rank Tensor Equations

MS16 Solving partial differential equations on high-dimensional Averaging Block-Toeplitz Matrices with Preserva- domains leads to very large linear systems. In these cases, tion of Toeplitz Block Structure the degrees of freedom in the linear system grow expo- nentially with the number of dimensions, making classic In many applications, data measurements are transformed approaches unfeasible. Approximation of the solution by into matrices. To find a trend in repeated measurements, low-rank tensor formats often allows us to avoid this curse an average of the corresponding matrices is required. We of dimensionality by exploiting the underlying structure use the barycenter, or minimizer of the sum of squared in- of the linear operator. We propose a truncated Newton trinsic distances, as our averaging operation. This intrin- method on the manifold of tensors of fixed rank, in partic- sic distance depends on the chosen geometry for the set ular, Tensor Train (TT) / Matrix Product States (MPS) of interest. We present an application-inspired geometry tensors. We demonstrate the flexibility of our algorithm for positive definite Toeplitz-Block Block-Toeplitz matri- by comparing different approximations of the Riemannian ces. Both the corresponding barycenter and an efficient Hessian as preconditioners for the gradient directions. Fi- approximation are discussed. nally, we compare the efficiency of our algorithm with other tensor-based approaches such as the Alternating Linear Ben Jeuris Scheme (ALS). KU Leuven Bart Vandereycken [email protected] ETH Zurich [email protected] Raf Vandebril Katholieke Universiteit Leuven Daniel Kressner Department of Computer Science EPFL Lausanne [email protected] Mathicse daniel.kressner@epfl.ch MS16 Michael Steinlechner Theory and Algorithms of Operator Means EPF Lausanne michael.steinlechner@epfl.ch We will consider the theory of matrix and operator means based on the Kubo-Ando and Loenwer theories. We will consider theoretical methods for extending, defining and MS17 approximating means of more than two positive operators, Tensors and Structured Matrices of Low Rank in fact means of probability measures supported on the cone of positive operators. The techniques will be moti- We present an interesting connection between higher-order vated by the geometric mean of positive definite matrices tensors and structured matrices, in the context of low-rank and discrete-time gradient flows for geodescally convex po- approximation. In particular, we show that the tensor low tential functions. multilinear rank approximation problem can be reformu- lated as a structured matrix low-rank approximation, the Miklos Palfia latter being an extensively studied and well understood Kyoto University problem. For simplicity, we consider symmetric tensors. 86 LA15 Abstracts

By imposing simple constraints in the optimization prob- has been popularly used to find a best low multilinear- lem, the proposed approach is applicable to general tensors, rank approximation of a tensor. It often performs well in as well as to affinely structured tensors to find (locally) best practice to learn a multilinear subspace. However, little of low multilinear rank approximation with the same struc- its convergence behavior has been known. In this talk, I ture. will present a greedy way to implement the HOOI method and show its subsequence convergence to a stationary point Mariya Ishteva without any assumption. Assuming a nondegeneracy con- Vrije Universiteit Brussel dition, I will further show its global sequence convergence [email protected] result, which guarantees global optimality if the starting point is sufficiently close to a globally optimal solution. Ivan Markovsky The results are then extended to the multilinear SVD with Vrije Universiteit Brussel (VUB) missing values. [email protected] Yangyang Xu Rice University MS17 University of Waterloo Alternating Least-Squares Variants for Tensor Ap- [email protected] proximation

The Alternating Least-Squares (ALS) algorithm for CP ap- MS18 proximation of tensors has spawned several variants, the Communication-optimal Loop Nests best-known of these being ALS with (Tikhonov) regular- ization. We present a unified description of these vari- Scheduling an algorithm’s operations to minimize commu- ants of ALS, and further generalize it to produce novel nication can reduce its runtime and energy costs. We variants of the algorithm. One variant, tentatively titled study communication-efficient schedules for a class of al- ”greased ALS”, converges significantly more rapidly than gorithms including many-body and matrix/tensor compu- ALS for several test cases. We seek to understand the cir- tations and, more generally, loop nests operating on array cumstances under which these variants provide more rapid variables subscripted by linear functions of the loop iter- convergence, and fully understand the reasons for the im- ation vector. We derive lower bounds on communication provement. between levels in a memory hierarchy and between parallel processors, and present communication-optimal schedules Nathaniel McClatchey that attain these bounds in many cases. Ohio University Department of Mathematics Nicholas Knight [email protected] New York University [email protected] Martin J. Mohlenkamp Ohio University [email protected] MS18 A Computation and Communication-Optimal Par- MS17 allel Direct 3-Body Algorithm Generating Polynomials and Symmetric Tensor We present a new 3-way interactions N-body (3-body) al- Decomposition gorithm that is both computation and communication op- timal. Its optional replication factor, c,savesc3 in commu- For symmetric tensors of a given rank, there exist linear re- 2 nication latency (number of messages) and c in bandwidth lations of recursive patterns among their entries. Such re- (volume). We also extend the algorithm to support k-way lations can be represented by polynomials, which are called interactions with cutoff distance. The 3-body algorithm generating polynomials. The homogenization of a generat- demonstrates 99% efficiency on tens of thousands of cores, ing polynomial belongs to the apolar ideal of the tensor. showing strong scaling properties with order of magnitude A set of generating polynomials can be represented by a speedups over the na¨ıve algorithm. matrix, which is called a generating matrix. Generally, a symmetric tensor decomposition can be uniquely deter- Penporn Koanantakool mined by a generating matrix satisfying certain conditions. UC Berkeley We characterize the sets of such generating matrices and [email protected] investigate their properties (e.g., the existence, dimensions, nondefectiveness). Using these properties, we propose com- putational methods for symmetric tensor decompositions. MS18 Extensive examples are shown to demonstrate their effi- ciency. Communication Lower Bounds for Distributed- Memory Computations Jiawang Nie University of California, San Diego We present a new approach to the study of the commu- [email protected] nication requirements of distributed computations. This approach advocates for the removal of the restrictive as- sumptions under which earlier lower bounds on communi- MS17 cation complexity were derived, and prescribes that lower On the Convergence of Higher-order Orthogonality bounds rely only on a mild assumption on work distribution Iteration and Its Extension in order to be significant and applicable. This approach is illustrated by providing tight communication lower bounds The higher-order orthogonality iteration (HOOI) method for several fundamental problems, such as matrix multipli- LA15 Abstracts 87

cation. Tensor

Michele Scquizzato The electron repulsion integral tensor has ubiquitous appli- University of Houston cation in quantum chemistry calculation. In this talk, we [email protected] discuss some recent progress on compressing the electron repulsion tensor into the tensor hypercontraction format Francesco Silvestri with cubic scaling computational cost. (joint work with University of Padova Lexing Ying) [email protected] Jianfeng Lu Mathematics Department MS18 Duke University Minimizing Communication in Tensor Contraction [email protected] Algorithms

Accurate numerical models of electronic correlation in MS19 molecules are derived in terms of tensors that represent Reduced Density Matrix Methods in Theoretical multi-orbital functions and consequently contain permu- Chemistry and Physics tational symmetries. Traditionally, contractions of such symmetric tensors exploit only the symmetries in the set The energy of an N-electron quantum system can be ex- of tensor element products. I will present an algebraically- pressed as a functional of the two-electron reduced density restructured method that requires fewer operations for matrix (2-RDM). In 2-RDM methods the 2-RDM of an many such contractions of symmetric tensors. Addition- N-electron system is directly computed without the wave ally, I will present communication lower bounds that con- function. In this lecture recent theoretical and computa- tain surprising new insights for both approaches. tional advances as well as examples from chemistry and condensed-matter physics will be discussed. Edgar Solomonik ETH Zurich David A. Mazziotti [email protected] University of Chicago [email protected]

MS19 Numerical Tensor Algebra in Modern Quantum MS20 Chemistry and Physics On Powers of Certain Positive Matrices I will review the role of tensor operations in quantum simu- A matrix is called totally positive (resp. nonnegative) if lations, and in particular, the challenges of numerical com- all of its minors are positive (resp. nonnegative). It is putation in the matrix product state and tensor network known that such matrices are closed under conventional representations. Issues of parallelization and numerical sta- multiplication, but not necessarily closed under entry-wise bility will be discussed, especially for tensor networks with or Hadamard multiplication. On the other hand, a real cycles. matrix is called positive definite (resp. semidefinite) if it is symmetric and has positive (resp. nonnegative) prin- Garnet Chan cipal minors. In this case, such matrices need not be Princeton University closed under matrix multiplication, but are closed under [email protected] Hadamard multiplication. In this talk, we will survey ex- isting work and discuss some current advances on contin- uous Hadamard and conventional powers of both totally MS19 positive and positive definite matrices (along with their Efficient Algorithms for Transition State Calcula- closures), in the spirit of identifying a critical exponent in tions either situation, should one exist.

Transition states are fundamental to understanding the re- Shaun M. Fallat action dynamics qualitatively. To date various methods of University of Regina first principle location of transition states have been devel- Canada oped. In the absence of the knowledge of the final struc- [email protected] ture, the minimal-mode following method climbs up to a transition state without calculating the Hessian matrix. In this talk, we introduce a locally optimal search direction MS20 finding algorithm and an iterative minimizing method for Perron-Frobenius Theory of Generalizations of the translation which improve the rotational step by a fac- Nonnegative Matrices tor and the translational step a quantitative scale. Numeri- cal experiments demonstrate the efficiency of our proposed An eventual property is one that is attained by all powers algorithms. This is joint work with Jing Leng, Zhi-Pan of a matrix beyond some power, so a matrix is eventually k Liu, Cheng Shang and Xiang Zhou. positive if A is entrywise positive for all k ≥ k0,This talk will survey Perron-Frobenius theory of positive and Weiguo Gao nonnegative matrices and and how it extends to general- Fudan University, China izations of nonnegative matrices, especially those defined [email protected] by eventual properties.

Leslie Hogben MS19 Iowa State University, Compression of the Electron Repulsion Integral American Institute of Mathematics 88 LA15 Abstracts

[email protected] between the real object and its observed image can be de- scribed as a convolution with the so called point spread function (PSF). Even though the to-be-built generation of MS20 extremely large telescopes (ELTs) use adaptive optics (AO) Inverses of Acyclic Matrices and Parter Sets systems to reduce the effects of the atmosphere, still some residual blurring remains. The PSF serves as quality mea- An important result of Gantmacher and Krein describes sure for the AO system and thus needs to be very accurate. the inverses of symmetric, irreducible and invertible tridi- As the PSF cannot be measured directly, a reconstruction agonal matrices. In the early 2000s, Nabben extended this form the data of the AO system is necessary. In this talk, by describing the inverses of invertible matrices, M,whose an efficient algorithm for PSF reconstruction for the Euro- graph is a given tree T under the assumption that each pean ELT (E-ELT) is presented. The reconstruction pro- diagonal entry of M −1 is positive. These are closely re- cess is based on splitting the PSF in one part calculated on lated to the important class of ultrametric matrices. We the fly from observation data and a second part that can establish a similar description for arbitrary trees T without only be obtained from simulation. Especially for the latter the condition on the diagonal entries, and relate this to the one, a good model of the atmosphere is needed and due to Parter-sets of M, i.e. subsets α of vertices such that the the size of the E-ELT all calculations have to be made as nullity of the matrix obtained from M by deleting the rows efficient as possible even though simulations can be done and columns indexed by α is the cardinality of α. beforehand. On the fly computations need to be fast and using little memory as the measurements are usually taken Bryan L. Shader at a frequency of 500 Hz, leading to a huge amount of data Department of Mathematics just for one single night. In addition, astronomers want University of Wyoming to evaluate the current observations conditions using the [email protected] PSF and thus need it almost on time. We show how to use an efficient representation, leading to a sparse matrix. Curtis Nelson The reconstructed PSF can be used for improvement of Department of Mathematics the observed images. We opt for a blind deconvolution Brigham Young University scheme to also improve the quality of the reconstructed [email protected] PSF. Again the choice of an efficient representation, now for the observed image, is crucial to get a sparse linear sys- tem. Finally, a blind deconvolution algorithm taking care MS20 of the features of ground based astronomy is presented. Geometric Mapping Properties of Semipositive Matrices Roland Wagner Radon Institute for Computational and Applied Semipositive matrices map a positive vector to a positive Mathematics vector. The geometric mapping properties of semipositive Austrian Academy of Sciences matrices exhibit parallels to the theory of cone preserving [email protected] and cone mapping matrices: For a semipositive matrix A, there exist a positive proper polyhedral cone K1 and a pos- MS21 itive polyhedral cone K2 such that AK1 = K2.Thesetof all nonnegative vectors mapped by A to the nonnegative Randomized Tensor Singular Value Decomposition orthant is a proper polyhedral cone; as a consequence, A The tensor Singular Value Decomposition (t-SVD) pro- belongs to a proper polyhedral cone comprising semiposi- k posed by Kilmer and Martin [2011] has been applied suc- tive matrices. When the powers A have a common semi- cessfully in many fields, such as computed tomography, fa- positivity vector, then A has a positive eigenvalue. When cial recognition, and video completion. In this talk, I will the spectral radius of A is the sole peripheral eigenvalue present a probabilistic method that can produce a factor- and the powers of A are semipositive, A leaves a proper ization with similar properties to the t-SVD, but is stable cone invariant. and more computationally efficient on very large datasets. Michael Tsatsomeros This method is an extension of a well-known randomized Washington State University matrix method. I will present the details of the algorithm, [email protected] theoretical results, and provide experimental results for two specific applications.

MS21 Jiani Zhang,ShuchinAeron Tufts University Vector Etrapolation for Image Restoration [email protected], [email protected] Abstract not available at time of publication. Arvind Saibaba Hassane Sadok Department of Electrical and Computer Engineering Universit´e du Littoral Tufts University Calais Cedex, France [email protected] [email protected] Misha E. Kilmer Tufts University MS21 [email protected] Psf Reconstruction for Extremely Large Telescopes

Objects on the sky appear blurred when observed by a MS21 ground based telescope due to diffraction of light and tur- Multidirectional Subspace Expansion for Single- bulences in the atmosphere. Mathematically the relation Parameter and Multi-Parameter Tikhonov Regu- LA15 Abstracts 89

larization that best fits the available observations to within the obser- vational error over a period of time. This results in a large- Tikhonov regularization is often used to approximate so- scale nonlinear weighted least-squares problem, typically lutions of linear discrete ill-posed problems when the ob- solved by a Gauss-Newton method. We will discuss using served or measured data is contaminated by noise. Multi- a multilevel preconditioner within 4D-Var to reduce mem- parameter Tikhonov regularization may improve the qual- ory requirements and increase computational efficiency. ity of the computed approximate solutions. We propose a new iterative method for large-scale multi-parameter Kirsty Brown Tikhonov regularization with general regularization oper- University of Strathclyde ators based on a multidirectional subspace expansion. [email protected]

Ian Zwaan Igor Gejadze Department of Mathematics and Computer Science IRSTEA-Montpellier Technical University Eindhoven [email protected] [email protected] Alison Ramage M.E. Hochstenbach University of Strathclyde Eindhoven University of Technology [email protected] [email protected]

MS22 MS22 Asynchronous Optimized Schwarz Methods Fast Iterative Solvers for a Coupled Cahn- Hilliard/Navier-Stokes System Optimized Schwarz Methods are domain decomposition methods, where one imposes Robin conditions on the arti- We consider a diffuse interface model that governs the hy- ficial interfaces. The Robin parameter can be optimized to drodynamics of two-phase flows. Our research builds on obtain very fast convergence. We present an asynchronous the work of Garcke, Hinze and Kahle. Our contribution version of this method, where the problem in each sub- concerns the fully iterative solution of the arising large domain is solved using whatever boundary data is locally and sparse linear systems. This is based on precondition- available and with no synchronizations with other pro- ing techniques we have developed for Cahn–Hilliard varia- cesses. We prove convergence of the method, and illustrate tional inequalities. Block preconditioners using an effective its efficiency on large three-dimensional problems. (Joint Schur complement approximation are presented. Numeri- ´ cal results illustrate the efficiency of the approach. work with Fr´ed´eric Magoul`es and C´edric Venet, Ecole Cen- trale Paris). Jessica Bosch Max Planck Institute Magdeburg Daniel B. Szyld [email protected] Temple University Department of Mathematics Martin Stoll [email protected] Max Planck Institute, Magdeburg [email protected] MS23 Title Not Available at Time of Publication MS22 Preconditioning a Mass-Conserving DG Discretiza- Abstract not available at time of publication. tion of the Stokes Equations Xin He While there is a long history of work on conforming finite- Delft University of Technology, Netherland element discretizations of the Stokes and Navier-Stokes [email protected] equations, using H(div)-conforming elements for the ve- locity has recently emerged as a way to ensure strong en- MS23 forcement of the incompressibility condition at the element scale. In this talk, we explore the applicability of clas- Preconditioners for Linear Systems Arising From sic block preconditioning and monolithic multigrid meth- The Eddy Current Problem ods for Stokes-like saddle-point problems to one such dis- cretization, using BDM elements for the velocities. We consider the preconditioning techniques for the structued systems of linear equations arising from the fi- Scott Maclachlan nite element discretization of the eddy current problem. A Department of Mathematics class of positive definite and positive semidefinite splittings Tufts University is proposed to solve such linear systems. This kind of ma- [email protected] trix splitting, as well as its relaxed variants, can be used as the preconditoners for Krylov subspace methods. The per- formance of the preconditioners is illustrated by numerical MS22 examples. A Multilevel Preconditioner for Data Assimilation with 4D-Var Jianyu Pan East China Normal University, China Large-scale variational data assimilation problems are com- [email protected] mon in numerical weather prediction. The 4D-Var method is frequently used to calculate a forecast model trajectory Xin Guan 90 LA15 Abstracts

Department of mathematics, East China Normal Universit´edeRennes1 University [email protected] [email protected] Ahmad Karfoul Al-Baath University MS23 [email protected] An Alternating Positive Semidefinite Splitting Pre- conditioner for Saddle Point Problems from Time- Hanna Becker harmonic Eddy Current Models GIPSA-Lab CNRS [email protected] For the saddle point problem arising from the finite ele- ment discretization of the hybrid formulation of the time- harmonic eddy current problem, we propose an alternat- Remi Gribonval ing positive semidefinite splitting preconditioner which is Projet METISS, IRISA-INRIA based on two positive semidefinite splittings of the saddle 35042 Rennes cedex, France point matrix. It is proved that the corresponding alter- [email protected] nating positive semidefinite splitting iteration method is unconditionally convergent. We show that the new precon- Amar Kachenoura, Lofti Senhadji ditioner is much easier to implement than the block alter- Universite de Rennes 1 nating splitting implicit preconditioner proposed in [Z.-Z. [email protected], Bai, Numer. Linear Algebra Appl., 19 (2012), 914–936] [email protected] when they are used to accelerate the convergence rate of Krylov subspace methods such as GMRES. Numerical ex- Guillotel Philippe amples are given to show the effectiveness of our proposed Technicolor R&D France, France preconditioner. [email protected]

Zhiru Ren Isabelle Merlet Chinese Academy of Sciences Inserm & Universit´e de Rennes 1, France [email protected] [email protected]

Yang Cao School of Transportation MS24 Nantong University On the Uniqueness of Coupled Matrix Block Di- [email protected] agonalization in the Joint Analysis of Multiple Datasets

MS23 Uniqueness of matrix (or tensor) factorizations is neces- Title Not Available at Time of Publication sary in order to achieve interpretability and attribute phys- ical meaning to factors. In this work, we discuss new In this talk, we focus on solving a class of complex linear uniqueness results on a coupled matrix block diagonaliza- systems arising from distributed optimal control with time- tion model. This model is associated with the joint blind periodic parabolic equations. A block alternating splitting source separation of multiple datasets, where each datasets (BAS) iteration method and preconditioner are presented. is a linear mixture of statistically independent multidimen- The convergence theory of the BAS iteration and the spec- sional components. tral properties of the BAS preconditioned matrix are dis- cussed. Numerical experiments are presented to illustrate Dana Lahat, Christian Jutten the efficiency of the BAS iteration as a solver as well as a GIPSA-Lab preconditioner for Krylov subspace methods. [email protected], [email protected] Guo-Feng Zhang, Zhong Zheng Lanzhou University, China gf [email protected], [email protected] MS24 The Decomposition of Matrices

MS24 There are well-known matrix decompositions: Localization and Reconstruction of Brain Sources LU-decomposition, SVD-decomopostion and QR- Using a Constrained Tensor-based Approach decomposition. In this talk we will discuss new types of matrix decompositions. We will use algebraic geometry This talk addresses the localization and reconstruction to show that every matrix can be decomposed into the of spatially distributed sources from ElectroEncephalo- product of finitely many Toeplitz or Hankel matrices. Graphic (EEG) signals. The occurrence of several ampli- Our method can also recover LU-decomposition and tude modulated spikes originating from the same epileptic QR-decompositon for generic matrices. This is a joint region are used to build a space-time-spike tensor from the work with Lim Lek-Heng. EEG data. A Canonical Polyadic (CP) decomposition of this tensor is computed such that the column vectors of Ke Ye, Lek-Heng Lim one loading matrix are linear combinations of a physics- University of Chicago driven dictionary. A performance study on realistic simu- [email protected], [email protected] lated EEG data of the proposed tensor-based approach is provided. MS24 Laurent Albera ”Sequentially Drilled” Joint Congruence Decom- LA15 Abstracts 91

positions (SeDJoCo): The Simple Case and the Francesco Silvestri Extended Case IT University of Copenhagen [email protected] We present the formulation and several associated prop- erties of a particular congruence transformations, which we term ”Sequentially Drilled” Joint Congruence (SeD- MS25 JoCo) decompositions and Extended SeDJoCo decompo- Characterizing the Communication Costs of Sparse sitions. In its basic form, SeDJoCo is defined as follows: Matrix Multiplication Given an ordered set of K (conjugate-)symmetric target- matrices of dimensions KxK, find a KxK transformation Parallel sparse matrix-matrix multiplication is a key com- matrix, such that when the k-th target-matrix is multi- putational kernel in scientific computing. Although its run- plied by the transformation matrix on the left, and by ning time is typically dominated by the cost of interpro- its (conjugate) transpose on the right, the resulting ma- cessor communication, little is known about these commu- trix is drilled on its k-th column (and k-th row), namely, nication costs. We propose a hypergraph model for this all elements in that column are zeros, except for the di- problem, in which vertices represent nonzero scalar mul- agonal ((k,k)-th) element. We show that this problem is tiplications, hyper-edges represent data, and the optimal closely related to Maximum Likelihood (ML) Independent vertex partition represents a communication-minimizing, Component Analysis (ICA) in some contexts, as well as to load-balanced distribution of work. We demonstrate the the problem of Coordinated Beamforming (CBF) in multi- potential of this model and similar coarser ones for deriv- user Multiple-Inputs Multiple-Outputs (MIMO) communi- ing communication lower bounds. cations. We show that a sufficient condition for existence of a solution to this problem is that all the target matri- Grey Ballard ces be positive definite. We also present some iterative Sandia National Laboratories solutions. We then proceed to describe the Extended SeD- [email protected] JoCo, in which, for some M, an ordered set of KM2 target matrices (each of dimensions KxK) is considered, and M Alex Druinsky transformation matrices are sought, satisfying a more com- Computational Research Division plicated set of congruence transformations. This extended Lawrence Berkeley National Laboratory version is closely related to ML Independent Vector Anal- [email protected] ysis (IVA), an emerging extension of ICA which considers M mixtures, with correlations between sources participat- Nicholas Knight ing in different mixtures. It is also related to CBF in the New York University context of multicast Multi-Users MIMO. [email protected]

Yao Cheng Oded Schwartz Ilmenau University of Technology Hebrew University of Jerusalem [email protected] [email protected]

Arie yeredor Tel-Aviv University MS25 [email protected] Data Movement Lower Bounds for Computational Directed Acyclic Graphs Martin Haardt Abstract not available at time of publication. Ilmenau University of Technology [email protected] Saday Sadayappan Ohio State University [email protected] MS25 Network Oblivious Algorithms MS25 Network obliviousness is a framework for design of algo- Using Symmetry to Schedule Algorithms rithms that can run efficiently on machines with different Presented with a machine with a new interconnect topol- parallelism and communication capabilities. Algorithms ogy, designers use intuition about the geometry of the algo- are specified only in terms of the input size and evaluated rithm and the machine to design time and communication- on a model with two parameters, capturing parallelism and efficient schedules that map the algorithm onto the ma- communication latency. Under broad conditions, optimal- chine. Is there a systematic procedure for designing such ity in the latter model implies optimality in Decomposable schedules? We explore a new algebraic approach with a BSP. Optimal network-oblivious algorithms are given for a view towards automating schedule design. Specifically, we few key problems. Limitations of non-oblivious broadcast describe how the ”symmetry” of matrix multiplication can are established. be used to derive schedules for different network topologies. Gianfranco Bilardi, Andrea Pietracaprina, Geppino Pucci University of Padova Harsha Vardhan Simhadri [email protected], [email protected], Lawrence Berkeley National Laboratory [email protected] [email protected]

Michele Scquizzato University of Houston MS26 [email protected] O(N) Density Functional Theory Calculations: The 92 LA15 Abstracts

Challenge of Going Sparse large computational cost associated with these simulations has severely restricted the size of systems that can be rou- To reduce computational complexity in Density Functional tinely studied. In this talk, previous and current efforts Theory calculations with wave functions, a special rep- of the speaker to develop efficient real-space formulations resentation of the solution is required. Sparse solutions, and parallel implementations for DFT will be discussed. which retain the accuracy of traditional approaches are These include linear scaling methods applicable to both possible but present many challenges. Nonetheless, they insulating and metallic systems. also offer many opportunities to solve much larger and more realistic problems in material sciences, and to scale Phanish Suryanarayana on much larger high performance computing platforms. In Georgia Institute of Technology this talk, we will discuss our recent developments using [email protected] real-space discretizations. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52- MS27 07NA27344. Arrangements of Equal Minors in the Positive Grassmannian Jean-Luc Fattebert Lawrence Livermore National Lab. We investigate the structure of equalities and inequalities [email protected] between the minors of totally positive matrices. We show that collections of equal minors of largest value are in bijec- Daniel Osei-Kuffuor tion with sorted sets, which earlier appeared in the context Lawrence Livermore National Laboratory of alcoved polytopes. We also prove in many cases that col- oseikuff[email protected] lections of equal minors of smallest value are exactly weakly separated sets. Such sets are closely related to the positive Grassmannian and the associated cluster algebra. MS26 Randomized Estimation of Spectral Densities of Miriam Farber Large Matrices Made Accurate MIT [email protected] In physics, it is sometimes desirable to compute the spec- tral density of a Hermitian matrix A. In many cases, the Alexander Postnikov only affordable operation is matrix-vector multiplication, Massachusetts Institute of Technology and stochastic based methods can be used to estimate the [email protected] DOS.√ The accuracy of stochastic algorithms converge as O(1/ Nv), where Nv is the number of stochastic vectors. I will introduce a new method called spectrum sweeping, MS27 which is a stochastic method√ and the convergence rate can Preserving Positivity for Matrices with Sparsity be much faster than O(1/ Nv ) for modest choice of Nv . Constraints

Lin Lin Functions preserving positivity have been investigated by University of California, Berkeley analysts and linear algebraists for much of the last few Lawrence Berkeley National Laboratory decades. It is well known from the work of Rudin and [email protected] Schoenberg that functions (applied entrywise) which are analytic and absolutely monontonic preserve positivity for all positive definite matrices of any size. The emergence of MS26 data science has raised new questions in the field of posi- Challenges and Opportunities for Solving Large- tivity. In this talk we investigate maps which preserve pos- scale Eigenvalue Problems in Electronic Structure itivity when various sparsity, rank and other constraints Calculations are imposed. As a consequence, we extend the results of Rudin and Schoenberg with modern motivations in mind. Realistic first-principle quantum simulations applied to (Joint with B. Rajaratnam and A. Khare) large-scale atomistic systems, pose new challenges in the design of numerical algorithms that are both capable of Bala Rajaratnam processing a considerable amount of generated data, and Stanford University achieving significant parallel scalability on modern high [email protected] end computing architectures. In this presentation, we ex- plore variations of the FEAST eigensolver that can consid- Dominique Guillot erably broaden the perspectives for addressing such diffi- University of Delaware culties. [email protected] Eric Polizzi University of Massachusetts, Amherst, USA Apoorva Khare [email protected] Stanford University [email protected]

MS26 Towards Large and Fast Density Functional Theory MS27 Calculations Critical Exponents and Positivity Electronic structure calculations based on Density Func- We classify the powers that preserve positivity, when ap- tional Theory (DFT) have been remarkably successful in plied entrywise to positive semidefinite (psd) matrices with describing material properties and behavior. However, the rank and sparsity constraints. This is part of a broad pro- LA15 Abstracts 93

gram to study positive entrywise functions on distinguished [email protected] submanifolds of the cone. In our first main result, we com- pletely classify the powers preserving Loewner properties on psd matrices with fixed dimension and rank. This in- MS28 cludes the case where the matrices have negative entries. Solving Graph Laplacians for Complex Networks Our second main result characterizes powers preserving positivity on matrices with zeros according to a chordal The graph Laplacian is often used in network analysis. graph. We show how preserving positivity relates to the Therefore, solving linear systems and eigensystems for geometry of the graph, thus providing interesting connec- graph Laplacians is of great practical interest. For large tions between combinatorics and analysis. The work has problems, iterative solvers such as preconditioned conju- applications in regularizing covariance/correlation matri- gate gradients are needed. We show that Jacobi and Gauss- ces, where entrywise powers are used to separate signal Seidel preconditioning are remarkably effective for power- from noise, while minimally modifying the entries of the law graphs. Support-graph (combinatorial) precondition- original matrix. (Based on joint work with D. Guillot and ers can reduce the number of iterations further but may B. Rajaratnam.) not be faster in practice. Finally, we discuss and show pre- liminary results for a recent theoretically almost-optimal algorithm by Kelner et al. Apoorva Khare Stanford University Erik G. Boman [email protected] Center for Computing Research Sandia National Labs Dominique Guillot [email protected] University of Delaware [email protected] Kevin Deweese UC Santa Barbara Bala Rajaratnam [email protected] Stanford University [email protected] MS28 On Growth and Form of Networks

MS27 A new spectral scaling method to detecting topological ir- regularities in networks is introduced. It is based on the The Jordan Form of An Irreducible Eventually communicability function and the principal eigenvector of Nonnegative Matrix the adjacency matrix of a network. The relation between the spatial efficiency of a network, measured by means of A square complex matrix A is eventually nonnegative if the communicability angles, and the topological hetero- there exists a positive integer k0 such that for all k ≥ k0, geneities is found. We then analyse the growing of Erdos- Ak ≥ 0. We say a matrix is strongly eventually nonnega- Renyi random graphs, networks growing with modularity, tive if it is eventually nonnegative and has an irreducible fractal networks, and networks growing under spatial con- nonnegative power. We present some results about the straints. Jordan forms of strongly eventually nonnegative matrices and the Jordan forms of irreducible, eventually nonnega- Ernesto Estrada tive matrices. University of Strathclyde [email protected]

Ulrica Wilson Morehouse College MS28 ICERM Using the Heat Kernel of a Graph for Local Algo- [email protected] rithms

The heat kernel is a fundamental tool in extracting local in- formation about large graphs. In fact, the heat kernel can MS28 be restricted to a few entries which are enough to summa- rize local structure. In this talk, I will present an efficient Dynamic Network Centrality with Edge Multi- algorithm for approximating the heat kernel of a graph. I damping will also present two local algorithms using the approxima- tion procedure; a local cluster detection algorithm, and a Dynamic centrality measures provide a numerical tool for local Laplacian linear solver. the identification of key players in a network whose edges are evolving over time. In this talk we consider a Katz-type Olivia Simpson centrality measure modified to accommodate the dynami- University of California, San Diego cal aspect of the network. We study the significance of an [email protected] edge attenuation parameter to the ordinal rankings at a given time. We introduce a concept of edge multidamping MS29 allowing for a time-dependent variant of the standard Katz parameter. Constraint Preconditioning for the Coupled Stokes-Darcy System Mary Aprahamian We propose the use of constraint preconditioned GMRES School of Mathematics for the solution of the linear system arising from the finite The University of Manchester element discretization of coupled Stokes-Darcy flow. We 94 LA15 Abstracts

provide spectral and field-of-values bounds for the precon- iterative eigenvalue solvers are not reliable for computing ditioned system which are independent of the underlying this eigenvalue. A recently developed and more robust mesh size. We present several numerical experiments in 3D method, Lyapunov inverse iteraiton, involves solving large- that illustrate the effectiveness of our approach. scale Lyapunov equations, which in turn requires the solu- tion of large, sparse linear systems analogous to those aris- Scott Ladenheim,DanielB.Szyld ing from solving the underlying partial differential equa- Temple University tions. This study explores the efficient implementation of [email protected], [email protected] Lyapunov inverse iteration when it is used for linear sta- bility analysis of incompressible flows. Efficiencies are ob- Prince Chidyagwai tained from effective solution strategies for the Lyapunov Loyola University Maryland equations and for the underlying partial differential equa- Department of Mathematics and Statistics tions. Solution strategies based on effective precondition- [email protected] ing methods and on recycling Krylov subspace methods are tested and compared, and a modified version of a Lya- punov solver is proposed that achieves significant savings MS29 in computational cost. Krylov Methods in Adaptive Finite Element Com- putations Minghao W. Rostami Worcester Polytechnic Institute Since decades modern technological applications lead to [email protected] challenging PDE eigenvalue problems, e.g., vibrations of structures, modeling of photonic gap materials, analysis of Howard C. Elman the hydrodynamic stability, or calculations of energy lev- University of Maryland, College Park els in quantum mechanics. Recently, a lot of research is [email protected] devoted to the so-called Adaptive Finite Element Meth- ods (AFEM). In most AFEM approaches the underlying algebraic problems, i.e., linear systems or eigenvalue prob- MS30 lems, are of large dimension which makes the Krylov sub- Squish Scalings space methods of particular importance. The goal of this ∈ Rn×n work is to analyze the influence of the accuracy of the al- Given non-negative matrices A, L, U we show how ∈ R ∈ Rn×n gebraic approximation, e.g. inexact Krylov method on the to compute a scalar α and a diagonal matrix D adaptivity process. Moreover, we discuss how to reduce such that ≤ −1 ≤ the computational effort of the iterative solver by adapt- L/β αD AD Uβ, ing the size of the Krylov subspace. Based on perturba- ∈ R 1 −1 for the least possible β . The entities in A are ‘pushed tion results in H (Ω)- and H (Ω)-norm derived in [1] and down’ by the upper bound U and ‘pushed up’ thy the a standard residual a posteriori error estimator a balanc- lower bound L. Hence squish scaling. Squish scalings are ing AFEM algorithm is proposed. [1] U. L. Hetmaniuk computed using max-plus algebra and have applications in graph theory and eigenvalue computation. and R. B. Lehoucq. Uniform accuracy of eigenpairs from a shift-invert Lanczos method. SIAM J. Matrix Anal. Appl., James Hook 28:927–948, 2006. School of Mathematics University of Manchester Agnieszka Miedlar [email protected] Technische Universit¨at Berlin [email protected] MS30 Using Matrix Scaling to Identify Block Structure MS29 Near-Krylov for Eigenvalues and Linear Equations We can apply a two-sided diagonal scaling to a nonnegative Including Multigrid and Rank-1 matrix to render it into doubly stochastic form if and only if the matrix is fully indecomposable. The scaling often We look at several eigenvalue and linear equations com- reveals key structural properties of the matrix as the effects putations that involve near-Krylov subspaces. Included of element size and connectivity are balanced. Exploiting are two-grid methods that are more robust than standard key spectral properties of doubly stochastic matrices, we multigrid and methods for Rank-1 updates of large matri- will show how to use the scaling to reveal hidden block ces. Recycling methods for changing matrices are also con- structure in matrices without any prior knowledge. sidered. We analyze some of the properties of near-Krylov subspaces that are used by these methods. Philip Knight University of Strathclyde, Glasgow, Scotland Ron Morgan [email protected] Baylor University Ronald [email protected] MS30 Hungarian Scaling of Polynomial Eigenproblems MS29 Efficient Iterative Algorithms for Linear Stability We study the behaviour of the eigenvalues of a paramet- Analysis of Incompressible Flows ric matrix polynomial P in a neighbourhood of zero. If we suppose that the entries of P have Puiseux series ex- Linear stability analysis of a dynamical system entails find- pansions we can build an auxiliary matrix polynomial Q ing the rightmost eigenvalue for a series of eigenvalue prob- whose entries are the leading exponents of those of P. We lems. For large-scale systems, it is known that conventional show that preconditioning P via a diagonal scaling based LA15 Abstracts 95

on the tropical eigenvalues of Q can improve conditioning pact representation of the quasi-Newton matrices. In this and backward error of the eigenvalues. talk, we present a compact formulation for the entire Broy- den convex class of updates for limited-memory quasi- Marianne Akian, Stephane Gaubert Newton methods and how they can be used to solve large- INRIA and CMAP, Ecole Polytechnique scale trust-region subproblems with quasi-Newton Hessian [email protected], [email protected] approximations.

Andrea Marchesini Roummel F. Marcia INRIA University of California, Merced [email protected] [email protected]

Francoise Tisseur MS31 The University of Manchester [email protected] Obtaining Singular Value Decompositions in a Dis- tributed Environment

MS30 Matrix decompositions like the singular value decomposi- tions are important analysis tools for processing and an- Max-Balancing Hungarian Scalings alyzing large amounts data. It is thus important to de- A Hungarian scaling is a two-sided diagonal scaling of a velop stable robust approaches to efficiently processing matrix, which is typically applied along with a permuta- large datasets in a distributed environment. We present re- tion to sparse indefinite nonsymmetric linear system before sults for processing both sparse and dense data for big data calling a direct or iterative solver. A Hungarian scaled and problems. We further explore the efficiency of some promis- reordered matrix has all its entries of modulus less than ing approximate optimization-based approaches that pro- or equal to one and entries of modulus one on the diago- vide more flexibility in customizing the canonical problem nal. We use max-plus algebra to characterize the set of all formulation. Hungarian scalings for a given matrix and show that max- Ben-Hao Wang,JoshuaGriffin balancing a Hungarian scaled matrix yields the most diago- SAS Institute, Inc. nally dominant Hungarian scaled matrix possible. We also [email protected], joshua.griffi[email protected] propose an approximate max-balancing Hungarian scaling whose computation is embarrassingly parallel. MS31 Francoise Tisseur The University of Manchester Linear Algebra Software in Nonlinear Optimization School of Mathematics We discuss some practical issues associated with sequen- [email protected] tial quadratic programming (SQP) methods for large-scale nonlinear optimization. In particular, we focus on SQP James Hook methods that can exploit advances in linear algebra soft- School of Mathematics ware. Numerical results are presented for the latest version University of Manchester of the software package SNOPT when utilizing third-party [email protected] linear algebra software packages.

Jennifer Pestana Elizabeth Wong School of Mathematics University of California, San Diego The University of Manchester Department of Mathematics [email protected] [email protected]

MS31 MS32 A High-Accuracy SR1 Trust-Region Subproblem Low-Rank Cross Approximation Methods for Re- Solver for Large-Scale Optimization duction of Parametric Equations

In this talk we present an SR1 trust-region subproblem The low-rank decomposition of matrices and tensors is a solver for large-scale optimization. This work makes use known data compression technique. One of the problems of the compact representation of an SR1 matrix and a QR is to build a low-rank representation, given a procedure to factorization to obtain the eigenvalues of the SR1 matrix. evaluate a particular element of a tensor. The cross ap- Optimality conditions are used to find a high accuracy solu- proximation algorithms select a few entries to construct tion to each subproblem even in the so-called ”hard case”. the low-rank factorization. However, these methods as- Numerical results will be presented. sume that each fixed element of a tensor is a single number, and that different elements can be evaluated independently Jennifer Erway on each other. For stochastic PDEs this is not the case: Wake Forest University each parameter point produces a vector of the discrete so- [email protected] lution of the PDE. This would hinder the performance of the cross algorithms, since a large portion of data is likely to be wasted. We extend the TT cross algorithm to ap- MS31 proximate many tensors simultaneously in the same low- Compact Representation of Quasi-Newton Matri- rank format. Besides, for a linear SPDE we can use the ces computed low-rank factors as model reduction bases, and increase the efficiency further. Very large systems of linear equations arising from quasi- Newton methods can be solved efficiently using the com- Sergey Dolgov 96 LA15 Abstracts

MPI - Magdeburg tic (Galerkin) nonlinear iteration and we demonstrate its Germany effectiveness in a series of numerical experiments. [email protected] Bedrich Sousedik University of Maryland, Baltimore County MS32 [email protected] Hierarchical Low Rank Approximation for Extreme Scale Uncertainty Quantification Howard C. Elman University of Maryland, College Park We consider the problem of uncertainty quantification for [email protected] extreme scale parameter dependent problems where an un- derlying low rank property of the parameter dependency is assumed. For this type of dependency the hierarchical MS33 Tucker format offers a suitable framework to approximate a Global and Local Methods for Solving the Gravi- given output function of the solutions of the parameter de- tational Lens Equation pendent problem from a number of samples that is linear in the number of parameters. In particular we can a posteriori The effect of gravitational microlensing can be modeled compute the mean, variance or other interesting statistical by the lens equation R(z) − z¯,whereR(z)isacomplex quantities of interest. In the extreme scale setting it is al- rational function. The zeros of this function correspond to ready assumed that the underlying fixed-parameter prob- images of a distant light source, created by the action of lem is distributed and solved for in parallel. We provide gravity on light. We discuss the numerical solution of the in addition a parallel evaluation scheme for the sampling lens equation, including a reformulation as a rootfinding phase that allows us on the one hand to combine several problem of a polyanalytic polynomial, and discuss local solves and on the other hand parallelise the sampling. optimization methods in the complex variables z andz ¯.

Lars Grasedyck Robert Luce RWTH Aachen TU Berlin [email protected] [email protected]

Jonas Ballani MS33 EPF Lausanne jonas.ballani@epfl.ch Fast and Stable Computation of the Roots of Poly- nomials

Christian Loebbert A stable algorithm to compute the roots of polynomials in RWTH Aachen the monomial basis is presented. Implicit QR steps are ex- [email protected] ecuted on a suitable representation of the iterates, which remain unitary plus low rank. The computational com- plexity is reduced to a minimum by exploiting the redun- MS32 dancy present in the representation allowing to ignore the Interpolation of Inverse Operators for Precon- low rank part. By ignoring the low rank part the remain- ditioning Parameter-Dependent Equations and ing computations are solely based on rotations and thus Projection-Based Model Reduction Methods unitary.

We present a method for the construction of precondition- Thomas Mach ers for large systems of parameter-dependent equations, TU Chemnitz where an interpolation of the matrix inverse is computed Germany through a projection of the identity with respect to random [email protected] approximations of the Frobenius norm. Adaptive interpo- lation strategies are then proposed for different objectives Jared Aurentz in the context of projection-based model order reduction University of Oxford methods: error estimation, projection on a given approxi- Mathematical Institute mation space, or recycling of computations. [email protected] Anthony Nouy Ecole Centrale Nantes, GeM Raf Vandebril [email protected] Katholieke Universiteit Leuven Department of Computer Science [email protected] MS32 Stochastic Galerkin Method for the Steady State David S. Watkins Navier-Stokes Equations Washington State University Department of Mathematics We study steady state Navier-Stokes equations in the con- [email protected] text of stochastic finite element discretizations. We assume that the viscosity is given in terms of a generalized polyno- mial chaos (gPC) expansion. We formulate the model and MS33 linearization schemes using Picard and Newton iterations Stable Polefinding-Rootfinding and Rational Least- in the framework of the stochastic Galerkin method, and Squares Fitting Via Eigenvalues compare the results with that of stochastic collocation and Monte Carlo methods. We also propose a preconditioner One way of finding the poles/roots of a meromorphic func- for systems of equations solved in each step of the stochas- tion is rational interpolatation, followed by rootfinging for LA15 Abstracts 97

the denominator. This is a two-step process and the type of Yale University the rational interpolant is required as input. Moreover, the [email protected] numerical stability has remained largely uninvestigated. This work introduces an algorithm with the features: (i) Ke Wang automatic detection of the numerical type (ii) pole/root- Institute for Mathematics and its Applications finding via a generalized eigenvalue problem in a one-step University of Minnesota fashion, and (iii) backward stability, defined appropriately. [email protected]

Shinji Ito MS34 University of Tokyo The Stochastic Block Model and Communities in shinji [email protected] Sparse Random Graphs: Detection at Optimal Rate Yuji Nakatsukasa Department of Mathematics We consider the problem of communities detection in University of Manchester sparse random graphs. Our model is the so-called Stochas- [email protected] tic Block Model, which has been immensely popular in the recent statistics literature. Let X1, ...Xk be vertex sets of sizeneach(wherek is fixed and n is large). One draws MS33 random edges inside each Xi with probability p/n,andbe- Resultant Methods for Multidimensional Rootfind- tween Xi and Xj with probability q/n,forsomeconstants ing Are Exponentially Unstable p and q. Given one instance of this sparse random graph, our goal is to recover the sets Xi as correctly as possible. Hidden-variable resultant methods are a class of algorithms We are going to present a fast spectral algorithm which for the solution of systems of polynomial equations. When does this job at the optimal rate (namely the relation be- the number of variable is 2, when care is taken, they can be tween p, q and the number of mistakes in the recovery is competitive. Yet, in higher dimensions they are known to optimal). Our algorithm is based on spectral properties of miss solutions, calculate roots to low precision, and intro- random sparse matrices and is easy to implement. We will duce spurious ghost solutions. We show that the popular also discuss several related works and algorithms and an hidden-variable resultant method based on the Cayley re- open question concerning perturbation of random matri- sultant is inherently and spectacularly numerically unsta- ces. ble, by a factor that grows exponentially with the dimen- sion. We also show the ill-conditioning of the Sylvester Anup Rao resultant for solving systems of two bivariate polynomial Yale equations. This provides a rigorous explanation of the nu- [email protected] merical difficulties that practitioners are frequently report- ing. MS34 Vanni Noferini Robust Tensor Decomposition and Independent University of Manchester Component Analysis [email protected] ICA is a classical model of unsupervised learning originat- Alex Townsend ing and widely used in signal processing. In this model, Department of Mathematics observations in n-space are obtained by an unknown linear MIT transformation of a signal with independent components. [email protected] The goal is to learn the the transformation and thereby the hidden basis of the independent components. In this talk, we’ll begin with a polynomial-time algorithm called MS34 Fourier PCA, for underdetermined ICA, the setting where Singular Values and Vectors under Random Per- the observed signal is in a lower dimensional space than the turbation original signal, assuming a mild, verifiable condition on the transformation. The algorithm is based on a robust tensor Computing the singular values and singular vectors of a decomposition method applied to a higher-order derivative large matrix is a basic task in high dimensional data anal- tensor of the Fourier transform and its analysis uses the ysis with many applications in computer science and statis- anticoncentration of Gaussian polynomials. The running tics. In practice, however, data is often perturbed by noise. time and sample complexity are polynomial, but of high A natural question is the following. How much does a small constant degree, for any constant error; a MATLAB im- perturbation to the matrix change the singular values and plementation of the algorithm appears to need 104 samples vectors? Classical (deterministic) theorems, such as those for 10-dimensional observations. We then present a more by Davis-Kahan, Wedin, and Weyl, give tight estimates efficient recursive variant for the fully determined case of for the worst-case scenario. In this talk, I will consider standard ICA, which needs only a nearly linear number of the case when the perturbation is random. In this setting, samples and has polynomial time complexity. Its analysis better estimates can be achieved when our matrix has low is based on the gaps between the roots of polynomials of rank. This talk is based on joint work with Van Vu and Gaussian random variables. Ke Wang. Santosh Vempala Sean O’Rourke Georgia Institute of Technology University of Colorado Boulder [email protected] [email protected] Navin Goyal Van Vu Microsoft Research 98 LA15 Abstracts

[email protected] [email protected]

Ying Xiao Palantir MS35 [email protected] Lanczos Bidiagonalization with Subspace Auge- mentation for Discrete Inverse Problems

MS34 The regularizing properties of Lanczos bidiagonalization were studied and advocated by many researchers, includ- Applications of Matrix Perturbation Bounds with ing Dianne O’Leary. This approach is powerful when the Random Noise in Matrix Recovery Problems underlying Krylov subspace captures the dominating com- ponents of the solution. In some applications the regular- The general matrix recovery problem is the following: A is ized solution can be further improved by augmenting the a large unknown matrix. We can only observe its noisy ver- Krylov subspace with a low-dimensional subspace that rep- sion A+E, or in some cases just a small part of it. We would resents specific prior information. Inspired by earlier work like to reconstruct A or estimate an important parameter on GMRES we demonstrate how to carry these ideas over as accurately as possible from the observation. In a recent to the Lanczos bidiagonalization algorithm. work, we study how much the singular value or singular vector of data matrix will change under a small perturba- Per Christian Hansen tion. We show that better estimates can be achieved if the Technical University of Denmark data matrix is low rank and the perturbation is random, Informatics and Mathematical Modelling improving the classical perturbation results. In this talk, [email protected] I would like to discuss some applications of our results in the matrix recovery problems. This is joint work with Sean O’Rourke and Van Vu. MS35 A Flexible Regression Model for Count Data Sean O’Rourke University of Colorado Boulder Poisson regression is a popular tool for modeling count [email protected] data and is applied in a vast array of applications. Real data, however, are often over- or under-dispersed and, thus, Van Vu not conducive to Poisson regression. We instead propose Yale University a regression model based on the ConwayMaxwell-Poisson [email protected] (COM-Poisson) distribution to address this problem. The COM-Poisson regression generalizes the well-known Pois- son and logistic regressions, and is suitable for fitting count Ke Wang data with a wide range of dispersion levels. Institute for Mathematics and its Applications University of Minnesota Kimberly F. Sellers [email protected] Mathematics and Statistics, Georgetown University [email protected]

MS35 A Hybrid LSMR Algorithm for Large-Scale MS35 Tikhonov Regularization Solving Eigenvalue and Linear System Problems on GPU for Three-Dimentional Photonic Device Sim- We develop a hybrid iterative approach for computing solu- ulations tions to large-scale ill-posed inverse problems via Tikhonov regularization. We consider a hybrid LSMR algorithm, Solutions of discretized Maxwell’s equations are essential where Tikhonov regularization is applied to the LSMR sub- to three-dimensional photonic device simulations. For full problem rather than the original problem. We show that, bandgap diagrams of photonic crystals, we demonstrate contrary to standard hybrid methods, hybrid LSMR iter- how the wanted interior eigenvalues can be computed in ates are not equivalent to LSMR iterates on the directly a ultrafast manner by the nullspace free algorithm on a regularized Tikhonov problem. Instead, hybrid LSMR multiple-GPU cluster. For linear systems in frequency- leads to a different Krylov subspace problem. We show domain simulation, we explore the homogeneous and pe- that hybrid LSMR shares many of the benefits of standard riodic structures to develop a domain decomposition type hybrid methods such as avoiding semiconvergence behav- algorithm. The algorithm significantly reduces the memory ior. In addition, since the regularization parameter can be usage with satisfactory scalability and acceleration perfor- estimated during the iterative process, it does not need to mance. be estimated a priori, making this approach attractive for Weichung Wang large-scale problems. We consider various methods for se- Institute of Applied Mathematical Sciences lecting regularization parameters and discuss stopping cri- National Taiwan University teria for hybrid LSMR, and we present results from image [email protected] processing.

Julianne Chung MS36 Virginia Tech Rotated Block Two-by-Two Preconditioners Based [email protected] on Pmhss

Katrina Palmer Motivated by the HSS iteration method, to solve the Department of Mathematical Sciences distributed control problem we have proposed modified Appalachian State University HSS (MHSS) as well as preconditioned and modified HSS LA15 Abstracts 99

(PMHSS) iteration methods. Theoretical analyses and nu- Optimization merical experiments have shown the effectiveness of these iteration methods when used either as linear solvers or We discuss a hierarchy of inexact full-space methods for as matrix preconditioners for Krylov subspace methods. large-scale optimization known as matrix-free sequential Moreover, we have developed the PMHSS iteration method quadratic programming (SQP) methods. The hierarchy is to block two-by-two matrices of non-Hermitian sub-blocks, based on the increasing complexity of linear systems and, resulting in the additive-type and the rotated block tri- consequently, the increasing flexibility in the design of ef- angular preconditioners. Spectral analyses and numerical ficient linear solvers and preconditioners. We examine the computations have shown that these preconditioners can solver performance on large-scale optimization problems significantly accelerate the convergence rates of the Krylov with linear and nonlinear PDE constraints, related to the subspace iteration methods when they are used to solve optimal design of acoustic fields and optimal control of ra- the block two-by-two linear systems, and the inexact vari- diative heat transfer. ants these preconditioners are as effective and robust as the corresponding exact ones. Denis Ridzal Sandia National Laboratories Zhong-Zhi Bai [email protected] Institute of Computational Mathematics Chinese Academy of Sciences, Beijing, China [email protected] MS37 Eigenvalues and Beyond

MS36 Spectral graph partitioning and graph-based signal pro- Optimal Order Multigrid Preconditioners for Lin- cessing are mathematically based on the spectral decom- ear Systems Arising in the Semismooth Newton position of the graph Laplacian. Computing the full eigen- Method Solution Process of a Class of Control- value decomposition is impractical, since the graph Lapla- Constrained Problems cian can be of an enormous size. Thus, iterative approaches are of great importance such as, e.g., approximate low- We introduce multigrid preconditioners for the semismooth pass filters resulted from Chebyshev and Krylov based it- Newton solution process of certain optimal control prob- erations. We present an overview of some of our recently lems with control-constraints. Each semismooth Newton published results in this area. iteration essentially requires inverting a principal subma- trix of the matrix entering the normal equations of the Andrew Knyazev associated unconstrained optimal control problem. Using Mitsubishi Electric Research Laboratories a piecewise constant discretization of the control space and CU-Denver non-conforming coarse spaces, we show that, under reason- [email protected] able geometric assumptions, the preconditioner approxi- mates the inverse of the desired matrix to optimal order. MS37 Implementation of LOBPCG for Symmetric Eigen- Andrei Draganescu value Problems in SLEPc Department of Mathematics and Statistics University of Maryland, Baltimore County SLEPc is a parallel library that provides a collection [email protected] of eigenvalue solvers such as Krylov-Schur and Jacobi- Davidson. We have recently added an implementation of LOBPCG, the Locally Optimal Block Preconditioned Jyoti Saraswat Conjugate Gradient method [Knyazev, SIAM J. Sci. Com- Thomas More College put. 23 (2001)]. In this talk we discuss several imple- [email protected] mentation details, such as locking, block-orthogonalization, and multi-vector operations, and compare the performance with the solver provided in BLOPEX (which is also inter- MS36 facedinSLEPc). The Iterative Solution of Systems from Pde Con- strained Optimization: An Overview Jose E. Roman Universidad Politecnica de Valencia In this, the introductory talk of the minisymposium ”Lin- [email protected] ear Algebra for PDE-constrained optimization”, we set the scene by introducing the PDE-constrained optimization problem and describe a few of the ways such problems can MS37 be discretized, give the resulting linear systems. Solving Spectrum Slicing by Polynomial and Rational this linear system is typically the bottleneck in codes that Function Filtering solve such problems, and fast and efficient iterative meth- ods to do this are the topic of this session. We describe a Filtering techniques are presented for solving eigenvalue number of the common methods to solve such systems, and large Hermitian problems by spectrum slicing. Polynomial set the scene for the rest of the talks in the minisymposium. filtering techniques can be quite efficient in the situation where the matrix-vector product operation is inexpensive Tyrone Rees and when a large number of eigenvalues is sought, as is Rutherford Appleton Lab, UK the case in electronic structure calculations for example. [email protected] Rational filtering techniques will also be described with a focus on showing the advantages of such filters obtained by a least-squares approach. MS36 Inexact Full-space Methods for PDE-constrained Yousef Saad 100 LA15 Abstracts

Department of Computer Science [email protected] University of Minnesota [email protected] MS38 2 MS37 Fast Solvers for H Matrices Using Sparse Algebra Techniques for Computing a Large Number of We will present a new class of preconditioners for sparse Eigenpairs of Sparse, Hermitian Matrices matrices, based on hierarchical matrices. Similar to the incomplete LU algorithm, this preconditioner controls the Computing a large number, m, of extreme or interior eigen- sparsity of the matrix as the LU factorization is progress- values of sparse Hermitian matrices remains an outstanding ing. However, low-rank approximations instead of thresh- problem in many applications, including materials science olding are used to remove the fill-in. This solver also shares and lattice QCD. Current state-of-the-art methods are lim- similarities with algebraic multigrid in its use of multiple ited by the cost of the orthogonalization or the Rayleigh scales to represent the problem. Many variants of these Ritz step which grows cubicly with m. We present a win- new preconditioners exist. We will focus on variants that dow approach that uses high degree polynomials to avoid use either a domain decomposition approach (simpler to both steps, at the expense of more matrix-vector products. implement), or multifrontal approach (optimal but more difficult to program). These solvers were shown to guar- antee a bounded number of iterations independent of the Andreas Stathopoulos 2 matrix size N. The cost of the preconditioner is O(r N) College of William & Mary where r is the rank used in the approximation. r has a slow Department of Computer Science dependence on N (logarithmic or slower). [email protected] Mohammad Hadi Pour Ansari Eloy Romero Alcalde Stanford Computer Science Department [email protected] College of Williams & Mary [email protected] Eric F. Darve Stanford University MS38 Mechanical Engineering Department [email protected] Data Sparse Techniques for Parallel Hybrid Solvers

In this talk we will describe how H-matrix data sparse tech- Zhengyu Huang, Pieter Coulier niques can be implemented in a parallel hybrid sparse linear Stanford solver based on algebraic non overlapping domain decom- [email protected], [email protected] position approach. Strong-hierarchical matrix arithmetic and various clustering techniques to approximate the local Schur complements will be investigated, aiming at reduc- MS38 ing workload and memory consumption while complying A Tensor Train Decomposition Solver for Integral with structures of the local interfaces of the sub-domains. Equations on Two and Three Dimensions Utilization of these techniques to form effective global pre- conditioner will be presented. We present an efficient method based on TT (tensor train) decomposition for solving volume and boundary integral Yuval Harness equations in 2D and 3D. This method proceeds by inter- Inria preting the corresponding matrix as a high-dimensional [email protected] tensor, exploiting a matrix-block low-rank structure to pro- duce a highly compressed approximation of the inverse. Eric F. Darve For a variety of problems, computational cost and storage Stanford University needed to construct a preconditioner are very low, com- Mechanical Engineering Department pared to existing methods, with experimentally observed [email protected] complexity O(log N) or lower. Once computed, it allows for a fast O(N log N) solve. For volume and boundary inte- Luc Giraud grals in simple geometries, our method yield a practical fast Inria O(N) direct solver. For boundary integrals in complex ge- Joint Inria-CERFACS lab on HPC ometries, we show that a low accuracy approximation can [email protected] be reliably used as a preconditioner.

Emmanuel Agullo Denis Zorin INRIA Computer Science Department [email protected] Courant Institute, New York University [email protected]

MS38 Eduardo Corona Title Not Available at Time of Publication The Courant Institute of Mathematical Sciences New York University Abstract not available at time of publication. [email protected]

Gunnar Martinsson Abtin Rahimian Univ. of Colorado at Boulder Georgia Institute of Technology LA15 Abstracts 101

[email protected] and Block Low-Rank (BLR) matrices have been used to design both dense and direct solvers. In this presentation we compare these three techniques from a theoretical and MS39 practical standpoint, using large dense and sparse problems Approximating Kernel Matrix with Recursive Low- from real applications. Rank Structure Francois-Henry Rouet Given a positive-definite function φ and a set of n points Lawrence Berkeley National Laboratory {xi}, a kernel matrix Φ has the (i, j) element being [email protected] φ(xi,xj ). Kernel matrices play an important role in Gaus- sian process modeling and are widely used in statistics and Patrick Amestoy machine learning. A challenge, however, is that many cal- 3 ENINPT-IRIT, Universit´e of Toulouse, France culations with these matrices are O(n ). We present an ap- [email protected] or [email protected] proximation with a recursive low-rank structure that pre- serves the positive definiteness of the matrix. A benefit of the structure is that it allows an O(n)costformanyma- Cleve Ashcraft trix computations of interest. We also present bounds to Livermore Software Technology characterize the asymptomatic convergence of the approx- Corporation imation. [email protected]

Jie Chen Alfredo Buttari Argonne National Laboratory CNRS-IRIT, France [email protected] [email protected]

Pieter Ghysels MS39 Lawrence Berkeley National Laboratory A Robust Algebraic Schur Complement Precondi- Computational Research Division tioner Based on Low Rank Corrections [email protected] In this talk we discuss LORASC, a robust algebraic pre- Jean-Yves L’Excellent conditioner for solving sparse linear systems of equations INRIA-LIP-ENS Lyon involving symmetric and positive definite matrices. The [email protected] graph of the matrix is partitioned by using k-way par- titioning with vertex separators into N disjoint domains and a separator formed by the vertices connecting the N Xiaoye Sherry Li domains. The obtained permuted matrix has a block ar- Computational Research Division row structure. The preconditioner relies on the Cholesky Lawrence Berkeley National Laboratory factorization of the first N diagonal blocks and on approx- [email protected] imating the Schur complement corresponding to the sepa- rator block. The approximation of the Schur complement Theo Mary involves the factorization of the last diagonal block and Universite de Toulouse, INPT ENSEEIHT a low rank correction obtained by solving a generalized [email protected] eigenvalue problem or a randomized algorithm. The pre- conditioner can be build and applied in parallel. Numerical Cl´ement Weisbecker results on a set of matrices arising from the discretization INP(ENSEEIHT)-IRIT by the finite element method of linear elasticity models [email protected] illustrate the robusteness and the efficiency of our precon- ditioner. MS39 Laura Grigori INRIA An Algebraic Multilevel Preconditioner with Low- France Rank Corrections for General Sparse Symmetric [email protected] Matrices

Frederic Nataf In this talk we present a multilevel Schur low rank (MSLR) Laboratoire J.L. Lions preconditioner for general sparse symmetric matrices. This [email protected] preconditioner exploits a low-rank property that is satisfied by the difference between the inverses of the local Schur Soleiman Yousef complements and specific blocks of the original matrix and IFPEN computes an approximate inverse of the original matrix soleiman [email protected] following a tree structure. We demonstrate its efficiency and robustness through various numerical examples.

MS39 Yuanzhe Xi A Comparison of Different Low-Rank Approxima- Purdue University tion Techniques [email protected]

Linear solvers and preconditioners based on structured ma- Yousef Saad trices have gained popularity in the past few years. In Department of Computer Science particular, Hierarchically Semi-Separable (HSS) matrices, University of Minnesota Hierarchically Off-Diagonal Low-Rank (HODLR) matrices [email protected] 102 LA15 Abstracts

Ruipeng Li The Hong Kong Polytechnic University Department of Computer Science & Engineering [email protected] University of Minnesota [email protected] MS40 The Anatomy of the Second Dominant Eigenvalue MS40 in a Transition Probability Tensor — The Power Spectral Clustering with Tensors Iteration for Markov Chains with Memory

Two of the fundamental analyses of networks are clustering Though ineffective for general eigenvalue computation, the (partitioning, community detection) and the frequency of power iteration has been of practical usage for comput- network motifs, or patterns of links between nodes. How- ing the stationary distribution of a stochastic matrix. For ever, these analyses are disjoint as community structure is a Markov chain with memory m, the transition ”matrix” typically defined by link relationships and ignore motifs. becomes an order-m tensor. Under suitable tensor prod- Here, we unify these two ideas through motif-based spec- ucts, the evolution of the chain with memory toward the tral clustering. Our framework represents motifs through limiting probability distribution can be interpreted as two adjacency tensors and uses Markov chains derived from types of power-like iterations — one traditional and the these tensors to find motif-based clusters. other shorter alternative. What is not clear is the makeup of the “second dominant eigenvalue” that affects the con- Austin Benson vergence rate of the iteration, if the method converges at Stanford University all. Casting the power method as a fixed-point iteration, [email protected] this paper examines the local behavior of the nonlinear map and identifies the cause of convergence or divergence. David F. Gleich Equipped with this understanding, it is found that there Purdue University exists an open set of irreducible and aperiodic transition [email protected] probability tensors where the short-cut type of power iter- ates fail to converge. Jure Leskovec Stanford University Moody T. Chu [email protected] North Carolina State University [email protected]

MS40 Sheng-Jhih Wu Spacey Random Walks, Tensor Eigenvalues, and Soochow University Multilinear PageRank [email protected]

This presentation will present an overview of the mini- symposia. Then we’ll introduce the spacey-random walk, MS41 a new type of stationary distribution that gives rise to a Solving the Bethe-Salpeter Eigenvalue Problem us- tensor eigenvalue problem. We’ll conclude by discussing ing Low-rank Tensor Factorization and a Reduced the multilinear PageRank vector and conditions when the Basis Approach PageRank eigenvector is unique and easy to compute. The Bethe-Salpeter equation (BSE) is a reliable model for David F. Gleich estimating the absorption spectra in molecules on the ba- Purdue University sis of accurate calculation of the excited states from first [email protected] principles. Direct diagonalization of the BSE matrix is in- tractable due to the O(N 6) complexity w.r.t. the atomic orbitals basis set. We present a new approach, yielding a Lek-Heng Lim 3 University of Chicago complexity of O(N ), based on a low-rank tensor approx- [email protected] imation of the dense matrix blocks and a reduced basis approach.

Austin Benson Peter Benner Stanford University Max Planck Institute, Magdeburg, Germany [email protected] [email protected]

MS40 Venera Khoromskaia Eigenvalues of Tensors and a Second Order Markov Max Planck Institute for Mathematics in the Sciences Chain [email protected]

It is possible to generalize the notion of eigenvalues of ma- Boris Khoromskij trices to eigenvalues of tensors, from both algebraic and ge- Max-Planck Institute, Leipzig ometric perspectives. Many interesting properties of eigen- [email protected] values of matrices have analogues for eigenvalues of ten- sors. Especially, there are Perron-Frobenius type theorems for eigenvalues of nonnegative tensors. The theory can be MS41 applied to study a particular second order Markov chain Fast Direct Solvers as Preconditioners for Time model. This talk will present some attempts along this Evolution Problems way. For problems involving time evolution of geometries, the Shenglong Hu cost of a time step is often dominated by the cost of solv- LA15 Abstracts 103

ing an integral equation. Typically, iterative solution tech- Equation niques requiring many iterations are used as the computa- tion of a direct solver is too expensive to be done at each Thepaper[H.C.Elman,O.G.ErnstandD.P.O’Leary: time step. This talk will illustrate the potential of cre- A multigrid method enhanced by Krylov subspace itera- ating a high accuracy approximate inverse for a reference tion for discrete Helmholtz equations] introduced a novel geometry as preconditioner for the evolved geometries. way of combining Krylov subspace iterations with a stan- dard multigrid V-cycle to obtain a robust solver for the Adrianna Gillman Helmholtz equation which scales favorably with the wave Rice University number parameter. It also gave a simple model prob- Department of Computational and applied mathematics lem eigenvalue analysis which explained why the standard [email protected] multigrid V-cycle will, in general, fail for the large wave- number case. In this talk we present some advances that have been made since then, and show how a similar anal- MS41 ysiscanbeusedtoexaminethebehavioroftheWave-Ray Hierarchical Matrix Factorization of the Inverse. Method of Brand and Livshits.

Given a sparse matrix, its LU-factors, inverse and inverse Oliver G. Ernst factors typically suffer from substantial fill-in, leading to TU Chemnitz non-optimal complexities in their computation as well as Department of Mathematics their storage. We will focus on the construction and rep- [email protected] resentation of the (LU-) factors of the inverse of a matrix. We introduce a blockwise approach that permits to replace (exact) matrix arithmetic through (approximate but effi- MS42 cient) H-arithmetic. We conclude with numerical results Symmetric Tensor Analysis to illustrate the performance of different approaches. A symmetric tensor is a multiway array that is invariant Sabine Le Borne under permutation of its indices. We consider the problem Tennessee Technological University of decomposing a real-valued symmetric tensor as the sum Department of Mathematics of symmetric outer products of real-valued vectors, includ- [email protected] ing the consideration of constraints such as nonnegativity and orthogonality. Application of standard numerical op- timization methods can be used to solve these problems. MS41 We also consider the data structures for efficiently storing H-Matrices for Elliptic Problems: Approximation symmetric tensors and the related decompositions. of the Inverses of FEM and BEM Matrices. Tamara G. Kolda Typically, the inversion or factorization of a rank- Sandia National Laboratories structured matrix cannot be performed exactly. A ba- [email protected] sic question, therefore, is whether the inverse or the fac- tors can at least be approximated well in the chosen tar- get format. For the format of H-matrices (introduced by MS42 W. Hackbusch), which are blockwise low-rank matrices, we Blocks, Curves, and Splits: A Glimpse into the show that indeed the inverses of Galerkin matrices arising O’Leary Toolbox in FEM and in BEM can be approximated at an exponen- tial rate in the block rank. Along with mathematical results, Dianne O’Leary’s body of work includes a wide variety of numerical algorithms, Markus Melenk, Markus Faustmann, Dirk Praetorius techniques, and strategies that can, in her words, be viewed Vienna University of Technology as “tools in our virtual toolbox”. Among these are blocks [email protected], [email protected], that capture relationships in form, structure, and sparsity; [email protected] curves that adaptively trace an enlightening, improving path; and splits that partition unknowns and equations to improve speed, accuracy, and flexibility. This talk will MS42 present an overview of her work from this perspective, be- Matrix Computations and Text Summarization ginning with her earliest days as a PhD student.

Automatic text summarization is an approach to address- Margaret H. Wright ing information overload. Machine-generated summaries Courant Institute are applied to a range of genres from newswire to scien- New York University tific reports. A user is presented with the key elements, [email protected] extracted from one or more documents and arranged to be both coherent and nonredundant. This talk will focus on matrix computations in text summarization, including MS43 Markov models, the QR decomposition, nonnegative ma- Solving Large Scale NNLS Problems Arising In trix factorization and robust regression. Computer Vision

John M. Conroy Given a set of images of a scene, the problem of recon- IDA Center for Computing Sciences structing the 3D structure of the scene and estimating the [email protected] pose of the camera(s) when acquiring these images can be formulated as non-linear least squares problem. Relatively small instances (tens of images) of this problem lead to Ja- MS42 cobians with thousands of columns and tens of thousands Towards Textbook Multigrid for the Helmholtz of rows. I will describe some new techniques for efficiently 104 LA15 Abstracts

solving problems with hundreds of thousands of images. Ding Ma These techniques combine ideas from Domain Decomposi- Stanford University tion, data clustering and truncated Newton methods. [email protected]

Sameer Agarwal Google Inc MS43 [email protected] The State-of-the-Art of Preconditioners for Sparse Linear Least Squares

MS43 In recent years, a number of different preconditioning Modulus Iterative Methods for Least Squares strategies have been proposed for use in solving m × n Problems with Box Constraints overdetermined large sparse linear least squares problems. However, little has been done in the way of comparing their For the solution of large sparse box constrained least performances in terms of efficiency, robustness and appli- squares problems (BLS), a new iterative method is pro- cability. In this talk, we briefly review the different precon- posed using generalized SOR method for the inner itera- ditioners and present a numerical evaluation of them using tions, and the accelerated modulus iterative method for the a large set of problems arising from practical applications. outer iterations for the solution of the linear complementar- ity problem resulting from the Karush-Kuhn-Tucker condi- tion of the BLS problem. Theoretical convergence analysis Jennifer Scott is presented. Numerical experiments show the efficiency Rutherford Appleton Laboratory of the proposed method with less iteration steps and CPU Computational Science and Engineering Department time compared to projection methods. [email protected]

Ning Zheng Nick Gould SOKENDAI (The Graduate University for Advanced Numerical Analysis Group Studies) Rutherford Appleton Laboratory [email protected] [email protected]

Ken Hayami National Institute of Informatics MS44 [email protected] A Structure-preserving Lanczos Algorithm for the Complex J-symmetric Eigenproblem Jun-Feng Yin Department of Mathematics A structure-preserving Lanczos algorithm for the eigen- Tongji University, Shanghai, P.R. China problem HC x = λx with [email protected]   AC HC = T D −A MS43 × T T LU Preconditioning for Full-Rank and Singular and n n complex matrices A, C = C ,D = D will be Sparse Least Squares considered. Matrices HC are called complex-J-symmetric 0 In where J = and In is the n × n identity. Sparse QR factorization is often the most efficient approach −In 0 to solving sparse least-squares problems min Ax−b,espe- The eigenvalues of HC display a symmetry: they appear in cially since the advent of Davis’s SuiteSparseQR software. pairs (λ, −λ). The structure-preserving algorithm is based For cases where QR factors are unacceptably dense, we con- on similarity transformations with 2n × 2n complex sym- T sider sparse LU factors from LUSOL for preconditioning an plectic matrices SC ,SC JSC = J. These matrices form iterative solver such as LSMR. LUSOL computes factors the automorphism group associated with the Lie algebra of the form P1AP2 = LU, with permutations chosen to of complex-J-symmetric matrices. The similarity transfor- −1 preserve sparsity while ensuring L is well-conditioned. For mation SC HC SC yields a complex-J-symmetric matrix. full-rank problems, one can right-precondition with either U or B,whereB is a basis from the rows of A defined by P1 [Saunders, 1979]. More recently, Duff and Arioli have Heike Fassbender recommended that B be chosen to have maximum volume. TU Braunschweig We experiment with LUSOL and LSMR on many realistic [email protected] examples. For singular problems we make use of LUSOL’s threshold rook pivoting option, and investigate whether Peter Benner threshold complete pivoting increases the volume of B in Max Planck Institute, Magdeburg, Germany ausefulway. [email protected]

Michael A. Saunders Chao Yang Systems Optimization Laboratory (SOL) Lawrence Berkeley National Lab Dept of Management Sci and Eng, Stanford [email protected] [email protected]

Nick W. Henderson MS44 Stanford University Structure Preserving Algorithms for Solving the Institute for Computational and Mathematical Bethe–Salpeter Eigenvalue Problem Engineering [email protected] In the numerical computation of electron-hole excitations, LA15 Abstracts 105

we need to solve a complex J-symmetric eigenvalue prob- ciency and competitiveness of our algorithms. lem arising from the discretized Bethe–Salpeter equation. In general all eigenpairs are of interest in this dense eigen- Fei Xue value problem. We develop new algorithms by investigat- University of Louisiana at Lafayette ing the connection between this problem and the Hamilto- [email protected] nian eigenvalue problem. Compared to several existing ap- proaches, our algorithm is fully structure preserving, and is Daniel B. Szyld easily parallelizable. Computational experiments demon- Temple University strate the efficiency and accuracy of our algorithm. Department of Mathematics [email protected] Meiyue Shao Lawrence Berkeley National Laboratory [email protected] Eugene Vecharynski Lawrence Berkeley National Lab [email protected] Felipe da Jornada University of California, Berkeley [email protected] MS45 Chao Yang Fast Hierarchical Randomized Methods for the Ap- Lawrence Berkeley National Lab proximation of Large Covariance Matrices [email protected] We propose a new efficient algorithm for performing hierar- Jack Deslippe chical kernel MVPs in O(N) operations called the Uniform National Energy Research Scientific Computing Center FMM (UFMM), an FFT accelerated variant of the black [email protected] box FMM by Fong and Darve. The UFMM is used to speed up randomized low-rank methods thus reducing their com- Steven Louie putational cost to O(N) in time and memory. Numerical University of California, Berkeley benchmarks include low-rank approximations of covariance [email protected] matrices for the simulation of stationary random fields on very large distributions of points. MS44 Pierre Blanchard Preconditioned Locally Harmonic Residual Meth- Inria ods for Interior Eigenvalue Computations [email protected]

This talk describes a Preconditioned Locally Harmonic Olivier Coulaud Residual (PLHR) method for computing several interior INRIA eigenpairs of a generalized Hermitian eigenvalue problem, [email protected] without traditional spectral transformations, matrix fac- torizations, or inversions. PLHR is based on a short-term recurrence, easily extended to a block form, computing Eric F. Darve eigenpairs simultaneously. The method takes advantage of Stanford University Hermitian positive definite preconditioning. We also dis- Mechanical Engineering Department cuss generalizations of PLHR on non-Hermitian eigenprob- [email protected] lems and cases where only an indefinite preconditioner is at hand. We demonstrate that PLHR is an efficient and robust option for interior eigenvalue calculation, especially MS45 if memory requirements are tight. Algebraic Operations with H2-Matrices Eugene Vecharynski Lawrence Berkeley National Lab We present a family of algorithms that perform approx- 2 [email protected] imate algebraic operations on H -matrices. These algo- rithms allow us to compute approximations of factoriza- tions, inverses, and products of H2-matrices in almost lin- MS44 ear complexity with respect to the matrix dimension. Com- pared to previous methods, our approach is based on local Preconditioned Solvers for Nonlinear Hermitian low-rank updates that can be carried out in linear complex- Eigenproblems with Variational Characterization ity. All other matrix operations can be expressed by these updates and simple matrix-vector products, and we obtain Large-scale nonlinear Hermitian eigenproblems of the form a fast and flexible technique for, e.g., constructing robust T (λ)v = 0 satisfying the standard variational characteri- preconditioners for integral and elliptic partial differential zation arises in a variety of applications. We review some operators. recent development of numerical methods for solving eigen- problems of this type, including variants of preconditioned conjugate gradient (PCG) methods for extreme eigenval- Steffen B¨orm ues, and preconditioned locally minimal residual (PLMR) University of Kiel methods for interior eigenvalues. We discuss the develop- [email protected] ment of search subspaces, projection and extractions, pre- conditioning, deflation, and convergence analysis of these Knut Reimer methods. Numerical experiments demonstrate the effi- Universitaet Kiel 106 LA15 Abstracts

[email protected] Jianlin Xia Department of Mathematics Purdue University MS45 [email protected] Fast Numerical Linear Algebra for Kohn-Sham Density Functional Theory via Compression MS46 In this talk we discuss algorithms for accelerating the solu- Some Recent Progress on Algorithms for FMM tion of the non-linear eigenvalue and eigenvector problem Matrices in Kohn-Sham density functional theory. Solving this prob- 2 lem self consistently requires the repeated computation and There are simple time efficient O(N ) algorithms for con- application of various linear algebraic operators. As part of verting a general dense matrix to FMM form using a prior partition tree. We show that there is a surprise as far as this talk we demonstrate how to construct a localized, and 1.5 hence sparse, representation of basis functions that allows memory efficiency goes, and present a O(N )working for reductions in the overall computational cost. memory algorithm and a corresponding worst case family ofpartitiontrees.Itisanopenquestionwhetherthere Anil Damle is a construction algorithm that only requires O(N)work Stanford University space. On another front, there is a simple O(N)flopsal- [email protected] gorithm to multiply two HSS forms exactly. However the corresponding question for FMM forms has remained open. Lexing Ying We show how to close it. Stanford University Shivkumar Chandrasekaran Department of Mathematics University of California, Santa Barbara [email protected] Department of Electrical and Computer Engieering [email protected] MS45 Kristen Lessel Comparison of FMM and HSS at Large Scale University of California, Santa Barbara [email protected] We perform a direct comparison between the fast multipole method (FMM) and hierarchically semi separable (HSS) matrix-vector multiplication for Laplace, Helmholtz, and MS46 Stokes kernels. Both methods are implemented with dis- tributed memory, thread, and SIMD parallelism, and the Fast Linear Solvers for Weakly Hierarchical Matri- runs are performed on large Cray systems such as Edison ces (XC30, 133,824 cores) at NERSC and Shaheen2 (XC40, In recent years, we have seen the development of a new 197,568 cores) at KAUST. class of linear solvers and preconditioners based on hierar- chical low-rank approximations of matrices. Such solvers Rio Yokota have been very successful and can solve problems that are King Abdullah University of Science and Technology currently intractable or very expensive using conventional [email protected] preconditioners such as block diagonal, incomplete LU, al- gebraic multigrid, etc. However, the computational cost of Francois-Henry Rouet these solvers currently fails to scale linearly with the size of Lawrence Berkeley National Laboratory the matrix. This is true in particular for 3D elliptic partial [email protected] differential equations. In this talk, we will present a new version of these fast solvers that scale like O(N) for general Xiaoye Sherry Li situations.Wewillseehowthesesolversareabletorepre- Computational Research Division sent dense matrices using (slightly larger) sparse matrices, Lawrence Berkeley National Laboratory and how the fill-in that appears during the LU factoriza- [email protected] tion of these sparse matrices can be removed, so that the algebra remains sparse throughout the process. As a result of this new approach, all steps are O(N), even in 3D. This MS46 results in very fast linear solvers that scale favorably with A Stable and Efficient Matrix Version of the Fast the problem size. Multipole Method Eric F. Darve It is well known that Fast Multipole Method (FMM) is Stanford University very efficient for evaluating matrix-vector products involv- Mechanical Engineering Department ing discretized kernel functions. In contrast, hierarchically [email protected] semiseparable (HSS) representations provide a fast and sta- ble way for many matrix operations, especially direct fac- Pieter Coulier, Mohammad Hadi Pour Ansari torizations. However, HSS forms have always been con- Stanford structed based on algebraic compression(e.g., SVD) of off- [email protected], [email protected] diagonal blocks. In this work, we show how to directly obtain an HSS representation from the matrix derived in FMM. MS46 MHS Structures for the Direct Solution of Multi- Difeng Cai Dimensional Problems Purdue University [email protected] We propose multi-layer hierarchically semiseparable LA15 Abstracts 107

(MHS) structures for the efficient factorizations of dense to any d-mode nonnegative tensors. References: S. Fried- matrices arising from multi-dimensional discretized prob- land, Positive diagonal scaling of a nonnegative tensor to lems. The problems include discretized integral equations one with prescribed slice sums, Linear Algebra Appl.,vol. and dense Schur complements in the factorizations of dis- 434 (2011), 1615-1619. cretized PDEs. Unlike existing work on hierarchically semiseparable (HSS) structures which are essentially 1D Shmuel Friedland structures, the MHS framework integrates multiple layers Department of Mathematics ,University of Illinois, of rank and tree structures. We lay theoretical founda- Chicago tions for MHS structures and justify the feasibility of MHS [email protected] approximations for these dense matrices. Rigorous rank bounds for the low-rank structures are given. Represen- tative subsets of mesh points are used to illustrate the MS47 multi-layer structures as well as the structured factoriza- Open Problems and Concluding Discussions tion. Systematic fast and stable MHS algorithms are pro- posed, particularly convenient direct factorizations. The In this final session of our mini-symposia, we will present new structures and algorithms can yield direct solvers with a summary of the results from all of the other presenters nearly linear complexity and linear storage for solving some as well as invite them to submit slides for an open problem practical 2D and 3D problems. discussion. The point is to identify a few key problems that could be solved to move the field forward. Jianlin Xia Purdue David F. Gleich [email protected] Purdue University [email protected]

MS47 Lek-Heng Lim A Semi-Tensor Product Approach for Probabilistic University of Chicago Boolean Networks [email protected] Modeling genetic regulatory networks is an important is- sue in systems biology. Various models and mathematical MS47 formalisms have been proposed in the literature to solve The Laplacian Tensor of a Multi-Hypergraph the capture problem. The main purpose in this paper is to show that the transition matrix generated under semi- Given a uniform multi-hypergraph H, we define a new tensor product approach (Here we call it the probability hyper-adjacency tensor A(H) and use it to define the Lapla- structure matrix for simplicity) and the traditional ap- cian L(H) and the signless Laplacian Q(H). We prove proach (Transition probability matrix) are similar to each L(H)andQ(H) are positive semi-definite for connected other. And we shall discuss three important problems in even uniform multigraphs. Some analysis on the largest Probabilistic Boolean Networks (PBNs): the dynamic of a and smallest H and Z-eigenvalues of L(H)aregivenas PBN, the steady-state probability distribution and the in- well as a comparison of other adjacency tensors. verse problem. Numerical examples are given to show the validity of our theory. We shall give a brief introduction to Kelly Pearson semi-tensor and its application. After that we shall focus Murray State University on the main results: to show the similarity of these two [email protected] matrices. Since the semi-tensor approach gives a new way for interpreting a BN and therefore a PBN, we expect that advanced algorithms can be developed if one can describe MS48 the PBN through semi-tensor product approach. Basker: A Scalable Sparse Direct Linear Solver for Many-Core Architectures Xiaoqing Cheng, Yushan Qiu, Wenpin Hou, Wai-Ki Ching The University of Hong Kong In this talk, we describe challenges related to the imple- [email protected], [email protected], mentation of a scalable direct linear solver on many-core [email protected], [email protected] architectures. Basker focuses on solving irregular systems such as those from circuit simulation. These systems may require full partial pivoting, and cannot be reorder for ef- MS47 ficient use of BLAS3 operations. We attempt to mitigate Positive Diagonal Scaling of a Nonnegative Tensor these issues by using a parallel implementation of Gilbert- to One with Prescribed Slice Sums Peierls method, while developing with Kokkos, which is a framework that allows for portability on emerging many- Given a nonnegative 3-tensor T =[ti,j,k]ofdimensions core architectures. m, n, p, when one can rescale it by three positive vectors x =(x1,...,xm),y =(y1,...,yn),z =(z1,...,zp), such Joshua D. Booth that the tensor [xiyj zkti,j,k] has given row, column and Center for Computing Research depth slice sums ri,cj ,dk.Here Sandia National Labs [email protected] n,p m,p m.n ri = xiyj zkti,j,k,cj = xiyj zkti,j,k,dk = xiyj zkti,j,k j,k=1 i,k=1 i,j=1 Siva Rajamanickam Sandia National Laboratories for i =1,...,m,j =1,...,m,k =1,...,l. Wegivea [email protected] neccesary and sufficient condition and state a correspond- ing minimizing problem of a convex function that gives a Erik G. Boman solution to the diagonal scaling. These results generalize Center for Computing Research 108 LA15 Abstracts

Sandia National Labs School of Computational Science and Engineering [email protected] Georgia Institute of Technology [email protected]

MS48 Richard Vuduc Parallel Iterative Incomplete Factorizations and Georgia Institute of Technology Triangular Solves [email protected]. edu

We present asynchronous fine-grained parallel algorithms Xiaoye Sherry Li for computing incomplete LU (ILU) factorizations and per- Computational Research Division forming the corresponding sparse triangular solves. The Lawrence Berkeley National Laboratory ILU algorithm computes all the non-zeros of the factors in [email protected] parallel (asynchronously) using an iterative method. The triangular solves can be performed approximately using another iterative method. We present several modifica- MS48 tions from previous work that improve the robustness of the method. We show results from Intel Xeon Phi and Parallel Multilevel Incomplete LU Factorization GPU using the Kokkos library. Preconditioner with Variable-Block Structure

Aftab Patel In this talk, we describe a distributed-memory implementa- School of Computational Science and Engineering tion of a variable-block multilevel algebraic recursive iter- Georgia Institute of Technology ative preconditioner for general non-symmetric linear sys- [email protected] tems. The preconditioner construction detects automati- cally exact or approximate dense blocks in the coefficient Siva Rajamanickam matrix and exploits them to achieve better reliability and Sandia National Laboratories computation density. To find such a dense-block struc- [email protected] ture, a graph-compression algorithm has been developed, which requires only one easy-to-use parameter. We present Erik G. Boman a study of the numerical performance and parallel scalabil- Center for Computing Research ity using additive Schwarz and Schur complement type pre- Sandia National Labs conditioners on a variety of general linear systems as well [email protected] as in the analysis of turbulent Navier-Stokes equations.

Edmond Chow Masha Sosonkina School of Computational Science and Engineering Old Dominion University Georgia Institute of Technology Ames Laboratory/DOE [email protected] [email protected]

Bruno Carpentieri MS48 Insitute for Mathematics and Computation A Sparse Direct Solver for Distributed Memory University of Gronengin GPU and Xeon Phi Accelerated Systems [email protected]

In this talk, we presents the first sparse direct solver for distributed memory systems comprising hybrid multicore MS49 CPU and Intel Xeon Phi co-processors or GPUs. It builds Solving Sequential Strongly Underdetermined Sys- on the algorithmic approach of SuperLUDist,whichis tems Using Sherman-Morrison Iterations right-looking and statically pivoted. Our contribution is a novel algorithm, called the HALO. The name is shorthand for highly asynchronous lazy offload; it refers to the way We are interested in repeatedly solving a regularized lin- the algorithm combines highly aggressive use of asynchrony ear least squares problem of the form where data become with accelerated offload, lazy updates, and data shadowing sequentially available. We assume we are given a strongly (alahalo or ghost zones), all of which serve to hide and underdetermined matrix and a Tikhonov type regulariza- reduce communication, whether to local memory, across tion term. Using the Sherman-Morrison-Woodbury for- thenetwork,oroverPCIe.WefurtheraugmentHALO mula we were able to develop a fast solver which can com- with a model-driven autotuning heuristic that chooses the pete with methods base on Cholesky factorization and it- intra-node division of labor among CPU and accelerator erative methods such as LSQR, CGLS, and LSMR. components. When integrated into HALO and evaluated on a variety of realistic test problems in both single-node Matthias Chung and multi-node configurations, the resulting implementa- Department of Mathematics tion achieves speedups of up to 2.5× over an already effi- Virginia Tech cient multicore CPU implementation, and achieves up to [email protected] 83% of a machine-specific upper-bound that we have esti- mated. Julianne Chung Virginia Tech Piyush Sao [email protected] Georgia Institute of Technology [email protected] Joseph Slagel Department of Mathematics Xing Liu Virginia Tech LA15 Abstracts 109

[email protected] [email protected]

MS49 MS50 A Tensor-Based Dictionary Approach to Tomo- Domain Decomposition Algorithms for Large Her- graphic Image Reconstruction mitian Eigenvalue Problems

The problem of low-dose tomographic image reconstruc- In this talk we will discuss Domain-Decomposition (DD) tion is investigated using dictionary priors learned from type methods for large Hermitian eigenvalue problems. training images. The reconstruction problem is considered This class of thechniques rely on spectral Schur comple- as a two phase process: in phase one, we construct a ten- ments combined with Newton’s iteration. The eigenvalues sor dictionary prior from our training data (a process for- of the spectral Schur complement appear in the form of mulated as a non-negative tensor factorization problem), branches of some functions, the roots of which are eigen- and in phase two, we pose the reconstruction problem in values of the original matrix. It is possible to extract these terms of recovering expansion coefficients in that dictio- roots by a number of methods which range from a form nary. Experiments demonstrate the potential utility of our or approximate Rayleigh iteration to an approximate in- approach. verse iteration, in which a Domain Decomposition frame- work is used. Numerical experiments in parallel environ- Sara Soltani ments will illustrate the numerical properties and efficiency Technical University of Denmark of the method. Department of Applied Mathematics and Computer Science Yousef Saad [email protected] Department of Computer Science University of Minnesota Misha E. Kilmer [email protected] Tufts University [email protected] Vasilis Kalantzis CEID,SchoolofEngineering Per Christian Hansen University of Patras, Greece Technical University of Denmark [email protected] Informatics and Mathematical Modelling [email protected] MS50 A Contour Integral-based Parallel Eigensolver with MS49 Higher Complex Moments Some Inverse Problems in Optical Imaging for Re- mote Sensing Large-scale eigenvalue problems arise in wide variety of sci- entific and engineering applications such as nano-scale ma- Hyperspectral, Light Detection and Ranging (LiDAR), and terials simulation, vibration analysis of automobiles, data Polarimetric imaging are the pervasive optical imaging analysis, etc. In such situation, high performance paral- modalities in remote sensing. Here, we describe methods lel eigensolvers are required to exploit distributed paral- for deblurring and analyzing 3D images of these types, as lel computational environments. In this talk, we present well as 4D compressive spectro-polarimetric snapshot im- a parallel eigensolver (SS method) for large-scale interior ages. eigenvalue problems. This method has a good parallel scal- ability according to a hierarchical structure of the method. Robert Plemmons We also present parallel software for computing eigenvalues Wake Forest University and corresponding eigenvectors in a given region or inter- [email protected] val. We illustrate the performance of the software with some numerical examples.

MS49 Tetsuya Sakurai, Yasunori Futamura, Akira Imakura Efficient Iterative Methods for Quantitative Sus- Department of Computer Science ceptibility Mapping University of Tsukuba [email protected], futa- Quantitative Susceptibility Mapping (QSM) is a MRI tech- [email protected], [email protected] nique that allows to image magnetic properties of tissue. QSM is applied more and more widely, for example, in neu- roscience, where iron concentration serves as a biomarker of MS50 certain neurological diseases. While the tissues susceptibil- On the Orthogonality of Eigenvectors Obtained ity cannot be measured directly it can be reconstructed by from Parallel Spectral Projection Methods solving a highly ill-posed inverse problem. In this talk, we will present iterative methods for QSM reconstruction and A number of recent eigensolvers have been devised with the their efficient implementation using Julia. We formulate aid of some approximate projectors to invariant subspaces. QSM as a PDE constrained optimization problem with a One of the major advantages on the use of these approx- non-smooth regularizer based on total variation (TV). Spe- imate spectral projectors is the natural parallelism they cial emphasis will be put on iterative all-at-once methods present. Different independent processing tasks can work for handling the PDE-constraints. on different parts of the spectrum in parallel. It is well known that exact eigenvectors from Hermitian problems Lars Ruthotto are mutually orthogonal. On the other hand, some initial Department of Mathematics and Computer Science studies cast doubts on the numerical orthogonalities of the Emory University vectors computed independently when spectral projection 110 LA15 Abstracts

methods are applied in a parallel setting. In this talk we presented. will present several practical approaches to maintain or- thogonalities without explicit reorthogonalization between Pieter Ghysels independent tasks. Lawrence Berkeley National Laboratory Computational Research Division Peter Tang [email protected] Intel Corporation 2200 Mission College Blvd, Santa Clara, CA Xiaoye Sherry Li [email protected] Computational Research Division Lawrence Berkeley National Laboratory [email protected] MS50 A Contour-Integral Based Algorithm for Counting Francois-Henry Rouet the Eigenvalues Inside a Region in the Complex Lawrence Berkeley National Laboratory Plane [email protected] In many applications, the information about the number of eigenvalues inside a given region is required. In this pa- MS51 per, we propose a contour-integral based method for this purpose. The new method is motivated by two findings. Iterative Construction of Non-Symmetric Factored Recently, some contour-integral based methods were devel- Sparse Approximate Preconditioners oped for estimating the number of eigenvalues inside a re- gion in the complex plane. But our method is able to count Factored sparse approximate inverses (FSAI) play a key- the eigenvalues inside the given region exactly. An appeal- role in the efficient algebraic preconditioning of sparse lin- ing feature of our method is that it can integrate with re- ear systems of equations. For SPD problems remarkable re- cently developed contour-integral based eigensolvers to de- sults are obtained by building the FSAI non-zero pattern it- termine whether all desired eigenvalues are found by these eratively during its computation. Unfortunately, an equiv- methods. Numerical experiments are reported to show the alent algorithm is still missing in the non-symmetric case. viability of our new method. In the present contribution we explore the possibility of it- eratively computing FSAI for non-symmetric matrices by Guojian Yin,GuojianYin using a incomplete Krylov subspace bi-orthogonalization Shenzhen Institutes of Advanced Technology, procedure. Chinese Academy of Sciences [email protected], [email protected] Carlo Janna, Massimiliano Ferronato Dept. ICEA - University of Padova [email protected], [email protected] MS51 Parallel Algorithms for Computing Functions of Andrea Franceschini Matrix Inverses University of Padova [email protected] Functions of entries of inverses of matrices like all diagonal entries of a sparse symmetric indefinite matrix inverse or its trace arise in several important computational applications MS51 such as density functional theory, covariance matrix analy- Monte Carlo Synthetic Acceleration Methods for sis or when evaluating Green’s functions in computational Sparse Linear Systems nanolelectronics. We will utilize an algorithm for selective inversion by Lin et al. (TOMS 2011) for approximately Stochastic linear solvers based on Monte Carlo Synthetic computing selective parts of a sparse symmetric matrix in- Acceleration are being studied as a potentially resilient al- verse. Its overall performance will be demonstrated for ternative to standard methods. Our work has shown that selected numerical examples and we will sketch paralleliza- for certain classes of problems, these methods can also tion aspects for raising the computational efficiency. demonstrate performance competitive with modern tech- Matthias Bollhoefer niques. In this talk we will review recent developments TU Braunschweig in Monte Carlo solvers for linear systems. Our work tar- [email protected] gets large, distributed memory systems by adapting Monte Carlo techniques developed by the transport community to solve sparse linear systems arising from the discretization MS51 of partial differential equations. A Parallel Multifrontal Solver and Precondi- Michele Benzi tioner Using Hierarchically Semiseparable Struc- Department of Mathematics and Computer Science tured Matrices Emory University We present a sparse solver or preconditioner using hier- [email protected] archically semi-separable (HSS) matrices to approximate dense matrices that arise during sparse matrix factoriza- Tom Evans tion. Using HSS, rank structured matrices with low-rank Oak Ridge National Laboratory off-diagonal blocks, reduces fill-in and for many PDE prob- [email protected] lems, it leads to a solver with lower computational com- plexity than the pure multifrontal method. For HSS con- Steven Hamilton struction, we use a novel randomized sampling technique. ORNL Hybrid shared and distributed memory parallel results are [email protected] LA15 Abstracts 111

Massimiliano Lupo Pasini Southern Illinois University at Carbondale Department of Mathematics and Computer Science [email protected] Emory University [email protected] MS53 Stuart Slattery Kolmogorov Widths and Low-Rank Approximation Computer Science and Mathematics Division of Parametric Problems Oak Ridge National Laboratory [email protected] This talk is concerned with error estimates for low-rank approximation of diffusion equations with several parame- ters. In particular, we focus on estimates of Kolmogorov n- MS52 widths of such solution manifolds, which also allow conclu- A Matrix Equation Approach for Incremental Lin- sions to be drawn concerning the performance achievable ear Discriminant Analysis by model reduction based on the reduced basis method. Moreover, we discuss numerical schemes for the computa- Abstract not available at time of publication. tion of low-rank approximations of such problems. Our theoretical bounds are illustrated by numerical experi- Delin Chu ments, and the results indicate in particular a surprisingly Department of Mathematics, National University of strong dependence on particular structural features of the Singapore problems. Block S17, 10 Lower Kent Ridge Road, Singapore 119076 [email protected] Markus Bachmayr Universit´e Pierre et Marie Curie Paris MS52 [email protected] Riccati Equations Arising in Stochastic Control Albert Cohen The numerical treatment of the infinite dimensional Universit´e Pierre et Marie Curie, Paris, France stochastic linear quadratic control problem requires solving [email protected] large-scale Riccati equations. We investigate the conver- gence of these Riccati operators to the finite dimensional ones. Moreover, the coefficient matrices of the resulting MS53 Riccati equation have a given structure (e.g. sparse, sym- A Greedy Method with Subspace Point of View for metric or low rank). We proposed an efficient method Low-Rank Tensor Approximation in Hierarchical based on a low-rank approach which exploit the given Tucker Format. structure. The performance of our method is illustrated by numerical results. The goal is to approximate high-order tensors in subspace- Hermann Mena based low-rank formats. A strategy is proposed for com- Department of Mathematics puting adapted nested tensor subspaces in a greedy fashion. University of Innsbruck, Austria The best approximation in these subspaces is then com- [email protected] puted with a DMRG-like algorithm, again usign a greedy strategy, allowing an automatic rank adaptation with a re- duced complexity compared to standard DMRG. A heuris- MS52 tic error indicator based on singular values enables to cap- Splitting Schemes for Differential Riccati Equa- ture a possible anisotropy of the tensor of interest. tions Lo¨ıc Giraldi I will consider the application of exponential splitting GeM, Ecole Centrale de Nantes schemes to large-scale differential Riccati equations of the [email protected] form T P˙ = A P + PA+ Q − PSP. (3) Anthony Nouy These numerical methods consider the affine and nonlin- Ecole Centrale Nantes, GeM ear parts of the equation separately, thereby decreasing [email protected] the necessary computational effort. I will focus on imple- mentation issues and how to utilize low-rank structure. I MS53 will also consider several natural generalizations such as time-dependent or implicit problems. Tensor Krylov Methods Tony Stillfjord High dimensional models with parametric dependencies Department of Mathematics can be challenging to simulate. We use tensor Krylov tech- University of Lund, Sweden niques to reduce the parametric model, combining both [email protected] moment matching and interpolatory model reduction in the parameter space. If a low rank tensor formulation is possible, this approach is competitive with classic para- MS52 metric model reduction. Furthermore, we look at models Conditioning Number of Solutions of Matrix Equa- containing stochastic parameters and construct a reduced tions in Stability Analysis model that outputs the mean over these parameters within the domain of interest. Abstract not available at time of publication. Pieter Lietaert Mingqing Xiao KU Leuven 112 LA15 Abstracts

[email protected] dures. Accuracies are compared using some test problems in the literature.

MS53 Jeffrey J. Hunter An Adaptive Tensor Method for High-Dimensional Auckland University of Technology Parametric PDEs jeff[email protected]

We apply the recently developed Tensor Train (TT) format for tensor decomposition to high-dimensional paramet- MS54 ric PDEs [Oseledets, Tensor-Train Decomposition, 2011, General Solution of the Poisson Equation for QBDs SIAM J. Sci. Comput. 33, pp. 2295-2317]. Adaptive solvers have been introduced independently [Eigel et al., If T is the transition matrix of a finite irreducible Markov − Adaptive Stochastic Galerkin FEM, Comput. Method. chain and b is a given vector, the Poisson equation (I T )x = b has a unique solution, up to an additive constant, Appl. M. 270, pp. 247-269] but computational cost grows # exponentially with the number of dimensions. In par- given by x =(I − T ) b + α1, for any α scalar, where 1 is # ticular, the proposed error estimators become unfeasable the vector of all ones and H denotes the group inverse of very quickly for larger problems. The TT format breaks H. This does not hold when the state space is infinite. We this curse of dimensionality and allows for the treat- consider the Poisson equation (I − P )u = g,whereP is ment of these problems. We solve the discretized prob- the transition matrix of a Quasi-Birth-and-Death (QBD) lem iteratively using the well-known Alternating Least process with infinitely many levels, g is a given infinite– Squares (ALS) algorithm [Holtz et al., The alternating lin- dimensional vector and u is the unknown vector. By using ear scheme for tensor optimisation in the TT format, SIAM the block tridiagonal and block Toeplitz structure of P , J. Sci. Comput. 34, pp. A683-A713] . Furthermore, we we recast the problem in terms of a set of matrix difference exploit the component structure of the tensor format. The equations. We provide an explicit expression of the general error estimation can be computed far more efficiently this solution, relying on the properties of Jordan triples of ma- way. Numerical results support this approach by showing trix polynomials and on the solutions of suitable quadratic higher accuracy even for problems of far greater size than matrix equations. The uniqueness and boundedness of the previously discussed. solutions are discussed, according to the properties of the right–hand side g. Max Pfeffer TU Berlin Dario Bini pfeff[email protected] Dept. of Mathematics University of Pisa, Italy [email protected] MS54 Steady-state Analysis of a Multi-class MAP/PH/c Sarah Dendievel, Guy Latouche Queue with Acyclic PH Retrials Universit´e libre de Bruxelles [email protected], [email protected] Amulti-classc-server retrial queueing system in which cus- tomers arrive according to a class-dependent Markovian ar- Beatrice Meini rival process (MAP) is considered. Service and retrial times Dept. of Mathematics follow class-dependent phase-type (PH) distributions. A University of Pisa, Italy necessary and sufficient condition for ergodicity is given. [email protected] A Lyapunov function is used to obtain a finite state space with a given steady-state probability mass. The truncated system is described as a multi-dimensional Markov chain MS54 and a Kronecker representation of its generator matrix is Complex Nonsymmetric Algebraic Riccati Equa- analyzed numerically. tions Arising in Markov Modulated Fluid Flows

Tugrul Dayar Motivated by the transient analysis of stochastic fluid flow Bilkent University models, we introduce a class of complex nonsymmetric al- Dept of Computer Engineering gebraic Riccati equations. The existence and uniqueness of [email protected] the extremal solutions to these equations are proved. Nu- merical methods for the extremal solutions are discussed. M. Can Orhan Bilkent University [email protected] Changli Liu School of Mathematical Science, Sichuan University [email protected] MS54 The Computation of the Key Properties of Markov Jungong Xue Chains using a Variety of Techniques Fundan University [email protected] Techniques for finding the stationary distribution, the mean first passage times and the group inverse of finite irreducible Markov chains is presented. We extend the MS55 technique of [Kohlas J., Numerical computation of mean Recovering Sparsity in a Krylov Subspace Frame- first passage times and absorption probabilities in Markov work and semi-Markov models, Zeit fur Oper Res, 30, (1986), 197-207], include some new results involving perturbation For many ill-posed inverse problems, such as image restora- techniques as well as some generalised matrix inverse proce- tion, the regularized solution may benefit from the inclu- LA15 Abstracts 113

sion of sparsity constraints, which involve terms evaluated [email protected] in the 1-norm. Our goal is to efficiently approximate the 1-norm terms by iteratively reweighed 2-norm terms. After presenting some new results on the regularizing properties MS55 of Krylov subspace methods, we describe a couple of new Arnoldi Methods for Image Deblurring with Anti- approaches to enforce sparsity within a Krylov subspace Reflective Boundary Conditions framework. We consider image deblurring with anti-reflective boundary Silvia Gazzola conditions and a nonsymmetric point spread function, and University of Padova compare several iterative Krylov subspace methods used [email protected] with with reblurring right or left preconditioners. The pur- pose of the preconditioner is not to accelerate the rate of James G. Nagy convergence, but to improve the quality of the computed Emory University solution and to increase the robustness of the regulariza- Department of Math and Computer Science tion method. Right preconditioned methods based on the [email protected] Arnoldi process are found to perform well. Marco Donatelli Paolo Novati University of Insubria University of Trieste [email protected] [email protected] David Martin, Lothar Reichel MS55 Kent State University [email protected], [email protected] Structure Preserving Reblurring Preconditioners for Image Deblurring MS56 In the context of image deblurring, we extend a precon- The Use of Stochastic Collocation Methods to Un- ditioning technique proposed in [Dell’Acqua et al., JCAM derstand Pseudo-Spectra in Linear Stability Anal- 2013] to a more general method whose preconditioners in- ysis herit the boundary conditions of the problem without los- ing in computational cost. Furthermore, following the idea Eigenvalue analysis is a well-established tool for stability developed in [Donatelli, Hanke, IP 2013], we introduce a analysis of dynamical systems. However, it is also known nonstationary version of the proposed method providing that there are situations where eigenvalues miss some im- further improvements in terms of iterations and quality of portant features of physical models. For example, in mod- reconstruction. Numerical comparisons with the aforemen- els of incompressible fluid dynamics, there are examples tioned techniques are given. where eigenvalue analysis predicts stability but transient simulations exhibit significant growth of infinitesmal per- Pietro Dell’Acqua turbations. This behavior can be predicted by pseudo- University of Genova spectral analysis. In this work, we show that an approach [email protected] similar to pseudo-spectral analysis can be performed in- expensively using stochastic collocation methods and the Marco Donatelli results can be used to provide quantitive information about University of Insubria the nature and probability of instability. [email protected] Howard C. Elman Claudio Estatico University of Maryland, College Park Department of Mathematics, University of Genova, Italy [email protected] [email protected] David Silvester Mariarosa Mazza School of Mathematics University of Insubria University of Manchester [email protected] [email protected]

MS56 MS55 Large-Scale Eigenvalue Calculations in Scientific SVD Approximations for Large Scale Imaging Problems Problems We will consider the problem of computing a relatively A fundamental tool for analyzing and solving ill-posed in- large number of eigenvalues of a large sparse and sym- verse problems is the singular value decomposition (SVD). metric matrix on distributed-memory parallel computers. However, in imaging applications the matrices are often The problem arises in several scientific applications, such too large to be able to efficiently compute the SVD. In as electronic structure calculations. We will discuss the this talk we present an efficient approach to compute an use of spectrum slicing and multiple shift-invert Lanczos approximate SVD, which exploits Kronecker product and in computing the eigenvalues. This is joint work with tensor structures that arise in imaging applications. Hasan Metin Aktulga, Lin Lin, Christopher Haine, and Chao Yang. James G. Nagy Emory University Esmond G. Ng Department of Math and Computer Science Lawrence Berkeley National Laboratory 114 LA15 Abstracts

[email protected] [email protected]

Yousef Saad MS56 Department of Computer Science Accurate Computations of Eigenvalues of Diago- University of Minnesota nally Dominant Matrices with Applications to Dif- [email protected] ferential Operators

In this talk, we present an algorithm that computes all MS57 eigenvalues of a symmetric diagonally dominant matrix to high relative accuracy. We further consider using the al- An Algebraic Multigrid Method for Nonsymmetric gorithm in an iterative method for a large scale eigenvalue Linear Systems problem and we show how smaller eigenvalues of finite dif- Linear systems with multiple physical unknowns pose a ference discretizations of differential operators can be com- challenge for standard multigrid techniques, particularly puted accurately. Numerical examples are presented to when the coupling between the unknowns is strong. We demonstrate the high accuracy achieved by the new algo- present our efforts to develop an AMG-preconditioned rithm. Krylov solver for nonsymmetric systems of linear equa- Qiang Ye tions. The preconditioner is designed to represent the cou- Dept of Math pling between the physical variables and account for the Univ of Kentucky underlying physics of the system. We present performance [email protected] results for the solver on challenging flow and transport ap- plications.

MS56 Daniel Osei-Kuffuor,LuWang Lawrence Livermore National Laboratory Convergence of the Truncated Lanczos Approach oseikuff[email protected], [email protected] for the Trust-Region Subproblem

The Trust-Region Subproblem plays a vital role in vari- Robert Falgout ous other applications. The truncated Lanczos approach Center for Applied Scientific Computing proposed in [Gould, Lucidi, Roma and Toint, SIAM J. Op- Lawrence Livermore National Laboratory tim., 9:504–525 (1999)] is a natural extension of the Lanc- [email protected] zos method and mimics the Rayleigh-Ritz procedure. In this paper, we analyze the convergence of the truncated Lanczos approach and reveals its convergence behavior in MS57 theory. Numerical examples are reported to support our Aggressive Accelerator-Enabled Local Smoothing analysis. Via Incomplete Factorization, with Applications to Preconditioning of Stokes Problems with Hetero- Leihong Zhang geneous Viscosity Structure Shanghai University of Finance and Economics [email protected] Hybrid supercomputers offer attractive performance per Watt, yet present communication-related bottlenecks. We Ren-Cang Li investigate “heavy smoothing’ with multigrid precondi- University of Texas - Arlington tioners; aggressive coarsening with accelerator-enabled [email protected] smoothing can reduce communication, maintaining scal- ability and admitting a tradeoff between local work and non-local communication. We examine local polynomial MS57 smoothing as well as the Chow-Patel fine-grained ILU de- Low-Rank Approximation Preconditioners for composition. Our work aims to provide portable software Symmetric and Nonsymmetric Systems contributions within ViennaCL, PETSc, and pTatin3D. We apply the smoothers to ill-conditioned systems from This talk will present a few preconditioning techniques lithospheric dynamics. based on low-rank approximations, designed for the gen- eral framework of the domain decomposition (DD) ap- Patrick Sanan proach and distributed sparse matrices. The DD decou- Universit`a della Svizzera Italiana ples the matrix and yields two variants of preconditioners. [email protected] The first is an approximate inverse of the original matrix based on a low-rank correction that exploits the Sherman- Karl Rupp Morrison-Woodbury formula. The second method is based Institute for Analysis and Scientific Computing, TU Wien on the Schur complement technique and it is based on ap- Institute for Microelectronics, TU Wien proximating the inverse of the Schur complement. Low- [email protected] rank expansions are computed by the Lanczos/Arnoldi pro- cedure with reorthogonalization. Numerical experiments Olaf Schenk with Krylov subspace accelerators will be presented for USI - Universit`a della Svizzera italiana symmetric indefinite systems and nonsymmetric systems. Institute of Computational Science These methods were found to be more efficient and robust [email protected] than ILU and multilevel ILU-type preconditioners.

Ruipeng Li Department of Computer Science & Engineering MS57 University of Minnesota Preconditioned Iterative Methods for Solving In- LA15 Abstracts 115

definite Linear Systems Nicholas Knight New York University Sparse symmetric indefinite linear systems of equations [email protected] arise in many practical applications. A preconditioned iterative method is frequently the method of choice. In James W. Demmel the talk we consider both general indefinite systems and University of California saddle-point problems and we summarize our work along Division of Computer Science these lines based on the recently adopted limited memory [email protected] approach. A number of new ideas proposed with the goal of improving the stability, robustness and efficiency of the David Fernandez resulting preconditioner will be mentioned. McGill University [email protected] Miroslav Tuma Academy of Sciences of the Czech Republic Institute of Computer Science MS58 [email protected] Enlarged Krylov Subspace Methods for Reducing Communication Jennifer Scott Rutherford Appleton Laboratory We introduce a new approach for reducing communication Computational Science and Engineering Department in Krylov subspace methods that consists of enlarging the [email protected] Krylov subspace by a maximum of t vectors per iteration, based on the domain decomposition of the graph of A. The enlarged Krylov projection subspace methods lead to faster MS58 convergence in terms of iterations and parallelizable algo- The s-Step Lanczos Method and its Behavior in rithms with less communication, with respect to Krylov Finite Precision methods. We present three enlarged CG versions, MSDO- CG, LRE-CG and SRE-CG. The s-step Lanczos method can reduce communication by afactorofO(s) versus the classical approach. By translat- Sophie M. Moufawad ing results of the finite-precision block process back onto LRI - INRIA Saclay, France the individual elements of the tridiagonal matrix, we show [email protected] that the bounds given by Paige [Lin. Alg. Appl., 34:235– 258, 1980] apply to the s-step case under certain assump- Laura Grigori tions. Our results confirm the importance of the s-step INRIA basis conditioning and motivate approaches for improving France finite precision behavior while still avoiding communica- [email protected] tion. Frederic Nataf Erin C. Carson Laboratoire J.L. Lions UC Berkeley [email protected] [email protected]

James W. Demmel MS58 University of California Preconditioning Communication-Avoiding Krylov Division of Computer Science Methods [email protected] Krylov subspace projection methods are the most widely- used iterative methods for solving large-scale linear systems MS58 of equations. Communication-avoiding (CA) techniques Sparse Approximate Inverse Preconditioners for can improve Krylov methods’ performance on modern com- Communication-Avoiding Bicgstab Solvers puters. However, in practice CA-Krylov methods suffer from lack of good preconditioners that can work seamlessly Communication-avoiding formulations of Krylov solvers in a communication-avoiding fashion. I will outline a do- can reduce communication costs by taking s steps of the main decomposition like framework for a family of precon- solver at the same time. In this work, we address the ditioners that are suitable for CA Krylov methods. Within lack of preconditioners for communication-avoiding Krylov this framework, we can envision multiple preconditioners solvers. We derive a preconditioned variant of the s-step that can work seamlessly with a CA-Krylov method. I will BiCGStab iterative solver and demonstrate the effective- present the first few preconditioners we have developed so ness of Sparse Approximate Inverse (SAI) preconditioners far and present results comparing these preconditioned CA in improving the convergence rate of the preconditioned s- techniques to traditional methods. step BiCGStab solver while maintaining the reduction in communication costs. Siva Rajamanickam Sandia National Laboratories Maryam Mehri Dehnavi [email protected] MIT [email protected] Ichitaro Yamazaki University of Tennessee, Knoxville Erin Carson [email protected] U.C. Berkeley [email protected] Andrey Prokopenko 116 LA15 Abstracts

Sandia National Laboratories [email protected] [email protected]

Erik G. Boman MS59 Center for Computing Research Low Rank Approximation of High Dimensional Sandia National Labs Functions Using Sparsity Inducing Regularization [email protected] Approximation of high dimensional stochastic functions us- ing functional approaches is often limited by the so called Michael Heroux curse of dimensionality. In literature, approximation meth- Sandia National Laboratories ods often rely on exploiting particular structures of high di- [email protected] mensional functions. One such structure which is increas- ingly found to be applicable in these functions is sparsity Jack J. Dongarra on suitable basis. Also, these functions exhibit optimal low Department of Computer Science rank representation and can be approximated in suitable The University of Tennessee low rank tensor subsets. In this work, we exploit sparsity [email protected] within low rank representation to approximate high dimen- sional stochastic functions in a non intrusive setting using few sample evaluations. We will illustrate this approach MS59 on suitable examples arising in computational sciences and Low-rank Tensor Approximation of Singular Func- engineering. tions Prashant Rai We study the quantized tensor train approximation of sin- Sandia gular functions, which are given, in low-order discretiza- Livermore tions, by coefficient tensors. The functions of interest are [email protected] analytic but with singularities on the boundary of the do- main. Such functions arise, for example, as solutions of lin- Mathilde Chevreuil ear second-order elliptic problems, where the singularities LUNAM Universite, Universite de Nantes, CNRS, GeM are inherited from the data or occur due to the nonsmooth- [email protected] ness of the domain. We show that the QTT approxima- tion achieves, theoretically and algorithmically, the opti- Loic Giraldi mal approximation accuracy ε with polylogarithmic ranks Ecole Centrale de Nantes κ −1 O(log ε ). That renders the low-rank QTT represen- [email protected] tation, which is amply adaptive and non-specific to the particular class of functions, comparable or even superior Anthony Nouy to special techniques such as hp approximations. CS was Ecole Centrale Nantes, GeM supported by the European Research Council (ERC) under [email protected] AdG 247277.

Vladimir Kazeev MS59 ETH Riemannian Optimization for High-Dimensional Zuerich Tensor Completion [email protected] Tensor completion aims to reconstruct a high-dimensional Christoph Schwab data set with a large fraction of missing entries. Assuming ETH Zuerich low-rank structure in the underlying data allows us to cast SAM the problem into an optimization problem restricted to the [email protected] manifold of fixed-rank TT tensors. We present a Rieman- nian nonlinear conjugate gradient scheme scaling linearly with the number of dimensions. In numerical experiments, MS59 we show the effectiveness of our approach and compare to adaptive sampling techniques such as cross-approximation. A Variant of Alternating Least Squares Tensor Completion in TT-Format Michael Steinlechner ∈ RI I { }d We fit a low rank tensor A , = 1,...,n ,to EPF Lausanne { ∈ R | ∈ } ⊂I agivenset Mi i P , P being unchange- michael.steinlechner@epfl.ch able. Considered is the TT-format with target rank r (rank adaption is realizable), while M ∈ RI (e.g. certain multi- variate functions) is assumed rank structured. A SOR-type MS60 2 2 solver ADF with O(r d#P ) computational complexity (r Rank-Structured PDE Solvers lower than alternating least squares (ALS)) is presented, working with #P = cdnr2 ∼ log(#I) samples. As good We describe a “superfast” Cholesky code based on results as for ALS, yet a significantly lower runtime are CHOLMOD, organized around level 3 BLAS for speed. observed rank r =4,...,20, given d ≤ 50, n ≤ 100 and We combine sparsity and low-rank structure and directly c ≥ 2. factor low-rank blocks using randomized algorithms. For a nearly-incompressible elasticity problem, CG with our Sebastian Kraemer,LarsGrasedyck,MelanieKluge rank-structured Cholesky converges faster than with in- RWTH Aachen complete Cholesky and the ML multigrid preconditioner, [email protected], [email protected], both in iteration counts and in run time. At 106 degrees of LA15 Abstracts 117

freedom, we use 3 GB of memory; exact factorization takes Eileen R. Martin 30 GB. Stanford University [email protected] David Bindel Cornell University Kenneth L. Ho [email protected] Department of Mathematics Stanford University [email protected] MS60 Approximation of Real Eigenvalues of Some Struc- Lexing Ying tured Nonsymmetric Matrices Stanford University Department of Mathematics Matrix Sign iteration is an efficient algorithm for general [email protected] eigenvalue problem, but we explore its application to ap- proximation of real eigenvalues of structured matrices with further application to real polynomial root-finding. Our PP1 formal analysis and extensive tests show that the approach On the Global Convergence of the Cyclic Jacobi is quite promising. Methods for the Symmetric Matrix of Order 4

Victor Pan We prove the global convergence of the general cyclic Ja- CUNY cobi method for the symmetric matrix of order four. In Lehman particular, we study the method under the strategies that [email protected] enable full parallelization of the method. These strategies, unlike the serial ones, can force the method to be very slow/fast within one cycle, depending on the underlying MS60 matrix. This implies that for the global convergence proof Linear Time Eigendecomposition and SVD Algo- one has to consider several adjacent cycles. rithms Erna Begovic,VjeranHari We present a set of superfast (nearly O(n) complexity in University of Zagreb time and storage) algorithms for finding the full singu- [email protected], [email protected] lar value decomposition or selected eigenpairs for a (non- symmetric) rank-structured matrix. These algorithms rely on tools from Hierarchically Semiseperable (HSS) matrix PP1 theory, as well as the Fast Multipole Method (FMM), ran- BFGS-like Updates of Constraint Preconditioners domization, and additional tools. We further show how for Saddle Point Linear Systems Arising in Convex these algorithms can be used in applications, such as PDE Quadratic Programming solvers. This is joint work with Jianlin Xia. Constraint preconditioners (CP) have revealed very effec- James Vogel tive in accelerating iterative methods to solve saddle-point Purdue University type linear systems arising in unconstrained optimization [email protected] problems. To avoid a large part of the costly application of CP we propose to selectively compute and update it via Jianlin Xia suitable low-rank BFGS-like modifications. We show under Department of Mathematics which conditions the new updated preconditioner guaran- Purdue University tees convergence of the PCG method and test it onto a class [email protected] of large-size problems taken from the CUTE repository. Luca Bergamaschi University of Padua MS60 [email protected] Butterfly Factorizations Angeles Martinez The butterfly factorization is a data-sparse representation Universita di Padova of the butterfly algorithm. It can be constructed efficiently [email protected] if either the butterfly algorithm is available or the entries of the matrix can be sampled individually. For an N × N matrix, the resulting factorization is a product of O(log N) PP1 sparse matrices, each with O(N) non-zero entries. The Pivotality of Nodes in Reachability Problems Using application complexity of this factorization is O(N log N) Avoidance and Transit Hitting Time Metrics with a prefactor significantly smaller than the original but- terfly algorithm. We propose a ”pivotality” measure of how easy it is to reach anodet from a node s. This measure attempts to capture Yingzhou Li how pivotal a role that some intermediate node k plays in Stanford the reachability from node s to node t in a given network. [email protected] We propose two metrics, the avoidance and transit hitting times, and show how these can be computed from the fun- Haizhao Yang damental matrices associated with the appropriate random Stanford University walk matrices. Math Dept, Stanford University [email protected] Daniel L. Boley 118 LA15 Abstracts

University of Minnesota ber of small matrices, especially for GPUs, which are cur- Department of Computer Science rently known to be about four to five times more energy [email protected] efficient than multicore CPUs. We describe the develop- ment of the main one-sided factorizations that work for Golshan Golnari a set of small dense matrices in parallel. Our approach University of Minnesota is based on representing the algorithms as a sequence of Department of Electrical and Computer Eng. batched BLAS routines for GPU-only execution. [email protected] Tingxing Dong Yanhua Li University of Tennessee University of Minnesota [email protected] Department of Computer Science [email protected] Azzam Haidar Department of Electrical Engineering and Computer Zhi-Li Zhang Science University of Minnesota University of Tennessee, Knoxville Computer Science [email protected] [email protected] Stanimire Tomov University of Tennessee, Knoxville PP1 [email protected] A Constructive Proof of Upper Hessenberg Form for Matrix Polynomials Piotr Luszczek University of Tennessee × We present a constructive proof that every n n matrix [email protected] polynomial can be reduced to an upper Hessenberg matrix rational. The proof relies on a pseudo inner product to be Jack J. Dongarra defined over the vector space of n tuples whose elements Department of Computer Science are rational functions. Armed with this pseudo inner prod- The University of Tennessee uct we proceed to apply Arnoldi and Krylov type subspace [email protected] methods to obtain an upper Hessenberg matrix rational whose eigenvalues are the same as the original matrix poly- nomial. We present a numerical technique for computing the eigenvalues of an upper Hessenberg matrix rational and PP1 one numerical example. Improved Incremental 2-Norm Condition Estima- tion of Triangular Matrices Thomas R. Cameron Washington State University Basically two incremental techniques to estimate the 2- [email protected] norm condition number of triangular matrices were intro- duced, one using left and one using right approximate sin- PP1 gular vectors. With incremental we mean that they com- pute estimates of a leading size k submatrix from a cheap A New Parallel Sparse Linear Solver update of the estimates for the leading size k − 1 subma- We will present a new parallel sparse linear solver based on trix. We show that a clever combination of both techniques hierarchical matrices. The algorithm shares some elements and information on the inverse triangular factors lead to a with the incomplete LU factorization, but uses low-rank significantly more accurate estimator. approximations to remove fill-ins. In addition, it uses mul- tiple levels, in a way similar to multigrid, to achieve fast Jurjen Duintjer Tebbens global convergence. Empirically, the current implementa- Institute of Computer Sciences tion has nearly linear complexity on equations arising from Academy of Sciences of the Czech Republic, Prague the discretization of elliptic PDE, making it very efficient [email protected] to solve large problems. Moreover, the task dependency graph of this algorithm has a tree structure that we exploit Miroslav Tuma in our MPI parallelization of the algorithm. Lastly, because Academy of Sciences of the Czech Republic fill-ins are strictly controlled by the algorithm, communi- Institute of Computer Science cation is needed only when boundary nodes are eliminated [email protected] on every process. Preliminary performance and scalability results of the solver will be demonstrated by our numerical experiments. PP1 Chao Chen A New Regularization Method for Computational Stanford University Color Constancy [email protected] Computational color constancy is an image processing problem that consists of correcting for colors in digital im- PP1 ages distorted by noise and illuminants. The color con- Towards Batched Linear Solvers on Accelerated stancy is an ill-posed problem, therefore regularization Hardware Platforms methods need to be applied. In this work, we propose a new regularization method for computational color con- Many applications need computations on very large num- stancy and show its performance in several natural images. LA15 Abstracts 119

tant applications require time-dependent calculations. We examine block preconditioners for time-dependent incom- Malena I. Espanol,MichaelWransky pressible Navier-Stokes problems and some related coupled Department of Mathematics problems. In some time-dependent problems, explicit time University of Akron stepping methods can require much smaller time steps for [email protected], [email protected] stability than are needed for reasonable accuracy. This leads to taking many more time steps than would other- wise be needed. With implicit time stepping methods, we PP1 can take larger steps, but at the price of needing to solve Design and Analysis of a Low-Memory Multigrid large linear systems at each time step. We consider im- Algorithm plicit Runge-Kutta (IRK) methods. Suppose our PDE has been linearized and discretized with N degrees of freedom. We develop an efficient multigrid solver with low memory Using an s-stage IRK method leads to an sN × sN linear complexity, in the spirit of A. Brandt, M. Adams, and oth- system that must be solved at each time step. These linear ers. The algorithm avoids storing the entire finest grid. systems are block s×s systems, where each block is N ×N. We introduce new ideas on how to apply τ-corrections in We investigate preconditioners for such systems, where we a full-approximation scheme to preserve the accuracy of a take advantage of the fact that each subblock is related to standard full multigrid solver while keeping the computa- a linear system from the (coupled) fluid flow equations. tion costs low. We use Fourier analysis to study the error components at different stages of the algorithm. Victoria Howle, Ashley Meek Texas Tech Stefan Henneking [email protected], [email protected] Georgia Institute of Technology [email protected] PP1 PP1 Structured Computations of Block Matrices with Anderson Acceleration of the Alternating Projec- Application in Quantum Monte Carlo Simulation tions Method for Computing the Nearest Correla- tion Matrix A block matrix can be regarded as a reshaped fourth-order tensor. A properly structured computation of the block In a wide range of applications an approximate correla- matrix provides insight into the tensor and vice versa. We tion matrix arises that is symmetric but indefinite and it will present a fast sketching algorithm for computing the is required to compute the nearest correlation matrix in inversion of a block p-cyclic matrix, and our synergistic ef- the Frobenius norm. The alternating projections method fort in developing stable and high-performance implemen- is widely used for this purpose, but has (at best) a linear tation of the algorithm for the applications of the fourth- rate of convergence and can require many iterations. We order tensor computations arising from quantum Monte- show that Anderson acceleration, a technique for acceler- Carlo simulation. ating the convergence of fixed-point iterations, can suc- cessfully be applied to the alternating projections method, bringing a significant reduction in both the number of it- Chengming Jiang erations and the computation time. We also show that University of California, Davis, USA Anderson acceleration remains effective when certain ele- [email protected] ments of the approximate correlation matrix are required to remain fixed. Zhaojun Bai Departments of Computer Science and Mathematics Nicholas Higham University of California, Davis School of Mathematics [email protected] The University of Manchester [email protected] Richard Scalettar Department of Physics, Nataˇsa Strabi´c University of California, Davis, USA School of Mathematics [email protected] School of Mathematics The University of Manchester [email protected] PP1

PP1 MHD Stagnation Point over a Stretching Cylinder with Variable Thermal Conductivity Block Preconditioning for Time-Dependent Cou- pled Fluid Flow Problems This article addresses the behavior of viscous fluid near Many important engineering and scientific systems require stagnation point over a stretching cylinder with vari- the solution of incompressible flow models or extensions to able thermal conductivity. The effects of heat genera- these models such as coupling to other processes or incor- tion/absorption are also encountered. Thermal conductiv- porating additional nonlinear effects. Finite element meth- ity is considered as a linear function of temperature. Com- ods and other numerical techniques provide effective dis- parison between solutions through HAM and Numerical is cretizations of these systems, and the generation of the presented. The effect of different parameters on velocity resulting algebraic systems may be automated by high- and temperature fields are shown graphically.. level software tools such those in the Sundance project, but the efficient solution of these algebraic equations re- Farzana Khan mains an important challenge. In addition, many impor- student 120 LA15 Abstracts

[email protected] matrix of the data requires outer products. Therefore, their spectral properties are tightly connected. This allows us to examine the kernel matrix through the sample covari- PP1 ance matrix in the feature space and vice versa. The use On Simple Algorithm Approximating Arbitrary of kernels often involve a large number of features, com- Real Powers Aα of a Matrix from Number Rep- pared to the number of observations. In this scenario, resentation System the sample covariance matrix is not well-conditioned nor is it necessarily invertible, mandating a solution to the In this paper we present new algorithm for approximat- α n×n problem of estimating high-dimensional covariance matri- ing arbitrary real powers A of a matrix A ∈ C only ces under small sample size conditions. We tackle this using matrix standard operation and matrix square root problem through the use of a shrinkage estimator that of- algorithms. Furthermore, although a real matrix with non- fers a compromise between the sample covariance matrix positive eigenvalues is given, we get real solution without and a well-conditioned matrix (also known as the ”tar- complex arithmetic. Its accuracy is only depend on a con- get”) with the aim of minimizing the mean-squared error dition number of A and that of matrix roots. Since the new (MSE). We propose a distribution-free kernel matrix regu- algorithm works for arbitrary real α and is independent on larization approach that is tuned directly from the kernel singularity and defectiveness against Pad´e-approximation matrix, avoiding the need to explicitly address the feature and Schur-Parllet algorithm respectively, it seems the only space. Numerical simulations demonstrate that the pro- way to get a good approximation of arbitrary real power posed regularization is effective in classification tasks. without any handling when a given matrix has such con- ditions. To addition this algorithm is also well suited in Tomer Lancewicki parallel computation because it can be implemented only University of Tennessee by standard matrix operations. [email protected]

Yeonji Kim Itamar Arel Department of mathematics University of Tennessee PusanNationUniversity Department of Electrical Engineering and Computer kyj14jeff@naver.com Science [email protected] Jong-Hyeon Seo Basic Science DIGIST PP1 [email protected] Computationally Enhanced Projection Methods for Symmetric Lyapunov Matrix Equations Hyun-Min Kim Department of mathematics In the numerical treatment of large-scale Lyapunov equa- Pusan National University tions, projection methods require solving a reduced Lya- [email protected] punov problem to check convergence. As the approxima- tion space expands, this solution takes an increasing por- tion of the overall computational efforts. When data are PP1 symmetric, we show that convergence can be monitored at Cycles of Linear and Semilinear Mappings significantly lower cost, with no reduced problem solution. Numerical experiments illustrate the effectiveness of this A mapping A from a complex vector space U to a complex strategy for standard and extended Krylov methods. vector space V is semilinear if Davide Palitta, Valeria Simoncini A  A A  A A (u + u )= u + u , (αu)=¯α u Universita’ di Bologna for all u, u ∈ U and α ∈ C. We give a canonical form of [email protected], [email protected] matrices of a cycle of linear or semilinear mapping PP1 V1 `` V2 ··· Vt−1 Fast Interior Point Solvers for PDE-Constrained ``` in which all Vi are complexVt vector spaces, each line is Optimization an arrow −→ or ←− , and each arrow denotes a linear or semilinear mapping. The talk is based on the article [Linear We present a fast interior point method for solving Algebra Appl. 438 (2013) 3442–3453]. quadratic programming problems arising from a number of PDE-constrained optimization models with bound con- Tetiana Klymchuk straints on the state and control variables. Particular em- Taras Shevchenko National University of Kyiv, Ukraine phasis is given to the development of preconditioned iter- Mechanics and Mathematics Department ative methods for the large and sparse saddle point sys- [email protected] tems arising at each Newton step. Having motivated and derived our recommended preconditioners, we present nu- merical results demonstrating the potency of our solvers. PP1 Regularization of the Kernel Matrix via Covariance John Pearson Matrix Shrinkage Estimation University of Kent [email protected] The kernel trick concept, formulated as an inner product in a feature space, facilitates powerful extensions to many well-known algorithms. While the kernel matrix involves PP1 inner products in the feature space, the sample covariance An Optimal Solver for Linear Systems Arising from LA15 Abstracts 121

Stochastic FEM Approximation of Diffusion Equa- study, and thus provide data consisting of at most a single tions with Random Coefficients trajectory in state space. Before proceeding with parame- ter estimation for models of such systems, it is important to Stochastic Galerkin approximation of elliptic PDE prob- know if the model parameters can be uniquely determined, lems with correlated random data often results in huge or identified, from idealized (error free) single trajectory symmetric positive definite linear systems. The novel fea- data. For linear models and a class of nonlinear systems ture of our preconditioned MINRES solver involves incor- that are linear in parameters, I will present several charac- poration of error control in the natural “energy’ norm to- terizations of identifiability. I will establish necessary and gether with an effective a posteriori estimator for the PDE sufficient conditions, solely based on the geometric struc- approximation error leading to a robust and optimally ef- ture of an observed trajectory, for these forms of identi- ficient stopping criterion: the iteration terminates when fiability to arise. I will extend the analysis to consider the algebraic error becomes insignificant compared to the collections of discrete data points, where, due to proper- approximation error. ties of the fundamental matrix solution, we can recover the parameters explicitly in certain cases. I will then examine Pranjal Prasad, David Silvester the role of measurement error in the data and determine University of Manchester regions in the data space which correspond to parameter [email protected], sets with desired properties, such as identifiability, model [email protected] equilibrium stability, and sign structure.

Shelby Stanhope PP1 University of Pittsburgh Approximation of the Scattering Amplitude Using [email protected] Nonsymmetric Saddle Point Matrices

This poster presents iterative methods for solving the for- PP1 ward and adjoint systems where the matrix is large, sparse, and nonysmmetric, such as those used to approximate the Is Numerical Stability an Important Issue in Iter- scattering amplitude. We use a conjugate gradient-like it- ative Computations? eration for a nonsymmetric saddle point matrix that has Although the answer seems obvious, the practice might be a real positive spectrum and a full set of eigenvectors that not. Full numerical stability analysis might be intriguing or are, in some sense, orthogonal. Numerical experiments our of reach. This contribution examines several situations demonstrate the effectiveness of this approach compared where the presentation and analysis of iterative numerical to other iterative methods for nonsymmetric systems. algorithms assumes exact arithmetic. Such approach can Amber S. Robertson be well justified in some cases and it can severely limit The University of Southern Mississippi application of the obtained results in others. [email protected] Zdenek Strakos Faculty of Mathematics and Physics, PP1 Charles University in Prague Generalizing Block Lu Factorization:A Lower- [email protected]ff.cuni.cz Upper-Lower Block Triangular Decomposition with Minimal Off-Diagonal Ranks PP1 We propose a factorization that decomposes a 2×2-blocked Computing Geodesic Rotations non-singular matrix into a product of three matrices that are respectively lower block-unitriangular, upper block- An element of a flag manifold is a frame—a decomposi- triangular, and lower block-unitriangular. In addition, we tion of Euclidean space into orthogonal components—and make this factorization “as block-diagonal as possible’ by a geodesic path on a flag manifold provides a “direct” ro- minimizing the ranks of the off-diagonal blocks. We present tation from one frame to another. Furthermore, a geodesic sharp lower bounds for these ranks and an algorithm to path corresponds to a decomposition of unitary matrices compute an optimal solution. One application of this fac- that generalizes the CS decomposition. We develop effi- torization is the design of optimal logic circuits for stream- cient methods for numerical computation on a flag mani- ing permutations. fold, including the location of geodesic paths, and consider a question of uniqueness. Fran¸cois Serre ETH Z¨urich Brian D. Sutton [email protected] Randolph-Macon College [email protected] Markus P¨uschel Department of Computer Science ETH Z¨urich PP1 [email protected] On the Numerical Behavior of Quadrature-based Bounds for the A-Norm of the Error in CG

PP1 The A-norm of the error in the conjugate gradient (CG) Existence and Uniqueness for the Inverse Prob- method can be estimated using quadrature-based bounds. lem for Linear and Linear-in-parameters Dynam- Our numerical experiments predict that some numerical ical Systems difficulties may arise when computing upper bounds based on modified quadrature rules like Gauss-Radau, regardless Certain experiments are non-repeatable, because they re- whether we use the CGQL algorithm or our new CGQ sult in the destruction or alteration of the system under algorithm for their computation. In this presentation we 122 LA15 Abstracts

analyze this phenomenon and try to explain when possible alency between this definition and classical ones. And, numerical difficulties can appear. we propose some generalizations of eigenvalue theorems to pseudospectra theorems based on the equivalent defi- Petr Tichy nition. Also, an algorithm of pesudospectra computation Czech Academy of Sciences via RRQR is given. Numerical experiments and compar- [email protected] isons are given to illustrate the efficiency of the results in this paper. Gerard Meurant Retired from Commissariat a l’Energie Atomique (CEA) Zhengsheng Wang [email protected] Nanjing University of Aeronautics and Astronautics [email protected]

PP1 Lothar Reichel Tensor Notation: What Would Hamlet Say? Kent State University [email protected] It is hard to spread the word about tensor computations. Summations, transpositions, and symmetries are described Min Zhang, Xudong Liu through vectors of subscripts and vectors of superscripts. Is Department of Mathematics, Nanjing University of there a magic notation for handling that stuff? Einstein no- Aeronautics tation and Dirac notation are noble efforts. Other worthy [email protected], [email protected] notations flow from traditional ”SIAM style” matrix com- putations. I will suggest with examples how the research community might navigate among these schemes. Shake- PP1 speare points the way: ”There is nothing either good or Linear Response and the Estimation of Absorption bad [about any given tensor notation] but thinking makes Spectrum in Time-Dependent Density Functional it so”. Hamlet II,II,250. Theory

Charles Van Loan The absorption spectrum of a molecular system can be Cornell University estimated from the dynamic dipole polarizability associ- Department of Computer Science ated with the linear response of a molecular system (at its [email protected] ground state) to an external perturbation. Although an ac- curate description of the absorption spectrum requires the diagonalization of the so-called Casida Hamiltonian, there PP1 are more efficient ways to obtain a good approximation to Best Rank-1 Approximations Without Orthogonal the general profile of the absorption spectrum without com- Invariance for the 1-norm and Frobenius 1-norm puting eigenvalues and eigenvectors. We will describe these methods that only require multipling the Casida Hamil- We consider finding the best rank-1 approximations to a tonian with a number of vectors. When highly accurate matrix A where the error is measured as the operator in- oscillator strength is required for a few selected excitation duced 1-norm (maximum column 1-norm) or Frobenius 1- energies, we can use a special iterative method to obtain norm (sum of absolute values over the matrix). Neither of the eigenvalues and eigenvectors associated with these en- these norms are unitarily invariant and the typical SVD ergies efficiently. procedure is insufficient. For the case of a 2 × 2matrix A, there is a simple solution procedure and the result- Chao Yang ing approximation bound for the Frobenius 1-norm is the Lawrence Berkeley National Lab | det(A)| . We present some thoughts on how to extend the [email protected] max |Aij | ij arguments to larger matrices. PP1 Varun Vasudevan A Set of Fortran 90 and Python Routines for Solv- Purdue University ing Linear Equations with IDR(s) [email protected] We present Fortran 90 and Python implementations of Tim Poston [Van Gijzen and Sonneveld, Algorithm 913: An Elegant Forus Health IDR(s) Variant that Efficiently Exploits Bi-orthogonality [email protected] Properties. ACM TOMS, Vol. 38, No. 1, pp. 5:1-5:19, 2011]. Features of the new software include: (1) the capa- David F. Gleich bility of solving general linear matrix equations, including Purdue University systems with multiple right-hand-sides, (2) the possibility [email protected] to extract spectral information, and (3) subspace recycling. Martin B. van Gijzen PP1 Delft Inst. of Appl. Mathematics Delft University of Technology Pseudospectra Computation Via Rank-Revealing [email protected] QR Factorization

The pseudospectrum of a matrix is a way to visualize the Reinaldo Astudillo non-normality of a matrix and the sensitivity of its eigen- Delft University of Technology values. In this paper, we propose an equivalent definition [email protected] of pseudospectra of matrices using Rank-Revealing QR (RRQR) factorization. We give the proof of the equiv-