<<

Eindhoven University of Technology

MASTER

Approximation of inverses of BTTB matrices for preconditioning applications

Schneider, F.S.

Award date: 2017

Link to publication

Disclaimer This document contains a student thesis (bachelor's or master's), as authored by a student at Eindhoven University of Technology. Student theses are made available in the TU/e repository upon obtaining the required degree. The grade received is not published on the document as presented in the repository. The required complexity or quality of research of student theses may vary by program, and the required minimum study period may vary in duration.

General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.

• Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain APPROXIMATION OF INVERSES OF BTTB MATRICES

for Preconditioning Applications

MASTERTHESIS

by

Frank Schneider

December 2016

Dr. Maxim Pisarenco Department of Research ASML Netherlands B.V., Veldhoven

Dr. Michiel Hochstenbach Department of Mathematics and Computer Science Technische Universiteit Eindhoven (TU/e)

Prof. Dr. Bernard Haasdonk Institute of Applied Analysis and Numerical Simulation Universität Stuttgart

APPROXIMATION OF INVERSES of

BTTB MATRICES

m

for Preconditioning Applications

Frank Schneider

December 2016

Submitted in partial fulfillment of the requirements for the degree of Master of Science (M.Sc) in Industrial and Applied Mathematics (IAM) to the Department of Mathematics and Computer Science Technische Universiteit Eindhoven (TU/e) as well as for the degree of Master of Science (M.Sc) in Simulation Technology to the Institute of Applied Analysis and Numerical Simulation Universität Stuttgart

The work described in this thesis has been carried out under the auspices of - Veldhoven, The Netherlands.

ABSTRACT

The metrology of integrated circuits (ICs) requires multiple solutions of a large-scale linear system. The time needed for solving this sys- tem, greatly determines the number of chips that can be processed per time unit.

Since the coefficient is partly composed of block-Toeplitz- Toeplitz-block (BTTB) matrices, approximations of its inverse are interesting candidates for a preconditioner.

In this work, different approximation techniques such as an ap- proximation by sums of Kronecker products or an approximation by inverting the corresponding generating function are examined and where necessary generalized for BTTB and BTTB-block matrices. The computational complexity of each approach is assessed and their uti- lization as a preconditioner evaluated.

The performance of the discussed preconditioners is investigated for a number of test cases stemming from real life applications.

v

ACKNOWLEDGEMENT

First and foremost I wish to thank my supervisor from ASML Maxim Pisarenco. Maxim has supported me not only by providing valuable feedback over the course of the thesis, but by being always there to answer all my questions. He guided the thesis while allowing me freedom to explore the areas that tempted me the most.

I also want to thank my supervisor from the TU/e Michiel Hochsten- bach who was an excellent resource of knowledge, academically and emotionally. Thank you for all the helpful feedback not only regard- ing the work and the thesis, but also regarding future plans.

I owe thanks to the members of my thesis committee, professor Barry Koren and Martijn van Beurden from the TU/e and professor Bernard Haasdonk from the university of Stuttgart. Thank you for your valuable guidance and insightful comments.

Thank you very much, everyone!

Frank Schneider

Eindhoven, December 28, 2016.

vii

CONTENTS i introduction1 1 motivation3 1.1 Photolithography ...... 4 1.1.1 Metrology ...... 5 1.2 Other Applications ...... 8 1.2.1 Deblurring Images ...... 9 1.2.2 Further Applications ...... 11 2 linear systems 13 2.1 Iterative Solvers ...... 13 2.1.1 CG Method ...... 13 2.1.2 Other Methods ...... 16 2.2 Preconditioning ...... 17 3 toeplitz systems 19 3.1 Multi-level Toeplitz Matrices ...... 21 3.2 Circulant Matrices ...... 23 3.3 Hankel ...... 23 4 problem description 25 4.1 Full Problem ...... 25 4.2 BTTB-Block System ...... 28 4.3 BTTB System ...... 29 5 thesis overview 31 ii preconditioners 33 6 overview over the preconditioning techniques 35 7 full c preconditioner 37 7.1 Application to Full Problem ...... 37 7.1.1 Inversion ...... 37 7.1.2 MVP ...... 37 8 circulant approximation 39 8.1 Circulant Approximation for Toeplitz Matrices . . . . 39 8.1.1 Circulant Preconditioners ...... 40 8.2 Circulant Approximation for BTTB Matrices ...... 42 8.2.1 Toeplitz-block Matrices ...... 43 8.2.2 Block-Toeplitz Matrices ...... 43 8.3 Application to BTTB-block Matrices ...... 45 8.3.1 Inversion ...... 45 8.3.2 MVP ...... 46 9 inverse generating function approach 47 9.1 Inverse Generating Function for Toeplitz and BTTB Matrices ...... 47

ix 9.1.1 Unknown Generating Function ...... 48 9.1.2 Numerical Integration for Computing the Fourier Coefficients ...... 49 9.1.3 Numerical Inversion of the Generating Function 50 9.1.4 Example ...... 51 9.1.5 Efficient Inversion andMVP...... 51 9.2 Inverse Generating Function for BTTB-block Matrices . 52 9.2.1 General Approach ...... 53 9.2.2 Preliminaries ...... 53 9.2.3 Proof of Clustering of the Eigenvalues ...... 57 9.2.4 Example ...... 63 9.3 Regularizing Functions ...... 63 9.4 Numerical Experiments ...... 65 9.4.1 Convergence of theIGF...... 65 9.4.2 IGF for aBTTB- ...... 66 10 kronecker product approximation 69 10.1 Optimal Approximation for BTTB Matrices ...... 69 10.1.1 Algorithm ...... 71 10.1.2 Inverse and MVP ...... 71 10.2 BTTB-block Matrices ...... 73 10.2.1 One Term Approximation ...... 75 10.2.2 Multiple Terms Approximation ...... 77 10.3 Numerical Experiments ...... 79 10.3.1 Convergence of the Kronecker Product Approx- imation ...... 79 10.3.2 Decay of Singular Values ...... 80 10.3.3 Relation to the Generating Function ...... 80 11 more ideas 83 11.1 Transformation Based Preconditioners ...... 83 11.1.1 Discrete Sine and Cosine Transform ...... 83 11.1.2 Hartley Transform ...... 84 11.2 Banded Approximations ...... 85 11.3 Koyuncu Factorization ...... 86 11.4 Low-Rank Update ...... 88 iii benchmarks 91 12 benchmarks 93 12.1 Transformation-based Preconditioner ...... 99 12.2 Kronecker Product Approximation ...... 100 12.3 Inverse Generating Function ...... 101 12.4 Banded Approximation ...... 103 iv conclusion 105 13 future work 107 13.1 Inverse Generating Function ...... 107 13.1.1 Regularization ...... 107 13.1.2 Other Kernels ...... 107

x 13.2 Kronecker Product Approximation ...... 108 13.2.1 Using a Common Basis ...... 108 13.3 Preconditioner Selection ...... 108 14 conclusion 111 v appendix 113 a inversion formulas for kronecker product ap- proximation 115 a.1 One Term Approximation ...... 115 a.1.1 Sum Approximation ...... 115 a.2 Multiple Terms Approximation ...... 119 a.2.1 Sum Approximation ...... 119 bibliography 123

xi LISTOFFIGURES

Figure 1.1 Moore’s law...... 3 Figure 1.2 Photolithographic process...... 4 Figure 1.3 Close-up of a wafer...... 5 Figure 1.4 Effect of focus on the gratings...... 6 Figure 1.5 Indirect grating measurement...... 6 Figure 1.6 Shape parameters for a trapezoidal grating. . . 7 Figure 1.7 Example for aPSF...... 10 Figure 1.8 Blurring problem...... 10 Figure 2.1 Minimization function...... 14 Figure 2.2 Convergence of gradient descent and conjugate gradient (CG) method for different functions φ. 15 Figure 2.3 Preconditioner trade off...... 18 Figure 4.1 Sparsity patterns of the matrices C, G and G as well as the resulting matrix A...... 25 Figure 4.2 Sparsity pattern of C...... 26 Figure 4.3 Color plots of all levels of C...... 27 Figure 8.1 Color plots for a Toeplitz-block matrix and its circulant-block approximation...... 43 Figure 8.2 Color plots for a block- and its block-circulant approximation...... 44 Figure 9.1 Illustration of the inverse generating function approach (marked in red)...... 48 Figure 9.2 Illustration of the inverse generating function approach for unknown generating functions, with the changes marked in red...... 49 Figure 9.3 Illustration of the inverse generating function approach with numerical Integration (highlighted in red)...... 50 Figure 9.4 Illustration of the inverse generating function approach using a sampled generating function. 51 Figure 9.5 Color plots for the inverse of the originalBTTB matrix, T[f]−1, the result of the inverse gener- ating function method T[1/f] and the difference between those two...... 52 Figure 9.6 Illustration of the inverse generating function for Toeplitz-block matrices...... 53 Figure 9.7 Color plots for the inverse of the original 2 × 2 BTTB-block matrix, T[F (x, y)]−1, the result of the inverse generating function method T[1/F (x,y)] and the difference of those two...... 64

xii Figure 9.8 Degrees of regularization...... 65 Figure 9.9 Convergence of the inverse generating func- tion (IGF) method towards the exact inverse. . 66 Figure 9.10 Distribution of eigenvalues for theIGF..... 68 Figure 10.1 Relative difference of the Kronecker product approximation (using all terms) and the orig- inalBTTB matrix, for 500 randomly created test cases...... 80 Figure 10.2 Decay of the singular value of a sample test case. 81 Figure 10.3 Relation of the Kronecker product approx- imation and the generating functions (taken from the test case 1b)...... 82 Figure 10.4 Convergence of the Generating Function. . . . 82 Figure 11.1 Color plots for aBTTB matrix and the ap- proximation resulting from discrete sine trans- form (DST)II...... 84 Figure 11.2 Color plots for aBTTB matrix and the approx- imation resulting from discrete cosine trans- form (DCT)II...... 84 Figure 11.3 Color plots for aBTTB matrix and the approx- imation resulting from a Hartley transforma- tion...... 85 Figure 11.4 Color plots for aBTTB matrix and a tridiago- nal approximation on both levels...... 85 H Figure 11.5 Relative difference of GM and UkSkVk for different values of k and four different test cases. 89 Figure 12.1 Box plots for the relative speed up of each pre- conditioner compared to the circulant precon- ditioner...... 98

xiii LISTOFTABLES

Table 4.1 Structure of C on each level, from highest (level Z) to lowest (level X)...... 27 Table 4.2 Convergence rates (number of iterations) for (4.1) using induced dimension reduction (IDR)(6). 28 Table 6.1 Applicability of different preconditioning meth- ods...... 35 Table 9.1 The average number of iterations for 3 × 3 BTTB matrix...... 68 Table 12.1 Color-code for the tables in the benchmark chap- ter...... 96 Table 12.2 Number of iterations if a selected precondi- tioner is used on a certain test case...... 97 Table 12.3 Number of iterations for transformation based preconditioners...... 99 Table 12.4 Number of iterations for preconditioners based on the Kronecker product approximation. . . 100 Table 12.5 Number of iterations for preconditioners based on the Kronecker product approximation with approximate singular value decomposition (SVD).102 Table 12.6 Number of iterations for preconditioners based on theIGF...... 103 Table 12.7 Number of iterations for preconditioners based on banded approximations...... 104

xiv NOMENCLATURE vectors

  x1   x   2 x: The vector x =  .  with the elements x1, x2, · · · ∈ C.  .    x2

Vector Norms

 n 1/p p ||x||p = |xi| is the p-norm of x for 1 6 p 6 . i=1 P ||x|| = max |xi| the -norm. ∞ 16i6n ∞ ∞ matrices

(n) A : is a matrix of size n × n, with the entries (A)i,j = Ai,j = ai,j for i, j = 1, ... , n.

(n1;n2) Ai,j;k,l : is referring to the (k, l)th entry of the (i, j)th block of matrix A, which as the size (n1 · n2) × (n1 · n2). Ai,j;:,: or Ai,j; is consequently referring to the (i, j)th block matrix of A.

A(name): endows the matrix with a certain name that helps un- derstanding its purpose.

Ti: A matrix with just one index is referring to an entry of a Toeplitz, circulant or similar matrix that can be described with just few entries. This is also true for example for two-level Toeplitz matrices, where one entry can be reffered to as Ti;j. Note, that i and j can be negative, following the specific nomen- clature of the class of matrix.

Furthermore:

I is the , where Ii,j = δij. T T A is the transposed matrix of A, where Ai,j = Aj,i. A−1 is the inverse matrix of A, where AA−1 = A−1A = I.

xv H H A is the conjugated transposed matrix of A, where (A )i,j = Aj,i. The overbar denotes the complex conjugate (a + bi = a − bi). A square matrix is called symmetric if A = AT. A matrix A is called positive definite if xHAx > 0 and real, for all non-zero vectors x. A matrix A is called Hermitian or self-adjoint if A = AH.

λk denotes the eigenvalues of A, where Av = λkv. κ(A) is the condition number of A, which is defined as κ(A) = ||A−1|| · ||A|| (usually using the 2-norm).

Matrix Norms

||Ax||p ||A||p = sup , the by the vector norm ||x||p induced matrix ||x||p x6=0 norm. In particular: m kAk1 = max i=1 |ai,j|, which is simply the maximum 16j6n absolute columnP sum of the matrix. n kAk = max j=1 |ai,j|, which is simply the maximum 16i6m absolute row sum of the matrix. ∞ P p H kAk2 = λmax(A A).

1/2  m n 2 kAkF = i=1 j=1 |ai,j| , called Frobenius norm. P P miscellaneous

0 if i 6= j , δ the Kronecker delta, with δ = . ij ij  1 if i = j f ∈ O (() g) is equivalent to: for x →a < , ∃ C > 0∃ε > 0∀x ∈ {x : d(x, a) < ε} : |f(x)| 6 C · |g(x)|, known as big O notation. ∞ i the imaginary number, where i2 = −1. acronyms

BCCB block-circulant-circulant-block

BiCG biconjugate gradient

BiCGSTAB biconjugate gradient stabilized

BTTB block-Toeplitz-Toeplitz-block

xvi CCD charge-coupled device

CG conjugate gradient

DCT discrete cosine transform

DFT discrete Fourier transform

DST discrete sine transform

DTT discrete trigonometric transform

FFT Fast Fourier transform

GMRES generalized minimal residual

HPD Hermitian positive definite

IC integrated circuit

IDR induced dimension reduction

IGF inverse generating function

MVP matrix-vector product

ODE ordinary differential equation

PDE partial differential equation

PSF point spread function

SVD singular value decomposition

xvii

Part I

INTRODUCTION

This part introduces the main application that motivated this master thesis. It explains the process of fabrication of ICs via photolithography and also includes a description of a method of inspecting and monitoring the production quality of the fabricatedICs. This metrology process re- quires the solution of large linear systems of equations. Before the structure of this particular linear system is fur- ther described along with two related (reduced) linear sys- tems, the basic terms concerning linear systems and their iterative solvers are introduced. Furthermore, the idea of preconditioning is described along with the definition of Toeplitz systems. In the last chapter of this part, the main objectives of this master thesis are discussed along with the main results. This chapter concludes with an outline of the following parts and chapters.

MOTIVATION 1 Following Moore’s Law (see Figure 1.1) , the performance of inte- In 1965, Gordon E. grated circuits (ICs) has steadily increased and fueled what is known Moore proposed in as the "Digital Revolution". Due to the ever increasing complexity and an article that the number of availability of digital electronics, influential developments such as the transistors that can personal computer, the internet or the cellular phone have been made be packed into a possible and affected almost every area of our lives. given unit of space will double roughly every two years [40]. 1010 SPARC M7

109 Apple A7 AMD Phenom Core 2 Duo 108 Pentium 4 7 10 Pentium II Pentium 106 Intel 80486 Intel 80386 Motorola 68020 105 Intel 8086 Number of Transistors 104 Intel 8080 Intel 4004 103 1970 1980 1990 2000 2010 2020 Year of Introduction

Figure 1.1:M oore’s law. This figure shows the number of transistors of landmark microprocessors against their year of introduction. The line shows the proposed doubling in transistor count every two years.

An integrated circuit (IC) can be thought of as a very advanced, miniaturized electric circuit. Using transistors, resistors and capaci- tors as building blocks, one can implement the basic logical opera- tions: not, and, or, etc. On a higher level, this allows the construc- tion of complex circuits such as microprocessors or flash memories [42, 46].

Today, the world around us is full of integrated circuits and mi- croprocessors. One can find them in computers, smartphones, tele- visions, cars and almost every modern electrical device [42]. But the need for more complex and powerful electronic devices is everlasting, with new computationally expensive areas such as computer simula- tions rising in importance. This motivates advances in photolithogra- phy, the main process of fabricating these integrated circuits (ICs).

3 4 motivation

1.1 photolithography (1) The fabrication ofIC s is a multi-billion dollar industry that requires a tightly controlled production environment. Hundreds of theseIC s are produced at the same time on a thin slice of silicon, called a (sili- Si con) wafer and they are later cut apart into singleIC chips. The often (2) complex and interconnected designs of theIC s are copied on a sili- con wafer in a process known as photolithography.

The steps of printing one layer of anIC onto the wafer, are visual- SiO2 Si ized in Figure 1.2 (compare [27, 42, 46]): (3) (1) Prepare wafer: Prior to the use, the silicon wafer has to be cleaned chemically.

Photoresist (2) Deposit barrier layer: In the next step, the wafer is covered with SiO2 Si a thin barrier layer, which is usually silicon dioxide (SiO2). (3) Application of photoresist: After this, the wafer is coated with (4) a light-sensitive material called photoresist.

UV Light (4) Mask alignment and exposure to UV light: The mask carry- Mask ing the complex pattern of theIC is carefully aligned and the whole wafer is exposed to high-intensity ultraviolet light. The Photoresist SiO2 photoresist is only exposed to the UV light in areas where the Si mask is transparent, and the pattern of the mask gets “copied” (5) onto the photoresist.

(5) Development: After developing the photoresist (similar to the development of photographic films) it washes away in areas Pho toresist

SiO2 where it has been exposed to the UV light (or vice versa for Si negative photoresists), making the desired pattern visible on the (6) wafer.

(6) Etching: Chemical etching is used to remove any barrier mate- rial (SiO2) not protected by the coating photoresist. Photoresist SiO2 (7) Photoresist removal: In the last step, the photoresist is removed Si (7) from the wafer, leaving just the barrier layer, with the desired pattern.

This process is repeated for each layer of theIC. The number of 20 40 33 SiO2 layers varies greatly, but lies usually between and [ ]. Each Si layer is processed one after the other.

Figure 1.2: Photolithographic In order to produce a workingIC, the mask as well as each layer process. needs to be aligned with high in comparison to the wafer and the underlying layers. Since the sizes of the structures on the wafer are in the magnitude of nanometer, this requires a complex 1.1 photolithography 5 process called metrology.

Metrology can also be used to extract information on the quality of the photolithographic process, by measuring metrology targets or gratings that were printed between the actual chips.

1.1.1 Metrology

Step (4) in the photolithographic process requires not only a careful ASML is the largest and precise alignment of the mask, but also the correct focus for the supplier in the world exposure, both of which are highly non-trivial tasks performed by of photolithographic high-tech lithography systems. systems for the semiconductor Because of that, small gratings between the chips on the wafer industry. are included, as test structures for quality control and high-precision alignment, as seen in Figure 1.3.

Figure 1.3: Close-up of a wafer. The wafer (large) contains several chips, each with a complicated structure (top right). Between the chips gratings have been printed (bottom right) for the purpose of quality control and alignment (source: [46]).

Since these gratings pass through the exact same production cycle as the actual chips, they show the same production biases or short- comings. However, it is easier to use the gratings as metrology tar- gets because of their easy periodic structure compared to the chip’s complex architecture. The exact shape of the gratings contains infor- mation such as an incorrect focus (see Figure 1.4), over- or underex- posure, etc.

Because of the small size of the gratings (in the magnitude of 100 nm), classical optical microscopy is not usable. In 1873 Abbe [1] found 6 motivation

Figure 1.4: Effect of focus on the gratings. While the central gratings is a product of a wafer in focus, the other two gratings were the result of a lithographic process with an incorrect focus. They would both be considered of not reaching the quality standard and therefore be sorted out (source: [46]).

that for light with wavelength λ, the resolution of the resulting picture is at best λ d = , 2n sin Θ where n is the refractive index of the medium being imaged in and Θ is the half-angle subtended by the optical objective lens. This means that the maximal resolution is bounded by c · λ with a c close to 1. To increase the resolution, shorter wavelengths such as UV-light and X-rays can be used.

(a) (b) (Simulated) output of the CCD

Figure 1.5: Indirect grating measurement. Subfigure (a) depicts the process of indirect grating measurement using the scattering of light (Source [46]). A (simulated) output of theCCD is shown in subfigure (b) (source [33]).

On the other side, electron microscopy has its own drawbacks such as being slow and potentially destructive [46]. Therefore, indirect measurements are preferred (see Figure 1.5a). 1.1 photolithography 7

For the indirect grating measurement, light is directed (through filters and and optical system) at the gratings. Depending on the grat- ings geometry, the light is scattered in a certain way. Part of the scat- tered light is captured by aCCD. Figure 1.5 illustrates the method of indirect grating measurement and the light intensity measured by theCCD.

This however does not give a direct access to the geometrical shape height of the gratings. The actual interest of the metrology step is to find out width the geometrical parameters p of the gratings. angle

p For a trapezoidal grating for example, three shape parameters k Figure 1.6: can be used to describe the grating’s geometry: The height p1 of the Shape gratings, the average width p2 of a grating and the angle of the side parameters. wall p3 (see Figure 1.6).

To extract the geometrical parameters p of the gratings, an inverse model of the scattering process is required:

• Forward problem - Scattering simulation: Given a certain shape p of the gratings, simulate the light intensities measured by the sensor I(p).

• Inverse problem - Profile reconstruction: Given a measured light intensity at the sensor ICCD (see Figure 1.5b), reconstruct the geometrical parameters p. This is done by computing This minimization is realized using the min ||ICCD − I(p)|| , Gauss-Newton p algorithm that requires the where I(p) is the result of the forward problem given the pa- computation of the rameter p. first order derivatives. They are Since visible light is an electromagnetic wave with a wavelength be- approximated using finite differences tween 400 to 700 nm, its diffraction is described by Maxwell’s equa- requiring O (n) −iωt tions. Using the time-harmonic assumption Ee(x, y, z, t) = E(x, y, z)e computations of the integral form of the equation is as follows (see [6, 18, 31]): I(p). 8 motivation

1 E ds = ρ dV Gauss’s law ‹ 0 ˚ ∂Ω Ω

B ds = 0 Gauss’s law for magnetism ‹ ∂Ω

E dl = − iω B ds Faraday’s law ˛ ¨ ∂Σ Σ

E dl = µ J ds − µ  iω E ds Ampère-Maxwell law ˛ 0 ¨ 0 0 ¨ ∂Σ Σ Σ

where:

E: the electric field, B: the magnetic field, ρ: the electric charge density, 0: the vacuum permittivity or electric constant, ω: the frequency, J: the electric current density Ω: a fixed volume with boundary surface ∂Ω, Σ: a fixed open surface with boundary curve ∂Σ, : denotes a closed line integral, ¸ : denotes a closed surface integral. ‚ Solving the discretized Maxwell’s equation in the case of light The exact structure scattering at gratings requires the solution of a linear system and characteristics of this linear system Ax = b is further described in Chapter 4 which is the most expensive step of the forward problem.

To solve the inverse problem, multiple instances of the forward problem have to be solved in the optimization process. This further motivates the search for an efficient solution of the forward problem in general and the resulting linear system in particular.

1.2 other applications

Besides the mentioned application, Toeplitz,Toeplitz-like and mul- tilevel Toeplitz systems arise in a variety of mathematics, scientific 1.2 otherapplications 9 computing and engineering applications (see [10, 41]).

Some of these applications arise from the fact that a discrete convo- lution can be written as an matrix-vector product (MVP) between a Toeplitz matrix and a vector:

Lemma 1.2.1: Discrete Convolution

Let h andx, be two vectors of size m and n respectively. The convolution h ∗ x, can be computed by theMVP:   h1 0 . . . 0 0  . .   h h ... . .   2 1 . .       h3 h2 . . . 0 0  x   1  .     . h3 . . . h1 0  x2      .  · x  hm−1 . . . . h2 h1   3    .   . .   .   hm hm−1 . . h2   .      x  0 0 . . . hm−1 hm−2 2    . . .   . . . hm hm−1 0 0 0 . . . hm

The two-dimensional case will result in a two-level Toeplitz ma- trix, also called aBTTB matrix.

The problem of deblurring images is an example for such an appli- cation that stems from a discrete convolution.

1.2.1 Deblurring Images

A model for a blurred image b is usually written:

Ax = b ,(1.1) where x is the original image, A is the blurring matrix and b is the blurred image (see [23]).

The blurring can be described by a point spread function (PSF), such as the one in Figure 1.7.APSF describes the response of the optical system, e. g. the camera or lens, to a point source. Due to im- perfections in the camera or lens system, the intensity of the point source will be spread over multiple pixels, and image gets blurred. 10 motivation

Figure 1.7: Example for aPSF. A typical (Gaussian) point spread function (PSF) for a blurring problem.

The full blurring of an image with a givenPSF is consequently de- scribed by a convolution of the image with thePSF. In the discrete case, this leads to the fact that the blurring matrix A has a two-level Toeplitz structure (in the two-dimensional case). Depending on the boundary condition, also Toeplitz-like matrices are possible (such as block-circulant-circulant-block (BCCB) or matrices with a Hankel structure [23, VIP 9.]). The resulting of such a blurring model can be seen in Figure 1.8.

(a) Original Image (b) Blurred Image

Figure 1.8: Blurring problem. Subfigure (b) is the result of a blurring with thePSF of Figure 1.7 (source of original image: http: //sipi.usc.edu/database/database.php?volume=misc&image=12#top). 1.2 other applications 11

To extract the original (sharp) image, theBTTB-system( 1.1) has to be solved. For applications in the field of image processing and restoration and their relation to preconditioning, see also Kamm and Nagy [29], Koyuncu [32], Lin and Zhang [35].

1.2.2 Further Applications

Other applications that include a Toeplitz,Toeplitz-like or multi- level Toeplitz system occur in areas such as (see for example [10, 22, 23, 41, 45]:

• Numerical ordinary differential equation (ODE)s and partial differential equation (PDE)s • • Signal processing and filtering • Control theory • Stochastic automata and neutral networks

LINEARSYSTEMS 2 This chapter aims at introducing the basic definitions and methods regarding systems of linear equations. These concepts will be used in subsequent chapters.

A system of linear equations (or linear system) is given by

Ax = b ,(2.1) with the coefficient matrix A ∈ Cn×n, the right-hand side vector b ∈ Cn and the unknown solution x ∈ Cn. This system has a unique solu- tion, if the coefficient matrix A is invertible (also called nonsingular). Consequently, the solution x∗ is then

−1 x∗ = A b .

The solution can be calculated using direct methods such as the Gaus- An implementation sian elimination. However, for larger systems, this is computationally of the Gaussian expensive and iterative solvers are preferred [2]. elimination has a complexity of O n3. 2.1 iterative solvers

Given an initial guess x0, an iterative solver computes a sequence of approximations {xk} of the true solution x∗ until the residual rk = ||rk|| Axk − b satisfies ||b|| < tol .

2.1.1 CG Method

TheCG method was proposed in the 1950s for symmetric and pos- itive definite coefficient matrices A [25]. It is the most prominent it- erative solver for large sparse systems [52] and the basis for a lot of more advanced and specialized algorithms. A proof for the Solving( 2.1) is equivalent to the following minimization problem: equality can be found in [52].

def 1 T T min φ(x) = x Ax − b x (2.2) 2 which means that if φ(x) becomes smaller with each iteration, we also get closer to the solution x∗.

13 14 linear systems

φ(x) in( 2.2) describes a quadratic function, where the exact form is described by the matrix A and the vector b (see Figure 2.1).

400

200

0 −10 −5 10 0 5 0 5 −5 10 −10

Figure 2.1: Minimization function. Example for a function φ(x) with the corresponding contour lines.

The gradient of the minimization problem is the residual of the lin- The solution x∗ of ear system: the minimization problem fulfills the ∇φ(x) = Ax − b = r(x) . necessary condition ∇φ(x ) = 0 ⇐⇒ ∗ The basic idea of theCG method is that the minimum of φ(x) can Ax∗ = b . be found by taking steps in the direction of the negative gradient at the current point, i. e. xk+1 = xk − γ · ∇φ(xk) = xk − γ · Ax + γ · b . The stepsize γ can be chosen to minimize φ(xk+1) along the direc- tion of the gradient.

In contrast to the gradient descent method (also known as steepest de- scent method), theCG method does not directly use the gradient as the descent direction, but insists that each new descent direction con-

Two vectors pi and jugates all directions used before. pk are conjugate (with respect to A), T Figure 2.2 compares the convergence of the gradient descent method if p iApk = 0 . with theCG method. It is important to note that the form of the function φ - and therefore the matrix A - plays a vital role in the convergence speed of the methods (as illustrated with Figure 2.2b). 2.1 iterative solvers 15

(a) Contour plot for the same function (b) Contour plot for a function φ re- φ as in Figure 2.1 with the conver- sulting from a scaled identity ma- gence of the gradient descent (blue) trix. Both methods would converge and theCG method (green). within a single step, regardless of the starting point.

Figure 2.2: Convergence of gradient descent andCG method for different functions φ. The closer the contours resemble circles, the faster the methods converge

2.1.1.1 Convergence Rate

Definition 2.1.1: Order of Convergence

Let a sequence {xk} converge to x∗, and ek = xk − x∗ be the error at step k. If there exists a constant C > 0 such that for a p > 0:

||ek+1|| lim p = C k→ ||ek||

is satisfied,∞ then we say the order of convergence of {xk} is (at least) p. If p = 1 and C < 1 then the convergence rate is said to be linear.

The order of convergence allows a prediction to the number of iter- ations a solver needs until it reaches a satisfactory solution. A higher A sequence {xk} x order of convergence is preferred, since it means less computation converges to ∗ if lim |xk − x∗| = 0 . time. k→

∞ The performance of theCG method for solving a linear system depends on the condition number κ(A) of its coefficient matrix A [41, 43]. For example: 16 linear systems

ek denotes the Theorem 2.1.2: Convergence Rate ofCG Method error vector e = x − x∗ , k k If the coefficient matrix A of the linear system( 2.1) is Hermitian where xk is the k-th iterate of the CG positive definite with the condition number κ(A), then theCG method and x∗ is method converges in the following way: the exact solution of the linear system. p !k ||ek|| κ(A) − 1 6 2 p ||e0|| κ(A) + 1

This theorem implies linear convergence for theCG method. At the same time, theCG method converges faster for linear systems with a smaller condition number. More precise convergence rates can be stated if the distribution of the eigenvalues of A is known (see among others [41]). In the special case of clustered eigenvalues, we get:

Lemma 2.1.3: Convergence Rate for Clustered Spectrum

If the eigenvalues λk of A are such that

0 < δ 6 λ1 6 ... 6 λi 6 1 −  6 λi+1 6 ... 6 λn−j 6 1 +  6 λn−j+1 6 ··· 6 λn

for a δ > 0 and 1 >  > 0, then theCG method converges in the following way:

 i ||ek|| 1 +  k−i−j 6 2  , k > i + j ||e0|| δ

This implies that a matrix A with eigenvalues that are tightly clus- tered around 1 and away from 0, with only as few exceptions as pos- sible, is desirable. This relates to a function φ, with almost circular contours (as shown in Figure 2.2b).

2.1.2 Other Methods

In general there are Besides the basicCG method, three other iterative solvers are impor- no convergence rates tant to this work. All three do not impose any restriction on the coef- known for these ficient matrix A, such as symmetry, and are therefore viable methods advanced methods. In Chapter 12 we for solving the linear system described in Chapter 4. will compare different • The biconjugate gradient stabilized (BiCGSTAB) method is preconditioners by an improved version of the biconjugate gradient (BiCG) method, the number of a generalization of theCG method for nonsymmetric coefficient MVPs they need until convergence. matrices [53]. The method was described by van der Vorst [58]. 2.2 preconditioning 17

• The induced dimension reduction (IDR) was invented by Wes- seling and Sonneveld [59].

• The generalized minimal residual (GMRES) was proposed by Saad and Schultz [49]. The main drawback of this method is its relatively high storage requirements [3].

2.2 preconditioning

The goal of preconditioning is to transform the original linear system (i. e.( 2.1)) into a different linear system that has the same solution x∗, but a better convergence rate.

This can be done by multiplying the whole linear system( 2.1) with the inverse of a preconditioner P , thus getting a left-preconditioned sys- tem:

P −1Ax = P −1b (2.3)

Since the linear system was multiplied by P −1 on both sides, the so- lution x∗ is still the same.

The right-preconditioned system:

AP −1y = b with P −1y = x (2.4) has the same solution as well, which can be seen easily.

In order to have a better convergence rate, the preconditioner P should be chosen, so that the new linear system can be solved easily. This in general means that the left sides of the preconditioned linear systems (P −1A for( 2.3) and AP −1 for( 2.4)) need to have a smaller condition number κ than in the original linear system [24].

ForCG-like methods the distribution of the eigenvalues of P −1A or AP −1 is also highly important, because it influences the conver- gence rate (see Section 2.1.1.1).

Note that in practice it is not necessary to compute the matrix- matrix product P −1A or AP −1, instead, the iterative solvers use the preconditioner in each iteration in aMVP. This results in an extra MVP of a vector with P −1, compared to the original system. The additional costs of the extraMVP in each iteration, needs to be com- pensated by a faster convergence, i. e. less iterations.

Loosely speaking, if P is a good approximation of A the solvers will converge fast. If P = A, then the linear system can be solved 18 linear systems

in one step. However, the inversion of A in order to construct the a) Hard Problem preconditioner is equal to solving the problem and computationally expensive. Therefore easily invertible approximations of A are inter- esting choices for P .

In conclusion, a perfect preconditioner should satisfy the following conditions:

P = I P = A C1 P should be a good approximation of A b) Medium Problem C2 P −1 should be easy to compute

C3 P −1 should be easily applied to a vector, i. e. theMVP P −1x should be cheap to compute

WhileC 1 influences the total number of iterations needed,C 2 de- termines the initial computation cost of the solver (the inverse is cal- P = I P = A culated exactly one time at the beginning). AdditionallyC 3 decides c) Easy Problem the computational cost of each iteration.

In general, the closer P approximates A, the more complex it is to compute its inverse and anMVP with it. Figure 2.3 illustrates this trade off, between a heavily preconditioned system (close to P = A) and a system without preconditioning (P = I). The blue line P = I P = A symbolizes the number of iterations needed to solve the system, the green line the time needed in each iteration and the black line is the Figure 2.3: product of both and represents the total time needed. The optimal Preconditioner preconditioner (marked with the dashed red line) that minimizes the trade off. total complexity of solving the system, depends on the complexity of the problem. TOEPLITZSYSTEMS 3 Definition 3.0.1: Toeplitz Matrix

A matrix T (n) ∈ Cn×n, with constant entries along each diago- nal, i. e. of the form:   t0 t−1 . . . t−n+2 t−n+1    ..   t1 t0 t−1 . t−n+2 (n)  . . .  T =  . t t .. .  ,  . 1 0 .   . .  t ...... t   −2+n −1  t−1+n t−2+n . . . t1 t0

is called a Toeplitz matrix.

Each entry tij of a Toeplitz matrix only depends on the difference In contrast to of its indices i and j: general n × n matrices, a Toeplitz matrix is ti j = ti+1 j+1 := ti−j = tk k = −n + 1, ... , 0, . . . n − 1 , , well defined by only 2 · n − 1 (rather 2 than n ) entries tk. Example 3.0.2: Toeplitz matrix

  9 1 3 6   2 9 1 3 T (4) =     4 2 9 1 1 4 2 9 is a Toeplitz matrix.

Definition 3.0.3: Toeplitz System

AToeplitz system, is a linear system

T x = b ,

where T is a Toeplitz matrix as defined in definition 3.0.1.

(n) We can interpret T as a principal submatrix of a × matrix A principal T ( ): submatrix is obtained by ∞ ∞ ∞ removing rows and columns with the 19 same indices from a larger matrix. 20 toeplitz systems

 ......   ... a0 a−1 a−2 a−3 a−4 ...     ... a a a a a ...   1 0 −1 −2 −3  ( )  ... a a a a a ...  T =  2 1 0 −1 −2    ∞  ... a3 a2 a1 a0 a−1 ...     ... a4 a3 a2 a1 a0 ...  ......

We can further assume that the diagonal coefficients {tk}k=− of ( ) this T matrix are the Fourier coefficients of a function f(∞x): ∞ ∞ π 1 t = f(x)e−ikx dx (3.1) k 2π ˆ −π

hence:

ikx f(x) = tke (3.2) ∞ k=− X We call f(x) the∞generating function of T ( ), as well as any princi- (n) (n) pal submatrix T (of size n × n). In the∞ same way, T [f], is the Toeplitz matrix T (n) ∈ Cn×n induced by the generating function f(x).

In many practical problems, the generating function is usually given first, not the corresponding Toeplitz matrices [10, 41]. This is for ex- ample true for:

• Numerical differential equations, where the equation gives f • Filter design, where the transfer function gives f • Image restoration, where the blurring function gives f

In the main application of this thesis (see Section 1.1 this is also true. The coefficient matrix is a result of the grating’s geometry and the refractive index of the grating’s materials (see also Chapter 4).

One of the special properties of Toeplitz matrices is that we can compute anMVP with T (n) in only O (2n log 2n) operations, us- ing the Fast Fourier transform (FFT). This works by embedding the Toeplitz matrix in a of twice the size, and then com- puting theMVP (see Section 8.1 for how this is done).

Another important aspect of a Toeplitz matrix is that while its inverse is not Toeplitz, it can be factorized by Toeplitz matrices, according to the Gohberg–Semencul formula [20]. 3.1 multi-level toeplitz matrices 21

Theorem 3.0.4: Gohberg–Semencul Formula

If the Toeplitz matrix T ∈ Rn×n is such that each of the systems of equations

T x = e1

T y = en

is solvable and the condition x1 6= 0 is fulfilled, then the ma- trix A is invertible, and its inverse is formed according to the formula

−1 −1 T T  T = x1 Lower(x)Lower(Jy) − Lower(Z0y)Lower(Z0Jx)

where Lower(x) denotes a lower triangular Toeplitz matrix with x as the first column, J is the anti- with ones on the anti-diagonal and zeros everywhere else and Z0 = Lower(e2).

3.1 multi-level toeplitz matrices

We can also define matrices, that possess a Toeplitz structure on one ore more levels of a matrix, therefore defining multi-level Toeplitz matrices. We can first define

Definition 3.1.1: Toeplitz-Block Matrix

T (m;n) ∈ Cm·n×m·n m × m A block matrix (TB) , where each of the blocks is a n × n Toeplitz matrix, is called a Toeplitz-block ma- trix. It has the form:   T1,1; T1,2; ... T1,m;    T2 1 T2 2 ... T2 m  T (m;n) =  , ; , ; , ;  T (TB)  . . . .  , where i,j; is Toeplitz .  . . .. .    Tm,1; Tm,2; ... Tm,m;

not to be confused with: 22 toeplitz systems

Definition 3.1.2: Block-Toeplitz Matrix

T (m;n) ∈ Cm·n×m·n A block matrix (BT) , of the form:   A0; A−1; ... A1−m;    A1 A0 ... A2−m  T (m;n) =  ; ; ; (BT)  . . . .  ,  . . .. .    Am−1; Am−2; ... A0;

where Ak is arbitrary.

A combination of the definitions 3.1.1 and 3.1.2 is:

A BTTB matrix is Definition 3.1.3: BTTB Matrix sometimes also called a two-level Toeplitz matrix. A block-Toeplitz matrix, where the blocks Tk; are themselves Toeplitz matrices, is called a block-Toeplitz-Toeplitz-block (BTTB) matrix:   T0; T−1; ... T1−m;    T1 T0 ... T2−m  T (m;n) =  ; ; ; (BTTB)  . . . .  ,  . . .. .    Tm−1; Tm−2; ... T0;

where Tk is Toeplitz.

Example 3.1.4:BTTB matrix

  4 2 7 8 6 5 1 0 6 9 2 1    6 4 2 3 8 6 3 1 0 3 9 2     8 6 4 8 3 8 6 3 1 4 3 9       7 3 1 4 2 7 8 6 5 1 0 6     1 7 3 6 4 2 3 8 6 3 1 0    (4;3)  9 1 7 8 6 4 8 3 8 6 3 1  T =   (BTTB)  6 0 9 7 3 1 4 2 7 8 6 5       3 6 0 1 7 3 6 4 2 3 8 6     0 3 6 9 1 7 8 6 4 8 3 8     4 3 2 6 0 9 7 3 1 4 2 7     2 4 3 3 6 0 1 7 3 6 4 2    3 2 4 0 3 6 9 1 7 8 6 4

ABTTB matrix corresponds to a generating function with two vari- The elements of a ables , i. e. BTTB matrix are the Fourier coefficients of a bivariate function. 3.2 circulant matrices 23

π π  1 2 t = f(x, y)e−i(kx+ly) dx dy k;l 2π ˆ ˆ −π −π hence:

i(kx+ly) f(x, y) = tk;le ∞ ∞ k=− l=− X X instead of( 3.1) and∞( 3.2)∞ respectively.

3.2 circulant matrices

A special kind of Toeplitz matrix is the circulant matrix, where each Block-circulant, column is a circular shift of its preceding column : circulant-block and BCCB matrices can   be defined similarly c0 c1 . . . cn−2 cn−1 to definitions    ..  3.1.1,3.1.2 and 3.1.3 cn−1 c0 c1 . cn−2  . . .  C =  . c c .. .   . n−1 0 .   . .   c ...... c   2 1  c1 c2 . . . cn−1 c0

A circulant matrix is fully defined by just n coefficients. In Chapter 8 further useful properties are described and used.

3.3 hankel

A related matrix type is the . It is basically an upside- We can again define down Toeplitz matrix, which means that it has constant values along block-Hankel, its anti-diagonals: Hankel-block and block-Hankel- Hankel-block   (BHHB) matrices h−n+1 h−n+2 . . . h−1 h0 and even  .  combinations from h . . h h h   −n+2 −1 0 1  all the matrix types  . . .  H =  . . . h h .  above, such as a  . 0 1 .   .  block-Hankel-  h h h . . h   −1 0 1 n−2 Toeplitz block (BHTB) matrix. h0 h1 . . . hn−2 hn−1

Analogously, to the Toeplitz matrices, a Hankel matrix is de- scribed by 2n − 1 coefficients.

PROBLEMDESCRIPTION 4 This chapter describes the full problem arising from the motivation illustrated in Chapter 1. Then two closely related and reduced prob- lems are derived that will be the main topic of this work.

4.1 full problem

As described in Section 1.1.1, a numerical solution of Maxwell’s equations describing light scattering on a 2D-periodic structure (the gratings) requires the solution of the linear system

Ax = b ,(4.1)

with A ∈ Cn×n the coefficient matrix, b ∈ Cn the right-hand side and x ∈ Cn the unknown solution.

It is known, that A can be decomposed via:

A = C − GM ,

n n where C, G, M ∈ C × are sparse matrices. Since the sparsity pat- In a terns of G and M are complementary, A is a dense matrix (see Fig- most elements are ure 4.1). Note that the matrix M has an identical structure as C. zero. If most elements are non-zero, the matrix is called dense.

= = − − · ·

A C G M

Figure 4.1: Sparsity patterns of the matrices C, G and G as well as the re- sulting matrix A.

25 A C G M 26 problem description

The sparsity pattern of C is shown in more detail in Figure 4.2.

0

C1;

2000

C2;

4000

C3;

6000

. .

CNz;

... 0 2000 4000 6000

Figure 4.2: Sparsity pattern of C.

Each block Cl; of C is of the form : A matrix of the form   of Cl; can be called a Cl;1,1; Cl;1,2; Cl;1,3; BTTB-block matrix.   Cl = C C C  ,(4.2) ;  l;2,1; l;2,2; l;2,3; Cl;3,1; Cl;3,2; Cl;3,3;

where each block Cl;i,j is aBTTB matrix with the following generat-

ing function, FCl;i,j :

 b  FC (x, y) = ni(x, y) nj(x, y) − 1 + δij for i, j ∈ x, y, z l;i,j (x, y)

where (x, y) are the piecewise constant material properties and ni(x, y), nj(x, y) are the normal-vector fields, which are piecewise constant for polygonal shapes.

Additionally, each Cl; is symmetric on a block level, i. e. Cl;i,j = Cl;j,i . 4.1 full problem 27

The structure of matrix C on each level is shown in color plots in Figure 4.3 and is summarized in Table 4.1.

Figure 4.3: Color plots of all levels of C. The image shows color plots of each level of C. On the top level, the block-diagonal structure can be seen easily. The next image shows the 3 × 3 symmetry that the matrix possesses on the next level. The bottom two levels each possess a Toeplitz structure, which is clearly visible in the color plots.

Table 4.1: Structure of C on each level, from highest (level Z) to lowest (level X).

Level Structure

Z Diagonal A 3 × 3 Symmetric YToeplitz XToeplitz

Because C is sparse and an approximation of A, C−1 is a good can- didate for a preconditioner. In fact, preliminary investigations have shown that by choosing C−1 as a preconditioner, the number of it- erations of the solver can be drastically reduced in comparison to choosing no preconditioner at all (which is equivalent to choosing the identity I as a preconditioner), see Table 4.2.

However, computing the inverse C−1 as well as anMVP with it, is quite expensive. Since the inverse of aBTTB matrix is notBTTB, the MVP of C−1 has a cost of O n2. This is computationally expensive in comparison to the cost of anMVP of the unpreconditioned system (here it is anMVP with C and therefore a cost of O (n log n)).

Therefore approximations of C with a cheaper inverse andMVP are wanted. Nevertheless, using C without any changes is still a possi- 28 problem description

Table 4.2: Convergence rates (number of iterations) for( 4.1) usingIDR( 6).

Case P = IP = C

Case 1a 732 225 Case 1b 297 102 Case 2a > 99998 381 Case 2b > 99998 3288 Case 3a 275 130 Case 3b 1817 212 Case 4 85135 310

ble preconditioner and an option for harder problems, see Section 2.2.

Note that our full So the full problemP 1 can be defined as following: problem only uses C as the coefficient P1 Find a good preconditioner (fulfillingC 1 toC 3) for a system matrix. In Cx = b, with C as defined in( 4.2). Section 11.4, we will discuss possible ways to include G 4.2 bttb-block system and M into the preconditioner, for a −1 better adaptation to Using C as a preconditioner requires computing the inverse C . the complete matrix Since C is a block diagonal matrix, the inverse is given by A.  −1  C1; 0 ...... 0  . .   0 C−1 .. .   2; .  −1  . . . .  C =  . .. C−1 .. .   . 3; .   . . .   . .. .. 0   .  0 ...... 0 C−1 Nz;

This means, if we solve the following reduced problemP 2, we will also solve the full problem.

P2 Find a good preconditioner (fulfillingC 1 toC 3) for a system Dx = b , with

Dx = b ,

where   D1,1; D1,2; D1,3;   D = D D D   2,1; 2,2; 2,3; D3,1; D3,2; D3,3; 4.3 bttb system 29

n×n and each Di,j; ∈ C is aBTTB matrix.

Furthermore, the matrix D is symmetric on the block level, i. e.

Di,j; = Dj,i; .

A solution to this reduced problemP 2 can be used for other applica- tions, as the one described in Chapter 1.

4.3 bttb system

A simplification of the previous problemP 2 is the following:

P3 Find a good preconditioner (fulfillingC 1 toC 3) for a system Dx = b, with

Ex = b ,

where E ∈ Cn×n is aBTTB matrix with a piecewise constant generating function.

Solving this problem can be an important step on the path to solv- ing the full problemP 1 and can be used for different applications, some of which are mentioned in Section 1.2.

THESISOVERVIEW 5 Chapter 1 introduces the main application which motivates the re- search subject of this thesis. The metrology of integrated circuits (ICs) requires the solution of a high-dimensional dense linear system. The time needed to solve such a system with an iterative solver can be reduced by applying a preconditioner to the linear system. Besides this main application, further applications of this thesis are described, such as the deblurring of images.

Chapter 2 gives an introduction into the mathematics of linear sys- tems, iterative solvers and the field of preconditioning. Additionally, three conditions for a good preconditioner are derived.

In Chapter 3, the special structure of Toeplitz systems is described. The concept of an associated generating function is explained and special properties of Toeplitz systems are mentioned.

Chapter 4 focuses on the mathematical description of the main ap- plication mentioned in the first chapter. The special structure of the linear system is illustrated. Two linear systems with a simpler struc- ture are introduced that will be considered as intermediate problems.

After this short outline of the thesis and its content, Chapter 6 will start off the part about different preconditioning techniques. In this chapter, a quick overview of the preconditioners described in the fol- lowing is given.

The first preconditioner is described in Chapter 7. The complete matrix C is considered as a preconditioner and evaluated.

In Chapter 8, different circulant approximations of the Toeplitz structures are considered. This approximation is described in the con- text of Toeplitz, multi-level Toeplitz and Toeplitz-block matrices.

Chapter 9 considers T[1/f] as a preconditioner for T[f]. This pre- conditioner is then generalized to the Toeplitz-block and theBTTB- block case. A proof is provided, that if F isHPD, the spectrum of T[F −1]T[F ] is clustered. The chapter closes by proposing a regular- ization for cases where F is notHPD.

31 32 thesis overview

Chapter 10 describes the Kronecker product approximation that approximates a matrix by a sum of Kronecker products. Several op- tions to adapt this method to theBTTB-block case are suggested. Additionally, the relationship between this method and the generat- ing function is illustrated.

Chapter 11 collects several additional preconditioning ideas and describes them briefly.

Chapter 12 compares the performance of the suggested precondi- tioners for several test cases by comparing the required number of iterations until convergence.

Suggestions for future investigations are presented in Chapter 13. This includes aspects of different topics that could not be analyzed in the limited time available for this thesis.

Finally, Chapter 14 summaries the main results of this work and presents conclusions. Part II

PRECONDITIONERS

The following part describes various techniques for ap- proximating (multi-level) Toeplitz matrices. Each approx- imation method will be explained and discussed in terms of their applicability for the presented case. If necessary, changes and generalizations are made to adapt it to the main application of this work. They are explained along with an examination of the complexities of each pre- conditioner, in terms of inversion andMVP with its in- verse.

OVERVIEWOVERTHE PRECONDITIONING 6 TECHNIQUES

This whole part will discuss and propose several preconditioners that can be applied to the full problem described in Section 4.1. The pre- conditioners in this part are proposed for general Toeplitz,BTTB andBTTB-block systems, to make sure the results obtained in this part can be used in general.

Table 6.1 illustrates the different preconditioning techniques and if they can be applied to a simple Toeplitz matrix, aBTTB matrix and a BTTB-block matrix. The table shows the applicability of different pre- conditioning methods for Toeplitz,BTTB andBTTB-block systems. A black check mark denotes this method has been used in literature before, while the symbol of a light bulb denotes the application of this method was done in this work. A cross marks means that dur- ing this work no way of applying this method for this case could be found or was not considered.

Table 6.1: Applicability of different preconditioning methods. Preconditioning Toeplitz matrixBTTB matrixBTTB-block matrix Technique Circulant    DST I - IV    DCT I - IV    Hartley    Diagonal    Banded with bandwidth > 2    Inverse Generating Function   Kronecker s = 1 -  s > 2 -   with approximate SVD   Koyuncu -  

35 36 overview over the preconditioning techniques

To simplify the notation of the subsequent chapters, T(Toep) will de- note a simple Toeplitz matrix of size Nx × Nx. T(BTTB) will denote aBTTB matrix, with Ny × Ny blocks, each of size Nx × Nx. Addi- tionally, T(Block) denotes a 3 × 3 block matrix, whose blocks areBTTB matrices, with Ny × Ny blocks, each of size Nx × Nx. FULLCPRECONDITIONER 7 The first and most obvious choice is to use the complete matrix C as a preconditioner (P = C). Compared to the subsequent precondition- ers, that are based on C but are approximations of it, this choice is the best in terms of reducing iterations.

However, each iteration is computationally expensive, as is the in- version of C. Nevertheless, using the complete matrix C can be a good choice for hard problems and is additionally an interesting ref- erence point for the subsequent preconditioners.

7.1 application to full problem

Since C is used as a full matrix, without any changes, this approach is directly usable to the full problem.

7.1.1 Inversion

An exact inversion of aBTTB-block matrix can done using the G aus- 3  sian elimination algorithm. However, the complexity is O (NxNy) Nz . So far, no exact inversion formulas with smaller complexity are known forBTTB-block matrices.

7.1.2 MVP

Since the inverse of aBTTB matrix is notBTTB, C−1 does not pos- sess any structure that can be used for an optimizedMVP. This −1 2  means that the complexity of theMVP with C is O (NxNy) Nz

37

CIRCULANT APPROXIMATION 8

8.1 circulant approximation for toeplitz matrices

It is known, that a Toeplitz matrix T can be approximated well by a This relates to circulant matrix C [10, 12, 14, 54, 57]. conditionC 1 of a good preconditioner. Additionally, it is well known [10, 17] that any circulant matrix C ∈ Cn×n can be diagonalized, such that

 H C = F (n) Λ(n)F (n) ,(8.1) where F (n) is the Fourier matrix of order n, i. e.

 (n) 1 2πijk F = √ e n j,k n

(n) (n) and Λ = diag(F c) , where c is the first column of C, is a diag- diag(A) denotes a onal matrix holding the eigenvalues of C. diagonal matrix, were the diagonal is equal to the diagonal Via this decomposition it is easy to compute the inverse of a circu- of A. lant matrix as follows:

−1 −1 C−1 = F HΛF  = F −1Λ−1 F H = F HΛ−1F .

Therefore, the inversion of an n × n circulant matrix can be done This relates to efficiently in O (n log n). conditionC 2 of the In the same way, theMVP can be computed in O (n log n): preconditioner. This satisfies C−1x = F HΛ−1F x = F Hdiag(F c)−1F x . conditionC 3. Since most FFT The mentioned properties of circulant matrices make them a suit- algorithms rely upon able choice as preconditioners. There are different methods of approx- the factorization of imating a Toeplitz matrix with a circulant one, that will be described n, the complexity increases, if n is in the following section. prime. However, specialized algorithms have been developed that guarantee a complexity of O (n log n) even when n is prime, e.g., [47]. 39 40 circulant approximation

8.1.1 Circulant Preconditioners

In this work, we will look at three different circulant preconditioners, each minimizing a certain norm:

•S trang’s preconditioner CS(T ) [54]: Minimizing kC − T k1 for C Hermitian and circulant, if T is Hermitian.

• T. Chan’s (optimal) preconditioner CC(T ) [14]: Minimizing ||C − T ||F.

•T yrtyshnikov’s (superoptimal) preconditioner CT (T )[57]: Min- −1 imizing ||I − C T ||F.

Each preconditioner will be discussed in the following sections.

8.1.1.1 Strang’s Preconditioner

Strang’s preconditioner [54] uses basically the first half of the first row of the original Hermitian Toeplitz matrix T and completes the rest of the first row with a flipped version of it, to make it circular. In other words:

h bn/2c+1 n−k+1 i CS(T ) = circ (T1j)j=1 transpose(flip((Ti1)i=2 )) ,

where CS(T ) denotes Strang’s preconditioner for the Hermitian Toeplitz matrix T , and circ(x) the circulant matrix, defined by x as the first row.

Example 8.1.1: Strang’s preconditioner

  9 1 3 6   2 9 1 3 Take T from Example 3.0.2, T =  , then   4 2 9 1 1 4 2 9

  9 1 3 2     2 9 1 3 CS(T ) = circ( 9 1 3 2 ) =   .   3 2 9 1 1 3 2 9

Incidentally, CS(T ) minimizes ||C − T ||1 and ||C − T || over all Her- mitian circulant matrices C, for Hermitian matrices T [12]. ∞ −1 It can be shown that CS(T ) T has a clustered spectrum (see for example Chan and Jin [10]), thus resulting in a fast convergence of the preconditionedCG method. 8.1 circulant approximation for toeplitz matrices 41

8.1.1.2 T. Chan’s Optimal Preconditioner

Another very popular choice is Chan’s optimal preconditioner CC [14], which minimizes ||A − C||F over all circulant matrices C for an Chan’s arbitrary matrix A. preconditioner is sometimes denoted cF(T ), highlighting Using the decomposition from( 8.1), we get: its relation to the discrete Fourier H H ||A − C||F = ||A − F ΛF ||F = ||F AF − Λ||F , transform. which is minimized for Λ = diag(F AF H) and therefore

(8.1) H H Cc(A) = F diag(F AF )F .

If A is a Toeplitz matrix, the entries of the first row of CC(A) can be given explicitly by

ia−n+i + (n − i)ai c = i = 0, ... , n − 1 , i n which is equal to averaging the corresponding diagonal of A [10, Eq. (2.6)].

Important properties of CC(A) are that it inherits the positive-definiteness −1 of A and that again the spectrum of CC(T ) T is clustered [10, 57].

Example 8.1.2: Chan’s (optimal) preconditioner

  9 1 3 6   2 9 1 3 Take again T from Example 3.0.2, T =  , then   4 2 9 1 1 4 2 9

  9 1 3.5 3    3 9 1 3.5 CC(T ) =   .   3.5 3 9 1  1 3.5 3 9

8.1.1.3 Tyrtyshnikov’s Superoptimal Preconditioner

−1 Similar to Chan’s preconditioner, we can try to minimize ||I − C A||F over all circulant matrices C for an arbitrary matrix A. 42 circulant approximation

It can be shown [10], that such a preconditioner is related to Chan’s preconditioner by

H H −1 CT (A) = CC(AA )CC(A ) .

Example 8.1.3: Tyrtyshnikov’s (superoptimal) preconditioner

  9 1 3 6   2 9 1 3 Take again T from Example 3.0.2, T =  , then   4 2 9 1 1 4 2 9

  9.42 0.8142 3.3981 3.0040   3.0040 9.42 0.8142 3.3981 CT (T ) =   .   3.3981 3.0040 9.42 0.8142 0.8142 3.3981 3.0040 9.42

−1 Similar to the previous preconditioners, CT (T ) T proves to have a clustered spectrum [10].

8.2 circulant approximation for bttb ma- trices

The goal of this section is to generalize the method of circulant ap- proximations toBTTB matrices. Each level will be handled separately, starting with the upper level, which is equivalent to a block-Toeplitz matrix.

ABTTB matrix can then be approximated with aBCCB matrix, by applying the Toeplitz-block and the block-Toeplitz approximations consecutively. Chan and Jin [9] showed, that approximating both levels separately with Chan’s optimal preconditioner is equivalent to solving

min kT(BTTB) − C(BCCB)kF , C(BCCB)

if C(BCCB) isBCCB. 8.2 circulant approximation for bttb matrices 43

8.2.1 Toeplitz-block Matrices

AToeplitz-block matrix is of the form   T1,1; T1,2; ... T1,m;    T2 1 T2 2 ... T2 m  where Ti j is Toeplitz T (m;n) =  , ; , ; , ;  , ; (TB)  . . . .  , .  . . .. .  i, j = 1, 2, ... , m   Tm,1; Tm,2; ... Tm,m;

Therefore, a natural choice is to approximate each Toeplitz matrix Ti,j; with its circulant approximation C(Ti,j;) [15].

The resulting matrix is a matrix of the form   C(T1,1;) C(T1,2;) ...C(T1,m;)    C(T2 1 ) C(T2 2 ) ...C(T2 m )  C(T (m;n)) =  , ; , ; , ;  (TB)  . . . .  .  . . .. .    C(Tm,1;) C(Tm,2;) ...C(Tm,m;)

Figure 8.1 shows an example of a Toepltz-block matrix and its approximation with a circulant-block matrix.

(a) Color plot for a sample Toeplitz- (b) Color plot for the circulant ap- block matrix, with 6 × 6 blocks each proximation of Figure 2.2b. The of size 4 × 4. The present Toeplitz Toeplitz structure of each block structure is clearly visible. has been approximated by a circu- lant one.

Figure 8.1: Color plots for a Toeplitz-block matrix and its circulant-block approximation.

8.2.2 Block-Toeplitz Matrices

The case of a block-Toeplitz matrix can be handled identically, after transforming it into a Toeplitz-block matrix first. 44 circulant approximation

A block-Toeplitz matrix has the form   A0; A−1; ... A1−m;    A1 A0 ... A2−m  T (m;n) =  ; ; ; A (BT)  . . . .  , where k; is arbitrary .  . . .. .    Am−1; Am−2; ... A0;

We can now define a P such that

 (n;m)  (m;n) H  (m;n) T( ) := PT( ) P = T( ) . TB k,l;i,j BT k,l;i,j BT i,j;k,l

That means that after applying the permutation

H T(TB) = PT(BT)P ,(8.2)

a block-Toeplitz matrix T(BT) will become a Toeplitz-block matrix T(TB). Note that the number of blocks and the size of the blocks will be swapped between those two matrices.

This means that a suitable approximation method for block-Toeplitz matrices T(TB) can be done via the following steps:

1. Transformation: Transform the block-Toeplitz matrix to a Toeplitz- block matrix by applying the permutation defined in( 8.2): T(TB) = H PT(BT)P .

2. Toeplitz-block approximation: Apply the approximation C(T(TB)) as defined in Section 8.2.1.

3. Back transformation: Transform the solution from the previous H step back, by applying: C(T(BT)) = P C(T(TB))P .

The result of such an approximation is illustrated in Figure 8.2.

(a) Color plot for a sample block- (b) Color plot for the circulant approx- Toeplitz matrix, with 4 × 4 blocks imation of Figure 8.2a. each of size 6 × 6.

Figure 8.2: Color plots for a block-Toeplitz matrix and its block-circulant approximation. 8.3 application to bttb-block matrices 45

8.3 application to bttb-block matrices

As stated in the previous sections, we can approximate eachBTTB matrix, by aBCCB one. The resulting matrix is thereforeBCCB- block. As the following two sections will describe, no further approxi- mation on the 3 × 3 symmetric level is required, to achieve an efficient inversion andMVP in that case.

8.3.1 Inversion

To compute this, we can use the block-inversion formulas. Let A be a 3 × 3-block matrix, where each of the blocks areBCCB matrices of size NxNy × NxNy, i. e. ,   A1,1; A1,2; A1,3;   A = A A A  where Ai j is aBCCB matrix .  2,1; 2,2; 2,3; , ; A3,1; A3,2; A3,3;

This 3 × 3 matrix can be condensed into a 2 × 2 matrix, on which the For the full problem known block inversion formula (see for example (2.8.25) in [5]) can discussed in be used. Section 4.1 the matrix A described ! here is equal to one WQ A = , of the blocks of C. RS where ! A A W = 1,1; 1,2; , A2,1; A2,2; ! A Q = 1,3; , A2,3;   R = A3,1; A3,2; ,   S = A3,3; .

Then by applying the block inversion formula, one gets

!−1 ! WQ W −1 + W −1QZRW −1 W −1QZ A−1 = = , RS −ZRW −1 Z where

Z = (S − RW −1Q)−1 46 circulant approximation

is the Schur complement.

The computation requires W −1, which can be computed by apply- ing the block inversion formula again, this time on W , thus

!−1 A A W −1 = 1,1; 1,2; A2,1; A2,2; −1 −1 0 −1 −1 0! A + A A1 2 Z A2 1 A A A1 2 Z = 1,1; 1,1; , ; , ; 1,1; 1,1; , ; , 0 −1 0 −Z A2,1;A1,1; Z

where

0 −1 −1 Z = (A2,2; − A2,1;A1,1;A1,2;)

is the Schur complement.

−1 The fact that BCCB Thus computing A requires the computation of Z and Z 0. Since matrices form an BCCB matrices form an algebra, Z and Z 0 areBCCB matrices as well. algebra can be Thus their computation can be done efficiently in the preprocessing checked easily by using the phase. decomposition via the Fourier The total computation of A−1 requires multiple multiplications, transformation, see inversions and additions ofBCCB matrices, so the total costs are (8.1) O ((NxNy) log(NxNy)).

8.3.2 MVP

As seen in the previous chapter, if A is aBCCB-block matrix, so is A−1. TheMVP of aBCCB matrix with can be computed in O ((NxNy) log(NxNy)). INVERSEGENERATING FUNCTIONAPPROACH 9

As described in Chapter 3 a Toeplitz matrix can be associated with a generating function (see( 3.2)). It has been shown ([11], [34]) that under certain assumptions for f, T[1/f] is a good preconditioner for T[f].

The first section will describe the approach using the inverse gen- erating function (IGF) for Toeplitz matrices. However, the method works almost identically for multi-level Toeplitz matrices, such as BTTB matrices. For theBTTB case, the main difference is that the generating function will be bivariate.

The subsequent section will focus on the generalization for the Toeplitz-block andBTTB-block case. It will also include a proof that the eigenvalues of the preconditioned system in that case are clus- tered as well.

9.1 inverse generating function for toeplitz and bttb matrices

In many applications, a generating function is given and from this, In the case of a the corresponding Toeplitz orBTTB matrix is computed (see up- BTTB matrix, the per arrow a) in Figure 9.1). Computing the inverse of the two-level generating function is bivariate, as Toeplitz matrix generated by f(x, y) is expensive and therefore a step shown in Figure 9.1. that should be avoided (see the downwards arrow on the right side However the in Figure 9.1). procedure is almost identical, whether a Toeplitz or a Instead the inverse of the generating function can be computed BTTB matrix is (step b)) and the Toeplitz matrix generated by 1/f can be computed. used. As shown by Chan and Ng [11] and Lin and Wang [34], this matrix is a good approximation of T[f]−1 if certain conditions are met. Under the assumption that f ∈ C TheIGF approach works in three steps, if the computation of T[f] 2π is positive, Chan and Ng [11] from f is included. The steps are illustrated in Figure 9.1 and are have shown that the eigenvalues of T[f] f A Building by computing the Fourier coefficients of (see T[1/f] T[f] are (3.1) or 3.1). clustered around one, which B Computing the inverse of f. (indirectly) satisfies conditionC 1 for a good preconditioner. Lin and Wang 47 [34] showed the same for BTTB matrices. 48 inverse generating function approach

C Computing the matrix generated by 1/f i. e. , T[1/f].

Fourier Transformation f(x, y) T[f(x, y)] A

Inversion B T[f(x, y)]−1

C 1/f(x,y) T[1/f(x,y)] Fourier Transformation

Figure 9.1: Illustration of the inverse generating function approach (marked in red).

In the next sections, the required steps A to C for theIGF ap- proach, will be replaced, with numerical alternatives (Ã to C).˜ This is necessary in cases when the analytical way can not be used or is too computationally expensive. In the cases relevant to this work, all three alternatives have to be applied, in order to implement this ap- proach (see Figure 9.4). Note that all the suggested alternatives can be computed efficiently, with the use of theFFT

9.1.1 Unknown Generating Function

In some cases, the generating function is not (explicitly) known, but only the (multilevel) Toeplitz matrix. In these cases, the starting point of theIGF method is the matrix itself. Therefore, the direction of step A has to be reversed (see Figure 9.2) and consequently the first step has to be à Approximate the (actual) generating function, with an f˜, using the matrix elements. This can be done in a variety of ways, the most straightforward way being

Nx−1 ikx f˜ = tke ,(9.1) k=−(XNx−1)

for an Nx × Nx (one level) Toeplitz matrix. This approach can be easily generalized toBTTB and other multilevel T oeplitz matrices. Computing the approximation f˜ as in( 9.1) is equivalent to

˜ f = DNx−1 ∗ f , 9.1 inverse generating function for toeplitz and bttb matrices 49

where DNx−1 denotes the Dirichlet kernel of order Nx − 1 and f the exact generating function [7, pp. 1011–1016].

Approximation fe(x, y) T[f(x, y)] Ã

Inversion B T[f(x, y)]−1

C 1 1/fe(x,y) T[ /fe(x,y)] Fourier Transformation

Figure 9.2: Illustration of the inverse generating function approach for un- known generating functions, with the changes marked in red.

Using alternative kernels that have the specific form f˜ = K ∗ f = Nx−1 ikx bke , for some coefficients bk, such as the Fejér kernel [7, −(Nx−1) pp.P1016–1020], are alternative choices and worth investigating in the future.

If the used kernel is of the form above, then f˜ can be computed using anFFT in O (Nx log Nx) operations.

9.1.2 Numerical Integration for Computing the Fourier Coefficients

In general, the Fourier coefficients of 1/f can not be computed an- alytically, which is why a numerical alternative is described in this section (see Figure 9.3). 50 inverse generating function approach

Fourier Transformation f(x, y) T[f(x, y)] A

Inversion B T[f(x, y)]−1

˜ 1 C /f(x,y) Te[1/f(x,y)] Numerical Integration

Figure 9.3: Illustration of the inverse generating function approach with nu- merical Integration (highlighted in red).

In order to get the elements of T[1/f] the following integral (in the simple one level Toeplitz case) has to be computed

π 1 1 e−ikx dx . 2π ˆ f(x) −π

Instead of an analytical integration, this can be transformed using the rectangular (also called mid-point) rule, into a numerical integration, that requires only point evaluations of f(x), i. e.

sN −1 1 x 1 e−ik(2π j/sNx−π) j ,(9.2) sNx f(2π /sNx − π) j=0 X where s is the sampling rate of the rectangular rule. In general, more accurate (and more complicated) methods of numerical integration could be applied, such as higher order Newton-Cotes formulae [8, Sec. 4.3].

For the rectangular rule, the numerical integration step can again be computed using theFFT of order sNx

9.1.3 Numerical Inversion of the Generating Function

Sometimes the generating function f is not given explicitly, but only allows for computationally efficient function evaluations. However, if we use both numerical alternatives from the previous sections, func- 9.1 inverse generating function for toeplitz and bttb matrices 51 tion evaluations are sufficient to compute T[1/f] (see Figure 9.4). There- fore, we can change( 9.1) to

n−1 ikx¯j f˜grid(x¯j) = tke , k=−(Xn−1) j 1 wherex ¯j = (2π /sn − π) one of the sampling points. Thus, /f˜grid can be computed by inverting pointwise, and be used in( 9.2).

Approximation fe(x, y) T[fe(x, y)] Ã

Pointwise Inversion B˜ T[f(x, y)]−1

˜ 1 C /f(x,y) Te[1/fe(x,y)] Numerical Integration

Figure 9.4: Illustration of the inverse generating function approach using a sampled generating function.

9.1.4 Example

Figure 9.5 shows an example for the result of theIGF method. The top image shows the inverse of the originalBTTB matrix, while the bottom figure shows theBTTB matrix generated by the inverse gen- erating function. It should be easily visible, that in this case both matrices are quite similar.

9.1.5 Efficient Inversion and MVP

Besides being a good approximation, which was shown in [11] and [34], the preconditioner also needs to fulfill the conditions regarding the inversion and itsMVP. ConditionC 2 is satisfied, since no in- verse needs to be computed. As shown in the previous sections all the steps of the inverse generating function method can be computed efficiently usingFFT s.

The last conditionC 3 requires an efficient MVP with T[1/f], which is satisfied trivially since it has the same structure as T[f] and therefore the same costs in an MVP. For aBTTB matrix with Ny × Ny blocks, 52 inverse generating function approach

T[f]−1

T[1/f]

T[f]−1 − T[1/f]

Figure 9.5: Color plots for the inverse of the originalBTTB matrix, T[f]−1, the result of the inverse generating function method T[1/f] and the difference between those two.

each of size Nx × Nx, the costs for an MVP are O (NxNy log NxNy).

9.2 inverse generating function for bttb- block matrices

In this section, theIGF approach will be generalized to (multi-level) Toeplitz-block matrices. First, the general approach of theIGF method for block matrices will be described. Subsequently, a proof is pro- vided, showing that the eigenvalues of the preconditioned system are clustered around one. This proof is currently in preparation [50] to be published and is similar to the ones provided by Chan and Ng [11] and Lin and Wang [34] and a generalization of them. 9.2 inverse generating function for bttb-block matrices 53

9.2.1 General Approach

Let us introduce a matrix-valued generating function   f11(s) f12(s) . . . f1M(s)  . .   f21(s) f22(s) .. .  F (s) =   9 3  . . . .  ,( . )  . . .. .   . . .  fM1(s) fM2(s) . . . fMM(s) and associate it with the corresponding Toeplitz-block matrix gener- ated by F (s) Note that in definition (9.4) T[F ]   T[f11(s)] T[f12(s)] ...T[f1M(s)] denotes a Toeplitz-block  . .   T[f21(s)] T[f22(s)] .. .  matrix, while T[F ] T[F (s)] =   9 4  . . . .  .( . ) could also be  . . .. .   . . .  referring to a T[fM1(s)] T[fM2(s)] ...T[fMM(s)] block-Toeplitz matrix, see for −1 example [51, Eq. (1) We can define the matrix-valued inverse generating function F (s) and (2)]. However, −1 (if the inverse exists) and use T[F ] as a preconditioner, analogously both definitions are to before. This method is illustrated in Figure 9.6. similar, since they are a result of a permutation (see Approximation (8.2)). Therefore, eF(x, y) T[eF(x, y)] the results described à here also apply to standard block-Toeplitz matrices, and vice-versa, the ˆ −1 Matrix Inversion B T[F(x, y)] analysis of block-Toeplitz ≈ matrices can be used here. ˜ −1 C F(x, y) Te[eF(x, y)−1] Numerical Integration

Figure 9.6: Illustration of the inverse generating function for Toeplitz-block matrices.

9.2.2 Preliminaries

In this section, smaller lemmas and their proofs are reproduced, that are needed in Section 9.2.3, starting with some definitions that will simplify the nomenclature later on. If not stated otherwise, the 2- norm is meant, if || · || is written. 54 inverse generating function approach

Definition 9.2.1: min λmin(F (s)) s

For any matrix-valued function F ∈ L1, where F (s) = F (s)H, we define

min λmin(F (s)) ≡ sup {y ∈ R : λ1(F (s)) > y , for a. e. s ∈ [−π, π]} , s y

where λj (F (s)) j = 1, ... , n are the eigenvalues of F sorted in non-decreasing order.

Roughly speaking, this denotes the smallest eigenvalue of F over all s ∈ [−π, π].

Definition 9.2.2: max λmax(F (s)) s

For any matrix-valued function F ∈ L1, where F (s) = F (s)H, we define

max λmax(F (s)) ≡ inf {y ∈ R : λn(F (s)) y , for a. e. s ∈ [−π, π]} , s y 6

where λj (F (s)) j = 1, ... , n are the eigenvalues of F sorted in non-decreasing order.

Roughly speaking, this denotes the largest eigenvalue of F over all s ∈ [−π, π].

The next two lemmas refer to the linearity of T[F ] and the fact that the Hermitian structure is preserved under T[F ]. Both lemmas can be checked easily, by utilizing( 3.1).

Lemma 9.2.3: Linearity of T[F ]

T[F ] is a linear mapping, such that T[a · A + b · B] = a · T[A] + b · T[B].

Lemma 9.2.4: Hermitian structure of T[F ]

Let F be a matrix-valued Hermitian function, i.e., fuv = (fvu)∗. Then T[F ] will be Hermitian, i.e.,

uv vu ∗ T[f ]w = (T[f ]−w) .

In the case of a scalar-valued function f, the well-known Grenander and Szegö Theorem (see, for example, [10, p. 13]) provides much in- formation on the distribution of the eigenvalues of T[f]. An extension 9.2 inverse generating function for bttb-block matrices 55 for block-Toeplitz matrices (see [55]), provides a similar result for our case.

Lemma 9.2.5: Distribution of Eigenvalues

Let F be Hermitian and λ be an eigenvalue of T[F ]. Then it holds that

min λmin (F (s)) λ max λmax (F (s)) . s 6 6 s

Proof. We can directly use [39, Thm. 3.1] or [51, Sec. 2] in combina- tion with the fact that the block-Toeplitz matrix in those papers can be transformed by a similarity transformation into a Toeplitz-block matrix (following definition( 9.4)).

We will use the following result in Section 9.2.3 to show the cluster- ing of the eigenvalues of the preconditioned system, in particular, to quantify the impact of the small norm perturbation on the spectrum.

Lemma 9.2.6: Bauer–Fike Theorem forHPD matrices

Let A be Hermitian positive definite (HPD). Let µ be an eigen- value of A + E, then there exists a λ, which is an eigenvalue of A, such that:

|λ − µ| 6 kEk .

Proof. For a proof see [48, pp. 59–60] in combination with [48, Thm. 1.8]

Some lemmas will close this section, that will be used numerous times during the proof of the clustering of the eigenvalues.

Lemma 9.2.7: Sum of Hermitian matrices

If A and B is Hermitian, so is A + B.

Proof. (A + B)H = AH + BH = A + B

The next lemma is a general inequality between the spectral norm and the Frobenius norm.

Lemma 9.2.8: 2-Norm and Frobenius norm

||A||2 6 ||A||F 56 inverse generating function approach

n 2 2 Proof. We can write the Frobenius norm as ||A||F = ||Aej||2. j=1 n P 2 At the same time, for an arbitrary x with ||x||2 = 1, i.e. |xj| = 1, j=1 we have P

2 n  n  2 2 ||Ax||2 = || xjAej||2 6  |xj|||Aej||2 j=1 j=1 X X  n  n n 2 2 2 2 6  |xj|  ||Aej||2 = ||Aej||2 = ||A||F , j=1 j=1 j=1 X X X where the first inequality is the triangular inequality and for the sec- ond one we used the Cauchy-Schwarz inequality. 2 Since this is true for arbitrary x it is also true for max ||Ax||2 = ||x||2=1 2 ||A||2.

Lemma 9.2.9: Sub-multiplicativity

All induced matrix norms are sub-multiplicative, ı.e. ||AB|| 6 ||A||||B||.

Proof.

kABk = max k(AB)xk = max kA(Bx)k kxk=1 kxk=1 6 max kAkkBxk = kAk max kBxk = kAkkBk kxk=1 kxk=1

Lemma 9.2.10: Eigenvalues of the inverse

If the matrix A has the eigenvalue λ, then A−1 has the eigen- value λ−1.

Proof. If λ is an eigenvalue of A, then Av = λv. If we multiply both sides with A−1, we get A−1Av = λA−1v from which directly follows −1 1 A v = λ v.

Lemma 9.2.11: Rank

rank(AB) 6 min(rank(A), rank(B)) 9.2 inverse generating function for bttb-block matrices 57

Proof. We can identify the vector product Ax with the linear trans- form A(x). Then rank(AB) = rank(A(B(x))) 6 rank(A) = rank(A). T We can do the same to get rank(DC) 6 rank(D). If we set C = A T T  T and D = B , we get rank(DC) = rank (AB) 6 rank(B ). We later need the fact that a similarity transformation A 7→ P −1AP does not change the eigenvalues.

Lemma 9.2.12: Similarity transform

If λ is an eigenvalue of A, then it is also an eigenvalue of A˜ = P −1AP for any P .

Proof. Let Av = λv, then forv ˜ = P −1v, A˜v˜ = (P −1AP )P −1v = P −1Av = λP −1v = λv˜.

9.2.3 Proof of Clustering of the Eigenvalues

We first show that the rank of T[P −1] T[P ] is bounded by 2KM, if all entries of P (s) are trigonometric polynomials of degree K or smaller. We follow the proof of Lemma 2 from [11] and generalize it to the Toeplitz-block case.

Lemma 9.2.1

uv Let p , 1 6 u, v 6 M be trigonometric polynomials of degree K in C2π, i.e.,

K uv uv iks p (s) = pbk e . k=−K X Define   p11 p12 ... p1M    p21 p22 ... p1M    P =  . . . .  ,  . . .. .    pM1 pM2 ... pMM

and assume its invertibility. −1 −1 Then for n > 2K, rank(T[P ] T[P ] − I) 6 2KM, where T[P ], and T[P ] ∈ CnM×nM and I denotes the identity matrix of ap- propriate size. 58 inverse generating function approach

Proof. Let

R(s) = P (s)−1 ,(9.5)

with its entries

uv uv iks r (s) = rk e . ∞ b k=− X (9.5) implies ∞

R(s) P (s) = I .

Therefore

M um mv ils r (s) p (s) = δu−v = δu−v δl e ,(9.6) ∞ m=1 l=− X X

where δi is the Kronecker delta that is 1∞if i = 0 and 0 otherwise. On the other hand,

M M ! K ! um mv um ik0s mv iks r (s) p (s) = rk0 e pk e ∞ b b m=1 m=1 k0=− k=−K X X X X M K ! ∞ um mv i(k0+k)s = rk0 pk e ∞ b b m=1 k0=− k=−K X X X M K ! 0 ∞ um mv ils [k = l − k] = rl−k pk e ∞ b b m=1 l=− k=−K X X X M K ! ∞ um mv ils = rl−k pk e .(9.7) ∞ b b l=− m=1 k=−K X X X Comparing the coefficients of e∞ils in the right-hand sides of( 9.6) and (9.7) we see that

M K um mv 1, if u = v and l = 0, brl−k pbk = δu−v δl = . m=1 k=−K 0, otherwise. X X Hence for n > 2K, the entries of T[P −1] T[P ] − I are all zeros except entries in the first and last K columns of each Toeplitz-block. Thus −1 rank(T[P ] T[P ] − I) 6 2KM. We now slightly deviate from [11, Lem. 3] and instead follow [34, Lem. 2] to show that T[F −1] − T −1[F ] can be written as a low-rank matrix Gn and a matrix Hn of small norm. 9.2 inverse generating function for bttb-block matrices 59

Lemma 9.2.2

uv Define F (s) as in( 9.3) where each f ∈ C2π, 1 6 u, v 6 M, and Fmin > 0. Then for all ε > 0, there exist positive integers N and K such that for all n > N,

T[F −1] − T −1[F ] = G + H,

where rank(G) 6 2KM and kHk < ε.

uv Proof. 1. Since f ∈ C2π, 1 6 u, v 6 M, following the Weierstrass approximation theorem (see [16, pp. 4–6]) given ε > 0, there exists a trigonometric polynomial

K uv uv iks p (s) = pbk e , k=−K X such that

kfuv(s) − puv(s)k ≡ max |fuv(s) − puv(s)| ε, 1 u, v M. s 6 6 6

2. Since F and P are Hermitian, F − P is also Hermitian. We can use the linearity, Hermitian structure and the distribution of the eigenvalues (see 9.2.5) to show, that the matrices T[F ] and T[P ] can be made arbitrarily close

kT[F ] − T[P ]]k = kT[F − P ]k

= max |λk (T[F − P ]) | k  max | min λmin (F (s) − P (s)) |, 6 s  | max λmax (F (s) − P (s)) | . s

We first derive an upper bound for the second element of the maximum above

2 2 | max λmax (F (s) − P (s)) | 6 max |λk (F (s) − P (s)) | s s,k max kF (s) − P (s)k2 6 s max kF (s) − P (s)k2 6 s F = max |fuv(s) − puv(s)|2 s uv X max |fuv(s) − puv(s)|2 6 s uv X2 6 Mε . 60 inverse generating function approach

For the first element we get

2 2 2 | min λmin (F (s) − P (s)) | 6 max |λk (F (s) − P (s)) | 6 Mε , s s,k

2 by using the same steps as for | max λmax (F (s) − P (s)) | . s 2 2 Thus kT[F ] − T[P ]]k 6 Mε . 3. Since T[F ] is invertible, T[P ] is also invertible for a sufficiently small ε. We can also derive that P isHPD, if F isHPD and kF − P k 6 cε for some c. We can now write

T[F −1] − T −1[F ] =(T[F −1] − T[P −1]) + [(T[P −1] − T −1[P ]) + (T −1[P ] − T −1[F ])] := G + H ,

where

G =T[P −1] − T −1[P ] = T[P −1] T[P ] − I T −1[P ], H = T[F −1] − T[P −1] + T −1[P ] − T −1[F ] .

4. Since P consists of trigonometric polynomials of degree K, we can use Lemma 9.2.1 to show that G is low-rank

−1  rank(G) 6 rank T[P ] T[P ] − I 6 2KM.

5. Now we address the small error term H

−1 −1 −1 −1 kHk 6 kT[F ] − T[P ]k + kT [P ] − T [F ]k = kT[P −1(F − P )F −1]k + kT −1[P ](T[F ] − T[P ]) T −1[F ]k .

Using the Hermitian structure, we now show bounds for each of those two terms separately, starting with the first term.

−1 −1 −1 −1  kT[P (F − P )F ]k max λk T[P (F − P )F ] 6 k −1 −1 6 max λk P (s) (F (s) − P (s))F (s) s,k max kP (s)−1(F (s) − P (s))F (s)−1k 6 s max kP (s)−1k k(F (s) − P (s))k kF (s)−1k 6 s 1 1 6 max |λk (F (s) − P (s))| min λmin (P (s)) s,k min λmin (F (s)) s s

For the first inequality, we use the fact that T[P −1(F − P )F −1] = T[F −1] − T[P −1] and therefore Hermitian, as the sum of two 9.2 inverse generating function for bttb-block matrices 61

Hermitian matrices. In the last step, we use the fact that the following equality holds true

−1 1 1 max λmax F (s) = max = , s s λmin (F (s)) min λmin (F (s)) s

if F (s) isHPD for all s ∈ [−π, π]. Now for the second term, we can bound it in the following way.

kT −1[P ](T[F ] − T[P ]) T −1[F ]k −1 −1 6 kT [P ]k kT[F ] − T[P ]k kT [F ]k −1  −1  = λmax T [P ] max |λk (T[F ] − T[P ])| λmax T [F ] k 1 1 6 max max |λk (F (s) − P (s))| max s λmin (P (s)) s,k s λmin (F (s)) 1 1 6 max |λk (F (s) − P (s))| min λmin (P (s)) s,k min λmin (F (s)) s s

Putting it all together (with Step 2 of this proof), we get

1 1 kHk2 6 2 · max |λk (F (s) − P (s))| min λmin (P (s)) s,k min λmin (F (s)) s s 1 1 √ 6 2 · Mc ε = Cεe . min λmin (P (s)) min λmin (F (s)) s s

:=Ce | {z }

We now use these lemmas to show that all except 2KM eigenvalues of the preconditioned matrix T[F −1] T[F ] are clustered around one, following the proof and generalizing the result of [34, Thm. 3].

Theorem 9.2.3

uv Let f ∈ C2π, 1 6 u, v 6 M and F (s) isHPD for all s ∈ [−π, π]. 1. All eigenvalues of T[F −1] T[F ] lie in the interval " # min λmin(F (s)) max λmax(F (s)) s s , . max λmax(F (s)) min λmin(F (s)) s s

2. For all ε > 0, there exist positive integers K and N such that for all n > N, at most 2KM eigenvalues of T[F −1] T[F ] − I have absolute values greater than eε.

Proof. 1. Since F (s) isHPD for all s ∈ [−π, π], both matrices T[F ] and T[F −1] areHPD (using Lemmas 9.2.4 and 9.2.5). The eigen- 62 inverse generating function approach

values of T[F −1] T[F ] and T 1/2[F ] T[F −1] T 1/2[F ] coincide (since the latter is a result of a similarity transform of the former). Let λ be an eigenvalue of T 1/2[F ] T[F −1] T 1/2[F ]. We have

x∗T 1/2[F ] T[F −1] T 1/2[F ] x λ 6 max x6=0 x∗x y∗T[F −1]y = max y6=0 y∗T −1[F ]y y∗T[F −1]y y∗y 6 max max y6=0 y∗y y6=0 y∗T −1[F ]y ∗ −1 z T[F ]z 1 6 max λk(F ) max ∗ 6 max λmax(F ). s,k z6=0 z z min λmin(F ) s s

min λmin(F (s)) min λmin(F (s)) Similarly we obtain λ s . Therefore, s > max λmax(F (s)) max λmax(F (s)) 6 s s max λmax(F (s)) λ s 6 min λ (F (s)) . s min 2. From Lemma 9.2.2 it follows that for a given ε > 0, there exist positive integers K and N such that for all n > N,

T[F −1] − T −1[F ] = G + H.

where rank(G) 6 2KM and kHk < Cεe . Since F (s) isHPD, the matrices T[F ], T[F −1], T[P ], T[P −1], T −1[F ] and T −1[P ] are Hermitian. Therefore, G and H are Hermitian. From the last equation we get

1 1 T /2[F ] T[F −1] T /2[F ] = I + Ge + Hf,

where

1 1 Ge = T /2[F ] G T /2[F ], 1 1 Hf = T /2[F ] H T /2[F ].

It follows that

rank(Ge) 6 rank(G) 6 2KM,

and

1 kHfk = kT /2[F ]k2 kHk.

Therefore, at most 2KM eigenvalues of

1 1 T /2[F ] T[F −1] T /2[F ] − Hf = I + Ge

are different from 1. 9.3 regularizing functions 63

We also know that   T[F −1] T[F ] =T −1/2[F ] T 1/2[F ] T[F −1] T 1/2[F ] T 1/2[F ]

1   1 =T − /2[F ] (I + Ge) + Hf T /2[F ] ,

which means that the eigenvalues of T[F −1] T[F ] are the same as the eigenvalues of (I + Ge) + Hf (since we applied a similarity transform). To get the eigenvalues of (I + Ge) + Hf, we use the Bauer–Fike Theorem (see Lemma 9.2.6). We know that at most 2KM eigen- values of I + Ge are not equal to one. If we add H, we know that the difference in the eigenvalues (compared to I + Ge) is limited by kHk.

In conclusion, the differences of the eigenvalues of (I + Ge) and 1/2 2 (I + Ge) + Hf will be smaller than kHfk 6 kT [F ]k kHk 6 1/2 2 kT [F ]k Cεe 6 eε and therefore, at most 2KM eigenvalues of :=εe T[F −1] T[F ] lie outside the interval [1 − ε, 1 + ε]. | {z } e e

9.2.4 Example

Figure 9.7 shows an example for the result of theIGF method in the case of a 2 × 2 BTTB-block matrix. The top image shows the inverse of the original matrix, while the bottom image is the result of the inverse generating function method described in this section. The plots illustrate that under certain circumstances, both matrices are very close to each other.

9.3 regularizing functions

Applying the inversion formulas for the matrix-valued generating function F, requires the inversion of some of its elements. As dis- cussed in the last section, the inversion of the elements fuv can hap- pen pointwise after approximating it with feuv, using the Fourier coefficients extracted from the original matrix.

However, if fuv is close to zero at some point x, the inverse (fuv)−1 will be very large at this point. The fact that the generating function can be only approximated can reinforce this fact, producing unrealis- tically high values for the inverse generating function. 64 inverse generating function approach

T[F ]−1

T[F −1]

T[F ]−1 − T[F −1]

Figure 9.7: Color plots for the inverse of the original 2 × 2 BTTB-block ma- trix, T[F (x, y)]−1, the result of the inverse generating function method T[1/F (x,y)] and the difference of those two.

To counteract this effect, we can apply the inverse generating func- tion method on a regularized function, i. e. , applying it on T[F + αIM]. This is equivalent to applying it onto T[F ] + αIm×n.

The optimal value for α corresponds to a trade-off, that can be different for each linear system. Figure 9.8 shows the number of it- erations until convergence (relative to the number of iterations for α = 0), for several test cases, with the inverse generating function method with different strengths of regularization, i. e. , different val- ues for α. In other words, the y-axis describes the ratio of the number of iterations compared to the unregularized system, i. e.

number of iterations of IGF with a regularization of strength α y = . number of iterations of IGF without regularization

Figure 9.8 indicates that choosing the right value for α can reduce the number of iterations. It also indicates that the optimal value varies 9.4 numerical experiments 65

Figure 9.8: Degrees of regularization. Relative convergence rates for several test cases without and with several degrees of regularization for the inverse generating function method. between the several test cases. Finding the optimal value - or at least a value that decreases the iterations - appears to be a non-trivial prob- lem, requiring more analysis.

9.4 numerical experiments

9.4.1 Convergence of the IGF

Figure 9.9 shows for a sample test case, the convergence of Te[fe] (the result of theIGF with approximations) towards T[f−1], if the sam- pling rate of the numerical integration (step c))˜ and the number of Fourier coefficients in the approximation of the generating function (step ã)) is increased.

This sample test case is the result of a piecewise constant generat- ing function, for which the Fourier coefficients can be computed ana- lytically. The inverse generating function is again a piecewise constant function. Therefore, T[f] and T[f−1] can be computed. Figure 9.9 plots the relative difference of the result of theIGF method and T[f−1], i. e. −1 kT[f ]−Te[fe]kF , −1 . It can be seen that increasing the number of Fourier kT[f ]kF terms does not necessarily decrease the relative difference, if the sam- pling rate is hold constant. 66 inverse generating function approach

Figure 9.9: Convergence of theIGF method towards the exact inverse.

9.4.2 IGF for a BTTB-block Matrix

This section will analyze theIGF method for a matrix of the form   T1 0.15 · T2 0.01 · T3   0.15 · T T 0.05 · 0.15 · T  ,  2 1 2 0.01 · T3 0.05 · 0.15 · T2 T1

where Tl, with l ∈ 1, 2, 3 is a BTTB matrix with n1 × n1 blocks, each of size n2 × n2, generated by the corresponding sequence of Fourier coefficients.

T1 is generated by

−1  1+0.1(|j|+1) tj,k = (|j| + 1)(|k| + 1) .

T2 is generated by

3.1 3.1−1 tj,k = (|j| + 1) + (|k| + 1) . 9.4 numerical experiments 67

T3 is generated by

2.5 j = 0, k = 0  1 1−1 tj,k = 1 + (|k| + 1) . j = 0, k = 0, ±1, ±2, ... .   −1 (|j| + 1)2.5 + |jk + 1|2.5 j = ±1, ±2, ... , k = 0, ±1, ±2, ...   It can be verified that this generates aHPD matrix, thus fulfilling the assumptions of Theorem 9.2.3.

Table 9.1 shows the average number of required iterations of the preconditioned conjugate gradient method, until the tolerance (10−7) is reached, for ten randomly created right-hand side vectors and dif- ferent sizes of n1 and n2. Here, the midpoint rule is used to compute the Fourier integrals numerically, using n1 and n2 intervals in the corresponding directions.

Besides comparing the results of theIGF preconditioner to the orig- inal system (P = I), the results of a 3 × 3 BCCB-block preconditioner (P = C(T[F ])) are also provided as a reference (see Chapter 8).

Compared to the original system the number of iterations is re- duced by a factor of up to ten, when using T[F −1] as a precondi- tioner. Compared to the BCCB-block preconditioner, T[F −1] reduces the number of iterations as well, up to a factor of two.

Figure 9.10 illustrates the distribution of the eigenvalues in the case of n1 = n2 = 30, firstly for the original system (Figure 9.10a), then when using the BCCB-block preconditioner (Figure 9.10b) and lastly when using theIGF preconditioner (Figure 9.10c). Both precondi- tioners cluster the eigenvalues around one. It can be seen that the clustering is tighter for our preconditioner, compared to the BCCB- block preconditioner. All eigenvalues of T[F −1]T[F ] are real, since T[F −1]T[F ] is similar to T 1/2[F ] T[F −1] T 1/2[F ] (see the proof of 9.2.3). T 1/2[F ] T[F −1] T 1/2[F ] is Hermitian and therefore all its eigenvalues are real. Consequently the eigenvalues of T[F −1]T[F ] are also real. 68 inverse generating function approach

Table 9.1: The average number of iterations for 3 × 3 BTTB matrix. −1 −1 n1 n2 Matrix size P = IP = C(T[F ]) P = T[F ]

10 10 300 56.7 16.6 16.0 20 20 1200 79.3 17.9 11.0 30 30 2700 92.2 19.0 11.0 40 40 4800 100.5 18.9 10.8

(a) Original System

(b) Circulant Preconditioned System

(c)IGF Preconditioned System

Figure 9.10: Distribution of eigenvalues for theIGF. Distribution of the eigenvalues λj of the original system (a), using 3 × 3 BCCB-block preconditioner (b) and using T[F −1] as a preconditioner (c). KRONECKERPRODUCT APPROXIMATION 10

The Kronecker product approximation refers to finding the optimal matrices Ak and Bk such that the sum of their Kronecker products is as close as possible to the desired matrix. This is equivalent to solving

s min ||M − Ak ⊗ Bk||F ,(10.1) A ∈CNx×Nx B ∈CNy×Ny k , k k=1 X where ⊗ denotes the Kronecker product of two matrices and M is any matrix of size NxNy × NxNy.

Using this (approximate) decomposition into Kronecker products, theMVP of its inverse and a vector can be computed more effi- ciently, if additional approximations are applied. The preconditioner obtained by computing its Kronecker product approximation will be denoted by Krons(T ), where s indicates the number of Kronecker product terms used in the approximation and T refers to the original matrix.

The next section describes an efficient way to obtain the Kronecker product approximation forBTTB matrices and how one can use this decomposition to compute the inverse efficiently from the computa- tional point of view.

Section 10.2 will discuss possible ways to use this approach in the case of a 3 × 3 BTTB-block matrix, thus generalizing the approach to make it applicable to the full problem.

10.1 optimal approximation for bttb matri- ces

Olshevsky et al. [44] proved that if the approximated matrix isBTTB Theorem 3.2 in [44] proofs this equality (M = T(BTTB)), then the problem in( 10.1) is equivalent to in the general case. This means that the s optimal Ak and Bk for example for a min ||T(BTTB) − Ak ⊗ Bk||F , A ∈T(Nx) B ∈T(Ny) k , k k=1 block-Toeplitz- X Hankel-block matrix, would have a Toeplitz and a Hankel structure. 69 70 kronecker product approximation

where T(n) denotes the class of Toeplitz matrices of size n × n. In other words, the optimal Ak and Bk have the same structure as T , just reduced to one-level.

From [36] we know that if we define a tilde-transformation that rearranges block matrices in the following way:

 T  vec(T1,1)  .     .   .  T1,1 T1,2 ... T1,n  T     vec(T2,1)  T T ... T     2,1 2,2 2,n   .  Te = tilde(T ) = tilde  . . . .  =  .  ,  . . .. .       T  vec(T1,n)  Tn,1 Tn,2 ... Tn,n    .   .  T vec(Tn,n)

where the operator vec(A) unrolls a matrix into a vector by "stacking" its columns from left to right. We can reformulate( 10.1) into

s T min ||Te(BTTB) − (aekbe k)||F (10.2) aek,bek k=1 X r H This problem can be solved by computing theSVD of Te = σekuekvek k=1 and taking the first s terms [21]. However, the cost of anSVDP of a 2 2 2 2 2 2 3 matrix of size Nx × Ny is with O (Nx) Ny + (Ny) too expensive. In [29], a banded BTTB matrix is In [29] and [30], algorithms were proposed to solve( 10.2) with an defined as a matrix SVD of a much smaller size. While Kamm and Nagy [29] focused on that can be fully defined by a single bandedBTTB matrices, K ilmer and Nagy [30] expanded the idea column. to dense block-Toeplitz-plus-Hankel matrices with adjustments for similar cases, such asBTTB matrices. [29] and [30] consider real valued Following [29] and [30], the problem in( 10.2) can be condensed matrices, while in into a smaller problem this work, we consider complex s s valued matrices. T T T T min ||ReL1P ReR1 − (ReL1ak)(bkReR1)||F = min ||Pb − (abkbk)||F ak,bk a¯ ,b¯ k=1 k k k=1 X X

where

P is a (2Nx − 1) × (2Ny − 1) matrix containing all the Fourier coefficients, present in theBTTB matrix Te(BTTB), 10.1 optimal approximation for bttb matrices 71

√ √ √ √ 1  √ √ √  ReL1 = √ diag 1, 2, ... , Nx − 1, Nx, Nx − 1, ... , 2, 1 , Nx

1 1/2 ReR1 = √ diag (1, 2, ... , Ny − 1, Ny, Ny − 1, ... , 2, 1) . Ny

10.1.1 Algorithm

To summarize, solving( 10.1) can be done efficiently with the follow- ing steps:

1. Compute ReL1 and ReR1.

T 2. Compute Pb = ReL1P ReR1. s H 3. CalculateSVD of Pb ≈ σkuvk . k=1 √ P √ 4. Set abk = σk uk and bk = σk vk.

5. Solve ReL1ak = abk and ReR1bk = bk.

6. Build matrices Ak and Bk from ak and bk.

Step 6 is done by creating a Toeplitz matrix Ak, whose first row con- sist of the first Nx elements of ak. The first column of Ak are the last Nx elements of ak. Analogously for Bk.

The total costs of this algorithm are dominated by step 3, comput- ing theSVD of a (2Nx − 1) × (2Ny − 1) matrix, of which the compu- 2  tational costs are O NxNy , if Nx > Ny [56, Lecture 31].

10.1.2 Inverse and MVP

To use the Kronecker product approximation as a preconditioner, theMVP with its inverse needs to be computable efficiently. In this section we start by first discussing the case of s = 1, i. e. an approx- imation with just one Kronecker product. Next, the case of s > 2 terms is discussed which requires further approximations for an effi- cient computation.

10.1.2.1 One Term Approximation

In this case, theBTTB matrix T(BTTB) is approximated with just a single term of the Kronecker product approximation, thus

T(BTTB) ≈ A ⊗ B .

To compute

−1 −1 (T(BTTB)) x ≈ (A ⊗ B) x , 72 kronecker product approximation

we use the fact that (A ⊗ B)−1 = A−1 ⊗ B−1 [21, Sec. 12.3.1]. Addi- tionally, it is known that (DT ⊗ E)vec(S) = vec(F ) ⇐⇒ ESD = F for matrices D, E, F and S [26, Lem. 4.3.1]. Using this, theMVP with the inverse can be expressed as

(A ⊗ B)−1 x = A−1 ⊗ B−1 x = vec B−1vec−1(x)A−T  .(10.3)

Consequently, the computation of anMVP with (A ⊗ B)−1 re- quires only the inverses of A and B separately. They can be computed utilizing the Gohberg–Semencul formula (see Theorem 3.0.4). This way, the inverses are given as four Toeplitz matrices, i. e. , T −1 = AB + CD, where A, B, C and D are Toeplitz.

The two matrix-matrix products in( 10.3) can then be computed by computing multipleMVP s with Toeplitz matrices. Thus, the total cost is O (NxNy log Ny + NyNx log Nx) = O (NxNy log(NxNy)).

10.1.2.2 Two-Terms Approximation

In the case of s = 2, we can rewrite

−1 (A1 ⊗ B1 + A2 ⊗ B2) x = c

⇐⇒ solve for c (A1 ⊗ B1 + A2 ⊗ B2)c = x

⇐⇒ solve for C (A1 ⊗ B1 + A2 ⊗ B2)vec(C) = vec(X) T T ⇐⇒ solve for CB1CA1 + B2CA2 = X

(∗) T −T −T ⇐⇒ solve for CB1C + B2CA2 A1 = XA1 (∗∗) −1 T −T −1 −T ⇐⇒ solve for CB2 B1C + CA2 A1 = B2 XA1

where at the equality marked with (∗) the previous equation was −T multiplied by A1 from the right, and analogously at step (∗∗) with −1 B2 from the left (see also [38, p. 1135]). This results in an equation of the form ACe + CBe = Ce, which is called a Sylvester equation and can be numerically solved, for example with the Bartels–Stewart 3 3  algorithm, in O Nx + Ny [4]. However, as this has to be computed in each iteration, it is considered as too computationally expensive, and other methods are preferred.

10.1.2.3 Multiple Terms Approximation

s In general, no exact inversion formulas exist for Ak ⊗ Bk for k=1 s > 2. P 10.2 bttb-block matrices 73

Kamm and Nagy [28] suggested using an approximateSVD if s > H H 2. Given theSVD of A1 = UASAVA and B1 = UBSBVB , construct

U = UA ⊗ UB ,

V = VA ⊗ VB ,  s  H S = diag U ( Ak ⊗ Bk)V , k=1 s P H then Ak ⊗ Bk ≈ USV . It can be seen that S satisfies k=1 P s s H H min kΣ − U ( Ak ⊗ Bk)V k = min kU ΣV − ( Ak ⊗ Bk)k , Σ∈D Σ∈D k=1 k=1 X X where D denotes the class of all diagonal matrices. This shows that the described method produces an optimalSVD approximation given the fixed basis U and V . Additionally, for s = 1, it returns the regular SVD, without any further approximation.

TheMVP with its inverse can then be computed by

s T −1 x ≈ ( A ⊗ B )−1 ≈ (USV H)−1x (BTTB) k k k=1 X −1 = V H S−1U −1x = (VS−1U H)x −1 H H = (VA ⊗ VB)S (UA ⊗ UB )x , which can be applied using the same strategy and computational costs as described in the previous section.

10.2 bttb-block matrices

If T(block) consist of 3 × 3 BTTB blocks that have each been approxi- mated by a sum of Kronecker products,   T1,1; T1,2; T1,3;   T( ) = T T T  block  2,1; 2,2; 2,3; T3,1; T3,2; T3,3;  s s s 

A1,1;k ⊗ B1,1;k A1,2;k ⊗ B1,2;k A1,3;k ⊗ B1,3;k k=1 k=1 k=1   s s s   P P P  =  A2,1;k ⊗ B2,1;k A2,2;k ⊗ B2,2;k A2,3;k ⊗ B2,3;k  k=1 k=1 k=1   s s s   P P P  A3,1;k ⊗ B3,1;k A3,2;k ⊗ B3,2;k A3,3;k ⊗ B3,3;k k=1 k=1 k=1 P P P 74 kronecker product approximation

then the block inversion formula can be used to compute the inverse. Let ! WQ T(block) = , RS

where ! T T W = 1,1; 1,2; , T2,1; T2,2; ! T Q = 1,3; , T2,3;   R = T3,1; T3,2; ,   S = T3,3; .

Then by applying the block inversion formula, one gets

!−1 ! WQ W −1 + W −1QZRW −1 W −1QZ T −1 = = , (block) RS −ZRW −1 Z

where

Z = (S − RW −1Q)−1 .

The computation requires W −1 which can be computed by applying the block inversion formula again, this time on W , thus

!−1 T T W −1 = 1,1; 1,2; T2,1; T2,2; −1 −1 0 −1 −1 0! T + T T1 2 Z T2 1 T T T1 2 Z = 1,1; 1,1; , ; , ; 1,1; 1,1; , ; , 0 −1 0 −Z T2,1;T1,1; Z

where

0 −1 −1 Z = (T2,2; − T2,1;T1,1;T1,2;) .

Besides Z and Z 0, This means that Z and Z 0 have to be computed in order to compute the inverse of T1,1; the inverse of aBTTB-block matrix. However, this can not be done has to be computed. efficiently without further approximations. How this can be done efficiently is 0 discussed in the The problem of computing Z and Z is separated into two different previous section. cases. In the first case s = 1, the approximation is just a single Kro- necker product, while in the second case s > 2 the approximation consists of a sum of Kronecker products. 10.2 bttb-block matrices 75

10.2.1 One Term Approximation

10.2.1.1 Sum Approximation

If s = 1, then

0 −1 −1 −1 Z = (A2,2; ⊗ B2,2; − A2,1; ⊗ B2,1; · A1,1; ⊗ B1,1; · A1,2; ⊗ B1,2;) −1 −1 −1 = (A2,2; ⊗ B2,2; − (A2,1;A1,1;A1,2;) ⊗ (B2,1;B1,1;B1,2;)) .

However, this is the inverse of a sum of Kronecker products which, as described in Section 10.1.2.3, cannot be computed efficiently. There- fore, a possible approach is to approximate the sum occurring in Z 0 by just the first term, thus

0 −1 −1 Z ≈ A2,2; ⊗ B2,2; .(10.4)

Analogously, for Z the same approximation is needed and therefore

−1 −1 −1 Z ≈ S = A3,3; ⊗ B3,3; .(10.5)

Using this approximation, the elements of W −1 are

W −1 = A−1 ⊗ B−1 + (A−1 A A−1 A A−1 ) ⊗ (B−1 B B−1 B B−1 ) 1,1 1,1 1,1 1,1 1,2 2,2 2,1 1,1 1,1 1,2 2,2 2,1 1,1 W −1 = (A−1 A A−1 ) ⊗ (B−1 B B−1 ) 1,2 1,1 1,2 2,2 1,1 1,2 2,2 W −1 = (−A−1 A A−1 ) ⊗ (B−1 B B−1 ) 2,1 2,2 2,1 1,1 2,2 2,1 1,1 W −1 = A−1 ⊗ B−1 2,2 2,2 2,2 .

The same approximation and inversion formula can be applied T T −1 again to get the inverse of (block). The elements of (block) can be found in the appendix, see Section A.1.1.

Using the approximations( 10.4) and( 10.5) for Z 0 and Z, the in- verse of T(block) can be computed using only inverses ofBTTB matri- ces.

TheMVP can be computed using one of two strategies. The first option is to precompute the matrix-matrix products in the equations above. For the second to last equation of Section A.1.1 we get for example: 76 kronecker product approximation

 −1  −1 −1 −1 −1 −1 −1 T( ) = (−A3,3 A3,1 A1,1 A1,2 A2,2 ) ⊗ (B3,3 B3,1 B1,1 B1,2 B2,2 ) block 3,2

Ae1 Be1 | {z } | {z } −1 −1 −1 −1 + (−A3,3 A3,2 A2,2 ) ⊗ (B3,3 B3,2 B2,2 ) ,

Ae2 Be2

= Ae1| ⊗ Be1 +{zAe2 ⊗ Be2} | {z } (10.6)

If this strategy is used, it is necessary to use the detailed formula described in the appendix. This option reduces the number of matrix- matrix products that have to be computed. However, the Toeplitz structures of the matrices in the original equation will be lost. There- 2 2  fore the complexity will be O NxNy + NxNy .

For the second option, the equations will be left untouched, pre- serving the Toeplitz structure of the matrices involved. This would require a fairly large number of matrix-matrix products, but each can be sped up using theFFT in combination with the G ohberg– Semencul formula (see Theorem 3.0.4).

The complexity is in this case O (NxNy log Ny + NyNx log Nx), but with a fairly large constant, depending on the number of matrices in the expression.

10.2.1.2 Diagonal Approximation

Section 10.2.1.2 Another approach would be to approximate the high- est level (3 × 3-block level) with a block diagonal matrix, i. e.   T1,1; T1,2; T1,3;   T( ) = T T T  block  2,1; 2,2; 2,3; T3,1; T3,2; T3,3;   T1,1;     ≈   T   .  2,2;    T3,3;

The inverse is then simply

 −1  T1,1;   −1   T ≈   T −1   . (block)  2,2;  −1   T3,3; 10.2 bttb-block matrices 77

T −1 TheMVP with (block) can be computed efficiently if the Gohberg– Semencul formula is utilized (see Theorem 3.0.4). The cost of an MVP is then O (NxNy log Ny + NyNx log Nx), with a smaller con- stant compared to the previous method.

10.2.2 Multiple Terms Approximation

In the case of s > 2, eachBTTB matrix will be approximated via an H approximateSVD, so Ti,j ≈ Ui,jSi,jVi,j .

10.2.2.1 Sum Approximation

Using the same approximations for Z 0 and Z, as the ones described in Section 10.2.1.1, we get

W −1 = V S−1U H + V S−1U H U S V HV S−1U H 1,1 11 11 11 11 11 11 12 12 12 22 22 22 H −1 H U21S21V21V11S11 U11 W −1 = V S−1U H U S V HV S−1U H 1,2 11 11 11 12 12 12 22 22 22 W −1 = −V S−1U H U S V HV S−1U H 2,1 22 22 22 21 21 21 11 11 11 W −1 = V S−1U H 2,2 22 22 22 ,

W −1 T −1 for the elements of . The elements of (block) are given in the appendix.

The costs of anMVP with T(block) are in this case 2 2  O NxNy + NyNx + NxNy = O (NxNy(Nx + Ny + 1)). This can be achieved, since all the matrices Ui,j are a result of a Kronecker prod- uct, so, U = UA ⊗ UB. The same is true for the matrices Vi,j. A matrix- matrix product with Ui,j and Vi,j can be computed in 2 2  O Ny + NyNx = O (NxNy(Nx + Ny)). A matrix-matrix product with a matrix Si,j can be computed in O (NxNy), since Si,j is diago- nal.

10.2.2.2 Common Basis

As another method, allBTTB matrices Ti,j could be diagonalized H by common basis matrices U and V . Then Ti,j = USi,j V and no approximations for Z or Z0 are needed, since

0 −1 H−1 −1 H Z = U(S22 − S21S11 S12)V := VSZ 0 U , can be computed easily. Similar, for Z,

−1 H−1 −1 H Z = U(S22 − S21S11 S12)V := VSZ U . 78 kronecker product approximation

Using a common basis, the elements of W −1 are

W −1 = V (S−1 + S−1S S−1S S−1)U H 1,1 11 11 12 Z 0 21 11 W −1 = V (S−1S S−1)U H 1,2 11 12 Z 0 W −1 = V (−S−1S S−1)U H 2,1 Z 0 21 11 W −1 = V (S−1)U H 2,2 Z 0 ,

T −1 and ultimately, the elements for (block) are

 −1  −1 −1 −1 −1 T = V S11 + S11 S12S 0 S21S11 (block) 11 Z −1 −1 −1 −1 −1 −1 −1 −1 + S11 S13SZ S32S11 + S11 S13SZ S32S11 S12SZ 0 S21S11 −1 −1 −1 −1 −1 + S11 S12SZ 0 S21S11 S13SZ S32S11 −1 −1 −1 −1 −1 −1 −1 + S11 S12SZ 0 S21S11 S13SZ S32S11 S12SZ 0 S21S11 −1 −1 −1 −1 − S11 S13SZ S32SZ 0 S21S11 −1 −1 −1 −1 −1 −1 − S11 S12SZ 0 S21S11 S13SZ S32SZ 0 S21S11 −1 −1 −1 −1 + S11 S12SZ 0 S23SZ S31S11 −1 −1 −1 −1 −1 −1 + S11 S12SZ 0 S23SZ S31S11 S12SZ 0 S21S11 −1 −1 −1 −1 −1 H − S11 S12SZ 0 S23SZ S31SZ 0 S21S11 U .

 −1  −1 −1 T = V S11 S12S 0 (block) 12 Z −1 −1 −1 −1 + S11 S13SZ S31S11 S12SZ 0 −1 −1 −1 −1 −1 −1 + S11 S12SZ 0 S21S11 S13SZ S31S11 S12SZ 0 −1 −1 −1 + S11 S13SZ S32SZ 0 −1 −1 −1 −1 −1 + S11 S12SZ 0 S21S11 S13SZ S32SZ 0 −1 −1 −1 −1 −1 + S11 S12SZ 0 S23SZ S31S11 S12SZ 0 −1 −1 −1 −1 H + S11 S12SZ 0 S23SZ S32SZ 0 U .

 −1  −1 −1 −1 −1 −1 −1 −1 −1 −1 H T = V S11 S13SZ + S11 S12S 0 S21S11 S13SZ + S11 S12S 0 S23SZ U . (block) 13 Z Z

 −1  −1 −1 −1 −1 −1 −1 T = V −S 0 S21S11 − S 0 S21S11 S13SZ S31S11 (block) 21 Z Z −1 −1 −1 −1 −1 −1 − SZ 0 S21S11 S13SZ S31S11 S12SZ 0 S21S11 −1 −1 −1 −1 −1 + SZ 0 S21S11 S13SZ S32SZ 0 S21S11 −1 −1 −1 −1 −1 −1 −1 −1 + SZ 0 S23SZ S31S11 + SZ 0 S23SZ S31S11 S12SZ 0 S21S11 −1 −1 −1 −1 H −SZ 0 S23SZ S32SZ 0 S21S11 U . 10.3 numerical experiments 79

 −1  −1 −1 −1 −1 −1 −1 T = V S 0 − S 0 S21S11 S13SZ S31S11 S12S 0 (block) 22 Z Z Z −1 −1 −1 −1 − SZ 0 S21S11 S13SZ S32SZ 0 −1 −1 −1 −1 −1 −1 −1 H +SZ 0 S23SZ S31S11 S12SZ 0 + SZ 0 S23SZ S32SZ 0 U .

 −1  −1 −1 −1 −1 −1 H T = V −S 0 S21S11 S13SZ + S 0 S23SZ U . (block) 23 Z Z

 −1  −1 −1 −1 −1 −1 −1 T = − V SZ S31S11 + SZ S31S11 S12S 0 S21S11 (block) 31 Z −1 −1 −1 H −SZ S32SZ 0 S21S11 U .

 −1  −1 −1 −1 −1 −1 H T = − V SZ S31S11 S12SZ + S 0 S32S 0 U . (block) 32 Z Z

 −1  −1 H T( ) = V SZ U . block 3,3

The complexity of this approach depends heavily on the basis ma- trices U and V .

10.2.2.3 Diagonal Approximation

Analogously to the diagonal approximation in the one term case, T(block) can be approximated with a block diagonal matrix on the T −1 highest level. The costs of theMVP with (block) is in this case O (NxNy(Nx + Ny + 1)).

10.3 numerical experiments

10.3.1 Convergence of the Kronecker Product Approximation

The Kronecker product approximation is only an approximation if terms are omitted. If the complete number of terms is used (s = rank(Pˆ )), then it recreates the original matrix T(BTTB).

Figure 10.1 illustrates this fact by plotting the relative differences of 500 randomly createdBTTB matrices T(BTTB) and the result of the Kronecker product approximation if all terms are used. The result of the Kronecker product approximation is computed using the algo- rithm described in Section 10.1.1 and is a decomposition of the form Ak ⊗ Bk. k P 80 kronecker product approximation

Figure 10.1: Relative difference of the Kronecker product approximation (using all terms) and the originalBTTB matrix, for 500 ran- domly created test cases.

10.3.2 Decay of Singular Values

The core part of the Kronecker product approximation is theSVD that is calculated in step 3 of the algorithm, described in Section 10.1.1. The decay of the singular value of thisSVD can give us an idea of the quality of the approximation and also of how it relates to the number of terms used.

Figure 10.2 illustrates the decay of the singular values of thisSVD for a sample case. Each line of the plot corresponds to one of the BTTB matrices of C. If a line stops, it means that the next singular value will be exactly zero.

The illustration shows a fast decay for the singular values, indicat- ing that each additional term of the Kronecker product approxima- tion will significantly improve the approximation. While Figure 10.2 corresponds to only a single test case, a similar behavior has been observed for other test cases as well.

10.3.3 Relation to the Generating Function

As described in Chapter 3, a matrix with a Toeplitz structure can be associated with a generating function. The Kronecker product ap- proximation can also be interpreted in terms of the generating func- tion (see Figure 10.3).

The generating function of T(BTTB) is a bivariate function f(x, y). The first term of the Kronecker product approximation consists of two Toeplitz matrices A1 and B1. Their generating function is uni- variate and we can denote them as g(x) and h(y). The Kronecker 10.3 numerical experiments 81

Figure 10.2: Decay of the singular value of a sample test case.

product A1 ⊗ B1 will result in aBTTB matrix. Its generating func- tion f1(x, y) can be written as

f1(x, y) = g(x) · h(y) .

In other words, the first term of the Kronecker product approx- imation A1 ⊗ B1 corresponds to an approximation of the original generating function by a separable function f1(x, y). 82 kronecker product approximation

≈ ⊗ =

≈ · =

f(x, y) g(x) h(y) f1(x, y)

Figure 10.3: Relation of the Kronecker product approximation and the gen- erating functions (taken from the test case 1b).

This also means that if the original generating function f(x, y) of T(BTTB) is a separable function, a single term of the Kronecker prod- uct approximation is a perfect decomposition. Therefore, it can be ex- pected that the Kronecker product approximation works very well for separable or almost separable generating functions. (a) Original Gener- ating Function Figure 10.4 illustrates the convergence of the Kronecker prod- uct approximation in terms of the generating function. Figure 10.4a shows the top view of the generating function corresponding to the originalBTTB matrix (taken from test case 2b). Figure 10.4b is the generating function corresponding to A1 ⊗ B1, the first term approx- (b) One Term Ap- imation of the Kronecker product approximation. This function is proximation clearly separable in x and y direction.

Figure 10.4c and Figure 10.4d add one and two more terms to the approximation, respectively. It is visible that in this example, the three term approximation gives a very good approximation of the original generating function. (c) Two Term Ap- proximation

(d) Three Term Ap- proximation

Figure 10.4 MOREIDEAS 11 11.1 transformation based preconditioners

In Chapter 8, circulant matrices have been discussed as possible pre- conditioners, because they are diagonalizable by the Fourier trans- formation. This fact allows for an efficient inversion and an efficient computation of itsMVP.

Similar to Chapter 8, other transformations can be used to derive a different set of preconditioners P where

H P = T(Transf.) ΛT(Transf.) , for Λ a diagonal matrix and T(Transf.) the associated transformation.

If the transformation T(Transf.) can be computed efficiently, for ex- ample by being a Fourier-related transform, the efficient algorithms for the inversion and theMVP still hold true.

In the following sections a series of Fourier-related transforms are described briefly.

11.1.1 Discrete Sine and Cosine Transform

The discrete sine transforms (DSTs) and discrete cosine transforms (DCTs) are a series of Fourier-related transforms and are used in applications such as signal processing or statistics. Martucci [37] described in total 16 transformations (four even and four odd ver- sions of each) and generalized them as discrete trigonometric trans- form (DTT).

Figure 11.1 and Figure 11.2 show the result of theDST II andDCT II transformation on aBTTB matrix. The transformation matrices of DST II andDCT II of size n, are defined as   π(i + 1)(2j + 1) T(DSTII) = ci,j sin i, j = 0, 1, ... , n − 1 i,j 2n   πi(2j + 1) T(DCTII) = ci,j cos i, j = 0, 1, ... , n − 1 , i,j 2n where ci,j is a scaling factor.

83 84 more ideas

(a) Color plot of the originalBTTB ma- (b) Color plot of the approximation via trix. theDST II.

Figure 11.1: Color plots for aBTTB matrix and the approximation resulting fromDST II.

(a) Color plot of the originalBTTB ma- (b) Color plot of the approximation via trix. theDCT II.

Figure 11.2: Color plots for aBTTB matrix and the approximation resulting fromDCT II.

11.1.2 Hartley Transform

Another Fourier-related transform is the Hartley transform. The discrete is defined as   1 2π   2π  T(Hartley) = √ cos ij sin √ ij i, j = 0, 1, ... , n − 1 . i,j n n n

Figure 11.3 illustrates the result of an approximation with the Hart- ley transformation. 11.2 banded approximations 85

(a) Color plot of the originalBTTB ma- (b) Color plot of the approximation via trix. the Hartley transformation.

Figure 11.3: Color plots for aBTTB matrix and the approximation resulting from a Hartley transformation.

11.2 banded approximations

A banded matrix is a matrix whose non-zero elements are limited to the main diagonal plus some diagonals on either side. For example, a has only non-zero elements on the main diagonal as well as on the diagonal above and below it. Figure 11.4 illustrates the result of a tridiagonal approximation on both levels.

(a) Color plot of the originalBTTB ma- (b) Color plot of a two-level tridiagonal trix. approximation.

Figure 11.4: Color plots for aBTTB matrix and a tridiagonal approximation on both levels.

Banded approximations can be used on both levels of aBTTB matrix, as described in Section 8.2. However, efficient inversion and MVP is only known if a diagonal approximation is applied. 86 more ideas

It is possible that efficient formulas for the inversion and theMVP exist as well, for example with the help of specialized banded solvers. However, due to the restricted time of this thesis, they have not been considered in this work. Nevertheless, the performances of banded approximations with different bandwidths are shown in Chapter 12. These results can help in evaluating whether it is worth investing more time in banded approximations.

11.3 koyuncu factorization

In his PhD thesis, Koyuncu [32] provides analytical inversion formu- las forBTTB matrices. They can be seen as generalizations for the G o- hberg–Semencul formula for one-level Toeplitz matrices. The main result [32, Thm. 3.0.1] of the thesis is the following:

Theorem 11.3.1: Inversion Formula forBTTB Matrices

Let

n1 n2 n1 n2 k l k l P(z1, z2) = Pklz1z2 , Q(z1, z2) = Qklz1z2 , k=0 l=0 k=0 l=0 X X X X

n1 n2 n1 n2 k l k l R(z1, z2) = Rklz1z2 , S(z1, z2) = Sklz1z2 , k=0 l=0 k=0 l=0 X X X X be stable operator valued polynomials, and suppose that

H H Q(z1, z2)P(z1, z2) = S(z1, z2) R(z1, z2)

on T2, where T = {z ∈ C : |z| = 1}. Put

H−1 −1 f(z1, z2) = P(z1, z2) Q(z1, z2) −1 H−1 = R(z1, z2) S(z1, z2) ,

for z1, z2 ∈ T. Put Λ = n\{n}, where n = (n1, n2) and write the 2 Fourier coefficients of f(z1, z2) as fb(k, l), (k, l) ∈ Z . Consider

T = (fbk1−k2,l1−l2 )(k1,l1),(k2,l2)∈Λ

Define A, A1, B, B1, C1, Cf1, C2, Cf2, D1, D2 as follows later. H H If Range(Ci) ⊂ Range(Di) or Range(Cei ) ⊂ Range(De i ) for i = 1, 2, then

−1 H H H (−1) H (−1) T = A1A − B B1 − Ce1 D1 C1 − Ce2 D2 C2 ,

(−1) (−1) where D1 and D2 denote generalized inverses of D1 and D2. 11.3 koyuncu factorization 87

The matrices A, A1, B, B1, C1, Cf1, C2, Cf2, D1, D2 are defined as follows:

A = (Pk−l)k,l∈Λ , A1 = (Qk−l)k,l∈Λ ,

B = (Sk−l)k∈n+Λ,l∈Λ , B1 = (Rk−l)k∈n+Λ,l∈Λ ,

j1 min{i2,j2} j1+n1 min{i2+n2,j2+n2} H H (C1)ij = Qk−iP j−k − S l−iRl−j k =i −n k =0 l =i l =n 1 X1 1 X2 X1 1 2X2 j1 min{i2,j2} j1+n1 min{i2+n2,j2+n2} H H (Ce1)ij = Pk−iQ j−k − R l−iSl−j , k =i −n k =0 l =i l =n 1 X1 1 X2 X1 1 2X2

where i ∈ Θ1 = {n1 + 1, n1 + 2, ... } × {0, 1, ... , n2 − 1}, j ∈ n1 × n2\{n1, n2},

min{i1,j1} j2 min{i1+n1,j1+n1} j2+n2 H H (C2)ij = Qk−iP j−k − S l−iRl−j k =0 k =i −n l =n l =i X1 2 X2 2 1X1 X2 2 min{i1,j1} j2 min{i1+n1,j1+n1} j2+n2 H H (Ce1)ij = Pk−iQ j−k − R l−iSl−j , k =0 k =i −n l =n l =i X1 2 X2 2 1X1 X2 2

where i ∈ Θ2 = {0, 1, ... , n1 − 1} × {n2 + 1, n2 + 2, ... }, j ∈ n1 × n2\{n1, n2},

min{k1,k˜ 1} min{k2,k˜ 2} H (D1)k,k˜ = Qk−lP k˜ −l ˜ l =0 l1=max{Xk1,k1}−n2 X2 min{k1,k˜ 1+n1} min{k2,k˜ 2+n2} H − S s−kRs−k˜ , ˜ s2=n2 s1=maxX{k1,k1} X where k, k˜ ∈ Θ1 and

min{k1,k˜ 1} min{k2,k˜ 2} H (D2)k,k˜ = Qk−lP k˜ −l l =0 ˜ X1 l2=max{Xk2,k2}−n2 min{k1,k˜ 1+n1} min{k2,k˜ 2+n2} H − S s−kRs−k˜ , s1=n1 ˜ X s2=maxX{k2,k2} where k, k˜ ∈ Θ2 and Pk = Rk = Qk = Sk = 0 whenever k 6∈ n.

Although the inversion formula is something that would poten- tially benefit the construction of a preconditioner heavily, this method is not pursued further in this work. This is mainly due to the fact that 88 more ideas

the prerequisites of the presented Theorem 11.3.1 are very strict. It is considered highly unlikely that such a decomposition of the generat- ing function is possible, or that it can be closely approximated by one of this form.

11.4 low-rank update

Additional to preconditioning methods based on approximating the inverse of C, involving the other two matrices G and M that make up A has been analyzed.

The matrix-matrix product GM can be approximated with anSVD of rank k, i. e.

H GM ≈ UkSkVk ,(11.1)

H where GM = USV , the result of anSVD on GM and Uk is the matrix equal to the first k columns of U, Vk is the matrix equal to the first k rows of V and Sk is the matrix that consists of the first k rows and k columns of S.

P(C) can be a Instead of using only the preconditioner based on C , we can then preconditioner H use P(C) + UkSkVk as a new preconditioner. Here P(C) denotes the suggested in one of H preconditioner based on C and UkSkV is the low rank update. the chapters 7 to 11. k

The inversion of this new preconditioner can be computed using the Woodbury matrix identity [5, Cor. 2.8.8],

H −1 P(C) + UkSkVk = P(C) −1 −1 H −1 −1 H −1 − P(C) Uk Sk + Vk P(C) Uk Vk P(C) ,

The inversion of which only requires the inversion of P(C) and a matrix of size k × k. P(C), can be done Thus, the low rank update is only efficient if k  size(C). efficiently, if it is one of the suggested preconditioners. Figure 11.5 illustrates the relative difference of the left and the right side of( 11.1) for different values of k and several test cases. It is vis- ible that for k  size(C), the quality of the approximation is rather low. Some sample performance benchmarks also confirm that the im- provement of the preconditioner with a low rank update is not rele- vant for small values of k. 11.4 low-rank update 89

H Figure 11.5: Relative difference of GM and UkSkVk for different values of k and four different test cases.

Part III

BENCHMARKS

A variety of real-life test scenarios is used to test the per- formance of the proposed preconditioners of the last part. They will be compared in terms of the required iterations needed until convergence is reached.

BENCHMARKS 12 In this chapter, several test cases are used to analyze the performance of the proposed preconditioners. The test cases stem from the appli- cation described in Chapter 1.

In the benchmarks, IDR(s) is chosen as the iterative solver, using the MATLAB implementation described by van Gijzen and Sonn- eveld [19], which can be found online (http://ta.twi.tudelft.nl/ nw/users/gijzen/IDR.html). The solver is used on a left precondi- tioned system, as described in( 2.3), with the following parameters:

• s = 6: Dimension of the shadow space.

• tol = 10−7: Tolerance of the method.

• maxit = 5000: Maximum number of iterations.

The following preconditioners have been tested:

• No Preconditioner: This is the unpreconditioned system Ax = b and is useful as a reference.

• Exact C: This preconditioner uses the complete matrix C, as de- scribed in Chapter 7. This will (usually) describe the upper limit of the performance of any preconditioner that will be based on C. While it reduces the number of iterations drastically, it is also important to note that each iteration will take relatively long, and therefore will not be optimal in terms of time.

• Circulant: This preconditioner replaces both Toeplitz levels with the Chan’s optimal preconditioner, described in Section 8.1.1.2. It is the current default choice and will therefore be used as the main reference in this benchmark.

• DCT II: Similar to the circulant preconditioner, but instead of the Fourier transformation,DCT II is used, as described in Section 11.1.1.

• DST II: Similar to the circulant preconditioner, but instead of the Fourier transformation,DST II is used, as described in Section 11.1.1.

• Hartley: Similar to the circulant preconditioner, but instead of the Fourier transformation, the Hartley transformation is used, as described in Section 11.1.2.

93 94 benchmarks

• Kron 1: EachBTTB block of C is approximated with a one term Kronecker product approximation. No further approximations are done on the level that is 3 × 3 symmetrical.

• Kron 2: EachBTTB block of C is approximated with a two term Kronecker product approximation. Although no efficient inversion andMVP could be found, the results of this bench- mark might still provide further insight.

• Kron 3: EachBTTB block of C is approximated with a three term Kronecker product approximation. Although no efficient inversion andMVP could be found, the results of this bench- mark might still provide further insight.

• Kron Full: EachBTTB block of C is described using all the terms resulting from the Kronecker product approximation. Since it replicates the matrix C exactly, the results should be very close to the one from Exact C. Although no efficient inver- sion andMVP could be found, the results of this benchmark might still provide further insight.

• Kron SVD 1: EachBTTB block of C is approximated with the approximateSVD described in Section 10.1.2.3, using one term of the Kronecker product approximation. No further approxi- mations are done on the level that is 3 × 3 symmetrical.

• Kron SVD 2: EachBTTB block of C is approximated with the approximateSVD described in Section 10.1.2.3, using two terms of the Kronecker product approximation. No further ap- proximations are done on the level that is 3 × 3 symmetrical.

• Kron SVD 3: EachBTTB block of C is approximated with the approximateSVD described in Section 10.1.2.3, using three terms of the Kronecker product approximation. No further ap- proximations are done on the level that is 3 × 3 symmetrical.

• Kron SVD Full: EachBTTB block of C is approximated with the approximateSVD described in Section 10.1.2.3, using all terms of the Kronecker product approximation. No further ap- proximations are done on the level that is 3 × 3 symmetrical.

• Kron 1 Approx: Same preconditioner as Kron 1, but with the sum approximation on the 3 × 3 symmetric level, mentioned in Section 10.2.1.

• Kron 1 Approx Matrix: Same preconditioner as Kron 1 Approx, but implemented without the use of the Kronecker decompo- sition. The results should be close to Kron 1 Approx and can be used as a validation. benchmarks 95

• Kron 1 Diagonal: Same preconditioner as Kron 1, but with the diagonal approximation on the 3 × 3 symmetric level, mentioned in Section 10.2.1.

• Kron SVD Diagonal: Same preconditioner as Kron SVD Full, but with the diagonal approximation on the 3 × 3 symmetric level, mentioned in Section 10.2.2.

• IGF 1: Preconditioner described in Section 9.2, with a sampling rate of 1, as defined in( 9.2).

• IGF 3: Preconditioner described in Section 9.2, with a sampling rate of 3, as defined in( 9.2).

• IGF 5: Preconditioner described in Section 9.2, with a sampling rate of 5, as defined in( 9.2).

• IGF 7: Preconditioner described in Section 9.2, with a sampling rate of 7, as defined in( 9.2).

• IGF 5 (α = 0.3): Same preconditioner as IGF 5, but with an additional regularization of α = 0.3.

• Diagonal: Each Toeplitz level of C has been approximated by a diagonal matrix, as described in Section 11.2. No further ap- proximations are done on the level that is 3 × 3 symmetrical.

• Tridiagonal: Each Toeplitz level of C has been approximated by a tridiagonal matrix, as described in Section 11.2. No further approximations are done on the level that is 3 × 3 symmetrical.

• Pentadiagonal: Each Toeplitz level of C has been approximated by a , as described in Section 11.2. No fur- ther approximations are done on the level that is 3 × 3 symmet- rical.

• Heptadiagonal: Each Toeplitz level of C has been approxi- mated by a , as described in Section 11.2. No further approximations are done on the level that is 3 × 3 symmetrical. The computation was done in MATLAB R2016b on a system with an Intel Core i5-6200U.

Table 12.2 contains all the results from the benchmark. It shows the number of iterations until convergence is reached, if one of the above mentioned preconditioners is used. The results are color-coded, in a way that is shown in Table 12.1.

In addition to the results per test case, the table also provides the sum, the average and the median of all the test cases, per precondi- tioner, rounded to the nearest integer. While the median value tends 96 benchmarks

Table 12.1: Color-code for the tables in the benchmark chapter. far better (> 10% fewer iterations vs. circulant) better (< 10% fewer iterations vs. circulant) worse (< 10% more iterations vs. circulant) far worse (> 10% more iterations vs. circulant)

to weight all test cases equally, the average weights the test cases de- pending on the number of iterations needed. Figure 12.1 illustrates the results in terms of the speed up each preconditioner produces rel- ative to the circulant preconditioner.

The test cases are categorized in three groups, Group 1, Group 2 and Group 3. It can be expected that the test cases in each group share common properties.

The following sections will briefly describe the results per precon- ditioning method. benchmarks 97 54 49 39 52 91 37 86 430 224 211 199 230 507 205 494 331 219 188 124 123 142 105 652 839 215 5000 1889 1407 2881 2780 1406 5000 25157 > > 53 50 38 52 93 83 90 423 226 210 195 237 516 629 122 166 160 112 743 218 5000 5000 5000 1816 1320 4927 1252 3547 1223 1113 33387 > > > 56 56 42 55 62 37 33 663 271 240 220 248 361 651 251103 348 68 598 111 637 355 5000 5000 5000 3206 1560 3125 1444 2235 1134 1250 1134 34030 > > > 57 56 45 55 63 8349 63 759 276 238 221 252 361 331 133 361 158 297 121 179 274 434 121 762 263 5000 3450 1565 1024 5000 5000 5000 30705 > > > > ) Diagonal Tridiagonal Pentadiagonal Heptadiagonal 3 67 . 767 289 944 296 754 263 176 280 0 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 4730 4492 2852 1960 3355 = > > > > > > > > > > > > > > > > > > 100647 α ( 5 IGF 7 233 476 282 5000 4784 4809 3178 4460 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 5000 > > > > > > > > > > > > > > > > > > > > > > > > > IGF 133786 5 63 600 902 692 579 775 920 529 930 115 352 719 200 5000 5000 5000 5000 5000 5000 5000 1112 1060 1224 1363 1028 2151 1707 1516 1806 1710 1902 1086 57060 IGF > > > > > > > 3 65 970 859 163 599 553 5000 2044 1065 2887 2971 1470 2708 1335 3953 1948 1035 3909 1247 2947 3631 2526 1712 2687 2617 5000 5000 5000 5000 5000 5000 5000 80605 IGF > > > > > > > > 1 87 590 779 571 754 185 687 5000 4030 1876 3872 1521 3854 1112 1986 1437 1021 1442 1148 1321 2089 2679 1931 5000 5000 5000 5000 5000 5000 5000 5000 5000 80372 > > > > > > > > > > 57 58 43 56 77 51 47 26 77 62 804 331 296 217 244 406 249 117 156 120 790 131 188 109 547 126 3154 1583 1793 5000 16410 > 58 56 41 55 6255 60 50 28 81 62 878 263 224 212 235 350 267 108 242 121 116 100798 262 192 107 109 536 157 2881 1515 1670 5000 16092 > Diagonal Kron SVD Diagonal IGF 1 59 54 39 54 55 55 28 62 652 262 229 206 240 340 259 107 238 118 116 180 326 472 188 108 700 197 3194 1585 1701 5000 5000 20989 > > Approx Matrix Kron 1 57 54 39 54 55 55 28 62 632 263 230 198 241 357 259 109 237 121 116 176 351 482 188 110 689 193 2933 1615 1588 5000 5000 20672 > > Approx Kron 1 59 53 42 54 5577 6051 42 24 77 60 62 833 333 300 207 256 420 254 115 156 120 790 131 188 109 553 126 3401 1484 1793 5000 16594 > Kron SVD Full Kron 3 59 53 41 54 55 80 52 42 24 79 62 759 327 299 201 255 428 391 128 167 117 111851 137 188 107 109 548 133 5000 3114 1450 1793 16427 > Kron SVD 2 59 52 40 54 58 51 43 24 79 62 801 329 303 205 248 438 326 118 356 229 118 105 813 153 188 109 570 171 3442 1497 1793 5000 17094 > Kron SVD 1 55 52 38 52 66 55 51 26 81 62 767 254 223 203 239 342 269 105 243 121 116 111 789 262 188 109 530 155 5000 2783 1439 1793 15895 > 54 47 39 53 51 74 56 48 41 24 76 79 62 211 261 214 177 185 229 279 198 121 118 104 788 789 192 109 252 114 1218 1670 7567 Kron Full Kron SVD 3 54 47 40 53 94 69 48 42 24 77 90 62 339 966 223 192 189 226 528 311 161 119 105 806 192 109 519 140 5000 1371 2372 1670 15580 > Kron 2 55 51 40 55 57 50 42 24 78 62 537 250 224 229 233 341 762 191 481 331 119 101 809 134 192 109 550 192 2821 1454 5000 1670 16503 > Kron 1 58 52 38 52 64 55 51 26 81 62 623 262 228 199 235 336 266 105 242 121 116 100 798 262 192 109 535 157 5000 3263 1387 1670 16054 : Number of iterations if a selected preconditioner is used on a certain test case. > 2 . 66 65 44 59 62 69 65 35 98 68 878 398 359 331 460 267 125 191 146 111 191 449 865 191 5000 5000 5000 2858 1756 1415 12 25959 > > > 65 63 43 60 59 7965 100 70 86 755 396 346 289352 290 457 247 112 146 125 114 132 168 526 523 876 157 3629 2332 5000 5000 5000 26275 > > > Table 61 43 63 61 71 59 3095 33 86 842 409 350 297 328 466 275 125 211 161 119 128 287 727 790 186 5000 5000 4370 2399 1481 23708 > > 5000 5000 > > 5447 6239 60 53 62 46 51 58 74 60 56 93 48 1004124 68 76 100 63 33 96 79 227 63 70 210254 682 220 3905 176 380 184 338 224 290 267 327 195 450 122 231 119 197 147 100803 107 770 856 191 1518 108 418 251114 770 212 1178 2211 1700 7526 23095 71 69 74 77 92 50 383 373 322 380 584 113 402 172 429 166 384 147 504 549 129 790 384 5000 5000 1049 2488 2991 1260 5000 5000 5000 37793 > > > a a b > a b a b c bb > a c a b a a a a a b b b b b b a b a b a 14 1 2 7 12 13 3 4 5 6 7 11 11 1 2 3 4 5 6 8 8 9 9 10 10 11 12 14 1 13 Test Case No Preconditioning Exact C Circulant DCT II DST II Hartley Kron Average Sum Median Test Case Test Case Test Case Test Case Test Case Test Case Test Case Test Case Test Case Test Case Test Case Test Case Test Case Test Case Test Case Test Case Test Case Test Case Test Case Test Case Test Case Test Case Test Case Test Case Test Case Test Case Test Case Test Case Test Case Test Case 1 1 1 1 1 3 3 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 1 1 1 3 3 1 3 Group Group Group Group Group Group Group Group Group Group Group Group Group Group Group Group Group Group Group Group Group Group Group Group Group Group Group Group Group Group Group 98 benchmarks preconditioner. circulant : Box plots for the relative speed up of each preconditioner compared to the 1 . 12 Figure 12.1 transformation-based preconditioner 99

12.1 transformation-based preconditioner

Table 12.3 shows the results for transformation-based preconditioners, in comparison to no preconditioning and the exact C preconditioner. The exact C preconditioner reduces the iterations drastically, but it is important to note that the time per iteration for this preconditioner is relatively large.

Table 12.3: Number of iterations for transformation based preconditioners. Group Test Case No Exact C Circulant DCT II DST II Hartley Prec.

Group 1 Test Case 1a 1049 210 682 842 755 878 Group 1 Test Case 1b >5000 254 3905 4370 3629 >5000 Group 1 Test Case 1c 2488 1178 2211 2399 2332 2858 Group 1 Test Case 2a 383 220 380 409 396 398 Group 1 Test Case 2b 71 54 62 62 65 66 Group 1 Test Case 3a 373 176 338 350 346 359 Group 1 Test Case 3b 69 47 60 61 63 65 Group 1 Test Case 4a 322 184 290 297 289 290 Group 1 Test Case 4b 74 39 46 43 43 44 Group 1 Test Case 5a 380 224 327 328 352 331 Group 1 Test Case 5b 77 53 58 63 60 59 Group 1 Test Case 6a 584 267 450 466 457 460 Group 1 Test Case 6b 113 51 60 61 59 62 Group 1 Test Case 7a 402 195 231 275 247 267 Group 1 Test Case 7b 172 74 93 125 112 125 Group 2 Test Case 8a 429 122 197 211 146 191 Group 2 Test Case 8b 166 56 100 100 79 100 Group 2 Test Case 9a 384 119 147 161 125 146 Group 2 Test Case 9b 147 48 68 71 65 69 Group 2 Test Case 10a 92 41 63 59 70 65 Group 2 Test Case 10b 50 24 33 30 33 35 Group 3 Test Case 11a 504 76 96 95 114 98 Group 3 Test Case 11b 2991 100 107 119 132 111 Group 3 Test Case 11c >5000 803 856 1481 >5000 1756 Group 3 Test Case 12a 549 79 227 128 168 191 Group 3 Test Case 12b >5000 770 >5000 >5000 >5000 >5000 Group 3 Test Case 13a >5000 191 1518 287 526 1415 Group 3 Test Case 13b >5000 1700 >5000 >5000 >5000 >5000 Group 3 Test Case 14a 129 63 70 86 86 68 Group 3 Test Case 14b 790 108 418 727 523 449

Sum 37793 7526 23095 23708 26275 25959 Average 1260 251 770 790 876 865

Median 384 114 212 186 157 191 100 benchmarks

In comparison to that, the circulant preconditioner reduces the it- erations compared to no preconditioning as well, but with less ad- ditional time per iteration. On average, the circulant preconditioner requires only 61% of the iterations the original system requires.

The other three transformations, DCT II, DST II and Hartley, do not show a clear pattern. While the median of all three is smaller than the median of the circulant preconditioner, their average is not. While in many test cases, the results are around the same as for the circulant one, there are some cases where the number of iterations is a lot higher, see for example test case 11c for all three transformations.

12.2 kronecker product approximation

Table 12.4 and Table 12.5 show the results for preconditioners based on the Kronecker product approximation.

Table 12.4: Number of iterations for preconditioners based on the Kro- necker product approximation.

Group Test Case No Exact C Circulant Kron 1 Kron 2 Kron 3 Kron Full Kron 1 Kron 1 Kron 1 Prec. Approx Approx Diagonal Matrix

Group 1 Test Case 1a 1049 210 682 623 537 339 211 632 652 878 Group 1 Test Case 1b >5000 254 3905 3263 2821 966 261 2933 3194 2881 Group 1 Test Case 1c 2488 1178 2211 1387 1454 1371 1218 1615 1585 1515 Group 1 Test Case 2a 383 220 380 262 250 223 214 263 262 263 Group 1 Test Case 2b 71 54 62 58 55 54 54 57 59 58 Group 1 Test Case 3a 373 176 338 228 224 192 177 230 229 224 Group 1 Test Case 3b 69 47 60 52 51 47 47 54 54 56 Group 1 Test Case 4a 322 184 290 199 229 189 185 198 206 212 Group 1 Test Case 4b 74 39 46 38 40 40 39 39 39 41 Group 1 Test Case 5a 380 224 327 235 233 226 229 241 240 235 Group 1 Test Case 5b 77 53 58 52 55 53 53 54 54 55 Group 1 Test Case 6a 584 267 450 336 341 528 279 357 340 350 Group 1 Test Case 6b 113 51 60 64 57 94 51 60 60 62 Group 1 Test Case 7a 402 195 231 266 762 2372 198 259 259 267 Group 1 Test Case 7b 172 74 93 105 191 311 74 109 107 108 Group 2 Test Case 8a 429 122 197 242 481 161 121 237 238 242 Group 2 Test Case 8b 166 56 100 121 331 69 56 121 118 121 Group 2 Test Case 9a 384 119 147 116 119 119 118 116 116 116 Group 2 Test Case 9b 147 48 68 55 50 48 48 55 55 55 Group 2 Test Case 10a 92 41 63 51 42 42 41 55 55 50 Group 2 Test Case 10b 50 24 33 26 24 24 24 28 28 28 Group 3 Test Case 11a 504 76 96 81 78 77 76 176 180 81 Group 3 Test Case 11b 2991 100 107 100 101 105 104 351 326 100 Group 3 Test Case 11c >5000 803 856 798 809 806 788 >5000 >5000 798 Group 3 Test Case 12a 549 79 227 262 134 90 79 482 472 262 Group 3 Test Case 12b >5000 770 >5000 >5000 >5000 >5000 789 >5000 >5000 >5000 Group 3 Test Case 13a >5000 191 1518 192 192 192 192 188 188 192 Group 3 Test Case 13b >5000 1700 >5000 1670 1670 1670 1670 1588 1701 1670 Group 3 Test Case 14a 129 63 70 62 62 62 62 62 62 62 Group 3 Test Case 14b 790 108 418 109 109 109 109 110 108 109

Sum 37793 7526 23095 16054 16503 15580 7567 20672 20989 16092 Average 1260 251 770 535 550 519 252 689 700 536

Median 384 114 212 157 192 140 114 193 197 157

While Kron 1 is a valid preconditioner on theBTTB-level, the re- sults for Kron 2, Kron 3 and Kron Full are only there for the purpose of 12.3 inverse generating function 101 providing further insights, since for these preconditioners no efficient inversion andMVP is known. Kron Full, for example, should repro- duce the results of the exact C preconditioner closely, which can be verified with the results from the table.

Although Kron 2 and Kron 3 should both provide a better approxi- mation of C than Kron 1, only Kron 3 shows a better performance in both the average and the median. This suggests, that it is not worth to further look into possibilities to use a two-term Kronecker approxi- mation as a preconditioner.

There exist some test cases, where the number of iterations is (al- most) equivalent for Kron 1, Kron 2, Kron 3 and Kron Full. These test cases correspond to a separable generating function.

The last three columns correspond to additional approximations on the 3 × 3 symmetrical level of C, compared to Kron 1. While all three show an increase in iterations (on average and in the median) compared to Kron 1, it is very small for Kron 1 diagonal. It is quite sur- prising, that Kron 1 approx and Kron 1 matrix produce worse results than Kron 1 diagonal, which could be evidence for an incorrect imple- mentation.

It is very important to note that all of the preconditioners suggested in this section perform better or far better than the circulant precondi- tioner. This is true for the preconditioners in Table 12.5 as well. These preconditioners result from an approximateSVD. Again, there is no strictly monotonic trend visible if more terms of the Kronecker prod- uct approximation are used.

12.3 inverse generating function

Table 12.6 shows the result of the benchmark for preconditioners based on theIGF method. It is easily visible, that the method is far worse in almost all test case. There are a few exceptions where it is better than the circulant approximation.

It is worth noting that in none of the test cases, the matrix C is expected to fulfill the assumptions mentioned in Theorem 9.2.3. To be more specific, if the matrix C is complex valued, it is never Her- mitian. Additionally, the matrix is typically not positive definite, but can be in special cases.

The last column shows the result with an additional regularization as described in Section 9.3. While the result can be better than without regularization, it still was not enough to provide a better result than 102 benchmarks

Table 12.5: Number of iterations for preconditioners based on the Kro- necker product approximation with approximateSVD.

Group Test Case No Exact C Circulant Kron SVD 1 Kron SVD 2 Kron SVD 3 Kron SVD Full Kron SVD Prec. Diagonal

Group 1 Test Case 1a 1049 210 682 767 801 759 833 804 Group 1 Test Case 1b >5000 254 3905 2783 3442 3114 3401 3154 Group 1 Test Case 1c 2488 1178 2211 1439 1497 1450 1484 1583 Group 1 Test Case 2a 383 220 380 254 329 327 333 331 Group 1 Test Case 2b 71 54 62 55 59 59 59 57 Group 1 Test Case 3a 373 176 338 223 303 299 300 296 Group 1 Test Case 3b 69 47 60 52 52 53 53 58 Group 1 Test Case 4a 322 184 290 203 205 201 207 217 Group 1 Test Case 4b 74 39 46 38 40 41 42 43 Group 1 Test Case 5a 380 224 327 239 248 255 256 244 Group 1 Test Case 5b 77 53 58 52 54 54 54 56 Group 1 Test Case 6a 584 267 450 342 438 428 420 406 Group 1 Test Case 6b 113 51 60 66 58 55 55 60 Group 1 Test Case 7a 402 195 231 269 326 391 254 249 Group 1 Test Case 7b 172 74 93 105 118 128 115 117 Group 2 Test Case 8a 429 122 197 243 356 167 156 156 Group 2 Test Case 8b 166 56 100 121 229 80 77 77 Group 2 Test Case 9a 384 119 147 116 118 117 120 120 Group 2 Test Case 9b 147 48 68 55 51 52 51 51 Group 2 Test Case 10a 92 41 63 51 43 42 42 47 Group 2 Test Case 10b 50 24 33 26 24 24 24 26 Group 3 Test Case 11a 504 76 96 81 79 79 77 77 Group 3 Test Case 11b 2991 100 107 111 105 111 107 107 Group 3 Test Case 11c >5000 803 856 789 813 851 790 790 Group 3 Test Case 12a 549 79 227 262 153 137 131 131 Group 3 Test Case 12b >5000 770 >5000 >5000 >5000 >5000 >5000 >5000 Group 3 Test Case 13a >5000 191 1518 188 188 188 188 188 Group 3 Test Case 13b >5000 1700 >5000 1793 1793 1793 1793 1793 Group 3 Test Case 14a 129 63 70 62 62 62 62 62 Group 3 Test Case 14b 790 108 418 109 109 109 109 109

Sum 37793 7526 23095 15895 17094 16427 16594 16410 Average 1260 251 770 530 570 548 553 547

Median 384 114 212 155 171 133 126 126 12.4 banded approximation 103

the circulant preconditioner. However, the value of α in this case was just a guess, based on very few previous tests.

Table 12.6: Number of iterations for preconditioners based on theIGF.

Group Test Case No Exact C Circulant IGF 1 IGF 3 IGF 5 IGF 7 IGF 5 Prec. (α = 0.3)

Group 1 Test Case 1a 1049 210 682 >5000 970 600 >5000 >5000 Group 1 Test Case 1b >5000 254 3905 >5000 2044 902 >5000 >5000 Group 1 Test Case 1c 2488 1178 2211 >5000 >5000 >5000 >5000 >5000 Group 1 Test Case 2a 383 220 380 4030 1065 692 >5000 >5000 Group 1 Test Case 2b 71 54 62 1876 2887 1112 >5000 >5000 Group 1 Test Case 3a 373 176 338 3872 859 579 4784 4730 Group 1 Test Case 3b 69 47 60 1521 2971 1060 >5000 4492 Group 1 Test Case 4a 322 184 290 3854 1470 775 >5000 >5000 Group 1 Test Case 4b 74 39 46 590 2708 1224 >5000 2852 Group 1 Test Case 5a 380 224 327 >5000 1335 920 >5000 >5000 Group 1 Test Case 5b 77 53 58 1112 3953 1363 >5000 >5000 Group 1 Test Case 6a 584 267 450 >5000 1948 1028 >5000 >5000 Group 1 Test Case 6b 113 51 60 1986 >5000 2151 4809 >5000 Group 1 Test Case 7a 402 195 231 >5000 >5000 >5000 >5000 767 Group 1 Test Case 7b 172 74 93 779 >5000 >5000 >5000 289 Group 2 Test Case 8a 429 122 197 1437 1035 529 >5000 944 Group 2 Test Case 8b 166 56 100 1021 3909 1707 >5000 296 Group 2 Test Case 9a 384 119 147 1442 1247 1516 >5000 754 Group 2 Test Case 9b 147 48 68 571 2947 930 >5000 263 Group 2 Test Case 10a 92 41 63 754 163 115 233 176 Group 2 Test Case 10b 50 24 33 185 599 352 476 67 Group 3 Test Case 11a 504 76 96 1148 >5000 1806 >5000 >5000 Group 3 Test Case 11b 2991 100 107 1321 3631 1710 >5000 >5000 Group 3 Test Case 11c >5000 803 856 >5000 >5000 >5000 >5000 >5000 Group 3 Test Case 12a 549 79 227 >5000 2526 >5000 >5000 >5000 Group 3 Test Case 12b >5000 770 >5000 >5000 >5000 >5000 >5000 >5000 Group 3 Test Case 13a >5000 191 1518 687 1712 719 >5000 >5000 Group 3 Test Case 13b >5000 1700 >5000 >5000 >5000 >5000 >5000 >5000 Group 3 Test Case 14a 129 63 70 87 65 63 282 280 Group 3 Test Case 14b 790 108 418 2089 553 200 3178 1960

Sum 37793 7526 23095 80372 80605 57060 133786 100647 Average 1260 251 770 2679 2687 1902 4460 3355

Median 384 114 212 1931 2617 1086 >5000 >5000

12.4 banded approximation

Table 12.7 shows the number of iterations for preconditioners where the Toeplitz structure has been replaced with a banded approxima- tion. None of these preconditioners has any additional approximation on the 3 × 3 symmetrical level.

While all banded approximations are fairly good for a majority of the Group 1 test cases, they are mostly far worse in the other cases. Overall they are worse or far worse than the circulant one. 104 benchmarks

A trend is visible, that a larger bandwidth results in fewer itera- tions. However, even a heptadiagonal preconditioner is still worse than the cirulant preconditioner. Therefore, it seems that banded pre- conditioner are only an interesting choice if a much larger bandwidth can be used.

Table 12.7: Number of iterations for preconditioners based on banded ap- proximations.

Group Test Case No Exact C Circulant Tri- Penta- Hepta- Prec. Diagonal diagonal diagonal diagonal

Group 1 Test Case 1a 1049 210 682 759 663 423 430 Group 1 Test Case 1b >5000 254 3905 3450 3206 1816 1889 Group 1 Test Case 1c 2488 1178 2211 1565 1560 1320 1407 Group 1 Test Case 2a 383 220 380 276 271 226 224 Group 1 Test Case 2b 71 54 62 57 56 53 54 Group 1 Test Case 3a 373 176 338 238 240 210 211 Group 1 Test Case 3b 69 47 60 56 56 50 49 Group 1 Test Case 4a 322 184 290 221 220 195 199 Group 1 Test Case 4b 74 39 46 45 42 38 39 Group 1 Test Case 5a 380 224 327 252 248 237 230 Group 1 Test Case 5b 77 53 58 55 55 52 52 Group 1 Test Case 6a 584 267 450 361 361 516 507 Group 1 Test Case 6b 113 51 60 63 62 93 91 Group 1 Test Case 7a 402 195 231 331 3125 4927 2881 Group 1 Test Case 7b 172 74 93 133 1444 1252 205 Group 2 Test Case 8a 429 122 197 361 2235 3547 494 Group 2 Test Case 8b 166 56 100 158 1134 629 331 Group 2 Test Case 9a 384 119 147 297 651 122 219 Group 2 Test Case 9b 147 48 68 121 251 68 188 Group 2 Test Case 10a 92 41 63 83 63 83 124 Group 2 Test Case 10b 50 24 33 49 37 33 37 Group 3 Test Case 11a 504 76 96 179 103 90 86 Group 3 Test Case 11b 2991 100 107 274 348 166 123 Group 3 Test Case 11c >5000 803 856 >5000 >5000 >5000 2780 Group 3 Test Case 12a 549 79 227 434 598 160 142 Group 3 Test Case 12b >5000 770 >5000 >5000 >5000 >5000 >5000 Group 3 Test Case 13a >5000 191 1518 >5000 1250 1223 1406 Group 3 Test Case 13b >5000 1700 >5000 >5000 >5000 >5000 >5000 Group 3 Test Case 14a 129 63 70 121 111 112 105 Group 3 Test Case 14b 790 108 418 762 637 743 652

Sum 37793 7526 23095 30705 34030 33387 25157 Average 1260 251 770 1024 1134 1113 839

Median 384 114 212 263 355 218 215 Part IV

CONCLUSION

In this last part, several suggestions are made for future investigations of the work discussed in this thesis. It concludes, by giving a summary of the main results and some conclusions.

FUTUREWORK 13 In this section several topics are listed that are suggested starting points for future work continuing the research.

A main aspect is the implementation of optimized versions of the preconditioners described in this work. Realizing this, will make it possible to replace the benchmarks in Chapter 12 with time measure- ments instead of only counting iterations.

Besides this general suggestion, there are a few preconditioner- specific suggestions for future work.

13.1 inverse generating function

13.1.1 Regularization

In Section 9.3 a regularization has been suggested in cases, where the regularIGF method will not return a successful preconditioner. In those cases, we propose using theIGF not on the original generating function f, but an elevated function f + α.

In Section 9.3, it was also pointed out, that the optimal value of α varies between different test cases. So far, no method could be found, to estimate this optimal α, besides trial-and-error. It could be an in- teresting future work, to find such methods.

Besides this regularization, where the whole function is shifted, a different regularization can be tested, that only changes the function in the problematic region. For example, instead of using f or f + α, the function fβ = max(f, β) could be used. In this case, the function will only be changed in regions, where it is close to zero. Analogously to a regularization of the whole function, the optimal value of β is the result of a trade-off.

13.1.2 Other Kernels

As mentioned in Section 9.1.1 different kernels could be used to ap- proximate the original generating function, if it is not available.

107 108 future work

In future work, different kernels besides the Dirichlet kernel that was used in this work, could be tested. Chan and Yeung [13] offer some kernels satisfying the requirements of theIGF method, that could be explored in future work.

13.2 kronecker product approximation

13.2.1 Using a Common Basis

In this work, several options for applying the Kronecker product approximation toBTTB-block matrices have been suggested in Sec- tion 10.2.2. One of them is to use a common basis for allBTTB ma- trices. Theoretically, the basis of anyBTTB could be used, but it is expected that this will produce a suboptimal result. Different choices for a common basis could be analyzed in future research.

13.3 preconditioner selection

The last part compared the performance of the preconditioners in dif- ferent test cases and showed that the optimal preconditioner depends on the test scenario. So naturally the question arises, if and how we can find the optimal preconditioner for a given test case.

For example, an automatic process could be tested, that automat- ically chooses a preconditioner for a given test system. Ideally, this process should pick the optimal preconditioner for this system, but a more realistic goal would be to make sure the process chooses a preconditioner that performs only slightly worse than the optimum.

In order to choose a (quasi) optimal preconditioner for a given sys- tem, the needed time for solving the system with each preconditioner needs to be computed. This could be done using a regression model for each preconditioner, that was trained with test data in a precom- putation phase and could potentially be updated throughout the use of the selection algorithm.

To make sure the regression produces realistic time predictions, we need good input values x for the regression model, that can ade- quately predict the performance of a preconditioner in a given sce- nario. This is a very important step, which is crucial to the overall performance of the selection algorithm. These predictors could be for example

• Diagonal dominance: The ratio of the diagonal elements and the the sum of the off-diagonal elements. This could, for exam- ple, predict if a banded approximation is a good choice. The 13.3 preconditioner selection 109

diagonal dominance could be approximated on just a few rows and columns.

• Norms: The norms of I − P −1C or I − P −1A could be com- puted or approximated. However, this requires the actual com- putation of the preconditioner and its inverse. If the 2-norm is used here, it is necessary to also compute the 2-norm of the inverse, to get a good prediction.

• Separability: As mentioned before, the Kronecker product ap- proximation works really well, for separable geometries. The general idea of the automatic selection process is that the se- lection algorithm chooses the preconditioner based on their expected time. However, the regression model is just an approximation of the actual time needed and it is highly unlikely, that the preconditioner with the smallest expected time is always the optimal one.

Instead a smaller assumption is made, namely that the probabil- ity that a preconditioner k is the optimal one, is proportional to the estimated time, computed by the regression model:

k P(P = k|x) ∝ Texpected(x) ,(13.1) where x is the input vector of the regression, i. e. the predictors,

k Texpected(x) is the expected time of preconditioner k given the predictors, estimated by the regression model,

P(P = k|x) the probability that the selection algorithm will choose k, given the predictors x. It approximates the probability that k is the optimal preconditioner, given the predictors x, Furthermore, it can be a good choice to try out a preconditioner where we don’t have a lot of test data yet that is similar to the data we are currently trying to solve. In other words, the probability that we choose preconditioner k could also depend on the ’uncertainty’ we have, concerning test cases around x:

k P(P = k|x) ∝ Uuncertainty(x) ,(13.2) where P(P = k|x) the probability that the selection algorithm will choose k, given the predictors x,

k Uuncertainty(x) is a measure for the uncertainty of the expected time of preconditioner k around x. This could be done, for ex- ample, by measuring the distance to the nearest set of predic- tors, for which the computational time is known. 110 future work

(13.2) describes a scaling of the probabilities depending on the amount of knowledge around x. This means that preconditioners with less test data around a given x get chosen more often relative to preconditioners with more test data around the same x. In other words,( 13.2) describes the fact that the selection algorithm tends to try out things.

While( 13.1) describes a preference of the algorithm for faster pre- conditioners,( 13.2) at the same time describes a preference for ’trying out’ uncertain preconditioners. In the end however, we want the bal- ance between those two aspects to change overtime. The more knowl- edge we already gathered (in total), the more we want the algorithm to choose the one with the smallest expected time. At the same time, it makes more sense to try out preconditioners at the beginning, when the gathering of more training data could improve the performance in the long run.

If we combine this fact with( 13.1) and( 13.2), a viable choice for the probability to choose a certain preconditioner is

t 1/t  k   k  P(P = k|x) = N Texpected(x) · Uuncertainty(x) ,

where N is a suitable normalization (so that P = 1) and t is a mea- k sure of the already accumulated training set,P or the ’time’ already invested. This way, we make sure that as we process more data, we value expected time more.

However, this is just one of many possible algorithms for choosing a preconditioner and a lot more literature research is needed. The method described here is just a possible suggestion to spark future work. CONCLUSION 14 In this work, several preconditioning methods forBTTB andBTTB- block systems were presented with the goal of reducing the time needed for solving a linear system using iterative solvers.

The performance of many of the suggested preconditioners has been analyzed in various real-world test cases. From the obtained results of the transformation-based methods it can be concluded that using different transformations besides the discrete Fourier trans- form (DFT), does not seem promising. The performance of other transformations was on average slightly worse than the performance of the circulant preconditioner, which is a result of theDFT.

In contrast to that, the Kronecker product approximation seems to be very promising. The performance of this preconditioner family is significantly better than the performance of the circulant precondi- tioner and should be considered the new default preconditioner in those test cases.

While the inverse generating function (IGF) has been shown to perform far worse in the benchmarks, we were still able in the con- text of this work, to extend the method to Toeplitz-block and BTTB- block matrices. Additionally, theoretical results could be obtained, that prove that theIGF works in these cases, if certain assumptions are met. However, these assumptions are typically not met in the test cases, which could explain its bad performance. Regularization could potentially help in those cases, but has to be further studied.

Banded approximations, especially the diagonal approximation has been shown to work well in about half of the test cases. However, the diagonal approximation, as well as the other banded preconditioners perform worse in the other half of the test set.

Finally, we want to mention, that while the focus was on the ap- plication of metrology for integrated circuits (ICs), many of the pro- posed methods and generalizations are also viable choices in other applications.

111

Part V

APPENDIX

INVERSIONFORMULAS FORKRONECKER A PRODUCT APPROXIMATION a.1 one term approximation a.1.1 Sum Approximation

T −1 This section describes the elements of (block), if for a one term ap- proximation the sum approximation described in Section 10.2.1.1 is used.

T −1 The last step for each element of (block) is necessary if theMVP will be computed following the strategy of( 10.6).

115 116 inversion formulas for kronecker product approximation

 −1  −1  −1 −1 −1 −1 T( ) = W 1 1 + W 1 1 T1,3 + W 1 2 T2,3) T3,3 (T3,1 W 1 1 block 1,1 , , , ,  + T W −1 3,2 2,1 = W −1 + W −1 T T −1 T W −1 1,1 1,1 1,3 3,3 3,1 1,1 + W −1 T T −1 T W −1 + W −1 T T −1 T W −1 1,1 1,3 3,3 3,2 2,1 1,2 2,3 3,3 3,1 1,1 + W −1 T T −1 T W −1 1,2 2,3 3,3 3,2 2,1 −1 −1 −1 −1 −1 −1 −1 −1 = (A1,1 ) ⊗ (B1,1 ) + (A1,1 A1,2 A2,2 A2,1 A1,1 ) ⊗ (B1,1 B1,2 B2,2 B2,1 B1,1 ) −1 −1 −1 −1 −1 −1 + (A1,1 A1,3 A3,3 A3,1 A1,1 ) ⊗ (B1,1 B1,3 B3,3 B3,1 B1,1 ) −1 −1 −1 −1 −1 + (A1,1 A1,3 A3,3 A3,1 A1,1 A1,2 A2,2 A2,1 A1,1 ) −1 −1 −1 −1 −1 ⊗ (B1,1 B1,3 B3,3 B3,1 B1,1 B1,2 B2,2 B2,1 B1,1 ) −1 −1 −1 −1 −1 + (A1,1 A1,2 A2,2 A2,1 A1,1 A1,3 A3,3 A3,1 A1,1 ) −1 −1 −1 −1 −1 ⊗ (B1,1 B1,2 B2,2 B2,1 B1,1 B1,3 B3,3 B3,1 B1,1 ) −1 −1 −1 −1 −1 −1 −1 + (A1,1 A1,2 A2,2 A2,1 A1,1 A1,3 A3,3 A3,1 A1,1 A1,2 A2,2 A2,1 A1,1 ) −1 −1 −1 −1 −1 −1 −1 ⊗ (B1,1 B1,2 B2,2 B2,1 B1,1 B1,3 B3,3 B3,1 B1,1 B1,2 B2,2 B2,1 B1,1 ) −1 −1 −1 −1 + (−A1,1 A1,3 A3,3 A3,2 A2,2 A2,1 A1,1 ) −1 −1 −1 −1 ⊗ (B1,1 B1,3 B3,3 B3,2 B2,2 B2,1 B1,1 ) −1 −1 −1 −1 −1 −1 + (−A1,1 A1,2 A2,2 A2,1 A1,1 A1,3 A3,3 A3,2 A2,2 A2,1 A1,1 ) −1 −1 −1 −1 −1 −1 ⊗ (B1,1 B1,2 B2,2 B2,1 B1,1 B1,3 B3,3 B3,2 B2,2 B2,1 B1,1 ) −1 −1 −1 −1 + (A1,1 A1,2 A2,2 A2,3 A3,3 A3,1 A1,1 ) −1 −1 −1 −1 ⊗ (B1,1 B1,2 B2,2 B2,3 B3,3 B3,1 B1,1 ) −1 −1 −1 −1 −1 −1 + (A1,1 A1,2 A2,2 A2,3 A3,3 A3,1 A1,1 A1,2 A2,2 A2,1 A1,1 ) −1 −1 −1 −1 −1 −1 ⊗ (B1,1 B1,2 B2,2 B2,3 B3,3 B3,1 B1,1 B1,2 B2,2 B2,1 B1,1 ) −1 −1 −1 −1 −1 + (−A1,1 A1,2 A2,2 A2,3 A3,3 A3,2 A2,2 A2,1 A1,1 ) −1 −1 −1 −1 −1 ⊗ (B1,1 B1,2 B2,2 B2,3 B3,3 B3,2 B2,2 B2,1 B1,1 ) , A.1 one term approximation 117

 −1  −1  −1 −1 −1 −1 −1  T( ) = W 1 2 + W 1 1 T1,3 + W 1 2 T2,3) T3,3 (T3,1 W 1 2 + T3,2 W 2 2 block 1,2 , , , , , = W −1 + W −1 T T −1 T W −1 + W −1 T T −1 T W −1 1,2 1,1 1,3 3,3 3,1 1,2 1,1 1,3 3,3 3,2 2,2 + W −1 T T −1 T W −1 + W −1 T T −1 T W −1 1,2 2,3 3,3 3,1 1,2 1,2 2,3 3,3 3,2 2,2 −1 −1 −1 −1 = (A1,1 A1,2 A2,2 ) ⊗ (B1,1 B1,2 B2,2 ) −1 −1 −1 −1 −1 −1 −1 −1 + (A1,1 A1,3 A3,3 A3,1 A1,1 A1,2 A2,2 ) ⊗ (B1,1 B1,3 B3,3 B3,1 B1,1 B1,2 B2,2 ) −1 −1 −1 −1 −1 −1 + (A1,1 A1,2 A2,2 A2,1 A1,1 A1,3 A3,3 A3,1 A1,1 A1,2 A2,2 ) −1 −1 −1 −1 −1 −1 ⊗ (B1,1 B1,2 B2,2 B2,1 B1,1 B1,3 B3,3 B3,1 B1,1 B1,2 B2,2 ) −1 −1 −1 −1 −1 −1 + (A1,1 A1,3 A3,3 A3,2 A2,2 ) ⊗ (B1,1 B1,3 B3,3 B3,2 B2,2 ) −1 −1 −1 −1 −1 + (A1,1 A1,2 A2,2 A2,1 A1,1 A1,3 A3,3 A3,2 A2,2 ) −1 −1 −1 −1 −1 ⊗ (B1,1 B1,2 B2,2 B2,1 B1,1 B1,3 B3,3 B3,2 B2,2 ) −1 −1 −1 −1 −1 + (A1,1 A1,2 A2,2 A2,3 A3,3 A3,1 A1,1 A1,2 A2,2 ) −1 −1 −1 −1 −1 ⊗ (B1,1 B1,2 B2,2 B2,3 B3,3 B3,1 B1,1 B1,2 B2,2 ) −1 −1 −1 −1 + (A1,1 A1,2 A2,2 A2,3 A3,3 A3,2 A2,2 ) −1 −1 −1 −1 ⊗ (B1,1 B1,2 B2,2 B2,3 B3,3 B3,2 B2,2 ) ,

 −1   −1 −1  −1 T( ) = W 1 1 T1,3 + W 1 2 T2,3 T3,3 block 1,3 , , −1 −1 −1 −1 = (A1,1 A1,3 A3,3 ) ⊗ (B1,1 B1,3 B3,3 ) −1 −1 −1 −1 −1 −1 −1 −1 + (A1,1 A1,2 A2,2 A2,1 A1,1 A1,3 A3,3 ) ⊗ (B1,1 B1,2 B2,2 B2,1 B1,1 B1,3 B3,3 ) −1 −1 −1 −1 −1 −1 + (A1,1 A1,2 A2,2 A2,3 A3,3 ) ⊗ (B1,1 B1,2 B2,2 B2,3 B3,3 ) ,

 −1  −1  −1 −1 −1 −1 −1  T( ) = W 2 1 + W 2 1 T1,3 + W 2 2 T2,3) T3,3 (T3,1 W 1 1 + T3,2 W 2 1 block 2,1 , , , , , = W −1 + W −1 T T −1 T W −1 + W −1 T T −1 T W −1 2,1 2,1 1,3 3,3 3,1 1,1 2,1 1,3 3,3 3,2 2,1 + W −1 T T −1 T W −1 + W −1 T T −1 T W −1 2,2 2,3 3,3 3,1 1,1 2,2 2,3 3,3 3,2 2,1 −1 −1 −1 −1 = (−A2,2 A2,1 A1,1 ) ⊗ (B2,2 B2,1 B1,1 ) −1 −1 −1 −1 −1 −1 −1 −1 + (A2,2 A2,1 A1,1 A1,3 A3,3 A3,1 A1,1 ) ⊗ (B2,2 B2,1 B1,1 B1,3 B3,3 B3,1 B1,1 ) −1 −1 −1 −1 −1 −1 + (−A2,2 A2,1 A1,1 A1,3 A3,3 A3,1 A1,1 A1,2 A2,2 A2,1 A1,1 ) −1 −1 −1 −1 −1 −1 ⊗ (B2,2 B2,1 B1,1 B1,3 B3,3 B3,1 B1,1 B1,2 B2,2 B2,1 B1,1 ) −1 −1 −1 −1 −1 + (A2,2 A2,1 A1,1 A1,3 A3,3 A3,2 A2,2 A2,1 A1,1 ) −1 −1 −1 −1 −1 ⊗ (B2,2 B2,1 B1,1 B1,3 B3,3 B3,2 B2,2 B2,1 B1,1 ) −1 −1 −1 −1 −1 −1 + (A2,2 A2,3 A3,3 A3,1 A1,1 ) ⊗ (B2,2 B2,3 B3,3 B3,1 B1,1 ) −1 −1 −1 −1 −1 + (A2,2 A2,3 A3,3 A3,1 A1,1 A1,2 A2,2 A2,1 A1,1 ) −1 −1 −1 −1 −1 ⊗ (B2,2 B2,3 B3,3 B3,1 B1,1 B1,2 B2,2 B2,1 B1,1 ) −1 −1 −1 −1 + (−A2,2 A2,3 A3,3 A3,2 A2,2 A2,1 A1,1 ) −1 −1 −1 −1 ⊗ (B2,2 B2,3 B3,3 B3,2 B2,2 B2,1 B1,1 ) , 118 inversion formulas for kronecker product approximation

 −1  −1  −1 −1 −1 −1 T( ) = W 2 2 + W 2 1 T1,3 + W 2 2 T2,3) T3,3 (T3,1 W 1 2 block 2,2 , , , ,  +T W −1 3,2 2,2 = W −1 + W −1 T T −1 T W −1 2,2 2,1 1,3 3,3 3,1 1,2 + W −1 T T −1 T W −1 2,1 1,3 3,3 3,2 2,2 + W −1 T T −1 T W −1 + W −1 T T −1 T W −1 2,2 2,3 3,3 3,1 1,2 2,2 2,3 3,3 3,2 2,2 −1 −1 −1 −1 −1 −1 −1 = A2,2 ⊗ B2,2 + (−A2,2 A2,1 A1,1 A1,3 A3,3 A3,1 A1,1 A1,2 A2,2 ) −1 −1 −1 −1 −1 ⊗ (B2,2 B2,1 B1,1 B1,3 B3,3 B3,1 B1,1 B1,2 B2,2 ) −1 −1 −1 −1 + (−A2,2 A2,1 A1,1 A1,3 A3,3 A3,2 A2,2 ) −1 −1 −1 −1 ⊗ (B2,2 B2,1 B1,1 B1,3 B3,3 B3,2 B2,2 ) −1 −1 −1 −1 + (A2,2 A2,3 A3,3 A3,1 A1,1 A1,2 A2,2 ) −1 −1 −1 −1 ⊗ (B2,2 B2,3 B3,3 B3,1 B1,1 B1,2 B2,2 ) −1 −1 −1 −1 −1 −1 + (A2,2 A2,3 A3,3 A3,2 A2,2 ) ⊗ (B2,2 B2,3 B3,3 B3,2 B2,2 ) ,

 −1   −1 −1  T( ) = W 2 1 T1,3 + W 1 2 T2,3 T3,3 block 2,3 , , −1 −1 −1 −1 −1 −1 = (−A2,2 A2,1 A1,1 A1,3 A3,3 ) ⊗ (B2,2 B2,1 B1,1 B1,3 B3,3 ) −1 −1 + (A2,2 A2,3 A3,3 ) ⊗ (B2,2 B2,3 B3,3 ) ,

 −1  −1  −1 −1  T( ) = − T3,3 T3,1 W 1 1 + T3,2 W 2 1 block 3,1 , , −1 −1 −1 −1 = (−A3,3 A3,1 A1,1 ) ⊗ (B3,3 B3,1 B1,1 ) −1 −1 −1 −1 + (−A3,3 A3,1 A1,1 A1,2 A2,2 A2,1 A1,1 ) −1 −1 −1 −1 ⊗ (B3,3 B3,1 B1,1 B1,2 B2,2 B2,1 B1,1 ) −1 −1 −1 −1 −1 −1 + (A3,3 A3,2 A2,2 A2,1 A1,1 ) ⊗ (B3,3 B3,2 B2,2 B2,1 B1,1 ) ,

 −1  −1  −1 −1  T( ) = − T3,3 T3,1 W 1 2 + T3,2 W 2 2 block 3,2 , , −1 −1 −1 −1 −1 −1 = (−A3,3 A3,1 A1,1 A1,2 A2,2 ) ⊗ (B3,3 B3,1 B1,1 B1,2 B2,2 ) −1 −1 −1 −1 + (−A3,3 A3,2 A2,2 ) ⊗ (B3,3 B3,2 B2,2 ) ,

 −1  −1 −1 T( ) = A3,3 ⊗ B3,3 . block 3,3 A.2 multiple terms approximation 119

a.2 multiple terms approximation

a.2.1 Sum Approximation

T −1 This section describes the elements of (block), if the sum approxima- tion described in Section 10.2.2.1 is used, where eachBTTB matrix has been approximated via an approximateSVD.

 −1  −1 −1 −1 −1 T( ) = W 1 2 + W 1 1 T1,3 T3,3 T3,1 W 1 2 block 1,1 , , , + W −1 T T −1 T W −1 + W −1 T T −1 T W −1 1,1 1,3 3,3 3,2 2,2 1,2 2,3 3,3 3,1 1,2 + W −1 T T −1 T W −1 1,2 2,3 3,3 3,2 2,2 −1 H −1 H H −1 H H −1 H = V11S11 U11 + V11S11 U11U12S12V12V22S22 U22U21S21V21V11S11 U11 −1 H H −1 H H −1 H + V11S11 U11U13S13V13V33S33 U33U31S31V31V11S11 U11 −1 H H −1 H H −1 H + V11S11 U11U13S13V13V33S33 U33U31S31V31V11S11 U11 H −1 H H −1 H U12S12V12V22S22 U22U21S21V21V11S11 U11 −1 H H −1 H H −1 H + V11S11 U11U12S12V12V22S22 U22U21S21V21V11S11 U11 H −1 H H −1 H U13S13V13V33S33 U33U31S31V31V11S11 U11 −1 H H −1 H H −1 H H + V11S11 U11U12S12V12V22S22 U22U21S21V21V11S11 U11U13S13V13 −1 H H −1 H V33S33 U33U31S31V31V11S11 U11 H −1 H H −1 H U12S12V12V22S22 U22U21S21V21V11S11 U11 −1 H H −1 H H − V11S11 U11U13S13V13V33S33 U33U32S32V32 −1 H H −1 H V22S22 U22U21S21V21V11S11 U11 −1 H H −1 H H −1 H − V11S11 U11U12S12V12V22S22 U22U21S21V21V11S11 U11 H −1 H H −1 H H −1 H U13S13V13V33S33 U33U32S32V32V22S22 U22U21S21V21V11S11 U11 −1 H H −1 H H + V11S11 U11U12S12V12V22S22 U22U23S23V23 −1 H H −1 H V33S33 U33U31S31V31V11S11 U11 −1 H H −1 H H −1 H H + V11S11 U11U12S12V12V22S22 U22U23S23V23V33S33 U33U31S31V31 −1 H H −1 H H −1 H V11S11 U11U12S12V12V22S22 U22U21S21V21V11S11 U11 −1 H H −1 H H − V11S11 U11U12S12V12V22S22 U22U23S23V23 −1 H H −1 H H −1 H V33S33 U33U32S32V32V22S22 U22U21S21V21V11S11 U11 , 120 inversion formulas for kronecker product approximation

 −1  −1 −1 −1 −1 T( ) = W 1 2 + W 1 1 T1,3 T3,3 T3,1 W 1 2 block 1,2 , , , + W −1 T T −1 T W −1 + W −1 T T −1 T W −1 1,1 1,3 3,3 3,2 2,2 1,2 2,3 3,3 3,1 1,2 + W −1 T T −1 T W −1 1,2 2,3 3,3 3,2 2,2 −1 H H −1 H = V11S11 U11U12S12V12V22S22 U22 −1 H H −1 H H + V11S11 U11U13S13V13V33S33 U33U31S31V31 −1 H H −1 H H −1 H H + V11S11 U11U12S12V12V22S22 U22U21S21V21V11S11 U11U13S13V13 −1 H H −1 H H −1 H V33S33 U33U31S31V31V11S11 U11U12S12V12V22S22 U22 −1 H H −1 H H −1 H + V11S11 U11U13S13V13V33S33 U33U32S32V32V22S22 U22 −1 H H −1 H H −1 H H + V11S11 U11U12S12V12V22S22 U22U21S21V21V11S11 U11U13S13V13 −1 H H −1 H V33S33 U33U32S32V32V22S22 U22 −1 H H −1 H H −1 H + V11S11 U11U12S12V12V22S22 U22U23S23V23V33S33 U33 H −1 H H −1 H U31S31V31V11S11 U11U12S12V12V22S22 U22 −1 H H −1 H H + V11S11 U11U12S12V12V22S22 U22U23S23V23 −1 H H −1 H V33S33 U33U32S32V32V22S22 U22 ,

 −1   −1 −1  −1 T( ) = W 1 1 T1,3 + W 1 2 T2,3 T3,3 block 1,3 , , −1 H H −1 H = V11S11 U11U13S13V13V33S33 U33 −1 H H −1 H H + V11S11 U11U12S12V12V22S22 U22U21S21V21 −1 H H −1 H V11S11 U11U13S13V13V33S33 U33 −1 H H −1 H + V22S22 U22U23S23V23V33S33 U33 , A.2 multiple terms approximation 121

 −1  −1 −1 −1 −1 T( ) = W 2 1 + W 2 1 T1,3 T3,3 T3,1 W 1 1 block 2,1 , , , + W −1 T T −1 T W −1 2,1 1,3 3,3 3,2 2,1 + W −1 T T −1 T W −1 2,2 2,3 3,3 3,1 1,1 + W −1 T T −1 T W −1 2,2 2,3 3,3 3,2 2,1 −1 H H −1 H = − V22S22 U22U21S21V21V11S11 U11 −1 H H −1 H H − V22S22 U22U21S21V21V11S11 U11U13S13V13 −1 H H −1 H V33S33 U33U31S31V31V11S11 U11 −1 H H −1 H H −1 H H − V22S22 U22U21S21V21V11S11 U11U13S13V13V33S33 U33U31S31V31 −1 H H −1 H H −1 H V11S11 U11U12S12V12V22S22 U22U21S21V21V11S11 U11 −1 H H −1 H H −1 H + V22S22 U22U21S21V21V11S11 U11U13S13V13V33S33 U33 H −1 H H −1 H U31S31V31V22S22 U22U21S21V21V11S11 U11 −1 H H −1 H H −1 H + V22S22 U22U23S23V23V33S33 U33U31S31V31V11S11 U11 −1 H H −1 H H −1 H + V22S22 U22U23S23V23V33S33 U33U31S31V31V11S11 U11 H −1 H H −1 H U12S12V12V22S22 U22U21S21V21V11S11 U11 −1 H H −1 H H − V22S22 U22U23S23V23V33S33 U33U32S32V32 −1 H H −1 H V22S22 U22U21S21V21V11S11 U11 ,

 −1  −1 −1 −1 −1 T( ) = W 2 2 + W 2 1 T1,3 T3,3 T3,1 W 1 2 block 2,2 , , , + W −1 T T −1 T W −1 2,1 1,3 3,3 3,2 2,2 + W −1 T T −1 T W −1 + W −1 T T −1 T W −1 2,2 2,3 3,3 3,1 1,2 2,2 2,3 3,3 3,2 2,2 −1 H −1 H H −1 H H −1 H = V22S22 U22 − V22S22 U22U21S21V21V11S11 U11U13S13V13V33S33 U33 H −1 H H −1 H U31S31V31V11S11 U11U12S12V12V22S22 U22 −1 H H −1 H H − V22S22 U22U21S21V21V11S11 U11U13S13V13 −1 H H −1 H V33S33 U33U32S32V32V22S22 U22 −1 H H −1 H H + V22S22 U22U23S23V23V33S33 U33U31S31V31 −1 H H −1 H V11S11 U11U12S12V12V22S22 U22 −1 H H −1 H H −1 H + V22S22 U22U23S23V23V33S33 U33U32S32V32V22S22 U22 ,

 −1   −1 −1  T( ) = W 2 1 T1,3 + W 1 2 T2,3 T3,3 block 2,3 , , −1 H H −1 H H −1 H = − V22S22 U22U21S21V21V11S11 U11U13S13V13V33S33 U33 −1 H H −1 H + V22S22 U22U23S23V23V33S33 U33 , 122 inversion formulas for kronecker product approximation

 −1  −1  −1 −1  T( ) = − T3,3 T3,1 W 1 1 + T3,2 W 2 1 block 3,1 , , −1 H H −1 H = − V33S33 U33U31S31V31V11S11 U11 −1 H H −1 H H −1 H − V33S33 U33U31S31V31V11S11 U11U12S12V12V22S22 U22 H −1 H U21S21V21V11S11 U11 −1 H H −1 H H −1 H + V33S33 U33U32S32V32V22S22 U22U21S21V21V11S11 U11 ,

 −1  −1  −1 −1  T( ) = − T3,3 T3,1 W 1 2 + T3,2 W 2 2 block 3,2 , , −1 H H −1 H H −1 H = − V33S33 U33U31S31V31V11S11 U11U12S12V12V22S22 U22 −1 H H −1 H − V33S33 U33U32S32V32V22S22 U22 ,

 −1  H T( ) = U3,3S3,3V3,3 . block 3,3 BIBLIOGRAPHY

[1]E.A bbe, Beiträge zur Theorie des Mikroskops und der mikroskopis- chen Wahrnehmung, Arch. Mikrosk. Anat., (9), (1873), pp. 413–468, German, Universitätsbibliothek Johann Christian Senckenberg.

[2]A.A mritkar, E. de Sturler, K. S´wirydowicz, D. Tafti, and K. Ahuja, Recycling Krylov subspaces for CFD applications and a new hybrid recycling solver, J. Comput. Phys., 303,(2015), pp. 222– 237.

[3]R.B arrett, M. W. Berry, T. F. Chan, J. Demmel, J. Donato, J. Dongarra, V. Eijkhout, R. Pozo, C. Romine, and H. Van der Vorst, Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods, SIAM, 1994.

[4] R. H. Bartels and G. Stewart, Solution of the matrix equation AX+ XB= C [F4], Comm. ACM, 15(9), (1972), pp. 820–826.

[5] D. S. Bernstein, Matrix Mathematics: Theory, Facts, and Formulas, Princeton University Press, 2009.

[6]M. van Beurden, Fast convergence with spectral volume integral equation for crossed block-shaped gratings with improved material in- terface conditions, J. Opt. Soc. Am. A, Opt. Image Sci. Vis., 28(11), (2011), pp. 2269–2278.

[7] A. M. Bruckner, J. B. Bruckner, and B. S. Thomson, Real Anal- ysis, ClassicalRealAnalysis.com, second edition ed., 2008.

[8] R. L. Burden and J. D. Faires, , Brooks/Cole Cengage Learning, 9th ed., 2011.

[9] R. H. Chan and X.-Q. Jin, A family of block preconditioners for block systems, SIAM J. Sci. Stat. Comput., 13(5), (1992), pp. 1218–1235.

[10] R. H. Chan and X.-Q. Jin, An Introduction to Iterative Toeplitz Solvers, Fundamentals of Algorithms, SIAM, 2007.

[11] R. H. Chan and K.-P. Ng, Toeplitz preconditioners for Hermitian Toeplitz systems, Appl., 190,(1993), pp. 181–208.

[12] R. H. Chan and M. K. Ng, Conjugate gradient methods for Toeplitz systems, SIAM Rev., 38(3), (1996), pp. 427–482.

[13] R. H. Chan and M.-C. Yeung, Circulant preconditioners con- structed from kernels, SIAM J. Numer. Anal., 29(4), (1992), pp. 1093–1103.

123 124 Bibliography

[14] T. F. Chan, An optimal circulant preconditioner for Toeplitz systems, SIAM J. Sci. Stat. Comput., 9(4), (1988), pp. 766–771.

[15] T. F. Chan and J. A. Olkin, Circulant preconditioners for Toeplitz- block matrices, Numer. Algorithms, 6(1), (1994), pp. 89–101.

[16]O.C hristensen and K. L. Christensen, Approximation Theory: From Taylor Polynomials to Wavelets, Springer Science & Business Media, 2004.

[17] P. J. Davis, Circulant Matrices, American Mathematical Soc., 2012.

[18] R. P. Feynman, R. B. Leighton, and M. Sands, The Feynman Lectures on Physics, Vol. II: The New Millennium Edition: Mainly Electromagnetism and Matter, vol. II of The Feynman Lectures on Physics, Basic books, the new millennium edition ed., 2011, on- line edition last accessed 23-March-2016. URL feynmanlectures.caltech.edu/

[19] M. B. van Gijzen and P. Sonneveld, Algorithm 913: An ele- gant IDR(s) variant that efficiently exploits biorthogonality properties, ACM Trans. Math. Software, 38(1), (2011), pp. 5:1–5:19.

[20] I. C. Gohberg and A. A. Semencul, The inversion of finite Toeplitz matrices and their continual analogues, Mat. Issled., 7(2), (1972), pp. 201–223.

[21] G. H. Golub and C. F. van Loan, Matrix Computations, vol. 3, JHU Press, 2012.

[22] R. M. Gray, Toeplitz and Circulant matrices: A Review, now pub- lishers inc, 2006.

[23] P. C. Hansen, J. G. Nagy, and D. P. O’leary, Deblurring Im- ages: Matrices, Spectra, and Filtering, vol. 3 of Fundamentals of Al- gorithms, SIAM, 2006.

[24]L.H emmingsson, Toeplitz preconditioners with block structure for first-order PDEs, Numer. Linear Algebr., 3(1), (1996), pp. 21–44.

[25] M. R. Hestenes and E. Stiefel, Methods of conjugate gradients for solving linear systems, J. Res. Nat. Bur. Stand., 49(6), (1952), pp. 409–436.

[26] R. A. Horn and C. R. Johnson, Topics in Matrix Analysis, Cam- bridge University Press, 1991.

[27] R. C. Jaeger, Introduction to Microelectronic Fabrication, vol. V of Modular Series on Solid State Devices, Pearson, 2002.

[28]J.K amm and J. G. Nagy, Kronecker product and SVD approxima- tions in image restoration, Linear Algebra Appl., 284(1), (1998), pp. 177–192. Bibliography 125

[29]J.K amm and J. G. Nagy, Optimal Kronecker product approximation of block Toeplitz matrices, SIAM J. Matrix Anal. Appl., 22(1), (2000), pp. 155–172.

[30] M. E. Kilmer and J. G. Nagy, Kronecker product approximations for dense block Toeplitz-plus-Hankel matrices, Numer. Linear Algebr., 14(8), (2007), pp. 581–602.

[31]A.K irsch and F. Hettlich, The Mathematical Theory of Time- Harmonic Maxwell’s Equations, vol. 190 of Applied Mathematical Sciences, Springer International Publishing, 2015.

[32]S.K oyuncu, The Inverse of Two-level Toeplitz Operator Matrices, Ph.D. thesis, Drexel University, 2012.

[33]M. van Kraaij, Forward Diffraction Modelling: Analysis and Appli- cation to Grating Reconstruction, Ph.D. thesis, Technische Univer- siteit Eindhoven, 2011.

[34] F.-R. Lin and C.-X. Wang, BTTB preconditioners for BTTB systems, Numer. Algorithms, 60(1), (2012), pp. 153–167.

[35] F.-R. Lin and D.-C. Zhang, BTTB preconditioners for BTTB least squares problems, Linear Algebra Appl., 434(11), (2011), pp. 2285– 2295.

[36] C. F. van Loan and N. Pitsianis, Approximation with Kronecker products, in Linear algebra for large scale and real-time applications, pp. 293–314, Springer, 1993.

[37] S. A. Martucci, Symmetric convolution and the discrete sine and cosine transforms, IEEE T. Signal Proces., 42(5), (1994), pp. 1038– 1051.

[38]K.M eerbergen and B. Plestenjak, A Sylvester–Arnoldi type method for the generalized eigenvalue problem with two-by-two opera- tor determinants, Numer. Linear Algebr., 22(6), (2015), pp. 1131– 1146.

[39]M.M iranda and P. Tilli, Asymptotic spectra of Hermitian block Toeplitz matrices and preconditioning results, SIAM J. Matrix Anal. Appl., 21(3), (2000), pp. 867–881.

[40] G. E. Moore, Cramming more components onto integrated circuits, Electronics, 38(8), (1965), pp. 114–117. URL monolithic3d.com/uploads/6/0/5/5/6055488/gordon_ moore_1965_article.pdf

[41] M. K. Ng, Iterative Methods for Toeplitz Systems, Numerical Math- ematics and Scientific Computation, Oxford University Press, 2004. 126 Bibliography

[42]N obelprize.org, The History of the Integrated Circuit, Website, 2003, last accessed 7-March-2016. URL http://www.nobelprize.org/educational/physics/ integrated_circuit/history/

[43]J.N ocedal and S. J. Wright, Numerical Optimization, chap. Con- jugate Gradient Methods, pp. 101–134, Springer Series in Oper- ations Research and Financial Engineering, Springer New York, New York, NY, 2nd ed., 2006.

[44] V. Olshevsky, I. Oseledets, and E. Tyrtyshnikov, Tensor proper- ties of multilevel Toeplitz and related matrices, Linear Algebra Appl., 412(1), (2006), pp. 1–21.

[45] H.-K. Pang, Y.-Y. Zhang, S.-W. Vong, and X.-Q. Jin, Circulant preconditioners for pricing options, Linear Algebra Appl., 434(11), (2011), pp. 2325–2342.

[46]M.P isarenco, Scattering from Finite Structures: An Extended Fourier Modal Method, Ph.D. thesis, Eindhoven University of Tech- nology, 2011.

[47] C. M. Rader, Discrete Fourier transforms when the number of data samples is prime, Proc. IEEE, 56,(1968), pp. 1107—-1108.

[48] Y. Saad, Numerical Methods for Large Eigenvalue Problems, SIAM, revised edition ed., 2011.

[49] Y. Saad and M. H. Schultz, GMRES: A generalized minimal resid- ual algorithm for solving nonsymmetric linear systems, SIAM J. Sci. Stat. Comput., 7(3), (1986), pp. 856–869.

[50] F. Schneider and M. Pisarenco, Inverse generating function ap- proach for Toeplitz-block matrices,(2017), in preparation.

[51]S.S erra, Asymptotic results on the spectra of block Toeplitz precon- ditioned matrices, SIAM J. Matrix Anal. Appl., 20(1), (1998), pp. 31–44.

[52] J. R. Shewchuk, An introduction to the conjugate gradient method without the agonizing pain, digital, 1994. URL cs.cmu.edu/~quake-papers/ painless-conjugate-gradient.pdf

[53]J.S toer and R. Bulirsch, Introduction to Numerical Analysis, vol. 12 of Texts in Applied Mathematics, Springer-Verlag New York, 3rd ed., 2002.

[54]G.S trang, A proposal for Toeplitz matrix calculations, Stud. Appl. Math., 74(2), (1986), pp. 171–176. Bibliography 127

[55] P. Tilli, A note on the spectral distribution of Toeplitz matrices, Lin- ear and Multilinear Algebra, 45,(1998), pp. 147–159.

[56] L. N. Trefethen and David Bau, III, Numerical Linear Algebra, SIAM, Philadelphia, 1997.

[57] E. E. Tyrtyshnikov, Optimal and superoptimal circulant precondi- tioners, SIAM J. Matrix Anal. Appl., 13(2), (1992), pp. 459–473.

[58] H. A. van der Vorst, Bi-CGSTAB: A fast and smoothly converg- ing variant of Bi-CG for the solution of nonsymmetric linear systems, SIAM J. Sci. Stat. Comput., 13(2), (1992), pp. 631–644.

[59] P. Wesseling and P. Sonneveld, Numerical experiments with a multiple grid and a preconditioned Lanczos type method, in Approx- imation methods for Navier-Stokes problems, pp. 543–562, Springer, 1980.

DECLARATION

I, Frank Schneider, declare that this thesis titled, “Approximation of Inverses of BTTB Matrices ” and the work presented in it are my own. I confirm that:

• This work was done wholly or mainly while in candidature for a research degree at the named Universities.

• Where any part of this thesis has previously been submitted for a degree or any other qualification at these Universities or any other institution, this has been clearly stated.

• Where I have consulted the published work of others, this is always clearly attributed.

• Where I have quoted from the work of others, the source is al- ways given. With the exception of such quotations, this thesis is entirely my own work.

Eindhoven, December 2016

Frank Schneider