<<

Numerical Range of Square Matrices A Study in

Department of , Linköping University

Erik Jonsson

LiTH-MAT-EX–2019/03–SE

Credits: 16 hp Level: G2 Supervisor: Göran Bergqvist, Department of Mathematics, Linköping University Examiner: Milagros Izquierdo, Department of Mathematics, Linköping University Linköping: June 2019

Abstract

In this thesis, we discuss important results for the numerical range of general square matrices. Especially, we examine analytically the numerical range of complex-valued 2 × 2 matrices. Also, we investigate and discuss the Gershgorin region of general square matrices. Lastly, we examine numerically the numerical range and Gershgorin regions for different types of square matrices, both contain the spectrum of the matrix, and compare these regions, using the calculation software Maple.

Keywords: numerical range, , spectrum, Gershgorin regions

URL for electronic version: urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-157661

Erik Jonsson, 2019. iii

Sammanfattning

I denna uppsats diskuterar vi viktiga resultat för numerical range av gene- rella kvadratiska matriser. Speciellt undersöker vi analytiskt numerical range av komplexvärda 2 × 2 matriser. Vi utreder och diskuterar också Gershgorin områden för generella kvadratiska matriser. Slutligen undersöker vi numeriskt numerical range och Gershgorin områden för olika typer av matriser, där båda innehåller matrisens spektrum, och jämför dessa områden genom att använda beräkningsprogrammet Maple.

Nyckelord: numerical range, kvadratisk matris, spektrum, Gershgorin områden

URL för elektronisk version: urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-157661

Erik Jonsson, 2019. v

Acknowledgements

I’d like to thank my supervisor Göran Bergqvist, who has shown his great interest for and inspired me for writing this thesis. Göran also stood up for giving me this very exciting subject, which can be researched in different levels, but making this possible for my bachelor thesis. I’d also like to thank my examiner Milagros Izquierdo, who has organized the oral presentation part where I’ve shown my work for my peer reviewers.

I’d also like to thank my childhood friend Love Mattsson, who has asked inter- esting questions about the subject and giving thumbs up for my work.

Lastly, I’d like to thank my family who has supported me and listened to my ideas about generalizing the subject to a level they’d understand and giving me some feedback to old versions of the thesis.

Erik Jonsson, 2019. vii

Nomenclature

A - Square matrix A

Mn - Set of all square matrices with dimension n A−1 - Inverse of A AT - Transpose of A A∗ - Hermitian transpose of A Γ - Schur form of A L Ai - Direct sum of square matrices in Mn λ - Eigenvalue of A σ(A) - Spectrum of A F(A) - Numerical range of A kAkF - Frobenius norm of A G(i)(A) - Gershgorin disc of A with index i G(A) - Gershgorin region of A

Erik Jonsson, 2019. ix

Contents

List of Figures xiii

1 Introduction 1

2 Preliminaries 3 2.1 Spectral theory ...... 3 2.2 Fundamental concepts from Optimization and Multi-variable Cal- culus ...... 7

3 Numerical Range 9 3.1 Numerical range of general square matrices ...... 9 3.2 Numerical range of a 2 × 2 matrix ...... 12 3.3 Topology behind the numerical range of a general matrix . . . . 17 3.4 General results of normal matrices ...... 18

4 Gershgorin Discs 21

5 Results 25 5.1 Theory behind the illustrations ...... 25 5.1.1 Drawing the illustrations ...... 27 5.2 General matrices in M2 ...... 27 5.3 Normal matrices ...... 30 5.4 Almost normal matrices ...... 33 5.4.1 Interference of interior points ...... 34 5.4.2 Interference of boundary points ...... 36 5.5 Other matrices that are not normal ...... 39 5.6 Gershgorin region versus Numerical range ...... 41 5.6.1 Matrices in M2 ...... 41 5.6.2 Matrices in M3 and higher dimensions ...... 47

Erik Jonsson, 2019. xi xii Contents

6 Discussion 61 6.1 Numerical range ...... 61 6.2 Gershgorin regions ...... 62 6.3 Numerical range versus Gershgorin regions ...... 62 6.4 Further topics to investigate ...... 62

Bibliography 65

A Maple Commands(Code) 67 A.1 Numerical range ...... 67 A.2 Gershgorin region ...... 68 List of Figures

3.1 Numerical range of A in Example 3.5 ...... 12 3.2 Numerical range of A in Example 3.7 ...... 16

4.1 Gershgorin region of A in Example 4.2 ...... 22 4.2 The intersection of G(A) and G(AT ) in Example 4.5 . . . . 24

5.1 Numerical range of A1 ...... 27 5.2 Envelope for F(A1) ...... 27 5.3 Numerical range of A2 ...... 28 5.4 Envelope for F(A2) ...... 28 5.5 Numerical range of A3 ...... 29 5.6 Envelope for F(A3) ...... 29 5.7 Numerical range of A4 ...... 30 5.8 Envelope for F(A4) ...... 30 5.9 Numerical range of A5 ...... 31 5.10 Envelope for F(A5) ...... 31 5.11 Numerical range of A6 ...... 32 5.12 Envelope for F(A6) ...... 32 5.13 Numerical range of A7 ...... 33 5.14 Envelope for F(A7) ...... 33 5.15 Numerical range of A8 ...... 34 5.16 Envelope for F(A8) ...... 34 5.17 Envelope for F(Ae8) when r = 6 ...... 35 ˆ 3 5.18 Envelope for F(A8) when r = 8 ...... 36 5.19 Envelope for F(Aˆ8), the point λ2 = 2 + 3i is zoomed . . . . 37 5.20 Envelope for F(Aˆ8) when r = 5 ...... 38 5.21 Numerical range of A9 ...... 39 5.22 Envelope for F(A9) ...... 39 5.23 Numerical range of Ae9 ...... 40

Erik Jonsson, 2019. xiii xiv List of Figures

5.24 Envelope for F(Ae9) ...... 40 5.25 Envelope for F(A10) ...... 41 5.26 Gershgorin region for A10 ...... 41 5.27 Intersection of F(A10) and G(A10) ...... 42 5.28 Envelope for F(A11) ...... 43 5.29 Gershgorin region for A11 ...... 43 5.30 Intersection of F(A11) and G(A11) ...... 44 5.31 Envelope for F(A12) ...... 45 5.32 Gershgorin region for A12 ...... 45 5.33 Intersection of F(A12) and G(A12) ...... 46 5.34 Envelope for F(A13) ...... 47 5.35 Gershgorin region for A13 ...... 47 5.36 Intersection of F(A13) and G(A13) ...... 48 5.37 Envelope for F(A14) ...... 49 5.38 Gershgorin region for A14 ...... 49 5.39 Intersection of F(A14) and G(A14) ...... 50 5.40 Envelope for F(A15) ...... 51 5.41 Gershgorin region for A15 ...... 51 5.42 Intersection of F(A15) and G(A15) ...... 52 5.43 Envelope for F(A16) ...... 53 5.44 Gershgorin region for A16 ...... 53 5.45 Intersection of F(A16) and G(A16) ...... 54 5.46 Envelope for F(A17) ...... 56 5.47 Gershgorin region for A17 ...... 56 5.48 Intersection of F(A17) and G(A17) ...... 57 5.49 Envelope for F(A18) ...... 59 5.50 Gershgorin region for A18 ...... 59 5.51 Intersection of F(A18) and G(A18) ...... 60 Chapter 1

Introduction

In technical applications such as dynamical systems, spectral theory is of huge interest. It’s often that systematic properties such as stability for a mechanical system are analyzed with the help of the study of eigenvalues of a linear system. Spectral theory is also useful in mathematical areas such as optimization w.r.t. no constraints, where a matrix A ∈ Mn, is analyzed to determine if it’s definite or not, depending on the form of the function that describes the matrix. Studying eigenvalues of probabilistic matrices (called Markovian matrices) that are positive, is also of huge interest in network problems (where a network is a graph). In this thesis, we are interested in investigating how the set of complex-valued numbers that forms the spectrum of A, σ(A), relate to its matrix geometrically, in a number of cases. As we find the results, describing the comparison of numerical range and Gershgorin region of a square matrix containing σ(A) is essential to conclude which of them to prefer.

Erik Jonsson, 2019. 1

Chapter 2

Preliminaries

Before we mention the properties of the numerical range of arbitrary square matrices, let’s refresh our memories regarding fundamental definitions in spectral theory for general square matrices, and of compact and convex sets. We also remind ourselves about the definition of an envelope and Frobenius norm.

2.1 Spectral theory Recall the following definition:

Definition 2.1 (Definition 8.1.1, Janfalk [4]) Let V be a vector space and consider the linear map T : V → V . A number λ ∈ C is called the eigenvalue of T if it exists a non-zero vector u¯ ∈ V s.t.

T (¯u) = λu.¯

The vector u¯ above is called the eigenvector and corresponds to the eigenvalue λ of T .

In this thesis, the interest lies in studying matrices A ∈ Mn, which maps Cn → Cn. Then 0¯ 6= u¯ ∈ Cn, is an eigenvector to A, with eigenvalue λ ∈ C, if Au¯ = λu¯.

It’s easy to determine the eigenvectors that corresponds to the eigenvalues of A. To refresh our memories, let’s do an example.

Erik Jonsson, 2019. 3 4 Chapter 2. Preliminaries

4 9 Example 2.2 Consider the following matrix A = . Calculations by 9 4 using the definition of secular polynomial for A gives us

λ1 = −5, λ2 = 13 which corresponds to the eigenvectors of A

u¯1 = (−1, 1), u¯2 = (1, 1)

respectively. It’s easy to check that

Au¯1 = λ1u¯1,Au¯2 = λ2u¯2.

If it exist a basis of eigenvectors [u¯1, u¯2,..., u¯n] that corresponds to the eigenval- −1 ues of A, λ1, λ2, ..., λn, then A = TDT , where T has u¯1, u¯2,..., u¯n as columns. We say A can be diagonalized with a diagonal matrix   λ1 ... 0 0    0 λ2 ... 0  D =  . . . .  .  . . .. .   . . .  0 0 . . . λn

However, there are many cases when A cannot be diagonalized. During these situations, we can use, for instance, a Jordan block in canonical form that makes A almost diagonalizable.

Recall the following definitions of Hermitian, normal, unitary and orthogo- nal matrices.

Definition 2.3 (Treil [9]) A∗ is defined as the transpose and complex conjugate of A.

Definition 2.4 (Treil [9]) A is Hermitian if A = A∗.

Definition 2.5 (Treil [9]) A is normal if AA∗ = A∗A.

Definition 2.6 (Treil [9]) A is unitary if A∗ = A−1.

Definition 2.7 (Treil [9]) A is orthogonal if A is real and AT = A−1. 2.1. Spectral theory 5

These definitions are important for diagonalizing square matrices, depending on whether the matrices are normal or not.

Recall the following theorems of Schur factorization of a matrix A and di- agonalization of normal matrices.

Theorem 2.8 Let A be a matrix (or map A : X → X acting on a com- plex vector space). Then there exists an ON-basis {u¯i} s.t. the matrix Γ of A becomes upper triangular. In other words, A can be represented as

A = UΓU ∗, where U is unitary.

Proof: See proof in [9, Chapter 6].

Theorem 2.9 for normal matrices Let A be a normal square matrix, AA∗ = A∗A. Then A can be represented as

A = UDU ∗, where U is a unitary matrix and D diagonal matrix. Also, if A = A∗ is self- adjoint, then D is real-valued. Furthermore, if A is real symmetric, then U is an orthogonal matrix.

Proof: See proof in [9, Chapter 6].

It’s important that the definition of the Frobenius norm of a matrix is mentioned too.

Definition 2.10 Let A be an m × n matrix. Then the Frobenius norm is the matrix norm defined as [12] v u m n uX X 2 kAkF = t aij . i=1 j=1

The norm can also be written as following

p ∗ kAkF = tr(A A).

We formulate also the definition of principal submatrix of a square matrix below. 6 Chapter 2. Preliminaries

Definition 2.11 Let A be a square matrix. A k × k principal submatrix of A ∈ Mn has elements aij, where i, j ∈ I (the same I for i and j) for some I = {i1, . . . , ik : 1 ≤ i1 < i2 < . . . < ik ≤ n}.

To understand how to obtain the principal submatrices of A, let’s do an example.

2 5 0 Example 2.12 Consider the matrix A = 1 3 3. If I1 = {1} and I2 = 0 1 4 {2, 3}, then the following principal submatrices of A that corresponds to I1 and I2 are

3 3 2 , 1 4

respectively. Note that if I3 = {1, 2, 3}, then the principal submatrix of A must be the matrix itself.

Now we define the direct sum of matrices L .

Definition 2.13 Let Ai ∈ Mn, 1 ≤ i ≤ n. Then the direct sum of matri- ces constructs the block diagonal matrix from a set of square matrices [13]

  A1 n   M  A2  Ai = diag(A1,...,An) =  .  .  ..  i=1   An

We make use of this definition for doing an example below.

1 3 2 4 Example 2.14 Let A = ,A = . Then the direct sum of A and 1 5 8 2 7 9 1 A2 is the following block diagonal matrix

1 3 0 0 M 5 8 0 0 A1 A2 = diag(A1,A2) =   . 0 0 2 4 0 0 7 9 2.2. Fundamental concepts from Optimization and Multi-variable Calculus 7

2.2 Fundamental concepts from Optimization and Multi-variable Calculus Before we define concepts like convexity and compactness of the numerical range for arbitrary n × n matrices, let’s discuss about what it actually means to have a convex and compact set in Rn.

Let’s begin with compactness.

Definition 2.15 A set in Rn is compact if it’s both closed and bounded. [1]

In Chapter 3 and 4, we’ll notice that the numerical range and Gershgorin regions of a square matrix, are compact subsets in C. Observe that the set  2 2 2 (x, y) ∈ R : 3x + 4y ≤ 19, 0 ≤ y ≤ x is closed and bounded.

Now, recall the definition of convexity for a set in Rn.

n Definition 2.16 A set X ⊆ R is a convex set if for any pair of points x¯1, x¯2 ∈ X and for 0 ≤ t ≤ 1, we have [5]

x¯ = tx¯1 + (1 − t)¯x2 ∈ X.

The definition says that if a set is to be convex, then for every two points in X, the line segment joining them must be within that set. There are lots of examples when a certain set isn’t convex.

Finally, recall the definition of an envelope.

Definition 2.17 The envelope of a one-parameter family of curves, defined by [11] F (x, y, c) = 0, where c a parameter, is a curve that intersects every member of the family by their tangents.

By the definition above, for finding the envelope of such a one-parameter family of curves, it suffices to solve the typical first-order differentiation equation ∂F = 0. ∂c F (x, y, c) = 0 8 Chapter 2. Preliminaries

Later in this thesis, we’ll see that the boundary of the numerical range of a complex-valued n × n matrix, is the envelope of a family of lines that intersects the vertices of an oval region in xy-plane (the region bounded by an ellipse if the matrix is in M2). Chapter 3

Numerical Range

In this chapter, we describe and formulate the concept numerical range for general matrices in Mn. Also, we define and show the concepts convexity, compactness and some properties for the numerical range of general square matrices. Having done this, we’ll be ready for examining Gershgorin regions of general square matrices (see Chapter 4) and later on, performing experiments on several matrices (see Chapter 5). A general reference for this chapter is Shapiro [8]. 3.1 Numerical range of general square matrices Definition 3.1 The set of all complex numbers x¯∗Ax¯, where the norm kx¯k = 1, is called the numerical range of A, and is denoted F(A). In other words, [8]

 ∗ n F(A) = x¯ Ax¯|x¯ ∈ C , kx¯k = 1 .

First a theorem that states useful properties about numerical range of general square matrices.

Theorem 3.2 Let A be an n × n complex-valued matrix. Then the follow- ing holds, [8] (a) For any unitary matrix U, then F(A) = F(U ∗AU). (b) F(A) contains all of the eigenvalues of A. (c) F(A) contains all of the diagonal entries of A.

(d) If Ak is any principal submatrix of A, then F(Ak) ⊆ F(A). (e) If P is a n × k matrix with orthonormal columns, then F(P ∗AP ) ⊆ F(A).

Erik Jonsson, 2019. 9 10 Chapter 3. Numerical Range

Some of these properties are perhaps trivial to see if we compare with ear- lier results from fundamental spectral theory, but let’s prove these statements.

Proof: We prove these properties in order. First, in order to prove (a), consider a unitary matrix U and the compact set B which is the unit sphere in Cn, and also put y¯ = Ux¯. Since U is unique and gives a bijection of B onto itself, we have F(A) = y¯∗Ay¯|y¯ ∈ B = y¯∗Ay¯|y¯ = Ux,¯ x¯ ∈ B = x¯∗U ∗AUx¯|x¯ ∈ B = F(U ∗AU).

To prove (b), let λ be an eigenvalue of A, and also let x¯ be a correspond- ing normalized eigenvector. Then it follows that x¯∗Ax = λx¯∗x¯ = λ. This means λ ∈ F(A) and σ(A) ⊂ F(A).

∗ For (c), it follows from e¯i Ae¯i = aii, where e¯i is a vector in the standard basis of Cn, 1 ≤ i ≤ n.

For (d), choose a permutation matrix Q s.t. Ak is formed from the first k rows and columns of Q∗AQ (Q is unitary). From (a), we have that F(Q∗AQ) = F(A). k ¯ Consider now z¯ ∈ C with normkz¯k = 1. Therefore, put zˆ = (z1, . . . , zk, 0,..., 0).

¯ ∗ ¯∗ ∗ ¯ ∗ We have now zˆ = 1 and z¯ Akz¯ = zˆ Q AQzˆ, so F(Ak) ⊆ F(Q AQ) = F(A). For the last part of the proof, (e), we prove in a similar way as previous part. Let z¯ ∈ Ck with norm kz¯k = 1. Since P has columns that are orthonormal, P z¯ ∈ Cn, has the norm kP z¯k = 1. Now, z¯∗P ∗AP z¯ = (P z¯)∗A(P z¯) ∈ F(A), and ∗ hence F(P AP ) is a subset of F(A). 

To make sense of this theorem, let’s do an example.

5 6  Example 3.3 Consider the real symmetric matrix A = . This ma- 6 10 trix has the eigenvalues

λ1 = 1, λ2 = 14.

This means that the spectrum of A is σ(A) = {1, 14} (ordered from smallest to largest eigenvalue). Using the recently stated theorem, if we check (b), we have σ(A) ⊂ F(A), where F(A) = [1, 14]. Observe that if we also check the property (c), then we have a11 = 5 ∈ F(A), a22 = 10 ∈ F(A). Also, since A is a , then we can represent it as A = UDU ∗, where D is the diagonal matrix 3.1. Numerical range of general square matrices 11 and U is unitary. Calculating U and D, gives us

1 −3 2 1 0  U = √ ,D = . 13 2 3 0 14

This means by (a),

F(A) = F(U ∗AU) = F(D).

There are interesting behavior in special types of normal matrices. For instance, Hermitian matrices have their numerical range as the closed and bounded in- terval of all real-valued numbers with λ1 as the smallest and λn as the largest eigenvalue as endpoints. Also, we saw in the proof of the theorem above that the numerical range of a matrix A is the same as the numerical range of the upper triangular matrix Γ. In the normal case, Γ is the diagonal matrix D. We’ll use this fact to prove an important theorem below.

Theorem 3.4 Let A ∈ Mn be normal, with eigenvalues λ1, . . . , λn. Then F(A) is the closed of the points λ1, . . . , λn. [8]

Before we prove this theorem, recall the following definition of convex hull of a set of points X:

If X = {λ1, λ2, . . . , λn}, then the convex hull of X is

n n Conv(X) = {Σi=1αiλi, Σi=1αi = 1, αi ≥ 0} .

This is the smallest convex set that contains the set of points X.

Proof: First, consider the unitary matrix U s.t. U ∗AU = D, where D is the diagonal matrix. By Theorem 3.2, F(A) = F(D). Now, we have nPn 2 Pn 2 o F(D) = i=1|xi| λi | i=1|xi| = 1 , which is the set of all convex combi- nations of λ1, . . . , λn. 

To make use of the recently stated theorem, we do an example.

0 1 0 0 0 0 0 1 Example 3.5 Consider the normal matrix A =   . Calculating the 0 0 1 0 1 0 0 0 eigenvalues gives us the following diagonal matrix D = U ∗AU 12 Chapter 3. Numerical Range

1 0 0 0  0 1 0 0  =  √  . 0 0 − 1 + i 3 0   2 2 √  1 i 3 0 0 0 − 2 − 2 Now, with the fact from the proof of the previous theorem with F(A) = F(D), illustrating this in Maple gives us

Figure 3.1: Numerical range of A in Example 3.5

3.2 Numerical range of a 2 × 2 matrix The theorem below is perhaps one of the most important ones for this thesis, because the proof of it describes well how any 2 × 2 matrix has the numerical range as all complex-valued numbers in the region bounded by an ellipse. With the understanding of this theorem, we can interpret the numerical range of other square matrices in higher dimensions since they are all oval.

Theorem 3.6 Let A be the 2 × 2 complex-valued matrix with eigenvalues q 2 2 2 λ1 and λ2. Also, let r = kAkF −|λ1| −|λ2| , where kAkF is the Frobenius norm of A. Then F(A) consists of all points in the region bounded by the ellipse with foci λ1 and λ2 and minor axis of length r. [8] 3.2. Numerical range of a 2 × 2 matrix 13

Proof: We show first that we only need to study matrices of the form C = t r  , where r and t are positive real numbers. 0 −t ∗ If A has eigenvalues λ1 and λ2, let A = UΓU be the Schur factorization λ b  of A, so Γ = 1 , and U is the unitary matrix. Then F(A) = F(Γ) and 0 λ2 2 2 2 2 2 kAkF =kΓkF =|λ1| +|λ2| +|b| .

λ1+λ2 Now, let λ = 2 . Then, we have " # " # λ1−λ2 λ1+λ2   λ + λ 2 b 2 0 λ b 1 2 Γ = λ2−λ1 + λ2+λ1 = + I, 0 2 0 2 0 −λ 2 where I is the identity matrix. With λ = teiθ, b = reiφ = reiθei(φ−θ) = reiθeiα, we have t reiα λ + λ 1 0  t r  1 0  λ + λ Γ = eiθ + 1 2 I = eiθ + 1 2 I = 0 −t 2 0 e−iα 0 −t 0 eiα 2 t r  λ + λ = V (eiθ + 1 2 I)V ∗, 0 −t 2 1 0  where V is and unitary. 0 e−iα Now, from the start of the proof, we have F(A) = F(Γ) = F(B), since t r  t r  Γ = VBV ∗ where B = eiθ + λ1+λ2 I. With C = , then 0 −t 2 0 −t iθ λ1+λ2 F(A) = e F(C) + 2 , so F(A) is aquired from F(C) by two operations in Euclidean geometry: rotation and translation. We can understand this further by n o iθ λ1+λ2 looking at the following set notation F(A) = e z + 2 ∈ C | z ∈ F(C) .

We now study the matrix C. For any normalized vector x¯, we have (eiθx¯)∗C(eiθx¯) = x¯∗Cx¯. This means that we only need to consider vectors whose first component " φ # cos 2 x1 is real. Therefore, for the norm kx¯k = 1, we assume x¯ = iθ φ . Calcu- e sin 2 lating the product x¯∗Cx¯ gives us " # h i t r  cos φ x¯∗Ax¯ = cos φ e−iθ sin φ 2 = t cos2 φ + reiθ sin φ cos φ − 2 2 0 −t iθ φ 2 2 2 e sin 2 14 Chapter 3. Numerical Range

2 φ r iθ 2 φ 2 φ t sin 2 = t cos φ + 2 e sin φ, where trigonometric identities as cos 2 − sin 2 = φ φ cos φ and 2 sin 2 cos 2 = sin φ have been used. So, we see that F(C) is the set of all numbers on the form

r iθ tcos φ + 2 e sin φ, (3.1) where real parameters φ, θ ∈ [0, 2π]. We see that for t = 0, we have the set of all r iθ r numbers 2 e sin φ, which is the compact region of a disk with radius 2 centered at origin. We also get the boundary circle when sin φ = 1 and interior points when |sin φ| < 1. Therefore, consider the case t 6= 0. Identifying the real and imaginary part of (3.1), we have r X = t cos φ + cos θ sin φ 2 r Y = sin θ cos φ 2

From this, we use spherical coordinates and shear mapping on space in R3 for showing that the set of such all points is the ellipse. Of course, the spherical coordinates are

x = ρ cos θ sin φ y = ρ sin θ sin φ z = ρ cos φ.

r Now, if ρ = 2 and let θ ∈ [0, 2π], φ ∈ [0, π]. What we get, is the surface of the sphere of radius ρ centered at the origin. Let B denote the surface. From this, we have z 2z cos φ = = . ρ r Now, comparing the equations above, we get 2t X = z + x r Y = y.

Now we’d like to do the shear map as mentioned earlier:

   2t  x x + r z y  y  T   =   . z z 3.2. Numerical range of a 2 × 2 matrix 15

With this linear transformation, it maps B onto the surface of an ellipsoid, denoted E. Geometrically, the vertical projection of E onto the xy-plane is the filled ellipse. The map T only shears in the x-direction and the shear depends only on the z-coordinate. Because of this, we have a symmetric ellipsoid obtained (from the sphere) with a shear in the xz-plane with its axis being the z-axis. The projected ellipse has its axes along the x - and y-coordinates. Moreover, the r vertices on y-axis are the points (0, ± 2 ). To find the vertices on x-axis, let’s find the values of X. We have r cos φ  t  t cos φ + cos θ sin φ = · r . 2 sin φ 2 cos θ

Since the norm (cos φ, sin φ) = 1, with Cauchy-Schwarz inequality theorem, we get r r r2 t cos φ + cos θ sin φ ≤ t2 + cos2 θ. 2 4

r Therefore, we choose θ = 0 and φ in a way that tan φ = 2t . The vectors in the dot product above becomes parallel and with equality from inequality equation above, we get r r r2 t cos φ + sin φ = t2 + . 2 4

Finally, the vertices on x-axis are r r2 (± t2 + , 0). 4 We conclude lastly that the major axis of the ellipse is on the x-axis, the foci are at (±t, 0) and the minor axis has length r.

iθ λ1+λ2 This ellipse is F(C) and with F(A) = e F(C) + 2 , F(A) is the ellipse iθ λ1+λ2 obtained from rotation e and translation 2 of F(C). 

We’ll do an example of how the theorem works.

3 4 Example 3.7 Consider the real-valued matrix A = . This matrix has 1 6 the eigenvalues λ1 = 2, λ2 = 7. Now, there exist an upper triangular matrix Γ such that A = UΓU ∗, where U is unitary. Calculations by using Gram-Schmidt 16 Chapter 3. Numerical Range

7 3 process for U gives us that Γ = . From the proof of previous theorem, 0 2 we calculate the parameters r and t to get a sense of how the numerical range of A looks. Since we know the formula of r, then we get r = 3 by using the 5 definition of Frobenius norm of A. Also, from Γ, we see that θ = 0, t = 2 and 9 9 3 λ = 2 . Therefore, the centre is ( 2 , 0), semi minor axis ± 2 , the semi major axis q √ √ 32 5 2 34 9 34 ± 4 + ( 2 ) = ± 2 and the vertices are at 2 ± 2 . This means that the ellipse has λ1 and λ2 as foci (also interior points of the ellipse).

Illustrating this in Maple, we get

Figure 3.2: Numerical range of A in Example 3.7

Now that we’ve stated some important theorems, let’s define the topologi- cal properties convexity and compactness of the numerical range of a square matrix of dimension n. 3.3. Topology behind the numerical range of a general matrix 17

3.3 Topology behind the numerical range of a general matrix We now describe topological properties of the numerical range as a subset of the complex plane C.

Theorem 3.8 Let F(A) be the numerical range of any matrix A ∈ Mn. Then, the following properties holds for F(A): [3]

(a) F(A) is a compact subset of C. (b) F(A) is a convex subset of C. Proof: Let’s prove (a) first. F(A) is the range of a continuous map x¯ → x¯∗Ax¯ from Ck → C over the domain {x¯: x¯ ∈ Cn, x¯∗x¯ = 1}, which is the unit sphere in Cn. Since the continuous image of a compact set is compact, then it fol- lows F(A) is a compact set (we identify Cn with R2n when defining compact sets).

For proving (b), we use the fact in 2 × 2 matrix case that it’s convex, and prove for the general n × n matrix case. Let two numbers a,b ∈ F(A). We need to show that any point on the line segment joining a and b is in F(A). It’s trivial if a = b, so consider a 6= b. Now let x¯ and y¯ be two normalized vectors s.t. a = x¯∗Ax¯ and b = y¯∗Ay¯. Since we know that a 6= b, x¯ and y¯ must be linearly independent. Let V be a subspace that is spanned by x¯ and y¯ and also let v¯1 and v¯2 be a ON-basis for V . Then there are scalar numbers ci, dj, i, j = 1, 2 s.t.

x¯ = c1v¯1 + c2v¯2

y¯ = d1v¯1 + d2v¯2

    c1 ¯ d1 Write c¯ = and d = . Since we know that v¯1 and v¯2 are orthonormal c2 d2 2 2 2 2 and the norms kx¯k = 1 and ky¯k = 1, then |c1| + |c2| = |d1| + |d2| = 1, so

¯ kc¯k = d = 1. Finally, let P be the n × 2 matrix with columns v¯1 and v¯2, then x¯ = P c¯ and y¯ = P d¯. From this, we have a = x¯∗Ax¯ = c¯∗P ∗AP c¯ and b = y¯∗Ay¯ = d¯∗P ∗AP d¯, which gives a and b are points in F(P ∗AP ). However, P ∗AP is a 2 × 2 matrix, so F(P ∗AP ) contains the line segment with endpoints a and b. We also know that F(P ∗AP ) ⊆ F(A), then it means F(A) contains the very same line segment with endpoints a and b. 

We’ll now discuss general results of normal square matrices. 18 Chapter 3. Numerical Range

3.4 General results of normal matrices Before we head on to the next chapter about Gershgorin regions, we’ll discuss the general results when we have a typical normal square matrix.

Just before we discuss the results for a normal matrix, let’s prove the fol- lowing theorem.

Theorem 3.9 Let A ∈ Mn. If any eigenvalue λ of A is on the boundary ∗ L of F(A), there exist a unitary matrix U s.t. U AU = λ A2, where A2 is a matrix of order n − 1. [8]

Proof: Let us suppose that λ is on the boundary of F(A). In that case, We choose corresponding normalized eigenvector u¯1 and u¯2,..., u¯n s.t. u¯1, u¯2,..., u¯n becomes an ON-basis for Cn. Now, let U be a unitary matrix with columns u¯1,..., u¯n. Then we have   λ a12 a13 . . . a1n 0  ∗   U AU =  .  ,  .   . A2  0

∗ where A2 is a (n−1)×(n−1) block matrix. Since F(A) = F(U AU), so λ lies on the boundary of the numerical range of the matrix above. By this construction,   λ a1j we show that a1j = 0, ∀j = 2, . . . , n. We build the matrix Cj = , a 2×2 0 ajj principal submatrix of U ∗AU formed from rows and columns 1 and j. Obviously, we have F(Cj) ⊆ F(A). If a1j 6= 0, then we have that F(Cj) is a non-degenerate ellipse and λ is one of the foci, and therefore is an interior point of F(Cj). This means of course that λ is an interior point of F(A). This is a contradiction because λ lies on the boundary of F(A). Because of that, a1j = 0, ∀j = 2, . . . , n. 

With this result, we can prove the following interesting theorem.

Theorem 3.10 If A ∈ Mn with n ≤ 4, and F(A) is the convex hull of the eigenvalues of A, then A is a normal matrix. [8]

In the upcoming proof, we use the knowledge about different types of numerical range to conclude if A is normal or not. 3.4. General results of normal matrices 19

Proof: We make the assumption that F(A) is the convex hull of the eigenval- ues of A. If A is a matrix with dimension n ≤ 4, there can only be at most four distinct eigenvalues, so F(A) is either a point, line segment, triangle or a quadrilateral. In the case that F(A) is a triangle or quadrilateral, we realize that the eigenvalues are vertices. If we consider a line segment as a subset in the plane, then every point joining the line segment obviously is a boundary point. Therefore, we realize in all cases that at least n − 1 of the n eigenvalues are on the boundary of F(A). There is an exception though, and that’s when A is a 4 × 4 matrix and its numerical range is a triangle, then the fourth eigenvalue is an interior point of the triangle.

Observing these cases, we realize that A is normal. By using Theorem 3.9 ∗ n − 1 times, we can obtain that U AU is diagonal. 

For square matrices with dimension n ≥ 5, n − 2 or fewer eigenvalues may be on the boundary, and they don’t have to be normal.

Later in this thesis, we’ll do experiments where we have a 5 × 5 square matrix which is almost normal, but for a small interference number r > 0 still has the numerical range as the convex set where some of the eigenvalues are vertices, and the rest of them interior points.

Chapter 4

Gershgorin Discs

This chapter contains useful theorems about Gershgorin discs. Understanding fundamental theory behind numerical range, one can interpret these regions. A general reference is Gustafson [2].

We start here by stating the definition of Gershgorin discs.

(i) n 0 o Definition 4.1 Let A ∈ Mn. Then the (closed) discs G (A) = |z − aii| ≤ Ri(A) ⊂ 0 P C, ∀i = 2, . . . , n, where Ri(A) = j6=i aij interpreted as the row sum, are called Sn (i) the Gershgorin discs, and G(A) = i=1 G (A), the Gershgorin region.

An example is in place, in order to understand the definition.

5 0 4 Example 4.2 Consider the matrix A = 1 3 3 . This matrix has the 0 1 5 √ √ 1 1 eigenvalues λ1 = 4, λ2 = 2 (9 + 17), λ3 = 2 (9 − 17). The Gershgorin discs are

G(1)(A) = |z − 5| ≤ 4 , G(2)(A) = |z − 3| ≤ 4 , G(3)(A) = |z − 5| ≤ 1 .

Also, the Gershgorin region is the union of discs

Erik Jonsson, 2019. 21 22 Chapter 4. Gershgorin Discs

Figure 4.1: Gershgorin region of A in Example 4.2 23

We see clearly here that G(3)(A) is inside the disc G(1)(A), they have the same centre, but different radius. Note that in the picture above, blue diamonds represents the centre of respective disc and red boxes the eigenvalues of A.

Theorem 4.3 Let A ∈ Mn. Then the relation σ(A) ⊂ G(A) holds. [2]

Proof: Let λ be an eigenvalue of A, then λ ∈ σ(A),Ax¯ = λx,¯ x¯ ∈ Cn, x¯ 6= 0¯. Pn Choose k s.t. |xk| = max xj . Now we have λxk = akixi, which we can 1≤j≤n i=1 P write as λxk − akkxk = i6=k akixi. P P Therefore, |λ − akk||xk| ≤ i6=k|aki||xi| ≤ i6=k|aki||xk|. Hence, |λ − akk| ≤ P (k) i6=k|aki| and λ ∈ G (A) ⊂ G(A). 

The proof can also be done by using the same argument with counting col- Sn 0(i) 0 0 umn sums of A, then we aquire σ(A) ⊂ i=1 G = G (A), where G (A) = n P o |z − aii| ≤ j6=i aji .

The theorem above tells us that σ(A), the spectrum of A, lies somewhere in the Gershgorin region, G(A), the union of all Gershgorin discs. If the Gersh- gorin region is the disjoint union of sets, then the number of eigenvalues in each set equals the number of discs that form it.

T Corollary 4.4 Let A ∈ Mn. Then the relation σ(A) ⊂ G(A) ∩ G(A ) holds.

0 Proof: Since σ(A) = σ(AT ) ⊂ G(AT ) = G (A), then from Theorem 4.3, it’s T clear that σ(A) ⊂ G(A) ∩ G(A ). 

We’ll do an example of illustrating G(A) ∩ G(AT ).

4 7 Example 4.5 Consider the real-valued matrix A = . This matrix has 2 6 √ √ the eigenvalues λ1 = 5 + 15, λ2 = 5 − 15. Illustrating the Gershgorin regions G(A) and G(AT ) below 24 Chapter 4. Gershgorin Discs

Figure 4.2: The intersection of G(A) and G(AT ) in Example 4.5 where the region in blue is G(A) and the one in red is G(AT ). It’s clear that σ(A) ⊂ G(A) ∩ G(AT ) (observe that the black diamonds are the centres of the discs and blue boxes eigenvalues of A and AT ). Chapter 5

Results

In this chapter, we investigate different types of complex-valued matrices by doing some illustrations in Maple, see appendix for main code.

An observation before going through the theory below, is that the command from Maple code, W works well, but doesn’t draw enough points for the numerical range of a square matrix, which makes the boundary of the region non-smooth. To make sure that the boundary of the numerical becomes smooth, we need to apply a second argument in the following commandW(A,p), where p is the number of points to draw the boundary of the numerical range, and A is a square matrix. By setting p to a huge number (default setting is 65 which is not enough for a smooth boundary of the numerical range), the numerical range becomes more smooth. For drawing the numerical range of a square matrix in the experiments, p is set to 800.

Another remark is that the number of lines that are being used for drawing the envelope for the numerical range is 120. Also, the number of points used for the envelope is 1000.

5.1 Theory behind the illustrations In this section, we motivate the illustrations of the numerical range of the ma- trices by discussing the theory behind the code which is based on results from Johnson [3].

Firstly, provided that we understand what an envelope is (check in Prelim-

Erik Jonsson, 2019. 25 26 Chapter 5. Results

inaries chapter to be sure), let

A + A∗ A − A∗ A = + , 2 2

A+A∗ where H(A) = 2 is the Hermitian part of A. We realize that H(A) is self-adjoint, so from before, we know that H(A) has real eigenvalues.

Now consider µ1 as the largest eigenvalue of H(A) and λ an eigenvalue of A, with normalized eigenvector x¯. Then

1 1 Re(λ) = (λ + λ¯) = (¯x∗Ax¯ +x ¯∗A∗x¯) =x ¯∗H(A)¯x ≤ µ . 2 2 1

This means the spectrum of A stretches to the left of the vertical line that goes through µ1 (analogously if we denote the smallest eigenvalue of H(A), µ2, then the spectrum of A stretches to the right of the vertical line that goes through µ2).

Now, by denoting

iθ Aθ = e A,

iθ we understand that Aθ must have eigenvalues λje if A has eigenvalues λj. Also, ∗ Aθ +Aθ denote H(Aθ) = 2 with largest eigenvalue λθ = µ1(Aθ). So, by refering to −iθ a result in [3], denote the half-planes Hθ ≡ e {z : Re z ≤ λθ}, ∀θ ∈ [0, 2π),  −iθ generated by straight lines Lθ ≡ e (λθ + ti), ∀t ∈ R , where λθ is the largest eigenvalue of Hθ.

By convexity of F(A), this leads us to a result in [3, Section 1.5] 1

\ F(A) = Hθ. θ∈[0,2π)

The numerical range is the intersection of all Hθ, which means the boundary of F(A) is the envelope of the family of lines Lθ, ∀θ ∈ [0, 2π). Also, the complex ∗ number aθ ≡ x¯θAx¯θ is a boundary point of F(A), given the prerequisites above, where x¯θ is a normalized eigenvector to H(Aθ) and λθ.

1A result which we motivate later in the upcoming subsection. 5.2. General matrices in M2 27

5.1.1 Drawing the illustrations The theory above, gives the reader an idea how to illustrate the numerical range of general square matrices in any calculation program, i.e. Maple, Matlab, et cetera. The approach is to draw lines Lθ, ∀θ ∈ [0, 2π), so the numerical range of any square matrix is obtained from the intersection of all Hθ (the boundary of the numerical range is the envelope of a family of lines Lθ).

Note that in the upcoming experiments, the axes are denoted s and t in the figures.

5.2 General matrices in M2 2 3 Consider the matrix A = . It has the eigenvalues λ = 2, λ = 1. 1 0 1 1 2 The Maple code gives us the following numerical range

Figure 5.1: Numerical range of A1

Figure 5.2: Envelope for F(A1) 28 Chapter 5. Results

which is definitely the region in the complex plane bounded by an ellipse. The 3 3 ellipse has the centre ( 2 , 0), minor semi-axis ± 2 , major semi-axis q √ √ 32 1 2 10 3± 10 ± 4 + ( 2 ) = ± 2 and vertices at ( 2 , 0) on the s-axis. Also, λ1 and λ2 are interior points in the ellipse. Note that A1 already is on Schur form.

1 1 We look at another matrix, A = which has the eigenvalues λ = 2 2 3 1 √ √ 2 + 3, λ = 2 − 3. 2 " √ # 2 + 3 −1 The Schur form of A is Γ = √ , so we get instead the numerical 2 0 2 − 3 range

Figure 5.3: Numerical range of A2

Figure 5.4: Envelope for F(A2) 5.2. General matrices in M2 29

We see that the numerical range is similar to A1, the region in the complex plane bounded by an ellipse. This ellipse has the centre (2, 0), minor semi-axis q √ 1 1 2 ± 2 , major semi-axis ± 4 + ( 3) ≈ 1.8 and vertices (2 ± 1.8, 0). Here, the eigenvalues of A2 are interior points, but quite close to the vertices.

5 1  Let’s look at the third example on a matrix A = . The matrix has the 3 0 −5 eigenvalues λ1 = 5, λ2 = −5.

The numerical range is

Figure 5.5: Numerical range of A3

Figure 5.6: Envelope for F(A3) 30 Chapter 5. Results

A3 has as A1 and A2 the numerical range as the region in the complex plane bounded by an ellipse. Note that this ellipse has the origin as centre, minor q √ √ 1 1 2 101 101 semi-axis ± 2 , major semi-axis ± 4 + 5 = ± 2 and vertices (± 2 , 0) (the eigenvalues are again close to the vertices, inside the ellipse).

5.3 Normal matrices In this section, we do experiments on normal matrices. We’ll examine the convex hulls and see how exactly they differ as regions in the complex plane. Also, illustrations of their envelopes are important in this matter.

2 5 Let’s start here with a normal (Hermitian) 2 × 2 matrix A = . The 4 5 1 √ √ 3 101 3 101 matrix has the eigenvalues λ1 = 2 + 2 , λ2 = 2 − 2 .

The numerical range is the convex hull, which is the line segment

Figure 5.7: Numerical range of A4

Figure 5.8: Envelope for F(A4) 5.3. Normal matrices 31

We see here that the numerical range is the line segment joining the smallest and largest eigenvalue (according to Theorem 3.6, r = 0, a flat ellipse).

Next case, we’ll look at a normal (Hermitian again) complex-valued matrix  1 i 0 of dimension 3. Consider the matrix A5 = −i 1 0 . The matrix has the 0 0 1 eigenvalues λ1 = 0, λ2 = 1, λ3 = 2.

The numerical range is the convex hull, which is the line segment

Figure 5.9: Numerical range of A5

Figure 5.10: Envelope for F(A5) 32 Chapter 5. Results

As we can see here, Hermitian matrices have a very odd behavior when it comes to their numerical range, i.e. the region in the complex plane bounded by a line segment joining the smallest and largest eigenvalue. We can also say that the convex hulls either are the line segment containing all of the real values forming the spectrum of the Hermitian matrices, or for more general normal matrices the region in the complex plane bounded by a polygon (triangle, quadriteral, et cetera).

1 0 0 0 0 0 1 0 Consider the normal matrix A6 =   . The matrix has the eigenval- 0 0 0 1 0 1 0 0 √ √ 1 i 3 1 i 3 ues λ1 = − 2 + 2 , λ2 = − 2 − 2 , λ3 = 1, λ4 = 1.

The numerical range is the convex hull, which is the triangle

Figure 5.11: Numerical range of A6

Figure 5.12: Envelope for F(A6) 5.4. Almost normal matrices 33

The numerical range is, indeed, the convex hull, the region in the complex plane bounded by a triangle. According to Theorem 3.10, the matrix must be normal (obviously since it’s stated before the illustrated pictures above), because its dimension is 4.

5.4 Almost normal matrices In this section, we study matrices that are complex-valued and almost normal and also other types of matrices.

4 0 0 Consider the matrix A7 = 0 2 3 . The matrix has the eigenvalues λ1 = 0 0 1 1, λ2 = 2, λ3 = 4.

The numerical range is

Figure 5.13: Numerical range of A7

Figure 5.14: Envelope for F(A7) 34 Chapter 5. Results

By seeing the form of A7, we realize that λ3 is on the boundary of the numerical range, and the rest of the eigenvalues interior points. Then, by 2 3 Theorem 3.9, we have U ∗A U = A = 4 L , where U = I, unitary. 7 7 0 1 Therefore, the numerical range of A7 is the convex set of the eigenvalues since it’s a set of an oval and a normal part. 5.4.1 Interference of interior points   0 0 0 0 0   0 2 + 3i 0 0 0    Now, consider next matrix A8 = 0 0 6 − 5i 0 0  . This ma- 0 0 0 2 + i 0   5  0 0 0 0 2 − i trix is obviously normal and has the eigenvalues λ1 = 0, λ2 = 2 + 3i, λ3 = 5 6 − 5i, λ4 = 2 + i, λ5 = 2 − i.

The numerical range is

Figure 5.15: Numerical range of A8

Figure 5.16: Envelope for F(A8) 5.4. Almost normal matrices 35

According to Theorem 3.10, the numerical range of this matrix is the convex hull of the eigenvalues λ1, λ2 and λ3, the region in the complex plane bounded by a triangle with interior points λ4 and λ5. But, the convex hull is also obtained for a small number r > 0 if the matrix looks like   0 0 0 0 0   0 2 + 3i 0 0 0    Ae8(r) = 0 0 6 − 5i 0 0 . 0 0 0 2 + i r   5  0 0 0 0 2 − i

However, for a large number enough, the numerical range would look different. Take r = 6, then the envelope now is

Figure 5.17: Envelope for F(Ae8) when r = 6 36 Chapter 5. Results

The larger r is, the more oval the boundary of the numerical range become to be. This matrix is no longer normal, though.

It’s also interesting to see here that the numerical ranges of both matrices are convex sets (obvious now to mention that the oval part of the numerical range of Ae8 is flat for a small number r > 0).

5.4.2 Interference of boundary points

It’s interesting to see how sensitive A8 is, if we investigate the interference around a boundary point this time.   0 r 0 0 0   0 2 + 3i 0 0 0  ˆ   Firstly, we define the matrix A8(r) = 0 0 6 − 5i 0 0 , where 0 0 0 2 + i 0   5  0 0 0 0 2 − i this matrix actually is A8 but applied with a interference number r around a boundary point this time.

3 For a small r, r = 8 , the envelope now is

ˆ 3 Figure 5.18: Envelope for F(A8) when r = 8 5.4. Almost normal matrices 37

One of the edges for the boundary of the numerical range is a bit oval, so it appears for a small number, the interference isn’t large enough. However, looking at the point λ2 = 2 + 3i, it’s not clear if it’s still a vertex on the region in the complex plane bounded by the triangle. By investigating this further, see the envelope of the same triangle but the coordinate axes are around λ2 below

Figure 5.19: Envelope for F(Aˆ8), the point λ2 = 2 + 3i is zoomed 38 Chapter 5. Results

It’s now easier to see that λ2 is still a vertex of the triangle.

Let’s see what happens if we raise the number further, actually greater than 1.   0 5 0 0 0   0 2 + 3i 0 0 0  ˆ   For A8(5) = 0 0 6 − 5i 0 0 , the envelope is 0 0 0 2 + i 0   5  0 0 0 0 2 − i

Figure 5.20: Envelope for F(Aˆ8) when r = 5 5.5. Other matrices that are not normal 39

The region is, at this point, very oval like the numerical range of A7. It’s clear from now on that the numerical range becomes more oval if we raise the interference number r any further.

5.5 Other matrices that are not normal 0 1 0 Next, we look at a matrix A9 = 0 0 1 , which appears to be a nilpotent 0 0 0 matrix (a matrix A is nilpotent if AN = 0,N ≥ 1). Then the eigenvalues are λ1 = λ2 = λ3 = 0.

The numerical range is

Figure 5.21: Numerical range of A9

Figure 5.22: Envelope for F(A9) 40 Chapter 5. Results

It appears that the nilpotent matrix has its numerical range as the region in the complex plane bounded by a circle with the eigenvalues as interior points. 0 a 0 ˆ However, looking at the matrix again, but A9 = 0 0 b , a, b ∈ C. a or b 0 0 0 can be zero, but not both (otherwise we’ll have a normal matrix and the numerical range result to be the point where the eigenvalues join). Raising any of a or b would by earlier experiments on other matrices result to the region bounded by a circle with greater radius, around origin. To be convinced, let’s raise a. 0 8 0 Now, the matrix is Ae9 = 0 0 1. The numerical range is now 0 0 0

Figure 5.23: Numerical range of Ae9

Figure 5.24: Envelope for F(Ae9) 5.6. Gershgorin region versus Numerical range 41

5.6 Gershgorin region versus Numerical range In this section, we study the Gershgorin regions and numerical ranges of several square matrices, and compare which of them is better (or perhaps both good) in bounding the spectrum.

5.6.1 Matrices in M2 0 1 First, consider the matrix A = . This matrix has the following eigen- 10 4 0 −2 3 values λ = −2, λ = 2 and the matrix Γ = is the Schur form of A . 1 2 1 0 2 10 The Gershgorin region and the envelope for the numerical range are

Figure 5.25: Envelope for F(A10)

Figure 5.26: Gershgorin region for A10 42 Chapter 5. Results

A comment to the numerical range, it’s the region in the complex plane bounded by an ellipse with origin as centre, minor semi-axis ± 3 and major q 2 32 2 5 semi-axis ± 4 + (−2) = ± 2 .

We see here that the Gershgorin region is the region in the plane bounded by circles where the smaller circle is inside the bigger circle. If we look at the following intersection of these regions below,

Figure 5.27: Intersection of F(A10) and G(A10) 5.6. Gershgorin region versus Numerical range 43 we can see here that the numerical range is completely inside the Gershgorin region (small circle inside the ellipse though). We see here that σ(A10) lies inside both regions in the plane. Therefore, we can conclude that the numerical range is the best set bounding the spectrum here.

" 1 # −1 2 Let’s look at next matrix. Consider A11 = 1 . This matrix has the 4 1 √ √ √ " 3 2 1 # 3 2 3 2 4 4√ following eigenvalues λ1 = − 4 , λ2 = 4 and the matrix Γ2 = −3 2 0 4 is the Schur form of A11.

The Gershgorin region and the envelope for the numerical range are

Figure 5.28: Envelope for F(A11)

Figure 5.29: Gershgorin region for A11 44 Chapter 5. Results

A comment to the numerical range, it’s the region in the complex plane 1 bounded by an ellipse with origin as centre, minor semi-axis ± 8 and major q 1 √ 16 3 2 2 semi-axis ± 4 + ( 4 ) ≈ ±1.068. We see here that the Gershgorin region is the region in the plane bounded by circles where the circles neither intersect or are inside of another. If we look at the following intersection of these regions below,

Figure 5.30: Intersection of F(A11) and G(A11) 5.6. Gershgorin region versus Numerical range 45 we can see here that parts of the numerical range are inside the Gershgorin region. This means that the rest of the numerical range is cut off by the Gershgorin region. We see here that σ(A11) lies inside both regions in the plane. Therefore, we can conclude that the intersection of these regions is the best set for bounding the spectrum.

The last experiment we’ll examine regarding the dimension 2 case, is the follow- −1 1 ing normal matrix A = . This matrix has the following eigenvalues 12 1 1 "√ # √ √ 2 0 λ = − 2, λ = 2 and the matrix Γ = √ is the Schur form of A . 1 2 3 0 − 2 12 The Gershgorin region and the envelope for the numerical range are

Figure 5.31: Envelope for F(A12)

Figure 5.32: Gershgorin region for A12 46 Chapter 5. Results

A comment to the numerical range, it’s a line segment joining the smallest and largest eigenvalue (a flat ellipse for r = 0). We see here that the Gershgorin region is the region in the complex plane bounded by circles where the circles intersect in origin. If we look at the following intersection of these regions below,

Figure 5.33: Intersection of F(A12) and G(A12) 5.6. Gershgorin region versus Numerical range 47

we see that the whole line segment lies in the union of the circles where σ(A12) lies within. Therefore, we can conclude that the numerical range is better here.

5.6.2 Matrices in M3 and higher dimensions In this subsection, we study matrices in higher dimensions in the matter of comparing numerical ranges to Gershgorin regions.

−4 0 1 Consider the matrix A13 =  0 1 4. This matrix has the eigenvalues −1 0 6 √ √ λ1 = 1, λ2 = 1 − 2 6, λ3 = 1 + 2 6. The Gershgorin region and the envelope for the numerical range are

Figure 5.34: Envelope for F(A13)

Figure 5.35: Gershgorin region for A13 48 Chapter 5. Results

The numerical range is an oval region in the complex plane. The Gershgorin region is a kind of the situation for matrix A12, where the bigger circle intersect the smaller circles. If we look at the following intersection of these regions below,

Figure 5.36: Intersection of F(A13) and G(A13) 5.6. Gershgorin region versus Numerical range 49 we see that parts of the oval region and the Gershgorin region intersects. Also, the Gershgorin region cuts off the rest of the oval region. σ(A13) lies in the intersection of the regions in the plane. Therefore, we can conclude that the intersection of these regions is the best set for bounding the spectrum.

6 + i 0 1  Let’s look at the next matrix A14 =  0 3 2 . This matrix has the 3 0 −2 + i √ √ eigenvalues λ1 = 3, λ2 = 2 + 19 + i, λ3 = 2 − 19 + i. The Gershgorin region and the envelope for the numerical range are

Figure 5.37: Envelope for F(A14)

Figure 5.38: Gershgorin region for A14 50 Chapter 5. Results

Like the previous matrix, the numerical range looks like an oval region in the complex plane. However, the Gershgorin region is a kind of the situation for matrix A11, circles not intersecting anywhere in the plane. If we look at the following intersection of these regions below,

Figure 5.39: Intersection of F(A14) and G(A14) 5.6. Gershgorin region versus Numerical range 51 we see that parts of the oval region and the Gershgorin region intersects. Also, the Gershgorin region cuts off the rest of the oval region. σ(A14) lies in the intersection of the regions in the plane. Therefore, we can conclude that the intersection of the regions is the best set for bounding the spectrum.   1 0 1  1+i  One last experiment for the 3×3 case, consider the matrix A15 = 0 4 1 .  1  1 0 − 2 3 1+i It has the eigenvalues λ1 = 2 , λ2 = 4 , λ3 = −1. The Gershgorin region and the envelope for the numerical range are

Figure 5.40: Envelope for F(A15)

Figure 5.41: Gershgorin region for A15 52 Chapter 5. Results

Like the previous matrix, the numerical range looks like an oval region in the complex plane. However, the Gershgorin region is the union of the three circles with radius 1 but different centres. If we look at the following intersection of these regions below,

Figure 5.42: Intersection of F(A15) and G(A15) 5.6. Gershgorin region versus Numerical range 53

we see that σ(A15) lies in both regions in the plane. However, we can conclude that the oval region lies inside the union of the circles, so the numerical range is the best set here.

We’ll now look at matrices of dimension n > 3. Consider the matrix A16 =  7  − 4 0 0 1 0  1 − 1 + i 0 0 0   4   0 0 5 1 0  . This matrix has the eigenvalues  4   11  1 0 0 4 + i 0  17  0 0 1 0 4

√ √ 1 1+i 93+36i 1+i 93+36i 5 17 λ1 = − 4 + i, λ2 = 2 − 4 , λ3 = 2 + 4 , λ4 = 4 , λ5 = 4 .

The Gershgorin region and the envelope for the numerical range are

Figure 5.43: Envelope for F(A16)

Figure 5.44: Gershgorin region for A16 54 Chapter 5. Results

Like the previous matrix, the numerical range looks like an oval region in the complex plane. However, the Gershgorin region is the union of five circles with the same radius 1 but has different centres (it looks kind of the symbol of Olympic Games). If we look at the following intersection of these regions below,

Figure 5.45: Intersection of F(A16) and G(A16) 5.6. Gershgorin region versus Numerical range 55

we see that parts of the oval region and the Gershgorin region intersects. σ(A16) lies in the intersection of the regions in the plane. Therefore, we can conclude that the intersection of the numerical range and Gershgorin region is the best suited set for bounding the spectrum.

−2 1 0 0 0 0 0   1 0 0 0 0 0 0     0 1 2 0 0 0 0    Consider the matrix A17 =  0 0 1 −2 + 2i 0 0  .    0 0 0 1 −2 − 2i 0 0     0 0 0 0 1 2 + 2i 0  0 0 0 0 0 1 2 − 2i This matrix has the eigenvalues

√ √ λ1 = −2 + 2i, λ2 = −2 − 2i, λ3 = −1 − 2, λ4 = −1 + 2, λ5 = 2, λ6 = 2 + 2i, λ7 = 2 − 2i.

The Gershgorin region and the envelope for the numerical range are 56 Chapter 5. Results

Figure 5.46: Envelope for F(A17)

Figure 5.47: Gershgorin region for A17 5.6. Gershgorin region versus Numerical range 57

The numerical range is an oval region in the complex plane (looks kind of a pillow). Also, the Gershgorin region is the union of seven circles with the same radius 1 but has different centres (the union forms a H).

If we look at the following intersection of these regions below,

Figure 5.48: Intersection of F(A17) and G(A17) 58 Chapter 5. Results

we see that parts of the oval region and the Gershgorin region intersects. It’s easy to see that the Gershgorin region cuts off the rest of the oval region. σ(A17) lies in the intersection of the regions in the plane. Therefore, we can conclude that the intersection of the numerical range and Gershgorin region is the best suited set for bounding the spectrum.

0 1 0 0 0 0 0 0 0  0 −2 1 0 0 0 0 0 0    0 0 −4 1 0 0 0 0 0    0 0 0 2 1 0 0 0 0    Consider the matrix A18 = 0 0 0 0 4 1 0 0 0  . This ma-   0 0 0 0 0 2i 1 0 0    0 0 0 0 0 0 4i 1 0    0 0 0 0 0 0 0 −2i 1  0 0 0 0 0 1 0 0 −4i trix has the eigenvalues

p √ p √ λ1 = −4, λ2 = −2, λ3 = −i 10 + 37, λ4 = −i 10 − 37, λ5 = 0, λ6 = p √ p √ 2, λ7 = 4, λ8 = i 10 − 37, λ9 = 10 + 37.

The Gershgorin region and the envelope for the numerical range are 5.6. Gershgorin region versus Numerical range 59

Figure 5.49: Envelope for F(A18)

Figure 5.50: Gershgorin region for A18 60 Chapter 5. Results

The numerical range is an oval region in the complex plane (looks like a non-smooth rhombus). Also, the Gershgorin region is the union of nine circles with the same radius 1 but has different centres (the union forms a plus).

If we look at the following intersection of these regions below,

Figure 5.51: Intersection of F(A18) and G(A18)

we see that parts of the oval region and the Gershgorin region intersects. It’s also easy to see that the Gershgorin region cuts off the rest of the numerical range. σ(A18) lies in the intersection of the regions in the plane. Therefore, we can conclude that the intersection of the numerical range and Gershgorin region is the best suited set for bounding the spectrum. Chapter 6

Discussion

In this chapter, we discuss the results and give a conclusion about numerical range and Gershgorin region for a square matrix, and also mention some topics that remain to investigate.

6.1 Numerical range In the result chapter, we have illustrated different cases of numerical ranges of square matrices, whether they be normal or not. The most important results are three.

Firstly, is that the numerical range of a normal square matrix is either for Hermitian matrices, the line segment joining the smallest and largest eigenvalue and the rest of the eigenvalues are interior points, or more general normal matri- ces the region in the complex plane bounded by a polygon where the eigenvalues are vertices.

Secondly, for the non-normal matrices, we have stated before, that their numeri- cal range is an oval region in the plane (or in 2 × 2 case the region bounded by an ellipse).

Finally, in the special case when the numerical range of an arbitrary square matrix is a combination of a normal and oval part, whether if there are 1 or 2 eigenvalues as interior points and the rest of the eigenvalues are vertices, we have seen that the region in the complex plane becomes more oval by raising the interference number r. However, for small numbers, the region become closely

Erik Jonsson, 2019. 61 62 Chapter 6. Discussion

the convex set (2 or more eigenvalues might be interior points if the dimension is greater than 4). If the dimension of the square matrix n ≤ 4, then the numerical range become closely the convex hull.

6.2 Gershgorin regions In Chapter 4, we examined analytically Gershgorin regions of general square matrices. In the result section, we have seen different types of Gershgorin regions containing the spectrum of a square matrix if the Gershgorin region is obtained by determining the union of Gershgorin discs (where every Gersh- gorin disc is aquired from each row in the matrix). However, the Gershgorin region can be obtained by the union of Gershgorin discs where every disc is determined from each column in the matrix, which means the spectrum of this matrix lies in the intersection of the Gershgorin regions. Finally, it’s indeed true that the number of eigenvalues in each Gershgorin disc equals the number of the Gershgorin discs that forms the Gershgorin region, the disjoint union of discs.

6.3 Numerical range versus Gershgorin regions As it has seen in some of the later experiments, the numerical range has shown to be the best set for bounding the spectrum. In other cases, however, the Ger- shgorin region has been useful in the sense of cutting off the parts of numerical range which doesn’t overlap within the Gershgorin region. This results to that the spectrum lies in the intersection of the numerical range and the Gershgorin region, so in a sense both sets are great for bounding the spectrum of the matrix.

6.4 Further topics to investigate We’ve covered the most fundamental theorems in the theory behind numerical range and also Gershgorin region. However, there is much more to discover, i.e. theorems that describes other types of regions containing the spectrum of any square matrix, like Ostrowski and Brauer regions (see [14], [10] for a brief summary of their definitions), which are based on the theory behind the Gershgorin disc theorem.

In a future research, the interest might lie in how to interpret the Ostrowski and Brauer regions with the help of understanding Gershgorin discs. This can be examined by the usage of , defining the fundamental theory behind numerical range and Gershgorin region more generally by using different types of operators, but the subject itself is also achievable by examining matrices 6.4. Further topics to investigate 63 that maps Cn → Cn. However, it’s a whole new world to explore applications in spectral theory, especially in analysis of spectra of general matrices (or operators defined in other ways).

Bibliography

[1] Böiers, Lars-Christer and Persson, Arne. Analys i flera variabler. 3:10 Edition. Studentlitteratur AB, Lund, 2005.

[2] Gustafson, Karl.E and Rao, Duggirala K.M. Numerical range. Springer- Verlag New York, Inc. 1997.

[3] Horn, Roger A. and Johnson, Charles R. Topics in matrix analysis. Cam- bridge University Press, USA, 1991.

[4] Janfalk, Ulf. Linjär algebra. Matematiska institutionen, Linköpings univer- sitet, Linköping, 2012.

[5] Lundgren, Jan; Rönnqvist, Mikael and Värbrand, Peter. Optimization. 1:2 Edition. Studentlitteratur AB, Lund, 2012.

[6] Maplesoft. Gershgorin Circle Theorem. 2016. https://www.maplesoft. com/applications/view.aspx?sid=4814&view=html (Retrieved 2019-04- 01)

[7] Maplesoft. Numerical range of a square matrix. 2019. https: //www.maplesoft.com/applications/view.aspx?sid=4128&view=html (Retrieved 2019-03-25)

[8] Shapiro, Helene. Linear algebra and matrices. American Mathematic Society, Providence, USA, 2015.

[9] Treil, Sergei. Linear algebra done wrong. Department of Mathematics, Brown University, 2017. http://www.math.brown.edu/~treil/papers/ LADW/LADW_2017-09-04.pdf

[10] Wolfram MathWorld. Brauer’s Theorem. 2019. http://mathworld. wolfram.com/BrauersTheorem.html (Retrieved 2019-04-10)

Erik Jonsson, 2019. 65 66 Bibliography

[11] Wolfram MathWorld. Envelope. 2019. http://mathworld.wolfram.com/ Envelope.html (Retrieved 2019-03-21) [12] Wolfram MathWorld. Frobenius Norm. 2019. http://mathworld.wolfram. com/FrobeniusNorm.html (Retrieved 2019-03-25) [13] Wolfram MathWorld. Matrix Direct Sum. 2019. http://mathworld. wolfram.com/MatrixDirectSum.html (Retrieved 2019-05-28) [14] Wolfram MathWorld. Ostrowski’s Theorem. 2019. http://mathworld. wolfram.com/OstrowskisTheorem.html (Retrieved 2019-04-10) Appendix A

Maple Commands(Code)

You’ll find valuable Maple commands in case that something is unclear with the simulations. The code for the command W in section A.1, however, isn’t attached here, so I recommend you check the hyperlink for the source code in the bibliography [7]. One observation is that W is based on obtaining the envelope for the numerical range of general square matrices, but leaving the lines out of it.

In section A.2, completed Maple code has been used for generating the Gersh- gorin regions. However, the code that generates the intersection of G(A) and G(A∗) is modified from the complete Maple code. Commands and the modified code are attached in section A.2 (see [6] for source code of Gershgorin regions).

A.1 Numerical range with(plots): with(LinearAlgebra): restart:

A := 2,0 | 3,1 : Eigenvalues(A): F:= Array(1..121): for j from 1 by 1 to 120 do A := evalf(exp((2.0.*Pi*I)/120)*A); H :=(A + A∗)/2 : eig := Eigenvectors(H, output = list): eig := sort(eig, (a,b) ->is(Re(a[1]) > Im(b[1]))):

Erik Jonsson, 2019. 67 68 Appendix A. Maple Commands(Code)

d1:= eig[1,1]: F[j] = implicitplot(d1 - cos((j*2*Pi)/120)*s +sin((j*2*Pi)/120)*t = 0, s = -4..4, t = -3..3,numpoints = 1000): end do:

ev := Eigenvalues(A):

F[121] := pointplot([[Re(ev[1]), Im(ev[1])], [Re(ev[2]), Im(ev[2])]], symbol = box, color = "Blue"):

display(seq(F[j], j = 1..121)): W(A,800):

A.2 Gershgorin region Gershgorin(A): Gershgorin(A∗) :

GershgorinIntersect := proc (A::Matrix, B::Matrix) local Delta, Lambda, m, n, AA, BB, R, Rho, C, C_star, i, c, d, eig1, eig2, P, Q, Plt1, Plt2; Delta := proc (i, j) if i = j then 0 else 1 end if end proc; Lambda := proc (i, k) if i = k then 0 else 1 end if end proc; m, n := LinearAlgebra[Dimension](A); m, n := LinearAlgebra[Dimension](B); AA := Matrix(m,n,(i,j) → Delta(i,j)abs∗(A[i,j])): BB := Matrix(m,n,(i,k) → Delta(i,k)abs∗(B[i,k])): R := evalm(&∗(AA, Vector(m, 1))): Rho := evalm(&∗(BB, Vector(m, 1))): C := seq(’plottools[circle]’([Re(A[i,i]),Im(A[i,i])], R[i],color=blue,i=1..m) : C_star := seq(’plottools[circle]’([Re(B[i,i]),Im(B[i,i])], Rho[i], color=blue,i=1..m) ; c := seq(’plottools[point]’([Re(A[i,i]),Im(A[i,i])], color=black, symbol=diamond),i=1..m) ; d := seq(’plottools[point]’([Re(B[i,i]),Im(B[i,i])], color=green, symbol=diamond),i=1..m) ; eig1 := evalf(LinearAlgebra[Eigenvalues](A)); eig2 := evalf(LinearAlgebra[Eigenvalues](B)); P := seq(’plottools[point]’([Re(eig1[i]), Im(eig1[i])], color=red, symbol=box),i=1..m) ; A.2. Gershgorin region 69

Q := seq(’plottools[point]’([Re(eig2[i]), Im(eig2[i])], color=blue, symbol=box),i=1..m) ; Plt1 := C union c union P; Plt2 := C_star union d union Q; plots[display](eval(Plt1),eval(Plt2), scaling=constrained);end:

GershgorinIntersection(A,A∗): display(W(A,800),Gershgorin(A)):

Linköping University Electronic Press

Copyright The publishers will keep this document online on the Internet – or its possible replacement – from the date of publication barring exceptional circumstances. The online availability of the document implies permanent permission for anyone to read, to download, or to print out single copies for his/her own use and to use it unchanged for non-commercial research and educational purpose. Subsequent transfers of copyright cannot revoke this permission. All other uses of the document are conditional upon the consent of the copyright owner. The publisher has taken technical and administrative measures to assure authenticity, security and accessibility. According to intellectual property law the author has the right to be mentioned when his/her work is accessed as described above and to be protected against infringement. For additional information about the Linköping University Electronic Press and its procedures for publication and for assurance of document integrity, please refer to its www home page: http://www.ep.liu.se/.

Upphovsrätt Detta dokument hålls tillgängligt på Internet – eller dess framtida ersättare – från publiceringsdatum under förutsättning att inga extraordinära omständigheter uppstår. Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner, skriva ut enstaka kopior för enskilt bruk och att använda det oförändrat för ickekommersiell forskning och för undervisning. Överföring av upphovsrätten vid en senare tidpunkt kan inte upphäva detta tillstånd. All annan användning av dokumentet kräver upphovsmannens medgivande. För att garantera äktheten, säkerheten och tillgängligheten finns lösningar av teknisk och administrativ art. Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsman i den omfattning som god sed kräver vid användning av dokumentet på ovan beskrivna sätt samt skydd mot att dokumentet ändras eller presenteras i sådan form eller i sådant sammanhang som är kränkande för upphovsmannens litterära eller konstnärliga anseende eller egenart. För ytterligare information om Linköping University Electronic Press se förlagets hemsida http://www.ep.liu.se/.

c 2019, Erik Jonsson