INFORMATION TO USERS

This manuscript has been reproduced from the microfilm master. UMI films the twct directly from the original or copy submitted. Thus, some thesis and dissertation copies are in typewriter face, while others may be from any type o f computer printer.

The quality of this reproduction is dependent upon the quality of the copy subm itted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleedthrough, substandard margins, and improper alignment can adversely affect reproduction.

In the unlikely event that the author did not send UMI a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion.

Oversize materials (e.g., maps, drawings, charts) are reproduced by sectioning the original, beginning at the upper left-hand comer and continuing from left to right in equal sections with small overlaps. Each original is also photographed in one exposure and is included in reduced form at the back of the book.

Photographs included in the original manuscript have been reproduced xerographically in this copy. Higher quality 6” x 9” black and white photographic prints are available for any photographs or illustrations appearing in this copy for an additional charge. Contact UMI directly to order. UMI A Bell & Howell Information Company 300 North Zed) Road, Ann Arbor MI 48106-1346 USA 313/761-4700 800/521-0600

Lattice Studies of the 77

DISSERTATION

Presented in Partial Fnlfillment of the Requirements for the Degree Doctor of Philosophy in the

Graduate School of The Ohio State University

By

Lakshmi Venkataraman. M.S.

* * y * *

The Ohio State University

1997

Dissertation Committee: Approved by

Dr. Gregory Kilcup, .A.dviser

Dr. Junko Shigemitsu C o-A dviser

Dr. Richard Kass Co-Adviser Dr. Richard Furnstahl Department of Physics DMI Number: 9813367

Copyright 1997 by Venkataraman, Lakshmi

All rights reserved.

UMI Microform 9813367 Copyright 1998, by UMI Company. All rights reserved.

This microform edition is protected against unauthorized copying under Title 17, United States Code.

UMI 300 North Zeeb Road Ann Arbor, MI 48103 © Copyright by

Lakshmi Venkataraman

1997 ABSTRACT

This thesis presents some studies of the r}' meson on the lattice. The eta-prime meson is the flavor singlet cousin of the pseudoscalar octet with a much heavier mass. In the continuum, a qualitative explanation exists to account for its heavy mass within the framework of quantum chromodynamics (QCD) in the limit of large number of colors. However, a quantitative model independent calculation is possible using lattice regularization of QCD which demands a vast amount of supercomputer time. In this thesis, we present such a numerical calculation to obtain the mass of 77' using the staggered formulation both in the quenched and full

QCD. The quenched calculation gives a value of 891(101 )MeV while the dynamical calculation yields 780(I87)MeV. Both numbers are consistent with the experimental value within the shown statistical errors. In a closely related study, we compute the lowest 32 eigenmodes of the staggered Dirac operator. The modes are then used to study the validity of the Atiyah-Singer index theorem on the lattice and to construct hadronic correlators. We find that the shifted zero modes of the lattice Dirac operator do allow some of the features of the index theorem to be preserved on the lattice.

As a consequence, we show that the disconnected 77' correlator is almost completely determined by these topological zero modes verifying the conventional notion that the f/(l) anomaly gives mass to 77'. To my parents.

Ill ACKNOWLEDGMENTS

First of all, I would like to thank Prof. Kilcup for his patient guidance of my work over the last few years. His constructive criticism helped an eager but inexperienced graduate student to leam about the methodology of research.

I would also like to thank my dissertation committee members Prof. Shigemitsu.

Prof. Kass and Prof. Furnstahl for their helpful suggestions which improved the presentation of this dissertation.

Finally, none of these would have been possible without the love, encouragement and moral support of my parents and my life partner.

IV VITA

A pril 19. 1967 ...... Born - Madras. India

1987 ...... B.Sc. Physics

1989 ...... M.Sc. Physics

1991 ...... M.Tech. Computer Science and Data Processing 1991-present ...... Graduate Teaching and Research Associate, The Ohio State University.

PUBLICATIONS

Research Publications

L. Venkataraman and G. Kilcup “Applications of the Eigenmodes of the Dirac Oper­ a to r” Proceedings of Lattice '91, July 1997.

L. Venkataraman, G. Kilcup and J. G randy “The Staggered rf with Smeared Opera­ tors” Nucl. Phys. B proc. SuppL, 53, 1997 pp. 259-261.

G. Kilcup, D. Pekurovsky and L. Venkataraman “On the N f and a dependence of Bfc” Nucl. Phys. B proc. SuppL, 53, 1997 pp. 345-348.

G. Kilcup, J. Grandy and L. Venkataraman “r/' and Cooling with Staggered ” Nucl. Phys. B proc. Suppl., 53, 1996 pp. 358-361. Instructional Publications

L. Venkataraman and W. D. Ploughe “Magnetic Field of a Long Straight Wire’’ A A P T Sum mer Meeting, University of Notre Dame, Notre Dame, IN, Aug. 1994.

L. Venkataraman, R. Padma and P.C. Deshmukh “Classroom Demonstration of an .Atomic Hartree-Fock Program” IV Annual Convention of Indian Association of Physics Teachers, Raipur, India, Oct, 1989.

FIELDS OF STUDY

Major Field: Physics

Studies in Lattice QCD: Prof. G. Kilcup

VI TABLE OF CONTENTS

Page

A b s tr a c t...... ii

D e d ic a tio n ...... iii

Acknowledgments ...... iv

V i t a ...... V

List of T ables ...... ix

List of Figures ...... x

C hapters:

1. Introduction ...... I

1.1 O v e r v ie w ...... 1 1.2 Relation of Our Work to Previous W o rk ...... 3 1.3 Contributions of This Thesis ...... -5 1.4 Organization of This T hesis ...... 6

9 Lattice Background ...... S

2.1 Overview ...... S 2.2 Staggered Fermions ...... 12 2.3 Correlators ...... 16 2.4 Sources of Errors ...... 20

VI1 3. Numerical Techniques...... 24

3.1 Computational Requirements for Lattice Q CD ...... 24 3.2 Fermion Matrix Inversion ...... 27 3.2.1 Description of the CO A lgorithm ...... 29 3.2.2 Implementation issues ...... 33 3.3 Eigenvalues and Eigenvectors of the Staggered P ...... 36 3.3.1 Description of the subspace iteration algorithm ...... 38 3.3.2 Implementation of the A lgorithm ...... 41 3.4 The Lanczos A lgorithm ...... 47 3.4.1 Cullum and Willoughby’s Lanczos M ethod ...... 48 3.4.2 Numerical Implementation and T ests ...... 51 3.5 Summary and Conclusions ...... 54

4. T]' meson with Staggered Ferm ions ...... 55

4.1 Continuum t)' L o r e ...... 55 4.2 Lattice calculation of m o ...... 59 4.3 Ratio from PQ\PT ...... 62 4.4 Simulation D etails ...... 66 4.4.1 Ensemble ...... 66 4.4.2 Propagators ...... 67 4.4.3 Wuppertal Smearing ...... 70 4.5 Results ...... 75 4.6 Conclusions ...... 84

5. Applications of the Eigenmodes of the Dirac operator ...... 86

5.1 Continuum Facts ...... 86 5.2 Topological Charge On and Off the Lattice ...... 89 5.2.1 Lattice R esults ...... 92 5.3 Hadron Correlators Revisited ...... 98 5.3.1 Correlator and the Lowest Eigenmodes ...... 100 5.3.2 The r/' Correlator and the Lowest M odes ...... 105 5.4 Conclusions ...... 109

Bibliography ...... I l l

vni LIST OF TABLES

Table Page

3.1 Number of Flops in the inversion of staggered i p ...... 36

3.2 Number of Flops in Subspace Iterations ...... 44

3.3 Number of iterations required to converge for different masses and m ...... 46

3.4 The ensemble used for subspace iterations ...... 47

3.5 Number of good eigenvalues of IfP ’ found after m iterations using the Lanczos procedure ...... 52

3.6 Maximum number of good eigenvalues obtained from CG ...... 53

4.1 The Statistical Ensemble...... 67

4.2 Smeared valence operators ...... 73

4.3 a. from the quenched ratio data ...... 83

4.4 =3) in the chiral limit...... 84

IX LIST OF FIGURES

Figure Page

2.1 Lattice fermion propagator: either a gap (full curve) or a zero cross- ing(dotted curve) ...... II

3.1 Change in the residual with CG iterations ...... 33

3.2 Relative error in the indicated eigenvalues and the Ritz function. . . . 45

4.1 Virtual quark loops in the q ' propagator ...... 56

4.2 Contributions to q ' correlator ...... 60

4.3 Ratio using local operators ...... 71

4.4 VVavefunction from Wuppertal smearing ...... 72

4.5 Correlators from smeared operators ...... 73

4.6 Effective mass plot for mg with LL correlator ...... 74

4.7 Effective mass plot with C5 and LLC 5 correlators ...... 75

4.8 Dynamical ratio from local and smeared operators ...... 76

4.9 Flavor dependence of R{t)...... 77

4.10 mg vs TUvai on N j = 0 and N j = 2 configurations ...... 78

4.11 Chiral extrapolation of quenched m l...... 79

4.12 Chiral extrapolation of m^<^ from V/ = 2 configurations ...... SO

X 4.13 One parameter fit to R{t) w ith m^ai = 0.02 and = 0.01 ...... SI

4.14 Z'/Z vs. rriyai ...... S2

5.1 Above:(Fs) on the dynamical ensemble. Below: Fluctuations in (Fg). 94

5.2 Above:(F5) on the quenched ensemble. Below.Fluctuations in (Fs). . 95

5.3 Topological charge on a typical dynamical configuration ...... 96

5.4 Quark propagators from 16 mode and zero seed on a typical dynamical config with ma = 0.01 from a S function source ...... 101

5.5 Pion propagator on dynamical ensemble at ma = 0.01...... 102

5.6 Comparison of CG convergence with 16 mode and zero mode seed at ma = 0.01. Above: number of modes vs number of CG iterations for various quark m asses ...... 103

5.7 Same as in Figure5-7, but for a f/(l) noise source ...... 104

5.8 Below: Comparison of the two loop amplitudes. Above: Ratio of the full two-loop amplitude to that calculated with modes ...... 107

5.9 Pion-f/' splitting ...... 108

5.10 Pion-7' splitting on the quenched configs at ma = 0.01...... 108

XI CHAPTER 1

INTRODUCTION

1.1 Overview

The eta-prime meson is one of the more intriguing strongly interacting .

.A.lthough it is closely related to the other pseudoscalar mesons, its mass turns out to be significantly larger than the and . This discrepancy in mass is such a well-known feature of the that it even has a name: the “U(l) problem”. The modern theory of the strong interactions, quantum chromodynam­ ics (QCD), is believed to account for this “problem”, and after pioneering work by

‘t Hooft, Witten, and Veneziano, there is even a qualitative understanding of the mechanism involved. Briefly, the story is as follows. At the classical level, the fla­ vor non-singlet axial symmetries of the QCD Lagrangian are spontaneously broken by the vacuum and the pions and the kaons have the right quantum numbers to be thought as the corresponding pseudo-Goldstone . The 7' meson appeared to be a natural candidate for the pseudo-Goldstone corresponding to the broken

(7(1) axial symmetry. However, its mass is too heavy and it became the crux of the

T'(I) problem. With the advent of the quantum theory, it became clear that the divergence of the 7^(1) axial current is non-zero and is associated with the topological charge density of the field or the anomaly as it came to be known. This, in turn

implied that U{1)a is not a symmetry of the QCD Lagrangian and therefore, there is no U{1) problem. The continuum models that explain the large rj' mass essentially blame the anomaly as the cause. In particular. W itten and Veneziano have derived an expression connecting the rj' mass with fluctuations in the topological charge of the gauge configurations. What has been lacking to this point is a quantitative cal­ culation, which can take in to account the complexities of non-perturbative QCD.

This thesis fills this gap. bringing to bear the technology of lattice QCD on this problem. Invented by Ken Wilson in 1974. lattice QCD has evolved into a major sub­ discipline of elementary physics. It provides a first-principles framework for attacking hard field theory problems, often at the price of a large amount of supercom­ puter time. The calculation of the eta-prime mass stretches the present technology to near its limit, but even so we have been able to say something useful. The major result is that we find the lattice QCD calculation of the eta-prime mass gives a value in agreement with the experimental number. Such an agreement is significant since it serves as a check of non-perturbative QCD. It also helps in establishing lattice as a primary model independent tool for tackling other non-perturbative problems for which experimental results are not available. In a closely related study, we are also able to verify the conventional qualitative understanding of the eta-prime mass in

QCD is correct, and that “zero-modes" of the Dirac operator which are related to the topology of the gauge configurations, do play a large role. 1.2 Relation of Our Work to Previous Work

All previous calculations of the ij' mass on the lattice have used Wilson fermions.

K uram ashi et al. [58] obtain a value of m,,< = 751(39)MeV at an inverse lattice spacing of = 1.45GeV on a 12^ x 20 lattice in the quenched approximation. Physically, quenched approximation means, the exclusion of quark loops from all physical pro­ cesses (the necessity for introducing this approximation is discussed in section 2.4).

Méisetti et al. [32] have attempted to study the dynamical flavor dependence of m,,/ by simulating with unphysical negative number of flavors in the Wilson discretization scheme. The calculations so far have been inadequate in that they have been done in the quenched approximation or at best with unphysical bosons. Ours is the first simulation carried out in full QCD. In addition, we also use quenched configurations which allows us to compare the simulation results from both. We also use staggered fermions which have better chiral properties than Wilson fermions. The lattice size that we use is 16^ x 32 at a~^ = 2GeV. We also use better operators which suppress excited state contributions, so the extraction of is more reliable. This aspect of our study of the t]' meson is detailed in sections 4.2, 4.3, 4.4 and 4.5.

The Atiyah-Singer index theorem that connects the topological charge of gauge configurations with the zero modes of the Dirac operator is a well known result in the continuum. Due to the fact that lattice gauge fields cannot be divided into well defined topological sectors, it is not obvious that the index theorem should hold on the lattice. Smit and Vink [48] have shown that remnants of the index theorem do exist on the lattice, but they show this in I 4- 1 dimensions with (7(1) gauge fields. There have been very few studies in which the lowest eigenvalues and eigenvectors have been calculated explicitly because it is computationally expensive. .<\s such, several

3 previous studies that have made use of a few of the smallest eigenvalues of the Dirac operator have been in the finite temperature regime where smaller lattice sizes are required. A number of groups have attempted to calculate the topological charge and the topological susceptibility by either cooling or using fermionic, geometric or field theoretic definitions of topological charge. However, each of these methods is plagued with different types of difficulties. A review of these methods and the results obtained so far can be found in [1]. Explicit calculation of the lowest eigenvalues and eigenvectors, although computationally very demanding, allows straightforward calculation of the topological charge without any of the problems suffered by the other methods. Once the low modes are available, they can be used to investigate a number of problems. This has been done for this thesis. Using the available low modes we show that the remnants of the index theorem exist on the lattice and the topological modes of the Dirac operator play a substantial role in determining the rj' mass. This is the first such study on SU(3) gauge configurations (both quenched and dynamical) in 4 dimensions at zero temperature. The only previous work that has calculated the smallest eigenvalues and eigenvectors using the subspace method is that of Bunk [7]. This was an exploratory study on SU{2) random gauge configurations of size S'* for Wilson fermions. We use staggered fermions, large lattices (upto 16^ x .32), two different types of error estimates and incorporate explicit reorthogonalization, which have enabled a detailed study of the algorithm for its computational feasibility

(detailed in section 3.2). A review of other available methods for obtaining eigenvalues and eigenvectors and a discussion of their relevance to lattice QCD are in section 3.2. 1.3 Contributions of This Thesis

In summary, the specific problems that were the focus of this thesis are:

• Extraction of the rj' m ass on SU{3) gauge configurations with (dynamical) and

without (quenched) the presence of quark loops using both local and smeared

operators.

• Calculation of a few of the lowest eigenvalues and the corresponding eigenvectors

of the Dirac operator.

• Study of the contribution to the topological charge from the lowest modes so

as to understand the extent of validity of the index theorem on the lattice.

• Explicit construction, using the lowest modes, of the t j ' and pion correlators.

This is to help understand the role of these modes in determining the low energy

sector of the QCD spectrum.

Some of the specific contributions made by this research are:

• Derivation of theoretical expressions for the 7/'-octet splitting based on partially

quenched chiral perturbation theory which were then used to compare with the

simulation results [54] (section 4.3).

• Demonstration of the use of smeared operators for reliable extraction of m,,/ [54]

(sections 4.4 and 4.5).

• Study of suitability of a particular variant of the subspace iterations for calcu­

lating a few of the lowest eigenvalues and eigenvectors of staggered ^ on SU{Z)

gauge configurations [53] (section 3.3). • Showed that some of the features of the Atiyah-Singer index theorem are pre­

served on the lattice [53] (section 5.2).

• Demonstration of the topological origin of the flavor singlet interaction of t]' [53]

(section 5.3).

1.4 Organization of This Thesis

The plan of this thesis is as follows. Chapter 2 provides a gentle introduction to lattice QCD without going into technical details. However, some details of the staggered formulation of lattice fermions are included as it was used in this entire work. This chapter culminates with a discussion on the sources of errors of the lattice formulation.

.Although chapters 4 and 5 form the core of the thesis in terms of the physics results, they would not have been possible without worrying about and dealing with the computational issues involved in such large scale calculations. .As such, chapter

3 begins with a discussion of the computational resources required for lattice QCD simulations. This is followed by a description of each of the algorithms implemented as part of this work: (i) conjugate gradient method for inversion of a large sparse matrix, (ii) a subspace iterations for determining the few lowest eigenvalues and eigenvectors of a large sparse matrix and (iii) Cullum and Willoughby’s variant of the Lanczos algorithm for determining the eigenvalue spectrum for a large sparse symmetric matrix. Issues pertaining to the implementation of these algorithms on the Cray T3D MPP, analysis of the operations count and their performance are also discussed. .All the programs were written in Fortran and where required, the Cray

T3D assembly language was used to enhance performance. In chapter 4, we present our calculation of the eta-prime mass. The rj' meson and the associated lore constitute the introductory portions of this chapter. Witten and

Venezicino’s arguments that establish the connection between the instanton configu­ rations and the t]' mass are presented. This is followed by a description of all the technical details involved in the extraction of the tj' mass. The chapter concludes with a discussion of the 77' mass results.

We conclude in chapter 5 with a study aimed at understanding the role of fermion zero modes for the rj' and other light . The low eigenmodes of the Dirac operator and their relation to the topology of the gauge configurations are well known in the continuum. To test the validity of these notions on the lattice, we compute the few of the lowest eigenvalues of If). Our test procedures and the results that we obtain form the core of the discussion in this chapter. CHAPTER 2

LATTICE BACKGROUND

2.1 Overview

Lattice gauge theory offers a systematic nonperturbative method for studying the

theory of strong interactions viz. QCD. Apart from QCD, the topics of electroweak

symmetry breaking, quantum and finite temperature field theories are also

being pursued using lattice regularization. In this scheme, space-time is discretized

on a four dimensional hypercubic lattice. The field variables reside on the

sites of the lattice while the gauge field variables live on the oriented links of the

lattice. The gauge degrees of freedom are path ordered exponential of the gauge field

and are given by

rx + a = exp(f y dx'A^^[x')) (2. 1)

This scheme preserves local gauge invariance and internal global symmetries. In the

Euclidean path integral

Z = y Df/ (2.2)

the discretization of space time results in the loss of continuous rotational invariance and imposes an ultraviolet cutoff, inversely proportional to the lattice spacing a. Here

S Sg and Sg represent the gauge and the quark action respectively and the integration

is defined over the gluon fields U and the quark and antiquark fields and The

latter integral can be done explicitly giving rise to the determinant term det(M[T']) =

+ m,) where P is the covariant Dirac operator discretized suitably (discussed

in the next section) and m, is the quark mass. .A. finite box of size L is necessary for

numerical simulations which also introduces an infrared cut-off. Physical quantities

of interest like hadronic masses, QCD coupling constant like q, etc. can be extracted

from appropriate gauge invariant correlation functions. The problem is therefore to

evaluate expressions like

{G) = ^ j ' D U det(;V/[t']) exp(-5J G[U] (2.3)

involving AT"* multidimensional integrals over the link variables for which Monte Carlo with importance sampling is used. Approximating the integral in eqn 2.2 by a sum over configurations allows the calculation of correlation functions as

= (2.4) i=l where N is the number of configurations in the ensemble. The form of the gauge and the quark action are discussed below.

In numerical simulations of lattice QCD, the link fields become SU{3) m atrices while the simplest choice for gauge action is the Wilson action [56, 36] 5u,,

S^U] = - ^ , d R e T r [ l - P(ar)| (2.5) X where = 2Nc/g^., gis the QCD coupling constant and P{i) is the product of links around the smallest closed loop on a lattice, known as a plaquette. The continuum pure gauge term of the QCD action is recovered in the a —> 0 limit,

5„, = ^ y éxF ^^F ^'' + 0(a") (2.6)

The continuum free fermion action,

S f = J (FxW{P + m )^ (2.7)

when naively discretized takes the form

S, = Y1 ^ (2.8) r pi

Here 'Pi, denote naive lattice fermion field at the lattice site x, m is the fermion

mass and 7^ the Euclidean Dirac matrix which is unitary, Hermitian and satisfies the

following anticommutation rules.

[ i n , I n ] = 2 6 tiv,

Eqn.(2.8) gives rise to the following momentum space propagator

G(k) = ,----- . ‘ (2.9) 11 Sin + m

Since sin vanishes both at = 0 and at k^^ — ir, there are 2“* poles in 4 dimen­ sions, or in other words, there are sixteen degenerate modes which transform into one another under the transformation —*■ k^ + ir. This is known in the literature as the fermion doubling problem. There is no escape around this problem without sacrificing non-locality or chiraJ invariance (in the m —> 0 limit) of the fermion action. It is a consequence of anti-hermitian discretization under periodic boundary conditions and which maintains translational invariance [38], ie. in simple words, if there is a crossing at = 0 (see fig 2.1). periodic boundary conditions dictate that there be one more

10 0

Figure 2.1: Lattice fermion propagator: either a gap (full curve) or a zero cross- ing(dotted curve).

zero at — 2ttfa. Between 0 and 2%/a, there has to be one more crossing or an infinite jump in the propagator. This latter property does not have a well defined continuum limit [26] and the former results in the naive propagator.

Another deeper reason for the existence of 16 degenerate modes was unveiled by

Karsten and Smit [26] who showed that the naive formulation of lattice fermions has exact chiral symmetry and is therefore anomaly free. By explicitly calculating the contribution of the 16 fermions to the anomaly graphs, they show that these extra modes come in pairs of opposite chiral charges and cancel the anomaly. The consequence of the doubling problem is that chiral gauge theories like the electroweak theory cannot be studied on the lattice. Several solutions have been suggested, the most popular of which go under the names of Wilson fermions and staggered fermions.

Wilson fermions, remove all doublers but one by explicitly breaking chiral symmetry on the lattice which can be regained only in the continuum limit. The reader can consult [36] for a more detailed discussion on Wilson fermions. This thesis has dealt entirely with staggered fermions and hence shall be described in more detail below.

11 2.2 Staggered Fermions

The staggered formulation avoids fermion doublers by having a single component

fermion field on each lattice site, thus reducing the number of fermion species from

16 to 4. The remaining 4 fermions are interpreted as new physical flavors in the

continuum limit. The idea, first introduced by Suss kind [49] in the Hamiltonian

formulation of lattice gauge theory, was later extended to the Euclidean lattice by

Sharatchandra et al. [45] and K aw am ato [27].

There are at least a couple of different approaches for obtaining the four staggered

flavors and the single component fermion field at each lattice site. In the method of spin diagonalization[27, 29], a local change of fermionic variables is achieved by

'Pr = ArXrr 'Px = Xx^x^, such that HereAx is a diagonal unitary

matrix. Although many choices exist as solutions for Ax, the one mentioned below is

popular for historical reasons.

/ix == (2A0) where 21,23,^3, are integer coordinates on the lattice in terms of the lattice spacing a. For the above choice of Ax,

77x^ = (-1)"+''+-+';'-' (2.11)

The different choices that exist for A x must satisfy the condition that the product of r/xfi around a plaquette must equal -1. Thus, in terms of the single component fields

Xx, Xxi fhe free staggered action is

Sq = J2 «‘‘{^(XxXx) + ^ 1 2 ?x„[(XxXr+^) - (Xx+^Xx)]}- (2.12) r ;x

1 2 When are coupled to a gauge field, the corresponding action is,

s , = (2.13) r - G ^

Here i and j are color indices.

An alternate approach for the reduction of fermionic degrees of freedom was de­ veloped by Sharatchandra et al. [45] for the Euclidean lattice and by Chodos and

Healy [9] for the hamiltonian lattice. The basic idea of this method consists of deter­ mining the maximal subgroup that diagonalizes the set of 2’^ discrete transformations^ which are invariances of the naive action (eqn 2.8) in position space. More explicitly, the naive action 5 is invariant under

iù{x) f{x,T)Té{x)

f{x,T)iix)T.

Here f{x,T ) is a phase factor that depends on the position space coordinate x and the particular transformation matrix T which is one of the 2“^ = 16, d = 4 7 matri­ ces. It follows from the properties of Clifford algebra, which the 7 matrices satisfy, the maximal subgroup that can be diagonalized simultaneously consists of 2“^^^ ele­ ments. Then restricting the number of allowed solutions to the Dirac equation by the constraint

P ri> {x] = i/’( x ) ( 2 .1 4 ) amounts to having one complex degree of freedom \(x) per site [9, 45]. Here is a projector that consists of linear combination of elements in the maximal subgroup.

The discrete symmetries of this action (eqn 2.12) consists of charge conjugation, shifts, inversion and discrete rotations [28, 19]. The fact that in the kinetic energy

'These discrete transformations in position space correspond to the symmetry — fc;, -i- t of the naive propagator in momentum space.

1:3 term of the action the \'s at odd sites are connected only to the \'s sitting at even

sites gives rise to the global symmetry group U{l)o ® U{l)e. This is broken by the

mass term to the diagonal subgroup defined by (Jo = (Je which gives rise to

number conservation.

Spinor Flavor Interpretation

The process of spin diagonrilization, loosely speaking, scatters the fermion spinor

and flavor components to the 16 corners of 2“* hypercubes of the lattice. They can

be gathered together and used to define a quark field at the center of each hyper­

cube. Doing so, enlarges the internal space of the Dirac spinor by endowing it with

an additional flavor index with the consequence that spin and flavor symmetries are

entangled with space-time transformations on the lattice. The Dirac field so asso­

ciated with each hypercube of the lattice has the property that the propagator has

only one pole in the momentum space and the action (eqn 2.12) rewritten in terms

of these quark fields, in the a —v 0 limit, goes over to the continuum action for free

Dirac fermions with SU(4) 0 51^(4) symmetry.

The entire lattice is divided into blocks of hypercubes, if t] is the hypercube corner

in the hypercubey th en

-I- 7?^, 7/^ = 0,1

Defining the quark fields and and the transformations

= Ô Z ] = â Y l X2y+r,^l;aa (2-15) ^ TJ ^ V the lattice action in terms of the new fields is [.36, 29],

S, = 16a‘‘{^^ ® ® ® l ) 9y} (2.16)

14 Here a and a are the Dirac spin and flavor indices respectively and the matrices F,, are defined by

r\, Tf'Tf'Ts'ir? (2H7)

The matrices satisfy

jTrtr'r,.) = s„, (2.18)

7 53 d;63 = 6 f^ S a 0 ( 2 - 1 9 )

The symbols and 6^ stand for the first and second lattice derivatives respec­ tively on the blocked lattice with spacing 2a:

~ ~ 9y-fi] ^FTo (2.20)

= J^[Qy+(i ~ 'My + (ly-n\ (2-21)

The matrices are the usual Dirac 7 matrices but act in the flavor space. The key observation that is to be gleaned from eqn 2.16 is that the second part of the kinetic energy term is a lattice artifact which is 0{a) compared to the first term. At finite lattice spacing, the (7(4) ® (7(4) chiral symmetry of the first term is broken down to (7(I)o 0 F''(I)e. The axial generator A = 75 0 (5 of the (7(4) © (7(4) symmetry is traceless in flavor space and hence the remnant (7(1) 0 6'’(1) symmetry belongs to the flavor nonsinglet sector. If it is spontaneously broken, the associated is like the pion.

Since the Dirac spin and flavor components reside at the corners of the hypercubic blocks of the lattice, the physical momentum components lie in the closed interval

[ ^ , ^], corresponding to a lattice spacing of 2a with the consequence that the prop­ agator has a pole only near = 0. The interested reader can consult [36] for an explicit derivation of the momentum space propagator.

15 In numerical simulations there are both advantages and disadvantages associated with using staggered fermions. The chief advantage of staggered fermions over Wilson fermions stems from the presence of the remnant continuous chiral symmetry which has rendered possible the derivation of certain Ward identities [28]. These are useful for obtaining the continuum chiral behaviour of weak matrix elements calculated with the lattice operators in (corresponding to the Goldstone boson) states. The spin flavor entanglement makes weak coupling perturbation theory calculations with staggered fermions complicated; although they have been done for certain lattice operators used in weak matrix element [40] computations. Although one continuous symmetry is retained, explicit breaking of continuum SU{A) flavor symmetry is unpleaisant. It is not very well understood whether this flavor symmetry is restored in the continuum limit. Sharatchandra [45] has shown that the correct ff(l) anomaly is recovered in the continuum limit. Another advantage is that the staggered action is already 0{a^) improved so only staggered operators need to be improved up to G{a~)] such calculations have been done in [30, 40].

2.3 Hadron Correlators

The very nature of its formulation suggests that it is possible to determine the low­ est hadron masses from the numerical simulations of lattice QCD. Most of the hadron masses are very well determined by experiments and hence comparison to experiment helps in a number of ways. First, it is an important check of non-perturbative QCD.

Secondly, it helps develop and improve numerical techniques; extracting the lightest hadron masses from the lattice can help in pinning down errors due to finite lattice spacing, finite volume and any other approximation or technique that we might use to

16 make the numerical simulations tractable. The hope is then that lattice calculations

for determining weak matrix elements or thermodynamical properties of QCD which

are not known from experiment will be trustworthy. A recent activity in the field

is the attempt to extract the light quark masses from the masses of the low lying

hadrons [33].

Hadronic masses axe extracted from two point correlation functions of the form

(C(z)) = (np(z)Of(0)|n> (2.22)

where 0{x) and 0{y) are operators that have the quantum numbers of the desired

hadron. Inserting in the complete set of intermediate states and making use of trans­

lational invariance.

C (p , 0 = E e " '" I e"" 0 (0) e-'P- | n) (n | 0(0) \ Q). (2.23) X n

Summing over all the spatial sites x projects out the zero momentum state and one obtains

0(() = EOne-^''\ C, = (n|0|n). (2.24) n

The exponential fall-off in the above sum ensures that only the ground state dominates at large Euclidean times, (f -+ oo) thus allowing the mass extraction of the desired hadron. A typical interpolating field that can create a meson is of the form 0(x)T0(x) which when used in the correlator eqn 2.22 gives rise to

0(() = E(^l% (x,<) V’Mx, t) % (o , 0) r^s v^(o, o)\ q ). (2.25) X

Here the Roman indices stand for color while the Greek indices stand for spin and V is a generic gamma matrix, for example, it is 75 if a pseudoscalar meson is desired.

17 Wick contraction of creation and annihilation operators lead to quark propagators

(0 () %(0,0)10)= (; 0,0). (2.26)

Therefore, in terms of the propagators, C{t) takes the form

C(i) = E %(X.i;0,0) 0; X, i) T:. (2.27) X

It might seem that two quark propagators need to be evaluated to calculate a meson propagator but there is a trick that can avoid this extra cost which enables the anti- quark propagator to be calculated from the quark propagator.

G{x, y) = 'fsGiy, (2.28)

While the above formalism stands as it is for Wilson fermions, for the staggered fermions, construction of the appropriate operators is little more involved. In the continuum limit, 517(4) flavor symmetry of the staggered fermion action is recovered which will give rise to a 15-plet of pions and a singlet. This continuum symmetry is broken by the lattice into a number of smaller representations which have been worked out by the authors of references [28, 19, 18]. The lattice states are labeled by the quantum numbers Fs ® F f where Fg denotes the spin of the quark bilinear that creates the state while Fp selects the appropriate flavor. The matrices F5 and F p are defined in the same manner as the matrix F,, in section 2.2. The pseudo Goldstone boson (PGB) pion, for example, is created by the spin flavor combination Fs ® Fg.

The 15-plet of pseudoscalar mesons break into seven lattice representations in which the flavor index Tp can be either of 7,-, 74, 7.74, 75, 7^75, 7475 and 7^7475.

The operator that creates the meson state is constructed out of the quark fields that live at the centers of hypercubes in the blocked lattice.

OsF = Fs ® Fjr qy (2.29)

IS In terms of the single component quark fields, O s f takes the form below:

OsF = Ex,x,’(r(r;rsr,.rt) (s.so)

The fact that a non zero of the value trace occurs only if the condition t j + S+F+r]' =

0 mod 2 is satisfied im plies th a t S = F + A where A is the hypercubic distance between the quark and the antiquaxk fields. Thus the staggered meson operators can be both local and non local. A non zero value for the trace can give rise to either

+ I or —1 and in general the operator O s f can couple to states with both parities.

Hence, for staggered fermions C{t) takes the form

C{t) = (2.31)

In a 2^ hypercube, non local operators can also be classified according to the number of links in the path connecting the quark and the antiquark which is 1, 2, 3 or

4. In numerical simulations, to maintain the gauge invariance of the meson correlation functions, either gauge fixed configurations must be used or links connecting the quark and the antiquark fields within the hypercube must be put in explicitly. The work reported in this thesis used the latter method.

In summary, the following general ingredients go into the lattice calculation of light hadron spectrum:

• Several SU{Z) gauge configurations distributed with probability density exp(—5"»,)

are generated using Monte Carlo methods either in the quenched approximation

(about which more will be said in the next section) or in the partially quenched

approximation.

• Quark propagators are calculated against a fixed background gauge field using

a matrix inversion technique like the conjugate gradient. The staggered analog

19 of eqn 2.28 is the following:

(2.32)

• Quark propagators are then combined in a gauge invariant way for the desired

hadron channel to obtain the correlation function C{t).

• The data so obtained for C[t) is fit to the form^ of eqn 2.31 (for staggered

fermions) from which the mass parameter is extracted in lattice units.

• The next task is the extraction of lattice spacing a and the quark mass m,. A

method that accomplishes this consists of calculating the ratios ^ for different

values of until the ratio agrees with experiment. Keeping one of the masses

(say the p mass) fixed to the experimental number fixes the overall scale from

which all the other physically interesting quantities are obtained. Thus only

mass ratios are predicted on the lattice.

2.4 Sources of Errors

Lattice QCD is not a model but a first principles approach to understanding non- perturbative aspects of QCD. Hence if the sources of systematic errors and statistical errors are under control in actual numerical simulations, a lattice QCD result is a prediction for QCD. Some of the sources of error that are encountered in actual practice are listed below.

Quenched Approximation: Integrating over the fermionic variables in the quark action (eqn 2.7) gives rise to the fermionic determinant (section 2.1) which makes the effective action non-local. At present, simulations are done including the effects of sea

"Depending on the boundary conditions, boundary terms get tagged onto eqn 2.31 and eqn 2.24.

2 0 quarks, but such algorithms are complicated and computationally very demanding.

Generating gauge configurations with the determinant set equal to one is called the

the quenched approximation or the “valence” approximation [55] which amounts to

neglecting the effects of sea quarks or more precisely, assuming sea quark effects can

be absorbed in a renormalized charge. Computationally, for the current values of a

and the quark mass, the time required to generate a new configuration decreases by a

factor of 100-1000 and hence much of lattice QCD results have been obtained in this

approximation. Although techniques have been developed to understand the effects of

quenched approximation [4, 44] it is not very well known how much quenched QCD

and QCD differ. According to the analysis of the hadron spectrum using Wilson

fermions, a recent paper [6] quotes that this difference could be as much as 10-15%.

Statistical Errors: Physically interesting quantities on the lattice are calculated from doing an ensemble average as defined in eqn 2.4. If the number of configurations

N is large, then the statistical errors decrease as With the advent of increasing computing power, these errors are currently at the 5% level and the chances of their reducing to 1-2% level in the future is quite bright. Correlation among different configurations in an ensemble and correlations among different quantities calculated within the ensemble are sources of errors for which many error analysis techniques are available.

Finite Lattice Spacing: Results of numerical simulations performed at finite vol­ ume and at finite lattice spacing must be extrapolated to the zero lattice spacing and infinite volume limit. Although leading finite a errors can be extrapolated away with some confidence in the mass spectrum calculations, the a dependence of other

2 1 quantities like coupling constants and decay amplitudes are more complicated. Re­ cently there has been much activity related to improving lattice actions which refer to adding higher order operators to reduce finite lattice spacing errors and also help in understanding cut-off effects.

Finite Volume: Numerical calculations have to be done keeping in mind the finite memory available in a computer thus introducing finite volume errors. Generally periodic boundary conditions are used for simulating hadrons which gives rise to infinite images of the hadron at intervals of L. The interaction with the images shifts the mass of the hadron; however, this interaction falls off exponentially [31] and hence if L is large enough, finite volume errors are expected to become small.

The explicit calculation by Luscher [31], indicates that for the case of the pion, the error is supposed to vanish as exp(— Currently lattices as large as 32^ x 64 at the lattice spacing of 0.1 fm are possible for which finite volume errors can be safely neglected.

Extrapolation to the Chiral Limit: Typical simulations are performed with quark masses ranging from oOMeV or higher as the current algorithms for sparse matrix inversion become computationally more demanding as the quark mass is lowered towards its physical value. Using chiral perturbation theory as a guide, quantities calculated on the lattice are extrapolated to the massless quark limit.

This chapter has attempted to explain some key physical ideas from lattice gauge theory so as fill in the background necessary for understanding the material covered in the subsequent chapters. For excellent pedagogical articles on formulation and foundation of lattice gauge theory, the reader is referred to [56, 11, 36]. For a review

0 9 of the current status of lattice QCD and the results of hadron spectroscopy, the reader can consult [21].

23 CHAPTER 3

NUMERICAL TECHNIQUES

The previous chapter laid out the physical ideas underlying lattice QCD, which need to be verified by numerical simulations. This chapter begins with a description of the computer resources needed for carrying out such simulations followed by the description of the algorithms that were implemented for the projects carried out in this thesis. Implementation of these algorithms amount to several thousand lines of

Fortran code running on the Cray T3D and serve as the “experimental apparatus” from which interesting physics, such as those described in the next two chapters, are extracted. The conjugate gradient algorithm for obtaining the inverse of the

Dirac equation is described in section 3.2. The method of subspace iterations for obtaining the lowest eigenvalues and eigenvectors of a large sparse matrix is described in section 3.3 and Cullum and Willoughby’s variant of the Lanczos algorithm for obtaining the eigenvalues only of a large sparse matrix is discussed in section 3.4.

3.1 Computational Requirements for Lattice QCD

An obvious factor determining the computational requirements would be the size of the lattice, which in turn is determined by the physics problem that we wish to solve on the lattice. Typically, a hadron like the has a Compton wavelength

24 of 0.2fm and a charge radius of nearly a fermi. Clearly, the proton must fit inside the

box, but to have the finite volume errors under control, the lattice must be somewhat

larger than Ifm. For the pion, whose charge radius is slightly smaller than that of proton, Luscher's calculation [31] of the pion scattering length indicates that the errors due to reflection at the walls under periodic boundary conditions become small

if the spatial volume of the lattice is 3/m^ or greater. Calculations of the hadron spectrum [21] by various groups using either Wilson or staggered fermions in the quenched approximation indicate that a box size of 2fm or greater is needed for the pion, while for the the lattice length L must be greater than 2.5fm.

The desired lattice spacing for the desired volume mentioned above determines the number of grid points; together both decide the magnitude of the computational task in terms of the memory and the CPU requirement. Experience with the standard action has indicated that a lattice spacing of 0.05 —0. Ifm or lower is necessary in order to have errors due to finite lattice spacing under control. Thus, a box of dimensions

32'* or greater for the lattice spacing O.lfm and 64'* or greater for the lattice spacing

O.OSfm is required. Currently, the few groups involved in studying the effects of systematic errors on the QCD spectrum have at their disposal computer resources for using lattices as large as 32^ x 64, but lattices as large as 64'* and 128'* belong to the size of things to come in the future. In summary, lattices of sizes ranging from

20'* — 128'* would be desirable for QCD calculations.

Memory Requirements: In a typical lattice QCD calculation, storage of the gauge configuration and the quark fields at each site would be required. Gauge configu­ rations are 3 x 3 complex SU{3) matrices, which implies 18 floating point numbers per site per link, yielding a total memory requirement of 72V x 8 bytes, where V is

20 the space-time volume of the lattice (each floating point number occupying S bytes).

For Wilson fermions, the storage requirement per site would be four spin times three colors of complex numbers, while staggered fermions take up a quarter of the space required for the Wilson fermion. Typically, an algorithm like the conjugate gradient method used to obtain the staggered quark propagator requires 5 copies of quark fields to be stored per site. Thus, a total of 0.9 gigabytes of storage for the 32** lattice to 230 gigabytes for the 128“’ lattice might be required.

.A.11 the algorithms described in this chapter and the subsequent applications de­ scribed in the following chapters were implemented on the Cray T3D which has a theoretical peak speed of loOMflops and 8 MW (64 megabytes) of memory per pro­ cessor.

Processor Requirements: An estimate of the number of floating point operations

(flops) present in a typical calculation would help determine the CPU speed require­ ment. Taking the example of calculating the quark propagator, which is an important step in the determination of the properties of hadrons in lattice QCD, one can get an idea of the number of flops. An iterative method like the conjugate gradient method is used to invert the staggered If). An essential part of the procedure is the matrix vector product Ip X] although it occurs only twice in the iterative procedure, it dom­ inates the number of flops per iteration. It amounts to gathering at a given site, the product of the three vectors at each of the eight nearest neighbor sites with the 5(7(3) link matrices connecting them to the central site. This requires 1752 operations^ per site per iteration. Typically, at a value of m ,a = 0.01 on a 16^ x 32 lattice at d = 5.7, the number of conjugate gradient iterations required to obtain the inverse is about

^The breakup will be explained in a later section.

26 900, yielding an estimate of 10*^ operations for a single propagator calculation on a

single gauge configuration.

The other important task in a lattice QCD calculation is the generation of gauge

configurations. As detailed in ref. [34], the number of flops required per site for pro­

ducing 100 quenched configurations is O(IO^) while that for dynamical, it is 0 (10^^)

The numbers given above for the size of the lattice, memory and the number

of flops might appear daunting, but the good news is that lattice QCD problems

are intrinsically parallel. The entire lattice can be divided into several contiguous

chunks of hypercubic blocks, distributed onto different processors, each with its own

memory. Each processor or node, as popularly referred to in the parallel computing

jargon, will then be responsible for doing the floating point operations for the sites

in its possession. In the currently available commercial MPP (massively parallel

processing) machines like the Cray-T3D, each processor is capable of attaining a

peak speed of loOMflops with 64 megabytes of memory per node. Other dedicated

machines have been built exclusively for doing QCD calculations; for example, the

ACPMAPS at Fermilab, GF-11 at IBM, APE in Italy, and QCDPAX at the University

of Tsukuba in Japan.

3.2 Fermion Matrix Inversion

An efficient matrix inversion algorithm is necessary as a dominant fraction of the computing time is spent in obtaining the solution of the discretized Dirac equation

(P-I-m,)G = u;, (3.1) where, for staggered fermions, p takes the form

x{x + fi)~ Ulix - n) x{x - n)}. (3.2)

Here, w is an arbitrary source vector and G is the quark Green’s function. The staggered operator is anti-Hermitian whose eigenvalues are pure imaginary which come in complex conjugate pairs and lie on a vertical line segment in the complex plane.

^ is a sparse matrix and is generally very large for the typical lattice size used in QCD calculations. For example, on a 16^ x 32 lattice, its order is 393216^. In general, the order of this matrix is NV x NV where V is the space-time volume of the lattice and N is the number of colors. Direct methods like Gaussian elimination or Cholesky decomposition are impossible and impractical as they involve factoriza­ tion of the matrix and the requirement that the full factorization matrix be stored.

Since factorization can increase the number of nonzero elements in the original sparse matrix, the amount of storage will exceed the minimum needed. Among the iterative methods, Jacobi and Gauss-Seidel are easy to use and parallelize, but they have a slow convergence rate which depends on the spectral radius of the matrix. Thus one is led to the conjugate gradient method which is very effective and efficient when the matrix is symmetric positive definite. Variations of the method like the generalized minimal residual (GMRES), biconjugate gradient (BiCG), quasi minimal residual (QMR) etc. exist for non Hermitian systems and have been tested for obtaining the Wilson quark propagator. However conjugate gradient is the preferred method for obtaining the inverse of the staggered fermion matrix since it can be made positive definite and

28 Hermitian by considering the corresponding equivalent form

+ Tn^)G = + Tn^)(jj. (3.3)

This allows conjugate gradient to be used in its direct form for inverting, and in addition, the even odd symmetry of the squared Dirac operator can be exploited to solve for G at only half the number of sites

{ — D^ + m^)Ge = {~Pu}o) + TUqUJe (3.4)

{ — D^+m^)Go = { — ^Ue) + TTlqUJo. (3.5)

Ge and Go are the Green’s function on the even and odd sites respectively while w, and ujo represent the source vector on the even and odd sites. A lattice site is even or odd depending on whether its coordinate sum rj + X2 + ^3 + is even or odd. The other advantage of this basis is clearly seen; the fermion vectors need to be stored at only half the number of sites, while the knowledge of the propagator (say) on the even sites only can be used to reconstruct the propagator at the odd sites from

— ipGo + rUqGe — (3.6)

— iPGe + TTlqGo ~ ^o- (3.7) which barely increases the computational overhead over solving the full equation (3.1).

3.2.1 Description of the CG Algorithm

Conjugate gradient derives its name from the fact that it generates a sequence of conjugate vectors. It proceeds by producing a sequence of three sets of vectors

(i) successive approximations to the solution, (ii) residuals of the iterates and (iii) search directions which are used for updating the residuals and the iterates. The

29 residuals are also the gradients of a quadratic functional, f{x) = x^A x + bx + c. the

minimization of which is equivalent to solving the linear system, Ax = 6, where A is

a positive definite symmetric matrix. Let xq be an initial guess for the solution and

To = —f'{xo) = 6 — .4x0 be the corresponding residual. In the method of steepest

descent, the next iterate xi is given by

x i = X q + aoro, (3.8)

with Qq chosen to minimize /(xi), i.e.,

d /(xi) = 0 =>/'(xi)ro = 0. (3.9) dao' Toro ^ Qo == T 4 ■ (3.10) To Atq

Thus, the new and the old gradient are orthogonal to one another, but each new

residual is not perpendicular to all the previously constructed residuals. As the

steepest descent wanders its way to the minimum along these residuals, it can find

itself going in the same directions and might even end up in a local minimum.

The conjugate gradient (CG) method overcomes this defect of steepest descent by

constructing n search directions po,Pi,P2 • - - pn satisfying

Pi ^Pj ~ ^ h (3.11)

where n is the order of the matrix A. The search directions are constructed from the

Gram-Schmidt conjugation, which requires n linearly independent vectors uo, ui... u„.

Pi = Ui + XI l^ikPk- (3.12) fc=0

Choosing the u,’s to be the residuals themselves gives conjugate gradient (CG) its name, simplifies the determination of the coefficients /?,>, and avoids the necessity of

30 storing the old search vectors. The condition that / be minimized along the current search direction determines a, as a,- = Vt Api The choice B: = -r'-'— satisfies the requirement of equation (3.11) or equivalently, that r,- and r,_i be orthogonal. In fact, this choice of ^ ensures that the new search direction p,- and the residual r, are A — orthogonal to all the previous search directions and the previous residuals respectively. An excellent reference for an intuitive understcinding of the method of

CG is [46]. The pseudocode used in our implementation of CG is given below.

Inputs: A, 6, initial guess x, maximum number of iterations imax, error tolerance e.

S o lv e f = Ax r i— b — t

9new ^ ^ ^ fo r i = 0 . 1, 2 ,... imax if , = 0 P <- r else < 9new 19old p r + e n d if t <— Ap a 9new/P^t X X + ap r r — at 9old * 9new 9new r^r T W <— X X toi < 9new /U: if toi < €. e x itf o r e n d if e n d

31 The recurrence relations satisfied by both the residual and the search directions

indicate that the new residual (search direction) can be expressed in terms of the old

one. For example, it can be verified that

Ti = To — Ofi^lro,

t*2 = To — (ai - 02(1 + /di))Aro + aia 2 A^ro-

Since rfrj = 0 , i ^ j, the residuals form an orthogonal basis for a Krylov subspace,

K{A,ro,i) = span{ro, .4ro,/l^ro, ,4'~‘ro}, a subspace obtained by repeatedly

applying a matrix to a vector. Likewise, the recurrence relation for the iterate implies

that Xi is built component by component such that x, = xo+5pan{ro, Aro, , 4.‘~Vo},

and with the property that the 4-norm of the error (e, = x, — Xjoin) is m inim ized.

.Another important property of CG is that, if there are n distinct eigenvalues of 4,

then it converges in at most n iterations. In practice, it takes fewer than n iterations

to converge as it also depends on the eigenvalue spectrum of 4. Convergence analysis

of CG has been worked out by a number of authors including [46]. CG’s asymptotic

convergence depends on the condition number of the matrix. As a result, an im­

portant problem that is encountered during fermion matrix inversion is the critical

slowing down (CSD) problem. In the absence of a quark mass term, the lattice Dirac operator has a finite density of small eigenvalues, which makes the condition number

very large and hence the problem of CSD. Therefore, lattice simulations of the light quark propagator are not carried out at realistic quark mass values but rather at a heavier mass. Physical quantities constructed out of these propagators are then extrapolated to the chiral limit.

32 — m=0.01 m=0.02 - - m=0.03

CM

,-10

,-12 0.0 400.0 800.0 iterations

Figure 3.1: Change in the residual with CG iterations.

Our stopping criterion involves checking the condition < e in each iteration.

We used e = 10“^* in all our production runs. Figure 3.1 traces the monotonie decrease of |r|^ as the CG iterations progress for different values of quark mass.

3.2.2 Implementation issues

Storage of at least four vectors is necessary while only the non-zero elements of the matrix A need be stored. Matrix A is not transformed during the course of the iteration; the maximum storage it requires would be as large as the number of nonzero elements it has, which in lattice QCD is four times the space-time volume of the lattice.

We implemented all the algorithms described in this section on the Cray T3D.

It uses distributed but globally addressable memory, with SMW of local memory on

33 each processor. Each node consists of 2 PE’s which are DECchip 21064 RISC micro­

processors capable of 150Mflops peak performance. The nodes are connected using a

3D torus interconnection network topology. It supports both SIMD and MIMD pro­

gramming methods. Our implementation uses the latter approach. Memory access is

cache based with the size of the data cache being 1024 words.

The conjugate gradient method as outlined in the pseudocode involves one matrix

vector product, four vector updates and three inner products per iteration**. The

overhead incurred in inverting the squared Dirac operator (equation (3.3)) is almost

zero as it amounts to an extra full matrix vector product occurring outside the it­ eration loop. Inside the iteration loop, although an extra matrix vector multiply is

involved, only half the number of sites are involved each time. Equation (3.2) implies

that the result of eight multiplications of a link field with a quark field are to be accumulated at a site. Going through CG iterations for all three colors at once gives a better flop rate than one after the other. Thus, multiplication of a link field with a quark field becomes a 3 x 3) complex matrix multiplication, an operation that has

.54 floating point multiplications and 45 floating point additions. Carrying this out in 8 directions and adding all the results together amounts to 855 flops per site. In terms of the volume V and the number of colors N, flops present in one invocation of ipx on an even odd partitioned lattice is 570jVy/2. Table 3.1 shows the floating point operations involved in the various steps of our implementation of CG. It can be immediately seen that the matrix vector product constitutes the dominant fraction of the total number of flops. As mentioned before, this mostly involves (3 x 3) x (3 x N)

**One extra vector update coming from having to add the m, term to the result of Ipx.

34 matrix multiplication of the type

B = B + U * A, (3.13) w here B and A are Z x N complex matrices, and U, being a link matrix is a 3 x 3 complex matrix. For 3 colors, N is 3, but it can take the values 6, 9, or 12 as well, depending on whether the lattice is doubled, tripled, or quadrupled in the time direction.

This operation can be done independently at each site, or at least at every al­ ternate site, as it involves obtaining information from nearest neighbor sites. The physical lattice is divided into nonoverlapping lattices of equal size and distributed onto different processors of the system. There are separate loops over the interior and the border sites. The border sites have a communication to computation ratio of 6/22. We implement the message passing paradigm with Cray’s shared memory routines rather than using PVM. The core operation constituting the matrix multi­ plication, is implemented in the Cray T3D assembly language (CAM). This allows us to optimize the loads and stores. The f/’s are loaded as and when they are needed while a column of A remains in the registers until it is used up. As an example of the speedups obtained using these optimizations, for # = 6 and a vector length of 1000, the assembly code gives 63Mfiops compared to 25Mflops from a direct implementation in Fortran.

Vector updates and inner products do not need nearest neighbor communication.

For the inner products, however, communication is required after each processor haa calculated the local sum. In our implementation, each processor obtains the local sum from all the other processors and then calculates the global sum locally. W ith Cray’s shared memory paradigm, this does not add significantly to the communication cost.

35 Product type flops M atrix vector product 570iVV 4 inner products 3 X (4 X NV/2) 4 vector updates 4 X (4 X NV/2) Total flops per iteration 5S4NV Flops outside iteration loop: 2 matrix vector products 2 X 285NV 2 inner products 2 X (4 X NV/2) 2 vector updates 2 X (4 X NV/2) T otal flops (584iVK X iter + 57SNV)

Table 3.1: Number of Flops in the inversion of staggered ip

3.3 Eigenvalues and Eigenvectors of the StaggeredIp

It is of interest to obtain the low-lying eigenvalues and the corresponding eigen­ vectors of the Dirac operator to study questions related to chiral symmetry breaking and topology on the lattice. Several methods are available for the determination of eigenvalues of a large sparse Hermitian positive definite matrix. The one that we have implemented is based on the idea of extremization of the Ritz functional

(a:, Ax) ^(x) = (3.14) (x ,x ) using nonlinear CG. After obtaining the lowest (highest) eigenmode, subsequent eigenmodes can be obtained by keeping them orthogonal to the previously determined modes. Kalkreuter [25] used this method to obtain a few of the smallest eigenmodes of Wilson Dirac operator in the SU{2) gauge fields. As more and more eigenvalues are sought, the problem of projecting against the previously determined eigenvectors can become computationally demanding. A variant of this idea yields the method of subspace iterations in which the m lowest eigenvalues and eigenvectors are computed

36 simultaneously [7]. Historically, the nomenclature subspace or simultaneous itera­

tions has referred to performing inverse iterations simultaneously for all the required

eigenvectors. In either case, the iteration is done on a m-dimensionaJ subspace rather

than on m individual iteration vectors simultaneously. For lattice QCD, a method

based on inverse iterations is not very attractive as one of the steps in the algorithm

involves determining the solution of the system Ax = 6. Since the Dirac operator has

near zero eigenvalues, as explained in section 3.2, the usual iterative methods used to

solve this system are associated with CSD.

Once again, the squared staggered Dirac operator being Hermitian and positive definite is a more suitable operator for use in the minimization of the Ritz functional

than p itself. Since decouples the odd sites from the even sites, a saving on space can be achieved by solving only for the even (odd) sites. This even odd symmetry gives rise to a twofold degeneracy in the spectrum for SU{Z) gauge fields. The eigenvalue equation we wish to solve is

- P ^ X , = A%A. (3.15)

Here P^ is assumed to be n x n operator, % is an n x m rectangular matrix whose columns are the m eigenvectors, and A is an m x m diagonal matrix whose entries are the m eigenvalues. For the reconstruction on the odd sites, we make use of the checkerboard basis ( 0 p \ ( xA ( ,A', which allows us to obtain .Vo as

.Vo = (-^.Vo)(A')-\ (3.16) 3.3.1 Description of the subspace iteration algorithm

Let C”* be the m dimensional subspace which can be described by the m eigenvec­

tors Xi,.. .,Xm of A, and which form the columns of X. The problem of determining

these m eigenvectors becomes the problem of determining the appropriate subspace

which is spanned by thesem eigenvectors. The starting iterate of m columns of X q

span C^: iteration proceeds until the m vectors span C"* to a sufficient accuracy.

Iterations are performed with a subspace rather than m individual vectors. This im­

plies that convergence of the individual eigenvectors to the true eigenvectors is not so

much of a requirement as convergence to the appropriate subspace. The generalized

definition

fi{X) = trF{X) (3.17)

where F(A') = (3.18) of the Ritz functional [35] is used to find the invariant subspace belonging to the m lowest eigenvalues. The invariance of the trace under cyclic perm utation leads one to observe that

fi{X) = fr{(%(A\Y)-'%^)A} (3.19)

T hen P{X) = A’(.Y^.V)~^.Y^ serves as a projector into the subspace spanned by X.

The specific algorithm that is used here for determining the appropriate subspace is nonlinear CG. Minimization of a nonlinear function using CG proceeds in a slightly different manner than that used for solving linear systems. The recursive formula for the residual cannot be used, instead the gradient of the function to be minimized is determined. Computation of the step size a becomes a little complicated and there

38 are several approaches for determining l3. The method that is described here, is in a sense, a block version of the usual nonlinear CG. The gradient of fi{X) is

G = {AX - Xfi{X)){X^X)-^, (3.20) which satisfiesX^G = 0. The other requirement of CG is a set of search directions

H which are chosen perpendicular to to the columns of X for convenience.

The basic algorithm consists of projecting the eigenvalue problem AX = XA into the space spanned by the columns of X and //, solving the 2m dimensional eigenvalue problem, and obtaining the 2m eigenvalues and eigenvectors, which are then used for constructing the CG step size matrix a {mx m). The columns of .V are updated, and the new gradient is determined, which is then used for updating the columns of H.

In matrix notation, the smaller eigenvalue problem that needs to be solved in each iteration of CG is

/ X^AH \(l3 i\_ ( X^X 0 W \ , \ H^AX H^AH j d2 j ~ V 0 j \ )

The elements X^ AX etc. are m x m matrices and /?i and 02 are the set of eigenvectors that span the subspace of .V and H respectively.

Beginning with an arbitrarily chosen X, the next iterate X' is obtained from the recurrence

X' = X + Ha, (3.21) where a is an m x m matrix now. Since the minimizing subspace is spanned by

-V^i + H02, oc = 020\^y determination of the new gradient G' becomes necessary for updating the search directions H. At the start of the iteration, H is set equal to G.

In the general CG scheme, the recurrence used for obtaining the new search direction

39 H' is

H' = G '+ H,3. (3.22)

An additional modification that is required here is the satisfaction of the condition

X^H = 0, which is ensured by having the projector (1 — P{X)) in front of H. Thus

/f' = G' + (1 - P{X))Hl3. (3.23)

.Although, there axe several methods for the determination of 3 (the one used in CG described in section 3.2 is called the Fletcher-Reeves form), numerical experiments suggest that the Polak-Ribiere prescription is more robust for non-quadratic functions.

Therefore,

3 = (G^G)-' {G'^G' - G+G'}, (3.24) which is also an m x m matrix like a.

For the theoretical foundations and convergence analysis for the general class of subspace iterations methods, the reader can refer to [35].

Stopping Criterion

The stopping criterion we have used is essentially based on estimating the rel­ ative error in the Ritz function between successive iterations using the three last approximations to p.. In the ith iteration the following ratio is computed first;

T]i = - (3.25) /i,_l - /i._2

If T}i < 1, then the Ritz function is considered to have converged if

‘ (3.26) where e is some given tolerance. The relative error in the Ritz functions is denoted by The rationale behind evaluating r/, first is that it denotes the coefficient in

40 the following relation (assuming linear convergence) between the true value fit and

the computed approximation fi( in a given iteration.

fit - fii = Tjiifit - fii-i ) (3.27)

In the actual implementation, the value of Sfi^^i in each iteration is monitored,

and if in a given iteration its value becomes less than some tolerance e, the algorithm

is halted. The relative error in the individual eigenvalues is also estimated although

it does not play a direct role in the stopping criterion. Denoting the relative error in

the eigenvalues by 8Xrei, it is defined as,

8Xrei = ^ — ; = l,2....,m (3.28) (1 — T}i)Xij

Another type of error estimate that helped cross check the eigenvalue approxima­

tions obtained from the above convergence criterion is given below [51]

f/ = ||(A-Atzt)||<(EW)'/" 6=1,2,...,m (3.29) k

Here. At are the eigenvalues of A and xt the corresponding eigenvectors; gk is the

gradient of the Ritz function calculated at xt. For formal proofs the reader can

consult [51].

3.3.2 Implementation of the Algorithm

The pseudocode used for our implementation of the subspace iterations method

is shown on the next page. The storage requirement includes 5 n x m complex

rectangular matrices, where n is ZV/2 when solving for the eigenvalues of on the even sites. In addition, storage must be allocated for at least 12 m x m complex

matrices and as in CG, for the link matrices which require a total of 72V memory

locations.

41 Input: Columns of X - an arbitrary set of m orthonormal vectors. S o lv e Y = AX X X { X ^ X ) - ^ ^inv *— invert(xx) xy X xy G *— {Y — Xz)Xinv H fo r i 0, 1,2. . . imax S o lv e Z = AH xz ^ X^Z zx

X X ^ ( X U V ) - ‘ invert{xx) xy ^ .\'ty ' = Xinv X xy n <- Tr{z) Find the new gradient. Z *r— [Y — Xz)Xinv Polak-Ribiere form of j3 99 G+G gginc ^ invert{gg)

0 gginv{Z — G)^Z G + - Z U pdate H. Z ^ H 0 H ^ G + { Z - XxinvX^Z) Check if stopping criterion is satisfied. Proceed with further iterations if needed.

42 In terms of operations, the basic steps in the the algorithm can be classified into

the following categories: two block matrix vector products (here vector actually refers

to th e n x m blocks) but only at half the number of sites each time, seven block inner

products, seven block vector accumulations, three m x m complex matrix inversion,

three m x m complex matrix multiplications, and six multiplications between n x m

and m xm complex matrices. There is also the diagonalization of a 2m x2m Hermitian

matrix for which we use one of the LAPACK routines called CHEGV, available as a

library routine on the Cray T3D. An appropriate LAPACK routine was also used for

the m x m complex matrix inversion. The total number of floating point operations

is tabulated in Table 3.2. .A.part from the observation that the number of flops

is approximately m/3 times that obtained in linear CG, another point that strikes

th e eye is th em} dependence which can dominate over the matrix vector product

operations for m > 10 inside the iteration loop. Here also, we use CAM routines

for doing the (3 x 3) x (3 x m) matrix multiplication which as before forms the core

operation.

.A problem that one is faced with is the danger that columns of X can become

more and more linearly independent as the iteration proceeds. This was certainly

present in our implementation preventing convergence from taking place. The m xm

m atrix X ^X begins to gain off-diagonal elements and the requirements of maintaining

the orthogonality between the H vectors and X and between the gradient and X are

not satisfied. A remedy in such a situation is the explicit reorthogonalization of

the vectors constituting the columns of X by a standard method such as the Gram-

Schmidt process. We used the variant called the modified Gram-Schmidt method, which is numerically more stable than the classical version of the algorithm. After

43 Operation type number of flops Inside the iteration loop: Matrix vector product 570m V Seven inner products S4m V Vector accumulation 24m V n X m yi m X m matrix multiplication 72m^V Total number of flops 678m V + 72m^V Outside the Iteration Loop: Matrix vector product 570m V Seven inner products 18m V Vector accumulation 9mV n X m X m X m matrix multiplication 2 4 m V Total number of flops 621 mV + 24m^V

Table 3.2: Number of Flops in Subspace Iterations

the reorthogonalization, the CG process is restarted by computing the new gradient; some of the loss encountered in restarting can be offset by preserving the last set of search directions. After doing a number of numerical experiments, we have settled on reorthogonalization and restarting after every 50 iterations. A criterion for doing this dynamically can help save time in the future. Currently, it adds f lopsf iter in addition to the total flops shown in Table 3.2.

Our implementation of the subspace method differs significantly from that of

Bunk’s[7] in the following ways. We found it necessary to explicitly reorthogonalize the columns of X every 50 iterations to achieve convergence. We also used two different types of error estimates for the calculated eigenvalues as discussed in the previous section. We offer a detailed breakup of the operations count in the algorithm in Table 3.2.

44 10 -

1 0 '^ ^ A A \ 10-^ ^ •6 Î \ /v 10 A -v \ ' A \ - -8 A À 10 1

10 - - 2 10 3 -12 1 0 - 4 ■ — 16 -14 1 0 A Ritz function -16 1 0 0.0 500.0 1000.0 1500.0 iter

Figure 3.2: Relative error in the indicated eigenvalues and the Ritz function.

Numerical Tests and Results

Numerical experiments indicated that e = I0“® works best on all configurations.

.Although the Ritz function itself is a monotonically decreasing function, the quantities

(5/ire/ 3.nd (5 A ref are not. which is obvious from Figure 3.2. We find that the smallest

eigenvalue converges the fastest with a relative error less than 10"^®. The relative

error of other eigenvalues hover between this and the given tolerance e. The

eigenvalue is the slowest to converge and has a relative error of the same magnitude

as the tolerance e.

The error u is not determined in every iteration but rather at the end after the

stopping criterion is satisfied, since this involves the multiplication of a n x m matrix

with a m X m matrix, which takes (12m^V^) operations. While this is tolerable for

45 lattice ma m = 1 m = 2 m = 4 m = 8 m = 16 m = 32 4^ X 8 0.0 9999 1549 852 837 265 170 random 0.001 9552 1431 852 440 286 164 0.01 7705 1255 848 439 277 152 0.1 5095 1102 775 243 147 1.0 3253 852 340 184 102 16* X 32 0.0 1652 1401 = 5.7 0.001 7502 3852 1652 1552 D ynam ical 0.01 5099 2802 1847 1501 1644 0.1 2852 2152 1351 1144 (m = 20) 1.0 1252 1152 902 758

Table 3.3: Number of iterations required to converge for different quark masses and m.

small m, for m > 10 it can cause additional slowing down. We find that individually, for every eigenvalue the quantity |/Ij: — Ax|^ is less than or equal to the corresponding gradient vector on almost all the configurations; this difference ranges from 0(10"^^) for the lowest few modes to 0(10“*^) for the mode. The criterion defined by equation (3.29) is always satisfied. Hence, choosing e = 10“* ensures sufficient accu­ racy in the calculated eigenmodes and this is reached on most configurations within

1600 iterations for the 16* x 32 lattices, at ma = 0.01 and m = 16.

The dependence of number of iterations required to converge on the quark mass value and the m value is shown in Table 3.3.2 for a random 4* x 8 5{7(3) gauge configuration and also for a typical 16* x 32 dynamical configuration at = 5.7.

A glance at Table 3.3.2 may suggest that ma = 1.0 is a good choice but the eigenvalues are not of the same order of accuracy mentioned before. Also, it might seem that on the 16* x 32 lattice and for m = 16, ma = 0.0 has performed better th an ma = 0.01. It must be remembered that the results shown in the table are for a

46 single configuration. It turned out that ma = 0.01 seemed the most optimal choice.

In any case, the choice of ma is an issue only for gaining on a few hundred iterations without losing accuracy.

We used this algorithm to determine the 16 lowest eigenvalues of on both dy­ namical and quenched SU{3) gauge configurations (table (3.4)). The eigenvalues and

TTldyn 0 size no. of configs 0.01,W/ = 2 5.7 16^ X 32 83 oo 6.9 16^ X 32 84

Table 3.4: The ensemble used for subspace iterations.

eigenvectors so obtained were then used to investigate questions related to topology of the gauge configurations; this is discussed in chapter 5. Although this method yields reasonably accurate eigenvalues and eigenvectors and has a well defined stop­ ping criterion, the chief disadvantage stems from the fact that the number of flops grows as m^. Inside the iteration loop, (n x m) x (m x m) matrix multiplication begins to dominate the matrix vector product for m > 10.

3.4 The Lanczos Algorithm

The algorithm described in the previous section is useful for extracting a few of the smallest eigenvalues and eigenvectors of a large sparse matrix. Sometimes it is of interest to obtain the entire eigenvalue spectrum, for example, when studying chiral phase transitions on the lattice, or universal fluctuations in the spectrum.

A variant of the Lanczos algorithm is useful for extracting the full spectrum for

47 moderately sized lattices. For large lattices, only the extremal eigenvalues of Ij) can

be estimated in a reasonable iteration time. For a large sparse Hermitian matrix,

the Lanczos procedure generates a sequence of symmetric tridiagonal matrices whose

eigenvalues become better and better approximations of the extremal eigenvalues

of the original matrix. There are other applications of the Lanczos procedure like

solution of linear systems, least squares etc. It is very well documented and has

been studied extensively. The reader can consult [20] and references therein for the

theoretical foundations, convergence analysis, descriptions of the different types of

Lanczos procedures and their applications. The method gained popularity when it

was realized that it can be used for large sparse matrices for which the well known and

usually reliable methods for small dense matrices like the Householder transformations failed since they could destroy the sparseness of the original matrix.

3.4.1 CuUum and Willoughby’s Lanczos Method

As before, it is convenient to deal with the Hermitian operator rather than p itself since the Lanczos procedure works well for such systems. The basic Lanczos recursion is obtained from the relation Ql^AQm = Tm-, where Q is a unitary matrix whose columns consists of orthonormal vectors qi,q2 , ,9m, and Tm is a. m x m a symmetric tridiagonal matrix with diagonal elements and off-diagonal elements

t ^ m • f Ql /3i 0 Q2 02 : Tm =

! /Sm-l \ 0 • • • 0m—I J

48 Equating columns in AQ = QT and making use of the orthonormality among the q's, the following basic Lanczos iteration scheme, with a slight modification introduced by Paige in the definition of or, is obtained:

Input: vi = randomvector, ||ui|| = 1, uq = 0, /3i = 0

F o n = 1,2,... ,m /3i+iVi+i <- Avi - - aivi or, ^ vJ[A vi — !A+l| ^ llAUi - - Or.U.ll

The iteration procedure outlined above leads one to observe that:

1. Only two complex vectors of size 3V need to be stored during the iteration

process. Two linear arrays of size depending upon the maximum number of

iterations are required to store the elements or, and No transformation takes

place on the elements of A making it an attractive technique for sparse matrices.

2. There is one matrix vector product, two vector accumulations (although it be­

comes 3 if there is a quark mass term to be added to p)^ two inner products,

and the division of a vector by its norm. Therefore it is easy to parallelize

in a manner similar to the CG and the subspace procedures. Computing j3

amounts to calculating the norm of the vector obtained from the first step of

the iteration.

3. As in CG, each u, can be written as a linear combination in Vi, Avi,...,

therefore the Lanczos procedure also determines an orthonormal basis for a

Krylov subspace A'(A, ui, i).

49 Denoting the Lanczos coefficients as and 3^, and the CG coefficients as and

they satisfy the following recurrence relations [20]:

r d? 1 . Oc- — —^----1 -, 2 = 1, . . . , 77% — 1 (3.30) Q:,-

3i = -^ ,% = 1,...,77%-1 (3.31)

di = 0.0 (3.32)

Lanczos and the CG method for obtaining the solution of linear systems are equivalent in exact arithmetic, but the fact that their equivalence is retained in finite precision arithmetic was shown by Cullum and Willoughby [12].

The above procedure is very elegant if one can work in exact arithmetic. In practice, loss of orthogonality among the Lanczos vectors due to roundoff errors (see discussion in [20]) causes spurious eigenvalues to appear among the eigenvalues of

T. Explicit reorthogonalization schemes incur a lot of storage overhead and limit the

Lanczos procedure to the extraction of a few of the smallest or the largest eigenvalues of a sparse matrix. The other troublesome feature of the Lanczos method is that there is no proper terminating criterion. It is still an attractive method as the extremal eigenvalues of A can be obtained in fewer than n iterations.

Cullum and Willoughby’s Lanczos method follows the iteration procedure shown above and also incorporates a test for identification of spurious eigenvalues. The test consists of creating a tridiagonal matrix T2 from Tm by deleting its first row and col­ umn and then comparing the eigenvalues of both. If a simple eigenvalue of Tm is also an eigenvalue of T 2, then it is a spurious eigenvalue. All other eigenvalues of Tm are considered good and they can begin appearing in multiple copies although their mul­ tiplicity does not reflect the true multiplicity of the eigenvalues of A. As m increases,

50 the “good” eigenvalues of become better and better estimates of the eigenvalues

of A. This test is based upon the relationship between the conjugate gradient and the

Lanczos in finite precision arithmetic. For further details, the appropriate reference

is [12].

3.4.2 Numerical Implementation and Tests

Computationally, the Lanczos method has almost the same number of operations

as the linear CG. In aJl such algorithms, the dominant contribution to the number

of flops comes from the matrix vector product so that the presence or absence of an

extra inner product or a vector update does not increase or reduce the cost very much.

The total number of flops excluding the diagonalization of the tridiagonal matrix is

ôSOvVy. In our implementation, N was 1 as it is enough to work with one color. We

obtained the eigenvalues of and Tj by using a LAPACK routine called SSTEQR

which implements the implicitly shifted QR algorithm. We found that a tolerance

of I0~" is good for determining the multiplicity. For the comparison of the simple

eigenvalues of T 2 and Tm, we found that it is best if the tolerance is chosen equal to

or greater than the tolerance for the multiplicity test.

Our numerical experiments on a 4^ x 8 lattice indicate that this identification test

is very reliable. We were able to get the complete spectra in less than 2n iterations,

where n is 768. In general, n = The eigenvalues so obtained should satisfy the

following sum rule:

Tr{P^) = j2dn. (3.33)

In 4 dimensions, the trace is simply NV, and for the 4^ x 8 lattice, this trace equals

1536. The eigenvalues that we obtained from our implementation satisfied this sum

5 1

i lattice order of Tm good ev MP") 4^ X 8 1056 738 1498 random 1120 757 1523 1165 766 1533 1175 768 1536.070 1200 768 1536 1998 768 1536 16^ X 32 5000 4821 = 5.7 and unquenched 6000 5762 7000 6697 8000 7627 9000 8552 10000 9474

Table 3.5: Number of good eigenvalues of Ip^ found after m iterations using the Lanczos procedure.

rule upto 12 decimal places suggesting that the test for identifying the spurious eigen­ values works very well.

On a dynamical 16^ x 32 lattice, n is 196608 and hence it would be impractical to determine all the eigenvalues. We ran upto 10000 iterations and based on the criterion for identifying spurious eigenvalues we found 9474 good eigenvalues, of which the first 19 and the last 89 were present in multiple copies. Since the eigenvalues which begin to replicate can be considered to have truly converged, out of the 9474 good eigenvalues, 108 have converged. These converged eigenvalues will continue to remain converged as the value of m increases and hence are numerically stable. The lowest

16 eigenvalues agreed very well with those determined from the subspace method.

Table( 3.4.2) shows the number of good eigenvalues found after m iterations.

The equivalence between CG and Lanczos was numerically tested by running the

CG algorithm until the magnitude of the residual became close to machine precision.

52 This is of O(10“‘‘°) on the Cray T3D and took about rrimax iterations. The values of af and /3f from each iteration were used to construct the Lanczos tri diagonal m atrix of order rrimax using the recurrence relations given by equation 3.32, which was subsequently diagonalized by the LAPACK routine SSTEQR. Spurious eigenvalues were identified and the good eigenvalues compared with those obtained from the

Lanczos procedure. Our results on both the random 4^ x 8 lattice and on a dynamical

16^ X 32 lattice are summarized in Table 3.6.

lattice ^m ax good ev good ev with rel error < 10 ' 4^ X 8 1065 741 741 16^ X 32 3516 3413 S3

Table 3.6: Maximum number of good eigenvalues obtained from CG.

On the smaller lattice, rrimax is sufficiently large enough so that almost all the eigenvalues have emerged with a relative error of However, on the 16^ x 32 lattice, rumax is a fraction of n and hence only the first 18 and the last 67 have converged with a relative error of 0(10“') or less. The inability to continue the CG procedure after the residuals become very small prevents the use of CG to compute the eigenvalues of large matrices. On the other hand, the Lanczos vectors u,- are scaled at each step so the Lanczos iterations can be continued indefinitely as long as does not become very small.

In summary, tests on the 4^ x 8 lattice indicates that Cullum and Willoughby’s variant of the Lanczos algorithm is very reliable for extracting the entire eigenvaJue spectra on moderate sized lattices. The chief disadvantage of the Lanczos algorithm is

53 that there is no well defined stopping criterion nor a well defined error bound for the

calculated eigenvalues. True multiplicities of the eigenvalues cannot be found either.

3.5 Summary and Conclusions

This chapter began with a discussion of some issues pertaining to numerical sim­

ulations of QCD. Lattice sizes of 32“* or greater with lattice spacing O.lfm or lower

with MPP environment capable of providing peak performance in the teraflop range

and providing a global memory address space in the gigabyte range are desired. Cor­

respondingly, algorithmic improvement for generating quark propagators to overcome

the problem of CSD and for generating dynamical configurations are also desired.

To date, conjugate gradient is the algorithm of choice for computing staggered

quark propagators. It is the most efficient among the iterative methods but it also

suffers from CSD. For the determination of a few of the smallest eigenvalues and

the corresponding eigenvectors of p, we took up the method of subspace iterations

and used it to obtain the 32 lowest modes of the staggered Dirac operator. We also studied the Lanczos algorithm à la Cullum and Willoughby for obtaining the entire spectra. On moderately sized lattices this can be achieved in reasonable CPU time.

All of the algorithms described in this chapter were implemented using Fortran and ran on the Cray T3D MPP at the Ohio Supercomputer Center.

54 CHAPTER 4

i MESON WITH STAGGERED FERMIONS

4.1 Continuum t]' Lore

The pseudoscalar octet mesons comprising the pions and the kaons are light par­ ticles and are understood as approximate Goldstone bosons of spontaneously broken

SU{'3) axial flavor symmetry of the QCD Lagrangian. The corresponding singlet me­ son, T]' with an experimentally measured mass of 95SMeV, has long been the center of attention of the U{1) problem. The r/' meson, the lightest particle with the quantum numbers that could correspond to the broken f/(l) axial symmetry, is far too massive to be an approximate Goldstone boson. This is known as the U{1) problem. The

17(1) axial current is conserved naively at the tree level for zero bare quark masses.

However, in the quantum theory, triangle diagrams[S] induce the anomaly term mak­ ing the divergence of the axial current non-zero. The presence of the anomaly can be understood in yet another way[l7|: the path integral measure is not invariant under the 1 ) axial transformation, giving rise to an extra Jacobian factor which is the anomaly. Thus one is forced to conclude that in nature, 1/(1) axial current is not conserved, and there is no missing Goldstone boson. This is the qualitative understanding of how the U(l) problem is resolved.

55 T he large Nc expansion provides a convenient framework for a quantitative dis­

cussion of the U{1) problem. When Nc —+ oo [57, 52], U{1) axial symmetry is restored

and gives rise to tj' — tt degeneracy. This degeneracy is lifted by the presence of virtual

quark loops (Figure 4.1) in the rj' propagator which are suppressed by one power of

1/Nc- Infinite iteration of these quark antiquark annihilation diagrams gives rise to a

Figure 4.1: Virtual quark loops in the t}' propagator

geometric series for the Euclidean r/' propagator. Defining the correlator {’n'{t)r]'{0)) one can write

1 1 + + p2 4- m l p2 ^ m | p^ + p2 4- "'°p2 + (4.1) where m | is the average of the square of the octet masses and rUg is the strength of the flavor singlet interaction. Summing the geometric series shifts the pole from to mg + m|. Thus 1 p^ + m | + m l

In an 517(3) symmetric world, one would write

= ml + ml (4.2) m | vanishes in the chiral limit while mg does not. Taking m | = (4m|^- -|-3m^ 4-m^)/8 and substituting the experimentally measured values for m^', ruf^-, m,r and m,,, one

56 obtains mo = S60MeV. Thus tj' gains its mass not only due to finite quark mass but

also due to an extra flavor singlet interaction.

A. quantitative estimate of this contribution has been obtained by Witten and

Veneziano[57, 52] who have derived a relationship linking tUq to the fluctuations in

the topological charge density of the pure gauge field. W itten’s approach consists

of two basic ideas. First one comes from the observation that the 0 (vacuum angle

parameter) dependence of pure SU{N) gauge theory to the leading order in l/N

vanishes in the presence of massless quark loops. This leads to an apparent paradox

since the effect of such fermion loops on the physical processes is in the next to

leading order. The second idea consists in resolving this paradox by determining the

matrix element (0|F'F|7') in two different ways. Here F is th e field stren g th tensor

and F its dual. In one of the methods, one inserts a complete set of one-hadron-pole

intermediate states (consisting of gluebails and mesons) in the the two point function

U{k) = I f(0) f(0) F{x) F{x) ). (4.3)

To the lowest order in l/iV,., the sum over intermediate states gives

= E + E (4.4) gluebails ^ mesons ^ ^ n

where iV/„ is the mass of the nth state and m„ is the mass of the nth meson state

^cOn = {0\ F F \nth glueball)

Cn = (0 I F F I nth meson)

In perturbation theory, U{k) vanishes at zero momentum as F F is a total divergence.

However, in 1/iVc expansion, the above sum does not vanish at general values of /t, but

5 7 it must vanish at A: = 0 in the presence of massless quarks since the 6 dependence of the

pure gauge theory can be rotated away by chiral transformations of the quark fields.

To the lowest order in 1/iVc, the sum over the glueball states receives contributions

from all diagrams without quark loops. Let this term be denoted by Uo{k); at k = 0,

U{k) vanishes if there exists at least one meson state whose mass squared is of order

ie.

Uo{0) = Nc cl/ml (4.5)

To the lowest order in l/Nc-, the lightest pseudoscalar flavor singlet that satisfies this requirem ent is th e tj' meson.

On the other hand, from the anomaly equation

one obtains,

since (0 \ d^iJ^\ q') = \/~N/ m '^ The approximation /,,< = f.^ is valid in the limit of large when the anomaly vanishes and the f/(3) x U{2) symmetry is restored.

Comparing eqn 4.5 and eqn 4.7, one can obtain a formula for m^, in the world of massless quarks AN = -PT (4.8) I t where \t is the topological susceptibility:

The mass form ula above obtained by VVitten[57] was modified by Veneziano for the real world case of light quarks with non zero mass. In the real world, taking the

®Note: In the world of massless quarks m-, = mg.

58 number of light flavors to be three and = 9SMeV, the topological susceptibility in

QCD turns out to be a number of the order of (lS5MeV)‘*[l].

4.2 Lattice calculation of mo

Lattice QCD offers the challenge of calculating ttiq from first principles without

resort to any argument based on large Nc- An independent measurement of the

topological susceptibility on the lattice can then be used to verify equation 4.8. Unlike

the continuum case, topological charge of a pure gauge field on the lattice is not a well defined quantity. Owing to the non existence of smooth gauge fields on the lattice, it

is possible to continuously deform a lattice gauge field from one topological class to another. Nevertheless, there exists a few reasonable definitions which axe reviewed in reference[l].

As is typical of spectrum calculation on the lattice, the asymptotic behaviour of the two point function {n'{^)v'iy)) can be used to extract m^. A simple choice for the interpolating field r]'{x) is Q{x)')s Q{x). Two types of contractions are possible: (i) single quark loop or the connected amplitude, the only contribution for the flavor non singlet mesons and (ii) in flavor singlet mesons, quark antiquark annihilation leads to the two loop or the disconnected amplitude (Figure 4.2). In all the lattice studies so far[58, 32, 16], the quantity of interest has been the ratio R{t) of the disconnected two loop amplitude to the connected one loop amplitude for which it is possible to obtain good signals and which can be related to the tj' - octet mass splitting.

The observation that the ratio of the rj' correlator to the octet correlator decays exponentially as the mass difference between the two gives R{t) = I — D exp(—Amt)

59 «C > C D

Connected Disconnected

Figure 4.2: Contributions to rj' correlator

when there axe equal numbers of dynamical and valence flavors and rtidyn = TUvai-

Here D is a constant and Am is the difference m,,» — mg. When there axe Ny valence

and Nf dynamical fermions, the normalization factor for the connected diagram of

Figure 4.2 is N//Ny w hereas it is {Nf/Ny)'^ for the disconnected diagram. Thus R{t)

on dynamical configurations with equal valence and dynamical fermion mass takes

th e form,

R[t) = [I — D exp{—Amt)] (4.11) I\f

On quenched configurations infinite iteration of the basic double pole vertex is not allowed which means equation 4.1 is truncated after the first two terms. Thus in the quenched approximation, there is a double pole in the x]' propagator. Projecting the first two terms on the r.h.s of eqn 4.1 onto the zero momentum state yields

= + ,4.12)

Taking the ratio of the above with the connected correlator makes the quenched R{t) linear with slope m ^/m |. m2 R{t) = const.+ (4.13)

Here the constant term equals m \j'lrn \. The linear behaviour of R{t) for quenched simulations arises from the double pole term in the t]' propagator.

60 This work is exclusively devoted to the extraction of the right hand side of equation

4.8 from the lattice. Previous lattice Ccdculations of include that of Kuramashi

et a/.[58] who obtain a result of mo{Nf = 3) = 751(39)MeV using quenched con­

figurations of size 12^ X 20 at /? = 5.7 with Wilson fermions. This result for mo

is 10% lower than the experimental value of 860MeV in the chiral limit. It would

be interesting to simulate with dynamical fermions and study the effects of quench­

ing. Masetti et a/.[32] have used the bermion (bosons with fermionic action on gauge

configurations with unphysical negative number of flavors) method with Wilson dis­

cretization to study the flavor dependence of m,,». They obtain a value for t}‘ — it

splitting consistent with that of Kuramashi et al.

Our work is different from the previously mentioned groups in several respects.

The chief difference is ofcourse that we use staggered fermions which have better

chiral properties. We also use smeared operators to suppress the contribution of

excited states from the initial few time slices. Since our simulations are carried out on

both dynamical and quenched gauge configurations, we are able discern the different

functional behaviour of R[t) and hence study the flavor dependence of m,,,. We also derive the theoretical expressions for R{t) for the quenched and the dynamical cases using partially quenched perturbation chiral perturbation theory {PQxPT). This allows us to compare the simulation data for R{t) with that predicted by theory.

This chapter is organized as follows. In section 4.3, we review the basic concepts of {PQxPT) and then show the key steps in the derivation of expressions for R{t) for various cases. This section is followed by a description of the details pertaining to the simulation including the Wuppertal smearing procedure. In section 4.5, we discuss the results obtained from our simulation and compare with the theoretical

61 predictions of section 4.3. Almost all of the analysis programs required for carrying out this study were written by the author although they are not mentioned as part of the thesis. The analysis was carried out in the Sun workstations and the programs were written in C.

4.3 Ratio from P Q xP T

The quenched approximation, which amounts to ignoring the effects of quark loops in the QCD correlation functions, introduces an uncontrollable systematic error.

Bernard and Golterman[4, 5] have developed a systematic technique based on chiral perturbation theory to study the m, dependence of physical quantities in the quenched and the partially quenched approximation. Partial quenching arises in numerical simulations in two common situations: (i) when simulating with two light flavors of dynamical fermions, the loops become quenched, and (ii) when the valence quark masses are chosen different from the dynamical quark masses. Chiral perturbation theory has proved to be a powerful tool for understanding the low energy sector of QCD which consists of pions and their interactions. Numerical results which are often obtained at values of quark mass much above that of the real world have to be extrapolated to the chiral limit for which chiral perturbation theory has proved to be a valuable guide. Also since a fermion propagator is inversely proportional to its mass the effect of suppressing light quark loops can be large. Thus a technique based on chiral perturbation theory can be expected to give valuable insight as to the difference between quenched and full QCD.

The basic idea in the approach of Bernard and Golterman is based on the obser­ vation that if a scalax quark {q) is added to the QCD Lagrangian, then the scalar

62 determinant can cancel the quark determinant[37]. For concreteness, if we assume there are quarks of masses m,(z = 1,2, then adding k pseudoquarks of m asses m.j{j = 1,2. such that mj = m,- would mean that first k quarks are quenched and the remaining Ny — k unquenched. The low energy effective chiral the­ ory would then describe interactions among all kinds of pseudoscalcir mesons - bound states of quarks and antiquarks, quarks and scalar antiquarks, scalar quarks and their antiquarks and scalar quarks and fermion antiquarks. The usual chirai Lagrangian is extended to incorporate //' in such a way that it has f/(3) ® t/(3) symmetry. Addition of scalar quarks promotes this symmetry to the graded group f/(3|3) ® f/(3|3) to ac­ count for the symmetry under the interchange of bosons and fermions. The reader is referred to [4, 5] for discussion regarding the detailed form of the Lagrangian and its symmetry properties. For the purposes of this chapter, it is sufBcient to write down the Green’s function in momentum space in the basis of the states corresponding to tlu, dd - ■ ■ and their pseudoquark counterparts.

P [«0______" P^ + Mf (p2 + M?)(p2 + M?)F(p2) ^ w here

F{p'^) = l+ m l ^ - (4.15) d=t+i P ^ -'"d Mf is the square of the meson mass composed of quaxk flavor i and

'■ = { - 1 n 7+ r < i < N . + k (4.16)

One of the terms in the Lagrangian is a momentum dependent self-interaction term for the p' with the coefficient a which has been neglected in writing down the momentum space propagator above. It can be reinstated any time by the substitution ml ml + a^. W hen A/, = Mj the first term is simply the neutral meson propagator

63 in the absence of flavor singlet interactions. The second term is obtained from sum­

ming the infinite geometric series obtained by iterating the flavor singlet interaction.

Since M r is the mass of the mesons composed of unquenched quarks, when M j ^ M r

and M i = M j this term acquires the unphysical double pole structure associated with

the quenching.

When simulating with staggered fermions on N/ = 2 dynamical configurations,

the variables and k take the values 4 and 2 respectively. Specifically, when all

the valence flavors have the same mass m^ai and all the dynamical flavors have mass

rudyn b u t rrivat 7^ (in a typical simulation this situation would arise if one were

simulating with staggered fermions on dynamical configurations) the Green’s function

above takes the form,

^ ^ _ i ______mg (p^ 4- M J) ______" + Af? (p2 4-yW?)2(p2 -hiW3 + jV/mg) I ' '

With all this machinery in place, it is straightforward to obtain an expression for

R{t) for the most general case (as far as actual simulations are concerned) of

valence fermions of equal mass and Nj dynamical fermions with their masses not

equal to the former. Projecting the first term onto the zero momentum state gives

the familiar exponential decaying residue . The second term has a double pole

at ±iMi and a single pole at with m,,, = ^ M ^ + NfiriQ. Projection onto

the zero momentum state, reveals that the single pole has the same exponentially

decaying residue proportional to but the residue of the double pole term is a

little ugly; it can be simply classified into terms that are constant in time and a term

that increases linearly with time. Collecting them all together the expression for R{t)

turns out to be®

®A11 the formulae have been derived in the infinite volume limit.

64

£ R{t) = - ^ [ A t — Bexp{—Am t) + C\ (4.18) Nf

w here

/I == {ml - m ;,)2m 8

m ^jm l - m l.f B = _ L L- Jm^,) L . (“ ■20)

[ml - ml.) {ml - m\) {ml-ml) C = I / O O \ *T* i (4.21) {ml - ml.) 2m | { m l-m l.)

ml = M l = Mj, ml = Mj and (4.22)

= m j + A / ml. (4.23)

As a check we note that the complicated expressions above reduce to the simple

form of eqn (4.11) when all the quark and pseudo quark masses are degenerate with

the coefficient D given by mg/m,,'. As another check, it is also possible to obtain

eqn (4.11) independently from considering the form of the propagator in momentum

space in the degenerate limit {M{ = mg for all i).

+ p' + m”/+ATymS

It is straightforward to verify that after projecting each term above to zero momen­

tum state and taking the ratio with the connected correlator yields equation 4.11.

Likewise, in the m^ —»• oo, Nj —*■ 0 limit, equation 4.18 gives the quenched form

shown in equation 4.13.

The various formulas above for R{t) essentially have one unknown parameter, eith er ml or m^,. We are interested in verifying one of two possibilities: (i) whether the data can be fit to determine the one parameter, or (ii) if the data were fit freely.

6.5 whether the size of the coefficientsA, B, C, and D remain the same, as predicted by theory.

We have ratio data on quenched, N/ = 2 and Nj = 4 configurations. Our ratio data on iVy = 2 configurations can be further divided into = ''^dyn and m^ai 7^

T^dyn- Quenched and dynamical m„a/ = data can be subjected to both one and two parameter fits while the ruyai 7^ data to one and four parameter fits, thus enabling comparison between theory and “experiment”.

4.4 Simulation Details 4.4.1 Ensemble

The propagators required for computing the disconnected and the connected am­ plitude were computed on configurations of size 16^ x 32. The statistical ensemble used and the valence quark masses at which the simulation was performed are shown in Table 4.1. The dynamical configurations were borrowed from Columbia while the quenched configurations were generated on the Cray T3D at Ohio Supercomputer

Center(OSC) by Jeff G randy. All the configurations listed in Table 4.1 have the same lattice spacing of about 0.1 fm obtained from nip. The values of m„a/ on the quenched configurations were chosen 10% higher than those on the dynajnical configurations, in order to bring the corresponding values of mg into more precise agreement. The staggered propagators were computed by invoking the conjugate gradient method us­ ing sets of processors ranging from 16 to 64 nodes of the CRAY T3D at OSC and Los

Alamos’s Advanced Computer Laboratary with a sustained performance of 45 Mflops per node.

66 /V; 0 ’ sa m p 0 00 6 . 0 8 3 0.011 0 . 0 2 2

0 . 0 3 3

2 0 . 0 1 5 . 7 7 9 0 . 0 1

0 . 0 2

0 . 0 3

2 0 . 0 1 5 5 . 7 5 0 0 . 0 1

0 . 0 1 5

2 0 . 0 2 5 5 . 7 3 4 0 . 0 1

0 . 0 2 5

4 0 . 0 1 5 . 4 7 0 0 . 0 1

Table 4.1: The Statistical Ensemble

4.4.2 Propagators

The pseudoscalar operator that creates or destroys an r)' in the staggered formalism is Ç 75 0 IQ. It is a distance 4 operator in which the quark and the antiquark sit at the opposite corners of the hypercube. We put in explicit links connecting the quark and the antiquark across the edges of the hypercube and averaged over ail the

24 possible paths to make the operator gauge invariant. As was mentioned earlier, the two point correlation function of this operator yields both the connected one loop and the disconnected two loop amplitude of Figure 4.2. Two propagators need to be computed for calculating the one loop amplitude. One propagator was calculated using a delta function source with a local random phase t}j. = exp(f^(x)) at each spatial site and for each color on time slices t = I and t = 2. When averaged over all noisy samples, the noise t)x satisfies = ^xy Translating the noisy source at each site (made gauge invariant by putting in links) a hypercubic distance 4 and putting

67 in the phase appropriate for the //' operator the second propagator was calculated.

The appropriate phase for i]' is obtained from evaluating T’rirJrsr^'F f’) defined in chapter 2. Transporting the source and calculating the anti propagator separately is the price paid due to the non locality of the staggered flavor singlet operator. For each configuration we took two noise samples with the lattice doubled in the time direction.

Very few lattice calculations of m^/ exist because of the difficulty involved in get­ ting good signals for the disconnected amplitude using reasonable computer time[39].

Ideally one would like to have a disconnected loop at each spatial site across the en­ tire lattice which means total number of propagator inversions required to be carried out equal the volume of the lattice! Typically quark propagators for spectrum determination are calculated at m,a < 0.03. For the size of lattice that we have used in our simulation, it takes about 200 seconds to calculate a quark propagator at m,a = 0.03 on 64 processors of the CRAY T3D. When repeated 16^ x 32 times, the total elapsed time is about 300 days!

We have addressed this problem by using a noisy source like the one used for the connected amplitude but placed on all sites of the lattice which we call the (7(1) noise source. Inverting on this noisy source solves for the pseudofermion field <(>, so called because it is a scalar whose equation of motion is

{p + m)(l> = T] (4.25) obtained from the action

68 Here rj is the (7(1) noise source. Now since

xz

It is easy to verify

By definition, the quark propagator G^y can be written as

= E(P + «,2 m2)

For Ir — y| even, comparison of eqns 4.28 and 4.29 gives the estimator

Cry = {m,,(pl) (4.29)

Likewise, ajiother useful estimator that is independent of whether \x — y\ is even or odd is G^y is {T]].y)- This estimator could be useful, for example, analyzing signals in the other flavor singlet channel which is local in time: 74 ® I. However, the data in this channel being is too noisy and we restricted ourselves to analyze the signals in the distance 4 flavor singlet channel. Although either of the estimators can be used for this channel, eqn 4.29 is the better one since the fluctuations are of order (xx)^ while for the other they are of order {xx)l^vai- Here (xx) is the condensate, (xx) lies between 0 and 1, making (xx)^ significantly less than for the range of light quark masses such as those we use here'. We used 16 noise samples per configuration, again on a doubled lattice.

' This relationship can be shown to be valid for every mass value

69 4.4.3 Wuppertal Smearing

Figure 4.3 shows a typical ratio plot obtained on quenched and dynamical config­

uration { N f = 2, mjyn = 0.01). Since the disconnected data is noisy, the ratio data exists only for the first few time slices where the contribution due to excited states could be non-negligible. A typical interpolating field used to create (say) a pion is of the form ç(x)7sç(x) where q{x) and q{x) are creation operators for a quark and an antiquark respectively and are point like. A typical hadron like the pion is an ex­ tended object of charge radius nearly a fermi and on the typical lattices (a = 0.1/m ) used in simulation, it is spread across several lattice spacings. Therefore, point like operators can be expected to perform poorly for projecting out the ground states at sm all t. One possibility that will overcome this problem is to construct operators that will have a large overlap with the ground state,

0{t) = X^ç(x,t)<é(x,î/)

Here (j>{x,y) isa gauge-covariant c-number function chosen to minimize the overlap with the excited states. Reducing the excited state contributions from the first few time slices makes the data more reliable for mass extraction.

Several methods have been experimented with for choosing the appropriate trial wave function. We have used the Wuppertal smearing technique[22] in which one obtains an exponentially decaying wavefunction from solving

(— + K^)(^(x) = ^x.o (4.31) where is a parameter that can be tuned to control the spread of the wavefunction.

On the lattice, for staggered fermions and to make the smearing procedure gauge

70 LL mq=0.01

e

02 4 6 8 10

Figure 4.3: Ratio using local operators.

invariant, the operator takes the form

1 -+ - ^ Uf,{x + /z) 6x',x+2ft M=1

We used CG for inverting eqn 4.31 for different values of and smeared at both the source and the sink end of the propagator. The computational overhead for smearing is negligibly small for values of faraway from its critical value. Near the critical value, the computational overhead can go upto as much as 10% of the cost of calculating the staggered quark propagator at m, = 0.01. The wavefunction {x) thus obtained for a given value of k} can be characterized by an rms radius defined by

r^- = Z x i x n ^ ( x ) r E x i< )(x )r We experimented with different values of and determined that the critical value occurred near = —0.64. The values of that are relevant to this study are those near the critical value since this corresponds to a maximum spread in the wavefunction

71 without losing the exponential behaviour. The form of the wavefunction obtained on

one of the Nj = 2{mdyn = 0.01) configurations is shown in Figure 4.4.

<‘‘= -.5 8

0.1

0.01 o ■

0.001 *— k— «— I— I - 1 - I I, I -fc -<— I 0 5 10 15

Figure 4.4: Wavefunction from Wuppertal smearing.

For this study, one propagator of the connected correlator W c is calculated with the source smeared to a fixed radius corresponding to = —0.6 and a point-like sink.

This was tied to five different anti-propagators with point-like source but smeared at

the sink end corresponding to different values of (see Figure 4.5). Thus the smeared valence propagators that we have calculated correspond to the set of correlators shown in Table 4.2®.

A typical effective mass plot obtained on N/ = 2, mjyn =0.01 configuration at

^vai = 0.01 is shown in Figure 4.6 for the correlator LL. Since decrease in increases the spread in the wavefunction, among the 5 correlators shown in Table 4.2, C5 is expected to perform the best which is evident by comparing Figure 4.6 and Figure 4.7.

*In Table 4.2, 7 5 ® I has been omitted to avoid crowding.

72 Smeared Connected Correlator

Smeared Disconnected Correlator

Figure 4.5: Correlators from smeared operators.

Contraction at source a t sink LL Q(z)Q(z) Q(0)Q(0) no smearing no smearing Cl Q (z )Q (z ) Q«x(0)Q(0) -0.6 no smearing C2 Q«,i(z)Q(r) Q ^(0)Q(0) -0.6 -0.5 Cs Q«,2(r)Q (z ) Qu;4(0)Q(0) -0.6 -0.53 C4 Qu,3(r)Q(z) QuM(0)Q(0) -0.6 -0.56 C 5 Q ^ ( 0)Q (0 ) -0.6 -0.6

Table 4.2: Smeared valence operators

73 VVe also defined the correlator,

(cos 6 Q{x)Q{x) + sin 6 (4.33)

which amounts to taking the linear combination

{cos^O LL + sin^û C5 + 2 cos Û sin ^ C i) (4.34)

We determined the angle 6 for which the value of the effective mass on t = 1 was the

lowest and this corresponded to 9 = 2.5rad. We denote this linear combination of

correlators as LLC 5 and it is very clear from Figure 4.7 that it is slightly better than

C5.

1.5 1111 0000 LL m=0.01

1.0

. o

0.5 - ®

0.0 0 20 4 0 60

Figure 4.6: Effective mass plot for mg with LL correlator.

To calculate the corresponding disconnected contributions of the correlators listed in 4.2, the sink end of the pseudofermion propagator {(()') was smeared to 4 differ­ ent radii corresponding to = —0.5, —0.53, —0.56 and — 0.6 (see Figure 4.5).

Accordingly, the quark propagator was obtained from the estimator G'j.y = {m'^4>^y).

74 1.5 n i l 0000 C5 m =0.01 1.5

1.0 1.0

0 .5 -® 0 .5 «

0.0 0 20 4 0 80 0 20 4 0 6 0

Figure 4.7: Effective mass plot with C5 and L L C 5 correlators.

Figure 4.8 compares the ratio plot with and without smearing on dynamical con­ figurations. In the initial time slices both the data are different but they begin to coincide after a few time slices as they should.

4.5 Results

In the plot of Figure 4.9, the different curves represent the fit to the ratio data obtained from the various configurations listed in Table 4.1. For N / = 2 and N f =

4, the points shown correspond to ruvai = TUdyn while for the quenched data the points correspond to m„a/ = 0.011. The quenched data is fit linearly according to equation 4.13 with both the intercept and the slope as free parameters. The dynamical data is fit according to equation (4.11) with both m^i an d B as free parameters. A remarkable observation that is to be gleaned from this graph is that the ratio data is distinctly different for different numbers of dynamical flavors and in accordance with m „ „ i —0 . 0 1 iiI j — —0 . 0 1 o LL o LLC,

0 2 4 6 8 10

Figure 4.8: Dynamical ratio from local and smeared operators.

the predicted theoretical form. The more interesting question of the deviation from the predicted theoretical values for the parameters is dealt further below.

Our values of mg calculated on quenched and Nj = 2 dynamical configurations are plotted as a function of the valence quark mass in Figure 4.10. In the chiral limit mg does not quite vanish as it is supposed to in the continuum. This is because in the staggered fermion formulation, the lattice regulator breaks axial SU{i) down to a

(7(1) associated with one particular flavor non-singlet generator. Only that generator is massless at finite lattice spacing. For our purpose, it is most appropriate to use the singlet generator.

From the linear fit to the quenched data for m„„/ = 0.011, 0.022 and 0.033, we ex tract ttiq from the slope. A plot of ml against m^ai and the extrapolation to the zero quark mass is shown in Figure 4.11. ml increases with decreasing quark mass and

76 4

m < ly n O 0.01 □ 0.015 3 o 0.025 % : 0.01

2 a i

,JI- 1

Nr=4 0 0 2 4 6 8 10

Figure 4.9: Flavor dependence of R(i).

( < 0.4 0.4 o LL N,=0 o LL Nf=2 tnd^=m ^

» LLC. o LLC, 0.3 0.3

S 0-2

0.1 0.1

0.0 0.0 0.00 0.01 0.02 0.0 3 0.00 0.01 0.02

Figure 4.10: m | vs rriyai on jVy = 0 and N f = 2 configurations.

does not vanish in the chiral limit. Smearing has lowered the value of the intercept obtained in the zero quark mass limit, bringing it closer to the real world value.

Two parameter fits, according to eqn 4.11, of the dynamical m„o/ = data yield Am and hence m,,». Extrapolating the values of m,,» so extracted, we find that it does not vanish in the chiral limit (Figure 4.12). As seen in quenched case, the values of m,,' both at finite and zero quark mass due to the correlator LLC^ are lower than the corresponding points due to the locad correlator LL. Errors quoted on the ratio data and all the mass values are obtained by doing a single elimination jackknife.

We could not obtain stable 4 parameter fits to our m^ai i=- mdyn data. The known theoretical forms for the ratio obtained from lowest order P Q xP T were used for one parameter fit of the rriyai ^ rudyn and rriyai = rudyn data. Figure 4.13 shows the theoretical fit obtained for m„a/ = 0.02 on Nj = 2 , rridyn = 0.01 configuration using smeared operators. A one parameter fit to the data obtained with local operators gave reasonable only if the fit range began from t > 3. This suggests that the

78 0.6

0.5 O LL o LLd

0.4

0.3

Expt

0.2

0.1

0.0 0.00 0.01 0.02 0.03 m val

Figure 4.11; Chiral extrapolation of quenched ttIq.

79 0.4

LLd 0.3

0.2

Expt 0.1

0.00 0.01 0.02

m val

Figure 4.12: Chiral extrapolation of from Nj = 2 configurations.

SO local data is heavily contaminated with excited states since the formulae derived

in section 2 hold for asymptotic times. Doing a similar job for m^ai = 0.03 data

and extrapolating the value of m^/ obtained from both to the chiral limit we obtain

mr/{Nj = 3) = 876 ± IGMeV which is remarkably consistent with the value obtained

from fully quenched and dynamical data (Table 4.4).

1.2

1.0

0.8

w 0 .6 CÜ

0.4

0.2

0.0 20 4 6 8 10

Figure 4.13: One parameter fit to R{t) with m„a/ = 0.02 and rridyn = 0.01.

The one parameter fit to niyai < rridyn data did not yield reasonable However,

for th e rriyai = rridyn data, it was possible to extract the deviation from the lowest order PQ\PT as determined by the observable the ratio of the residues for

81 creating t]' and q. 2/ K 0 |q% 0 )|q') (4.35) ^ 1(0 I 9(0 ) I q) r

In PQ xPt, this ratio is I based on the assumption fr,' = fr,- Therefore, in PQ\PT as shown in section 2, the parameter B in eqn 4.11 equals instead of From the 2 parameter fit to the rrivai = ^dyn data one can then determine the ratio Z'fZ.

The values we extract for Z'/Z are plotted as a function of rrivai in Figure 4.14 both for smeared and local data. It can be seen that the data prefer to be 20-30% above unity which is typically the size of higher order chiral and 0 (1 /Nc) corrections.

2.0

1.8 —o LL0|

1.6 N

N 1.4

1.2

0.00 0.01 0.02 0.03

^ d y n

Figure 4.14: Z'/Z vs. m„a/-

82 In the analysis so far, we had neglected the effect of momentum dependence of mg, ie. we had been working in a theory with a, the coefficient of the kinetic energy term of T]' in the chirai Lagrangian[5], set to zero, q is a small quantity which contributes at the next to leading order in a combined expansion in l/Nc and quark mass[4].

Including a simply shifts the strength of the flavor singlet interaction vertex from ml to ml + a^. While the form of R{t) remains the same as before for all the three cases, the coefficients of the time dependent terms and the constants become functions of two unknown parameters, ml and a. For the quenched case, we obtain

_ (mg , {ml + aml)a (4.36) = — 5^— + —2 ^ m l —

It is clear from the above formula that the value of ml is shifted by a small amount at finite quark mass and indeed this is borne out true by the quenched ratio data when fit to the above form. We see I0%-I5% downward shift in the the values of ml obtained from the slope and the intercept of the fit at non zero quark mass. When extrapolated to the chiral limit, the value of ml is left unaffected (within quoted errors in Table 4.4) by the presence of the momentum dependent interaction term.

The statistical errors associated with the extracted values of a are rather large. Our results for a for each quark mass is shown in Table 4.3.

LL LLCs 0.011 -1.7496(2321) -0.2988(2091) 0.022 -0.8475(1301) -0.1431(1195) 0.033 -0.4645(910) -0.0399(702)

Table 4.3: a from the quenched ratio data.

S3 On dynamical configurations, the presence of a non zero a is reflected in the

expression for m,,/. It now taices the form

ml, = ?» (4.37) I+ Nf a

The natural quantity that is extracted from fitting the dynamical m„ai = rnjyn d a ta

with equation 4.11 is m,,> which cannot be used to determine both mg and a inde­

pendently.

4.6 Conclusions

The observable R{t), being sensitive to the number of dynamical flavors has proved

to be remarkably useful for extracting in the chiral limit on both quenched and

dynamical configurations. Our values for the r)' - pseudoscalar octet mass splitting, summarized in Table 4.4, are consistent with the experimental number. Despite

the large statistical errors, the results give credence to the argument that the quark antiquark annihilation diagrams separate the rj' from 7r. Our method of calculating the disconnected propagator has proved quite effective enabling future lattice studies of quantities that have disconnected contributions. Since the statistical errors are

Q uenched D ynam ical {rridyn = mw/) (M eV) (M eV) LL 1156(95) 974(133) LLCs 891(101) 780(187)

Table 4.4: rrir,'{Nf = .3) in the chiral limit.

large we have not attempted to quantify the systematic errors. The obvious sources

84 of systematic errors are due to the finite lattice spacing (our calculation was done at a lattice spacing of O.lfm) and due to finite lattice volume. The systematic error introduced by the contribution of excited states has been addressed by the use of smeared operators. The effectiveness of using Wuppertal smeared operators can be seen by comparing the ratio data such as in Figure 4.8 and the numbers shown for the value of m,,/ in the chiral limit in Table 4.4. Other systematic errors include quenching and flavor symmetry breaking of staggered fermions. We have not taken into account the effect of 77 — 77' mixing. It would be interesting to simulate at lower quark masses (the lowest value we employed was m^a = 0.01) which will shed light on the approach to the chiral limit by mo.

85 CHAPTER 5

APPLICATIONS OF THE EIGENMODES OF THE DIRAC OPERATOR

5.1 Continuum Facts

The vacuum structure of QCD is quite complicated exhibiting among other things quark confinement and spontaneous breaking of chiral symmetry. A qualitative pic­ ture of some of the non-perturbative aspects of the QCD vacuum emerged after the discovery of instantons [3] which are finite action solutions of the classical equations of motion in Euclidean space. For the classical Yang-Mills theory, such solutions are the dual and the antiself-dual fields^ with the action satisfying the condition

Here Q is an integer and refers to the winding number of the mapping between the gauge field at asymptotic distance and the gauge group. A finite action configuration

(whether or not the solution to the equations of the motion) with one value of Q cannot be continuously deformed into another by a nonsingular gauge transformation

®There is no proof yet that these are the only finite action solutions of the Euclidean Yang-Mills equations

86 In terms of the fields, Q takes the form

Q = / d^xF^^{x)F^,^{x) (5.2)

The finite action solutions of the classical theory become topologically distinct

vacua in the quantum theory. However, the true quantum vacua are obtained by

taking a linear superposition since the the finiteness of the action makes tunneling

highly probable. The fact that instantons can also be interpreted as topologically

nontrivial tunneling paths was first discussed in the references [24, 13]. The vacua in

the quantum theory form a band of states described by a parameter 0; the functional

integral for the pure gauge theory now acquires an additional 0 dependent term:

Z = / IdAI exp i( J Tr [ - + OQ ) (5.3)

In pure gauge theory and in the real world of nonzero quark mass, the 0 dependent term violates CP which goes under the name of the strong CP problem. However, in the presence of mcissless quarks, the tunneling amplitude is suppressed due to the presence of zero modes in the covariant Dirac operator. This results in different 0 worlds becoming equivalent, since 0 can be rotated away by chiral transformation of the fermion fields. The index theorem (which we discuss in the next section) connects the topological charge of the gauge configuration to the change in the number of chiral fermions. It implies that going from one topologically distinct but degenerate vacuum to another in the presence of massless fermions is always accompanied by violation of the axial charge number. This provides a natural explanation for the non existence of the U(l) axial symmetry and eliminates the need for the ninth goldstone boson and hence theU(l) problem. Pedagogical discussions on instanton physics with references to the original literature are present in [10, 43].

8 7 When instantons "came into being”, there was a great hope that the problem

of quark confinement could be explained. It was shown that confinement due to

instantons (in the dilute gas approximation) does occur in two dimensional Higgs

model [14] and 2+1 dimensional non Abelian Higgs model [42]. Unfortunately, this

is not the case in QCD because the Yang-Mills instantons come in all sizes, the large

instantons in particular are associated with infrared divergence.

Instantons have been used in several phenomenological models to explain con­

finement and other hadronic properties. One of the successful models is the random

instanton liquid model (RILM [47]) whose predictions agree with experiment in pseu­

doscalar and vector channels but not in the flavor singlet channel. None of the models

to date have been successful in explaining all of the observed phenomenology associ­

ated with the hadron spectrum. .A.s for confinement, the current favourites are models

based on abelian projection methods [41, 15].

As has been emphasized before, the lattice serves as a model independent tool.

Lattice gauge fields are not smooth as their continuum counterparts and they can

be deformed continuously from one topological sector to the other. However, de­

termination of the topological charge and the topological susceptibility (eqn (4.9))

on the lattice may bring to light some of the features of the continuum theory that

might be preserved on the lattice. In this spirit, we have calculated the lowest 32

eigenvalues and eigenvectors of the staggered Dirac operator and have used them to obtain the topological charge of 517(3) gauge configurations both in quenched and

full QCD. We find that the lowest few modes can be identified with the topological zero modes. We are among the first to demonstrate this in 4 dimensions using SU{3) gauge fields. We find that these lowest modes determine the disconnected part of

SS the T]' correlator completely which is a remarkable result in itself since it implies that

the various continuum conjectures regarding large t j ' have been verified in a model

independent manner on the lattice. Once again, ours is the first study to obtain such

a result. We also find that for the determination of the pion correlator, considerably

more number of modes are required. The procedure for obtaining the topological

charge and the corresponding results are discussed in section 5.2. In section 5.3, the

method for constructing the hadron correlators with the modes is described followed

by a discussion of the results.

5.2 Topological Charge On and Off the Lattice

The asymptotic behaviour of gauge fields determine their topology. In the presence of fermions, the topological index of the gauge field satisfies the index theorem

Ç = — n_ (5.4) where and n_ are the number of zero modes of positive and negative chirality. It can be derived from the anomaly equation for the f/(l) axial current

= 2mzr/'(ar)750(a:) + 2iF{x)F{x). (5.5)

Integrating both sides over all space results in the vanishing of the total divergence term on the left hand side and gives rise to

Q = m Tt{-(5 Sf) (5.6)

Here 5/ is the fermion propagator. Resolving the propagator as a sum over the eigenmodes of If) yields the expression

(5.7)

89 The eigenmodes Ur satisfy

J p U r = i \ r Ur U^Ur = 0 (5.8)

The anticommutation relation { p , 75} = 0 implies that Ur and 75 uj are eigenfunctions

of the Dirac operator corresponding to the eigenvalues iXr and its complex conjugate.

Therefore, the sum in eqn (5.7) vanishes for ail non zero values of while for the zero

modes (in the presence of massless quarks), the corresponding u^’s can be considered

as eigenfunctions of 75 with definite chirality, ie.

75 Ur = ±Ur. (5.9)

from which follows the index theorem stated in eqn (5.4). It is independent of m and it implies that for configurations of non-zero Q, there must be atleast one zero mode.

The anomaly equation remains intact on the lattice for Wilson and staggered fermions and therefore, eqn. (5.7) holds^°. While for Wilson fermions Ts is simply the Dirac 75, for staggered fermions, it is a four-link operator, Tg = 75 ® I. However for staggered fermions, P and Ps do not anticommute. On the lattice, fluctuations in the gauge field shift the zero modes: it is these fluctuations inside a finite volume that prevent lattice gauge fields being divided into definite topological sectors. This in turn implies that (Ps) is not expected to exactly vanish for any mode with the consequence that the index theorem may not hold on the lattice. All of the facts mentioned above paint a pessimisitic picture for the study of topology on the lattice.

Nevertheless, by asking the following questions, one can proceed ahead:

I. Does (P5) decrease as A increases?

'°One should be careful in interpreting the quark mass dependence of Q on the lattice. Without taking the thermodynamic limit, m, — 0 limit cannot be taken.

90 2. Is it possible to identify some of the lowest eigenmodes as the shifted zero

m odes?

3. If so, how well do the shifted zero modes determine Q?

4. Is there any difference between the quenched ajid dynamical behaviour?

Answers to these questions would then reveal the extent of validity of the index

theorem on the lattice.

Earlier, Smit and Vink [48] have shown that some of the features predicted by

the index theorem do hold on the lattice for both Wilson and staggered fermions in

1 + 1 dimensions on 17(1) gauge fields. From their study on quenched S if{S ) gauge

configurations at 3 = 5.7. with staggered fermions, they conclude that the fluctuations

in the gauge fields do not allow replica of the continuum behaviour. They evaluate the topological charge using pseudofermions and find it to be strongly dependent on the quark mass unlike the continuum case where Q is m independent. However, they predict that higher(3 values might yield better results. Teper and Hands [50] obtained the lowest twenty modes of staggered IJ) on quenched SU {2) gauge fields.

They use the Lanczos algorithm for obtaining the eigenvalues followed by inverse iterations to obtain the eigenvectors. From the lowest eigenvalues, they calculate the density p (\) for small A and extract the chiral condensate for different lattice sizes and at different ^ values including blocked configurations and find enough evidence for scaling and universality. They compare (Fs) from cooled and uncooled configurations and find significant evidence for correlating the lowest modes with topological activity.

Since the lowest modes are responsible for chiral symmetry breaking, they attribute topological origin for spontaneous breaking of chiral symmetry.

91 In this work, we have calculated explicitly the 16 lowest eigenvalues (A') and eigenmodes of the staggered on both quenched and dynamical configurations of size 16^ X 32 at /? = 6.0 and 3 = 5.7 respectively (correspond to the same scale, set by nip as explained in the previous chapter). The eigenmodes so obtained using the subspace iterations method (the algorithm and its performance are described in detail in Chapter 3) are available only on the even sites. Equation (3.16) is then used for the reconstruction of the full eigenvector corresponding to the eigenvalues ± i \ / ÿ of ip. Using the available eigenmodes, (Fs) and the sum in eqn (5.7) for different values of the quark mass parameter are evaluated. This is among the first such study on

SU {'i) dynamical gauge configurations. We discuss the results of our study below.

5.2.1 Lattice Results

Figure 5.1 shows the mode by mode contribution to u^FsUa on all the dynamical configurations. Qualitatively it appears that the contribution to (Fs) decreases as A increases. This can be verified quantitatively by calculating the fluctuations in (Fs) in each bin (eigenvalues were divided into 20 bins) which is shown in the bottom graph of Figure 5.1. It is evident that the fluctuations diminish as A increases, consistent with the continuum behaviour where u^FsUA = 0 for all the non zero eigenvalues.

Figure 5.2 shows a similar set of plots for the quenched ensemble. Although both (Fs) and its width show qualitatively similar behaviour as before, they seem to drop more slowly with increasing A. While the plots shown so far suggest a positive response to the first question posed in section 5.2, they cannot be used to identify the number of shifted zero modes. If more of the lowest eigenvalues had been available, then combined with the information from Figure 5.3, perhaps a tentative upper limit Xmax

92 could have been identified below which ail the modes are the shifted zero modes. The solid line present in all the figures shown so far represents the minimum value of the

16th eigenvalue in the ensemble.

93 u " V

-0.5 -= = f

0.000 0.010 0.020

0.39

■g 5

0.19

0.01 0.000 0.005 0.010 0.015 0.020 0.025

Figure 5.1: Above:(rs) on the dynamical ensemble. Below: Fluctuations in (Fs).

94 ;# # g :c

0.005 0.010 0.015

I 1 1 . 1 1 1 . > 1 <> -

- L O .

; o ^ ^ 0 - o - : 0 - o - ; o - o o . O vv - o - o

- o

0.00 ____ 1____ 1____ 1____ 1____ 1____ 1____ 1------u. - t. . —.L i ' '____1—- .1 i. — 1»------1 . 0.000 0.005 0.010 0.015

Figure 5.2: Above:(Fs) on the quenched ensemble. Below:Fluctuations in (Fs).

95 0.90

0.80

0.70 o ■---- ■ 16 pf copies 0.60 • # Q from modes - 82 pf copies 0.50

0.40 0.0 2.0 4.0 6.0 8.0 loggCnumber of modes)

Figure 5.3: Topological charge on a typical dynamical configuration

The fact there is topological activity on the lattice and that it is predominantly due to the lowest few modes is further reinforced by the plot of Figure 5.3. Here the sum in eqn(5.7) for r = 2, 4, 8, 24 and 32 modes at m a = 0.01 is plotted against the corresponding number of modes on a typical dynamical configuration. The plateau in Q as the number of modes increases implies that the contribution of the higher modes to the sum in eqn 5.7 is becoming negligibly small. Although, the plot shown here is for one configuration, we find that the saturation of Q with increasing number of modes is present on all the configurations, both quenched and dynamical.

96 We also compared the value of Q obtained from the eigenmodes to that obtained

from pseudofermions. The fermion propagator using pseudofermions is given by^^

G{x,y) = Y^m(p{x.z) 0 \z,y) \x - y\ even. (5.10)

Now

Tr(rs G(x,y)) = m Tr(rs(p^(p|,) (5.11)

where

possible to replace é i and 4>y in the above expression by

Resolving -A— as a sum over eigenmodes yields ip+m

T,[n z --TY'" ' " : (5.13) ^ ^ + m-tX'

Making use of the orthonormality relations

“a( ^ ) (5.14) z , w where .'V is the normalization factor, results in

m Tr = m Z T r (5.15)

Thus the estimateQ obtained from the pseudofermions is equivalent to the value obtained from the sum in eqn (5.7) when all the eigenmodes are present. Our pro­ cedure for obtaining the pseudofermion propagator is explained in Chapter 4. In

Figure 5.3, the isolated point represented by a square is the value of Q obtained from

“ For a more detailed discussion regarding pseudofermion , the reader is referred to the previous chapter.

97 pseudofermions. The error bar is due to averaging over 16 copies of noise. The '‘true” value obtained by averaging over 82 copies of noise lies within the band marked by dashed lines. Thus the plateau in Q obtained with the eigenmodes approaches the full value using pseudofermions is a very good indication that these lowest modes are the topological zero modes and the effect of higher modes on the value of Q is negligible.

In the continuum, only the zero modes are associated with the value of Q while the sum ir,\,A>o “a7s“a vanishes. On the lattice, for the dynamical configurations, we find that not more than 32 lowest modes are required to determine Q which is a very good indication that these are the shifted zero modes.

We have compared Q obtained from pseudofermions and modes on all the dynam­ ical configurations and find that both agree within a standard deviation for 75% of the configurations while the remaining 25% of the configurations within 2 standard deviations. In this respect the quenched configurations differ from the dynamical in that the agreement on most configurations is within 2 standard deviations. Similarly a qualitative comparison of Figures 5.1 and 5.2 indicates that more than 32 lowest eigenvalues are required to identify an upper limit for the number of shifted zero modes on the quenched configurations.

Overall, the analysis so far suggests that while it is not possible to recover the index theorem exactly at finite lattice spacing, several of the features predicted by the index theorem are preserved on the lattice.

5.3 Hadron Correlators Revisited

Since the small eigenvalues of the Dirac operator are related to chiral symmetry breaking, one might ask how well are the hadron correlators are determined by the

98 lowest few modes. Before discussing the results that will reveal an answer to this j question, the connection between the small eigenvalues of ip and chiral symmetry

breaking axe reviewed.

The evidence for the spontaneous breaking of chiral symmetry comes directly from

nature by the absence of opposite parity partners for the pseudoscalar octet mesons.

The corresponding order parameter, the chiral condensate (çç) is related to the pion

correlator by the well known Ward identity

ix x ) = m{Y7sx(0)x75X(a;)) (5-16)

This relation holds on the lattice as well. The nonvanishing of the order parameter

(çç) as verified from lattice simulations and phenomenological models further support

the idea that the axial flavor symmetry of the vacuum is not preserved. A nonzero

value for the chiral condensate is also significant in that it is inherently a nonper-

turbative phenomenon as (çç) = 0 to all orders in perturbation theory. The quark

condensate is also related to the small fermion modes by the Banks-Casher formula

[2].

p(0) = ~ ^{q q )o (5.17)

where p(0) is the density of small eigenvalues of iip evaluated at A = 0. Its derivation

is shown below. The quark propagator, when resolved in terms of the eigenvalues and

eigenvectors o i Ip on & fi.xed background gauge field takes th e form,

= (5.18)

Integrating over all space and taking the trace,

{XX) = J Tr(G(x.x))cT‘x = -2m ^ (5.19)

99 Use has been made of the fact that eigenvalues of ip comes in complex conjugate

pairs with eigenvectors u \ ajid 75 u \ respectively. It might appear that the condensate vanishes in the chiral limit, but this is not so as one must take the thermodynamic limit first before taking the chiral limit. In the continuum, the spectrum becomes dense, therefore the sum above can be replaced by an integral with the measure d \ p(A) where p { \) is the density of eigenvalues in the range (A, A + dX).

{xx) = I dXp{\)-^—- (5.20) Jo A^ +

The integral diverges in the ultraviolet limit and also in the infrared when m —» 0.

Averaging the spectral density over all gauge configurations and taking the chiral limit followed by the A integration yields the Banks-Casher relation. The derivation above suggests that the spectrum of the Dirac operator close to zero eigenvalues is important for studying chiral symmetry breaking.

5.3.1 Pion Correlator and the Lowest Eigenmodes

Use of CG for the calculation of the fermion propagator requires a starting iterate and this can be constructed out of the available lowest modes. So as a first step, let us look at the propagator constructed using the first 32 modes according to the sum in eqn (5.18) and compare with the full propagator (Figure 5.4. It can be seen that at the beginning of the iterations, a substantial part of the full propagator is determined by the 32 mode seed when the quark mass parameter is 0.01. The fact that it takes 400 more iterations to converge to the full value is an indication that more than 32 lowest modes would be needed to saturate the pion correlator. This is indeed the case as illustrated by Figure 5.5. This is not surprising in the light of the discussion following the derivation of eqn (5.17). Apart from the topological

100 0.20 ...... '

0.15 : / zero seed ...... I mode seed : / ' / / ! a ------4 mode seed " x 0.10 ' ------8 mode seed 16 mode seed : 1 0.05 - ■ // / 1 i_i J. 1_L 1_1 L J j_ 1 1 1 1 1 11 Î J. 1 1 0.0 400.0 800.0 iteratio n s

Figure .5.4: Quark propagators from 16 mode and zero seed on a typical dynamical config with m a = 0.01 from a 6 function source.

zero modes, additional eigenvalues near zero are also required. Although the pion correlator shown in Figure 5.5 is obtained from the dynamical ensemble, a similar behaviour for the quark propagator is expected from the quenched ensemble. Negele

et al. [23] have shown that the pion and thep correlators can be completely determined by the lowest 128 eigenmodes using Wilson fermions on both quenched and full QCD configurations on 16“* lattices. They used the k-step Arnoldi method to obtain the eigenvectors and the eigenvalues.

An interesting side benefit of this exercise led to some observations regarding the convergence behaviour of CG. If the starting iterate zo, is constructed out of the available lowest modes, the number of CG iterations required for the residual to converge to a given tolerance drops by more than 50% at m a = 0.01 (Figure 5.6).

101 Dynamical, m_a=0.01

C 16 modes 100000 full

10000

- 1.0 9.0 19.0 29.0 time

Figure 5.5: Pion propagator on dynamical ensemble at m a = 0.01.

In the light of the discussion on CG in chapter .3, this is not a surprising result but it

is heartening to have a numerical verification. At higher quark masses, this effect is

not so pronounced owing to the fact that the lowest 32 eigenvalues are smaller than

the quark mass itself. .'Ml of the above conclusions hold only for delta function or

wall sources built out of delta functions. When we repeated the same exercise with

C'(l) noise source at all sites or for that m atter any noisy source - at one time slice

or all sites, there is virtually no improvement in the convergence rate (see Figure 5.7.

We learn that the quark propagator from noisy sources are dominated by the short

distance modes while that from 6 function type sources are predominantly made up

of long wavelength modes.

102

i' 20.0

(O Q 15.0 ■D O E 10.0

0.0 0.0 500.0 1000.0 1500.0

10

,-4 zero seed 10 1 mode seed 4 mode seed 8 mode seed 10 16 mode seed

,-8 10

,-10

,-12 10 0.0 200.0 400.0 600.0 800.0 1000.0 ite ra tio n s

Figure 5.6: Comparison of CG convergence with 16 mode and zero mode seed at m a = 0.01. Above: number of modes vs number of CG iterations for various quark masses

103 10® 4 10 zero seed J 16 mode seed :

10,0

2

10 -4

10,•6 0.0 200.0 400.0 600.0 800.0 1000.0 iterations

Figure 5.7: Same as in Figureo-7, but for a f7(I) noise source.

104 5.3.2 The rj' Correlator and the Lowest Modes

Much was said about t]' meson and its mass determination in the preceding chap­

ter. Here we determine the disconnected 77' correlator ((Tr(rs G(x,x)) Tr(rs G {y.,y)))

using the 32 lowest modes and compare (Figure 5.8 with the full result available from our previous study. The data points calculated from the full propagator using pseud­ ofermions and those using the modes coincide very well within statistical errors at all the quark masses, m,a = 0.01, 0.02 and 0.03. A quantitative estimate of the similar­ ity is shown in the bottom part of Figure 5.8 which is obtained by taking the ratio of the disconnected loop from the 32 modes with the full result from pseudofermions.

It can be seen that this ratio approaches one after a few time slices for the 32 mode case indicating that the disconnected data from 32 modes approaches the full value as determined by the pseudofermion method. The error bars are obtained using single elimination jackknife.

In Figure. (5.9), we present the data in a slightly different fashion, dividing the disconnected correlator by the connected gives rise to the ratio

m = {v'{tWiO))conn which figured prominently in chapter 4. For the connected correlator, we use the full result available from our previous study (as was demonstrated for the case of the pion, it is not possible to compute the connected correlator using only 32 modes). In

Figure 5.9 R {t) is plotted for both dynamical and valence mass set to m a = 0.01. The signal obtained with 16 modes agrees so well with that obtained from pseudofermions it is possible to fit the data to the form of eqn (4.11) and extract m,,/ from it.

105 On the quenched configurations, although qualitatively similar behaviour is seen,

more number of modes would be required to obtain the same level of saturation

seen with the dynamical configurations. Thus this result only reaffirms the earlier conclusion reached from the study of (Fs) and Q on the quenched ensemble. This is evident from Figure 5.10

Since it takes the same amount of time to calculate the disconnected loop with

16 copies of noise as the lowest 16 modes of we advocate disconnected loops are better calculated with the modes rather than noisy sources. Once the modes are available they can be used my number of times to construct quark propagators, Q etc. for various quark mass values. Ofcourse, a better method might be required as flops in subspace iterations grow as the square of the number of modes.

106 Dynamical, m a=0.01 3.0

■ 1 mode 2.0 ■ 16 modes'

0.0

- 1.0

45.0

■ 16 modes 35.0 • full

25.0

15.0

5.0

- 5.0 - 1.0 9.0 19.0 29.0 time

Figure 5.8: Below: Comparison of the two loop amplitudes. Above: Ratio of the full two-loop amplitude to that calculated with modes.

107 Dynamical, m,a=0.01 3.0 ♦ 1 mode ■ 16 modes 2.0 ® full

0.0

0.0 5.0 10.0 15.0

Figure 5.9: Pion-;?' splitting.

♦ 1 mode ■ 16 modes 4.0 • full

oc 2.0

- 2.0 0.0 5.0 10.0 15.0 t Figure 5.10: Pion-;?' splitting on the quenched configs at m a = 0.01.

1 0 8

JL 5.4 Conclusions

Using the lowest 32 eigenvalues ajid eigenvectors of we computed the topolog­

ical charge, studied the contribution to (Fs) and constructed the disconnected rj' and

the pion correlators in both quenched and full QCD at = 5.7. We find that it is pos­

sible to identify the shifted zero modes on the lattice. These shifted modes make the

dominant contribution to and tr(Fs5/). The diminishing value of Fs with in­

creasing A and the steep rise in topological charge followed by the plateau (Figure 5.3)

as the number of modes is increased suggest that the higher modes make negligible

contributions. As a cross-check, we compared the value of Q from pseudofermions

to that calculated using the available eigenmodes and find that they coincide within

statistical errors. This suggests that although there are no exact zero modes on the

lattice, it is possible to clearly distinguish between the generic low energy modes and

the topological modes. The existence of the latter makes possible the preservation of

some of the essential features of the index theorem on the lattice. On the dynajnical

configurations, it seems that the number of shifted zero modes must be around 32

while on the quenched configurations the data indicates that there are more than this

number, explanation?

On the one hand, from the preceding analysis, it is not surprising that the 77' dis­

connected correlator, being the two point function of tr(7s5/), is very well determined

by the lowest 32 modes on the dynamical configurations. On the other hand, viewed

from a more global perspective it is a remarkable that the 77' disconnected correlator can be obtained with just 32 modes which have been identified as the topological zero modes. Thus in a model independent way, Witten and Veneziano’s conjecture that the fluctuations in the topological charge density gives mass to 77' has been confirmed

109 in two different ways: (i) non-vanishing of mo in the chiral Limit, shown in the pre­

vious chapter, and (ii)demonstration of the fact that the flavor singlet interaction is

dominated entirely by the “zero modes” of Thirty two lowest modes which are topological in origin are not enough for the determination of the pion correlator or the chiral condensate. Negele et al. have demonstrated however that on these size lattices, the lowest 128 modes are enough. Although their work was done with Wil­ son fermions, a similar kind of figure should also hold for staggered fermions as well; this is further supported by our study of the spectrum using the Lanczos algorithm.

Thus the condensate receives contributions from both the topological emd the non topological modes as suggested by the Banks-Casher relation which requires a finite density of eigenvalues close to zero.

While the results of this study have been interesting so far, it has also raised many questions which should form the subject of a future study. The most obvious extension would be to obtain more modes, say, as many as the lowest 100 and use them to study the same quantities as before. The results should tell us whether it is possible indeed to separate in a quantitative manner the topological modes from the others. It will also help determine the number of shifted zero modes on the quenched configurations. Determination of the connected part of a hadron correlator will enable hadron masses to be extracted at lower quark masses (as low as finite volume effects would allow) and hence the study of approach to chiral limit. There are other lattice simulations like the study of spin content of the proton which require disconnected correlators and which can be computed using the available lowest modes. Repeating all of the above for different lattice sizes and for several 0 values would enable the study of scaling, finite a and finite volume effects.

110 BIBLIOGRAPHY

[1] A l l é s , B., D ’E l ia , M ., a n d D iG ia c OMO, A. Topological Susceptibility in Zero and Finite T in SU(3) Yang-Mills Theory. Nucl. Phys. B494 (1996), 281- 292.

[2] B a n k s , T ., a n d C a s h e r , A. Chiral Symmetry Breaking in Confining Theories. Nuc. Phys. B169 (1980), 103-146.

[3] B e l a v in , A . A ., P o l y a k o v , A . M ., Sc h w a r t z , A . S ., a n d T y u p k i n , Y . S. Pseudoparticle Solutions of the Yang-Mills Equations. Phys. Lett. 59B (1975), 85-87.

[4] B e r n a r d , C ., a n d G o l o t e r m a N, M. F. Chiral Perturbation Theory for the Quenched Approximation of QCD. Phys. Rev. D4 6 (1992), 853-857.

[5] B e r n a r d , C ., a n d G o l t e r m a n , M. F. Partially Quenched Gauge Theories and an Application to Staggered Fermions. Phys.Rev. D49 (1994), 486-494.

[6] B hattacharya , t .. G u p t a , R ., K il c u p , G ., a n d S h a r p e , S. H adron Spectrum for Wilson Fermions. Phys. Rev. D53 (1996), 6486-6508.

[7] B u n k , B. Computing the Lowest Eigenvalues of the Fermion Matrix by Sub- space Iterations. In Lattice 9 6 (1997), vol. 53 of Mud. Phys. B (Proc. Suppl.), pp. 987—989.

[8] C h e n g , T . P ., a n d L i , L. Gauge Theory of Elementary . Oxford, 1984.

[9] C h o d o s , a . , a n d He a l y , J. B. Spectral Degeneracy of the Lattice Dirac Equation as a Function of Lattice Shape. Nucl. Phys. B127 (1977), 426-446.

[10] C o l e m a n , S. Aspects of Symmetry. Cambridge, 1985, ch. The Uses of Instan- tons.

[11] C r u ETZ, M . Quarks, and Lattices. Cambridge, 1983.

Ill [12] CULLüM. J., AND W il l o u g h b y , R. Lanczos Algorithms for Large Symmetric Eigenvalue C om putations, Vol. I Theory. Birkhauser, 1985.

[13] D a s h e n , C . G . C . R ., an d G r o ss , D. The Structure of the Gauge Theory Vacuum . Phys. Lett. 63B (1976), 334-340.

[14] D a s h e n , C . G . C . R ., a n d G r o s s , D. Pseudoparticles and Massless Fermions in Two Dimensions. Phys. Rev. Dl6 (1977), 2526-2534.

[15] D e b b io , L. D ., F a b e r , M ., G r e e n s it e , J ., and O l e j n ik , S. Some Cau­ tionary Remarks on Abelian Projection and Abelian Dominance. In Lattice 96 (1997), vol. 53 of Nucl. Phys. B (Proc. Suppi), pp. 141-147.

[16] D u n c a n , A ., E i c h t e n , E., P e r r u c c i , S ., an d T h a c k e r , H. Quenched Chiral Logs, the eta’ mass and the Hairpin Diagram. In Lattice 96 (1997), vol. 53 of Nucl. Phys. B (Proc. Suppi), pp. 256-258.

[17] F u jik a w a , K . Path Integral for Gauge Theory with Fermions. Phys. Rev. D2l (1980), 2848-2858.

[18] G o l t e r m a n , M . F . Staggered Mesons. Nuc. Phys. B273 (1986), 663-676.

[19] G o l t e r m a n , M . F ., a n d S m it , J. Self Energy and Flavor Interpretation of Staggered Fermions. Nuc. Phys. B245 (1984), 61-88.

[20] G o l u b , G . H ., a n d L o a n , C . F. V . Matrix Computations, 2 ed. Johns Hopkins University Press, 1989.

[21] G o t t l ie b . S. QCD Spectrum - 1996. In Lattice 96 (1997), vol. 53 of Nucl. Phys. B (Proc. Suppi), pp. 155-167.

[22] G u s k e n , s . .a. Study of Smearing Techniques for Hadron Correlation Functions. In Lattice 89 (1990), vol. 17 of Nucl. Phys. B (Proc. Suppi), pp. 361-364.

[23] Iv a n e n k o , T . L ., a n d N e g e l e , J. W. Evidence of Instanton Effects in Hadrons from the Study of Low Eigenfunctions of the Dirac Operator. In Pro­ ceedings of Lattice 97 (1997). To be Published.

[24] J a c KIW, R ., a n d R e b b i , C. Vacuum Periodicity in Yang-Mills Quantum Theory. Phys. Rev. Lett 37 (1976), 172-175.

[25] K a l k r e u t e r , t . , a n d S im m a , H. An Accelerated Conjugate Gradient Algo­ rithm to Compute the Low-lying Eigenvalues - a study for the Dirac Operator in SU(2) Lattice QCD. hep-lat/9507023.

[26] K a r s t e n , L. H ., a n d S m it , J. Lattice Fermions: Species Doubling, Chiral Invariance and the Triangle Anomaly. Nuc. Phys. B183 (1981), 103-140.

112 [27] K a w a m o t o , N ., a n d S m it , J. Effective Lagrangian and Dynamical Symmetry Breaking in Strongly Coupled Lattice QCD. Nuc. Phys. B192 (1981), 100-124.

[28] K il c u p , G . W ., a n d S h a r p e , S. R. A Tool Kit for Staggered Fermions. Nuc. Phys. B283 (1987), 493-550.

[29] K l u b e r g -S t e r n , H ., M o r e l , A ., N a p o l y , O ., a n d P e t e r s s o n , B. Flavours of Lagrangian Susskind Fermions. Nuc. Phys. B220 (1983), 447-470.

[30] Luo, R. Y. On Shell Improved Lattice QCD with Staggered Fermions, hep- lat/9702013.

[31] Lüscher, M. Volume Dependence of the Energy Spectrum in Massive Quantum Field Theories I: Stable Particle States. Commun. Math. Phys. 104 (1986), 177-206.

[32] M . M a s s e t i , et. a!. Evidence for Eta’-Pi Splitting in Unquenched Lattice QCD. hep-lat/9605044.

[33] M a c k e n z ie , P. Recent Lattice Results on the Light Quark Masses. In Lattice 96 (1997), vol. 53 of Nucl. Phys. B (Proc. Suppi.), pp. 23-29.

[34] M a c k e n z ie , P. B. Lattice gauge machines. In From Actions to Answers (1989), Proceedings of TASI 89, World Scientific, pp. 205-236.

[35] M e y e r , A . .Modem Algorithms for Large Sparse Eigenvalue Problems. .Akademie-Verlag, 1987.

[36] M o n t v a y , I., AND M u n s t e r , G . Quantum Fields on a Lattice. Cambridge, 1994.

[37] M o r e l , A. Chiral Logarithms in Quenched QCD. J. Physique 48 (1987),

1111- 1120.

[38] N ie l s e n , H. B ., a n d N in o m iy a , M . Absence of on a Lattice. 1 . Proof By Homotopy Theory. Nuc. Phys. B185 (1981), 20-40.

[39] O k a w a , M . Flavor Dinglet Physics in Lattice QCD. KEK preprint 94-178.

[40] P a t e l , A ., a n d S h a r p e , S. Perturbative Corrections For Staggered Fermion Bilinears. Nuc. Phys. B395 (1993), 701-732.

[41] POLIKARPOV, M. I. Recent Results on the Abelian Projection of Lattice Gluon- dynamics. In Lattice 96 (1997), vol. 53 of Nucl. Phys. B (Proc. Suppi), pp. 134- 140.

113 [42] PoLYAKOV, A. M. Quark Confinement and Topology of Gauge Theories. Nucl. Phys. B120 (1977), 429-458.

[43] R a j a r a m a n , R . Solitons and [nstantons. North-Holland, 1977.

[44] S e x t o n , J ., a n d W e in g a RTEN, D . Error Estimate for the Valence Approx­ imation and for a Systematic Expansion of Full QCD. Phys. Rev. D55 (1997), 4025-4035.

[45] S h a RATCHANDRA, H . s ., T h u n , H . J ., a n d W e is z , P. Susskind Fermions on a Euclidean Lattice. Nuc. Phys. B192 (1981), 205-236.

[46] SheVVCHUK, j . R. An Introduction to the Conjugate Gradient Without the Ago­ nizing Pain, 1994. Available by ftp from warp.cs.emu.edu:quake-papers/painless- conj ugate-gradient. ps.

[47] S h u r YAK, E . V .. AND V erbaarschot , j. j. M. Mesonic Correlators in the Random Instanton Vacuum. Nucl. Phys. B 4 IO (1993), 55-89.

[48] S m it . J ., a n d V in k , J. C. Remnants of the Index Theorem on the Lattice. .Nuc. Phys. B286 (1987), 485-508.

[49] S u s s k in d , L. Lattice Fermions. Phys. Rev. D16 (1977), 3031-3039.

[50] T e p ER, M ., a n d Ha n d s , S. J. On the Value and Origin of the Chiral Con­ densate in Quenched SU{2) Lattice Gauge Theory. Nucl. Phys. B347 (1990), 819-853.

[51] V e n d e r g r a f t , j. s. Generalized Rayleigh Methods with Applications to Find­ ing Eigenvalues of Large Matrices. Lin. Alg. Appl. 4 (1971), 353-368.

[52] V e n e z ia n o , G . f/(l) Without Instantons. Nuc. Phys. B159 (1979), 213-224.

[53] V enkataraman , L ., a n d K il c u p , G . Aplications of the Eigenmodes of the Dirac Operator. In Proceedings of Lattice 97 (1997). To be Published.

[54] Venkataraman, L., Kilcup, G., and Grandy, J. The Staggered g' w ith Smeared Operators. In Lattice 96 (1997), vol. 53 of Nucl. Phys. B (Proc. Suppi.)., pp. 259-261.

[5 5 ] WEINGARTEN, D. Monte Carlo Evaluation of Hadron Masses in Lattice Gauge Theories. Phys. Lett. 109B (1982), 57-62.

[56] W il s o n , K . New Phenomena in Subnuclear Physics. Plenum, 1977, ch. Quarks and Strings on a Lattice, pp. 69-125.

114 [57] W i t t e n , E. Current Algebra Theorems for the U{1) Goldstone Boson. Nuc. Phys. B156 (1979), 269-283.

[58] Y . K u r a m a s HI, et al. rj' Meson Mass in Lattice QCD. Phys. Rev. Lett. 72 (1994), 3448-3450.

115