AN ANALYSIS OF THE LANCZOS GAMMA APPROXIMATION
by
GLENDON RALPH PUGH
B.Sc. (Mathematics) University of New Brunswick, 1992 M.Sc. (Mathematics) University of British Columbia, 1999
A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF
THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY
in
THE FACULTY OF GRADUATE STUDIES
Department of Mathematics
We accept this thesis as conforming to the required standard
......
......
......
......
......
THE UNIVERSITY OF BRITISH COLUMBIA
November 2004
c Glendon Ralph Pugh, 2004
In presenting this thesis in partial fulfillment of the requirements for an ad- vanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission.
(Signature)
Department of Mathematics The University of British Columbia Vancouver, Canada
Date Abstract
This thesis is an analysis of C. Lanczos’ approximation of the classical gamma function Γ(z+1) as given in his 1964 paper A Precision Approximation of the Gamma Function [14]. The purposes of this study are:
(i) to explain the details of Lanczos’ paper, including proofs of all claims made by the author; (ii) to address the question of how best to implement the approximation method in practice; and (iii) to generalize the methods used in the derivation of the approximation.
At present there are a number of algorithms for approximating the gamma function. The oldest and most well-known is Stirling’s asymptotic series which is still widely used today. Another more recent method is that of Spouge [27], which is similar in form though different in origin than Lanczos’ formula. All three of these approximation methods take the form
z+1/2 z w Γ(z +1)=√2π(z + w) e− − [sw,n(z)+ǫw,n(z)] (1)
where sw,n(z) denotes a series of n + 1 terms and ǫw,n(z) a relative error to be estimated. The real variable w is a free parameter which can be adjusted to control the accuracy of the approximation. Lanczos’ method stands apart from the other two in that, with w 0 fixed, asn the series s (z) ≥ →∞ w,n converges while ǫw,n(z) 0 uniformly on Re(z) > w. Stirling’s and Spouge’s methods do not share this→ property. − What is new here is a simple empirical method for bounding the relative error ǫ (z) in the right half plane based on the behaviour of this function | w,n | as z . This method is used to produce pairs (n, w) which give formu- las| (1)|→∞ which, in the case of a uniformly bounded error, are more efficient than Stirling’s and Spouge’s methods at comparable accuracy. In the n = 0 case, a variation of Stirling’s formula is given which has an empirically determined uniform error bound of 0.006 on Re(z) 0. Another result is a proof of the limit formula ≥
z 1 12/r z 22/r z(z 1) Γ(z + 1) = 2 lim r e− + e− − + r 2 − z +1 (z +1)(z +2) ··· →∞ as stated without proof by Lanczos at the end of his paper.
ii Table of Contents
Abstract ii
Table of Contents iii
List of Tables vi
List of Figures vii
Acknowledgments viii
Chapter 1. Introduction 1 1.1 Lanczos and His Formula ...... 2 1.2 Thesis Overview ...... 8
Chapter 2. A Primer on the Gamma Function 20 2.1 First Encounters with Gamma ...... 20 2.2 A Brief History of Gamma ...... 22 2.3 Definition of Γ ...... 25 2.4 Standard Results and Identities ...... 27 2.5 Stirling’s Series and Formula ...... 29 2.5.1 The Formula ...... 29 2.5.2 Some Remarks about Error Bounds ...... 34 2.6 Spouge’s Method ...... 36 2.7 Additional Remarks ...... 39
Chapter 3. The Lanczos Formula 40 3.1 The Lanczos Derivation ...... 41 3.1.1 Preliminary Transformations ...... 42 3.1.2 The Implicit Function v(θ) ...... 44 3.1.3 Series Expansion of fE,r ...... 46 3.2 Derivation in the Chebyshev Setting ...... 47 3.3 Derivation in the Hilbert Space Setting ...... 49 3.4 Additional Remarks ...... 50
Chapter 4. The Functions v and fE,r 52 4.1 Closed Form Formulas ...... 52 4.1.1 Closed Form for v(x) ...... 53
iii Table of Contents
4.1.2 Closed Form for fr(x) andf E,r(x) ...... 54 r 4.2 Smoothness of v (x), fr(x) andf E,r(x) ...... 56 r 4.3 Smoothness of v (θ), fr(θ) andf E,r(θ) ...... 59 4.3.1 Generalized fr(θ) ...... 59 4.3.2 Derivatives of G(β,j,k,l; θ) ...... 61 4.3.3 Application to fr ...... 61 4.4 Fourier Series and Growth Order of Coefficients ...... 64
Chapter 5. Analytic Continuation of Main Formula 66
5.1 Domain of Convergence of Sr(z) ...... 66 5.2 Analytic Continuation ...... 69 5.3 Additional Remarks ...... 69
Chapter 6. Computing the Coefficients 72 6.1 The Lanczos Method ...... 72 6.2 A Horner Type Method ...... 74 6.3 Another Recursion Formula ...... 76 6.4 Discrete Fourier Transform ...... 77 6.4.1 Finite Fourier Series ...... 77 6.4.2 Finite Chebyshev Series ...... 78 6.5 Some Sample Coefficients ...... 80 6.6 Partial Fraction Decomposition ...... 80 6.7 Matrix Representation ...... 82
Chapter 7. Lanczos’ Limit Formula 84 7.1 Some Motivation ...... 85 7.2 Proof of the Limit Formula ...... 87 7.3 Additional Remarks ...... 95
Chapter 8. Implementation & Error Discussion 98 8.1 A Word or Two about Errors ...... 99 8.1.1 Absolute and Relative Error ...... 99 8.1.2 Lanczos’ Relative Error ...... 100 8.1.3 Restriction to Re(z) 0 ...... 101 8.2 Some Notation and a Theorem≥ ...... 102 8.3 Lanczos’ Error Estimates ...... 105 8.4 Luke’s Error Estimates ...... 108
8.5 Numerical Investigations & the Zeros of ǫr,n∞ ...... 109 8.5.1 The Motivating Idea ...... 109
8.5.2 Zeros of ǫr,n∞ ...... 111
iv Table of Contents
8.5.3 Improved Lanczos Formula ...... 111 8.5.4 Optimal Formulas ...... 114 8.6 Additional Remarks ...... 116
Chapter 9. Comparison of the Methods 120 9.1 Stirling’s Series ...... 120 32 9.1.1 Γ(20+17i) with ǫρ < 10− ...... 120 | | 32 9.1.2 Γ(1+ z) with ǫρ < 10− Uniformly ...... 122 9.2 Spouge’s Method ...... | | 123 32 9.2.1 Γ(20+17i) with ǫρ < 10− ...... 123 | | 32 9.2.2 Γ(1+ z) with ǫρ < 10− Uniformly ...... 124 9.3 Lanczos’ Method ...... | | 125 9.4 Discussion ...... 125
Chapter 10. Consequences of Lanczos’ Paper 128 10.1 The Lanczos Transform ...... 128 10.2 A Combinatorial Identity ...... 133 10.3 Variations on Stirling’s Formula ...... 134
Chapter 11. Conclusion and Future Work 139 11.1 Outstanding Questions Related to Lanczos’ Paper ..... 140 11.2 Outstanding Questions Related to Error Bounds ...... 141 11.3 Conjectures on a Deterministic Algorithm ...... 143
Bibliography 145
Appendix A. Preliminary Steps of Lanczos Derivation 148
Appendix B. A Primer on Chebyshev Polynomials 149
Appendix C. Empirical Error Data 152
v List of Tables
1.1 Relative decrease of ak(r), r =1, 4, 7 ...... 4
2.1 Upper bound on EN,n(6+13i) in Stirling Series ...... 33 2.2 Coefficients of Spouge’s| Series, |a =12.5 ...... 38
6.1 Scaled summands of c6(6) ...... 75 6.2 Coefficients as functions of r ...... 80
8.1 Lanczos’ uniform error bounds ...... 106 8.2 Location of maximum of ηr,n(θ) ...... 107 8.3 Errors at infinity ...... 107
8.4 Approximate Zeros of ǫr,∞6 ...... 113 8.5 Sample coefficients, n = 10, r =10.900511 ...... 116 8.6 Comparison of Mr(n),n and an+1(r(n)) ...... 119 9.1 Upper bound on E (19 + 17i) in Stirling Series ...... 121 | N,n | 9.2 Upper bound on EN,n(0) in Stirling Series ...... 122 9.3 Coefficients of Spouge’s| series,| a =39 ...... 124 9.4 Coefficients of Lanczos’ series, r =22.618910 ...... 126
C.1 Error Data, n =0,...,30 ...... 153 C.2 Error Data, n =31,...,60 ...... 154
vi List of Figures
1.1 Cornelius Lanczos ...... 3 1.2 Relative decrease of ak(1) ...... 5 1.3 Relative decrease of ak(4) ...... 5 1.4 Relative decrease of ak(7) ...... 6 1.5 Lanczos shelf, r =20 ...... 7 1.6 Lanczos shelf, r =40 ...... 7 1.7 Lanczos shelf, r =60 ...... 8 1.8 f (θ), π/2 θ π/2 ...... 13 r − ≤ ≤ 2.1 Γ(z +1), 4 2 7.1 exp[ r(θ π/2) ] gr(θ) for r =1, 6, 10 ...... 86 7.2 log Q−(k, 10),− k =0,...,− 24 ...... 96 7.3 log Q(k, 20), k =0,...,24 ...... 97 7.4 log Q(k, 50), k =0,...,24 ...... 97 1/5 8.1 ǫ∞ , n =0, 1, 2, 3, 1/2 11.1 ǫr,n(z), 0.5 vii Acknowledgments I would like to thank all those who have assisted, guided and sup- ported me in my studies leading to this thesis. In particular, I wish to thank my supervisor Dr. Bill Casselman for his patience, guidance and encouragement, my committee members Dr. David Boyd and Dr. Richard Froese, and Ms. Lee Yupitun for her assistance as I found my way through the administrative maze of graduate school. I would also like to thank Professor Wesley Doggett of North Carolina State Uni- versity for his kind assistance in obtaining permission to reproduce the photo of C.Lanczos in Chapter 1. The online computing community has been a great help to me also in making available on the web many useful tools which helped in my research. In particular, MPJAVA, a multiple precision floating point computation package in Java produced by The HARPOON Project at the University of North Carolina was used for many of the high precision numerical investigations. In addition, the online spreadsheet MathSheet developed by Dr. Bill Casselman and Dr. David Austin was very helpful in producing many of the PostScript plots in the thesis. Finally, I would like to thank my family for their support and en- couragement: my parents Anne and Charles, and especially my wife Jacqueline for her encouragement and many years of understanding during the course of my studies. viii Chapter 1 Introduction The focus of this study is the little known Lanczos approximation [14] for computing the gamma function Γ(z + 1) with complex argument z. The aim is to address the following questions: (i) What are the properties of the approximation? This involves a detailed examination of Lanczos’ derivation of the method. (ii) How are the parameters of the method best chosen to approximate Γ(z +1)? (iii) How does Lanczos’ method compare against other methods? (iv) Does Lanczos’ method generalize to other functions? The subject of approximating the factorial function and its gener- alization, the gamma function, is a very old one. Within a year of Euler’s first consideration of the gamma function in 1729, Stirling pub- lished the asymptotic formula for the factorial with integer argument which bears his name. The generalization of this idea, Stirling’s series, has received much attention over the years, especially since the advent of computers in the twentieth century. Stirling’s series remains today the state of the art and forms the basis of many, if not most algorithms for computing the gamma func- tion, and is the subject of many papers dealing with optimal computa- tional strategies. There are, however, other methods for computing the gamma function with complex argument [7][20][26][14][27]. One such method is that of Lanczos, and another similar in form though different in origin is that of Spouge. 1 Chapter 1. Introduction The methods of Stirling and Lanczos share the common strategy of terminating an infinite series and estimating the error which results. Also in common among these two methods, as well as that of Spouge, is a free parameter which controls the accuracy of the approximation. The Lanczos’ method, though, is unique in that the infinite series term of the formula converges, and the resulting formula defines the gamma function in right-half planes Re(z) r where r 0 is the aforementioned free parameter. This is in≥− contrast to the≥ divergent nature of the series in Stirling’s method. The main results of this work are (i) Improved versions of Lanczos’ formulas for computing Γ(z +1) on Re(z) 0. In the specific case of a uniformly bounded relative ≥ 32 error of 10− in the right-half plane, a formula which is both simpler and more efficient than Stirling’s series is found. A one term approximation similar to Stirling’s formula is also given, but with a uniform error bound of 0.006 in the right-half plane. These results stem from the examination of the relative error as an analytic function of z, and in particular, the behaviour of this error as z . | |→∞ (ii) A careful examination of Lanczos’ paper [14], including a proof of the limit formula stated without proof at the end of the paper. 1.1 Lanczos and His Formula The object of interest in this work is the formula z+1/2 (z+r+1/2) Γ(z +1)=√2π (z + r +1/2) e− Sr(z) (1.1) where 1 z z(z 1) S (z)= a (r)+a (r) + a (r) − + . (1.2) r 2 0 1 z +1 2 (z +1)(z +2) ··· This unusual formula is due to Cornelius Lanczos, who published it in his 11 page 1964 paper A Precision Approximation of the Gamma Func- tion [14]. The method has been popularized somewhat by its mention in Numerical Recipes in C [21], though this reference gives no indication 2 Chapter 1. Introduction as to why it should be true, nor how one should go about selecting trun- 1 cation orders of the series Sr(z) or values of the parameter r . Despite this mention, few of the interesting properties of the method have been explored. For example, unlike divergent asymptotic formulas such as Stirling’s series, the series Sr(z) converges. Yet, in a manner similar to a divergent series, the coefficients ak(r) initially decrease quite rapidly, followed by slower decay as k increases. Furthermore, with increasing r, the region of convergence of the series extends further and further to the left of the complex plane, including z on the negative real axis and not an integer. Figure 1.1: Cornelius Lanczos The coefficients ak(r) in (1.2) are expressible in closed form as functions of the free parameter r 0, and equation (1.1) is valid for ≥ Re(z + r) > 0, z not a negative integer. Equation (1.1) actually rep- 1In fact, according to [21], r should be chosen to be an integer, which need not be the case. 3 Chapter 1. Introduction resents an infinite family of formulas, one for each value of r, and the choice of this parameter is quite crucial in determining the accuracy of the formula when the series Sr(z) is truncated at a finite number of terms. It is precisely the peculiar r dependency of the coefficients ak(r), and hence Sr(z) itself, which makes Lanczos’ method interesting. To get a sense of the r parameter’s influence on Sr(z) at this early stage, consider the ratios ak+1(r)/ak(r) for sample values r =1, 4, 7 in Table 1.1. Observe how| for r = 1,| the relative size of successive k a (1)/a (1) a (4)/a (4) a (7)/a (7) | k+1 k | | k+1 k | | k+1 k | 0 .31554 1.10590 1.39920 1 .00229 .15353 .33574 2 .32102 .02842 .15091 3 .34725 .00085 .05917 4 .43102 .00074 .01752 5 .50107 .05011 .00258 6 .55757 .36295 .00002 7 .60343 .24357 .00335 8 .64113 .25464 .06382 9 .67258 .27771 .03425 10 .69914 .30151 .58252 Table 1.1: Relative decrease of ak(r), r =1, 4, 7 coefficients drops very quickly at k = 1, but then flattens out into a more regular pattern. Compare this with the r =4andr = 7 columns where the drop occurs later, at k =4andk = 6 respectively, but the size of which is increasingly precipitous. A graphical illustration makes this behaviour more apparent; re- fer to Figures 1.2, 1.3, and 1.4 for bar graphs of log a (r)/a (r) , − | k+1 k | r =1, 4, 7, respectively. Observe the large peak to the left of each cor- responding to the steep drop-off in the coefficients. Considering that the scale in these graphs is logarithmic, the behaviour is dramatic. No- tice also that after the initial large decrease, the ratios do not follow a smooth pattern as in the r = 1 case, but rather, jump around irregu- larly. The ak(r) are actually Fourier coefficients of a certain function, and 2r are bounded asymptotically by a constant times k− . However, the ini- 4 Chapter 1. Introduction 12 log a (1)/a (1) − | k+1 k | 0 4 − 0 k 29 Figure 1.2: Relative decrease of ak(1) 12 log a (4)/a (4) − | k+1 k | 0 4 − 0 k 29 Figure 1.3: Relative decrease of ak(4) tial decrease of the coefficients appears more exponential in nature, and then switches abruptly to behave according to the asymptotic bound. 5 Chapter 1. Introduction 12 log a (7)/a (7) − | k+1 k | 0 4 − 0 k 29 Figure 1.4: Relative decrease of ak(7) This peculiar phenomenon is nicely illustrated in Figures 1.5, 1.6 and 1.7 where log a (r)/er is plotted for k =0,...,50 using r =20, 40 − | k | and 60, respectively. The abrupt transition in decay (around k r in each case) appears unique to Lanczos’ formula, a phenomenon termed≈ the “Lanczos shelf” in this work. When using the Lanczos formula in practice, the shelf behaviour suggests that the series (1.2) should be terminated at about the k r term, or equivalently, a k-term approx- ≈ imation should use r k, since the decay rate of coefficients after the cutoff point slows considerably.≈ Apart from the observations noted so far, the methods used both in the derivation of the main formula and for the calculation of the coefficients makes Lanczos’ work worthy of further consideration. The techniques used extend to other functions defined in terms of a partic- ular integral transform. Lanczos was primarily a physicist, although his contributions to mathematics and in particular numerical analysis were vast [15]. There is no question that he had an appreciation for rigour and pure anal- ysis, and was no doubt adept and skilled technically. At the same time, however, he seemed to have a certain philosophical view that the worth of a theoretical result should be measured in part by its practi- 6 Chapter 1. Introduction 300 log a (20)/e20 − | k | 0 0 k 100 Figure 1.5: Lanczos shelf, r =20 300 log a (40)/e40 − | k | 0 0 k 100 Figure 1.6: Lanczos shelf, r =40 cality. It is this practicality which is in evidence more so than rigour in his 1938 paper [11] and his book [12], in which much emphasis is 7 Chapter 1. Introduction 300 log a (60)/e60 − | k | 0 0 k 100 Figure 1.7: Lanczos shelf, r =60 placed on the benefits of interpolation over extrapolation of analytical functions (that is, Fourier series versus Taylor series). In these works one does not find the standard format of theorems followed by proofs common today. Rather, Lanczos explains his methods in very readable (and mostly convincing) prose followed by examples, and leaves many technical details to the reader. This same style pervades [14], the focus of this study, and is one of the motivations for the more thorough analysis undertaken here. A number of statements are given without proof, or even hint of why they should be in fact be true, and a number of authors have noted this scarcity of detail. Aspects of his work are variously described as “none is quite as neat . . . seemingly plucked from thin air” [21], “exceedingly curious” [29], and “complicated” [27], so a more detailed examination seems warranted. 1.2 Thesis Overview The goals of this work are threefold: (i) To explain in detail Lanczos’ paper, including proofs of all state- 8 Chapter 1. Introduction ments. (ii) From a practical point of view, to determine how (1.1) is best used to compute Γ(z +1), which reduces to an examination of the error resulting from truncation of the series (1.2), and to compare Lanczos’ method with other known methods. (iii) To note consequences and extensions of Lanczos’ method. The thesis is organized into eleven chapters, which fall into six more or less distinct categories. These are I. History and Background of Γ(z +1): Chapter 2 II. Lanczos’ Paper: Details and Proofs; Lanczos Limit Formula: Chapters 3-7 III. Error Discussion: Chapter 8 IV. Comparison of Calculation Methods: Chapter 9 V. Consequences and Extensions of the Theory: Chapter 10 VI. Conclusion and Future Work: Chapter 11 I have attempted to summarize the content and state main results at the beginning of each chapter, followed by details and proofs in the body. The remainder of this introductory chapter is dedicated to an overview of the six categories noted, with the aim of providing the reader with a survey of the main ideas, thus serving as a guide for navigating the rest of the thesis. History and Background of Γ(z +1) The gamma function is a familiar mathematical object for many, being the standard generalization of the factorial n! for non-negative integer n. The history of the function is not so well known, however, and so a brief one is given here, culminating with Euler’s representation ∞ x 1 t Γ(x)= t − e− dt Z0 9 Chapter 1. Introduction which is the form with which Lanczos begins his study. Following the formal definition of Γ(z + 1), a number of standard identities are given with some explanation but without formal proof. The chapter concludes with a discussion of computational methods, beginning with Stirling’s series, which is a generalization of the familiar Stirling’s formula n n n! √2πnn e− as n . ∼ →∞ Stirling’s asymptotic series forms the basis for most computational al- gorithms for the gamma function and much has been written on its implementation. An example on the use of Stirling’s series is given with a short discussion of error bounds. The second computational method noted is the relatively recent formula of Spouge [27]. This method is similar in form to Lanczos’ formula but differs greatly in its origin. The formula takes the form N z+1/2 (z+a) 1/2 ck Γ(z +1)=(z + a) e− (2π) c0 + + ǫ(z) " z + k # Xk=1 where N = a 1, c0 = 1andc k is the residue of Γ(z + 1)(z + (z+1/2) z+a⌈ ⌉−1/2 a)− e (2π)− at z = k. Although not quite as accurate as − Lanczos’ method using the same number of terms of the series, Spouge’s method has simpler error estimates, and the coefficients ck are much easier to compute. This method also extends to the digamma function Ψ(z +1)=d/dz [log Γ(z + 1)] and trigamma function Ψ′(z). Lanczos’ Paper: Details and Proofs Chapter 3 examines in detail Lanczos’ paper, beginning with the deriva- tion of the main formula (1.1). The derivation makes use of Fourier se- ries in a novel way by first defining an implicit function whose smooth- ness can be controlled with a free parameter r 0, and then replacing that function with its Fourier series. Specifically,≥ beginning with ∞ z t Γ(z +1)= t e− dt , Z0 10 Chapter 1. Introduction these steps yield z+1/2 (z+r+1/2) Γ(z +1/2)=(z + r +1/2) e− √2 (1.3) π/2 √2 v(θ)r sin θ cos2z θ dθ . × π/2 log v(θ) Z− " # The function v(θ) in the integrand is defined implicitly by v(1 log v)=cos2 θ − where v is increasing, and v =0, 1,e corresponds to θ = π/2, 0,π/2, − respectively. Denoting by fr(θ) the term in square brackets in the integrand of (1.3), and fE,r(θ) its even part, the properties of this last function are such that it can be replaced by a uniformly convergent Fourier series a (r) ∞ f (θ)= 0 + a (r)cos(2kθ) . E,r 2 k Xk=1 Integrating this series against cos2z θ term by term then gives rise to the series Sr(z) of (1.2) with the help of a handy trigonometric integral. The derivation of (1.1) can also be carried out using Chebyshev se- ries instead of Fourier series (really one and the same), which seems closer in spirit to methods seen later for computing the coefficients ak(r). In fact, both the Fourier and Chebyshev methods can be gen- eralized to a simple inner product using Hilbert space techniques, and this is noted as well. From the theory of Fourier series, the smoothness of fE,r(θ) governs the rate of asymptotic decrease of the coefficients ak(r), which in turn determines the region of convergence of the expansion. This funda- mental relationship motivates a detailed examination of the properties of v(θ), fr(θ) andf E,r(θ) in Chapter 4. Practical formulas for these functions are found in terms of Lambert W functions, and a few rep- resentative graphs are plotted for several values of r. Chapter 5 addresses the convergence of the series Sr(z). Once the growth order of the ak(r) is established, a bound on the functions 1 ifk =0, Hk(z)= z (z k+1) (1.4) ··· − ( (z+1) (z+k) if k 1. ··· ≥ 11 Chapter 1. Introduction appearing in Sr(z) is found which leads to the conclusion that Sr(z) converges absolutely and uniformly on compact subsets of Re(z) > r away from the negative integers, and thus equation (1.1) defines the− gamma function in this region. Finally, Chapter 6 addresses the practical problem of computing the ak(r). The standard method provided by Fourier theory, namely the direct integration 2 π/2 ak(r)= fE,r(θ)cos(2kθ) dθ (1.5) π π/2 Z− proves impractical due to the complicated nature of fE,r(θ). Lanczos overcomes this difficulty with the clever use of Chebyshev polynomials as follows: define 1/2 z 1/2 (z)=2− Γ(z +1/2)(z + r +1/2)− − exp (z + r +1/2) , Fr which is just the integral in (1.3), and recall the defining property of the 2kth Chebyshev polynomial: k 2j T2k(cos θ)= C2j,2k cos θ j=0 X = cos(2kθ ) . Then (1.5) becomes 2 π/2 k a (r)= f (θ) C cos2j θdθ k π E,r 2j,2k π/2 j=0 Z− X 2 k = C (j) , π 2j,2kFr j=0 X a simple linear combination of evaluated at integer arguments. Fr Aside from Lanczos’ method for computing the coefficients, several others are noted. In particular, a recursive method similar to Horner’s rule is described, as well as a matrix method which reduces floating point operations thus avoiding some round off error. In addition, these matrix methods provide formulas for the partial fraction decomposition of the series Sr(z) once terminated at a finite number of terms. 12 Chapter 1. Introduction The Lanczos Limit Formula Lanczos concludes his paper with the following formula which he claims is the limiting behaviour of (1.1), and which defines the gamma function in the entire complex plane away from the negative integers: ∞ z 1 k k2/r Γ(z + 1) = 2 lim r + ( 1) e− Hk(z) . (1.6) r →∞ "2 − # Xk=1 The Hk(z) functions here are as defined by (1.4). He gives no proof, nor any explanation of why this should be true. In Chapter 7 the limit (1.6) is proved in detail, a result termed the “Lanczos Limit Formula” in this work. The proof is motivated by the plots in Figure 1.8 of the function fr(θ) from equation (1.3). Notice that as r increases, the area un- √2e2 r=2 fr(θ) √2e r=1 √2 r=0 π π 0 − 2 2 Figure 1.8: f (θ), π/2 θ π/2 r − ≤ ≤ 13 Chapter 1. Introduction der fr(θ) appears to be increasingly concentrated near θ = π/2 where r fr(π/2) = √2e . er The limit formula (1.6) suggests a rescaling of fr(θ)to r/(2π)fer(θ)e− , which decays rapidly to zero on [ π/2,π/2] except at θ = π/2 where p it has the value r/π. This behaviour− is reminiscent of that of the Gaussian distribution r/π exp [ r(θ π/2)2]on( ,π/2], and in- p − − −∞ deed, plots of these two functions in the vicinity of θ = π/2 show a p remarkable fit, even for small values of r. With some work, the limiting value of (1.3) is shown to be expressible in the form z ∞ 2z r(θ π/2)2 Γ(z +1/2) = lim r cos θ√re− − dθ r →∞ Z−∞ which, upon replacement of cos2z θ by its Fourier series, gives rise to the coefficients via 1 ∞ 1/2 r(θ π/2)2 k k2/r cos (2kθ)r e− − dt =( 1) e− . √π − Z−∞ A comparison of (1.6) with the main formula (1.1) suggests that for large r, πr ak(er) k k2/r ( 1) e− , (1.7) 2 eer ∼ − r and plots of these coefficients for various r values supports this rela- tionship. From the smoothness analysis of fr(θ) it follows that ak(r)= 2r O k−⌈ ⌉ . Yet, (1.7) suggests a much more rapid decrease of the coef- ficients for large fixed r and increasing k. Indeed, in his paper, Lanczos states Generally, the higher r becomes, the smaller will be the value of the coefficients at which the convergence begins to slow down. At the same time, however, we have to wait longer, before the asymptotic stage is reached. Although he does not make precise quantitatively how this conclusion is reached, it seems likely based on the comparison (1.7). Note that this description is precisely the behaviour of the coefficients observed at the beginning of this chapter. Unfortunately, the proof of the Lanczos Limit Formula does not clarify quantitatively Lanczos’ observation, but it does shed some light on what perhaps motivated him to make it in the first place. 14 Chapter 1. Introduction Error Discussion To use Lanczos’ formula in practice, the series Sr(z) is terminated at a finite number of terms, and an estimate of the resulting (relative) error is therefore required. Chapter 8 begins with a survey of existing error considerations found in the literature, followed by the two prin- cipal results of the chapter. The first is a simple empirical method for bounding the relative error. The second is an observation about the behaviour of the relative error as z . This observation leads to | |→∞ improved error bounds for certain choices of r as a function of the series truncation order n. For the purposes of error estimation, one need only be concerned with Re(z) 0 since the reflection formula ≥ πz Γ(1 + z)Γ(1 z)= − sin πz provides estimates for gamma in the left hand plane if the function in right half plane is known. For notation, write Sr(z)=Sr,n(z)+ǫr,n(z) where Sr,n(z) is the sum of the first n terms and ∞ ǫr,n(z)= ak(r)Hk(z) . k=Xn+1 Following Lanczos, the function ǫr,n(z) is considered the relative error, which differs slightly from the standard definition, but which matters little when it comes to bounding the error. Lanczos gives, for Re(z) 0, uniform bounds on ǫ (z) for seven combinations of n and r values≥ | r,n | but does not provide precise details as to how he obtains these figures. Attempts to reproduce his estimates based on his description were not successful, and no examination of his estimates appears elsewhere in the literature. Lanczos’ figures do appear correct, however, based on the new estimation method which follows. In order to bound ǫr,n(z) , observe that ǫr,n(z) is an analytic func- tion on Re(z) 0. It turns| out| that ≥ n a0(r) lim ǫr,n(z)=1 ak(r) , z − 2 − | |→∞ Xk=1 15 Chapter 1. Introduction a constant denoted ǫr,n∞ , the error at infinity. Under these conditions, as a consequence of the maximum modulus principle, the maximum of ǫr,n(z) on Re(z) 0 must occur on the line Re(z) = 0, possibly at infinity.| | Furthermore,≥ thanks to the Schwartz reflection principle, this maximum will occur on the positive imaginary axis. Thus to bound ǫ (z) on Re(z) 0, it is sufficient to bound ǫ (it) ,0 t< : | r,n | ≥ | r,n | ≤ ∞ still a daunting task. The problem is overcome by first transforming the domain to a finite interval via the mapping z(t)=it/(1 t), 0 t<1, − ≤ and then observing that under this mapping the functions Hk(z(t)) have the simple well behaved form k 1 − t(i + j) j H (z(t)) = − . k t(i j 1)+ j +1 j=0 Y − − Thus the maximum of ǫ (z) on Re(z) 0 can be easily estimated | r,n | ≥ empirically by examining the first few terms of ǫ (z(t)) on 0 t<1. | r,n | ≤ This estimation method is used to examine the dependence of ǫr,n(z) on both the truncation order n and the parameter r. The prevailing thought in the literature is that n should be chosen as a function of r; the approach to the problem here is just the opposite: r is selected as a function of n, with surprising results. Using the relative error in Stirling’s formula as a guide, the question is: for a given value of n, can r be chosen so that the relative error at infinity is zero? That is, does ǫr,n∞ = 0 have solutions? This question motivated an extensive numerical examination of the relative error functions ǫr,n(z) andǫ r,n∞ for n =0,...,60. The results of this investigation are tabulated in Appendix C. Experimentally, ǫr,n∞ = 0 was found to have a finite number of real solutions. For the values 0 n 60 examined, ǫr,n∞ had at most 2n +2 real zeros located between≤r =≤ 1/2 and r = n + 4. Furthermore, − for each n, the largest zero of ǫr,n∞ was found to give the least uniform bound on ǫ (z) . For example, Lanczos gives the following uniform | r,n | error bounds for various values of n and r (for Re(z) 0): ≥ 16 Chapter 1. Introduction n r ǫ (z) < | r,n | 1 1 0.001 1 1.5 0.00024 5 2 2 5.1 10− × 6 3 2 1.5 10− × 6 3 3 1.4 10− × 8 4 4 5 10− × 10 6 5 2 10− × By contrast, selecting r to be the largest zero of ǫr,n∞ yields the following dramatically improved uniform error bounds (again, for Re(z) 0): ≥ n r ǫr,n(z) < | | 4 1 1.489194 1.0 10− × 7 2 2.603209 6.3 10− × 8 3 3.655180 8.5 10− × 9 4 4.340882 4.3 10− × 12 6 6.779506 2.7 10− × Comparison of Calculation Methods The purpose of this chapter is to compare the methods of Lanczos, Spouge and Stirling in the context of an extended precision compu- tation. For each method, a detailed calculation of Γ(20 + 17i) with 32 relative error ǫρ < 10− is carried out. Additionally, formulas for | | 32 Γ(1 + z) with uniform error bound of 10− are given for each. The conclusion is that each method has its merits and shortcomings, and the question of which is best has no clear answer. For a uniformly bounded relative error, Lanczos’ method seems most efficient, while Stirling’s series yields very accurate results for z of large modulus due to its error term which decreases rapidly with increasing z . Spouge’s method, on the other hand, is the easiest to implement thanks| | to its simple formulas for both the series coefficients and the error bound. Consequences and Extensions of the Theory Chapter 10 discusses a variety of results which follow from various as- pects of Lanczos’ paper and the theory of earlier chapters. 17 Chapter 1. Introduction The main result is a generalization of the integral transform in (1.3) which is termed a “Lanczos transform” in this work. Briefly, if F (z) is defined for Re(z) 0as ≥ π/2 F (z)= cos2z θg(θ) dθ π/2 Z− where g(θ) L2[ π/2,π/2] is even, then ∈ − Γ(z +1/2) 1 z z(z 1) F (z)=√π a + a + a − + . Γ(z +1) 2 0 1 z +1 2 (z +1)(z +2) ··· The ak in the series are the Fourier coefficients of g(θ), and these are given by 2 k a = C F (j) , k π 2j,2k j=0 X exactly as in the gamma function case. Again, just as in the gamma function case, the smoothness of g has a direct influence on the domain and speed of convergence of the series. The relative error resulting from truncating the series at a finite number of terms is constant at infin- ity, and so the empirical error bounding methods used in the gamma function case also apply to the more general Lanczos transform. Aside from the Lanczos transform, two non-trivial combinatorial identities which follow directly from the Lanczos Limit Formula are noted. The third result is a variation on Stirling’s formula. A result of the work in previous chapters was the one term approximation (z+1/2) (z+r+1/2) Γ(z +1)=(z + r +1/2) e− √2π (1 + ǫr,0(z)) , . where r =0.319264, the largest zero of ǫr,∞0, and ǫr,0(z) < 0.006 every- where in the right-half plane Re(z) 0. The natural| question| is: can ≥ r be chosen as a function of z so that ǫr(z),0(z) = 0, thus yielding (z+1/2) (z+r(z)+1/2) Γ(z +1)=(z + r(z)+1/2) e− √2π ? The answer is shown to be yes, with the help of Lambert W functions, and numerical checks indicate that the functions r(z) vary little over the entire real line. 18 Chapter 1. Introduction Future Work and Outstanding Questions The concluding chapter notes a number of outstanding questions and areas requiring further investigation. This topic is broken down into three categories: unresolved problems from Lanczos’ paper itself, ques- tions arising out of the theory developed in this study, particularly with respect to error bounds, and finally, a conjectured deterministic algorithm for calculating the gamma function based on the numerical evidence of Appendix C. The last of these is of greatest practical interest. Letting r(n) denote the largest zero of ǫ∞ and M the maximum of ǫ (it) for 0 r,n r(n),n | r,n | ≤ t< , a plot of (n, log M ) reveals a near perfect linear trend ∞ − r(n),n n = a log Mr(n),n + b. The algorithm is then clear: given ǫ>0, select n = − a log ǫ + b , and setr = r(n). Then (1.1) truncated after the ⌈− ⌉ nth term computes Γ(1 + z) with a uniformly bounded relative error of at most ǫ. Although a proof is not available at present, the data is compelling and further investigation is warranted. 19 Chapter 2 A Primer on the Gamma Function This chapter gives a brief overview of the history of the gamma func- tion, followed by the definition which serves as the starting point of Lanczos’ work. A summary of well known results and identities involv- ing the gamma function is then given, concluding with a survey of two computational methods with examples: Stirling’s series and Spouge’s method. The introductory material dealing with the definition and resulting identities is standard and may be safely skipped by readers already familiar with the gamma function. The computational aspects of the function, on the other hand, may be less familiar and readers may therefore wish to include the material starting with Section 2.5. 2.1 First Encounters with Gamma Most any student who has taken a mathematics course at the senior secondary level has encountered the gamma function in one form or another. For some, the initial exposure is accidental: n! is that button on their scientific calculator which causes overflow errors for all but the first few dozen integers. The first formal treatment, however, is typically in the study of permutations where for integer n 0 the factorial function first makes an appearance in the form ≥ 1 ifn =0, n!= n(n 1)! else. ( − 20 Chapter 2. A Primer on the Gamma Function Once some familiarity with this new object is acquired (and students are convinced that 0! should indeed equal 1), it is generally applied to counting arguments in combinatorics and probability. This is the end of the line for some students. Others continue on to study calculus, where n! is again encountered in the freshman year, typically in the result dn xn = n! dxn for n a non-negative integer. Closely related to this result is the ap- pearance of n! in Taylor’s formula. Some are asked to prove ∞ n t n!= t e− dt (2.1) Z0 as homework, while a fortunate few learn how (2.1) “makes sense” for non-integer n. Some even go so far as to prove n n n! √2πnn e− as n . ∼ →∞ At this point the audience begins to fracture: some leave math- ematics and the gamma function forever, having satisfied their credit requirement. Others push on into engineering and the physical sciences, and a few venture into the more abstract territory of pure mathemat- ics. The last two of these groups have not seen the end of the gamma function. For the physical scientists, the next encounter is in the study of differential equations and Laplace transform theory, and for those who delve into asymptotics, computation of the gamma function is a classic application of the saddle point method. In statistics the gamma function forms the basis of a family of density functions which includes the exponential and chi-square distributions. For the mathematicians the show is just beginning. In complex analysis they learn that not only can the gamma function be extended to non-integer arguments, but even to an analytic function away from the negative integers. Computationally, they learn about Stirling’s for- mula and the associated Stirling series. In number theory, the gamma function is the springboard for developing the analytic continuation of Riemann’s zeta function ζ(s), which leads into the prime number theorem and a tantalizing glimpse of the Riemann Hypothesis. The gamma function has many guises, but exactly how is it defined, and more importantly for our purposes, how does one compute it? So 21 Chapter 2. A Primer on the Gamma Function far, a precise definition of the function, and notation for it (save the n! used) has been avoided. The gamma function is a generalization of n!, but the question remains: what definition generalizes the factorial function in the most natural way? 2.2 A Brief History of Gamma Most works dealing with the gamma function begin with a statement of the definition, usually in terms of an integral or a limit. These def- initions generalize the factorial n!, but there are many other functions which interpolate n! between the non-negative integers; why are the standard definitions the “right” ones? The life of gamma is retraced in the very entertaining paper of Davis [6], according to which the year 1729 saw “the birth” of the gamma function as Euler studied the pattern 1, 1 2, 1 2 3, ... . · · · The problem was simple enough: it was well known that interpolating formulas of the form n(n +1) 1+2+ + n = ··· 2 existed for sums; was there a similar formula f(n)=1 2 n for prod- ucts? The answer, Euler showed, was no; the products· 1···2 n would · ··· require their own symbol. The notation n! was eventually adopted (though not universally) to denote the product, however this notation did not arrive on the scene until 1808 when C. Kramp used it in a paper. Today the symbol n! is normally read “n-factorial”, although some early English texts suggested the somewhat amusing reading of “n-admiration”. Refer to [4] for a detailed history of the notation. Euler went on further and found representations of the factorial which extended its domain to non-integer arguments. One was a prod- uct m!(m +1)x x! = lim , (2.2) m (x +1) (x + m) →∞ ··· and the second an integral representation: 1 x!= ( log t)x dt . (2.3) − Z0 22 Chapter 2. A Primer on the Gamma Function The product (2.2) converges for any x not a negative integer. The integral (2.3) converges for real x> 1. The substitution u = log t puts (2.3) into the more familiar form− − ∞ x u x!= u e− du . Z0 In 1808 Legendre introduced the notation which would give the gamma function its name1. He defined ∞ x 1 t Γ(x)= t − e− dt , (2.4) Z0 so that Γ(n +1)=n!. It is unclear why Legendre chose to shift the argument in his notation. Lanczos remarks in the opening paragraph of his paper [14]: The normalization of the gamma function Γ(n + 1) in- stead of Γ(n) is due to Legendre and void of any rationality. This unfortunate circumstance compels us to utilize the no- tation z! instead of Γ(z + 1), although this notation is ob- viously highly unsatisfactory from the operational point of view. Lanczos’ was not the only voice of objection. Edwards, in his well- known treatise on the Riemann zeta function [8], reverts to Gauss’ notation Π(s)=s!, stating: Unfortunately, Legendre subsequently introduced the no- tation Γ(s) for Π(s 1). Legendre’s reasons for considering (n 1)! instead of n!− are obscure (perhaps he felt it was more natural− to have the first pole occur at s = 0 rather than at s = 1) but, whatever the reason, this notation prevailed in France− and, by the end of the nineteenth century, in the rest of the world as well. Gauss’s original notation appears to me to be much more natural and Riemann’s use of it gives me a welcome opportunity to reintroduce it. It is true, however, that (2.4) represents Γ(x) as the Mellin transform of exp ( t), so that Legendre’s normalization is the right one in some − 1Whittaker and Watson [30] give 1814 as the date, however. 23 Chapter 2. A Primer on the Gamma Function sense, though Mellin was not born until some twenty one years after Legendre’s death. An interesting recent investigation into the history of the Γ notation appears in the paper by Gronau [10]. The question remains, are Euler’s representations (2.2) and (2.3) the “natural” generalizations of the factorial? These expressions are equivalent for x> 1 (see [16]), so consider the integral form (2.3), or equivalently, (2.4).− Some compelling evidence in support of (2.4) comes in the form of the complete characterization of Γ(x) by the Bohr- Mollerup theorem (see [23]): Theorem 2.1. Suppose f :(0, ) (0, ) is such that f(1) = 1, f(x +1)=xf(x), and f is log-convex.∞ → Then∞f =Γ. So as a function of a real variable, Γ(x) has nice behaviour. But if gamma is extended beyond the real line, its properties are even more ∞ z 1 t satisfying: as a function of a complex variable z,Γ(z)= 0 t − e− dt defines an analytic function on Re(z) > 0. Along with the fundamental R recursion Γ(z +1) = zΓ(z), Γ can then be extended to an analytic function on z C 0, 1, 2, 3,... . Indeed, Euler’s formula (2.2) ∈ \{ − − − } converges and defines Γ on this same domain. The definition created to describe a mathematical object is an arti- ficial absolute which, it is hoped, precisely captures and describes every aspect which characterizes the object. However, it is impossible to state with certainty that a particular definition is in some way the correct one. In the present case, a definition of the correct generalization of n! is sought. The Bohr-Mollerup theorem and the analytic properties of Euler’s Γ strongly suggest that (2.4) is indeed the right definition. The appearance of Euler’s Γ in so many areas of mathematics, and its natural relationship to so many other functions further support this claim. This viewpoint is perhaps best summarized by Temme in [28]: This does not make clear why Euler’s choice of general- ization is the best one. But, afterwards, this became abun- dantly clear. Time and again, the Euler gamma function shows up in a very natural way in all kinds of problems. Moreover, the function has a number of interesting proper- ties. ∞ z 1 t So is Γ(z)= 0 t − e− dt the natural extension of n! ? It seems so, and we will take it as such and study its properties, and in particular, R how to compute it. 24 Chapter 2. A Primer on the Gamma Function 2.3 Definition of Γ As noted, there are several equivalent ways to define the gamma func- tion. Among these, the one which served as the starting point for Lanczos’ paper [14] will be used: Definition 2.1. For z C with Re(z) > 1, the gamma function is ∈ − defined by the integral ∞ z t Γ(z +1)= t e− dt . (2.5) Z0 From this definition, we immediately state Theorem 2.2. For Γ(z +1)defined by (2.5): (i) Γ(z +1)is analytic on Re(z) > 1; − (ii) Γ(z +1)=zΓ(z) on Re(z) > 0 ; (iii) Γ(z +1) extends to an analytic function on C with simple poles at the negative integers. Proof of Theorem 2.2: z t (i) Let Ω denote the domain Re(z) > 1. The integrand of ∞ t e− dt − 0 is continuous as a function of t and z, and for fixed t>0 is ana- R lytic on Ω. Further the integral converges uniformly on compact subsets of Ω. Hence Γ(z + 1) is analytic on Ω. (ii) For Re(z) > 0, integrating by parts gives ∞ z t Γ(z +1)= t d( e− ) − Z0 z t ∞ z 1 t = t e− ∞ + z t − e− dt − 0 Z0 = zΓ(z) (iii) For Re(z) > 0 and any integer n 0, the fundamental recursion Γ(z +1)=zΓ(z) implies ≥ Γ(z + n +1) Γ(z)= n . (2.6) k=0(z + k) 25Q Chapter 2. A Primer on the Gamma Function The right hand side of (2.6) is analytic on Re(z) > n 1, z not − − a negative integer, and is equal to Γ(z)onRe(z) > 0. Hence (2.6) is the unique analytic continuation of Γ(z) to the right half plane Re(z) > n 1, z not a negative integer. Since n was an arbitrary − − non-negative integer, Γ(z) can be extended to an analytic function on C 0, 1, 2, 3,... . \{ − − − } For z = n a negative integer, (2.6) shows that Γ has a simple pole with− residue Γ(z + n +1) lim (z + n)Γ(z) = lim (z + n) n z n z n →− →− k=0(z + k) Γ(1) = n 1 Q − ( n + k) k=0 − 1 = Qn 1 − ( n + k) k=0 − ( 1)n = Q− . n! An interesting generalization of (2.5) due to Cauchy leads to another analytic continuation of the gamma function. Let σ = Re( z 1). Then − − k ∞ z t ( t) Γ(z +1)= t e− − dt . (2.7) − k! 0 0 k<σ ! Z ≤X An empty sum in the integrand of (2.7) is to be considered zero. This integral converges and is analytic for z C such that Re(z) = 1, 2, 3,... and reduces to (2.5) if Re(z) >∈ 1. See [30, pp.243–6 − − − − 244] Figure 2.1 shows a graph of Γ(z +1) for z real. Note the the poles at z = 1, 2,... and the contrast between the distinctly different behaviours− of− the function on either side of the origin. Figure 2.2 shows a surface plot of Γ(x + iy) which illustrates the behaviour of the function away from the| real axis.| 26 Chapter 2. A Primer on the Gamma Function Γ(z +1) 3 2 2 z − 3 − Figure 2.1: Γ(z +1), 4 2.4 Standard Results and Identities From the proof of Theorem 2.2 it is clear that the fundamental recursion Γ(z +1)=zΓ(z) holds for all z C 0, 1, 2, 3,... . There are many more properties of the gamma∈ \{ function− − which− follow} from the definition and Theorem 2.2. The main ones are stated here without proof; the interested reader is referred to [16, pp. 391–412] for details. The first result is Euler’s formula n!nz Γ(z) = lim (2.8) n z(z +1) (z + n) →∞ ··· which converges for all z C 0, 1, 2, 3,... . This identity follows ∈ \{n − −t n −z 1 } from integrating by parts in 1 t − dt to get 0 − n n n z t Rz 1 n!n 1 t − dt = − n z(z +1) (z + n) Z0 ··· 27 Chapter 2. A Primer on the Gamma Function 6 3 0 2 2 − 0 0 2 2 y x − Figure 2.2: Γ(x + iy) , 3 and then letting n . →∞ Rewriting (2.8) in reciprocal form as n z(z +1) (z + n) z( log n+Pn 1/k) z z/k lim ··· = lim ze − k=1 1+ e− n z n →∞ n!n →∞ k kY=1 yields the Weierstrass product ∞ 1 γz z z/k = ze 1+ e− . (2.9) Γ(z) k kY=1 The γ appearing in (2.9) is the Euler (or sometimes Euler- Mascheroni) n constant γ = limn (1/k) log n 0.5772 . →∞ k=1 − ≈ ··· Writing both Γ(z)P and Γ( z) using (2.9) and recalling the infinite − 28 Chapter 2. A Primer on the Gamma Function product representation of the sine function ∞ z2 sin z = z 1 , − k2π2 Yk=1 we arrive at the very useful reflection formula πz Γ(1 + z)Γ(1 z)= . (2.10) − sin πz Equation (2.10) is of great practical importance since it reduces the problem of computing the gamma function at arbitrary z to that of z with positive real part. 2.5 Stirling’s Series and Formula Among the well known preliminary results on the gamma function, one is of particular importance computationally and requires special atten- tion: Stirling’s series. This is an asymptotic expansion for Γ(z +1) which permits evaluation of the function to any prescribed accuracy, and variations of this method form the basis of many calculation rou- tines. 2.5.1 The Formula Stirling’s series is named after James Stirling (1692–1770) who in 1730 published the simpler version for integer arguments commonly known as Stirling’s Formula. The more general Stirling’s series can be stated thus: N log Γ(z +1) (z + k) =(z + N +1/2)log(z + N) " # Yk=1 1 (z + N)+ log2π − 2 (2.11) n B + 2j 2j(2j 1)(z + N)2j 1 j=1 − X − ∞ B (x) 2n dx . − 2n(z + N + x)2n Z0 29 Chapter 2. A Primer on the Gamma Function The product in the left hand side of (2.11) is understood to be one if N = 0. TheB 2j are the Bernoulli numbers, which are defined by writing t/(et 1) as a Maclaurin series: − t ∞ B = j tj . et 1 j! j=0 − X th The function B2n(x)isthe2n Bernoulli polynomial defined over R by periodic extension of its values on [0, 1]. By judiciously selecting n and N the absolute error in log Γ(z +1), that is, the absolute value of the integral term in (2.11), can be made as small as desired, and hence the same can be achieved with the relative error of Γ(z + 1) itself. Equation (2.11) is valid in the so-called slit plane C t R t N . \{ ∈ | ≤− } Equation (2.11) with N = 0 is derived using Euler-Maclaurin sum- mation. Alternatively, again with N = 0, begin with the formula of Binet obtained from an application of Plana’s theorem, 1 ∞ arctan (t/z) log Γ(z +1)=(z +1/2) log z z + log2π +2 dt , − 2 e2πt 1 Z0 − and expand arctan(t/z) in the last term into a Taylor polynomial with remainder. The resulting series of integrals produce the terms of (2.11) containing the Bernoulli numbers [30, pp.251–253]. The gen- eral form (2.11) with N 1 is obtained by replacing z with z + N and ≥ applying the fundamental recursion. Exponentiation of Stirling’s series (with N = 0) yields Stirling’s asymptotic formula for Γ(z + 1) itself: z z+1/2 1/2 1 1 Γ(z +1)=e− z (2π) 1+ + + . (2.12) 12z 288z2 ··· Equation (2.12) also follows from applying Laplace’s method (or equiva- lently, the saddle point method) to (2.5). Truncating the series in (2.12) after the constant term yields the familiar Stirling’s formula noted in the introduction: z z Γ(z +1) √2πzz e− as z . (2.13) ∼ | |→∞ It may be tempting to let n in (2.11) with the hope that the integral error term will go to zero,→∞ but unfortunately the series contain- ing the B2j becomes divergent. The terms of this sum initially decrease 30 Chapter 2. A Primer on the Gamma Function in modulus, but then begin to grow without bound as a result of the rapid growth of the Bernoulli numbers. The odd Bernoulli numbers are all zero except for B1 = 1/2, while for the even ones, B0 = 1, and for j 1, − ≥ ( 1)j+12(2j)!ζ(2j) B = − . (2.14) 2j (2π)2j 2j 2j Since ζ(2j)= ∞ k− 1 rapidly with increasing j, B = O ((2j)!(2π)− ) k=1 → 2j (see [8, p.105]). P What is true, however, is that the integral error term in (2.11) tends to zero uniformly as z in any sector arg(z + N) δ<π. In | |→∞ | |≤ terms of Γ(z + 1) itself, this means the relative error tends to zero uniformly as z in any sector arg(z + N) δ<π. | |→∞ | |≤ In order to use (2.11) to evaluate the gamma function, an estimate of the error (as a function of n, N, andz ) is required when the integral term of (2.11) is omitted. The following result is due to Stieltjes [8, p.112]: Theorem 2.3. Let θ = arg(z + N), where π<θ<π, and denote − the absolute error by ∞ B (x) E (z)= 2n dx . N,n 2n(z + N + x)2n Z0 Then 1 2n+2 B E (z) 2n+2 . (2.15) | N,n |≤ cos (θ/2) (2n + 2)(2n +1)(z + N)2n+1 In other words, if the series in (2.11) is terminated after the B2n term and the integral term is omitted, the resulting error is at most 2n 2 (cos(θ/2))− − times the B2n+2 term. To get an idea of how to use Stirling’s series and the error esti- mate (2.15), consider the following Example 2.1. Compute Γ(7+13i) accurate to within an absolute error 12 of ǫ =10− . Solution: For convenience, let EN,n denote the integral error term EN,n(6+13i) in this example. The first task is to translate the absolute 31 Chapter 2. A Primer on the Gamma Function 12 bound ǫ =10− on the error in computing Γ(7 + 13i) into a bound on EN,n. Letlog(GP N ) denote the left hand side of (2.11), where G represents the true value of Γ(7 + 13i) and PN the product term. Let log GN,n denote the right hand side of (2.11) without the integral error term. Then (2.11) becomes log(GPN )=logGN,n + EN,n , so that GN,n G = eEN,n , PN and the requirement is GN,n GN,n eEN,n <ǫ. P − P N N That is, it is desired that ǫ eEN,n 1 < − GN,n/PN | | from which E E2 ǫ E 1+ N,n + N,n + < . | N,n| 2! 3! ··· G /P N,n N | | The infinite series is at most e in modulus provided EN,n 1, so that if E < ǫ/( e G /P ), the prescribed accuracy is guaranteed.≤ This | N,n| | N,n N | last estimate is self-referencing, but GN,n/PN can be estimated using Stirling’s formula (2.13). That is, we require z 1 ǫe − EN,n < | | √ z+1/2 2πz 12 5 10− ≈ × upon setting z =6+13i. Now that a target bound for E has been determined, n and N | N,n| must now be selected to meet the target. In Table 2.1 are listed bounds on EN,n as given by (2.15) for z =6+13i and various combinations of (|N,n).| 32 Chapter 2. A Primer on the Gamma Function n N =0 N =1 N =2 N =3 N =4 6 6 6 6 7 1 1.9 10− 1.6 10− 1.3 10− 1.1 10− 9.7 10− × 9 × 9 × 9 × 9 × 9 2 3.7 10− 2.8 10− 2.2 10− 1.7 10− 1.3 10− × 11 × 11 × 12 × 12 × 12 3 1.9 10− 1.3 10− 9.1 10− 6.4 10− 4.4 10− × 13 × 13 × 14 × 14 × 14 4 1.9 10− 1.2 10− 7.3 10− 4.6 10− 2.9 10− × 15 × 15 × 16 × 16 × 16 5 2.9 10− 1.6 10− 9.3 10− 5.3 10− 3.1 10− × × × × × Table 2.1: Upper bound on E (6+13i) in Stirling Series | N,n | We can see that N = 0 withn = 4 is sufficient to produce the desired accuracy. Using these values with z =6+13i, and leaving off the error term in (2.11) then gives logΓ(7 + 13i) log G ≈ 0,4 1 = (6+13i +1 /2)log(6+ 13i) (6+13i)+ log2π − 2 4 B + 2j 2j(2j 1)(6 + 13i)2j 1 j=1 − − . X = 2.5778902638380984 + i 28.9938056395651838 . − Taking exponentials yields the final approximation . Γ(7 + 13i) G = 0.0571140842611710 i 0.0500395762571980 . ≈ 0,4 − − Using the symbolic computation package Maple with forty digit preci- . sion, the absolute error in the estimate is found to be Γ(7 + 13i) G0,4 = 15 | − | 2.5 10− , which is well within the prescribed bound. × Observe from Table 2.1 that N = 4 withn = 3 would also have given the desired accuracy, in which case only n = 3 terms of the series would be needed. The reader has no doubt noticed the ad-hoc manner in which n and N were selected in this example. The question of how best to choose n and N given z and absolute error ǫ is not an easy one, and will not be addressed here. 33 Chapter 2. A Primer on the Gamma Function It is a simpler matter, however, to determine the maximum accuracy that can be expected for a given choice of N, both for a fixed z and uniformly, and this is done in the following sections. 2.5.2 Some Remarks about Error Bounds From Table 2.1 it may seem difficult to believe that as n continues to increase, the error will reach a minimum and then eventually grow without bound. However, for a fixed z and N it is possible to determine the value of n which gives the least possible error bound that can be achieved with (2.15). More generally, for each N 1, it is possible to establish the best possible uniform error estimate≥ reported by (2.15) for Re(z) 0. ≥ In the following discussion, let 1 2n+2 B U (z)= 2n+2 , N,n cos (θ/2) (2n + 2)(2n +1)(z + N)2n+1 which is the right hand side of (2.15). Recall that θ =arg(z + N). Minimum of |UN,n(6 + 13i)| For each N, what is the best accuracy that can be achieved in Exam- ple 2.1 using the error estimate in Stirling’s Series (2.15)? Throughout this section let U = U (6+13i) . To find the minimum value of | N,n| | N,n | UN,n , withN fixed, it is sufficient to determine the n value at which the| sequence| of U changes from decreasing to increasing, which cor- | N,n| responds to the first n such that UN,n/UN,n+1 1. Using the explicit form of the Bernoulli numbers (2.14),| the task|≤ then is to find the least n such that (2π)2(z + N)2 cos2 (θ/2) ζ(2n +2) 1 . (2n + 2)(2n +1) ζ(2n +4) ≤ That is, we want the least n satisfying 3 1 ζ(2n +4) π(z + N)cos(θ/2) n 1+ + . | |≤ 2n 2n2 ζ(2n +2) s The square root term is at most √14π/7 when n = 1 and decreases to 1 rapidly, so that if n is chosen n π z + N cos (θ/2) (2.16) ≈⌈ | | ⌉ 34 Chapter 2. A Primer on the Gamma Function a minimal error should result. In the case of z =6+13i, cos(θ/ 2) = 1/2+3/√205 and N =0, (2.16) predicts a minimum error for n =q 38. Using Maple the cor- 34 responding error is found to be E < 1.3 10− . This bound is | 0,38| × impressive, but bear in mind that the corresponding Bernoulli number required to achieve this is B 8 1050. This is the best possible | 76|≈ × accuracy for z =6+13i with N = 0; greater accuracy is possible as N is increased from zero. For example, with N = 4, (2.16) predicts a 42 minimum error for n = 47, at which E < 6.7 10− . | 4,47| × Uniform Bound of |UN,n(z)| As demonstrated, for fixed z and N there is a limit to the accuracy which can be achieved by taking more and more terms of the Stirling series. It is worthwhile to determine this limiting accuracy in the form of the best possible uniform error bound as a function of N 0. By the reflection formula (2.10), it is enough to consider only the right≥ half plane Re(z) 0. Now for N 0 fixed andz + N = u + iv, UN,n(z) will be worst≥ possible where ≥ | | cos (θ/2) 2n+2 u + iv 2n+1 | | | | is a minimum in u N. IfN = 0, letting u + iv 0 along the positive ≥ → real axis shows that U grows without bound. Assume then that | 0,n| N 1. Writing cos(θ/2) = (1 + cos θ)/2 and cosθ = u/√u2 + v2 puts≥ (2.5.2) in the form p 1 n/2 n+1 cos (θ/2) 2n+2 u + iv 2n+1 = u2 + v2 √u2 + v2 + u | | | | 2n+1 which is clearly minimized when u and v are minimized, that is, at v =0andu = N. In terms ofz and θ, this says that UN,n(z) is worst possible at z = θ = 0. By (2.16), n should then be chosen| n | πN , ≈⌈ ⌉ 35 Chapter 2. A Primer on the Gamma Function and we have B U 2n+2 | N,n|≤ (2n + 2)(2n +1)N 2n+1 2(2n +2)!ζ(2n +2) = from (2.14) (2π)2n+2(2n +2)(2n +1)N 2n+1 2(2n)! since ζ(2n +2) 1 ≈ (2π)2n+2N 2n+1 ≈ 2 √4πn(2n)2n e 2n − from (2.13) ≈ (2π)2n+2N 2n+1 e 2πN = − upon setting n πN . π√N ≈ So selecting n πN results in a uniform bound which decreases rapidly with N. The≈⌈ problem⌉ remains, however, that the larger N, and hence n becomes, so too do the Bernoulli numbers required in the Stirling series. In practice, unless one is dealing with z values near the origin, one can use much smaller values of N which result in acceptable error bounds. 2.6 Spouge’s Method To conclude this survey of standard results, special mention is made of the 1994 work of Spouge [27]. In that work, the author develops the formula N z+1/2 (z+a) 1/2 ck(a) Γ(z +1)=(z + a) e− (2π) c0 + + ǫ(z) (2.17) " z + k # Xk=1 which is valid for Re(z + a) > 0. The parameter a is real, N = a 1, (z+1/2) z+a ⌈ ⌉−1/2 c0(a)=1andck(a) is the residue of Γ(z +1)(z + a)− e (2π)− at z = k. Explicitly, for 1 k N, this is − ≤ ≤ k 1 1 ( 1) − k 1/2 k+a ck(a)= − ( k + a) − e− . (2.18) √2π (k 1)! − − 36 Chapter 2. A Primer on the Gamma Function Spouge’s formula has the very simple relative error bound ǫS(a, z) ǫ(z) ǫ (a, z) = | S | Γ(z +1)(z + a) (z+1/2)ez+a(2π) 1/2 − − √a 1 < , (2π)a+1/2 Re(z + a) provided a 3. Thus for z in the right half plane Re(z) 0, ǫS(a, z) has the uniform≥ bound ≥ 1 ǫ (a, z) < . (2.19) | S | √a(2π)a+1/2 Though similar in form to Lanczos’ formula (note for example, the free parameter a), Spouge’s work differs greatly in the derivation, mak- ing extensive use of complex analysis and residue calculus. To see how Spouge’s method works in practice, we revisit the com- putation of Γ(7 + 13i) from Example 2.1: Example 2.2. Estimate Γ(7+13i) accurate to within an absolute error 12 of ǫ =10− using Spouge’s method. 12 Solution: An absolute error bound of ǫ<10− means the relative error must be bounded by ǫ ǫ (a, z) < | S | Γ(7 + 13i) 10 12e6+13i − by Stirling’s formula ≈ √ 6+13i+1/2 2π(6+13i) . 11 =1.3 10− . × By plotting ǫS(a, 6+13i) , a =12.5 and hence N = 12 terms of the series (2.17)| are sufficient| to achieve this bound. The calculation of the coefficients (2.18) for these values yields Table 2.2. Equation (2.17) 37 Chapter 2. A Primer on the Gamma Function k ck(a) 0 +1.0000000000000 100 × 1 +1.3355050294248 105 × 2 4.9293093529936 105 − × 3 +7.4128747369761 105 × 4 5.8509737760400 105 − × 5 +2.6042527033039 105 × 6 6.5413353396114 104 − × 7 +8.8014596350842 103 × 8 5.6480502412898 102 − × 9 +1.3803798339181 101 × 2 10 8.0781761698951 10− − × 5 11 +3.4797414457425 10− × 12 12 5.6892712275042 10− − × Table 2.2: Coefficients of Spouge’s Series, a =12.5 then gives Γ(7 + 13i) 0.0571140842611682 i0.0500395762571984, ≈− − 16 which differs in absolute value from Γ(7 + 13i) by less than 2.9 10− . × Compared to Stirling’s series, many more terms of Spouge’s series are required to achieve the same accuracy, but the individual terms are much easier to compute, and the selection criterion for a and hence N is straightforward. Spouge’s work is important for several reasons. The first is that the coefficients ck(a) are simpler to compute than those of the Lanczos series. The second is that Spouge gives a simpler yet more accurate version of Stirling’s formula. And finally, Spouge’s approximation and error estimates apply not only to Γ(z + 1), but also to the digamma function Ψ(z +1)=d/dz [log Γ(z + 1)] and trigamma function Ψ′(z). To see the link between (2.17) and Lanczos’ formula (1.1), write a = r +1/2 and resolve the first N terms of the series (1.2) into partial 38 Chapter 2. A Primer on the Gamma Function fractions, 1 z z (z N +1) a (r)+a (r) + + a (r) ··· − + ǫ(z) 2 0 1 z +1 ··· N (z +1) (z + N) ··· N b (r) = b (r)+ k + ǫ(z) . 0 z + k Xk=1 Comparing this with (2.17), the bk(r) obtained from truncating the (z+1/2) z+a 1/2 Lanczos series are the approximate residues of Γ(z+1)(z+a)− e (2π)− at z = k, and the larger N becomes the better the approximation. − 2.7 Additional Remarks The standard definition and properties of the gamma function can be found in many works dealing with special functions and analysis. For this study the especially clear treatments given in [28, pp.41–77] and [23, pp.192–195] are of note. The references [16, pp. 391–412] and [30, pp.235–264] are also worthy of mention. A thorough treatment of Stirling’s series can be found in [8, p.105]. Finally, the work of Artin [2] is of interest for the treatment of Γ(x) for x real from the point of view of convexity. For a survey and comparison of various computational methods for computing the gamma function, including Stirling’s series, see the paper by Ng [20]. Among the methods covered there is that of Spira [26] in which a simpler error bound on Stirling’s series than (2.15) is given. 39 Chapter 3 The Lanczos Formula We arrive at last at the examination of Lanczos’ paper itself, beginning in this chapter with the derivation of the main formula (1.1). The derivation consists of three steps, which in a nutshell can be broken down as ∞ z 1/2 t Γ(z +1/2) = t − e− dt Z0 z+1/2 (z+r+1/2) =(z + r +1/2) e− √2 e z 1/2 r [v(1 log v)] − v dv (step I) × − Z0 z+1/2 (z+r+1/2) =(z + r +1/2) e− √2 π/2 √2 v(θ)r sin θ cos2z θ dθ (step II) × π/2 log v(θ) Z− " # z+1/2 (z+r+1/2) =(z + r +1/2) e− √2 π/2 ∞ 2z a0(r) cos θ + ak(r)cos(2kθ) dθ . (step III) × π/2 " 2 # Z− Xk=0 The series in Equation (1.1) then follows from the last integral upon integrating term by term. The derivation is first retraced more or less along the same lines as Lanczos uses in his paper [14]. This approach uses Fourier series tech- 40 Chapter 3. The Lanczos Formula niques in a novel albeit complicated way to obtain explicit formulas for the coefficients an(r) as a linear combination of Chebyshev polyno- mial coefficients. The appearance of Chebyshev coefficients suggests a connection with the orthogonal system of Chebyshev polynomials, and indeed the derivation in the setting of these polynomials provides a slightly cleaner argument. Finally, both of these derivations are seen to be cases of a much more general idea, that of inner products in Hilbert space. The following notation which will be standard throughout the re- mainder of this work: Definition 3.1. Define H0(z)=1, and for k 1, ≥ Γ(z +1)Γ(z +1) H (z)= k Γ(z k +1)Γ(z + k +1) − 1 = (z +1)k(z +1) k − z (z k +1) = ··· − . (z +1) (z + k) ··· Using this new notation, (1.1) may be restated1: ∞ z+1/2 (z+r+1/2) ′ Γ(z +1)=√2π (z + r +1/2) e− ak(r)Hk(z) . (3.1) Xk=0 3.1 The Lanczos Derivation Lanczos uses as his starting point Euler’s formula (2.5) and transforms the integral in a series of steps. The motivation for these transforma- tions is not immediately obvious, though the effect is to extract from 1The prime notation in the sum indicates that the k = 0 term is given weight 1/2. 41 Chapter 3. The Lanczos Formula the integral a term similar in form to the first factor of Stirling’s series (2.12). The integral which remains is then precisely the form required to perform a clever series expansion. In the exposition the following conventions have been adopted: 1. The complex variable z = σ + it where σ =Re(z) andt = Im(z ); 2. For a, b C with a not a negative real number or zero, define ab = eb log∈a where Im(log a) is between π and π. − 3.1.1 Preliminary Transformations Beginning with ∞ z t Γ(z +1)= t e− dt, Re(z) > 1 , (3.2) − Z0 make the substitution t αt where Re(α) > 0 to obtain → z+1 ∞ z αt Γ(z +1)=α t e− dt . Z0 The validity of this substitution for complex α can be shown using contour arguments or via the following lemma2: Lemma 3.1. For Re(α) > 0 and Re(z) > 1, − z+1 ∞ z αt Γ(z +1)=α t e− dt . Z0 Proof of Lemma 3.1: For fixed z with Re(z) > 1 andα> 0 real, replacing t with αt in (3.2) gives − z+1 ∞ z αt Γ(z +1)=α t e− dt . Z0 Viewed as a function of α, the right hand side is analytic in the right half α-plane, and equals the constant Γ(z + 1) along the positive real 2This result can be found as an exercise in [30, p.243]. For real α>0 the result is elementary, however Lanczos makes no comment as to its validity for complex α. 42 Chapter 3. The Lanczos Formula axis. Hence the integral must equal Γ(z + 1) throughout the right half α-plane. Next set α = z + r + 1, where r 0, so that ≥ z+1 ∞ z (z+r+1)t Γ(z +1)=(z + r +1) t e− dt, Re(z) > 1 . (3.3) − Z0 Here r 0 is a free parameter which plays a key role later. Now reduce the bounds≥ of integration to a finite case via the change of variable t =1 log v, dt =( 1/v)dv: − − 0 z+1 z (z+r+1)(1 log v) 1 Γ(z +1)=(z + r +1) (1 log v) e− − − dv e − v Z e z+1 (z+r+1) z z r 1 =(z + r +1) e− (1 log v) v v v dv 0 − v Z e z+1 (z+r+1) z r =(z + r +1) e− [v(1 log v)] v dv . (3.4) − Z0 σ+r ǫ The integrand in this last expression is O(v − ) for every ǫ>0as v 0, while it is O( v e σ)asv e . Equation (3.4) is therefore → | − | → valid for Re(z) > 1.3 − The substitution t =1 log v appears to be the key to the Lanczos method in the sense that− it puts all the right pieces in all the right places for the series expansion to come. It has the effect of peeling off 3There is an error in [17, p.30, Eq.(1)] at this point concerning the domain of convergence of the formula. There the author makes the replacement z z 1/2 and states (using σ to signify what is here denoted r): → − Γ(z +1/2) = (z + σ +1/2)z+1/2 exp [ (z + σ +1/2)]F (z), − F (z)= e[v(1 log v)]z 1/2vσ dv, Re(z + σ +1/2) > 0 . 0 − − However, the integralR F (z) diverges with, for example, z = 3/4 and a − choice of σ =5/4. To see this, note that v(1 log v) (e v) on [0,e ], so − ≤ − that [v/(e v)]5/4 [v(1 log v)] 5/4v5/4, whence − ≤ − − e e 5/4 5/4 5/4 [v/(e v)] dv [v(1 log v)]− v dv . − ≤ − Z0 Z0 But the left hand side e[v/(e v)]5/4 dv = . 0 − ∞ R 43 Chapter 3. The Lanczos Formula leading terms similar in form to Stirling’s formula, while factoring the integrand into separate powers of z and r. The sequence of substitutions used here, however, namely α (z + r +1) and t 1 log v, is quite different from the seemingly→ disconnected chain→ of sub−stitutions Lanczos uses in [14] to arrive at equation (3.4). His steps are perhaps clues to the motivation behind his method and so these are reproduced in Appendix A. 3.1.2 The Implicit Function v(θ) The next step in the derivation is the transformation of the integral in (3.4) into a form more amenable to Fourier methods. The path of v(1 log v) is shown in Figure 3.1 so that v may be implicitly defined as a− function of θ via cos2 θ = v(1 log v) , − where θ = π/2 corresponds to v =0,θ =0tov =1, andθ = π/2to − v = e. 1 v(1 log v) − 0 1 e v Figure 3.1: Path of v(1 log v) − Making the substitution v(1 log v)=cos2 θ, dv = 2 sinθ cos θ/ log vdθ − 44 Chapter 3. The Lanczos Formula in (3.4) then gives π/2 z+1 (z+r+1) 2 z r 2 sinθ cos θ Γ(z +1)=(z + r +1) e− (cos θ) v dθ π/2 log v Z− π/2 r z+1 (z+r+1) 2z+1 2v sin θ =(z + r +1) e− cos θ dθ . (3.5) π/2 log v Z− The integrand in (3.5) is O( θ + π/2 2σ+2r+1) nearθ = π/2, O(1) near θ = 0, but is O( θ π/2 2|σ+1)asθ| π/2 so this expression− in once again valid for Re(| z−) > |1. → − Moving a √2 outside the integral and replacing z with z 1/2, the reason for which will become apparent later4, yields − π/2 r 2z √2v sin θ Γ(z +1/2) = Pr(z) cos θ dθ , (3.6) π/2 log v Z− " # where Pr(z) is defined z+1/2 (z+r+1/2) Pr(z)=√2(z + r +1/2) e− . Now denote by fr(θ) the term in square brackets in (3.6) and by fE,r(θ) its even part [fr(θ)+fr( θ)] /2. Noting that the odd part of the inte- grand integrates out to zero− finally puts (3.6) in the form π/2 2z Γ(z +1/2) = Pr(z) cos θfE,r(θ) dθ (3.7) π/2 Z− for Re(z) > 1/2 andr 0. Equation (3.7) is the starting point for the series expansion− in (3.1).≥ 4Shifting the argument to z 1/2 at the outset, that is, starting the derivation − with ∞ z 1/2 t Γ(z +1/2) = t − e− dt, Re(z) > 1/2 − Z0 in place of equation (3.2), would seem more logical, considering the reason for the shift is yet to come anyway. For some reason Lanczos does not do this. 45 Chapter 3. The Lanczos Formula 3.1.3 Series Expansion of fE,r The next step in the development is the replacement of fE,r(θ)bya series which can be integrated term by term. Lanczos first constructs a Taylor series in sin θ for fr(θ) but notes that the resulting series after integrating is of slow convergence. Instead, he turns from the ‘extrapo- lating’ Taylor series to an ‘interpolating’ series of orthogonal functions: a Fourier series. The properties of fE,r will be examined later, but for the time being assume that this function may be represented by a uniformly convergent Fourier series on [ π/2,π/2]. Further, this series − will contain only cosine terms since fE,r is even. Thus 1 ∞ kπθ f (θ)= a (r)+ a (r)cos E,r 2 0 k π/2 Xk=1 ∞ ′ = ak(r) cos (2kθ ) (3.8) Xk=0 where the coefficients ak(r) are given by 2 π/2 ak(r)= fE,r(θ)cos(2kθ) dθ . π π/2 Z− On the surface this substitution would appear to complicate mat- ters, or at least offer little improvement. It turns out, however, that thanks to a handy identity, the resulting integrals evaluate explicitly to the rational functions Hk(z) which exhibit the poles of Γ(z + 1). This identity is stated as a lemma, the proof of which can be found in [17, p.16]: Lemma 3.2. For Re(z) > 1/2 − π/2 2z Γ(z +1/2) cos θ cos (2kθ) dθ = √π Hk(z) (3.9) π/2 Γ(z +1) Z− Picking up where we left off with (3.7), and using (3.8) to replace 46 Chapter 3. The Lanczos Formula fE,r(θ), we obtain π/2 ∞ 2z ′ Γ(z +1/2) = Pr(z) cos θ ak(r)cos(2kθ) dθ π/2 Z− Xk=0 ∞ π/2 ′ 2z = Pr(z) ak(r) cos θ cos (2kθ) dθ π/2 Xk=0 Z− Γ(z +1/2) ∞ = P (z)√π ′ a (r)H (z) . (3.10) r Γ(z +1) k k Xk=0 Now it is just a matter of eliminating Γ(z +1/2) on both sides of (3.10), the reason for replacing z with z 1/2 in (3.5), and moving the − Γ(z + 1) to the left hand side to get the final form: ∞ z+1/2 (z+r+1/2) ′ Γ(z +1)=(z + r +1/2) e− √2π ak(r)Hk(z) , (3.11) Xk=0 where again, Re(z) > 1/2, and r 0. We will see later that conver- gence of this series can− be extended≥ to the left of Re(z)= 1/2. − 3.2 Derivation in the Chebyshev Setting The argument used to obtain (3.11) is now repeated, but this time using the set of Chebyshev polynomials T (x) orthogonal on the in- { k } terval [ 1, 1]. For a brief overview of Chebyshev polynomials and their connection− to Fourier expansions the reader is referred to Appendix B. Beginning with equation (3.4), e z+1 (z+r+1) z r Γ(z +1)=(z + r +1) e− [v(1 log v)] v dv , − Z0 this time define v(x) implicitly with 1 x2 = v(1 log v) , (3.12) − − where x = 1 corresponds to v =0,x =0tov = 1, andx =1to − v = e. This yields 1 r z+1 (z+r+1) 2 z 2xv Γ(z +1)=(z + r +1) e− (1 x ) dx . (3.13) 1 − log v Z− 47 Chapter 3. The Lanczos Formula z+1/2 Once again let Pr(z)=√2(z + r +1/2) exp [ (z + r +1/2)], re- − r place z with z 1/2, and denote by fE,r(x) the even part of √2xv / log v, giving − 1 2 z dx Γ(z +1/2) = Pr(z) (1 x ) fE,r(x) . (3.14) 2 1 − √1 x Z− − The motivation this time for the replacement z z 1/2 is a bit clearer: with the introduction of the weight function→ 1/√−1 x2, the integral 2 − 2 in (3.14) reveals itself as an inner product on [ 1,1](dx/√1 x ) which has the Chebyshev polynomials as orthogonalL − basis. − Expanding fE,r(x) in a series of even Chebyshev polynomials (the justification for which will follow) produces ∞ ′ fE,r(x)= ck(r)T2k(x) (3.15) Xk=0 so that 1 ∞ 2 z ′ dx Γ(z +1/2) = Pr(z) (1 x ) ck(r)T2k(x) , 2 1 − √1 x Z− Xk=0 − which upon integrating term by term gives Γ(z +1/2) ∞ Γ(z +1/2) = P (z)√π ′ c (r)( 1)k H (z) . r Γ(z +1) k − k Xk=0 The term by term integration uses an equivalent form of the iden- tity (3.9): 1 2 z dx k Γ(z +1/2) (1 x ) T2k(x) =( 1) √π Hk(z) . 2 1 − √1 x − Γ(z +1) Z− − Finally, multiplying through by Γ(z +1)/Γ(z +1/2) and writing a (r)=( 1)kc (r) once again yields k − k ∞ z+1/2 (z+r+1/2) ′ Γ(z +1)=(z + r +1/2) e− √2π ak(r)Hk(z) . (3.16) Xk=0 48 Chapter 3. The Lanczos Formula 3.3 Derivation in the Hilbert Space Set- ting It is worth noting that both (3.7) and (3.14) show that Γ(z+1/2)/Pr(z) manifests itself as an inner product in Hilbert space, so that (3.16) can be deduced directly using Hilbert space theory. In fact, the requirement that fE,r be equal to a uniformly convergent Fourier (resp. Chebyshev) series is unnecessary, if we ask only that fE,r be square summable with respect to Lebesgue measure. Take for example the Fourier setting. Parseval’s theorem [24, p.91] says that for even f,g L2[ π/2,π/2] with g real, ∈ − π/2 ∞ 2 ′ f(θ)g(θ) dθ = fˆkgˆk π π/2 Z− Xk=0 where fˆk denotes the Fourier coefficient π/2 ˆ 2 fk = f(θ)cos(2kθ) dθ . π π/2 Z− In the present case, the Fourier coefficients of p(θ)=cos2z(θ) are π/2 2 2z pˆk = cos (θ)cos(2kθ) dθ π π/2 Z− 2 Γ(z +1/2) = H (z) , √π Γ(z +1) k from Lemma 3.2, while those of f are a (r), assuming f L2[ π/2,π/2]. E,r k E,r ∈ − 49 Chapter 3. The Lanczos Formula Thus by Parseval’s theorem, (3.7) becomes simply π/2 2z Γ(z +1/2) = Pr(z) cos θfE,r(θ) dθ π/2 Z− π ∞ = P (z) ′ pˆ a (r) r 2 k k Xk=0 Γ(z +1/2) ∞ = P (z)√π ′ a (r) H (z) , r Γ(z +1) k k Xk=0 which again yields equation (3.1). 3.4 Additional Remarks To conclude this chapter, a number of remarks on various aspects of the derivation of (3.1) are worthy of mention. The first remark is concerning the use of both Fourier and Cheby- shev series to achieve the same result. These are clearly equivalent since, as Boyd notes in [3], a Chebyshev polynomial expansion is merely a Fourier cosine series in disguise. In this case, the transformation x sin θ links one to the other. Based on the extensive use of Cheby- shev→ polynomials in his work [11], Lanczos was likely aware of the pos- sibility of using either expansion in his derivation. Indeed, he does make the substitution (3.12) in his paper, but not as a springboard for a Chebyshev expansion, but rather to set the stage for expressing f0(θ) as a Taylor polynomial in sin θ. From the periodicity of this last function he then argues that the interpolating Fourier cosine series is a better choice in terms of convergence versus the extrapolating Taylor series. Secondly, the Fourier methods used here, in particular the use of the identity (3.9), bear a striking resemblance to a technique used in Lanczos’ 1961 work [13, p.45]. There he derives an expansion for Bessel functions of arbitrary order in terms of Bessel functions of even integer order. Part of that derivation involves a clever method for computing 50 Chapter 3. The Lanczos Formula the coefficients of the expansion which will be described in the sequel. This common thread is not so surprising since, as Lanczos notes in the acknowledgments of each of [14] and [13], the winter of 1959-60 saw much of both works completed at the Mathematics Research Center of the U.S. Army at the University of Wisconsin in Madison. Thirdly, the derivation of (3.1) can be simplified if one considers first only real z> 1/2 followed by an application of the principle of analytic continuation− at the end. This approach obviates Lemma 3.1 but requires consideration of the analytic properties of the infinite series factor of the formula. Such matters will be examined later when (3.1) is extended to the left of Re(z)= 1/2. − The final remark involves the free parameter r as it relates to the Fourier coefficients ak(r) (or equivalently ck(r) in the Chebyshev set- ting). There are several methods for computing the coefficients and these will be examined in due course. For the moment, note Lanczos’ own formulation: the constant term of the series is 2e 1/2 a (r)= er, (3.17) 0 π(r +1/2) while for k 1, ≥ 2 k a (r)= C (j) . (3.18) k π 2j,2kFr j=0 X 1/2 j 1/2 Here r(j)=2− Γ(j +1/2)(j + r +1/2)− − exp (j + r +1/2) and F 2j th C2j,2k is the coefficient of x in the 2k Chebyshev polynomial. These coefficients are functions of r and this parameter plays a fundamental role in the rate of convergence of the series. This role will be examined in detail, but for now observe that the constraint r 0 imposed by Lanczos is unduly restrictive in the sense that the substitution≥ in equation (3.3) requires only that Re(r) 0. Under this relaxed constraint, the Fourier coefficients given by (3.17)≥ and (3.18) are then single valued analytic functions of r in this region. The properties of the ak(r) as a function of a complex variable were not considered here, but it would be interesting to investigate further the effect of this generalization on (3.1). 51 Chapter 4 The Functions v and fE,r The derivation of Lanczos’ main formula (1.1) in Chapter 3 relies on the validity of expanding the implicitly defined function fE,r(θ) of equa- tion (3.7) (resp. fE,r(x) of (3.14)) into an infinite interpolating Fourier (resp. Chebyshev) series. Furthermore, the smoothness and summa- bility properties of fE,r determine the asymptotic growth rate of the coefficients ak(r) (resp. ck(r)), and consequently the convergence of the series (1.2). In this chapter the properties of the implicit functions v, fr and fE,r are examined in detail, and bounds are established on the coefficients appearing in the series (3.8) (resp. (3.15)). The principal results are r (i) In the Chebyshev setting, f (x) C⌊ ⌋[ 1, 1] and is of bounded E,r ∈ − variation, so that the expansion (3.15) is justified. 2r (ii) In the Fourier setting, fE,r(θ) C⌊ ⌋[ π/2,π/2] and is of bounded variation, thus justifying the∈ expansion− (3.8). (iii) f (n) L1[ π/2,π/2], where n = 2r . E,r ∈ − ⌈ ⌉ 2r (iv) a (r)=O(k−⌈ ⌉)ask . k →∞ 4.1 Closed Form Formulas We begin by finding explicit expressions for v and fE,r in terms of Lambert W functions W0 and W 1. This is useful for several reasons: − the first is that graphs of v and fE,r can be easily plotted using the built in Lambert W evaluation routines of Maple 8. The second is that the 52 Chapter 4. The Functions v and fE,r smoothness properties of v and fE,r can be deduced directly from those of W0 and W 1. Third, the expression for fE,r can be used to compute − approximate values of the coefficients ak(r) using finite Fourier series. In this section we work first in the Chebyshev setting and establish explicit formulas for v(x)andf (x)on[ 1, 1]. These formulas are in E,r − terms of the two real branches W 1 and W0 of the Lambert W function − defined implicitly by W (t)eW (t) = t. See [5] for a thorough treatment of Lambert W functions. 4.1.1 Closed Form for v(x) Recall that v(x) was defined implicitly via 1 x2 = v(1 log v) , (4.1) − − where x = 1 corresponds to v =0,x =0tov = 1, andx =1to − v = e. Letting w =log(v/e), this is equivalent to x2 1 − = wew e whence x2 1 w = W − e so that v x2 1 = expW − e e x2 1 e− = . (4.2) x2 1 W e− For 1