A Revisit of the Gram-Charlier and Edgeworth Series Expansions
Total Page:16
File Type:pdf, Size:1020Kb
ORIG I NAL AR TI CLE DRAFT VERSION 0.1 A REVISIT OF THE Gram-Charlier AND Edgeworth SERIES EXPANSIONS TORGEIR Brenn1 | Stian Normann AnfiNSEN1 1UnivERSITY OF Tromsø - The Arctic UnivERSITY OF NorwaY, Department OF In THIS PAPER WE MAKE SEVERAL OBSERVATIONS ON THE Gram- PhYSICS AND TECHNOLOGY, Machine LEARNING Charlier AND Edgeworth series, WHICH ARE METHODS FOR mod- Group, 9037 Tromsø, NorwaY ELING AND APPROXIMATING PROBABILITY DENSITY functions. WE Correspondence PRESENT A SIMPLIfiED DERIVATION WHICH HIGHLIGHTS BOTH THE Stian Normann Anfinsen, UiT The Arctic UnivERSITY OF NorwaY, Department OF SIMILARITY AND THE DIFFERENCES OF THE SERIES Expansions, THAT PhYSICS AND TECHNOLOGY, 9037 Tromsøø, ARE OFTEN OBSCURED BY ALTERNATIVE derivations. WE ALSO in- NorwaY Email: [email protected] TRODUCE A REFORMULATION OF THE Edgeworth SERIES IN TERMS OF THE COMPLETE EXPONENTIAL Bell polynomials, WHICH MAKE FUNDING INFORMATION BOTH SERIES EASY TO IMPLEMENT AND Evaluate. The RESULT IS A SIGNIfiCANTLY MORE ACCESSIBLE METHODOLOGY, IN THE SENSE THAT IT IS EASIER TO UNDERSTAND AND TO implement. Finally, WE ALSO MAKE A REMARK ON THE Gram-Charlier SERIES WITH A GAMMA Kernel, PROVIDING A NOVEL AND SIMPLE EXPRESSION FOR ITS COEFficients. KEYWORDS PROBABILITY DENSITY functions, SERIES Expansions, Edgeworth, Gram-Charlier, Bell polynomials, Normal Kernel, GAMMA KERNEL 1 | INTRODUCTION The Gram-Charlier AND Edgeworth SERIES EXPANSIONS PROVIDE ATTRACTIVE ALTERNATIVES WHEN IT COMES TO PROBABILITY DENSITY FUNCTION (PDF) estimation. TheY COMBINE THE SIMPLICITY OF fiTTING A two-parAMETER PDF WITH THE flEXIBILITY OF CORRECTING FOR HIGHER ORDER moments, OFTEN RESULTING IN FAST AND ACCURATE APPROximations. Abbreviations: CF, CHARACTERISTIC function; FT, FOURIER TRansform; IID, INDEPENDENT AND IDENTICALLY distributed; PDF, PROBABILITY den- SITY function; RV, RANDOM variable. 1 2 BRENN AND ANFINSEN The SERIES EXPANSION METHODS WERE INTRODUCED IN THE LATE 19th AND EARLY 20th CENTURY (ChebYSHEv, 1860; Gram, 1883; Thiele, 1889; ChebYSHEv, 1890; Thiele, 1903; Edgeworth, 1905; Charlier, 1905, 1906), AND THEIR HISTORY IS SUMMARIZED IN (Wallace, 1958) AND (Hald, 2000). The METHODS CAN BE DERIVED IN TERMS OF ORTHOGONAL POLYNOMIALS (KENDALL ET al., 1994), BUT IT IS ALSO POSSIBLE TO RETRIEVE THE PDF FROM THE CHARACTERISTIC FUNCTION (CF) (Lévy, 1925; Lukacs, 1970), AS IN (Wallace, 1958; BlinnikOV AND Moessner, 1998). TRADITIONALLY, THE SERIES EXPANSIONS USED THE NORMALIAN PDF AS THE KERNEL ALMOST EXCLUSIVELY, BUT (KENDALL ET al., 1994) MENTIONED OTHER POSSIBILITIES AND (Gaztanaga ET al., 2000) PRESENTED AN EXPLICIT EXPRESSION FOR THE Gram-Charlier SERIES WITH A GAMMA PDF Kernel. A TOOL WHICH WAS NOT AVAILABLE TO Gram, Charlier, Edgeworth etc. ARE THE Bell polynomials, NAMED AFTER Eric TEMPLE Bell, WHO INTRODUCED THEM UNDER THE NAME PARTITION POLYNOMIALS IN (Bell, 1927). Among OTHER things, THEY CAN BE USED TO RETRIEVE MOMENTS FROM CUMULANTS (Pitman, 2002; Rota AND Shen, 2000). Thiele, WHO INTRODUCED THE CUMULANTS IN (Thiele, 1889), CERTAINLY MASTERED THEIR RELATIONSHIP WITH THE moments, BUT NATURALLY DID NOT HAVE ACCESS TO THE Bell polynomials. Instead, HE GAVE A RECURSION FORMULA TO COMPUTE CUMULANTS FROM MOMENTS (Hald, 2000). TODAY, POLYNOMIALS LIKE THOSE NAMED AFTER Bell OR Kummer (Kummer, 1837; Daalhuis, 2010) ARE READILY AVAILABLE ONLINE AND IN MATHEMATICS software. The IMPLICATION THAT USING THESE POLYNOMIALS TO EXPRESS THE Gram-Charlier AND Edgeworth SERIES ALLOWS FOR EASIER AND FASTER implementation, AS DEMONSTRATED IN (Withers AND Nadarajah, 2009, 2015). This PAPER IS ORGANIZED AS follows. In Section 2 WE BRIEflY ACCOUNT FOR THE NECESSARY THEORETICAL background, INCLUDING moments, cumulants, CF, Bell POLYNOMIALS AND THE TRADITIONAL WAY OF DERIVING THE Gram-Charlier AND Edge- WORTH series. In Section 3 WE PRESENT OUR DERIVATION OF THE SAME Gram-Charlier AND Edgeworth SERIES WITH THE NOVEL APPLICATION OF THE Bell POLYNOMIALS IN THIS CONTExt. WE MAKE AN OBSERVATION ON THE Gram-Charlier SERIES AROUND THE GAMMA KERNEL IN Section 4 AND PRESENT OUR CONCLUSIONS IN Section 5. 2 | THEORETICAL BACKGROUND 2.1 | The Hermite POLYNOMIALS AND THE Normal Distribution The nTH probabilists’ Hermite POLYNOMIAL Hn ¹xº IS DEfiNED IN TERMS OF THE DERIVATIVES OF THE STANDARDIZED (zero mean, 2 UNIT VARIANCE) NORMAL PDF φ¹xº = ¹2πº−1/2e−x /2, NAMELY n (−Dx º φ¹xº = Hn ¹xºφ¹xº; (1) n WHERE Dx = D/Dx IS THE DIFFERENTIAL OPERATOR AND THE FACTOR (−1º ENSURES THAT THE LEADING COEFfiCIENT OF Hn ¹xº IS 1 (KENDALL ET al., 1994). FOR ARBITRARY MEAN µ AND VARIANCE σ2, WE DEfiNE 1 ¹x − µº2 1 x − µ φ¹x; µ; σº = p EXP − = φ ; (2) 2πσ 2σ2 σ σ −1 AND LETTING y = ¹x − µ)/σ WE SEE THAT Dy = σ Dx , i.e. n 1 x − µ (−Dx º φ¹x; µ; σº = Hn φ¹x; µ; σº: (3) σn σ WE CAN NOW WORK WITH THE STANDARDIZED φ¹xº DURING THE DERIVATIONS FOR BREVITY AND USE (3) TO GENERALIZE TO ARBITRARY KERNEL MEAN AND variance. The Hermite POLYNOMIALS ARE ORTHOGONAL WITH RESPECT THE NORMAL PDF IN THE SENSE THAT BRENN AND ANFINSEN 3 (KENDALL ET al., 1994) 1 ( ¹ n!; k = n ; Hk ¹xºHn ¹xºφ¹xºDx = (4) 0 k n : −∞ ; , 2.2 | The Gram-Charlier Series WITH THE Normal KERNEL Suppose NOW THAT THE RANDOM VARIABLE (RV) X HAS UNKNOWN PDF fX ¹xº, WHICH CAN BE WRITTEN IN TERMS OF THE DERIVATIVES OF THE NORMAL PDF, i.e. 1 Õ fX ¹xº = an Hn ¹xºφ¹xº; (5) n=0 WHERE WE USED (1). TO fiND THE COEFfiCIENTS an , WE MULTIPLY BOTH SIDES WITH Hk ¹xº, INTEGRATE FROM −∞ TO +1, SWAP THE ORDER OF INTEGRATION AND summation, AND USE THE ORTHOGONAL PROPERTY FROM (4) TO GET 1 1 1 ¹ ¹ Õ fX ¹xºHk ¹xºDx = an Hk ¹xºHn ¹xºφ¹xº = an n! ; (6) −∞ −∞ n=0 1 1 ¹ an = fX ¹xºHn ¹xºDx : (7) n! −∞ ν Since Hn ¹xº IS A POLYNOMIAL IN x, THE COEFfiCIENTS an MUST BE A LINEAR COMBINATION OF THE MOMENTS µν = EfX g OF X . (KENDALL ET al., 1994) LISTS THE fiRST FEW COEFfiCIENTS BOTH IN TERMS OF THE MOMENTS AND IN TERMS OF THE cumulants, WITH THE LATTER REPRESENTATION PRESENTED IN (13). 2.3 | The Gram-Charlier Series WITH THE Gamma KERNEL LET THE GAMMA DISTRIBUTION PDF WITH SHAPE φ AND SCALE β BE DENOTED − γ¹x; φ; βº = β φ+1xφ e βx /Γ¹φ + 1º (8) FOR x ≥ 0. The GENERALIZED Laguerre POLYNOMIAL OF DEGREE n AND ORDER φ IS IMPLICITLY DEfiNED AS (Szeg, 1939) ¹φº 1 n n Ln ¹xºγ¹x; φº = D »x γ¹x; φº¼ ; (9) n! WHERE WE TAKE γ¹x; φº TO MEAN THAT β = 1 FOR BREVITY, WHICH IS EASILY GENERALIZED BY REPLACING x WITH βx AS THE ARGUMENT ¹φº 1 OF BOTH Ln ¹xº AND γ¹x; φº. The Laguerre POLYNOMIALS HAVE AN ORTHOGONALITY PROPERTY ANALOGOUS TO (4), NAMELY 1 Γ¹n + φ + 1º ¹ >8 ; k = n ; ¹φº¹ º ¹φº¹ º ¹ º >< Γ¹n + 1ºΓ¹φ + 1º Ln x Lk x γ x; φ Dx = (10) > 0 ; k , n : 0 : 1This IS FOUND IN TERMS OF BINOMIAL COEFfiCIENTS IN (Szeg, 1939), WITH (Fowler, 1996) PROVIDING A GENERALIZATION TO non-integer arguments. This CAN ALSO BE GENERALIZED BY REPLACING x WITH βx ON THE LEFT HAND SIDE IN (10), LEAVING THE RIGHT HAND SIDE unchanged. 4 BRENN AND ANFINSEN FROM (KENDALL ET al., 1994) AND (Gaztanaga ET al., 2000) WE KNOW THAT A PDF fX ¹xº WHICH IS ZERO FOR NEGATIVE x, CAN BE WRITTEN AS 1 Õ ¹φº fX ¹xº = an Ln ¹βxºγ¹x; φ; βº ; (11) n=0 WHERE THE COEFfiCIENTS an ARE FOUND THE SAME WAY AS IN Section 2.2, GIVING 1 n ¹ ! ¹ º ¹φº¹ º an = n fX x Ln βx Dx; (12) Î ¹φ + i º 0 i =1 WHICH CLEARLY IS A LINEAR COMBINATION OF THE MOMENTS OF X BY THE SAME REASONING AS IN Section 2.2. The fiRST FEW TERMS OF THE SERIES IS GIVEN IN (14). 2.4 | The Edgeworth Series FOR THE RV X WITH UNKNOWN PDF fX ¹xº, ITS CF ΨX ¹t º IS THE FT OF fX ¹xº (KENDALL ET al., 1994), THAT IS 1 ¹ i t x i t X ΨX ¹t º ≡ F»fX ¹xº¼¹t º = e fX ¹xº Dx = Efe g ; (16) −∞ WHERE j IS THE IMAGINARY unit, THE EXPECTATION Ef·g IS WITH RESPECT TO x AND t IS A TRANSFORM variable. The CUMULANT GENERATING FUNCTION IS DEfiNED AS LOG ΨX ¹t º ; (17) " 2 # κ3 x − µ κ4 x − µ κ5 x − µ κ6+10κ3 x − µ κ7+35κ3κ4 x − µ fX ¹xº= 1+ H3 + H4 + H5 + H6 + H7 + ··· φ¹x; µ; σº 6σ3 σ 24σ4 σ 120σ5 σ 720σ6 σ 5040σ7 σ (13) 3 4 3 β µ3 ¹φº β µ4 4β µ3 ¹φº fX ¹xº= 1 + − + 1 L ¹βxº + − + 3 L ¹βxº + ··· γ¹x; φ; βº ¹φ + 1º¹φ + 2º¹φ + 3º 3 ¹φ + 1º · · · ¹φ + 4º ¹φ + 1º¹φ + 2º¹φ + 3º 4 (14) " 2 # 1 λ3 x − µ 1 λ4 x − µ λ3 x − µ fX ¹xº = φ¹x; µ; σº + H3 φ¹x; µ; σº + H4 + H6 φ¹x; µ; σº (15) r 1/2 6σ3 σ r 24σ4 σ 72σ6 σ " 3 # 1 λ5 x − µ λ3λ4 x − µ λ3 x − µ 1 + H5 + H7 + H9 φ¹x; µ; σº + O r 3/2 120σ5 σ 144σ7 σ 1296σ9 σ r 2 BRENN AND ANFINSEN 5 i.e. THE NATURAL LOGARITHM OF THE CF. The CUMULANTS κX ;ν; ν 2 Ú>0 can, IF THEY ALL Exist, BE RETRIEVED FROM 1 Õ ¹i t ºν LOG ΨX ¹t º = κX ;ν : (18) ν ν=1 ! WE LET Ψφ ¹t º DENOTE THE CF OF φ¹xº (Bryc, 2012). Using (17) AND (18), WE CAN WRITE THE CFs OF X AND THE NORMAL KERNEL AS ( 1 ) Õ ¹i t ºν ΨX ¹t º = EXP κX ;ν ; (19) ν ν=1 ! ( 1 ) Õ ¹i t ºν Ψφ ¹t º = EXP κφ;ν : (20) ν ν=1 ! These CAN BE COMBINED INTO ( 1 ) Õ ¹i t ºν ΨX ¹t º = EXP »κX ;ν − κφ;ν ¼ Ψφ ¹t º: (21) ν ν=1 ! As (Wallace, 1958) notes, IT IS POSSIBLE TO RETRIEVE THE Gram-Charlier SERIES AT THIS POINT BY USING THE POWER SERIES EXPANSION OF EXPf·g, SORTING THE TERMS BY THEIR POWER OF (−Dx º, AND APPLYING (1).