Complex Random Variables Leigh J. Halliwell, FCAS, MAAA ______

Abstract: Rarely have casualty actuaries needed, much less wanted, to work with complex numbers. One readily could wisecrack about imaginary dollars and creative accounting. However, complex numbers are well established in mathematics; they even provided the impetus for abstract algebra. Moreover, they are essential in several scientific fields, most notably in electromagnetism and quantum mechanics, the two fields to which most of the sparse material about complex random variables is tied. This paper will introduce complex random variables to an actuarial audience, arguing that complex random variables will eventually prove useful in the field of actuarial science. First, it will describe the two ways in which statistical work with complex numbers differs from that with real numbers, viz., in transjugation versus transposition and in rank versus dimension. Next, it will introduce the mean and the of the complex random vector, and derive the distribution function of the standard complex normal random vector. Then it will derive the general distribution of the complex normal multivariate and discuss the behavior and moments of complex lognormal variables, a limiting case of which is the unit-circle W = eiΘ for real Θ uniformly distributed. Finally, it will suggest several foreseeable actuarial applications of the preceding theory, especially its application to linear statistical modeling. Though the paper will be algebraically intense, it will require little knowledge of complex-function theory. But some of that theory, viz., Cauchy’s theorem and analytic continuation, will arise in an appendix on the complex moment generating function of a normal random multivariate.

Keywords: Complex numbers, matrices, and random vectors; augmented variance; lognormal and unit-circle distributions; determinism; Cauchy-Riemann; analytic continuation ______

1. INTRODUCTION

Even though their education has touched on algebra and calculus with complex numbers, most casualty actuaries would be hard-pressed to cite an actuarial use for numbers of the form x + iy .

Their use in the discrete Fourier transformation (Klugman [1998], §4.7.1) is notable; however, many would view this as a trick or convenience, rather than as indicating any further usefulness. In this paper we will develop a theory for complex random variables and vectors, arguing that such a theory will eventually find actuarial uses. The development, lengthy and sometimes arduous, will take the following steps. Sections 2-4 will base complex matrices in certain real-valued matrices called “double-real.” This serves the aim of our presentation, namely, to analogize from real-valued random variables and vectors to complex ones. Transposition and dimension in the real-valued

Casualty Actuarial Society E-Forum, Fall 2015 1 Complex Random Variables

realm become transjugation and rank in the complex. These differences figure into the standard

quadratic form of Section 5, where also the distribution of the standard complex normal random vector is derived. Section 6 will elaborate on the variance of a complex random vector, as well as

introduce “augmented variance,” i.e., the variance of dyad whose second part is the complex

conjugate of the first. Section 7 derives of the formula for the distribution of the general complex normal multivariate. Of special interest to many casualty actuaries should be the treatment of the complex lognormal random vector in Section 8, an intuition into whose behavior Section 9 provides on a univariate or scalar level. Even further simplification in the next two sections leads to the unit- circle random variable, which is the only random variable with widespread deterministic effects. In

Section 12 we adapt the linear statistical model to complex multivariates. Finally, Section 13 lists foreseeable applications of complex random variables. However, we believe their greatest benefit resides not in their concrete applications, but rather in their fostering abstractions of thought and imagination. Three appendices delve into mathematical issues too complicated for the body of paper. Those who work on an advanced level with lognormal random variables should read

Appendix A (“Real-Valued Lognormal Random Vectors”), regardless of their interest in complex random variables.

2. INVERTING COMPLEX MATRICES

Let m×n complex matrix Z be composed of real and imaginary parts X and Y, i.e., Z = X + iY . Of

course, X and Y also must be m×n. Since only square matrices have inverses, our purpose here

requires that m = n . Complex matrix W = A + iB is an inverse of Z if and only if ZW = WZ = I n ,

where In is the n×n identity matrix. Because such an inverse must be unique, we may say that

Z−1 = W . Under what conditions does W exist?

Casualty Actuarial Society E-Forum, Fall 2015 2 Complex Random Variables

First, define the conjugate of Z as Z = X − iY . Since the conjugate of a product equals the product

1 −1 −1 −1 of the conjugates, if Z is non-singular, then ZZ = ZZ = I n = I n . Similarly, Z Z = I n .

−1 Therefore, Z too is non-singular, and Z = Z-1 . Moreover, if Z is non-singular, so too are i n Z and i n Z . Therefore, the invertibility of X + iY , − Y + iX , − X − iY , Y − iX , X − iY , Y + iX ,

− X + iY , and − Y − iX is true for all eight or true for none. Invertibility is no respecter of the real and imaginary parts.

Now if the inverse of Z is W = A + iB , then I n = (X + iY)(A + iB) = (A + iB)(X + iY).

Expanding the first equality, we have:

I n = (X + iY)(A + iB) = XA + iYA + iXB + i 2 YB

= XA + iYA + iXB − YB = (XA − YB)+ i (YA + XB)

Therefore, ZW = I n if and only if XA − YB = I n and YA + XB = 0n×n . We may combine the last two equations into the partitioned-matrix form:

X − YA I n     =   Y XB 0 

Since YA + XB = 0 if and only if − YA − XB = 0 , another form just as valid is:

X − Y− B 0     =   Y X A I n 

We may combine these two forms into the balanced form:

X − YA − B I n 0     =   = I 2n Y XB A 0 I n 

1 If Z and W are conformable to multiplication: ZW = (X + iY)(A + iB) = XA − YB + i(XB + YA) = XA − YB − i(XB + YA) = (X − iY)(A − iB) = Z W

Casualty Actuarial Society E-Forum, Fall 2015 3 Complex Random Variables

X − YA − B I n 0  Therefore, ZW = I n if and only if    =   . By a similar expansion of the Y XB A 0 I n 

A − BX − Y I n 0  last equality above, WZ = I n if and only if    =   . Hence, we conclude B AY X 0 I n 

that the n×n complex matrix Z = X + iY is non-singular, or has an inverse, if and only if the 2n×2n

−1 X − Y X − Y real-valued matrix   is non-singular. Moreover, if   exists, it will have the form Y X Y X

A − B −1   and Z will equal A + iB. B A

3. COMPLEX MATRICES AS DOUBLE-REAL MATRICES

That the problem of inverting an n×n complex matrix resolves into the problem of inverting a

2n×2n real-valued matrix suggests that with complex numbers one somehow gets “two for the price

of one.” It even hints of a relation between the general m×n complex matrix Z = X + iY and the

X − Y X − Y 2m×2n complex matrix   . If X and Y are m×n real matrices, we will call   a Y X Y X double-real matrix. The matrix is double in two senses; first, in that it involves two (same-sized and real-valued) matrices X and Y, and second, in that its right half is redundant, or reproducible from its left.

Returning to the hint above, we easily see an addition analogy:

X1 − Y1  X 2 − Y2  X1 + iY1 + X 2 + iY2 ⇔   +    Y1 X1   Y2 X 2 

Casualty Actuarial Society E-Forum, Fall 2015 4 Complex Random Variables

And if Z1 is m×n and Z2 is n×p, so that the matrices are conformable to multiplication, then

Z1Z2 = (X1 + iY1 )(X 2 + iY2 ) = (X1X 2 − Y1Y2 )+ i (X1Y2 + Y1X 2 ). This is analogous with the double-real multiplication:

X1 − Y1 X 2 − Y2  X1X 2 − Y1Y2 − X1Y2 − Y1X 2     =    Y1 X1  Y2 X 2  X1Y2 + Y1X 2 X1X 2 − Y1Y2 

Rather trivial is the analogy between the m×n complex zero matrix and the 2m×2n double-real zero

0m×n − 0 matrix   , as well as that between the n×n complex identity matrix and the 2n×2n double- 0 0

I n − 0  real identity matrix   . 0 I n 

The general 2m×2n double-real matrix may itself be decomposed into quasi-real and quasi-imaginary

X − Y X 0   0 − Y parts:   =   +   . And in the case of square matrices ( m = n ) this extends Y X  0 X Y 0 

X − Y X 0  0 − I n Y 0  to the form   =   +    , wherein the double-real matrix Y X  0 X I n 0  0 Y

0 − I n    is analogous with the imaginary unit, inasmuch as: I n 0 

2 0 − I n  0 − I n 0 − I n  − I n 0  I n 0    =    =   = (−1)  I n 0  I n 0 I n 0   0 − I n  0 I n 

Finally, one of the most important theorems of linear algebra is that every m×n complex matrix

Z = X + iY may be reduced by invertible transformations to “canonical form” (Healy [1986], 32-

34). In symbols, for every Z there exist non-singular matrices U and V such that:

Casualty Actuarial Society E-Forum, Fall 2015 5 Complex Random Variables

I r 0 U m×m Zm×n Vn×n =   0 0 m×n

The m×n real matrix on the right side of the equation consists entirely of zeroes except for r instances of one along its main diagonal. Since invertible matrix operations can reposition the ones, it is further stipulated that the ones appear as a block in the upper-left corner. Although many reductions of Z to canonical form exist, the canonical forms themselves must all contain the same number of ones, r, which is defined as the rank of Z. Providing the matrices with real and complex parts, we have:

U m×m Zm×n Vn×n = (P + iQ)(X + iY)(R + iS) = (PXR − QYR − PYS − QXS)+ i(PXS − QYS + PYR + QXR) = A + iB

I r 0 =   + i 0m×n 0 0

The double-real analogue to this is:

I r 0    0m×n   P − QX − YR − S A − B 0 0      =   = Q P Y XS R B A  I r 0  0m×n    0 0

 P − Q As shown in the previous section,   is non-singular, or invertible, if and only if Q P 

R −S U = P + iQ is non-singular; the same is true for   . Therefore, the rank of the double-real S R analogue of a complex matrix is twice the rank of the complex matrix. Moreover, the 2r instances of one correspond to r quasi-real and r quasi-imaginary instances. It is not possible for the contribution to the rank of a matrix to be real without its being imaginary, and vice versa.

Casualty Actuarial Society E-Forum, Fall 2015 6 Complex Random Variables

To conclude this section, there are extensive analogies between complex and double-real matrices,

analogies so extensive that one who lacked either the confidence or the software to work with

complex numbers could probably do a work-around with double-real matrices.2

4. COMPLEX MATRICES AND VARIANCE

′ Σ = Var[x] = E(x − µ)(x − µ)  is a real-valued n×n matrix, whose jkth element is the covariance of  

the jth element of x with the kth element. Since the covariance of two real-valued random variables is

symmetric, Σ must be a symmetric matrix. But a realistic Σ must have one other property, viz., non-

negative definiteness (NND). This means that for every real-valued n×1 vector ξ, ξ′Σξ ≥ 0 .3 This must be true, because ξ′Σξ is the variance of the real-valued random variable ξ′x :

′ ′ ′ Var[ξ′x] = E(ξ′x − ξ′µ)(ξ′x − ξ′µ)  = Eξ′(x − µ)(x − µ) ξ = ξ′E(x − µ)(x − µ) ξ = ξ′Σξ      

Although of real-valued random variables may be zero, they must not be negative. Now if

ξ′Σξ > 0 for all ξ ≠ 0n×1 , the variance Σ is said to be positive-definite (PD). Every invertible NND

matrix must be PD. Moreover, every NND matrix may be expressed as the product of some real

matrix and its transpose, the most common method for doing this being the Cholesky

 x − y 2 The representation of the complex scalar z = x + iy as the real 2×2 matrix   is a common theme in y x modern algebra (e.g., section 7.2 of the Wikipedia article “Complex number”). We have merely extended the representation to complex matrices. Our representation is even more meaningful when expressed in the Kronecker- X − Y X 0   0 − Y 1 0 0 −1 product form   =   +   =   ⊗ X +   ⊗ Y. Due to certain properties Y X  0 X Y 0  0 1 1 0 of the Kronecker product (cf. Judge [1988], Appendix A.15), all the analogies of this section would hold even in the 1 0 0 −1 commuted form X ⊗   + Y ⊗   . In practical terms this means that it matters not whether the form 0 1 1 0 is 2×2 of m×n or m×n of 2×2. 3 More accurately, ξ′Σξ ≥ [0], since the quadratic form ξ′Σξ is a 1×1 matrix. The relevant point is that 1×1 real- valued matrices are as orderable as their real-valued elements are. Appendices A.13 and A.14 of Judge [1988] provide introductions to quadratic forms and definiteness that are sufficient to prove the theorems used herein.

Casualty Actuarial Society E-Forum, Fall 2015 7 Complex Random Variables decomposition (Healy [1986], §7.2). Accordingly, if Σ is NND, then A′ΣA ≥ 0 for any conformable real-valued matrix A. Finally, if Σ is PD and real-valued n×r matrix A is of full column rank, i.e.,

rank(A n×r ) = r , then the r×r matrix A′ΣA is PD.

X − Y In the remainder of this section we will show how the analogy between X + iY and   Y X leads to a proper definition of the variance of a complex random vector. We start by considering

X − Y   as a real-valued variance matrix. In order to be so, first it must be symmetric: Y X

′ X − Y  X′ Y′ X − Y   =   =   Y X − Y′ X′ Y X

X − Y Hence,   is symmetric if and only if X′ = X and Y′ = −Y . In words, X is symmetric and Y X

Y is skew-symmetric. Clearly, the main diagonal of a skew-symmetric matrix must be zero. But of greater significance, if a and b are real-valued n×1 vectors:

′ a′Yb = (a′Yb)1×1 = (a′Yb) = b′Y′a = b′(− Y)a = −b′Ya

Consequently, if b = a :

a′Ya + a′Ya a′Ya + (− a′Ya) a′Ya = = = 0 × 2 2 1 1

Next, considering the specifications on X, Y, a, and b, we evaluate the 2×2 quadratic form:

Casualty Actuarial Society E-Forum, Fall 2015 8 Complex Random Variables

′ a − b X − Ya − b  a′ b′X − Ya − b      =     b a Y Xb a − b′ a′Y Xb a  a′X + b′Y b′X − a′Ya − b =    − b′X + a′Y a′X + b′Yb a  a′Xa + b′Ya − a′Yb + b′Xb − a′Xb − b′Yb + b′Xa − a′Ya =   − b′Xa + a′Ya + a′Xb + b′Yb a′Xa + b′Ya − a′Yb + b′Xb a′Xa + b′Ya − a′Yb + b′Xb − a′Xb − 0 + b′Xa − 0 =    − b′Xa + 0 + a′Xb + 0 a′Xa + b′Ya − a′Yb + b′Xb a′Xa + b′Ya − a′Yb + b′Xb − a′Xb + b′Xa =    − b′Xa + a′Xb a′Xa + b′Ya − a′Yb + b′Xb a′Xa + b′Ya − a′Yb + b′Xb 0 =    0 a′Xa + b′Ya − a′Yb + b′Xb ′  −  a X Ya       0 b Y Xb  =  ′   a X − Ya  0       b Y Xb

′ ′ a − b X − Ya − b a X − Ya Therefore,      is PD [or NND] if and only if      is PD b a Y Xb a b Y Xb

[or NND].

a − b Now the double-real 2n×2 matrix   is analogous with the n×1 complex vector a + ib . Its b a

′ a − b  a′ b′ transpose   =   is analogous with the 1×n complex vector a′ − ib′ . Moreover, b a − b′ a′

′ ′ ′ * a′ − ib′ = (a − ib) = a + ib = (a + ib) = (a + ib) , where ‘*’ is the combined operation of

Casualty Actuarial Society E-Forum, Fall 2015 9 Complex Random Variables

4 X − Y transposition and conjugation (order irrelevant). And   is analogous with the n×n Y X complex matrix X + iY . Accordingly, the complex analogue of the double-real quadratic form

′ a − b X − Ya − b * X − Y      is (a + ib) (X + iY)(a + ib). Moreover, since   is b a Y Xb a Y X

symmetric, (X + iY)* = X′ − iY′ = X − i(− Y) = X + iY . A matrix equal to its transposed conjugate

* is said to be Hermetian: matrix Γ is Hermetian if and only if Γ = Γ . Therefore, Γn×n = X + iY is

the variance matrix of some complex random variable z n×1 = x + iy if and only if Γ is Hermetian

X − Y 5 and   is non-negative-definite. Y X

′ ′ * a X − Ya a X − Ya Because (a + ib) (X + iY)(a + ib) =      + i ⋅ 0 =      , the definiteness b Y Xb b Y Xb

X − Y of Γ = X + iY is the same as the definiteness of   . Therefore, a matrix qualifies as the Y X

variance matrix of some complex random vector if and only if it is Hermetian and NND. Just as the

variance matrix of a real-valued random vector factors as Σ = A′A n×n for some real-valued A, so too

* the variance matrix of a complex random vector factors as Γ = A A n×n for some complex A.

Likewise, every invertible Hermetian NND matrix must be PD. Due to the skew symmetry of their

4 The transposed conjugate is sometimes called the “transjugate,” which in linear algebra is commonly symbolized with the asterisk. Physicists prefer the “dagger” notation A†, though physicist Hermann Weyl [1950, p. 17] called it the ~ “Hermetian conjugate” and symbolized it as A . X − Y 5 It is superfluous to add ‘symmetric’ here. For Γ = X + iY is Hermetian if and only if   is symmetric. Y X

Casualty Actuarial Society E-Forum, Fall 2015 10 Complex Random Variables

complex parts, the main diagonals of Hermetian matrices must be real-valued. If the matrices are

NND [or PD], all the elements of their main diagonals must be non-negative [or positive].

Let Γ represent the variance of the complex random vector z. Its jkth element represents the

covariance of the jth element of z with the kth element. Since Γ is Hermetian,

* γ kj = [Γ]kj = [Γ ]kj = [Γ′]kj = [Γ]jk = [Γ] jk = γ jk . Because of this, it is fitting and natural to define

the variance of a complex random vector as:

 ′ * Γ = Var[z] = E (z − µ)(z − µ) = E[(z − µ)(z − µ) ]  

The complex formula is like the real formula except that the second factor in the expectation is

transjugated, not simply transposed. This renders Γ Hermetian, since:

* * Γ* = E[(z − µ)(z − µ)* ] = E[{(z − µ)(z − µ)* } ]= E[(z − µ)(z − µ)* ]= Γ

It also renders Γ NND. For since (z − µ)(z − µ)* is NND, its expectation over the probability

distribution of z must also be so. Usually Γ is PD, in which case Γ −1 exists.

5. THE EXPECTATION OF THE STANDARD QUADRATIC FORM

The most common quadratic form in zn×1 involves the variance of the complex random variable,

viz., (z − µ)* Γ −1 (z − µ), where Γ = Var[z]. The expectation of this quadratic form equals n, the

rank of the variance. The following proof uses the trace function. The trace of a matrix is the sum

of its main-diagonal elements, and if A and B are conformable tr(AB) = tr(BA). Moreover, the trace of the expectation equals the expectation of the trace. Consequently:

Casualty Actuarial Society E-Forum, Fall 2015 11 Complex Random Variables

E[(z − µ)* Γ −1 (z − µ)]= E[tr((z − µ)* Γ −1 (z − µ))] = E[tr(Γ −1 (z − µ)(z − µ)* )] = tr(E[Γ −1 (z − µ)(z − µ)* ]) = tr(Γ −1E[(z − µ)(z − µ)* ]) = tr(Γ −1Γ)

= tr(I n ) = n

The analogies above between complex and double-real matrices might suggest the result to be 2n.

However, for real-valued random variables E[(x − µ)* Σ −1 (x − µ)]= n , and the complex case is a superset of the real. So by extension, the complex case must be the same.

But an insight is available into why the value is n, rather than 2n. Let x and y be n×1 real-valued random vectors. Assume their means to be zero, and their variances to be identity matrices (so zero covariance):

x  0  I 0   n×1 n    ~  μ =  , Σ =   y  0n×1  0 I n 

The quadratic form is:

−1 x I 0  x x n  x 2 y 2  ′ ′ Σ −1 = ′ ′ n = ′ ′ = ′ + ′ =  j + j  [x y ]   [x y ]    [x y ]  x x y y ∑  y 0 I n  y y j=1  1 1 

Since the elements have unit variances, the expectation is:

 n 2 2  n 2 2 n  − x  x j y j   E[x j ] E[y j ] 1 1 ′ ′ Σ 1 =  +  =  +  = + = E[x y ]   E∑  ∑  ∑  2n  y  j=1  1 1  j=1  1 1  j=1 1 1

Now let z be the n×1 complex random vector x + iy . Since E[x + iy] = 0n×1 , the variance of z is:

Casualty Actuarial Society E-Forum, Fall 2015 12 Complex Random Variables

Γ = Var[z] = Var[x + iy]  x = Var[I n iI n ]   y *  x x    = E [I n iI n ] [I n iI n ]  y y        *  xx * = E[I n iI n ]   [I n iI n ]   yy  * xx  I  = [I iI ]E  n n n    −  yy  iI n 

x I n  = [I n iI n ]Var   y− iI n 

I n 0  I n  = [I n iI n ]   0 I n − iI n 

 I n  = [I n iI n ]  − iI n  2 = I n −i I n

= 2I n

The complex quadratic form is:

* n z z n (x − iy )(x + iy ) n x 2 + y 2 x * −1 * −1 z z j j j j j j j j 1 ′ ′ −1 z Γ z = z (2I n ) z = = ∑ = ∑ = ∑ = [x y ]Σ   2 j=1 2 j=1 2 j=1 1+1 2 y

The complex form is half the real-valued form; hence, its expectation equals n. The condensation of the 2n real dimensions into n complex ones inverts the order of operations:

n  x 2 y 2  n x 2 + y 2  j + j  ⇒ j j ∑  ∑ j=1  1 1  j=1 1+1

Casualty Actuarial Society E-Forum, Fall 2015 13 Complex Random Variables

Within the sigma operator, the sum of two quotients becomes the quotient of two sums. A proof

for general variance Γ involves diagonalizing Γ, i.e., that Γ can be eigen-decomposed as

* * * 6 Γ = WΛW , where Λ is diagonal and WW = W W = I n .

At this point we can derive the standard complex . The normal distribution is

(x−µ )2 − 1 2σ 2 f X (x) = e . The standard complex normal random variable is formed from two 2πσ 2

independent real normal variables whose means equal zero and whose variances equal one half:

(x−0)2 ( y−0)2 − − 1 2(1 2) 1 2(1 2) 1 −x2 1 − y2 1 −(x2 + y2 ) 1 −zz fZ (z) = e e = e e = e = e 2π (1 2) 2π (1 2) π π π π

1 * The distribution of the n×1 standard complex normal random vector is f (z) = e z z . A vector so z π n

distributed has mean E[z] = 0n×1 and variance Var[z] = E[zz′] = I n .

6. COMPLEX VARIANCE, PSEUDOVARIANCE, AND AUGMENTED VARIANCE

Section 4 justified the definition of the variance of a complex random vector as:

 ′ * Γ = Var[z] = E (z − µ)(z − µ) = E[(z − µ)(z − µ) ]  

The naïve formula differs from this by one critical symbol (prime versus asterisk):

′ C = E(z − µ)(z − µ)   

6 Cf. Appendix C for eigen-decomposition and diagonalization. We believe the insight about commuting sums and quotients to be valuable as an abstraction. But of course, a vector of n independent complex random variables of unit 2n 1 variance translates into a vector of 2n independent real random variables of half-unit variance, and ∑ = n . Because j=1 2 of the half-unit real variance, the formula in the next paragraph for the standard complex normal distribution, lacking any factors of two, is simpler than the formula for the standard real normal distribution.

Casualty Actuarial Society E-Forum, Fall 2015 14 Complex Random Variables

This naïveté leads many to conclude that Var[iz] = i 2Var[z] = −Var[z], whereas it is actually:7

Var[iz] = E[i(z − µ){i(z − µ)}* ]= E[i(z − µ)i(z − µ)* ]= iiE[(z − µ)(z − µ)* ]= i(− i)Var[z] = Var[z]

Nevertheless, there is a role for the naïve formula, which reduces to:

′  ′ * =  − µ − µ  = − µ − µ = − µ − µ = C E(z )(z )  E(z )(z )  E[(z )(z ) ] Cov[z,z]    

Veeravalli [2006], whose notation we follow, calls C the “relation matrix.” The Wikipedia article

“Complex normal distribution” calls it the “pseudocovariance matrix.” Because of the naïveté that leads many to a false conclusion, we prefer the ‘pseudo’ terminology (better, “pseudovariance”) to something as bland as “relation matrix.” However, a useful and non-pejorative concept is what we will call the “augmented variance.”

The augmented variance is the variance of the complex random vector z augmented with its

z z E[z] µ conjugate z , i.e., the 2n×1 vector   . Its expectation is E  =   =   . And its variance is z z E[z] µ

(for brevity we ignore the mean):

* z zz  z  Cov[z,z] Cov[z, z] Var  = E    = E [z′ z′] =   z zz  z  Cov[z,z] Cov[z, z]

In two ways this matrix is redundant. First, Cov[z, z] = E[zz′] = E[zz′] = Cov[z,z]; equivalently,

Var[z] = Var[z] = Γ . And second, Cov[z,z]= E[zz′]= E[zz′]= Cov[z, z]= C . Therefore:

z Cov[z,z] Cov[z, z] Γ C Var  =   =   z Cov[z,z] Cov[z, z] C Γ

7 In general, for any complex scalar α, Var[αz] = ααVar[z].

Casualty Actuarial Society E-Forum, Fall 2015 15 Complex Random Variables

As with any valid variance matrix, the augmented variance must be Hermetian. Hence,

* ′ ′ Γ C Γ C Γ C Γ C Γ* C′ = = = = Γ* = Γ ′ =          *  , from which follow and C C . C Γ C Γ C Γ C Γ C Γ′

Moreover, it must be at least NND, if not PD. It is important to note from this that the

pseudovariance is an essential part of the augmented z; it is possible for two random variables to have the same variance and to covary differently with their conjugates. How a complex random vector covaries with its conjugate is useful information; it is even a parameter of the general complex normal distribution, which we will treat next.

7. THE COMPLEX NORMAL DISTRIBUTION

All the information for deriving the complex normal distribution of z n×1 = x + iy is contained in

the parameters of the real-valued multivariate normal distribution:

x  μ  Σ Σ  ~ Nμ = x , Σ = xx xy     2n×1   2n×2n Σ Σ  y  μ y   yx yy 

According to this variance structure, the real and imaginary parts of z may covary, as long as the

covariance is symmetric: Σyx = Σ′xy . The grand Σ matrix must be symmetric and PD. The

probability density function of this multivariate normal is:8

− − 1 −1 x μx  1 −1 x μx  − [x′−μ′x y′−μ′y ]Σ   − [x′−μ′x y′−μ′y ]Σ   1 2 y−μy  1 2 y−μy  f x (x, y) = e = e   2n n 2n y (2π ) Σ π 2 Σ

8 As derived briefly by Judge [1988, pp 49f]. Chapter 4 of Johnson [1992] is thorough. To be precise, Σ under the radical should be Σ , the absolute value of the determinant of Σ. However, the determinant of a PD matrix must be positive (cf. Judge [1988, A.14(1)]).

Casualty Actuarial Society E-Forum, Fall 2015 16 Complex Random Variables

z x + iy I n iI n x x Since z = x + iy , the augmented vector is   =   =    = Ξ n   . We will call z x − iy I n − iI n y y

I n iI n  Ξ n =   the augmentation matrix; this linear function of the real-valued vectors produces I n − iI n  the complex vector and its conjugate. An important equation is:

* I n iI n  I n I n  2I n 0  Ξ nΞ n =    =   = 2I 2n I n − iI n − iI n iI n   0 2I n 

Therefore, Ξn has an inverse, viz., one half of its transjugate.

z μ  μ + iμ  μ The augmented mean is E = Ξ x = x y = . The augmented variance is:   n    −    z μ y  μ x iμ y  μ

z * Var  = Ξ nΣΞ n z I iI Σ Σ  I I  = n n xx xy n n  Σ Σ   I n − iI n  yx yy − iI n iI n  Σ + iΣ Σ + iΣ  I I  = xx yx xy yy n n Σ − Σ Σ − Σ    xx i yx xy i yy − iI n iI n  Σ + Σ − i(Σ − Σ ) Σ − Σ + i(Σ + Σ ) = xx yy xy yx xx yy xy yx Σ − Σ − Σ + Σ Σ + Σ + Σ − Σ   xx yy i( xy yx ) xx yy i( xy yx ) Γ C =   C Γ

And so:

−1 −1 z Γ C * −1 * −1 −1 −1 Var   =   = (Ξ nΣΞ n ) = (Ξ n ) Σ (Ξ n ) z C Γ

−1 * −1 z * Γ C −1 This can be reformulated as Ξ nVar  Ξ n = Ξ n   Ξ n = Σ . z C Γ

Casualty Actuarial Society E-Forum, Fall 2015 17 Complex Random Variables

We now work these augmented forms into the probability density function:

− 1 −1 x μx  − [x′−μ′x y′−μ′y ]Σ   1 2 y−μy  f x (x, y) = e   n 2n y π 2 Σ − 1 * −1 z x μx  − [x′−μ′x y′−μ′y ]ΞnVar  Ξn   1 2 z y−μ = e    y  n π 2I 2n Σ

* 1 x−μx   z x−μx   −  Ξ  −1  Ξ   n    Var   n    1 2 y−μy z y−μy = e         n * π Ξ nΞ n Σ

1 −1 zz−μ − [z′−μ′ z′−μ′]Var    1 − = e 2 zz μ n * π Ξ nΣΞ n

1 −1 zz−μ − [z′−μ′ z′−μ′]Var    1 − = e 2 zz μ

n z π Var  z

However, this is not quite the density function of z, since the differential volume has not been

considered. The correct formula is f z (z)dVz = f x (x, y)dVxy . The differential volume in the xy   y

n coordinates is dVxy = ∏ dx j dy j . A change of dxj entails an equal change in the real part of dzj, j=1

even as a change of dyj entails an equal change in the imaginary part of dzj. Accordingly,

n n n n n n dVz = ∏ dx j (i ⋅ dy j ) = i ∏ dx j dy j = i ∏ dx j dy j = 1⋅∏ dx j dy j = dVxy . It so happens that j=1 j=1 j=1 j=1

9 Ξn does not distort volume; but this had to be demonstrated.

So finally, the probability density function of the complex random vector z is:

9 This will be abstruse to some actuaries. However, the integration theory is implicit in the change-of-variables technique outlined in Hogg [1984, pp 42-46]. That the n×n determinant represents “the volume function of an n- dimensional parallelepiped” is beautifully explained in Chapter 4 of Schneider [1973].

Casualty Actuarial Society E-Forum, Fall 2015 18 Complex Random Variables

dVz f z (z) = f z (z)⋅1 = f z (z) = f x (x, y) dV   xy y

1 −1 zz−μ − [z′−μ′ z′−μ′]Var    1 − = e 2 zz μ

n z π Var  z

−1 1 Γ C z−μ − [z′−μ′ z′−μ′]    1 Γ − = e 2 C  z μ Γ C π n C Γ

This formula is equivalent to the one found in the Wikipedia article “Complex normal distribution.”

Although Γ Γ − CΓ −1C appears within the radical of that article’s formula, it can be shown that

Γ C = Γ Γ − CΓ −1C . As far as allowable parameters are concerned, µ may be any complex C Γ

Γ C * Γ C vector. is allowed if and only if Ξ n  Ξ n = 4Σ is real-valued and PD. C Γ C Γ

Veeravalli [2006] defines a “proper” complex variable as one whose pseudo[co]variance matrix is

0n×n. Inserting zero for C into the formula, we derive the probability density function of a proper

complex random variable whose variance is Γ:

Casualty Actuarial Society E-Forum, Fall 2015 19 Complex Random Variables

−1 1 Γ C z−μ − [z′−μ′ z′−μ′]    1 2 C Γ z−μ f z (z) = e Γ C π n C Γ

−1 1 Γ 0  z−μ − [z′−μ′ z′−μ′]    1 Γ − = e 2 0  z μ Γ 0 π n 0 Γ

1  −1 −1  1 − (z′−μ′)Γ (z−μ)+(z′−μ′)Γ (z−μ) = e 2  π n Γ Γ

1 1 − {(z′−μ′)Γ−1 (z−μ)+(z′−μ′)Γ−1 (z′−μ′)} = e 2 π n Γ Γ

1 1 − {(z′−μ′)Γ−1 (z−μ )+(z′−μ′)Γ−1 (z−μ )} = e 2 π n Γ 2

1 −(z′−μ′)Γ−1 (z−μ ) = e π n Γ

1 * −1 = e −(z−μ ) Γ (z−μ ) π n Γ

The transformations in the last several lines rely on the fact that Γ is Hermetian and PD. Now the standard complex random vector is a proper complex random vector with mean zero and variance

1 * I . Therefore, in confirmation of Section 5, its density function is e −z z . n π n

8. THE COMPLEX LOGNORMAL RANDOM VECTOR AND ITS MOMENTS

A complex lognormal random vector is the elementwise exponentiation of a complex normal

zn×1 z z random vector: w n×1 = e . Its conjugate also is lognormal, since w = e = e . Deriving the probability density function of w is precluded by the fact that e : z → w is many-to-one.

Specifically, e z = w = e z+i(2πk ) for any integer k. So unlike the real-valued lognormal random

variable, whose density function can be found in Klugman [1998, §A.4.11], an analytic form for the

Casualty Actuarial Society E-Forum, Fall 2015 20 Complex Random Variables

complex lognormal density is not available. However, even for the real-valued lognormal the

density function is of little value; its moments are commonly derived from the moment generating

function of the normal variable on which it is based. So too, the moment generating function of the

complex normal random vector is available for deriving the lognormal moments.

We hereby define the moment generating function of the complex n×1 random vector z as

s′z+t′z M z (s n×1 , t n×1 ) = E[e ]. Since this definition may differ from other definitions in the sparse

literature, we should justify it. First, because we will take derivatives of this function with respect to s and t, the function must be differentiable. This demands simple transposition in the linear combination, i.e., s′z + t′z rather than the transjugation s*z + t*z . For transjugation would involve

ds derivatives of the form , which do not exist, as they violate the Cauchy-Riemann condition.10 ds

Second, even though moments of z are conjugates of moments of z, we will need second-order

moments involving both z and z . For this reason both terms must be in the exponent of the

moment generating function.

10 Cf. Appendix D.1.3 of Havil [2003]. Express f (z = x + iy) in terms of real-valued functions, i.e., as ∂u ∂x ∂v ∂x u(x, y)+ i ⋅ v(x, y). The derivative is based on the matrix of real-valued partial derivatives   . ∂u ∂y ∂v ∂y For the derivative to be the same in both directions, the Cauchy-Riemann condition must hold, viz., that 1 0 ∂u ∂x = ∂v ∂y and ∂v ∂x = − ∂u ∂y . But for f (z) = z = x − iy , the partial-derivative matrix is   ; 0 −1 hence ∂u ∂x ≠ ∂v ∂y . The Cauchy-Riemann condition becomes intuitive when one regards a valid complex 1 0 derivative as a double-real 2×2 matrix (Section 3). Compare this with f (z) = z = x + iy , whose matrix is   , 0 1 which represents the complex number 1.

Casualty Actuarial Society E-Forum, Fall 2015 21 Complex Random Variables

z x + iy I n iI n x x We start with terminology from Section 7, viz., that   =   =    = Ξ n   and z x − iy I n − iI n y y

x  μ  Σ Σ  that ~ Nμ = x , Σ = xx xy  . According to Appendix A, the moment    2n×1   2n×2n Σ Σ  y  μ y   yx yy 

x generating function of the real-valued normal random vector   is: y

x μ x  1 Σxx Σxy a   [a′ b′]  [a′ b′] + [a′ b′]   a     Σ Σ     =  y  = μ y  2  yx yy b M x    E e e   b   y     

Consequently:

s′z+t′z M z (s, t) = E[e ]

In iIn  x  [s′ t′]   −   = Ee In iIn  y      x  [s′+t′ i(s′−t′)]   = Ee y      s + t  =   M x      i(s − t) y   

′ ′ μx  1 ′ ′ Σxx Σxy  s+t  [(s+t ) i(s−t ) ] + [(s+t ) i(s−t ) ]   μ Σ Σ ( − ) = e  y  2  yx yy i s t 

It is so that we could invoke it here that Appendix B went to the trouble of proving that complex values are allowed in this moment generating function.

But in two ways we can simplify this expression. First:

′ ′ μ x  ′ ′ (s + t) i(s − t)  = (s + t) μ + i(s − t) μ = s′(μ + iμ )+ t′(μ − iμ ) = s′μ + t′μ    x y x y x y z z μ y 

Casualty Actuarial Society E-Forum, Fall 2015 22 Complex Random Variables

z Γ C Σ Σ  And second, again from Section 7, Var = = Ξ Σ Ξ* = Ξ xx xy Ξ* , or     n 2n×2n n n Σ Σ  n z C Γ  yx yy 

Ξ* Γ C Ξ Σ Σ  equivalently, n n = xx xy . Hence:   Σ Σ  2 C Γ 2  yx yy 

* ′ ′ Σ xx Σ xy  s + t  ′ ′ Ξ Γ C Ξ  s + t  (s + t) i(s − t)  = (s + t) i(s − t)  n n  Σ Σ          yx yy i(s − t) 2 C Γ 2 i(s − t)

Ξ n  s + t  1 I n iI n  s + t  1 (s + t)− (s − t) t On the right side,   =    =   =   . And on the 2 i(s − t) 2 I n − iI n i(s − t) 2 (s + t)+ (s − t) s

* ′ ′ Ξ 1 ′ ′  I n I n  left side, (s + t) i(s − t)  n = (s + t) i(s − t)  = [s′ t′]. So:      2 2 − iI n iI n 

′ ′ Σ xx Σ xy  s + t  Γ Ct (s + t) i(s − t)  = [s′ t′]  Σ Σ       yx yy i(s − t) C Γs

And the simplified expression, based on the mean and the augmented variance of z, is:

s′z+t′z M z (s, t) = E[e ] 1 Γ Ct  s′μ z +t′μz + [s′ t′]   = e 2 C Γs 1 s′μ+t′μ+ (s′Γt+s′Cs+t′Ct+t′Γs) = e 2

th n n As in Appendix A, let e j denote the j unit vector of ℜ , or even better, of C . Then:

1 μ j + C jj z j 2 E[e ]= M z (e j , 0)= e 1 μ j + C jj Eez j  = E[ez j ]= M (0, e )= e 2 = E[ez j ]   z j

Moreover:

1 z z +z μ j +μ k + (C jj +C jk +Ckj +Ckk ) j z k j k 2 E[e e ]= E[e ]= M z (e j + ek , 0)= e

Casualty Actuarial Society E-Forum, Fall 2015 23 Complex Random Variables

According to Section 6, C is symmetric ( C′ = C ). This and further simplification leads to:

 1   1  1 1  μ + C + μ + C + (C +C ) z μ j +μ k + (C jj +C jk +Ckj +Ckk ) j jj k kk jk jk z C E[e j ez k ]= e 2 = e 2   2  2 = E[e j ]E[ez k ]⋅e jk

Hence, mindful of the transjugation (*) in the definition of complex covariance, we have:

z j z  z j z  z j  z  z j z z j z z j z C jk Cov[e , e k ]= E e e k − E[e ]E e k = E[e e k ]− E[e ]E[e k ]= E[e ]E[e k ]⋅ (e −1)    

′ In terms of w = ez this translates as Cov[w, w]=  E[w]E[w]   (eC −1 ), in which the ‘◦’ operator   n×n

represents elementwise multiplication.11 So too:

z z z C Cove j , ez k  = Cov e j , ez k = Ee j E ez k ⋅ e jk −1   [ ]   [ ] ( )

′ This translates as Cov[w, w]=  E[w]E[w]   (eC −1 ).   n×n

z The remaining combination is the mixed form E[e j ezk ]:

1 z z z +z μ j +μk + (Γjk +C jj +Ckk +Γkj ) j z k j zk j k 2 E[e e ]= E[e e ]= E[e ]= M z (e j , ek )= e

* Since Γ is Hermetian, Γkj = Γjk′ = Γjk = Γjk . Hence:

 1   1  1 1  μ + C + μ + C + (Γ +Γ ) z μ j +μk + (Γjk +C jj +Ckk +Γkj ) j jj k kk jk jk z Γ E[e j ez k ]= e 2 = e 2   2  2 = E[e j ]E[ez k ]⋅e jk

z z z z Γ Therefore, Cov[e j , ez k ]= E[e j ez k ]− E[e j ]E[ez k ]= E[e j ]E[ez k ]⋅ (e jk −1), which translates as

z z Γ  ′  Γ  j zk   j  zk jk Cov[w, w]=  E[w]E[w]   (e −1 × ). By conjugation, Cov e , e = E e E[e ]⋅(e −1),   n n    

′ Γ which translates as Cov[w, w]=  E[w]E[w]   (e −1 ).   n×n

11 Elementwise multiplication is formally known as the Hadamard, or Hadamard-Schur, product, of which we will make use in Appendices A and C. Cf. Million [2007].

Casualty Actuarial Society E-Forum, Fall 2015 24 Complex Random Variables

zn×1 We conclude this section by expressing it all in terms of w n×1 = e . Let z be complex normal with

z Γ C mean µ and augmented variance Var  =   . And let D be the n×1 vector consisting of the z C Γ main diagonal of C. Then D is the vectorization of the diagonal of C . So the augmented mean of

w eμ+D 2  = w is E   μ+D 2  . And the augmented variance of w is: w e 

w Cov[w, w] Cov[w, w] Var  =   w Cov[w, w] Cov[w, w]

 ′ Γ ′ C   E[w]E[w]   (e −1 )  E[w]E[w]   (e −1 )   n×n   n×n  =    ′  C  ′  Γ  E[w]E[w]   (e −1n×n )  E[w]E[w]   (e −1n×n )      Γ C  ′ ′      E[w]E[w] E[w]E[w] C Γ =    e   −1   ′ ′   2n×2n   E[w]E[w] E[w]E[w]    z  w   Var     ′   z  = E E[w w]  e −12n×2n  w          * z    Var     w w   z  = E E   e −12n×2n  w w           

Scaling all the lognormal means to unity (or setting μ = − D 2 ), we can say that the coefficient-of-

z Var   z lognormal-augmented-variation matrix equals e −12n×2n , which is analogous with the well-

2 known coefficient of lognormal variation eσ −1.

9. THE COMPLEX LOGNORMAL RANDOM VARIABLE

The previous section derived the augmented mean and variance of the lognormal random vector;

this section provides some intuition into it. The complex lognormal random variable, or scalar,

Casualty Actuarial Society E-Forum, Fall 2015 25 Complex Random Variables

X   0 σ2 ρστ derives from the real-valued normal bivariate ~ N ,  . Zero is not much of a       2  Y   0 ρστ τ  restriction; since eCN (μ, V) = eμ+CN (0, V) = eμ  eCN (0, V) , the normal mean affects only the scale of the lognormal. The variance is written in correlation form, where −1≤ ρ ≤1. As usual, 0 < σ, τ < ∞ .

Z  X  1 iX  X + iY  Define   = Ξ1   =    =   . Its mean is zero, and according to Section 7 its Z  Y  1 − iY  X − iY  variance (the augmented variance) is:

Z  1 iσ 2 ρστ 1 1 = Var    2   Z  1 − iρστ τ − i i σ 2 + τ 2 − i(ρστ − ρστ) σ 2 − τ 2 + i(ρστ + ρστ) =  2 2 2 2  σ − τ − i(ρστ + ρστ) σ + τ + i(ρστ − ρστ) σ 2 + τ 2 σ 2 − τ 2 + 2iρστ =  2 2 2 2  σ − τ − 2iρστ σ + τ 

We will say little about non-zero correlation (ρ ≠ 0 ); but at this point a digression on complex correlation is apt. The coefficient of correlation between Z and its conjugate is:

σ2 − τ2 + 2iρστ σ2 − τ2 − 2iρστ ρ = = = ρ ZZ σ2 + τ2 σ2 + τ2 ZZ

As a form of covariance, correlation is Hermetian. Moreover:

2 2 2 (σ2 − τ2 ) + 4ρ2σ2τ2 (σ2 − τ2 ) + 4(1)σ2τ2 (σ2 + τ2 ) 0 ≤ ρZZ ρZZ = ρZZ ρZZ = 2 ≤ 2 = 2 =1 (σ2 + τ2 ) (σ2 + τ2 ) (σ2 + τ2 )

So, the magnitude of complex correlation is not greater than unity. The imaginary part of the correlation is zero unless some correlation exists between the real and imaginary parts of the

= = − underlying bivariate. More interesting are the two limits: lim2 + ρ ZZ 1 and lim2 + ρ ZZ 1. In the τ →0 σ →0

Casualty Actuarial Society E-Forum, Fall 2015 26 Complex Random Variables first case, Z → Z in a statistical sense, and the correlation approaches one. In the second case,

Z → −Z , and the correlation approaches negative one.

Z 0+(σ2 −τ2 +2iρστ) 2 (σ2 −τ2 ) 2 iρστ Now if W = e , by the formulas of Section 8, E[W ] = e = e ⋅ e and

(σ2 −τ2 ) 2 −iρστ E[W ]= e ⋅ e . And the augmented variance is:

* Z     Var    W   W  W    Z   Var  = E E   e −12×2 W  W W             

2 2 2 2 2 2 σ +τ σ −τ +i⋅2ρστ (σ −τ ) 2 iρστ        2 2 2 2  2 2 2 2 e ⋅ e (σ −τ ) 2 −iρστ (σ −τ ) 2 iρστ  σ −τ −i⋅2ρστ σ +τ   =  e ⋅ e e ⋅ e   e −1  (σ2 −τ2 ) 2 −iρστ [ ]  2×2   e ⋅ e          

2iρστ σ2 +τ2 σ2 −τ2 +2iρστ 2 − 2 1 e  e −1 e −1 = eσ τ   −2iρστ   σ2 −τ2 −2iρστ σ2 +τ2  e 1  e −1 e −1 

σ2 +τ2 σ2 −τ2 +4iρστ 2iρστ 2 − 2 e −1 e − e  = eσ τ  σ2 −τ2 −4iρστ −2iρστ σ2 +τ2  e − e e −1 

2 2 2 2 2 2 2 e 2σ − eσ −τ e 2σ −2τ +4iρστ − eσ −τ ⋅ e 2iρστ  =  2σ2 −2τ2 −4iρστ σ2 −τ2 −2iρστ 2σ2 σ2 −τ2  e − e ⋅ e e − e 

2 + W  σ2 2 1 W  σ2 σ2 1 1 In the first case above, as τ → 0 , E  → e   and Var  → e (e −1)  . Since W  1 W  1 1 the complex part Y becomes probability-limited to its mean of zero, the complex lognormal degenerates to the real-valued W = e X . The limiting result is oblivious to the underlying correlation

ρ, since W → W .

Casualty Actuarial Society E-Forum, Fall 2015 27 Complex Random Variables

τ 2 −τ 2 + W  − 2 1 W  −τ 2 e −1 e −1 2 → → τ 2 → In the second case, as σ 0 , E  e   and Var  e  −τ 2 τ 2  . As W  1 W  e −1 e −1  in the first case, both E[W ]= E[W ] and the underlying correlation ρ has disappeared. Nevertheless,

the variance shows W and its conjugate to differ; in fact, their correlation is the real-valued

−τ 2 τ 2 −τ 2 2 ρWW = (e −1) (e −1)= −e = −E[W ]. Since τ > 0 , −1 < ρWW < 0 and

0 < E[W ]= E[W ]<1.

Both these cases are understandable from the “geometry” of W = e Z = e X +iY = e X eiY . The complex exponential function is the basis of polar coordinates; e X is the magnitude of W, and Y is the angle of W in radians counterclockwise from the real axis of the complex plane. Imagine a canon whose angle and range can be set. In the first case, the angle is fixed at zero, but the range is variable. This makes for a lognormal distribution along the positive real axis. In the second case, the canon’s angle varies, but its range is fixed at e0 = 1. This makes all the shots to land on the

complex unit circle; hence, their mean lies within the circle, i.e., E[W ] < 1. Moreover, the symmetry of Y as N(0, τ 2 )-distributed guarantees E[W ] to fall on the real axis, or −1 < E[W ] < 1.

Furthermore, since the normal density function strictly decreases in both directions from the mean,

2 more shots land to the right of the imaginary axis than to the left, so 0 < E[W ] = e −τ < 1. A “right-

handed” canon, or a canon whose angle is measured clockwise from the real axis, fires W = e X e −iY

shots.

Casualty Actuarial Society E-Forum, Fall 2015 28 Complex Random Variables

A shot from an unrestricted canon will “almost surely” not land on the real axis.12 If we desire negative values from the complex lognormal random variable, as a practical matter we must extract them from its real or complex parts, e.g., U = Re(W ). One can see in the second case, that as τ 2

grows larger, so too grows larger the probability that U < 0 . As τ 2 → ∞ , the probability

approaches one half. In the limit, the shots are uniformly distributed around the complex unit

circle. In this specialized case ( σ 2 → 0+ and τ 2 → ∞ ), the distribution of U = Re(W ) is

1 13 fU (u) = , for −1 ≤ u ≤ 1. π 1− u 2

This suggests a third case, in which τ 2 → ∞ while σ 2 remains at some positive amount. An

intriguing feature of complex variables is that infinite variance in Y leads to a uniform distribution of

eiY .14 So if W = e Z = e X eiY , U = Re(W ) = e X cosY will be something of a reflected lognormal;

both its tails will be as heavy as the lognormal’s.15 In this case:

(σ2 −τ2 +2iρστ) 2 W  e  0 E = lim =   2  (σ2 −τ2 +2iρστ) 2    W  τ →∞e  0

2 2 2 2 2 2 2 2 W  e 2σ − eσ −τ e 2σ −2τ +4iρστ − eσ −τ ⋅ e 2iρστ  e 2σ 0  Var = lim =   2  2σ2 −2τ2 −4iρστ σ2 −τ2 −2iρστ 2σ2 σ2 −τ2   2σ2  W  τ →∞e − e ⋅ e e − e  0 e 

Again, ρ has disappeared from the limiting distribution; but in this case ρWW = 0.

12 For an event almost surely to happen means that its probability is unity; for an event almost surely not to happen means that its probability is zero. The latter case means not that the event will not happen, but rather that the event has zero probability mass. For example, if X ~ Uniform[0, 1], Prob[X=½] = 0. So X almost surely does not equal ½, even though ½ is as possible as any other number in the interval. 13 For more on this bimodal Arcsine(-1, 1) distribution see Wikipedia, “Arcsine distribution.” 14 The next section expands on this important subject. “Infinite variance in Y” means “as the variance of Y approaches infinity.” It does not mean that eiY is uniform for a variable Y whose variance is infinite, e.g., for a Pareto random variable whose shape parameter is less than or equal to two. 15 Cf. Halliwell [2013] for a discussion on the right tails of the lognormal and other loss distributions.

Casualty Actuarial Society E-Forum, Fall 2015 29 Complex Random Variables

In practical work with U = Re(W ),16 the angular part eiY will be more important than the

lognormal range e X . For example, one who wanted the tendency for the larger magnitudes of

U = Re(W ) to be positive might set the mean of Y at − π 2 and the correlation ρ to some positive

value. Thus, greater than average values of Y, angling off into quadrants 4 and 1 of the complex

plane, would correlate with larger than average values of X and hence of e X . Of course,

Var[Y ] = τ 2 would have to be small enough that deviations of ± π from E[Y ] = −π 2 would be tolerably rare. Equivalently, one could set the mean of Y at π 2 and the correlation ρ to some

negative value. As a second example, one who wanted negative values of U to be less frequent than

positive, might set both the mean of Y and ρ to zero, and set the variance of Y so that

Prob[Y > π 2] is desirably small. Some distributions of U for τ 2 >> σ 2 are bimodal, as in the

specialized case σ 2 → 0+ and τ 2 → ∞ . But less extreme parameters would result in unimodal

distributions for U over the entire real number line.

10. THE COMPLEX UNIT-CIRCLE RANDOM VARIABLE

In the previous section we claimed that as the variance τ2 of the normal random variable Y

approaches infinity, eiY approaches a uniform distribution over the complex unit circle. The

explanation and justification of this claim in this section prepare for an important implication in the

next.

Let real-valued random variable Y be distributed as N[μ, σ 2 ], and let W = eiY . According to the

itY itμ+(it )2 σ2 2 itμ−t 2 σ2 2 moment-generating-formula of Section 8, M Y (it) = E[e ]= e = e . Although the

16 In the absence of an analytic distribution, practical work with the complex lognormal would seem to require simulating its values from the underlying normal distribution.

Casualty Actuarial Society E-Forum, Fall 2015 30 Complex Random Variables

formula applies to complex values of t, here we’ll restrict it to real values. With t ∈ ℜ M Y (it) is

known as the characteristic function of real variable Y. And so:

itμ−t 2 σ2 2 itμ −t 2 σ2 2 1 if t = 0 lim M Y (it) = lim e = e lim e = δt0 =  σ2 →∞ σ2 →∞ σ2 →∞ 0 if t ≠ 0

It is noteworthy, and indicative of a uniformity of some sort, that µ drops out of the result.

Next, let real-valued random variable Θ be uniformly distributed over [a, a + 2πn], where n is a positive integer; in symbols, Θ ~ U[a, a + 2πn]. Then:

itΘ M Θ (it) = E[e ] a+2πn 1 = ∫ eitθ dθ θ=a 2πn a+2πn eitθ = π 2 itn a e 2πitn −1 = eita 2πitn  1 if t = 0  =  0 if tn = ±1, ± 2,   ≠ 0 if tn not integral

Letting n approach infinity, we have:

2πitn ita e −1 1 if t = 0 lim M Θ (it) = e lim = δt0 =  n→∞ n→∞ 2πitn 0 if t = 0

Hence, lim M Θ (it) = lim M Y (it) = δt0 . The equality of the limits of the characteristic functions of n→∞ σ2 →∞ the random variables implies the identity of the limits of their distributions; hence, the diffuse uniform U[a, a + ∞] is “the same” as the diffuse normal N[μ, ∞].17

17 Quotes are around ‘the same’ because the limiting distributions are not proper distributions. The notion of diffuse distributions comes from Venter [1996, pp. 406-410], who shows there how different diffuse distributions result in

Casualty Actuarial Society E-Forum, Fall 2015 31 Complex Random Variables

Indeed, for the limit to be δt0 it is not required that n be an integer. But for Θ ~ U[a, a + 2πn], the

integral moments of W = eiΘ are:

 1 if j = 0 Θ  E[W j ]= E[eij ]=  0 if jn = ±1, ± 2,   ≠ 0 if jn not integral

So if n is an integer, jn will be an integer, and all the integral moments of W will be zero, except for

the zeroth. Therefore, the integral moments of W = eiΘ are invariant to n, as long as the n in 2πn,

the width of the interval of Θ, is a whole number. Hence, although we hereby define the unit-circle

random variable as eiΘ for Θ ~ U[0, 2π ], the choice of a = 0 and n = 1 is out of convenience,

rather than out of necessity. The probability for eiΘ to be in an arc of this circle of length l equals

l 2π .

The integral moments of the conjugate of W = e −iΘ are the same, for

j j j j j −ijΘ E[W ]= E[W ]= E[W ]= δ j0 = δ j0 = E[W ]. Alternatively, E[W ]= E[e ]= δ(− j )0 = δ j0 . And

th j k ijΘ −ikΘ i( j−k )Θ the jk mixed moment is E[W W ]= E[e e ]= E[e ]= δ( j−k )0 = δ jk . Since

E[W ] = E[W ]= 0 , the augmented variance of the unit-circle random variable is:

W  W   WW WW  δ11 δ 20  1 0 Var  = E [W W ] = E  =   =   = I 2 W  W   WW WW  δ02 δ11  0 1

Hence, W = eiΘ for Θ ~ U[0, 2π ] is not just a unit-circle random variable; having zero mean and

unit variance, it is the standard unit-circle random variable.

different Bayesian estimates. But here every continuous random variable Y diffuses through the periodicity of eiY into the same limiting distribution, viz., the Kronecker δt0 (note 31).

Casualty Actuarial Society E-Forum, Fall 2015 32 Complex Random Variables

Multiplying W by a complex constant α ≠ 0 affects the radius of the random variable, whose jkth

mixed moment is:

j j k j k j k j k (αα) if j = k E[(αW ) (αW ) ]= α α E[W W ]= α α δ jk =   0 if j ≠ k

αW  The augmented variance is Var  = ααVar[W ] = αα I 2 . One may consider α as an instance of a αW 

complex random variable Α. Due to the independence of Α from W, the jkth mixed moment of

j k j k j k j k j ΑW is E[(ΑW ) (ΑW ) ]= E[Α Α ]E[W W ]= E[Α Α ]δ jk = E[(ΑΑ) ]δ jk . Its augmented

AW  variance is Var  = E[ΑΑ]Var[W ] = {Var[A]+ E[Α]E[Α]}Var[W ]. Unlike the one-dimensional AW 

W, ΑW can cover the whole complex plane. However, like W, it too possesses the desirable

j 18 property that E[(ΑW ) ]= δ j0 .

11. UNIT-CIRCULARITY AND DETERMINISM

The single most important quality of a random variable is its mean. In fact, just having reliable

estimates of mean values would satisfy many users of actuarial analyses. Stochastic advances in

actuarial science over the last few decades notwithstanding, much actuarial work remains

deterministic. Determinism is not the reduction of a stochastic answer Y = f (X ) to its mean

E[Y ] = E[ f (X )]. Rather, the deterministic assumption is that the expectation of a function of a

random variable equals the function of the expectation of the random variable; in symbols,

k 18 The existence of the moments E[Α j Α ] needs to be ascertained. In particular, moments for j and k as negative integers will not exist unless Prob[Α = 0]=1− Prob[Α ≠ 0]=1− Prob[ΑΑ > 0]= 0 .

Casualty Actuarial Society E-Forum, Fall 2015 33 Complex Random Variables

E[Y ]= E[ f (X )]= f (E[X ]). Because this assumption is true for linear f, it was felt to be a

reasonable or necessary approximation for non-linear f.

Advances in computing hardware and software, as well as increased technical sophistication, have made determinism more avoidable and less acceptable. However, the complex unit-circular random

variable provides a habitat for the survival of determinism. To see this, let f be analytic over the domain of complex random variable Z. From Cauchy’s Integral Formula (Havil [2003, Appendix

D.8 and D.9]) it follows that within the domain of Z, f can be expressed as a convergent series

∞ j f (z) = a0 + a1 z + = a0 + ∑ a j z . Taking the expectation, we have: j=1

∞ j E[ f (Z )] = a0 + ∑ a j E[Z ] j=1

But if for every positive integer j E[Z j ]= E[Z] j , then:

∞ ∞ j j E[ f (Z )] = a0 + ∑ a j E[Z ]= a0 + ∑ a j E[Z] = f (E[Z]) j=1 j=1

Therefore, determinism conveniently works for analytic functions of random variables whose

moments are powers of their means.

Now a real-valued random variable whose moments are powers of its mean would have the

characteristic function:

∞ j ∞ j itX (it) j (it) j itE[X ] M X (it) = E[e ]= 1+ ∑ E[X ]= 1+ ∑ E[X ] = e = M E[X ] (it) j=1 j! j=1 j!

This is the characteristic function of the “deterministic” random variable, i.e., the random variable

whose probability is massed at one point, its mean. So determinism with real-valued random

Casualty Actuarial Society E-Forum, Fall 2015 34 Complex Random Variables

variables requires “deterministic” random variables. But some complex random variables, such as

the unit-circle, have the property E[Z j ]= E[Z] j without being deterministic.

In fact, when E[Z j ]= E[Z] j , not only is E[ f (Z )] = f [E[Z]]. For positive integer k, f k (z) is as

analytic as f itself; hence, E[f k (Z )]= f k (E[Z]). So the determinism with these complex random variables is valid for all moments; nothing is lost.

In Section 10 we saw that for the unit-circle random variable W = eiΘ and for j = ±1, ± 2, ,

E[W − j ]= E[W j ]= 0 = E[W ] j . Can determinism extend to non-analytic functions which involve

the negative moments? For example, let g(z) = 1 (η − z), for some complex η ≠ 0 . The function

is singular at z = η; but within the disc {z : z η <1}= {z : z < η} the function equals the

convergent series:

2 1 1 1  z  z   1 z z 2 = − = ⋅ = + +   + = + + + g(z) 1 (η z) 1    2 3 η  z  η  η  η   η η η 1−   η 

Outside the disc, or for {z : z η > 1}= {z : z > η}, another convergent series represents the function:

2 1 1 1  η  η   1 η η2 g(z) =1 (η − z) = − ⋅ = − 1+ +   + = − − − − z  η  z  z  z   z z 2 z3 1−     z 

So, if η > 1, then W =1< η . In this case:

Casualty Actuarial Society E-Forum, Fall 2015 35 Complex Random Variables

1 W W 2  = + + + E[g(W )] E 2 3  η η η  1 E[W ] E W 2 = + + [ ] + η 2 3 η η 1 0 0 = + + + η η2 η3 1 = η

However, if η <1, then W =1 > η . So in this case:

 1 η η2  = − − − − E[g(W )] E 2 3   W W W  = −E[W −1 ]− ηE[W −2 ]− η2 E[W −3 ]− = −0 − η ⋅ 0 − η2 ⋅ 0 − = 0

Both answers are correct; however, only the first satisfies the deterministic equation

E[g(W )] = g(E[W ]) = g(0) = 1 η.

To understand why the answer depends on whether η is inside or outside the complex unit circle, let

us evaluate E[g(W )] directly:

2π 1 dθ E[g(W )] = E[1 (η −W )] = E 1 η − eiΘ = [ ( )] ∫ iθ θ=0 η − e 2π

The next step is to transform from θ into z = eiθ . So dz = ieiθ dθ = izdθ , and the line integral

transforms into a contour integral over the unit circle C:

Casualty Actuarial Society E-Forum, Fall 2015 36 Complex Random Variables

2π 1 dθ E[g(W )]= ∫ iθ θ=0 η − e 2π 1 iz dθ = ∫ C η − z iz 2π 1 1 dz = ∫ C η − z z 2π i 1 dz = ∫ 2π i C z(η − z) 1  1 1  1 = ∫ +  dz 2π i C  z η − z  η   1  1 dz 1 dz  =  ∫ + ∫  η  2π i C z 2π i C η − z    1  1 dz 1 dz  =  ∫ − ∫  η  2π i C z − 0 2π i C z − η 

Now the value of each of these integrals is one if its singularity is within the unit circle C, and zero if

it is not.19 Of course, the singularity of the first integral at z = 0 lies within C; hence, its value is

one. The second integral’s singularity at z = η lies within C if and only if η < 1. Therefore:

1  1 dz 1 dz  1 if η >1 E[g(W )]=  −  = g(E[W ])⋅  π ∫ − π ∫ −  < η  2 i C z 0 2 i C z η  0 if η 1

So the deterministic equation will hold for one of the Laurent series according to which the domain

of the non-analytic function is divided into regions of convergence. Fascinating enough is how the

1 dz 1 if η >1 function φ(η) = =  serves as the indicator of a state, viz., the state of being π ∫ − < 2 i C z η 0 if η 1

inside or outside the complex unit circle.

19 Technically, the integral has no value if the singularity lies on C; but there are some practical advantages for “splitting the difference” in that case.

Casualty Actuarial Society E-Forum, Fall 2015 37 Complex Random Variables

12. THE LINEAR STATISTICAL MODEL

Better known as “regression” models, linear statistical models extend readily into the realm of complex numbers. A general real-valued form of such models is presented and derived in Halliwell

[1997, Appendix C]:

y1  X1  e1  e1  Σ11 Σ12    =  β +  , Var  =   y 2  X2  e2  e2  Σ21 Σ22 

The subscript ‘1’ denotes observations, ‘2’ denotes predictions. Vector y1 is observed; the whole

design matrix X is hypothesized, as well as the fourfold ‘Σ’ variance structure. Although the

variance structure may be non-negative-definite (NND), the variance of the observations Σ11 must

be positive-definite (PD). Also, the observation design X1 must be of full column rank. The last

−1 −1 −1 two requirements ensure the existence of the inverses Σ11 and (X1′Σ11X1 ) . The best linear

ˆ −1 ˆ unbiased predictor of y 2 is yˆ 2 = X 2β + Σ 21Σ11 (y1 − X1β). The variance of prediction error

′ −1 ˆ −1 −1 y 2 − yˆ 2 is Var[y 2 − yˆ 2 ] = (X 2 − Σ 21Σ11 X1 )Var[β](X 2 − Σ 21Σ11 X1 ) + Σ 22 − Σ 21Σ11 Σ12 . Embedded

in these formulas are the estimator of β and its variance:

ˆ −1 −1 −1 ˆ −1 β = (X1′Σ11 X1 ) X1′Σ11 y1 = Var[β]⋅ X1′Σ11 y1 .

For the purpose of introducing complex numbers into the linear statistical model we will concern ourselves here only the estimation of the parameter β. So we drop the subscripts ‘1’ and ‘2’ and simplify the observation as y = Xβ + e , where Var[e] = Γ . Again, X must be of full column rank

and Γ must be Hermetian PD. According to Section 4, transjugation is to complex matrices what

transposition is to real-valued matrices. Therefore, the short answer for a complex model is:

−1 βˆ = (X*Γ −1X) X*Γ −1y = Var[βˆ]⋅ X*Γ −1y .

Casualty Actuarial Society E-Forum, Fall 2015 38 Complex Random Variables

However, deriving the solution from the double-real representation in Section 3 will deepen the understanding. The double-real form of the observation is:

y r − y i  X r − X i β r − βi  e r − ei    =    +   y i y r  X i X r βi β r  ei e r 

All the vectors and matrices in this form are real-valued. The subscripts ‘r’ and ‘i’ denote the real and imaginary parts of y, X, β, and e. Due to the redundancy of double-real representation, we may retain just the left column:

y r  Xr − Xi βr  er    =    +   yi  Xi Xr βi  ei 

Note that if X is real-valued, then X i = 0 , and y r and yi become two “data panels,” each with its

20 own parameter β r and βi .

It iIt  Now let Ξ t =   , the augmentation matrix of Section 7, where t is the number of It − iIt  observations. Since the augmentation matrix is non-singular, premultiplying the left-column form by it yields equivalent but insightful forms:

20 This assumes zero covariance between the error vectors.

Casualty Actuarial Society E-Forum, Fall 2015 39 Complex Random Variables

It iIt y r  It iIt Xr − Xi βr  It iIt er     =     +    It − iIt yi  It − iIt Xi Xr βi  It − iIt ei 

y r + iyi  Xr + iXi − Xi + iXr βr  er + iei    =    +   y r − iyi  Xr − iXi − Xi − iXr βi  er − iei 

y X iXβr  e   =    +   y X − iXβi  e

y Xβr + iXβi  e   =   +   y Xβr − iXβi  e y Xβ e   =   +   y Xβ e y X 0 β e   =    +   y  0 Xβ e

The first insight is that y = Xβ + e = Xβ + e is as much observed as y = Xβ + e . The second

e insight is that Var  is an augmented variance, whose general form according to Section 6 is e

e Γ C Var  =   . Therefore, the general form of the observation of a complex linear model is e C Γ

y X 0 β e e Γ C   =    +   , where Var  =   . Not only is y as observable as y, but also β y  0 Xβ e e C Γ is as estimable as β. Furthermore, although the augmented variance may default to C = 0 , the

e complex linear statistical model does not require   to be “proper complex,” as defined in Section e

7.

X 0  Since X is of full column rank, so too must be   . And since Γ is Hermetian PD, both it and  0 X

Γ C its conjugate Γ are invertible. But the general form of the observation requires   to be C Γ

Casualty Actuarial Society E-Forum, Fall 2015 40 Complex Random Variables

Hermetian PD, hence invertible. A consequence is that both the “determinant” forms Γ − CΓ −1C

and Γ − CΓ−1C are Hermetian PD and invertible. With this background it can be shown, and the

−1 Γ C Η K −1 −1 −1 reader should verify, that   =   , where Η = (Γ − CΓ C) and K = −Γ CΗ . The C Γ K Η

important point is that inversion preserves the augmented-variance form.

y X 0 β e e Γ C The solution of the complex linear model   =    +   , where Var  =   , is: y  0 Xβ e e C Γ

− * −1 1 * −1 βˆ   X 0  Γ C X 0  X 0  Γ C y   =   ˆ     Γ      Γ   β   0 X C   0 X  0 X C  y −1  X* 0 Η KX 0  X* 0 Η Ky =              0 X′K Η 0 X  0 X′K Ηy −1 X*ΗX X*KX X*Ηy + X*Ky =     X′KX X′ΗX X′Ky + X′Ηy 

−1 βˆ  X*ΗX X*KX = And Var ˆ    , which must exist since it is a quadratic form based on the β X′KX X′ΗX

−1 Γ C X 0  Hermetian PD   and the full column rank   . Reformulate this as: C Γ  0 X

X*ΗX X*KXβˆ  X*Ηy + X*Ky =   ˆ    X′KX X′ΗXβ X′Ky + X′Ηy 

The conjugates of the two equations in βˆ and βˆ are the same equations in βˆ and βˆ :

* *   * * * * X ΗX X KX βˆ X Ηy + X Ky X ΗX X KXβˆ     =   =    ′ ′ ′ ′ ′ ′ ˆ X KX X ΗXβˆ  X Ky + X Ηy  X KX X ΗXβ

Casualty Actuarial Society E-Forum, Fall 2015 41 Complex Random Variables

 ˆ  βˆ  β = Therefore,  ˆ  . It is well known that the estimator of a linear function of a random variable βˆ  β

is the linear function of the estimator of the random variable. But conjugation is not a linear

function. Nevertheless, we have just proven that the estimator of the conjugate is the conjugate of

the estimator.

13. ACTUARIAL APPLICATIONS OF COMPLEX RANDOM VARIABLES

How might casualty actuaries put complex random variables to work? Since the support of most

complex random variables is a plane, rather than a line, their obvious application is bivariate. An

example is a random variable whose real part is loss and whose imaginary part is LAE. Another

application might pertain to copulas. According to Venter [2002, p. 69], “copulas are joint

distributions of unit random variables.” One could translate these joint distributions into

distributions of complex variables whose support is the complex unit square, i.e., the square whose

vertices are the points z = 0, 1, 1+ i, i . However, for now it seems that real-valued bivariates provide the necessary theory and technique for these purposes.

Actuaries who have applied log-linear models to triangles with paid increments have been frustrated

applying them to incurred triangles. The problem is that incurred increments are often negative, and

the logarithm of a negative number is not real-valued. This has led Glenn Meyers [2013] to seek

modified lognormal distributions whose support includes the negative real numbers. The persistent

intractability of the log-linear problem was a major reason for our attention to the lognormal

zn×1 random vector w n×1 = e in Section 8. But to model an incurred loss as the exponential function

of a complex number suffers from two drawbacks. First, to model a real-valued loss as e z = e x ⋅ eiy

requires y to be an integral multiple of π. The mixed random variable e X with probability p and

Casualty Actuarial Society E-Forum, Fall 2015 42 Complex Random Variables

− e X with probability 1− p is not lognormal. No more suitable are such “denatured” random variables as Re(eZ ). Second, one still cannot model the eminently practical value of zero, because

for all z, e z ≠ 0 .21 At present it does not appear that complex random variables will give birth to

useful distributions of real-valued random variables. Even the unit-circle and indicator random

variables of Sections 10 and 11, as interesting as they are in the theory of analytic functions, most likely will engender no distributions valuable to actuarial work.

The complex version of the linear model in Section 12 showed us that conjugates of observations are themselves observations and that conjugates of estimators are estimators of conjugates.

Moreover, there we found a use for augmented variance. Nonetheless we are still fairly bound to our conclusion to Section 3, that one who lacked either the confidence or the software to work with complex numbers could probably do a work-around with double-real matrices.

So how can actuarial science benefit from complex random variables? The great benefit will come from new ways of thinking. The first step will be to overcome the habit of picturing a complex number as half real and half imaginary. Historically, it was only after numbers had expanded from rational to irrational that the whole set was called “real.” Numbers ultimately are sets; zero is just the empty set. How real are sets? Regardless of their mathematical reality, they are not physically real. Complex numbers were deemed “real” because mathematicians needed them for the solution of polynomial equations. In the nineteenth century this spurred the development of abstract algebra. At first new ways of thinking amount to differences in degree; at some point many develop

− + − − 21 If e a = 0 for some a, then for all z e z = e z a a = e z a e a = e z a ⋅ 0 = 0 . One who sees that lim ex ⋅eiy = 0⋅eiy = 0 might propose to add the ordinate Re(z) = x = −∞ to the complex plane. But not only x→−∞ is this proposal artificial; it also militates against the standard theory of complex variables, according to which all points infinitely far from zero constitute one and the same point at infinity.

Casualty Actuarial Society E-Forum, Fall 2015 43 Complex Random Variables into differences in kind. One might argue, “Why study Euclidean geometry? It all derives from a few axioms.” True, but great theorems (e.g., that the sum of the angles of a triangle is the sum of two right angles) can be a long way from their axioms. A theorem means more than the course of its proof; often there are many proofs of a theorem. Furthermore, mathematicians often work backwards from accepted or desired truths to efficient and elegant sets of axioms. Perhaps the most wonderful thing about mathematics is its “unreasonable effectiveness in the natural sciences,” to quote physicist Eugene Wigner. The causality between pure and applied mathematics works in both directions. Therefore, it is likely that complex random variables and vectors will find their way into actuarial science. But it will take years, even decades, and technology and education will have to prepare for it.

14. CONCLUSION

Just as physics divides into different areas, e.g., theoretical, experimental, and applied, so too actuarial science, though perhaps more concentrated on business application, justifiably has and needs a theoretical component. Theory and application cross-fertilize each other. In this paper we have proposed to add complex numbers to the probability and of actuarial theory. With patience, the technically inclined actuary should be able to understand the theory of complex random variables delineated herein. In fact, our multivariate approach may even more difficult to understand than the complex-function theory; but both belong together. Although all complex matrices and operations were formed from double-real counterparts, we believe that the “sum is greater than the parts,” i.e., that the assimilation of this theory will lead to higher-order thinking and creativity. In the sixteenth century the “fiction” of i = −1 allowed mathematicians to solve more equations. Although at first complex solutions were deemed “extraneous roots,” eventually their practicality became recognized; so that today complex numbers are essential for science and

Casualty Actuarial Society E-Forum, Fall 2015 44 Complex Random Variables engineering. Applying complex numbers to probability has lagged; but even now it is part of signal processing in electrical engineering. Knowing how rapidly science has developed with nuclear physics, molecular biology, space exploration, and computers, who would dare to bet against the usefulness of complex random variables to actuarial science by the mid-2030s, when many scientists and futurists expect nuclear fusion to be harnessed and available for commercial purposes?

Casualty Actuarial Society E-Forum, Fall 2015 45 Complex Random Variables

REFERENCES

[1.] Halliwell, Leigh J., Conjoint Prediction of Paid and Incurred Losses, CAS Forum, Summer 1997, 241-380, www.casact.org/pubs/forum/97sforum/97sf1241.pdf.

[2.] Halliwell, Leigh J., “Classifying the Tails of Loss Distributions,” CAS E-Forum, Spring 2013, Volume 2, www.casact.org/pubs/forum/13spforumv2/Haliwell.pdf.

[3.] Havil, Julian, Gamma: Exploring Euler’s Constant, Princeton University Press, 2003.

[4.] Healy, M. J. R., Matrices for Statistics, Oxford, Clarendon Press, 1986.

[5.] Hogg, Robert V., and Stuart A. Klugman, Loss Distributions, New York, Wiley, 1984.

[6.] Johnson, Richard A., and Dean Wichern, Applied Multivariate Statistical Analysis (Third Edition), Englewood Cliffs, NJ, Prentice Hall, 1992.

[7.] Judge, George G., Hill, R. C., et al., Introduction to the Theory and Practice of Econometrics (Second Edition), New York, Wiley, 1988.

[8.] Klugman, Stuart A., et al., Loss Models: From Data to Decisions, New York, Wiley, 1998.

[9.] Meyers, Glenn, “The and Beyond,” Actuarial Review, May 2013, p. 15, www.casact.org/newsletter/pdfUpload/ar/AR_May2013_1.pdf.

[10.] Million, Elizabeth, The Hadamard Product, 2007, http://buzzard.ups.edu/courses/2007spring/ projects/million-paper.pdf.

[11.] Schneider, Hans, and George Barker, Matrices and Linear Algebra, New York, Dover, 1973.

[12.] Veeravalli, V. V., Proper Complex Random Variables and Vectors, 2006, http://courses.engr. illinois.edu/ece461/handouts/notes6.pdf.

[13.] Venter, G.G., "Credibility," Foundations of Casualty Actuarial Science (Third Edition), Casualty Actuarial Society, 1996.

[14.] Venter, G.G., "Tails of Copulas," PCAS LXXXIX, 2002, 68-113, www.casact.org/pubs/proceed/ proceed02/02068.pdf.

[15.] Weyl, Hermann, The Theory of Groups and Quantum Mechanics, New York, Dover, 1950 (ET by H. P. Robertson from second German Edition 1930).

[16.] Wikipedia contributors, “Arcsine distribution,” Wikipedia, The Free Encyclopedia, http://en.wikipedia.org/wiki/Arcsine_distribution (accessed October 2014).

[17.] Wikipedia contributors, “Complex normal distribution,” Wikipedia, The Free Encyclopedia, http://en.wikipedia.org/wiki/Complex_normal_distribution (accessed October 2014).

[18.] Wikipedia contributors, “Complex number,” Wikipedia, The Free Encyclopedia, http://en.wikipedia.org/wiki/Complex_number (accessed October 2014).

Casualty Actuarial Society E-Forum, Fall 2015 46 Complex Random Variables

APPENDIX A

REAL-VALUED LOGNORMAL RANDOM VECTORS

Feeling that the treatment of lognormal random vectors in Section 8 would be too long, we have

decided to prepare for it in Appendices A and B. According to Section 7, the probability density function of real-valued n×1 normal random vector x with mean µ and variance Σ is:

1 ′ − − (x−μ ) Σ 1 (x−μ ) 1 2 fx (x) = e (2π )n Σ

= ℜn Therefore, ∫ f x (x)dV 1. The single integral over represents an n-multiple integral over each x∈ℜn

xj from –∞ to +∞; dV = dx1dx n .

 n  ∑t j x j ′ The moment generating function of x is M (t) = E[e t x ]= Ee j =1  , where t is a suitable n×1 x    

22 vector. Partial derivatives of the moment generating function evaluated at t = 0n×1 equal moments

of x, since:

+ ∂ k1 kn ( ) M x t k1 kn t′x k1 kn = E[x1 xn e ] = E[x1 xn ] ∂ k1 x ∂ kn x t=0 1 n t=0

th But lognormal moments are values of the function itself. For example, if t = e j , the j unit vector,

′ e j x X j X j X k then M x (e j ) = E[e ]= E[e ]. Likewise, M x (e j + e k ) = E[e e ]. The moment generating

function of x, if it exists, is the key to the moments of e x .

22 All real-valued t vectors are suitable; Appendix B will extend the suitability to complex t.

Casualty Actuarial Society E-Forum, Fall 2015 47 Complex Random Variables

The moment generating function of the real-valued multivariate normal x is:

t′x M x (t) = E[e ]

1 ′ −1 1 − (x−μ ) Σ (x−μ ) ′ = e 2 et x dV ∫ n x∈ℜn (2π ) Σ

1 ′ − 1 − {(x−μ ) Σ 1 (x−μ )−2t′x} = e 2 dV ∫ n x∈ℜn (2π ) Σ

A multivariate “completion of the square” results in the identity:

′ − ′ − (x − μ) Σ 1 (x − μ)− 2t′x = (x − [μ + Σt]) Σ 1 (x − [μ + Σt])− 2t′μ − t′Σt

We leave it for the reader to verify. By substitution, we have:

1 ′ − 1 − {(x−μ ) Σ 1 (x−μ )−2t′x} M (t) = e 2 dV x ∫ n x∈ℜn (2π ) Σ

1 ′ 1 − {(x−[μ+Σt]) Σ−1 (x−[μ+Σt])−2t′μ−t′Σt} = e 2 dV ∫ n x∈ℜn (2π ) Σ

1 ′ −1 1 − (x−[μ+Σt]) Σ (x−[μ+Σt]) ′ ′ = e 2 dV ⋅et μ+t Σt 2 ∫ n x∈ℜn (2π ) Σ =1⋅et′μ+t′Σt 2 = et′μ+t′Σt 2

The reduction of the integral to unity in the second last line is due to the fact that

1 ′ 1 − (x−[μ+Σt]) Σ−1 (x−[μ+Σt]) e 2 is the probability density function of the real-valued n×1 normal (2π )n Σ random vector with mean μ + Σt and variance Σ. This new mean is valid if it is real-valued, which will be so if t is real-valued. In fact, μ + Σt is real-valued if and only if t is real-valued.

Casualty Actuarial Society E-Forum, Fall 2015 48 Complex Random Variables

So the moment generating function of the real-valued normal multivariate x ~ N(μ, Σ) is

t′μ+t′Σt 2 n 23 M x (t) = e , which is valid at least for t ∈ ℜ . As a check:

∂M (t) ′ ′ ∂M (t) x = (μ + Σt)e t μ+t Σt 2 ⇒ E[x]= x = μ ∂t ∂t t=0

And for the second derivative:

∂ 2 M (t) ∂(μ + Σt)e t′μ+t′Σt 2 x = ∂t∂t′ ∂t′ ′ ′ + ′Σ = Σ + (μ + Σt)(μ + Σt) e t μ t t 2   ∂ 2 M (t) ⇒ E[xx′] = x = Σ + μμ′ ∂ ∂ ′ t t t=0 ⇒ Var[x] = E[xx′]− μμ′ = Σ

The lognormal moments follow from the moment generating function:

X j e′j x e′jμ+e′jΣe j 2 µ j +Σ jj 2 E[e ]= E[e ]= M x (e j ) = e = e

The second moments are conveniently expressed in terms of first moments:

′ X (e +e ) x E[e j e X k ]= Ee j k    ′ ′ (e +e ) μ+(e +e ) Σ(e +e ) 2 = e j k j k j k µ +µ +(Σ +Σ +Σ +Σ ) 2 = e j k jj jk kj kk µ +Σ 2+µ +Σ 2+(Σ +Σ ) 2 = e j jj k kk jk kj µ +Σ 2 µ +Σ (Σ +Σ ) 2 = e j jj ⋅ e k kk 2 ⋅ e jk kj µ +Σ 2 µ +Σ (Σ +Σ ) 2 = e j jj ⋅ e k kk 2 ⋅ e jk jk X Σ = E[e j ]E[e X k ]⋅ e jk

23 The vector formulation of partial differentiation is explained in Appendix A.17 of Judge [1988].

Casualty Actuarial Society E-Forum, Fall 2015 49 Complex Random Variables

X X X X Σ So, Cov[e j , e X k ]= E[e j e X k ]− E[e j ]E[e X k ]= E[e j ]E[e X k ](e jk −1), which is the multivariate

2 2 equivalent of the well-known scalar formula CV 2 [e X ]= Var[e X ] E[e X ] = eσ −1. Letting E[e x ]

X   denote the n×1 vector whose jth element is E[e j ],24 and diag E[e x ] as its n×n diagonalization,  

 x  Σ  x   x  we have Var[x] = diag E[e ]{e −1 × }diag E[e ] . Because diag E[e ] is diagonal in positive   n n    

Σ elements (hence, symmetric and PD), Var[x] is NND [or PD] if and only if e −1n×n is NND [or

PD].25 Symmetry is no issue here, because for real-valued matrices, Σ is symmetric if and only if

Σ e −1n×n is symmetric.

Σ The relation between Σ and Τ = e −1n×n is merits a discussion whose result will be clear from a

consideration of 2×2 matrices. Since Σ 2×2 is symmetric, it is defined in terms of three real numbers:

a b 2 Σ =   . Now Σ is NND if and only if 1) a ≥ 0, 2) d ≥ 0, and 3) ad − b ≥ 0. Σ is PD if and b d

only if these three conditions are strictly greater than zero. If a or d is zero, by the third condition b

also must be zero.26 Since we are not interested in degenerate random variables, which are

effectively constants, we will require a and d to be positive. With this requirement, Σ is NND if and

24 This would follow naturally from the “elementwise” interpretation of e A , i.e., that the exponential function of matrix A is the matrix of the exponential functions of the elements of A. But if A is a square matrix, e A may have the ∞ j “matrix” interpretation In + ∑ A j!. j=1 25 PD [positive-definite] and NND [non-negative-definite] are defined in Section 4.

 X 1  Cov[X 1 , X 1 ] Cov[X 1 , X 2 ] a b 26 Since NND matrices represent variances, Σ = Var  =   =   . The X 2  Cov[X 2 , X 1 ] Cov[X 2 , X 2 ] b d fact that a or d equals 0 implies that b equals 0 means that a random variable can’t covary with another random variable unless it covaries with itself.

Casualty Actuarial Society E-Forum, Fall 2015 50 Complex Random Variables

only if b 2 ≤ ad , and PD if and only if b 2 < ad . Since a and d are positive, so too is ad, as well as

the geometric mean γ = ad . So Σ is NND if and only if − γ ≤ b ≤ γ , and PD if and only if

a + d − γ < b < γ . It is well-known that min(a, d ) ≤ γ ≤ ≤ max(a, d ) with equality if and only if 2

a = d .

e a −1 eb −1 Τ = Σ − = Now the same three conditions determine the definiteness of e 12×2  b d  . e −1 e −1

Since we required a and d to be positive, both e a −1 and e d −1 are positive. This leaves the

definiteness of Τ dependent on the relation between (eb −1)(eb −1) and (e a −1)(e d −1). We will

next examine this relation according to the three cases b = 0 , b > 0 , and b < 0 , all of which must

be subject to − γ ≤ b ≤ γ .

First, if b = 0 , then − γ < b < γ and Σ is PD. Furthermore, (eb −1)(eb −1) = 0 < (e a −1)(e d −1).

Σ Therefore, in this case, the lognormal transformation Σ → Τ = e −12×2 is from PD to PD. And

zero covariance in the normal pair produces zero covariance in the lognormal pair. In fact, since

zero covariance between normal bivariates implies independence (cf. §2.5.7 of Judge [1988]), the

lognormal bivariates also are independent.

In the second case, b > 0 , or more fully, 0 < b ≤ γ . Σ is PD if and only if b < γ . Define the

function ϕ(x) = ln((e x −1) x) for positive real x (or x ∈ ℜ+ ). A graph will show that the function

strictly increases, i.e., ϕ(x1 ) < ϕ(x2 ) if and only if x1 < x2 . Moreover, the function is concave

upward. This means that the line segment between points (x1 , ϕ(x1 )) and (x2 , ϕ(x2 )) lies above

Casualty Actuarial Society E-Forum, Fall 2015 51 Complex Random Variables

ϕ(x )+ ϕ(x )  x + x  the curve ϕ(x) for intermediate values of x. In particular, 1 2 ≥ ϕ 1 1  . 2  2 

+  x1 + x1  Equivalently, for all x1 , x2 ∈ ℜ , 2ϕ  ≤ ϕ(x1 )+ ϕ(x2 ) with equality if and only if x1 = x2 .  2 

 a + d  a + d Therefore, since a and d are positive, 2ϕ  ≤ ϕ(a)+ ϕ(d ). And since 0 < γ = ad ≤ ,  2  2

 a + d  2ϕ(γ ) ≤ 2ϕ  ≤ ϕ(a)+ ϕ(d ). So 2ϕ(γ ) ≤ ϕ(a)+ ϕ(d ) with equality if and only if a = d .  2 

Furthermore, since in this case 0 < b ≤ γ , 2ϕ(b) ≤ 2ϕ(γ ) ≤ ϕ(a)+ ϕ(d ). Hence,

2ϕ(b) ≤ ϕ(a)+ ϕ(d ). The equality prevails if and only if b = γ = a = d , or if and only if a = b = d .

If a = b = d then Σ is not PD; otherwise Σ is PD. Hence:

 eb −1  e a −1  e d −1 2ln  = 2ϕ(b) ≤ ϕ(a)+ ϕ(d ) = ln  + ln   b   a   d 

The inequality is preserved by exponentiation:

2  eb −1  e a −1 e d −1   ≤     b   a  d 

This leads at last to the inequality:

b 2 2 2 b 2 2  e −1 b a d  b  a d a d (e −1) = b   ≤ (e −1)(e −1) =   (e −1)(e −1)≤ 1⋅ (e −1)(e −1)  b  ad  γ 

2 Therefore, in this case (eb −1) ≤ (e a −1)(e d −1) with equality if and only if a = b = d . This means

Σ that if b > 0 , the lognormal transformation Σ → Τ = e −12×2 is from NND to NND. But Τ is

2 NND only if (eb −1) = (e a −1)(e d −1), or only if a = b = d . Otherwise, Τ is PD. So, when

Σ b > 0 , Τ = e −12×2 is PD except when all four elements of Σ have the same positive value. Even

Casualty Actuarial Society E-Forum, Fall 2015 52 Complex Random Variables

 a ad  the NND matrix Σ =   log-transforms into a PD matrix, unless a = d . So all PD and  ad d

most NND normal variances transform into PD lognormal variances. A NND lognormal variance indicates that at least one element of the normal random vector is duplicated.

In the third and final case, b < 0 , or more fully, − γ ≤ b < 0 . This is equivalent to 0 < −b ≤ γ , or

2 to the second case with − b . In that case, (e −b −1) ≤ (e a −1)(e d −1) with equality if and only if

a = −b = d . But from this, as well as from the fact that 0 < e 2b < e 2⋅0 < 1, it follows:

2 2 2 (eb −1) = e 2b (1− e −b ) = e 2b (e −b −1) ≤ e 2b (e a −1)(e d −1)< 1⋅ (e a −1)(e d −1)

2 So in this case the inequality is strict: (eb −1) < (e a −1)(e d −1), and Τ is PD. Therefore, if b < 0 ,

Σ the lognormal transform Σ → Τ = e −12×2 is PD, even if Σ is NND.

Σ To summarize, the lognormal transformation Σ → Τ = e −12×2 is PD if Σ is PD. Even when Σ is

not PD, but merely NND, Τ is almost always PD. Only when Σ is so NND as to conceal a

random-variable duplication is its lognormal transformation NND.

The Hadamard (elementwise) product and Schur’s product theorem allow for an understanding of

Σ th the general lognormal transformation Σ → Τ = e −1n×n . Denoting the elementwise n power of Σ

n factors  ∞ as Σn = Σ  Σ , we can express elementwise exponentiation as eΣ = ∑Σ j j!. So j=0

∞ Σ  j Τ = e −1n×n = ∑Σ j!. According to Schur’s theorem (§3 of Million [2007]), the Hadamard j=1

Casualty Actuarial Society E-Forum, Fall 2015 53 Complex Random Variables product of two NND matrices is NND.27 Since Σ is NND, its powers Σ j are NND, as well as the terms Σ j j!. Being the sum of a countable number of NND matrices, Τ also must be NND.28

But if just one of the terms of the sum is PD, the sum itself must be PD. Therefore, if Σ is PD, then Τ also is PD.

Now the kernel of m×n matrix A is the set of all x ∈ ℜn such that Ax = 0, or ker(A) = {x : Ax = 0}. The kernel is a linear subspace of ℜn and its dimensionality is n − rank(A).

By the Cholesky decomposition the NND matrix U can be factored as U n×n = W′Wn×n . The

′ quadratic form in U is x′Ux = x′W′Wx = (Wx) (Wx). If x′Ux = 0 , then Wx = 0n×1 , and

Ux = W′Wx = W′0n×1 = 0n×1 . Conversely, if Ux = 0n×1 , then x′Ux = 0 . So the kernel of NND matrix U is precisely the solution set of x′Ux = 0 , i.e., x′Ux = 0 if and only if x ∈ ker(U).

∞ Therefore, the kernel of Τ = ∑Σ j j! is the intersection of the kernels of Σ j , or j=1

∞ ker(Τ) = ker(Σ j ). It is possible for this intersection to be of dimension zero, i.e., for it to equal j=1

 j {0n×1}, even though the kernel of no Σ is. Because of the accumulation of intersections in k→∞ ∑Σ j j! , the lognormal transformation of NND matrix Σ tends to be “more PD” than Σ itself. j=1

27 Appendix C provides a proof of this theorem.   28 ′ ′ For the quadratic form of a sum equals the sum of the quadratic forms: x ∑ U k x = ∑ x U k x . In fact, if the  k  k ∞  j Σ coefficients aj are non-negative, then ∑ a j Σ is NND, provided that the sum converges, as e does. j=1

Casualty Actuarial Society E-Forum, Fall 2015 54 Complex Random Variables

Σ We showed above that if Σ is PD, then Τ = e −1n×n is PD. But even if Σ is NND, Τ will be PD,

Σ Σ  unless Σ contains a 2×2 subvariance jj jk whose four elements are all equal. Σ Σ   kj kk 

Casualty Actuarial Society E-Forum, Fall 2015 55 Complex Random Variables

APPENDIX B

THE NORMAL MOMENT GENERATING FUNCTION AND COMPLEX ARGUMENTS

In Appendix A we saw that the moment generating function of the real-valued normal multivariate

t′μ+t′Σt 2 n x ~ N(μ, Σ), viz., M x (t) = e , is valid at least for t ∈ ℜ . The validity rests on the identity

1 ′ 1 − (x−[μ+Σt]) Σ−1 (x−[μ+Σt]) e 2 dV =1 for real-valued ξ = μ + Σt . But in Section 8 we must ∫ n x∈ℜn (2π ) Σ

1 1 − (x−ξ)′ Σ−1 (x−ξ) know the value of ϕ(ξ) = e 2 dV when ξ is a complex n×1 vector. So in ∫ n x∈ℜn (2π ) Σ

this appendix, we will prove that for all ξ ∈ C n , ϕ(ξ) = 1.

The proof begins with diagonalization. Since Σ is symmetric and PD, Σ −1 exists and is symmetric and PD. According to the Cholesky decomposition (Healy [1986, §7.2]), there exists a real-valued

n×n matrix W such that W′W = Σ −1 . Due to theorems on matrix rank, W must be non-singular, or

invertible. So the transformation y = Wx is one-to-one. And letting ζ = Wξ , we have

y − ζ = W(x − ξ). Moreover, the volume element in the y coordinates is:

2 −1 dVx dVy = W dVx = W dVx = W′ W dVx = W′W dVx = Σ dVx = . Σ

Hence:

Casualty Actuarial Society E-Forum, Fall 2015 56 Complex Random Variables

1 1 − (x−ξ)′ Σ−1 (x−ξ) ϕ(ξ) = e 2 dV ∫ n x∈ℜn (2π ) Σ

1 − (x−ξ)′ W′W(x−ξ) 1 2 = e dVx ∫ n x∈ℜn (2π ) Σ

1 ′ 1 − [W(x−ξ)] [W(x−ξ)] dV = e 2 x ∫ n x∈ℜn (2π ) Σ

1 ′ − (y−ζ) (y−ζ) 1 2 = e dVy ∫ n y∈ℜn (2π )

n 1 2 − ∑ (y j −ζ j ) 1 2 = e j =1 dV ∫ n y∈ℜn (2π )

n 1 2 n − ∑ (y j −ζ j ) 1 2 = ∫ ∏ e j =1 dV y∈ℜn j=1 2π

n ∞ 1 2 n − ∑ (y j −ζ j ) 1 2 = j =1 ∏ ∫ e dy j j=1 π y j =−∞ 2 n = ∏ ψ(ζ j ) j=1

b 1 1 − (x−ζ)2 In the last line ψ(ζ) = lim ∫ e 2 dx . Obviously, if ζ is real-valued, ψ(ζ) =1. So the a→−∞ x=a 2π b→+∞

issue of the value of a moment generating function of a complex variable resolves into the issue of

the “total probability” of a unit-variance normal random variable with a complex mean.29

To evaluate ψ(ζ) requires some complex analysis with contour integrals. First, consider the

z 2 1 − standard-normal density function with a complex argument: f (z) = e 2 . By function- 2π

composition rules, since both z 2 and the exponential function are “entire” functions (i.e., analytic

29 We deliberately put ‘total probability’ in quotes because the probability density function with complex ζ is not proper; it may produce negative and even complex values for probability densities.

Casualty Actuarial Society E-Forum, Fall 2015 57 Complex Random Variables

over the whole complex plane), so too is f (z). Therefore, f (z)dz = 0 for any closed contour C ∫C

(cf. Appendix D.7 of Havil [2003]). Let C be the parallelogram traced from vertex z = b to vertex

z = a to vertex z = a − ζ to vertex z = b − ζ and finally back to vertex z = b . Therefore:

0 = f (z)dz ∫C a a−ζ b−ζ b = ∫ f (z)dz + ∫ f (z)dz + ∫ f (z)dz + ∫ f (z)dz b a a−ζ b−ζ

The line segments along which the second and fourth integrals proceed are finite; their common

length is L = (a − ζ)− a = − ζ = ζ = b − (b − ζ) , where ζ = ζζ ≥ 0. By the triangle inequality

a−ζ a−ζ a−ζ ∫ f (z)dz ≤ ∫ f (z)dz = ∫ f (z) dz . But f (z) is a continuous real-valued function, so over a a a a

closed interval it must be upper-bounded by some positive real number M. Hence,

a−ζ a−ζ a−ζ a−ζ ∫ f (z)dz ≤ ∫ f (z) dz ≤ ∫ Sup( f (z))dz = Sup( f (z ∈[a, a − ζ])) ∫ dz = M (a)⋅ L . Likewise, a a a a

b b ∫ f (z)dz ≤ Sup( f (z ∈[b, b − ζ])) ∫ dz = M (b)⋅ L . b−ζ b−ζ

Now, in general:

Casualty Actuarial Society E-Forum, Fall 2015 58 Complex Random Variables

f (z) = f (z) f (z)

z 2 z 2 1 − 1 − = e 2 ⋅ e 2 2π 2π

z 2 z 2 − − e 2 ⋅ e 2 = 2π

z 2 z 2 − − ∝ e 2 ⋅ e 2

z 2 +z 2 − ∝ e 2

−Re(z 2 ) ∝ e

Therefore, since a ∈ ℜ:

−Re(z 2 ) lim M (a)⋅ L ∝ lim Sup e ; z ∈[a, a − ζ] ⋅ L a→−∞ a→−∞  

2    z     −a2 Re       a   z  ζ  ∝ lim Sup e   ; ∈ 1, 1−  ⋅ L 2   a →+∞  a  a    ∝ e −∞ Re(1) ⋅ L ∝ 0

Similarly, lim M (b)⋅ L ∝ 0. So in the limit as a → −∞ and b → +∞ on the real axis, the second b→+∞ and fourth integrals approach zero. Accordingly: 0 = lim{0} a→−∞ b→+∞  a a−ζ b−ζ b  = lim∫ f (z)dz + ∫ f (z)dz + ∫ f (z)dz + ∫ f (z)dz a→−∞ b a a−ζ b−ζ  b→+∞    a b−ζ  = lim∫ f (z)dz + ∫ f (z)dz a→−∞ b a−ζ  b→+∞  

b−ζ a b +∞ 1 1 − z 2 And so, lim ∫ f (z)dz = −lim∫ f (z)dz = lim∫ f (z)dz = ∫ e 2 dz = 1 a→−∞ a−ζ a→−∞ b a→−∞ a −∞ 2π b→+∞ b→+∞ b→+∞

Casualty Actuarial Society E-Forum, Fall 2015 59 Complex Random Variables

So, at length:

b 1 1 − (x−ζ)2 ψ(ζ) = lim ∫ e 2 dx a→−∞ x=a 2π b→+∞ b 1 1 − ((z+ζ)−ζ)2 = lim ∫ e 2 d(z + ζ) a→−∞ x=a 2π b→+∞ b−ζ 1 1 − z 2 = lim ∫ e 2 d(z + ζ) a→−∞ z=a−ζ 2π b→+∞ b−ζ 1 1 − z 2 = lim ∫ e 2 dz a→−∞ z=a−ζ 2π b→+∞ b−ζ = lim ∫ f (z)dz a→−∞ z=a−ζ b→+∞ = 1

+∞ 1 1 − (x−ζ)2 So, working backwards, what we proved for one dimension, viz., ψ(ζ) = ∫ e 2 dx = 1, x=−∞ 2π

1 1 − (x−ξ)′ Σ−1 (x−ξ) applies n-dimensionally: for all ξ ∈ C n , ϕ(ξ) = e 2 dV = 1. Therefore, even ∫ n x∈ℜn (2π ) Σ

t′μ+t′Σt 2 for complex t, M x (t) = e . Complex values are allowable as arguments in the moment generating function of a real-valued normal vector. This result is critical to Section 8.

Though we believe the contour-integral proof above to be worthwhile for its instructional value, a simple proof comes from the powerful theorem of analytic continuation (cf. Appendix D.12 of

Havel [2003]). This theorem concerns two functions that are analytic within a common domain. If the functions are equal over any smooth curve within the domain, no matter how short,30 then they

30 The length of the curve must be positive; equality at single points, or punctuated equality, does not qualify.

Casualty Actuarial Society E-Forum, Fall 2015 60 Complex Random Variables

b 1 1 − (x−ζ)2 are equal over all the domain. Now ψ(ζ) = lim ∫ e 2 dx is analytic over all the complex a→−∞ x=a 2π b→+∞ plane. And for all real-valued ζ, ψ(ζ) = 1. So ψ(ζ) and f (ζ) = 1 are two functions analytic over the complex plane and identical on the real axis. Therefore, by analytic continuation ψ(ζ) must equal one for all complex ζ. Analytic continuation is analogous with the theorem in real analysis that two smooth functions equal over any interval are equal everywhere. Analytic continuation derives from the fact that a complex derivative is the same in all directions. It is mistaken to regard the real and imaginary parts of the derivative as partial derivatives, as if they applied respectively to the real and imaginary axes of the independent variable. Rather, the whole derivative applies in every direction.

Casualty Actuarial Society E-Forum, Fall 2015 61 Complex Random Variables

APPENDIX C

EIGEN-DECOMPOSITION AND SCHUR’S PRODUCT THEOREM

Appendix A quoted Schur’s Product Theorem, viz., that the Hadamard product of non-negative- definite (NND) matrices is NND. Million [2007] proves it as Theorem 3.4; however, we believe our proof in this appendix to be simpler; moreover, it affords a review of eigen-decomposition. Those

familiar with eigen-decomposition may skip to the last paragraph.

Let Γ be an n×n Hermetian NND matrix. As explained in Section 4, ‘Hermetian’ means that

Γ = Γ* ; ‘NND’ means that for every complex n×n vector z (or z ∈ C n ), z*Γz ≥ 0 . Positive

* definiteness [PD] is a stricter condition, in which z Γz = 0 if and only if z = 0n×1 .

Complex scalar λ and non-zero vector ν form an “eigenvalue-eigenvector” pair of Γ, if Γν = λν .

* Since ν = 0n×1 is excluded as a trivial solution, vector ν can be scaled to unity, or ν ν = 1. But

Γν = λν if and only if (Γ − λI n )ν = 0n×1 . If Γ − λIn is non-singular, or invertible, then:

−1 −1 ν = I n ν = (Γ − λI n ) (Γ − λI n )ν = (Γ − λI n ) 0n×1 = 0n×1

Hence, allowable eigenvectors require for Γ − λIn to be singular, or for its determinant Γ − λI n to be zero. Since the determinant is an nth-degree equation (with complex coefficients based on the

elements of Γ) in λ, it has n root values of λ, not necessarily distinct. So the determinant can be

factored as f (λ) = Γ − λI n = (λ − λ1 )(λ − λ n ). Since f (λ j ) = Γ − λ j I n = 0 , there exist non-

zero solutions to (Γ − λI n )ν = 0n×1 . So for every eigenvalue there is a non-zero eigenvector, even a

non-zero eigenvector of unit magnitude.

Casualty Actuarial Society E-Forum, Fall 2015 62 Complex Random Variables

The first important result is that the eigenvalues of Γ are real-valued and non-negative. Consider the

th j eigenvalue-eigenvector pair, which satisfies the equation Γν j = λ j ν j . Therefore,

* * * * ν j Γν j = λ j ν j ν j . Since Γ is NND, ν j Γν j is real-valued and non-negative. Also, ν j ν j is real-

valued and positive. Therefore, their quotient λj is a real-valued and non-negative scalar.

* Furthermore, if Γ is PD, ν j Γν j is positive, as well as λj.

The second important result is that eigenvectors paired with unequal eigenvalues are orthogonal.

Let the two unequal eigenvalues be λ j ≠ λ k . Because the eigenvalues are real-valued, λ j = λ j . The

eigenvector equations are Γν j = λ j ν j and Γν k = λ k ν k . The following string of equations relies on

Γ’s being Hermetian (so Γ = Γ* ):

* * * (λ j − λ k )ν j ν k = λ j ν j ν k − λ k ν j ν k * * = λ j ν j ν k − λ k ν j ν k

* * * = (λ j ν k ν j ) − λ k ν j ν k

* * * = (ν k Γν j ) − ν j Γν k * * * = ν j Γ ν k − ν j Γν k * * = ν j Γν k − ν j Γν k = 0

* Because λ j − λ k ≠ 0 , the eigenvectors must be orthogonal, or ν j ν k = 0. If all the eigenvectors are distinct, the eigenvectors form an orthogonal basis of C n . But even if not, the kernel of each

n n eigenvalue, or ker(Γ − λ j I n ) = {z ∈ C : Γz = λ j z} is a linear subspace of C whose rank or

dimensionality equals the multiplicity of the root λj in the characteristic equation f (λ) = Γ − λI n .

This means that the number of mutually orthogonal eigenvectors paired with an eigenvalue equals

Casualty Actuarial Society E-Forum, Fall 2015 63 Complex Random Variables

how many times that eigenvalue is a root of its characteristic equation. Consequently, there exist n

* 31 eigenvalue-eigenvector pairs (λ j , ν j ) such that Γν j = λ j ν j and ν j ν k = δij .

th * Now, define W as the partitioned matrix Wn×n = [ν1  ν n ]. The jk element of W W equals

* * ν j ν k = δij ; hence, W W = I n . A matrix whose transjugate is its inverse is called “unitary,” as is

32 th W. Furthermore, define Λ as the n×n diagonal matrix whose jj element is λj. Then:

λ1  Γ = Γ  = Γ  Γ =  =     = Λ W [ν1 ν n ] [ ν1 ν n ] [λ1ν1 λ n ν n ] [ν1 ν n ]  W    λ n 

* * And so, Γ = ΓI n = ΓWW = WΛW , and Γ is said to be “diagonalized.” Thus have we shown,

assuming the theory of equations, 33 the third important result, viz., that every NND Hermetian

matrix can be diagonalized. Other matrices can be diagonalized; the NND [or PD] consists in the

fact that all the eigenvalues of this diagonalization are non-negative [or positive].

* th The fourth and final “eigen” result relies on the identity W ν j = e j , which just extracts the j

* th columns of each side of W W = I n . As in Appendix A, e j is the j unit vector. Therefore:

31 The Kronecker delta, δij , is the function IF(i = j, 1, 0) . 32 To be precise, at this point W* is only the left-inverse of W. But by matrix-rank theorems, the rank of W equals n, so -1 * * * -1 * -1 -1 -1 W has a unique full inverse W . Then W = W I n = W (WW ) = (W W)W = I n W = W .

33 th The theory of equations guarantees the existence of the roots of the n -degree equation f (λ) = Γ − λI n . Appendix D.9 of Havel [2003] contains a quick and lucid proof of this, the fundamental theorem of algebra.

Casualty Actuarial Society E-Forum, Fall 2015 64 Complex Random Variables

Γ = WΛW*  n  =  *  * W∑ λ j e j e j W  j=1   n  =  * * *  * W∑ λ j (W ν j )(W ν j ) W  j=1   n  =  * *  * W∑ λ j W ν j ν j WW  j=1   n  = *  *  * WW ∑ λ j ν j ν j WW  j=1   n  =  *  I n ∑ λ j ν j ν j I n  j=1  n * = ∑ λ j ν j ν j j=1

n * The form ∑ λ j ν j ν j is called the “spectral decomposition” of Γ (§7.4 of Healy [1986]), which plays j=1 the leading role in the following succinct proof of Schur’s product theorem.

If Σ and Τ are two n×n Hermetian NND definite matrices, we may spectrally decompose them as

n n * * Σ = ∑ λ j ν j ν j and Τ = ∑ κ j η j η j , where all the λ and κ scalars are non-negative. Accordingly: j=1 j=1

Casualty Actuarial Society E-Forum, Fall 2015 65 Complex Random Variables

(Σ  Τ) jk = (Σ) jk (Τ) jk n n  *   *  = ∑ λ r ν r ν r  ∑ κ s ηs ηs   r=1  jk  s=1  jk  n  n  = ∑ λ r (ν r ) j (ν r )k ∑ κ s (ηs ) j (ηs )k   r=1  s=1  n n = ∑∑ λ r κ s (ν r ) j (ηs ) j (ν r )k (ηs )k r=1 s=1 n n =   ∑∑ λ r κ s (ν r ηs ) j (ν r ηs )k r=1 s=1 n n =   ∑∑ λ r κ s (ν r ηs ) j (ν r ηs )k r=1 s=1 n n =   * ∑∑ λ r κ s ((ν r ηs )(ν r ηs ) )jk r=1 s=1 n n  *  = ∑∑ λ r κ s (ν r  ηs )(ν r  ηs )   r=1 s=1  jk

n n * * Hence, Σ  Τ = ∑∑ λ r κ s (ν r  ηs )(ν r  ηs ) . Since each matrix (νr  ηs )(νr  ηs ) is Hermetian r=1 s=1

NND, and each scalar λ r κ s is non-negative, Σ  Τ must be Hermetian NND. Therefore, the

Hadamard product of NND matrices is NND.

Casualty Actuarial Society E-Forum, Fall 2015 66