The Ellipse

✧ 1 / 28 The Variance Ellipse

For bivariate data, like velocity, the variabililty can be spread out in not one but two dimensions. In this case, the variance is now a , and the spread of the data is characterized by an ellipse.

This variance ellipse eccentricity indicates the extent to which the variability is anisotropic or directional, and the orientation tells the direction in which the variability is concentrated.

✧ 2 / 28 Variance Ellipse Example

Variance ellipses are a very useful way to analyze velocity data.

This example compares velocities observed by a mooring array in Fram Strait with velocities in two numerical models.

From Hattermann et al. (2016), “Eddy­driven recirculation of Atlantic Water in Fram Strait”, Geophysical Research Letters.

Variance ellipses can be powerfully combined with lowpassing and bandpassing to reveal the geometric structure of variability in different frequency bands.

✧ 3 / 28 Understanding Ellipses

This section will focus on understanding the properties of the variance ellipse.

To do this, it is not really possible to avoid matrix algebra.

Therefore we will first review some relevant mathematical background.

✧ 4 / 28 Review: Rotations

T The most important action on a vector z ≡ [u v] is a ninety-degree rotation. This is carried out through the 0 −1 0 −1 u −v z = = . [ 1 0 ] [ 1 0 ] [ v ] [ u ]

Note the mathematically positive direction is counterclockwise. A general rotation is carried out by the cos θ − sin θ J(θ) ≡ [ sin θ cos θ ]

cos θ − sin θ u u cos θ − v sin θ J(θ) z = = . [ sin θ cos θ ] [ v ] [ u sin θ + v cos θ ]

The ninety-degree rotation matrix is J(π/2), while J(π), the 180 degree rotation matrix, just changes the sign of z.

✧ 5 / 28 Review: Matrix Basics

Recall that a matrix M is said to be unitary if 1 0 MT M = I = [ 0 1 ]

where “T” represents the matrix transpose, and I is called the . T For a , M M z = z, i.e. when the matrix and its transpose operate in succession, nothing happens.

We can see that the rotation matrix J(θ) is unitary since cos θ sin θ cos θ − sin θ 1 0 JT (θ) J(θ) = = . [ − sin θ cos θ ] [ sin θ cos θ ] [ 0 1 ]

T We also note that J (θ) = J(−θ), the transpose of a rotation matrix is the same as a rotation in the opposite direction. Makes sense!

✧ 6 / 28 Complex Notation

A pair of can also be grouped into a single complex­ valued time series

zn = un + ivn n = 0, 1, 2, … N − 1

where i = √−‾‾1‾. The real part represents east-west, and the imaginary part represents north-south.

We will use both the complex-valued and vector representations.

Complex notation turns out to be highly useful not only for bivariate data, but also in the analysis of real-valued time series as well.

Complex numbers are reviewed in detail in another lecture.

✧ 7 / 28 The of Bivariate Data

Next we look at the mean and variance for the case of bivariate data, which we represent as the vector-valued time series zn .

The sample mean of the vector time series zn is also a vector,

1 N−1 ⎯u⎯⎯ ⎯z⎯⎯ ≡ z = N ∑ n [ ⎯v⎯⎯ ] n=0

that consists of the sample of the u and v components of zn .

✧ 8 / 28 Variance of Bivariate Data

The variance of the vector-valued times series zn is not a scalar or a vector, it is a 2 × 2 matrix

1 N−1 Σ ≡ z − ⎯z⎯⎯ z − ⎯z⎯⎯ T N ∑ ( n ) ( n ) n=0

T where “T” represents the matrix transpose, z = [ u v ].

Carrying out the matrix multiplication leads to

N−1 ⎯⎯⎯ 2 ⎯⎯⎯ ⎯⎯⎯ 1 (un − u) (un − u) (vn − v) Σ = N ∑ [ ⎯⎯⎯ ⎯⎯⎯ ⎯⎯⎯ 2 ] n=0 (un − u) (vn − v) (vn − v)

2 2 The diagonal elements of Σ are the sample σu and σv , while the off-diagonal gives the between un and vn . Note that the two off-diagonal elements are identical.

✧ 9 / 28

Σ is generally called the velocity covariance matrix.

We can still define a scalar-valued standard deviation σ. This is done by taking the mean of an inner product rather than an outer product,

1 N−1 1 N−1 Σ ≡ z − ⎯z⎯⎯ z − ⎯z⎯⎯ T , σ 2 ≡ z − ⎯z⎯⎯ T z − ⎯z⎯⎯ . N ∑ ( n ) ( n ) N ∑ ( n ) ( n ) n=0 n=0

2 The squared velocity standard deviation σ is related to the covariance matrix as the sum of the diagonal elements:

2 2 2 σ = Σuu + Σvv = σu + σv .

The sum of the diagonal elements of matrix is known as the trace of 2 the matrix, denoted tr. Thus σ = tr{Σ}.

Note σ 2 is only a factor of two away from the eddy kinetic energy, K = 1 σ 2 ✧ 2 . Clearly we only need to use one of these quantities. 10 / 28 Eigenvalue Decomposition

For bivariate data zn , the second —the velocity covariance matrix—takes on a geometric aspect that can be highly informative.

We will shown that the covariance matrix Σ can be written as cos θ − sin θ a2 0 cos θ sin θ Σ = [ sin θ cos θ ] [ 0 b2 ] [ − sin θ cos θ ]

T or more compactly as Σ = J(θ) D(a, b) J (θ), where we have introduced the D(a, b) defined as a2 0 D(a, b) ≡ . [ 0 b2 ]

This is the eigenvalue decomposition of the covariance matrix Σ.

Generally, the eigenvalue decomposition is found numerically, though for the 2 × 2 case this is not necessary because there are simple expressions for a, b, and θ, as will be shown later. ✧ 11 / 28 Diagonalization

The operation of the eigenvalue decomposition is to diagonalize the covariance matrix. In other words, Σ = J(θ) D(a, b) JT (θ)

implies that JT (θ)ΣJ(θ) = D(a, b)

which means that if we rotate the observed velocities by −θ, we obtain an ellipse its major axis oriented along the x-axis, and with no correlation between the x- and y-velocities.

These rotated velocities are given by ˜u ˜z ≡ ≡ JT (θ)z [ ˜v ]

with ˜u being the component of the velocity along the major axis, ˜v ✧ and the component of the velocity along the axis. 12 / 28 Diagonalization

If we form the covariance matrix of the velocities rotated by the angle of θ that comes out of the eigenvalue decomposition, we find

N−1 1 ⎯⎯⎯ ⎯⎯⎯ T ˜Σ ≡ ˜z − ˜z ˜z − ˜z [ N ∑ ( n ) ( n ) ] n=0 1 N−1 ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ T = JT (θ)z − JT (θ)z JT (θ)z − JT (θ)z J(θ) [ N ∑ ( n ) ( n ) ] n=0 N−1 T 1 ⎯⎯⎯ ⎯⎯⎯ T = J (θ) (zn − z) (zn − z) J(θ) [ N ∑ ] n=0 = JT (θ) [J(θ) D(a, b) JT (θ)] J(θ) = D(a, b)

T T T using (Az) = z A . Thus the eigenvalue matrix D(a, b) is simply the covariance matrix computed in a rotated frame.

The eigenvalue decomposition has found the rotation for which the ✧ covariance between the rotated velocity components vanishes. 13 / 28 The Variance Ellipse

The covariance matrix describes an ellipse with major axis a and minor axis b, orientated at an angle θ with respect to the x-axis.

The usual equation for an ellipse with major axis a oriented along the u-axis and minor axis b oriented along the v-axis is u2 v2 1/a2 0 u + = [ u v ] = zT D−1(a, b) z = 1 a2 b2 [ 0 1/b2 ] [ v ]

where the “−1” denotes the matrix inverse. Recall the inverse of a −1 matrix M is defined to give M M = I. Thus

−1 zT Σ−1 z = zT [J(θ) D(a, b) JT(θ)] z T = [JT(θ) z] D−1(a, b) [JT(θ) z] = 1

is the equation for an ellipse with semi-major axis a, semi-minor axis b, and oriented θ radians counterclockwise from the x-axis.

✧ 14 / 28 The Variance Ellipse

Thus we have shown the covariance matrix Σ of a bivariate time series zn defines an ellipse that captures how the data is spread out about its mean value, as claimed.

✧ 15 / 28 Expressions for the Axes

Exact expression can be found for a, b, and θ.

Here we introduce some new notation. tr{M} denotes the matrix trace, which is defined to be the sum of all diagonal elements of M. Similarly det{M} denotes the .

2 2 2 For Σ we have tr{Σ} = σuu + σvv and det{Σ} = ΣuuΣvv − Σuv.

The eigenvalues of Σ are given explicitly by 1 1 a2 = tr{Σ} + ‾[t‾r‾{‾Σ‾}‾‾]2‾‾−‾‾4‾d‾e‾t‾{‾Σ‾}‾ 2 2 √ 1 1 b2 = tr{Σ} − ‾[t‾r‾{‾Σ‾}‾‾]2‾‾−‾‾4‾d‾e‾t‾{‾Σ‾}‾ 2 2 √

as can easily be shown by inserting the values for tr{Σ} and det{Σ}.

Born and Wolf (1959), Principles of Optics Samson (1980), “Comments on polarization and coherence” ✧ 16 / 28 Expression for the Angle

To find the angle θ, we carry out the matrix multiplications, giving 1 1 cos 2θ sin 2θ J (θ) D(a, b)JT (θ) = (a2 + b2) I + (a2 − b2) 2 2 [ sin 2θ − cos 2θ ]

and we also rewrite the covariance matrix Σ in the form

Σuu Σuv 1 Σuu − Σvv Σuv Σ = = (Σuu + Σvv) I + . [ Σuv Σvv ] 2 [ Σuv Σvv − Σuu ]

Equating terms in the anisotropic parts of these matrices leads to

2 2 2 2 Σuu − Σvv = (a − b ) cos 2θ, 2Σuv = (a − b ) sin 2θ

and dividing these two expressions, we find θ satisfies 2Σ tan(2θ) = uv . Σuu − Σvv ✧ 17 / 28 Isotropy and Polarization

The variance ellipse can alternately be decomposed into (i) 2 2 directional variability in direction θ, with variance a − b , plus (ii) 2 purely isotropic or directionless variability, with variance 2b .

✧ 18 / 28 The Polarization Ratio

T To show this, let ˆn = [cos θ sin θ] be the unit vector pointing in the direction of the major axis. Then Σ may be written as

polarized unp⏞olarized T Σ = (a2 − b2) ˆn ˆn + b2I

as may be readily verified using trigonometric identities.

The first component gives variability associated with the particular direction ˆn, while the second is associated with all directions. The first term is said to be “polarized” or anisotropic, while the second is said to be “unpolarized” or isotropic.

T Because tr{ˆn ˆn } = 1, the variance associated with the first term is a2 − b2. Because tr{I} = 2, that associated with the second is 2b2.

2 2 2 2 The polarization ratio P ≡ (a − b ) / (a + b ) tells the extent to which the variability is organized along a particular direction. ✧ 19 / 28 Three Dimensions

T What is if the vector zn has three components, zn = [u v w] ?

This is no different. The general form for an eigenvalue decomposition of a real-valued N × N matrix is Σ = UDUT

where U is an (essentially, a generalized T T rotation), with U U = UU = I, and D is a diagonal matrix of eigenvalues.

For any , the eigenvalues are real-valued. For a covariance matrix such as Σ in particular, the eigenvalues are also non-negative. A matrix having such a property is said to be positive semi­definite.

✧ 20 / 28 Three Dimensions

In three dimensions, it is realtively easy to show that this becomes Σ = J(α, β, θ) D(a, b, c) JT (α, β, θ)

where J(α, β, θ) is a three-dimensional rotation matrix and ⎡ a2 0 0 ⎤ ⎢ ⎥ D(a, b) ≡ ⎢ 0 b2 0 ⎥ . ⎢ ⎥ ⎣ 0 0 c2 ⎦

The interpretation is that the covariance matrix Σ describes an in three dimensions, with major axis a, first semi-minor axis b, and second semi-minor axis c, oriented in space as described by the rotation matrix J(α, β, θ).

✧ 21 / 28 Covariance Invariances

The covariance matrix Σ has several important invariances. We will look at these in a general way, enabling us to connect the 2 × 2 case we have been working with to the case of any number of dimensions.

✧ 22 / 28 Determinant Invariance

The first invariance is the determinant of covariance matrix. We 2 2 2 2 have seen that det {Σ} = a b = A /π where A is the ellipse area.

Intuitively, we expect we should be able to rotate the coordinate system without changing the ellipse area.

This invariance relates to a property of the matrix determinant. If A and B are any two square matrices of the same size, then det {AB} = det {A} det {B} .

It follows that, for any angle ϑ, the rotated covariace matrix T ˜Σ ≡ J(ϑ)ΣJ (ϑ) has the same determinant as the original matrix,

det {˜Σ } = det {J(ϑ)} det {Σ} det {JT (ϑ)} = det {Σ}

using the fact that we have previously verified det {J(ϑ)} = 1.

✧ 23 / 28 Trace Invariance

The second invariance is the trace of covariance matrix. We saw 2 2 that tr {Σ} = σuu + σvv = 2K where K is the eddy kinetic energy.

Intuitively, we should be able to rotate the coordinate system without changing the kinetic energy.

This invariance relates to a property of the matrix trace. If U is any −1 −1 , such that U exists with U U = I, then ˜Σ ≡ UΣU−1

is said to be a similarity transformation of Σ. The trace has the property that

tr {Σ} = tr {UΣU−1} = tr {˜Σ }

which is called similarity invariance. Since 2 × 2 rotation matrix J(ϑ) is invertible for any angle ϑ, the trace is not changed under rotations. ✧ 24 / 28 Note it is not the case that tr {AB} tr {A} tr {B}! Generalization to N > 2

Let's say that instead of two signal components, u and v, we have T many, z ≡ [z1 z2 … zN ] .

A common application is where the zn are taken from a spatial grid, which we re-organize into a single column vector for convenience.

We then form the covariance matrix and take its eigenvalue decomposition as before

1 N−1 Σ ≡ z − ⎯z⎯⎯ z − ⎯z⎯⎯ T = U D UT N ∑ ( n ) ( n ) n=0

but where now U is an orthogonal matrix (a "generalized rotation"), T T with U U = UU = I, and D is a diagonal matrix of eigenvalues.

In this N-dimensional case, tr {Σ} /N is the average variance across the N components, while √d‾‾e‾t‾{‾Σ‾}‾ is a measure of the volume enclosed by the variability. ✧ 25 / 28 Generalization to N > 2

Transforming the original variables as

T ˜z ≡ [ ˜z1 ˜z2 … ˜zN ] ≡ U z

we obtain ˜z1 as the orthogonal transformation (linear combination preserving the vector length ‖z‖) of the original time series having the maximum variance.

Another way to say this is that ˜z1 is the time series along the "generalized direction" that maximizes variance, ˜z2 is the time series along the generalized direction that maximizes variances and that is also orthogonal to the direction of ˜z1 , etc.

These are the principal components. The columns of U correspond to spatial patterns if we rearrange them back onto the grid. The magnitudes of these components are the elements of the diagnonal matrix D, d1, d2, ... dN . That's EOF analysis in a nutshell.

The variance ellipse is geometric expression of a 2 × 2 EOF analysis. ✧ 26 / 28 Homework

1. Review the notes. 2. For your variance ellipse (or bravo or m1244), verify that using T a, and b, and θ output by specdiag in Σ = J(θ) D(a, b) J (θ) does indeed recover the covariance matrix Σ that you started with. You can use jmat2 to form the rotation matrix if you like. 3. Numerically find the rotation such that the real part of your complex velocity has the largest variance. Verify that this matches θ as output by specdiag. T 4. Multiply out the expansion Σ = J(θ) D(a, b) J (θ) to verify that

1 (a2 + b2) + (a2 − b2) cos 2θ (a2 − b2) sin 2θ Σ = 2 [ (a2 − b2) sin 2θ (a2 + b2) − (a2 − b2) cos 2θ ]

and note that this matches the expression at the top of page 17. 5. Find the trace and determinant of this matrix. 6. From these, verify the formula for a2 and b2 on page 16. This shows how the ellipse axes are found from the components of Σ. 7. Verify the polarization expansion of Σ at the top of page 19. ✧ 27 / 28 Pop Quiz!

1. If J(θ) is the 2x2 rotation matrix, what is J(θ) J(θ) JT (θ) J(θ) JT (θ) J(−θ) J(−θ) JT (−θ) =?

2. What is the trace and determinant of J(θ)? Of the 2x2 identity matrix I? 3. What is the rotation matrix about the z-axis in three dimensions? 4. A mooring observes a velocity distribution with a mean velocity T of [u v] = [1 0] and a variance ellipse with a = 1/2, b = 1/2, and θ = 0. Draw (possible) contours of its velocity distribution on the (u, v) plane. 5. A mooring observes a velocity distribution with a mean velocity T of [u v] = [1 − 1] and a variance ellipse with a = √2‾, b = √2‾/2, and θ = 45 degrees. Draw (possible) contours of its velocity distribution on the (u, v) plane. 6. A mooring sits at a location where the bathymetry shoals to the north. The currents are purely eastward and the variance ellipse is isotropic. Draw a possible mean temperature distribution on ✧ the (u, v) plane that could accomplish an onshore heat flux. 28 / 28