The Variance Ellipse
✧ 1 / 28 The Variance Ellipse
For bivariate data, like velocity, the variabililty can be spread out in not one but two dimensions. In this case, the variance is now a matrix, and the spread of the data is characterized by an ellipse.
This variance ellipse eccentricity indicates the extent to which the variability is anisotropic or directional, and the orientation tells the direction in which the variability is concentrated.
✧ 2 / 28 Variance Ellipse Example
Variance ellipses are a very useful way to analyze velocity data.
This example compares velocities observed by a mooring array in Fram Strait with velocities in two numerical models.
From Hattermann et al. (2016), “Eddydriven recirculation of Atlantic Water in Fram Strait”, Geophysical Research Letters.
Variance ellipses can be powerfully combined with lowpassing and bandpassing to reveal the geometric structure of variability in different frequency bands.
✧ 3 / 28 Understanding Ellipses
This section will focus on understanding the properties of the variance ellipse.
To do this, it is not really possible to avoid matrix algebra.
Therefore we will first review some relevant mathematical background.
✧ 4 / 28 Review: Rotations
T The most important action on a vector z ≡ [u v] is a ninety-degree rotation. This is carried out through the matrix multiplication 0 −1 0 −1 u −v z = = . [ 1 0 ] [ 1 0 ] [ v ] [ u ]
Note the mathematically positive direction is counterclockwise. A general rotation is carried out by the rotation matrix cos θ − sin θ J(θ) ≡ [ sin θ cos θ ]
cos θ − sin θ u u cos θ − v sin θ J(θ) z = = . [ sin θ cos θ ] [ v ] [ u sin θ + v cos θ ]
The ninety-degree rotation matrix is J(π/2), while J(π), the 180 degree rotation matrix, just changes the sign of z.
✧ 5 / 28 Review: Matrix Basics
Recall that a matrix M is said to be unitary if 1 0 MT M = I = [ 0 1 ]
where “T” represents the matrix transpose, and I is called the identity matrix. T For a unitary matrix, M M z = z, i.e. when the matrix and its transpose operate in succession, nothing happens.
We can see that the rotation matrix J(θ) is unitary since cos θ sin θ cos θ − sin θ 1 0 JT (θ) J(θ) = = . [ − sin θ cos θ ] [ sin θ cos θ ] [ 0 1 ]
T We also note that J (θ) = J(−θ), the transpose of a rotation matrix is the same as a rotation in the opposite direction. Makes sense!
✧ 6 / 28 Complex Notation
A pair of time series can also be grouped into a single complex valued time series
zn = un + ivn n = 0, 1, 2, … N − 1
where i = √−‾‾1‾. The real part represents east-west, and the imaginary part represents north-south.
We will use both the complex-valued and vector representations.
Complex notation turns out to be highly useful not only for bivariate data, but also in the analysis of real-valued time series as well.
Complex numbers are reviewed in detail in another lecture.
✧ 7 / 28 The Mean of Bivariate Data
Next we look at the mean and variance for the case of bivariate data, which we represent as the vector-valued time series zn .
The sample mean of the vector time series zn is also a vector,
1 N−1 ⎯u⎯⎯ ⎯z⎯⎯ ≡ z = N ∑ n [ ⎯v⎯⎯ ] n=0
that consists of the sample means of the u and v components of zn .
✧ 8 / 28 Variance of Bivariate Data
The variance of the vector-valued times series zn is not a scalar or a vector, it is a 2 × 2 matrix
1 N−1 Σ ≡ z − ⎯z⎯⎯ z − ⎯z⎯⎯ T N ∑ ( n ) ( n ) n=0
T where “T” represents the matrix transpose, z = [ u v ].
Carrying out the matrix multiplication leads to
N−1 ⎯⎯⎯ 2 ⎯⎯⎯ ⎯⎯⎯ 1 (un − u) (un − u) (vn − v) Σ = N ∑ [ ⎯⎯⎯ ⎯⎯⎯ ⎯⎯⎯ 2 ] n=0 (un − u) (vn − v) (vn − v)
2 2 The diagonal elements of Σ are the sample variances σu and σv , while the off-diagonal gives the covariance between un and vn . Note that the two off-diagonal elements are identical.
✧ 9 / 28 Standard Deviation
Σ is generally called the velocity covariance matrix.
We can still define a scalar-valued standard deviation σ. This is done by taking the mean of an inner product rather than an outer product,
1 N−1 1 N−1 Σ ≡ z − ⎯z⎯⎯ z − ⎯z⎯⎯ T , σ 2 ≡ z − ⎯z⎯⎯ T z − ⎯z⎯⎯ . N ∑ ( n ) ( n ) N ∑ ( n ) ( n ) n=0 n=0
2 The squared velocity standard deviation σ is related to the covariance matrix as the sum of the diagonal elements:
2 2 2 σ = Σuu + Σvv = σu + σv .
The sum of the diagonal elements of matrix is known as the trace of 2 the matrix, denoted tr. Thus σ = tr{Σ}.
Note σ 2 is only a factor of two away from the eddy kinetic energy, K = 1 σ 2 ✧ 2 . Clearly we only need to use one of these quantities. 10 / 28 Eigenvalue Decomposition
For bivariate data zn , the second moment—the velocity covariance matrix—takes on a geometric aspect that can be highly informative.
We will shown that the covariance matrix Σ can be written as cos θ − sin θ a2 0 cos θ sin θ Σ = [ sin θ cos θ ] [ 0 b2 ] [ − sin θ cos θ ]
T or more compactly as Σ = J(θ) D(a, b) J (θ), where we have introduced the diagonal matrix D(a, b) defined as a2 0 D(a, b) ≡ . [ 0 b2 ]
This is the eigenvalue decomposition of the covariance matrix Σ.
Generally, the eigenvalue decomposition is found numerically, though for the 2 × 2 case this is not necessary because there are simple expressions for a, b, and θ, as will be shown later. ✧ 11 / 28 Diagonalization
The operation of the eigenvalue decomposition is to diagonalize the covariance matrix. In other words, Σ = J(θ) D(a, b) JT (θ)
implies that JT (θ)ΣJ(θ) = D(a, b)
which means that if we rotate the observed velocities by −θ, we obtain an ellipse its major axis oriented along the x-axis, and with no correlation between the x- and y-velocities.
These rotated velocities are given by ˜u ˜z ≡ ≡ JT (θ)z [ ˜v ]
with ˜u being the component of the velocity along the major axis, ˜v ✧ and the component of the velocity along the minor axis. 12 / 28 Diagonalization
If we form the covariance matrix of the velocities rotated by the angle of θ that comes out of the eigenvalue decomposition, we find
N−1 1 ⎯⎯⎯ ⎯⎯⎯ T ˜Σ ≡ ˜z − ˜z ˜z − ˜z [ N ∑ ( n ) ( n ) ] n=0 1 N−1 ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ T = JT (θ)z − JT (θ)z JT (θ)z − JT (θ)z J(θ) [ N ∑ ( n ) ( n ) ] n=0 N−1 T 1 ⎯⎯⎯ ⎯⎯⎯ T = J (θ) (zn − z) (zn − z) J(θ) [ N ∑ ] n=0 = JT (θ) [J(θ) D(a, b) JT (θ)] J(θ) = D(a, b)
T T T using (Az) = z A . Thus the eigenvalue matrix D(a, b) is simply the covariance matrix computed in a rotated frame.
The eigenvalue decomposition has found the rotation for which the ✧ covariance between the rotated velocity components vanishes. 13 / 28 The Variance Ellipse
The covariance matrix describes an ellipse with major axis a and minor axis b, orientated at an angle θ with respect to the x-axis.
The usual equation for an ellipse with major axis a oriented along the u-axis and minor axis b oriented along the v-axis is u2 v2 1/a2 0 u + = [ u v ] = zT D−1(a, b) z = 1 a2 b2 [ 0 1/b2 ] [ v ]
where the “−1” denotes the matrix inverse. Recall the inverse of a −1 matrix M is defined to give M M = I. Thus
−1 zT Σ−1 z = zT [J(θ) D(a, b) JT(θ)] z T = [JT(θ) z] D−1(a, b) [JT(θ) z] = 1
is the equation for an ellipse with semi-major axis a, semi-minor axis b, and oriented θ radians counterclockwise from the x-axis.
✧ 14 / 28 The Variance Ellipse
Thus we have shown the covariance matrix Σ of a bivariate time series zn defines an ellipse that captures how the data is spread out about its mean value, as claimed.
✧ 15 / 28 Expressions for the Axes
Exact expression can be found for a, b, and θ.
Here we introduce some new notation. tr{M} denotes the matrix trace, which is defined to be the sum of all diagonal elements of M. Similarly det{M} denotes the determinant.
2 2 2 For Σ we have tr{Σ} = σuu + σvv and det{Σ} = ΣuuΣvv − Σuv.
The eigenvalues of Σ are given explicitly by 1 1 a2 = tr{Σ} + ‾[t‾r‾{‾Σ‾}‾‾]2‾‾−‾‾4‾d‾e‾t‾{‾Σ‾}‾ 2 2 √ 1 1 b2 = tr{Σ} − ‾[t‾r‾{‾Σ‾}‾‾]2‾‾−‾‾4‾d‾e‾t‾{‾Σ‾}‾ 2 2 √
as can easily be shown by inserting the values for tr{Σ} and det{Σ}.
Born and Wolf (1959), Principles of Optics Samson (1980), “Comments on polarization and coherence” ✧ 16 / 28 Expression for the Angle
To find the angle θ, we carry out the matrix multiplications, giving 1 1 cos 2θ sin 2θ J (θ) D(a, b)JT (θ) = (a2 + b2) I + (a2 − b2) 2 2 [ sin 2θ − cos 2θ ]
and we also rewrite the covariance matrix Σ in the form
Σuu Σuv 1 Σuu − Σvv Σuv Σ = = (Σuu + Σvv) I + . [ Σuv Σvv ] 2 [ Σuv Σvv − Σuu ]
Equating terms in the anisotropic parts of these matrices leads to
2 2 2 2 Σuu − Σvv = (a − b ) cos 2θ, 2Σuv = (a − b ) sin 2θ
and dividing these two expressions, we find θ satisfies 2Σ tan(2θ) = uv . Σuu − Σvv ✧ 17 / 28 Isotropy and Polarization
The variance ellipse can alternately be decomposed into (i) 2 2 directional variability in direction θ, with variance a − b , plus (ii) 2 purely isotropic or directionless variability, with variance 2b .
✧ 18 / 28 The Polarization Ratio
T To show this, let ˆn = [cos θ sin θ] be the unit vector pointing in the direction of the major axis. Then Σ may be written as
polarized unp⏞olarized T Σ = (a2 − b2) ˆn ˆn + b2I
as may be readily verified using trigonometric identities.
The first component gives variability associated with the particular direction ˆn, while the second is associated with all directions. The first term is said to be “polarized” or anisotropic, while the second is said to be “unpolarized” or isotropic.
T Because tr{ˆn ˆn } = 1, the variance associated with the first term is a2 − b2. Because tr{I} = 2, that associated with the second is 2b2.
2 2 2 2 The polarization ratio P ≡ (a − b ) / (a + b ) tells the extent to which the variability is organized along a particular direction. ✧ 19 / 28 Three Dimensions
T What is if the vector zn has three components, zn = [u v w] ?
This is no different. The general form for an eigenvalue decomposition of a real-valued N × N matrix is Σ = UDUT
where U is an orthogonal matrix (essentially, a generalized T T rotation), with U U = UU = I, and D is a diagonal matrix of eigenvalues.
For any symmetric matrix, the eigenvalues are real-valued. For a covariance matrix such as Σ in particular, the eigenvalues are also non-negative. A matrix having such a property is said to be positive semidefinite.
✧ 20 / 28 Three Dimensions
In three dimensions, it is realtively easy to show that this becomes Σ = J(α, β, θ) D(a, b, c) JT (α, β, θ)
where J(α, β, θ) is a three-dimensional rotation matrix and ⎡ a2 0 0 ⎤ ⎢ ⎥ D(a, b) ≡ ⎢ 0 b2 0 ⎥ . ⎢ ⎥ ⎣ 0 0 c2 ⎦
The interpretation is that the covariance matrix Σ describes an ellipsoid in three dimensions, with major axis a, first semi-minor axis b, and second semi-minor axis c, oriented in space as described by the rotation matrix J(α, β, θ).
✧ 21 / 28 Covariance Invariances
The covariance matrix Σ has several important invariances. We will look at these in a general way, enabling us to connect the 2 × 2 case we have been working with to the case of any number of dimensions.
✧ 22 / 28 Determinant Invariance
The first invariance is the determinant of covariance matrix. We 2 2 2 2 have seen that det {Σ} = a b = A /π where A is the ellipse area.
Intuitively, we expect we should be able to rotate the coordinate system without changing the ellipse area.
This invariance relates to a property of the matrix determinant. If A and B are any two square matrices of the same size, then det {AB} = det {A} det {B} .
It follows that, for any angle ϑ, the rotated covariace matrix T ˜Σ ≡ J(ϑ)ΣJ (ϑ) has the same determinant as the original matrix,
det {˜Σ } = det {J(ϑ)} det {Σ} det {JT (ϑ)} = det {Σ}
using the fact that we have previously verified det {J(ϑ)} = 1.
✧ 23 / 28 Trace Invariance
The second invariance is the trace of covariance matrix. We saw 2 2 that tr {Σ} = σuu + σvv = 2K where K is the eddy kinetic energy.
Intuitively, we should be able to rotate the coordinate system without changing the kinetic energy.
This invariance relates to a property of the matrix trace. If U is any −1 −1 invertible matrix, such that U exists with U U = I, then ˜Σ ≡ UΣU−1
is said to be a similarity transformation of Σ. The trace has the property that
tr {Σ} = tr {UΣU−1} = tr {˜Σ }
which is called similarity invariance. Since 2 × 2 rotation matrix J(ϑ) is invertible for any angle ϑ, the trace is not changed under rotations. ✧ 24 / 28 Note it is not the case that tr {AB} tr {A} tr {B}! Generalization to N > 2
Let's say that instead of two signal components, u and v, we have T many, z ≡ [z1 z2 … zN ] .
A common application is where the zn are taken from a spatial grid, which we re-organize into a single column vector for convenience.
We then form the covariance matrix and take its eigenvalue decomposition as before
1 N−1 Σ ≡ z − ⎯z⎯⎯ z − ⎯z⎯⎯ T = U D UT N ∑ ( n ) ( n ) n=0
but where now U is an orthogonal matrix (a "generalized rotation"), T T with U U = UU = I, and D is a diagonal matrix of eigenvalues.
In this N-dimensional case, tr {Σ} /N is the average variance across the N components, while √d‾‾e‾t‾{‾Σ‾}‾ is a measure of the volume enclosed by the variability. ✧ 25 / 28 Generalization to N > 2
Transforming the original variables as
T ˜z ≡ [ ˜z1 ˜z2 … ˜zN ] ≡ U z
we obtain ˜z1 as the orthogonal transformation (linear combination preserving the vector length ‖z‖) of the original time series having the maximum variance.
Another way to say this is that ˜z1 is the time series along the "generalized direction" that maximizes variance, ˜z2 is the time series along the generalized direction that maximizes variances and that is also orthogonal to the direction of ˜z1 , etc.
These are the principal components. The columns of U correspond to spatial patterns if we rearrange them back onto the grid. The magnitudes of these components are the elements of the diagnonal matrix D, d1, d2, ... dN . That's EOF analysis in a nutshell.
The variance ellipse is geometric expression of a 2 × 2 EOF analysis. ✧ 26 / 28 Homework
1. Review the notes. 2. For your variance ellipse (or bravo or m1244), verify that using T a, and b, and θ output by specdiag in Σ = J(θ) D(a, b) J (θ) does indeed recover the covariance matrix Σ that you started with. You can use jmat2 to form the rotation matrix if you like. 3. Numerically find the rotation such that the real part of your complex velocity has the largest variance. Verify that this matches θ as output by specdiag. T 4. Multiply out the expansion Σ = J(θ) D(a, b) J (θ) to verify that
1 (a2 + b2) + (a2 − b2) cos 2θ (a2 − b2) sin 2θ Σ = 2 [ (a2 − b2) sin 2θ (a2 + b2) − (a2 − b2) cos 2θ ]
and note that this matches the expression at the top of page 17. 5. Find the trace and determinant of this matrix. 6. From these, verify the formula for a2 and b2 on page 16. This shows how the ellipse axes are found from the components of Σ. 7. Verify the polarization expansion of Σ at the top of page 19. ✧ 27 / 28 Pop Quiz!
1. If J(θ) is the 2x2 rotation matrix, what is J(θ) J(θ) JT (θ) J(θ) JT (θ) J(−θ) J(−θ) JT (−θ) =?
2. What is the trace and determinant of J(θ)? Of the 2x2 identity matrix I? 3. What is the rotation matrix about the z-axis in three dimensions? 4. A mooring observes a velocity distribution with a mean velocity T of [u v] = [1 0] and a variance ellipse with a = 1/2, b = 1/2, and θ = 0. Draw (possible) contours of its velocity distribution on the (u, v) plane. 5. A mooring observes a velocity distribution with a mean velocity T of [u v] = [1 − 1] and a variance ellipse with a = √2‾, b = √2‾/2, and θ = 45 degrees. Draw (possible) contours of its velocity distribution on the (u, v) plane. 6. A mooring sits at a location where the bathymetry shoals to the north. The currents are purely eastward and the variance ellipse is isotropic. Draw a possible mean temperature distribution on ✧ the (u, v) plane that could accomplish an onshore heat flux. 28 / 28