Chapter 1 Some Results on Linear Algebra, Matrix Theory and Distributions
We need some basic knowledge to understand the topics in the analysis of variance.
Vectors: A vector Y is an ordered n-tuple of real numbers. A vector can be expressed as a row vector or a column vector as
y1 y Y 2 yn is a column vector of order n1 and
Yyyy' (12 , ,...,n ) is a row vector of order 1n .
If all yi 0 for all i = 1,2,…,n then Y ' (0,0,...,0) is called the null vector.
If
x111 yz x yz XYZ222,, xnnn yz then
x12 yky 1 x yky XY22, kY 2 xnn yky n
X ()()YZ XY Z XY'( Z ) XY ' XZ ' kXY(')( kXY )' X '() kY kX() Y kXkY
X '...Yxyxy11 2 2 xynn where k is a scalar.
Analysis of Variance | Chapter 1 | Linear Algebra, Matrix Theory and Dist. | Shalabh, IIT Kanpur 1 Orthogonal vectors: Two vectors X and Y are said to be orthogonal if XY''0 YX .
The null vector is orthogonal to every vector X and is the only such vector.
Linear combination:
If x12,xx ,..., m are m vectors and kk12, ,..., km are m scalars, then
m tkx ii i1 is called the linear combination of x12,xx ,..., m .
Linear independence
If XX12, ,..., Xm are m vectors then they are said to be linearly independent if there exist scalars kk12, ,..., km such that
m kXii00 k i for all i = 1,2,…,m. i1
m If there exist kk12, ,..., km with at least one ki to be nonzero, such that kxii 0 then i1 x12,xx ,..., m are said to be linearly dependent. Any set of vectors containing the null vector is linearly dependent. Any set of non-null pair-wise orthogonal vectors is linearly independent. If m > 1 vectors are linearly dependent, it is always possible to express at least one of them as a linear combination of the others.
Linear function:
Let K (kk12 , ,..., km )' be a m 1 vector of scalars and Xxxx (12 , ,...,m ) be a m 1 vector of variables,
m then K 'Yky ii is called a linear function or linear form. The vector K is called the coefficient i1 vector. For example, the mean of x12,xx ,..., m can be expressed as
x1 11m x 1 2 ' x xXim(1,1,...,1) 1 mmi1 m xm
' where 1m is a m 1 vector of all elements unity. Analysis of Variance | Chapter 1 | Linear Algebra, Matrix Theory and Dist. | Shalabh, IIT Kanpur 2 Contrast:
m m The linear function K ' Xkx ii is called a contrast in x12,xx ,..., m if ki 0. i1 i1 For example, the linear functions x x xxx,2 3 xx , 1 x 3 121 2323 2 are contrasts.
m A linear function KX' is a contrast if and only if it is orthogonal to a linear function xi or to i 1 m the linear function x x.i . m i1
Contrasts x1213xx, x ,..., x 1 xj are linearly independent for all j 2,3,...,m .
Every contrast in x12,xx ,..., n can be written as a linear combination of (m - 1) contrasts
x1213xx, x ,..., x 1 xm .
Matrix: A matrix is a rectangular array of real numbers. For example
aa11 12... a 1n aa... a 21 22 2n aamm12... a mn is a matrix of order mn with m rows and n columns. If m = n, then A is called a square matrix.
If aijmnij 0, , , then A is a diagonal matrix and is denoted as Ag dia (aa , ,..., a ). 11 22 mm
If m = n (square matrix) and aij 0 for i > j , then A is called an upper triangular matrix. On
the other hand if m = n and aij 0 for i < j then A is called a lower triangular matrix.
If A is a mn matrix, then the matrix obtained by writing the rows of A and
columns of A as columns of A and rows of A respectively is called the transpose of a matrix A and is denoted as A' . If A A' then A is a symmetric matrix. If A A' then A is skew-symmetric matrix. A matrix whose all elements are equal to zero is called a null matrix.
Analysis of Variance | Chapter 1 | Linear Algebra, Matrix Theory and Dist. | Shalabh, IIT Kanpur 3 An identity matrix is a square matrix of order p whose diagonal elements are unity (ones) and all
the off diagonal elements are zero. It is denoted as I p .
If A and B are matrices of order mn then ()'''.A BAB If A and B are the matrices of order m x n and n x p respectively and k is any scalar, then ()'''AB B A
()kA B A () kB k ( AB ) kAB .
If the orders of matrices A is m x n, B is n x p and C is n x p then A()BC ABAC .
If the orders of matrices A is m x n, B is n x p and C is p x q then ()ABC ABC ().
If A is the matrix of order m x n then IAmn AI A..
Trace of a matrix: The trace of nn matrix A , denoted as tr(A) or trace(A) is defined to be the sum of all the diagonal
n elements of A, i.e. tr() A aii . i1
If A is of order mn and B is of order nm , then tr() AB tr () BA .
If A is nn matrix and P is any nonsingular nn matrix then tr() A tr ( P1 AP ). If P is an orthogonal matrix than tr() A tr (' P AP ).
* If A and B are nn matrices, aband are scalars then tr()()() aA bB atr A btr B .
If A is an mn matrix, then
nn 2 tr(') A A tr ( AA ') aij ji11 and tr(') A A tr ( AA ')0 if and only if A 0.
If A is nn matrix then tr(') A trA.
Analysis of Variance | Chapter 1 | Linear Algebra, Matrix Theory and Dist. | Shalabh, IIT Kanpur 4 Rank of matrices The rank of a matrix A of mn is the number of linearly independent rows in A. Let B be any other matrix of order nq . A square matrix of order m is called non-singular if it has full rank. rank()min((),()) AB rank A rank B
rank( A B ) rank () A rank () B Rank A is equal to the maximum order of all nonsingular square sub-matrices of A.
rank(') AA rank (') A A rank () A rank (') A . A is of full row rank if rank(A) = m < n. A is of full column rank if rank(A) = n < m.
Inverse of a matrix The inverse of a square matrix A of order m, is a square matrix of order m, denoted as A1 , such that
11 A AAAIm. The inverse of A exists if and only if A is non-singular. ()A11 A .
If A is non-singular, then (')(AA11 )' If A and B are non-singular matrices of the same order, then their product, if defined, is also nonsingular and ()ABBA111 .
Idempotent matrix: A square matrix A is called idempotent if A2 AA A..
If A is an nn idempotent matrix with rank(A) = r n. Then
eigenvalues of A are 1 or 0. trace A rank A r.
If A is of full rank n, then A In . If A and B are idempotent and AB = BA, then AB is also idempotent. If A is idempotent then (I – A) is also idempotent and A(I - A) = (I - A)A = 0.
Analysis of Variance | Chapter 1 | Linear Algebra, Matrix Theory and Dist. | Shalabh, IIT Kanpur 5 Quadratic forms: If A is a given matrix of order mn and X and Y are two given vectors of order m1and n1 respectively
mn XAY' axyij i j ij11 where aij are the nonstochastic elements of A.
If A is a square matrix of order m and X = Y , then
22 XAXax'11 1 .... amm x m ( a 12 a 21 ) xx 1 2 ... ( a m 1, m a m , m 1 ) x m 1 x m .
If A is symmetric also, then
22 XAXax'11 1 .... amm x m 2 axx 12 1 2 ..... 2 a m 1, m x m 1 x m mn =axxij i j ij11 is called a quadratic form in m variables x12,xx ,..., m or a quadratic form in X. To every quadratic form corresponds a symmetric matrix and vice versa. The matrix A is called the matrix of the quadratic form. The quadratic form X ' AX and the matrix A of the form is called. Positive definite if XAX ' 0 for all x 0 . Positive semidefinite if XAX'0 for all x 0 . Negative definite if XAX ' 0 for all x 0 . Negative semidefinite if XAX ' 0 for all x 0 .
If A is positive semi-definite matrix then aii 0 and if aii 0 then aij 0 for all j, and
a ji 0 for all j. If P is any nonsingular matrix and A is any positive definite matrix (or positive semi-definite matrix) then PAP' is also a positive definite matrix (or positive semi-definite matrix). A matrix A is positive definite if and only if there exists a non-singular matrix P such that APP '. A positive definite matrix is a nonsingular matrix.
If A is mn matrix and rank A m n then AA' is positive definite and A' A is positive
semidefinite. If A mn matrix and rank A k m n , then both A' Aand AA' are positive semidefinite.
Analysis of Variance | Chapter 1 | Linear Algebra, Matrix Theory and Dist. | Shalabh, IIT Kanpur 6 Simultaneous linear equations
The set of m linear equations in n unknowns x12,xx ,..., n and scalars aij and bi , imjn1,2,..., , 1,2,..., of the form
ax11 1 ax 12 2.... ax 1nn b 1 ax ax.... ax b 21 1 22 2 2nn 2
axmm11 ax 2 2 .... ax mnnm b can be formulated as AX = b where A is a real matrix of known scalars of order mn called as a coefficient matrix, X is n 1 real vector and b is n1 real vector of known scalars given by
aa11 12... a 1n aa... a Amn21 22 2n , is an real matrix called as coefficient matrix, aamm12... a mn
xb11 xb Xn22is an 1 vector of variables and bm , is an 1 real vector. xbnm If A is a nn nonsingular matrix, then AX = b has a unique solution.
Let B = [A, b] is an augmented matrix. A solution to AX = b exist if and only if rank(A) = rank(B). If A is a mn matrix of rank m , then AX b has a solution. Linear homogeneous system AX = 0 has a solution other than X = 0 if and only if rank(A) < n. If AX = b is consistent then AX = b has a unique solution if and only if rank(A) = n
th If aii is the i diagonal element of an orthogonal matrix, then 11aii .
Let the nn matrix be partitioned as A [aa12 , ,..., an ]where ai is an n 1 vector of the elements of ith column of A. A necessary and sufficient condition that A is an orthogonal matrix is given by the following:
' (iaa ) ii 1 for i 1,2,..., n ' ()ii aij a 0 for i j 1,2,...,. n
Analysis of Variance | Chapter 1 | Linear Algebra, Matrix Theory and Dist. | Shalabh, IIT Kanpur 7
Orthogonal matrix A square matrix A is called an orthogonal matrix if A''AAAI or equivalently if A1 A'.
An orthogonal matrix is non-singular. If A is orthogonal, then AA' is also orthogonal.
If A is an nn matrix and let P is an nn orthogonal matrix, then the determinants of A and PAP' are the same.
Random vectors:
Let YY12, ,..., Yn be n random variables then YYYY (12 , ,...,n )' is called a random vector. The mean vector Y is
EY( ) (( EY (12 ), EY ( ),..., EY (n ))' The covariance matrix or dispersion matrix of Y is
Var( Y1121 ) Cov ( Y , Y ) ... Cov ( Y , Yn ) Cov( Y , Y ) Var ( Y ) ... Cov ( Y , Y ) Var() Y 21 2 2n Cov( Ynn , Y12 ) Cov ( Y , Y ) ... Var ( Y n ) which is a symmetric matrix
If YY12, ,..., Yn are independently distributed, then the covariance matrix is a diagonal matrix.
2 2 If Var() Yi for all i = 1, 2,…,n then Var() Y In .
Linear function of random variable :
n If YY12, ,..., Yn are n random variables, and kk12,,.., kn are scalars, then kYii is called a linear function i1 of random variables YY12, ,..., Yn .
n If YYYY(12 , ,...,nn )', Kkkk ( 12 , ,..., )' then K 'YkY ii. i1
n the mean KY' is E(')KY KEY '() kEYii () and i1 the variance of KY' is Var K' Y K' Var Y K .
Analysis of Variance | Chapter 1 | Linear Algebra, Matrix Theory and Dist. | Shalabh, IIT Kanpur 8
Multivariate normal distribution
A random vector YYYY (12 , ,...,n )' has a multivariate normal distribution with mean vector
(,12 ,...,) n and dispersion matrix if its probability density function is
11 fY(/,) exp ( Y )'( 1 Y ) n/2 1/2 (2 ) 2 assuming is a nonsingular matrix.
Chi-square distribution
If YY12, ,..., Yk are independently distributed following the normal distribution random variables
k 2 2 with common mean 0 and common variance 1, then the distribution of Yi is called the - i1 distribution with k degrees of freedom. The probability density function of 2 -distribution with k degrees of freedom is given as
k 1 1 x fx() x2 exp ; 0 x 2 k/2 (/2)2k 2
If YY12, ,..., Yk are independently distributed following the normal distribution with common mean 1 k 0 and common variance 2 then Y 2 has 2 distribution with k degrees of freedom. , 2 i i1
If the random variables YY12, ,..., Yk are normally distributed with non-null means 12, ,..., k but
k 2 2 common variance 1, then the distribution of Yi has noncentral distribution with k i1
k 2 degrees of freedom and noncentrality parameter i . i1
If YY12, ,..., Yk are independently and normally distributed following the normal distribution with 1 k means , ,..., but common variance 2 then Y 2 has non-central 2 distribution 12 k 2 i i1 1 k with k degrees of freedom and noncentrality parameter 2 . 2 i i1 If U has a Chi-square distribution with k degrees of freedom then E()Uk and Var() U 2 k . If U has a noncentral Chi-square distribution with k degrees of freedom and
Analysis of Variance | Chapter 1 | Linear Algebra, Matrix Theory and Dist. | Shalabh, IIT Kanpur 9 noncentrality parameter then EU k and Var() U 2 k 4 .
If UU12, ,..., Uk are independently distributed random variables with each Ui having a
noncentral Chi-square distribution with ni degrees of freedom and non-centrality parameter
k k i ,ik 1,2,..., then Ui has noncentral Chi-square distribution with ni degrees of i1 i1
k freedom and noncentrality parameter i . i1
Let XXXX (12 , ,...,n )' has a multivariate distribution with mean vector and positive
definite covariance matrix . Then X ' AX is distributed as noncentral 2 with k degrees of freedom if and only if A is an idempotent matrix of rank k.
Let XXXX (,12 ,...,)n has a multivariate normal distribution with mean vector and positive definite covariance matrix . Let the two quadratic forms-
2 XAX' 1 is distributed as with n1 degrees of freedom and noncentrality parameter
' A1 and
2 XAX' 2 is distributed as with 2 degrees of freedom and noncentrality parameter
' A2 .
Then XAX'and'12 XAX are independently distributed if AA12 0 t-distribution If X has a normal distribution with mean 0 and variance 1, Y has a 2 distribution with n degrees of freedom, and X and Y are independent random variables, X then the distribution of the statistic T is called the t-distribution with n degrees of Yn/ freedom. The probability density function of t is
n 1 n1 2 2 t 2 ftT ( ) 1 ; - t n n n 2 X If the mean of X is nonzero then the distribution of is called the noncentral t- Yn/ distribution with n degrees of freedom and noncentrality parameter . Analysis of Variance | Chapter 1 | Linear Algebra, Matrix Theory and Dist. | Shalabh, IIT Kanpur 10
F-distribution If X and Y are independent random variables with 2 -distribution with m and n degrees of X / m freedom respectively, then the distribution of the statistic F is called the F-distribution Yn/ with m and n degrees of freedom. The probability density function of F is
m/2 mn m mn m2 2 2 n 2 m ffF ( ) f 1 f ; 0 f mn n 22 If X has a noncentral Chi-square distribution with m degrees of freedom and noncentrality parameter ;Y has a 2 distribution with n degrees of freedom, and X and Y are independent X / m random variables, then the distribution of F is the noncentral F distribution with m and Yn/ n degrees of freedom and noncentrality parameter .
Linear model: Suppose there are n observations. In the linear model, we assume that these observations are the values taken by n random variables YY12,,.., Yn satisfying the following conditions:
1. EY()i is a linear combination of p unknown parameters
12, ,...,p ,
EY(ii ) x11 x i 2 2 ... x ipp , i 1,2,..., n
where xij 's are known constants.
2 2. YY12, ,..., Yn are uncorrelated and normality distributed with variance Var() Yi . The linear model can be rewritten by introducing independent normal random variables following N(0, 2 ) ,as
Yxii11 x i 2 2 .... x ippi , i 1,2,..., n .
These equations can be written using the matrix notations as YX where Y is a n 1 vector of observation, X is a np matrix of n observations on each of X12,XX ,..., p variables, is a p1vector of parameters and is a n 1vector of random error components with
Analysis of Variance | Chapter 1 | Linear Algebra, Matrix Theory and Dist. | Shalabh, IIT Kanpur 11 2 ~(0,).NIn Here Y is called study or dependent. variable, X12,XX ,..., p are called explanatory or independent variables and 12, ,..., p are called as regression coefficients.
Alternatively since YNX~( ,2 I ) so the linear model can also be expressed in the expectation form as a normal random variable Y with Ey() X
Var() y 2 I .
Note that and 2 are unknown but X is known.
Estimable functions: A linear parametric function ' of the parameter is said to be an estimable parametric function or estimable if there exists a linear function of random variables ' y of Y where YYYY (12 , ,...,n )' such that Ey(') ' with (12 , ,...,n )' and (12 , ,..., n )' being vectors of known scalars.
Best linear unbiased estimates (BLUE) The unbiased minimum variance linear estimate 'Y of an estimable function ' is called the best linear unbiased estimate of ' .
' ' '' Suppose 1Y and 2Y are the BLUE of 12and respectively. Then ()'aaY11 2 2 is
the BLUE of ()'aa11 2 2 .
If ' is estimable, its best estimate is 'ˆ where ˆ is any solution of the equations X ''.XXY
Least squares estimation The least-squares estimate of is YX is the value of which minimizes the error sum of squares ' .
Let SYXYX '( )'()
YY' 2'' XY '' X X .
Minimizing S with respect to involves
Analysis of Variance | Chapter 1 | Linear Algebra, Matrix Theory and Dist. | Shalabh, IIT Kanpur 12 S 0 X ''XXY which is termed as normal equation. This normal equation has a unique solution given by ˆ (')X XXY1 '
2S assuming rank X p . Note that XX' is a positive definite matrix. So ˆ (')X XXY1 ' ' is the value of which minimizes ' and is termed as ordinary least squares estimator of .
In this case, 12, ,..., p are estimable and consequently, all the linear parametric function are estimable. E()(')'()(')'ˆ XX11 XEY XX XX
Var()ˆ ( X ' X )1121 X ' Var () Y X ( X ' X ) ( X ' X )
If 'and'ˆˆ are the estimates of 'and' respectively, then
Var(') ˆˆ ' Var () 21 ['(' X X ) ]
ˆˆ 21 Cov(', ') ['(' X X ) ]. YX ˆ is called the residual vector
EY()0. Xˆ
Linear model with correlated observations: In the linear model YX with EVar() 0, () and is normally distributed, we find EY() X , VarY ()
Assuming to be positive definite, so we can write PP' where P is a nonsingular matrix. Premultiplying YX by P, we get PY PX P or YX * * * whereYPYXPX * , * and * P .
2 Note that in this model EVarI(*00and (*) .
Analysis of Variance | Chapter 1 | Linear Algebra, Matrix Theory and Dist. | Shalabh, IIT Kanpur 13
Distribution of 'Y : In the linear model YX,~(0,) N 2 I consider a linear function 'Y which is normally distributed with EY(' ) ' X ,
Var(' Y ) 2 ('). Then ''YX ~,1N . ''
(' Y )2 Further, has a noncentral Chi-square distribution with one degree of freedom and noncentrality 2' (' X )2 parameter . 2'
Degrees of freedom: A linear function 'Y of the observations (0) is said to carry one degrees of freedom. A set of linear functions LY' where L is r x n matrix, is said to have M degrees of freedom if there exist M linearly independent functions in the set and no more. Alternatively, the degrees of freedom carried by the set LY' equals rank() L . When the set LY' are the estimates of ', the degrees of freedom of the set LY' will also be called the degrees of freedom for the estimates of ' .
Sum of squares: Y ' If 'Y is a linear function of observations, then the projection of Y on is the vector . . The ' (' Y )2 square of this projection is called the sum of squares (SS) due to ' y is given by . Since 'Y has ' one degree of freedom, so the SS due 'Y to has one degree of freedom.
The sum of squares and the degrees of freedom arising out of the mutually orthogonal sets of functions can be added together to give the sum of squares and degrees of freedom for the set of all the function together and vice versa.
Analysis of Variance | Chapter 1 | Linear Algebra, Matrix Theory and Dist. | Shalabh, IIT Kanpur 14
Let XXXX (,12 ,...,)n has a multivariate normal distribution with mean vector and positive definite covariance matrix . Let the two quadratic forms.
2 X ',AX is distribution with n1 degrees of freedom and noncentrality parameter ' A1 and
2 XAX' 2 is distributed as with n2 degrees of freedom and noncentrality parameter ' A2 .
Then X 'and'AX12 X AX are independently distributed if AA12 0 .
Fisher-Cochran theorem
If X (XX12 , ,..., Xn ) has a multivariate normal distribution with mean vector and positive definite covariance matrix and let
1 XXQQ'...12 Qk where QXAXii ' with rank(ANiii ) , 1,2,...,. k Then Qsi ' are independently distributed noncentral Chi-square distribution with Ni degrees of freedom and noncentrality parameter ' Ai if
k and only if NNi is which case i1
k 1 ''. Ai i1
Derivatives of quadratic and linear forms:
Let X (,xx12 ,...,)' xn and f(X) be any function of n independent variables x12,xx ,..., n , then
f ()X x 1 f ()X fX() x X 2 . f ()X xn If Kkkk (, ,...,)'is a vector of constants, then 12 n KX' K X If A is a mn matrix, then XAX' 2(A AX ') . X
Analysis of Variance | Chapter 1 | Linear Algebra, Matrix Theory and Dist. | Shalabh, IIT Kanpur 15
Independence of linear and quadratic forms:
Let Y be an n1 vector having multivariate normal distribution NI(,) and B be a mn matrix. Then the m1 vector linear form BY is independent of the quadratic form YAY' if BA = 0 where A is a symmetric matrix of known elements.
Let Y be a n1 vector having multivariate normal distribution N(,) with rank() n . If BA0, then the quadratic form YAY' is independent of linear form BY where B is a mn matrix.
Analysis of Variance | Chapter 1 | Linear Algebra, Matrix Theory and Dist. | Shalabh, IIT Kanpur 16