Down with Determinants!
Sheldon Axler
21 Decemb er 1994
1. Intro duction
Ask anyone why a square matrix of complex numb ers has an eigenvalue, and you'll
probably get the wrong answer, which go es something like this: The characteristic
p olynomial of the matrix|which is de ned via determinants|has a ro ot (by the
fundamental theorem of algebra); this ro ot is an eigenvalue of the matrix.
What's wrong with that answer? It dep ends up on determinants, that's what.
Determinants are dicult, non-intuitive, and often de ned without motivation. As
we'll see, there is a b etter pro of|one that is simpler, clearer, provides more insight,
and avoids determinants.
This pap er will showhow linear algebra can b e done b etter without determinants.
Without using determinants, we will de ne the multiplici ty of an eigenvalue and
prove that the numb er of eigenvalues, counting multiplicities, equals the dimension
of the underlying space. Without determinants, we'll de ne the characteristic and
minimal p olynomials and then prove that they b ehave as exp ected. Next, we will
easily prove that every matrix is similar to a nice upp er-triangular one. Turning
to inner pro duct spaces, and still without mentioning determinants, we'll havea
simple pro of of the nite-dimensional Sp ectral Theorem.
Determinants are needed in one place in the undergraduate mathematics curricu-
lum: the change of variables formula for multi-variable integrals. Thus at the end of
this pap er we'll revive determinants, but not with any of the usual abstruse de ni-
tions. We'll de ne the determinant of a matrix to b e the pro duct of its eigenvalues
(counting multiplici ties). This easy-to-rememb er de nition leads to the usual for-
mulas for computing determinants. We'll derive the change of variables formula for
multi-variable integrals in a fashion that makes the app earance of the determinant
there seem natural.
This work was partially supp orted by the National Science Foundation. Many p eople
made comments that help ed improve this pap er. I esp ecially thank Marilyn Brouwer,
William Brown, Jonathan Hall, Paul Halmos, Richard Hill, Ben Lotto, and Wade Ramey.
@
det
@
2
A few friends who use determinants in their researchhave expressed unease at the
title of this pap er. I know that determinants play an honorable role in some areas of
research, and I do not mean to b elittle their imp ortance when they are indisp ensable.
But most mathematicians and most students of mathematics will have a clearer
understanding of linear algebra if they use the determinant-free approach to the
basic structure theorems.
The theorems in this pap er are not new; they will already b e familiar to most
readers. Some of the pro ofs and de nitions are new, although many parts of this
approachhave b een around in bits and pieces, but without the attention they de-
served. For example, at a recentannual meeting of the AMS and MAA, I lo oked
through every linear algebra text on display. Out of over fty linear algebra texts
o ered for sale, only one obscure b o ok gave a determinant-free pro of that eigen-
values exist, and that b o ok did not manage to develop other key parts of linear
algebra without determinants. The anti-determinant philosophyadvo cated in this
pap er is an attempt to counter the undeserved dominance of determinant-dep endent
metho ds.
This pap er fo cuses on showing that determinants should b e banished from much
of the theoretical part of linear algebra. Determinants are also useless in the com-
putational part of linear algebra. For example, Cramer's rule for solving systems
of linear equations is already worthless for 10 10 systems, not to mention the
much larger systems often encountered in the real world. Many computer programs
eciently calculate eigenvalues numerically|none of them uses determinants. To
emphasize the p oint, let me quote a numerical analyst. Henry Thacher, in a review
(SIAM News , Septemb er 1988) of the Turb o Pascal Numerical Metho ds Toolbox ,
writes,
I nd it hard to conceive of a situation in whichthenumerical value of a
determinant is needed: Cramer's rule, b ecause of its ineciency,iscom-
pletely impractical, while the magnitude of the determinant is an indication
of neither the condition of the matrix nor the accuracy of the solution.
2. Eigenvalues and Eigenvectors
The basic ob jects of study in linear algebra can b e thought of as either linear
transformations or matrices. Because a basis-free approach seems more natural, this
pap er will mostly use the language of linear transformations; readers who prefer the
language of matrices should have no trouble making the appropriate translation.
The term linear op erator will mean a linear transformation from a vector space to
itself; thus a linear op erator corresp onds to a square matrix (assuming some choice
of basis).
Notation used throughout the pap er: n denotes a p ositiveinteger, V denotes
an n-dimensional complex vector space, T denotes a linear op erator on V ,andI
denotes the identity op erator.
A complex number is called an eigenvalue of T if T I is not injective. Here is
the central result ab out eigenvalues, with a simple pro of that avoids determinants.
@
det
@
3
Theorem 2.1 Every linear op erator on a nite-dimensional complex vector space
has an eigenvalue.
Pro of. Toshow that T (our linear op erator on V ) has an eigenvalue, x any non-
2 n
zero vector v 2 V . The vectors v; T v; T v; : : : ; T v cannot b e linearly indep endent,
b ecause V has dimension n and wehave n +1 vectors. Thus there exist complex
numbers a ;:::;a , not all 0, such that
0 n
n
a v + a Tv + + a T v =0:
0 1 n
Make the a's the co ecients of a p olynomial, which can b e written in factored form
as
n
a + a z + + a z = c(z r ) :::(z r );
0 1 n 1 m
where c is a non-zero complex numb er, each r is complex, and the equation holds
j
for all complex z .We then have
n
0= (a I + a T + + a T )v
0 1 n
= c(T r I ) :::(T r I )v;
1 m
which means that T r I is not injective for at least one j .Inotherwords, T has
j
an eigenvalue.
Recall that a vector v 2 V is called an eigenvector of T if Tv = v for some
eigenvalue . The next prop osition|which has a simple, determinant-free pro of|
obviously implies that the numb er of distinct eigenvalues of T cannot exceed the
dimension of V .
Prop osition 2.2 Non-zero eigenvectors corresp onding to distinct eigenvalues of T
are linearly indep endent.
Pro of. Supp ose that v ;:::;v are non-zero eigenvectors of T corresp onding to
1 m
distinct eigenvalues ;:::; .We need to prove that v ;:::;v are linearly inde-
1 m 1 m
p endent. To do this, supp ose a ;:::;a are complex numb ers such that
1 m
a v + + a v =0:
1 1 m m
Apply the linear op erator (T I )(T I ) :::(T I ) to b oth sides of the
2 3 m
equation ab ove, getting
a ( )( ) :::( )v =0:
1 1 2 1 3 1 m 1
Thus a = 0. In a similar fashion, a = 0 for each j , as desired.
1 j
@
det
@
4
3. Generalized eigenvectors
Unfortunately, the eigenvectors of T need not span V .For example, the linear
2
op erator on C whose matrix is
0 1
0 0
has only one eigenvalue, namely 0, and its eigenvectors form a one-dimensional
2
subspace of C .We will see, however, that the generalized eigenvectors (de ned
b elow) of T always span V .
Avector v 2 V is called a generalized eigenvector of T if
k
(T I ) v =0
for some eigenvalue of T and some p ositiveinteger k . Obviously, the set of
generalized eigenvectors of T corresp onding to an eigenvalue is a subspace of V .
The following lemma shows that in the de nition of generalized eigenvector, instead
of allowing an arbitrary p ower of T I to annihilate v ,we could have restricted
th
attention to the n power, where n equals the dimension of V . As usual, ker is an
abbreviation for kernel (the set of vectors that get mapp ed 0).
Lemma 3.1 The set of generalized eigenvectors of T corresp onding to an eigen-
n
value equals ker(T I ) .
n
Pro of. Obviously,every elementofker(T I ) is a generalized eigenvector of
T corresp onding to . To prove the inclusion in the other direction, let v be a
n
generalized eigenvector of T corresp onding to .Weneedtoprove that (T I ) v =
0. Clearly,we can assume that v 6= 0, so there is a smallest non-negativeinteger
k
k such that (T I ) v =0.Wewillbedoneifwe show that k n. This will b e
proved byshowing that
2 k 1
v; (T I )v; (T I ) v; : : : ; (T I ) v (3.2)
are linearly indep endentvectors; we will then have k linearly indep endent elements
in an n-dimensional space, which implies that k n.
Toprove the vectors in (3.2) are linearly indep endent, supp ose a ;:::;a are
0 k 1
complex numbers such that
k 1
a v + a (T I )v + + a (T I ) v =0: (3.3)
0 1 k 1
k 1 k 1
Apply (T I ) to b oth sides of the equation ab ove, getting a (T I ) v =0,
0
k 2
which implies that a =0.Now apply (T I ) to b oth sides of (3.3), getting
0
k 1
a (T I ) v = 0, which implies that a =0: Continuing in this fashion, wesee
1 1
that a = 0 for each j , as desired.
j
The next result is the key to ol we'll use to give a description of the structure of a linear op erator.
@
det
@
5
Prop osition 3.4 The generalized eigenvectors of T span V .
Pro of. The pro of will b e by induction on n, the dimension of V .Obviously, the
result holds when n =1.
Supp ose that n>1 and that the result holds for all vector spaces of dimension
less than n. Let be anyeigenvalue of T (one exists by Theorem 2.1). We rst
show that
n n
ran(T I ) ; (3.5) V =ker(T I )
{z } {z } | |
V V
1 2
here, as usual, ran is an abbreviation for range. To prove(3.5), supp ose v 2 V \ V .
1 2
n n
Then (T I ) v = 0 and there exists u 2 V suchthat(T I ) u = v . Applying
n 2n
(T I ) to b oth sides of the last equation, wehave(T I ) u = 0. This implies
n
that (T I ) u =0 (by Lemma 3.1), which implies that v =0.Thus
V \ V = f0g: (3.6)
1 2
Because V and V are the kernel and range of a linear op erator on V ,wehave
1 2
dim V =dimV +dimV : (3.7)
1 2
Equations (3.6) and (3.7) imply (3.5).
Note that V 6= f0g (b ecause is an eigenvalue of T ), and thus dim V 1 2 n Furthermore, b ecause T commutes with (T I ) ,we easily see that T maps V 2 into V . By our induction hyp othesis, V is spanned by the generalized eigenvectors 2 2 of T j ,each of whichisobviously also a generalized eigenvector of T .Everything V 2 in V is a generalized eigenvector of T , and hence (3.5) gives the desired result. 1 A nice corollary of the last prop osition is that if 0 is the only eigenvalue of T , then T is nilp otent (recall that an op erator is called nilp otent if some p ower of it equals 0). Pro of: If 0 is the only eigenvalue of T , then every vector in V is a generalized eigenvector of T corresp onding to the eigenvalue 0 (by Prop osition 3.4); Lemma 3.1 n then implies that T =0. Non-zero eigenvectors corresp onding to distinct eigenvalues are linearly indep en- dent (Prop osition 2.2). We need an analogous result with generalized eigenvectors replacing eigenvectors. This can b e proved by following the basic pattern of the pro of of Prop osition 2.2, as wenowdo. Prop osition 3.8 Non-zero generalized eigenvectors corresp onding to distinct eigen- values of T are linearly indep endent. Pro of. Supp ose that v ;:::;v are non-zero generalized eigenvectors of T corre- 1 m sp onding to distinct eigenvalues ;:::; .We need to prove that v ;:::;v are 1 m 1 m linearly indep endent. To do this, supp ose a ;:::;a are complex numb ers such that 1 m a v + + a v =0: (3.9) 1 1 m m @ det @ 6 k Let k b e the smallest p ositiveinteger suchthat(T I ) v = 0. Apply the linear 1 1 op erator k 1 n n (T I ) (T I ) :::(T I ) 1 2 m to b oth sides of (3.9), getting k 1 n n a (T I ) (T I ) :::(T I ) v =0; (3.10) 1 1 2 m 1 n n where wehave used Lemma 3.1. If we rewrite (T I ) :::(T I ) in (3.10) as 2 m