A NUMERICAL METHOD FOR THE

INVERSE STOCHASTIC SPECTRUM PROBLEM

y

MOODY T CHU AND QUANLIN GUO

Abstract Inverse sto chastic sp ectrum problem involves the construction of a sto chastic

with a prescrib ed sp ectrum A dierential equation aimed to bring forth the steep est descent ow

in reducing the distance b etween isosp ectral matrices and nonnegative matrices represented in terms

of some general co ordinates is describ ed The ow is further characterized by an analytic singular

value decomp osition to maintain the numerical stability and to monitor the proximity to singularity

This ow approachcan be used to design Markovchains with sp ecied structure Applications are

demonstrated byn umerical examples

Key words Sto chastic Matrix Least Squares Steep est Descent Isosp ectral Flow Structured

Markovchain Analytic Singular Value Flow

AMSMOS sub ject classications F H

Intro duction Inverse eigenvalue problems concern the reconstruction of ma

trices from prescrib ed sp ectral data The sp ectral data mayinvolve complete or partial

information of eigenvalues or eigenvectors Generally a problem without any restric

tions on the matrix is of little interest In order that the inverse eigenvalue problem b e

meaningful it is often necessary to restrict the construction to sp ecial classes of matri

ces such as symmetric To eplitz matrices or matrices with other sp ecial structures In

this pap er we limit our attention to the so called sto chastic matrices ie matrices with

nonnegative elements where all its row sums are equal to one We prop ose a numer

ical pro cedure for the construction of a sto chastic matrix so that its sp ectrum agrees

with a prescrib ed set of complex values If the set of prescrib ed values turns out to

be infeasible the metho d pro duces a best approximation in the sense of least squares

To our knowledge this inverse eigenvalue problem for sto chastic matrices has not b een

studied extensively probably due to its diculty as we shall discuss below Neverthe

less for a variety of physical problems that can be describ ed in the context of Markov

chains an understanding of the inverse eigenvalue problem for sto chastic matrices and

a capacity to solve the problem would make it p ossible to construct a system from its

natural frequencies The metho d prop osed in the pap er app ears to be the rst

attemp at tackling this problem numerically with some success Our technique can also

be applied as a numerical way to solve the long standing inverse eigenvalue problems

for nonnegative matrices

Asso ciated with every inverse eigenvalue problem are two fundamental questions

issue on solvability and the practical issue on computability The the theoretic

ma jor eort in solvability has been to determine a necessary or a sucient condition

under which an inverse eigenvalue problem has a solution whereas the main concern

Department of North Carolina State University Raleigh NC This

researchwas supp orted in part by the National Science Foundation under grant DMS

y

Department of Mathematics North Carolina State University Raleigh NC 1

0.8

0.6

0.4

0.2

0

−0.2

−0.4

−0.6

−0.8

−1

−1 −0.5 0 0.5 1

Fig by the KarpelevicTheorem

in computability has been to develop an algorithm by which knowing a priori that

the given sp ectral data are feasible a matrix can be constructed numerically Both

questions are dicult and challenging Searching through the literature wehavefound

only a handful of inverse eigenvalue problems that have b een completely understo o d or

solved The fo cus of this pap er is on the computability for sto chastic matrices

For sto chastic matrices the inverse eigenvalue problem is particularly dicult as

result on existence by Karp elevic can be seen from the involvement in the best known

Karp elevic completely characterized the set of points in the complex

n

plane that are eigenvalues of sto chastic n n matrices In particular the region is

n

symmetric ab out the real axis It is contained within the unit circle and its intersections

ab

with the unit circle are p oints z e where a and b run over all integers satisfying

a b n The b oundary of consists of these intersection p oints and of

n

curvilinear arcs connecting them in circular order These arcs are characterized by

sp ecic parametric equations whose formulas can be found in For example

a is an eigenvalue for a sto chastic matrix if and only if it

b elongs to a region such as the one shown in Figure Complicated though it may

seem the Karp elevic theorem characterizes only one complex value a time and do es

alues of the not provide further insights into when two or more points in are eigenv

n

same sto chastic matrix Minc distinctively called the problem we are considering

where the entire sp ectrum is given the inverse spectrum problem

It is known that the inverse eigenvalue problem for nonnegative matrices is virtually

equivalent to that for sto chastic matrices For example a complex nonzero number is

an eigenvalue of a nonnegative matrix with a p ositive maximal eigenvalue r if and only

if r is an eigenvalue of a sto chastic matrix Our problem is much more complicated

b ecause it involves the entire sp ectrum Fortunately based on the following theorem

we can pro ceed our computation once a nonnegative matrix is found

Theorem If A is a nonnegative matrix with positive maximal eigenvalue r

and a positive maximal eigenvector x then D r AD is a stochastic matrix where

D diagfx x g

n

Wethus should turn our attention to the inverse eigenvalue or sp ectrum problems

for nonnegative matrices a sub ject that has received considerable interest in the litera

ture Some necessary and a few sucient conditions on whether a given set of complex

numbers could be the sp ectrum of a nonnegative matrix can be found for example

in and the references contained therein Yet numerical

metho ds for constructing such a matrix even if the sp ectrum is feasible still need to

b e develop ed Some discussion can b e found in Regardless of all the eorts the

inverse eigenvalue problem for nonnegative matrices has not been completely resolved

to this date

In an earlier pap er the rst author has develop ed an algorithm that can con

struct symmetric nonnegative matrices with prescrib ed sp ectra by means of dierential

equations Symmetry was needed there b ecause the techniques by then were for ows

in the group of orthogonal matrices only Up on realizing the existence of an analytic

singular value decomp osition ASVD for a real analytic path of matrices

we are able to advance the techniques in to general matrices in this pap er

This pap er is organized as follows We reformulate the inverse sto chastic sp ectrum

problem as that of nding the shortest distance between isosp ectral matrices and non

negative matrices In x we intro duce a general co ordinate system to describ e these

twotyp es of matrices and discuss how this setting naturally leads to a steep est descent

ow This approach generalizes what has b een done b efore but requires the inversion

of matrices that is potentially dangerous In x we argue that the steep est descent

o w is in fact analytic and hence an analytic singular value decomp osition exists We

therefore are able to describ e the ow by a more stable vector eld We illustrate the

application of this dierential equation to the inverse sp ectrum problem by numerical

examples in x

Basic Formulation The given sp ectrum f g may b e complexvalued

n

It is not dicult to create a simple say tridiagonal realvalued matrix carrying the

same sp ectrum For multiple eigenvalues one should also consider the p ossible real

valued Jordan canonical form dep ending on the geometric multiplicity Matrices in the

set

nn

M fP P jP R is nonsingularg

obviously are isosp ectral to Let

nn n

fB B jB R g R

denote the cone of all nonnegative matrices where means the Hadamard pro duct of

n

matrices Our basic idea is to nd the intersection of M and R Such an

intersection if exists results in a nonnegative matrix isosp ectral to Furthermore if

the condition in Theorem holds ie if the eigenvector corresp onding to the p ositive

maximal eigenvalue is p ositive then we will have solved the inverse sp ectrum problem

for sto chastic matrices by a diagonal similarity transformation The diculty as we

have p ointed earlier is the lack of means to determine if the given sp ectrum is feasible

An arbitrarily given set of values even if for all i may not be the

n i n

sp ectrum of any nonnegative matrix In this case it is reasonable to ask for only the

b est p ossible approximation To handle b oth problems at the same time we reformulate

the inverse sp ectrum problem as that of nding the shortest distance between M

n

and R

minimize F P R kP P R R k

where kk represents the Frob enius matrix norm Obviously if is feasible then

F P R for some suitable P and R Note that the variable P in resides in the

nn

op en set of nonsingular matrices whereas R is simple a general matrix in R The

optimization in sub jects to no other signicant constraint Since the optimization

is over unb ounded op en domain it is p ossible that the minimum do es not exist We

shall comment more on this point later

The Frechet derivativeofF at P R acting on H K is calculated as follows



F P RH K hP P R R H P P P HP K R R K i

T T T T T T

R R P Hi hP P R R P P P P P

hP P R R R K i

where h i denotes the Frob enius inner pro duct of two matrices Dene for abbrevia

tion

M P P P

P R M P R R

The norm of P R represents how close we are able to solve the inverse sp ectrum

nn nn

problem With resp ect to the pro duct top ology on R R we can easily read o

the gradient rF of the ob jective F from



T T T

rF P R P RM P M P P RP P R R

Therefore the ow P tRt dened by the dierential equations

dP

T T

M P P RP

dt

dR

P R R

dt

where denotes the Lie bracket of two matrices signies in fact the steep est descent

ow for the ob jective function F

An imp ortantadvance wehave made here is that the gradient rF P R no longer

needs to b e pro jected as was required in since P needs not to b e orthogonal On the

other hand a p ossible frailtyofthisadvance is that the solution ow P t is susceptible

to b ecoming unb ounded

The dierential system and has another interesting prop erty that is useful

for constructing Markov chains with designated structure The Hadamard pro duct in

dr

ij

implies that if r then Thus the zero structure in the original matrix

ij

dt

R is preserved throughout the integration Wemay use this prop erty to explore the

p ossibilityof constructing a Markov chain with prescrib ed linkages and sp ectrum

ASVD ow A somewhat worrisome feature in the dierential system and

is the involvementofP In this section we prop ose using the ASVD as a stable way

to carry out the computation Also we have pointed out earlier that the minimization

over two op en sets may not have a minimum It is p ossible during the integration

that the ow P t from one particular starting value graduately moves toward the

nn

b oundary ie the closed subset of singular matrices in R and b ecomes more and

more nearly singular The ASVD technique allows us to monitor the situation If the

singular values indicate that P t is nearly decient wecanab ort the integration

and restart from a new initial value

An analytic singular value decomp osition of the path of matrices P t is an analytic

path of factorizations

T

P t X tS tY t

et al where X t and Y t are orthogonal and S t is diagonal In BunseGerstner

prove that an ASVD exists if P t is analytic The fact that P t dened by and

is indeed analytic follows from the CauchyKovalevskaya Theorem since the

co ecients of the vector eld in and are analytic With this understanding we

may pro ceed to describ e the dierential equations for the ASVD of P t

It is worthy to point out that the two matrices P and R are used resp ectively as

coordinates to describ e the isosp ectral matrices and nonnegative matrices Wemayhave

used more dimensions of variables than necessary to describ e the underlying matrices

but that do es no harm When ows P t and R t are intro duced in a sense a ow

n

in M and a ow in R are also intro duced To stablize the computation we

further describ e the motion of the co ordinate P by three other variables X S and Y

according to The ows of X t S t and Y t can be found in the following way

due to Wright

Dierentiating b oth sides of we obtain the following equation after some suit

able multiplications

T

dP dX dS dY

T T

X Y X S S Y

dt dt dt dt

Dene

dP

T

Qt X Y

dt

dX

T

Z t X

dt

T

dY

W t Y

dt

Note that Qt is known from where the inverse of P t is calculated from

T

P YS X

The diagonal entries of S diagfs s g provide us with information ab out the

n

proximityof P t to singularity On the other hand comparing the diagonal entries on

b oth sides of we obtain the dierential equation for S t

dS

diagQ

dt

since b oth Z t and W t are skew symmetric Comparing the odiagonal entries on

b oth sides of we obtain the linear system

q z s s w

jk jk k j jk

q z s s w

kj jk j k jk

If s s we can solve this system and obtain

k j

s q s q

k jk j kj

z

jk

s s

j

k

s q s q

j jk k kj

w

jk

s s

j

k

for all j k Once Z t and W t are known the dierential equations for X t and

Y t are given resp ectively by

dX

XZ

dt

dY

T

YW

dt

By nowwehavedevelop ed a complete co ordinate system X tStY tRt for

n

matrices in M R The dierential equations and with the

relationship describ e how these co ordinates should be varied in t to pro duce the

steep est descent ow for the ob jective function F This ow is ready to be integrated

numerically by any initial value problem solvers We have thus prop osed a numerical

metho d for solving the inverse sto chastic sp ectrum problem

Convergence When assessing the convergence prop erties of the foregoing ap

proach we must distinguish carefully the means used to measure the convergence

First of all the approach fails only at two o ccasions either P t b ecomes sin

gular in nite time or that F P tRt converges to a nonzero constant The former

case detected by examining the singular values of P t requires a restart from a new

initial value with the hop e to avoid the singularity The latter case indicates that a

least squares local solution has been found but that solution has not yet solved the

inverse sp ectrum problem A restart may help to lo cate an exact solution if the pre

scrib ed sp ectrum is feasible or to move to another least squares approximation that

may pro duce a dierent ob jective value

In all cases the function

GtF P tRt

enjoys the prop erty that

dG

krF P tRtk

dt

along any solution curve P tRt It follows that Gt is monotone decreasing and

dG

that only when a lo cal stationary point of F P R is reached Supp ose that

dt

P t remains nonsingular throughout the integration an assumption that seems generic

according to our exp eriences Then Gt has to converge It is in this sense that our

metho d is globally convergent

In the co ordinate matrix P t is limited to b e orthogonal hence is b ounded and

exists for all t This constraint is not imp osed on the approach discussed in the current

pap er Generally there is no guarantee that P t is b ounded However in the case that

the solution owP tRt corresp onding to a certain initial value indeed is b ounded

and exists for all t then we can conclude from Lyapunovs second metho d that

limit p oints of P t exist and that each limit point satisfy rF P R In other

words limit p oints of the ow are necessarily stationary p oints Because that the vector

eld always points to the steep est descent direction and that other typ es of stationary

points are unstable any limit p oint reached through numerical computation will most

likely b e a lo cal minimizer for F The structure of limit set of the dierential system

and can be further analyzed in a way similar to that in For example if the

limit set of a ow contains a p ointatwhich F P R then that p oint is the only

element in the limit set The ow hence converges to that limit p oint We shall not

rep eat the detailed argument here Our exp eriences seem to indicate that our metho d

works reasonably well for solving the inverse sp ectrum problem

Numerical Exp eriment In this section we rep ort some exp eriences of our

exp eriment with the dierential equation applied to the inverse problem The compu

tation is carried out by MATLAB a on an ALPHA LX workstation The

solvers used for the initial value problem are o de and o des from the MATLAB

ODE SUITE The co de o de is a PECE implementation of AdamsBashforth

Moulton metho ds for nonsti systems The co de odes is a quasiconstant step size

implementation of the KlopfensteinShampine family of the numerical dierential for

mulas for sti systems The statistics ab out the cost of integration can be obtained

directly from the o deset option built in the integrator More details of these co des

can be found in the do cument The reason for using these two co des is simply for

convenience and illustration Any other ODE solvers can certainly be used instead

In our exp eriments the tolerance for b oth absolute error and relative error is set at

This criterion is used to control the accuracy in following the solution path The

high accuracy we required here has little to do with the dynamics of the underlying

vector eld We examine the output values at time interval of The integration

terminates automatically when the norm of P R or the relative improvement of

P R between two consecutive output points is less than indicating either a

sto chastic matrix with the prescrib ed sp ectrum or in the case of an infeasible sp ectrum

a least squares solution has b een found So as to t the data comfortably in the running

text we rep ort only the case n and displayallnumb ers with ve digits

Example To ensure the feasiblity of test data we start with a randomly gener

ated sto chastic matrix and use it eigenvalues as the ob jective sp ectrum To demonstrate

the robustness of our approach the initial values of the dierential equations are also

generated randomly Rep orted below is one typical run in our exp eriments

The random matrix

 

 

 

 

 

A

 

 

 

is sto chastic Its sp ectrum f i g also random

but feasible is used as the target We note that the presence of complexconjugate

common Orthogonal matrices X Y pairs of eigenvalues in the sp ectrum is quite

and the diagonal matrix S from the singular value decomp osition P X S Y of the

random matrix

 

 

 

 

 

P

 

 

 

together with the matrix R where is the matrix with all entries are

used as the initial values for X t Y t S t and R t resp ectively Figure depicts

the history of F P tRt throughout the integration As is exp ected F P tRt

The ow P tconverges to a nonnegative matrix with the is monotone decreasing in t

prescrib ed sp ectrum that by Theorem is converted intoastochastic matrix B

 

 

 

 

 

B

 

 

 

Note that B is not exp ected to be correlated to A other than the sp ectrum since

no other information of A is used in the calculation While the history of F P tRt Distance between Isospectral and Nonnegative 0 10

−2 10

−4 10

−6

log(F(P,R)) 10

−8 10

−10 10 0 1 2 3 10 10 10 10

log(t)

Fig Aloglog plot of F P tRt versus t for Example

is indep endent of the integrator used Figure indicates the number of steps taken

in each interval of length by the nonsti solver o de and by the sti solver

o des Both solvers seem to work reasonably well although the sti solver clearly is

advancing with much larger step sizes at the cost of solving implicit algebraic equations

Figure summarizes the statistics of the cost when using o des It should b e p ointed

out that the numerical computation of the partial derivative and the related function

evaluations could hav ebeensaved if the interval of output p oints had b een larger

Supp ose we merely change the initial value R in the ab ove to another random

matrix

 

 

 

 

 

R

 

 

 

Then the resulting sto chastic matrix C b ecomes

 

 

 

 

 

C

 

 

 

illustrating the nonuniqueness of the solution for the inverse sp ectrum problem and

also the robustness of our dierential equation approach

Example In this example we illustrate the application of our approach to struc

tured sto chastic matrix Supp ose we want to nd a sto chastic matrix with eigenvalues

f g Furthermore supp ose we want the Markov Number of Steps Taken per Interval of Length 10 3.5

+ = ode113

3 o = ode15s

2.5

2 log(number of steps)

1.5

1 0 50 100 150 200 250 300 350 400 450

t

Fig Acomparison of steps taken by o de and o des for Example

Cost of Integration per Interval of Length 10 4 x = function evaluations 3.5 * = linear solvers . = successful steps 3 + = LU decompositions o = partial derivatives 2.5

2

1.5 log(numbers)

1

0.5

0

−0.5 0 50 100 150 200 250 300 350 400 450 500

t

Fig Cost of o des for Example

chain to b e such that the states form a ring and that each state is linked at most to its

two immediate neighbors We b egin with the initial matrices

 

 

 

 

 

P

 

 

 

and R where

 

 

 

 

 

 

 

 

As we p ointed out earlier the zeros in R are invariant under the integration of

and Thus we are maintaining the ring structure while searching for the one with

matched sp ectrum It turns out that the sto chastic matrix

 

 

 

 

 

D

 

 

 

is the limit p ointofthe solution ow and p ossesses the desirable sp ectrum

Example By a result of Dmitriev and Dynkin a complex number with

j arg is an eigenvalue of an n n sto chastic matrix if and only lies either in j

n

in in

the triangle e or in e The result by replacing the complex

conjugate pair in the sp ectrum of Example with another pair of complexconjugate

values in these two triangles will not tamp er the fact that every individual value is an

eigenvalue of a certain sto chastic matrix However whether these values are eigenvalues

of the same sto chastic matrix is dicult to conrm

We exp eriment with for instance the eigen values i Using the same

initial values R as in Example we have exp erienced extremely slow

convergence for this case The history of F P R in Figure clearly indicates this

observation The limit p oint given by

 

 

 

 

 

E

 

 

 

Distance between Isospectral and Nonnegative 0 10

−2 10

−4 10

−6

log(Delta(P,R)) 10

−8 10

−10 10 0 1 2 3 4 5 10 10 10 10 10 10

log(t)

Fig Aloglog plot of F P tRt versus t for Example

Minimal Sigular Value 0 10 log(minimal singular value)

−1 10 0 1 2 3 4 5 10 10 10 10 10 10

log(t)

Fig History of the smal lest singular value for Example

exhibits an unexp ected zero structure that we think is the cause of the slow conver

gence The variation of the smallest singular value in the ASVD is plotted in Figure

indicating that matrices P t stay away from singularity at a go o d distance Supp ose

we mo dify the initial value to reect the structure by simply setting the corresp onding

entries in the original R zero Then the ow converges to another limit p oint

 

 

 

 

 

F

 

 

 

at an almost equally slow pace The sp ectra of both E and F agree with the desired

data up to the integration error

Conclusion The theory of solvability on the inverse sp ectrum problem for

sto chastic or nonnegative matrices is yet to b e develop ed nevertheless wehave prop osed

an ODE approach that is capable of constructing numerically sto chastic or nonnegativ e

matrices with the desired sp ectrum if the sp ectrum is feasible The metho d is easy to

implementby existing ODE solvers The metho d can also b e used to approximate least

squares solutions or linearly structured matrices

Acknowledgement The authors wish to thank Bart De Mo or for bring this

problem to their attention and Carl Meyer for kindly pointing them to the reference

REFERENCES

W W Barrett and C R Johnson Possible sp ectra of totally p ositive matrices Linear Alg Appl

A Berman and R J Plemmons Nonnegative Matrices in the Mathematical Sciences Classics

in Applied Mathematics SIAM Philadelphia

M Boyle and D Handelman The sp ectra of nonnegative matrices via symb olic dynamics Ann

Math

F Brauer and J A Hohel Qualitative Theory of Ordinary Dierential Equations Benjamin

New York

N K Nichols Numerical computation of an A BunseGerstner R Byers V Mehrmann and

analytic singular value decomp osition of a matrix valued function Numer Math

M T Chu and K R Driessel Constructing symmetric nonnegative matrices with prescrib ed

eigenvalues by dierential equations SIAM J Math Anal

B De Mo or Private communication

M Fiedler Eigenvalues of nonnegative symmetric matrices and Appl

S Friedland On an inverse problem for nonnegativeandeventually nonnegative matrices Israel

J Math

S Friedland and A A Melkman On the eigenvalues of nonnegative Jacobi matrices Linear

Algebra and Appl

G M L Gladwell Inverse Problems in Vibration Martinus Nijho Publishers Dordrecht

Netherlands

D Hershkowits Existence of matrices with prescrib ed eigenvalues and entries Linear and Mul

tilinear Algebra

R Lo ewy and D London A note on an inverse problems for nonnegative matrices Linear and

Multilinear Algebra

F I Karp elevic On the characteristic ro ots of matrices with nonnegative elements Izv Akad

Nauk SSSR Ser Mat in Russian

V Mehrmann and W Rath Numerical metho ds for the computation of analytic singular value

decomp ositions Elec J Numer Anal

H Minc Nonnegative matrices Wiley New York

G N Oliveira Nonnegative matrices with prescrib ed sp ectrum Linear Alg Appl

M Renardy and R C Rogers An Intro duction to Partial Dierential Equations SpringerVerlag

New York

L F Shampine and M W Reichelt The MATLAB ODE suite preprint

G W Soules Constructing symmetric nonnegative matrices Linear Alg Appl

K Wright Dierential equations for the analytic singular value decomp osition of a matrix Nu

mer Math