<<

Fast algorithms with prepro cessing for vector

multiplication problems

IGohberg and VOlshevsky

School of Mathematical Sciences

Raymond and Beverly Sackler Faculty of Exact Sciences

Tel Aviv University Ramat Aviv Israel

email gohb ergmathtauacil vadimmathtauacil

Journal of Complexity

Abstract

In this pap er the problem of complexityofmultiplication of a matrix with a vector

is studied for To eplitz Hankel Vandermonde and Cauchy matrices and for matrices

connected with them ie for transp ose inverse and transp ose to inverse matrices

The prop osed algorithms have complexities at most O n log n ops and in a number

of cases improv e the known estimates In these algorithms in a separate preprocessing

phase are singled out all the actions on the preparation of a given matrix which

aimed at the reduction of the complexity of the second stage of computations directly

connected with the multiplication by an arbitrary vector Incidentally the eective

algorithms for computing the Vandermonde and the determinant of a

Cauchy matrix are given

Intro duction

nn

Let the matrix A C be given by all its n entries The problem is to compute the

n

pro duct Ab of a matrix A by input vector b C Using the standard rule of a matrix

times vector multiplication the ab ove problem can be solved in n n ops ie oat

point op erations of addition subtraction multiplication and division In the following

three simple examples the sp ecial structure of a matrix enables faster computation of the

pro duct by avector

Examples

Band matrices

n

Let A a b e a with the width of the band ie by denition a

ij ij

ij 



Obviously in this case the computing the pro duct Ab costs n while j i j j

ops

Smal l matrices

nn

Let matrix A C with rank R be given in the form of the outer sum of terms

X

n

T

h g C h g A

m m m

m

m

The representation shows that A can b e multiplied by an arbitrary vector in n

ops

Semiseparable matrices

n 

Such a matrix A a is dened by the equalities

ij

ij 

P

f g ij

mi mi

m

a

ij

i j

n

n n

where f f g g m are given vectors from C In this

m mi m mi

i i

case

X

diagg A diagf

m m

m

where by diagf is denoted the whose entries on the main diagonal equal

n

the co ordinates of the vector f C The pro duct of a semiseparable matrix by an arbitrary

vector can b e computed using in n ops Obviously the analogous estimate

holds for the transp ose to the matrices of the form

nn

Let matrix A C be given The task is to compute the pro ducts Ab Ab of the

 

n

matrix A by input vectors b b C in the smallest p ossible time Such a situation

 

arises naturally in a numb er of computational problems for example when wehave to carry

out the iterations with a given matrix The problem of computing the pro duct of a matrix

by a matrix can also b e solved in the ab ove framework In the latter case the columns of a

second matrix are interpreted as input vectors

In accordance with the accepted scheme we single out all the computations which are

not dep endent up on the input vectors b b Accordingly the prop osed algorithms will

 

b e divided into two separate phases

nn

I Prepro cessing for matrix A C

II Application to the vector

The rst phase also contains the preparation of the given matrix which enables the second

phase to b e accomplished more eectively

In the present pap er weembody this scheme and prop ose a numb er of fast algorithms for

matrices with a certain structure namely for transp osed Vandermonde matrix for transp ose

to inverse of Vandermonde matrix for Cauchy matrices and for matrices connected with them

ie for transp ose inverse and transp ose to inverse matrices

Background

Basic algorithms

In this pap er a limited number of well known algorithms is used intensively These algorithms

are listed below and accompanied by the estimates of their complexities The particular

implementations of these basic algorithms and complexity analysis for them can b e found in

various sources see for example Aho Hop croft and Ullman

BA Evaluation algorithm

An algorithm of evaluation of a n degree p olynomial at n points The complexity

of this algorithm will b e denoted by n It is well kno wn that

n O n log n

BA Interpolation algorithm

An algorithm of interp olation of a n degree p olynomial from its values at n

points The complexity of the interp olation algorithm is denoted by n and as it is

well known

n O n log n

BA Fast Fourier Transform algorithm

n

n

The discrete Fourier transform of the vector r r C is by denition the vector

i

i

P

n

n n

ik

C where is the primitive nth ro ot from unity The discrete r

k

i

k 

Fourier transform can be computed by well known metho ds collectively named Fast

Fourier Transform FFT It is well known that for the complexity n of computing

one FFT of order n the following estimate holds

n O n log n

Basic matrices

In this subsection some matrices which are known to havealower than O n ops complex

ityofmultiplication with a vector are considered Note that the metho ds for fast computing

of the pro ducts of these basic matrices byvectors are based on the algorithms BA BA

BM Vandermonde matrix

By denition the Vandermonde matrix V t is the matrix of the form

n

t t





n

t t





n

n

t t C with V t

i

i

n

t t

n

n

Obviously the problem of computing the pro duct V tb of the matrix V tbyavector

P

n

n

i

b b is equivalent to the problem of evaluation of p b at n p oints

i i

i

i

t t t Thus the pro duct V tb can b e computed using the algorithm BA

  n

in n ops

BM Inverse of Vandermonde matrix

nn

Consider the problem of application to a vector of the matrix A C which is

n

dened as the inverse of the given Vandermonde matrix V t t C It is easy to

see that the pro duct Ab can be computed using the algorithm BA Thus for the

inverse of Vandermonde matrix the application to avector costs n ops

BM Fourier matrix and its inverse

i

n

be primitive nth ro ot from the unity Consider the Fourier matrix Let e

n



i 

p

F V with and its inverse F F where sup erscript means

i

n

conjugate transp ose The pro ducts F b and F b can b e computed using the algorithm

BA in nops

BM Factor circulant

n

By Circ r will b e denoted circulant with the rst column r r ie matrix

i

i

of the form

r r r

 n 

r r r

  n

Circ r

r



r

n

r r r

n  

The matrix Circ r is referred to as a circulant It is known see Cline Plemmons



and Worm that the matrix Circ r admits the following decomp osition



r F D Circ rD F

n

i

where F is the Fourier matrix r diagF D r and D diag with

i

n

C satisfying the condition From it follows that if the co ordinates of

n

n

i

the vector C are given then the pro duct of the matrix Circ r by an

i

arbitrary vector can b e computed in

nO n

ops In this case the prepro cessing phase consists of computation of the central factor

rinthe righthand side of and costs

nO n

ops

BM Toeplitz matrices

n

The pro duct of To eplitz matrix A a by a vector can be computed by em

ij

ij 

b edding of the matrix A in the circulantofdouble size Corresp ondingly the amount

of op erations is

nO nnO n

ops after

n O n nO n

prepro cessing

BM Inverses of Toeplitz matrices

In computations with the inverses of To eplitz matrices the Gohb ergSemencul formula

see Gohb erg and Semencul or Gohberg and Feldman is useful This

formula represents the inverse of a To eplitz matrix in the form of the sum of pro ducts

of triangular To eplitz matrices Namely if for the To eplitz matrix T the equations

T x e T y e

 n

h i h i

T T

n

with e e have solutions x x

 n i

i

n

y y and x then T is invertible and

i 

i

y y y x

n n  

y y x x

n n  



T

x



x



y

n

y x x x

n n  

x x

n 

y x

 n

y



x

n

y y

n 

On the basis of formula the numb er of fast algorithms for the inversion of Toeplitz

matrices was elab orated see Brent Gustavson and Yun de Ho og Chun

and Kailath where the metho ds for solving the equations of the form are

suggested Denote by n the complexity of solving one equation of the form

According to Brent Gustavson and Yun de Ho og Chun and Kailath

n O n log n

nn

Thus if matrix A C is the inverse of the given To eplitz matrix then from the

ab ove arguments follows that after nn ops prepro cessing the pro duct of

A by an arbitrary vector can be computed by in n O n ops Below we

show how this complexity can be reduced

nn n

If for To eplitz matrix T C the equations have solutions x x y

i

i

n

y and x then

i 

i



T Circ x Circ Z y Circ Z y Circ x

x



where numbers and are arbitrary and

Z

is the cyclic lower Formula was obtained in Ammar and Gader

for p ositive denite To eplitz matrices and and was extended

to the general case in Gohberg and Olshevsky Note in the latter pap er one

can also nd other factor circulant decomp ositions for the inverses of To eplitz matri

ces which are useful in the fast computations with To eplitz matrices Furthermore

representing each factor circulant in the lefthand side of in the form of we

nally get

  

T D F x F D D F Z y

x





Z y F D D F x F D

n

where F is a Fourier matrix and the diagonal matrices D and r r C are

it follows that after dened as in BM From

n n O n

ops prepro cessing the pro duct of the inverse of the To eplitz matrix by an arbitrary

vector can be computed in

nO n

ops The last estimate is obtained here without the additional requirement for matrix

A to be Hermitian In the Hermitian case this result is obtained in the earlier pap er

Ammar and Gader Note and imply that the pro duct of the inverse

of To eplitz matrix by an arbitrary matrix can b e computed in nn O n ops

BM Hankel matrices

n

Let us recall that an arbitrary H a can be transformed in

ij

ij 

the To eplitz matrix by means of multiplication from the right by the reverse identity

matrix

J

Therefore the complexity of the multiplication with a vector for the class of Hankel

matrices coincides with the same complexity for the class of To eplitz matrices

Transp ose to Vandermonde and to inverse of Van

dermonde matrices

Relations between Vandermonde matrices and Bezoutians

Let n R b e the maximal degree of two p olynomials f and g then the bilinear form

n

X

f g f g

i j

b B

ij fg

ij 

is called the Bezoutian of f and g The matrix

n

B b

fg ij

ij 

whose entries are determined by the co ecients of the bilinear form B will b e referred

fg

to as a Bezout matrix which corresp onds to the p olynomials f and g Obviously

i h

n

B B

fg fg

n

The equality yields the following useful prop erty of the Bezout matrix

T n

V s B V t B s t

fg fg i j

ij 

n

n n

where s s t t C From and obvious equality

i i

i i

B f g f g

fg

it follows that if t t t are n simple ro ots of the p olynomial f then

  n

T n

V t B V t diagf t g t

fg i i

i

The latter formula app eared in Lander and can also be found in Heinig and Rost

On the basis of in the next subsection the fast algorithm for multiplication of

transp ose Vandermonde matrix by a vector is presented

Transp ose to Vandermonde matrix and its inverse

n

For vector t t t t while i j set

i i j

i

n n

X Y

i n n

r and g t f

i i

i i

n

where is such that t i n Under these conditions the matrix in the

i

righthand side of is invertible Hence all the matrices in the lefthand side of

are also invertible and moreover

T   n

V t B V t diag f t g t

i i

i

fg

Furthermore matrix B is the Bezout matrix of two p olynomials one of them having

fg

n

the sp ecial form g This fact allows to receive for B and for its inverse a

fg

representation involving a factor circulant To prove this let us compute the Bezoutian

dened by B of the p olynomials f and g

fg

n n

f f f f

B

fg

n n

X X

i i i i n i n n i

r r

i i

i i

with r From the last identity it can be easily seen that the corresp onding Bezout

n

matrix B has the sp ecial form

fg

r r r r

 n 

r r



Circ r e J B

 fg

r r

n n

r r r r

  n n

where J is as ab ove the reverse identit y matrix Substituting in we have

T   n

V t J Circ r e V t diag f t g t

 i i

i

On the basis of the last formula the following algorithm is prop osed

Algorithm Computing the pro duct of transp ose Vandermonde matrix with an

arbitrary vector

I Prepro cessing

Complexity nnnn log n O n ops

n

n

Input Vector t t C t t while i j

i i j

i

T

Output Parameters of the representation V t in the form

   T

V t D F r e F D V t J D

fg 

Q P

n n

n i

Compute the co ecients of the p olynomial f t r

i i

i i

in n ops

Compute in O n ops the co ecients of the p olynomial f

Evaluate in n ops the values f t i n

i

n

Evaluate in at most n log n n ops the values g t t i n

i

i

using for example the algorithms from x in Knuth

n

Compute in n ops the entries of the matrix D diagf t g t

fg i i

i

i

Set max t Compute in O n ops the numbers i n

in i

n

n i

Set and D diag

i



Compute in O n ops the entries of the matrix D

Compute the co ordinates of the vector F D y with y r e in n O n



n

ops using BM where r r

i

i

Compute in O n ops the inverse of the diagonal matrix

r e diag F D y



II Application to the vector

Complexity nnO n ops

n

Input Vector b C

n

T

Output Vector V t b C

T

Compute vector V t b by using BM and BM

The pro duct of transp osed Vandermonde matrix with an arbitrary matrix can be com

puted using algorithm in

n nn nn O

ops

T 

Note that representations of the matrix V t in the form of the pro duct of matrix V t

by some matrices with lower complexity of multiplication with vectors could be found in

Heinig and Rost and Canny Kaltofen and Yagati In Canny Kaltofen and

T

Yagati for example the matrix V t is represented in the form

T 

V t H V t

T

el matrix For computing the where matrix H V t V t turned out to be a Hank

entries of H it is prop osed in Canny Kaltofen and Yagati to interp olate the nth

degree p olynomial from its zeros at n points and then to solve an equation of the typ e

nn

T x b with To eplitz matrix T from C Let nbeasabove the complexityofsolving

one To eplitz equation of the form Thus computing the entries of the matrix H in

by the scheme in Canny Kaltofen and Yagati and afterwards the preparation

of the Hankel matrix H according to BM costs nnn O n ops This

complexity exceeds the complexity of prepro cessing in the algorithm Furthermore the

app earance of the factor circulantin instead of the Hankel matrix in is reected

in the additional economy in n ops at the stage of application to the vector

Formula also enables the prop osition of fast algorithm for computing the pro duct

T

of matrix V t byav ector Moreover the prepro cessing phase consists of computing the

T

parameters of the representation of V t in the form analogous to the representation

T

of V t and costs

nnnn log n O n

ops Afterwards the application to the vector costs

nnO n

T

ops The pro duct of the matrix V t by an arbitrary matrix can be computed in

nn nn O n

ops

Formulas and yield the following representation for the inverse to a Vander

monde matrix

n

 T

 V t Circ r e J V t diag



n

i

f t t

i

i

Q P

n n

n i

Here as ab ove J stands for the reverse f t r

i i

i i

n

and r r The last formula can be used for computing the entries of the inverse of

i

i

Vandermonde matrix in O n ops

y matrices and matrices connected with them Cauch

By denition the C s t has the form

n

C s t 

ij 

s t

i j

n

n n

where s s and t t are given vectors from C suchthatt s i n

i i i i

i i

are n dierent complex numb ers As in the previous section set

n n

Y X

n i n

f t r g

i i

i i

n

where is such that t i n Formula yields

i

n

f s g t

i j

n n T

 diagf s C s t diagg t V s B V t

i i fg

i i

ij 

s t

i j

From here follows

n n

T

C s t diag V s B V t diag  

fg

i i

f s g t

i i

This equality and yields the following representation of the Cauchy matrix

n

 n

C s t diag  V s V t diagf t

i

i

i

f s

i

Formula enables us to prop ose of the following algorithm

Algorithm Computing the pro duct of the Cauchy matrix by avector

I Prepro cessing

Complexity nnO n ops

n n

and t t t t while i j Input Vectors s s

i i j i

i i

Output Parameters of the representation C s tinthe form

Q

n

Compute the co ecients of the p olynomial f t inn ops

i

i



Evaluate in nO n ops the values i n

f s 

i

Compute in O n ops the co ecients of the p olynomial f

Evaluate in n ops the values f t i n

i

II Application to the vector

Complexity nnO n ops

n

Input Vector b C

n

Output Vector C s tb C

Compute C s tb by using BM and BM

In an earlier pap er of Gerasoulis another algorithm for computing the pro duct of

Cauchy matrix with a vector was prop osed Moreover this algorithm do es not restrict itself

to Cauchy matrices and is valid for more general classes of matrices Gerasoulis algorithm

applied for computing the pro duct C s tb requires nnO n ops of which

nnO n can b e incorp orated into a prepro cessing stage

Note the pro duct of C s tby an arbitrary matrix can b e computed using algorithm

in

nn nn O n

ops

Formula also enables the fast algorithms of multiplication with vectors for matrices

T  T

C s t C s t and C s t to be elab orated For example formula implies the

following representation for the inverse of Cauchy matrix

n

  n

 C s t diag V t V s diagf s

i

i

i

f t

i

This formula yields that pro duct of the inverse of Cauchy matrix byavector can b e computed

in

nnO n

ops after the same prepro cessing as in algorithm In fact the latter prop osition can be

formulated as follows The system of linear equations with invertible Cauchy matrix can be

solved in O n log n ops

hy matrices of Vandermonde and Cauc

In the recent pap er Pan an outline of some algorithms for computing up to a sign

the determinants of Vandermonde and Cauchy matrices with real entries is suggested For

computing the Vandermonde determinant it is prop osed therein rst to compute using the

algorithm from Canny Kaltofen and Yagati the entries of the Hankel matrix H in

This step costs nnO n ops Then compute det H det V t

This seems to be quite exp ensive For the determinant of a Cauchy matrix in Pan

a complicated pro cedure of reducing this problem to the analogous problem for some close

toTo eplitz matrix is prop osed It turned out that wellknown formulas lead to simple and

signicantly more eective algorithms for computing the determinants of Vandermonde and

Cauchy matrices with complex entries Indeed as is well known

Y

t t det V t

j i

ij n

The square of the expression in the righthand side of is by denition the

Q

n

t It is well known see for example van der D f of the p olynomial f

i

i

nn

Q

n

Warden and can easily b e seen that D f f t Thus

i

i

n

Y

nn

f t det V t

i

i

This formula enables the following algorithm to b e prop osed

Algorithm Computing the Vandermonde determinant

Complexity nnO n ops plus one extraction of a square ro ot

n

n

Input Vector t t C t t while i j

i i j

i

Output Value det V tupto asign

Q

n

Compute in n ops the co ecients of the p olynomial f t

i

i

Compute in O n ops the co ecients of the p olynomial f

Evaluate the values f t i n in nops

i

nn

Q

n

Compute det V t f t inO n ops

i

i

Recover det V t up to a sign in one op eration of the extraction of a square ro ot

Furthermore if all the co ordinates of the vector t are real numb ers this situation was

considered in Pan then the explicit expression in the righthand side of allows

in O n log n time to come to a conclusion ab out the sign of V t Indeed in this case the

sign of V t is determined by the quantity of the pairs t t satisfying the condition t t

i j i j

while i j Thus the sign of det V t is determined by the number of inversions in the

p ermutation k k k k where k is the p osition of the element t in the sequence

  n i i

n

which is derived from ft g by the rearrangement of all the elements in increasing order

i

i

To compute this number of inversions we have to rearrange the twocomp onent sequence

 n

in the increasing order of the rst comp onents and then extract the second ft ig

i

i

comp onents into the separate sequence m m m m It is easy to see that m is the

  n

inverse of the p ermutation k and consequently has the same number of inversions Therefore

to determine the sign of det V t it remains to compute the number Inm of inversions in

the p ermutation m The algorithm based on these arguments is prop osed below Note that

here we deviate from the rule accepted in the present pap er and estimate the complexityin

terms time and not in ops The reason for this is that we use here the algorithms of sorting

and of computing the number of inversions in a p ermutation These two algorithms do not

involveoat point op erations and involve cheap er op erations of comparison and exchange

Algorithm Determination of the sign of Vandermonde determinant

Complexity O n log n time

n

n

Input Vector t t R t t while i j

i i j

i

Output Sign of det V t

 n

in the increasing Sort in O n log n time the sequence of the pairs ft ig

i

i

order of the rst comp onents using for example an algorithm from x in

Knuth

For the p ermutation m formed from the second comp onents of the sorted se

quence compute in O n log n time the number Inm of its inversions using

for example the algorithm from Knuth problem

Set sign V tifthenumber Inmiseven and sign V t in the opp osite

case

Furthermore for the determinant of Cauchy matrix the following expression is well

known

Q Q

s t t s

j i j i

ij n ij n

det C s t

Q

s t

i j

ij n

see for example Polya and Szego where it is attributed to Cauchy Using

the latter formula can be rewritten in the form

nn

det V s det V t

det C s t

Q

n

f s

i

i

Q

n

t Using this formula and the algorithm one can where as ab ove f

i

i

compute the value det C s tup to a sign in

nnO n

ops plus one op eration of the extraction of a square ro ot

Furthermore in the case when s and t are real the formula and the algorithm

i i

allow to determine the sign of the value det C s tinO n log n time

References

AhoAV Hop croftJE and UllmanJD The design and analysis of computer

algorithms AddisonWesley Reading Mass

AmmarG and GaderP New decomp ositions of the inverse of a To eplitz ma

trix in Signal pro cessing Scattering and Op erator Theory and Numerical Metho ds

Pro c Int Symp MTNS vol III MAKaasho ek JH van Schrupp en and

ACMRan Eds Birkhauser Boston pp

BrentR GustavsonF and YunD Fast solutions of To eplitz systems of equa

tions and computation of Pade approximants Journal of Algorithms

CannyJF KaltofenE and YagatiL Solving systems of nonlinear equations

faster in Pro c ACMSIGSAM Internat Symp Symb olic Algebraic Comput

ACM New York pp

ChunJ and KailathT Divideandconquer solutions of leastsquares problems

for matrices with displacement structure SIAM Journal of Matrix Analysis Appl

No

ClineRE PlemmonsRJ and WormG Generalized inverses of certain To eplitz

matrices Appl

GerasoulisA A fast algorithm for the multiplication of generalized Hilb ert

matrices with vectors Math of Computation No

Gohb ergI and FeldmanI Convolution equations and pro jection metho ds for

their solutions Translations of Mathematical Monographs Amer Math So c

Gohb ergI and Olshevsky V Circulants displacements and decomp ositions of

matrices Integral Equations and Operator Theory No

Gohb ergI and SemenculA On the inversion of nite To eplitz matrices and

No their continuous analogs in Russian Matem Issled Kishinev

HeinigG Rost K Algebraic metho ds for To eplitzlike matrices and op era

tors Op erator Theory vol Birkauser Verlag Basel

de Ho ogF A new algorithm for solving To eplitz system of equations Linear

Algebra Appl

KnuthD The art of computer programming vol Seminumerical algo

rithms AddisonWesley Reading Mass

KnuthD The art of computer programming vol Sorting and searching

AddisonWesley Reading Mass

LanderFI The Bezoutian and the inversion of Hankel and To eplitz matrices

in Russian Matem Issle d Kishinev No

PanV On computations with dense structured matrices Math of Computa

tion No

PolyaG and Szego G Problems and theorems in analysis Springer Verlag

Berlin New York

Van der WardenBL Algebra I Springer Verlag Berlin New York