<<

LETT]~R~, AL NU0V0 ClMENTO VOL. 38, N. 16 17 Dicembrc 1983

Geometrical . Identification. of and Information Theories.

E. R. CA~A~IET.LO(*) Joint Institute ]or Nuclear Research, Laboratory o/ Theoretical l~hysics - Dubna

(ricevuto il 15 Settembre 1983)

PACS. 03.65. - Quantum theory; quantum .

Quantum mechanics and information theory both deal with (( phenomena )) (rather than (~ noumena ,) as does ) and, in a way or another, with (( uncertainty ~>, i.e. lack of information. The issue may be far more general, and indeed of a fundamental nature to all science; e.g. recently KALMA~ (1) has shown that (~ uncertain data )~ lead to (~ uncertain models )) and that even the simplest linear models must be profoundly modified when this fact is taken into account (( in all branches of science, including time-series analysis, economic forecasting, most of econometrics, psychometrics and elsewhere, (2). The emergence of (~ discrete levels ~) in every sort of ~( structured sys- tems ~ (a) may point in the same direction. Connections between statistics, and information theory have been studied in remarkable papers (notably by TRIBUS (4) and JAY~s (s)) based on the maximum (Shannon) entropy principle. We propose here to show that such a connection can be obtained in the (( natural meeting ground ~) of . Information theory and several branches of statistics have classic geometric realizations, in which the role of (( distance ~)is played by the infinitesimal difference between two probability distributions: the basic concept is here cross-entropy, i.e. information (which we much prefer to entropy: the latter is essentially (~ static )~, while the former correlates a posterior to a prior situa- tion and invites thus to dynamics). The (( information distance ~ is here ~ complexi- fled ~> and generalized in a way clearly forced by the formalism adopted. Quantum mechanics, if presented as (~ quantum geometry ~ (as proposed in previous occasions

(*) On leave from the University of Salerno. (1) R. E. KALMiN: a) Proceedings International Symposium on Dynamical Systems, edited by ~k. BEDNAREK (New York, N.Y., 1982); b) Current Developments in the Interlace: Economics, Eco- nometrics, Mathematics, edited by S. M. ~[~AZENWINKELand A. It. G. RrNNOY KAN (Dordrect, 1982). (i) R. E. KALMAN: Current Developments in the Inter/ace: Economics, Econometrics. Mathematics, edited by S. M. HAZENWINKEL and A. It. G. RINNOY KA~ (Dordrecht, 1982), p. 135. (a) E.R. CAIANIELLO: Biol. Cybernetics, 26, 11 (1977); Int. J. Gen. Systems, 8, 81 (1982) (with G. SCAR- PETI~A and G. SIMONCEI,LI). (') M. TRIBUS: Thermostattcs and (New York, N.Y., 1961). (~) ]L T. 5AYNES: In Brang~eis Theor. Phys. Lectures on Statistical Physics, Yol. 3 (New York, .~. Y., 1962),

539 540 E.R. CA.IANIELLO by us (~)), appears then to be a particular instance of this wider ((complex informa- tion geometry ~.

In/ormation geometry. Metric. Let x ~ {x (1), x (2) ..... x r and z ~- {z(1), z(2)..... z(m)} be two sets of real variables; in the current literature of statistics or estimation theory x is a point in <( paramcter space ~) S ~, z represents random variables. Thus, in the Gaussian distribution

1 [ (z-,)~] (l) QG.... -- ~/2~ exp ~ J , we read z -= {z(,)}, -= {x(~) =/(~)(~, o), z(~) =/(~)(~, ~)}

This interpretation of the roles of x and z is, however, not essential, and may be modified whenever convenient (it being a matter of ~ identification ~), provided formal manipu- lations stay the same. The Kullback-Leibler information (7) (or eross-entlopy):

(2) I(1, 2) --j~(--? 'x'flz) '1o g~(x~lz~(xllZ) ) dz discriminates between distributions at points x I and x2; setting x 1 -~ x, x~= x + dx, one has from (2)

(3) 2I(x + dx, x) = ds 2 = gak(x) dx~ dx k , where (~h = 8/~x~),

(4) g~(z) = g~(z) =~(zlz) ~ log Q(xiz)% log ~(xlz) dz J is the Fisher (7,s), or in/ormation metric; of course ~(xlz)~ 0 and

(5) fQ(xlz)dz = 1

C o n n e eti o n. The geometrical representation of a model (or theory, according to one's philosophy) requires essentially the choice of a metric G and a connection /~, and the identification of a reference frame; these depend upon the ((universe ~) one

(e) E. 1%. CAIANIELLO: a) Left. Nuovo Cimenlo: 25, 225 (1979); 27, 89 (1980); 32, 65 (1981); 35, 381 (1982). Nuovo Cimento B, 59, 350 (1980); Proceedings VI JINR International Con]erenee on the Problems o/ Quantum Fled Theory, zilushta (1981); Quanlizalion as Geometry in Phese Space, IV and V Con/erence on Quantum Theory and the Structure o/ Time and Space, Tulzing (1980) and 1982: the latter to be published in Semin. Mat. Fis., University of Milano. Left. Nuovo Clmento 30, 469 (1981) (with G. VILASI); 33, 555 (1982) (with S. DE FILIPPO and S. G. VILASI); 34, 112 (1982) (with S. DE FILIePO, G. I~IARMO, G. VILASI); 34, 301 (1982) ~with &. GIOV&NNINI); 36, 487 (1983) (with G. lVIARMO and G. SGARPETT•). b) iet~. Naouo Cimento, 37, 399 (1983) (with G. ~VIARMO and G. SCARPETTA). (~) S. C. KULLBACK; In/ormation Theory and Statistics (New York, N. Y., 1959). (s) R.A. FISHm%: Proc. Cambridge Philos. See., 122, 700 (1925). GEOMETRICAl. (~ IDENTIFICATION ~) OF QUANTUM AND INFORMATION THEORIES 5~1 wishes to model. For purposes of statistics and estimation the most general connection is of the form (~)

(6) /'o~ = [i~;/~] -- ~ ~ log q ~ log 0 ~ iog 0 dz with a an arbitrary real parameter. The only case of interest to us, in our comparison with quantum geometry, will be ~ = 0 (nonmetricity (~) does not appear desirable at this stage). It is instructive, however, to consider first the general case, on the example of the Gaussian distribution (1), which one may write (u) as

z [x(~)]2 1 ~ 1 (7) eGau.s(~lZ) : exp X[1)-- Z2.T~(2)-1 -~ ~logx ( ~--~logn ] 4:09 (2) with

x(1) = ~, x(2) = 1 then (n)

G =-- {gh~} : 2pa~ 4/t~ a~ ~_ 204 and tf (8) Rln2 = (1 -- ~2) as.

Thus, the choice ~ ~ 1 renders Gaussian distributions as straight lines when an appro- priate (( natural ~ frame is taken in parameter space (~ =- 1 does the same for n ~ mixture distributions,: ~(xlz ) = (1 --Z x(n)) Q0(z) + Z xch~Q~(z) (0< x(~)< 1)). Except- hffil h=l 0 ing these eases, and in particular with the metric connection /~ ~ [i]; p] of interest to us, (8) tells that the curvature tensor expresses our lack o] information: it vanishes 0nly when a ---- 0, i.e. when the z is known with absolute certainty. Such possibility is the main intuitive notion we wish to draw from this example, as an introduegon to our:

Quant~r~ geometry. - We refer here to the realization of one-particle quantum mechanics treated in earlier works (e) ; it will suffice here to recall that its main ingre- dients are

1) a Hcrmitian metric G in 8-dimensional relativistic phase space (which may derive from)

2) an anti-Hermitian connection I'~, such that the ensuing Riemann tensor expres- ses the I-Ieisenberg commutation relations, i.e. our (~ uncertainty ~) or lack of information; both position and momentum operators become covariant derivatives taken along the axes of

($) 1~r. N~ CHENTZOV:~latistical Decision Rules and Optimal Gonclusions (Moscow, 1972) (in Russian). (10) j. A. SCHOUTEN: Ricci Calculus, 2nd edition (Berlin, 1954). Q~) S. AM~UI: Raag Reps., 106, February 1980; Technical Report Fac.-Engineering, University of Tokyo, METR 81-1, April 1981. 542 E.R. CA/ANIELLO

3) a quantum /rame, whose fixing (or (~ polarization )~) is a prerequisite and eor- corresponds to (( identification ~) (often ignored by physicists but central to systems theory).

Statistics etc. use mainly probability distributions 0, while quantum mechanics prefers their (( square roots ~, i.e. amplitudes. We shall need, therefme, first of allto re- formulate information geometry in terms of amplitudes. Information and systems theory deal with uncertainties in various ways, none of which excludes the theoretical possibility that they may be all rendered as small as wanted (though this view may have to be changed (~)) ; quantum physics sets instead theoretical limitations to this possibility. In model-thinking, the difference may appear less relevant, or not relevant at all, depending upon how much of theoretical or technical limitations one wishes to build within the model, or leave out as (~ error ~>. The development of physics is a paradigm not to be overlooked by sciences which go from (~ soft ,) to (( hard ~). We take the in- formation metric (4) as basic; we wish, though, to treat with it also situations in which 0(x [z) is not given a priori, but has to be determined by means of additional considerations. The meaning of x and z will thus depend on the particular situation: again, (~ identifi- cation ~). We start with the remark that the metric element (4) can also be written as

(9) so that (lO) = f = f

The variable z may of course take also discrete values: f--~)-~, so that situations can be conceived in which (9), (10) simply express ga, in terms of a holonomic vielbein. Thus, if ~(~r = ~V/O(xlz = :~), (9) reads in the familiar way

(11) = ~ 0a~(~(x ) 0,~<~(z) .

Many other situations can be expressed through the scalar product (9) (hypercomplete sets of states, etc.); the consideration of this typical (~system-model)) play will not concern us here. The wanted generalization of the information metric (to include also our quan- tum metric qnk(x) = ~/g~k(x)) is now obvious: it reads (neglecting the irrelevant numer- ical factor)

if y%(xlz ) = OhqJ(xlz) (12) yields the general holonomic case; if 0a ~ -- 0a% we fall back into the standard information metric (9) or (4). The choice of the connection F u is the next step in the geometrical construction of a model, or theory, as was evidenced in the discussion of the Gaussian and mixture distributions (a ~ 4-1). Comparison with our work on quantum geometry(6) re- quires, as already stated, a = 0. It is interesting to see what happens in this case to the connection (6). 0 _Pu reads, in the standard form (4) (11):

(13) Fi~,(x)o = f q[O~ log q Ok log q -t- Oi log q O~ log 0 Ok log 0] dz. GEOMETRICAL (( IDENTIFICATION ~) OF QUANTUM AND INFORMATION THEORIES ~

From

k k (14) YI a, log e = 2~ e-~ ~ 1-I a, V6, t~l i=I one finds

This unexpected simplification can hardly be a coincidence; it strongly suggests that (9) is more convenient, or (( natural ~>, than (4) for the construction of a wider, complexified geometry. It is important to remind also that the addition (if we start with (12)) of anti-Iffermitian terms to (15) leaves ds ~ invariant: it amounts to a gauge trans- formation (~). In this perspective, our formulation of quantum geometry obtains by restricting in a suitable way the (( complex information geometry )~ sketched here and, of course, by suitably identifying its mathematical objects with those typical of quantum mechanics (fixing the quantum frame). Obviously, information geometry is another restriction of that kind. We suggest therefore, in conclusion, that (~ complex information geometry ~> may be an apt tool for attempting (~ technological transfer ~> among fields of investigation alien thus far to one another. As a final remark, we note that (~ infinitesimal distance ~) ds ~ and (~infinitesimal cross-entropy ~> dH c (~3) now coincide (to within appropiate identifications) :

(16) ds ~ ~- dH~ = g~k dxh d-~ -- ~Ph dx~ v~ d-~ dz. J

Then, the requirement imposed by relativity for particles to be physical (~4),

(17) ds ~ ~ dg 2__1 d2~> 0, C2 coincides with the requiremen~

08) dRo>0, which physical phenomena must satisfy. The issue, whether or how ~h(x) in (12) can be connected with solutions of (quantum mechanical) equations, will be considered in a next note.

The author is greatly indebted to Profs. S. AMARI and R. E. KAT.~N for kindly com- municating to him their impoltant and stimulating results. Warmest thanks are also due to Prof. N. N. BOGOr.VBOV and N. N. BOGOLUROV jr. for their so cordial hospitality at JINR in Dubna.

(1,) E. R. CAIAI~IEImO, G. MAnMO and G. SCARPETTA: Lett. Nuovo Ctmento, 37, 399 (1983). (1') In measure-theoretical treatments cross-entropy as used here is often called entropy: cfr. ref. (i) and (7) above. (1,) It becomes ds I ~ dt ~ --(lJc 2) dx' + (~2//~'e6)[dE2--dp ~] > 0 in our quantum geometry; this leads to a * maximal acceleration J: cfr. ref. (a) above.