<<

https://ntrs.nasa.gov/search.jsp?R=19690016211 2020-03-12T04:06:43+00:00Z View metadata, citation and similar papers at core.ac.uk brought to you by CORE

provided by NASA Technical Reports Server

Quantum Detection and Estimation Theory Carl W. Helstrom*

Department of Applied Electrophys ic s 'Bniversity of California, San Diego ' La Jolla, California 92037

ABSTRACT

A review. Quantum is a reformulation, in

quantum-mechanical terms, of statistical as applied

to the detection of in random . Density operators take

the place of the density functions of conventional .

The optimum procedure for chooeing between two hypotheses, and an

approximate procedure valid at small si nal-to-noise ratios and call

threshold detection, are prersented. Quantum estimation theory seeks

best of of a density operator. A quantum counter=

part of the Crapdr-Rao inequality of conventional statirstics sets a lower

bound to the -square errors of such estimates, Applications at I present are praarily to the detection and estimation of signals of optical

frequencies in the presence of thermal radiation.

Keywords :

detecti6n, detection theory, estimation, ~ statistical estimation, estimation theory, quantum theory, 'decision theory, hypothesis testing, statistical decisions.

1 I. Quantum Statis tical Theory

Much of can be viewed as the calculation of expected values. Classically, a system characterized by the variables xl, x2, , x has associated with it a probability density function . . . n

(p. d. f. 1 P(X1’ X2’ Y xn), and the expectations of certain measurable functions f(xl , x2, .. . , Xn) ’

are required. Quantum-mechanically a system is described by a density operator p, which is a function of the dynamical variables of the system, and the of an observable whose quantum- 1 mechanical operator is F is given by the trace

-E(F) = Tr(pF). (1.2)

The density operator p is the quantum counterpart of the p. d. f. p(xl,.. , 5).When, as in the classical limit, p is diagonal in a re- presentation based on the simultaneous eigenstates x x ) of the I 1’ . . n operators X ,X corresponding to the variables x1 , ,x 1’ . . . n . . . n’

the expectatipn in Eq. (1.2) reduces to Eq. (1. l), with

f(xl, ) = (xl.. { Flxl. .x }. ... ,xn .x n - n (1.4) 2 Quantum statistical theory includes the classical as a special case .

Modern statistical theory also has a normative and methodological aspect, .which appears in its treatment of hypothesis testing and estimation.

2 It seeks the best procedures for making statements ab,Qvt the condition of a system under observation, statements that are framed as decisions among hypotheses about the system, or as estimates of numerical parameters characterizing it. The statements are based on observa- tional data subject to unavoidable random error. The best methods are those that minimize the influence of error, and by evaluating their quality it is possible to determine the ultimate limits imposed by statisti- 3 cal uncertainty on the accuracy of decisions and measurements .

In classical physics statistical uncertainty is largely due to the

I. presence of random noise, which originates primarily in molecular chaos, Statistical hypothesis -testing or decision theory has been ex- tensively applied to the detection of acoustic and electromagnetic signals in noise and permits defining the weakest signal that can be detected with a specified probability of error, as a function of the strength of the interfering n~ise~-~.Estimation theory has been applied to the measurement of signal parameters such as amplitude, carrier frequency, and time of arrival, which are important in tele- metry and . The noise sets a limit to the accuracy of such measurements.

The subject of this review is the formulation of statistical decision and estimation theory in quantum-mechanical terms. It involves re- placing the probability density functions that appear in the classical theory by quantum-mechanical density operators. Although the context will be the detection of signals at optical frequencies and the estimation

3 of their parameters, the application of these concepts is not limited

thereto. The aim of quantum detection and estimation theory is to determine how the reliability of decisions and parameter estimates

is affected both by random noise and by quantum-mechanical uncertainty.

Classical Decision and Estimation Theory

Decision theory treats the choice among hypotheses about the

system at hand. In the simplest binary decision there are two hypo-

theses, exemplified by the absence or presence of a signal s(t) of known form in the input x(t) to a receiver during a certain observation interval

(0, TI. The hypotheses are then

(null hypothesis): x(t) = n(t), HO (alternative hypothesis): x(t) = n(t) s(t), H1 t where n(t) is a random process representing noise with certain specified

statistical properties. We suppose that the decision is to be based on

n samples x = x(t.) of the input x(t) during the interval (0, T), (i = 1, i 1 2, . . ., n). The p.d.f.'s p (x . . . , x ) and p (x ,x ) of these o 1' n 1 1" .. n data under the two hypotheses are known. The best method of deciding between them is sought.

-The adjective "best" is principally defined in two ways. In

"Bayesian" decision theory the observer knows the prior

6 and (1-6) of hypotheses Ho and H and he also knows the four costs C 1' ij 10 of choosing hypothesis H. when H. is true (i, j = 0, 1) . The costs 1 J are entailed by the actions and circumstances following the decision,

4 which is to be made in such a way that the average cost is minimum. 11 This so-called "Bayes strategy" requires H to be selected whenever 1

P1(X1' .'7Xn) 2 ~(C,o-Coo) A(Xl,. . . ,x ) = = A. (1.5) n p0(x1, ' > Xn' (1- 6)(Co,-C1 I 1

Otherwise H is selected. The function A(x . . , x ) is called the 0 1' * n likelihood ratio.

Decisions among more than two hypotheses can be treated in a similar manner. Often the costs associated with the various errors can be set equal, whereupon it is the average probability of error that is to be minimized. The best strategy is then to choose the hypothesis whose posterior or conditional probability, given the data (xl, . . . , xn), is greatest12. The posterior probability can be expressed in terms of likelihood ratios between pairs of p. d. f. 's for the data under the several hypotheses.

The second way of defining a "best" binary decision procedure is 13, 14 provided by the theory of Neyman and Pearson . Two kinds of errors can occur. Choosing H when H is true is called an error of 1 0 the first kind, or false alarm; its probability under a given decision strategy is denoted by Q,. Choosing H when H is true is an error 0 1 of the second kind, or false dismissal; its probability is Q The 1' 15 complement Q = 1 - Ql is often called the probability of detection . d That strategy is now considered best that attains the maximum proba- i bility Q 'of detection for a set false-alarm dobability Q It leads d 0'

5 to the same comparison of the likelihood ratio A(x . ,x ) with a 1’ . . n decision level A as in Eq. (1.31, but with A fixed so that the false- 0 0 alarm probability equals the pre-assigned value 16 . The Neyman-

Pearson criterion dispenses with the prior probabilities and costs needed for the Bayesian approach, but is not easily generalized to decisions among more than two hypotheses.

Estimation theory typically treats data E = , , x ) whose (x1’ . . n joint p. d. f. p(xl, , x ; ,0 ) = p(~;0) depends on some unknown . . . n el, . . . m - parameters 0 = (el,. ,0 ) that are to be estimated. For instance, - . . m the data may be samples x = x(t.) of the input j J x(t) = s(t; -0) t n(t) to a receiver, composed of noise n(t) of known statistical properties

= ,0 ) such and a signal s(t; E) depending on parameters - (el, . . . m as amplitude, time of arrival, and carrier frequency. On the basis

of the n data E, the values of these parameters are to be estimated

as accurately as possible.

Estimation theory sets up a measure of the cost or seriousness A/\ A of errors in the estimates = , , 0 ) of the parameters. The ft (9 1’ . . m most common cost function is a weighted sum of the squared errors,

6 AA The problem is to find estimates 0 = 0 (x , x ) as such functions k k 1” . . n of the data that the average cost is minimum. Of interest also are lower bounds on the sizes of the errors, measured usually by the mean-

A 2 square deviations E(O- k-’k) , as well as the of each estimate, defined as the deviation A

17 of the expected value of the estimate from the true value of the parameter .

The Generalization to Quantum Theory

Central to classical decision and estimation theory are the p. d. f. ‘s po(z), p (x), and p(z; 0) of the outcomes of observations of the system. 1- - It is natural to consider analogous theories based instead on quantum- mechanical density operators p 0’ PI’ p (0- ) of the sys tem, a generali- 18-20 zation that leads to quantum decision and estimation theory

The system under observation might, for instance, be a lossless cavity that functions as an ideal receiver of electromagnetic radiation.

The cavity is initially empty. In one wall is an aperture that faces the source of the signal, and during an interval (0, T) when the signal, if present in the external field, is expected to arrive, the aperture is open. At the T the aperture is closed, and thereafter the cavity contains background radiation and, possibly, a field due to the signal.

The density operator of the field will be p when only background 0 radiation is present (hypothesis H ) and p when a signal of the speci- 0 1 fied type has arrived (hypothesis H ). Detection involves a choice 1

7 between these hypotheses. In particular, one would like to know the weakest signal that can be detected with a certain probability Q as a d function of the false-alarm probability Q and the nature of the back- 0 ground radiation.

If, on the other hand, the signal field is known to be present, it may be necessary to measure certain of its parameters, such as its amplitude or carrier frequency. These can be regarded as parameters of the density operator p = p(8 ,8 ) of the net field in the cavity. (5) 1’ . . . m One would like to know the minimum mean-square errors with which the field parameters can be estimated, as functions of the character- istics of the signal and background fields.

Crucial in quantum decision and estimation theory is the question of which dynamical variables of the system shall be measured. In the classical theory it is possible in principle to measure all the variables and to conceive of their having the joint probability density functions po(z), p (E), and p(x; 0) required for setting up the optimum procedures. 1 -- Quantum-mechanically only observables - - dynamical variables repre- sented by Hermitian operators -- can be measured, and since they are to be measured simultaneously on the same system, their operators must commute. Different sets of commuting observables may yield different costs in a Bayes decision or estimation strategy, and the problemaremains of finding the set that entails the lowest cost of all.

If there exists a representation in which all the density operators

8 involved are simultaneously diagonal, they all commute, and by working in this representation, the decision or estimation problem can be re- duced to one that can be handled by the classical theory. Quantum- mechanical decision and estimation theory is presently formulated entirely within the framework of the conventional interpretation of quantum mechanics , and questions of the simultaneous measurability of variables whose operators do not commute have not been treated.

11. Binary Decisions

The Detection Operator

A choice is to be made between two hypotheses about a system,

(H ) that its density operator is p and (H ) that its density operator 0 0 1 is p The prior probability of H is 6 and of H (1-c), and the cost 1’ 0 1 attendant upon choosing H. when H, is true (i, j = 0, 1) is C.. . Suppose 1 J 1J that some set of commuting observables X X2, , has been mea- 1’ . . . sured, with outcomes x x The decision will be based on the 1’ 2’ . . . . value of some function f(x x . ) of the outcomes. Equivalently, 1’ 2’ . . it could be based on the outcome of a measurement of the operator f(Xl, X2, . . . ). What operator should this be?

All that we really require is that the outcome be one of two numbers,

0 and 1, and we choose H if it is 0, H if it is 1. The operator 0 1 f(Xl, X2, . . . ) should therefore be one whose only eigenvalues are0 and 1, and such an operator is a projection operator. We denote it by ll and call it the detection operator.

9 Which of all the projection operators II for the system is best?

To determine it we put down an expression for the average cost and minimize it over the set of all II's. The average cost depends on the probabilities Q and Q of errors of the first and second kinds. The 0 1 former is the probability under hypothesis H that H is chosen, that is, 0 1 that measurement of II yields the value 1,

Q, = Pr In --f 1 I H 3 = E(II~H =~rpII. 0- 0 0 Similarly

Q1 = Tr[pl(l- II)] = 1 - TrplII, and the average cost is - C = c[C (l-QO)f CloQoI f (1-C) CColQl -i- Cll(l = 00 - Ql)I

cco0 (l-c)col-(1 - 6)(co1-Cll) Tr(pl-Apo)Il, (2.3) where

- Since C > Cll, C will be minimum if Tr(p hp is maximum. 01 1 - o) Choose a representation in terms of the eigenstates 11 ) of the k operator p ApO, whose eigenvalues we suppose discrete; 1 - (P -h0)1 Tk) = Vkl Tk) -

It is then necessary to maximize Tp(p 1- '6) 0)' = L Tk(Tk1 nl Vk), and this will be accomplished if

(Tkl'lTk) = '3 Yk' '9

10 Hence the best projection operator to measure in order to choose between H and H is 0 1

“ilk“

Equivalently, p -hp is measured, and H is chosen if the outcome is 1 0 1 19, 21 positive

and the minimum average cost is

Denote the eigenvalues of the density operators p and p1 by P 0 Ok and Plk, respectively, numbering them in descending order. If the

operators are completely continuous, these eigenvalues form discrete

spectra. A theorem in analysis then assures us that the eigenvalues

of p1 - hp are also discrete, that its k-th positive eigenvalue is 1M 0 less than or equal to P and that its k-th negative eigenvalue is greater lk’ than or equal to - hPOk. Here the positive eigenvalues are counted by beginning with the largest, the negative ones by beginning with the 22 most negative .

11 If the density operators p and p commute, the eigenvalues of 0 1 -hp are P -hPOk, and these are positive when p1 0 lk plk/pOk > A.

The best procedure is then to measure either p or a suitable 0’ pl’ operator commuting with both. When the system is found in the kth

common eigenstate, choose H if P h, H if P /POk

Let the system be a simple harmonic oscillator, such as a single mode of the field in our ideal receiver, and assume it to be in thermal

equilibrium with an average number of photons equal to N (hypothesis 0 23 H ) or to N (hypothesis H1). The density operators are 0 1

(2.10) m Pkm = (1 - v )v v = N /(Nktl), k = 0, 1, kk’ k k in terms of the eigenstates Im) of the number operator n. It then

suffices to measure n itself and to choose hypotheais H when 1

where m is the outcome of the measurement.

The Choice between Pure States

There are few pairs of noncommuting,operators p and p for which 0 1 I the eigenvalue problem in Eq. (2. 5) has been solved. One general case

12 of interest is that in which the system is in a pure state under each 24,25 hypo the s is 9

(2.11)

There are then just two states I I,), 1 Il ) satisfying Eq.(2.5) with non- zero eigenvalues, and they are linear combinations of I $ ) and I$ 1), 0 (2.12) 17,) =z ) t z-..I$ } , k = 0, 1. kQ kl 1

By substituting Eqs. (2. 11) and (2.12) into Eq. (2. 5), a set of linear homogeneous equations for z z is obtained. A solution kO’ kl exists only when the determinant of their coefficients vanishes, which yields a quadratic equation for the eigenvalues 7 and The solution 0 7 1’ is k 7 =Q (1-A) - (-1) R, k=0, 1, k (2. 13)

The detection operator to be measured is = 17 ) (T , the false-alarm II 11I and detection probabilities are 2 Q, = I (7,IJi0)I = (Il-q)/2R, (2. 14)

- and the minimum average cost C can be calculated by eq. (2.9),in which the suzn now has a single term 17 1’ In the choice between two coherent states lpo) and Ipl) of a harmonic oscillator such as the field in a single mode of our ideal

13 receiver, now devoid of background radiation, the parameter q entering 26 Eq. (2. 13) is

If, for instance, p = 0, the choice is between the presence and the 0 absence of a coherent signal in the mode, and the probabilities of error 2 depend, through q = 1 - exp(-N ), only on the mean number N = Iw1I S S of signal photons, as in Eq. (2. 14).

The Coherent Signal in Thermal Radiation

Let hypothesis H assert the presence, H the absence, of a 1 0

coherent signal of complex amplitude p in a single mode of a cavity in

thermal equilibrium at absolute temperature$ If the thermal radiation

were gone, the oscillator representing the mode would be in a coherent 26 state Ip). The density operators are, in the P-representation ,

where% = Planck's constant h/2n, R = the angular frequency of the mode,

and K = Boltzmann's constant. The diagonalization of p -hpo, as in' 1 .- .- - - __ Eq. 12. Ei), with these density operators remains an outstanding unsolved

problem of quantum detection theory. By taking p as real, which voids

no generality, and using the co-ordinate (9-)representation, Eq. (2.5)

can be expressed as a homogeneous integral equation, whose kernel is

14 27 a linear combination of Gaussian functions . Evaluation of the pro-

bability of detection woutd parmit specifying th

coherent signal of kno phase in the presence

radiation.

When, as is most reasonable at optical frequencies, the absolute

phase of the complex signal amplitude is unknown and is assigned a

uniform prior distribution over (0, T IT), both p and p are diagonal in 0 the number representation, and the best detector simply measures the 18 energyinthe mode .

If a coherent signal of random phase is present in a number of

modes of a receiver cavity in thermal equilibrium, a linear transform-

ation of the mode amplitudes permits approximate reduction of the 19 problem to the detection of a signal in a single harmonic oscillator .

For this it is required that the signal occupy a frequency band so narrow

that the average number of thermal photons is the same for all the modes

that it excites. In effect, the optimum processing of the field creates _- - I a single mode t'matched't to the signal, and it is the energy or the ex-

citation level of this composite mode that is to be measured.

o signal is present whenever the number

d mode is less than an integer M. The

alarm probability is then

15 -1 Qd = 1 - (N+1) exp(-Ns/(Ntl)) __M-1 - -1iN/(Ntl)j Im L (-N /i--N(Ntl)l), ms (2.16) m=O 19, 28 where L,(x) is the mth Laguerre polynomial

If this receiver is designed to meet the Neyman-Pearson criterion, randomization will in general be necessary in order to attain the pre- assigned false-alarm probability. There will then be a certain photon count MI for which hypothesis H (signal present) is chosen with pro- 1 bability f, H with probability 1-f. For counts less than MI, H is 0 0 always chosen, for counts greater than MI, H1. The required value of f is easily calculated. Graphs of detection probability versus signal 29 strength for such a receiver have been published .

111. Threshold Detection

The Classical Threshold Receiver

It would be useful if a receiver set to incur a fixed false-alarm probability attained maximum detection probability for all expected amplitudes of the signal. This is seldom the case with a receiver based on the classical likelihood-ratio test. Only in particularly simple in- stances, as when the signal is completely known except for amplitude and phase and is received in Gaussian noise, is the likelihood-ratio test uniformly most powerful with respect to signal amplitude. It is usually necessary to set it up for a "standard" signal of specific ampli- tude. and to accept less than maximum probability of detecting signals of

16 other amplitudes. Furthermore, the likelihood ratio is often difficult to generate from a receiver input.

In a compromise that is often expedient, the likelihood ratio is replaced by the so-called threshold statis tic

is the likelihood ratio, with p (x , , x ; A) the p. d. f. of the data 1 1’ . . n when a signal of strength A is present; p 9 x ) = pl(xl, , ; 0). 0 (xl”** n . . . x n The threshold statistic is the logarithmic derivative, with respect to A, of the likelihood ratio for detecting a signal of strength A, evaluated in the limit of vanishing amplitude. It is compared with a decision level U and hypothesis H “signal present”, is selected when U > U 0’ 1’ 0’ The measure A of signal strength is so chosen that the derivative in

Eq. (3.1) does not vanish; it is usually proportional to the energy of 30 the signal .

This threshold statistic U is most nearly optimum when the decision is based on data collected in, a large number M of independent trials.

Compared with the decision level U then is the sum U t U t . . 0 1 2 . oi the threshold statistics calculated from the data obtained in each

trial. The sum has nearly a Gaussian distribution, by virtue of the

central limit theorem, and the false-alarm and detection probabilities

are approximately

17 Qd= erfc (x -MH D), where D is an equivalent signal-to-noise ratio defined by

2 *- '2 D = i E(UIHQ - gul L- (3.4) with Var U the of the statistic U in the absence of the signal. 0

In Eq. (3. 3), x is related to the decision level U on the sum of the 0 threshold statistics.

The false-alarm and detection probabilities will be given approxi- mately as in Eq. (3. 3) for any statistic U(x , ) when the decision 1' . . . xn is based on the sum of such statistics for a large number M of indepen- dent trials. For a fixed pair of probabilities Q and Q and for M >> 1, 0 d that detector is best for which the equivalent signal-to-noise ratio D is largest, for such a detector will require the least number of M of in- dependent trials. The threshold detector as defined in Eqs. (3. 1) and 31 (3.2) is best in this sense .

The Quantum Threshold Receiver

The quantum counterpart of the likelihood-ratio receiver is one in which the optimum detection operator II is measured. It has been found uniformly most powerful with respect to signal amplitude only for de- tecting a known signal of random phase in the presence of thermal noise, a detection problem in which, as we have seen, the density operators commute and the classical likelihood-ratio test is optimum.

18 Furthermore, the mathematical problem of determining the optimum projection operator II presents great difficulty in most cases of practical interest. For these reasons, a quantum-mechanical counterpart to the classical threshold statistic is of interest. 18 The quantum threshold statistic is defined as II 8

where is that operator for which the equivalent signal-to-noise II a ratio D given by

3- 3- ?2 I Trpl(A) II - Trp II 2L a a.j D= o 2 2 (3.6) TrPOIIa - (moria) is maximum. This equivalent signal-to-noise ratio is the quantum-

(3.4); p (A) is the density mechanical form of the one defined in Eq. 1 operator of the observed system when a signal of strength A is present, and p There is no loss of generality if is so defined that 0 = ~~(0). II a

Tr polla = 0, (3.7) since an arbitrary multiple of the identity operator -1 can be subtracted 2 Erom II without changing D . a We define the Hermitian operator 0 (A) as the solution of the equation

and we show that = 0. First of all, II a

Tr(P1 -p ) = 0 =* Tr(pOO +0p ) = Trp 0, 0 0 0 so that Eq. (3.7) is satisfied. We now show that II = 0 maximizes a 2" -c2 2 D =t Tr(p - p,) II i /Tr(poIIa). (3.9) L 1 a_i Substituting from Eq. (3. 81, we find

2 2 = Tr(pO@1 Tr(Pona) by the Schwarz inequality for traces. Hence 2 2 D Tr(p 0 ) 0 with equality when = 0. II a 32 The threshold operator is thus

II, = a@(A)/aAlA = .

As the solution of the operator equation

apl(A)/aAIA= 0 = $(Pone + II8P0)' (3.10) it can be regarded as the symmetrized logarithmic derivative (s. 1. d. ) of p (A), evaluated at A = 0.

In the quantum threshold receiver the operator Is is measured 8 and the outcome compared with a decision level n set to yield a pre- 8 assigned false-alarm probability. The operator is not a projection II 8 operator; the equivalent projection operator is

20 with 10) the eigenstate of n, with eigenvalue 9 , assumed here part of a continuous spectrum.

Threshold Detection of a Coherent Signal

In the cavity that furnishes our model of a quantum receiver the electric field at time t at point 5 is represented by a quantum-mechanical operator -c (r -7 t), which is conveniently decomposed into its positive- and negative-frequency parts ,

the one being the Hermitian conjugate of the other. In terms of the mode eigenfunctions u (r), which are solutions of the Helmholtz equation --m-- with suitable boundary conditions at the walls of the cavity, the positive- 33 frequency part of the electric-field operator is written as

(3.11)

where w is the angular frequency of mode ,m. The mode index m 5 accounts for both the spatial configuration and the polarization of the mode. t The operator a and its Hermitian conjugate a are the annihilation m m c 5 and creation operators for photons in mode and obey the usual com- mutation rules,

t t t a a -a a =[a,a I=& ?2! E?? EE E.2, (3.12) t [a ,a]=[a , at] s.2 %-n =o.

21 The number operator for mode m is n = a t a . 3r - E% Suppose that under hypothesis H the cavity is filled with random 0

Gaussian radiation characterized by the mode correlation matrix CJ ,

whose elements are

(3. 13)

The density operator p for L modes of the field is then, in the P-repre- 0 26 sentation ,

(3.14)

where 5a is a column vector of complex mode variables =cc t ia 5 px Q?Y ' t t *r4. CY is the Hermitian conjugate row vector, a = . . . CY . . . - - -m 3, 2L and d E da dcr is the element of integration in the space EX EY of thea 's. Here E

is the Glauber coherent state for a field with complex amplitude cc F in mode ,m. In thermal equilibrium at absolute temperaturer,

-1 N =[exp(hwk/K-g- 13 (3. 15) k . PI - Were a coherent signal of amplitude A and known phase present in the absence of the random radiation, the field would be in a coherent

State 14) , in which the complex amplitude in mode m- is 4 If @ this coherent signal is superimposed on the random radiation described

of (3. 14), the density operator for the field is by p 0 Eq, (3. 16) which can also be written as 34

(3.17)

(3. 18) where

(3. 19) with 2 the column vector of annihilation operators a and m c t t a = (..., a . . . ) the row vector of the creation operators for the - -m' modes. I- is the identity matrix.

The threshold operator for deciding whether the coherent field with amplitudes Ap is present is the operator given by Eq. (3. 18), F? II 8 as can be verified by differentiating p (A) with respect to A, setting 1 A = 0, and comparing with Eq. (3. 10). The outcomes of measurements of the operator TI have a Gaussian distribution under each hypothesis, 8 and the false-alarm and detection probabilities are given exactly by

(3. 3), with related to the decision level 7~ with which the out- Eq. x 0 comes are compared. The equivalent signal-to-noise ratio D is given by 2 2t D2 = Tr( p B ) = 4A (I-t Zz)-'k. (3.20) 00 For detection in thermal radiation this signal-to-noise ratio reduces to 2 D = 4N /(2N t l), (3.21) S where N = E /hQ is the average number of photons in the field of S S

23 the coherent signal and N is given by Eq. (2.15). For Eq. (3.21) to hold it is necessary that the average numbers N of thermal photons m- in all modes excited by the signal be nearly equal to N, as will be the case when, as usually, the signal occupies only a narrow band of fre- quencies about 0.

In the classical limit the term 2cp dominates in the factor (It 2g)-l z - in Eq. (3.19). and the threshold operator becomes, except for an additive constant, proportional to the logarithm of the classical likeli- hood ratio for choosing between hypotheses H and H The false- 0 1' alarm and detection probabilities for the classically optimum detector 2 2 t -1 are given by Fq. (3. 3) with D = 2A 2 2 E, or for thermal equilibrium, 2 D = 2E /KT.Thus in this case the quantum threshold operator becomes S 35 equivalent in the classical limit to the optimum likelihood- ratio statistic .

Detection of Gaussian Radiation

If the signal field itself has the character of random Gaussian

radiation, the density operator p has the same form as p in Eq. (3. 14). 1 0 We suppose that under hypothesis H the mode correlation matrix, 0 defined by Eq. (3.13), under hypothesis H it is is2 0' 1

where A2 is the mode correlation matrix of the random signal com- S ponents of the field. This is the quantum-mechanical counterpart of what is sometimes called the "noise-in-noise" detection problem, and

it corresponds to the detection of light from an incoherent source.

24 The optimum detection operator II for deciding between hypotheses

H and H remains undiscovered. The threshold operator, however, 0 1 36 can be calculated . It is an Hermitian quadratic form in the annihi- lation and creation operators of the modes,

(3.23)

where the matrix Q- is the solution of the equation

z9s = 9,a_cr+To) f (p20) Qg0* (3.24)

The constant b serves to make Tr (p TI ) vanish. 0' e The p. d. f. 's, under the two hypotheses, of the outcomes of measure-

ments of I'i e are difficult to calculate, and only approximate forms of the false-alarm and detection probabilities are accessible in the general t case. The moment-generating functions of the observable Q' = 2 QA 36 are given by

(3.25)

-P = exp (zQ) -2, i = 0, 1.

The p. d. f of the outcome of a measurement of Q' is the inverse Laplace

transform of h.(-s), and approximation methods, such as the method of 1 steepest descents, are available for calculating the false-alarm and 37 detection probabilities .

25 Reception at an Aperture

An unsatisfactory aspect of quantum detection theory is its formu- lation in terms of simultaneous measurements of the electromagnetic field in a closed volume. An optical instrument, such as a telescope, is more appropriately considered as processing the field at its aperture throughout a finite observation interval. An advantage of the threshold operator for detecting a Gaussian random field is that it can be trans- lated into a form involving only the field operators at the aperture of the receiver, and it can thus be applied to the detection of light from 38 an incoherent source . This translation is possible because the classi- cal mode amplitudes for the cavity receiver after its aperture is closed are linearly related to the field at the aperture itself during the obser- vation interval (0, T).

In order for the threshold operator for detecting incoherent light in the presence of thermal background radiation to take a simple form when expressed in terms of the aperture fields, it is necessary that the duration T of the observation interval be much longer than the reciprocal of the bandwidth W of the light to be detected (WT >> l), and that the diameter of the aperture fl be much greater than the correlation length hc/Krof the thermal radiation. Both these conditions are normally met.

The threshold operator is then proportional to

(3.26)

26 in which for simplicity a scalar field (+I (-1 $(E,t) = JI (5,t) + $ (x,t) has been assumed. Here

is the mutual coherence function of the signal field, where p is its S density operator in the absence of the thermal background. A similar receiver has been derived by Kuriksha on the basis of the classical

39, 40 likelihood ratio

The moment-generating function of the threshold statistic can also be expressed in terms of the mutual coherence function of the signal field at the aperture by similarly translating the form given in Eq.

(3.25), and from this the false-alarm and detection probabilities can 38 be approximated. Details are presented elsewhere .

IV. Choices Among Many Hypotheses.

The choice among M hypotheses, of which the kth asserts, "The system has the density operator p I' k = 1, 2,. M, can be based k' . . on the outcome of a measurement of M commuting projection operators

*. .. ,i3 forming a resolution of the identity operator -1 n,, n2, m- =l. (4.1) n, tn2 t. .. ti3 m-

Quantum logic was formulated in terrns of projection operators by 41 von Neumann . Our problem is to pick such a set of operators II k

that the decision among the M hypotheses can be made with minirnurn

average cost. It will arise, for instance, in designing and evaluating

27 the best receiver for a communication system in which messages are

coded into analphabetof more than two symbols, a different signal be-

ing transmitted for each.

Let 5 be the prior probability of hypothesis and C be the cost k 3 ij incurred upon choosing H. when H. is true. The average cost per 1 J decision is

which is to be minimized by a set of commuting projection operators

II that satisfy Eq. (4. 1). If in particular C.. = 0, Cij = 1, i # j, k 11 equals the average probability of error. This problem of minimizing s C remains unsolved for M >2, except when the M operators p commute, j whereupon it reduces to a standard problem in class cal decision theory.

If under each hypothesis the system is in a pure

the projection operators will have the form

where the 171.) are linear combinatiom of the states ). J I$ k - To find what linear combination minimizes C is also, for M > 2, an

unsolved problem, although one that appears simpler than the general

problem.

V. Estimation Theory

A A Bayesian estimation theory determines a strategy 0(~)= O(xl x2, . . . , xn)

for estimating a parameter 0 of the p. d. f. p(5; 0) =p(xl, , ; 0) x2,. . . xn

28 of the data = (x . . . ,x ) by minimizing the average cost z 1' x2' n

where z(8) is the prior probability density function of the parameter n 8 and C(8, 8) is the cost associated with a discrepancy between the A 42 estimate 8 and the true value of the parameter .

Quantum-mechanically the parameter 8 of a density operator p(8) 41 is estimated by of a resolution of the identity

J dE(8') =A , (5.2) where dE(0 ') is a projection operator corresponding to the statement,

"The value of the parameter 8 lies between 8' and 8' t d8'."

Equivalently we can define such an operator

8 = 8' dE(8') (5.3) "1 n that the outcome of a measurement of 8 yields the value of the estimate of the parameter 8. Corresponding to Eq. (5. l), the average cost associated with the estimate is

(5.4)

The best of the parameter 8 is that resolution of the identity A dE(8'), or the associated operator 0, for which the average cost is minimum. How to find it remains an unsolved problem. If estimation is viewed as a choice among a continuum of hypotheses about the system,

Eq. (5.4) is the counterpart of Eq. (4.2). If there is a representation in which the density operators p(8) are simultaneously diagonal, they

29 all commute, and the problem reduces to the classical one of minimizing

of Eq. (5.1).

Even in classical statistical estimation the full apparatus of the

Bayesian theory is seldom called upon, for prior probability density functions of the parameters are usually unknown. Instead, estimators are sought that have small or zero bias and at the same time incur a small mean-square error over a broad range of true values of the para- n meters. In quantum-mechanical terms the bias of an estimate 8 of a parameter 8 of the density operator p(0) is defined by

(5.5)

A where 8 is the operator whose measurement yields the value of the estimate of 8. (Parameters are c-numbers. ) The mean-square error

An estimate that has zero bias and attains the minimum value of efor

all values of the parameter 8 is said to have uniformly minimum variance.

The Cramgr-Rao Inequality 44 In classical statistics an inequality due to Cram&r43and Rao

sets a lower bound to the mean-square error attainable by any estimator

of a parameter 8 of a p.d.f. p(z; e), A 2 E(8 - 8) 2

where b'(0) = db(Q)/d8and b(8) is the bias. For unbiased estimates

b'(8) = 0..

Furthermore, equality is achieved in Eq. (5.7) by an estimator n 0 (2)satisfying the equation 30 with k(8) independent of the data -x, provided that such an estimator exists. If it exists, it is unbiased and a , and it is called an ---efficient estimator. n In order for a function 0(~)to be a sufficient statistic for estimating

0, it must be possible to factor the density function p(3; 0) into a part A depending on the data2 only throughV0(%)anda remainder that is inde- pendent of the parameter 8,

Such a factorization is seldom possible. 45 An analogous lower bound exists in quantum estimation theory . A Let 0 be an operator, the outcome of a measurement of which provides an estimate of a parameter 8 of the density operator p(0). Then the mean-square error is bounded below by

where L is the symmetrized logarithmic derivative (s. 1. d. ) of p(0) with respect to 0, defined by

The inequality becomes an equality if

A L = k(8)(0 - 02), (5.11) with k(8) a numerical function of the true value 0 only. This requires the density operator p(0) to have the form

31 (5.12)

A where p is independent of the parameter 8, and V(0; e) is an operator b satisfying the equation A av/ae = +VL = $+(e) v(e - eg (5.13) and depending on the dynamical variables of the system only through

A h the operator 0, of which it is a function. If such an estimator 0 exists, it is unbiased, attains the minimum variance k(e)]-',and is termed ., 46 an efficient estimator,

An example is the estimation of the amplitude A of a coherent field in the presence of incoherent Gaussian radiation. The density

operator p (A) is then given by pl(A) of Eqs. (3. 16) and (3. 17), with p of Eq. (3. 14) taking the place of p Comparing Eqs. (3. 18) and 0 b' (5. 13), we find the s. 1. d.

t L = II e - 4Ap- (It ZCJI-)-~~, where is the threshold operat or for detecting the field with mode II 8 amplitudes Ap in the presence of the same type of background radi- m z ation. This threshold operator is given in Eq. (3.19).

An efficient estimator of the amplitude A of the field is, by virtue of Eq. (5. ll), the operator

(5. 14) and it attains the minimum variance

(5.15)

For background radiation of the thermal variety and a narrow-band signal field, this estimator provides a relative variance

(5.16)

32 where N is the mean number of photons in the signal field and N is the S mean number of thermal photons per mode. In the classical limit this

.~. -1 minimum relative variance becomes equal to (2E /K ) , which is the S same as for a classical efficient estimator of the amplitude of a coher-

ent signal of energy E and known phase in thermal noise of absolute S .- .- .. temperature 2'.

Efficient estimators can be expected to be at least as rare in

quantum estimation theory as in the classical theory, and no general method has been found for producing estimators that come close to the lower bound set by the quantum counterpart, Eq. (5.9), of the

C ram6r - Rao inequality.

Sufficient Statistics

The density operator ~(8)can sometimes be factored as in Eq. n (5.12) into two parts, V(8; 8) and its Hermitian conjugate, that depend

A on the dynamical variables of the system only through the operator 8,

and a third part p independent of the unknown parameter 8. The b A operator 8 might then, in analogy with the classical terminology, be

called a sufficient estimator, or a sufficient statistic for estimation. A The operator A in Eq. (5. 14) is sufficent for estimating the amplitude

of the signal field.

In classical detection theory the sufficient statistic for estimating

the amplitude of a coherent signal in Gaussian noise is also sufficent

for detecting the signal; that is, the likelihood ratio for detection

33 depends on the input to the receiver only through that statistic, and the

optimum decision about the presence of the signal can be based upon it.

In the corresponding quadtum-detection problem, the amplitude esti- A mator A does not provide the optimum detection operator, as is evident

from the treatment of detection in the absence of thermal noise, given

in Section II. For a coherent signal in Gaussian noise the efficient

estimator of signal amplitude is related rather to the threshold statistic

for detection. The concept of a sufficient statistic does not, therefore,

seem to have the range in quantum-mechanical decision and estimation 46 Bheory that it possesses in the classical theory.

Multiple Estimation

Thus far we have treated only the estimation of a single para- meter of the density operator of the system. In the classical theory

the CramAr-Rao inequality has been generalized to cover the simulta- 43, 44 neous estimation of several unknown parameters , and a corres-

ponding generalization is possible in quantum estimation theory as 47 well . In discussing it we restrict ourselves for simplicity to un- biased estimates.

Let there be m parameters = (8 19 , 8 ) of the density . . . m A operator p (e) to be estimated, and let 8. be the operator whose measure- - J ment yields a number that is taken as the estimate of the parameter 8 j' Since the estimates are assumed to be unbiased,

34 we define

as the operator providing the error in the estimate of 8 The co- j' variance of simultaneous estimates of the parameters 8 and 8. is then i J nn (5. 17)

These form an m x m matrix l3, whose diagonal elements n are the of the errors in the estimates. If the operators 0 i are to be measured on the same system, they must commute in order 48 for the covariances B.. to have a clearly defined physical meaning . 1J The sizes of the errors and their correlations are conveniently visualized in terms of the concentration ellipsoid in an m-dimensional

N space with Cartesian co-ordinates ,Z = (z z , z ); its equation 1, 2' . . . m is 49 -1 5-ZB _Z=mt2, (5. 18)

N where ,Z is a column vector, 5 its transposed row vector. The larger

this ellipsoid, the greater the mean-square errors, and an elongation

of the ellipsoid in a direction aslant to the co-ordinate axes indicates

a correlation among the estimates.

The generalized Cram6 r- Rao inequality for multiple estimation

places this concentration ellipsoid outside the ellipsoid

N ---ZAZ = m t 2, (5. 19)

35 A.. = STr ~(L.L.t L.L.) = Tr(ap/aei)Lj (5.20) 1J 1J J1 with L. the s.1.d. of p(5) with respect to e., defined as in Eq. (5.10). 1 1 That is, for any column vector ,Z of m real elements,

A1 t e rna tiv ely , for any column vector of real elements,

(5.21)

by picking appropriate values of = (y,, , y ), one can set lower 2 . . . m bounds to variances and covariances of unbiased estimates of the unknown parameters. In particular,

h A 2 -1 B.. = Var8. = Trp(0. - 9.1) 2 (,A )ii, (5.22) 11 1 1 1- -1 which is the i-th diagonal element of the inverse matrix -A . The symmetrized logarithmic derivatives needed in the error bounds on both single and multiple estimates can be worked out for parameters of coherent fields and random Gaussian fields observed

in the presence of random Gaussian background fields. The density

operators have then the forms given by Eqs. (3.14) and (3.16). Details 47 have been presented elsewhere .

VI. Conclusion

We have omitted from this review the analysis of actual receivers

in which quantum effects are significant and the extension of information

theory to channels embodying such receivers. Optical heterodyne

receiver’s and optical detectors of incoherent light have been extensively

36 studied, the types of noise encountered in them have been classified and measured, and methods and data required for their design have been compiled. To simplified models, such as the photon-counting 50-59 receivers, classical detection and estimation theory have been applied

Capacities and information rates of communication channels embodying such receivers have been calculated in order to extend into the quantum domain the results of classical 60-70

A review of quantum detection and estimation theory itself can at the present time be little more than a recital of unsolved problems.

Indeed, a collection of ideas in which suchfundamental matters as optimum Bayes estimation and optimum multiple- hypothesis testing re- main unresolved can hardly be called a theory at all. Nevertheless, it is eminently reasonable that such a theory should exist. If it can be elaborated sufficiently, it will permit us to specify the ultimate limits that the thermal and quantum properties of nature set to the reliable detection of signals and to accurate measurement of parameters of physical systems.

37 Footnotes

*This paper was prepared under grant NGR-05-009-079 from the

National Aeronautics and Space Administration.

1. For a review of the theory of density operators, see U. Fano, Revs. Mod. Phys. -29, 74 (1957), D. Ter Haar, (Institute of

Physics and the Physical Society, London) 2, 304 (1961).

2. The word l'classicall'is used here to mean "nonquantum-mechanical".

It does not indicate venerability, most of what is here called

"classical statistical theory" being younger than quantum

mechanics itself.

3. E. Lehmann, Testing of Statistical Hypotheses (John Wiley & Sons, Inc.,

New York, 1959).

4. D. Middleton, An Introduction to Statistical Communication Theory

(McGraw-Hill Book Company, Inc., New York, 1960), chs. 18 - 23.

5. I. Selin, Detection Theory (Princeton University Press, Princeton,

N. J., 1965).

6. J. C. Hancock and P. A. Wintz, Signal Detection Theory (McGraw-

Hill Book Company, Inc., New York, 1966)

7. A. V. Balakrishnan (ea. ), Communication Theory (McGraw Hill ., Book Company, Inc., New York, 1968), chs. 3, 4.

8. 6. W. Helstrom, Statistical Theory of Signal Detection (Pergamon

Press, Ltd., Oxford, England, 1968), 2nd ed.

38 9. H. Van Trees, Detection, Estimation, and Modulation Theory

(John Wiley & Sons, Inc., New York, 1968), vol. 1.

10. The Bayes whose name figures so largely here was the Rev. Thomas

Bayes (1702 - 1761), whose paper, "An Essay Toward Solving

a Problem in the Doctrine of Chances," Phil. Trans. Roy. SOC.

(London) !Z3, 370 (1763) proposed basing decisions on posterior

or conditional probabilities. It has been reprinted in Biometrika

-45, 293 (1958).

11. Ref. 8, pp. 82, 91; ref. 9, p. 26.

12. Ref, 8, p. 194; ref. 9, p. 46.

13. J. Neyman and E. Pearson, Proc. Camb. Phil. SOC.-29, 492 (1933).

14. 3. Neyrnan and E. Pearson, Phil. Trans. Roy. SOC.(London) -A231,

289 (1933).

15. The notations Q = cy and Qd = i3 are common in the statistical liter- 0 ature, where these probabilities are termed the size and the

power of the test, respectively.

16.' Ref. 8, pp, 87, 93; ref. 9, p. 33.

17. Ref. 8, ch. VIII, p. 249; ref. 9, 52.4, . 52. '

18. C. W. Helstrom, Information and Cont 01 I10, 254 (1967)

19. C. W. Helstrorn, Int. J. Theor. Phy . 1, 37 (1968).

2 0. C. W. Helstrom, Information and Ccntrol _.13, 156 (1968).

21. A derivation taking into account the possibility that randomization may be necessary is given in :ef. 18. i/-

39 . .. 9. . .. J i

22. F. Riesz and B. Se. -Nagy, Functional Analysis (F. Ungar Publishing

Co. , New York, 1955), p. 238. I am indebted to Professor

E. Masry for pointing out this theorem.

23. W. H. Louisell, Radiation- and Noise in- Quantum Electronics

(McGraw-Hill Book Company, Inc. , New York, 1964), p. 242.

24. Ref. 20, $8.

25. P. A. Bakut and S. S. Shchurov, Problemy Peredachi Informatsii -4,

no. 1, 77 (1968).

26. R. .J. Glauber, Phys. Rev. -131, 2766 (1966)

27. Ref. 19, p. 45. 28. G. Lachs, Phys. Rev. - 138, B1012 (1965). 29. 6. W. Helstrom, Trans. IEEE --AES-5, (1969).

30. D. Middleton, Trans. IEEE IT-12, 230 (1966).

31. Ref. 18, p. 273.

32. Ref. 18, p. 275.

33. Ref. 23, p, 153.

34. Ref. 20, p. 165f.

35. Ref. 8, Ch. IV.

36. Ref. 18, p. 284ff.

11 37. G. Doetsch, Handbuch der Laplace-Transformation (Birkhauser

Verlag, Base1 and Stuttgart, 1955) vol. 2, ch. 3.

38. C. W’. Helstrom, submitted to J. Opt. SOC.Am.

40 39. A. A. Kuriksha, Problemy Peredachi Informatsii -3, no. 1, 42 (1967).

40. A. A. Kuriksha, Radiotekhnika i Elektronika 13,- 1790 (1968)

41. J. von Neumann, Mathematische- Grundlagen- der Quantenmechanik

(Springer Verlag, Berlin, 1932) p. 130.

42. Ref. 8, ch. VIII; ref. 9, $2.4, p. 52.

43. H. Cramgr, Mathematical- Methods of Statistics-- (Princeton University

Press, Princeton, N. J., 1946), p, 473ff.

44. C. R. Rao, Bull. Calcutta Math. SOC. I37, 81 (1945).

45. C. W. Helstrom, Phys. Letters -25A, -- 101 (1967).

46. Ref. 20, pp. 164ff.

47. C. W. Helstrom, Trans. LEEE I---IT-14, 234 (1968).

48. H. Margenau and R. N. Hill, Progr. Theoret. Phys. (Kyoto) -26,

722 (1961).

49. L. Schmetterer, Mathematische Statistik (Springer Verlag, New York,

1966), p. 63.

50. B. Reiffen and H. Sherman, Proc. IEEE --51, 1316 (1963).

51. C. W. Helstrom, Trans. IEEE IT-10, 274 (1964).

52. J..W. Goodman, IEEE J. Quant. Elec. QE-1, 180 (1965).

53. J. W. Goodman, Trans. IEEEAES-2, 526 (1966).

54. P. A. Bakut, V. G. Vygon, A. A. Kuriksha, V. G. Repin, G. P.

Tartakovskii, Problemy Peredachi Informatsii -2, no. 4, 39 (1966).

55. V. L: Stefanyuk, --ibid. 2, no. 1, 58 (1966).

56. P. A. Bakut, Radio Engrg. & Electr. Phys. -11, 551 (1966).

41 57, S. A. Ginzburg, --ibid. 11, 1972 (1966).

58. P. A. Bakut, --ibid. 12, 1 (1967)

59. I. Bar-David, Trans. IEEE IT-15, 31. (1969),

60. T. E. Stern, Trans. IRE -IT-6, 435 (1960)

61. J. P. Gordon, Proc. IRE -50, 1898 (1962)

62. B. E. Goodwin and L. P. Bolgiano, Jr. Proc. IEEE -53, 1745 (1965)

63. L. F. Jelsma and L. P. Bolgiano, IEEE Ann. Commun. Conv.

Record, 635 (1965).

64. L.' B'. Levitin, Problemy Peredachi Informatsii -1, no. 3, 71 (1965).

65. H. Takahasi, Advances in Communication Systems (A. V. Balakrishnan,

ed., Academic Press, New York) -1 , 227 (1965).

66. D. S. Lebedev and L. B. Levitin, Information and Control -9, 1 (1966).

67. V. V. Mityugov, Problemy Peredachi Informatsii -2, no. 3, 4%(1966).

68. R. L. Stratonovich, --ibid. 2, no. 1, 45 (1966).

69. S. I. Borovitskii, V. V. Mityugov, --ibid. 3, no. 1, 35 (1967).

70. G. Lachs and G. Fillmore, Trans. IEEE IT-15, (1969).

42