Reasoning Foundations of Author(s): Robert S. Ledley and Lee B. Lusted Source: , New Series, Vol. 130, No. 3366 (Jul. 3, 1959), pp. 9-21 Published by: American Association for the Advancement of Science Stable URL: http://www.jstor.org/stable/1758070 . Accessed: 14/09/2011 21:33

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].

American Association for the Advancement of Science is collaborating with JSTOR to digitize, preserve and extend access to Science.

http://www.jstor.org 3 July 1959, Volume 130, Number 3366 SCIENCE

ance are the ones who do remember and consider the most possibilities." Computers are especially suited to help the collect and process clinical information and remind him of diagnoses which he may have over- Reasoning Foundations 4of Ilooked. In many cases computers may be as simple as a set of hand-sorted cards, s* whereas in other cases the use of a large- Medical Diagnos 1]S scale digital electronic computer may be indicated. There are other ways in which computers may serve the physician, and Symbolic logic, probability, and value the,ory some of these are suggested in this paper. For example, medical students might aid our understanding of how reas,on. find the computer an important aid in learning the methods of differential di- Robert S. Ledley and Lee B. Lus _ted agnosis. But to use the computer thus we must understand how the physician makes a medical diagnosis. This, then, brings us to the subject of our investiga- The purpose of this article is to ana- fitted into a definite dis

3 JULY 1959 9 clinicopathological exercises from Mas- or symptom complexes in conjunction tions is to determine which of these al- sachusetts General were stud- with combinations of diseases or disease ternative disease complexes is "most ied. It has been necessary to simplify complexes. This point is emphasized be- likely" for this patient. the case illustrations in order to demon- cause often an evaluation is made of a The value theory concepts inherent in strate the calculations in their entirety. sign or symptom (4) by itself with re- medical diagnosis and treatment are con- Two well-known mathematical disci- spect to each possible disease by itself, cerned with the important value de- plines, symbolic logic and probability, whereas consideration of the combina- cisions that the diagnostician frequently contribute to our understanding of the tions of signs and symptoms that the faces when he is choosing between alter- reasoning foundations of medical diag- patient does and does not have in rela- native methods of treatment. The prob- nosis; a third mathematical discipline, tion to possible combinations of diseases lem facing the physician is to choose value theory, can aid the choice of an is of primary importance in diagnosis. that treatment which will maximize the optimum treatment. These three basic The probabilistic concepts inherent in chance of curing the patient under the concepts are inherent in any medical medical diagnosis arise because a med- ethical, social, economic, and moral con- diagnostic procedure, even when the di- ical diagnosis can rarely be made with straints of our society. As is discussed agnostician utilizes them subconsciously, absolute certainty; the end result of the below, Von Neumann's so-called "theory or on an "intuitive" level. diagnostic process usually gives a "most of games" can be used to analyze such As is shown below, the logical con- likely" diagnosis. The logical considera- value decisions. cepts inherent in medical diagnosis em- tions present alternative possible disease phasize the fundamental importance of complexes that the patient can have; the considering combinations of symptoms purpose of the probabilistic considera- Logical Concepts

There are three ingredients to the logical concepts inherent in medical di- agnosis; these are (i) medical knowl- edge, (ii) the signs and symptoms pre- sented by the patient, and (iii) the final medical diagnosis itself. Medical knowl- edge presents certain information about relationships that exist between the symptoms and the diseases. The pa- Y X'Y tient's symptoms (4) present further in- formation associated with this particu- --- (a) (b) o lar patient. With these two sources of not Y bothX available information, and by means of andY logical reasoning, the diagnosis is made. Symbolism. The first step in making a logical analysis of this process is to re- view some symbolism associated with the propositional calculus of symbolic logic. Such symbolism enables the more precise communication of the concepts involved in logical processes. The symbols x, y, . .. are used to represent "attributes" a patient may have such as, for instance, X+-Y XCYXY XY XY a sign "" or a disease "," (c)~ and so forth. Corresponding capital X letters X, Y, . . . are used to repre- X or Y(orboth) notX notX bothX sent statements about these attributes. and and. and and For example, Y represents the sen- _____|_ notY notY Y y tence: (d) The patient has the attributey. The negation of this statement: ? The patient does not have the attributey. ~ Keyto is represented by Y, where the bar (called ....__\______kyssattributes negation) over the Y indicates "not." The combination of symbols X Y rep- resents the combined statement: if X'thenY The patient has both the attribute x and the attribute y. (e) where the center dot (called logical Fig. 1. Combinationsof attributes. product) indicates "and." The combi-

1() SCIENCE, VOL. 130 Table 1. Symbolic representationof com- classes Ci, namely, Co, C1,..., C2m_. according to medical knowledge, we binations of attributes. For our purposes, we need only intro- have for E: duce attributes that are and Interpre- symptoms Symbols Name tation diseases. Let the symbol S (1) mean, E=[D(2) S(1)]. "The patient has symptom 1," and simi- [D(1) ?D(2) - S(2)] ? Y Negation Not Y larly for S(2), and so forth. Let the sym- [D(l) ?.D(2)-8 (2)]- X . Y X and Y Logical bol D(1) mean, "The patient has dis- S(1) +S(2) -D(1)+ D(2)] (1) product X + Y Logical sum X or Y ease 1," and similarly for D(2), and so To illustrate the G function is much (or both) forth. In general, a set of n symptoms, simpler. A particular patient might pre- X -> Y Implies If X then Y sent symptom 2 and not symptom 1; then (1), S(2), . . ., S(n) we have and a set of m diseases, G=S(1) .S(2) nation of symbols X + Y represents the D(1), . . ., combined statements: D(2), D(m) Note that symptoms the patient does not will be under consideration. Which have are included as well as those the The patient has attribute x symptoms and diseases are to be in- patient does have. If it is not known or attributey, or both. cluded in such sets is usually dictated by whether the patient does or does not where the plus sign (called logical sum) the circumstances. For example, an oph- have a symptom--for example, if the indicates "or"-that is, the "inclusive thalmologist is interested in a certain symptom is determined as the result of or." The sentence: collection of symptoms and diseases, a laboratory test not yet accomplished, whereas an orthopedist is interested in then this symptom does not appear ex- If the patient has attribute x, another collection. plicitly in the function G. Thus, if the then he has attributey. Logical problem. By means of our sym- patient has symptom 2 and it is not is symbolized by X -> Y. bolic language, each of the three afore- known whether or not he has symptom All these symbols and their meanings mentioned ingredients of medical diagno- 1, then G=S(2). are summarized in Table 1. But they sis can be expressed in terms of Boolean Function f may be illustrated as fol- can be most easily visualized by consid- functions. The relationships between dis- lows. If the patient has disease 1 and not ering, for example, the population of pa- eases and symptoms that comprise disease 2, then tients illustrated in Fig. 1. The cross- medical knowledge can be expressed hatched patients of Fig. la have attribute as a Boolean function of the diseases /=D(1) .D(2) y-that is, they are those for whom Y and symptoms under consideration, say Of course, function f is computed when holds. If we now consider a second at- the functions E and G are known. For E(S(1), . .., S(n), D(1), . .., D(m)) tribute x for some of our patients (cross- example, as we shall presently show, if hatched in the other direction), then Fig. Similarly, the symptoms presented by a E as illustrated above describes the med- 1b indicates these patients for whom patient can be expressed as a Boolean ical knowledge concerning D (1), D(2), X Y holds. Similarly, Fig. l c indicates function of the symptoms alone, say S (1), and S (2), and if a patient presents those patients for whom X + Y holds. In fact, with two attributes, our patients G(S(l1),...,S(n)) G=S(1) .S(2) can be put into four classes, as indicated Then the diagnosis itself can be ex- then it turns out that by CO,C1, C2, and C3 of Fig. ld. pressed as a Boolean function of the dis- Figure 1e illustrates a population of eases alone, say f=D(l) .D(2) patients where the attributes x and y Although we shall discuss a specific have the property that "if X then Y." f(D(1), . ..,D(m)) example below, it is important to first Here, note that the patients for whom To illustrate these three functions E, state the logical problem of medical di- X -> Y holds are those of Co, C2 and G, and f, let us for limit our simplicity agnosis in more abstract terms. The log- C3 only. The situation C1 cannot occur to consideration two diseases, D(1) and ical aspect of the medical diagnosis prob- C1 with X (because represents patients D(2), and two symptoms, S(1) and lem is to determine the diseases f such but not Y); hence, C1 has been crossed Let us first discuss E. the S(2). Suppose that if medical knowledge E is known, out. statements were made in a di- following then: if the patient presents symptoms Of course, in general, more than two textbook agnostic concerning the rela- G, he has diseases f. In terms of our sym- attributes are considered, and usually tionships between D(1), D(2), S(1), bolic language, the problem is to deter- more can be complicated expressions and S(2): mine a Boolean function f that satisfies formed combinations of by making the formula: If a has disease he following attributes. Such expressions are called patient 2, - must have symptom 1 D(2) S(1) "Boolean functions" and are generally If a patient has disease 1 and E -> (G --f) (2) denoted in terms of the usual functional not disease 2, then he must have symptom 2 D(1) *D(2) -e S(2) This is the fundamental formula of med- notation Y, . . ., Similarly, for If a patient has disease 2 and f(X, Z). ical diagnosis. That this is truly the diag- more than two attributes, we can classify not disease 1, then he can- not have symptom 2 D(1) ? D(2) -> S(2) nosis in an intuitive sense can be readily the patients into more than four classes If a patient has either or both of the then he seen. For it is easy to show that the fun- In fact, it is easy to see that for m symptoms, Ci. must have one or both of S(1) +S(2) -> damental formula can be equivalently there are 2m the the diseases D(1) + D(2) attributes, possible ways written as patients can and cannot have the m at- tributes-that there are of the is, 2m Since all of these relations are to hold E - (f -G) 3 JULY 1959 11 where the columns are now labeled by tients who have at least disease D (2) Ck, with a superscript, and represent all but do not have symptom S(1) (see CG conceivable symptom complexes. If we Fig. 4 and Fig. lIe). Also the expression Co , C, consider the four attributes S (1), S (2), 0 D (1), and D (2), then all conceivable D(1). D(2) ->S(2) D(l) 0 I I combinations of disease complexes and is included in E; hence columns C0? and D(2) 0 0 1 / symptom complexes can be summarized C,1 must be eliminated. From the ex- by the columns on the logical basis of pression Fig. 4. Each column represents a differ- and Fig. 2. Logical basis for D(1) D(2). ent product C. C*; let us denote such a D(l) ?D(2) - S(2) column simply by For example, the Ck. we find that columns C22 and C)3 must demarcated column in 4 corresponds Fig. be eliminated. Finally, the expression to C1 *C2, and we denote it by C21. con- Thus this column C,2 represents the S(1) +S(2) --D(1) +D(2) C C ceivable situation of a patient's having O? C3 eliminates columns C01, Co2, and Co3. S(1) but not S(2), and at the same time Thus the reduced basis that includes the D(2) but not D(1)-that is, S(l) 0 I 0 I medical science information (that is, S(2) 0 0 1 1 S(1) .S(2) *D(1) .D(2) Fig. 4 with the appropriate columns omitted) is shown in Fig. 5. similarly, column C,2 (that is, C2. Cg) We now come to the following point: represents the case Fig. 3. Logical basis for S (1) and S(2). If the patient presents a particular symp- S(1) .S(2) .D(1) .D(2) tom complex, what possible disease com- plexes does he have? Consider, for ex- and so forth. For n symptoms and m ample, a patient that presents the case which means in a sense that if the dis- diseases, the combined logical basis will are then the C2-that is, eases f cured, patient's have 2n+" columns representing all con- symptoms will disappear. It can be ceivable combinations of symptom-dis- G=S(1) .S(2) shown that a solution f always exists. ease complexes. The reader who is fa- The column in our reduced basis We shall actually illustrate below an ele- miliar with the binary number system only that contains this is mentary computational technique for will note that the columns of a logical symptom complex is determining the function f in a simple basis with b rows simply form the binary C2--that situation involving two symptoms and numbers from 0 to (2k- 1). S(1) 0 for more two diseases; however, compli- Example of elementary computation. S(2) 1 1 cated situations where many more symp- a basis lists all D(l) Although logical D(2) 0 toms and diseases are involved, more ad- conceivable symptom-disease complex vanced and powerful techniques for combinations, it is obvious that many (see Fig. 5). Since this is the only dis- computing / must be used (5-7). of these do not actually occur. Which ease-symptom complex combination that Logical basis. To illustrate the appli- do occur and which do not occur is can occur (according to medical knowl- cation of the elementary computational information included in medical knowl- edge) that includes the symptom com- method to a specific example, we must edge, and therefore it is natural for plex S(1) S(2), it follows that the di- first consider the concept of a logical us to look to the E function for such agnosis is C,-that is, basis. Actually, we have already intro- information. Thus the role of the E duced this concept in a preliminary way function that embodies medical sci- f =D(1) . D(2) in 1d, for a logical basis displays all Fig. ence is to reduce the logical basis from or the patient has disease D(1) but not conceivable combinations of the attri- all conceivable combinations of disease- disease D (2). butes under consideration that a patient symptom complexes to only those that As another example, suppose the pa- may have. For two attributes (as con- are actually possible. As an illustration, tient presented C1-that is, sidered in Fig. ld) there are four such consider the E function of the above ex- combinations. 2 illustrates how Figure ample (see Eq. 1). First note that it con- G=S(1) .S(2) in a basis these are displayed logical tains as a term the expression D(2) --> to the attributes D then we must consider both column C21 corresponding (1) S(1). This means that if a patient has and The 0 indicates that the cor- and column C31, since both of these col- D(2). D(2) then he must have S(1), and umns include the S symptom responding disease does not occur; the 1 hence the combination of a patient hav- (1) S(2) indicates that it does. Each column Ci complex. Thus there are two possible ing D(2) and not S(1)--that is, S 1)- represents a disease complex, that is, CO disease complexes that the patient may cannot occur; thus, for example, column D 1 D C have, C0 or C3. Thus, represents ( ) (2)), represents C2?, namely D( ) D? (2), and so forth. The columns f=-D(1) .D(2) +D(1) .D(2) represent an exhaustive list of conceiv- Co S(1) 0<- -that the has disease able complexes, that is, a patient must S(2) 0 is, patient D(2) fit into one of these complexes. The com- D(l) 0 and it is not known whether he has D ( 1 ) - plexes are mutually exclusive-that is, D(2) 1 or not; either further tests must be taken a particular patient can fit into only one C2 or else medical knowledge cannot tell of the complexes at a time. cannot occur. Similarly it can be checked whether or not he has D (1) under these Similarly, we can form a logical basis that columns C02, C3?, and C32 cannot circumstances. for two symptoms, as is shown in Fig. 3, occur, for each of these represents pa- Next, suppose the patient has S(2),

12 SCIENCE, VOL. 130 and it is not known whether he has S (1) patients one single patient; what is the condition. The conditional probability, or not-that is, C2 or C3, or chance, or total probability, that the pa- denoted by P(Glf), that from patients tient chosen has certain specified attri- having condition or attributes f, a single G=S(1) .S(2) +S(1) S(2) butes f (x, y, ..., z) ? By definition, the patient selected at random will also have In this case we consider C2, ?C13,and total probability is the ratio of the attributes G is defined as the ratio of the C33, whence the patient has C1 or C,3- number of patients that have these at- number of patients with both attributes that is, tributes to the total number of patients G f to the number of patients having from which the random selection is attributes f. LNote: In this notation the f=D(1) .D(2) +D(1). D(2) made. If the total number of patients is condition appears to the right, and the N, and if is the number of these attribute of selection to the left, of the or the patient certainly has D(1) but it N(f) patients with attributes f, then the total vertical bar: is not known whether he has D(2) or P(attributelcondition) .] not. probability that a patient has attributes Thus we can write: We have demonstrated from f is: thus how, P(Glf) =P(G. )/P(f) (5) the reduced basis that embodies medical P(f) =N(f)/N (3) For the conditional knowledge and from the symptom com- example, probability For the that a that a with disease plexes presented by the patient, we can example, probability pa- patient complex Ci tient has disease becomes: has Ck becomes: determine the possible disease complexes complex Cg symptom complex the patient may have, which is the med- P(Ci) =N(Ci)/N (4) P(CklCi) =N(Ck. Ci)/N(Ci) (6) ical diagnosis. The conditional probability is analogous Probabilistic problem. The results of to the total probability, where the selec- the logical analysis of medical diagnosis Probabilistic Concepts tion is made only from that subpopula- often leave a choice about the possible tion of patients that have the specified disease complexes that the patient may Need for probabilities. In the previous section we considered statements such as, "If a patient has disease 2, he must have symptom 2." While such positive state- COC0c2c3COCIC2C3 GC1C'c3 C?CC1C3 ments have a place when, for example, some laboratory tests are being discussed, it is also evident that in many cases, the S(I) 0I01 0101 0101 O101 statement would read, "If a patient has disease 2, then there is only a certain f chance that he will have symptom 2- 5(2) 001 1 001 1 0011 0011 that is, say, approximately 75 out of 100 patients will have symptom 2." Since "chance" or "probabilities" enter into D(I) 0000 / 1 0000 0 III! "medical knowledge," then chance, or probabilities, enter into the diagnosis it- D(2) 0000 0000 1111 1 I self. At present it may generally be said that specific probabilities are rarely known; medical diagnostic textbooks C rarely give numerical values, although C, C, C, they may use words such as "frequently," "very often," and "almost always." How- Fig. 4. Logical basis for S(1), S(2), D(1), and D(2). ever, as is shown below, it is a relatively simple matter to collect such statistics. Since we are considering topics from an essentially academic point of view, we shall assume that the probabilities are known or can be easily obtained, and we shall discuss methods of utilizing such probabilities in the medical diagnosis. Actually, such a discussion makes clear in any particular circumstances precisely which statistics should be taken and pre- sents methods for rapidly collecting them in the most useful form. Total and conditional probabilities. The first step in discussing a probabil- istic analysis of medical diagnosis is to review some definitions and important properties of probabilities. The concept of total probability is concerned with the following question. Suppose we se- lect at random from our population of Fig. 5. Reduced basis that includes medical knowledge. 3 JULY 1959 13 have. The problem now is: Which of Table 3.'Summary of values associated represent unrelated disease-symptom these choices is most probable-that is, with treatment-diseasecombinations. combinations, according to medical which of the disease complexes given by T C2 C3 knowledge, and hence there are no pa- the logical diagnosis function f is the tients having these disease-symptom com- patient most likely to have. In terms of T(1) 90/100 30/100 plexes (see cross-hatched columns of 10/100 100/100 conditional probabilities, the probabilis- T(2) Fig. 5). tic aspect of the diagnosis problem is to Now suppose a patient presented determine the probability that a patient symptom complex has diseases f where it is known that the over all disease particular patient presents symptoms G, possible complexes (that G=S(l) .S(2) = C1 is, if there are m diseases under con- that is, the probabilistic aspect of med- Logical analysis shows that the sideration, then o takes on values from diagnosis ical diagnosis is to evaluate P(flG) for is 0 2m- The a particular patient. through 1). important part of 7 is the numerator The data upon which the evaluation Eq. of the right- f=D(1) .D(2) +D(1) .D(2) hand side. It has two factors, of P(/[G) is based must, of course, come P(CkJCi) The now is: Which disease and The former is the rela- problem from medical knowledge. Such medi- P(Ci). just does the most tion between Ck and complex patient likely cal knowledge is generally also given in C4 given by medi- cal which we would have, the form of conditional probabilities- knowledge, certainly expect as a factor in the How- namely, the probability that a patient diagnosis. C2D(l) .D(2) or C,D(l) .D (2) ever, observe the latter factor: it is the having disease complex C, will have total probability that the patient has the To solve this problem, we calculate both symptom complex Ck, or P(CkICi). The disease in P(C2 I C1) and P(C3 ] C1) by means of reason medical knowledge takes this complex question, irrespective of This is the factor that Eq. 7 and Table 2, as follows: form is because this conditional prob- any symptoms. takes account of the local ability is relatively independent of local aspects--geo- P( =[P( seasonal C,IC) C2)P(ClC2)]- environmental factors such as geography, graphical location, influence, + occurrence of and so forth. [P(Co)P(C'lCo) season, and others, and depends primar- epidemics, P(C1)P(CIC) + This factor a ily on the physiological-pathological as- explains why physician P(C2)P(C1lC,)+ tell a over the pects of the disease complex itself. Thus might patient telephone: P(C3) + (C1jC3)] "Your of mild = the study of the disease processes as a symptoms , [(25/1000) (1)] and so indicate that cause for the resulting possible symptom fever, forth, you [(910/1000) (0)] + have Asian flu-it's around + complexes can be expressed as such con- probably our (50/1000) (0) know." And the (25/1000)(1) + ditional probabilities: of having a symp- community now, you is more than he is (15/1000) (2/3) tom complex on condition that the physician likely right; the factor in the = 25/(25 + 10) = 5/7 patient has the disease complex. It is using P(Ci) making interesting to note that this is also the diagnosis. Similarly, we have In the more the follow- reason most diagnostic textbooks discuss general case, of formula can be P(C3|C1) = [(15/1000) (2/3)] the symptoms associated with a disease, ing adaptation Bayes' [(910/1000) (0) + rather than the the diseases as- made for our purposes: reverse, (50/1000) (0) + sociated with a symptom. krG (25/1000) (1) + : P )P(Ck|Cd) The question that naturally arises at (Ci (15/1000)(2/3)] this point is: If medical knowledge is P(flG) = kGC (8) = 10/(25 + 10) = 2/7 in the form P(CkjC4)--that is, prob- 2 5P(C,)P(CklCw,) Hence the chances are 5:2 that the ability of having the symptoms given the pa- tient has disease 2 but not disease 1, patient having the diseases-then how Example of a simple computation. Table 2 rather than both disease 1 and disease 2. can we make the diagnosis P(flG)--that gives hypothetical probabilities Next, suppose the patient presented is, the probability of having the disease for our example that are consistent with our previous example of two diseases given the patient having the symptoms? G=S(1) ?S(2) = C3 The answer lies in the well-known Bayes' and two symptoms. These conditional and total were The tells formula (8) of probability. Let us first probabilities probabilities logical analysis us that discuss the simpler case where supposed to have been obtained from f=Ci f=D(1) .D(2) and G = Ck; then it can be shown that clinical statistical data and medical +D(1) .D(2) knowledge. We can immediately observe That is, the patient has either that the conditional probabilities corre- P( Ci|Cud) = P(Ci)P(CkICi)p(c)P( (7) CklC) sponding to columns that were elimi- Cl=D(1) .Db(2) orC3=D(1) .D(2) nated by means of the logical analysis Determining the conditional probabili- where to under 2 indicates summation are zero. This is because these columns ties P(CIjC3) and P(C3\C3) according to Table 2, we find:

Table 2. Illustrative values of P(Ck|Ci) and P (Ci). P(C,[C) = 20/(20 + 5) = 4/5 P(C?ICo)l 1 P(C'ICo) =0 P(C2fCo)=0 P(C3ICo)=0 P(Co) =910/1000 and P(C?lC) =0 P(CICl) = 0 P(C2Cl) = 3/5 P(C3C1) =2/5 P(C1) = 50/1000 P(C3alC) =5/(20+5) = 1/5 P(C?C2) =0 P(CC2) = 1 P(C2C2)=0 P(C3C2)=0 P(C2-)= 25/1000 P(C?IC3) =0 P(Cl C3) =2/3 P(C21C3)=0 P(C?lCs) = 1/3 P(C3) = 15/1000 Hence the chances are 4:1 that the pa-

14 SCIENCE, VOL. 130i tient most likely has disease 1 and not has becomes itself a statistic and can be there exists a treatment T(l) that is 90 disease 2 rather than both diseases 1 and included in a recalculation of the proba- percent effective against disease complex 2. bilities for this physician or at C2 and 30 percent effective against dis- Statistics. In our use of probabilities that time. In other words, the patient ease complex C3. If we use this treat- we have tacitly made one subtle assump- for whom the diagnosis has been made ment, what proportion of the 7000 pa- tion that does not belong in the realm automatically becomes a part of the tients should we expect to ? The of the reasoning foundations of medical known population upon which the proba- answer is given in terms of the "ex- diagnosis, but rather in statistics. The bilities for those circumstances are based. pected value" of the proportion E, which assumption is that even though our Thus the known population becomes is the sum of the products of the value probabilities, P(C,) and P ( Ck C), by simply the already-diagnosed cases. of the treatment for curing the disease definition, apply only to a randomly se- Hence the probabilities P(Ci) and complex and the probability that a pa- lected patient from a known population, P(CklCi) are continuously changing tient has the disease complex. For ex- we of course are applying the same as successive diagnoses are made. Of ample, about (5/7) (7000), or 5000, will probabilities to a new patient (not course, the probabilities should be based have disease complex C2, and of these among the known population) who on relatively current statistics; hence, we expect that 90 percent, or 4500, will comes to the physician for diagnosis and after a time, the older cases are dropped be cured by T(1); similarly, for those treatment. The reason we can apply from this known population. Actually with disease complex C3, we expect these probabilities to this patient anyway this recalculation of probabilities is not that 30 percent, or 600, will be cured by is beyond the scope of this article; it de- hard to do. This problem is discussed T(1). Altogether, we expect that pends on statistical considerations-con- below. siderations which, by the way, have [(910) (7)+(3 0 )(47)] 7000 proved exceedingly useful for solving practical problems in many walks of Value Theory Concepts will be cured by T(1). Here life. However, certain general aspects of ( 90 \(5 ( 30 2\2 51 the statistical problem can serve to illus- Value decisions for treatment: com- 100/ 7 M+l-\ V60770 trate some properties of our probabilistic plicated conflict situation. After the is the expected value of the proportion approach to medical diagnosis. diagnosis has been established, the phy- of patients cured by T(1). Note that the physician has no direct sician must further decide upon the Suppose, on the other hand, that there control over which particular person treatment. Often this is a relatively sim- is an alternative treatment T(2) for will come to him as a patient at any ple, straightforward application of the these diseases; it is 10 percent effective time, and hence his patients are certainly currently accepted available therapeutic against C2 but 100 percent effective: randomly chosen in this sense. Also note measures relating to the particular diag- against C3. The problem is: With which that although the patient is not a mem- nosis. On the other hand, and perhaps treatment will we expect to cure more ber of the known population upon which just as often, the choice of treatment patients (see Table 3)? The expected the probabilities were based, the proba- involves an evaluation and estimation of value of the proportion cured by bilities will to him if he is a a conflict situation that not T(2) apply person complicated becomes: who lives under approximately the same only depends on the established diagno- circumstances as those of the known sis but also on therapeutic, moral, ethi- ( 10 \ (5\ (l00 (2\) 25 population. By "circumstances"we mean cal, social, and economic considerations 100/ Uj7/ M10/0 !7 70 geographical area, local community, sea- concerning the individual patient, his and hence we would expect to cure more son of the and so forth. and the in which he year, family, society lives. patients with T ( 1) than with T(2). On The important results of these obser- Similar complicated decision problems the other hand, suppose the probability vations are twofold. First, since the frequently arise in military, economic, that a patient has C2 is 2/7, that he has probabilities, particularly P(C4), depend and political situations; and to aid a C3, 5/7. Then, calculating the expected upon such circumstances, then for each more analytical and quantitative ap- value of the proportion who will be or clinic there is a to these physician P(Ci). proach problems, mathemati- cured by both T(1) and T(2) respec- That is to say, in general, nearly all the cians have developed "value theory." tively, we find: patients of an individual physician or The striking similarity between these de- clinic will be subject to the same cir- cision problems and the value decisions (90 (2 30\5\ 33 100\ cumstances. Thus each such physician or frequently facing the physician indicate 7) 100 7U 70 ( 10 (2\ (5\ clinic will have its own P(C,) which, in that value theory methods can be ap- (100 52 7 100/ \7 general, will be different at different plied to the medical decision problem 100J ( 70 times. As discussed above, the P(C ICk) as well. Of the several mathematical Thus T(2) becomes the treatment of can be used by many physicians over a forms value theory has taken, we have choice. longer period of time. chosen to discuss that developed prin- The process of choosing the best treat- Second, if these probabilities are so cipally by Von Neumann (9, 10), often ment can be described in the terminol- variable, from place to place and from called "game theory." ogy of games. There are two players, the time to time, the question arises as to Expected value. One of the basic con- physician and nature. The physician is how they can be evaluated at all. The cepts upon which value theory rests is trying to determine the best strategy answer to this is based on the fact that that of expected value (8). Suppose we from his limited knowledge of nature. once a diagnosis has been made for a consider 7000 patients, for all of whom The matrix representation of values patient by a particular physician or clinic two tentative diagnoses, C2 or C3, have given in Table 3 constitutes the payoffs at a certain time, the symptom-disease been made, with probability 5/7 and -what the physician will "win," and complex combination that this patient 2/7, respectively. Suppose, also, that nature "lose." 3 JULY 1959 15; For the values of the treatments as Table 4. Values associatedwith treatment- three ways we can choose the treatment: disease combinations. given in Table 3, let us see how the (i) treat all patients by T ( 1), (ii) treat value and the choice expected E, hence T C2 C3 all patients by T(2), and (iii) treat of treatment, depends on the probabil- some patients by T(1) and others by ity that the patient has C2 or C3. If P T(1) +5 -10 T(2). The first two ways are called T(2) -5 + 8 is the probability that a patient has C2, "pure strategies," the third, a "mixed then (1 - P) must be the probability strategy." that the patient has C3 (since by sup- Consider the values of Table 3, and position the patient has either C2 or C3 tion will show that when we are maxi- suppose we choose the third way of but not both). Hence, by Table 3, the mizing the expected number of people treatment (which really includes the first expected value E1 with treatment T( 1 ) cured, we are really maximizing the two anyway). Let Q .be the fraction of becomes: probability that any individual patient patients to be treated by T(1), then will be cured. Hence we need not actu- (1 - Q) is the fraction to be treated by E-=.9P - .3(1 -P) ally have, say, 7000 patients; we can T(2). Observe that if all the patients and the expected value E2 with treat- apply our results to a single patient. The had C2, we would expect to cure ment T(2) becomes: same argument holds when more com- values are involved. plicated [ Qf + 6( 1-OQ) ]7000 E2=.1P+ (1 -P) The second point is that the decision patients. We have called the bracketed involved for assigning the value to a Figure 6 illustrates the graphs of these expression and have graphed it treatment-disease combination was not E(C2) two equations, where the points for in Fig. 7. Similarly, if all the patients discussed at all. Then what is the advan- P =5/7 and P = 2/7, discussed above, had C3, we would expect to cure of our new The are indicated. Hence T(1) is the treat- tage technique? advan- anent of choice for P to the right of tage is that we have enabled the separa- tion the [1o^00]+ too -"' ]70""0 ?where the lines cross, and T(2) is the of strategy problem from the -treatment of choice for P to the left of decision of values problem; however, patients; we have also graphed this where the lines cross. only the strategy problem was solved. bracketed expression in Fig. 7. Evidently, Up to now we have considered the The decision of values problem fre- for a particular value of Q, the lower value of a treatment with respect to a quently involves intangibles such as (thick) line in Fig. 7 represents the disease complex as being measured di- moral and ethical standards which must, minimum number of patients that we rectly by its effectiveness in curing the in the last analysis, be left to the phy- can expect to cure. For Q=.6, this diseases. This, however, may not always sician's judgment. minimum number is a maximum, and be the case. For example, certain kinds Mixed strategy. In our development we would expect to cure 58 percent of of do involve a marked risk; of the reasoning foundations of medical the patients (or 4060 patients); hence if the surgery is successful, the patient diagnosis for treatment, we first sketched (.6) (7000) patients should be treated by will be cured or benefited; if it is unsuc- the logical principles involved in the T(1) and the rest, (.4) (7000), should based on the cessful, the patient may die. Hence the diagnosis; alternative diag- be treated by T(2). value associated with this treatment is noses presented by the logic, we calcu- To arrange for such a treatment is more difficult to define. As an illustra- lated probabilities for these alternatives; easy: Separate the patients at ran- tion, suppose values were chosen be- based on these probabilities, we sketched dom into two groups, one containing tween -10 and + 10, as is shown in a technique for choosing between meth- (.6) (7000) = 4200 patients, the other Table 4. Then, if the probability that ods of treatment. However at the present containing (.4) (7000) = 2800 patients, the patient has C2 is 5/7 and the proba- time, as we observed above, data are not the former group to receive T(1), the available to enable the latter there is bility that he has C3 is 2/7, generally proba- T(2). However, another bilities to be computed; and in rare dis- way of arranging for such a treatment, E1= + = (5) (5/7) (- 10) (2/7) 5/7 eases such data will be difficult to obtain. as follows: As each patient comes up for + E,2= (-5) (5/7) (8) (2/7) =-4/7 Hence selection of the method of treat- treatment, spin the wheel of chance so that T(1) is the treatment of choice. ment must frequently be made based on shown in Fig. 8. If the wheel stops oppo- If the probabilities were the other way the logical diagnostic results alone. We site one of the numbers 0, 1, 2, 3, 4, or around, that is, if C,= 2/7 and C,= now consider a method for determining 5, the patient receives T(1); if it stops 5/7, then we would have E =- 40/7, the best treatment under such circum- opposite 6, 7, 8, or 9, the patient re- E2 = 30/7, and T(2) would be the treat- stances. ceives T(2). Since there is an equal ment of choice. Again consider 7000 patients with chance that the wheel will stop opposite Two points still require further dis- identical diagnoses of C2 or C3, and sup- any number, then about 0.6 of the pa- cussion. First, we have considered our pose the effectiveness of alternative tients will receive T(1) and 0.4 will problem from the point of view of many treatments T(1) or T(2) are as given receive T(2). This process is called patients all of whom have the diagnosis in Table 3. But this time we do not "choosing a random number from 0 to C2 or C3, and we have seen how to know the probabilities that the patients 9." Actually, one does not need to spin choose that treatment which will max- have C2 or C3. Our problem is again to a wheel of chance to get random num- imize the number of patients cured or choose that treatment which will insure bers: books have been published con- maximize some other value for the pa- that we cure the largest number of peo- taining nothing but millions of random tients. However, in private practice, the ple-that is, to maximize the minimum numbers (11, 12). physician is usually concerned with a possible number of patients that we ex- Why do we bring up random numbers single individual patient. A little reflec- pect will be cured. There are actually when all we really needed to do was

16 SCIENCE, VOL. 130 divide our 7000 patients into two another hemangioma below the surface groups? To treat the 7000 patients, the of the . (The hemangiomas had en- two-group technique is perfectly ade- larged since birth.) Also, calcium such quate; but let us consider again the phy- as that seen in the mass on the chest sician who is concerned at the moment x-ray is found in blood vessel tumors; with a single patient. He cannot very (iii) A dermoid cyst, D (3), could be well divide up the patient into two present in the mediastinum. The cal- groups. To help this physician out, we cium in the mass suggests this possibil- interpret Q as the probability that the ity. patient should receive T(1), and then What treatments should be used? The (1- Q) is the probability that the pa- physician decides that some treatment tient should receive T(2). With this in- is absolutely necessary and that there are terpretation, the above discussion shows two possibilities, x-ray therapy to the mass or that by choosing Q to be .6, the chance Fraction(oY probability) Q surgery. or probability of curing the patient is There are some arguments for and 7. Mathematical in mixed maximized to .58. Hence the physician Fig. expectation some against each treatment. This type strategy. chooses a single random integer: if it of problem is susceptible to value theory is 0, 1, 2, 3, 4, or 5, the patient gets sentations of the payoffs, the concepts of analysis. The physicians set up the argu- T(1); if it is 6, 7, 8, or 9, the patient pure and mixed strategies, and so on, ments pro and con for each treatment gets T(2). This is the concept of a give valuable orientation to persons who as follows: mixed strategy applied to a single case. must think about complicated situa- 1) X-ray therapy to the mass [here- Such a method for choosing the treat- tions." after referred to as T(1)]. Argument ment may be very hard to appreciate at pro. (i) If the mass is thymus, the first contact, but this is just the method x-ray treatment will cause it to decrease used every day when probabilities are in size. (ii) If the mass is a hemangioma applied to single situations. Of course, in Simplified Illustration composed of small blood vessels, it may actual practice, some further informa- decrease with radiation. (iii) This treat- tion bearing on the choice of treatment A case history. A 5-week-old female ment can be done quickly with little dis- would be sought-that is to say, the infant was observed by the mother to comfort or immediate danger to the formulation of the problem of which have progressive difficulty in breathing patient. treatment to give the patient is far more during a 5-day period. No respiratory Argument con. (i) Radiation to the complicated than that posed by the problem had been present immediately mass may cause of the thyroid to single problem discussed above. In con- after birth. develop later (14). (ii) Radiation will clusion, we may quote J. D. Williams Physical examination showed a well- not affect the mass if it is a dermoid cyst (13) on the role of game theory: nourished infant with hemangiomas or a large vessel-type hemangioma. "While there are specific applications (blood vessel tumors) on the lower neck 2) Surgery [hereafter referred to as today, despite the current limitations of anteriorly, on the left ear, and lower lip. T(2)]. Argument pro. (i) surgical ex- the theory, perhaps its greatest contribu- The physical examination was otherwise ploration will permit the surgeon to in- tion so far has been an intangible one: negative, and all the laboratory tests spect the mass and to make a definite the general orientation given to people were normal. X-ray examination of the diagnosis. (ii) If the mass is found to be who are faced with overcomplex prob- chest showed a mass in the anterior su- a dermoid cyst, it can be removed. If lems. Even though these problems are perior mediastinum which displaced the the mass is thymus or hemangiomas, par- not strictly solvable-it helps to have a trachea to the right and posteriorly. tial or total removal may be possible. framework in which to work on them. There was some narrowing of the tra- Argument con. (i) The infant is sub- The concepts of a strategy, the repre- chea caused by the mass. Several small ject to the risks of a surgical procedure flecks of calcium were placed anteriorly (these are concerned with general anes- within this mass. The physician is thus faced with this problem: A 5-week-old infant presents increasing respiratory distress which must be relieved or the infant will die. First, what differential diagnosis should he make and, second, what should the treatment be? The physician decided that one or more of three abnormalities might be causing the respiratory distress: (i) a prominent thymus gland [hereafter referred to as D(1)], since it is well rec- ognized that a large thymus can cause 2/7 '5 such distress; (ii) A deep hemangioma ProbabilityP in mediastinum, D(2), must be consid- Fig. 6. Mathematical expectation of treat- ered because the infant has three surface ment. hemangiomas and therefore should have Fig. 8. Gambling wheel.

3 JULY 1959 17 Symptomcomplex superscript 0123 4567 0123 456 7 0123 4567 0 1 2 3 4567 0123 45 6 7 0 123 4567 0 1 2 3 4567 o123 4567

S(1) 101 S(2) 0 1 1 s(3)

D(1) 000 0(2) 111 D(3) 000 0000 Z1 O U C ii6 111 1 i

\

0 01 3 4 5 6 7 Disease complex subscript Fig. 9. Reduced logical basis for the illustrative example.

thesia and a chest operation). (ii) If the cal basis that represents all conceivable diagnosis results in four possible disease mass is a hemangioma, an attempt at symptom-disease complex combinations complexes that the patient may have. surgical removal might result in bleed- (see Fig. 9). Further, let us suppose that Consider next a patient that presents and but where ing which would be difficult to control the population of patients under consid- symptoms S(1) S(2), the has not been taken-that and thereby add to the risk of the opera- eration is such that they can have no x-ray yet G S the above tech- tion. other symptoms or diseases than those is, =S(1) (2). By and that each must niques, we find that the logical diagnosis Setting up the illustration. The above given above, patient have at least one of the symptoms and case history suggests an appropriate sim- at least one of the diseases. Let us sup- f=D(l) .D(2)D(3) +_ plification that we can make for pur- + pose that medical knowledge consists of D(1) .D(2) .D(3) poses of illustration. Let us limit our D(1) .D(2) *.D(3) + the following three observations: attention to just the three diseases D(l1), D(1) .D(2) .D(3) and thymus, deep 1. A patient having D (1) Note that this is the same as D(2), D(3) (large D diagnosis and also either (2) = hemangioma, and dermoid cyst, respec- or D(3) must have D(1) ? [D(2) + D(3)]-> for the patient with symptoms G tively), the three symptoms S 1 both symptoms S(1) S(1) S (3) S(1) In other words, if, ( ), S(2), and S(3) S(2)'S(3). and S(3) (respiratory distress, several when the x-ray was taken, positive re- 2. If a patient does not sults were the re- surface hemangiomas, and mediastinal have D(2) then he obtained, diagnosis mass on chest x-ray, respectively), and does not have S(2) D(2) -> S(2) mains the same as it was before the x-ray results were known. On the other the two treatments T(1) and T(2) 3. If a patient does not therapy and surgery, respec- have D(1) but does hand, suppose the x-ray turned out nega- (x-ray have both D(2) and -> tive; then the patient's symptoms would tively). Of course a realistic application D(3), then he has D(1) . D(2) * D(3) of the techniques developed above would symptom S(3) S(3) be require consideration of the hundreds of Under these observations of medical G=S(1) .S(2) .S(3) diseases and associated symptoms with, knowledge and under the limitations whence it is easy to see that the diagno- a say, particular . However, imposed on the population of patients sis becomes within the limited space allowed the under consideration, Fig. 9 represents we are forced to confine present article, the reduced basis embodying medical f=D(l) .D(2) .D(3) our attention to the three diseases and knowledge, where the noncrosshatched three the case In this case the additional information symptoms suggested by columns represent possible symptom-dis- The discussion of a method obtained from the x-ray film enabled history. per- ease complex combinations consistent the feasible of our the diagnosis to be reduced from four mitting application with medical knowledge and the popu- to more realistic circum- disease complex possibilities to a unique techniques lation of patients selected. stances is given in the following section. disease complex diagnosis. This example Examples of logical diagnosis. Now We shall now for a moment illustrated the interesting fact that ad- digress we are ready to return to our case his- from the case in order to set ditional diagnostic information may not history up tory. Here the patient presented symp- the illustration. Since we are considering always result in further differentiation toms S(1), S(2), and S(3)-that is, only three symptoms, there are 23 = 8 between disease complexes, depending on the circumstances. conceivable symptom complexes; for our G=S(1) .S(2) .S(3) three diseases there are likewise 23 = 8 As a final example of logical diagno- conceivable disease complexes; therefore By the technique described above, it is sis, consider a patient that presents there are 23+3= 64 columns in our logi- easy to see the logical diagnosis: G=S(1) .S(2) .S(3) f=D(1) D(2)D(3) + Here we find Table 5. Values of treatments for disease D(1).D(2) .D(3 ) + complexes. D(1) .D(2) .D(3) + f=D(1) .D(2) .D(3) + D(1) .D(2) .D(3) =D(2) D(1). D(2)? D(3) + T C1 Ca C4 Cd D(1) .D2)D(2) ) + which means that the certainly patient D(1) .D(2) .D(3) X-ray T(1) +3 -2 - 3 -2 has D(2), and may or may not have -2 +6 +10 +8 SurgeryT(2) D(1) and D(3). Here, then, the logical Thus the patient must have one of these 18 SCIENCE, VOL. 130 four possible disease complexes. In this Symptomcomplex superscript case the logical diagnosis, while narrow- * ing down the possibilities, does not seem 0o 1 2 3 4 5 6 7 IIP(diseasecomplex) sufficient. Therefore let us determine which of these disease complexes the patient most probably has. Q. Examples of probabilistic diagnosis. In 1 . .,167X m~j m .333 .500 m?E . .600 order to present these examples we must r.l Q1 2 .067 .067 .200 .067 .167 .167 .265 .150 have a table of conditional and total Q probabilities. In Fig. 10 we present such cn a however the numbers in the table; __64 .10 .300 .600 67 . .15000 table do not have any basis in fact, they 0, were just made up for the purposes of s Iotl*. oo the illustration. They are, however, self- *l- consistent in themselves and consistent with the logical assumptions made above. 72M 0X0 .200 20 .800 .005 The cross-hatched probabilities are all 0 and correspond to symptom-disease P(symptom complex | disease complex) complex combinations that are not pos- Fig. 10. Values of conditional probabilities and total probabilities for the illustrative sible according to medical knowledge. example. Consider the patient with symptom complex of the different treatments for the vari- On the other hand, suppose we did G=S(1) .S(2) .S(3) =C4 ous disease complexes, then the judg- not know or could not calculate the prob- ment could be replaced by a calculated abilities We found by logical analysis that the P(C1iC4),P(C2JC4), P(C4IC4), probabilistic value. However, this-.cannot and due to lack of sufficient patient can have one of the following P(C6\C4) always be done in general, for the value statistical data or for other reasons. The disease complexes: of some treatments may involve ethical, problem is to choose the treatment which D(1) . D(2) . D(3) = C1 social, and moral considerations as well. will maximize the minimum gain for the D(1) .D(2) .D(3) =C2 For our patient who presented symp- patient. The graphical solution of this D(l) .D(2) .D(3) =C4 toms problem according to the techniques dis- D(1) .D(2) .D(3) =C6 cussed above is given in Fig. 11. Hence G=S(1) .S(2) .S(3) should be chosen with Hence, by the techniques described T(1) probability above, we have: we determine for the value of treatment 0.61 over T(2) with probability 0.39. T(1) (the x-ray treatment) by means P(CIC4) =[(.600)(.333)]- of the techniques described above, as [(.600) (.333) + (.150) (.067) + Conditional or (.050) (.300) + (.005) (.200)]= .885 follows: Probability Learning Device and, similarly, (3) (.885) - (2) (.044) - (3) (.067) - (2) (.004) = 2.358 C4) =.044 A device often called a conditional P(Ca On the other hand, the value of treat- P(C4 C') =.067 probability or learning machine can be =.004 ment becomes P(C0 C4) T(2) used to implement the foregoing logical Thus it becomes clear that the patient - (2) (.885) + (6) (.044) + and probabilistic analysis of medical most likely has (10) (.067) + (8) (.004) =-.804 diagnosis. The particular form of such a Obviously, then, the treatment of great- device that we shall describe was chosen C,=D(1) *D(2) .D0(3) est value to this patient is T((1), the for its extreme simplicity and ready -that is, an enlarged thymus only. x-ray treatment. availability. It can collect data rapidly, Analysis of the treatment. Let us con- and it easily recalculates the probabili- tinue further with this case and deter- ties at each use. With such a device the mine the treatment of greatest value for variation of P(Ci) with location, season, the patient. For this we need a table and so forth, can be checked as well as giving the values of the two treatments relative stability of P(C"jCi). As de- under consideration for each of the dis- scribed here, it is essentially an experi- ease complexes the patient may have. To mental tool, but undoubtedly more so- fill in this table we have used the phy- phisticated forms of the device could be sician's considered judgment with regard further developed. 0 , to the pro and con of each treatment in T(2) T(I) Consider the logical analysis of medi- relation to the disease complex. The cal diagnosis first. In a realistic applica- values have been chosen between + 10 4 _ tion perhaps 300 diseases and 400 symp- and - 10, the greatest value (the best 6- toms must be considered as, for exam- treatment for a particular situation) ple, might occur within a medical spe- 8 being + 10, the smallest (for the worst cialty. The logical basis for such a set treatment) being - 10 (see Table 5). If of symptoms and diseases would require statistics were available on the outcomes Fig. 11. Determining the best treatment. 2700 columns (more than 10200) from 3 JULY 1959 19 1 C3 using past diagnoses to aid in making Co C2 C3 C!c' C C future diagnoses. Any wrong past diag- noses may therefore lead to a perpetua- H'jI r/^jtion of errors. Hence it is clear that only ^0"^ ^ carefully evaluated or definitely verified should be used in ! diagnoses making up ; 1 ^the deck, or at least there should be A:) provision for review and removal of in- ;~^ *) correct diagnoses. 'Ij K>I

Conclusions

co C2 C3 Three factors are involved in the log- ical of medical Fig. 12. Cards notched to indicate columns of logical basis. analysis diagnosis: (i) medical knowledge that relates disease complexes to symptom complexes; (ii) which the elimination of columns for the sufficient number of patients have been the particular symptom complex pre- reduced basis would be made. This is so recorded-that is, after a sufficient sented by the patient; (iii) and the dis- obviously impracticable. However, the number of disease-symptom complex ease complexes that are the final diag- columns to be eliminated correspond to combination cards have been obtained- nosis. The effect of medical knowledge disease-symptom complexes that will the entire deck of such cards is ready to is to eliminate from consideration dis- never occur; the reduced basis corre- be used. ease complexes that are not related to sponds to columns that will occur. The cards are sorted as illustrated in the symptom complex presented. The Hence, by listing many cases by disease- Fig. 13. To separate those cards that are resulting diagnosis computed by means symptom complex combination, the re- notched in a certain position from those of logic is essentially a list of the pos- duced basis will soon be generated. This that are unnotched in that position, put sible disease complexes that the patient can be done, for example, with mar- a rod in the corresponding position and can have that are consistent with medi- ginal notched cards, as follows: Posi- the notched cards will fall; the un- cal knowledge and the patient's symp- tions along the edge of a card are as- notched cards will not fall. Then, by toms. Equation 2 is the fundamental signed to the diseases and symptoms means of a rod through the holes in the formula for the logical analysis of med- under consideration. After a case has upper right-hand corner of the cards, the ical diagnosis. been diagnosed, the positions on the edge unnotched cards are removed from the The "most likely" diagnosis is deter- of a single card are notched correspond- notched ones. mined by calculating the conditional ing to the diseases the patient has, as To make a diagnosis, sort out those probability that a patient presenting well as the presented symptoms. This cards that correspond to the symptom these symptoms has each of the possible card then represents a column of the de- complex presented by the patient. The disease complexes under consideration. sired reduced basis. In this way the en- disease complex part of these cards gives This probability depends upon two con- tire reduced basis can soon be generated all possible disease complexes the pa- tributing factors. The first factor is the (see Fig. 12). tient can have. Separate these cards by conditional probability that a patient The probabilistic analysis of medical the symptom complexes: the thicknesses diagnosis is obtained by notching a card of the resulting separated decks will be for every patient who has been diagnosed. proportional to the probability of the Then there will be, in general, more patient's having the respective disease than one card representing a single col- complexes (see Fig. 14). umn of the logical basis. The number of To determine P(Ci), sort the cards cards representing columns C. Ci is for Ci; then P(C,) is the ratio of the then just N(Ck Ci) of Eq. 6. After a thickness of the sorted cards to the thick- ness of the entire deck of cards. To de- termine P(CkjCi), sort the cards for Ci and measure their thickness; then sort these for Ck and measure their thickness; then P(CkICi)is the ratio of the former to the latter measurements. After each diagnosis is made, a card is notched accordingly and placed with the deck. Old cards are periodically thrown away. This keeps the statistics current. In general, the decks will grow exceedingly rapidly. In a clinic it is often Fig. 14. For a patient presenting symptom normal to diagnose over 100 patients per complex C1, the conditional probabilities day; at this rate only 10 days will result for diagnoses C2 and C3 are read from the thicknesses of the decks in 1000 cards. respective as P(C2C1) =5/(5+2) and P(C3IC) = Fig. 13. Sorting the cards. It is important to observe that we are 2/(5 + 2). 20 SCIENCE,VOL. 130 with a certain disease complex will have Use of value theory enables the sys- Nash, "Differential diagnosis: an apparatus to assist the logical faculties," Lancet 1, 874 a particular symptom complex (that is, tematic computation of the optimum (1954); E. Baylund and G. Baylund, "Use of the reverse of the afore-mentioned strategy to be used in any situation. It record cards in practice, prescription and di- just agnostic records," Ugeskrift Laeger 116, 3 conditional probability); it remains rela- does not, however, determine the values (1954); "Ereignisstatistik und Symptomen- of local factors and of the treatments involved. It is quite kunde" (Incidence statistics and symptoma- tively independent tology), Med. Monatsschr. 6, 12 (1952); M. depends primarily on the physiopatho- evident that the choice of such values Lipkin and J. D. Hardy, "Mechanical corre- must be eval- lation of data in differential diagnosis of hem- logical effects of the disease complex involves intangibles which atological diseases," J. Am. Med. Assoc. 166, itself. The second factor is the effect on uated and judged by the physician. How- 2 (1958). 2. L. B. Lusted, "Medical electronics," New the medical diagnosis of the circum- ever, by clearly separating the strategy Engl. J. Med. 252, 580 (1955) (a review with stances surrounding the patient or, more problem from the values judgment prob- 53 references). 3. L. Clendening and E. H. Hashinger, Methods precisely, the total probability that any lem, the physician is left free to concen- of Diagnosis (Mosby, St. Louis, Mo., 1947), chosen from the particular popu- trate his whole attention on the latter. chaps. 1 and 2. person 4. For the purposes of this paper, the terms lation sample under consideration will One of the most important and novel symptom, sign, and laboratory test are con- disease un- contributions to the value for our sidered as synonymous. The physician must have the particular complex theory determine whether a symptom exists or not. der consideration; this may depend on purpose is the concept of the mixed 5. R. S. Ledley, "Mathematical foundations and computational methods for a digital logic the geographical location of the popu- strategy for approaching value decisions. machine," J. Operations Research Soc. Am. lation sample, or the season when the The mathematical techniques that we 2, 3 (1954). 6. R. S. Ledley, "Digital computational methods sample is chosen, or whether the popu- have discussed and the associated use of in symbolic logic with examples in biochem- lation sample is chosen during an epi- computers are intended to be an aid istry," Proc. Natl. Acad. Sci. U.S. 41, 7 (1955), where f is determined by finding a demic, or whether the sample is com- to the physician. This method in no way consequence solution of the form G ->f to of patients visiting a particular that a computer can take over the "equations" E. posed implies 7. R. S. Ledley, Digital Computer and Control type of specialist or clinic, and so forth. the physician's duties. Quite the reverse; Engineering (McGraw-Hill, New York), in The afore-mentioned are it that the task press. probabilities implies physician's may 8. J. V. Uspensky, Introduction to Mathemati- continually changing; each diagnosis, as become more complicated. The physi- cal Probability (McGraw-Hill, New York, it is itself becomes a statistic that have to learn in addi- 1937). made, cian may more; 9. J. von Neumann and 0. Morgenstern, Theory changes the value of these probabilities. tion to the knowledge he presently needs, of Games and Economic Behavior (Princeton Univ. Press, Princeton, N.J., 1944). Such changing probabilities reflect the he may also have to know the methods 10. R. D. Luce and H. Raiffa, Games and Deci- spread of new epidemics, or new strains and techniques under consideration in sions (Wiley, New York, 1957). 11. Rand Corporation, A Million Random Digits of antibiotic-resistant bacteria, or the this paper. However, the benefit that we with 100,000 Normal Deviates (Free Press, discovery of new and better techniques be to offset these in- Glencoe, Ill., 1955). hope may gained 12. M. G. Kendall and B. Babington Smith, of diagnosis and treatment, or new creased difficulties is the ability to make Tables of Random Numbers, and measures, or in a more and a more "Tracts for Computers XXIV" (Cambridge preventive changes precise diagnosis Univ. Press, Cambridge, 1939). social and economic standards, and so scientific determination of the treatment 13. J. D. Williams, The Compleat Strategyst the (McGraw-Hill, New York, 1954). forth. This observation emphasizes plan (15). 14. D. E. Clark, "Association of irradiation with greater significance and value of current cancer of the thyroid in children and adoles- References and Notes cents," J. Am. Med. Assoc. 159, 1007 (1955). statistics; it depreciates the significance 15. We are grateful to Thomas Bradley of the of past statistics. Equation 8 above, which 1. See, for example: C. F. Paycha, "Memoire National Academy of Sciences-National Re- diagnostique," Montpellier me'd. 47, 588 search Council, and to George Schonholtz of is an adaptation of Bayes' formula, sum- (1955) (punched-card symptom registration the Walter Reed Army Medical Center, and marizes the probabilistic analysis of med- for differential diagnosis of ophthalmological to Scott Swisher of the University of Roch- diseases); "Diagnosis by slide rule," What's ester for their encouragement ical diagnosis. New (Abbott Labs.) No. 189 (1955); F. A. and advice in connection with this study.

3 JULY 1959 21