arXiv:0808.0781v1 [stat.ME] 6 Aug 2008 ovrainwt oadRaiffa Howard Fienberg E. with 1947–1967 Stephen Years: Conversation Statistical A Early The 08 o.2,N.1 136–149 1, DOI: No. 23, Vol. 2008, Science Statistical 51-80 S (e-mail: USA Pennsylvania 15213-3890, Pittsburgh, University, Department, Mellon Learning Carnegie Machine and Department Science, Statistics Social of and Statistics of Professor

tpe .Febr sMuieFl University Falk Maurice is Fienberg E. Stephen c ttsia Science article Statistical original the the of by reprint published electronic an is This ern iesfo h rgnli aiainand pagination in detail. original typographic the from differs reprint nttt fMteaia Statistics Mathematical of Institute 10.1214/088342307000000104 nttt fMteaia Statistics Mathematical of Institute , eiinMaking. Decision 11 written and is R dissertations book analysis. new doctoral 90 than and more analysi pervised analysis decision risk as theory, th known game , field decision the Governme statistical span of of interests search creation School the Kennedy Sc Ra in Graduate the P. pioneer Frank the and in the Administration of (Emeritus) now Business member is Managerial a he in been where Chair mathemati University, has in Harvard Raiffa Ph.D. at 1957, faculty his Since and Michigan. statistics of University in degree master’s his Abstract. adUiest.Telte a wre n2002. in awarded was latter Nege The University. the of vard of University University Gurion the Ben University, honorary University, holds Northwestern Mellon He Carnegie Resolution. from Dispute degrees for Awar Achievement Institute Lifetime CPR a the and Conflict for tion for School Pri Business Negotiation Melamed Chicago the of and University analy America; the decision M of of Ramsey Society field P. Research Frank the Operations the to Contri Analysis; contributions Risk Distinguished outstanding of the for 1972 Society include from the awards capacity from and Award that honors in cr many serving a to His Director, Analysis helped first Systems Raiffa its Applied 1998. became for in Institute Year International the Inst the of Resources) Book Public was for Resolution Keeney, Center Dispute the Ralph as and known Hammond (formerly CPR John students doctoral former 08 o.2,N.1 136–149 1, No. 23, Vol. 2008, [email protected] 2008 , oadRiaere i ahlrsdge nmathematics, in degree bachelor’s his earned Raiffa Howard eere odMdlfo h nentoa Associa- International the from Medal Gold a earned He . eoito nlss h cec n r fCollaborative of Art and Science The Analysis: Negotiation nte book, Another This . ). in mr Choices Smart 1 nit”FrteDprmn fSaitc eelabo- 1947–1967. years he the Statistics on of rated Sci- Department Decision the a For of entist.” Roots cere- Analytical The Dickson “The Estelle. the was: mony at years, presentation 55 Howard’s of of Jay topic wife Kass, Raiffa’s Rob and Eddy, Kadane William the included in participating discussion the and from Raiffa present Science Howard in Others which University. Prize Dickson at 1999 pre- seminar the a 2000, received day 3, a at April by Statistics on ceding of University Department Mellon the Carnegie in seminar formal hscnesto okpaea ato nin- an of part as place took conversation This h r n cec of Science and Art The oatoe ihhis with co-authored , oy behavioral eory, iahssu- has aiffa i rmthe from sis dh later he nd n Har- and v ok.His books. Michigan, sa the at cs ,hsre- his s, o1975. to tt for itute doctor’s efrom ze olof hool from d bution t A nt. msey eate edal the the 2 S. E. FIENBERG

of the leaders of that Monday afternoon seminar was Howard Raiffa. Raiffa: That was called the Decision Under Un- certainty seminar, the DUU seminar, and it was one of the exciting parts of my life as well. And it went on for four years, from 1961 to 1964. I ran that seminar with my colleague Robert Schlai- fer. Between the weekly sessions of the seminar, a few of us exchanged a flood of memos comment- ing on what was discussed and what should have been discussed. Although there was a demand for air time, we never scheduled starting a new topic until we felt that we had completed the train of re- search thinking on the table. Half-baked ideas were given priority. Fienberg: I was lucky to be able to attend for a couple of them! I brought along with me a couple of slices of statistical history this afternoon: Applied Statistical Decision Theory (Raiffa and Schlaifer, 1961) and Introduction to Statistical Decision Theory (Pratt, Raiffa and Schlaifer, 1995). They come from the same period as the DUU seminar and both books had a tremendous impact on the Bayesian revival of the 1960s. I’m hoping we will have a chance to talk about them in a few minutes, but let’s go back a lit- tle further in time and start with how you became a statistician. Raiffa: Let me start at a decision node. I was Fig. 1. Howard Raiffa following the interview. April, 2000. a First Lieutenant in the Air Force in charge of a radar-blind-landing system at Tachikawa Airfield near Tokyo—ground-controlled approach it was Fienberg: In 1964 I arrived as a graduate student called—and I was due for a routine discharge after at Harvard and in my first class on statistical in- completing 44 months in the armed services. Should ference, a faculty member, whose name I will not I continue in Japan acting in the same extraordinar- mention, began teaching inference from a Neyman– ily exciting capacity as a civilian, earning oodles of Pearson perspective, that is, hypothesis testing and money, or should I return to the States to complete confidence intervals. In fact, we studied Erich my bachelor’s degree, and if so, where, and to study Lehmann’s book on hypothesis testing. It was clear what? I was 22. I thought of becoming an engineer to me that this wasn’t the way I wanted to think building upon my practical experience with radar. about statistics. I asked around the department about I learned from an army buddy of a field I had never what alternatives were available to me and some- heard of: actuarial mathematics. Merit counted in one said: “On Monday afternoons they have a semi- the actuarial profession since budding actuaries had nar at the business school across the Charles River.” to pass nine competitive exams. The place to go to So I just showed up one Monday afternoon. It was prepare for the first three of them was University of one of the most wonderful experiences of my gradu- Michigan. ate career. Although I have no memory of who was Fienberg: Howard, I thought you went to City speaking that first afternoon, I recall that it was the College. most animated and heated discussion I had engaged Raiffa: Before going into the army, I was more in at any point in my career up to that point. It interested in athletics than scholarship. I was the turned out that the seminar and the heated discus- captain of my high school basketball team in New sion were replicated every Monday afternoon. One York City—a high school of 14,000 students—at a CONVERSATION WITH HOWARD RAIFFA 3 time when basketball was the rage in NYC. I was a the course using the R. L. Moore pedagogical style. mediocre B student in my freshman and sophomore Are you familiar with this approach, Steve? years at City College of New York (CCNY), the col- Fienberg: Yes, I am. But please tell us a little lege of choice for poor and middle-income students about the R. L. Moore pedagogy anyway. I under- in New York—I was on the poor side. Four years stand that Jimmy Savage also became enamored later, when I returned to college at the University with studying mathematics after being exposed to of Michigan, I became a superb student. That sur- the Moore approach. prised me. I also was a married man determined to Raiffa: I never knew this until I attended a memo- become employable after getting my bachelors de- rial service for Jimmy. Well, I was similarly affected. gree. For a while I thought the students at CCNY I was studying for a master’s degree in statistics; were just brighter than those in Michigan, but on going for a doctorate was not an alternative in my reflection I didn’t like that explanation. Something personal decision space—it just wasn’t, but it should changed within me. I was shooting for straight A’s have been if I had only known more about making and getting them. smart choices. Fienberg: We could statistically investigate this Copeland’s first assignment was weird: “Here are change if we had the right sort of data. But Howard, some seemingly unrelated mathematical curiosities. did you ever take those actuarial exams and how Think about them. Try to make some conjectures come you stayed on and got your master’s degree in about them. Try to prove your conjectures. Try to statistics? discover something of interest to talk about.” I drew Raiffa: I got my bachelor’s in Actuarial Math and a blank and I came to class with nothing to con- passed the first three exams and was on my way to tribute. So did twenty other students. becoming an actuary preparing for the fourth exam At the beginning of class on that first day the when I decided that I really wanted to study some- instructor asked, “Does anybody have any contri- thing more cerebral—something more theoretical. butions to make?” We sat and sat and sat and ten Fienberg: Why statistics? minutes went by and he said, “Class dismissed.” He Raiffa: The second actuarial exam was on proba- added, “The same assignment tomorrow.” The fol- bility theory and we actuarial students took a course lowing day, he started the class: “Anybody have any- that delighted me. It used Whitworth’s Choice and thing to say?” Finally, someone raised their hand Chance, full of intriguing puzzlers on combinations and asked a question. The course was pure R. L. and permutations. Brain teasers galore. The book Moore. No books were used, absolutely no books. was short on theory and long on problems and I loved It was taboo to look at the literature because you it, and I was good at it. I just couldn’t put it down. might find hints. You should act as if you were a I was good enough that my professors encouraged mathematician in the 17th century trying to prove me to go on and get a master’s degree. So I took a something new. No matter that we were “discover- master’s degree in statistics. Frankly, I was also dis- ing” well-known results; it was new to us. We stu- appointed in that as well because, at that time, the dents did not study mathematics; we did mathemat- statistics program in the mathematics department ics. The R. L. Moore method of teaching turned me was short on theory and depth and long on compu- on. I knew then that I wanted to become a math- tational manipulations. I remember having to invert ematician because it was so much fun and, to my an 8 × 8 matrix. People are laughing because now surprise, I found out that I was pretty good at it. you enter an 8 × 8 matrix in your computer and So I became a student of pure mathematics and poof, out comes the answer. But in those days we I was deliriously happy. Thanks to my wife, who had a mechanical Marchant calculator and I hated by that time became an elementary school teacher it. Even though I won a prize for speed, it did me and could support me in a manner that I grew ac- in. customed to, we went from abject poverty to solid Fienberg: So how come you stayed on for a doc- riches. torate? In the year I studied statistics, I don’t think I heard Raiffa: Along with the courses I took in statistics, the word Bayes. As a way of inference it was nonexis- I also took for cultural curiosity a course in the foun- tent; inference was all strictly done from the Neyman– dations of mathematics by a Professor Copeland Pearson perspective. And the version of Neyman– who later became one of my mentors. Copeland taught Pearson statistics I was exposed to wasn’t very the- 4 S. E. FIENBERG oretical as well. They talked about tests of signifi- that each player has a dominant strategy so the op- cance but they really didn’t talk about the power of timal thing for each to do is to choose this strategy. the tests. But the rub is that if each player plays wisely, they Eddy: Who were your professors? each get miserable payoffs. The paradox is that two Raiffa: Paul Dwyer and Cecil C. Craig. wise players do worse than two dumb players. Still, Fienberg: And they were all stalwarts of the In- in a single-shot situation, each player should choose stitute of Mathematical Statistics in its first couple wisely. Rational individual choice leads to group in- of decades. So you weren’t turned on by the kind of efficiencies. The anomaly lies in the structure of the statistics they did? game. Raiffa: They were good mathematicians and good When the game is repeated for a pre-specified statisticians; but to them, statistics meant some- number of iterations, equilibrium analysis specifies thing quite restrictive. Dwyer was computational that each player should use his or her dominating and Craig did multivariate sampling theory. They strategy at each trial with the result that each does were very good at what they did, but there was not miserably trial after trial. The game presents a so- a decision bone in their bodies. cial pathology. If the players had pre-play commu- Fienberg: So you are now studying pure mathe- nication and could make binding agreements, they matics. How did your interest in start? would agree to cooperate at each trial by taking Raiffa: For ten hours a week I worked as a research the myopically dominated strategy. In the finitely assistant on an Office of Naval Research (ONR)- repeated game, double crossing at each trial (i.e., sponsored research program administered jointly by choosing the non-cooperative strategy at each trial) the Mathematics Department and the School of En- is the best retort if the other guy acts that way. But double crossing at each trial is not the best retort gineering. My role was to attend national meetings against someone who is not playing the double cross and listen to applied problems that people were talk- strategy at every trial. In the laboratory, most ana- ing about that related to our ONR project and to try lytically inclined subjects start by cooperating but to formulate interesting mathematical problems for switch to a belligerent stance toward the end of the my more mathematical-oriented colleagues to work number of trials. But there is where the on. I became quite adept at taking ill-formed situa- switch will take place. tions and translating them into mathematical prob- In Part A of a report I wrote in 1950 on the lems that other people could work on, including my- two-person non-zero-sum game for the ONR project, self. I considered the two-person Prisoner’s Dilemma game Because submarine warfare was a hot topic, I read repeated a fixed number of times (say 20). I as- von Neumann’s and Morgenstern’s book on Game signed a subjective probability distribution over the Theory—or at least parts of it. Jerry Thompson, way I thought others would play in order to figure a fellow student, and I developed a paper on how out the best way I should play. In my naivete, with- to solve two-person zero-sum games and we found out any theory or anything like that, I did what out to our delight that this same algorithm could I now recognize as a prescriptive analysis for one solve linear programming problems. At that time we party, making use of a descriptive modeling pro- didn’t know anything about the simplex method, so cess for the possible decisions of the other parties. for maybe two weeks we had the best algorithm for The descriptive modeling process involved assessing solving linear programming problems. Incidentally, judgmental probability distributions. I slipped into Jerry has spent his distinguished academic career being a subjectivist without realizing how radical here at Carnegie Mellon University. I was behaving. That was the natural thing to do. At that time there was no well-developed theory No big deal. of the simplest two-person, non-zero-sum games— Part B of the report dealt with complex non-zero- there still isn’t. I (and probably hundreds of others sum games where there’s no solution. What would unknown to me) investigated the many qualitatively I do if two buddies of mine came over to me and said, different bi-matrix games having two strategies for “Look, Howard, we can’t solve this game; there’s no each player. I, naturally, became intrigued with a solution. You resolve it for us. What’s fair?” Essen- game now known as the Prisoner’s Dilemma game. tially I sought an arbitration rule that would pro- I know you are familiar with this game. The point is pose a compromise solution for any non-zero-sum CONVERSATION WITH HOWARD RAIFFA 5 game. At that time I was familiar with the sem- write a thesis, and I was through before I thought inal work of Kenneth Arrow. Arrow sought a so- I started. cial welfare function that would combine individual Fienberg: So that was the end of your graduate preferences to arrive at a social or group preference. education! Did you start immediately looking for a He examined a set of very plausible constraints on job? this social welfare function only to prove that these Raiffa: That was in April; it was too late to go on requirements were incompatible. No social welfare to the job market and I didn’t know what to do. The function exists that satisfies properties X, Y and Z. Departments of Mathematics and Psychology initi- I adopted the Arrow approach: in my state of confu- ated an interdisciplinary seminar on Mathematics sion, I proposed a set of reasonable desiderata for an in the Social Sciences and as a post-doc I was hired arbitration scheme to satisfy and then I investigated to be the rapporteur of the seminar. I was charged their joint implications. to record what was said during the meetings and I remember distinctly how I started my research “what should have been said.” I had a ball! For that on arbitration rules. I attended a lecture by a labor one year I steeped myself mostly in psychological arbitrator by the name of William Haber. During measurement theory, working with Clyde Coombs his talk about arbitration, I experienced an “aha” from psychology and Larry Klein from economics. inspiration; I jumped out of my chair while the lec- That involvement constituted an important part of ture was going on, I went back to my study and my analytical roots. wrote vigorously for hours without a break on how During my post-doc year I also gave a series of I would arbitrate non-zero-sum games. That consti- seminar talks to the statistical faculty and doctoral tuted Part B of my ONR report on non-zero-sum students on Abraham Wald’s newly published book on Statistical Decision Theory. I was invited to do so games. I used the Kakutani Fixed Point Theorem because the book was very mathematical and made to show the existence of equilibria strategies. extensive use of game theory in existence proofs. That report was published informally in the Engi- Wald’s book was full of Bayesian decision rules. Not neering Department. It was not peer reviewed; it was as a way of making decisions but as a way of elimi- simply an informal report. At the time I was prepar- nating noncontenders—inadmissible rules. Wald never ing to take my oral qualifying exam and searching used subjective probabilities or judgments to choose for a thesis topic in linear, normed spaces, Banach a decision rule. Bayesian analysis was just a math- spaces. This was in April of 1950. For the oral quali- ematical technique for finding out complete classes fying exams in the Math Department, the candidate of admissible decision rules. first had to write a report on what he or she would The following year I was ready to go on the job like to be examined on; then, depending upon the market and I had several offers from mathemat- report, the examiners structured the oral exam—its ics departments, but there were also two statistics breadth and depth. In my written proposal I ex- ones: one from Columbia University’s Department of amined how all sorts of mathematical ideas found Mathematical Statistics, the other one was working their way into the theory of stochastic processes. with George Shannon at Bell Labs. The Bell Labs And then a surprising thing happened. job paid a lot more than Columbia. But Columbia My wife, Estelle, received a telephone call from the presented a unique opportunity for me. A year ear- very famous algebraist, Richard Brauer, who was lier, Abraham Wald, who was the star of the statis- the chairman of my oral examining committee. He tics department at Columbia, was killed in an air- informed her that, on the basis of my written re- plane accident over India and his colleagues Jack port, the committee decided to excuse me from my Wolfowitz and Jack Keifer left for Cornell when Wald oral exam. And then he said, “By the way, the com- died. The department was decimated but still there mittee would like to talk to me about my thesis.” remained Ted Anderson, Henry Scheffe, Howard Lev- I came in the next day all excited about the fact ene and Herbert Solomon. But they needed someone that I didn’t need to take my oral exam and was desperately to teach Wald’s stuff and to supervise told that the committee thought it appropriate that his many doctoral students. I was supposed to fill I use my recently completed Engineering Report as that bill because I knew Wald’s book. my doctoral dissertation. I was stunned. So I ended I really was not prepared. Wald’s doctoral stu- up not having to take an oral exam, not having to dents knew more statistics than I and there were 6 S. E. FIENBERG

Fig. 2. Raiffa with his wife Estelle and Carnegie Mellon President Jared L. Cohon at Dickson Prize Ceremony in April, 2000. times when I hadn’t the faintest idea what they tral to what the field should be about. So I repeated were talking about. I had to learn and give lec- what I knew how to do best. I went back to the Ar- tures on advanced topics in statistics. I did what row approach and in an axiomatic style examined I called “just-in-time” teaching. I used a book by what primitive things that I believed in and explored Blackwell and Girshick. At that time it was avail- their joint implications. At that time I should have able only in a pre-printed form and it hadn’t yet known about some of the work of Jimmy Savage, but been published. But it was a wonderful book. Black- I didn’t; I didn’t even know who Jimmy Savage was. well and Girshick were more inclined toward the But in 1954 I came across a paper by Herman Cher- Bayesian viewpoint than Wald, but not really. (At noff justifying the Laplace solution to these prob- least not then, although Blackwell later became an lems. Essentially in a state of ignorance it argued ardent supporter of the Bayesian perspective.) The for using a uniform prior distribution over states. book carefully skirted issues of measurability by con- I had my reservations about the full analysis, but fining itself to the denumerable case; it did nothing Chernoff used an axiom he attributed to Herman continuous. I bracketed their presentation by exam- Rubin, called the Sure Thing Principle. I embraced ining more closely the finite case, spending a lot of it wholeheartedly. The implications were devastat- time on n = 2, and then going abstract by consider- ing. It argued for making inferences and decisions ing the most advanced measure-theoretic version of based on the likelihood function. It also ruled out the ideas. I produced copious notes for the course. the Neyman–Pearson theory that preached that you I was a nervous wreck because my statistical col- can’t infer what you should do based on an observed leagues all audited that course. I also taught the sample outcome until you think what you would do first course in statistics. I taught Neyman–Pearson for all potential sample outcomes that could have oc- theory, tests of hypotheses, confidence intervals, un- curred but didn’t. “Nonsense,” says the sure thing biased estimation and preached about the dangers principle. Out went tests and unbiased estimates of optional stopping that I no longer believe to be and confidence intervals and lots of stuff on optional true. stopping. Not much left. Now that we had destroyed Gradually I became disillusioned. I just didn’t be- so much, it was time to really seek an alternative to lieve that the basic concepts I was teaching were cen- NP theory. CONVERSATION WITH HOWARD RAIFFA 7

Consider the standard problem involving an un- others weren’t interested in my class of problems. known population proportion p. That’s the usual To me they were primarily inference people. binomial model. The sample produces nine F ’s and Fienberg: I notice from your resume that in 1957 one S in that order and the likelihood function is you published the highly acclaimed book on Games (1−p)9p1. Now compare that with the stopping rule and Decisions with Duncan Luce (Luce and Raiffa, that samples until the first success appears; and sup- 1957). You must have written this during the same pose that the first S occurs on the tenth trial. The period of time you were primarily learning and teach- likelihood function would be the same whether you ing and establishing your philosophical roots in statis- had the first stopping rule or the second stopping tics. Could you tell us about your game theory ac- rule, and therefore the inference should be the same. tivities during this period? But using Neyman–Pearson tests of hypothesis, the Raiffa: A group of us started the interdisciplinary answers are different—you have to worry about not Behavioral Models Project at Columbia and we hired only what happened, but what could have happened a mathematical psychologist, Duncan Luce, to su- according to the sampling plan. pervise the project. Duncan was not affiliated with I pushed the axiomatics and convinced myself that any department at the time. The principals were it made sense to assign a prior probability distri- Paul Lazersfeld from psychology, Ernest Nagel from bution over the states of the problem and maxi- philosophy, Bill Vickery from economics, and I sup- mize expected . It was for me akin to a reli- pose I should include myself from statistics. As the gious conversion—from being a Neyman–Pearsonian junior member of that steering committee, I acted as to being a Bayesian. I became a closet Bayesian. Chairman of the project and Duncan and I carried I didn’t come out of the closet because my asso- the burden of pushing the project along. We had ex- ciates, whom I admired, were vociferously opposed ternal financial support and ran seminars and hired to Bayesianism. They thought it was a step back- pre-docs. We proposed publishing a series of short ward. They’d say, “Look, Howard, what are you try- 50-page monographs on the topics that featured the ing to do? Are you trying to introduce squishy judg- use of mathematics in the social sciences—topics mental, psychological stuff into something which we such as learning theory, psychological measurement think is science?” Jimmy Savage, I think, had the theory, game theory, informatics and cybernetics— best retort to this. He said: “Yes, I would rather but we couldn’t get any authors to submit manu- build an edifice on the shifting sands of subjective scripts. So Duncan and I decided to write a 50-page probabilities than building on a void.” document on games and decisions, on game theory The biggest difference between me and my col- really. We eventually gave up writing that fifty-page leagues at Columbia was the kind of problems that document and wrote instead a 500-page book on we worked on. They were basically driven by the games and decisions. It took us two years to write. problem of inference. They paid lip service to de- In 1953–1954, Duncan was at the Center for Ad- cision problems by considering whether one should vanced Study in the Behavioral Sciences at Stanford reject a null hypothesis, but really basically what and I was at Columbia. The following year I went they were interested in was problems of statistical to that institute and he was back at Columbia. We sampling and inference going from observations to were together face to face for five days in these two parameters. I came from a background in game the- years, in an era well before e-mail, and we wrote ory and operational research, so for me a proto- the Games and Decisions book. It included a lot of typical problem was: how much should you stock material from my unpublished “non-dissertation.” of a product when demand was unknown. For me, Fienberg: That was a landmark book in many that unknown demand was the population value and senses and it’s still in print as a Dover paperback that population value had a probability distribution. almost a half a century later. What happened next The whole point of doing sampling was to get bet- in your career? ter information, to get more informed probability Raiffa: The book still sells a few thousand copies distributions. Decisions were tied to real economic a year. O.K., it is now ’57 and out of the blue I got problems, not phony ones like should you accept a two offers: one from the University of California, null hypothesis. That was really the divide, because Berkeley—not in statistics—and one from Harvard my Columbia colleagues, whom I greatly admired, University—a joint appointment with the newly cre- Henry Scheff´e, Ted Anderson, Herb Robbins and ated Statistics Department and the Business School. 8 S. E. FIENBERG

The newly formed Statistics Department was led by and economics of aviation engines. By some involve- Fred Mosteller, a wonderful man and statistician, ment, by some fluke, he received an appointment at and it also included Bill Cochran, another great the Business School but he had no specialty. The statistician. So I was very, very flattered, except single professor at the B-School who taught a prim- I was worried. How would they receive my new con- itive course in statistics retired at that time and version to Bayesianism? I talked to Fred Mosteller Robert was asked to teach that course. Thus his- about that and he was lukewarm, but he was tol- tory was made. Trained as a classical historian, he erant. And Cochran said: “Well, you’ll grow up.” knew nothing about statistics, so he read the “clas- At that time I literally was at Columbia for five sics”: R. A. Fisher, Neyman and Pearson—not Wald years and I never knew that Columbia had a busi- and not Savage—and he concluded that standard ness school all this time. I really didn’t know any- statistical pedagogy did not address the main prob- thing about business and the only reason I decided lem of a businessman: how to make decisions un- to go to Harvard was because of the Statistics De- der uncertainty. Not knowing anything about the partment. They were willing to double my Columbia subjective/objective philosophical divide, he threw salary. Columbia, belatedly, agreed to match it, and away the books and invented Bayesian decision the- promote me, but we decided to go to Harvard. It ory from scratch. Since he had little mathematics, was a close call in making that decision. most of his examples involved discrete problems or My wife, Estelle, and I stewed about the Harvard the univariate case, like an unknown population pro- offer because there were many conflicting objectives portion. we had to balance. We did a sort of formal anal- Robert had had only one course in mathematics, ysis of this decision problem. Our analysis involved in the calculus. But he had raw mathematical abil- ten objectives that we scored and weighted. My wife ities, provided he could see how it might be put to is not mathematically inclined at all, but for this use. He was single-minded in his pursuit of relevance case she joined me in making all the assessments. It to the real world. When I came there, he was thrilled turned out that we agreed on practically everything. that here was a kindred soul that could tutor him in Harvard was the clear winner. Of course, there were just the kind of mathematics he needed. I spent most some dimensions where Columbia was better, so it of my days teaching Robert Schlaifer mathematics— wasn’t a dominating solution, but the formalization first calculus, then linear algebra. I would teach him helped us really decide that it wasn’t a close call at something about linear algebra in the morning and all. We then followed some advice that was given to he would show me how it could be applied in the af- us by Patty Lazersfeld. She advised that in decisions ternoon. He was not only smarter than other people, of this kind, don’t ever make your choice without but he worked longer hours than anyone else. testing it. You tell your friends that you’re going Fienberg: I’ve heard you being alluded to as Mr. to Harvard, you tell your family, but you don’t tell . What’s the story behind that? the administration. Then before you officially com- Raiffa: Already in my book with Luce on game mit yourself, you see how you sleep for a week. And theory, published in 1957, I used game trees to define that’s what we did. We slept well, we felt content, games in extensive form. There were player nodes and we ended up at Harvard. and chance nodes, but all chance nodes had associ- Fienberg: But when I arrived at Harvard in 1964 ated, objective probabilities in the common knowl- you were in essence full-time at the business school. edge domain. When I started working on individual How did this shift occur? decision problems at the B-School I modified the Raiffa: Surprisingly to me, my academic life didn’t game tree into a decision tree that featured chance revolve around the statistics department; it revolved nodes with probability distributions subjectively as- around a place called the Business School. At the sessed by the decision maker. My nonmathemati- B-School I worked closely with Robert O. Schlaifer. cally inclined audiences found it impossible to follow He’s probably the person who influenced me more the logic of the analysis without an accompanying in my life than anybody else. He was trained as a decision tree to keep track of the discussion. My use classical historian and classical Greek scholar. Dur- of decision trees started as a search for pedagogi- ing the war he worked for the underwater laboratory cal simplicity, but I gradually became dependent on writing prose for technical reports. He ended up at them myself. It’s interesting to reflect why I never the end of the war writing a tome on the engineering used decision trees earlier in teaching elementary CONVERSATION WITH HOWARD RAIFFA 9

Fig. 3. Harvard Statistics Department, 1957. From left to right: John Pratt, Raiffa, William Cochran, Arthur Dempster and Frederick Mosteller. statistics at Columbia. In Applied Statistical Deci- ter, θ. Move 2 is in chance’s domain and the sample sion Theory (ASDT), I present a schematic decision outcome is symbolically denoted by z. At move 3 tree depicting the prototypical or canonical statisti- the DM must choose a terminal act a and at move cal decision problem. [At this point Raiffa went to 4 chance reveals the true population parameter θ. the blackboard.] This requires specifying the marginal probability of At move 1, a decision node, the decision maker z at move 2 and at move 4 the conditional or pos- (DM) has a choice of experiments or information- terior probability of θ given z. To make these prob- gathering alternatives including the null experiment, ability assessments, the DM usually starts with a which means acting now without gathering further subjectively assessed prior distribution of θ and an information about the unknown population parame- objective, model-based conditional sampling distri- 10 S. E. FIENBERG

Fig. 4. Raiffa drawing the decision tree. bution of z given each value of θ. The DM then uses in the shown figure is too special and not indicative Bayes’ theorem to find the probabilities required at of the broader class of decision problems; and, in move 4 and at 1. This is so standard that analysts this wider class, subjective probability assignments using this methodology are called Bayesians. Classi- are often made without invoking Bayes’s formula. cists mistakenly choose not to assign probabilities at Fienberg: So how would you like to be called? moves 2 and 4 because these involve subjective prob- Raiffa: I think of myself as a decision analyst who ability inputs and are taboo. Hence, for them, there believes in using subjective probabilities. I would is no gain to be had from considering this schematic prefer being called a “subjectivist” than a “Bayesian.” decision tree. Robert and I divided Bayesians into two groups: en- Fienberg: In 1957, when you really launched your gineers and scientists, or “echt” and the “nonecht.” efforts into the Bayesian direction, the word Bayesian The really true subjectivists were the engineers. The was not used except perhaps in a pejorative or math- nonecht scientists never elicited judgmental ques- ematical formalism kind of way. Jimmy Savage, Jack tions from anybody. For them it’s all abstract. The Good and Dennis Lindley did not call themselves echt folk got their hands dirty. In the early 1960s Bayesians in the early parts of the fifties. Yet by the we had a series of distinguished Bayesians (Lind- time you wrote the book with Bob, it had become ley, Box and Tiao), who each spent a semester at second nature to identify yourself as such. How did the Business School. They were primarily wonderful that happen? How did Bayesian inference become statisticians of the non-echt variety. The most no- known as “Bayesian”? table of our echt visitors was Amos Tversky—but Raiffa: In the preface to ASDT we refer to “the he visited us in the early 1980s. Next to Schlaifer, so-called Bayesian approach.” I’m not sure who is Amos was the person who influenced me most. But responsible for the nomenclature. But I dislike us- Amos was not a statistician. ing the term “Bayesian” to refer to decision ana- Fienberg: Howard, why don’t you continue with lysts who believe in using subjective probabilities the story of your early interactions with Schlaifer. because, once we generalize from the standard classi- Raiffa: Robert and I taught an elective course to- cal statistical paradigm, the schematic decision tree gether in statistics at the B-School and it was a ball. CONVERSATION WITH HOWARD RAIFFA 11

I opted for teaching both the objectivist and sub- function. It’s not hard to see how the mathematics jectivist points of view side by side with an openly would go. So I guess I can take credit for that. declared preference for the subjective school. Robert Fienberg: You should! thought that was a cop-out. He said, “Look, we’re Raiffa: I’ll take responsibility for that one. It just professors. We are supposed to know what’s right. seemed all so natural. You teach what’s right; you don’t teach what’s Our effort turned into the book you mentioned at wrong.” “But our students are going to have to read the beginning, Applied Statistical Decision Theory, the literature,” I retorted. “Since when do business- which I’m proud to say has been republished by Wi- men—never women—read the literature?” He ley in their classics series. Originally the book was wouldn’t have anything to do with teaching Neyman– published not by a regular publishing house but by Pearson theory; it was judgmental probability all the the Harvard Business School Division of Research, which had never published anything mathematical way right down the line. The closest I could push before. The book must have sold maybe three hun- him toward the classical school was to examine de- dred copies. cision problems from the normal—as contrasted to Jimmy Savage reviewed the manuscript very fa- the extensive—form of analysis. Not only was he vorably and he called the notation dazzlingly in- smart, very smart, but he was the most opinionated tricate. He didn’t like the notational conventions. person I ever met. I take responsibility for the intricate hieroglyphics. One objection we encountered in using the subjec- It works for me and seems to work also for novices tivist approach to statistics was that it was too hard. who have not been brainwashed with usages of other Robert and I explored ways to simplify it. In 1959, notations. The theory is intricate enough so that after I was at the B-School two years, we started when I’m away from the field for long periods of time writing a book together on what you have alluded and then return, I have a tough time remembering to as Bayesian statistics. It was not so much a book the theory and the notation comes to my rescue. Let to be read or even a textbook but a compendium me illustrate what I’m talking about. Let me go to of results for the specialist. The theme of our book the blackboard and show you. We consider a pop- was: it doesn’t have to be too complicated; anything ulation parameter, designated by the Greek symbol the classicists can do we can do also—only better. µ; since µ is an uncertain quantity, we flag it with We discovered a simple algebraic way to go from a tilde sign,µ ˜. We distinguish between prior distri- priors to posteriors for sampling distributions that butions and posterior distributions ofµ ˜ by primes ′ ′′ admitted fixed-dimensional sufficient statistics, like and double primes, giving respectivelyµ ˜ andµ ˜ . the exponential distributions. We distinguish the prior mean—that is, the mean ofµ ˜′—which we labelµ ¯′—from the posterior mean Kadane: You mean the use of conjugate priors. ′′ Fienberg: Well, you certainly succeeded in push- µ¯ . Now we get more complicated. From a prior ing the ideas. I’m opening the book right now, and point of view, we might be interested in the as-yet- here, almost at the beginning, there is a classical re- unknown posterior mean. The distribution of this sult on minimal sufficient statistics and exponential quantity was dubbed by Robert, the pre-posterior distribution and renamed by our students as the pre- families. And then suddenly, as if out of nowhere, posterous distribution. The prior variance of the as- you introduce conjugate families and conjugate pri- ′ yet-unknown posterior mean is connoted by v . The ors. Clearly, the ideas were around for special cases, posterior distribution ofµ ˜ depends upon a sufficient going back at least into the nineteenth century, but statistic z, which we bring into our notational fold, I haven’t found any other source that laid the ap- and so it goes. proach out in full generality. How did you come to Fienberg: While that first printing of ASDT by this idea? the Business School may have not sold very many Raiffa: Well, it was pretty obvious that if we’re copies, a later paperback version brought out by going to get a systematic Bayesian approach for MIT Press was widely used and important to those the exponential-family distributions, we needed to of us who tried to take the conjugate prior frame- get something where updating could be done alge- work into statistical problems beyond decision the- braically in a formal sort of way. We needed the ory. But then fairly soon after the book first ap- prior and posterior to belong to the same family of peared, you began to turn to related problems. How distributions and to conform well to the likelihood did this happen? 12 S. E. FIENBERG

Fig. 5. Raiffa recreating the ASDT notation at the blackboard.

Raiffa: ASDT was written in 1959 and 1960 and Business School, in the area of finance. Grayson was published in ’61. Schlaifer and I would discuss some interested in financial decisions of oil wildcatters. ideas and my style was to start writing when I was Through his interaction with me, his thesis was ex- still confused. The task of writing focused my mind. panded from a purely descriptive to a prescriptive Schlaifer had trouble writing first drafts. He would perspective. How should wildcatters accumulate in- look at what I had written and invariably his reac- formation from geologic surveys, seismic soundings, tion was: “This is terrible,” and he would tear up exploratory wells, expert judgments? The drilling of my version and write something better. But he had an exploratory well was simultaneously a terminal- trouble starting without something to criticize. action and an information-gathering move. How In the academic year 1960–1961, I was awfully should they form syndicates for the sharing of risks? busy since I organized, at the request of the Ford This leads to decision problems galore, and the prob- Foundation, a special 11-month program for 40 lems did not easily conform to the classical statisti- research-oriented professors, teaching in management cal decision paradigm involving sampling to gather schools, who felt the need to learn more mathemat- information about an unknown population param- ics. That program, The Institute for Basic Math- eter. The statistical decision paradigm seemed too ematics for Application to Business (IBMAB), was restrictive, too hobbling, too narrow. Schlaifer con- reputed to be a huge success and the next generation curred with me and we began to think of ourselves of deans at such prestigious business schools as Har- more as decision analysts than as statisticians. I don’t vard, Stanford and Northwestern were all graduates know why it took so long to make the shift, but in of that program. Naturally, in the IBMAB program my mind every problem that I thought of up until I taught statistics from a subjectivist perspective that point was cast in the old statistical paradigm with a heavy decision orientation and the gospel ra- of going from a prior to a posterior distribution diated outward in schools of management. of a population parameter. With my new orienta- In the academic year 1961–1962, another radical tion I saw problems all over the place in business, thing happened to me. I was a second reader of a in medicine, in engineering, in public policy where thesis proposal by Jack Grayson, a student in the the decision problems under uncertainty did not fit CONVERSATION WITH HOWARD RAIFFA 13 comfortably into the classical mold. We were excited I had a doctoral student who investigated the prob- about our new vision of the world of uncertainty lem of overconfidence under my supervision. Experts with a vast new agenda and we started the Decision did not calibrate very well; they were surprised a under Uncertainty Seminar that you, Steve, referred surprising number of times and we had to make our to earlier. experts aware of this tendency. Our motivation was Fienberg: This was also around the time that John not description, but prescription, but nevertheless Pratt worked with you and Robert on the introduc- our students and we did a lot of work in what is tory book that appeared in its preliminary edition now called behavioral decision making. in 1965. It was called Introduction to Statistical De- In the mid-1960s Robert introduced a required cision Theory—ISDT in contrast to ASDT*. I stud- course in the first year of the MBA entitled Man- ied from that unpublished manuscript. But then you agerial Economics. All 800 students were exposed never quite polished it up and finished it. What hap- to cases that featured decision making under un- pened? certainty. It was an heroic effort that was not uni- Raiffa: The reason why we didn’t publish this mas- versally appreciated by some of our nonnumerate sive 900-page book after we essentially finished it students. But it was like an existence theorem: it was that Schlaifer and I no longer believed in the demonstrated that decision analysis was relevant and centrality of the standard statistical paradigm as de- teachable to future managers. In retrospect I think picted in the above figure. The book was put aside the effort was done too quickly without enough at- because we had a new exciting agenda to explore. tention to palatable pedagogy. At semester’s end the But we should have finished it. The fact that the students burned one of Robert’s books, one that book was available in a pre-publication form also I believe deserved a prize for innovation. took off the pressure for actually publishing it. That In the mid-1960s, I was offered and accepted a book wasn’t finished until 1995, when I retired and joint chair between the Business School and the Eco- had more time and more maturity. Pratt did most nomics Department. I relished the fact that I never of the polishing and added more theoretical mate- took a course in economics. I taught decision anal- rial, but Schlaifer chose not to get involved in the ysis, which in my mind had a different agenda than revisions. courses I once taught in statistical decision theory. We are now back in 1963 and with ISDT on a back I drifted away from Robert as I started to work on burner, we eagerly pursued our new agenda. We problems more in the public sector and to do re- had to learn how to elicit judgments from people— search on multiple conflicting objectives and nego- probabilities and . Skeptics asserted that we tiations. But that research, together with my expe- couldn’t get real experts to provide these subjec- riences at IIASA, are other chapters in my career tive assessments. Well, we demonstrated to our sat- which I’ll discuss tomorrow at the Dickson Award isfaction that was not the case. We worked with ceremonies. engineers and market experts who were more than Fienberg: As you look back over the field of statis- willing to give us their subjective probability assess- tics, don’t you have a sense of satisfaction about how ments for real problems. But were these assessments your work with Robert and John has influenced oth- any good? Garbage in and garbage out. We realized ers and the growth of the Bayesian school? that if we wanted to elicit judgments in a credible Raiffa: Certainly I’m proud of what I’ve contributed manner, there was the delicate issue on how you in this field. But still I’m a little disappointed. If we asked the questions. Framing was crucial. For ex- made a survey of the way statistics is taught across ample, we learned that if we asked questions about the country, it would be dominated by the old stuff, utility judgments in terms of incremental amounts, the Neyman–Pearson theory. Carnegie Mellon is a we would get different answers than if we asked peo- maverick; is an exception. The subjectivist school of ple questions about their asset positions. And then decision making is not being taught in many places. we had to decide which set of responses should be Kass: I think it’s turned a corner. used. Schlaifer and I convinced ourselves that ques- Raiffa: Here, at Carnegie Mellon for sure, but. . . tions should be posed in terms of asset positions and Kass: Well, no, I think in the world, in the last not in terms of incremental amounts of money, be- ten years or so I think it’s really started to change. cause increments invited all kinds of zero illusions. And I don’t think it’s only in statistics but in lots of 14 S. E. FIENBERG other fields, applied areas that really use statistical Kass: Not only has the Bayesian world become ideas and methods. more intimately involved in applications since your Raiffa: Well, let’s look at the curricula of most early efforts, but statistics as a whole has moved in of the universities. I think that overwhelmingly the this direction. classical school is still dominant at most universities. Raiffa: I hope you are correct but it’s painfully I was instrumental in helping to start the Kennedy slow. I look forward to the day that there will be School of Government at Harvard and in the begin- Departments of Decision Sciences in other universi- ning I had some input in what was taught. In the ties besides Carnegie-Mellon and Duke. Statistics is early days the curriculum was decision and policy a broad subject encompassing data analysis, mod- oriented and Bayesian statistics and decision anal- eling and inference as well as decisions. I just don’t ysis were taught and integrated in the curriculum. want the decision component to disappear. But new teachers were hired and they taught what Fienberg: Well, Howard, we all look forward to a they had learned as students and Bayesianism dis- continuation of this conversation when you can tell appeared. us more about your analytical roots as a decision Kass: I have another question; it’s almost the same scientist and about your experiences after 1967. as Steve’s, involving the Bayes part of it. One of the Raiffa: I look forward to it. things that’s so interesting to many of us is to ex- amine the way one body of work influences another. REFERENCES And in the case of your book with Schlaifer, it’s easy to see how that influenced, for instance, Mor- Hammond, J. S., Keeney, R. L. and Raiffa, H. (1998). rie DeGroot’s book, which came a decade or so later, Smart Choices. Harvard Business School Press, Boston. Keeney, R. L. and Raiffa, H. (1976). Decisions with Mul- and then, much later, Jim Berger’s book which most tiple Objectives: Preferences and Value Tradeoffs. Wiley, students today are familiar with in modern statisti- New York. Reprinted, Cambridge Univ. Press, New York cal decision theory. In retrospect it’s clear how your (1993). MR0449476 book inspired the material and presentation in DeG- Luce, R. D. and Raiffa, H. (1957). Games and Decisions: root’s book. To see these three books, one right after Introduction and Critical Survey. Wiley, New York. Paper- back reprint, Dover, New York. MR0087572 another, it’s very easy to trace the influences back- Pratt, J. W., Raiffa, H. and Schaifer, R. (1995). Intro- wards, but how do we trace things back to see the duction to Statistical Decision Theory. MIT Press, Cam- influences on your work with Robert as a Bayesian? bridge, MA. MR1326829 Raiffa: Schlaifer was driven by the need to coor- Raiffa, H. (1968). Decision Analysis: Introductory Lectures dinate statistics with business decision making and on Choices Under Uncertainty. Addison-Wesley, Reading, he truly discovered from scratch the basic ideas of MA. Raiffa, H. (1982). The Art and Science of Negotiation. Har- what you refer as Bayesianism. I, on the other hand, vard Univ. Press, Cambridge, MA. was brainwashed into the classical tradition and had Raiffa, H. (2002). Negotiation Analysis. Harvard Univ. to go through a religious conversion. Press, Cambridge, MA. Kadane: What about Jimmy Savage’s work and Raiffa, H., Richardson, J. and Metcalfe, D. (2003). Ne- his 1954 book? gotiation Analysis: The Science and Art of Collaborative Decision. Harvard Univ. Press, Cambridge, MA. Raiffa: Somehow I just was not aware of that book Raiffa, H. and Schaifer, R. (1961). Applied Statistical until I left Columbia. I already mentioned Herman Decision Theory. Division of Research, Harvard Business Chernoff’s paper and Herman Rubin’s sure thing School, Boston. 1968 paperback edition, MIT Press, Cam- principle. That had a profound effect on me. bridge, MA. Wiley Classics Library edition (2000).