Who Solved the Secretary Problem? Author(s): Thomas S. Ferguson Source: Statistical Science, Vol. 4, No. 3 (Aug., 1989), pp. 282-289 Published by: Institute of Mathematical Stable URL: http://www.jstor.org/stable/2245639 Accessed: 21/11/2010 11:01

Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use.

Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at http://www.jstor.org/action/showPublisher?publisherCode=ims.

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].

Institute of Mathematical Statistics is collaborating with JSTOR to digitize, preserve and extend access to Statistical Science.

http://www.jstor.org StatisticalScience 1989, Vol. 4, No. 3, 282-296 Who Solved the SecretaryProblem? Thomas S. Ferguson

Abstract.In MartinGardner's Mathematical Games column in the February 1960issue of Scientific American, there appeared a simpleproblem that has cometo be knowntoday as theSecretary Problem, or the Marriage Problem. It has sincebeen taken up and developedby many eminent probabilists and statisticiansand has been extendedand generalizedin manydifferent directionsso that now one can say that it constitutesa "field"within mathematics-probability-optimization.The object of this articleis partly historical(to givea freshview of the origins of the problem, touching upon Cayleyand Kepler),partly review of the field (listing the subfields of recent interest),partly serious (to answerthe questionposed in the title),and partlyentertainment. The contentsof this paper werefirst given as the AllenT. Craiglecture at theUniversity of Iowa, 1988. Keywords and phrases: Secretary problem, marriage problem, search prob- lem,relative ranks, stopping times, rules.

1. INTRODUCTION thinkyou will findthe journeyinteresting, and the conclusionsurprising. In thelate 1950'sand early1960's there appeared a simple,partly recreational, problem known as the 2. STATEMENTOF THE PROBLEM secretaryproblem, or the marriageproblem, or the dowryproblem, that made its wayaround the mathe- The reader'sfirst reaction to thetitle might well be maticalcommunity. The problemhas a certainappeal. to ask, "Whichsecretary problem?". After all, as I It is easy to state and has a strikingsolution. It have just implied,there are manyvariations on the was immediatelytaken up and developedby certain problem.The secretaryproblem in its simplestform eminentprobabilists and statisticians,among them has thefollowing features. Lindley(1961), Dynkin (1963), Chow, Moriguti, Rob- 1. Thereis one secretarialposition available. bins and Samuels'(1964),and Gilbertand Mosteller The numbern ofapplicants is known. (1966). Since that time,the problemhas been ex- 2. 3. The applicantsare interviewedsequentially in tendedand generalizedin manydifferent directions randomorder, each orderbeing equally likely. so thatnow one can saythat it constitutesa "field"of 4. It is assumedthat you can rankall theapplicants study withinmathematics-probability-optimization. frombest to worstwithout ties. The decisionto One can see fromthe review paper by Freeman (1983) acceptor rejectan applicantmust be based only how extensiveand vast the fieldhas become;more- on the relativeranks of thoseapplicants inter- over,the field has continuedits exponential growth in viewedso far. theyears since that paper appeared. 5. An applicantonce rejectedcannot later be re- One mainobjective of the present article is histori- called. cal, to reviewthe history of the problem with the aim 6. You are veryparticular and willbe satisfiedwith ofdetermining who was thefirst to solvethe secretary nothingbut the verybest. (That is, yourpayoff problem.The historicalreview may take us far,but I is 1 ifyou choose the best of the n applicantsand Ootherwise.) This basic problemhas a remarkablysimple solu- Thomas S. Ferguson is Professorof Mathematics at tion.First, one showsthat attention can be restricted Universityof California,Los Angeles. His mailingad- to theclass ofrules that for some integer r > 1 rejects dress is: Department of Mathematics, Universityof the firstr - 1 applicants,and thenchooses the next California,405 HilgardAvenue, Los Angeles,California applicantwho is best in the relativeranking of the 90024. observedapplicants. For such a rule,the probability,

282 WHO SOLVED THE SECRETARY PROBLEM? 283 /0(r),of selectingthe best applicantis 1/nfor r = 1, fianceproblem at a conferenceon mathematicalprob- and,for r > 1, lemsin logisticsheld at GeorgeWashington Univer- sityin January1950 (personal communication, 1988). n=E Jthapplicant is best I personallyremember working on extensionsof the (2.1) j=nr)J=r \ andyou select it problemin the summerof 1959. In any case, many peopleknew of the problemby the timeit appeared in print. j=r \fl/- 11 ( n j=rJ Lindley(1961) seems to be the firstto solve the The optimalr is the one that maximizesthis proba- problem in a scientificjournal. He extends the bility.For smallvalues of n, the optimalr can easily problemto an arbitraryutility based on the rank be computed.Of interestare the approximatevalues of the applicantselected, and considersin particular ofthe optimal r forlarge n. Ifwe let n tendto infinity the problemof minimizingthe expectedrank of the and writex as the limitof r/n,then using t forj/n applicantselected, rank 1 beingbest. However, the dif- and dt for1/n, the sumbecomes a Riemannapproxi- ficultproblem of findingthe asymptotic,for large n, mationto an integral, optimalstrategies and expectedrank was leftopen, and finallysolved neatly by Chow,Moriguti, Robbins and Samuels (1964). Dynkin (1963) considersthe (2.2) ( i)j( 1)(n) problemas an applicationof the theoryof Markov stoppingtimes, and showsthat, properly interpreted, - x f (-) dt = -x log(x). theproblem is monotoneso thatthe one-stagelook- ahead ruleis optimal. The value of x thatmaximizes this quantity is easily Then,in 1966,came the basic paperof Gilbertand foundby setting the derivative with respect to x equal Mosteller,with elegant derivations and extensionsin to zeroand thensolving for x. Whenthis is done we a numberof importantdirections. In particular,they findthat allow r choicesto obtainthe best; theyconsider the problemof obtainingthe best or the nextbest; they (2.3) optimal x = lle = .367879 *.. , and optimalprobability = l/e. treatthe "full-information"case, in whichone is al- lowedto observethe actual values of the applicants Thus forlarge n, it is approximatelyoptimal to wait (presumedto be chosenindependently from a given untilabout 37% of the applicantshave been inter- distribution),for both the best-choice problem and for viewedand thento selectthe next relatively best one. theminimum-rank problem; they analyze some game The probabilityof success is also about37%. theoreticversions of the problem;etc. This paper, This derivationmay seem a littleloose, but it gives morethan the others,foreshadowed the explosionof theright answer. For a lucidpresentation of these and ideas, generalizations,and effortthat wouldimpact relatedresults, see the basic paper of Gilbertand thisarea startingin 1972and thatcontinues strongly Mosteller(1966). today.Some ofthese contributions will be mentioned in Section5. 3. HISTORICALBACKGROUND This, briefly,is the officialearly historyof the secretaryproblem. Since we areto attemptto discover There is some obscurityas to the originsof this whofirst solved this problem, we shall,as is customary problem.It seemsto be generallyagreed among work- in historicalpapers, proceed backward in time, looking ersin thefield that the first statement of the problem forthe germ of the idea hiddenin forgottenliterature. to appear in printoccurred in the February1960 The firstpossibility occurs in the workof Arthur ,column of Martin Gardner in ScientificAmerican, whereit is attributedto Fox and Marnie.A solution Cayley. to theproblem is outlinedin the March1960 issue of 4. CAYLEY'S PROBLEM Scientific American and attributed to Moser and Pounder.In 1963,it appearedas a problemin The The distinguishedEnglish mathematician, Arthur American MathematicalMonthly contributed by Bis- Cayley (1821-1895),is perhapsbest knownfor his singerand Siegel(1963); in 1964,a solutionappeared seminalwork in thetheory of algebraic invariants. He dueto Bosch.But Mostellerlearned of the problem in was also one ofthe mostprolific mathematicians the 1955 fromAndrew Gleason, "who claimedto have worldhas ever known.His collectedworks contain heardit fromanother," (Gilbert and Mosteller,1966). some966 paperstouching on manysubjects in math- HerbRobbins recalls discussing the problem in 1953- ematics,theoretical dynamics and astronomy.Paper 54 at ColumbiaUniversity, and MerrillFlood recalls #705contains some 50 pagesof problems and solutions presentinga versionof the problem that he calledthe thatCayley submitted to theEducational Times from 284 T. S. FERGUSON 1871to 1894.One ofthese problems Cayley (1875), is and Chowand Robbins(1963): Randomvariables X1, as follows: X2 *... are observedsequentially at a costof c > 0 per observation;if you stop afterobserving X,, thenyou 4528. (Proposedby Professor Cayley) A lotteryis receiveXn - nc. If recall of past observationsis arrangedas follows:There are n ticketsrepre- allowed,the payoff for stopping after observing Xn is sentinga, b, c, ... pounds respectively.A person max(X1,..., XX) - nc; such problemswere treatedby drawsonce; looks at his ticket;and ifhe pleases, MacQueen and Miller (1960), Derman and Sacks drawsagain (out ofthe remainingn - 1 tickets); (1960) and Chow and Robbins (1961). In Karlin and so on, drawingin all not morethan k times; (1962),the problemis solvedwith a discountrather and he receivesthe value of the last ticket drawn. than a cost. See DeGroot(1970) fora fulleraccount Supposingthat he regulateshis drawingsin the ofthese developments. mannermost advantageous to him accordingto This class ofproblems forms a ratherdistinct set of the theoryof probabilities,what is the value of problemsthat is stillbeing confused with the secretary his expectation? problems.Since thereare so manyvariations of the basicsecretary problem (each of the 6 conditionslisted Fromthe "Solutionby the Proposer,"we see that in Section2 has beenmodified by at leastone author), Cayleybelieves it to be understoodthat k, n, and the I thinkit is worthwhileto tryto definewhat a secre- a, b, c, ... are known numbers. (Note that here k, taryproblem is. My definitionis: A secretaryproblem ratherthan n, representsthe total numberof draw- is a sequential observationand selectionproblem in ings.) He solvesthe problemby whatis now known which the payoffdepends on the observationsonly as the methodof backwardinduction of dynamic throughtheir relative ranks and not otherwiseon their programming.As an example,he takes n = 4, and actual values. a, b, c, d = 1, 2, 3, 4, and findsfor k = 1, 2, 3, 4, the Withthis definitionthen, Cayley's problem is not values of the most advantageousexpectation to be evena secretaryproblem. We mustlook elsewhere to 10/4 38/12, 85/24, 4 resp. see whosolved the secretary problem. Proceeding far- Cayley'sproblem was resurrectedfrom oblivion by ther back in time,we come to the firstpractical Moser(1956), who also reformulatedthe problem in a applicationI couldfind of these sequential observation neaterguise which may be viewedas an approximation and selectiontechniques: the selectionof a wifeby to the Cayleyproblem when n is verylarge and a, b, JohannesKepler. c, ... are 1, 2, ..., n: You observe, sequentially, randomvariables X1, ... , known to be iid froma Xk 5. KEPLER'S PROBLEM uniformdistribution on theinterval (0, 1); ifyou stop afterobserving Xj, thenyou receive Xj as yourreward. Whenthe celebrated German astronomer, Johannes The optimalrule, as foundby Moser,is to stopwhen Kepler (1571-1630),lost his firstwife to cholerain thereare m observationsleft to be observedif the 1611,he set aboutfinding a newwife using the same value of the presentobservation is greaterthan Em, methodicalthoroughness and carefulconsideration of wherethe Em are definedrecursively by Eo = 0 and the data thathe used in findingthe orbitof Mars to Em+1= (1 - E2)/2. The correspondingequations are be an ellipse.His first,not altogether happy, marriage discussedin Guttman(1960) forthe normaldistribu- had been arrangedfor him, and this time he was tion,in Karlin(1962) forthe exponential distribution determinedto makehis owndecision. In a longletter and in Gilbertand Mosteller(1966) forthe inverse to a Baron Strahlendorfon October23, 1613,written powerdistribution. afterhe had madehis selection,he describesin great Althoughthere are strongpoints of similaritybe- detailthe problemshe facedand the reasonsbehind tweenCayley's problem and the secretaryproblem, each of the decisionshe made.He arrangedto inter- thereis one importantdifference. The payoffis not viewand to choosefrom among no fewerthan eleven one or zerodepending on whetheryou selectthe best candidatesfor his hand.The processconsumed much or not; it is a numericalquantity depending on the of his attentionand energyfor nearly 2 years,what intrinsicvalue of the objectselected. This difference withthe investigations into the virtues and drawbacks playsa big rolein the feelingof the problem and has of each candidate,her dowry,negotiations with her led to a class of problems,called variously the house parents,natural hesitations, the advice of friends, etc. huntingproblem, the problemof sellingan asset, The book of ArthurKoestler (1960) contains an or the searchproblem, with a literatureas largeas entertainingand insightfulexposition of the process. forthe secretaryproblems. The basic problemis the The bookof Carola Baumgardt (1951) containsmuch Cayley-Moserproblem with an infinitehorizon, with supplementaryinformation. thepayoff modified to makecertain that one will wish Sufficeit to saythat of the eleven candidates inter- to stopin a finitetime. For example,Sakaguchi (1961) viewed,Kepler eventually decided on thefifth. It may WHO SOLVED THE SECRETARY PROBLEM? 285 be noted that when n = 11, the function0n(r) of formulationof Stadje (1980),Gnedin (1983), Berezov- (2.1) takeson itsmaximum value when r = 5. Perhaps, skiy,Baryshnikov and Gnedin(1986) or Samuelsand ifKepler had been aware of the theory of the secretary Chotlos(1986). Possibly,Kepler was concernedwith problem,he couldhave saved himself a lotof time and the actual value of his brideand not just withher trouble. rankingamong the other candidates. This wouldmake Ofcourse, in all practicalapplications of theoretical it not a secretaryproblem at all, but a stopping-rule results,the assumptionsare neverexactly satisfied, problemmore like Cayley'sproblem or the search and in the presentinstance this is especiallytrue as problemdiscussed in theprevious section. we can see fromKepler's letter.For example,after It is clearthat much more research needs to be done interviewingcandidate number 5 and beingstrongly to clarifywhich of these problems Kepler was actually attractedto her, Kepler listenedto the advice of solving.Whichever one it was,there can be no doubt friendswho were concernedwith her lack of high thatthe outcome was favorablefor him. His newwife, rank,wealth, parentage and dowry(she was an or- whoseeducation, as he says in his letter,must take phan),and whopersuaded him to proposeto number theplace ofa dowry,bore him seven children, ran his 4. Thus, clearlyKepler thought he could recallpast householdefficiently, and seemsto haveprovided the applicantsin violationof assumption5 of Section2. necessarytranquil homelife for his erraticgenius to Certainly,Kepler would have been interestedin the flourish. papersof Yang (1974),Petruccelli (1981, 1984), Rose (1984),Ferenstein and Enns (1988) and others,who 6. THE GAME OF GOOGOL allow backwardsolicitation with a cost or with a Let us returnto thequestion: Of which of the many probabilityq of beingaccepted. The probabilityq is differentversions of the secretary problem am I trying not1 in Kepler'scase sincecandidate number 4 turned to findthe solver?As historians,we shouldtake as himdown. He had waitedtoo long. thesecretary problem, the problem as it firstappeared That Keplerwent on afterhis failurewith number in print,in MartinGardner's February 1960 column 4 showsthat he was notjust interestedin gettingthe in ScientificAmerican, where it was called the game best (assumption6 of Section 2). Perhaps he was ofgoogol and describedas follows. minimizingthe expectedrank or some otherutility function,in whichcase thepapers of Chow, Moriguti, Ask someoneto take as manyslips of paper as Robbinsand Samuels(1964), Mucci (1973), Lorenzen he pleases, and on each slip writea different (1979), Frank and Samuels (1980), etc.,would have positivenumber. The numbersmay rangefrom interestedhim. Since he had been marriedbefore, it small fractionsof 1 to a numberthe size of a is unrealisticto assumethat he knewnothing about googol (1 followedby a hundred0's) or even women(assumption 4 of Section2); he wouldhave larger.These slipsare turned face down and shuf- enjoyedthe full-informationproblem of Gilbertand fledover the top of a table. One at a timeyou Mosteller(1966) or Tamaki (1986). But it is also turnthe slipsface up. The aim is to stopturning unreasonableto assume that he knew everything whenyou cometo the numberthat you guessto aboutwomen (who does?), so the modelswith partial be the largestof the series.You cannotgo back informationand learningof Stewart(1978), Samuels and pickup a previouslyturned slip. If youturn (1981), Campbelland Samuels (1981) or Campbell overall slips,then of courseyou mustpick the (1982) are moreto thepoint. last one turned. On the otherhand, he actuallyinterviewed all 11 candidatesand could have gone on. Perhapshe was The astutereader may notice that this is not the expectinga randomnumber of availablecandidates simpleform of the secretaryproblem described in (violatingcondition 2 of Section2), in whichcase he Section2. The actual values of the numbersare re- wouldhave enjoyedreading the papers of Presman vealedto thedecision maker in violationof condition and Sonin (1972), Gianini-Pettitt(1979), Abdel- 4. Also, there is this "someone"who chooses the Hamid,Bather and Trustrum(1982), and Bruss and numbers,presumably to makeyour problem of select- Samuels(1987). Or perhapsthere was a costof obser- ing the largestas difficultas possible.The game of vationas in Bartoszynskiand Govindarajulu(1978), googolis reallya two-persongame. Lorenzen (1981) or Samuels (1985) or a discount This raisestwo questions. First, can youguarantee factoras in Rasmussenand Pliska (1976). He would a higherprobability of selecting the largest number if certainlybe interestedin the randomarrival models you allowyour decision rule to dependon the actual ofSakaguchi (1976, 1986), Cowan and Zabczyk(1978) valuesof the numbers? In otherwords, does the stated and Bruss (1987), in the game theoreticmodels of solutiongive the lowervalue ofthe game?Second, if Presmanand Sonin (1975),Fushimi (1981) and Enns youare toldhow this "someone" is choosingthe num- and Ferenstein(1987), and in the multiplecriteria bersto place on the slips,can you now guaranteea 286 T. S. FERGUSON higherprobability of selectingthe largestnumber? In where the three-parameterPareto distribution,Pa(k, otherwords, does the value of the game exist,and can lo, uo) with k > 0 and 1l < uo, is the distributionwith we find optimal or c-optimal strategies for the se- density quence chooser? These questions were not addressed in the solution presented in ScientificAmerican in g(ae,: kglog uo) March 1960. (7.2) = k(k + l)(uo - lo)"

The statementof the problemas it appeared in The (fa)k?+2 I(< ,u< ) AmericanMathematical Monthly in 1963 is much the same, but somewhatmore nebulous since you are not where I representsthe indicator function.This is a told where the numbers come from.Perhaps it is a conjugate familyof distributionsfor the uniformdis- game against nature, so the second question above tributions,and the posterior distributionof (a, ) does not arise. In any case, the solutionpresented for givenX1, ..*, Xj is also Pareto, the problemin 1964 contains the same oversight. (a , , given X1, *.., Those distinguished statisticians who worked on 3) Xj, the secretaryproblem in the 1960's were more careful is Pa(k + j, 1j, uj), where in theirstatements of the problemin specifyingwhat (7.3) = minJ1o,X1,...,Xj and informationcould be used in the decision rule, but none of them attacked the above problem.Therefore, Uj= maxfuo,X1, *.., Xjl. to see who firstsolved the problem,we must proceed into the 1970's and beyond. Thus, one can interpretthe priorinformation as being Suppose you were this "someone" who must choose equivalent to a sample of size k froma uniformdistri- the numbersto writeon the slips. How would you go bution with a minimumof lo and a maximumof uo. about choosing the numbersto make it as difficultas Stewart obtained a rather strikingresult for this possible for me to obtain the largest? You could not priordistribution of the Xj, as follows.First, the payoff if just choose the numbers 1, 2, *.., n, because then I is changed so that you win only you stop on the could wait until the number n appeared and thus largestXj and if that Xj is greaterthan uo. Then, one obtain the largest number with probability 1. You shows that could not just choose them iid fromsome fixeddistri- P (Xi Un~I Xis* Xi bution,because this would lead to the full-information (7.4) case solved by Gilbert and Mosteller (1966), who k + j + 1 showed that I can obtain the largest number with k + n + 1(24 = uj) probabilityat least 0.58016 ... , which is the limiting This implies that attentioncan be restrictedto rules value forlarge n. So, you must choose the numbersin that depend only on the relative ranks of the obser- some dependentfashion. But you mightas well choose vations including uo. In fact, the problem becomes themto be an exchangeableprocess since the numbers equivalent to the secretaryproblem of Section 2 with are put in random order anywaybefore being shown n + k + 1 applicants, in which you start with k + 1 to me. Thus, you are drawnto the partial information applicants already rejected,the largest having value models of Stewart (1978), Petruccelli (1980) and uO. In particular,the Bayes rule among rules that use Samuels (1981). all the informationis the rule that rejects the first r'- 1 applicants, and selects the next applicant who is relativelybest (and better than uo), where r' = 7. PARTIALINFORMATION MODELS max(1, r - k - 1) and r is the optimal value of r for the secretaryproblem of Section 2 with n + k + 1 Let X1, *... denote the values of the numbers Xn applicants. In addition,for all values of k, the proba- on the slips. These may be consideredas the parame- bility of win under an optimal rule tends to l/e as tersof our statisticalproblem. We want to finda prior n -* oo!Since forsufficiently large n, the maximumXj distributionfor X1, ... , Xn, withrespect to whichthe will be greaterthan uo withprobability close to 1, this Bayes rule is the usual optimal rule based on the means that given any e > 0, there is an N and a relativeranks of the Xj. distributionof the form(7.1) such that for n > N if In the paper of Stewart (1978), the are chosen Xj (a, O) is chosen fromthis distributionand then the iid froma uniformdistribution on the interval (a, f), Xj are chosen as froma uniformdistribution on (a, d), denoted by U(a, f), and (a, d) is chosen from the the probabilityof win in the game of googol is less Pareto distribution. three-parameter Specifically, than l/e + e. Thus Stewart has solved googol asymptoticallyfor (7.1) (a, O) is Pa(k, loguo), and n large. Unfortunately,for fixed finiten, one cannot X1,.., Xn, given (ae, O3, are iid U(ae, f3, find,for all e > 0, e-optimaldistributions for choosing WHO SOLVED THE SECRETARY PROBLEM? 287 theXj amongthe distributionshe suggests.We must conjugateprior for U(O, 0), and containsthe posterior lookfurther. The paperof Petruccelli (1980) considers distributionof 0 givenX1, * **, Xj: some otherdistributions for the Xj. If the Xj are 0, given X1, ***, is Pa(a + uniformon the interval(0 - 0.5, 0 + 0.5), then the Xj, j, mj), bestinvariant stopping rule gives a probabilityof win (8.3) where mj = max(mo,X1, *.., Xj). to 0.43517... as n -* oo. If the distribu- asymptotic Let us pretendthat we areplaying a gameof googol, tions are normalwith mean and variance1, the g I choosingthe and youchoosing the stoppingrule. situationis even worse (fromour point of view). Xj For a givene > 0, I will choosethe accordingto The best invariantrule gives a probabilityof win Xj (8.1), and findan a (sufficientlyclose to zero)so that asymptoticto the full-informationcase, namely, you will win with a probabilityless than + where 0.58016.--. Asymptotically,you mightas well tell O5n E, is the maximum probabilityyou can guarantee youropponent what youare using. O5n g usingstrategies that depend only on therelative ranks, Finally,we come to the paper of Samuels (1981), On= maxrqn(r). who extendsthe resultsof Stewart.Samuels shows In fact,I willgive you an additionalslight advantage that in the modelwhere the Xj are chosenfrom the and stillkeep your probability of successbelow 'On + uniformdistribution on (a, ,B),the usual rule (the e. I willsay thatyou win if you stop at the largest optimalrule based on relativeranks) is minimaxfor Xj, or if all the are no greaterthan 1. This willallow each n. Since the usual rule is an equalizerrule (it Xj you to restrictattention to stoppingrules that stop givesthe same probabilityof successno matterhow onlyat a relativelylargest that is greaterthan 1. the are chosen),we see thatit is minimaxagainst Xj Xj The probabilityyou win is general,nonparametric alternatives as well.In other words,the usual solutionachieves the lowervalue of (8.4) P(win) = P(all Xj c 1) + P(win*), thegame. Thus, I believethat Steve Samuels deserves creditfor having solved (the more difficult half of) the wherewin* represents the event Imax X > 1 and you secretaryproblem as it appearedin The American choose it}. The firstterm does not dependon your MathematicalMonthly. strategyand is easilycomputed: However,this exhausts my search through the rel- 00 evant literature.What can be said about the game P(all Xj < 1) = P(all Xj < 1 1O)g(O I a, 1) do of googol?I can finallygive you my answerto the questionin the titleof this article.Who solvedthe (8.5) 00 secretaryproblem? Nobody. = a J (1/0)n(1/o)a+1 dO

= a/(n + a). 8. A RESOLUTION Yourproblem is to maximizethe second term of (8.4). Let me hastento apologizefor this anticlimax, and First,we findthe probability, conditional at stagej, to venturethe opinionthat the reason no one has thatmj > 1 is alreadyas largeas it willget. solvedthis problem is thatpossibly no one was inter- ... ested in googolas a game,or perhapsrealized there P(Mi = Mn IXi, ,Xi) was a problemyet to be solved.To remedythe situa- =E(P(Mi = Mn I a, Xi, * * Xj)J X1, .. * Xj) tion,let us tryto findthe solution now. With the hint givenby the paperof Stewart,it turnsout not to be (8.6) I a + hard. = Jmj (mi/M)n-ig(O j, mj) dO I In fact,since we areconsidering only the best choice problem,we mayconsider a simplerclass ofdistribu- = (a + j)/(a + n) tionsthan that of Stewart-theone-parameter uni- independent of X1, *.., This is an analog of formand the Xj. two-parameterPareto distributions. Stewart'sresult (7.4). If you have a newcandidate at Thus,we take stagej, thatis ifXj = mji,it is optimalto selectit if 0 is and and onlyif (8.1 Pa(a, 1), (8.1) X1, *.., Xn, given0, are iid U(0, 0), a +j wherethe two-parameterPareto distribution,Pa(a, ae + n MO),is the distributionwith density > P(win*with best strategy from stage j + 1 on). (8.2) g(0I a, mo) = aiMa/0a+lI(0 > MO), The rightside of this inequalityis a nonincreasing wherea > 0 and mo > 0. For simplicity,we take functionof j, sinceany strategy available at stagej + mO=1. This class of Pareto distributionsforms a 2 is also availableat stagej + 1. Since the leftside of 288 T. S. FERGUSON

the inequalityis an increasingfunction ofj, an optimal of the field but also a careful exposition and new informationas rule may be found among rules of the formfor some well. In theirdiscussion of the partial informationmodel of Stewart, theyuse the priordistribution (8.1) and derive (8.6). However,this - - r 1: reject the first-r 1 applicants and accept the is only used to prove the asymptoticminimaxity result of Stewart. next applicant forwhich Xj = mj, if any. Using such a strategy,the probabilityof a win*may be computedas REFERENCES

n ABDEL-HAMID, A. R., BATHER, J. A. and TRUSTRUM, G. B. (1982). P(win*) = E P(select j)P(j is best I select j). The secretaryproblem with an unknownnumber of candidates. j=r J. Appl. Probab. 19 619-630. BARTOSZYNSKI, R. and GOVINDARAJULU, Z. (1978). The secretary The probabilitythat j is best givenyou select it is just problemwith interviewcost. Sankhya Ser. B 40 11-28. (8.6). The probabilitythat you select j can be found BAUMGARDT, CAROLA (1951). Johannes Kepler: Life and Letters. using (8.6): Philosophical Library,Inc., New York. BEREZOVSKIY, B. A., BARYSHNIKOV, Yu. M. and GNEDIN, A. V. P(select j) = P(mr1 = mjn1and mj > mjr1) (1986). On a class of best choice problems. Inform.Sci. 39 111-127. = a + r-1{ _a + j -1 BISSENGER, B. H. and SIEGEL, C. (1963). Advanced Problem 5086. a +j- i-a \ +j! Amer. Math. Monthly 70 336. (Solution by A. J. Bosch. 71 (1964) 329-330.) a + r-1 BRUSS, F. T. (1987). On an optimal selection problem of Cowan and Zabczyk. J. AppL Probab. 24 918-928. (a + j - 1)(a + j) BRUSS, F. T. and SAMUELS,S. M. (1987). A unifiedapproach to a with a random numberof Combiningthese into (8.4) and letting0n(r, a) denote class of optimal selection problems options.Ann. Probab. 15 824-830. of a we find the probability win, CAMPBELL, G. (1982). The maximum of a sequence with prior information.Sequential Anal. 1 177-191. On(r, a) (a: + n) +( 1)jzr (a +-J1) CAMPBELL, G. and SAMUELS, S. (1981). Choosing the best of the currentcrop. Adv. in Appl. Probab. 13 510-532. CAYLEY, Mathematical questions with their solutions. Now, note that a) is continuous in a for a > 0, A. (1875). On(r, The Educational Times 23 18-19. See The CollectedMathe- and that as a -O, n(r, a) -X-n(r) forall r= 1, ** *, maticalPapers ofArthur Cayley 10 587-588 (1896). Cambridge n, whereOn(r) is givenby (2.1). Hence, as a 0, Univ. Press, Cambridge. CHOW, Y. S., MORIGUTI,S., ROBBINS, H. and SAMUELS, S. M. -- = maxr qn(r, a) maxr qn(r) .n- (1964). Optimal selection based on relative rank (The "secre- tary"problem). Israel J. Math. 2 81-90. Therefore,an c-optimalmethod of choosingthe Xj is CHOW,Y. S. and ROBBINS, H. (1961). A martingalesystem theorem given by (8.1), wherea = a(n, c) is chosen so that and applications. Proc. Fourth BerkeleySymp. Math. Statist. Probab. 1 93-104. Univ. CaliforniaPress. On < Imaxr qn(r, ao) I e. CHOW, Y. S. and ROBBINS, H. (1963). On optimal stoppingrules. This derivation may be considered an alternate Z. Wahrsch.verw. Gebiete 2 33-49. proofof the minimaxityresult of Samuels mentioned COWAN, R. and ZABCZYK, J. (1978). An optimal selection problem associated with the Poisson process. TheoryProbab. Appl. 23 in Section 7. It is interestingto note that this result 584-592. cannot be obtained using the distributionsof (7.1). DEGROOT, M. H. (1970). Optimal Statistical Decisions. McGraw- The optimal rule based on relative ranks is exactly Hill, New York. Stewart's rule with k = -1; but k must be positive for DERMAN, C. and SACKS, J. (1960). Replacement of periodically (7.1) to be a distribution;hence, we cannot approxi- inspectedequipment. Naval Res. Logist. Quart. 7 597-607. of the instant for = DYNKIN, E. B. (1963). The optimum choice mate the case k -1 with distributions.(In addition, stoppinga Markov process. Soviet Math. Dokl. 4 627-629. the term correspondingto (8.5) does not go to zero.) ENNS, E. G. and FERENSTEIN, E. Z. (1987). On a multipersontime- If it seems strange that the distributions(7.1) were sequential game withpriorities. Sequential Anal. 6 239-256. consideredbefore the simplerdistributions (8.1), the FERENSTEIN, E. Z. and ENNS, E. G. (1988). Optimal sequential reason is that Stewart and Samuels treated in their selectionfrom a knowndistribution with holding costs. J. Amer. Statist.Assoc. 83 382-386. with more general payoffs,not just papers problems FRANK, A. Q. and SAMUELS,S. M. (1980). On an the best choice problem,and needed a broader class problemof Gusein-Zade. StochasticProcess. Appl. 10 299-311. of distributions.Unfortunately, (7.1) does not contain FREEMAN, P. R. (1983). The secretaryproblem and its extensions- (8.1). If there is a moral to this, maybe it is that the A review.Internat. Statist. Rev. 51 189-206. simplercases should always be examined first. FUSHIMI, M. (1981). The secretaryproblem in a competitivesitua- tion. J. Oper. Res. Soc. Japan 24 350-359. GARDNER, M. (1960). ScientificAmerican Feb., Mar. (See also NoteAdded in Proof Martin Gardner's New Mathematical Diversions (1966) 35. Steve Samuels has sent me a copy of the book, Problemsof Best Simon and Schuster,New York.) Choice (in Russian) by B. A. Berezovskiyand A. V. Gnedin, 1984, GIANINI-PErrIrr,J. (1979). Optimal selection based on relative Akademia Nauk, USSR, Moscow. This book is devotedsolely to the ranks with a random number of individuals. Adv. in Appl. secretaryproblem and its variations.It contains not only a review Probab. 11 720-736. WHO SOLVED THE SECRETARY PROBLEM? 289

GILBERT, J. and MOSTELLER, F. (1966). Recognizingthe maximum PRESMAN, E. L. and SONIN, I. M. (1975). Equilibriumpoints in a of a sequence. J. Amer. Statist.Assoc. 61 35-73. game relatedto the best choice problem.Theory Probab. Appl. GNEDIN, A. V. (1983). Effective stopping on a Pareto-optimal 20 770-781. option.Avtomat. i Telemekh.44 (3) 166-170. (In Russian). RASMUSSEN, W. T. and PLISKA, S. R. (1976). Choosing the maxi- GUTTMAN, I. (1960). On a problemof L. Moser. Canad. Math. Bull. mum froma sequlencewith a discount function.Appl. Math. 3 35-39. Optim.2 279-289. KARLIN, S. (1962). Stochastic models and optimalpolicy forselling ROSE, J. (1984). Optimal sequential selection based on relative an asset. In Studies in Applied Probabilityand Management ranks with renewablecall options. J. Amer. Statist.Assoc. 79 Science (K. J. Arrow,S. Karlin and H. Scarf, eds.) 148-158. 430-435. StanfordUniv. Press, Stanford,Calif. SAKAGUCHI, M. (1961). Dynamic programmingof some sequential KOESTLER, A. (1960). The Watershed.A Biographyof Johannes samplingdesign. J. Math. Anal. Appl. 2 446-466. Kepler. AnchorBooks, Garden City,NY. SAKAGUCHI, M. (1976). Optimal stopping problems for randomly LINDLEY, D. V. (1961). Dynamic programmingand . arrivingoffers. Math. Japon. 21 201-217. Appl. Statist. 10 39-51. SAKAGUCHI, M. (1986). Best choice problemsfor randomly arriving LORENZEN, T. J. (1979). Generalizingthe secretaryproblem. Adv. offersduring a random lifetime.Math. Japon. 31 107-117. in Appl. Probab. 1 1 384-396. SAMUELS,S. M. (1981). Minimax stoppingrules when the under- LORENZEN, T. J. (1981). Optimal stoppingwith sampling cost. Ann. lying distribution is uniform. J. Amer. Statist. Assoc. 76 Probab. 9 167-172. 188-197. MACQUEEN, J. and MILLER, R. P., JR. (1960). Optimal persistence SAMUELS,S. M. (1985). A best-choiceproblem with linear travel policies. Oper. Res. 8 362-380. cost. J. Amer. Statist.Assoc. 80 461-464. MOSER, L. (1956). On a problem of Cayley. Scripta Math. 22 SAMUELS, S. M. and CHOTLOS, B. (1986). A multiple criteria 289-292. optimal selection problem. In Adaptive Statistical Procedures MuccI, A. G. (1973). On a class of secretaryproblems. Ann. Probab. and Related Topics (J. Van Ryzin, ed.) 62-78. IMS, Hayward, 1 417-427. Calif. PETRUCCELLI, J. D. (1980). On a best choice problemwith partial STADJE, W. (1980). Efficientstopping of a randomseries of partially information.Ann. Statist. 8 1171-1174. orderedpoints. Multiple CriteriaDecision Making. Theoryand PETRUCCELLI, J. D. (1981). Best-choice problemsinvolving uncer- ApplicationLecture Notes in Econ. Math. Syst. 177 430-447. tainty of selection and recall of observation.J. Appl. Probab. Springer,New York. 18 415-425. STEWART, T. J. (1978). Optimal selection froma random sequence PETRUCCELLI, J. D. (1984). Best-choice problems involvingrecall with learningof the underlyingdistribution. J. Amer. Statist. withuncertainty of selection when the numberof observations Assoc. 73 775-780. is random.Adv. in Appl. Probab. 16 111-130. TAMAKI, M. (1986). A full-informationbest-choice problem with PRESMAN, E. L. and SONIN, I. M. (1972). The best-choiceproblem finitememory. J. Appl. Probab. 23 718-735. for a random number of objects. Theory Probab. Appl. 17 YANG,M. C. K. (1974). Recognizingthe maximumof a sequence 657-668. withbackward solicitation.J. Appl. Probab. 1 1 504-512.

Comment:Who Will Solve the Secretary Problem? StephenM. Samuels

Justlike Johannes Kepler, who threw a newcurve Given n, eitherfind an exchangeablesequence of at the solar system,Tom Fergusonhas givena dif- continuous random variables, X1, X2, *---, Xn, for ferentslant to the SecretaryProblem. To its many which,among all stoppingrules, r, basedon theX's, practitionerswho ritually begin by saying "all thatwe can observeare the relativeranks," Ferguson (citing sup P{XT= max(Xl, X2, XXn), historicalprecedent), in effect,responds "let's not take thatassumption for granted." The heartof his paper, is achievedby a rulebased onlyon the relativeranks as I see it,is the followingFerguson Secretary Problem: ofthe X's-or provethat no suchsequence exists. Fergusonhas come withinepsilon of solvingthis problem.He has exhibitedexchangeable sequences, foreach n and e > 0, such that the best rulebased Stephen M. Samuels is Professor of Statistics and onlyon relativeranks has successprobability within Mathematics at Purdue University.His mailing ad- e ofthe supremum.But he has leftopen the question dress is: Departmentof Statistics,Mathematical Sci- ofwhether this supremum can actuallybe attained. ences Building, Purdue University,West Lafayette, For n = 2, the answeris easy; thereis no such Indiana 47907. sequence.The followingelementary argument, which