Piet Groeneboom, Jan van Mill, Aad van der Vaart Statistics as both a purely mathematical activity and an applied science NAW 5/18 nr. 1 maart 2017 55

Piet Groeneboom Jan van Mill Aad van der Vaart Delft Institute of Applied Mathematics KdV Institute for Mathematics Mathematical Institute Delft University of Technology University of Amsterdam Leiden University [email protected] [email protected] [email protected]

In Memoriam Kobus Oosterhoff (1933–2015) Statistics as both a purely mathematical activity and an applied science

On 27 May 2015 Kobus Oosterhoff passed away at the age of 82. Kobus was employed at contact with Hemelrijk and the encourage- the Mathematisch Centrum in Amsterdam from 1961 to 1969, at the Roman Catholic Univer- ment received from him, but he did his the- ity of Nijmegen from 1970 to 1974, and then as professor in at the sis under the direction of Willem van Zwet Vrije Universiteit Amsterdam from 1975 until his retirement in 1996. In this obituary Piet who, one year younger than Kobus, had Groeneboom, Jan van Mill and Aad van der Vaart look back on his life and work. been a professor at Leiden University since 1965. Kobus became Willem’s first PhD stu- Kobus (officially: Jacobus) Oosterhoff was diploma’ (comparable to a masters) in dent, defending his dissertation Combina- born on 7 May 1933 in Leeuwarden, the 1963. His favorite lecturer was the topolo- tion of One-sided Test Statistics on 26 June capital of the province of Friesland in the gist J. de Groot, who seems to have deeply 1969 at Leiden University. north of the Netherlands. He was the el- influenced Kobus outlook on mathematics. After his graduation Kobus spent a year dest of three sons of Wijbo Jacobus Oos- During his student days he married Käthe as a visiting assistant professor at the terhoff and Gerarda Elisabeth Johanna Oos- Hötte, with whom he was to share the rest University of Oregon in Eugene, where he terhoff-Hoefer Wijehen. The Second World of his life and to have two sons, Wijbo became friends with Don Truax (who was War broke out during his early school days, and Rutger; his eldest was born before his professor of Statistics there) and his fam- but Kobus escaped famine and other war graduation. ily. In 1970 he was appointed as ‘lector’ miseries, as his father had sent him to his at the (then) Roman Catholic University of uncle’s farm, while himself continuing his Career Nijmegen, and then in 1975 as full profes- practice as a lawyer in Leeuwarden. From 1961 on Kobus was employed at the sor in Mathematical Statistics at the Vrije After the war Kobus attended the Ste­ Mathematisch Centrum in Amsterdam (now Universiteit Amsterdam, a position he held delijk Gymnasium in Leeuwarden (a top CWI), as an assistant until he obtained his until his retirement in 1996. It is ironic that level high school with a curriculum that master in 1963, then as a ‘medewerker’, Kobus switched from a catholic to a prot- includes ancient Greek and Latin), where and from 1967 to 1969 as the deputy-di- estant university, being not at all religious he graduated in 1951 in the science (‘beta’) rector (‘sous-chef’) of the department of himself. Nowadays both universities have track. He next started studies in geogra- Mathematical Statistics. The head of this a general outlook, but this was different phy at the University of Amsterdam, in line department was J. Hemelrijk,­ and the Math- in the 1970s. The Vrije Universiteit had to with a passion for traveling and collecting ematisch Centrum was a thriving place for grant Kobus official (and rare) dispensa- minerals that he maintained throughout both applied and mathematical statistics. tion from endorsing the Christian objec- his life. However, he switched to mathe- Kobus was involved in consulting projects tives of the university, and demanded him matics, in which he obtained his ‘kan- and also worked on a PhD thesis. In his to write a statement that he would respect didaatsdiploma’ in 1959 and his ‘doctoraal­- thesis Kobus acknowledges the regular the protestant character of the university. 56 NAW 5/18 nr. 1 maart 2017 Statistics as both a purely mathematical activity and an applied science Piet Groeneboom, Jan van Mill, Aad van der Vaart

in 1977. Kobus enjoyed working with stu- dents, and was an encouraging teacher. Although he left his students free to pur- sue whatever subject they chose, he read every word of their dissertations, and his preference for exact and clear expression made him want to rewrite every sentence. Later he more than once jokingly claimed that perhaps Richard Gill had taught him survival analysis, but certainly he had taught Richard (a native speaker of course) to write correct English. A typical feedback might start with saying “You make people work so hard. Why don’t you give more explanation!”, but would quickly turn into positive comments. The Mathematisch Cen- trum also organised courses, usually one week long, for mathematicians, also from Belgium. Kobus was involved in two such courses: one on contiguity and efficiency of rank statistics, and one on efficiency of tests and large deviations (with Piet Groe- neboom and Rob Potharst as lecturers). These were important topics of research at the time; the second was the topic of the theses of three of Kobus students.

Music Kobus played the violin in his youth, and was interested in music, both classical music and jazz. Piet knows this because Kobus told him, but his sons do not re- member having heard him play. Piet still remembers the following events, which he considers somehow typical for his relation- ship with Kobus. During the big ISI con- gress in Amsterdam in 1985 a ‘gathering of friends’ (Kobus, Henry Daniels, Ronald Pyke, Jon Wellner and Piet) was planned at Piet’s house in the town Utrecht, which is about an hour’s drive from Amsterdam. It was a day of extreme rainfall, which made driving difficult. The plan was that Kobus Photo: Andrew S. Tanenbaum Andrew Photo: would take some of the friends in his car Kobus Oosterhoff in 1998 and Piet the remaining persons, and that they would meet at the first gasoline sta- Kobus always kept to this promise, which isticians in the Netherlands, and Kobus tion on the highway from Amsterdam to was made easy by the fact that his politi- returned to the Centrum as an advisor in Utrecht, after which Kobus would follow cal ideas were in tune with Christian ethics 1971, a position he kept until 1981. In this Piet to his house in Utrecht. Piet waited and his colleagues in the mathematics de- capacity he became PhD advisor to Richard for a long time at the first gasoline station, partment were liberal, even if in majority Gill and Piet Groeneboom, who were ap- but did not see his PhD supervisor appear. adhering to the religious character of the pointed at the Centrum and both obtained He reluctantly got out of the car into the university. On occasions such as a PhD de- their PhD in 1979. They were the second heavy rain to call Kobus’ wife from the sta- fense, where the chair person had to pro- and third of eight doctoral students under tion (pre mobile phone time!), but she also nounce a prayer, Kobus would have him- Kobus guidance, all defending at the Vrije had absolutely no idea where her husband self replaced for this part of the ceremony. Universiteit. The first student was Wilbert could be. What had happened was that Ko- The Mathematisch Centrum in Amster- Kallenberg, who had started with Kobus bus had taken a second entrance to the dam remained a meeting place for stat- in Nijmegen and defended in Amsterdam highway, missing the first gasoline station. Piet Groeneboom, Jan van Mill, Aad van der Vaart Statistics as both a purely mathematical activity and an applied science NAW 5/18 nr. 1 maart 2017 57

Anyway, the group gathered at the second isca de Gunst), enabling them to travel and was one of the best compliments to him gasoline station (close to Utrecht), and to develop in teaching and research, and ever given. everybody finally arrived at Piet’s house. pointing out less obvious details of aca- Kobus was full of ideas, especially on Ronald Pyke then said: “Henry and Piet, demic life. new education plans. He was dean when in why don’t you play something for us?” And 1990 the study program Bedrijfswiskunde Henry Daniels immediately set himself at ‘Algemene Statistiek’ en Informatica (now Business Analytics) the piano, after which it was decided that At the time of his appointment at the Vrije was founded at the Vrije Universiteit. It be- Händel’s famous and beautiful D major Universiteit in 1975, the department of came a great success, although co-founder sonata for violin and continuo would be mathematics there was on its way to find- Gerke Nieuwland once jokingly complained played. Henry’s favourite instrument was ing a balance between theoretical and ap- that Kobus together with Bert Kersten had a British invention, the concertina (a kind plied mathematics. There had been recent turned his concept of a modern informa- of accordeon), but he also was an excel- appointments in applied analysis and nu- tion-technology-based mathematics into lent pianist. Piet met Kobus again the next merical mathematics, and computer science mere applied statistics. In later years Ko- morning and Kobus told him: “I did not was developing. Although he also taught bus was thinking about how mathematics exactly have high expectations of how this pure mathematical statistics courses, some could be integrated with life sciences in was going to be with Henry and you, but advanced, Kobus naturally weighed in on order to attract more students, and worked to my relief it was actually pretty nice, I the applied side. A large proportion of stu- out the financial basis to the successful enjoyed it!” dents would conclude their studies with an exchange program with Eastern-European internship in industry on some statistical universities. A modest man project. A course in applied data analysis, Kobus retired early at age 62 to make Kobus suffered from asthmatic bronchitis, using state-of-the-art computing with the room for the younger generation, after which he kept under control by regularly GLIM package was unique in the Nether- having first put pressure on other profes- sniffing from a small inhalator. For the rest lands. The lecture notes of Kobus under- sors to do so too. In the years to follow he he was a strong man, and would never graduate course, named ‘Algemene Statis- remained involved and became a source of complain about health issues. He loved tiek’ to reach out also to non-mathematics information for newly appointed and much rowing and walking in the mountains. He students, became the basis for a Dutch younger administrators, and also for the also loved to cook and enjoyed fine dining. language book that appeared in the Epsi- board of the newly established Faculty of He took his wife and sons at least once a lon series in 2013. The richness of statistics Science (comprising mathematics, comput- year to a Michelin star(s) restaurant, and as both a purely mathematical activity and er science, physics and chemistry). At one when taking out scientific guests for din- an applied science was central to Kobus time Kobus produced a financial formula ner, it was usually Kobus who knew the thinking, as is also expressed in his fare- that the dean did not understand, but as best restaurants. One everyday memory well lecture (‘Wiskunde of niet’, 1996; the the ‘Oosterhoff formula’ became a matter of Kobus is that he always carried a black title is also a pun on his own initial choice of heated discussion during board meet- purse, in which he kept his money and not to study mathematics, perhaps, as he ings of the Faculty of Science. also the inhalator. At a statistical meeting recalls, in response to his high school rec- Kobus was also active as an adminis- in Palermo, Sicily, this was snatched away tor’s opinion that mathematics and Greek trator outside the Vrije Universiteit. He was by boys on a passing scooter, but this did and Latin have much in common: eminent president of the Netherlands Statistical So- nothing to change this habit. One also value and total absence of usefulness). ciety from 1976–1979 and associate editor remembers his interest in Chinese porce- of the Annals of Statistics in the 1980s. lain, especially the soft blue-green qingbai, Administrator During his term as treasurer of the Bernoul- made during the Song and Yuan dynasties, Soon after his appointment at the Vrije li Society in the 1990s he helped estab- of which he had a collection in a display Universiteit it became clear that Kobus had lish the journal Bernoulli, now among the cabinet in his bedroom, shown to special strong administrative skills. He served sev- best journals in statistics. He was the main house guests. eral times as chairman of the Subfaculteit advisor when the board of the Koninklijk Kobus was a modest man, probably too Wiskunde, and later became dean of the Wiskundig Genootschap took over the Ep- modest. He once observed that many hard Faculty of Mathematics and Computer Sci- silon publishing house. Kobus wisdom and working individuals end up failing repeat- ence. He was an effective leader, respected responsibility were highly valued in such edly in life, through their inability to accept and trusted not only by the mathemati- administrative matters. Kobus was also in- their own shortcomings. It is much easier, cians, but also by the computer scientists. volved in statistical education in Indonesia Kobus said, to know yourself and take it When in 2002 the Division of Mathematics and in setting up a program in ‘Bedrijfs- from there. Kobus was honest with others and Computer Science was split into two wiskunde en Informatica’ in Potchefstroom, too, and of the highest integrity. Although independent departments, Kobus (emeri- South Africa, after the end of apartheid. he was aware that being honest is not the tus professor by then) drafted the financial Kobus was elected as a member of the same as being right, he was sometimes plan. Although nobody understood the fig- International Statistical Institute in 1975 direct in giving his honest opinions. He ures, everybody agreed within days, also and received the Commemorative Medal of generously supported the young assistant the computer scientists. Given the strained the Faculty of Mathematics and Physics of professors in statistics (first Wilbert Kallen- relationship between the two parties, Charles University, Prague, in 1988. berg and later Aad van der Vaart and Math- mostly caused by financial disputes, this He truly was a man of many talents. 58 NAW 5/18 nr. 1 maart 2017 Statistics as both a purely mathematical activity and an applied science Piet Groeneboom, Jan van Mill, Aad van der Vaart

Work It turns out that the solution depends to local asymptotic normality and the Hell- Kobus research output was modest, but strongly on the value of a. One highlight inger distance. It is known for its clarity of spans his career and includes significant in the paper is that for a = 00. 5 a region results and exposition. contributions, mostly to statistical testing of the form theory. Large deviations and efficiencies Kt=+(,te): rt()aa12ecrt() $ ()a #-12 Testing problems with level tending to zero Statistical decision theory gives a most stringent test. Such an ex- were also the motivation of the theses of His first major paper was joint with ponential combination of tests was and Kobus’ students Wilbert Kallenberg, Piet W. R. van Zwet (see [14]) and was published is unusual. Its optimality results from the Groeneboom and Arnold Kester, on large in the Annals of Mathematical Statistics, fact that it is Bayes relative to a certain deviations and Bahadur efficiency and the premier journal in the area (nowadays prior that is supported at two points of deficiency; Kobus published several joint split into the Annals of Statistics and the maximal shortcoming. In his dissertation papers with Piet on this subject [6–8]. The Annals of Probability). In the reprint of his Kobus manages to extend a number of key relative efficiency of a sequence of test sta- ()2 thesis [10], finished two years later, Ko- results to dimension k > 2, but solves the tistics {}Tn with respect to another such ()1 bus was to write “The present study is a full problem only up to a conjecture (see sequence {}Tn at given 01< 0, where the inequality 1 equivalent to this test. Two other chap- and power at i at least b (probability of on nn= (,f,)n is understood coordi- 1 k ters are on the combination of t-tests, and correctly rejecting the null hypothesis). A natewise. There is no ‘best’ solution to this on multinomial tests, and the thesis also ()2 large value indicates that {}Tn is supe- problem, because the most powerful level contains numerical investigations. The two rior to {}T ()1 , since fewer observations a test for testing H versus Hl : nn= , n 0 11panels in Figure 1 are reprinted from Fig- are needed to achieve the same power. given by the Neyman-Pearson lemma, de- ures 2.5.3 and 2.5.5, and Kobus thankfully The relative efficiency is usually difficult pends on n . To overcome this Oosterhoff 1 acknowledges the programmer of the ‘plot- to determine exactly, but one can simpli- and van Zwet undertook to find amost ter of the EL-X1’. The thesis makes a solid fy by taking a limit, sending a to 0, b to stringent test: one that minimizes the impression and gained Kobus a cum laude 1, or letting i approach the null hypothe- maximum short-coming to the Neyman– PhD. sis, making the testing problem harder in Pearson test. If bn+ () is the power of The Gaussian testing problem arises as 1 each case. Letting a tend to 0, meanwhile the Neyman–Pearson test, this means to a limit when the sample size tends to in- keeping b and i fixed, yields theBahadur find for a given levela ! (,01) the mea- finity. Another joint paper with Willem van efficiency, which is directly connected to surable set (critical region) K 1 Rk with Zwet (see [12]) is concerned with contigui- large deviation theory. Because the alter- Prn = 0 ()TK! = a that minimizes ty, a concept introduced by Lucien le Cam, ()j native is kept fixed, thep -values Ln of the which can be used to make this connection + tests (defined as the probability of a more sup bn()- bnK (), mathematically rigorous. The paper gives n > 0 6@ extreme value of the test statistic than ob- necessary and sufficient conditions for bnK ()= Prn ()TK! . served calculated under the assumption connecting contiguity of product measures that the null hypothesis is true) usually tend to zero exponentially fast in the num- ber of observations, almost surely under ()j the alternative, and -(/2 nL)log n tends to a positive limit, called the Bahadur slope. Then the Bahadur efficiency is the ratio of the Bahadur slopes: under the probability defined by the alternative i, al- most surely:

a.s log L()2 limeff(,TT()21();,ab,)i = lim n . a . 0 ()1 n " 3 log Ln Many test statistics can be written as a

functional T()Pn of the empirical distri- 2 bution Pn of the data points. This is the Figure 1 Contour plot of shortcoming as a function of the alternative n1 ! R of the minimax exponential combination test (left) and Fisher’s combination test (right). uniform distribution on the data points: Piet Groeneboom, Jan van Mill, Aad van der Vaart Statistics as both a purely mathematical activity and an applied science NAW 5/18 nr. 1 maart 2017 59

Pn ()B is the fraction of data points falling this is stronger than the weak topology, tamination alternatives the partition size in a set B. For such test statistics calcu- the result is obtained for a wider class of should remain finite if the expectation of lation of the Bahadur slope comes down sets X, which is essential to its application the quotient of the null and contamination 4 to establishing limits of the type or more to statistical functionals. The x topology densities is finite for some r > 3 and tend -1 generally limlnn" 3 n og Pr{}P ! X , for has later been used in the more general to infinity if this norm is infinite for some 4 a set X, with the data points a random probabilistic theory on large deviations as r < 3 . This precise criterion contradicts sample from the distribution P determined well (see, e.g., [1]), but has some peculiar the naive idea that the partitions should by the alternative i. Sanov’s theorem (see properties. For example, in contrast with always be enriched when more and more [13] which has, however, some difficulties, the weak topology a sequence of discrete observations become available. a version on Polish Spaces can be deduced distributions cannot converge to a contin- Another problem connected to good- from [2]) guarantees the existence of such uous non-atomic distribution, but a net of ness-of-fit testing is that the null hypothe- a limit for Polish sample spaces and sets X discrete distributions can. sis typically consists of more than one dis- such that KP((int XX), )(= KPcl(),). Here tribution, making it necessary to estimate KP(,X ) is the infimum overQ ! X of the Goodness-of-fit tests the expected null counts. Kobus (with Kullback–Leibler KQ(,P) be- Towards the end of his career Kobus con- co-authors, see [5]) studied this problem tween the probability measures Q and P, centrated on goodness-of-fit tests. Such for general statistics that compare the em- and int()X and cl()X the interior and clo- tests have the purpose of investigating pirical distribution of the data to its ex- sure of the set X with respect to the weak whether a dataset can be a sample from a pected value under the null hypothesis, topology on the set of all Borel probability given . A chi square finding that non-robust estimators of loca- measures (which is the topology gener- statistic is a certain quadratic form in the tion and scale of the null distribution can ated by all maps Qf7 # dQ, for bound- vector of counts of the numbers of obser- improve the power for heavy-tailed alter- ed continuous functions f ). The limit is vations belonging to the sets of a given natives. With the same co-authors Kobus then given by partition of the sample space. It compares also studied power approximations for a this to the expected number if the null class of tests encompassing the likelihood -1 limlnKog Pr{}Pn ! XX=- (,P). hypothesis were true. Such statistics are ratio and chi square tests (see [3, 4]), with " 3 n popular, but are sensitive to the choice a main finding that the usual non-central In [8] this was improved to sample spaces of the partition, which is up to the user. chi square approximation can be much im- that are just Hausdorff and the topology Kobus (with co-authors Wilbert Kallenberg proved, in particular for the likelihood ratio on the set of probability measures gener- and Bert Schriever, see [9]) studied this in test. ated by all maps QQ7 ()B , for Borel sets the large sample setting, allowing the size Kobus published his last paper in 1994 B, called x topology}, which seems to be of the partition to tend to infinity when [11], two years before his retirement: a com- the right topology for the Sanov theorem, the number of observations increases in- parison of the merits of trimmed means although it is non-metrizable. Because definitely. One finding was that for con- versus the median. s

References 1 E. Bolthausen, Markov process large devi- terhoff, The power of EDF tests of fit under 10 J. Oosterhoff, Combination of One-sided ations in x-topology, Stochastic Process. nonrobust estimation of nuisance parame- Statistical Tests, Mathematical Centre Tracts, Appl. 25(1) (1987), 95–108. ters, Statist. Decisions 8(2) (1990), 167–182. Vol. 28, Mathematisch Centrum, Amsterdam, 2 M. D. Donsker and S. R. S. Varadhan, Asymp- 6 P. Groeneboom and J. Oosterhoff, Bahadur 1969. totic evaluation of certain Markov process efficiency and probabilities of large devia- 11 J. Oosterhoff, Trimmed mean or sample me- expectations for large time. III, Comm. Pure tions, Statistica Neerlandica 31(1) (1977), dian? Statist. Probab. Lett. 20(5) (1994), Appl. Math. 29(4) (1976), 389–461. 1–24. 401–409. 3 F. C. Drost, W. C. M. Kallenberg, D. S. Moore 7 P. Groeneboom and J. Oosterhoff, Bahadur 12 J. Oosterhoff and W. R. van Zwet, A note and J. Oosterhoff, Asymptotic error bounds efficiency and small-sample efficiency, Inter- on contiguity and Hellinger distance, in J. for power approximations to multinomial nat. Statist. Rev. 49(2) (1981), 127–141. Jurečková, ed., Contributions to Statistics, tests of fit, in L. J. Gleser et al., eds., Con- 8 P. Groeneboom, J. Oosterhoff and F. H. Reidel, 1979, pp. 157–166. tributions to Probability and Statistics, Ruymgaart, Large deviation theorems 13 I. N. Sanov, On the probability of large devi- Springer, 1989, pp. 429–446. for empirical probability measures, Ann. ations of random magnitudes, Mat. Sb. N. S. 4 F. C. Drost, W. C. M. Kallenberg, D. S. Moore Probab. 7(4) (1979), 553–586. 42(84) (1957), 11–44. and J. Oosterhoff, Power approximations to 9 W. C. M. Kallenberg, J. Oosterhoff and B. F. 14 W. R. van Zwet and J. Oosterhoff, On the multinomial tests of fit, J. Amer. Statist. As- Schriever, The number of classes in chi- combination of independent test statistics, soc. 84(405) (1989), 130–141. squared goodness-of-fit tests, J. Amer. Stat- Ann. Math. Statist 38 (1967), 659–680. 5 F. C. Drost, W. C. M. Kallenberg and J. Oos- ist. Assoc. 80(392) (1985), 959–968.