THE ISBA BULLETIN

Vol. 12 No. 4 December 2005

The official bulletin of the International Society for Bayesian Analysis

AMESSAGEFROMTHE PRESIDENT This year has passed too quickly (don’t they all?) and I look forward to carrying on working with our by Sylvia Richardson new President, Alan Gelfand, to advance some of ISBA President the unfinished business, in particular concerning [email protected] the arrangements for the upkeep of Bayes’ grave. I have enjoyed working with the ISBA community This year has been very productive for ISBA with and I would like to thank all members of the Board a strong society engaged worldwide, the establish- and of the Executive, in particular the past Presi- ment of Sections to encourage diversity, many suc- dent Jim Berger and the treasurer Bruno Sanso,´ for cessful meetings (my personal highlight was the being so responsive. I would also like to thank all MCMski meeting in Bormio), and the launch of our of you who have accepted new responsibilities on electronic journal Bayesian Analysis under the stew- both the prize committees and programme com- ardship of Rob Kass. Following on from the launch, mittees, and congratulate our new elected board a proposal to distribute paper copies under the um- members: Marilena Barbieri, Wes Johnson, Steve brella of the IMS is under discussion by both Soci- MacEachern and Jim Zideck, as well as our new eties. Both the Bulletin and Bayesian Analysis are President Elect Peter Green. flagships for our Society and we are indebted to I wish you all a happy and successful year in Rob Kass and Andres´ Christen for their work on 2006 and look forward to seeing you at our next the editorial side. ISBA 2006 conference in Valencia.▲

AMESSAGEFROMTHE EDITOR our Bayesian friends in the southern hemisphere. ▲ by J. Andr´esChristen Contents [email protected] ' $ The last issue of the year presents an assorted se- ➤ ANNOTATED lection of interesting articles, from regular sections BIBLIOGRAPHY to a contribution from Tony O’Hagan; I hope you ☛Page 2 enjoy reading it as much as I did. ➤ The treasurer of ISBA, Bruno Sanso,´ wishes to APPLICATIONS ☛ thank a donation to the Valencia 8 meeting (to be Page 6 held next June) by BEST, LLC. We thank BEST, LLC, ➤ SOFTWARE again for this generous donation. ☛Page 7 Finally, please do not hesitate to send me any suggestions about articles that you may wish to see ➤ CONTRIBUTIONS published in the Bulletin, or send me any free con- ☛Page 9 tribution you might feel is of general interest for the ➤ ISBA community. Taking the opportunity of this NEWS FROM THE WORLD ☛ December issue, I wish you all a great and exiting Page 10 Bayesian 2006, and also a terrific summer ... to all & % ISBA Bulletin, 12(4), December 2005 ANNOTATED BIBLIOGRAPHY

ADVANCED MARKOV CHAIN ergy variable is approximately uniformly dis- MONTE CARLO METHODS tributed, and propose an iterative procedure for constructing such a trial distribution. by Faming Liang [email protected] • Chen, M.-H. and Schmeiser, B.W. (1993). Per- formances of the Gibbs, hit-and-run, and Markov chain Monte Carlo (MCMC) methods Metropolis samplers. Journal of Computational are rooted in the work of physicists such as and Graphical Statistics, 2, 251-272. Metropolis and von Neumann during the period The authors propose a general form of the 1945-1955 when they employed modern electronic hit-and-run algorithm, which behaves like a computers for the simulation of some probabilis- random-direction Gibbs sampler and allows tic problems in atomic bomb designs. After five for a complete exploration of a randomly cho- decades of continual development, they have be- sen direction. The hit-and-run algorithm is come the dominant methodology in the solution of particularly useful when the sample space is many classes of computational problems of central sharply constrained. importance to science and technology. The MCMC • methods have numerous application areas such as Duane, S., Kennedy, A.D., Pendleton, B.J. Bayesian statistical inference, spin-glasses simula- and Roweth, D. (1987). Hybrid Monte Carlo. Physics Letters B 195 tions, chip design, image processing, economics , , 216-222. and finance, signal processing, machine learning, The authors propose the hybrid Monte Carlo biological sequence analysis, phylogeny inference, algorithm which combines the basic idea protein structure prediction, microarray data anal- of molecular dynamics and the Metropolis ysis, among others. acceptance-rejection rule to produce Monte In brief, a MCMC method simulates a Markov Carlo samples for a complex distribution. chain to draw samples proportionally (with respect • Edwards, R.G. and Sokal, A.D. (1988). Gener- to the invariant distribution) from each part of the alization of the Fortuin-Kasteleyn-Swendsen- sample space, and then conducts statistical infer- Wang representation and Monte Carlo algo- ences based on the samples drawn during the sim- rithm. Physical Review D, 38, 2009-2012. ulation process. The local trap phenomenon oc- curs when the energy function, or the negative The authors propose the slice sampler which log-posterior density function in Bayesian statis- seeks to generate samples which are uni- tics, has multiple local minima separated by high formly distributed in a region under the energy barriers. In this situation the Markov chain surface of the target density function. A will be trapped into a local energy minimum indef- marginal distribution of the sample is iden- initely. Consequently, the simulation process may tical to the target distribution. fail to sample from the relevant parts of the sam- • Gelfand, A.E. and Smith, A.F.M. (1990). ple space, and the quantities of interest can not Sampling-based approaches to calculating be estimated correctly. Many applications of the marginal densities. Journal of the American MCMC methods, such as protein folding, combi- Statistical Association, 85, 398-409. natorial optimization, and spin-glasses, could be The authors demonstrate that the conditional dramatically enhanced if we had better algorithms distributions needed in the Gibbs sampler are which allowed the process to avoid being trapped commonly available in many Bayesian and into local minima. Developing advanced MCMC likelihood computations. methods that are immune to the local trap problem has long been considered as one of the most im- • Geman, S. and Geman, D. (1984). Stochas- portant research topics in scientific computing. A tic relaxation, Gibbs distributions and the non-exhaustive list of the works in this direction is Bayesian restoration of Images. IEEE Trans- as follows. action on Pattern Analysis and Machine Intelli- gence, 6, 721-741. • Berg, B.A. and Neuhaus, T. (1991). Mul- The authors propose the Gibbs sampler ticanonical algorithms for 1st order phase- which turns out to be a special scheme of the transitions. Physics Letters B, 267, 249-253. Metropolis-Hastings algorithm with the pro- The authors propose the multicanonical al- posal distributions being the conditional dis- gorithm which seeks to generate samples tributions derived from the target distribu- from a trial distribution under which the en- tion. In the Gibbs sampler, the components of

2 the parameter vector (multidimensional) can The authors propose the Langevin algorithm be updated in a systematic or random order. which produces Monte Carlo samples by sim- ulating a diffusion process with the target • Geyer, C.J. (1991). Markov chain Monte Carlo distribution being its stationary distribution. maximum likelihood. Computing Science and The diffusion process can be discretized and Statistics: proceedings of the 23rd Symposium on moderated by the Metropolis-Hastings algo- the Interface (ed. E.M. Keramigas), 156-163, In- rithm. terface Foundations, Fairfax. The author proposes the parallel temper- • Hastings, W.K. (1970). Monte Carlo sampling ing algorithm which falls into the class of methods using Markov chains and their ap- multiple-chain MCMC algorithms. The in- plications. Biometrika, 57, 97-109. variant distributions of the multiple Markov The author generalizes the Metropolis algo- chains are constructed by scaling (or temper- rithm to the case that the proposal distribu- ing) the target distribution along a given tem- tion is asymmetric. perature ladder. The swapping operation, exchange of samples between neighbouring • Hesselbo, B. and Stinchcombe, R.B. (1995). Markov chains, accelerates the convergences Monte Carlo simulation and global optimiza- of the Markov chains at low temperature lev- tion without parameters. Physics Review Let- els. ters, 74, 2151-2155. • Geyer, C.J. and Thompson, E.A. (1995). An- The authors propose the 1/k-ensemble sam- nealing Markov chain Monte Carlo with ap- pling algorithm, which is similar in spirit plications to pedigree analysis. Journal of the to the multicanonical algorithm (Berg and American Statistical Association, 90, 909-920. Neuhaus, 1991) and seeks to produce sam- ples from a trial distribution under which The authors consider practical issues of sim- the configuration entropy variable is approx- ulated tempering (Marinari and Parisi, 1992), imately uniformly distributed. The trial dis- for example, how to set the temperature tribution can be constructed in the same pro- ladder and how to estimate the pseudo- cedure as that used in the multicanonical al- normalizing constants for each of the trial dis- gorithm. tributions constructed by scaling (or temper- ing) the target distribution along a given tem- • Hukushima K. and Nemoto, K. (1996). Ex- perature ladder. The authors also demon- change Monte Carlo method and application strate the usefulness of the algorithm through to spin glass simulations. Journal of the Physi- a biomedical example. cal Society of Japan, 65, 1604-1608. • Gilks, W.R., Roberts, G.O. and George, E.I. The authors propose the exchange Monte (1994). Adaptive direction sampling. Statis- Carlo algorithm which is a reinvention of par- tician, 43, 179-189. allel tempering (Geyer, 1991). The authors propose adaptive direction sam- • Kirkpatrick, S., Gelatt, C.D. and Vecchi, M.P. pling algorithms under the framework of (1983). Optimization by simulated annealing. multiple-chain MCMC simulations. Science, 220, 671-680. • Green, P.J. (1995). Reversible jump The authors propose simulated annealing Markov chain Monte Carlo computation and (SA) as a general-purpose optimization algo- Bayesian model determination. Biometrika, rithm. SA employs a temperature parameter 82, 711-732. to control simulation or optimization of the The author presents a variable transforma- target distribution. As shown by Geman and tion based treatment for the Metropolis- Geman (1984), if the temperature decreases Hastings moves between two different di- sufficiently slow (i.e., in the logarithmic rate), mensional spaces and names the moves re- SA can reach the global energy minima with versible jumps. The reversible jumps have probability 1 as the running time goes to in- wide applications in Bayesian model selec- finity. tion. • Kou, S.C., Zhou, Q. and Wong, W.H. (2005). • Grenander, U. and Miller, M.I. (1994). Repre- Equi-energy sampler with applications in sta- sentations of knowledge in complex systems. tistical inference and statistical mechanics. Journal of the Royal Statistical Society B, 56, 549- Technical Report, Department of Statistics, Har- 603. vard University.

3 The authors propose the equi-energy sampler The authors extends the evolutionary Monte which can be viewed as a new implemen- Carlo algorithm (Liang and Wong, 2000) to tation of the multicanonical algorithm (Berg the case that the parameters are real vari- and Neuhaus, 1991) in the style of multiple- ables. Direction sampling algorithms, such chain MCMC. as the snooker algorithm (Gilks, Roberts and George, 1994), are adopted as crossover oper- • Liang, F. (2002). Dynamically weighted im- ators. portance sampling in Monte Carlo computa- tion. Journal of the American Statistical Associa- • Liu, C. (2003). Alternating subspace- tion , 97, 807-821. spanning resampling to accelerate Markov The author proposes the dynamic importance chain Monte Carlo simulation. Journal of the sampling algorithm in which the trial distri- American Statistical Association, 98, 110-117. bution can be self-learned and the importance The author proposes to accelerate MCMC al- weight becomes a random variable. gorithms, such as the data augmentation al- • Liang, F. (2005). Generalized Wang-Landau gorithm (Tanner and Wong, 1987) and the algorithm for Monte Carlo Computation. Gibbs sampler (Geman and Geman, 1984), Journal of the American Statistical Association, via partial resampling. in press. • Liu, J.S., Liang, F. and Wong, W.H. (2000). The author generalizes the Wang-Landau al- The use of multiple-try method and local op- gorithm (Wang and Landau, 2001) based timization in Metropolis sampling. Journal on a partition of the sample space. The of the American Statistical Association, 95, 121- generalized algorithm is applicable to many 134. statistical problems, such as model selec- tion and sampling of complex distributions. The authors propose the multiple-try The generalized algorithm has also incorpo- Metropolis-Hastings algorithm which can rated some features of 1/k-ensemble sam- be viewed as an importance sampling-based pling (Hesselbo and Stinchcombe, 1995) and, Metropolis-Hastings algorithm. This paper hence, it is attractive to optimization. provides a general framework for incorporat- ing optimization procedures, such as steepest • Liang, F., Liu, C. and Carroll, R.J. (2005). descent and conjugate gradient, into MCMC Stochastic approximation in Monte Carlo simulations. computation. Technical Report, Department of Statistics, Texas A&M University. • Liu, J.S. and Sabatti, C. (2000). Generalized The authors propose the stochastic approx- Gibbs sampler and multigrid Monte Carlo for imation Monte Carlo algorithm which can Bayesian computation. Biometrika, 87, 353- be regarded as a stochastic approximation 369. extension of the Wang-Landau algorithm The authors propose a general form of the (Wang and Landau, 2001). This work also conditional sampling originated in the Gibbs represents a new development of the stochas- sampler (Geman and Geman, 1984). The gen- tic approximation method, extending the eralization is done via groups of transforma- applications of stochastic approximation to tions. Monte Carlo computation. • Liu, J.S., Wong, W.H. and Kong, A. (1994). • Liang, F. and Wong, W.H. (2000). Evolution- Covariance structure of the Gibbs sam- ary Monte Carlo: Applications to C model p pler with applications to the comparisons sampling and change point problem. Sinica of estimators and augmentation schemes. Sinica, 10, 317-342. Biometrika, 81, 27-40. The authors propose the evolutionary Monte Carlo algorithm which incorporates the ge- Based on the theoretical results on the conver- netic algorithm into MCMC simulations. The gence rate of the Gibbs sampler, the authors algorithm is useful in variable selection and argue that grouping highly correlated com- change point identification. ponents together (i.e., update them jointly) can improve its efficiency. • Liang, F. and Wong, W.H. (2001). Real Param- eter evolutionary Monte Carlo and Bayesian • Marinari, E. and Parisi, G. (1992). Simulated neural network forecasting. Journal of the tempering: a new Monte Carlo scheme. Euro- American Statistical Association, 96, 653-666. physics Letters, 19, 451-458.

4 The authors propose the simulated temper- • Swendsen, R.H. and Wang, J.S. (1987). ing algorithm in which the sample space is Nonuniversal critical dynamics in Monte augmented by an auxiliary variable, the in- Carlo simulations. Physical Review Letters, 58, dex of temperatures. Simulated tempering 86-88. leads to a random walk along the tempera- The authors propose a clustering algorithm, ture ladder. which reduces the critical slow down of the Ising and Potts models and has many appli- • Metropolis, N., Rosenbluth, A.W., Rosen- cations in image analysis. The idea of the al- bluth, M.N., Teller, A.H. and Teller, E. (1953). gorithm has been extended to a class of aux- Equations of state Calculations by fast com- iliary variable-based MCMC algorithms. puting machines. Journal Chemical Physics, 21, 1087-1091. • Tanner, M.A. and Wong, W.H. (1987). The calculation of posterior distributions by data The authors propose the Metropolis algo- augmentation (with discussion). Journal of the rithm which forms the cornerstone of the American Statistical Association, 82, 528-540. Markov chain-based Monte Carlo methods. At each iteration, the Metropolis algorithm The authors propose the data augmentation suggests a possible move according to a sym- algorithm which first links the Gibbs sampler metric proposal distribution and then em- structure with statistical missing data prob- ploys an acceptance-rejection rule to moder- lems and the EM algorithm. ate the move such that the detailed-balance • Wang, F. and Landau, D.P. (2001). Efficient, condition is satisfied. The detailed-balance multiple-range random walk algorithm to condition ensures invariance of the target dis- calculate the density of states. Physical Review tribution. Letters, 86, 2050-2053. • Neal, R. (2003). Slice sampling (with discus- The authors propose the Wang-Landau algo- sion). Annals of Statistics, 31, 705-767. rithm which can be viewed as an improved implementation of the multicanonical algo- The author proposes improvements to the rithm (Berg and Neuhaus, 1991). standard slice sampling algorithm based on random walk suppression, which can be • Wong, W.H. and Liang, F. (1997). Dynamic done for univariate slice sampling by “over- weighting in Monte Carlo and optimization, relaxation” and for multivariate slice sam- Proceedings of the National Academy of Sciences pling by “reflection” from the edges of the USA, 94, 14220-14224. slice. The authors propose the dynamic weighting algorithm in which the importance weight • Propp, J. and Wilson, D. (1996). Exact sam- becomes a random variable and helps the sys- pling with coupled Markov chains and ap- tem to escape from the trap of local energy plications to statistical mechanics. Random minima. The authors also introduce the con- Structures and Algorithms, 9, 223-252. cept of invariance with respect to the impor- The authors proposes the coupling from the tance weights and propose to use it as a gen- past(CFTP) algorithm which can be used to eral guideline for MCMC simulations in place draw exact samples from a distribution de- of the detailed balance condition used by the fined on a finite state space. Metropolis-Hastings algorithm. ▲

SUGGESTIONS PLEASE, FEELCOMPLETELYFREETOSENDUSSUGGESTIONSTHATMIGHT IMPROVETHEQUALITYOFTHEBULLETIN [email protected]

5 ISBA Bulletin, 12(4), December 2005 APPLICATIONS

BAYESIAN METHODSIN synthetic datasets. These data are simulated from STATISTICAL DISCLOSURE posterior predictive distributions estimated using LIMITATION the original data. Inferences from the synthetic datasets are obtained using methods like those by Jerome Reiter from multiple imputation for missing data, al- [email protected] though different rules are used for combining point When releasing data to the public, statistical and variance estimates across the multiple datasets. agencies are obligated by law to protect the con- For tabular data, one proposal is to generate new fidentiality of respondents’ identities and sensi- tables from posterior distributions conditional on tive attributes. Stripping unique identifiers−−like certain marginal counts. These are generated using names, addresses, and government-issued identi- importance sampling and techniques from compu- fication codes−−from the file may not adequately tational algebra. protect confidentiality, because data snoopers may There is much opportunity and need for be able to link records in the released data to Bayesian statisticians to advance research on data records in external databases by matching on com- confidentiality. To learn about this area of research, mon values in the two files. Most agencies there- I suggest starting with the sources listed below. fore alter the original data before disseminating them, for example by coarsening or adding noise. 1. Willenborg, L. and de Waal, T. (2000), Ele- In this article, I review the role Bayesian methods ments of Statistical Disclosure Control. Springer play in the evaluation and development of confi- Verlag: New York. dentiality protection strategies. This is an overview of disclosure limitation To select confidentiality protection strategies, techniques used by national statistical agen- agencies seek to maximize the usefulness of the cies. released data for acceptable levels of disclosure 2. Bill Winkler’s bibliography on data confiden- risks. Many agencies employ Bayesian techniques tiality: http://www.niss.org/affiliates/ to quantify risk and utility, often implicitly and totalsurveyerrorworkshop200503/presentations/ sometimes explicitly. To measure risk, agencies WinklerConfidRef050211.pdf. can apply Bayes rule to compute probabilities of identification for the records in the observed file, 3. Web sites for the National Institute of Statis- given the released data and publicly available in- tical Sciences Digital Government projects formation about target records. Another approach (http://www.niss.org/projects.html) is to estimate the number of records in the ob- and the ASA Committee on Privacy and served dataset that have unique characteristics in Data Confidentiality (http://www.amstat. the population. Such estimations can be improved org/comm/cmtepc/index.cfm?fuseaction= by incorporating prior information and borrowing main). strength across small areas, tasks ideally handled These web sites contain links to a variety of by Bayesian methodology. To measure data utility, research. agencies often compare ad hoc summaries of the re- leased data, such as first and second moments, to 4. UNECE workshop on data confidentiality corresponding summaries in the original data. An- (http://www.unece.org/stats/documents/ other approach is to determine the amount of over- 2005.11.confidentiality.htm). lap in posterior distributions of specific parameters See the papers by Karr et al., Forster, Polettini when estimated using the original and the altered and Stander, Ting et al., and Trottini. data. More broadly, one can frame utility analysis as an exercise in Bayesian decision theory. 5. CHANCE magazine, Summer 2004. Bayesian thinking underpins several approaches This is a special issue on data confidentiality for releasing confidential data. For record-level with introductions to synthetic data, tabular data, one proposal is to release multiply-imputed, data, and risk/utility tradeoffs.▲

6 SOFTWARE REVIEW BAYESIAN CLINICAL TRIAL Multc Lean SOFTWARE FROM MDACC The Multc Lean is a “lean” version of the Multc99 by John D. Cook software implementing the multiple comparison M. D. Anderson Cancer Center safety monitoring method of Thall, Simon, and [email protected] Estey[4]. Multc Lean monitors only two outcomes, efficacy and toxicity, rather than the complex com- binations supported by its predecessor. And while Multc99 is a command line application, Multc Lean Introduction has a Windows user interface. Multc Lean offers the user one option not available in Multc99, and This note briefly describes some of the software de- that is the ability to simulate expected trial dura- veloped by the Department of Biostatistics and Ap- tion. plied Mathematics at M. D. Anderson Cancer Cen- ter for Bayesian clinical trial methods. All appli- cations presented here have a Windows graphical Adaptive Randomization user interface and are freely available from the fol- lowing web site: The Adaptive Randomization software provides a unified environment for simulating a wide variety http://biostatistics.mdanderson.org/SoftwareDownload of adaptively randomized trials. Both binary and time-to-event endpoints are supported. See Berry and Eick[1] for some background on adaptive ran- CRMSimulator domization. There have been numerous variations of the CRM Predictive Probabilities (Continual Reassessment Method) for phase I dose- finding based on toxicity. The CRMSimulator For trials with either binary or time-to-event out- strips the CRM down to only those features we comes, Predictive Probabilities computes the pre- have seen most commonly used in practice and em- dictive probabilities of various events, such as con- phasizes simplicity rather than generality. It is in- cluding that one arm or the other is superior or tended as a pedagogical tool, an easy-to-use appli- stopping a trial due to futility. This software is of- cation for those new to adaptive clinical trial meth- ten used in designing clinical trials and in conduct- ods. Graphical controls allow the user to visualize ing interim analyses. the prior probabilities of toxicity and the target tox- icity. Inequality Calculator

Given two random variables X and Y, the Inequal- EffTox ity Calculator calculates P(X > Y + δ) and pro- duces graphs of the densities of the two random The EffTox application implements the method of variables. This sort of random inequality is at the Thall and Cook [3] for dose-finding based on ef- heart of many safety monitoring rules, such as [4] ficacy and toxicity outcomes. Rather than search- and [5]. The supported distribution families are ing for a moderately toxic dose, the EffTox method Beta, Gamma, Inverse Gamma, Log normal, Nor- attempts to find a dose maximizing efficacy and mal, and Weibull. minimizing toxicity, using utility trade-offs elicited Included in the Inequality Calculator is Param- from physicians. The method is thus useful for eter Solver, also available separately, for solving phase I/II trials, evaluating safety and efficacy at for distribution parameters given two quantiles or the same time. mean and variance. The software allows physicians to visualize their utility contours, aiding the elicitation process. One may enter model hyperparameters directly, but References the software can solve for hyperparameters that cause the model to match elicited prior probabili- [1] Donald A. Berry and G. E. Eick. Adaptive ties. EffTox supports trial simulation and conduct. assignment versus balanced randomization in

7 clinical trials: a decision analysis. Statistics in [4] Peter Thall, Richard Simon, and Elihu Estey. Medicine, vol 14, 231-246 (1995). Bayesian sequential monitoring designs for single-arm clinical trials with multiple out- [2] Donald A. Berry. Bayesian statistics and the comes, Statistics in Medicine, vol 14, 357-379 ethics of clinical trials. Statistical Science Volume (1995). 19(1):175-187. (2004) [5] Peter F. Thall, Leiko H. Wooten, and Nizar M. [3] Peter F. Thall and John D. Cook. Dose-Finding Tannir. Monitoring Event Times in Early Phase Based on Efficacy-Toxicity Trade-Offs, Biomet- Clinical Trials: Some Practical Issues, Clinical rics 60, p684-693 (2004). Trials, to appear (2005).▲

Figure 1: Working environment in EffTox application

THE 2006 MITCHELL PRIZE The Mitchell Prize committee invites nominations for the 2006 Mitchell Prize. The Prize is currently awarded every other year in recognition of an out- standing paper that describes how a Bayesian analysis has solved an impor- tant applied problem. The Prize is jointly sponsored by the ASA Section on Bayesian Statistical Science (SBSS), the International Society for Bayesian Analysis (ISBA), and the Mitchell Prize Founders’ Committee, and consists for 2006 of an award of $1000 and a commemorative plaque. The 2006 Prize selec- tion committee members are Tony O’Hagan (chair), Dave Higdon and Marina Vannucci. This information is reproduced from http://www.bayesian.org/ awards/mitchell.html, where more details may be found.

8 ISBA Bulletin, 12(4), December 2005 CONTRIBUTIONS

MY EARLY DAYS AS A BAYESIAN all natural Bayesians. The 999 frequentist statisti- cians in every thousand were in fact heavily out- by Tony OHagan numbered by Bayesian non-statisticians! [email protected] I remain convinced to this day that everyone is born a Bayesian, and only loses this state of grace when corrupted by the fallacious teachings of fre- How I became a Bayesian quentists.

I took the BSc Statistics course at University Col- lege London from 1966 to 1969. In my second year, An early proposal for sampling- moved to UCL to take over as Pro- fessor from Maurice Bartlett. Until then, the teach- based inference ing had of course been exclusively frequentist, but Dennis introduced a short lecture course for final One piece of work that I did at the CEGB illustrates year students on Decision Theory. This dealt with how my thinking had shifted firmly to a Bayesian Bayesian hypothesis testing and point estimation perspective in less than two years. The work at as decisions, as well as conjugate priors and value Berkeley at that time was strongly directed towards of information. research for the new advanced gas-cooled reactors Dennis was a superb lecturer, who conveyed en- (AGRs). Within the reactor core, assemblies of 36 thusiasm for his subject and challenged our percep- fuel rods had gas passed over them to take the tions. I enjoyed his course probably more than any heat generated in the fuel away to the turbines. It other in my final year. However, I left UCL un- was important that the transfer of heat from the convinced that all the frequentist theory I had been stainless-steel casing of the fuel rod to the gas was taught could be useless. After all, at least 999 of very efficient, so as to prevent over-heating in the every thousand statisticians were happy with the assembly. That first experiment I had analysed was standard methods could they all be wrong? measuring the heat transfer at many points in a fuel I left UCL to work for the Central Electricity Gen- rod assembly (actually one for the previous gener- erating Board (CEGB), in the small Statistics sec- ation of magnox reactors). tion that was based in their Computing centre in The analysis of such experiments yielded a fit- Bankside, London. (We were right next to Bank- ted regression model, but the question of inter- side power station, which is now the famous Tate est was, what would be the lowest heat transfer Modern art gallery.) My work involved designing value achieved at any point in the assembly? If and analysing experiments for the CEGBs scientists this was too low, there was a risk of the fuel casing in their research laboratories. I was particularly re- rupturing, with devastating results. Thinking as a sponsible for the work of their nuclear power labo- Bayesian, I could see that the experiment ought to ratories at Berkeley, Gloucestershire. My first task provide a posterior distribution for this minimum was analysing a large factorial experiment with heat transfer coefficient. I could also see that with split plots and cyclical factors, and of course I did weak prior distributions the posterior distribution this using standard classical methods. for the parameters of the fitted regression model However, it was during the two years that I would be like the standard frequentist analysis. But worked for the CEGB that my conversion took the only way I could see of calculating the posterior place. And it happened because of how the scien- distribution of interest was by simulating from the tists interpreted the results that I gave them. When posterior distribution of the regression parameters I presented them with an estimate for some param- and computing the minimum point from each sim- eter, I could see them mentally comparing it with ulated regression. their expectations, with results from similar exper- This idea was never carried out, partly because iments, and so on. As a result, they would invari- of the limited computational power we had then. ably feel that the true value was more likely to be The CEGB had one of the largest and fastest com- on one side of my estimate than the other. And puters in the UK at the time. It occupied its own air- when I presented them with a confidence interval conditioned room to which only the operators had they would automatically interpret it as a proba- access, but it had only 384kB of RAM. The PC that bility statement about the uncertain parameter (for I am writing this on has about a thousand times as fixed interval). Because I had received Dennis much main memory and is probably at least a thou- Lindleys excellent undergraduate course at UCL, I sand times as fast. Any program I wanted to run realised that the scientists I was working with were had to be submitted on punched cards, and I would

9 get the output on fan-fold paper a couple of hours dependence on an arbitrary definition of the de- later. Writing and debugging a program (even the sign space was quite unrealistic in practice. I did very simple programs we could run in those days) not solve this until my 1978 paper in JRSSB (which took weeks when we could only do two or three is now remembered for my early use therein of a runs a day. Gaussian process to model the regression function). My method was set out in an internal report that Instead, I started working on the simpler optimisa- I wrote. For some reason, the report was classified tion problem of inference about the location of the as confidential, and I long ago lost the only copy I maximum of a response surface. This led to my had. I doubt very much whether the successors of first published paper (in Biometrika in 1973), but I the CEGB have archives of those old reports, so I got stuck on the question of inference about a ratio suppose it really is lost now. The idea of simulat- of parameters and switched to a different topic that ing from the posterior is a commonplace feature of Dennis suggested (simultaneous equations models modern Bayesian methodology, in particular as a in econometrics). My thesis is therefore a rather feature of MCMC, but no doubt there were others undistinguished mixture of ideas. However, it is using the idea before that. Does anyone know of worth noting a part of it which again anticipated any other instances as early as 1971? important Bayesian developments. I needed to integrate an intractable multidimen- sional posterior density, a familiar problem that has Return to UCL received enormous attention and now is usually tackled by MCMC. I only dared to try two dimen- I had always intended to do a PhD. Indeed, when sions. My solution was to use two-dimensional I left UCL I already had a grant. At that time, the (product) Gauss-Hermite quadrature. My bivari- Science Research Council were running a scheme ate posterior exhibited substantial correlation, and whereby one could be awarded a grant for a PhD I realised that it would be necessary to rotate the on graduation, but this grant was to be taken up axes to make the quadrature more efficient. I was after spending from one to five years in industry or adopting essentially the same approach as the in- teaching. This was a wonderful idea, and I have al- fluential paper of John Naylor and Adrian Smith in ways been grateful for the opportunity it gave me Applied Statistics in 1982, except that I located the to experience the real world before becoming an principal axis by searching for the maximum of the academic. All other PhD grants then were awarded density on a small semicircle around the mode. to university departments to give to the best stu- I claim no special priority for having done this in dents who applied to study there (as is basically 1974 no doubt there were others coming up with the case now). Mine was awarded to me person- similar pragmatic solutions, but it was Naylor and ally, and I could take it to any university that would Smith who saw the wide value of this approach, have me another excellent feature of the scheme. researched it thoroughly and published it. Theirs Having been converted to Bayesianism, I naturally was the most widely used way of integrating mul- took myself back to UCL to study for my PhD un- tidimensional posterior densities before MCMC. der Dennis Lindleys supervision. I hope that ISBA members will find in these I brought with me topics that I wanted to work rather self-indulgent memories of mine some on, and which had been suggested by my time at echoes of their own experiences. Our roads may the CEGB. One of these was how to formulate op- have been different, but we have all had to reach timal experimental design so that the solution did the realisation that the Bayesian paradigm is the not involve putting points at the limits of the de- right one, and we have all faced practical problems sign space. My experience had shown me that this in implementing it! ▲

ISBA Bulletin, 12(4), December 2005 NEWS OF THE WORLD

NEWSFROMTHE WORLD Events

by Alexandra M. Schmidt Statistics for Biological Networks, EURANDOM, [email protected] Eindhoven, The Netherlands, January, 16th - 18th, 2006. Detailed information concerning the invited speakers and the logistics may be found at the I would like to encourage those who are organi- workshop website: zing any event around the World, to get in touch http://euridice.tue.nl/ frigat/sbn.htm. with me to announce it here. The main topics of the workshop are:

10 • Gene Regulatory Networks, • Computation for simulation and statistical in- ference in very large stochastic systems; • Statistical Analysis of Neuronal Data, • Graphical Models and Bayesian Networks. • Developments in Markov chain Monte Carlo The deadline for applications to participate as a (MCMC); contributed speaker or as a poster presenter is De- cember 1st, 2005. Although contributions are pri- • The suitability to parallelization of statisti- marily expected to be focussed on the three main cal methods (with a focus on competing ap- topics of the workshop, applications more loosely proaches to MCMC); connected with these main themes will also be con- sidered. • Challenging statistical applications that IceBUGS: A Workshop about the development test the boundaries of available computing and use of the BUGS programme, Tvarminne¨ Zo- power; ological Station, University of Helsinki, Finland, February, 10th - 13th, 2006. • Grid technologies for statistical analysis; The aim of this workshop is to bring together people working with and on BUGS, providing a • Quantum computing and statistical inference platform for BUGS users and developers to discuss and exchange ideas about using BUGS in data anal- Contributed papers, in the form of an extended ab- ysis. The workshop will consist of both oral and stract of up to 3 pages, are sought for presentation poster session, as well as discussion session where as a talk or a poster. BUGS experts can discuss your problems, and sug- Deadline for submission of contributed papers gest solutions. We are planning for about 30 parti- is April, 25th, 2006. More details can be found at ciants, so the meeting will be fairly small and infor- http://www.tcd.ie/Statistics/hpcsi/ or con- mal. tact [email protected] for more information. The following speakers have already confirmed COMPSTAT 2006: 17th Conference of the In- their attendence: (UK), Nicky ternational Association for Statistical Computing Best (UK), Martyn Plummer (France), Brad Carlin (IASC), Rome, Italy, August, 28th - September 1st, (USA) and Andrew Thomas (Finland). 2006. If you are interested in attending and IMPORTANT DEADLINES: for more details, please email Bob O’Hara January 15, 2006: Deadline for submission of ([email protected].fi). contributed papers for possible publication on Meeting on High Performance Computing and Conference Proceedings. Statistical Inference, University of Dublin, Trinity May 2, 2006: Deadline for submission of con- College, Ireland, August 23rd - 25th, 2006. tributed abstracts (not included in Conference Pro- The conference topics are: ceedings) • Implementation of statistical analysis with More details can be found at distributed computing; w3.uniroma1.it/compstat2006. ▲

THE 2006 DEGROOT PRIZE The DeGroot Prize is awarded to the author or authors of a published book in Sta- tistical Science. The Prize is named for Morris (“Morrie”) H DeGroot, and recog- nizes the impact and importance of his work in Statistics and Decision Theory, and his marked influence on the evolution of the discipline over several decades through his personal scholarship, educational and professional leadership. Award winning books will be textbooks or monographs concerned with fundamental issues of statis- tical inference, decision theory and/or statistical applications. Nominations for the 2006 award must be received by Friday, 6th January 2006. Only books published dur- ing the 5 year period ending December 31, 2004 are eligible for consideration. The winner of the 2006 DeGroot Prize will be announced at the Valencia/ISBA 8 World Meeting on Bayesian Statistics, June, 2006. The webpage http://www.bayesian.org/ awards/DeGrootPrize.html contains the full list of the committee members and their addresses, nomination procedure and further information about the prize.

11 Executive Committee Program Council Board Members

President: Sylvia Richardson Chair: Kerrie Mengersen Carmen Fernandez, Valen Johnson, Past President: Jim Berger Vice Chair: Peter Muller¨ Peter Muller,¨ Fernando Quintana, President Elect: Alan Gelfand Past Chair: Jose´ Miguel Bernardo Brad Carlin, Merlise Clyde, David Treasurer: Bruno Sanso´ Higdon, David Madigan, Michael Executive Secretary: Web page: Goldstein, Jun Liu, Christian Robert, Deborah Ashby http://www.bayesian.org ¨ Marina Vannucci.

© EDITORIAL BOARD

Editor

J. Andres´ Christen

Associate Editors

Annotated Bibliography Alexandra M. Schmidt Marina Vannucci Software Review Applications Ramses Mena Catherine Calder Student’s Corner Interviews Robert Gramacy Bayesian Brunero Liseo History News from the World Antonio Pievatolo

12