THE ANNALS of APPLIED STATISTICS
Total Page:16
File Type:pdf, Size:1020Kb
ISSN 1932-6157 (print) ISSN 1941-7330 (online) THE ANNALS of APPLIED STATISTICS AN OFFICIAL JOURNAL OF THE INSTITUTE OF MATHEMATICAL STATISTICS Articles Refining cellular pathway models using an ensemble of heterogeneous data sources ALEXANDER M. FRANKS,FLORIAN MARKOWETZ AND EDOARDO M. AIROLDI 1361 Statistical shape analysis of simplified neuronal trees ADAM DUNCAN,ERIC KLASSEN AND ANUJ SR I VA S TAVA 1385 TPRM: Tensor partition regression models with applications in imaging biomarker detection . MICHELLE F. MIRANDA,HONGTU ZHU AND JOSEPH G. IBRAHIM 1422 Complex-valued time series modeling for improved activation detection in fMRI studies...........DANIEL W. ADRIAN,RANJAN MAITRA AND DANIEL B. ROWE 1451 Optimal multilevel matching using network flows: An application to a summer reading intervention.........................SAMUEL D. PIMENTEL,LINDSAY C. PAGE, MATTHEW LENARD AND LUKE KEELE 1479 Topological data analysis of single-trial electroencephalographic signals YUAN WANG,HERNANDO OMBAO AND MOO K. CHUNG 1506 Joint significance tests for mediation effects of socioeconomic adversity on adiposity via epigenetics...............................................YEN-TSUNG HUANG 1535 Adaptive-weight burden test for associations between quantitative traits and genotype data withcomplexcorrelations...........XIAOWEI WU,TING GUAN,DAJIANG J. LIU, LUIS G. LEÓN NOVELO AND DIPANKAR BANDYOPADHYAY 1558 Bayesian aggregation of average data: An application in drug development SEBASTIAN WEBER,ANDREW GELMAN,DANIEL LEE,MICHAEL BETANCOURT, AKI VEHTARI AND AMY RACINE-POON 1583 BayCount: A Bayesian decomposition method for inferring tumor heterogeneity using RNA-Seq counts . FANGZHENG XIE,MINGYUAN ZHOU AND YANXUN XU 1605 Exploring the conformational space for protein folding with sequential Monte Carlo.......................SAMUEL W. K. WONG,JUN S. LIU AND S. C. KOU 1628 Sequential double cross-validation for assessment of added predictive ability in high-dimensional omic applications . MAR RODRÍGUEZ-GIRONDO,PERTTU SALO, TOMASZ BURZYKOWSKI,MARKUS PEROLA, JEANINE HOUWING-DUISTERMAAT AND BART MERTENS 1655 Joining the incompatible: Exploiting purposive lists for the sample-based estimation of species richness . ALESSANDRO CHIARUCCI,ROSA MARIA DI BIASE, LORENZO FATTORINI,MARZIA MARCHESELLI AND CATERINA PISANI 1679 A general framework for association analysis of heterogeneous data GEN LI AND IRINA GAYNANOVA 1700 ConfidentinferenceforSNPeffectsontreatmentefficacy.................YING DING, YING GRACE LI,YUSHI LIU,STEPHEN J. RUBERG AND JASON C. HSU 1727 Nonparametric Bayesian learning of heterogeneous dynamic transcription factor networks..................................XIANGYU LUO AND YINGYING WEI 1749 continued Vol. 12, No. 3—September 2018 ISSN 1932-6157 (print) ISSN 1941-7330 (online) THE ANNALS of APPLIED STATISTICS AN OFFICIAL JOURNAL OF THE INSTITUTE OF MATHEMATICAL STATISTICS Articles—Continued from front cover Estimating and comparing cancer progression risks under varying surveillance protocols.............JANE M. LANGE,ROMAN GULATI,AMY S. LEONARDSON, DANIEL W. LIN,LISA F. NEWCOMB,BRUCE J. TROCK, H. BALLENTINE CARTER,PETER R. CARROLL,MATTHEW R. COOPERBERG, JANET E. COWAN,LAWRENCE H. KLOTZ AND RUTH ETZIONI 1773 Analysing plant closure effects using time-varying mixture-of-experts Markov chain clustering..................SYLVIA FRÜHWIRTH-SCHNATTER,STEFAN PITTNER, ANDREA WEBER AND RUDOLF WINTER-EBMER 1796 Using missing types to improve partial identification with application to a study of HIV prevalenceinMalawi..........................ZHICHAO JIANG AND PENG DING 1831 A coupled ETAS-I2GMM point process with applications to seismic fault detection...........YICHENG CHENG,MURAT DUNDAR AND GEORGE MOHLER 1853 Functional principal variance component testing for a genetic association study of HIV progression . DENIS AGNIEL,WEN XIE,MYRON ESSEX AND TIANXI CAI 1871 Estimating a common covariance matrix for network meta-analysis of gene expression datasets in diffuse large B-cell lymphoma ANDERS ELLERN BILGRAU,RASMUS FROBERG BRØNDUM, POUL SVANTE ERIKSEN,KAREN DYBKÆR AND MARTIN BØGSTED 1894 Tree-based reinforcement learning for estimating optimal dynamic treatment regimes........................YEBIN TAO,LU WANG AND DANIEL ALMIRALL 1914 A frequency-calibrated Bayesian search for new particles SHIRIN GOLCHI AND RICHARD LOCKHART 1939 Bayesian randomized response technique with multiple sensitive attributes: The case of information systems resource misuse RAY S. W. CHUNG,AMANDA M. Y. CHU AND MIKE K. P. SO 1969 Direct likelihood-based inference for discretely observed stochastic compartmental models of infectious disease LAM SI TUNG HO,FORREST W. CRAWFORD AND MARC A. SUCHARD 1993 Vol. 12, No. 3—September 2018 THE ANNALS OF APPLIED STATISTICS Vol. 12, No. 3, pp. 1361–2021 September 2018 INSTITUTE OF MATHEMATICAL STATISTICS (Organized September 12, 1935) The purpose of the Institute is to foster the development and dissemination of the theory and applications of statistics and probability. IMS OFFICERS President: Xiao-Li Meng, Department of Statistics, Harvard University, Cambridge, Massachusetts 02138-2901, USA President-Elect: Susan Murphy, Department of Statistics, Harvard University, Cambridge, Massachusetts 02138-2901, USA Past President: Alison Etheridge, Department of Statistics, University of Oxford, Oxford, OX1 3LB, United Kingdom Executive Secretary: Edsel Peña, Department of Statistics, University of South Carolina, Columbia, South Carolina 29208-001, USA Treasurer: Zhengjun Zhang, Department of Statistics, University of Wisconsin, Madison, Wisconsin 53706-1510, USA Program Secretary: Ming Yuan, Department of Statistics, Columbia University, New York, NY 10027-5927, USA IMS PUBLICATIONS The Annals of Statistics. Editors: Edward I. George, Department of Statistics, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA; Tailen Hsing, Department of Statistics, University of Michigan, Ann Arbor, Michigan 48109-1107, USA The Annals of Applied Statistics. Editor-In-Chief : Tilmann Gneiting, Heidelberg Institute for Theoretical Studies, HITS gGmbH, Schloss-Wolfsbrunnenweg 35, 69118 Heidelberg, Germany, and Institute for Stochastics, Karlsruhe Institute of Technology, Englerstr. 2, 76128 Karlsruhe, Germany The Annals of Probability. Editor: Amir Dembo, Department of Statistics and Department of Mathematics, Stan- ford University, Stanford, California 94305, USA The Annals of Applied Probability. Editor: Bálint Tóth, School of Mathematics, University of Bristol, University Walk, BS8 1TW, Bristol, U.K. and Institute of Mathematics, Technical University Budapest, Egry József u. 1, H-1111 Budapest, Hungary Statistical Science. Editor: Cun-Hui Zhang, Department of Statistics, Rutgers University, Piscataway, New Jersey 08854, USA The IMS Bulletin. Editor: Vlada Limic, UMR 7501 de l’Université de Strasbourg et du CNRS, 7 rue René Descartes, 67084 Strasbourg Cedex, France The Annals of Applied Statistics [ISSN 1932-6157 (print); ISSN 1941-7330 (online)], Volume 12, Number 3, September 2018. Published quarterly by the Institute of Mathematical Statistics, 3163 Somerset Drive, Cleveland, Ohio 44122, USA. Periodicals postage pending at Cleveland, Ohio, and at additional mailing offices. POSTMASTER: Send address changes to The Annals of Applied Statistics, Institute of Mathematical Statistics, Dues and Subscriptions Office, 9650 Rockville Pike, Suite L 2310, Bethesda, Maryland 20814-3998, USA. Copyright © 2018 by the Institute of Mathematical Statistics Printed in the United States of America The Annals of Applied Statistics 2018, Vol. 12, No. 3, 1361–1384 https://doi.org/10.1214/16-AOAS915 © Institute of Mathematical Statistics, 2018 REFINING CELLULAR PATHWAY MODELS USING AN ENSEMBLE OF HETEROGENEOUS DATA SOURCES1 BY ALEXANDER M. FRANKS,FLORIAN MARKOWETZ2,3 AND EDOARDO M. AIROLDI3 University of California, Santa Barbara, University of Cambridge and Temple University Improving current models and hypotheses of cellular pathways is one of the major challenges of systems biology and functional genomics. There is a need for methods to build on established expert knowledge and reconcile it with results of new high-throughput studies. Moreover, the available sources of data are heterogeneous, and the data need to be integrated in different ways depending on which part of the pathway they are most informative for. In this paper, we introduce a compartment specific strategy to integrate edge, node and path data for refining a given network hypothesis. To carry out inference, we use a local-move Gibbs sampler for updating the pathway hypothesis from a compendium of heterogeneous data sources, and a new network regression idea for integrating protein attributes. We demonstrate the utility of this ap- proach in a case study of the pheromone response MAPK pathway in the yeast S. cerevisiae. REFERENCES ALBERTS,B.,JOHNSON,A.,LEWIS,J.,RAFF,M.,ROBERTS,K.andWALTER, P. (2002). Molec- ular Biology of the Cell, 4th ed. Garland Science, New York. BALBIN,O.A.,PRENSNER,J.R.,SAHU,A.,YOCUM,A.,SHANKAR,S.,MALIK,R., FERMIN,D.,DHANASEKARAN,S.M.,CHANDLER,B.,THOMAS,D.,BEER,D.G., CAO,X.,NESVIZHSKII,A.I.andCHINNAIYAN, A. M. (2013). Reconstructing targetable pathways in lung cancer by integrating diverse omics data. Nat. Commun. 4 Article ID 2617. DOI:10.1038/ncomms3617. BERNARD,A.andHARTEMINK, A. J. (2005). Informative structure priors: Joint learning of dy- namic regulatory networks from multiple types of data. In Pacific Symposium on Biocomputing 459–470. BREM,R.B.andKRUGLYAK, L. (2005). The landscape