Statistical Modelling of Citation Exchange Between Statistics Journals

Total Page:16

File Type:pdf, Size:1020Kb

Statistical Modelling of Citation Exchange Between Statistics Journals 52 Discussion on the Paper by Varin, Cattelan and Firth Pengsheng Ji (University of Georgia, Athens), Jiashun Jin (Carnegie Mellon University, Pittsburgh) and Zheng Tracy Ke (University of Chicago) We congratulate Varin, Cattelan and Firth for a very stimulating paper. They use the Stigler model on cross-citation data and provide a mode-based method to rank statistical journals. Their approach allows for evaluation of uncertainty of rankings and sheds light on how to avoid overinterpretation of the insignificant difference between journal rankings. In a related context, we study social networks for authors (instead of journals) with a data set that we collect (based on all papers in the Annals of Statistics, Biometrika, the Journal of the American Statistical Association and the Journal of the Royal Statistical Society, Series B, 2003–2012). The data set will be publicly available soon. The data set provides a fertile ground for studying networks for statisticians. In Ji and Jin (2014) we have presented results including (a) ‘hot’ authors and papers, (b) many meaningful communities and (c) research trends. Here, we report results only on community detection of the citation network (for authors). Intuitively, network communities are groups of nodes that have more edges within than across (Jin, 2015). The goal of community detection is to identify such groups (i.e. clustering). We have analysed the citation network with the method of directed scores (Ji and Jin, 2014; Jin, 2015) and identified three meaningful communities; Fig. 16. The first community is ‘large-scale multiple testing’, including (a) a Bayesian group, James Berger and Peter Muller,¨ (b) a Carnegie Mellon group, Christopher Genovese, Jiashun Jin, Isabella Verdinelli and Larry Wasser- man, (c) a causal inference group, Donald Rubin and Paul Rosenbaum, (d) three Berkeley–Stanford groups, (i) Bradley Efron, David Siegmund and John Storey, (ii) David Donoho, Iain Johnstone, Mark Low (University of Pennsylvania) and John Rice and (iii) Eric Lehmann and Joseph Romano, and (e) a Tel Aviv group, Felix Abramovich, Yoav Benjamini, Abba Krieger (University of Pennsylvania) and Daniel Yekutieli. The second community is ‘spatial statistics’ and can be further split into three subgroups: (a) a non-parametric spatial statistics subgroup, including David Blei, Alan Gelfand, Yi Li and Trivel- lore Raghunathan; (b) a parametric spatial statistics subgroup, including Tilmann Gneiting, Douglas Nychka, Anthony O’Hagan, Adrian Raftery, Nancy Reid and Michael Stein; (c) a semiparametric–non-parametric statistics (subgroup), including Raymond Carroll, Ciprian Craniceanu, David Ruppert and Naisyin Wang. The third community is ‘variable selection’ including researchers on dimension reduction (Dennis Cook), quantile regression (Xuming He), variable selection (Peter Bickel, Peter Buhlmann,¨ Emmanuel Candes, Jianqing Fan, Peter Hall, Trevor Hastie, Runze Li, Terrence Tao, Robert Tibshirani, Alexandre Tsybakov, Ming Yuan, Cun-Hui Zhang, Ji Zhu and Hui Zou). Our results must be interpreted with caution, for the scope of the data set is limited. Also, it is not our intention to rank authors or papers. Jon R. Kettenring (Drew University, Madison) This paper provides an excellent comprehensive discussion of citation analysis with emphasis on ranking journals. Citations are a weak form of data. The hope is that the data will nevertheless be sufficiently rich to produce useful insights. With this in mind, my comments focus on Section 3, clustering journals, which the authors suggest ‘can help to establish relatively homogeneous subsets of journals that might reasonably be ranked together’. A complete linkage hierarchical clustering algorithm is used to produce the dendrogram shown in Fig. 2. Six clusters and two singletons are identified by cutting the tree. How well determined are they? For comparison, I repeated the analysis by using the average and minimax (Bien and Tibshirani, 2011) linkage Discussion on the Paper by Varin, Cattelan and Firth 53 John Rice David L Donoho David Siegmund John D Storey Joseph P Romano Bradley Efron Subhashis Ghosal Daniel Yekutieli Abba M Krieger Sanat K Sarkar Donald B Rubin Aad van der Vaart Jiashun Jin Larry Wasserman Zhiyi Chi Yoav Benjamini D R Cox Peter Muller E L Lehmann Christian P Robert James O Berger Mark G Low Christopher Genovese Paul R Rosenbaum Felix Abramovich Iain M Johnstone (a) Ming−Hui Chen Paul Fearnhead R Todd Ogden Yi Li Gareth Roberts David RuppertAmy H Herring Omiros Papaspiliopoulos Simon N Wood Gary L Rosner Jeffrey S Morris Joseph G Ibrahim Mohammad Hosseini−Nasab Ciprian M Crainiceanu Steven N MacEachern Ulrich Stadtmuller Athanasios Kottas Raymond J Carroll Alan E Gelfand Naisyin Wang Sudipto BanerjeeMark F J Steel Alan H Welsh Andrew O Finley Anthony OHagan Brian S Caffo Michael L SteinMontserrat Fuentes Yongtao Guan Hao Purdue Zhang Marc G Genton Michael Sherman Huiyan Sang Theo Gasser Tilmann Gneiting Douglas W Nychka Martin Schlather Adrian E Raftery N Reid Jonathan Tawn Laurens de Haan Robin Henderson (b) Michael Kosorok Bin Yu Terence Tao Jian HuangHansheng Wang Dennis Cook Alexandre Tsybakov Helen Zhang Emmanuel Candes Lixing Zhu Jinchi Lv Nicolai Meinshausen Hui Zou Trevor Hastie Cun−Hui Zhang Peter Bickel Yi Lin Heng Peng Peter Buhlmann Joel Horowitz Dan Yu Lin Jianqing Fan Ming Yuan Runze Li Elizaveta LevinaJi Zhu Robert Tibshirani Peter Hall Hua Liang Xuming He Tony Cai Bernard Silverman Jianhua Huang Jonathan Taylor Jane−Ling Wang Fang Yao Hans−Georg Muller Mohsen Pourahmadi Marina Vannucci Xihong Lin (c) Fig. 16. Communities found in the citation network: (a) ‘large-scale multiple testing’ (359 nodes; only 26 nodes with 24 or more citers are shown); (b) ‘spatial statistics’ (1010 nodes; only 42 nodes with 24 or more citers are shown); (c) ‘variable selection’ community (1285 nodes; only 40 nodes with 54 or more citers are shown).
Recommended publications
  • BERNOULLI NEWS, Vol 24 No 2 (2017)
    Vol. 24 (2), November 2017 Published twice per year by the Bernoulli Society ISSN 1360–6727 CONTENTS News from the Bernoulli A VIEW FROM THE PRESIDENT Society p. 1 Awards and Prizes p. 2 New Executive Members in the Bernoulli Society p. 3 Articles and Letters On Bayesian Measures for Some Statistical Inverse Problems with Partial Differential Equations p. 5 Obituary Alastair Scott p. 10 Past Conferences, Susan A. Murphy receives the Bernoulli Book from Sara van de Geer during the General Assembly of the Bernoulli Society ISI World Congress in Marrakech, Morocco. Meetings and Workshops p. 11 Dear Members of the Bernoulli Society, Next Conferences, Meetings and Workshops It is an honor to assume the role of Bernoulli Society president, particularly be- cause this is a very exciting time to be a statistician or probabilist! As all of us have and Calendar of Events become, ever-more acutely aware, the role of data in society and in science is dramat- p. 13 ically changing. Many new challenges are due to the complex, and vast amounts of, data resulting from the development of new data collection tools such as wearable Book Reviews sensors in clothing, on eyeglasses, in toothbrushes and most commonly on our phones. Indeed there are now wearable radar sensors that provide data that might be used Editor to improve the safety of bicyclists or help visually impaired individuals gain greater MIGUEL DE CARVALHO independence. There are also wearable respiratory sensors that provide data that School of Mathematics could be used to help us investigate the impact of dietary and exercise regimens, or THE UNIVERSITY of EDINBURGH EDINBURGH, UK identify nutritional imbalances.
    [Show full text]
  • 2015 Conference Directory
    RSS International Conference 2015 The conference for all statisticians and users of data sponsored by 7 – 10 September 2015 Exeter University www.rss.org.uk/conference2015 Welcome Outline Timetable Wednesday 9 September 9am – 10am Invited/Contributed sessions (5) It is my great honour to welcome you all to Exeter for the 10.05am – 11.25pm 2015 RSS Conference! After listening to your feedback, you Monday 7 September Invited sessions (6) will see that this year we have adopted a new streaming 9.30am – 5.30pm format to Conference which should allow you to easily 11.25am – 11.50am Pre-conference short courses and workshops identify sessions that are directly relevant to your area of Refreshment break (The Forum) expertise and interests. That being said, I do hope that you 11.50am – 1.10pm will take advantage of this unique opportunity to pop into 7pm – 9pm Invited sessions (7) sessions that you may not have access to elsewhere. Welcome Reception (Royal Albert Memorial Museum) 1.10pm – 2.10pm Lunch We have a great line-up of speakers and sessions organised for you, but I also hope that you will find some time for 2.10pm – 3.10pm exploration. The Conference will be hosted in three Tuesday 8 September Plenary 3 – Significance lecture outstanding venues - The Forum (which all you Broadchurch fans may recognise as the 3.15pm – 4.15pm 9am – 9.30am Courtroom in the second series), the award winning Royal Albert Memorial Museum Invited/Contributed sessions (8) and the University’s Great Hall. The Young Statisticians’ Guide to the Conference 4.15pm – 4.40pm Exeter is an ancient city with many notable landmarks, including the Cathedral which 9.30am – 10.30am Refreshment break boasts an astronomical clock and the longest uninterrupted vaulted ceiling in England.
    [Show full text]
  • BERNOULLI NEWS, Vol 24 No 1 (2017)
    Vol. 24 (1), May 2017 Published twice per year by the Bernoulli Society ISSN 1360–6727 CONTENTS News from the Bernoulli Society p. 1 Awards and Prizes p. 2 New Executive Members A VIEW FROM THE PRESIDENT in the Bernoulli Society Dear Members of the Bernoulli Society, p. 3 As we all seem to agree, the role and image of statistics has changed dramatically. Still, it takes ones breath when realizing the huge challenges ahead. Articles and Letters Statistics has not always been considered as being very necessary. In 1848 the Dutch On Bayesian Measures of Ministry of Home Affairs established an ofŮice of statistics. And then, thirty years later Uncertainty in Large or InŮinite minister Kappeyne van de Coppelo abolishes the “superŮluous” ofŮice. The ofŮice was Dimensional Models p. 4 quite rightly put back in place in 1899, as “Centraal Bureau voor de Statistiek” (CBS). Statistics at the CBS has evolved from “simple” counting to an art requiring a broad range On the Probability of Co-primality of competences. Of course counting remains important. For example the CBS reports in of two Natural Numbers Chosen February 2017 that almost 1 out of 4 people entitled to vote in the Netherlands is over the at Random p. 7 age of 65. But clearly, knowing this generates questions. What is the inŮluence of this on the outcome of the elections? This calls for more data. Demographic data are combined with survey data and nowadays also with data from other sources, in part to release the “survey pressure” that Ůirms and individuals are facing.
    [Show full text]
  • Coauthorship and Citation Networks for Statisticians
    Submitted to the Annals of Applied Statistics arXiv: arXiv:0000.0000 COAUTHORSHIP AND CITATION NETWORKS FOR STATISTICIANS By Pengsheng Jiy and Jiashun Jinz University of Georgiay and Carnegie Mellon Universityz We have collected and cleaned two network data sets: Coauthor- ship and Citation networks for statisticians. The data sets are based on all research papers published in four of the top journals in statis- tics from 2003 to the first half of 2012. We analyze the data sets from many different perspectives, focusing on (a) centrality, (b) commu- nity structures, and (c) productivity, patterns and trends. For (a), we have identified the most prolific/collaborative/highly cited authors. We have also identified a handful of \hot" papers, suggesting \Variable Selection" as one of the \hot" areas. For (b), we have identified about 15 meaningful communities or research groups, including large-size ones such as \Spatial Statis- tics", \Large-Scale Multiple Testing", \Variable Selection" as well as small-size ones such as \Dimensional Reduction", \Objective Bayes", \Quantile Regression", and \Theoretical Machine Learning". For (c), we find that over the 10-year period, both the average number of papers per author and the fraction of self citations have been decreasing, but the proportion of distant citations has been increasing. These suggest that the statistics community has become increasingly more collaborative, competitive, and globalized. Our findings shed light on research habits, trends, and topological patterns of statisticians. The data sets provide a fertile ground for future researches on or related to social networks of statisticians. 1. Introduction. It is frequently of interest to identify \hot" areas and key authors in a scientific community, and to understand the research habits, trends, and topological patterns of the researchers.
    [Show full text]
  • IMS Bulletin 39(4)
    Volume 39 • Issue 4 IMS1935–2010 Bulletin May 2010 Meet the 2010 candidates Contents 1 IMS Elections 2–3 Members’ News: new ISI members; Adrian Raftery; It’s time for the 2010 IMS elections, and Richard Smith; we introduce this year’s nominees who are IMS Collections vol 5 standing for IMS President-Elect and for IMS Council. You can read all the candi- 4 IMS Election candidates dates’ statements, starting on page 4. 9 Amendments to This year there are also amendments Constitution and Bylaws to the Constitution and Bylaws to vote Letter to the Editor 11 on: they are listed The candidate for IMS President-Elect is Medallion Preview: Laurens on page 9. 13 Ruth Williams de Haan Voting is open until June 26, so 14 COPSS Fisher lecture: Bruce https://secure.imstat.org/secure/vote2010/vote2010.asp Lindsay please visit to cast your vote! 15 Rick’s Ramblings: March Madness 16 Terence’s Stuff: And ANOVA thing 17 IMS meetings 27 Other meetings 30 Employment Opportunities 31 International Calendar of Statistical Events The ten Council candidates, clockwise from top left, are: 35 Information for Advertisers Krzysztof Burdzy, Francisco Cribari-Neto, Arnoldo Frigessi, Peter Kim, Steve Lalley, Neal Madras, Gennady Samorodnitsky, Ingrid Van Keilegom, Yazhen Wang and Wing H Wong Abstract submission deadline extended to April 30 IMS Bulletin 2 . IMs Bulletin Volume 39 . Issue 4 Volume 39 • Issue 4 May 2010 IMS members’ news ISSN 1544-1881 International Statistical Institute elects new members Contact information Among the 54 new elected ISI members are several IMS members. We congratulate IMS IMS Bulletin Editor: Xuming He Fellow Jon Wellner, and IMS members: Subhabrata Chakraborti, USA; Liliana Forzani, Assistant Editor: Tati Howell Argentina; Ronald D.
    [Show full text]
  • Statistical Modelling of Citation Exchange Between Statistics Journals
    J. R. Statist. Soc. A (2016) 179, Part 1, pp. 1–63 Statistical modelling of citation exchange between statistics journals Cristiano Varin, Universit`a Ca’ Foscari, Venezia, Italy Manuela Cattelan Universit`a degli Studi di Padova, Italy and David Firth University of Warwick, Coventry, UK [Read before The Royal Statistical Society at a meeting organized by the General Applica- tions Section on Wednesday, May 13th, 2015, Professor P. Clarke in the Chair ] Summary. Rankings of scholarly journals based on citation data are often met with scepticism by the scientific community. Part of the scepticism is due to disparity between the common per- ception of journals’ prestige and their ranking based on citation counts. A more serious concern is the inappropriate use of journal rankings to evaluate the scientific influence of researchers. The paper focuses on analysis of the table of cross-citations among a selection of statistics jour- nals. Data are collected from the Web of Science database published by Thomson Reuters. Our results suggest that modelling the exchange of citations between journals is useful to highlight the most prestigious journals, but also that journal citation data are characterized by consid- erable heterogeneity, which needs to be properly summarized. Inferential conclusions require care to avoid potential overinterpretation of insignificant differences between journal ratings. Comparison with published ratings of institutions from the UK’s research assessment exercise shows strong correlation at aggregate level between assessed research quality and journal citation ‘export scores’ within the discipline of statistics. Keywords: Bradley–Terry model; Citation data; Export score; Impact factor; Journal ranking; Research evaluation; Stigler model 1.
    [Show full text]
  • Applied Statistics
    ISSN 1932-6157 (print) ISSN 1941-7330 (online) THE ANNALS of APPLIED STATISTICS AN OFFICIAL JOURNAL OF THE INSTITUTE OF MATHEMATICAL STATISTICS Articles Elicitability and backtesting: Perspectives for banking regulation NATALIA NOLDE AND JOHANNA F. ZIEGEL 1833 Discussion...............................HAJO HOLZMANN AND BERNHARD KLAR 1875 Discussion.....................................................PATRICK SCHMIDT 1883 Discussion.................................................... MARK H. A. DAV I S 1886 Discussion...........................................................CHEN ZHOU 1888 Discussion.........................................................MARIE KRATZ 1894 Rejoinder...............................NATALIA NOLDE AND JOHANNA F. ZIEGEL 1901 Estimating average causal effects under general interference, with application to a social networkexperiment.....................PETER M. ARONOW AND CYRUS SAMII 1912 Co-evolution of social networks and continuous actor attributes NYNKE M. D. NIEZINK AND TOM A. B. SNIJDERS 1948 Bayesian sensitivity analysis for causal effects from 2 × 2 tables in the presence of unmeasured confounding with application to presidential campaign visits LUKE KEELE AND KEVIN M. QUINN 1974 Model-based clustering with data correction for removing artifacts in gene expression data WILLIAM CHAD YOUNG,ADRIAN E. RAFTERY AND KA YEE YEUNG 1998 A unified framework for variance component estimation with summary statistics in genome-wideassociationstudies...................................XIANG ZHOU 2027 Photodegradation modeling
    [Show full text]
  • Sequential Monte Carlo Methods for Inference and Prediction of Latent Time-Series
    SSStttooonnnyyy BBBrrrooooookkk UUUnnniiivvveeerrrsssiiitttyyy The official electronic file of this thesis or dissertation is maintained by the University Libraries on behalf of The Graduate School at Stony Brook University. ©©© AAAllllll RRRiiiggghhhtttsss RRReeessseeerrrvvveeeddd bbbyyy AAAuuuttthhhooorrr... Sequential Monte Carlo Methods for Inference and Prediction of Latent Time-series A Dissertation presented by Iñigo Urteaga to The Graduate School in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in Electrical Engineering Stony Brook University August 2016 Stony Brook University The Graduate School Iñigo Urteaga We, the dissertation committee for the above candidate for the Doctor of Philosophy degree, hereby recommend acceptance of this dissertation. Petar M. Djuri´c,Dissertation Advisor Professor, Department of Electrical and Computer Engineering Mónica F. Bugallo, Chairperson of Defense Associate Professor, Department of Electrical and Computer Engineering Yue Zhao Assistant Professor, Department of Electrical and Computer Engineering Svetlozar Rachev Professor, Department of Applied Math and Statistics This dissertation is accepted by the Graduate School. Nancy Goroff Interim Dean of the Graduate School ii Abstract of the Dissertation Sequential Monte Carlo Methods for Inference and Prediction of Latent Time-series by Iñigo Urteaga Doctor of Philosophy in Electrical Engineering Stony Brook University 2016 In the era of information-sensing mobile devices, the Internet- of-Things and Big Data, research on advanced methods for extracting information from data has become extremely critical. One important task in this area of work is the analysis of time-varying phenomena, observed sequentially in time. This endeavor is relevant in many applications, where the goal is to infer the dynamics of events of interest described by the data, as soon as new data-samples are acquired.
    [Show full text]
  • Gareth Roberts
    Publication list: Gareth Roberts [1] Hongsheng Dai, Murray Pollock, and Gareth Roberts. Bayesian fusion: Scalable unification of distributed statistical analyses. arXiv preprint arXiv:2102.02123, 2021. [2] Christian P Robert and Gareth O Roberts. Rao-blackwellization in the mcmc era. arXiv preprint arXiv:2101.01011, 2021. [3] Dootika Vats, Fl´avioGon¸calves, KrzysztofLatuszy´nski,and Gareth O Roberts. Efficient Bernoulli factory MCMC for intractable likelihoods. arXiv preprint arXiv:2004.07471, 2020. [4] Marcin Mider, Paul A Jenkins, Murray Pollock, Gareth O Roberts, and Michael Sørensen. Simulating bridges using confluent diffusions. arXiv preprint arXiv:1903.10184, 2019. [5] Nicholas G Tawn and Gareth O Roberts. Optimal temperature spacing for regionally weight-preserving tempering. arXiv preprint arXiv:1810.05845, 2018. [6] Joris Bierkens, Kengo Kamatani, and Gareth O Roberts. High- dimensional scaling limits of piecewise deterministic sampling algorithms. arXiv preprint arXiv:1807.11358, 2018. [7] Cyril Chimisov, Krzysztof Latuszynski, and Gareth Roberts. Adapting the Gibbs sampler. arXiv preprint arXiv:1801.09299, 2018. [8] Cyril Chimisov, Krzysztof Latuszynski, and Gareth Roberts. Air Markov chain Monte Carlo. arXiv preprint arXiv:1801.09309, 2018. [9] Paul Fearnhead, Krzystof Latuszynski, Gareth O Roberts, and Giorgos Sermaidis. Continuous-time importance sampling: Monte Carlo methods which avoid time-discretisation error. arXiv preprint arXiv:1712.06201, 2017. [10] Fl´avioB Gon¸calves, Krzysztof GLatuszy´nski,and Gareth O Roberts. Exact Monte Carlo likelihood-based inference for jump-diffusion processes. arXiv preprint arXiv:1707.00332, 2017. [11] Andi Q Wang, Murray Pollock, Gareth O Roberts, and David Stein- saltz. Regeneration-enriched Markov processes with application to Monte Carlo. to appear in Annals of Applied Probability, 2020.
    [Show full text]
  • Computational and Financial Econometrics (CFE 2017)
    CFE-CMStatistics 2017 PROGRAMME AND ABSTRACTS 11th International Conference on Computational and Financial Econometrics (CFE 2017) http://www.cfenetwork.org/CFE2017 and 10th International Conference of the ERCIM (European Research Consortium for Informatics and Mathematics) Working Group on Computational and Methodological Statistics (CMStatistics 2017) http://www.cmstatistics.org/CMStatistics2017 Senate House & Birkbeck University of London, UK 16 – 18 December 2017 ⃝c ECOSTA ECONOMETRICS AND STATISTICS. All rights reserved. I CFE-CMStatistics 2017 ISBN 978-9963-2227-4-2 ⃝c 2017 - ECOSTA ECONOMETRICS AND STATISTICS Technical Editors: Gil Gonzalez-Rodriguez and Marc Hofmann. All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted, in any other form or by any means without the prior permission from the publisher. II ⃝c ECOSTA ECONOMETRICS AND STATISTICS. All rights reserved. CFE-CMStatistics 2017 International Organizing Committee: Ana Colubi, Erricos Kontoghiorghes, Marc Levene, Bernard Rachet, Herman Van Dijk. CFE 2017 Co-chairs: Veronika Czellar, Hashem Pesaran, Mike Pitt and Stefan Sperlich. CFE 2017 Programme Committee: Knut Are Aastveit, Alessandra Amendola, Josu Arteche, Monica Billio, Roberto Casarin, Gianluca Cubadda, Manfred Deistler, Jean-Marie Dufour, Ekkehard Ernst, Jean-David Fermanian, Catherine Forbes, Philip Hans Franses, Marc Hallin, Alain Hecq, David Hendry, Benjamin Holcblat, Jan Jacobs, Degui Li, Alessandra Luati, Richard Luger, J Isaac Miller, Claudio Morana, Bent
    [Show full text]
  • By Alex Reinhart Yosihiko Ogata
    STATISTICAL SCIENCE Volume 33, Number 3 August 2018 A Review of Self-Exciting Spatio-Temporal Point Processes and Their Applications .................................................................................Alex Reinhart 299 Comment on “A Review of Self-Exciting Spatiotemporal Point Process and Their Applications”byAlexReinhart.............................................Yosihiko Ogata 319 Comment on “A Review of Self-Exciting Spatio-Temporal Point Process and Their Applications”byAlexReinhart..........................................Jiancang Zhuang 323 Comment on “A Review of Self-Exciting Spatio-Temporal Point Processes and Their Applications”byAlexReinhart..................................Frederic Paik Schoenberg 325 Self-Exciting Point Processes: Infections and Implementations .............Sebastian Meyer 327 Rejoinder: A Review of Self-Exciting Spatio-Temporal Point Processes and Their Applications..................................................................Alex Reinhart 330 On the Relationship between the Theory of Cointegration and the Theory of Phase Synchronization..................Rainer Dahlhaus, István Z. Kiss and Jan C. Neddermeyer 334 Confidentiality and Differential Privacy in the Dissemination of Frequency Tables ......................Yosef Rinott, Christine M. O’Keefe, Natalie Shlomo and Chris Skinner 358 Piecewise Deterministic Markov Processes for Continuous-Time Monte Carlo .....................Paul Fearnhead, Joris Bierkens, Murray Pollock and Gareth O. Roberts 386 Fractionally Differenced Gegenbauer Processes
    [Show full text]
  • Network Cross-Validation by Edge Sampling
    Biometrika (2020), 107,2,pp. 257–276 doi: 10.1093/biomet/asaa006 Printed in Great Britain Advance Access publication 4 April 2020 Network cross-validation by edge sampling By TIANXI LI Department of Statistics, University of Virginia, B005 Halsey Hall, 148 Amphitheater Way, Charlottesville, Virginia 22904, U.S.A. [email protected] Downloaded from https://academic.oup.com/biomet/article/107/2/257/5816043 by guest on 01 July 2021 ELIZAVETA LEVINA AND JI ZHU Department of Statistics, University of Michigan, 459 West Hall, 1085 South University Avenue, Ann Arbor, Michigan 48105, U.S.A. [email protected] [email protected] Summary While many statistical models and methods are now available for network analysis, resampling of network data remains a challenging problem. Cross-validation is a useful general tool for model selection and parameter tuning, but it is not directly applicable to networks since splitting network nodes into groups requires deleting edges and destroys some of the network structure. In this paper we propose a new network resampling strategy, based on splitting node pairs rather than nodes, that is applicable to cross-validation for a wide range of network model selection tasks. We provide theoretical justification for our method in a general setting and examples of how the method can be used in specific network model selection and parameter tuning tasks. Numerical results on simulated networks and on a statisticians’ citation network show that the proposed cross-validation approach works well for model selection. Some key words: Cross-validation; Model selection; Parameter tuning; Random network. 1. Introduction Statistical methods for analysing networks have received a great deal of attention because of their wide-ranging applications in areas such as sociology, physics, biology and the medical sciences.
    [Show full text]