Multivariate Statistical Methods to Analyse Multidimensional Data in Applied Life Science

Total Page:16

File Type:pdf, Size:1020Kb

Multivariate Statistical Methods to Analyse Multidimensional Data in Applied Life Science Multivariate statistical methods to analyse multidimensional data in applied life science Dissertation zur Erlangung des Doktorgrades der Naturwissenschaften (Dr. rer. nat.) der Naturwissenschaftlichen Fakult¨atIII Agrar- und Ern¨ahrungswissenschaften, Geowissenschaften und Informatik der Martin-Luther-Universit¨atHalle-Wittenberg vorgelegt von Frau Trutschel (geb. Boronczyk), Diana Geb. am 18.02.1979 in Hohenm¨olsen Gutachter: 1. Prof. Dr. Ivo Grosse, MLU Halle/Saale 2. Dr. Steffen Neumann, IPB alle/Saale 3. Prof. Dr. Andr´eScherag, FSU Jena Datum der Verteidigung: 18.04.2019 I Eidesstattliche Erklärung / Declaration under Oath Ich erkläre an Eides statt, dass ich die Arbeit selbstständig und ohne fremde Hilfe verfasst, keine anderen als die von mir angegebenen Quellen und Hilfsmittel benutzt und die den benutzten Werken wörtlich oder inhaltlich entnommenen Stellen als solche kenntlich gemacht habe. I declare under penalty of perjury that this thesis is my own work entirely and has been written without any help from other people. I used only the sources mentioned and included all the citations correctly both in word or content. __________________________ ____________________________________________ Datum / Date Unterschrift des Antragstellers / Signature of the applicant III This thesis is a cumulative thesis, including five research articles that have previously been published in peer-reviewed international journals. In the following these publications are listed, whereby the first authors are underlined and my name (Trutschel) is marked in bold. 1. Trutschel, Diana and Schmidt, Stephan and Grosse, Ivo and Neumann, Steffen, "Ex- periment design beyond gut feeling: statistical tests and power to detect differential metabolites in mass spectrometry data", Metabolomics, 2015, available at: https:// link.springer.com/content/pdf/10.1007/s11306-014-0742-y.pdf 2. Trutschel, Diana and Schmidt, Stephan and Grosse, Ivo and Neumann, Steffen, "Joint analysis of dependent features within compound spectra can improve detection of differ- ential features", Frontiers in Bioengineering and Biotechnology, 2015, available at: https: //www.frontiersin.org/articles/10.3389/fbioe.2015.00129/full 3. M¨onchgesang, Susann and Strehmel, Nadine and Trutschel, Diana and Westphal, Lore and Neumann, Steffen and Scheel, Dierk, "Plant-to-Plant Variability in Root Metabo- lite Profiles of 19 Arabidopsis thaliana Accessions Is Substance-Class-Dependent", In- ternational Journal of Molecular Sciences, 2016, available at: http://www.mdpi.com/ 1422-0067/17/9/1565 4. Trutschel, Diana and Palm, Rebecca and Holle, Bernhard and Simon, Michael, "Method- ological approaches in analysing observational data: a practical example on how to ad- dress clustering and selection bias", International Journal of Nursing Studies, 2017, avail- able at: http://www.sciencedirect.com/science/article/pii/S0020748917301426? via%3Dihub 5. Palm, Rebecca and Trutschel, Diana and Simon, Michael and Bartholomeyczik, Sabine and Holle, Bernhard, ”Differences in Case Conferences in Dementia Specific vs Tradi- tional Care Units in German Nursing Homes: Results from a Cross-Sectional Study", Journal of the American Medical Directors Association, 2016, available at: https:// www.jamda.com/article/S1525-8610(15)00557-5/fulltext I hereby declare that the copyright of the content of the articles Trutschel et al., 2015b (2), M¨onchgesang et al., 2016 (3) and Trutschel et al., 2017 (4) is by the authors (under Creative Commons License). I hereby declare that the copyright of the content of the article Trutschel et al., 2015a (1) is by c Springer Science+Business Media New York 2015. I hereby declare that the copyright of the content of the article Palm et al., 2016 (5) is by c 2016 AMDA { The Society for Post-Acute and Long-Term Care Medicine. V Zusammenfassung Angwandte Lebenswissenschaften sind interdisziplin¨areForschungsbereiche, die umfangreiche statistische und computergest¨utzteMethoden ben¨otigenum gesammelte Daten zu organ- isieren, visualisieren und analysieren, insbesondere seitdem die Komplexizit¨atder Daten in diesen Forschungsfeldern selbst immer mehr zunimmt. F¨urdie Datenanalyse ist es wiederum wichtig, Studien durchdacht zu konzipieren und geeignete Methoden f¨urdie Analyse zu w¨ahlen,um valide Ergebnisse f¨urwissenschaftliche Entscheidungen zu erhalten. Um dieses Ziel zu erreichen, wird Wissen ¨uber die Daten- struktur und -eigenschaften ben¨otigt,unabh¨angigin welchem wissenschaftlichen Bereich gear- beitet wird. Benutzerfreundliche Programme, die komplizierte mathematische und comput- ergest¨utzteMethoden aufbereiten und f¨urden praktischen Anwender zug¨anglich machen, sind dabei ebenfalls unverzichtbar geworden. Der Fokus dieser Dissertation liegt auf der methodologischen Erarbeitung solcher Verfahren ebenso wie auf deren Anwendung bei der Analyse von Daten in realen Studien. Die Heraus- forderungen bei der Datenanalyse kommen durch die verschiedenen Dateneigenschaften zu- stande und werden hier am Beispiel von zwei Lebenswissenschaften aufgezeigt: Metabolomik und Gesundheitsversorgung. W¨ahrendmeiner Arbeit in beiden Bereichen hat sich gezeigt, dass obwohl beide Wissenschaften verschiedene Fragestellungen zu beantworten versuchen, die methodische Vorgehensweise ebenso wie die mathematischen L¨osungsans¨atze¨ahnlich sind. Metabolomik ist eine Schl¨usseldisziplinin der Systembiologie. Das komplette Set an kleinen Molek¨ulenin einem Organismus, das Metabolom, wird hier untersucht. Zur Identifizierung und Quantifizierung dieser kleinen Molek¨ule(Metabolite) in solchen komplexen Gemischen werden oft Methoden der Massenspektrometrie genutzt. Die Metabolomforschung besch¨aftigtsich mit metabolischen und regulatorischen Mechanismen, die das Wachstum, die Entwicklung und die Stressantwort von Organismen beeinflussen. Einen großen Teil nehmen dabei Experimente mit analytischem Character ein um diese Informationen zu erhalten. Die Daten aus solchen Experimenten m¨ussenjedoch mit geeigneten Methoden ausgewertet werden k¨onnen. Ein Teilbereich der Gesundheitsversorgung ist die Pflegewissenschaft. Sie hat unter anderem zum Ziel, die Pflegepraxis anzuleiten und die Pflege und Lebensqualit¨atder Patienten zu verbessern. Durch die Uberalterung¨ der Gesellschaft liegt heutzutage ein vermehrtes Interesse auf der Reduktion der Gesundheitsversorgungskosten und der Erhaltung der Lebensqualit¨at von Erkrankten mit neurokognitive St¨orungen(Demenz), ebenso wie auf der Erleichterung der Pflege und dem Schutz vor extremer Arbeitsbelastung der Pflegenden. Um sich dieser Fragen anzunehmen werden zum Teil sehr komplexe Systeme untersucht, so dass der Vorgang f¨urdas Sammeln und die Analyse der Daten nach keinen festen Muster ablaufen kann, sondern eher, je nach Fragestellung in Bezug auf die Pflege von Personen mit Demenz, flexible Antworten ben¨otigtwerden. In beiden wissenschaftlichen Gebieten werden spezifische wissenschaftliche Fragen gestellt. Die Eigenschaften der Daten, die zur Beantwortung der Fragen gewonnen werden, k¨onnensich zwischen den zwei Gebieten sehr unterscheiden oder auch ¨ahneln. Aber unabh¨angigdavon, wie sehr sich beide Wissenschaftsgebiete unterscheiden, in den meisten F¨allenfindet man in den Daten Abh¨angigkeiten und Korrelationen zwischen verschiedenen Variablen, die mit multivariate Methoden analysiert werden. Diese Arbeit beinhaltet drei methodische Artikel, die verschiedene multivariate Methoden untersuchen, und zwei Artikel, die die Analyse einer realen Studie unter Anwendung dieser Methoden pr¨asentieren. VII Summary Applied life sciences are interdisciplinary fields, which require profound statistical and com- putational methods to organize, visualize and analyse the obtained data, especially, since the complexity of data has grown. For data analysis carefully designed studies and appropriate methods are important to make conclusions on the basis of valid results. This needs knowledge about data structure and characteristics, whatever in which scientific field. Furthermore, user-friendly applications to make difficult mathematical and computational methods available for practitioners are essential for these applied sciences. In this thesis the focus is on a methodological point of view as well as showing the application of provided methods in analysing data of real studies. The challenges on data analysis depend- ing on several data characteristics are shown within two applied life sciences: metabolomics and health care. During my work in both fields, it has been shown, that the possible solutions and mathematical approaches are similar although both sciences have different issues. Metabolomics - a key discipline in system biology - investigates the metabolome, which is a complete set of small molecules in an organism. For the identification and quantification of such molecules, the metabolites, in complex mixtures mass spectrometry based methods are often used. Metabolomic research helps to get insights into the metabolic and molecular regulatory mechanisms contoling the growth, development and stress responses of organism. A major part therefore takes the conduction of experiments with analytical character, which have to be analysed with appropriate methods to receive these insights. Nursing science is one part of health care, where clinical nursing service research has the aim to guide nursing practice and to improve care and quality of life of patients. Related to the population ageing, nowadays, there is a special interest within nursing service research on neurocognitive disorders, popularly
Recommended publications
  • Admb Package
    Using AD Model Builder and R together: getting started with the R2admb package Ben Bolker March 9, 2020 1 Introduction AD Model Builder (ADMB: http://admb-project.org) is a standalone program, developed by Dave Fournier continuously since the 1980s and re- leased as an open source project in 2007, that takes as input an objective function (typically a negative log-likelihood function) and outputs the co- efficients that minimize the objective function, along with various auxiliary information. AD Model Builder uses automatic differentiation (that's what \AD" stands for), a powerful algorithm for computing the derivatives of a specified objective function efficiently and without the typical errors due to finite differencing. Because of this algorithm, and because the objective function is compiled into machine code before optimization, ADMB can solve large, difficult likelihood problems efficiently. ADMB also has the capability to fit random-effects models (typically via Laplace approximation). To the average R user, however, ADMB represents a challenge. The first (unavoidable) challenge is that the objective function needs to be written in a superset of C++; the second is learning the particular sequence of steps that need to be followed in order to output data in a suitable format for ADMB; compile and run the ADMB model; and read the data into R for analysis. The R2admb package aims to eliminate the second challenge by automating the R{ADMB interface as much as possible. 2 Installation The R2admb package can be installed in R in the standard way (with install.packages() or via a Packages menu, depending on your platform.
    [Show full text]
  • Automated Likelihood Based Inference for Stochastic Volatility Models H
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Institutional Knowledge at Singapore Management University Singapore Management University Institutional Knowledge at Singapore Management University Research Collection School Of Economics School of Economics 11-2009 Automated Likelihood Based Inference for Stochastic Volatility Models H. Skaug Jun YU Singapore Management University, [email protected] Follow this and additional works at: https://ink.library.smu.edu.sg/soe_research Part of the Econometrics Commons Citation Skaug, H. and YU, Jun. Automated Likelihood Based Inference for Stochastic Volatility Models. (2009). 1-28. Research Collection School Of Economics. Available at: https://ink.library.smu.edu.sg/soe_research/1151 This Working Paper is brought to you for free and open access by the School of Economics at Institutional Knowledge at Singapore Management University. It has been accepted for inclusion in Research Collection School Of Economics by an authorized administrator of Institutional Knowledge at Singapore Management University. For more information, please email [email protected]. Automated Likelihood Based Inference for Stochastic Volatility Models Hans J. SKAUG , Jun YU November 2009 Paper No. 15-2009 ANY OPINIONS EXPRESSED ARE THOSE OF THE AUTHOR(S) AND NOT NECESSARILY THOSE OF THE SCHOOL OF ECONOMICS, SMU Automated Likelihood Based Inference for Stochastic Volatility Models¤ Hans J. Skaug,y Jun Yuz October 7, 2008 Abstract: In this paper the Laplace approximation is used to perform classical and Bayesian analyses of univariate and multivariate stochastic volatility (SV) models. We show that imple- mentation of the Laplace approximation is greatly simpli¯ed by the use of a numerical technique known as automatic di®erentiation (AD).
    [Show full text]
  • Management and Analysis of Wildlife Biology Data Bret A
    2 Management and Analysis of Wildlife Biology Data bret a. collier and thomas w. schwertner INTRODUCTION HE GENERAL FOCUS in this chapter is on outlining the range of op- tions available for the management and analysis of data collected during wildlife biology studies. Topics presented are inherently tied to data analy- Tsis, including study design, data collection, storage, and management, as well as graphical and tabular displays best suited for data summary, interpretation, and analysis. The target audience is upper-level undergraduate or masters-level gradu- ate students studying wildlife ecology and management. Statistical theory will only be presented in situations where it would be beneficial for the reader to understand the motivation behind a particular technique (e.g., maximum likelihood estima- tion) versus attempting to provide in-depth detail regarding all potential assump- tions, methods of interpretation, or ways of violating assumptions. In addition, po- tential approaches that are currently available and are rapidly advancing (e.g., applications of mixed models or Bayesian modeling using Markov Chain Monte Carlo methods) are presented. Also provided are appropriate references and pro- grams for the interested reader to explore further these ideas. The previous authors (Bart and Notz 2005) of this chapter suggested that before collecting data, visit with a professional statistician. Use the consultation with a professional to serve as a time to discuss the objectives of the research study, out- line a general study design, discuss the proposed methods to be implemented, and identify potential problems with initial thoughts on sampling design, as well as to determine what statistical applications to use and how any issues related to sam- pling will translate to issues with statistical analysis.
    [Show full text]
  • On Desktop Application for Some Selected Methods in Ordinary Differential Equations and Optimization Techniques
    International Journal of Mathematics and Statistics Invention (IJMSI) E-ISSN: 2321- 4767 P-ISSN: 2321-4759 www.ijmsi.org Volume 8 Issue 9 || September, 2020 || PP 20-34 On Desktop Application for Some Selected Methods in Ordinary Differential Equations and Optimization Techniques 1Buah, A. M.,2Alhassan, S. B. and3Boahene,B. M. 1 Department of Mathematical Sciences, University of Mines and Technology, Tarkwa, Ghana 2 Department of Mathematical Sciences, University of Mines and Technology, Ghana 3 Department of Mathematical Sciences, University of Mines and Technology, Ghana ABSTRACT-Numerical methods are algorithms used to give approximate solutions to mathematical problems. For many years, problems solved numerically were done manually, until the software was developed to solve such problems. The problem with some of this softwareis that, though some are free, the environment they offer is not user friendly. With others, it would cost you a fortune to get access to them and some suffer execution time and overall speed. In this research, a desktop application with a friendly user interface, for solving some selected methods in numerical methods for Ordinary Differential Equations (ODEs) and sequential search methods in optimization techniques has been developed using Visual C++. The application can determine the error associated with some of the methods in ODEs if the user supplies the software with the exact solution. The application can also solve optimization problems (minimization problems) and return the minimum function value and the point at which it occurs. KEYWORDS:ODEs, optimization, desktop application ----------------------------------------------------------------------------------------------------------------------------- ---------- Date of Submission: 11-11-2020 Date of Acceptance: 27-11-2020 ------------------------------------------------------------------------------------------------------------------------ --------------- I. INTRODUCTION A Differential Equation (DE) is an equation involving a function and its derivatives [1].
    [Show full text]
  • Theta Logistic Population Model Writeup
    Theta logistic population model writeup Casper W. Berg October 11, 2012 1 Summary • ADMB is by far the fastest method, but it takes a little bit more coding than doing it in the BUGS language. • HMM can be implemented in R, but it takes some effort and is quite slow. • JAGS is slower but not too bad for this relatively simple problem. • All methods produce quite similar results with respect to parameter esti- mates. • JAGS produced much wider credible intervals for the estimated parame- ters than the corresponding confidence intervals for ADMB and HMM. • For this application, choosing the mean of the MCMC chains from the BUGS runs instead of the mode leads to more bias in the parameter esti- mates. • Coverage fractions are close to 0.95 for all the methods. 2 Introduction This example is based on Pedersen et al. (2011), which contains much more details than this shorter writeup. The example is a theta logistic population model, which is a nonlinear state- space model. The log-transformed population size is modelled as a nonlinear function of its previous size in the following way: ! exp(X )θ X = X + r 1 − t−1 + e ; (1) t t−1 0 K t Yt = Xt + ut; (2) where et ∼ N(0;Q) and ut ∼ N(0;R). 1 This example does not contain any real data, so instead we will simulate 200 timesteps of the population with an initial value substantially below the carrying capacity, such that in the beginning the population will grow, and in the end of the time series the population size will fluctuate around the carrying capacity.
    [Show full text]
  • TMB: Automatic Differentiation and Laplace Approximation
    JSS Journal of Statistical Software April 2016, Volume 70, Issue 5. doi: 10.18637/jss.v070.i05 TMB: Automatic Differentiation and Laplace Approximation Kasper Kristensen Anders Nielsen Casper W. Berg DTU Compute DTU Aqua DTU Aqua Hans Skaug Bradley M. Bell University of Bergen University of Washington Abstract TMB is an open source R package that enables quick implementation of complex nonlinear random effects (latent variable) models in a manner similar to the established AD Model Builder package (ADMB, http://admb-project.org/; Fournier et al. 2011). In addition, it offers easy access to parallel computations. The user defines the joint likelihood for the data and the random effects as a C++ template function, while all the other operations are done in R; e.g., reading in the data. The package evaluates and maximizes the Laplace approximation of the marginal likelihood where the random effects are automatically integrated out. This approximation, and its derivatives, are obtained using automatic differentiation (up to order three) of the joint likelihood. The computations are designed to be fast for problems with many random effects (≈ 106) and parameters (≈ 103). Computation times using ADMB and TMB are compared on a suite of examples ranging from simple models to large spatial models where the random effects are a Gaussian random field. Speedups ranging from 1.5 to about 100 are obtained with increasing gains for large problems. The package and examples are available at http://tmb-project.org/. Keywords: automatic differentiation, AD, random effects, latent variables, C++ templates, R. 1. Introduction Calculation of derivatives plays an important role in computational statistics.
    [Show full text]
  • AD Model Builder: Using Automatic Differentiation for Statistical Inference of Highly Parameterized Complex Nonlinear Models
    Optimization Methods and Software ISSN: 1055-6788 (Print) 1029-4937 (Online) Journal homepage: http://tandfonline.com/loi/goms20 AD Model Builder: using automatic differentiation for statistical inference of highly parameterized complex nonlinear models David A. Fournier , Hans J. Skaug , Johnoel Ancheta , James Ianelli , Arni Magnusson , Mark N. Maunder , Anders Nielsen & John Sibert To cite this article: David A. Fournier , Hans J. Skaug , Johnoel Ancheta , James Ianelli , Arni Magnusson , Mark N. Maunder , Anders Nielsen & John Sibert (2012) AD Model Builder: using automatic differentiation for statistical inference of highly parameterized complex nonlinear models, Optimization Methods and Software, 27:2, 233-249, DOI: 10.1080/10556788.2011.597854 To link to this article: http://dx.doi.org/10.1080/10556788.2011.597854 Copyright Taylor and Francis Group, LLC Published online: 03 Oct 2011. Submit your article to this journal Article views: 4634 View related articles Citing articles: 332 View citing articles Full Terms & Conditions of access and use can be found at http://tandfonline.com/action/journalInformation?journalCode=goms20 Download by: [69.89.31.232] Date: 24 March 2016, At: 18:25 Optimization Methods & Software Vol. 27, No. 2, April 2012, 233–249 AD Model Builder: using automatic differentiation for statistical inference of highly parameterized complex nonlinear models David A. Fourniera, Hans J. Skaugb*, Johnoel Anchetac, James Ianellid, Arni Magnussone, Mark N. Maunderf , Anders Nielseng and John Sibertc aOtter Research Ltd., Sidney, Canada; bDepartment of Mathematics, University of Bergen, Norway; cJoint Institute for Marine and Atmospheric Research, University of Hawaii at M¯anoa, USA; d REFM Division, Alaska Fisheries Science Center, NOAA, Seattle, WA, USA; eMarine Research Institute, Reykjavik, Iceland; f Inter-American Tropical Tuna Commission, La Jolla, CA, USA; gTech.
    [Show full text]
  • Next Generation Stock Assessment Models Workshop Report
    Frameworks for the next generation of general stock assessment models: 2019 CAPAM workshop report New Zealand Fisheries Assessment Report 2020/39 S.D. Hoyle, M.N. Maunder, Z.T. A’mar ISSN 1179-5352 (online) ISBN 978-1-99-004329-1 (online) November 2020 Requests for further copies should be directed to: Publications Logistics Officer Ministry for Primary Industries PO Box 2526 WELLINGTON 6140 Email: [email protected] Telephone: 0800 00 83 33 Facsimile: 04-894 0300 This publication is also available on the Ministry for Primary Industries websites at: http://www.mpi.govt.nz/news-and-resources/publications http://fs.fish.govt.nz go to Document library/Research reports © Crown Copyright – Fisheries New Zealand TABLE OF CONTENTS EXECUTIVE SUMMARY 1 1. INTRODUCTION 4 2. BACKGROUND 5 2.1 Requirements for a NextGen SAM 5 2.2 NextGen SAM development 7 2.3 Characteristics of the NextGen SAM 8 2.4 Funding 11 3. CODING PHILOSOPHIES, STRUCTURE, AND UNDERLYING LANGUAGE BASE 11 3.1 Introduction 11 3.2 Model codebase and coding requirements 13 3.3 Code modularity 15 3.4 Model testing and validation 15 3.5 Developer requirements 16 3.6 Free and open-open source (FOSS) 17 3.7 Ancillary code and software libraries 17 3.8 Package hosting 18 3.9 NextGen SAM architecture and underlying structure 13 4. STOCK ASSESSMENT MODEL FEATURES 18 4.1 NextGen SAM groups/partition structure 18 4.1.1 General considerations 18 4.1.2 Spatial structure and inclusion of areas 19 4.1.3 The time period and time steps within a model 21 4.1.4 Length and/or age structured models
    [Show full text]
  • Blueline Tilefish Report Has Not Been Finalized
    SEDAR SouthEast Data, Assessment, and Review 4055 Faber Place Drive #201 Phone (843) 571-4366 North Charleston SC 29405 Fax (843) 769-4520 www.sefsc.noaa.gov/sedar/ As of October 4, 2013, the SEDAR 32 South Atlantic blueline tilefish report has not been finalized. The following sections were incomplete when the federal government shut down on October 1, 2013. An updated report will be distributed, if available, prior to the SSC meeting. • Section I: Introduction, Assessment Summary Report – waiting approval from the lead analyst. • Section VI: Addendum with updated projection information – the review panel suggested the projections be rerun using the 2012 landings and discard data. The Council requested a p-star analysis of 0.5 (see Introduction pages 10-11). SEDAR requested this information by September 20, 2013. The lead analyst did not receive 2012 landings and discard data until September 19 and was planning to get the updated projections to SEDAR the week of September 30. Due to the federal government shut down this information is not yet available. Southeast Data, Assessment, and Review SEDAR 32 Stock Assessment Report South Atlantic Blueline Tilefish October 2013 SSC Review Draft SEDAR 4055 Faber Place Drive, Suite 201 North Charleston, SC 29405 Please cite this document as: SEDAR. 2013. SEDAR 32 – South Atlantic blueline tilefish Stock Assessment Report. SEDAR, North Charleston SC. 341 pp. available online at: http://www.sefsc.noaa.gov/sedar/Sedar_Workshops.jsp?WorkshopNum=32 THE STOCK ASSESSMENT REPORT IS NOT FINALIZED SO IT IS NOT AVAILABLE ON SEDAR WEBSITE YET. SSC Review Draft Table of Contents Pages of each Section are numbered separately.
    [Show full text]
  • Hamiltonian Monte Carlo Using an Adjoint- Differentiated Laplace Approximation: Bayesian Inference for Latent Gaussian Models and Beyond
    Hamiltonian Monte Carlo using an adjoint- differentiated Laplace approximation: Bayesian inference for latent Gaussian models and beyond Charles C. Margossian Aki Vehtari Department of Statistics Department of Computer Science Columbia University Aalto University New York, NY 10027 02150 Espoo, Finland [email protected] Finnish Center for Artificial Intelligence Daniel Simpson Raj Agrawal Department of Statistical Sciences CSAIL University of Toronto Massachusetts Institute of Technology ON M5S, Canada Cambridge, MA 02139 Abstract Gaussian latent variable models are a key class of Bayesian hierarchical models with applications in many fields. Performing Bayesian inference on such models can be challenging as Markov chain Monte Carlo algorithms struggle with the geometry of the resulting posterior distribution and can be prohibitively slow. An alternative is to use a Laplace approximation to marginalize out the latent Gaussian variables and then integrate out the remaining hyperparameters using dynamic Hamiltonian Monte Carlo, a gradient-based Markov chain Monte Carlo sampler. To implement this scheme efficiently, we derive a novel adjoint method that propagates the minimal information needed to construct the gradient of the approximate marginal likelihood. This strategy yields a scalable differentiation method that is orders of magnitude faster than state of the art differentiation techniques when the hyperparameters are high dimensional. We prototype the method in the probabilistic programming framework Stan and test the utility of the embedded Laplace approximation on several models, including one where the dimension of the hyperparameter is ∼6,000. Depending on the cases, the benefits can include an alleviation of the geometric pathologies that frustrate Hamiltonian Monte Carlo and a dramatic speed-up.
    [Show full text]
  • Ophthalmic Fronts & Temples
    Ophthalmic Fronts & Temples PDF Express Edition Available from: Phone: +44 208 123 2220 or +1 732 587 5005 [email protected] Sales Manager: Alison Smith on +44 208 123 2220 [email protected] Ophthalmic Fronts & Temples Ophthalmic Fronts & Temples Ophthalmic Fronts & Temples The PDF report is an extract of the main database and provides a number of limited datasets for each of the countries covered. For users needing more information, detailed data on Ophthalmic Fronts & Temples is available in several geographic Editions and Database versions. Users can order any other Editions, or the full Database version, as required from the After-Sales Service or from any NIN Dealer at a discount. This research provides data on Ophthalmic Fronts & Temples. Contents Express Edition .......................................................................................................................................................... 4 Products & Markets .................................................................................................................................................... 4 Report Description ..................................................................................................................................................... 5 Tables ........................................................................................................................................................................ 5 Countries Covered ..................................................................................................................................................
    [Show full text]
  • (Microsoft Powerpoint
    Acil T ıp’ta İstatistik Sunum Plan ı IV. Acil Tıp Asistan Sempozyumu 19-21 Haziran 2009 Biyoistatistik nedir Sa ğlık alan ında biyoistatistik Acil T ıp’ta biyoistatistik Meral Leman Almac ıoğlu İstatistik yaz ılım programlar ı Uluda ğ Üniversitesi T ıp Fakültesi Acil T ıp AD Bursa Biyoistatistik Nedir Biyoistatistik Niçin Gereklidir 1. Biyolojik, laboratuar ve klinik verilerdeki yayg ınl ık ı ğ ı Biyoistatistik; t p ve sa l k bilimleri 2. Verilerin anla şı lmas ı alanlar ında veri toplanmas ı, özetleme, 3. Yorumlanmas ı analiz ve de ğerlendirmede istatistiksel 4. Tıp literatürünün kriti ğinin yap ılmas ı yöntemleri kullanan bilim dal ı 5. Ara ştırmalar ın planlanmas ı, gerçekle ştirilmesi, analiz ve yorumlanmas ı Biyoistatistiksel teknikler kullan ılmadan gerçekle ştirilen ara ştırmalar bilimsel ara ştırmalar de ğildir Acil’de İstatistik Sa ğlık istatistikleri sa ğlık çal ış anlar ının verdi ği bilgilerden derlenmekte Acil Servis’in hasta yo ğunlu ğunun y ıl-ay-gün-saat baz ında de ğerlendirilmesi bu veriler bir ülkede sa ğlık hizmetlerinin planlanmas ı Çal ış ma saatlerinin ve çal ış mas ı gereken ki şi say ısının ve de ğerlendirmesinde kullan ılmakta planlanmas ı Gerekli malzeme, yatak say ısı, ilaç vb. planlanmas ı Verilen hizmetin kalitesinin ölçülmesi İyi bir biyoistatistik eğitim alan sa ğlık personelinin o Eğitimin kalitesinin ölçülmesi ülkenin sa ğlık informasyon sistemlerine güvenilir Pandemi ve epidemilerin tespiti katk ılarda bulunmas ı beklenir Yeni çal ış malar, tezler … İstatistik Yaz ılım Programlar ı İİİstatistiksel
    [Show full text]