Contribution of Prof. H. Akaike Tsttitilmdlito Statistical Modeling of Complex Systems

Total Page:16

File Type:pdf, Size:1020Kb

Contribution of Prof. H. Akaike Tsttitilmdlito Statistical Modeling of Complex Systems Contribution of Prof. H. Akaike tSttitilMdlito Statistical Modeling of Complex Systems Research Organization of Information and Systems Genshiro Kitagawa Research Organization of ISI, Dublin, August 23, 2011 Information and Systems Hirotugu Akaike(1927-2009) Research Organization of Information and Systems 2 Brief History 1952 Graduated from Math. Dept., Tokyo University Researcher of the Institute of Statistical Mathematics 1962 Head of 2nd Section, 1st Division 1973 Director of 5th Division 1985 Director of Dept. of Prediction and Control 1986 Director-General of ISM (-1994) 1988 Member of Science Council of Japan (- 1991) Head of Dept. of Statistical Science, Graduate University of Advanced Study 1994 PfEProf. Emerit itThIus, The Inst tSttitMth. Statist. Math. Prof. Emeritus, Graduate Univ. for Advance Study Research Organization of Information and Systems 3 Prizes 1972 Ishikawa Prize (Establishment of statistical analysis and control method for dynamic systems) 1980 Okochi Prize (Research and realization of optimal steam temperature control of thermal electric plant) 1989 AhiPiAsahi Prize (Research on statistics, in particular theory and applications of AIC) The Purple Ribbon Medal (Statistics, in particular time series analysis and its applications) 1996 The 1st Japan Statistical Society Prize (Contributions to statistical theory and its applications ) 2000 The Order of the Sacred Treasure 2006 Kyoto Prize ( Major contribution to statistical science and modeling with the development of AIC) Fellow of ASA,,,, RSS, IMS, IEEE, JSS Research Organization of Information and Systems 4 Laureate of 22nd Kyoto Prize "Major contribution to statistical science and modeling with the development of the Akaike Information Criterion (AIC)” Research Organization of Information and Systems 5 Ontline 1. Launching period: Mathematical analysis and structural modeling (1950’s) 2. Frequency domain analysis (1960’s) 3. Time domain time series modeling (1970’s) 4. AIC and statistical modelingg( (1970’s) 5. Bayes modeling (1980’s) Research Organization of Information and Systems 6 1. Launching Period (1952-1960) Mathematical Analysis • DiiDecision process Bird cage • Evaluation of probability distribution • Monte Carlo method solving linear equation • Computation of eigenvalues • Converggpgence of optimum gradient method Convergence Analysis of Optimal Gradient Method AISM(1959) Analyygzed limiting behavior of the op pgtimal gradient method and showed the poor convergence property, known as bird cage. Became the fffoundation of nonlinear optimization methods such as Conjugate Gradient method and Quasi Newton method … Kowarik and Osbone (1968) Research Organization of Information and Systems 7 Structural Modeling At this time, he was rather interested in real-world problems. And f’from very early stage in the1950’, he realized the conventional linear stationary models are unrealistic and the importance of developing a model that fully takes into account of the structure of the process. Control of filature production process (Joint work with Dr. Shimazaki of the Sericultural Experiment Station) • Developed a control method based on gap process modeling • This method provided a reference process to detect abnormalities in actual reeling process. •Brouggght significant innovation for silk production in Ja pan Research Organization of Information and Systems 8 2. Frequency Domain Analysis: (1960-70) • By 1960, he established contacts with engineers It took almost 10 years for me to develop substantial contacts, and this happened largely by coincidence. Statistical Science (1995) • He realized that linear stationary modeling is utilized and many unsolved important problems are left for statisticians • Smoothing method for power spectrum estimation (Suspension system of a car, Dr. Kaneshige, Isuzu Motor Co.) Akaike windows •Estimation of frequency response function (Dr. Yamanouchi, Transportation Tech. Res. Inst.) Developed a method of estimating frequency response function from observations under normal steering (without using sinusoidal inputs) Research Organization of Information and Systems 9 2. Frequency Domain Analysis: (1960-70) Developed practical method of analyzing complex systems Organized workshop on practical use of time series method • Ship’s yawing, rolling (Yamanouchi, Kawashima) • Rolling of a car(Kawamura) • Engine of a car(Kaneshige) • Response of an air-plane to side wind(Takeda) • Hydroelectric power plant(Nakamura) • Estimation of underground structure through micro tremor • TiTsunami (Kinosita) • EEG analysis (Suhara) Research Organization of Information and Systems AISM supplement (1964) 10 Difficulty in Feedback Systems rotary kiln Cementrotary rotary kiln kiln Cement Rotary Kiln, Chichibu Cement Company rotary kiln •rotkilitary kiln is a comp litdlicated fdbktfeedback system, consitiisting of many variables such as raw material feed, fuel feed, gas damper angle, temperature at various locations, etc. • Conventional spectral analysis methods were useless to identify the source of fluctuation • In the frequency domain analysis, we cannot explicitly utilize the physical realizability of the system n1 A Limitation of frequency y n 2 y n1 domain approach Return to time domain modeling B Madison syypmposium (1967)(()1967) n2 Research Organization of Information and Systems 11 3. Time Domain Time Series Modeling (1968-) y ( y , , y )T Vector time series n n1 n wn1 Structural model B m 1 y ni kij ynk, j ni n1 ji k 1 A1 ni ki nk,i wni k 1 y n2 y n1 A2 VAR model representation n2 m y A y w n k nk n B2 k 1 Estimation of kij and ki through Ak wn2 Research Organization of Information and Systems 12 Spectrum & Power Contribution VAR model Power spectrum of yni m y A y w w ~ N(0,) 2 2 n k nk n n pii ( f ) | bij ( f ) | j k1 j1 Cross spectrum * p( f ) B( f ) B( f ) Power contribution m B( f ) (b ) A ep(exp(2i f ) 2 2 ij | bij ( f ) | j 1 r ij ( f ) pii ( f ) Order m is unknown! Proportion of the contribution from noise input e (n) in the Proper selection of the order j power spectrum of yi(n) at is crucial freqqyuency f. Research Organization of Information and Systems 13 Predictive Point of View and FPE Conventional fitting view point Predictive view point “True” “True” Distribution Model Distribution Estimation Estimation Prediction Future Present Present Model Data Data Data Evaluation Scalar AR model m 2 Multivariate case: MFPE, FPEC y n ak ynk wn , wn ~ N(0, m ) k 1 FPE is a forerunner of AIC FPE (Final Prediction Error) 2 nm1 n log FPEm nlogˆm nlog nm1 nm1 2 2 FPEm ˆm nlogˆ (m1) nm1 m Research Organization of Information and Systems Akaike (1969) 14 Realization of Statistical Control Cement rotary kiln VARX model m m yn Aj yn j Bjrn j n j1 j1 State –Space Representation xn Fxn1 Grn n yn Hxn Electric power plant Nonlinear control Japan Bailay Ozaki-Peng Criterion Function L T T J Exn Qxn rn Rrn Roll-reducing j1 Nakamura-Akaike(1981) control Mitsui ship building Statistical Optimal controller Autopilot Oda * rn Kxn Hrn1 Noise-adaptive controller Yokokawa Kitagawa-Akaike- TIMSAC Akaike-Nakagawa(1972) Ohtsu (2006) Ohtsu-Horigome-Kitagawa(1978) Research Organization of Information and Systems 1970 1980 2000 15 Analysis and Control of Feedback Systems Metabolism Mulivariate Wada AR model Ship’s Economics dynamics Tanaka-Sato Ohtsu-Horigome Finance, CDS Nuclear Tsuda Power power plant contribution Hitachi ((Fukunihsi) Man-machine system IhiIshiguro-Ohya Akaike’s linear time series modeling starts from multivariate case. 1968 1977 1993 2000 Research Organization of Information and Systems 16 Main Contributions in Time Domain Modeling 1. Developed practical method of VAR modeling • Analysis of feedback systems • Statistical controller 2. Use of state-space model • For optimal controller design • ML E sti mati on and id entifi cat ion o f VARMA mo de l • BAYSEA: Bayesian method for seasonal adjustment 3. Oder determination • FPE, MFPE, FPEC •AIC 4. Developed TIMSAC (Time Series Analysis and Control program package) Research Organization of Information and Systems 17 4. AIC and Statistical Modeling (1971-) Predictive Point of View “True” Distribution Prediction Predictive Estimation Future Pont of view Present Model Data Data Evaluation MFPE (1) Predictive point of view (2) Prediction by distribution AIC (3) Evaluation by K-L information Factor analysis Research Organization of Information and Systems 18 Predictive Point of View and AIC Model g(y): distribution of future data, f(y): model Kullback-Leibler Information g I(g; f ) EY log EY log g EY log f f Expected Log-Likelihood Unknown constant E log f (Y ) log f ( y)g( y)ddy Y Log-Likelihood x1 , , xn ~ g (x) 1 n log f ( X i ) EY log f (Y ) n i1 MLE ˆ ˆ () log f (X |) max ( ) (X ) Multi-model Situation: Bias correction AIC Research Organization of Information and Systems Akaike (1973,1974) 19 Bias Correction ˆ ˆ b(G) EX log f (X |(X )) nEY log f (Y |(X )) EX [D] Log-Likelihood Generic Information Criterion logfX ( | ( X )) D1 log f ( X |ˆ( X )) b(G) D D2 1 D b(G) trI(G)J(G) m 3 I(G): Fisher Information EEdxpected J(G): -(Expected Hessian) Log-Likelihood EY log f (Y |(X)) ˆ AIC 2log f (X |(X )) 2m * 2(log L(ˆ)) 2(dim of ) Research Organization of Information and Systems 20 Number of papers citing Akaike’s paper by Thomson (ISI) Year by year cites Cited Research Area 1200 5142 IEEEac(1974) AK(1973) 2907 Akad. Kiado (1973) 1000 2550 Math. Stat. IEEE(1974) 1115 Psychometrika (1987) 800 1987 Medical, Health 818 AISM(1969) 600 431 AISM(1970) 400 359 Math. Science (1978) 1868 Bio logy 200 281 Appl. Stat. (1977) 1833 Engineering 265 Bayesian Stat. (1980) 0 1973 1978 1983 1988 1993 1998 2003 2008 204 System Ident. (1980) 1755 Computer, Info. Sci. 135 Celebration (1986) 12000 Cumulative cites 1376 Psychology 127 AISM(1969) 10000 AK(1973) 123 AISM(1974) IEEE(1974) 1070 Econom ics 8000 13104 total 785 Social Science 6000 773 Environmental Sci. BByy GoogGooglele ScScholarholar 4000 673 Physics, Chemistry 14265 IEEE-ac (1974) 610 Geophysics 2000 8588 Acad.
Recommended publications
  • Strength in Numbers: the Rising of Academic Statistics Departments In
    Agresti · Meng Agresti Eds. Alan Agresti · Xiao-Li Meng Editors Strength in Numbers: The Rising of Academic Statistics DepartmentsStatistics in the U.S. Rising of Academic The in Numbers: Strength Statistics Departments in the U.S. Strength in Numbers: The Rising of Academic Statistics Departments in the U.S. Alan Agresti • Xiao-Li Meng Editors Strength in Numbers: The Rising of Academic Statistics Departments in the U.S. 123 Editors Alan Agresti Xiao-Li Meng Department of Statistics Department of Statistics University of Florida Harvard University Gainesville, FL Cambridge, MA USA USA ISBN 978-1-4614-3648-5 ISBN 978-1-4614-3649-2 (eBook) DOI 10.1007/978-1-4614-3649-2 Springer New York Heidelberg Dordrecht London Library of Congress Control Number: 2012942702 Ó Springer Science+Business Media New York 2013 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. Exempted from this legal reservation are brief excerpts in connection with reviews or scholarly analysis or material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Duplication of this publication or parts thereof is permitted only under the provisions of the Copyright Law of the Publisher’s location, in its current version, and permission for use must always be obtained from Springer.
    [Show full text]
  • Press Release
    Japan Statistical Society July 10, 2020 The Institute of Statistical Mathematics (ISM) The Japan Statistical Society (JSS) Announcement of the Awardee of the Third Akaike Memorial Lecture Award Awardee/Speaker: Professor John Brian Copas (University of Warwick, UK) Lecture Title: “Some of the Challenges of Statistical Applications” Debater: Professor Masataka Taguri(Yokohama City University) Dr. Masayuki Henmi (The Institute of Statistical Mathematics) Date/Time: 16:00-18:00 September 9, 2020 1. Welcome Speech and Explanation of the Akaike Memorial Lecture Award 2. Akaike Memorial Lecture 3. Discussion with Young Scientists and Professor Copas’s rejoinder Venue: Toyama International Conference Center (https://www.ticc.co.jp/english/) 1-2 Ote-machi Toyama-city Toyama 930-0084MAP,Japan • Detailed information will be uploaded on the website of ISM (http://www.ism.ac.jp/index_e.html) and other media. Professor John B. Copas (current age, 76) Emeritus Professor, University of Warwick, UK Professor Copas, born in 1943, is currently Professor Emeritus at the University of Warwick. With a focus on both theory and application, his works provide a deep insight and wide perspective on various sources of data bias. His contributions span over a wide range of academic disciplines, including statistical medicine, ecometrics, and pscyhometrics. He has developed the Copas selection model, a well-known method for sensitivity analysis, and the Copas rate, a re-conviction risk estimator used in the criminal justice system. Dr. Hirotugu Akaike The late Dr. Akaike was a statistician who proposed the Akaike Information Criterion (AIC). He established a novel paradigm of statistical modeling, characterized by a predictive point of view, that was completely distinct from traditional statistical theory.
    [Show full text]
  • IMS Bulletin 39(5)
    Volume 39 • Issue 5 IMS1935–2010 Bulletin June 2010 IMS Carver Award: Julia Norton Contents Julia A. Norton, Professor Emerita in the 1 IMS Carver Award Department of Statistics and Biostatistics at California State University, East Bay in 2–3 Members’ News: new US National Academy mem- Hayward, California, USA has received bers; Donald Gaver; Marie the 2010 Carver Medal from the Institute Davidian; tributes to Kiyosi of Mathematical Statistics. The presenta- Itô, Ehsanes Saleh tion of the medal will take place August 4 Laha Award recipients 10, 2010 at a special ceremony during the IMS Annual Meeting in Gothenburg, 5 Statistics museum Sweden. Rietz lecture: Michael Stein 6 Professor Norton receivesThe candidate the award for IMS President-Elect is Medallion Preview: Marek for contributions to the IMS throughout 7 Ruth Williams Biskup her career, and especially for her conscien- tious and pivotal service as IMS Treasurer 8 NISS/SAMSI award; during the period when the IMS Business Samuel Kotz Julia A. Norton Office was moved from California to 9 Obituary: Hirotugu Akaike Ohio. 10 Annual survey She said she was greatly surprised to receive the award, commenting, “I just said yes to all sorts of new ideas that came the way of the Institute during my two terms 11 Rick’s Ramblings: with Larry Shepp as Treasurer.” She added, “Like most things, the best part of the job is the fantastic teamwork displayed by the staff. I am most proud of my hand in helping to hire and Terence’s Stuff: Changes, 12 keep [Executive Director] Elyse in our employ.” courses and committees The Carver Medal was created by the IMS in 2002 13 IMS meetings in honor of Harry C.
    [Show full text]
  • Akaike Receives 2006 Kyoto Prize
    Akaike Receives 2006 Kyoto Prize In June 2006 the Inamori global network of closely linked systems. Modern Foundation announced statistical sciences are expected to help our un- the laureates who will re- derstanding of the study of such complicated, un- ceive its 22nd annual certain, and dynamic systems, or the study of sit- Kyoto Prize, Japan’s high- uations where only incomplete information is est private award for life- available. The main role of statistical sciences must time achievement, pre- be to give useful scientific methodologies for the sented to individuals and development of new technologies and for the fur- groups worldwide who ther development of society, which is characterized have contributed signifi- by increased uncertainty. One of the characteris- cantly to humankind’s tics of the statistical sciences is their interdisci- betterment. Receiving the plinary nature, with applicability in various fields. prize in the Basic Sciences For example, the role of the statistical sciences has Hirotugu Akaike category is the Japanese recently become increasingly important in the un- statistical mathematician derstanding and forecasting of phenomena in eco- HIROTUGU AKAIKE, professor emeritus at the Institute nomic-related fields, such as finance and insur- of Statistical Mathematics. The prize consists of a ance; safety-related fields, including pharma- diploma, a Kyoto Prize Medal of 20-karat gold, and ceuticals, food, and transportation; natural phe- a cash gift of 50 million yen (approximately nomena, such as weather, natural disasters, and the US$446,000). environment; and in the management of huge sys- tems. The Work of Hirotugu Akaike Starting in the early 1970s Akaike explained the Rapid advances in science and technology in the importance of modeling in analysis and in fore- twentieth century brought numerous benefits to so- casting.
    [Show full text]
  • Preface: Special Issue in Honor of Dr. Hirotugu Akaike
    Ann Inst Stat Math (2010) 62:1–2 DOI 10.1007/s10463-009-0269-6 Preface: Special issue in honor of Dr. Hirotugu Akaike Published online: 10 December 2009 © The Institute of Statistical Mathematics, Tokyo 2009 We deeply regret to announce that Dr. Hirotugu Akaike passed away on August 4, 2009. In 2007, because of his excellent contribution to the field of statistical sci- ence, we planned to publish “Special Issue of the Annals of the Institute of Statistical Mathematics in Honor of Dr. Hirotugu Akaike”. We received the sad news of his death, just after all editorial process has completed. This issue begins with an article by Dr. Akaike’s last manuscript. Dr. Hirotugu Akaike was awarded the 2006 Kyoto Prize for his major contribution to statistical science and modeling with the Akaike Information Criterion (AIC) with the praise that “In the early 1970’s, Dr. Hirotugu Akaike formulated the Akaike Infor- mation Criterion (AIC), a new practical, yet versatile criterion for the selection of statistical models, based on basic concepts of information mathematics. This criterion established a new paradigm that bridged the world of data and the world of mod- eling, thus contributing greatly to the information and statistical sciences” (Inamori Foundation). In 1973, he proposed the AIC as a natural extension of the log-likelihood. The most natural way of applying the AIC was to use it as the model selection or order selection criterion. In the minimum AIC procedure, the model with the minimum value of the AIC is selected as the best one among many possible models.
    [Show full text]
  • Model Selection and Inference
    INFORMATION THEORY AND LOG-LIKELIHOOD MODELS: A BASIS FOR MODEL SELECTION AND INFERENCE Full reality cannot be included in a model; thus we seek a good model to approximate the effects or factors supported by the empirical data. The selection of an appropriate approximating model is critical to statistical inference from many types of empirical data. Information theory (see Guiasu 1977 and Cover and Thomas 1991) and, in particular the Kullback-Leibler (Kullback and Liebler 1951) “distance," or “information" forms the deep theoretical basis for data-based model selection. Akaike (1973) found a simple relationship between expected Kullback-Leibler information and Fisher's maximized log-likelihood function (see deLeeuw 1992 for a brief review). This relationship leads to a simple, effective, and very general methodology for selecting a parsimonious model for the analysis of empirical data. Some Background Well over a century ago measures were derived for assessing the “distance" between two models or probability distributions. Most relevant here is Boltzmann's (1877) concept of generalized entropy in physics and thermodynamics (see Akaike 1985 for a brief review). Shannon (1948) employed entropy in his famous treatise on communication theory. Kullback and Leibler (1951) derived an information measure that happened to be the negative of Boltzmann's entropy: now referred to as the Kullback-Leibler (K-L) distance. The motivation for Kullback and Leibler's work was to provide a rigorous definition of “information" in relation to Fisher's “sufficient statistics." The K-L distance has also been called the K-L discrepancy, divergence, information and number – these terms are synonyms, we tend to use distance or information in the material to follow.
    [Show full text]
  • Emanuel Parzen and a Tale of Two Kernels
    DEPARTMENT OF STATISTICS University of Wisconsin 1300 University Ave. Madison, WI 53706 TECHNICAL REPORT NO. 1183 February 28, 2017 Emanuel Parzen and a Tale of Two Kernels An Appreciation of Manny Parzen. Grace Wahba1 Department of Statistics, Department of Computer Sciences and Department of Biostatistics and Medical Informatics University of Wisconsin, Madison 1Research supported in part by CPCP and NSF Grant DMS-1308877 Emanuel Parzen and a Tale of Two Kernels Grace Wahba1 Department of Statistics, Department of Computer Sciences and Department of Biostatistics and Medical Informatics University of Wisconsin, Madison February 27, 2017 Abstract I was Manny Parzen's fifth student, according to the Mathemat- ical Genealogy Project, receiving my PhD under his supervision in 1966, and remaining at Stanford with him for a postdoc, leaving for the University of Wisconsin-Madison in 1967, where I remain as I write. To be Manny's PhD student at Stanford in the 60's was noth- ing short of bliss. Many was full of ideas, the weather was warm and sunny, sometimes classes were held on the grass in front of the old Sequoia Hall with Reproducing Kernel Hilbert Spaces on the impro- vised blackboard. A lovely, elegant dinner at their Stanford home that Manny and Carol threw for a group of graduate students bring back fond memories. Manny and Carol remained lifelong friends. Manny launched me on a 50 year academic year, buttressed by what I learned about RKHS from him in those days. In this article I first describe many fond memories over the years, and then present some techni- cal results and thoughts relating to RKHS, Parzen density estimates, Statistical Machine learning and related topics.
    [Show full text]
  • Towards Group-Based Configuration
    18th International Configuration Workshop Proceedings of the 18th International Configuration Workshop within CP 2016 Conference Edited by Élise Vareilles, Lars Hvam, Cipriano Forza and Caroline Becker September 5-6, 2016 Toulouse, France Organized by Centre Génie Industriel, Mines Albi, France 2 ISBN: 979-10-91526-04-3 École des Mines d’Albi-Carmaux Campus Jarlard Route de Teillet Albi 81013 Cedex 09 France Proceedings of the 18th International Configuration Workshop September 5-6, 2016, Toulouse, France 3 18th International Configuration Workshop Chairs Élise Vareilles, Mines Albi, France Lars Hvam, Technical University of Denmark, Denmark Cipriano Forza, University of Padova, Italy Caroline Becker, PROS Toulouse, France Program Committee Michel Aldanondo, Mines Albi, France Andres Barco, Mines Albi, France Caroline Becker, PROS, France Jean-Guillaume Fages, COSLING S.A.S., France Andreas Falkner, Siemens AG, Austria Hélène Fargier, IRIT, Université de Toulouse, France Alexander Felfernig, Graz University of Technology, Austria Cipriano Forza, Universita di Padova, Italy Gerhard Friedrich, Alpen-Adria-Universitaet Klagenfurt, Austria Paul Gaborit, Mines Albi, France Luis Garcés, Mines Albi, France Chiara Grosso, Universita di Padova, Italy Albert Haag, SAP SE, Germany Alois Haselboeck, Siemens AG, Austria Lothar Hotz, HITeC e.V. / University of Hamburg, Germany Lars Hvam, Technical University of Denmark, Denmark Dietmar Jannach, TU Dortmund, Germany Manuel Korell, Finja AB / Achoice, Denmark Thorsten Krebs, encoway GmbH, Germany Katrin Kristjansdottir,
    [Show full text]
  • Factor Analysis and Aic Hirotugu Akaike
    PSYCHOMETRIKA--VOL.52, NO. 3, 317-332 SEPTEMBER 1987 SPECIAL SECTION FACTOR ANALYSIS AND AIC HIROTUGU AKAIKE THE INSTITU~ OF STATISTICAL MATHEMA'F/CS The information criterion AIC was introduced to extend the method of maximum likelihood to the multimodel situation. It was obtained by relating the successful experience of the order determination of an autoregressive model to the determination of the number of factors in the maximum likelihood factor analysis. The use of the AIC criterion in the factor analysis is particu- larly interesting when it is viewed as the choice of a Bayesian model. This observation shows that the area of application of AIC can be much wider than the conventional i.i.d, type models on which the original derivation of the criterion was based. The observation of the Bayesian structure of the factor analysis model leads us to the handling of the problem of improper solution by introducing a natural prior distribution of factor loadings. Key words: factor analysis, maximum likelihood, information criterion AIC, improper solution, Bayesian modeling. 1. Introduction The factor analysis model has been producing thought provoking statistical prob- lems. The model is typically represented by y(n) = Ax(n) + u(n), n = 1, 2, ..., N where y(n) denotes a p-dimensional vector of observations, x(n) a k-dimensional vector of factor scores, u(n) a p-dimensional vector of specific variations. It is assumed that the variables with different n's are mutually independent and that x(n) and u(n) are mutually independently distributed as Gaussian random variables with variance covariance matrices I k x ~ and W, respectively, where W is a diagonal matrix.
    [Show full text]
  • Jim Pitman I Take This Opportunity to Report on Some Slower in Statistics
    Volume 35 • Issue 9 IMS Bulletin November 2006 From the IMS President: Jim Pitman I take this opportunity to report on some slower in statistics. CONTENTS of the new initiatives currently being At the Rio meeting 1 President’s Message undertaken by IMS, and to invite you to in July, IMS council support these initiatives in various ways. supported the 2–3 IMS Members’ News: ASA Fellows; Marvin Recognizing that the interests of IMS creation of two new Zelen; Brad Efron; Ciprian members now span an enormous range electronic journals in Crainiceanu; Mariel Finucane; from pure mathematical theory to col- statistics: a research Jim Pitman Hirotugu Akaike laborations with scientists in numerous journal to be called fields, IMS is launching a new journal, the Electronic Journal of Statistics, and a 4 IT Project Manager The Annals of Applied Statistics, aimed at survey journal, Statistics Surveys. What 5 Nominate: Laha Awards papers in the applied half of this range. is most needed now for success of these 6 Nominate: Other IMS Published quarterly in both print and journals is authors willing to submit high Awards electronic form, our goal is to provide a quality articles to these electronic-only timely and unified forum for all areas of outlets. And wherever you submit your 7 Obituary: Raja Deo; LNMS News applied statistics. The first Editor-in-Chief articles, to increase their visibility and is Brad Efron, supported by Stephen impact, please be sure to place a copy on Profile: David Blackwell 8 Fienberg, Editor for social science, gov- ArXiv, preferably at the time of submis- 9 Meeting report: SPA06 ernment, sample surveys, and economics, sion.
    [Show full text]
  • Model Selection
    12:14 Friday 13th November, 2015 See updates and corrections at http://www.stat.cmu.edu/~cshalizi/mreg/ Lecture 21: Model Selection 36-401, Fall 2015, Section B 10 November 2015 Contents 1 Generalization and Optimism 2 2 Mallow's Cp Statistic 4 2.1 R2 and Adjusted R2 .........................5 3 Akaike Information Criterion (AIC) 6 3.1 Why −dim(S)?............................7 4 Leave-one-out Cross-Validation (LOOCV) 9 4.1 Short-cut Based on Leverage . 10 4.2 Summing Up Cp, AIC, LOOCV . 11 5 Other Model Selection Criteria 11 5.1 k-Fold Cross-Validation . 11 5.2 BIC . 12 6 Stepwise Model Selection 13 7 Inference after Selection 13 7.1 Data Splitting . 15 8 R Practicalities 15 8.1 A Demo . 17 9 Further Reading 19 10 Exercises 20 1 2 1 Generalization and Optimism We estimated our model by minimizing the mean squared error on our data: 1 βb = argmin (y − xb)T (y − xb) b n Different linear models will amount to different choices of the design matrix x | we add or drop variables, we add or drop interactions or polynomial terms, etc., and this adds or removes columns from the design matrix. We might consider doing selecting among models themselves by minimizing the MSE. This is a very bad idea, for a fundamental reason: Every model is too optimistic about how well it will actually predict. Let's be very clear about what it would mean to predict well. The most challenging case would be that we see a new random point, with predictor values X1;:::Xp and response Y , and our old βb has a small expected squared error: 2 23 0 0 p 11 6 ^ X ^ 7 E 4@Y − @β0 + XjβjAA 5 j=1 Here both Y and the X's are random (hence the capital letters), so we might be asking the model for a prediction at a point it never saw before.
    [Show full text]
  • Estat Ic — Display Information Criteria
    Title stata.com estat ic — Display information criteria Syntax Menu for estat Description Option Remarks and examples Stored results Methods and formulas References Also see Syntax estat ic , n(#) Menu for estat Statistics > Postestimation > Reports and statistics Description estat ic displays Akaike’s and Schwarz’s Bayesian information criteria. Option n(#) specifies the N to be used in calculating BIC; see[ R] BIC note. Remarks and examples stata.com estat ic calculates two information criteria used to compare models. Unlike likelihood-ratio, Wald, and similar testing procedures, the models need not be nested to compare the information criteria. Because they are based on the log-likelihood function, information criteria are available only after commands that report the log likelihood. In general, “smaller is better”: given two models, the one with the smaller AIC fits the data better than the one with the larger AIC. As with the AIC, a smaller BIC indicates a better-fitting model. For AIC and BIC formulas, see Methods and formulas. Example 1 In[ R] mlogit, we fit a model explaining the type of insurance a person has on the basis of age, gender, race, and site of study. Here we refit the model with and without the site dummies and compare the models. 1 2 estat ic — Display information criteria . use http://www.stata-press.com/data/r13/sysdsn1 (Health insurance data) . mlogit insure age male nonwhite (output omitted ) . estat ic Akaike's information criterion and Bayesian information criterion Model Obs ll(null) ll(model) df AIC BIC . 615 -555.8545 -545.5833 8 1107.167 1142.54 Note: N=Obs used in calculating BIC; see [R] BIC note .
    [Show full text]