<<

Probability Distributions Used in

Probability Distributions Used in Reliability Engineering

Andrew N. O’Connor Mohammad Modarres Ali Mosleh

Center for Risk and Reliability 0151 Glenn L Martin Hall University of Maryland College Park, Maryland

Published by the Center for Risk and Reliability

International Standard Book Number (ISBN): 978-0-9966468-1-9

Copyright © 2016 by the Center for Reliability Engineering University of Maryland, College Park, Maryland, USA

All rights reserved. No part of this book may be reproduced or transmitted in any form or by any , electronic or mechanical, including photocopying, recording, or by any information storage and retrieval system, without permission in writing from The Center for Reliability Engineering, Reliability Engineering Program.

The Center for Risk and Reliability University of Maryland College Park, Maryland 20742-7531

In memory of Willie Mae Webb

This book is dedicated to the memory of Miss Willie Webb who passed away on April 10 2007 while working at the Center for Risk and Reliability at the University of Maryland (UMD). She initiated the concept of this book, as an aid for students conducting studies in Reliability Engineering at the University of Maryland. Upon passing, Willie bequeathed her belongings to fund a scholarship providing financial to Reliability Engineering students at UMD.

Preface

Reliability are required to combine a practical understanding of science and engineering with . The reliability ’s understanding of statistics is focused on the practical application of a wide variety of accepted statistical methods. Most reliability texts provide only a basic introduction to probability distributions or only provide a detailed reference to few distributions. Most texts in statistics provide theoretical detail which is outside the scope of likely reliability engineering tasks. As such the objective of this book is to provide a single reference text of closed form probability formulas and approximations used in reliability engineering.

This book provides details on 22 probability distributions. Each distribution section provides a graphical visualization and formulas for distribution parameters, along with distribution formulas. Common statistics such as moments and formulas are followed by likelihood functions and in many cases the derivation of maximum likelihood estimates. Bayesian non-informative and conjugate priors are provided followed by a discussion on the distribution characteristics and applications in reliability engineering. Each section is concluded with online and hardcopy references which can provide further information followed by the relationship to other distributions.

The book is divided into six parts. Part 1 provides a brief coverage of the fundamentals of probability distributions within a reliability engineering context. Part 1 is limited to concise explanations aimed to familiarize readers. For further understanding the reader is referred to the references. Part 2 to Part 6 cover Common Life Distributions, Continuous Distributions, Univariate Discrete Distributions and Multivariate Distributions respectively.

The authors would like to thank the many students in the Reliability Engineering Program particularly Reuel Smith for proof reading.

Contents i Contents

PREFACE ...... V

CONTENTS ...... I

1. FUNDAMENTALS OF PROBABILITY DISTRIBUTIONS ...... 1

1.1. ...... 2 1.1.1. Theory of Probability ...... 2 1.1.2. Interpretations of Probability Theory ...... 2 1.1.3. Laws of Probability...... 3 1.1.4. Law of Total Probability ...... 4 1.1.5. Bayes’ Law ...... 4 1.1.6. Likelihood Functions ...... 5 1.1.7. Fisher Information Matrix ...... 6

1.2. Distribution Functions ...... 9 1.2.1. Random Variables ...... 9 1.2.2. Statistical Distribution Parameters ...... 9 1.2.3. Probability Density Function ...... 9 1.2.4. Cumulative Distribution Function ...... 11 1.2.5. Reliability Function ...... 12 1.2.6. Conditional Reliability Function ...... 13 1.2.7. 100α% Percentile Function ...... 13 1.2.8. Residual Life ...... 13 1.2.9. Hazard Rate ...... 13 1.2.10. Cumulative Hazard Rate ...... 14 1.2.11. Characteristic Function ...... 15 1.2.12. Joint Distributions ...... 16 1.2.13. Marginal Distribution ...... 17 1.2.14. Conditional Distribution ...... 17 1.2.15. Bathtub Distributions...... 17 1.2.16. Truncated Distributions ...... 18 1.2.17. Summary ...... 19

1.3. Distribution Properties ...... 20 1.3.1. / ...... 20 1.3.2. Moments of Distribution ...... 20 1.3.3. ...... 21

1.4. Parameter Estimation ...... 22 1.4.1. Probability Plotting Paper ...... 22 1.4.2. Total Time on Test Plots ...... 23 1.4.3. Least Mean Square Regression ...... 24 1.4.4. Method of Moments ...... 25 1.4.5. Maximum Likelihood Estimates ...... 26 1.4.6. Bayesian Estimation ...... 27 ii Probability Distributions Used in Reliability Engineering

1.4.7. Confidence Intervals ...... 30

1.5. Related Distributions ...... 33

1.6. Supporting Functions ...... 34 1.6.1. , ...... 34 1.6.2. Incomplete Beta Function ( ; , ) ...... 34 1.6.3. Regularized Incomplete𝐁𝐁𝒙𝒙 𝒚𝒚 Beta Function ( ; , ) ...... 34 1.6.4. Complete 𝑩𝑩𝑩𝑩 (𝒕𝒕 )𝒙𝒙 ...... 𝒚𝒚 ...... 34 1.6.5. Upper Incomplete Gamma Function (𝑰𝑰𝑰𝑰, 𝒕𝒕) 𝒙𝒙...... 𝒚𝒚 ...... 35 1.6.6. Lower Incomplete Gamma Function𝚪𝚪 𝒌𝒌 ( , ) ...... 35 1.6.7. Digamma Function ...... 𝚪𝚪 𝒌𝒌 𝒕𝒕 ...... 36 1.6.8. Trigamma Function ...... 𝛄𝛄 𝒌𝒌 𝒕𝒕 ...... 36 𝝍𝝍𝝍𝝍 1.7. Referred Distributions ...... 𝝍𝝍′𝒙𝒙 ...... 37 1.7.1. Inverse ( , )...... 37 1.7.2. Student T Distribution ( , , ) ...... 37 1.7.3. F Distribution ( , ) ...... 𝑰𝑰𝑰𝑰 𝜶𝜶 𝜷𝜷 ...... 37 1.7.4. Chi-Square Distribution𝑻𝑻 𝜶𝜶(𝝁𝝁)𝝈𝝈𝝈𝝈 ...... 37 1.7.5. Hypergeometric𝑭𝑭 𝒏𝒏Distribution𝟏𝟏 𝒏𝒏𝟐𝟐 ( ; , , ) ...... 38 1.7.6. 𝝌𝝌𝝌𝝌 𝒗𝒗 ( ; , ) ...... 38 𝑯𝑯𝑯𝑯𝑯𝑯𝑯𝑯𝑯𝑯𝑯𝑯𝑯𝑯𝑯𝑯𝑯𝑯 𝒌𝒌 𝒏𝒏 𝒎𝒎 𝑵𝑵 1.8. Nomenclature and Notation𝑾𝑾𝑾𝑾𝑾𝑾𝑾𝑾𝑾𝑾 ...... 𝑾𝑾𝑾𝑾𝑾𝑾 𝒙𝒙 𝚺𝚺 𝒏𝒏 ...... 39

2. COMMON LIFE DISTRIBUTIONS ...... 40

2.1. Exponential Continuous Distribution ...... 41

2.2. Lognormal Continuous Distribution ...... 49

2.3. Weibull Continuous Distribution ...... 59

3. BATHTUB LIFE DISTRIBUTIONS ...... 68

3.1. 2-Fold Mixed ...... 69

3.2. Exponentiated Weibull Distribution ...... 76

3.3. Modified Weibull Distribution ...... 81

4. UNIVARIATE CONTINUOUS DISTRIBUTIONS ...... 85

4.1. Beta Continuous Distribution ...... 86

4.2. Birnbaum Saunders Continuous Distribution ...... 93 Contents iii 4.3. Gamma Continuous Distribution ...... 99

4.4. Logistic Continuous Distribution ...... 108

4.5. Normal (Gaussian) Continuous Distribution ...... 115

4.6. Pareto Continuous Distribution ...... 125

4.7. Triangle Continuous Distribution ...... 131

4.8. Truncated Normal Continuous Distribution...... 135

4.9. Uniform Continuous Distribution ...... 145

5. UNIVARIATE DISCRETE DISTRIBUTIONS ...... 151

5.1. Bernoulli Discrete Distribution ...... 152

5.2. Binomial Discrete Distribution ...... 157

5.3. Poisson Discrete Distribution ...... 165

6. BIVARIATE AND MULTIVARIATE DISTRIBUTIONS ...... 172

6.1. Bivariate Normal Continuous Distribution ...... 173

6.2. Dirichlet Continuous Distribution ...... 181

6.3. Multivariate Normal Continuous Distribution ...... 187

6.4. Multinomial Discrete Distribution ...... 193

7. REFERENCES ...... 200

iv Probability Distributions Used in Reliability Engineering

Introduction 1 Prob Theory

1. Fundamentals of Probability Distributions 2 Probability Distributions Used in Reliability Engineering

1.1. Probability Theory Prob Theory Prob 1.1.1. Theory of Probability

The theory of probability formalizes the representation of probabilistic concepts through a set of rules. The most common reference to formalizing the rules of probability is through a set of axioms proposed by Kolmogorov in 1933. Where is an event in the event space = with different events. 𝑖𝑖 𝑛𝑛 𝐸𝐸 𝑖𝑖=1 𝑖𝑖 Ω ∪ 𝐸𝐸 𝑛𝑛 0 ( ) 1

𝑖𝑖 ( ) =≤1 𝑃𝑃 and𝐸𝐸 ≤( ) = 0

(𝑃𝑃 Ω ) = ( 𝑃𝑃)𝜙𝜙+ ( )

1 2 1 2 𝑃𝑃 𝐸𝐸 ∪ 𝐸𝐸 𝑃𝑃 𝐸𝐸 Whe𝑃𝑃n𝐸𝐸 and are mutually exclusive.

1 2 Other representations of uncertainty exist such as fuzzy𝐸𝐸 logic𝐸𝐸 and theory of evidence (Dempster-Shafer model) which do not follow the theory of probability but almost all reliability concepts are defined based on probability as the metric of uncertainty. For a justification of probability theory see (Singpurwalla 2006).

1.1.2. Interpretations of Probability

The two most common interpretations of probability are:

Interpretation. In the frequentist interpretation of probability, the probability of an event (failure) is defined as:

( ) = lim 𝑓𝑓 𝑛𝑛 𝑃𝑃 𝐾𝐾 𝑛𝑛→∞ Also known as the classical approach, this interpretation𝑛𝑛 assumes there exists a real probability of an event, . The analyst uses the observed frequency of the event to estimate the value of . The more historic events that have occurred, the more confident the analyst𝑝𝑝 is of the estimation of . This approach does have limitations, for instance when 𝑝𝑝 from events are not available (e.g. no failures occur in a test) cannot be estimated and this method𝑝𝑝 cannot incorporate “soft evidence” such as expert opinion. 𝑝𝑝 • Subjective Interpretation. The subjective interpretation of probability is also known as the Bayesian school of thought. This method defines the probability of an event as degree of belief the analyst has on the occurrence of event. This means probability is a product of the analyst’s state of knowledge. Any evidence which would change the analyst’s degree of belief must be considered when calculating the probability (including soft evidence). The assumption is made that the probability assessment is made by a coherent person where any coherent person having the same state of knowledge would make the same assessment.

Introduction 3 Prob Theory

The subjective interpretation has the flexibility of including many types of evidence to assist in estimating the probability of an event. This is important in

many reliability applications where the events of interest (e. g, system failure) are rare.

1.1.3. Laws of Probability

The following rules of logic form the basis for many mathematical operations within the theory of probability.

Let = and = be two events within the sample space where .

𝑖𝑖 𝑗𝑗 Boolean𝑋𝑋 𝐸𝐸 Laws of𝑌𝑌 probability𝐸𝐸 are (Modarres et al. 1999, p.25): Ω 𝑖𝑖 ≠ 𝑗𝑗

= Commutative Law = ( 𝑋𝑋 ∪ 𝑌𝑌) = (𝑌𝑌 ∪ 𝑋𝑋 ) Associative Law ( 𝑋𝑋 ∩ 𝑌𝑌) = (𝑌𝑌 ∩ 𝑋𝑋 ) 𝑋𝑋(∪ 𝑌𝑌 ∪) =𝑍𝑍 ( 𝑋𝑋 ∪) 𝑌𝑌 (∪ 𝑍𝑍 ) Distributive Law 𝑋𝑋 ∩ 𝑌𝑌 ∩ 𝑍𝑍 =𝑋𝑋 ∩ 𝑌𝑌 ∩ 𝑍𝑍 Idempotent Law 𝑋𝑋 ∩ 𝑌𝑌 ∪ 𝑍𝑍 𝑋𝑋=∩ 𝑌𝑌 ∪ 𝑋𝑋 ∩ 𝑍𝑍 𝑋𝑋X ∪ X𝑋𝑋 = 𝑋𝑋 Complementation Law 𝑋𝑋X ∩ X𝑋𝑋 = Ø𝑋𝑋 � ∪X = X Ω � ( ∩) = De Morgan’s Theorem � ( ) = ���������� � � 𝑋𝑋(∪ 𝑌𝑌 ) =𝑋𝑋 ∩ 𝑌𝑌 ���������� 𝑋𝑋 ∩ 𝑌𝑌 𝑋𝑋� ∪ 𝑌𝑌� Two events 𝑋𝑋are∪ mutually𝑋𝑋� ∩ 𝑌𝑌 exclusive𝑋𝑋 ∪ 𝑌𝑌 if: X Y = Ø, ( ) = 0

Two events are independent if one∩ event occurring𝑃𝑃 𝑋𝑋 ∩ 𝑌𝑌 does not affect the probability of the second event occurring: 𝑌𝑌 𝑋𝑋 ( | ) = ( )

The rules for evaluating the probability𝑃𝑃 of𝑋𝑋 compound𝑌𝑌 𝑃𝑃 𝑋𝑋 events are:

Addition Rule: ( ) = ( ) + ( ) ( ) = ( ) + ( ) ( ) ( | ) 𝑃𝑃 𝑋𝑋 ∪ 𝑌𝑌 𝑃𝑃 𝑋𝑋 𝑃𝑃 𝑌𝑌 − 𝑃𝑃 𝑋𝑋 ∩ 𝑌𝑌 Multiplication Rule: 𝑃𝑃 𝑋𝑋 𝑃𝑃 𝑌𝑌 − 𝑃𝑃 𝑋𝑋 𝑃𝑃 𝑌𝑌 𝑋𝑋 ( ) = ( ) ( | ) = ( ) ( | )

When and are independent:𝑃𝑃 𝑋𝑋 ∩ 𝑌𝑌 𝑃𝑃 𝑋𝑋 𝑃𝑃 𝑌𝑌 𝑋𝑋 𝑃𝑃 𝑌𝑌 𝑃𝑃 𝑋𝑋 𝑌𝑌 ( ) = ( ) + ( ) ( ) ( ) 𝑋𝑋 𝑌𝑌 ( ) = ( ) ( ) 𝑃𝑃 𝑋𝑋 ∪ 𝑌𝑌 𝑃𝑃 𝑋𝑋 𝑃𝑃 𝑌𝑌 − 𝑃𝑃 𝑋𝑋 𝑃𝑃 𝑌𝑌 𝑃𝑃 𝑋𝑋 ∩ 𝑌𝑌 𝑃𝑃 𝑌𝑌 𝑃𝑃 𝑌𝑌 4 Probability Distributions Used in Reliability Engineering

Generalizations of these equations: ( … ) = [ ( ) + ( ) + + ( )]

Prob Theory Prob [ ( ) + ( ) + + ( )] 𝑃𝑃 𝐸𝐸 1 ∪ 𝐸𝐸 2 ∪ ∪ 𝐸𝐸 𝑛𝑛 +𝑃𝑃 [𝐸𝐸1( 𝑃𝑃 𝐸𝐸2 ⋯) + 𝑃𝑃( 𝐸𝐸𝑛𝑛 ) + ] 1 2 1 3 𝑛𝑛−1 𝑛𝑛 − 𝑃𝑃 (𝐸𝐸 1∩) 𝐸𝐸 [ (𝑃𝑃 𝐸𝐸 ∩ 𝐸𝐸 … ⋯ )]𝑃𝑃 𝐸𝐸 ∩ 𝐸𝐸 1 2 3 1 2 4 𝑃𝑃 𝐸𝐸 ∩𝑛𝑛𝐸𝐸+1 ∩ 𝐸𝐸 𝑃𝑃 𝐸𝐸 ∩ 𝐸𝐸 ∩ 𝐸𝐸 ⋯ ( … − ⋯ )−= ( )𝑃𝑃. 𝐸𝐸(1 ∩|𝐸𝐸2)∩. ( ∩|𝐸𝐸𝑛𝑛 ) … ( | … ) 𝑃𝑃 𝐸𝐸1 ∩ 𝐸𝐸2 ∩ ∩ 𝐸𝐸𝑛𝑛 𝑃𝑃 𝐸𝐸1 𝑃𝑃 𝐸𝐸2 𝐸𝐸1 𝑃𝑃 𝐸𝐸3 𝐸𝐸1 ∩ 𝐸𝐸2 𝑛𝑛 1 2 𝑛𝑛−1 1.1.4. Law of Total Probability 𝑃𝑃 𝐸𝐸 𝐸𝐸 ∩ 𝐸𝐸 ∩ ∩ 𝐸𝐸

The probability of can be obtained by the following summation;

𝑋𝑋 ( ) = 𝑛𝑛𝐴𝐴 ( ) ( | )

𝑃𝑃 𝑋𝑋 � 𝑃𝑃 𝐴𝐴𝑖𝑖 𝑃𝑃 𝑋𝑋 𝐴𝐴𝑖𝑖 𝑖𝑖=1 where = { , , … , } is a partition of the sample space, , and all the elements of A A A = Ø are mutually exclusive,1 2 𝑛𝑛 𝐴𝐴 , and the union of all A elements cover the complete 𝐴𝐴 𝐴𝐴 𝐴𝐴 𝐴𝐴= Ω sample space, i . j 𝐴𝐴 ∩ 𝑛𝑛 𝑖𝑖=1 𝑖𝑖 For example: ∪ 𝐴𝐴 Ω ( ) = ( ) + ( ) = ( ) ( | ) + ( ) ( | ) 𝑃𝑃 𝑋𝑋 𝑃𝑃 𝑋𝑋 ∩ 𝑌𝑌 𝑃𝑃 𝑋𝑋 ∩ 𝑌𝑌� � 1.1.5. Bayes’ Law 𝑃𝑃 𝑌𝑌 𝑃𝑃 𝑋𝑋 𝑌𝑌 𝑃𝑃 𝑌𝑌 𝑃𝑃 𝑋𝑋 𝑌𝑌

Bayes’ law, can be derived from the multiplication rule and the law of total probability as follows:

( ) ( | ) = ( ) ( | )

𝑃𝑃 𝜃𝜃 𝑃𝑃 𝐸𝐸 𝜃𝜃 (𝑃𝑃)𝐸𝐸(𝑃𝑃| 𝜃𝜃)𝐸𝐸 ( | ) = ( ) 𝑃𝑃 𝜃𝜃 𝑃𝑃 𝐸𝐸 𝜃𝜃 𝑃𝑃 𝜃𝜃 𝐸𝐸 ( 𝑃𝑃) 𝐸𝐸( | ) ( | ) = ( | ) ( ) 𝑃𝑃 𝜃𝜃 𝑃𝑃 𝐸𝐸 𝜃𝜃 𝑃𝑃 𝜃𝜃 𝐸𝐸 𝑖𝑖 𝑖𝑖 𝑖𝑖 the unknown of interest (UOI).∑ 𝑃𝑃 𝐸𝐸 𝜃𝜃 𝑃𝑃 𝜃𝜃

𝜃𝜃 the observed , evidence.

𝐸𝐸( ) the prior state of knowledge about without the evidence. Also denoted as ( ). 𝑃𝑃 𝜃𝜃 𝜃𝜃 𝑜𝑜 ( | ) 𝜋𝜋the𝜃𝜃 likelihood of observing the evidence given the UOI. Also denoted as ( | ). 𝑃𝑃 𝐸𝐸 𝜃𝜃 𝐿𝐿 𝐸𝐸 𝜃𝜃 Introduction 5 Prob Theory

( | ) the posterior state of knowledge about given the evidence. Also denoted as ( | ).

𝑃𝑃 𝜃𝜃 𝐸𝐸 𝜃𝜃 ( | ) ( 𝜋𝜋) is𝜃𝜃 the𝐸𝐸 normalizing constant.

𝑖𝑖 𝑖𝑖 Thus∑ Bayes𝑃𝑃 𝐸𝐸 𝜃𝜃 formula𝑃𝑃 𝜃𝜃 enables us to use a piece of evidence , , to make inference about the unobserved . 𝐸𝐸 The continuous form𝜃𝜃 of Bayes’ Law can be written as: ( ) ( | ) ( | ) = ( ) ( | ) 𝜋𝜋𝑜𝑜 𝜃𝜃 𝐿𝐿 𝐸𝐸 𝜃𝜃

𝜋𝜋 𝜃𝜃 𝐸𝐸 𝑜𝑜 In Bayesian statistics the state of knowledge∫ 𝜋𝜋 𝜃𝜃 (uncertainty)𝐿𝐿 𝐸𝐸 𝜃𝜃 𝑑𝑑𝑑𝑑 of an unknown of interest is quantified by assigning a to its possible values. Bayes’ law provides a mathematical means by which this uncertainty can be updated given new evidence.

1.1.6. Likelihood Functions

The is the probability of observing the evidence (e.g., sample data), , given the distribution parameters, . The probability of observing events is the product of each event likelihood: 𝐸𝐸 𝜃𝜃 ( | ) = ( | )

𝐿𝐿 𝜃𝜃 𝐸𝐸 𝑐𝑐 � 𝐿𝐿 𝜃𝜃 𝑡𝑡𝑖𝑖 𝑖𝑖 is a combinatorial constant which quantifies the number of combination which the observed evidence could have occurred. Methods which use the likelihood function in p𝑐𝑐arameter estimation do not depend on the constant and so it is omitted.

The following table summarizes the likelihood functions for different types of observations:

Table 1: Summary of Likelihood Functions (Klein & Moeschberger 2003, p.74) Type of Observation Likelihood Function Example Description Exact Lifetimes ( | ) = ( | ) Failure time is known

Right Censored 𝐿𝐿𝑖𝑖(𝜃𝜃|𝑡𝑡𝑖𝑖) = 𝑓𝑓(𝑡𝑡𝑖𝑖 |𝜃𝜃) Component survived to time Left Censored 𝐿𝐿𝑖𝑖(𝜃𝜃|𝑡𝑡𝑖𝑖) = 𝑅𝑅(𝑡𝑡𝑖𝑖|𝜃𝜃) Component failed before time𝑡𝑡𝑖𝑖 Interval Censored ( | 𝐿𝐿)𝑖𝑖 =𝜃𝜃 𝑡𝑡𝑖𝑖( |𝐹𝐹)𝑡𝑡𝑖𝑖 𝜃𝜃 ( | ) Component failed between 𝑡𝑡𝑖𝑖 𝑅𝑅𝑅𝑅 𝐿𝐿𝐿𝐿 and 𝑖𝑖 𝑖𝑖 𝑖𝑖 𝑖𝑖 𝐿𝐿 𝜃𝜃 𝑡𝑡 𝐹𝐹 𝑡𝑡 𝜃𝜃 − 𝐹𝐹 𝑡𝑡 𝜃𝜃 𝐿𝐿𝐿𝐿 𝑅𝑅𝑅𝑅 Left Truncated ( | ) Component𝑡𝑡𝑖𝑖 𝑡𝑡𝑖𝑖 failed at time where ( | ) = ( | ) observations are truncated before . 𝑖𝑖 𝑖𝑖 𝑖𝑖 𝑖𝑖 𝑓𝑓 𝑡𝑡 𝜃𝜃 𝑡𝑡 Right Truncated 𝐿𝐿 𝜃𝜃 𝑡𝑡 ( 𝐿𝐿| ) Component failed at time where𝑡𝑡𝐿𝐿 ( | ) = 𝑅𝑅 𝑡𝑡 𝜃𝜃 ( | ) observations are truncated after . 𝑖𝑖 𝑖𝑖 𝑖𝑖 𝑖𝑖 𝑓𝑓 𝑡𝑡 𝜃𝜃 𝑡𝑡 Interval Truncated 𝐿𝐿 𝜃𝜃 𝑡𝑡 ( 𝑈𝑈| ) Component failed at time where𝑡𝑡𝑈𝑈 ( | ) = 𝐹𝐹 𝑡𝑡 𝜃𝜃 ( | ) ( | ) observations are truncated before 𝑖𝑖 𝑖𝑖 𝑖𝑖 𝑖𝑖 𝑓𝑓 𝑡𝑡 𝜃𝜃 and after . 𝑡𝑡 𝐿𝐿 𝜃𝜃 𝑡𝑡 𝐿𝐿 𝐹𝐹 𝑡𝑡𝑈𝑈 𝜃𝜃 − 𝐹𝐹 𝑡𝑡𝐿𝐿 𝜃𝜃 𝑡𝑡 𝑡𝑡𝑈𝑈 6 Probability Distributions Used in Reliability Engineering

The Likelihood function is used in and maximum likelihood parameter

Prob Theory Prob estimation techniques. In both instances any constant in front of the likelihood function becomes irrelevant. Such constants are therefore not included in the likelihood functions given in this book (nor in most references).

For example, consider the case where a test is conducted on components with an exponential time to failure distribution. The test is terminated at during which components failed at times , , … , and = components𝑛𝑛 survived. Using the 𝑠𝑠 to construct the likelihood function we obtain: 𝑡𝑡 𝑟𝑟 1 2 𝑟𝑟 𝑡𝑡 𝑡𝑡 𝑡𝑡 𝑠𝑠 𝑛𝑛 − 𝑟𝑟

( | ) = 𝑛𝑛𝐹𝐹 𝑛𝑛𝑆𝑆 𝐹𝐹 𝑆𝑆 𝐿𝐿 𝜆𝜆 𝐸𝐸 � 𝑓𝑓�𝜆𝜆�𝑡𝑡𝑖𝑖 � � 𝑅𝑅�𝜆𝜆�𝑡𝑡𝑖𝑖 � 𝑖𝑖=1 𝑖𝑖=1

= 𝑛𝑛𝐹𝐹 𝑛𝑛𝑆𝑆 𝐹𝐹 𝑆𝑆 −𝜆𝜆𝑡𝑡𝑖𝑖 −𝜆𝜆𝑡𝑡𝑖𝑖 � 𝜆𝜆𝑒𝑒 � 𝑒𝑒 𝑖𝑖=1 𝑖𝑖=1 = 𝑛𝑛𝐹𝐹 𝐹𝐹 𝑛𝑛𝑆𝑆 𝑆𝑆 𝑛𝑛𝐹𝐹 −𝜆𝜆 ∑𝑖𝑖=1 𝑡𝑡𝑖𝑖 −λ ∑𝑖𝑖=1 𝑡𝑡𝑖𝑖 𝜆𝜆 𝑒𝑒 𝑒𝑒 = 𝑛𝑛𝐹𝐹 𝐹𝐹 𝑛𝑛𝑆𝑆 𝑆𝑆 𝑛𝑛𝐹𝐹 −𝜆𝜆�∑𝑖𝑖=1 𝑡𝑡𝑖𝑖 +∑𝑖𝑖=1 𝑡𝑡𝑖𝑖 � Alternatively, because the test described𝜆𝜆 𝑒𝑒 is a homogeneous Poisson process1 the likelihood function could also have been constructed using a . The data can be stated as seeing r failure in time where is the total time on test = + . Therefore the likelihood function would be: 𝑇𝑇 𝑇𝑇 𝑇𝑇 𝑛𝑛𝐹𝐹 𝐹𝐹 𝑛𝑛𝑆𝑆 𝑆𝑆 ( | ) = ( | , 𝑡𝑡 ) 𝑡𝑡 𝑡𝑡 𝑖𝑖=1 𝑖𝑖 𝑖𝑖=1 𝑖𝑖 ∑ 𝑡𝑡 ∑ 𝑡𝑡 𝐿𝐿 𝜆𝜆 𝐸𝐸 𝑓𝑓( 𝜆𝜆 𝑛𝑛) 𝐹𝐹 𝑡𝑡𝑇𝑇 = !𝑛𝑛𝐹𝐹 𝑇𝑇 −𝜆𝜆𝑡𝑡𝑇𝑇 𝜆𝜆𝑡𝑡 𝐹𝐹 𝑒𝑒 = 𝑛𝑛 𝑛𝑛𝐹𝐹 −𝜆𝜆𝑡𝑡𝑇𝑇 𝑐𝑐𝜆𝜆 𝑒𝑒 = 𝑛𝑛𝐹𝐹 𝐹𝐹 𝑛𝑛𝑆𝑆 𝑆𝑆 𝑛𝑛𝐹𝐹 −𝜆𝜆�∑𝑖𝑖=1 𝑡𝑡𝑖𝑖 +∑𝑖𝑖=1 𝑡𝑡𝑖𝑖 � As mentioned earlier, in estimation procedures𝜆𝜆 𝑒𝑒 within this book, the constant can be ignored. As such, the two likelihood functions are equal. For more information see (Meeker & Escobar 1998, p.36) or (Rinne 2008, p.403). 𝑐𝑐

1.1.7. Fisher Information Matrix

The Fisher Information Matrix has many uses but in reliability applications it is most often used to create Jeffery’s non-informative priors. There are two types of Fisher information matrices, the Expected Fisher Information Matrix ( ), and the Observed Fisher Information Matrix ( ). 𝐼𝐼 𝜃𝜃 1 Homogeneous in 𝐽𝐽time,𝜃𝜃 where it does not matter if you have components on test at once (exponential test), or you have a single component on test which is replaced after failure times (Poisson process), the evidence produced will be𝑛𝑛 the same.

𝑛𝑛 Introduction 7 Prob Theory

The Expected Fisher Information Matrix is obtained from a log-likelihood function from a single random variable. The random variable is replaced by its .

For a single parameter distribution: ( | ) ( | ) ( ) = = 2 2 𝜕𝜕 Λ 𝜃𝜃 𝑥𝑥 𝜕𝜕Λ 𝜃𝜃 𝑥𝑥 𝐼𝐼 𝜃𝜃 −𝐸𝐸 � 2 � �� � � 𝜕𝜕𝜃𝜃 𝜕𝜕𝜕𝜕 where is the log-likelihood function and [ ] = ( ) . For a distribution with parameters the Expected Fisher Information Matrix is: Λ 𝐸𝐸 𝑈𝑈 ∫ 𝑈𝑈𝑈𝑈 𝑥𝑥 𝑑𝑑𝑑𝑑 𝑝𝑝 ( | ) ( | ) ( | ) 2 2 2 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙 ⎡−𝐸𝐸 � ( 2| )� −𝐸𝐸 � ( | )� ⋯ −𝐸𝐸 � ( | )�⎤ 𝜕𝜕𝜃𝜃1 𝜕𝜕𝜃𝜃1𝜕𝜕𝜃𝜃2 𝜕𝜕𝜃𝜃1𝜕𝜕𝜃𝜃𝑝𝑝 ( ) = ⎢ 2 2 2 ⎥ ⎢ 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙 ⎥ 2 ⎢−𝐸𝐸 � 2 1 � −𝐸𝐸 � 2 � ⋯ −𝐸𝐸 � 2 𝑝𝑝 �⎥ 𝐼𝐼 𝜽𝜽 𝜕𝜕𝜃𝜃(𝜕𝜕𝜃𝜃| ) 𝜕𝜕(𝜃𝜃 | ) 𝜕𝜕𝜃𝜃(𝜕𝜕𝜃𝜃| ) ⎢ ⎥ ⋮ ⋮ …⋱ ⋮ ⎢ 2 2 2 ⎥ 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙 ⎢ 2 ⎥ −𝐸𝐸 � 𝑝𝑝 1 � −𝐸𝐸 � 𝑝𝑝 2 � −𝐸𝐸 � 𝑝𝑝 � The Observed Fisher ⎣Information𝜕𝜕𝜃𝜃 𝜕𝜕𝜃𝜃 Matrix is obtained𝜕𝜕𝜃𝜃 𝜕𝜕𝜃𝜃 from a likelihood𝜕𝜕𝜃𝜃 function⎦ constructed from observed samples from the distribution. The expectation term is dropped.

For a𝑛𝑛 single parameter distribution:

( | ) ( ) = 𝑛𝑛 2 𝜕𝜕 Λ 𝜃𝜃 𝑥𝑥𝑖𝑖 𝐽𝐽𝑛𝑛 𝜃𝜃 − � 2 𝑖𝑖=1 𝜕𝜕𝜃𝜃 For a distribution with parameters the Observed Fisher Information Matrix is:

𝑝𝑝 ( | ) ( | ) ( | ) 2 2 2 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙𝒊𝒊 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙𝒊𝒊 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙𝒊𝒊 ⎡− ( 2| ) − ( | ) ⋯ − ( | )⎤ 𝜕𝜕𝜃𝜃1 𝜕𝜕𝜃𝜃1𝜕𝜕𝜃𝜃2 𝜕𝜕𝜃𝜃1𝜕𝜕𝜃𝜃𝑝𝑝 ( ) = 𝑛𝑛 ⎢ 2 2 2 ⎥ ⎢ 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙𝒊𝒊 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙𝒊𝒊 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙𝒊𝒊 ⎥ 2 𝑛𝑛 ⎢− 2 1 − 2 ⋯ − 2 𝑝𝑝 ⎥ 𝐽𝐽 𝜽𝜽 � 𝜕𝜕𝜃𝜃(𝜕𝜕|𝜃𝜃 ) 𝜕𝜕(𝜃𝜃 | ) 𝜕𝜕𝜃𝜃(𝜕𝜕|𝜃𝜃 ) 𝑖𝑖=1 ⎢ ⎥ ⋮ ⋮ …⋱ ⋮ ⎢ 2 2 2 ⎥ 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙𝒊𝒊 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙𝒊𝒊 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙𝒊𝒊 ⎢ 2 ⎥ − 𝑝𝑝 1 − 𝑝𝑝 2 − 𝑝𝑝 It can be seen that as ⎣becomes𝜕𝜕𝜃𝜃 𝜕𝜕𝜃𝜃 large, the𝜕𝜕𝜃𝜃 average𝜕𝜕𝜃𝜃 value of𝜕𝜕𝜃𝜃 the random⎦ variable approaches its expected value and so the following asymptotic relationship exists between the observed and𝑛𝑛 expected Fisher information matrices:

1 plim ( ) = ( )

𝑛𝑛 𝑛𝑛→∞ 𝐽𝐽 𝜽𝜽 𝐼𝐼 𝜽𝜽 For large n the following approximation can𝑛𝑛 be used:

( )

𝐽𝐽𝑛𝑛 ≈ 𝑛𝑛𝑛𝑛 𝜽𝜽 8 Probability Distributions Used in Reliability Engineering

When evaluated at = the observed Fisher information matrix estimates the -

Prob Theory Prob : 𝜽𝜽 𝜽𝜽� ( ) ( , ) ( , ) ( , ) ( ) ( , ) = ( = ) = 𝑉𝑉𝑉𝑉𝑉𝑉 𝜃𝜃�1 𝐶𝐶𝐶𝐶𝐶𝐶 𝜃𝜃�1 𝜃𝜃�2 ⋯ 𝐶𝐶𝐶𝐶𝐶𝐶 𝜃𝜃�1 𝜃𝜃�𝑑𝑑 −1 ⎡ ⎤ �1 �2 �2 �2 �𝑑𝑑 𝑽𝑽 �𝐽𝐽𝑛𝑛 𝜽𝜽 𝜽𝜽� � ⎢𝐶𝐶𝐶𝐶𝐶𝐶(𝜃𝜃 , 𝜃𝜃 ) 𝑉𝑉𝑉𝑉(𝑉𝑉 𝜃𝜃, ) ⋯… 𝐶𝐶𝐶𝐶𝐶𝐶 𝜃𝜃( 𝜃𝜃) ⎥ ⎢ ⋮ ⋮ ⋱ ⋮ ⎥ ⎣𝐶𝐶𝐶𝐶𝐶𝐶 𝜃𝜃�1 𝜃𝜃�𝑑𝑑 𝐶𝐶𝐶𝐶𝐶𝐶 𝜃𝜃�2 𝜃𝜃�𝑑𝑑 𝑉𝑉𝑉𝑉𝑉𝑉 𝜃𝜃�𝑑𝑑 ⎦ Introduction 9 Dist Functions

1.2. Distribution Functions

1.2.1. Random Variables

Probability distributions are used to model random events for which the outcome is uncertain such as the time of failure for a component. Before placing a demand on that component, the time it will fail is unknown. The distribution of the probability of failure at different times is modeled by a probability distribution. In this book random variables will be denoted as capital letter such as for time. When the random variable assumes a value we denote this by small caps such as for time. For example, if we wish to find the probability that the component fails before𝑇𝑇 time we would find ( ). 𝑡𝑡 1 1 Random variables are classified as either discrete𝑡𝑡 or continuous. In𝑃𝑃 a𝑇𝑇 disc≤ 𝑡𝑡rete distribution, the random variable can take on a distinct or countable number of possible values such as number of demands to failure. In a continuous distribution the random variable is not constrained to distinct possible values such as time-to-failure distribution.

This book will denote continuous random variables as or , and discrete random variables as . 𝑋𝑋 𝑇𝑇 1.2.2. Statistical𝐾𝐾 Distribution Parameters

The parameters of a distribution are the variables which need to be specified in order to completely specify the distribution. Often parameters are classified by the effect they have on the distributions. Shape parameters define the shape of the distribution, scale parameters stretch the distribution along the random variable axis, and location parameters shift the distribution along the random variable axis. The reader is cautioned that the parameters for a distribution may change depending on the text. Therefore, before using formulas from other sources the parameterization need to be confirmed.

Understanding the effect of changing a distribution’s parameter value can be a difficult task. At the beginning of each section a graph of the distribution is shown with varied parameters.

1.2.3. Probability Density Function

A probability density function (pdf), denoted as ( ) is any function which is always positive and has a unit area: 𝑓𝑓 𝑡𝑡 ( ) = 1, ( ) = 1 ∞ � 𝑓𝑓 𝑡𝑡 𝑑𝑑𝑑𝑑 � 𝑓𝑓 𝑘𝑘 −∞ 𝑘𝑘

The probability of an event occurring between limits a and b is the area under the pdf: ( ) = ( ) = ( ) ( ) 𝑏𝑏

𝑃𝑃 𝑎𝑎 ≤ 𝑇𝑇 ≤ 𝑏𝑏 �𝑎𝑎 𝑓𝑓 𝑡𝑡 𝑑𝑑𝑑𝑑 𝐹𝐹 𝑏𝑏 − 𝐹𝐹 𝑎𝑎 10 Probability Distributions Used in Reliability Engineering

( ) = 𝑏𝑏 ( ) = ( ) ( 1)

Dist Functions Dist 𝑃𝑃 𝑎𝑎 ≤ 𝐾𝐾 ≤ 𝑏𝑏 � 𝑓𝑓 𝑘𝑘 𝐹𝐹 𝑏𝑏 − 𝐹𝐹 𝑎𝑎 − 𝑖𝑖=𝑎𝑎 The instantaneous value of a discrete pdf at can be obtained by minimizing the limits to [ , ]: 𝑖𝑖 ( = ) = ( <𝑘𝑘 ) = ( ) 𝑖𝑖−1 𝑖𝑖 𝑘𝑘 𝑘𝑘 𝑖𝑖 𝑖𝑖 𝑖𝑖 The instantaneous value of𝑃𝑃 a 𝐾𝐾continuous𝑘𝑘 𝑃𝑃 pdf𝑘𝑘 is infinitesimal.𝐾𝐾 ≤ 𝑘𝑘 𝑓𝑓 𝑘𝑘This result can be seen when minimizing the limits to [ , + ]: ( = ) = lim ( < + ) = lim ( ). 𝑡𝑡 𝑡𝑡 Δ𝑡𝑡

𝑃𝑃 𝑇𝑇 𝑡𝑡 Δt→0 𝑃𝑃 𝑡𝑡 𝑇𝑇 ≤ 𝑡𝑡 Δ𝑡𝑡 Δt→0 𝑓𝑓 𝑡𝑡 Δ𝑡𝑡 Therefore the reader must remember that in order to calculate the probability of an event, an interval for the random variable must be used. Furthermore, a common misunderstanding is that a pdf cannot have a value above one because the probability of an event occurring cannot be greater than one. As can be seen above this is true for discrete distributions, only because = 1. However for continuous the case the pdf is multiplied by a small interval , which ensures that the probability an event occurring within the interval is less than one.Δ 𝑘𝑘 Δ𝑡𝑡 ( ) Δ𝑡𝑡 ( )

𝑓𝑓 𝑡𝑡 𝑓𝑓 𝑘𝑘

𝑝𝑝𝑝𝑝𝑝𝑝 𝑝𝑝𝑝𝑝𝑝𝑝

( < ) ( < ) 𝑃𝑃 𝑘𝑘1 𝐾𝐾 ≤ 𝑘𝑘2 𝑃𝑃 𝑡𝑡1 𝑇𝑇 ≤ 𝑡𝑡2

𝑡𝑡 1 𝑘𝑘 𝑡𝑡Figure𝑡𝑡2 1: Left: continuous pdf, right: discrete𝑘𝑘1 𝑘𝑘pdf2

To derive the continuous pdf relationship to the cumulative density function (cdf), ( ):

lim ( ). = lim ( < + ) = lim ( + ) ( ) = lim ( ) 𝐹𝐹 𝑡𝑡

Δt→0 𝑓𝑓 𝑡𝑡 Δ𝑡𝑡 Δt→0 𝑃𝑃 𝑡𝑡 𝑇𝑇 ≤ 𝑡𝑡 Δ𝑡𝑡 ( Δ)t→0 𝐹𝐹 𝑡𝑡( ) Δ𝑡𝑡 − 𝐹𝐹 𝑡𝑡 Δt→0 Δ𝐹𝐹 𝑡𝑡 ( ) = lim = Δ𝐹𝐹 𝑡𝑡 𝑑𝑑𝑑𝑑 𝑡𝑡 𝑓𝑓 𝑡𝑡 Δt→0 The shape of the pdf can be obtained by plottingΔ𝑡𝑡 a 𝑑𝑑normalized𝑑𝑑 of an infinite number of samples from a distribution.

It should be noted when plotting a discrete pdf the points from each discrete value should not be joined. For ease of explanation using the area under the graph argument the step Introduction 11 Dist Functions is intuitive but implies a non-integer random variable. Instead stem plots or column plots are often used. ( ) ( )

𝑓𝑓 𝑘𝑘 𝑓𝑓 𝑘𝑘

Figure 2: Discrete data plotting.𝑘𝑘 Left stem plot. Right column plot. 𝑘𝑘

1.2.4. Cumulative Distribution Function

The cumulative density function (cdf), denoted by ( ) is the probability of the random event occurring before , ( ). For a discrete cdf the height of each step is the pdf value ( ). 𝐹𝐹 𝑡𝑡 𝑡𝑡 𝑃𝑃 𝑇𝑇 ≤ 𝑡𝑡 𝑖𝑖 ( ) = ( ) = ( ) , ( ) = ( ) = ( ) 𝑓𝑓 𝑘𝑘 𝑡𝑡 𝐹𝐹 𝑡𝑡 𝑃𝑃 𝑇𝑇 ≤ 𝑡𝑡 � 𝑓𝑓 𝑥𝑥 𝑑𝑑𝑑𝑑 𝐹𝐹 𝑘𝑘 𝑃𝑃 𝐾𝐾 ≤ 𝑘𝑘 � 𝑓𝑓 𝑘𝑘𝑖𝑖 −∞ 𝑘𝑘𝑖𝑖≤𝑘𝑘 The limits of the cdf for < < and 0 are given as:

−∞ 𝑡𝑡lim ∞ ( ) = 0≤, 𝑘𝑘 ≤ ∞( 1) = 0

𝑡𝑡→−∞ 𝐹𝐹 𝑡𝑡 𝐹𝐹 − lim ( ) = 1 , lim ( ) = 1

𝑡𝑡→∞ 𝐹𝐹 𝑡𝑡 𝑘𝑘→∞ 𝐹𝐹 𝑘𝑘 The cdf can be used to find the probability of the random even occurring between two limits: ( ) = ( ) = ( ) ( ) 𝑏𝑏

𝑃𝑃 𝑎𝑎 ≤ 𝑇𝑇 ≤ 𝑏𝑏 �𝑎𝑎 𝑓𝑓 𝑡𝑡 𝑑𝑑𝑑𝑑 𝐹𝐹 𝑏𝑏 − 𝐹𝐹 𝑎𝑎 ( ) = 𝑏𝑏 ( ) = ( ) ( 1)

𝑃𝑃 𝑎𝑎 ≤ 𝐾𝐾 ≤ 𝑏𝑏 � 𝑓𝑓 𝑘𝑘 𝐹𝐹 𝑏𝑏 − 𝐹𝐹 𝑎𝑎 − 𝑖𝑖=𝑎𝑎 12 Probability Distributions Used in Reliability Engineering

( ) ( )

1 𝐹𝐹 𝑡𝑡 1𝐹𝐹 𝑘𝑘 ( ) Dist Functions Dist ( ) 𝑐𝑐𝑐𝑐𝑐𝑐 𝑐𝑐𝑐𝑐𝑐𝑐 1 ( ) 𝑃𝑃 𝑇𝑇 ≤ 𝑡𝑡 1 ( ) 𝑃𝑃 𝐾𝐾 ≤ 𝑘𝑘 1 𝐹𝐹 𝑡𝑡 1 𝐹𝐹 𝑘𝑘

𝑡𝑡 𝑘𝑘 ( ) 𝑡𝑡1 ( ) 𝑘𝑘1 𝑓𝑓 𝑡𝑡 𝑓𝑓 𝑘𝑘 𝑝𝑝𝑝𝑝𝑝𝑝

( ) ( ) 𝑃𝑃 𝐾𝐾 ≤ 𝑘𝑘1 𝑃𝑃 𝑇𝑇 ≤ 𝑡𝑡1

𝑝𝑝𝑝𝑝𝑝𝑝

𝑡𝑡 1 1 𝑘𝑘 Figure𝑡𝑡 3: Left: continuous cdf/pdf, right: discrete𝑘𝑘 cdf/pdf

1.2.5. Reliability Function

The reliability function, also known as the , is denoted as ( ). It is the probability that the random event (time of failure) occurs after . 𝑅𝑅 𝑡𝑡 ( ) = ( > ) = 1 ( ), ( ) = ( > 𝑡𝑡) = 1 ( )

𝑅𝑅 𝑡𝑡 𝑃𝑃 𝑇𝑇( )𝑡𝑡= −( 𝐹𝐹) 𝑡𝑡 , 𝑅𝑅 (𝑘𝑘 ) =𝑃𝑃 𝑇𝑇 𝑘𝑘( ) − 𝐹𝐹 𝑘𝑘 ∞ ∞ 𝑅𝑅 𝑡𝑡 � 𝑓𝑓 𝑡𝑡 𝑑𝑑𝑑𝑑 𝑅𝑅 𝑘𝑘 � 𝑓𝑓 𝑘𝑘𝑖𝑖 𝑡𝑡 𝑖𝑖=𝑘𝑘+1 It should be noted that in most publications the discrete reliability function is defined as ( ) = ( ) = ( ). This definition results in ( ) 1 ( ). Despite this problem∗ it is the most ∞common definition and is included in ∗all the references in this book 𝑖𝑖=𝑘𝑘 except𝑅𝑅 𝑘𝑘 (Xie,𝑃𝑃 𝑇𝑇 Gaudoin,≥ 𝑘𝑘 ∑ et al.𝑓𝑓 𝑘𝑘2002) 𝑅𝑅 𝑘𝑘 ≠ − 𝐹𝐹 𝑘𝑘

Introduction 13 Dist Functions

( ) ( )

1 𝐹𝐹 𝑡𝑡 1 𝑅𝑅 𝑡𝑡

𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓

𝑐𝑐𝑐𝑐𝑐𝑐

0 0

𝑡𝑡 𝑡𝑡 Figure 4: Left continuous cdf, right continuous survival function

1.2.6. Conditional Reliability Function

The conditional reliability function, denoted as ( ) is the probability of the component surviving given that it has survived to time t. 𝑚𝑚 𝑥𝑥( + x) ( ) = ( | ) = ( ) 𝑅𝑅 𝑡𝑡 Where: 𝑚𝑚 𝑥𝑥 𝑅𝑅 𝑥𝑥 𝑡𝑡 is the given time for which we know the component𝑅𝑅 𝑡𝑡 survived. is new random variable defined as the time after . = 0 at . 𝑡𝑡 𝑥𝑥 𝑡𝑡 𝑥𝑥 𝑡𝑡 1.2.7. 100α% Percentile Function

The 100α% percentile function is the interval [0, ] for which the area under the pdf is .

𝛼𝛼 = (𝑡𝑡 ) 𝛼𝛼 −1 𝛼𝛼 1.2.8. Mean Residual Life 𝑡𝑡 𝐹𝐹 𝛼𝛼

The mean residual life (MRL), denoted as ( ), is the expected life given the component has survived to time, . 𝑢𝑢 𝑡𝑡 𝑡𝑡 1 ( ) = ( | ) = ( ) ∞ ( ) ∞

𝑢𝑢 𝑡𝑡 �0 𝑅𝑅 𝑥𝑥 𝑡𝑡 𝑑𝑑𝑑𝑑 �𝑡𝑡 𝑅𝑅 𝑥𝑥 𝑑𝑑𝑑𝑑 1.2.9. Hazard Rate 𝑅𝑅 𝑡𝑡

The hazard function, denoted as h(t), is the that a component fails in a small time interval, given that it has survived from time zero until the beginning of the time interval. For the continuous case the probability that an item will fail in a time interval given the item was functioning at time is:

𝑡𝑡 14 Probability Distributions Used in Reliability Engineering

( < < + ) ( + ) ( ) ( ) ( < < + | > ) = = = ( > ) ( ) ( ) 𝑃𝑃 𝑡𝑡 𝑇𝑇 𝑡𝑡 Δ𝑡𝑡 𝐹𝐹 𝑡𝑡 Δ𝑡𝑡 − 𝐹𝐹 𝑡𝑡 Δ𝐹𝐹 𝑡𝑡 Dist Functions Dist 𝑃𝑃 𝑡𝑡 𝑇𝑇 𝑡𝑡 Δ𝑡𝑡 𝑇𝑇 𝑡𝑡 By dividing the probability by and finding𝑃𝑃 𝑇𝑇 the𝑡𝑡 limit as 0𝑅𝑅 gives𝑡𝑡 the hazard𝑅𝑅 𝑡𝑡 rate:

(Δ𝑡𝑡< < + | > ) Δ𝑡𝑡 → ( ) ( ) ( ) = lim = lim = ( ) ( ) 𝑃𝑃 𝑡𝑡 𝑇𝑇 𝑡𝑡 Δ𝑡𝑡 𝑇𝑇 𝑡𝑡 Δ𝐹𝐹 𝑡𝑡 𝑓𝑓 𝑡𝑡 ℎ 𝑡𝑡 Δ𝑡𝑡→0 Δ𝑡𝑡→0 The discrete hazard rate is defined as:Δ (Xie,𝑡𝑡 Gaudoin, et al.Δ 2002)𝑡𝑡𝑡𝑡 𝑡𝑡 𝑅𝑅 𝑡𝑡

( = ) ( ) ( ) = = ( ) ( 1) 𝑃𝑃 𝐾𝐾 𝑘𝑘 𝑓𝑓 𝑘𝑘 ℎ 𝑘𝑘 This unintuitive result is due to a popular𝑃𝑃 𝐾𝐾 ≥definit𝑘𝑘 ion𝑅𝑅 𝑘𝑘of− ( ) = ( ) in which case ( ) = ( )/ ( ). This definition has been avoided because∗ it∞ violates the formula 𝑖𝑖=𝑘𝑘 ( ) = 1 ( ∗). The discrete hazard rate cannot be𝑅𝑅 used𝑘𝑘 in∑ the𝑓𝑓 𝑘𝑘same way as a continuousℎ 𝑘𝑘 𝑓𝑓 𝑘𝑘 hazard𝑅𝑅 𝑘𝑘 rate with the following differences (Xie, Gaudoin, et al. 2002): 𝑅𝑅 𝑘𝑘 − 𝐹𝐹 𝑘𝑘 • ( ) is defined as a probability and so is bounded by [0,1]. • ( ) is not additive for series systems. • Forℎ 𝑘𝑘 the cumulative hazard rate ( ) = ln[ ( )] ( ) • ℎWhen𝑘𝑘 a set of data is analyzed using a discrete counterpart𝑘𝑘 of the continuous 𝑖𝑖=0 distribution the values of the hazard𝐻𝐻 𝑘𝑘 rate− do 𝑅𝑅not𝑘𝑘 converge.≠ ∑ ℎ 𝑘𝑘

A function called the second has been proposed (Gupta et al. 1997):

( 1) ( ) = ln = ln[1 ( )] ( ) 𝑅𝑅 𝑘𝑘 − 𝑟𝑟 𝑘𝑘 − − ℎ 𝑘𝑘 This function overcomes the previously𝑅𝑅 mentioned𝑘𝑘 limitations of the discrete hazard rate function and maintains the monotonicity property. For more information, the reader is referred to (Xie, Gaudoin, et al. 2002)

Care should be taken not to confuse the hazard rate with the Rate of Occurrence of Failures (ROCOF). ROCOF is the probability that a failure (not necessarily the first) occurs in a small time interval. Unlike the hazard rate, the ROCOF is the absolute rate at which system failures occur and is not conditional on survival to time t. ROCOF is using in measuring the change in the rate of failures for repairable systems.

1.2.10. Cumulative Hazard Rate

The cumulative hazard rate, denoted as ( ) an in the continuous case is the area under the hazard rate function. This function is useful to calculate average failure rates. 𝐻𝐻 𝑡𝑡 ( ) = ( ) = ln [ ( )] 𝑡𝑡 ( ) = ln[ ( )] 𝐻𝐻 𝑡𝑡 �∞ℎ 𝑢𝑢 𝑑𝑑𝑑𝑑 − 𝑅𝑅 𝑡𝑡

For a discussion on the discrete cumulative𝐻𝐻 𝑘𝑘 − hazard𝑅𝑅 𝑘𝑘 rate see hazard rate.

Introduction 15 Dist Functions

1.2.11. Characteristic Function

The characteristic function of a random variable completely defines its probability distribution. It can be used to derive properties of the distribution from transformations of the random variable. (Billingsley 1995)

The characteristic function is defined as the expected value of the function exp ( ) where is the random variable of the distribution with a cdf ( ), is a parameter that can have any real value and = 1: 𝑖𝑖𝑖𝑖𝑖𝑖 𝑥𝑥 𝐹𝐹 𝑥𝑥 𝜔𝜔 𝑖𝑖 √− ( ) = 𝑖𝑖𝑖𝑖𝑖𝑖 𝑋𝑋 = ( ) 𝜑𝜑 𝜔𝜔 𝐸𝐸�∞𝑒𝑒 � 𝑖𝑖𝑖𝑖𝑖𝑖

�−∞𝑒𝑒 𝐹𝐹 𝑥𝑥 𝑑𝑑𝑑𝑑 A useful property of the characteristic function is the sum of independent random variables is the product of the random variables characteristic function. It is often easier to use the natural log of the characteristic function when conducting this operation.

( ) = ( ) ( )

𝑋𝑋+𝑌𝑌 𝑋𝑋 𝑌𝑌 ln [ 𝜑𝜑 ( )]𝜔𝜔= ln[𝜑𝜑 (𝜔𝜔 )𝜑𝜑] ln𝜔𝜔[ ( )]

𝑋𝑋+𝑌𝑌 𝑋𝑋 𝑌𝑌 For example, the addition of two𝜑𝜑 exponentially𝜔𝜔 𝜑𝜑 distributed𝜔𝜔 𝜑𝜑 random𝜔𝜔 variables with the same gives the gamma distribution with = 2:

𝜆𝜆 ~ 𝑘𝑘 ( ), ~ ( ) ( ) = , ( ) = 𝑋𝑋 𝐸𝐸𝐸𝐸𝐸𝐸+ 𝜆𝜆 𝑌𝑌 𝐸𝐸𝐸𝐸𝐸𝐸 𝜆𝜆 + 𝑖𝑖𝑖𝑖 𝑖𝑖𝑖𝑖 𝜑𝜑𝑋𝑋 𝜔𝜔 𝜑𝜑𝑌𝑌 𝜔𝜔 𝜔𝜔( 𝑖𝑖)𝑖𝑖= ( ) ( )𝜔𝜔 𝑖𝑖𝑖𝑖 𝑋𝑋+𝑌𝑌 = 𝑋𝑋 𝑌𝑌 𝜑𝜑 𝜔𝜔 (𝜑𝜑 +𝜔𝜔2𝜑𝜑) 𝜔𝜔 −𝜆𝜆 2 + ~ 𝜔𝜔 ( 𝑖𝑖𝑖𝑖= 1, )

This is the characteristic function𝑋𝑋 of the𝑌𝑌 𝐺𝐺gamma𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 distribution𝑘𝑘 𝜆𝜆 with = 2.

The generating function can be calculated from the characterist𝑘𝑘 ic function: ( ) = ( )

𝑋𝑋 𝑋𝑋 The raw moment can be calculated𝜑𝜑 −𝑖𝑖𝑖𝑖 by differentiating𝑀𝑀 𝜔𝜔 the characteristic function times. 𝑡𝑡Forℎ more information on moments see section 1.3.2. 𝑛𝑛 𝑛𝑛 ( ) [ ] = (0) 𝑛𝑛 −𝑛𝑛 𝑛𝑛 = 𝑋𝑋 ( ) 𝐸𝐸 𝑋𝑋 𝑖𝑖 𝜑𝜑 𝑛𝑛 −𝑛𝑛 𝑑𝑑 𝑖𝑖 � 𝑛𝑛 𝜑𝜑𝑋𝑋 𝜔𝜔 � 𝑑𝑑𝜔𝜔 16 Probability Distributions Used in Reliability Engineering

1.2.12. Joint Distributions

Dist Functions Dist Joint distributions are multivariate distributions with, random variables ( > 1). An example of a bivariate distribution ( = 2) may be the distribution of failure for a vehicle tire which with random variables time, , and distance𝑑𝑑 travelled, . The dependence𝑑𝑑 between these two variables can be 𝑑𝑑quantified in terms of correlation and covariance. See section 1.3.3 for more discussion. For more𝑇𝑇 on properties of multivariate𝑋𝑋 distributions see (Rencher 1997). The continuous and discrete random variables will be denoted as:

= 𝑥𝑥1 , = 𝑘𝑘1 𝑥𝑥2 𝑘𝑘2 𝒙𝒙 � � 𝒌𝒌 � � ⋮ ⋮ 𝑑𝑑 𝑑𝑑 Joint distributions can be derived from 𝑥𝑥the conditional𝑘𝑘 distributions. For the bivariate case with random variables and :

𝑥𝑥 (𝑦𝑦 , ) = ( | ) ( ) = ( | ) ( )

For the more general case:𝑓𝑓 𝑥𝑥 𝑦𝑦 𝑓𝑓 𝑦𝑦 𝑥𝑥 𝑓𝑓 𝑥𝑥 𝑓𝑓 𝑥𝑥 𝑦𝑦 𝑓𝑓 𝑦𝑦

( ) = ( | , … , ) ( , … , ) = ( ) ( | ) … ( | , … , ) ( | , … , ) 1 2 𝑑𝑑 2 𝑑𝑑 𝑓𝑓 𝒙𝒙 𝑓𝑓 𝑥𝑥 𝑥𝑥 𝑥𝑥 𝑓𝑓 𝑥𝑥 𝑥𝑥 1 2 1 𝑛𝑛−1 1 𝑛𝑛−2 𝑛𝑛 1 𝑛𝑛 If the random variables 𝑓𝑓are𝑥𝑥 independent,𝑓𝑓 𝑥𝑥 𝑥𝑥 𝑓𝑓 their𝑥𝑥 joint𝑥𝑥 distribution𝑥𝑥 𝑓𝑓 𝑥𝑥 is 𝑥𝑥simply𝑥𝑥 the product of the marginal distributions:

( ) = 𝑑𝑑 ( )

𝑓𝑓 𝒙𝒙 � 𝑓𝑓 𝑥𝑥𝑖𝑖 𝑤𝑤ℎ𝑒𝑒𝑒𝑒𝑒𝑒 𝑥𝑥𝑖𝑖 ⊥ 𝑥𝑥𝑗𝑗𝑓𝑓𝑓𝑓𝑓𝑓 𝑖𝑖 ≠ 𝑗𝑗 𝑖𝑖=1

( ) = 𝑑𝑑 ( )

𝑓𝑓 𝒌𝒌 � 𝑓𝑓 𝑘𝑘𝑖𝑖 𝑤𝑤ℎ𝑒𝑒𝑒𝑒𝑒𝑒 𝑘𝑘𝑖𝑖 ⊥ 𝑘𝑘𝑗𝑗𝑓𝑓𝑓𝑓𝑓𝑓 𝑖𝑖 ≠ 𝑗𝑗 𝑖𝑖=1 A general multivariate cumulative probability function with n random variables ( , , … , ) is defined as:

1 2 𝑛𝑛 𝑇𝑇 𝑇𝑇 𝑇𝑇 ( , , … , ) = ( , , … , )

1 2 𝑛𝑛 1 1 2 2 𝑛𝑛 𝑛𝑛 The survivor function 𝐹𝐹is 𝑡𝑡given𝑡𝑡 as:𝑡𝑡 𝑃𝑃 𝑇𝑇 ≤ 𝑡𝑡 𝑇𝑇 ≤ 𝑡𝑡 𝑇𝑇 ≤ 𝑡𝑡

( , , … , ) = ( > , > , … , > )

1 2 𝑛𝑛 1 1 2 2 𝑛𝑛 𝑛𝑛 Different from univariate𝑅𝑅 𝑡𝑡 distributions𝑡𝑡 𝑡𝑡 is𝑃𝑃 the𝑇𝑇 relation𝑡𝑡 𝑇𝑇 ship𝑡𝑡 between𝑇𝑇 𝑡𝑡 the CDF and the survivor function (Georges et al. 2001):

( , , … , ) + ( , , … , ) 1

1 2 𝑛𝑛 1 2 𝑛𝑛 𝐹𝐹 𝑡𝑡 𝑡𝑡 𝑡𝑡 𝑅𝑅 𝑡𝑡 𝑡𝑡 𝑡𝑡 ≤ If ( , , … , ) is differentiable then the probability density function is given as:

𝐹𝐹 𝑡𝑡1 𝑡𝑡2 𝑡𝑡𝑛𝑛 Introduction 17 Dist Functions

( , , … , ) ( , , … , ) = 𝑛𝑛 … 1 2 𝑛𝑛 1 2 𝑛𝑛 𝜕𝜕 𝐹𝐹 𝑡𝑡 𝑡𝑡 𝑡𝑡 𝑓𝑓 𝑡𝑡 𝑡𝑡 𝑡𝑡 1 2 𝑛𝑛 𝜕𝜕𝑡𝑡 𝜕𝜕𝑡𝑡 𝜕𝜕𝑡𝑡 For a discussion on the multivariate hazard rate functions and the construction of joint distributions from marginal distributions see (Singpurwalla 2006).

1.2.13. Marginal Distribution

The marginal distribution of a single random variable in a joint distribution can be obtained:

( ) = … ( ) …

1 2 3 𝑑𝑑 𝑓𝑓 𝑥𝑥 �𝑥𝑥𝑑𝑑 �𝑥𝑥3 �𝑥𝑥2𝑓𝑓 𝒙𝒙 𝑑𝑑𝑥𝑥 𝑑𝑑𝑥𝑥 𝑑𝑑𝑥𝑥 ( ) = … ( )

1 𝑓𝑓 𝑘𝑘 � � � 𝑓𝑓 𝒌𝒌 𝑘𝑘2 𝑘𝑘3 𝑘𝑘𝑛𝑛

1.2.14. Conditional Distribution

If the value is known for some random variables the conditional distribution of the remaining random variables is:

( ) ( ) ( | , … , ) = = ( , … , ) ( ) 𝑓𝑓 𝑥𝑥1 𝑥𝑥2 𝑥𝑥𝑑𝑑 𝑓𝑓 𝑥𝑥 𝑓𝑓 𝑥𝑥 𝑓𝑓 𝑥𝑥2 𝑥𝑥𝑑𝑑 𝑥𝑥1 1 ( ) ∫ 𝑓𝑓 𝑥𝑥( )𝑑𝑑𝑑𝑑 ( | , … , ) = = ( , … , ) ( ) 1 2 𝑑𝑑 𝑓𝑓 𝑘𝑘 𝑓𝑓 𝑘𝑘 𝑓𝑓 𝑘𝑘 𝑘𝑘 𝑘𝑘 1 𝑓𝑓 𝑘𝑘2 𝑘𝑘𝑑𝑑 ∑𝑘𝑘 𝑓𝑓 𝑥𝑥 1.2.15. Bathtub Distributions

Elementary texts on reliability introduce the hazard rate of a system as a bathtub curve. The bathtub curve has three regions, infant mortality (decreasing failure rate), useful life (constant failure rate) and wear out (increasing failure rate). Bathtub distributions have not been a popular choice for modeling life distributions when compared to exponential, Weibull and lognormal distributions. This is because bathtub distributions are generally more complex without closed form moments and more difficult to estimate parameters.

Sometimes more complex shapes are required than simple bathtub curves, as such generalizations and modifications to the bathtub curves has been studied. These include an increase in the failure rate followed by a bathtub curve and rollercoaster curves (decreasing followed by unimodal hazard rate). For further reading including applications see (Lai & Xie 2006).

18 Probability Distributions Used in Reliability Engineering

1.2.16. Truncated Distributions

Dist Functions Dist Truncation arises when the existence of a potential observation would be unknown if it were to occur in a certain . An example of truncation is when the existence of a defect is unknown due to the defect’s amplitude being less than the inspection threshold. The number of flaws below the inspection threshold is unknown. This is not to be confused with which occurs when there is a bound for observing events. An example of right censoring is when a test is time terminated and the failures of the surviving components are not observed, however we know how many components were censored. (Meeker & Escobar 1998, p.266)

A truncated distribution is the conditional distribution that results from restricting the domain of another probability distribution. The following general formulas apply to truncated distribution functions, where ( ) and ( ) are the pdf and cdf of the non- truncated distribution. For further reading specific to common distributions see (Cohen 0 0 1991) 𝑓𝑓 𝑥𝑥 𝐹𝐹 𝑥𝑥

Probability Distribution Function: ( ) ( , ] ( ) = ( ) ( ) 𝑜𝑜 𝑓𝑓 0𝑥𝑥 0 0 𝑓𝑓𝑓𝑓𝑓𝑓 𝑥𝑥 ∈ 𝑎𝑎 𝑏𝑏 𝑓𝑓 𝑥𝑥 �𝐹𝐹 𝑏𝑏 − 𝐹𝐹 𝑎𝑎 Cumulative Distribution Function: 𝑜𝑜𝑜𝑜ℎ𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 0 ( ) ( ) = 𝑓𝑓𝑓𝑓𝑓𝑓 𝑥𝑥 ≤( ,𝑎𝑎 ] ⎧ (𝑥𝑥 ) ( ) ⎪ ∫𝑎𝑎 𝑓𝑓0 𝑡𝑡 𝑑𝑑𝑑𝑑 𝐹𝐹 𝑥𝑥 1 𝑓𝑓𝑓𝑓𝑓𝑓 𝑥𝑥 ∈ >𝑎𝑎 𝑏𝑏 ⎨𝐹𝐹0 𝑏𝑏 − 𝐹𝐹0 𝑎𝑎 ⎪ ⎩ 𝑓𝑓𝑓𝑓𝑓𝑓 𝑥𝑥 𝑏𝑏 Introduction 19 Dist Functions

1.2.17. Summary

Table 2: Summary of important reliability function relationships

( ) ( ) ( ) ( ) ( )

𝒇𝒇 𝒕𝒕 𝑭𝑭 𝒕𝒕 𝑹𝑹 𝒕𝒕 𝒉𝒉 𝒕𝒕 𝑯𝑯 𝒕𝒕 {exp[ ( )]} ( ) ( ) ( ) = --- ( ) ( ) exp 𝑡𝑡 ′ ′ 𝑑𝑑 −𝐻𝐻 𝑡𝑡 𝒇𝒇 𝒕𝒕 𝐹𝐹 𝑡𝑡 −𝑅𝑅 𝑡𝑡 ℎ 𝑡𝑡 �− � ℎ 𝑥𝑥 𝑑𝑑𝑑𝑑� − 𝑜𝑜 𝑑𝑑𝑑𝑑 1 exp { ( )} ( ) = ( ) 1 ( ) 1 exp ( ) 𝑡𝑡 ---- 𝑡𝑡 − −𝐻𝐻 𝑡𝑡 𝑭𝑭 𝒕𝒕 �0 𝑓𝑓 𝑥𝑥 𝑑𝑑𝑑𝑑 − 𝑅𝑅 𝑡𝑡 − �− �𝑜𝑜 ℎ 𝑥𝑥 𝑑𝑑𝑑𝑑�

exp { ( )} ( ) = ( ) 1 – ( ) exp ( ) 1 𝑡𝑡 ---- 𝑡𝑡 −𝐻𝐻 𝑡𝑡 𝑹𝑹 𝒕𝒕 − �0 𝑓𝑓 𝑥𝑥 𝑑𝑑𝑑𝑑 𝐹𝐹 𝑡𝑡 �− �𝑜𝑜 ℎ 𝑥𝑥 𝑑𝑑𝑑𝑑�

( ) ( ) ( ) ( ) ( ) = ---- ′ 1 ′ ( ) ′( ) 𝐻𝐻 𝑡𝑡 1 ( ) 𝐹𝐹 𝑡𝑡 𝑅𝑅 𝑡𝑡 𝒉𝒉 𝒕𝒕 𝑓𝑓𝑡𝑡 𝑡𝑡 − 𝐹𝐹 𝑡𝑡 𝑅𝑅 𝑡𝑡 − ∫0 𝑓𝑓 𝑥𝑥 𝑑𝑑𝑑𝑑 1 --- ( ) = ( ) ln ln { ( )} ( ) ∞ 1 ( ) 𝑡𝑡 𝑯𝑯 𝒕𝒕 −𝑙𝑙𝑙𝑙 � 𝑓𝑓 𝑥𝑥 𝑑𝑑𝑑𝑑 � � − 𝑅𝑅 𝑥𝑥 � ℎ 𝑥𝑥 𝑑𝑑𝑑𝑑 𝑡𝑡 − 𝐹𝐹 𝑥𝑥 0 { } ( + ) [1 ( )] ( ) exp ( ) exp ( ) ∞ ( ) = ∞ ∞ ∞ ∞ 𝑡𝑡 exp{ ( )} ( ) 1 ( ) ( ) 𝑡𝑡 𝑜𝑜 ( ) 𝑡𝑡 ∫0 𝑥𝑥𝑥𝑥 𝑡𝑡 𝑥𝑥 𝑑𝑑𝑑𝑑 𝑡𝑡 𝑡𝑡 ∫ exp�− ∫ ℎ 𝑥𝑥 𝑑𝑑𝑑𝑑� 𝑑𝑑𝑑𝑑 ∫ −𝐻𝐻 𝑥𝑥 𝑑𝑑𝑑𝑑 ∞ ∫ − 𝐹𝐹 𝑥𝑥 𝑑𝑑𝑑𝑑 ∫ 𝑅𝑅 𝑥𝑥 𝑑𝑑𝑑𝑑 𝑡𝑡 𝒖𝒖 𝒕𝒕 −𝐻𝐻 𝑥𝑥 ∫𝑡𝑡 𝑓𝑓 𝑥𝑥 𝑑𝑑𝑑𝑑 − 𝐹𝐹 𝑡𝑡 𝑅𝑅 𝑡𝑡 �− ∫𝑜𝑜 ℎ 𝑥𝑥 𝑑𝑑𝑥𝑥�

20 Probability Distributions Used in Reliability Engineering

1.3. Distribution Properties

Dist Properties 1.3.1. Median / Mode

The median of a distribution, denoted as . is when the cdf and reliability function are equal to 0.5. 0 5 . = (0𝑡𝑡.5) = (0.5) −1 −1 0 5 The mode is the highest point 𝑡𝑡of the𝐹𝐹 pdf, . This𝑅𝑅 is the point where a failure has the highest probability. Samples from this distribution would occur most often around the 𝑚𝑚 mode. 𝑡𝑡

1.3.2. Moments of Distribution

The moments of a distribution are given by:

= ( ) ( ) , = ( ) ∞ 𝑛𝑛 𝑛𝑛 When = 0 the moments,𝜇𝜇𝑛𝑛 � 𝑥𝑥, −are𝑐𝑐 called𝑓𝑓 𝑥𝑥 𝑑𝑑the𝑑𝑑 raw 𝜇𝜇moments,𝑛𝑛 ��𝑘𝑘𝑗𝑗 described− 𝑐𝑐� 𝑓𝑓 𝑘𝑘 as moments about −∞ 𝑖𝑖 the origin. In respect to probability′ distributions the first two raw moments are important. 𝑛𝑛 always𝑐𝑐 equals one, and 𝜇𝜇 is the distributions mean which is the expected value of the random′ variable for the distribution:′ 𝜇𝜇0 𝜇𝜇1 = ( ) = 1, = ( ) = 1 ∞ 𝜇𝜇′0 � 𝑓𝑓 𝑥𝑥 𝑑𝑑𝑑𝑑 𝜇𝜇′0 � 𝑓𝑓 𝑘𝑘𝑖𝑖 −∞ 𝑖𝑖 mean = [ ] = : = ( ) , = ( ) 𝐸𝐸 𝑋𝑋 𝜇𝜇 ∞ Some important properties𝜇𝜇′1 of� the𝑥𝑥𝑥𝑥 expected𝑥𝑥 𝑑𝑑𝑑𝑑 value𝜇𝜇′1 �[ 𝑘𝑘] 𝑖𝑖𝑓𝑓when𝑘𝑘𝑖𝑖 transformations of the −∞ 𝑖𝑖 random variable occur are: [ + ] = + 𝐸𝐸 𝑋𝑋

𝐸𝐸[𝑋𝑋 + 𝑏𝑏] = 𝜇𝜇𝑋𝑋 + 𝑏𝑏

𝐸𝐸 𝑋𝑋 [ 𝑌𝑌] = 𝜇𝜇𝑋𝑋 𝜇𝜇𝑌𝑌

𝑋𝑋 𝐸𝐸[𝑎𝑎𝑎𝑎] = 𝑎𝑎𝜇𝜇 + ( , )

𝑋𝑋 𝑌𝑌 When = the moments, , are𝐸𝐸 𝑋𝑋 called𝑋𝑋 𝜇𝜇 the𝜇𝜇 central𝐶𝐶𝐶𝐶𝐶𝐶 𝑋𝑋moments𝑌𝑌 , described as moments about the mean. In this book, the first five central moments are important. is equal to 𝑛𝑛 = 1.𝑐𝑐 𝜇𝜇is the variance which𝜇𝜇 quantifies the amount the random variable deviates from 0 the′ mean. and are used to calculate the and . 𝜇𝜇 0 1 𝜇𝜇 𝜇𝜇 𝜇𝜇2 𝜇𝜇3 = ( ) = 1, = ( ) = 1 ∞ 𝜇𝜇0 � 𝑓𝑓 𝑥𝑥 𝑑𝑑𝑑𝑑 𝜇𝜇0 � 𝑓𝑓 𝑘𝑘𝑖𝑖 = ( −∞) ( ) = 0, = 𝑖𝑖 ( ) ( ) = 0 ∞ 𝜇𝜇1 � 𝑥𝑥 − 𝜇𝜇 𝑓𝑓 𝑥𝑥 𝑑𝑑𝑑𝑑 𝜇𝜇1 � 𝑘𝑘𝑖𝑖 − 𝜇𝜇 𝑓𝑓 𝑘𝑘𝑖𝑖 −∞ 𝑖𝑖 Introduction 21 Properties Dist

variance = [( [ ]) ] = [ ] { [ ]} = : 2 2 2 2 𝐸𝐸 𝑋𝑋 − 𝐸𝐸 𝑋𝑋 𝐸𝐸 𝑋𝑋 − 𝐸𝐸 𝑋𝑋 𝜎𝜎 = ( ) ( ) , = ( ) ( ) ∞ 2 2 𝜇𝜇2 � 𝑥𝑥 − 𝜇𝜇 𝑓𝑓 𝑥𝑥 𝑑𝑑𝑑𝑑 𝜇𝜇2 � 𝑘𝑘𝑖𝑖 − 𝜇𝜇 𝑓𝑓 𝑘𝑘𝑖𝑖 −∞ 𝑖𝑖 Some important properties of the variance exist when transformations of the random variable occur are: [ + ] = [ ]

𝑉𝑉𝑉𝑉𝑉𝑉[𝑋𝑋 + 𝑏𝑏] = 𝑉𝑉𝑉𝑉𝑉𝑉+𝑋𝑋 ± 2 ( , ) 2 2 𝑉𝑉𝑉𝑉𝑉𝑉 𝑋𝑋 [ 𝑌𝑌] = 𝜎𝜎𝑋𝑋 𝜎𝜎𝑌𝑌 𝐶𝐶𝐶𝐶𝐶𝐶 𝑋𝑋 𝑌𝑌 2 2 𝑉𝑉𝑉𝑉𝑉𝑉 𝑎𝑎𝑎𝑎 𝑎𝑎 𝜎𝜎𝑋𝑋 ( , ) [ ] = ( ) + + 2 2 2 2 𝜎𝜎𝑋𝑋 𝜎𝜎𝑌𝑌 𝐶𝐶𝐶𝐶𝐶𝐶 𝑋𝑋 𝑌𝑌 𝑉𝑉𝑉𝑉𝑉𝑉 𝑋𝑋𝑋𝑋 𝑋𝑋𝑋𝑋 �� � � � � The skewness is a measure of the asymmetry𝑋𝑋 of the𝑌𝑌 distribution.𝑋𝑋𝑋𝑋

= 𝜇𝜇3 𝛾𝛾1 3⁄2 The kurtosis is a measure of the whether the 𝜇𝜇data2 is peaked or flat.

= 𝜇𝜇4 1.3.3. Covariance 𝛾𝛾2 2 𝜇𝜇2 Covariance is a measure of the dependence between random variables. ( , ) = [( )( )] = [ ]

𝑋𝑋 𝑌𝑌 𝑋𝑋 𝑌𝑌 A normalized measure𝐶𝐶𝐶𝐶𝐶𝐶 of𝑋𝑋 covari𝑌𝑌 ance𝐸𝐸 𝑋𝑋 is− correlation,𝜇𝜇 𝑌𝑌 − 𝜇𝜇 . The𝐸𝐸 𝑋𝑋 correlation𝑋𝑋 − 𝜇𝜇 𝜇𝜇 has the limits 1 1. When = 1 the random variables have a linear dependency (i.e, an increase in X will result in the same increase in Y). When = 1 the𝜌𝜌 random variables have a negative− ≤ linear𝜌𝜌 ≤ dependency𝜌𝜌 (i.e, an increase in X will result in the same decrease in Y). The relationship between covariance and correlation𝜌𝜌 − is: ( , ) , = ( , ) = 𝑋𝑋 𝑌𝑌 𝐶𝐶𝐶𝐶𝐶𝐶 𝑋𝑋 𝑌𝑌 𝜌𝜌 𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶 𝑋𝑋 𝑌𝑌 𝑋𝑋 𝑌𝑌 If the two random variables are independent than the𝜎𝜎 correlation𝜎𝜎 is equal to zero, however the reverse is not always true. If the correlation is zero the random variables does not need to be independent. For derivations and more information the reader is referred to (Dekking et al. 2007, p.138).

22 Probability Distributions Used in Reliability Engineering

1.4. Parameter Estimation

Parameter Est Parameter 1.4.1. Probability Plotting Paper

Most plotting methods transform the data available into a straight line for a specific distribution. From a line of best fit the parameters of the distribution can be estimated. Most plotting paper plots the random variable (time or demands) against the pdf, cdf or hazard rate and transform the data points to a linear relationship by adjusting the scale of each axis. Probability plotting is done using the following steps (Nelson 1982, p.108):

1. Order the data such that .

1 2 𝑖𝑖 𝑛𝑛 2. Assign a rank to each failure.𝑥𝑥 ≤ 𝑥𝑥 For≤ complete⋯ ≤ 𝑥𝑥 ≤ ⋯data≤ 𝑥𝑥this is simply the value i. Censored data is discussed after step 7.

3. Calculate the plotting position. The cdf may simply be calculated as / however this produces a biased result, instead the following non-parametric Blom estimates, are recommended as suitable for many cases by (Kimball 1960): 𝑖𝑖 𝑛𝑛 1 ( ) = ( + 0.625)( ) ℎ � 𝑡𝑡𝑖𝑖 𝑛𝑛 −0.𝑖𝑖375 𝑡𝑡𝑖𝑖+1 − 𝑡𝑡𝑖𝑖 ( ) = + 0.25 𝑖𝑖 − 𝐹𝐹� 𝑡𝑡𝑖𝑖 𝑛𝑛 + 0.625 ( ) = ( + 0.25) 𝑛𝑛 − 𝑖𝑖 𝑅𝑅 � 𝑡𝑡𝑖𝑖 𝑛𝑛 1 ( ) = ( + 0.25)( ) ̂ 𝑖𝑖 𝑓𝑓 𝑡𝑡 𝑖𝑖+1 𝑖𝑖 Other proposed estimators are: 𝑛𝑛 𝑡𝑡 − 𝑡𝑡 Naive: ( ) = 𝑖𝑖 0.3 Median (approximate): 𝐹𝐹�(𝑡𝑡𝑖𝑖) = 𝑛𝑛 + 0.4 𝑖𝑖 − 𝐹𝐹� 𝑡𝑡𝑖𝑖 𝑛𝑛 0.5 Midpoint: ( ) = 𝑖𝑖 − 𝐹𝐹� 𝑡𝑡𝑖𝑖 𝑛𝑛 Mean ( ) = + 1 𝑖𝑖 ∶ 𝐹𝐹� 𝑡𝑡𝑖𝑖 𝑛𝑛 1 Mode: ( ) = 1 𝑖𝑖 − 𝐹𝐹� 𝑡𝑡𝑖𝑖 4. Plot points on probability paper. The choice of distribution𝑛𝑛 − should be from experience, or multiple distributions should be used to assess the best fit. Probability paper is available from http://www.weibull.com/GPaper/.

Introduction 23 Parameter Est

5. Assess the data and chosen distributions. If the data plots in straight line then the distribution may be a reasonable fit.

6. Draw a line of best fit. This is a subjective assessment which minimizes the deviation of the points from the chosen line.

7. Obtained the desired information. This may be the distribution parameters or estimates of reliability or hazard rate trends.

When multiple failure modes are observed only one failure mode should be plotted with the other failures being treated as censored. Two popular methods to treat censored data two methods are:

Rank Adjustment Method. (Manzini et al. 2009, p.140) Here the adjusted rank, is calculated only for non-censored units (with still being the rank for all ordered times).𝑖𝑖 𝑗𝑗𝑡𝑡 This adjusted rank is used for step 2 with the remaining𝑖𝑖 steps unchanged: ( 𝑖𝑖𝑡𝑡+ 1) = + 2 + 𝑡𝑡𝑖𝑖−1 𝑡𝑡𝑖𝑖 𝑡𝑡𝑖𝑖−1 𝑛𝑛 − 𝑗𝑗 𝑗𝑗 𝑗𝑗 𝑡𝑡𝑖𝑖 Kaplan Meier Estimator. Here the estimate for reliability𝑛𝑛 − 𝑖𝑖 is: ( ) = 1 + 1 𝑑𝑑 𝑅𝑅� 𝑡𝑡𝑖𝑖 � � − � 𝑡𝑡𝑗𝑗<𝑡𝑡𝑖𝑖 𝑛𝑛 − 𝑖𝑖 Where is the number of failures in rank j (for non- = 1). From this estimate a cdf can be given as ( ) = 1 ( ). For a detailed derivation and properties of this estimator𝑑𝑑 see (Andersen et al. 1996, p.255) 𝑑𝑑 𝑖𝑖 𝑖𝑖 𝐹𝐹� 𝑡𝑡 − 𝑅𝑅� 𝑡𝑡 Probability plots are fast and not dependent on complex numerical methods and can be used without a detailed knowledge of statistics. It provides a visual representation of the data for which qualitative statements can be made. It can be useful in estimating initial values for numerical methods. Limitation of this technique is that it is not objective and two different people making the same plot will obtain different answers. It also does not provide confidence intervals. For more detail of probability plotting the reader is referred to (Nelson 1982, p.104) and (Meeker & Escobar 1998, p.122)

1.4.2. Total Time on Test Plots

Total time on Test (TTT) plots is a graph which provides a visual representation of the hazard rate trend, i.e increasing, constant or decreasing. This assists in identifying the distribution from which the data may come from. To plot TTT (Rinne 2008, p.334):

1. Order the data such that .

1 2 𝑖𝑖 𝑛𝑛 2. Calculate the TTT positions:𝑥𝑥 ≤ 𝑥𝑥 ≤ ⋯ ≤ 𝑥𝑥 ≤ ⋯ ≤ 𝑥𝑥

= 𝑖𝑖 ( + 1) ; = 1,2, … ,

𝑇𝑇𝑇𝑇𝑇𝑇𝑖𝑖 � 𝑛𝑛 − 𝑗𝑗 �𝑥𝑥𝑗𝑗 − 𝑥𝑥𝑗𝑗−1� 𝑖𝑖 𝑛𝑛 𝑗𝑗=1 3. Calculate the normalized TTT positions: 24 Probability Distributions Used in Reliability Engineering

= ; = 1,2, … , ∗ 𝑖𝑖 𝑖𝑖 𝑇𝑇𝑇𝑇𝑇𝑇 Parameter Est Parameter 𝑇𝑇𝑇𝑇𝑇𝑇 𝑛𝑛 𝑖𝑖 𝑛𝑛 4. Plot the points , . 𝑇𝑇𝑇𝑇𝑇𝑇 𝑖𝑖 ∗ 𝑛𝑛 𝑖𝑖 5. Analyze graph:� 𝑇𝑇𝑇𝑇𝑇𝑇 �

1

𝑖𝑖 𝑇𝑇𝑇𝑇𝑇𝑇

0 0 1 𝑖𝑖 �𝑛𝑛 Figure 5: Time on test plot interpretation

Compared to probability plotting, TTT plots are simple, scale invariant and can represent any data set even those from different distributions on the same plot. However it only provides an indication of failure rate properties and cannot be used directly to estimate parameters. For more information about TTT plots the reader is referred to (Rinne 2008, p.334).

1.4.3. Least Mean Square Regression

When the relationship between two variables, and is assumed linear ( = + ), an estimate of the line’s parameters can be obtained from sample data points, ( , ) using least mean square (LMS) regression. The least𝑥𝑥 square𝑦𝑦 method minimizes𝑦𝑦 the𝑚𝑚 𝑚𝑚square𝑐𝑐 of 𝑖𝑖 𝑖𝑖 the residual. 𝑛𝑛 𝑥𝑥 𝑦𝑦

= 𝑛𝑛 2 𝑆𝑆 � 𝑟𝑟𝑖𝑖 𝑖𝑖=1 Introduction 25 Parameter Est

The residual can be defined in many ways.

Minimize y residuals Minimize x residuals

= ( ; , ) = ( ; , )

𝑟𝑟𝑖𝑖 𝑦𝑦𝑖𝑖 − 𝑓𝑓 (𝑥𝑥𝑖𝑖 𝑚𝑚)(𝑐𝑐 ) 𝑟𝑟𝑖𝑖 𝑥𝑥𝑖𝑖 − 𝑓𝑓 𝑦𝑦𝑖𝑖 𝑚𝑚 𝑐𝑐 = = 2 𝑖𝑖 𝑖𝑖 𝑖𝑖 𝑖𝑖 2 ( )2( ) 𝑛𝑛∑𝑥𝑥 𝑦𝑦 − ∑𝑥𝑥 ∑𝑦𝑦 𝑖𝑖 𝑖𝑖 2 𝑛𝑛∑𝑦𝑦 − �∑𝑦𝑦 � 𝑚𝑚� 2 2 𝑚𝑚� 𝑛𝑛∑𝑥𝑥𝑖𝑖 − �∑𝑥𝑥𝑖𝑖 � 𝑛𝑛∑𝑥𝑥𝑖𝑖𝑦𝑦𝑖𝑖 − ∑𝑥𝑥𝑖𝑖 ∑𝑦𝑦𝑖𝑖 = = 𝑖𝑖 𝑖𝑖 𝑖𝑖 𝑖𝑖 ∑𝑦𝑦 ∑𝑥𝑥 ∑𝑦𝑦 ∑𝑥𝑥 𝑐𝑐̂ − 𝑚𝑚� 𝑐𝑐̂ − 𝑚𝑚� 𝑦𝑦 𝑛𝑛 𝑛𝑛 𝑦𝑦 𝑛𝑛 𝑛𝑛

( , ) Regression line ( , ) 𝑥𝑥𝑖𝑖 𝑦𝑦𝑖𝑖 = + 𝑖𝑖 𝑦𝑦 𝑥𝑥𝑖𝑖 𝑦𝑦𝑖𝑖 𝑦𝑦 𝑚𝑚𝑚𝑚 𝑐𝑐 ( ; , ) 𝑖𝑖 𝑟𝑟 Regression line 𝑖𝑖 = + 𝑓𝑓 𝑥𝑥 𝑚𝑚 𝑐𝑐 𝑦𝑦 𝑚𝑚𝑚𝑚 𝑐𝑐

0 0 𝑖𝑖 𝑟𝑟 ( ; , ) 𝑥𝑥 𝑥𝑥 𝑖𝑖 𝑖𝑖 Figure 6: Left minimize y residual, right minimize𝑓𝑓 x𝑦𝑦 residual𝑚𝑚 𝑐𝑐 𝑦𝑦

The LMS method can be used to estimate the line of best fit when using plotting parameter estimation methods. When plotting on a regular scale in software such as Microsoft Excel, it is often easy to conduct linear least mean square (LMS) regression using in built functions. Where available this book provides the formulas to plot the sample data in a straight line in a regular scale plot. It also provides the transformation from the linear LMS regression estimates of and to the distribution parameter estimates.

For more on least square𝑚𝑚� methods𝑐𝑐̂ in a reliability engineering context see (Nelson 1990, p.167). MS regression can also be conducted on multivariate distributions, see (Rao & Toutenburg 1999) and can also be conducted on non-linear data directly, see (Bjõrck 1996).

1.4.4. Method of Moments

To estimate the distribution parameters using the method of moments the sample moments are equated to the parameter moments and solved for the unknown parameters. The following sample moments can be used:

The sample mean is given as: 1 = 𝑛𝑛

𝑥𝑥̅ � 𝑥𝑥𝑖𝑖 𝑛𝑛 𝑖𝑖=1 26 Probability Distributions Used in Reliability Engineering

The unbiased sample variance is given as: 1 = 𝑛𝑛 ( ) Parameter Est Parameter 1 2 2 𝑆𝑆 � 𝑥𝑥𝑖𝑖 − 𝑥𝑥̅ 𝑛𝑛 − 𝑖𝑖=1 Method of moments is not as accurate as Bayesian or maximum likelihood estimates but is easy and fast to calculate. The method of moment estimates are often used as a starting point for numerical methods to optimize maximum likelihood and least square estimators.

1.4.5. Maximum Likelihood Estimates

Maximum likelihood estimates (MLE) are based on a frequentist approach to parameter estimation usually obtained by maximizing the natural log of the likelihood function.

( |E) = ln{ ( | )}

Algebraically this is done by solvingΛ theθ first order𝐿𝐿 𝜃𝜃 partial𝐸𝐸 derivatives of the log-likelihood function. This calculation has been included in this book for distributions where the result is in closed form. Otherwise the log-likelihood function can be maximized directly using numerical methods.

MLE for is obtained by solving for :

𝜃𝜃� 𝜃𝜃 = 0 𝜕𝜕Λ Denote the true parameters of the distribution𝜕𝜕𝜕𝜕 as , MLEs have the following properties (Rinne 2008, p.406): 𝟎𝟎 • Consistency. As the number of samples𝜽𝜽 increases the difference between the estimated and actual parameter decreases: plim =

𝑛𝑛→∞ 𝜽𝜽� 𝜽𝜽 • Asymptotic normality. lim ~ ( , [ ( )] ) −1 0 𝑛𝑛 0 𝑛𝑛→∞ 𝜃𝜃� 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝜃𝜃 𝐼𝐼 𝜃𝜃 where ( ) = ( ) is the Fisher information matrix. Therefore is asymptotically unbiased: 𝑛𝑛 𝐼𝐼 𝜃𝜃 𝑛𝑛𝑛𝑛 𝜃𝜃 lim [ ] = 𝜃𝜃�

0 𝑛𝑛→∞ 𝐸𝐸 𝜃𝜃� 𝜃𝜃 • Asymptotic . lim [ ] = [ ( )] −1 𝑛𝑛 0 𝑛𝑛→∞ 𝑉𝑉𝑉𝑉𝑉𝑉 𝜃𝜃� 𝐼𝐼 𝜃𝜃 • Invariance. The MLE of ( ) is ( ) if (. ) is a continuous and continuously differentiable function. 0 𝑓𝑓 𝜃𝜃 𝑓𝑓 𝜃𝜃� 𝑓𝑓 The advantages of MLE are that it is a very common technique that has been widely published and is implemented in many software packages. The MLE method can easily handle censored data. The disadvantage to MLE is the bias introduced for small sample sizes and unbounded estimates may result when no failures have been observed. The Introduction 27 Parameter Est numerical optimization of the log-likelihood function may be non-trivial with high sensitivity to starting values and the presence of local maximums.

For more information in a reliability context see (Nelson 1990, p.284).

1.4.6. Bayesian Estimation

Bayesian estimation uses a subjective interpretation of the theory of probability and for parameter and confidence intervals uses Bayes’ rule to update our state of knowledge of the unknown of interest (UIO). Recall from section 1.1.5 Bayes rule,

( ) ( | ) ( ) ( | ) ( | ) = , ( | ) = 𝑜𝑜( ) ( | ) ( | ) ( ) respectively for continuous𝜋𝜋 and𝜃𝜃 discrete𝐿𝐿 𝐸𝐸 𝜃𝜃 forms of variable of .𝑃𝑃 𝜃𝜃 𝑃𝑃 𝐸𝐸 𝜃𝜃 𝜋𝜋 𝜃𝜃 𝐸𝐸 𝑃𝑃 𝜃𝜃 𝐸𝐸 𝑖𝑖 𝑖𝑖 ∫ 𝜋𝜋𝑜𝑜 𝜃𝜃 𝐿𝐿 𝐸𝐸 𝜃𝜃 𝑑𝑑𝑑𝑑 ∑ 𝑃𝑃 𝐸𝐸 𝜃𝜃 𝑃𝑃 𝜃𝜃 The Prior Distribution ( ) 𝜃𝜃

The prior distribution is 𝜋𝜋probability𝑜𝑜 𝜃𝜃 distribution of the UOI, , which captures our state of knowledge of prior to the evidence being observed. It is common for this distribution to represent soft evidence or intervals about the possible values𝜃𝜃 of . If the distribution is dispersed it represents𝜃𝜃 little being known about the parameter. If the distribution is concentrated in an area then it reflects a good knowledge about the𝜃𝜃 likely values of .

Prior distributions should be a proper probability distribution of . A distribution is proper𝜃𝜃 when it integrates to one and improper otherwise. The prior should also not be selected based on the form of the likelihood function. When the prior has𝜃𝜃 a constant which does not affect the posterior distribution (such as improper priors) it will be omitted from the formulas within this book.

Non-informative Priors. Occasions arise when it is not possible to express a subjective prior distribution due to lack of information, time or cost. Alternatively a subjective prior distribution may introduce unwanted bias through model convenience (conjugates) or due to elicitation methods. In such cases a non-informative prior may be desirable. The following methods exist for creating a non-informative prior (Yang and Berger 1998):

• Principle of Indifference - Improper Uniform Priors. An equal probability is assigned across all the possible values of the parameter. This is done using an improper uniform distribution with a constant, usually 1, over the range of the possible values for . When placed in Bayes formula the constant cancels out, however the denominator is integrated over all possible values of . In most cases this prior distribution𝜃𝜃 will result in a proper posterior, but not always. Improper Uniform Priors may be chosen to enable the use of conjugate𝜃𝜃 priors.

For example using exponential likelihood model, with an improper uniform prior, 1, over the limits [0, ) with evidence of failures in total time, :

𝐹𝐹 𝑇𝑇 ∞ Prior: ( ) = 1𝑛𝑛 (1,0) 𝑡𝑡

0 Likelihood:𝜋𝜋 𝜆𝜆 (∝|𝐺𝐺)𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺= nF −𝜆𝜆𝑡𝑡𝑇𝑇 𝐿𝐿 𝐸𝐸 𝜆𝜆 𝜆𝜆 𝑒𝑒 28 Probability Distributions Used in Reliability Engineering

1. ( | ) Posterior: ( | ) = ( | ) 1. 0

Parameter Est Parameter 𝐿𝐿 𝐸𝐸 𝜆𝜆 𝜋𝜋 𝜆𝜆 𝐸𝐸 ∞ Using conjugate relationship (see Conjugate∫ Priors𝐿𝐿 𝐸𝐸 for𝜆𝜆 calculations):𝑑𝑑𝑑𝑑

~ ( ; 1 + nF, tT)

• Principle of Indifference 𝜆𝜆- Proper𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 Uniform𝜆𝜆 Priors. An equal probability is assigned across the values of the parameter within a range defined by the uniform distribution. The uniform distribution is obtained by estimating the far left and right bounds ( and ) of the parameter giving ( ) = = , where c is a constant. When placed in Bayes formula the constant1 cancels out, 𝑜𝑜 𝑏𝑏−𝑎𝑎 however the denominator𝑎𝑎 is𝑏𝑏 integrated over the𝜃𝜃 bound [𝜋𝜋, 𝜃𝜃]. Care needs𝑐𝑐 to be taken in choosing and because no matter how much evidence suggests otherwise the posterior distribution will always be zero outside𝑎𝑎 𝑏𝑏 these bounds. 𝑎𝑎 𝑏𝑏 Using an exponential likelihood model, with a proper uniform prior, , over the limits [ , ] with evidence of failures in total time, : 𝑐𝑐 𝐹𝐹 𝑇𝑇 𝑎𝑎 𝑏𝑏 Prior: ( ) 𝑛𝑛= = 𝑡𝑡 (1,0) 1 𝜋𝜋0 𝜆𝜆 𝑏𝑏−𝑎𝑎 𝑐𝑐 ∝ 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 Likelihood: ( | ) = nF −𝜆𝜆𝑡𝑡𝑇𝑇 𝐿𝐿. 𝐸𝐸(𝜆𝜆 | ) 𝜆𝜆 𝑒𝑒 Posterior: ( | ) = for a b . ( | ) 𝑐𝑐 𝐿𝐿 𝐸𝐸 𝜆𝜆 𝜋𝜋 𝜆𝜆 𝐸𝐸 𝑏𝑏 ≤ λ ≤ 𝑎𝑎 Using conjugate relationship this results𝑐𝑐 ∫ 𝐿𝐿 in𝐸𝐸 a𝜆𝜆 truncated𝑑𝑑𝑑𝑑 Gamma distribution:

. ( ; 1 + n , t ) for a b ( ) = F T 0 otherwise 𝑐𝑐 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝜆𝜆 ≤ λ ≤ 𝜋𝜋 𝜆𝜆 � • Jeffrey’s Prior. Proposed by Jeffery in 1961, this prior is defined as ( ) = ( ) where is the Fisher information matrix. This derivation is motivated0 by the fact that it is not dependent upon the set of parameter variables𝜋𝜋 that𝜃𝜃 is 𝜃𝜃 𝜃𝜃 chosen�𝑑𝑑𝑑𝑑𝑑𝑑 𝑰𝑰 to describe𝑰𝑰 parameter space. Jeffery himself suggested the need to make ad hoc modifications to the prior to avoid problems in multidimensional distributions. Jeffory’s prior is normally improper. (Bernardo et al. 1992)

• Reference Prior. Proposed by Bernardo in 1979, this prior maximizes the expected posterior information from the data, therefore reducing the effect of the prior. When there is no nuisance parameters and certain regularity conditions are satisfied the reference prior is identical to the Jeffrey’s prior. Due to the need to order or group the importance of parameters, it may occur that different posteriors will result from the same data depending on the importance the user places on each parameter. This prior overcomes the problems which arise when using Jeffery’s prior in multivariate applications.

Introduction 29 Parameter Est

• Maximal Data Information Prior (MDIP). Developed by Zelluer in 1971 maximizes the likelihood function with relation to the prior. (Berry et al. 1995,

p.182)

For further detail on the differences between each type of non-informative prior see (Berry et al. 1995, p.179)

Conjugate Priors. Calculating posterior distributions can be extremely complex and in most cases requires expensive computations. A special case exists however by which the posterior distribution is of the same form as the prior distribution. The Bayesian updating mathematics can be reduced to simple calculations to update the model parameters. As an example the gamma function is a to a Poisson likelihood function:

Prior: ( ) = 𝛼𝛼( 𝛼𝛼)−1 𝛽𝛽 𝜆𝜆 −βλ 𝜋𝜋𝑜𝑜 𝜆𝜆 𝑒𝑒 Γ 𝛼𝛼 Likelihood: ( | ) = 𝑘𝑘 !𝑘𝑘 𝜆𝜆𝑖𝑖 𝑡𝑡𝑖𝑖 −𝜆𝜆𝑡𝑡𝑖𝑖 𝐿𝐿𝑖𝑖 𝑡𝑡𝑖𝑖 𝜆𝜆 𝑒𝑒 𝑘𝑘𝑖𝑖 Likelihood: ( | ) = 𝑛𝑛𝐹𝐹 ( | ) = ∑𝑘𝑘 ! 𝑘𝑘 𝑖𝑖 −𝜆𝜆∑𝑡𝑡𝑖𝑖 𝑖𝑖 𝑖𝑖 𝜆𝜆 ∏𝑡𝑡 𝐿𝐿 𝐸𝐸 𝜆𝜆 � 𝐿𝐿 𝑡𝑡 𝜆𝜆 𝑖𝑖 𝑒𝑒 𝑖𝑖=1( ) ( | ) ∏𝑘𝑘 Posterior: ( | ) = ( ) ( | ) 0 𝜋𝜋𝑜𝑜 𝜆𝜆 𝐿𝐿 𝐸𝐸 𝜆𝜆 𝜋𝜋 𝜆𝜆 𝐸𝐸 ∞ ∫ 𝜋𝜋𝑜𝑜 𝜆𝜆 𝐿𝐿 𝐸𝐸 𝜆𝜆 𝑑𝑑𝑑𝑑 𝛼𝛼 𝛼𝛼(−1) ∑𝑘𝑘 ! 𝑘𝑘 = 𝛽𝛽 𝜆𝜆 𝜆𝜆 ∏𝑡𝑡𝑖𝑖 −βλ −𝜆𝜆∑𝑡𝑡𝑖𝑖 𝑖𝑖 𝑒𝑒 𝑒𝑒 ∞ Γ 𝛼𝛼 ∏𝑘𝑘 𝛼𝛼 𝛼𝛼(−1) ∑𝑘𝑘 ! 𝑘𝑘 𝛽𝛽 𝜆𝜆 𝜆𝜆 ∏𝑡𝑡𝑖𝑖 −βλ −𝜆𝜆∑𝑡𝑡𝑖𝑖 � 𝑒𝑒 𝑒𝑒 𝑑𝑑𝑑𝑑 Γ 𝛼𝛼 ∏𝑘𝑘𝑖𝑖 0 ( ) = 𝛼𝛼−1+∑𝑘𝑘 −λ( β+∑𝑡𝑡𝑖𝑖) 𝜆𝜆 𝑒𝑒 ∞ 𝛼𝛼−1+∑𝑘𝑘 −λ β+∑𝑡𝑡𝑖𝑖 0 𝜆𝜆 𝑒𝑒 𝑑𝑑𝑑𝑑 Using the identity ( ) = ∫ we can calculate the denominator using the ∞ = ( + 𝑧𝑧−)1 −𝑥𝑥 = = change of variable Γ 𝑧𝑧 𝑜𝑜 𝑥𝑥 . 𝑒𝑒This𝑑𝑑 results𝑑𝑑 in , and with the limits ∫ 𝑢𝑢 𝑑𝑑𝑑𝑑 of the same as . Substituting back into the posterior equation gives: 𝑢𝑢 𝜆𝜆 𝛽𝛽 ∑𝑡𝑡𝑖𝑖 𝜆𝜆 𝛽𝛽+∑𝑡𝑡𝑖𝑖 𝑑𝑑𝑑𝑑 𝛽𝛽+∑𝑡𝑡𝑖𝑖

𝑢𝑢 𝜆𝜆 ( ) ( | ) = 𝛼𝛼−1+∑𝑘𝑘 −λ β+∑𝑡𝑡𝑖𝑖 1 𝜆𝜆 𝑒𝑒 ∞ 𝜋𝜋 𝜆𝜆 𝐸𝐸 + + 𝛼𝛼−1+∑𝑘𝑘 𝑢𝑢 −u � � � 𝑒𝑒 𝑑𝑑𝑑𝑑 𝛽𝛽 ∑𝑡𝑡𝑖𝑖 𝛽𝛽 ∑𝑡𝑡𝑖𝑖 0 ( ) = 1 𝛼𝛼−1+∑𝑘𝑘 −λ β+∑𝑡𝑡𝑖𝑖

( 𝜆𝜆) 𝑒𝑒 + ∞ = + 𝛼𝛼−1+∑𝑘𝑘 −u Let 𝛼𝛼+∑𝑘𝑘 ∫0 𝑢𝑢 𝑒𝑒 𝑑𝑑𝑑𝑑 𝛽𝛽 ∑𝑡𝑡𝑖𝑖 𝑧𝑧 𝛼𝛼 ∑𝑘𝑘 30 Probability Distributions Used in Reliability Engineering

( ) ( | ) = 1𝛼𝛼−1+∑𝑘𝑘 −λ β+∑𝑡𝑡𝑖𝑖 𝜆𝜆 𝑒𝑒 Parameter Est Parameter ( + ) 𝜋𝜋 𝜆𝜆 𝐸𝐸 ∞ 𝑧𝑧−1 −u 𝛼𝛼+∑𝑘𝑘 0 𝑖𝑖 ∫ 𝑢𝑢 𝑒𝑒 𝑑𝑑𝑑𝑑 Using ( ) = : 𝛽𝛽 ∑𝑡𝑡 ∞ 𝑧𝑧−1 −𝑥𝑥 ( + ) Γ 𝑧𝑧 ∫𝑜𝑜 𝑥𝑥 𝑒𝑒 (𝑑𝑑|𝑑𝑑 ) = ( ) 𝛼𝛼−1+∑𝑘𝑘( + ) 𝛼𝛼+∑𝑘𝑘 𝜆𝜆 𝛽𝛽 ∑𝑡𝑡𝑖𝑖 −λ β+∑𝑡𝑡𝑖𝑖 𝜋𝜋 𝜆𝜆 𝐸𝐸 𝑒𝑒 Let = + , = + : Γ 𝛼𝛼 ∑𝑘𝑘 ′ ′ 𝑖𝑖 ′ ′ 𝛼𝛼 𝛼𝛼 ∑𝑘𝑘 𝛽𝛽 𝛽𝛽 ∑𝑡𝑡 ( | ) = 𝛼𝛼 𝛼𝛼 −(1 )′ ′ 𝜆𝜆 𝛽𝛽 −β λ 𝜋𝜋 𝜆𝜆 𝐸𝐸 ′ 𝑒𝑒 As can be seen the posterior is a gamma distributionΓ 𝛼𝛼 with the parameters = + , = + . Therefore the prior and posterior are of the same form, and Bayes’′ rule does not′ need to be re-calculated for each update. Instead the user can simply𝛼𝛼 update𝛼𝛼 ∑the𝑘𝑘 𝑖𝑖 𝛽𝛽parameters𝛽𝛽 ∑𝑡𝑡 with the new evidence.

The Likelihood Function ( | )

The reader is referred to section𝐿𝐿 𝐸𝐸 1.1.6𝜃𝜃 for a discussion on the construction of the likelihood function.

The Posterior Distribution ( | )

The posterior distribution is a𝜋𝜋 probability𝜃𝜃 𝐸𝐸 distribution of the UOI, , which captures our state of knowledge of including all prior information and the evidence. 𝜃𝜃 Point Estimate. From𝜃𝜃 the posterior distribution we may want to give a point estimate of θ. The Bayesian estimator when using a quadratic is the posterior mean (Christensen & Huffman 1985):

= [ ( | )] = ( | ) =

𝜋𝜋 For more information on𝜃𝜃 �utility,𝐸𝐸 𝜋𝜋loss𝜃𝜃 𝐸𝐸functions∫ 𝜃𝜃 and𝜋𝜋 𝜃𝜃 estimators𝐸𝐸 𝑑𝑑𝑑𝑑 𝜇𝜇 in a Bayesian context see (Berger 1993).

1.4.7. Confidence Intervals

Assuming a random variable is distributed by a given distribution, there exists the true distribution parameters, , which is unknown. The parameter point estimates, , may or may not be close to the true parameter values. Confidence intervals provide the range 𝟎𝟎 over which the true parameter𝜽𝜽 values may exist with a certain level of confidence.𝜽𝜽� Confidence intervals only quantify uncertainty due to error arising from a limited number of samples. Uncertainty due to incorrect or incorrect assumptions is not included. (Meeker & Escobar 1998, p.49)

Increasing the desired confidence results in an increased . Increasing the sample size generally decreases the confidence interval. There are many methods to calculate confidence intervals. Some𝛾𝛾 popular methods are: Introduction 31 Parameter Est

• Exact Confidence Intervals. It may be mathematically shown that the

parameter of a distribution itself follows a distribution. In such cases exact confidence intervals can be derived. This is only the case in very few distributions.

• Fisher Information Matrix (Nelson 1990, p.292). For a large number of samples, the asymptotic normal property can be used to estimate confidence intervals: lim ~ ( , [ ( )] ) −1 0 0 𝑛𝑛→∞ 𝜃𝜃� 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝜃𝜃 𝑛𝑛𝑛𝑛 𝜃𝜃 Combining this with the asymptotic property as gives the following estimate for the distribution of : 𝜃𝜃� → 𝜃𝜃0 𝑛𝑛 → ∞ lim ~ , 𝜃𝜃� −1 𝑛𝑛 𝑛𝑛→∞ 𝜃𝜃� 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 �𝜃𝜃� �𝐽𝐽 �𝜃𝜃��� � 100 % approximate confidence intervals are calculated using of the . If the range of is unbounded ( , ) the approximate two sided𝛾𝛾 confidence intervals are: 𝜃𝜃 1 + −∞ ∞ = 2 −1 𝛾𝛾 −1 𝛾𝛾 � 1 + � 𝑛𝑛 � 𝜃𝜃 = 𝜃𝜃 +− Φ � � �𝐽𝐽 �𝜃𝜃�� 2 −1 𝛾𝛾 −1 𝜃𝜃��𝛾𝛾� 𝜃𝜃� Φ � � ��𝐽𝐽𝑛𝑛�𝜃𝜃��� If the range of is (0, ) the approximate two sided confidence intervals are: 1 + 𝜃𝜃 ∞ 2 = . exp −1 −1 𝛾𝛾 ⎡𝛷𝛷 � � ��𝐽𝐽𝑛𝑛�𝜃𝜃��� ⎤ 𝜃𝜃𝛾𝛾 𝜃𝜃� ⎢ ⎥ ⎢ 1 + −𝜃𝜃� ⎥ ⎣ 2 ⎦ = . exp −1 −1 𝛾𝛾 ⎡𝛷𝛷 � � ��𝐽𝐽𝑛𝑛�𝜃𝜃��� ⎤ 𝜃𝜃��𝛾𝛾� 𝜃𝜃� ⎢ ⎥ If the range of is (0,1) the approximate⎢ two𝜃𝜃 �sided confidence⎥ intervals are: ⎣ 1 + ⎦ 𝜃𝜃 2 −1 = . + (1 ) exp −1 −1 𝛾𝛾 ⎧ ⎡𝛷𝛷 � (1� ��𝐽𝐽)𝑛𝑛�𝜃𝜃��� ⎤⎫ 𝜃𝜃𝛾𝛾 𝜃𝜃� 𝜃𝜃� − 𝜃𝜃� ⎢ ⎥ � � ⎨ ⎢ 1 +𝜃𝜃 − 𝜃𝜃 ⎥⎬ ⎩ ⎣ 2 ⎦⎭−1 = . + (1 ) exp −1 −1 𝛾𝛾 ⎧ ⎡𝛷𝛷 � (1� ��𝐽𝐽𝑛𝑛)�𝜃𝜃��� ⎤⎫ 𝜃𝜃��𝛾𝛾� 𝜃𝜃� 𝜃𝜃� − 𝜃𝜃� ⎢ ⎥ ⎨ ⎢ −𝜃𝜃� − 𝜃𝜃� ⎥⎬ The advantage of this⎩ method is it can⎣ be calculated for all distributions⎦⎭ and is easy to calculate. The disadvantage is that the assumption of a normal distribution is asymptotic and so sufficient data is required for the confidence interval estimate to be accurate. The number of samples needed for an accurate estimate changes from distribution to distribution. It also produces symmetrical confidence intervals which may be very inaccurate. For more information see (Nelson 1990, p.292). 32 Probability Distributions Used in Reliability Engineering

• Likelihood Ratio Intervals (Nelson 1990, p.292). The test for the

Parameter Est Parameter likelihood ratio is: = 2[ ( )]

is approximately Chi-Square𝐷𝐷 distributedΛ�𝜃𝜃�� − withΛ 𝜃𝜃 one degree of freedom. = 2 ( ) ( ; 1) 𝐷𝐷 2 Where is the 100 % confidence𝐷𝐷 �Λ�𝜃𝜃 �interval� − Λ 𝜃𝜃 for� ≤ .𝜒𝜒 The𝛾𝛾 two sided confidence limits and are calculated by solving: 𝛾𝛾 𝛾𝛾 𝜃𝜃 𝛾𝛾 𝛾𝛾 ( ; 1) 𝜃𝜃 �𝜃𝜃�� ( ) = 2 2 𝜒𝜒 𝛾𝛾 Λ 𝜃𝜃 Λ�𝜃𝜃�� − The limits are normally solved numerically. The likelihood ratio intervals are always within the limits of the parameter and gives asymmetrical confidence limits. It is much more accurate than the Fisher information matrix method particularly for one sided limits although it is more complicated to calculate. This method must be solved numerically and so will not be discussed further in this book.

• Bayesian Confidence Intervals. In Bayesian statistics the uncertainty of a parameter, , is quantified as a distribution ( ). Therefore the two sided 100 % confidence intervals are found by solving: 𝜃𝜃 𝜋𝜋 𝜃𝜃 𝛾𝛾 1 1 + = ( ) , = ( ) 2 𝜃𝜃𝛾𝛾 2 ∞ − 𝛾𝛾 𝛾𝛾 �−∞ 𝜋𝜋 𝜃𝜃 𝑑𝑑𝑑𝑑 �𝜃𝜃𝛾𝛾 𝜋𝜋 𝜃𝜃 𝑑𝑑𝑑𝑑 Other methods exist to calculate approximate confidence intervals. A summary of some techniques used in reliability engineering is included in (Lawless 2002). Introduction 33 Related Dist

Transformation

1.5. Related Distributions Special Case

= , = , = Lognormal + ( , ) 𝑋𝑋𝐶𝐶1 1 1 Beta 𝐵𝐵 2 1 2 2 ( , ) 2 𝑋𝑋 𝐶𝐶1 𝐶𝐶2 𝛼𝛼 𝑣𝑣 𝛽𝛽 𝑣𝑣 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿 𝜇𝜇𝑁𝑁 𝜎𝜎𝑁𝑁 𝑋𝑋 𝑋𝑋 = ln( ) Bernoulli 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝛼𝛼 𝛽𝛽 ( ) 𝑁𝑁 𝐿𝐿 𝑋𝑋 𝑋𝑋 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑝𝑝 Normal = = 1 ( , ) ( ) = 2 X = 𝐺𝐺 𝐵𝐵𝐵𝐵 + 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝜇𝜇 𝜎𝜎 𝑋𝑋 � 𝑋𝑋 𝐺𝐺1 𝑋𝑋 𝑋𝑋 X 𝑛𝑛 𝑛𝑛 𝐵𝐵 = , =

( ) 𝑆𝑆 C 𝑖𝑖𝑖𝑖𝑖𝑖 𝑋𝑋 Binomial 𝐺𝐺1 𝐺𝐺2 ( 2 𝑋𝑋 𝑋𝑋 1 N k ( , ) 1 2 ) − µ 𝛼𝛼 𝑘𝑘 𝛽𝛽 𝑘𝑘 ≤

� � � lim V ⋯ σ 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝑛𝑛 𝑛𝑛 𝑝𝑝 𝑛𝑛 ≤ 𝑛𝑛

| Chi-square = 𝑵𝑵 → ( ) 𝑋𝑋 𝑿𝑿 𝑆𝑆 𝜇𝜇 ( ; ) =

| ( ∞

2 𝑛𝑛 =

( ) ) 𝜒𝜒 𝑣𝑣 ; + 1, ,

C 𝑃𝑃 𝑋𝑋

X 𝑓𝑓 𝑘𝑘 𝜆𝜆𝜆𝜆

𝑋𝑋 ( ; , )

𝑆𝑆 (

𝐹𝐹 𝐺𝐺 X 𝐹𝐹 𝑡𝑡 𝑘𝑘 𝜆𝜆 𝑟𝑟 ) χ2 = Poisson 𝐺𝐺 Gamma ~ = ( ) −𝐹𝐹 𝑡𝑡 𝑘𝑘 𝜆𝜆 ( , ) 𝐵𝐵 𝐵𝐵𝐵𝐵𝐵𝐵 𝑋𝑋 𝑋𝑋 𝐶𝐶 𝐶𝐶 𝑋𝑋 1 2 𝜒𝜒 2 𝑛𝑛 𝑛𝑛 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝜇𝜇 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑘𝑘 𝜆𝜆

1 2 𝑋𝑋

𝐺𝐺 𝑘𝑘 = = 𝑋𝑋 1

F 𝐸𝐸 𝛼𝛼

( , ) 1 = + ⋯ 𝐹𝐹 𝑛𝑛1 𝑛𝑛2 1 = , 𝛽𝛽 +

= 1,2 = =

𝑋𝑋 𝐹𝐹 𝑇𝑇 𝐸𝐸𝐸𝐸

𝑋𝑋 𝑋𝑋 1

1 2 Chi Student𝑛𝑛 t 𝑛𝑛 𝑣𝑣 ( ) ( ) Exponential 𝜒𝜒 𝑣𝑣 𝑡𝑡 𝑣𝑣 = 2, = ( )

2 𝑅𝑅 𝐸𝐸 𝐸𝐸𝐸𝐸𝐸𝐸 𝜆𝜆 𝑋𝑋 = 𝑣𝑣 Rayleigh𝜎𝜎 𝑋𝑋 �𝑋𝑋 ( ) 𝑊𝑊 = exp

𝛽𝛽 ( ) , =

2 𝛽𝛽 𝑆𝑆 𝐸𝐸 𝑋𝑋 √ 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅ℎ 𝜎𝜎 𝑋𝑋 −𝜆𝜆𝑋𝑋 = ln( ) =

𝐸𝐸 1 Standard Uniform � =

= 2 1 𝛽𝛽 (0,1)

𝛼𝛼 𝐸𝐸 𝑃𝑃 , = 2/ 𝑋𝑋 𝑋𝑋 ⁄𝜃𝜃 2 𝛽𝛽 Pareto 𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈 = 0 = ( , = 1

𝑣𝑣 𝜎𝜎 √ 𝛼𝛼 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝜃𝜃 𝛼𝛼 𝑎𝑎 Weibull Uniform𝑏𝑏 ( , ) ( , )

𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 𝛼𝛼 𝛽𝛽 𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈 𝑎𝑎 𝑏𝑏 Figure 7: Relationships between common distributions (Leemis & McQueston 2008).

Many relations are not included such as central limit convergence to the normal distribution and many transforms which would have made the figure unreadable. For further details refer to individual sections and (Leemis & McQueston 2008).

34 Probability Distributions Used in Reliability Engineering

1.6. Supporting Functions

Supporting Func Supporting 1.6.1. Beta Function ( , )

B( , ) is the Beta function𝐁𝐁 and𝒙𝒙 𝒚𝒚 is the Euler integral of the first kind. ( , ) = (1 ) 𝑥𝑥 𝑦𝑦 1 𝑥𝑥−1 𝑦𝑦−1 Where > 0 and > 0. 𝐵𝐵 𝑥𝑥 𝑦𝑦 �0 𝑢𝑢 − 𝑢𝑢 𝑑𝑑𝑑𝑑

Relationships:𝑥𝑥 𝑦𝑦 ( , ) = ( , ) ( ) ( ) ( , ) = 𝐵𝐵 𝑥𝑥 𝑦𝑦 𝐵𝐵(𝑦𝑦 +𝑥𝑥 ) Γ 𝑥𝑥 Γ 𝑦𝑦 𝐵𝐵 𝑥𝑥 𝑦𝑦 ( , ) = Γ∞𝑥𝑥 𝑦𝑦 𝑛𝑛 +− 𝑦𝑦 � � 𝑛𝑛 𝐵𝐵 𝑥𝑥 𝑦𝑦 � 𝑛𝑛=0 𝑥𝑥 𝑛𝑛 More formulas, definitions and special values can be found in the Digital Library of Mathematical Functions on the National Institute of Standards and Technology (NIST) website, http://dlmf.nist.gov.

1.6.2. Incomplete Beta Function ( ; , )

𝒕𝒕 ( ; , ) is the incomplete Beta function𝑩𝑩 expressed𝒕𝒕 𝒙𝒙 𝒚𝒚 by:

𝑡𝑡 ( ; , ) = (1 ) 𝐵𝐵 𝑡𝑡 𝑥𝑥 𝑦𝑦 𝑡𝑡 𝑥𝑥−1 𝑦𝑦−1 𝑡𝑡 𝐵𝐵 𝑡𝑡 𝑥𝑥 𝑦𝑦 �0 𝑢𝑢 − 𝑢𝑢 𝑑𝑑𝑑𝑑 1.6.3. Regularized Incomplete Beta Function ( ; , )

𝒕𝒕 ( | , ) is the regularized incomplete Beta function: 𝑰𝑰 𝒕𝒕 𝒙𝒙 𝒚𝒚 (t ; , ) 𝑡𝑡 ( | , ) = 𝐼𝐼 𝑡𝑡 𝑥𝑥 𝑦𝑦 ( , ) 𝐵𝐵𝑡𝑡 𝑥𝑥 𝑦𝑦 𝐼𝐼 𝑡𝑡 𝑡𝑡 𝑥𝑥 𝑦𝑦 𝐵𝐵 𝑥𝑥 𝑦𝑦 ( + 1)! = 𝑥𝑥+𝑦𝑦−1 . (1 ) ! ( + 1 )! 𝑥𝑥 𝑦𝑦 − 𝑗𝑗 𝑥𝑥+𝑦𝑦−1−𝑗𝑗 Properties: � 𝑡𝑡 − 𝑡𝑡 𝑗𝑗=𝑥𝑥 𝑗𝑗 𝑥𝑥 𝑦𝑦 − − 𝑗𝑗 (0; , ) = 0

0 𝐼𝐼 (1; 𝑥𝑥, 𝑦𝑦) = 1

1 (t; , 𝐼𝐼) = 1𝑥𝑥 𝑦𝑦 (1 t; , )

𝑡𝑡 1.6.4. Complete Gamma Function𝐼𝐼 𝑥𝑥 𝑦𝑦 ( −) 𝐼𝐼 − 𝑦𝑦 𝑥𝑥

( ) is a generalization of the factorial function𝚪𝚪 𝒌𝒌 ! to include non-integer values.

Γ 𝑘𝑘 𝑘𝑘 Supporting Func Introduction 35

For > 0 ( ) = 𝑘𝑘 ∞ 𝑘𝑘−1 −𝑡𝑡

Γ 𝑘𝑘 = �0 𝑡𝑡 𝑒𝑒 +𝑑𝑑𝑑𝑑( 1) ∞ ∞ 𝑘𝑘−1 −𝑡𝑡 𝑘𝑘−2 −𝑡𝑡 =� 𝑡𝑡( 𝑒𝑒1)�0 𝑘𝑘 − �0 𝑡𝑡 𝑒𝑒 𝑑𝑑𝑑𝑑 ∞ 𝑘𝑘−2 −𝑡𝑡 = ( 1) ( 1) 𝑘𝑘 − �0 𝑡𝑡 𝑒𝑒 𝑑𝑑𝑑𝑑

When is an integer: 𝑘𝑘 − Γ 𝑘𝑘 − ( ) = ( 1)! Special𝑘𝑘 values: Γ 𝑘𝑘 (1) =𝑘𝑘 −1

Γ(2) = 1

Γ 1 = 2

Γ � � √𝜋𝜋 Relation to the incomplete gamma functions: ( ) = ( , ) + ( , )

More formulas, definitions and specialΓ 𝑘𝑘 Γvalues𝑘𝑘 𝑡𝑡 can𝛾𝛾 𝑘𝑘 be𝑡𝑡 found in the Digital Library of Mathematical Functions on the National Institute of Standards and Technology (NIST) website, http://dlmf.nist.gov.

1.6.5. Upper Incomplete Gamma Function ( , )

For > 0 𝚪𝚪 𝒌𝒌 𝒕𝒕 ( , ) = 𝑘𝑘 ∞ 𝑘𝑘−1 −𝑥𝑥 When is an integer: Γ 𝑘𝑘 𝑡𝑡 �𝑡𝑡 𝑥𝑥 𝑒𝑒 𝑑𝑑𝑑𝑑

𝑘𝑘 ( , ) = ( 1)! 𝑘𝑘−1 𝑛𝑛! −𝑡𝑡 𝑡𝑡 Γ 𝑘𝑘 𝑡𝑡 𝑘𝑘 − 𝑒𝑒 � 𝑛𝑛=0 𝑛𝑛 More formulas, definitions and special values can be found on the NIST website, http://dlmf.nist.gov.

1.6.6. Lower Incomplete Gamma Function ( , )

For > 0 𝛄𝛄 𝒌𝒌 𝒕𝒕 ( , ) = 𝑘𝑘 𝑡𝑡 𝑘𝑘−1 −𝑥𝑥 When is an integer: γ 𝑘𝑘 𝑡𝑡 �0 𝑥𝑥 𝑒𝑒 𝑑𝑑𝑑𝑑

𝑘𝑘 ( , ) = ( 1)! 1 𝑘𝑘−1 𝑛𝑛! −𝑡𝑡 𝑡𝑡 γ 𝑘𝑘 𝑡𝑡 𝑘𝑘 − � − 𝑒𝑒 � � 𝑛𝑛=0 𝑛𝑛

36 Probability Distributions Used in Reliability Engineering

More formulas, definitions and special values can be found on the NIST website, http://dlmf.nist.gov.

Supporting Func Supporting 1.6.7. Digamma Function ( )

( ) is the digamma function defined𝝍𝝍 𝒙𝒙 as: ( ) ( ) = ln[ ( )] = > 0 𝜓𝜓 𝑥𝑥 ′( ) 𝑑𝑑 Γ 𝑥𝑥 𝜓𝜓 𝑥𝑥 Γ 𝑥𝑥 𝑓𝑓𝑓𝑓𝑓𝑓 𝑥𝑥 𝑑𝑑𝑑𝑑 Γ 𝑥𝑥 1.6.8. Trigamma Function ( )

( ) is the trigamma function defined𝝍𝝍′ 𝒙𝒙 as: ′ 𝜓𝜓 𝑥𝑥 ( ) = ( ) = ∞ ( + ) 2 ′ 𝑑𝑑 −2 𝜓𝜓 𝑥𝑥 2 𝑙𝑙𝑙𝑙Γ 𝑥𝑥 � 𝑥𝑥 𝑖𝑖 𝑑𝑑𝑥𝑥 𝑖𝑖=0

Introduction 37 Dist Referred

1.7. Referred Distributions

1.7.1. Inverse Gamma Distribution ( , )

The pdf to the inverse gamma distribution 𝑰𝑰𝑰𝑰is: 𝜶𝜶 𝜷𝜷 ( ; , ) = . . (0, ) ( )𝛼𝛼 −𝛽𝛽 𝛽𝛽 𝑥𝑥 With mean: 𝑓𝑓 𝑥𝑥 𝛼𝛼 𝛽𝛽 𝛼𝛼+1 𝑒𝑒 𝐼𝐼𝑥𝑥 ∞ Γ 𝛼𝛼 𝑥𝑥 = for > 1 1 𝛽𝛽 𝜇𝜇 𝛼𝛼 1.7.2. Student T Distribution ( ,𝛼𝛼 −, ) 𝟐𝟐 The pdf to the standard student t distribu𝑻𝑻 𝜶𝜶 𝝁𝝁tion𝝈𝝈 with = 0, = 1 is: 2 𝜇𝜇 𝜎𝜎 [( + 1) 2] 𝛼𝛼+1 ( ; ) = . 1 + − ( 2) 2 2 Γ 𝛼𝛼 ⁄ 𝑥𝑥 𝑓𝑓 𝑥𝑥 𝛼𝛼 � � The generalized student t distribution√ 𝛼𝛼is:𝛼𝛼 Γ 𝛼𝛼⁄ 𝛼𝛼 [( + 1) 2] ( ) 𝛼𝛼+1 ( ; , , ) = . 1 + − ( 2) 2 2 2 Γ 𝛼𝛼 ⁄ 𝑥𝑥 − 𝜇𝜇 𝑓𝑓 𝑥𝑥 𝛼𝛼 𝜇𝜇 𝜎𝜎 � 2 � With mean 𝜎𝜎√𝛼𝛼𝛼𝛼Γ 𝛼𝛼⁄ 𝛼𝛼𝜎𝜎 =

1.7.3. F Distribution ( , ) 𝜇𝜇 𝜇𝜇

𝟏𝟏 𝟐𝟐 Also known as the Variance𝑭𝑭 𝒏𝒏 Ratio𝒏𝒏 or Fisher-Snedecor distribution the pdf is: 1 ( ) . ( ; ) = . 2 𝑛𝑛1 { 𝑛𝑛 } , ( 1+ ) 2 2 𝑛𝑛 𝑥𝑥 𝑛𝑛2 � 𝑛𝑛1+𝑛𝑛2 With cdf: 𝑓𝑓 𝑥𝑥 𝛼𝛼 𝑛𝑛1 𝑛𝑛2 1 2 𝑥𝑥𝑥𝑥 � � 𝑛𝑛 𝑥𝑥 𝑛𝑛 , , = 21 22 1+ 𝑡𝑡 𝑛𝑛 𝑛𝑛 𝑛𝑛 𝑥𝑥 𝐼𝐼 � � 𝑤𝑤ℎ𝑒𝑒𝑒𝑒𝑒𝑒 𝑡𝑡 1 2 1.7.4. Chi-Square Distribution ( ) 𝑛𝑛 𝑥𝑥 𝑛𝑛 𝟐𝟐 The pdf to the chi-square distribution𝝌𝝌 is: 𝒗𝒗

( ) 2 ( ; ) = 𝑣𝑣−2 ⁄2 2 𝑥𝑥 𝑥𝑥 𝑒𝑒𝑒𝑒𝑒𝑒2�− � With mean: 𝑓𝑓 𝑥𝑥 𝑣𝑣 𝑣𝑣⁄2 𝑣𝑣 𝛤𝛤 � � =

𝜇𝜇 𝑣𝑣 38 Probability Distributions Used in Reliability Engineering

1.7.5. Hypergeometric Distribution ( ; , , )

Referred Dist The hypergeometric distribution models probability𝑯𝑯𝒚𝒚𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑 of 𝒌𝒌successes𝒏𝒏 𝒎𝒎 𝑵𝑵 in Bernoulli trials from population containing success without replacement. = / . The pdf to the hypergeometric distribution is: 𝑘𝑘 𝑛𝑛 𝑁𝑁 𝑚𝑚 𝑝𝑝 𝑚𝑚 𝑁𝑁

( ; , , ) = m N−m � k �� n−k � With mean: 𝑓𝑓 𝑘𝑘 𝑛𝑛 𝑚𝑚 𝑁𝑁 N �n� = 𝑛𝑛𝑛𝑛 𝜇𝜇 1.7.6. Wishart Distribution ( 𝑁𝑁; , )

𝒅𝒅 The Wishart distribution is the multivariate𝑾𝑾𝑾𝑾𝑾𝑾𝑾𝑾𝑾𝑾𝑾𝑾𝒕𝒕 generalization𝒙𝒙 𝚺𝚺 𝒏𝒏 of the gamma distribution. The pdf is given as:

| | ( ) 1 1 ( ; , ) = n−d−1 exp 2 2 2 | | −𝟏𝟏 𝑑𝑑 𝐱𝐱 2 With mean: 𝑓𝑓 𝒙𝒙 𝚺𝚺 𝑛𝑛 𝑛𝑛𝑛𝑛⁄2 n⁄2 𝑛𝑛 �− 𝑡𝑡𝑡𝑡�𝒙𝒙 𝚺𝚺�� 𝚺𝚺 Γ𝑑𝑑 � � =

𝝁𝝁 𝑛𝑛𝚺𝚺 Introduction 39 Nomenclature

1.8. Nomenclature and Notation

Functions are presented in the following form: ( ; | )

In continuous𝑓𝑓 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 distributions𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣 the𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 number of 𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔items 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣under test = + + . In discrete distributions the total number of trials. 𝑓𝑓 𝑠𝑠 𝐼𝐼 𝑛𝑛 𝑛𝑛 𝑛𝑛 𝑛𝑛 The number of items which failed before the conclusion of the test.

𝐹𝐹 𝑛𝑛 The number of items which survived to the end of the test.

𝑆𝑆 𝑛𝑛 The number of items which have interval data

𝐼𝐼 𝑛𝑛 , The time at which a component fails. 𝐹𝐹 𝑖𝑖 𝑡𝑡𝑖𝑖 𝑡𝑡 The time at which a component survived to. The item may have been 𝑆𝑆 removed from the test for a reason other than failure. 𝑡𝑡𝑖𝑖 The upper limit of a censored interval in which an item failed 𝑈𝑈𝑈𝑈 𝑡𝑡𝑖𝑖 The lower limit of a censored interval in which an item failed 𝐿𝐿𝐿𝐿 𝑡𝑡𝑖𝑖 The lower truncated limit of sample.

𝐿𝐿 𝑡𝑡 The upper truncated limit of sample.

𝑈𝑈 𝑡𝑡 Time on test = +

𝑇𝑇 𝑖𝑖 𝑠𝑠 𝑡𝑡 or Continuous random∑𝑡𝑡 variable∑𝑡𝑡 (T is normally a random time)

𝑋𝑋 𝑇𝑇 Discrete random variable

𝐾𝐾 or A continuous random variable with a known value

𝑥𝑥 𝑡𝑡 A discrete random variable with a known value

𝑘𝑘 The hat denotes an estimated value

𝑥𝑥� A bold symbol denotes a vector or matrix

𝒙𝒙 Generalized unknown of interest (UOI)

𝜃𝜃 Upper confidence interval for UOI

𝜃𝜃 Lower confidence interval for UOI

𝜃𝜃~ The random variable is distributed as a -variate normal distribution.

𝑑𝑑 𝑋𝑋 𝑁𝑁𝑁𝑁𝑁𝑁𝑚𝑚 𝑋𝑋 𝑑𝑑 40 Probability Distributions Used in Reliability Engineering

Life Distribution 2. Common Life Distributions

Exponential Continuous Distribution 41

2.1. Exponential Continuous Distribution

Expo Probability Density Function - f(t)

0.90 λ=0.5 nential 0.80 λ=1

0.70 λ=2 0.60 0.50 λ=3 0.40 0.30 0.20 0.10 0.00 0 0.5 1 1.5 2 2.5 3 3.5 4

Cumulative Density Function - F(t) 1.00

0.80

0.60 λ=0.5 λ=1 0.40 λ=2 0.20 λ=3

0.00 0 0.5 1 1.5 2 2.5 3 3.5 4

Hazard Rate - h(t) 3.00

2.50

2.00

1.50

1.00

0.50

0.00 0 0.5 1 1.5 2 2.5 3 3.5 4

42 Common Life Distributions

Parameters & Description Parameters : Equal to the hazard > 0 rate. 𝜆𝜆 𝜆𝜆 0

Limits

Function 𝑡𝑡 ≥ Laplace Domain

PDF ( ) = e ( ) = , > + −λt

Exponential 𝜆𝜆 𝑓𝑓 𝑡𝑡 𝜆𝜆 𝑓𝑓 𝑠𝑠 𝑠𝑠 −𝜆𝜆 CDF ( ) = 1 e (𝜆𝜆) =𝑠𝑠 ( + ) −λt 𝜆𝜆 𝐹𝐹 𝑡𝑡 − 𝐹𝐹 𝑠𝑠 1 Reliability R(t) = e ( ) =𝑠𝑠 𝜆𝜆 𝑠𝑠 + −λt 𝑅𝑅 𝑠𝑠 1 ( ) = e ( ) = 𝜆𝜆 𝑠𝑠 + −λx Conditional 𝑚𝑚 𝑥𝑥 𝑚𝑚 𝑠𝑠 𝜆𝜆 𝑠𝑠 Survivor Function Where ( > + | > ) is the given time we know the component has survived to. is a random variable defined as the time after . Note: = 0 at . 𝑃𝑃 𝑇𝑇 𝑥𝑥 𝑡𝑡 𝑇𝑇 𝑡𝑡 𝑡𝑡 𝑥𝑥 𝑡𝑡 𝑥𝑥 𝑡𝑡 Mean Residual 1 1 ( ) = ( ) = Life 𝑢𝑢 𝑡𝑡 𝑢𝑢 𝑠𝑠 Hazard Rate ( ) = 𝜆𝜆 ( ) =𝜆𝜆𝜆𝜆 𝜆𝜆 Cumulative ℎ 𝑡𝑡 𝜆𝜆 ℎ 𝑠𝑠 ( ) = ( ) = 𝑠𝑠 Hazard Rate 𝜆𝜆 𝐻𝐻 𝑡𝑡 𝜆𝜆𝜆𝜆 𝐻𝐻 𝑠𝑠 2 Properties and Moments 𝑠𝑠 (2) Median 𝑙𝑙𝑙𝑙 Mode 0𝜆𝜆 1 Mean - 1st Raw Moment

1 Variance - 2nd Central Moment 𝜆𝜆

rd 2 Skewness - 3 Central Moment 𝜆𝜆2 Excess kurtosis - 4th Central Moment 6

Characteristic Function + 𝑖𝑖𝑖𝑖 1 100α% Percentile Function = 𝑡𝑡 ln𝑖𝑖𝑖𝑖 (1 )

𝑡𝑡𝛼𝛼 − − 𝛼𝛼 𝜆𝜆 Exponential Continuous Distribution 43

Parameter Estimation Plotting Method Least Mean X-Axis Y-Axis = = Square - Expo + [1 ( )] 𝑦𝑦 𝜆𝜆̂ −𝑚𝑚

𝑖𝑖 𝑖𝑖 nential 𝑚𝑚𝑚𝑚 𝑐𝑐 𝑡𝑡 Likelihood Function𝑙𝑙𝑙𝑙 − 𝐹𝐹 𝑡𝑡

( | ) = e . . e . e e F S I n F n S n LI UI nF −λ ti −ti −λti −λti 𝐿𝐿 𝐸𝐸 𝜆𝜆 λ � � � � − � Likelihood ������i=�1���� ���i=�1��� ���i=�1����������� failures survivors interval failures Functions when there is no interval data this reduces to:

( | ) = = t + t = nF −𝜆𝜆𝑡𝑡𝑇𝑇 F S 𝐿𝐿 𝐸𝐸 𝜆𝜆 𝜆𝜆 𝑒𝑒 𝑤𝑤ℎ𝑒𝑒𝑒𝑒𝑒𝑒 𝑡𝑡𝑇𝑇 � i � i 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑖𝑖𝑖𝑖 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 ( | ) = r. ln( ) nF t nS t + nI ln e e LI RI F −λti −λti i s Λ 𝐸𝐸 𝜆𝜆 λ − � λ − � λ � � − � Log-Likelihood �������i�=�1 �� i�=1�� �i=�1������������� failures survivors interval failures Functions when there is no interval data this reduces to:

( | ) = n . ln( ) where = t + t F S Λ 𝐸𝐸 𝜆𝜆 F 𝜆𝜆 − 𝜆𝜆𝑡𝑡𝑇𝑇 𝑡𝑡𝑇𝑇 � i � i solve for to get :

𝜆𝜆 𝜆𝜆̂ n t e t e = 0 nF t nS t nI LI RI = 0 LI λti RI λti ∂Λ F F S i i i i 𝐿𝐿𝐿𝐿 − 𝑅𝑅𝑅𝑅 − � − � − � � 𝜆𝜆𝑡𝑡𝑖𝑖 𝜆𝜆𝑡𝑡𝑖𝑖 � ∂λ λ i=1 i=1 i=1 ������� ��� �����𝑒𝑒���−��𝑒𝑒����� failures survivors interval failures Point When there is only complete and right-censored data the point estimate Estimates is: n = = t + t = F F S ̂ 𝑇𝑇 i i Fisher 𝜆𝜆 𝑇𝑇 𝑤𝑤ℎ𝑒𝑒𝑒𝑒𝑒𝑒 𝑡𝑡 � 1 � 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑖𝑖𝑖𝑖 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑡𝑡 ( ) = Information 𝐼𝐼 𝜆𝜆 λlower - λupper𝜆𝜆 - λupper - 2-Sided 2-Sided 1-Sided 100 % Confidence (2n ) (2n + 2) Type I (Time ( )(2n + 2) Interval𝛾𝛾 2 2 Terminated) 1−γ F 1+γ F 2 � 2� � �2 γ 2 F 𝜒𝜒 2 𝜒𝜒 2 𝜒𝜒 (excluding (𝑇𝑇2n ) (𝑇𝑇2n ) 𝑡𝑡𝑇𝑇 interval data) Type II (Failure 𝑡𝑡 𝑡𝑡 ( )(2n ) 2 2 2 Terminated) 1−γ F 1+γ F � 2� � 2� γ2 F 𝜒𝜒 2 𝜒𝜒 2 𝜒𝜒 𝑇𝑇 𝑡𝑡𝑇𝑇 𝑡𝑡𝑇𝑇 𝑡𝑡 44 Common Life Distributions

( ) is the percentile of the Chi-squared distribution. (Modarres et al. 1999,2 pp.151-152) Note: These confidence intervals are only valid for 𝛼𝛼 complete𝜒𝜒 and𝛼𝛼 right-censored data or when approximations of interval data are used (such as the median). They are exact confidence bounds

and therefore approximate methods such as use of the Fisher information matrix need not be used. Bayesian Non-informative Priors ( )

Exponential (Yang and Berger 1998, p.6) 𝝅𝝅 𝝀𝝀 Type Prior Posterior Uniform Proper Prior 1 Truncated Gamma Distribution

with limits [ , ] For a b . ( ; 1 + n , t ) 𝜆𝜆 ∈ 𝑎𝑎 𝑏𝑏 𝑏𝑏 − 𝑎𝑎 ≤ λ ≤ F T Otherwise𝑐𝑐 𝐺𝐺 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺( ) = 0𝜆𝜆

Uniform Improper Prior 1 (1,0) 𝜋𝜋 𝜆𝜆 ( ; 1 + n , t ) with limits [0, ) ∝ 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝜆𝜆 F T Jeffrey’s Prior 1 ( ; + n , t ) 𝜆𝜆 ∈ ∞ ( , 0) when 1 [0, ) 1 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝜆𝜆 2 F T Novick and Hall 1 ∝ 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 2 ( ; n , t ) √𝜆𝜆 (0,0) 𝜆𝜆 ∈ ∞ when [0, ) 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝜆𝜆 F T ∝ 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 where𝜆𝜆 = t + t = total time in test𝜆𝜆 ∈ ∞ F S 𝑡𝑡𝑇𝑇 Conjugate∑ i ∑ iPriors UOI Likelihood Evidence Dist. of Prior Posterior Model UOI Para Parameters failures = + from Exponential in unit of Gamma , 𝜆𝜆 𝑛𝑛𝐹𝐹 = + ( ; ) time 𝑘𝑘 𝑘𝑘𝑜𝑜 𝑛𝑛𝐹𝐹 𝑡𝑡𝑇𝑇 𝑘𝑘0 Λ0 Λ Λ𝑜𝑜 𝑡𝑡𝑇𝑇 𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 λ Description , Limitations and Uses Example Three vehicle tires were run on a test area for 1000km have punctures at the following distances: Tire 1: No punctures Tire 2: 400km, 900km Tire 3: 200km

Punctures are a random failure with constant failure rate therefore an exponential distribution would be appropriate. Due to an exponential distribution being homogeneous in time, the renewal process of the second tire failing twice with a repair can be considered as two separate tires on test with single failures. See example in section 1.1.6.

Total distance on test is 3 × 1000 = 3000km. Total number of Exponential Continuous Distribution 45

failures is 3. Therefore using MLE the estimate of :

n 3 𝜆𝜆 = = = 1E-3 F 3000

̂ Expo 𝜆𝜆 𝑇𝑇 With 90% confidence interval𝑡𝑡 (distance terminated test): ( )(6) ( )(8)

. . nential = 0.272 -3, = 2.584 -3 26000 26000 𝜒𝜒 0 05 𝜒𝜒 0 95 � 𝐸𝐸 𝐸𝐸 � A Bayesian point estimate using the Jeffery non-informative improper prior ( , 0), with posterior ( ; 3.5, 3000) has 1 a point estimate: 2 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 3.𝐺𝐺5𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝜆𝜆 = E[ ( ; 3.5, 3000)] = = 1.16E 3 3000

𝜆𝜆̂ 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝜆𝜆 ̇ − With 90% confidence interval using inverse Gamma cdf: [ (0.05) = 0.361 -3, (0.95) = 2.344 -3] −1 −1 𝐹𝐹𝐺𝐺 𝐸𝐸 𝐹𝐹𝐺𝐺 𝐸𝐸 Characteristics Constant Failure Rate. The exponential distribution is defined by a constant failure rate, . This means the component is not subject to wear or accumulation of damage as time increases. 𝜆𝜆 ( ) = . As can be seen, is the initial value of the distribution. Increases in increase the probability density at (0). 𝒇𝒇 𝟎𝟎 𝝀𝝀 𝜆𝜆 HPP. The exponential𝜆𝜆 distribution is the time to failure𝑓𝑓 distribution of a single event in the Homogeneous Poisson Process (HPP).

~ ( ; ) Scaling property 𝑇𝑇 𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 𝜆𝜆 ~ ; 𝜆𝜆 Minimum property 𝑎𝑎𝑎𝑎 𝐸𝐸𝐸𝐸𝐸𝐸 �𝑡𝑡 � 𝑎𝑎 min { , , … , }~ ; 𝑛𝑛

Variate Generation property𝑇𝑇1 𝑇𝑇2 𝑇𝑇𝑛𝑛 𝐸𝐸𝐸𝐸𝐸𝐸 �𝑡𝑡 � 𝜆𝜆𝑖𝑖� ln(1 ) 𝑖𝑖=1 ( ) = , 0 < < 1 Memoryless property.−1 − 𝑢𝑢 𝐹𝐹 𝑢𝑢 𝑢𝑢 Pr( > + |−𝜆𝜆> ) = Pr ( > )

Properties from (Leemis𝑇𝑇 &𝑡𝑡 McQueston𝑥𝑥 𝑇𝑇 𝑡𝑡 2008)𝑇𝑇. 𝑥𝑥 Applications No Wearout. The exponential distribution is used to model occasions when there is no wearout or cumulative damage. It can be used to approximate the failure rate in a component’s useful life period (after burn in and before wear out).

46 Common Life Distributions

Homogeneous Poisson Process (HPP). The exponential distribution is used to model the inter arrival times in a repairable system or the arrival times in queuing models. See Poisson and Gamma distribution for more detail.

Electronic Components. Some electronic components such as capacitors or integrated circuits have been found to follow an exponential distribution. Early efforts at collecting reliability data assumed a constant failure rate and therefore many reliability handbooks only provide a failure rate estimates for components. Exponential

Random Shocks. It is common for the exponential distribution to model the occurrence of random shocks An example is the failure of a vehicle tire due to puncture from a nail (random shock). The probability of failure in the next mile is independent of how many miles the tire has travelled (memoryless). The probability of failure when the tire is new is the same as when the tire is old (constant failure rate).

In general component life distributions do not have a constant failure rate, for example due to wear or early failures. Therefore the exponential distribution is often inappropriate to model most life distributions, particularly mechanical components.

Online: http://www.weibull.com/LifeDataWeb/the_exponential_distribution.h tm http://mathworld.wolfram.com/ExponentialDistribution.html http://en.wikipedia.org/wiki/Exponential_distribution http://socr.ucla.edu/htmls/SOCR_Distributions.html (web calc) Resources

Books: Balakrishnan, N. & Basu, A.P., 1996. Exponential Distribution: Theory, Methods and Applications 1st ed., CRC.

Nelson, W.B., 1982. Applied Life Data Analysis, Wiley-Interscience. Relationship to Other Distributions 2-Para Special Case: Exponential 1 ( ; ) = ( ; = 0, = ) Distribution ( ; , ) 𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 𝜆𝜆 𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 µ β 𝜆𝜆 𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 𝜇𝜇 𝛽𝛽 Let … ~ ( ) = + + + Gamma Then 1 𝑘𝑘 𝑡𝑡 1 2 𝑘𝑘 Distribution 𝑇𝑇 𝑇𝑇 𝐸𝐸𝐸𝐸𝐸𝐸 𝜆𝜆 ~𝑎𝑎𝑎𝑎𝑎𝑎 ( 𝑇𝑇, ) 𝑇𝑇 𝑇𝑇 ⋯ 𝑇𝑇 The gamma distribution is the probability density function of the sum 𝑡𝑡 ( ; , ) of k exponentially distributed𝑇𝑇 𝐺𝐺 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺time random𝑘𝑘 𝜆𝜆 variables sharing the same constant rate of occurrence, . This is a Homogeneous 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑡𝑡 𝑘𝑘 𝜆𝜆 Poisson Process. 𝜆𝜆 Exponential Continuous Distribution 47

Special Case: ( ; ) = ( ; = 1, )

Let 𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 𝜆𝜆 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑡𝑡 𝑘𝑘 𝜆𝜆 , … ~ ( ; ) Expo Given 1 2 = 𝑇𝑇 +𝑇𝑇 +𝐸𝐸𝐸𝐸𝐸𝐸+ 𝑡𝑡 𝜆𝜆+ … nential Then 1 2 𝐾𝐾 𝐾𝐾+1 Poisson 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑇𝑇 ~ 𝑇𝑇 (k⋯; =𝑇𝑇 ) 𝑇𝑇 Distribution The Poisson distribution𝐾𝐾 is 𝑃𝑃the𝑃𝑃𝑃𝑃𝑃𝑃 probabilityµ 𝜆𝜆𝜆𝜆 of observing exactly k ( ; ) occurrences within a time interval [0, t] where the inter-arrival times of each occurrence is exponentially distributed. This is a 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑘𝑘 𝜇𝜇 Homogeneous Poisson Process.

Special Cases: (k = 1; = ) = ( ; )

Let 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 µ 𝜆𝜆𝜆𝜆 𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 𝜆𝜆 ~ ( ) = X / Weibull Then 1 β Distribution 𝑋𝑋 𝐸𝐸𝐸𝐸𝐸𝐸 𝜆𝜆 𝑎𝑎𝑎𝑎𝑎𝑎 𝑌𝑌 ~ ( = , ) −1 ( ; , ) Special Case: 𝛽𝛽 𝑌𝑌 𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 𝛼𝛼 𝜆𝜆 𝛽𝛽1 ( ; ) = ; = , = 1 𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 𝑡𝑡 𝛼𝛼 𝛽𝛽 𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 𝜆𝜆 𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 �𝑡𝑡 𝛼𝛼 𝛽𝛽 � Let 𝜆𝜆 ~ ( ) = [ ], Geometric Then Distribution 𝑋𝑋 𝐸𝐸𝐸𝐸𝐸𝐸 𝜆𝜆 𝑎𝑎𝑎𝑎𝑎𝑎 ~ 𝑌𝑌 𝑋𝑋 ( ,𝑌𝑌)𝑖𝑖 𝑖𝑖 𝑡𝑡ℎ𝑒𝑒 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑜𝑜𝑜𝑜 𝑋𝑋

( ; ) The 𝑌𝑌 𝐺𝐺 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺is the 𝛼𝛼discrete𝛽𝛽 equivalent of the continuous exponential distribution. The geometric distribution is 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑘𝑘 𝑝𝑝 also memoryless. Let Rayleigh ~ ( ) = X Distribution Then 𝑋𝑋 𝐸𝐸𝐸𝐸𝐸𝐸 𝜆𝜆 𝑎𝑎𝑛𝑛𝑛𝑛 1 𝑌𝑌 √ ( ; ) ~ ( = )

𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅ℎ 𝑡𝑡 𝛼𝛼 Special Case: 𝑌𝑌 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅ℎ 𝛼𝛼 Chi-square √𝜆𝜆 1 ( ; ) ( ; = 2) = ; = 2 2 2 𝜒𝜒 𝑥𝑥 𝑣𝑣 Let 𝜒𝜒 𝑥𝑥 𝑣𝑣 𝐸𝐸𝐸𝐸𝐸𝐸 �𝑥𝑥 λ � Pareto ~ ( , ) = ln( ) Distribution Then ( ; , ) 𝑌𝑌 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝜃𝜃 𝛼𝛼 ~ 𝑎𝑎𝑎𝑎𝑎𝑎( = ) 𝑋𝑋 𝑌𝑌⁄𝜃𝜃 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑡𝑡 𝜃𝜃 𝛼𝛼 𝑋𝑋 𝐸𝐸𝐸𝐸𝐸𝐸 𝜆𝜆 𝛼𝛼 48 Common Life Distributions

Let Logistic ~ ( = 1) = ln 1 +−𝑋𝑋 Distribution 𝑒𝑒 ( , ) Then 𝑋𝑋 𝐸𝐸𝐸𝐸𝐸𝐸 𝜆𝜆 𝑎𝑎𝑎𝑎𝑎𝑎 𝑌𝑌 � −𝑋𝑋� 𝑒𝑒 ~ (0,1) 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿 µ 𝑠𝑠 (Hastings et al. 2000, p.127): 𝑌𝑌 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿

Exponential Lognormal Continuous Distribution 49

2.2. Lognormal Continuous Distribution

Probability Density Function - f(t) Lognormal 0.40 μ'=1 σ'=1 μ'=0 σ'=1 0.90 μ'=1 σ'=1 μ'=0.5 σ'=1 0.35 μ'=1 σ'=1.5 0.80 μ'=1 σ'=1 0.30 μ'=1 σ'=2 0.70 μ'=1.5 σ'=1 0.60 0.25 0.50 0.20 0.40 0.15 0.30 0.10 0.20 0.10 0.05 0.00 0.00 0 2 4 6 0 2 4

Cumulative Density Function - F(t) 1.00 0.90 0.80 0.80 0.70 0.60 0.60 0.50 0.40 0.40 0.30 0.20 0.20 0.10 0.00 0.00 0 2 4 6 0 2 4

Hazard Rate - h(t) 0.70 1.80 1.60 0.60 1.40 0.50 1.20 0.40 1.00 0.80 0.30 0.60 0.20 0.40 0.10 0.20 0.00 0.00 0 2 4 6 0 2 4

50 Common Life Distributions

Parameters & Description Scale parameter: The mean of the normally distributed ln( ). This parameter only determines the scale

< < and not the location as in 𝑥𝑥a normal distribution. 𝜇𝜇𝑁𝑁 −∞ 𝜇𝜇𝑁𝑁 ∞ = ln +2 Parameters 𝑁𝑁 𝜇𝜇 Lognormal 𝜇𝜇 � 2 2� �:𝜎𝜎 The𝜇𝜇 of the normally distributed ln(x). This parameter only determines > 0 the shape and not the scale as in a 2 2 normal distribution. σN σN + = ln 2 2 2 𝜎𝜎 𝜇𝜇 σN � 2 � Limits t > 0 𝜇𝜇 Distribution Formulas 1 1 ln ( ) ( ) = exp 2 2 2 𝑡𝑡 − 𝜇𝜇𝑁𝑁 𝑓𝑓 𝑡𝑡 � − � 𝑁𝑁 � � PDF 𝜎𝜎𝑁𝑁𝑡𝑡√ 1𝜋𝜋 ( ) 𝜎𝜎 = . 𝑙𝑙𝑙𝑙 𝑡𝑡 − 𝜇𝜇𝑁𝑁

𝑁𝑁 𝜙𝜙 � 𝑁𝑁 � where is the standard normal𝜎𝜎 𝑡𝑡 pdf. 𝜎𝜎

𝜙𝜙 1 1 1 ln( ) ( ) = exp 2 𝑡𝑡 2 ∗ 2 𝑡𝑡 − 𝜇𝜇𝑁𝑁 ∗ where is𝐹𝐹 the𝑡𝑡 time variable� ∗ over� which− � the pdf is integrated.� � 𝑑𝑑𝑡𝑡 𝑁𝑁 0 𝑡𝑡 𝜎𝜎𝑁𝑁 ∗ 𝜎𝜎 √ 𝜋𝜋 𝑡𝑡 1 1 ( ) = + erf CDF 2 2 2 𝑙𝑙𝑙𝑙 𝑡𝑡 − 𝜇𝜇𝑁𝑁 � � ln( ) 𝜎𝜎𝑁𝑁√ = 𝑡𝑡 − 𝜇𝜇𝑁𝑁

� 𝑁𝑁 � where is the standard normalΦ cdf.𝜎𝜎

Φ ln ( ) Reliability R(t) = 1 𝑡𝑡 − 𝜇𝜇𝑁𝑁 − � 𝑁𝑁 � Φ 𝜎𝜎 ln( + t) 1 Conditional ( + ) ( ) = ( | ) = = 𝑁𝑁 Survivor Function ( ) ln𝑥𝑥(t) − 𝜇𝜇 −1 � 𝑁𝑁 � ( > + | > ) 𝑅𝑅 𝑡𝑡 𝑥𝑥 𝜎𝜎 𝑚𝑚 𝑥𝑥 𝑅𝑅 𝑥𝑥 𝑡𝑡 Φ 𝑅𝑅 𝑡𝑡 − 𝜇𝜇𝑁𝑁 Where − � � 𝑃𝑃 𝑇𝑇 𝑥𝑥 𝑡𝑡 𝑇𝑇 𝑡𝑡 Φ 𝜎𝜎𝑁𝑁 Lognormal Continuous Distribution 51

is the given time we know the component has survived to. is a random variable defined as the time after . Note: = 0 at . 𝑡𝑡 𝑥𝑥 ( ) 𝑡𝑡 𝑥𝑥 𝑡𝑡 ( ) = ∞ ( ) 𝑡𝑡 𝑅𝑅 𝑥𝑥 𝑑𝑑𝑑𝑑 Mean Residual ∫ Lognormal Life lim ( )𝑢𝑢 𝑡𝑡 𝑅𝑅 𝑡𝑡 [1 + (1)] ( )2 ln 𝑁𝑁 (1) 𝜎𝜎 𝑡𝑡 Where is Landau's𝑡𝑡→∞ 𝑢𝑢 𝑡𝑡notation.≈ (Kleiber & 𝑜𝑜Kotz 2003, p.114) 𝑡𝑡 − 𝜇𝜇𝑁𝑁 ln( ) 𝑜𝑜 ( ) = Hazard Rate 𝑡𝑡 −(𝜇𝜇)𝑁𝑁 𝜙𝜙 � ln � . 1 𝜎𝜎𝑁𝑁 ℎ 𝑡𝑡 𝑁𝑁 𝑁𝑁 𝑡𝑡 − 𝜇𝜇 Cumulative 𝑡𝑡 𝜎𝜎 � − Φ � 𝑁𝑁 �� ( ) = ln [ ( )𝜎𝜎] Hazard Rate Properties and𝐻𝐻 Moments𝑡𝑡 − 𝑅𝑅 𝑡𝑡 Median ( ) 𝜇𝜇𝑁𝑁 Mode (𝑒𝑒 ) 2 𝑁𝑁 𝑁𝑁 st 𝜇𝜇 −𝜎𝜎 Mean - 1 Raw Moment 𝑒𝑒 2 𝜎𝜎𝑁𝑁 �𝜇𝜇𝑁𝑁+ � 2 nd Variance - 2 Central Moment 𝑒𝑒 1 . 2 2 𝜎𝜎𝑁𝑁 2𝜇𝜇𝑁𝑁+𝜎𝜎𝑁𝑁 rd Skewness - 3 Central Moment �𝑒𝑒 +−2 �. 𝑒𝑒 1 2 2 𝜎𝜎 𝜎𝜎 Excess kurtosis - 4th Central Moment �𝑒𝑒+ 2 � �+𝑒𝑒3 − 3 2 2 2 4𝜎𝜎𝑁𝑁 3𝜎𝜎𝑁𝑁 2𝜎𝜎𝑁𝑁 Characteristic Function Deriving𝑒𝑒 a unique𝑒𝑒 characteristic𝑒𝑒 − equation is not trivial and complex series solutions have been proposed. (Leipnik 1991) 100α% Percentile Function = e( . ) 𝜇𝜇𝑁𝑁+𝑧𝑧𝛼𝛼 𝜎𝜎𝑁𝑁 𝛼𝛼 where is 𝑡𝑡the 100pthof the standard normal distribution 𝛼𝛼 𝑧𝑧 = e( ( )) −1 𝜇𝜇𝑁𝑁+𝜎𝜎𝑁𝑁Φ 𝛼𝛼 𝑡𝑡α Parameter Estimation Plotting Method

Least Mean X-Axis Y-Axis = Square ln ( ) [ ( )] 1𝑐𝑐 = + 𝜇𝜇�𝑁𝑁 =− 𝑚𝑚 𝑡𝑡𝑖𝑖 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝐹𝐹 𝑡𝑡𝑖𝑖 𝑦𝑦 𝑚𝑚𝑚𝑚 𝑐𝑐 𝜎𝜎�𝑁𝑁 𝑚𝑚

52 Common Life Distributions

Maximum Likelihood Function Likelihood 1 ( ) . 1 . Functions nF . t nS nI 𝐹𝐹 𝑆𝑆 𝑅𝑅𝑅𝑅 𝐿𝐿𝐿𝐿 F 𝑖𝑖 𝑖𝑖 𝑖𝑖 𝑖𝑖 �i=1 𝑁𝑁 i 𝜙𝜙 𝑧𝑧 �i=1� − �𝑧𝑧 �� �i=1� �𝑧𝑧 � − �𝑧𝑧 �� 𝜎𝜎 ������������� ����������������� where�� ����������� survivorsΦ intervalΦ failuresΦ failures ln(t ) = x 𝑥𝑥 i − 𝜇𝜇𝑁𝑁 𝑧𝑧𝑖𝑖 � � 𝜎𝜎𝑁𝑁 Lognormal Log-Likelihood 1 Function ( , |E) = nF ln + nS ln 1 . 𝐹𝐹 𝑆𝑆 N N 𝐹𝐹 𝑖𝑖 𝑖𝑖 Λ µ σ � � 𝑁𝑁 𝑖𝑖 𝜙𝜙�𝑧𝑧 �� � � − �𝑧𝑧 �� �i=�1���𝜎𝜎��𝑡𝑡������ �i=�1������Φ����� failures survivors + nI ln 𝑅𝑅𝑅𝑅 𝐿𝐿𝐿𝐿 𝑖𝑖 𝑖𝑖 � � �𝑧𝑧 � − �𝑧𝑧 �� where �i=�1���Φ�������Φ����� intervalln(t failures) = x 𝑥𝑥 i 𝑁𝑁 𝑖𝑖 − 𝜇𝜇 solve for to get MLE 𝑧𝑧: � 𝑁𝑁 � = 0 𝜎𝜎 𝑁𝑁 . N 𝑁𝑁 1 1 z ∂Λ 𝜇𝜇 = 𝜇𝜇�+ nF ln (t ) + nS N F 1 Sz ∂µ ∂Λ −µN F ϕ� i � i S N N N � N � ∂µ σ σ i=1 σ i=1 i ��������������� ������−�Φ����� 1failuresz z survivors nI = 0 zRI zLI ϕ� i � − ϕ� i � RI LI − � N � � i=1 σ i i where ������Φ������−�Φ������ interval failuresln(t ) = x 𝑥𝑥 i 𝑁𝑁 𝑖𝑖 − 𝜇𝜇 solve for to get : 𝑧𝑧 � 𝑁𝑁 � = 0 𝜎𝜎 𝑁𝑁 n 𝑁𝑁 1 1 z . z ∂Λ 𝜎𝜎 = 𝜎𝜎�+ nF ln (t ) + nS N 1 S zS ∂σ ∂Λ − F F 2 i ϕ� i � 3 i N S N N �� − µ � N � ∂σ σ N i=1 σ i=1 i �����σ�������������� ������−�Φ����� 1 failuresz . z z z survivors nI = 0 RI z RI LIz LI i ϕ� i � − i ϕ� i � RI LI − � N � � i=1 σ i i where ��������Φ������−�Φ�� ������ interval failuresln(t ) = x 𝑥𝑥 i − 𝜇𝜇𝑁𝑁 𝑧𝑧𝑖𝑖 � � MLE Point When there is only complete failure𝜎𝜎 data𝑁𝑁 the point estimates can be Estimates given as: ln ( ) ln = = n 𝐹𝐹 𝐹𝐹n 2 ∑ 𝑡𝑡𝑖𝑖 2 ∑� �𝑡𝑡𝑖𝑖 � − 𝜇𝜇�𝑡𝑡� 𝑁𝑁 �N 𝜇𝜇� F σ F Lognormal Continuous Distribution 53

Note: In almost all cases the MLE methods for a normal distribution can be used by taking the ( ). However Normal distribution estimation methods cannot be used with interval data. (Johnson et al. 1994, p.220) 𝑙𝑙𝑙𝑙 𝑋𝑋 In most cases the unbiased estimators are used: Lognormal ln ( ) ln = = n 𝐹𝐹 n 𝐹𝐹 1 2 ∑ 𝑡𝑡𝑖𝑖 2 ∑� �𝑡𝑡𝑖𝑖 � − 𝜇𝜇�𝑡𝑡� 𝜇𝜇�𝑁𝑁 σ�N F F − Fisher 1 0 Information ( , ) = ⎡ 2 1 ⎤ 2 𝜎𝜎0𝑁𝑁 𝐼𝐼 𝜇𝜇𝑁𝑁 𝜎𝜎𝑁𝑁 ⎢ 2 ⎥ ⎢ ⎥ (Kleiber & Kotz 2003, p.119). − 4 ⎣ 𝜎𝜎 ⎦ 100 % 1-Sided Lower 2-Sided Lower 2-Sided Upper Confidence

Intervals𝛾𝛾 (n 1) (n 1) + (n 1) 𝑁𝑁 𝑁𝑁 𝑁𝑁 𝑵𝑵 𝜎𝜎� 𝜎𝜎� 1−γ 𝜎𝜎� 1−γ 𝑁𝑁 γ F 𝑁𝑁 � � F 𝑁𝑁 � � F 𝝁𝝁 𝜇𝜇� − 𝑡𝑡 − 𝜇𝜇� − 𝑡𝑡 2 − 𝜇𝜇� 𝑡𝑡 2 − (for complete (𝑛𝑛𝐹𝐹 1) 𝑛𝑛𝐹𝐹( 1) 𝑛𝑛𝐹𝐹( 1) √ √ √ data) 𝟐𝟐 ( 1) ( 1) ( 1) 𝑵𝑵 2 𝑛𝑛𝐹𝐹 − 2 1+𝑛𝑛𝐹𝐹 − 2 1 𝑛𝑛𝐹𝐹 − 𝝈𝝈 �𝑁𝑁 2 �𝑁𝑁 2 2 �𝑁𝑁 2 2 𝜎𝜎 𝛾𝛾 𝐹𝐹 𝜎𝜎 γ 𝐹𝐹 𝜎𝜎 −γ 𝐹𝐹 Where (n𝜒𝜒 𝑛𝑛1)− is the 100 th percentile𝜒𝜒� � 𝑛𝑛 − of the t-distribution𝜒𝜒� � with𝑛𝑛 − 1 degrees of freedom and ( 1) is the 100 th percentile of the - 𝑡𝑡γ F − 𝛾𝛾 𝑛𝑛𝐹𝐹 − distribution with 1 degrees2 of freedom. (Nelson 1982, pp.218-219)2 𝜒𝜒𝛾𝛾 𝑛𝑛𝐹𝐹 − 𝛾𝛾 𝜒𝜒 1 Sided - Lower𝑛𝑛𝐹𝐹 − 2 Sided

exp + + exp + ± + 22 2 2( 4 1) 22 2 2( 4 1) �𝑁𝑁 �𝑁𝑁 �𝑁𝑁 �𝑁𝑁 �𝑁𝑁 �𝑁𝑁 𝝁𝝁 𝑁𝑁 𝜎𝜎 1−𝛼𝛼 𝜎𝜎 𝜎𝜎 𝑁𝑁 𝜎𝜎 1−𝛼𝛼 𝜎𝜎 𝜎𝜎 �𝜇𝜇� − 𝑍𝑍 � � �𝜇𝜇� 𝑍𝑍 �2� � 𝑛𝑛𝐹𝐹 𝑛𝑛𝐹𝐹 − 𝑛𝑛𝐹𝐹 𝑛𝑛𝐹𝐹 − These formulas are the Cox approximation for the confidence intervals of the lognormal distribution mean where = ( ), the inverse of −1 the standard normal cdf. (Zhou & Gao 1997)𝑝𝑝 𝑍𝑍 𝛷𝛷 𝑝𝑝 Zhou & Gao recommend using the parametric bootstrap method for small sample sizes. (Angus 1994) Bayesian Non-informative Priors when is known, ( ) (Yang and Berger 1998,𝟐𝟐 p.22) 𝝈𝝈𝑵𝑵 𝝅𝝅𝟎𝟎 𝝁𝝁𝑵𝑵 Type Prior Posterior Uniform Proper 1 Truncated Normal Distribution

Prior with limits For a b [ , ] ln N . ; 𝐹𝐹 , 𝑏𝑏 − 𝑎𝑎 ≤ µ ≤ 𝑛𝑛 𝐹𝐹 n2 𝜇𝜇𝑁𝑁 ∈ 𝑎𝑎 𝑏𝑏 ∑𝑖𝑖=1 𝑡𝑡𝑖𝑖 σN Otherwise𝑐𝑐 𝑁𝑁 𝑁𝑁𝑁𝑁𝑁𝑁( �)µ=N 0 � 𝑛𝑛𝐹𝐹 F 𝜋𝜋 𝜇𝜇𝑁𝑁 54 Common Life Distributions

All 1 ln ; 𝐹𝐹 , 𝑛𝑛 𝐹𝐹 n2 ∑𝑖𝑖=1 𝑡𝑡𝑖𝑖 σN 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁when�µN ( , ) � 𝑛𝑛𝐹𝐹 F Non-informative Priors when is known, 𝜇𝜇𝑁𝑁(∈ )∞ ∞

(Yang and Berger 1998, p.23) 𝟐𝟐 𝝁𝝁𝑵𝑵 𝝅𝝅𝒐𝒐 𝛔𝛔𝐍𝐍 Type Prior Posterior Uniform Proper 1 Truncated Inverse Gamma Distribution

Lognormal Prior with limits For a b [ , ] 2 ( 2) S N . ; , 2 𝑏𝑏 − 𝑎𝑎 ≤ σ ≤ 2 22 𝜎𝜎𝑁𝑁 ∈ 𝑎𝑎 𝑏𝑏 2 𝑛𝑛𝐹𝐹 − N Otherwise𝑐𝑐 𝐼𝐼𝐼𝐼( �σ)N= 0 � 2 Uniform Improper 1 𝑁𝑁 ( 2) S 𝜋𝜋 𝜎𝜎 ; , Prior with limits 2 𝐹𝐹 2 2 (0, ) 2 𝑛𝑛 − N 𝐼𝐼𝐼𝐼See�σN section 1.7.1 � 2 Jeffery’s,𝜎𝜎𝑁𝑁 ∈ ∞ 1 S ; , Reference, MDIP 2 22 Prior 2 𝑛𝑛𝐹𝐹 N 2 with𝐼𝐼𝐼𝐼 limits�σN (0�, ) 𝜎𝜎𝑁𝑁 See section2 1.7.1 𝜎𝜎𝑁𝑁 ∈ ∞ Non-informative Priors when and are unknown, ( , ) (Yang and Berger 1998,𝟐𝟐 p.23) 2 𝝁𝝁𝑵𝑵 𝝈𝝈𝑵𝑵 𝝅𝝅𝒐𝒐 𝝁𝝁𝑵𝑵 σN Type Prior Posterior Improper Uniform 1 S ( | )~ ; 3, , with limits: ( 2 3) ( , ) N 𝜋𝜋 𝜇𝜇𝑁𝑁 𝐸𝐸 𝑇𝑇See�µN section𝑛𝑛𝐹𝐹 − 1.7.2𝑡𝑡��𝑁𝑁� � (0, ) 𝐹𝐹 𝐹𝐹 𝑁𝑁 ( 𝑛𝑛3) 𝑛𝑛S − 𝜇𝜇 2 ∈ ∞ ∞ ( | )~ ; , 𝑁𝑁 2 22 𝜎𝜎 ∈ ∞ 2 2 𝑛𝑛𝐹𝐹 − N 𝜋𝜋 σN 𝐸𝐸See𝐼𝐼𝐼𝐼 section�σN 1.7.1 � Jeffery’s Prior 1 S ( | )~ ; + 1, , ( 2+ 1) 𝐹𝐹 4 𝑁𝑁 whenN ( ,𝑁𝑁 ) 𝑁𝑁 𝜋𝜋 𝜇𝜇 𝐸𝐸 𝑇𝑇 �µ 𝑁𝑁 𝑡𝑡��� 𝐹𝐹 𝐹𝐹 � 𝜎𝜎 See section 1.7.2𝑛𝑛 𝑛𝑛 𝜇𝜇𝑁𝑁 ∈ ( ∞ +∞1) S ( | )~ ; , 2 22 2 2 𝑛𝑛𝐹𝐹 N 𝜋𝜋 σN 𝐸𝐸when𝐼𝐼𝐼𝐼 �σN (0, ) � See section2 1.7.1 𝜎𝜎𝑁𝑁 ∈ ∞ Reference Prior ( , ) No closed form ordering { , } 1 2 𝑜𝑜 𝑁𝑁 𝜋𝜋 𝜙𝜙2 +𝜎𝜎 𝜙𝜙 𝜎𝜎 ∝where 2 𝑁𝑁 𝜎𝜎 =� / 𝜙𝜙

𝜙𝜙 𝜇𝜇𝑁𝑁 𝜎𝜎𝑁𝑁 Lognormal Continuous Distribution 55

Reference where 1 S ( | )~ ; 1, , and are n (n 2 1) 𝐹𝐹 N separate groups.2 𝑁𝑁 whenN ( �,�𝑁𝑁� ) 𝜇𝜇 𝜎𝜎 𝑁𝑁 𝜋𝜋 𝜇𝜇 𝐸𝐸 𝑇𝑇 �µ 𝑁𝑁 − 𝑡𝑡 F F � 𝜎𝜎 See section 1.7.2 − MDIP Prior 𝜇𝜇𝑁𝑁 ∈ ( ∞ ∞1) ( | )~ ; , Lognormal 2 22 2 2 𝑛𝑛𝐹𝐹 − 𝑆𝑆𝑁𝑁 𝜋𝜋 σN 𝐸𝐸when𝐼𝐼𝐼𝐼 �σN (0, ) � See section2 1.7.1 𝜎𝜎𝑁𝑁 ∈ ∞ where 1 = 𝑛𝑛𝐹𝐹 (ln ) and = 𝑛𝑛𝐹𝐹 ln n 2 2 𝑁𝑁 𝑖𝑖 𝑁𝑁 𝑁𝑁 𝑖𝑖 𝑆𝑆 � 𝑡𝑡 − 𝑡𝑡��� �𝑡𝑡�� F � 𝑡𝑡 𝑖𝑖=1 Conjugate Priors 𝑖𝑖=1 UOI Likelihood Evidence Dist. of Prior Posterior Parameters Model UOI Para = + /2

Lognormal failures 𝑜𝑜 𝐹𝐹 from2 with known 𝐹𝐹 Gamma , 𝑘𝑘 𝑘𝑘 𝑛𝑛 𝑁𝑁 at times𝑛𝑛 1 (𝜎𝜎; , ) = + 𝑛𝑛𝐹𝐹 (ln ) 0 0 2 2 𝑘𝑘 𝜆𝜆 2 𝑁𝑁 𝑁𝑁 𝑁𝑁 𝑜𝑜 𝑖𝑖 𝑁𝑁 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿 𝑡𝑡 𝜇𝜇 𝜎𝜎 𝜇𝜇 𝑖𝑖 𝜆𝜆 𝜆𝜆 � 𝑡𝑡 − 𝜇𝜇 𝑡𝑡 𝑖𝑖=1 ln ( ) + 𝑛𝑛𝐹𝐹 = 0 ∑𝑖𝑖=1 𝑖𝑖 µ2 1 2 𝑡𝑡 Lognormal 0 + 𝑁𝑁 failures σ 𝜎𝜎 from with known 𝐹𝐹 Normal , 𝜇𝜇 𝐹𝐹 𝑁𝑁 at times𝑛𝑛 2 𝑛𝑛2 (𝜇𝜇 ; , ) 2 0 𝑁𝑁 𝑜𝑜 𝑜𝑜 𝜎𝜎 1𝜎𝜎 2 2 𝜇𝜇 𝜎𝜎 = 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿 𝑡𝑡 𝜇𝜇𝑁𝑁 𝜎𝜎𝑁𝑁 𝜎𝜎𝑁𝑁 1 n 𝑡𝑡𝑖𝑖 2 + σ F 2 2 Description , Limitations and Uses σ0 𝜎𝜎𝑁𝑁 Example 5 components are put on a test with the following failure times: 98, 116, 2485, 2526, , 2920 hours

Taking the natural log of these failure times allows us to use a normal distribution to approximate the parameters. ln ( ): 4.590, 4.752, 7.979, 7.818, 7.834 (hours) 𝑖𝑖 𝑡𝑡 MLE Estimates are: 𝑙𝑙𝑙𝑙 ln 32.974 = = = 6.595 n 𝐹𝐹 5 ∑ �𝑡𝑡𝑖𝑖 � 𝜇𝜇�𝑁𝑁 ln = F = 3.091 n 𝐹𝐹 1 2 2 ∑� �𝑡𝑡𝑖𝑖 � − 𝜇𝜇�𝑡𝑡� �𝑁𝑁 𝜎𝜎 F 90% confidence interval for : −

(𝑁𝑁) {0.95}𝜇𝜇4 , + {0.95}(4) 𝑁𝑁4 𝑁𝑁4 𝜎𝜎� 𝜎𝜎� �𝜇𝜇�𝑁𝑁 − 𝑡𝑡 𝜇𝜇�𝑁𝑁 𝑡𝑡 � √ √ 56 Common Life Distributions

[4.721, 8.469]

90% confidence interval for : 4 2 4 𝑁𝑁 , 𝜎𝜎(4) (4) 2 {0.95} 2 {0.05} �𝜎𝜎�𝑁𝑁 2 [1.303, 17𝜎𝜎�.396𝑁𝑁 2] � 𝜒𝜒 𝜒𝜒

A Bayesian point estimate using the Jeffery non-informative improper Lognormal prior 1 with posterior for ~ (6, 6.595, 0.412 ) and ~ (3, 6.182) has4 a point estimates: 2 𝑁𝑁 𝑁𝑁 𝑁𝑁 ⁄𝜎𝜎 𝜇𝜇 𝑇𝑇 𝜎𝜎 𝐼𝐼𝐼𝐼 = E[ (6,6.595,0.412 )] = = 6.595

𝑁𝑁 𝜇𝜇� 𝑇𝑇 6.182µ = E[ (3,6.182)] = = 3.091 2 2 𝜎𝜎�𝑁𝑁 𝐼𝐼𝐼𝐼 With 90% confidence intervals:

[ (0.05) = 5.348, (0.95) = 7.842] 𝑁𝑁 𝜇𝜇 −1 −1 𝑇𝑇 𝑇𝑇 2 [1/𝐹𝐹 (0.95) = 0.982, 𝐹𝐹1/ (0.05) = 7.560] 𝑁𝑁 𝜎𝜎 −1 −1 𝐹𝐹𝐺𝐺 𝐹𝐹𝐺𝐺 Characteristics Characteristics. determines the scale and not the location as in a normal distribution. The distribution if fixed at f(0)=0 and an 𝑵𝑵 𝑁𝑁 increase𝝁𝝁 in the scale parameter𝜇𝜇 stretches the distribution across the x-axis. This has the effect of increasing the mode, mean and median of the distribution.

Characteristics. determines the shape and not the scale as in a normal distribution. For values of σN > 1 the distribution rises very 𝑵𝑵 𝑁𝑁 sharply𝝈𝝈 at the beginning𝜎𝜎 and decreases with a shape similar to an Exponential or Weibull with 0 < < 1. As σN → 0 the mode, mean and median converge to . The distribution becomes narrower and approaches a Dirac delta 𝜇𝜇function𝑁𝑁 𝛽𝛽 at = . 𝑒𝑒 𝜇𝜇𝑁𝑁 Hazard Rate. (Kleiber & Kotz 2003,𝑡𝑡 𝑒𝑒p.115)The hazard rate is unimodal with (0) = 0 and all dirivitives of ( ) = 0 and a slow decrease to zero as 0. The mode of the hazard rate: ℎ = exp ( + ) ℎ’ 𝑡𝑡 where is given by:𝑡𝑡 → 𝑡𝑡𝑚𝑚 𝜇𝜇 𝑧𝑧(𝑚𝑚𝜎𝜎) 𝑚𝑚 ( + ) = 𝑧𝑧 1 ( ) 𝜙𝜙 𝑧𝑧𝑚𝑚 therefore < < 𝑚𝑚 + 𝑁𝑁 and therefore: 𝑧𝑧 𝜎𝜎 𝑚𝑚 −1 − Φ 𝑧𝑧 𝑁𝑁 𝑚𝑚 𝑁𝑁 −𝜎𝜎 𝑧𝑧 −𝜎𝜎 <𝜎𝜎 < 2 2 As , 𝜇𝜇𝑁𝑁 and−𝜎𝜎𝑁𝑁 so for large𝜇𝜇𝑁𝑁− 𝜎𝜎𝑁𝑁+: 1 2 𝑚𝑚 𝜇𝜇𝑁𝑁−𝑒𝑒𝜎𝜎𝑁𝑁 𝑡𝑡exp 𝑒𝑒 𝜎𝜎𝑁𝑁 → ∞ 𝑡𝑡𝑚𝑚 → 𝑒𝑒 max ( ) 𝜎𝜎𝑁𝑁 2 1 2 �𝜇𝜇𝑁𝑁 − 2𝜎𝜎𝑁𝑁� ℎ 𝑡𝑡 ≈ 𝜎𝜎𝑁𝑁√ 𝜋𝜋 Lognormal Continuous Distribution 57

As 0, and so for large : 2 𝜇𝜇𝑁𝑁−𝜎𝜎𝑁𝑁+1 1 𝜎𝜎𝑁𝑁 → 𝑡𝑡𝑚𝑚 → 𝑒𝑒 max ( ) 𝜎𝜎𝑁𝑁 2 ℎ 𝑡𝑡 ≈ 2 𝜇𝜇𝑁𝑁−𝜎𝜎𝑁𝑁+1 𝜎𝜎𝑁𝑁𝑒𝑒 Mean / Median / Mode: Lognormal ( ) < ( ) < [ ]

𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚 𝑋𝑋 𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚 𝑋𝑋 𝐸𝐸 𝑋𝑋 Scale/Product Property: Let: ~ , If and are independent: 2 𝑎𝑎𝑗𝑗𝑋𝑋𝑗𝑗 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿�𝜇𝜇𝑁𝑁𝑁𝑁 σNj� ~ + ln , 𝑋𝑋𝑗𝑗 𝑋𝑋𝑗𝑗+1 2 � 𝑎𝑎𝑗𝑗𝑋𝑋𝑗𝑗 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿 ���𝜇𝜇𝑁𝑁𝑁𝑁 �𝑎𝑎𝑗𝑗�� � 𝜎𝜎𝑁𝑁𝑁𝑁� Lognormal versus Weibull. In analyzing life data to these distributions it is often the case that both may be a good fit, especially in the middle of the distribution. The Weibull distribution has an earlier lower tail and produces a more pessimistic estimate of the component life. (Nelson 1990, p.65)

Applications General Life Distributions. The lognormal distribution has been found to accurately model many life distributions and is a popular choice for life distributions. The increasing hazard rate in early life models the weaker subpopulation (burn in) and the remaining decreasing hazard rate describes the main population. In particular this has been applied to some electronic devices and -fracture data. (Meeker & Escobar 1998, p.262)

Failure Modes from Multiplicative Errors. The lognormal distribution is very suitable for failure processes that are a result of multiplicative errors. Specific applications include failure of components due to fatigue cracks. (Provan 1987)

Repair Times. The lognormal distribution has commonly been used to model repair times. It is natural for a repair time probability to increase quickly to a mode value. For example very few repairs have an immediate or quick fix. However, once the time of repair passes the mean it is likely that there are serious problems, and the repair will take a substantial amount of time.

Parameter Variability. The lognormal distribution can be used to model parameter variability. This was done when estimating the uncertainty in the parameter in a Nuclear Reactor Safety Study (NUREG-75/014). 𝜆𝜆 Theory of Breakage. The distribution models particle sizes observed in breakage processes (Crow & Shimizu 1988) Resources Online: 58 Common Life Distributions

http://www.weibull.com/LifeDataWeb/the_lognormal_distribution.ht m http://mathworld.wolfram.com/LogNormalDistribution.html http://en.wikipedia.org/wiki/Log-normal_distribution http://socr.ucla.edu/htmls/SOCR_Distributions.html (web calc)

Books: Crow, E.L. & Shimizu, K., 1988. Lognormal distributions, CRC Press.

Lognormal Aitchison, J.J. & Brown, J., 1957. The Lognormal Distribution, New York: Cambridge University Press.

Nelson, W.B., 1982. Applied Life Data Analysis, Wiley-Interscience.

Relationship to Other Distributions Normal Let; Distribution ~ ( , ) = ln ( ) 2 𝑁𝑁 N ( ; , ) Then: 𝑋𝑋 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿 𝜇𝜇 σ 2 ~𝑌𝑌 (𝑋𝑋, ) 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑡𝑡 𝜇𝜇 𝜎𝜎 Where: 2 𝑌𝑌 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝜇𝜇 𝜎𝜎 + = ln , = ln +2 2 2 𝜇𝜇 𝜎𝜎 𝜇𝜇 𝑁𝑁 𝑁𝑁 2 𝜇𝜇 � 2 2� 𝜎𝜎 � � �𝜎𝜎 𝜇𝜇 𝜇𝜇 Weibull Continuous Distribution 59

2.3. Weibull Continuous Distribution

Probability Density Function - f(t) α=1 β=0.5 0.90 α=1 β=2 1.80 α=1 β=1 0.80 α=1.5 β=2 1.60 Weibull α=1 β=2 0.70 α=2 β=2 1.40 α=1 β=5 0.60 α=3 β=2 1.20 1.00 0.50 0.80 0.40 0.60 0.30 0.40 0.20 0.20 0.10 0.00 0.00 0 1 2 0 2 4

Cumulative Density Function - F(t) 1.00 1.00

0.80 0.80

0.60 0.60

0.40 0.40 α=1 β=0.5 α=1 β=2 α=1 β=1 α=1.5 β=2 0.20 α=1 β=2 0.20 α=2 β=2 α=1 β=5 α=3 β=2 0.00 0.00 0 1 2 0 2 4

Hazard Rate - h(t) 6 10.00 α=1 β=2 9.00 α=1.5 β=2 5 8.00 α=2 β=2 α=3 β=2 4 7.00 6.00 3 5.00 4.00 2 3.00 1 2.00 1.00 0 0.00 0 1 2 0 2 4

60 Common Life Distributions

Parameters & Description Scale Parameter: The value of equals the 63.2th percentile and has > 0 a unit equal to . Note that this is not𝛼𝛼 𝛼𝛼 𝛼𝛼 equal to the mean. Parameters 𝑡𝑡 Shape Parameter: Also known as the slope (referring to a linear CDF plot) > 0

Weibull determines the shape of the 𝛽𝛽 𝛽𝛽 𝛽𝛽 distribution. Limits 0

Distribution Formulas𝑡𝑡 ≥

PDF ( ) = 𝛽𝛽 𝛽𝛽−1 𝑡𝑡 −� � 𝛽𝛽𝑡𝑡 𝛼𝛼 𝑓𝑓 𝑡𝑡 𝛽𝛽 𝑒𝑒 CDF ( ) = 1𝛼𝛼 𝛽𝛽 𝑡𝑡 −� � 𝛼𝛼 𝐹𝐹 𝑡𝑡 − 𝑒𝑒 Reliability R(t) = 𝛽𝛽 𝑡𝑡 −� � 𝛼𝛼 𝑒𝑒 ( + x) ( ) 𝛽𝛽 𝛽𝛽 ( ) = ( | ) = = 𝑡𝑡 − 𝑡𝑡+x Conditional ( ) � 𝛽𝛽 � Survivor Function 𝑅𝑅 𝑡𝑡 𝛼𝛼 Where 𝑚𝑚 𝑥𝑥 𝑅𝑅 𝑥𝑥 𝑡𝑡 𝑒𝑒 ( > + | > ) is the given time we know the component𝑅𝑅 𝑡𝑡 has survived to is a random variable defined as the time after . Note: = 0 at . 𝑃𝑃 𝑇𝑇 𝑥𝑥 𝑡𝑡 𝑇𝑇 𝑡𝑡 𝑡𝑡 (Kleiber𝑥𝑥 & Kotz 2003, p.176) 𝑡𝑡 𝑥𝑥 𝑡𝑡

( ) = 𝛽𝛽 𝛽𝛽 𝑡𝑡 ∞ 𝑥𝑥 Mean Residual � � −� � Life which has the asymptotic property𝛼𝛼 of: 𝛼𝛼 𝑢𝑢 𝑡𝑡 𝑒𝑒 �t 𝑒𝑒 𝑑𝑑𝑑𝑑 lim ( ) = 1−𝛽𝛽 𝑡𝑡→∞ 𝑢𝑢 𝑡𝑡 𝑡𝑡 Hazard Rate ( ) = . 𝛽𝛽−1 β 𝑡𝑡 Cumulative ℎ 𝑡𝑡 � � ( ) α= 𝛼𝛼 Hazard Rate 𝛽𝛽 𝑡𝑡 𝐻𝐻 𝑡𝑡 � � Properties and Moments𝛼𝛼 Median (2) 1 𝛽𝛽 Mode 𝛼𝛼1�𝑙𝑙𝑙𝑙 � 1 if 1 𝛽𝛽 𝛽𝛽 − otherwise𝛼𝛼 no� mode� exists 𝛽𝛽 ≥ 𝛽𝛽 Weibull Continuous Distribution 61

Mean - 1st Raw Moment 1 1 +

Variance - 2nd Central Moment 𝛼𝛼Γ2� � 1 1 + 𝛽𝛽 1 + 2 2 rd 𝛼𝛼 �Γ � 3 � − Γ � �� Weibull Skewness - 3 Central Moment 1 + 𝛽𝛽 𝛽𝛽 3 2 3 Γ � � α − 3µσ − µ β Excess kurtosis - 4th Central Moment 6 + 12 33 4 + σ 4 2( )2 − Γ1 Γ1 Γ2 − Γ2 − Γ1Γ3 Γ4 where: 2 2 Γ2 − Γ1 = 1 + 𝑖𝑖 Γ𝑖𝑖 Γ � � Characteristic Function ( ) 𝛽𝛽 ∞ 1 + 𝑛𝑛! 𝑛𝑛 𝑖𝑖𝑖𝑖 𝛼𝛼 𝑛𝑛 � Γ � � 100p% Percentile Function 𝑛𝑛=0 𝑛𝑛 𝛽𝛽 = [ ln(1 )] 1 𝛽𝛽 Parameter Estimation 𝑡𝑡𝑝𝑝 𝛼𝛼 − − 𝑝𝑝 Plotting Method

Least Mean X-Axis Y-Axis = 𝑐𝑐 Square − ln ( ) 1 = 𝑚𝑚 = + ln 1 𝛼𝛼� 𝑒𝑒 𝑡𝑡𝑖𝑖 𝛽𝛽̂ 𝑚𝑚 𝑦𝑦 𝑚𝑚𝑚𝑚 𝑐𝑐 �𝑙𝑙𝑙𝑙 � �� Maximum Likelihood Function− 𝐹𝐹 Likelihood Functions F 𝛽𝛽 S 𝛽𝛽 L( , |E) = 𝛽𝛽−1 ti . ti nF 𝐹𝐹 −� � nS −� � 𝛽𝛽�𝑡𝑡𝑖𝑖 � 𝛼𝛼 𝛼𝛼 α β � 𝛽𝛽 𝑒𝑒 � 𝑒𝑒 ���i=�1����𝛼𝛼��������� ���i=�1����� failures survivors LI 𝛽𝛽 RI 𝛽𝛽 ti ti nI −� � −� � 𝛼𝛼 𝛼𝛼 �i=1 �𝑒𝑒 − 𝑒𝑒 � ������������������� Log-Likelihood interval failures ( , |E) = n ln( ) n ln( ) + nF ( 1) ln Function 𝐹𝐹 𝛽𝛽 𝐹𝐹 𝑡𝑡𝑖𝑖 Λ α β F β − β F 𝛼𝛼 � � 𝛽𝛽 − �𝑡𝑡𝑖𝑖 � − � � � i=1 𝛼𝛼 ����������������������������������� t failuresβ β S I LI RI n β + n ln e ti e ti S −� � −� � i α α − � � � � � − � i=1 α i=1 ������� ������������������� survivors interval failures 62 Common Life Distributions

solve for to get : = 0 n ∂Λ 𝛼𝛼 = 𝛼𝛼� + nF t + nS t ∂α ∂Λ −β F β F β β S β β+1 �� i � β+1 �� i � ∂α α i=1 i=1 �������α�������� �α �������� failures survivors t RI β t LI β β e ti β e ti LI � � RI � � + nI i α i α = 0 ⎛� � − � � ⎞ Weibull α α β RI β LI β � ⎜ e ti e ti ⎟ � � � � i=1 α ⎜ α α ⎟ − ����⎝���������������������⎠ solve for to get : interval failures = 0

∂Λ 𝛽𝛽 𝛽𝛽̂ n t t t t t ∂β = + nF ln . ln nS ln F F β F S β S ∂Λ F i i i i i � � � � − � � � �� − � � � � � ∂β β i=1 α α 𝛼𝛼 i=1 α 𝛼𝛼 ����������������������� ����������� failures survivors t t LI β t t RI β ln . β . e ti ln . β . e ti RI RI � � LI LI � � + nI i i α i i α = 0 � � � � − � � � � ⎛ 𝛼𝛼 α 𝛼𝛼 α ⎞ RI β LI β � ⎜ e ti e ti ⎟ � � � � i=1 ⎜ α α ⎟ − ���⎝����������������������������������⎠ MLE Point When there is only complete failureinterval and/orfailures right censored data the point Estimates estimates can be solved using (Rinne 2008, p.439):

t + t 1 = � � � F β n S β β ∑� i � ∑� i � α� � F � t ln t + t ln t 1 = � � ln (t ) −1 F β F S β S ∑� i � t� i � + ∑�ti � � i � F � � � i β � F β S β − 𝐹𝐹 � � i i 𝑛𝑛 Note: Numerical methods∑� � are needed∑� � to solve then substitute to find . Numerical methods to find Weibull MLE estimates for complete and censored data for 2 parameter and 3 parameterβ� Weibull distribution are detailedα� in (Rinne 2008).

Fisher (2) 1 Information 2 2 ′ 𝛽𝛽 − 𝛾𝛾 Matrix ( , ) = 𝛽𝛽 Γ = 2 (2 ) 1 + (2) ⎡ + (1 ) ⎤ ⎡ ⎤ 1𝛼𝛼 6 𝛼𝛼 ′𝛼𝛼 −𝛼𝛼′′ ⎢ 2 ⎥ (Rinne 2008, 𝐼𝐼 𝛼𝛼 𝛽𝛽 ⎢ ⎥ 𝜋𝜋 2 Γ Γ ⎢ − 𝛾𝛾 ⎥ p.412) ⎢ 2 ⎥ 0.422784⎢ − 𝛾𝛾 ⎥ ⎣ −𝛼𝛼 𝛽𝛽 ⎦ 2 2 ⎣ 𝛼𝛼 𝛽𝛽 ⎦ 0.422784𝛽𝛽 1.823680 ⎡ 2 ⎤ ≅ ⎢ 𝛼𝛼 −𝛼𝛼 ⎥ ⎢ 2 ⎥ ⎣ −𝛼𝛼 𝛽𝛽 ⎦ Weibull Continuous Distribution 63

100 % The asymptotic variance-covariance matrix of ( , ) is: (Rinne 2008, Confidence pp.412-417) Interval𝛾𝛾 𝛼𝛼� 𝛽𝛽̂ 1 1.1087 0.2570 , = , = 2 (complete −1 𝛼𝛼� 0.2570 2 0.6079 data) ̂ 𝑛𝑛 ̂ 𝛼𝛼� 𝐶𝐶𝐶𝐶𝐶𝐶�𝛼𝛼� 𝛽𝛽� �𝐽𝐽 �𝛼𝛼� 𝛽𝛽�� � 𝛽𝛽̂ � Weibull 𝑛𝑛𝐹𝐹 2 Bayesian 𝛼𝛼� 𝛽𝛽̂

Bayesian analysis is applied to either one of two re-parameterizations of the Weibull Distribution: (Rinne 2008, p.517)

( ; , ) = exp where = or 𝛽𝛽−1 𝛽𝛽 −𝛽𝛽 𝑓𝑓 𝑡𝑡 𝜆𝜆 𝛽𝛽 𝜆𝜆𝜆𝜆𝑡𝑡 �−𝜆𝜆𝑡𝑡 � 𝜆𝜆 1 𝛼𝛼 ( ; , ) = exp where = = 𝛽𝛽 𝛽𝛽 𝛽𝛽−1 𝑡𝑡 𝛽𝛽 𝑓𝑓 𝑡𝑡 𝜃𝜃 𝛽𝛽 𝑡𝑡 �− � 𝜃𝜃 𝛼𝛼 Non-informative𝜃𝜃 Priors 𝜃𝜃( ) (Rinne 2008,𝜆𝜆 p.517) Type Prior 𝝅𝝅𝟎𝟎 𝝀𝝀 Posterior Uniform Proper Prior with 1 Truncated Gamma Distribution known and limits [ , ] For a b . ( ; 1 + n , t , ) 𝛽𝛽 𝜆𝜆 ∈ 𝑎𝑎 𝑏𝑏 𝑏𝑏 − 𝑎𝑎 ≤ λ ≤ F T β Otherwise𝑐𝑐 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 ( )𝜆𝜆= 0 Jeffrey’s Prior when is known. 1 ( ; n , t ) (0,0) 𝜋𝜋 𝜆𝜆 , when [0, ) 𝛽𝛽 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝜆𝜆 F T β Jeffrey’s Prior for unknown ∝ 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺1 No closed form 𝜆𝜆 𝜆𝜆 ∈ ∞ and . (Rinne 2008, p.527) 𝜃𝜃 𝛽𝛽 𝜃𝜃𝜃𝜃 where , = t + t = adjusted total time in test F β S β 𝑡𝑡𝑇𝑇 𝛽𝛽 ∑� iConjugate� ∑� i � Priors It was found by Soland that no joint continuous prior distribution exists for the Weibull distribution. Soland did however propose a procedure which used a continuous distribution for α and a discrete distribution for β which will not be included here. (Martz & Waller 1982) UOI Likelihood Evidence Dist of Prior Posterior Model UOI Para Parameters

where = + =𝜆𝜆 Weibull with failures = + , Gamma , 𝑜𝑜 𝐹𝐹 from− 𝛽𝛽 known at times 𝑘𝑘 𝑘𝑘 𝑛𝑛 𝐹𝐹 (Rinne0 2008,𝑇𝑇 𝛽𝛽 𝜆𝜆 ( 𝛼𝛼; , ) 𝑛𝑛 𝐹𝐹 0 0 Λ p.520)Λ 𝑡𝑡 𝑖𝑖 𝑘𝑘 Λ 𝛽𝛽 𝑡𝑡 𝑊𝑊𝑊𝑊𝑊𝑊 𝑡𝑡 𝛼𝛼 𝛽𝛽 64 Common Life Distributions

= + where Weibull with failures Inverted = + , =𝜃𝜃 , 𝑜𝑜 𝐹𝐹 known at times Gamma (Rinne𝛼𝛼 𝛼𝛼 2008,𝑛𝑛 from𝛽𝛽 𝑛𝑛𝐹𝐹 β β0 𝑡𝑡𝑇𝑇 𝛽𝛽 𝜃𝜃 𝛼𝛼 𝐹𝐹 𝛼𝛼0 β0 p.524) ( ; , ) 𝛽𝛽 𝑡𝑡𝑖𝑖 𝑊𝑊𝑊𝑊𝑊𝑊 𝑡𝑡 𝛼𝛼 𝛽𝛽 Description , Limitations and Uses Example 5 components are put on a test with the following failure times:

Weibull 535, 613, 976, 1031, 1875 hours

is found by numerically solving:

β� t ln t = � 6.8118 −1 F β F ∑� i �t � i � � � β � F β − � i ∑� =� 2.275 is found by solving: β� α� t 1 = � � = 1140 nF β β ∑� i � α� � F � Covariance Matrix is: 1 1.1087 0.2570 55679 58.596 , = 2 = 5 𝛼𝛼� 58.596 0.6293 0.2570 2 0.6079 𝛼𝛼� 𝐶𝐶𝐶𝐶𝐶𝐶�𝛼𝛼� 𝛽𝛽̂� � 𝛽𝛽̂ � � � 2 90% confidence interval for 𝛼𝛼� : 𝛽𝛽̂ (0.95) 55679 (0.95) 55679 . exp , . exp −1 𝛼𝛼� −1 𝛷𝛷 √ 𝛷𝛷 √ �𝛼𝛼� � [811�, 1602𝛼𝛼� ] � �� −𝛼𝛼� 𝛼𝛼�

90% confidence interval for : (0.95) 0.6293 (0.95) 0.6293 . exp , . exp −1 𝛽𝛽 −1 𝛷𝛷 √ 𝛷𝛷 √ �𝛽𝛽̂ � [1.282�, 4𝛽𝛽̂.037] � �� −𝛽𝛽̂ 𝛽𝛽̂ Note that with only 5 samples the assumption that the parameter distribution is approximately normal is probably inaccurate and therefore the confidence intervals need to be used with caution.

Characteristics The Weibull distribution is also known as a “Type III asymptotic distribution for minimum values”.

β Characteristics: β < 1. The hazard rate decreases with time. Weibull Continuous Distribution 65

β = 1. The hazard rate is constant (exp distribution) β > 1. The hazard rate increases with time. 1 < β < 2. The hazard rate increases less as time increases. β = 2. The hazard rate increases with a linear relationship to time. β > 2. The hazard rate increases more as time increases. Weibull β < 3.447798. The distribution is positively skewed. (Tail to right). β ≈ 3.447798. The distribution is approximately symmetrical. β > 3.447798. The distribution is negatively skewed (Tail to left). 3 < β < 4. The distribution approximates a normal distribution. β > 10. The distribution approximates a Smallest Extreme Value Distribution.

Note that for = 0.999, (0) = , but for = 1.001, (0) = 0. This rapid change creates complications when maximizing likelihood functions. (Weibull.com)𝛽𝛽 As𝑓𝑓 ∞, the 𝛽𝛽 . 𝑓𝑓

α Characteristics. Increasing𝛽𝛽 → ∞stretches𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚 the →distribution𝛼𝛼 over the time scale. With the (0) point fixed this also has the effect of increasing the mode, mean and𝛼𝛼 median. The value for is at the 63% Percentile. ( ) =𝑓𝑓 0.632.. 𝛼𝛼 𝐹𝐹 𝛼𝛼 ~ ( , )

Scaling property: (Leemis𝑋𝑋 &𝑊𝑊 McQueston𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 𝛼𝛼 𝛽𝛽 2008)

~ , 𝛽𝛽 Minimum property (Rinne𝑘𝑘𝑘𝑘 2008,𝑊𝑊𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 p.107)�𝛼𝛼𝑘𝑘 𝛽𝛽� min { , , … , }~ ( , ) 1 − When is fixed. 𝛽𝛽 2 𝑛𝑛 𝑋𝑋 𝑋𝑋 𝑋𝑋 𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 𝛼𝛼𝑛𝑛 𝛽𝛽 Variate𝛽𝛽 Generation property ( ) = [ ln(1 )] , 0 < < 1 1 −1 𝛽𝛽 Lognormal versus𝐹𝐹 𝑢𝑢 Weibull.𝛼𝛼 − In− 𝑢𝑢analyzing 𝑢𝑢life data to these distributions it is often the case that both may be a good fit, especially in the middle of the distribution. The Weibull distribution has an earlier lower tail and produces a more pessimistic estimate of the component life. (Nelson 1990, p.65)

Applications The Weibull distribution is by far the most popular life distribution used in reliability engineering. This is due to its variety of shapes and generalization or approximation of many other distributions. Analysis assuming a Weibull distribution already includes the exponential life 66 Common Life Distributions

distribution as a special case.

There are many physical interpretations of the Weibull Distribution. Due to its minimum property a physical interpretation is the weakest link, where a system such as a chain will fail when the weakest link fails. It can also be shown that the Weibull Distribution can be derived

from a cumulative wear model (Rinne 2008, p.15)

The following is a non-exhaustive list of applications where the Weibull Weibull distribution has been used in: • Acceptance sampling • Warranty analysis • Maintenance and renewal • Strength of material modeling • Wear modeling • Electronic failure modeling • Corrosion modeling

A detailed list with references to practical examples is contained in (Rinne 2008, p.275)

Resources Online: http://www.weibull.com/LifeDataWeb/the_weibull_distribution.htm http://mathworld.wolfram.com/WeibullDistribution.html http://en.wikipedia.org/wiki/Weibull_distribution http://socr.ucla.edu/htmls/SOCR_Distributions.html (interactive web calculator) http://www.qualitydigest.com/jan99/html/weibull.html (how to use conduct Weibull analysis in Excel, William W. Dorner)

Books: Rinne, H., 2008. The Weibull Distribution: A Handbook 1st ed., Chapman & Hall/CRC.

Murthy, D.N.P., Xie, M. & Jiang, R., 2003. Weibull Models 1st ed., Wiley-Interscience.

Nelson, W.B., 1982. Applied Life Data Analysis, Wiley-Interscience.

Relationship to Other Distributions The three parameter model adds a locator parameter to the two Three Parameter parameter Weibull distribution allowing a shift along the x-axis. This Weibull creates a period of guaranteed zero failures to the beginning of the Distribution product life and is therefore only used in special cases.

( ; , , ) Special Case: ( ; , ) = ( ; , , = 0) 𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 𝑡𝑡 𝛼𝛼 𝛽𝛽 𝛾𝛾 Exponential Let 𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 𝑡𝑡 𝛼𝛼 𝛽𝛽 𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 𝑡𝑡 𝛼𝛼 𝛽𝛽 𝛾𝛾 Weibull Continuous Distribution 67

Distribution ~ ( , ) = X Then β ( ; ) 𝑋𝑋 𝑊𝑊𝑒𝑒𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝛼𝛼~𝛽𝛽 ( =𝑎𝑎𝑎𝑎𝑎𝑎 ) 𝑌𝑌 Special Case: −𝛽𝛽 𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 𝜆𝜆 𝑌𝑌 𝐸𝐸𝐸𝐸𝐸𝐸 λ 𝛼𝛼 1 ( ; ) = ; = , = 1 Weibull 𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 𝜆𝜆 𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 �𝑡𝑡 𝛼𝛼 𝛽𝛽 � Rayleigh Special Case: 𝜆𝜆 Distribution ( ; ) = ( ; , = 2)

( ; ) 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅ℎ 𝑡𝑡 𝛼𝛼 𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 𝑡𝑡 𝛼𝛼 𝛽𝛽

χ Distribution𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅ℎ 𝑡𝑡 𝛼𝛼 Special Case: ( | = 2) = | = 2, = 2 ( | ) 𝜒𝜒 𝑡𝑡 𝑣𝑣 𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊�𝑡𝑡 𝛼𝛼 √ 𝛽𝛽 � 𝜒𝜒 𝑡𝑡 𝑣𝑣

68 Probability Distributions Used in Reliability Engineering

3. Bathtub Life Distributions

2-Fold Mixed Weibull Distribution 69

3.1. 2-Fold Mixed Weibull Distribution All shapes shown are variations from = 0.5 = 2 = 0.5 = 10 = 20 Probability Density Function - f(t) 1 1 2 2 α2=10 β2=20 𝑝𝑝 𝛼𝛼 α1=2 β1=0.1𝛽𝛽 𝛼𝛼 𝛽𝛽 p=0 α2=10 β2=10 α1=2 β1=0.5 p=0.3 α2=10 β2=1 α1=2 β1=1 α2=5 β2=10 p=0.7 α1=2 β1=5 p=1 Mixed Wbl

0 5 10 0 5 10 0 5 10

Cumulative Density Function - F(t)

0 5 10 0 5 10 0 10

Hazard Rate - h(t)

0 5 10 0 10 0 10

70 Bathtub Life Distributions

Parameters & Description Scale Parameter: This is the scale for > 0 each Weibull Distribution. 𝑖𝑖 𝑖𝑖 𝛼𝛼 𝛼𝛼 Shape Parameters: The shape of > 0 Parameters each Weibull Distribution 𝑖𝑖 𝑖𝑖 𝛽𝛽 𝛽𝛽 Mixing Parameter. This determines 0 1 the weight each Weibull Distribution has on the overall density function. 𝑝𝑝 ≤ 𝑝𝑝 ≤ Limits 0

Distribution Formulas𝑡𝑡 ≥

( ) = ( ) + (1 ) ( )

PDF 𝑓𝑓 𝑡𝑡 𝑝𝑝𝑓𝑓1 𝑡𝑡 − 𝑝𝑝 𝑓𝑓2 𝑡𝑡 𝛽𝛽 where ( ) = 𝑖𝑖 𝑖𝑖 and {1,2}

Mixed Wbl Mixed 𝛽𝛽 −1 𝑡𝑡 −� � 𝛽𝛽𝑖𝑖𝑡𝑡 𝛼𝛼𝑖𝑖 𝑓𝑓𝑖𝑖 𝑡𝑡 𝛽𝛽𝑖𝑖 𝑒𝑒 𝑖𝑖 ∈ ( ) = 𝛼𝛼𝑖𝑖 ( ) + (1 ) ( ) CDF 1 2 where 𝐹𝐹( 𝑡𝑡) = 1𝑝𝑝𝐹𝐹 𝑡𝑡 𝛽𝛽𝑖𝑖 − 𝑝𝑝 and𝐹𝐹 𝑡𝑡 {1,2} 𝑡𝑡 −� � 𝛼𝛼𝑖𝑖 𝐹𝐹𝑖𝑖 𝑡𝑡 − 𝑒𝑒 𝑖𝑖 ∈ ( ) = ( ) + (1 ) ( )

Reliability 1 2 where 𝑅𝑅 (𝑡𝑡 ) =𝑝𝑝𝑅𝑅 𝑡𝑡 𝛽𝛽𝑖𝑖 − 𝑝𝑝 and𝑅𝑅 𝑡𝑡 {1,2} 𝑡𝑡 −� � 𝛼𝛼𝑖𝑖 𝑅𝑅𝑖𝑖 𝑡𝑡 𝑒𝑒 𝑖𝑖 ∈ ( ) = ( ) ( ) + ( ) ( )

1 1 2 2 Hazard Rate ℎ 𝑡𝑡 𝑤𝑤 𝑡𝑡 ℎ (𝑡𝑡 ) 𝑤𝑤 𝑡𝑡 ℎ 𝑡𝑡 where ( ) = and {1,2} ( ) 𝑝𝑝𝑖𝑖𝑅𝑅𝑖𝑖 𝑡𝑡 𝑤𝑤𝑖𝑖 𝑡𝑡 𝑛𝑛 𝑖𝑖 ∈ ∑𝑖𝑖=1 𝑝𝑝𝑖𝑖𝑅𝑅𝑖𝑖 𝑡𝑡 Properties and Moments Median Solved numerically Mode Solved numerically Mean - 1st Raw Moment 1 1 1 + + (1 p) 1 +

𝑝𝑝𝛼𝛼1Γ � � − 𝛼𝛼2Γ � � 𝛽𝛽1 𝛽𝛽2 Variance - 2nd Central Moment . [ ] + (1 ) [ ] + ( [ ] [ ]) 1 2 𝑝𝑝 𝑉𝑉𝑉𝑉 +𝑉𝑉(𝑇𝑇1 )( −[ 𝑝𝑝 ]𝑉𝑉𝑉𝑉2𝑉𝑉 [𝑇𝑇 ]) 1 𝑝𝑝 𝐸𝐸 𝑋𝑋 − 𝐸𝐸 𝑋𝑋 2 2 − 𝑝𝑝 𝐸𝐸 𝑋𝑋 − 𝐸𝐸 𝑋𝑋 . 1 + 1 + 2 2 2 1 𝑝𝑝 𝛼𝛼 �Γ � 𝛽𝛽1� − Γ � 𝛽𝛽1�� 2-Fold Mixed Weibull Distribution 71

+(1 ) 1 + 1 + 2 2 2 1 − 𝑝𝑝 𝛼𝛼 �Γ � 𝛽𝛽2� − Γ � 𝛽𝛽2�� + 1 + [ ] 2 1 𝑝𝑝 �𝛼𝛼1Γ � 𝛽𝛽1� − 𝐸𝐸 𝑋𝑋 � +(1 ) 1 + [ ] 2 1 − 𝑝𝑝 �𝛼𝛼2Γ � 𝛽𝛽2� − 𝐸𝐸 𝑋𝑋 � 100p% Percentile Function Solved numerically Parameter Estimation Plotting Method (Jiang & Murthy 1995)

Plot Points on X-Axis Y-Axis Mixed Wbl a Weibull = ( ) 1 Probability Plot = 1 𝑥𝑥 𝑙𝑙𝑙𝑙 𝑡𝑡 𝑦𝑦 𝑙𝑙𝑙𝑙 �𝑙𝑙𝑙𝑙 � �� Using the Weibull Probability Plot the parameters can be estimated. Jiang & Murthy,− 𝐹𝐹 1995, provide a comprehensive coverage of this procedure and detail error in previous methods. A typical WPP for a 2-fold Mixed Weibull Distribution is: 2

1.5

1

0.5

0

-0.5

-1

-1.5

-2 -1 0 1 2 3 4 5 6 WPP for 2-Fold Weibull Mixture Model = , = 5, = 0.5, = 10, = 5 1 2 1 1 2 2 Sub Populations: 𝑝𝑝 𝛼𝛼 𝛽𝛽 𝛼𝛼 𝛽𝛽 The dotted lines in the WPP is the lines representing the subpopulations: = [ ln( )]

1 1 1 𝐿𝐿 = 𝛽𝛽 [𝑥𝑥 − ln(𝛼𝛼 )]

𝐿𝐿2 𝛽𝛽2 𝑥𝑥 − 𝛼𝛼2 72 Bathtub Life Distributions

Asymptotes (Jiang & Murthy 1995): As ( 0) there exists an asymptote approximated by: [ ln( )] + ln ( ) where𝑥𝑥 → −∞ 𝑡𝑡 → 𝑦𝑦 ≈ 𝛽𝛽 1 𝑥𝑥 − 𝛼𝛼 1 𝑐𝑐 = + (1 ). 1 = 2 𝑝𝑝 𝛽𝛽1 𝑤𝑤ℎ𝑒𝑒𝑒𝑒 𝛽𝛽 ≠ 𝛽𝛽 1 𝑐𝑐 � 𝛼𝛼 1 2 𝑝𝑝 − 𝑝𝑝 � 2� 𝑤𝑤ℎ𝑒𝑒𝑒𝑒 𝛽𝛽 𝛽𝛽 As ( ) the asymptote straight line𝛼𝛼 can be approximated by: [ ln( )] 𝑥𝑥 → ∞ 𝑡𝑡 → ∞ 1 1 𝑦𝑦 ≈ 𝛽𝛽 𝑥𝑥 − 𝛼𝛼

Parameter Estimation Jiang and Murthy divide the parameter estimation procedure into three cases:

Well Mixed Case Mixed Wbl Mixed - Estimate the parameters of and from the line (right asymptote). 𝟐𝟐 𝟏𝟏 𝟏𝟏 𝟐𝟐 - Estimate the paramet𝜷𝜷 ≠ 𝜷𝜷er 𝒂𝒂𝒂𝒂𝒂𝒂 from𝜶𝜶 ≈the𝜶𝜶 separation distance between the left and right 1 1 1 asymptotes. 𝛼𝛼 𝛽𝛽 𝐿𝐿 - Find the point where the curve𝑝𝑝 crosses (point I). The slope at point I is: = + (1 ) 1 - Determine slope at point I and use to estimate𝐿𝐿 1 2 - Draw a line through the intersection𝛽𝛽̅ point𝑝𝑝𝛽𝛽 I with− slope𝑝𝑝 𝛽𝛽 and use the intersection point 2 to estimate . 𝛽𝛽 2 𝛽𝛽 2 𝛼𝛼 Well Separated Case - Determine visually if data is scattered along the bottom (or top) to determine if 𝟐𝟐 𝟏𝟏 𝟏𝟏 𝟐𝟐 𝟏𝟏 𝟐𝟐 (or ). 𝜷𝜷 ≠ 𝜷𝜷 𝒂𝒂𝒂𝒂𝒂𝒂 𝜶𝜶 ≫ 𝜶𝜶 𝒐𝒐𝒐𝒐 𝜶𝜶 ≪ 𝜶𝜶 1 2 - If ( ) locate the inflection, , to the left (right) of the point I. This𝛼𝛼 ≪point𝛼𝛼 1 2 𝛼𝛼 ln≫ [ 𝛼𝛼ln(1 )] { or ln [ ln( )] }. Using this formula estimate p. 1 2 1 2 𝑎𝑎 - Estimate𝛼𝛼 ≪ 𝛼𝛼 𝛼𝛼and≫ 𝛼𝛼 : 𝑦𝑦 𝑎𝑎 𝑎𝑎 𝑦𝑦 ≅• − − 𝑝𝑝 𝑦𝑦 ≅ − 𝑝𝑝= ln 1 + = ln If 1 2calculate point ( ) and ( ) . 𝛼𝛼 𝛼𝛼 𝑝𝑝 1−𝑝𝑝 Find the coordinates where and intersect the WPP curve. At these points 𝛼𝛼1 ≪ 𝛼𝛼2 𝑦𝑦1 �𝑙𝑙𝑙𝑙 � − 𝑝𝑝 𝑒𝑒𝑒𝑒𝑒𝑒 1 �� 𝑦𝑦2 �𝑙𝑙𝑙𝑙 �𝑒𝑒𝑒𝑒𝑒𝑒 1 �� estimate = and = . 1 2 • 𝑥𝑥1 𝑦𝑦=𝑥𝑥2ln 𝑦𝑦 = ln + If 1calculate point2 ( ) and ( ) . 𝛼𝛼 𝑒𝑒 𝛼𝛼 𝑒𝑒 𝑝𝑝 1−𝑝𝑝 Find the coordinates where and intersect the WPP curve. At these points 𝛼𝛼1 ≫ 𝛼𝛼2 𝑦𝑦1 �− 𝑙𝑙𝑙𝑙 �𝑒𝑒𝑒𝑒𝑒𝑒 1 �� 𝑦𝑦2 �−𝑙𝑙𝑙𝑙 �𝑝𝑝 𝑒𝑒𝑒𝑒𝑒𝑒 1 �� estimate = and = . 1 2 - Estimate : 𝑥𝑥1 𝑦𝑦 𝑥𝑥2 𝑦𝑦 1 2 • If 𝛼𝛼 draw𝑒𝑒 and approximate𝛼𝛼 𝑒𝑒 ensuring it intersects . Estimate from 1 the𝛽𝛽 slope of . 1 2 2 2 2 • If 𝛼𝛼 ≪ 𝛼𝛼 draw and approximate 𝐿𝐿 ensuring it intersects 𝛼𝛼 . Estimate 𝛽𝛽 from 2 the slope of 𝐿𝐿 . 1 2 1 1 1 - Find the point𝛼𝛼 ≫ where𝛼𝛼 the curve crosses 𝐿𝐿(point I). The slope at point𝛼𝛼 I is: 𝛽𝛽 1 𝐿𝐿 = + (1 ) 1 - Determine slope at point I and use to estimate𝐿𝐿 1 2 𝛽𝛽̅ 𝑝𝑝𝛽𝛽 − 𝑝𝑝 𝛽𝛽 2 𝛽𝛽 2-Fold Mixed Weibull Distribution 73

Common Shape Parameter =

If 1 then: 𝟐𝟐 𝟏𝟏 1 𝜷𝜷 𝜷𝜷 𝛼𝛼2 𝛽𝛽 - Estimate the parameters of and from the line (right asymptote). �𝛼𝛼1� ≈ - Estimate the parameter from the separation distance between the left and right 1 1 1 asymptotes. 𝛼𝛼 𝛽𝛽 𝐿𝐿 - Draw a vertical line through𝑝𝑝 = ln ( ). The intersection with the WPP can yield an estimate of using: 𝑥𝑥 𝛼𝛼1 𝛼𝛼2 1 = + exp(1) 𝑝𝑝 exp − 𝑝𝑝 𝑦𝑦1 � 𝛽𝛽1 � 𝛼𝛼2

�� � � Mixed Wbl 𝛼𝛼1 If 1 then: 1 𝛼𝛼2 𝛽𝛽 - Find inflection point and estimate the y coordinate . Estimate p using: �𝛼𝛼1� ≪ ln [ ln( )] 𝑟𝑟 = ln 1 + 𝑦𝑦 = ln - If calculate point 𝑇𝑇 ( ) and ( ) . Find the 𝑦𝑦 ≅ − 𝑝𝑝 1−𝑝𝑝 coordinates where and intersect the WPP curve. At these points estimate = 𝛼𝛼1 ≪ 𝛼𝛼2 𝑦𝑦1 �𝑙𝑙𝑙𝑙 � − 𝑝𝑝 𝑒𝑒𝑒𝑒𝑒𝑒 1 �� 𝑦𝑦2 �𝑙𝑙𝑙𝑙 �𝑒𝑒𝑒𝑒𝑒𝑒 1 �� and = . 𝑥𝑥1 1 2 1 - Using the 𝑥𝑥left2 or right𝑦𝑦 asymptote𝑦𝑦 estimate = from the slope. 𝛼𝛼 𝑒𝑒 𝛼𝛼2 𝑒𝑒 MLE and Bayesian techniques1 2 can be used using numerical Maximum 𝛽𝛽 𝛽𝛽 methods however estimates obtained from the graphical methods Likelihood are useful for initial guesses. A literature review of MLE and

Bayesian methods is covered in (Murthy et al. 2003). Bayesian

Description , Limitations and Uses Characteristics Hazard Rate Shape. The hazard rate can be approximated at its limits by (Jiang & Murthy 1995):

: ( ) ( ) : ( )

1 1 This result proves𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 that𝑡𝑡 ℎ the𝑡𝑡 ≈ hazard𝑐𝑐ℎ 𝑡𝑡 rate𝐿𝐿 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿(increasing𝑡𝑡 ℎ 𝑡𝑡 or≈ ℎdecreasing) of will dominate the limits of the mixed Weibull distribution. Therefore the hazard rate cannot be a bathtub curve shape. Instead the 1 possibleℎ shapes of the hazard rate is: • Decreasing • Unimodal • Decreasing followed by unimodal (rollercoaster) • Bi-modal

The reason this distribution has been included as a bathtub distribution is because on many occasions the hazard rate of a complex product may follow the “rollercoaster” shape instead which is given as decreasing followed by unimodal shape.

The shape of the hazard rate is only determined by the two shape parameters and . A complete study on the characterization of

𝛽𝛽1 𝛽𝛽2 74 Bathtub Life Distributions

the 2-Fold Mixed Weibull Distribution is contained in Jiang and Murthy 1998.

p Values The mixture ratio, , for each Weibull Distribution may be used to estimate the percentage of each subpopulation. However this is not 𝑖𝑖 a reliable measure𝑝𝑝 and it known to be misleading (Berger & Sellke 1987)

N-Fold Distribution (Murthy et al. 2003) A generalization to the 2-fold mixed Weibull distribution is the n-fold case. This distribution is defined as:

( ) = 𝑛𝑛 ( )

𝑓𝑓 𝑡𝑡 � 𝑝𝑝𝑖𝑖𝑓𝑓𝑖𝑖 𝑡𝑡 𝑖𝑖=1 where ( ) = 𝛽𝛽𝑖𝑖 𝑛𝑛 = 1 𝛽𝛽𝑖𝑖−1 𝑡𝑡 𝑖𝑖 −� 𝑖𝑖� Mixed Wbl Mixed 𝛽𝛽 𝑡𝑡 𝛼𝛼 𝑖𝑖 𝛽𝛽𝑖𝑖 𝑖𝑖 and the hazard rate𝑓𝑓 is𝑡𝑡 given 𝑖𝑖as: 𝑒𝑒 𝑎𝑎𝑎𝑎𝑎𝑎 � 𝑝𝑝 𝛼𝛼 𝑖𝑖=1

( ) = 𝑛𝑛 ( ) ( )

ℎ 𝑡𝑡 � 𝑤𝑤𝑖𝑖 𝑡𝑡 ℎ𝑖𝑖 𝑡𝑡 ( ) 𝑖𝑖=(1) = ( ) 𝑝𝑝𝑖𝑖𝑅𝑅𝑖𝑖 𝑡𝑡 𝑖𝑖 𝑛𝑛 𝑤𝑤ℎ𝑒𝑒𝑒𝑒𝑒𝑒 𝑤𝑤 𝑡𝑡 𝑖𝑖=1 𝑖𝑖 𝑖𝑖 It has been found that in many instances∑ 𝑝𝑝a 𝑅𝑅higher𝑡𝑡 number of folds will not significantly increase the accuracy of the model but does impose a significant overhead in the number of parameters to estimate. The 3-Fold Weibull has been studied by Jiang and Murthy 1996.

2-Fold Weibull 3-Parameter Distribution A common variation to the model presented here is to have the second Weibull distribution modeled with three parameters.

Resources Books / Journals: Jiang, R. & Murthy, D., 1995. Modeling Failure-Data by Mixture of 2 Weibull Distributions : A Graphical Approach. IEEE Transactions on Reliability, 44, 477-488.

Murthy, D., Xie, M. & Jiang, R., 2003. Weibull Models 1st ed., Wiley-Interscience.

Rinne, H., 2008. The Weibull Distribution: A Handbook 1st ed., Chapman & Hall/CRC.

Jiang, R. & Murthy, D., 1996. A mixture model involving three Weibull distributions. In Proceedings of the Second Australia– Japan Workshop on Stochastic Models in Engineering, Technology and Management. Gold Coast, Australia, pp. 260-270.

2-Fold Mixed Weibull Distribution 75

Jiang, R. & Murthy, D., 1998. Mixture of Weibull distributions - parametric characterization of failure rate function. Applied Stochastic Models and Data Analysis, (14), 47-65.

Balakrishnan, N. & Rao, C.R., 2001. Handbook of Statistics 20: Advances in Reliability 1st ed., Elsevier Science & Technology.

Relationship to Other Distributions Weibull Special Case: Distribution ( ; , ) = 2 ( ; = , = , = 1) ( ; , ) = 2 ( ; = , = , = 0) 1 1 ( ; , ) 𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 𝑡𝑡 𝛼𝛼 𝛽𝛽 𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹 𝑡𝑡 𝛼𝛼 𝛼𝛼 𝛽𝛽 𝛽𝛽 𝑝𝑝 𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 𝑡𝑡 𝛼𝛼 𝛽𝛽 𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹 𝑡𝑡 𝛼𝛼 𝛼𝛼2 𝛽𝛽 𝛽𝛽2 𝑝𝑝 Mixed Wbl

𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 𝑡𝑡 𝛼𝛼 𝛽𝛽

76 Bathtub Life Distributions

3.2. Exponentiated Weibull Distribution

Probability Density Function - f(t) 0.30 α=5 β=1.5 v=0.2 0.30 α=5 β=1 v=0.5 α=5 β=1.5 v=0.5 α=5 β=1.5 v=0.5 0.25 0.25 α=5 β=1.5 v=0.8 α=5 β=2 v=0.5 α=5 β=3 v=0.5 0.20 α=5 β=1.5 v=3 0.20

0.15 0.15

0.10 0.10

0.05 0.05 Weibull - 0.00 0.00 Exp 0 5 10 15 0 5

Cumulative Density Function - F(t) 1.00 1.00

0.80 0.80

0.60 0.60

0.40 0.40

0.20 0.20

0.00 0.00 0 5 10 15 0 5

Hazard Rate - h(t) 0.60 0.60

0.50 0.50

0.40 0.40

0.30 0.30

0.20 0.20

0.10 0.10

0.00 0.00 0 5 10 15 0 5

Exponentiated Weibull Continuous Distribution 77

Parameters & Description > 0 Scale Parameter.

Parameters 𝛼𝛼 𝛼𝛼 > 0 Shape Parameter. 𝛽𝛽 𝛽𝛽 > 0 Shape Parameter. Limits 𝑣𝑣 𝑣𝑣 0 Distribution Formulas𝑡𝑡 ≥

( ) = 1 exp exp 𝛽𝛽−1 𝛽𝛽 𝑣𝑣−1 𝛽𝛽 𝛽𝛽𝛽𝛽𝑡𝑡 𝑡𝑡 𝑡𝑡 Exp PDF 𝑓𝑓 𝑡𝑡 𝛽𝛽 � − �− � � �� �− � � � - = {𝛼𝛼 ( )} ( ) 𝛼𝛼 𝛼𝛼 Weibull 𝑣𝑣−1 𝑊𝑊 𝑊𝑊 Where ( ) and𝑣𝑣 𝐹𝐹 (𝑡𝑡 ) are𝑓𝑓 the𝑡𝑡 cdf and pdf of the two parameter

Weibull distribution respectively. 𝐹𝐹𝑊𝑊 𝑡𝑡 𝑓𝑓𝑊𝑊 𝑡𝑡

t ( ) = 1 exp CDF β v 𝐹𝐹 𝑡𝑡 = �[ −( )] �− � � �� 𝑣𝑣 α 𝐹𝐹𝑊𝑊 𝑡𝑡

t R(t) = 1 1 exp Reliability β v = 1 − �[ −( )] �− � � �� 𝑣𝑣 α − 𝐹𝐹𝑊𝑊 𝑡𝑡 + ( + x) 1 1 exp 𝑣𝑣 ( ) ( | ) 𝛽𝛽 = = = 𝑡𝑡 𝑥𝑥 Conditional ( ) − � − �− � � �� 𝑅𝑅 𝑡𝑡 1 1 exp 𝛼𝛼 Survivor Function 𝑚𝑚 𝑥𝑥 𝑅𝑅 𝑥𝑥 𝑡𝑡 𝛽𝛽 𝑣𝑣 ( > + | > ) 𝑅𝑅 𝑡𝑡 𝑡𝑡 Where − � − �− � � �� is the given time we know the component has survived𝛼𝛼 to. 𝑃𝑃 𝑇𝑇 𝑥𝑥 𝑡𝑡 𝑇𝑇 𝑡𝑡 is a random variable defined as the time after . Note: = 0 at . 𝑡𝑡 𝑥𝑥 𝑡𝑡 𝑥𝑥 𝑡𝑡

1 1 exp 𝑣𝑣 Mean Residual ∞ 𝛽𝛽 ( ) = 𝑡𝑡 Life ∫t � − � − �− � � �� � 𝑑𝑑𝑑𝑑 1 1 exp 𝛼𝛼 𝑢𝑢 𝑡𝑡 𝛽𝛽 𝑣𝑣 𝑡𝑡 − � − �− � � �� 𝛼𝛼 1 exp exp ( ) = 𝛽𝛽−1 𝛽𝛽 𝑣𝑣−1 𝛽𝛽 Hazard Rate 𝑡𝑡 𝑡𝑡 𝑡𝑡 𝛽𝛽𝛽𝛽� �𝛼𝛼� �1 − 1 �exp−� �𝛼𝛼� �� �−� �𝛼𝛼� � ℎ 𝑡𝑡 𝛽𝛽 𝑣𝑣 𝑡𝑡 − � − �−� �𝛼𝛼� �� 78 Bathtub Life Distributions

For small t: (Murthy et al. 2003, p.130) ( ) 𝛽𝛽𝛽𝛽−1 𝛽𝛽𝛽𝛽 𝑡𝑡 For large t: (Murthy et al.ℎ 2003,𝑡𝑡 ≈ � p.130)� � � 𝛼𝛼 𝛼𝛼 ( ) 𝛽𝛽−1 𝛽𝛽 𝑡𝑡 ℎ 𝑡𝑡 ≈ � � � � Properties and Moments𝛼𝛼 𝛼𝛼 Median ln 1 2 1 −1 � �𝑣𝑣 𝛽𝛽 Mode For > 1 𝛼𝛼the�− mode� − can �be� approximated

(Murthy et al. 2003, p.130): 𝛽𝛽1𝛽𝛽 ( 8 + 2 + 9 ) 1 1 𝑣𝑣 2 2

Weibull �𝛽𝛽 𝛽𝛽 − 𝑣𝑣 𝛽𝛽𝛽𝛽 𝛽𝛽𝑣𝑣 - st 𝛼𝛼 � � − − �� Mean - 1 Raw Moment Solved numerically𝛽𝛽𝛽𝛽 see Murthy et al. 2003,𝑣𝑣 Exp Variance - 2nd Central Moment p.128 100 % Percentile Function = ln 1 1 1 �𝛽𝛽 𝑝𝑝 �𝑣𝑣 Parameter Estimation𝑡𝑡𝑝𝑝 𝛼𝛼 �− � − 𝑝𝑝 �� Plotting Method (Jiang & Murthy 1999) Plot Points on X-Axis Y-Axis a Weibull = ( ) 1 Probability Plot = 1 𝑥𝑥 𝑙𝑙𝑙𝑙 𝑡𝑡 𝑦𝑦 𝑙𝑙𝑙𝑙 �𝑙𝑙𝑙𝑙 � �� Using the Weibull Probability Plot the parameters can be estimated. (Jiang− 𝐹𝐹 & Murthy 1999), provide a comprehensive coverage of this. A typical WPP for an exponentiated Weibull distribution is: 3

2.5

2

1.5

1

0.5

0

-0.5

-1

-1.5

-2 -1 -0.5 0 0.5 1 1.5 2 2.5 3 WPP for exponentiated Weibull distribution = 5, = 2, = 0.4

Asymptotes (Jiang & Murthy 1999): 𝛼𝛼 𝛽𝛽 𝑣𝑣 As ( 0) there exists an asymptote approximated by:

𝑥𝑥 → −∞ 𝑡𝑡 → Exponentiated Weibull Continuous Distribution 79

[ ln( )]

As ( ) the asymptote straight𝑦𝑦 ≈ 𝛽𝛽 𝛽𝛽line𝑥𝑥 −can be𝛼𝛼 approximated by: [ ln( )] 𝑥𝑥 → ∞ 𝑡𝑡 → ∞ Both asymptotes intersect the x-axis𝑦𝑦 at≈ 𝛽𝛽 (𝑥𝑥 −) however𝛼𝛼 both have different slopes unless = 1 and the WPP is the same as a two parameter Weibull distribution. 𝑙𝑙𝑙𝑙 𝛼𝛼 Parameter𝑣𝑣 Estimation Plot estimates of the asymptotes ensuring they cross the x-axis at the same point. Use the right asymptote to estimate and . Use the left asymptote to estimate .

𝛼𝛼 𝛽𝛽 𝑣𝑣 MLE and Bayesian techniques can be used in the standard way Exp

Maximum however estimates obtained from the graphical methods are useful - Likelihood for initial guesses when using numerical methods to solve equations. Weibull A literature review of MLE and Bayesian methods is covered in Bayesian (Murthy et al. 2003).

Description , Limitations and Uses Characteristics PDF Shape: (Murthy et al. 2003, p.129) <= 1. The pdf is monotonically decreasing, (0) = . = . The pdf is monotonically decreasing, (0) = 1/ . 𝜷𝜷𝜷𝜷 > 1. The pdf is unimodal. (0) = 0. 𝑓𝑓 ∞ 𝜷𝜷𝜷𝜷 𝟏𝟏 𝑓𝑓 𝛼𝛼 The𝜷𝜷𝜷𝜷 pdf shape is determined 𝑓𝑓by in a similar way to the for a two parameter Weibull distribution. 𝛽𝛽𝛽𝛽 Hazard Rate Shape:𝛽𝛽 (Murthy et al. 2003, p.129) and . The hazard rate is monotonically decreasing. 𝜷𝜷 ≤ 𝟏𝟏 and 𝜷𝜷𝜷𝜷 ≤ 𝟏𝟏. The hazard rate is monotonically increasing. 𝜷𝜷 <≥ 1𝟏𝟏 and 𝜷𝜷𝜷𝜷>≥1.𝟏𝟏 The hazard rate is unimodal. > 1 and < 1. The hazard rate is a bathtub curve. 𝜷𝜷 𝜷𝜷𝜷𝜷 Weibull Distribution.𝜷𝜷 𝜷𝜷𝜷𝜷 The Weibull distribution is a special case of the expatiated distribution when = 1. When is an integer greater than 1, then the cdf represents a multiplicative Weibull model. 𝑣𝑣 𝑣𝑣 Standard Exponentiated Weibull. (Xie et al. 2004) When = 1 the distribution is the standard exponentiated Weibull distribution with cdf: 𝛼𝛼 ( ) = 1 exp 𝛽𝛽 𝑣𝑣 Minimum Failure Rate.𝐹𝐹 𝑡𝑡(Xie et� al.− 2004)�−𝑡𝑡 When�� the hazard rate is a bathtub curve ( > 1 and < 1) then the minimum failure rate point is: 𝛽𝛽 𝛽𝛽𝛽𝛽 = [ ln(1 )] 1 ′ �𝛽𝛽 𝑡𝑡 𝛼𝛼 − − 𝑦𝑦1 80 Bathtub Life Distributions

where is the solution to: ( 1) (1 ) + ln(1 ) [1 + ] = 0 1 𝑦𝑦 𝑣𝑣 𝑣𝑣 Maximum𝛽𝛽 −Mean𝑦𝑦 Residual− 𝑦𝑦 𝛽𝛽 Life. −(Xie𝑦𝑦 et al.𝑣𝑣𝑣𝑣 2004)− 𝑣𝑣 − 𝑦𝑦By solving the derivative of the MRL function to zero, the maximum MRL is found by solving to t: = [ ln(1 )] 1 ∗ �𝛽𝛽 2 where is the solution𝑡𝑡 to: 𝛼𝛼 − − 𝑦𝑦

2 (1 ) [ ln(1 )] 𝑦𝑦 1 𝑣𝑣−1 − �𝛽𝛽 × [1 1 (1 ) = 0 ∞ 𝛽𝛽𝛽𝛽 − 𝑦𝑦 𝑦𝑦 − − 𝑦𝑦 [ ( )] 𝛽𝛽 𝑣𝑣 1 −𝑥𝑥 𝑣𝑣 2 �𝛽𝛽 �− ln 1−𝑦𝑦 − � − 𝑒𝑒 � 𝑑𝑑𝑑𝑑 − − 𝑦𝑦

Weibull - Resources Books / Journals: Exp Mudholkar, G. & Srivastava, D., 1993. Exponentiated Weibull family for analyzing bathtub failure-rate data. Reliability, IEEE Transactions on, 42(2), 299-302.

Jiang, R. & Murthy, D., 1999. The exponentiated Weibull family: a graphical approach. Reliability, IEEE Transactions on, 48(1), 68- 72.

Xie, M., Goh, T.N. & Tang, Y., 2004. On changing points of mean residual life and failure rate function for some generalized Weibull distributions. Reliability Engineering and System Safety, 84(3), 293–299.

Murthy, D., Xie, M. & Jiang, R., 2003. Weibull Models 1st ed., Wiley-Interscience.

Rinne, H., 2008. The Weibull Distribution: A Handbook 1st ed., Chapman & Hall/CRC.

Balakrishnan, N. & Rao, C.R., 2001. Handbook of Statistics 20: Advances in Reliability 1st ed., Elsevier Science & Technology. Relationship to Other Distributions Weibull Special Case: Distribution ( ; , ) = ( ; = , = , = 1)

( ; , ) 𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 𝑡𝑡 𝛼𝛼 𝛽𝛽 𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 𝛼𝛼 𝛼𝛼 𝛽𝛽 𝛽𝛽 𝑣𝑣

𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 𝑡𝑡 𝛼𝛼 𝛽𝛽

Modified Weibull Distribution 81

3.3. Modified Weibull Distribution

Probability Density Function - f(t) b=0.5 a=5 λ=1 b=0.8 a=10 λ=5 b=1 a=20 λ=10 b=2 Modified Weibull

0 0.5 1 0 0.5 0 0.2

Cumulative Density Function - F(t)

b=0.5 a=5 λ=1 b=0.8 a=10 λ=5 b=1 a=20 λ=10 b=2

0 0.5 1 0 0.5 0 0.2

Hazard Rate - h(t)

0 5 0 5 0 5

Note: The hazard rate plots are on a different scale to the PDF and CDF

82 Bathtub Life Distributions

Parameters & Description > 0 Scale Parameter.

𝑎𝑎 𝑎𝑎 Shape Parameter: The shape of the distribution is completely determined Parameters 0 by b. When 0 < < 1 the distribution 𝑏𝑏 𝑏𝑏 ≥ has a bathtub shaped hazard rate. 𝑏𝑏 0 Scale Parameter.

Limits 𝜆𝜆 𝜆𝜆 ≥ 0

Distribution Formulas𝑡𝑡 ≥

PDF ( ) = a(b + t) t exp( t) exp at exp( t) b−1 b 𝑓𝑓 𝑡𝑡 λ λ �− λ � CDF ( ) = 1 exp[ ( )] 𝑏𝑏 𝐹𝐹 𝑡𝑡 − −𝑎𝑎𝑡𝑡 𝑒𝑒𝑒𝑒𝑒𝑒 𝜆𝜆𝜆𝜆 Modified Weibull Modified Reliability R(t) = exp[ ( )] 𝑏𝑏 −𝑎𝑎𝑡𝑡 𝑒𝑒𝑒𝑒𝑒𝑒 𝜆𝜆𝜆𝜆 Mean Residual ( ) = exp exp Life ∞ 𝑏𝑏 𝜆𝜆𝜆𝜆 𝑏𝑏 𝜆𝜆𝜆𝜆 𝑢𝑢 𝑡𝑡 �𝑎𝑎𝑎𝑎 𝑒𝑒 � �𝑡𝑡 �𝑎𝑎𝑥𝑥 𝑒𝑒 � 𝑑𝑑𝑑𝑑 Hazard Rate ( ) = a(b + t)t e b−1 λt Properties andℎ 𝑡𝑡 Momentsλ Median Solved numerically (see 100p%) Mode Solved numerically Mean - 1st Raw Moment Solved numerically Variance - 2nd Central Moment Solved numerically 100p% Percentile Function Solve for numerically: ln(1 ) exp𝑝𝑝 ( t ) = 𝑡𝑡 a b − 𝑝𝑝 𝑡𝑡𝑝𝑝 λ p − Parameter Estimation Plotting Method (Lai et al. 2003) Plot Points on X-Axis Y-Axis a Weibull ln ( ) 1 Probability Plot 1 𝑡𝑡𝑖𝑖 𝑙𝑙𝑙𝑙 �𝑙𝑙𝑙𝑙 � �� Using the Weibull Probability Plot the parameters can be estimated. (Lai et− al.𝐹𝐹 2003).

Asymptotes (Lai et al. 2003): Modified Weibull Distribution 83

As ( 0) the asymptote straight line can be approximated as: + ln( ) 𝑥𝑥 → −∞ 𝑡𝑡 → As ( ) the asymptote straight𝑦𝑦 ≈ 𝑏𝑏𝑏𝑏 line can𝑎𝑎 be approximated as (not used for parameter estimate but more for model validity): 𝑥𝑥 → ∞ 𝑡𝑡 → ∞ exp( ) =

Intersections (Lai et al. 2003): 𝑦𝑦 ≈ 𝜆𝜆 𝑥𝑥 𝜆𝜆𝜆𝜆

Y-Axis Intersection (0, ) ln( ) + + = 0 0 ( 𝑥𝑥, 0) 𝑥𝑥0

X-Axis Intersection Modified Weibull 0 𝑎𝑎ln( 𝑏𝑏)𝑥𝑥+ =𝜆𝜆𝑒𝑒 0 𝑦𝑦 0 Solving these gives an approximate value𝑎𝑎 for𝜆𝜆 each𝑦𝑦 parameter which can be used as an initial guess for numerical methods solving MLE or Bayesian methods.

A typical WPP for an Modified Weibull Distribution is:

10

9

8

7

6

5

4

3

2

1

0

-1 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 WPP for modified Weibull distribution = 5, = 0.2, = 10

𝑎𝑎 𝑏𝑏 𝜆𝜆 Description , Limitations and Uses Characteristics Parameter Characteristics:(Lai et al. 2003) < < 1 > 0. The hazard rate has a bathtub curve shape. ( ) 0. ( ) . 𝟎𝟎 𝑏𝑏 𝒂𝒂 𝒏𝒏>𝒅𝒅 𝝀𝝀0. Has an increasing hazard rate function. (0) = 0ℎ. 𝑡𝑡 (→) ∞ 𝑎𝑎𝑎𝑎 𝑡𝑡 → ℎ 𝑡𝑡. → ∞ 𝑎𝑎𝑎𝑎 𝑡𝑡 → ∞ 𝒃𝒃 =≥ 𝟏𝟏. 𝒂𝒂𝒂𝒂𝒂𝒂The 𝝀𝝀function has the same form as a Weibull Distribution.ℎ ℎ 𝑡𝑡(0→) =∞ 𝑎𝑎𝑎𝑎. 𝑡𝑡 →( )∞ 𝝀𝝀 𝟎𝟎 ℎ 𝑎𝑎𝑎𝑎 ℎ 𝑡𝑡 → ∞ 𝑎𝑎𝑎𝑎 𝑡𝑡 → ∞ 84 Bathtub Life Distributions

Minimum Failure Rate. (Xie et al. 2004) When the hazard rate is a bathtub curve (0 < < 1 > 0) then the minimum failure rate point is given as: 𝑏𝑏 𝑎𝑎𝑎𝑎𝑎𝑎 𝜆𝜆 = ∗ √𝑏𝑏 − 𝑏𝑏 𝑡𝑡 Maximum Mean Residual Life. (Xie𝜆𝜆 et al. 2004) By solving the derivative of the MRL function to zero, the maximum MRL is found by solving to t:

( + ) exp ( ( ) exp = 0 ∞ 𝑏𝑏−1 𝜆𝜆𝜆𝜆 𝑏𝑏 𝜆𝜆𝜆𝜆 𝑑𝑑𝑑𝑑 𝑏𝑏 𝜆𝜆𝜆𝜆

𝑎𝑎 𝑏𝑏 𝜆𝜆𝜆𝜆 𝑡𝑡 𝑒𝑒 �𝑡𝑡 −𝑎𝑎𝑥𝑥 𝑒𝑒 − �𝑎𝑎𝑡𝑡 𝑒𝑒 � Shape. The shape of the hazard rate cannot have a flat “usage period” and a strong “wear out” gradient. Resources Books / Journals:

Modified Weibull Modified Lai, C., Xie, M. & Murthy, D., 2003. A modified Weibull distribution. IEEE Transactions on Reliability, 52(1), 33-37.

Murthy, D.N.P., Xie, M. & Jiang, R., 2003. Weibull Models 1st ed., Wiley-Interscience.

Xie, M., Goh, T.N. & Tang, Y., 2004. On changing points of mean residual life and failure rate function for some generalized Weibull distributions. Reliability Engineering and System Safety, 84(3), 293–299.

Rinne, H., 2008. The Weibull Distribution: A Handbook 1st ed., Chapman & Hall/CRC.

Balakrishnan, N. & Rao, C.R., 2001. Handbook of Statistics 20: Advances in Reliability 1st ed., Elsevier Science & Technology. Relationship to Other Distributions Weibull Special Case: Distribution ( ; , ) = ( ; a = , = , = 0)

( ; , ) 𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 𝑡𝑡 𝛼𝛼 𝛽𝛽 𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀 𝑡𝑡 𝛼𝛼 𝑏𝑏 𝛽𝛽 𝜆𝜆

𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 𝑡𝑡 𝛼𝛼 𝛽𝛽

Univariate Continuous Distributions 85

4. Univariate Continuous Distributions

Univariate Cont

86 Univariate Continuous Distributions

4.1. Beta Continuous Distribution

Probability Density Function - f(t) 3.00 α=5 β=2 α=1 β=1 α=2 β=2 α=5 β=5 α=0.5 β=0.5 2.50 α=1 β=2 2.50 α=0.5 β=2 2.00 2.00

1.50 1.50

1.00 1.00

0.50 0.50

0.00 0.00 0 0.5 1 0 0.2 0.4 0.6 0.8 1

Cumulative Density Function - F(t) 1.00 1.00

0.80 0.80

0.60 0.60 Beta

0.40 0.40 α=1 β=1 0.20 0.20 α=5 β=5 α=0.5 β=0.5 0.00 0.00 0 0.5 1 0 0.2 0.4 0.6 0.8 1

Hazard Rate - h(t) 10.00 α=5 β=2 4.00 α=2 β=2 9.00 3.50 8.00 α=1 β=2 α=0.5 β=2 3.00 7.00 6.00 2.50 5.00 2.00 4.00 1.50 3.00 1.00 α=1 β=1 2.00 α=5 β=5 1.00 0.50 α=0.5 β=0.5 0.00 0.00 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1

Beta Continuous Distribution 87

Parameters & Description > 0 Shape Parameter.

𝛼𝛼 𝛼𝛼 > 0 Shape Parameter. 𝛽𝛽 𝛽𝛽 Lower Bound: is the lower bound but has also been called a location < < 𝐿𝐿 Parameters parameter. In𝑎𝑎 the standard Beta = 0 𝑎𝑎𝐿𝐿 −∞ 𝑎𝑎𝐿𝐿 𝑏𝑏𝑈𝑈 distribution . Upper Bound𝑎𝑎𝐿𝐿: is the upper bound. In the standard = 1. < < 𝑈𝑈 The scale parameter𝑏𝑏 may also be defined 𝑏𝑏𝑈𝑈 𝑏𝑏𝑈𝑈 𝑎𝑎𝐿𝐿 𝑏𝑏𝑈𝑈 ∞ as . Limits 𝑏𝑏𝑈𝑈<− 𝑎𝑎𝐿𝐿 Distribution 𝑎𝑎Formulas𝐿𝐿 𝑡𝑡 ≤ 𝑏𝑏𝑈𝑈 ( , ) is the Beta function, ( | , ) is the incomplete Beta function, ( | , ) is the regularized Beta function, ( ) is the complete gamma which is discussed in section 1.6. 𝐵𝐵 𝑥𝑥 𝑦𝑦 𝐵𝐵𝑡𝑡 𝑡𝑡 𝑥𝑥 𝑦𝑦 𝐼𝐼𝑡𝑡 𝑡𝑡 𝑥𝑥 𝑦𝑦 GeneralΓ 𝑘𝑘 Form: ( + ) ( ) ( )

( ; , , , ) = . Beta ( ) ( ) ( 𝛼𝛼−1 ) 𝛽𝛽−1 Γ 𝛼𝛼 𝛽𝛽 𝑡𝑡 − 𝑎𝑎𝐿𝐿 𝑏𝑏𝑈𝑈 − 𝑡𝑡 𝐿𝐿 𝑈𝑈 𝛼𝛼+𝛽𝛽−1 𝑓𝑓 𝑡𝑡 𝛼𝛼 𝛽𝛽 𝑎𝑎 𝑏𝑏 𝑈𝑈 𝐿𝐿 When = 0, = 1: Γ 𝛼𝛼 Γ 𝛽𝛽 𝑏𝑏 − 𝑎𝑎 PDF ( + ) 𝐿𝐿 𝑈𝑈 ( | , ) = . (1 ) 𝑎𝑎 𝑏𝑏 ( ) ( ) Γ 𝛼𝛼1 𝛽𝛽 𝛼𝛼−1 𝛽𝛽−1 𝑓𝑓 𝑡𝑡 𝛼𝛼 𝛽𝛽 = . 𝑡𝑡 (1 −) 𝑡𝑡 Γ (𝛼𝛼 ,Γ )𝛽𝛽 𝛼𝛼−1 𝛽𝛽−1 𝑡𝑡 − 𝑡𝑡 𝐵𝐵 𝛼𝛼 𝛽𝛽 ( + ) F(t) = (1 ) ( ) ( ) 𝑡𝑡 Γ (𝛼𝛼 | ,𝛽𝛽 ) 𝛼𝛼−1 𝛽𝛽−1 CDF = �0 𝑢𝑢 − 𝑢𝑢 𝑑𝑑𝑑𝑑 Γ 𝛼𝛼( Γ 𝛽𝛽) 𝑡𝑡 , = 𝐵𝐵( 𝑡𝑡| 𝛼𝛼, 𝛽𝛽) 𝐵𝐵 𝛼𝛼 𝛽𝛽 𝐼𝐼𝑡𝑡 𝑡𝑡 𝛼𝛼 𝛽𝛽 Reliability R(t) = 1 ( | , )

( + x𝑡𝑡) 1 ( + | , ) ( ) = ( | ) = − 𝐼𝐼 =𝑡𝑡 𝛼𝛼 𝛽𝛽 ( ) 1 ( | , ) Conditional 𝑡𝑡 Where 𝑅𝑅 𝑡𝑡 − 𝐼𝐼 𝑡𝑡 𝑥𝑥 𝛼𝛼 𝛽𝛽 Survivor Function 𝑚𝑚 𝑥𝑥 𝑅𝑅 𝑥𝑥 𝑡𝑡 𝑡𝑡 is the given time we know the𝑅𝑅 component𝑡𝑡 −has𝐼𝐼 survive𝑡𝑡 𝛼𝛼 𝛽𝛽 d to. is a random variable defined as the time after . Note: = 0 at . 𝑡𝑡 𝑥𝑥 { ( , ) (x| , )}dx𝑡𝑡 𝑥𝑥 𝑡𝑡 Mean Residual ( ) = ∞ ( , ) (t| , ) Life ∫t 𝐵𝐵 α β − 𝐵𝐵x α β (Gupta and Nadarajah𝑢𝑢 𝑡𝑡 2004, p.44) 𝐵𝐵 α β − 𝐵𝐵t α β 88 Univariate Continuous Distributions

t (1 t) ( ) = Hazard Rate ( , α−)1 (t| , ) − (Gupta and Nadarajahℎ 2004,𝑡𝑡 p.44) 𝐵𝐵 α β − 𝐵𝐵t α β Properties and Moments Numerically solve for t: Median . = ( , ) −1 Mode 𝑡𝑡 0for5 𝐹𝐹> 1 𝛼𝛼 𝛽𝛽 > 1 𝛼𝛼−1 𝛼𝛼+𝛽𝛽−2 𝛼𝛼 𝑎𝑎𝑎𝑎𝑎𝑎 𝛽𝛽 st Mean - 1 Raw Moment + α Variance - 2nd Central Moment α β ( + ) ( + + 1) αβ 2 2α( β ) α +β + 1 Skewness - 3rd Central Moment ( + + 2) β − α �α β 6[ + (1 2α) +β (1 +�αβ) 2 (2 + )] Excess kurtosis - 4th Central Moment 3 2 ( + + 22)( + + 3) 𝛼𝛼 𝛼𝛼 − 𝛽𝛽 𝛽𝛽 𝛽𝛽 − 𝛼𝛼𝛼𝛼 𝛽𝛽

𝛼𝛼𝛼𝛼 𝛼𝛼 F 𝛽𝛽( ; +𝛼𝛼 ; it𝛽𝛽)

1 1 Beta Where F is the αconfluenα β t hypergeometric function defined as: 1 1 Characteristic Function ( )

F ( ; ; x) = ∞ . ( ) 𝑘𝑘 k ! 1 1 α 𝑥𝑥 α β � k k=0 β 𝑘𝑘 (Gupta and Nadarajah 2004, p.44) Numerically solve for t: 100p% Percentile Function = ( , ) −1 Parameter Estimation 𝑡𝑡𝑝𝑝 𝐹𝐹 𝛼𝛼 𝛽𝛽 Maximum Likelihood Function ( + )n L( , |E) = 1 . Likelihood ( ) ( ) nF F 𝐹𝐹 𝛼𝛼−1 𝐹𝐹 𝛽𝛽−1 Functions Γ 𝛼𝛼 𝛽𝛽 𝑖𝑖 𝑖𝑖 α β �i=1 𝑡𝑡 � − 𝑡𝑡 � �Γ��𝛼𝛼�Γ��𝛽𝛽������������������� failures ( , |E) = n {ln[ ( + ) [ ( )] [ ( )]} Log-Likelihood F Functions Λ α β +( 1) 𝛤𝛤nF𝛼𝛼ln t𝛽𝛽 −+ 𝑙𝑙𝑙𝑙( 𝛤𝛤 1𝛼𝛼) n−F ln𝑙𝑙𝑙𝑙 (1𝛤𝛤 𝛽𝛽t ) F F α − � � i � β − � − i i=1 1 i=1 ( ) ( + ) = nF ln (t ) n = 0 F i ∂Λ ( ) [𝜓𝜓(α)]− 𝜓𝜓 α β F � where = ln is the digammai =function1 see section 1.6.7. ∂α 𝑑𝑑 𝜓𝜓 𝑥𝑥 𝑑𝑑𝑑𝑑 Γ 𝑥𝑥 Beta Continuous Distribution 89

(Johnson et al. 1995, p.223)

1 ( ) ( + ) = nF ln (1 t ) = 0 n ∂Λ i (Johnson et al. 1995,𝜓𝜓 β p.223)− 𝜓𝜓 α β F � − ∂β i=1 Point Point estimates are obtained by using numerical methods to solve the Estimates simultaneous equations above. Fisher ( ) ( + ) ( + ) ( , ) = Information ′ ( ′+ ) ( ) ′ ( + ) 𝜓𝜓 𝛼𝛼 − 𝜓𝜓 𝛼𝛼 𝛽𝛽 −𝜓𝜓 𝛼𝛼 𝛽𝛽 Matrix 𝐼𝐼 𝛼𝛼 𝛽𝛽 � ′ ′ ′ � ( ) = ( −)𝜓𝜓 = 𝛼𝛼 𝛽𝛽( + ) 𝜓𝜓 𝛽𝛽 − 𝜓𝜓 𝛼𝛼 𝛽𝛽 where 2 is the Trigamma function. ′ 𝑑𝑑 ∞ −2 See section 1.6.8.2 (Yang and Berger 1998, p.5) 𝜓𝜓 𝑥𝑥 𝑑𝑑𝑥𝑥 𝑙𝑙𝑙𝑙Γ 𝑥𝑥 ∑𝑖𝑖=0 𝑥𝑥 𝑖𝑖 Confidence For a large number of samples the Fisher information matrix can be Intervals used to estimate confidence intervals. See section 1.4.7.

Bayesian Non-informative Priors

Jeffery’s Prior det ( , ) Beta where ( , ) is given above. √ �𝐼𝐼 𝛼𝛼 𝛽𝛽 � 𝐼𝐼 𝛼𝛼 𝛽𝛽 Conjugate Priors UOI Likelihood Evidence Dist. of Prior Posterior Model UOI Para Parameters

failures = + from Bernoulli Beta , in 1 trail = + 1 𝑝𝑝 ( ; ) 𝑜𝑜 𝑘𝑘 0 0 𝛼𝛼 𝛼𝛼 𝑘𝑘 𝛼𝛼 𝛽𝛽 𝑜𝑜 𝛽𝛽 𝛽𝛽 − 𝑘𝑘 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑝𝑝 failures = + from Binomial Beta , 𝑝𝑝 in trials = + n k ( ; , ) 𝑘𝑘 𝛼𝛼 𝛼𝛼𝑜𝑜 𝑘𝑘 𝛼𝛼𝑜𝑜 𝛽𝛽𝑜𝑜 𝑛𝑛 β βo − 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑝𝑝 𝑛𝑛 Description , Limitations and Uses Example For examples on the use of the beta distribution as a conjugate prior see the .

A non-homogeneous (operate in different environments) population of 5 switches have the following of failure on demand. 0.1176, 0.1488, 0.3684, 0.8123, 0.9783

Estimate the population variability function: 1 nF ln (t ) = 1.0549 n F i F � − i=1 90 Univariate Continuous Distributions

1 nF ln (1 t ) = 1.25 n i F � − − i=1 Numerically Solving: ( ) + 1.0549 = ( ) + 1.25 Gives: 𝜓𝜓 α = 0.7369𝜓𝜓 β = 0.6678 𝛼𝛼�1.5924 1.0207 ( , ) = 𝑏𝑏� 1.0207 2.0347 − 𝐼𝐼 𝛼𝛼 𝛽𝛽 � 0.1851� 0.0929 , = − , = −1 −1 0.0929 0.1449

�𝐽𝐽𝑛𝑛�𝛼𝛼� 𝛽𝛽̂�� �𝑛𝑛𝐹𝐹𝐼𝐼�𝛼𝛼� 𝛽𝛽̂�� � � 90% confidence interval for : (0.95) 0.1851 (0.95) 0.1851 . exp , . exp −1 𝛼𝛼 −1 𝛷𝛷 √ 𝛷𝛷 √ �𝛼𝛼� � [0.282� , 1𝛼𝛼�.92] � �� −𝛼𝛼� 𝛼𝛼� 90% confidence interval for : (0.95) 0.1449 (0.95) 0.1449

. exp , . exp −1 𝛽𝛽 −1 𝛷𝛷 √ 𝛷𝛷 √ Beta �𝛽𝛽̂ � [0.262� , 1𝛽𝛽.̂71] � �� −𝛽𝛽̂ 𝛽𝛽̂ Characteristics The Beta distribution was originally known as a Pearson Type I distribution (and Type II distribution which is a special case of a Type I).

( , ) is the mirror distribution of ( , ). If ~ ( , ) and let = 1 then ~ ( , ). 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝛼𝛼 𝛽𝛽 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝛽𝛽 𝛼𝛼 𝑋𝑋 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝛼𝛼 𝛽𝛽 Location𝑌𝑌 − / 𝑋𝑋Scale Parameters𝑌𝑌 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝛽𝛽 𝛼𝛼 (NIST Section 1.3.6.6.17) and can be transformed into a location and scale parameter: = 𝐿𝐿 𝑈𝑈 𝑎𝑎 𝑏𝑏 = 𝐿𝐿 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙 𝑎𝑎 𝑈𝑈 𝐿𝐿 Shapes(Gupta and Nadarajah𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 2004,𝑏𝑏 −p.41)𝑎𝑎 : < < 1. As 0, ( ) . < < 1. As 1, ( ) . 𝟎𝟎 > 𝛼𝛼1, > 1. As𝑥𝑥 → 𝑓𝑓0,𝑥𝑥 (→)∞ 0. There is a single mode at 𝟎𝟎 𝛽𝛽. 𝑥𝑥 → 𝑓𝑓 𝑥𝑥 → ∞ 𝜶𝜶 𝜷𝜷 𝑥𝑥 → 𝑓𝑓 𝑥𝑥 → 𝛼𝛼<−11, < 1 𝛼𝛼+𝛽𝛽−2 . The distribution is a U shape. There is a single anti-mode at . 𝜶𝜶 𝜷𝜷 > 0, > 0 𝛼𝛼−1 . There𝛼𝛼 +exists𝛽𝛽−2 inflection points at: 1 1 ( 1)( 1) 𝜶𝜶 𝜷𝜷 ± . + 2 + 2 + 3 𝛼𝛼 − 𝛼𝛼 − 𝛽𝛽 − � 𝛼𝛼 𝛽𝛽 − 𝛼𝛼 𝛽𝛽 − 𝛼𝛼 𝛽𝛽 − Beta Continuous Distribution 91

= . The distribution is symmetrical about = 0.5. As = becomes large, the beta distribution approaches the normal𝜶𝜶 𝜷𝜷 distribution. The Standard Uniform 𝑥𝑥Distribution arises𝛼𝛼 𝛽𝛽 when = = 1. = , = or = , = . Straight line. ( – )( – )𝛼𝛼< 0.𝛽𝛽 J Shaped. 𝜶𝜶 𝟏𝟏 𝜷𝜷 𝟐𝟐 𝜶𝜶 𝟐𝟐 𝜷𝜷 𝟏𝟏 Hazard Rate𝜶𝜶 𝟏𝟏 and𝜷𝜷 MRL𝟏𝟏 (Gupta and Nadarajah 2004, p.45): , . ( ) is increasing. ( ) is decreasing. , . ( ) is decreasing. ( ) is increasing. 𝜶𝜶 ≥> 𝟏𝟏1, 0𝜷𝜷<≥ 𝟏𝟏 <ℎ1𝑡𝑡. ( ) is bathtub𝑢𝑢 shaped𝑡𝑡 and ( ) is an 𝜶𝜶upside≤ 𝟏𝟏 down𝜷𝜷 ≤ 𝟏𝟏 bathtubℎ 𝑡𝑡 shape. 𝑢𝑢 𝑡𝑡 𝜶𝜶 < < 1, 𝛽𝛽 > 1. ℎ( 𝑡𝑡) is upside down bathtub shaped𝑢𝑢 𝑡𝑡 and ( ) is bathtub shape. 𝟎𝟎 𝛼𝛼 𝜷𝜷 ℎ 𝑡𝑡 𝑢𝑢 𝑡𝑡 Parameter Model. The Beta distribution is often used to model parameters which are constrained to take place between an interval. In particular the distribution of a probability parameter 0 1 is popular with the Beta distribution. ≤ 𝑝𝑝 ≤ Bayesian Analysis. The Beta distribution is often used as a conjugate prior in Bayesian analysis for the Bernoulli, Binomial and Beta Geometric Distributions to produce closed form posteriors. The Beta(0,0) distribution is an improper prior sometimes used to Applications represent ignorance of parameter values. The Beta(1,1) is a standard uniform distribution which may be used as a non- informative prior. When used as a conjugate prior to a Bernoulli or Binomial process the parameter may represent the number of successes and the total number of failures with the total number of trials being = + . 𝛼𝛼 𝛽𝛽 𝑛𝑛 𝛼𝛼 𝛽𝛽 Proportions. Used to model proportions. An example of this is the likelihood ratios for estimating uncertainty. Online: http://mathworld.wolfram.com/BetaDistribution.html http://en.wikipedia.org/wiki/Beta_distribution http://socr.ucla.edu/htmls/SOCR_Distributions.html (interactive web calculator) http://www.itl.nist.gov/div898/handbook/eda/section3/eda366h.htm

Resources Books: Gupta, A.K. & Nadarajah, S., 2004. Handbook of beta distribution and its applications, CRC Press.

Johnson, N.L., Kotz, S. & Balakrishnan, N., 1995. Continuous Univariate Distributions, Vol. 2 2nd ed., Wiley-Interscience.

Relationship to Other Distributions 92 Univariate Continuous Distributions

Let X Chi-square X ~ ( ) Y = X + X Distribution 2 1 Then i 𝑖𝑖 𝜒𝜒 𝑣𝑣 𝑎𝑎𝑎𝑎𝑎𝑎 1 2 ( ; ) Y~ = , = 2 1 1 𝜒𝜒 𝑡𝑡 𝑣𝑣 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 �𝛼𝛼 2𝑣𝑣1 𝛽𝛽 2𝑣𝑣2� Let X ~ (0,1) X X X Then Uniform i 1 2 n 𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈 X ~ 𝑎𝑎𝑎𝑎𝑎𝑎( , +≤1) ≤ ⋯ ≤ Distribution Where and are integers. r 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑟𝑟 𝑛𝑛 − 𝑟𝑟 ( ; , ) Special𝑛𝑛 Case:𝑟𝑟 ( ) ( ) 𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈 𝑡𝑡 𝑎𝑎 𝑏𝑏 ; 1,1, , = ; ,

𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑡𝑡 𝑎𝑎 𝑏𝑏 𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈 𝑡𝑡 𝑎𝑎 𝑏𝑏 For large and with fixed / :

Normal (𝛼𝛼 , ) 𝛽𝛽 =𝛼𝛼 𝛽𝛽 , = + ( + ) ( + + 1) Distribution α αβ 2 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝛼𝛼 𝛽𝛽 ≈ 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 �µ 𝜎𝜎 � �

α β α β α β ( ; , ) As and increase the mean remains constant and the variance is reduced. Beta 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑡𝑡 𝜇𝜇 𝜎𝜎 𝛼𝛼 𝛽𝛽 Let Gamma X Distribution , ~ (k , ) Y = X + X 1 Then 1 2 i i ( ; k, ) 𝑋𝑋 𝑋𝑋 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 λ 𝑎𝑎𝑎𝑎𝑎𝑎 1 2 Y~ ( = , = ) 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑡𝑡 λ Dirichlet Special Case: 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝛼𝛼 𝑘𝑘1 𝛽𝛽 𝑘𝑘2 Distribution (x; [ , ]) = ( = ; = , = )

( ; ) 𝐷𝐷𝐷𝐷𝑟𝑟𝑑𝑑=1 α1 α0 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑥𝑥 𝛼𝛼 𝛼𝛼1 𝛽𝛽 𝛼𝛼0

𝐷𝐷𝐷𝐷𝑟𝑟𝑑𝑑 𝑥𝑥 𝜶𝜶 Birnbaum Sanders Continuous Distribution 93

4.2. Birnbaum Saunders Continuous Distribution

Probability Density Function - f(t) 2.0 α=0.2 β=1 1.4 α=1 β=0.5 α=0.5 β=1 α=1 β=1 α=1 β=1 1.2 α=1 β=2 1.5 α=2.5 β=1 α=1 β=4 1.0 0.8 1.0 0.6

0.5 0.4 0.2 0.0 0.0 0 1 2 3 0 1 2 3

Birnbaum Cumulative Density Function - F(t) 1.0 0.9

0.8 0.8 Sanders 0.7 0.6 0.6

0.5 0.4 0.4 α=0.2 β=1 α=0.5 β=1 0.3 0.2 α=1 β=1 0.2 α=2.5 β=1 0.1 0.0 0.0 0 1 2 3 0 1 2 3

Hazard Rate - h(t) 2.5 1.6 2.0 1.4 1.2 1.5 1.0 0.8 1.0 0.6 0.5 0.4 0.2 0.0 0.0 0 1 2 3 0 1 2 3

94 Univariate Continuous Distributions

Parameters & Description Scale parameter. is the scale > 0 Parameters parameter equal to the median. β β β > 0 Shape parameter.

0 < t < Limits 𝛼𝛼 α

∞ Distribution Formulas

t + t 1 t t ( ) = exp 2 2 2 2 � ⁄β �β⁄ � ⁄β − �β⁄ 𝑓𝑓 𝑡𝑡 � − � � � t 𝛼𝛼𝛼𝛼+ 𝜋𝜋 t 𝛼𝛼 = √ (z) PDF 2 � ⁄β �β⁄ ϕ where ( ) is the standard𝛼𝛼𝛼𝛼 normal pdf and: t t 𝜙𝜙 𝑧𝑧 = � ⁄β − �β⁄ 𝑧𝑧𝐵𝐵𝐵𝐵

Sanders t 𝛼𝛼 t ( ) = CDF � ⁄β − �β⁄ 𝐹𝐹 𝑡𝑡 = Φ(� ) � 𝛼𝛼 Birnbaum Birnbaum Φ 𝑧𝑧𝐵𝐵𝐵𝐵 t t R(t) = Reliability �β⁄ − � ⁄β = Φ(� ) � 𝛼𝛼 Φ −𝑧𝑧𝐵𝐵(𝐵𝐵 + ) ( ) ( ) = ( | ) = = ( ) ( ′ ) 𝐵𝐵𝐵𝐵 Where 𝑅𝑅 𝑡𝑡 𝑥𝑥 Φ −𝑧𝑧 𝑚𝑚 𝑥𝑥 𝑅𝑅 𝑥𝑥 𝑡𝑡 𝐵𝐵𝐵𝐵 Conditional t t 𝑅𝑅 𝑡𝑡 (t + xΦ) −𝑧𝑧 (t + x) Survivor Function = , = ( > + | > ) � ⁄β − �β⁄ ′ � ⁄β − �β⁄ 𝑧𝑧𝐵𝐵𝐵𝐵 𝑧𝑧𝐵𝐵𝐵𝐵 is the given time 𝛼𝛼we know the component has survived𝛼𝛼 to. 𝑃𝑃 𝑇𝑇 𝑥𝑥 𝑡𝑡 𝑇𝑇 𝑡𝑡 is a random variable defined as the time after . Note: = 0 at . 𝑡𝑡 𝑥𝑥 𝑡𝑡 𝑥𝑥 𝑡𝑡 Mean Residual ( ) ( ) = ∞ Life ( ) ∫𝑡𝑡 Φ −𝑧𝑧𝐵𝐵𝐵𝐵 𝑑𝑑𝑑𝑑 𝑢𝑢 𝑡𝑡 𝐵𝐵𝐵𝐵 t + Φ −t𝑧𝑧 (z ) Hazard Rate ( ) = 2 ( ) � ⁄β �β⁄ ϕ BS ℎ 𝑡𝑡 � � Cumulative 𝛼𝛼𝛼𝛼 Φ −𝑧𝑧𝐵𝐵𝐵𝐵 ( ) = ln[ ( )] Hazard Rate 𝐻𝐻 𝑡𝑡 − Φ −𝑧𝑧𝐵𝐵𝐵𝐵 Birnbaum Sanders Continuous Distribution 95

Properties and Moments Median

Mode Numerically solve for β: + (1 + ) + (3 1) = 0 3 2 2 𝑡𝑡 2 2 3 Mean - 1st Raw Moment 𝑡𝑡 𝛽𝛽 𝛼𝛼 𝑡𝑡 𝛽𝛽 𝛼𝛼 − 𝑡𝑡 − 𝛽𝛽 1 + 22 α β � � Variance - 2nd Central Moment 5 1 + 42 2 2 α α β � � Skewness - 3rd Central Moment (11 + 6) 2 4α(5 α+ 4) 3 2 2 (Lemonteα et al. 2007) Excess kurtosis - 4th Central Moment 6 (93 + 40) 3 + 2 2

(5 + 4) Birnbaum α α 2 2 (Lemonte αet al. 2007)

100 % Percentile Function Sanders = ( ) + 4 + [ ( )] 4 2 −1 −1 2 𝛾𝛾 𝛾𝛾 β Parameter Estimation𝑡𝑡 �αΦ γ � αΦ γ �

Maximum Likelihood Function Likelihood For complete data: Function t + t 1 t t ( , | ) = nF exp 2 2 2 2 � i⁄β �β⁄ i � i⁄β − �β⁄ i 𝐿𝐿 𝜃𝜃 𝛼𝛼 𝐸𝐸 � � − � � � i=1 𝛼𝛼𝑡𝑡𝑖𝑖√ 𝜋𝜋 𝛼𝛼 ��������������������������������� failures Log-Likelihood 1 Function ( , | ) = n ln( ) + 𝑛𝑛𝐹𝐹 ln 1 + 3 𝑛𝑛𝐹𝐹 + 2 2 2 2 𝛽𝛽 𝛽𝛽 𝑡𝑡𝑖𝑖 𝛽𝛽 F 2 Λ 𝛼𝛼 𝛽𝛽 𝐸𝐸 − αβ � �� 𝑖𝑖� � 𝑖𝑖� � − � � 𝑖𝑖 − � 𝑖𝑖=1 𝑡𝑡 𝑡𝑡 𝛼𝛼 𝑖𝑖=1 𝛽𝛽 𝑡𝑡 ��������������������������������������� failures = 0 n 2 1 1 = 1 + + 𝑛𝑛𝐹𝐹 + 𝑛𝑛𝐹𝐹 = 0 ∂Λ ∂Λ F 𝛽𝛽 2 3 𝑖𝑖 3 ∂α − � � � 𝑡𝑡 � 𝑖𝑖 ∂α α α α β 𝑖𝑖=1 𝛼𝛼 𝑖𝑖=1 𝑡𝑡 ������������������������� failures

= 0 n 1 1 1 1 = + 𝑛𝑛𝐹𝐹 + 𝑛𝑛𝐹𝐹 𝑛𝑛𝐹𝐹 = 0 ∂Λ + 2 2 ∂Λ F 2 2 𝑖𝑖 2 ∂β − � 𝑖𝑖 � 𝑡𝑡 − � 𝑖𝑖 ∂β 2β 𝑖𝑖=1 𝑡𝑡 𝛽𝛽 𝛼𝛼 𝛽𝛽 𝑖𝑖=1 𝛼𝛼 𝑖𝑖=1 𝑡𝑡 ����������������������������� MLE Point is found by solving: failures

𝛽𝛽̂ 96 Univariate Continuous Distributions

Estimates [2 + ( )] + [ + ( )] = 0 where 2 𝛽𝛽 − 𝛽𝛽 𝑅𝑅 𝑔𝑔 𝛽𝛽 𝑅𝑅 𝑆𝑆 𝑔𝑔 𝛽𝛽 1 1 1 1 1 ( ) = 𝑛𝑛𝐹𝐹 −1 , = 𝑛𝑛𝐹𝐹 , = 𝑛𝑛𝐹𝐹 −1 + 𝑖𝑖 𝑔𝑔 𝛽𝛽 � � 𝑖𝑖� 𝑆𝑆 𝐹𝐹 � 𝑡𝑡 𝑅𝑅 � 𝐹𝐹 � 𝑖𝑖� 𝑛𝑛 𝑖𝑖=1 𝛽𝛽 𝑡𝑡 𝑛𝑛 𝑖𝑖=1 𝑛𝑛 𝑖𝑖=1 𝑡𝑡 Point estimates for is:

𝛼𝛼� = + 2 𝑆𝑆 𝛽𝛽̂ (Lemonte et al. 2007) 𝛼𝛼� � − 𝛽𝛽̂ 𝑅𝑅 Fisher 2 0 Information ( , ) = ⎡ 2 (2 ) ( ) + 1⎤ 𝛼𝛼0 𝐼𝐼 𝜃𝜃 𝛼𝛼 ⎢ −1⁄2 ⎥ 𝛼𝛼 𝜋𝜋 𝑘𝑘 𝛼𝛼 where ⎢ 2 2 ⎥ ⎣ 2𝛼𝛼 𝛽𝛽 2⎦ ( ) = exp 1 2 𝜋𝜋 (Lemonte et al. 2007)𝑘𝑘 𝛼𝛼 𝛼𝛼� − 𝜋𝜋 � 2� � − Φ � �� 𝛼𝛼 𝛼𝛼 Sanders 100 % Calculated from the Fisher information matrix. See section 1.4.7. For a Confidence literature review of proposed confidence intervals see (Lemonte et al. Intervals𝛾𝛾 2007).

Birnbaum Birnbaum Description , Limitations and Uses Example 5 components are put on a test with the following failure times: 98, 116, 2485, 2526, , 2920 hours

1 = 𝑛𝑛𝐹𝐹 = 1629

𝑖𝑖 𝑆𝑆 𝐹𝐹 � 𝑡𝑡 1 𝑛𝑛 𝑖𝑖=11 = 𝑛𝑛𝐹𝐹 −1 = 250.432

𝑅𝑅 � 𝐹𝐹 � 𝑖𝑖� Solving: 𝑛𝑛 𝑖𝑖=1 𝑡𝑡 1 1 1 1 2 + 𝑛𝑛𝐹𝐹 −1 + + 𝑛𝑛𝐹𝐹 −1 = 0 + + 2 𝛽𝛽 − 𝛽𝛽 � 𝑅𝑅 � � 𝑖𝑖� � 𝑅𝑅 �𝑆𝑆 � � 𝑖𝑖� � 𝑛𝑛 𝑖𝑖=1 𝛽𝛽 𝑡𝑡 𝑛𝑛 𝑖𝑖=1 𝛽𝛽 𝑡𝑡 = 601.949

β� = + 2 = 1.763 𝑆𝑆 𝛽𝛽̂ 𝛼𝛼� � − ̂ 𝑅𝑅 𝛽𝛽 Birnbaum Sanders Continuous Distribution 97

90% confidence interval for : (0.95) (0.95) 2 2𝛼𝛼 2 2 . exp −1 𝛼𝛼 , . exp −1 𝛼𝛼 ⎡ ⎧𝛷𝛷 � ⎫ ⎧𝛷𝛷 � ⎫⎤ 𝑛𝑛𝐹𝐹 𝑛𝑛𝐹𝐹 ⎢𝛼𝛼� 𝛼𝛼� ⎥ ⎢ ⎨ −𝛼𝛼� [1.048⎬, 2.966]⎨ 𝛼𝛼� ⎬⎥ ⎣ ⎩ ⎭ ⎩ ⎭⎦ 90% confidence interval for : 2 2 ( ) = exp 1 = 1.442 2 𝛽𝛽 𝜋𝜋 (2 ) ( 2) + 1 𝑘𝑘 𝛼𝛼� =𝛼𝛼�� − 𝜋𝜋 � � � −=Φ10� .335�� -6 −1⁄2 𝛼𝛼� 𝛼𝛼� 𝛼𝛼� 𝜋𝜋 𝑘𝑘 𝛼𝛼� 𝐼𝐼𝛽𝛽𝛽𝛽 2 2 𝐸𝐸 96762𝛼𝛼� 𝛽𝛽̂ 96762 (0.95) (0.95) . exp −1 , . exp −1 ⎡ ⎧𝛷𝛷 � ⎫ ⎧𝛷𝛷 � ⎫⎤ 𝑛𝑛𝐹𝐹 𝑛𝑛𝐹𝐹 ⎢𝛽𝛽̂ 𝛽𝛽̂ ⎥ ⎢ ⎨ −𝛽𝛽̂ [100⎬.4, 624.5] ⎨ −𝛽𝛽̂ ⎬⎥

⎣ ⎩ ⎭ ⎩ ⎭⎦ Birnbaum Note that this confidence interval uses the assumption of the parameters being normally distributed which is only true for large sample sizes. Therefore these confidence intervals may be

inaccurate. Bayesian methods must be done numerically. Sanders Characteristics The Birnbaum-Saunders distribution is a stochastic model of the Miner’s rule.

Characteristic of . As decreases the distribution becomes more symmetrical around the value of . 𝜶𝜶 𝛼𝛼 Hazard Rate. The hazard rate is𝛽𝛽 always unimodal. The hazard rate has the following asymptotes: (Meeker & Escobar 1998, p.107) (0) = 0 1 lim ( ) = ℎ 2 < 0.6 The change point of the unimodal𝑡𝑡→∞ ℎ 𝑡𝑡 hazard2 rate for must be solved numerically, however for > 0.𝛽𝛽6𝛼𝛼 can be approximated using: (Kundu et al. 2008) 𝛼𝛼 𝛼𝛼 = ( 0.4604 + 1.8417 ) 𝛽𝛽 𝑡𝑡𝑐𝑐 2 Lognormal and Inverse −Gaussian Distribution.𝛼𝛼 The shape and behavior of the Birnbaum-Saunders distribution is similar to that of the lognormal and inverse Gaussian distribution. This similarity is seen primarily in the center of the distributions. (Meeker & Escobar 1998, p.107)

Let: ~ ( ; , )

𝑇𝑇 𝐵𝐵𝐵𝐵 𝑡𝑡 𝛼𝛼 𝛽𝛽 98 Univariate Continuous Distributions

Scaling property (Meeker & Escobar 1998, p.107) ~ ( ; , ) where > 0 𝑐𝑐𝑐𝑐 𝐵𝐵𝐵𝐵 𝑡𝑡 𝛼𝛼 𝑐𝑐𝑐𝑐 Inverse𝑐𝑐 property (Meeker & Escobar 1998, p.107)

1 1 ~ ; ,

𝐵𝐵𝐵𝐵 �𝑡𝑡 𝛼𝛼 � Applications Fatigue-Fracture. The distribution𝑇𝑇 has𝛽𝛽 been designed to model crack growth to critical crack size. The model uses the Miner’s rule which allows for non-constant fatigue cycles through accumulated damage. The assumption is that the crack growth during any one cycle is independent of the growth during any other cycle. The growth for each cycle has the same distribution from cycle to cycle. This is different from the proportional degradation model used to derive the log normal distribution model, with the rate of degradation being dependent on accumulated damage. (http://www.itl.nist.gov/div898/handbook/apr/section1/apr166.htm)

Resources Online: http://www.itl.nist.gov/div898/handbook/eda/section3/eda366a.htm Sanders http://www.itl.nist.gov/div898/handbook/apr/section1/apr166.htm http://en.wikipedia.org/wiki/Birnbaum%E2%80%93Saunders_distrib ution

Birnbaum Birnbaum Books: Birnbaum, Z.W. & Saunders, S.C., 1969. A New Family of Life Distributions. Journal of Applied Probability, 6(2), 319-327.

Lemonte, A.J., Cribari-Neto, F. & Vasconcellos, K.L., 2007. Improved for the two-parameter Birnbaum-Saunders distribution. Computational Statistics & Data Analysis, 51(9), 4656- 4681.

Johnson, N.L., Kotz, S. & Balakrishnan, N., 1995. Continuous Univariate Distributions, Vol. 2, 2nd ed., Wiley-Interscience.

Rausand, M. & Høyland, A., 2004. System reliability theory, Wiley- IEEE.

Gamma Continuous Distribution 99

4.3. Gamma Continuous Distribution

Probability Density Function - f(t) 0.30 k=0.5 λ=1 0.50 k=3 λ=1.8 k=3 λ=1 0.45 k=3 λ=1.2 0.25 k=6 λ=1 0.40 k=3 λ=0.8 k=12 λ=1 0.20 0.35 k=3 λ=0.4 0.30 0.15 0.25 0.20 0.10 0.15 0.05 0.10 0.05 0.00 0.00 0 10 20 0 2 4 6 8

Cumulative Density Function - F(t) 1.00 1.00

0.80 0.80 Gamma

0.60 0.60

0.40 0.40 k=0.5 λ=1 0.20 k=3 λ=1 0.20 k=6 λ=1 k=12 λ=1 0.00 0.00 0 10 20 0 2 4 6 8

Hazard Rate - h(t) 1.60 1.60 1.40 1.40 1.20 1.20 1.00 1.00 0.80 0.80 0.60 0.60 0.40 0.40 0.20 0.20 0.00 0.00 0 10 20 0 2 4 6 8

100 Univariate Continuous Distributions

Parameters & Description Scale Parameter: Equal to the rate (frequency) of events/shocks. > 0 Sometimes defined as 1/ where is the 𝜆𝜆 𝜆𝜆 average time between events/shocks. 𝜃𝜃 𝜃𝜃 Parameters Shape Parameter: As an integer can be interpreted as the number of events/shocks until failure. When𝑘𝑘 not > 0 restricted to an integer, and be 𝑘𝑘 𝑘𝑘 interpreted as a measure of the ability to resist shocks. 𝑘𝑘 Limits 0

Distribution When k is an integer 𝑡𝑡 ≥ When k is continuous () ( ) is the complete gamma function. ( , ) and ( , ) are the incomplete gamma functions see section 1.6. Γ 𝑘𝑘 Γ 𝑘𝑘 𝑡𝑡 𝛾𝛾 𝑘𝑘 𝑡𝑡 ( ) = e 𝑘𝑘(𝑘𝑘−)1 −λt ( ) = e 𝜆𝜆 𝑡𝑡 (k𝑘𝑘 𝑘𝑘1−1)! 𝑓𝑓 𝑡𝑡 PDF 𝜆𝜆 𝑡𝑡 −λt with Laplace transformation:Γ 𝑘𝑘 𝑓𝑓 𝑡𝑡 Gamma − ( ) = + k 𝜆𝜆 𝑓𝑓 𝑠𝑠 � � 𝜆𝜆 𝑠𝑠 ( , ) ( ) = ( ) ( ) CDF ( ) = 1 𝑘𝑘−1 𝛾𝛾 𝑘𝑘 𝜆𝜆𝜆𝜆 ! 𝑛𝑛 𝐹𝐹 𝑡𝑡 1 −𝜆𝜆𝜆𝜆 = 𝜆𝜆𝜆𝜆 ( ) 𝛤𝛤𝜆𝜆𝜆𝜆𝑘𝑘 𝐹𝐹 𝑡𝑡 − 𝑒𝑒 � 𝑘𝑘−1 −𝑥𝑥 𝑛𝑛=0 𝑛𝑛 �0( 𝑥𝑥, ) 𝑒𝑒 𝑑𝑑𝑑𝑑 (𝛤𝛤)𝑘𝑘= ( ) ( ) Reliability ( ) = 𝑘𝑘−1 𝛤𝛤 𝑘𝑘 𝜆𝜆𝜆𝜆 ! 𝑛𝑛 1 −𝜆𝜆𝜆𝜆 =𝑅𝑅 𝑡𝑡 𝜆𝜆𝜆𝜆 ( ) 𝛤𝛤∞ 𝑘𝑘 𝑅𝑅 𝑡𝑡 𝑒𝑒 � 𝑘𝑘−1 −𝑥𝑥 𝑛𝑛=0 𝑛𝑛 [ ( + )] �𝜆𝜆𝜆𝜆 𝑥𝑥 𝑒𝑒 𝑑𝑑𝑑𝑑 𝛤𝛤 (𝑘𝑘 ) ( ) ! 𝑛𝑛 + , + ) 𝑘𝑘−1 ( ) = = Conditional 𝑛𝑛=0 𝜆𝜆 (𝑡𝑡 )𝑥𝑥 ( ) ( , ) −𝜆𝜆𝜆𝜆 ∑ ! 𝑅𝑅 𝑡𝑡 𝑥𝑥 𝛤𝛤 𝑘𝑘 𝜆𝜆𝜆𝜆 𝜆𝜆𝜆𝜆 Survivor Function 𝑒𝑒 𝑛𝑛 𝑛𝑛 𝑚𝑚 𝑥𝑥 𝑘𝑘−1 𝜆𝜆𝜆𝜆 𝑅𝑅 𝑡𝑡 𝛤𝛤 𝑘𝑘 𝜆𝜆𝜆𝜆 ( > + | > ) Where ∑𝑛𝑛=0 𝑛𝑛 is the given time we know the component has survived to. 𝑃𝑃 𝑇𝑇 𝑥𝑥 𝑡𝑡 𝑇𝑇 𝑡𝑡 is a random variable defined as the time after . Note: = 0 at . 𝑡𝑡 𝑥𝑥 ( ) 𝑡𝑡 ( , )𝑥𝑥 𝑡𝑡 ( ) = ∞ ( ) = ∞ Mean Residual ( ) ( , ) ∫𝑡𝑡 𝑅𝑅 𝑥𝑥 𝑑𝑑𝑑𝑑 ∫𝑡𝑡 𝛤𝛤 𝑘𝑘 𝜆𝜆𝜆𝜆 𝑑𝑑𝑑𝑑 Life 𝑢𝑢 𝑡𝑡 𝑢𝑢 𝑡𝑡 The mean residual𝑅𝑅 𝑡𝑡 life does not have a closed𝛤𝛤 form𝑘𝑘 𝜆𝜆𝜆𝜆 but has the expansion: Gamma Continuous Distribution 101

1 ( 1)( 2) ( ) = 1 + + + ( ) 𝑘𝑘 − 𝑘𝑘 − 𝑘𝑘 − −3 Where ( 𝑢𝑢) 𝑡𝑡is Landau's notation. (Kleiber2 & Kotz𝑂𝑂 𝑡𝑡 2003, p.161) −3 𝑡𝑡 𝑡𝑡 𝑂𝑂 𝑡𝑡 ( ) = ( ) = 𝑘𝑘 𝑘𝑘−1( ) (𝑘𝑘 ,𝑘𝑘−1) ( ) 𝜆𝜆 𝑡𝑡 ! 𝜆𝜆 𝑡𝑡 −𝜆𝜆𝜆𝜆 ℎ 𝑡𝑡 𝑛𝑛 ℎ 𝑡𝑡 𝑒𝑒 𝑘𝑘−1 𝜆𝜆𝜆𝜆 𝛤𝛤 𝑘𝑘 𝜆𝜆𝜆𝜆 𝛤𝛤 𝑘𝑘 ∑𝑛𝑛=0 𝑛𝑛 Series expansion of the hazard rate is: (Kleiber & Kotz 2003, p.161) ( 1)( 2) ( ) = + ( ) Hazard Rate −1 𝑘𝑘 − 𝑘𝑘 − −3 Limits of ( ) (Rausandℎ 𝑡𝑡 � & Høyland2 2004)𝑂𝑂 𝑡𝑡 � 𝑡𝑡 limℎ 𝑡𝑡 ( ) = lim ( ) = 0 < < 1

𝑡𝑡→0 ℎ 𝑡𝑡 ∞ 𝑎𝑎𝑎𝑎𝑎𝑎 𝑡𝑡→∞ ℎ 𝑡𝑡 𝜆𝜆 𝑤𝑤ℎ𝑒𝑒𝑒𝑒 𝑘𝑘 lim ( ) = 0 lim ( ) = 1

𝑡𝑡→0 ℎ 𝑡𝑡 𝑎𝑎𝑎𝑎𝑎𝑎 𝑡𝑡→∞ ℎ 𝑡𝑡 𝜆𝜆 𝑤𝑤ℎ𝑒𝑒𝑒𝑒 𝑘𝑘 ≥

Cumulative ( ) ( , ) ( ) = 𝑘𝑘−1 ( ) = Hazard Rate ! 𝑛𝑛 ( ) Gamma 𝜆𝜆𝜆𝜆 𝛤𝛤 𝑘𝑘 𝜆𝜆𝜆𝜆 𝐻𝐻 𝑡𝑡 𝜆𝜆𝜆𝜆 − 𝑙𝑙𝑙𝑙 �� � 𝐻𝐻 𝑡𝑡 − 𝑙𝑙𝑙𝑙 � � 𝑛𝑛=0 𝑛𝑛 𝛤𝛤 𝑘𝑘 Properties and Moments

Numerically solve for t when: . = (0.5; , ) or −1 0 5 Median 𝑡𝑡 (k, 𝐹𝐹t) = (k,𝑘𝑘t𝜆𝜆)

where (k, γt) isλ the Γlowerλ incomplete gamma function, see section 1.6.6. γ λ 1 1 Mode 𝑘𝑘 − 𝑓𝑓𝑓𝑓𝑓𝑓 𝑘𝑘 ≥ 𝜆𝜆 0 < < 1 k Mean - 1st Raw Moment 𝑁𝑁𝑁𝑁 𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚 𝑓𝑓𝑓𝑓𝑓𝑓 𝑘𝑘

k Variance - 2nd Central Moment 𝜆𝜆

rd 2 Skewness - 3 Central Moment 2/𝜆𝜆 th Excess kurtosis - 4 Central Moment 6/√𝑘𝑘 it𝑘𝑘 Characteristic Function 1 −k

Numerically solve� for− t:� 100α% Percentile Function λ = ( ; , ) −1 𝑡𝑡𝛼𝛼 𝐹𝐹 𝛼𝛼 𝑘𝑘 𝜆𝜆 102 Univariate Continuous Distributions

Parameter Estimation Maximum Likelihood Function

( , | ) = nF e Likelihood (𝑘𝑘𝑛𝑛)𝐹𝐹 𝑘𝑘−1 −λti Functions 𝜆𝜆 𝐹𝐹 𝐿𝐿 𝑘𝑘 𝜆𝜆 𝐸𝐸 𝑛𝑛 � 𝑡𝑡𝑖𝑖 Γ 𝑘𝑘 i=1 ��������������� failures Log-Likelihood ( , | ) = ln( ) ln ( ) + (k 1) nF ln(t ) nF t Functions Λ 𝑘𝑘 𝜆𝜆 𝐸𝐸 𝑘𝑘𝑛𝑛𝐹𝐹 𝜆𝜆 − 𝑛𝑛𝐹𝐹 �𝛤𝛤 𝑘𝑘 � − � i − λ � i i=1 i=1 0 = ln ( ) ( ) + nF {ln(t )} = 0 k 𝐹𝐹 𝐹𝐹 i ∂Λ ( ) [ ( 𝑛𝑛)] λ − 𝑛𝑛 𝜓𝜓 𝑘𝑘 � where = ln is the digamma functioni=1 see section 1.6.7. ∂ 𝑑𝑑 𝜓𝜓 𝑥𝑥 𝑑𝑑𝑑𝑑 Γ 𝑥𝑥 = 0 0 = nF t ∂Λ 𝑘𝑘𝑛𝑛𝐹𝐹 − � i Point∂ λ λ i=1

Point estimates for and are obtained by using numerical methods to Estimates solve the simultaneous equations above. (Kleiber & Kotz 2003, p.165) 𝑘𝑘� 𝜆𝜆̂ Fisher ( ) ( , ) = Gamma Information ′ 𝜓𝜓 𝑘𝑘 𝜆𝜆 Matrix 𝐼𝐼 𝑘𝑘 𝜆𝜆 � 2� ( ) = ( ) = ( +𝜆𝜆 ) 𝑘𝑘𝜆𝜆 where 2 is the Trigamma function. ′ 𝑑𝑑 ∞ −2 (Yang and Berger 21998, p.10) 𝜓𝜓 𝑥𝑥 𝑑𝑑𝑥𝑥 𝑙𝑙𝑙𝑙Γ 𝑥𝑥 ∑𝑖𝑖=0 𝑥𝑥 𝑖𝑖 Confidence For a large number of samples the Fisher information matrix can be Intervals used to estimate confidence intervals. Bayesian Non-informative Priors, ( , ) (Yang and Berger 1998, p.6) 𝝅𝝅 𝒌𝒌 𝝀𝝀 Type Prior Posterior Uniform Improper 1 No Closed Form Prior with limits: (0, ) (0, ) 𝜆𝜆 ∈ ∞ Jeffrey’s𝑘𝑘 ∈ Prior∞ . ( ) 1 No Closed Form ′ Reference No Closed Form 𝜆𝜆�𝑘𝑘 𝜓𝜓 𝑘𝑘 − 1 Order: . ( ) { , } ′ 𝜆𝜆�𝑘𝑘 𝜓𝜓 𝑘𝑘 − Reference𝑘𝑘 𝜆𝜆 ( ) 𝛼𝛼 No Closed Form Order: ′ { , } 𝜆𝜆�𝜓𝜓 𝑘𝑘

𝜆𝜆 𝑘𝑘 Gamma Continuous Distribution 103

( ) = ( ) = ( + ) where 2 is the Trigamma function 𝑑𝑑 ∞ ′ 2 −2 𝜓𝜓 𝑥𝑥 𝑑𝑑𝑥𝑥 𝑙𝑙𝑙𝑙Γ 𝑥𝑥 Conjugate∑𝑖𝑖=0 𝑥𝑥 Priors𝑖𝑖 UOI Likelihood Evidence Dist. of Prior Posterior Model UOI Para Parameters

= + from Exponential failures in Gamma , 𝐹𝐹 = + Λ( ; ) 𝑛𝑛 𝑜𝑜 𝐹𝐹 0 0 𝑘𝑘 𝑘𝑘 𝑛𝑛 𝑘𝑘 𝜆𝜆 𝑜𝑜 𝑇𝑇 𝑇𝑇 𝜆𝜆 𝜆𝜆 𝑡𝑡 𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 Λ 𝑡𝑡 = + from Poisson failures in Gamma , 𝐹𝐹 = + (Λ ; ) 𝑛𝑛 𝑜𝑜 𝐹𝐹 0 0 𝑘𝑘 𝑘𝑘 𝑛𝑛 𝑘𝑘 𝜆𝜆 𝑜𝑜 𝑇𝑇 𝑇𝑇 𝜆𝜆 = 𝜆𝜆 + 𝑡𝑡 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑘𝑘 Λ𝑡𝑡 𝑡𝑡 where Weibull with failures 𝑜𝑜 𝐹𝐹 =𝜆𝜆 𝐹𝐹 Gamma , 𝑘𝑘= 𝑘𝑘+ 𝑛𝑛𝐹𝐹𝑛𝑛 known at times𝑛𝑛 from− 𝛽𝛽 𝛽𝛽 0 0 𝑜𝑜 𝜆𝜆 ( 𝛼𝛼; , ) 𝑘𝑘 𝜆𝜆 (Rinne𝜆𝜆 𝜆𝜆2008,� p.520)𝑡𝑡𝑖𝑖 𝛽𝛽 𝑖𝑖=1 𝑖𝑖 𝑡𝑡 = + /2 𝑊𝑊𝑊𝑊𝑊𝑊 𝑡𝑡 𝛼𝛼 𝛽𝛽 Normal with failures 2 𝐹𝐹 , 1 from Gamma 𝑜𝑜 𝐹𝐹 Gamma known at times𝑛𝑛 = 𝑘𝑘 +𝑘𝑘 𝑛𝑛 𝑛𝑛( ) 𝜎𝜎( ; , ) 2 𝑘𝑘0 𝜆𝜆0 2 2 𝜇𝜇 𝜆𝜆 𝜆𝜆𝑜𝑜 � 𝑡𝑡𝑖𝑖 − 𝜇𝜇 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑥𝑥 𝜇𝜇 𝜎𝜎 Gamma with 𝑖𝑖 𝑖𝑖=1

𝑡𝑡 = + from known = failures in Gamma , 𝐹𝐹 = + 𝜆𝜆( ; , ) 𝑛𝑛 0 𝐹𝐹 𝐸𝐸 0 0 𝜂𝜂 𝜂𝜂 𝑛𝑛 𝑘𝑘 𝑘𝑘 𝜂𝜂 Λ 𝑜𝑜 𝑇𝑇 𝐸𝐸 𝑇𝑇 kΛ = kΛ + 𝑡𝑡 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑥𝑥 𝜆𝜆 𝑘𝑘 𝑘𝑘 𝑡𝑡 Pareto with failures from 𝐹𝐹 Gamma k , 𝑜𝑜 𝐹𝐹 known at times𝑛𝑛 = + 𝑛𝑛𝐹𝐹 ln𝑛𝑛 𝛼𝛼( ; , ) 0 λ0 𝑥𝑥𝑖𝑖 𝜃𝜃 λ λ𝑜𝑜 � � � 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑡𝑡 𝜃𝜃 𝛼𝛼 where: = 𝑡𝑡𝑖𝑖 t + t = 𝑖𝑖=1 𝜃𝜃 F S Description𝑡𝑡𝑇𝑇 ∑ ,i Limitations∑ i 𝑡𝑡𝑡𝑡𝑡𝑡 𝑡𝑡𝑡𝑡and𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 Uses𝑖𝑖𝑖𝑖 𝑡𝑡 𝑡𝑡𝑡𝑡𝑡𝑡 Example 1 For an example using the gamma distribution as a conjugate prior see the Poisson or Exponential distributions.

A renewal process has an exponential time between failure with parameter = 0.01 under the homogeneous Poisson process conditions. What is the probability the forth failure will occur before 200 hours. 𝜆𝜆

(200; 4,0.01) = 0.1429

Example 2 5 components are put 𝐹𝐹on a test with the following failure times: 38, 42, 44, 46, 55 hours Solving: 5 0 = 225 0 = 5ln ( ) 5𝑘𝑘 ( ) + 18.9954 − λ λ − 𝜓𝜓 𝑘𝑘 104 Univariate Continuous Distributions

Gives: k = 21.377

� = 0.4749

90% confidence interval for :𝜆𝜆 ̂ 0.0479 0.4749 ( , ) = 𝑘𝑘 0.4749 4.8205

𝐼𝐼 𝑘𝑘 𝜆𝜆 � 179.979� 17.730 , = , = −1 −1 17.730 1.7881 − �𝐽𝐽𝑛𝑛�𝑘𝑘� 𝜆𝜆̂�� �𝑛𝑛𝐹𝐹𝐼𝐼�𝑘𝑘� 𝜆𝜆̂�� � � (0.95) 179.979 − (0.95) 179.979 . exp , . exp −1 −1 𝛷𝛷 √ 𝛷𝛷 √ �𝑘𝑘� � [7.6142�, 60𝑘𝑘� .0143�] �� −𝑘𝑘� 𝑘𝑘� 90% confidence interval for :

(0.95) 1.7881𝜆𝜆 (0.95) 1.7881 . exp , . exp −1 −1 𝛷𝛷 √ 𝛷𝛷 √ �𝜆𝜆̂ � [0.0046�, 48𝜆𝜆̂ .766�] �� −𝜆𝜆̂ 𝜆𝜆̂ Note that this confidence interval uses the assumption of the parameters being normally distributed which is only true for large Gamma sample sizes. Therefore these confidence intervals may be inaccurate. Bayesian methods must be done numerically. Characteristics The gamma distribution was originally known as a Pearson Type III distribution. This distribution includes a which shifts the distribution along the x-axis. 𝛾𝛾 ( ) ( ; , , ) = e ( ) 𝑘𝑘 ( ) 𝑘𝑘−1 𝜆𝜆 𝑡𝑡 − 𝛾𝛾 −λ t−γ 𝑓𝑓 𝑡𝑡 𝑘𝑘 𝜆𝜆 𝛾𝛾 When k is an integer, the GammaΓ distribution𝑘𝑘 is called an Erlang distribution.

Characteristics: < 1. (0) = . There is no mode. 𝒌𝒌 = . (0) = . The gamma distribution reduces to an 𝒌𝒌exponential𝑓𝑓 distribution∞ with failure rate λ. Mode at = 0. 𝒌𝒌 > 1𝟏𝟏. 𝑓𝑓 (0) =𝜆𝜆0 Large . The gamma distribution approaches 𝑡𝑡 a normal 𝒌𝒌 𝑓𝑓 distribution𝒌𝒌 with = , = . 𝑘𝑘 𝑘𝑘 2 𝜇𝜇 𝜆𝜆 𝜎𝜎 �𝜆𝜆

Homogeneous Poisson Process (HPP). Components with an exponential time to failure which undergo instantaneous renewal with an identical item undergo a HPP. The Gamma distribution is Gamma Continuous Distribution 105

probability distribution of the kth failed item and is derived from the convolution of exponentially distributed random variables, . (See related distributions, exponential distribution). 𝑖𝑖 𝑘𝑘 𝑇𝑇 ~ ( , ) Scaling property: 𝑇𝑇 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑘𝑘 𝜆𝜆 ~ , 𝜆𝜆 Convolution property: 𝑎𝑎𝑎𝑎 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 �𝑘𝑘 � + + … + ~ 𝑎𝑎 ( , ) Where is fixed. 1 2 𝑛𝑛 𝑖𝑖 𝑇𝑇 𝑇𝑇 𝑇𝑇 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 ∑𝑘𝑘 𝜆𝜆 Properties𝜆𝜆 from (Leemis & McQueston 2008) , Homogenous Poisson Process. Used to model a renewal process where the component time to failure is exponentially distributed and the component is replaced instantaneously with a new identical component. The HPP can also be used to model ruin theory (used in risk assessments) and queuing theory.

System Failure. Can be used to model system failure with backup Applications Gamma systems. 𝑘𝑘 Life Distribution. The gamma distribution is flexible in shape and can give good approximations to life data.

Bayesian Analysis. The gamma distribution is often used as a prior in Bayesian analysis to produce closed form posteriors.

Online: http://mathworld.wolfram.com/GammaDistribution.html http://en.wikipedia.org/wiki/Gamma_distribution http://socr.ucla.edu/htmls/SOCR_Distributions.html (interactive web calculator) http://www.itl.nist.gov/div898/handbook/eda/section3/eda366b.htm

Books: Resources Artin, E., 1964. The Gamma Function, New York: Holt, Rinehart & Winston.

Johnson, N.L., Kotz, S. & Balakrishnan, N., 1994. Continuous Univariate Distributions, Vol. 1 2nd ed., Wiley-Interscience.

Bowman, K.O. & Shenton, L.R., 1988. Properties of estimators for the gamma distribution, CRC Press. Relationship to Other Distributions Generalized Gamma Distribution 106 Univariate Continuous Distributions

( ) ( ; , , , ) = exp{ [ ( )] } ( ; , , , ) 𝜉𝜉𝜉𝜉 ( ) 𝜉𝜉𝜉𝜉−1 𝑘𝑘 - Scale Parameter 𝜉𝜉𝜆𝜆 𝑡𝑡 − 𝛾𝛾 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑡𝑡 𝑘𝑘 𝜆𝜆 γ 𝜉𝜉 𝑓𝑓 𝑡𝑡 𝑘𝑘 𝜆𝜆 𝛾𝛾 𝜉𝜉 − 𝜆𝜆 𝑡𝑡 − 𝛾𝛾 - Shape Parameter Γ 𝑘𝑘 𝜆𝜆 - Location parameter 𝑘𝑘 - Second shape parameter 𝛾𝛾 𝜉𝜉The generalized gamma distribution has been derived because it is a generalization of a large amount of probability distributions. Such as: ( ; 1, , 0,1) = ( ; ) 1 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑡𝑡 ; 1,𝜆𝜆 , , 1 =𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡(𝜆𝜆; , ) 1 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 �𝑡𝑡 β � 𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 𝜇𝜇(𝛽𝛽 ) ; 1, 𝜇𝜇 , 0, = ; , 1 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 �𝑡𝑡; 1, , , 𝛽𝛽� = 𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊(𝑡𝑡; 𝛼𝛼, 𝛽𝛽, ) 𝛼𝛼 1 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 �𝑡𝑡; , ,γ0,𝛽𝛽1� = 𝑊𝑊(𝑒𝑒t𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒; n) 𝑡𝑡 𝛼𝛼 𝛽𝛽 𝛾𝛾 2 𝛼𝛼2 𝑛𝑛 1 2 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 �𝑡𝑡; , , 0,�2 =χ (t; n) 2 2 𝑛𝑛 1 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 �𝑡𝑡; 1, , 0,2 =� Rayleighχ (t; ) √ 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 �𝑡𝑡 � σ Gamma 𝜎𝜎 Let … ~ ( ) = + + + Then 1 𝑘𝑘 𝑡𝑡 1 2 𝑘𝑘 Exponential 𝑇𝑇 𝑇𝑇 𝐸𝐸𝐸𝐸𝐸𝐸 𝜆𝜆 ~𝑎𝑎𝑎𝑎𝑎𝑎 ( 𝑇𝑇, ) 𝑇𝑇 𝑇𝑇 ⋯ 𝑇𝑇 Distribution 𝑡𝑡 This is gives the Gamma distribution𝑇𝑇 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 its𝑘𝑘 convolution𝜆𝜆 property. ( ; )

𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 λ Special Case: ( ; ) = ( ; = 1, )

Let 𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 𝜆𝜆 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑡𝑡 𝑘𝑘 𝜆𝜆 … ~ ( ) = + + + Then 1 𝑘𝑘 𝑡𝑡 1 2 𝑘𝑘 𝑇𝑇 𝑇𝑇 𝐸𝐸𝐸𝐸𝐸𝐸 𝜆𝜆 ~𝑎𝑎𝑎𝑎𝑎𝑎 ( 𝑇𝑇, ) 𝑇𝑇 𝑇𝑇 ⋯ 𝑇𝑇

𝑡𝑡 Poisson The Poisson distribution is𝑇𝑇 the𝐺𝐺 probability𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑘𝑘 𝜆𝜆 that exactly failures have Distribution been observed in time . This is the probability that is between and . 𝑘𝑘 𝑘𝑘 ( ; ) 𝑡𝑡 𝑡𝑡 𝑇𝑇 𝑇𝑇𝑘𝑘+1 ( ; ) = ( ; , ) 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑘𝑘 𝜆𝜆𝜆𝜆 𝑘𝑘+1 𝑃𝑃𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜 = 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺( ; + 1, ) ( ; , ) 𝑓𝑓 𝑘𝑘 𝜆𝜆𝜆𝜆 �𝑘𝑘 𝑓𝑓 𝑡𝑡 𝑥𝑥 𝜆𝜆 𝑑𝑑𝑑𝑑

𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 where is an integer. 𝐹𝐹 𝑡𝑡 𝑘𝑘 𝜆𝜆 − 𝐹𝐹 𝑡𝑡 𝑘𝑘 𝜆𝜆

𝑘𝑘 Gamma Continuous Distribution 107

Normal Special Case for large k: Distribution lim ( , ) = = , = ( ; , ) 𝑘𝑘 𝑘𝑘 𝑘𝑘→∞ 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑘𝑘 𝜆𝜆 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 �𝜇𝜇 𝜎𝜎 � 2� Chi𝑁𝑁-square𝑁𝑁𝑁𝑁𝑁𝑁 𝑡𝑡 𝜇𝜇 𝜎𝜎 Special Case: 𝜆𝜆 𝜆𝜆 Distribution 1 ( ; ) = ( ; = , = ) 2 2 ( ; ) where is an integer2 𝑣𝑣 𝜒𝜒 𝑡𝑡 𝑣𝑣 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑡𝑡 𝑘𝑘 𝜆𝜆 2 𝜒𝜒 𝑡𝑡 𝑣𝑣 Let 𝑣𝑣 Inverse Gamma 1 Distribution X~ ( , ) Y = X

Then ( ; , ) 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑘𝑘 𝜆𝜆 𝑎𝑎𝑎𝑎𝑎𝑎 Y~I ( = , = ) 𝐼𝐼𝐼𝐼 𝑡𝑡 𝛼𝛼 β Let 𝐺𝐺 𝛼𝛼 𝑘𝑘 β λ Beta Distribution X , ~ (k , ) Y = X +1 X ( ; , ) Then 1 2 i i 𝑋𝑋 𝑋𝑋 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 λ 𝑎𝑎𝑎𝑎𝑎𝑎 1 2 Y~ ( = , = ) 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑡𝑡 α β 1 2 Let: 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝛼𝛼 𝑘𝑘 𝛽𝛽 𝑘𝑘 Gamma

~ ( , ) . . = 𝑑𝑑

Then: 𝑌𝑌𝑖𝑖 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝜆𝜆 𝑘𝑘𝑖𝑖 𝑖𝑖 𝑖𝑖 𝑑𝑑 𝑎𝑎𝑎𝑎𝑎𝑎 𝑉𝑉 � 𝑌𝑌𝑖𝑖 Dirichlet 𝑖𝑖=1 ~ ( , ) Distribution Let: 𝑉𝑉 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝜆𝜆 ∑𝑘𝑘𝑖𝑖 ( ; ) = , , … , 𝑌𝑌1 𝑌𝑌2 𝑌𝑌𝑑𝑑 𝐷𝐷𝐷𝐷𝑟𝑟𝑑𝑑 𝒙𝒙 𝛂𝛂 Then: 𝒁𝒁 � � ~ 𝑉𝑉 ( 𝑉𝑉 , … , 𝑉𝑉 )

𝑑𝑑 1 𝑘𝑘 *i.i.d: independent and identically𝒁𝒁 𝐷𝐷𝐷𝐷𝑟𝑟 𝛼𝛼distributed𝛼𝛼 Wishart The Wishart Distribution is the multivariate generalization of the Distribution gamma distribution.

( ; )

𝑊𝑊𝑊𝑊𝑊𝑊ℎ𝑎𝑎𝑟𝑟𝑡𝑡𝑑𝑑 𝑛𝑛 𝚺𝚺 108 Univariate Continuous Distributions

4.4. Logistic Continuous Distribution

Probability Density Function - f(t) 0.3 μ=0 s=0.7 μ=2 0.3 s=1 0.2 μ=4 s=2

0.2 0.2

0.1 0.1 0.1

0.0 0.0 -6 -1 4 9 -7 -2 3

Cumulative Density Function - F(t)

1.0 1.0

0.8 0.8 Logistic 0.6 0.6

0.4 0.4 μ=0 s=1 s=0.7 s=1 0.2 μ=2 s=1 0.2 μ=4 s=1 s=2 0.0 0.0 -6 -1 4 9 -7 -2 3

Hazard Rate - h(t) 1.0 μ=0 1.4 μ=0 s=0.7 μ=2 μ=0 s=1 1.2 0.8 μ=4 μ=0 s=2 1.0 0.6 0.8

0.4 0.6 0.4 0.2 0.2 0.0 0.0 -6 -1 4 9 -7 -2 3

Logistic Continuous Distribution 109

Parameters & Description Location parameter. is the mean, < < median and mode of the distribution. Parameters 𝜇𝜇 µ −∞ 𝜇𝜇 ∞ Scale parameter. Proportional to the s > 0 standard deviation of the distribution.

𝑠𝑠 < t < Limits

−∞ ∞ Distribution Formulas e e ( ) = = s(1 +ze ) s(1 +−ez ) 𝑓𝑓 𝑡𝑡 z 2 −z 2 1 PDF = sech 4 2 2 𝑡𝑡 − 𝜇𝜇 where � � 𝑠𝑠 𝑠𝑠 = 𝑡𝑡 − 𝜇𝜇 𝑧𝑧 1 e 𝑠𝑠 ( ) = = Logistic 1 + e 1 +ze CDF −z z 𝐹𝐹 𝑡𝑡 1 1 = + tanh 2 2 2 𝑡𝑡 − 𝜇𝜇 � � 1 𝑠𝑠 R(t) = Reliability 1 + e z

( + ) 1 + exp ( ) = ( | ) = = Conditional ( ) +𝑡𝑡 − 𝜇𝜇 1 + exp � � Survivor Function 𝑅𝑅 𝑡𝑡 𝑥𝑥 𝑠𝑠 Where 𝑚𝑚 𝑥𝑥 𝑅𝑅 𝑥𝑥 𝑡𝑡 𝑡𝑡 𝑥𝑥 − 𝜇𝜇 ( > + | > ) 𝑅𝑅 𝑡𝑡 � � is the given time we know the component has survived𝑠𝑠 to. 𝑃𝑃 𝑇𝑇 𝑥𝑥 𝑡𝑡 𝑇𝑇 𝑡𝑡 is a random variable defined as the time after . Note: = 0 at . 𝑡𝑡 Mean Residual 𝑥𝑥 ( ) = (1 + ) s. ln + 𝑡𝑡 t 𝑥𝑥 𝑡𝑡 Life t 𝜇𝜇 z �s �𝑠𝑠 𝑢𝑢 𝑡𝑡 𝑒𝑒 � 1 �𝑒𝑒 𝑒𝑒( ) � − � ( ) = = s(1 + e ) 1 𝐹𝐹 𝑡𝑡 Hazard Rate ℎ 𝑡𝑡 −z = 𝑠𝑠 s + s exp 𝜇𝜇 − 𝑡𝑡 � � 𝑠𝑠 Cumulative ( ) = ln 1 + exp Hazard Rate 𝑡𝑡 − 𝜇𝜇 𝐻𝐻 𝑡𝑡 � � �� 𝑠𝑠 110 Univariate Continuous Distributions

Properties and Moments Median

Mode µ

Mean - 1st Raw Moment µ

Variance - 2nd Central Moment µ s 32 π 2 Skewness - 3rd Central Moment 0

Excess kurtosis - 4th Central Moment 6

5 Characteristic Function (1 , 1 + ) for | | < 1 𝑖𝑖𝑖𝑖𝑖𝑖 100 % Percentile Function 𝑒𝑒 𝐵𝐵 − 𝑖𝑖𝑖𝑖𝑖𝑖 = 𝑖𝑖+𝑖𝑖𝑖𝑖 ln 𝑠𝑠𝑠𝑠 1 𝛾𝛾 𝛾𝛾 𝑡𝑡𝛾𝛾 𝜇𝜇 𝑠𝑠 � � Parameter Estimation − 𝛾𝛾 Plotting Method

Least Mean X-Axis Y-Axis 1 = Square m t ln[F] ln[1 ] = + Logistic 𝑠𝑠̂ i = 𝑦𝑦 𝑚𝑚𝑚𝑚 𝑐𝑐 − − 𝐹𝐹 Maximum Likelihood Function 𝜇𝜇̂ −𝑐𝑐𝑠𝑠̂ Likelihood For complete data: Function exp ( , | ) = nF 𝑡𝑡𝑖𝑖 − 𝜇𝜇 s 1 + exp� � −𝑠𝑠 2 𝐿𝐿 𝜇𝜇 𝑠𝑠 𝐸𝐸 � 𝑖𝑖 i=1 𝑡𝑡 − 𝜇𝜇 ������������������ failures −𝑠𝑠 Log-Likelihood Function ( , | ) = n ln s + 𝑛𝑛𝐹𝐹 2 𝑛𝑛𝐹𝐹 ln 1 + exp 𝑡𝑡𝑖𝑖 − 𝜇𝜇 𝑡𝑡𝑖𝑖 − 𝜇𝜇 Λ 𝜇𝜇 𝑠𝑠 𝐸𝐸 − F � � � − � � � �� 𝑖𝑖=1 −𝑠𝑠 𝑖𝑖=1 −𝑠𝑠 ��������������������������������� failures = 0 n 2 1 = 𝑛𝑛𝐹𝐹 = 0 ∂Λ s ∂Λ F 1 + exp ∂µ − � 𝑖𝑖 ∂µ 𝑠𝑠 𝑖𝑖=1 𝑡𝑡 − 𝜇𝜇 �������������������� failures 𝑠𝑠

= 0 n 1 1 exp s = 𝑛𝑛𝐹𝐹 = 0 s s 𝑡𝑡𝑖𝑖 − 𝜇𝜇 ∂Λ F 𝑖𝑖 1 +− exp � � ∂Λ 𝑡𝑡 − 𝜇𝜇 𝑠𝑠 ∂ − − � � � � 𝑖𝑖 � ∂ 𝑠𝑠 𝑖𝑖=1 𝑠𝑠 𝑡𝑡 − 𝜇𝜇 ��������������������������� failures 𝑠𝑠 Logistic Continuous Distribution 111

MLE Point The MLE estimates for and are found by solving the following Estimates equations: 𝜇𝜇̂ 𝑠𝑠̂ 1 1 𝑛𝑛𝐹𝐹 1 + exp = 0 2 n −1 𝑡𝑡𝑖𝑖 − 𝜇𝜇 − F � � � �� 1 𝑖𝑖=1 1 exp𝑠𝑠 1 + 𝑛𝑛𝐹𝐹 = 0 n 𝑡𝑡𝑖𝑖 − 𝜇𝜇 𝑖𝑖 1 −+ exp � � 𝑡𝑡 − 𝜇𝜇 𝑠𝑠 F � � � 𝑡𝑡𝑖𝑖 − 𝜇𝜇 𝑖𝑖=1 𝑠𝑠 � � These estimates are biased. (Balakrishnan𝑠𝑠 1991) provides tables derived from Monte Carlo simulation to correct the bias. Fisher 1 0 Information 3 ( , ) = 2 + 3 0𝑠𝑠 𝐼𝐼 𝜇𝜇 𝑠𝑠 � 29 � 𝜋𝜋 (Antle et al. 1970) 2 𝑠𝑠 100 % Confidence intervals are most often obtained from tables derived from Confidence Monte Carlo simulation. Corrections from using the Fisher Information Intervals𝛾𝛾 matrix method are given in (Antle et al. 1970). Logistic Bayesian

Non-informative Priors ( , )

Type Prior𝝅𝝅𝟎𝟎 𝝁𝝁 𝒔𝒔 Jeffery Prior 1

Description , Limitations and𝑠𝑠 Uses Example The accuracy of a cutting machine used in manufacturing is desired to be measured. 5 cuts at the required length are made and measured as: 7.436, 10.270, 10.466, 11.039, 11.854

Numerically solving MLE equations gives: 𝑚𝑚𝑚𝑚 = 10.446 = 0.815 𝜇𝜇̂ This gives a mean of 10.446 and𝑠𝑠̂ a variance of 2.183. Compared to the same data used in the Normal distribution section it can be seen that this estimate is very similar to a normal distribution.

90% confidence interval for : 3 3 (0.95) 𝜇𝜇 , + (0.95) 2 2 −1 𝑠𝑠̂ −1 𝑠𝑠̂ �𝜇𝜇̂ − Φ � 𝜇𝜇̂ Φ � � [9.408𝑛𝑛𝐹𝐹 , 11.4844] 𝑛𝑛𝐹𝐹

112 Univariate Continuous Distributions

90% confidence interval for : 9 9 (0.95) 𝑠𝑠 (0.95) (3 +2 ) (3 +2 ) −1 −1 ⎡ . exp ⎧ 𝑠𝑠̂ 2 ⎫ , . exp ⎧ 𝑠𝑠̂ 2 ⎫⎤ 𝛷𝛷 � 𝐹𝐹 𝛷𝛷 � 𝐹𝐹 ⎢ ⎪ 𝑛𝑛 𝜋𝜋 ⎪ ⎪ 𝑛𝑛 𝜋𝜋 ⎪⎥ ⎢𝑠𝑠̂ 𝑠𝑠̂ ⎥ ⎢ ⎨ −𝑠𝑠̂ ⎬ ⎨ 𝑠𝑠̂ ⎬⎥ ⎢ ⎪ [0.441⎪ , 1.501] ⎪ ⎪⎥ ⎣ ⎩ ⎭ ⎩ ⎭⎦ Note that this confidence interval uses the assumption of the parameters being normally distributed which is only true for large sample sizes. Therefore these confidence intervals may be inaccurate.

Bayesian methods must be calculated using numerical methods.

Characteristics The is most often used to model growth rates (and has been used extensively in biology and chemical applications). In reliability engineering it is most often used as a life distribution.

Shape. There is no shape parameter and so the logistic distribution

is always a bell shaped curve. Increasing shifts the curve to the right, increasing increases the spread of the curve. 𝜇𝜇 𝑠𝑠 Logistic Normal Distribution. The shape of the logistic distribution is very similar to that of a normal distribution with the logistic distribution having slightly ‘longer tails’. It would take a large number of samples to distinguish between the distributions. The main difference is that the hazard rate approaches 1/ for large . The has historically been preferred over the normal distribution because of its simplified form. (Meeker & Escobar𝑠𝑠 1998,𝑡𝑡 p.89)

Alternative Parameterization. It is equally as popular to present the logistic distribution using the true standard deviation = 3. This form is used in reference book, Balakrishnan 1991, and gives the following cdf: 𝜎𝜎 𝜋𝜋𝜋𝜋⁄√

1 ( ) = 1 + exp 𝐹𝐹 𝑡𝑡 3 −𝜋𝜋 𝑡𝑡 − 𝜇𝜇 � � �� Standard Logistic Distribution. The√ standard𝜎𝜎 logistic distribution has = 0, = 1. The standard logistic distribution random variable, , is related to the logistic distribution: 𝜇𝜇 𝑠𝑠 𝑍𝑍 = 𝑋𝑋 − 𝜇𝜇 𝑍𝑍 𝑠𝑠

Logistic Continuous Distribution 113

Let: ~ ( ; , )

Scaling property (Leemis𝑇𝑇 &𝐿𝐿 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿McQueston𝑡𝑡 𝜇𝜇 𝑠𝑠2008) ~ ( ; , )

Rate Relationships. 𝑎𝑎𝑎𝑎The𝐿𝐿 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿distribution𝑡𝑡 𝜇𝜇 𝑎𝑎𝑎𝑎has the following rate relationships which make it suitable for modeling growth (Hastings et al. 2000, p.127):

f(t) ( ) ( ) = = R(t) 𝐹𝐹 𝑡𝑡 ℎ 𝑡𝑡 ( ) 𝑠𝑠 = ln = ln[ ( )] ln[1 ( )] ( ) 𝐹𝐹 𝑡𝑡 where 𝑧𝑧 � � 𝐹𝐹 𝑡𝑡 − − 𝐹𝐹 𝑡𝑡 𝑅𝑅 𝑡𝑡 = when = 0 and = 1: 𝑡𝑡 − 𝜇𝜇 𝑧𝑧 𝑠𝑠 𝜇𝜇 𝑠𝑠 ( )

( ) = = ( ) ( ) Logistic 𝑑𝑑𝑑𝑑 𝑡𝑡 𝑓𝑓 𝑡𝑡 𝐹𝐹 𝑡𝑡 𝑅𝑅 𝑡𝑡 𝑑𝑑𝑑𝑑

Applications Growth Model. The logistic distribution most common use is a growth model.

Probability of Detection. The cdf of logistic distribution is commonly used to represent the probability of detection damaged materials sensors and detection instruments. For example probability of detection of embedded flaws in metals using ultrasonic signals.

Life Distribution. In reliability applications it is used as a life distribution. It is similar in shape to a normal distribution and so is often used instead of a normal distribution due to its simplified form. (Meeker & Escobar 1998, p.89)

Logistic Regression. is a generalized model used predict binary outcomes. (Agresti 2002)

Resources Online: http://mathworld.wolfram.com/LogisticDistribution.html http://en.wikipedia.org/wiki/Logistic_distribution http://socr.ucla.edu/htmls/SOCR_Distributions.html (web calc) http://www.weibull.com/LifeDataWeb/the_logistic_distribution.htm

Books: Balakrishnan, 1991. Handbook of the Logistic Distribution 1st ed., CRC.

114 Univariate Continuous Distributions

Johnson, N.L., Kotz, S. & Balakrishnan, N., 1995. Continuous Univariate Distributions, Vol. 2 2nd ed., Wiley-Interscience.

Relationship to Other Distributions Let Exponential ~ ( = 1) = ln 1 +−𝑋𝑋 Distribution 𝑒𝑒 ( ; ) Then 𝑋𝑋 𝐸𝐸𝐸𝐸𝐸𝐸 𝜆𝜆 𝑎𝑎𝑎𝑎𝑎𝑎 𝑌𝑌 � −𝑋𝑋� ~ (0,1) 𝑒𝑒 𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 𝜆𝜆 (Hastings et al. 2000, p.127) 𝑌𝑌 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿 Let X ( ) Pareto ~ , = ln α 1 Distribution ( , ) Then 𝑋𝑋 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝜃𝜃 𝛼𝛼 𝑎𝑎𝑎𝑎𝑎𝑎 𝑌𝑌 − �� � − � ~ (0,1) θ 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝜃𝜃 𝛼𝛼 (Hastings et al. 2000, p.127) 𝑌𝑌 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿 Let Gumbel ~ ( , ) = X X Distribution Then

𝑖𝑖 1 2 ( , ) 𝑋𝑋 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝛼𝛼 𝛽𝛽~ 𝑎𝑎𝑎𝑎𝑎𝑎(0, ) 𝑌𝑌 − (Hastings et al. 2000, p.127) 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝛼𝛼 𝛽𝛽 𝑌𝑌 𝐿𝐿𝐿𝐿𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔 𝛽𝛽

Logistic Normal Continuous Distribution 115

4.5. Normal (Gaussian) Continuous Distribution

Probability Density Function - f(t) 0.5 μ=-3 σ=1 μ=0 σ=0.6 μ=0 σ=1 μ=0 σ=1 0.6 μ=3 σ=1 μ=0 σ=2 0.4

0.3 0.4

0.2 0.2 0.1

0.0 0.0 -6 -4 -2 0 2 4 6 -6 -4 -2 0 2 4 6

Cumulative Density Function - F(t) 1.0 1.0 Normal

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0.0 0.0 -6 -4 -2 0 2 4 6 -6 -4 -2 0 2 4 6

Hazard Rate - h(t) μ=-3 σ=1 6 6 μ=-3 σ=0.6 μ=0 σ=1 5 μ=0 σ=1 5 μ=3 σ=1 μ=0 σ=2 4 4

3 3

2 2

1 1

0 0 -6 -4 -2 0 2 4 6 -6 -4 -2 0 2 4 6

116 Univariate Continuous Distributions

Parameters & Description Location parameter: The mean of the < < distribution. Parameters 𝜇𝜇 −∞ µ ∞ Scale parameter: The standard > 0 deviation of the distribution. 2 2 Limits 𝜎𝜎 σ < t <

Distribution − Formulas∞ ∞ 1 1 t ( ) = exp 2 2 2 1 − µ PDF 𝑓𝑓 𝑡𝑡 = � − � � � σ√ 𝜋𝜋 σ 𝑡𝑡 − µ 𝜙𝜙 � � where is the standard normalσ σpdf with = 0 and = 1. 2 𝜙𝜙 1 1 𝜇𝜇 𝜎𝜎 ( ) = exp 2 𝑡𝑡 2 2 θ − µ 𝐹𝐹 𝑡𝑡 � � − � � � 𝑑𝑑𝑑𝑑 σ√ 𝜋𝜋1 −∞1 σ = + erf 2 2 2 CDF 𝑡𝑡 − µ � � t = σ√ Normal − µ Φ � � where is the standard normal cdfσ with = 0 and = 1. 2 Φ t 𝜇𝜇 𝜎𝜎 R(t) = 1 Reliability t − µ = − Φ � � σ µ − Φ � � σ x t ( + ) ( ) = ( | ) = = Conditional ( ) µ − −t Φ � � Survivor Function 𝑅𝑅 𝑡𝑡 𝑥𝑥 σ Where 𝑚𝑚 𝑥𝑥 𝑅𝑅 𝑥𝑥 𝑡𝑡 µ − ( > + | > ) 𝑅𝑅 𝑡𝑡 Φ � � is the given time we know the component has survivedσ to. 𝑃𝑃 𝑇𝑇 𝑥𝑥 𝑡𝑡 𝑇𝑇 𝑡𝑡 is a random variable defined as the time after . Note: = 0 at . 𝑡𝑡 Mean Residual 𝑥𝑥 ( ) ( ) 𝑡𝑡 𝑥𝑥 𝑡𝑡 ( ) = ∞ = ∞ Life ( ) ( ) ∫𝑡𝑡 𝑅𝑅 𝑥𝑥 𝑑𝑑𝑑𝑑 ∫𝑡𝑡 𝑅𝑅 𝑥𝑥 𝑑𝑑𝑑𝑑 𝑢𝑢 𝑡𝑡 𝑅𝑅 𝑡𝑡 t 𝑅𝑅 𝑡𝑡 ( ) = Hazard Rate − µt 𝜙𝜙 � � σ ℎ 𝑡𝑡 µ − 𝜎𝜎 �Φ � �� σ Cumulative ( ) = ln Hazard Rate 𝜇𝜇 − 𝑡𝑡 𝐻𝐻 𝑡𝑡 − �𝛷𝛷 � �� 𝜎𝜎 Normal Continuous Distribution 117

Properties and Moments Median

Mode µ st Mean - 1 Raw Moment µ nd Variance - 2 Central Moment µ rd 2 Skewness - 3 Central Moment σ0 Excess kurtosis - 4th Central Moment 0

Characteristic Function exp 1 2 2 100 % Percentile Function = + �𝑖𝑖𝑖𝑖𝑖𝑖 −( 2𝜎𝜎) 𝑡𝑡 � = + 2−1erf (2 1) 𝛼𝛼 𝛼𝛼 𝑡𝑡 µ σΦ α−1 Parameter Estimation µ σ√ 𝛼𝛼 − Plotting Method

Least Mean X-Axis Y-Axis = Square

[ ( )] 𝑐𝑐 Normal = + µ� − 1 𝑚𝑚 1 𝑖𝑖 𝑖𝑖 = , = 𝑦𝑦 𝑚𝑚𝑚𝑚 𝑐𝑐 𝑡𝑡 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝐹𝐹 𝑡𝑡 2 σ� σ� 2 Maximum Likelihood Function 𝑚𝑚 𝑚𝑚 Likelihood For complete data: Function 1 1 ( , | ) = exp nF 2 2 𝑡𝑡𝑖𝑖 − 𝜇𝜇 𝐿𝐿 𝜇𝜇 𝜎𝜎 𝐸𝐸 nF � �− � � � i=1 𝜎𝜎 �σ�√�2π����������������������� 1 failures1 = exp 𝑛𝑛𝐹𝐹 ( ) 2 2 nF �− 2 � 𝑡𝑡𝑖𝑖 − 𝜇𝜇 � �σ 2π� 𝜎𝜎 𝑖𝑖=1 ��√����������������������� Log-Likelihood failures1 Function ( , | ) = n ln 𝑛𝑛𝐹𝐹 ( ) 2 2 Λ 𝜇𝜇 𝜎𝜎 𝐸𝐸 − F �σ√2π� − 2 � 𝑡𝑡𝑖𝑖 − 𝜇𝜇 𝜎𝜎 𝑖𝑖=1 ����������������������� solve for to get MLE : failures = 0 1 ∂Λ 𝜇𝜇 𝜇𝜇̂ = 𝑛𝑛𝐹𝐹 = 0 ∂µ ∂Λ 𝜇𝜇𝑛𝑛𝐹𝐹 2 − 2 � 𝑡𝑡𝑖𝑖 ∂µ 𝜎𝜎 𝜎𝜎 𝑖𝑖=1 ����������� solve for to get : failures = 0 n 1 ∂Λ 𝜎𝜎 𝜎𝜎� = + 𝑛𝑛𝐹𝐹 ( ) = 0 ∂σ ∂Λ F 2 − 3 � 𝑡𝑡𝑖𝑖 − 𝜇𝜇 ∂σ σ 𝜎𝜎 𝑖𝑖=1 ��������������� failures 118 Univariate Continuous Distributions

MLE Point When there is only complete failure data the point estimates can be Estimates given as: 1 1 = 𝑛𝑛𝐹𝐹 = 𝑛𝑛𝐹𝐹 ( ) n n 2 2 𝑖𝑖 � 𝑖𝑖 𝜇𝜇̂ F � 𝑡𝑡 𝜎𝜎 F � 𝑡𝑡 − 𝜇𝜇 𝑖𝑖=1 𝑖𝑖=1 In most cases the unbiased estimators are used:

1 1 = 𝑛𝑛𝐹𝐹 = 𝑛𝑛𝐹𝐹 ( ) n n 1 2 2 𝑖𝑖 � 𝑖𝑖 𝜇𝜇̂ F � 𝑡𝑡 𝜎𝜎 F � 𝑡𝑡 − 𝜇𝜇 𝑖𝑖=1 − 𝑖𝑖=1 Fisher 1/ 0 ( , ) = Information 0 2 1/2 2 𝜎𝜎 𝐼𝐼 𝜇𝜇 𝜎𝜎 � 4� 100 % 1 Sided - Lower 2 Sided - Lower− 𝜎𝜎 2 Sided - Upper Confidence

Intervals𝛾𝛾 (n 1) (n 1) + (n 1)

𝜎𝜎� 𝜎𝜎� 1+γ 𝜎𝜎� 1+γ γ � � � � 𝝁𝝁 𝜇𝜇̂ − 𝑡𝑡 − 𝜇𝜇̂ − 𝑡𝑡 2 − 𝜇𝜇̂ 𝑡𝑡 2 − (for complete √𝑛𝑛( 1) √𝑛𝑛 ( 1) √𝑛𝑛 ( 1) data) 𝟐𝟐 ( 1) 1+ ( 1) 1 ( 1) 2 𝑛𝑛 − 2 𝑛𝑛 − 2 𝑛𝑛 − 𝝈𝝈 � 2 � 2 2 � 2 2 𝜎𝜎 𝛼𝛼 𝜎𝜎 γ 𝜎𝜎 −γ (Nelson 1982,𝜒𝜒 pp.218𝑛𝑛 − -220) Where𝜒𝜒 � (�n𝑛𝑛 −1) is the 100 𝜒𝜒th � percentile� 𝑛𝑛 − of

Normal 1 ( 1) the t-distribution with degreesγ of freedom and is the th 𝑡𝑡 − 1 𝛾𝛾2 100 percentile of the -distribution with degrees𝛾𝛾 of freedom. 𝑛𝑛 −2 𝜒𝜒 𝑛𝑛 − 𝛾𝛾 𝜒𝜒 𝑛𝑛 − Bayesian Non-informative Priors when is known, ( ) (Yang and Berger 1998,𝟐𝟐 p.22) 𝝈𝝈 𝝅𝝅𝟎𝟎 𝝁𝝁 Type Prior Posterior Uniform Proper 1 Truncated Normal Distribution

Prior with limits For a b [ , ] . ; 𝐹𝐹 , 𝑏𝑏 − 𝑎𝑎 ≤ µ ≤ 𝑛𝑛 𝐹𝐹 n2 𝜇𝜇 ∈ 𝑎𝑎 𝑏𝑏 ∑𝑖𝑖=1 𝑡𝑡𝑖𝑖 σ Otherwise𝑐𝑐 𝑁𝑁(𝑁𝑁𝑁𝑁𝑁𝑁) =�0µ � 𝑛𝑛𝐹𝐹 F All 1 𝜋𝜋 𝜇𝜇 ; 𝐹𝐹 , 𝑛𝑛 𝐹𝐹 n2 ∑𝑖𝑖=1 𝑡𝑡𝑖𝑖 σ when (𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁, ) �µ � 𝑛𝑛𝐹𝐹 F Non-informative Priors when is𝜇𝜇 ∈known,∞ ∞ ( ) (Yang and Berger 1998, p.23) 𝟐𝟐 𝝁𝝁 𝝅𝝅𝒐𝒐 𝝈𝝈 Type Prior Posterior Uniform Proper 1 Truncated Inverse Gamma Distribution

Prior with limits For a b [ , ] 2 2 𝑏𝑏 − 𝑎𝑎 ≤ σ ≤ 𝜎𝜎 ∈ 𝑎𝑎 𝑏𝑏 Normal Continuous Distribution 119

( 2) S . ; , 2 22 2 𝑛𝑛𝐹𝐹 − Otherwise𝑐𝑐 𝐼𝐼𝐼𝐼( �σ) = 0 � 2 Uniform Improper 1 ( 2) S 𝜋𝜋 𝜎𝜎 ; , Prior with limits 2 𝐹𝐹 2 2 (0, ) 2 𝑛𝑛 − 𝐼𝐼𝐼𝐼See�σ section 1.7.1 � 2 Jeffery’s,𝜎𝜎 ∈ ∞ 1 S ; , Reference, MDIP 2 22 Prior 2 𝑛𝑛𝐹𝐹 2 with𝐼𝐼 𝐼𝐼limits�σ (0�, ) 𝜎𝜎 See section2 1.7.1 𝜎𝜎 ∈ ∞ Non-informative Priors when and are unknown, ( , ) (Yang and Berger 1998,𝟐𝟐 p.23) 𝟐𝟐 𝝁𝝁 𝝈𝝈 𝝅𝝅𝒐𝒐 𝝁𝝁 𝝈𝝈 Type Prior Posterior Improper Uniform 1 S ( | )~ ; n 3, , with limits: n (n 2 3) ( , ) See sectionF 1.7.2 𝜋𝜋 𝜇𝜇 𝐸𝐸 𝑇𝑇 �µ − 𝑡𝑡̅ F F � (0, ) ( 3) S−

𝜇𝜇2∈ ∞ ∞ ( | )~ ; , Normal 2 22 𝜎𝜎 ∈ ∞ 2 2 𝑛𝑛𝐹𝐹 − 𝜋𝜋 𝜎𝜎 𝐸𝐸See𝐼𝐼 𝐼𝐼section�σ 1.7.1 �

Jeffery’s Prior 1 S ( | )~ ; n + 1, , n (n 2+ 1) 4 when F ( , ) 𝜋𝜋 𝜇𝜇 𝐸𝐸 𝑇𝑇 �µ 𝑡𝑡̅ F F � 𝜎𝜎 See section 1.7.2 𝜇𝜇 ∈ (∞ ∞+ 1) S ( | )~ ; , 2 22 2 2 𝑛𝑛𝐹𝐹 𝜋𝜋 𝜎𝜎 𝐸𝐸when𝐼𝐼𝐼𝐼 �σ (0, ) � See section2 1.7.1 𝜎𝜎 ∈ ∞ Reference Prior ( , ) No Closed Form ordering { , } 12 𝑜𝑜 𝜋𝜋 𝜙𝜙 2𝜎𝜎+ 𝜙𝜙 𝜎𝜎 where∝ 2 𝜎𝜎�= / 𝜙𝜙 Reference where 1 S 𝜙𝜙 𝜇𝜇 𝜎𝜎 ( | )~ ; n 1, , and are n (n 2 1) separate groups.2 2 when F ( ̅, ) 𝜇𝜇 𝜎𝜎 𝜋𝜋 𝜇𝜇 𝐸𝐸 𝑇𝑇 �µ − 𝑡𝑡 F F � 𝜎𝜎 See section 1.7.2 − MDIP Prior 𝜇𝜇 ∈ (∞ ∞ 1) S ( | )~ ; , 2 22 2 2 𝑛𝑛𝐹𝐹 − 𝜋𝜋 𝜎𝜎 𝐸𝐸when𝐼𝐼𝐼𝐼 �σ (0, ) � See section2 1.7.1 𝜎𝜎 ∈ ∞ 120 Univariate Continuous Distributions

where 1 = 𝑛𝑛𝐹𝐹 ( ) and = 𝑛𝑛𝐹𝐹 n 2 2 𝑖𝑖 𝑖𝑖 𝑆𝑆 � 𝑡𝑡 − 𝑡𝑡̅ 𝑡𝑡̅ F � 𝑡𝑡 𝑖𝑖=1 𝑖𝑖=1 Conjugate Priors UOI Likelihood Evidence Dist. of Prior Posterior Parameters Model UOI Para

+ 𝑛𝑛𝐹𝐹 𝐹𝐹 = 0 𝑖𝑖=1 𝑖𝑖 𝑢𝑢 1 ∑ 2𝑡𝑡 Normal 0 + failures 𝑣𝑣 𝜎𝜎 from with known Normal u , 𝑢𝑢 𝐹𝐹 at times 𝑛𝑛2 (𝜇𝜇; , ) 𝐹𝐹 0 𝑛𝑛 𝑜𝑜 0 𝑣𝑣 1 𝜎𝜎 2 2 𝑖𝑖 𝑣𝑣 = 𝑡𝑡 1 n 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑡𝑡 𝜇𝜇 𝜎𝜎 𝜎𝜎 + 𝑣𝑣 F 2 = 𝑣𝑣0+ 𝜎𝜎/2 Normal failures 𝑜𝑜 𝐹𝐹 from2 with known Gamma , 𝑘𝑘 𝑘𝑘 𝑛𝑛 at times 1 𝜎𝜎( ; , ) 𝐹𝐹 = + 𝑛𝑛𝐹𝐹 ( ) 𝑛𝑛 0 0 2 2 𝑘𝑘 𝜆𝜆 𝑡𝑡𝑖𝑖 2 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑡𝑡 𝜇𝜇 𝜎𝜎 𝜇𝜇 𝜆𝜆 𝜆𝜆𝑜𝑜 � 𝑡𝑡𝑖𝑖 − 𝜇𝜇 𝑖𝑖=1 Normal u ln ( ) + 𝑛𝑛𝐹𝐹 = 0 ∑𝑖𝑖=1 𝑖𝑖 2 1 2 𝑡𝑡 Lognormal 0 + 𝑁𝑁 failures σ 𝜎𝜎 from with known Normal , 𝑢𝑢 𝐹𝐹 𝑁𝑁 at times 2 𝑛𝑛2 (𝜇𝜇; , ) 𝐹𝐹 𝑁𝑁 𝑛𝑛 𝑜𝑜 0 𝑣𝑣 1 𝜎𝜎 2 2 𝑖𝑖 𝑢𝑢 𝑣𝑣 = 𝑁𝑁 𝑁𝑁 𝑁𝑁 𝑡𝑡 1 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿 𝑡𝑡 𝜇𝜇 𝜎𝜎 𝜎𝜎 + 𝑣𝑣 𝐹𝐹 2 𝑛𝑛2 Description , Limitations and Uses 𝑣𝑣 𝜎𝜎𝑁𝑁 Example The accuracy of a cutting machine used in manufacturing is desired to be measured. 5 cuts at the required length are made and measured as: 7.436, 10.270, 10.466, 11.039, 11.854

MLE Estimates are: 𝑚𝑚𝑚𝑚 = = 10.213 n 𝐹𝐹 ∑ 𝑡𝑡𝑖𝑖 𝜇𝜇̂ = F = 2.789 n𝐹𝐹 1 2 2 ∑ 𝑖𝑖 𝑡𝑡 � �𝑡𝑡 − 𝜇𝜇� � 𝜎𝜎 F 90% confidence interval for : −

{0.95}(4), + {0.95}(4) 5 𝜇𝜇 5 𝜎𝜎� 𝜎𝜎� �𝜇𝜇̂ − 𝑡𝑡 [10.163, 10𝜇𝜇.̂262] 𝑡𝑡 � √ √ Normal Continuous Distribution 121

90% confidence interval for : 4 2 4 , 𝜎𝜎(4) (4) 2 {0.95} 2 {0.05} �𝜎𝜎� 2 [1.176, 15𝜎𝜎�.6972] � 𝜒𝜒 𝜒𝜒 A Bayesian point estimate using the Jeffery non-informative improper prior 1 with posterior for ~ (6, 10.213, 0.558 ) and ~ (3, 5.578) has4 a point estimates: 2 ⁄𝜎𝜎 𝜇𝜇 𝑇𝑇 𝜎𝜎 𝐼𝐼𝐼𝐼 = E[ (6,6.595,0.412 )] = = 10.213

𝜇𝜇̂ 𝑇𝑇 5.578µ = E[ (3,5.578)] = = 2.789 2 2 𝜎𝜎� 𝐼𝐼𝐺𝐺 With 90% confidence intervals:

[ (0.05) = 8.761, (0.95) = 11.665] 𝜇𝜇 −1 −1 𝑇𝑇 𝑇𝑇 2 [1/𝐹𝐹 (0.95) = 0.886, 𝐹𝐹1/ (0.05) = 6.822] 𝜎𝜎 −1 −1 𝐹𝐹𝐺𝐺 𝐹𝐹𝐺𝐺 Normal Characteristics Also known as a Gaussian distribution or bell curve.

Unit Normal Distribution. Also known as the standard normal distribution is when = 0 and = 1 with pdf ( ) and cdf (z). If X is normally distributed with mean and standard deviation then the following transformation𝜇𝜇 is used:𝜎𝜎 𝜙𝜙 𝑧𝑧 Φ 𝜇𝜇 𝜎𝜎 = 𝑥𝑥 − 𝜇𝜇 𝑧𝑧 . Let ,𝜎𝜎 , … , be a sequence of independent and identically distributed (i.i.d) random variables each 1 2 𝑛𝑛 having a mean of and a variance𝑋𝑋 𝑋𝑋 of 𝑋𝑋 . As the sample size𝑛𝑛 increases, the distribution of the sample average2 of these random variables approaches𝜇𝜇 the normal distribution𝜎𝜎 with mean and variance / irrespective of the shape of the original distribution. Formally: 2 𝜇𝜇 𝜎𝜎 𝑛𝑛 = + +

𝑛𝑛 1 𝑛𝑛 If we define a new random𝑆𝑆 variables:𝑋𝑋 ⋯ 𝑋𝑋 = , = 𝑆𝑆𝑛𝑛 − 𝑛𝑛𝑛𝑛 𝑆𝑆𝑛𝑛 𝑍𝑍𝑛𝑛 𝑎𝑎𝑛𝑛𝑑𝑑 𝑌𝑌 The distribution of converges𝜎𝜎√𝑛𝑛 to the standard𝑛𝑛 normal distribution. The distribution of converges to a normal distribution with mean 𝑛𝑛 𝑍𝑍 / and standard deviation𝑛𝑛 of . 𝑆𝑆 𝜇𝜇 Sigma Intervals. Often 𝜎𝜎intervals√𝑛𝑛 of the normal distribution are expressed in terms of distance away from the mean in units of sigma. 122 Univariate Continuous Distributions

The following is approximate values for each sigma:

Interval ( + ) ( ) ± 68.2689492137% ± 2 𝚽𝚽 𝝁𝝁95.4499736104%𝒏𝒏𝒏𝒏 − 𝚽𝚽 𝝁𝝁 − 𝒏𝒏 𝒏𝒏 𝜇𝜇± 3𝜎𝜎 99.7300203937% 𝜇𝜇 ± 4𝜎𝜎 99.9936657516% 𝜇𝜇 ± 5𝜎𝜎 99.9999426697% 𝜇𝜇 ± 6𝜎𝜎 99.9999998027% 𝜇𝜇 𝜎𝜎 Truncated Normal.𝜇𝜇 𝜎𝜎Often in reliability engineering a truncated normal distribution may be used due to the limitation that 0. See Truncated Normal Continuous Distribution. 𝑡𝑡 ≥ Inflection Points: Inflection points occur one standard deviation away from the mean ( ± ).

Mean𝜇𝜇 𝜎𝜎 / Median / Mode: The mean, median and mode are always equal to .

Hazard Rate. The hazard rate is increasing for all𝜇𝜇 . The Standard Normal Distribution’s hazard rate approaches ( ) = as becomes large. 𝑡𝑡 Normal ℎ 𝑡𝑡 𝑡𝑡 𝑡𝑡 Let: X~ ( , ) Convolution Property 2 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 µ σ 𝑛𝑛 ~ , 2 Scaling Property � 𝑋𝑋𝑖𝑖 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 �� 𝜇𝜇𝑖𝑖 ∑𝜎𝜎𝑖𝑖 � 𝑖𝑖=1

+ ~ ( + , ) Linear Combination Property: 2 2 𝑎𝑎𝑎𝑎 𝑏𝑏 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑎𝑎𝑎𝑎 𝑏𝑏 𝑎𝑎 𝜎𝜎 𝑛𝑛 + ~ { + } , 2 2 � 𝑎𝑎𝑖𝑖𝑋𝑋𝑖𝑖 𝑏𝑏𝑖𝑖 𝑁𝑁𝑜𝑜𝑜𝑜𝑜𝑜 �� 𝑎𝑎𝑖𝑖𝜇𝜇𝑖𝑖 𝑏𝑏𝑖𝑖 ∑�𝑎𝑎𝑖𝑖 𝜎𝜎𝑖𝑖 �� 𝑖𝑖=1 Applications Approximations to Other Distributions. The origin of the Normal Distribution was from an approximation of the Binomial distribution. Due to the Central Limit Theory the Normal distribution can be used to approximate many distributions as detailed under ‘Related Distributions’.

Strength Stress Interference. When the strength of a component follows a distribution and the stress that component is subjected to follows a distribution there exists a probability that the stress will be greater than the strength. When both distributions are a normal distribution, there is a closed for solution to the interference Normal Continuous Distribution 123

probability.

Life Distribution. When used as a life distribution a truncated Normal Distribution may be used due to the constraint 0. However it is often found that the difference in results is negligible. (Rausand & Høyland 2004) 𝑡𝑡 ≥

Time Distributions. The normal distribution may be used to model simple repair or inspection tasks that have a typical duration with variation which is symmetrical about the mean. This is typical for inspection and preventative maintenance times.

Analysis of Variance (ANOVA). A test used to analyze variance and dependence of variables. A popular model used to conduct ANOVA assumes the data comes from a normal population.

Six Sigma Quality Management. Six sigma is a business management strategy which aims to reduce costs in manufacturing processes by removing variance in quality (defects). Current manufacturing standards aim for an expected 3.4 defects out of one million parts: 2 ( 6). (Six Sigma Academy 2009) Normal Resources Online: Φ − http://www.weibull.com/LifeDataWeb/the_normal_distribution.htm http://mathworld.wolfram.com/NormalDistribution.html http://en.wikipedia.org/wiki/Normal_distribution http://socr.ucla.edu/htmls/SOCR_Distributions.html (web calc)

Books: Patel, J.K. & Read, C.B., 1996. Handbook of the Normal Distribution 2nd ed., CRC.

Simon, M.K., 2006. Probability Distributions Involving Gaussian Random Variables: A Handbook for Engineers and Scientists, Springer.

Relationship to Other Distributions Truncated Normal Let: Distribution ~ ( , ) ( , ) 2 ( ; , , , ) Then: 𝑋𝑋 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝜇𝜇 𝜎𝜎 ~T 𝑋𝑋 ∈ (∞, ∞, , ) 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 𝑥𝑥 𝜇𝜇 𝜎𝜎 𝑎𝑎𝐿𝐿 𝑏𝑏𝑈𝑈 [ , 2 ] 𝑌𝑌 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝜇𝜇 𝜎𝜎 𝑎𝑎𝐿𝐿 𝑏𝑏𝑈𝑈 Lognormal Let: 𝑌𝑌 ∈ 𝑎𝑎𝐿𝐿 𝑏𝑏𝑈𝑈 Distribution ~ ( , ) = ln ( ) 2 𝑁𝑁 N ( ; , ) Then: 𝑋𝑋 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿 𝜇𝜇 σ 2 ~𝑌𝑌 (𝑋𝑋, ) 𝑁𝑁 𝑁𝑁 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿 𝑡𝑡 𝜇𝜇 𝜎𝜎 Where: 2 𝑌𝑌 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝜇𝜇 𝜎𝜎 124 Univariate Continuous Distributions

+ = ln , = ln +2 2 2 𝑁𝑁 𝜇𝜇 𝑁𝑁 𝜎𝜎 𝜇𝜇 𝜇𝜇 � 2 2� 𝜎𝜎 � � 2 � Rayleigh Let �𝜎𝜎 𝜇𝜇 𝜇𝜇 Distribution , ~ (0, ) Y = X + X

( ; ) 2 2 Then 𝑋𝑋1 𝑋𝑋2 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 σ 𝑎𝑎𝑎𝑎𝑎𝑎 � 1 2 Y~ ( ) 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅ℎ 𝑡𝑡 𝜎𝜎 Let 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅ℎ 𝜎𝜎 Chi-square X ( ) Distribution ~ , Y = v 2 2 k 𝑖𝑖 − µ ( ; ) Then 𝑋𝑋 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 µ σ 𝑎𝑎𝑎𝑎𝑎𝑎 � � � k=1 σ 2 Y~ ( ; ) 𝜒𝜒 𝑡𝑡 𝑣𝑣 2 Limiting Case for constant : 𝜒𝜒 𝑡𝑡 𝑣𝑣 lim ( ; , ) = k; = n , = (1 ) Binomial 𝑝𝑝 2 𝑛𝑛→∞ 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑛𝑛 𝑝𝑝 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁� µ 𝑝𝑝 𝜎𝜎 𝑛𝑛𝑛𝑛 − 𝑝𝑝 � Distribution 𝑝𝑝=𝑝𝑝 The Normal distribution can be used as an approximation of the ( ; , ) 10 (1 ) 10 Binomial distribution when and .

𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑛𝑛 𝑝𝑝 ( ; , ) 𝑛𝑛𝑛𝑛=≥ + 0.5; 𝑛𝑛𝑛𝑛= −,𝑝𝑝 ≥= (1 )

Normal 2 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 lim𝑝𝑝 𝑛𝑛 ≈ 𝑁𝑁(𝑁𝑁𝑁𝑁𝑁𝑁; )�𝑡𝑡= 𝑘𝑘 ( ; 𝜇𝜇 = 𝑛𝑛,𝑛𝑛 𝜎𝜎= 𝑛𝑛)𝑛𝑛 − 𝑝𝑝 � ′ 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 Poisson 𝜇𝜇→∞ 𝐹𝐹 𝑘𝑘 𝜇𝜇 𝐹𝐹 𝑘𝑘 𝜇𝜇 𝜇𝜇 𝜎𝜎 �𝜇𝜇 This is a good approximation when > 1000. When > 10 the same Distribution approximation can be made with a correction:

𝜇𝜇 𝜇𝜇 (k; ) lim ( ; ) = ( ; = 0.5, = ) ′ 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 µ 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝜇𝜇→∞ 𝐹𝐹 𝑘𝑘 𝜇𝜇 𝐹𝐹 𝑘𝑘 𝜇𝜇 𝜇𝜇 − 𝜎𝜎 �𝜇𝜇 For large and with fixed / :

(𝛼𝛼 , ) 𝛽𝛽 =𝛼𝛼 𝛽𝛽 , = Beta Distribution + ( + ) ( + + 1) α αβ 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝛼𝛼 𝛽𝛽 ≈ 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 �µ 𝜎𝜎 � 2 � ( ; , ) α β α β α β As and increase the mean remains constant and the variance is 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑡𝑡 𝛼𝛼 𝛽𝛽 reduced. 𝛼𝛼 𝛽𝛽

Gamma Special Case for large k: Distribution lim ( , ) = = , = ( , ) 𝑘𝑘 𝑘𝑘 𝑘𝑘→∞ 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑘𝑘 𝜆𝜆 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 �𝜇𝜇 𝜎𝜎 � 2� 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑘𝑘 𝜆𝜆 𝜆𝜆 𝜆𝜆

Pareto Continuous Distribution 125

4.6. Pareto Continuous Distribution

Probability Density Function - f(t) 1.0 1.0 θ=1 σ=1 θ=1 σ=1 0.9 θ=2 σ=1 θ=1 σ=2 0.8 0.8 θ=1 σ=4 θ=3 σ=1 0.7 0.6 0.6 0.5 0.4 0.4 0.3 0.2 0.2 0.1 0.0 0.0 0 2 4 6 8 0 2 4 6 8

Cumulative Density Function - F(t) 1.0 θ=1 σ=1 1.0 0.9 θ=2 σ=1 0.9 θ=3 σ=1 0.8 0.8 0.7 0.7 Pareto 0.6 0.6

0.5 0.5 0.4 0.4 0.3 0.3 θ=1 σ=1 0.2 0.2 θ=1 σ=2 0.1 0.1 θ=1 σ=4 0.0 0.0 0 2 4 6 8 0 2 4 6 8

Hazard Rate - h(t) 1.0 1.0 θ=1 σ=1 θ=1 σ=1 0.9 0.9 θ=1 σ=2 0.8 θ=2 σ=1 0.8 θ=3 σ=1 θ=1 σ=4 0.7 0.7 0.6 0.6 0.5 0.5 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0.0 0.0 0 2 4 6 8 0 2 4 6 8

126 Univariate Continuous Distributions

Parameters & Description Location parameter. is the lower > 0 limit of t. Sometimes refered to as t- θ Parameters minimum. θ θ Shape parameter. Sometimes called > 0 the Pareto index.

𝛼𝛼 α t < Limits

θ ≤ ∞ Distribution Formulas

PDF ( ) = t α αθ 𝑓𝑓 𝑡𝑡 α+1 CDF ( ) = 1 t α θ 𝐹𝐹 𝑡𝑡 − � �

Reliability R(t) = t α θ � � ( + ) (t) ( ) = ( | ) = = Conditional ( ) (t + xα) Pareto 𝑅𝑅 𝑡𝑡 𝑥𝑥 Survivor Function Where 𝑚𝑚 𝑥𝑥 𝑅𝑅 𝑥𝑥 𝑡𝑡 α ( > + | > ) is the given time we know the comp𝑅𝑅 onent𝑡𝑡 has survived to time is a random variable defined as the time after . Note: = 0 at . 𝑃𝑃 𝑇𝑇 𝑥𝑥 𝑡𝑡 𝑇𝑇 𝑡𝑡 𝑡𝑡 Mean Residual 𝑥𝑥 ( ) 𝑡𝑡 𝑥𝑥 𝑡𝑡 ( ) = ∞ Life ( ) ∫𝑡𝑡 𝑅𝑅 𝑥𝑥 𝑑𝑑𝑑𝑑 𝑢𝑢 𝑡𝑡 Hazard Rate ( ) =𝑅𝑅 𝑡𝑡 t α Cumulative ℎ 𝑡𝑡 ( ) = ln Hazard Rate 𝑡𝑡 𝐻𝐻 𝑡𝑡 𝛼𝛼 � � Properties and Moments 𝜃𝜃 Median 1⁄α Mode θ2 Mean - 1st Raw Moment θ , for > 1 1 αθ Variance - 2nd Central Moment α α − , for > 2 ( 1) (2 2) αθ 2 α Skewness - 3rd Central Moment 2α(1−+ ) α − 2 , for > 3 ( 3) α α − � α α − α Pareto Continuous Distribution 127

Excess kurtosis - 4th Central Moment 6( + 2) , for > 4 (3 32)( 4) α α − 6α − α Characteristic Function α α −( α)− ( , ) 𝛼𝛼 100 % Percentile Function 𝛼𝛼 −𝑖𝑖=𝑖𝑖𝑖𝑖 (1Γ −𝛼𝛼) −𝑖𝑖𝑖𝑖𝑖𝑖 −1⁄α 𝛾𝛾 Parameter Estimation 𝑡𝑡𝛾𝛾 θ − γ Plotting Method Least Mean X-Axis Y-Axis = Square ln ( ) ln[1 ] = exp = + 𝛼𝛼� −𝑚𝑚 𝑐𝑐 𝑖𝑖 𝜃𝜃� � � 𝑦𝑦 𝑚𝑚𝑚𝑚 𝑐𝑐 Maximum𝑡𝑡 Likelihood Function− 𝐹𝐹 𝛼𝛼� Likelihood For complete data: Function 1 ( , | ) = nF t nF αnF α+1 𝐿𝐿 𝜃𝜃 𝛼𝛼 𝐸𝐸 α θ �i=1 �����������i�� Log-Likelihood failures ( , | ) = n ln( ) + n ln ( ) ( + 1) 𝑛𝑛𝐹𝐹 ln Function Pareto

Λ 𝜃𝜃 𝛼𝛼 𝐸𝐸 F α Fα θ − 𝛼𝛼 � 𝑡𝑡𝑖𝑖 𝑖𝑖=1

������������������������� solve for to get : failures = 0 n ∂Λ 𝛼𝛼 𝛼𝛼� = + n ln 𝑛𝑛𝐹𝐹 ln = 0 ∂α ∂Λ F − F θ − � 𝑡𝑡𝑖𝑖 ∂α α 𝑖𝑖=1 ����������������� failures MLE Point The likelihood function increases as increases. Therefore the MLE Estimates point estimate is the largest which satisfies t < : 𝜃𝜃 =𝜃𝜃min t , … , t θ ≤ i ∞

� 1 nF Substituting gives the MLE𝜃𝜃 for � : �

𝜃𝜃� 𝛼𝛼� = ln ln 𝑛𝑛𝐹𝐹 𝛼𝛼� 𝑛𝑛𝐹𝐹 ∑𝑖𝑖=1� 𝑡𝑡𝑖𝑖 − 𝜃𝜃�� Fisher Information 1/ 0 ( , ) = 0 2 1/ − 𝛼𝛼 𝐼𝐼 𝜃𝜃 𝛼𝛼 � 2� 𝜃𝜃 100 % 1-Sided Lower 2-Sided Lower 2-Sided Upper Confidence

Interv𝛾𝛾 als (2n 2) (2n 2) (2n 2) if is 2n { } 2n 2n 2 2 2 unknown 𝛼𝛼� 𝛼𝛼� 1−γ 𝛼𝛼� 1+γ 𝛼𝛼� 1−γ � � � � 𝜃𝜃 F 𝜒𝜒 − F 𝜒𝜒 2 − F 𝜒𝜒 2 − 128 Univariate Continuous Distributions

(for complete (2n) (2n) (2n 2) if is 2n { } 2n 2n data) 2 2 2 known 𝛼𝛼� 𝛼𝛼� 1−γ 𝛼𝛼� 1+γ 𝛼𝛼� 1−γ � � � � 𝜃𝜃 F 𝜒𝜒 F 𝜒𝜒 2 F 𝜒𝜒 2 − (Johnson et al. 1994, p.583) Where ( ) is the 100 th percentile of the 2 -distribution with degrees of freedom.𝛾𝛾 2 𝜒𝜒 𝑛𝑛 𝛾𝛾 𝜒𝜒 𝑛𝑛 Bayesian Non-informative Priors when is known, ( ) (Yang and Berger 1998, p.22) 𝜽𝜽 𝝅𝝅𝟎𝟎 𝜶𝜶 Type Prior Jeffery and 1

Reference

Conjugate Priors 𝛼𝛼 UOI Likelihood Evidence Dist. of Prior Posterior Model UOI Para Parameters

Uniform = max { , … , } failures from with known Pareto , 1 𝑛𝑛𝐹𝐹 (𝑏𝑏; a, ) a at times 𝜃𝜃 𝑡𝑡 𝑡𝑡 𝑛𝑛𝐹𝐹 = + 𝜃𝜃𝑜𝑜 𝛼𝛼0 𝑡𝑡𝑖𝑖 𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈 𝑡𝑡 𝑏𝑏 a𝛼𝛼= a𝛼𝛼0 𝑛𝑛𝐹𝐹 Pareto Pareto with failures where a > from Pareto a , 𝑜𝑜 𝐹𝐹 known at times − 𝛼𝛼𝑛𝑛 𝜃𝜃( ; , ) 𝐹𝐹 0 𝐹𝐹 𝑛𝑛 0 0 = 𝛼𝛼𝑛𝑛 𝑖𝑖 Θ 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑡𝑡 θ 𝛼𝛼 𝛼𝛼 𝑡𝑡 k =Θk Θ+0

Pareto with failures 𝑜𝑜 𝐹𝐹 from Gamma k , 𝑛𝑛 known at times 𝛼𝛼( ; , ) 𝐹𝐹 = + 𝑛𝑛𝐹𝐹 ln 𝑛𝑛 0 0 𝑖𝑖 λ 𝑖𝑖 𝜃𝜃 𝑡𝑡 𝑥𝑥 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑡𝑡 θ 𝛼𝛼 λ λ𝑜𝑜 � � � Description , Limitations and Uses 𝑖𝑖=1 𝜃𝜃 Example 5 components are put on a test with the following failure times: 108, 125, 458, 893, 13437 hours

MLE Estimates are:

= 108

Substituting gives the MLE𝜃𝜃 �for :

𝜃𝜃� 5 𝛼𝛼� = = 0.8029 (ln ln (108)) 𝛼𝛼� 𝑛𝑛𝐹𝐹 𝑖𝑖=1 𝑖𝑖 90% confidence interval∑ for 𝑡𝑡: −

𝛼𝛼� Pareto Continuous Distribution 129

2 (8), 2 (8) 10 {0.05} 10 {0.95} 𝛼𝛼� [0.2194, 1.2451𝛼𝛼� ] � 𝜒𝜒 𝜒𝜒 � Characteristics 80/20 Rule. Most commonly described as the basis for the “80/20 rule” (In a quality context, for example, 80% of manufacturing defects will be a result from 20% of the causes).

Conditional Distribution. The conditional probability distribution given that the event is greater than or equal to a value exceeding is a with the same index but with a minimum 1 instead of . 𝜃𝜃 𝜃𝜃 𝛼𝛼 1 Types.𝜃𝜃 This distribut𝜃𝜃 ion is known as a Pareto distribution of the first kind. The Pareto distribution of the second kind (not detailed here) is also known as the . Pareto also proposed a third distribution now known as a Pareto distribution of the third kind.

Pareto and the Lognormal Distribution. The Lognormal distribution models similar physical phenomena as the Pareto distribution. The two distributions have different weights at the extremities. Pareto Let: X ~ ( , )

i i Minimum property 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 θ α

min { , , … , }~ , 𝑛𝑛

For constant . 𝑋𝑋 𝑋𝑋2 𝑋𝑋𝑛𝑛 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 �𝜃𝜃 � 𝛼𝛼𝑖𝑖� 𝑖𝑖=1 Applications Rare Events.𝜃𝜃 The survival function ‘slowly’ decreases compared to most life distributions which makes it suitable for modeling rare events which have large outcomes. Examples include natural events such as the distribution of the daily rain fall, or the size of manufacturing defects. Resources Online: http://mathworld.wolfram.com/ParetoDistribution.html http://en.wikipedia.org/wiki/Pareto_distribution http://socr.ucla.edu/htmls/SOCR_Distributions.html (web calc)

Books: Arnold, B., 1983. Pareto distributions, Fairland, MD: International Co- operative Pub. House.

Johnson, N.L., Kotz, S. & Balakrishnan, N., 1994. Continuous Univariate Distributions, Vol. 1 2nd ed., Wiley-Interscience.

Relationship to Other Distributions 130 Univariate Continuous Distributions

Let Exponential ~ ( , ) = ln( ) Distribution Then ( ; ) 𝑌𝑌 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝜃𝜃 𝛼𝛼 ~ 𝑎𝑎𝑎𝑎𝑎𝑎( = ) 𝑋𝑋 𝑌𝑌⁄𝜃𝜃 𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 𝜆𝜆 Let 𝑋𝑋 𝐸𝐸𝐸𝐸𝐸𝐸 𝜆𝜆 𝛼𝛼 Chi-Squared ~ ( , ) = ln( ) Distribution Then ( ; ) 𝑌𝑌 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝜃𝜃 𝛼𝛼 ~ 𝑎𝑎𝑎𝑎𝑎𝑎( = 2) 𝑋𝑋 2α 𝑌𝑌⁄𝜃𝜃 2 (Johnson et al. 1994, p.526) 2 χ 𝑥𝑥 𝑣𝑣 𝑋𝑋 𝜒𝜒 𝑣𝑣 Let Logistic X ~ ( , ) = ln 1 Distribution α ( , ) Then 𝑋𝑋 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝜃𝜃 𝛼𝛼 𝑎𝑎𝑎𝑎𝑎𝑎 𝑌𝑌 − �� � − � ~ (0,1) θ 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿 µ 𝑠𝑠 (Hastings et al. 2000, p.127) 𝑌𝑌 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿

Pareto Triangle Continuous Distribution 131

4.7. Triangle Continuous Distribution

Probability Density Function - f(t) 2.0 a=0 b=1 c=0.5 2.0 a=0 b=1.5 c=0.5 1.8 a=0 b=2 c=0.5 1.6 1.5 1.4 1.2 1.0 1.0 0.8 0.6 0.5 0.4 a=0 b=1 c=0.25 a=0 b=1 c=0.5 0.2 a=0 b=1 c=0.75 0.0 0.0 0 0.5 1 1.5 2 0 0.2 0.4 0.6 0.8 1

Cumulative Density Function - F(t)

1.0 1.0 Triangle

0.8 0.8

0.6 0.6

0.4 0.4 a=0 b=1 c=0.5 0.2 a=0 b=1.5 c=0.5 0.2 a=0 b=2 c=0.5 0.0 0.0 0 0.5 1 1.5 2 0 0.2 0.4 0.6 0.8 1

Hazard Rate - h(t) 10 10 a=0 b=1 c=0.25 9 9 a=0 b=1 c=0.5 8 8 a=0 b=1 c=0.75 7 7 6 6 5 5 4 4 3 3 2 2 1 1 0 0 0 0.5 1 1.5 2 0 0.5 1

132 Univariate Continuous Distributions

Parameters & Description < Minimum Value. is the lower bound Maximum Value. is the upper 𝑎𝑎 −∞<≤ 𝑎𝑎< 𝑏𝑏 𝑎𝑎 Parameters bound. 𝑏𝑏 𝑏𝑏 𝑎𝑎 𝑏𝑏 ∞ Mode Value. is the mode of the

distribution (top of the triangle). 𝑐𝑐 Random Variable 𝑐𝑐 𝑎𝑎 ≤ 𝑐𝑐 ≤ 𝑏𝑏

Distribution Formulas𝑎𝑎 ≤ 𝑡𝑡 ≤ 𝑏𝑏 2(t a) for a t c (b a)(c a) ( ) = − PDF ⎧ 2(b t) ≤ ≤ ⎪ for c t b (b − a)(b − c) 𝑓𝑓 𝑡𝑡 ⎨ − ⎪ ≤ ≤ ⎩ − − (t a) for a t c (b a)(c 2 a) CDF ( ) = − ⎧ (b t) ≤ ≤ ⎪1 − − for c t b (b a)(b 2 c) 𝐹𝐹 𝑡𝑡 ⎨ − ⎪ − ≤ ≤ ⎩ −(t a)−

Triangle 1 for a t c (b a)(c 2 a) Reliability ( ) = − ⎧ −(b t) ≤ ≤ ⎪ − − for c t b (b a)(b 2 c) 𝑅𝑅 𝑡𝑡 ⎨ − ⎪ ≤ ≤ Properties⎩ and− Moments− Median + ( )( ) 2 1 𝑏𝑏 − 𝑎𝑎 𝑎𝑎 �2 𝑏𝑏 − 𝑎𝑎 𝑐𝑐 − 𝑎𝑎 𝑓𝑓𝑓𝑓𝑓𝑓 𝑐𝑐 ≥ ( )( ) < 2 1 𝑏𝑏 − 𝑎𝑎 𝑏𝑏 − �2 𝑏𝑏 − 𝑎𝑎 𝑏𝑏 − 𝑐𝑐 𝑓𝑓𝑓𝑓𝑓𝑓 𝑐𝑐 Mode Mean - 1st Raw Moment + + 𝑐𝑐 3 𝑎𝑎 𝑏𝑏 𝑐𝑐 Variance - 2nd Central Moment + +

2 2 2 18 𝑎𝑎 𝑏𝑏 𝑐𝑐 − 𝑎𝑎𝑎𝑎 − 𝑎𝑎𝑎𝑎 − 𝑏𝑏𝑏𝑏 Skewness - 3rd Central Moment 2( + 2 )(2 )( 2 + )

5( + + ) / √ 𝑎𝑎 𝑏𝑏 − 𝑐𝑐 𝑎𝑎 − 𝑏𝑏 − 𝑐𝑐 𝑎𝑎 − 𝑏𝑏 𝑐𝑐 2 2 2 3 2 Excess kurtosis - 4th Central Moment 3 𝑎𝑎 𝑏𝑏 𝑐𝑐 − 𝑎𝑎𝑎𝑎 − 𝑎𝑎𝑎𝑎 − 𝑏𝑏𝑏𝑏 5 − Triangle Continuous Distribution 133

Characteristic Function (b c)e (b a)e + (c a)e 2 (bita a)(c a)(itcb c)t itb − − − − − 2 100γ % Percentile Function = a + (b− a)(c− a) − < ( )

𝑡𝑡𝛾𝛾 �γ − − 𝑓𝑓𝑓𝑓𝑓𝑓 𝛾𝛾 𝐹𝐹 𝑐𝑐 = b (1 )(b a)(b c) ( )

Parameter𝑡𝑡𝛾𝛾 Estimation− � − γ − − 𝑓𝑓𝑓𝑓𝑓𝑓 𝛾𝛾 ≥ 𝐹𝐹 𝑐𝑐 Maximum Likelihood Function Likelihood 2(t a) 2(b t ) ( , , | ) = . Functions r ( ) nF ( ) b ai (c a) b a (b i c) − − 𝐿𝐿 𝑎𝑎 𝑏𝑏 𝑐𝑐 𝐸𝐸 �i=1 �i=r+1 ���2�����−�����−�t � a��������b−���t��−�� failers to the left of c failures to the right of c = 𝐹𝐹 𝑛𝑛 r ( ) nF ( ) b a ci a b ci − − � � �i=1 �i=r+1 Where failure times are− ordered: − −

and is the number of1 failure2 times less𝑟𝑟 than 𝑛𝑛 and𝐹𝐹 is the number of failure times greater than𝑇𝑇 ≤ 𝑇𝑇. Therefore≤ ⋯ ≤ 𝑇𝑇 ≤ ⋯= ≤+𝑇𝑇 . 𝑟𝑟 𝑐𝑐 𝑠𝑠 Triangle Point The MLE estimates a, b, and𝑐𝑐 c are obtained𝑛𝑛𝐹𝐹 𝑟𝑟 by 𝑠𝑠numerically calculating Estimates the likelihood function for different r and selecting the maximum where c = X . � � �

� r� 2 max ( , , | ) = 𝐹𝐹 { ( , , ( , )} b a 𝑛𝑛 where 𝑎𝑎 ≤ 𝑐𝑐 ≤ 𝑏𝑏 𝐿𝐿 𝑎𝑎 𝑏𝑏 𝑐𝑐 𝐸𝐸 � � 𝑀𝑀 𝑎𝑎 𝑏𝑏 𝑟𝑟̂ 𝑎𝑎 𝑏𝑏 t −a b t ( , , ) = r−1 ( ) nF ( ) ti a b ti − − 𝑀𝑀 𝑎𝑎 𝑏𝑏 𝑟𝑟 �i=1 r �i=r+1 r ( , ) = arg max− ( , , ) − { ,…, }

𝑟𝑟 𝑎𝑎 𝑏𝑏 𝑟𝑟∈ 1 𝑛𝑛𝐹𝐹 𝑀𝑀 𝑎𝑎 𝑏𝑏 𝑟𝑟 Note that the MLE estimates for a and are not the same as the uniform distribution: a min (t ,𝑏𝑏t … ) b max (tF , tF … ) 1 2 (Kotz & Dorp 2004) � ≠ F F � ≠ 1 2 Description , Limitations and Uses Example When eliciting an opinion from an expert on the possible value of a quantity, , the expert may give : - Lowest possible value = 0 - Highest𝑥𝑥 possible value = 1 - Estimate of most likely value (mode) = 0.7

The corresponding distribution for may be a triangle distribution with parameters: = 0, = 1𝑥𝑥, = 0.7

𝑎𝑎 𝑏𝑏 𝑐𝑐 134 Univariate Continuous Distributions

Characteristics Standard Triangle Distribution. The standard triangle distribution has = 0, = 1. This distribution has a mean at 2 and median at 1 (1 ) 2. 𝑎𝑎 𝑏𝑏 �𝑐𝑐⁄

Symmetr− � ical− 𝑐𝑐 ⁄Triangle Distribution. The symmetrical triangle distribution occurs when = ( )/2. The symmetrical triangle distribution is formed from the average of two uniform random variables (see related distributions).𝑐𝑐 𝑏𝑏 − 𝑎𝑎

Applications Subjective Representation. The triangle distribution is often used to model subjective evidence where and are the bounds of the estimation and is an estimation of the mode. 𝑎𝑎 𝑏𝑏 Substitution to𝑐𝑐 the Beta Distribution. Due to the triangle distribution having bounded support it may be used in place of the beta distribution.

Monte Carlo Simulation. Used to approximate distributions of

variables when the underlying distribution is unknown. A distribution of interest is obtained by conducting Monte Carlo simulation of a model using the triangle distributions as inputs.

Triangle Resources Online: http://mathworld.wolfram.com/TriangularDistribution.html http://en.wikipedia.org/wiki/Triangular_distribution

Books: Kotz, S. & Dorp, J.R.V., 2004. Beyond Beta: Other Continuous Families Of Distributions With Bounded Support And Applications, World Scientific Publishing Company.

Relationship to Other Distributions Let X + X Uniform X ~ ( , ) Y = 2 Distribution 1 2 Then i 𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈 𝑎𝑎 𝑏𝑏 𝑎𝑎𝑎𝑎𝑎𝑎 b a ( ; a, b) Y~ a, , b 2 − 𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈 𝑡𝑡 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 � � Beta Distribution Special Cases: (1,2 ) = (0,0,1) ( ; , ) (2,1 ) = (0,1,1) 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑡𝑡 α 𝛽𝛽 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 Truncated Normal Continuous Distribution 135

4.8. Truncated Normal Continuous Distribution Probability Density Function - f(t) μ=-1 σ=1 a=0 b=2 μ=0.5 σ=0.5 1.4 1.8 μ=0.5 σ=1 μ=0 σ=1 a=0 b=2 1.6 μ=0.5 σ=3 1.2 μ=1 σ=1 a=0 b=2 1.4 1.0 1.2 0.8 1.0 0.6 0.8 0.6 0.4 0.4 0.2 0.2 0.0 0.0 -1 0 1 2 3 -1 0 1 2 3

Cumulative Density Function - F(t)

1.0 1.0 Trunc Normal

0.8 0.8

0.6 0.6

0.4 0.4 μ=-1 a=0 b=2 μ=0.5 σ=0.5 a=0 0.2 μ=0 a=0 b=2 0.2 μ=0.5 σ=1 b=2 μ=1 a=0 b=2 μ=0.5 σ=3 0.0 0.0 -1 0 1 2 3 -1 0 1 2 3

Hazard Rate - h(t) μ=0.5 σ=0.5 a=0 μ=-1 σ=1 a=0 b=2 1.4 μ=0.5 σ=1 b=2 μ=0 σ=1 a=0 b=2 1.2 μ=0.5 σ=3 1.2 μ=1 σ=1 a=0 b=2 1.0 1.0 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0.0 0.0 -1 0 1 2 3 -1 0 1 2 3

136 Univariate Continuous Distributions

Parameters & Description Location parameter: The mean of the < < distribution. 𝜇𝜇 −∞ µ ∞ Scale parameter: The standard > 0 deviation of the distribution. 2 2 𝜎𝜎 σ Lower Bound: is the lower bound. Parameters < < The standard normal transform of 𝐿𝐿 = 𝑎𝑎 is . 𝐿𝐿 𝑎𝑎𝐿𝐿 −∞ 𝑎𝑎𝐿𝐿 𝑏𝑏𝑈𝑈 𝑎𝑎𝐿𝐿−𝜇𝜇 𝑎𝑎 Upper𝑧𝑧𝑎𝑎 Bound𝜎𝜎 : is the upper bound. The standard normal transform of < < 𝑈𝑈 is = . 𝑏𝑏 𝑈𝑈 𝐿𝐿 𝑈𝑈 𝑈𝑈 𝑏𝑏 𝑎𝑎 𝑏𝑏 ∞ 𝑏𝑏𝑈𝑈−𝜇𝜇 𝑏𝑏 Limits <𝑧𝑧𝑏𝑏 𝜎𝜎

Distribution Left Truncated Normal 𝑎𝑎𝐿𝐿 𝑥𝑥 ≤General𝑏𝑏𝑈𝑈 Truncated Normal [0, ) [ , ]

for 0 𝑥𝑥 ∈ ∞ for 𝑥𝑥 ∈ 𝑎𝑎𝐿𝐿 𝑏𝑏𝑈𝑈 1 (z ) (z ) ( ) = 𝑎𝑎𝐿𝐿 ≤ 𝑥𝑥 ≤ 𝑏𝑏𝑈𝑈 ≤ 𝑥𝑥 ≤ ∞ ( z ) ( ) = x (z ) x (z ) otherwise 𝜙𝜙 𝜙𝜙 𝑓𝑓 𝑥𝑥 0 otherwise σ ( )σΦ= 0− 𝑓𝑓 𝑥𝑥 b a PDF Φ( ) = −0 Φ

Trunc Normal Trunc where 𝑓𝑓 𝑥𝑥 𝑓𝑓 𝑥𝑥 is the standard normal pdf with = 0 and = 1 is the standard normal cdf with = 0 and 2 = 1 𝜙𝜙 𝜇𝜇 𝜎𝜎 = 2 Φ 𝑖𝑖−𝜇𝜇 𝜇𝜇 𝜎𝜎 𝑧𝑧𝑖𝑖 � 𝜎𝜎 � for < a for < 0 ( ) = 0 L ( ) = 0 𝑥𝑥 𝑥𝑥 for 𝐹𝐹 𝑥𝑥 for 0 𝐹𝐹<𝑥𝑥 (z ) (z ) CDF 𝐿𝐿 ( ) = 𝑈𝑈 𝑎𝑎 ≤ 𝑥𝑥 ≤ 𝑏𝑏 (z ) (z ) ≤ 𝑥𝑥 ∞(z ) (z ) Φ x − Φ a ( ) = 𝐹𝐹 𝑥𝑥 ( ) Φ b − Φ a x z 0 for > b Φ − Φ ( ) = 1 𝐹𝐹 𝑥𝑥 0 U Φ − 𝑥𝑥 𝐹𝐹 𝑥𝑥 for < a for < 0 ( ) = 1 ( ) = 1 L 𝑥𝑥 𝑥𝑥 for 𝑅𝑅 𝑥𝑥 for 0 𝑅𝑅<𝑥𝑥 Reliability (z ) (z ) 𝐿𝐿 ( ) = 𝑈𝑈 𝑎𝑎 ≤ 𝑥𝑥 ≤ 𝑏𝑏 ( ) ( ) ≤ 𝑥𝑥 ∞(z ) (z ) zb zx ( ) = Φ − Φ ( z ) 𝑅𝑅 𝑥𝑥 b a Φ 0 − Φ x for > b Φ − Φ 𝑅𝑅 𝑥𝑥 ( ) Φ − 0 = 0 𝑥𝑥 U 𝑅𝑅 𝑥𝑥 Truncated Normal Continuous Distribution 137

for < 0 for < a ( ) = ( + ) ( ) = ( + ) L 𝑡𝑡 𝑡𝑡 for 0𝑚𝑚 𝑥𝑥< 𝑅𝑅 𝑡𝑡 𝑥𝑥 for 𝑚𝑚 𝑥𝑥 𝑅𝑅 𝑡𝑡 𝑥𝑥 ( + ) ( + ) ( ) = ( | ) = 𝐿𝐿 ( ) = 𝑈𝑈( | ) = ≤ 𝑡𝑡 ∞ ( ) 𝑎𝑎 ≤ 𝑡𝑡 ≤ 𝑏𝑏 ( ) 1 (z 𝑅𝑅)𝑡𝑡 𝑥𝑥 𝑅𝑅 𝑡𝑡 𝑥𝑥 𝑚𝑚 𝑥𝑥 𝑅𝑅 𝑥𝑥 𝑡𝑡 (z ) (z ) Conditional = 𝑚𝑚 𝑥𝑥 = 𝑅𝑅 𝑥𝑥 𝑡𝑡 1 (z ) 𝑅𝑅 𝑡𝑡 (z ) 𝑅𝑅(z 𝑡𝑡) Survivor Function − Φ xt+x t b t+x ( > + | > ) Φ − Φ t b t = − Φ for > b Φ − Φ µ − −t 𝑃𝑃 𝑇𝑇 𝑥𝑥 𝑡𝑡 𝑇𝑇 𝑡𝑡 Φ � � ( ) = 0 σ U µ − 𝑡𝑡 Φ � � is the given timeσ we know the component 𝑚𝑚has𝑥𝑥 survived to. is a random variable defined as the time after . Note:𝑡𝑡 = 0 at . This operation is the equivalent of t replacing the lower𝑥𝑥 bound. 𝑡𝑡 𝑥𝑥 𝑡𝑡 ( ) ( ) Mean Residual Life ( ) = ∞ = ∞ ( ) ( ) ∫𝑡𝑡 𝑅𝑅 𝑥𝑥 𝑑𝑑𝑥𝑥 ∫𝑡𝑡 𝑅𝑅 𝑥𝑥 𝑑𝑑𝑑𝑑

𝑢𝑢 𝑡𝑡 𝑅𝑅 for𝑡𝑡 < a 𝑅𝑅 𝑡𝑡 Trunc Normal ( ) = 0 for < 0 L 𝑥𝑥 ( ) = 0 for ℎ 𝑥𝑥 𝑥𝑥 for 0 ℎ<𝑥𝑥 Hazard Rate 𝑎𝑎𝐿𝐿 ≤ 𝑥𝑥1≤ 𝑏𝑏𝑈𝑈 (z )[ (z ) (z )] ≤ 𝑥𝑥1 ∞ ( ) = (z )[1 (z )] [ (z ) (z )] 𝜙𝜙 x Φ b − Φ x ( ) = σ [1 (z )] ℎ 𝑥𝑥 2 𝜙𝜙 x − Φ x for > b Φ b − Φ a σ ℎ 𝑥𝑥 2 ( ) − Φ 0 = 0 𝑥𝑥 U Cumulative Hazard ( ) = ln[ ( )] ( )ℎ=𝑥𝑥 ln[ ( )] Rate Properties and Left𝐻𝐻 Truncated𝑡𝑡 − 𝑅𝑅 Normal𝑡𝑡 General𝐻𝐻 𝑡𝑡 Truncated− 𝑅𝑅 𝑡𝑡Normal Moments [0, ) [ , ]

Median No𝑥𝑥 closed∈ ∞ form No𝑥𝑥 closed∈ 𝑎𝑎𝐿𝐿 𝑏𝑏 form𝑈𝑈 Mode 0 [ , ] 0 < 0 < 𝐿𝐿 𝑈𝑈 𝜇𝜇 𝑤𝑤ℎ𝑒𝑒𝑒𝑒𝑒𝑒 𝜇𝜇 ≥ 𝜇𝜇 𝑤𝑤ℎ 𝑒𝑒𝑒𝑒𝑒𝑒 𝜇𝜇 ∈ >𝑎𝑎 𝑏𝑏 𝑤𝑤ℎ𝑒𝑒𝑒𝑒𝑒𝑒 𝜇𝜇 𝑎𝑎𝐿𝐿 𝑤𝑤ℎ𝑒𝑒𝑒𝑒𝑒𝑒 𝜇𝜇 𝑎𝑎𝐿𝐿 Mean (z ) 𝑏𝑏𝑈𝑈 𝑤𝑤ℎ(𝑒𝑒z𝑒𝑒𝑒𝑒) 𝜇𝜇 (𝑏𝑏z𝑈𝑈) st + + 1 Raw Moment ( z ) (z ) (z ) 𝜎𝜎𝜎𝜎 0 𝜙𝜙 a − 𝜙𝜙 b where µ where µ 𝜎𝜎 Φ − 0 Φ b − Φ a = = , = −µ 𝐿𝐿 𝑈𝑈 0 𝑎𝑎 𝑎𝑎 − µ 𝑏𝑏 𝑏𝑏 − µ Variance [1 𝑧𝑧{ } ] 𝑧𝑧 [1 { }𝑧𝑧 ] nd σ σ σ 2 Central Moment where2 2 where 2 2 σ − −Δ0 − Δ1 σ − −Δ0 − Δ1 138 Univariate Continuous Distributions

( ) ( ) ( ) = = (𝑘𝑘 ) 1 𝑘𝑘 ( ) 𝑘𝑘( ) 𝑧𝑧0 𝜙𝜙 𝑧𝑧0 𝑧𝑧𝑏𝑏 𝜙𝜙 𝑧𝑧𝑏𝑏 − 𝑧𝑧𝑎𝑎 𝜙𝜙 𝑧𝑧𝑎𝑎 Δ𝑘𝑘 Δ𝑘𝑘 Φ 𝑧𝑧0 − Φ 𝑧𝑧𝑏𝑏 − Φ 𝑧𝑧𝑎𝑎 Skewness 1 [2 + (3 1) + ] 3rd Central Moment − 3 where 3 Δ0 Δ1 − Δ0 Δ2 2 𝑉𝑉 = 1 2 Excess kurtosis 1 1 0 [ 3 6 𝑉𝑉 2 − Δ 4− Δ 3 + 3] 4th Central Moment 4 2 2 2 − Δ0 − Δ1Δ0 − Δ0 − Δ2Δ0 − Δ1 − Δ3 𝑉𝑉 Characteristic See (Abadir & Magdalinos 2002, pp.1276-1287) Function 100 % Percentile = = Function + { + (z )[1 ]} + { (z ) + (z )[1 ]} 𝛼𝛼 𝛼𝛼 𝛼𝛼 𝑡𝑡 −1 𝑡𝑡 −1 µ σΦ α Φ 0 − α µ σΦ αΦ b Φ a − α Parameter Estimation Maximum Likelihood Function Likelihood For limits [ , ]: Function 1 1 ( , , , 𝐿𝐿 )𝑈𝑈= exp 𝑎𝑎 𝑏𝑏 nF 2 { (z ) (z )} 2 𝑖𝑖 𝐿𝐿 𝑈𝑈 nF 𝑥𝑥 − 𝜇𝜇

Trunc Normal Trunc 𝐿𝐿 𝜇𝜇 𝜎𝜎 𝑎𝑎 𝑏𝑏 � �− � � � b a i=1 𝜎𝜎 �σ�√�2π���Φ����−��Φ������������������������ 1 failures 1 = exp 𝑛𝑛𝐹𝐹 ( ) 2 { (z ) (z )} 2 nF �− 2 � 𝑥𝑥𝑖𝑖 − 𝜇𝜇 � �σ 2π Φ b − Φ a � 𝜎𝜎 𝑖𝑖=1 ��√��������������������������������� failures For limits [0, ) 1 1 ( , ) = exp ∞ F { z } n 2 2 𝑥𝑥𝑖𝑖 − 𝜇𝜇 𝐿𝐿 𝜇𝜇 𝜎𝜎 nF � �− � � � 0 i=1 𝜎𝜎 �Φ��−���σ�√�2π���������������������� 1 failures 1 = exp 𝑛𝑛𝐹𝐹 ( ) 2 { z } 2 nF �− 2 � 𝑥𝑥𝑖𝑖 − 𝜇𝜇 � �Φ − 0 σ 2π� 𝜎𝜎 𝑖𝑖=1 �������√���������������������� failures Log-Likelihood For limits [ , ]: Function ( , , , | ) 𝑎𝑎𝐿𝐿 𝑏𝑏𝑈𝑈 𝐿𝐿 𝑈𝑈 1 =Λ 𝜇𝜇 n𝜎𝜎 𝑎𝑎ln[ 𝑏𝑏(z𝐸𝐸) (z )] n ln 𝑛𝑛𝐹𝐹 ( ) 2 2 − F Φ b − Φ a − F �σ√2π� − 2 � 𝑥𝑥𝑖𝑖 − 𝜇𝜇 𝜎𝜎 𝑖𝑖=1 ��������������������������������������� failures For limits [0, )

∞ Truncated Normal Continuous Distribution 139

1 ( , | ) = ln( { }) ln 2 𝑛𝑛𝐹𝐹 ( ) 2 2 Λ 𝜇𝜇 𝜎𝜎 𝐸𝐸 −𝑛𝑛𝐹𝐹 𝛷𝛷 −𝑧𝑧0 − 𝑛𝑛𝐹𝐹 �𝜎𝜎√ 𝜋𝜋� − 2 � 𝑥𝑥𝑖𝑖 − 𝜇𝜇 𝜎𝜎 𝑖𝑖=1 ����������������������������������� failures

= 0 n (z ) (z ) 1 = + 𝑛𝑛𝐹𝐹 ( ) = 0 ∂Λ (z ) (z ) ∂Λ − F ϕ a − ϕ b 2 𝑖𝑖 ∂µ � b a � � 𝑥𝑥 − 𝜇𝜇 ∂µ σ Φ − Φ 𝜎𝜎 𝑖𝑖=1 ������������������������� failures = 0 n z (z ) z (z ) n 1 = + 𝑛𝑛𝐹𝐹 ( ) = 0 ∂Λ (z ) (z ) ∂Λ − F aϕ a − bϕ b F 2 2 3 𝑖𝑖 ∂σ � b a � − � 𝑥𝑥 − 𝜇𝜇 ∂σ σ Φ − Φ σ 𝜎𝜎 𝑖𝑖=1 ��������������������������������� failures MLE Point First Estimate the values for and by solving the simultaneous Estimates equations numerically (Cohen 1991, p.33): 𝑎𝑎 𝑏𝑏 𝑧𝑧 𝑧𝑧 ( , ) = =

𝑄𝑄𝑎𝑎 − 𝑄𝑄𝑏𝑏 − 𝑧𝑧𝑎𝑎 𝑥𝑥̅ − 𝑎𝑎𝐿𝐿 Trunc Normal 𝐻𝐻1 𝑧𝑧𝑎𝑎 𝑧𝑧𝑏𝑏 1 + 𝑧𝑧𝑏𝑏 − 𝑧𝑧𝑎𝑎( 𝑏𝑏𝑈𝑈)− 𝑎𝑎𝐿𝐿 ( , ) = = ( ) 2 ( 2 ) 𝑧𝑧𝑎𝑎𝑄𝑄𝑎𝑎 − 𝑧𝑧𝑏𝑏𝑄𝑄𝑏𝑏 − 𝑄𝑄𝑎𝑎 − 𝑄𝑄𝑏𝑏 𝑠𝑠 2 𝑎𝑎 𝑏𝑏 2 2 𝐻𝐻 𝑧𝑧 𝑧𝑧 𝑏𝑏 𝑎𝑎 𝑈𝑈 𝐿𝐿 Where: 𝑧𝑧 − 𝑧𝑧 𝑏𝑏 − 𝑎𝑎

( ) ( ) = , = ( ) ( ) ( ) ( ) 𝜙𝜙 𝑧𝑧𝑎𝑎 𝜙𝜙 𝑧𝑧𝑏𝑏 𝑄𝑄𝑎𝑎 𝑄𝑄𝑏𝑏 Φ 𝑧𝑧𝑏𝑏 − Φ 𝑧𝑧𝑎𝑎 Φ 𝑧𝑧𝑏𝑏 − Φ 𝑧𝑧𝑎𝑎 = , = 𝑎𝑎𝐿𝐿 − 𝜇𝜇 𝑏𝑏𝑈𝑈 − 𝜇𝜇 1𝑧𝑧𝑎𝑎 𝑧𝑧1𝑏𝑏 = 𝑛𝑛𝐹𝐹 ,𝜎𝜎 = 𝑛𝑛𝐹𝐹 𝜎𝜎( ) 1 2 2 𝐹𝐹 𝑖𝑖 𝑖𝑖 𝑥𝑥̅ � 𝑥𝑥 𝑠𝑠 𝐹𝐹 � 𝑥𝑥 − 𝑥𝑥̅ 𝑛𝑛 0 𝑛𝑛 − 0 The distribution parameters can then be estimated using: = , = 𝑈𝑈 𝐿𝐿 𝑏𝑏 − 𝑎𝑎 𝐿𝐿 𝑎𝑎 𝜎𝜎� 𝑏𝑏 𝑎𝑎 𝜇𝜇̂ 𝑎𝑎 − 𝜎𝜎�𝑧𝑧� (Cohen 1991, p.44) provides𝑧𝑧� − 𝑧𝑧� a graphical procedure to estimate parameters to use as the starting point for numerical solvers.

For the case where the limits are [0, ) first numerically solve for : 1 Q (Q ) = 0 (Q )∞ 2 𝑧𝑧 − 0 0 − 𝑧𝑧0 𝑠𝑠 where 2 0 − 𝑧𝑧0 ( ) 𝑥𝑥̅ = 1 ( ) 𝜙𝜙 𝑧𝑧0 0 𝑄𝑄 0 The distribution parameters can be −estimatedΦ 𝑧𝑧 using: 140 Univariate Continuous Distributions

= , = 𝑥𝑥̅ 0 𝜎𝜎� 0 0 𝜇𝜇̂ −𝜎𝜎�𝑧𝑧� 𝑄𝑄 − 𝑧𝑧� When the limits and are unknown, the likelihood function is maximized when the difference, (z ) (z ), is at its minimum. This 𝐿𝐿 𝑈𝑈 occurs when the𝑎𝑎 difference𝑏𝑏 between is at its minimum. b a Therefore the MLE estimates for Φ and− Φ are: 𝑈𝑈 𝐿𝐿 𝑏𝑏 − 𝑎𝑎 𝐿𝐿 𝑈𝑈 a = min𝑎𝑎 (t , t 𝑏𝑏… ) b = max (tF , tF … ) L 1 2 � F F �U 1 2 1 1 2( ) Fisher [1 + ] + Information ( , ) = ′ ′ 𝑥𝑥̅ − 𝜇𝜇 (Cohen 1991, ⎡ 1 22( − 𝑄𝑄) 𝑎𝑎 𝑄𝑄𝑏𝑏 1 3[ 2 +� ( ) −] 𝜆𝜆𝑎𝑎 𝜆𝜆𝑏𝑏� ⎤ 2 ⎢ 𝜎𝜎 + 𝜎𝜎 𝜎𝜎 1 + ⎥ p.40) 𝐼𝐼 𝜇𝜇 𝜎𝜎 2 2 ⎢ 𝑥𝑥̅ − 𝜇𝜇 𝑠𝑠 𝑥𝑥̅ − 𝜇𝜇 ⎥ Where ⎢ 2 � − 𝜆𝜆𝑎𝑎 𝜆𝜆𝑏𝑏� 2 � 2 − − 𝜂𝜂𝑎𝑎 𝜂𝜂𝑏𝑏�⎥ ⎣𝜎𝜎 = 𝜎𝜎 ( ), 𝜎𝜎 = 𝜎𝜎 ( + ) ⎦ ′ = + , ′ = +

𝑎𝑎 𝑎𝑎 𝑎𝑎 𝑎𝑎 𝑏𝑏 𝑏𝑏 𝑏𝑏 𝑏𝑏 𝑄𝑄 =𝑄𝑄 (𝑄𝑄 −+𝑧𝑧 ), 𝑄𝑄 =−𝑄𝑄( 𝑄𝑄+ 𝑧𝑧) 𝑎𝑎 𝐿𝐿 𝑎𝑎 𝑎𝑎 𝑏𝑏 𝑈𝑈 𝑏𝑏 𝑏𝑏 𝜆𝜆 𝑎𝑎 𝑄𝑄′ 𝑄𝑄 𝜆𝜆 𝑏𝑏 𝑄𝑄′ 𝑄𝑄 𝜂𝜂𝑎𝑎 𝑎𝑎𝐿𝐿 𝜆𝜆𝑎𝑎 𝑄𝑄𝑎𝑎 𝜂𝜂𝑏𝑏 𝑏𝑏𝑈𝑈 𝜆𝜆𝑏𝑏 𝑄𝑄𝑏𝑏 100 % Calculated from the Fisher information matrix. See section 1.4.7. For Confidence further detail and examples see (Cohen 1991, p.41) Intervals𝛾𝛾 Trunc Normal Trunc Bayesian No closed form solutions to priors exist. Description , Limitations and Uses Example 1 The size of washers delivered from a manufacturer is desired to be modeled. The manufacture has already removed all washers below 15.95mm and washers above 16.05mm. The washers received have the following diameters:

15.976, 15.970, 15.955, 16.007, 15.966, 15.952, 15.955 From data: = 15.973, = 4.3950 -4 𝑚𝑚𝑚𝑚 2 Using numerical solver𝑥𝑥̅ MLE Estimates𝑠𝑠 for and𝐸𝐸 are:

𝑎𝑎 𝑏𝑏 = 0, = 3.3351𝑧𝑧 𝑧𝑧 Therefore 𝑧𝑧�𝑎𝑎 �𝑧𝑧𝑏𝑏 = = 0.029984 𝑈𝑈 𝐿𝐿 𝑏𝑏 − 𝑎𝑎 𝜎𝜎� 𝑏𝑏 𝑎𝑎 𝑧𝑧�= − 𝑧𝑧� = 15.95

𝐿𝐿 𝑎𝑎 To calculate confidence intervals,𝜇𝜇̂ 𝑎𝑎 − first𝜎𝜎�𝑧𝑧� calculate: Truncated Normal Continuous Distribution 141

= 0.63771, = 0.010246 ′ = 10.970, ′ = 0.16138 𝑎𝑎 𝑏𝑏 𝑄𝑄 = 187.71, 𝑄𝑄 = −2.54087 𝑎𝑎 𝑏𝑏 𝜆𝜆 𝜆𝜆 − 𝑎𝑎 𝑏𝑏 90% confidence intervals𝜂𝜂 : 𝜂𝜂 − 391.57 10699 ( , ) = 10699 209183 − 𝐼𝐼 𝜇𝜇 𝜎𝜎 � 1.1835 -4 � 6.0535 -6 [ ( , )] = [ ( , )−] = − 6.0535 -6 2.2154 -7 −1 −1 𝐸𝐸 − 𝐸𝐸 𝐽𝐽𝑛𝑛 𝜇𝜇̂ 𝜎𝜎� 𝑛𝑛𝐹𝐹𝐼𝐼 𝜇𝜇̂ 𝜎𝜎� � � 90% confidence interval for : − 𝐸𝐸 − 𝐸𝐸 1(0.95) 1.1835 -4, + 1(0.95) 1.1835 -4 𝜇𝜇 − [15.932, 15.968]− − Φ Φ �𝜇𝜇̂ √ 𝐸𝐸 𝜇𝜇̂ √ 𝐸𝐸 � 90% confidence interval for : (0.95) 2.2154 -7 (0.95) 2.2154 -7 . exp , . exp −1 𝜎𝜎 −1 𝛷𝛷 √ 𝐸𝐸 𝛷𝛷 √ 𝐸𝐸 �𝜎𝜎� � [2.922 -2�, 3𝜎𝜎�.0769� -2] �� −𝜎𝜎� 𝜎𝜎� An estimate can be made on𝐸𝐸 how many washers𝐸𝐸 the manufacturer Trunc Normal discards:

The distribution of washer sizes is a Normal Distribution with estimated parameters = 15.95, = 0.029984. The percentage of

washers wish pass is: 𝜇𝜇̂ 𝜎𝜎� (16.05) (15.95) = 49.96%

It is likely that there𝐹𝐹 is too− much𝐹𝐹 variance in the manufacturing process for this system to be efficient.

Example 2 The following example adjusts the calculations used in the Normal Distribution to account for the fact that the limit on distance is [0, ).

The accuracy of a cutting machine used in manufacturing is desired∞ to be measured. 5 cuts at the required length are made and measured as: 7.436, 10.270, 10.466, 11.039, 11.854

From data: 𝑚𝑚𝑚𝑚 = 10.213, = 2.789 2 Using numerical solver𝑥𝑥̅ MLE Estimates𝑠𝑠 for is:

0 = 4.5062 𝑧𝑧 Therefore 𝑧𝑧�0 − = = 2.26643 𝑥𝑥̅ 𝜎𝜎� 𝑄𝑄0 − 𝑧𝑧�0 142 Univariate Continuous Distributions

= = 10.213

𝑎𝑎 To calculate confidence intervals,𝜇𝜇̂ −𝜎𝜎�𝑧𝑧� first calculate: = 7.0042 -5, = 1.5543 -5, = 0.16138 ′ 0 0 𝑏𝑏 90% confidence𝑄𝑄 intervals:𝐸𝐸 𝜆𝜆 𝐸𝐸 𝜆𝜆 − 0.19466 2.9453 -6 ( , ) = 2.9453 -6 0.12237 − 𝐸𝐸 𝐼𝐼 𝜇𝜇 𝜎𝜎 � 1.0274 2.�4728 -5 [ ( , )] = [ (−, )] 𝐸𝐸= 2.4728 -5 1.6343 −1 −1 𝐸𝐸 𝐽𝐽𝑛𝑛 𝜇𝜇̂ 𝜎𝜎� 𝑛𝑛𝐹𝐹𝐼𝐼 𝜇𝜇̂ 𝜎𝜎� � � 90% confidence interval for : 𝐸𝐸 1(0.95) 1.0274, + 1(0.95) 1.0274 − [8𝜇𝜇.546, 11.88] − − Φ Φ �𝜇𝜇̂ √ 𝜇𝜇̂ √ � 90% confidence interval for : (0.95) 1.6343 (0.95) 1.6343 . exp , . exp −1 𝜎𝜎 −1 𝛷𝛷 √ 𝛷𝛷 √ �𝜎𝜎� � [0.8962� , 𝜎𝜎�5.732�] �� −𝜎𝜎� 𝜎𝜎� To compare these results to a non-truncated normal distribution: 90% Lower CI Point Est 90% Upper CI Norm - 10.163 10.213 10.262 Classical

Trunc Normal Trunc Norm - 𝜇𝜇 1.176 2.789 15.697 Classical2 Norm - 𝜎𝜎 8.761 10.213 11.665 Bayesian Norm - 𝜇𝜇 0.886 2.789 6.822 Bayesian2 TNorm -𝜎𝜎 8.546 10.213 11.88 TNorm - 0.80317 5.1367 32.856 *Note: The𝜇𝜇 TNorm2 estimate and interval are squared. 𝜎𝜎 The truncated norma𝜎𝜎 l produced results which had a wider confidence in the parameter estimates, however the point estimates were within each others confidence intervals. In this case the truncation correction might be ignored for ease of calculation.

Characteristics For large / truncation may have negligible affect. In this case the use the Normal Continuous Distribution as an approximation. 𝜇𝜇 𝜎𝜎 Let: X~ ( , ) where [ , ] 2 Convolution Property.𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 Theµ σsum of truncated𝑋𝑋 ∈ 𝑎𝑎 normal𝑏𝑏 distribution random variables is not a truncated normal distribution. When truncation is symmetrical about the mean the sum of truncated normal distribution random variables is well approximated using: Truncated Normal Continuous Distribution 143

b a = 𝑛𝑛 where = 2 i − i 𝑌𝑌 � 𝑋𝑋𝑖𝑖 µi 𝑖𝑖=1 , ( ) where [ , ]

𝑌𝑌 ≈ 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 �� 𝜇𝜇𝑖𝑖 � 𝑉𝑉𝑉𝑉𝑉𝑉 𝑋𝑋𝑖𝑖 � 𝑌𝑌 ∈ ∑𝑎𝑎𝑖𝑖 ∑𝑏𝑏𝑖𝑖 Linear Transformation Property (Cozman & Krotkov 1997)

= + ~ ( + , ) where [ + , + ] 2𝑌𝑌2 𝑐𝑐𝑐𝑐 𝑑𝑑 𝑌𝑌 𝑇𝑇𝑁𝑁𝑜𝑜𝑜𝑜𝑜𝑜 𝑐𝑐𝑐𝑐 𝑑𝑑 𝑑𝑑 𝜎𝜎 𝑌𝑌 ∈ 𝑐𝑐𝑐𝑐 𝑑𝑑 𝑐𝑐𝑐𝑐 𝑑𝑑 Applications Life Distribution. When used as a life distribution a truncated Normal Distribution may be used due to the constraint t≥0. However it is often found that the difference in results is negligible. (Rausand & Høyland 2004)

Repair Time Distributions. The truncated normal distribution may be used to model simple repair or inspection tasks that have a typical duration with little variation using the limits [0, ) Trunc Normal Failures After Pre-test Screening. When a customer∞ receives a product from a vendor, the product may have already been subject to burn-in testing. The customer will not know the number of failures which occurred during the burn-in, but may know the duration. As

such the failure distribution is left truncated. (Meeker & Escobar 1998, p.269)

Flaws under the inspection threshold. When a flaw is not detected due to the flaw’s amplitude being less than the inspection threshold the distribution is left truncated. (Meeker & Escobar 1998, p.266)

Worst Case Measurements. Sometimes only the worst performers from a population are monitored and have data collected. Therefore the threshold which determined that the item be monitored is the truncation limit. (Meeker & Escobar 1998, p.267)

Screening Out Units With Large Defects. In quality control processes it may be common to remove defects which exceed a limit. The remaining population of defects delivered to the customer has a right truncated distribution. (Meeker & Escobar 1998, p.270)

Resources Online: http://en.wikipedia.org/wiki/Truncated_normal_distribution http://socr.ucla.edu/htmls/SOCR_Distributions.html (web calc) http://www.ntrand.com/truncated-normal-distribution/

Books: Cohen, 1991. Truncated and Censored Samples 1st ed., CRC 144 Univariate Continuous Distributions

Press.

Patel, J.K. & Read, C.B., 1996. Handbook of the Normal Distribution 2nd ed., CRC.

Schneider, H., 1986. Truncated and censored samples from normal populations, M. Dekker.

Relationship to Other Distributions Normal Let: Distribution ~ ( , ) ( , ) 2 ( ; , ) Then: 𝑋𝑋 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝜇𝜇 𝜎𝜎 2 ~T 𝑋𝑋 ∈ (∞, ∞, , ) 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑥𝑥 𝜇𝜇 𝜎𝜎 [ , 2 ] 𝑌𝑌 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝜇𝜇 𝜎𝜎 𝑎𝑎𝐿𝐿 𝑏𝑏𝑈𝑈 For further relationships see Normal𝑌𝑌 Continuous∈ 𝑎𝑎𝐿𝐿 𝑏𝑏𝑈𝑈 Distribution

Trunc Normal Trunc Uniform Continuous Distribution 145

4.9. Uniform Continuous Distribution

Probability Density Function - f(t) 2.00 a=0 b=1 a=0.75 b=1.25 a=0.25 b=1.5 1.50

1.00

0.50

0.00 0 0.5 1 1.5

Cumulative Density Function - F(t) Uniform Cont 1.00 a=0 b=1 a=0.75 b=1.25 0.80 a=0.25 b=1.5

0.60

0.40

0.20

0.00 0 0.5 1 1.5

Hazard Rate - h(t) 10.00 a=0 b=1 9.00 a=0.75 b=1.25 8.00 a=0.25 b=1.5 7.00 6.00 5.00 4.00 3.00 2.00 1.00 0.00 0 0.5 1 1.5

146 Univariate Continuous Distributions

Parameters & Description Minimum Value. is the lower bound 0 < of the uniform distribution. Parameters 𝑎𝑎 𝑎𝑎 ≤ 𝑎𝑎 𝑏𝑏 Maximum Value. is the upper bound < < of the uniform distribution. 𝑏𝑏 Random Variable 𝑏𝑏 𝑎𝑎 𝑏𝑏 ∞

Distribution Time Domain 𝑎𝑎 ≤ 𝑡𝑡 ≤ 𝑏𝑏 Laplace 1 for a t b ( ) = b a 0 otherwise ≤ ≤ 𝑓𝑓 𝑡𝑡 � 1− PDF = { ( ) ( )} ( ) = b a −𝑎𝑎(𝑎𝑎 −)𝑏𝑏𝑏𝑏 𝑒𝑒 − 𝑒𝑒 𝑢𝑢 𝑡𝑡 − 𝑎𝑎 − 𝑢𝑢 𝑡𝑡 − 𝑏𝑏 𝑓𝑓 𝑠𝑠 Where (− ) is the Heaviside 𝑠𝑠 𝑏𝑏 − 𝑎𝑎 step function. 𝑢𝑢 𝑡𝑡 − 𝑎𝑎 0 for t < t a ( ) = for a t b b a 𝑎𝑎 CDF 1 − for t > ( ) = 𝐹𝐹 𝑡𝑡 �t a ≤ ≤ −𝑎𝑎(𝑎𝑎 −)𝑏𝑏𝑏𝑏 = − { ( ) ( )} 𝑒𝑒 − 𝑒𝑒 b a 𝑏𝑏 𝐹𝐹 𝑠𝑠 2 − + ( ) 𝑠𝑠 𝑏𝑏 − 𝑎𝑎 𝑢𝑢 𝑡𝑡 − 𝑎𝑎 − 𝑢𝑢 𝑡𝑡 − 𝑏𝑏 − Uniform Cont Uniform 1 for𝑢𝑢 t <𝑡𝑡 − 𝑏𝑏 b t 1 Reliability ( ) = for a t b ( ) = + b a 𝑎𝑎 −𝑏𝑏(𝑏𝑏 −)𝑎𝑎𝑎𝑎 0 − for t > 𝑒𝑒 − 𝑒𝑒 𝑅𝑅 𝑡𝑡 � ≤ ≤ 𝑅𝑅 𝑠𝑠 2 𝑠𝑠 𝑏𝑏 − 𝑎𝑎 𝑠𝑠 < : − For 𝑏𝑏 1 for t + x < 𝑡𝑡 𝑎𝑎 ( + ) b (t + x) ( ) = = for a t + x b ( ) b a 𝑎𝑎 𝑅𝑅 𝑡𝑡 𝑥𝑥 0 − for t > 𝑚𝑚 𝑥𝑥 � ≤ ≤ For : 𝑅𝑅 𝑡𝑡 − Conditional 1 for t + x𝑏𝑏< 𝑎𝑎 ≤ 𝑡𝑡 ≤ 𝑏𝑏 ( + ) b (t + x) Survivor Function ( ) = = for a t + x b ( > + | > ) ( ) b t 𝑎𝑎 𝑅𝑅 𝑡𝑡 𝑥𝑥 0 − for t + x > 𝑚𝑚 𝑥𝑥 � ≤ ≤ 𝑃𝑃 𝑇𝑇 𝑥𝑥 𝑡𝑡 𝑇𝑇 𝑡𝑡 For > : 𝑅𝑅 𝑡𝑡 − ( ) = 0 𝑏𝑏 Where𝑡𝑡 𝑏𝑏 is the given time we know the𝑚𝑚 component𝑥𝑥 has survived to. is a random variable defined as the time after . Note: = 0 at . 𝑡𝑡 𝑥𝑥For < : 𝑡𝑡 𝑥𝑥 𝑡𝑡 ( ) = ( + ) Mean Residual 𝑡𝑡 𝑎𝑎 : 1 Life For 𝑢𝑢 𝑡𝑡 2 𝑎𝑎 (𝑏𝑏 − 𝑡𝑡) ( ) = 𝑎𝑎 ≤ 𝑡𝑡 ≤ 𝑏𝑏 2( 2) 𝑎𝑎 − 𝑏𝑏 𝑢𝑢 𝑡𝑡 𝑎𝑎 − 𝑡𝑡 − 𝑡𝑡 − 𝑏𝑏 Uniform Continuous Distribution 147

For > : ( ) = 0 𝑡𝑡 𝑏𝑏 𝑢𝑢 𝑡𝑡 1 for a t b ( ) = Hazard Rate b t 0 otherwise ≤ ≤ ℎ 𝑡𝑡 � − 0 for t < Cumulative b t ( ) = ln for a t b Hazard Rate b a 𝑎𝑎 − for t > 𝐻𝐻 𝑡𝑡 �− � � ≤ ≤ − Properties and∞ Moments 𝑏𝑏 Median ( + ) 1 Mode Any value2 between𝑎𝑎 𝑏𝑏 and

st Mean - 1 Raw Moment ( + ) 𝑎𝑎 𝑏𝑏 1

nd 2 𝑎𝑎 𝑏𝑏 Uniform Cont Variance - 2 Central Moment ( ) 1 2 Skewness - 3rd Central Moment 12 𝑏𝑏 0− 𝑎𝑎

Excess kurtosis - 4th Central Moment 6

Characteristic Function e −5e

ititb(b aita) − 100α% Percentile Function = (b− a) + a Parameter Estimation 𝑡𝑡𝛼𝛼 α − Maximum Likelihood Function Likelihood 1 b t t t ( , | ) = F . . 1 + Functions b a n nS b aS nI RIb aLI − i i − i

𝐿𝐿 𝑎𝑎 𝑏𝑏 𝐸𝐸 � � �i=1 � � �i=1 � � This assumes that� all��−� times��� �are�� �within���� −the��� bound���� a,�� b.� � ���−���� failures survivors interval failures

When there is only complete failure data: 1 ( , | ) = F b a n where 𝐿𝐿 𝑎𝑎 𝑏𝑏 𝐸𝐸 � � −

Point The likelihood function is maximize𝑎𝑎 ≤ 𝑡𝑡d𝑖𝑖 when≤ 𝑏𝑏 a is large, b is small with the Estimates restriction that all times are between a and b. Thus:

a = min (t , t … ) b = max (tF , tF … ) 1 2 � F F � 1 2 148 Univariate Continuous Distributions

When = 0 and is estimated with complete data the following estimates may be used where = max (t , t … t ). (Johnson et al. 1995, p.286)𝑎𝑎 𝑏𝑏 F F F 𝑚𝑚𝑚𝑚𝑚𝑚 1 2 n 1. MLE. 𝑡𝑡 b =

2. Min Mean Square Error. b = 𝑚𝑚𝑚𝑚𝑥𝑥 � n𝑡𝑡+2 b = 3. Unbiased Estimator. � n+1 𝑡𝑡𝑚𝑚𝑚𝑚𝑚𝑚 n+1 b = 2 4. Closest Estimator. � n 𝑡𝑡𝑚𝑚𝑚𝑚𝑚𝑚 1 �n 𝑚𝑚𝑚𝑚𝑚𝑚 Procedures for parameter estimating when� there𝑡𝑡 is censored data is detailed in (Johnson et al. 1995, p.286)

Fisher 1 1 Information ( ) ( ) ( , ) = − ⎡ 1 2 1 2⎤ 𝑎𝑎 − 𝑏𝑏 𝑎𝑎 − 𝑏𝑏 𝐼𝐼 𝑎𝑎 𝑏𝑏 ⎢( ) ( ) ⎥ − ⎢ 2 2⎥ Bayesian⎣ 𝑎𝑎 − 𝑏𝑏 𝑎𝑎 − 𝑏𝑏 ⎦ The Uniform distribution is widely used in Bayesian methods as a non-informative prior or to model evidence which only suggests bounds on the parameter.

Non-informative Prior. The Uniform distribution can be used as a non-informative prior. As can be seen below, the only affect the uniform prior has on Bayes equation is to limit the range of the parameter for which the denominator integrates over.

Uniform Cont Uniform 1 ( | ) ( | ) ( | ) = = 1 𝐿𝐿(𝐸𝐸|𝜃𝜃) � � ( | ) 𝑏𝑏 − 𝑎𝑎 𝐿𝐿 𝐸𝐸 𝜃𝜃 𝜋𝜋 𝜃𝜃 𝐸𝐸 𝑏𝑏 𝑏𝑏

𝑎𝑎 𝐿𝐿 𝐸𝐸 𝜃𝜃 � � 𝑑𝑑𝑑𝑑 ∫𝑎𝑎 𝐿𝐿 𝐸𝐸 𝜃𝜃 𝑑𝑑𝑑𝑑 Parameter Bounds. This type ∫of distribution𝑏𝑏 − 𝑎𝑎allows an easy method to mathematically model soft data where only the parameter bounds can be estimated. An example is where uniform distribution can model a person’s opinion on the value where they know that it could not be lower than or greater than , but is unsure of any particular value could take. 𝜃𝜃 𝑎𝑎 𝑏𝑏 𝜃𝜃 Non-informative Priors Jeffrey’s Prior 1

Description , Limitations and𝑎𝑎 − 𝑏𝑏Uses Example For an example of the uniform distribution being used in Bayesian updating as a prior, (1,1) see the binomial distribution.

Given the following data𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 calculate the MLE parameter estimates: 240, 585, 223, 751, 255

a = 223

� Uniform Continuous Distribution 149

b = 751 Characteristics The Uniform distribution is a �special case of the Beta distribution when = = 1.

The uniform𝛼𝛼 𝛽𝛽 distribution has an increasing failure rate with lim ( ) = . 𝑡𝑡→𝑏𝑏 ℎ 𝑡𝑡

The∞ Standard Uniform Distribution has parameters = 0 and = 1. This results in ( ) = 1 for and 0 otherwise. 𝑎𝑎 𝑏𝑏 𝑓𝑓 𝑡𝑡 𝑎𝑎~≤ 𝑡𝑡 ≤(𝑏𝑏 , ) Uniformity Property If > and + < then: 𝑇𝑇 𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈 𝑎𝑎 𝑏𝑏 1 ( + ) = = 𝑡𝑡 𝑎𝑎 𝑡𝑡 𝛥𝛥 𝑏𝑏 𝑡𝑡+∆ ∆ The probability that a random variable falls within any interval of fixed 𝑃𝑃 𝑡𝑡 → 𝑡𝑡 ∆ �𝑡𝑡 𝑑𝑑𝑑𝑑 length is independent of the location,𝑏𝑏 −, and𝑎𝑎 is only𝑏𝑏 − dependent𝑎𝑎 on the interval size, . 𝑡𝑡 Uniform Cont Variate Generation𝛥𝛥 Property ( ) = ( ) + −1 Residual Property 𝐹𝐹 𝑢𝑢 𝑢𝑢 𝑏𝑏 − 𝑎𝑎 𝑎𝑎 If k is a real constant where < < then:

Pr ( | > )~ ( = , ) 𝑎𝑎 𝑘𝑘 𝑏𝑏 𝑇𝑇 𝑇𝑇 𝑘𝑘 𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈 𝑎𝑎 𝑘𝑘 𝑏𝑏 Applications Random Number Generator. The uniform distribution is widely used as the basis for the generation of random numbers for other statistical distributions. The random uniform values are mapped to the desired distribution by solving the inverse cdf.

Bayesian Inference. The uniform distribution can be used ss a non- informative prior and to model soft evidence.

Special Case of Beta Distribution. In applications like Bayesian statistics the uniform distribution is used as an uninformative prior by using a beta distribution of = = 1.

Resources Online: 𝛼𝛼 𝛽𝛽 http://mathworld.wolfram.com/UniformDistribution.html http://en.wikipedia.org/wiki/Uniform_distribution_(continuous) http://socr.ucla.edu/htmls/SOCR_Distributions.html (web calc)

Books: Johnson, N.L., Kotz, S. & Balakrishnan, N., 1995. Continuous Univariate Distributions, Vol. 2 2nd ed., Wiley-Interscience. Relationship to Other Distributions Beta Distribution Let 150 Univariate Continuous Distributions

X ~ (0,1) X X X ( ; , , a, b) Then i 1 2 n 𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈 X ~ 𝑎𝑎𝑎𝑎𝑎𝑎( , +≤1) ≤ ⋯ ≤ 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑡𝑡 α β Where and are integers. r 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑟𝑟 𝑛𝑛 − 𝑟𝑟 Special𝑛𝑛 Case:𝑘𝑘 ( ; , | = 1, = 1 ) = ( ; , )

𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑡𝑡 𝑎𝑎 𝑏𝑏 𝛼𝛼 β 𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈 𝑡𝑡 𝑎𝑎 𝑏𝑏 Exponential Let Distribution ~ ( ) Y = exp ( ) Then ( ; ) 𝑋𝑋 𝐸𝐸𝐸𝐸𝐸𝐸 𝜆𝜆 ~𝑎𝑎𝑎𝑎𝑎𝑎 (0,1) −𝜆𝜆𝜆𝜆

𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 𝜆𝜆 𝑌𝑌 𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈

Uniform Cont Uniform Discrete Distributions 151

5. Univariate Discrete Distributions

Univar Discrete

152 Discrete Distributions

5.1. Bernoulli Discrete Distribution

Probability Density Function - f(k) 0.70 0.70 0.60 0.60 p=0.3 0.50 0.50 0.40 0.40 0.30 0.30 p=0.55 0.20 0.20 0.10 0.10 0.00 0.00 0 1 2 0 1 2

Cumulative Density Function - F(k) 1.00 1.00

0.80 0.80

0.60 0.60 p=0.3 p=0.55 0.40 0.40

0.20 0.20

0.00 0.00 0 1 2 0 1 2

Bernoulli Hazard Rate - h(k) 1.00 1.00

0.80 0.80

0.60 0.60 p=0.3 p=0.55 0.40 0.40

0.20 0.20

0.00 0.00 0 1 2 0 1 2

Bernoulli Discrete Distribution 153

Parameters & Description Bernoulli probability parameter. Parameters 0 1 Probability of success. Random Variable 𝑝𝑝 ≤ 𝑝𝑝 ≤ {0, 1} The probability of getting exactly (0 or 1) successes in 1 trial with Question 𝑘𝑘 ∈ probability p. 𝑘𝑘 Distribution Formulas ( ) = p (1 p) PDF 1k p for1− kk = 0 = 𝑓𝑓 𝑘𝑘 p − for k = 1 − ( ) = �(1 p) CDF 1 p for k = 0 = 1−k 𝐹𝐹 𝑘𝑘 1 − for k = 1 − ( ) =� 1 (1 p) Reliability p for k = 0 = 1−k 𝑅𝑅 𝑘𝑘 0− for− k = 1 1� p for k = 0 Hazard Rate ( ) = 1 for k = 1 − Propertiesℎ and𝑘𝑘 Moments�

Mode . = 0.5 . = {0,1} = 0.5 𝑘𝑘0 5 ‖𝑝𝑝‖ 𝑤𝑤ℎ𝑒𝑒𝑒𝑒 𝑝𝑝 ≠ st Mean - 1 Raw Moment 𝑘𝑘0 5 𝑤𝑤 ℎ𝑒𝑒𝑒𝑒 𝑝𝑝

nd Bernoulli Variance - 2 Central Moment (1𝑝𝑝 ) rd q p Skewness - 3 Central Moment 𝑝𝑝 − 𝑝𝑝= (1 ) pq −

th 𝑤𝑤ℎ𝑒𝑒𝑒𝑒𝑒𝑒 𝑞𝑞 − 𝑝𝑝 Excess kurtosis - 4 Central Moment � 6p 6p + 1

p2(1 p) − Characteristic Function (1 p)−+ pe it Parameter Estimation − Maximum Likelihood Function Likelihood ( | ) = (1 ) Function ∑𝑘𝑘𝑖𝑖 𝑛𝑛−∑𝑘𝑘𝑖𝑖 where is the number𝐿𝐿 of𝑝𝑝 Bernoulli𝐸𝐸 𝑝𝑝 trials− 𝑝𝑝 {0,1}, and = 𝑛𝑛 𝑛𝑛 𝑘𝑘𝑖𝑖 ∈ ∑𝑘𝑘𝑖𝑖 ∑𝑖𝑖=1 𝑘𝑘𝑖𝑖 154 Discrete Distributions

L solve for = 0 p L ( ) 𝑑𝑑 =𝑝𝑝 k. p (1 p) (n k)p (1 p) = 0 p i i i i 𝑑𝑑 𝑑𝑑 ∑( k) −1 n−∑k ∑k n−1−∑k ∑k. p (1 −p) =−(n − ∑k )p (1 − p) 𝑑𝑑 ∑ ki −1 n−∑ki ∑ki n−1−∑ki i ∑k . p = (n − k )(1 p) − ∑ − −1 −1 ∑(1 i p) n − k∑ i − = p k i − − ∑ k ∑ i p = n i ∑ Fisher 1 ( ) = Information (1 ) 𝐼𝐼 𝑝𝑝 MLE Point The MLE point estimate for p: 𝑝𝑝 − 𝑝𝑝 Estimates k p = n ∑ Fisher � 1 ( ) = Information (1 ) 𝐼𝐼 𝑝𝑝 𝑝𝑝 − 𝑝𝑝 Confidence See discussion in binomial distribution. Intervals Bayesian Non-informative Priors for p, ( )

(Yang and Berger 1998, p.6) 𝝅𝝅 𝒑𝒑 Type Prior Posterior

Bernoulli Uniform Proper 1 Truncated Beta Distribution

Prior with limits For a b [ , ] . ( ; 1 + , 2 ) 𝑏𝑏 − 𝑎𝑎 ≤ 𝑝𝑝 ≤ 𝑝𝑝 ∈ 𝑎𝑎 𝑏𝑏 Otherwise𝑐𝑐 𝐵𝐵 𝐵𝐵𝐵𝐵𝐵𝐵( )𝑝𝑝= 0 𝑘𝑘 − 𝑘𝑘

Uniform Improper 1 = ( ; 1,1) 𝜋𝜋 (𝑝𝑝 ; 1 + , 2 ) Proir with limits [0,1] 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑝𝑝 𝐵𝐵𝑒𝑒𝑡𝑡𝑡𝑡 𝑝𝑝 𝑘𝑘 − 𝑘𝑘 Jeffrey’s Prior 1 1 1 1 𝑝𝑝 ∈ = ; , ; + , 1.5 Reference Prior (1 ) 2 2 2 [0,1] 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 �𝑝𝑝 � 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵when�𝑝𝑝 𝑘𝑘 − 𝑘𝑘� �𝑝𝑝 − 𝑝𝑝 MDIP 1.6186 (1 ) Proper - No𝑝𝑝 Closed∈ Form 𝑝𝑝 1−𝑝𝑝 Novick and Hall (1 𝑝𝑝) =− 𝑝𝑝 (0,0) ( ; , 1 ) −1 −1 when [0,1] 𝑝𝑝 − 𝑝𝑝 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑝𝑝 𝑘𝑘 − 𝑘𝑘 𝑝𝑝 ∈ Bernoulli Discrete Distribution 155

Conjugate Priors UOI Likelihood Evidence Dist of Prior Posterior Model UOI Para Parameters

failures = + from Bernoulli Beta , 𝑝𝑝 in 1 trail = + 1 ( ; ) 𝑘𝑘 𝛼𝛼 𝛼𝛼𝑜𝑜 𝑘𝑘 𝛼𝛼0 𝛽𝛽0 𝛽𝛽 𝛽𝛽𝑜𝑜 − 𝑘𝑘 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑝𝑝 Description , Limitations and Uses Example When a demand is placed on a machine it undergoes a Bernoulli trial with success defined as a successful start. It is known the probability of a successful start, , equals 0.8. Therefore the probability the machine does not start. (0) = 0.2. 𝑝𝑝 For an example with 𝑓𝑓multiple Bernoulli trials see the binomial distribution. Characteristics A Bernoulli process is a probabilistic that can have one of two outcomes, success ( = 1) with the probability of success is , and failure ( = 0) with the probability of failure is 1 . 𝑘𝑘 𝑝𝑝Single Trial. It𝑘𝑘’s important to emphasis that the Bernoulli𝑞𝑞 ≡ distribution− 𝑝𝑝 is for a single trial or event. The case of multiple Bernoulli trials with replacement is the binomial distribution. The case of multiple Bernoulli trials without replacement is the hypergeometric distribution.

~ ( | ) Maximum Property max { , , … , 𝐾𝐾}~𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵(𝑘𝑘;𝑝𝑝 = 1 {1 }) Bernoulli Minimum property 1 2 𝑛𝑛 𝑖𝑖 min𝐾𝐾 𝐾𝐾 { , 𝐾𝐾, … , 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵}~ 𝑘𝑘 𝑝𝑝( ; −=Π −) 𝑝𝑝 Product Property 𝐾𝐾1 𝐾𝐾2 𝐾𝐾𝑛𝑛 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑝𝑝 Π𝑝𝑝𝑖𝑖 n K ~ ( ; = )

� i 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 Π𝑘𝑘 𝑝𝑝 Π𝑝𝑝𝑖𝑖 i=1 Applications Used to model a single event which have only two outcomes. In reliability engineering it is most often used to model demands or shocks to a component where the component will fail with probability p.

In practice it is rare for only a single event to be considered and so a binomial distribution is most often used (with the assumption of replacement). The conditions and assumptions of a Bernoulli trial however are used as the basis for each trial in a binomial distribution. See ‘Related Distributions’ and binomial distribution for more details.

Resources Online: 156 Discrete Distributions

http://mathworld.wolfram.com/BernoulliDistribution.html http://en.wikipedia.org/wiki/Bernoulli_distribution http://socr.ucla.edu/htmls/SOCR_Distributions.html (web calc)

Books: Collani, E.V. & Dräger, K., 2001. Binomial distribution handbook for scientists and engineers, Birkhäuser.

Johnson, N.L., Kemp, A.W. & Kotz, S., 2005. Univariate Discrete Distributions 3rd ed., Wiley-Interscience. Relationship to Other Distributions The Binomial distribution counts the number of successes in n independent observations of a Bernoulli process.

Let Binomial Distribution ~ (k ; p) = 𝑛𝑛

𝑖𝑖 i 𝑖𝑖 ( |n, p) Then 𝐾𝐾 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑎𝑎𝑎𝑎𝑎𝑎 𝑌𝑌 � 𝐾𝐾 𝑖𝑖=1 Y~ (k = k |n, p) where {1, 2, … , } 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘′ ′ i Special Case:𝐵𝐵 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 ∑ 𝑘𝑘′ ∈ 𝑛𝑛 (k; p) = (k; p|n = 1)

𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵

Bernoulli Binomial Discrete Distribution 157

5.2. Binomial Discrete Distribution

Probability Density Function - f(k) 0.60 n=1 p=0.4 0.30 n=20 p=0.2 n=20 p=0.5 n=5 p=0.4 0.50 0.25 n=20 p=0.8 n=15 p=0.4 0.40 0.20

0.30 0.15

0.20 0.10

0.10 0.05

0.00 0.00 0 5 10 0 5 10 15 20

Cumulative Density Function - F(k) 1.00 1.00

0.80 0.80

0.60 0.60

0.40 0.40 n=1 p=0.4 0.20 n=5 p=0.4 0.20 n=15 p=0.4 Binomial 0.00 0.00 0 5 10 0 5 10 15 20

Hazard Rate - h(k) 1.0 1.0

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0.0 0.0 0 5 10 0 10 20

158 Discrete Distributions

Parameters & Description {1, 2 … , } Number of Trials. Parameters Bernoulli probability parameter. 𝑛𝑛 𝑛𝑛 ∈0 1∞ Probability of success in a single trial. Random Variable 𝑝𝑝 ≤ 𝑝𝑝 ≤ {0, 1, 2 … , }

Question The probability of getting𝑘𝑘 ∈ exactly 𝑛𝑛 successes in trials. Distribution Formulas𝑘𝑘 𝑛𝑛 n ( ) = p (1 p) k where k combinations from n: k n−k PDF 𝑓𝑓 𝑘𝑘 � � −! = = = = ! ( )! 𝑛𝑛 𝑛𝑛 𝑛𝑛 𝑛𝑛 𝑛𝑛−1 � � 𝑛𝑛𝐶𝐶𝑘𝑘 𝐶𝐶𝑘𝑘 𝐶𝐶𝑘𝑘−1 𝑘𝑘 𝑘𝑘 𝑛𝑛 − 𝑘𝑘 𝑘𝑘 ! ( ) = k p (1 p) ! ( )! 𝑛𝑛 j n−j 𝐹𝐹 𝑘𝑘 = � ( , + 1) − j=0 𝑗𝑗 𝑛𝑛 − 𝑗𝑗

1−𝑝𝑝 where ( , ) is the Regularized𝐼𝐼 𝑛𝑛 − 𝑘𝑘 Incomplete𝑘𝑘 Beta function. See section 1.6.3. 𝑝𝑝 𝐼𝐼 𝑎𝑎 𝑏𝑏 When 20 and 0.05, or if 100 and 10, this can CDF be approximated by a Poisson distribution with = : 𝑛𝑛 ≥ 𝑝𝑝 ≤ 𝑛𝑛 ≥ 𝑛𝑛𝑛𝑛 ≤ ( + 1, ) ( ) e k = 𝜇𝜇 𝑛𝑛𝑛𝑛 j!j ! −µ µ Γ 𝑘𝑘 𝜇𝜇 𝐹𝐹 𝑘𝑘 ≅ �(2 , 2 + 2) j=0 𝑘𝑘 2 𝜒𝜒 Binomial When 10 and ≅ (𝐹𝐹1 𝜇𝜇) 𝑘𝑘10 then the cdf can be approximated using a normal distribution: 𝑛𝑛𝑛𝑛 ≥ 𝑛𝑛𝑛𝑛 − 𝑝𝑝 +≥0.5 ( ) (1 ) 𝑘𝑘 − 𝑛𝑛𝑛𝑛 𝐹𝐹 𝑘𝑘 ≅ Φ � � �𝑛𝑛!𝑛𝑛 − 𝑝𝑝 ( ) = 1 k p (1 p) ! ( )! 𝑛𝑛 j n−j 𝑅𝑅 𝑘𝑘 − � − j=0 𝑗𝑗 𝑛𝑛! − 𝑗𝑗 = n p (1 p) Reliability ! ( )! 𝑛𝑛 j n−j = �( + 1, ) − j=k+1 𝑗𝑗 𝑛𝑛 − 𝑗𝑗 𝑝𝑝 where ( , ) is the Regularized𝐼𝐼 𝑘𝑘 𝑛𝑛 − Incomplete𝑘𝑘 Beta function. See section 1.6.3. 𝐼𝐼𝑝𝑝 𝑎𝑎 𝑏𝑏 Binomial Discrete Distribution 159

n (1 + ) k −1 ( ) = 1 + n n k j j=0 θ k− ∑ � � θ Hazard Rate where ℎ 𝑘𝑘 � k � � � θ = 1 𝑝𝑝 (Gupta et al. 1997) 𝜃𝜃 − 𝑝𝑝 Properties and Moments

Median . is either { , }

Mode 𝑘𝑘0 5 ( + 1⌊)𝑛𝑛𝑛𝑛 ⌋ ⌈𝑛𝑛𝑛𝑛⌉ st Mean - 1 Raw Moment ⌊ 𝑛𝑛 𝑝𝑝⌋ nd Variance - 2 Central Moment (1𝑛𝑛𝑛𝑛 ) Skewness - 3rd Central Moment 1 2p 𝑛𝑛𝑛𝑛 − 𝑝𝑝 np(1 p) − th Excess kurtosis - 4 Central Moment 6�p 6p−+ 1

np2 (1 p) − Characteristic Function 1 p +−pe it n 100α% Percentile Function Numerically �solve− for �(which is not arduous for 10): = ( 𝑘𝑘, ) 𝑛𝑛 ≤ −1 𝛼𝛼 For 10𝑘𝑘 and𝐹𝐹 𝑛𝑛(1𝑝𝑝 ) 10 the

normal approximation may be used: Binomial 𝑛𝑛𝑛𝑛 ≥ 𝑛𝑛𝑛𝑛 − 𝑝𝑝 ≥ k ( ) (1 ) + 0.5 −1

α ≅ �Φ 𝛼𝛼 �𝑛𝑛𝑛𝑛 − 𝑝𝑝 𝑛𝑛𝑛𝑛 − � Parameter Estimation Maximum Likelihood Function Likelihood For complete data only: Function n ( ) 𝑛𝑛𝐵𝐵 ( ) | = k p 1 p i ki ni−ki 𝐿𝐿 𝑝𝑝 𝐸𝐸 = �p �(1i� p) − i=1 ∑ki ∑ni−∑ki Where is the number of Binomial− processes, = , = and the combinatory term is ignored (see section𝑛𝑛𝐵𝐵 1.1.6 for 𝐵𝐵 𝑖𝑖 𝑖𝑖=1 𝑖𝑖 𝑖𝑖 𝐵𝐵 ∑ discussion).𝑛𝑛 𝑛𝑛 ∑𝑘𝑘 𝑘𝑘 ∑𝑛𝑛 𝑖𝑖=1 𝑖𝑖 ∑ 𝑛𝑛 160 Discrete Distributions

L solve for = 0 p L ( ) 𝑑𝑑 = 𝑝𝑝k . p (1 p) ( n k )p (1 p) p i i i i i i 𝑑𝑑 𝑑𝑑 ∑ k −1 ∑n −∑k ∑k ∑n −1−∑k ∑ i − − ∑ i − ∑ i − 𝑑𝑑 k . p ( ) (1 p) = ( n k )p (1 p) ∑ ki −1 ∑ni−∑ki ∑ki −1+∑ni−∑ki i i i ∑k . p = ( n− k )(1 p)∑ − ∑ − −1 −1 (∑1 i p) n∑ i − k∑ i − = p i k i − ∑ − ∑ k ∑ i p = ni ∑ ∑ i MLE Point The MLE point estimate for p: Estimates k p = n ∑ i Fisher � 1i ( ) = ∑ Information (1 ) 𝐼𝐼 𝑝𝑝 Confidence The confidence intervals for the binomial𝑝𝑝 − 𝑝𝑝 distribution parameter p is a Intervals controversial subject which is still debated. The Wilson interval is recommended for small and large . (Brown et al. 2001)

+ 2 𝑛𝑛 + 4 (1 ) = + + 2 22( + ) 𝑛𝑛𝑝𝑝̂ 𝜅𝜅 ⁄ 𝜅𝜅�𝜅𝜅 𝑛𝑛𝑝𝑝̂ − 𝑝𝑝̂ 2 2

𝑝𝑝 𝑛𝑛+ 𝜅𝜅 2 +𝑛𝑛4 𝜅𝜅(1 ) = + 2 22( + ) 𝑛𝑛𝑝𝑝̂ 𝜅𝜅 ⁄ 𝜅𝜅�𝜅𝜅 𝑛𝑛𝑝𝑝̂ − 𝑝𝑝̂ where 𝑝𝑝 2 − 2 Binomial 𝑛𝑛 𝜅𝜅 + 1𝑛𝑛 𝜅𝜅 = 2 −1 𝛾𝛾 𝜅𝜅 Φ � � It should be noted that most textbooks use the Wald interval (normal approximation) given below, however many articles have shown these estimates to be erratic and cannot be trusted. (Brown et al. 2001) (1 ) = + 𝑝𝑝̂ − 𝑝𝑝̂ 𝑝𝑝 𝑝𝑝̂ 𝜅𝜅� 𝑛𝑛 (1 ) = 𝑝𝑝̂ − 𝑝𝑝̂ 𝑝𝑝 𝑝𝑝̂ − 𝜅𝜅� 𝑛𝑛 For a comparison of binomial confidence interval estimates the reader is referred to (Brown et al. 2001). The following webpage has links to online calculators which use many different methods. http://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval Binomial Discrete Distribution 161

Bayesian Non-informative Priors for p given n, ( | ) (Yang and Berger 1998, p.6) 𝝅𝝅 𝒑𝒑 𝒏𝒏 Type Prior Posterior Uniform Proper 1 Truncated Beta Distribution

Prior with limits For a b [ , ] . ( ; 1 + , 1 + ) 𝑏𝑏 − 𝑎𝑎 ≤ 𝑝𝑝 ≤ 𝑝𝑝 ∈ 𝑎𝑎 𝑏𝑏 Otherwise𝑐𝑐 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 (𝑝𝑝 ) = 0𝑘𝑘 𝑛𝑛 − 𝑘𝑘

Uniform Improper 1 = ( ; 1,1) 𝜋𝜋( ;𝑝𝑝1 + , 1 + ) Proir with limits [0,1] 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑝𝑝 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑝𝑝 𝑘𝑘 𝑛𝑛 − 𝑘𝑘 Jeffrey’s Prior 1 1 1 1 1 𝑝𝑝 ∈ = ; , ; + , + Reference Prior (1 ) 2 2 2 2 [0,1] 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 �𝑝𝑝 � 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 �when𝑝𝑝 𝑘𝑘 𝑛𝑛 − 𝑘𝑘� �𝑝𝑝 − 𝑝𝑝 MDIP 1.6186 (1 ) Proper - No𝑝𝑝 Closed∈ Form 𝑝𝑝 1−𝑝𝑝 Novick and Hall (1 𝑝𝑝) =− 𝑝𝑝 (0,0) ( ; , ) −1 −1 when [0,1] 𝑝𝑝 − 𝑝𝑝 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑝𝑝 𝑘𝑘 𝑛𝑛 − 𝑘𝑘 Conjugate Priors 𝑝𝑝 ∈ UOI Likelihood Evidence Dist of Prior Posterior Parameters Model UOI Para

failures = + from Binomial Beta , 𝑝𝑝 in trial = + n k Binomial ( ; , ) 𝑘𝑘 𝛼𝛼 𝛼𝛼𝑜𝑜 𝑘𝑘 𝛼𝛼𝑜𝑜 𝛽𝛽𝑜𝑜 𝑛𝑛 β βo − 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑝𝑝 𝑛𝑛 Description , Limitations and Uses

Example Five machines are measured for performance on demand. The machines can either fail or succeed in their application. The machines are tested for 10 demands with the following data for each machine:

Machine/Trail 1 2 3 4 5 6 7 8 9 10 1 F = 3 S = 7 2 F=2 S=8 3 F=2 S=8 4 F=3 S=7 5 F=2 S=8 (1 )

𝑖𝑖 Assuming𝜇𝜇 machines are𝑛𝑛𝑝𝑝̂ homogeneous estimate𝑛𝑛 −the𝑝𝑝̂ parameter :

Using MLE: 𝑝𝑝 k 12 p = = = 0.24 n 50 ∑ i � ∑ i 162 Discrete Distributions

90% confidence intervals for : = (0.95) = 1.64485 −1𝑝𝑝 +𝜅𝜅 Φ2 + 4 (1 ) = = 0.1557 + 2 22( + ) 𝑛𝑛𝑝𝑝̂ 𝜅𝜅 ⁄ 𝜅𝜅�𝜅𝜅 𝑛𝑛𝑝𝑝̂ − 𝑝𝑝̂ 𝑝𝑝𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙 2 − 2 𝑛𝑛 + 𝜅𝜅 2 +𝑛𝑛 4 𝜅𝜅 (1 ) = + = 0.351 + 2 22( + ) 𝑛𝑛𝑝𝑝̂ 𝜅𝜅 ⁄ 𝜅𝜅�𝜅𝜅 𝑛𝑛𝑝𝑝̂ − 𝑝𝑝̂ 𝑝𝑝𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢 2 2 A Bayesian point estimate𝑛𝑛 𝜅𝜅 using a uniform𝑛𝑛 𝜅𝜅 prior distribution (1, 1), with posterior ( ; 13, 39) has a point estimate: 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑝𝑝 13 = E[ ( ; 13, 39)] = = 0.25 52

𝑝𝑝̂ 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑝𝑝 With 90% confidence interval using inverse Beta cdf:

[ (0.05) = 0.1579, (0.95) = 0.3532] −1 −1 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 The probability𝐹𝐹 of observing no failures𝐹𝐹 in the next 10 trials with replacement is: (0; 10,0.25) = 0.0563

The probability of observing𝑓𝑓 less than 5 failures in the next 10 trials with replacement is: (0; 10,0.25) = 0.9803

𝑓𝑓 Characteristics CDF Approximations. The Binomial distribution is one of the most widely used distributions throughout history. Although simple, the CDF function was tedious to calculate prior to the use of computers.

Binomial As a result approximations using the Poisson and Normal distribution have been used. For details see ‘Related Distributions’.

With Replacement. The Binomial distribution models probability of successes in Bernoulli trials. However, the successes can occur anywhere among the trials with different combinations. Therefore𝑘𝑘 the Binomial𝑛𝑛 distribution assumes replacement.𝑘𝑘 The 𝑛𝑛 𝑘𝑘 equivalent distribution which𝑛𝑛 assumes without𝐶𝐶 replacement is the hypergeometric distribution.

Symmetrical. The distribution is symmetrical when = 0.5.

Compliment. ( ; , ) = ( ; , 1 ). Tables𝑝𝑝 usually only provide values up to /2 allowing the reader to calculate to using the compliment 𝑓𝑓formula.𝑘𝑘 𝑛𝑛 𝑝𝑝 𝑓𝑓 𝑛𝑛 − 𝑘𝑘 𝑛𝑛 − 𝑝𝑝 𝑛𝑛 𝑛𝑛 Assumptions. The binomial distribution describes the behavior of a count variable K if the following conditions apply: Binomial Discrete Distribution 163

1. The number of observations n is fixed. 2. Each observation is independent. 3. Each observation represents one of two outcomes ("success" or "failure"). 4. The probability of "success" is the same for each outcome.

~ ( , ) Convolution Property 𝐾𝐾 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑛𝑛 𝑝𝑝 K ~ ( , ) When is fixed. � i 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 ∑𝑛𝑛𝑖𝑖 𝑝𝑝

𝑝𝑝 Applications Used to model independent repeated trials which have two outcomes. Examples used in Reliability Engineering are: • Number of independent components which fail, , from a population, after receiving a shock. • Number of failures to start, , from demands𝑘𝑘 on a component.𝑛𝑛 • Number of independent items𝑘𝑘 defective,𝑛𝑛 , from a population of items. 𝑘𝑘 𝑛𝑛 Resources Online: http://mathworld.wolfram.com/BinomialDistribution.html http://en.wikipedia.org/wiki/Binomial_distribution http://socr.ucla.edu/htmls/SOCR_Distributions.html (web calc)

Books: Collani, E.V. & Dräger, K., 2001. Binomial distribution handbook for scientists and engineers, Birkhäuser. Binomial

Johnson, N.L., Kemp, A.W. & Kotz, S., 2005. Univariate Discrete

Distributions 3rd ed., Wiley-Interscience. Relationship to Other Distributions The Binomial distribution counts the number of successes in independent observations of a Bernoulli process. 𝑘𝑘 𝑛𝑛 Let ~ ( ; ) = 𝑛𝑛

𝑖𝑖 i 𝑖𝑖 (k ; p) Then 𝐾𝐾 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 k′ 𝑝𝑝 𝑎𝑎𝑎𝑎𝑎𝑎 𝑌𝑌 � 𝐾𝐾 𝑖𝑖=1 ′ Y~ ( k ; , ) where {1, 2, … , } 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 ′ i Special Case: 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 ∑ 𝑛𝑛 𝑝𝑝 𝑘𝑘 ∈ 𝑛𝑛 ( ; ) = ( ; | = 1)

𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑝𝑝 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑝𝑝 𝑛𝑛 164 Discrete Distributions

The hypergeometric distribution models probability of successes in Bernoulli trials from a population , with successors without Hypergeometric replacement. 𝑘𝑘 Distribution 𝑛𝑛 ( ; , ,𝑁𝑁 ) 𝑚𝑚

Limiting Case for and𝑓𝑓 𝑘𝑘not𝑛𝑛 near𝑚𝑚 𝑁𝑁 0 or 1: ( ; , , ) lim ( ; , = ) = ( ; , , ) 𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻 𝑛𝑛 ≫ 𝑘𝑘 𝑝𝑝 𝑘𝑘 𝑛𝑛 𝑚𝑚 𝑁𝑁 𝑚𝑚 𝑛𝑛→∞ 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑛𝑛 𝑝𝑝 𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻 𝑘𝑘 𝑛𝑛 𝑚𝑚 𝑁𝑁 Limiting Case for constant : 𝑁𝑁 lim ( | , ) = k = n , = (1 ) Normal 𝑝𝑝 2 𝑛𝑛→∞ 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑛𝑛 𝑝𝑝 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁� �µ 𝑝𝑝 𝜎𝜎 𝑛𝑛𝑛𝑛 − 𝑝𝑝 � Distribution 𝑝𝑝=𝑝𝑝 The Normal distribution can be used as an approximation of the ( ; , ) Binomial distribution when 10 and (1 ) 10. 2 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑡𝑡 𝜇𝜇 𝜎𝜎 ( | , ) 𝑛𝑛𝑛𝑛 ≥+ 0.5 =𝑛𝑛𝑛𝑛 ,− 𝑝𝑝=≥ (1 ) 2 Limiting𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 Case 𝑘𝑘for𝑝𝑝 constant𝑛𝑛 ≈ 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 �: 𝑘𝑘 �𝜇𝜇 𝑛𝑛𝑛𝑛 𝜎𝜎 𝑛𝑛𝑛𝑛 − 𝑝𝑝 � lim ( ; , ) = (k; = n ) 𝑛𝑛𝑛𝑛 𝑛𝑛→∞ 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑛𝑛 𝑝𝑝 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 µ 𝑝𝑝 n𝑝𝑝=𝜇𝜇 The Poisson distribution is the limiting case of the Binomial distribution when is large but the ratio of remains constant. Hence the Poisson distribution models rare events. 𝑛𝑛 𝑛𝑛𝑛𝑛 The Poisson distribution can be used as an approximation to the Binomial distribution when 20 and 0.05, or if 100 and 10. 𝑛𝑛 ≥ 𝑝𝑝 ≤ 𝑛𝑛 ≥ 𝑛𝑛𝑛𝑛 ≤ Poisson The Binomial is expressed in terms of the total number of a Distribution probability of success, , and trials, . Where a Poisson distribution

Binomial is expressed in terms of a success rate and does not need to know 𝑝𝑝 𝑁𝑁 ( ; ) the total number of trials.

𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑘𝑘 𝜇𝜇 The derivation of the Poisson distribution from the binomial can be found at http://mathworld.wolfram.com/PoissonDistribution.html.

This interpretation can also be used to understand the conditional distribution of a Poisson random variable: Let , ~ ( ) Given 1 2 = +𝐾𝐾 𝐾𝐾= 𝑃𝑃𝑃𝑃𝑃𝑃𝑠𝑠 µ Then 𝑛𝑛 𝐾𝐾1 𝐾𝐾2 𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛 𝑜𝑜𝑜𝑜 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 |n~ k; n, p = + µ1 𝐾𝐾1 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 � � Multinomial Special Case: µ1 µ2 Distribution ( |n, ) = ( | , ) ( |n, ) 𝑀𝑀𝑀𝑀𝑀𝑀𝑚𝑚𝑑𝑑=2 𝐤𝐤 𝐩𝐩 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑛𝑛 𝑝𝑝 𝑀𝑀𝑀𝑀𝑀𝑀𝑚𝑚𝑑𝑑 𝐤𝐤 𝐩𝐩 Poisson Discrete Distribution 165

5.3. Poisson Discrete Distribution

Probability Density Function - f(k) 0.20 μ=4 0.18 μ=10 0.16 μ=20 0.14 μ=30 0.12 0.10 0.08 0.06 0.04 0.02 0.00 0 5 10 15 20 25 30 35 40 45

Cumulative Density Function - F(k) 1.00

0.80

0.60 μ=4 0.40 μ=10 μ=20 μ=30 0.20

0.00 Poisson 0 5 10 15 20 25 30 35 40 45

Hazard Rate - h(k)

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0 5 10 15 20 25 30 35 40 45

166 Discrete Distributions

Parameters & Description Shape Parameter: The value of is the expected number of events per time period or other physical𝜇𝜇 dimensions. If the Poisson distribution is modeling failure events, then = Parameters > 0 is the average number of failures that 𝜇𝜇 𝜆𝜆𝜆𝜆 𝜇𝜇 𝜇𝜇 would occur in the space . In this case is fixed and becomes the distribution parameter. Some𝑡𝑡 texts use the𝑡𝑡 symbol . 𝜆𝜆

Random Variable is an integer, 0 𝜌𝜌 Distribution 𝑘𝑘 Formulas 𝑘𝑘 ≥ ( ) PDF ( ) = = k𝑘𝑘! k! 𝑘𝑘 𝜇𝜇 −µ 𝜆𝜆𝜆𝜆 −λt 𝑓𝑓 𝑘𝑘 𝑒𝑒 𝑒𝑒 ( + 1, ) ( ) = e k = j!j ! −µ µ Γ 𝑘𝑘 𝜇𝜇 𝐹𝐹 𝑘𝑘 = �(2 , 2 + 2) j=0 𝑘𝑘 2 CDF 𝜒𝜒 Where ( | ) is the Chi-square𝐹𝐹 𝜇𝜇 CDF.𝑘𝑘 2 𝜒𝜒 When 𝐹𝐹> 10𝑥𝑥 𝑣𝑣the ( ) can be approximated by a normal distribution: + 0.5 𝜇𝜇 𝐹𝐹 𝑘𝑘 ( ) 𝑘𝑘 − 𝜇𝜇 𝐹𝐹 𝑘𝑘 ≅ Φ � � √𝜇𝜇 Reliability R(k) = 1 F(k)

Poisson − k! −1 ( ) = 1 + e 1 k Hazard Rate j!j µ µ ℎ 𝑘𝑘 � � − − � �� (Gupta et al. 1997) µ j=1 Properties and Moments Median See 100α% Percentile Function when = 0.5.

Mode 𝛼𝛼 where is the floor function2 ⌊𝜇𝜇⌋ st Mean - 1 Raw Moment ⌊𝜇𝜇⌋ nd Variance - 2 Central Moment 𝜇𝜇 𝜇𝜇

2 = is the floor function (largest integer not greater than )

⌊𝜇𝜇⌋ 𝜇𝜇 Poisson Discrete Distribution 167

Skewness - 3rd Central Moment 1/

Excess kurtosis - 4th Central Moment 1/�µ Characteristic Function exp{ e µ 1 } ik 100α% Percentile Function Numerically solveµ� for − �(which is not arduous for 10): = (𝑘𝑘 ) 𝜇𝜇 ≤ −1 𝛼𝛼 For > 10 the 𝑘𝑘normal𝐹𝐹 approximation𝛼𝛼 may be used: 𝑘𝑘 k ( ) + 0.5 −1 α ≅ ��µΦ 𝛼𝛼 𝜇𝜇 − � Parameter Estimation Maximum Likelihood Estimates Likelihood For complete data: Functions ( | ) = 𝐹𝐹 n k𝑘𝑘𝑖𝑖! 𝜇𝜇 −µ F 𝐿𝐿 𝜇𝜇 𝐸𝐸 �i=1 𝑒𝑒 ������i ��� known k where is the number of poisson processes.

Log-Likelihood 𝑛𝑛 Function = n + n { ln( ) ln ( !)}

𝑖𝑖 𝑖𝑖 Λ − µ � 𝑘𝑘 𝜇𝜇 − 𝑘𝑘 ������i=�1�������������� Poisson known k = 0 1 = n + n = 0 ∂Λ ∂Λ

𝑖𝑖 − � 𝑘𝑘 ∂µ ∂µ µ i=1 ��������� MLE Point For complete data solving =known0 gives:k Estimates ∂Λ 1 ∂µ 1 = . n = . n

Note that in this context:µ� � 𝑘𝑘𝑖𝑖 𝑜𝑜𝑜𝑜 λ� � 𝑘𝑘𝑖𝑖 𝑛𝑛 i=1 𝑡𝑡𝑡𝑡 i=1 t = the unit of time for which the rate, is being measured. = the number of Poisson processes for which the exact number of failures, k, was known. 𝜆𝜆 𝑛𝑛 = the number of failures that occurred within the ith Poisson process.

𝑖𝑖 When𝑘𝑘 there is only one Poisson process this reduces to: = = For censored data numerical methods are needed𝑘𝑘 to maximize the log- µ� 𝑘𝑘 𝑜𝑜𝑜𝑜 λ� likelihood function. 𝑡𝑡 168 Discrete Distributions

Fisher 1 ( ) = Information 𝐼𝐼 𝜆𝜆 lower - 𝜆𝜆 upper - 2 Sided 2 Sided 𝜆𝜆 𝜆𝜆 Conservative two (2 ) (2 + 2) 100 % sided confidence 2 2 1−γ 𝑖𝑖 1+γ 𝑖𝑖 Confidence intervals. 𝜒𝜒� 2� ∑𝑘𝑘 𝜒𝜒� � 2∑𝑘𝑘 Interval𝛾𝛾 2 2 When is large 1𝑡𝑡𝑡𝑡+ 1𝑡𝑡𝑡𝑡+ (complete data ( > 10) two sided + 𝑘𝑘 2 2 only) intervals −1 𝛾𝛾 𝜆𝜆̂ −1 𝛾𝛾 𝜆𝜆̂ 𝑘𝑘 λ� − Φ � � � λ� Φ � � � (Nelson 1982, p.201) Note: The first𝑡𝑡𝑛𝑛 confidence intervals 𝑡𝑡are𝑡𝑡 conservative in that at least 100 %. Exact confidence intervals cannot be easily achieved for discrete distributions. 𝛾𝛾 Bayesian Non-informative Priors ( ) in known time interval

Type Prior 𝝅𝝅 𝝀𝝀 Posterior 𝑡𝑡 Uniform Proper 1 Truncated Gamma Distribution

Prior with limits For a b [ , ] . ( ; 1 + k, t) 𝑏𝑏 − 𝑎𝑎 ≤ λ ≤ 𝜆𝜆 ∈ 𝑎𝑎 𝑏𝑏 Otherwise𝑐𝑐 𝐺𝐺 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺( ) = 0𝜆𝜆

Uniform Improper 1 (1,0) 𝜋𝜋 𝜆𝜆 ( ; 1 + k, t) Prior with limits [0, ) ∝ 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝜆𝜆

Jeffrey’s Prior 1 ( ; + k, t) 𝜆𝜆 ∈ ∞ ( , 0) when 1[0, ) 1 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝜆𝜆 2 Poisson Novick and Hall 1 ∝ 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 2 ( ; k, t) √𝜆𝜆 (0,0) 𝜆𝜆 ∈ ∞ when [0, ) 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝜆𝜆 ∝ 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝜆𝜆 Conjugate Priors 𝜆𝜆 ∈ ∞ UOI Likelihood Evidence Dist of Prior Posterior Parameters Model UOI Para failures = + from Exponential in unit of Gamma , 𝜆𝜆 𝑛𝑛𝐹𝐹 = + ( ; ) time 𝑘𝑘 𝑘𝑘𝑜𝑜 𝑛𝑛𝐹𝐹 𝑡𝑡𝑇𝑇 𝑘𝑘0 Λ0 Λ Λ𝑜𝑜 𝑡𝑡𝑇𝑇 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑘𝑘 µ Description , Limitations and Uses Example Three vehicle tires were run on a test area for 1000km have punctures at the following distances: Tire 1: No punctures Tire 2: 400km, 900km Tire 3: 200km

Poisson Discrete Distribution 169

Punctures can be modeled as a renewal process with perfect repair and an inter-arrival time modeled by an exponential distribution. Due to the Poisson distribution being homogeneous in time, the test from multiple tires can be combined and considered a test of one tire with multiple renewals. See example in section 1.1.6.

Total time on test is 3 × 1000 = 3000km. Total number of failures is 3. Therefore using MLE the estimate of :

k 3 𝜆𝜆 = = = 1E-3 3000 ̂ 𝜆𝜆 𝑇𝑇 With 90% confidence interval𝑡𝑡 (conservative): ( . )(6) ( . )(8) = 0.272 -3, = 2.584 -3 26000 26000 𝜒𝜒 0 05 𝜒𝜒 0 95 � 𝐸𝐸 𝐸𝐸 � A Bayesian point estimate using the Jeffery non-informative improper prior ( , 0), with posterior ( ; 3.5, 3000) has 1 a point estimate: 2 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 3.𝐺𝐺5𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝜆𝜆 = E[ ( ; 3.5, 3000)] = = 1.16E 3 3000

𝜆𝜆̂ 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝜆𝜆 ̇ − With 90% confidence interval using inverse Gamma cdf: [ (0.05) = 0.361 -3, (0.95) = 2.344 -3] −1 −1 𝐹𝐹𝐺𝐺 𝐸𝐸 𝐹𝐹𝐺𝐺 𝐸𝐸 Characteristics The Poisson distribution is also known as the Rare Event distribution.

If the following assumptions are met than the process follows a Poisson distribution: Poisson • The chance of two simultaneous events is negligible or impossible (such as renewal of a single component);

• The expected value of the random number of events in a region is proportional to the size of the region. • The random number of events in non-overlapping regions are independent.

μ characteristics: • is the expected number of events for the unit of time being measured. • 𝜇𝜇When the unit of time varies μ can be transformed into a rate and time measure, . • For 10 the distribution is skewed to the right. • For 10 the distribution𝜆𝜆𝜆𝜆 approaches a normal distribution with𝜇𝜇 a ≲ = and = . 𝜇𝜇 ≳ 𝜇𝜇 𝜇𝜇 𝜎𝜎 ~√𝜇𝜇 ( ) Convolution property + + 𝐾𝐾… +𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃~𝜇𝜇 ( ; )

𝐾𝐾1 𝐾𝐾2 𝐾𝐾𝑛𝑛 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑘𝑘 ∑𝜇𝜇𝑖𝑖 170 Discrete Distributions

Applications Homogeneous Poisson Process (HPP). The Poisson distribution gives the distribution of exactly k failures occurring in a HPP. See relation to exponential and gamma distributions.

Renewal Theory. Used in renewal theory as the counting function and may model non-homogeneous (aging) components by using a time dependent failure rate, ( ).

Binomial Approximation. Used𝜆𝜆 𝑡𝑡 to model the Binomial distribution when the number of trials is large and μ remains moderate. This can greatly simplify Binomial distribution calculations.

Rare Event. Used to model rare events when the number of trials is large compared to the rate at which events occur. Resources Online: http://mathworld.wolfram.com/PoissonDistribution.html http://en.wikipedia.org/wiki/Poisson_distribution http://socr.ucla.edu/htmls/SOCR_Distributions.html (interactive web calculator)

Books: Haight, F.A., 1967. Handbook of the Poisson distribution [by] Frank A. Haight, New York,: Wiley.

Nelson, W.B., 1982. Applied Life Data Analysis, Wiley-Interscience.

Johnson, N.L., Kemp, A.W. & Kotz, S., 2005. Univariate Discrete Distributions 3rd ed., Wiley-Interscience.

Relationship to Other Distributions Let ~ (k; = ) Poisson Given = 𝐾𝐾 +𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃+ µ+ 𝜆𝜆𝜆𝜆+ … Exponential Then Distribution 1 2 𝐾𝐾 𝐾𝐾+1 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑇𝑇 , T 𝑇𝑇… ~⋯ (𝑇𝑇t; ) 𝑇𝑇

( ; ) 1 2 The time between each arrival𝑇𝑇 of T𝐸𝐸 is𝐸𝐸𝐸𝐸 exponentially𝜆𝜆 distributed. 𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 𝜆𝜆 Special Cases: (k; | = 1) = ( ; )

Let 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝜆𝜆𝜆𝜆 𝑘𝑘 𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 𝜆𝜆 … ~ ( ) = + + + Gamma Then 1 𝑘𝑘 𝑡𝑡 1 2 𝑘𝑘 Distribution 𝑇𝑇 𝑇𝑇 𝐸𝐸𝐸𝐸𝐸𝐸 𝜆𝜆 ~𝑎𝑎𝑎𝑎𝑎𝑎 ( 𝑇𝑇, ) 𝑇𝑇 𝑇𝑇 ⋯ 𝑇𝑇 The Poisson distribution is the probability that exactly failures have 𝑡𝑡 ( | ) been observed in time . This𝑇𝑇 𝐺𝐺 is𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 the probability𝑘𝑘 𝜆𝜆 that is between and . 𝑘𝑘 𝑘𝑘 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑘𝑘 𝜆𝜆 𝑡𝑡 𝑡𝑡 𝑇𝑇 𝑇𝑇𝑘𝑘+1 Poisson Discrete Distribution 171

( ; ) = ( ; , ) 𝑘𝑘+1 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 = 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺( ; + 1, ) ( ; , ) 𝑓𝑓 𝑘𝑘 𝜆𝜆𝜆𝜆 �𝑘𝑘 𝑓𝑓 𝑡𝑡 𝑥𝑥 𝜆𝜆 𝑑𝑑𝑑𝑑

𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 where is an integer. 𝐹𝐹 𝑡𝑡 𝑘𝑘 𝜆𝜆 − 𝐹𝐹 𝑡𝑡 𝑘𝑘 𝜆𝜆

Limiting𝑘𝑘 Case for constant : lim ( ; , ) = (k| = n ) 𝑛𝑛𝑛𝑛 𝑛𝑛→∞ 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑛𝑛 𝑝𝑝 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 µ 𝑝𝑝 n𝑝𝑝=𝜇𝜇 The Poisson distribution is the limiting case of the Binomial distribution when is large but the ratio of remains constant. Hence the Poisson distribution models rare events. 𝑛𝑛 𝑛𝑛𝑛𝑛 The Poisson distribution can be used as an approximation to the Binomial distribution when 20 and 0.05, or if 100 and 10. 𝑛𝑛 ≥ 𝑝𝑝 ≤ 𝑛𝑛 ≥ Binomial The𝑛𝑛𝑛𝑛 ≤ Binomial is expressed in terms of the total number of a Distribution probability of success, , and trials, . Where a Poisson distribution is expressed in terms of a success rate and does not need to know ( | , ) the total number of trials.𝑝𝑝 𝑁𝑁

𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑝𝑝 𝑁𝑁 The derivation of the Poisson distribution from the binomial can be found at http://mathworld.wolfram.com/PoissonDistribution.html.

This interpretation can also be used to understand the conditional distribution of a Poisson random variable: Let , ~ ( ) Poisson Given 1 2 = +𝐾𝐾 𝐾𝐾= 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 µ

Then 1 2 𝑛𝑛 |n𝐾𝐾~ 𝐾𝐾 𝑛𝑛k𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛; n p = 𝑜𝑜𝑜𝑜 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 + µ1 𝐾𝐾1 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 � � � lim ( ; ) = ( µ; 1 =µ2, = ) ′ 2 Normal 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝜇𝜇→∞ 𝐹𝐹 𝑘𝑘 𝜇𝜇 𝐹𝐹 𝑘𝑘 𝜇𝜇 𝜇𝜇 𝜎𝜎 𝜇𝜇 Distribution This is a good approximation when > 1000. When > 10 the same approximation can be made with a correction: ( | , ) 𝜇𝜇 𝜇𝜇 lim ( ; ) = ( ; = 0.5, = ) 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑘𝑘 𝜇𝜇′ 𝜎𝜎 ′ 2 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 Chi-square 𝜇𝜇→∞ 𝐹𝐹 𝑘𝑘 𝜇𝜇 𝐹𝐹 𝑘𝑘 𝜇𝜇 𝜇𝜇 − 𝜎𝜎 𝜇𝜇 Distribution ( | ) = ( = 2 , = 2 + 2) ( | ) 2 2 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑘𝑘 𝜇𝜇 𝜒𝜒 𝑥𝑥 𝜇𝜇 𝑣𝑣 𝑘𝑘 𝜒𝜒 𝑡𝑡 𝑣𝑣 172 Discrete Distributions

6. Bivariate and Multivariate Distributions

Poisson Bivariate Normal Distribution 173

6.1. Bivariate Normal Continuous Distribution Probability Density Function - f(x,y)

y x

> = < 0 𝜎𝜎𝑥𝑥 𝜎𝜎𝑦𝑦 𝜎𝜎𝑥𝑥 𝜎𝜎𝑦𝑦 𝜎𝜎𝑥𝑥 𝜎𝜎𝑦𝑦 y y y

> 0

𝜌𝜌 x x x 0 < < 45 = 45 45 < < 90 𝑜𝑜 𝑜𝑜 𝑜𝑜 𝑜𝑜 𝑜𝑜 𝜃𝜃 𝜃𝜃 𝜃𝜃 y y y

= 0 Bi

- var Normal 𝜌𝜌 x x x = 0 = 45 = 0 𝑜𝑜 𝑜𝑜 𝑜𝑜 𝜃𝜃 𝜃𝜃 𝜃𝜃 y y y < 0

𝜌𝜌 x x x 135 < < 180 = 45 90 < < 135 Adapted from 𝑜𝑜(Kotz et al. 2000,𝑜𝑜 p.256) 𝑜𝑜 𝑜𝑜 𝑜𝑜 𝜃𝜃 𝜃𝜃 𝜃𝜃 174 Bivariate and Multivariate Distributions

Parameters & Description Location parameter: The < < , mean of each random { , } −∞ µj ∞ variable. 𝜇𝜇𝑥𝑥 𝜇𝜇𝑦𝑦 𝑗𝑗 ∈ 𝑥𝑥 𝑦𝑦 Scale parameter: The > 0 , standard deviation of each { , } 𝜎𝜎𝑗𝑗 random variable. 𝜎𝜎𝑥𝑥 𝜎𝜎𝑦𝑦 Parameters 𝑗𝑗 ∈ 𝑥𝑥 𝑦𝑦 : The correlation between the two random variables. [ ] 1 1 = ( , ) = 𝑐𝑐𝑐𝑐𝑐𝑐 𝑋𝑋𝑋𝑋 𝜌𝜌 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐[( 𝑋𝑋 𝑌𝑌)( )] 𝜌𝜌 − ≤ 𝜌𝜌 ≤ = 𝜎𝜎𝑥𝑥𝜎𝜎𝑦𝑦 𝐸𝐸 𝑋𝑋 − 𝜇𝜇𝑥𝑥 𝑌𝑌 − 𝜇𝜇𝑦𝑦

𝜎𝜎𝑥𝑥𝜎𝜎𝑦𝑦 Limits < x < < y <

Distribution −∞ ∞Formulas𝑎𝑎𝑎𝑎𝑎𝑎 − ∞ ∞

1 z + z 2 z z (x, y) = exp 2 1 2 2(12 ) x y − 𝜌𝜌 x y 𝑓𝑓 = ( ) ( | ) 2 � 2 � 𝜋𝜋σxσy� − 𝜌𝜌 − − 𝜌𝜌 = 𝜙𝜙( 𝑥𝑥) 𝜙𝜙 𝑦𝑦 𝑥𝑥 = ( ) PDF 1 1 𝑦𝑦 − 𝜌𝜌𝜌𝜌 𝑥𝑥 − 𝜌𝜌𝜌𝜌 𝜙𝜙 𝑥𝑥 𝜙𝜙 � 2� 𝜙𝜙 𝑦𝑦 𝜙𝜙 � 2� Where is the standard� normal− 𝜌𝜌 distribution and:� − 𝜌𝜌 x z = { , } 𝜙𝜙 − µj j 𝑗𝑗 ∈ 𝑥𝑥 𝑦𝑦 σj ( ) = ( , ) ( ) = ( , ) ∞ ∞ 1 1 1 1 Marginal PDF 𝑓𝑓 𝑥𝑥 = �−∞𝑓𝑓 𝑥𝑥exp𝑦𝑦 𝑑𝑑 𝑑𝑑 (z ) 𝑓𝑓 𝑦𝑦 = �−∞𝑓𝑓 𝑥𝑥exp𝑦𝑦 𝑑𝑑 𝑑𝑑 z 2 2 2 2 2 2 = ( , ) x = , y 𝑥𝑥 � − � 𝑦𝑦 � − � � � 𝜎𝜎 √ 𝜋𝜋 𝜎𝜎 √ 𝜋𝜋

var Normal var 𝑥𝑥 𝑥𝑥 - 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝜇𝜇 𝜎𝜎 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁�𝜇𝜇𝑦𝑦 𝜎𝜎𝑦𝑦� Bi ( | ) = | = + , | = (1 ) 𝜎𝜎𝑥𝑥 2 2 2 Conditional PDF 𝑓𝑓 𝑥𝑥 𝑦𝑦 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 �𝜇𝜇𝑥𝑥 𝑦𝑦 𝜇𝜇𝑥𝑥 𝜌𝜌 � � �𝑦𝑦 − 𝜇𝜇𝑦𝑦� 𝜎𝜎𝑥𝑥 𝑦𝑦 𝜎𝜎𝑥𝑥 − 𝜌𝜌 � 𝜎𝜎𝑦𝑦 ( | ) = | = + ( ), | = (1 ) 𝜎𝜎𝑦𝑦 2 2 2 𝑓𝑓 𝑦𝑦 𝑥𝑥 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 �𝜇𝜇𝑦𝑦 𝑥𝑥 𝜇𝜇𝑦𝑦 𝜌𝜌 � � 𝑦𝑦 − 𝜇𝜇𝑥𝑥 𝜎𝜎𝑦𝑦 𝑥𝑥 𝜎𝜎𝑦𝑦 − 𝜌𝜌 � 𝜎𝜎𝑥𝑥 1 z + z z z CDF (x, y) = exp du dv 2 1 x y 2 2(21 ) u v − 2ρ u v 2 𝐹𝐹 2 �−∞ �−∞ � � 𝜋𝜋σxσy� − ρ − − ρ Bivariate Normal Distribution 175

where x z = − µj j σj 1 z + z z z (x, y) = exp du dv 2 1 ∞ ∞ 2 2(12 ) u v − 2ρ u v Reliability 2 where𝑅𝑅 2 �x �y � � 𝜋𝜋σxσy� − ρ x − ρ z = j j − µ Properties and Momentsσj

Median 𝜇𝜇𝑥𝑥 � 𝑦𝑦� Mode 𝜇𝜇 𝜇𝜇𝑥𝑥 st � 𝑦𝑦� Mean - 1 Raw Moment 𝜇𝜇= 𝑥𝑥 𝑋𝑋 𝜇𝜇 𝐸𝐸 � � � 𝑦𝑦� The mean of the marginal𝑌𝑌 distributions𝜇𝜇 is: [ ] = [ ] = 𝐸𝐸 𝑋𝑋 𝜇𝜇𝑥𝑥 𝑦𝑦 The mean of the conditional𝐸𝐸 𝑌𝑌 𝜇𝜇 distributions gives the following lines (also called the regression lines): E(X|Y = y) = + . (y ) σx ( | ) µx ρ ( − µy) E Y X = x = + . σy y σy µy ρ − µx σx Variance - 2nd Central Moment = 2 𝑋𝑋 𝜎𝜎1 𝜌𝜌𝜎𝜎1𝜎𝜎2 𝐶𝐶𝐶𝐶𝐶𝐶 � � � 2 � 𝑌𝑌 𝜌𝜌𝜎𝜎1𝜎𝜎2 𝜎𝜎2

Variance of marginal distributions: Bi

(X) = - var Normal (Y) = 2 x 𝑉𝑉𝑉𝑉𝑉𝑉 σ2 y Variance of conditional𝑉𝑉𝑉𝑉 𝑉𝑉distributions:σ (X|Y = y) = (1 ) ( | = ) = 2(1 2) x 𝑉𝑉𝑉𝑉𝑉𝑉 σ2 − ρ2 𝑉𝑉𝑉𝑉𝑉𝑉 𝑌𝑌 𝑋𝑋 𝑥𝑥 σy − ρ 100α% Percentile Function An ellipse containing 100α % of the distribution is (Kotz et al. 2000, p.254):

(z + z z z ) = ln (1 ) 2 2(12 ) x y − 2ρ x y 2 − 𝛼𝛼 − − 𝜌𝜌 176 Bivariate and Multivariate Distributions

where x z = { , } j j − µ j 𝑗𝑗 ∈ 𝑥𝑥 𝑦𝑦 For the standard bivariateσ normal:

x + y xy = ln (1 ) 2 2(12 ) − 2ρ 2 − 𝛼𝛼 − − ρ Parameter Estimation Maximum Likelihood Function MLE Point When there is only complete failure data the MLE estimates can be given Estimates as (Kotz et al. 2000, p.294): 1 1 = 𝑛𝑛𝐹𝐹 = 𝑛𝑛𝐹𝐹 ( ) n n 2 2 𝑥𝑥 𝑖𝑖 �x 𝑖𝑖 𝑥𝑥 𝜇𝜇� F � 𝑥𝑥 σ F � 𝑥𝑥 − 𝜇𝜇� 1 𝑖𝑖=1 1 𝑖𝑖=1 = 𝑛𝑛𝐹𝐹 = 𝑛𝑛𝐹𝐹 n n 2 2 𝑦𝑦 𝑖𝑖 �y 𝑖𝑖 𝑦𝑦 𝜇𝜇� F � 𝑦𝑦 σ F ��𝑦𝑦 − 𝜇𝜇�� 𝑖𝑖=1 𝑖𝑖=1 = 𝑛𝑛𝐹𝐹 ( ) n 𝑖𝑖 𝑥𝑥 𝑖𝑖 𝑦𝑦 𝜌𝜌� 𝑥𝑥 𝑦𝑦 F � 𝑥𝑥 − 𝜇𝜇 �𝑦𝑦 − 𝜇𝜇 � 𝜎𝜎�𝜎𝜎� 𝑖𝑖=1 If one or more of the variables are known, different estimators are given in (Kotz et al. 2000, pp.294-305).

A correction factor of -1 can be introduced to the to give the unbiased estimators: 2 � 1 1 𝜎𝜎 = 𝑛𝑛𝐹𝐹 ( ) = 𝑛𝑛𝐹𝐹 n 1 n 1 2 2 2 2 �x 𝑖𝑖 𝑥𝑥 �y 𝑖𝑖 𝑦𝑦 σ F � 𝑥𝑥 − 𝜇𝜇� σ F ��𝑦𝑦 − 𝜇𝜇�� − 𝑖𝑖=1 − 𝑖𝑖=1

Bayesian Non-informative Priors: A complete coverage of numerous reference prior distributions with different parameter ordering is contained in (Berger & Sun 2008).

var Normal var

- For a summary of the general Bayesian priors and conjugates see the multivariate

Bi normal distribution. Description , Limitations and Uses Example The accuracy of a cutting machine used in manufacturing is desired to be measured. 5 cuts at the required length are made. The lengths and room temperature were measured as: 7.436, 10.270, 10.466, 11.039, 11.854 19.51, 21.23, 21.41, 22.78, 26.78 oC 𝑚𝑚𝑚𝑚 Bivariate Normal Distribution 177

MLE estimates are: x = = 10.213 n i 𝑥𝑥 ∑ t 𝜇𝜇� = = 22.342 n i ∑ 𝜇𝜇�𝑇𝑇 (x ) = = 2.7885 n 1 2 2 ∑ i − 𝜇𝜇�𝐿𝐿 �𝑥𝑥 (t ) 𝜎𝜎 = = 7.5033 − 2 ni 1𝑇𝑇 2 ∑ − 𝜇𝜇� 𝜎𝜎�𝑇𝑇 1 − = 𝑛𝑛𝐹𝐹 ( )( ) = 0.1454 n 𝑖𝑖 𝑥𝑥 𝑖𝑖 𝑇𝑇 𝜌𝜌� 𝑥𝑥 𝑇𝑇 F � 𝑥𝑥 − 𝜇𝜇 𝑡𝑡 − 𝜇𝜇 𝜎𝜎�𝜎𝜎� 𝑖𝑖=1 If you know the temperature is 24 oC what is the likely cutting distance distribution?

( | = 24) = | = + ( ), | = (1 ) 𝜎𝜎𝑥𝑥 2 2 2 𝑓𝑓(𝑥𝑥|𝑡𝑡 = 24) = 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁(�10𝜇𝜇𝑥𝑥.303𝑡𝑡 , 𝜇𝜇 𝑥𝑥2.730𝜌𝜌 �) � 𝑡𝑡 − 𝜇𝜇𝑇𝑇 𝜎𝜎𝑥𝑥 𝑡𝑡 𝜎𝜎𝑥𝑥 − 𝜌𝜌 � 𝜎𝜎𝑡𝑡

𝑓𝑓 𝑥𝑥 𝑡𝑡 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 Characteristic Also known as Binormal Distribution.

Let U, V and W be three independent normally distributed random variables. Then let: = + = + 𝑋𝑋 𝑈𝑈 𝑉𝑉 Then ( , ) has a bivariate normal𝑌𝑌 𝑉𝑉 distribution.𝑊𝑊 (Balakrishnan & Lai 2009, p.483) 𝑋𝑋 𝑌𝑌 Independence. If and are jointly normal random variables, then they are independent when = 0. This gives a contour plot of ( , ) 𝑋𝑋 𝑌𝑌 with concentric circles around the origin. When given a value on the Bi

axis it does not assist in estimating𝜌𝜌 the value on the axis and therefore𝑓𝑓 𝑥𝑥 𝑦𝑦 - var Normal are independent. When and are independent, the pdf reduces to: 𝑦𝑦 1 z + z 𝑥𝑥 (x, y) = exp 𝑋𝑋 𝑌𝑌 2 2 2 x 2 y

𝑓𝑓 x y �− � 𝜋𝜋σ σ Correlation Coefficient . (Yang et al. 2004, p.49) - > 0. When X increases then Y also tends to increase. When = 1 X and Y have𝝆𝝆 a perfect positive linear relationship such 𝝆𝝆that = + where is positive. - 𝜌𝜌 < 0. When X increases then Y also tends to decrease. When = 𝑌𝑌1 X𝑐𝑐 and𝑚𝑚 𝑚𝑚Y have 𝑚𝑚a perfect negative linear relationship such𝝆𝝆 that = + where is negative. - 𝜌𝜌 = −. Increases or decreases in X have no affect on Y. X and 𝑌𝑌 𝑐𝑐 𝑚𝑚𝑚𝑚 𝑚𝑚 𝝆𝝆 𝟎𝟎 178 Bivariate and Multivariate Distributions

Y are independent.

Ellipse Axis. (Kotz et al. 2000, p.254) The slope of the main axis from the x-axis is given as:

1 2 = 2 −1 𝜌𝜌𝜎𝜎𝑥𝑥𝜎𝜎𝑦𝑦 2 2 𝜃𝜃 𝑡𝑡𝑡𝑡𝑛𝑛 � 𝑥𝑥 𝑦𝑦 � If = for positive the main axis𝜎𝜎 of− the𝜎𝜎 ellipse is 45o from the x- o axis.𝑥𝑥 For 𝑦𝑦negative the main axis of the ellipse is -45 from the x-axis. 𝜎𝜎 𝜎𝜎 𝜌𝜌 𝜌𝜌 Circular Normal Density Function. (Kotz et al. 2000, p.255) When = and = 0 the bivariate distribution is known as a circular

normal𝑥𝑥 𝑦𝑦 density function. 𝜎𝜎 𝜎𝜎 𝜌𝜌 Elliptical Normal Distribution (Kotz et al. 2000, p.255). If = 0 and then the distribution may be known as an elliptical normal 𝜌𝜌 distribution.𝑥𝑥 𝑦𝑦 𝜎𝜎 ≠ 𝜎𝜎 Standard Bivariate Normal Distribution. Occurs when = 0 and = 1. For positive ρ the main axis of the ellipse is 45o from the x-axis. For negative ρ the main axis of the ellipse is -45o from the x-axis.𝜇𝜇 𝜎𝜎

1 x + y xy (x, y) = exp 2 1 2 2(12 ) − 2ρ 𝑓𝑓 2 �− 2 � − ρ Mean / Median / Mode: 𝜋𝜋� − ρ As per the univariate distributions the mean, median and mode are equal.

Matrix Form. The bivariate distribution may be written in matrix form as: = = = 2 𝑋𝑋1 𝜇𝜇1 𝜎𝜎1 𝜌𝜌𝜎𝜎1𝜎𝜎2 when ( , ) 2 𝑿𝑿 � 2� 𝝁𝝁 �𝜇𝜇2� 𝚺𝚺 � 1 2 2 � 𝑋𝑋 𝜌𝜌𝜎𝜎 𝜎𝜎 𝜎𝜎 𝑿𝑿 ∼ 𝑁𝑁𝑁𝑁𝑁𝑁𝑚𝑚2 𝝁𝝁 𝚺𝚺 1 1 ( ) = exp ( ) ( ) 2 | | 2 var Normal var T −1 - 𝑓𝑓 𝐱𝐱 �− 𝐱𝐱 − 𝛍𝛍 𝚺𝚺 𝐱𝐱 − 𝛍𝛍 � Bi Where | | is the determin𝜋𝜋� ant𝚺𝚺 of . This is the form used in multivariate normal distribution. 𝚺𝚺 𝚺𝚺 The following properties are given in matrix form:

Convolution Property Let ( , ) , (independent) Where 𝒙𝒙 𝐱𝐱 𝒚𝒚 𝐲𝐲 Then 𝑿𝑿 ∼ 𝑁𝑁 𝑁𝑁𝑁𝑁𝑁𝑁+ ~𝝁𝝁 𝚺𝚺 +𝒀𝒀 ∼ ,𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁+ �𝝁𝝁 𝚺𝚺 � 𝑿𝑿 ⊥ 𝒀𝒀 𝑿𝑿 𝒀𝒀 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁�𝝁𝝁𝒙𝒙 𝝁𝝁𝒚𝒚 𝚺𝚺𝒙𝒙 𝚺𝚺𝒚𝒚� Bivariate Normal Distribution 179

Note if X and Y are dependent then + may not be even be normally distributed.(Novosyolov 2006) 𝑿𝑿 𝒀𝒀 Scaling Property Let = + Y is a p x 1 matrix 𝒀𝒀 𝑨𝑨 𝑨𝑨 𝒃𝒃 b is a p x 1 matrix Then ~ ( + , ) A is a p x 2 matrix 𝑻𝑻 𝒀𝒀 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑨𝑨𝑨𝑨 𝒃𝒃 𝑨𝑨𝚺𝚺𝑨𝑨 Marginalize Property: Let ~ , 2 𝑋𝑋1 𝜇𝜇1 𝜎𝜎1 𝜌𝜌𝜎𝜎1𝜎𝜎2 2 � 2� 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 ��𝜇𝜇2� � 1 2 2 �� Then 𝑋𝑋 ( 𝜌𝜌,𝜎𝜎 𝜎𝜎) 𝜎𝜎

1 1 1 Conditional Property: 𝑋𝑋 ∼ 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝜇𝜇 𝜎𝜎 Let ~ , 2 1 𝜇𝜇1 1 1 2 𝑋𝑋 𝜎𝜎 𝜌𝜌𝜎𝜎 2𝜎𝜎 � 2� 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 �� 2� � �� 𝑋𝑋 𝜇𝜇 𝜌𝜌𝜎𝜎1𝜎𝜎2 𝜎𝜎2 Then ( | ) = | , |

𝑓𝑓 𝑥𝑥1 𝑥𝑥2 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁�𝜇𝜇1 2 𝜎𝜎1 2� Where | = + ( ) 𝜎𝜎1 𝜇𝜇1 |2 = 𝜇𝜇1 1𝜌𝜌 �𝜎𝜎2� 𝑥𝑥2 − 𝜇𝜇2 2 1 2 1 It should be noted that𝜎𝜎 the standard𝜎𝜎 � − deviation𝜌𝜌 of the marginal distribution does not depend on the given value.

Applications The bivariate distribution is used in many more applications which are common to the multivariate normal distribution. Please refer to multivariate normal distribution for a more complete coverage.

Graphical Representation of Multivariate Normal. As with all Bi

bivariate distributions having only two dependent variables allows it to - var Normal be easily graphed (in a three dimensional graph) and visualized. As such the bivariate normal is popular in introducing higher dimensional cases.

Resources Online: http://mathworld.wolfram.com/BivariateNormalDistribution.html http://en.wikipedia.org/wiki/Multivariate_normal_distribution http://www.aiaccess.net/English/Glossaries/GlosMod/e_gm_binormal_ distri.htm (interactive visual representation)

Books: Balakrishnan, N. & Lai, C., 2009. Continuous Bivariate Distributions 2nd ed., Springer. 180 Bivariate and Multivariate Distributions

Yang, K. et al., 2004. Multivariate Statistical Methods in Quality Management 1st ed., McGraw-Hill Professional.

Patel, J.K, Read, C.B, 1996. Handbook of the Normal Distribution, 2nd Edition, CRC

Tong, Y.L., 1990. The Multivariate Normal Distribution, Springer. var Normal var - Bi 181

6.2. Dirichlet Continuous Distribution

Probability Density Function - f(x)

([ , ] ; [2,2,2] ) ([ , ] ; [10,10,10] ) 𝑇𝑇 𝑇𝑇 𝑇𝑇 𝑇𝑇 𝐷𝐷𝐷𝐷𝑟𝑟2 𝑥𝑥1 𝑥𝑥2 𝐷𝐷𝐷𝐷𝑟𝑟2 𝑥𝑥1 𝑥𝑥2

([ , ] ; , , ) ([ , ] ; [1,1,1] ) 1 1 1 𝑇𝑇 𝑇𝑇 𝑇𝑇 𝑇𝑇 𝐷𝐷𝐷𝐷𝑟𝑟2 𝑥𝑥1 𝑥𝑥2 �2 2 2� 𝐷𝐷𝐷𝐷𝑟𝑟2 𝑥𝑥1 𝑥𝑥2 Dirichlet

([ , ] ; , 1,2 ) ([ , ] ; [2,1,2] ) 1 𝑇𝑇 𝑇𝑇 𝑇𝑇 𝑇𝑇 𝐷𝐷𝐷𝐷𝑟𝑟2 𝑥𝑥1 𝑥𝑥2 �2 � 𝐷𝐷𝐷𝐷𝑟𝑟2 𝑥𝑥1 𝑥𝑥2 182 Bivariate and Multivariate Distributions

Parameters & Description Shape Matrix. Note that the = [ , , … , , ] > 0 matrix is 𝑇𝑇 + 1 𝛂𝛂 𝛼𝛼1 𝛼𝛼2 𝛼𝛼𝑑𝑑 𝛼𝛼0 αi in length. 𝜶𝜶 𝑑𝑑 Dimensions. The number 1 of random

(integer) variables 𝑑𝑑 ≥ 𝑑𝑑 being modeled.

0 x 1

i Limits ≤ ≤ 𝑑𝑑 1

� 𝑥𝑥𝑖𝑖 ≤ 𝑖𝑖=1 Distribution Formulas

1 𝛼𝛼0−1 ( ) = 1 𝑑𝑑 x B( ) d αi−1 𝑖𝑖 i 𝑓𝑓 𝐱𝐱 � − � 𝑥𝑥 � �i=1 𝛂𝛂 𝑖𝑖=1 where B( ) is the multinomial beta function: PDF ( ) B( ) = 𝜶𝜶 d ∏i=0 Γ αi d 𝛂𝛂 i=0 i The special case of the DirichletΓ�∑ αdistribution� is the beta distribution when = 1.

𝑑𝑑 Let = ~ ( ) Where = [𝑼𝑼 , … , , , … , ] 𝑿𝑿 � � 𝐷𝐷𝐷𝐷𝑟𝑟𝑑𝑑 𝜶𝜶 Dirichlet = [𝑽𝑽 , … , ] 𝑇𝑇 1 𝑠𝑠 𝑠𝑠+1 𝑑𝑑 𝑿𝑿 = [𝑋𝑋 , …𝑋𝑋, 𝑋𝑋𝑇𝑇 ] 𝑋𝑋 1 𝑠𝑠 Let 𝑼𝑼 = 𝑋𝑋 𝑋𝑋= sum𝑇𝑇 of matrix elements. 𝑠𝑠+1 𝑑𝑑 Marginal PDF 𝑽𝑽 𝑋𝑋 𝑑𝑑 𝑋𝑋 Σ 𝑗𝑗=0 𝑗𝑗 ( ) 𝛼𝛼where∑ 𝛼𝛼= , , …𝜶𝜶, , 𝑠𝑠 𝑇𝑇 𝑼𝑼 ∼ 𝐷𝐷𝐷𝐷𝑟𝑟𝑠𝑠 𝜶𝜶𝒖𝒖 𝜶𝜶𝒖𝒖 �𝛼𝛼1 𝛼𝛼2 𝛼𝛼𝑠𝑠 𝛼𝛼Σ − ∑𝑗𝑗=1 𝛼𝛼𝑗𝑗� ( ) 𝑠𝑠 ( ) = 1 𝑠𝑠 𝛼𝛼Σ−1−∑𝑗𝑗=1 𝛼𝛼𝑗𝑗 x ( ) s Γ αΣ αi−1 𝑠𝑠 s 𝑖𝑖 i 𝑓𝑓 𝐮𝐮 Σ 𝑗𝑗 i � − � 𝑥𝑥 � �i=1 Γ�𝛼𝛼 − ∑𝑗𝑗=1 𝛼𝛼 � ∏i=1 Γ α 𝑖𝑖=1 Dirichlet Distribution 183

When marginalized to one variable: ~ ( , )

𝑋𝑋𝑖𝑖 (𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵) 𝛼𝛼𝑖𝑖 𝛼𝛼Σ − 𝛼𝛼𝑖𝑖 ( ) = (1 ) x ( Σ) ( ) Γ α i 𝛼𝛼Σ−𝛼𝛼𝑖𝑖−1 αi−1 𝑖𝑖 𝑖𝑖 i 𝑓𝑓 𝑥𝑥 Σ 𝑖𝑖 − 𝑥𝑥 | = Γ 𝛼𝛼| − where𝛼𝛼 Γ α | = [ , , … , , ] 𝑇𝑇 (Kotz et al. 2000,𝑑𝑑−𝑠𝑠 p.488)𝑢𝑢 𝒗𝒗 𝒖𝒖 𝒗𝒗 𝑆𝑆+1 𝑠𝑠+2 𝑚𝑚 0 𝑼𝑼 𝑽𝑽 𝒗𝒗 ∼ 𝐷𝐷𝐷𝐷𝑟𝑟 �𝜶𝜶 � 𝜶𝜶 𝛼𝛼 𝛼𝛼 𝛼𝛼 𝛼𝛼 Conditional PDF =0 ( | ) = 𝑠𝑠 1 𝑠𝑠 𝛼𝛼0−1 𝑗𝑗 𝑗𝑗 𝑠𝑠 Γ (𝛼𝛼 ) 𝛼𝛼𝑖𝑖−1 �s∑ � 𝑖𝑖 𝑖𝑖 𝑓𝑓 𝐮𝐮 𝐯𝐯 i � − � 𝑥𝑥 � �𝑖𝑖=1𝑥𝑥 ∏i=0 Γ α 𝑖𝑖=1 ( ) = P(X x , X x , … , X x )

1 1 2 2 d d 𝐹𝐹 𝐱𝐱 ≤ ≤ ≤𝛼𝛼0−1 = … 1 𝑑𝑑 x d , … , , CDF 𝑥𝑥1 𝑥𝑥2 𝑥𝑥𝑑𝑑 d αi−1 𝑖𝑖 i 2 1 � � � � − � 𝑥𝑥 � �i=1 𝑑𝑑 𝑑𝑑𝑥𝑥 𝑑𝑑𝑥𝑥 Numerical0 methods0 0 have been𝑖𝑖=1 explored to evaluate this integral, see (Kotz et al. 2000, pp.497-500) ( ) = P(X > x , X > x , … , X > x )

1 1 2 2 d d 𝑅𝑅 𝐱𝐱 𝛼𝛼0−1 Reliability = … 1 𝑑𝑑 x d , … , , ∞ ∞ ∞ d αi−1 𝑖𝑖 i 2 1 �1 �2 �𝑑𝑑 � − � 𝑥𝑥 � �i=1 𝑑𝑑 𝑑𝑑𝑥𝑥 𝑑𝑑𝑥𝑥 𝑥𝑥 𝑥𝑥 𝑥𝑥 𝑖𝑖=1 Properties and Moments Median Solve numerically using ( ) = 0.5

Mode = for > 0 otherwise𝐹𝐹 𝒙𝒙 no mode 𝛼𝛼𝑖𝑖−1 𝑖𝑖 𝑖𝑖 Mean - 1st Raw Moment Let 𝑥𝑥 =𝛼𝛼Σ−𝑑𝑑 : 𝛼𝛼 𝑑𝑑 𝛼𝛼Σ ∑𝑖𝑖=0 𝛼𝛼𝑖𝑖 [ ] = = 𝜶𝜶 𝐸𝐸 𝑿𝑿 𝝁𝝁 Σ Mean of the marginal distribution:𝛼𝛼 Dirichlet [ ] = = 𝑢𝑢 𝒖𝒖 𝜶𝜶 𝐸𝐸 𝑼𝑼 𝝁𝝁 𝛼𝛼Σ [ ] = = 𝛼𝛼𝑖𝑖 where 𝐸𝐸 𝑋𝑋𝑖𝑖 𝜇𝜇𝑖𝑖 𝑎𝑎Σ 𝑇𝑇 = , , … , , 𝑠𝑠

𝜶𝜶𝒖𝒖 �𝛼𝛼1 𝛼𝛼2 𝛼𝛼𝑠𝑠 𝛼𝛼Σ − � 𝛼𝛼𝑗𝑗� 𝑗𝑗=1 Mean of the conditional distribution: | [ | = ] = | = 𝜶𝜶𝒖𝒖 𝒗𝒗 𝐸𝐸 𝑼𝑼 𝑽𝑽 𝒗𝒗 𝝁𝝁𝑢𝑢 𝑣𝑣 𝛼𝛼Σ 184 Bivariate and Multivariate Distributions

where | = [ , , … , , ] 𝑇𝑇 Variance - 2nd Central Moment Let = 𝜶𝜶𝒖𝒖 𝒗𝒗 : 𝛼𝛼𝑆𝑆+1 𝛼𝛼𝑠𝑠+2 𝛼𝛼𝑚𝑚 𝛼𝛼0 𝑑𝑑 𝛼𝛼Σ ∑𝑖𝑖=0 𝛼𝛼𝑖𝑖 ( ) [ ] = ( + 1) 𝛼𝛼𝑖𝑖 𝛼𝛼Σ − 𝛼𝛼𝑖𝑖 𝑉𝑉𝑉𝑉𝑉𝑉 𝑋𝑋𝑖𝑖 2 𝛼𝛼Σ 𝛼𝛼Σ , = ( + 1) −𝛼𝛼𝑖𝑖𝛼𝛼𝑗𝑗 𝐶𝐶𝐶𝐶𝐶𝐶�𝑋𝑋𝑖𝑖 𝑋𝑋𝑗𝑗� 2 Parameter Estimation 𝛼𝛼Σ 𝛼𝛼Σ Maximum Likelihood Function MLE Point The MLE estimates of can be obtained from n observations of by Estimates numerically maximizing the log-likelihood function: (Kotz et al. 2000, 𝚤𝚤 𝒊𝒊 p.505) 𝛼𝛼� 𝒙𝒙

1 ( |E) = ( ) 𝑑𝑑 ln + 𝑑𝑑 1 𝑛𝑛 ln

Λ 𝛂𝛂 𝑛𝑛 �𝑙𝑙𝑙𝑙Γ 𝛼𝛼Σ − � 𝛤𝛤�𝛼𝛼𝑗𝑗�� 𝑛𝑛 � � �𝛼𝛼𝑗𝑗 − � � �𝑥𝑥𝑖𝑖𝑖𝑖�� 𝑗𝑗=0 𝑗𝑗=0 𝑛𝑛 𝑖𝑖=1 The method of moments are used to provide initial guesses of for the numerical methods. 𝑖𝑖 𝛼𝛼 Fisher = ( ), Information = (′ ) ( ) 𝑖𝑖𝑖𝑖 Σ Matrix 𝐼𝐼 −𝑛𝑛𝜓𝜓′ 𝛼𝛼 ′ 𝑖𝑖 ≠ 𝑗𝑗 𝑖𝑖𝑖𝑖 𝑖𝑖 Σ ( ) = ( 𝐼𝐼) 𝑛𝑛𝜓𝜓 𝛼𝛼 − 𝑛𝑛𝜓𝜓 𝛼𝛼 Where 2 is the trigamma function. See section 1.6.8. ′ 𝑑𝑑 (Kotz et al. 2000, p.506)2 𝜓𝜓 𝑥𝑥 𝑑𝑑𝑥𝑥 𝑙𝑙𝑙𝑙Γ 𝑥𝑥 100 % The confidence intervals can be obtained from the fisher information Confidence matrix. Intervals𝛾𝛾 Bayesian

Non-informative Priors Jeffery’s Prior det ( ) Dirichlet ( ) where is given above.� �𝐼𝐼 𝜶𝜶 � Conjugate𝐼𝐼 𝜶𝜶 Priors UOI Likelihood Evidence Dist of Prior Posterior Model UOI Para Parameters

, failures in

trials with from Multinomiald 𝑘𝑘𝑖𝑖 𝑗𝑗 Dirichletd+1 = + 𝒑𝒑( ; , ) possible 𝑛𝑛 𝑑𝑑 states. 𝜶𝜶𝒐𝒐 𝜶𝜶 𝜶𝜶𝒐𝒐 𝒌𝒌 𝑀𝑀𝑀𝑀𝑀𝑀𝑚𝑚𝑑𝑑 𝒌𝒌 𝑛𝑛𝑡𝑡 𝒑𝒑 Dirichlet Distribution 185

Description , Limitations and Uses Example Five machines are measured for performance on demand. The machines can either fail, partially fail or success in their application. The machines are tested for 10 demands with the following data for each machine:

Machine/Trail 1 2 3 4 5 6 7 8 9 10 1 F = 3 P = 2 S = 5 2 F=2 P=2 S=6 3 F=2 P=3 S=5 4 F=3 P=3 S=4 5 F=2 P=3 S=5

𝑖𝑖 𝐹𝐹 𝑃𝑃 𝐹𝐹 Estimate𝜇𝜇 the multinomial𝑛𝑛𝑝𝑝� distribution𝑛𝑛𝑝𝑝� parameter = [ 𝑛𝑛𝑝𝑝,� , ]:

𝐹𝐹 𝑃𝑃 𝑆𝑆 Using a non-informative improper prior (0,0,𝒑𝒑0) after𝑝𝑝 updating:𝑝𝑝 𝑝𝑝

𝐷𝐷𝐷𝐷𝑟𝑟3 12 = . 12 = . = = 13 [ ] = 𝐹𝐹 50 [ ] = 7 15𝐸𝐸−5 𝐹𝐹 𝑝𝑝� 13 𝑝𝑝 25 7.54𝐸𝐸−5 𝑃𝑃 𝑃𝑃 = 50 𝒙𝒙 �𝑝𝑝 � 𝜶𝜶 � � 𝐸𝐸 𝒙𝒙 �𝑝𝑝� 25� 𝑉𝑉𝑉𝑉𝑉𝑉 𝒙𝒙 � 9 80𝐸𝐸−5 � 𝑆𝑆 𝑝𝑝 𝑆𝑆 50 Confidence intervals for the parameters𝑝𝑝� = [ , , ] can also be calculated using the cdf of the marginal distribution ( ). 𝐹𝐹 𝑃𝑃 𝑆𝑆 𝒑𝒑 𝑝𝑝 𝑝𝑝 𝑝𝑝 𝐹𝐹 𝑥𝑥𝑖𝑖 Characteristic Beta Generalization. The Dirichlet distribution is a generalization of the beta distribution. The beta distribution is seen when = 1.

Interpretation. The higher the sharper and𝑑𝑑 more certain the distribution is. This follows from its use in Bayesian statistics to model 𝑖𝑖 the𝜶𝜶 parameter𝛼𝛼 . As more evidence is used, the values get higher which reduces uncertainty. The values of can also be interpreted as a count for 𝑝𝑝each state of the multinomial 𝑖𝑖 𝑖𝑖 𝛼𝛼distribution. 𝛼𝛼

Alternative Formulation. The most common formulation of the Dirichlet Dirichlet distribution is as follows: = [ , , … , ] where > 0

= [ , , … , ] 𝑇𝑇 where 0 x 1, = 1 1 2 𝑚𝑚 i 𝛂𝛂 𝛼𝛼 𝛼𝛼 𝛼𝛼 𝑇𝑇 1α 𝑚𝑚 1 2 𝑚𝑚 ( ) = i x 𝑖𝑖= 1 𝑖𝑖 𝐱𝐱 𝑥𝑥 𝑥𝑥 𝑥𝑥 B( ≤) ≤m ∑ 𝑥𝑥 αi−1 i 𝑓𝑓 𝐱𝐱 �i=1 This formulation is popular because𝛂𝛂 it is a more simple presentation where the matrix of and are the same size. However it should be noted that last term of the vector is dependent on { … } through the relationship =𝜶𝜶1 𝒙𝒙 . 1 𝑚𝑚−1 𝑚𝑚−1 𝒙𝒙 𝑥𝑥 𝑥𝑥 𝑚𝑚 𝑖𝑖=1 𝑖𝑖 Neutrality. (Kotz𝑥𝑥 et al.− 2000,∑ 𝑥𝑥p.500) If and are non negative

𝑋𝑋1 𝑋𝑋2 186 Bivariate and Multivariate Distributions

random variables such that + 1 then is called neutral if the following are independent: 𝑋𝑋1 𝑋𝑋2 ≤ 𝑋𝑋𝑖𝑖 ( ) 1 𝑋𝑋𝑗𝑗 If ( ) then is a neutral𝑖𝑖 vector with each X being neutral under 𝑋𝑋 ⊥ 𝑖𝑖 𝑖𝑖 ≠ 𝑗𝑗 all permutations of the above definition.− 𝑋𝑋 This property is unique to the 𝑑𝑑 i Dirichlet𝑿𝑿 ∼ 𝐷𝐷𝐷𝐷 𝑟𝑟distribution.𝜶𝜶 𝑿𝑿

Applications Bayesian Statistics. The Dirichlet distribution is often used as a conjugate prior to the multinomial likelihood function.

Resources Online: http://en.wikipedia.org/wiki/Dirichlet_distribution http://www.cis.hut.fi/ahonkela/dippa/node95.html

Books: Kotz, S., Balakrishnan, N. & Johnson, N.L., 2000. Continuous Multivariate Distributions, Volume 1, Models and Applications, 2nd Edition 2nd ed., Wiley-Interscience.

Congdon, P., 2007. Bayesian Statistical Modelling 2nd ed., Wiley.

MacKay, D.J. & Petoy, L.C., 1995. A hierarchical Dirichlet language model. Natural language engineering. Relationship to Other Distributions Beta Special Case: Distribution (x; [ , ]) = ( = ; = , = )

( ; , ) 𝐷𝐷𝐷𝐷𝑟𝑟𝑑𝑑=1 α1 α0 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑥𝑥 𝛼𝛼 𝛼𝛼1 𝛽𝛽 𝛼𝛼0

Gamma𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑥𝑥 𝛼𝛼 𝛽𝛽 Let: Distribution ~ ( , ) . . = 𝑑𝑑 ( ; , ) 𝑖𝑖 𝑖𝑖 𝑖𝑖 Then: 𝑌𝑌 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝜆𝜆 𝑘𝑘 𝑖𝑖 𝑖𝑖 𝑑𝑑 𝑎𝑎𝑎𝑎𝑎𝑎 𝑉𝑉 � 𝑌𝑌

𝑖𝑖=1 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑥𝑥 𝜆𝜆 𝑘𝑘 ~ ( , ) Let: 𝑉𝑉 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝜆𝜆 ∑𝑘𝑘𝑖𝑖 = , , … , Dirichlet 𝑌𝑌1 𝑌𝑌2 𝑌𝑌𝑑𝑑 Then: 𝒁𝒁 � � ~ 𝑉𝑉 ( 𝑉𝑉 , … , 𝑉𝑉 )

𝑑𝑑 1 𝑘𝑘 *i.i.d: independent and identically𝒁𝒁 𝐷𝐷𝐷𝐷𝑟𝑟 distributed𝛼𝛼 𝛼𝛼

Multivariate Normal Distribution 187

6.3. Multivariate Normal Continuous Distribution *Note for a graphical representation see bivariate normal distribution Parameters & Description Location Vector: A d- dimensional vector = [ , , … , ] < < giving the mean of each 𝑇𝑇 𝝁𝝁 𝜇𝜇1 𝜇𝜇2 𝜇𝜇𝑑𝑑 −∞ µi ∞ random variable. Covariance Matrix: A × matrix which quantifies the random 𝑑𝑑 𝑑𝑑 Parameters > 0 variable variance and = dependence. This matrix 𝜎𝜎11 ⋯ 𝜎𝜎1𝑑𝑑 0 𝜎𝜎𝑖𝑖𝑖𝑖 determines the shape of ∑ � ⋮ ⋱ ⋮ � 𝑖𝑖𝑖𝑖 is 𝑑𝑑1 𝑑𝑑𝑑𝑑 𝜎𝜎 ≥ the distribution. 𝜎𝜎 ⋯ 𝜎𝜎 symmetric positive definite matrix. Σ 2 Dimensions. The number

(integer) of dependent variables. 𝑑𝑑 ≥ Limits 𝑑𝑑 < x <

Distribution −∞Formulasi ∞

1 1 ( ) = exp ( ) ( ) (2 ) / | | 2 PDF T −1 𝑓𝑓 𝐱𝐱 𝑑𝑑 2 �− 𝐱𝐱 − 𝛍𝛍 𝚺𝚺 𝐱𝐱 − 𝛍𝛍 � Where | | is the determinant𝜋𝜋 � 𝚺𝚺 of .

𝚺𝚺 𝚺𝚺 Let = ~ , 𝑢𝑢 𝑢𝑢𝑢𝑢 𝑢𝑢𝑢𝑢 Mu 𝑼𝑼 𝑑𝑑 𝝁𝝁 𝚺𝚺𝑇𝑇 𝚺𝚺 Where 𝑿𝑿 = � �, …𝑁𝑁, 𝑁𝑁𝑁𝑁𝑚𝑚, ��, …𝒗𝒗�, � 𝑢𝑢𝑢𝑢 𝒗𝒗𝒗𝒗�� Normal ltivar 𝑽𝑽 𝝁𝝁 𝚺𝚺 𝑇𝑇 𝚺𝚺 = 1, … , 𝑝𝑝 𝑝𝑝+1 𝑑𝑑 𝑿𝑿 �𝑋𝑋 𝑋𝑋 𝑋𝑋𝑇𝑇 𝑋𝑋 � = , … , 𝑼𝑼 �𝑋𝑋1 𝑋𝑋𝑝𝑝� Marginal PDF 𝑇𝑇 𝑽𝑽 �𝑋𝑋𝑝𝑝+1 𝑋𝑋𝑑𝑑�

( , ) ( ) = ( ) 𝑝𝑝 𝒖𝒖 𝒖𝒖𝒖𝒖 ∞ 𝑼𝑼 ∼ 𝑁𝑁𝑁𝑁𝑁𝑁𝑚𝑚 𝝁𝝁 𝚺𝚺 1 1 𝑓𝑓 𝒖𝒖 = �−∞𝑓𝑓 𝒙𝒙 𝑑𝑑𝒗𝒗 exp ( ) ( ) ( ) / 2 2 | | T −1 𝒖𝒖 uu 𝒖𝒖 𝑝𝑝 2 �− 𝐮𝐮 − 𝝁𝝁 𝚺𝚺 𝐮𝐮 − 𝝁𝝁 � 𝒖𝒖𝒖𝒖 𝜋𝜋 | =𝚺𝚺 | , | Conditional PDF �

𝑼𝑼 𝑽𝑽 𝒗𝒗 ∼ 𝑁𝑁𝑁𝑁𝑁𝑁𝑚𝑚𝑝𝑝�𝝁𝝁𝑢𝑢 𝑣𝑣 𝚺𝚺𝑢𝑢 𝑣𝑣� 188 Bivariate and Multivariate Distributions

Where | = + ( ) | = 𝑇𝑇 −1 𝝁𝝁𝑢𝑢 𝑣𝑣 𝝁𝝁𝑢𝑢 𝚺𝚺𝑢𝑢𝑢𝑢𝚺𝚺𝑣𝑣𝑣𝑣 𝒗𝒗 − 𝝁𝝁𝑣𝑣 𝑇𝑇 −1 𝚺𝚺𝑢𝑢 𝑣𝑣 𝚺𝚺𝑢𝑢𝑢𝑢 − 𝚺𝚺𝑢𝑢𝑢𝑢𝚺𝚺𝑣𝑣𝑣𝑣 𝚺𝚺𝑢𝑢𝑢𝑢 1 1 ( ) = exp ( ) ( ) d ( ) / 𝐱𝐱 2 CDF 2 | | T −1 𝑑𝑑 2 𝐹𝐹 𝐱𝐱 �−∞ �− 𝐱𝐱 − 𝛍𝛍 𝚺𝚺 𝐱𝐱 − 𝛍𝛍 � 𝒙𝒙 𝜋𝜋 � 𝚺𝚺 1 1 ( ) = exp ( ) ( ) d Reliability ( ) / ∞ 2 2 | | T −1 𝑑𝑑 2 𝑅𝑅 𝐱𝐱 �𝐱𝐱 �− 𝐱𝐱 − 𝛍𝛍 𝚺𝚺 𝐱𝐱 − 𝛍𝛍 � 𝒙𝒙 𝜋𝜋 � 𝚺𝚺 Properties and Moments Median

Mode 𝝁𝝁 st Mean - 1 Raw Moment [ 𝝁𝝁] =

Mean of the marginal 𝐸𝐸distribution:𝑿𝑿 𝝁𝝁 [ ] = [ ] = 𝑢𝑢 𝐸𝐸 𝑼𝑼 𝝁𝝁 𝑣𝑣 Mean of the conditional𝐸𝐸 𝑽𝑽 distribution:𝝁𝝁 | = + ( ) 𝑇𝑇 −1 Variance - 2nd Central Moment 𝝁𝝁𝑢𝑢 𝑣𝑣 𝝁𝝁𝑢𝑢 [𝚺𝚺𝑢𝑢]𝑢𝑢𝚺𝚺=𝑣𝑣𝑣𝑣 𝒗𝒗 − 𝝁𝝁𝑣𝑣

Covariance of marginal𝐶𝐶𝐶𝐶𝐶𝐶 distributions:𝑿𝑿 𝚺𝚺 ( ) =

𝐮𝐮𝐮𝐮 Covariance of conditional𝐶𝐶𝐶𝐶𝐶𝐶 𝐔𝐔 distributions:𝚺𝚺 ( | ) = 𝑇𝑇 −1 𝐶𝐶𝐶𝐶𝐶𝐶 𝐔𝐔 𝐕𝐕 𝚺𝚺𝑢𝑢𝑢𝑢 − 𝚺𝚺𝑢𝑢𝑢𝑢𝚺𝚺𝑣𝑣𝑣𝑣 𝚺𝚺𝑢𝑢𝑢𝑢 Parameter Estimation

Maximum Likelihood Function MLE Point When given complete data of samples: Estimates = , , … , where = (1,2, … , ) , , 𝑛𝑛𝐹𝐹, 𝑻𝑻 𝒕𝒕 1 𝑡𝑡 2 𝑡𝑡 𝑑𝑑 𝑡𝑡 𝐹𝐹 The following𝒙𝒙 MLE�𝒙𝒙 estimates𝒙𝒙 𝒙𝒙are �given: (Kotz𝑡𝑡 et al. 2000,𝑛𝑛 p.161) Multivar Normal 1 = 𝑛𝑛𝐹𝐹 n 𝑡𝑡 𝛍𝛍� F � 𝒙𝒙 1 𝑡𝑡=1 = 𝑛𝑛𝐹𝐹 ( ) n , , �𝑖𝑖𝑖𝑖 𝑖𝑖 𝑡𝑡 𝚤𝚤 𝑗𝑗 𝑡𝑡 𝚥𝚥 Σ F ��𝑥𝑥 − 𝜇𝜇� � 𝑥𝑥 − 𝜇𝜇� 𝑡𝑡=1 A review of different estimators is given in (Kotz et al. 2000). When estimates are from a low number of samples ( < 30) a correction

𝑛𝑛𝐹𝐹 Multivariate Normal Distribution 189

factor of -1 can be introduced to give the unbiased estimators (Tong 1990, p.53): 1 = 𝑛𝑛𝐹𝐹 ( ) n 1 , , �𝑖𝑖𝑖𝑖 𝑖𝑖 𝑡𝑡 𝚤𝚤 𝑗𝑗 𝑡𝑡 𝚥𝚥 Σ F ��𝑥𝑥 − 𝜇𝜇� � 𝑥𝑥 − 𝜇𝜇� − 𝑡𝑡=1 Fisher = Information , 𝑇𝑇 Matrix 𝜕𝜕𝜇𝜇 −1 𝜕𝜕𝜕𝜕 𝐼𝐼𝑖𝑖 𝑗𝑗 Σ 𝜕𝜕𝜃𝜃𝑖𝑖 𝜕𝜕𝜃𝜃𝑗𝑗 Bayesian Non-informative Priors when is known, ( ) (Yang and Berger 1998, p.22) 𝚺𝚺 𝜋𝜋0 𝝁𝝁 Type Prior Posterior 1 Uniform 1 Improper, Jeffrey, ( | )~ ; 𝑛𝑛𝐹𝐹 , Reference Prior n n 𝑑𝑑 𝑡𝑡 𝚺𝚺 𝜋𝜋 𝝁𝝁 𝑬𝑬 𝑁𝑁𝑁𝑁𝑁𝑁𝑚𝑚 �µ F � 𝒙𝒙 F� when ( , ) 𝑡𝑡=1 ( ) Shrinkage ( ) 𝝁𝝁No∈ ∞Closed∞ Form 𝑇𝑇 −1 𝑇𝑇 − 𝑑𝑑−2 Non-informative𝝁𝝁 𝚺𝚺 𝝁𝝁 Priors when is known, ( ) (Yang & Berger 1994) 𝝁𝝁 𝜋𝜋𝑜𝑜 𝚺𝚺 Type Prior Posterior Uniform Improper 1 ~ Prior with limits −𝟏𝟏 ; 1, (0, ) 𝜋𝜋�𝚺𝚺 �𝑬𝑬� −1 −𝟏𝟏 𝑺𝑺 𝑊𝑊𝑊𝑊𝑊𝑊ℎ𝑎𝑎𝑎𝑎𝑡𝑡𝑑𝑑 �𝚺𝚺 𝑛𝑛𝐹𝐹 − 𝑑𝑑 − � Jeffery’s𝚺𝚺 ∈ Prior∞ 1 ~ 𝑛𝑛𝐹𝐹

−𝟏𝟏 | | ; , d+1 𝜋𝜋�𝚺𝚺 �𝑬𝑬� −1 2 −𝟏𝟏 𝑺𝑺 𝚺𝚺 with limits𝑑𝑑 (0𝐹𝐹 , ) 𝑊𝑊𝑊𝑊𝑊𝑊ℎ𝑎𝑎𝑎𝑎𝑡𝑡 �𝚺𝚺 𝑛𝑛 𝐹𝐹 �

𝑛𝑛 Mu 1 Reference Prior Proper - No Closed𝚺𝚺 ∈ ∞Form Normal ltivar Ordered | | ( ) { , , . . , } 𝚺𝚺 ∏i<𝑗𝑗 λi − λj Reference𝑖𝑖 𝑗𝑗 Prior𝑑𝑑 1 Proper - No Closed Form 𝜆𝜆 𝜆𝜆 𝜆𝜆 Ordered | |(log log ) ( ) { , , , . . , } d−2 𝚺𝚺 λ1 − λd ∏i<𝑗𝑗 λi − λj MDIP1 𝑑𝑑 𝑖𝑖 𝑑𝑑−1 1 No Closed Form 𝜆𝜆 𝜆𝜆 𝜆𝜆 𝜆𝜆 | |

Non-informative Priors when and𝚺𝚺 are unknown for bivariate normal, ( , ). A complete coverage of numerous reference prior distributions with different parameter 𝑜𝑜 ordering is𝝁𝝁 contained𝚺𝚺 in (Berger & Sun 2008) 𝜋𝜋 𝝁𝝁 𝚺𝚺

190 Bivariate and Multivariate Distributions

Type Prior Posterior Uniform Improper 1 No Closed Form Prior Jeffery’s Prior 1 No Closed Form

| | d+1 Reference Prior 1 2 No Closed Form 𝚺𝚺 Ordered | | ( ) { , , . . , } 𝚺𝚺 ∏i<𝑗𝑗 λi − λj Reference𝑖𝑖 𝑗𝑗 Prior𝑑𝑑 1 No Closed Form 𝜆𝜆 𝜆𝜆 𝜆𝜆 Ordered | |(log log ) ( ) { , , , . . , } d−2 𝚺𝚺 λ1 − λd ∏i<𝑗𝑗 λi − λj MDIP1 𝑑𝑑 𝑖𝑖 𝑑𝑑−1 1 No Closed Form 𝜆𝜆 𝜆𝜆 𝜆𝜆 𝜆𝜆 | |

where is the eigenvalue of , 𝚺𝚺 and and are population and sample multiple correlation coefficients𝑡𝑡ℎ where: 𝜆𝜆𝑖𝑖 𝑖𝑖 Σ 𝑅𝑅� 𝑅𝑅 1 1 S = 𝑛𝑛𝐹𝐹 ( ) and = 𝑛𝑛𝐹𝐹 n 1 , , n 𝑖𝑖𝑖𝑖 𝑖𝑖 𝑡𝑡 𝚤𝚤 𝑗𝑗 𝑡𝑡 𝚥𝚥 𝑡𝑡 F ��𝑥𝑥 − 𝜇𝜇� � 𝑥𝑥 − 𝜇𝜇� 𝐱𝐱� F � 𝒙𝒙 − 𝑡𝑡=1 𝑡𝑡=1

Conjugate Priors UOI Likelihood Evidenc Dist of Prior Posterior Model e UOI Para Parameters + n = Multi-variate events Multi- −1 + n −1 𝐕𝐕0 𝐔𝐔𝐨𝐨 F𝐕𝐕 𝐱𝐱� from Normal with at variate , −1 −1 𝐹𝐹 𝑼𝑼 F 𝝁𝝁( , ) known 𝑛𝑛 points Normal 𝐕𝐕0 1 𝚺𝚺 𝟎𝟎 𝟎𝟎 = 𝒙𝒙 𝑼𝑼 𝐕𝐕 + n 𝑁𝑁𝑁𝑁𝑁𝑁𝑚𝑚𝑑𝑑 𝝁𝝁 𝚺𝚺 𝚺𝚺 −1 −𝟏𝟏

𝐕𝐕 Description , Limitations and Uses 𝐕𝐕0 F𝚺𝚺 Example See bivariate normal distribution. Characteristic Standard Spherical Normal Distribution. When = 0, = we obtain the standard spherical normal distribution: 1 1 𝝁𝝁 𝚺𝚺 𝐼𝐼 ( ) = / exp

Multivar Normal (2 ) 2 T 𝑓𝑓 𝐱𝐱 𝑑𝑑 2 �− 𝐱𝐱 𝐱𝐱� Covariance Matrix. (Yang et al.𝜋𝜋 2004, p.49) - Diagonal Elements. The diagonal elements of is the variance of each random variable. = ( ) - Non Diagonal Elements. Non diagonal elements Σgive the 𝑖𝑖𝑖𝑖 𝑖𝑖 covariance = , = .𝜎𝜎 Hence𝑉𝑉𝑉𝑉𝑉𝑉 𝑋𝑋the matrix is symmetric. 𝑖𝑖𝑖𝑖 𝑖𝑖 𝑗𝑗 𝑗𝑗𝑗𝑗 - Independent𝜎𝜎 Variables.𝐶𝐶𝐶𝐶𝐶𝐶�𝑋𝑋 If𝑋𝑋 � 𝜎𝜎 , = = 0 then and

𝐶𝐶𝐶𝐶𝐶𝐶�𝑋𝑋𝑖𝑖 𝑋𝑋𝑗𝑗� 𝜎𝜎𝑖𝑖𝑖𝑖 𝑋𝑋𝑖𝑖 Multivariate Normal Distribution 191

and independent. - > 0. When increases then and tends to increase. 𝑋𝑋𝑗𝑗 - < 0. When increases then and tends to decrease. 𝝈𝝈𝒊𝒊𝒊𝒊 𝑋𝑋𝑖𝑖 𝑋𝑋𝑗𝑗 𝒊𝒊𝒊𝒊 𝑖𝑖 𝑗𝑗 Ellipsoid𝝈𝝈 Axis. The ellipsoids𝑋𝑋 has axes pointing𝑋𝑋 in the direction of the eigenvectors of . The magnitude of these axes are given by the corresponding eigenvalues. 𝚺𝚺 Mean / Median / Mode: As per the univariate distributions the mean, median and mode are equal.

Convolution Property Let ( , ) , (independent) Where 𝑑𝑑 𝒙𝒙 𝐱𝐱 𝑑𝑑 𝒚𝒚 𝐲𝐲 Then 𝑿𝑿 ∼ 𝑁𝑁𝑁𝑁𝑁𝑁+𝑚𝑚 ~𝝁𝝁 𝚺𝚺 +𝒀𝒀 ∼ ,𝑁𝑁𝑁𝑁𝑁𝑁+𝑚𝑚 �𝝁𝝁 𝚺𝚺 � 𝑿𝑿 ⊥ 𝒀𝒀 𝑑𝑑 𝒙𝒙 𝒚𝒚 𝒙𝒙 𝒚𝒚 Note if X and Y are𝑿𝑿 dependent𝒀𝒀 𝑁𝑁𝑁𝑁𝑁𝑁𝑚𝑚 � 𝝁𝝁then 𝝁𝝁X 𝚺𝚺+ Y 𝚺𝚺may� not be normally distributed. (Novosyolov 2006)

Scaling Property Let = + Y is a p x 1 matrix b is a p x 1 matrix 𝒀𝒀 𝑨𝑨𝑨𝑨 𝒃𝒃 Then ~ ( + , ) A is a p x d matrix 𝑻𝑻 𝑑𝑑 Marginalize Property:𝒀𝒀 𝑁𝑁𝑁𝑁𝑁𝑁𝑚𝑚 𝑨𝑨𝑨𝑨 𝒃𝒃 𝑨𝑨𝚺𝚺𝑨𝑨 Let = ~ , 𝑢𝑢 𝑢𝑢𝑢𝑢 𝑢𝑢𝑢𝑢 𝑼𝑼 𝑑𝑑 𝝁𝝁 𝚺𝚺𝑇𝑇 𝚺𝚺 𝑿𝑿 � � 𝑁𝑁𝑁𝑁𝑁𝑁𝑚𝑚 �� 𝒗𝒗� � 𝑢𝑢𝑢𝑢 𝒗𝒗𝒗𝒗�� Then 𝑽𝑽 ( , 𝝁𝝁) 𝚺𝚺 𝚺𝚺 is a p x 1 matrix

𝑝𝑝 𝒖𝒖 𝒖𝒖𝒖𝒖 Conditional Property:𝑼𝑼 ∼ 𝑁𝑁𝑁𝑁𝑁𝑁𝑚𝑚 𝝁𝝁 𝚺𝚺 𝑼𝑼 Let = ~ , 𝑢𝑢 𝑢𝑢𝑢𝑢 𝑢𝑢𝑢𝑢 Mu 𝑼𝑼 𝑑𝑑 𝝁𝝁 𝚺𝚺𝑇𝑇 𝚺𝚺 𝑿𝑿 � � 𝑁𝑁𝑁𝑁𝑁𝑁𝑚𝑚 ��𝝁𝝁𝑣𝑣� � 𝑢𝑢𝑢𝑢 𝒗𝒗𝒗𝒗�� Normal ltivar Then | =𝑽𝑽 | , 𝚺𝚺 | 𝚺𝚺 is a p x 1 matrix

𝑼𝑼 𝑽𝑽 𝒗𝒗 ∼ 𝑁𝑁𝑁𝑁𝑁𝑁𝑚𝑚𝑝𝑝�𝝁𝝁𝑢𝑢 𝑣𝑣 𝚺𝚺𝑢𝑢 𝑣𝑣� 𝑼𝑼 Where | = + ( ) = 𝑇𝑇 −1 𝑢𝑢| 𝑣𝑣 𝑢𝑢 𝑢𝑢𝑢𝑢 𝑣𝑣𝑣𝑣 𝑣𝑣 𝝁𝝁 𝝁𝝁 𝚺𝚺 𝑇𝑇 𝚺𝚺 −1 𝑽𝑽 − 𝝁𝝁 𝑢𝑢 𝑣𝑣 𝑢𝑢𝑢𝑢 𝑢𝑢𝑢𝑢 𝑣𝑣𝑣𝑣 𝑢𝑢𝑢𝑢 It should be noted 𝚺𝚺that the𝚺𝚺 −standard𝚺𝚺 𝚺𝚺 𝚺𝚺deviation of the marginal distribution does not depend on the given values in V.

Applications Convenient Properties. (Balakrishnan & Lai 2009, p.477) Popularity of the multivariate normal distribution over other multivariate distributions is due to the convenience of the conditional and marginal distribution properties which both produce univariate normal distributions. 192 Bivariate and Multivariate Distributions

Kalman Filter. The Kalman filter estimates the current state of a system in the presence of noisy measurements. This process uses multivariate normal distributions to model the noise.

Multivariate (MANOVA). A test used to analyze variance and dependence of variables. A popular model used to conduct MANOVA assumes the data comes from a multivariate normal population.

Gaussian Regression Process. This is a for observations or events that occur in a continuous domain of time or space, where every point is associated with a normally distributed random variable and every finite collection of these random variables has a multivariate normal distribution.

Multi-Linear Regression. Multi-linear regression attempts to model the relationship between parameters and variables by fitting a linear equation. One model to do such a task (MLE) fits a distribution to the observed variance where a multivariate normal distribution is often assumed.

Gaussian Bayesian Belief Networks (BBN). BBNs graphical represent the dependence between variables in a probability distribution. When using continuous random variables BBNs quickly become tremendously complicated. However due to the multivariate normal distribution’s conditional and marginal properties this task is simplified and popular.

Resources Online: http://mathworld.wolfram.com/BivariateNormalDistribution.html http://www.aiaccess.net/English/Glossaries/GlosMod/e_gm_binormal _distri.htm (interactive visual representation)

Books: Patel, J.K, Read, C.B, 1996. Handbook of the Normal Distribution, 2nd Edition, CRC

Tong, Y.L., 1990. The Multivariate Normal Distribution, Springer.

Yang, K. et al., 2004. Multivariate Statistical Methods in Quality Management 1st ed., McGraw-Hill Professional. Multivar Normal

Bertsekas, D.P. & Tsitsiklis, J.N., 2008. Introduction to Probability, 2nd Edition, Athena Scientific. Multinomial Distribution 193

6.4. Multinomial Discrete Distribution

Probability Density Function - f(k)

Trinomial Distribution, ([ , , ] ) where = 8, = , , . Note is not shown 𝑇𝑇 because it is determined𝑇𝑇 using = 1 1 5 𝑓𝑓 𝑘𝑘1 𝑘𝑘2 𝑘𝑘3 𝑛𝑛 𝐩𝐩 �3 4 12� 𝑘𝑘3 𝑘𝑘3 𝑛𝑛 − 𝑘𝑘1 − 𝑘𝑘2 Multinomial

Trinomial Distribution, ([ , , ] ) where = 20, = , , . Note is not shown 𝑇𝑇 because it is determined𝑇𝑇 as = 1 1 1 𝑓𝑓 𝑘𝑘1 𝑘𝑘2 𝑘𝑘3 𝑛𝑛 𝐩𝐩 �3 2 6� 𝑘𝑘3 𝑘𝑘3 𝑛𝑛 − 𝑘𝑘1 − 𝑘𝑘2 194 Bivariate and Multivariate Distributions

Parameters & Description Number of Trials. This is n > 0 sometimes called the index. (integer) (Johnson et al. 1997, p.31) 𝑛𝑛 Event Probability Matrix: 0 p 1 The probability of event i Parameters = [ , , … , ] i occurring. is often called 𝑑𝑑≤ =≤1 𝑇𝑇 cell probabilities. (Johnson 𝐩𝐩 𝑝𝑝1 𝑝𝑝2 𝑝𝑝𝑑𝑑 𝑝𝑝𝑖𝑖 � 𝑝𝑝𝑖𝑖 et al. 1997, p.31) 𝑖𝑖=1 Dimensions. The number of 2 mutually exclusive states of (integer) 𝑑𝑑 ≥ the system. 𝑑𝑑

k {0, … , }

Limits i ∈ 𝑛𝑛 𝑑𝑑 =

� 𝑘𝑘𝑖𝑖 𝑛𝑛 𝑖𝑖=1 Distribution Formulas

n ( ) = p k , k , . . , k d ki where 𝑓𝑓 𝐤𝐤 � 1 2 d� � i n n! n! i=1 (n + 1) = = = k , k , . . , k k ! k ! … k ! k ! (k + 1) Γ 1 2 n d d � � 1 2 d i=1 i i=1 i Note that in p there is only d-1 ‘free’ variables∏ as∏ the Γlast PDF = 1 giving the distribution: 𝑑𝑑−1 𝑝𝑝𝑑𝑑 − ∑𝑖𝑖=1 𝑝𝑝𝑖𝑖 n d−1 i ( ) = k , k , . . , k p . 1 s p n−∑i=1 k

d−1 ki i i 𝑓𝑓 𝐤𝐤 � 1 2 n� �i=1 � − � � i=1 Now the special case of binomial distribution when = 2 can be seen. 𝑑𝑑 Multinomial Let = ~ , 𝒖𝒖 Where = [𝑼𝑼 , … , , 𝑑𝑑 , … ,𝒑𝒑 ] 𝑲𝑲 � � 𝑀𝑀𝑀𝑀𝑜𝑜𝑚𝑚 �𝑛𝑛 � 𝒗𝒗�� = [𝑽𝑽 , … , ] 𝒑𝒑 𝑇𝑇 1 𝑠𝑠 𝑠𝑠+1 𝑑𝑑 𝑲𝑲 = [𝐾𝐾 , …𝐾𝐾, 𝑇𝑇𝐾𝐾] 𝐾𝐾 Marginal PDF 1 𝑠𝑠 𝑼𝑼 𝐾𝐾 𝐾𝐾 𝑇𝑇 𝑽𝑽 𝐾𝐾𝑠𝑠+1 𝐾𝐾𝑑𝑑 ( , ) where = , , … , , 1 𝑼𝑼 ∼ 𝑀𝑀𝑀𝑀𝑀𝑀𝑚𝑚𝑠𝑠 𝑛𝑛 𝒑𝒑𝒖𝒖 𝑠𝑠−1 𝑇𝑇 𝒑𝒑𝒖𝒖 �𝑝𝑝1 𝑝𝑝2 𝑝𝑝𝑠𝑠−1 � − ∑𝑖𝑖=1 𝑝𝑝𝑖𝑖�� Multinomial Distribution 195

n ( ) = p k , k , . . , k s ki i 𝑓𝑓 𝒖𝒖 � 1 2 s� �i=1 When only two states = [ , (1 )] : 𝑇𝑇 𝒑𝒑 𝑝𝑝 n − 𝑝𝑝 ( ) = k p (1 ) ki 𝑛𝑛−𝑘𝑘𝑖𝑖 𝑖𝑖 i 𝑖𝑖 𝑓𝑓| 𝑘𝑘= � i� − 𝑝𝑝| , | where 𝑼𝑼 𝑽𝑽 𝒗𝒗 ∼ 𝑀𝑀𝑀𝑀𝑀𝑀𝑚𝑚𝑠𝑠�𝑛𝑛𝑢𝑢 𝑣𝑣 𝒑𝒑𝑢𝑢 𝒗𝒗� = = 𝑑𝑑 Conditional PDF | 𝑛𝑛𝑢𝑢 𝑣𝑣 𝑛𝑛 −1𝑛𝑛𝑣𝑣 𝑛𝑛 − � 𝑘𝑘𝑖𝑖 | = [ , ,𝑖𝑖…=𝑠𝑠,+1 ] 𝑇𝑇 𝒑𝒑𝑢𝑢 𝑣𝑣 𝑠𝑠 𝑝𝑝1 𝑝𝑝2 𝑝𝑝𝑠𝑠 ∑𝑖𝑖=1 𝑝𝑝𝑖𝑖 ( ) = P(K k , K k , … , K k )

1 1 2 2 n d d CDF 𝐹𝐹 𝐤𝐤 = k1 k≤2 … kd ≤ ≤ p j , j , . . , j d ji i � � � � 1 2 d� �i=1 j1=0 j2=0 jd=0 ( ) = P(K > k , K > k , … , K > k )

1 1 2 2 d nd 𝑅𝑅 𝐤𝐤 = n n … n p Reliability j , j , . . , j d ji i � � � � 1 2 d� �i=1 j1=k1+1 j2=k2+1 jd=kd+1 Properties and Moments Median3 ( ) is either { , }

Mode 𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀 𝑘𝑘𝑖𝑖( ) = ( +⌊𝑛𝑛1𝑝𝑝)𝑖𝑖⌋ ⌈𝑛𝑛 𝑝𝑝𝑖𝑖⌉ st Mean - 1 Raw Moment 𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀[𝑘𝑘𝑖𝑖] = ⌊ =𝑛𝑛 𝑝𝑝𝑖𝑖⌋

Mean of the marginal𝐸𝐸 𝑲𝑲 distribution:𝝁𝝁 𝑛𝑛𝒑𝒑 [ ] = = Multinomial [ ] = = 𝐸𝐸 𝑼𝑼 𝝁𝝁𝒖𝒖 𝑛𝑛𝒑𝒑𝑢𝑢 𝑖𝑖 𝑘𝑘𝑖𝑖 𝑖𝑖 Mean of the conditional𝐸𝐸 𝐾𝐾 𝜇𝜇 distribution:𝑛𝑛𝑝𝑝 [ | = ] = | = | |

where 𝐸𝐸 𝑼𝑼 𝑽𝑽 𝒗𝒗 𝝁𝝁𝑢𝑢 𝑣𝑣 𝑛𝑛𝑢𝑢 𝑣𝑣𝒑𝒑𝑢𝑢 𝒗𝒗

| = = 𝑑𝑑

𝑛𝑛𝑢𝑢 𝑣𝑣 𝑛𝑛 −1𝑛𝑛𝑣𝑣 𝑛𝑛 − � 𝑘𝑘𝑖𝑖 | = [ , ,𝑖𝑖…=𝑠𝑠,+1 ] 𝑇𝑇 𝒑𝒑𝑢𝑢 𝑣𝑣 𝑠𝑠 𝑝𝑝1 𝑝𝑝2 𝑝𝑝𝑠𝑠 ∑𝑖𝑖=1 𝑝𝑝𝑖𝑖

3 = is the floor function (largest integer not greater than ) = is the ceiling function (smallest integer not less than ) ⌊𝑥𝑥⌋ 𝑥𝑥 ⌈𝑥𝑥⌉ 𝑥𝑥 196 Bivariate and Multivariate Distributions

Variance - 2nd Central Moment [ ] = (1 ) , = 𝑉𝑉𝑉𝑉𝑉𝑉 𝐾𝐾𝑖𝑖 𝑛𝑛𝑝𝑝𝑖𝑖 − 𝑝𝑝𝑖𝑖 𝑖𝑖 𝑗𝑗 𝑖𝑖 𝑗𝑗 Covariance of𝐶𝐶𝐶𝐶𝐶𝐶 marginal�𝐾𝐾 𝐾𝐾 � distributions:−𝑛𝑛𝑝𝑝 𝑝𝑝 [ ] = (1 )

𝑖𝑖 𝑖𝑖 𝑖𝑖 Covariance of𝑉𝑉𝑉𝑉 conditional𝑉𝑉 𝐾𝐾 𝑛𝑛𝑝𝑝 distributions:− 𝑝𝑝 | , = | | , (1 | , ) | , , | , = | | , | , 𝑉𝑉𝑉𝑉𝑉𝑉�𝐾𝐾𝑈𝑈 𝑉𝑉 𝑖𝑖� 𝑛𝑛𝑢𝑢 𝑣𝑣𝑝𝑝𝑢𝑢 𝑣𝑣 𝑖𝑖 − 𝑝𝑝𝑢𝑢 𝑣𝑣 𝑖𝑖 𝑈𝑈 𝑉𝑉 𝑖𝑖 𝑈𝑈 𝑉𝑉 𝑗𝑗 𝑢𝑢 𝑣𝑣 𝑢𝑢 𝑣𝑣 𝑖𝑖 𝑢𝑢 𝑣𝑣 𝑗𝑗 where𝐶𝐶𝐶𝐶𝐶𝐶 �𝐾𝐾 𝐾𝐾 � −𝑛𝑛 𝑝𝑝 𝑝𝑝

| = = 𝑑𝑑

𝑛𝑛𝑢𝑢 𝑣𝑣 𝑛𝑛 −1𝑛𝑛𝑣𝑣 𝑛𝑛 − � 𝑘𝑘𝑖𝑖 | = [ , ,𝑖𝑖…=𝑠𝑠,+1 ] 𝑇𝑇 𝒑𝒑𝑢𝑢 𝑣𝑣 𝑠𝑠 𝑝𝑝1 𝑝𝑝2 𝑝𝑝𝑠𝑠 ∑𝑖𝑖=1 𝑝𝑝𝑖𝑖 Parameter Estimation Maximum Likelihood Function MLE Point As with the binomial distribution the MLE estimates, given the vector Estimates k(and therefore n), is:(Johnson et al. 1997, p.51)

= n 𝐤𝐤 𝐩𝐩� Where there are observations of each containing trails:

1 𝑡𝑡 𝑡𝑡 𝑇𝑇 = 𝒌𝒌 𝑇𝑇 𝑛𝑛 n n 𝑡𝑡 𝐩𝐩� t � 𝒌𝒌 ∑t=1 𝑡𝑡=1 100 % An approximation of the joint interval confidence limits for 100 % given Confidence by Goodman in 1965 is:(Johnson et al. 1997, p.51) Intervals𝛾𝛾 𝛾𝛾

lower confidence limit: (Complete 𝑖𝑖 1 4 Data) 𝑝𝑝 + 2 + ( ) 2( + ) 𝑖𝑖 𝑖𝑖 𝑖𝑖 �𝐴𝐴 𝑘𝑘 − 𝐴𝐴�𝐴𝐴 𝑘𝑘 𝑛𝑛 − 𝑘𝑘 �

Multinomial 𝑛𝑛 𝐴𝐴 𝑛𝑛 upper confidence limit:

𝑝𝑝𝑖𝑖 1 4 + 2 + + ( ) 2( + ) �𝐴𝐴 𝑘𝑘𝑖𝑖 𝐴𝐴�𝐴𝐴 𝑘𝑘𝑖𝑖 𝑛𝑛 − 𝑘𝑘𝑖𝑖 � 𝑛𝑛 𝐴𝐴 𝑛𝑛 where is the standard normal CDF and: d 1 + = = Φ d −1 𝑑𝑑−1+𝛾𝛾 − γ 𝐴𝐴 𝑍𝑍 𝑑𝑑 Φ � � Multinomial Distribution 197

A complete coverage of estimation techniques and confidence intervals is contained in (Johnson et al. 1997, pp.51-65). A more accurate method which requires numerical methods is given in (Sison & Glaz 1995)

Bayesian Non-informative Priors, ( ) (Yang and Berger 1998, p.6) 𝝅𝝅 𝒑𝒑 Type Prior Posterior Uniform Prior 1 = ( = 1) ( |1 + ) Jeffreys Prior 𝑑𝑑+1 𝑖𝑖 𝑑𝑑+1 ( ) 𝐷𝐷=𝐷𝐷𝑟𝑟 𝛼𝛼 = 𝐷𝐷𝐷𝐷𝑟𝑟 𝐩𝐩 𝐤𝐤 One Group - 1 𝐶𝐶 1 𝑑𝑑+1 2+𝐤𝐤 Reference Prior 𝑑𝑑+1 𝑖𝑖 2 𝐷𝐷𝐷𝐷𝑟𝑟 𝐩𝐩� 𝑑𝑑 𝐷𝐷𝐷𝐷𝑟𝑟 �𝛼𝛼 � where�∏ 𝑖𝑖=1 is𝑝𝑝 𝑖𝑖a constant In 𝐶𝐶terms of the reference prior, this approach considers all parameters are of equal importance.(Berger & Bernardo 1992) d-group Proper. See m-group posterior

Reference Prior when = 1. 1𝐶𝐶 𝑑𝑑−1 𝑖𝑖 𝑚𝑚 where� ∏ 𝑖𝑖 =is1 �a𝑝𝑝 constant𝑖𝑖� − ∑𝑗𝑗= 1 𝑝𝑝𝑖𝑖�� This approach𝐶𝐶 considers each parameter to be of different importance (group length 1) and so the parameters must be ordered by importance. (Berger & Bernardo 1992) m-group ( ) = Reference Prior 1 𝐶𝐶 1 𝑜𝑜 𝑖𝑖+1 𝜋𝜋 𝒑𝒑 𝑁𝑁𝑚𝑚 𝑁𝑁𝑖𝑖 𝑛𝑛 where groups are given by: 𝑑𝑑−1 𝑚𝑚−1 �� − ∑𝑗𝑗=1 𝑝𝑝𝑗𝑗� ∏𝑖𝑖=1 𝑝𝑝𝑖𝑖 ∏𝑖𝑖=1 � − ∑𝑗𝑗=1 𝑝𝑝𝑗𝑗� = , … , = , … , 𝑇𝑇 𝑇𝑇 = +1 + for 1= 1, … , 1 2 𝐩𝐩𝟏𝟏 �𝑝𝑝1 𝑝𝑝𝑛𝑛 � 𝐩𝐩𝟐𝟐 �𝑝𝑝𝑛𝑛 +1 𝑝𝑝𝑛𝑛 +𝑛𝑛 � Multinomial 𝑗𝑗 1 = 𝑗𝑗 , … , 𝑁𝑁 𝑛𝑛 ⋯ 𝑛𝑛 𝑗𝑗 𝑇𝑇 𝑚𝑚 𝐢𝐢 is𝑁𝑁 a𝑖𝑖− constant1+1 𝑁𝑁 𝑖𝑖 Posterior: 𝐩𝐩 �𝑝𝑝 𝑝𝑝 � 𝐶𝐶

1 1 𝑘𝑘𝑑𝑑−2 ( | ) 𝑁𝑁𝑚𝑚 𝑖𝑖 � − ∑𝑗𝑗=1 𝑝𝑝 � 𝑖𝑖+1 𝜋𝜋 𝒑𝒑 𝒌𝒌 ∝ 𝑁𝑁𝑖𝑖 𝑛𝑛 𝑑𝑑−1 𝑚𝑚−1 �∏𝑖𝑖=1 𝑝𝑝𝑖𝑖 ∏𝑖𝑖=1 � − ∑𝑗𝑗=1 𝑝𝑝𝑗𝑗� This approach splits the parameters into m different groups of importance. Within the group order is not important, but the groups need to be ordered by importance. It is common to have = 2 and split the parameters into importance and nuisance parameters. (Berger & Bernardo 1992) 𝑚𝑚 198 Bivariate and Multivariate Distributions

MDIP ( | + 1 + k ) = ( = + 1) 𝑑𝑑 𝑝𝑝𝑖𝑖 𝑑𝑑+1 𝑖𝑖 i 𝑖𝑖 𝑑𝑑+1 𝑖𝑖 𝑖𝑖 𝐷𝐷𝐷𝐷𝑟𝑟 𝐩𝐩′ 𝑝𝑝 Novick and Hall’s �𝑖𝑖=1𝑝𝑝 𝐷𝐷𝐷𝐷𝑟𝑟 𝛼𝛼 𝑝𝑝 ( | ) = ( = 0) Prior (improper) 𝑑𝑑 −1 𝑑𝑑+1 𝑖𝑖 𝑑𝑑+1 𝑖𝑖 𝐷𝐷𝐷𝐷𝑟𝑟 𝐩𝐩 𝐤𝐤 �𝑖𝑖=1Conjugate𝑝𝑝 𝐷𝐷𝐷𝐷𝑟𝑟 Priors𝛼𝛼 (Fink 1997) UOI Likelihood Evidence Dist of Prior Posterior Model UOI Para Parameters

, failures in

trials with from Multinomiald 𝑘𝑘𝑖𝑖 𝑗𝑗 Dirichletd+1 = + 𝒑𝒑( ; , ) possible 𝑛𝑛 states. 𝜶𝜶𝒐𝒐 𝜶𝜶 𝜶𝜶𝒐𝒐 𝒌𝒌 𝑑𝑑 𝑡𝑡 𝑑𝑑 𝑀𝑀𝑀𝑀𝑀𝑀𝑚𝑚 𝒌𝒌 𝑛𝑛 𝒑𝒑 Description , Limitations and Uses Example A six sided dice being thrown 60 times produces the following multinomial distribution:

0.2 Face Times 12 0.1 Number Observed 6 12 0.2 1 12 = = = 60 ⎡10⎤ ⎡0.16⎤ ⎢ ⎥ 2 7 ⎢ 8 ⎥ 0.13 3 11 𝒌𝒌 ⎢ ⎥ 𝒑𝒑 ⎢ ⎥ 𝑛𝑛 ⎢12⎥ ⎢ 0.2 ̇ ⎥ 4 10 ⎢ ⎥ ⎢ ̇ ⎥ 5 8 ⎣ ⎦ ⎣ ⎦ 6 12 Characteristic Binomial Generalization. The multinomial distribution is a generalization of the binomial distribution where more than two states of the system are allowed. The binomial distribution is a special case where = 2.

Covariance.𝑑𝑑 All covariance’s are negative. This is because the increase in one parameter must result in the decrease of to satisfy = 1.

𝑖𝑖 𝑗𝑗 𝑖𝑖 With Replacement.𝑝𝑝 The multinomial distribution assume𝑝𝑝 s replacement.Σ𝑝𝑝 The equivalent distribution which assumes without replacement is the multivariate hypergeometric distribution.

Convolution Property Multinomial Let ( ; , ) Then 𝑲𝑲𝒕𝒕 ∼ 𝑀𝑀𝑀𝑀𝑀𝑀𝑚𝑚𝑑𝑑 𝒌𝒌 𝑛𝑛𝑡𝑡 𝐩𝐩 ~ ( ; , )

� 𝑲𝑲𝒕𝒕 𝑀𝑀𝑀𝑀𝑀𝑀𝑚𝑚𝑑𝑑 ∑𝒌𝒌𝑡𝑡 ∑𝑛𝑛𝑡𝑡 𝒑𝒑 *This does not hold when the p parameter differs.

Applications Partial Failures. When the states of a system under demands cannot Multinomial Distribution 199

be modeled with two states (success or failure) the multinomial distribution may be used. Examples of this include when modeling discrete states of component degradation.

Resources Online: http://en.wikipedia.org/wiki/Multinomial_distribution http://mathworld.wolfram.com/MultinomialDistribution.html http://www.math.uah.edu/stat/bernoulli/Multinomial.xhtml

Books: Johnson, N.L., Kotz, S. & Balakrishnan, N., 1997. Discrete Multivariate Distributions 1st ed., Wiley-Interscience.

Relationship to Other Distributions Binominal Special Case: Distribution ( |n, ) = ( | , )

𝑑𝑑=2 ( | , ) 𝑀𝑀𝑀𝑀𝑀𝑀𝑚𝑚 𝐤𝐤 𝐩𝐩 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑛𝑛 𝑝𝑝

𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑛𝑛 𝑝𝑝

Multinomial

200 Probability Distributions Used in Reliability Engineering

7. References

Abadir, K. & Magdalinos, T., 2002. The Characteristic Function from a Family of Truncated Normal Distributions. Econometric Theory, 18(5), p.1276-1287.

Agresti, A., 2002. Categorical data analysis, John Wiley and Sons.

Aitchison, J.J. & Brown, J.A.C., 1957. The Lognormal Distribution, New York: Cambridge University Press.

Andersen, P.K. et al., 1996. Statistical Models Based on Counting Processes Corrected., Springer.

Angus, J.E., 1994. Bootstrap one-sided confidence intervals for the log-normal mean. Journal of the Royal Statistical Society. Series D (The ), 43(3), p.395–401.

Anon, Six Sigma | Process Management | Strategic Process Management | Welcome to SSA & Company.

Antle, C., Klimko, L. & Harkness, W., 1970. Confidence Intervals for the Parameters of the Logistic Distribution. Biometrika, 57(2), p.397-402.

Aoshima, M. & Govindarajulu, Z., 2002. Fixed-width confidence interval for a lognormal mean. International Journal of Mathematics and Mathematical Sciences, 29(3), p.143–153.

Arnold, B.C., 1983. Pareto distributions, Fairland, MD: International Co-operative Pub. House.

Artin, E., 1964. The Gamma Function, New York: Holt, Rinehart & Winston.

Balakrishnan, 1991. Handbook of the Logistic Distribution 1st ed., CRC.

Balakrishnan, N. & Basu, A.P., 1996. Exponential Distribution: Theory, Methods and Applications 1st ed., CRC.

Balakrishnan, N. & Lai, C.-D., 2009. Continuous Bivariate Distributions 2nd ed., Springer.

Balakrishnan, N. & Rao, C.R., 2001. Handbook of Statistics 20: Advances in Reliability 1st ed., Elsevier Science & Technology. References Berger, J.O., 1993. Statistical Decision Theory and Bayesian Analysis 2nd ed., Springer. References 201

Berger, J.O. & Bernardo, J.M., 1992. Ordered Group Reference Priors with Application to the Multinomial Problem. Biometrika, 79(1), p.25-37.

Berger, J.O. & Sellke, T., 1987. Testing a Point Null Hypothesis: The Irreconcilability of P Values and Evidence. Journal of the American Statistical Association, 82(397), p.112-122.

Berger, J.O. & Sun, D., 2008. Objective priors for the bivariate normal model. The Annals of Statistics, 36(2), p.963-982.

Bernardo, J.M. et al., 1992. On the development of reference priors. Bayesian statistics, 4, p.35–60.

Berry, D.A., Chaloner, K.M. & Geweke, J.K., 1995. Bayesian Analysis in Statistics and : Essays in Honor of Arnold Zellner 1st ed., Wiley-Interscience.

Bertsekas, D.P. & Tsitsiklis, J.N., 2008. Introduction to Probability 2nd ed., Athena Scientific.

Billingsley, P., 1995. Probability and Measure, 3rd Edition 3rd ed., Wiley-Interscience.

Birnbaum, Z.W. & Saunders, S.C., 1969. A New Family of Life Distributions. Journal of Applied Probability, 6(2), p.319-327.

Bjõrck, A., 1996. Numerical Methods for Problems 1st ed., SIAM: Society for Industrial and Applied Mathematics.

Bowman, K.O. & Shenton, L.R., 1988. Properties of estimators for the gamma distribution, CRC Press.

Brown, L.D., Cai, T.T. & DasGupta, A., 2001. for a binomial proportion. Statistical Science, p.101–117.

Christensen, R. & Huffman, M.D., 1985. Bayesian Point Estimation Using the Predictive Distribution. The American Statistician, 39(4), p.319-321.

Cohen, 1991. Truncated and Censored Samples 1st ed., CRC Press.

Collani, E.V. & Dräger, K., 2001. Binomial distribution handbook for scientists and engineers, Birkhäuser. References

Congdon, P., 2007. Bayesian Statistical Modelling 2nd ed., Wiley.

Cozman, F. & Krotkov, E., 1997. Truncated Gaussians as Tolerance Sets.

Crow, E.L. & Shimizu, K., 1988. Lognormal distributions, CRC Press. 202 Probability Distributions Used in Reliability Engineering

Dekking, F.M. et al., 2007. A Modern Introduction to Probability and Statistics: Understanding Why and How, Springer.

Fink, D., 1997. A compendium of conjugate priors. See http://www. people. cornell. edu/pages/df36/CONJINTRnew% 20TEX. pdf, p.46.

Georges, P. et al., 2001. Multivariate Survival Modelling: A Unified Approach with Copulas. SSRN eLibrary.

Gupta and Nadarajah, 2004. Handbook of beta distribution and its applications, CRC Press.

Gupta, P.L., Gupta, R.C. & Tripathi, R.C., 1997. On the monotonic properties of discrete failure rates. Journal of Statistical Planning and Inference, 65(2), p.255-268.

Haight, F.A., 1967. Handbook of the Poisson distribution, New York,: Wiley.

Hastings, N.A.J., Peacock, B. & Evans, M., 2000. Statistical Distributions, 3rd Edition 3rd ed., John Wiley & Sons Inc.

Jiang, R. & Murthy, D.N.P., 1996. A mixture model involving three Weibull distributions. In Proceedings of the Second Australia–Japan Workshop on Stochastic Models in Engineering, Technology and Management. Gold Coast, Australia, pp. 260-270.

Jiang, R. & Murthy, D.N.P., 1998. Mixture of Weibull distributions - parametric characterization of failure rate function. Applied Stochastic Models and Data Analysis, (14), p.47-65.

Jiang, R. & Murthy, D.N.P., 1995. Modeling Failure-Data by Mixture of2 Weibull Distributions : A Graphical Approach. IEEE Transactions on Reliability, 44, p.477-488.

Jiang, R. & Murthy, D.N.P., 1999. The exponentiated Weibull family: a graphical approach. Reliability, IEEE Transactions on, 48(1), p.68-72.

Johnson, N.L., Kemp, A.W. & Kotz, S., 2005. Univariate Discrete Distributions 3rd ed., Wiley-Interscience.

Johnson, N.L., Kotz, S. & Balakrishnan, N., 1994. Continuous Univariate Distributions, Vol. 1 2nd ed., Wiley-Interscience.

Johnson, N.L., Kotz, S. & Balakrishnan, N., 1995. Continuous Univariate Distributions, Vol. 2 2nd ed., Wiley-Interscience. References References 203

Johnson, N.L., Kotz, S. & Balakrishnan, N., 1997. Discrete Multivariate Distributions 1st ed., Wiley-Interscience.

Kimball, B.F., 1960. On the Choice of Plotting Positions on Probability Paper. Journal of the American Statistical Association, 55(291), p.546-560.

Kleiber, C. & Kotz, S., 2003. Statistical Size Distributions in Economics and Actuarial Sciences 1st ed., Wiley-Interscience.

Klein, J.P. & Moeschberger, M.L., 2003. : techniques for censored and truncated data, Springer.

Kotz, S., Balakrishnan, N. & Johnson, N.L., 2000. Continuous Multivariate Distributions, Volume 1, Models and Applications, 2nd Edition 2nd ed., Wiley- Interscience.

Kotz, S. & Dorp, J.R. van, 2004. Beyond Beta: Other Continuous Families Of Distributions With Bounded Support And Applications, World Scientific Publishing Company.

Kundu, D., Kannan, N. & Balakrishnan, N., 2008. On the hazard function of Birnbaum- Saunders distribution and associated inference. Comput. Stat. Data Anal., 52(5), p.2692-2702.

Lai, C.D., Xie, M. & Murthy, D.N.P., 2003. A modified Weibull distribution. IEEE Transactions on Reliability, 52(1), p.33-37.

Lai, C.-D. & Xie, M., 2006. Stochastic Ageing and Dependence for Reliability 1st ed., Springer.

Lawless, J.F., 2002. Statistical Models and Methods for Lifetime Data 2nd ed., Wiley- Interscience.

Leemis, L.M. & McQueston, J.T., 2008. Univariate distribution relationships. The American Statistician, 62(1), p.45–53.

Leipnik, R.B., 1991. On Lognormal Random Variables: I-the Characteristic Function. The ANZIAM Journal, 32(03), p.327-347.

Lemonte, A.J., Cribari-Neto, F. & Vasconcellos, K.L.P., 2007. Improved statistical inference for the two-parameter Birnbaum-Saunders distribution. References Computational Statistics & Data Analysis, 51(9), p.4656-4681.

Limpert, E., Stahel, W. & Abbt, M., 2001. Log-normal Distributions across the Sciences: Keys and Clues. BioScience, 51(5), p.352, 341. 204 Probability Distributions Used in Reliability Engineering

MacKay, D.J.C. & Petoy, L.C.B., 1995. A hierarchical Dirichlet language model. Natural language engineering.

Manzini, R. et al., 2009. Maintenance for Industrial Systems 1st ed., Springer.

Martz, H.F. & Waller, R., 1982. Bayesian reliability analysis, JOHN WILEY & SONS, INC, 605 THIRD AVE, NEW YORK, NY 10158.

Meeker, W.Q. & Escobar, L.A., 1998. Statistical Methods for Reliability Data 1st ed., Wiley-Interscience.

Modarres, M., Kaminskiy, M. & Krivtsov, V., 1999. Reliability engineering and risk analysis, CRC Press.

Murthy, D.N.P., Xie, M. & Jiang, R., 2003. Weibull Models 1st ed., Wiley-Interscience.

Nelson, W.B., 1990. Accelerated Testing: Statistical Models, Test Plans, and Data Analysis, Wiley-Interscience.

Nelson, W.B., 1982. Applied Life Data Analysis, Wiley-Interscience.

Novosyolov, A., 2006. The sum of dependent normal variables may be not normal. http://risktheory.ru/papers/sumOfDep.pdf.

Patel, J.K. & Read, C.B., 1996. Handbook of the Normal Distribution 2nd ed., CRC.

Pham, H., 2006. Springer Handbook of 1st ed., Springer.

Provan, J.W., 1987. Probabilistic approaches to the material-related reliability of fracture-sensitive structures. Probabilistic fracture mechanics and reliability(A 87-35286 15-38). Dordrecht, Martinus Nijhoff Publishers, 1987,, p.1–45.

Rao, C.R. & Toutenburg, H., 1999. Linear Models: Least Squares and Alternatives 2nd ed., Springer.

Rausand, M. & Høyland, A., 2004. System reliability theory, Wiley-IEEE.

Rencher, A.C., 1997. Multivariate Statistical Inference and Applications, Volume 2, Methods of Multivariate Analysis Har/Dis., Wiley-Interscience.

Rinne, H., 2008. The Weibull Distribution: A Handbook 1st ed., Chapman & Hall/CRC.

Schneider, H., 1986. Truncated and censored samples from normal populations, M. Dekker. References References 205

Simon, M.K., 2006. Probability Distributions Involving Gaussian Random Variables: A Handbook for Engineers and Scientists, Springer.

Singpurwalla, N.D., 2006. Reliability and Risk: A Bayesian Perspective 1st ed., Wiley.

Sison, C.P. & Glaz, J., 1995. Simultaneous Confidence Intervals and Sample Size Determination for Multinomial Proportions. Journal of the American Statistical Association, 90(429).

Tong, Y.L., 1990. The Multivariate Normal Distribution, Springer.

Xie, M., Gaudoin, O. & Bracquemond, C., 2002. Redefining Failure Rate Function for Discrete Distributions. International Journal of Reliability, Quality & , 9(3), p.275.

Xie, M., Goh, T.N. & Tang, Y., 2004. On changing points of mean residual life and failure rate function for some generalized Weibull distributions. Reliability Engineering and System Safety, 84(3), p.293–299.

Xie, M., Tang, Y. & Goh, T.N., 2002. A modified Weibull extension with bathtub-shaped failure rate function. Reliability Engineering and System Safety, 76(3), p.279– 285.

Yang and Berger, 1998. A Catalog of Noninformative Priors (DRAFT).

Yang, K. et al., 2004. Multivariate Statistical Methods in Quality Management 1st ed., McGraw-Hill Professional.

Yang, R. & Berger, J.O., 1994. Estimation of a Covariance Matrix Using the Reference Prior. The Annals of Statistics, 22(3), p.1195-1211.

Zhou, X.H. & Gao, S., 1997. Confidence intervals for the log-normal mean. Statistics in medicine, 16(7), p.783–790.

References