<<

Category L T P Credit 14MERC0 PE 3 0 0 3 Preamble

Reliability engineering is engineering that emphasizes in the lifecycle management of a product. Dependability, or reliability, describes the ability of a or component to function under stated conditions for a specified period of time. The students can able to identify and manage asset reliability that could adversely affect plant or business operations.

Prerequisite  14ME310 - Statistical techniques  Course Outcomes At the end of the course, the students will be able to: CO 1. Explain the basic concepts of Reliability Engineering and its Understand measures. CO 2. Predict the Reliability at system using various models. Apply CO 3. the test plan to meet the reliability . Apply CO 4. Predict and estimate the reliability from failure . Apply CO 5. Develop and implement a successful Reliability programme. Apply

Mapping with Programme Outcomes COs PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12

CO1. S S - S M ------

CO2. S S - S S ------M

CO3. S S S S S ------M

CO4. S S - S M ------M

CO5. S S S S M ------M

S- Strong; M-Medium; L-Low

Assessment Pattern

Continuous Assessment Tests Bloom‟s Category Terminal Examination 1 2 3 Remember 20 20 20 20 Understand 40 40 40 40 Apply 40 40 40 40 Analyse - - - - Evaluate - - - - Create - - - -

Course Level Assessment Questions

Course Outcome 1 (CO1): Write the concept of Reliability Define the term “Reliability management Explain the term “Bath Tub Curve

Course Outcome 2 (CO2): State and explain the possible causes of low reliability of modern engineering Compare the availability of the following two unit systems with repair facilities: a)Series system with one repair facility, b)Series system with two repair facilities

Passed in Board of Studies Meeting held on 26.11.2016 Approved in 53rd Academic Council Meeting held on 22.12.2016 B.E. Degree () - 2014-15

Course Outcome 3 (CO3): Calculate a) the expectation b)the second about the origin and c)the for the following distributions.

X = 8 12 16 20 24 p(X) = 1/8 1/6 3/8 1/4 1/12 2. Draw Fault –tree diagrams for the systems shown in the following figures:

a) b) A

A B C

B

C

Course Outcome 4 (CO4): What is failure data analysis What are the different techniques of analysis? How do you assess the design process in

Course Outcome 5 (CO5): Explain the various risk measurement systems in modern industrial Explain about various risk reduction resources in a chemical industry How the will support the industrial safety.

Concept Map

Syllabus Introduction :Basic definitions: Reliability, Availability, Serviceability, , ReliabilityMathematics, Failure distribution - constant failure rate model, Time dependent failure rate models and its types, Bath tub curve. case study or Videos on Human Reliability, Reliability. System Reliability: Reliability - Series, Parallel & combined series- parallelconfigurations; redundant-active and passive types, Failure , Effects and Criticality

Passed in Board of Studies Meeting held on 26.11.2016 Approved in 53rd Academic Council Meeting held on 22.12.2016 B.E. Degree (Mechanical Engineering) - 2014-15

Analysis (FMECA), Failure Reporting, Analysis and Corrective Action System (FRACAS), (FTA), System state analysis-Markov Model, Availability, Downtime. Reliability testing: Failures and types of failures; Intrinsic & extrinsic failures; Failurecascade; Failure mode; Failure rate, MTTF, MTBF, (ALT) - Qualitative ALT, Quantitative ALT & its types, AF, Samples Reliability estimation and life : Types of Failure data - Data ,Parametric and Non Parametric distribution, Probability density function, Exponential, Normal, lognormal &weibull distributions, weibull distributions, Electronics reliability prediction-parts count, parts method, MIL standard, Naval Surface Warfare Center (NSWC). Reliability Management: Design for Reliability, Relationship between Reliability and safetyfactor, Stress-Strength interference theory, Reliability growth testing, Reliability centered (RCM), Spares planning.

Text Book

1. Kailash C. Kapur, Michael Pecht, ReliabilityEngineering, John Wiley & Sons, 2014.

Reference Books Srinath L.S, “Reliability Engineering”, Affiliated East-West Press Pvt Ltd, New Delhi, 1998. Modarres, “Reliability and Risk analysis”, Marshal Dekker Inc.1993. John Davidson, “The Reliability of Mechanical system” published by the Institution of Mechanical Engineers, London, 1988. Smith C.O. “Introduction to Reliability in Design”, McGraw Hill, London, 1976. Charles E. Ebeling, “An introduction to Reliability and engineering”, TMH, 2004 Roy Billington and Ronald N. Allan, “Reliability Evaluation of Engineering Systems”, Springer, 2007. Handbook of Reliability Prediction Procedures for Mechanical Equipment Technology Support CARDEROCKDIV, NSWC-11 May 2011, West Bethesda, Maryland 20817-5700.

Course Contents and Lecture Schedule

Module Topic No. of Lectures No. 1 INTRODUCTION

1.1 Basic definitions: Reliability, Availability, 1

Serviceability, Failure rate 1.2 Reliability Mathematics, Failure distribution- 2

constant failure rate model 1.3 Time dependent failure rate models and its types, 1

Bath tub curve 1.4 Case study or videos on Human Reliability, 1

Software Reliability 2. SYSTEM RELIABILITY

2.1 RBD-Series, Parallel & combined series-parallel 2

configurations 2.2 Redundant-active and passive types 2

2.3 FMECA, FRACAS, Fault tree analysis (FTA), 1 System state analysis 2.4 Markov Model, Availability, Downtime 2

Passed in Board of Studies Meeting held on 26.11.2016 Approved in 53rd Academic Council Meeting held on 22.12.2016 B.E. Degree (Mechanical Engineering) - 2014-15

3 RELIABILITY TESTING

3.1 Failures and types of failures; Intrinsic & extrinsic 2 failures 3.2 Failure cascade; Failure mode; Failure rate, MTTF, 2

MTBF 3.3 Accelerated life testing (ALT) - Qualitative ALT 1

3.4 Quantitative ALT & its types, AF, Samples 2

4 RELIABILITY ESTIMATION AND LIFE PREDICTION

4.1 Types of Failure data - Data censoring 1

4.2 Parametric and Non Parametric distribution 2

4.3 Probability density function, Exponential, Normal, 2 lognormal &weibull distributions 4.4 Weibull Goodness of fit distributions 2

4.5 Electronics reliability prediction-parts count, parts 2 stress method 4.6 MIL standard, NSWC 1

5 RELIABILITY MANAGEMENT

5.1 Design for Reliability 2

5.2 Relationship between Reliability and safety factor 1

5.3 Stress-Strength interference theory 2

5.4 Reliability growth testing 1

5.5 RCM, Spares planning 1

TOTAL 36

Course : 1. S. Karthikeyan [email protected]

Accelerated Life Testing?

Traditional life data analysis involves analyzing times-to-failure data obtained under normal operating conditions in order to quantify the life characteristics of a product, system or component. For many reasons, obtaining such life data (or times-to-failure data) may be very difficult or impossible. The reasons for this difficulty can include the long life times of today's products, the small time period between design and release, and the challenge of testing products that are used continuously under normal conditions. Given these difficulties and the need to observe failures of products to better understand their failure modes and life characteristics, reliability practitioners have attempted to devise methods to force these products to fail more quickly than they would under normal use conditions. In other words, they have attempted to accelerate their failures. Over the years, the phrase accelerated life testing has been used to describe all such practices. As we use the phrase in this reference, accelerated life testing involves the acceleration of failures with the single purpose of quantifying the life characteristics of the product at normal use conditions. More specifically, accelerated life testing can be divided into two areas: qualitative accelerated testing and quantitative accelerated life testing. In qualitative accelerated testing, the engineer is mostly interested in identifying failures and failure modes without attempting to make any as to the product's life under normal use conditions. In quantitative accelerated life testing, the engineer is interested in predicting the life of the product (or more specifically, life characteristics such as MTTF, B(10) life, etc.) at normal use conditions, from data obtained in an accelerated life test. Qualitative vs. Quantitative Accelerated Tests

Each type of test that has been called an accelerated test provides different information about the product and its failure mechanisms. These tests can be divided into two types: qualitative tests (HALT, HAST, torture tests, shake and bake tests, etc.) and quantitative accelerated life tests. This reference addresses and quantifies the models and procedures associated with quantitative accelerated life tests (QALT).

Qualitative Accelerated Testing Qualitative tests are tests which yield failure information (or failure modes) only. They have been referred to by many names including: . Elephant tests . Torture tests . HALT (Highly accelerated life testing) . HAST (Highly accelerated stress test) . Shake & bake tests

Qualitative tests are performed on small samples with the specimens subjected to a single severe level of stress, to multiple stresses, or to a time-varying stress (e.g., stress cycling, cold to hot, etc.). If the specimen survives, it passes the test. Otherwise, appropriate actions will be taken to improve the product's design in order to eliminate the cause(s) of failure. Qualitative tests are used primarily to reveal probable failure modes. However, if not designed properly, they may cause the product to fail due to modes that would never have been encountered in real life. A good qualitative test is one that quickly reveals those failure modes that will occur during the life of the product under normal use conditions. In general, qualitative tests are not designed to yield life data that can be used in subsequent quantitative accelerated life data analysis as described in this reference. In general, qualitative tests do not quantify the life (or reliability) characteristics of the product under normal use conditions, however they provide valuable information as to the types and levels of stresses one may wish to employ during a subsequent quantitative test. Benefits and Drawbacks of Qualitative Tests . Benefits: . Increase reliability by revealing probable failure modes. . Provide valuable feedback in designing quantitative tests, and in many cases are a precursor to a quantitative test.

. Drawbacks: . Do not quantify the reliability of the product at normal use conditions. Quantitative Accelerated Life Testing

Quantitative accelerated life testing (QALT), unlike the qualitative testing methods described previously, consists of tests designed to quantify the life characteristics of the product, component or system under normal use conditions, and thereby provide reliability information. Reliability information can include the probability of failure of the product under use conditions, life under use conditions, and projected returns and warranty costs. It can also be used to assist in the performance of risk assessments, design comparisons, etc. Quantitative accelerated life testing can take the form of usage rate acceleration or overstress acceleration. Both accelerated life test methods are described next. Because usage rate acceleration test data can be analyzed with typical life data analysis methods, the overstress acceleration method is the testing method relevant to both ALTA and the remainder of this reference. Quantitative Accelerated Life Tests For all life tests, some time-to-failure information (or time-to-an-event) for the product is required since the failure of the product is the event we want to understand. In other words, if we wish to understand, measure and predict any event, we must observe how that event occurs! Most products, components or systems are expected to perform their functions successfully for long periods of time (often years). Obviously, for a company to remain competitive, the time required to obtain times-to-failure data must be considerably less than the expected life of the product. Two methods of acceleration, usage rate acceleration and overstress acceleration, have been devised to obtain times-to-failure data at an accelerated pace. For products that do not operate continuously, one can accelerate the time it takes to induce/observe failures by continuously testing these products. This is called usage rate acceleration. For products for which usage rate acceleration is impractical, one can apply stress(es) at levels which exceed the levels that a product will encounter under normal use conditions and use the times-to-failure data obtained in this manner to extrapolate to use conditions. This is called overstress acceleration. Usage Rate Acceleration For products which do not operate continuously under normal conditions, if the test units are operated continuously, failures are encountered earlier than if the units were tested at normal usage. For example, a microwave oven operates for small periods of time every day. One can accelerate a test on microwave ovens by operating them more frequently until failure. The same could be said of washers. If we assume an average washer use of 6 hours a week, one could conceivably reduce the testing time 28-fold by testing these washers continuously. Data obtained through usage acceleration can be analyzed with the same methods used to analyze regular times-to-failure data. The limitation of usage rate acceleration arises when products, such as computer servers and peripherals, maintain a very high or even continuous usage. In such cases, usage acceleration, even though desirable, is not a feasible alternative. In these cases the practitioner must stimulate the product to fail, usually through the application of stress(es). This method of accelerated life testing is called overstress acceleration and is described next. Overstress Acceleration For products with very high or continuous usage, the accelerated life testing practitioner must stimulate the product to fail in a life test. This is accomplished by applying stress(es) that exceed the stress(es) that a product will encounter under normal use conditions. The times-to- failure data obtained under these conditions are then used to extrapolate to use conditions. Accelerated life tests can be performed at high or low temperature, humidity, voltage, pressure, vibration, etc. in order to accelerate or stimulate the failure mechanisms. They can also be performed at a combination of these stresses. Stresses & Stress Levels Accelerated life test stresses and stress levels should be chosen so that they accelerate the failure modes under consideration but do not introduce failure modes that would never occur under use conditions. Normally, these stress levels will fall outside the product specification limits but inside the design limits as illustrated next:

This choice of stresses/stress levels and of the process of setting up the is extremely important. Consult your (s) and material scientist(s) to determine what stimuli (stresses) are appropriate as well as to identify the appropriate limits (or stress levels). If these stresses or limits are unknown, qualitative tests should be performed in order to ascertain the appropriate stress(es) and stress levels. Proper use of design of (DOE) methodology is also crucial at this step. In addition to proper stress selection, the application of the stresses must be accomplished in some logical, controlled and quantifiable . Accurate data on the stresses applied, as well as the observed behavior of the test specimens, must be maintained. Clearly, as the stress used in an accelerated test becomes higher, the required test duration decreases (because failures will occur more quickly). However, as the stress level moves farther away from the use conditions, the in the extrapolation increases. Confidence intervals provide a measure of this uncertainty in extrapolation. Understanding Quantitative Accelerated Life Data Analysis In typical life data analysis one determines, through the use of statistical distributions, a life distribution that describes the times-to-failure of a product. Statistically speaking, one wishes to determine the use level probability density function, or pdf, of the times-to-failure. Appendix A of this reference presents these statistical concepts and provides a basic statistical background as it applies to life data analysis. Once this pdf has been obtained, all other desired reliability results can be easily determined, including: . Percentage failing under warranty. . Risk assessment. . Design comparison. . Wear-out period (product performance degradation). In typical life data analysis, this use level probability density function, or pdf, of the times-to- failure can be easily determined using regular times-to-failure/suspension data and an underlying distribution such as the Weibull, exponential or lognormal distribution. In accelerated life data analysis, however, we face the challenge of determining the use level pdf from accelerated life test data, rather than from times-to-failure data obtained under use conditions. To accomplish this, we must develop a method that allows us to extrapolate from data collected at accelerated conditions to arrive at an estimation of use level characteristics.

Looking at a Single Constant Stress Accelerated Life Test To understand the process involved with extrapolating from overstress test data to use level conditions, let's look closely at a simple accelerated life test. For simplicity we will assume that the product was tested under a single stress at a single constant stress level. We will further assume that times-to-failure data have been obtained at this stress level. The times-to-failure at this stress level can then be easily analyzed using an underlying life distribution. A pdf of the times-to-failure of the product can be obtained at that single stress level using traditional approaches. This pdf, the overstress pdf, can likewise be used to make predictions and estimates of life measures of interest at that particular stress level. The objective in an accelerated life test, however, is not to obtain predictions and estimates at the particular elevated stress level at which the units were tested, but to obtain these measures at another stress level, the use stress level.

To accomplish this objective, we must devise a method to traverse the path from the overstress pdf to extrapolate a use level pdf. The next figure illustrates a typical behavior of the pdf at the high stress (or overstress level) and the pdf at the use stress level.

To further simplify the scenario, let's assume that the pdf for the product at any stress level can be described by a single point. The next figure illustrates such a simplification where we need to determine a way to (or map) this single point from the high stress to the use stress.

Obviously, there are infinite ways to map a particular point from the high stress level to the use stress level. We will assume that there is some model (or a function) that maps our point from the high stress level to the use stress level. This model or function can be described mathematically and can be as simple as the equation for a line. The next figure demonstrates some simple models or relationships.

Even when a model is assumed (e.g., linear, exponential, etc.), the mapping possibilities are still infinite since they depend on the of the chosen model or relationship. For example, a simple would generate different mappings for each slope value because we can draw an infinite number of lines through a point. If we tested specimens of our product at two different stress levels, we could begin to fit the model to the data. Clearly, the more points we have, the better off we are in correctly mapping this particular point or fitting the model to our data.

The above figure illustrates that you need a minimum of two higher stress levels to properly map the function to a use stress level. Life Distributions and Life-Stress Models The analysis of accelerated life test data consists of (1) an underlying life distribution that describes the product at different stress levels and (2) a life-stress relationship (or model) that quantifies the manner in which the life distribution changes across different stress levels. These elements of analysis are graphically shown next:

The combination of both an underlying life distribution and a life-stress model can be best seen in the next figure where a pdf is plotted against both time and stress.

The assumed underlying life distribution can be any life distribution. The most commonly used life distributions include the Weibull, exponential and lognormal distribution. Along with the life distribution, a life-stress relationship is also used. These life-stress relationships have been empirically derived and fitted to data. An overview of some of these life-stress relationships is presented in the Analysis Method subchapter. Analysis Method With our current understanding of the principles behind accelerated life testing analysis, we will continue with a discussion of the steps involved in analyzing life data collected from accelerated life tests like those described in the Quantitative Accelerated Life Tests section. Select a Life Distribution The first step in performing an accelerated life data analysis is to choose an appropriate life distribution. Although it is rarely appropriate, the has in the past been widely used as the underlying life distribution because of its simplicity. The Weibull and lognormal distributions, which require more involved calculations, are more appropriate for most uses. The underlying life distributions available in ALTA are presented in detail in the Distributions Used in Accelerated Testing chapter of this reference. Select a Life-Stress Relationship After you have selected an underlying life distribution appropriate to your data, the second step is to select (or create) a model that describes a characteristic point or a life characteristic of the distribution from one stress level to another.

The life characteristic can be any life measure such as the mean, , R(x), F(x), etc. This life characteristic is expressed as a function of stress. Depending on the assumed underlying life distribution, different life characteristics are considered. Typical life characteristics for some distributions are shown in the next table.

Distribution Parameters Life Characteristic

Weibull *, Scale ,

Exponential Mean life ( )

Lognormal , * Median,

*Usually assumed constant For example, when considering the , the , , is chosen to be

the life characteristic that is stress dependent, while is assumed to remain constant across different stress levels. A life-stress relationship is then assigned to . Eight common life-stress models are presented later in this reference. Click a topic to go directly to that page. . Arrhenius Relationship . Eyring Relationship . Inverse Power Law Relationship . Temperature-Humidity Relationship . Temperature Non-Thermal Relationship . Multivariable Relationships: General Log-Linear and Proportional Hazards . Time-Varying Stress Models Parameter Estimation Once you have selected an underlying life distribution and life-stress relationship model to fit your accelerated test data, the next step is to select a method by which to perform parameter estimation. Simply put, parameter estimation involves fitting a model to the data and solving for the parameters that describe that model. In our case, the model is a combination of the life distribution and the life-stress relationship (model). The task of parameter estimation can vary from trivial (with ample data, a single constant stress, a simple distribution and simple model) to impossible. Available methods for estimating the parameters of a model include the graphical method, the method and the maximum likelihood estimation method. Parameter estimation methods are presented in detail in Appendix B of this reference. Greater emphasis will be given to the MLE method because it provides a more robust solution, and is the one employed in ALTA. Derive Reliability Information Once the parameters of the underlying life distribution and life-stress relationship have been estimated, a variety of reliability information about the product can be derived such as: . Warranty time. . The instantaneous failure rate, which indicates the number of failures occurring per unit time. . The mean life which provides a measure of the average time of operation to failure. . B(X) life, which is the time by which X% of the units will fail. . etc. Stress Loading The discussion of accelerated life testing analysis thus far has included the assumption that the stress loads applied to units in an accelerated test have been constant with respect to time. In real life, however, different types of loads can be considered when performing an accelerated test. Accelerated life tests can be classified as constant stress, step stress, cycling stress, random stress, etc. These types of loads are classified according to the dependency of the stress with respect to time. There are two possible stress loading schemes, loadings in which the stress is time-independent and loadings in which the stress is time-dependent. The mathematical treatment, models and assumptions vary depending on the relationship of stress to time. Both of these loading schemes are described next. Stress is Time-Independent (Constant Stress) When the stress is time-independent, the stress applied to a sample of units does not vary. In other words, if temperature is the thermal stress, each unit is tested under the same accelerated temperature, (e.g., 100° C), and data are recorded. This is the type of stress load that has been discussed so far.

This type of stress loading has many advantages over time-dependent stress loadings. Specifically: . Most products are assumed to operate at a constant stress under normal use. . It is far easier to run a constant stress test (e.g., one in which the chamber is maintained at a single temperature). . It is far easier to quantify a constant stress test. . Models for data analysis exist, are widely publicized and are empirically verified. . Extrapolation from a well-executed constant stress test is more accurate than extrapolation from a time-dependent stress test. Stress is Time-Dependent When the stress is time-dependent, the product is subjected to a stress level that varies with time. Products subjected to time-dependent stress loadings will yield failures more quickly, and models that fit them are thought by many to be the "holy grail" of accelerated life testing. The cumulative damage model allows you to analyze data from accelerated life tests with time- dependent stress profiles. The step-stress model, as discussed in [31], and the related ramp-stress model are typical cases of time-dependent stress tests. In these cases, the stress load remains constant for a period of time and then is stepped/ramped into a different stress level, where it remains constant for another time interval until it is stepped/ramped again. There are numerous variations of this concept.

The same idea can be extended to include a stress as a continuous function of time.

Summary of Accelerated Life Testing Analysis In summary, accelerated life testing analysis can be conducted on data collected from carefully designed quantitative accelerated life tests. Well-designed accelerated life tests will apply stress(es) at levels that exceed the stress level the product will encounter under normal use conditions in order to accelerate the failure modes that would occur under use conditions. An underlying life distribution (like the exponential, Weibull and lognormal lifetime distributions) can be chosen to fit the life data collected at each stress level to derive overstress pdfs for each stress level. A life-stress relationship (Arrhenius, Eyring, etc.) can then be chosen to quantify the path from the overstress pdfs in order to extrapolate a use level pdf. From the extrapolated use level pdf, a variety of functions can be derived, including reliability, failure rate, mean life, warranty time etc.

 page

 discussion

 view source

 history  Log in reliawiki.org . Home . About

2003 Annual RELIABILITY and MAINTAINABILITY Symposium

Understanding Accelerated Life-Testing Analysis

Pantelis Vassiliou and Adamantios Mettas

Pantelis Vassiliou ReliaSoft Corporation ReliaSoft Plaza 115 South Sherwood Village Drive Tucson, AZ 85710 [email protected]

RF #2003RM-100: page i RF

Summary & Purpose

Accelerated tests are becoming increasingly popular in today’s industry due to the need for obtaining life data quickly. Life testing of products under higher stress levels without introducing additional failure modes can provide significant savings of both time and money. Correct analysis of data gathered via such accelerated life testing will yield parameters and other information for the product’s life under use stress conditions.

This is a brief introductory tutorial on this subject. Its main purpose is to introduce the participant to some of the basic theories and methodologies of accelerated life testing data analysis.

Pantelis Vassiliou and Adamantios Mettas

Mr. Vassiliou directs and coordinates ReliaSoft's R&D efforts to deliver state of the art software tools for applying reliability engineering concepts and methodologies. He is the original architect of ReliaSoft's Weibull++, a renowned expert and lecturer on Reliability Engineering and ReliaSoft's founder. He is currently spearheading the development of new technologically advanced products and services. In addition, he also consults, trains and lectures on reliability engineering topics to Fortune 1000 companies worldwide. Mr. Vassiliou holds an MS degree in Reliability Engineering from the University of Arizona.

Mr. Mettas is the Senior research scientist at ReliaSoft Corporation. He fills a critical role in the advancement of ReliaSoft's theoretical research efforts and formulations in the subjects of Life Data Analysis, Accelerated Life Testing, and System Reliability and Maintainability. He has played a key role in the development of ReliaSoft's software including Weibull++, ALTA, and BlockSim, and has published numerous papers on various reliability methods. Mr. Mettas holds an MS degree in Reliability Engineering from the University of Arizona.

RF #2003RM-100: page ii RF

Table of Contents 1. Introduction ...... 1 2. Types of Accelerated Tests ...... 1 2.1. Qualitative Tests ...... 1 2.2. ESS and Burn-In ...... 1 2.3. Quantitative Accelerated Life Tests ...... 1 3. Understanding Accelerated Life Test Analysis...... 2 3.1. Looking at a Single Constant Stress Accelerated Life Test...... 3 4. Life Distribution and Stress-Life Models...... 4 4.1. Overview of the Analysis Steps...... 4 5. Review of some Simple Life Stress Relationships...... 5 5.1. Arrhenius Relationship ...... 5 5.2. Eyring Relationship ...... 5 5.3. Inverse Power Law Relationship...... 5 5.4. Temperature-Humidity Relationship...... 5 5.5. Temperature-Non-Thermal Relationship...... 5 6. Parameter Estimation...... 6 7. Reliability Information ...... 6 8. Stress Loading...... 6 8.1. Stress is Time-Independent (Constant Stress) ...... 6 8.2. Stress is Time-Dependent ...... 6 8.3. Stress is Quasi Time-Dependent...... 6 8.4. Stress is Continuously Time-Dependent...... 7 9. An Introduction to the Arrhenius Relationship ...... 7 9.1. A Look at the Parameter B...... 8 9.2. Acceleration Factor...... 9 9.3. Arrhenius Relationship combined with a Life Distribution...... 9 9.4. Other Single Constant Stress Models ...... 10 10. An Introduction to Two-Stress Models ...... 10 10.1. Temperature-Humidity Relationship Introduction ...... 10 10.2. Temperature-Non-Thermal Relationship Introduction...... 11 11. A Very Simple Tutorial Example ...... 12 12. References...... 14

RF #2003RM-100: page iii RF test is one that quickly reveals those failure modes that will 1. INTRODUCTION occur during the life of the product under normal use Traditional “Life Data Analysis” involves analyzing times- conditions. In general, qualitative tests are not designed to to-failure data (of a product, system or component) obtained yield life data that can be used in subsequent analysis or for under “normal” operating conditions in order to quantify the “Accelerated Life Test Analysis.” In general, qualitative tests life characteristics of the product, system or component. In do not quantify the life (or reliability) characteristics of the many situations, and for many reasons, such life data (or product under normal use conditions. times-to-failure data) is very difficult, if not impossible, to 2.1.1 Benefits and Drawbacks of Qualitative Tests: obtain. The reasons for this difficulty can include the long life Benefit: Increase reliability by revealing probable failure times of today’s products, the small time period between modes. design and release, and the challenge of testing products that Unanswered question: What is the reliability of the product are used continuously under normal conditions. Given this at normal use conditions? difficulty, and the need to observe failures of products to better understand their failure modes and their life 2.2 ESS and Burn-In characteristics, reliability practitioners have attempted to The second type of accelerated test consists of ESS and devise methods to force these products to fail more quickly Burn-in testing. ESS, Environmental Stress Screening, is a than they would under normal use conditions. In other words, process involving the application of environmental stimuli to they have attempted to accelerate their failures. Over the products (usually electronic or electromechanical products) on years, the term “Accelerated Life Testing” has been used to an accelerated basis. The stimuli in an ESS test can include describe all such practices. thermal cycling, random vibration, electrical stresses, etc. The A variety of methods, which serve different purposes, have goal of ESS is to expose, identify and eliminate latent defects been termed “Accelerated Life Testing.” As we use the term which cannot be detected by visual inspection or electrical in this tutorial, “Accelerated Life Testing” involves testing but which will cause failures in the field. ESS is acceleration of failures with the single purpose of the performed on the entire population and does not involve “quantification of the life characteristics of the product at . normal use conditions.” This tutorial is solely concerned Burn-in can be regarded as a special case of ESS. with this type of accelerated life testing. To avoid confusion, According to MIL-STD-883C, Burn-in is a test performed for the following section describes different types of tests that the purpose of screening or eliminating marginal devices. have been called “accelerated tests” and distinguishes between Marginal devices are those with inherent defects or defects those that are addressed in this tutorial and those that are not. resulting from manufacturing aberrations which cause time- 2. TYPES OF ACCELERATED TESTS and stress-dependent failures. As with ESS, Burn-in is performed on the entire population. Readers interested in the Each type of test that has been called an accelerated test subject of ESS and Burn-in are encouraged to refer to provides different information about the product and its failure Kececioglu & Sun on ESS [3] and Burn-in [4]. mechanisms. Generally, accelerated tests can be divided into three types: Qualitative Tests (Torture Tests or Shake and 2.3 Quantitative Accelerated Life Tests Bake Tests), ESS and Burn-in and finally Quantitative Quantitative Accelerated Life Testing, unlike the qualitative Accelerated Life Tests. This tutorial only addresses and testing methods (i.e., Torture Tests, Burn-in, etc.) described quantifies some models and procedures associated with the previously, consists of quantitative tests designed to quantify last type, Quantitative Accelerated Life Tests. the life characteristics of the product, component or system under normal use conditions, and thereby provide “Reliability 2.1 Qualitative Tests Information.” Reliability information can include the Qualitative Tests are tests that yield failure information (or determination of the probability of failure of the product under failure modes) only. They have been referred to by many use conditions, mean life under use conditions, and projected names including: returns and warranty costs. It can also be used to assist in the • Elephant Tests performance of risk assessments, design comparisons, etc. • Torture Tests Accelerated Life Testing can take the form of “Usage Rate • HALT (Highly Accelerated Life Testing) Acceleration” or “Overstress Acceleration.” Both Accelerated • Shake & Bake Tests Life Test methods are described next. Because “Usage Rate Qualitative tests are performed on small samples with the Acceleration” test data can be analyzed with typical life data specimens subjected to a single severe level of stress, to a analysis methods, the Overstress Acceleration method is the number of stresses, or to a time-varying stress (i.e., method relevant to this Tutorial. cycling, cold to hot, etc.). If the specimen survives, it passes For all life tests, some time-to-failure information for the the test. Otherwise, appropriate actions will be taken to product is required since the failure of the product is the event improve the product’s design in order to eliminate the cause(s) we want to understand. In other words, if we wish to of failure. Qualitative tests are used primarily to reveal understand, measure, and predict any event, we must observe probable failure modes. However, if not designed properly, the event! they may cause the product to fail due to modes that would Most products, components or systems are expected to have never been encountered in real life. A good qualitative perform their functions successfully for long periods of time,

RF #2003RM-100: page 1 RF such as years. Obviously, for a company to remain competitive, the time required to obtain times-to-failure data must be considerably less than the expected life of the product. Two methods of acceleration, “Usage Rate Acceleration” and “Overstress Acceleration,” have been devised to obtain times- to-failure data at an accelerated pace. For products that do not operate continuously, one can accelerate the time it takes to induce failures by continuously testing these products. This is called “Usage Rate Acceleration.” For products for which “Usage Rate Acceleration” is impractical, one can apply stress(es) at levels that exceed the levels that a product will encounter under normal use conditions and use the times-to- failure data obtained in this manner to extrapolate to use conditions. This is called “Overstress Acceleration.” 2.3.1 Usage Rate Acceleration For products that do not operate continuously under normal conditions, if the test units are operated continuously, failures are encountered earlier than if the units were tested at normal usage. For example, a microwave oven operates for small Figure 1: Typical stress for a component, product periods of time every day. One can accelerate a test on or system. microwave ovens by operating them more frequently until failure. The same could be said of washers. If we assume an This choice of stresses as well as stress levels and the average washer use of 6 hours a week, one could conceivably process of setting up the experiment is of the utmost reduce the testing time 28-fold by testing these washers importance. Consult your design engineer(s) and material continuously. Data obtained through usage acceleration can scientist(s) to determine what stimuli (stress) is appropriate as be analyzed with the same methods used to analyze regular well as to identify the appropriate limits (or stress levels). If times-to-failure data. The limitation of “Usage Rate these stresses or limits are unknown, multiple tests with small Acceleration” arises when products, such as computer servers sample sizes can be performed in order to ascertain the and peripherals, maintain a very high or even continuous appropriate stress(es) and stress levels. The adequacy and usage. In such cases, usage acceleration, even though applicability of these stresses can be confirmed through desirable, is not a feasible alternative. In these cases the subsequent failure analysis. Information from the qualitative practitioner must stimulate the product to fail, usually through testing phase (i.e., torture tests, etc.) of a normal product the application of stress(es). This method of accelerated life development process can also be utilized in ascertaining the testing is called “Overstress Acceleration” and is described appropriate stress(es). Proper use of next. (DOE) methodology is also crucial at this step. In addition to proper stress selection, the application of the stresses must be 2.3.2 Overstress Acceleration accomplished in some logical, controlled and quantifiable For products with very high or continuous usage, the fashion. Accurate data on the stresses applied as well as the accelerated life-testing practitioner must stimulate the product observed behavior of the test specimens must be maintained. to fail in a life test. This is accomplished by applying It is clear that as the stress used in an accelerated test stress(es) that exceed the stress(es) that a product will becomes higher the required test duration decreases. encounter under normal use conditions. The times-to-failure However, as the stress moves farther away from the use data obtained under these conditions are then used to conditions, the uncertainty in the extrapolation increases. This extrapolate to use conditions. Accelerated life tests can be is what we jokingly refer to as the “there is no free lunch” performed at high or low temperature, humidity, voltage, principle. Confidence intervals provide a measure of this pressure, vibration, and/or combinations of stresses to uncertainty in extrapolation. accelerate or stimulate the failure mechanisms. Accelerated life test stresses and stress levels should be 3. UNDERSTANDING ACCELERATED LIFE TEST chosen so that they accelerate the failure modes under ANALYSIS consideration but do not introduce failure modes that would In typical life data analysis one determines, through the use never occur under use conditions. Normally, these stress of statistical distributions, a life distribution that describes the levels will fall outside the product specification limits but times-to-failure of a product. Statistically speaking, one inside the design limits. wishes to determine the use level probability density function, or pdf, of the times-to-failure. Once this pdf is obtained, all other desired reliability results can be easily determined including but not limited to:

RF #2003RM-100: page 2 RF Percentage failing under warranty. ) Risk assessment. f(t Design comparison. ss s tre tres e S h S Wear-out period (product performance degradation). Us Hig In typical life data analysis, this use level probability density

function, or pdf, of the times-to-failure can be easily

) t

determined using regular times-to-failure data and an (

e f

underlying distribution such as the Weibull, exponential, and i lognormal distributions. In accelerated life testing analysis, L however, we face the challenge of determining this use level pdf from accelerated life test data rather than from times-to- failure data obtained under use conditions. To accomplish Stress this, we must develop a method that allows us to extrapolate from data collected at accelerated conditions to arrive at an estimation of use level characteristics. 3.1 Looking at a Single Constant Stress Accelerated Life Test To understand the process involved with extrapolating from overstress test data to use level conditions, let’s look closely at a simple accelerated life test. For simplicity we will assume that the product was tested under a single stress and at a single constant stress level. We will further assume that times-to- failure data have been obtained at this stress level. The times- to-failure at this stress level can then be easily analyzed using an underlying life distribution. A pdf of the times-to-failure of the product can be obtained at that single stress level using traditional approaches (for more details see [7, 10]). This Figure 2: Traversing from a high stress to our use stress. overstress pdf, can be used to make predictions and estimates of life measures of interest at that particular stress level. The objective in an accelerated life test, however, is not to obtain predictions and estimates at the particular elevated stress level at which the units were tested, but to obtain these measures at another stress level, the use stress level. To accomplish this objective, we must devise a method to traverse the path from the overstress pdf to extrapolate a use level pdf. The first part of Figure 2 illustrates a typical behavior of the pdf at the high stress (or overstress level) and the pdf at the use stress level. To further simplify the scenario, let’s assume that a single point can describe the pdf for the product, at any stress level. The second part of Figure 2 illustrates such a simplification where we need to determine a way to project (or map) this single point from the high stress to the use stress. Figure 3: A simple linear and a simple exponential Obviously there are infinite ways to map a particular point relationship. from the high stress level to the use stress level. We will Even when a model is assumed (i.e., linear, exponential, assume that there is some road map (model or a function) that etc.), the mapping possibilities are still infinite since they maps our point from the high stress level to the use stress level depend on the parameters of the chosen model or relationship. (or shows us the way). This model or function can be For example, a simple linear model would generate different described mathematically and can be as simple as the equation mappings for each slope value because we can draw an for a line. Figure 3 demonstrates some simple models or infinite number of lines through a point. If we tested relationships. specimens of our product at two different stress levels, we could begin to fit the model to the data. Obviously, the more points we have, the better off we are in correctly mapping this particular point, or fitting the model to our data. Figure 4 illustrates that you need a minimum of two stress levels to properly map the function to a use stress level.

RF #2003RM-100: page 3 RF failure rate can be justified. Along with the life distribution, a stress-life relationship is also used. A stress-life relationship can be one of the empirically derived relationships or a new one formulated for the particular stress and application. The data obtained from the experiment is then fitted to both the underlying life distribution and stress-life relationship.

pdf

Figure 4: Testing at two (or more) higher stress levels

allows us to better fit the model. S t r e 4. LIFE DISTRIBUTION AND STRESS-LIFE MODELS s s Analysis of accelerated life test data, then, consists of an underlying life distribution that describes the product at different stress levels and a stress-life relationship (or model) Time that quantifies the manner in which the life distribution (or the life distribution characteristic under consideration) changes Figure 6: A three dimensional representation of the pdf across different stress levels. These elements of analysis are vs. time and stress created using ReliaSoft’s ALTA 1.0 shown graphically in Figure 5. software [10].

Probability Plot 4.1 Overview of the Analysis Steps 99.00 With our current understanding of the principles behind accelerated life testing analysis, we will continue with a discussion of the steps involved in performing an analysis on life data that has been collected from accelerated life tests 50.00 4.1.1 Life Distribution The first step in performing an accelerated life test analysis is to choose an appropriate life distribution. Although it is rarely appropriate, the exponential distribution, because of its Unreliability, F(t) Unreliability, 10.00 simplicity, is very commonly used as the underlying life

5.00 distribution. The Weibull and lognormal distributions, which require more involved calculations, are more appropriate for most uses. Note that the exponential distribution is a special case of the Weibull (for b equal to 1). 1.00 0.01 0.10 1.00 10.00 100.001000.0010000.00 Time, (t) 4.1.2 Stress-Life Relationship After you have selected an underlying life distribution Figure 5: A life distribution and a stress-life relationship. appropriate to your data, the second step is to select (or create) a model that describes a characteristic point or a life The combination of both an underlying life distribution and characteristic of the distribution from one stress level to a stress-life model can be best seen in Figure 6 where a pdf is another. plotted against both time and stress. The life characteristic can be any life measure such as the The assumed underlying life distribution can be any life mean, median, etc. This life characteristic is expressed as a distribution. The most commonly used life distributions function of stress. Depending on the assumed underlying life include the Weibull, the exponential, and the lognormal. The distribution, different life characteristic are considered. practitioner should be cautioned against using the exponential Typical life characteristic for some distributions are shown in distribution, unless the underlying assumption of a constant the next table (Table 1).

RF #2003RM-100: page 4 RF Table 1: Typical life characteristics • B is another model parameter to be determined. 5.2 Eyring Relationship The Eyring relationship is also commonly used for analyzing data for which temperature is the accelerated stress. The Eyring model is given by,  B  1 − A−  *Usually assumed constant L(V ) = ⋅ e  V  V where: For example, when considering the Weibull distribution, the • L represents a quantifiable life measure, such as mean scale parameter, h, is chosen to be the “life characteristic” that life, characteristic life, median life, B(x) life, etc. is stress dependent, while b is assumed to remain constant • V represents the stress level. across different stress levels. A stress-life relationship is then assigned to h. • A is one of the model parameters to be determined. • B is another model parameter to be determined. Reliability vs Stress Surface 5.3 Inverse Power Law Relationship The inverse power law relationship (or IPL) is commonly used for analyzing data for which the accelerated stress is non- thermal in nature. The inverse power law (IPL) model is given by, 1 L(V ) = K ⋅V n where: • L represents a quantifiable life measure, such as mean life, characteristic life, median life, B(x) life, etc. • V represents the stress level. • K is a model parameter to be determined, (K > 0). • n is another model parameter to be determined. 5.4 Temperature-Humidity Relationship The temperature-humidity relationship is a two-stress relationship. It is commonly used for predicting the life at use conditions when temperature and humidity are the accelerated stresses in a test. This combination model is given by,  φ b   +  L(U,V ) = A⋅ e  V U  where: • f is one of the three parameters to be determined. Figure 7: A graphical representation of a Weibull • b is the second of the three parameters to be determined reliability function plotted as both a function of time and (also known as the activation energy for humidity). stress. • A is the third of the three parameters to be determined. • U is the relative humidity. 5. OVERVIEW OF SOME SIMPLE STRESS-LIFE • V is temperature (in absolute units). RELATIONSHIPS 5.5 Temperature-Non-Thermal Relationship The temperature-non-thermal relationship is another two- 5.1 Arrhenius Relationship stress model. This relationship is given by, The Arrhenius relationship is commonly used for analyzing C L(U,V ) = B data for which temperature is the accelerated stress. The − n Arrhenius model is given by, U e V B where: L(V ) = C ⋅ e V • U is the non-thermal stress (e.g., voltage). where: • V is the temperature (in absolute scale). • L represents a quantifiable life measure, such as mean • B, C, n are the parameters to be determined. life, characteristic life, median life, or B(x) life, etc. • V represents the stress level (in absolute units if it is temperature). • C is a model parameter to be determined, (C > 0).

RF #2003RM-100: page 5 RF 6. PARAMETER ESTIMATION Once you have selected an underlying life distribution and stress-life relationship model to fit your accelerated test data, the next step is to select a method by which to perform parameter estimation. Simply put, parameter estimation involves fitting a model to the data and solving for the parameters that describe that model. In our case the model is a combination of the life distribution and the stress-life relationship. The task of parameter estimation can vary from trivial (with ample data, a single constant stress, a simple distribution and a simple model) to impossible. Available methods for estimating the parameters of a model include the graphical method, the least squares method and the maximum Figure 8: Graphical representation of time vs. stress in a likelihood estimation method. Computer software can be used time-independent stress loading. to accomplish this task [12; 10; 11]. This type of stress loading has many advantages over time- 7. RELIABILITY INFORMATION dependent stress loadings. Specifically: Most products are assumed to operate at a constant stress Once the parameters of the underlying life distribution and under normal use. stress-life relationship have been estimated, a variety of It is far easier to run a constant stress test (e.g., one in which reliability information about the product can be derived such the chamber is maintained at a single temperature). as: It is far easier to quantify a constant stress test. • Warranty time. Models for data analysis exist, are widely publicized and are • The instantaneous failure rate, which indicates the empirically verified. number of failures occurring per unit time. Extrapolation from a well executed constant stress test is • The mean life which provides a measure of the average more accurate than extrapolation from a time-dependent time of operation to failure. stress test. 8. STRESS LOADING 8.2 Stress is Time-Dependent When the stress is time-dependent, the product is subjected The discussion of accelerated life testing analysis thus far to a stress level that varies with time. Products subjected to has included the assumption that the stress loads applied to time-dependent stress loadings will yield failures more units in an accelerated test have been constant with respect to quickly and models that fit them are thought by many to be time. In real life, however, different types of loads can be the “holy grail” of accelerated life testing. The current state of considered when performing an accelerated test. Accelerated analysis techniques for time-dependent stress loading schemes life tests can be classified as constant stress, step stress, can be best expressed by a passage in Dr. Wayne Nelson’s cycling stress, or random stress. These types of loads are accelerated testing book [6]. classified according to the dependency of the stress with Dr. Nelson writes, “Such cumulative exposure models are respect to time. There are two possible stress loading like the weather. Everybody talks about them, but nobody schemes, loadings in which the stress is time-independent and does anything about them. Many models appear in literature, loadings in which the stress is time-dependent. The few have been fitted to data and even fewer assessed for mathematical treatment, models and assumptions vary adequacy of fit. Morever, fitting such a model to data requires depending on the relationship of stress to time. This tutorial a sophisticated special computer program. Thus, constant deals with time-independent stresses, the most common type stress tests are generally recommended over step-stress tests of stress loading. Treatment of time-dependent stresses is for reliability estimation.” complex and well beyond the scope of this tutorial. Participants interested in the analysis of data utilizing time- 8.3 Stress is Quasi Time-Dependent dependent stresses can refer to [9]. The step-stress model [6] and the related ramp-stress model are typical cases of time-dependent stress tests. In these cases, 8.1 Stress is Time-Independent (Constant Stress) the stress is quasi time-independent. This that the When the stress is time-independent, the stress applied to a stress load remains constant for a period of time and then is sample of units does not vary. In other words, if temperature stepped/ramped into a different stress level where it remains is the thermal stress, each unit is tested under the same constant for another time interval until it is stepped/ramped accelerated temperature, e.g., 100° C, and data are recorded. again. There are numerous variations of this concept. This is the type of stress load that has been discussed so far.

RF #2003RM-100: page 6 RF

Figure 9: Graphical representation of the step-stress Figure 12: Graphical representation of a completely model. time-dependent stress model.

9. AN INTRODUCTION TO THE ARRHENIUS RELATIONSHIP One of the most commonly used stress-life relationships is the Arrhenius. It is an exponential relationship and it was formulated by assuming that life is proportional to the inverse reaction rate of the process, thus the Arrhenius stress-life relationship is given by, B L(V ) = C ⋅ e V (1) where: • L represents a quantifiable life measure, such as mean life, characteristic life, median life, or B(x) life, etc. • V represents the stress level (formulated for temperature Figure 10: Graphical representation of the ramp-stress and temperature values in absolute units i.e., degrees model. Kelvin or degrees Rankine. This is a because the model is exponential, thus negative stress 8.4 Stress is Continuously Time-Dependent values are not possible.) The concept of stress-life models that includes stress as a • C is one of the model parameters to be determined, (C > continuous function of time has not been widely contemplated 0). in the literature. An introduction to these models can be found • B is another model parameter to be determined. in [6] and in-depth discussion and applications in [9]. Since the Arrhenius is a physics-based model derived for Analyses of these types of stress models are more complex temperature dependence, it is strongly recommended that the than the quasi time-dependent models and require advanced model be used for temperature-accelerated tests. For the same software packages such as [11] to accomplish. reason, temperature values must be in absolute units (Kelvin or Rankine), even though eq (1) is unitless. The Arrhenius relationship can be linearized and plotted on a life vs. stress plot, also called the Arrhenius plot. The relationship is linearized by taking the natural logarithm of both sides in eq (1) or, B ln(L(V )) = ln(C) + (2) V In eq (2) ln(c) is the intercept of the line and B is the slope of the line. Note that the inverse of the stress, and not the stress, is the variable. In Figure 13, life is plotted versus stress and not versus the inverse stress. This is because eq (2) was plotted on a reciprocal scale. On such a scale, the slope B appears to be negative even though it has a positive value. Figure 11: Graphical representation of a constantly This is because B is actually the slope of the reciprocal of the increasing (or progressive) stress model. stress and not the slope of the stress. The reciprocal of the stress is decreasing as stress is increasing 1/V is decreasing as V is increasing). The two different axes are shown in Figure 14.

RF #2003RM-100: page 7 RF 9.1 A Look at the Parameter B Depending on the application (and where the stress is exclusively thermal), the parameter B can be replaced by,

E B = A K activation energy = (3) Boltzman's constant activation energy = 8.623×10 −5 eV ⋅ K −1

Note that in this formulation, the activation energy must be known apriori. If the activation energy is known then there is only one model parameter remaining, C. Because in most real life situations this is rarely the case, all subsequent formulations will assume that this activation energy is unknown and treat B as one of the model parameters. As it can be seen in eq (3), B has the same properties as the Figure 13: The Arrhenius relationship linearized on log- activation energy. In other words, B is a measure of the effect reciprocal paper. that the stress (i.e., temperature) has on the life. The larger the value of B, the higher the dependency of the life on the specific stress. Parameter B may also take negative values. In that case, life is increasing with increasing stress (see Figure 15). An example of this would be plasma filled bulbs, where low temperature is a higher stress on the bulbs than high temperature.

Figure 14: An of both reciprocal and non- reciprocal scales for the Arrhenius relationship.

The Arrhenius relationship is plotted on a reciprocal scale for practical reasons. For example, in Figure 14 it is more convenient to locate the life corresponding to a stress level of 370K rather than to take the reciprocal of 370K (0.0027) first, and then locate the corresponding life. Figure 15: Behavior of the parameter B. The shaded areas shown in Figure 14 are the imposed pdf’s at each test stress level. From such imposed pdf’s one can see the range of the life at each test stress level, as well as the scatter in life.

RF #2003RM-100: page 8 RF 9.2 Acceleration Factor 9.3.2 Arrhenius Weibull Most practitioners use the term acceleration factor to refer to A more useful variation is the Weibull-Arrhenius the ratio of the life (or acceleration characteristic) between the formulation, which is obtained by considering the pdf for 2- use level and a higher test stress level or, parameter Weibull distribution. It is given by,

β β −1  t  LUSE −  A = β  t   η  F L f (t) = ⋅  e . (7) Accelerated η η  For the Arrhenius model this factor is, The scale parameter (or characteristic life) of the Weibull B distribution is h. The Arrhenius-Weibull model pdf can then Vu LUSE C ⋅e be obtained by setting h = L(V) in eq (7), AF = = B L Accelerated V B C ⋅e A V B η = L(V ) = C ⋅ e , (8)  B B  Vu    −  e  Vu V A  = B = e and substituting for h in eq (7), V A β e   β −1  t  −    B  β t  V  Thus, if B is assumed to be known apriori (using an f (t,V ) = ⋅  e  C⋅e  . (9) B  B  activation energy), the assumed activation energy alone C ⋅e V  C ⋅e V  dictates this acceleration factor! An illustration of the pdf for different stresses is shown in 9.3 Arrhenius Relationship Combined with a Life Distribution Figure 16. As expected, the pdf at lower stress levels is more stretched to the right, with a higher scale parameter, while its All relationships presented must be combined with an shape remains the same (the is approximately underlying life distribution for analysis. The simplest 3 in Figure 16). This behavior is observed when the combination is with the exponential distribution as shown parameter B of the Arrhenius model is positive. Figure 17 next: illustrates the behavior of the reliability function for the same 9.3.1 Arrhenius Exponential parameter set.

The pdf of the 1-parameter exponential distribution is given by, f (t) = λ ⋅ e −λ⋅t (4)

It can be easily shown that the mean life for the 1-parameter exponential distribution is given by, 1 λ = (5) m thus,

t 1 − f (t) = ⋅ e m (6) m The Arrhenius-exponential model pdf can then be obtained by setting m = L(V) in eq (6). Therefore,

B m = L(V ) = C ⋅ e V

Substituting for m in eq (6) yields a pdf that is both a function of time and stress or,

1 − ⋅t B 1 V f (t,V ) = ⋅e C⋅e Figure 16: Behavior of the probability density function B at different stresses and with the parameters held C ⋅e V constant. Once the pdf is obtained all other metrics of interest (i.e.,

Reliability, MTTF, etc.) can be easily formulated. For more information see [12; 8].

RF #2003RM-100: page 9 RF extrapolation. The degree of uncertainty is reflected in the confidence bounds.

Figure 18: Comparison of the confidence bounds for different use stress levels.

Figure 17: Behavior of the reliability function at 9.4 Other Single Constant Stress Models different stresses and constant parameter values. The same formulations can be applied to other models such The advantage of using the Weibull distribution as the life as the distribution lies in its flexibility to assume different shapes. • Eyring relationship (exponential relationship). 9.3.3 Example • Inverse Power Law relationship (power relationship). Consider the following times-to-failure data at three • Coffin Manson relationship (power relationship utilizing different stress levels. a ∆V for stress). One must be cautious in selecting a model. The physical Table 2: Times-to-failure data at three different stress characteristics of the failure mode under consideration must levels. be understood and the selected model must be appropriate. As an example, in cases where the failure mode is the use of an exponential relationship would be inappropriate since the physical mechanism are based on a power relation, thus a power model would be more appropriate (i.e., Inverse Power Law model). 10. AN INTRODUCTION TO TWO-STRESS MODELS

10.1 Temperature-Humidity Relationship Introduction A variation of the Eyring relationship is the temperature- humidity (T-H) relationship, which has been proposed for predicting the life at use conditions when temperature and The data were analyzed jointly and with a complete MLE humidity are the accelerated stresses in a test. This solution over the entire data set, using [10]. The analysis combination model is given by,  φ b  yields,  +  ) L(U,V ) = A⋅e  V U  β = 4.291 ) where, B = 1861.618 • f is one of the three parameters to be determined, ) C = 58.984 • b is the second of the three parameters to be determined Once the parameters of the model are estimated, (also known as the activation energy for humidity), extrapolation and other life measures can be directly obtained • A is a constant and the third of the three parameters to be using the appropriate equations. Using the MLE method, determined, confidence bounds for all estimates can be obtained. Note in • U is the relative humidity (decimal or percentage), Figure 18 below that the more distant the accelerated stress • V is temperature (in absolute units) from the operating stress, the greater the uncertainty of the

RF #2003RM-100: page 10 RF Since life is now a function of two stresses, a life vs. stress perform the test at (300K, 0.6) and (343K, 0.8). Doing so plot can only be obtained by keeping one of the two stresses would not provide information about the temperature- constant and varying the other one. In Figure 19 below, data humidity effects on life. This is because both stresses are obtained from a temperature and humidity test were analyzed increased at the same time and therefore it is unknown which and plotted on log-reciprocal paper. On the first plot, life is stress is causing the acceleration on life. A possible plotted versus temperature with relative humidity held at a combination that would provide information about fixed value. On the second plot, life is plotted versus relative temperature-humidity effects on life would be (300K, 0.6), humidity with temperature held at a fixed value. (300K, 0.8) and (343K, 0.8). It is clear that by testing at Note that in Figure 19 the points shown in these plots (300K, 0.6) and (300K, 0.8) the effect of humidity on life can represent the life characteristics at the test stress levels (the be determined (since temperature remained constant). data were fitted to a Weibull distribution, thus the points Similarly, the effects of temperature on life can be determined represent the scale parameter, h). For example, the points by testing at (300K, 0.8) and (343K, 0.8) (since humidity shown in the first plot represent h at each of the test remained constant). temperature levels (two temperature levels were considered in this test). 10.1.2 An Example Using the T-H Model The following data were collected after testing twelve electronic devices at different temperature and humidity conditions: Table 3: T-H Data

Using [10], the following results were obtained: ) β = 5.874 Aˆ = 0.0000597 ) b = 0.281 ) φ = 5630.330

10.2 Temperature-Non-Thermal Relationship Introduction When temperature and a second non-thermal stress (e.g., Figure 19: Life vs. stress plots for the Temperature- voltage) are the accelerated stresses of a test, then the Humidity model, holding humidity constant on the first Arrhenius and the inverse power law models can be combined plot and temperature constant on the second. to yield the temperature-non-thermal (T-NT) model. This model is given by, 10.1.1 A Note about T-H Data C L(U,V ) = B − When using the T-H relationship, the effect of both U n e V temperature and humidity on life is sought. For this reason, where, the test must be performed in a combination manner between • U is the non-thermal stress (i.e., voltage, vibration, etc.), the different stress levels of the two stress types. For example, assume that an accelerated test is to be performed at two • V is the temperature (in absolute units) temperature and two humidity levels. The two temperature • B, C, and n are the parameters to be determined. levels were chosen to be 300K and 343K. The two humidity In Figure 20 below, data obtained from a temperature levels were chosen to be 0.6 and 0.8. It would be wrong to and voltage test were analyzed and plotted on a log-

RF #2003RM-100: page 11 RF reciprocal scale. In the first plot, life is plotted versus assumed to be at a 45° bend. The acceleration stress was temperature, with voltage held at a fixed value. In the determined to be the angle to which we bend the clips, thus second plot life is plotted versus voltage, with two accelerated bend stresses of 90° and 180° were used. temperature held at a fixed value. The paper clips were tested using the following procedure for the 90° bend. A similar procedure was also used for the 180°

Generated by: ReliaSoft's ALTA - www.ReliaSoft.com - 888-886-0410 and 45° test. Life vs Stress 1000.00

1. To Open the Paper Clip.

1. With one hand, hold the clip by the longer, outer 100.00 loop. 2. With the thumb and forefinger of the other

Life hand, grasp the smaller, inner loop.

10.00 3. Pull the smaller, inner loop out and down 90 degrees so that a right angle is formed as shown.

1.00 320.00 334.00 348.00 362.00 376.00 390.00 Temperature 2. To Close the Paper Clip. Generated by: ReliaSoft's ALTA - www.ReliaSoft.com - 888-886-0410 Life vs Stress 10000.00 1. With one hand, continue to hold the clip by the longer, outer loop. 2. With the thumb and forefinger of the other hand, grasp the smaller, 1000.00 inner loop. 3. Push the smaller inner loop up and in 90 degrees so that the smaller loop is

Life returned to the original upright position in line with the larger, outer loop as shown. 4. This completes one cycle. 100.00 3. Repeat until the paper clip breaks. Count and record the cycles-to-failure for each clip.

At this point the reader must note that the paper clips used in

10.00 this example were “Jumbo” paper clips capable of repeated 1.00 10.00 bending, different paper clips will yield different results. Voltage Additionally, and so that no other stresses are imposed, caution must be taken to assure that the rate at which the paper Figure 20: Life vs. stress plots for the Temperature- clips are cycled remains the same across the experiment. Humidity model, holding voltage constant on the first plot For the experiment a sample of six paper clips was tested to and temperature constant on the second. failure at both 90° and 180° bends. A base test sample of six paper clips was tested at a 45° bend (the assumed use stress 11. A VERY SIMPLE TUTORIAL EXAMPLE level) to confirm the analysis. The cycles-to-failure data obtained are given next. To illustrate the principles behind accelerated testing, consider the following simple example that involves a paper clip and can be easily and independently performed by the reader. The objective was to determine the mean number of cycles-to-failure of a given paper clip. The use cycles were

RF #2003RM-100: page 12 RF

Cycles-to-failure at 90° 16, 17, 18, 21, 22, 23 cycles.

Cycles-to-failure at 180° 4, 5, 5, 5.5, 6, 6.5 cycles.

Cycles-to-failure at 45° 58, 63, 65, 72, 78, 86 cycles.

The accelerated test data were then analyzed in [10], assuming a lognormal life distribution (fatigue) and an inverse Figure 22: Resulting Probability plot for 90 and 180 power law relationship (non-thermal) for the stress-life model. bends. The analysis and some of the results are shown in the next figures. The base data were analyzed using [12] and a base MTTF estimated. In this case our accelerated test correctly predicted the MTTF as verified by our base test. It is interesting to note (see Figure 23) that mathematically one can come up with very high acceleration factors. However for one to accomplish this, these stresses must be foolishly high (i.e., 360+ degree bend on the paper clips) and would cause the product to fail under modes that are not realistic.

Figure 23: The resulting acceleration factor versus stress plot.

Figure 21: The accelerated test data analyzed in [10].

RF #2003RM-100: page 13 RF Note that a reciprocal transformation on X, or X=1/V will result to an exponential life stress relationship, while a logarithmic transformation, X=ln(V) results to a power life stress relationship. 12.3 Time-Varying Stress Models When the test stresses are time-dependent (see Section 8), the life-stress relationships can be extended to account for this type of stresses. As an example consider an exponential life stress relationship utilizing a time-varying stress:  B    L(V (t)) = Ce V (t)  Treatment and analysis of time-varying stresses requires further assumptions and more complex analysis techniques [6, 9, 11]. REFERENCES

1. Glasstone, S., Laidler, K. J., and Eyring, H. E., The Theory of Rate Processes, McGraw Hill, NY, 1941. 2. Groebel, David, Mettas, Adamantios and Sun, Feng-Bin, Determination and Interpretation of Activation Energy Using Accelerated Test Data, 47th Reliability and Maintainability Symposium. 3. Kececioglu, Dimitri, and Sun, Feng-Bin, Environmental Stress Screening - Its Quantification, Optimization and Management, Prentice Figure 24: The resulting life versus stress plot from [10]. Hall PTR, New Jersey, 1995. Note that from the plot the estimated MTTF at a 45° bend 4. Kececioglu, Dimitri, and Sun, Feng-Bin, Burn-In Testing - Its is 71.6 cycles. This was estimated utilizing the 90° and Quantification and Optimization, Prentice Hall PTR, New Jersey, 1997. 5. Mettas, Adamantios, Modeling & Analysis for Multiple Stress-Type 180° bend data. Accelerated Life Data, 46th Reliability and Maintainability Symposium. 6. Nelson, Wayne, Accelerated Testing: Statistical Models, Test Plans, and Note that the base 45° data analyzed in [12], utilizing a Data Analyses, John Wiley & Sons, Inc., New York, 1990. lognormal distribution yielded an MTTF estimate of 70.33 7. ReliaSoft Corporation, Life Data Analysis Reference, ReliaSoft cycles. Publishing, Tucson, AZ, 2000. Also portions are published on-line at www.Weibull.com. 12. ADVANCED CONCEPTS 8. ReliaSoft Corporation, Accelerated Life Testing Reference, ReliaSoft Publishing, Tucson, AZ, 1998. Also published on-line at www.Weibull.com. 12.1 Confidence Bounds 9. ReliaSoft Corporation, ALTA 6 Accelerated Life Testing Reference, ReliaSoft Publishing, Tucson, AZ, 2001. The confidence bounds on the parameters and a number of 10. ReliaSoft Corporation, ALTA 6.0 Software Package, Tucson, AZ, other quantities such as the reliability and the can be www.ReliaSoft.com. obtained based on the asymptotic theory for maximum 11. ReliaSoft Corporation, ALTA PRO 6.0 Software Package, Tucson, AZ, likelihood estimates, for complete and censored data. This www.ReliaSoft.com. 12. ReliaSoft Corporation, Weibull++ 6.0 Software Package, Tucson, AZ, type of confidence bounds, are most commonly referred to as www.Weibull.com. the Fisher matrix bounds. 13. Striny, Kurt M., and Schelling, Arthur W., Reliability Evaluation of Aluminum-Metalized MOS Dynamic RAMS in Plastic Packages in 12.2 Multivariable Relationships High Humidity and Temperature Environments, IEEE 31st Electronic So far in this tutorial the life-stress relationships presented Components Conference, pp. 238-244, 1981. have been either single stress relationships or two stress relationships. In most practical applications however, life is a function of more than one or two variables (stress types). In addition, there are many applications where the life of a product as a function of stress and of some engineering variable other than stress is sought. A multivariable relationship is therefore needed in order to analyze such data. Such a relationship is the general log-linear relationship, which describes a life characteristic as a function of a vector of n stresses. Mathematically the model is given by,  m   a + a X   0 ∑ i i  L(X ) = e i=1  , where:

• αj are model parameters. • X is a vector of n stresses.

RF #2003RM-100: page 14 RF ®

Accelerated Reliability Testing for Commercial and Utility PV Inverters

Ron Vidano

February 25, 2015 Abstract

Accelerated testing is an efficient strategy to improve reliability for commercial and utility photovoltaic inverter equipment. The two most often used tests are highly accelerated life testing (HALT) and accelerated life testing (ALT).

HALT is a technique that yields results within a few days due to the nature of the acceleration factors used in the test whereby the unit is subjected to progressively higher stress levels and the inclusion of combined temperature and vibration. HALT is an invaluable method to uncover design weaknesses and is used at both the system as well as assembly level.

Accelerated Life Testing (ALT) is useful to determine wear-out mechanisms or lifetime within confidence limits. ALT is capable of determination for product reliability in a short time period of weeks or months by environmental acceleration factors. ALT can find dominant failure mechanisms and is a valuable tool for the discovery of wear-out failure. In addition, ALT methods can serve as qualification criteria to prescribed lifetime confidence limits.

ALT at the system level involves integration of multiple units such as an inverter and power supply within a large environmentally controlled facility. Subsystem life testing can be completed within smaller environmental enclosures or may be accomplished as a component integrated within the inverter at the unit or system level testing facility.

For ALT, the acceleration factor, length of the test, number of samples, confidence required, and test environment are known. The most common temperature acceleration factor is based upon the Arrhenius model. For PV inverters another acceleration factor is the duty cycle whereby testing may be accomplished continually as opposed to the sun-cycle restrictions for on-site exposure. In addition, inclusion of solar methods provides for inverter cycling experienced during environmental and solar resource extremes. One element of efficient ALT qualification is envelope performance testing at environmental extremes.

It is advantageous to synergize the HALT methods to determine design weaknesses and ALT procedures which provide insight into wear-out lifetimes. Once, it has been determined that the inverter design can attain expected lifetimes, burn-in procedures are developed and used to ensure that the product does not contain process or assembly defects. 2 Methodology - Reliability Assurance Milestones During Inverter Product Lifecycle

AE uses a closed loop reliability FOR RELIABILITY, MAINTAINABILITY AND Design for Reliability MANUFACTURABILITY • MTBF, DFMEA, Fault Tree

Reliability Test QUALIFICATION TESTING • Quantitative: ALT, Thermal • Qualitative: HALT MANUFACTURING ASSURANCE Qualification Test • Power profile, , CONTINUOUS DESIGN IMPROVEMENTLOOP harmonics, waveform, FIELD MONITORING AND FRACAS modulation, control loop, compliance, WCSA, limits, control & communication, burn-in development 3 Range of PV Inverters for Accelerated Testing

• String Inverters such as the 3TL Gen3 24kW

• Central Inverters such as the 500TX and 500NX

• Utility Inverters such as the 1000NX AE Reliability Assurance Background

• AE’s Solar and Precision Power AE Reliability Assurance Program Supply customer base requires a reliability focus. • All products are required to meet a very low AFR

+ • PV Inverter products have unique challenges • Grid and Solar Simulators • High Firmware Contact – HIL Methods • Harsh Environment • Stringent Warranties • Monitoring • Inverter Reliability Must Compensate for BOE Issues • 20-Year Durability • >99% Availability • High Efficiency Advanced Power Supply Infrastructure and Simulation Inverter Reliability Assurance Program • Design for Reliability (DfR) Focus Areas • Modularity; Improves reliability, repair, test, and manufacturing • Derating; Component and subassembly derating to reduce operating stress • Temperature Management; Achievement of reduced operating temperatures • Predictive Methods – MTBF, DFMEA, Fault Tree Assessments • Reliability Test

• Verification of potential causes based upon DFMEA Focus • Subassembly ALT, Thermal, Thermal Cycle For • Environmental Testing – Temp/Humidity, Salt Fog This • HALT Presentation • System Level ALT • Experience; Reliability Growth • Product lifecycle learning experiences into design • Improvements based upon assurance testing and field experience Accelerated Testing Applied to PV Inverters

• Accelerated Life Testing • Temperature • Humidity, Temperature-Humidity • Voltage • Temperature Cycling • Power Cycling • Highly Accelerated Life Testing • Cold step stressing • Hot step stressing • Rapid thermal transitions • Vibration step stressing • Combined environments Performance Testing – Solar Simulation

• AE has installed programmable supplies to perform solar simulation testing • Example of NREL test profile demonstrated with 1000NX inverter • Example of actual site irradiance data programmed for test

Advanced Power Supply AC2000P Environmental Chamber Accelerated Life Test (ALT) – Temperature Acceleration

Durability tests such as subsystem and The acceleration factor scales for different system level accelerated life testing (ALT) activation energies and life test temperatures. are key tools to qualify the reliability of new .

The most common temperature acceleration factor AF(T) is based upon the Arrhenius model

• Kb is the Boltzmann’s constant, To is the initial ambient temperature in °K, T is the life test temperature in °K, and Ea is the activation energy in eV.

λ Failures/(Total Device Hours × AF(T)) AF(T) = exp[(Ea/Kb)(1/To – 1/T)] ∝ ALT is a gage of the inverter durability to reach end-of-life failure rate region Long Term Life Test Profile Example; System Level ALT

Repeat Cycle

System Environmental Chamber

AE has performed ALT for up to two calendar years upon inverters at 50degC, 24X7 Short Term Life Test Profile Example; System Level ALT AE has developed accelerated life test facilities in Fort Collins, CO and Bend, OR which are capable of grid test simulation at high temperatures using advanced programmable power supplies with solar simulators

Anderson Electric Controls Supply and Solar Simulator Inverter housed within AE environmental chamber

Using solar simulators, AE has performed ALT for up to two calendar months upon inverters at 50degC, 24X7 AE Background with HALT, HASS

• Highly accelerated life test (HALT) is a qualitative technique pioneered by leading firms such as HP to develop very reliable printers • AE adopted the technique to develop reliable precision power supplies used in semiconductor processing • Several HALT chambers were installed for testing and qualification as well as highly accelerated stress screening (HASS) chambers • HALT has been used for the past seven years to test and quality PV inverter systems and subsystems Highly Accelerated Life Test - HALT • HALT is intended to uncover design and design margin issues • Five stresses • Cold step stressing

• Hot step stressing Field Test Product Stress Stress Strength • Rapid thermal transitions • Vibration step stressing • Combined environments in addition to maximum loading the inverters are exercised under power

Test • Corrective Actions Failures ?5% Field Failures • Achievement of acceptable <5% design margins; Temperature margins, Vibration margins, Combined stresses HALT; PV Inverter Subsystems and Systems • Utility Inverters • String Inverters • Entire Switching • Entire 3TL 24kW Assembly (Engine) • Entire 3TL 48kW • DC Contactor Assemblies • Aux Power Supplies • Cable Assemblies • Line Reactors • Communication Subsystem • PCBAs • Digital Control • Analog • Sensor Control

System Level Burn-In for Utility Inverters • Burn-in testing takes place at the unit Weibull statistics are accumulated to assess the burn-in cycle level to stress the components for a designated period time to precipitate component early lifetime mortality - Temperature and Voltage Acceleration Factors • The burn-in cycle contains voltage and power cycling which is done to ensure that power connections such as the bolted-joint assemblies are robust as Failure Rate well as to test low power electrical connector interfaces Time Production Burn-In reduces the number of failures in the early (decreasing failure rate) lifetime region Conclusions

• Accelerated life testing can be effectively employed for both subsystem and system level qualification of central, utility and string inverters • HALT qualification is most effective at the subsystem level for central and utility inverters • For string inverters, HALT qualification offers a unique approach for reliability improvement of the entire product

16 AE World Headquarters 1625 Sharp Point Drive Fort Collins, CO 80525

800.446.9167 970.221.4670 [email protected]

www.advanced-energy.com

© 2014 Advanced Energy Industries, Inc. All rights reserved. An Introduction to Fault Tree Analysis (FTA)

Dr Jane Marshall Product Excellence using 6 Sigma Module

PEUSS 2011/2012 FTA Page 1

Objectives

– Understand purpose of FTA – Understand & apply rules of FTA – Analyse a simple system using FTA – Understand & apply rules of Boolean algebra

PEUSS 2011/2012 FTA Page 2

1 Relationship between FMEA & FTA

Product Failure

Fault Tree Failure Mode & Effect Analysis (FTA) Analysis (FMEA)

Part Failure

PEUSS 2011/2012 FTA Page 3

Fault Tree Analysis

• Is a systematic method of System Analysis • Examines System from Top  Down • Provides graphical symbols for ease of understanding • Incorporates mathematical tools to focus on critical areas

PEUSS 2011/2012 FTA Page 4

2 Fault tree analysis (FTA)

• Key elements: – Gates represent the outcome – Events represent input to the gates • FTA is used to: – investigate potential faults; – its modes and causes; – and to quantify their contribution to system unreliability in the

course of .

PEUSS 2011/2012 FTA Page 5

Symbols

Basic Event

A B A B

‘AND’ Gate ‘OR’ Gate Transfer out A ∩ B A U B

PEUSS 2011/2012 FTA Transfer in Page 6

3 Example Fault Tree A developed Tree ….. Top event

•A •A

1 2 2 3 4 5 5 6 7 8 1 3 4 6

1 2

9 10 7 8 9

… .. Ready for analysis 1 2 3 4 5 6

PEUSS 2011/2012 FTA 7

Example: redundant fire pumps

Source: http://www.ntnu.no/ross/srt/slides/fta.pdf

PEUSS 2011/2012 FTA Page 8

4 Example: redundant fire pumps

Source: http://www.ntnu.no/ross/srt/slides/fta.pdf

PEUSS 2011/2012 FTA Page 9

Example

PEUSS 2011/2012 FTA Page 10

5 Example

PEUSS 2011/2012 FTA Page 11

Methodology (Preliminary Analysis)

• Set System Boundaries • Understand Chosen System • Define Top Events

PEUSS 2011/2012 FTA Page 12

6 Methodology (Rules)

1. The “Immediate, Necessary & Sufficient” Rule 2. The “Clear Statement” Rule 3. The “No Miracles” Rule 4. The “Complete-the-Gate” Rule 5. The “No Gate-to-Gate” Rule 6. The “Component or System Fault?” Rule

PEUSS 2011/2012 FTA Page 13

Methodology (Rules - 1) – immediate, necessary and sufficient cause Immediate Closest in , time and derivation of the event above

Necessary There is no in the statement or gate linkage The event above could not result from a sub set of the causal Sufficientevents The events will, in all circumstances and at all times, cause the event above PEUSS 2011/2012 FTA Page 14

7 Methodology (Rules - 2) – The clear statement rule

Write event box statements clearly, stating precisely what the event is and when it occurs

PEUSS 2011/2012 FTA Page 15

Methodology (Rules - 3) – The ‘component or systems fault’ rule

If the answer to the question: “Can this fault consist of a component failure?” is Yes, – Classify the event as a “State of component fault” If the answer is No, – Classify the event as a “state of system fault”

PEUSS 2011/2012 FTA Page 16

8 Methodology (Rules - 4) – no miracles rule

If the normal functioning of a component propagates a fault sequence, then it is assumed that the component functions normally

PEUSS 2011/2012 FTA Page 17

Methodology (Rules - 5) – the complete gate rule

All inputs to a particular gate should be completely defined before further analysis of any one of them is undertaken

PEUSS 2011/2012 FTA Page 18

9 Methodology (Rules - 6) no gate to gate rule

Gate inputs should be properly defined fault events, and gates should not be directly connected to other gates

PEUSS 2011/2012 FTA Page 19

Fault Tree Example

Connector A Switch

Battery Motor ….. Motor does not Connector B run when switch is pressed

PEUSS 2011/2012 FTA Page 20

10 Fault Tree Example

Motor does not run top event ….. motor does not run No Power Motor Supply failed when switch is pressed

Switch No malfunction connection Battery is dead

Switch Connector Connector is Insufficient A B force is detached detached broken applied

PEUSS 2011/2012 FTA Page 21

Qualitative Analysis (Combination of Gates)

Q

Algebraic representation is: A C D B Q = ( A  C)  (D  B)

 or gate  and gate

PEUSS 2011/2012 FTA Page 22

11 Qualitative Analysis (Cut Sets)

A listing taken directly from the Fault Tree of the events, ALL of which must occur to cause the TOP Event to happen

PEUSS 2011/2012 FTA Page 23

Qualitative Analysis (Cut Sets)

Algebraic representation is: Q = ( A  C)  (D  B) Q which can be re-written as:

Q = ( A  D)  (A  B)  (C  D)  (C  B) Q = (A • D ) + (A • B ) + (C • D ) + ( C • B ) … which is a listing of Groupings ...each of which is a Cut Set A C D B AD AB CD BC

PEUSS 2011/2012 FTA Page 24

12 Qualitative Analysis (Minimal Cut Sets)

A listing, derived from the Fault Tree Cut Sets and reduced by Boolean Algebra, which is the smallest list of events that is necessary to cause the Top Event to happen

PEUSS 2011/2012 FTA Page 25

Qualitative Analysis (Boolean Algebra)

Commutative laws Commutative laws A  B = B  A A • B = B • A A  B = B  A A + B = B +A Associative laws Associative laws A  (B  C) = (A  B)  C A • (B • C) = (A • B) • C A  (B  C) = (A  B)  C A + (B+C)=(A+B)+C Distributive laws Distributive laws A  (B  C) = A  B  A  C A • (B + C) = A • B + A • C A  (B  C) = (A  B)  (A  C) A + (B • C) = (A + B) • (A + C)

PEUSS 2011/2012 FTA Page 26

13 Qualitative Analysis (Boolean Reduction)

Idempotent laws Top event

A • A = A A +A = A A Absorption law A + (A • B) = A B A

PEUSS 2011/2012 FTA Page 27

Exercise in deriving Cut Sets ….. (A  B)  ( (A  C )  (D  B ))  ( D  C)

A  B (A  C )  (D  B ) D  C

A B A  C D  B D C

A C D B

PEUSS 2011/2012 FTA Page 28

14 Solution …..

(A  B)  (( A  C)  (D  B ))  (D  C)  ( A + B ) • (A • C + D • B ) • D • C  AACDC + ADBDC + BACDC + BDBDC  ACD + ABCD + ABCD + BCD  ACD + BCD Minimal Cut Sets …… ACD, BCD

PEUSS 2011/2012 FTA Page 29

Design Analysis of Minimal Cut Sets

A Cut Set comprising several components is less likely to fail than one containing a single component

Hint ..... AND Gates at the top of the Fault Tree increase the number of components in a Cut Set OR Gates increase the number of Cut Sets, but often lead to single component Sets

PEUSS 2011/2012 FTA Page 30

15 Benefits and limitations

• Prepared in early stages of a design and further developed in detail concurrently with design development. • Identifies and records systematically the logical fault paths from a specific effect, to the prime causes • Allows easy conversion to probability measures • But may lead to very large trees if the analysis is extended in depth. • Depends on skill of analyst • Difficult to apply to systems with partial success • Can be costly in time & effort

PEUSS 2011/2012 FTA Page 31

Software

• Software packages available for reliability tools • Relex • Relia soft • others

PEUSS 2011/2012 FTA Page 32

16 Exercise 1

PEUSS 2011/2012 FTA Page 33

One Possible Solution

PEUSS 2011/2012 FTA Page 34

17 RBD of an engine

LV HV

Ignition system 1 Fuel Fuel Other pump filter Jet components Fuel system Carburettor

LV HV

Ignition system 2

PEUSS 2011/2012 FTA Page 35

PEUSS 2011/2012 FTA Page 36

18 Important Probability Distributions

OPRE 6301 Important Distributions. . .

Certain probability distributions occur with such regular- ity in real-life applications that they have been given their own names. Here, we survey and study basic properties of some of them.

We will discuss the following distributions:

Binomial • Poisson • Uniform • Normal • Exponential • The first two are discrete and the last three continuous.

1 Binomial Distribution. . .

Consider the following scenarios:

— The number of heads/tails in a sequence of coin flips — Vote counts for two different candidates in an election — The number of male/female employees in a company — The number of accounts that are in compliance or not in compliance with an accounting procedure — The number of successful sales calls — The number of defective products in a production run — The number of days in a month your company’s com- puter network experiences a problem

All of these are situations where the binomial distribution may be applicable.

2 Canonical Framework. . .

There is a set of assumptions which, if valid, would lead to a binomial distribution. These are:

A set of n experiments or trials are conducted. • Each trial could result in either a success ora failure. • The probability p of success is the same for all trials. • The outcomes of different trials are independent. • We are interested in the total number of successes in • these n trials.

Under the above assumptions, let X be the total number of successes. Then, X is called a binomial , and the of X is called the binomial distribution.

3 Binomial Probability-Mass Function. . .

Let X be a binomial random variable. Then, its probability- mass function is:

n! x n x P (X = x)= p (1 p) − (1) x!(n x)! − − for x =0,1,2,..., n.

The values of n and p are called the parameters of the distribution.

To understand (1), note that:

The probability for observing any sequence of n in- • dependent trials that contains x successes and n x n n x − failures is p (1 p) − . − The total number of such sequences is equal to • n n! x ≡ x!(n x)!   − (i.e., the total number of possible combinations when we randomly select x objects out of n objects).

4 Example: Multiple-Choice Exam

Consider an exam that contains 10 multiple-choice ques- tions with 4 possible choices for each question, only one of which is correct. Suppose a student is to select the answer for every ques- tion randomly. Let X be the number of questions the student answers correctly. Then, X has a binomial distribution with parameters n = 10 and p = 0.25. (Convince yourself that all assumptions for a binomial distribution are reasonable in this setting.) What is the probability for the student to get no answer correct? Answer:

10! 0 10 0 P (X =0) = (0.25) (1 0.25) − 0!(10 0)! − − = (0.75)10 = 0.0563

5 What is the probability for the student to get two an- swers correct? Answer: 10! P (X =2) = (0.25)2(1 0.25)8 2!8! − = 45 (0.25)2 (0.75)8 · · = 0.2816

What is the probability for the student to fail the test (i.e., to have less than 6 correct answers)? Answer:

5 P (X 5) = P (X = i) ≤ i=0 =X 0.0563 + 0.1877 + 0.2816 + 0.2503 +0.1460 + 0.0584 = 0.9803

Binomial can be computed using the Excel function BINOMDIST(). Two other examples are given in a separate Excel file.

6 Binomial Mean and Variance. . .

It can be shown that

µ = E(X)= np and σ2 = V (X)= np(1 p). −

For the previous example, we have

E(X)=10 0.25=2.5. • · V (X)=10 (0.25) (1 0.25) = 1.875. • · · −

7 . . .

The Poisson distribution is another family of distributions that arises in a great number of business situations. It usually is applicable in situations where random “events” occur at a certain rate over a period of time.

Consider the following scenarios: — The hourly number of customers arriving at a bank — The daily number of accidents on a particular stretch of highway — The hourly number of accesses to a particular web server — The daily number of emergency calls in Dallas — The number of typos in a book — The monthly number of employees who had an ab- sence in a large company — Monthly demands for a particular product

All of these are situations where the Poisson distribution may be applicable.

8 Canonical Framework. . .

Like the Binomial distribution, the Poisson distribution arises when a set of canonical assumptions are reasonably valid. These are:

The number of events that occur in any time interval • is independent of the number of events in any other disjoint interval. Here, “time interval” is the standard example of an “exposure variable” and other interpre- tations are possible. Example: Error rate per page in a book. The distribution of number of events in an interval is • the same for all intervals of the same size. For a “small” time interval, the probability of observ- • ing an event is proportional to the length of the inter- val. The proportionality constant corresponds to the “rate” at which events occur. The probability of observing two or more events in • an interval approaches zero as the interval becomes smaller.

9 Under the above assumptions, let λ be the rate at which events occur, t be the length of a time interval, and X be the total number of events in that time interval. Then, X is called a Poisson random variable and the proba- bility distribution of X is called the Poisson distrib- ution.

Let µ λt; then, µ can be interpreted as the average, or ≡ mean, number of events in an interval of length t.

10 Poisson Probability-Mass Function. . .

Let X be a Poisson random variable. Then, its probability- mass function is: x µ µ P (X = x)= e− (2) x! for x =0,1,2,....

The value of µ is the parameter of the distribution. For a given time interval of interest, in an application, µ can be specified as λ times the length of that interval.

Example: Typos The number of typographical errors in a “big” textbook is Poisson distributed with a mean of 1.5 per 100 pages. Suppose 100 pages of the book are randomly selected. What is the probability that there are no typos? An- swer: x 0 µ µ 1.5 1.5 P (X =0)= e− = e− =0.2231 x! 0!

11 Suppose 400 pages of the book are randomly selected. What are the probabilities for having no typos and for having five or fewer typos? Answers: 0 1.5 4 (1.5 4) P (X =0) = e− · · 0!

= 0.002479 and 5 P (X 5) = P (X = i) ≤ i=0 =X 0.0025 + 0.0149 + 0.0446 + 0.0892 +0.1339 + 0.1606 = 0.4457

Poisson probabilities can be computed using the Excel function POISSON(). Further numerical examples of the Poisson distribution are given in a separate Excel file.

12 Mean and Variance

It can be shown that

E(X)= µ and V (X)= µ .

Interpretation of (2)

The form of (2) seems mysterious. The best way to un- derstand it is via the binomial distribution.

Consider a time interval and divide it into n equally-sized subintervals. Suppose n is very large so that either one or zero event can occur in a subinterval. Suppose further that the probability for an event to occur in a subinterval is µ/n, independent of what occurs in other subintervals.

13 Under these assumptions, the total number of events, X, in that interval has a binomial distribution with parame- ters n and µ/n. That is, n! µ x µ n x P (X = x)= 1 − (3) x!(n x)! n − n −     for x =0,1,2,..., n.

Note that E(X)= n (µ/n)= µ, suggesting that (3) and · (1) are “consistent.” Indeed, it can be shown that as n approaches , (3) becomes (2). This useful fact is called ∞ Poisson approximation to the binomial distribution.

We will see several other examples of such limiting ap- proximations in future chapters. They provide simple and accurate approximations to otherwise unmanageable expressions.

14 General Continuous Distributions. . .

Recall that a continuous random variable or distribu- tion is defined via a probability density function. Let f(x) (nonnegative) be the density function of variable X. Then, f(x) is the rate at which probability accumulates in the neighborhood of x. In other words, f(x) h P (x

∞ f(x) dx =1 . Z−∞

Note that the probability for a continuous random vari- able to assume any particular value is 0; this can be seen by setting x1 = x2 in (4).

15 Recall further that the integral of a function over an inter- val is the area under that function over the given interval. We can therefore visualize P (x < X x ) as the area 1 ≤ 2 of the yellow region below:

f(x)

x1 x2 x

16 For < x < , the function −∞ ∞ x F (x) P (X x)= f(y) dy ≡ ≤ Z−∞ (i.e., let x = and x = x in (4)) is called the 1 −∞ 2 cumulative distribution function of X. F (x) can also be used to describe a random variable, since f(x) is the derivative of F (x).

Various probabilities of interest regarding a variable X can all be computed via either f(x) or F (x).

We next discuss three important continuous distributions: uniform, normal, and exponential.

17 Uniform Distribution. . .

The uniform distribution is the simplest example of a con- tinuous probability distribution. A random variable X is said to be uniformly distributed if its density function is given by: 1 f(x)= (5) b a − for

f(x)

a b x where the shaded region has area (b a)[1/(b a)]=1 − − (width times height).

18 The values a and b are the parameters of the uniform distribution. It can be shown that

a + b (b a)2 E(X)= and V (X)= − . 2 12

The standard uniform density has parameters a = 0 and b = 1; and hence f(x)=1for0 x 1 and 0 other- ≤ ≤ wise. The Excel function RAND() “pretends” to generate independent samples from this density function.

19 Example: Gasoline Sales Suppose the amount of gasoline sold daily at a service station is uniformly distributed with a minimum of 2,000 gallons and a maximum of 5,000 gallons. What is the probability that daily sales will fall between 2,500 gallons and 3,000 gallons? Answer: 1 P (2500

Visually, we have

f(x)

2,000 5,000 x

and the answer corresponds to the area in blue.

20 What is the probability that the service station will sell at least 4,000 gallons? Answer: 1 P (X > 4000) = (5000 4000) 5000 2000 − − = 0.3333 .

Visually, we have

f(x)

2,000 5,000 x

What is the probability that the service station will sell exactly 2,500 gallons? Answer: P (X = 2500) = 0, since the area of a “vertical line” at 2,500 is 0.

f(x)

2,000 5,000 x

21 Normal Distribution. . .

The normal distribution is the most important distrib- ution in statistics, since it arises naturally in numerous applications. The key reason is that large sums of (small) random variables often turn out to be normally distributed; a more-complete discussion of this will be given in Chapter 9.

A random variable X is said to have the normal distrib- ution with parameters µ and σ if its density function is given by:

1 1 x µ 2 f(x)= exp − (6) √2π σ − 2 σ (   ) for < x < . −∞ ∞ It can be shown that

E(X)= µ and V (X)= σ2 . Thus, the normal distribution is characterized by a mean µ and a σ .

22 A typical normal density curve looks like this:

Thus, the curve is bell shaped and is symmetric around the mean µ. The standard deviation σ controls the “flat- ness” of the curve.

Details ...

23 Increasing the mean shifts the density curve to the right ...

Increasing the standard deivation flattens the density curve ...

24 Calculating Normal Probabilities. . .

A normal distribution whose mean is 0 and standard de- viation is 1 is called the standard normal distribution. In this case, the density function assumes the simpler form: 1 x2/2 f(x)= e− (7) √2π for < x < . −∞ ∞ Table 3 in Appendix B of the text can be used to cal- culate probabilities associated with the standard normal distribution. The Excel function NORMSDIST() (where “S” is for “standard”) can also be used.

Denote by Z a random variable that follows the standard normal distribution. Then, Table 3 gives the probability P (0 < Z z) for any nonnegative value z; whereas ≤ NORMSDIST() returns P (Z z) for any z from ≤ −∞ to , i.e., values of the cumulative distribution function. ∞ For general parameter values, the Excel function NOR- MDIST() (without “S” in the middle) can be used di- rectly. However, ...

25 A standard practice is to convert a normal random vari- able X with arbitrary parameters µ and σ into a stan- dardized normal random variable Z with parameters 0 and 1 via the transformation: X µ Z = − ; (8) σ this is illustrated in:

This shifts the mean of X to zero…

0

0

This changes the shape of the curve…

26 Example 1: Build Time of Computers

Suppose the time required to build a computer is nor- mally distributed with a mean of 50 minutes and a standard deviation of 10 minutes. What is the probability for the assembly time of a com- puter to be between 45 and 60 minutes? Answer: We wish to compute P (45 < X 60). To do this, ≤ we first rewrite the event of interest into a form that is in terms of a standardized variable Z =(X 50)/10, − as follows. 45 50 X 50 60 50 P − < − − 10 10 ≤ 10   = P ( 0.5 < Z 1) . − ≤ Next, observe that

P ( 0.5 < Z 1) = P (Z 1) P (Z 0.5) . − ≤ ≤ − ≤− Using the Excel function NORMSDIST(), we find that P (Z 1)=0.8413 and P (Z 0.5)=0.3085. ≤ ≤− Hence, the answer is 0.8413 0.3085 = 0.5328. −

27 Table 3 can also be used for this calculation:

P ( 0.5 < Z 1) − ≤ = P ( 0.5 < Z 0) + P (0 < Z 1) − ≤ ≤ = P (0 < Z 0.5) + P (0 < Z 1) ≤ ≤ = 0.1915 + 0.3414 = 0.5328 , where the first equality follows from

0 –.5 … 1

the second equality is due to the fact that the normal density curve is symmetric, and the third equality is from Table 3. Is it reasonable to assume that the build time is nor- mally distributed? Reasoning: The build time can be thought of as the sum of times needed to build many individual components.

28 Example 2: Stock Returns

Suppose the return of an investment in a stock over a given time period is normally distributed with a mean of 10% and a standard deviation of 5%. Reasoning: Price movement of a stock over the given period can be thought of as the sum of a “long” sequence of small movements. What is the probability of losing money over the given period? Answer: We wish to determine P (X 0). ≤ Following the steps in the previous example, we obtain

P (X 0) ≤ X 10 0 10 = P − − 5 ≤ 5   = P (Z 2) ≤− = 0.02275 .

29 What is the effect of doubling the standard deviation to 10? Answer: A similar calculation yields X 10 0 10 P (X 0) = P − − ≤ 10 ≤ 10   = P (Z 1) ≤− = 0.1587 , which is almost 7 times larger than the previous an- swer. Thus, increasing the standard deviation in- creases the probability of losing money. This reiter- ates the fact that the standard deviation is a measure of risk.

Example 3: Midterm Scores

Why did the distribution of the Midterm scores resemble a normal density curve? Reasoning: The to- tal score of an exam is the sum of scores for many individual problems/parts.

30 Finding “z” for Given Probability. . .

Most of the calculations above are of the form: Find the probability P (Z z) for a given value of z. Often times, ≤ we are also interested in an inverse problem: Find the value of zA such that the probability for Z to be greater than zA equals a specified value A.

Formally, our question is: For what value of zA do we have

P (Z > zA)= A ? (9) This can be visualized as:

Questions like these will be relevant in statistical infer- ence.

31 Examples:

Find zA for A =0.025 (or 2.5%). That is, what is z0.025? Answer: Observe that

P (Z > z )=1 P (Z z ) . 0.025 − ≤ 0.025

Area = .025

Observer further that

P (Z z )=1 P (Z > z ) ≤ 0.025 − 0.025 = 1 0.025 − = 0.975 , where the second equality follows from the definition

of z0.025.

32 Hence, our problem is equivalent to that of finding z such that P (Z z )=0.975. That is, we 0.025 ≤ 0.025 are interested in the inverse of a cumulative distrib- ution function; this is similar to finding using an ogive. The Excel function NORMSDIST() (which is a cumulative distribution function) has an inverse: NORMSINV(). Using this inverse function

with argument 0.975, we find that z0.025 = 1.96.

For A =0.05, we have z0.05 =1.645.

For A =0.01, we have z0.01 =2.33.

33 Exponential Distribution. . .

Another useful continuous distribution is the exponen- tial distribution, which has the following probability density function: λx f(x)= λe− (10) for x 0. ≥ This family of distributions is characterized by a single parameter λ, which is called the rate. Intuitively, λ can be thought of as the instantaneous “failure rate” of a “device” at any time t, given that the device has survived up to t.

The exponential distribution is typically used to model time intervals between “random events”...

34 Examples:

— The length of time between telephone calls — The length of time between arrivals at a service station — The life time of electronic components, i.e., an inter- failure time

An important fact is that when times between random “events” follow the exponential distribution with rate λ, then the total number of events in a time period of length t follows the Poisson distribution with parameter λt.

If a random variable X is exponentially distributed with rate λ, then it can be shown that 1 1 2 E(X)= and V (X)= . λ λ  

35 For λ = 0.5, 1, and 2, the shapes of the expenential density curve are:

Observe that the greater the rate, the faster the curve drops. Or, the lower the rate, the flatter the curve.

Several useful formulas are:

λx P X x =1 e− { ≤ } −λx P X>x = e− { } λx λx P x

36 Example 1: Lifetime of a Battery

The lifetime X of an alkaline battery is exponentially distributed with λ =0.05 per hour. What are the mean and standard deviation of the bat- tery’s lifetime? Answer: 1 E(X)= SD(X)= = 20 hours. 0.05 What are the probabilities for the battery to last between 10 and 15 hours and to last more than 20 hours? An- swer:

0.05 10 0.05 15 P (10 20) = e− · =0.3679 (The Excel function EXP() can be used for these cal- culations.)

37 Example 2: Arrivals at a Gas Station

The arrival rate of cars at a gas station is λ = 40 cus- tomers per hour. (This is equivalent to saying that the interarrival times are exponentially distributed with rate 40 per hour.) What is the probability of having no arrivals in a 5- minute interval? Answer:

5 40 (5/60) P (X > )= e− · =0.03567 60 What are the mean and variance of the number, N, of arrivals in 5 minutes? Answer: The variable N has a Poisson distribution with para- meter µ = λt = 40 (5/60) = 3.333. Hence, · E(N)=3.333 and V (N)=3.333 .

What is the probability for having 3 arrivals in a 5- minute interval? Answer: 3 3.333 3.333 P (N =3)= e− =0.2202 . 3!

38 MTBF, MTTR, MTTF & FIT Explanation of Terms Purpose

The intent of this White Paper is to provide an understanding of MTBF and other product reliability methods. Understanding the methods for the lifecycle prediction for a product enables the customer to consider the tangible value of the product beyond set-features before purchasing it.

MTBF, MTTR, MTTF and FIT are reliability terms based on methods and procedures for lifecycle predictions for a product. Customers often must include reliability data when determining what product to buy for their application. MTBF (Mean Time Between Failure), MTTR (Mean Time To Repair), MTTF (Mean Time To Failure) and FIT (Failure In Time) are ways of providing a numeric value based on a compilation of data to quantify a failure rate and the resulting time of expected performance. The numeric value can be expressed using any measure of time, but hours is the most common unit in practice.

About the Author

Susan Stanley, Senior Technical Support Engineer, IMC Networks

Susan Stanley has spent the last 16 years in engineering and customer service at technology-related companies such as Brother International, Citoh and currently, IMC Networks. Her working experience encompasses a wide range of technologies, including Operating Systems, supporting and troubleshooting peripherals such as IP-based Multi-Function equipment and Scanners, web-coding for an Intranet and utilizing the application of Visual Basic to modify code. Certifi ed as a technical trainer, she has trained all new employees for product knowledge as well as developing a comprehensive FAQ system for internal use.

Today, Susan Stanley heads the technical support and customer service activities for IMC Networks, providers of fi ber optic access and media conversion solutions for Enterprise, Government and Service Providers’ LANs, First-Mile FTTx Networks and Metropolitan Area Networks. She is key in establishing the initial customer service contact and resolving critical issues for IMC Networks products. Having the ability to convey technical information into layman’s terminology is a critical element in quickly resolving an end-user’s product issues and questions. She provides feedback from the customer base to the Engineering team, which can result in product improvements or suggestions. MTBF, MTTR, MTTF & FIT Explanation of Terms

Introduction

MTBF, MTTR, MTTF and FIT

Mean Time Between Failure (MTBF) is a reliability term used to provide the amount of failures per million hours for a product. This is the most common inquiry about a product’s life span, and is important in the decision-making process of the end user. MTBF is more important for industries and integrators than for consumers. Most consumers are price driven and will not take MTBF into consideration, nor is the data often readily available. On the other hand, when equipment such as media converters or switches must be installed into mission critical applications, MTBF becomes very important. In addition, MTBF may be an expected line item in an RFQ (Request For Quote). Without the proper data, a manufacturer’s piece of equipment would be immediately disqualifi ed.

Mean Time To Repair (MTTR) is the time needed to repair a failed hardware module. In an op- erational system, repair generally means replacing a failed hardware part. Thus, hardware MTTR could be viewed as mean time to replace a failed hardware module. Taking too long to repair a prod- uct drives up the cost of the installation in the long run, due to down time until the new part arrives and the possible window of time required to schedule the installation. To avoid MTTR, many com- panies purchase spare products so that a replacement can be installed quickly. Generally, however, customers will inquire about the turn-around time of repairing a product, and indirectly, that can fall into the MTTR category.

Mean Time To Failure (MTTF) is a basic measure of reliability for non-repairable systems. It is the mean time expected until the fi rst failure of a piece of equipment. MTTF is a statistical value and is meant to be the mean over a long period of time and a large number of units. Technically, MTBF should be used only in reference to a repairable item, while MTTF should be used for non-repairable items. However, MTBF is commonly used for both repairable and non-repairable items.

Failure In Time (FIT) is another way of reporting MTBF. FIT reports the number of expected failures per one billion hours of operation for a device. This term is used particularly by the semiconductor industry but is also used by component manufacturers. FIT can be quantifi ed in a number of ways: 1000 devices for 1 million hours or 1 million devices for 1000 hours each, and other combinations. FIT and CL (Confi dence Limits) are often provided together. In common usage, a claim to 95% confi dence in something is normally taken as indicating virtual certainty. In statistics, a claim to 95% confi dence simply means that the researcher has seen something occur that only happens one time in twenty or less. For example, component manufacturers will take a small sampling of a component, test x number of hours, and then determine if there were any failures in the test bed. Based on the number of failures that occur, the CL will then be provided as well.

1 Reliability Methods & Standards

Several prediction methods over time have been developed to determine reliability, but the two standards most often used when compiling reliability data for media converters are: the MIL-HDBK- 217F Notice 2 (Military Handbook) and Bellcore TR332. The MIL-HDBK-217 encompasses two ways to predict reliability: Parts Count Prediction (used to predict the reliability of a product in its early development cycle) and Parts Stress Analysis Prediction (used later in the development cycle, as the product nears production). This is how the famous “bathtub curve” so adeptly illustrates the unit failure in proportion to a period of time. Other methods are applicable to the telecom industry while still others are useful for analyzing how failure modes would impact a product. The challenge is choosing the method based on the product’s functionality.

MTBF

When the failure rate needs to be as low as possible, especially for mission critical systems, for example, utilizing MTBF data to ensure maximum uptime for an installation. It is a common misconception, however, that the MTBF value is equivalent to the expected number of operating hours before a product fails, or the “service life”. There are several variables that can impact failures. Aside from component failures, customer use/installation can also result in failure. For example, if a customer misuses a product and then it malfunctions, should that be considered a failure? If a product is delivered DOA because it was not properly packaged, is that a failure?

The MTBF is often calculated based on an that factors in all of a product’s components to reach the sum life cycle in hours. In reality, depreciation modes of the product could limit the life of the product much earlier due to some of the variables listed above. It is very possible to have a product with an extremely high MTBF, but an average or more realistic expected service life.

MTBF = 1

FR1 + FR2 + FR3 + ...... FRn where FR is the failure rate of each component of the system up to n components

MTBF is not just a simple formula. A person certifi ed and educated in calculating MTBF is a good investment. That person must review the MTBF for every component as well as other factors such as operating temperature range, storage temperature range, etc.

2 Beyond the MTBF calculation, Managers should track all reported fi eld failures as well as the root cause of those product failures to produce a true snapshot of a product’s service life. Since this process takes time, the MTBF and other predictions of reliability for a product are on-going. MTBF can be subject to change. For example, in 2006, RoHS (Restriction of Hazardous Substances) was mandated by the European Community. If a released product is re-developed in order to meet RoHS-compliancy, the entire calculation has to be performed again, since non-RoHS components may have a different life cycle than those that do meet the RoHS standard.

ISO-9001 can also effectively support MTBF. How? Companies that are ISO certifi ed agree to meet the goals of “zero defect” and “continual improvement”. With processes in place, a product is devel- oped and tested in numerous ways, including submissions to lab certifi cations appropriate for the product. The result is that before a product is ever introduced into the market, it is as fl awless and as functional as it was intended to be.

Summary

Reliability methods such as MTTR, MTTF and FIT apply to products or to specifi c components. However, MTBF remains a basic measure of a systems’ reliability for most products. It is often debated, sometimes even rejected as no longer relevant, and overall, widely misunderstood. It is still regarded as a useful tool when considering the purchase and installation of a product. Remember, along with obtaining an MTBF value, ask questions regarding how current that information is and on what standards it is based on to ensure choosing the most appropriate product for your installation.

About IMC Networks

IMC Networks is a leading ISO 9001 certifi ed manufacturer of optical networking and LAN/WAN connectivity solutions for enterprise, telecommunications and service provider applications. Found- ed in 1988, with over one million products installed worldwide, IMC Networks offers a wide range of fi ber media and mode converters for a variety of applications. Solutions include managed and unmanaged fi ber to copper converters, TDM over fi ber extenders and advanced optical Ethernet demarcation devices. Select from a wide range of connectors (SC, ST, LC, RJ-45, and SFP), fi - ber modes (single, multi), options for increasing fi ber capacity (Wavelength Division Multiplexing/ CWDM, single-strand fi ber), powering options (AC, DC, USB, Power over Ethernet) and extended temperature solutions.

Fiber Consulting Services

IMC Networks’ Fiber Consulting Services (FCS) assists network managers and system integrators with the design and development of fi ber-based networks. Consulting services are free of charge. For more information about FCS, please contact us at [email protected] or call 800-624-1070 (within the USA) or +1-949-465-3000 (outside the USA).

To learn more about IMC Networks and its products, please visit our website at http://www.imcnetworks.com

3 www.imcnetworks.com

IMC Networks IMC Networks IMC Networks Headquarters Eastern US/Latin America Europe 19772 Pauling 28050 U.S. Hwy. 19 North, Suite 306 Herseltsesteenweg 268 Foothill Ranch, CA 92610 Clearwater, FL 33761 B-3200 Aarschot | Belgium TEL: 949-465-3000 TEL: 727-797-0300 TEL: +32-16-550880 FAX: 949-465-3020 FAX: 727-797-0331 FAX: +32-16-550888 [email protected] [email protected] [email protected]

Copyright © 2011 IMC Networks. All rights reserved. The information in this is subject to change without notice. Document # WP-3011 IMC Networks assumes no responsibility for any errors that may appear in this document. Specific product names may be 0311 trademarks or registered trademarks and are the property of their respective companies. Probability Distributions Used in Reliability Engineering

Probability Distributions Used in Reliability Engineering

Andrew N. O’Connor Mohammad Modarres Ali Mosleh

Center for Risk and Reliability 0151 Glenn L Martin Hall University of Maryland College Park, Maryland

Published by the Center for Risk and Reliability

International Standard Book Number (ISBN): 978-0-9966468-1-9

Copyright © 2016 by the Center for Reliability Engineering University of Maryland, College Park, Maryland, USA

All rights reserved. No part of this book may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage and retrieval system, without permission in writing from The Center for Reliability Engineering, Reliability Engineering Program.

The Center for Risk and Reliability University of Maryland College Park, Maryland 20742-7531

In memory of Willie Mae Webb

This book is dedicated to the memory of Miss Willie Webb who passed away on April 10 2007 while working at the Center for Risk and Reliability at the University of Maryland (UMD). She initiated the concept of this book, as an aid for students conducting studies in Reliability Engineering at the University of Maryland. Upon passing, Willie bequeathed her belongings to fund a scholarship providing financial support to Reliability Engineering students at UMD.

Preface

Reliability Engineers are required to combine a practical understanding of science and engineering with statistics. The reliability engineer’s understanding of statistics is focused on the practical application of a wide variety of accepted statistical methods. Most reliability texts provide only a basic introduction to probability distributions or only provide a detailed reference to few distributions. Most texts in statistics provide theoretical detail which is outside the scope of likely reliability engineering tasks. As such the objective of this book is to provide a single reference text of closed form probability formulas and approximations used in reliability engineering.

This book provides details on 22 probability distributions. Each distribution section provides a graphical and formulas for distribution parameters, along with distribution formulas. Common statistics such as moments and percentile formulas are followed by likelihood functions and in many cases the derivation of maximum likelihood estimates. Bayesian non-informative and conjugate priors are provided followed by a discussion on the distribution characteristics and applications in reliability engineering. Each section is concluded with online and hardcopy references which can provide further information followed by the relationship to other distributions.

The book is divided into six parts. Part 1 provides a brief coverage of the fundamentals of probability distributions within a reliability engineering context. Part 1 is limited to concise explanations aimed to familiarize readers. For further understanding the reader is referred to the references. Part 2 to Part 6 cover Common Life Distributions, Univariate Continuous Distributions, Univariate Discrete Distributions and Multivariate Distributions respectively.

The authors would like to thank the many students in the Reliability Engineering Program particularly Reuel Smith for proof reading.

Contents i Contents

PREFACE ...... V

CONTENTS ...... I

1. FUNDAMENTALS OF PROBABILITY DISTRIBUTIONS ...... 1

1.1. Probability Theory ...... 2 1.1.1. Theory of Probability ...... 2 1.1.2. Interpretations of Probability Theory ...... 2 1.1.3. Laws of Probability...... 3 1.1.4. Law of Total Probability ...... 4 1.1.5. Bayes’ Law ...... 4 1.1.6. Likelihood Functions ...... 5 1.1.7. Fisher Information Matrix ...... 6

1.2. Distribution Functions ...... 9 1.2.1. Random Variables ...... 9 1.2.2. Statistical Distribution Parameters ...... 9 1.2.3. Probability Density Function ...... 9 1.2.4. Cumulative Distribution Function ...... 11 1.2.5. Reliability Function ...... 12 1.2.6. Conditional Reliability Function ...... 13 1.2.7. 100α% Percentile Function ...... 13 1.2.8. Mean Residual Life ...... 13 1.2.9. Hazard Rate ...... 13 1.2.10. Cumulative Hazard Rate ...... 14 1.2.11. Characteristic Function ...... 15 1.2.12. Joint Distributions ...... 16 1.2.13. Marginal Distribution ...... 17 1.2.14. Conditional Distribution ...... 17 1.2.15. Bathtub Distributions...... 17 1.2.16. Truncated Distributions ...... 18 1.2.17. Summary ...... 19

1.3. Distribution Properties ...... 20 1.3.1. Median / Mode ...... 20 1.3.2. Moments of Distribution ...... 20 1.3.3. ...... 21

1.4. Parameter Estimation ...... 22 1.4.1. Probability Plotting Paper ...... 22 1.4.2. Total Time on Test Plots ...... 23 1.4.3. Least Mean Square Regression ...... 24 1.4.4. Method of Moments ...... 25 1.4.5. Maximum Likelihood Estimates ...... 26 1.4.6. Bayesian Estimation ...... 27 ii Probability Distributions Used in Reliability Engineering

1.4.7. Confidence Intervals ...... 30

1.5. Related Distributions ...... 33

1.6. Supporting Functions ...... 34 1.6.1. Beta Function , ...... 34 1.6.2. Incomplete Beta Function ( ; , ) ...... 34 1.6.3. Regularized Incomplete𝐁𝐁𝒙𝒙 𝒚𝒚 Beta Function ( ; , ) ...... 34 1.6.4. Complete Gamma Function𝑩𝑩𝑩𝑩 (𝒕𝒕 )𝒙𝒙 ...... 𝒚𝒚 ...... 34 1.6.5. Upper Incomplete Gamma Function (𝑰𝑰𝑰𝑰, 𝒕𝒕) 𝒙𝒙...... 𝒚𝒚 ...... 35 1.6.6. Lower Incomplete Gamma Function𝚪𝚪 𝒌𝒌 ( , ) ...... 35 1.6.7. Digamma Function ...... 𝚪𝚪 𝒌𝒌 𝒕𝒕 ...... 36 1.6.8. Trigamma Function ...... 𝛄𝛄 𝒌𝒌 𝒕𝒕 ...... 36 𝝍𝝍𝝍𝝍 1.7. Referred Distributions ...... 𝝍𝝍′𝒙𝒙 ...... 37 1.7.1. Inverse Gamma Distribution ( , )...... 37 1.7.2. Student T Distribution ( , , ) ...... 37 1.7.3. F Distribution ( , ) ...... 𝑰𝑰𝑰𝑰 𝜶𝜶 𝜷𝜷 ...... 37 1.7.4. Chi-Square Distribution𝑻𝑻 𝜶𝜶(𝝁𝝁)𝝈𝝈𝝈𝝈 ...... 37 1.7.5. Hypergeometric𝑭𝑭 𝒏𝒏Distribution𝟏𝟏 𝒏𝒏𝟐𝟐 ( ; , , ) ...... 38 1.7.6. Wishart Distribution 𝝌𝝌𝝌𝝌 𝒗𝒗 ( ; , ) ...... 38 𝑯𝑯𝑯𝑯𝑯𝑯𝑯𝑯𝑯𝑯𝑯𝑯𝑯𝑯𝑯𝑯𝑯𝑯 𝒌𝒌 𝒏𝒏 𝒎𝒎 𝑵𝑵 1.8. Nomenclature and Notation𝑾𝑾𝑾𝑾𝑾𝑾𝑾𝑾𝑾𝑾 ...... 𝑾𝑾𝑾𝑾𝑾𝑾 𝒙𝒙 𝚺𝚺 𝒏𝒏 ...... 39

2. COMMON LIFE DISTRIBUTIONS ...... 40

2.1. Exponential Continuous Distribution ...... 41

2.2. Lognormal Continuous Distribution ...... 49

2.3. Weibull Continuous Distribution ...... 59

3. BATHTUB LIFE DISTRIBUTIONS ...... 68

3.1. 2-Fold Mixed Weibull Distribution ...... 69

3.2. Exponentiated Weibull Distribution ...... 76

3.3. Modified Weibull Distribution ...... 81

4. UNIVARIATE CONTINUOUS DISTRIBUTIONS ...... 85

4.1. Beta Continuous Distribution ...... 86

4.2. Birnbaum Saunders Continuous Distribution ...... 93 Contents iii 4.3. Gamma Continuous Distribution ...... 99

4.4. Logistic Continuous Distribution ...... 108

4.5. Normal (Gaussian) Continuous Distribution ...... 115

4.6. Pareto Continuous Distribution ...... 125

4.7. Triangle Continuous Distribution ...... 131

4.8. Truncated Normal Continuous Distribution...... 135

4.9. Uniform Continuous Distribution ...... 145

5. UNIVARIATE DISCRETE DISTRIBUTIONS ...... 151

5.1. Bernoulli Discrete Distribution ...... 152

5.2. Binomial Discrete Distribution ...... 157

5.3. Poisson Discrete Distribution ...... 165

6. BIVARIATE AND MULTIVARIATE DISTRIBUTIONS ...... 172

6.1. Bivariate Normal Continuous Distribution ...... 173

6.2. Dirichlet Continuous Distribution ...... 181

6.3. Multivariate Normal Continuous Distribution ...... 187

6.4. Multinomial Discrete Distribution ...... 193

7. REFERENCES ...... 200

iv Probability Distributions Used in Reliability Engineering

Introduction 1 Prob Theory

1. Fundamentals of Probability Distributions 2 Probability Distributions Used in Reliability Engineering

1.1. Probability Theory Prob Theory Prob 1.1.1. Theory of Probability

The theory of probability formalizes the representation of probabilistic concepts through a set of rules. The most common reference to formalizing the rules of probability is through a set of axioms proposed by Kolmogorov in 1933. Where is an event in the event space = with different events. 𝑖𝑖 𝑛𝑛 𝐸𝐸 𝑖𝑖=1 𝑖𝑖 Ω ∪ 𝐸𝐸 𝑛𝑛 0 ( ) 1

𝑖𝑖 ( ) =≤1 𝑃𝑃 and𝐸𝐸 ≤( ) = 0

(𝑃𝑃 Ω ) = ( 𝑃𝑃)𝜙𝜙+ ( )

1 2 1 2 𝑃𝑃 𝐸𝐸 ∪ 𝐸𝐸 𝑃𝑃 𝐸𝐸 Whe𝑃𝑃n𝐸𝐸 and are mutually exclusive.

1 2 Other representations of uncertainty exist such as fuzzy𝐸𝐸 logic𝐸𝐸 and theory of evidence (Dempster-Shafer model) which do not follow the theory of probability but almost all reliability concepts are defined based on probability as the metric of uncertainty. For a justification of probability theory see (Singpurwalla 2006).

1.1.2. Interpretations of Probability

The two most common interpretations of probability are:

• Frequency Interpretation. In the frequentist interpretation of probability, the probability of an event (failure) is defined as:

( ) = lim 𝑓𝑓 𝑛𝑛 𝑃𝑃 𝐾𝐾 𝑛𝑛→∞ Also known as the classical approach, this interpretation𝑛𝑛 assumes there exists a real probability of an event, . The analyst uses the observed frequency of the event to estimate the value of . The more historic events that have occurred, the more confident the analyst𝑝𝑝 is of the estimation of . This approach does have limitations, for instance when data𝑝𝑝 from events are not available (e.g. no failures occur in a test) cannot be estimated and this method𝑝𝑝 cannot incorporate “soft evidence” such as expert opinion. 𝑝𝑝 • Subjective Interpretation. The subjective interpretation of probability is also known as the Bayesian school of thought. This method defines the probability of an event as degree of belief the analyst has on the occurrence of event. This means probability is a product of the analyst’s state of knowledge. Any evidence which would change the analyst’s degree of belief must be considered when calculating the probability (including soft evidence). The assumption is made that the probability assessment is made by a coherent person where any coherent person having the same state of knowledge would make the same assessment.

Introduction 3 Prob Theory

The subjective interpretation has the flexibility of including many types of evidence to assist in estimating the probability of an event. This is important in

many reliability applications where the events of interest (e. g, system failure) are rare.

1.1.3. Laws of Probability

The following rules of logic form the basis for many mathematical operations within the theory of probability.

Let = and = be two events within the sample space where .

𝑖𝑖 𝑗𝑗 Boolean𝑋𝑋 𝐸𝐸 Laws of𝑌𝑌 probability𝐸𝐸 are (Modarres et al. 1999, p.25): Ω 𝑖𝑖 ≠ 𝑗𝑗

= Commutative Law = ( 𝑋𝑋 ∪ 𝑌𝑌) = (𝑌𝑌 ∪ 𝑋𝑋 ) Associative Law ( 𝑋𝑋 ∩ 𝑌𝑌) = (𝑌𝑌 ∩ 𝑋𝑋 ) 𝑋𝑋(∪ 𝑌𝑌 ∪) =𝑍𝑍 ( 𝑋𝑋 ∪) 𝑌𝑌 (∪ 𝑍𝑍 ) Distributive Law 𝑋𝑋 ∩ 𝑌𝑌 ∩ 𝑍𝑍 =𝑋𝑋 ∩ 𝑌𝑌 ∩ 𝑍𝑍 Idempotent Law 𝑋𝑋 ∩ 𝑌𝑌 ∪ 𝑍𝑍 𝑋𝑋=∩ 𝑌𝑌 ∪ 𝑋𝑋 ∩ 𝑍𝑍 𝑋𝑋X ∪ X𝑋𝑋 = 𝑋𝑋 Complementation Law 𝑋𝑋X ∩ X𝑋𝑋 = Ø𝑋𝑋 � ∪X = X Ω � ( ∩) = De Morgan’s Theorem � ( ) = ���������� � � 𝑋𝑋(∪ 𝑌𝑌 ) =𝑋𝑋 ∩ 𝑌𝑌 ���������� 𝑋𝑋 ∩ 𝑌𝑌 𝑋𝑋� ∪ 𝑌𝑌� Two events 𝑋𝑋are∪ mutually𝑋𝑋� ∩ 𝑌𝑌 exclusive𝑋𝑋 ∪ 𝑌𝑌 if: X Y = Ø, ( ) = 0

Two events are independent if one∩ event occurring𝑃𝑃 𝑋𝑋 ∩ 𝑌𝑌 does not affect the probability of the second event occurring: 𝑌𝑌 𝑋𝑋 ( | ) = ( )

The rules for evaluating the probability𝑃𝑃 of𝑋𝑋 compound𝑌𝑌 𝑃𝑃 𝑋𝑋 events are:

Addition Rule: ( ) = ( ) + ( ) ( ) = ( ) + ( ) ( ) ( | ) 𝑃𝑃 𝑋𝑋 ∪ 𝑌𝑌 𝑃𝑃 𝑋𝑋 𝑃𝑃 𝑌𝑌 − 𝑃𝑃 𝑋𝑋 ∩ 𝑌𝑌 Multiplication Rule: 𝑃𝑃 𝑋𝑋 𝑃𝑃 𝑌𝑌 − 𝑃𝑃 𝑋𝑋 𝑃𝑃 𝑌𝑌 𝑋𝑋 ( ) = ( ) ( | ) = ( ) ( | )

When and are independent:𝑃𝑃 𝑋𝑋 ∩ 𝑌𝑌 𝑃𝑃 𝑋𝑋 𝑃𝑃 𝑌𝑌 𝑋𝑋 𝑃𝑃 𝑌𝑌 𝑃𝑃 𝑋𝑋 𝑌𝑌 ( ) = ( ) + ( ) ( ) ( ) 𝑋𝑋 𝑌𝑌 ( ) = ( ) ( ) 𝑃𝑃 𝑋𝑋 ∪ 𝑌𝑌 𝑃𝑃 𝑋𝑋 𝑃𝑃 𝑌𝑌 − 𝑃𝑃 𝑋𝑋 𝑃𝑃 𝑌𝑌 𝑃𝑃 𝑋𝑋 ∩ 𝑌𝑌 𝑃𝑃 𝑌𝑌 𝑃𝑃 𝑌𝑌 4 Probability Distributions Used in Reliability Engineering

Generalizations of these equations: ( … ) = [ ( ) + ( ) + + ( )]

Prob Theory Prob [ ( ) + ( ) + + ( )] 𝑃𝑃 𝐸𝐸 1 ∪ 𝐸𝐸 2 ∪ ∪ 𝐸𝐸 𝑛𝑛 +𝑃𝑃 [𝐸𝐸1( 𝑃𝑃 𝐸𝐸2 ⋯) + 𝑃𝑃( 𝐸𝐸𝑛𝑛 ) + ] 1 2 1 3 𝑛𝑛−1 𝑛𝑛 − 𝑃𝑃 (𝐸𝐸 1∩) 𝐸𝐸 [ (𝑃𝑃 𝐸𝐸 ∩ 𝐸𝐸 … ⋯ )]𝑃𝑃 𝐸𝐸 ∩ 𝐸𝐸 1 2 3 1 2 4 𝑃𝑃 𝐸𝐸 ∩𝑛𝑛𝐸𝐸+1 ∩ 𝐸𝐸 𝑃𝑃 𝐸𝐸 ∩ 𝐸𝐸 ∩ 𝐸𝐸 ⋯ ( … − ⋯ )−= ( )𝑃𝑃. 𝐸𝐸(1 ∩|𝐸𝐸2)∩. ( ∩|𝐸𝐸𝑛𝑛 ) … ( | … ) 𝑃𝑃 𝐸𝐸1 ∩ 𝐸𝐸2 ∩ ∩ 𝐸𝐸𝑛𝑛 𝑃𝑃 𝐸𝐸1 𝑃𝑃 𝐸𝐸2 𝐸𝐸1 𝑃𝑃 𝐸𝐸3 𝐸𝐸1 ∩ 𝐸𝐸2 𝑛𝑛 1 2 𝑛𝑛−1 1.1.4. Law of Total Probability 𝑃𝑃 𝐸𝐸 𝐸𝐸 ∩ 𝐸𝐸 ∩ ∩ 𝐸𝐸

The probability of can be obtained by the following summation;

𝑋𝑋 ( ) = 𝑛𝑛𝐴𝐴 ( ) ( | )

𝑃𝑃 𝑋𝑋 � 𝑃𝑃 𝐴𝐴𝑖𝑖 𝑃𝑃 𝑋𝑋 𝐴𝐴𝑖𝑖 𝑖𝑖=1 where = { , , … , } is a partition of the sample space, , and all the elements of A A A = Ø are mutually exclusive,1 2 𝑛𝑛 𝐴𝐴 , and the union of all A elements cover the complete 𝐴𝐴 𝐴𝐴 𝐴𝐴 𝐴𝐴= Ω sample space, i . j 𝐴𝐴 ∩ 𝑛𝑛 𝑖𝑖=1 𝑖𝑖 For example: ∪ 𝐴𝐴 Ω ( ) = ( ) + ( ) = ( ) ( | ) + ( ) ( | ) 𝑃𝑃 𝑋𝑋 𝑃𝑃 𝑋𝑋 ∩ 𝑌𝑌 𝑃𝑃 𝑋𝑋 ∩ 𝑌𝑌� � 1.1.5. Bayes’ Law 𝑃𝑃 𝑌𝑌 𝑃𝑃 𝑋𝑋 𝑌𝑌 𝑃𝑃 𝑌𝑌 𝑃𝑃 𝑋𝑋 𝑌𝑌

Bayes’ law, can be derived from the multiplication rule and the law of total probability as follows:

( ) ( | ) = ( ) ( | )

𝑃𝑃 𝜃𝜃 𝑃𝑃 𝐸𝐸 𝜃𝜃 (𝑃𝑃)𝐸𝐸(𝑃𝑃| 𝜃𝜃)𝐸𝐸 ( | ) = ( ) 𝑃𝑃 𝜃𝜃 𝑃𝑃 𝐸𝐸 𝜃𝜃 𝑃𝑃 𝜃𝜃 𝐸𝐸 ( 𝑃𝑃) 𝐸𝐸( | ) ( | ) = ( | ) ( ) 𝑃𝑃 𝜃𝜃 𝑃𝑃 𝐸𝐸 𝜃𝜃 𝑃𝑃 𝜃𝜃 𝐸𝐸 𝑖𝑖 𝑖𝑖 𝑖𝑖 the unknown of interest (UOI).∑ 𝑃𝑃 𝐸𝐸 𝜃𝜃 𝑃𝑃 𝜃𝜃

𝜃𝜃 the observed random variable, evidence.

𝐸𝐸( ) the prior state of knowledge about without the evidence. Also denoted as ( ). 𝑃𝑃 𝜃𝜃 𝜃𝜃 𝑜𝑜 ( | ) 𝜋𝜋the𝜃𝜃 likelihood of observing the evidence given the UOI. Also denoted as ( | ). 𝑃𝑃 𝐸𝐸 𝜃𝜃 𝐿𝐿 𝐸𝐸 𝜃𝜃 Introduction 5 Prob Theory

( | ) the posterior state of knowledge about given the evidence. Also denoted as ( | ).

𝑃𝑃 𝜃𝜃 𝐸𝐸 𝜃𝜃 ( | ) ( 𝜋𝜋) is𝜃𝜃 the𝐸𝐸 normalizing constant.

𝑖𝑖 𝑖𝑖 Thus∑ Bayes𝑃𝑃 𝐸𝐸 𝜃𝜃 formula𝑃𝑃 𝜃𝜃 enables us to use a piece of evidence , , to make inference about the unobserved . 𝐸𝐸 The continuous form𝜃𝜃 of Bayes’ Law can be written as: ( ) ( | ) ( | ) = ( ) ( | ) 𝜋𝜋𝑜𝑜 𝜃𝜃 𝐿𝐿 𝐸𝐸 𝜃𝜃

𝜋𝜋 𝜃𝜃 𝐸𝐸 𝑜𝑜 In Bayesian statistics the state of knowledge∫ 𝜋𝜋 𝜃𝜃 (uncertainty)𝐿𝐿 𝐸𝐸 𝜃𝜃 𝑑𝑑𝑑𝑑 of an unknown of interest is quantified by assigning a probability distribution to its possible values. Bayes’ law provides a mathematical means by which this uncertainty can be updated given new evidence.

1.1.6. Likelihood Functions

The is the probability of observing the evidence (e.g., sample data), , given the distribution parameters, . The probability of observing events is the product of each event likelihood: 𝐸𝐸 𝜃𝜃 ( | ) = ( | )

𝐿𝐿 𝜃𝜃 𝐸𝐸 𝑐𝑐 � 𝐿𝐿 𝜃𝜃 𝑡𝑡𝑖𝑖 𝑖𝑖 is a combinatorial constant which quantifies the number of combination which the observed evidence could have occurred. Methods which use the likelihood function in p𝑐𝑐arameter estimation do not depend on the constant and so it is omitted.

The following table summarizes the likelihood functions for different types of observations:

Table 1: Summary of Likelihood Functions (Klein & Moeschberger 2003, p.74) Type of Observation Likelihood Function Example Description Exact Lifetimes ( | ) = ( | ) Failure time is known

Right Censored 𝐿𝐿𝑖𝑖(𝜃𝜃|𝑡𝑡𝑖𝑖) = 𝑓𝑓(𝑡𝑡𝑖𝑖 |𝜃𝜃) Component survived to time Left Censored 𝐿𝐿𝑖𝑖(𝜃𝜃|𝑡𝑡𝑖𝑖) = 𝑅𝑅(𝑡𝑡𝑖𝑖|𝜃𝜃) Component failed before time𝑡𝑡𝑖𝑖 Interval Censored ( | 𝐿𝐿)𝑖𝑖 =𝜃𝜃 𝑡𝑡𝑖𝑖( |𝐹𝐹)𝑡𝑡𝑖𝑖 𝜃𝜃 ( | ) Component failed between 𝑡𝑡𝑖𝑖 𝑅𝑅𝑅𝑅 𝐿𝐿𝐿𝐿 and 𝑖𝑖 𝑖𝑖 𝑖𝑖 𝑖𝑖 𝐿𝐿 𝜃𝜃 𝑡𝑡 𝐹𝐹 𝑡𝑡 𝜃𝜃 − 𝐹𝐹 𝑡𝑡 𝜃𝜃 𝐿𝐿𝐿𝐿 𝑅𝑅𝑅𝑅 Left Truncated ( | ) Component𝑡𝑡𝑖𝑖 𝑡𝑡𝑖𝑖 failed at time where ( | ) = ( | ) observations are truncated before . 𝑖𝑖 𝑖𝑖 𝑖𝑖 𝑖𝑖 𝑓𝑓 𝑡𝑡 𝜃𝜃 𝑡𝑡 Right Truncated 𝐿𝐿 𝜃𝜃 𝑡𝑡 ( 𝐿𝐿| ) Component failed at time where𝑡𝑡𝐿𝐿 ( | ) = 𝑅𝑅 𝑡𝑡 𝜃𝜃 ( | ) observations are truncated after . 𝑖𝑖 𝑖𝑖 𝑖𝑖 𝑖𝑖 𝑓𝑓 𝑡𝑡 𝜃𝜃 𝑡𝑡 Interval Truncated 𝐿𝐿 𝜃𝜃 𝑡𝑡 ( 𝑈𝑈| ) Component failed at time where𝑡𝑡𝑈𝑈 ( | ) = 𝐹𝐹 𝑡𝑡 𝜃𝜃 ( | ) ( | ) observations are truncated before 𝑖𝑖 𝑖𝑖 𝑖𝑖 𝑖𝑖 𝑓𝑓 𝑡𝑡 𝜃𝜃 and after . 𝑡𝑡 𝐿𝐿 𝜃𝜃 𝑡𝑡 𝐿𝐿 𝐹𝐹 𝑡𝑡𝑈𝑈 𝜃𝜃 − 𝐹𝐹 𝑡𝑡𝐿𝐿 𝜃𝜃 𝑡𝑡 𝑡𝑡𝑈𝑈 6 Probability Distributions Used in Reliability Engineering

The Likelihood function is used in and maximum likelihood parameter

Prob Theory Prob estimation techniques. In both instances any constant in front of the likelihood function becomes irrelevant. Such constants are therefore not included in the likelihood functions given in this book (nor in most references).

For example, consider the case where a test is conducted on components with an exponential time to failure distribution. The test is terminated at during which components failed at times , , … , and = components𝑛𝑛 survived. Using the 𝑠𝑠 exponential distribution to construct the likelihood function we obtain: 𝑡𝑡 𝑟𝑟 1 2 𝑟𝑟 𝑡𝑡 𝑡𝑡 𝑡𝑡 𝑠𝑠 𝑛𝑛 − 𝑟𝑟

( | ) = 𝑛𝑛𝐹𝐹 𝑛𝑛𝑆𝑆 𝐹𝐹 𝑆𝑆 𝐿𝐿 𝜆𝜆 𝐸𝐸 � 𝑓𝑓�𝜆𝜆�𝑡𝑡𝑖𝑖 � � 𝑅𝑅�𝜆𝜆�𝑡𝑡𝑖𝑖 � 𝑖𝑖=1 𝑖𝑖=1

= 𝑛𝑛𝐹𝐹 𝑛𝑛𝑆𝑆 𝐹𝐹 𝑆𝑆 −𝜆𝜆𝑡𝑡𝑖𝑖 −𝜆𝜆𝑡𝑡𝑖𝑖 � 𝜆𝜆𝑒𝑒 � 𝑒𝑒 𝑖𝑖=1 𝑖𝑖=1 = 𝑛𝑛𝐹𝐹 𝐹𝐹 𝑛𝑛𝑆𝑆 𝑆𝑆 𝑛𝑛𝐹𝐹 −𝜆𝜆 ∑𝑖𝑖=1 𝑡𝑡𝑖𝑖 −λ ∑𝑖𝑖=1 𝑡𝑡𝑖𝑖 𝜆𝜆 𝑒𝑒 𝑒𝑒 = 𝑛𝑛𝐹𝐹 𝐹𝐹 𝑛𝑛𝑆𝑆 𝑆𝑆 𝑛𝑛𝐹𝐹 −𝜆𝜆�∑𝑖𝑖=1 𝑡𝑡𝑖𝑖 +∑𝑖𝑖=1 𝑡𝑡𝑖𝑖 � Alternatively, because the test described𝜆𝜆 𝑒𝑒 is a homogeneous Poisson process1 the likelihood function could also have been constructed using a Poisson distribution. The data can be stated as seeing r failure in time where is the total time on test = + . Therefore the likelihood function would be: 𝑇𝑇 𝑇𝑇 𝑇𝑇 𝑛𝑛𝐹𝐹 𝐹𝐹 𝑛𝑛𝑆𝑆 𝑆𝑆 ( | ) = ( | , 𝑡𝑡 ) 𝑡𝑡 𝑡𝑡 𝑖𝑖=1 𝑖𝑖 𝑖𝑖=1 𝑖𝑖 ∑ 𝑡𝑡 ∑ 𝑡𝑡 𝐿𝐿 𝜆𝜆 𝐸𝐸 𝑓𝑓( 𝜆𝜆 𝑛𝑛) 𝐹𝐹 𝑡𝑡𝑇𝑇 = !𝑛𝑛𝐹𝐹 𝑇𝑇 −𝜆𝜆𝑡𝑡𝑇𝑇 𝜆𝜆𝑡𝑡 𝐹𝐹 𝑒𝑒 = 𝑛𝑛 𝑛𝑛𝐹𝐹 −𝜆𝜆𝑡𝑡𝑇𝑇 𝑐𝑐𝜆𝜆 𝑒𝑒 = 𝑛𝑛𝐹𝐹 𝐹𝐹 𝑛𝑛𝑆𝑆 𝑆𝑆 𝑛𝑛𝐹𝐹 −𝜆𝜆�∑𝑖𝑖=1 𝑡𝑡𝑖𝑖 +∑𝑖𝑖=1 𝑡𝑡𝑖𝑖 � As mentioned earlier, in estimation procedures𝜆𝜆 𝑒𝑒 within this book, the constant can be ignored. As such, the two likelihood functions are equal. For more information see (Meeker & Escobar 1998, p.36) or (Rinne 2008, p.403). 𝑐𝑐

1.1.7. Fisher Information Matrix

The Fisher Information Matrix has many uses but in reliability applications it is most often used to create Jeffery’s non-informative priors. There are two types of Fisher information matrices, the Expected Fisher Information Matrix ( ), and the Observed Fisher Information Matrix ( ). 𝐼𝐼 𝜃𝜃 1 Homogeneous in 𝐽𝐽time,𝜃𝜃 where it does not matter if you have components on test at once (exponential test), or you have a single component on test which is replaced after failure times (Poisson process), the evidence produced will be𝑛𝑛 the same.

𝑛𝑛 Introduction 7 Prob Theory

The Expected Fisher Information Matrix is obtained from a log-likelihood function from a single random variable. The random variable is replaced by its expected value.

For a single parameter distribution: ( | ) ( | ) ( ) = = 2 2 𝜕𝜕 Λ 𝜃𝜃 𝑥𝑥 𝜕𝜕Λ 𝜃𝜃 𝑥𝑥 𝐼𝐼 𝜃𝜃 −𝐸𝐸 � 2 � �� � � 𝜕𝜕𝜃𝜃 𝜕𝜕𝜕𝜕 where is the log-likelihood function and [ ] = ( ) . For a distribution with parameters the Expected Fisher Information Matrix is: Λ 𝐸𝐸 𝑈𝑈 ∫ 𝑈𝑈𝑈𝑈 𝑥𝑥 𝑑𝑑𝑑𝑑 𝑝𝑝 ( | ) ( | ) ( | ) 2 2 2 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙 ⎡−𝐸𝐸 � ( 2| )� −𝐸𝐸 � ( | )� ⋯ −𝐸𝐸 � ( | )�⎤ 𝜕𝜕𝜃𝜃1 𝜕𝜕𝜃𝜃1𝜕𝜕𝜃𝜃2 𝜕𝜕𝜃𝜃1𝜕𝜕𝜃𝜃𝑝𝑝 ( ) = ⎢ 2 2 2 ⎥ ⎢ 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙 ⎥ 2 ⎢−𝐸𝐸 � 2 1 � −𝐸𝐸 � 2 � ⋯ −𝐸𝐸 � 2 𝑝𝑝 �⎥ 𝐼𝐼 𝜽𝜽 𝜕𝜕𝜃𝜃(𝜕𝜕𝜃𝜃| ) 𝜕𝜕(𝜃𝜃 | ) 𝜕𝜕𝜃𝜃(𝜕𝜕𝜃𝜃| ) ⎢ ⎥ ⋮ ⋮ …⋱ ⋮ ⎢ 2 2 2 ⎥ 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙 ⎢ 2 ⎥ −𝐸𝐸 � 𝑝𝑝 1 � −𝐸𝐸 � 𝑝𝑝 2 � −𝐸𝐸 � 𝑝𝑝 � The Observed Fisher ⎣Information𝜕𝜕𝜃𝜃 𝜕𝜕𝜃𝜃 Matrix is obtained𝜕𝜕𝜃𝜃 𝜕𝜕𝜃𝜃 from a likelihood𝜕𝜕𝜃𝜃 function⎦ constructed from observed samples from the distribution. The expectation term is dropped.

For a𝑛𝑛 single parameter distribution:

( | ) ( ) = 𝑛𝑛 2 𝜕𝜕 Λ 𝜃𝜃 𝑥𝑥𝑖𝑖 𝐽𝐽𝑛𝑛 𝜃𝜃 − � 2 𝑖𝑖=1 𝜕𝜕𝜃𝜃 For a distribution with parameters the Observed Fisher Information Matrix is:

𝑝𝑝 ( | ) ( | ) ( | ) 2 2 2 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙𝒊𝒊 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙𝒊𝒊 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙𝒊𝒊 ⎡− ( 2| ) − ( | ) ⋯ − ( | )⎤ 𝜕𝜕𝜃𝜃1 𝜕𝜕𝜃𝜃1𝜕𝜕𝜃𝜃2 𝜕𝜕𝜃𝜃1𝜕𝜕𝜃𝜃𝑝𝑝 ( ) = 𝑛𝑛 ⎢ 2 2 2 ⎥ ⎢ 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙𝒊𝒊 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙𝒊𝒊 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙𝒊𝒊 ⎥ 2 𝑛𝑛 ⎢− 2 1 − 2 ⋯ − 2 𝑝𝑝 ⎥ 𝐽𝐽 𝜽𝜽 � 𝜕𝜕𝜃𝜃(𝜕𝜕|𝜃𝜃 ) 𝜕𝜕(𝜃𝜃 | ) 𝜕𝜕𝜃𝜃(𝜕𝜕|𝜃𝜃 ) 𝑖𝑖=1 ⎢ ⎥ ⋮ ⋮ …⋱ ⋮ ⎢ 2 2 2 ⎥ 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙𝒊𝒊 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙𝒊𝒊 𝜕𝜕 Λ 𝜽𝜽 𝒙𝒙𝒊𝒊 ⎢ 2 ⎥ − 𝑝𝑝 1 − 𝑝𝑝 2 − 𝑝𝑝 It can be seen that as ⎣becomes𝜕𝜕𝜃𝜃 𝜕𝜕𝜃𝜃 large, the𝜕𝜕𝜃𝜃 average𝜕𝜕𝜃𝜃 value of𝜕𝜕𝜃𝜃 the random⎦ variable approaches its expected value and so the following asymptotic relationship exists between the observed and𝑛𝑛 expected Fisher information matrices:

1 plim ( ) = ( )

𝑛𝑛 𝑛𝑛→∞ 𝐽𝐽 𝜽𝜽 𝐼𝐼 𝜽𝜽 For large n the following approximation can𝑛𝑛 be used:

( )

𝐽𝐽𝑛𝑛 ≈ 𝑛𝑛𝑛𝑛 𝜽𝜽 8 Probability Distributions Used in Reliability Engineering

When evaluated at = the observed Fisher information matrix estimates the variance-

Prob Theory Prob : 𝜽𝜽 𝜽𝜽� ( ) ( , ) ( , ) ( , ) ( ) ( , ) = ( = ) = 𝑉𝑉𝑉𝑉𝑉𝑉 𝜃𝜃�1 𝐶𝐶𝐶𝐶𝐶𝐶 𝜃𝜃�1 𝜃𝜃�2 ⋯ 𝐶𝐶𝐶𝐶𝐶𝐶 𝜃𝜃�1 𝜃𝜃�𝑑𝑑 −1 ⎡ ⎤ �1 �2 �2 �2 �𝑑𝑑 𝑽𝑽 �𝐽𝐽𝑛𝑛 𝜽𝜽 𝜽𝜽� � ⎢𝐶𝐶𝐶𝐶𝐶𝐶(𝜃𝜃 , 𝜃𝜃 ) 𝑉𝑉𝑉𝑉(𝑉𝑉 𝜃𝜃, ) ⋯… 𝐶𝐶𝐶𝐶𝐶𝐶 𝜃𝜃( 𝜃𝜃) ⎥ ⎢ ⋮ ⋮ ⋱ ⋮ ⎥ ⎣𝐶𝐶𝐶𝐶𝐶𝐶 𝜃𝜃�1 𝜃𝜃�𝑑𝑑 𝐶𝐶𝐶𝐶𝐶𝐶 𝜃𝜃�2 𝜃𝜃�𝑑𝑑 𝑉𝑉𝑉𝑉𝑉𝑉 𝜃𝜃�𝑑𝑑 ⎦ Introduction 9 Dist Functions

1.2. Distribution Functions

1.2.1. Random Variables

Probability distributions are used to model random events for which the outcome is uncertain such as the time of failure for a component. Before placing a demand on that component, the time it will fail is unknown. The distribution of the probability of failure at different times is modeled by a probability distribution. In this book random variables will be denoted as capital letter such as for time. When the random variable assumes a value we denote this by small caps such as for time. For example, if we wish to find the probability that the component fails before𝑇𝑇 time we would find ( ). 𝑡𝑡 1 1 Random variables are classified as either discrete𝑡𝑡 or continuous. In𝑃𝑃 a𝑇𝑇 disc≤ 𝑡𝑡rete distribution, the random variable can take on a distinct or countable number of possible values such as number of demands to failure. In a continuous distribution the random variable is not constrained to distinct possible values such as time-to-failure distribution.

This book will denote continuous random variables as or , and discrete random variables as . 𝑋𝑋 𝑇𝑇 1.2.2. Statistical𝐾𝐾 Distribution Parameters

The parameters of a distribution are the variables which need to be specified in order to completely specify the distribution. Often parameters are classified by the effect they have on the distributions. Shape parameters define the shape of the distribution, scale parameters stretch the distribution along the random variable axis, and location parameters shift the distribution along the random variable axis. The reader is cautioned that the parameters for a distribution may change depending on the text. Therefore, before using formulas from other sources the parameterization need to be confirmed.

Understanding the effect of changing a distribution’s parameter value can be a difficult task. At the beginning of each section a graph of the distribution is shown with varied parameters.

1.2.3. Probability Density Function

A probability density function (pdf), denoted as ( ) is any function which is always positive and has a unit area: 𝑓𝑓 𝑡𝑡 ( ) = 1, ( ) = 1 ∞ � 𝑓𝑓 𝑡𝑡 𝑑𝑑𝑑𝑑 � 𝑓𝑓 𝑘𝑘 −∞ 𝑘𝑘

The probability of an event occurring between limits a and b is the area under the pdf: ( ) = ( ) = ( ) ( ) 𝑏𝑏

𝑃𝑃 𝑎𝑎 ≤ 𝑇𝑇 ≤ 𝑏𝑏 �𝑎𝑎 𝑓𝑓 𝑡𝑡 𝑑𝑑𝑑𝑑 𝐹𝐹 𝑏𝑏 − 𝐹𝐹 𝑎𝑎 10 Probability Distributions Used in Reliability Engineering

( ) = 𝑏𝑏 ( ) = ( ) ( 1)

Dist Functions Dist 𝑃𝑃 𝑎𝑎 ≤ 𝐾𝐾 ≤ 𝑏𝑏 � 𝑓𝑓 𝑘𝑘 𝐹𝐹 𝑏𝑏 − 𝐹𝐹 𝑎𝑎 − 𝑖𝑖=𝑎𝑎 The instantaneous value of a discrete pdf at can be obtained by minimizing the limits to [ , ]: 𝑖𝑖 ( = ) = ( <𝑘𝑘 ) = ( ) 𝑖𝑖−1 𝑖𝑖 𝑘𝑘 𝑘𝑘 𝑖𝑖 𝑖𝑖 𝑖𝑖 The instantaneous value of𝑃𝑃 a 𝐾𝐾continuous𝑘𝑘 𝑃𝑃 pdf𝑘𝑘 is infinitesimal.𝐾𝐾 ≤ 𝑘𝑘 𝑓𝑓 𝑘𝑘This result can be seen when minimizing the limits to [ , + ]: ( = ) = lim ( < + ) = lim ( ). 𝑡𝑡 𝑡𝑡 Δ𝑡𝑡

𝑃𝑃 𝑇𝑇 𝑡𝑡 Δt→0 𝑃𝑃 𝑡𝑡 𝑇𝑇 ≤ 𝑡𝑡 Δ𝑡𝑡 Δt→0 𝑓𝑓 𝑡𝑡 Δ𝑡𝑡 Therefore the reader must remember that in order to calculate the probability of an event, an interval for the random variable must be used. Furthermore, a common misunderstanding is that a pdf cannot have a value above one because the probability of an event occurring cannot be greater than one. As can be seen above this is true for discrete distributions, only because = 1. However for continuous the case the pdf is multiplied by a small interval , which ensures that the probability an event occurring within the interval is less than one.Δ 𝑘𝑘 Δ𝑡𝑡 ( ) Δ𝑡𝑡 ( )

𝑓𝑓 𝑡𝑡 𝑓𝑓 𝑘𝑘

𝑝𝑝𝑝𝑝𝑝𝑝 𝑝𝑝𝑝𝑝𝑝𝑝

( < ) ( < ) 𝑃𝑃 𝑘𝑘1 𝐾𝐾 ≤ 𝑘𝑘2 𝑃𝑃 𝑡𝑡1 𝑇𝑇 ≤ 𝑡𝑡2

𝑡𝑡 1 𝑘𝑘 𝑡𝑡Figure𝑡𝑡2 1: Left: continuous pdf, right: discrete𝑘𝑘1 𝑘𝑘pdf2

To derive the continuous pdf relationship to the cumulative density function (cdf), ( ):

lim ( ). = lim ( < + ) = lim ( + ) ( ) = lim ( ) 𝐹𝐹 𝑡𝑡

Δt→0 𝑓𝑓 𝑡𝑡 Δ𝑡𝑡 Δt→0 𝑃𝑃 𝑡𝑡 𝑇𝑇 ≤ 𝑡𝑡 Δ𝑡𝑡 ( Δ)t→0 𝐹𝐹 𝑡𝑡( ) Δ𝑡𝑡 − 𝐹𝐹 𝑡𝑡 Δt→0 Δ𝐹𝐹 𝑡𝑡 ( ) = lim = Δ𝐹𝐹 𝑡𝑡 𝑑𝑑𝑑𝑑 𝑡𝑡 𝑓𝑓 𝑡𝑡 Δt→0 The shape of the pdf can be obtained by plottingΔ𝑡𝑡 a 𝑑𝑑normalized𝑑𝑑 of an infinite number of samples from a distribution.

It should be noted when plotting a discrete pdf the points from each discrete value should not be joined. For ease of explanation using the area under the graph argument the step Introduction 11 Dist Functions plot is intuitive but implies a non-integer random variable. Instead stem plots or column plots are often used. ( ) ( )

𝑓𝑓 𝑘𝑘 𝑓𝑓 𝑘𝑘

Figure 2: Discrete data plotting.𝑘𝑘 Left stem plot. Right column plot. 𝑘𝑘

1.2.4. Cumulative Distribution Function

The cumulative density function (cdf), denoted by ( ) is the probability of the random event occurring before , ( ). For a discrete cdf the height of each step is the pdf value ( ). 𝐹𝐹 𝑡𝑡 𝑡𝑡 𝑃𝑃 𝑇𝑇 ≤ 𝑡𝑡 𝑖𝑖 ( ) = ( ) = ( ) , ( ) = ( ) = ( ) 𝑓𝑓 𝑘𝑘 𝑡𝑡 𝐹𝐹 𝑡𝑡 𝑃𝑃 𝑇𝑇 ≤ 𝑡𝑡 � 𝑓𝑓 𝑥𝑥 𝑑𝑑𝑑𝑑 𝐹𝐹 𝑘𝑘 𝑃𝑃 𝐾𝐾 ≤ 𝑘𝑘 � 𝑓𝑓 𝑘𝑘𝑖𝑖 −∞ 𝑘𝑘𝑖𝑖≤𝑘𝑘 The limits of the cdf for < < and 0 are given as:

−∞ 𝑡𝑡lim ∞ ( ) = 0≤, 𝑘𝑘 ≤ ∞( 1) = 0

𝑡𝑡→−∞ 𝐹𝐹 𝑡𝑡 𝐹𝐹 − lim ( ) = 1 , lim ( ) = 1

𝑡𝑡→∞ 𝐹𝐹 𝑡𝑡 𝑘𝑘→∞ 𝐹𝐹 𝑘𝑘 The cdf can be used to find the probability of the random even occurring between two limits: ( ) = ( ) = ( ) ( ) 𝑏𝑏

𝑃𝑃 𝑎𝑎 ≤ 𝑇𝑇 ≤ 𝑏𝑏 �𝑎𝑎 𝑓𝑓 𝑡𝑡 𝑑𝑑𝑑𝑑 𝐹𝐹 𝑏𝑏 − 𝐹𝐹 𝑎𝑎 ( ) = 𝑏𝑏 ( ) = ( ) ( 1)

𝑃𝑃 𝑎𝑎 ≤ 𝐾𝐾 ≤ 𝑏𝑏 � 𝑓𝑓 𝑘𝑘 𝐹𝐹 𝑏𝑏 − 𝐹𝐹 𝑎𝑎 − 𝑖𝑖=𝑎𝑎 12 Probability Distributions Used in Reliability Engineering

( ) ( )

1 𝐹𝐹 𝑡𝑡 1𝐹𝐹 𝑘𝑘 ( ) Dist Functions Dist ( ) 𝑐𝑐𝑐𝑐𝑐𝑐 𝑐𝑐𝑐𝑐𝑐𝑐 1 ( ) 𝑃𝑃 𝑇𝑇 ≤ 𝑡𝑡 1 ( ) 𝑃𝑃 𝐾𝐾 ≤ 𝑘𝑘 1 𝐹𝐹 𝑡𝑡 1 𝐹𝐹 𝑘𝑘

𝑡𝑡 𝑘𝑘 ( ) 𝑡𝑡1 ( ) 𝑘𝑘1 𝑓𝑓 𝑡𝑡 𝑓𝑓 𝑘𝑘 𝑝𝑝𝑝𝑝𝑝𝑝

( ) ( ) 𝑃𝑃 𝐾𝐾 ≤ 𝑘𝑘1 𝑃𝑃 𝑇𝑇 ≤ 𝑡𝑡1

𝑝𝑝𝑝𝑝𝑝𝑝

𝑡𝑡 1 1 𝑘𝑘 Figure𝑡𝑡 3: Left: continuous cdf/pdf, right: discrete𝑘𝑘 cdf/pdf

1.2.5. Reliability Function

The reliability function, also known as the , is denoted as ( ). It is the probability that the random event (time of failure) occurs after . 𝑅𝑅 𝑡𝑡 ( ) = ( > ) = 1 ( ), ( ) = ( > 𝑡𝑡) = 1 ( )

𝑅𝑅 𝑡𝑡 𝑃𝑃 𝑇𝑇( )𝑡𝑡= −( 𝐹𝐹) 𝑡𝑡 , 𝑅𝑅 (𝑘𝑘 ) =𝑃𝑃 𝑇𝑇 𝑘𝑘( ) − 𝐹𝐹 𝑘𝑘 ∞ ∞ 𝑅𝑅 𝑡𝑡 � 𝑓𝑓 𝑡𝑡 𝑑𝑑𝑑𝑑 𝑅𝑅 𝑘𝑘 � 𝑓𝑓 𝑘𝑘𝑖𝑖 𝑡𝑡 𝑖𝑖=𝑘𝑘+1 It should be noted that in most publications the discrete reliability function is defined as ( ) = ( ) = ( ). This definition results in ( ) 1 ( ). Despite this problem∗ it is the most ∞common definition and is included in ∗all the references in this book 𝑖𝑖=𝑘𝑘 except𝑅𝑅 𝑘𝑘 (Xie,𝑃𝑃 𝑇𝑇 Gaudoin,≥ 𝑘𝑘 ∑ et al.𝑓𝑓 𝑘𝑘2002) 𝑅𝑅 𝑘𝑘 ≠ − 𝐹𝐹 𝑘𝑘

Introduction 13 Dist Functions

( ) ( )

1 𝐹𝐹 𝑡𝑡 1 𝑅𝑅 𝑡𝑡

𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓

𝑐𝑐𝑐𝑐𝑐𝑐

0 0

𝑡𝑡 𝑡𝑡 Figure 4: Left continuous cdf, right continuous survival function

1.2.6. Conditional Reliability Function

The conditional reliability function, denoted as ( ) is the probability of the component surviving given that it has survived to time t. 𝑚𝑚 𝑥𝑥( + x) ( ) = ( | ) = ( ) 𝑅𝑅 𝑡𝑡 Where: 𝑚𝑚 𝑥𝑥 𝑅𝑅 𝑥𝑥 𝑡𝑡 is the given time for which we know the component𝑅𝑅 𝑡𝑡 survived. is new random variable defined as the time after . = 0 at . 𝑡𝑡 𝑥𝑥 𝑡𝑡 𝑥𝑥 𝑡𝑡 1.2.7. 100α% Percentile Function

The 100α% percentile function is the interval [0, ] for which the area under the pdf is .

𝛼𝛼 = (𝑡𝑡 ) 𝛼𝛼 −1 𝛼𝛼 1.2.8. Mean Residual Life 𝑡𝑡 𝐹𝐹 𝛼𝛼

The mean residual life (MRL), denoted as ( ), is the expected life given the component has survived to time, . 𝑢𝑢 𝑡𝑡 𝑡𝑡 1 ( ) = ( | ) = ( ) ∞ ( ) ∞

𝑢𝑢 𝑡𝑡 �0 𝑅𝑅 𝑥𝑥 𝑡𝑡 𝑑𝑑𝑑𝑑 �𝑡𝑡 𝑅𝑅 𝑥𝑥 𝑑𝑑𝑑𝑑 1.2.9. Hazard Rate 𝑅𝑅 𝑡𝑡

The hazard function, denoted as h(t), is the conditional probability that a component fails in a small time interval, given that it has survived from time zero until the beginning of the time interval. For the continuous case the probability that an item will fail in a time interval given the item was functioning at time is:

𝑡𝑡 14 Probability Distributions Used in Reliability Engineering

( < < + ) ( + ) ( ) ( ) ( < < + | > ) = = = ( > ) ( ) ( ) 𝑃𝑃 𝑡𝑡 𝑇𝑇 𝑡𝑡 Δ𝑡𝑡 𝐹𝐹 𝑡𝑡 Δ𝑡𝑡 − 𝐹𝐹 𝑡𝑡 Δ𝐹𝐹 𝑡𝑡 Dist Functions Dist 𝑃𝑃 𝑡𝑡 𝑇𝑇 𝑡𝑡 Δ𝑡𝑡 𝑇𝑇 𝑡𝑡 By dividing the probability by and finding𝑃𝑃 𝑇𝑇 the𝑡𝑡 limit as 0𝑅𝑅 gives𝑡𝑡 the hazard𝑅𝑅 𝑡𝑡 rate:

(Δ𝑡𝑡< < + | > ) Δ𝑡𝑡 → ( ) ( ) ( ) = lim = lim = ( ) ( ) 𝑃𝑃 𝑡𝑡 𝑇𝑇 𝑡𝑡 Δ𝑡𝑡 𝑇𝑇 𝑡𝑡 Δ𝐹𝐹 𝑡𝑡 𝑓𝑓 𝑡𝑡 ℎ 𝑡𝑡 Δ𝑡𝑡→0 Δ𝑡𝑡→0 The discrete hazard rate is defined as:Δ (Xie,𝑡𝑡 Gaudoin, et al.Δ 2002)𝑡𝑡𝑡𝑡 𝑡𝑡 𝑅𝑅 𝑡𝑡

( = ) ( ) ( ) = = ( ) ( 1) 𝑃𝑃 𝐾𝐾 𝑘𝑘 𝑓𝑓 𝑘𝑘 ℎ 𝑘𝑘 This unintuitive result is due to a popular𝑃𝑃 𝐾𝐾 ≥definit𝑘𝑘 ion𝑅𝑅 𝑘𝑘of− ( ) = ( ) in which case ( ) = ( )/ ( ). This definition has been avoided because∗ it∞ violates the formula 𝑖𝑖=𝑘𝑘 ( ) = 1 ( ∗). The discrete hazard rate cannot be𝑅𝑅 used𝑘𝑘 in∑ the𝑓𝑓 𝑘𝑘same way as a continuousℎ 𝑘𝑘 𝑓𝑓 𝑘𝑘 hazard𝑅𝑅 𝑘𝑘 rate with the following differences (Xie, Gaudoin, et al. 2002): 𝑅𝑅 𝑘𝑘 − 𝐹𝐹 𝑘𝑘 • ( ) is defined as a probability and so is bounded by [0,1]. • ( ) is not additive for series systems. • Forℎ 𝑘𝑘 the cumulative hazard rate ( ) = ln[ ( )] ( ) • ℎWhen𝑘𝑘 a set of data is analyzed using a discrete counterpart𝑘𝑘 of the continuous 𝑖𝑖=0 distribution the values of the hazard𝐻𝐻 𝑘𝑘 rate− do 𝑅𝑅not𝑘𝑘 converge.≠ ∑ ℎ 𝑘𝑘

A function called the second failure rate has been proposed (Gupta et al. 1997):

( 1) ( ) = ln = ln[1 ( )] ( ) 𝑅𝑅 𝑘𝑘 − 𝑟𝑟 𝑘𝑘 − − ℎ 𝑘𝑘 This function overcomes the previously𝑅𝑅 mentioned𝑘𝑘 limitations of the discrete hazard rate function and maintains the monotonicity property. For more information, the reader is referred to (Xie, Gaudoin, et al. 2002)

Care should be taken not to confuse the hazard rate with the Rate of Occurrence of Failures (ROCOF). ROCOF is the probability that a failure (not necessarily the first) occurs in a small time interval. Unlike the hazard rate, the ROCOF is the absolute rate at which system failures occur and is not conditional on survival to time t. ROCOF is using in measuring the change in the rate of failures for repairable systems.

1.2.10. Cumulative Hazard Rate

The cumulative hazard rate, denoted as ( ) an in the continuous case is the area under the hazard rate function. This function is useful to calculate average failure rates. 𝐻𝐻 𝑡𝑡 ( ) = ( ) = ln [ ( )] 𝑡𝑡 ( ) = ln[ ( )] 𝐻𝐻 𝑡𝑡 �∞ℎ 𝑢𝑢 𝑑𝑑𝑑𝑑 − 𝑅𝑅 𝑡𝑡

For a discussion on the discrete cumulative𝐻𝐻 𝑘𝑘 − hazard𝑅𝑅 𝑘𝑘 rate see hazard rate.

Introduction 15 Dist Functions

1.2.11. Characteristic Function

The characteristic function of a random variable completely defines its probability distribution. It can be used to derive properties of the distribution from transformations of the random variable. (Billingsley 1995)

The characteristic function is defined as the expected value of the function exp ( ) where is the random variable of the distribution with a cdf ( ), is a parameter that can have any real value and = 1: 𝑖𝑖𝑖𝑖𝑖𝑖 𝑥𝑥 𝐹𝐹 𝑥𝑥 𝜔𝜔 𝑖𝑖 √− ( ) = 𝑖𝑖𝑖𝑖𝑖𝑖 𝑋𝑋 = ( ) 𝜑𝜑 𝜔𝜔 𝐸𝐸�∞𝑒𝑒 � 𝑖𝑖𝑖𝑖𝑖𝑖

�−∞𝑒𝑒 𝐹𝐹 𝑥𝑥 𝑑𝑑𝑑𝑑 A useful property of the characteristic function is the sum of independent random variables is the product of the random variables characteristic function. It is often easier to use the natural log of the characteristic function when conducting this operation.

( ) = ( ) ( )

𝑋𝑋+𝑌𝑌 𝑋𝑋 𝑌𝑌 ln [ 𝜑𝜑 ( )]𝜔𝜔= ln[𝜑𝜑 (𝜔𝜔 )𝜑𝜑] ln𝜔𝜔[ ( )]

𝑋𝑋+𝑌𝑌 𝑋𝑋 𝑌𝑌 For example, the addition of two𝜑𝜑 exponentially𝜔𝜔 𝜑𝜑 distributed𝜔𝜔 𝜑𝜑 random𝜔𝜔 variables with the same gives the gamma distribution with = 2:

𝜆𝜆 ~ 𝑘𝑘 ( ), ~ ( ) ( ) = , ( ) = 𝑋𝑋 𝐸𝐸𝐸𝐸𝐸𝐸+ 𝜆𝜆 𝑌𝑌 𝐸𝐸𝐸𝐸𝐸𝐸 𝜆𝜆 + 𝑖𝑖𝑖𝑖 𝑖𝑖𝑖𝑖 𝜑𝜑𝑋𝑋 𝜔𝜔 𝜑𝜑𝑌𝑌 𝜔𝜔 𝜔𝜔( 𝑖𝑖)𝑖𝑖= ( ) ( )𝜔𝜔 𝑖𝑖𝑖𝑖 𝑋𝑋+𝑌𝑌 = 𝑋𝑋 𝑌𝑌 𝜑𝜑 𝜔𝜔 (𝜑𝜑 +𝜔𝜔2𝜑𝜑) 𝜔𝜔 −𝜆𝜆 2 + ~ 𝜔𝜔 ( 𝑖𝑖𝑖𝑖= 1, )

This is the characteristic function𝑋𝑋 of the𝑌𝑌 𝐺𝐺gamma𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 distribution𝑘𝑘 𝜆𝜆 with = 2.

The moment generating function can be calculated from the characterist𝑘𝑘 ic function: ( ) = ( )

𝑋𝑋 𝑋𝑋 The raw moment can be calculated𝜑𝜑 −𝑖𝑖𝑖𝑖 by differentiating𝑀𝑀 𝜔𝜔 the characteristic function times. 𝑡𝑡Forℎ more information on moments see section 1.3.2. 𝑛𝑛 𝑛𝑛 ( ) [ ] = (0) 𝑛𝑛 −𝑛𝑛 𝑛𝑛 = 𝑋𝑋 ( ) 𝐸𝐸 𝑋𝑋 𝑖𝑖 𝜑𝜑 𝑛𝑛 −𝑛𝑛 𝑑𝑑 𝑖𝑖 � 𝑛𝑛 𝜑𝜑𝑋𝑋 𝜔𝜔 � 𝑑𝑑𝜔𝜔 16 Probability Distributions Used in Reliability Engineering

1.2.12. Joint Distributions

Dist Functions Dist Joint distributions are multivariate distributions with, random variables ( > 1). An example of a bivariate distribution ( = 2) may be the distribution of failure for a vehicle tire which with random variables time, , and distance𝑑𝑑 travelled, . The dependence𝑑𝑑 between these two variables can be 𝑑𝑑quantified in terms of correlation and covariance. See section 1.3.3 for more discussion. For more𝑇𝑇 on properties of multivariate𝑋𝑋 distributions see (Rencher 1997). The continuous and discrete random variables will be denoted as:

= 𝑥𝑥1 , = 𝑘𝑘1 𝑥𝑥2 𝑘𝑘2 𝒙𝒙 � � 𝒌𝒌 � � ⋮ ⋮ 𝑑𝑑 𝑑𝑑 Joint distributions can be derived from 𝑥𝑥the conditional𝑘𝑘 distributions. For the bivariate case with random variables and :

𝑥𝑥 (𝑦𝑦 , ) = ( | ) ( ) = ( | ) ( )

For the more general case:𝑓𝑓 𝑥𝑥 𝑦𝑦 𝑓𝑓 𝑦𝑦 𝑥𝑥 𝑓𝑓 𝑥𝑥 𝑓𝑓 𝑥𝑥 𝑦𝑦 𝑓𝑓 𝑦𝑦

( ) = ( | , … , ) ( , … , ) = ( ) ( | ) … ( | , … , ) ( | , … , ) 1 2 𝑑𝑑 2 𝑑𝑑 𝑓𝑓 𝒙𝒙 𝑓𝑓 𝑥𝑥 𝑥𝑥 𝑥𝑥 𝑓𝑓 𝑥𝑥 𝑥𝑥 1 2 1 𝑛𝑛−1 1 𝑛𝑛−2 𝑛𝑛 1 𝑛𝑛 If the random variables 𝑓𝑓are𝑥𝑥 independent,𝑓𝑓 𝑥𝑥 𝑥𝑥 𝑓𝑓 their𝑥𝑥 joint𝑥𝑥 distribution𝑥𝑥 𝑓𝑓 𝑥𝑥 is 𝑥𝑥simply𝑥𝑥 the product of the marginal distributions:

( ) = 𝑑𝑑 ( )

𝑓𝑓 𝒙𝒙 � 𝑓𝑓 𝑥𝑥𝑖𝑖 𝑤𝑤ℎ𝑒𝑒𝑒𝑒𝑒𝑒 𝑥𝑥𝑖𝑖 ⊥ 𝑥𝑥𝑗𝑗𝑓𝑓𝑓𝑓𝑓𝑓 𝑖𝑖 ≠ 𝑗𝑗 𝑖𝑖=1

( ) = 𝑑𝑑 ( )

𝑓𝑓 𝒌𝒌 � 𝑓𝑓 𝑘𝑘𝑖𝑖 𝑤𝑤ℎ𝑒𝑒𝑒𝑒𝑒𝑒 𝑘𝑘𝑖𝑖 ⊥ 𝑘𝑘𝑗𝑗𝑓𝑓𝑓𝑓𝑓𝑓 𝑖𝑖 ≠ 𝑗𝑗 𝑖𝑖=1 A general multivariate cumulative probability function with n random variables ( , , … , ) is defined as:

1 2 𝑛𝑛 𝑇𝑇 𝑇𝑇 𝑇𝑇 ( , , … , ) = ( , , … , )

1 2 𝑛𝑛 1 1 2 2 𝑛𝑛 𝑛𝑛 The survivor function 𝐹𝐹is 𝑡𝑡given𝑡𝑡 as:𝑡𝑡 𝑃𝑃 𝑇𝑇 ≤ 𝑡𝑡 𝑇𝑇 ≤ 𝑡𝑡 𝑇𝑇 ≤ 𝑡𝑡

( , , … , ) = ( > , > , … , > )

1 2 𝑛𝑛 1 1 2 2 𝑛𝑛 𝑛𝑛 Different from univariate𝑅𝑅 𝑡𝑡 distributions𝑡𝑡 𝑡𝑡 is𝑃𝑃 the𝑇𝑇 relation𝑡𝑡 𝑇𝑇 ship𝑡𝑡 between𝑇𝑇 𝑡𝑡 the CDF and the survivor function (Georges et al. 2001):

( , , … , ) + ( , , … , ) 1

1 2 𝑛𝑛 1 2 𝑛𝑛 𝐹𝐹 𝑡𝑡 𝑡𝑡 𝑡𝑡 𝑅𝑅 𝑡𝑡 𝑡𝑡 𝑡𝑡 ≤ If ( , , … , ) is differentiable then the probability density function is given as:

𝐹𝐹 𝑡𝑡1 𝑡𝑡2 𝑡𝑡𝑛𝑛 Introduction 17 Dist Functions

( , , … , ) ( , , … , ) = 𝑛𝑛 … 1 2 𝑛𝑛 1 2 𝑛𝑛 𝜕𝜕 𝐹𝐹 𝑡𝑡 𝑡𝑡 𝑡𝑡 𝑓𝑓 𝑡𝑡 𝑡𝑡 𝑡𝑡 1 2 𝑛𝑛 𝜕𝜕𝑡𝑡 𝜕𝜕𝑡𝑡 𝜕𝜕𝑡𝑡 For a discussion on the multivariate hazard rate functions and the construction of joint distributions from marginal distributions see (Singpurwalla 2006).

1.2.13. Marginal Distribution

The marginal distribution of a single random variable in a joint distribution can be obtained:

( ) = … ( ) …

1 2 3 𝑑𝑑 𝑓𝑓 𝑥𝑥 �𝑥𝑥𝑑𝑑 �𝑥𝑥3 �𝑥𝑥2𝑓𝑓 𝒙𝒙 𝑑𝑑𝑥𝑥 𝑑𝑑𝑥𝑥 𝑑𝑑𝑥𝑥 ( ) = … ( )

1 𝑓𝑓 𝑘𝑘 � � � 𝑓𝑓 𝒌𝒌 𝑘𝑘2 𝑘𝑘3 𝑘𝑘𝑛𝑛

1.2.14. Conditional Distribution

If the value is known for some random variables the conditional distribution of the remaining random variables is:

( ) ( ) ( | , … , ) = = ( , … , ) ( ) 𝑓𝑓 𝑥𝑥1 𝑥𝑥2 𝑥𝑥𝑑𝑑 𝑓𝑓 𝑥𝑥 𝑓𝑓 𝑥𝑥 𝑓𝑓 𝑥𝑥2 𝑥𝑥𝑑𝑑 𝑥𝑥1 1 ( ) ∫ 𝑓𝑓 𝑥𝑥( )𝑑𝑑𝑑𝑑 ( | , … , ) = = ( , … , ) ( ) 1 2 𝑑𝑑 𝑓𝑓 𝑘𝑘 𝑓𝑓 𝑘𝑘 𝑓𝑓 𝑘𝑘 𝑘𝑘 𝑘𝑘 1 𝑓𝑓 𝑘𝑘2 𝑘𝑘𝑑𝑑 ∑𝑘𝑘 𝑓𝑓 𝑥𝑥 1.2.15. Bathtub Distributions

Elementary texts on reliability introduce the hazard rate of a system as a bathtub curve. The bathtub curve has three regions, infant mortality (decreasing failure rate), useful life (constant failure rate) and wear out (increasing failure rate). Bathtub distributions have not been a popular choice for modeling life distributions when compared to exponential, Weibull and lognormal distributions. This is because bathtub distributions are generally more complex without closed form moments and more difficult to estimate parameters.

Sometimes more complex shapes are required than simple bathtub curves, as such generalizations and modifications to the bathtub curves has been studied. These include an increase in the failure rate followed by a bathtub curve and rollercoaster curves (decreasing followed by unimodal hazard rate). For further reading including applications see (Lai & Xie 2006).

18 Probability Distributions Used in Reliability Engineering

1.2.16. Truncated Distributions

Dist Functions Dist Truncation arises when the existence of a potential observation would be unknown if it were to occur in a certain range. An example of truncation is when the existence of a defect is unknown due to the defect’s amplitude being less than the inspection threshold. The number of flaws below the inspection threshold is unknown. This is not to be confused with censoring which occurs when there is a bound for observing events. An example of right censoring is when a test is time terminated and the failures of the surviving components are not observed, however we know how many components were censored. (Meeker & Escobar 1998, p.266)

A truncated distribution is the conditional distribution that results from restricting the domain of another probability distribution. The following general formulas apply to truncated distribution functions, where ( ) and ( ) are the pdf and cdf of the non- truncated distribution. For further reading specific to common distributions see (Cohen 0 0 1991) 𝑓𝑓 𝑥𝑥 𝐹𝐹 𝑥𝑥

Probability Distribution Function: ( ) ( , ] ( ) = ( ) ( ) 𝑜𝑜 𝑓𝑓 0𝑥𝑥 0 0 𝑓𝑓𝑓𝑓𝑓𝑓 𝑥𝑥 ∈ 𝑎𝑎 𝑏𝑏 𝑓𝑓 𝑥𝑥 �𝐹𝐹 𝑏𝑏 − 𝐹𝐹 𝑎𝑎 Cumulative Distribution Function: 𝑜𝑜𝑜𝑜ℎ𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 0 ( ) ( ) = 𝑓𝑓𝑓𝑓𝑓𝑓 𝑥𝑥 ≤( ,𝑎𝑎 ] ⎧ (𝑥𝑥 ) ( ) ⎪ ∫𝑎𝑎 𝑓𝑓0 𝑡𝑡 𝑑𝑑𝑑𝑑 𝐹𝐹 𝑥𝑥 1 𝑓𝑓𝑓𝑓𝑓𝑓 𝑥𝑥 ∈ >𝑎𝑎 𝑏𝑏 ⎨𝐹𝐹0 𝑏𝑏 − 𝐹𝐹0 𝑎𝑎 ⎪ ⎩ 𝑓𝑓𝑓𝑓𝑓𝑓 𝑥𝑥 𝑏𝑏 Introduction 19 Dist Functions

1.2.17. Summary

Table 2: Summary of important reliability function relationships

( ) ( ) ( ) ( ) ( )

𝒇𝒇 𝒕𝒕 𝑭𝑭 𝒕𝒕 𝑹𝑹 𝒕𝒕 𝒉𝒉 𝒕𝒕 𝑯𝑯 𝒕𝒕 {exp[ ( )]} ( ) ( ) ( ) = --- ( ) ( ) exp 𝑡𝑡 ′ ′ 𝑑𝑑 −𝐻𝐻 𝑡𝑡 𝒇𝒇 𝒕𝒕 𝐹𝐹 𝑡𝑡 −𝑅𝑅 𝑡𝑡 ℎ 𝑡𝑡 �− � ℎ 𝑥𝑥 𝑑𝑑𝑑𝑑� − 𝑜𝑜 𝑑𝑑𝑑𝑑 1 exp { ( )} ( ) = ( ) 1 ( ) 1 exp ( ) 𝑡𝑡 ---- 𝑡𝑡 − −𝐻𝐻 𝑡𝑡 𝑭𝑭 𝒕𝒕 �0 𝑓𝑓 𝑥𝑥 𝑑𝑑𝑑𝑑 − 𝑅𝑅 𝑡𝑡 − �− �𝑜𝑜 ℎ 𝑥𝑥 𝑑𝑑𝑑𝑑�

exp { ( )} ( ) = ( ) 1 – ( ) exp ( ) 1 𝑡𝑡 ---- 𝑡𝑡 −𝐻𝐻 𝑡𝑡 𝑹𝑹 𝒕𝒕 − �0 𝑓𝑓 𝑥𝑥 𝑑𝑑𝑑𝑑 𝐹𝐹 𝑡𝑡 �− �𝑜𝑜 ℎ 𝑥𝑥 𝑑𝑑𝑑𝑑�

( ) ( ) ( ) ( ) ( ) = ---- ′ 1 ′ ( ) ′( ) 𝐻𝐻 𝑡𝑡 1 ( ) 𝐹𝐹 𝑡𝑡 𝑅𝑅 𝑡𝑡 𝒉𝒉 𝒕𝒕 𝑓𝑓𝑡𝑡 𝑡𝑡 − 𝐹𝐹 𝑡𝑡 𝑅𝑅 𝑡𝑡 − ∫0 𝑓𝑓 𝑥𝑥 𝑑𝑑𝑑𝑑 1 --- ( ) = ( ) ln ln { ( )} ( ) ∞ 1 ( ) 𝑡𝑡 𝑯𝑯 𝒕𝒕 −𝑙𝑙𝑙𝑙 � 𝑓𝑓 𝑥𝑥 𝑑𝑑𝑑𝑑 � � − 𝑅𝑅 𝑥𝑥 � ℎ 𝑥𝑥 𝑑𝑑𝑑𝑑 𝑡𝑡 − 𝐹𝐹 𝑥𝑥 0 { } ( + ) [1 ( )] ( ) exp ( ) exp ( ) ∞ ( ) = ∞ ∞ ∞ ∞ 𝑡𝑡 exp{ ( )} ( ) 1 ( ) ( ) 𝑡𝑡 𝑜𝑜 ( ) 𝑡𝑡 ∫0 𝑥𝑥𝑥𝑥 𝑡𝑡 𝑥𝑥 𝑑𝑑𝑑𝑑 𝑡𝑡 𝑡𝑡 ∫ exp�− ∫ ℎ 𝑥𝑥 𝑑𝑑𝑑𝑑� 𝑑𝑑𝑑𝑑 ∫ −𝐻𝐻 𝑥𝑥 𝑑𝑑𝑑𝑑 ∞ ∫ − 𝐹𝐹 𝑥𝑥 𝑑𝑑𝑑𝑑 ∫ 𝑅𝑅 𝑥𝑥 𝑑𝑑𝑑𝑑 𝑡𝑡 𝒖𝒖 𝒕𝒕 −𝐻𝐻 𝑥𝑥 ∫𝑡𝑡 𝑓𝑓 𝑥𝑥 𝑑𝑑𝑑𝑑 − 𝐹𝐹 𝑡𝑡 𝑅𝑅 𝑡𝑡 �− ∫𝑜𝑜 ℎ 𝑥𝑥 𝑑𝑑𝑥𝑥�

20 Probability Distributions Used in Reliability Engineering

1.3. Distribution Properties

Dist Properties 1.3.1. Median / Mode

The median of a distribution, denoted as . is when the cdf and reliability function are equal to 0.5. 0 5 . = (0𝑡𝑡.5) = (0.5) −1 −1 0 5 The mode is the highest point 𝑡𝑡of the𝐹𝐹 pdf, . This𝑅𝑅 is the point where a failure has the highest probability. Samples from this distribution would occur most often around the 𝑚𝑚 mode. 𝑡𝑡

1.3.2. Moments of Distribution

The moments of a distribution are given by:

= ( ) ( ) , = ( ) ∞ 𝑛𝑛 𝑛𝑛 When = 0 the moments,𝜇𝜇𝑛𝑛 � 𝑥𝑥, −are𝑐𝑐 called𝑓𝑓 𝑥𝑥 𝑑𝑑the𝑑𝑑 raw 𝜇𝜇moments,𝑛𝑛 ��𝑘𝑘𝑗𝑗 described− 𝑐𝑐� 𝑓𝑓 𝑘𝑘 as moments about −∞ 𝑖𝑖 the origin. In respect to probability′ distributions the first two raw moments are important. 𝑛𝑛 always𝑐𝑐 equals one, and 𝜇𝜇 is the distributions mean which is the expected value of the random′ variable for the distribution:′ 𝜇𝜇0 𝜇𝜇1 = ( ) = 1, = ( ) = 1 ∞ 𝜇𝜇′0 � 𝑓𝑓 𝑥𝑥 𝑑𝑑𝑑𝑑 𝜇𝜇′0 � 𝑓𝑓 𝑘𝑘𝑖𝑖 −∞ 𝑖𝑖 mean = [ ] = : = ( ) , = ( ) 𝐸𝐸 𝑋𝑋 𝜇𝜇 ∞ Some important properties𝜇𝜇′1 of� the𝑥𝑥𝑥𝑥 expected𝑥𝑥 𝑑𝑑𝑑𝑑 value𝜇𝜇′1 �[ 𝑘𝑘] 𝑖𝑖𝑓𝑓when𝑘𝑘𝑖𝑖 transformations of the −∞ 𝑖𝑖 random variable occur are: [ + ] = + 𝐸𝐸 𝑋𝑋

𝐸𝐸[𝑋𝑋 + 𝑏𝑏] = 𝜇𝜇𝑋𝑋 + 𝑏𝑏

𝐸𝐸 𝑋𝑋 [ 𝑌𝑌] = 𝜇𝜇𝑋𝑋 𝜇𝜇𝑌𝑌

𝑋𝑋 𝐸𝐸[𝑎𝑎𝑎𝑎] = 𝑎𝑎𝜇𝜇 + ( , )

𝑋𝑋 𝑌𝑌 When = the moments, , are𝐸𝐸 𝑋𝑋 called𝑋𝑋 𝜇𝜇 the𝜇𝜇 central𝐶𝐶𝐶𝐶𝐶𝐶 𝑋𝑋moments𝑌𝑌 , described as moments about the mean. In this book, the first five central moments are important. is equal to 𝑛𝑛 = 1.𝑐𝑐 𝜇𝜇is the variance which𝜇𝜇 quantifies the amount the random variable deviates from 0 the′ mean. and are used to calculate the and . 𝜇𝜇 0 1 𝜇𝜇 𝜇𝜇 𝜇𝜇2 𝜇𝜇3 = ( ) = 1, = ( ) = 1 ∞ 𝜇𝜇0 � 𝑓𝑓 𝑥𝑥 𝑑𝑑𝑑𝑑 𝜇𝜇0 � 𝑓𝑓 𝑘𝑘𝑖𝑖 = ( −∞) ( ) = 0, = 𝑖𝑖 ( ) ( ) = 0 ∞ 𝜇𝜇1 � 𝑥𝑥 − 𝜇𝜇 𝑓𝑓 𝑥𝑥 𝑑𝑑𝑑𝑑 𝜇𝜇1 � 𝑘𝑘𝑖𝑖 − 𝜇𝜇 𝑓𝑓 𝑘𝑘𝑖𝑖 −∞ 𝑖𝑖 Introduction 21 Properties Dist

variance = [( [ ]) ] = [ ] { [ ]} = : 2 2 2 2 𝐸𝐸 𝑋𝑋 − 𝐸𝐸 𝑋𝑋 𝐸𝐸 𝑋𝑋 − 𝐸𝐸 𝑋𝑋 𝜎𝜎 = ( ) ( ) , = ( ) ( ) ∞ 2 2 𝜇𝜇2 � 𝑥𝑥 − 𝜇𝜇 𝑓𝑓 𝑥𝑥 𝑑𝑑𝑑𝑑 𝜇𝜇2 � 𝑘𝑘𝑖𝑖 − 𝜇𝜇 𝑓𝑓 𝑘𝑘𝑖𝑖 −∞ 𝑖𝑖 Some important properties of the variance exist when transformations of the random variable occur are: [ + ] = [ ]

𝑉𝑉𝑉𝑉𝑉𝑉[𝑋𝑋 + 𝑏𝑏] = 𝑉𝑉𝑉𝑉𝑉𝑉+𝑋𝑋 ± 2 ( , ) 2 2 𝑉𝑉𝑉𝑉𝑉𝑉 𝑋𝑋 [ 𝑌𝑌] = 𝜎𝜎𝑋𝑋 𝜎𝜎𝑌𝑌 𝐶𝐶𝐶𝐶𝐶𝐶 𝑋𝑋 𝑌𝑌 2 2 𝑉𝑉𝑉𝑉𝑉𝑉 𝑎𝑎𝑎𝑎 𝑎𝑎 𝜎𝜎𝑋𝑋 ( , ) [ ] = ( ) + + 2 2 2 2 𝜎𝜎𝑋𝑋 𝜎𝜎𝑌𝑌 𝐶𝐶𝐶𝐶𝐶𝐶 𝑋𝑋 𝑌𝑌 𝑉𝑉𝑉𝑉𝑉𝑉 𝑋𝑋𝑋𝑋 𝑋𝑋𝑋𝑋 �� � � � � The skewness is a measure of the asymmetry𝑋𝑋 of the𝑌𝑌 distribution.𝑋𝑋𝑋𝑋

= 𝜇𝜇3 𝛾𝛾1 3⁄2 The kurtosis is a measure of the whether the 𝜇𝜇data2 is peaked or flat.

= 𝜇𝜇4 1.3.3. Covariance 𝛾𝛾2 2 𝜇𝜇2 Covariance is a measure of the dependence between random variables. ( , ) = [( )( )] = [ ]

𝑋𝑋 𝑌𝑌 𝑋𝑋 𝑌𝑌 A normalized measure𝐶𝐶𝐶𝐶𝐶𝐶 of𝑋𝑋 covari𝑌𝑌 ance𝐸𝐸 𝑋𝑋 is− correlation,𝜇𝜇 𝑌𝑌 − 𝜇𝜇 . The𝐸𝐸 𝑋𝑋 correlation𝑋𝑋 − 𝜇𝜇 𝜇𝜇 has the limits 1 1. When = 1 the random variables have a linear dependency (i.e, an increase in X will result in the same increase in Y). When = 1 the𝜌𝜌 random variables have a negative− ≤ linear𝜌𝜌 ≤ dependency𝜌𝜌 (i.e, an increase in X will result in the same decrease in Y). The relationship between covariance and correlation𝜌𝜌 − is: ( , ) , = ( , ) = 𝑋𝑋 𝑌𝑌 𝐶𝐶𝐶𝐶𝐶𝐶 𝑋𝑋 𝑌𝑌 𝜌𝜌 𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶 𝑋𝑋 𝑌𝑌 𝑋𝑋 𝑌𝑌 If the two random variables are independent than the𝜎𝜎 correlation𝜎𝜎 is equal to zero, however the reverse is not always true. If the correlation is zero the random variables does not need to be independent. For derivations and more information the reader is referred to (Dekking et al. 2007, p.138).

22 Probability Distributions Used in Reliability Engineering

1.4. Parameter Estimation

Parameter Est Parameter 1.4.1. Probability Plotting Paper

Most plotting methods transform the data available into a straight line for a specific distribution. From a line of best fit the parameters of the distribution can be estimated. Most plotting paper plots the random variable (time or demands) against the pdf, cdf or hazard rate and transform the data points to a linear relationship by adjusting the scale of each axis. Probability plotting is done using the following steps (Nelson 1982, p.108):

1. Order the data such that .

1 2 𝑖𝑖 𝑛𝑛 2. Assign a rank to each failure.𝑥𝑥 ≤ 𝑥𝑥 For≤ complete⋯ ≤ 𝑥𝑥 ≤ ⋯data≤ 𝑥𝑥this is simply the value i. Censored data is discussed after step 7.

3. Calculate the plotting position. The cdf may simply be calculated as / however this produces a biased result, instead the following non-parametric Blom estimates, are recommended as suitable for many cases by (Kimball 1960): 𝑖𝑖 𝑛𝑛 1 ( ) = ( + 0.625)( ) ℎ � 𝑡𝑡𝑖𝑖 𝑛𝑛 −0.𝑖𝑖375 𝑡𝑡𝑖𝑖+1 − 𝑡𝑡𝑖𝑖 ( ) = + 0.25 𝑖𝑖 − 𝐹𝐹� 𝑡𝑡𝑖𝑖 𝑛𝑛 + 0.625 ( ) = ( + 0.25) 𝑛𝑛 − 𝑖𝑖 𝑅𝑅 � 𝑡𝑡𝑖𝑖 𝑛𝑛 1 ( ) = ( + 0.25)( ) ̂ 𝑖𝑖 𝑓𝑓 𝑡𝑡 𝑖𝑖+1 𝑖𝑖 Other proposed estimators are: 𝑛𝑛 𝑡𝑡 − 𝑡𝑡 Naive: ( ) = 𝑖𝑖 0.3 Median (approximate): 𝐹𝐹�(𝑡𝑡𝑖𝑖) = 𝑛𝑛 + 0.4 𝑖𝑖 − 𝐹𝐹� 𝑡𝑡𝑖𝑖 𝑛𝑛 0.5 Midpoint: ( ) = 𝑖𝑖 − 𝐹𝐹� 𝑡𝑡𝑖𝑖 𝑛𝑛 Mean ( ) = + 1 𝑖𝑖 ∶ 𝐹𝐹� 𝑡𝑡𝑖𝑖 𝑛𝑛 1 Mode: ( ) = 1 𝑖𝑖 − 𝐹𝐹� 𝑡𝑡𝑖𝑖 4. Plot points on probability paper. The choice of distribution𝑛𝑛 − should be from experience, or multiple distributions should be used to assess the best fit. Probability paper is available from http://www.weibull.com/GPaper/.

Introduction 23 Parameter Est

5. Assess the data and chosen distributions. If the data plots in straight line then the distribution may be a reasonable fit.

6. Draw a line of best fit. This is a subjective assessment which minimizes the deviation of the points from the chosen line.

7. Obtained the desired information. This may be the distribution parameters or estimates of reliability or hazard rate trends.

When multiple failure modes are observed only one failure mode should be plotted with the other failures being treated as censored. Two popular methods to treat censored data two methods are:

Rank Adjustment Method. (Manzini et al. 2009, p.140) Here the adjusted rank, is calculated only for non-censored units (with still being the rank for all ordered times).𝑖𝑖 𝑗𝑗𝑡𝑡 This adjusted rank is used for step 2 with the remaining𝑖𝑖 steps unchanged: ( 𝑖𝑖𝑡𝑡+ 1) = + 2 + 𝑡𝑡𝑖𝑖−1 𝑡𝑡𝑖𝑖 𝑡𝑡𝑖𝑖−1 𝑛𝑛 − 𝑗𝑗 𝑗𝑗 𝑗𝑗 𝑡𝑡𝑖𝑖 Kaplan Meier Estimator. Here the estimate for reliability𝑛𝑛 − 𝑖𝑖 is: ( ) = 1 + 1 𝑑𝑑 𝑅𝑅� 𝑡𝑡𝑖𝑖 � � − � 𝑡𝑡𝑗𝑗<𝑡𝑡𝑖𝑖 𝑛𝑛 − 𝑖𝑖 Where is the number of failures in rank j (for non- = 1). From this estimate a cdf can be given as ( ) = 1 ( ). For a detailed derivation and properties of this estimator𝑑𝑑 see (Andersen et al. 1996, p.255) 𝑑𝑑 𝑖𝑖 𝑖𝑖 𝐹𝐹� 𝑡𝑡 − 𝑅𝑅� 𝑡𝑡 Probability plots are fast and not dependent on complex numerical methods and can be used without a detailed knowledge of statistics. It provides a visual representation of the data for which qualitative statements can be made. It can be useful in estimating initial values for numerical methods. Limitation of this technique is that it is not objective and two different people making the same plot will obtain different answers. It also does not provide confidence intervals. For more detail of probability plotting the reader is referred to (Nelson 1982, p.104) and (Meeker & Escobar 1998, p.122)

1.4.2. Total Time on Test Plots

Total time on Test (TTT) plots is a graph which provides a visual representation of the hazard rate trend, i.e increasing, constant or decreasing. This assists in identifying the distribution from which the data may come from. To plot TTT (Rinne 2008, p.334):

1. Order the data such that .

1 2 𝑖𝑖 𝑛𝑛 2. Calculate the TTT positions:𝑥𝑥 ≤ 𝑥𝑥 ≤ ⋯ ≤ 𝑥𝑥 ≤ ⋯ ≤ 𝑥𝑥

= 𝑖𝑖 ( + 1) ; = 1,2, … ,

𝑇𝑇𝑇𝑇𝑇𝑇𝑖𝑖 � 𝑛𝑛 − 𝑗𝑗 �𝑥𝑥𝑗𝑗 − 𝑥𝑥𝑗𝑗−1� 𝑖𝑖 𝑛𝑛 𝑗𝑗=1 3. Calculate the normalized TTT positions: 24 Probability Distributions Used in Reliability Engineering

= ; = 1,2, … , ∗ 𝑖𝑖 𝑖𝑖 𝑇𝑇𝑇𝑇𝑇𝑇 Parameter Est Parameter 𝑇𝑇𝑇𝑇𝑇𝑇 𝑛𝑛 𝑖𝑖 𝑛𝑛 4. Plot the points , . 𝑇𝑇𝑇𝑇𝑇𝑇 𝑖𝑖 ∗ 𝑛𝑛 𝑖𝑖 5. Analyze graph:� 𝑇𝑇𝑇𝑇𝑇𝑇 �

1

𝑖𝑖 𝑇𝑇𝑇𝑇𝑇𝑇

0 0 1 𝑖𝑖 �𝑛𝑛 Figure 5: Time on test plot interpretation

Compared to probability plotting, TTT plots are simple, scale invariant and can represent any data set even those from different distributions on the same plot. However it only provides an indication of failure rate properties and cannot be used directly to estimate parameters. For more information about TTT plots the reader is referred to (Rinne 2008, p.334).

1.4.3. Least Mean Square Regression

When the relationship between two variables, and is assumed linear ( = + ), an estimate of the line’s parameters can be obtained from sample data points, ( , ) using least mean square (LMS) regression. The least𝑥𝑥 square𝑦𝑦 method minimizes𝑦𝑦 the𝑚𝑚 𝑚𝑚square𝑐𝑐 of 𝑖𝑖 𝑖𝑖 the residual. 𝑛𝑛 𝑥𝑥 𝑦𝑦

= 𝑛𝑛 2 𝑆𝑆 � 𝑟𝑟𝑖𝑖 𝑖𝑖=1 Introduction 25 Parameter Est

The residual can be defined in many ways.

Minimize y residuals Minimize x residuals

= ( ; , ) = ( ; , )

𝑟𝑟𝑖𝑖 𝑦𝑦𝑖𝑖 − 𝑓𝑓 (𝑥𝑥𝑖𝑖 𝑚𝑚)(𝑐𝑐 ) 𝑟𝑟𝑖𝑖 𝑥𝑥𝑖𝑖 − 𝑓𝑓 𝑦𝑦𝑖𝑖 𝑚𝑚 𝑐𝑐 = = 2 𝑖𝑖 𝑖𝑖 𝑖𝑖 𝑖𝑖 2 ( )2( ) 𝑛𝑛∑𝑥𝑥 𝑦𝑦 − ∑𝑥𝑥 ∑𝑦𝑦 𝑖𝑖 𝑖𝑖 2 𝑛𝑛∑𝑦𝑦 − �∑𝑦𝑦 � 𝑚𝑚� 2 2 𝑚𝑚� 𝑛𝑛∑𝑥𝑥𝑖𝑖 − �∑𝑥𝑥𝑖𝑖 � 𝑛𝑛∑𝑥𝑥𝑖𝑖𝑦𝑦𝑖𝑖 − ∑𝑥𝑥𝑖𝑖 ∑𝑦𝑦𝑖𝑖 = = 𝑖𝑖 𝑖𝑖 𝑖𝑖 𝑖𝑖 ∑𝑦𝑦 ∑𝑥𝑥 ∑𝑦𝑦 ∑𝑥𝑥 𝑐𝑐̂ − 𝑚𝑚� 𝑐𝑐̂ − 𝑚𝑚� 𝑦𝑦 𝑛𝑛 𝑛𝑛 𝑦𝑦 𝑛𝑛 𝑛𝑛

( , ) Regression line ( , ) 𝑥𝑥𝑖𝑖 𝑦𝑦𝑖𝑖 = + 𝑖𝑖 𝑦𝑦 𝑥𝑥𝑖𝑖 𝑦𝑦𝑖𝑖 𝑦𝑦 𝑚𝑚𝑚𝑚 𝑐𝑐 ( ; , ) 𝑖𝑖 𝑟𝑟 Regression line 𝑖𝑖 = + 𝑓𝑓 𝑥𝑥 𝑚𝑚 𝑐𝑐 𝑦𝑦 𝑚𝑚𝑚𝑚 𝑐𝑐

0 0 𝑖𝑖 𝑟𝑟 ( ; , ) 𝑥𝑥 𝑥𝑥 𝑖𝑖 𝑖𝑖 Figure 6: Left minimize y residual, right minimize𝑓𝑓 x𝑦𝑦 residual𝑚𝑚 𝑐𝑐 𝑦𝑦

The LMS method can be used to estimate the line of best fit when using plotting parameter estimation methods. When plotting on a regular scale in software such as Microsoft Excel, it is often easy to conduct linear least mean square (LMS) regression using in built functions. Where available this book provides the formulas to plot the sample data in a straight line in a regular scale plot. It also provides the transformation from the linear LMS regression estimates of and to the distribution parameter estimates.

For more on least square𝑚𝑚� methods𝑐𝑐̂ in a reliability engineering context see (Nelson 1990, p.167). MS regression can also be conducted on multivariate distributions, see (Rao & Toutenburg 1999) and can also be conducted on non-linear data directly, see (Bjõrck 1996).

1.4.4. Method of Moments

To estimate the distribution parameters using the method of moments the sample moments are equated to the parameter moments and solved for the unknown parameters. The following sample moments can be used:

The sample mean is given as: 1 = 𝑛𝑛

𝑥𝑥̅ � 𝑥𝑥𝑖𝑖 𝑛𝑛 𝑖𝑖=1 26 Probability Distributions Used in Reliability Engineering

The unbiased sample variance is given as: 1 = 𝑛𝑛 ( ) Parameter Est Parameter 1 2 2 𝑆𝑆 � 𝑥𝑥𝑖𝑖 − 𝑥𝑥̅ 𝑛𝑛 − 𝑖𝑖=1 Method of moments is not as accurate as Bayesian or maximum likelihood estimates but is easy and fast to calculate. The method of moment estimates are often used as a starting point for numerical methods to optimize maximum likelihood and least square estimators.

1.4.5. Maximum Likelihood Estimates

Maximum likelihood estimates (MLE) are based on a frequentist approach to parameter estimation usually obtained by maximizing the natural log of the likelihood function.

( |E) = ln{ ( | )}

Algebraically this is done by solvingΛ theθ first order𝐿𝐿 𝜃𝜃 partial𝐸𝐸 derivatives of the log-likelihood function. This calculation has been included in this book for distributions where the result is in closed form. Otherwise the log-likelihood function can be maximized directly using numerical methods.

MLE for is obtained by solving for :

𝜃𝜃� 𝜃𝜃 = 0 𝜕𝜕Λ Denote the true parameters of the distribution𝜕𝜕𝜕𝜕 as , MLEs have the following properties (Rinne 2008, p.406): 𝟎𝟎 • Consistency. As the number of samples𝜽𝜽 increases the difference between the estimated and actual parameter decreases: plim =

𝑛𝑛→∞ 𝜽𝜽� 𝜽𝜽 • Asymptotic normality. lim ~ ( , [ ( )] ) −1 0 𝑛𝑛 0 𝑛𝑛→∞ 𝜃𝜃� 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝜃𝜃 𝐼𝐼 𝜃𝜃 where ( ) = ( ) is the Fisher information matrix. Therefore is asymptotically unbiased: 𝑛𝑛 𝐼𝐼 𝜃𝜃 𝑛𝑛𝑛𝑛 𝜃𝜃 lim [ ] = 𝜃𝜃�

0 𝑛𝑛→∞ 𝐸𝐸 𝜃𝜃� 𝜃𝜃 • Asymptotic efficiency. lim [ ] = [ ( )] −1 𝑛𝑛 0 𝑛𝑛→∞ 𝑉𝑉𝑉𝑉𝑉𝑉 𝜃𝜃� 𝐼𝐼 𝜃𝜃 • Invariance. The MLE of ( ) is ( ) if (. ) is a continuous and continuously differentiable function. 0 𝑓𝑓 𝜃𝜃 𝑓𝑓 𝜃𝜃� 𝑓𝑓 The advantages of MLE are that it is a very common technique that has been widely published and is implemented in many software packages. The MLE method can easily handle censored data. The disadvantage to MLE is the bias introduced for small sample sizes and unbounded estimates may result when no failures have been observed. The Introduction 27 Parameter Est numerical optimization of the log-likelihood function may be non-trivial with high sensitivity to starting values and the presence of local maximums.

For more information in a reliability context see (Nelson 1990, p.284).

1.4.6. Bayesian Estimation

Bayesian estimation uses a subjective interpretation of the theory of probability and for parameter and confidence intervals uses Bayes’ rule to update our state of knowledge of the unknown of interest (UIO). Recall from section 1.1.5 Bayes rule,

( ) ( | ) ( ) ( | ) ( | ) = , ( | ) = 𝑜𝑜( ) ( | ) ( | ) ( ) respectively for continuous𝜋𝜋 and𝜃𝜃 discrete𝐿𝐿 𝐸𝐸 𝜃𝜃 forms of variable of .𝑃𝑃 𝜃𝜃 𝑃𝑃 𝐸𝐸 𝜃𝜃 𝜋𝜋 𝜃𝜃 𝐸𝐸 𝑃𝑃 𝜃𝜃 𝐸𝐸 𝑖𝑖 𝑖𝑖 ∫ 𝜋𝜋𝑜𝑜 𝜃𝜃 𝐿𝐿 𝐸𝐸 𝜃𝜃 𝑑𝑑𝑑𝑑 ∑ 𝑃𝑃 𝐸𝐸 𝜃𝜃 𝑃𝑃 𝜃𝜃 The Prior Distribution ( ) 𝜃𝜃

The prior distribution is 𝜋𝜋probability𝑜𝑜 𝜃𝜃 distribution of the UOI, , which captures our state of knowledge of prior to the evidence being observed. It is common for this distribution to represent soft evidence or intervals about the possible values𝜃𝜃 of . If the distribution is dispersed it represents𝜃𝜃 little being known about the parameter. If the distribution is concentrated in an area then it reflects a good knowledge about the𝜃𝜃 likely values of .

Prior distributions should be a proper probability distribution of . A distribution is proper𝜃𝜃 when it integrates to one and improper otherwise. The prior should also not be selected based on the form of the likelihood function. When the prior has𝜃𝜃 a constant which does not affect the posterior distribution (such as improper priors) it will be omitted from the formulas within this book.

Non-informative Priors. Occasions arise when it is not possible to express a subjective prior distribution due to lack of information, time or cost. Alternatively a subjective prior distribution may introduce unwanted bias through model convenience (conjugates) or due to elicitation methods. In such cases a non-informative prior may be desirable. The following methods exist for creating a non-informative prior (Yang and Berger 1998):

• Principle of Indifference - Improper Uniform Priors. An equal probability is assigned across all the possible values of the parameter. This is done using an improper uniform distribution with a constant, usually 1, over the range of the possible values for . When placed in Bayes formula the constant cancels out, however the denominator is integrated over all possible values of . In most cases this prior distribution𝜃𝜃 will result in a proper posterior, but not always. Improper Uniform Priors may be chosen to enable the use of conjugate𝜃𝜃 priors.

For example using exponential likelihood model, with an improper uniform prior, 1, over the limits [0, ) with evidence of failures in total time, :

𝐹𝐹 𝑇𝑇 ∞ Prior: ( ) = 1𝑛𝑛 (1,0) 𝑡𝑡

0 Likelihood:𝜋𝜋 𝜆𝜆 (∝|𝐺𝐺)𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺= nF −𝜆𝜆𝑡𝑡𝑇𝑇 𝐿𝐿 𝐸𝐸 𝜆𝜆 𝜆𝜆 𝑒𝑒 28 Probability Distributions Used in Reliability Engineering

1. ( | ) Posterior: ( | ) = ( | ) 1. 0

Parameter Est Parameter 𝐿𝐿 𝐸𝐸 𝜆𝜆 𝜋𝜋 𝜆𝜆 𝐸𝐸 ∞ Using conjugate relationship (see Conjugate∫ Priors𝐿𝐿 𝐸𝐸 for𝜆𝜆 calculations):𝑑𝑑𝑑𝑑

~ ( ; 1 + nF, tT)

• Principle of Indifference 𝜆𝜆- Proper𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 Uniform𝜆𝜆 Priors. An equal probability is assigned across the values of the parameter within a range defined by the uniform distribution. The uniform distribution is obtained by estimating the far left and right bounds ( and ) of the parameter giving ( ) = = , where c is a constant. When placed in Bayes formula the constant1 cancels out, 𝑜𝑜 𝑏𝑏−𝑎𝑎 however the denominator𝑎𝑎 is𝑏𝑏 integrated over the𝜃𝜃 bound [𝜋𝜋, 𝜃𝜃]. Care needs𝑐𝑐 to be taken in choosing and because no matter how much evidence suggests otherwise the posterior distribution will always be zero outside𝑎𝑎 𝑏𝑏 these bounds. 𝑎𝑎 𝑏𝑏 Using an exponential likelihood model, with a proper uniform prior, , over the limits [ , ] with evidence of failures in total time, : 𝑐𝑐 𝐹𝐹 𝑇𝑇 𝑎𝑎 𝑏𝑏 Prior: ( ) 𝑛𝑛= = 𝑡𝑡 (1,0) 1 𝜋𝜋0 𝜆𝜆 𝑏𝑏−𝑎𝑎 𝑐𝑐 ∝ 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 Likelihood: ( | ) = nF −𝜆𝜆𝑡𝑡𝑇𝑇 𝐿𝐿. 𝐸𝐸(𝜆𝜆 | ) 𝜆𝜆 𝑒𝑒 Posterior: ( | ) = for a b . ( | ) 𝑐𝑐 𝐿𝐿 𝐸𝐸 𝜆𝜆 𝜋𝜋 𝜆𝜆 𝐸𝐸 𝑏𝑏 ≤ λ ≤ 𝑎𝑎 Using conjugate relationship this results𝑐𝑐 ∫ 𝐿𝐿 in𝐸𝐸 a𝜆𝜆 truncated𝑑𝑑𝑑𝑑 Gamma distribution:

. ( ; 1 + n , t ) for a b ( ) = F T 0 otherwise 𝑐𝑐 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝜆𝜆 ≤ λ ≤ 𝜋𝜋 𝜆𝜆 � • Jeffrey’s Prior. Proposed by Jeffery in 1961, this prior is defined as ( ) = ( ) where is the Fisher information matrix. This derivation is motivated0 by the fact that it is not dependent upon the set of parameter variables𝜋𝜋 that𝜃𝜃 is 𝜃𝜃 𝜃𝜃 chosen�𝑑𝑑𝑑𝑑𝑑𝑑 𝑰𝑰 to describe𝑰𝑰 parameter space. Jeffery himself suggested the need to make ad hoc modifications to the prior to avoid problems in multidimensional distributions. Jeffory’s prior is normally improper. (Bernardo et al. 1992)

• Reference Prior. Proposed by Bernardo in 1979, this prior maximizes the expected posterior information from the data, therefore reducing the effect of the prior. When there is no nuisance parameters and certain regularity conditions are satisfied the reference prior is identical to the Jeffrey’s prior. Due to the need to order or group the importance of parameters, it may occur that different posteriors will result from the same data depending on the importance the user places on each parameter. This prior overcomes the problems which arise when using Jeffery’s prior in multivariate applications.

Introduction 29 Parameter Est

• Maximal Data Information Prior (MDIP). Developed by Zelluer in 1971 maximizes the likelihood function with relation to the prior. (Berry et al. 1995,

p.182)

For further detail on the differences between each type of non-informative prior see (Berry et al. 1995, p.179)

Conjugate Priors. Calculating posterior distributions can be extremely complex and in most cases requires expensive computations. A special case exists however by which the posterior distribution is of the same form as the prior distribution. The Bayesian updating mathematics can be reduced to simple calculations to update the model parameters. As an example the gamma function is a conjugate prior to a Poisson likelihood function:

Prior: ( ) = 𝛼𝛼( 𝛼𝛼)−1 𝛽𝛽 𝜆𝜆 −βλ 𝜋𝜋𝑜𝑜 𝜆𝜆 𝑒𝑒 Γ 𝛼𝛼 Likelihood: ( | ) = 𝑘𝑘 !𝑘𝑘 𝜆𝜆𝑖𝑖 𝑡𝑡𝑖𝑖 −𝜆𝜆𝑡𝑡𝑖𝑖 𝐿𝐿𝑖𝑖 𝑡𝑡𝑖𝑖 𝜆𝜆 𝑒𝑒 𝑘𝑘𝑖𝑖 Likelihood: ( | ) = 𝑛𝑛𝐹𝐹 ( | ) = ∑𝑘𝑘 ! 𝑘𝑘 𝑖𝑖 −𝜆𝜆∑𝑡𝑡𝑖𝑖 𝑖𝑖 𝑖𝑖 𝜆𝜆 ∏𝑡𝑡 𝐿𝐿 𝐸𝐸 𝜆𝜆 � 𝐿𝐿 𝑡𝑡 𝜆𝜆 𝑖𝑖 𝑒𝑒 𝑖𝑖=1( ) ( | ) ∏𝑘𝑘 Posterior: ( | ) = ( ) ( | ) 0 𝜋𝜋𝑜𝑜 𝜆𝜆 𝐿𝐿 𝐸𝐸 𝜆𝜆 𝜋𝜋 𝜆𝜆 𝐸𝐸 ∞ ∫ 𝜋𝜋𝑜𝑜 𝜆𝜆 𝐿𝐿 𝐸𝐸 𝜆𝜆 𝑑𝑑𝑑𝑑 𝛼𝛼 𝛼𝛼(−1) ∑𝑘𝑘 ! 𝑘𝑘 = 𝛽𝛽 𝜆𝜆 𝜆𝜆 ∏𝑡𝑡𝑖𝑖 −βλ −𝜆𝜆∑𝑡𝑡𝑖𝑖 𝑖𝑖 𝑒𝑒 𝑒𝑒 ∞ Γ 𝛼𝛼 ∏𝑘𝑘 𝛼𝛼 𝛼𝛼(−1) ∑𝑘𝑘 ! 𝑘𝑘 𝛽𝛽 𝜆𝜆 𝜆𝜆 ∏𝑡𝑡𝑖𝑖 −βλ −𝜆𝜆∑𝑡𝑡𝑖𝑖 � 𝑒𝑒 𝑒𝑒 𝑑𝑑𝑑𝑑 Γ 𝛼𝛼 ∏𝑘𝑘𝑖𝑖 0 ( ) = 𝛼𝛼−1+∑𝑘𝑘 −λ( β+∑𝑡𝑡𝑖𝑖) 𝜆𝜆 𝑒𝑒 ∞ 𝛼𝛼−1+∑𝑘𝑘 −λ β+∑𝑡𝑡𝑖𝑖 0 𝜆𝜆 𝑒𝑒 𝑑𝑑𝑑𝑑 Using the identity ( ) = ∫ we can calculate the denominator using the ∞ = ( + 𝑧𝑧−)1 −𝑥𝑥 = = change of variable Γ 𝑧𝑧 𝑜𝑜 𝑥𝑥 . 𝑒𝑒This𝑑𝑑 results𝑑𝑑 in , and with the limits ∫ 𝑢𝑢 𝑑𝑑𝑑𝑑 of the same as . Substituting back into the posterior equation gives: 𝑢𝑢 𝜆𝜆 𝛽𝛽 ∑𝑡𝑡𝑖𝑖 𝜆𝜆 𝛽𝛽+∑𝑡𝑡𝑖𝑖 𝑑𝑑𝑑𝑑 𝛽𝛽+∑𝑡𝑡𝑖𝑖

𝑢𝑢 𝜆𝜆 ( ) ( | ) = 𝛼𝛼−1+∑𝑘𝑘 −λ β+∑𝑡𝑡𝑖𝑖 1 𝜆𝜆 𝑒𝑒 ∞ 𝜋𝜋 𝜆𝜆 𝐸𝐸 + + 𝛼𝛼−1+∑𝑘𝑘 𝑢𝑢 −u � � � 𝑒𝑒 𝑑𝑑𝑑𝑑 𝛽𝛽 ∑𝑡𝑡𝑖𝑖 𝛽𝛽 ∑𝑡𝑡𝑖𝑖 0 ( ) = 1 𝛼𝛼−1+∑𝑘𝑘 −λ β+∑𝑡𝑡𝑖𝑖

( 𝜆𝜆) 𝑒𝑒 + ∞ = + 𝛼𝛼−1+∑𝑘𝑘 −u Let 𝛼𝛼+∑𝑘𝑘 ∫0 𝑢𝑢 𝑒𝑒 𝑑𝑑𝑑𝑑 𝛽𝛽 ∑𝑡𝑡𝑖𝑖 𝑧𝑧 𝛼𝛼 ∑𝑘𝑘 30 Probability Distributions Used in Reliability Engineering

( ) ( | ) = 1𝛼𝛼−1+∑𝑘𝑘 −λ β+∑𝑡𝑡𝑖𝑖 𝜆𝜆 𝑒𝑒 Parameter Est Parameter ( + ) 𝜋𝜋 𝜆𝜆 𝐸𝐸 ∞ 𝑧𝑧−1 −u 𝛼𝛼+∑𝑘𝑘 0 𝑖𝑖 ∫ 𝑢𝑢 𝑒𝑒 𝑑𝑑𝑑𝑑 Using ( ) = : 𝛽𝛽 ∑𝑡𝑡 ∞ 𝑧𝑧−1 −𝑥𝑥 ( + ) Γ 𝑧𝑧 ∫𝑜𝑜 𝑥𝑥 𝑒𝑒 (𝑑𝑑|𝑑𝑑 ) = ( ) 𝛼𝛼−1+∑𝑘𝑘( + ) 𝛼𝛼+∑𝑘𝑘 𝜆𝜆 𝛽𝛽 ∑𝑡𝑡𝑖𝑖 −λ β+∑𝑡𝑡𝑖𝑖 𝜋𝜋 𝜆𝜆 𝐸𝐸 𝑒𝑒 Let = + , = + : Γ 𝛼𝛼 ∑𝑘𝑘 ′ ′ 𝑖𝑖 ′ ′ 𝛼𝛼 𝛼𝛼 ∑𝑘𝑘 𝛽𝛽 𝛽𝛽 ∑𝑡𝑡 ( | ) = 𝛼𝛼 𝛼𝛼 −(1 )′ ′ 𝜆𝜆 𝛽𝛽 −β λ 𝜋𝜋 𝜆𝜆 𝐸𝐸 ′ 𝑒𝑒 As can be seen the posterior is a gamma distributionΓ 𝛼𝛼 with the parameters = + , = + . Therefore the prior and posterior are of the same form, and Bayes’′ rule does not′ need to be re-calculated for each update. Instead the user can simply𝛼𝛼 update𝛼𝛼 ∑the𝑘𝑘 𝑖𝑖 𝛽𝛽parameters𝛽𝛽 ∑𝑡𝑡 with the new evidence.

The Likelihood Function ( | )

The reader is referred to section𝐿𝐿 𝐸𝐸 1.1.6𝜃𝜃 for a discussion on the construction of the likelihood function.

The Posterior Distribution ( | )

The posterior distribution is a𝜋𝜋 probability𝜃𝜃 𝐸𝐸 distribution of the UOI, , which captures our state of knowledge of including all prior information and the evidence. 𝜃𝜃 Point Estimate. From𝜃𝜃 the posterior distribution we may want to give a point estimate of θ. The Bayesian estimator when using a quadratic is the posterior mean (Christensen & Huffman 1985):

= [ ( | )] = ( | ) =

𝜋𝜋 For more information on𝜃𝜃 �utility,𝐸𝐸 𝜋𝜋loss𝜃𝜃 𝐸𝐸functions∫ 𝜃𝜃 and𝜋𝜋 𝜃𝜃 estimators𝐸𝐸 𝑑𝑑𝑑𝑑 𝜇𝜇 in a Bayesian context see (Berger 1993).

1.4.7. Confidence Intervals

Assuming a random variable is distributed by a given distribution, there exists the true distribution parameters, , which is unknown. The parameter point estimates, , may or may not be close to the true parameter values. Confidence intervals provide the range 𝟎𝟎 over which the true parameter𝜽𝜽 values may exist with a certain level of confidence.𝜽𝜽� Confidence intervals only quantify uncertainty due to sampling error arising from a limited number of samples. Uncertainty due to incorrect or incorrect assumptions is not included. (Meeker & Escobar 1998, p.49)

Increasing the desired confidence results in an increased . Increasing the sample size generally decreases the confidence interval. There are many methods to calculate confidence intervals. Some𝛾𝛾 popular methods are: Introduction 31 Parameter Est

• Exact Confidence Intervals. It may be mathematically shown that the

parameter of a distribution itself follows a distribution. In such cases exact confidence intervals can be derived. This is only the case in very few distributions.

• Fisher Information Matrix (Nelson 1990, p.292). For a large number of samples, the asymptotic normal property can be used to estimate confidence intervals: lim ~ ( , [ ( )] ) −1 0 0 𝑛𝑛→∞ 𝜃𝜃� 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝜃𝜃 𝑛𝑛𝑛𝑛 𝜃𝜃 Combining this with the asymptotic property as gives the following estimate for the distribution of : 𝜃𝜃� → 𝜃𝜃0 𝑛𝑛 → ∞ lim ~ , 𝜃𝜃� −1 𝑛𝑛 𝑛𝑛→∞ 𝜃𝜃� 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 �𝜃𝜃� �𝐽𝐽 �𝜃𝜃��� � 100 % approximate confidence intervals are calculated using percentiles of the normal distribution. If the range of is unbounded ( , ) the approximate two sided𝛾𝛾 confidence intervals are: 𝜃𝜃 1 + −∞ ∞ = 2 −1 𝛾𝛾 −1 𝛾𝛾 � 1 + � 𝑛𝑛 � 𝜃𝜃 = 𝜃𝜃 +− Φ � � �𝐽𝐽 �𝜃𝜃�� 2 −1 𝛾𝛾 −1 𝜃𝜃��𝛾𝛾� 𝜃𝜃� Φ � � ��𝐽𝐽𝑛𝑛�𝜃𝜃��� If the range of is (0, ) the approximate two sided confidence intervals are: 1 + 𝜃𝜃 ∞ 2 = . exp −1 −1 𝛾𝛾 ⎡𝛷𝛷 � � ��𝐽𝐽𝑛𝑛�𝜃𝜃��� ⎤ 𝜃𝜃𝛾𝛾 𝜃𝜃� ⎢ ⎥ ⎢ 1 + −𝜃𝜃� ⎥ ⎣ 2 ⎦ = . exp −1 −1 𝛾𝛾 ⎡𝛷𝛷 � � ��𝐽𝐽𝑛𝑛�𝜃𝜃��� ⎤ 𝜃𝜃��𝛾𝛾� 𝜃𝜃� ⎢ ⎥ If the range of is (0,1) the approximate⎢ two𝜃𝜃 �sided confidence⎥ intervals are: ⎣ 1 + ⎦ 𝜃𝜃 2 −1 = . + (1 ) exp −1 −1 𝛾𝛾 ⎧ ⎡𝛷𝛷 � (1� ��𝐽𝐽)𝑛𝑛�𝜃𝜃��� ⎤⎫ 𝜃𝜃𝛾𝛾 𝜃𝜃� 𝜃𝜃� − 𝜃𝜃� ⎢ ⎥ � � ⎨ ⎢ 1 +𝜃𝜃 − 𝜃𝜃 ⎥⎬ ⎩ ⎣ 2 ⎦⎭−1 = . + (1 ) exp −1 −1 𝛾𝛾 ⎧ ⎡𝛷𝛷 � (1� ��𝐽𝐽𝑛𝑛)�𝜃𝜃��� ⎤⎫ 𝜃𝜃��𝛾𝛾� 𝜃𝜃� 𝜃𝜃� − 𝜃𝜃� ⎢ ⎥ ⎨ ⎢ −𝜃𝜃� − 𝜃𝜃� ⎥⎬ The advantage of this⎩ method is it can⎣ be calculated for all distributions⎦⎭ and is easy to calculate. The disadvantage is that the assumption of a normal distribution is asymptotic and so sufficient data is required for the confidence interval estimate to be accurate. The number of samples needed for an accurate estimate changes from distribution to distribution. It also produces symmetrical confidence intervals which may be very inaccurate. For more information see (Nelson 1990, p.292). 32 Probability Distributions Used in Reliability Engineering

• Likelihood Ratio Intervals (Nelson 1990, p.292). The test for the

Parameter Est Parameter likelihood ratio is: = 2[ ( )]

is approximately Chi-Square𝐷𝐷 distributedΛ�𝜃𝜃�� − withΛ 𝜃𝜃 one degree of freedom. = 2 ( ) ( ; 1) 𝐷𝐷 2 Where is the 100 % confidence𝐷𝐷 �Λ�𝜃𝜃 �interval� − Λ 𝜃𝜃 for� ≤ .𝜒𝜒 The𝛾𝛾 two sided confidence limits and are calculated by solving: 𝛾𝛾 𝛾𝛾 𝜃𝜃 𝛾𝛾 𝛾𝛾 ( ; 1) 𝜃𝜃 �𝜃𝜃�� ( ) = 2 2 𝜒𝜒 𝛾𝛾 Λ 𝜃𝜃 Λ�𝜃𝜃�� − The limits are normally solved numerically. The likelihood ratio intervals are always within the limits of the parameter and gives asymmetrical confidence limits. It is much more accurate than the Fisher information matrix method particularly for one sided limits although it is more complicated to calculate. This method must be solved numerically and so will not be discussed further in this book.

• Bayesian Confidence Intervals. In Bayesian statistics the uncertainty of a parameter, , is quantified as a distribution ( ). Therefore the two sided 100 % confidence intervals are found by solving: 𝜃𝜃 𝜋𝜋 𝜃𝜃 𝛾𝛾 1 1 + = ( ) , = ( ) 2 𝜃𝜃𝛾𝛾 2 ∞ − 𝛾𝛾 𝛾𝛾 �−∞ 𝜋𝜋 𝜃𝜃 𝑑𝑑𝑑𝑑 �𝜃𝜃𝛾𝛾 𝜋𝜋 𝜃𝜃 𝑑𝑑𝑑𝑑 Other methods exist to calculate approximate confidence intervals. A summary of some techniques used in reliability engineering is included in (Lawless 2002). Introduction 33 Related Dist

Transformation

1.5. Related Distributions Special Case

= , = , = Lognormal + ( , ) 𝑋𝑋𝐶𝐶1 1 1 Beta 𝐵𝐵 2 1 2 2 ( , ) 2 𝑋𝑋 𝐶𝐶1 𝐶𝐶2 𝛼𝛼 𝑣𝑣 𝛽𝛽 𝑣𝑣 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿 𝜇𝜇𝑁𝑁 𝜎𝜎𝑁𝑁 𝑋𝑋 𝑋𝑋 = ln( ) Bernoulli 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝛼𝛼 𝛽𝛽 ( ) 𝑁𝑁 𝐿𝐿 𝑋𝑋 𝑋𝑋 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑝𝑝 Normal = = 1 ( , ) ( ) = 2 X = 𝐺𝐺 𝐵𝐵𝐵𝐵 + 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝜇𝜇 𝜎𝜎 𝑋𝑋 � 𝑋𝑋 𝐺𝐺1 𝑋𝑋 𝑋𝑋 X 𝑛𝑛 𝑛𝑛 𝐵𝐵 = , =

( ) 𝑆𝑆 C 𝑖𝑖𝑖𝑖𝑖𝑖 𝑋𝑋 Binomial 𝐺𝐺1 𝐺𝐺2 ( 2 𝑋𝑋 𝑋𝑋 1 N k ( , ) 1 2 ) − µ 𝛼𝛼 𝑘𝑘 𝛽𝛽 𝑘𝑘 ≤

� � � lim V ⋯ σ 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝑛𝑛 𝑛𝑛 𝑝𝑝 𝑛𝑛 ≤ 𝑛𝑛

| Chi-square = 𝑵𝑵 → ( ) 𝑋𝑋 𝑿𝑿 𝑆𝑆 𝜇𝜇 ( ; ) =

| ( ∞

2 𝑛𝑛 =

( ) ) 𝜒𝜒 𝑣𝑣 ; + 1, ,

C 𝑃𝑃 𝑋𝑋

X 𝑓𝑓 𝑘𝑘 𝜆𝜆𝜆𝜆

𝑋𝑋 ( ; , )

𝑆𝑆 (

𝐹𝐹 𝐺𝐺 X 𝐹𝐹 𝑡𝑡 𝑘𝑘 𝜆𝜆 𝑟𝑟 ) χ2 = Poisson 𝐺𝐺 Gamma ~ = ( ) −𝐹𝐹 𝑡𝑡 𝑘𝑘 𝜆𝜆 ( , ) 𝐵𝐵 𝐵𝐵𝐵𝐵𝐵𝐵 𝑋𝑋 𝑋𝑋 𝐶𝐶 𝐶𝐶 𝑋𝑋 1 2 𝜒𝜒 2 𝑛𝑛 𝑛𝑛 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝜇𝜇 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑘𝑘 𝜆𝜆

1 2 𝑋𝑋

𝐺𝐺 𝑘𝑘 = = 𝑋𝑋 1

F 𝐸𝐸 𝛼𝛼

( , ) 1 = + ⋯ 𝐹𝐹 𝑛𝑛1 𝑛𝑛2 1 = , 𝛽𝛽 +

= 1,2 = =

𝑋𝑋 𝐹𝐹 𝑇𝑇 𝐸𝐸𝐸𝐸

𝑋𝑋 𝑋𝑋 1

1 2 Chi Student𝑛𝑛 t 𝑛𝑛 𝑣𝑣 ( ) ( ) Exponential 𝜒𝜒 𝑣𝑣 𝑡𝑡 𝑣𝑣 = 2, = ( )

2 𝑅𝑅 𝐸𝐸 𝐸𝐸𝐸𝐸𝐸𝐸 𝜆𝜆 𝑋𝑋 = 𝑣𝑣 Rayleigh𝜎𝜎 𝑋𝑋 �𝑋𝑋 ( ) 𝑊𝑊 = exp

𝛽𝛽 ( ) , =

2 𝛽𝛽 𝑆𝑆 𝐸𝐸 𝑋𝑋 √ 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅ℎ 𝜎𝜎 𝑋𝑋 −𝜆𝜆𝑋𝑋 = ln( ) =

𝐸𝐸 1 Standard Uniform � =

= 2 1 𝛽𝛽 (0,1)

𝛼𝛼 𝐸𝐸 𝑃𝑃 , = 2/ 𝑋𝑋 𝑋𝑋 ⁄𝜃𝜃 2 𝛽𝛽 Pareto 𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈 = 0 = ( , = 1

𝑣𝑣 𝜎𝜎 √ 𝛼𝛼 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝜃𝜃 𝛼𝛼 𝑎𝑎 Weibull Uniform𝑏𝑏 ( , ) ( , )

𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 𝛼𝛼 𝛽𝛽 𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈 𝑎𝑎 𝑏𝑏 Figure 7: Relationships between common distributions (Leemis & McQueston 2008).

Many relations are not included such as central limit convergence to the normal distribution and many transforms which would have made the figure unreadable. For further details refer to individual sections and (Leemis & McQueston 2008).

34 Probability Distributions Used in Reliability Engineering

1.6. Supporting Functions

Supporting Func Supporting 1.6.1. Beta Function ( , )

B( , ) is the Beta function𝐁𝐁 and𝒙𝒙 𝒚𝒚 is the Euler integral of the first kind. ( , ) = (1 ) 𝑥𝑥 𝑦𝑦 1 𝑥𝑥−1 𝑦𝑦−1 Where > 0 and > 0. 𝐵𝐵 𝑥𝑥 𝑦𝑦 �0 𝑢𝑢 − 𝑢𝑢 𝑑𝑑𝑑𝑑

Relationships:𝑥𝑥 𝑦𝑦 ( , ) = ( , ) ( ) ( ) ( , ) = 𝐵𝐵 𝑥𝑥 𝑦𝑦 𝐵𝐵(𝑦𝑦 +𝑥𝑥 ) Γ 𝑥𝑥 Γ 𝑦𝑦 𝐵𝐵 𝑥𝑥 𝑦𝑦 ( , ) = Γ∞𝑥𝑥 𝑦𝑦 𝑛𝑛 +− 𝑦𝑦 � � 𝑛𝑛 𝐵𝐵 𝑥𝑥 𝑦𝑦 � 𝑛𝑛=0 𝑥𝑥 𝑛𝑛 More formulas, definitions and special values can be found in the Digital Library of Mathematical Functions on the National Institute of Standards and Technology (NIST) website, http://dlmf.nist.gov.

1.6.2. Incomplete Beta Function ( ; , )

𝒕𝒕 ( ; , ) is the incomplete Beta function𝑩𝑩 expressed𝒕𝒕 𝒙𝒙 𝒚𝒚 by:

𝑡𝑡 ( ; , ) = (1 ) 𝐵𝐵 𝑡𝑡 𝑥𝑥 𝑦𝑦 𝑡𝑡 𝑥𝑥−1 𝑦𝑦−1 𝑡𝑡 𝐵𝐵 𝑡𝑡 𝑥𝑥 𝑦𝑦 �0 𝑢𝑢 − 𝑢𝑢 𝑑𝑑𝑑𝑑 1.6.3. Regularized Incomplete Beta Function ( ; , )

𝒕𝒕 ( | , ) is the regularized incomplete Beta function: 𝑰𝑰 𝒕𝒕 𝒙𝒙 𝒚𝒚 (t ; , ) 𝑡𝑡 ( | , ) = 𝐼𝐼 𝑡𝑡 𝑥𝑥 𝑦𝑦 ( , ) 𝐵𝐵𝑡𝑡 𝑥𝑥 𝑦𝑦 𝐼𝐼 𝑡𝑡 𝑡𝑡 𝑥𝑥 𝑦𝑦 𝐵𝐵 𝑥𝑥 𝑦𝑦 ( + 1)! = 𝑥𝑥+𝑦𝑦−1 . (1 ) ! ( + 1 )! 𝑥𝑥 𝑦𝑦 − 𝑗𝑗 𝑥𝑥+𝑦𝑦−1−𝑗𝑗 Properties: � 𝑡𝑡 − 𝑡𝑡 𝑗𝑗=𝑥𝑥 𝑗𝑗 𝑥𝑥 𝑦𝑦 − − 𝑗𝑗 (0; , ) = 0

0 𝐼𝐼 (1; 𝑥𝑥, 𝑦𝑦) = 1

1 (t; , 𝐼𝐼) = 1𝑥𝑥 𝑦𝑦 (1 t; , )

𝑡𝑡 1.6.4. Complete Gamma Function𝐼𝐼 𝑥𝑥 𝑦𝑦 ( −) 𝐼𝐼 − 𝑦𝑦 𝑥𝑥

( ) is a generalization of the factorial function𝚪𝚪 𝒌𝒌 ! to include non-integer values.

Γ 𝑘𝑘 𝑘𝑘 Supporting Func Introduction 35

For > 0 ( ) = 𝑘𝑘 ∞ 𝑘𝑘−1 −𝑡𝑡

Γ 𝑘𝑘 = �0 𝑡𝑡 𝑒𝑒 +𝑑𝑑𝑑𝑑( 1) ∞ ∞ 𝑘𝑘−1 −𝑡𝑡 𝑘𝑘−2 −𝑡𝑡 =� 𝑡𝑡( 𝑒𝑒1)�0 𝑘𝑘 − �0 𝑡𝑡 𝑒𝑒 𝑑𝑑𝑑𝑑 ∞ 𝑘𝑘−2 −𝑡𝑡 = ( 1) ( 1) 𝑘𝑘 − �0 𝑡𝑡 𝑒𝑒 𝑑𝑑𝑑𝑑

When is an integer: 𝑘𝑘 − Γ 𝑘𝑘 − ( ) = ( 1)! Special𝑘𝑘 values: Γ 𝑘𝑘 (1) =𝑘𝑘 −1

Γ(2) = 1

Γ 1 = 2

Γ � � √𝜋𝜋 Relation to the incomplete gamma functions: ( ) = ( , ) + ( , )

More formulas, definitions and specialΓ 𝑘𝑘 Γvalues𝑘𝑘 𝑡𝑡 can𝛾𝛾 𝑘𝑘 be𝑡𝑡 found in the Digital Library of Mathematical Functions on the National Institute of Standards and Technology (NIST) website, http://dlmf.nist.gov.

1.6.5. Upper Incomplete Gamma Function ( , )

For > 0 𝚪𝚪 𝒌𝒌 𝒕𝒕 ( , ) = 𝑘𝑘 ∞ 𝑘𝑘−1 −𝑥𝑥 When is an integer: Γ 𝑘𝑘 𝑡𝑡 �𝑡𝑡 𝑥𝑥 𝑒𝑒 𝑑𝑑𝑑𝑑

𝑘𝑘 ( , ) = ( 1)! 𝑘𝑘−1 𝑛𝑛! −𝑡𝑡 𝑡𝑡 Γ 𝑘𝑘 𝑡𝑡 𝑘𝑘 − 𝑒𝑒 � 𝑛𝑛=0 𝑛𝑛 More formulas, definitions and special values can be found on the NIST website, http://dlmf.nist.gov.

1.6.6. Lower Incomplete Gamma Function ( , )

For > 0 𝛄𝛄 𝒌𝒌 𝒕𝒕 ( , ) = 𝑘𝑘 𝑡𝑡 𝑘𝑘−1 −𝑥𝑥 When is an integer: γ 𝑘𝑘 𝑡𝑡 �0 𝑥𝑥 𝑒𝑒 𝑑𝑑𝑑𝑑

𝑘𝑘 ( , ) = ( 1)! 1 𝑘𝑘−1 𝑛𝑛! −𝑡𝑡 𝑡𝑡 γ 𝑘𝑘 𝑡𝑡 𝑘𝑘 − � − 𝑒𝑒 � � 𝑛𝑛=0 𝑛𝑛

36 Probability Distributions Used in Reliability Engineering

More formulas, definitions and special values can be found on the NIST website, http://dlmf.nist.gov.

Supporting Func Supporting 1.6.7. Digamma Function ( )

( ) is the digamma function defined𝝍𝝍 𝒙𝒙 as: ( ) ( ) = ln[ ( )] = > 0 𝜓𝜓 𝑥𝑥 ′( ) 𝑑𝑑 Γ 𝑥𝑥 𝜓𝜓 𝑥𝑥 Γ 𝑥𝑥 𝑓𝑓𝑓𝑓𝑓𝑓 𝑥𝑥 𝑑𝑑𝑑𝑑 Γ 𝑥𝑥 1.6.8. Trigamma Function ( )

( ) is the trigamma function defined𝝍𝝍′ 𝒙𝒙 as: ′ 𝜓𝜓 𝑥𝑥 ( ) = ( ) = ∞ ( + ) 2 ′ 𝑑𝑑 −2 𝜓𝜓 𝑥𝑥 2 𝑙𝑙𝑙𝑙Γ 𝑥𝑥 � 𝑥𝑥 𝑖𝑖 𝑑𝑑𝑥𝑥 𝑖𝑖=0

Introduction 37 Dist Referred

1.7. Referred Distributions

1.7.1. Inverse Gamma Distribution ( , )

The pdf to the inverse gamma distribution 𝑰𝑰𝑰𝑰is: 𝜶𝜶 𝜷𝜷 ( ; , ) = . . (0, ) ( )𝛼𝛼 −𝛽𝛽 𝛽𝛽 𝑥𝑥 With mean: 𝑓𝑓 𝑥𝑥 𝛼𝛼 𝛽𝛽 𝛼𝛼+1 𝑒𝑒 𝐼𝐼𝑥𝑥 ∞ Γ 𝛼𝛼 𝑥𝑥 = for > 1 1 𝛽𝛽 𝜇𝜇 𝛼𝛼 1.7.2. Student T Distribution ( ,𝛼𝛼 −, ) 𝟐𝟐 The pdf to the standard student t distribu𝑻𝑻 𝜶𝜶 𝝁𝝁tion𝝈𝝈 with = 0, = 1 is: 2 𝜇𝜇 𝜎𝜎 [( + 1) 2] 𝛼𝛼+1 ( ; ) = . 1 + − ( 2) 2 2 Γ 𝛼𝛼 ⁄ 𝑥𝑥 𝑓𝑓 𝑥𝑥 𝛼𝛼 � � The generalized student t distribution√ 𝛼𝛼is:𝛼𝛼 Γ 𝛼𝛼⁄ 𝛼𝛼 [( + 1) 2] ( ) 𝛼𝛼+1 ( ; , , ) = . 1 + − ( 2) 2 2 2 Γ 𝛼𝛼 ⁄ 𝑥𝑥 − 𝜇𝜇 𝑓𝑓 𝑥𝑥 𝛼𝛼 𝜇𝜇 𝜎𝜎 � 2 � With mean 𝜎𝜎√𝛼𝛼𝛼𝛼Γ 𝛼𝛼⁄ 𝛼𝛼𝜎𝜎 =

1.7.3. F Distribution ( , ) 𝜇𝜇 𝜇𝜇

𝟏𝟏 𝟐𝟐 Also known as the Variance𝑭𝑭 𝒏𝒏 Ratio𝒏𝒏 or Fisher-Snedecor distribution the pdf is: 1 ( ) . ( ; ) = . 2 𝑛𝑛1 { 𝑛𝑛 } , ( 1+ ) 2 2 𝑛𝑛 𝑥𝑥 𝑛𝑛2 � 𝑛𝑛1+𝑛𝑛2 With cdf: 𝑓𝑓 𝑥𝑥 𝛼𝛼 𝑛𝑛1 𝑛𝑛2 1 2 𝑥𝑥𝑥𝑥 � � 𝑛𝑛 𝑥𝑥 𝑛𝑛 , , = 21 22 1+ 𝑡𝑡 𝑛𝑛 𝑛𝑛 𝑛𝑛 𝑥𝑥 𝐼𝐼 � � 𝑤𝑤ℎ𝑒𝑒𝑒𝑒𝑒𝑒 𝑡𝑡 1 2 1.7.4. Chi-Square Distribution ( ) 𝑛𝑛 𝑥𝑥 𝑛𝑛 𝟐𝟐 The pdf to the chi-square distribution𝝌𝝌 is: 𝒗𝒗

( ) 2 ( ; ) = 𝑣𝑣−2 ⁄2 2 𝑥𝑥 𝑥𝑥 𝑒𝑒𝑒𝑒𝑒𝑒2�− � With mean: 𝑓𝑓 𝑥𝑥 𝑣𝑣 𝑣𝑣⁄2 𝑣𝑣 𝛤𝛤 � � =

𝜇𝜇 𝑣𝑣 38 Probability Distributions Used in Reliability Engineering

1.7.5. Hypergeometric Distribution ( ; , , )

Referred Dist The hypergeometric distribution models probability𝑯𝑯𝒚𝒚𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑 of 𝒌𝒌successes𝒏𝒏 𝒎𝒎 𝑵𝑵 in Bernoulli trials from population containing success without replacement. = / . The pdf to the hypergeometric distribution is: 𝑘𝑘 𝑛𝑛 𝑁𝑁 𝑚𝑚 𝑝𝑝 𝑚𝑚 𝑁𝑁

( ; , , ) = m N−m � k �� n−k � With mean: 𝑓𝑓 𝑘𝑘 𝑛𝑛 𝑚𝑚 𝑁𝑁 N �n� = 𝑛𝑛𝑛𝑛 𝜇𝜇 1.7.6. Wishart Distribution ( 𝑁𝑁; , )

𝒅𝒅 The Wishart distribution is the multivariate𝑾𝑾𝑾𝑾𝑾𝑾𝑾𝑾𝑾𝑾𝑾𝑾𝒕𝒕 generalization𝒙𝒙 𝚺𝚺 𝒏𝒏 of the gamma distribution. The pdf is given as:

| | ( ) 1 1 ( ; , ) = n−d−1 exp 2 2 2 | | −𝟏𝟏 𝑑𝑑 𝐱𝐱 2 With mean: 𝑓𝑓 𝒙𝒙 𝚺𝚺 𝑛𝑛 𝑛𝑛𝑛𝑛⁄2 n⁄2 𝑛𝑛 �− 𝑡𝑡𝑡𝑡�𝒙𝒙 𝚺𝚺�� 𝚺𝚺 Γ𝑑𝑑 � � =

𝝁𝝁 𝑛𝑛𝚺𝚺 Introduction 39 Nomenclature

1.8. Nomenclature and Notation

Functions are presented in the following form: ( ; | )

In continuous𝑓𝑓 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 distributions𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣 the𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 number of 𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔items 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣under test = + + . In discrete distributions the total number of trials. 𝑓𝑓 𝑠𝑠 𝐼𝐼 𝑛𝑛 𝑛𝑛 𝑛𝑛 𝑛𝑛 The number of items which failed before the conclusion of the test.

𝐹𝐹 𝑛𝑛 The number of items which survived to the end of the test.

𝑆𝑆 𝑛𝑛 The number of items which have interval data

𝐼𝐼 𝑛𝑛 , The time at which a component fails. 𝐹𝐹 𝑖𝑖 𝑡𝑡𝑖𝑖 𝑡𝑡 The time at which a component survived to. The item may have been 𝑆𝑆 removed from the test for a reason other than failure. 𝑡𝑡𝑖𝑖 The upper limit of a censored interval in which an item failed 𝑈𝑈𝑈𝑈 𝑡𝑡𝑖𝑖 The lower limit of a censored interval in which an item failed 𝐿𝐿𝐿𝐿 𝑡𝑡𝑖𝑖 The lower truncated limit of sample.

𝐿𝐿 𝑡𝑡 The upper truncated limit of sample.

𝑈𝑈 𝑡𝑡 Time on test = +

𝑇𝑇 𝑖𝑖 𝑠𝑠 𝑡𝑡 or Continuous random∑𝑡𝑡 variable∑𝑡𝑡 (T is normally a random time)

𝑋𝑋 𝑇𝑇 Discrete random variable

𝐾𝐾 or A continuous random variable with a known value

𝑥𝑥 𝑡𝑡 A discrete random variable with a known value

𝑘𝑘 The hat denotes an estimated value

𝑥𝑥� A bold symbol denotes a vector or matrix

𝒙𝒙 Generalized unknown of interest (UOI)

𝜃𝜃 Upper confidence interval for UOI

𝜃𝜃 Lower confidence interval for UOI

𝜃𝜃~ The random variable is distributed as a -variate normal distribution.

𝑑𝑑 𝑋𝑋 𝑁𝑁𝑁𝑁𝑁𝑁𝑚𝑚 𝑋𝑋 𝑑𝑑 40 Probability Distributions Used in Reliability Engineering

Life Distribution 2. Common Life Distributions

Exponential Continuous Distribution 41

2.1. Exponential Continuous Distribution

Expo Probability Density Function - f(t)

0.90 λ=0.5 nential 0.80 λ=1

0.70 λ=2 0.60 0.50 λ=3 0.40 0.30 0.20 0.10 0.00 0 0.5 1 1.5 2 2.5 3 3.5 4

Cumulative Density Function - F(t) 1.00

0.80

0.60 λ=0.5 λ=1 0.40 λ=2 0.20 λ=3

0.00 0 0.5 1 1.5 2 2.5 3 3.5 4

Hazard Rate - h(t) 3.00

2.50

2.00

1.50

1.00

0.50

0.00 0 0.5 1 1.5 2 2.5 3 3.5 4

42 Common Life Distributions

Parameters & Description Parameters Scale Parameter: Equal to the hazard > 0 rate. 𝜆𝜆 𝜆𝜆 0

Limits

Function 𝑡𝑡 ≥ Laplace Domain

PDF ( ) = e ( ) = , > + −λt

Exponential 𝜆𝜆 𝑓𝑓 𝑡𝑡 𝜆𝜆 𝑓𝑓 𝑠𝑠 𝑠𝑠 −𝜆𝜆 CDF ( ) = 1 e (𝜆𝜆) =𝑠𝑠 ( + ) −λt 𝜆𝜆 𝐹𝐹 𝑡𝑡 − 𝐹𝐹 𝑠𝑠 1 Reliability R(t) = e ( ) =𝑠𝑠 𝜆𝜆 𝑠𝑠 + −λt 𝑅𝑅 𝑠𝑠 1 ( ) = e ( ) = 𝜆𝜆 𝑠𝑠 + −λx Conditional 𝑚𝑚 𝑥𝑥 𝑚𝑚 𝑠𝑠 𝜆𝜆 𝑠𝑠 Survivor Function Where ( > + | > ) is the given time we know the component has survived to. is a random variable defined as the time after . Note: = 0 at . 𝑃𝑃 𝑇𝑇 𝑥𝑥 𝑡𝑡 𝑇𝑇 𝑡𝑡 𝑡𝑡 𝑥𝑥 𝑡𝑡 𝑥𝑥 𝑡𝑡 Mean Residual 1 1 ( ) = ( ) = Life 𝑢𝑢 𝑡𝑡 𝑢𝑢 𝑠𝑠 Hazard Rate ( ) = 𝜆𝜆 ( ) =𝜆𝜆𝜆𝜆 𝜆𝜆 Cumulative ℎ 𝑡𝑡 𝜆𝜆 ℎ 𝑠𝑠 ( ) = ( ) = 𝑠𝑠 Hazard Rate 𝜆𝜆 𝐻𝐻 𝑡𝑡 𝜆𝜆𝜆𝜆 𝐻𝐻 𝑠𝑠 2 Properties and Moments 𝑠𝑠 (2) Median 𝑙𝑙𝑙𝑙 Mode 0𝜆𝜆 1 Mean - 1st Raw Moment

1 Variance - 2nd Central Moment 𝜆𝜆

rd 2 Skewness - 3 Central Moment 𝜆𝜆2 Excess kurtosis - 4th Central Moment 6

Characteristic Function + 𝑖𝑖𝑖𝑖 1 100α% Percentile Function = 𝑡𝑡 ln𝑖𝑖𝑖𝑖 (1 )

𝑡𝑡𝛼𝛼 − − 𝛼𝛼 𝜆𝜆 Exponential Continuous Distribution 43

Parameter Estimation Plotting Method Least Mean X-Axis Y-Axis = = Square - Expo + [1 ( )] 𝑦𝑦 𝜆𝜆̂ −𝑚𝑚

𝑖𝑖 𝑖𝑖 nential 𝑚𝑚𝑚𝑚 𝑐𝑐 𝑡𝑡 Likelihood Function𝑙𝑙𝑙𝑙 − 𝐹𝐹 𝑡𝑡

( | ) = e . . e . e e F S I n F n S n LI UI nF −λ ti −ti −λti −λti 𝐿𝐿 𝐸𝐸 𝜆𝜆 λ � � � � − � Likelihood ������i=�1���� ���i=�1��� ���i=�1����������� failures survivors interval failures Functions when there is no interval data this reduces to:

( | ) = = t + t = nF −𝜆𝜆𝑡𝑡𝑇𝑇 F S 𝐿𝐿 𝐸𝐸 𝜆𝜆 𝜆𝜆 𝑒𝑒 𝑤𝑤ℎ𝑒𝑒𝑒𝑒𝑒𝑒 𝑡𝑡𝑇𝑇 � i � i 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑖𝑖𝑖𝑖 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 ( | ) = r. ln( ) nF t nS t + nI ln e e LI RI F −λti −λti i s Λ 𝐸𝐸 𝜆𝜆 λ − � λ − � λ � � − � Log-Likelihood �������i�=�1 �� i�=1�� �i=�1������������� failures survivors interval failures Functions when there is no interval data this reduces to:

( | ) = n . ln( ) where = t + t F S Λ 𝐸𝐸 𝜆𝜆 F 𝜆𝜆 − 𝜆𝜆𝑡𝑡𝑇𝑇 𝑡𝑡𝑇𝑇 � i � i solve for to get :

𝜆𝜆 𝜆𝜆̂ n t e t e = 0 nF t nS t nI LI RI = 0 LI λti RI λti ∂Λ F F S i i i i 𝐿𝐿𝐿𝐿 − 𝑅𝑅𝑅𝑅 − � − � − � � 𝜆𝜆𝑡𝑡𝑖𝑖 𝜆𝜆𝑡𝑡𝑖𝑖 � ∂λ λ i=1 i=1 i=1 ������� ��� �����𝑒𝑒���−��𝑒𝑒����� failures survivors interval failures Point When there is only complete and right-censored data the point estimate Estimates is: n = = t + t = F F S ̂ 𝑇𝑇 i i Fisher 𝜆𝜆 𝑇𝑇 𝑤𝑤ℎ𝑒𝑒𝑒𝑒𝑒𝑒 𝑡𝑡 � 1 � 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑖𝑖𝑖𝑖 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑡𝑡 ( ) = Information 𝐼𝐼 𝜆𝜆 λlower - λupper𝜆𝜆 - λupper - 2-Sided 2-Sided 1-Sided 100 % Confidence (2n ) (2n + 2) Type I (Time ( )(2n + 2) Interval𝛾𝛾 2 2 Terminated) 1−γ F 1+γ F 2 � 2� � �2 γ 2 F 𝜒𝜒 2 𝜒𝜒 2 𝜒𝜒 (excluding (𝑇𝑇2n ) (𝑇𝑇2n ) 𝑡𝑡𝑇𝑇 interval data) Type II (Failure 𝑡𝑡 𝑡𝑡 ( )(2n ) 2 2 2 Terminated) 1−γ F 1+γ F � 2� � 2� γ2 F 𝜒𝜒 2 𝜒𝜒 2 𝜒𝜒 𝑇𝑇 𝑡𝑡𝑇𝑇 𝑡𝑡𝑇𝑇 𝑡𝑡 44 Common Life Distributions

( ) is the percentile of the Chi-squared distribution. (Modarres et al. 1999,2 pp.151-152) Note: These confidence intervals are only valid for 𝛼𝛼 complete𝜒𝜒 and𝛼𝛼 right-censored data or when approximations of interval data are used (such as the median). They are exact confidence bounds

and therefore approximate methods such as use of the Fisher information matrix need not be used. Bayesian Non-informative Priors ( )

Exponential (Yang and Berger 1998, p.6) 𝝅𝝅 𝝀𝝀 Type Prior Posterior Uniform Proper Prior 1 Truncated Gamma Distribution

with limits [ , ] For a b . ( ; 1 + n , t ) 𝜆𝜆 ∈ 𝑎𝑎 𝑏𝑏 𝑏𝑏 − 𝑎𝑎 ≤ λ ≤ F T Otherwise𝑐𝑐 𝐺𝐺 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺( ) = 0𝜆𝜆

Uniform Improper Prior 1 (1,0) 𝜋𝜋 𝜆𝜆 ( ; 1 + n , t ) with limits [0, ) ∝ 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝜆𝜆 F T Jeffrey’s Prior 1 ( ; + n , t ) 𝜆𝜆 ∈ ∞ ( , 0) when 1 [0, ) 1 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝜆𝜆 2 F T Novick and Hall 1 ∝ 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 2 ( ; n , t ) √𝜆𝜆 (0,0) 𝜆𝜆 ∈ ∞ when [0, ) 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝜆𝜆 F T ∝ 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 where𝜆𝜆 = t + t = total time in test𝜆𝜆 ∈ ∞ F S 𝑡𝑡𝑇𝑇 Conjugate∑ i ∑ iPriors UOI Likelihood Evidence Dist. of Prior Posterior Model UOI Para Parameters failures = + from Exponential in unit of Gamma , 𝜆𝜆 𝑛𝑛𝐹𝐹 = + ( ; ) time 𝑘𝑘 𝑘𝑘𝑜𝑜 𝑛𝑛𝐹𝐹 𝑡𝑡𝑇𝑇 𝑘𝑘0 Λ0 Λ Λ𝑜𝑜 𝑡𝑡𝑇𝑇 𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 λ Description , Limitations and Uses Example Three vehicle tires were run on a test area for 1000km have punctures at the following distances: Tire 1: No punctures Tire 2: 400km, 900km Tire 3: 200km

Punctures are a random failure with constant failure rate therefore an exponential distribution would be appropriate. Due to an exponential distribution being homogeneous in time, the renewal process of the second tire failing twice with a repair can be considered as two separate tires on test with single failures. See example in section 1.1.6.

Total distance on test is 3 × 1000 = 3000km. Total number of Exponential Continuous Distribution 45

failures is 3. Therefore using MLE the estimate of :

n 3 𝜆𝜆 = = = 1E-3 F 3000

̂ Expo 𝜆𝜆 𝑇𝑇 With 90% confidence interval𝑡𝑡 (distance terminated test): ( )(6) ( )(8)

. . nential = 0.272 -3, = 2.584 -3 26000 26000 𝜒𝜒 0 05 𝜒𝜒 0 95 � 𝐸𝐸 𝐸𝐸 � A Bayesian point estimate using the Jeffery non-informative improper prior ( , 0), with posterior ( ; 3.5, 3000) has 1 a point estimate: 2 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 3.𝐺𝐺5𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝜆𝜆 = E[ ( ; 3.5, 3000)] = = 1.16E 3 3000

𝜆𝜆̂ 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝜆𝜆 ̇ − With 90% confidence interval using inverse Gamma cdf: [ (0.05) = 0.361 -3, (0.95) = 2.344 -3] −1 −1 𝐹𝐹𝐺𝐺 𝐸𝐸 𝐹𝐹𝐺𝐺 𝐸𝐸 Characteristics Constant Failure Rate. The exponential distribution is defined by a constant failure rate, . This means the component is not subject to wear or accumulation of damage as time increases. 𝜆𝜆 ( ) = . As can be seen, is the initial value of the distribution. Increases in increase the probability density at (0). 𝒇𝒇 𝟎𝟎 𝝀𝝀 𝜆𝜆 HPP. The exponential𝜆𝜆 distribution is the time to failure𝑓𝑓 distribution of a single event in the Homogeneous Poisson Process (HPP).

~ ( ; ) Scaling property 𝑇𝑇 𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 𝜆𝜆 ~ ; 𝜆𝜆 Minimum property 𝑎𝑎𝑎𝑎 𝐸𝐸𝐸𝐸𝐸𝐸 �𝑡𝑡 � 𝑎𝑎 min { , , … , }~ ; 𝑛𝑛

Variate Generation property𝑇𝑇1 𝑇𝑇2 𝑇𝑇𝑛𝑛 𝐸𝐸𝐸𝐸𝐸𝐸 �𝑡𝑡 � 𝜆𝜆𝑖𝑖� ln(1 ) 𝑖𝑖=1 ( ) = , 0 < < 1 Memoryless property.−1 − 𝑢𝑢 𝐹𝐹 𝑢𝑢 𝑢𝑢 Pr( > + |−𝜆𝜆> ) = Pr ( > )

Properties from (Leemis𝑇𝑇 &𝑡𝑡 McQueston𝑥𝑥 𝑇𝑇 𝑡𝑡 2008)𝑇𝑇. 𝑥𝑥 Applications No Wearout. The exponential distribution is used to model occasions when there is no wearout or cumulative damage. It can be used to approximate the failure rate in a component’s useful life period (after burn in and before wear out).

46 Common Life Distributions

Homogeneous Poisson Process (HPP). The exponential distribution is used to model the inter arrival times in a repairable system or the arrival times in queuing models. See Poisson and Gamma distribution for more detail.

Electronic Components. Some electronic components such as capacitors or integrated circuits have been found to follow an exponential distribution. Early efforts at collecting reliability data assumed a constant failure rate and therefore many reliability handbooks only provide a failure rate estimates for components. Exponential

Random Shocks. It is common for the exponential distribution to model the occurrence of random shocks An example is the failure of a vehicle tire due to puncture from a nail (random shock). The probability of failure in the next mile is independent of how many miles the tire has travelled (memoryless). The probability of failure when the tire is new is the same as when the tire is old (constant failure rate).

In general component life distributions do not have a constant failure rate, for example due to wear or early failures. Therefore the exponential distribution is often inappropriate to model most life distributions, particularly mechanical components.

Online: http://www.weibull.com/LifeDataWeb/the_exponential_distribution.h tm http://mathworld.wolfram.com/ExponentialDistribution.html http://en.wikipedia.org/wiki/Exponential_distribution http://socr.ucla.edu/htmls/SOCR_Distributions.html (web calc) Resources

Books: Balakrishnan, N. & Basu, A.P., 1996. Exponential Distribution: Theory, Methods and Applications 1st ed., CRC.

Nelson, W.B., 1982. Applied Life Data Analysis, Wiley-Interscience. Relationship to Other Distributions 2-Para Special Case: Exponential 1 ( ; ) = ( ; = 0, = ) Distribution ( ; , ) 𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 𝜆𝜆 𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 µ β 𝜆𝜆 𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 𝜇𝜇 𝛽𝛽 Let … ~ ( ) = + + + Gamma Then 1 𝑘𝑘 𝑡𝑡 1 2 𝑘𝑘 Distribution 𝑇𝑇 𝑇𝑇 𝐸𝐸𝐸𝐸𝐸𝐸 𝜆𝜆 ~𝑎𝑎𝑎𝑎𝑎𝑎 ( 𝑇𝑇, ) 𝑇𝑇 𝑇𝑇 ⋯ 𝑇𝑇 The gamma distribution is the probability density function of the sum 𝑡𝑡 ( ; , ) of k exponentially distributed𝑇𝑇 𝐺𝐺 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺time random𝑘𝑘 𝜆𝜆 variables sharing the same constant rate of occurrence, . This is a Homogeneous 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑡𝑡 𝑘𝑘 𝜆𝜆 Poisson Process. 𝜆𝜆 Exponential Continuous Distribution 47

Special Case: ( ; ) = ( ; = 1, )

Let 𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 𝜆𝜆 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑡𝑡 𝑘𝑘 𝜆𝜆 , … ~ ( ; ) Expo Given 1 2 = 𝑇𝑇 +𝑇𝑇 +𝐸𝐸𝐸𝐸𝐸𝐸+ 𝑡𝑡 𝜆𝜆+ … nential Then 1 2 𝐾𝐾 𝐾𝐾+1 Poisson 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑇𝑇 ~ 𝑇𝑇 (k⋯; =𝑇𝑇 ) 𝑇𝑇 Distribution The Poisson distribution𝐾𝐾 is 𝑃𝑃the𝑃𝑃𝑃𝑃𝑃𝑃 probabilityµ 𝜆𝜆𝜆𝜆 of observing exactly k ( ; ) occurrences within a time interval [0, t] where the inter-arrival times of each occurrence is exponentially distributed. This is a 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑘𝑘 𝜇𝜇 Homogeneous Poisson Process.

Special Cases: (k = 1; = ) = ( ; )

Let 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 µ 𝜆𝜆𝜆𝜆 𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 𝜆𝜆 ~ ( ) = X / Weibull Then 1 β Distribution 𝑋𝑋 𝐸𝐸𝐸𝐸𝐸𝐸 𝜆𝜆 𝑎𝑎𝑎𝑎𝑎𝑎 𝑌𝑌 ~ ( = , ) −1 ( ; , ) Special Case: 𝛽𝛽 𝑌𝑌 𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 𝛼𝛼 𝜆𝜆 𝛽𝛽1 ( ; ) = ; = , = 1 𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 𝑡𝑡 𝛼𝛼 𝛽𝛽 𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 𝜆𝜆 𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 �𝑡𝑡 𝛼𝛼 𝛽𝛽 � Let 𝜆𝜆 ~ ( ) = [ ], Geometric Then Distribution 𝑋𝑋 𝐸𝐸𝐸𝐸𝐸𝐸 𝜆𝜆 𝑎𝑎𝑎𝑎𝑎𝑎 ~ 𝑌𝑌 𝑋𝑋 ( ,𝑌𝑌)𝑖𝑖 𝑖𝑖 𝑡𝑡ℎ𝑒𝑒 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑜𝑜𝑜𝑜 𝑋𝑋

( ; ) The geometric distribution𝑌𝑌 𝐺𝐺 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺is the 𝛼𝛼discrete𝛽𝛽 equivalent of the continuous exponential distribution. The geometric distribution is 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑘𝑘 𝑝𝑝 also memoryless. Let Rayleigh ~ ( ) = X Distribution Then 𝑋𝑋 𝐸𝐸𝐸𝐸𝐸𝐸 𝜆𝜆 𝑎𝑎𝑛𝑛𝑛𝑛 1 𝑌𝑌 √ ( ; ) ~ ( = )

𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅ℎ 𝑡𝑡 𝛼𝛼 Special Case: 𝑌𝑌 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅ℎ 𝛼𝛼 Chi-square √𝜆𝜆 1 ( ; ) ( ; = 2) = ; = 2 2 2 𝜒𝜒 𝑥𝑥 𝑣𝑣 Let 𝜒𝜒 𝑥𝑥 𝑣𝑣 𝐸𝐸𝐸𝐸𝐸𝐸 �𝑥𝑥 λ � Pareto ~ ( , ) = ln( ) Distribution Then ( ; , ) 𝑌𝑌 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝜃𝜃 𝛼𝛼 ~ 𝑎𝑎𝑎𝑎𝑎𝑎( = ) 𝑋𝑋 𝑌𝑌⁄𝜃𝜃 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑡𝑡 𝜃𝜃 𝛼𝛼 𝑋𝑋 𝐸𝐸𝐸𝐸𝐸𝐸 𝜆𝜆 𝛼𝛼 48 Common Life Distributions

Let Logistic ~ ( = 1) = ln 1 +−𝑋𝑋 Distribution 𝑒𝑒 ( , ) Then 𝑋𝑋 𝐸𝐸𝐸𝐸𝐸𝐸 𝜆𝜆 𝑎𝑎𝑎𝑎𝑎𝑎 𝑌𝑌 � −𝑋𝑋� 𝑒𝑒 ~ (0,1) 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿 µ 𝑠𝑠 (Hastings et al. 2000, p.127): 𝑌𝑌 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿

Exponential Lognormal Continuous Distribution 49

2.2. Lognormal Continuous Distribution

Probability Density Function - f(t) Lognormal 0.40 μ'=1 σ'=1 μ'=0 σ'=1 0.90 μ'=1 σ'=1 μ'=0.5 σ'=1 0.35 μ'=1 σ'=1.5 0.80 μ'=1 σ'=1 0.30 μ'=1 σ'=2 0.70 μ'=1.5 σ'=1 0.60 0.25 0.50 0.20 0.40 0.15 0.30 0.10 0.20 0.10 0.05 0.00 0.00 0 2 4 6 0 2 4

Cumulative Density Function - F(t) 1.00 0.90 0.80 0.80 0.70 0.60 0.60 0.50 0.40 0.40 0.30 0.20 0.20 0.10 0.00 0.00 0 2 4 6 0 2 4

Hazard Rate - h(t) 0.70 1.80 1.60 0.60 1.40 0.50 1.20 0.40 1.00 0.80 0.30 0.60 0.20 0.40 0.10 0.20 0.00 0.00 0 2 4 6 0 2 4

50 Common Life Distributions

Parameters & Description Scale parameter: The mean of the normally distributed ln( ). This parameter only determines the scale

< < and not the location as in 𝑥𝑥a normal distribution. 𝜇𝜇𝑁𝑁 −∞ 𝜇𝜇𝑁𝑁 ∞ = ln +2 Parameters 𝑁𝑁 𝜇𝜇 Lognormal 𝜇𝜇 � 2 2� Shape parameter�:𝜎𝜎 The𝜇𝜇 standard deviation of the normally distributed ln(x). This parameter only determines > 0 the shape and not the scale as in a 2 2 normal distribution. σN σN + = ln 2 2 2 𝜎𝜎 𝜇𝜇 σN � 2 � Limits t > 0 𝜇𝜇 Distribution Formulas 1 1 ln ( ) ( ) = exp 2 2 2 𝑡𝑡 − 𝜇𝜇𝑁𝑁 𝑓𝑓 𝑡𝑡 � − � 𝑁𝑁 � � PDF 𝜎𝜎𝑁𝑁𝑡𝑡√ 1𝜋𝜋 ( ) 𝜎𝜎 = . 𝑙𝑙𝑙𝑙 𝑡𝑡 − 𝜇𝜇𝑁𝑁

𝑁𝑁 𝜙𝜙 � 𝑁𝑁 � where is the standard normal𝜎𝜎 𝑡𝑡 pdf. 𝜎𝜎

𝜙𝜙 1 1 1 ln( ) ( ) = exp 2 𝑡𝑡 2 ∗ 2 𝑡𝑡 − 𝜇𝜇𝑁𝑁 ∗ where is𝐹𝐹 the𝑡𝑡 time variable� ∗ over� which− � the pdf is integrated.� � 𝑑𝑑𝑡𝑡 𝑁𝑁 0 𝑡𝑡 𝜎𝜎𝑁𝑁 ∗ 𝜎𝜎 √ 𝜋𝜋 𝑡𝑡 1 1 ( ) = + erf CDF 2 2 2 𝑙𝑙𝑙𝑙 𝑡𝑡 − 𝜇𝜇𝑁𝑁 � � ln( ) 𝜎𝜎𝑁𝑁√ = 𝑡𝑡 − 𝜇𝜇𝑁𝑁

� 𝑁𝑁 � where is the standard normalΦ cdf.𝜎𝜎

Φ ln ( ) Reliability R(t) = 1 𝑡𝑡 − 𝜇𝜇𝑁𝑁 − � 𝑁𝑁 � Φ 𝜎𝜎 ln( + t) 1 Conditional ( + ) ( ) = ( | ) = = 𝑁𝑁 Survivor Function ( ) ln𝑥𝑥(t) − 𝜇𝜇 −1 � 𝑁𝑁 � ( > + | > ) 𝑅𝑅 𝑡𝑡 𝑥𝑥 𝜎𝜎 𝑚𝑚 𝑥𝑥 𝑅𝑅 𝑥𝑥 𝑡𝑡 Φ 𝑅𝑅 𝑡𝑡 − 𝜇𝜇𝑁𝑁 Where − � � 𝑃𝑃 𝑇𝑇 𝑥𝑥 𝑡𝑡 𝑇𝑇 𝑡𝑡 Φ 𝜎𝜎𝑁𝑁 Lognormal Continuous Distribution 51

is the given time we know the component has survived to. is a random variable defined as the time after . Note: = 0 at . 𝑡𝑡 𝑥𝑥 ( ) 𝑡𝑡 𝑥𝑥 𝑡𝑡 ( ) = ∞ ( ) 𝑡𝑡 𝑅𝑅 𝑥𝑥 𝑑𝑑𝑑𝑑 Mean Residual ∫ Lognormal Life lim ( )𝑢𝑢 𝑡𝑡 𝑅𝑅 𝑡𝑡 [1 + (1)] ( )2 ln 𝑁𝑁 (1) 𝜎𝜎 𝑡𝑡 Where is Landau's𝑡𝑡→∞ 𝑢𝑢 𝑡𝑡notation.≈ (Kleiber & 𝑜𝑜Kotz 2003, p.114) 𝑡𝑡 − 𝜇𝜇𝑁𝑁 ln( ) 𝑜𝑜 ( ) = Hazard Rate 𝑡𝑡 −(𝜇𝜇)𝑁𝑁 𝜙𝜙 � ln � . 1 𝜎𝜎𝑁𝑁 ℎ 𝑡𝑡 𝑁𝑁 𝑁𝑁 𝑡𝑡 − 𝜇𝜇 Cumulative 𝑡𝑡 𝜎𝜎 � − Φ � 𝑁𝑁 �� ( ) = ln [ ( )𝜎𝜎] Hazard Rate Properties and𝐻𝐻 Moments𝑡𝑡 − 𝑅𝑅 𝑡𝑡 Median ( ) 𝜇𝜇𝑁𝑁 Mode (𝑒𝑒 ) 2 𝑁𝑁 𝑁𝑁 st 𝜇𝜇 −𝜎𝜎 Mean - 1 Raw Moment 𝑒𝑒 2 𝜎𝜎𝑁𝑁 �𝜇𝜇𝑁𝑁+ � 2 nd Variance - 2 Central Moment 𝑒𝑒 1 . 2 2 𝜎𝜎𝑁𝑁 2𝜇𝜇𝑁𝑁+𝜎𝜎𝑁𝑁 rd Skewness - 3 Central Moment �𝑒𝑒 +−2 �. 𝑒𝑒 1 2 2 𝜎𝜎 𝜎𝜎 Excess kurtosis - 4th Central Moment �𝑒𝑒+ 2 � �+𝑒𝑒3 − 3 2 2 2 4𝜎𝜎𝑁𝑁 3𝜎𝜎𝑁𝑁 2𝜎𝜎𝑁𝑁 Characteristic Function Deriving𝑒𝑒 a unique𝑒𝑒 characteristic𝑒𝑒 − equation is not trivial and complex series solutions have been proposed. (Leipnik 1991) 100α% Percentile Function = e( . ) 𝜇𝜇𝑁𝑁+𝑧𝑧𝛼𝛼 𝜎𝜎𝑁𝑁 𝛼𝛼 where is 𝑡𝑡the 100pthof the standard normal distribution 𝛼𝛼 𝑧𝑧 = e( ( )) −1 𝜇𝜇𝑁𝑁+𝜎𝜎𝑁𝑁Φ 𝛼𝛼 𝑡𝑡α Parameter Estimation Plotting Method

Least Mean X-Axis Y-Axis = Square ln ( ) [ ( )] 1𝑐𝑐 = + 𝜇𝜇�𝑁𝑁 =− 𝑚𝑚 𝑡𝑡𝑖𝑖 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝐹𝐹 𝑡𝑡𝑖𝑖 𝑦𝑦 𝑚𝑚𝑚𝑚 𝑐𝑐 𝜎𝜎�𝑁𝑁 𝑚𝑚

52 Common Life Distributions

Maximum Likelihood Function Likelihood 1 ( ) . 1 . Functions nF . t nS nI 𝐹𝐹 𝑆𝑆 𝑅𝑅𝑅𝑅 𝐿𝐿𝐿𝐿 F 𝑖𝑖 𝑖𝑖 𝑖𝑖 𝑖𝑖 �i=1 𝑁𝑁 i 𝜙𝜙 𝑧𝑧 �i=1� − �𝑧𝑧 �� �i=1� �𝑧𝑧 � − �𝑧𝑧 �� 𝜎𝜎 ������������� ����������������� where�� ����������� survivorsΦ intervalΦ failuresΦ failures ln(t ) = x 𝑥𝑥 i − 𝜇𝜇𝑁𝑁 𝑧𝑧𝑖𝑖 � � 𝜎𝜎𝑁𝑁 Lognormal Log-Likelihood 1 Function ( , |E) = nF ln + nS ln 1 . 𝐹𝐹 𝑆𝑆 N N 𝐹𝐹 𝑖𝑖 𝑖𝑖 Λ µ σ � � 𝑁𝑁 𝑖𝑖 𝜙𝜙�𝑧𝑧 �� � � − �𝑧𝑧 �� �i=�1���𝜎𝜎��𝑡𝑡������ �i=�1������Φ����� failures survivors + nI ln 𝑅𝑅𝑅𝑅 𝐿𝐿𝐿𝐿 𝑖𝑖 𝑖𝑖 � � �𝑧𝑧 � − �𝑧𝑧 �� where �i=�1���Φ�������Φ����� intervalln(t failures) = x 𝑥𝑥 i 𝑁𝑁 𝑖𝑖 − 𝜇𝜇 solve for to get MLE 𝑧𝑧: � 𝑁𝑁 � = 0 𝜎𝜎 𝑁𝑁 . N 𝑁𝑁 1 1 z ∂Λ 𝜇𝜇 = 𝜇𝜇�+ nF ln (t ) + nS N F 1 Sz ∂µ ∂Λ −µN F ϕ� i � i S N N N � N � ∂µ σ σ i=1 σ i=1 i ��������������� ������−�Φ����� 1failuresz z survivors nI = 0 zRI zLI ϕ� i � − ϕ� i � RI LI − � N � � i=1 σ i i where ������Φ������−�Φ������ interval failuresln(t ) = x 𝑥𝑥 i 𝑁𝑁 𝑖𝑖 − 𝜇𝜇 solve for to get : 𝑧𝑧 � 𝑁𝑁 � = 0 𝜎𝜎 𝑁𝑁 n 𝑁𝑁 1 1 z . z ∂Λ 𝜎𝜎 = 𝜎𝜎�+ nF ln (t ) + nS N 1 S zS ∂σ ∂Λ − F F 2 i ϕ� i � 3 i N S N N �� − µ � N � ∂σ σ N i=1 σ i=1 i �����σ�������������� ������−�Φ����� 1 failuresz . z z z survivors nI = 0 RI z RI LIz LI i ϕ� i � − i ϕ� i � RI LI − � N � � i=1 σ i i where ��������Φ������−�Φ�� ������ interval failuresln(t ) = x 𝑥𝑥 i − 𝜇𝜇𝑁𝑁 𝑧𝑧𝑖𝑖 � � MLE Point When there is only complete failure𝜎𝜎 data𝑁𝑁 the point estimates can be Estimates given as: ln ( ) ln = = n 𝐹𝐹 𝐹𝐹n 2 ∑ 𝑡𝑡𝑖𝑖 2 ∑� �𝑡𝑡𝑖𝑖 � − 𝜇𝜇�𝑡𝑡� 𝑁𝑁 �N 𝜇𝜇� F σ F Lognormal Continuous Distribution 53

Note: In almost all cases the MLE methods for a normal distribution can be used by taking the ( ). However Normal distribution estimation methods cannot be used with interval data. (Johnson et al. 1994, p.220) 𝑙𝑙𝑙𝑙 𝑋𝑋 In most cases the unbiased estimators are used: Lognormal ln ( ) ln = = n 𝐹𝐹 n 𝐹𝐹 1 2 ∑ 𝑡𝑡𝑖𝑖 2 ∑� �𝑡𝑡𝑖𝑖 � − 𝜇𝜇�𝑡𝑡� 𝜇𝜇�𝑁𝑁 σ�N F F − Fisher 1 0 Information ( , ) = ⎡ 2 1 ⎤ 2 𝜎𝜎0𝑁𝑁 𝐼𝐼 𝜇𝜇𝑁𝑁 𝜎𝜎𝑁𝑁 ⎢ 2 ⎥ ⎢ ⎥ (Kleiber & Kotz 2003, p.119). − 4 ⎣ 𝜎𝜎 ⎦ 100 % 1-Sided Lower 2-Sided Lower 2-Sided Upper Confidence

Intervals𝛾𝛾 (n 1) (n 1) + (n 1) 𝑁𝑁 𝑁𝑁 𝑁𝑁 𝑵𝑵 𝜎𝜎� 𝜎𝜎� 1−γ 𝜎𝜎� 1−γ 𝑁𝑁 γ F 𝑁𝑁 � � F 𝑁𝑁 � � F 𝝁𝝁 𝜇𝜇� − 𝑡𝑡 − 𝜇𝜇� − 𝑡𝑡 2 − 𝜇𝜇� 𝑡𝑡 2 − (for complete (𝑛𝑛𝐹𝐹 1) 𝑛𝑛𝐹𝐹( 1) 𝑛𝑛𝐹𝐹( 1) √ √ √ data) 𝟐𝟐 ( 1) ( 1) ( 1) 𝑵𝑵 2 𝑛𝑛𝐹𝐹 − 2 1+𝑛𝑛𝐹𝐹 − 2 1 𝑛𝑛𝐹𝐹 − 𝝈𝝈 �𝑁𝑁 2 �𝑁𝑁 2 2 �𝑁𝑁 2 2 𝜎𝜎 𝛾𝛾 𝐹𝐹 𝜎𝜎 γ 𝐹𝐹 𝜎𝜎 −γ 𝐹𝐹 Where (n𝜒𝜒 𝑛𝑛1)− is the 100 th percentile𝜒𝜒� � 𝑛𝑛 − of the t-distribution𝜒𝜒� � with𝑛𝑛 − 1 degrees of freedom and ( 1) is the 100 th percentile of the - 𝑡𝑡γ F − 𝛾𝛾 𝑛𝑛𝐹𝐹 − distribution with 1 degrees2 of freedom. (Nelson 1982, pp.218-219)2 𝜒𝜒𝛾𝛾 𝑛𝑛𝐹𝐹 − 𝛾𝛾 𝜒𝜒 1 Sided - Lower𝑛𝑛𝐹𝐹 − 2 Sided

exp + + exp + ± + 22 2 2( 4 1) 22 2 2( 4 1) �𝑁𝑁 �𝑁𝑁 �𝑁𝑁 �𝑁𝑁 �𝑁𝑁 �𝑁𝑁 𝝁𝝁 𝑁𝑁 𝜎𝜎 1−𝛼𝛼 𝜎𝜎 𝜎𝜎 𝑁𝑁 𝜎𝜎 1−𝛼𝛼 𝜎𝜎 𝜎𝜎 �𝜇𝜇� − 𝑍𝑍 � � �𝜇𝜇� 𝑍𝑍 �2� � 𝑛𝑛𝐹𝐹 𝑛𝑛𝐹𝐹 − 𝑛𝑛𝐹𝐹 𝑛𝑛𝐹𝐹 − These formulas are the Cox approximation for the confidence intervals of the lognormal distribution mean where = ( ), the inverse of −1 the standard normal cdf. (Zhou & Gao 1997)𝑝𝑝 𝑍𝑍 𝛷𝛷 𝑝𝑝 Zhou & Gao recommend using the parametric bootstrap method for small sample sizes. (Angus 1994) Bayesian Non-informative Priors when is known, ( ) (Yang and Berger 1998,𝟐𝟐 p.22) 𝝈𝝈𝑵𝑵 𝝅𝝅𝟎𝟎 𝝁𝝁𝑵𝑵 Type Prior Posterior Uniform Proper 1 Truncated Normal Distribution

Prior with limits For a b [ , ] ln N . ; 𝐹𝐹 , 𝑏𝑏 − 𝑎𝑎 ≤ µ ≤ 𝑛𝑛 𝐹𝐹 n2 𝜇𝜇𝑁𝑁 ∈ 𝑎𝑎 𝑏𝑏 ∑𝑖𝑖=1 𝑡𝑡𝑖𝑖 σN Otherwise𝑐𝑐 𝑁𝑁 𝑁𝑁𝑁𝑁𝑁𝑁( �)µ=N 0 � 𝑛𝑛𝐹𝐹 F 𝜋𝜋 𝜇𝜇𝑁𝑁 54 Common Life Distributions

All 1 ln ; 𝐹𝐹 , 𝑛𝑛 𝐹𝐹 n2 ∑𝑖𝑖=1 𝑡𝑡𝑖𝑖 σN 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁when�µN ( , ) � 𝑛𝑛𝐹𝐹 F Non-informative Priors when is known, 𝜇𝜇𝑁𝑁(∈ )∞ ∞

(Yang and Berger 1998, p.23) 𝟐𝟐 𝝁𝝁𝑵𝑵 𝝅𝝅𝒐𝒐 𝛔𝛔𝐍𝐍 Type Prior Posterior Uniform Proper 1 Truncated Inverse Gamma Distribution

Lognormal Prior with limits For a b [ , ] 2 ( 2) S N . ; , 2 𝑏𝑏 − 𝑎𝑎 ≤ σ ≤ 2 22 𝜎𝜎𝑁𝑁 ∈ 𝑎𝑎 𝑏𝑏 2 𝑛𝑛𝐹𝐹 − N Otherwise𝑐𝑐 𝐼𝐼𝐼𝐼( �σ)N= 0 � 2 Uniform Improper 1 𝑁𝑁 ( 2) S 𝜋𝜋 𝜎𝜎 ; , Prior with limits 2 𝐹𝐹 2 2 (0, ) 2 𝑛𝑛 − N 𝐼𝐼𝐼𝐼See�σN section 1.7.1 � 2 Jeffery’s,𝜎𝜎𝑁𝑁 ∈ ∞ 1 S ; , Reference, MDIP 2 22 Prior 2 𝑛𝑛𝐹𝐹 N 2 with𝐼𝐼𝐼𝐼 limits�σN (0�, ) 𝜎𝜎𝑁𝑁 See section2 1.7.1 𝜎𝜎𝑁𝑁 ∈ ∞ Non-informative Priors when and are unknown, ( , ) (Yang and Berger 1998,𝟐𝟐 p.23) 2 𝝁𝝁𝑵𝑵 𝝈𝝈𝑵𝑵 𝝅𝝅𝒐𝒐 𝝁𝝁𝑵𝑵 σN Type Prior Posterior Improper Uniform 1 S ( | )~ ; 3, , with limits: ( 2 3) ( , ) N 𝜋𝜋 𝜇𝜇𝑁𝑁 𝐸𝐸 𝑇𝑇See�µN section𝑛𝑛𝐹𝐹 − 1.7.2𝑡𝑡��𝑁𝑁� � (0, ) 𝐹𝐹 𝐹𝐹 𝑁𝑁 ( 𝑛𝑛3) 𝑛𝑛S − 𝜇𝜇 2 ∈ ∞ ∞ ( | )~ ; , 𝑁𝑁 2 22 𝜎𝜎 ∈ ∞ 2 2 𝑛𝑛𝐹𝐹 − N 𝜋𝜋 σN 𝐸𝐸See𝐼𝐼𝐼𝐼 section�σN 1.7.1 � Jeffery’s Prior 1 S ( | )~ ; + 1, , ( 2+ 1) 𝐹𝐹 4 𝑁𝑁 whenN ( ,𝑁𝑁 ) 𝑁𝑁 𝜋𝜋 𝜇𝜇 𝐸𝐸 𝑇𝑇 �µ 𝑁𝑁 𝑡𝑡��� 𝐹𝐹 𝐹𝐹 � 𝜎𝜎 See section 1.7.2𝑛𝑛 𝑛𝑛 𝜇𝜇𝑁𝑁 ∈ ( ∞ +∞1) S ( | )~ ; , 2 22 2 2 𝑛𝑛𝐹𝐹 N 𝜋𝜋 σN 𝐸𝐸when𝐼𝐼𝐼𝐼 �σN (0, ) � See section2 1.7.1 𝜎𝜎𝑁𝑁 ∈ ∞ Reference Prior ( , ) No closed form ordering { , } 1 2 𝑜𝑜 𝑁𝑁 𝜋𝜋 𝜙𝜙2 +𝜎𝜎 𝜙𝜙 𝜎𝜎 ∝where 2 𝑁𝑁 𝜎𝜎 =� / 𝜙𝜙

𝜙𝜙 𝜇𝜇𝑁𝑁 𝜎𝜎𝑁𝑁 Lognormal Continuous Distribution 55

Reference where 1 S ( | )~ ; 1, , and are n (n 2 1) 𝐹𝐹 N separate groups.2 𝑁𝑁 whenN ( �,�𝑁𝑁� ) 𝜇𝜇 𝜎𝜎 𝑁𝑁 𝜋𝜋 𝜇𝜇 𝐸𝐸 𝑇𝑇 �µ 𝑁𝑁 − 𝑡𝑡 F F � 𝜎𝜎 See section 1.7.2 − MDIP Prior 𝜇𝜇𝑁𝑁 ∈ ( ∞ ∞1) ( | )~ ; , Lognormal 2 22 2 2 𝑛𝑛𝐹𝐹 − 𝑆𝑆𝑁𝑁 𝜋𝜋 σN 𝐸𝐸when𝐼𝐼𝐼𝐼 �σN (0, ) � See section2 1.7.1 𝜎𝜎𝑁𝑁 ∈ ∞ where 1 = 𝑛𝑛𝐹𝐹 (ln ) and = 𝑛𝑛𝐹𝐹 ln n 2 2 𝑁𝑁 𝑖𝑖 𝑁𝑁 𝑁𝑁 𝑖𝑖 𝑆𝑆 � 𝑡𝑡 − 𝑡𝑡��� �𝑡𝑡�� F � 𝑡𝑡 𝑖𝑖=1 Conjugate Priors 𝑖𝑖=1 UOI Likelihood Evidence Dist. of Prior Posterior Parameters Model UOI Para = + /2

Lognormal failures 𝑜𝑜 𝐹𝐹 from2 with known 𝐹𝐹 Gamma , 𝑘𝑘 𝑘𝑘 𝑛𝑛 𝑁𝑁 at times𝑛𝑛 1 (𝜎𝜎; , ) = + 𝑛𝑛𝐹𝐹 (ln ) 0 0 2 2 𝑘𝑘 𝜆𝜆 2 𝑁𝑁 𝑁𝑁 𝑁𝑁 𝑜𝑜 𝑖𝑖 𝑁𝑁 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿 𝑡𝑡 𝜇𝜇 𝜎𝜎 𝜇𝜇 𝑖𝑖 𝜆𝜆 𝜆𝜆 � 𝑡𝑡 − 𝜇𝜇 𝑡𝑡 𝑖𝑖=1 ln ( ) + 𝑛𝑛𝐹𝐹 = 0 ∑𝑖𝑖=1 𝑖𝑖 µ2 1 2 𝑡𝑡 Lognormal 0 + 𝑁𝑁 failures σ 𝜎𝜎 from with known 𝐹𝐹 Normal , 𝜇𝜇 𝐹𝐹 𝑁𝑁 at times𝑛𝑛 2 𝑛𝑛2 (𝜇𝜇 ; , ) 2 0 𝑁𝑁 𝑜𝑜 𝑜𝑜 𝜎𝜎 1𝜎𝜎 2 2 𝜇𝜇 𝜎𝜎 = 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿 𝑡𝑡 𝜇𝜇𝑁𝑁 𝜎𝜎𝑁𝑁 𝜎𝜎𝑁𝑁 1 n 𝑡𝑡𝑖𝑖 2 + σ F 2 2 Description , Limitations and Uses σ0 𝜎𝜎𝑁𝑁 Example 5 components are put on a test with the following failure times: 98, 116, 2485, 2526, , 2920 hours

Taking the natural log of these failure times allows us to use a normal distribution to approximate the parameters. ln ( ): 4.590, 4.752, 7.979, 7.818, 7.834 (hours) 𝑖𝑖 𝑡𝑡 MLE Estimates are: 𝑙𝑙𝑙𝑙 ln 32.974 = = = 6.595 n 𝐹𝐹 5 ∑ �𝑡𝑡𝑖𝑖 � 𝜇𝜇�𝑁𝑁 ln = F = 3.091 n 𝐹𝐹 1 2 2 ∑� �𝑡𝑡𝑖𝑖 � − 𝜇𝜇�𝑡𝑡� �𝑁𝑁 𝜎𝜎 F 90% confidence interval for : −

(𝑁𝑁) {0.95}𝜇𝜇4 , + {0.95}(4) 𝑁𝑁4 𝑁𝑁4 𝜎𝜎� 𝜎𝜎� �𝜇𝜇�𝑁𝑁 − 𝑡𝑡 𝜇𝜇�𝑁𝑁 𝑡𝑡 � √ √ 56 Common Life Distributions

[4.721, 8.469]

90% confidence interval for : 4 2 4 𝑁𝑁 , 𝜎𝜎(4) (4) 2 {0.95} 2 {0.05} �𝜎𝜎�𝑁𝑁 2 [1.303, 17𝜎𝜎�.396𝑁𝑁 2] � 𝜒𝜒 𝜒𝜒

A Bayesian point estimate using the Jeffery non-informative improper Lognormal prior 1 with posterior for ~ (6, 6.595, 0.412 ) and ~ (3, 6.182) has4 a point estimates: 2 𝑁𝑁 𝑁𝑁 𝑁𝑁 ⁄𝜎𝜎 𝜇𝜇 𝑇𝑇 𝜎𝜎 𝐼𝐼𝐼𝐼 = E[ (6,6.595,0.412 )] = = 6.595

𝑁𝑁 𝜇𝜇� 𝑇𝑇 6.182µ = E[ (3,6.182)] = = 3.091 2 2 𝜎𝜎�𝑁𝑁 𝐼𝐼𝐼𝐼 With 90% confidence intervals:

[ (0.05) = 5.348, (0.95) = 7.842] 𝑁𝑁 𝜇𝜇 −1 −1 𝑇𝑇 𝑇𝑇 2 [1/𝐹𝐹 (0.95) = 0.982, 𝐹𝐹1/ (0.05) = 7.560] 𝑁𝑁 𝜎𝜎 −1 −1 𝐹𝐹𝐺𝐺 𝐹𝐹𝐺𝐺 Characteristics Characteristics. determines the scale and not the location as in a normal distribution. The distribution if fixed at f(0)=0 and an 𝑵𝑵 𝑁𝑁 increase𝝁𝝁 in the scale parameter𝜇𝜇 stretches the distribution across the x-axis. This has the effect of increasing the mode, mean and median of the distribution.

Characteristics. determines the shape and not the scale as in a normal distribution. For values of σN > 1 the distribution rises very 𝑵𝑵 𝑁𝑁 sharply𝝈𝝈 at the beginning𝜎𝜎 and decreases with a shape similar to an Exponential or Weibull with 0 < < 1. As σN → 0 the mode, mean and median converge to . The distribution becomes narrower and approaches a Dirac delta 𝜇𝜇function𝑁𝑁 𝛽𝛽 at = . 𝑒𝑒 𝜇𝜇𝑁𝑁 Hazard Rate. (Kleiber & Kotz 2003,𝑡𝑡 𝑒𝑒p.115)The hazard rate is unimodal with (0) = 0 and all dirivitives of ( ) = 0 and a slow decrease to zero as 0. The mode of the hazard rate: ℎ = exp ( + ) ℎ’ 𝑡𝑡 where is given by:𝑡𝑡 → 𝑡𝑡𝑚𝑚 𝜇𝜇 𝑧𝑧(𝑚𝑚𝜎𝜎) 𝑚𝑚 ( + ) = 𝑧𝑧 1 ( ) 𝜙𝜙 𝑧𝑧𝑚𝑚 therefore < < 𝑚𝑚 + 𝑁𝑁 and therefore: 𝑧𝑧 𝜎𝜎 𝑚𝑚 −1 − Φ 𝑧𝑧 𝑁𝑁 𝑚𝑚 𝑁𝑁 −𝜎𝜎 𝑧𝑧 −𝜎𝜎 <𝜎𝜎 < 2 2 As , 𝜇𝜇𝑁𝑁 and−𝜎𝜎𝑁𝑁 so for large𝜇𝜇𝑁𝑁− 𝜎𝜎𝑁𝑁+: 1 2 𝑚𝑚 𝜇𝜇𝑁𝑁−𝑒𝑒𝜎𝜎𝑁𝑁 𝑡𝑡exp 𝑒𝑒 𝜎𝜎𝑁𝑁 → ∞ 𝑡𝑡𝑚𝑚 → 𝑒𝑒 max ( ) 𝜎𝜎𝑁𝑁 2 1 2 �𝜇𝜇𝑁𝑁 − 2𝜎𝜎𝑁𝑁� ℎ 𝑡𝑡 ≈ 𝜎𝜎𝑁𝑁√ 𝜋𝜋 Lognormal Continuous Distribution 57

As 0, and so for large : 2 𝜇𝜇𝑁𝑁−𝜎𝜎𝑁𝑁+1 1 𝜎𝜎𝑁𝑁 → 𝑡𝑡𝑚𝑚 → 𝑒𝑒 max ( ) 𝜎𝜎𝑁𝑁 2 ℎ 𝑡𝑡 ≈ 2 𝜇𝜇𝑁𝑁−𝜎𝜎𝑁𝑁+1 𝜎𝜎𝑁𝑁𝑒𝑒 Mean / Median / Mode: Lognormal ( ) < ( ) < [ ]

𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚 𝑋𝑋 𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚 𝑋𝑋 𝐸𝐸 𝑋𝑋 Scale/Product Property: Let: ~ , If and are independent: 2 𝑎𝑎𝑗𝑗𝑋𝑋𝑗𝑗 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿�𝜇𝜇𝑁𝑁𝑁𝑁 σNj� ~ + ln , 𝑋𝑋𝑗𝑗 𝑋𝑋𝑗𝑗+1 2 � 𝑎𝑎𝑗𝑗𝑋𝑋𝑗𝑗 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿 ���𝜇𝜇𝑁𝑁𝑁𝑁 �𝑎𝑎𝑗𝑗�� � 𝜎𝜎𝑁𝑁𝑁𝑁� Lognormal versus Weibull. In analyzing life data to these distributions it is often the case that both may be a good fit, especially in the middle of the distribution. The Weibull distribution has an earlier lower tail and produces a more pessimistic estimate of the component life. (Nelson 1990, p.65)

Applications General Life Distributions. The lognormal distribution has been found to accurately model many life distributions and is a popular choice for life distributions. The increasing hazard rate in early life models the weaker subpopulation (burn in) and the remaining decreasing hazard rate describes the main population. In particular this has been applied to some electronic devices and fatigue-fracture data. (Meeker & Escobar 1998, p.262)

Failure Modes from Multiplicative Errors. The lognormal distribution is very suitable for failure processes that are a result of multiplicative errors. Specific applications include failure of components due to fatigue cracks. (Provan 1987)

Repair Times. The lognormal distribution has commonly been used to model repair times. It is natural for a repair time probability to increase quickly to a mode value. For example very few repairs have an immediate or quick fix. However, once the time of repair passes the mean it is likely that there are serious problems, and the repair will take a substantial amount of time.

Parameter Variability. The lognormal distribution can be used to model parameter variability. This was done when estimating the uncertainty in the parameter in a Nuclear Reactor Safety Study (NUREG-75/014). 𝜆𝜆 Theory of Breakage. The distribution models particle sizes observed in breakage processes (Crow & Shimizu 1988) Resources Online: 58 Common Life Distributions

http://www.weibull.com/LifeDataWeb/the_lognormal_distribution.ht m http://mathworld.wolfram.com/LogNormalDistribution.html http://en.wikipedia.org/wiki/Log-normal_distribution http://socr.ucla.edu/htmls/SOCR_Distributions.html (web calc)

Books: Crow, E.L. & Shimizu, K., 1988. Lognormal distributions, CRC Press.

Lognormal Aitchison, J.J. & Brown, J., 1957. The Lognormal Distribution, New York: Cambridge University Press.

Nelson, W.B., 1982. Applied Life Data Analysis, Wiley-Interscience.

Relationship to Other Distributions Normal Let; Distribution ~ ( , ) = ln ( ) 2 𝑁𝑁 N ( ; , ) Then: 𝑋𝑋 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿 𝜇𝜇 σ 2 ~𝑌𝑌 (𝑋𝑋, ) 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑡𝑡 𝜇𝜇 𝜎𝜎 Where: 2 𝑌𝑌 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝜇𝜇 𝜎𝜎 + = ln , = ln +2 2 2 𝜇𝜇 𝜎𝜎 𝜇𝜇 𝑁𝑁 𝑁𝑁 2 𝜇𝜇 � 2 2� 𝜎𝜎 � � �𝜎𝜎 𝜇𝜇 𝜇𝜇 Weibull Continuous Distribution 59

2.3. Weibull Continuous Distribution

Probability Density Function - f(t) α=1 β=0.5 0.90 α=1 β=2 1.80 α=1 β=1 0.80 α=1.5 β=2 1.60 Weibull α=1 β=2 0.70 α=2 β=2 1.40 α=1 β=5 0.60 α=3 β=2 1.20 1.00 0.50 0.80 0.40 0.60 0.30 0.40 0.20 0.20 0.10 0.00 0.00 0 1 2 0 2 4

Cumulative Density Function - F(t) 1.00 1.00

0.80 0.80

0.60 0.60

0.40 0.40 α=1 β=0.5 α=1 β=2 α=1 β=1 α=1.5 β=2 0.20 α=1 β=2 0.20 α=2 β=2 α=1 β=5 α=3 β=2 0.00 0.00 0 1 2 0 2 4

Hazard Rate - h(t) 6 10.00 α=1 β=2 9.00 α=1.5 β=2 5 8.00 α=2 β=2 α=3 β=2 4 7.00 6.00 3 5.00 4.00 2 3.00 1 2.00 1.00 0 0.00 0 1 2 0 2 4

60 Common Life Distributions

Parameters & Description Scale Parameter: The value of equals the 63.2th percentile and has > 0 a unit equal to . Note that this is not𝛼𝛼 𝛼𝛼 𝛼𝛼 equal to the mean. Parameters 𝑡𝑡 Shape Parameter: Also known as the slope (referring to a linear CDF plot) > 0

Weibull determines the shape of the 𝛽𝛽 𝛽𝛽 𝛽𝛽 distribution. Limits 0

Distribution Formulas𝑡𝑡 ≥

PDF ( ) = 𝛽𝛽 𝛽𝛽−1 𝑡𝑡 −� � 𝛽𝛽𝑡𝑡 𝛼𝛼 𝑓𝑓 𝑡𝑡 𝛽𝛽 𝑒𝑒 CDF ( ) = 1𝛼𝛼 𝛽𝛽 𝑡𝑡 −� � 𝛼𝛼 𝐹𝐹 𝑡𝑡 − 𝑒𝑒 Reliability R(t) = 𝛽𝛽 𝑡𝑡 −� � 𝛼𝛼 𝑒𝑒 ( + x) ( ) 𝛽𝛽 𝛽𝛽 ( ) = ( | ) = = 𝑡𝑡 − 𝑡𝑡+x Conditional ( ) � 𝛽𝛽 � Survivor Function 𝑅𝑅 𝑡𝑡 𝛼𝛼 Where 𝑚𝑚 𝑥𝑥 𝑅𝑅 𝑥𝑥 𝑡𝑡 𝑒𝑒 ( > + | > ) is the given time we know the component𝑅𝑅 𝑡𝑡 has survived to is a random variable defined as the time after . Note: = 0 at . 𝑃𝑃 𝑇𝑇 𝑥𝑥 𝑡𝑡 𝑇𝑇 𝑡𝑡 𝑡𝑡 (Kleiber𝑥𝑥 & Kotz 2003, p.176) 𝑡𝑡 𝑥𝑥 𝑡𝑡

( ) = 𝛽𝛽 𝛽𝛽 𝑡𝑡 ∞ 𝑥𝑥 Mean Residual � � −� � Life which has the asymptotic property𝛼𝛼 of: 𝛼𝛼 𝑢𝑢 𝑡𝑡 𝑒𝑒 �t 𝑒𝑒 𝑑𝑑𝑑𝑑 lim ( ) = 1−𝛽𝛽 𝑡𝑡→∞ 𝑢𝑢 𝑡𝑡 𝑡𝑡 Hazard Rate ( ) = . 𝛽𝛽−1 β 𝑡𝑡 Cumulative ℎ 𝑡𝑡 � � ( ) α= 𝛼𝛼 Hazard Rate 𝛽𝛽 𝑡𝑡 𝐻𝐻 𝑡𝑡 � � Properties and Moments𝛼𝛼 Median (2) 1 𝛽𝛽 Mode 𝛼𝛼1�𝑙𝑙𝑙𝑙 � 1 if 1 𝛽𝛽 𝛽𝛽 − otherwise𝛼𝛼 no� mode� exists 𝛽𝛽 ≥ 𝛽𝛽 Weibull Continuous Distribution 61

Mean - 1st Raw Moment 1 1 +

Variance - 2nd Central Moment 𝛼𝛼Γ2� � 1 1 + 𝛽𝛽 1 + 2 2 rd 𝛼𝛼 �Γ � 3 � − Γ � �� Weibull Skewness - 3 Central Moment 1 + 𝛽𝛽 𝛽𝛽 3 2 3 Γ � � α − 3µσ − µ β Excess kurtosis - 4th Central Moment 6 + 12 33 4 + σ 4 2( )2 − Γ1 Γ1 Γ2 − Γ2 − Γ1Γ3 Γ4 where: 2 2 Γ2 − Γ1 = 1 + 𝑖𝑖 Γ𝑖𝑖 Γ � � Characteristic Function ( ) 𝛽𝛽 ∞ 1 + 𝑛𝑛! 𝑛𝑛 𝑖𝑖𝑖𝑖 𝛼𝛼 𝑛𝑛 � Γ � � 100p% Percentile Function 𝑛𝑛=0 𝑛𝑛 𝛽𝛽 = [ ln(1 )] 1 𝛽𝛽 Parameter Estimation 𝑡𝑡𝑝𝑝 𝛼𝛼 − − 𝑝𝑝 Plotting Method

Least Mean X-Axis Y-Axis = 𝑐𝑐 Square − ln ( ) 1 = 𝑚𝑚 = + ln 1 𝛼𝛼� 𝑒𝑒 𝑡𝑡𝑖𝑖 𝛽𝛽̂ 𝑚𝑚 𝑦𝑦 𝑚𝑚𝑚𝑚 𝑐𝑐 �𝑙𝑙𝑙𝑙 � �� Maximum Likelihood Function− 𝐹𝐹 Likelihood Functions F 𝛽𝛽 S 𝛽𝛽 L( , |E) = 𝛽𝛽−1 ti . ti nF 𝐹𝐹 −� � nS −� � 𝛽𝛽�𝑡𝑡𝑖𝑖 � 𝛼𝛼 𝛼𝛼 α β � 𝛽𝛽 𝑒𝑒 � 𝑒𝑒 ���i=�1����𝛼𝛼��������� ���i=�1����� failures survivors LI 𝛽𝛽 RI 𝛽𝛽 ti ti nI −� � −� � 𝛼𝛼 𝛼𝛼 �i=1 �𝑒𝑒 − 𝑒𝑒 � ������������������� Log-Likelihood interval failures ( , |E) = n ln( ) n ln( ) + nF ( 1) ln Function 𝐹𝐹 𝛽𝛽 𝐹𝐹 𝑡𝑡𝑖𝑖 Λ α β F β − β F 𝛼𝛼 � � 𝛽𝛽 − �𝑡𝑡𝑖𝑖 � − � � � i=1 𝛼𝛼 ����������������������������������� t failuresβ β S I LI RI n β + n ln e ti e ti S −� � −� � i α α − � � � � � − � i=1 α i=1 ������� ������������������� survivors interval failures 62 Common Life Distributions

solve for to get : = 0 n ∂Λ 𝛼𝛼 = 𝛼𝛼� + nF t + nS t ∂α ∂Λ −β F β F β β S β β+1 �� i � β+1 �� i � ∂α α i=1 i=1 �������α�������� �α �������� failures survivors t RI β t LI β β e ti β e ti LI � � RI � � + nI i α i α = 0 ⎛� � − � � ⎞ Weibull α α β RI β LI β � ⎜ e ti e ti ⎟ � � � � i=1 α ⎜ α α ⎟ − ����⎝���������������������⎠ solve for to get : interval failures = 0

∂Λ 𝛽𝛽 𝛽𝛽̂ n t t t t t ∂β = + nF ln . ln nS ln F F β F S β S ∂Λ F i i i i i � � � � − � � � �� − � � � � � ∂β β i=1 α α 𝛼𝛼 i=1 α 𝛼𝛼 ����������������������� ����������� failures survivors t t LI β t t RI β ln . β . e ti ln . β . e ti RI RI � � LI LI � � + nI i i α i i α = 0 � � � � − � � � � ⎛ 𝛼𝛼 α 𝛼𝛼 α ⎞ RI β LI β � ⎜ e ti e ti ⎟ � � � � i=1 ⎜ α α ⎟ − ���⎝����������������������������������⎠ MLE Point When there is only complete failureinterval and/orfailures right censored data the point Estimates estimates can be solved using (Rinne 2008, p.439):

t + t 1 = � � � F β n S β β ∑� i � ∑� i � α� � F � t ln t + t ln t 1 = � � ln (t ) −1 F β F S β S ∑� i � t� i � + ∑�ti � � i � F � � � i β � F β S β − 𝐹𝐹 � � i i 𝑛𝑛 Note: Numerical methods∑� � are needed∑� � to solve then substitute to find . Numerical methods to find Weibull MLE estimates for complete and censored data for 2 parameter and 3 parameterβ� Weibull distribution are detailedα� in (Rinne 2008).

Fisher (2) 1 Information 2 2 ′ 𝛽𝛽 − 𝛾𝛾 Matrix ( , ) = 𝛽𝛽 Γ = 2 (2 ) 1 + (2) ⎡ + (1 ) ⎤ ⎡ ⎤ 1𝛼𝛼 6 𝛼𝛼 ′𝛼𝛼 −𝛼𝛼′′ ⎢ 2 ⎥ (Rinne 2008, 𝐼𝐼 𝛼𝛼 𝛽𝛽 ⎢ ⎥ 𝜋𝜋 2 Γ Γ ⎢ − 𝛾𝛾 ⎥ p.412) ⎢ 2 ⎥ 0.422784⎢ − 𝛾𝛾 ⎥ ⎣ −𝛼𝛼 𝛽𝛽 ⎦ 2 2 ⎣ 𝛼𝛼 𝛽𝛽 ⎦ 0.422784𝛽𝛽 1.823680 ⎡ 2 ⎤ ≅ ⎢ 𝛼𝛼 −𝛼𝛼 ⎥ ⎢ 2 ⎥ ⎣ −𝛼𝛼 𝛽𝛽 ⎦ Weibull Continuous Distribution 63

100 % The asymptotic variance-covariance matrix of ( , ) is: (Rinne 2008, Confidence pp.412-417) Interval𝛾𝛾 𝛼𝛼� 𝛽𝛽̂ 1 1.1087 0.2570 , = , = 2 (complete −1 𝛼𝛼� 0.2570 2 0.6079 data) ̂ 𝑛𝑛 ̂ 𝛼𝛼� 𝐶𝐶𝐶𝐶𝐶𝐶�𝛼𝛼� 𝛽𝛽� �𝐽𝐽 �𝛼𝛼� 𝛽𝛽�� � 𝛽𝛽̂ � Weibull 𝑛𝑛𝐹𝐹 2 Bayesian 𝛼𝛼� 𝛽𝛽̂

Bayesian analysis is applied to either one of two re-parameterizations of the Weibull Distribution: (Rinne 2008, p.517)

( ; , ) = exp where = or 𝛽𝛽−1 𝛽𝛽 −𝛽𝛽 𝑓𝑓 𝑡𝑡 𝜆𝜆 𝛽𝛽 𝜆𝜆𝜆𝜆𝑡𝑡 �−𝜆𝜆𝑡𝑡 � 𝜆𝜆 1 𝛼𝛼 ( ; , ) = exp where = = 𝛽𝛽 𝛽𝛽 𝛽𝛽−1 𝑡𝑡 𝛽𝛽 𝑓𝑓 𝑡𝑡 𝜃𝜃 𝛽𝛽 𝑡𝑡 �− � 𝜃𝜃 𝛼𝛼 Non-informative𝜃𝜃 Priors 𝜃𝜃( ) (Rinne 2008,𝜆𝜆 p.517) Type Prior 𝝅𝝅𝟎𝟎 𝝀𝝀 Posterior Uniform Proper Prior with 1 Truncated Gamma Distribution known and limits [ , ] For a b . ( ; 1 + n , t , ) 𝛽𝛽 𝜆𝜆 ∈ 𝑎𝑎 𝑏𝑏 𝑏𝑏 − 𝑎𝑎 ≤ λ ≤ F T β Otherwise𝑐𝑐 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 ( )𝜆𝜆= 0 Jeffrey’s Prior when is known. 1 ( ; n , t ) (0,0) 𝜋𝜋 𝜆𝜆 , when [0, ) 𝛽𝛽 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝜆𝜆 F T β Jeffrey’s Prior for unknown ∝ 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺1 No closed form 𝜆𝜆 𝜆𝜆 ∈ ∞ and . (Rinne 2008, p.527) 𝜃𝜃 𝛽𝛽 𝜃𝜃𝜃𝜃 where , = t + t = adjusted total time in test F β S β 𝑡𝑡𝑇𝑇 𝛽𝛽 ∑� iConjugate� ∑� i � Priors It was found by Soland that no joint continuous prior distribution exists for the Weibull distribution. Soland did however propose a procedure which used a continuous distribution for α and a discrete distribution for β which will not be included here. (Martz & Waller 1982) UOI Likelihood Evidence Dist of Prior Posterior Model UOI Para Parameters

where = + =𝜆𝜆 Weibull with failures = + , Gamma , 𝑜𝑜 𝐹𝐹 from− 𝛽𝛽 known at times 𝑘𝑘 𝑘𝑘 𝑛𝑛 𝐹𝐹 (Rinne0 2008,𝑇𝑇 𝛽𝛽 𝜆𝜆 ( 𝛼𝛼; , ) 𝑛𝑛 𝐹𝐹 0 0 Λ p.520)Λ 𝑡𝑡 𝑖𝑖 𝑘𝑘 Λ 𝛽𝛽 𝑡𝑡 𝑊𝑊𝑊𝑊𝑊𝑊 𝑡𝑡 𝛼𝛼 𝛽𝛽 64 Common Life Distributions

= + where Weibull with failures Inverted = + , =𝜃𝜃 , 𝑜𝑜 𝐹𝐹 known at times Gamma (Rinne𝛼𝛼 𝛼𝛼 2008,𝑛𝑛 from𝛽𝛽 𝑛𝑛𝐹𝐹 β β0 𝑡𝑡𝑇𝑇 𝛽𝛽 𝜃𝜃 𝛼𝛼 𝐹𝐹 𝛼𝛼0 β0 p.524) ( ; , ) 𝛽𝛽 𝑡𝑡𝑖𝑖 𝑊𝑊𝑊𝑊𝑊𝑊 𝑡𝑡 𝛼𝛼 𝛽𝛽 Description , Limitations and Uses Example 5 components are put on a test with the following failure times:

Weibull 535, 613, 976, 1031, 1875 hours

is found by numerically solving:

β� t ln t = � 6.8118 −1 F β F ∑� i �t � i � � � β � F β − � i ∑� =� 2.275 is found by solving: β� α� t 1 = � � = 1140 nF β β ∑� i � α� � F � Covariance Matrix is: 1 1.1087 0.2570 55679 58.596 , = 2 = 5 𝛼𝛼� 58.596 0.6293 0.2570 2 0.6079 𝛼𝛼� 𝐶𝐶𝐶𝐶𝐶𝐶�𝛼𝛼� 𝛽𝛽̂� � 𝛽𝛽̂ � � � 2 90% confidence interval for 𝛼𝛼� : 𝛽𝛽̂ (0.95) 55679 (0.95) 55679 . exp , . exp −1 𝛼𝛼� −1 𝛷𝛷 √ 𝛷𝛷 √ �𝛼𝛼� � [811�, 1602𝛼𝛼� ] � �� −𝛼𝛼� 𝛼𝛼�

90% confidence interval for : (0.95) 0.6293 (0.95) 0.6293 . exp , . exp −1 𝛽𝛽 −1 𝛷𝛷 √ 𝛷𝛷 √ �𝛽𝛽̂ � [1.282�, 4𝛽𝛽̂.037] � �� −𝛽𝛽̂ 𝛽𝛽̂ Note that with only 5 samples the assumption that the parameter distribution is approximately normal is probably inaccurate and therefore the confidence intervals need to be used with caution.

Characteristics The Weibull distribution is also known as a “Type III asymptotic distribution for minimum values”.

β Characteristics: β < 1. The hazard rate decreases with time. Weibull Continuous Distribution 65

β = 1. The hazard rate is constant (exp distribution) β > 1. The hazard rate increases with time. 1 < β < 2. The hazard rate increases less as time increases. β = 2. The hazard rate increases with a linear relationship to time. β > 2. The hazard rate increases more as time increases. Weibull β < 3.447798. The distribution is positively skewed. (Tail to right). β ≈ 3.447798. The distribution is approximately symmetrical. β > 3.447798. The distribution is negatively skewed (Tail to left). 3 < β < 4. The distribution approximates a normal distribution. β > 10. The distribution approximates a Smallest Extreme Value Distribution.

Note that for = 0.999, (0) = , but for = 1.001, (0) = 0. This rapid change creates complications when maximizing likelihood functions. (Weibull.com)𝛽𝛽 As𝑓𝑓 ∞, the 𝛽𝛽 . 𝑓𝑓

α Characteristics. Increasing𝛽𝛽 → ∞stretches𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚 the →distribution𝛼𝛼 over the time scale. With the (0) point fixed this also has the effect of increasing the mode, mean and𝛼𝛼 median. The value for is at the 63% Percentile. ( ) =𝑓𝑓 0.632.. 𝛼𝛼 𝐹𝐹 𝛼𝛼 ~ ( , )

Scaling property: (Leemis𝑋𝑋 &𝑊𝑊 McQueston𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 𝛼𝛼 𝛽𝛽 2008)

~ , 𝛽𝛽 Minimum property (Rinne𝑘𝑘𝑘𝑘 2008,𝑊𝑊𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 p.107)�𝛼𝛼𝑘𝑘 𝛽𝛽� min { , , … , }~ ( , ) 1 − When is fixed. 𝛽𝛽 2 𝑛𝑛 𝑋𝑋 𝑋𝑋 𝑋𝑋 𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 𝛼𝛼𝑛𝑛 𝛽𝛽 Variate𝛽𝛽 Generation property ( ) = [ ln(1 )] , 0 < < 1 1 −1 𝛽𝛽 Lognormal versus𝐹𝐹 𝑢𝑢 Weibull.𝛼𝛼 − In− 𝑢𝑢analyzing 𝑢𝑢life data to these distributions it is often the case that both may be a good fit, especially in the middle of the distribution. The Weibull distribution has an earlier lower tail and produces a more pessimistic estimate of the component life. (Nelson 1990, p.65)

Applications The Weibull distribution is by far the most popular life distribution used in reliability engineering. This is due to its variety of shapes and generalization or approximation of many other distributions. Analysis assuming a Weibull distribution already includes the exponential life 66 Common Life Distributions

distribution as a special case.

There are many physical interpretations of the Weibull Distribution. Due to its minimum property a physical interpretation is the weakest link, where a system such as a chain will fail when the weakest link fails. It can also be shown that the Weibull Distribution can be derived

from a cumulative wear model (Rinne 2008, p.15)

The following is a non-exhaustive list of applications where the Weibull Weibull distribution has been used in: • Acceptance sampling • Warranty analysis • Maintenance and renewal • Strength of material modeling • Wear modeling • Electronic failure modeling • Corrosion modeling

A detailed list with references to practical examples is contained in (Rinne 2008, p.275)

Resources Online: http://www.weibull.com/LifeDataWeb/the_weibull_distribution.htm http://mathworld.wolfram.com/WeibullDistribution.html http://en.wikipedia.org/wiki/Weibull_distribution http://socr.ucla.edu/htmls/SOCR_Distributions.html (interactive web calculator) http://www.qualitydigest.com/jan99/html/weibull.html (how to use conduct Weibull analysis in Excel, William W. Dorner)

Books: Rinne, H., 2008. The Weibull Distribution: A Handbook 1st ed., Chapman & Hall/CRC.

Murthy, D.N.P., Xie, M. & Jiang, R., 2003. Weibull Models 1st ed., Wiley-Interscience.

Nelson, W.B., 1982. Applied Life Data Analysis, Wiley-Interscience.

Relationship to Other Distributions The three parameter model adds a locator parameter to the two Three Parameter parameter Weibull distribution allowing a shift along the x-axis. This Weibull creates a period of guaranteed zero failures to the beginning of the Distribution product life and is therefore only used in special cases.

( ; , , ) Special Case: ( ; , ) = ( ; , , = 0) 𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 𝑡𝑡 𝛼𝛼 𝛽𝛽 𝛾𝛾 Exponential Let 𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 𝑡𝑡 𝛼𝛼 𝛽𝛽 𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 𝑡𝑡 𝛼𝛼 𝛽𝛽 𝛾𝛾 Weibull Continuous Distribution 67

Distribution ~ ( , ) = X Then β ( ; ) 𝑋𝑋 𝑊𝑊𝑒𝑒𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝛼𝛼~𝛽𝛽 ( =𝑎𝑎𝑎𝑎𝑎𝑎 ) 𝑌𝑌 Special Case: −𝛽𝛽 𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 𝜆𝜆 𝑌𝑌 𝐸𝐸𝐸𝐸𝐸𝐸 λ 𝛼𝛼 1 ( ; ) = ; = , = 1 Weibull 𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 𝜆𝜆 𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 �𝑡𝑡 𝛼𝛼 𝛽𝛽 � Rayleigh Special Case: 𝜆𝜆 Distribution ( ; ) = ( ; , = 2)

( ; ) 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅ℎ 𝑡𝑡 𝛼𝛼 𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 𝑡𝑡 𝛼𝛼 𝛽𝛽

χ Distribution𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅ℎ 𝑡𝑡 𝛼𝛼 Special Case: ( | = 2) = | = 2, = 2 ( | ) 𝜒𝜒 𝑡𝑡 𝑣𝑣 𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊�𝑡𝑡 𝛼𝛼 √ 𝛽𝛽 � 𝜒𝜒 𝑡𝑡 𝑣𝑣

68 Probability Distributions Used in Reliability Engineering

3. Bathtub Life Distributions

2-Fold Mixed Weibull Distribution 69

3.1. 2-Fold Mixed Weibull Distribution All shapes shown are variations from = 0.5 = 2 = 0.5 = 10 = 20 Probability Density Function - f(t) 1 1 2 2 α2=10 β2=20 𝑝𝑝 𝛼𝛼 α1=2 β1=0.1𝛽𝛽 𝛼𝛼 𝛽𝛽 p=0 α2=10 β2=10 α1=2 β1=0.5 p=0.3 α2=10 β2=1 α1=2 β1=1 α2=5 β2=10 p=0.7 α1=2 β1=5 p=1 Mixed Wbl

0 5 10 0 5 10 0 5 10

Cumulative Density Function - F(t)

0 5 10 0 5 10 0 10

Hazard Rate - h(t)

0 5 10 0 10 0 10

70 Bathtub Life Distributions

Parameters & Description Scale Parameter: This is the scale for > 0 each Weibull Distribution. 𝑖𝑖 𝑖𝑖 𝛼𝛼 𝛼𝛼 Shape Parameters: The shape of > 0 Parameters each Weibull Distribution 𝑖𝑖 𝑖𝑖 𝛽𝛽 𝛽𝛽 Mixing Parameter. This determines 0 1 the weight each Weibull Distribution has on the overall density function. 𝑝𝑝 ≤ 𝑝𝑝 ≤ Limits 0

Distribution Formulas𝑡𝑡 ≥

( ) = ( ) + (1 ) ( )

PDF 𝑓𝑓 𝑡𝑡 𝑝𝑝𝑓𝑓1 𝑡𝑡 − 𝑝𝑝 𝑓𝑓2 𝑡𝑡 𝛽𝛽 where ( ) = 𝑖𝑖 𝑖𝑖 and {1,2}

Mixed Wbl Mixed 𝛽𝛽 −1 𝑡𝑡 −� � 𝛽𝛽𝑖𝑖𝑡𝑡 𝛼𝛼𝑖𝑖 𝑓𝑓𝑖𝑖 𝑡𝑡 𝛽𝛽𝑖𝑖 𝑒𝑒 𝑖𝑖 ∈ ( ) = 𝛼𝛼𝑖𝑖 ( ) + (1 ) ( ) CDF 1 2 where 𝐹𝐹( 𝑡𝑡) = 1𝑝𝑝𝐹𝐹 𝑡𝑡 𝛽𝛽𝑖𝑖 − 𝑝𝑝 and𝐹𝐹 𝑡𝑡 {1,2} 𝑡𝑡 −� � 𝛼𝛼𝑖𝑖 𝐹𝐹𝑖𝑖 𝑡𝑡 − 𝑒𝑒 𝑖𝑖 ∈ ( ) = ( ) + (1 ) ( )

Reliability 1 2 where 𝑅𝑅 (𝑡𝑡 ) =𝑝𝑝𝑅𝑅 𝑡𝑡 𝛽𝛽𝑖𝑖 − 𝑝𝑝 and𝑅𝑅 𝑡𝑡 {1,2} 𝑡𝑡 −� � 𝛼𝛼𝑖𝑖 𝑅𝑅𝑖𝑖 𝑡𝑡 𝑒𝑒 𝑖𝑖 ∈ ( ) = ( ) ( ) + ( ) ( )

1 1 2 2 Hazard Rate ℎ 𝑡𝑡 𝑤𝑤 𝑡𝑡 ℎ (𝑡𝑡 ) 𝑤𝑤 𝑡𝑡 ℎ 𝑡𝑡 where ( ) = and {1,2} ( ) 𝑝𝑝𝑖𝑖𝑅𝑅𝑖𝑖 𝑡𝑡 𝑤𝑤𝑖𝑖 𝑡𝑡 𝑛𝑛 𝑖𝑖 ∈ ∑𝑖𝑖=1 𝑝𝑝𝑖𝑖𝑅𝑅𝑖𝑖 𝑡𝑡 Properties and Moments Median Solved numerically Mode Solved numerically Mean - 1st Raw Moment 1 1 1 + + (1 p) 1 +

𝑝𝑝𝛼𝛼1Γ � � − 𝛼𝛼2Γ � � 𝛽𝛽1 𝛽𝛽2 Variance - 2nd Central Moment . [ ] + (1 ) [ ] + ( [ ] [ ]) 1 2 𝑝𝑝 𝑉𝑉𝑉𝑉 +𝑉𝑉(𝑇𝑇1 )( −[ 𝑝𝑝 ]𝑉𝑉𝑉𝑉2𝑉𝑉 [𝑇𝑇 ]) 1 𝑝𝑝 𝐸𝐸 𝑋𝑋 − 𝐸𝐸 𝑋𝑋 2 2 − 𝑝𝑝 𝐸𝐸 𝑋𝑋 − 𝐸𝐸 𝑋𝑋 . 1 + 1 + 2 2 2 1 𝑝𝑝 𝛼𝛼 �Γ � 𝛽𝛽1� − Γ � 𝛽𝛽1�� 2-Fold Mixed Weibull Distribution 71

+(1 ) 1 + 1 + 2 2 2 1 − 𝑝𝑝 𝛼𝛼 �Γ � 𝛽𝛽2� − Γ � 𝛽𝛽2�� + 1 + [ ] 2 1 𝑝𝑝 �𝛼𝛼1Γ � 𝛽𝛽1� − 𝐸𝐸 𝑋𝑋 � +(1 ) 1 + [ ] 2 1 − 𝑝𝑝 �𝛼𝛼2Γ � 𝛽𝛽2� − 𝐸𝐸 𝑋𝑋 � 100p% Percentile Function Solved numerically Parameter Estimation Plotting Method (Jiang & Murthy 1995)

Plot Points on X-Axis Y-Axis Mixed Wbl a Weibull = ( ) 1 Probability Plot = 1 𝑥𝑥 𝑙𝑙𝑙𝑙 𝑡𝑡 𝑦𝑦 𝑙𝑙𝑙𝑙 �𝑙𝑙𝑙𝑙 � �� Using the Weibull Probability Plot the parameters can be estimated. Jiang & Murthy,− 𝐹𝐹 1995, provide a comprehensive coverage of this procedure and detail error in previous methods. A typical WPP for a 2-fold Mixed Weibull Distribution is: 2

1.5

1

0.5

0

-0.5

-1

-1.5

-2 -1 0 1 2 3 4 5 6 WPP for 2-Fold Weibull Mixture Model = , = 5, = 0.5, = 10, = 5 1 2 1 1 2 2 Sub Populations: 𝑝𝑝 𝛼𝛼 𝛽𝛽 𝛼𝛼 𝛽𝛽 The dotted lines in the WPP is the lines representing the subpopulations: = [ ln( )]

1 1 1 𝐿𝐿 = 𝛽𝛽 [𝑥𝑥 − ln(𝛼𝛼 )]

𝐿𝐿2 𝛽𝛽2 𝑥𝑥 − 𝛼𝛼2 72 Bathtub Life Distributions

Asymptotes (Jiang & Murthy 1995): As ( 0) there exists an asymptote approximated by: [ ln( )] + ln ( ) where𝑥𝑥 → −∞ 𝑡𝑡 → 𝑦𝑦 ≈ 𝛽𝛽 1 𝑥𝑥 − 𝛼𝛼 1 𝑐𝑐 = + (1 ). 1 = 2 𝑝𝑝 𝛽𝛽1 𝑤𝑤ℎ𝑒𝑒𝑒𝑒 𝛽𝛽 ≠ 𝛽𝛽 1 𝑐𝑐 � 𝛼𝛼 1 2 𝑝𝑝 − 𝑝𝑝 � 2� 𝑤𝑤ℎ𝑒𝑒𝑒𝑒 𝛽𝛽 𝛽𝛽 As ( ) the asymptote straight line𝛼𝛼 can be approximated by: [ ln( )] 𝑥𝑥 → ∞ 𝑡𝑡 → ∞ 1 1 𝑦𝑦 ≈ 𝛽𝛽 𝑥𝑥 − 𝛼𝛼

Parameter Estimation Jiang and Murthy divide the parameter estimation procedure into three cases:

Well Mixed Case Mixed Wbl Mixed - Estimate the parameters of and from the line (right asymptote). 𝟐𝟐 𝟏𝟏 𝟏𝟏 𝟐𝟐 - Estimate the paramet𝜷𝜷 ≠ 𝜷𝜷er 𝒂𝒂𝒂𝒂𝒂𝒂 from𝜶𝜶 ≈the𝜶𝜶 separation distance between the left and right 1 1 1 asymptotes. 𝛼𝛼 𝛽𝛽 𝐿𝐿 - Find the point where the curve𝑝𝑝 crosses (point I). The slope at point I is: = + (1 ) 1 - Determine slope at point I and use to estimate𝐿𝐿 1 2 - Draw a line through the intersection𝛽𝛽̅ point𝑝𝑝𝛽𝛽 I with− slope𝑝𝑝 𝛽𝛽 and use the intersection point 2 to estimate . 𝛽𝛽 2 𝛽𝛽 2 𝛼𝛼 Well Separated Case - Determine visually if data is scattered along the bottom (or top) to determine if 𝟐𝟐 𝟏𝟏 𝟏𝟏 𝟐𝟐 𝟏𝟏 𝟐𝟐 (or ). 𝜷𝜷 ≠ 𝜷𝜷 𝒂𝒂𝒂𝒂𝒂𝒂 𝜶𝜶 ≫ 𝜶𝜶 𝒐𝒐𝒐𝒐 𝜶𝜶 ≪ 𝜶𝜶 1 2 - If ( ) locate the inflection, , to the left (right) of the point I. This𝛼𝛼 ≪point𝛼𝛼 1 2 𝛼𝛼 ln≫ [ 𝛼𝛼ln(1 )] { or ln [ ln( )] }. Using this formula estimate p. 1 2 1 2 𝑎𝑎 - Estimate𝛼𝛼 ≪ 𝛼𝛼 𝛼𝛼and≫ 𝛼𝛼 : 𝑦𝑦 𝑎𝑎 𝑎𝑎 𝑦𝑦 ≅• − − 𝑝𝑝 𝑦𝑦 ≅ − 𝑝𝑝= ln 1 + = ln If 1 2calculate point ( ) and ( ) . 𝛼𝛼 𝛼𝛼 𝑝𝑝 1−𝑝𝑝 Find the coordinates where and intersect the WPP curve. At these points 𝛼𝛼1 ≪ 𝛼𝛼2 𝑦𝑦1 �𝑙𝑙𝑙𝑙 � − 𝑝𝑝 𝑒𝑒𝑒𝑒𝑒𝑒 1 �� 𝑦𝑦2 �𝑙𝑙𝑙𝑙 �𝑒𝑒𝑒𝑒𝑒𝑒 1 �� estimate = and = . 1 2 • 𝑥𝑥1 𝑦𝑦=𝑥𝑥2ln 𝑦𝑦 = ln + If 1calculate point2 ( ) and ( ) . 𝛼𝛼 𝑒𝑒 𝛼𝛼 𝑒𝑒 𝑝𝑝 1−𝑝𝑝 Find the coordinates where and intersect the WPP curve. At these points 𝛼𝛼1 ≫ 𝛼𝛼2 𝑦𝑦1 �− 𝑙𝑙𝑙𝑙 �𝑒𝑒𝑒𝑒𝑒𝑒 1 �� 𝑦𝑦2 �−𝑙𝑙𝑙𝑙 �𝑝𝑝 𝑒𝑒𝑒𝑒𝑒𝑒 1 �� estimate = and = . 1 2 - Estimate : 𝑥𝑥1 𝑦𝑦 𝑥𝑥2 𝑦𝑦 1 2 • If 𝛼𝛼 draw𝑒𝑒 and approximate𝛼𝛼 𝑒𝑒 ensuring it intersects . Estimate from 1 the𝛽𝛽 slope of . 1 2 2 2 2 • If 𝛼𝛼 ≪ 𝛼𝛼 draw and approximate 𝐿𝐿 ensuring it intersects 𝛼𝛼 . Estimate 𝛽𝛽 from 2 the slope of 𝐿𝐿 . 1 2 1 1 1 - Find the point𝛼𝛼 ≫ where𝛼𝛼 the curve crosses 𝐿𝐿(point I). The slope at point𝛼𝛼 I is: 𝛽𝛽 1 𝐿𝐿 = + (1 ) 1 - Determine slope at point I and use to estimate𝐿𝐿 1 2 𝛽𝛽̅ 𝑝𝑝𝛽𝛽 − 𝑝𝑝 𝛽𝛽 2 𝛽𝛽 2-Fold Mixed Weibull Distribution 73

Common Shape Parameter =

If 1 then: 𝟐𝟐 𝟏𝟏 1 𝜷𝜷 𝜷𝜷 𝛼𝛼2 𝛽𝛽 - Estimate the parameters of and from the line (right asymptote). �𝛼𝛼1� ≈ - Estimate the parameter from the separation distance between the left and right 1 1 1 asymptotes. 𝛼𝛼 𝛽𝛽 𝐿𝐿 - Draw a vertical line through𝑝𝑝 = ln ( ). The intersection with the WPP can yield an estimate of using: 𝑥𝑥 𝛼𝛼1 𝛼𝛼2 1 = + exp(1) 𝑝𝑝 exp − 𝑝𝑝 𝑦𝑦1 � 𝛽𝛽1 � 𝛼𝛼2

�� � � Mixed Wbl 𝛼𝛼1 If 1 then: 1 𝛼𝛼2 𝛽𝛽 - Find inflection point and estimate the y coordinate . Estimate p using: �𝛼𝛼1� ≪ ln [ ln( )] 𝑟𝑟 = ln 1 + 𝑦𝑦 = ln - If calculate point 𝑇𝑇 ( ) and ( ) . Find the 𝑦𝑦 ≅ − 𝑝𝑝 1−𝑝𝑝 coordinates where and intersect the WPP curve. At these points estimate = 𝛼𝛼1 ≪ 𝛼𝛼2 𝑦𝑦1 �𝑙𝑙𝑙𝑙 � − 𝑝𝑝 𝑒𝑒𝑒𝑒𝑒𝑒 1 �� 𝑦𝑦2 �𝑙𝑙𝑙𝑙 �𝑒𝑒𝑒𝑒𝑒𝑒 1 �� and = . 𝑥𝑥1 1 2 1 - Using the 𝑥𝑥left2 or right𝑦𝑦 asymptote𝑦𝑦 estimate = from the slope. 𝛼𝛼 𝑒𝑒 𝛼𝛼2 𝑒𝑒 MLE and Bayesian techniques1 2 can be used using numerical Maximum 𝛽𝛽 𝛽𝛽 methods however estimates obtained from the graphical methods Likelihood are useful for initial guesses. A literature review of MLE and

Bayesian methods is covered in (Murthy et al. 2003). Bayesian

Description , Limitations and Uses Characteristics Hazard Rate Shape. The hazard rate can be approximated at its limits by (Jiang & Murthy 1995):

: ( ) ( ) : ( )

1 1 This result proves𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 that𝑡𝑡 ℎ the𝑡𝑡 ≈ hazard𝑐𝑐ℎ 𝑡𝑡 rate𝐿𝐿 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿(increasing𝑡𝑡 ℎ 𝑡𝑡 or≈ ℎdecreasing) of will dominate the limits of the mixed Weibull distribution. Therefore the hazard rate cannot be a bathtub curve shape. Instead the 1 possibleℎ shapes of the hazard rate is: • Decreasing • Unimodal • Decreasing followed by unimodal (rollercoaster) • Bi-modal

The reason this distribution has been included as a bathtub distribution is because on many occasions the hazard rate of a complex product may follow the “rollercoaster” shape instead which is given as decreasing followed by unimodal shape.

The shape of the hazard rate is only determined by the two shape parameters and . A complete study on the characterization of

𝛽𝛽1 𝛽𝛽2 74 Bathtub Life Distributions

the 2-Fold Mixed Weibull Distribution is contained in Jiang and Murthy 1998.

p Values The mixture ratio, , for each Weibull Distribution may be used to estimate the percentage of each subpopulation. However this is not 𝑖𝑖 a reliable measure𝑝𝑝 and it known to be misleading (Berger & Sellke 1987)

N-Fold Distribution (Murthy et al. 2003) A generalization to the 2-fold mixed Weibull distribution is the n-fold case. This distribution is defined as:

( ) = 𝑛𝑛 ( )

𝑓𝑓 𝑡𝑡 � 𝑝𝑝𝑖𝑖𝑓𝑓𝑖𝑖 𝑡𝑡 𝑖𝑖=1 where ( ) = 𝛽𝛽𝑖𝑖 𝑛𝑛 = 1 𝛽𝛽𝑖𝑖−1 𝑡𝑡 𝑖𝑖 −� 𝑖𝑖� Mixed Wbl Mixed 𝛽𝛽 𝑡𝑡 𝛼𝛼 𝑖𝑖 𝛽𝛽𝑖𝑖 𝑖𝑖 and the hazard rate𝑓𝑓 is𝑡𝑡 given 𝑖𝑖as: 𝑒𝑒 𝑎𝑎𝑎𝑎𝑎𝑎 � 𝑝𝑝 𝛼𝛼 𝑖𝑖=1

( ) = 𝑛𝑛 ( ) ( )

ℎ 𝑡𝑡 � 𝑤𝑤𝑖𝑖 𝑡𝑡 ℎ𝑖𝑖 𝑡𝑡 ( ) 𝑖𝑖=(1) = ( ) 𝑝𝑝𝑖𝑖𝑅𝑅𝑖𝑖 𝑡𝑡 𝑖𝑖 𝑛𝑛 𝑤𝑤ℎ𝑒𝑒𝑒𝑒𝑒𝑒 𝑤𝑤 𝑡𝑡 𝑖𝑖=1 𝑖𝑖 𝑖𝑖 It has been found that in many instances∑ 𝑝𝑝a 𝑅𝑅higher𝑡𝑡 number of folds will not significantly increase the accuracy of the model but does impose a significant overhead in the number of parameters to estimate. The 3-Fold Weibull Mixture Distribution has been studied by Jiang and Murthy 1996.

2-Fold Weibull 3-Parameter Distribution A common variation to the model presented here is to have the second Weibull distribution modeled with three parameters.

Resources Books / Journals: Jiang, R. & Murthy, D., 1995. Modeling Failure-Data by Mixture of 2 Weibull Distributions : A Graphical Approach. IEEE Transactions on Reliability, 44, 477-488.

Murthy, D., Xie, M. & Jiang, R., 2003. Weibull Models 1st ed., Wiley-Interscience.

Rinne, H., 2008. The Weibull Distribution: A Handbook 1st ed., Chapman & Hall/CRC.

Jiang, R. & Murthy, D., 1996. A mixture model involving three Weibull distributions. In Proceedings of the Second Australia– Japan Workshop on Models in Engineering, Technology and Management. Gold Coast, Australia, pp. 260-270.

2-Fold Mixed Weibull Distribution 75

Jiang, R. & Murthy, D., 1998. Mixture of Weibull distributions - parametric characterization of failure rate function. Applied Stochastic Models and Data Analysis, (14), 47-65.

Balakrishnan, N. & Rao, C.R., 2001. Handbook of Statistics 20: Advances in Reliability 1st ed., Elsevier Science & Technology.

Relationship to Other Distributions Weibull Special Case: Distribution ( ; , ) = 2 ( ; = , = , = 1) ( ; , ) = 2 ( ; = , = , = 0) 1 1 ( ; , ) 𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 𝑡𝑡 𝛼𝛼 𝛽𝛽 𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹 𝑡𝑡 𝛼𝛼 𝛼𝛼 𝛽𝛽 𝛽𝛽 𝑝𝑝 𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 𝑡𝑡 𝛼𝛼 𝛽𝛽 𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹𝐹 𝑡𝑡 𝛼𝛼 𝛼𝛼2 𝛽𝛽 𝛽𝛽2 𝑝𝑝 Mixed Wbl

𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 𝑡𝑡 𝛼𝛼 𝛽𝛽

76 Bathtub Life Distributions

3.2. Exponentiated Weibull Distribution

Probability Density Function - f(t) 0.30 α=5 β=1.5 v=0.2 0.30 α=5 β=1 v=0.5 α=5 β=1.5 v=0.5 α=5 β=1.5 v=0.5 0.25 0.25 α=5 β=1.5 v=0.8 α=5 β=2 v=0.5 α=5 β=3 v=0.5 0.20 α=5 β=1.5 v=3 0.20

0.15 0.15

0.10 0.10

0.05 0.05 Weibull - 0.00 0.00 Exp 0 5 10 15 0 5

Cumulative Density Function - F(t) 1.00 1.00

0.80 0.80

0.60 0.60

0.40 0.40

0.20 0.20

0.00 0.00 0 5 10 15 0 5

Hazard Rate - h(t) 0.60 0.60

0.50 0.50

0.40 0.40

0.30 0.30

0.20 0.20

0.10 0.10

0.00 0.00 0 5 10 15 0 5

Exponentiated Weibull Continuous Distribution 77

Parameters & Description > 0 Scale Parameter.

Parameters 𝛼𝛼 𝛼𝛼 > 0 Shape Parameter. 𝛽𝛽 𝛽𝛽 > 0 Shape Parameter. Limits 𝑣𝑣 𝑣𝑣 0 Distribution Formulas𝑡𝑡 ≥

( ) = 1 exp exp 𝛽𝛽−1 𝛽𝛽 𝑣𝑣−1 𝛽𝛽 𝛽𝛽𝛽𝛽𝑡𝑡 𝑡𝑡 𝑡𝑡 Exp PDF 𝑓𝑓 𝑡𝑡 𝛽𝛽 � − �− � � �� �− � � � - = {𝛼𝛼 ( )} ( ) 𝛼𝛼 𝛼𝛼 Weibull 𝑣𝑣−1 𝑊𝑊 𝑊𝑊 Where ( ) and𝑣𝑣 𝐹𝐹 (𝑡𝑡 ) are𝑓𝑓 the𝑡𝑡 cdf and pdf of the two parameter

Weibull distribution respectively. 𝐹𝐹𝑊𝑊 𝑡𝑡 𝑓𝑓𝑊𝑊 𝑡𝑡

t ( ) = 1 exp CDF β v 𝐹𝐹 𝑡𝑡 = �[ −( )] �− � � �� 𝑣𝑣 α 𝐹𝐹𝑊𝑊 𝑡𝑡

t R(t) = 1 1 exp Reliability β v = 1 − �[ −( )] �− � � �� 𝑣𝑣 α − 𝐹𝐹𝑊𝑊 𝑡𝑡 + ( + x) 1 1 exp 𝑣𝑣 ( ) ( | ) 𝛽𝛽 = = = 𝑡𝑡 𝑥𝑥 Conditional ( ) − � − �− � � �� 𝑅𝑅 𝑡𝑡 1 1 exp 𝛼𝛼 Survivor Function 𝑚𝑚 𝑥𝑥 𝑅𝑅 𝑥𝑥 𝑡𝑡 𝛽𝛽 𝑣𝑣 ( > + | > ) 𝑅𝑅 𝑡𝑡 𝑡𝑡 Where − � − �− � � �� is the given time we know the component has survived𝛼𝛼 to. 𝑃𝑃 𝑇𝑇 𝑥𝑥 𝑡𝑡 𝑇𝑇 𝑡𝑡 is a random variable defined as the time after . Note: = 0 at . 𝑡𝑡 𝑥𝑥 𝑡𝑡 𝑥𝑥 𝑡𝑡

1 1 exp 𝑣𝑣 Mean Residual ∞ 𝛽𝛽 ( ) = 𝑡𝑡 Life ∫t � − � − �− � � �� � 𝑑𝑑𝑑𝑑 1 1 exp 𝛼𝛼 𝑢𝑢 𝑡𝑡 𝛽𝛽 𝑣𝑣 𝑡𝑡 − � − �− � � �� 𝛼𝛼 1 exp exp ( ) = 𝛽𝛽−1 𝛽𝛽 𝑣𝑣−1 𝛽𝛽 Hazard Rate 𝑡𝑡 𝑡𝑡 𝑡𝑡 𝛽𝛽𝛽𝛽� �𝛼𝛼� �1 − 1 �exp−� �𝛼𝛼� �� �−� �𝛼𝛼� � ℎ 𝑡𝑡 𝛽𝛽 𝑣𝑣 𝑡𝑡 − � − �−� �𝛼𝛼� �� 78 Bathtub Life Distributions

For small t: (Murthy et al. 2003, p.130) ( ) 𝛽𝛽𝛽𝛽−1 𝛽𝛽𝛽𝛽 𝑡𝑡 For large t: (Murthy et al.ℎ 2003,𝑡𝑡 ≈ � p.130)� � � 𝛼𝛼 𝛼𝛼 ( ) 𝛽𝛽−1 𝛽𝛽 𝑡𝑡 ℎ 𝑡𝑡 ≈ � � � � Properties and Moments𝛼𝛼 𝛼𝛼 Median ln 1 2 1 −1 � �𝑣𝑣 𝛽𝛽 Mode For > 1 𝛼𝛼the�− mode� − can �be� approximated

(Murthy et al. 2003, p.130): 𝛽𝛽1𝛽𝛽 ( 8 + 2 + 9 ) 1 1 𝑣𝑣 2 2

Weibull �𝛽𝛽 𝛽𝛽 − 𝑣𝑣 𝛽𝛽𝛽𝛽 𝛽𝛽𝑣𝑣 - st 𝛼𝛼 � � − − �� Mean - 1 Raw Moment Solved numerically𝛽𝛽𝛽𝛽 see Murthy et al. 2003,𝑣𝑣 Exp Variance - 2nd Central Moment p.128 100 % Percentile Function = ln 1 1 1 �𝛽𝛽 𝑝𝑝 �𝑣𝑣 Parameter Estimation𝑡𝑡𝑝𝑝 𝛼𝛼 �− � − 𝑝𝑝 �� Plotting Method (Jiang & Murthy 1999) Plot Points on X-Axis Y-Axis a Weibull = ( ) 1 Probability Plot = 1 𝑥𝑥 𝑙𝑙𝑙𝑙 𝑡𝑡 𝑦𝑦 𝑙𝑙𝑙𝑙 �𝑙𝑙𝑙𝑙 � �� Using the Weibull Probability Plot the parameters can be estimated. (Jiang− 𝐹𝐹 & Murthy 1999), provide a comprehensive coverage of this. A typical WPP for an exponentiated Weibull distribution is: 3

2.5

2

1.5

1

0.5

0

-0.5

-1

-1.5

-2 -1 -0.5 0 0.5 1 1.5 2 2.5 3 WPP for exponentiated Weibull distribution = 5, = 2, = 0.4

Asymptotes (Jiang & Murthy 1999): 𝛼𝛼 𝛽𝛽 𝑣𝑣 As ( 0) there exists an asymptote approximated by:

𝑥𝑥 → −∞ 𝑡𝑡 → Exponentiated Weibull Continuous Distribution 79

[ ln( )]

As ( ) the asymptote straight𝑦𝑦 ≈ 𝛽𝛽 𝛽𝛽line𝑥𝑥 −can be𝛼𝛼 approximated by: [ ln( )] 𝑥𝑥 → ∞ 𝑡𝑡 → ∞ Both asymptotes intersect the x-axis𝑦𝑦 at≈ 𝛽𝛽 (𝑥𝑥 −) however𝛼𝛼 both have different slopes unless = 1 and the WPP is the same as a two parameter Weibull distribution. 𝑙𝑙𝑙𝑙 𝛼𝛼 Parameter𝑣𝑣 Estimation Plot estimates of the asymptotes ensuring they cross the x-axis at the same point. Use the right asymptote to estimate and . Use the left asymptote to estimate .

𝛼𝛼 𝛽𝛽 𝑣𝑣 MLE and Bayesian techniques can be used in the standard way Exp

Maximum however estimates obtained from the graphical methods are useful - Likelihood for initial guesses when using numerical methods to solve equations. Weibull A literature review of MLE and Bayesian methods is covered in Bayesian (Murthy et al. 2003).

Description , Limitations and Uses Characteristics PDF Shape: (Murthy et al. 2003, p.129) <= 1. The pdf is monotonically decreasing, (0) = . = . The pdf is monotonically decreasing, (0) = 1/ . 𝜷𝜷𝜷𝜷 > 1. The pdf is unimodal. (0) = 0. 𝑓𝑓 ∞ 𝜷𝜷𝜷𝜷 𝟏𝟏 𝑓𝑓 𝛼𝛼 The𝜷𝜷𝜷𝜷 pdf shape is determined 𝑓𝑓by in a similar way to the for a two parameter Weibull distribution. 𝛽𝛽𝛽𝛽 Hazard Rate Shape:𝛽𝛽 (Murthy et al. 2003, p.129) and . The hazard rate is monotonically decreasing. 𝜷𝜷 ≤ 𝟏𝟏 and 𝜷𝜷𝜷𝜷 ≤ 𝟏𝟏. The hazard rate is monotonically increasing. 𝜷𝜷 <≥ 1𝟏𝟏 and 𝜷𝜷𝜷𝜷>≥1.𝟏𝟏 The hazard rate is unimodal. > 1 and < 1. The hazard rate is a bathtub curve. 𝜷𝜷 𝜷𝜷𝜷𝜷 Weibull Distribution.𝜷𝜷 𝜷𝜷𝜷𝜷 The Weibull distribution is a special case of the expatiated distribution when = 1. When is an integer greater than 1, then the cdf represents a multiplicative Weibull model. 𝑣𝑣 𝑣𝑣 Standard Exponentiated Weibull. (Xie et al. 2004) When = 1 the distribution is the standard exponentiated Weibull distribution with cdf: 𝛼𝛼 ( ) = 1 exp 𝛽𝛽 𝑣𝑣 Minimum Failure Rate.𝐹𝐹 𝑡𝑡(Xie et� al.− 2004)�−𝑡𝑡 When�� the hazard rate is a bathtub curve ( > 1 and < 1) then the minimum failure rate point is: 𝛽𝛽 𝛽𝛽𝛽𝛽 = [ ln(1 )] 1 ′ �𝛽𝛽 𝑡𝑡 𝛼𝛼 − − 𝑦𝑦1 80 Bathtub Life Distributions

where is the solution to: ( 1) (1 ) + ln(1 ) [1 + ] = 0 1 𝑦𝑦 𝑣𝑣 𝑣𝑣 Maximum𝛽𝛽 −Mean𝑦𝑦 Residual− 𝑦𝑦 𝛽𝛽 Life. −(Xie𝑦𝑦 et al.𝑣𝑣𝑣𝑣 2004)− 𝑣𝑣 − 𝑦𝑦By solving the derivative of the MRL function to zero, the maximum MRL is found by solving to t: = [ ln(1 )] 1 ∗ �𝛽𝛽 2 where is the solution𝑡𝑡 to: 𝛼𝛼 − − 𝑦𝑦

2 (1 ) [ ln(1 )] 𝑦𝑦 1 𝑣𝑣−1 − �𝛽𝛽 × [1 1 (1 ) = 0 ∞ 𝛽𝛽𝛽𝛽 − 𝑦𝑦 𝑦𝑦 − − 𝑦𝑦 [ ( )] 𝛽𝛽 𝑣𝑣 1 −𝑥𝑥 𝑣𝑣 2 �𝛽𝛽 �− ln 1−𝑦𝑦 − � − 𝑒𝑒 � 𝑑𝑑𝑑𝑑 − − 𝑦𝑦

Weibull - Resources Books / Journals: Exp Mudholkar, G. & Srivastava, D., 1993. Exponentiated Weibull family for analyzing bathtub failure-rate data. Reliability, IEEE Transactions on, 42(2), 299-302.

Jiang, R. & Murthy, D., 1999. The exponentiated Weibull family: a graphical approach. Reliability, IEEE Transactions on, 48(1), 68- 72.

Xie, M., Goh, T.N. & Tang, Y., 2004. On changing points of mean residual life and failure rate function for some generalized Weibull distributions. Reliability Engineering and System Safety, 84(3), 293–299.

Murthy, D., Xie, M. & Jiang, R., 2003. Weibull Models 1st ed., Wiley-Interscience.

Rinne, H., 2008. The Weibull Distribution: A Handbook 1st ed., Chapman & Hall/CRC.

Balakrishnan, N. & Rao, C.R., 2001. Handbook of Statistics 20: Advances in Reliability 1st ed., Elsevier Science & Technology. Relationship to Other Distributions Weibull Special Case: Distribution ( ; , ) = ( ; = , = , = 1)

( ; , ) 𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 𝑡𝑡 𝛼𝛼 𝛽𝛽 𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 𝛼𝛼 𝛼𝛼 𝛽𝛽 𝛽𝛽 𝑣𝑣

𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 𝑡𝑡 𝛼𝛼 𝛽𝛽

Modified Weibull Distribution 81

3.3. Modified Weibull Distribution

Probability Density Function - f(t) b=0.5 a=5 λ=1 b=0.8 a=10 λ=5 b=1 a=20 λ=10 b=2 Modified Weibull

0 0.5 1 0 0.5 0 0.2

Cumulative Density Function - F(t)

b=0.5 a=5 λ=1 b=0.8 a=10 λ=5 b=1 a=20 λ=10 b=2

0 0.5 1 0 0.5 0 0.2

Hazard Rate - h(t)

0 5 0 5 0 5

Note: The hazard rate plots are on a different scale to the PDF and CDF

82 Bathtub Life Distributions

Parameters & Description > 0 Scale Parameter.

𝑎𝑎 𝑎𝑎 Shape Parameter: The shape of the distribution is completely determined Parameters 0 by b. When 0 < < 1 the distribution 𝑏𝑏 𝑏𝑏 ≥ has a bathtub shaped hazard rate. 𝑏𝑏 0 Scale Parameter.

Limits 𝜆𝜆 𝜆𝜆 ≥ 0

Distribution Formulas𝑡𝑡 ≥

PDF ( ) = a(b + t) t exp( t) exp at exp( t) b−1 b 𝑓𝑓 𝑡𝑡 λ λ �− λ � CDF ( ) = 1 exp[ ( )] 𝑏𝑏 𝐹𝐹 𝑡𝑡 − −𝑎𝑎𝑡𝑡 𝑒𝑒𝑒𝑒𝑒𝑒 𝜆𝜆𝜆𝜆 Modified Weibull Modified Reliability R(t) = exp[ ( )] 𝑏𝑏 −𝑎𝑎𝑡𝑡 𝑒𝑒𝑒𝑒𝑒𝑒 𝜆𝜆𝜆𝜆 Mean Residual ( ) = exp exp Life ∞ 𝑏𝑏 𝜆𝜆𝜆𝜆 𝑏𝑏 𝜆𝜆𝜆𝜆 𝑢𝑢 𝑡𝑡 �𝑎𝑎𝑎𝑎 𝑒𝑒 � �𝑡𝑡 �𝑎𝑎𝑥𝑥 𝑒𝑒 � 𝑑𝑑𝑑𝑑 Hazard Rate ( ) = a(b + t)t e b−1 λt Properties andℎ 𝑡𝑡 Momentsλ Median Solved numerically (see 100p%) Mode Solved numerically Mean - 1st Raw Moment Solved numerically Variance - 2nd Central Moment Solved numerically 100p% Percentile Function Solve for numerically: ln(1 ) exp𝑝𝑝 ( t ) = 𝑡𝑡 a b − 𝑝𝑝 𝑡𝑡𝑝𝑝 λ p − Parameter Estimation Plotting Method (Lai et al. 2003) Plot Points on X-Axis Y-Axis a Weibull ln ( ) 1 Probability Plot 1 𝑡𝑡𝑖𝑖 𝑙𝑙𝑙𝑙 �𝑙𝑙𝑙𝑙 � �� Using the Weibull Probability Plot the parameters can be estimated. (Lai et− al.𝐹𝐹 2003).

Asymptotes (Lai et al. 2003): Modified Weibull Distribution 83

As ( 0) the asymptote straight line can be approximated as: + ln( ) 𝑥𝑥 → −∞ 𝑡𝑡 → As ( ) the asymptote straight𝑦𝑦 ≈ 𝑏𝑏𝑏𝑏 line can𝑎𝑎 be approximated as (not used for parameter estimate but more for model validity): 𝑥𝑥 → ∞ 𝑡𝑡 → ∞ exp( ) =

Intersections (Lai et al. 2003): 𝑦𝑦 ≈ 𝜆𝜆 𝑥𝑥 𝜆𝜆𝜆𝜆

Y-Axis Intersection (0, ) ln( ) + + = 0 0 ( 𝑥𝑥, 0) 𝑥𝑥0

X-Axis Intersection Modified Weibull 0 𝑎𝑎ln( 𝑏𝑏)𝑥𝑥+ =𝜆𝜆𝑒𝑒 0 𝑦𝑦 0 Solving these gives an approximate value𝑎𝑎 for𝜆𝜆 each𝑦𝑦 parameter which can be used as an initial guess for numerical methods solving MLE or Bayesian methods.

A typical WPP for an Modified Weibull Distribution is:

10

9

8

7

6

5

4

3

2

1

0

-1 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 WPP for modified Weibull distribution = 5, = 0.2, = 10

𝑎𝑎 𝑏𝑏 𝜆𝜆 Description , Limitations and Uses Characteristics Parameter Characteristics:(Lai et al. 2003) < < 1 > 0. The hazard rate has a bathtub curve shape. ( ) 0. ( ) . 𝟎𝟎 𝑏𝑏 𝒂𝒂 𝒏𝒏>𝒅𝒅 𝝀𝝀0. Has an increasing hazard rate function. (0) = 0ℎ. 𝑡𝑡 (→) ∞ 𝑎𝑎𝑎𝑎 𝑡𝑡 → ℎ 𝑡𝑡. → ∞ 𝑎𝑎𝑎𝑎 𝑡𝑡 → ∞ 𝒃𝒃 =≥ 𝟏𝟏. 𝒂𝒂𝒂𝒂𝒂𝒂The 𝝀𝝀function has the same form as a Weibull Distribution.ℎ ℎ 𝑡𝑡(0→) =∞ 𝑎𝑎𝑎𝑎. 𝑡𝑡 →( )∞ 𝝀𝝀 𝟎𝟎 ℎ 𝑎𝑎𝑎𝑎 ℎ 𝑡𝑡 → ∞ 𝑎𝑎𝑎𝑎 𝑡𝑡 → ∞ 84 Bathtub Life Distributions

Minimum Failure Rate. (Xie et al. 2004) When the hazard rate is a bathtub curve (0 < < 1 > 0) then the minimum failure rate point is given as: 𝑏𝑏 𝑎𝑎𝑎𝑎𝑎𝑎 𝜆𝜆 = ∗ √𝑏𝑏 − 𝑏𝑏 𝑡𝑡 Maximum Mean Residual Life. (Xie𝜆𝜆 et al. 2004) By solving the derivative of the MRL function to zero, the maximum MRL is found by solving to t:

( + ) exp ( ( ) exp = 0 ∞ 𝑏𝑏−1 𝜆𝜆𝜆𝜆 𝑏𝑏 𝜆𝜆𝜆𝜆 𝑑𝑑𝑑𝑑 𝑏𝑏 𝜆𝜆𝜆𝜆

𝑎𝑎 𝑏𝑏 𝜆𝜆𝜆𝜆 𝑡𝑡 𝑒𝑒 �𝑡𝑡 −𝑎𝑎𝑥𝑥 𝑒𝑒 − �𝑎𝑎𝑡𝑡 𝑒𝑒 � Shape. The shape of the hazard rate cannot have a flat “usage period” and a strong “wear out” gradient. Resources Books / Journals:

Modified Weibull Modified Lai, C., Xie, M. & Murthy, D., 2003. A modified Weibull distribution. IEEE Transactions on Reliability, 52(1), 33-37.

Murthy, D.N.P., Xie, M. & Jiang, R., 2003. Weibull Models 1st ed., Wiley-Interscience.

Xie, M., Goh, T.N. & Tang, Y., 2004. On changing points of mean residual life and failure rate function for some generalized Weibull distributions. Reliability Engineering and System Safety, 84(3), 293–299.

Rinne, H., 2008. The Weibull Distribution: A Handbook 1st ed., Chapman & Hall/CRC.

Balakrishnan, N. & Rao, C.R., 2001. Handbook of Statistics 20: Advances in Reliability 1st ed., Elsevier Science & Technology. Relationship to Other Distributions Weibull Special Case: Distribution ( ; , ) = ( ; a = , = , = 0)

( ; , ) 𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 𝑡𝑡 𝛼𝛼 𝛽𝛽 𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀 𝑡𝑡 𝛼𝛼 𝑏𝑏 𝛽𝛽 𝜆𝜆

𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊 𝑡𝑡 𝛼𝛼 𝛽𝛽

Univariate Continuous Distributions 85

4. Univariate Continuous Distributions

Univariate Cont

86 Univariate Continuous Distributions

4.1. Beta Continuous Distribution

Probability Density Function - f(t) 3.00 α=5 β=2 α=1 β=1 α=2 β=2 α=5 β=5 α=0.5 β=0.5 2.50 α=1 β=2 2.50 α=0.5 β=2 2.00 2.00

1.50 1.50

1.00 1.00

0.50 0.50

0.00 0.00 0 0.5 1 0 0.2 0.4 0.6 0.8 1

Cumulative Density Function - F(t) 1.00 1.00

0.80 0.80

0.60 0.60 Beta

0.40 0.40 α=1 β=1 0.20 0.20 α=5 β=5 α=0.5 β=0.5 0.00 0.00 0 0.5 1 0 0.2 0.4 0.6 0.8 1

Hazard Rate - h(t) 10.00 α=5 β=2 4.00 α=2 β=2 9.00 3.50 8.00 α=1 β=2 α=0.5 β=2 3.00 7.00 6.00 2.50 5.00 2.00 4.00 1.50 3.00 1.00 α=1 β=1 2.00 α=5 β=5 1.00 0.50 α=0.5 β=0.5 0.00 0.00 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1

Beta Continuous Distribution 87

Parameters & Description > 0 Shape Parameter.

𝛼𝛼 𝛼𝛼 > 0 Shape Parameter. 𝛽𝛽 𝛽𝛽 Lower Bound: is the lower bound but has also been called a location < < 𝐿𝐿 Parameters parameter. In𝑎𝑎 the standard Beta = 0 𝑎𝑎𝐿𝐿 −∞ 𝑎𝑎𝐿𝐿 𝑏𝑏𝑈𝑈 distribution . Upper Bound𝑎𝑎𝐿𝐿: is the upper bound. In the standard Beta distribution = 1. < < 𝑈𝑈 The scale parameter𝑏𝑏 may also be defined 𝑏𝑏𝑈𝑈 𝑏𝑏𝑈𝑈 𝑎𝑎𝐿𝐿 𝑏𝑏𝑈𝑈 ∞ as . Limits 𝑏𝑏𝑈𝑈<− 𝑎𝑎𝐿𝐿 Distribution 𝑎𝑎Formulas𝐿𝐿 𝑡𝑡 ≤ 𝑏𝑏𝑈𝑈 ( , ) is the Beta function, ( | , ) is the incomplete Beta function, ( | , ) is the regularized Beta function, ( ) is the complete gamma which is discussed in section 1.6. 𝐵𝐵 𝑥𝑥 𝑦𝑦 𝐵𝐵𝑡𝑡 𝑡𝑡 𝑥𝑥 𝑦𝑦 𝐼𝐼𝑡𝑡 𝑡𝑡 𝑥𝑥 𝑦𝑦 GeneralΓ 𝑘𝑘 Form: ( + ) ( ) ( )

( ; , , , ) = . Beta ( ) ( ) ( 𝛼𝛼−1 ) 𝛽𝛽−1 Γ 𝛼𝛼 𝛽𝛽 𝑡𝑡 − 𝑎𝑎𝐿𝐿 𝑏𝑏𝑈𝑈 − 𝑡𝑡 𝐿𝐿 𝑈𝑈 𝛼𝛼+𝛽𝛽−1 𝑓𝑓 𝑡𝑡 𝛼𝛼 𝛽𝛽 𝑎𝑎 𝑏𝑏 𝑈𝑈 𝐿𝐿 When = 0, = 1: Γ 𝛼𝛼 Γ 𝛽𝛽 𝑏𝑏 − 𝑎𝑎 PDF ( + ) 𝐿𝐿 𝑈𝑈 ( | , ) = . (1 ) 𝑎𝑎 𝑏𝑏 ( ) ( ) Γ 𝛼𝛼1 𝛽𝛽 𝛼𝛼−1 𝛽𝛽−1 𝑓𝑓 𝑡𝑡 𝛼𝛼 𝛽𝛽 = . 𝑡𝑡 (1 −) 𝑡𝑡 Γ (𝛼𝛼 ,Γ )𝛽𝛽 𝛼𝛼−1 𝛽𝛽−1 𝑡𝑡 − 𝑡𝑡 𝐵𝐵 𝛼𝛼 𝛽𝛽 ( + ) F(t) = (1 ) ( ) ( ) 𝑡𝑡 Γ (𝛼𝛼 | ,𝛽𝛽 ) 𝛼𝛼−1 𝛽𝛽−1 CDF = �0 𝑢𝑢 − 𝑢𝑢 𝑑𝑑𝑑𝑑 Γ 𝛼𝛼( Γ 𝛽𝛽) 𝑡𝑡 , = 𝐵𝐵( 𝑡𝑡| 𝛼𝛼, 𝛽𝛽) 𝐵𝐵 𝛼𝛼 𝛽𝛽 𝐼𝐼𝑡𝑡 𝑡𝑡 𝛼𝛼 𝛽𝛽 Reliability R(t) = 1 ( | , )

( + x𝑡𝑡) 1 ( + | , ) ( ) = ( | ) = − 𝐼𝐼 =𝑡𝑡 𝛼𝛼 𝛽𝛽 ( ) 1 ( | , ) Conditional 𝑡𝑡 Where 𝑅𝑅 𝑡𝑡 − 𝐼𝐼 𝑡𝑡 𝑥𝑥 𝛼𝛼 𝛽𝛽 Survivor Function 𝑚𝑚 𝑥𝑥 𝑅𝑅 𝑥𝑥 𝑡𝑡 𝑡𝑡 is the given time we know the𝑅𝑅 component𝑡𝑡 −has𝐼𝐼 survive𝑡𝑡 𝛼𝛼 𝛽𝛽 d to. is a random variable defined as the time after . Note: = 0 at . 𝑡𝑡 𝑥𝑥 { ( , ) (x| , )}dx𝑡𝑡 𝑥𝑥 𝑡𝑡 Mean Residual ( ) = ∞ ( , ) (t| , ) Life ∫t 𝐵𝐵 α β − 𝐵𝐵x α β (Gupta and Nadarajah𝑢𝑢 𝑡𝑡 2004, p.44) 𝐵𝐵 α β − 𝐵𝐵t α β 88 Univariate Continuous Distributions

t (1 t) ( ) = Hazard Rate ( , α−)1 (t| , ) − (Gupta and Nadarajahℎ 2004,𝑡𝑡 p.44) 𝐵𝐵 α β − 𝐵𝐵t α β Properties and Moments Numerically solve for t: Median . = ( , ) −1 Mode 𝑡𝑡 0for5 𝐹𝐹> 1 𝛼𝛼 𝛽𝛽 > 1 𝛼𝛼−1 𝛼𝛼+𝛽𝛽−2 𝛼𝛼 𝑎𝑎𝑎𝑎𝑎𝑎 𝛽𝛽 st Mean - 1 Raw Moment + α Variance - 2nd Central Moment α β ( + ) ( + + 1) αβ 2 2α( β ) α +β + 1 Skewness - 3rd Central Moment ( + + 2) β − α �α β 6[ + (1 2α) +β (1 +�αβ) 2 (2 + )] Excess kurtosis - 4th Central Moment 3 2 ( + + 22)( + + 3) 𝛼𝛼 𝛼𝛼 − 𝛽𝛽 𝛽𝛽 𝛽𝛽 − 𝛼𝛼𝛼𝛼 𝛽𝛽

𝛼𝛼𝛼𝛼 𝛼𝛼 F 𝛽𝛽( ; +𝛼𝛼 ; it𝛽𝛽)

1 1 Beta Where F is the αconfluenα β t hypergeometric function defined as: 1 1 Characteristic Function ( )

F ( ; ; x) = ∞ . ( ) 𝑘𝑘 k ! 1 1 α 𝑥𝑥 α β � k k=0 β 𝑘𝑘 (Gupta and Nadarajah 2004, p.44) Numerically solve for t: 100p% Percentile Function = ( , ) −1 Parameter Estimation 𝑡𝑡𝑝𝑝 𝐹𝐹 𝛼𝛼 𝛽𝛽 Maximum Likelihood Function ( + )n L( , |E) = 1 . Likelihood ( ) ( ) nF F 𝐹𝐹 𝛼𝛼−1 𝐹𝐹 𝛽𝛽−1 Functions Γ 𝛼𝛼 𝛽𝛽 𝑖𝑖 𝑖𝑖 α β �i=1 𝑡𝑡 � − 𝑡𝑡 � �Γ��𝛼𝛼�Γ��𝛽𝛽������������������� failures ( , |E) = n {ln[ ( + ) [ ( )] [ ( )]} Log-Likelihood F Functions Λ α β +( 1) 𝛤𝛤nF𝛼𝛼ln t𝛽𝛽 −+ 𝑙𝑙𝑙𝑙( 𝛤𝛤 1𝛼𝛼) n−F ln𝑙𝑙𝑙𝑙 (1𝛤𝛤 𝛽𝛽t ) F F α − � � i � β − � − i i=1 1 i=1 ( ) ( + ) = nF ln (t ) n = 0 F i ∂Λ ( ) [𝜓𝜓(α)]− 𝜓𝜓 α β F � where = ln is the digammai =function1 see section 1.6.7. ∂α 𝑑𝑑 𝜓𝜓 𝑥𝑥 𝑑𝑑𝑑𝑑 Γ 𝑥𝑥 Beta Continuous Distribution 89

(Johnson et al. 1995, p.223)

1 ( ) ( + ) = nF ln (1 t ) = 0 n ∂Λ i (Johnson et al. 1995,𝜓𝜓 β p.223)− 𝜓𝜓 α β F � − ∂β i=1 Point Point estimates are obtained by using numerical methods to solve the Estimates simultaneous equations above. Fisher ( ) ( + ) ( + ) ( , ) = Information ′ ( ′+ ) ( ) ′ ( + ) 𝜓𝜓 𝛼𝛼 − 𝜓𝜓 𝛼𝛼 𝛽𝛽 −𝜓𝜓 𝛼𝛼 𝛽𝛽 Matrix 𝐼𝐼 𝛼𝛼 𝛽𝛽 � ′ ′ ′ � ( ) = ( −)𝜓𝜓 = 𝛼𝛼 𝛽𝛽( + ) 𝜓𝜓 𝛽𝛽 − 𝜓𝜓 𝛼𝛼 𝛽𝛽 where 2 is the Trigamma function. ′ 𝑑𝑑 ∞ −2 See section 1.6.8.2 (Yang and Berger 1998, p.5) 𝜓𝜓 𝑥𝑥 𝑑𝑑𝑥𝑥 𝑙𝑙𝑙𝑙Γ 𝑥𝑥 ∑𝑖𝑖=0 𝑥𝑥 𝑖𝑖 Confidence For a large number of samples the Fisher information matrix can be Intervals used to estimate confidence intervals. See section 1.4.7.

Bayesian Non-informative Priors

Jeffery’s Prior det ( , ) Beta where ( , ) is given above. √ �𝐼𝐼 𝛼𝛼 𝛽𝛽 � 𝐼𝐼 𝛼𝛼 𝛽𝛽 Conjugate Priors UOI Likelihood Evidence Dist. of Prior Posterior Model UOI Para Parameters

failures = + from Bernoulli Beta , in 1 trail = + 1 𝑝𝑝 ( ; ) 𝑜𝑜 𝑘𝑘 0 0 𝛼𝛼 𝛼𝛼 𝑘𝑘 𝛼𝛼 𝛽𝛽 𝑜𝑜 𝛽𝛽 𝛽𝛽 − 𝑘𝑘 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑝𝑝 failures = + from Binomial Beta , 𝑝𝑝 in trials = + n k ( ; , ) 𝑘𝑘 𝛼𝛼 𝛼𝛼𝑜𝑜 𝑘𝑘 𝛼𝛼𝑜𝑜 𝛽𝛽𝑜𝑜 𝑛𝑛 β βo − 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑝𝑝 𝑛𝑛 Description , Limitations and Uses Example For examples on the use of the beta distribution as a conjugate prior see the binomial distribution.

A non-homogeneous (operate in different environments) population of 5 switches have the following probabilities of failure on demand. 0.1176, 0.1488, 0.3684, 0.8123, 0.9783

Estimate the population variability function: 1 nF ln (t ) = 1.0549 n F i F � − i=1 90 Univariate Continuous Distributions

1 nF ln (1 t ) = 1.25 n i F � − − i=1 Numerically Solving: ( ) + 1.0549 = ( ) + 1.25 Gives: 𝜓𝜓 α = 0.7369𝜓𝜓 β = 0.6678 𝛼𝛼�1.5924 1.0207 ( , ) = 𝑏𝑏� 1.0207 2.0347 − 𝐼𝐼 𝛼𝛼 𝛽𝛽 � 0.1851� 0.0929 , = − , = −1 −1 0.0929 0.1449

�𝐽𝐽𝑛𝑛�𝛼𝛼� 𝛽𝛽̂�� �𝑛𝑛𝐹𝐹𝐼𝐼�𝛼𝛼� 𝛽𝛽̂�� � � 90% confidence interval for : (0.95) 0.1851 (0.95) 0.1851 . exp , . exp −1 𝛼𝛼 −1 𝛷𝛷 √ 𝛷𝛷 √ �𝛼𝛼� � [0.282� , 1𝛼𝛼�.92] � �� −𝛼𝛼� 𝛼𝛼� 90% confidence interval for : (0.95) 0.1449 (0.95) 0.1449

. exp , . exp −1 𝛽𝛽 −1 𝛷𝛷 √ 𝛷𝛷 √ Beta �𝛽𝛽̂ � [0.262� , 1𝛽𝛽.̂71] � �� −𝛽𝛽̂ 𝛽𝛽̂ Characteristics The Beta distribution was originally known as a Pearson Type I distribution (and Type II distribution which is a special case of a Type I).

( , ) is the mirror distribution of ( , ). If ~ ( , ) and let = 1 then ~ ( , ). 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝛼𝛼 𝛽𝛽 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝛽𝛽 𝛼𝛼 𝑋𝑋 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝛼𝛼 𝛽𝛽 Location𝑌𝑌 − / 𝑋𝑋Scale Parameters𝑌𝑌 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝛽𝛽 𝛼𝛼 (NIST Section 1.3.6.6.17) and can be transformed into a location and scale parameter: = 𝐿𝐿 𝑈𝑈 𝑎𝑎 𝑏𝑏 = 𝐿𝐿 𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙 𝑎𝑎 𝑈𝑈 𝐿𝐿 Shapes(Gupta and Nadarajah𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 2004,𝑏𝑏 −p.41)𝑎𝑎 : < < 1. As 0, ( ) . < < 1. As 1, ( ) . 𝟎𝟎 > 𝛼𝛼1, > 1. As𝑥𝑥 → 𝑓𝑓0,𝑥𝑥 (→)∞ 0. There is a single mode at 𝟎𝟎 𝛽𝛽. 𝑥𝑥 → 𝑓𝑓 𝑥𝑥 → ∞ 𝜶𝜶 𝜷𝜷 𝑥𝑥 → 𝑓𝑓 𝑥𝑥 → 𝛼𝛼<−11, < 1 𝛼𝛼+𝛽𝛽−2 . The distribution is a U shape. There is a single anti-mode at . 𝜶𝜶 𝜷𝜷 > 0, > 0 𝛼𝛼−1 . There𝛼𝛼 +exists𝛽𝛽−2 inflection points at: 1 1 ( 1)( 1) 𝜶𝜶 𝜷𝜷 ± . + 2 + 2 + 3 𝛼𝛼 − 𝛼𝛼 − 𝛽𝛽 − � 𝛼𝛼 𝛽𝛽 − 𝛼𝛼 𝛽𝛽 − 𝛼𝛼 𝛽𝛽 − Beta Continuous Distribution 91

= . The distribution is symmetrical about = 0.5. As = becomes large, the beta distribution approaches the normal𝜶𝜶 𝜷𝜷 distribution. The Standard Uniform 𝑥𝑥Distribution arises𝛼𝛼 𝛽𝛽 when = = 1. = , = or = , = . Straight line. ( – )( – )𝛼𝛼< 0.𝛽𝛽 J Shaped. 𝜶𝜶 𝟏𝟏 𝜷𝜷 𝟐𝟐 𝜶𝜶 𝟐𝟐 𝜷𝜷 𝟏𝟏 Hazard Rate𝜶𝜶 𝟏𝟏 and𝜷𝜷 MRL𝟏𝟏 (Gupta and Nadarajah 2004, p.45): , . ( ) is increasing. ( ) is decreasing. , . ( ) is decreasing. ( ) is increasing. 𝜶𝜶 ≥> 𝟏𝟏1, 0𝜷𝜷<≥ 𝟏𝟏 <ℎ1𝑡𝑡. ( ) is bathtub𝑢𝑢 shaped𝑡𝑡 and ( ) is an 𝜶𝜶upside≤ 𝟏𝟏 down𝜷𝜷 ≤ 𝟏𝟏 bathtubℎ 𝑡𝑡 shape. 𝑢𝑢 𝑡𝑡 𝜶𝜶 < < 1, 𝛽𝛽 > 1. ℎ( 𝑡𝑡) is upside down bathtub shaped𝑢𝑢 𝑡𝑡 and ( ) is bathtub shape. 𝟎𝟎 𝛼𝛼 𝜷𝜷 ℎ 𝑡𝑡 𝑢𝑢 𝑡𝑡 Parameter Model. The Beta distribution is often used to model parameters which are constrained to take place between an interval. In particular the distribution of a probability parameter 0 1 is popular with the Beta distribution. ≤ 𝑝𝑝 ≤ Bayesian Analysis. The Beta distribution is often used as a conjugate prior in Bayesian analysis for the Bernoulli, Binomial and Beta Geometric Distributions to produce closed form posteriors. The Beta(0,0) distribution is an improper prior sometimes used to Applications represent ignorance of parameter values. The Beta(1,1) is a standard uniform distribution which may be used as a non- informative prior. When used as a conjugate prior to a Bernoulli or Binomial process the parameter may represent the number of successes and the total number of failures with the total number of trials being = + . 𝛼𝛼 𝛽𝛽 𝑛𝑛 𝛼𝛼 𝛽𝛽 Proportions. Used to model proportions. An example of this is the likelihood ratios for estimating uncertainty. Online: http://mathworld.wolfram.com/BetaDistribution.html http://en.wikipedia.org/wiki/Beta_distribution http://socr.ucla.edu/htmls/SOCR_Distributions.html (interactive web calculator) http://www.itl.nist.gov/div898/handbook/eda/section3/eda366h.htm

Resources Books: Gupta, A.K. & Nadarajah, S., 2004. Handbook of beta distribution and its applications, CRC Press.

Johnson, N.L., Kotz, S. & Balakrishnan, N., 1995. Continuous Univariate Distributions, Vol. 2 2nd ed., Wiley-Interscience.

Relationship to Other Distributions 92 Univariate Continuous Distributions

Let X Chi-square X ~ ( ) Y = X + X Distribution 2 1 Then i 𝑖𝑖 𝜒𝜒 𝑣𝑣 𝑎𝑎𝑎𝑎𝑎𝑎 1 2 ( ; ) Y~ = , = 2 1 1 𝜒𝜒 𝑡𝑡 𝑣𝑣 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 �𝛼𝛼 2𝑣𝑣1 𝛽𝛽 2𝑣𝑣2� Let X ~ (0,1) X X X Then Uniform i 1 2 n 𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈 X ~ 𝑎𝑎𝑎𝑎𝑎𝑎( , +≤1) ≤ ⋯ ≤ Distribution Where and are integers. r 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑟𝑟 𝑛𝑛 − 𝑟𝑟 ( ; , ) Special𝑛𝑛 Case:𝑟𝑟 ( ) ( ) 𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈 𝑡𝑡 𝑎𝑎 𝑏𝑏 ; 1,1, , = ; ,

𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑡𝑡 𝑎𝑎 𝑏𝑏 𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈 𝑡𝑡 𝑎𝑎 𝑏𝑏 For large and with fixed / :

Normal (𝛼𝛼 , ) 𝛽𝛽 =𝛼𝛼 𝛽𝛽 , = + ( + ) ( + + 1) Distribution α αβ 2 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝛼𝛼 𝛽𝛽 ≈ 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 �µ 𝜎𝜎 � �

α β α β α β ( ; , ) As and increase the mean remains constant and the variance is reduced. Beta 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑡𝑡 𝜇𝜇 𝜎𝜎 𝛼𝛼 𝛽𝛽 Let Gamma X Distribution , ~ (k , ) Y = X + X 1 Then 1 2 i i ( ; k, ) 𝑋𝑋 𝑋𝑋 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 λ 𝑎𝑎𝑎𝑎𝑎𝑎 1 2 Y~ ( = , = ) 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑡𝑡 λ Dirichlet Special Case: 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝛼𝛼 𝑘𝑘1 𝛽𝛽 𝑘𝑘2 Distribution (x; [ , ]) = ( = ; = , = )

( ; ) 𝐷𝐷𝐷𝐷𝑟𝑟𝑑𝑑=1 α1 α0 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑥𝑥 𝛼𝛼 𝛼𝛼1 𝛽𝛽 𝛼𝛼0

𝐷𝐷𝐷𝐷𝑟𝑟𝑑𝑑 𝑥𝑥 𝜶𝜶 Birnbaum Sanders Continuous Distribution 93

4.2. Birnbaum Saunders Continuous Distribution

Probability Density Function - f(t) 2.0 α=0.2 β=1 1.4 α=1 β=0.5 α=0.5 β=1 α=1 β=1 α=1 β=1 1.2 α=1 β=2 1.5 α=2.5 β=1 α=1 β=4 1.0 0.8 1.0 0.6

0.5 0.4 0.2 0.0 0.0 0 1 2 3 0 1 2 3

Birnbaum Cumulative Density Function - F(t) 1.0 0.9

0.8 0.8 Sanders 0.7 0.6 0.6

0.5 0.4 0.4 α=0.2 β=1 α=0.5 β=1 0.3 0.2 α=1 β=1 0.2 α=2.5 β=1 0.1 0.0 0.0 0 1 2 3 0 1 2 3

Hazard Rate - h(t) 2.5 1.6 2.0 1.4 1.2 1.5 1.0 0.8 1.0 0.6 0.5 0.4 0.2 0.0 0.0 0 1 2 3 0 1 2 3

94 Univariate Continuous Distributions

Parameters & Description Scale parameter. is the scale > 0 Parameters parameter equal to the median. β β β > 0 Shape parameter.

0 < t < Limits 𝛼𝛼 α

∞ Distribution Formulas

t + t 1 t t ( ) = exp 2 2 2 2 � ⁄β �β⁄ � ⁄β − �β⁄ 𝑓𝑓 𝑡𝑡 � − � � � t 𝛼𝛼𝛼𝛼+ 𝜋𝜋 t 𝛼𝛼 = √ (z) PDF 2 � ⁄β �β⁄ ϕ where ( ) is the standard𝛼𝛼𝛼𝛼 normal pdf and: t t 𝜙𝜙 𝑧𝑧 = � ⁄β − �β⁄ 𝑧𝑧𝐵𝐵𝐵𝐵

Sanders t 𝛼𝛼 t ( ) = CDF � ⁄β − �β⁄ 𝐹𝐹 𝑡𝑡 = Φ(� ) � 𝛼𝛼 Birnbaum Birnbaum Φ 𝑧𝑧𝐵𝐵𝐵𝐵 t t R(t) = Reliability �β⁄ − � ⁄β = Φ(� ) � 𝛼𝛼 Φ −𝑧𝑧𝐵𝐵(𝐵𝐵 + ) ( ) ( ) = ( | ) = = ( ) ( ′ ) 𝐵𝐵𝐵𝐵 Where 𝑅𝑅 𝑡𝑡 𝑥𝑥 Φ −𝑧𝑧 𝑚𝑚 𝑥𝑥 𝑅𝑅 𝑥𝑥 𝑡𝑡 𝐵𝐵𝐵𝐵 Conditional t t 𝑅𝑅 𝑡𝑡 (t + xΦ) −𝑧𝑧 (t + x) Survivor Function = , = ( > + | > ) � ⁄β − �β⁄ ′ � ⁄β − �β⁄ 𝑧𝑧𝐵𝐵𝐵𝐵 𝑧𝑧𝐵𝐵𝐵𝐵 is the given time 𝛼𝛼we know the component has survived𝛼𝛼 to. 𝑃𝑃 𝑇𝑇 𝑥𝑥 𝑡𝑡 𝑇𝑇 𝑡𝑡 is a random variable defined as the time after . Note: = 0 at . 𝑡𝑡 𝑥𝑥 𝑡𝑡 𝑥𝑥 𝑡𝑡 Mean Residual ( ) ( ) = ∞ Life ( ) ∫𝑡𝑡 Φ −𝑧𝑧𝐵𝐵𝐵𝐵 𝑑𝑑𝑑𝑑 𝑢𝑢 𝑡𝑡 𝐵𝐵𝐵𝐵 t + Φ −t𝑧𝑧 (z ) Hazard Rate ( ) = 2 ( ) � ⁄β �β⁄ ϕ BS ℎ 𝑡𝑡 � � Cumulative 𝛼𝛼𝛼𝛼 Φ −𝑧𝑧𝐵𝐵𝐵𝐵 ( ) = ln[ ( )] Hazard Rate 𝐻𝐻 𝑡𝑡 − Φ −𝑧𝑧𝐵𝐵𝐵𝐵 Birnbaum Sanders Continuous Distribution 95

Properties and Moments Median

Mode Numerically solve for β: + (1 + ) + (3 1) = 0 3 2 2 𝑡𝑡 2 2 3 Mean - 1st Raw Moment 𝑡𝑡 𝛽𝛽 𝛼𝛼 𝑡𝑡 𝛽𝛽 𝛼𝛼 − 𝑡𝑡 − 𝛽𝛽 1 + 22 α β � � Variance - 2nd Central Moment 5 1 + 42 2 2 α α β � � Skewness - 3rd Central Moment (11 + 6) 2 4α(5 α+ 4) 3 2 2 (Lemonteα et al. 2007) Excess kurtosis - 4th Central Moment 6 (93 + 40) 3 + 2 2

(5 + 4) Birnbaum α α 2 2 (Lemonte αet al. 2007)

100 % Percentile Function Sanders = ( ) + 4 + [ ( )] 4 2 −1 −1 2 𝛾𝛾 𝛾𝛾 β Parameter Estimation𝑡𝑡 �αΦ γ � αΦ γ �

Maximum Likelihood Function Likelihood For complete data: Function t + t 1 t t ( , | ) = nF exp 2 2 2 2 � i⁄β �β⁄ i � i⁄β − �β⁄ i 𝐿𝐿 𝜃𝜃 𝛼𝛼 𝐸𝐸 � � − � � � i=1 𝛼𝛼𝑡𝑡𝑖𝑖√ 𝜋𝜋 𝛼𝛼 ��������������������������������� failures Log-Likelihood 1 Function ( , | ) = n ln( ) + 𝑛𝑛𝐹𝐹 ln 1 + 3 𝑛𝑛𝐹𝐹 + 2 2 2 2 𝛽𝛽 𝛽𝛽 𝑡𝑡𝑖𝑖 𝛽𝛽 F 2 Λ 𝛼𝛼 𝛽𝛽 𝐸𝐸 − αβ � �� 𝑖𝑖� � 𝑖𝑖� � − � � 𝑖𝑖 − � 𝑖𝑖=1 𝑡𝑡 𝑡𝑡 𝛼𝛼 𝑖𝑖=1 𝛽𝛽 𝑡𝑡 ��������������������������������������� failures = 0 n 2 1 1 = 1 + + 𝑛𝑛𝐹𝐹 + 𝑛𝑛𝐹𝐹 = 0 ∂Λ ∂Λ F 𝛽𝛽 2 3 𝑖𝑖 3 ∂α − � � � 𝑡𝑡 � 𝑖𝑖 ∂α α α α β 𝑖𝑖=1 𝛼𝛼 𝑖𝑖=1 𝑡𝑡 ������������������������� failures

= 0 n 1 1 1 1 = + 𝑛𝑛𝐹𝐹 + 𝑛𝑛𝐹𝐹 𝑛𝑛𝐹𝐹 = 0 ∂Λ + 2 2 ∂Λ F 2 2 𝑖𝑖 2 ∂β − � 𝑖𝑖 � 𝑡𝑡 − � 𝑖𝑖 ∂β 2β 𝑖𝑖=1 𝑡𝑡 𝛽𝛽 𝛼𝛼 𝛽𝛽 𝑖𝑖=1 𝛼𝛼 𝑖𝑖=1 𝑡𝑡 ����������������������������� MLE Point is found by solving: failures

𝛽𝛽̂ 96 Univariate Continuous Distributions

Estimates [2 + ( )] + [ + ( )] = 0 where 2 𝛽𝛽 − 𝛽𝛽 𝑅𝑅 𝑔𝑔 𝛽𝛽 𝑅𝑅 𝑆𝑆 𝑔𝑔 𝛽𝛽 1 1 1 1 1 ( ) = 𝑛𝑛𝐹𝐹 −1 , = 𝑛𝑛𝐹𝐹 , = 𝑛𝑛𝐹𝐹 −1 + 𝑖𝑖 𝑔𝑔 𝛽𝛽 � � 𝑖𝑖� 𝑆𝑆 𝐹𝐹 � 𝑡𝑡 𝑅𝑅 � 𝐹𝐹 � 𝑖𝑖� 𝑛𝑛 𝑖𝑖=1 𝛽𝛽 𝑡𝑡 𝑛𝑛 𝑖𝑖=1 𝑛𝑛 𝑖𝑖=1 𝑡𝑡 Point estimates for is:

𝛼𝛼� = + 2 𝑆𝑆 𝛽𝛽̂ (Lemonte et al. 2007) 𝛼𝛼� � − 𝛽𝛽̂ 𝑅𝑅 Fisher 2 0 Information ( , ) = ⎡ 2 (2 ) ( ) + 1⎤ 𝛼𝛼0 𝐼𝐼 𝜃𝜃 𝛼𝛼 ⎢ −1⁄2 ⎥ 𝛼𝛼 𝜋𝜋 𝑘𝑘 𝛼𝛼 where ⎢ 2 2 ⎥ ⎣ 2𝛼𝛼 𝛽𝛽 2⎦ ( ) = exp 1 2 𝜋𝜋 (Lemonte et al. 2007)𝑘𝑘 𝛼𝛼 𝛼𝛼� − 𝜋𝜋 � 2� � − Φ � �� 𝛼𝛼 𝛼𝛼 Sanders 100 % Calculated from the Fisher information matrix. See section 1.4.7. For a Confidence literature review of proposed confidence intervals see (Lemonte et al. Intervals𝛾𝛾 2007).

Birnbaum Birnbaum Description , Limitations and Uses Example 5 components are put on a test with the following failure times: 98, 116, 2485, 2526, , 2920 hours

1 = 𝑛𝑛𝐹𝐹 = 1629

𝑖𝑖 𝑆𝑆 𝐹𝐹 � 𝑡𝑡 1 𝑛𝑛 𝑖𝑖=11 = 𝑛𝑛𝐹𝐹 −1 = 250.432

𝑅𝑅 � 𝐹𝐹 � 𝑖𝑖� Solving: 𝑛𝑛 𝑖𝑖=1 𝑡𝑡 1 1 1 1 2 + 𝑛𝑛𝐹𝐹 −1 + + 𝑛𝑛𝐹𝐹 −1 = 0 + + 2 𝛽𝛽 − 𝛽𝛽 � 𝑅𝑅 � � 𝑖𝑖� � 𝑅𝑅 �𝑆𝑆 � � 𝑖𝑖� � 𝑛𝑛 𝑖𝑖=1 𝛽𝛽 𝑡𝑡 𝑛𝑛 𝑖𝑖=1 𝛽𝛽 𝑡𝑡 = 601.949

β� = + 2 = 1.763 𝑆𝑆 𝛽𝛽̂ 𝛼𝛼� � − ̂ 𝑅𝑅 𝛽𝛽 Birnbaum Sanders Continuous Distribution 97

90% confidence interval for : (0.95) (0.95) 2 2𝛼𝛼 2 2 . exp −1 𝛼𝛼 , . exp −1 𝛼𝛼 ⎡ ⎧𝛷𝛷 � ⎫ ⎧𝛷𝛷 � ⎫⎤ 𝑛𝑛𝐹𝐹 𝑛𝑛𝐹𝐹 ⎢𝛼𝛼� 𝛼𝛼� ⎥ ⎢ ⎨ −𝛼𝛼� [1.048⎬, 2.966]⎨ 𝛼𝛼� ⎬⎥ ⎣ ⎩ ⎭ ⎩ ⎭⎦ 90% confidence interval for : 2 2 ( ) = exp 1 = 1.442 2 𝛽𝛽 𝜋𝜋 (2 ) ( 2) + 1 𝑘𝑘 𝛼𝛼� =𝛼𝛼�� − 𝜋𝜋 � � � −=Φ10� .335�� -6 −1⁄2 𝛼𝛼� 𝛼𝛼� 𝛼𝛼� 𝜋𝜋 𝑘𝑘 𝛼𝛼� 𝐼𝐼𝛽𝛽𝛽𝛽 2 2 𝐸𝐸 96762𝛼𝛼� 𝛽𝛽̂ 96762 (0.95) (0.95) . exp −1 , . exp −1 ⎡ ⎧𝛷𝛷 � ⎫ ⎧𝛷𝛷 � ⎫⎤ 𝑛𝑛𝐹𝐹 𝑛𝑛𝐹𝐹 ⎢𝛽𝛽̂ 𝛽𝛽̂ ⎥ ⎢ ⎨ −𝛽𝛽̂ [100⎬.4, 624.5] ⎨ −𝛽𝛽̂ ⎬⎥

⎣ ⎩ ⎭ ⎩ ⎭⎦ Birnbaum Note that this confidence interval uses the assumption of the parameters being normally distributed which is only true for large sample sizes. Therefore these confidence intervals may be

inaccurate. Bayesian methods must be done numerically. Sanders Characteristics The Birnbaum-Saunders distribution is a stochastic model of the Miner’s rule.

Characteristic of . As decreases the distribution becomes more symmetrical around the value of . 𝜶𝜶 𝛼𝛼 Hazard Rate. The hazard rate is𝛽𝛽 always unimodal. The hazard rate has the following asymptotes: (Meeker & Escobar 1998, p.107) (0) = 0 1 lim ( ) = ℎ 2 < 0.6 The change point of the unimodal𝑡𝑡→∞ ℎ 𝑡𝑡 hazard2 rate for must be solved numerically, however for > 0.𝛽𝛽6𝛼𝛼 can be approximated using: (Kundu et al. 2008) 𝛼𝛼 𝛼𝛼 = ( 0.4604 + 1.8417 ) 𝛽𝛽 𝑡𝑡𝑐𝑐 2 Lognormal and Inverse −Gaussian Distribution.𝛼𝛼 The shape and behavior of the Birnbaum-Saunders distribution is similar to that of the lognormal and inverse Gaussian distribution. This similarity is seen primarily in the center of the distributions. (Meeker & Escobar 1998, p.107)

Let: ~ ( ; , )

𝑇𝑇 𝐵𝐵𝐵𝐵 𝑡𝑡 𝛼𝛼 𝛽𝛽 98 Univariate Continuous Distributions

Scaling property (Meeker & Escobar 1998, p.107) ~ ( ; , ) where > 0 𝑐𝑐𝑐𝑐 𝐵𝐵𝐵𝐵 𝑡𝑡 𝛼𝛼 𝑐𝑐𝑐𝑐 Inverse𝑐𝑐 property (Meeker & Escobar 1998, p.107)

1 1 ~ ; ,

𝐵𝐵𝐵𝐵 �𝑡𝑡 𝛼𝛼 � Applications Fatigue-Fracture. The distribution𝑇𝑇 has𝛽𝛽 been designed to model crack growth to critical crack size. The model uses the Miner’s rule which allows for non-constant fatigue cycles through accumulated damage. The assumption is that the crack growth during any one cycle is independent of the growth during any other cycle. The growth for each cycle has the same distribution from cycle to cycle. This is different from the proportional degradation model used to derive the log normal distribution model, with the rate of degradation being dependent on accumulated damage. (http://www.itl.nist.gov/div898/handbook/apr/section1/apr166.htm)

Resources Online: http://www.itl.nist.gov/div898/handbook/eda/section3/eda366a.htm Sanders http://www.itl.nist.gov/div898/handbook/apr/section1/apr166.htm http://en.wikipedia.org/wiki/Birnbaum%E2%80%93Saunders_distrib ution

Birnbaum Birnbaum Books: Birnbaum, Z.W. & Saunders, S.C., 1969. A New Family of Life Distributions. Journal of Applied Probability, 6(2), 319-327.

Lemonte, A.J., Cribari-Neto, F. & Vasconcellos, K.L., 2007. Improved for the two-parameter Birnbaum-Saunders distribution. Computational Statistics & Data Analysis, 51(9), 4656- 4681.

Johnson, N.L., Kotz, S. & Balakrishnan, N., 1995. Continuous Univariate Distributions, Vol. 2, 2nd ed., Wiley-Interscience.

Rausand, M. & Høyland, A., 2004. System reliability theory, Wiley- IEEE.

Gamma Continuous Distribution 99

4.3. Gamma Continuous Distribution

Probability Density Function - f(t) 0.30 k=0.5 λ=1 0.50 k=3 λ=1.8 k=3 λ=1 0.45 k=3 λ=1.2 0.25 k=6 λ=1 0.40 k=3 λ=0.8 k=12 λ=1 0.20 0.35 k=3 λ=0.4 0.30 0.15 0.25 0.20 0.10 0.15 0.05 0.10 0.05 0.00 0.00 0 10 20 0 2 4 6 8

Cumulative Density Function - F(t) 1.00 1.00

0.80 0.80 Gamma

0.60 0.60

0.40 0.40 k=0.5 λ=1 0.20 k=3 λ=1 0.20 k=6 λ=1 k=12 λ=1 0.00 0.00 0 10 20 0 2 4 6 8

Hazard Rate - h(t) 1.60 1.60 1.40 1.40 1.20 1.20 1.00 1.00 0.80 0.80 0.60 0.60 0.40 0.40 0.20 0.20 0.00 0.00 0 10 20 0 2 4 6 8

100 Univariate Continuous Distributions

Parameters & Description Scale Parameter: Equal to the rate (frequency) of events/shocks. > 0 Sometimes defined as 1/ where is the 𝜆𝜆 𝜆𝜆 average time between events/shocks. 𝜃𝜃 𝜃𝜃 Parameters Shape Parameter: As an integer can be interpreted as the number of events/shocks until failure. When𝑘𝑘 not > 0 restricted to an integer, and be 𝑘𝑘 𝑘𝑘 interpreted as a measure of the ability to resist shocks. 𝑘𝑘 Limits 0

Distribution When k is an integer 𝑡𝑡 ≥ When k is continuous (Erlang distribution) ( ) is the complete gamma function. ( , ) and ( , ) are the incomplete gamma functions see section 1.6. Γ 𝑘𝑘 Γ 𝑘𝑘 𝑡𝑡 𝛾𝛾 𝑘𝑘 𝑡𝑡 ( ) = e 𝑘𝑘(𝑘𝑘−)1 −λt ( ) = e 𝜆𝜆 𝑡𝑡 (k𝑘𝑘 𝑘𝑘1−1)! 𝑓𝑓 𝑡𝑡 PDF 𝜆𝜆 𝑡𝑡 −λt with Laplace transformation:Γ 𝑘𝑘 𝑓𝑓 𝑡𝑡 Gamma − ( ) = + k 𝜆𝜆 𝑓𝑓 𝑠𝑠 � � 𝜆𝜆 𝑠𝑠 ( , ) ( ) = ( ) ( ) CDF ( ) = 1 𝑘𝑘−1 𝛾𝛾 𝑘𝑘 𝜆𝜆𝜆𝜆 ! 𝑛𝑛 𝐹𝐹 𝑡𝑡 1 −𝜆𝜆𝜆𝜆 = 𝜆𝜆𝜆𝜆 ( ) 𝛤𝛤𝜆𝜆𝜆𝜆𝑘𝑘 𝐹𝐹 𝑡𝑡 − 𝑒𝑒 � 𝑘𝑘−1 −𝑥𝑥 𝑛𝑛=0 𝑛𝑛 �0( 𝑥𝑥, ) 𝑒𝑒 𝑑𝑑𝑑𝑑 (𝛤𝛤)𝑘𝑘= ( ) ( ) Reliability ( ) = 𝑘𝑘−1 𝛤𝛤 𝑘𝑘 𝜆𝜆𝜆𝜆 ! 𝑛𝑛 1 −𝜆𝜆𝜆𝜆 =𝑅𝑅 𝑡𝑡 𝜆𝜆𝜆𝜆 ( ) 𝛤𝛤∞ 𝑘𝑘 𝑅𝑅 𝑡𝑡 𝑒𝑒 � 𝑘𝑘−1 −𝑥𝑥 𝑛𝑛=0 𝑛𝑛 [ ( + )] �𝜆𝜆𝜆𝜆 𝑥𝑥 𝑒𝑒 𝑑𝑑𝑑𝑑 𝛤𝛤 (𝑘𝑘 ) ( ) ! 𝑛𝑛 + , + ) 𝑘𝑘−1 ( ) = = Conditional 𝑛𝑛=0 𝜆𝜆 (𝑡𝑡 )𝑥𝑥 ( ) ( , ) −𝜆𝜆𝜆𝜆 ∑ ! 𝑅𝑅 𝑡𝑡 𝑥𝑥 𝛤𝛤 𝑘𝑘 𝜆𝜆𝜆𝜆 𝜆𝜆𝜆𝜆 Survivor Function 𝑒𝑒 𝑛𝑛 𝑛𝑛 𝑚𝑚 𝑥𝑥 𝑘𝑘−1 𝜆𝜆𝜆𝜆 𝑅𝑅 𝑡𝑡 𝛤𝛤 𝑘𝑘 𝜆𝜆𝜆𝜆 ( > + | > ) Where ∑𝑛𝑛=0 𝑛𝑛 is the given time we know the component has survived to. 𝑃𝑃 𝑇𝑇 𝑥𝑥 𝑡𝑡 𝑇𝑇 𝑡𝑡 is a random variable defined as the time after . Note: = 0 at . 𝑡𝑡 𝑥𝑥 ( ) 𝑡𝑡 ( , )𝑥𝑥 𝑡𝑡 ( ) = ∞ ( ) = ∞ Mean Residual ( ) ( , ) ∫𝑡𝑡 𝑅𝑅 𝑥𝑥 𝑑𝑑𝑑𝑑 ∫𝑡𝑡 𝛤𝛤 𝑘𝑘 𝜆𝜆𝜆𝜆 𝑑𝑑𝑑𝑑 Life 𝑢𝑢 𝑡𝑡 𝑢𝑢 𝑡𝑡 The mean residual𝑅𝑅 𝑡𝑡 life does not have a closed𝛤𝛤 form𝑘𝑘 𝜆𝜆𝜆𝜆 but has the expansion: Gamma Continuous Distribution 101

1 ( 1)( 2) ( ) = 1 + + + ( ) 𝑘𝑘 − 𝑘𝑘 − 𝑘𝑘 − −3 Where ( 𝑢𝑢) 𝑡𝑡is Landau's notation. (Kleiber2 & Kotz𝑂𝑂 𝑡𝑡 2003, p.161) −3 𝑡𝑡 𝑡𝑡 𝑂𝑂 𝑡𝑡 ( ) = ( ) = 𝑘𝑘 𝑘𝑘−1( ) (𝑘𝑘 ,𝑘𝑘−1) ( ) 𝜆𝜆 𝑡𝑡 ! 𝜆𝜆 𝑡𝑡 −𝜆𝜆𝜆𝜆 ℎ 𝑡𝑡 𝑛𝑛 ℎ 𝑡𝑡 𝑒𝑒 𝑘𝑘−1 𝜆𝜆𝜆𝜆 𝛤𝛤 𝑘𝑘 𝜆𝜆𝜆𝜆 𝛤𝛤 𝑘𝑘 ∑𝑛𝑛=0 𝑛𝑛 Series expansion of the hazard rate is: (Kleiber & Kotz 2003, p.161) ( 1)( 2) ( ) = + ( ) Hazard Rate −1 𝑘𝑘 − 𝑘𝑘 − −3 Limits of ( ) (Rausandℎ 𝑡𝑡 � & Høyland2 2004)𝑂𝑂 𝑡𝑡 � 𝑡𝑡 limℎ 𝑡𝑡 ( ) = lim ( ) = 0 < < 1

𝑡𝑡→0 ℎ 𝑡𝑡 ∞ 𝑎𝑎𝑎𝑎𝑎𝑎 𝑡𝑡→∞ ℎ 𝑡𝑡 𝜆𝜆 𝑤𝑤ℎ𝑒𝑒𝑒𝑒 𝑘𝑘 lim ( ) = 0 lim ( ) = 1

𝑡𝑡→0 ℎ 𝑡𝑡 𝑎𝑎𝑎𝑎𝑎𝑎 𝑡𝑡→∞ ℎ 𝑡𝑡 𝜆𝜆 𝑤𝑤ℎ𝑒𝑒𝑒𝑒 𝑘𝑘 ≥

Cumulative ( ) ( , ) ( ) = 𝑘𝑘−1 ( ) = Hazard Rate ! 𝑛𝑛 ( ) Gamma 𝜆𝜆𝜆𝜆 𝛤𝛤 𝑘𝑘 𝜆𝜆𝜆𝜆 𝐻𝐻 𝑡𝑡 𝜆𝜆𝜆𝜆 − 𝑙𝑙𝑙𝑙 �� � 𝐻𝐻 𝑡𝑡 − 𝑙𝑙𝑙𝑙 � � 𝑛𝑛=0 𝑛𝑛 𝛤𝛤 𝑘𝑘 Properties and Moments

Numerically solve for t when: . = (0.5; , ) or −1 0 5 Median 𝑡𝑡 (k, 𝐹𝐹t) = (k,𝑘𝑘t𝜆𝜆)

where (k, γt) isλ the Γlowerλ incomplete gamma function, see section 1.6.6. γ λ 1 1 Mode 𝑘𝑘 − 𝑓𝑓𝑓𝑓𝑓𝑓 𝑘𝑘 ≥ 𝜆𝜆 0 < < 1 k Mean - 1st Raw Moment 𝑁𝑁𝑁𝑁 𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚 𝑓𝑓𝑓𝑓𝑓𝑓 𝑘𝑘

k Variance - 2nd Central Moment 𝜆𝜆

rd 2 Skewness - 3 Central Moment 2/𝜆𝜆 th Excess kurtosis - 4 Central Moment 6/√𝑘𝑘 it𝑘𝑘 Characteristic Function 1 −k

Numerically solve� for− t:� 100α% Percentile Function λ = ( ; , ) −1 𝑡𝑡𝛼𝛼 𝐹𝐹 𝛼𝛼 𝑘𝑘 𝜆𝜆 102 Univariate Continuous Distributions

Parameter Estimation Maximum Likelihood Function

( , | ) = nF e Likelihood (𝑘𝑘𝑛𝑛)𝐹𝐹 𝑘𝑘−1 −λti Functions 𝜆𝜆 𝐹𝐹 𝐿𝐿 𝑘𝑘 𝜆𝜆 𝐸𝐸 𝑛𝑛 � 𝑡𝑡𝑖𝑖 Γ 𝑘𝑘 i=1 ��������������� failures Log-Likelihood ( , | ) = ln( ) ln ( ) + (k 1) nF ln(t ) nF t Functions Λ 𝑘𝑘 𝜆𝜆 𝐸𝐸 𝑘𝑘𝑛𝑛𝐹𝐹 𝜆𝜆 − 𝑛𝑛𝐹𝐹 �𝛤𝛤 𝑘𝑘 � − � i − λ � i i=1 i=1 0 = ln ( ) ( ) + nF {ln(t )} = 0 k 𝐹𝐹 𝐹𝐹 i ∂Λ ( ) [ ( 𝑛𝑛)] λ − 𝑛𝑛 𝜓𝜓 𝑘𝑘 � where = ln is the digamma functioni=1 see section 1.6.7. ∂ 𝑑𝑑 𝜓𝜓 𝑥𝑥 𝑑𝑑𝑑𝑑 Γ 𝑥𝑥 = 0 0 = nF t ∂Λ 𝑘𝑘𝑛𝑛𝐹𝐹 − � i Point∂ λ λ i=1

Point estimates for and are obtained by using numerical methods to Estimates solve the simultaneous equations above. (Kleiber & Kotz 2003, p.165) 𝑘𝑘� 𝜆𝜆̂ Fisher ( ) ( , ) = Gamma Information ′ 𝜓𝜓 𝑘𝑘 𝜆𝜆 Matrix 𝐼𝐼 𝑘𝑘 𝜆𝜆 � 2� ( ) = ( ) = ( +𝜆𝜆 ) 𝑘𝑘𝜆𝜆 where 2 is the Trigamma function. ′ 𝑑𝑑 ∞ −2 (Yang and Berger 21998, p.10) 𝜓𝜓 𝑥𝑥 𝑑𝑑𝑥𝑥 𝑙𝑙𝑙𝑙Γ 𝑥𝑥 ∑𝑖𝑖=0 𝑥𝑥 𝑖𝑖 Confidence For a large number of samples the Fisher information matrix can be Intervals used to estimate confidence intervals. Bayesian Non-informative Priors, ( , ) (Yang and Berger 1998, p.6) 𝝅𝝅 𝒌𝒌 𝝀𝝀 Type Prior Posterior Uniform Improper 1 No Closed Form Prior with limits: (0, ) (0, ) 𝜆𝜆 ∈ ∞ Jeffrey’s𝑘𝑘 ∈ Prior∞ . ( ) 1 No Closed Form ′ Reference No Closed Form 𝜆𝜆�𝑘𝑘 𝜓𝜓 𝑘𝑘 − 1 Order: . ( ) { , } ′ 𝜆𝜆�𝑘𝑘 𝜓𝜓 𝑘𝑘 − Reference𝑘𝑘 𝜆𝜆 ( ) 𝛼𝛼 No Closed Form Order: ′ { , } 𝜆𝜆�𝜓𝜓 𝑘𝑘

𝜆𝜆 𝑘𝑘 Gamma Continuous Distribution 103

( ) = ( ) = ( + ) where 2 is the Trigamma function 𝑑𝑑 ∞ ′ 2 −2 𝜓𝜓 𝑥𝑥 𝑑𝑑𝑥𝑥 𝑙𝑙𝑙𝑙Γ 𝑥𝑥 Conjugate∑𝑖𝑖=0 𝑥𝑥 Priors𝑖𝑖 UOI Likelihood Evidence Dist. of Prior Posterior Model UOI Para Parameters

= + from Exponential failures in Gamma , 𝐹𝐹 = + Λ( ; ) 𝑛𝑛 𝑜𝑜 𝐹𝐹 0 0 𝑘𝑘 𝑘𝑘 𝑛𝑛 𝑘𝑘 𝜆𝜆 𝑜𝑜 𝑇𝑇 𝑇𝑇 𝜆𝜆 𝜆𝜆 𝑡𝑡 𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 Λ 𝑡𝑡 = + from Poisson failures in Gamma , 𝐹𝐹 = + (Λ ; ) 𝑛𝑛 𝑜𝑜 𝐹𝐹 0 0 𝑘𝑘 𝑘𝑘 𝑛𝑛 𝑘𝑘 𝜆𝜆 𝑜𝑜 𝑇𝑇 𝑇𝑇 𝜆𝜆 = 𝜆𝜆 + 𝑡𝑡 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑘𝑘 Λ𝑡𝑡 𝑡𝑡 where Weibull with failures 𝑜𝑜 𝐹𝐹 =𝜆𝜆 𝐹𝐹 Gamma , 𝑘𝑘= 𝑘𝑘+ 𝑛𝑛𝐹𝐹𝑛𝑛 known at times𝑛𝑛 from− 𝛽𝛽 𝛽𝛽 0 0 𝑜𝑜 𝜆𝜆 ( 𝛼𝛼; , ) 𝑘𝑘 𝜆𝜆 (Rinne𝜆𝜆 𝜆𝜆2008,� p.520)𝑡𝑡𝑖𝑖 𝛽𝛽 𝑖𝑖=1 𝑖𝑖 𝑡𝑡 = + /2 𝑊𝑊𝑊𝑊𝑊𝑊 𝑡𝑡 𝛼𝛼 𝛽𝛽 Normal with failures 2 𝐹𝐹 , 1 from Gamma 𝑜𝑜 𝐹𝐹 Gamma known at times𝑛𝑛 = 𝑘𝑘 +𝑘𝑘 𝑛𝑛 𝑛𝑛( ) 𝜎𝜎( ; , ) 2 𝑘𝑘0 𝜆𝜆0 2 2 𝜇𝜇 𝜆𝜆 𝜆𝜆𝑜𝑜 � 𝑡𝑡𝑖𝑖 − 𝜇𝜇 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑥𝑥 𝜇𝜇 𝜎𝜎 Gamma with 𝑖𝑖 𝑖𝑖=1

𝑡𝑡 = + from known = failures in Gamma , 𝐹𝐹 = + 𝜆𝜆( ; , ) 𝑛𝑛 0 𝐹𝐹 𝐸𝐸 0 0 𝜂𝜂 𝜂𝜂 𝑛𝑛 𝑘𝑘 𝑘𝑘 𝜂𝜂 Λ 𝑜𝑜 𝑇𝑇 𝐸𝐸 𝑇𝑇 kΛ = kΛ + 𝑡𝑡 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑥𝑥 𝜆𝜆 𝑘𝑘 𝑘𝑘 𝑡𝑡 Pareto with failures from 𝐹𝐹 Gamma k , 𝑜𝑜 𝐹𝐹 known at times𝑛𝑛 = + 𝑛𝑛𝐹𝐹 ln𝑛𝑛 𝛼𝛼( ; , ) 0 λ0 𝑥𝑥𝑖𝑖 𝜃𝜃 λ λ𝑜𝑜 � � � 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑡𝑡 𝜃𝜃 𝛼𝛼 where: = 𝑡𝑡𝑖𝑖 t + t = 𝑖𝑖=1 𝜃𝜃 F S Description𝑡𝑡𝑇𝑇 ∑ ,i Limitations∑ i 𝑡𝑡𝑡𝑡𝑡𝑡 𝑡𝑡𝑡𝑡and𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 Uses𝑖𝑖𝑖𝑖 𝑡𝑡 𝑡𝑡𝑡𝑡𝑡𝑡 Example 1 For an example using the gamma distribution as a conjugate prior see the Poisson or Exponential distributions.

A renewal process has an exponential time between failure with parameter = 0.01 under the homogeneous Poisson process conditions. What is the probability the forth failure will occur before 200 hours. 𝜆𝜆

(200; 4,0.01) = 0.1429

Example 2 5 components are put 𝐹𝐹on a test with the following failure times: 38, 42, 44, 46, 55 hours Solving: 5 0 = 225 0 = 5ln ( ) 5𝑘𝑘 ( ) + 18.9954 − λ λ − 𝜓𝜓 𝑘𝑘 104 Univariate Continuous Distributions

Gives: k = 21.377

� = 0.4749

90% confidence interval for :𝜆𝜆 ̂ 0.0479 0.4749 ( , ) = 𝑘𝑘 0.4749 4.8205

𝐼𝐼 𝑘𝑘 𝜆𝜆 � 179.979� 17.730 , = , = −1 −1 17.730 1.7881 − �𝐽𝐽𝑛𝑛�𝑘𝑘� 𝜆𝜆̂�� �𝑛𝑛𝐹𝐹𝐼𝐼�𝑘𝑘� 𝜆𝜆̂�� � � (0.95) 179.979 − (0.95) 179.979 . exp , . exp −1 −1 𝛷𝛷 √ 𝛷𝛷 √ �𝑘𝑘� � [7.6142�, 60𝑘𝑘� .0143�] �� −𝑘𝑘� 𝑘𝑘� 90% confidence interval for :

(0.95) 1.7881𝜆𝜆 (0.95) 1.7881 . exp , . exp −1 −1 𝛷𝛷 √ 𝛷𝛷 √ �𝜆𝜆̂ � [0.0046�, 48𝜆𝜆̂ .766�] �� −𝜆𝜆̂ 𝜆𝜆̂ Note that this confidence interval uses the assumption of the parameters being normally distributed which is only true for large Gamma sample sizes. Therefore these confidence intervals may be inaccurate. Bayesian methods must be done numerically. Characteristics The gamma distribution was originally known as a Pearson Type III distribution. This distribution includes a which shifts the distribution along the x-axis. 𝛾𝛾 ( ) ( ; , , ) = e ( ) 𝑘𝑘 ( ) 𝑘𝑘−1 𝜆𝜆 𝑡𝑡 − 𝛾𝛾 −λ t−γ 𝑓𝑓 𝑡𝑡 𝑘𝑘 𝜆𝜆 𝛾𝛾 When k is an integer, the GammaΓ distribution𝑘𝑘 is called an Erlang distribution.

Characteristics: < 1. (0) = . There is no mode. 𝒌𝒌 = . (0) = . The gamma distribution reduces to an 𝒌𝒌exponential𝑓𝑓 distribution∞ with failure rate λ. Mode at = 0. 𝒌𝒌 > 1𝟏𝟏. 𝑓𝑓 (0) =𝜆𝜆0 Large . The gamma distribution approaches 𝑡𝑡 a normal 𝒌𝒌 𝑓𝑓 distribution𝒌𝒌 with = , = . 𝑘𝑘 𝑘𝑘 2 𝜇𝜇 𝜆𝜆 𝜎𝜎 �𝜆𝜆

Homogeneous Poisson Process (HPP). Components with an exponential time to failure which undergo instantaneous renewal with an identical item undergo a HPP. The Gamma distribution is Gamma Continuous Distribution 105

probability distribution of the kth failed item and is derived from the convolution of exponentially distributed random variables, . (See related distributions, exponential distribution). 𝑖𝑖 𝑘𝑘 𝑇𝑇 ~ ( , ) Scaling property: 𝑇𝑇 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑘𝑘 𝜆𝜆 ~ , 𝜆𝜆 Convolution property: 𝑎𝑎𝑎𝑎 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 �𝑘𝑘 � + + … + ~ 𝑎𝑎 ( , ) Where is fixed. 1 2 𝑛𝑛 𝑖𝑖 𝑇𝑇 𝑇𝑇 𝑇𝑇 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 ∑𝑘𝑘 𝜆𝜆 Properties𝜆𝜆 from (Leemis & McQueston 2008) Renewal Theory, Homogenous Poisson Process. Used to model a renewal process where the component time to failure is exponentially distributed and the component is replaced instantaneously with a new identical component. The HPP can also be used to model ruin theory (used in risk assessments) and queuing theory.

System Failure. Can be used to model system failure with backup Applications Gamma systems. 𝑘𝑘 Life Distribution. The gamma distribution is flexible in shape and can give good approximations to life data.

Bayesian Analysis. The gamma distribution is often used as a prior in Bayesian analysis to produce closed form posteriors.

Online: http://mathworld.wolfram.com/GammaDistribution.html http://en.wikipedia.org/wiki/Gamma_distribution http://socr.ucla.edu/htmls/SOCR_Distributions.html (interactive web calculator) http://www.itl.nist.gov/div898/handbook/eda/section3/eda366b.htm

Books: Resources Artin, E., 1964. The Gamma Function, New York: Holt, Rinehart & Winston.

Johnson, N.L., Kotz, S. & Balakrishnan, N., 1994. Continuous Univariate Distributions, Vol. 1 2nd ed., Wiley-Interscience.

Bowman, K.O. & Shenton, L.R., 1988. Properties of estimators for the gamma distribution, CRC Press. Relationship to Other Distributions Generalized Gamma Distribution 106 Univariate Continuous Distributions

( ) ( ; , , , ) = exp{ [ ( )] } ( ; , , , ) 𝜉𝜉𝜉𝜉 ( ) 𝜉𝜉𝜉𝜉−1 𝑘𝑘 - Scale Parameter 𝜉𝜉𝜆𝜆 𝑡𝑡 − 𝛾𝛾 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑡𝑡 𝑘𝑘 𝜆𝜆 γ 𝜉𝜉 𝑓𝑓 𝑡𝑡 𝑘𝑘 𝜆𝜆 𝛾𝛾 𝜉𝜉 − 𝜆𝜆 𝑡𝑡 − 𝛾𝛾 - Shape Parameter Γ 𝑘𝑘 𝜆𝜆 - Location parameter 𝑘𝑘 - Second shape parameter 𝛾𝛾 𝜉𝜉The generalized gamma distribution has been derived because it is a generalization of a large amount of probability distributions. Such as: ( ; 1, , 0,1) = ( ; ) 1 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑡𝑡 ; 1,𝜆𝜆 , , 1 =𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡(𝜆𝜆; , ) 1 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 �𝑡𝑡 β � 𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 𝜇𝜇(𝛽𝛽 ) ; 1, 𝜇𝜇 , 0, = ; , 1 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 �𝑡𝑡; 1, , , 𝛽𝛽� = 𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊𝑊(𝑡𝑡; 𝛼𝛼, 𝛽𝛽, ) 𝛼𝛼 1 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 �𝑡𝑡; , ,γ0,𝛽𝛽1� = 𝑊𝑊(𝑒𝑒t𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒; n) 𝑡𝑡 𝛼𝛼 𝛽𝛽 𝛾𝛾 2 𝛼𝛼2 𝑛𝑛 1 2 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 �𝑡𝑡; , , 0,�2 =χ (t; n) 2 2 𝑛𝑛 1 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 �𝑡𝑡; 1, , 0,2 =� Rayleighχ (t; ) √ 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 �𝑡𝑡 � σ Gamma 𝜎𝜎 Let … ~ ( ) = + + + Then 1 𝑘𝑘 𝑡𝑡 1 2 𝑘𝑘 Exponential 𝑇𝑇 𝑇𝑇 𝐸𝐸𝐸𝐸𝐸𝐸 𝜆𝜆 ~𝑎𝑎𝑎𝑎𝑎𝑎 ( 𝑇𝑇, ) 𝑇𝑇 𝑇𝑇 ⋯ 𝑇𝑇 Distribution 𝑡𝑡 This is gives the Gamma distribution𝑇𝑇 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 its𝑘𝑘 convolution𝜆𝜆 property. ( ; )

𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 λ Special Case: ( ; ) = ( ; = 1, )

Let 𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 𝜆𝜆 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑡𝑡 𝑘𝑘 𝜆𝜆 … ~ ( ) = + + + Then 1 𝑘𝑘 𝑡𝑡 1 2 𝑘𝑘 𝑇𝑇 𝑇𝑇 𝐸𝐸𝐸𝐸𝐸𝐸 𝜆𝜆 ~𝑎𝑎𝑎𝑎𝑎𝑎 ( 𝑇𝑇, ) 𝑇𝑇 𝑇𝑇 ⋯ 𝑇𝑇

𝑡𝑡 Poisson The Poisson distribution is𝑇𝑇 the𝐺𝐺 probability𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑘𝑘 𝜆𝜆 that exactly failures have Distribution been observed in time . This is the probability that is between and . 𝑘𝑘 𝑘𝑘 ( ; ) 𝑡𝑡 𝑡𝑡 𝑇𝑇 𝑇𝑇𝑘𝑘+1 ( ; ) = ( ; , ) 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑘𝑘 𝜆𝜆𝜆𝜆 𝑘𝑘+1 𝑃𝑃𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜 = 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺( ; + 1, ) ( ; , ) 𝑓𝑓 𝑘𝑘 𝜆𝜆𝜆𝜆 �𝑘𝑘 𝑓𝑓 𝑡𝑡 𝑥𝑥 𝜆𝜆 𝑑𝑑𝑑𝑑

𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 where is an integer. 𝐹𝐹 𝑡𝑡 𝑘𝑘 𝜆𝜆 − 𝐹𝐹 𝑡𝑡 𝑘𝑘 𝜆𝜆

𝑘𝑘 Gamma Continuous Distribution 107

Normal Special Case for large k: Distribution lim ( , ) = = , = ( ; , ) 𝑘𝑘 𝑘𝑘 𝑘𝑘→∞ 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑘𝑘 𝜆𝜆 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 �𝜇𝜇 𝜎𝜎 � 2� Chi𝑁𝑁-square𝑁𝑁𝑁𝑁𝑁𝑁 𝑡𝑡 𝜇𝜇 𝜎𝜎 Special Case: 𝜆𝜆 𝜆𝜆 Distribution 1 ( ; ) = ( ; = , = ) 2 2 ( ; ) where is an integer2 𝑣𝑣 𝜒𝜒 𝑡𝑡 𝑣𝑣 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑡𝑡 𝑘𝑘 𝜆𝜆 2 𝜒𝜒 𝑡𝑡 𝑣𝑣 Let 𝑣𝑣 Inverse Gamma 1 Distribution X~ ( , ) Y = X

Then ( ; , ) 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑘𝑘 𝜆𝜆 𝑎𝑎𝑎𝑎𝑎𝑎 Y~I ( = , = ) 𝐼𝐼𝐼𝐼 𝑡𝑡 𝛼𝛼 β Let 𝐺𝐺 𝛼𝛼 𝑘𝑘 β λ Beta Distribution X , ~ (k , ) Y = X +1 X ( ; , ) Then 1 2 i i 𝑋𝑋 𝑋𝑋 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 λ 𝑎𝑎𝑎𝑎𝑎𝑎 1 2 Y~ ( = , = ) 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑡𝑡 α β 1 2 Let: 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝛼𝛼 𝑘𝑘 𝛽𝛽 𝑘𝑘 Gamma

~ ( , ) . . = 𝑑𝑑

Then: 𝑌𝑌𝑖𝑖 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝜆𝜆 𝑘𝑘𝑖𝑖 𝑖𝑖 𝑖𝑖 𝑑𝑑 𝑎𝑎𝑎𝑎𝑎𝑎 𝑉𝑉 � 𝑌𝑌𝑖𝑖 Dirichlet 𝑖𝑖=1 ~ ( , ) Distribution Let: 𝑉𝑉 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝜆𝜆 ∑𝑘𝑘𝑖𝑖 ( ; ) = , , … , 𝑌𝑌1 𝑌𝑌2 𝑌𝑌𝑑𝑑 𝐷𝐷𝐷𝐷𝑟𝑟𝑑𝑑 𝒙𝒙 𝛂𝛂 Then: 𝒁𝒁 � � ~ 𝑉𝑉 ( 𝑉𝑉 , … , 𝑉𝑉 )

𝑑𝑑 1 𝑘𝑘 *i.i.d: independent and identically𝒁𝒁 𝐷𝐷𝐷𝐷𝑟𝑟 𝛼𝛼distributed𝛼𝛼 Wishart The Wishart Distribution is the multivariate generalization of the Distribution gamma distribution.

( ; )

𝑊𝑊𝑊𝑊𝑊𝑊ℎ𝑎𝑎𝑟𝑟𝑡𝑡𝑑𝑑 𝑛𝑛 𝚺𝚺 108 Univariate Continuous Distributions

4.4. Logistic Continuous Distribution

Probability Density Function - f(t) 0.3 μ=0 s=0.7 μ=2 0.3 s=1 0.2 μ=4 s=2

0.2 0.2

0.1 0.1 0.1

0.0 0.0 -6 -1 4 9 -7 -2 3

Cumulative Density Function - F(t)

1.0 1.0

0.8 0.8 Logistic 0.6 0.6

0.4 0.4 μ=0 s=1 s=0.7 s=1 0.2 μ=2 s=1 0.2 μ=4 s=1 s=2 0.0 0.0 -6 -1 4 9 -7 -2 3

Hazard Rate - h(t) 1.0 μ=0 1.4 μ=0 s=0.7 μ=2 μ=0 s=1 1.2 0.8 μ=4 μ=0 s=2 1.0 0.6 0.8

0.4 0.6 0.4 0.2 0.2 0.0 0.0 -6 -1 4 9 -7 -2 3

Logistic Continuous Distribution 109

Parameters & Description Location parameter. is the mean, < < median and mode of the distribution. Parameters 𝜇𝜇 µ −∞ 𝜇𝜇 ∞ Scale parameter. Proportional to the s > 0 standard deviation of the distribution.

𝑠𝑠 < t < Limits

−∞ ∞ Distribution Formulas e e ( ) = = s(1 +ze ) s(1 +−ez ) 𝑓𝑓 𝑡𝑡 z 2 −z 2 1 PDF = sech 4 2 2 𝑡𝑡 − 𝜇𝜇 where � � 𝑠𝑠 𝑠𝑠 = 𝑡𝑡 − 𝜇𝜇 𝑧𝑧 1 e 𝑠𝑠 ( ) = = Logistic 1 + e 1 +ze CDF −z z 𝐹𝐹 𝑡𝑡 1 1 = + tanh 2 2 2 𝑡𝑡 − 𝜇𝜇 � � 1 𝑠𝑠 R(t) = Reliability 1 + e z

( + ) 1 + exp ( ) = ( | ) = = Conditional ( ) +𝑡𝑡 − 𝜇𝜇 1 + exp � � Survivor Function 𝑅𝑅 𝑡𝑡 𝑥𝑥 𝑠𝑠 Where 𝑚𝑚 𝑥𝑥 𝑅𝑅 𝑥𝑥 𝑡𝑡 𝑡𝑡 𝑥𝑥 − 𝜇𝜇 ( > + | > ) 𝑅𝑅 𝑡𝑡 � � is the given time we know the component has survived𝑠𝑠 to. 𝑃𝑃 𝑇𝑇 𝑥𝑥 𝑡𝑡 𝑇𝑇 𝑡𝑡 is a random variable defined as the time after . Note: = 0 at . 𝑡𝑡 Mean Residual 𝑥𝑥 ( ) = (1 + ) s. ln + 𝑡𝑡 t 𝑥𝑥 𝑡𝑡 Life t 𝜇𝜇 z �s �𝑠𝑠 𝑢𝑢 𝑡𝑡 𝑒𝑒 � 1 �𝑒𝑒 𝑒𝑒( ) � − � ( ) = = s(1 + e ) 1 𝐹𝐹 𝑡𝑡 Hazard Rate ℎ 𝑡𝑡 −z = 𝑠𝑠 s + s exp 𝜇𝜇 − 𝑡𝑡 � � 𝑠𝑠 Cumulative ( ) = ln 1 + exp Hazard Rate 𝑡𝑡 − 𝜇𝜇 𝐻𝐻 𝑡𝑡 � � �� 𝑠𝑠 110 Univariate Continuous Distributions

Properties and Moments Median

Mode µ

Mean - 1st Raw Moment µ

Variance - 2nd Central Moment µ s 32 π 2 Skewness - 3rd Central Moment 0

Excess kurtosis - 4th Central Moment 6

5 Characteristic Function (1 , 1 + ) for | | < 1 𝑖𝑖𝑖𝑖𝑖𝑖 100 % Percentile Function 𝑒𝑒 𝐵𝐵 − 𝑖𝑖𝑖𝑖𝑖𝑖 = 𝑖𝑖+𝑖𝑖𝑖𝑖 ln 𝑠𝑠𝑠𝑠 1 𝛾𝛾 𝛾𝛾 𝑡𝑡𝛾𝛾 𝜇𝜇 𝑠𝑠 � � Parameter Estimation − 𝛾𝛾 Plotting Method

Least Mean X-Axis Y-Axis 1 = Square m t ln[F] ln[1 ] = + Logistic 𝑠𝑠̂ i = 𝑦𝑦 𝑚𝑚𝑚𝑚 𝑐𝑐 − − 𝐹𝐹 Maximum Likelihood Function 𝜇𝜇̂ −𝑐𝑐𝑠𝑠̂ Likelihood For complete data: Function exp ( , | ) = nF 𝑡𝑡𝑖𝑖 − 𝜇𝜇 s 1 + exp� � −𝑠𝑠 2 𝐿𝐿 𝜇𝜇 𝑠𝑠 𝐸𝐸 � 𝑖𝑖 i=1 𝑡𝑡 − 𝜇𝜇 ������������������ failures −𝑠𝑠 Log-Likelihood Function ( , | ) = n ln s + 𝑛𝑛𝐹𝐹 2 𝑛𝑛𝐹𝐹 ln 1 + exp 𝑡𝑡𝑖𝑖 − 𝜇𝜇 𝑡𝑡𝑖𝑖 − 𝜇𝜇 Λ 𝜇𝜇 𝑠𝑠 𝐸𝐸 − F � � � − � � � �� 𝑖𝑖=1 −𝑠𝑠 𝑖𝑖=1 −𝑠𝑠 ��������������������������������� failures = 0 n 2 1 = 𝑛𝑛𝐹𝐹 = 0 ∂Λ s ∂Λ F 1 + exp ∂µ − � 𝑖𝑖 ∂µ 𝑠𝑠 𝑖𝑖=1 𝑡𝑡 − 𝜇𝜇 �������������������� failures 𝑠𝑠

= 0 n 1 1 exp s = 𝑛𝑛𝐹𝐹 = 0 s s 𝑡𝑡𝑖𝑖 − 𝜇𝜇 ∂Λ F 𝑖𝑖 1 +− exp � � ∂Λ 𝑡𝑡 − 𝜇𝜇 𝑠𝑠 ∂ − − � � � � 𝑖𝑖 � ∂ 𝑠𝑠 𝑖𝑖=1 𝑠𝑠 𝑡𝑡 − 𝜇𝜇 ��������������������������� failures 𝑠𝑠 Logistic Continuous Distribution 111

MLE Point The MLE estimates for and are found by solving the following Estimates equations: 𝜇𝜇̂ 𝑠𝑠̂ 1 1 𝑛𝑛𝐹𝐹 1 + exp = 0 2 n −1 𝑡𝑡𝑖𝑖 − 𝜇𝜇 − F � � � �� 1 𝑖𝑖=1 1 exp𝑠𝑠 1 + 𝑛𝑛𝐹𝐹 = 0 n 𝑡𝑡𝑖𝑖 − 𝜇𝜇 𝑖𝑖 1 −+ exp � � 𝑡𝑡 − 𝜇𝜇 𝑠𝑠 F � � � 𝑡𝑡𝑖𝑖 − 𝜇𝜇 𝑖𝑖=1 𝑠𝑠 � � These estimates are biased. (Balakrishnan𝑠𝑠 1991) provides tables derived from Monte Carlo simulation to correct the bias. Fisher 1 0 Information 3 ( , ) = 2 + 3 0𝑠𝑠 𝐼𝐼 𝜇𝜇 𝑠𝑠 � 29 � 𝜋𝜋 (Antle et al. 1970) 2 𝑠𝑠 100 % Confidence intervals are most often obtained from tables derived from Confidence Monte Carlo simulation. Corrections from using the Fisher Information Intervals𝛾𝛾 matrix method are given in (Antle et al. 1970). Logistic Bayesian

Non-informative Priors ( , )

Type Prior𝝅𝝅𝟎𝟎 𝝁𝝁 𝒔𝒔 Jeffery Prior 1

Description , Limitations and𝑠𝑠 Uses Example The accuracy of a cutting machine used in manufacturing is desired to be measured. 5 cuts at the required length are made and measured as: 7.436, 10.270, 10.466, 11.039, 11.854

Numerically solving MLE equations gives: 𝑚𝑚𝑚𝑚 = 10.446 = 0.815 𝜇𝜇̂ This gives a mean of 10.446 and𝑠𝑠̂ a variance of 2.183. Compared to the same data used in the Normal distribution section it can be seen that this estimate is very similar to a normal distribution.

90% confidence interval for : 3 3 (0.95) 𝜇𝜇 , + (0.95) 2 2 −1 𝑠𝑠̂ −1 𝑠𝑠̂ �𝜇𝜇̂ − Φ � 𝜇𝜇̂ Φ � � [9.408𝑛𝑛𝐹𝐹 , 11.4844] 𝑛𝑛𝐹𝐹

112 Univariate Continuous Distributions

90% confidence interval for : 9 9 (0.95) 𝑠𝑠 (0.95) (3 +2 ) (3 +2 ) −1 −1 ⎡ . exp ⎧ 𝑠𝑠̂ 2 ⎫ , . exp ⎧ 𝑠𝑠̂ 2 ⎫⎤ 𝛷𝛷 � 𝐹𝐹 𝛷𝛷 � 𝐹𝐹 ⎢ ⎪ 𝑛𝑛 𝜋𝜋 ⎪ ⎪ 𝑛𝑛 𝜋𝜋 ⎪⎥ ⎢𝑠𝑠̂ 𝑠𝑠̂ ⎥ ⎢ ⎨ −𝑠𝑠̂ ⎬ ⎨ 𝑠𝑠̂ ⎬⎥ ⎢ ⎪ [0.441⎪ , 1.501] ⎪ ⎪⎥ ⎣ ⎩ ⎭ ⎩ ⎭⎦ Note that this confidence interval uses the assumption of the parameters being normally distributed which is only true for large sample sizes. Therefore these confidence intervals may be inaccurate.

Bayesian methods must be calculated using numerical methods.

Characteristics The logistic distribution is most often used to model growth rates (and has been used extensively in biology and chemical applications). In reliability engineering it is most often used as a life distribution.

Shape. There is no shape parameter and so the logistic distribution

is always a bell shaped curve. Increasing shifts the curve to the right, increasing increases the spread of the curve. 𝜇𝜇 𝑠𝑠 Logistic Normal Distribution. The shape of the logistic distribution is very similar to that of a normal distribution with the logistic distribution having slightly ‘longer tails’. It would take a large number of samples to distinguish between the distributions. The main difference is that the hazard rate approaches 1/ for large . The logistic function has historically been preferred over the normal distribution because of its simplified form. (Meeker & Escobar𝑠𝑠 1998,𝑡𝑡 p.89)

Alternative Parameterization. It is equally as popular to present the logistic distribution using the true standard deviation = 3. This form is used in reference book, Balakrishnan 1991, and gives the following cdf: 𝜎𝜎 𝜋𝜋𝜋𝜋⁄√

1 ( ) = 1 + exp 𝐹𝐹 𝑡𝑡 3 −𝜋𝜋 𝑡𝑡 − 𝜇𝜇 � � �� Standard Logistic Distribution. The√ standard𝜎𝜎 logistic distribution has = 0, = 1. The standard logistic distribution random variable, , is related to the logistic distribution: 𝜇𝜇 𝑠𝑠 𝑍𝑍 = 𝑋𝑋 − 𝜇𝜇 𝑍𝑍 𝑠𝑠

Logistic Continuous Distribution 113

Let: ~ ( ; , )

Scaling property (Leemis𝑇𝑇 &𝐿𝐿 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿McQueston𝑡𝑡 𝜇𝜇 𝑠𝑠2008) ~ ( ; , )

Rate Relationships. 𝑎𝑎𝑎𝑎The𝐿𝐿 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿distribution𝑡𝑡 𝜇𝜇 𝑎𝑎𝑎𝑎has the following rate relationships which make it suitable for modeling growth (Hastings et al. 2000, p.127):

f(t) ( ) ( ) = = R(t) 𝐹𝐹 𝑡𝑡 ℎ 𝑡𝑡 ( ) 𝑠𝑠 = ln = ln[ ( )] ln[1 ( )] ( ) 𝐹𝐹 𝑡𝑡 where 𝑧𝑧 � � 𝐹𝐹 𝑡𝑡 − − 𝐹𝐹 𝑡𝑡 𝑅𝑅 𝑡𝑡 = when = 0 and = 1: 𝑡𝑡 − 𝜇𝜇 𝑧𝑧 𝑠𝑠 𝜇𝜇 𝑠𝑠 ( )

( ) = = ( ) ( ) Logistic 𝑑𝑑𝑑𝑑 𝑡𝑡 𝑓𝑓 𝑡𝑡 𝐹𝐹 𝑡𝑡 𝑅𝑅 𝑡𝑡 𝑑𝑑𝑑𝑑

Applications Growth Model. The logistic distribution most common use is a growth model.

Probability of Detection. The cdf of logistic distribution is commonly used to represent the probability of detection damaged materials sensors and detection instruments. For example probability of detection of embedded flaws in metals using ultrasonic signals.

Life Distribution. In reliability applications it is used as a life distribution. It is similar in shape to a normal distribution and so is often used instead of a normal distribution due to its simplified form. (Meeker & Escobar 1998, p.89)

Logistic Regression. is a generalized model used predict binary outcomes. (Agresti 2002)

Resources Online: http://mathworld.wolfram.com/LogisticDistribution.html http://en.wikipedia.org/wiki/Logistic_distribution http://socr.ucla.edu/htmls/SOCR_Distributions.html (web calc) http://www.weibull.com/LifeDataWeb/the_logistic_distribution.htm

Books: Balakrishnan, 1991. Handbook of the Logistic Distribution 1st ed., CRC.

114 Univariate Continuous Distributions

Johnson, N.L., Kotz, S. & Balakrishnan, N., 1995. Continuous Univariate Distributions, Vol. 2 2nd ed., Wiley-Interscience.

Relationship to Other Distributions Let Exponential ~ ( = 1) = ln 1 +−𝑋𝑋 Distribution 𝑒𝑒 ( ; ) Then 𝑋𝑋 𝐸𝐸𝐸𝐸𝐸𝐸 𝜆𝜆 𝑎𝑎𝑎𝑎𝑎𝑎 𝑌𝑌 � −𝑋𝑋� ~ (0,1) 𝑒𝑒 𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 𝜆𝜆 (Hastings et al. 2000, p.127) 𝑌𝑌 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿 Let X ( ) Pareto ~ , = ln α 1 Distribution ( , ) Then 𝑋𝑋 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝜃𝜃 𝛼𝛼 𝑎𝑎𝑎𝑎𝑎𝑎 𝑌𝑌 − �� � − � ~ (0,1) θ 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝜃𝜃 𝛼𝛼 (Hastings et al. 2000, p.127) 𝑌𝑌 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿 Let Gumbel ~ ( , ) = X X Distribution Then

𝑖𝑖 1 2 ( , ) 𝑋𝑋 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝛼𝛼 𝛽𝛽~ 𝑎𝑎𝑎𝑎𝑎𝑎(0, ) 𝑌𝑌 − (Hastings et al. 2000, p.127) 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝛼𝛼 𝛽𝛽 𝑌𝑌 𝐿𝐿𝐿𝐿𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔 𝛽𝛽

Logistic Normal Continuous Distribution 115

4.5. Normal (Gaussian) Continuous Distribution

Probability Density Function - f(t) 0.5 μ=-3 σ=1 μ=0 σ=0.6 μ=0 σ=1 μ=0 σ=1 0.6 μ=3 σ=1 μ=0 σ=2 0.4

0.3 0.4

0.2 0.2 0.1

0.0 0.0 -6 -4 -2 0 2 4 6 -6 -4 -2 0 2 4 6

Cumulative Density Function - F(t) 1.0 1.0 Normal

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0.0 0.0 -6 -4 -2 0 2 4 6 -6 -4 -2 0 2 4 6

Hazard Rate - h(t) μ=-3 σ=1 6 6 μ=-3 σ=0.6 μ=0 σ=1 5 μ=0 σ=1 5 μ=3 σ=1 μ=0 σ=2 4 4

3 3

2 2

1 1

0 0 -6 -4 -2 0 2 4 6 -6 -4 -2 0 2 4 6

116 Univariate Continuous Distributions

Parameters & Description Location parameter: The mean of the < < distribution. Parameters 𝜇𝜇 −∞ µ ∞ Scale parameter: The standard > 0 deviation of the distribution. 2 2 Limits 𝜎𝜎 σ < t <

Distribution − Formulas∞ ∞ 1 1 t ( ) = exp 2 2 2 1 − µ PDF 𝑓𝑓 𝑡𝑡 = � − � � � σ√ 𝜋𝜋 σ 𝑡𝑡 − µ 𝜙𝜙 � � where is the standard normalσ σpdf with = 0 and = 1. 2 𝜙𝜙 1 1 𝜇𝜇 𝜎𝜎 ( ) = exp 2 𝑡𝑡 2 2 θ − µ 𝐹𝐹 𝑡𝑡 � � − � � � 𝑑𝑑𝑑𝑑 σ√ 𝜋𝜋1 −∞1 σ = + erf 2 2 2 CDF 𝑡𝑡 − µ � � t = σ√ Normal − µ Φ � � where is the standard normal cdfσ with = 0 and = 1. 2 Φ t 𝜇𝜇 𝜎𝜎 R(t) = 1 Reliability t − µ = − Φ � � σ µ − Φ � � σ x t ( + ) ( ) = ( | ) = = Conditional ( ) µ − −t Φ � � Survivor Function 𝑅𝑅 𝑡𝑡 𝑥𝑥 σ Where 𝑚𝑚 𝑥𝑥 𝑅𝑅 𝑥𝑥 𝑡𝑡 µ − ( > + | > ) 𝑅𝑅 𝑡𝑡 Φ � � is the given time we know the component has survivedσ to. 𝑃𝑃 𝑇𝑇 𝑥𝑥 𝑡𝑡 𝑇𝑇 𝑡𝑡 is a random variable defined as the time after . Note: = 0 at . 𝑡𝑡 Mean Residual 𝑥𝑥 ( ) ( ) 𝑡𝑡 𝑥𝑥 𝑡𝑡 ( ) = ∞ = ∞ Life ( ) ( ) ∫𝑡𝑡 𝑅𝑅 𝑥𝑥 𝑑𝑑𝑑𝑑 ∫𝑡𝑡 𝑅𝑅 𝑥𝑥 𝑑𝑑𝑑𝑑 𝑢𝑢 𝑡𝑡 𝑅𝑅 𝑡𝑡 t 𝑅𝑅 𝑡𝑡 ( ) = Hazard Rate − µt 𝜙𝜙 � � σ ℎ 𝑡𝑡 µ − 𝜎𝜎 �Φ � �� σ Cumulative ( ) = ln Hazard Rate 𝜇𝜇 − 𝑡𝑡 𝐻𝐻 𝑡𝑡 − �𝛷𝛷 � �� 𝜎𝜎 Normal Continuous Distribution 117

Properties and Moments Median

Mode µ st Mean - 1 Raw Moment µ nd Variance - 2 Central Moment µ rd 2 Skewness - 3 Central Moment σ0 Excess kurtosis - 4th Central Moment 0

Characteristic Function exp 1 2 2 100 % Percentile Function = + �𝑖𝑖𝑖𝑖𝑖𝑖 −( 2𝜎𝜎) 𝑡𝑡 � = + 2−1erf (2 1) 𝛼𝛼 𝛼𝛼 𝑡𝑡 µ σΦ α−1 Parameter Estimation µ σ√ 𝛼𝛼 − Plotting Method

Least Mean X-Axis Y-Axis = Square

[ ( )] 𝑐𝑐 Normal = + µ� − 1 𝑚𝑚 1 𝑖𝑖 𝑖𝑖 = , = 𝑦𝑦 𝑚𝑚𝑚𝑚 𝑐𝑐 𝑡𝑡 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝐹𝐹 𝑡𝑡 2 σ� σ� 2 Maximum Likelihood Function 𝑚𝑚 𝑚𝑚 Likelihood For complete data: Function 1 1 ( , | ) = exp nF 2 2 𝑡𝑡𝑖𝑖 − 𝜇𝜇 𝐿𝐿 𝜇𝜇 𝜎𝜎 𝐸𝐸 nF � �− � � � i=1 𝜎𝜎 �σ�√�2π����������������������� 1 failures1 = exp 𝑛𝑛𝐹𝐹 ( ) 2 2 nF �− 2 � 𝑡𝑡𝑖𝑖 − 𝜇𝜇 � �σ 2π� 𝜎𝜎 𝑖𝑖=1 ��√����������������������� Log-Likelihood failures1 Function ( , | ) = n ln 𝑛𝑛𝐹𝐹 ( ) 2 2 Λ 𝜇𝜇 𝜎𝜎 𝐸𝐸 − F �σ√2π� − 2 � 𝑡𝑡𝑖𝑖 − 𝜇𝜇 𝜎𝜎 𝑖𝑖=1 ����������������������� solve for to get MLE : failures = 0 1 ∂Λ 𝜇𝜇 𝜇𝜇̂ = 𝑛𝑛𝐹𝐹 = 0 ∂µ ∂Λ 𝜇𝜇𝑛𝑛𝐹𝐹 2 − 2 � 𝑡𝑡𝑖𝑖 ∂µ 𝜎𝜎 𝜎𝜎 𝑖𝑖=1 ����������� solve for to get : failures = 0 n 1 ∂Λ 𝜎𝜎 𝜎𝜎� = + 𝑛𝑛𝐹𝐹 ( ) = 0 ∂σ ∂Λ F 2 − 3 � 𝑡𝑡𝑖𝑖 − 𝜇𝜇 ∂σ σ 𝜎𝜎 𝑖𝑖=1 ��������������� failures 118 Univariate Continuous Distributions

MLE Point When there is only complete failure data the point estimates can be Estimates given as: 1 1 = 𝑛𝑛𝐹𝐹 = 𝑛𝑛𝐹𝐹 ( ) n n 2 2 𝑖𝑖 � 𝑖𝑖 𝜇𝜇̂ F � 𝑡𝑡 𝜎𝜎 F � 𝑡𝑡 − 𝜇𝜇 𝑖𝑖=1 𝑖𝑖=1 In most cases the unbiased estimators are used:

1 1 = 𝑛𝑛𝐹𝐹 = 𝑛𝑛𝐹𝐹 ( ) n n 1 2 2 𝑖𝑖 � 𝑖𝑖 𝜇𝜇̂ F � 𝑡𝑡 𝜎𝜎 F � 𝑡𝑡 − 𝜇𝜇 𝑖𝑖=1 − 𝑖𝑖=1 Fisher 1/ 0 ( , ) = Information 0 2 1/2 2 𝜎𝜎 𝐼𝐼 𝜇𝜇 𝜎𝜎 � 4� 100 % 1 Sided - Lower 2 Sided - Lower− 𝜎𝜎 2 Sided - Upper Confidence

Intervals𝛾𝛾 (n 1) (n 1) + (n 1)

𝜎𝜎� 𝜎𝜎� 1+γ 𝜎𝜎� 1+γ γ � � � � 𝝁𝝁 𝜇𝜇̂ − 𝑡𝑡 − 𝜇𝜇̂ − 𝑡𝑡 2 − 𝜇𝜇̂ 𝑡𝑡 2 − (for complete √𝑛𝑛( 1) √𝑛𝑛 ( 1) √𝑛𝑛 ( 1) data) 𝟐𝟐 ( 1) 1+ ( 1) 1 ( 1) 2 𝑛𝑛 − 2 𝑛𝑛 − 2 𝑛𝑛 − 𝝈𝝈 � 2 � 2 2 � 2 2 𝜎𝜎 𝛼𝛼 𝜎𝜎 γ 𝜎𝜎 −γ (Nelson 1982,𝜒𝜒 pp.218𝑛𝑛 − -220) Where𝜒𝜒 � (�n𝑛𝑛 −1) is the 100 𝜒𝜒th � percentile� 𝑛𝑛 − of

Normal 1 ( 1) the t-distribution with degreesγ of freedom and is the th 𝑡𝑡 − 1 𝛾𝛾2 100 percentile of the -distribution with degrees𝛾𝛾 of freedom. 𝑛𝑛 −2 𝜒𝜒 𝑛𝑛 − 𝛾𝛾 𝜒𝜒 𝑛𝑛 − Bayesian Non-informative Priors when is known, ( ) (Yang and Berger 1998,𝟐𝟐 p.22) 𝝈𝝈 𝝅𝝅𝟎𝟎 𝝁𝝁 Type Prior Posterior Uniform Proper 1 Truncated Normal Distribution

Prior with limits For a b [ , ] . ; 𝐹𝐹 , 𝑏𝑏 − 𝑎𝑎 ≤ µ ≤ 𝑛𝑛 𝐹𝐹 n2 𝜇𝜇 ∈ 𝑎𝑎 𝑏𝑏 ∑𝑖𝑖=1 𝑡𝑡𝑖𝑖 σ Otherwise𝑐𝑐 𝑁𝑁(𝑁𝑁𝑁𝑁𝑁𝑁) =�0µ � 𝑛𝑛𝐹𝐹 F All 1 𝜋𝜋 𝜇𝜇 ; 𝐹𝐹 , 𝑛𝑛 𝐹𝐹 n2 ∑𝑖𝑖=1 𝑡𝑡𝑖𝑖 σ when (𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁, ) �µ � 𝑛𝑛𝐹𝐹 F Non-informative Priors when is𝜇𝜇 ∈known,∞ ∞ ( ) (Yang and Berger 1998, p.23) 𝟐𝟐 𝝁𝝁 𝝅𝝅𝒐𝒐 𝝈𝝈 Type Prior Posterior Uniform Proper 1 Truncated Inverse Gamma Distribution

Prior with limits For a b [ , ] 2 2 𝑏𝑏 − 𝑎𝑎 ≤ σ ≤ 𝜎𝜎 ∈ 𝑎𝑎 𝑏𝑏 Normal Continuous Distribution 119

( 2) S . ; , 2 22 2 𝑛𝑛𝐹𝐹 − Otherwise𝑐𝑐 𝐼𝐼𝐼𝐼( �σ) = 0 � 2 Uniform Improper 1 ( 2) S 𝜋𝜋 𝜎𝜎 ; , Prior with limits 2 𝐹𝐹 2 2 (0, ) 2 𝑛𝑛 − 𝐼𝐼𝐼𝐼See�σ section 1.7.1 � 2 Jeffery’s,𝜎𝜎 ∈ ∞ 1 S ; , Reference, MDIP 2 22 Prior 2 𝑛𝑛𝐹𝐹 2 with𝐼𝐼 𝐼𝐼limits�σ (0�, ) 𝜎𝜎 See section2 1.7.1 𝜎𝜎 ∈ ∞ Non-informative Priors when and are unknown, ( , ) (Yang and Berger 1998,𝟐𝟐 p.23) 𝟐𝟐 𝝁𝝁 𝝈𝝈 𝝅𝝅𝒐𝒐 𝝁𝝁 𝝈𝝈 Type Prior Posterior Improper Uniform 1 S ( | )~ ; n 3, , with limits: n (n 2 3) ( , ) See sectionF 1.7.2 𝜋𝜋 𝜇𝜇 𝐸𝐸 𝑇𝑇 �µ − 𝑡𝑡̅ F F � (0, ) ( 3) S−

𝜇𝜇2∈ ∞ ∞ ( | )~ ; , Normal 2 22 𝜎𝜎 ∈ ∞ 2 2 𝑛𝑛𝐹𝐹 − 𝜋𝜋 𝜎𝜎 𝐸𝐸See𝐼𝐼 𝐼𝐼section�σ 1.7.1 �

Jeffery’s Prior 1 S ( | )~ ; n + 1, , n (n 2+ 1) 4 when F ( , ) 𝜋𝜋 𝜇𝜇 𝐸𝐸 𝑇𝑇 �µ 𝑡𝑡̅ F F � 𝜎𝜎 See section 1.7.2 𝜇𝜇 ∈ (∞ ∞+ 1) S ( | )~ ; , 2 22 2 2 𝑛𝑛𝐹𝐹 𝜋𝜋 𝜎𝜎 𝐸𝐸when𝐼𝐼𝐼𝐼 �σ (0, ) � See section2 1.7.1 𝜎𝜎 ∈ ∞ Reference Prior ( , ) No Closed Form ordering { , } 12 𝑜𝑜 𝜋𝜋 𝜙𝜙 2𝜎𝜎+ 𝜙𝜙 𝜎𝜎 where∝ 2 𝜎𝜎�= / 𝜙𝜙 Reference where 1 S 𝜙𝜙 𝜇𝜇 𝜎𝜎 ( | )~ ; n 1, , and are n (n 2 1) separate groups.2 2 when F ( ̅, ) 𝜇𝜇 𝜎𝜎 𝜋𝜋 𝜇𝜇 𝐸𝐸 𝑇𝑇 �µ − 𝑡𝑡 F F � 𝜎𝜎 See section 1.7.2 − MDIP Prior 𝜇𝜇 ∈ (∞ ∞ 1) S ( | )~ ; , 2 22 2 2 𝑛𝑛𝐹𝐹 − 𝜋𝜋 𝜎𝜎 𝐸𝐸when𝐼𝐼𝐼𝐼 �σ (0, ) � See section2 1.7.1 𝜎𝜎 ∈ ∞ 120 Univariate Continuous Distributions

where 1 = 𝑛𝑛𝐹𝐹 ( ) and = 𝑛𝑛𝐹𝐹 n 2 2 𝑖𝑖 𝑖𝑖 𝑆𝑆 � 𝑡𝑡 − 𝑡𝑡̅ 𝑡𝑡̅ F � 𝑡𝑡 𝑖𝑖=1 𝑖𝑖=1 Conjugate Priors UOI Likelihood Evidence Dist. of Prior Posterior Parameters Model UOI Para

+ 𝑛𝑛𝐹𝐹 𝐹𝐹 = 0 𝑖𝑖=1 𝑖𝑖 𝑢𝑢 1 ∑ 2𝑡𝑡 Normal 0 + failures 𝑣𝑣 𝜎𝜎 from with known Normal u , 𝑢𝑢 𝐹𝐹 at times 𝑛𝑛2 (𝜇𝜇; , ) 𝐹𝐹 0 𝑛𝑛 𝑜𝑜 0 𝑣𝑣 1 𝜎𝜎 2 2 𝑖𝑖 𝑣𝑣 = 𝑡𝑡 1 n 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑡𝑡 𝜇𝜇 𝜎𝜎 𝜎𝜎 + 𝑣𝑣 F 2 = 𝑣𝑣0+ 𝜎𝜎/2 Normal failures 𝑜𝑜 𝐹𝐹 from2 with known Gamma , 𝑘𝑘 𝑘𝑘 𝑛𝑛 at times 1 𝜎𝜎( ; , ) 𝐹𝐹 = + 𝑛𝑛𝐹𝐹 ( ) 𝑛𝑛 0 0 2 2 𝑘𝑘 𝜆𝜆 𝑡𝑡𝑖𝑖 2 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑡𝑡 𝜇𝜇 𝜎𝜎 𝜇𝜇 𝜆𝜆 𝜆𝜆𝑜𝑜 � 𝑡𝑡𝑖𝑖 − 𝜇𝜇 𝑖𝑖=1 Normal u ln ( ) + 𝑛𝑛𝐹𝐹 = 0 ∑𝑖𝑖=1 𝑖𝑖 2 1 2 𝑡𝑡 Lognormal 0 + 𝑁𝑁 failures σ 𝜎𝜎 from with known Normal , 𝑢𝑢 𝐹𝐹 𝑁𝑁 at times 2 𝑛𝑛2 (𝜇𝜇; , ) 𝐹𝐹 𝑁𝑁 𝑛𝑛 𝑜𝑜 0 𝑣𝑣 1 𝜎𝜎 2 2 𝑖𝑖 𝑢𝑢 𝑣𝑣 = 𝑁𝑁 𝑁𝑁 𝑁𝑁 𝑡𝑡 1 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿 𝑡𝑡 𝜇𝜇 𝜎𝜎 𝜎𝜎 + 𝑣𝑣 𝐹𝐹 2 𝑛𝑛2 Description , Limitations and Uses 𝑣𝑣 𝜎𝜎𝑁𝑁 Example The accuracy of a cutting machine used in manufacturing is desired to be measured. 5 cuts at the required length are made and measured as: 7.436, 10.270, 10.466, 11.039, 11.854

MLE Estimates are: 𝑚𝑚𝑚𝑚 = = 10.213 n 𝐹𝐹 ∑ 𝑡𝑡𝑖𝑖 𝜇𝜇̂ = F = 2.789 n𝐹𝐹 1 2 2 ∑ 𝑖𝑖 𝑡𝑡 � �𝑡𝑡 − 𝜇𝜇� � 𝜎𝜎 F 90% confidence interval for : −

{0.95}(4), + {0.95}(4) 5 𝜇𝜇 5 𝜎𝜎� 𝜎𝜎� �𝜇𝜇̂ − 𝑡𝑡 [10.163, 10𝜇𝜇.̂262] 𝑡𝑡 � √ √ Normal Continuous Distribution 121

90% confidence interval for : 4 2 4 , 𝜎𝜎(4) (4) 2 {0.95} 2 {0.05} �𝜎𝜎� 2 [1.176, 15𝜎𝜎�.6972] � 𝜒𝜒 𝜒𝜒 A Bayesian point estimate using the Jeffery non-informative improper prior 1 with posterior for ~ (6, 10.213, 0.558 ) and ~ (3, 5.578) has4 a point estimates: 2 ⁄𝜎𝜎 𝜇𝜇 𝑇𝑇 𝜎𝜎 𝐼𝐼𝐼𝐼 = E[ (6,6.595,0.412 )] = = 10.213

𝜇𝜇̂ 𝑇𝑇 5.578µ = E[ (3,5.578)] = = 2.789 2 2 𝜎𝜎� 𝐼𝐼𝐺𝐺 With 90% confidence intervals:

[ (0.05) = 8.761, (0.95) = 11.665] 𝜇𝜇 −1 −1 𝑇𝑇 𝑇𝑇 2 [1/𝐹𝐹 (0.95) = 0.886, 𝐹𝐹1/ (0.05) = 6.822] 𝜎𝜎 −1 −1 𝐹𝐹𝐺𝐺 𝐹𝐹𝐺𝐺 Normal Characteristics Also known as a Gaussian distribution or bell curve.

Unit Normal Distribution. Also known as the standard normal distribution is when = 0 and = 1 with pdf ( ) and cdf (z). If X is normally distributed with mean and standard deviation then the following transformation𝜇𝜇 is used:𝜎𝜎 𝜙𝜙 𝑧𝑧 Φ 𝜇𝜇 𝜎𝜎 = 𝑥𝑥 − 𝜇𝜇 𝑧𝑧 . Let ,𝜎𝜎 , … , be a sequence of independent and identically distributed (i.i.d) random variables each 1 2 𝑛𝑛 having a mean of and a variance𝑋𝑋 𝑋𝑋 of 𝑋𝑋 . As the sample size𝑛𝑛 increases, the distribution of the sample average2 of these random variables approaches𝜇𝜇 the normal distribution𝜎𝜎 with mean and variance / irrespective of the shape of the original distribution. Formally: 2 𝜇𝜇 𝜎𝜎 𝑛𝑛 = + +

𝑛𝑛 1 𝑛𝑛 If we define a new random𝑆𝑆 variables:𝑋𝑋 ⋯ 𝑋𝑋 = , = 𝑆𝑆𝑛𝑛 − 𝑛𝑛𝑛𝑛 𝑆𝑆𝑛𝑛 𝑍𝑍𝑛𝑛 𝑎𝑎𝑛𝑛𝑑𝑑 𝑌𝑌 The distribution of converges𝜎𝜎√𝑛𝑛 to the standard𝑛𝑛 normal distribution. The distribution of converges to a normal distribution with mean 𝑛𝑛 𝑍𝑍 / and standard deviation𝑛𝑛 of . 𝑆𝑆 𝜇𝜇 Sigma Intervals. Often 𝜎𝜎intervals√𝑛𝑛 of the normal distribution are expressed in terms of distance away from the mean in units of sigma. 122 Univariate Continuous Distributions

The following is approximate values for each sigma:

Interval ( + ) ( ) ± 68.2689492137% ± 2 𝚽𝚽 𝝁𝝁95.4499736104%𝒏𝒏𝒏𝒏 − 𝚽𝚽 𝝁𝝁 − 𝒏𝒏 𝒏𝒏 𝜇𝜇± 3𝜎𝜎 99.7300203937% 𝜇𝜇 ± 4𝜎𝜎 99.9936657516% 𝜇𝜇 ± 5𝜎𝜎 99.9999426697% 𝜇𝜇 ± 6𝜎𝜎 99.9999998027% 𝜇𝜇 𝜎𝜎 Truncated Normal.𝜇𝜇 𝜎𝜎Often in reliability engineering a truncated normal distribution may be used due to the limitation that 0. See Truncated Normal Continuous Distribution. 𝑡𝑡 ≥ Inflection Points: Inflection points occur one standard deviation away from the mean ( ± ).

Mean𝜇𝜇 𝜎𝜎 / Median / Mode: The mean, median and mode are always equal to .

Hazard Rate. The hazard rate is increasing for all𝜇𝜇 . The Standard Normal Distribution’s hazard rate approaches ( ) = as becomes large. 𝑡𝑡 Normal ℎ 𝑡𝑡 𝑡𝑡 𝑡𝑡 Let: X~ ( , ) Convolution Property 2 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 µ σ 𝑛𝑛 ~ , 2 Scaling Property � 𝑋𝑋𝑖𝑖 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 �� 𝜇𝜇𝑖𝑖 ∑𝜎𝜎𝑖𝑖 � 𝑖𝑖=1

+ ~ ( + , ) Linear Combination Property: 2 2 𝑎𝑎𝑎𝑎 𝑏𝑏 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑎𝑎𝑎𝑎 𝑏𝑏 𝑎𝑎 𝜎𝜎 𝑛𝑛 + ~ { + } , 2 2 � 𝑎𝑎𝑖𝑖𝑋𝑋𝑖𝑖 𝑏𝑏𝑖𝑖 𝑁𝑁𝑜𝑜𝑜𝑜𝑜𝑜 �� 𝑎𝑎𝑖𝑖𝜇𝜇𝑖𝑖 𝑏𝑏𝑖𝑖 ∑�𝑎𝑎𝑖𝑖 𝜎𝜎𝑖𝑖 �� 𝑖𝑖=1 Applications Approximations to Other Distributions. The origin of the Normal Distribution was from an approximation of the Binomial distribution. Due to the Central Limit Theory the Normal distribution can be used to approximate many distributions as detailed under ‘Related Distributions’.

Strength Stress Interference. When the strength of a component follows a distribution and the stress that component is subjected to follows a distribution there exists a probability that the stress will be greater than the strength. When both distributions are a normal distribution, there is a closed for solution to the interference Normal Continuous Distribution 123

probability.

Life Distribution. When used as a life distribution a truncated Normal Distribution may be used due to the constraint 0. However it is often found that the difference in results is negligible. (Rausand & Høyland 2004) 𝑡𝑡 ≥

Time Distributions. The normal distribution may be used to model simple repair or inspection tasks that have a typical duration with variation which is symmetrical about the mean. This is typical for inspection and preventative maintenance times.

Analysis of Variance (ANOVA). A test used to analyze variance and dependence of variables. A popular model used to conduct ANOVA assumes the data comes from a normal population.

Six Sigma . is a business management strategy which aims to reduce costs in manufacturing processes by removing variance in quality (defects). Current manufacturing standards aim for an expected 3.4 defects out of one million parts: 2 ( 6). (Six Sigma Academy 2009) Normal Resources Online: Φ − http://www.weibull.com/LifeDataWeb/the_normal_distribution.htm http://mathworld.wolfram.com/NormalDistribution.html http://en.wikipedia.org/wiki/Normal_distribution http://socr.ucla.edu/htmls/SOCR_Distributions.html (web calc)

Books: Patel, J.K. & Read, C.B., 1996. Handbook of the Normal Distribution 2nd ed., CRC.

Simon, M.K., 2006. Probability Distributions Involving Gaussian Random Variables: A Handbook for Engineers and Scientists, Springer.

Relationship to Other Distributions Truncated Normal Let: Distribution ~ ( , ) ( , ) 2 ( ; , , , ) Then: 𝑋𝑋 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝜇𝜇 𝜎𝜎 ~T 𝑋𝑋 ∈ (∞, ∞, , ) 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 𝑥𝑥 𝜇𝜇 𝜎𝜎 𝑎𝑎𝐿𝐿 𝑏𝑏𝑈𝑈 [ , 2 ] 𝑌𝑌 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝜇𝜇 𝜎𝜎 𝑎𝑎𝐿𝐿 𝑏𝑏𝑈𝑈 Lognormal Let: 𝑌𝑌 ∈ 𝑎𝑎𝐿𝐿 𝑏𝑏𝑈𝑈 Distribution ~ ( , ) = ln ( ) 2 𝑁𝑁 N ( ; , ) Then: 𝑋𝑋 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿 𝜇𝜇 σ 2 ~𝑌𝑌 (𝑋𝑋, ) 𝑁𝑁 𝑁𝑁 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿 𝑡𝑡 𝜇𝜇 𝜎𝜎 Where: 2 𝑌𝑌 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝜇𝜇 𝜎𝜎 124 Univariate Continuous Distributions

+ = ln , = ln +2 2 2 𝑁𝑁 𝜇𝜇 𝑁𝑁 𝜎𝜎 𝜇𝜇 𝜇𝜇 � 2 2� 𝜎𝜎 � � 2 � Rayleigh Let �𝜎𝜎 𝜇𝜇 𝜇𝜇 Distribution , ~ (0, ) Y = X + X

( ; ) 2 2 Then 𝑋𝑋1 𝑋𝑋2 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 σ 𝑎𝑎𝑎𝑎𝑎𝑎 � 1 2 Y~ ( ) 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅ℎ 𝑡𝑡 𝜎𝜎 Let 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅ℎ 𝜎𝜎 Chi-square X ( ) Distribution ~ , Y = v 2 2 k 𝑖𝑖 − µ ( ; ) Then 𝑋𝑋 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 µ σ 𝑎𝑎𝑎𝑎𝑎𝑎 � � � k=1 σ 2 Y~ ( ; ) 𝜒𝜒 𝑡𝑡 𝑣𝑣 2 Limiting Case for constant : 𝜒𝜒 𝑡𝑡 𝑣𝑣 lim ( ; , ) = k; = n , = (1 ) Binomial 𝑝𝑝 2 𝑛𝑛→∞ 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑛𝑛 𝑝𝑝 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁� µ 𝑝𝑝 𝜎𝜎 𝑛𝑛𝑛𝑛 − 𝑝𝑝 � Distribution 𝑝𝑝=𝑝𝑝 The Normal distribution can be used as an approximation of the ( ; , ) 10 (1 ) 10 Binomial distribution when and .

𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑛𝑛 𝑝𝑝 ( ; , ) 𝑛𝑛𝑛𝑛=≥ + 0.5; 𝑛𝑛𝑛𝑛= −,𝑝𝑝 ≥= (1 )

Normal 2 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 lim𝑝𝑝 𝑛𝑛 ≈ 𝑁𝑁(𝑁𝑁𝑁𝑁𝑁𝑁; )�𝑡𝑡= 𝑘𝑘 ( ; 𝜇𝜇 = 𝑛𝑛,𝑛𝑛 𝜎𝜎= 𝑛𝑛)𝑛𝑛 − 𝑝𝑝 � ′ 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 Poisson 𝜇𝜇→∞ 𝐹𝐹 𝑘𝑘 𝜇𝜇 𝐹𝐹 𝑘𝑘 𝜇𝜇 𝜇𝜇 𝜎𝜎 �𝜇𝜇 This is a good approximation when > 1000. When > 10 the same Distribution approximation can be made with a correction:

𝜇𝜇 𝜇𝜇 (k; ) lim ( ; ) = ( ; = 0.5, = ) ′ 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 µ 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝜇𝜇→∞ 𝐹𝐹 𝑘𝑘 𝜇𝜇 𝐹𝐹 𝑘𝑘 𝜇𝜇 𝜇𝜇 − 𝜎𝜎 �𝜇𝜇 For large and with fixed / :

(𝛼𝛼 , ) 𝛽𝛽 =𝛼𝛼 𝛽𝛽 , = Beta Distribution + ( + ) ( + + 1) α αβ 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝛼𝛼 𝛽𝛽 ≈ 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 �µ 𝜎𝜎 � 2 � ( ; , ) α β α β α β As and increase the mean remains constant and the variance is 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑡𝑡 𝛼𝛼 𝛽𝛽 reduced. 𝛼𝛼 𝛽𝛽

Gamma Special Case for large k: Distribution lim ( , ) = = , = ( , ) 𝑘𝑘 𝑘𝑘 𝑘𝑘→∞ 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑘𝑘 𝜆𝜆 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 �𝜇𝜇 𝜎𝜎 � 2� 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑘𝑘 𝜆𝜆 𝜆𝜆 𝜆𝜆

Pareto Continuous Distribution 125

4.6. Pareto Continuous Distribution

Probability Density Function - f(t) 1.0 1.0 θ=1 σ=1 θ=1 σ=1 0.9 θ=2 σ=1 θ=1 σ=2 0.8 0.8 θ=1 σ=4 θ=3 σ=1 0.7 0.6 0.6 0.5 0.4 0.4 0.3 0.2 0.2 0.1 0.0 0.0 0 2 4 6 8 0 2 4 6 8

Cumulative Density Function - F(t) 1.0 θ=1 σ=1 1.0 0.9 θ=2 σ=1 0.9 θ=3 σ=1 0.8 0.8 0.7 0.7 Pareto 0.6 0.6

0.5 0.5 0.4 0.4 0.3 0.3 θ=1 σ=1 0.2 0.2 θ=1 σ=2 0.1 0.1 θ=1 σ=4 0.0 0.0 0 2 4 6 8 0 2 4 6 8

Hazard Rate - h(t) 1.0 1.0 θ=1 σ=1 θ=1 σ=1 0.9 0.9 θ=1 σ=2 0.8 θ=2 σ=1 0.8 θ=3 σ=1 θ=1 σ=4 0.7 0.7 0.6 0.6 0.5 0.5 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0.0 0.0 0 2 4 6 8 0 2 4 6 8

126 Univariate Continuous Distributions

Parameters & Description Location parameter. is the lower > 0 limit of t. Sometimes refered to as t- θ Parameters minimum. θ θ Shape parameter. Sometimes called > 0 the Pareto index.

𝛼𝛼 α t < Limits

θ ≤ ∞ Distribution Formulas

PDF ( ) = t α αθ 𝑓𝑓 𝑡𝑡 α+1 CDF ( ) = 1 t α θ 𝐹𝐹 𝑡𝑡 − � �

Reliability R(t) = t α θ � � ( + ) (t) ( ) = ( | ) = = Conditional ( ) (t + xα) Pareto 𝑅𝑅 𝑡𝑡 𝑥𝑥 Survivor Function Where 𝑚𝑚 𝑥𝑥 𝑅𝑅 𝑥𝑥 𝑡𝑡 α ( > + | > ) is the given time we know the comp𝑅𝑅 onent𝑡𝑡 has survived to time is a random variable defined as the time after . Note: = 0 at . 𝑃𝑃 𝑇𝑇 𝑥𝑥 𝑡𝑡 𝑇𝑇 𝑡𝑡 𝑡𝑡 Mean Residual 𝑥𝑥 ( ) 𝑡𝑡 𝑥𝑥 𝑡𝑡 ( ) = ∞ Life ( ) ∫𝑡𝑡 𝑅𝑅 𝑥𝑥 𝑑𝑑𝑑𝑑 𝑢𝑢 𝑡𝑡 Hazard Rate ( ) =𝑅𝑅 𝑡𝑡 t α Cumulative ℎ 𝑡𝑡 ( ) = ln Hazard Rate 𝑡𝑡 𝐻𝐻 𝑡𝑡 𝛼𝛼 � � Properties and Moments 𝜃𝜃 Median 1⁄α Mode θ2 Mean - 1st Raw Moment θ , for > 1 1 αθ Variance - 2nd Central Moment α α − , for > 2 ( 1) (2 2) αθ 2 α Skewness - 3rd Central Moment 2α(1−+ ) α − 2 , for > 3 ( 3) α α − � α α − α Pareto Continuous Distribution 127

Excess kurtosis - 4th Central Moment 6( + 2) , for > 4 (3 32)( 4) α α − 6α − α Characteristic Function α α −( α)− ( , ) 𝛼𝛼 100 % Percentile Function 𝛼𝛼 −𝑖𝑖=𝑖𝑖𝑖𝑖 (1Γ −𝛼𝛼) −𝑖𝑖𝑖𝑖𝑖𝑖 −1⁄α 𝛾𝛾 Parameter Estimation 𝑡𝑡𝛾𝛾 θ − γ Plotting Method Least Mean X-Axis Y-Axis = Square ln ( ) ln[1 ] = exp = + 𝛼𝛼� −𝑚𝑚 𝑐𝑐 𝑖𝑖 𝜃𝜃� � � 𝑦𝑦 𝑚𝑚𝑚𝑚 𝑐𝑐 Maximum𝑡𝑡 Likelihood Function− 𝐹𝐹 𝛼𝛼� Likelihood For complete data: Function 1 ( , | ) = nF t nF αnF α+1 𝐿𝐿 𝜃𝜃 𝛼𝛼 𝐸𝐸 α θ �i=1 �����������i�� Log-Likelihood failures ( , | ) = n ln( ) + n ln ( ) ( + 1) 𝑛𝑛𝐹𝐹 ln Function Pareto

Λ 𝜃𝜃 𝛼𝛼 𝐸𝐸 F α Fα θ − 𝛼𝛼 � 𝑡𝑡𝑖𝑖 𝑖𝑖=1

������������������������� solve for to get : failures = 0 n ∂Λ 𝛼𝛼 𝛼𝛼� = + n ln 𝑛𝑛𝐹𝐹 ln = 0 ∂α ∂Λ F − F θ − � 𝑡𝑡𝑖𝑖 ∂α α 𝑖𝑖=1 ����������������� failures MLE Point The likelihood function increases as increases. Therefore the MLE Estimates point estimate is the largest which satisfies t < : 𝜃𝜃 =𝜃𝜃min t , … , t θ ≤ i ∞

� 1 nF Substituting gives the MLE𝜃𝜃 for � : �

𝜃𝜃� 𝛼𝛼� = ln ln 𝑛𝑛𝐹𝐹 𝛼𝛼� 𝑛𝑛𝐹𝐹 ∑𝑖𝑖=1� 𝑡𝑡𝑖𝑖 − 𝜃𝜃�� Fisher Information 1/ 0 ( , ) = 0 2 1/ − 𝛼𝛼 𝐼𝐼 𝜃𝜃 𝛼𝛼 � 2� 𝜃𝜃 100 % 1-Sided Lower 2-Sided Lower 2-Sided Upper Confidence

Interv𝛾𝛾 als (2n 2) (2n 2) (2n 2) if is 2n { } 2n 2n 2 2 2 unknown 𝛼𝛼� 𝛼𝛼� 1−γ 𝛼𝛼� 1+γ 𝛼𝛼� 1−γ � � � � 𝜃𝜃 F 𝜒𝜒 − F 𝜒𝜒 2 − F 𝜒𝜒 2 − 128 Univariate Continuous Distributions

(for complete (2n) (2n) (2n 2) if is 2n { } 2n 2n data) 2 2 2 known 𝛼𝛼� 𝛼𝛼� 1−γ 𝛼𝛼� 1+γ 𝛼𝛼� 1−γ � � � � 𝜃𝜃 F 𝜒𝜒 F 𝜒𝜒 2 F 𝜒𝜒 2 − (Johnson et al. 1994, p.583) Where ( ) is the 100 th percentile of the 2 -distribution with degrees of freedom.𝛾𝛾 2 𝜒𝜒 𝑛𝑛 𝛾𝛾 𝜒𝜒 𝑛𝑛 Bayesian Non-informative Priors when is known, ( ) (Yang and Berger 1998, p.22) 𝜽𝜽 𝝅𝝅𝟎𝟎 𝜶𝜶 Type Prior Jeffery and 1

Reference

Conjugate Priors 𝛼𝛼 UOI Likelihood Evidence Dist. of Prior Posterior Model UOI Para Parameters

Uniform = max { , … , } failures from with known Pareto , 1 𝑛𝑛𝐹𝐹 (𝑏𝑏; a, ) a at times 𝜃𝜃 𝑡𝑡 𝑡𝑡 𝑛𝑛𝐹𝐹 = + 𝜃𝜃𝑜𝑜 𝛼𝛼0 𝑡𝑡𝑖𝑖 𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈 𝑡𝑡 𝑏𝑏 a𝛼𝛼= a𝛼𝛼0 𝑛𝑛𝐹𝐹 Pareto Pareto with failures where a > from Pareto a , 𝑜𝑜 𝐹𝐹 known at times − 𝛼𝛼𝑛𝑛 𝜃𝜃( ; , ) 𝐹𝐹 0 𝐹𝐹 𝑛𝑛 0 0 = 𝛼𝛼𝑛𝑛 𝑖𝑖 Θ 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑡𝑡 θ 𝛼𝛼 𝛼𝛼 𝑡𝑡 k =Θk Θ+0

Pareto with failures 𝑜𝑜 𝐹𝐹 from Gamma k , 𝑛𝑛 known at times 𝛼𝛼( ; , ) 𝐹𝐹 = + 𝑛𝑛𝐹𝐹 ln 𝑛𝑛 0 0 𝑖𝑖 λ 𝑖𝑖 𝜃𝜃 𝑡𝑡 𝑥𝑥 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑡𝑡 θ 𝛼𝛼 λ λ𝑜𝑜 � � � Description , Limitations and Uses 𝑖𝑖=1 𝜃𝜃 Example 5 components are put on a test with the following failure times: 108, 125, 458, 893, 13437 hours

MLE Estimates are:

= 108

Substituting gives the MLE𝜃𝜃 �for :

𝜃𝜃� 5 𝛼𝛼� = = 0.8029 (ln ln (108)) 𝛼𝛼� 𝑛𝑛𝐹𝐹 𝑖𝑖=1 𝑖𝑖 90% confidence interval∑ for 𝑡𝑡: −

𝛼𝛼� Pareto Continuous Distribution 129

2 (8), 2 (8) 10 {0.05} 10 {0.95} 𝛼𝛼� [0.2194, 1.2451𝛼𝛼� ] � 𝜒𝜒 𝜒𝜒 � Characteristics 80/20 Rule. Most commonly described as the basis for the “80/20 rule” (In a quality context, for example, 80% of manufacturing defects will be a result from 20% of the causes).

Conditional Distribution. The conditional probability distribution given that the event is greater than or equal to a value exceeding is a Pareto distribution with the same index but with a minimum 1 instead of . 𝜃𝜃 𝜃𝜃 𝛼𝛼 1 Types.𝜃𝜃 This distribut𝜃𝜃 ion is known as a Pareto distribution of the first kind. The Pareto distribution of the second kind (not detailed here) is also known as the Lomax distribution. Pareto also proposed a third distribution now known as a Pareto distribution of the third kind.

Pareto and the Lognormal Distribution. The Lognormal distribution models similar physical phenomena as the Pareto distribution. The two distributions have different weights at the extremities. Pareto Let: X ~ ( , )

i i Minimum property 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 θ α

min { , , … , }~ , 𝑛𝑛

For constant . 𝑋𝑋 𝑋𝑋2 𝑋𝑋𝑛𝑛 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 �𝜃𝜃 � 𝛼𝛼𝑖𝑖� 𝑖𝑖=1 Applications Rare Events.𝜃𝜃 The survival function ‘slowly’ decreases compared to most life distributions which makes it suitable for modeling rare events which have large outcomes. Examples include natural events such as the distribution of the daily rain fall, or the size of manufacturing defects. Resources Online: http://mathworld.wolfram.com/ParetoDistribution.html http://en.wikipedia.org/wiki/Pareto_distribution http://socr.ucla.edu/htmls/SOCR_Distributions.html (web calc)

Books: Arnold, B., 1983. Pareto distributions, Fairland, MD: International Co- operative Pub. House.

Johnson, N.L., Kotz, S. & Balakrishnan, N., 1994. Continuous Univariate Distributions, Vol. 1 2nd ed., Wiley-Interscience.

Relationship to Other Distributions 130 Univariate Continuous Distributions

Let Exponential ~ ( , ) = ln( ) Distribution Then ( ; ) 𝑌𝑌 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝜃𝜃 𝛼𝛼 ~ 𝑎𝑎𝑎𝑎𝑎𝑎( = ) 𝑋𝑋 𝑌𝑌⁄𝜃𝜃 𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 𝜆𝜆 Let 𝑋𝑋 𝐸𝐸𝐸𝐸𝐸𝐸 𝜆𝜆 𝛼𝛼 Chi-Squared ~ ( , ) = ln( ) Distribution Then ( ; ) 𝑌𝑌 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝜃𝜃 𝛼𝛼 ~ 𝑎𝑎𝑎𝑎𝑎𝑎( = 2) 𝑋𝑋 2α 𝑌𝑌⁄𝜃𝜃 2 (Johnson et al. 1994, p.526) 2 χ 𝑥𝑥 𝑣𝑣 𝑋𝑋 𝜒𝜒 𝑣𝑣 Let Logistic X ~ ( , ) = ln 1 Distribution α ( , ) Then 𝑋𝑋 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝜃𝜃 𝛼𝛼 𝑎𝑎𝑎𝑎𝑎𝑎 𝑌𝑌 − �� � − � ~ (0,1) θ 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿 µ 𝑠𝑠 (Hastings et al. 2000, p.127) 𝑌𝑌 𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿𝐿

Pareto Triangle Continuous Distribution 131

4.7. Triangle Continuous Distribution

Probability Density Function - f(t) 2.0 a=0 b=1 c=0.5 2.0 a=0 b=1.5 c=0.5 1.8 a=0 b=2 c=0.5 1.6 1.5 1.4 1.2 1.0 1.0 0.8 0.6 0.5 0.4 a=0 b=1 c=0.25 a=0 b=1 c=0.5 0.2 a=0 b=1 c=0.75 0.0 0.0 0 0.5 1 1.5 2 0 0.2 0.4 0.6 0.8 1

Cumulative Density Function - F(t)

1.0 1.0 Triangle

0.8 0.8

0.6 0.6

0.4 0.4 a=0 b=1 c=0.5 0.2 a=0 b=1.5 c=0.5 0.2 a=0 b=2 c=0.5 0.0 0.0 0 0.5 1 1.5 2 0 0.2 0.4 0.6 0.8 1

Hazard Rate - h(t) 10 10 a=0 b=1 c=0.25 9 9 a=0 b=1 c=0.5 8 8 a=0 b=1 c=0.75 7 7 6 6 5 5 4 4 3 3 2 2 1 1 0 0 0 0.5 1 1.5 2 0 0.5 1

132 Univariate Continuous Distributions

Parameters & Description < Minimum Value. is the lower bound Maximum Value. is the upper 𝑎𝑎 −∞<≤ 𝑎𝑎< 𝑏𝑏 𝑎𝑎 Parameters bound. 𝑏𝑏 𝑏𝑏 𝑎𝑎 𝑏𝑏 ∞ Mode Value. is the mode of the

distribution (top of the triangle). 𝑐𝑐 Random Variable 𝑐𝑐 𝑎𝑎 ≤ 𝑐𝑐 ≤ 𝑏𝑏

Distribution Formulas𝑎𝑎 ≤ 𝑡𝑡 ≤ 𝑏𝑏 2(t a) for a t c (b a)(c a) ( ) = − PDF ⎧ 2(b t) ≤ ≤ ⎪ for c t b (b − a)(b − c) 𝑓𝑓 𝑡𝑡 ⎨ − ⎪ ≤ ≤ ⎩ − − (t a) for a t c (b a)(c 2 a) CDF ( ) = − ⎧ (b t) ≤ ≤ ⎪1 − − for c t b (b a)(b 2 c) 𝐹𝐹 𝑡𝑡 ⎨ − ⎪ − ≤ ≤ ⎩ −(t a)−

Triangle 1 for a t c (b a)(c 2 a) Reliability ( ) = − ⎧ −(b t) ≤ ≤ ⎪ − − for c t b (b a)(b 2 c) 𝑅𝑅 𝑡𝑡 ⎨ − ⎪ ≤ ≤ Properties⎩ and− Moments− Median + ( )( ) 2 1 𝑏𝑏 − 𝑎𝑎 𝑎𝑎 �2 𝑏𝑏 − 𝑎𝑎 𝑐𝑐 − 𝑎𝑎 𝑓𝑓𝑓𝑓𝑓𝑓 𝑐𝑐 ≥ ( )( ) < 2 1 𝑏𝑏 − 𝑎𝑎 𝑏𝑏 − �2 𝑏𝑏 − 𝑎𝑎 𝑏𝑏 − 𝑐𝑐 𝑓𝑓𝑓𝑓𝑓𝑓 𝑐𝑐 Mode Mean - 1st Raw Moment + + 𝑐𝑐 3 𝑎𝑎 𝑏𝑏 𝑐𝑐 Variance - 2nd Central Moment + +

2 2 2 18 𝑎𝑎 𝑏𝑏 𝑐𝑐 − 𝑎𝑎𝑎𝑎 − 𝑎𝑎𝑎𝑎 − 𝑏𝑏𝑏𝑏 Skewness - 3rd Central Moment 2( + 2 )(2 )( 2 + )

5( + + ) / √ 𝑎𝑎 𝑏𝑏 − 𝑐𝑐 𝑎𝑎 − 𝑏𝑏 − 𝑐𝑐 𝑎𝑎 − 𝑏𝑏 𝑐𝑐 2 2 2 3 2 Excess kurtosis - 4th Central Moment 3 𝑎𝑎 𝑏𝑏 𝑐𝑐 − 𝑎𝑎𝑎𝑎 − 𝑎𝑎𝑎𝑎 − 𝑏𝑏𝑏𝑏 5 − Triangle Continuous Distribution 133

Characteristic Function (b c)e (b a)e + (c a)e 2 (bita a)(c a)(itcb c)t itb − − − − − 2 100γ % Percentile Function = a + (b− a)(c− a) − < ( )

𝑡𝑡𝛾𝛾 �γ − − 𝑓𝑓𝑓𝑓𝑓𝑓 𝛾𝛾 𝐹𝐹 𝑐𝑐 = b (1 )(b a)(b c) ( )

Parameter𝑡𝑡𝛾𝛾 Estimation− � − γ − − 𝑓𝑓𝑓𝑓𝑓𝑓 𝛾𝛾 ≥ 𝐹𝐹 𝑐𝑐 Maximum Likelihood Function Likelihood 2(t a) 2(b t ) ( , , | ) = . Functions r ( ) nF ( ) b ai (c a) b a (b i c) − − 𝐿𝐿 𝑎𝑎 𝑏𝑏 𝑐𝑐 𝐸𝐸 �i=1 �i=r+1 ���2�����−�����−�t � a��������b−���t��−�� failers to the left of c failures to the right of c = 𝐹𝐹 𝑛𝑛 r ( ) nF ( ) b a ci a b ci − − � � �i=1 �i=r+1 Where failure times are− ordered: − −

and is the number of1 failure2 times less𝑟𝑟 than 𝑛𝑛 and𝐹𝐹 is the number of failure times greater than𝑇𝑇 ≤ 𝑇𝑇. Therefore≤ ⋯ ≤ 𝑇𝑇 ≤ ⋯= ≤+𝑇𝑇 . 𝑟𝑟 𝑐𝑐 𝑠𝑠 Triangle Point The MLE estimates a, b, and𝑐𝑐 c are obtained𝑛𝑛𝐹𝐹 𝑟𝑟 by 𝑠𝑠numerically calculating Estimates the likelihood function for different r and selecting the maximum where c = X . � � �

� r� 2 max ( , , | ) = 𝐹𝐹 { ( , , ( , )} b a 𝑛𝑛 where 𝑎𝑎 ≤ 𝑐𝑐 ≤ 𝑏𝑏 𝐿𝐿 𝑎𝑎 𝑏𝑏 𝑐𝑐 𝐸𝐸 � � 𝑀𝑀 𝑎𝑎 𝑏𝑏 𝑟𝑟̂ 𝑎𝑎 𝑏𝑏 t −a b t ( , , ) = r−1 ( ) nF ( ) ti a b ti − − 𝑀𝑀 𝑎𝑎 𝑏𝑏 𝑟𝑟 �i=1 r �i=r+1 r ( , ) = arg max− ( , , ) − { ,…, }

𝑟𝑟 𝑎𝑎 𝑏𝑏 𝑟𝑟∈ 1 𝑛𝑛𝐹𝐹 𝑀𝑀 𝑎𝑎 𝑏𝑏 𝑟𝑟 Note that the MLE estimates for a and are not the same as the uniform distribution: a min (t ,𝑏𝑏t … ) b max (tF , tF … ) 1 2 (Kotz & Dorp 2004) � ≠ F F � ≠ 1 2 Description , Limitations and Uses Example When eliciting an opinion from an expert on the possible value of a quantity, , the expert may give : - Lowest possible value = 0 - Highest𝑥𝑥 possible value = 1 - Estimate of most likely value (mode) = 0.7

The corresponding distribution for may be a triangle distribution with parameters: = 0, = 1𝑥𝑥, = 0.7

𝑎𝑎 𝑏𝑏 𝑐𝑐 134 Univariate Continuous Distributions

Characteristics Standard Triangle Distribution. The standard triangle distribution has = 0, = 1. This distribution has a mean at 2 and median at 1 (1 ) 2. 𝑎𝑎 𝑏𝑏 �𝑐𝑐⁄

Symmetr− � ical− 𝑐𝑐 ⁄Triangle Distribution. The symmetrical triangle distribution occurs when = ( )/2. The symmetrical triangle distribution is formed from the average of two uniform random variables (see related distributions).𝑐𝑐 𝑏𝑏 − 𝑎𝑎

Applications Subjective Representation. The triangle distribution is often used to model subjective evidence where and are the bounds of the estimation and is an estimation of the mode. 𝑎𝑎 𝑏𝑏 Substitution to𝑐𝑐 the Beta Distribution. Due to the triangle distribution having bounded support it may be used in place of the beta distribution.

Monte Carlo Simulation. Used to approximate distributions of

variables when the underlying distribution is unknown. A distribution of interest is obtained by conducting Monte Carlo simulation of a model using the triangle distributions as inputs.

Triangle Resources Online: http://mathworld.wolfram.com/TriangularDistribution.html http://en.wikipedia.org/wiki/Triangular_distribution

Books: Kotz, S. & Dorp, J.R.V., 2004. Beyond Beta: Other Continuous Families Of Distributions With Bounded Support And Applications, World Scientific Publishing Company.

Relationship to Other Distributions Let X + X Uniform X ~ ( , ) Y = 2 Distribution 1 2 Then i 𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈 𝑎𝑎 𝑏𝑏 𝑎𝑎𝑎𝑎𝑎𝑎 b a ( ; a, b) Y~ a, , b 2 − 𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈 𝑡𝑡 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 � � Beta Distribution Special Cases: (1,2 ) = (0,0,1) ( ; , ) (2,1 ) = (0,1,1) 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑡𝑡 α 𝛽𝛽 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 Truncated Normal Continuous Distribution 135

4.8. Truncated Normal Continuous Distribution Probability Density Function - f(t) μ=-1 σ=1 a=0 b=2 μ=0.5 σ=0.5 1.4 1.8 μ=0.5 σ=1 μ=0 σ=1 a=0 b=2 1.6 μ=0.5 σ=3 1.2 μ=1 σ=1 a=0 b=2 1.4 1.0 1.2 0.8 1.0 0.6 0.8 0.6 0.4 0.4 0.2 0.2 0.0 0.0 -1 0 1 2 3 -1 0 1 2 3

Cumulative Density Function - F(t)

1.0 1.0 Trunc Normal

0.8 0.8

0.6 0.6

0.4 0.4 μ=-1 a=0 b=2 μ=0.5 σ=0.5 a=0 0.2 μ=0 a=0 b=2 0.2 μ=0.5 σ=1 b=2 μ=1 a=0 b=2 μ=0.5 σ=3 0.0 0.0 -1 0 1 2 3 -1 0 1 2 3

Hazard Rate - h(t) μ=0.5 σ=0.5 a=0 μ=-1 σ=1 a=0 b=2 1.4 μ=0.5 σ=1 b=2 μ=0 σ=1 a=0 b=2 1.2 μ=0.5 σ=3 1.2 μ=1 σ=1 a=0 b=2 1.0 1.0 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0.0 0.0 -1 0 1 2 3 -1 0 1 2 3

136 Univariate Continuous Distributions

Parameters & Description Location parameter: The mean of the < < distribution. 𝜇𝜇 −∞ µ ∞ Scale parameter: The standard > 0 deviation of the distribution. 2 2 𝜎𝜎 σ Lower Bound: is the lower bound. Parameters < < The standard normal transform of 𝐿𝐿 = 𝑎𝑎 is . 𝐿𝐿 𝑎𝑎𝐿𝐿 −∞ 𝑎𝑎𝐿𝐿 𝑏𝑏𝑈𝑈 𝑎𝑎𝐿𝐿−𝜇𝜇 𝑎𝑎 Upper𝑧𝑧𝑎𝑎 Bound𝜎𝜎 : is the upper bound. The standard normal transform of < < 𝑈𝑈 is = . 𝑏𝑏 𝑈𝑈 𝐿𝐿 𝑈𝑈 𝑈𝑈 𝑏𝑏 𝑎𝑎 𝑏𝑏 ∞ 𝑏𝑏𝑈𝑈−𝜇𝜇 𝑏𝑏 Limits <𝑧𝑧𝑏𝑏 𝜎𝜎

Distribution Left Truncated Normal 𝑎𝑎𝐿𝐿 𝑥𝑥 ≤General𝑏𝑏𝑈𝑈 Truncated Normal [0, ) [ , ]

for 0 𝑥𝑥 ∈ ∞ for 𝑥𝑥 ∈ 𝑎𝑎𝐿𝐿 𝑏𝑏𝑈𝑈 1 (z ) (z ) ( ) = 𝑎𝑎𝐿𝐿 ≤ 𝑥𝑥 ≤ 𝑏𝑏𝑈𝑈 ≤ 𝑥𝑥 ≤ ∞ ( z ) ( ) = x (z ) x (z ) otherwise 𝜙𝜙 𝜙𝜙 𝑓𝑓 𝑥𝑥 0 otherwise σ ( )σΦ= 0− 𝑓𝑓 𝑥𝑥 b a PDF Φ( ) = −0 Φ

Trunc Normal Trunc where 𝑓𝑓 𝑥𝑥 𝑓𝑓 𝑥𝑥 is the standard normal pdf with = 0 and = 1 is the standard normal cdf with = 0 and 2 = 1 𝜙𝜙 𝜇𝜇 𝜎𝜎 = 2 Φ 𝑖𝑖−𝜇𝜇 𝜇𝜇 𝜎𝜎 𝑧𝑧𝑖𝑖 � 𝜎𝜎 � for < a for < 0 ( ) = 0 L ( ) = 0 𝑥𝑥 𝑥𝑥 for 𝐹𝐹 𝑥𝑥 for 0 𝐹𝐹<𝑥𝑥 (z ) (z ) CDF 𝐿𝐿 ( ) = 𝑈𝑈 𝑎𝑎 ≤ 𝑥𝑥 ≤ 𝑏𝑏 (z ) (z ) ≤ 𝑥𝑥 ∞(z ) (z ) Φ x − Φ a ( ) = 𝐹𝐹 𝑥𝑥 ( ) Φ b − Φ a x z 0 for > b Φ − Φ ( ) = 1 𝐹𝐹 𝑥𝑥 0 U Φ − 𝑥𝑥 𝐹𝐹 𝑥𝑥 for < a for < 0 ( ) = 1 ( ) = 1 L 𝑥𝑥 𝑥𝑥 for 𝑅𝑅 𝑥𝑥 for 0 𝑅𝑅<𝑥𝑥 Reliability (z ) (z ) 𝐿𝐿 ( ) = 𝑈𝑈 𝑎𝑎 ≤ 𝑥𝑥 ≤ 𝑏𝑏 ( ) ( ) ≤ 𝑥𝑥 ∞(z ) (z ) zb zx ( ) = Φ − Φ ( z ) 𝑅𝑅 𝑥𝑥 b a Φ 0 − Φ x for > b Φ − Φ 𝑅𝑅 𝑥𝑥 ( ) Φ − 0 = 0 𝑥𝑥 U 𝑅𝑅 𝑥𝑥 Truncated Normal Continuous Distribution 137

for < 0 for < a ( ) = ( + ) ( ) = ( + ) L 𝑡𝑡 𝑡𝑡 for 0𝑚𝑚 𝑥𝑥< 𝑅𝑅 𝑡𝑡 𝑥𝑥 for 𝑚𝑚 𝑥𝑥 𝑅𝑅 𝑡𝑡 𝑥𝑥 ( + ) ( + ) ( ) = ( | ) = 𝐿𝐿 ( ) = 𝑈𝑈( | ) = ≤ 𝑡𝑡 ∞ ( ) 𝑎𝑎 ≤ 𝑡𝑡 ≤ 𝑏𝑏 ( ) 1 (z 𝑅𝑅)𝑡𝑡 𝑥𝑥 𝑅𝑅 𝑡𝑡 𝑥𝑥 𝑚𝑚 𝑥𝑥 𝑅𝑅 𝑥𝑥 𝑡𝑡 (z ) (z ) Conditional = 𝑚𝑚 𝑥𝑥 = 𝑅𝑅 𝑥𝑥 𝑡𝑡 1 (z ) 𝑅𝑅 𝑡𝑡 (z ) 𝑅𝑅(z 𝑡𝑡) Survivor Function − Φ xt+x t b t+x ( > + | > ) Φ − Φ t b t = − Φ for > b Φ − Φ µ − −t 𝑃𝑃 𝑇𝑇 𝑥𝑥 𝑡𝑡 𝑇𝑇 𝑡𝑡 Φ � � ( ) = 0 σ U µ − 𝑡𝑡 Φ � � is the given timeσ we know the component 𝑚𝑚has𝑥𝑥 survived to. is a random variable defined as the time after . Note:𝑡𝑡 = 0 at . This operation is the equivalent of t replacing the lower𝑥𝑥 bound. 𝑡𝑡 𝑥𝑥 𝑡𝑡 ( ) ( ) Mean Residual Life ( ) = ∞ = ∞ ( ) ( ) ∫𝑡𝑡 𝑅𝑅 𝑥𝑥 𝑑𝑑𝑥𝑥 ∫𝑡𝑡 𝑅𝑅 𝑥𝑥 𝑑𝑑𝑑𝑑

𝑢𝑢 𝑡𝑡 𝑅𝑅 for𝑡𝑡 < a 𝑅𝑅 𝑡𝑡 Trunc Normal ( ) = 0 for < 0 L 𝑥𝑥 ( ) = 0 for ℎ 𝑥𝑥 𝑥𝑥 for 0 ℎ<𝑥𝑥 Hazard Rate 𝑎𝑎𝐿𝐿 ≤ 𝑥𝑥1≤ 𝑏𝑏𝑈𝑈 (z )[ (z ) (z )] ≤ 𝑥𝑥1 ∞ ( ) = (z )[1 (z )] [ (z ) (z )] 𝜙𝜙 x Φ b − Φ x ( ) = σ [1 (z )] ℎ 𝑥𝑥 2 𝜙𝜙 x − Φ x for > b Φ b − Φ a σ ℎ 𝑥𝑥 2 ( ) − Φ 0 = 0 𝑥𝑥 U Cumulative Hazard ( ) = ln[ ( )] ( )ℎ=𝑥𝑥 ln[ ( )] Rate Properties and Left𝐻𝐻 Truncated𝑡𝑡 − 𝑅𝑅 Normal𝑡𝑡 General𝐻𝐻 𝑡𝑡 Truncated− 𝑅𝑅 𝑡𝑡Normal Moments [0, ) [ , ]

Median No𝑥𝑥 closed∈ ∞ form No𝑥𝑥 closed∈ 𝑎𝑎𝐿𝐿 𝑏𝑏 form𝑈𝑈 Mode 0 [ , ] 0 < 0 < 𝐿𝐿 𝑈𝑈 𝜇𝜇 𝑤𝑤ℎ𝑒𝑒𝑒𝑒𝑒𝑒 𝜇𝜇 ≥ 𝜇𝜇 𝑤𝑤ℎ 𝑒𝑒𝑒𝑒𝑒𝑒 𝜇𝜇 ∈ >𝑎𝑎 𝑏𝑏 𝑤𝑤ℎ𝑒𝑒𝑒𝑒𝑒𝑒 𝜇𝜇 𝑎𝑎𝐿𝐿 𝑤𝑤ℎ𝑒𝑒𝑒𝑒𝑒𝑒 𝜇𝜇 𝑎𝑎𝐿𝐿 Mean (z ) 𝑏𝑏𝑈𝑈 𝑤𝑤ℎ(𝑒𝑒z𝑒𝑒𝑒𝑒) 𝜇𝜇 (𝑏𝑏z𝑈𝑈) st + + 1 Raw Moment ( z ) (z ) (z ) 𝜎𝜎𝜎𝜎 0 𝜙𝜙 a − 𝜙𝜙 b where µ where µ 𝜎𝜎 Φ − 0 Φ b − Φ a = = , = −µ 𝐿𝐿 𝑈𝑈 0 𝑎𝑎 𝑎𝑎 − µ 𝑏𝑏 𝑏𝑏 − µ Variance [1 𝑧𝑧{ } ] 𝑧𝑧 [1 { }𝑧𝑧 ] nd σ σ σ 2 Central Moment where2 2 where 2 2 σ − −Δ0 − Δ1 σ − −Δ0 − Δ1 138 Univariate Continuous Distributions

( ) ( ) ( ) = = (𝑘𝑘 ) 1 𝑘𝑘 ( ) 𝑘𝑘( ) 𝑧𝑧0 𝜙𝜙 𝑧𝑧0 𝑧𝑧𝑏𝑏 𝜙𝜙 𝑧𝑧𝑏𝑏 − 𝑧𝑧𝑎𝑎 𝜙𝜙 𝑧𝑧𝑎𝑎 Δ𝑘𝑘 Δ𝑘𝑘 Φ 𝑧𝑧0 − Φ 𝑧𝑧𝑏𝑏 − Φ 𝑧𝑧𝑎𝑎 Skewness 1 [2 + (3 1) + ] 3rd Central Moment − 3 where 3 Δ0 Δ1 − Δ0 Δ2 2 𝑉𝑉 = 1 2 Excess kurtosis 1 1 0 [ 3 6 𝑉𝑉 2 − Δ 4− Δ 3 + 3] 4th Central Moment 4 2 2 2 − Δ0 − Δ1Δ0 − Δ0 − Δ2Δ0 − Δ1 − Δ3 𝑉𝑉 Characteristic See (Abadir & Magdalinos 2002, pp.1276-1287) Function 100 % Percentile = = Function + { + (z )[1 ]} + { (z ) + (z )[1 ]} 𝛼𝛼 𝛼𝛼 𝛼𝛼 𝑡𝑡 −1 𝑡𝑡 −1 µ σΦ α Φ 0 − α µ σΦ αΦ b Φ a − α Parameter Estimation Maximum Likelihood Function Likelihood For limits [ , ]: Function 1 1 ( , , , 𝐿𝐿 )𝑈𝑈= exp 𝑎𝑎 𝑏𝑏 nF 2 { (z ) (z )} 2 𝑖𝑖 𝐿𝐿 𝑈𝑈 nF 𝑥𝑥 − 𝜇𝜇

Trunc Normal Trunc 𝐿𝐿 𝜇𝜇 𝜎𝜎 𝑎𝑎 𝑏𝑏 � �− � � � b a i=1 𝜎𝜎 �σ�√�2π���Φ����−��Φ������������������������ 1 failures 1 = exp 𝑛𝑛𝐹𝐹 ( ) 2 { (z ) (z )} 2 nF �− 2 � 𝑥𝑥𝑖𝑖 − 𝜇𝜇 � �σ 2π Φ b − Φ a � 𝜎𝜎 𝑖𝑖=1 ��√��������������������������������� failures For limits [0, ) 1 1 ( , ) = exp ∞ F { z } n 2 2 𝑥𝑥𝑖𝑖 − 𝜇𝜇 𝐿𝐿 𝜇𝜇 𝜎𝜎 nF � �− � � � 0 i=1 𝜎𝜎 �Φ��−���σ�√�2π���������������������� 1 failures 1 = exp 𝑛𝑛𝐹𝐹 ( ) 2 { z } 2 nF �− 2 � 𝑥𝑥𝑖𝑖 − 𝜇𝜇 � �Φ − 0 σ 2π� 𝜎𝜎 𝑖𝑖=1 �������√���������������������� failures Log-Likelihood For limits [ , ]: Function ( , , , | ) 𝑎𝑎𝐿𝐿 𝑏𝑏𝑈𝑈 𝐿𝐿 𝑈𝑈 1 =Λ 𝜇𝜇 n𝜎𝜎 𝑎𝑎ln[ 𝑏𝑏(z𝐸𝐸) (z )] n ln 𝑛𝑛𝐹𝐹 ( ) 2 2 − F Φ b − Φ a − F �σ√2π� − 2 � 𝑥𝑥𝑖𝑖 − 𝜇𝜇 𝜎𝜎 𝑖𝑖=1 ��������������������������������������� failures For limits [0, )

∞ Truncated Normal Continuous Distribution 139

1 ( , | ) = ln( { }) ln 2 𝑛𝑛𝐹𝐹 ( ) 2 2 Λ 𝜇𝜇 𝜎𝜎 𝐸𝐸 −𝑛𝑛𝐹𝐹 𝛷𝛷 −𝑧𝑧0 − 𝑛𝑛𝐹𝐹 �𝜎𝜎√ 𝜋𝜋� − 2 � 𝑥𝑥𝑖𝑖 − 𝜇𝜇 𝜎𝜎 𝑖𝑖=1 ����������������������������������� failures

= 0 n (z ) (z ) 1 = + 𝑛𝑛𝐹𝐹 ( ) = 0 ∂Λ (z ) (z ) ∂Λ − F ϕ a − ϕ b 2 𝑖𝑖 ∂µ � b a � � 𝑥𝑥 − 𝜇𝜇 ∂µ σ Φ − Φ 𝜎𝜎 𝑖𝑖=1 ������������������������� failures = 0 n z (z ) z (z ) n 1 = + 𝑛𝑛𝐹𝐹 ( ) = 0 ∂Λ (z ) (z ) ∂Λ − F aϕ a − bϕ b F 2 2 3 𝑖𝑖 ∂σ � b a � − � 𝑥𝑥 − 𝜇𝜇 ∂σ σ Φ − Φ σ 𝜎𝜎 𝑖𝑖=1 ��������������������������������� failures MLE Point First Estimate the values for and by solving the simultaneous Estimates equations numerically (Cohen 1991, p.33): 𝑎𝑎 𝑏𝑏 𝑧𝑧 𝑧𝑧 ( , ) = =

𝑄𝑄𝑎𝑎 − 𝑄𝑄𝑏𝑏 − 𝑧𝑧𝑎𝑎 𝑥𝑥̅ − 𝑎𝑎𝐿𝐿 Trunc Normal 𝐻𝐻1 𝑧𝑧𝑎𝑎 𝑧𝑧𝑏𝑏 1 + 𝑧𝑧𝑏𝑏 − 𝑧𝑧𝑎𝑎( 𝑏𝑏𝑈𝑈)− 𝑎𝑎𝐿𝐿 ( , ) = = ( ) 2 ( 2 ) 𝑧𝑧𝑎𝑎𝑄𝑄𝑎𝑎 − 𝑧𝑧𝑏𝑏𝑄𝑄𝑏𝑏 − 𝑄𝑄𝑎𝑎 − 𝑄𝑄𝑏𝑏 𝑠𝑠 2 𝑎𝑎 𝑏𝑏 2 2 𝐻𝐻 𝑧𝑧 𝑧𝑧 𝑏𝑏 𝑎𝑎 𝑈𝑈 𝐿𝐿 Where: 𝑧𝑧 − 𝑧𝑧 𝑏𝑏 − 𝑎𝑎

( ) ( ) = , = ( ) ( ) ( ) ( ) 𝜙𝜙 𝑧𝑧𝑎𝑎 𝜙𝜙 𝑧𝑧𝑏𝑏 𝑄𝑄𝑎𝑎 𝑄𝑄𝑏𝑏 Φ 𝑧𝑧𝑏𝑏 − Φ 𝑧𝑧𝑎𝑎 Φ 𝑧𝑧𝑏𝑏 − Φ 𝑧𝑧𝑎𝑎 = , = 𝑎𝑎𝐿𝐿 − 𝜇𝜇 𝑏𝑏𝑈𝑈 − 𝜇𝜇 1𝑧𝑧𝑎𝑎 𝑧𝑧1𝑏𝑏 = 𝑛𝑛𝐹𝐹 ,𝜎𝜎 = 𝑛𝑛𝐹𝐹 𝜎𝜎( ) 1 2 2 𝐹𝐹 𝑖𝑖 𝑖𝑖 𝑥𝑥̅ � 𝑥𝑥 𝑠𝑠 𝐹𝐹 � 𝑥𝑥 − 𝑥𝑥̅ 𝑛𝑛 0 𝑛𝑛 − 0 The distribution parameters can then be estimated using: = , = 𝑈𝑈 𝐿𝐿 𝑏𝑏 − 𝑎𝑎 𝐿𝐿 𝑎𝑎 𝜎𝜎� 𝑏𝑏 𝑎𝑎 𝜇𝜇̂ 𝑎𝑎 − 𝜎𝜎�𝑧𝑧� (Cohen 1991, p.44) provides𝑧𝑧� − 𝑧𝑧� a graphical procedure to estimate parameters to use as the starting point for numerical solvers.

For the case where the limits are [0, ) first numerically solve for : 1 Q (Q ) = 0 (Q )∞ 2 𝑧𝑧 − 0 0 − 𝑧𝑧0 𝑠𝑠 where 2 0 − 𝑧𝑧0 ( ) 𝑥𝑥̅ = 1 ( ) 𝜙𝜙 𝑧𝑧0 0 𝑄𝑄 0 The distribution parameters can be −estimatedΦ 𝑧𝑧 using: 140 Univariate Continuous Distributions

= , = 𝑥𝑥̅ 0 𝜎𝜎� 0 0 𝜇𝜇̂ −𝜎𝜎�𝑧𝑧� 𝑄𝑄 − 𝑧𝑧� When the limits and are unknown, the likelihood function is maximized when the difference, (z ) (z ), is at its minimum. This 𝐿𝐿 𝑈𝑈 occurs when the𝑎𝑎 difference𝑏𝑏 between is at its minimum. b a Therefore the MLE estimates for Φ and− Φ are: 𝑈𝑈 𝐿𝐿 𝑏𝑏 − 𝑎𝑎 𝐿𝐿 𝑈𝑈 a = min𝑎𝑎 (t , t 𝑏𝑏… ) b = max (tF , tF … ) L 1 2 � F F �U 1 2 1 1 2( ) Fisher [1 + ] + Information ( , ) = ′ ′ 𝑥𝑥̅ − 𝜇𝜇 (Cohen 1991, ⎡ 1 22( − 𝑄𝑄) 𝑎𝑎 𝑄𝑄𝑏𝑏 1 3[ 2 +� ( ) −] 𝜆𝜆𝑎𝑎 𝜆𝜆𝑏𝑏� ⎤ 2 ⎢ 𝜎𝜎 + 𝜎𝜎 𝜎𝜎 1 + ⎥ p.40) 𝐼𝐼 𝜇𝜇 𝜎𝜎 2 2 ⎢ 𝑥𝑥̅ − 𝜇𝜇 𝑠𝑠 𝑥𝑥̅ − 𝜇𝜇 ⎥ Where ⎢ 2 � − 𝜆𝜆𝑎𝑎 𝜆𝜆𝑏𝑏� 2 � 2 − − 𝜂𝜂𝑎𝑎 𝜂𝜂𝑏𝑏�⎥ ⎣𝜎𝜎 = 𝜎𝜎 ( ), 𝜎𝜎 = 𝜎𝜎 ( + ) ⎦ ′ = + , ′ = +

𝑎𝑎 𝑎𝑎 𝑎𝑎 𝑎𝑎 𝑏𝑏 𝑏𝑏 𝑏𝑏 𝑏𝑏 𝑄𝑄 =𝑄𝑄 (𝑄𝑄 −+𝑧𝑧 ), 𝑄𝑄 =−𝑄𝑄( 𝑄𝑄+ 𝑧𝑧) 𝑎𝑎 𝐿𝐿 𝑎𝑎 𝑎𝑎 𝑏𝑏 𝑈𝑈 𝑏𝑏 𝑏𝑏 𝜆𝜆 𝑎𝑎 𝑄𝑄′ 𝑄𝑄 𝜆𝜆 𝑏𝑏 𝑄𝑄′ 𝑄𝑄 𝜂𝜂𝑎𝑎 𝑎𝑎𝐿𝐿 𝜆𝜆𝑎𝑎 𝑄𝑄𝑎𝑎 𝜂𝜂𝑏𝑏 𝑏𝑏𝑈𝑈 𝜆𝜆𝑏𝑏 𝑄𝑄𝑏𝑏 100 % Calculated from the Fisher information matrix. See section 1.4.7. For Confidence further detail and examples see (Cohen 1991, p.41) Intervals𝛾𝛾 Trunc Normal Trunc Bayesian No closed form solutions to priors exist. Description , Limitations and Uses Example 1 The size of washers delivered from a manufacturer is desired to be modeled. The manufacture has already removed all washers below 15.95mm and washers above 16.05mm. The washers received have the following diameters:

15.976, 15.970, 15.955, 16.007, 15.966, 15.952, 15.955 From data: = 15.973, = 4.3950 -4 𝑚𝑚𝑚𝑚 2 Using numerical solver𝑥𝑥̅ MLE Estimates𝑠𝑠 for and𝐸𝐸 are:

𝑎𝑎 𝑏𝑏 = 0, = 3.3351𝑧𝑧 𝑧𝑧 Therefore 𝑧𝑧�𝑎𝑎 �𝑧𝑧𝑏𝑏 = = 0.029984 𝑈𝑈 𝐿𝐿 𝑏𝑏 − 𝑎𝑎 𝜎𝜎� 𝑏𝑏 𝑎𝑎 𝑧𝑧�= − 𝑧𝑧� = 15.95

𝐿𝐿 𝑎𝑎 To calculate confidence intervals,𝜇𝜇̂ 𝑎𝑎 − first𝜎𝜎�𝑧𝑧� calculate: Truncated Normal Continuous Distribution 141

= 0.63771, = 0.010246 ′ = 10.970, ′ = 0.16138 𝑎𝑎 𝑏𝑏 𝑄𝑄 = 187.71, 𝑄𝑄 = −2.54087 𝑎𝑎 𝑏𝑏 𝜆𝜆 𝜆𝜆 − 𝑎𝑎 𝑏𝑏 90% confidence intervals𝜂𝜂 : 𝜂𝜂 − 391.57 10699 ( , ) = 10699 209183 − 𝐼𝐼 𝜇𝜇 𝜎𝜎 � 1.1835 -4 � 6.0535 -6 [ ( , )] = [ ( , )−] = − 6.0535 -6 2.2154 -7 −1 −1 𝐸𝐸 − 𝐸𝐸 𝐽𝐽𝑛𝑛 𝜇𝜇̂ 𝜎𝜎� 𝑛𝑛𝐹𝐹𝐼𝐼 𝜇𝜇̂ 𝜎𝜎� � � 90% confidence interval for : − 𝐸𝐸 − 𝐸𝐸 1(0.95) 1.1835 -4, + 1(0.95) 1.1835 -4 𝜇𝜇 − [15.932, 15.968]− − Φ Φ �𝜇𝜇̂ √ 𝐸𝐸 𝜇𝜇̂ √ 𝐸𝐸 � 90% confidence interval for : (0.95) 2.2154 -7 (0.95) 2.2154 -7 . exp , . exp −1 𝜎𝜎 −1 𝛷𝛷 √ 𝐸𝐸 𝛷𝛷 √ 𝐸𝐸 �𝜎𝜎� � [2.922 -2�, 3𝜎𝜎�.0769� -2] �� −𝜎𝜎� 𝜎𝜎� An estimate can be made on𝐸𝐸 how many washers𝐸𝐸 the manufacturer Trunc Normal discards:

The distribution of washer sizes is a Normal Distribution with estimated parameters = 15.95, = 0.029984. The percentage of

washers wish pass is: 𝜇𝜇̂ 𝜎𝜎� (16.05) (15.95) = 49.96%

It is likely that there𝐹𝐹 is too− much𝐹𝐹 variance in the manufacturing process for this system to be efficient.

Example 2 The following example adjusts the calculations used in the Normal Distribution to account for the fact that the limit on distance is [0, ).

The accuracy of a cutting machine used in manufacturing is desired∞ to be measured. 5 cuts at the required length are made and measured as: 7.436, 10.270, 10.466, 11.039, 11.854

From data: 𝑚𝑚𝑚𝑚 = 10.213, = 2.789 2 Using numerical solver𝑥𝑥̅ MLE Estimates𝑠𝑠 for is:

0 = 4.5062 𝑧𝑧 Therefore 𝑧𝑧�0 − = = 2.26643 𝑥𝑥̅ 𝜎𝜎� 𝑄𝑄0 − 𝑧𝑧�0 142 Univariate Continuous Distributions

= = 10.213

𝑎𝑎 To calculate confidence intervals,𝜇𝜇̂ −𝜎𝜎�𝑧𝑧� first calculate: = 7.0042 -5, = 1.5543 -5, = 0.16138 ′ 0 0 𝑏𝑏 90% confidence𝑄𝑄 intervals:𝐸𝐸 𝜆𝜆 𝐸𝐸 𝜆𝜆 − 0.19466 2.9453 -6 ( , ) = 2.9453 -6 0.12237 − 𝐸𝐸 𝐼𝐼 𝜇𝜇 𝜎𝜎 � 1.0274 2.�4728 -5 [ ( , )] = [ (−, )] 𝐸𝐸= 2.4728 -5 1.6343 −1 −1 𝐸𝐸 𝐽𝐽𝑛𝑛 𝜇𝜇̂ 𝜎𝜎� 𝑛𝑛𝐹𝐹𝐼𝐼 𝜇𝜇̂ 𝜎𝜎� � � 90% confidence interval for : 𝐸𝐸 1(0.95) 1.0274, + 1(0.95) 1.0274 − [8𝜇𝜇.546, 11.88] − − Φ Φ �𝜇𝜇̂ √ 𝜇𝜇̂ √ � 90% confidence interval for : (0.95) 1.6343 (0.95) 1.6343 . exp , . exp −1 𝜎𝜎 −1 𝛷𝛷 √ 𝛷𝛷 √ �𝜎𝜎� � [0.8962� , 𝜎𝜎�5.732�] �� −𝜎𝜎� 𝜎𝜎� To compare these results to a non-truncated normal distribution: 90% Lower CI Point Est 90% Upper CI Norm - 10.163 10.213 10.262 Classical

Trunc Normal Trunc Norm - 𝜇𝜇 1.176 2.789 15.697 Classical2 Norm - 𝜎𝜎 8.761 10.213 11.665 Bayesian Norm - 𝜇𝜇 0.886 2.789 6.822 Bayesian2 TNorm -𝜎𝜎 8.546 10.213 11.88 TNorm - 0.80317 5.1367 32.856 *Note: The𝜇𝜇 TNorm2 estimate and interval are squared. 𝜎𝜎 The truncated norma𝜎𝜎 l produced results which had a wider confidence in the parameter estimates, however the point estimates were within each others confidence intervals. In this case the truncation correction might be ignored for ease of calculation.

Characteristics For large / truncation may have negligible affect. In this case the use the Normal Continuous Distribution as an approximation. 𝜇𝜇 𝜎𝜎 Let: X~ ( , ) where [ , ] 2 Convolution Property.𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 Theµ σsum of truncated𝑋𝑋 ∈ 𝑎𝑎 normal𝑏𝑏 distribution random variables is not a truncated normal distribution. When truncation is symmetrical about the mean the sum of truncated normal distribution random variables is well approximated using: Truncated Normal Continuous Distribution 143

b a = 𝑛𝑛 where = 2 i − i 𝑌𝑌 � 𝑋𝑋𝑖𝑖 µi 𝑖𝑖=1 , ( ) where [ , ]

𝑌𝑌 ≈ 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 �� 𝜇𝜇𝑖𝑖 � 𝑉𝑉𝑉𝑉𝑉𝑉 𝑋𝑋𝑖𝑖 � 𝑌𝑌 ∈ ∑𝑎𝑎𝑖𝑖 ∑𝑏𝑏𝑖𝑖 Linear Transformation Property (Cozman & Krotkov 1997)

= + ~ ( + , ) where [ + , + ] 2𝑌𝑌2 𝑐𝑐𝑐𝑐 𝑑𝑑 𝑌𝑌 𝑇𝑇𝑁𝑁𝑜𝑜𝑜𝑜𝑜𝑜 𝑐𝑐𝑐𝑐 𝑑𝑑 𝑑𝑑 𝜎𝜎 𝑌𝑌 ∈ 𝑐𝑐𝑐𝑐 𝑑𝑑 𝑐𝑐𝑐𝑐 𝑑𝑑 Applications Life Distribution. When used as a life distribution a truncated Normal Distribution may be used due to the constraint t≥0. However it is often found that the difference in results is negligible. (Rausand & Høyland 2004)

Repair Time Distributions. The truncated normal distribution may be used to model simple repair or inspection tasks that have a typical duration with little variation using the limits [0, ) Trunc Normal Failures After Pre-test Screening. When a customer∞ receives a product from a vendor, the product may have already been subject to burn-in testing. The customer will not know the number of failures which occurred during the burn-in, but may know the duration. As

such the failure distribution is left truncated. (Meeker & Escobar 1998, p.269)

Flaws under the inspection threshold. When a flaw is not detected due to the flaw’s amplitude being less than the inspection threshold the distribution is left truncated. (Meeker & Escobar 1998, p.266)

Worst Case Measurements. Sometimes only the worst performers from a population are monitored and have data collected. Therefore the threshold which determined that the item be monitored is the truncation limit. (Meeker & Escobar 1998, p.267)

Screening Out Units With Large Defects. In quality control processes it may be common to remove defects which exceed a limit. The remaining population of defects delivered to the customer has a right truncated distribution. (Meeker & Escobar 1998, p.270)

Resources Online: http://en.wikipedia.org/wiki/Truncated_normal_distribution http://socr.ucla.edu/htmls/SOCR_Distributions.html (web calc) http://www.ntrand.com/truncated-normal-distribution/

Books: Cohen, 1991. Truncated and Censored Samples 1st ed., CRC 144 Univariate Continuous Distributions

Press.

Patel, J.K. & Read, C.B., 1996. Handbook of the Normal Distribution 2nd ed., CRC.

Schneider, H., 1986. Truncated and censored samples from normal populations, M. Dekker.

Relationship to Other Distributions Normal Let: Distribution ~ ( , ) ( , ) 2 ( ; , ) Then: 𝑋𝑋 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝜇𝜇 𝜎𝜎 2 ~T 𝑋𝑋 ∈ (∞, ∞, , ) 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑥𝑥 𝜇𝜇 𝜎𝜎 [ , 2 ] 𝑌𝑌 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝜇𝜇 𝜎𝜎 𝑎𝑎𝐿𝐿 𝑏𝑏𝑈𝑈 For further relationships see Normal𝑌𝑌 Continuous∈ 𝑎𝑎𝐿𝐿 𝑏𝑏𝑈𝑈 Distribution

Trunc Normal Trunc Uniform Continuous Distribution 145

4.9. Uniform Continuous Distribution

Probability Density Function - f(t) 2.00 a=0 b=1 a=0.75 b=1.25 a=0.25 b=1.5 1.50

1.00

0.50

0.00 0 0.5 1 1.5

Cumulative Density Function - F(t) Uniform Cont 1.00 a=0 b=1 a=0.75 b=1.25 0.80 a=0.25 b=1.5

0.60

0.40

0.20

0.00 0 0.5 1 1.5

Hazard Rate - h(t) 10.00 a=0 b=1 9.00 a=0.75 b=1.25 8.00 a=0.25 b=1.5 7.00 6.00 5.00 4.00 3.00 2.00 1.00 0.00 0 0.5 1 1.5

146 Univariate Continuous Distributions

Parameters & Description Minimum Value. is the lower bound 0 < of the uniform distribution. Parameters 𝑎𝑎 𝑎𝑎 ≤ 𝑎𝑎 𝑏𝑏 Maximum Value. is the upper bound < < of the uniform distribution. 𝑏𝑏 Random Variable 𝑏𝑏 𝑎𝑎 𝑏𝑏 ∞

Distribution Time Domain 𝑎𝑎 ≤ 𝑡𝑡 ≤ 𝑏𝑏 Laplace 1 for a t b ( ) = b a 0 otherwise ≤ ≤ 𝑓𝑓 𝑡𝑡 � 1− PDF = { ( ) ( )} ( ) = b a −𝑎𝑎(𝑎𝑎 −)𝑏𝑏𝑏𝑏 𝑒𝑒 − 𝑒𝑒 𝑢𝑢 𝑡𝑡 − 𝑎𝑎 − 𝑢𝑢 𝑡𝑡 − 𝑏𝑏 𝑓𝑓 𝑠𝑠 Where (− ) is the Heaviside 𝑠𝑠 𝑏𝑏 − 𝑎𝑎 step function. 𝑢𝑢 𝑡𝑡 − 𝑎𝑎 0 for t < t a ( ) = for a t b b a 𝑎𝑎 CDF 1 − for t > ( ) = 𝐹𝐹 𝑡𝑡 �t a ≤ ≤ −𝑎𝑎(𝑎𝑎 −)𝑏𝑏𝑏𝑏 = − { ( ) ( )} 𝑒𝑒 − 𝑒𝑒 b a 𝑏𝑏 𝐹𝐹 𝑠𝑠 2 − + ( ) 𝑠𝑠 𝑏𝑏 − 𝑎𝑎 𝑢𝑢 𝑡𝑡 − 𝑎𝑎 − 𝑢𝑢 𝑡𝑡 − 𝑏𝑏 − Uniform Cont Uniform 1 for𝑢𝑢 t <𝑡𝑡 − 𝑏𝑏 b t 1 Reliability ( ) = for a t b ( ) = + b a 𝑎𝑎 −𝑏𝑏(𝑏𝑏 −)𝑎𝑎𝑎𝑎 0 − for t > 𝑒𝑒 − 𝑒𝑒 𝑅𝑅 𝑡𝑡 � ≤ ≤ 𝑅𝑅 𝑠𝑠 2 𝑠𝑠 𝑏𝑏 − 𝑎𝑎 𝑠𝑠 < : − For 𝑏𝑏 1 for t + x < 𝑡𝑡 𝑎𝑎 ( + ) b (t + x) ( ) = = for a t + x b ( ) b a 𝑎𝑎 𝑅𝑅 𝑡𝑡 𝑥𝑥 0 − for t > 𝑚𝑚 𝑥𝑥 � ≤ ≤ For : 𝑅𝑅 𝑡𝑡 − Conditional 1 for t + x𝑏𝑏< 𝑎𝑎 ≤ 𝑡𝑡 ≤ 𝑏𝑏 ( + ) b (t + x) Survivor Function ( ) = = for a t + x b ( > + | > ) ( ) b t 𝑎𝑎 𝑅𝑅 𝑡𝑡 𝑥𝑥 0 − for t + x > 𝑚𝑚 𝑥𝑥 � ≤ ≤ 𝑃𝑃 𝑇𝑇 𝑥𝑥 𝑡𝑡 𝑇𝑇 𝑡𝑡 For > : 𝑅𝑅 𝑡𝑡 − ( ) = 0 𝑏𝑏 Where𝑡𝑡 𝑏𝑏 is the given time we know the𝑚𝑚 component𝑥𝑥 has survived to. is a random variable defined as the time after . Note: = 0 at . 𝑡𝑡 𝑥𝑥For < : 𝑡𝑡 𝑥𝑥 𝑡𝑡 ( ) = ( + ) Mean Residual 𝑡𝑡 𝑎𝑎 : 1 Life For 𝑢𝑢 𝑡𝑡 2 𝑎𝑎 (𝑏𝑏 − 𝑡𝑡) ( ) = 𝑎𝑎 ≤ 𝑡𝑡 ≤ 𝑏𝑏 2( 2) 𝑎𝑎 − 𝑏𝑏 𝑢𝑢 𝑡𝑡 𝑎𝑎 − 𝑡𝑡 − 𝑡𝑡 − 𝑏𝑏 Uniform Continuous Distribution 147

For > : ( ) = 0 𝑡𝑡 𝑏𝑏 𝑢𝑢 𝑡𝑡 1 for a t b ( ) = Hazard Rate b t 0 otherwise ≤ ≤ ℎ 𝑡𝑡 � − 0 for t < Cumulative b t ( ) = ln for a t b Hazard Rate b a 𝑎𝑎 − for t > 𝐻𝐻 𝑡𝑡 �− � � ≤ ≤ − Properties and∞ Moments 𝑏𝑏 Median ( + ) 1 Mode Any value2 between𝑎𝑎 𝑏𝑏 and

st Mean - 1 Raw Moment ( + ) 𝑎𝑎 𝑏𝑏 1

nd 2 𝑎𝑎 𝑏𝑏 Uniform Cont Variance - 2 Central Moment ( ) 1 2 Skewness - 3rd Central Moment 12 𝑏𝑏 0− 𝑎𝑎

Excess kurtosis - 4th Central Moment 6

Characteristic Function e −5e

ititb(b aita) − 100α% Percentile Function = (b− a) + a Parameter Estimation 𝑡𝑡𝛼𝛼 α − Maximum Likelihood Function Likelihood 1 b t t t ( , | ) = F . . 1 + Functions b a n nS b aS nI RIb aLI − i i − i

𝐿𝐿 𝑎𝑎 𝑏𝑏 𝐸𝐸 � � �i=1 � � �i=1 � � This assumes that� all��−� times��� �are�� �within���� −the��� bound���� a,�� b.� � ���−���� failures survivors interval failures

When there is only complete failure data: 1 ( , | ) = F b a n where 𝐿𝐿 𝑎𝑎 𝑏𝑏 𝐸𝐸 � � −

Point The likelihood function is maximize𝑎𝑎 ≤ 𝑡𝑡d𝑖𝑖 when≤ 𝑏𝑏 a is large, b is small with the Estimates restriction that all times are between a and b. Thus:

a = min (t , t … ) b = max (tF , tF … ) 1 2 � F F � 1 2 148 Univariate Continuous Distributions

When = 0 and is estimated with complete data the following estimates may be used where = max (t , t … t ). (Johnson et al. 1995, p.286)𝑎𝑎 𝑏𝑏 F F F 𝑚𝑚𝑚𝑚𝑚𝑚 1 2 n 1. MLE. 𝑡𝑡 b =

2. Min Mean Square Error. b = 𝑚𝑚𝑚𝑚𝑥𝑥 � n𝑡𝑡+2 b = 3. Unbiased Estimator. � n+1 𝑡𝑡𝑚𝑚𝑚𝑚𝑚𝑚 n+1 b = 2 4. Closest Estimator. � n 𝑡𝑡𝑚𝑚𝑚𝑚𝑚𝑚 1 �n 𝑚𝑚𝑚𝑚𝑚𝑚 Procedures for parameter estimating when� there𝑡𝑡 is censored data is detailed in (Johnson et al. 1995, p.286)

Fisher 1 1 Information ( ) ( ) ( , ) = − ⎡ 1 2 1 2⎤ 𝑎𝑎 − 𝑏𝑏 𝑎𝑎 − 𝑏𝑏 𝐼𝐼 𝑎𝑎 𝑏𝑏 ⎢( ) ( ) ⎥ − ⎢ 2 2⎥ Bayesian⎣ 𝑎𝑎 − 𝑏𝑏 𝑎𝑎 − 𝑏𝑏 ⎦ The Uniform distribution is widely used in Bayesian methods as a non-informative prior or to model evidence which only suggests bounds on the parameter.

Non-informative Prior. The Uniform distribution can be used as a non-informative prior. As can be seen below, the only affect the uniform prior has on Bayes equation is to limit the range of the parameter for which the denominator integrates over.

Uniform Cont Uniform 1 ( | ) ( | ) ( | ) = = 1 𝐿𝐿(𝐸𝐸|𝜃𝜃) � � ( | ) 𝑏𝑏 − 𝑎𝑎 𝐿𝐿 𝐸𝐸 𝜃𝜃 𝜋𝜋 𝜃𝜃 𝐸𝐸 𝑏𝑏 𝑏𝑏

𝑎𝑎 𝐿𝐿 𝐸𝐸 𝜃𝜃 � � 𝑑𝑑𝑑𝑑 ∫𝑎𝑎 𝐿𝐿 𝐸𝐸 𝜃𝜃 𝑑𝑑𝑑𝑑 Parameter Bounds. This type ∫of distribution𝑏𝑏 − 𝑎𝑎allows an easy method to mathematically model soft data where only the parameter bounds can be estimated. An example is where uniform distribution can model a person’s opinion on the value where they know that it could not be lower than or greater than , but is unsure of any particular value could take. 𝜃𝜃 𝑎𝑎 𝑏𝑏 𝜃𝜃 Non-informative Priors Jeffrey’s Prior 1

Description , Limitations and𝑎𝑎 − 𝑏𝑏Uses Example For an example of the uniform distribution being used in Bayesian updating as a prior, (1,1) see the binomial distribution.

Given the following data𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 calculate the MLE parameter estimates: 240, 585, 223, 751, 255

a = 223

� Uniform Continuous Distribution 149

b = 751 Characteristics The Uniform distribution is a �special case of the Beta distribution when = = 1.

The uniform𝛼𝛼 𝛽𝛽 distribution has an increasing failure rate with lim ( ) = . 𝑡𝑡→𝑏𝑏 ℎ 𝑡𝑡

The∞ Standard Uniform Distribution has parameters = 0 and = 1. This results in ( ) = 1 for and 0 otherwise. 𝑎𝑎 𝑏𝑏 𝑓𝑓 𝑡𝑡 𝑎𝑎~≤ 𝑡𝑡 ≤(𝑏𝑏 , ) Uniformity Property If > and + < then: 𝑇𝑇 𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈 𝑎𝑎 𝑏𝑏 1 ( + ) = = 𝑡𝑡 𝑎𝑎 𝑡𝑡 𝛥𝛥 𝑏𝑏 𝑡𝑡+∆ ∆ The probability that a random variable falls within any interval of fixed 𝑃𝑃 𝑡𝑡 → 𝑡𝑡 ∆ �𝑡𝑡 𝑑𝑑𝑑𝑑 length is independent of the location,𝑏𝑏 −, and𝑎𝑎 is only𝑏𝑏 − dependent𝑎𝑎 on the interval size, . 𝑡𝑡 Uniform Cont Variate Generation𝛥𝛥 Property ( ) = ( ) + −1 Residual Property 𝐹𝐹 𝑢𝑢 𝑢𝑢 𝑏𝑏 − 𝑎𝑎 𝑎𝑎 If k is a real constant where < < then:

Pr ( | > )~ ( = , ) 𝑎𝑎 𝑘𝑘 𝑏𝑏 𝑇𝑇 𝑇𝑇 𝑘𝑘 𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈 𝑎𝑎 𝑘𝑘 𝑏𝑏 Applications Random Number Generator. The uniform distribution is widely used as the basis for the generation of random numbers for other statistical distributions. The random uniform values are mapped to the desired distribution by solving the inverse cdf.

Bayesian Inference. The uniform distribution can be used ss a non- informative prior and to model soft evidence.

Special Case of Beta Distribution. In applications like Bayesian statistics the uniform distribution is used as an uninformative prior by using a beta distribution of = = 1.

Resources Online: 𝛼𝛼 𝛽𝛽 http://mathworld.wolfram.com/UniformDistribution.html http://en.wikipedia.org/wiki/Uniform_distribution_(continuous) http://socr.ucla.edu/htmls/SOCR_Distributions.html (web calc)

Books: Johnson, N.L., Kotz, S. & Balakrishnan, N., 1995. Continuous Univariate Distributions, Vol. 2 2nd ed., Wiley-Interscience. Relationship to Other Distributions Beta Distribution Let 150 Univariate Continuous Distributions

X ~ (0,1) X X X ( ; , , a, b) Then i 1 2 n 𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈 X ~ 𝑎𝑎𝑎𝑎𝑎𝑎( , +≤1) ≤ ⋯ ≤ 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑡𝑡 α β Where and are integers. r 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑟𝑟 𝑛𝑛 − 𝑟𝑟 Special𝑛𝑛 Case:𝑘𝑘 ( ; , | = 1, = 1 ) = ( ; , )

𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑡𝑡 𝑎𝑎 𝑏𝑏 𝛼𝛼 β 𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈 𝑡𝑡 𝑎𝑎 𝑏𝑏 Exponential Let Distribution ~ ( ) Y = exp ( ) Then ( ; ) 𝑋𝑋 𝐸𝐸𝐸𝐸𝐸𝐸 𝜆𝜆 ~𝑎𝑎𝑎𝑎𝑎𝑎 (0,1) −𝜆𝜆𝜆𝜆

𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 𝜆𝜆 𝑌𝑌 𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈

Uniform Cont Uniform Discrete Distributions 151

5. Univariate Discrete Distributions

Univar Discrete

152 Discrete Distributions

5.1. Bernoulli Discrete Distribution

Probability Density Function - f(k) 0.70 0.70 0.60 0.60 p=0.3 0.50 0.50 0.40 0.40 0.30 0.30 p=0.55 0.20 0.20 0.10 0.10 0.00 0.00 0 1 2 0 1 2

Cumulative Density Function - F(k) 1.00 1.00

0.80 0.80

0.60 0.60 p=0.3 p=0.55 0.40 0.40

0.20 0.20

0.00 0.00 0 1 2 0 1 2

Bernoulli Hazard Rate - h(k) 1.00 1.00

0.80 0.80

0.60 0.60 p=0.3 p=0.55 0.40 0.40

0.20 0.20

0.00 0.00 0 1 2 0 1 2

Bernoulli Discrete Distribution 153

Parameters & Description Bernoulli probability parameter. Parameters 0 1 Probability of success. Random Variable 𝑝𝑝 ≤ 𝑝𝑝 ≤ {0, 1} The probability of getting exactly (0 or 1) successes in 1 trial with Question 𝑘𝑘 ∈ probability p. 𝑘𝑘 Distribution Formulas ( ) = p (1 p) PDF 1k p for1− kk = 0 = 𝑓𝑓 𝑘𝑘 p − for k = 1 − ( ) = �(1 p) CDF 1 p for k = 0 = 1−k 𝐹𝐹 𝑘𝑘 1 − for k = 1 − ( ) =� 1 (1 p) Reliability p for k = 0 = 1−k 𝑅𝑅 𝑘𝑘 0− for− k = 1 1� p for k = 0 Hazard Rate ( ) = 1 for k = 1 − Propertiesℎ and𝑘𝑘 Moments�

Mode . = 0.5 . = {0,1} = 0.5 𝑘𝑘0 5 ‖𝑝𝑝‖ 𝑤𝑤ℎ𝑒𝑒𝑒𝑒 𝑝𝑝 ≠ st Mean - 1 Raw Moment 𝑘𝑘0 5 𝑤𝑤 ℎ𝑒𝑒𝑒𝑒 𝑝𝑝

nd Bernoulli Variance - 2 Central Moment (1𝑝𝑝 ) rd q p Skewness - 3 Central Moment 𝑝𝑝 − 𝑝𝑝= (1 ) pq −

th 𝑤𝑤ℎ𝑒𝑒𝑒𝑒𝑒𝑒 𝑞𝑞 − 𝑝𝑝 Excess kurtosis - 4 Central Moment � 6p 6p + 1

p2(1 p) − Characteristic Function (1 p)−+ pe it Parameter Estimation − Maximum Likelihood Function Likelihood ( | ) = (1 ) Function ∑𝑘𝑘𝑖𝑖 𝑛𝑛−∑𝑘𝑘𝑖𝑖 where is the number𝐿𝐿 of𝑝𝑝 Bernoulli𝐸𝐸 𝑝𝑝 trials− 𝑝𝑝 {0,1}, and = 𝑛𝑛 𝑛𝑛 𝑘𝑘𝑖𝑖 ∈ ∑𝑘𝑘𝑖𝑖 ∑𝑖𝑖=1 𝑘𝑘𝑖𝑖 154 Discrete Distributions

L solve for = 0 p L ( ) 𝑑𝑑 =𝑝𝑝 k. p (1 p) (n k)p (1 p) = 0 p i i i i 𝑑𝑑 𝑑𝑑 ∑( k) −1 n−∑k ∑k n−1−∑k ∑k. p (1 −p) =−(n − ∑k )p (1 − p) 𝑑𝑑 ∑ ki −1 n−∑ki ∑ki n−1−∑ki i ∑k . p = (n − k )(1 p) − ∑ − −1 −1 ∑(1 i p) n − k∑ i − = p k i − − ∑ k ∑ i p = n i ∑ Fisher 1 ( ) = Information (1 ) 𝐼𝐼 𝑝𝑝 MLE Point The MLE point estimate for p: 𝑝𝑝 − 𝑝𝑝 Estimates k p = n ∑ Fisher � 1 ( ) = Information (1 ) 𝐼𝐼 𝑝𝑝 𝑝𝑝 − 𝑝𝑝 Confidence See discussion in binomial distribution. Intervals Bayesian Non-informative Priors for p, ( )

(Yang and Berger 1998, p.6) 𝝅𝝅 𝒑𝒑 Type Prior Posterior

Bernoulli Uniform Proper 1 Truncated Beta Distribution

Prior with limits For a b [ , ] . ( ; 1 + , 2 ) 𝑏𝑏 − 𝑎𝑎 ≤ 𝑝𝑝 ≤ 𝑝𝑝 ∈ 𝑎𝑎 𝑏𝑏 Otherwise𝑐𝑐 𝐵𝐵 𝐵𝐵𝐵𝐵𝐵𝐵( )𝑝𝑝= 0 𝑘𝑘 − 𝑘𝑘

Uniform Improper 1 = ( ; 1,1) 𝜋𝜋 (𝑝𝑝 ; 1 + , 2 ) Proir with limits [0,1] 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑝𝑝 𝐵𝐵𝑒𝑒𝑡𝑡𝑡𝑡 𝑝𝑝 𝑘𝑘 − 𝑘𝑘 Jeffrey’s Prior 1 1 1 1 𝑝𝑝 ∈ = ; , ; + , 1.5 Reference Prior (1 ) 2 2 2 [0,1] 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 �𝑝𝑝 � 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵when�𝑝𝑝 𝑘𝑘 − 𝑘𝑘� �𝑝𝑝 − 𝑝𝑝 MDIP 1.6186 (1 ) Proper - No𝑝𝑝 Closed∈ Form 𝑝𝑝 1−𝑝𝑝 Novick and Hall (1 𝑝𝑝) =− 𝑝𝑝 (0,0) ( ; , 1 ) −1 −1 when [0,1] 𝑝𝑝 − 𝑝𝑝 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑝𝑝 𝑘𝑘 − 𝑘𝑘 𝑝𝑝 ∈ Bernoulli Discrete Distribution 155

Conjugate Priors UOI Likelihood Evidence Dist of Prior Posterior Model UOI Para Parameters

failures = + from Bernoulli Beta , 𝑝𝑝 in 1 trail = + 1 ( ; ) 𝑘𝑘 𝛼𝛼 𝛼𝛼𝑜𝑜 𝑘𝑘 𝛼𝛼0 𝛽𝛽0 𝛽𝛽 𝛽𝛽𝑜𝑜 − 𝑘𝑘 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑝𝑝 Description , Limitations and Uses Example When a demand is placed on a machine it undergoes a Bernoulli trial with success defined as a successful start. It is known the probability of a successful start, , equals 0.8. Therefore the probability the machine does not start. (0) = 0.2. 𝑝𝑝 For an example with 𝑓𝑓multiple Bernoulli trials see the binomial distribution. Characteristics A Bernoulli process is a probabilistic experiment that can have one of two outcomes, success ( = 1) with the probability of success is , and failure ( = 0) with the probability of failure is 1 . 𝑘𝑘 𝑝𝑝Single Trial. It𝑘𝑘’s important to emphasis that the Bernoulli𝑞𝑞 ≡ distribution− 𝑝𝑝 is for a single trial or event. The case of multiple Bernoulli trials with replacement is the binomial distribution. The case of multiple Bernoulli trials without replacement is the hypergeometric distribution.

~ ( | ) Maximum Property max { , , … , 𝐾𝐾}~𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵(𝑘𝑘;𝑝𝑝 = 1 {1 }) Bernoulli Minimum property 1 2 𝑛𝑛 𝑖𝑖 min𝐾𝐾 𝐾𝐾 { , 𝐾𝐾, … , 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵}~ 𝑘𝑘 𝑝𝑝( ; −=Π −) 𝑝𝑝 Product Property 𝐾𝐾1 𝐾𝐾2 𝐾𝐾𝑛𝑛 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑝𝑝 Π𝑝𝑝𝑖𝑖 n K ~ ( ; = )

� i 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 Π𝑘𝑘 𝑝𝑝 Π𝑝𝑝𝑖𝑖 i=1 Applications Used to model a single event which have only two outcomes. In reliability engineering it is most often used to model demands or shocks to a component where the component will fail with probability p.

In practice it is rare for only a single event to be considered and so a binomial distribution is most often used (with the assumption of replacement). The conditions and assumptions of a Bernoulli trial however are used as the basis for each trial in a binomial distribution. See ‘Related Distributions’ and binomial distribution for more details.

Resources Online: 156 Discrete Distributions

http://mathworld.wolfram.com/BernoulliDistribution.html http://en.wikipedia.org/wiki/Bernoulli_distribution http://socr.ucla.edu/htmls/SOCR_Distributions.html (web calc)

Books: Collani, E.V. & Dräger, K., 2001. Binomial distribution handbook for scientists and engineers, Birkhäuser.

Johnson, N.L., Kemp, A.W. & Kotz, S., 2005. Univariate Discrete Distributions 3rd ed., Wiley-Interscience. Relationship to Other Distributions The Binomial distribution counts the number of successes in n independent observations of a Bernoulli process.

Let Binomial Distribution ~ (k ; p) = 𝑛𝑛

𝑖𝑖 i 𝑖𝑖 ( |n, p) Then 𝐾𝐾 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑎𝑎𝑎𝑎𝑎𝑎 𝑌𝑌 � 𝐾𝐾 𝑖𝑖=1 Y~ (k = k |n, p) where {1, 2, … , } 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘′ ′ i Special Case:𝐵𝐵 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 ∑ 𝑘𝑘′ ∈ 𝑛𝑛 (k; p) = (k; p|n = 1)

𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵

Bernoulli Binomial Discrete Distribution 157

5.2. Binomial Discrete Distribution

Probability Density Function - f(k) 0.60 n=1 p=0.4 0.30 n=20 p=0.2 n=20 p=0.5 n=5 p=0.4 0.50 0.25 n=20 p=0.8 n=15 p=0.4 0.40 0.20

0.30 0.15

0.20 0.10

0.10 0.05

0.00 0.00 0 5 10 0 5 10 15 20

Cumulative Density Function - F(k) 1.00 1.00

0.80 0.80

0.60 0.60

0.40 0.40 n=1 p=0.4 0.20 n=5 p=0.4 0.20 n=15 p=0.4 Binomial 0.00 0.00 0 5 10 0 5 10 15 20

Hazard Rate - h(k) 1.0 1.0

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0.0 0.0 0 5 10 0 10 20

158 Discrete Distributions

Parameters & Description {1, 2 … , } Number of Trials. Parameters Bernoulli probability parameter. 𝑛𝑛 𝑛𝑛 ∈0 1∞ Probability of success in a single trial. Random Variable 𝑝𝑝 ≤ 𝑝𝑝 ≤ {0, 1, 2 … , }

Question The probability of getting𝑘𝑘 ∈ exactly 𝑛𝑛 successes in trials. Distribution Formulas𝑘𝑘 𝑛𝑛 n ( ) = p (1 p) k where k combinations from n: k n−k PDF 𝑓𝑓 𝑘𝑘 � � −! = = = = ! ( )! 𝑛𝑛 𝑛𝑛 𝑛𝑛 𝑛𝑛 𝑛𝑛−1 � � 𝑛𝑛𝐶𝐶𝑘𝑘 𝐶𝐶𝑘𝑘 𝐶𝐶𝑘𝑘−1 𝑘𝑘 𝑘𝑘 𝑛𝑛 − 𝑘𝑘 𝑘𝑘 ! ( ) = k p (1 p) ! ( )! 𝑛𝑛 j n−j 𝐹𝐹 𝑘𝑘 = � ( , + 1) − j=0 𝑗𝑗 𝑛𝑛 − 𝑗𝑗

1−𝑝𝑝 where ( , ) is the Regularized𝐼𝐼 𝑛𝑛 − 𝑘𝑘 Incomplete𝑘𝑘 Beta function. See section 1.6.3. 𝑝𝑝 𝐼𝐼 𝑎𝑎 𝑏𝑏 When 20 and 0.05, or if 100 and 10, this can CDF be approximated by a Poisson distribution with = : 𝑛𝑛 ≥ 𝑝𝑝 ≤ 𝑛𝑛 ≥ 𝑛𝑛𝑛𝑛 ≤ ( + 1, ) ( ) e k = 𝜇𝜇 𝑛𝑛𝑛𝑛 j!j ! −µ µ Γ 𝑘𝑘 𝜇𝜇 𝐹𝐹 𝑘𝑘 ≅ �(2 , 2 + 2) j=0 𝑘𝑘 2 𝜒𝜒 Binomial When 10 and ≅ (𝐹𝐹1 𝜇𝜇) 𝑘𝑘10 then the cdf can be approximated using a normal distribution: 𝑛𝑛𝑛𝑛 ≥ 𝑛𝑛𝑛𝑛 − 𝑝𝑝 +≥0.5 ( ) (1 ) 𝑘𝑘 − 𝑛𝑛𝑛𝑛 𝐹𝐹 𝑘𝑘 ≅ Φ � � �𝑛𝑛!𝑛𝑛 − 𝑝𝑝 ( ) = 1 k p (1 p) ! ( )! 𝑛𝑛 j n−j 𝑅𝑅 𝑘𝑘 − � − j=0 𝑗𝑗 𝑛𝑛! − 𝑗𝑗 = n p (1 p) Reliability ! ( )! 𝑛𝑛 j n−j = �( + 1, ) − j=k+1 𝑗𝑗 𝑛𝑛 − 𝑗𝑗 𝑝𝑝 where ( , ) is the Regularized𝐼𝐼 𝑘𝑘 𝑛𝑛 − Incomplete𝑘𝑘 Beta function. See section 1.6.3. 𝐼𝐼𝑝𝑝 𝑎𝑎 𝑏𝑏 Binomial Discrete Distribution 159

n (1 + ) k −1 ( ) = 1 + n n k j j=0 θ k− ∑ � � θ Hazard Rate where ℎ 𝑘𝑘 � k � � � θ = 1 𝑝𝑝 (Gupta et al. 1997) 𝜃𝜃 − 𝑝𝑝 Properties and Moments

Median . is either { , }

Mode 𝑘𝑘0 5 ( + 1⌊)𝑛𝑛𝑛𝑛 ⌋ ⌈𝑛𝑛𝑛𝑛⌉ st Mean - 1 Raw Moment ⌊ 𝑛𝑛 𝑝𝑝⌋ nd Variance - 2 Central Moment (1𝑛𝑛𝑛𝑛 ) Skewness - 3rd Central Moment 1 2p 𝑛𝑛𝑛𝑛 − 𝑝𝑝 np(1 p) − th Excess kurtosis - 4 Central Moment 6�p 6p−+ 1

np2 (1 p) − Characteristic Function 1 p +−pe it n 100α% Percentile Function Numerically �solve− for �(which is not arduous for 10): = ( 𝑘𝑘, ) 𝑛𝑛 ≤ −1 𝛼𝛼 For 10𝑘𝑘 and𝐹𝐹 𝑛𝑛(1𝑝𝑝 ) 10 the

normal approximation may be used: Binomial 𝑛𝑛𝑛𝑛 ≥ 𝑛𝑛𝑛𝑛 − 𝑝𝑝 ≥ k ( ) (1 ) + 0.5 −1

α ≅ �Φ 𝛼𝛼 �𝑛𝑛𝑛𝑛 − 𝑝𝑝 𝑛𝑛𝑛𝑛 − � Parameter Estimation Maximum Likelihood Function Likelihood For complete data only: Function n ( ) 𝑛𝑛𝐵𝐵 ( ) | = k p 1 p i ki ni−ki 𝐿𝐿 𝑝𝑝 𝐸𝐸 = �p �(1i� p) − i=1 ∑ki ∑ni−∑ki Where is the number of Binomial− processes, = , = and the combinatory term is ignored (see section𝑛𝑛𝐵𝐵 1.1.6 for 𝐵𝐵 𝑖𝑖 𝑖𝑖=1 𝑖𝑖 𝑖𝑖 𝐵𝐵 ∑ discussion).𝑛𝑛 𝑛𝑛 ∑𝑘𝑘 𝑘𝑘 ∑𝑛𝑛 𝑖𝑖=1 𝑖𝑖 ∑ 𝑛𝑛 160 Discrete Distributions

L solve for = 0 p L ( ) 𝑑𝑑 = 𝑝𝑝k . p (1 p) ( n k )p (1 p) p i i i i i i 𝑑𝑑 𝑑𝑑 ∑ k −1 ∑n −∑k ∑k ∑n −1−∑k ∑ i − − ∑ i − ∑ i − 𝑑𝑑 k . p ( ) (1 p) = ( n k )p (1 p) ∑ ki −1 ∑ni−∑ki ∑ki −1+∑ni−∑ki i i i ∑k . p = ( n− k )(1 p)∑ − ∑ − −1 −1 (∑1 i p) n∑ i − k∑ i − = p i k i − ∑ − ∑ k ∑ i p = ni ∑ ∑ i MLE Point The MLE point estimate for p: Estimates k p = n ∑ i Fisher � 1i ( ) = ∑ Information (1 ) 𝐼𝐼 𝑝𝑝 Confidence The confidence intervals for the binomial𝑝𝑝 − 𝑝𝑝 distribution parameter p is a Intervals controversial subject which is still debated. The Wilson interval is recommended for small and large . (Brown et al. 2001)

+ 2 𝑛𝑛 + 4 (1 ) = + + 2 22( + ) 𝑛𝑛𝑝𝑝̂ 𝜅𝜅 ⁄ 𝜅𝜅�𝜅𝜅 𝑛𝑛𝑝𝑝̂ − 𝑝𝑝̂ 2 2

𝑝𝑝 𝑛𝑛+ 𝜅𝜅 2 +𝑛𝑛4 𝜅𝜅(1 ) = + 2 22( + ) 𝑛𝑛𝑝𝑝̂ 𝜅𝜅 ⁄ 𝜅𝜅�𝜅𝜅 𝑛𝑛𝑝𝑝̂ − 𝑝𝑝̂ where 𝑝𝑝 2 − 2 Binomial 𝑛𝑛 𝜅𝜅 + 1𝑛𝑛 𝜅𝜅 = 2 −1 𝛾𝛾 𝜅𝜅 Φ � � It should be noted that most textbooks use the Wald interval (normal approximation) given below, however many articles have shown these estimates to be erratic and cannot be trusted. (Brown et al. 2001) (1 ) = + 𝑝𝑝̂ − 𝑝𝑝̂ 𝑝𝑝 𝑝𝑝̂ 𝜅𝜅� 𝑛𝑛 (1 ) = 𝑝𝑝̂ − 𝑝𝑝̂ 𝑝𝑝 𝑝𝑝̂ − 𝜅𝜅� 𝑛𝑛 For a comparison of binomial confidence interval estimates the reader is referred to (Brown et al. 2001). The following webpage has links to online calculators which use many different methods. http://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval Binomial Discrete Distribution 161

Bayesian Non-informative Priors for p given n, ( | ) (Yang and Berger 1998, p.6) 𝝅𝝅 𝒑𝒑 𝒏𝒏 Type Prior Posterior Uniform Proper 1 Truncated Beta Distribution

Prior with limits For a b [ , ] . ( ; 1 + , 1 + ) 𝑏𝑏 − 𝑎𝑎 ≤ 𝑝𝑝 ≤ 𝑝𝑝 ∈ 𝑎𝑎 𝑏𝑏 Otherwise𝑐𝑐 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 (𝑝𝑝 ) = 0𝑘𝑘 𝑛𝑛 − 𝑘𝑘

Uniform Improper 1 = ( ; 1,1) 𝜋𝜋( ;𝑝𝑝1 + , 1 + ) Proir with limits [0,1] 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑝𝑝 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑝𝑝 𝑘𝑘 𝑛𝑛 − 𝑘𝑘 Jeffrey’s Prior 1 1 1 1 1 𝑝𝑝 ∈ = ; , ; + , + Reference Prior (1 ) 2 2 2 2 [0,1] 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 �𝑝𝑝 � 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 �when𝑝𝑝 𝑘𝑘 𝑛𝑛 − 𝑘𝑘� �𝑝𝑝 − 𝑝𝑝 MDIP 1.6186 (1 ) Proper - No𝑝𝑝 Closed∈ Form 𝑝𝑝 1−𝑝𝑝 Novick and Hall (1 𝑝𝑝) =− 𝑝𝑝 (0,0) ( ; , ) −1 −1 when [0,1] 𝑝𝑝 − 𝑝𝑝 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑝𝑝 𝑘𝑘 𝑛𝑛 − 𝑘𝑘 Conjugate Priors 𝑝𝑝 ∈ UOI Likelihood Evidence Dist of Prior Posterior Parameters Model UOI Para

failures = + from Binomial Beta , 𝑝𝑝 in trial = + n k Binomial ( ; , ) 𝑘𝑘 𝛼𝛼 𝛼𝛼𝑜𝑜 𝑘𝑘 𝛼𝛼𝑜𝑜 𝛽𝛽𝑜𝑜 𝑛𝑛 β βo − 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑝𝑝 𝑛𝑛 Description , Limitations and Uses

Example Five machines are measured for performance on demand. The machines can either fail or succeed in their application. The machines are tested for 10 demands with the following data for each machine:

Machine/Trail 1 2 3 4 5 6 7 8 9 10 1 F = 3 S = 7 2 F=2 S=8 3 F=2 S=8 4 F=3 S=7 5 F=2 S=8 (1 )

𝑖𝑖 Assuming𝜇𝜇 machines are𝑛𝑛𝑝𝑝̂ homogeneous estimate𝑛𝑛 −the𝑝𝑝̂ parameter :

Using MLE: 𝑝𝑝 k 12 p = = = 0.24 n 50 ∑ i � ∑ i 162 Discrete Distributions

90% confidence intervals for : = (0.95) = 1.64485 −1𝑝𝑝 +𝜅𝜅 Φ2 + 4 (1 ) = = 0.1557 + 2 22( + ) 𝑛𝑛𝑝𝑝̂ 𝜅𝜅 ⁄ 𝜅𝜅�𝜅𝜅 𝑛𝑛𝑝𝑝̂ − 𝑝𝑝̂ 𝑝𝑝𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙 2 − 2 𝑛𝑛 + 𝜅𝜅 2 +𝑛𝑛 4 𝜅𝜅 (1 ) = + = 0.351 + 2 22( + ) 𝑛𝑛𝑝𝑝̂ 𝜅𝜅 ⁄ 𝜅𝜅�𝜅𝜅 𝑛𝑛𝑝𝑝̂ − 𝑝𝑝̂ 𝑝𝑝𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢 2 2 A Bayesian point estimate𝑛𝑛 𝜅𝜅 using a uniform𝑛𝑛 𝜅𝜅 prior distribution (1, 1), with posterior ( ; 13, 39) has a point estimate: 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑝𝑝 13 = E[ ( ; 13, 39)] = = 0.25 52

𝑝𝑝̂ 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑝𝑝 With 90% confidence interval using inverse Beta cdf:

[ (0.05) = 0.1579, (0.95) = 0.3532] −1 −1 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 The probability𝐹𝐹 of observing no failures𝐹𝐹 in the next 10 trials with replacement is: (0; 10,0.25) = 0.0563

The probability of observing𝑓𝑓 less than 5 failures in the next 10 trials with replacement is: (0; 10,0.25) = 0.9803

𝑓𝑓 Characteristics CDF Approximations. The Binomial distribution is one of the most widely used distributions throughout history. Although simple, the CDF function was tedious to calculate prior to the use of computers.

Binomial As a result approximations using the Poisson and Normal distribution have been used. For details see ‘Related Distributions’.

With Replacement. The Binomial distribution models probability of successes in Bernoulli trials. However, the successes can occur anywhere among the trials with different combinations. Therefore𝑘𝑘 the Binomial𝑛𝑛 distribution assumes replacement.𝑘𝑘 The 𝑛𝑛 𝑘𝑘 equivalent distribution which𝑛𝑛 assumes without𝐶𝐶 replacement is the hypergeometric distribution.

Symmetrical. The distribution is symmetrical when = 0.5.

Compliment. ( ; , ) = ( ; , 1 ). Tables𝑝𝑝 usually only provide values up to /2 allowing the reader to calculate to using the compliment 𝑓𝑓formula.𝑘𝑘 𝑛𝑛 𝑝𝑝 𝑓𝑓 𝑛𝑛 − 𝑘𝑘 𝑛𝑛 − 𝑝𝑝 𝑛𝑛 𝑛𝑛 Assumptions. The binomial distribution describes the behavior of a count variable K if the following conditions apply: Binomial Discrete Distribution 163

1. The number of observations n is fixed. 2. Each observation is independent. 3. Each observation represents one of two outcomes ("success" or "failure"). 4. The probability of "success" is the same for each outcome.

~ ( , ) Convolution Property 𝐾𝐾 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑛𝑛 𝑝𝑝 K ~ ( , ) When is fixed. � i 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 ∑𝑛𝑛𝑖𝑖 𝑝𝑝

𝑝𝑝 Applications Used to model independent repeated trials which have two outcomes. Examples used in Reliability Engineering are: • Number of independent components which fail, , from a population, after receiving a shock. • Number of failures to start, , from demands𝑘𝑘 on a component.𝑛𝑛 • Number of independent items𝑘𝑘 defective,𝑛𝑛 , from a population of items. 𝑘𝑘 𝑛𝑛 Resources Online: http://mathworld.wolfram.com/BinomialDistribution.html http://en.wikipedia.org/wiki/Binomial_distribution http://socr.ucla.edu/htmls/SOCR_Distributions.html (web calc)

Books: Collani, E.V. & Dräger, K., 2001. Binomial distribution handbook for scientists and engineers, Birkhäuser. Binomial

Johnson, N.L., Kemp, A.W. & Kotz, S., 2005. Univariate Discrete

Distributions 3rd ed., Wiley-Interscience. Relationship to Other Distributions The Binomial distribution counts the number of successes in independent observations of a Bernoulli process. 𝑘𝑘 𝑛𝑛 Let Bernoulli Distribution ~ ( ; ) = 𝑛𝑛

𝑖𝑖 i 𝑖𝑖 (k ; p) Then 𝐾𝐾 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 k′ 𝑝𝑝 𝑎𝑎𝑎𝑎𝑎𝑎 𝑌𝑌 � 𝐾𝐾 𝑖𝑖=1 ′ Y~ ( k ; , ) where {1, 2, … , } 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 ′ i Special Case: 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 ∑ 𝑛𝑛 𝑝𝑝 𝑘𝑘 ∈ 𝑛𝑛 ( ; ) = ( ; | = 1)

𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑝𝑝 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑝𝑝 𝑛𝑛 164 Discrete Distributions

The hypergeometric distribution models probability of successes in Bernoulli trials from a population , with successors without Hypergeometric replacement. 𝑘𝑘 Distribution 𝑛𝑛 ( ; , ,𝑁𝑁 ) 𝑚𝑚

Limiting Case for and𝑓𝑓 𝑘𝑘not𝑛𝑛 near𝑚𝑚 𝑁𝑁 0 or 1: ( ; , , ) lim ( ; , = ) = ( ; , , ) 𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻 𝑛𝑛 ≫ 𝑘𝑘 𝑝𝑝 𝑘𝑘 𝑛𝑛 𝑚𝑚 𝑁𝑁 𝑚𝑚 𝑛𝑛→∞ 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑛𝑛 𝑝𝑝 𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻𝐻 𝑘𝑘 𝑛𝑛 𝑚𝑚 𝑁𝑁 Limiting Case for constant : 𝑁𝑁 lim ( | , ) = k = n , = (1 ) Normal 𝑝𝑝 2 𝑛𝑛→∞ 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑛𝑛 𝑝𝑝 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁� �µ 𝑝𝑝 𝜎𝜎 𝑛𝑛𝑛𝑛 − 𝑝𝑝 � Distribution 𝑝𝑝=𝑝𝑝 The Normal distribution can be used as an approximation of the ( ; , ) Binomial distribution when 10 and (1 ) 10. 2 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑡𝑡 𝜇𝜇 𝜎𝜎 ( | , ) 𝑛𝑛𝑛𝑛 ≥+ 0.5 =𝑛𝑛𝑛𝑛 ,− 𝑝𝑝=≥ (1 ) 2 Limiting𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 Case 𝑘𝑘for𝑝𝑝 constant𝑛𝑛 ≈ 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 �: 𝑘𝑘 �𝜇𝜇 𝑛𝑛𝑛𝑛 𝜎𝜎 𝑛𝑛𝑛𝑛 − 𝑝𝑝 � lim ( ; , ) = (k; = n ) 𝑛𝑛𝑛𝑛 𝑛𝑛→∞ 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑛𝑛 𝑝𝑝 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 µ 𝑝𝑝 n𝑝𝑝=𝜇𝜇 The Poisson distribution is the limiting case of the Binomial distribution when is large but the ratio of remains constant. Hence the Poisson distribution models rare events. 𝑛𝑛 𝑛𝑛𝑛𝑛 The Poisson distribution can be used as an approximation to the Binomial distribution when 20 and 0.05, or if 100 and 10. 𝑛𝑛 ≥ 𝑝𝑝 ≤ 𝑛𝑛 ≥ 𝑛𝑛𝑛𝑛 ≤ Poisson The Binomial is expressed in terms of the total number of a Distribution probability of success, , and trials, . Where a Poisson distribution

Binomial is expressed in terms of a success rate and does not need to know 𝑝𝑝 𝑁𝑁 ( ; ) the total number of trials.

𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑘𝑘 𝜇𝜇 The derivation of the Poisson distribution from the binomial can be found at http://mathworld.wolfram.com/PoissonDistribution.html.

This interpretation can also be used to understand the conditional distribution of a Poisson random variable: Let , ~ ( ) Given 1 2 = +𝐾𝐾 𝐾𝐾= 𝑃𝑃𝑃𝑃𝑃𝑃𝑠𝑠 µ Then 𝑛𝑛 𝐾𝐾1 𝐾𝐾2 𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛 𝑜𝑜𝑜𝑜 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 |n~ k; n, p = + µ1 𝐾𝐾1 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 � � Multinomial Special Case: µ1 µ2 Distribution ( |n, ) = ( | , ) ( |n, ) 𝑀𝑀𝑀𝑀𝑀𝑀𝑚𝑚𝑑𝑑=2 𝐤𝐤 𝐩𝐩 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑛𝑛 𝑝𝑝 𝑀𝑀𝑀𝑀𝑀𝑀𝑚𝑚𝑑𝑑 𝐤𝐤 𝐩𝐩 Poisson Discrete Distribution 165

5.3. Poisson Discrete Distribution

Probability Density Function - f(k) 0.20 μ=4 0.18 μ=10 0.16 μ=20 0.14 μ=30 0.12 0.10 0.08 0.06 0.04 0.02 0.00 0 5 10 15 20 25 30 35 40 45

Cumulative Density Function - F(k) 1.00

0.80

0.60 μ=4 0.40 μ=10 μ=20 μ=30 0.20

0.00 Poisson 0 5 10 15 20 25 30 35 40 45

Hazard Rate - h(k)

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0 5 10 15 20 25 30 35 40 45

166 Discrete Distributions

Parameters & Description Shape Parameter: The value of is the expected number of events per time period or other physical𝜇𝜇 dimensions. If the Poisson distribution is modeling failure events, then = Parameters > 0 is the average number of failures that 𝜇𝜇 𝜆𝜆𝜆𝜆 𝜇𝜇 𝜇𝜇 would occur in the space . In this case is fixed and becomes the distribution parameter. Some𝑡𝑡 texts use the𝑡𝑡 symbol . 𝜆𝜆

Random Variable is an integer, 0 𝜌𝜌 Distribution 𝑘𝑘 Formulas 𝑘𝑘 ≥ ( ) PDF ( ) = = k𝑘𝑘! k! 𝑘𝑘 𝜇𝜇 −µ 𝜆𝜆𝜆𝜆 −λt 𝑓𝑓 𝑘𝑘 𝑒𝑒 𝑒𝑒 ( + 1, ) ( ) = e k = j!j ! −µ µ Γ 𝑘𝑘 𝜇𝜇 𝐹𝐹 𝑘𝑘 = �(2 , 2 + 2) j=0 𝑘𝑘 2 CDF 𝜒𝜒 Where ( | ) is the Chi-square𝐹𝐹 𝜇𝜇 CDF.𝑘𝑘 2 𝜒𝜒 When 𝐹𝐹> 10𝑥𝑥 𝑣𝑣the ( ) can be approximated by a normal distribution: + 0.5 𝜇𝜇 𝐹𝐹 𝑘𝑘 ( ) 𝑘𝑘 − 𝜇𝜇 𝐹𝐹 𝑘𝑘 ≅ Φ � � √𝜇𝜇 Reliability R(k) = 1 F(k)

Poisson − k! −1 ( ) = 1 + e 1 k Hazard Rate j!j µ µ ℎ 𝑘𝑘 � � − − � �� (Gupta et al. 1997) µ j=1 Properties and Moments Median See 100α% Percentile Function when = 0.5.

Mode 𝛼𝛼 where is the floor function2 ⌊𝜇𝜇⌋ st Mean - 1 Raw Moment ⌊𝜇𝜇⌋ nd Variance - 2 Central Moment 𝜇𝜇 𝜇𝜇

2 = is the floor function (largest integer not greater than )

⌊𝜇𝜇⌋ 𝜇𝜇 Poisson Discrete Distribution 167

Skewness - 3rd Central Moment 1/

Excess kurtosis - 4th Central Moment 1/�µ Characteristic Function exp{ e µ 1 } ik 100α% Percentile Function Numerically solveµ� for − �(which is not arduous for 10): = (𝑘𝑘 ) 𝜇𝜇 ≤ −1 𝛼𝛼 For > 10 the 𝑘𝑘normal𝐹𝐹 approximation𝛼𝛼 may be used: 𝑘𝑘 k ( ) + 0.5 −1 α ≅ ��µΦ 𝛼𝛼 𝜇𝜇 − � Parameter Estimation Maximum Likelihood Estimates Likelihood For complete data: Functions ( | ) = 𝐹𝐹 n k𝑘𝑘𝑖𝑖! 𝜇𝜇 −µ F 𝐿𝐿 𝜇𝜇 𝐸𝐸 �i=1 𝑒𝑒 ������i ��� known k where is the number of poisson processes.

Log-Likelihood 𝑛𝑛 Function = n + n { ln( ) ln ( !)}

𝑖𝑖 𝑖𝑖 Λ − µ � 𝑘𝑘 𝜇𝜇 − 𝑘𝑘 ������i=�1�������������� Poisson known k = 0 1 = n + n = 0 ∂Λ ∂Λ

𝑖𝑖 − � 𝑘𝑘 ∂µ ∂µ µ i=1 ��������� MLE Point For complete data solving =known0 gives:k Estimates ∂Λ 1 ∂µ 1 = . n = . n

Note that in this context:µ� � 𝑘𝑘𝑖𝑖 𝑜𝑜𝑜𝑜 λ� � 𝑘𝑘𝑖𝑖 𝑛𝑛 i=1 𝑡𝑡𝑡𝑡 i=1 t = the unit of time for which the rate, is being measured. = the number of Poisson processes for which the exact number of failures, k, was known. 𝜆𝜆 𝑛𝑛 = the number of failures that occurred within the ith Poisson process.

𝑖𝑖 When𝑘𝑘 there is only one Poisson process this reduces to: = = For censored data numerical methods are needed𝑘𝑘 to maximize the log- µ� 𝑘𝑘 𝑜𝑜𝑜𝑜 λ� likelihood function. 𝑡𝑡 168 Discrete Distributions

Fisher 1 ( ) = Information 𝐼𝐼 𝜆𝜆 lower - 𝜆𝜆 upper - 2 Sided 2 Sided 𝜆𝜆 𝜆𝜆 Conservative two (2 ) (2 + 2) 100 % sided confidence 2 2 1−γ 𝑖𝑖 1+γ 𝑖𝑖 Confidence intervals. 𝜒𝜒� 2� ∑𝑘𝑘 𝜒𝜒� � 2∑𝑘𝑘 Interval𝛾𝛾 2 2 When is large 1𝑡𝑡𝑡𝑡+ 1𝑡𝑡𝑡𝑡+ (complete data ( > 10) two sided + 𝑘𝑘 2 2 only) intervals −1 𝛾𝛾 𝜆𝜆̂ −1 𝛾𝛾 𝜆𝜆̂ 𝑘𝑘 λ� − Φ � � � λ� Φ � � � (Nelson 1982, p.201) Note: The first𝑡𝑡𝑛𝑛 confidence intervals 𝑡𝑡are𝑡𝑡 conservative in that at least 100 %. Exact confidence intervals cannot be easily achieved for discrete distributions. 𝛾𝛾 Bayesian Non-informative Priors ( ) in known time interval

Type Prior 𝝅𝝅 𝝀𝝀 Posterior 𝑡𝑡 Uniform Proper 1 Truncated Gamma Distribution

Prior with limits For a b [ , ] . ( ; 1 + k, t) 𝑏𝑏 − 𝑎𝑎 ≤ λ ≤ 𝜆𝜆 ∈ 𝑎𝑎 𝑏𝑏 Otherwise𝑐𝑐 𝐺𝐺 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺( ) = 0𝜆𝜆

Uniform Improper 1 (1,0) 𝜋𝜋 𝜆𝜆 ( ; 1 + k, t) Prior with limits [0, ) ∝ 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝜆𝜆

Jeffrey’s Prior 1 ( ; + k, t) 𝜆𝜆 ∈ ∞ ( , 0) when 1[0, ) 1 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝜆𝜆 2 Poisson Novick and Hall 1 ∝ 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 2 ( ; k, t) √𝜆𝜆 (0,0) 𝜆𝜆 ∈ ∞ when [0, ) 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝜆𝜆 ∝ 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝜆𝜆 Conjugate Priors 𝜆𝜆 ∈ ∞ UOI Likelihood Evidence Dist of Prior Posterior Parameters Model UOI Para failures = + from Exponential in unit of Gamma , 𝜆𝜆 𝑛𝑛𝐹𝐹 = + ( ; ) time 𝑘𝑘 𝑘𝑘𝑜𝑜 𝑛𝑛𝐹𝐹 𝑡𝑡𝑇𝑇 𝑘𝑘0 Λ0 Λ Λ𝑜𝑜 𝑡𝑡𝑇𝑇 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑘𝑘 µ Description , Limitations and Uses Example Three vehicle tires were run on a test area for 1000km have punctures at the following distances: Tire 1: No punctures Tire 2: 400km, 900km Tire 3: 200km

Poisson Discrete Distribution 169

Punctures can be modeled as a renewal process with perfect repair and an inter-arrival time modeled by an exponential distribution. Due to the Poisson distribution being homogeneous in time, the test from multiple tires can be combined and considered a test of one tire with multiple renewals. See example in section 1.1.6.

Total time on test is 3 × 1000 = 3000km. Total number of failures is 3. Therefore using MLE the estimate of :

k 3 𝜆𝜆 = = = 1E-3 3000 ̂ 𝜆𝜆 𝑇𝑇 With 90% confidence interval𝑡𝑡 (conservative): ( . )(6) ( . )(8) = 0.272 -3, = 2.584 -3 26000 26000 𝜒𝜒 0 05 𝜒𝜒 0 95 � 𝐸𝐸 𝐸𝐸 � A Bayesian point estimate using the Jeffery non-informative improper prior ( , 0), with posterior ( ; 3.5, 3000) has 1 a point estimate: 2 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 3.𝐺𝐺5𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝜆𝜆 = E[ ( ; 3.5, 3000)] = = 1.16E 3 3000

𝜆𝜆̂ 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝜆𝜆 ̇ − With 90% confidence interval using inverse Gamma cdf: [ (0.05) = 0.361 -3, (0.95) = 2.344 -3] −1 −1 𝐹𝐹𝐺𝐺 𝐸𝐸 𝐹𝐹𝐺𝐺 𝐸𝐸 Characteristics The Poisson distribution is also known as the Rare Event distribution.

If the following assumptions are met than the process follows a Poisson distribution: Poisson • The chance of two simultaneous events is negligible or impossible (such as renewal of a single component);

• The expected value of the random number of events in a region is proportional to the size of the region. • The random number of events in non-overlapping regions are independent.

μ characteristics: • is the expected number of events for the unit of time being measured. • 𝜇𝜇When the unit of time varies μ can be transformed into a rate and time measure, . • For 10 the distribution is skewed to the right. • For 10 the distribution𝜆𝜆𝜆𝜆 approaches a normal distribution with𝜇𝜇 a ≲ = and = . 𝜇𝜇 ≳ 𝜇𝜇 𝜇𝜇 𝜎𝜎 ~√𝜇𝜇 ( ) Convolution property + + 𝐾𝐾… +𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃~𝜇𝜇 ( ; )

𝐾𝐾1 𝐾𝐾2 𝐾𝐾𝑛𝑛 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑘𝑘 ∑𝜇𝜇𝑖𝑖 170 Discrete Distributions

Applications Homogeneous Poisson Process (HPP). The Poisson distribution gives the distribution of exactly k failures occurring in a HPP. See relation to exponential and gamma distributions.

Renewal Theory. Used in renewal theory as the counting function and may model non-homogeneous (aging) components by using a time dependent failure rate, ( ).

Binomial Approximation. Used𝜆𝜆 𝑡𝑡 to model the Binomial distribution when the number of trials is large and μ remains moderate. This can greatly simplify Binomial distribution calculations.

Rare Event. Used to model rare events when the number of trials is large compared to the rate at which events occur. Resources Online: http://mathworld.wolfram.com/PoissonDistribution.html http://en.wikipedia.org/wiki/Poisson_distribution http://socr.ucla.edu/htmls/SOCR_Distributions.html (interactive web calculator)

Books: Haight, F.A., 1967. Handbook of the Poisson distribution [by] Frank A. Haight, New York,: Wiley.

Nelson, W.B., 1982. Applied Life Data Analysis, Wiley-Interscience.

Johnson, N.L., Kemp, A.W. & Kotz, S., 2005. Univariate Discrete Distributions 3rd ed., Wiley-Interscience.

Relationship to Other Distributions Let ~ (k; = ) Poisson Given = 𝐾𝐾 +𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃+ µ+ 𝜆𝜆𝜆𝜆+ … Exponential Then Distribution 1 2 𝐾𝐾 𝐾𝐾+1 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 𝑇𝑇 , T 𝑇𝑇… ~⋯ (𝑇𝑇t; ) 𝑇𝑇

( ; ) 1 2 The time between each arrival𝑇𝑇 of T𝐸𝐸 is𝐸𝐸𝐸𝐸 exponentially𝜆𝜆 distributed. 𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 𝜆𝜆 Special Cases: (k; | = 1) = ( ; )

Let 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝜆𝜆𝜆𝜆 𝑘𝑘 𝐸𝐸𝐸𝐸𝐸𝐸 𝑡𝑡 𝜆𝜆 … ~ ( ) = + + + Gamma Then 1 𝑘𝑘 𝑡𝑡 1 2 𝑘𝑘 Distribution 𝑇𝑇 𝑇𝑇 𝐸𝐸𝐸𝐸𝐸𝐸 𝜆𝜆 ~𝑎𝑎𝑎𝑎𝑎𝑎 ( 𝑇𝑇, ) 𝑇𝑇 𝑇𝑇 ⋯ 𝑇𝑇 The Poisson distribution is the probability that exactly failures have 𝑡𝑡 ( | ) been observed in time . This𝑇𝑇 𝐺𝐺 is𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 the probability𝑘𝑘 𝜆𝜆 that is between and . 𝑘𝑘 𝑘𝑘 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑘𝑘 𝜆𝜆 𝑡𝑡 𝑡𝑡 𝑇𝑇 𝑇𝑇𝑘𝑘+1 Poisson Discrete Distribution 171

( ; ) = ( ; , ) 𝑘𝑘+1 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 = 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺( ; + 1, ) ( ; , ) 𝑓𝑓 𝑘𝑘 𝜆𝜆𝜆𝜆 �𝑘𝑘 𝑓𝑓 𝑡𝑡 𝑥𝑥 𝜆𝜆 𝑑𝑑𝑑𝑑

𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 where is an integer. 𝐹𝐹 𝑡𝑡 𝑘𝑘 𝜆𝜆 − 𝐹𝐹 𝑡𝑡 𝑘𝑘 𝜆𝜆

Limiting𝑘𝑘 Case for constant : lim ( ; , ) = (k| = n ) 𝑛𝑛𝑛𝑛 𝑛𝑛→∞ 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑛𝑛 𝑝𝑝 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 µ 𝑝𝑝 n𝑝𝑝=𝜇𝜇 The Poisson distribution is the limiting case of the Binomial distribution when is large but the ratio of remains constant. Hence the Poisson distribution models rare events. 𝑛𝑛 𝑛𝑛𝑛𝑛 The Poisson distribution can be used as an approximation to the Binomial distribution when 20 and 0.05, or if 100 and 10. 𝑛𝑛 ≥ 𝑝𝑝 ≤ 𝑛𝑛 ≥ Binomial The𝑛𝑛𝑛𝑛 ≤ Binomial is expressed in terms of the total number of a Distribution probability of success, , and trials, . Where a Poisson distribution is expressed in terms of a success rate and does not need to know ( | , ) the total number of trials.𝑝𝑝 𝑁𝑁

𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑝𝑝 𝑁𝑁 The derivation of the Poisson distribution from the binomial can be found at http://mathworld.wolfram.com/PoissonDistribution.html.

This interpretation can also be used to understand the conditional distribution of a Poisson random variable: Let , ~ ( ) Poisson Given 1 2 = +𝐾𝐾 𝐾𝐾= 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 µ

Then 1 2 𝑛𝑛 |n𝐾𝐾~ 𝐾𝐾 𝑛𝑛k𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛; n p = 𝑜𝑜𝑜𝑜 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 + µ1 𝐾𝐾1 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 � � � lim ( ; ) = ( µ; 1 =µ2, = ) ′ 2 Normal 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝜇𝜇→∞ 𝐹𝐹 𝑘𝑘 𝜇𝜇 𝐹𝐹 𝑘𝑘 𝜇𝜇 𝜇𝜇 𝜎𝜎 𝜇𝜇 Distribution This is a good approximation when > 1000. When > 10 the same approximation can be made with a correction: ( | , ) 𝜇𝜇 𝜇𝜇 lim ( ; ) = ( ; = 0.5, = ) 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑘𝑘 𝜇𝜇′ 𝜎𝜎 ′ 2 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 Chi-square 𝜇𝜇→∞ 𝐹𝐹 𝑘𝑘 𝜇𝜇 𝐹𝐹 𝑘𝑘 𝜇𝜇 𝜇𝜇 − 𝜎𝜎 𝜇𝜇 Distribution ( | ) = ( = 2 , = 2 + 2) ( | ) 2 2 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑘𝑘 𝜇𝜇 𝜒𝜒 𝑥𝑥 𝜇𝜇 𝑣𝑣 𝑘𝑘 𝜒𝜒 𝑡𝑡 𝑣𝑣 172 Discrete Distributions

6. Bivariate and Multivariate Distributions

Poisson Bivariate Normal Distribution 173

6.1. Bivariate Normal Continuous Distribution Probability Density Function - f(x,y)

y x

> = < 0 𝜎𝜎𝑥𝑥 𝜎𝜎𝑦𝑦 𝜎𝜎𝑥𝑥 𝜎𝜎𝑦𝑦 𝜎𝜎𝑥𝑥 𝜎𝜎𝑦𝑦 y y y

> 0

𝜌𝜌 x x x 0 < < 45 = 45 45 < < 90 𝑜𝑜 𝑜𝑜 𝑜𝑜 𝑜𝑜 𝑜𝑜 𝜃𝜃 𝜃𝜃 𝜃𝜃 y y y

= 0 Bi

- var Normal 𝜌𝜌 x x x = 0 = 45 = 0 𝑜𝑜 𝑜𝑜 𝑜𝑜 𝜃𝜃 𝜃𝜃 𝜃𝜃 y y y < 0

𝜌𝜌 x x x 135 < < 180 = 45 90 < < 135 Adapted from 𝑜𝑜(Kotz et al. 2000,𝑜𝑜 p.256) 𝑜𝑜 𝑜𝑜 𝑜𝑜 𝜃𝜃 𝜃𝜃 𝜃𝜃 174 Bivariate and Multivariate Distributions

Parameters & Description Location parameter: The < < , mean of each random { , } −∞ µj ∞ variable. 𝜇𝜇𝑥𝑥 𝜇𝜇𝑦𝑦 𝑗𝑗 ∈ 𝑥𝑥 𝑦𝑦 Scale parameter: The > 0 , standard deviation of each { , } 𝜎𝜎𝑗𝑗 random variable. 𝜎𝜎𝑥𝑥 𝜎𝜎𝑦𝑦 Parameters 𝑗𝑗 ∈ 𝑥𝑥 𝑦𝑦 : The correlation between the two random variables. [ ] 1 1 = ( , ) = 𝑐𝑐𝑐𝑐𝑐𝑐 𝑋𝑋𝑋𝑋 𝜌𝜌 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐[( 𝑋𝑋 𝑌𝑌)( )] 𝜌𝜌 − ≤ 𝜌𝜌 ≤ = 𝜎𝜎𝑥𝑥𝜎𝜎𝑦𝑦 𝐸𝐸 𝑋𝑋 − 𝜇𝜇𝑥𝑥 𝑌𝑌 − 𝜇𝜇𝑦𝑦

𝜎𝜎𝑥𝑥𝜎𝜎𝑦𝑦 Limits < x < < y <

Distribution −∞ ∞Formulas𝑎𝑎𝑎𝑎𝑎𝑎 − ∞ ∞

1 z + z 2 z z (x, y) = exp 2 1 2 2(12 ) x y − 𝜌𝜌 x y 𝑓𝑓 = ( ) ( | ) 2 � 2 � 𝜋𝜋σxσy� − 𝜌𝜌 − − 𝜌𝜌 = 𝜙𝜙( 𝑥𝑥) 𝜙𝜙 𝑦𝑦 𝑥𝑥 = ( ) PDF 1 1 𝑦𝑦 − 𝜌𝜌𝜌𝜌 𝑥𝑥 − 𝜌𝜌𝜌𝜌 𝜙𝜙 𝑥𝑥 𝜙𝜙 � 2� 𝜙𝜙 𝑦𝑦 𝜙𝜙 � 2� Where is the standard� normal− 𝜌𝜌 distribution and:� − 𝜌𝜌 x z = { , } 𝜙𝜙 − µj j 𝑗𝑗 ∈ 𝑥𝑥 𝑦𝑦 σj ( ) = ( , ) ( ) = ( , ) ∞ ∞ 1 1 1 1 Marginal PDF 𝑓𝑓 𝑥𝑥 = �−∞𝑓𝑓 𝑥𝑥exp𝑦𝑦 𝑑𝑑 𝑑𝑑 (z ) 𝑓𝑓 𝑦𝑦 = �−∞𝑓𝑓 𝑥𝑥exp𝑦𝑦 𝑑𝑑 𝑑𝑑 z 2 2 2 2 2 2 = ( , ) x = , y 𝑥𝑥 � − � 𝑦𝑦 � − � � � 𝜎𝜎 √ 𝜋𝜋 𝜎𝜎 √ 𝜋𝜋

var Normal var 𝑥𝑥 𝑥𝑥 - 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝜇𝜇 𝜎𝜎 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁�𝜇𝜇𝑦𝑦 𝜎𝜎𝑦𝑦� Bi ( | ) = | = + , | = (1 ) 𝜎𝜎𝑥𝑥 2 2 2 Conditional PDF 𝑓𝑓 𝑥𝑥 𝑦𝑦 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 �𝜇𝜇𝑥𝑥 𝑦𝑦 𝜇𝜇𝑥𝑥 𝜌𝜌 � � �𝑦𝑦 − 𝜇𝜇𝑦𝑦� 𝜎𝜎𝑥𝑥 𝑦𝑦 𝜎𝜎𝑥𝑥 − 𝜌𝜌 � 𝜎𝜎𝑦𝑦 ( | ) = | = + ( ), | = (1 ) 𝜎𝜎𝑦𝑦 2 2 2 𝑓𝑓 𝑦𝑦 𝑥𝑥 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 �𝜇𝜇𝑦𝑦 𝑥𝑥 𝜇𝜇𝑦𝑦 𝜌𝜌 � � 𝑦𝑦 − 𝜇𝜇𝑥𝑥 𝜎𝜎𝑦𝑦 𝑥𝑥 𝜎𝜎𝑦𝑦 − 𝜌𝜌 � 𝜎𝜎𝑥𝑥 1 z + z z z CDF (x, y) = exp du dv 2 1 x y 2 2(21 ) u v − 2ρ u v 2 𝐹𝐹 2 �−∞ �−∞ � � 𝜋𝜋σxσy� − ρ − − ρ Bivariate Normal Distribution 175

where x z = − µj j σj 1 z + z z z (x, y) = exp du dv 2 1 ∞ ∞ 2 2(12 ) u v − 2ρ u v Reliability 2 where𝑅𝑅 2 �x �y � � 𝜋𝜋σxσy� − ρ x − ρ z = j j − µ Properties and Momentsσj

Median 𝜇𝜇𝑥𝑥 � 𝑦𝑦� Mode 𝜇𝜇 𝜇𝜇𝑥𝑥 st � 𝑦𝑦� Mean - 1 Raw Moment 𝜇𝜇= 𝑥𝑥 𝑋𝑋 𝜇𝜇 𝐸𝐸 � � � 𝑦𝑦� The mean of the marginal𝑌𝑌 distributions𝜇𝜇 is: [ ] = [ ] = 𝐸𝐸 𝑋𝑋 𝜇𝜇𝑥𝑥 𝑦𝑦 The mean of the conditional𝐸𝐸 𝑌𝑌 𝜇𝜇 distributions gives the following lines (also called the regression lines): E(X|Y = y) = + . (y ) σx ( | ) µx ρ ( − µy) E Y X = x = + . σy y σy µy ρ − µx σx Variance - 2nd Central Moment = 2 𝑋𝑋 𝜎𝜎1 𝜌𝜌𝜎𝜎1𝜎𝜎2 𝐶𝐶𝐶𝐶𝐶𝐶 � � � 2 � 𝑌𝑌 𝜌𝜌𝜎𝜎1𝜎𝜎2 𝜎𝜎2

Variance of marginal distributions: Bi

(X) = - var Normal (Y) = 2 x 𝑉𝑉𝑉𝑉𝑉𝑉 σ2 y Variance of conditional𝑉𝑉𝑉𝑉 𝑉𝑉distributions:σ (X|Y = y) = (1 ) ( | = ) = 2(1 2) x 𝑉𝑉𝑉𝑉𝑉𝑉 σ2 − ρ2 𝑉𝑉𝑉𝑉𝑉𝑉 𝑌𝑌 𝑋𝑋 𝑥𝑥 σy − ρ 100α% Percentile Function An ellipse containing 100α % of the distribution is (Kotz et al. 2000, p.254):

(z + z z z ) = ln (1 ) 2 2(12 ) x y − 2ρ x y 2 − 𝛼𝛼 − − 𝜌𝜌 176 Bivariate and Multivariate Distributions

where x z = { , } j j − µ j 𝑗𝑗 ∈ 𝑥𝑥 𝑦𝑦 For the standard bivariateσ normal:

x + y xy = ln (1 ) 2 2(12 ) − 2ρ 2 − 𝛼𝛼 − − ρ Parameter Estimation Maximum Likelihood Function MLE Point When there is only complete failure data the MLE estimates can be given Estimates as (Kotz et al. 2000, p.294): 1 1 = 𝑛𝑛𝐹𝐹 = 𝑛𝑛𝐹𝐹 ( ) n n 2 2 𝑥𝑥 𝑖𝑖 �x 𝑖𝑖 𝑥𝑥 𝜇𝜇� F � 𝑥𝑥 σ F � 𝑥𝑥 − 𝜇𝜇� 1 𝑖𝑖=1 1 𝑖𝑖=1 = 𝑛𝑛𝐹𝐹 = 𝑛𝑛𝐹𝐹 n n 2 2 𝑦𝑦 𝑖𝑖 �y 𝑖𝑖 𝑦𝑦 𝜇𝜇� F � 𝑦𝑦 σ F ��𝑦𝑦 − 𝜇𝜇�� 𝑖𝑖=1 𝑖𝑖=1 = 𝑛𝑛𝐹𝐹 ( ) n 𝑖𝑖 𝑥𝑥 𝑖𝑖 𝑦𝑦 𝜌𝜌� 𝑥𝑥 𝑦𝑦 F � 𝑥𝑥 − 𝜇𝜇 �𝑦𝑦 − 𝜇𝜇 � 𝜎𝜎�𝜎𝜎� 𝑖𝑖=1 If one or more of the variables are known, different estimators are given in (Kotz et al. 2000, pp.294-305).

A correction factor of -1 can be introduced to the to give the unbiased estimators: 2 � 1 1 𝜎𝜎 = 𝑛𝑛𝐹𝐹 ( ) = 𝑛𝑛𝐹𝐹 n 1 n 1 2 2 2 2 �x 𝑖𝑖 𝑥𝑥 �y 𝑖𝑖 𝑦𝑦 σ F � 𝑥𝑥 − 𝜇𝜇� σ F ��𝑦𝑦 − 𝜇𝜇�� − 𝑖𝑖=1 − 𝑖𝑖=1

Bayesian Non-informative Priors: A complete coverage of numerous reference prior distributions with different parameter ordering is contained in (Berger & Sun 2008).

var Normal var

- For a summary of the general Bayesian priors and conjugates see the multivariate

Bi normal distribution. Description , Limitations and Uses Example The accuracy of a cutting machine used in manufacturing is desired to be measured. 5 cuts at the required length are made. The lengths and room temperature were measured as: 7.436, 10.270, 10.466, 11.039, 11.854 19.51, 21.23, 21.41, 22.78, 26.78 oC 𝑚𝑚𝑚𝑚 Bivariate Normal Distribution 177

MLE estimates are: x = = 10.213 n i 𝑥𝑥 ∑ t 𝜇𝜇� = = 22.342 n i ∑ 𝜇𝜇�𝑇𝑇 (x ) = = 2.7885 n 1 2 2 ∑ i − 𝜇𝜇�𝐿𝐿 �𝑥𝑥 (t ) 𝜎𝜎 = = 7.5033 − 2 ni 1𝑇𝑇 2 ∑ − 𝜇𝜇� 𝜎𝜎�𝑇𝑇 1 − = 𝑛𝑛𝐹𝐹 ( )( ) = 0.1454 n 𝑖𝑖 𝑥𝑥 𝑖𝑖 𝑇𝑇 𝜌𝜌� 𝑥𝑥 𝑇𝑇 F � 𝑥𝑥 − 𝜇𝜇 𝑡𝑡 − 𝜇𝜇 𝜎𝜎�𝜎𝜎� 𝑖𝑖=1 If you know the temperature is 24 oC what is the likely cutting distance distribution?

( | = 24) = | = + ( ), | = (1 ) 𝜎𝜎𝑥𝑥 2 2 2 𝑓𝑓(𝑥𝑥|𝑡𝑡 = 24) = 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁(�10𝜇𝜇𝑥𝑥.303𝑡𝑡 , 𝜇𝜇 𝑥𝑥2.730𝜌𝜌 �) � 𝑡𝑡 − 𝜇𝜇𝑇𝑇 𝜎𝜎𝑥𝑥 𝑡𝑡 𝜎𝜎𝑥𝑥 − 𝜌𝜌 � 𝜎𝜎𝑡𝑡

𝑓𝑓 𝑥𝑥 𝑡𝑡 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 Characteristic Also known as Binormal Distribution.

Let U, V and W be three independent normally distributed random variables. Then let: = + = + 𝑋𝑋 𝑈𝑈 𝑉𝑉 Then ( , ) has a bivariate normal𝑌𝑌 𝑉𝑉 distribution.𝑊𝑊 (Balakrishnan & Lai 2009, p.483) 𝑋𝑋 𝑌𝑌 Independence. If and are jointly normal random variables, then they are independent when = 0. This gives a contour plot of ( , ) 𝑋𝑋 𝑌𝑌 with concentric circles around the origin. When given a value on the Bi

axis it does not assist in estimating𝜌𝜌 the value on the axis and therefore𝑓𝑓 𝑥𝑥 𝑦𝑦 - var Normal are independent. When and are independent, the pdf reduces to: 𝑦𝑦 1 z + z 𝑥𝑥 (x, y) = exp 𝑋𝑋 𝑌𝑌 2 2 2 x 2 y

𝑓𝑓 x y �− � 𝜋𝜋σ σ Correlation Coefficient . (Yang et al. 2004, p.49) - > 0. When X increases then Y also tends to increase. When = 1 X and Y have𝝆𝝆 a perfect positive linear relationship such 𝝆𝝆that = + where is positive. - 𝜌𝜌 < 0. When X increases then Y also tends to decrease. When = 𝑌𝑌1 X𝑐𝑐 and𝑚𝑚 𝑚𝑚Y have 𝑚𝑚a perfect negative linear relationship such𝝆𝝆 that = + where is negative. - 𝜌𝜌 = −. Increases or decreases in X have no affect on Y. X and 𝑌𝑌 𝑐𝑐 𝑚𝑚𝑚𝑚 𝑚𝑚 𝝆𝝆 𝟎𝟎 178 Bivariate and Multivariate Distributions

Y are independent.

Ellipse Axis. (Kotz et al. 2000, p.254) The slope of the main axis from the x-axis is given as:

1 2 = 2 −1 𝜌𝜌𝜎𝜎𝑥𝑥𝜎𝜎𝑦𝑦 2 2 𝜃𝜃 𝑡𝑡𝑡𝑡𝑛𝑛 � 𝑥𝑥 𝑦𝑦 � If = for positive the main axis𝜎𝜎 of− the𝜎𝜎 ellipse is 45o from the x- o axis.𝑥𝑥 For 𝑦𝑦negative the main axis of the ellipse is -45 from the x-axis. 𝜎𝜎 𝜎𝜎 𝜌𝜌 𝜌𝜌 Circular Normal Density Function. (Kotz et al. 2000, p.255) When = and = 0 the bivariate distribution is known as a circular

normal𝑥𝑥 𝑦𝑦 density function. 𝜎𝜎 𝜎𝜎 𝜌𝜌 Elliptical Normal Distribution (Kotz et al. 2000, p.255). If = 0 and then the distribution may be known as an elliptical normal 𝜌𝜌 distribution.𝑥𝑥 𝑦𝑦 𝜎𝜎 ≠ 𝜎𝜎 Standard Bivariate Normal Distribution. Occurs when = 0 and = 1. For positive ρ the main axis of the ellipse is 45o from the x-axis. For negative ρ the main axis of the ellipse is -45o from the x-axis.𝜇𝜇 𝜎𝜎

1 x + y xy (x, y) = exp 2 1 2 2(12 ) − 2ρ 𝑓𝑓 2 �− 2 � − ρ Mean / Median / Mode: 𝜋𝜋� − ρ As per the univariate distributions the mean, median and mode are equal.

Matrix Form. The bivariate distribution may be written in matrix form as: = = = 2 𝑋𝑋1 𝜇𝜇1 𝜎𝜎1 𝜌𝜌𝜎𝜎1𝜎𝜎2 when ( , ) 2 𝑿𝑿 � 2� 𝝁𝝁 �𝜇𝜇2� 𝚺𝚺 � 1 2 2 � 𝑋𝑋 𝜌𝜌𝜎𝜎 𝜎𝜎 𝜎𝜎 𝑿𝑿 ∼ 𝑁𝑁𝑁𝑁𝑁𝑁𝑚𝑚2 𝝁𝝁 𝚺𝚺 1 1 ( ) = exp ( ) ( ) 2 | | 2 var Normal var T −1 - 𝑓𝑓 𝐱𝐱 �− 𝐱𝐱 − 𝛍𝛍 𝚺𝚺 𝐱𝐱 − 𝛍𝛍 � Bi Where | | is the determin𝜋𝜋� ant𝚺𝚺 of . This is the form used in multivariate normal distribution. 𝚺𝚺 𝚺𝚺 The following properties are given in matrix form:

Convolution Property Let ( , ) , (independent) Where 𝒙𝒙 𝐱𝐱 𝒚𝒚 𝐲𝐲 Then 𝑿𝑿 ∼ 𝑁𝑁 𝑁𝑁𝑁𝑁𝑁𝑁+ ~𝝁𝝁 𝚺𝚺 +𝒀𝒀 ∼ ,𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁+ �𝝁𝝁 𝚺𝚺 � 𝑿𝑿 ⊥ 𝒀𝒀 𝑿𝑿 𝒀𝒀 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁�𝝁𝝁𝒙𝒙 𝝁𝝁𝒚𝒚 𝚺𝚺𝒙𝒙 𝚺𝚺𝒚𝒚� Bivariate Normal Distribution 179

Note if X and Y are dependent then + may not be even be normally distributed.(Novosyolov 2006) 𝑿𝑿 𝒀𝒀 Scaling Property Let = + Y is a p x 1 matrix 𝒀𝒀 𝑨𝑨 𝑨𝑨 𝒃𝒃 b is a p x 1 matrix Then ~ ( + , ) A is a p x 2 matrix 𝑻𝑻 𝒀𝒀 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝑨𝑨𝑨𝑨 𝒃𝒃 𝑨𝑨𝚺𝚺𝑨𝑨 Marginalize Property: Let ~ , 2 𝑋𝑋1 𝜇𝜇1 𝜎𝜎1 𝜌𝜌𝜎𝜎1𝜎𝜎2 2 � 2� 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 ��𝜇𝜇2� � 1 2 2 �� Then 𝑋𝑋 ( 𝜌𝜌,𝜎𝜎 𝜎𝜎) 𝜎𝜎

1 1 1 Conditional Property: 𝑋𝑋 ∼ 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 𝜇𝜇 𝜎𝜎 Let ~ , 2 1 𝜇𝜇1 1 1 2 𝑋𝑋 𝜎𝜎 𝜌𝜌𝜎𝜎 2𝜎𝜎 � 2� 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 �� 2� � �� 𝑋𝑋 𝜇𝜇 𝜌𝜌𝜎𝜎1𝜎𝜎2 𝜎𝜎2 Then ( | ) = | , |

𝑓𝑓 𝑥𝑥1 𝑥𝑥2 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁�𝜇𝜇1 2 𝜎𝜎1 2� Where | = + ( ) 𝜎𝜎1 𝜇𝜇1 |2 = 𝜇𝜇1 1𝜌𝜌 �𝜎𝜎2� 𝑥𝑥2 − 𝜇𝜇2 2 1 2 1 It should be noted that𝜎𝜎 the standard𝜎𝜎 � − deviation𝜌𝜌 of the marginal distribution does not depend on the given value.

Applications The bivariate distribution is used in many more applications which are common to the multivariate normal distribution. Please refer to multivariate normal distribution for a more complete coverage.

Graphical Representation of Multivariate Normal. As with all Bi

bivariate distributions having only two dependent variables allows it to - var Normal be easily graphed (in a three dimensional graph) and visualized. As such the bivariate normal is popular in introducing higher dimensional cases.

Resources Online: http://mathworld.wolfram.com/BivariateNormalDistribution.html http://en.wikipedia.org/wiki/Multivariate_normal_distribution http://www.aiaccess.net/English/Glossaries/GlosMod/e_gm_binormal_ distri.htm (interactive visual representation)

Books: Balakrishnan, N. & Lai, C., 2009. Continuous Bivariate Distributions 2nd ed., Springer. 180 Bivariate and Multivariate Distributions

Yang, K. et al., 2004. Multivariate Statistical Methods in Quality Management 1st ed., McGraw-Hill Professional.

Patel, J.K, Read, C.B, 1996. Handbook of the Normal Distribution, 2nd Edition, CRC

Tong, Y.L., 1990. The Multivariate Normal Distribution, Springer. var Normal var - Bi Dirichlet Distribution 181

6.2. Dirichlet Continuous Distribution

Probability Density Function - f(x)

([ , ] ; [2,2,2] ) ([ , ] ; [10,10,10] ) 𝑇𝑇 𝑇𝑇 𝑇𝑇 𝑇𝑇 𝐷𝐷𝐷𝐷𝑟𝑟2 𝑥𝑥1 𝑥𝑥2 𝐷𝐷𝐷𝐷𝑟𝑟2 𝑥𝑥1 𝑥𝑥2

([ , ] ; , , ) ([ , ] ; [1,1,1] ) 1 1 1 𝑇𝑇 𝑇𝑇 𝑇𝑇 𝑇𝑇 𝐷𝐷𝐷𝐷𝑟𝑟2 𝑥𝑥1 𝑥𝑥2 �2 2 2� 𝐷𝐷𝐷𝐷𝑟𝑟2 𝑥𝑥1 𝑥𝑥2 Dirichlet

([ , ] ; , 1,2 ) ([ , ] ; [2,1,2] ) 1 𝑇𝑇 𝑇𝑇 𝑇𝑇 𝑇𝑇 𝐷𝐷𝐷𝐷𝑟𝑟2 𝑥𝑥1 𝑥𝑥2 �2 � 𝐷𝐷𝐷𝐷𝑟𝑟2 𝑥𝑥1 𝑥𝑥2 182 Bivariate and Multivariate Distributions

Parameters & Description Shape Matrix. Note that the = [ , , … , , ] > 0 matrix is 𝑇𝑇 + 1 𝛂𝛂 𝛼𝛼1 𝛼𝛼2 𝛼𝛼𝑑𝑑 𝛼𝛼0 αi in length. 𝜶𝜶 𝑑𝑑 Dimensions. The number 1 of random

(integer) variables 𝑑𝑑 ≥ 𝑑𝑑 being modeled.

0 x 1

i Limits ≤ ≤ 𝑑𝑑 1

� 𝑥𝑥𝑖𝑖 ≤ 𝑖𝑖=1 Distribution Formulas

1 𝛼𝛼0−1 ( ) = 1 𝑑𝑑 x B( ) d αi−1 𝑖𝑖 i 𝑓𝑓 𝐱𝐱 � − � 𝑥𝑥 � �i=1 𝛂𝛂 𝑖𝑖=1 where B( ) is the multinomial beta function: PDF ( ) B( ) = 𝜶𝜶 d ∏i=0 Γ αi d 𝛂𝛂 i=0 i The special case of the DirichletΓ�∑ αdistribution� is the beta distribution when = 1.

𝑑𝑑 Let = ~ ( ) Where = [𝑼𝑼 , … , , , … , ] 𝑿𝑿 � � 𝐷𝐷𝐷𝐷𝑟𝑟𝑑𝑑 𝜶𝜶 Dirichlet = [𝑽𝑽 , … , ] 𝑇𝑇 1 𝑠𝑠 𝑠𝑠+1 𝑑𝑑 𝑿𝑿 = [𝑋𝑋 , …𝑋𝑋, 𝑋𝑋𝑇𝑇 ] 𝑋𝑋 1 𝑠𝑠 Let 𝑼𝑼 = 𝑋𝑋 𝑋𝑋= sum𝑇𝑇 of matrix elements. 𝑠𝑠+1 𝑑𝑑 Marginal PDF 𝑽𝑽 𝑋𝑋 𝑑𝑑 𝑋𝑋 Σ 𝑗𝑗=0 𝑗𝑗 ( ) 𝛼𝛼where∑ 𝛼𝛼= , , …𝜶𝜶, , 𝑠𝑠 𝑇𝑇 𝑼𝑼 ∼ 𝐷𝐷𝐷𝐷𝑟𝑟𝑠𝑠 𝜶𝜶𝒖𝒖 𝜶𝜶𝒖𝒖 �𝛼𝛼1 𝛼𝛼2 𝛼𝛼𝑠𝑠 𝛼𝛼Σ − ∑𝑗𝑗=1 𝛼𝛼𝑗𝑗� ( ) 𝑠𝑠 ( ) = 1 𝑠𝑠 𝛼𝛼Σ−1−∑𝑗𝑗=1 𝛼𝛼𝑗𝑗 x ( ) s Γ αΣ αi−1 𝑠𝑠 s 𝑖𝑖 i 𝑓𝑓 𝐮𝐮 Σ 𝑗𝑗 i � − � 𝑥𝑥 � �i=1 Γ�𝛼𝛼 − ∑𝑗𝑗=1 𝛼𝛼 � ∏i=1 Γ α 𝑖𝑖=1 Dirichlet Distribution 183

When marginalized to one variable: ~ ( , )

𝑋𝑋𝑖𝑖 (𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵) 𝛼𝛼𝑖𝑖 𝛼𝛼Σ − 𝛼𝛼𝑖𝑖 ( ) = (1 ) x ( Σ) ( ) Γ α i 𝛼𝛼Σ−𝛼𝛼𝑖𝑖−1 αi−1 𝑖𝑖 𝑖𝑖 i 𝑓𝑓 𝑥𝑥 Σ 𝑖𝑖 − 𝑥𝑥 | = Γ 𝛼𝛼| − where𝛼𝛼 Γ α | = [ , , … , , ] 𝑇𝑇 (Kotz et al. 2000,𝑑𝑑−𝑠𝑠 p.488)𝑢𝑢 𝒗𝒗 𝒖𝒖 𝒗𝒗 𝑆𝑆+1 𝑠𝑠+2 𝑚𝑚 0 𝑼𝑼 𝑽𝑽 𝒗𝒗 ∼ 𝐷𝐷𝐷𝐷𝑟𝑟 �𝜶𝜶 � 𝜶𝜶 𝛼𝛼 𝛼𝛼 𝛼𝛼 𝛼𝛼 Conditional PDF =0 ( | ) = 𝑠𝑠 1 𝑠𝑠 𝛼𝛼0−1 𝑗𝑗 𝑗𝑗 𝑠𝑠 Γ (𝛼𝛼 ) 𝛼𝛼𝑖𝑖−1 �s∑ � 𝑖𝑖 𝑖𝑖 𝑓𝑓 𝐮𝐮 𝐯𝐯 i � − � 𝑥𝑥 � �𝑖𝑖=1𝑥𝑥 ∏i=0 Γ α 𝑖𝑖=1 ( ) = P(X x , X x , … , X x )

1 1 2 2 d d 𝐹𝐹 𝐱𝐱 ≤ ≤ ≤𝛼𝛼0−1 = … 1 𝑑𝑑 x d , … , , CDF 𝑥𝑥1 𝑥𝑥2 𝑥𝑥𝑑𝑑 d αi−1 𝑖𝑖 i 2 1 � � � � − � 𝑥𝑥 � �i=1 𝑑𝑑 𝑑𝑑𝑥𝑥 𝑑𝑑𝑥𝑥 Numerical0 methods0 0 have been𝑖𝑖=1 explored to evaluate this integral, see (Kotz et al. 2000, pp.497-500) ( ) = P(X > x , X > x , … , X > x )

1 1 2 2 d d 𝑅𝑅 𝐱𝐱 𝛼𝛼0−1 Reliability = … 1 𝑑𝑑 x d , … , , ∞ ∞ ∞ d αi−1 𝑖𝑖 i 2 1 �1 �2 �𝑑𝑑 � − � 𝑥𝑥 � �i=1 𝑑𝑑 𝑑𝑑𝑥𝑥 𝑑𝑑𝑥𝑥 𝑥𝑥 𝑥𝑥 𝑥𝑥 𝑖𝑖=1 Properties and Moments Median Solve numerically using ( ) = 0.5

Mode = for > 0 otherwise𝐹𝐹 𝒙𝒙 no mode 𝛼𝛼𝑖𝑖−1 𝑖𝑖 𝑖𝑖 Mean - 1st Raw Moment Let 𝑥𝑥 =𝛼𝛼Σ−𝑑𝑑 : 𝛼𝛼 𝑑𝑑 𝛼𝛼Σ ∑𝑖𝑖=0 𝛼𝛼𝑖𝑖 [ ] = = 𝜶𝜶 𝐸𝐸 𝑿𝑿 𝝁𝝁 Σ Mean of the marginal distribution:𝛼𝛼 Dirichlet [ ] = = 𝑢𝑢 𝒖𝒖 𝜶𝜶 𝐸𝐸 𝑼𝑼 𝝁𝝁 𝛼𝛼Σ [ ] = = 𝛼𝛼𝑖𝑖 where 𝐸𝐸 𝑋𝑋𝑖𝑖 𝜇𝜇𝑖𝑖 𝑎𝑎Σ 𝑇𝑇 = , , … , , 𝑠𝑠

𝜶𝜶𝒖𝒖 �𝛼𝛼1 𝛼𝛼2 𝛼𝛼𝑠𝑠 𝛼𝛼Σ − � 𝛼𝛼𝑗𝑗� 𝑗𝑗=1 Mean of the conditional distribution: | [ | = ] = | = 𝜶𝜶𝒖𝒖 𝒗𝒗 𝐸𝐸 𝑼𝑼 𝑽𝑽 𝒗𝒗 𝝁𝝁𝑢𝑢 𝑣𝑣 𝛼𝛼Σ 184 Bivariate and Multivariate Distributions

where | = [ , , … , , ] 𝑇𝑇 Variance - 2nd Central Moment Let = 𝜶𝜶𝒖𝒖 𝒗𝒗 : 𝛼𝛼𝑆𝑆+1 𝛼𝛼𝑠𝑠+2 𝛼𝛼𝑚𝑚 𝛼𝛼0 𝑑𝑑 𝛼𝛼Σ ∑𝑖𝑖=0 𝛼𝛼𝑖𝑖 ( ) [ ] = ( + 1) 𝛼𝛼𝑖𝑖 𝛼𝛼Σ − 𝛼𝛼𝑖𝑖 𝑉𝑉𝑉𝑉𝑉𝑉 𝑋𝑋𝑖𝑖 2 𝛼𝛼Σ 𝛼𝛼Σ , = ( + 1) −𝛼𝛼𝑖𝑖𝛼𝛼𝑗𝑗 𝐶𝐶𝐶𝐶𝐶𝐶�𝑋𝑋𝑖𝑖 𝑋𝑋𝑗𝑗� 2 Parameter Estimation 𝛼𝛼Σ 𝛼𝛼Σ Maximum Likelihood Function MLE Point The MLE estimates of can be obtained from n observations of by Estimates numerically maximizing the log-likelihood function: (Kotz et al. 2000, 𝚤𝚤 𝒊𝒊 p.505) 𝛼𝛼� 𝒙𝒙

1 ( |E) = ( ) 𝑑𝑑 ln + 𝑑𝑑 1 𝑛𝑛 ln

Λ 𝛂𝛂 𝑛𝑛 �𝑙𝑙𝑙𝑙Γ 𝛼𝛼Σ − � 𝛤𝛤�𝛼𝛼𝑗𝑗�� 𝑛𝑛 � � �𝛼𝛼𝑗𝑗 − � � �𝑥𝑥𝑖𝑖𝑖𝑖�� 𝑗𝑗=0 𝑗𝑗=0 𝑛𝑛 𝑖𝑖=1 The method of moments are used to provide initial guesses of for the numerical methods. 𝑖𝑖 𝛼𝛼 Fisher = ( ), Information = (′ ) ( ) 𝑖𝑖𝑖𝑖 Σ Matrix 𝐼𝐼 −𝑛𝑛𝜓𝜓′ 𝛼𝛼 ′ 𝑖𝑖 ≠ 𝑗𝑗 𝑖𝑖𝑖𝑖 𝑖𝑖 Σ ( ) = ( 𝐼𝐼) 𝑛𝑛𝜓𝜓 𝛼𝛼 − 𝑛𝑛𝜓𝜓 𝛼𝛼 Where 2 is the trigamma function. See section 1.6.8. ′ 𝑑𝑑 (Kotz et al. 2000, p.506)2 𝜓𝜓 𝑥𝑥 𝑑𝑑𝑥𝑥 𝑙𝑙𝑙𝑙Γ 𝑥𝑥 100 % The confidence intervals can be obtained from the fisher information Confidence matrix. Intervals𝛾𝛾 Bayesian

Non-informative Priors Jeffery’s Prior det ( ) Dirichlet ( ) where is given above.� �𝐼𝐼 𝜶𝜶 � Conjugate𝐼𝐼 𝜶𝜶 Priors UOI Likelihood Evidence Dist of Prior Posterior Model UOI Para Parameters

, failures in

trials with from Multinomiald 𝑘𝑘𝑖𝑖 𝑗𝑗 Dirichletd+1 = + 𝒑𝒑( ; , ) possible 𝑛𝑛 𝑑𝑑 states. 𝜶𝜶𝒐𝒐 𝜶𝜶 𝜶𝜶𝒐𝒐 𝒌𝒌 𝑀𝑀𝑀𝑀𝑀𝑀𝑚𝑚𝑑𝑑 𝒌𝒌 𝑛𝑛𝑡𝑡 𝒑𝒑 Dirichlet Distribution 185

Description , Limitations and Uses Example Five machines are measured for performance on demand. The machines can either fail, partially fail or success in their application. The machines are tested for 10 demands with the following data for each machine:

Machine/Trail 1 2 3 4 5 6 7 8 9 10 1 F = 3 P = 2 S = 5 2 F=2 P=2 S=6 3 F=2 P=3 S=5 4 F=3 P=3 S=4 5 F=2 P=3 S=5

𝑖𝑖 𝐹𝐹 𝑃𝑃 𝐹𝐹 Estimate𝜇𝜇 the multinomial𝑛𝑛𝑝𝑝� distribution𝑛𝑛𝑝𝑝� parameter = [ 𝑛𝑛𝑝𝑝,� , ]:

𝐹𝐹 𝑃𝑃 𝑆𝑆 Using a non-informative improper prior (0,0,𝒑𝒑0) after𝑝𝑝 updating:𝑝𝑝 𝑝𝑝

𝐷𝐷𝐷𝐷𝑟𝑟3 12 = . 12 = . = = 13 [ ] = 𝐹𝐹 50 [ ] = 7 15𝐸𝐸−5 𝐹𝐹 𝑝𝑝� 13 𝑝𝑝 25 7.54𝐸𝐸−5 𝑃𝑃 𝑃𝑃 = 50 𝒙𝒙 �𝑝𝑝 � 𝜶𝜶 � � 𝐸𝐸 𝒙𝒙 �𝑝𝑝� 25� 𝑉𝑉𝑉𝑉𝑉𝑉 𝒙𝒙 � 9 80𝐸𝐸−5 � 𝑆𝑆 𝑝𝑝 𝑆𝑆 50 Confidence intervals for the parameters𝑝𝑝� = [ , , ] can also be calculated using the cdf of the marginal distribution ( ). 𝐹𝐹 𝑃𝑃 𝑆𝑆 𝒑𝒑 𝑝𝑝 𝑝𝑝 𝑝𝑝 𝐹𝐹 𝑥𝑥𝑖𝑖 Characteristic Beta Generalization. The Dirichlet distribution is a generalization of the beta distribution. The beta distribution is seen when = 1.

Interpretation. The higher the sharper and𝑑𝑑 more certain the distribution is. This follows from its use in Bayesian statistics to model 𝑖𝑖 the𝜶𝜶 multinomial distribution parameter𝛼𝛼 . As more evidence is used, the values get higher which reduces uncertainty. The values of can also be interpreted as a count for 𝑝𝑝each state of the multinomial 𝑖𝑖 𝑖𝑖 𝛼𝛼distribution. 𝛼𝛼

Alternative Formulation. The most common formulation of the Dirichlet Dirichlet distribution is as follows: = [ , , … , ] where > 0

= [ , , … , ] 𝑇𝑇 where 0 x 1, = 1 1 2 𝑚𝑚 i 𝛂𝛂 𝛼𝛼 𝛼𝛼 𝛼𝛼 𝑇𝑇 1α 𝑚𝑚 1 2 𝑚𝑚 ( ) = i x 𝑖𝑖= 1 𝑖𝑖 𝐱𝐱 𝑥𝑥 𝑥𝑥 𝑥𝑥 B( ≤) ≤m ∑ 𝑥𝑥 αi−1 i 𝑓𝑓 𝐱𝐱 �i=1 This formulation is popular because𝛂𝛂 it is a more simple presentation where the matrix of and are the same size. However it should be noted that last term of the vector is dependent on { … } through the relationship =𝜶𝜶1 𝒙𝒙 . 1 𝑚𝑚−1 𝑚𝑚−1 𝒙𝒙 𝑥𝑥 𝑥𝑥 𝑚𝑚 𝑖𝑖=1 𝑖𝑖 Neutrality. (Kotz𝑥𝑥 et al.− 2000,∑ 𝑥𝑥p.500) If and are non negative

𝑋𝑋1 𝑋𝑋2 186 Bivariate and Multivariate Distributions

random variables such that + 1 then is called neutral if the following are independent: 𝑋𝑋1 𝑋𝑋2 ≤ 𝑋𝑋𝑖𝑖 ( ) 1 𝑋𝑋𝑗𝑗 If ( ) then is a neutral𝑖𝑖 vector with each X being neutral under 𝑋𝑋 ⊥ 𝑖𝑖 𝑖𝑖 ≠ 𝑗𝑗 all permutations of the above definition.− 𝑋𝑋 This property is unique to the 𝑑𝑑 i Dirichlet𝑿𝑿 ∼ 𝐷𝐷𝐷𝐷 𝑟𝑟distribution.𝜶𝜶 𝑿𝑿

Applications Bayesian Statistics. The Dirichlet distribution is often used as a conjugate prior to the multinomial likelihood function.

Resources Online: http://en.wikipedia.org/wiki/Dirichlet_distribution http://www.cis.hut.fi/ahonkela/dippa/node95.html

Books: Kotz, S., Balakrishnan, N. & Johnson, N.L., 2000. Continuous Multivariate Distributions, Volume 1, Models and Applications, 2nd Edition 2nd ed., Wiley-Interscience.

Congdon, P., 2007. Bayesian Statistical Modelling 2nd ed., Wiley.

MacKay, D.J. & Petoy, L.C., 1995. A hierarchical Dirichlet language model. Natural language engineering. Relationship to Other Distributions Beta Special Case: Distribution (x; [ , ]) = ( = ; = , = )

( ; , ) 𝐷𝐷𝐷𝐷𝑟𝑟𝑑𝑑=1 α1 α0 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑥𝑥 𝛼𝛼 𝛼𝛼1 𝛽𝛽 𝛼𝛼0

Gamma𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑥𝑥 𝛼𝛼 𝛽𝛽 Let: Distribution ~ ( , ) . . = 𝑑𝑑 ( ; , ) 𝑖𝑖 𝑖𝑖 𝑖𝑖 Then: 𝑌𝑌 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝜆𝜆 𝑘𝑘 𝑖𝑖 𝑖𝑖 𝑑𝑑 𝑎𝑎𝑎𝑎𝑎𝑎 𝑉𝑉 � 𝑌𝑌

𝑖𝑖=1 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑥𝑥 𝜆𝜆 𝑘𝑘 ~ ( , ) Let: 𝑉𝑉 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝜆𝜆 ∑𝑘𝑘𝑖𝑖 = , , … , Dirichlet 𝑌𝑌1 𝑌𝑌2 𝑌𝑌𝑑𝑑 Then: 𝒁𝒁 � � ~ 𝑉𝑉 ( 𝑉𝑉 , … , 𝑉𝑉 )

𝑑𝑑 1 𝑘𝑘 *i.i.d: independent and identically𝒁𝒁 𝐷𝐷𝐷𝐷𝑟𝑟 distributed𝛼𝛼 𝛼𝛼

Multivariate Normal Distribution 187

6.3. Multivariate Normal Continuous Distribution *Note for a graphical representation see bivariate normal distribution Parameters & Description Location Vector: A d- dimensional vector = [ , , … , ] < < giving the mean of each 𝑇𝑇 𝝁𝝁 𝜇𝜇1 𝜇𝜇2 𝜇𝜇𝑑𝑑 −∞ µi ∞ random variable. Covariance Matrix: A × matrix which quantifies the random 𝑑𝑑 𝑑𝑑 Parameters > 0 variable variance and = dependence. This matrix 𝜎𝜎11 ⋯ 𝜎𝜎1𝑑𝑑 0 𝜎𝜎𝑖𝑖𝑖𝑖 determines the shape of ∑ � ⋮ ⋱ ⋮ � 𝑖𝑖𝑖𝑖 is 𝑑𝑑1 𝑑𝑑𝑑𝑑 𝜎𝜎 ≥ the distribution. 𝜎𝜎 ⋯ 𝜎𝜎 symmetric positive definite matrix. Σ 2 Dimensions. The number

(integer) of dependent variables. 𝑑𝑑 ≥ Limits 𝑑𝑑 < x <

Distribution −∞Formulasi ∞

1 1 ( ) = exp ( ) ( ) (2 ) / | | 2 PDF T −1 𝑓𝑓 𝐱𝐱 𝑑𝑑 2 �− 𝐱𝐱 − 𝛍𝛍 𝚺𝚺 𝐱𝐱 − 𝛍𝛍 � Where | | is the determinant𝜋𝜋 � 𝚺𝚺 of .

𝚺𝚺 𝚺𝚺 Let = ~ , 𝑢𝑢 𝑢𝑢𝑢𝑢 𝑢𝑢𝑢𝑢 Mu 𝑼𝑼 𝑑𝑑 𝝁𝝁 𝚺𝚺𝑇𝑇 𝚺𝚺 Where 𝑿𝑿 = � �, …𝑁𝑁, 𝑁𝑁𝑁𝑁𝑚𝑚, ��, …𝒗𝒗�, � 𝑢𝑢𝑢𝑢 𝒗𝒗𝒗𝒗�� Normal ltivar 𝑽𝑽 𝝁𝝁 𝚺𝚺 𝑇𝑇 𝚺𝚺 = 1, … , 𝑝𝑝 𝑝𝑝+1 𝑑𝑑 𝑿𝑿 �𝑋𝑋 𝑋𝑋 𝑋𝑋𝑇𝑇 𝑋𝑋 � = , … , 𝑼𝑼 �𝑋𝑋1 𝑋𝑋𝑝𝑝� Marginal PDF 𝑇𝑇 𝑽𝑽 �𝑋𝑋𝑝𝑝+1 𝑋𝑋𝑑𝑑�

( , ) ( ) = ( ) 𝑝𝑝 𝒖𝒖 𝒖𝒖𝒖𝒖 ∞ 𝑼𝑼 ∼ 𝑁𝑁𝑁𝑁𝑁𝑁𝑚𝑚 𝝁𝝁 𝚺𝚺 1 1 𝑓𝑓 𝒖𝒖 = �−∞𝑓𝑓 𝒙𝒙 𝑑𝑑𝒗𝒗 exp ( ) ( ) ( ) / 2 2 | | T −1 𝒖𝒖 uu 𝒖𝒖 𝑝𝑝 2 �− 𝐮𝐮 − 𝝁𝝁 𝚺𝚺 𝐮𝐮 − 𝝁𝝁 � 𝒖𝒖𝒖𝒖 𝜋𝜋 | =𝚺𝚺 | , | Conditional PDF �

𝑼𝑼 𝑽𝑽 𝒗𝒗 ∼ 𝑁𝑁𝑁𝑁𝑁𝑁𝑚𝑚𝑝𝑝�𝝁𝝁𝑢𝑢 𝑣𝑣 𝚺𝚺𝑢𝑢 𝑣𝑣� 188 Bivariate and Multivariate Distributions

Where | = + ( ) | = 𝑇𝑇 −1 𝝁𝝁𝑢𝑢 𝑣𝑣 𝝁𝝁𝑢𝑢 𝚺𝚺𝑢𝑢𝑢𝑢𝚺𝚺𝑣𝑣𝑣𝑣 𝒗𝒗 − 𝝁𝝁𝑣𝑣 𝑇𝑇 −1 𝚺𝚺𝑢𝑢 𝑣𝑣 𝚺𝚺𝑢𝑢𝑢𝑢 − 𝚺𝚺𝑢𝑢𝑢𝑢𝚺𝚺𝑣𝑣𝑣𝑣 𝚺𝚺𝑢𝑢𝑢𝑢 1 1 ( ) = exp ( ) ( ) d ( ) / 𝐱𝐱 2 CDF 2 | | T −1 𝑑𝑑 2 𝐹𝐹 𝐱𝐱 �−∞ �− 𝐱𝐱 − 𝛍𝛍 𝚺𝚺 𝐱𝐱 − 𝛍𝛍 � 𝒙𝒙 𝜋𝜋 � 𝚺𝚺 1 1 ( ) = exp ( ) ( ) d Reliability ( ) / ∞ 2 2 | | T −1 𝑑𝑑 2 𝑅𝑅 𝐱𝐱 �𝐱𝐱 �− 𝐱𝐱 − 𝛍𝛍 𝚺𝚺 𝐱𝐱 − 𝛍𝛍 � 𝒙𝒙 𝜋𝜋 � 𝚺𝚺 Properties and Moments Median

Mode 𝝁𝝁 st Mean - 1 Raw Moment [ 𝝁𝝁] =

Mean of the marginal 𝐸𝐸distribution:𝑿𝑿 𝝁𝝁 [ ] = [ ] = 𝑢𝑢 𝐸𝐸 𝑼𝑼 𝝁𝝁 𝑣𝑣 Mean of the conditional𝐸𝐸 𝑽𝑽 distribution:𝝁𝝁 | = + ( ) 𝑇𝑇 −1 Variance - 2nd Central Moment 𝝁𝝁𝑢𝑢 𝑣𝑣 𝝁𝝁𝑢𝑢 [𝚺𝚺𝑢𝑢]𝑢𝑢𝚺𝚺=𝑣𝑣𝑣𝑣 𝒗𝒗 − 𝝁𝝁𝑣𝑣

Covariance of marginal𝐶𝐶𝐶𝐶𝐶𝐶 distributions:𝑿𝑿 𝚺𝚺 ( ) =

𝐮𝐮𝐮𝐮 Covariance of conditional𝐶𝐶𝐶𝐶𝐶𝐶 𝐔𝐔 distributions:𝚺𝚺 ( | ) = 𝑇𝑇 −1 𝐶𝐶𝐶𝐶𝐶𝐶 𝐔𝐔 𝐕𝐕 𝚺𝚺𝑢𝑢𝑢𝑢 − 𝚺𝚺𝑢𝑢𝑢𝑢𝚺𝚺𝑣𝑣𝑣𝑣 𝚺𝚺𝑢𝑢𝑢𝑢 Parameter Estimation

Maximum Likelihood Function MLE Point When given complete data of samples: Estimates = , , … , where = (1,2, … , ) , , 𝑛𝑛𝐹𝐹, 𝑻𝑻 𝒕𝒕 1 𝑡𝑡 2 𝑡𝑡 𝑑𝑑 𝑡𝑡 𝐹𝐹 The following𝒙𝒙 MLE�𝒙𝒙 estimates𝒙𝒙 𝒙𝒙are �given: (Kotz𝑡𝑡 et al. 2000,𝑛𝑛 p.161) Multivar Normal 1 = 𝑛𝑛𝐹𝐹 n 𝑡𝑡 𝛍𝛍� F � 𝒙𝒙 1 𝑡𝑡=1 = 𝑛𝑛𝐹𝐹 ( ) n , , �𝑖𝑖𝑖𝑖 𝑖𝑖 𝑡𝑡 𝚤𝚤 𝑗𝑗 𝑡𝑡 𝚥𝚥 Σ F ��𝑥𝑥 − 𝜇𝜇� � 𝑥𝑥 − 𝜇𝜇� 𝑡𝑡=1 A review of different estimators is given in (Kotz et al. 2000). When estimates are from a low number of samples ( < 30) a correction

𝑛𝑛𝐹𝐹 Multivariate Normal Distribution 189

factor of -1 can be introduced to give the unbiased estimators (Tong 1990, p.53): 1 = 𝑛𝑛𝐹𝐹 ( ) n 1 , , �𝑖𝑖𝑖𝑖 𝑖𝑖 𝑡𝑡 𝚤𝚤 𝑗𝑗 𝑡𝑡 𝚥𝚥 Σ F ��𝑥𝑥 − 𝜇𝜇� � 𝑥𝑥 − 𝜇𝜇� − 𝑡𝑡=1 Fisher = Information , 𝑇𝑇 Matrix 𝜕𝜕𝜇𝜇 −1 𝜕𝜕𝜕𝜕 𝐼𝐼𝑖𝑖 𝑗𝑗 Σ 𝜕𝜕𝜃𝜃𝑖𝑖 𝜕𝜕𝜃𝜃𝑗𝑗 Bayesian Non-informative Priors when is known, ( ) (Yang and Berger 1998, p.22) 𝚺𝚺 𝜋𝜋0 𝝁𝝁 Type Prior Posterior 1 Uniform 1 Improper, Jeffrey, ( | )~ ; 𝑛𝑛𝐹𝐹 , Reference Prior n n 𝑑𝑑 𝑡𝑡 𝚺𝚺 𝜋𝜋 𝝁𝝁 𝑬𝑬 𝑁𝑁𝑁𝑁𝑁𝑁𝑚𝑚 �µ F � 𝒙𝒙 F� when ( , ) 𝑡𝑡=1 ( ) Shrinkage ( ) 𝝁𝝁No∈ ∞Closed∞ Form 𝑇𝑇 −1 𝑇𝑇 − 𝑑𝑑−2 Non-informative𝝁𝝁 𝚺𝚺 𝝁𝝁 Priors when is known, ( ) (Yang & Berger 1994) 𝝁𝝁 𝜋𝜋𝑜𝑜 𝚺𝚺 Type Prior Posterior Uniform Improper 1 ~ Prior with limits −𝟏𝟏 ; 1, (0, ) 𝜋𝜋�𝚺𝚺 �𝑬𝑬� −1 −𝟏𝟏 𝑺𝑺 𝑊𝑊𝑊𝑊𝑊𝑊ℎ𝑎𝑎𝑎𝑎𝑡𝑡𝑑𝑑 �𝚺𝚺 𝑛𝑛𝐹𝐹 − 𝑑𝑑 − � Jeffery’s𝚺𝚺 ∈ Prior∞ 1 ~ 𝑛𝑛𝐹𝐹

−𝟏𝟏 | | ; , d+1 𝜋𝜋�𝚺𝚺 �𝑬𝑬� −1 2 −𝟏𝟏 𝑺𝑺 𝚺𝚺 with limits𝑑𝑑 (0𝐹𝐹 , ) 𝑊𝑊𝑊𝑊𝑊𝑊ℎ𝑎𝑎𝑎𝑎𝑡𝑡 �𝚺𝚺 𝑛𝑛 𝐹𝐹 �

𝑛𝑛 Mu 1 Reference Prior Proper - No Closed𝚺𝚺 ∈ ∞Form Normal ltivar Ordered | | ( ) { , , . . , } 𝚺𝚺 ∏i<𝑗𝑗 λi − λj Reference𝑖𝑖 𝑗𝑗 Prior𝑑𝑑 1 Proper - No Closed Form 𝜆𝜆 𝜆𝜆 𝜆𝜆 Ordered | |(log log ) ( ) { , , , . . , } d−2 𝚺𝚺 λ1 − λd ∏i<𝑗𝑗 λi − λj MDIP1 𝑑𝑑 𝑖𝑖 𝑑𝑑−1 1 No Closed Form 𝜆𝜆 𝜆𝜆 𝜆𝜆 𝜆𝜆 | |

Non-informative Priors when and𝚺𝚺 are unknown for bivariate normal, ( , ). A complete coverage of numerous reference prior distributions with different parameter 𝑜𝑜 ordering is𝝁𝝁 contained𝚺𝚺 in (Berger & Sun 2008) 𝜋𝜋 𝝁𝝁 𝚺𝚺

190 Bivariate and Multivariate Distributions

Type Prior Posterior Uniform Improper 1 No Closed Form Prior Jeffery’s Prior 1 No Closed Form

| | d+1 Reference Prior 1 2 No Closed Form 𝚺𝚺 Ordered | | ( ) { , , . . , } 𝚺𝚺 ∏i<𝑗𝑗 λi − λj Reference𝑖𝑖 𝑗𝑗 Prior𝑑𝑑 1 No Closed Form 𝜆𝜆 𝜆𝜆 𝜆𝜆 Ordered | |(log log ) ( ) { , , , . . , } d−2 𝚺𝚺 λ1 − λd ∏i<𝑗𝑗 λi − λj MDIP1 𝑑𝑑 𝑖𝑖 𝑑𝑑−1 1 No Closed Form 𝜆𝜆 𝜆𝜆 𝜆𝜆 𝜆𝜆 | |

where is the eigenvalue of , 𝚺𝚺 and and are population and sample multiple correlation coefficients𝑡𝑡ℎ where: 𝜆𝜆𝑖𝑖 𝑖𝑖 Σ 𝑅𝑅� 𝑅𝑅 1 1 S = 𝑛𝑛𝐹𝐹 ( ) and = 𝑛𝑛𝐹𝐹 n 1 , , n 𝑖𝑖𝑖𝑖 𝑖𝑖 𝑡𝑡 𝚤𝚤 𝑗𝑗 𝑡𝑡 𝚥𝚥 𝑡𝑡 F ��𝑥𝑥 − 𝜇𝜇� � 𝑥𝑥 − 𝜇𝜇� 𝐱𝐱� F � 𝒙𝒙 − 𝑡𝑡=1 𝑡𝑡=1

Conjugate Priors UOI Likelihood Evidenc Dist of Prior Posterior Model e UOI Para Parameters + n = Multi-variate events Multi- −1 + n −1 𝐕𝐕0 𝐔𝐔𝐨𝐨 F𝐕𝐕 𝐱𝐱� from Normal with at variate , −1 −1 𝐹𝐹 𝑼𝑼 F 𝝁𝝁( , ) known 𝑛𝑛 points Normal 𝐕𝐕0 1 𝚺𝚺 𝟎𝟎 𝟎𝟎 = 𝒙𝒙 𝑼𝑼 𝐕𝐕 + n 𝑁𝑁𝑁𝑁𝑁𝑁𝑚𝑚𝑑𝑑 𝝁𝝁 𝚺𝚺 𝚺𝚺 −1 −𝟏𝟏

𝐕𝐕 Description , Limitations and Uses 𝐕𝐕0 F𝚺𝚺 Example See bivariate normal distribution. Characteristic Standard Spherical Normal Distribution. When = 0, = we obtain the standard spherical normal distribution: 1 1 𝝁𝝁 𝚺𝚺 𝐼𝐼 ( ) = / exp

Multivar Normal (2 ) 2 T 𝑓𝑓 𝐱𝐱 𝑑𝑑 2 �− 𝐱𝐱 𝐱𝐱� Covariance Matrix. (Yang et al.𝜋𝜋 2004, p.49) - Diagonal Elements. The diagonal elements of is the variance of each random variable. = ( ) - Non Diagonal Elements. Non diagonal elements Σgive the 𝑖𝑖𝑖𝑖 𝑖𝑖 covariance = , = .𝜎𝜎 Hence𝑉𝑉𝑉𝑉𝑉𝑉 𝑋𝑋the matrix is symmetric. 𝑖𝑖𝑖𝑖 𝑖𝑖 𝑗𝑗 𝑗𝑗𝑗𝑗 - Independent𝜎𝜎 Variables.𝐶𝐶𝐶𝐶𝐶𝐶�𝑋𝑋 If𝑋𝑋 � 𝜎𝜎 , = = 0 then and

𝐶𝐶𝐶𝐶𝐶𝐶�𝑋𝑋𝑖𝑖 𝑋𝑋𝑗𝑗� 𝜎𝜎𝑖𝑖𝑖𝑖 𝑋𝑋𝑖𝑖 Multivariate Normal Distribution 191

and independent. - > 0. When increases then and tends to increase. 𝑋𝑋𝑗𝑗 - < 0. When increases then and tends to decrease. 𝝈𝝈𝒊𝒊𝒊𝒊 𝑋𝑋𝑖𝑖 𝑋𝑋𝑗𝑗 𝒊𝒊𝒊𝒊 𝑖𝑖 𝑗𝑗 Ellipsoid𝝈𝝈 Axis. The ellipsoids𝑋𝑋 has axes pointing𝑋𝑋 in the direction of the eigenvectors of . The magnitude of these axes are given by the corresponding eigenvalues. 𝚺𝚺 Mean / Median / Mode: As per the univariate distributions the mean, median and mode are equal.

Convolution Property Let ( , ) , (independent) Where 𝑑𝑑 𝒙𝒙 𝐱𝐱 𝑑𝑑 𝒚𝒚 𝐲𝐲 Then 𝑿𝑿 ∼ 𝑁𝑁𝑁𝑁𝑁𝑁+𝑚𝑚 ~𝝁𝝁 𝚺𝚺 +𝒀𝒀 ∼ ,𝑁𝑁𝑁𝑁𝑁𝑁+𝑚𝑚 �𝝁𝝁 𝚺𝚺 � 𝑿𝑿 ⊥ 𝒀𝒀 𝑑𝑑 𝒙𝒙 𝒚𝒚 𝒙𝒙 𝒚𝒚 Note if X and Y are𝑿𝑿 dependent𝒀𝒀 𝑁𝑁𝑁𝑁𝑁𝑁𝑚𝑚 � 𝝁𝝁then 𝝁𝝁X 𝚺𝚺+ Y 𝚺𝚺may� not be normally distributed. (Novosyolov 2006)

Scaling Property Let = + Y is a p x 1 matrix b is a p x 1 matrix 𝒀𝒀 𝑨𝑨𝑨𝑨 𝒃𝒃 Then ~ ( + , ) A is a p x d matrix 𝑻𝑻 𝑑𝑑 Marginalize Property:𝒀𝒀 𝑁𝑁𝑁𝑁𝑁𝑁𝑚𝑚 𝑨𝑨𝑨𝑨 𝒃𝒃 𝑨𝑨𝚺𝚺𝑨𝑨 Let = ~ , 𝑢𝑢 𝑢𝑢𝑢𝑢 𝑢𝑢𝑢𝑢 𝑼𝑼 𝑑𝑑 𝝁𝝁 𝚺𝚺𝑇𝑇 𝚺𝚺 𝑿𝑿 � � 𝑁𝑁𝑁𝑁𝑁𝑁𝑚𝑚 �� 𝒗𝒗� � 𝑢𝑢𝑢𝑢 𝒗𝒗𝒗𝒗�� Then 𝑽𝑽 ( , 𝝁𝝁) 𝚺𝚺 𝚺𝚺 is a p x 1 matrix

𝑝𝑝 𝒖𝒖 𝒖𝒖𝒖𝒖 Conditional Property:𝑼𝑼 ∼ 𝑁𝑁𝑁𝑁𝑁𝑁𝑚𝑚 𝝁𝝁 𝚺𝚺 𝑼𝑼 Let = ~ , 𝑢𝑢 𝑢𝑢𝑢𝑢 𝑢𝑢𝑢𝑢 Mu 𝑼𝑼 𝑑𝑑 𝝁𝝁 𝚺𝚺𝑇𝑇 𝚺𝚺 𝑿𝑿 � � 𝑁𝑁𝑁𝑁𝑁𝑁𝑚𝑚 ��𝝁𝝁𝑣𝑣� � 𝑢𝑢𝑢𝑢 𝒗𝒗𝒗𝒗�� Normal ltivar Then | =𝑽𝑽 | , 𝚺𝚺 | 𝚺𝚺 is a p x 1 matrix

𝑼𝑼 𝑽𝑽 𝒗𝒗 ∼ 𝑁𝑁𝑁𝑁𝑁𝑁𝑚𝑚𝑝𝑝�𝝁𝝁𝑢𝑢 𝑣𝑣 𝚺𝚺𝑢𝑢 𝑣𝑣� 𝑼𝑼 Where | = + ( ) = 𝑇𝑇 −1 𝑢𝑢| 𝑣𝑣 𝑢𝑢 𝑢𝑢𝑢𝑢 𝑣𝑣𝑣𝑣 𝑣𝑣 𝝁𝝁 𝝁𝝁 𝚺𝚺 𝑇𝑇 𝚺𝚺 −1 𝑽𝑽 − 𝝁𝝁 𝑢𝑢 𝑣𝑣 𝑢𝑢𝑢𝑢 𝑢𝑢𝑢𝑢 𝑣𝑣𝑣𝑣 𝑢𝑢𝑢𝑢 It should be noted 𝚺𝚺that the𝚺𝚺 −standard𝚺𝚺 𝚺𝚺 𝚺𝚺deviation of the marginal distribution does not depend on the given values in V.

Applications Convenient Properties. (Balakrishnan & Lai 2009, p.477) Popularity of the multivariate normal distribution over other multivariate distributions is due to the convenience of the conditional and marginal distribution properties which both produce univariate normal distributions. 192 Bivariate and Multivariate Distributions

Kalman Filter. The Kalman filter estimates the current state of a system in the presence of noisy measurements. This process uses multivariate normal distributions to model the noise.

Multivariate (MANOVA). A test used to analyze variance and dependence of variables. A popular model used to conduct MANOVA assumes the data comes from a multivariate normal population.

Gaussian Regression Process. This is a for observations or events that occur in a continuous domain of time or space, where every point is associated with a normally distributed random variable and every finite collection of these random variables has a multivariate normal distribution.

Multi-Linear Regression. Multi-linear regression attempts to model the relationship between parameters and variables by fitting a linear equation. One model to do such a task (MLE) fits a distribution to the observed variance where a multivariate normal distribution is often assumed.

Gaussian Bayesian Belief Networks (BBN). BBNs graphical represent the dependence between variables in a probability distribution. When using continuous random variables BBNs quickly become tremendously complicated. However due to the multivariate normal distribution’s conditional and marginal properties this task is simplified and popular.

Resources Online: http://mathworld.wolfram.com/BivariateNormalDistribution.html http://www.aiaccess.net/English/Glossaries/GlosMod/e_gm_binormal _distri.htm (interactive visual representation)

Books: Patel, J.K, Read, C.B, 1996. Handbook of the Normal Distribution, 2nd Edition, CRC

Tong, Y.L., 1990. The Multivariate Normal Distribution, Springer.

Yang, K. et al., 2004. Multivariate Statistical Methods in Quality Management 1st ed., McGraw-Hill Professional. Multivar Normal

Bertsekas, D.P. & Tsitsiklis, J.N., 2008. Introduction to Probability, 2nd Edition, Athena Scientific. Multinomial Distribution 193

6.4. Multinomial Discrete Distribution

Probability Density Function - f(k)

Trinomial Distribution, ([ , , ] ) where = 8, = , , . Note is not shown 𝑇𝑇 because it is determined𝑇𝑇 using = 1 1 5 𝑓𝑓 𝑘𝑘1 𝑘𝑘2 𝑘𝑘3 𝑛𝑛 𝐩𝐩 �3 4 12� 𝑘𝑘3 𝑘𝑘3 𝑛𝑛 − 𝑘𝑘1 − 𝑘𝑘2 Multinomial

Trinomial Distribution, ([ , , ] ) where = 20, = , , . Note is not shown 𝑇𝑇 because it is determined𝑇𝑇 as = 1 1 1 𝑓𝑓 𝑘𝑘1 𝑘𝑘2 𝑘𝑘3 𝑛𝑛 𝐩𝐩 �3 2 6� 𝑘𝑘3 𝑘𝑘3 𝑛𝑛 − 𝑘𝑘1 − 𝑘𝑘2 194 Bivariate and Multivariate Distributions

Parameters & Description Number of Trials. This is n > 0 sometimes called the index. (integer) (Johnson et al. 1997, p.31) 𝑛𝑛 Event Probability Matrix: 0 p 1 The probability of event i Parameters = [ , , … , ] i occurring. is often called 𝑑𝑑≤ =≤1 𝑇𝑇 cell probabilities. (Johnson 𝐩𝐩 𝑝𝑝1 𝑝𝑝2 𝑝𝑝𝑑𝑑 𝑝𝑝𝑖𝑖 � 𝑝𝑝𝑖𝑖 et al. 1997, p.31) 𝑖𝑖=1 Dimensions. The number of 2 mutually exclusive states of (integer) 𝑑𝑑 ≥ the system. 𝑑𝑑

k {0, … , }

Limits i ∈ 𝑛𝑛 𝑑𝑑 =

� 𝑘𝑘𝑖𝑖 𝑛𝑛 𝑖𝑖=1 Distribution Formulas

n ( ) = p k , k , . . , k d ki where 𝑓𝑓 𝐤𝐤 � 1 2 d� � i n n! n! i=1 (n + 1) = = = k , k , . . , k k ! k ! … k ! k ! (k + 1) Γ 1 2 n d d � � 1 2 d i=1 i i=1 i Note that in p there is only d-1 ‘free’ variables∏ as∏ the Γlast PDF = 1 giving the distribution: 𝑑𝑑−1 𝑝𝑝𝑑𝑑 − ∑𝑖𝑖=1 𝑝𝑝𝑖𝑖 n d−1 i ( ) = k , k , . . , k p . 1 s p n−∑i=1 k

d−1 ki i i 𝑓𝑓 𝐤𝐤 � 1 2 n� �i=1 � − � � i=1 Now the special case of binomial distribution when = 2 can be seen. 𝑑𝑑 Multinomial Let = ~ , 𝒖𝒖 Where = [𝑼𝑼 , … , , 𝑑𝑑 , … ,𝒑𝒑 ] 𝑲𝑲 � � 𝑀𝑀𝑀𝑀𝑜𝑜𝑚𝑚 �𝑛𝑛 � 𝒗𝒗�� = [𝑽𝑽 , … , ] 𝒑𝒑 𝑇𝑇 1 𝑠𝑠 𝑠𝑠+1 𝑑𝑑 𝑲𝑲 = [𝐾𝐾 , …𝐾𝐾, 𝑇𝑇𝐾𝐾] 𝐾𝐾 Marginal PDF 1 𝑠𝑠 𝑼𝑼 𝐾𝐾 𝐾𝐾 𝑇𝑇 𝑽𝑽 𝐾𝐾𝑠𝑠+1 𝐾𝐾𝑑𝑑 ( , ) where = , , … , , 1 𝑼𝑼 ∼ 𝑀𝑀𝑀𝑀𝑀𝑀𝑚𝑚𝑠𝑠 𝑛𝑛 𝒑𝒑𝒖𝒖 𝑠𝑠−1 𝑇𝑇 𝒑𝒑𝒖𝒖 �𝑝𝑝1 𝑝𝑝2 𝑝𝑝𝑠𝑠−1 � − ∑𝑖𝑖=1 𝑝𝑝𝑖𝑖�� Multinomial Distribution 195

n ( ) = p k , k , . . , k s ki i 𝑓𝑓 𝒖𝒖 � 1 2 s� �i=1 When only two states = [ , (1 )] : 𝑇𝑇 𝒑𝒑 𝑝𝑝 n − 𝑝𝑝 ( ) = k p (1 ) ki 𝑛𝑛−𝑘𝑘𝑖𝑖 𝑖𝑖 i 𝑖𝑖 𝑓𝑓| 𝑘𝑘= � i� − 𝑝𝑝| , | where 𝑼𝑼 𝑽𝑽 𝒗𝒗 ∼ 𝑀𝑀𝑀𝑀𝑀𝑀𝑚𝑚𝑠𝑠�𝑛𝑛𝑢𝑢 𝑣𝑣 𝒑𝒑𝑢𝑢 𝒗𝒗� = = 𝑑𝑑 Conditional PDF | 𝑛𝑛𝑢𝑢 𝑣𝑣 𝑛𝑛 −1𝑛𝑛𝑣𝑣 𝑛𝑛 − � 𝑘𝑘𝑖𝑖 | = [ , ,𝑖𝑖…=𝑠𝑠,+1 ] 𝑇𝑇 𝒑𝒑𝑢𝑢 𝑣𝑣 𝑠𝑠 𝑝𝑝1 𝑝𝑝2 𝑝𝑝𝑠𝑠 ∑𝑖𝑖=1 𝑝𝑝𝑖𝑖 ( ) = P(K k , K k , … , K k )

1 1 2 2 n d d CDF 𝐹𝐹 𝐤𝐤 = k1 k≤2 … kd ≤ ≤ p j , j , . . , j d ji i � � � � 1 2 d� �i=1 j1=0 j2=0 jd=0 ( ) = P(K > k , K > k , … , K > k )

1 1 2 2 d nd 𝑅𝑅 𝐤𝐤 = n n … n p Reliability j , j , . . , j d ji i � � � � 1 2 d� �i=1 j1=k1+1 j2=k2+1 jd=kd+1 Properties and Moments Median3 ( ) is either { , }

Mode 𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀 𝑘𝑘𝑖𝑖( ) = ( +⌊𝑛𝑛1𝑝𝑝)𝑖𝑖⌋ ⌈𝑛𝑛 𝑝𝑝𝑖𝑖⌉ st Mean - 1 Raw Moment 𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀[𝑘𝑘𝑖𝑖] = ⌊ =𝑛𝑛 𝑝𝑝𝑖𝑖⌋

Mean of the marginal𝐸𝐸 𝑲𝑲 distribution:𝝁𝝁 𝑛𝑛𝒑𝒑 [ ] = = Multinomial [ ] = = 𝐸𝐸 𝑼𝑼 𝝁𝝁𝒖𝒖 𝑛𝑛𝒑𝒑𝑢𝑢 𝑖𝑖 𝑘𝑘𝑖𝑖 𝑖𝑖 Mean of the conditional𝐸𝐸 𝐾𝐾 𝜇𝜇 distribution:𝑛𝑛𝑝𝑝 [ | = ] = | = | |

where 𝐸𝐸 𝑼𝑼 𝑽𝑽 𝒗𝒗 𝝁𝝁𝑢𝑢 𝑣𝑣 𝑛𝑛𝑢𝑢 𝑣𝑣𝒑𝒑𝑢𝑢 𝒗𝒗

| = = 𝑑𝑑

𝑛𝑛𝑢𝑢 𝑣𝑣 𝑛𝑛 −1𝑛𝑛𝑣𝑣 𝑛𝑛 − � 𝑘𝑘𝑖𝑖 | = [ , ,𝑖𝑖…=𝑠𝑠,+1 ] 𝑇𝑇 𝒑𝒑𝑢𝑢 𝑣𝑣 𝑠𝑠 𝑝𝑝1 𝑝𝑝2 𝑝𝑝𝑠𝑠 ∑𝑖𝑖=1 𝑝𝑝𝑖𝑖

3 = is the floor function (largest integer not greater than ) = is the ceiling function (smallest integer not less than ) ⌊𝑥𝑥⌋ 𝑥𝑥 ⌈𝑥𝑥⌉ 𝑥𝑥 196 Bivariate and Multivariate Distributions

Variance - 2nd Central Moment [ ] = (1 ) , = 𝑉𝑉𝑉𝑉𝑉𝑉 𝐾𝐾𝑖𝑖 𝑛𝑛𝑝𝑝𝑖𝑖 − 𝑝𝑝𝑖𝑖 𝑖𝑖 𝑗𝑗 𝑖𝑖 𝑗𝑗 Covariance of𝐶𝐶𝐶𝐶𝐶𝐶 marginal�𝐾𝐾 𝐾𝐾 � distributions:−𝑛𝑛𝑝𝑝 𝑝𝑝 [ ] = (1 )

𝑖𝑖 𝑖𝑖 𝑖𝑖 Covariance of𝑉𝑉𝑉𝑉 conditional𝑉𝑉 𝐾𝐾 𝑛𝑛𝑝𝑝 distributions:− 𝑝𝑝 | , = | | , (1 | , ) | , , | , = | | , | , 𝑉𝑉𝑉𝑉𝑉𝑉�𝐾𝐾𝑈𝑈 𝑉𝑉 𝑖𝑖� 𝑛𝑛𝑢𝑢 𝑣𝑣𝑝𝑝𝑢𝑢 𝑣𝑣 𝑖𝑖 − 𝑝𝑝𝑢𝑢 𝑣𝑣 𝑖𝑖 𝑈𝑈 𝑉𝑉 𝑖𝑖 𝑈𝑈 𝑉𝑉 𝑗𝑗 𝑢𝑢 𝑣𝑣 𝑢𝑢 𝑣𝑣 𝑖𝑖 𝑢𝑢 𝑣𝑣 𝑗𝑗 where𝐶𝐶𝐶𝐶𝐶𝐶 �𝐾𝐾 𝐾𝐾 � −𝑛𝑛 𝑝𝑝 𝑝𝑝

| = = 𝑑𝑑

𝑛𝑛𝑢𝑢 𝑣𝑣 𝑛𝑛 −1𝑛𝑛𝑣𝑣 𝑛𝑛 − � 𝑘𝑘𝑖𝑖 | = [ , ,𝑖𝑖…=𝑠𝑠,+1 ] 𝑇𝑇 𝒑𝒑𝑢𝑢 𝑣𝑣 𝑠𝑠 𝑝𝑝1 𝑝𝑝2 𝑝𝑝𝑠𝑠 ∑𝑖𝑖=1 𝑝𝑝𝑖𝑖 Parameter Estimation Maximum Likelihood Function MLE Point As with the binomial distribution the MLE estimates, given the vector Estimates k(and therefore n), is:(Johnson et al. 1997, p.51)

= n 𝐤𝐤 𝐩𝐩� Where there are observations of each containing trails:

1 𝑡𝑡 𝑡𝑡 𝑇𝑇 = 𝒌𝒌 𝑇𝑇 𝑛𝑛 n n 𝑡𝑡 𝐩𝐩� t � 𝒌𝒌 ∑t=1 𝑡𝑡=1 100 % An approximation of the joint interval confidence limits for 100 % given Confidence by Goodman in 1965 is:(Johnson et al. 1997, p.51) Intervals𝛾𝛾 𝛾𝛾

lower confidence limit: (Complete 𝑖𝑖 1 4 Data) 𝑝𝑝 + 2 + ( ) 2( + ) 𝑖𝑖 𝑖𝑖 𝑖𝑖 �𝐴𝐴 𝑘𝑘 − 𝐴𝐴�𝐴𝐴 𝑘𝑘 𝑛𝑛 − 𝑘𝑘 �

Multinomial 𝑛𝑛 𝐴𝐴 𝑛𝑛 upper confidence limit:

𝑝𝑝𝑖𝑖 1 4 + 2 + + ( ) 2( + ) �𝐴𝐴 𝑘𝑘𝑖𝑖 𝐴𝐴�𝐴𝐴 𝑘𝑘𝑖𝑖 𝑛𝑛 − 𝑘𝑘𝑖𝑖 � 𝑛𝑛 𝐴𝐴 𝑛𝑛 where is the standard normal CDF and: d 1 + = = Φ d −1 𝑑𝑑−1+𝛾𝛾 − γ 𝐴𝐴 𝑍𝑍 𝑑𝑑 Φ � � Multinomial Distribution 197

A complete coverage of estimation techniques and confidence intervals is contained in (Johnson et al. 1997, pp.51-65). A more accurate method which requires numerical methods is given in (Sison & Glaz 1995)

Bayesian Non-informative Priors, ( ) (Yang and Berger 1998, p.6) 𝝅𝝅 𝒑𝒑 Type Prior Posterior Uniform Prior 1 = ( = 1) ( |1 + ) Jeffreys Prior 𝑑𝑑+1 𝑖𝑖 𝑑𝑑+1 ( ) 𝐷𝐷=𝐷𝐷𝑟𝑟 𝛼𝛼 = 𝐷𝐷𝐷𝐷𝑟𝑟 𝐩𝐩 𝐤𝐤 One Group - 1 𝐶𝐶 1 𝑑𝑑+1 2+𝐤𝐤 Reference Prior 𝑑𝑑+1 𝑖𝑖 2 𝐷𝐷𝐷𝐷𝑟𝑟 𝐩𝐩� 𝑑𝑑 𝐷𝐷𝐷𝐷𝑟𝑟 �𝛼𝛼 � where�∏ 𝑖𝑖=1 is𝑝𝑝 𝑖𝑖a constant In 𝐶𝐶terms of the reference prior, this approach considers all parameters are of equal importance.(Berger & Bernardo 1992) d-group Proper. See m-group posterior

Reference Prior when = 1. 1𝐶𝐶 𝑑𝑑−1 𝑖𝑖 𝑚𝑚 where� ∏ 𝑖𝑖 =is1 �a𝑝𝑝 constant𝑖𝑖� − ∑𝑗𝑗= 1 𝑝𝑝𝑖𝑖�� This approach𝐶𝐶 considers each parameter to be of different importance (group length 1) and so the parameters must be ordered by importance. (Berger & Bernardo 1992) m-group ( ) = Reference Prior 1 𝐶𝐶 1 𝑜𝑜 𝑖𝑖+1 𝜋𝜋 𝒑𝒑 𝑁𝑁𝑚𝑚 𝑁𝑁𝑖𝑖 𝑛𝑛 where groups are given by: 𝑑𝑑−1 𝑚𝑚−1 �� − ∑𝑗𝑗=1 𝑝𝑝𝑗𝑗� ∏𝑖𝑖=1 𝑝𝑝𝑖𝑖 ∏𝑖𝑖=1 � − ∑𝑗𝑗=1 𝑝𝑝𝑗𝑗� = , … , = , … , 𝑇𝑇 𝑇𝑇 = +1 + for 1= 1, … , 1 2 𝐩𝐩𝟏𝟏 �𝑝𝑝1 𝑝𝑝𝑛𝑛 � 𝐩𝐩𝟐𝟐 �𝑝𝑝𝑛𝑛 +1 𝑝𝑝𝑛𝑛 +𝑛𝑛 � Multinomial 𝑗𝑗 1 = 𝑗𝑗 , … , 𝑁𝑁 𝑛𝑛 ⋯ 𝑛𝑛 𝑗𝑗 𝑇𝑇 𝑚𝑚 𝐢𝐢 is𝑁𝑁 a𝑖𝑖− constant1+1 𝑁𝑁 𝑖𝑖 Posterior: 𝐩𝐩 �𝑝𝑝 𝑝𝑝 � 𝐶𝐶

1 1 𝑘𝑘𝑑𝑑−2 ( | ) 𝑁𝑁𝑚𝑚 𝑖𝑖 � − ∑𝑗𝑗=1 𝑝𝑝 � 𝑖𝑖+1 𝜋𝜋 𝒑𝒑 𝒌𝒌 ∝ 𝑁𝑁𝑖𝑖 𝑛𝑛 𝑑𝑑−1 𝑚𝑚−1 �∏𝑖𝑖=1 𝑝𝑝𝑖𝑖 ∏𝑖𝑖=1 � − ∑𝑗𝑗=1 𝑝𝑝𝑗𝑗� This approach splits the parameters into m different groups of importance. Within the group order is not important, but the groups need to be ordered by importance. It is common to have = 2 and split the parameters into importance and nuisance parameters. (Berger & Bernardo 1992) 𝑚𝑚 198 Bivariate and Multivariate Distributions

MDIP ( | + 1 + k ) = ( = + 1) 𝑑𝑑 𝑝𝑝𝑖𝑖 𝑑𝑑+1 𝑖𝑖 i 𝑖𝑖 𝑑𝑑+1 𝑖𝑖 𝑖𝑖 𝐷𝐷𝐷𝐷𝑟𝑟 𝐩𝐩′ 𝑝𝑝 Novick and Hall’s �𝑖𝑖=1𝑝𝑝 𝐷𝐷𝐷𝐷𝑟𝑟 𝛼𝛼 𝑝𝑝 ( | ) = ( = 0) Prior (improper) 𝑑𝑑 −1 𝑑𝑑+1 𝑖𝑖 𝑑𝑑+1 𝑖𝑖 𝐷𝐷𝐷𝐷𝑟𝑟 𝐩𝐩 𝐤𝐤 �𝑖𝑖=1Conjugate𝑝𝑝 𝐷𝐷𝐷𝐷𝑟𝑟 Priors𝛼𝛼 (Fink 1997) UOI Likelihood Evidence Dist of Prior Posterior Model UOI Para Parameters

, failures in

trials with from Multinomiald 𝑘𝑘𝑖𝑖 𝑗𝑗 Dirichletd+1 = + 𝒑𝒑( ; , ) possible 𝑛𝑛 states. 𝜶𝜶𝒐𝒐 𝜶𝜶 𝜶𝜶𝒐𝒐 𝒌𝒌 𝑑𝑑 𝑡𝑡 𝑑𝑑 𝑀𝑀𝑀𝑀𝑀𝑀𝑚𝑚 𝒌𝒌 𝑛𝑛 𝒑𝒑 Description , Limitations and Uses Example A six sided dice being thrown 60 times produces the following multinomial distribution:

0.2 Face Times 12 0.1 Number Observed 6 12 0.2 1 12 = = = 60 ⎡10⎤ ⎡0.16⎤ ⎢ ⎥ 2 7 ⎢ 8 ⎥ 0.13 3 11 𝒌𝒌 ⎢ ⎥ 𝒑𝒑 ⎢ ⎥ 𝑛𝑛 ⎢12⎥ ⎢ 0.2 ̇ ⎥ 4 10 ⎢ ⎥ ⎢ ̇ ⎥ 5 8 ⎣ ⎦ ⎣ ⎦ 6 12 Characteristic Binomial Generalization. The multinomial distribution is a generalization of the binomial distribution where more than two states of the system are allowed. The binomial distribution is a special case where = 2.

Covariance.𝑑𝑑 All covariance’s are negative. This is because the increase in one parameter must result in the decrease of to satisfy = 1.

𝑖𝑖 𝑗𝑗 𝑖𝑖 With Replacement.𝑝𝑝 The multinomial distribution assume𝑝𝑝 s replacement.Σ𝑝𝑝 The equivalent distribution which assumes without replacement is the multivariate hypergeometric distribution.

Convolution Property Multinomial Let ( ; , ) Then 𝑲𝑲𝒕𝒕 ∼ 𝑀𝑀𝑀𝑀𝑀𝑀𝑚𝑚𝑑𝑑 𝒌𝒌 𝑛𝑛𝑡𝑡 𝐩𝐩 ~ ( ; , )

� 𝑲𝑲𝒕𝒕 𝑀𝑀𝑀𝑀𝑀𝑀𝑚𝑚𝑑𝑑 ∑𝒌𝒌𝑡𝑡 ∑𝑛𝑛𝑡𝑡 𝒑𝒑 *This does not hold when the p parameter differs.

Applications Partial Failures. When the states of a system under demands cannot Multinomial Distribution 199

be modeled with two states (success or failure) the multinomial distribution may be used. Examples of this include when modeling discrete states of component degradation.

Resources Online: http://en.wikipedia.org/wiki/Multinomial_distribution http://mathworld.wolfram.com/MultinomialDistribution.html http://www.math.uah.edu/stat/bernoulli/Multinomial.xhtml

Books: Johnson, N.L., Kotz, S. & Balakrishnan, N., 1997. Discrete Multivariate Distributions 1st ed., Wiley-Interscience.

Relationship to Other Distributions Binominal Special Case: Distribution ( |n, ) = ( | , )

𝑑𝑑=2 ( | , ) 𝑀𝑀𝑀𝑀𝑀𝑀𝑚𝑚 𝐤𝐤 𝐩𝐩 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑛𝑛 𝑝𝑝

𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑘𝑘 𝑛𝑛 𝑝𝑝

Multinomial

200 Probability Distributions Used in Reliability Engineering

7. References

Abadir, K. & Magdalinos, T., 2002. The Characteristic Function from a Family of Truncated Normal Distributions. Econometric Theory, 18(5), p.1276-1287.

Agresti, A., 2002. Categorical data analysis, John Wiley and Sons.

Aitchison, J.J. & Brown, J.A.C., 1957. The Lognormal Distribution, New York: Cambridge University Press.

Andersen, P.K. et al., 1996. Statistical Models Based on Counting Processes Corrected., Springer.

Angus, J.E., 1994. Bootstrap one-sided confidence intervals for the log-normal mean. Journal of the Royal Statistical Society. Series D (The ), 43(3), p.395–401.

Anon, Six Sigma | Process Management | Strategic Process Management | Welcome to SSA & Company.

Antle, C., Klimko, L. & Harkness, W., 1970. Confidence Intervals for the Parameters of the Logistic Distribution. Biometrika, 57(2), p.397-402.

Aoshima, M. & Govindarajulu, Z., 2002. Fixed-width confidence interval for a lognormal mean. International Journal of Mathematics and Mathematical Sciences, 29(3), p.143–153.

Arnold, B.C., 1983. Pareto distributions, Fairland, MD: International Co-operative Pub. House.

Artin, E., 1964. The Gamma Function, New York: Holt, Rinehart & Winston.

Balakrishnan, 1991. Handbook of the Logistic Distribution 1st ed., CRC.

Balakrishnan, N. & Basu, A.P., 1996. Exponential Distribution: Theory, Methods and Applications 1st ed., CRC.

Balakrishnan, N. & Lai, C.-D., 2009. Continuous Bivariate Distributions 2nd ed., Springer.

Balakrishnan, N. & Rao, C.R., 2001. Handbook of Statistics 20: Advances in Reliability 1st ed., Elsevier Science & Technology. References Berger, J.O., 1993. Statistical Decision Theory and Bayesian Analysis 2nd ed., Springer. References 201

Berger, J.O. & Bernardo, J.M., 1992. Ordered Group Reference Priors with Application to the Multinomial Problem. Biometrika, 79(1), p.25-37.

Berger, J.O. & Sellke, T., 1987. Testing a Point Null Hypothesis: The Irreconcilability of P Values and Evidence. Journal of the American Statistical Association, 82(397), p.112-122.

Berger, J.O. & Sun, D., 2008. Objective priors for the bivariate normal model. The Annals of Statistics, 36(2), p.963-982.

Bernardo, J.M. et al., 1992. On the development of reference priors. Bayesian statistics, 4, p.35–60.

Berry, D.A., Chaloner, K.M. & Geweke, J.K., 1995. Bayesian Analysis in Statistics and : Essays in Honor of Arnold Zellner 1st ed., Wiley-Interscience.

Bertsekas, D.P. & Tsitsiklis, J.N., 2008. Introduction to Probability 2nd ed., Athena Scientific.

Billingsley, P., 1995. Probability and Measure, 3rd Edition 3rd ed., Wiley-Interscience.

Birnbaum, Z.W. & Saunders, S.C., 1969. A New Family of Life Distributions. Journal of Applied Probability, 6(2), p.319-327.

Bjõrck, A., 1996. Numerical Methods for Least Squares Problems 1st ed., SIAM: Society for Industrial and Applied Mathematics.

Bowman, K.O. & Shenton, L.R., 1988. Properties of estimators for the gamma distribution, CRC Press.

Brown, L.D., Cai, T.T. & DasGupta, A., 2001. for a binomial proportion. Statistical Science, p.101–117.

Christensen, R. & Huffman, M.D., 1985. Bayesian Point Estimation Using the Predictive Distribution. The American Statistician, 39(4), p.319-321.

Cohen, 1991. Truncated and Censored Samples 1st ed., CRC Press.

Collani, E.V. & Dräger, K., 2001. Binomial distribution handbook for scientists and engineers, Birkhäuser. References

Congdon, P., 2007. Bayesian Statistical Modelling 2nd ed., Wiley.

Cozman, F. & Krotkov, E., 1997. Truncated Gaussians as Tolerance Sets.

Crow, E.L. & Shimizu, K., 1988. Lognormal distributions, CRC Press. 202 Probability Distributions Used in Reliability Engineering

Dekking, F.M. et al., 2007. A Modern Introduction to Probability and Statistics: Understanding Why and How, Springer.

Fink, D., 1997. A compendium of conjugate priors. See http://www. people. cornell. edu/pages/df36/CONJINTRnew% 20TEX. pdf, p.46.

Georges, P. et al., 2001. Multivariate Survival Modelling: A Unified Approach with Copulas. SSRN eLibrary.

Gupta and Nadarajah, 2004. Handbook of beta distribution and its applications, CRC Press.

Gupta, P.L., Gupta, R.C. & Tripathi, R.C., 1997. On the monotonic properties of discrete failure rates. Journal of Statistical Planning and Inference, 65(2), p.255-268.

Haight, F.A., 1967. Handbook of the Poisson distribution, New York,: Wiley.

Hastings, N.A.J., Peacock, B. & Evans, M., 2000. Statistical Distributions, 3rd Edition 3rd ed., John Wiley & Sons Inc.

Jiang, R. & Murthy, D.N.P., 1996. A mixture model involving three Weibull distributions. In Proceedings of the Second Australia–Japan Workshop on Stochastic Models in Engineering, Technology and Management. Gold Coast, Australia, pp. 260-270.

Jiang, R. & Murthy, D.N.P., 1998. Mixture of Weibull distributions - parametric characterization of failure rate function. Applied Stochastic Models and Data Analysis, (14), p.47-65.

Jiang, R. & Murthy, D.N.P., 1995. Modeling Failure-Data by Mixture of2 Weibull Distributions : A Graphical Approach. IEEE Transactions on Reliability, 44, p.477-488.

Jiang, R. & Murthy, D.N.P., 1999. The exponentiated Weibull family: a graphical approach. Reliability, IEEE Transactions on, 48(1), p.68-72.

Johnson, N.L., Kemp, A.W. & Kotz, S., 2005. Univariate Discrete Distributions 3rd ed., Wiley-Interscience.

Johnson, N.L., Kotz, S. & Balakrishnan, N., 1994. Continuous Univariate Distributions, Vol. 1 2nd ed., Wiley-Interscience.

Johnson, N.L., Kotz, S. & Balakrishnan, N., 1995. Continuous Univariate Distributions, Vol. 2 2nd ed., Wiley-Interscience. References References 203

Johnson, N.L., Kotz, S. & Balakrishnan, N., 1997. Discrete Multivariate Distributions 1st ed., Wiley-Interscience.

Kimball, B.F., 1960. On the Choice of Plotting Positions on Probability Paper. Journal of the American Statistical Association, 55(291), p.546-560.

Kleiber, C. & Kotz, S., 2003. Statistical Size Distributions in Economics and Actuarial Sciences 1st ed., Wiley-Interscience.

Klein, J.P. & Moeschberger, M.L., 2003. : techniques for censored and truncated data, Springer.

Kotz, S., Balakrishnan, N. & Johnson, N.L., 2000. Continuous Multivariate Distributions, Volume 1, Models and Applications, 2nd Edition 2nd ed., Wiley- Interscience.

Kotz, S. & Dorp, J.R. van, 2004. Beyond Beta: Other Continuous Families Of Distributions With Bounded Support And Applications, World Scientific Publishing Company.

Kundu, D., Kannan, N. & Balakrishnan, N., 2008. On the hazard function of Birnbaum- Saunders distribution and associated inference. Comput. Stat. Data Anal., 52(5), p.2692-2702.

Lai, C.D., Xie, M. & Murthy, D.N.P., 2003. A modified Weibull distribution. IEEE Transactions on Reliability, 52(1), p.33-37.

Lai, C.-D. & Xie, M., 2006. Stochastic Ageing and Dependence for Reliability 1st ed., Springer.

Lawless, J.F., 2002. Statistical Models and Methods for Lifetime Data 2nd ed., Wiley- Interscience.

Leemis, L.M. & McQueston, J.T., 2008. Univariate distribution relationships. The American Statistician, 62(1), p.45–53.

Leipnik, R.B., 1991. On Lognormal Random Variables: I-the Characteristic Function. The ANZIAM Journal, 32(03), p.327-347.

Lemonte, A.J., Cribari-Neto, F. & Vasconcellos, K.L.P., 2007. Improved statistical inference for the two-parameter Birnbaum-Saunders distribution. References Computational Statistics & Data Analysis, 51(9), p.4656-4681.

Limpert, E., Stahel, W. & Abbt, M., 2001. Log-normal Distributions across the Sciences: Keys and Clues. BioScience, 51(5), p.352, 341. 204 Probability Distributions Used in Reliability Engineering

MacKay, D.J.C. & Petoy, L.C.B., 1995. A hierarchical Dirichlet language model. Natural language engineering.

Manzini, R. et al., 2009. Maintenance for Industrial Systems 1st ed., Springer.

Martz, H.F. & Waller, R., 1982. Bayesian reliability analysis, JOHN WILEY & SONS, INC, 605 THIRD AVE, NEW YORK, NY 10158.

Meeker, W.Q. & Escobar, L.A., 1998. Statistical Methods for Reliability Data 1st ed., Wiley-Interscience.

Modarres, M., Kaminskiy, M. & Krivtsov, V., 1999. Reliability engineering and risk analysis, CRC Press.

Murthy, D.N.P., Xie, M. & Jiang, R., 2003. Weibull Models 1st ed., Wiley-Interscience.

Nelson, W.B., 1990. Accelerated Testing: Statistical Models, Test Plans, and Data Analysis, Wiley-Interscience.

Nelson, W.B., 1982. Applied Life Data Analysis, Wiley-Interscience.

Novosyolov, A., 2006. The sum of dependent normal variables may be not normal. http://risktheory.ru/papers/sumOfDep.pdf.

Patel, J.K. & Read, C.B., 1996. Handbook of the Normal Distribution 2nd ed., CRC.

Pham, H., 2006. Springer Handbook of 1st ed., Springer.

Provan, J.W., 1987. Probabilistic approaches to the material-related reliability of fracture-sensitive structures. Probabilistic and reliability(A 87-35286 15-38). Dordrecht, Martinus Nijhoff Publishers, 1987,, p.1–45.

Rao, C.R. & Toutenburg, H., 1999. Linear Models: Least Squares and Alternatives 2nd ed., Springer.

Rausand, M. & Høyland, A., 2004. System reliability theory, Wiley-IEEE.

Rencher, A.C., 1997. Multivariate Statistical Inference and Applications, Volume 2, Methods of Multivariate Analysis Har/Dis., Wiley-Interscience.

Rinne, H., 2008. The Weibull Distribution: A Handbook 1st ed., Chapman & Hall/CRC.

Schneider, H., 1986. Truncated and censored samples from normal populations, M. Dekker. References References 205

Simon, M.K., 2006. Probability Distributions Involving Gaussian Random Variables: A Handbook for Engineers and Scientists, Springer.

Singpurwalla, N.D., 2006. Reliability and Risk: A Bayesian Perspective 1st ed., Wiley.

Sison, C.P. & Glaz, J., 1995. Simultaneous Confidence Intervals and Sample Size Determination for Multinomial Proportions. Journal of the American Statistical Association, 90(429).

Tong, Y.L., 1990. The Multivariate Normal Distribution, Springer.

Xie, M., Gaudoin, O. & Bracquemond, C., 2002. Redefining Failure Rate Function for Discrete Distributions. International Journal of Reliability, Quality & , 9(3), p.275.

Xie, M., Goh, T.N. & Tang, Y., 2004. On changing points of mean residual life and failure rate function for some generalized Weibull distributions. Reliability Engineering and System Safety, 84(3), p.293–299.

Xie, M., Tang, Y. & Goh, T.N., 2002. A modified Weibull extension with bathtub-shaped failure rate function. Reliability Engineering and System Safety, 76(3), p.279– 285.

Yang and Berger, 1998. A Catalog of Noninformative Priors (DRAFT).

Yang, K. et al., 2004. Multivariate Statistical Methods in Quality Management 1st ed., McGraw-Hill Professional.

Yang, R. & Berger, J.O., 1994. Estimation of a Covariance Matrix Using the Reference Prior. The Annals of Statistics, 22(3), p.1195-1211.

Zhou, X.H. & Gao, S., 1997. Confidence intervals for the log-normal mean. Statistics in medicine, 16(7), p.783–790.

References