<<

2016 Proceedings of PICMET '16: Technology Management for Social Innovation

Suitability of Different Probability Distributions for Performing Schedule Risk Simulations in Project Management

J. Krige Visser Department of Engineering and Technology Management, University of Pretoria, Pretoria, South Africa

Abstract--Project managers are often confronted with the The Project Evaluation and Review Technique (PERT), in question on what is the probability of finishing a project within conjunction with the Critical Path Method (CPM), were budget or finishing a project on time. One method or tool that is developed in the 1950’s to address the uncertainty in project useful in answering these questions at various stages of a project duration for complex projects [11], [13]. The is to develop a Monte Carlo simulation for the cost or duration or mean value for each activity of the project network was of the project and to update and repeat the simulations with actual data as the project progresses. The PERT method became calculated by applying the and three popular in the 1950’s to express the uncertainty in the duration estimates for the duration of the activity. The total project of activities. Many other distributions are available for use in duration was determined by adding all the duration values of cost or schedule simulations. the activities on the critical path. The Monte Carlo simulation This paper discusses the results of a project to investigate the (MCS) provides a distribution for the total project duration output of schedule simulations when different distributions, e.g. and is therefore more useful as a method or tool for decision triangular, normal, lognormal or betaPert, are used to express making. the uncertainty in activity durations. Two examples were used to compare the output distributions, i.e. a network with 10 B. Research objectives activities in sequence and a network where some of the activities are performed in parallel. The results indicate that there is no The main objective of this study was to compare the significant difference in the output distributions when different output of a project schedule risk simulation when alternative input distributions with the same mean and values are input distributions are used to express the uncertainty in the used. duration of individual activities of the project. This objective is supported by the following sub-objectives. I. INTRODUCTION  Perform a literature search on comparison of probability distributions used in general Monte Carlo simulation. A. Background  Define two test ‘projects’, one with a number of activities Many factors influence the total project outcome or to be performed in sequence and one with parallel performance. The three dimensional goal for any project is activities, that can be used to compare different well known, i.e. to finish the project on time, within budget distributions. and with satisfactory performance or quality [11]. Large and  Run simulations using different input schedule complex projects typically have high uncertainty in the final distributions for a series and parallel network with a project outcome, e.g. space exploration (International space simulation add-in for MS Excel and standalone simulation station, Mars Curiosity lander), passenger airplane software. development (Airbus A380, Boeing 787) and bridge  Compare the total project duration outputs graphically and construction (Akashi-Kaikyo, Millau). Such complex projects through statistical analysis. are often late and exceed the original budget. Good project and risk management therefore involves the reduction of The following hypothesis was defined. uncertainty in the outcome of the project. H0: The distribution for the total duration of a project One method of reducing general uncertainty in the project comprising ten individual activities that each have a duration is to model the uncertainty in the duration of triangular distribution is not significantly different to the individual activities by means of probability distributions and distribution for the total project duration obtained when to run a simulation with multiple trials to produce a the ten activities have a normal, lognormal, Gumbel or for the total project duration. An , 95% confidence level. initial simulation is usually performed before project H1: The distribution for the total duration of a project implementation and for projects with a long duration the comprising ten individual activities that each have a simulation can be repeated at various stages or milestones of triangular distribution is significantly different to the the project. Actual duration values for completed activities distribution for the total project duration obtained when are then used in place of the initial estimates and 80 or 90% the ten activities have a normal, lognormal, Gumbel or certainty of completion are compared with the original values Weibull distribution, at a 95% confidence level. derived from the first simulation. This was the approach used for the schedule risk management of the Øresund bridge project that was completed 5 months ahead of schedule in 2001 [1].

2031 2016 Proceedings of PICMET '16: Technology Management for Social Innovation

II. LITERATURE AND THEORY and ρ is the random number obtained with the RAND() function of MS Excel or the RandUniform(0,1) function A. Project scheduling provided by SimVoi. Planning and scheduling are tools that assist in achieving the three crucial goals of project management. The work breakdown structure (WBS) is one of the outputs of the The normal distribution, also know as the Gaussian planning function and enables the development of the project distribution, is a continuous and symmetric distribution cost breakdown and the project schedule. Single point defined by two parameters, i.e. the mean value µ and the estimates are used for the duration of activities and the standard deviation, . The density function of the normal activity network enables a calculation of the critical path and distribution is defined over -∞ to +∞ and care should be taken total project duration. In modern projects, a schedule when using this distribution in simulation where the variable simulation is often used to complement the single point is only defined from 0 - ∞, e.g. duration of an activity which deterministic approach by incorporating uncertainty in the cannot be negative. The normal distribution is symmetric duration of the activities [2]. around the mean value whereas the other distributions investigated are non-symmetric. Random variates for the B. Project risk simulation normal distribution can be calculated with the MCS methods have become more popular amongst NORMINV(RAND(), µ, ) function of MS Excel or the project managers and planners in the last decade due to the RandNormal(µ, ) function provided by SimVoi. availability of fast desktop computers and software that is freely available, especially as add-ins for spreadsheets like Lognormal distribution MS Excel. The method is well documented and discussed in The lognormal distribution is a continuous, non- various textbooks [11], [2], [14], and [6]. Wood [16] says symmetric distribution that is often used to model the MCS can be “applied in many diverse fields that require duration of activities or tasks. It applies mostly to novice outcomes to be quantified statistically under conditions of artisans or workers that have to perform non-standard and uncertainty to aid in decision-making.” complex tasks. These tasks often have an overflow especially Various software programs for performing simulations are when something goes wrong. It has two parameters, i.e. the available to model uncertainty in the cost or duration of mean value µ and the standard deviation . The random activities. Two standalone software programs that use variates can be calculated using the MS Excel function discrete event simulation are Arena and GoldSim. Some add- LOGNORM.INV(RAND(), µ ,) or the RandLognormal(µ, ins for performing simulations with MS Excel are @Risk, ) function provided by SimVoi. Crystal Ball, SimVoi and RiskAmp. GoldSim [5] and SimVoi [15] were selected for this research project. The Gumbel distribution is also known as the Smallest C. Probability distributions Extreme Value (SEV) distribution (Type I) and has a positive Triangular distribution location parameter µ and a positive scale parameter β. A The triangular distribution is bounded on the left and right random variate, T, for the Gumbel distribution was calculated and can be symmetric or skew depending on the values of the using (4) and the RandUniform(0,1) function to generate a three parameters that determine the shape of the distribution. random number. This distribution is mainly used in the The three parameters of the triangular distribution are analysis of extreme values and in survival analysis. typically: , , ∙ (4)  a = lower bound (minimum)  m = most likely value () where ρ is a random number.  b = upper bound (maximum) Frechet distribution The density and cumulative distribution functions for the The Fréchet distribution, also known as the inverse triangular distribution are not provided by MS Excel but can Weibull distribution, belongs to the broader family of be calculated using the formulas from [3]. The random extreme value distributions and has a positive shape variates for the triangular distribution are also not provided in parameter α and a positive scale parameter, s. It is often MS Excel but were calculated with (1) to (3) below. applied for natural events or disasters like earthquakes, , , , 0 , , storms, floods and volcanic eruptions. The parameters of the (1) Fréchet distribution cannot be calculated from the mean and standard deviation using analytical equations and a numerical , , , 1 , , 1 (2) equation solver was used to determine the parameters. A random variate, T, for the Fréchet distribution was calculated where , , ⁄ (3) using (5). , , ∙/ (5)

2032 2016 Proceedings of PICMET '16: Technology Management for Social Innovation where ρ is a random number. tailed distributions like the Gumbel, Fréchet and Fisk distributions than for the normal or lognormal distributions. Weibull distribution Hajdu and Bokor [7] tested simulations with uniform, The Weibull distribution is well known for modelling triangular and beta distributions and indicated that ‘the use of reliability of physical assets and humans due to its versatility different distributions with the same three-point estimation with regard to failure rate. Two and three parameter has a smaller effect on the project duration than a 10% distributions are typically used but the 2-parameter version is difference in the values of the three-point estimation’. simpler and easier to apply. The ‘shape’ parameter, α, Sherer et al. [12] investigated the application of a determines whether an item exhibits a decreasing, constant or symmetric triangular distribution as an approximation for the decreasing failure rate. The second parameter, β, is known as normal distribution and a non-symmetric triangular the ‘characteristic life’ when used in reliability applications. distribution for the lognormal distribution. The authors The Weibull distribution is also useful to model task or concluded that the triangular distribution provides a good activity duration in projects since it is one of a few approximation for the normal distribution in the range of the distributions that is skewed towards the left, i.e. a negative mean ± 2.44σ. factor. The general assumption is that the Wood [16] performed simulations with the same input distribution of task duration is skewed to the right and data and found that the output cumulative distributions for the therefore some historical data is needed on specific task uniform, normal, lognormal and triangular distributions could durations to warrant the use of the Weibull distribution. A vary as much as 10%. This contradicts the findings of Hajdu random variate, T, for the Weibull distribution was calculated and Bokor [7]. using (6). III. METHODOLOGY , , 1 (6) A. Overview where ρ is a random number. Two theoretical experiments were performed using a simulation software add-in for MS Excel, i.e. SimVoiTM as BetaPert distribution well as the GoldSimTM standalone software program. The The betaPert distribution was developed in the 1950’s to schedule simulation can be done with MS Excel without describe the uncertainty in activity durations of complex additional software or add-ins using standard functions projects and used the beta distribution as the basis. The beta provided, e.g. RAND() for generating random numbers distribution is rather complex but quite versatile [9]. The between 0 and 1 and FREQUENCY(Array1, Array2, ) to parameters of the beta distribution are not easy to estimate for produce a histogram of random variates. The variates can be the duration of activities but the betaPert simplifies this sorted from smallest to largest and a cumulative distribution process by introducing 3 parameters, i.e. an optimistic, most values can be determined. For the total duration of the likely (mode) and pessimistic estimate for the duration. The project. Random variates from the distributions used in this OPERT software add-in for MS Excel was used to generate study can be determined from analytical expressions except random variates from the betaPert distribution [8]. for the normal distribution for which the built-in function of MS Excel can be used. However, the simulation add-ins or D. Comparison of distributions standalone software automate the simulation process and is Ferson et al. [4] were of the opinion that ‘the results of more convenient, easier and faster to use. probabilistic risk analyses are known to be sensitive to the Two project activity networks, each comprising 10 choice of distributions used as inputs, an effect which is activities were selected for this study i.e. a ‘series’ network undoubtedly even stronger for the tail probabilities’. In some where all activities are performed in sequence and a ‘parallel’ simulation applications, this may be the case, especially when network where some activities take place in parallel. As a a large number of trials are used. The occurrence of very starting point values were assumed for the parameters of the small or very large random variates is more likely for fat- triangular distribution for each of the ten activities. The two networks are shown in Fig. 1 and Fig. 2 below.

Figure1: Series activity network

Figure2: Parallel activity network

2033 2016 Proceedings of PICMET '16: Technology Management for Social Innovation

The values assumed for the parameters of the triangular normal distribution can be compared by performing a two distribution as well as the mean and variance values are sample independent t-test which produces a p-value. For a p- shown in Table 1. The values for the parameters a, m and b in value greater than 0.05 (typical value used) the null Table 1 are arbitrary time units which could typically be days hypothesis, i.e. that the two distributions that are compared or weeks. are significantly different, can be rejected in favour of the alternative hypothesis that they are the same. Another method TABLE 1: VALUES ASSUMED FOR DURATION OF ACTIVITIES 1-10 that was used is to compare two scenarios was the Activity a m b Mean Variance 1 5 7 12 8.000 2.167 Kolmogorov-Smirnov (K-S) test which does not require 2 7 10 15 10.667 2.722 normality of the two distributions that are compared. 3 10 12 17 13.000 2.167 The cumulative distribution values provided by the 4 6 7 15 9.333 4.056 5 7 10 16 11.000 3.500 software add-ins are determined through interpolation of the 6 10 11 16 12.333 1.722 cumulative data and the total duration values at the 80 and 7 5 8 13 8.667 2.722 90% probability values (P80 and P90) were used for 8 6 10 20 12.000 8.667 9 17 20 29 22.000 6.500 comparison. These values are often the most important output 10 3 4 8 5.000 1.167 of a schedule risk simulation and are used for critical schedule related decisions throughout the project execution. The values for the triangular distributions were selected to The percentage variation of the P80 and P90 values as give a right skewed distribution (positive skewness). Many of compared with the values for the triangular distribution were the standard distributions like the lognormal, gamma and used to determine how well the simulation produced useful Gumbel are right skewed. The normal distribution outputs. (symmetric) and Weibull distribution (left skewed if the shape parameter is greater than 1) are exceptions. IV. RESULTS AND FINDINGS

B. Data gathering A. Overview A spreadsheet simulation model for each of the two The SimVoi and GoldSim software provide the raw data, examples, series and parallel activities, was formulated in MS i.e. the random variates calculated for the total project Excel. The mean and variance values calculated for each of duration in this study. Most software also provides some the activities as given in Table 1 were used to determine the statistical data of the output distribution. SimVoi and parameters of the lognormal, normal, Gumbel, Weibull and GoldSim also provide graphical output, typically the Fréchet distributions. A simulation with 20000 trials for probability density and cumulative distribution function of SimVoi and 10000 trials for GoldSim was then performed for the output distribution. each activity network with the different input distributions for The results are provided separately for the two cases, i.e. the activity durations. The output of the simulation is a the series activity network and the parallel activity network. distribution for the total project duration. The descriptive are given for the SimVoi add-in The basic output of some add-in software is a list of all only since GoldSim does not provide all this detail. The P80 the random variates determined for various random numbers. and P90 duration values are compared against the triangular For this study it was an array comprising 20000 values distribution to assess the effect of the tails of the input (10000 for GoldSim) for the total project duration. Some add- distributions. in software offer further analysis of this basic data to provide Statistical tests were performed to determine how well the the mean, standard deviation, skewness and other statistical output distributions for total project duration agreed when the parameters of the total project duration. Values for the Gumbel, lognormal, normal, Weibull and Fréchet input cumulative distribution are also provided by most simulation distributions were used in comparison with the output when software. triangular distribution inputs were used. The difference in cumulative probability at about 20 data points were squared, C. Data analysis added together and the square root was calculated. The The output distributions for different input distributions Kolmogorov-Smirnov test for two samples was done with the for the activity durations were compared using a number of cumulative probability values for the output distributions. different measures. The simplest comparison is to compare the array of duration values for one distribution as input B. Series activity network against the array of duration values for another set of input The SimVoi simulation software add-in provides some distributions. The triangular distribution was chosen as statistical data on the output distribution. The results for the reference and all other distributions were therefore compared different input distributions are given in Table 2 below. against this one. Two distributions that are both close to a

2034 2016 Proceedings of PICMET '16: Technology Management for Social Innovation

TABLE 2: DESCRIPTIVE STATISTICS FOR SERIES ACTIVITY NETWORK, SIMVOI SIMULATION Triangular Gumbel Lognormal Normal Weibull Fréchet Mean 112.05 112.00 112.06 111.98 112.01 111.93 Std Dev 5.94 5.90 5.98 5.94 5.93 5.89 First Quartile 107.94 107.83 107.89 107.95 108.08 107.80 111.90 111.57 111.91 111.96 112.12 111.29 Third Quartile 116.00 115.75 116.02 116.01 116.01 115.36 Skewness 0.132 0.412 0.193 -0.006 -0.112 0.830

The mean and standard deviation values for all 5 input distributions are very similar, varying only a fraction of a GoldSim SimVoi percent. The largest difference is seen in the skewness which 0.08 varies from -0.112 for the Weibull distribution inputs to +0.830 for the Fréchet input distributions. 0.07 0.06 P80 and P90 values 0.05 The difference in the P80 and P90 values of the other input distributions and that of the triangular distribution is 0.04 shown in Fig. 3 below. 0.03 Gumbel Lognormal Normal Weibull Frèchet 0.02

0.1 squares of sum of root Square 0.01 0.0 0.00 Gumbel Lognormal Normal Weibull Frèchet ‐0.1 Figure 4: Square root of the sum of the squares of differences in CDF ‐0.2

‐0.3 The lognormal distribution provided the smallest values

Difference (%) ‐0.4 and therefore gives the closest result in comparison with the triangular distribution. The normal distribution, even though ‐0.5 it’s a symmetric distribution, provided the 2nd best result. The Fréchet distribution gave the worst result vs the triangular P80 P90 distribution. The Goldsim software does not provide the Figure 3: Difference in P80 and P90 values for series network Fréchet distribution and could therefore not be used to perform the simulation. For the normal, lognormal and Weibull distributions the P80 values differ very slightly (< 0.1%) from the P80 values Two sample t-test of the triangular distribution. The P90 values differ somewhat The p-values as determined by the 2-sample independent more (~ 0.3%), except for the Fréchet distribution that differs t-test (T.TEST function of MS Excel) for different input by ~0.5%. It is clear that the longer right hand tail of the distributions are shown in Table 3. The distributions were Fréchet does make some difference in the P80 and P90 compared against the triangular distribution. values. TABLE 3: P-VALUES FOR TWO SAMPLE T-TEST FOR SERIES NETWORK Sum of the squares Gumbel Lognormal Normal Weibull Fréchet The cumulative distributions produced by the simulation SimVoi 0.423 0.756 0.295 0.589 0.044 can be compared by calculating the square root of the sum of GoldSim 0.889 0.811 0.130 0.186 the squares of the difference for a number of duration values. Twenty data points, corresponding to a duration interval of The p-values for the Gumbel, lognormal, normal and 2.5 time units, were used for the comparison and the results Weibull distributions are greater than 0.05 which means that are shown in Fig. 4 for both simulation programs. the output with these input distributions are not significantly different than the output if the triangular distribution is used. However, the p-value for the Fréchet as input distribution is less than 0.05 and the null hypothesis cannot be accepted meaning the Fréchet could produce an output that is

2035 2016 Proceedings of PICMET '16: Technology Management for Social Innovation significantly different than the output for the triangular statistics as calculated by the SimVoi add-in for the total distribution as input. project duration are given in Table 4. As seen in Table 4 the mean and standard deviation (Std. Kolmogorov-Smirnov test Dev.) values for the output distributions are very close. As The Kolmogorov-Smirnov (K-S) test can be used to noticed with the series network the largest difference is in the compare two distributions that are not necessarily normal skewness factors which vary from -0.163 for the Weibull distributions. The difference in values for the cumulative input distributions to +1.333 for the Fréchet input distribution for two cases is determined and compared against distributions. The skewness value for the lognormal a critical factor, Dcrit [10]. The K-S test uses the maximum distribution was the closest to that of the triangular vertical deviation between the two CDF curves as the statistic distribution, i.e. 0.233 vs, 0.144. Dstat. As for the K-S two-sample test, the null hypothesis (at significance level α) is rejected if the difference, D, is P80 and P90 values positive. This difference is determined with (7). The P80 and P90 values as provided by the SimVoi ∆ (7) software were analysed and the percentage difference from where (8) the triangular distribution values were determined. The and Dcrit is the critical value obtained from standard K-S results are shown in Fig. 6 below. tables for the number of data points. Gumbel Lognormal Normal Weibull Frechet The difference D for the K-S-test is shown in Fig. 5 for 0.2 the different input distributions. The triangular distribution 0.1 was used as reference input distribution. 0.0 All values in Fig. 5 are negative and therefore the null ‐0.1 hypothesis can be accepted at a 95% significance level, i.e. ‐0.2 the output distributions from the simulation does not differ significantly from the outputs obtained with the triangular ‐0.3 distribution as input. Difference (%) ‐0.4 ‐0.5 Gumbel Lognormal Normal Weibull Frèchet ‐0.6 0.00 ‐0.7 ‐0.02 D P80 P90  ‐0.04 Figure 6: Difference in P80 and P90 values for parallel network

‐0.06

Difference, Difference, For the normal distribution, the P80 values are nearly ‐0.08 identical to the triangular values while the P90 value is lower by less than 0.2%. For the Gumbel, lognormal and Fréchet ‐0.10 distributions, the P90 values are closer to the triangular GoldSim SimVoi values than the P80 values. The largest difference is for the Fréchet distribution where the P80 value is about 0.62% less. Figure 5: D values for K-S test for series activity network The P80 value for the Weibull distribution is nearly 0.1% larger than the P80 value for the triangular distribution but C. Parallel activity network the P90 value is about 0.2% less. The explanation for this is The same values for the parameters of the triangular because the Weibull distribution is left skewed (negative distribution as used for the series activities were also used for skewness) and drops sharply to the right of the mode of the the parallel situation as shown in Fig.2. The descriptive distribution.

TABLE 4: DESCRIPTIVE STATISTICS FOR PARALLEL ACTIVITY NETWORK, SIMVOI SIMULATION Triangular Gumbel Lognormal Normal Weibull Fréchet Mean 88.33 88.33 88.38 88.39 88.38 88.34 Std. Dev. 5.52 5.50 5.50 5.50 5.50 5.54 First Quartile 84.52 84.49 84.58 84.69 84.72 84.49 Median 88.20 88.00 88.23 88.36 88.58 87.65 Third Quartile 92.01 91.75 91.92 92.09 92.20 91.46 Skewness 0.144 0.441 0.233 0.004 -0.163 1.333

2036 2016 Proceedings of PICMET '16: Technology Management for Social Innovation

Sum of the squares the triangular distribution at a 95% confidence level can be The cumulative distributions produced by the simulation accepted. can be compared by calculating the square root of the sum of Gumbel Lognormal Normal Weibull Frechet the squares of the difference for a number of duration values. 0.00 Twenty data points, corresponding to a duration interval of 2.5 time units, were used for the comparison and the results ‐0.02 D are shown in Fig. 7 for 2 simulation programs.  ‐0.04

GoldSim SimVoi ‐0.06

0.08 Difference, ‐0.08 0.07 ‐0.10 0.06 0.05 GoldSim SimVoi 0.04 Figure 8: D values from K-S test for parallel activity network 0.03

0.02 D. The betaPert vs triangular distribution 0.01 The betaPert distribution was developed in the 1950’s to

Square root of sum squares of sum of root Square 0.00 Gumbel Lognormal Normal Weibull Frechet model the high uncertainty in the duration of activities of highly complex projects. It is derived from the beta Figure 7: Square root of the sum of the squares of differences in CDF distribution to facilitate the input of three duration values, i.e. optimistic (a), most likely (m) and pessimistic (b) estimates. The lognormal distribution provided the smallest values of The output of a simulation for betaPert input values will not about 0.013 and therefore gives the closest result in provide the same output distribution as the triangular comparison with the triangular distribution. The normal distribution for the same input values since the mean and distribution, even though it’s a symmetric distribution, nd standard deviation (Std Dev.) values are different as seen in provided the 2 best result. The Fréchet distribution gave the (9) to (12) below. worst result vs the triangular distribution. The Goldsim BetaPert distribution: software does not provide the Fréchet distribution and could (9) not be used to run the simulation. (10) Two sample T-test Triangular distribution: The p-values for the parallel network as determined by the (11) 2-sample independent t-test are shown in Table 5. (12) TABLE 5: P-VALUES FOR TWO SAMPLE T-TEST FOR PARALLEL The betaPert distribution assigns a greater weight to the NETWORK most likely value in comparison with the triangular Gumbel Lognormal Normal Weibull Fréchet distribution where the 3 parameters have equal weight. SimVoi 0.875 0.423 0.348 0.453 0.981 GoldSim 0.930 0.384 0.234 0.218 The data given in Table 1 for the network with parallel activities was used to compare these two distributions, but The p-values for all the input distributions are greater than without activities 4, 6 and 10. The cumulative distributions 0.05 which supports the output with these input distributions (CDF) for the total duration for both input distributions are are not significantly different than the output if the triangular shown in Fig. 9. distribution is used. betaPert Triangular

Kolmogorov-Smirnov test 1 The difference, D, from the K-S test for the parallel 0.8 network is shown in Fig. 8 for different input distributions. 0.6 The lognormal distribution has the largest negative D 0.4 and is therefore the closest fit to the triangular distribution. 0.2 The Fréchet distribution has the smallest negative D value 0 and is therefore the worst fit vs. the triangular distribution. Cumulative probability 70 75 80 85 90 95 100 All 5 input distributions have negative D values and Total Project Duration therefore the hypothesis that the simulation output from these

5 distributions does not differ significantly from the output of Figure 9: CDF for triangular and betaPert input distributions with the same values for 3 parameters, a; m and b

2037 2016 Proceedings of PICMET '16: Technology Management for Social Innovation

For the second example the mode, m, of the triangular V. CONCLUSIONS distribution was also used as the mode for the betaPert distribution. However, the values of the scale parameters, a Two project activity networks were investigated in this and b, for the betaPert distribution were calculated to give the study, i.e. a series network with ten activities and a parallel same mean value and standard deviation as for the triangular network with ten activities. Duration values were accepted distribution. The output distribution with the betaPert as input for the triangular distribution for each activity and Monte is compared with the output with the triangular distribution as Carlo simulations were performed for this distribution and 5 input in Fig. 10. other distributions. The distributions for the total project duration were compared to determine whether there is a betaPert Triangular significant difference in the output. 1.0 The results of this study indicate that the choice of 0.8 probability distribution for the duration of activities is not critical for performing a schedule simulation for the total 0.6 project. The distributions for the total project duration 0.4 differed only slightly for the 6 different input distributions 0.2 (triangular, lognormal, normal, Gumbel, Fréchet and 0.0 Weibull) as indicated by the results of the t-test and the Cumulative Cumulative Probability 70 75 80 85 90 95 100 Kolmogorov-Smirnov test for similarity. The mean and Total Project Duration standard deviation values for all input distributions were exactly the same but the skewness values were not. The Figure 10: CDF for triangular and betaPert input distributions for the same Weibull distribution is skewed towards the left (negative mode, mean and standard deviation skewness) for the values used in this study, the normal

distribution is symmetric and the other distributions are The output distributions are very similar for these input skewed towards the right (positive skewness). For the two distributions. The square root of the sum of the squares of the network examples used it can be concluded that the skewness difference between the two cumulative distributions in Fig. of the input distribution does not have a significant effect on 10 was 0.023 and the p-value for the t-test was 0.81. The P80 the results of the simulation, except that the skewness of the values differ by 0.03% and the P90 values differ by 0.11%. distributions for the total project duration vary slightly. The D value for the K-S test was -0.052 which supports the The special case of the betaPert input distribution was also hypothesis that the betaPert distribution does not differ investigated to determine whether the output distribution significantly from the triangular distribution when the same differed significantly when compared with the triangular mode, mean and standard deviation values are used. The p- input distribution. If the same values for the three parameters value and similarity in the P80 and P90 values indicate that of the triangular distributions are also used for the betaPert the two output distributions are very similar and that both distributions, the output distribution will be different. these input distributions can be used for a schedule However, if the same mean, standard deviation and mode are simulation. used for both the betaPert and the triangular, the output

distributions will be very similar. E. Summary of findings Considering the results obtained this study, it can be The difference in the P80 and P90 values for the Gumbel, concluded that the choice of input distribution does not have lognormal, normal, Weibull and Fréchet input distributions a significant effect on the output distribution of a schedule compared against the triangular input distribution was less simulation when these input distributions are symmetric or than 1% for the series and parallel network. The lognormal slightly skewed. The results cannot be generalized to highly provided the smallest difference for P80 and P90 for both skewed distributions where the different tails of the activity networks and the Fréchet the largest difference for distributions might lead to different results. This paper both. confirms what might be expected for a small network with The square root of the sum of the differences for about 20 slightly skewed input distributions for the duration of the points of the cumulative distribution function when compared activities. A focus for further study and analysis could be a with the triangular distribution was smaller than 0.08 for the larger network and highly skewed distributions. The output series and parallel networks. The lognormal had the smallest distribution might be distorted in the 80-95% probability difference and the Fréchet had the largest difference for both. region for this scenario. The D values for the K-S test were negative for the The results of this study provide confidence to risk and series and parallel networks for all input distributions project managers that they can use any of the input compared with the triangular distribution. The lognormal had distributions analysed in this study, together with duration the largest negative and the Fréchet the smallest negative data provided by experts, for schedule risk simulations. values. This indicated that the lognormal compares the best to the triangular input distribution.

2038 2016 Proceedings of PICMET '16: Technology Management for Social Innovation

VI. RECOMMENDATIONS ISO 31000 and IEC 62198. 2nd Edition, John Wiley & Sons Ltd. United Kingdom, 2014. [3] Evans, M., N. Hastings and B. Peacock. Statistical distributions, 3rd Two sets of input data were used for this study, one for Edition, John Wiley & Sons, New York. 2000. series activities and the other for activities that can be [4] Ferson, S, Ginzberg, L. and R. Akçakaya, “Whereof one cannot speak: performed in parallel. To make the findings more general, at When input distributions are unknown,” Retrieved 11/01/15 World least one more set of data with highly skewed distributions Wide Web, http://www.ramas.com/whereof.pdf). [5] GoldSim Technology Group, “GoldSimTM Monte Carlo Simulation should be analysed. It would be interesting to see how well software for decision and risk management”, Retrieved 01/27/16 World the Weibull (left skewed) and normal distribution Wide Web, http://www.goldsim.com/Home/. (symmetric) compare with right skewed distributions [6] Grey, S. Practical Risk Assessment for Project Management. John (lognormal, Gumbel and Fréchet) for highly skewed input Wiley & Sons, Chichester, England. 1995. [7] Hajdu, M. and O. Bokor, “The Effects of Different Activity distributions. Distributions on Project Duration in PERT Networks,” Procedia - The schedule simulations were performed with two Social and Behavioral Sciences 119, pp. 766-775, 2014. simulation software packages, i.e. SimVoi and GoldSim. It is [8] Hayes, C. and J. Jacobs, “Openpert Reference Guide”, Retrieved recommended that one more software program, @RiskTM or 11/01/15 World Wide Web, https://openpert. TM googlecode.com/files/openpert_reference_guide.pdf. RiskAmp ) be used to repeat these simulations and compare [9] McDonald, J.B. and Y.J. Xu. “A generalization of the beta distribution the results with the results given in this paper. with applications,” Journal of Econometrics 66, pp. 133-152, 1995. [10] Pham, H. Handbook of Engineering Statistics, Springer-Verlag, UK. ACKNOWLEDGMENTS 2006. [11] Raydugin, Y., Project Risk Management: Essential Methods for Project Teams and Decision Makers, John Wiley & Sons Inc., New I thank the Department of Engineering and Technology Jersey, 2013. Management at the University of Pretoria for the use of the [12] Scherer, W.T., T.A. Pomroy and D.N. Fuller, “The triangular density to facilities and administrative support. I also thank the approximate the normal density,” Reliability Engineering and System Safety Vol. 82, pp. 331–341, 2003. Microsoft, Treeplan Software and GoldSim companies for the [13] Shankar. N.R., S.S. Babu, Y.L.P. Thorani and D. Raghuram, “Right use of their software for this project. skewed distribution of activity times in PERT”, International Journal of Engineering Science and Technology 3, pp. 2932-2938, 2011. REFERENCES [14] Steyn, H.P.J. et al. Project Management: A multi-disciplinary approach. 4th Edition, FPM Publishing, Pretoria, South Africa, 2016. [15] Treeplan Software, SimVoiTM Monte Carlo Simulation Add-in. [1] Christensen, P.J. and J. Rydberg, “Overcoming Obstacles: Strategic Retrieved 01/30/16 World Wide Web, http://simvoi.com. risk management in the Øresund bridge project”, PM Network, 15, (11) [16] Wood, D.A., “Risk Simulation Techniques to aid project cost-time pp. 30-37, 2001. planning and management,” Risk Management 4 (1) pp. 41-60, 2002. [2] Cooper, D., P. Bosnich, S. Grey, G. Purdy, G. Raymond, P. Walker and M. Wood, Project Risk Management Guidelines. Managing Risk with

2039