Tornado Diagrams for Natural Hazard Risk Analysis by Keith Porter, University of Colorado Boulder and SPA Risk LLC

Tornado Diagrams for Natural Hazard Risk Analysis By Keith Porter, University of Colorado Boulder and SPA Risk LLC

Reader: this short work comes from my Beginner’s Guide to Fragility, Vulnerability, and Risk (Porter 2016), a textbook in progress written for my graduate students who are beginning the study of natural-hazard risk management. Find a URL in the references at the end of this document.

Introduction: which inputs matter most to an uncertain quantity?

Mathematical models of an uncertain real-world quantity y (such as the uncertain future cost to repair a building after an earthquake) often involve a set of uncertain input parameters x (such as how strongly the ground shakes). Analysts are commonly faced with one of two problems in such a situation: (1) it may be costly to study each input x sufficiently to quantify its probability distribution, or (2) each calculation of y may be costly, so it becomes costly to allow all of the inputs x to vary. In the former case, the analyst might want to deeply investigate only the important x values—those that contribute most strongly to uncertainty in y—and accept reasonable guesses as to the distributions of the other x values. In either case, the analyst might want to simplify the model by setting those x values that do not matter much at deterministic, best-estimate values. But how can one defensibly determine which inputs x matter? How much does uncertainty in each input parameter affect the result?

An x might vary wildly but have little effect on y. An x might not vary much at all, but y could be very sensitive to it. We say that the x values whose uncertainty has strong effect on y are the ones that matter, the ones that we might need to know more about. The others don’t matter, and we can take them at typical or best-estimate values, as if they did not vary at all for practical purposes. Once an analyst identifies the most important uncertainties, he or she can focus on understanding and quantifying those quantities and their effect on the quantity of interest y, and ignore the variability of the others, or at least treat them more causally.

One can use a tool called a tornado-diagram analysis to identify those important uncertainties. The tool produces a diagram (see Figure 1 for an example) that depicts the approximate effect of each uncertain input x on the quantity of interest y in the form of a horizontal bar chart that resembles a tornado in profile, hence the name. The method comes from the field of decision analysis (Howard 1988 may be the earliest work). Porter et al. (2002) may be its first proposed use in earthquake engineering. Other authors have used it in performance-based earthquake engineering and seismology a few times since then. Here first is a brief overview of the procedure; there follow step-by-step instructions, a short list of advantages and disadvantages of the method, and an example problem completely worked out.

A brief overview of tornado-diagram analysis

1. Define the output variable y of interest. 2. Define its inputs x. 3. Create a mathematical function f(x) to relate y to x, that is, y = f(x). 4. Find or guess a low, typical (or best estimate), and high value of each x, i.e., xlow, xtyp, and xhigh. 5. Evaluate a baseline value of y as using all the typical x values, i.e., ybaseline = f(x1typ, x2typ, ...) 6. Estimate y using typical x values except one, which is set at its low value. 7. Repeat with the same input set to its high value. The difference between the last two outputs is referred to as the swing associated with the one input that was varied. 8. That input is then set back to its typical value and repeat the process for the next input, again setting all the other inputs to their typical value. 9. Sort the labels for the inputs in decreasing order of the swing associated with that input. 10. Create a horizontal bar chart by depicting the swing associated with each input variable as a bar whose ends are at the low and high values of the output produced by changing just that input. The horizontal axis is the value of the output y. The bars are arranged with the input that has the highest swing on the top, then the input with the second-highest swing, etc. A vertical line is drawn through the baseline value. See Figure 1 for an example. It shows that an output parameter called damage factor depends most on something called assembly capacity, then Sa, then ground motion record, etc. It suggests that the analyst ought to focus on understanding the first two or three parameters to understand variability in damage factor.

Assembly capacity

Ground motion record

Unit cost

Damping

F-d multiplier

Mass

O&P

0.00 0.25 0.50 0.75 1.00 Damage factor

Figure 1. A sample tornado diagram that depicts how the earthquake-induced repair cost for a particular building is affected by various model parameters (Porter et al. 2002).

Tornado-diagram procedure

Now, more precisely, here is how to create a tornado diagram. Suppose you have a function for a quantity y that you care about and want to explore.

y f x12, x ,... xin ,... x  (1) Each xi is an uncertain input quantity, expressed either with a cumulative distribution function or merely low, typical and high values that you might guess or find in the literature. If you want to use a little rigor in guessing low, typical, and high values, think of the values in terms of bets: what value of x would you bet 100:1 is the lowest you would find if you were to do a survey or some rigorous data-collection effort? What value of x would you bet 100:1 would not be exceeded? 50:50? Use those as your low, high, and typical (or best-estimate) values of x. Repeat for each x. Make a table of x like in Table 1.

Table 1. Tabulating input values for a tornado diagram

Typical or Input quantity Low High best estimate x1 x1low x1typ x1high x2 x2low x2typ x2high ... xi xilow xityp xihigh ... xn xnlow xntyp xnhigh

In a research document such as a doctoral thesis, the literature review should provide a basis for completing the table, and one can add a comments column to the right to cite sources.

Now calculate ybaseline = f(x1typ, x2typ, ... xityp, ... xntyp)

This is the baseline y value. Now test the sensitivity of y to the uncertainty in each x: y1low = f(x1low, x2typ, ... xityp, ... xntyp), i.e., all typical values except using x1low y1high = f(x1high, x2typ, ... xityp, ... xntyp), i.e., all typical values except using x1high y2low = f(x1typ, x2low, ... xityp, ... xntyp), i.e., all typical values including x1typ, except using x2low y2high = f(x1typ, x2high, ... xityp, ... xntyp), i.e., all typical values including x1typ, except using x2high ... yilow = f(x1typ, x2typ, ... xilow, ... xntyp) yihigh = f(x1high, x2typ, ... xihigh, ... xntyp) ... ynlow = f(x1typ, x2typ, ... xityp, ... xnlow) ynhigh = f(x1high, x2typ, ... xityp, ... xnhigh)

Table 2. Tabulating output values for a tornado diagram

Input ylow yhigh Swing x1 y1low y1high swing1 x2 y2low y2high swing2 ... xi yilow yihigh swingi ... xn ynlow ynhigh swingn

Sort inputs x in decreasing order of swing, e.g., maybe: swing6 > swing1 > .... > swing7

Now make a horizontal bar chart. The horizontal axis measures y. The uppermost (top) horizontal bar in the chart measures yi with where i is the index for the input parameter with the largest swing. Its left end is the smaller of {yilow, yihigh}. Its right end is the larger of {yilow, yihigh}.

The next horizontal bar measures yj where j is the index for the x-parameter with the 2nd-largest swing. Its left end is the smaller of {yjlow, yjhigh}. Its right end is the larger of {yjlow, yjhigh}. The result looks like a tornado in profile. Draw a vertical line at ybaseline. Now you can explore the 2 or 3 or 4 x-parameters that matter most, and ignore the rest or treat them more casually, that is, with less effort to quantify or propagate their uncertainty.

Advantages and disadvantages

The tornado diagram is relatively easy to create, requiring only reasonable guesses as to the range of values of the input parameters, plus 2n+1 evaluations of the quantity of interest, where n is the number of uncertain input parameters. It does not require absolute minima, maxima, or mean values of the input parameters. The diagram is intuitive to read. It helps the analyst identify which parameters to focus on, to spend the most time quantifying and understanding. But they do not present probabilistic information. The baseline value does not necessarily represent an expected value of the quantity of interest. The diagram gives only a general sense of the variability of the quantity of interest. In a probabilistic analysis, the quantity of interest is evaluated as an uncertain function of the jointly distributed uncertain input parameters, or at least all the important ones— the ones at the top of the tornado diagram.

Example tornado diagram problem

Let us consider an example problem. Imagine that you are arranging a party for everyone in your local professional society and need to make a budget. You consider five uncertain quantities for your budget, need to budget for approximately the 90th percentile of cost, and have time to investigate only one or two to improve your budget estimate. You model your cost, y, using Equation (2), whose variables are listed in Table 3, along with your estimates of the lower bound, best estimate, and upper bound of each quantity.

y x1  x 2  x 2  x 3  x 4 1  x 5  (2)

Table 3. Tornado diagram example problem

Quantity Meaning Low Best High x1 Number of current members 1,900 2,000 2,200 x2 Fraction who will attend 0.01 0.05 0.15 x3 Fraction who will bring a guest 0.01 0.05 0.10 x4 Cost per meal $25.00 $50.00 $75.00 x5 Meals wasted to accommodate attendee choice 1% 10% 20%

Here are sample calculations of ybaseline and y2low. (This example uses the subscript “best” for best estimate rather than “typ” because in this case one does not know the typical value.)

y baseline = x 1best ∙(x 2best + x 2best ∙x 3best )∙x 4best ∙(1+x 5best ) = 2000∙(0.05+0.05∙0.05)∙50∙(1+0.1) = $ 5,775.00

y 2low = x 1best ∙(x 2low + x 2low ∙x 3best )∙x 4best ∙(1+x 5best ) = 2000∙(0.01+0.01∙0.05)∙50∙(1+0.1) = $ 1,155.00

The relevant quantities are calculated in Table 4. The tornado diagram is shown in Figure 2. The table and figure show that most of the uncertainty in cost results from not knowing better what fraction of members will attend. The next most important quantity to know is the unit cost of meals. The other uncertainties hardly matter compared with these two; you might as well budget using your best-estimate values and ignore uncertainty. Don’t bother checking with caterers about likely waste, making the membership committee chair figure out exactly how many members there are, or asking around about how many members will bring guests. But do go to the trouble of asking whoever organized the last meal what fraction of members attended and what each meal cost, and refine your estimate of the probability distributions of both those quantities.

Table 4. Professional society meal cost tornado diagram quantities

Uncertainty y low y high swing x 1 Current members $ 5,486 $ 6,353 $ 866 x 2 Fraction attending $ 1,155 $ 17,325 $ 16,170 x 3 Fraction bringing a guest $ 5,555 $ 6,050 $ 495 x 4 Cost per meal $ 2,888 $ 8,663 $ 5,775 x 5 Waste $ 5,303 $ 6,300 $ 998

Fraction attending

Cost per meal

Waste

Number of members

Bring guest

$- $5,000 $10,000 $15,000 $20,000 Meal cost

Figure 2. Tornado diagram for professional society meal References cited

Howard, R.A., 1988. Decision analysis: practice and promise. Management Science 34 (6), 679- 695, http://goo.gl/0LVJaA

Porter, K., 2016. A Beginner’s Guide to Fragility, Vulnerability, and Risk. University of Colorado Boulder, 91 pp., http://spot.colorado.edu/~porterka/Porter-beginners-guide.pdf

Porter, K.A., J.L. Beck, and R.V. Shaikhutdinov, 2002. Sensitivity of building loss estimates to major uncertain variables. Earthquake Spectra, 18 (4), 719-743, www.sparisk.com/pubs/Porter- 2002-Sensitivity.pdf