Decomposing High-Order Statistics for Sensitivity Analysis

Center for Turbulence Research 139 Annual Research Briefs 2014 Decomposing high-order statistics for sensitivity analysis By G. Geraci, P.M. Congedo† AND G. Iaccarino 1. Motivation and objectives Sensitivity analysis in the presence of uncertainties in operating conditions, material properties, and manufacturing tolerances poses a tremendous challenge to the scientific computing community. In particular, in realistic situations, the presence of a large number of uncertain inputs complicates the task of propagation and assessment of output uncertainties; many of the popular techniques, such as stochastic collocation or polynomial chaos, lead to exponentially increasing costs, thus making these methodologies unfeasible (Foo & Karniadakis 2010). Handling uncertain parameters becomes even more challeng- ing when robust design optimization is of interest (Kim et al. 2006; Eldred 2009). One of the alternative solutions for reducing the cost of the Uncertainty Quantification (UQ) methods is based on approaches attempting to identify the relative importance of the input uncertainties. In the literature, global sensitivity analysis (GSA) aims at quantifying how uncertainties in the input parameters of a model contribute to the uncertainties in its output (Borgonovo et al. 2003). Traditionally, GSA is performed using methods based on the decomposition of the output variance (Sobol 2001), i.e., ANalysis Of VAriance, ANOVA. The ANOVA approach involves splitting a multi-dimensional function into its contributions from different groups of dependent variables. The ANOVA-based analysis creates a hierarchy of dominant input parameters, for a given output, when variations are computed in terms of variance. A limitation of this approach is the fact that it is based on the variance since it might not be a sufficient indicator of the overall output variations. The main idea of this work is that the hierarchy of important parameters based on second-order statistical moment (as in ANOVA analysis) is not the same if a different statistic is considered (a first attempt in this direction can be found in Abgrall et al. 2012). Depending on the problem, the decomposition of the nth-order moment might be more insightful. Our goal is to illustrate a systematic way of investigating the effect of high-order interactions between variables to understand if they are dominant or not. We introduce a general method to compute the decomposition of high-order statistics, then formulate an approach similar to ANOVA but for skewness and kurtosis. This is a fundamental step in order to also formulate innovative optimization methods for obtaining robust designs that account for a complete description of the output statistics. For instance, by knowing the relative importance of each variable (or subset of variables) over the design spaces, reduced UQ propagation problems can be solved adaptively by choosing only the influent variables. A similar approach (Congedo et al. 2013) using variance-based sensitivity indices has been demonstrated to be effective in the overall reduction of the numerical cost associated with a design optimization of a turbine blade for Organic Rankine Cycle (ORC) application with a large number of uncertain inputs. † Inria Bordeaux–Sud-Ouest, France 140 Geraci, Congedo & Iaccarino The methodology proposed in this work illustrates how third- and fourth-order statistic moments can be decomposed (in a way which mimics what has been done for the variance). It is shown that this decomposition is correlated to a polynomial chaos (PC) expansion, enabling us to compute each term and propose new sensitivity indices. The new decomposition technique is illustrated by considering several test functions. In particular, a functional decomposition based on variance, skewness, and kurtosis is computed, displaying how sensitivity indices vary according to the order of the statistical moment. Moreover, the decomposition of high-order statistics is used to drive the model reduction of the metamodel. The effect of the high-order decomposition is also evaluated, for several test cases, in terms of its impact on the probability density functions. 2. High-order statistics definition Let us consider a real function f = f(ξ) with ξ a vector of independent and identically d d d distributed random inputs ξ ∈ Ξ = Ξ1 ×···× Ξn (Ξ ⊂ R ) and ξ ∈ Ξ 7−→ f(ξ) ∈ 4 d d L (Ξ ,p(ξ)), where p(ξ)= i=1 p(ξi) is the probability density function of ξ. The central moments of order n can be defined as Q µn(f)= (f(ξ) − E(f))np(ξ)dξ, where E(f)= f(ξ)p(ξ)dξ. (2.1) d d ZΞ ZΞ In the following, we indicate with σ2 = µ2(f), s = µ3(f), and k = µ4(f) the variance (second-order moment), the skewness (third-order), and the kurtosis (fourth-order), respectively. We note here that according to standard definitions of the skewness and kurtosis, we should include a normalization factor, namely the third power of the standard deviation and the square of the variance, respectively. However, in this context, interest is only in the relative contribution of each term of the decomposition; thus, distorting the nomenclature somewhat, we refer to skewness and kurtosis following the definitions as in Eq. (2.1). 3. Functional ANOVA decomposition Let us apply the definition of the Sobol functional decomposition (Sobol 2001) to the function f as N f(ξ)= fmi (ξ · mi), (3.1) i=0 X where the multi-index m, of cardinality card(m) = d, can contain only elements equal d to 0 or 1. The total number of admissible multi-indices mi is N +1=2 ; this number represents the total number of contributes up to the dth-order of the stochastic variables ξ. The scalar product between the stochastic vector ξ and mi is employed to identify the functional dependences of fmi . In this framework, the multi-index m0 = (0,..., 0), ξ ξ is associated with the mean term fm0 = Ξd f( )p( )dξ. As a consequence, fm0 is equal to the expectancy of f, i.e., E(f). In the following, we assume the first d indices as the multi-indices associated to the single variables,R while the second-order interaction terms follow, and so on. The decomposition Eq. (3.1) is of ANOVA-type in the sense of Sobol (Sobol 2001) if Decomposing high-order statistics for sensitivity analysis 141 all the members in Eq. (3.1) are orthogonal, i.e., as fmi (ξ · mi)fmj (ξ · mj )p(ξ)dξ = 0 with mi 6= mj , (3.2) d ZΞ and for all the terms fmi , except f0, it holds fmi (ξ · mi)p(ξj )dξj = 0 with ξj ∈ (ξ · mi) . (3.3) d ZΞ Each term fmi of (3.1) can be expressed as ¯ ¯ fmi (ξ · mi)= fmi (ξ · mi)p(ξi)dξi − fmj (ξ · mj ), (3.4) Ξd−card(mî) Z mj 6=mi card(mˆjX)<card(mi) where the symbol mî indicates a vector of cardinality equal to the number of non-null elements in mi, i.e., card(mi) indicates the number of variables involved in mi, while ¯ ¯ ξi contains all the variables that satisfy (ξ · mi) ∪ ξi = ξ. Hereafter, we refer, for brevity, to the probability measure: dµi = p(ξ · mi)d(ξ · mi). Variance can be expressed as the summation of all the conditional contributions N 2 2 2 2 σ = σmi , where σmi = fmi (ξ · mi)dµi. (3.5) ˆ i=1 Ξi X Z card(mˆ ) The notation is made more compact by means of Ξî = Ξ i . Because of the properties of the ANOVA terms, all the mixed contributions are zero due to orthogonality. Analogously, the skewness, first by taking the third power of f(ξ) − f0 and by neglecting the orthogonal contributions, is equal to N 3 3 2 s = (f(ξ) − f0) dµ= fmp dµp +3 fmp fmq dµpq Ξ Ξˆ Ξˆ p=1 p mp mq ⊂mp pq Z X Z X X Z N N N (3.6) +6 fmp fmq fmr dµpq. ˆ p=1 q=p+1 r=q+1 ZΞpq X X mpqX=mr In the previous expression, the multi-index mpq represents the union between mp and mq, also indicated as mpq = mp ⊞ mq. After some manipulations, it is possible to N demonstrate the following (additive) form: s = i=1 smi . In particular, by considering |mi| each multi-index mi associated with a set of 2 − 1 sub-interactions and by denoting P this set as Pi (Pi,6= is shorthand for Pi −{mi}), each contribution can be expressed as 3 2 smi = fmi dµi+3 fmi fmq dµi+6 fmi fmp fmq dµi. Ξˆ i Ξˆ i Ξî Z Z mq ∈Pi,=6 mp∈Pi,=6 mp6=mq ∈Pi,=6 Z X X mpqX=mi (3.7) Similar considerations lead to the additive form of the kurtosis, where each conditional 142 Geraci, Congedo & Iaccarino term is 4 3 2 2 kmi = fmi dµi +4 fmi fmq dµi +6 fmp fmq dµi Ξˆ i Ξˆ i Ξˆ i Z Z mq ∈Pi,=6 mp∈Pi mp6=mq ∈Pi Z X X mpqX=mi 2 + 12 fmp fmq fmr dµi Ξî mp mp6=mq ∈Pi mr ∈Pi,r>q Z X X mp⊞X∩qr =mi + 24 fmp fmq fmr fmt dµi. Ξî mp∈Pi mq ∈Pi,q>p mr ∈Pi,r>q t>r,mr ∈Pi Z X X X mi⊆mXpq ⊞∩rt mi⊆mrt⊞∩pq (3.8) Hereafter, the symbol ∩pq indicates the set of variables contained in both mp and mq. 4. Correlation with polynomial chaos framework Variance, skewness, and kurtosis from the functional decomposition are correlated with the terms contained within a polynomial chaos expansion. This correlation establishes a rigorous numerical approach to compute the terms present in the functional (additive) decomposition of the central moments.

Decomposing High-Order Statistics for Sensitivity Analysis

The Probability Lifesaver: Order Statistics and the Median Theorem

Lecture 14: Order Statistics; Conditional Expectation

Medians and Order Statistics

Exponential Families and Theoretical Inference

Inference Procedures Based on Order Statistics

Speech Detection Using High Order Statistic (Skewness)

The Multivariate Normal Distribution

Algorithms Chapter 9 Medians and Order Statistics

Concomitants of Order Statistics Shie-Shien Yang Iowa State University

On Concomitants of Order Statistics

On the Student's T-Distribution and the T-Statistic

Journal of Multivariate Analysis 100 (2009) 946–951