Parametric measures of variability induced by risk measures

Fabio Bellini∗ Tolulope Fadina† Ruodu Wang‡ Yunran Wei§

December 10, 2020

Abstract

We study general classes of parametric measures of variability with applications in risk management. Particular focus is put on variability measures induced by three classes of popular risk measures: the Value-at-Risk, the Expected Shortfall, and the expectiles. Properties of these variability measures are explored in detail, and a characterization result is obtained via the mixture of inter-ES differences. Convergence properties and asymptotic normality of their empirical are established. We provide an illustration of the three classes of variability measures applied to financial and analyze their relative advantages.

Keywords: Measures of variability, quantiles, Value-at-Risk; Expected Shortfall; expec- tiles arXiv:2012.05219v1 [q-fin.RM] 9 Dec 2020

and Quantitative Methods, University of Milano-Biccoca, Italy. [email protected] †Department of Mathematical Sciences, University of Essex, United Kingdom. [email protected] ‡Department of Statistics and , University of Waterloo, Canada. [email protected] §Department of Statistics and Actuarial Science, Northern Illinois University, USA. [email protected]

1 1 Introduction

Various measures of distributional variability are widely used in statistics, probability, , finance, physical sciences, and other disciplines. In this paper, we study the theory for measures of variability with an emphasis on three symmetric classes generated by popular parametric risk measures. These risk measures are the Value-at-Risk (VaR), the Expected Shortfall (ES), and the expectiles, and they respectively induce the inter-quantile difference, the inter-ES difference, and the inter-expectile difference. The great popularity and convenient properties of these risk measures lead to various features for the corresponding variability measures. The literature on risk measures is extensive, and a standard reference is F¨ollmerand Schied(2016). As classic risk measures, VaR is a quantile and ES is a coherent risk measure in the sense of Artzner et al.(1999). Both VaR and ES are implemented in current banking and insurance regulation; see McNeil et al.(2015) for a comprehensive background and Wang and Zitikis(2020) for a recent account. Expectiles, originally introduced by Newey and Powell (1987) in regression, have received an increasing attention in risk management, as they are the only elicitable coherent risk measures (Ziegel(2016)). We refer to Bellini et al.(2014) and Bellini and Di Berhardino(2015) for the theory and applications of expectiles. For a comparison of the above risk measures in the context of regulatory capital calculation, see Embrechts et al.(2014) and Emmer et al.(2015). The theory of variability measures has been studied with various different definitions; see David(1998) for a review in the context of measuring statistical dispersion. A mathe- matical formulation close to our variability measures is the measure of Rockafellar et al.(2006), which is further developed by Grechuk et al.(2009, 2010). A similar notion of measures of variability was proposed by Furman et al.(2017) with an emphasis on the Gini deviation. We will explain in Section2 the differences between our definition and the ones in the literature; in particular, the inter-quantile difference does not satisfy the definition of a deviation measure of Rockafellar et al.(2006). Our main contribution is a collection of results on a theory for variability measures, with particular results on the three parametric classes mentioned above. Several novel properties are studied to discuss the special role these measures play among other variability measures. Since for VaR, ES, and expectiles is well developed, the estimation of the corresponding variability measures is also straightforward; see e.g., Shorack and Wellner (2009) for statistical inference of VaR and Kr¨atschmer and Z¨ahle(2017) for that of expectiles.

2 The three classes of variability measures of our main interest are not new. The inter- quantile difference is a popular measure of statistical dispersion widely used in box plots, sometimes under the name inter- (e.g., David(1998)). The other two objects are, as far as we know, relatively new: the inter-ES difference appears in Example 4 of Wang et al.(2020b) as a signed Choquet integral, and the inter-expectile difference is studied by Bellini et al.(2020) via a connection to option prices. Nevertheless, this paper is the first unifying study of these measures, in particular focusing on their qualitative and quantitative advanatges. The rest of the paper is organized as follows. In the remainder of this section, we introduce some notation. The definitions of the three classes of variability measures induced by VaR, ES, and the expectiles is presented in Section2, with some basic properties. In Section3, we summarize many properties of some common variability measures which are arguably desirable in practice. A characterization result of these measures established. In Section4 we discuss non-parametric estimation of the three classes of variability measures. We obtain the asymptotic normality and the asymptotic explicitly for the empirical estimators. It may be undesirable and financial unjustifiable to choose the same probability level for the three classes of variability measures induced by VaR, ES, and the expectiles; see Li and Wang(2019) for a detailed analysis on plausible equivalent probability levels when ES is to replace VaR. A simple analysis of a cross-comparison of an equivalent probability level for the variability measures using different distributions is carried out in Section5.A small empirical analysis using the variability measures on the S&P 500 index is conducted in Section6, where we observe the differences between these variability measures during different economic regimes. In Section7 we conclude the paper with some discussions on the suitability of the three classes in different situations. AppendixA contains a list of classic variability measures, and proofs of all results are put in AppendixB.

Notation

Throughout, Lq is the set of all random variables in an atomless probability space ∞ (Ω, A, P) with finite q-th , q ∈ (0, ∞) and L is the set of essentially bounded random 0 variables. X = L is the set of all random variables, and M is the set of all distributions on R. 0 −1 For any X ∈ L , FX represents the distribution function of X, FX its left-quantile function, −1 and UX is a uniform [0, 1] such that FX (UX ) = X almost surely. The exis- tence of such uniform random variable UX for any X is given, for example, in Lemma A.32 of F¨ollmerand Schied(2016). Two random variables X and Y are comonotonic if there exist

3 increasing functions f and g such that X = f(X + Y ) and Y = g(X + Y ). We write X =d Y if X and Y have the same distribution. In this paper, terms “increasing” and “decreasing” are in the non-strict sense.

2 Definitions

2.1 Basic requirements for variability measures

A variability measure is a functional ν : X → [0, ∞] which quantifies the magnitude of variability of random variables. In order for our definition of variability measures to be as

general as possible, we only require three natural properties on a functional X → R to be a variability measure.

Definition 1. A variability measure is a functional ν : X → [0, ∞] satisfying the following properties.

(A1) Law invariance: if X ∈ X and Y ∈ X have the same distribution under P, denoted as X =d Y , then ν(X) = ν(Y ).

(A2) Standardization: ν(m) = 0 for all m ∈ R.

(A3) Positive homogeneity: there exists α ∈ [0, ∞) such that ν(λX) = λαν(X) for any λ > 0 and X ∈ X . The number α is called the homogeneity index of ν.

Some examples of classic variability measures are given in AppendixA. There are some relative measures of variability in the literature which are only formulated for positive random variables, such as the Gini coefficient or the relative deviation (see AppendixA). In this paper, we omit these variability measures, although our definition can be easily amended to include

them by replacing X with a positive convex cone. We call the set Xν = {X ∈ X : ν(X) < ∞} the effective domain of ν.

Remark 1. A deviation measure ν of Rockafellar et al.(2006) satisfies, in addition to stan- dardization and positive homogeneity with index 1, that ν(X) > 0 for all non-constant X and ν is subadditive. The latter two properties are not satisfied by the inter-quantile difference, which we will see later in Table1. Thus, to study the three parametric classes of variabil- ity measures, our set of definitions is more suitable than that of Rockafellar et al.(2006). Alternatively, Furman et al.(2017) defined measures of variability via a location-invariance property instead of positive homogeneity. This property is not satisfied by the relative vari- ability measures in AppendixA. In view of the above discussions, we use the three properties

4 in Definition1 as the defining properties of variability measures, and all other properties (such as location invariance and subadditivity) will be additional properties that may or may not be satisfied by a variability measure; see Section3 for a detailed analysis.

2.2 Three parametric families of risk measures

Below we list three parametric families of risk measures which are popular in risk man- agement (e.g., Embrechts et al.(2014) and Emmer et al.(2015)). We define two versions of quantiles and ES for notational convenience.

(i) The right-quantile (right VaR): for p ∈ (0, 1),

Qp(X) = inf{x ∈ R : P(X 6 x) > p},X ∈ X .

The left-quantile (left VaR): for p ∈ (0, 1),

− Qp (X) = inf{x ∈ R : P(X 6 x) > p},X ∈ X .

(ii) The ES: for p ∈ (0, 1),

1 Z 1 ESp(X) = Qr(X)dr, X ∈ X . 1 − p p

The left-ES: for p ∈ (0, 1),

Z p − 1 ESp (X) = Qr(X)dr, X ∈ X . p 0

(iii) The expectile: for p ∈ (0, 1),

1 exp(X) = min{x ∈ R : pE[(X − x)+] 6 (1 − p)E[(X − x)−]},X ∈ L .

− − In the above risk measures, Qp and Qp are finite on X , and ESp, ESp and exp are finite on L1. We only define expectiles on L1 as generalizing expectiles beyond L1 is not natural; on the other hand, ES can be defined naturally on a set larger than L1 by taking possibly infinite values.

2.3 Three variability measures based on risk measures

We now define variability measures derived from three popular parametric families of risk measures, the main objects of the paper.

5 (i) The inter-quantile difference: for p ∈ [1/2, 1),

Q − ∆p (X) = Qp(X) − Q1−p(X),X ∈ X .

Q 0 It is obvious that ∆p is finite on X = L .

(ii) The inter-ES difference: for p ∈ (0, 1),

ES − ∆p (X) = ESp(X) − ES1−p(X),X ∈ X .

− Here, ESp takes value in (−∞, ∞], and ES1−p takes value in [−∞, ∞), and hence the ES above ∆p is well defined on X .

(iii) The inter-expectile difference: for p ∈ (1/2, 1),

ex 1 ∆p (X) = exp(X) − ex1−p(X),X ∈ L ,

ex 1 and ∆p (X) = ∞ for X ∈ X \ L .

In addition to the above three classes, it is also natural to define the limiting variability measure Q ES ex ∆1 (X) = ∆1 (X) = ∆1 (X) = ess-sup(X) − ess-inf(X),X ∈ X ,

Q ES which is the range functional, and it is simply denoted by ∆1. Both ∆p and ∆p belong to the class of distortion riskmetrics (Wang et al.(2020a,b)), with many convenient theoretical ex properties. On the other hand, ∆p does not belong to this class, but it also has many nice properties, inherited from expectiles. These properties are summarized in Table1 in Section 3.

Remark 2. Analogously, we can define a variability measure ∆ρ1,ρ2 by

ρ1,ρ2 ∆ (X) = ρ1(X) − ρ2(X),X ∈ X , for any risk measures ρ1 and ρ2 satisfying ρ1 > ρ2. In this paper, we focus mainly on the three parametric families of variability measures above, since they are generated by the three most popular and practical risk measures with convenient properties.

Q In Theorems1-2 and Table1 below, the range of p is p ∈ [1/2, 1) for ∆p , p ∈ (1/2, 1) ex ES for ∆p , and p ∈ (0, 1) for ∆p .

6 Theorem 1. For each p, the following statements hold.

Q ES ex (i) ∆p , ∆p , ∆p and ∆1 are variability measures.

Q ES ex 0 1 1 ∞ (ii) The effective domains of ∆p , ∆p , ∆p and ∆1 are L , L , L , and L , respectively.

Q ES ex (iii) Each of ∆p , ∆p and ∆p is increasing in p.

(iv) For X ∈ X , the following alternative formulations hold

Q ∆p (X) = Qp(X) + Qp(−X), ES ∆p (X) = ESp(X) + ESp(−X), ex ∆p (X) = exp(X) + exp(−X).

ES It is straightforward to check that a special case of ∆p by taking p = 1/2 is two times the deviation (MMD; see AppendixA). The next proposition suggests that for ES ∆p , it suffices to consider p ∈ [1/2, 1).

ES ES ES 1 R 1 Q Proposition 1. For each p ∈ (0, 1), (1 − p)∆p = p∆1−p, and ∆p = 1−p p ∆q dq.

ES Using Proposition1, we only need to discuss properties and estimation of ∆ p for p ∈ [1/2, 1), and we will tacitly assume p ∈ [1/2, 1) in most of our statements for the three classes of variability measures.

3 Comparative properties and characterization

Q ES ex In this section, we study comparative advantages of ∆p , ∆p and ∆p , among with several other measures of variability, namely the (STD), the , the mean absolute deviation (MAD), the Gini deviation (Gini-D), and the range; see Appendix A for the definition of these classic variability measures. We consider the following properties of a variability measure ν, which are all arguably desirable in some situations. In what follows, ≺cx is the convex order, that is, for two random variables X and Y , X ≺cx Y if E[φ(X)] 6 E[φ(Y )] for all convex φ : R → R such that the above two expectations exist.

(B1) Relevance: ν(X) > 0 if X is not a constant, and there exists β ∈ (0, ∞) such that

ν(X) 6 β for all X ∈ X with |X| 6 1.

(B2) Continuity: ν((X ∧ M) ∨ (−M)) → ν(X) as M → ∞ for all X ∈ X .

7 (B3) Symmetry: ν(X) = ν(−X) for all X ∈ X .

(B4) Comonotonic additivity (C-additivity): ν(X + Y ) = ν(X) + ν(Y ) for all comonotonic X,Y ∈ X .

(B5) Convex order consistency (Cx-consistency): ν(X) 6 ν(Y ) if X ≺cx Y .

(B6) Convexity: ν(λX + (1 − λ)Y ) 6 λν(X) + (1 − λ)ν(Y ) for all X,Y ∈ X and λ ∈ [0, 1].

(B7) Mixture concavity (M-concavity):ν ˆ is concave, whereν ˆ : M → [0, ∞] is defined by νˆ(F ) = ν(X) for X ∼ F .

(B8) Location invariance (L-invariance): ν(X + c) = ν(X) for X ∈ X and c ∈ R.

Relevance (B1) requires ν to report a positive value for all non degenerate distributions, and the value of ν(X) should not explode if |X| 6 1. Continuity (B2) is very weak and unspecific q q to the effective domain of ν(R). If ν is finite on L for some q > 1, then (B2) is implied by L continuity. Symmetry (B3) that ν is indifferent to the positive and the negative sides of the distribution, and this property is in sharp contrast to classic risk measures of which positive and negative values are interpreted very differently (deficit/surplus or loss/profit). C-additivity (B4) is a convenient functional property which allows for a characterization result below. The properties (B5)-(B7) describe natural requirements for ν to increase when the underlying distribution is more spread out in some sense; see Wang et al.(2020a) for explanations of these properties. Finally, (B8) is a simple requirement that variability is measured independently of the location of the distribution and is imposed by Furman et al. (2017) for measures of variability. In Table1 below, α represents the homogeneity index. Table1 shows properties of differ- ent variability measures including the inter-quantile, inter-ES, and inter-expectile differences, as well as some classic variability measures.

Theorem 2. The statements in Table1 hold true.

The proof of Theorem2, thus checking the properties in Table1, relies on several existing results on properties of risk measures and distortion riskmetrics from Newey and Powell (1987), Bellini et al.(2014, 2018), Liu et al.(2020) and Wang et al.(2020a). Notably, the inter-ES difference satisfies all properties (B1)-(B8), along with the Gini deviation and the range. Next, we establish that any variability measure satisfying (B1)-(B8) ES admits a representation as a mixture of ∆p for p ∈ (0, 1].

8 Q ES ex ∆p ∆p ∆p variance STD MAD Gini-D range relevance NO YES YES YES YES YES YES YES continuity YES YES YES YES YES YES YES YES symmetry YES YES YES YES YES YES YES YES C-additivity YES YES NO NO NO NO YES YES Cx-consistency NO YES YES YES YES YES YES YES convexity NO YES YES YES YES YES YES YES M-concavity NO YES NO YES YES NO YES YES L-invariance YES YES YES YES YES YES YES YES homogeneity (α) 1 1 1 2 1 1 1 1 effective domain L0 L1 L1 L2 L2 L1 L1 L∞

Table 1: Properties of variability measures

Theorem 3. The following statements are equivalent for a variability measure ν : X → [0, ∞]:

(i) ν satisfies (B1)-(B8).

(ii) ν satisfies (B1)-(B4) and one of (B5)-(B6).

ES (iii) ν is a mixture of ∆p , that is, there exists a finite Borel measure µ 6= 0 on (0, 1] such that

Z 1 ES ν(X) = ∆p (X)dµ(p),X ∈ X . (1) 0

The measure µ in (1) for a given ν is generally not unique. Using Proposition1, we can require µ in (1) to be supported on [1/2, 1] instead of (0, 1].

Example 1. There are three examples in Table1 that satisfy all of (B1)-(B8), and they each admit a representation in Theorem3. We give a corresponding measure µ for each of them.

ES 1.∆ p for p ∈ (0, 1): µ = δp.

2. The Gini deviation: µ(dx) = (1 − x)dx on [0, 1].

3. The range ∆1: µ = δ1.

Q ES ex As we have seen from Theorem2, all of ∆ p , ∆p , ∆p are invariant under location shift. Q ES ex In the next result, we show that each of the families ∆p , ∆p , ∆p characterize a symmetric distribution up to location shift.

9 Proposition 2. Suppose that X has a symmetric distribution, i.e., X =d −X. Each of the Q ES ex curves p 7→ ∆p (X), p 7→ ∆p (X) and p 7→ ∆p (X) for p ∈ (1/2, 1), if it is finite, uniquely determines the distribution of X.

Q ES Remark 3. If the distribution of X is not symmetric, none of p 7→ ∆p (X), p 7→ ∆p (X) ex and p 7→ ∆p (X) for p ∈ (1/2, 1) determines its distribution up to location shift. This is − because the quantile difference curve p 7→ Qp − Q1−p does not determine the quantile curve p 7→ Qp. For instance, for a quantile curve p 7→ Qp(X), we can define another quantile curve p 7→ Qp(Y ) by

Qp(Y ) = Qp(X) + f(p), p ∈ (0, 1), where f(p) is any continuous function satisfying f(p) = f(1 − p) for p ∈ (0, 1/2), such − that Qp(Y ) is increasing in p. The quantile difference curves p 7→ Qp(X) − Q1−p(X) and − p 7→ Qp(Y ) − Q1−p(Y ) are the same, but the distributions of X and Y are not the same up to location shift unless f is a constant function.

4 Non-parametric estimators

Q ES ex The non-parametric estimation of ∆p (X), ∆p (X) and ∆p (X) is straightforward from non-parametric estimators of VaR, ES and expectiles, which we will explain in this section. 1 Suppose X1,X2 · · · ∈ L are iid sample for a random variable X. Recall that the empir- ical distribution Fbn of X1,...,Xn is given by

n 1 X Fbn(x) = 1{X x}, x ∈ . n j 6 R j=1

Q Q Q Let ∆b p (n) be the empirical of ∆p (X), that is, applying ∆p to the empirical ES ex distribution of X1,...,Xn. Similarly, let ∆b p (n) and ∆b p (n) be the empirical estimators of ES ex ∆p (X) and ∆p (X), respectively. We will establish consistency and asymptotic normality of the empirical estimators, based on results on empirical estimators of VaR, ES and expectiles in the literature, e.g., Chen and Tang(2005), Chen(2008), and Kr¨atschmer and Z¨ahle(2017). We make a standard regularity assumption on the distribution of the random variable X.

(R) The distribution F of X ∈ L1 is supported on an interval and has a positive density function f on the support.

Denote by g = f ◦ F −1 and let p ∈ (1/2, 1). We will show in the next theorem that the

10 Q ES asymptotic variances of the empirical estimators for ∆p and ∆p are given by, respectively,

p(1 − p) p(1 − p) (1 − p)2 σ2 = + − 2 , (2) Q (g(p))2 (g(1 − p))2 g(p)g(1 − p) Z Z ! 2 1 s ∧ t − st σES = 2 −2 dtds, (3) (1 − p) [p,1]2∪[0,1−p]2 [p,1]×[0,1−p] g(t)g(s)

ex and that for ∆p is given by

2 ex ex ex σex = sp + s1−p − 2cp , (4) where for r ∈ {p, 1 − p},

(1 − r)1 + r1 ex {t6exr(X)} {t>exr(X)} fr,F (t) = , t ∈ R, (1 − 2r)F (exr(X)) + r Z ∞ Z ∞ ex ex ex sr = fr,F (t)fr,F (s)F (t ∧ s)(1 − F (t ∨ s))dtds, −∞ −∞ Z ∞ Z ∞ ex ex ex cr = fr,F (t)f1−r,F (s)F (t ∧ s)(1 − F (t ∨ s))dtds. −∞ −∞

Theorem 4. Suppose that p ∈ (1/2, 1) and Assumption (R) holds.

Q p Q ES p ES ex p ex (i) ∆b p (n) → ∆p (X), ∆b p (n) → ∆p (X) and ∆b p (n) → ∆p (X) as n → ∞.

(ii) If X ∈ L2+δ for some δ > 0, then

√ Q Q d 2 n(∆b p (n) − ∆p (X)) → N(0, σQ), √ ES ES d 2 n(∆b p (n) − ∆p (X)) → N(0, σES), √ ex ex d 2 n(∆b p (n) − ∆p (X)) → N(0, σex),

2 2 2 where σQ, σES and σex are given in (2), (3) and (4), respectively.

Simulation results are presented in Figure1 for normal and Pareto risks with p = 0.9, which confirm the asymptotic normality of the empirical estimators in Theorem4. Because of location-invariance and positive homogeneity, the parameters of the normal distribution is not important, and we take a standard normal distribution. The Pareto distribution is −4 chosen with tail index 4, i.e., P(X > x) = x for x > 1. Asymptotic results on the empirical estimators for α-mixing data can be established similarly using results in Chen(2008) and Kr¨atschmer and Z¨ahle(2017). For the interest of space we omit discussions on results in the case of dependent observations.

11 N(2.563, 5.195/n) N(0.752, 1.761/n) 20 30 15 20 10 Density Density 10 5 0 0

2.50 2.55 2.60 2.65 0.72 0.74 0.76 0.78 0.80

inter−quantile difference for normal samples inter−quantile difference for Pareto samples

Q Q (a) of ∆b p (n) for N(0, 1) (b) Histogram of ∆b p (n) for Pareto(4)

N(3.51, 6.966/n) N(1.358, 10.47/n) 15 12 8 10 6 Density Density 5 4 2 0 0

3.45 3.50 3.55 3.60 1.25 1.30 1.35 1.40 1.45 1.50

inter−ES difference for normal samples inter−ES difference for Pareto samples

ES ES (c) Histogram of ∆b p (n) for N(0, 1) (d) Histogram of ∆b p (n) for Pareto(4)

N(1.723, 1.53/n) N(0.669, 2.011/n) 25 30 20 15 Density Density 10 5 0 0

1.68 1.70 1.72 1.74 1.76 0.62 0.64 0.66 0.68 0.70 0.72 0.74

inter−expectile difference for normal samples inter−expectile difference for Pareto samples

ex ex (e) Histogram of ∆b p (n) for N(0, 1) (f) Histogram of ∆b p (n) for Pareto(4) Figure 1: of empirical estimators for simulated normal and Pareto risks, plotted against the density of their asymptotic normal distributions in Theorem4 (with variance normalized by the sample size n). Each histogram is computed from 5,000 replications with sample size n = 10, 000. The parameter p is set to 0.9 in all simulation .

12 5 A rule of thumb for cross comparison

As mentioned in the introduction, we are interested in comparing the inter-quantile, the inter-ES and the inter-expectile differences. Due to the different meanings of the parameter Q ES ex p in VaRp, ESp and exp, there is no reason to directly compare ∆p , ∆p and ∆p using the same probability level p. For a fair cross comparison, we may calibrate p, q, r such that the variability measures have the same value, that is,

Q ES ex ∆p = ∆q = ∆r , for some common choices of distributions. In particular, we will consider normal (N), t- and exponential distributions as benchmarks, and the curves of q and r in terms of p for these distributions are plotted in Figure2. We observe that the values of r is typically much closer to 1 than the corresponding p or q. The value of q is smaller than the corresponding p but the relationship between q and p is close to linear; a corresponding observation on comparing VaR and ES is noted by Li and Wang(2019), where they obtained the ratios (1 − q)/(1 − p) ≈ 2.5 for normal risks and (1 − q)/(1 − p) = e ≈ 2.72 for exponential risks (this corresponds to the straight line in Figure 2b). In empirical studies, it may be convenient to use the matching values for normal distribu- tion as a rule of thumb for general comparisons; note that the location and scale parameters are irrelevant for such a comparison due to location-invariance and positive homogeneity. Roughly, we obtain Q ES ex ∆p ≈ ∆q ≈ ∆r for (p, q, r) ∈ {(0.9, 0.75, 0.97), (0.95, 0.875, 0.99), (0.99, 0.97, 0.999)}. For the particular choice Q ES ex of p = 0.95, it means that ∆0.95 ≈ ∆0.875 ≈ ∆0.99 for normal risks. We will compare these variability measures on real data in the next section, confirming the well-known fact that financial return data are not normally distributed.

6 Empirical analysis

In this section, we analyze the financial data for the three classes of variability measures studied in this paper, and observe the difference between their performances during different periods of time (different economic regimes). Our data are the historical price movements spanned from 01/04/1999 to 06/30/2020 of the S&P 500 index.1 We use its daily log-loss

1The source of the price data is Yahoo Finance.

13 q,r q,r

q q r r 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

0.5 0.6 0.7 0.8 0.9 1.0 0.6 0.7 0.8 0.9 1.0

p p

(a) q, r for p ∈ (0.5, 1) in N(0, 1) (b) q, r for p ∈ (0.5, 1) in exp(1) q,r q,r

q q r r 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

0.5 0.6 0.7 0.8 0.9 1.0 0.5 0.6 0.7 0.8 0.9 1.0

p p

(c) q, r for p ∈ (0.5, 1) in t(4) (d) q, r for p ∈ (0.5, 1) in t(10)

Q ES ex Figure 2: q, r such that ∆p = ∆q = ∆r for p ∈ (0.5, 1)

14 data2 over the observation period with moving window of 253 days for daily estimation of the Q variability measures. For example, an estimation of ∆p on 01/02/2000 use the log-loss data from Jan 1999 to Dec 1999. ES ex To compare the relative performance of the three measures, we report the ratios ∆q /∆r ES Q and ∆q /∆p for the S&P 500 daily log-losses in Figures3 and4 using the rule of thumb for (p, q, r) obtained in Section5 induced by the normal distribution. In Figure3, spikes in the ES Q ratio of ∆q /∆p are located around the 2008 subprime crisis and the COVID-19 period. On ES ex the other hand, the ratio ∆q /∆r in Figure4 experiences a down-slide around the subprime ES crisis and the COVID-19 period. These results suggest that ∆q is more sensitive to extremely Q ex ES large losses than ∆p , but ∆r is even more sensitive than ∆q . Recall that these ratios should ES Q be 1 if the underlying losses are normally distributed, whereas we observe ∆q /∆p > 1 and ES ex ES ex ∆q /∆r < 1 for most dates during the period of 2000 - 2020 (∆q /∆r is almost always smaller than 1). Hence, Figures3 and4 confirm that the data from S&P 500 are not normally distributed, and in fact, they are heavy tailed. The tail-heaviness of financial return data is well studied; see McNeil et al.(2015) for more discussions.

1.30 1.5

1.25 1.4 1.20

1.3 5 0 9

9 1.15 . . Q 0 Q 0 / / 5 5 7 7 . S 8 . 1.2 S 1.10 E 0 E 0

1.05 1.1

1.00 1.0 0.95

0.9 0.90 2000 2004 2008 2012 2016 2020 2000 2004 2008 2012 2016 2020 year year

ES Q Figure 3: The ratio of ∆q to ∆p using S&P 500 daily log-loss data (Jan 2000 - Jun 2020). Left: (p, q) = (0.9, 0.75). Right: (p, q) = (0.95, 0.875).

2We use the log-loss (negative log-return) to be consistent with most studies on financial asset return data.

15 1.0 1.0

0.9 0.9 9 7

9 0.8 9 . x . x 0.8 e 0 e 0 / / 5 5 7 7 . S 8 . S E 0 E 0 0.7 0.7

0.6 0.6

0.5 2000 2004 2008 2012 2016 2020 2000 2004 2008 2012 2016 2020 year year

ES ex Figure 4: The ratio of ∆q to ∆r using S&P 500 daily log-loss data (Jan 2000 - Jun 2020). Left: (q, r) = (0.75, 0.97). Right: (q, r) = (0.875, 0.99).

7 Conclusion

In this paper, we introduce variability measures induced by three very popular parametric families of risk measures, that is, the inter-quantile, the inter-ES, and the inter-expectile differences. The three classes of variability measures enjoy many nice theoretical properties (Theorem1); in particular, each of them characterizes symmetric distributions up to a location shift (Proposition2). We study several desirable functional properties of general variability measures including the above three classes and many other classic ones; a grand summary is obtained in Theorem2 and Table1. The family of variability measures that satisfy a set of desirable properties is characterized as mixtures of inter-ES differences (Theorem3). It is important to note that the three classes of variability measures introduced in this paper are well defined on L1 and they each has one intuitive parameter which allows for flexible applications. This distinguishes them from other deviation measures (e.g., Rockafellar et al.(2006)) where no parametric family is given. The empirical estimators of the the inter- quantile, inter-ES, and the inter-expectile differences can be formulated based on those of VaR, ES and the expectile, and the asymptotic normality of the estimators is established (Theorem4). Applied to financial asset return data, we observe that the behaviour of the variability measures is similar to the corresponding parametric family. However, a comparison of different ratio of the variability measures reveals that ∆ex is the most sensitive to extreme losses, and ∆Q is the least sensitive. For the end-user, if tail risk is of particular concern, then ∆ex may be a better variability measure to use, as it captures tail-heaviness quite effectively. However, ∆ex is usually cum- bersome in computation and optimization because of the lack of explicit formulas in terms of

16 quantile or distribution functions; another technical disadvantage is that ∆ex is not concave in mixtures. On the other hand, if robustness is more important and tail risk is not relevant, then ∆Q is a good choice, because quantiles are easy to compute and they are generally more robust than coherent risk measures including ES and expectiles (see Cont et al.(2010)). Moreover, ∆Q is well defined on risks without a finite mean; nevertheless we should keep in mind that ∆Q ignores tail risk just like a quantile. Finally, ∆ES lies somewhere in between ∆Q and ∆ex regarding the above considerations, which giving rise to a good compromise; further, it is the only one among the three classes that is concave in mixtures (see Table1), and it is the building block for many other measures of variability (see Theorem3). In the literature, risk measures are commonly defined on a space of both positive and negative random variables. For this reason, our variability measures are also defined on such spaces, and we omit a detailed study of relative variability measures which are defined only for positive random variables. Relative variability measures include important examples such as the relative deviation and the Gini coefficient; see AppendixA. By replacing classic risk measures with relative risk measures (e.g., Peng et al.(2012)), one could define new classes of relative risk measures. On the other hand, other parametric families of risk measures, such as entropic risk measures (e.g., F¨ollmerand Schied(2016)) and RVaR (e.g., Embrechts et al. (2018)), can also be used to design flexible variability measures.

Acknowledgements

T. Fadina and R. Wang are supported by the Natural Sciences and Engineering Research Council of Canada (RGPIN-2018-03823, RGPAS-2018-522590).

A Classic variability measures

Below we list some classic variability measures, which are formulated on their respective effective domains.

(i) The variance (Var) 2 2 E[(X − E[X]) ],X ∈ L .

(ii) The standard deviation (STD):

pVar(X),X ∈ L2.

(iii) The range (∆1): ess-sup(X) − ess-inf(X),X ∈ L∞.

17 (iv) The mean absolute deviation (MAD):

1 E[|X − E[X]|],X ∈ L .

(v) The mean median deviation (MMD):

1 min E[|X − x|] = E[|X − Q1/2(X)|],X ∈ L . x∈R

(vi) The Gini deviation (Gini-D):

1 [|X − X |],X ∈ L1,X ,X ,X are iid. 2E 1 2 1 2

(vii) The relative deviation: SD(X) 2 ,X ∈ L+. E[X]

(viii) The Gini coefficient:

E[|X1 − X2|] Gini(X) 1 = ,X ∈ L+,X1,X2,X are iid. 2E[X] E[X]

q q Here, L+, q ∈ [0, ∞] is the set of all non-negative random variables X in L with P(X > 0) > 0. B Proofs of main results

Proof of Theorem1. (i) Law invariance (A1) is obvious. For standardization (A2), note − − that the risk measures ρ ∈ {Qp,Qp , ESp, ESp , exp} are all monetary (F¨ollmerand Schied Q (2016)) and satisfies ρ(m) = m for any constant m. Hence, for a constant m, ∆p (m) = ES ex − − ∆p (m) = ∆p (m) = 0. Positive homogeneity follows from that of Qp, Qp , ESp, ESp

and exp.

(ii) The effective domains of these variability measures can be easily checked from the ef- fective domain of the corresponding risk measures.

− Q (iii) Since Qp is increasing in p and Q1−p is decreasing in p, ∆p is increasing in p. The same ES ex applies to ∆p and ∆p .

(iv) By definition, for X ∈ L0,

Qp(−X) = inf{x ∈ R : P(−X 6 x) > p} = inf{x ∈ R : P(X < −x) < 1 − p}.

18 Moreover,

inf{x ∈ R : P(X 6 x) > 1 − p} = sup{x ∈ R : P(X 6 x) < 1 − p},

and hence

− −Q1−p(X) = − inf{x ∈ R : P(X 6 x) > 1 − p}

= − sup{x ∈ R : P(X 6 x) < 1 − p}

= inf{−x ∈ R : P(X 6 x) < 1 − p} = inf{x ∈ R : P(X 6 −x) < 1 − p}.

− Q Thus, Qp(−X) = −Q1−p(X) and ∆p (X) = Qp(X) + Qp(−X).

ES − The formula for ∆p , ESp(X) − ES1−p(X) = ESp(X) + ESp(−X), follows directly from definition.

ex Next we show the formula for ∆p . From Newey and Powell(1987), the expectile ex p(X), for p ∈ (1/2, 1) is the unique solution x to

pE[(X − x)+] = (1 − p)E[(X − x)−]. (5)

Hence, the expectile of −X satisfies

(1 − p)E[(−X − ex1−p(−X))+] = pE[(−X − ex1−p(−X))−].

This is equivalent to

pE[(X + ex1−p(−X))+] = (1 − p)E[(X + ex1−p(−X))−].

− The uniqueness of solution x to (5) implies −ex1−p(X) = exp(−X). Hence,

ex − ∆p (X) = exp(X) − ex1−p(X) = exp(X) + exp(−X),

thus the desired formula.

Proof of Proposition1. By definition and Theorem1 (iv), for X ∈ L1,

Z 1 ES 1 ES (1 − p)∆p (X) = (1 − p) (Qr(X) + Qr(−X)) dr = p∆1−p(X) 1 − p p

19 and

Z 1 Z 1 Z 1 ES 1 1 1 Q ∆p (X) = Qq(X)dq + Qq(−X)dq = ∆q (X)dq. 1 − p p 1 − p p 1 − p p

Hence, the desired statements hold.

Proof of Theorem2. We first explain some general observations on all variability measures in Table1. The effective domains and the homogeneity indices follow directly from definition. Continuity (B2) is implied by Lq continuity since all variability measures are finite and thus continuous on their effective domains. Symmetry (B3) and location invariance (B8) are straightforward to check, and they hold for all variability measures in Table1. The conditions (B5)-(B7) are connected. In particular, Theorem 3 of Wang et al.(2020a) shows that (B5)-(B7) are equivalent for distortion riskmetrics, which are functionals satis- fying (A1), (B4) and some continuity assumptions. It is well known that the inter-quantile differences and the inter-ES differences are distortion riskmetrics. Next, we explain that convexity (B6) implies Cx-consistency (B5) for all variability measures we consider. By Theorem 2.2 of Liu et al.(2020), all law-invariant convex risk functionals, i.e., functionals satisfying (A1), (B6) and (B8), can be written as the supremum of a family of convex distortion riskmetrics. Since each distortion riskmetric is Cx-consistent by Theorem 3 of Wang et al.(2020a), (B5) is implied by (B6). The only negative statement for (B5) is made for the inter-quantile difference, which is a non-convex distortion riskmetric; see Table 1 of Wang et al.(2020a). Hence, the inter-quantile difference does not satisfy any of (B5)-(B7). It remains to verify (B1), (B4), (B6), (B7) for each variability measure.

Q (i) The following example shows that ∆p does not satisfy (B1). Take ε > 0 such that Q p + ε < 1 and X ∼ Bernoulli(1 − p − ε). Notice that X is not a constant but ∆p (X) = − Q Qp(X) − Q1−p(X) = 0 − 0 = 0. C-additivity (B4) is satisfied since ∆p is a distortion riskmetric. (B6)-(B7) are explained above.

ES (ii)∆ p , Gini-D and range are all convex distortion riskmetrics; see Table 1 of Wang et al. (2020a). Hence, they all satisfy (B4)-(B7). Relevance (B1) can be easily verified.

(iii) If X is not a constant, by Newey and Powell(1987, Theorem 1), ex p is strictly increasing ex in p ∈ (0, 1), which means that ∆p (X) = exp(X) − ex1−p(X) > 0 for p ∈ (1/2, 1). By

Proposition 7 of Bellini et al.(2014), ex p is increasing in X, so for |X| 6 1, −1 6

20 ex exp(X) 6 1 for p ∈ (0, 1). Thus ∆p (X) 6 2 and Relevance (B1) is satisfied. Convexity (B6) is satisfied by Theorem1 (iv) and convexity of expectiles.

ex We show that M-concavity (B7) is not satisfied by ∆p (X) via the following example from Bellini et al.(2018). Take p = 1/10. Define X by P(X = −1) = 1/2, and ex 8 P(X = 1) = 1/2; Y by P(Y = 0) = 2/3, P(Y = 5) = 1/3. Then ∆1/10(X) = − 5 and ex 800 ∆1/10(Y ) = − 209 .

9 1 Let F = 10 FX + 10 FY and Z ∼ F . Then

2531 9 1 9524 ∆ex (Z) = − < ∆ex (X) + ∆ex (Y ) = − , 1/10 1311 10 1/10 10 1/10 5225

ex and hence ∆p is not mixture concave.

C-additivity (B4) is not satisfied since by Theorem3, a variability measure satisfying (B1)-(B5) must satisfy (B7).

(iv) For the variance, Relevance (B1) can be easily verified. Variance does not satisfy (B4) since (B4) requires the homogeneity index to be 1. For (B6), the variance is well known to be convex (Deprez and Gerber(1985)); see also Example 2.2 of Liu et al.(2020). The variance satisfies M-concavity (B7) because of the well known equality

2 2 2 σ (X) = min E[(X − x) ],X ∈ L . x∈R

Since σ2 is the minimum of mixture-linear functionals, we know that it is mixture concave.

(v) For STD, Relevance (B1) can be easily verified. C-additivity (B4) is not satisfied by STD since STD is not additive for comonotonic random variables X and Y with correlation less than 1. STD is convex (B6); see Example 2.1 of Liu et al.(2020). To show that STD 1 satisfies M-concavity (B7), take X,Y ∈ L and let Z ∼ λFX + (1 − λ)FY for λ ∈ [0, 1]. By definition,

σ2(Z) − (λσ(X) + (1 − λ)σ(Y ))2

2 2  = λ(1 − λ) E[X ] + E[Y ] − 2E[X]E[Y ] − 2σ(X)σ(Y ) 2 2 2 2  = λ(1 − λ) E [X] + σ (X) + E [Y ] + σ (Y ) − 2E[X]E[Y ] − 2σ(X)σ(Y )  2 2 = λ(1 − λ) (E[X] − E[Y ]) + (σ(X) − σ(Y )) > 0,

21 which is equivalent to σ(Z) > λσ(X) + (1 − λ)σ(Y ).

(vi) For the mean absolute deviation (MAD), Relevance (B1) can be easily verified. MAD satisfies convexity (B6), since, for λ ∈ [0, 1] and X,Y ∈ L1,

E[|λX + (1 − λ)Y − λE[X] − (1 − λ)E[Y ]|]

6 E[|λX − λE[X]|] + E[|(1 − λ)(Y − E[Y ])|] = λE[|X − E[X]|] + (1 − λ)E[|Y − E[Y ]|].

We give an example showing that MAD does not satisfy M-concavity (B7). Take X ∼ d 1 1 Bernoulli(1/3), and Y = −X. Let F = 2 FX + 2 FY and Z ∼ F . It is easy to calculate that E[X] = 1/3, E[Y ] = −1/3, E[Z] = 0, and E[|X − E[X]| = E[|Y − E[Y ]| = 4/9. On the other hand, 1 1 1 [|Z − [Z]|] = [|X|] + [|Y |] = . E E 2E 2E 3 Therefore, 1 1 [|Z − [Z]|] < [|X − [X]| + [|Y − [Y ]|, E E 2E E 2E E and hence MAD is not mixture concave.

C-additivity (B4) is not satisfied by MAD since by Theorem3, a variability measure satisfies (B1)-(B5) must satisfy (B7).

R 1 ES Proof of Theorem3. Write the functional νµ = 0 ∆p dµ(p), which is the right-hand side of (1). First, obviously (i) implies (ii). It is also straightforward to check that (iii) implies (i), ES since ∆p for p ∈ (0, 1] satisfies (B1)-(B8) by Theorem2, and so is νµ; the only non-trivial statement is (B2) of νµ which is guaranteed by Theorem 5 of Wang et al.(2020a). Below, we show (ii)⇒(iii).

Let Xν be the effective domain of ν. Take X ∈ Xν such that ν(X) > 0. By (B4), ν(2X) = ν(X) + ν(X) = 2ν(X). Hence, the homogeneity index of ν is 1. 0 d Suppose that Cx-consistency (B5) holds. Take any X,Y ∈ Xν and let X = X and 0 d 0 0 0 0 Y = X such that X and Y are comonotonic. It is well known that X + Y ≺cx X + Y ; see e.g., Theorem 3.5 of R¨uschendorf(2013). Using (B4) and (B5), we have

0 0 0 0 ν(X + Y ) 6 ν(X + Y ) = ν(X ) + ν(Y ) = ν(X) + ν(Y ).

Therefore, ν is subadditive, that is,

ν(X + Y ) 6 ν(X) + ν(Y ) for all X,Y ∈ X . (6)

22 Note that convexity (B6) and homogeneity (A3) with α = 1 together also imply subadditivity. Hence, either assuming (B5) or (B6), we get (6). It follows from (6) and (B1) that there exists

β > 0 such that ν(Y ) − ν(X) 6 ν(Y − X) 6 βkY − Xk∞ where kY − Xk∞ is the essential supremum of |Y − X|. Hence, ν is uniformly continuous with respect to the supremum norm. ∞ Moreover, as a consequence of (B1), (A3) and (6), Xν is a convex cone that contains L . Theorem 1 of Wang et al.(2020a) suggests that a real functional on a convex cone that is uniformly continuous with respect to the supremum norm, law-invariant, and satisfying (B2) and (B4) is a distortion riskmetric in the sense of that paper; see (7) below. Further, using Theorem 3 of Wang et al.(2020a), we know that any of (B5)-(B7) implies that ν is 1 a convex distortion riskmetric on Xν ∩ L . By Theorem 5 of Wang et al.(2020a), ν has a representation, for some finite measures µ1 and µ2,

Z 1 Z 1 1 ν(X) = ESp(X)dµ1(p) + ESp(−X)dµ2(p),X ∈ Xν ∩ L . (7) 0 0

By symmetry (B3), we know

Z 1 Z 1 1 ν(X) = ν(−X) = ESp(X)dµ2(p) + ESp(−X)dµ1(p),X ∈ Xν ∩ L . 0 0

Hence, we can take µ = (µ1 + µ2)/2, and get

Z 1 ES 1 ν(X) = ∆p (X)dµ(p),X ∈ Xν ∩ L . 0

1 ES Relevance (B1) implies µ 6= 0, which in turn implies Xν ⊂ L , as the effective domain of ∆p 1 ∞ is L for p ∈ (0, 1). Hence, the two functionals ν and νµ coincide on Xν which contains L .

Also note that both ν and νµ satisfy continuity (B2), and hence one can approximate any random variable outside Xν with truncated random variables, and obtain that ν and νµ also coincide on X .

Proof of Proposition2. (i) If X has a symmetric distribution, then by Theorem1 (iv), we have

Q − − − − ∆p (X) = Qp(X) − Q1−p(X) = −Q1−p(−X) − Q1−p(X) = −2Q1−p(X).

Q Q Assume X1 and X2 are symmetric distributions with finite ∆p (X1) = ∆p (X2) for 1 − − 1 p ∈ ( 2 , 1). It follows that Qp (X1) = Qp (X2) for p ∈ (0, 2 ). By the left-continuity of − − the left-quantile, Q1/2(X1) = Q1/2(X2). By symmetry of the distribution of X, we have

23 − − Qp (X1) = Qp (X2) almost every p, and thus X1 and X2 have the same distribution.

ES − (ii) If X has a symmetric distribution, then similarly to (i), we have ∆p (X) = −2ES1−p(X). ES ES Assume X1 and X2 are symmetric distributions with finite ∆p (X1) = ∆p (X2) for 1 1 − − p ∈ ( 2 , 1). It follows that for p ∈ (0, 2 ), ESp (X1) = ESp (X2) holds, which means

Z p Z p Qr(X1)dr = Qr(X2)dr. (8) 0 0

1 We know the set of discontinuities of Qr is at most countable, so for each p ∈ (0, 2 ),

there exists p0 < p such that Qr is continuous on (p0, p). By (8), we have

Z r0 Z p Z r0 Z p Qr(X1)dr + Qr(X1)dr = Qr(X2)dr + Qr(X2)dr. 0 r0 0 r0

By taking the derivative with respect to p, we get

Qp(X1) = Qp(X2).

1 This argument can be applied to any p ∈ (0, 2 ). Similarly to part (i), we conclude that

X1 and X2 have the same distribution.

(iii) If X has a symmetric distribution, then similarly to (i), we have

ex ∆p (X) = 2exp(X) = −2ex1−p(X).

ex ex Suppose X1 and X2 have symmetric distributions with finite ∆p (X1) = ∆p (X2) for 1 1 1 p ∈ ( 2 , 1). Then exp(X1) = exp(X2) for p ∈ (0, 2 ) ∪ ( 2 , 1). By symmetry, we observe

that E[X1] = E[X2] = 0, which means ex 1 (X1) = ex 1 (X2) = 0, so exp(X1) = exp(X2) 2 2 for p ∈ (0, 1).

The expectile has alternative definitions from Newey and Powell(1987),

2p − 1 ex (X) = [X] + [(X − ex (X)) ], p E 1 − p E p +

which leads to

E[(X1 − exp(X1))+] = E[(X2 − exp(X2))+].

24 Since exp(X) is continuous in p and takes all values in the range of X, we know

E[(X1 − x)+] = E[(X2 − x)+]

for all x ∈ R, implying that the distributions of X1 and X2 are identical.

Proof of Theorem4. (i) Let Qbp(n), ESc p(n), and exb p(n) be the empirical estimators of

Qp(X), ESp(X), and exp(X) based on n sample data points. It is well known (e.g., p Bahadur(1966)) that Qbr(n) → Qr(X) at each r of continuous point of Qr(X), which im- Q p Q plies ∆b p (n) → ∆p (X) under assumption (R). Since ESp and exp are law-invariant con- p vex risk measures, by Theorem 2.6 of Kr¨atschmer et al.(2014), ESc r(n) → ESr(X) and p ES p ES ex p ex exb r(n) → exr(X) for each r. Hence we have ∆b p (n) → ∆p (X) and ∆b p (n) → ∆p (X).

(ii) By Proposition 1 of Shorack and Wellner(2009, p.640), if assumption (R) is satisfied, then we have

√   d Bp n Qbp(n) − Qp(X) → . (9) g(p)

− where Bp is a standard Brownian bridge. With assumption (R), Qp(X) = Qp (X). Hence,

√   d Bp B1−p n ∆b Q(n) − ∆Q(X) → − , p p g(p) g(1 − p)

which has a Gaussian distribution. Using the property of the Brownian

bridge, that is, Cov[Bt,Bs] = s − st for s < t, we have

 B B  (1 − p)2 Cov p , 1−p = . g(p) g(1 − p) g(p)g(1 − p)

√ Q Q d 2 2 Therefore, n(∆b p (n) − ∆p (X)) → N(0, σQ), where σQ is in (2), namely,

p(1 − p) p(1 − p) (1 − p)2 σ2 = + − 2 . Q g2(p) g2(1 − p) g(p)g(1 − p)

Next, we address the inter-ES difference. Applying the convergence in (9) to ESp, we obtain Z 1 √   d 1 Bs n ESc p(n) − ESp(X) → ds, 1 − p p g(s)

25 and thus

Z 1 Z 1−p √ ES ES d 1 Bs 1 Bs n(∆b p (n) − ∆p (X)) → ds − ds. 1 − p p g(s) 1 − p 0 g(s)

Note that

 Z 1   Z 1 Z 1  1 Bs 1 BsBt Var ds = E 2 dtds 1 − p p g(s) (1 − p) p p g(s)g(t) 1 Z 1 Z 1 s ∧ t − st = 2 dtds, (1 − p) p p g(s)g(t) and

1 Z 1 1 Z 1−p 1  1 Z 1 Z 1−p s ∧ t − st 2 Cov Btdt, Btdt = 2 dtds. (1 − p) p g(t) 0 g(t) (1 − p) p 0 g(t)g(s)

√ ES ES d 2 2 Hence, n(∆b p (n) − ∆p (X)) → N(0, σES), with σES given in (3), namely,

Z Z ! 2 1 s ∧ t − st σES = 2 −2 dtds. (1 − p) [p,1]2∪[0,1−p]2 [p,1]×[0,1−p] g(t)g(s)

For the inter-expectile difference, we use Theorem 3.2 of Kr¨atschmer and Z¨ahle(2017). The conditions for this theorem are satisfied in our setting noting that X ∈ L2+δ; see Remark 3.4 of Kr¨atschmer and Z¨ahle(2017). We obtain, for p ∈ (1/2, 1),

√ ex n(exb p(n) − exp(X)) → N(0, sp ) where for r ∈ {1 − p, p},

Z ∞ Z ∞ ex ex ex sr = fr,F (t)fr,F (s)F (t ∧ s)(1 − F (t ∨ s))dtds, −∞ −∞ and (1 − r)1 + r1 ex {t6exr(X)} {t>exr(X)} fr,F (t) = . (1 − 2r)F (exr(X)) + r Similar arguments as above lead to

√ ex ex d ex ex ex n(∆b p (n) − ∆p (X)) → N(0, sp + s1−p − 2cp ),

26 where Z ∞ Z ∞ ex ex ex cp = fp,F (t)f1−p,F (s)F (t ∧ s)(1 − F (t ∨ s))dtds. −∞ −∞ This completes the proof.

References

Artzner, P., Delbaen, F., Eber, J.-M. and Heath, D. (1999). Coherent measures of risk. Mathematical Finance, 9(3), 203–228. Bahadur, R. R. (1966). A note on quantiles in large samples. The Annals of , 37(3), 577–580. Bellini, F., Bignozzi, V. and Puccetti, G. (2018). Conditional expectiles, time consistency and mixture convexity properties. Insurance: Mathematics and Economics, 82, 117–123. Bellini, F. and Di Berhardino, E. (2015). Risk management with expectiles. The European Journal of Finance, 23(6), 487–506. Bellini, F., Klar, B., M¨uller, A. and Rosazza Gianin, E. (2014). Generalized quantiles as risk measures. Insurance: Mathematics and Economics, 54(1), 41–48. Bellini, F., Mercuri, L. and Rroji, E. (2020). On the dependence structure between S&P500, VIX and implicit Interexpectile Differences. Quantitative Finance, 20(11), 1839–1848. Chen, S. X. (2008). Nonparametric estimation of Expected Shortfall. Journal of Financial , 6, 87–107. Chen, S. X. and Tang, C. Y. (2005). Nonparametric inference of Value-at-Risk for dependent financial returns. Journal of Financial Econometrics, 3, 227–255. Cont, R., Deguest, R. and Scandolo, G. (2010). Robustness and sensitivity analysis of risk measurement procedures. Quantitative Finance, 10(6), 593–606. David, H. A. (1998). Early sample measures of variability. Statistical Science, 13(4), 368–377. Deprez, O. and Gerber, H. U. (1985). On convex principles of premium calculation. Insurance: Mathematics and Economics, 4(3), 179–189. Embrechts, P., Puccetti, G., R¨uschendorf, L., Wang, R. and Beleraj, A. (2014). An academic response to Basel 3.5. Risks, 2(1), 25-48. Embrechts, P., Liu, H. and Wang, R. (2018). Quantile-based risk sharing. Operations Research, 66(4), 936–949. Emmer, S., Kratz, M. and Tasche, D. (2015). What is the best risk measure in practice? A comparison of standard measures. Journal of Risk, 18(2), 31–60.

27 F¨ollmer,H. and Schied, A. (2016). Stochastic Finance. An Introduction in Discrete Time. Walter de Gruyter, Berlin, Fourth Edition. Furman, E., Wang, R. and Zitikis, R. (2017). Gini-type measures of risk and variability: Gini shortfall, capital allocation and heavy-tailed risks. Journal of Banking and Finance, 83, 70–84. Grechuk, B., Molyboha, A. and Zabarankin, M. (2009). Maximum entropy principle with general deviation measures. Mathematics of Operations Research, 34(2), 445–467. Grechuk, B., Molyboha, A. and Zabarankin, M. (2010). Chebyshev inequalities with law invariant deviation measures, Probability in the Engineering and Informational Sciences, 24(1), 145–170. Kr¨atschmer, V., Schied, A. and Z¨ahle,H. (2014). Comparative and quantitiative robustness for law-invariant risk measures. Finance and Stochastics, 18(2), 271–295. Kr¨atschmer, V. and Z¨ahle,H. (2017). Statistical inference for expectile-based risk measures. Scandinavian Journal of Statistics, 44(2), 425–454. Li, H. and Wang, R. (2019). PELVE: Probability equivalent level of VaR and ES. SSRN : 3489566. Liu, F., Cai, J., Lemieux, C. and Wang R. (2020). Convex risk functionals: representation and applications. Insurance: Mathematics and Economics, 90, 66–79. McNeil, A. J., Frey, R. and Embrechts, P. (2015). Quantitative Risk Management: Concepts, Techniques and Tools. Revised Edition. Princeton, NJ: Princeton University Press. Newey, W. and Powell, J. (1987). Asymmetric estimation and testing. Econo- metrica, 55(4), 819–847. Peng, L., Qi, Y., Wang, R. and Yang, J. (2012). Jackknife empirical likelihood method for some risk measures and related quantities. Insurance: Mathematics and Economics, 51(1), 142–150. Rockafellar, R. T., Uryasev, S. and Zabarankin, M. (2006). Generalized deviation in risk analysis. Finance and Stochastics, 10, 51–74. R¨uschendorf, L. (2013). Mathematical Risk Analysis. Dependence, Risk Bounds, Optimal Al- locations and Portfolios. Springer, Heidelberg. Shorack, G. and Wellner, J. (2009). Empirical Processes with Applications to Statistics. Society for Industrial and Applied Mathematics. Wang, Q., Wang, R. and Wei, Y. (2020a). Distortion riskmetrics on general spaces. ASTIN Bulletin, 50(4), 827–851.

28 Wang, R., Wei, Y. and Willmot, G. E. (2020b). Characterization, robustness and aggregation of signed Choquet integrals. Mathematics of Operations Research, 45(3), 993–1015. Wang, R. and Zitikis, R. (2020). An axiomatic foundation for the Expected Shortfall. Man- agement Science, DOI: 10.1287/mnsc.2020.3617. Ziegel, F. (2016). Coherence and elicitability. Mathematical Finance, 26(4), 901–918.

29