Beta and Coskewness Pricing: Perspective from Probability

Weighting

YUN SHI, XIANGYU CUI, XUN YU ZHOU∗

ABSTRACT

The security market line is often flat. We hypothesize that probability weighting plays a role and that one ought to differentiate periods in which agents overweight and underweight tails.

Overweighting inflates the probability of extremely bad events and demands greater compensation for beta risk, whereas underweighting is the opposite. Overall, these two effects offset each other, resulting in a nearly flat return-beta relationship. Similarly, overweighting the tails enhances the negative relationship between return and coskewness, while underweighting reduces it. We offer a psychology-based explanation regarding the switch between overweighting and underweighting, and we support our theories empirically.

∗Shi is at Academy of and Interdisciplinary Sciences, Faculty of Economics and Management, East China Normal University, Shanghai, China. Email: [email protected]. Cui is at School of Statistics and Manage- ment, Shanghai Institute of International Finance and Economics, Shanghai University of Finance and Economics, Shanghai, China. Email: [email protected]. Zhou is at Department of Industrial Engineering and Operations Research and The Data Science Institute, Columbia University, New York, NY 10027, USA. Email: [email protected]. Shi acknowledges financial support from the National Natural Science Foundation of China under grants 71971083, 71601107, and from the State Key Program of National Natural Science Foundation of China under grant 71931004. Cui acknowledges financial support from the National Natural Science Foundation of China under grant 71601107. Zhou acknowledges financial support through a start-up grant at Columbia University and through the Nie Center for Intelligent Asset Management. 1 Introduction

The classical CAPM theory of Sharpe (1964), Lintner (1965), and Mossin (1966) asserts that expected returns increase with beta, leading to an upward-sloping security market line. However, empirical studies have shown that the return–beta slope is flat or even downward-sloping; see e.g.,

Black, Jensen, and Scholes (1972), Fama and French (1992), and Baker, Bradley, and Wurgler

(2011). Various explanations have been offered for why beta is not or is negatively priced, ranging from misspecification of risk (Jagannathan and Wang, 1996), to inefficiency of market proxies (Roll and Ross, 1994), to market frictions (Black et al., 1972; Frazzini and Pedersen, 2014), and to aggregate disagreement (Hong and Sraer, 2016).

We approach the beta anomaly from a different perspective, one that involves probability weight- ing. The central task of asset pricing is to characterize how expected returns are related to risk and to investors’ perceptions of risk. Probability weighting affects the perception of risk, especially in the two tails of the market returns; therefore, it is poised to play a role in beta pricing. On the other hand, a positive/negative measures the risk of large poitive/negative realizations and can be viewed as a part of tail risk. As a result, probability weighting is also relevant to skew- ness/coskewness pricing because substantially right-/left-skewed events, such as winning a lottery

(Barberis and Huang, 2008; Bordalo, Gennaioli, and Shleifer, 2012) or encountering a catastrophe

(Kelly and Jiang, 2014; Bollerslev, Todorov, and Xu, 2015; Kozhan, Neuberger, and Schneider,

2013) are more attractive/undesirable to an investor under probability weighting.

In this paper, we provide a theory for beta and coskewness pricing using rank-dependent utilities

(Quiggin, 1982; Schmeidler, 1989; Abdellaoui, 2002; Quiggin, 2012). The key difference between the rank-dependent utility theory (RDUT) and the classical expected utility theory (EUT) is that in the former the utility is weighted by a probability weighting function assigned to ranked outcomes.1 So

RDUT nests EUT but also captures risk attitude toward probabilities, especially those represented in the two tails of the return distributions. We first derive an equilibrium asset pricing formula

1Another prominent theory that extends EUT to include probability weighting is the cumulative prospect theory (CPT; Tversky and Kahneman (1979) and Tversky and Kahneman (1992)). Based on CPT, Barberis, Mukherjee, and Wang (2016) construct a TK factor and find its predicting power derives mainly from probability weighting. Hence, we focus on RDUT in this paper.

2 for a representative RDUT economy, and we then examine the signs and magnitudes of the risk premiums in and coskewness. Our results show that the pricing kernel in this economy is the product of the marginal utility and the derivative of probability weighting, which implies that the shape of the weighting function matters for pricing.

The dominant view in the behavioral finance literature is that individuals overweight probabil- ities of both extremely good and extremely bad events, rendering an inverse S-shaped probability weighting function (Tversky and Kahneman, 1992; Wu and Gonzalez, 1996; Prelec, 1998; Hsu, Kra- jbich, Zhao, and Camerer, 2009; Tanaka, Camerer, and Nguyen, 2010). Some authors, however, have documented empirical evidence that individuals sometimes underweight tail events (Hertwig,

Barron, Weber, and Erev, 2004; Humphrey and Verschoor, 2004; Harrison, Humphrey, and Ver- schoor, 2009; Henrich, Heine, and Norenzayan, 2010), leading to S-shaped weighting functions.

Accordingly, Barberis (2013) comments that the coexistence of overweighting and underweighting poses a challenge in the study of extreme events. Employing the method proposed in Polkovnichenko and Zhao (2013), we use S&P 500 index option data from January 4, 1996 to April 29, 2016 to back out option-implied probability weighting functions, updated monthly. Among the total 244 months tested in our study, about 89.3% of the time the implied weighting functions are either S-shaped

(about 37.7%) or inverse S-shaped (about 51.6%). This observation suggests that individuals tend to overweight or underweight the two tails simultaneously (Tversky and Kahneman, 1992; Gonzalez and Wu, 1999; Humphrey and Verschoor, 2004). We further use the implied weighting functions to construct a monthly probability weighting index that reflects investors’ perceptions of and re- sponses to tail events, based on which we divide the market into two regimes: overweighting and underweighting. The ratio between the number of months during which the market is in the over- weighting regime and during which it is in the underweighting one is around 57% to 43%. Given this empirical finding, we can separate the overweighting and underweighting regimes and analyze the corresponding beta and coskewness pricing separately and respectively.

Our main theoretical result is a closed-form expression of a three- CAPM, with risk premiums of the covariance and coskewness of any given asset with respect to the market. These premiums depend on the utility function, as well as on the probability weighting function evaluated

3 at the probability value of having a positive market return. During an overweighing month (in which the weighting function is inverse S-shaped), the agent overweights both tails. Overweighting the left tail marks the RDUT agent as more risk averse toward bad states (related to the convex part of the weighting function) than her EUT counterpart, whereas overweighting the right tail marks her as less risk averse (the concave part) toward good states. While these two effects seem to offset each other, the empirical fact that the average probability of having a positive market return is close to 1 for any reasonably long sample period indicates that only the convex part of the weighting function is relevant. The intuition of this is that, when the probability of having a positive market return is large enough, the chance of observing an extreme negative realization is much smaller than that of observing the same size positive realization. Thus, the impact of extremely bad states is much stronger than that of extremely good states, which causes the agent to pay more attentions to the former. This one-sided effect results in an enhanced, positive risk premium with respect to the covariance with the market. Symmetrically, during an underweighting month in which the agent underweights both tails of the market return distribution, she is less risk averse toward bad states and less risk seeking toward good ones. Once again, the left tail effect dominates and only the former behavior is relevant because the probability of having a positive market return now falls in the concave domain of the weighting function. In this case, the agent demands less compensation for the covariance with the market compared to an EUT maximizer, and she may even be willing to pay for the risk if and when she sufficiently underweights the bad states. Finally, for a sufficiently long sample period covering comparable runs of overweighting and underweighting months, the elevated beta during the former periods and the reduced beta during the latter periods may cancel each other, leading to an overall flat or possibly downward-sloping security market line (see Figure 7(a)).

Kraus and Litzenberger (1976) and Harvey and Siddique (2000) show that a typical EUT agent is willing to pay for the coskewness, implying a negative premium in coskewness. Probability weighting is likely to strengthen this skewness-inclined preference during an overweighting period.

This can be explained as follows. When the asset has positive coskewness with the market, the asset return has a fat right tail with respect to the market portfolio. The hope of a larger gain relative to

4 the market due to overwheighting further increases the attraction of the risky asset and requires even less compensation when compared to the case of no probability weighting. Symmetrically, when the coskewness is negative, the fear of a larger loss, which is also amplified by the overweighting, reduces the attraction of the risky asset and hence demands greater compensation. Combined, overweighting the tails inflates the negative relationship between expected return and coskewness, generating a significantly negative premium of the coskewness (see Figure 7(b)). When the agent underweights both tails of the market return, the probability weighting introduces a skewness- averse preference. Underweighting the tails deflates the negative relationship between expected return and coskewness, resulting in an insignificant (or even potentially positive) premium.

Our empirical study confirms the above key theoretical predictions. To carry out this study, the first important task is to identify the characteristics and level of probability weighting for any given period. We develop several indices for this purpose, using S&P 500 option data for the period January 4, 1996 – April 29, 2016. After deriving the non-parametric, option-implied probability weighting functions as described earlier, we propose an “implied fear index” (IFI) and an “implied hope index” (IHI) as the average values of the derivatives of the weighting function around 1 and 0 respectively. The two indices have highly similar dynamics (see Figure 6(a)), with a correlation coefficient of 0.8. Hence, we use the average of the two to define one of our probability weighting indices, PWI1, to reflect the level of probability weighting. Moreover, to cross examine how robust PWI1 is, we introduce a second probability weighting index, PWI2, by

2 fitting the parametric specification of Prelec’s weighting function. Moreover, PWI1 and PWI2 have a correlation coefficient of 0.949 (see also Figure 6(b) and Figure 6(c) for a comparison of their dynamics). Therefore, we use PWI1 alone as the probability weighting index in our empirical study, and we use its median to divide between the overweighting regime and the underweighting one.

We then perform portfolio analysis to study the impact of probability weighting on the mean- covariance and mean-coskewness relationships, respectively. We use all of the common stocks listed on the NYSE, AMEX and NASDAQ for the period January 1991 – April 2016. The one-way

2Prelec’s weighting function generates inverse S-shaped and S-shaped functions, determined by a single parameter.

5 sort portfolio analysis (Table 2 and Figure 7) shows that the covariance spread portfolio has a significantly positive return during overweighting months, a significantly negative return during underweighting months, and a slightly positive (but insignificant) average return for the entire period. The coskewness spread portfolio, on the other hand, has a significantly negative return during overweighting months, an insignificantly negative return during underweighting months, and a notably negative average return for the entire period. Moreover, the two-way sort portfolio analysis (used to separate the effects of covariance and coskewness) yields similar results (Table 3).

Finally, a Fama–MacBeth cross-sectional regression analysis (Fama and MacBeth, 1973) further confirms all of the results (Table 4).

The results themselves regarding conditional risk-return tradeoffs during periods when investors overweight or underweight tail probabilities do not, however, answer the question as to why indi- viduals switch between overweighting or underweighting. Dierkes (2013) estimates a probability weighting function month by month from index options and finds that the former varies over time in a way that is related to a sentiment index proposed by Baker and Wurgler (2006). Yet that paper stops short of providing a psychological explanation as to why this variation takes place.

Baker and Wurgler (2006)’s index is constructed based on Initial Public Offering (IPO) numbers,

first-day return of IPO, close-end fund discount, turnover ratio, and share of equity/debt issues, attributes to which individuals do not typically pay direct attention. So it is unlikely that the changes in this sentiment index are responsible for the variation in probability weighting. In the present paper, we propose a psychology-based hypothesis, involving both beliefs and preferences, to explain how probability weighting responds directly to changes in the outside environment. We use Tversky and Kahneman (1973)’s availability heuristic to understand how investors’ estimation of tail probabilities is impacted by recently experienced rare yet significant events, such as extreme price movements and massive volatilities, which are often portrayed sensationally by the news me- dia. We apply the recent theories in the psychology and behavioral finance literatures (Herold and

Netzer, 2010; Woodford, 2012; Steiner and Stewart, 2016) that probability weighting, as a choice of preferences, is a matter of responding optimally to various challenges associated with decision- making under uncertainty, to deduce that probability weighting varies endogenously as the market

6 environment changes.

We then test the above hypothesis empirically. We choose different groups of variables that capture, directly or indirectly, the market environment changes, including price return variables, newspaper-based variables, investors survey variables, VIX, and Baker and Wurgler (2006)’s sen- timent index. We find that there are high correlations between PWI1, the return variables, and the media variables, confirming that a direct experience of extreme events and/or grandstanding media coverage can prompt people to adjust their beliefs/preferences. There are considerable cor- relations between PWI1 and the survey variables, showing (albeit perhaps circumstantial) evidence that probability weighting is related to investors’ opinions regarding the market. Interestingly, our study yields that PWI1 and Baker and Wurgler (2006)’s sentiment index are unrelated, suggesting that, rather intuitively, factors such as IPOs and equity/debt issues are not sources of time variation in probability weighting because individuals typically do not pay attention directly to these factors.

In an online appendix, we use these highly correlated variables as alternative probability weighting indices to repeat the empirical analysis on the conditional return and covariance/coskewness rela- tionship, and we find that the results remain. This additional empirical check demonstrates that our main results are robust in that they are independent of the choice of a particular probability weighting index.

The paper proceeds as follows. We present and solve the portfolio selection problem under

RDUT in Section 2. We derive the equilibrium asset pricing formula and the three-moment

CAPM model in Section 3. We present the empirical analysis of the conditional return and co- /coskewness relationship in Section 4. We propose and empirically test a psychology-based explanation of the time variation in probability weighting in Section 5. We conclude in Section 6.

All proofs are presented in the Appendix.

7 2 Portfolio Selection under Rank-dependent Utility

2.1 Basic Setting

Consider a one-period-two-date market. The set of possible states of nature at date 1 is Ω and the set of events at date 1 is a σ-algebra F of subsets of Ω. There are n assets (one risk-free asset and n − 1 risky assets), whose prices at time t are Pt1, Pt2, ··· , Ptn, t = 0, 1. These assets are priced by

P0i = E[mP ˜ 1i], i = 1, 2, ··· , n, wherem ˜ is the pricing kernel, which is an F-measurable such that P(m ˜ > 0) = 1,

E[m ˜ ] < ∞ and E[mP ˜ 1i] < ∞ (i = 1, 2, ··· , n). The representative agent in the market has an RDUT preference given by

Z ˜ 0 U(X) = u(x)w (1 − FX˜ (x))dFX˜ (x),

where u(·) is an (outcome) utility function, w(·) a probability weighting function, and FX˜ (·) the cumulative distribution function (CDF) of the random payoff X˜. Specific parametric classes of probability weighting functions proposed in the literature include those by Tversky and Kahneman

(1992)

pα w(p) = , (pα + (1 − p)α)1/α and by Prelec (1998)

w(p) = exp(−(− log(p))α), where, in both cases, 0 < α < 1 corresponds to inverse S-shaped weighting functions (i.e., first concave and then convex; hence overweighting both tails) and α > 1 corresponds to S-shaped one (i.e., first convex and then concave; hence underweighting both tails). In this paper, we will

8 use Prelec’s weighting functions, which are plotted in Figure 1 with different α’s. The smaller the α(< 1), the higher degree of overweighting, and the larger the α(> 1), the higher degree of underweighting.

[Insert Figure 1 here]

We make the following assumptions on the market.

ASSUMPTION 1. (i) The probability space (Ω, F, P) admits no atom. (ii) u is strictly increasing, strictly concave, continuously differentiable on (0, ∞), and satisfies the

Inada condition: u0(0+) = ∞, u0(∞) = 0. Moreover, without loss of generality, u(∞) > 0.

xu0(x) The asymptotic elasticity of u is strictly less than one: limx→∞ u(x) < 1. (iii) w is strictly increasing and continuously differentiable on [0, 1] and satisfies w(0) = 0, w(1) =

1.

2.2 Portfolio Selection

Use xi to denote the portfolio weight in asset i and let shorting be disallowed. Then, the representative investor faces a one-period portfolio selection problem:

Z (P ) max u(x)w0(1 − F (x))dF (x), ReP ReP {xi} Pn s.t. ReP = xiRei, i=1 (1) Pn i=1 xi = 1,

xi ≥ 0, i = 1, 2, ··· , n,

where ReP is the total return of the portfolio and Rei is the total return of asset i.

Noting P0i = E[mP ˜ 1i], we have E[m ˜ Rei] = 1. Thus, problem (P ) is, equivalently,

Z (P 0) max u(x)w0(1 − F (x))dF (x), ReP ReP ReP ∈F (2) s.t. E[m ˜ ReP ] = 1,

P(ReP ≥ 0) = 1.

9 This problem specializes to the one solved by Xia and Zhou (2016).3 Define a function N by

Z 1− − N(q) = − Qm˜ (1 − p)dp, q ∈ [0, 1], w¯−1(q)

− ∆ where Qm˜ (p) = sup{x ∈ R | Fm˜ (x) < p}, p ∈ (0, 1], is the lower quantile function of m, andw ¯ is the dual of w:w ¯(p) = 1 − w(1 − p).

Applying Theorem 3.3 of Xia and Zhou (2016), we have the following result.

THEOREM 1. Assume m˜ has a continuous CDF Fm˜ , and

Z 1− 0 −1 ˆ 0 − (u ) (µN (w ¯(p)))Qm˜ (1 − p)dp < ∞ 0 for all µ > 0, where Nˆ is the concave envelope of N and Nˆ 0 the right derivative of Nˆ. Then the total return of the optimal portfolio is

∗ 0 −1  ∗ ˆ 0  ReP = (u ) λ N (1 − w(Fm˜ (m ˜ ))) ,

h  i ∗ 0 −1 ∗ ˆ 0 where the Lagrange multiplier λ is determined by E m˜ (u ) λ N (1 − w(Fm˜ (m ˜ ))) = 1.

3 Equilibrium Asset Pricing and Three-Moment CAPM

3.1 Equilibrium Asset Pricing Formula

Denote by

n X M ReM = xi Rei i=1

M the total return of the market portfolio where xi is the market capitalization weight of asset i Pn M with i=1 xi = 1, and byr ˜M ,r ˜i and rf the return rates of the market portfolio, the asset i, and

3 In Problem (2.2) of Xia and Zhou (2016), setting I = 1, u0i(·) = 0, u1i(·) = u(·), wi(·) = w(·), βi = 1, E[˜ρe˜1i] = 1, 0 e0i = 0, andc ˜1i = ReP , we recover our Problem (P ).

10 the risk-free asset, respectively. Define a function m by

w0(1 − F (x))u0(x) m(x) = ReM , x > 0. (3) h 0 0 i (1 + rf ) w (1 − F (RM ))u (RM ) E ReM e e

THEOREM 2. When the market is in equilibrium with RM having a continuous CDF F , the e ReM pricing kernel is given by

0 0 w (1 − F (RM ))u (RM ) ReM e e m˜ = m(ReM ) ≡ . (4) h 0 0 i (1 + rf ) w (1 − F (RM ))u (RM ) E ReM e e

Moreover, if RM has a continuously differentiable density function f , then, for any risky asset e ReM i, we have the three-moment CAPM:

1 [˜r ] = r + ACov(˜r , r˜ ) + BCov(˜r , r˜2 ) + o(˜r2 ), (5) E i f i M 2 i M M where the risk premiums A and B are given as

0 A = − (1 + rf )m (1)

−1 0 0 = − [w (1 − F (RM ))u (RM )] E ReM e e h i · w0(1 − F (1))u00(1) − w00(1 − F (1))u0(1)f (1) , (6) ReM ReM ReM

00 B = − (1 + rf )m (1)

−1 0 0 = − [w (1 − F (RM ))u (RM )] E ReM e e h · w0(1 − F (1))u000(1) + w000(1 − F (1))u0(1)f 2 (1) ReM ReM ReM i − w00(1 − F (1))[2u00(1)f (1) + u0(1)f 0 (1)] . (7) ReM ReM ReM

The equilibrium pricing kernelm ˜ (indicated by expression (4)) is not only influenced by the utility function u, but also by the probability weighting function w evaluated at 1 − F (RM ) and, ReM e consequently, by the market return distribution. To demonstrate how all the three work together in determining pricing kernel, we use Figure 2. There, we consider Prelec’s probability weighting

11 function, the CRRA utility function and two types of market scenarios:

• Market I: 1 − F (1) > e−1, i.e., 1 − F (1) lies in the convex domain (concave domain) of ReM ReM w(·) when α < 1 (α > 1).

• Market II: 1 − F (1) < e−1, i.e., 1 − F (1) lies in the concave domain (convex domain) of ReM ReM w(·) when α < 1 (α > 1).4

In Figure 2, both market scenarios are considered. Taker ˜M to be skew-normal with mean

7.6%, 15.8% and skewness -0.339, which can be easily verified to be a case of Market I. For Market II we taker ˜M to be skew-normal with mean -7.6%, standard deviation

15.8% and skewness -0.339.5 Figures 2(a) and 2(c) show the plots of the pricing kernel against the market return. They show that, in both market scenarios, the RDUT agent overweighting both tails pays higher prices in the extreme states, resulting in a U-shaped pricing kernel.6 The case of underweighting both tails has the opposite characteristics, with a bell-shaped pricing kernel.

0 Figures 2(b) and 2(d) depict the implied relative risk aversion, −m (ReM ) · ReM /m(ReM ), at

ReM = 1, which can be interpreted as the investor’s risk attitude at the initial wealth level, as a function of α. The classical CRRA case (the black dot line) has a flat relative risk aversion, as expected. The red and blue solid lines show how probability weighting can change risk attitudes by influencing investors’ perceptions of tail risk. In Market I, the implied relative risk aversion is a decreasing function of α; so a higher level of overweighting the tails leads to a larger relative risk aversion and hence demands a higher risk compensation. In particular, the risk aversion is negative when one sufficiently underweights, yielding a risk-seeking behavior. In contrast, the implied relative risk aversion is increasing in α under Market II, which implies that the more (less) overweighting (underweighting) the tails the less risk aversion is entailed.

It may seem puzzling why overweighting tails strengthens risk aversion in Market I, but weakens it in Market II. The reason lies in the interaction between probability weighting and market return

4Note that, e−1 is the solution of the equation exp(−(− log(p))α) = p for any α; hence it is the reflection point separating the convex and concave domains of Prelec’s weighting function. 5Later we will show empirically that the US markets have historically been in Market I. 6U-shaped pricing kernels have been presented in literature under various settings. For example, Shefrin (2008) finds that the pricing kernel has a U-shape when agents have heterogeneous risk tolerance. Baele, Driessen, Ebert, Londono, and Spalt (2019) observe that pricing kernels under the cumulative prospect theory (CPT) are also U- shaped.

12 distribution. Under Market I in which the probability of a positive market return is sufficiently higher, the chance of observing or experiencing an extreme negative realization is smaller than that of the same size positive realization. Thus, the impact of an extremely bad event, once occurred, is stronger than that of extremely good states, making the agent pay more attentions to the left tail. This leads to the additional level of risk aversion arising from overweighting the left tail dominating that from overweighting the right tail, strengthening the overall risk aversion of the agent. In contrast, Market II underlines a smaller probability of a positive market return; hence a symmetric reasoning yields that risk aversion is overall weakened.7

[Insert Figure 2 here]

3.2 Risk Premiums

In the three-moment CAPM model, the terms A and B are the risk premiums of the covariance and the coskewness, respectively. The signs of the two parameters determine the risk preference with respect to covariance and coskewness. As A and B depend on the utility function, the weighting function, and the distribution of the market return in a complex way, it is difficult to analytically study their signs in general. However, if we choose a specific weighting function - in this case,

Prelec’s - then an analytical examination of the signs becomes possible.

As noted earlier, the market return distribution also impacts pricing and risk premiums. We have already introduced two scenarios, Market I and Market II, depending on whether 1 − F (1) ReM lies in the convex or concave domain of w. Notice that 1 − F (1) = (RM > 1), i.e., the ReM P e probability that the market has a positive return rate. So this quantity reflects the trend of the overall economy/market. Let us first examine which type of Market is more plausible for the S&P

500. Following De Giorgi and Legg (2012), we assume that the S&P 500 annual return rate,r ˜M , follows either a normal distribution, a skew-normal distribution, or a log-normal distribution. Based on the annual historical data of the S&P 500 from January 1946 to January 2009, the annual return

7This also explains why, with overweighting of tail events, the left tail of the U-shaped pricing kernel is more prominent in Market I and the right tail of the U-shaped pricing kernel is more prominent in Market II; see Figures 2(a) and 2(c).

13 rate,r ˜M , has a mean of 7.6% and a standard deviation of 15.8%. Moreover, in the case of skew- normal, its skewness is estimated to be -0.339. Based on these values, assumingr ˜M is respectively normal, skew-normal, and log-normal, we can estimate the corresponding value of 1 − F (1) to ReM be respectively 0.6847, 0.6980, and 0.6543, all greater than e−1. Next, we consider the S&P 500 monthly return. Following Polkovnichenko and Zhao (2013), we use the daily historical data of the

S&P 500 from January 4, 1996 to April 29, 2016 and estimate an EGARCH model for the S&P 500 with 1250 daily index return data up to the end of each month. Then, we simulate 100,000 samples based on the EGARCH model and compute the nonparametric probability density function for the monthly return of the S&P 500. We find that 1 − F (1) > e−1 holds for all the months tested. ReM Therefore, we conclude that Market I is a more likely scenario for the S&P 500, and hence we focus on this case in the following discussion.

Let us now examine first the sign of A. From its expression on the right hand side of (6), it follows that the first term (including the factor in front of the brackets) is positive, representing the part of the risk aversion arising from the outcome utility function u. The second term is more complicated, representing the part of the risk aversion arising from the probability weighting, whose sign depends on whether the quantity 1 − F (1) lies in the convex or concave domain of ReM the weighting function w.

PROPOSITION 1. Assume that the market is in equilibrium with ReM having a density function f , w(p) = exp(−(− log(p))α) and 1 − F (1) > e−1 (Market I). Then, ReM ReM

(i) A > 0 if 0 < α < 1 (overweighting case).

(ii) A < 0 if α > α∗ for some α∗ > 1 (underweighting case).

To interpret Proposition 1, consider first the case 0 < α < 1 when w is inverse S-shaped and the agent overweights both extremely good and extremely bad states. Overweighting the left tail makes the RDUT agent more risk averse toward bad states than her CRRA counterpart, and overweighting the right tail makes her less risk averse toward good states. While these two effects seem to offset each other, as explained intuitively earlier the condition 1 − F (1) > e−1, which is ReM consistent with the empirical observation for the S&P 500, dictates that only the former effect is

14 relevant.8 This results in a positive risk premium with respect to the covariance with the market; see the solid red line in Figure 3.

There is more to it. Write

" # w0(1 − F (1))u0(1) u00(1) w00(1 − F (1)) A = ReM − + ReM f (1) . (8) 0 0 0 0 ReM [w (1 − F (RM ))u (RM )] u (1) w (1 − FR (1)) E ReM e e eM

u00(1) The first term inside the brackets, − u0(1) > 0, measures the risk aversion arising from the outcome w00(1−F (1)) ReM utility function u in the classical EUT framework. The second term, 0 f (1), which w (1−F (1)) ReM ReM arises from the probability weighting independent of the outcome utility function, is also positive under Market I and when 0 < α < 1. This second term elevates the level of risk aversion. This may shed light on the equity premium puzzle (Mehra and Prescott, 1985): the reason the empirical data of the S&P 500 implies an extremely implausibly high level of risk aversion within the classical

EUT framework is that the latter has not accounted for the part of the risk aversion emanating from probability weighting.

When α > 1, w is S-shaped and the agent underweights the two tails of the market return distribution. This entails less risk aversion toward bad states and less risk seeking toward good states. However, the condition 1 − F (1) > e−1 implies that 1 − F (1) is now in the concave ReM ReM domain of w and hence only the former behavior is relevant. In this case, the agent demands less compensation for the beta risk compared with a classical utility maximizer, and, indeed, even becomes willing to pay for the risk (i.e., A < 0) when she sufficiently underweights bad states (i.e., when α is sufficiently large). This case is illustrated by the dashed blue line in Figure 3.9

[Insert Figure 3 here]

Next we investigate the sign of B. First, when there is no probability weighting, we have

−1 0 000 000 B = −E [u (ReM )]u (1) < 0 assuming that u > 0, which is satisfied by most commonly used utility functions.10 This implies that a typical EUT agent is willing to pay for the coskewness; see

8This observation is straightforward from the mathematical expression of A, (6), which depends on w only through its value at 1 − F (1), a point in the convex domain of w. ReM 9Figure 3 is completely consistent with Figure 2(b). 10Brockett and Golden (1987) refer to the class of increasing utility functions with derivatives that alternate in

15 also Harvey and Siddique (2000). In the presence of probability weighting, the first term of B (see

(7)) is negative if, again, u000 > 0. The second and third terms are more complicated, in that their signs depend on the utility function, the weighting function, and the market return distribution, all intertwined. However, as in the case of A, whether 1 − F (1) lies in the convex or concave ReM domain of w is also critical for the sign of B. We provide several sufficient conditions for B to be positive or negative.

PROPOSITION 2. Assume that the market is in equilibrium with ReM having a continuously dif- ferentiable density function f , and that u(x) = 1 x1−γ where γ > 0, w(p) = exp(−(− log(p))α) ReM 1−γ where α > 0, and 1 − F (1) > e−1. Then, ReM

0 f ˜ (1) (i) B < 0 if 0 < α < 1 and γ ≥ γ∗ := RM . 2f ˜ (1) RM (ii) B < 0 if α > 1 and γ > γ(α) where γ(α) is a threshold depending on α.

(iii) B > 0 if α > 1 and γ∗ ≤ γ < γ(α), where γ(α) is a threshold depending on α.

We have noted earlier that in the absence of probability weighting, B < 0 for “most commonly used utility functions,” suggesting an inherent skewness-inclined preference in the classical EUT framework. Now, when the agent overweights both tails of the market return (the case of 0 < α < 1), the proof of Proposition 2 (Appendix C) shows that w000(1 − F (1)) > 0. Thus probability ReM weighting is likely to enhance the skewness-inclined preference when overweighting occurs. This enhancement can be explained intuitively as follows. When the asset has positive coskewness with the market, the asset return has a fat right tail with respect to the market portfolio. The hope of a larger gain relative to the market, which is strengthened by overweighting, further increases the attraction of the risky asset and requires even less compensation when compared to the case of no probability weighting. Symmetrically, when the coskewness is negative, the fear of a larger loss, which is also amplified by the overweighting, reduces the attraction of the risky asset and hence demands greater compensation. Combined, overweighting the tails inflates the negative relationship between expected return and coskewness, generating a significantly negative B. This enhancement sign as the class that contains “all commonly used utility functions.” In particular, u000 > 0 implies that the utility function has nonincreasing absolute risk aversion, which is one of the essential properties of a risk-averse individual; see Harvey and Siddique (2000).

16 is confirmed by comparing the empirical results across Panels A and B in Table 1.11

When the agent underweights both tails of the market return (i.e., α > 1), w000(1 − F (1)) ReM becomes negative, which implies that the probability weighting introduces a skewness-averse pref- erence. Underweighting the tails deflates the negative relationship between expected return and coskewness. Whether B is eventually positive or negative depends on which one dominates: the skewness-inclination inherent in u or the skewness-aversion in w. Proposition 2-(ii) and -(iii) pro- vide sufficient conditions under which one of these is the case. These results are illustrated by the dashed blue line in Figure 4 and by the values of B in Panel C of Table 1.12

[Insert Table 1 here]

[Insert Figure 4 here]

4 Empirical Analysis

Our empirical analysis tests our preceding theory regarding how probability weighting impacts the risk premiums on covariance and coskewness. To this end, we first identify the characteristics and level of probability weighting from option data. Then, we provide one-way sort and two-way sort portfolio analyse to study the effect of probability weighting on the mean-covariance and mean- coskewness relationships, respectively. Finally, we run a Fama-MacBeth regression to investigate the impact of probability weighting on cross-sectional asset pricing.

4.1 Option-implied Probability Weighting Functions and Indices

Option data have been used to estimate empirical pricing kernels (see, e.g., Rosenberg and

Engle, 2002). As the pricing kernel under RDUT is the product of the marginal utility and the derivative of probability weighting (Theorem 3.1), upon specifying a suitable utility function we can obtain implied probability weighting functions from option price data.

11The additional condition γ > γ∗ in Proposition 2-(i) is purely technical, as it is used to control the product term u00w00 in the proof (Appendix C). Indeed, from that proof it is clear that this condition is sufficient, but by no means necessary. In Panel A of Table 1, all the B values are negative even if the condition is violated. Moreover, a numerical plot when α = 0.5 shows that B < 0 for any γ > 0; see Figure 4(a). 12Again, the condition γ ≥ γ∗ in Proposition 2-(iii) is only a technical condition used in the proof. In Panel C of Table 1 we observe positive B when γ < γ∗.

17 We use S&P 500 index option data and S&P 500 index data to derive the option-implied probability weighting functions and to construct the corresponding probability weighting indices for any given period of time (monthly in this paper). These indices are then used to gauge the overall probability weighting level in the market for the period concerned. The option data and index data are obtained from the OptionMetrics daily files, with a time window from January 4,

1996 to April 29, 2016.13

More specifically, following the procedure of Polkovnichenko and Zhao (2013), with some vari- ations, we derive implied probability weighting functions in three steps. First, at the end of each month during the period January 4, 1996 to April 29, 2016, we use both the S&P 500 index and index option data to construct the implied volatility curves for the options with the nearest matu- rity and the second nearest maturity. We then compute the implied volatility curve for the options with one-month maturity by spline interpolation. Next, compute the one-month maturity option prices based on the implied volatility curve and extract the nonparametric risk-neutral probability density function for the one-month return of the S&P 500 index, f Q , by taking the second-order R˜M derivative of the one-month maturity option prices with respect to the strike prices (see Bliss and

Panigirtzoglou, 2002 and Kostakis, Panigirtzoglou, and Skiadopoulos, 2011 for details).14 Second, at the end of each month, we estimate an EGARCH model for the S&P 500 index with 1250 daily

S&P 500 index return data up to the end of the month. Then, simulate 100,000 samples based on the EGARCH model and compute the nonparametric real-world probability density function for the one-month return of the S&P 500 index, f .15 Third, we estimate the probability weighting R˜M

13OptionMetrics is a provider of historical options data for empirical and econometric research. We access Option- Metrics via Wharton Research Data Services (WRDS). 14Different from Polkovnichenko and Zhao (2013), who directly select monthly option quotes closest to 28, 45, and 56 days from expiration and compute the risk-neutral probability density function for the 28, 45, and 56-day returns, we focus on the last trading day of each month. As there are no traded options with one-month maturity at the end of each month in our experiments, we need to numerically compute the prices of one-month maturity options based on the options with other maturities. Therefore, we apply the nonparametric method in Bliss and Panigirtzoglou (2002) and Kostakis et al. (2011) to derive the risk-neutral probability density function, which is more suitable for our analysis. 15We use the empirical S&P 500 index returns to estimate the physical probability density function. Investors’ actual beliefs about the return distribution can differ from the empirical one. However, the probability weighting function is already the result of a combination of (largely erroneous) beliefs and (biased) preferences; see a detailed discussion in Section 5.1.

18 function according to the following equation:

Z 1 Z +∞ 0 0 w(1 − p) = w (1 − y)dy = w (1 − F ˜ (x))f ˜ (x)dx −1 RM RM p F ˜ (p) RM Z +∞ m(x) = fR˜ (x)dx (9) −1 h 0 0 i 0 M F ˜ (p) (1 + rf ) w (1 − F (RM ))u (RM ) u (x) RM E ReM e e  Q −1 Q Z +∞ f (x) Z +∞ f (x) R˜ R˜ =  M dx M dx, u0(x) −1 u0(x) 0 F ˜ (p) RM where the function m is defined in (3). Based on this we derive w numerically by specifying the utility function to be CRRA with γ = 2.16 The solid red lines in Figure 5 show two typical types of implied probability weighting functions derived in our study, an inverse S-shaped one (overweighting both tails) and an S-shaped one (underweighting both tails). Among the total 244 months tested in our study, most of the time (about 89.3%) the implied weighting functions are either S-shaped or inverse S-shaped, an observation that reconciles with the well-documented result that individuals tend to underweight or overweight the two tails simultaneously.

[Insert Figure 5 here]

Probability weighting is captured by the derivative of the weighting function w, mainly around the two ends of its domain, p = 1 and p = 0. We propose two indices, those of “implied fear” and

“implied hope”, capture the representative agent’s biases toward extremely good and bad events:

P99 0 Implied fear index (IFI) = i=95 0.2w (0.01i),

P5 0 Implied hope index (IHI) = i=1 0.2w (0.01i), which are the average values of the derivatives of the probability weighting function around 1 and

0, respectively. When IFI > 1 (IHI > 1), the agent overweights extreme bad (good) events. Figure

6(a) shows the monthly time series of these indices. Evidently, the two indices have very similar

16The derived probability weighting function depends on the choice of the utility function. However, we have tested many alternative settings, such as CRRA with γ = 1, 3, 4 and exponential utility u(x) = −e−ax with a = 1, 2, 3, 4, and found that our empirical results still hold, including the sort portfolio analysis and Fama-MacBeth regression.

19 dynamics and move together most of the time (their correlation coefficient is 0.8). This reiterates the aforementioned simultaneous underweighting/overweighting of the two tails.

Given the highly correlated dynamics of IFI and IHI, we use their average as an option-implied probability weighting index to reflect the level of probability weighting:

IFI + IHI PWI = . (10) 1 2

Figure 6(b) depicts the movement of PWI1 during the testing period.

To cross examine how robustly PWI1 captures the degree of probability weighting in the market over time, we introduce a second probability weighting index by fitting the parametric specification of Prelec’s weighting function. Specifically, we calibrate Prelec’s parameter α by minimizing the

Euclidean distance between Prelec’s weighting function and the obtained nonparametric weighting function:

99 X αˆ = arg min |w(0.01i) − exp(−(− log(0.01i))α)|2, α i=1 which we call the implied α. Note that for Prelec’s weighting function, 0 < α < 1 corresponds to inverse S-shaped (overweighting) and α > 1 to S-shaped (underweighting), with α = 1 dividing the two. The greater (smaller) α, the greater the degree of overweighting (underweighting). This suggests that the following index, PWI2, also reflects the level of probability weighting:

1 PWI = . (11) 2 αˆ

The percentages of time when these indices indicate the occurrence of overwheighting are re- markably close to each other: investors overweight extreme events 56%-59% of the time and un- derweight extreme events 41%-44% of the time. Moreover, Figure 6(b) and (c) show that PWI1 and PWI2 have very similar dynamics. Indeed, they have a correlation coefficient 0.949. Thus, in the following analysis, we focus on PWI1, and we use it to separate the whole sample period

(from January 1996 to April 2016) into two categories: overweighting periods and underweighting

20 periods. Specifically, we calculate the median level of PWI1 over the entire period. When the

PWI1 value at the end of month t is above its median level, the next month t + 1 is labelled as an overweighting period; otherwise, t + 1 is an underweighting period.17

[Insert Figure 6 here]

4.2 One-way Sort Portfolio Analysis

We now conduct one-way sort portfolio empirical analysis to test the mean-covariance and mean-coskewness relationships during different probability weighting periods. In our study, we use all the common stocks listed on the NYSE, AMEX, and NASDAQ to perform cross-sectional analysis. Data on prices, returns, and shares outstanding are obtained from the Centre for Research in Security Prices (CRSP) monthly files, which range from January 1991 to April 2016.18

At the end of each month t from January 1991 to April 2016, we use the immediate prior

60-month sample covariance and sample coskewness to estimate the pre-ranking covariance and coskewness in month t,

59 1 X covar = (˜r − r¯ )(˜r − r¯ ), i,t 59 i,t−j i M,t−j M j=0 59 1 X coskew = (˜r − r¯ )(˜r2 − r2 ), i,t 59 i,t−j i M,t−j M j=0 wherer ˜i,t−j is the monthly return rate of stock i in month t − j,r ˜M,t−j is the monthly return rate

2 of the value-weighted market portfolio in month t − j, andr ¯i,r ¯M and rM are the sample means of 2 r˜i,t,r ˜M,t, andr ˜M,t, respectively. At each t, we classify stocks into 10 covariance-sorted groups based on their pre-ranking co- variance. Group 1 (10) contains stocks with the lowest (highest) covariance. For each group, we build an equal-weighted portfolio. Then, we construct a spread portfolio (10-1) that longs Port-

17Given that the percentages of overweighting periods and underweighting periods are 56%-59% and 41%-44% respectively, it is reasonable to use the median as the separating point. We have also used the value 1 to separate these periods, and we found the main results to be the same. 18CRSP is a provider of historical options data for use in empirical research and econometric studies. We access CRSP via Wharton Research Data Services (WRDS) at http://www.whartonwrds.com.

21 folio 10 and shorts Portfolio 1. We then calculate the next month (t + 1)’s post-ranking portfolio returns for these portfolios and repeat the procedure for the next month. Similarly, we construct the coskewness-sorted portfolios.

Table 2 reports the average next month’s returns of covariance-sorted portfolios and coskewness- sorted portfolios for the entire period, overweighting periods and underweighting periods, respec- tively, where the overweighting periods and underweighting periods are determined by the index

PWI1, as described earlier. We observe that the spread covariance-sorted portfolio (10-1) has a significantly positive monthly return during the overweighting periods and a significantly negative monthly return during the underweighting periods. Meanwhile, the spread coskewness-sorted port- folio (10-1) has a significantly negative monthly return during the overweighting periods and an insignificantly negative monthly return during the underweighting periods. Moreover, the negative return of this spread portfolio nearly doubles its size from an average of -1.01% monthly over the entire period to an average of -1.81% monthly during the overweighting periods.

[Insert Table 2 here]

Figures 7 (a) and (b) plot the regression lines of the average next month’s returns on the average covariance and on the average coskewness, respectively. The return–covariance line has a steep upward (downward) slope during the overweighting (underweighting) periods. The return– coskewness line has a steep downward slope during the overweighting periods and a nearly flat slope during the underweighting periods. These findings are consistent with our theoretical predictions in Section 3.2. Finally, for the entire sample period, the return–covariance line is almost flat, reconciling with the beta anomaly, and the return–coskewness line is downward-sloping, conforming to the skewness-inclined preference that is well documented in the literature.

[Insert Figure 7 here]

4.3 Two-way Sort Portfolio Analysis

Next, we conduct two-way sort portfolio analysis to separate the effects of covariance and coskewness. For each month t from January 1996 to April 2016, we classify all the stocks into five

22 coskewness-sorted groups based on their pre-ranking coskewness. Then, in each coskewness-sorted group, we further group the stocks into five covariance-sorted groups based on their pre-ranking covariance. Thus, there are now a total of 25 groups. For each group, we construct the equal- weighted portfolio, calculate next month (t + 1)’s post-ranking portfolio returns, and repeat the procedure from the next month onwards. As the stocks have similar values of coskewness within the same coskewness-sorted group, we can thereby control the influence of the coskewness and compare the performance of the five covariance-sorted portfolios. Similarly, we also analyze the

25 covariance-coskewness-sorted portfolios based first on their pre-ranking covariance and then on their pre-ranking coskewness.

Table 3 reports the average next month’s returns for the entire period, overweighting periods and underweighting periods, respectively, where the overweighting periods and underweighting periods are determined by PWI1. Panel A shows that with the coskewness controlled, the spread portfolios (5-1) still have significantly positive monthly returns during the overweighting periods and significantly negative monthly returns during the underweighting periods. Panel B, on the other hand, yields that with covariance controlled the spread portfolios (5-1) have significantly negative monthly returns during the overweighting periods and insignificantly monthly returns during the underweighting periods.

[Insert Table 3 here]

Our sort portfolio analysis suggests that one can significantly improve the performance of the long-short covariance-sorted and coskewness-sorted strategies by making use of the additional infor- mation from the probability weighting indices proposed in this paper. It confirms that probability weighting is an important factor for pricing securities and building profitable portfolios.

4.4 Fama-MacBeth Cross-Sectional Regression

We conduct a Fama-MacBeth regression to investigate the impact of probability weighting on cross-sectional asset pricing. Following Bali, Engle, and Murray (2016), we first compute the covariance and coskewness of risky stock i for each month t from January 1996 to April 2016, by

23 means of a 60-month-window regression:

2 r˜i,τ = ki,t + covariancei,tr˜M,τ + coskewnessi,tr˜M,τ + i,τ , τ = t − 59, ··· , t,

where ki,t is the intercept and i,τ is the residual. In contrast to computing the covariance and coskewness from the samples directly, as in Subsection 4.2 and 4.3, the above regression rules out interactions between covariance and coskewness. Next, for each t, we consider the following cross-sectional regression:

r˜i,t+1 = kt + Atcovariancei,t + Btcoskewnessi,t + εi,t+1, i = 1, ··· , n.

The significance of the risk premiums A and B is determined by a t-test on the time series {At}

19 and {Bt}.

Panel A of Table 4 reports the Fama-MacBeth regression results, in which the market over- weighting and underweighting periods are divided by the index PWI1. If we do not separate the time periods based on probability weighting, the overall risk premium of covariance, A, is not sig- nificant and the risk premium of coskewness, B, is modestly negative. These are consistent with the empirical findings in the literature. When we do separate the whole period into two regimes (over- weighting and underweighting), however, A becomes significantly positive during overweighting periods and significantly negative during underweighting periods, and B almost doubles its overall value during overweighting periods yet becomes insignificant during underweighting periods. Again, these results reaffirm our theoretical and prior empirical findings.

[Insert Table 4 here]

19 Besides Bali et al. (2016)’s approach, Harvey and Siddique (2000) propose to compute covariancei,t and coskewnessi,t by the following:

r˜i,τ = ki,t + covariancei,tr˜M,τ + i,τ , τ = t − 59, ··· , t, 1 P59 2 2 59 j=0(i,t−j )(˜rM,t−j − rM ) coskewnessi,t = q , 1 P59 2 1 P59 2 60 j=0 i,t−j · 59 j=0(˜rM,t−j − r¯M )

2 2 wherer ¯M and rM are the sample means ofr ˜M,τ andr ˜M,τ , respectively. We obtain the same results based on this method (omitted here due to space).

24 Finally, we note that during overweighting periods, the mean-covariance tradeoff and mean- coskewness tradeoff are also economically significant. Indeed, the standard deviation of covariance is 0.8586 and that of coskewness is 10.8123. Thus, during an overweighting period, a one-standard- deviation increase in covariance is associated with a roughly 0.76% (the std of covariance times

A, i.e., 0.8586 × 8.888 × 10−3) increase in expected monthly return, and a one-standard-deviation decrease in coskewness is associated with a roughly 0.89% (the std of coskewness times B, i.e.,

(−10.8123) × (−0.823) × 10−3) increase in expected monthly return. This finding strengthens the notion that probability weighting has important implications in pricing and investment.

4.5 Robustness with Control Variables

Finally, we conduct a Fama-MacBeth regression to test the robustness of our main findings after controlling six other variables. These control variables are Size (Size), Book-to-market ratio

(BM), Momentum (Mom), Short-term reversal (Rev), Idiosyncratic volatility (IV ol), and Idiosyn- cratic skewness (ISkew). Among them, the first four are well-studied factors long known to be relevant in asset pricing, and the last two are relatively new in the literature. Liu, Stambaugh, and

Yuan (2018) argue that the beta anomaly arises from beta’s positive correlation with idiosyncratic volatility (IV ol). Barberis and Huang (2008) find that CPT investors are willing to pay more for lottery-like stocks. An, Wang, Wang, and Yu (2020) observes that lottery-like stocks significantly underperform their non-lottery-like counterparts, especially among stocks where investors have lost money. This type of stocks has high idiosyncratic skewness, which could be largely independent of the co-skewness with the market. If probability weighting has an impact on the latter, as we have demonstrated, then, very likely, it has an impact on the former as well.

25 Following Bali et al. (2016), these control variables are computed as follows

PRC · SHROUT  Size = log i,t i,t , i,t 1000   1000 · BEi,y BMi,t = log , PRCi,y · SHROUTi,y  11  Y Momi,t = 100  (˜ri,t−j + 1) − 1 , j=1

Revi,t = 100 · r˜i,t, r 1 √ IV ol = 100 · P59 ε2 · 12, i,t 56 j=0 i,t−j 1 P59 ε3 ISkew = 60 j=0 i,t−j , i,t 3/2  1 P59 2  60 j=0 εi,t−j where y is the end of a fiscal year, t is the end of a month between June of year y + 1, and May of year y + 2, PRCi,t is the price of stock i at t, SHROUTi,t is the number of shares outstanding of stock i at t, BEi,y is the book value of common equity of stock i at y, PRCi,y is the price of stock i at y, SHROUTi,y is the number of shares outstanding of stock i at y,r ˜i,t−j is the return rate of stock i during month t − j, and εi,t−j, j = 0, ··· , 59, are the residuals of the following regression:

1 2 3 r˜i,τ = ki,t + δi,tMKTτ + δi,tSMBτ + δi,tHMLτ + εi,τ , τ = t − 59, ··· , t,

where MKTτ is the return of the market factor during month τ and SMBτ and HMLτ are the returns of the size and value factor mimicking portfolios, respectively, during month τ.20

The Fama-MacBeth regression results with control variables are reported in Panel B of Table

4. They show that even with the commonly used variables controlled in a cross-sectional analy- sis, our main empirical finding that the risk premium of covariance (coskewness) is significantly positive (negative) during overweighting periods and significantly negative (insignificant) during underweighting periods still stands.

It is interesting to note that, in Panel B of Table 4, IVol exhibits a significantly negative rela-

20 The IV oli,t and ISkewi,t are estimated base on monthly data in a rolling window fashion. We have also estimated the IV oli,t and ISkewi,t based on daily data in each month t, and the main results still hold.

26 tionship with expected return only during underweighting periods. In other words, the mispricing

(the negative risk premiums for beta and idiosyncratic volatility) only appears when investors un- derweight the tail risk, which echoes the finding in Liu et al. (2018). On the other hand, similar to coskewness, ISkew shows a significantly negative relation with expected return only during over- weighting periods. This implies that overweighting both tails leads to RDUT investors developing a strong taste for both coskewness and idiosyncratic skewness.

5 Dynamism of Probability Weighting

We have uncovered a conditional relationship, both theoretically and empirically, between the covariance/coskewness and the average return, where the conditioning variable is probability weight- ing. However, we have left an important question unanswered: Why do individuals sometimes overweight tail events and sometimes underweight them? In this section we offer a psychological explanation of the switch between overweighting and underweighting, and then test our theory against empirical evidence.

5.1 A Hypothesis on Time Variation in Probability Weighting

Fox and Tversky (1998) propose a two-step model of the psychology of tail events. The first step is about beliefs: an individual, based on her beliefs, estimates the probabilities of tail events. This step does not yet involve any decisions. The second step is about preferences: with these estimated probabilities at hand, the individual assigns weights to the tail events to assist her decision. So let us say that the true probability of a tail event is p, which is unknown to the individual; she will first assess it to be w1(p) – a subjective estimation of p – and then further weight it, relevant to her decision, to be w2(w1(p)). The final probability weighting function w – such as the one we have been dealing with in this paper – is indeed a composition of the two distortion functions: w = w2 ◦ w1.

Research in psychology distinguishes between decisions from description and those from expe- rience (see e.g., Barberis, 2013). A decision from description occurs when the true probabilities are known to the individual, in which case the first step in the Fox and Tversky (1998) model can

27 be skipped. A decision from experience occurs when the individual is not privy to the objective probabilities and hence has to learn them by experiencing them over time. In this case, she will undergo both steps. It is widely held in the psychology literature that, when making decisions from description, there is consistent overweighting of tails; that is, the aforementioned w2 function, and hence the w function (w ≡ w2 ◦ w1 = w2), are both more likely to be inverse S-shaped. However,

Hertwig et al. (2004) argue via lab data that when making decisions from experience people may underweight tails.21 This observation reconciles with the S-shape of the weighting functions during certain months inferred from option data in the present paper. Mathematically, it is also consistent with – or, at least not contradictory to – the formula w = w2 ◦ w1: while w2 is robustly inverse

S-shaped, the presence of w1 may render the final w a different shape, including S-shaped.

The above theory may explain the presence of overweighting or underweighting of tail distri- butions in stock investing, but not the switch between overweighting in some months and under- weighting in others. To understand this dynamism of probability weighting we need to understand the underlying psychological mechanism that triggers the switch. For this we propose a coherent hypothesis, involving both beliefs and preferences and built on Fox and Tversky (1998)’s two-step model, to explain how probability weighting reacts to changes in the outside environment, and hence to account for the switch.

Tversky and Kahneman (1973) put forth the so-called availability heuristic, which refers to an apparent tendency to give more weight to easily recalled events in estimating the probability of an event happening. Several studies have examined the availability heuristic in finance, ranging from asymmetric reactions of stock prices to consumer sentiment reports (Akhtar, Faff, Oliver, and Subrahmanyam, 2012) to investors’ tendency to repurchase previously held stocks (Nofsinger and Varma, 2013). Goetzmann, Kim, and Shiller (2017) test the availability heuristic directly on investors’ probability estimates of a major stock crash – a left tail extreme event. They observe that recently experienced exogenous rare events significantly – although often incorrectly – impact an investor’s estimating of crash probabilities. The authors present evidence that articles in the

financial press change investor crash beliefs: those with negative sentiment lead to higher crash

21The robustness and interpretation of this finding, however, is still being debated; see, e.g., pp. 613–614 of Barberis (2013).

28 probability assessments following market downturns, but those with positive sentiment have al- most no effect. Finally, they find that subjective crash probabilities are highly dynamic: they vary significantly over time and are correlated to measures of jump risk such as the VIX and to the occurrence of extreme negative stock returns. The availability heuristic thereby provides a psy- chological ground as to why people’s probability beliefs react closely to recent exogenous shocks and, as a consequence, change over time. Moreover, Barberis (2013) argues that the availability heuristic can explain both overestimation and underestimation of tails for stock investing: during or immediately after a major stock crash (such as the most recent one due to COVID-19), events like market circuit-breaks and spectacular rebounds “receive disproportionate media coverage, making it easier for us to recall them, and tricking us into judging them more probable than they actually are”. On the other hand, in the years leading up to a crash (such as the big bull run before the pandemic), massive one-day stock market ups and downs (both of, say, more than 10%) are rare at best, for which reason we hardly recall tail events and hence assign lower probabilities to them.

While beliefs change dynamically, so do preferences. Many studies argue that probability weighting, while seemingly biased and irrational, actually serves a purpose. It is employed – most likely subconsciously or unconsciously – to respond optimally to various challenges associated with decision-making under uncertainty, such as limited information processing capacity (including limited attention), information loss, or over-optimism. Herold and Netzer (2010) show that inverse

S-shaped probability weighting is an optimal response to S-shaped valuation of rewards. Woodford

(2012) argues that the choices of probability weighting - termed perception strategies - are them- selves optimal in minimizing the mean-square error of the subjective evaluation of probabilities.

Steiner and Stewart (2016) demonstrate that overweighting of small probability events optimally mitigates errors due to randomness. Now, if probability weighting is endogenous and is an optimal response to the external environment, then the former will be forced to change over time because, inevitably, the latter will do so. Finally, like underestimation, if investors have not seen tail events for a long time, then they may underweight or even simply ignore them so as to optimally divert their ability and attention to what they think are more important scenarios.

Given the support from the psychology and behavioral finance literatures for the feedback nature

29 of probability weighting on tail events – that is, that environmental changes are fed back to our perceptions on the tails in both beliefs and preferences – we hypothesize that the probability weighting is time varying. Most of the time, the change is slow and insignificant, especially when no extreme events are occurring, during which time a prior overweighting may gradually change to underweighting. However, a sudden and significant event, aka a “black swan”, such as a pandemic, a stock crash or even an unexpected news report, may trigger a major, characteristic change in the probability weighting, including that from underweighting to overwheighting.22

5.2 Empirical Tests

We now test empirically our hypothesis that switches between overwheighting and underweight- ing are associated with (sometimes dramatic) changes in the market environment. We choose vari- ables in different categories that capture these changes. The first category consists of proxies of extreme price movement. Let Maxi and Mini be, respectively, the largest (most positive) and small- est (most negative) daily returns of asset i in a month, and Maxave and Minave be the respective average values across all stocks. Moreover, the monthly returns of all stocks form a cross-sectional distribution, whose 1% and 99% quantiles are denoted by Q01 and Q99, respectively. We also take the largest and smallest daily returns of the S&P 500 in each month, denoted by MaxSP , MinSP , respectively, and the standard derivation of the daily excess returns of the market portfolio, denoted by σMKT .

The next natural choice is the VIX, the Chicago Board Options Exchange (CBOE) Market

Volatility Index, a well-known measure of the stock market’s expectation of volatility based on

S&P 500 index options. This index has been widely accepted as the best available barometer of investors’ fear since its introduction in the 1990s.

We also consider two media-based variables. The first is the Equity Market Volatility (EMV) tracker, which is based on eleven major U.S. newspapers.23 The EMV index counts the newspaper

22We reiterate that our theory here is both belief- and preference-based. Beliefs and preferences are always en- tangled, and it is both challenging and interesting to disentangle them even for classical EUT (Ross, 2015). It is even thornier to decouple them in the context of probability weighting (Barberis, 2013, Goetzmann et al., 2017). Luckily, for our purpose in this paper we need to understand only the combined effect of the dynamics of beliefs and preferences on probability weighting. 23The eleven newspapers are Boston Globe, Chicago Tribune, Dallas Morning News, Houston Chronicle, Los Ange-

30 articles that contain at least one term in each of E: {economic, economy, financial}, M: {stock market, equity, equities, Standard and Poors (and variants)}, V: {volatility, volatile, uncertain, uncertainty, risk, risky}. The second variable we consider is the Financial Stress Indicator (FSI), which is based on five U.S. newspapers.24 This index also counts articles related to financial distress. Specifically, four financial sentiment dictionaries are used to first measure the sentiment of each article’s title, and then the FSI is the total number of articles in a month with net negative connotations.

In addition to these news-based sentiment indices, we also consider a more general and widely used sentiment index proposed by Baker and Wurgler (2006). This index is constructed by the first principal component of five different sentiment proxies, including IPO numbers, first-day return of

IPO, close-end fund discount, turnover ratio and share of equity/debt issues.

The final category is the survey variables. We consider Shiller’s U.S. Crash Confidence Index, which is the percentage of respondents who think that the probability of a US market crash occur- ring in the next six months is strictly less than 10%.25 We also take the Bull and Bell Indices based on the Investor Sentiment Survey of the American Association of Individual Investors (AAII). The

Investor Sentiment Survey, which has been administered weekly to members of the AAII since 1987, measures the percentage of individual investors who are bullish, neutral, or bearish on the stock market for the next six months.

We present all of the above variables, together with the PWI1 index, and their summary statistics in Table 5. Figure 8 graphs the PWI1 index together with variables selected from each group in Table 5.26 It is evident from Figure 8 that all of the indices, except the sentiment index, move along with PWI1. les Times, Miami Herald, New York Times, San Francisco Chronicle, USA Today, Wall Street Journal, and Washing- ton Post. The monthly EMV data can be downloaded from http://www.policyuncertainty.com/EMV monthly.html. 24The five U.S. newspapers are Boston Globe, Chicago Tribune, Los Angeles Times, Wall Street Journal and Wash- ington Post. The monthly FSI data can be downloaded from http://www.policyuncertainty.com/financial stress.html. 25The monthly data of this index is available at https://som.yale.edu/faculty-research-centers/centers- initiatives/international-center-for-finance/data/stock-market-confidence-indices/united-states-stock-market- confidence-indices. 26There are a total of 15 variables in Table 5. It would be visually incoherent were we to plot them all in the same figure. Therefore, we choose a representative from each group to display in Figure 8. For example, we select MaxSP and MinSP from the return variables to represent the extreme market movements. If we instead had used Maxave and Minave as the representative, they would have displayed patterns similar to the green and red bars shown in Figure 8.

31 [Insert Table 5 here]

[Insert Figure 8 here]

Panel A of Table 6 reports the correlations between PWI1 and the other variables. VIX has the highest correlation with PWI1 (0.815). The return variables representing extreme events are all highly correlated with PWI1, the smallest one being Q99, which still has 0.338. The media variables also have high correlations, as does the Crash index in the survey group. The Bull/Bear survey indices are correlated with PWI1, if not as significantly. Finally, the sentiment index is uncorrelated with PWI1. Panel B of Table 6 reports the number and proportion of months for which an alternative variable and PWI1 give different labels (overweighting/underweighting). We can see that the variables highly correlated with PWI1, such as VIX, Minave, Maxave, have small numbers of discrepancy.

[Insert Table 6 here]

Figure 9 further plots the switching patterns of the overweighting periods and the underweight- ing periods under five indices. These are remarkably similar.

[Insert Figure 9 here]

The findings summarized by Table 6, Figure 8, and Figure 9 strongly support our theory that the dynamic of PWI1 is driven by market movements. The high correlation between PWI1, the re- turn variables (and their proxies such as VIX), and the media variables confirms our conjecture that a direct experience of extreme events and/or sensational media coverage quickly causes investors to pay more attention to tail risk and to adjust their beliefs/preferences accordingly. The considerable correlation between PWI1 and the survey variables underlines a synchronization between proba- bility weighting and investors’ sentiments regarding the market. Intriguingly, the fact that PWI1 and Baker and Wurgler (2006)’s sentiment index are largely unrelated proves our hypothesis from a different angle: the constituents of this particular index (IPO numbers, first-day return of IPO, close-end fund discount, turnover ratio, and share of equity/debt issues) are obscure and opaque,

32 not directly observable and easily comprehensible by investors; hence, they do not respond to them, at least not immediately or dramatically.

In addition, we use these highly correlated variables as alternative probability weighting indices to repeat the empirical analysis on the conditional return and covariance/coskewness relationship, and we find that the results remain consistent. The results are presented in an online appendix.

This additional empirical check demonstrates that our main results are robust in that they are independent of the choice of a specific probability weighting index.

6 Conclusions

We show that probability weighting is a key driver of beta and coskewness pricing. Individuals tend to overweight or underweight the two tails simultaneously. Whether they overweight or under- weight changes dynamically over time, to which we offer a psychology based explanation, supported by empirical evidence that we present. We find that the total length of overweighting periods is comparable to that of underweighting periods. If we analyze a sufficiently long sample period, the aggregate effects of overweighting and underweighting may cancel out, potentially deceiving us into overlooking the significance of probability weighting. It is, therefore, important to separate the overweighting and underweighting periods and to study them individually and respectively. In do- ing so, as demonstrated in this paper, we are able to understand the significant role that probability weighting plays in pricing and in building profitable portfolios.

We carry out the theoretical analysis using an RDUT model and confirm its implications with empirical tests. We believe that our model offers a partial explanation of the beta anomaly and hints at a viable way in which to potentially resolve other puzzles in asset pricing.

Appendix

A: Proof of Theorem 2

When the market is in equilibrium, the total return of the optimal portfolio of the representative

∗ investor equals the total return of the market portfolio, i.e., ReP = ReM . Applying Theorem 5.2 of

33 Xia and Zhou (2016), we obtain

∗ −1 0 0 m˜ = (λ ) w (1 − F (RM ))u (RM ). ReM e e

However, the pricing formula of the risk-free asset yields

1 = E[m ˜ (1 + rf )] = E[m ˜ ](1 + rf ), which in turn implies

∗ h 0 0 i λ = (1 + rf ) w (1 − F (RM ))u (RM ) . (12) E ReM e e

This establishes (4).

Next we rewrite

∗ −1 0 0 m˜ = (λ ) w (1 − F (1 +r ˜M ))u (1 +r ˜M ). ReM

Applying the Taylor expansion up to the second order ofr ˜M , we can express the pricing kernel as follows:27

m˜ =(λ∗)−1w0(1 − F (1))u0(1) ReM

∗ −1 h 0 00 00 0 i + (λ ) w (1 − F (1))u (1) − w (1 − F (1))u (1)f (1) r˜M ReM ReM ReM 1 n + (λ∗)−1 w0(1 − F (1))u000(1) − w00(1 − F (1))[2u00(1)f (1) + u0(1)f 0 (1)] 2 ReM ReM ReM ReM 000 0 2 o 2 2 + w (1 − F (1))u (1)f (1) r˜M + o(˜rM ). (13) ReM ReM

Applying the pricing formula to the risky assets, we have

E[m ˜ (1 +r ˜i)] = 1, i = 2, 3, ··· , n.

27 0 Here, we Taylor expand F (1 +r ˜M ) around 1, then expand w (1 − F (1 +r ˜M )) around 1 − F (1), and ReM ReM ReM 0 expand u (1 +r ˜M ) around 1.

34 Hence,

Cov(m, ˜ (1 +r ˜i)) + E[1 +r ˜i]E[m ˜ ] = 1, or

1 − Cov(m, ˜ r˜i) 1 2 2 E[˜ri] = − 1 = rf + ACov(˜ri, r˜M ) + BCov(˜ri, r˜M ) + o(˜rM ), E[m ˜ ] 2 thanks to (12) and (13).

B: Proof of Proposition 1

i) When α < 1, w is inverse S-shaped. Hence, 1 − F (1) lies within the convex domain of w ReM under the assumptions of the proposition. Therefore, the two terms inside the brackets of (6) are both negative. This implies A > 0.28

ii) When α > 1, w is S-shaped, and 1 − F (1) lies within the concave domain of w. With ReM w(p) = exp(−(− log(p))α), we have

w0(1 − F (1)) " w00(1 − F (1)) # A ≡ − ReM u00(1) − ReM u0(1)f (1) 0 0 0 ReM [w (1 − F (RM ))u (RM )] w (1 − FR (1)) E ReM e e eM " # w0(1 − F (1)) αyα − α + 1 − y = − ReM u00(1) − 0 0 u0(1)f (1) , 0 0 ReM [w (1 − F (RM ))u (RM )] (1 − FR (1))y0 E ReM e e eM where y0 = − log(1 − F (1)). It follows from the assumptions of the proposition that 0 ≤ y0 < 1, ReM α which implies limα→+∞ αy0 − α = −∞. Hence,

α αy0 − α + 1 − y0 0 lim − u (1)fR (1) = +∞. α→+∞ (1 − F (1))y0 eM ReM

αyα−α+1−y Thus, there must exist α∗ > 1 such that when α > α∗, − 0 0 u0(1)f (1) > −u00(1), (1−F (1))y0 ReM ReM implying A < 0.

28This part of the proof does not depend on the specific form of w; it only requires 1 − F (1) to be in the convex ReM domain of w.

35 C: Proof of Proposition 2

With w(p) = exp(−(− log(p))α), we have

" w0(1 − F (1))u0(1) αyα − α + 1 − y B = − ReM γ(γ + 1) − 0 0 (−2γf (1) + f 0 (1)) 0 0 ReM ReM [w (1 − F (RM ))u (RM )] (1 − FR (1))y0 E ReM e e eM α 2 α # (αy0 − α + 1 − y0) + (1 − α) + (1 − α − y0)(αy0 − y0) 2 + 2 f (1) (1 − F (1))2y ReM ReM 0 w0(1 − F (1))u0(1) ReM := − [B1 + B2 + B3], 0 0 [w (1 − F (RM ))u (RM )] E ReM e e where B1, B2, B3 denote the three terms in the brackets, and y0 = − log(1 − F (1)). Clearly, ReM

B1 > 0. 0 f ˜ (1) RM i) When 0 < α < 1 and γ ≥ , we have B2 ≥ 0. We now examine B3. It follows from 2f ˜ (1) RM −1 1 − F (1) > e that 0 ≤ y0 < 1. On the other hand, the first-order condition of the function ReM α (αy0 − y0) is

∂ α 2 α−1 (αy0 − y0) = α y0 − 1 = 0, ∂y0

2 ∗ 1−α whose solution is y0 = α ∈ (0, 1). Then, the extreme values of this function on y0 ∈ [0, 1] are

1+α 2 1+α 2 α α α 1−α 1−α 1−α 1−α max (αy0 − y0) = max {α0 − 0, α1 − 1, α − α } = α − α > 0, y0∈[0,1]

1+α 2 α α α 1−α 1−α min (αy0 − y0) = min {α0 − 0, α1 − 1, α − α } = α − 1 < 0. y0∈[0,1]

When 1 − α − y0 > 0, we have

α (1 − α) + (1 − α − y0)(αy0 − y0) ≥ (1 − α) + (1 − α − 0)(α − 1) = (1 − α)α > 0.

When 1 − α − y0 < 0, we have

1+α 2 2 α 1−α 1−α 1−α (1 − α) + (1 − α − y0)(αy0 − y0) ≥ (1 − α) + (1 − α − 1)(α − α ) = (1 − α )(1 − α) > 0.

36 When 1 − α − y0 = 0, we have

α (1 − α) + (1 − α − y0)(αy0 − y0) = (1 − α) > 0.

This establishes that B3 > 0, leading to B < 0. (Here we have actually proved that, in this case, w000(1 − F (1)) > 0.) ReM

ii) When α > 1, we consider B1 + B2 + B3 as a function of γ and α:

2 0 2 B1 + B2 + B3 = γ + (1 + 2g1(α)f (1))γ − g1(α)f (1) + g2(α)f (1), ReM ReM ReM

where g1(α) and g2(α) are continuous functions of α:

α αy0 − α + 1 − y0 g1(α) := , (1 − F (1))y0 ReM (αyα − α + 1 − y )2 + (1 − α) + (1 − α − y )(αyα − y ) g (α) := 0 0 0 0 0 . 2 (1 − F (1))2y2 ReM 0

It is easy to show that, for any given finite α > 1, there exists a threshold

 q 2 0 2 −(1+2g1(α)fR (1))+ (1+2g1(α)fR (1)) −4(−g1(α)f (1)+g2(α)f (1))  eM eM ReM ReM ,  2  2 0 2 γ(α) := if (1 + 2g1(α)f (1)) − 4(−g1(α)f (1) + g2(α)f (1)) ≥ 0, ReM R R  eM eM  2 0 2  0, if (1 + 2g1(α)f (1)) − 4(−g1(α)f (1) + g2(α)f (1)) < 0, ReM ReM ReM such that B1 + B2 + B3 > 0 (and, hence, B < 0) whenever γ > γ(α).

iii) As 0 ≤ y0 < 1 and 1 − F (1) lies within the concave domain of w, we have ReM

α αy0 − α + 1 − y0 ≤ 0.

∗ Thus, B2 ≤ 0 when γ ≥ γ . Now,

2 B1 + B3 = γ(γ + 1) + g2(α)f (1). ReM

37 It is easy to prove that, for any given finite α > 1, there must exist a threshold

q 2  1−g2(α)f (1)−1 ReM  , if g2(α) ≤ 0, γ(α) := 2  0, if g2(α) > 0,

such that B1 + B3 < 0 when 0 ≤ γ < γ(α). This completes the proof.

References

Abdellaoui, Mohammed, 2002, A genuine rank-dependent generalization of the von neumann-morgenstern

expected utility theorem, Econometrica 70, 717–736.

Akhtar, S., R. Faff, B. Oliver, and A. Subrahmanyam, 2012, Stock salience and the asymmetric market effect

of consumer sentiment news, Journal of Banking & Finance 36, 3289–3301.

An, Li, Huijun Wang, Jian Wang, and Jianfeng Yu, 2020, Lottery-related anomalies: The role of reference-

dependent preferences, Management Science 66, 473–501.

Baele, Lieven, Joost Driessen, Sebastian Ebert, Juan M. Londono, and Oliver G. Spalt, 2019, Cumulative

prospect theory, option returns, and the variance premium, The Review of Financial Studies 32, 3667–

3723.

Baker, M., and J. Wurgler, 2006, Investor sentiment and the cross-section of stock returns, The Journal of

Finance 61, 1645–1680.

Baker, Malcolm, Brendan Bradley, and Jeffrey Wurgler, 2011, Benchmarks as limits to arbitrage: Under-

standing the low-volatility anomaly, Financial Analysts Journal 67, 40–54.

Bali, Turan G., Robert F. Engle, and Scott Murray, 2016, Empirical asset pricing: the cross section of stock

returns (John Wiley & Sons).

Barberis, Nicholas, 2013, The psychology of tail events: Progress and challenges, American Economic Review

103, 611–16.

Barberis, Nicholas, and Ming Huang, 2008, Stocks as lotteries: The implications of probability weighting for

security prices, The American Economic Review 98, 2066–2100.

38 Barberis, Nicholas, Abhiroop Mukherjee, and Baolian Wang, 2016, Prospect theory and stock returns: An

empirical test, Review of Financial Studies 29, 3068–3107.

Black, Fischer, Michael C. Jensen, and Myron Scholes, 1972, The capital asset pricing model: Some empirical

tests, 9–124 (Praeger Publishers Inc.).

Bliss, Robert R., and Nikolaos Panigirtzoglou, 2002, Testing the stability of implied probability density

functions, Journal of Banking & Finance 26, 381–422.

Bollerslev, Tim, Viktor Todorov, and Lai Xu, 2015, Tail risk premia and return predictability, Journal of

Financial Economics 118, 113–134.

Bordalo, Pedro, Nicola Gennaioli, and Andrei Shleifer, 2012, Salience theory of choice under risk, The

Quarterly Journal of Economics 127, 1243–1285.

Brockett, Patrick L., and Linda L. Golden, 1987, A class of utility functions containing all the common

utility functions, Management Science 33, 955–964.

De Giorgi, Enrico G, and Shane Legg, 2012, Dynamic portfolio choice and asset pricing with narrow framing

and probability weighting, Journal of Economic Dynamics and Control 36, 951–972.

Dierkes, Maik, 2013, Probability weighting and asset prices, SSRN.

Fama, Eugene F, and Kenneth R French, 1992, The cross-section of expected stock returns, The Journal of

Finance 47, 427–465.

Fama, Eugene F, and James D MacBeth, 1973, Risk, return, and equilibrium: Empirical tests, Journal of

Political Economy 81, 607–636.

Fox, Craig R, and Amos Tversky, 1998, A belief-based account of decision under uncertainty., Management

Science 44, 879–895.

Frazzini, Andrea, and Lasse Heje Pedersen, 2014, Betting against beta, Journal of Financial Economics 111,

1–25.

Goetzmann, William N., Dasol Kim, and Robert J. Shiller, 2017, Crash beliefs from investor surveys, NBER

Working Paper No. 22143 .

Gonzalez, Richard, and George Wu, 1999, On the shape of the probability weighting function, Cognitive

psychology 38, 129–166.

39 Harrison, Glenn W, Steven J Humphrey, and Arjan Verschoor, 2009, Choice under uncertainty: evidence

from ethiopia, india and uganda, The Economic Journal 120, 80–104.

Harvey, Campbell R, and Akhtar Siddique, 2000, Conditional skewness in asset pricing tests, The Journal

of Finance 55, 1263–1295.

Henrich, Joseph, Steven J Heine, and Ara Norenzayan, 2010, The weirdest people in the world?, Behavioral

and brain sciences 33, 61–83.

Herold, Florian, and Nick Netzer, 2010, Probability weighting as evolutionary second-best, University of

Zurich Socioeconomic Institute Working Paper 1005 .

Hertwig, Ralph, Greg Barron, Elke U Weber, and Ido Erev, 2004, Decisions from experience and the effect

of rare events in risky choice, Psychological Science 15, 534–539.

Hong, Harrison, and David A Sraer, 2016, Speculative betas, The Journal of Finance 71, 2095–2144.

Hsu, Ming, Ian Krajbich, Chen Zhao, and Colin F Camerer, 2009, Neural response to reward anticipation

under risk is nonlinear in probabilities, Journal of Neuroscience 29, 2231–2237.

Humphrey, Steven J, and Arjan Verschoor, 2004, The probability weighting function: experimental evidence

from uganda, india and ethiopia, Economics Letters 84, 419–425.

Jagannathan, Ravi, and Zhenyu Wang, 1996, The conditional capm and the cross-section of expected returns,

The Journal of Finance 51, 3–53.

Kelly, Bryan, and Hao Jiang, 2014, Tail risk and asset prices, The Review of Financial Studies 27, 2841–2871.

Kostakis, Alexandros, Nikolaos Panigirtzoglou, and George Skiadopoulos, 2011, Market timing with option-

implied distributions: A forward-looking approach, Management Science 57, 1231–1249.

Kozhan, Roman, Anthony Neuberger, and Paul Schneider, 2013, The skew risk premium in the equity index

market, The Review of Financial Studies 26, 2174–2203.

Kraus, Alan, and Robert H Litzenberger, 1976, Skewness preference and the valuation of risk assets, The

Journal of Finance 31, 1085–1100.

Lintner, John, 1965, Security prices, risk, and maximal gains from diversification, The Journal of Finance

20, 587–615.

40 Liu, Jianan, Robert F Stambaugh, and Yu Yuan, 2018, Absolving beta of volatility’s effects, Journal of

Financial Economics 128, 1–15.

Mehra, Rajnish, and Edward C Prescott, 1985, The equity premium: A puzzle, Journal of Monetary Eco-

nomics 15, 145–161.

Mossin, Jan, 1966, Equilibrium in a capital asset market, Econometrica 34, 768–783.

Nofsinger, J.R., and A. Varma, 2013, Availability, recency, and sophistication in the repurchasing behavior

of retail investors, Journal of Banking & Finance 37, 2572–2585.

Polkovnichenko, Valery, and Feng Zhao, 2013, Probability weighting functions implied in options prices,

Journal of Financial Economics 107, 580–609.

Prelec, Drazen, 1998, The probability weighting function, Econometrica 66, 497–527.

Quiggin, John, 1982, A theory of anticipated utility, Journal of Economic Behavior & Organization 3,

323–343.

Quiggin, John, 2012, Generalized expected utility theory: The rank-dependent model (Springer Science &

Business Media).

Roll, Richard, and Stephen A Ross, 1994, On the cross-sectional relation between expected returns and

betas, The Journal of Finance 49, 101–121.

Rosenberg, Joshua V, and Robert F Engle, 2002, Empirical pricing kernels, Journal of Financial Economics

64, 341–372.

Ross, S., 2015, The recovery theorem, The Journal of Finance 70, 615–648.

Schmeidler, David, 1989, Subjective probability and expected utility without additivity, Econometrica 57,

571–587.

Sharpe, William F, 1964, Capital asset prices: A theory of market equilibrium under conditions of risk, The

Journal of Finance 19, 425–442.

Shefrin, Hersh, 2008, A Behavioral Approach to Asset Pricing (New York: Elsevier.).

Steiner, Jakub, and Colin Stewart, 2016, Perceiving prospects properly, American Economic Review 106,

1601–1631.

41 Tanaka, Tomomi, Colin F Camerer, and Quang Nguyen, 2010, Risk and time preferences: Linking experi-

mental and household survey data from vietnam, American Economic Review 100, 557–71.

Tversky, Amos, and Daniel Kahneman, 1973, Availability: A heuristic for judging frequency and probability,

Cognitive psychology 5, 207–232.

Tversky, Amos, and Daniel Kahneman, 1979, Prospect theory: An analysis of decision under risk, Econo-

metrica 47, 263–291.

Tversky, Amos, and Daniel Kahneman, 1992, Advances in prospect theory: Cumulative representation of

uncertainty, Journal of Risk and Uncertainty 5, 297–323.

Woodford, Michael, 2012, Prospect theory as efficient perceptual distortion, American Economic Review

102, 41–46.

Wu, George, and Richard Gonzalez, 1996, Curvature of the probability weighting function, Management

science 42, 1676–1690.

Xia, Jianming, and Xun Yu Zhou, 2016, Arrow–debreu equilibria for rank-dependent utilities, Mathematical

Finance 26, 558–588.

42 Table 1 The values of B under different settings. The probability weighting function is that of Prelec with α = 0.5, 1 and 1.5. The utility function is CRRA with various values of relative risk aversion parameter γ. The distribution of the market portfolio’s total return is normal (µ = 1.076, σ = 0.158), skew-normal (µ = 1.076, σ = 0.158, skewness = −0.339), or log-normal (µ = 1.076, σ = 0.158). The parameters γ∗, γ and γ are defined in Proposition 2 and its proof (Appendix C).

Panel A: Overweighting case (α = 0.5)

Distribution of ReM Parameters γ = 1 γ = 2 γ = 3 γ = 4 γ = 5 Normal γ∗ = 1.5222 -34.6634 -39.9503 -43.4616 -44.6584 -43.4897 Skew-normal γ∗ = 2.3799 -30.2979 -36.4819 -41.0653 -43.2547 -42.8122 Log-normal γ∗ = 0.7816 -32.5639 -34.8329 -35.4924 -16.2108 -12.7188

Panel B: no probability weighting case (α = 1)

Distribution of ReM Parameters γ = 1 γ = 2 γ = 3 γ = 4 γ = 5 Normal γ∗ = 1.5222 -2.1102 -6.5066 -13.0431 -21.2156 -30.1970 Skew-normal γ∗ = 2.3799 -2.0897 -6.3884 -12.7085 -20.5003 -28.8738 Log-normal γ∗ = 0.7816 -2.1385 -6.3540 -12.2682 -19.2142 -26.2404 43

Panel C: Underweighting case (α = 1.5)

Distribution of ReM Parameters γ = 1 γ = 2 γ = 3 γ = 9 γ = 10 Normal γ∗ = 1.5222, γ(α) = 1.4765, γ(α) = 8.2577 8.9976 15.8929 20.7699 -8.4330 -20.4957 Skew-normal γ∗ = 2.3799, γ(α) = 1.4129, γ(α) = 7.1000 -1.3199 4.8151 9.0720 -16.7676 -26.2754 Log-normal γ∗ = 0.7816, γ(α) = 1.6202, γ(α) = 8.8817 21.4745 28.4943 32.8405 -1.3287 -12.4790 Table 2 One-way sort results based on PWI1. The overweighting periods and underweighting periods are separated by the index PWI1. All periods (All), overweighting periods (Over), and underweighting periods (Under) include 244 months, 122 months, and 122 months, respectively. The spread portfolio (10-1) is constructed by longing the 10th portfolio and shorting the 1st portfolio. The t-statistics of the returns of the spread portfolios are reported. ***, ** , and * denote significance at the 1%, 5%, and 10% levels, respectively.

Panel A: covariance-sorted portfolios t-statistic 1 2 3 4 5 6 7 8 9 10 (10-1) of (10-1) All 1.05% 1.16% 1.18% 1.21% 1.34% 1.27% 1.21% 1.33% 1.18% 1.18% 0.13% (0.23) Over 1.09% 1.16% 1.21% 1.25% 1.47% 1.63% 1.63% 2.08% 2.24% 2.75% 1.65%* (1.77) Under 1.01% 1.16% 1.16% 1.16% 1.22% 0.91% 0.80% 0.58% 0.12% -0.40% -1.40%** (-2.48)

Panel B: coskewness-sorted portfolios t-statistic 1 2 3 4 5 6 7 8 9 10 (10-1) of (10-1) All 1.73% 1.46% 1.42% 1.40% 1.20% 1.14% 1.20% 0.93% 0.91% 0.72% -1.01%*** (-3.04) 44 Over 2.85% 2.24% 1.97% 1.79% 1.47% 1.49% 1.53% 1.07% 1.05% 1.04% -1.81%*** (-3.01) Under 0.61% 0.67% 0.87% 1.01% 0.93% 0.79% 0.87% 0.80% 0.78% 0.39% -0.21% (-0.79) Table 3 Two-way sort results based on PWI1. The overweighting periods (Over) and under- weighting periods (Under) are separated by PWI1. The spread portfolio (5-1) is constructed by longing the 5th portfolio and shorting the 1st portfolio. The t-statistics of the returns of the spread portfolios are reported. ***, **, and * denote significance at the 1%, 5%, and 10% levels, respectively.

Panel A: coskewness-covariance-sorted portfolios covariance Periods coskewness 1 2 3 4 5 (5-1) t-statistic of (5-1) 1 1.40% 1.36% 1.74% 1.55% 1.07% -0.32% (-0.83) 2 1.15% 1.29% 1.34% 1.08% 1.03% -0.13% (-0.29) All 3 1.11% 1.23% 1.30% 1.20% 1.23% 0.13% (0.31) 4 0.92% 1.02% 1.18% 1.23% 1.23% 0.31% (0.70) 5 0.86% 1.27% 1.04% 1.09% 1.37% 0.51% (0.87) 1 1.79% 1.69% 2.25% 2.51% 2.05% 0.26% (0.39) 2 1.13% 1.23% 1.56% 1.49% 2.37% 1.25%* (1.68) Over 3 1.08% 1.19% 1.37% 1.53% 2.37% 1.29%* (1.90) 4 0.83% 1.00% 1.24% 1.55% 2.39% 1.56%** (2.18) 5 0.67% 1.43% 1.46% 2.08% 3.01% 2.33%** (2.28) 1 1.01% 1.04% 1.22% 0.59% 0.09% -0.91%** (-2.30) 2 1.18% 1.34% 1.12% 0.68% -0.32% -1.50%*** (-3.36) Under 3 1.13% 1.27% 1.23% 0.86% 0.09% -1.04%** (-2.36) 4 1.02% 1.04% 1.12% 0.90% 0.08% -0.94%* (-1.91) 5 1.04% 1.10% 0.61% 0.10% -0.26% -1.30%** (-2.32)

Panel B: covariance-coskewness-sorted portfolios coskewness Periods covariance 1 2 3 4 5 (5-1) t-statistic of (5-1) 1 1.70% 1.24% 1.08% 1.09% 0.76% -0.94%*** (-2.71) 2 1.59% 1.29% 1.25% 1.05% 0.80% -0.79%*** (-3.30) All 3 1.61% 1.32% 1.23% 1.13% 1.05% -0.55%** (-2.30) 4 1.72% 1.19% 1.24% 1.13% 1.07% -0.66%** (-2.47) 5 1.12% 1.04% 1.36% 1.19% 1.03% -0.09% (-0.33) 1 2.32% 1.33% 1.00% 1.00% 0.49% -1.83%*** (-2.88) 2 2.02% 1.32% 1.21% 0.98% 0.72% -1.30%*** (-3.11) Over 3 2.11% 1.59% 1.24% 1.26% 1.12% -0.99%** (-2.45) 4 2.79% 1.93% 1.70% 1.42% 1.43% -1.35%*** (-2.99) 5 2.42% 2.32% 2.63% 2.59% 2.29% -0.13% (-0.26) 1 1.08% 1.16% 1.15% 1.18% 1.03% -0.05% (-0.19) 2 1.16% 1.26% 1.28% 1.11% 0.87% -0.29% (-1.24) Under 3 1.10% 1.04% 1.23% 1.01% 0.99% -0.12% (-0.46) 4 0.66% 0.46% 0.78% 0.84% 0.70% 0.04% (0.14) 5 -0.18% -0.25% 0.09% -0.21% -0.23% -0.06% (-0.21)

45 Table 4 Fama-MacBeth regression results. “All period” is from January 1996 to April 2016, while “Underweighting periods” and “Overweighting periods” are determined by PWI1 index. Panel A reports results without control variables. Panel B reports results with two, four, and six control variables. The average values of the coefficients, which are multiplied by 1000, are reported above and the corresponding t-statistics are reported below, in parentheses. ***, **, and * denote significance at the 1%, 5%, and 10% levels, respectively. The row labelled “Adj. R2” reports the average adjusted r-square of each regression. The row labelled “n” reports the average number of stocks in each cross-sectional regression.

Panel A: Fama-MacBeth regression results without control variables All periods Overweighting periods Underweighting periods A 1.51 8.89** -5.87*** (0.65) (2.20) (-2.74) B -0.44*** -0.82*** -0.05 (-2.87) (-2.89) (-0.53) Adj. R2 0.02 0.02 0.02 n 2695 2819 2571 46 Panel B: Fama-MacBeth regression results with control variables All periods Overweighting periods Underweighting periods A 1.24 2.30 1.72 7.47** 8.74*** 7.23** -4.99*** -4.14** -3.79** (0.65) (1.20) (1.03) (2.28) (2.65) (2.59) (-2.73) (-2.35) (-2.25) B -0.38*** -0.33*** -0.24** -0.66*** -0.60** -0.43** -0.09 -0.06 -0.05 (-2.94) (-2.63) (-2.47) (-2.78) (-2.56) (-2.51) (-1.05) (-0.72) (-0.53) IV ol 0.01 -0.06 -0.05 0.10 0.01 0.01 -0.09 * -0.12 *** -0.12*** (0.15) (-1.34) (-1.35) (1.02) (0.07) (0.13) (-1.69) (-2.76) (-2.69) ISkew -0.72 -1.03* -0.74 -2.46** -2.60** -1.92** 1.01 0.54 0.44 (-1.17) (-1.68) (-1.39) (-2.32) (-2.48) (-2.18) (1.64) (0.89) (0.74) Size -1.66*** -1.43*** -2.23** -1.66** -1.09** -1.21*** (-3.27) (-3.23) (-2.45) (-2.13) (-2.43) (-2.79) BM 1.68** 1.93*** 1.30 1.64 2.06** 2.22*** (2.35) (2.71) (1.11) (1.39) (2.51) (2.76) Mom 0.01 -0.02 0.04** (0.45) (-0.59) (2.25) Rev -0.34*** -0.54*** -0.15*** (-6.38) (-5.94) (-2.81) Adj. R2 0.03 0.04 0.05 0.03 0.04 0.06 0.02 0.03 0.04 n 2693 2691 2689 2817 2815 2813 2569 2567 2565 Table 5 The variables used to test for the dynamism of probability weighting, and their summary statistics. The data are collected from the OptionMetrics (OM), the Center for Research on Security Prices (CRSP), Kenneth R. French’s data library (FF), the Economic Policy Uncertainty website (EPU), the survey data from Robert Shiller’s Investor Confidence Surveys (ICS), Investor Sentiment Survey of the American Association of Individual Investors (AAII), or Jeffrey Wurgler’s website (BW). The starting months when the data are available are different for these indices, which are reported under the “Start Time” column, and the ending month is set to be April 2016 uniformly. For example, the period of VIX we use to compute summary statistics is from January 1990 to April 2016.

Panel A: Variable Descriptions Index Name Description Start Time Source

Option-implied variables: PWI1 The probability weighting index at the end of a month, derived Jan. 1996 OM from the S&P500 index options. VIX The 30-day expected volatility of the U.S. stock market at the Jan. 1990 CBOE end of a month, derived from the S&P500 index options.

Return variables: Maxave, Minave For each stock, the largest and smallest daily returns during Jun. 1968 CRSP a month are obtained. Maxave and Minave are the average values of the largest and smallest daily returns across all stocks, respectively. Q01, Q99 In each month, the monthly returns of all stocks form a cross- Jun. 1968 CRSP sectional distribution. Q01 (Q99) is the 1 (99) percentage quan- tile of such cross-sectional distribution. MaxSP , MinSP MaxSP and MinSP are the largest and smallest daily returns Jun. 1968 CRSP of S&P500 in each month. σMKT The standard derivation of the daily excess returns of the mar- Jun. 1968 FF ket portfolio during a month.

Media variables: EMV The newspaper-based and Policy-Related Equity Market Jan. 1985 EPU Volatility (EMV) tracker, which is based on eleven major U.S. newspapers: the Boston Globe, Chicago Tribune, Dallas Morning News, Houston Chronicle, Los Angeles Times, Miami Herald, New York Times, San Francisco Chronicle, USA To- day, Wall Street Journal, and Washington Post, and updated monthly. FSI The newspaper-based Financial Stress Indicator of the U.S. Jun. 1968 EPU market, which is constructed by considering five U.S. news- papers: the Boston Globe, Chicago Tribune, Los Angeles Times, Wall Street Journal and Washington Post and updated monthly.

Survey variables: Crash The average crash probability reported by individual investors, Jul. 2001 ICS which is 1 minus the crash confidence level. The Crash Con- fidence Index is the percentage of individual respondents who think that the probability of market crash for the next six month is strictly less than 10%. Bull, Bear The percentage of individual investors who are bullish and Sep. 1987 AAII bearish on the stock market for the next six month.

Sentiment variable: Sentiment Sentiment index in Baker and Wurgler (2006), based on the Jun.1968 BW first principal component of FIVE (standardized) sentiment proxies.

47 Panel B: Summary Statistics N Mean StDev Min P25 Median P75 Max Option-implied variables: PWI1 244 1.51 1.21 0.00 0.40 1.31 2.44 5.10 VIX 316 19.85 7.52 10.42 14.02 18.24 23.73 59.89

Return variables: Maxave 577 7.04 2.17 3.69 5.37 6.57 8.25 18.77 Minave 577 -5.74 1.75 -17.22 -6.75 -5.42 -4.43 -3.01 Q99 577 23.68 11.66 -3.45 16.57 21.93 29.01 94.24 Q01 577 -18.53 7.25 -53.59 -21.95 -17.29 -14.04 -2.82 MaxSP 577 1.80 1.11 0.34 1.10 1.53 2.18 11.58 MinSP 577 -1.73 1.39 -20.47 -2.09 -1.49 -1.00 -0.01 σMKT 577 14.01 8.16 4.41 9.24 11.73 15.84 78.92

Media variables: EMV 377 19.68 8.11 8.03 14.76 17.59 22.30 69.84 FSI 577 100.87 0.84 98.76 100.30 100.80 101.32 105.89

Survey variables: Crash 179 66.68 7.38 51.12 61.08 67.48 71.16 85.43 Bull 346 38.53 8.76 18.00 32.06 38.75 43.69 64.40 Bear 346 30.31 8.21 14.00 24.25 29.13 35.50 59.00

Sentiment variable: Sentiment 577 0.08 1.00 -2.42 -0.37 0.09 0.58 3.20

48 Table 6 Panel A reports correlations among 15 variables. The correlations with Crash are computed based on the monthly data from January 2001 to April 2016. The other correlations are computed based on the monthly data from January 1996 to April 2016. Panel B reports the number of months for which an alternative variable and PWI1 give different labels of overweighting/underweighting. For Q01, Minave and MinSP , an overweighting period is one in which the value is smaller than the median of all periods. For the other indices, an overweighting period is one in which the value is larger than the median of all periods.

Panel A: Correlations among variables

PWI1 VIX Minave Maxave Q01 Q99 MinSP MaxSP σMKT EMV FSI Crash Bear Bull VIX 0.815 Minave -0.715 -0.708 Maxave 0.729 0.726 -0.927 Q01 -0.402 -0.417 0.792 -0.561 Q99 0.338 0.375 -0.253 0.554 0.260 MinSP -0.529 -0.601 0.752 -0.590 0.646 0.007 MaxSP 0.609 0.715 -0.750 0.715 -0.536 0.160 -0.738 σMKT 0.581 0.730 -0.818 0.701 -0.686 0.056 -0.893 0.904 EMV 0.589 0.611 -0.789 0.676 -0.652 0.033 -0.767 0.771 0.829

49 FSI 0.635 0.614 -0.756 0.705 -0.562 0.194 -0.660 0.634 0.679 0.704 Crash 0.516 0.586 -0.385 0.363 -0.264 0.143 -0.454 0.458 0.487 0.392 0.167 Bear 0.217 0.283 -0.178 0.035 -0.285 -0.241 -0.412 0.382 0.473 0.349 0.212 0.428 Bull -0.160 -0.127 0.094 0.076 0.275 0.361 0.245 -0.183 -0.244 -0.205 -0.030 -0.351 -0.637 Sentiment 0.039 -0.126 -0.254 0.226 -0.270 0.063 0.035 -0.031 -0.038 0.101 0.107 -0.210 -0.189 0.087

Panel B: Number of differently labeled months

VIX Minave Maxave Q01 Q99 MinSP MaxSP σMKT EMV FSI Crash Bear Bull Sentiment Number 38 34 34 79 90 72 68 76 50 58 59 113 132 115 Proportion 15.6% 13.9% 13.9% 32.4% 36.9% 29.5% 27.9% 31.1% 20.5% 23.8% 24.2% 46.3% 54.1% 47.1% Figure 1. Prelec’s probability weighting functions with different α’s. When α < 1, it has an inverse S-shape (i.e., overweighting both tails). When α > 1, it has an S-shape (i.e., underweighting both tails). The point p∗ = e−1 separates the convex and concave domains of the probability weighting functions.

1

0.9

0.8

0.7

0.6 ) a ; 0.5 p ( w 0.4

0.3

0.2 Overweighting (α = 0.5) No probability weighting (α = 1) 0.1 Underweighting (α = 1.5)

0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 p

50 Figure 2. The pricing kernels and implied relative risk aversions for CRRA and RDUT, under both market scenarios. The CRRA has a relative risk aversion γ = 2 and the RDUT has Prelec’s probability weighting functions. The dot black lines are those of the CRRA model, the solid red lines represent the RDUT model with overweighting the tails, and the dashed blue lines show the RDUT model with underweighting the tails. For Market I, the market return follows a skew-normal distribution with mean 7.6%, standard deviation 15.8%, and skewness -0.339. For Market II, the market return follows a skew-normal distribution with mean -7.6%, standard deviation 15.8%, and skewness -0.339. In (a), (b), α = 0.5 for overweighting and α = 1.5 for underweighting.

8 8 Overweighting α Overweighting ( =0.5) Underweighting CRRA (α=1) 6 7 CRRA Underweighting (α=1.5)

6 4

5 2 m

4 0

-2 Pricing kernel3 ˜

2 -4 Implied risk aversion at initial wealth

1 -6

0 -8 0.7 0.8 0.9 1 1.1 1.2 1.3 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 e RM α

(a) Pricing kernel (Market I) (b) Implied risk aversion at ReM = 1 (Market I)

7 25 Overweighting (α = 0.5) Overweighting CRRA (α = 1) Underweighting Underweighting (α = 1.5) 6 CRRA 20

5 15

4

10

3 Pricing kernel

5 2 2 Implied risk aversion at initial wealth 0 1

0 -5 0.6 0.7 0.8 0.9 1 1.1 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 e RM α

(c) Pricing kernel (Market II) (d) Implied risk aversion at ReM = 1 (Market II)

51 Figure 3. Relationship between A and α. The probability weighting function is that of Prelec with parameter α and the utility is CRRA with relative risk aversion γ = 2. The market return follows a skew-normal distribution with mean 7.6%, standard deviation 15.8%, and skewness -0.339.

4 Overweighting Underweighting 3

2

1 α∗ 0 A

-1

-2

-3

-4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 1.5 α

52 Figure 4. Relationship between B and γ. The probability weighting function is that of Prelec with α = 0.5 or α = 1.5 and the utility is CRRA with relative risk aversion γ. The market return follows a skew-normal distribution with mean 7.6%, standard deviation 15.8%, and skewness -0.339.

0 20

-5 10

-10 0 -15

-20 -10 B B

-25 -20

-30 -30 -35

-40 -40

-45 -50 0 5 10 15 20 25 30 0 5 10 15 20 25 30 35 40 45 γ γ

(a) Overweighting Case α = 0.5 (b) Underweighting Case α = 1.5

53 Figure 5. Two typical option-implied probability weighting functions from our empirical study. The weighting function on September 30, 2008 is inverse S-shaped, implying overweighting of both tails. Among the 244 months tested, there are 126 months (about 51.6%) in which the implied weighting functions are inverse S-shaped. The weighting function on December 30, 2006 is S-shaped, yielding underweighting of both tails. Among the 244 months, there are 92 months (about 37.7%) in which the implied weighting functions are S-shaped.

1 1

0.9 0.9

0.8 0.8

0.7 0.7

0.6 0.6

0.5 0.5

Distorted probability 0.4 0.4 Distorted probability

0.3 0.3

0.2 0.2

0.1 0.1

0 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 probability 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 probability

(a) 2008-09-30 (b) 2006-12-30

54 Figure 6. Time series of implied fear index(IFI), implied hope index(IHI), probability weighting indices PWI1 and PWI2. The dashed black line in Panel A separates overweighting and under- weighting. The dashed blue lines in Panel B and Panel C are the medians of the indices, which also separate overweighting and underweighting.

6 Implied hope 5 Implied fear

4

3

2

1 Implied fear (Implied hope)

0

-1 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016 time

(a) Time series of IHI and IFI

6

5

4

3 1 PW 2

1

0

-1 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016 time

(b) Time series of PWI1

3

2.5

2

2 1.5 PW

1

0.5

0 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016 time

(c) Time series of PWI2

55 Figure 7. The relationship between the average next month’s returns and the average covariance (coskewness) of the ten covariance-sorted portfolios (ten coskewness-sorted portfolios).

(a) Covariance-sorted portfolios

(b) Coskewness-sorted portfolios

56 Figure 8. Plots of probability weighting index (PWI1), VIX index, Equity Market Volatility (EMV) index, annualized volatility of Fama-French market factor, maximum daily return of S&P500, minimum daily return of S&P500, crash belief index, and sentiment index from 1996 to 2016. The values are computed semiannually from the original monthly data. Probability weighting, crash belief and sentiment indices are properly scaled or adjusted in order to fit into the same figure. More specifically, probability weighting index is scaled to have the same standard deviation as VIX, sentiment index is scaled to have half standard deviation of VIX, and crash belief index is adjusted to be the original crash belief index minus 55.

60

50

40

30 57

20

10

0

-10 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016

annual volatility of FF market factor maximum daily return of S&P500 minimum daily return of S&P500

probability weighting index crash belief sentiment VIX equity market volatillity (EMV) Figure 9. Overweighting periods and underweighting periods generated by probability weighting index (PWI1), VIX index (VIX), Equity Market Volatility index (EMV), average maximum daily return of all stocks (Maxave), crash belief index (Crash) from 1996 to 2016. For each index, the higher part of the line corresponds to overweighting periods and the lower part of the line corresponds to underweighting periods. 58

1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016

PWI Max EMV VIX 1 ave Crash