The Hidden Correlation of

Collateralized Debt Obligations

N. N. Kellogg College University of Oxford

A thesis submitted in partial fulfillment of the MSc in Mathematical

April 13, 2009 Acknowledgments

I would like to thank my supervisor Dr Alonso Pe˜nafor his advice and help and for encouraging me to follow the approaches taken in this thesis. I want to thank my wife Blanca for her direct (by reading and correcting) and indirect (by motivating me) support of this work.

I would like to thank my former employer d-fine for the possibility to attend the MSc Programme in Mathematical Finance at the University of Oxford. For their engage- ment in this programme, I particularly would like to thank Dr Christoph Reisinger, the course director, and Prof Dr Sam Howison. Thank also goes to the other lecturers of the Mathematical Finance Programme. Finally, I thank my friend Ari Pankiewicz for the effort to read and correct this thesis.

ii Abstract

We propose a model for the correlation structure of reference portfolios of collater- alized debt obligations. The model is capable of exhibiting typical characteristics of the implied correlation smile (skew, respectively) observed in the market. Moreover, it features a simple economic interpretation and is computationally inexpensive as it naturally integrates into the factor model framework.

iii Contents

List of Figures v

List of Tables vi

1 Introduction to Collateralized Debt Obligations 1 1.1 The CDO Market ...... 1 1.2 Valuation of STCDOs ...... 4 1.3 Outline ...... 5

2 Modeling of Multivariate Default Risk 7 2.1 Structural Models ...... 8 2.2 Copula Functions ...... 9 2.3 Factor Models ...... 11

3 Default Correlation 18 3.1 Implied Correlation ...... 19 3.2 Correlation Smile ...... 22

4 Modeling the Correlation Matrix 25 4.1 Empirical Correlations and RMT ...... 26 4.2 Group Models ...... 28 4.3 Attainable Correlation Smiles ...... 31

5 Numerical Implementation 40 5.1 Integration Schemes ...... 40 5.2 Existence and Uniqueness of Correlation ...... 48 5.3 Performance ...... 49

Conclusions 51

iv CONTENTS CONTENTS

A Inverse Transform Sampling 53

B Correlation in Student-t Factor Model 54

C Used Parameters 56

Bibliography 59

v List of Figures

3.1 Example of correlation smile/skew observed in the market...... 22 3.2 Loss distribution, un-skewed vs. skewed...... 23

4.1 Example of implied correlations for non-flat correlation matrix. . . . . 25 4.2 Implied corr. for varying µ in the two-layer hierarchical model. . . . . 32 4.3 Implied corr. for varying ∆ in the two-layer hierarchical model. . . . 32 4.4 Implied corr. for varying n in the two-layer hierarchical model. . . . . 33 4.5 Implied corr. for varying ν in the two-layer hierarchical model. . . . . 34

4.6 Implied corr. for varying ∆n in the two-layer hierarchical model. . . . 35

4.7 Implied corr. for varying µρ in the two-layer hierarchical model. . . . 36

4.8 Implied corr. for varying ∆ρ in the two-layer hierarchical model. . . . 36

4.9 Implied corr. for varying ∆n in the block model...... 37

4.10 Implied corr. for varying ∆ρ in the block model...... 38

5.1 Integration schemes for the 1-factor Gaussian copula...... 46 5.2 Integration schemes for the 2-factor Student-t copula...... 47 5.3 Integration schemes for the 2-factor Gaussian copula...... 48 5.4 Uniqueness and existence of implied compound correlations...... 49

vi List of Tables

4.1 Types of group models...... 29

5.1 Test cases for the validation of the developed program code...... 41 5.2 Tranche spreads: LHP vs. numerical results in factor model framework. 49 5.3 CPU times for CDO pricing in different parameter regimes...... 50

C.1 Used parameters for the validation of integration schemes...... 56 C.2 Used parameters for the implied correlations of group models. . . . . 57

vii Chapter 1

Introduction to Collateralized Debt Obligations

1.1 The CDO Market

After a rapid growth in the recent ten years, the market for collateralized debt obli- gations (CDOs) has seen drastic and devastating changes in the months prior and during the creation of this thesis. Being in the center of the subprime mortgage crisis, the value of existing CDO positions has sharply dropped –if a price can be assigned at all, given the almost non-existing market– and the issuing of new CDOs has slowed down by almost 90% in 2008 compared to 2007 [1]. For the riskiest CDO portions on subprime mortgages, the term ”toxic waste” has been coined and the underlying mortgage-backed securities (MBS) are the main focus of the US$ 700 bil- lion ”Troubled Assets Relief Program” (commonly referred to as ”bailout”) of the US government [2]. We will later come to the involvement of CDOs in these recent developments, but first lay out the basic ideas behind CDOs.

CDOs are structured finance products that transfer and distribute credit risk on a reference portfolio of assets into tranches of increasing seniority. The risk is transferred from the arranger of the CDO, who acts as buyer of protection, to the investor, who acts as seller of protection. The common feature of all CDOs is the splitting of the underlying reference portfolio into different tranches, which defines a sequential allocation of the losses that occur in the portfolio. The lowest-lying tranche (equity tranche) is the first one to absorb losses, with the losses resulting in a payment from the investor to the arranger (and a decrease of the premiums paid to the investor, see below), up to a coverage limit which defines the size of the tranche (e.g. 3% of the CDO notional). If the cumulative losses exceed this limit, the next tranche (mezzanine tranche, e.g. 3%-6%) is affected, and so on. The CDO investors thus

1 1.1 The CDO Market 1 Introduction to Collateralized Debt Obligations take on exposure to a particular tranche, where the higher risk of the lower (junior) tranches is naturally rewarded with a higher premium, whereas the higher (senior) tranches are protected by the junior tranches, hence yielding a much lower spread.

In essence, CDOs are based on the idea that the idiosyncratic risk of the names in the portfolio can be diversified out and that the diversified pool of risky assets has a relatively predictable loss pattern, where losses hardly ever touch the senior tranches. This allows the senior tranches to achieve a higher rating than the average rating of the underlying portfolio. However, the residual systematic risk crucially depends on the dependency structure (loosely termed ”correlation” in the following, see Chapters 2 and 3 for details) of the portfolio and gives rise to correlation risk and correlation trading. Importantly, the correlation risk varies among the tranches: a high correlation benefits equity investors, whereas the price of senior tranches drops as correlation rises, see Chapter 3.

Originally, the aim of the first CDO deals was the management of the balance sheet of the originating banks by making use of the possibility to transfer credit risk and free up regulatory capital (balance-sheet CDOs), with the underlying assets typically being cash assets like bank loans, bonds, or asset-backed securities (ABS, MBS). The ownership of these assets then needs to be transferred to a separate legal entity (SPV, special purpose entity) which acts as arranger of the CDO and issues the tranches (cash CDO). In contrast, synthetic CDOs are backed by credit default swaps (CDS); the arranger typically does not hold these CDS positions beforehand but enters into them only in order to set up the CDO. This is done to take advantage of a possible spread difference between the average yield of the underlying CDS portfolio and the spread paid on the tranches ( CDO) 1. The spread difference can be seen as service charge for bundling the portfolio and reflects the benefit for the investor, who otherwise would have to assemble the portfolio himself.

As further step in the product development of synthetic CDOs, single-tranche CDOs (STCDOs) entered the market in 2003 and in the same year already accounted for 90% of the issued synthetic CDOs [12]. In a STCDO, only one isolated tranche is sold to a investor, thereby carving out a piece of the loss profile. From the investor’s point of view, the popularity of STCDOs originates from the flexibility in the risk structure (the underlying credit portfolio, level of subordination, and tranche size can be chosen by the investor according to his needs) while providing a substantially higher yield than similarly rated investments, whereas the arrangers appreciate the comparably

1An arbitrage CDO can also be set up as cash CDO.

2 1 Introduction to Collateralized Debt Obligations 1.1 The CDO Market easy set-up of the transaction. However, as the arranger sells only part of the CDO structure, he exposes himself to the ”remaining” credit risk and needs to hedge that risk. STCDOs are often written on standardized CDS indices like CDX and iTraxx which facilitates this hedging for the arranger. Obviously, STCDOs are leveraged products with respect to the spreads of the underlying names: if the portfolio losses would increase from e.g. 3% to 6% due to a change in the credit spreads of the same order of magnitude, a 3%-6% STCDO would suffer a total loss. In the course of the product evolution, also more sophisticated products have appeared on the market of which we only want to give two examples: CDO2 are CDOs backed by tranches of other, underlying CDOs (and ABS, typically), where additional leverage is achieved by the overlap of names in the reference portfolios of the underlying CDOs (can be further enhanced to CDOn). Callable STCDOs are the equivalent to bermudan options in equity markets and are particularly difficult to price due to their loss path dependence (cp. footnote 2 in Chapter 2). The focus of this work lies on standard CDOs, though.

With respect to financial stability, the CDO market was believed to have a positive impact, as it facilitates the distribution of credit risk to non-bank investors and in- creases the degree of completeness of the credit market [12]. However, the narrowing of the corporate spreads in the years before the financial crisis (which, due to the leverage, had a multiplicative effect on the CDO spreads) increased the pressure to find higher yield/higher risk underlyings to keep engagement into CDOs profitable for investors and arrangers, bringing subprime mortgage loans into focus. However, this selective choice of loans also implied a higher correlation than anticipated from historical estimates: the subprime loans are more correlated because they are all un- derwritten with weak credit standards. This was not adequately reflected by both the credit rating agencies, which continued to assign first-class ratings to the senior CDO tranches, as well as by the market participants who continued to invest. As default rates on mortgage loans started to rise in 2007, the affected CDOs experienced severe losses and rating downgrades. For the case of synthetic CDOs, systemic risk also emerges from the fact that the number of liquid CDS names is limited and arrangers are forced to constantly draw from the same pool of names. Consequently, the same name is contained in many CDO deals, greatly amplifying the impact of a default. Also, an important point to consider is the back-action of the growth of the CDO issuance on other markets: as pointed out above, the arranger of a STCDO needs

3 1.2 Valuation of STCDOs 1 Introduction to Collateralized Debt Obligations to hedge the credit risk he is exposed to, namely by selling protection on the CDS market, thereby contributing to the narrowing of the spreads. Due to the leverage of the STCDOs, this has to be done with a multiple of the notional value of the sold tranches. Moreover, these developments were probably supported by the high complexity of the involved products, with models and techniques that were not always sufficiently tested and understood. Especially on the investor’s side, the lack of expert knowledge probably led to CDO investments without the ability to correctly assess and manage the associated risk.

1.2 Valuation of STCDOs

As pointed out above, the investor of a STCDO covers only the tranche losses of the tranche he is invested in, i.e. the losses in the interval [aL, aH ], where aL (aH ) is referred to as lower (upper) attachment point and the size (or notional) of the tranche is defined as aH − aL. Each loss within this tranche results in a payment equal to the loss amount from the investor to the arranger; in analogy to a credit default , this is called the protection leg of the STCDO. As compensation, the investor receives periodic payments (premium leg), usually paid quarterly in arrears 2, at a spread s on an outstanding principal which is initially equal to the notional of the tranche and stochastically declines with each loss by the size of the loss. Based on this, the STCDO can be valued as a swap contract, with the value –from the investor’s point of view– equal to the expected present value of the premium payments less the expected present value of the protection payments. The expected tranche loss at time t reads

 + E [Ltr(t)] = E (min(L(t), aH ) − aL) , (1.1) where the expectation is taken with respect to the risk neutral measure and L(t) is the portfolio loss, see Eq. (2.1). If the loss distribution F (x; t) ≡ P(L(t) ≤ x) is known (e.g. in the LHP approximation, see Sec. 2.3.5), the expected tranche loss can be written as Z I E [Ltr(t)] = (min(x, aH ) − aL)dF (x; t), (1.2) aL where I is the maximal loss, i.e. the initial portfolio amount.

2Except for the equity tranche, where a part of the premium is usually paid upfront.

4 1 Introduction to Collateralized Debt Obligations 1.3 Outline

For a portfolio of finite size, we use the (numerically obtained) discrete loss proba- bilities P (w; t) ≡ P(L(t) = wu) (see Sec. 2.3.3 for the meaning of the loss unit u),

X + E [Ltr(t)] = P (w; t) (min(wu, aH ) − aL) . (1.3) w With this, the value of the premium leg evaluates to

n X  Vpremium = s D(ti)∆ti aH − aL − E [Ltr(ti)] , (1.4) i=1 where s is the spread, 0 < t1 < . . . < tn = T are the payment dates, D(ti) is the discount factor for time ti, and ∆ti ≈ ti − ti−1 is the accrual factor. The value of the protection leg reads

Z T  0 0 Vprotection = E D(t ) dLtr(t ) , (1.5) 0 however, one commonly assumes that the payments are made on the next periodic payment date after the default (otherwise, one would also have to compute the loss distribution for each banking day in between the payment dates), thereby essentially neglecting the interest that the payment would accrue since the default,

n X Vprotection = D(ti)(E [Ltr(ti)] − E [Ltr(ti−1)]) . (1.6) i=1

The fair (or par) spread sPar is the spread that sets the total value of the tranche to zero, Vprotection sPar = P . (1.7) i D(ti)∆ti aH − aL − E [Ltr(ti)]

Hence, central for the valuation of the STCDO is the determination of the portfolio loss distribution F (x; t), respectively P (w; t). This will be the main focus of Chapter 2.

1.3 Outline

From the discussions in Sec. 1.1 above, it should have become apparent that the cor- relation among the names in the reference portfolio of a CDO is of crucial importance. It is therefore the primary topic of this thesis. In Chapter 2, we will cover the basics of the modeling of multivariate default risk in order to derive the loss distribution. It will become clear how the correlation

5 1.3 Outline 1 Introduction to Collateralized Debt Obligations between the names in the underlying portfolio of the CDO enters into the valuation. The Gaussian copula framework with flat correlation is introduced as the standard market model for the valuation of CDOs. The different concepts of historical vs. implied correlation are pointed out in Chap- ter 3, with the emphasis on the two methodologies for implied correlation, namely compound and base correlation. The properties, advantages, and pitfalls of these two measures are discussed. The starting point for the investigations in this thesis is the empirical observation that the market reports a non-constant correlation across the different tranches of a CDO (called smile, or skew, respectively) that contradicts the assumption of a constant correlation in the Gaussian copula framework. In order to reproduce the correlation smile, a model for the correlation matrix is proposed in Chapter 4, based on the investigation of empirical correlation matrices of stock price returns and on economic intuition. The shapes of the implied correlation curves emerging from this model are analyzed in detail for different parameter regimes. Chapter 5 finally focuses on some numerical issues that one encounters when imple- menting a CDO valuation code.

6 Chapter 2

Modeling of Multivariate Default Risk

The aim of this chapter is the derivation of the portfolio loss distribution F (x; t) in Eq. (1.2) respectively the discrete loss probabilities 1 P (w; t) from Eq. (1.3). To this end, we employ different techniques from default risk modeling like structural models, copulas, and factor models. Let us introduce some notation: we consider a reference portfolio of N obligors subject to default risk with random default times τ1, τ2, . . . , τN . Modeling of multivariate default risk denotes the modeling of the joint distribution H(t1, t2,...) ≡ P(τ1 ≤ t1, τ2 ≤ t2,...) with marginal CDFs pi(t) ≡ P(τi ≤ t) = E [1τi≤t]. pi(t) is called the default probability of name i and is assumed to be known, calibrated from credit curves (e.g. risky bonds or credit default swaps) on name i. Thus, one only needs to specify the dependency function (copula) for a full characterization of H, see Sec. 2.2. 2

The default of obligor i results in a loss li = Ni(1 − Ri) (for recovery type CDOs), where Ni is the notional of the i-th credit and Ri ∈ [0, 1] is its recovery rate, i.e. the percentage of the notional that is expected to be recovered in case of default by selling assets of the obligor. The random portfolio loss is then given by

N X L(t) = li1τi≤t , (2.1) i=1

1We will henceforth use the term ”loss distribution” also for P (w; t). 2To be more precise, we point out that the valuation of a CDO doesn’t require knowledge about the full joint distribution of default times which also contains all inter-temporal correlations, but rather the portfolio loss distribution which is the marginal distribution of losses at a fixed point in time. As described in Ref. [5], for the CDO valuation only the loss distribution at the discrete set of payment dates is needed.

7 2.1 Structural Models 2 Modeling of Multivariate Default Risk with expectation value

N X Z t E [L(t)] = E [li|τi = s] dpi(s). (2.2) i=1 0

Note that no default dependence enters here, i.e. the expected loss over the whole portfolio (respectively tranche structure) depends only on the single-name default probabilities. The expected loss suffered by part of the structure –like a tranche– is significantly affected by the default dependence between the obligors, though. This makes the modeling of the default dependence in Sec. 2.2 a key issue for the valuation and risk management of CDOs. In a first step, we cover the modeling of univariate default risk, though. The factor model approach that is followed in Sec. 2.3 derives from Merton’s structural model approach [33] for univariate default risk, we will therefore introduce structural models in the following. 3

2.1 Structural Models

Structural models (also termed asset value models or firm value models) define the time of default of a firm as the moment when the firm’s assets fall below its liabilities.

The asset value is modeled as random process Xt and looking at a fixed time scale (maturity) t the default is given by

τ ≤ t ⇔ Xt(t) ≤ C, (2.3) where the barrier C stands for the liabilities. The Black-Cox approach [8] extends this model by allowing a default anytime before maturity at the first hitting time of a (possibly time varying) barrier,

τ = inf{t : Xt(t) ≤ C(t)}. (2.4)

We will focus on the simplified case of static models that suppress the time dependence of X by treating it as random variable; let C be an increasing function of time. This relates the random variables X and τ by X = C(τ).

3We will not cover intensity-based models [25] in this work. We will, however, sometimes use the notion of the hazard rate h, borrowed from the intensity framework. For our purposes, the default probability and the hazard rate are related by p(t) = 1 − e−ht.

8 2 Modeling of Multivariate Default Risk 2.2 Copula Functions

In order to calibrate the static model to the market implied default curve p(t), the level of the barrier is adjusted. We have

p(t) = P(X ≤ C(t)) = P(FX (X) ≤ FX (C(t))) = P(U ≤ FX (C(t))) = FX (C(t)), (2.5) where FX is the distribution of X, and U denotes a uniformly distributed random variable (we have used the fact that FX (X) is uniformly distributed, cp. Appendix A). The calibration equation for C(t) then reads

−1 C(t) = FX (p(t)). (2.6)

Structural models naturally extend to the modeling of multiple dependent defaults since a default dependence can easily be incorporated by a correlation between the individual asset processes (variables, respectively). This will be useful in the context of factor models below.

2.2 Copula Functions

The specification of a joint distribution of default times is central to the valuation of basket credit derivatives. It is economically plausible and empirically shown that default events of obligors are not independent from each other but exhibit positive dependence, i.e. due to economic cycles or firm interactions, defaults tend to cluster together. Consequently, the joint default probability cannot simply be modeled by the product of the individual default probabilities. Obviously, the marginal single-name default probabilities discussed in Sec. 2.1 don’t uniquely fix the joint distribution. A convenient and powerful approach to specify the missing part, i.e. the default dependence between the obligors, is the use of copula functions. In general, copulas allow for the separate treatment of marginal distributions and dependency structure by linking given marginal distributions to a joint distribution. In this section a very brief and informal introduction to copulas is given, see e.g. Ref. [14] for further reading and proofs. A copula is defined as a multivariate distribution function on the unit hypercube with uniformly distributed marginals,

C(u1, . . . , un) = P(U1 ≤ u1,...,Un ≤ un), (2.7) C(1,..., 1, ui, 1,..., 1) = ui , ui ∈ [0, 1] . The idea of the analysis of dependency with copula functions is expressed by Sklar’s n theorem [36]: if F (x1, . . . , xn): R → [0, 1] is a n-dimensional distribution function

9 2.2 Copula Functions 2 Modeling of Multivariate Default Risk

with marginal distributions F1(x1),...,Fn(xn), then there exists a copula function C n (C is unique if the Fi(xi) are all continuous) such that for all (x1, . . . , xn) ∈ R

F (x1, . . . , xn) = C(F1(x1),...,Fn(xn)) (2.8) holds, i.e. C joins the marginal distributions to the joint distribution.

In practice, if a joint distribution function with a suitable dependency structure has been identified, the copula containing the dependency structure can be extracted by

−1 −1 C(u1, . . . , un) = F (F1 (u1),...,Fn (un)). (2.9)

Then, a set of given marginal distributions (e.g. the individual credit curves pi(t) obtained from the market) is plugged into Eq. (2.8), resulting in a joint distribu- tion function with the given marginal distributions and the extracted dependency structure. For the popular multivariate Gaussian, the extracted Gaussian copula reads (Φ being the standard normal CDF)

−1 −1 Cρ(u1, . . . , un) = Φρ(Φ (u1),..., Φ (un)) −1 −1 1 Z Φ (u1) Z Φ (un)  1  = √ ... exp − xT ρ−1x dx. (2.10) n/2 (2π) detρ −∞ −∞ 2

For the special case of the Gaussian copula, the dependency structure is fully described by the matrix of pairwise correlations ρ. The (linear) correlation between two random variables X and Y with nonzero finite σ(X) and σ(Y ) is defined as

E[XY ] − E[X]E[Y ] ρ(X,Y ) = . (2.11) σ(X) σ(Y )

Linear correlation is measure of linear dependence, since Y = aX + b almost surely iff |ρ(X,Y )| = 1. For joint distributions other than the multivariate Gaussian, the correlation does not completely define the dependency structure 4. Other measures of dependency have been proposed; however, since the Gaussian copula is the industry standard in credit risk modeling, the correlation is commonly used as measure of dependency, despite its limited applicability. Due to its particular importance for the valuation of CDOs, correlation is also used as quotation device, see Chapter 3.

4This even applies to other distributions belonging to the class of , like the Student-t distribution, see Appendix B.

10 2 Modeling of Multivariate Default Risk 2.3 Factor Models

2.3 Factor Models

Factor models are a widely used simplification in the context of credit risk modeling and provide an efficient way to deal with a large number of credits with default dependence [18]. We assume a low- (d-) dimensional random variable –the factor–

Z ∼ GZ conditional upon which default times become independent, i.e. the Z- conditional defaults are assumed independent events with probability

P(τ ≤ t|Z = z) ≡ pi(t|Z). (2.12) The independence of the conditional default probabilities allows for the application of recursion arguments that lead to large gains in the efficiency of the calculation of the (discrete) conditional loss probabilities P (w; t|Z) ≡ P(L(t) = wu|Z), see Sec. 2.3.3. The unconditional loss distribution then follows from integrating P (w; t|Z) over the distribution GZ of Z, Z P (w; t) = P (w; t|Z = z) dGZ (z). (2.13) Ω The integration domain Ω depends on the properties of the random variable(s) that are taken as factor(s), e.g. Ω = Rd for a d-dimensional Gaussian factor structure, see d Sec. 2.3.1, and Ω = R × R+ for a d-dimensional Student-t factor structure, see Sec. 2.3.2.

The form of the conditional default probability pi(t|Z): R+ ×Ω → [0, 1] emerges from the used factor model with the constraint that the (unconditional) default probabili- ties of the individual firms are preserved,

Z ! pi(t|Z = z) dGZ (z) = pi(t). (2.14) Ω Following the structural model approach and notation, the random variable X de- scribing the asset value in a factor model is given by 5 q 2 Xi = βiZ + 1 − βi i. (2.15) The systematic factor Z (which can be thought of as a common macroeconomic driver or, loosely speaking, ’the market’) is shared by all Xi, and i is the residual driver idiosyncratic to firm i and is independent of Z and j ∀j 6= i. The β’s are called factor loadings. In the following, we describe two common models: the Gaussian factor model and the Student-t factor model. 5 The products in general need to be taken as scalar products, βiZ ≡ βi · Z, βiβj ≡ βi · βj , 2 βi ≡ βi · βi

11 2.3 Factor Models 2 Modeling of Multivariate Default Risk

2.3.1 Gaussian Copula

The Gaussian copula model [28] is the industry standard model. For a Gaussian copula, Z and  are independent N (0, 1) distributed. Being a weighted sum of two

standard Gaussian random variables, Xi is also standard Gaussian and the correlation ρ between two obligors directly evaluates to

 β β , for i 6= j ρ(X ,X ) = i j (2.16) i j 1 , for i = j

i.e. X = (X1,...,XN ) is a N-dimensional Gaussian variable with correlation matrix ρ. With Φ the standard normal distribution, the conditional default probability reads ! Ci(t) − βiZ pi(t|Z) = (Xi < Ci(t)|Z) = Φ . (2.17) P p 2 1 − βi

Since Xi ∼ N (0, 1) and following Eq. (2.6) the calibration equation for Ci(t) reads

−1 Ci(t) = Φ (pi(t)). (2.18)

2.3.2 Student-t Copula

The Student-t distribution has fatter tails than the Gaussian distribution and is known to generate tail dependence in the joint distribution, two properties that are also observed in the market. We make use of the following model:

rν q rν rν  q  X = β · V + 1 − β2 · V = β V + 1 − β2 V , (2.19) i i g i g i g i i i

2 where V and Vi are independent N (0, 1), and g is an independent χ (ν) random vari- q ν able that is common to both the systematic process Z = g V and the idiosyncratic q ν process i = g Vi. In order to prevent infinite low-order moments of X, we impose the restriction that ν > 2. Note that the factor Z is now two-dimensional and has a bivariate distribution func- q ν tion. The asset variable Xi, though, reads Xi = g Yi with Yi ∼ N (0, 1) which amounts to a Student-t distribution with ν degrees of freedom, Xi ∼ tν, as desired. Also, recall that with Y being a N-dimensional standard Gaussian variable with 2 q ν correlation matrix ρ, and g a scalar χ (ν) random variable, X = g Y follows a

12 2 Modeling of Multivariate Default Risk 2.3 Factor Models

N-dimensional Student-t distribution6 with correlation matrix ρ 7 and ν degrees of freedom. Thus, we have in fact constructed a Student-t copula. For the conditional default probability we now obtain

p g ! ν Ci(t) − βiV pi(t|V, g) = Φ , (2.20) p 2 1 − βi

−1 where the calibration equation reads Ci(t) = tν (pi(t)). Here, the strength of the model chosen above becomes apparent: if we had chosen two independent χ2 dis- tributed variables for the systematic and the idiosyncratic factor, then Xi would have been a sum of two Student-t variables whose distribution cannot be expressed in closed form. The calibration of C would therefore have required a numerical root finding in contrast to the analytic expression above. On the downside, the model above requires a double integral in Eq. (2.13) over both factor components g and V .

2.3.3 The Portfolio Loss Distribution

From the conditional default probabilities, the conditional loss distribution P (w; t|Z) can be calculated.

Let us first consider the case of a homogeneous setup, i.e. li ≡ l, pi ≡ p, and βi ≡ β are identical for all obligors. It follows that the loss distribution (conditional and unconditional) is related to the distribution of the number K of defaulted names by the transformation P (w; t) = P(K = w/l; t), i.e. we only have to consider the density of the number of defaults. Given the conditional independence of defaults, the density is given by the binomial density function,

N  (K = n; t|Z) = p(t|Z)n (1 − p(t|Z))N−n = b(n; N, p(t|Z)). (2.21) P n

In the more general inhomogeneous setup, the conditional loss distribution is given by the convolution product of all conditional single name loss distributions 8. This convolution can be computed by established methods for the calculation of continuous

6The generalization of the univariate Student-t distribution to a multivariate distribution is not unique. We use the definition used in Ref. [7]. 7Hence, the construction in Eq. 2.19 preserves the Gaussian correlation structure, i.e. as before we get ρ(Xi,Xj) = βiβj, for the derivation see Appendix B. This means that the factor loadings β have the same meaning as before and form a decomposition of the correlation matrix. 8The distribution of the sum X + Y of two independent random variables with respective dis- ∞ P R 0 tributions GX (x) and GY (x) is given by the convolution G (x) = x0=−∞ GX (x)GY (x − x). The distribution of X1 + ... + Xn, Xi independent, can be computed from this by induction.

13 2.3 Factor Models 2 Modeling of Multivariate Default Risk

convolutions of independent random variables, see Ref. [20] for a Fourier transform based method. A particularly simple and efficient method [7] is based on recursion: a loss unit u is

introduced, such that the individual losses li are discretized in integer units wi of u, 9 li = wi u . If Ln is the loss associated with a portfolio containing the first n names

(in arbitrary order), then Pn(w; t|Z) ≡ P(Ln(t) = wu|Z) is recursively given by

Pn+1(w; t|Z) = Pn(w − wn+1; t|Z) pn+1(t|Z) + Pn(w; t|Z)(1 − pn+1(t|Z)). (2.22)

From this, the loss distribution P (w; t|Z) = PN (w; t|Z) can be built, starting from

the boundary case of the empty portfolio, P0(w; t|Z) = δw,0. Note that Eq. (2.22) corresponds to a nested loop: new names are successively added to the portfolio and for each new name the probabilities for all possible losses

w = 0, . . . , wmax,n + wn+1 need to be computed. Therefore, the computational time of numerical implementations will typically depend roughly quadratically on the size of the portfolio (cp. Sec. 5.3).

2.3.4 Fitting the Factor Loadings

Suppose we have inferred a correlation matrix ρ for the reference portfolio from the market, e.g. from asset correlations (cp. Chapter 3). In general, the factor loadings cannot be chosen such that the correlation matrix is exactly reproduced, since ρ has (N − 1)(N − 2)/2 independent entries whereas there are only N factor loadings for each factor. Therefore, one aims to find factor loadings β that approximate ρ best with a given number of factors M. This optimization problem can be numerically solved with a number of methods, an iterative algorithm based on principal component analysis (PCA) is sketched in Ref. [7]. From the spectral decomposition of the correlation matrix, ρ = V ΛV T , only the √ √ eigenvectors of the M largest eigenvalues are used, β 1 = λ1v1, . . . ,ββM = λ1vM (where the eigenvalues are labeled in decreasing order of magnitude), yielding an PM T approximation ρ˜ to ρ, ρ˜ = i=1 β i β i .

In general, the factor loadings β i are not optimal yet, since the optimization procedure above minimizes trρ −ρ˜ρ −ρ˜T , i.e. it uses a Frobenius norm on all elements of ρ − ρ˜. However, the diagonal elements of ρ don’t need to be approximated, since the

9See Ref. [7] for a discussion about the optimal size of the loss unit in order to keep rounding errors at the one hand and computational effort at the other hand at a tolerable level.

14 2 Modeling of Multivariate Default Risk 2.3 Factor Models

construction of the factor models (2.15) assures that ρ(Xi,Xi) = 1, and we only need to consider non-diagonal elements of ρ. Therefore, a modified correlation matrix  (2) ρij , for i 6= j ρij = (2.23) ρ˜ij , for i = j

(2) is approximated in the next iteration, yielding β i , etc. In this thesis we will follow a different approach, though, as we will construct factor loadings that build a model correlation matrix with some predefined structure, see Chapter 4.

2.3.5 Large Homogeneous Portfolio Approximation

For an asymptotically large, homogeneous portfolio it is possible to derive a closed- form solution for the loss distribution. Specifically, we make the following assump- tions: • The portfolio consists of an infinite number of entities N.

• The notional amounts, single name default probabilities, recovery rates, and pair-wise correlations are identical for all entities.

• There is a single factor determining the conditional default probabilities. This is referred to as the Large Homogeneous Portfolio (LHP) approximation. We now K consider the average number of defaulted obligors N and use the fact that conditional on the factor Z defaults are independent and follow the law of large numbers, K  = p(Z)|Z = 1, (2.24) P N i.e. the average number of defaulted obligors equals the conditional default probability almost surely. As above, we obtain the unconditional (now continuous) distribution by integrating out the factor, K  Z ∞ K  F (α) = P ≤ α = P ≤ α|Z = z dGZ (z) N −∞ N Z ∞ = P(p(Z) ≤ α|Z = z) dGZ (z). (2.25) −∞ p(z) is not a random variable but a deterministic function and we can restrict the integration domain to the region where p(z) ≤ α, i.e. where z ≥ p−1(α) (for p(z) is a decreasing function), Z ∞ −1 F (α) = dGZ (z) = 1 − GZ (p (α)). (2.26) p−1(α)

15 2.3 Factor Models 2 Modeling of Multivariate Default Risk

2.3.5.1 Gaussian Copula

For the Gaussian copula, Eq. (2.26) evaluates to ! ! C − p1 − β2 Φ−1(α) p1 − β2 Φ−1(α) − C F (α) = 1 − Φ = Φ , (2.27) β β where we have used the symmetry of the Gaussian density function in the last step. The density function reads d 1 f(α) = F (α) = −g (p−1(α)) dα Z p 0(Z(α)) ! p1 − β2 Φ−1(α) − C p1 − β2 = ϕ , (2.28) β βϕΦ−1(α) where ϕ is the density of the standard normal distribution.

2.3.5.2 Student-t Copula

As pointed out above, the conditional default probability (2.20) in the Student-t copula model differs from the Gaussian one by being conditioned on two independent variables, thereby failing the conditions for the LHP approximation stated above. This issue is resolved in Ref. [35] by introducing a new factor variable η,

rg η := C − βV, (2.29) ν thus transforming Eq. (2.20) into ! η p(η) = Φ . (2.30) p1 − β2

Eq. (2.26) then turns into

Z p−1(α) −1 F (α) = dGη(η) = Gη(p (α)), (2.31) −∞ since p(η) is an increasing function.

It is shown in Ref. [35], that the density function gη can be written in a closed form as finite sum over incomplete gamma functions 10,

2 ν−1   − νt X ν − 1 g (t) = A e 2(C2+νβ2) (Bt)ν−1−kI (−Bt), (2.32) η k k k=0

10Assuming that ν ∈ N and p < 50% [35].

16 2 Modeling of Multivariate Default Risk 2.3 Factor Models

where for each k ∈ N0 the function Ik : R 7→ R is given by

( k−1 k+1  k+1 x2  2  2 Γ 2 Γ 2 , 2 , for x ≥ 0 Ik(x) = (2.33) k Ik(0) + (−1) (Ik(0) − Ik(−x)) , for x < 0

1 R ∞ −t k−1 with Γ(k, x) = Γ(k) x e t dt, and the constants A and B are defined as

ν ν−1  ν  2 β C2+νβ2 C A := ,B := . (2.34) √ ν−1 ν p 2  2 2 π 2 Γ 2 β C + νβ This can be used to express the density function of the fraction of defaulted obligors as 1 f(α) = g (p−1(α)) η p 0(η(α)) p p  1 − β2 = g 1 − β2 Φ−1(α) . (2.35) η ϕΦ−1(α)

17 Chapter 3

Default Correlation

As pointed out in the previous chapters, the valuation of CDO tranches involves a number of parameters: the reference portfolio is described by single-name credit spreads, recovery rates, and losses, as well as the correlation between the obligors; for the CDO being a cash-flow instrument, its valuation naturally also requires risk-free interest rates and accrual factors. The correlation, though, stands out from this list as it is the only parameter specific to CDOs (or other basket credit derivatives), whereas the others can be observed from the liquid CDS and interest rate markets.

Empirical investigations suggest that defaults tend to cluster together and that there is positive correlation even between companies in different sectors; therefore, the as- sumption of uncorrelated defaults is not justified and correlation has to be considered. Moreover, the strength of the correlation crucially affects the prices and risk profiles of the CDO tranches. The determination of the correlation therefore receives particular interest and can be approached from two opposite directions.

One can consider default correlation as model input that needs to be estimated ex- ternally, to the best of one’s knowledge. This is complicated by the fact that there is little empirical data on the relationship between defaults, and the underlying process driving the defaults is not easily observable. In theory, information about default cor- relation should be extracted from historical defaults; this generally suffers from the low number of observed defaults, though. For the estimation of the default correlation one therefore typically resorts to the correlation of historical equity market returns which is used as proxy for the asset return correlation of latent variable models like the Merton model (2.3).

18 3 Default Correlation 3.1 Implied Correlation

3.1 Implied Correlation

Conversely, one can think of correlation as a quotation convention. With the advent of standardized STCDOs on liquid CDS indices and the resulting increased liquidity in the CDO market, prices can be expected to follow from supply and demand rather than from a theoretical model. The correlation implied from these prices can be used as a quotation device to facilitate a comparison of prices across tranches and products, the line of thought being similar to implied volatilities of equity markets 1. Given the developments in the standardization of the credit derivative markets, there is also a need for a commonly agreed method of quoting this implied correlation. As standard model, the one-factor Gaussian copula with a flat correlation, βi ≡ β, ρij ≡ ρ ∀i, j is widely used. However, like the Block-Scholes model in equity markets, the one-factor Gaussian copula is known to be too simplistic to reflect market reality. It can therefore not be expected that a single correlation number is sufficient to fit the prices of all tranches simultaneously, but rather that each tranche trades at its own implied correlation. In the following, we describe two popular methods that have been developed for quoting the implied correlation with respect to the tranche.

3.1.1 Compound Correlation

Compound correlation is the direct extension of the Black-Scholes implied volatility and was the quotation convention initially used in the market. By inverting the standard pricing model, it is possible to find the level of default correlation that equates the observed market spread and the theoretical spread for a tranche. The (compound) correlation affects the different tranches in different ways. In general, an increase in default correlation means that it becomes more likely to observe many or few defaults. An equity tranche, for instance, is not much affected by the occurrence of many defaults as it absorbs losses only up to its detachment point. However, the occurrence of only a few defaults reduces the expected loss in the tranche and therefore the fair spread. Thus, in terms of prices, an equity tranche investor is long correlation. Conversely, if defaults in the portfolio are completely uncorrelated, the probability of the occurrence of many losses is very low, i.e. the losses are very unlikely to hit the senior tranche. Therefore, senior tranche investors are short correlation. Mezzanine tranches, finally, are in general not monotonic in (compound) correlation: a long position in a mezzanine tranche with attachment point aL and detachment 1However, market prices may be observed that are not attainable by any choice of correlation.

19 3.1 Implied Correlation 3 Default Correlation

point aH can be seen as a long position in an equity tranche with detachment point aH plus a short position in an equity tranche with detachment point aL, since the expected loss can be decomposed into

E [LaL,aH ] = E [L0,aH ] − E [L0,aL ]. (3.1)

Since both ”virtual” tranches are long correlation but enter with opposite sign, the root finding algorithm for the determination of the compound correlation often doesn’t yield a unique compound correlation but two correlations that produce the same tranche PV, see Sec. 5.2 for an example. This ambiguity is one of the biggest drawbacks of compound correlation. Moreover, as we will discuss in Sec. 3.2, compound correlation is not easy to extend to the pricing of tranches with non-standard strikes. The upside of compound correlation is its transparency and intuitive meaning as it directly maps to the asset correlation of latent variable models.

3.1.2 Base Correlation

Base correlation has been developed by JP Morgan [32] and makes use of the mono- tonicity of equity tranches. The basic idea of base correlation is the decomposition of all tranches into combinations of (virtual, non-traded) base (=equity) tranches, as given in Eq. (3.1). In contrast to compound correlation, where the two virtual base tranches are priced at the same correlation, base correlation allows for pricing each base tranche at a different correlation. This is done via a bootstrapping mech- anism where the base correlation from the first tranche is used to solve for the base correlation of the second one, and so on:

1. Let Va−b(s, ρ) denote the PV (from the investor’s perspective) of the tranche with attachment point a and detachment point b, with spread s, and (flat)

correlation ρ. The base correlation of the equity tranche, ρ0−K1 , solves the

equation ! V0−K1 (s0−K1 , ρ0−K1 ) = 0, (3.2)

where s0−K1 is the market spread for the 0−K1 tranche (i.e. the equity tranche). This is equivalent to the determination of the compound correlation for the equity tranche; hence, base correlation and compound correlation are identical for the equity tranche.

20 3 Default Correlation 3.1 Implied Correlation

2. The next tranche K1 − K2 is equivalent to a portfolio of a long position in a

0−K2 base tranche and a short position in a 0−K1 base tranche, both using the

K1 − K2 market spread. We define the base correlation ρ0−K2 as the correlation that solves ! VK1−K2 (sK1−K2 , ρK1−K2 ) = V0−K2 (sK1−K2 , ρ0−K2 ) − V0−K1 (sK1−K2 , ρ0−K1 ) . | {z } | {z } =0 by definition of ρK1−K2 known, <0 since sK1−K2

3. The procedure continues in the same way for the higher tranches.

Since the PV of the 0 − K2 base tranche, V0−K2 (sK1−K2 , ρ0−K2 ), is the only changing component in Eq. (3.3) (the base correlation of the lower strike, ρ0−K1 , is fixed during the root finding) and is an increasing function of the base correlation ρ0−K2 , the base correlation is guaranteed to be unique. Moreover, by means of Eq. (3.3), the existence of base correlations for all tranches guarantees that Eq. (3.1) holds, which is a basic arbitrage-free requirement. With respect to the whole tranche structure this means that the sum of the protection legs of all tranches equals the sum of the protection legs of all underlying CDS. This is also the reason that sometimes base correlation fails to find a solution for the super senior tranche: since the expected loss (and hence the spread) over the whole tranche structure doesn’t depend on the correlation (cp. Eq. (2.2)), a market spread that only slightly deviates from this spread cannot be reproduced anymore by any base correlation value.

However, there are some pitfalls with base correlation, too, the most obvious being the difficulty to build intuition about it. Since every tranche is actually characterized by two correlations (one for the lower, one for the upper strike), it is not obvious how base correlation changes with default or asset correlation. In fact, an increased default correlation can result in the decrease of the base correlation for some tranches [37]. One also needs to keep in mind that the determination of the base correlation for a particular tranche requires the knowledge about the prices for all more junior tranches and that base correlation is only unique given the set of attachment points [37]. The base correlation of the a − b tranche thus depends not only on a and b but also on the position of all prior attachment points, making the comparison e.g. between the European and American markets difficult due to the different tranche structure.

21 3.2 Correlation Smile 3 Default Correlation

3.2 Correlation Smile

As pointed out above, the implied correlation is typically not constant throughout the tranches but is known to exhibit a ”smile” (in terms of compound correlation) or a skew (in terms of base correlation), reflecting the difference between real world pricing and the assumptions of the (simplistic) Gaussian copula model, see Fig. 3.1. The main features of the smile/skew are relatively high spreads for equity and senior tranches (resulting in a low implied compound correlation for equity tranches and a high correlation for senior tranches) and low spreads for mezzanine tranches.

Figure 3.1: Example of a correlation smile/skew observed in the market (DJ iTraxx, 13 April 2006). Figure was taken from Ref. [15].

The reasons for the shape of the smile/skew are manifold, including model effects and supply-demand conditions. On the model side, it is obvious that the single correlation number isn’t able to capture the dependency structure of a portfolio of 125 credits; also, the Gaussian copula model does not reflect the empirical joint distribution of default times as it underestimates the probability of a very low or very high number of defaults. On the supply-demand side it is worth noting that mezzanine tranches are very popular among investors, resulting in a tightening of spreads. On the other hand, the CDO market is segmented, i.e. different parts of the seniority structure are driven by different investors (hedge funds buying equity tranches, retail investors looking for mezzanine tranches, and pension funds focusing on senior tranches), and there are only few market participants that seek to exploit perceived mispricing across different tranches. A number of other effects have been suggested as explanation, see e.g. Refs. [4, 32].

22 3 Default Correlation 3.2 Correlation Smile

An important application of the correlation smile/skew is the interpolation in order to calculate spreads for customized tranches on standard indices with non-standard attachment points. In this context, the shape of the base correlation skew, which is typically close to linear, appears to be much better suited for an interpolation than the more complicated smile shape of compound correlation. However, as it is shown in Ref. [37], the spread errors from a linear interpolation of the base correlation skew can still be large. In rare cases, the interpolation can even yield negative expected losses for some tranches.

3.2.1 Modeling Approaches

A starting point for modeling approaches is the loss distribution implied by the mar- ket. Since tranche prices are essentially integrals over the loss distribution, cp. Eqs. (1.2) and (1.3), the implied loss distribution can be backed out from the market by differentiating. Compared to the loss distribution generated from the Gaussian cop- ula, one observes a fat upper tail, indicating an increased probability of high loss scenarios (yielding a high spread for senior tranches), and a lower probability of zero (or few) defaults (equivalent to a high spread for the equity tranche), see Fig. 3.2. Whereas the fat upper tail can be modeled e.g. by switching from the Gaussian to a Student-t copula, the challenge is to do so without also raising the probability of small losses.

50% 100%

un-skewed un-skewed skewed skewed Prob P(L

0% 0% 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Normalised Loss Normalised Loss

Figure 3.2: Distribution of losses, as produced by a Gaussian copula (”un-skewed”) and observed in the market (”skewed”). Figure was taken from Ref. [24].

A number of alternative copulas have been proposed to overcome this issue, including Double-t, Clayton, Archimedean, and Marshall-Olkin copulas, we refer to Ref. [11] for an overview. Also extensions to the Gaussian copula model have been investi- gated: stochastic correlation interprets the correlation as (binary) stochastic variable

23 3.2 Correlation Smile 3 Default Correlation

[10], which –in a further extension termed ”random factor loadings” [6]– possibly de- pends on the factor Z (low correlation in bullish markets, high correlation in bearish markets). Random recovery [6] in turn treats recovery rates stochastically, following e.g. a Gaussian, Student-t, or beta distribution.

24 Chapter 4

Modeling the Correlation Matrix

In the previous chapter, a number of modeling techniques for the correlation smile were mentioned. In this work, we want to pursue a different approach, namely by considering the full matrix of pairwise correlations between the obligors instead of a constant and equal correlation between all names. As an example, the left panel of Fig. 4.1 shows the implied compound and base correlation for a non-flat correlation matrix whose structure is indicated in the right panel. Apparently, the correlation seen by the individual tranches differs, giving rise to a non-flat implied correlation curve.

40

Compound correlation Base correlation ρ=0.55 ρ=0.05 25 30 ) % (

n 50 o i t r a l o

e 20 g r i r l o b C O 75 d e i l p m I 10 100

0 125 0 5 10 15 20 25 25 50 75 100 125 Detachment point (%) Obligor

Figure 4.1: Implied compound and base correlation curve (left panel) for the block structure of the correlation matrix indicated in the right panel. Used parameters: see Tab. C.2 in Appendix C.

Within the factor model framework, this approach leaves us with N ·d free parameters (the factor loadings), with d being the number of factors. The fitting of the correlation

25 4.1 Empirical Correlations and RMT 4 Modeling the Correlation Matrix smile thus turns into a high-dimensional optimization problem which was tackled in Ref. [23] by using Evolutionary Algorithms. In contrast, the idea of this work is to impose a structure for the correlation matrix in order to drastically reduce the number of fitting parameters while still keeping enough degrees of freedom to allow for rich variations of the implied correlation curve. To this end, we will let ourselves guide by the structure of empirical correlation matrices obtained from the market.

4.1 Empirical Correlations and RMT

As mentioned in Chapter 3, the correlation of equity returns is often used as a proxy for default correlation. Denoting the equity return (price change) of name i by δxi(t) and assuming that the δx’s have been rescaled to constant unit volatility, the corre- lation reads T 1 X 1 C = δx (t) δx (t) = MMMM T , (4.1) ij T i j T t=1 where M is a N × T matrix (N being the number of assets), representing the N time series of price changes of length T . The period T is in practice always limited, resulting in empirical correlations that are to a large degree noisy (i.e. random). Random matrix theory [13] (RMT) makes predictions about universal properties of (real and symmetric) random matrices, in particular spectral properties. Deviations from these universal predictions then iden- tify non-random, system specific properties. The density ρA(λ) of eigenvalues of a matrix A is defined as

N 1 dn(λ) X ρ (λ) = = δ(λ − λ ), (4.2) A N dλ i i where n(λ) is the number of eigenvalues of A smaller than λ. Assuming that the matrix M from above is completely random, i.e. its elements are iid random with arbitrary distribution, ρC (λ) is universal and exactly known in the limit N,T → ∞, with a fixed ratio Q = T/N ≥ 1 and reads [31]

Q p(λ − λ)(λ − λ ) ρ (λ) = max min with C 2πσ2 λ (4.3) max 2 p λmin = σ (1 + 1/Q ± 2 1/Q),

26 4 Modeling the Correlation Matrix 4.1 Empirical Correlations and RMT

2 with λ ∈ [λmin, λmax], and σ the of the elements of M . Eq. (4.3) is called Marˇcenko-Pastur law [31]; an important prediction besides the functional form of the 1 spectral density is that ρC vanishes outside the interval [λmin, λmax].

The investigation of empirical correlation matrices of stock price returns in various markets in Refs. [27, 34] now reveals that the spectrum obtained from these empirical matrices shows – within the Marˇcenko-Pastur bounds λmin, λmax– a good agreement with the ”pure noise” hypothesis as expressed in Eq. (4.3). However, ≈ 6% of the eigenvalues exceed the upper bound significantly, in particular the highest eigenvalue with a value of λ1 ≈ 25 · λmax. It is shown in Ref. [27] that the corresponding eigenvector has almost equal components on all names and thus can be identified with the market. Ref. [34] elaborates further on the structure of the eigenvectors corresponding to the other large eigenvalues and shows that subsets (groups) of names of different sizes contribute to them.

To conclude the most important findings from the investigation of empirical correla- tion matrices of equity returns, we assert that:

• Only a small number of (large) eigenvalues deviates from the Marˇcenko-Pastur law (4.3) and thus carries some information. 2

• The highest eigenvalue is much larger than all other eigenvalues and can be identified with the market, i.e. there is a common systematic risk factor to which all names are significantly exposed to.

• The correlations that account for the other large eigenvalues are neither com- pletely localized (i.e. only few names contribute to them), nor very extended. Instead, they span groups of names of different sizes.

In the following we discuss the task of constructing correlation matrices with these properties in a factor model framework.

1For finite N, the edges of the interval are smoothed, i.e. there is a small probability to find eigenvalues outside the interval. We will not account for finite N effects here. 2However, care must be taken here: the Marˇcenko-Pastur law applies to independent sample elements whereas the correlation matrix is only a measure of linear dependence. If sampled from heavily tailed distributions, e.g. a multivariate Student-t distribution, even uncorrelated random data may lead to an eigenspectrum that doesn’t correspond to the Marˇcenko-Pastur law. Hence, the number of eigenvalues exceeding the upper bound might not give a correct indication of the number of real driving factors [17].

27 4.2 Group Models 4 Modeling the Correlation Matrix

4.2 Group Models

It is already intuitively clear that there is a close connection between the eigenvectors of the correlation matrix and the factor loading vectors in the factor framework since both are forming a decomposition of the correlation matrix. The case of the com- pletely random correlation matrix reflected by the Marˇcenko-Pastur law obviously corresponds to the zero-factor model, i.e. there is no systematic factor in Eq. (2.15). The correlation matrix in Eq. (2.16) then equals the identity matrix. Note that this is the model correlation matrix of the zero-factor model, i.e. without noise dressing. By sampling finite time series’ from the zero-factor model and computing the corre- lation matrix from the sample, one would retrieve the noise dressed matrix with the Marˇcenko-Pastur spectrum. Going to the one-factor model, the model spectrum exhibits a large eigenvector as- sociated with the systematic variable (the market): for simplicity we assume equal factor loadings for all names, βi ≡ β, so that the correlation matrix can be written as a weighted sum of the identity matrix I N and the unit matrix (i.e. a matrix consisting of all ones) J N ,

 1 β2 ...  2 2 2 C =  β 1  = (1 − β )I N + β J N , (4.4) . . . .. and the characteristic equation for C reads

2 2  2 N−1 2  ! det (1 − β − λ)I N + β J N = (1 − β − λ) 1 + (N − 1)β − λ = 0, (4.5)

N−1 where we have used det(aII N + bJJ N ) = a (a + Nb) [29]. Hence, the spectrum of 2 the one factor model is composed by a large eigenvalue λ1 = 1 + (N − 1)β , which approximately scales with N, and N − 1 degenerate eigenvalues λ2 = ... = λN = 1 − β2. As straightforward extension, one can consider a multifactor model with d factors, where in the simple case the dynamics of each name is governed by only one factor and by the idiosyncratic noise, i.e. the portfolio consists of d non-overlapping groups. The correlation matrix is then block diagonal and its spectral density is given by the superposition of the spectral densities of the d one-factor models of the individual groups, resulting in d large eigenvalues. As generalization, we want to allow for a hierarchical overlap of the groups intro- duced above [21]. The resulting general structure of the correlation matrix is termed ”(homogeneous) group model” and is given by

28 4 Modeling the Correlation Matrix 4.2 Group Models

 1 ρ1 . . . ρ1  .  ρ 1 .   1   . . .   . .. . ρ     ρ ...... 1   1   ..  C =  .  , (4.6)    1 ρl . . . ρl     .   ρ ρl 1 .   . . .   . .. .  ρl ...... 1 with ρk, k ∈ {1, . . . , l} being the intra-sector correlation of group k and ρ the inter- sector correlation, without limitation on the number of groups l. In addition, groups can be nested (not shown in Eq. (4.6) for simplicity), e.g. group 1 could contain a number of subgroups 1, . . . , m with intra-sector correlations ρ11, . . . , ρ1m, ρ1 then being the inter-sector correlation between these subgroups. Thus, a hierarchical struc- ture of layers emerges where each layer partitions the set of variables in a group in non-overlapping subgroups and so on. In that sense, the correlation matrix (4.6) can be considered a group with intra-sector correlation ρ which is partitioned in l subgroups. Furthermore, we assume a homogeneous structure, i.e. the correlation between each pair of obligors in group k is equal to ρk (the groups are homogeneous in economic activity) and all inter-sector correlations are equal to ρ, with obvious generalization for hierarchical models. Hence, this model features a very large eigenvalue resulting from ρ as well as a number of other large eigenvalues, reflecting the size ni and correlation strength ρi of the associated groups, as empirically observed [29]. On the other hand, the spectrum of the noise dressed matrix is still compatible with the Marˇcenko-Pastur law for the low eigenvalues. The model also reflects the economically intuitive and quantitatively verified [30] feature of industry sectors within which companies are more strongly correlated.

Term Parameter values Description Flat model All corr. equal to ρ Degeneration to standard flat correlation Block model ρ = 0, no nesting Block diagonal correlation matrix Hierarchical model All other values, General case. Eq. (4.6) shows with n layers n nested layers hierarchical model with two layers.

Table 4.1: Types of group models.

29 4.2 Group Models 4 Modeling the Correlation Matrix

4.2.1 Implementation in Factor Models

For the two-layer hierarchical model in Eq. (4.6), we consider each obligor as belong- ing to one specific (sub-)group (or sector) k that influences its dynamics,

q 2 Xi = φk(i) Zk(i) + 1 − φk(i) i , (4.7)

where Zk(i), i are independent standard Gaussian variables and k(i) denotes the

sector to which obligor i belongs; Zk(i) is the common driving factor for this sector.

In order to reflect the hierarchical structure of the model, the risk factors Zk are themselves related via a factor structure, q 2 Zk = λkZ + 1 − λk Wk . (4.8)

The process for Xi then reads q 2 q 2 Xi = φk(i) λk Z + φk(i) 1 − λk Wk + 1 − φk(i) i . (4.9)

This is a two factor structure with the factors Z and Wk, q (1) (k) (1)2 (k)2 Xi = βi Z + βi Wk + 1 − βi − βi i , (4.10)

(1) (k) p 2 with the factor loadings βi = φk(i) λk and βi = φk(i) 1 − λk. Two names in different sectors are only related via the common factor Z, therefore √ ρ = φk λk must hold. From Eq. (4.7) we see that two names in the same sector k √ p are correlated via Zk, so we have ρk = φk. This gives λ = ρ/ρk and the factor loadings evaluate to 3 √ √  ρ − ρ , if name in sector k β(1) = ρ and β(k) = k (4.11) i i 0 , else.

In total, l + 1 factors are required to generate the correlation structure in Eq. (4.6),

see Eq. (4.10): one common factor Z plus one factor Wk for each group k. A block model (cp. Tab. 4.1) of l blocks requires only l factors 4. The factor construction can be extended to more layers by introducing a new factor structure for each new hierarchy layer, as in Eq. (4.8). Again, this adds one factor for the inter-sector correlation on that level and one factor more for each additional group.

3Note that this is only one possible decomposition of the correlation matrix into factor loadings, C = β ·β T . The factor loadings found by the PCA algorithm in Sec. 2.3.4 are in general different — they are eigenvectors and therefore orthogonal — but also approximate the group structure perfectly. 4This holds for the Gaussian copula. For the Student-t copula, one additional factor for the common χ2 variable is required for all models.

30 4 Modeling the Correlation Matrix 4.3 Attainable Correlation Smiles

4.3 Attainable Correlation Smiles

Eventually, we are interested whether group models are able to generate a correlation smile/skew and which forms it can take. In the following, we present numerical results for the implied correlations as generated by different types of group models (cp. Tab. 4.1). If the implied (compound) correlation for a mezzanine tranche is not unique, cp. Sec. 5.2, always the lower one of the two correlations is indicated, following market convention.

4.3.1 Two-Layer Hierarchical Model 4.3.1.1 One Block

We start off with the simplest form of a hierarchical model where there is one block of size n of highly correlated obligors with pairwise correlation ρ1 on a background of weakly correlated obligors with pairwise correlation ρ. As indicated in Fig. 4.1, this simple correlation structure with only three free parameters is already capable of generating a correlation smile. For investigating the effect of a variation of the correlation strengths, we use the mean and the difference of the correlations which –supposedly– have ”orthogonal” effects on the implied correlation curves, ρ + ρ µ = 1 , ∆ = ρ − ρ. (4.12) 2 1

Variation of Correlation Strengths

Fig. 4.2 shows the impact of µ on the compound and base correlation. As expected, changing µ to higher (lower) values essentially shifts the curves up (down). This is not true, however, for the 3%–6% mezzanine tranche whose implied compound correlation almost isn’t affected. For µ = 0.4 and µ = 0.5, it is even lower than ρ, i.e. the implied correlation can be lower than the lowest pairwise correlation in the correlation matrix. Conversely, setting ∆ to larger (smaller) values primarily pronounces (mitigates) the compound correlation smile and steepens (flattens) the base correlation skew, see Fig. 4.3. The behavior of the overall implied correlation level with respect to ∆ depends on the block size n, though. If n ≈ N, most entries in the correlation matrix belong to the block and take on the value ρ1 = µ + ∆/2: large values for ∆ will yield large implied correlations. However, for n = 60 the number of pairwise correlations in the block (which scales with d2) is small compared to the number of entries outside the

31 4.3 Attainable Correlation Smiles 4 Modeling the Correlation Matrix

50 50

40 40 ) % ( )

n % ( o

i t n a l o 30 i 30 t e r a r l o e r c r

o d c n

u e o s p 20 a 20 b m µ=0.3 µ=0.3 o d c e

i µ µ=0.4 l =0.4 d p e i

l µ µ =0.5 m =0.5 I p m

I 10 10

Parameters: ∆=0.5, block size n = 60 Parameters: ∆=0.5, block size n = 60 0 0 0 5 10 15 20 25 0 5 10 15 20 25 Detachment point (%) Detachment point (%)

Figure 4.2: Implied compound and base correlations for varying µ in the two-layer hierarchical model, see Eq. (4.12). Variation of µ basically causes a parallel shift of the implied correlation curves, except for the compound correlation of the 3%–6% mezzanine tranche. Used parameters: see Tab. C.2 in Appendix C. block (in the ”bulk”), which take on the value ρ = µ−∆/2. This results in an inverse shift of the correlation curve with respect to the value of ∆.

40 40

) 30 30 % ( )

n % ( o

i t n a o l i t e r a r l o e r c r

20 o 20 d c n

u ∆ e ∆=0.3 o

=0.3 s p a ∆

∆ b =0.5

m =0.5

o d ∆ c

∆ e =0.7

=0.7 i l d p e i l 10 m 10 I p m I

Parameters: µ=0.35, block size n=60 Parameters: µ=0.35, block size n=60 0 0 0 5 10 15 20 25 0 5 10 15 20 25 Detachment point (%) Detachment point (%)

Figure 4.3: Implied compound and base correlations for varying ∆ in the two-layer hierarchical model, see Eq. (4.12). Larger values for ∆ result in a more pronounced smile in the compound correlation curve and a steeper base correlation curve. Used parameters: see Tab. C.2 in Appendix C.

32 4 Modeling the Correlation Matrix 4.3 Attainable Correlation Smiles

Variation of Block Size

The effect of a change of the block size as shown in Fig. 4.4 seems to be twofold. On the one hand, one can observe an upwards (downwards) shift of the compound and base correlation curve for larger (smaller) block size, reflecting the obvious fact that more (less) entries in the correlation matrix adopt the value ρ1 > ρ. On the other hand, the compound correlation smile is deepened for large block size. Regarding the base correlation curve, we recall that for the equity tranche base cor- relation and compound correlation are identical. For large block sizes, the base cor- relation curve starts off at a high implied correlation for the equity tranche; the slope of the curve is then squeezed into the remaining gap and hence less steep.

50 50 n=30 n=30 n=60 n=60 n=90 n=90 40 40 ) % ( )

n % ( o

i t n a o l 30 i 30 t e r a r l o e r c r

o d c n

u e o s p 20 a 20 b m

o d c e

i l d p e i l m I p m

I 10 10

Parameters: µ=0.3, ∆=0.5 Parameters: µ=0.3, ∆=0.5

0 0 0 5 10 15 20 25 0 5 10 15 20 25 Detachment point (%) Detachment point (%)

Figure 4.4: Implied compound and base correlations for different block sizes in the two-layer hierarchical model. Used parameters: see Tab. C.2 in Appendix C.

Variation of Degrees of Freedom for Student-t Copula

Compared to the Gaussian copula, the fatter tails of the Student-t copula increase the probability of extremal events, i.e. the probability of many or few defaults. Following the line of argumentation in Sec. 3.1.1, this amounts to an increase in compound correlation for the equity and senior tranche and hence a more pronounced smile. This is shown in Fig. 4.5 (the results for the Student-t copula with 30 degrees of freedom can be considered identical to those for a Gaussian copula). Note that the spreads emerging from a Student-t copula with 3 degrees of freedom cannot be repriced in the Gaussian copula framework by any choice of (compound)

33 4.3 Attainable Correlation Smiles 4 Modeling the Correlation Matrix

50 50 )

% 40 40 ( )

n % ( o

i t n a l o i t e r a r l o 30 e 30 r c r

o d c n

u e o s p a b m 20 20 o d

c 3 degrees of freedom e

5 degrees of freedom i l d

p 5 degrees of freedom e

i 10 degrees of freedom l m 10 degrees of freedom I p 30 degrees of freedom

m 30 degrees of freedom I 10 10

Parameters: µ=0.3, ∆=0.5, block size n=60 Parameters: µ=0.3, ∆=0.5, block size n=60 0 0 0 5 10 15 20 25 0 5 10 15 20 25 Detachment point (%) Detachment point (%)

Figure 4.5: Implied compound and base correlations for a varying number ν of degrees of freedom for the Student-t copula in the two-layer hierarchical model. Fatter tails from lower ν yield a more pronounced smile. For ν = 3, compound correlation fails to match prices for some tranches. Used parameters: see Tab. C.2 in Appendix C. correlation; hence the associated curve is missing. The base correlation for this case exists, though, but has a negative skew.

4.3.1.2 Two Blocks

We extend the two-layer hierarchical model with one block by adding a second block of size n2 with correlation ρ2. The transformed parameters in this setup read ρ + ρ µ = 1 2 , ∆ = ρ − ρ , ρ 2 ρ 1 2 n + n (4.13) µ = 1 2 , ∆ = n − n . n 2 n 1 2 The implementation of the two-layer, two block hierarchical model requires three factors (for the Gaussian copula).

Variation of Block Sizes

Setting the intra-group correlations equal for both blocks, Fig. 4.6 displays the effect of a growing size difference between the two blocks (for symmetry reasons, we can restrict ourselves to the case n1 ≥ n2, or ∆n ≥ 0). The correlation structure is changed from correlations of mid-sized extension (∆n = 0) to long-ranged correlations

(∆n large), for ∆n/2 = µn (i.e. n2 = 0) finally degenerating into a one-block model. It can be seen that the case of large asymmetry (i.e. the regime close to the one- block model) yields the best results in the sense that it is able to generate larger

34 4 Modeling the Correlation Matrix 4.3 Attainable Correlation Smiles compound correlations for the senior tranches which –in the other cases– is only on the level of the equity tranche correlation or below. As for the other investigated parameter regimes in this model (see below), the base correlation is almost flat (or even downwards skewed) and doesn’t feature the rising slope commonly observed in the market.

The effect of a variation of the mean block size µd is not shown here as it basically results in a shift of the curve. Hence, the additional block doesn’t provide an improvement over the one-block model, at least with respect to the additional size degree of freedom.

40 40 ) % ( )

n % (

o 30 30

i t n a o l i t e r

∆ a ∆ r =0 l =0 n n o e r c r ∆ ∆ =15 =15 o d n n c n

u 20 ∆ e 20 ∆ =30

o =30 n s n p a b

m ∆ ∆ =45 =45

o n

n d c e

i

∆ l ∆ d =60 =60

n p n e i l m I p 10 10 m I

ρ µ ∆ µ Parameters: ρ=0.2, µ =0.7, ∆ =0, µ =35 Parameters: =0.2, ρ=0.7, ρ=0, =35 ρ ρ n n 0 0 0 5 10 15 20 25 0 5 10 15 20 25 Detachment point (%) Detachment point (%)

Figure 4.6: Implied compound and base correlations for varying size difference be- tween the two blocks in the two-layer hierarchical model. Used parameters: see Tab. C.2 in Appendix C.

Variation of Correlation Strengths

As can be seen from Figs. 4.7 and 4.8, the mean correlation strength as well as the cor- relation difference between the two blocks are capable of influencing the depth of the correlation smile while there is little effect on the skew. For large ∆ρ, the compound correlation for the senior tranche can be significantly elevated; the associated base correlation has a concavely curved shape. Note that this regime again approaches the one-block model: the large block (∆n > 0, n1 > n2) has a high correlation whereas the correlation in the small block is close to the inter-sector correlation. We pass on looking at the variation of the inter-sector correlation ρ in detail for it does not exhibit features qualitatively different from the already investigated parameter regimes. It essentially leads to a shift in the curves and a slight modification of the smile depth in the compound correlation.

35 4.3 Attainable Correlation Smiles 4 Modeling the Correlation Matrix

40 40 ) % ( )

n % (

o 30 30

i t n a l o i t e r a r l o e r c r

o d c n

u

20 e 20 o s p µ a µ =0.4 =0.4 b ρ m

ρ o d c e µ

=0.6 µ =0.6 i ρ ρ l d p

e i

l µ µ =0.8 m =0.8 I ρ p 10 ρ 10 m I

Parameters: ρ=0.2, ∆ =0, µ =35, ∆ =30 Parameters: ρ=0.2, ∆ =0, µ =35, ∆ =30 ρ n n ρ n n 0 0 0 5 10 15 20 25 0 5 10 15 20 25 Detachment point (%) Detachment point (%)

Figure 4.7: Implied compound and base correlations for varying µρ in the two-layer hierarchical model, see Eq. (4.13). Used parameters: see Tab. C.2 in Appendix C.

50 50 Parameters: ρ=0.2, µ =0.6, µ =35, ∆ =30 Parameters: ρ=0.2, µ =0.6, µ =35, ∆ =30 ρ n n ρ n n

40 40 ) % ( )

n % ( o

i t n a o l 30 i 30 t e r a r l o e r c r

o d c n

u e o s p ∆

∆ a =-0.6 20 ρ=-0.6 20 ρ b m

o ∆ d ∆ =-0.3 c =-0.3 e ρ

ρ i l d p

e ∆

i ∆ =0

l =0 ρ ρ m I p ∆ m ∆ =0.3 I 10 ρ=0.3 10 ρ ∆ ∆ =0.6 ρ=0.6 ρ

0 0 0 5 10 15 20 25 0 5 10 15 20 25 Detachment point (%) Detachment point (%)

Figure 4.8: Implied compound and base correlations for varying correlation difference between the two blocks in the two-layer hierarchical model. Used parameters: see Tab. C.2 in Appendix C.

4.3.2 Block model 4.3.2.1 Three Blocks

As an example for a block model, we finally want to look at a block diagonal corre- lation matrix with three blocks, characterized by their sizes ni, and their intra-sector correlations ρi, i = {1, 2, 3}. This model is of the same computational complexity as the two-layer, two-block hierarchical model, requiring three factors when using the Gaussian copula.

36 4 Modeling the Correlation Matrix 4.3 Attainable Correlation Smiles

Again, the means and differences of sizes and correlations shall be used, ρ + ρ + ρ µ = 1 2 3 , ∆ = ρ − ρ = ρ − ρ , ρ 3 ρ 1 2 2 3 n + n + n (4.14) µ = 1 2 3 , ∆ = n − n = n − n . n 3 n 1 2 2 3

Note that this choice of parameters is a special case: variation of ∆ρ, for instance, increases the correlation in the first block, decreases the correlation in the third block, and leaves the correlation in the second block unchanged; likewise for ∆n. This lowers the number of free parameters of the model from six to four. Hence, not the full parameter space is investigated; however, we expect this choice of parameters to be capable of capturing all relevant dynamics of the model.

Variation of Block Sizes

The variation of the mean block size µn just results in a shift of the implied correlation curves and is not separately shown here. More interestingly, sweeping the size difference between the blocks from symmetric to strongly asymmetric (with the largest block featuring the highest correlation) en- hances the smile depth by moderately increasing the compound correlation for the equity tranche and significantly increasing the one for the senior tranche, see Fig. 4.9. This is similar to the case of large asymmetry in Fig. 4.8 and also exhibits the curved shape in the base correlation.

Parameters: µ =0.5, ∆ =0.3, µ =30 Parameters: µ =0.5, ∆ =0.3, µ =30 ρ ρ n ρ ρ n 40 40

) % ( )

n % (

o 30 30

i t n a l o i t e r a r l o e r c r

o d c n

u

20 e 20 o s p a b m

o d c ∆

e ∆

=0 =0 i

n l n d p e i

l ∆ ∆

=10 m =10

n I p n 10 10 m

I ∆ =20 ∆ =20 n n ∆ =25 ∆ =25 n n

0 0 0 5 10 15 20 25 0 5 10 15 20 25 Detachment point (%) Detachment point (%)

Figure 4.9: Implied compound and base correlations for varying size difference be- tween the three blocks in the block model. Used parameters: see Tab. C.2 in Ap- pendix C.

37 4.3 Attainable Correlation Smiles 4 Modeling the Correlation Matrix

Variation of Correlation Strengths

For a large correlation asymmetry between the blocks, the correlation structure can be tuned to a regime with an even more pronounced curvature in the base correlation, see Fig. 4.10. Eventually, the obtained spreads cannot be reproduced anymore in the compound correlation framework (case ∆ρ = 0.4).

50 50 µ µ ∆ ∆ Parameters: µ =0.5, µ =30, ∆ =20 ∆ =-0.4 Parameters: ρ=0.5, =30, =20 ρ=-0.4 ρ n n ρ n n ∆ ∆ =-0.2 ρ=-0.2 ρ 40 ∆ 40 ∆ =0 ρ=0 ρ ) ∆ ∆ =0.2 % =0.2 ρ ( ρ )

n % ∆

( =0.4 o

ρ i t n a o l 30 i 30 t e r a r l o e r c r

o d c n

u e o s p 20 a 20 b m

o d c e

i l d p e i l m I p m

I 10 10

0 0 0 5 10 15 20 25 0 5 10 15 20 25 Detachment point (%) Detachment point (%)

Figure 4.10: Implied correlations for varying correlation difference between the three blocks in the block model, see Eq. (4.14). Used parameters: see Tab. C.2 in Appendix C.

4.3.3 Summary of Results

In summary, the one-block hierarchical model provides a simple but versatile way to generate characteristic features in the implied correlation curves like a –potentially deep– smile in the compound correlation and an upwards sloping base correlation skew. The two-block hierarchical model doesn’t seem to provide an improvement over this; rather, it is more computationally expensive and generally creates fairly flat base correlation curves. The implied correlations produced by the block model do not differ fundamentally from the other models either; however, the flexibility in the shaping of the base correlation curve seems to be higher than in the two- block hierarchical model, occasionally exhibiting strongly curved base correlations that cannot be repriced in the compound correlation framework. In general, group models are capable of producing very low implied compound correlations for mezza- nine tranches (lower than the lowest pairwise correlation), thereby generating deep

38 4 Modeling the Correlation Matrix 4.3 Attainable Correlation Smiles smiles. Throughout all investigated regimes, group models fail to exhibit strongly upwards skewed base correlations, though.

The model calculations in this chapter have been performed with a homogeneous reference portfolio (i.e. homogeneous single-name default probabilities, cp. Tab. C.2) in Appendix C. Since the group model parameterization by itself introduces inhomogeneities in the correlation matrix (e.g. entries belonging to a group have a different impact on the implied correlation than entries not belonging to a group), the location of obligors in the correlation matrix starts to play a role. Although the effect of positioning would require further investigation, it is probably advisable to put obligors with larger default risk into groups rather than into the ”bulk”, hence allowing for a potentially larger impact on the implied correlation curve.

39 Chapter 5

Numerical Implementation

The implementation of all pricing models laid out in the previous chapters was done in the numerical computing environment MATLAB. The developed program is able to handle an arbitrary number of obligors and payment dates, and up to three sys- tematic factors for the used factor model; a Gaussian or Student-t copula can be used, with a number of different integration schemes for calculating the unconditional loss distribution from the conditional loss distribution, see Sec. 5.1.

The code has been thoroughly tested, see the test cases in Tab. 5.1. All tests have shown positive results.

5.1 Integration Schemes

In order to calculate an unconditional property A from its associated conditional property A(Z) (most prominently the loss distribution, Eq. (2.13)), we need to integrate out the factor(s) Z that A(Z) is conditioned on, Z A(X = x; t) = A(X = x; t|Z = z) dGZ (z). (5.1) Ω

One can use the probability density function of Z instead, gZ (x), and replace dGZ (z) by gZ (z) dz, Z A(X = x; t) = A(X = x; t|Z = z) gZ (z) dz, (5.2) Ω effectively integrating the function f(x, z) ≡ A(X = x; t|Z = z) gZ (z) over z.

In the following, we sketch some approaches to perform this integration numerically. We first assume the factor structure to be one-dimensional; the generalization to multiple dimensions is discussed in Sec. 5.1.4. In Sec. 5.1.5, we validate the different

40 5 Numerical Implementation 5.1 Integration Schemes

Test case Expected result Convergence of recursive In a homogeneous setup, the recursively derived cond. loss scheme to binomial dist. distribution should equal the binomial distribution. Retrieval of default curve Calculating the cond. single-name default probability and subsequent integration should recover the original default curve in the limit of a large number of integration nodes. Comparison of beta-fit Beta-fit results for example correlation matrix given in Ref. w/ example in literature [24]. Comparison of CDO Results for loss distribution and par spreads given in Ref. pricing with example in [19], Tab. 1 and Tab. 2. literature Comparison of implied Results for correlation smiles of single sector with high cor- correlation smile with relation vs. background of low correlation given in Ref. example in literature [22], Fig. 2 and Fig. 8. Convergence of Student-t For large number of degrees of freedom, the spreads from copula to Gaussian cop- the Student-t copula should equal the ones from the Gaus- ula sian copula. Convergence of numer- For a large and homogeneous portfolio, the obtained nu- ical results to LHP merical spreads should equal the analytical spreads from approximation the LHP approximation, see below.

Table 5.1: Test cases for the validation of the developed program code.

integration schemes with respect to their feasibility and accuracy for the pricing of CDOs.

Often, numerical integration methods are suited to find approximations to definite R b integrals a , whereas the integral in Eq. (5.2) is indefinite. Therefore, the interval [a, b] must be chosen such that the values of f drop sufficiently fast outside the interval. In the spirit of the definition of the Riemann integral, the straightforward discretization of Eq. (5.2) then looks like

N−1 X A(X = x; t) = f(x, zi)(zi+1 − zi), (5.3) i=0

zN −z0 where z0 = a, zN = b, and the nodes are equidistantly spaced, zi = z0 + i N . This zN −z0 scheme always multiplies the size of the subinterval zi+1 −zi = N with the function value at the left side of the subinterval. The value f(x, zn) at the right boundary of

41 5.1 Integration Schemes 5 Numerical Implementation the integration interval doesn’t enter into the calculation of A, thus introducing an asymmetry.

5.1.1 Midpoint Rectangle Rule

A symmetric version of Eq. (5.3) is given by the midpoint rectangle rule, N−1   zN − z0 X zi+1 + zi A(x; t) = f x, , (5.4) N 2 i=0 where the size of the subinterval has been pulled out of the integral. The midpoint

zi+1−zi rectangle rule resembles the scheme in Eq. (5.3) when shifting the interval by 2 . Other related methods for numerical integration like the trapezoidal rule,

N−1 ! zN − z0 f(x, z0) + f(x, zN ) X A(x; t) = + f(x, z ) , (5.5) N 2 i i=1 or the Simpson’s rule basically modify the weights of the function values at the interval endpoints, compared to the simple scheme in Eq. (5.3)). Since we aim for choosing the interval such that A(x|Z = z0) ≈ 0 and/or GZ (z0) ≈ 0 (and likewise for zN ), these more advanced methods are not expected to provide a significant improvement.

5.1.2 Sample Mean

Eq. (5.2) can be seen as expectation value of A(Z) with respect to gZ . By sampling from the distribution of Z and adding up the values of A, we get the sample mean, which is an estimator for the expectation value,

N 1 X A(x; t) = A(X = x; t|Z = z ) , z ∼ G . (5.6) N i Z i=1

Here, the integration nodes zi are not equidistantly spaced but distributed according to GZ . This is achieved by applying the inverse CDF to equidistantly spaced (i.e. 1 1 uniformly distributed) points ui ∈ [ 2N , 1 − 2N ],

−1 zi = GZ (ui) ∼ GZ , (5.7) which is known as inverse transform sampling, cp. Appendix A. ui is taken from 1 1 −1 −1 [ 2N , 1 − 2N ] rather than from (0, 1) (note that GZ (0) = −∞ and GZ (1) = ∞, therefore both values are excluded from the interval) because we want the nodes to lie in the middle of the subinterval like for the midpoint rectangle rule. It also reflects that we take into account the far end of the distribution tail as we go to a higher number of nodes.

42 5 Numerical Implementation 5.1 Integration Schemes

5.1.3 Gaussian Quadrature

The integration methods pointed out above can be stated as weighted sum of function values of A(Z) at specified nodes zi with specified weights (gZ (zi) for the rectangle rule, 1 for the sample mean method).

In the same way, Gaussian quadrature is a general term for choosing the nodes zi and weights wi such that certain types of functions (depending on the actual quadrature scheme) are integrated optimally (or even exactly) with a given number of nodes.

5.1.3.1 Factor with Gaussian Distribution

For a normal distributed factor, we use the Gauss-Hermite scheme, which allows for integrating functions of type f(z) = A(z) e−z2 . The integration is exact if A(z) is a Hermite polynomial (if the number of nodes N is greater or equal than the order of the polynomial) and very good if A(z) can be good approximated by a Hermite polynomial, Z −∞ Z ∞ N −z2 X f(z)dz = e A(z) dz ≈ wi A(zi). (5.8) ∞ ∞ i=1

Here, zi is the i-th of N roots of the Hermite polynomial HN (x) and the weight wi is given by √ 2N−1N! π wi = 2 2 . (5.9) N [HN−1(xi)] As a simple example consider f(z) = e−z2 , i.e. A(z) ≡ 1 constant,

Z −∞ N −z2 X √ e dz = wi = π. (5.10) ∞ i=1 The last equality holds since the weights obtained from Eq. (5.9) always add up to √ π, independent of N. This reflects the fact that the constant function A(z) equals the 0-th order Hermite polynomial and therefore can be exactly integrated by any number of nodes.

Eq. (5.2) with GZ being a normal distribution reads

∞ Z 1 2 A(X = x; t) = A(X = x; t|Z = z)√ e−z /2 dz, (5.11) −∞ 2π which after the substitution y = √1 z turns into 2

Z ∞ N 1 √ 2 1 X √ A(x; t) = √ A(x; t|Y = 2y) e−y dy ≈ √ w A(x; t|Y = 2y ). (5.12) π π i i −∞ i=1

43 5.1 Integration Schemes 5 Numerical Implementation

5.1.3.2 Factor with χ2-Distribution

For the Student-t copula, one factor variable is χ2 distributed,

ν z ( z 2 −1e− 2 ν , for x ≥ 0 gZ (z; ν) = 2 2 Γ(ν/2) (5.13) 0 , else.

The integration over this variable can be performed using the Gauss-Laguerre scheme, which is the appropriate method for integrating functions of the form f(z) = A(z) e−z, with the integration being exact if A(z) is a Laguerre polynomial. For the Gauss-

Laguerre scheme, zi is the i-th root of the Laguerre polynomial LN (x) and the weight wi is given by xi wi = 2 2 . (5.14) (N + 1) [LN+1(xi)] 2 Eq. (5.2) with GZ being a χ distribution reads

∞ ν −1 − z Z z 2 e 2 A(X = x; t) = A(X = x; t|Z = z) ν dz. (5.15) 0 2 2 Γ(ν/2)

z We substitute y = 2 and get

Z ∞ N 1 ν −1 −y 1 X 0 A(x; t) = A(x; t|Y = 2y)y 2 e dy ≈ w A(x; t|Y = 2y ), Γ ν  Γ ν  i i 2 0 2 i=1 (5.16) ν 0 2 −1 where wi = wi yi .

5.1.4 Multivariate Integration

The obvious way to perform a d-dimensional integration with any of the integration schemes is a decomposition into a sequence of nested one-dimensional integrations, where for each dimension the same points and weights as in the one-dimensional integration are used. However, this approach has at least two potential disadvantages. Firstly, the corners in the multidimensional integration domain are being weighted with very low weights (the product of weights which are already small individually) and essentially waste computational time as they contribute almost nothing to the integral value. Secondly, r 2 Pd  (j) (j) the nodes on the diagonals have a larger spacing j=1 zi+1 − zi than the nodes on the axes. Both drawbacks are related in that they do not account for the (potential) symmetry properties of the problem. In more extreme cases, function values that significantly contribute to the integral value might be mainly concentrated

44 5 Numerical Implementation 5.1 Integration Schemes in a small region (e.g. along an axis), thereby rendering a large part of the integration domain useless. For multivariate Gauss-Hermite integration, this issue is discussed in Ref. [26]. Since multivariate functions integrated by Gauss-Hermite are typically of a Gaussian type and have a Gaussian dependency structure, the article [26] focuses on integration schemes for rotational symmetric and correlated Gaussians and puts forward an integration in polar coordinates. For the integration of the conditional loss distribution discussed here, we stick to the simple nesting of one-dimensional integrations, though. The construction of the Gaussian copula factor model as well as of the Student-t copula model in fact imposes a symmetry with respect to the (Gaussian) factors which might favor the use of po- lar coordinates. This symmetry is broken by the (in general) different beta factors, though, making the determination of an optimal integration scheme a demanding topic by itself. Therefore, we leave a further refinement of optimal multivariate inte- gration techniques for CDOs up to further research.

5.1.5 Validation of Different Schemes for CDO Pricing

The integration schemes laid out above were validated with respect to their accu- racy for the pricing of CDOs. This was done by calculating the tranche spreads of an example CDO using the different integration schemes with a varying number of nodes and comparing the results to the analytic results from the corresponding LHP approximation. As a measure of accuracy, we use the sum of squared relative spread differences normalized by the number of tranches K,

K  LHP 2 1 X sk − s  = k . (5.17) K sLHP k=1 k For a good agreement between LHP and the numerical results, the numerical results are based on a homogeneous reference portfolio with 500 names. For a comparison, the continuous LHP CDF (2.27) (PDF (2.35) for Student-t copula, respectively) has to be discretized, i.e. the probabilities for a discrete number n of defaulted names have to be derived. This is (approximately) achieved by  n+0.5  n−0.5  F N − F N , for 0 < n < N  0.5  P(K = n) = F N , for n = 0 (5.18)  F (1) , for n = N for the Gaussian copula (i.e. the probability of n defaults is the difference between n − 0.5 and n + 0.5 defaults, albeit boundary conditions), and 1  n  (K = n) = f (5.19) P N N

45 5.1 Integration Schemes 5 Numerical Implementation for the Student-t copula.

Fig. 5.1 shows a logarithmic plot of the results for a one-factor Gaussian copula structure, for an investment grade reference portfolio (left figure), and a high yield portfolio (right figure).  = 0 corresponds to the analytic LHP result (5.18). The investigated integration schemes are two symmetrical rectangle schemes (i.e. z0 = 1 1 − 2 · size, zN = 2 · size in Eq. (5.4)) with different interval sizes, two asymmetrical rectangle schemes ([z0, zN ] = [−4, 2] and [−7, 3], respectively), as well as the sample mean and the Gauss-Hermite scheme.

1 1 R e c t . ( s y m m . ) , s i z e 6 R e c t . ( s y m m . ) , s i z e 1 0 R e c t . ( a s y m m . ) , s i z e 6 R e c t . ( a s y m m . ) , s i z e 1 0 0 . 1 0 . 1 S a m p l e M e a n G a u s s - H e r m i t e

0 . 0 1 0 . 0 1 ε ε

1 E - 3 1 E - 3

1 E - 4 1 E - 4

2 E - 5 2 E - 5 0 1 0 2 0 3 0 4 0 5 0 0 1 0 2 0 3 0 4 0 5 0 # o f i n t e g r a t i o n n o d e s # o f i n t e g r a t i o n n o d e s

Figure 5.1: Convergence behavior of different integration schemes for the one-factor Gaussian copula, for an investment grade reference portfolio (constant hazard rate 1%, left panel), and a high yield portfolio (constant hazard rate 10%, right panel). Used parameters: see Tab. C.1 in Appendix C.

With higher number of integration nodes, the error in general decreases for all schemes, with some peculiarities, though. First, as the LHP approximation is only exact for an infinite number of obligors and due to the inexact discretization, the numerical results converge to a value different from zero. They might also coincidentally come close to the LHP solution for a given number of nodes and then deviate again towards their convergence point, resulting in a positive gradient in parts of the curve. For the investment grade portfolio, the Gauss-Hermite quadrature, the asymmetric rectangle schemes, and the symmetric rectangle scheme with large interval size converge to roughly the same value for a large number of integration nodes, whereas the sample mean scheme converges very slowly and the symmetric rectangle scheme with small interval size appears to converge to a different value. The reason is that – given the low default probabilities in that portfolio – the default events happen in the far tail of the Gaussian distribution. These tail events mostly are not covered by the small interval of the rectangle scheme, and the sample mean scheme only includes them for

46 5 Numerical Implementation 5.1 Integration Schemes a very high number of nodes. This motivates the use of the asymmetric rectangle schemes which are chosen to cover the interesting region of the distribution tail. This effect is alleviated for the high yield portfolio with its larger default probabilities. This leads to a faster convergence. In summary, the asymmetric rectangle scheme with large interval size can be consid- ered the best integration scheme for the one-factor Gaussian copula. As a compromise between accuracy and computational time, 25 integration nodes seem to be a good choice (cp. Tab. 5.2 for an impression of the associated error).

1 1

R e c t a n g l e , s i z e 3 * n R e c t a n g l e , s i z e 3 * n S a m p l e M e a n 0 . 1 S a m p l e M e a n 0 . 1 G a u s s - L a g u e r r e G a u s s - L a g u e r r e

0 . 0 1 0 . 0 1

ε ε 1 E - 3 1 E - 3

1 E - 4 1 E - 4

D e g r e e s o f f r e e d o m : 3 D e g r e e s o f f r e e d o m : 1 0 1 E - 5 1 E - 5 5 1 0 1 5 2 0 2 5 3 0 3 5 5 1 0 1 5 2 0 2 5 3 0 3 5 # o f i n t e g r a t i o n n o d e s p e r d i m e n s i o n # o f i n t e g r a t i o n n o d e s p e r d i m e n s i o n

Figure 5.2: Convergence behavior of different integration schemes for the two-factor Student-t copula, with 3 degrees of freedom (left panel), and 10 degrees of freedom (right panel). Both factors always have the same number of nodes. Used parameters: see Tab. C.1 in Appendix C.

Figure 5.2 gives the results for the two-factor (i.e. one χ2 factor and one Gaussian factor) Student-t copula structure, for different numbers of degrees of freedom. The different integration schemes refer to the integration of the χ2 variable, for the inte- gration of the Gaussian factor variable an asymmetric rectangle scheme with interval [−7, 3] has been used throughout all calculations. The size of the rectangle scheme interval now depends on the number of degrees of freedom ν in order to account for the broadening of the χ2 distribution with increasing ν. A sample mean scheme or Gauss-Laguerre scheme with 25 integration nodes appears to be a suitable choice for the Student-t copula. The results for the two-factor Gaussian copula structure are found in Figure 5.3. Since the LHP approximation is only applicable for a single factor structure, the factor loadings have to be chosen such that a single factor structure is mimicked, q 2 2 ! √ p X = β1Z1 + β2Z2 + 1 − β1 − β2  = ρZ + 1 − ρ . (5.20)

47 5.2 Existence and Uniqueness of Correlation 5 Numerical Implementation

2 2 This is easily achieved by choosing β1 and β2 such that ρ = β1 + β2 (because β1Z1 + 2 2 2 2 β2Z2 ∼ N (0, β1 + β2 ) and the correlation ρ(X1,X2) = β1 + β2 is preserved).

1 1 R e c t . ( s y m m . ) , s i z e 6 R e c t . ( s y m m . ) , s i z e 1 0 R e c t . ( a s y m m . ) , s i z e 6 R e c t . ( a s y m m . ) , s i z e 1 0 0 . 1 0 . 1 S a m p l e M e a n G a u s s - H e r m i t e

0 . 0 1 0 . 0 1 ε ε

1 E - 3 1 E - 3

1 E - 4 1 E - 4

3 E - 5 3 E - 5 0 1 0 2 0 3 0 0 1 0 2 0 3 0 # o f i n t e g r a t i o n n o d e s p e r d i m e n s i o n # o f i n t e g r a t i o n n o d e s p e r d i m e n s i o n

Figure 5.3: Convergence behavior of different integration schemes for the two-factor Gaussian copula, for an investment grade reference portfolio (constant hazard rate 1%, left panel), and a high yield portfolio (constant hazard rate 10%, right panel). The integration scheme and the number of nodes always refers to the integration of both factors. Used parameters: see Tab. C.1 in Appendix C.

Qualitatively, the same behavior as for the one-factor Gaussian is observed. However, good accuracy can already be achieved with a lower number of nodes per dimension (the total number of nodes still being higher, though).

Note that spline functions have been chosen in order to interpolate the data in the plots above. Due to the complexity and nonlinearity of the problem, there is no known or expected convergence behavior for the plotted errors that would suggest a particular fitting or interpolation function. The spline functions have therefore only been chosen for their ability to smoothly interpolate the data.

5.2 Existence and Uniqueness of Correlation

As mentioned in Chapter 3 and experienced in Sec. 4.3, implied compound and base correlations for given spreads are not guaranteed to exist or to be unique. To illustrate this situation, Fig. 5.4 displays the spread difference for two mezzanine tranches between the ”market” (group model) spread and the one from pricing with the indicated compound correlation. The spread emerging from using a Student-t copula with 3 degrees of freedom cannot be attained by any choice of compound correlation (cp. Fig. 4.5), whereas the other curve shows the more common situation of two compound correlations fitting the market spread (cp. Fig. 4.3).

48 5 Numerical Implementation 5.3 Performance

Tranche spreads [bps] Method 0%-3% 3%-6% 6%-9% 9%-12% 12%-22% LHP 1846 221 66 25 5 Numerical 1849 229 69 27 5

Table 5.2: Comparison of tranche spreads for LHP approximation and numerical re- sults in the factor model framework (investment grade portfolio, asymmetric rectangle scheme with large interval size, 25 nodes; all other parameters like in Fig. 5.1). The differences in spreads listed here are an upper limit for the error of the integration scheme since the numerical results do not converge to the LHP approximation (due to the finite basket size). Typically, the integration errors in this thesis don’t exceed ≈ 3% in terms of implied correlation.

100

50 )

s 0 p b (

e

c -50 n e r e f

f -100 i d

d

a -150 e r 6%-9% Tranche, Student-t 1-block hier. model p

S µ ∆ -200 3 degrees of freedom, =0.3, =0.5, n=60 3%-6% Tranche, Gaussian 1-block hier. model µ=0.3, ∆=0.5, n=30 -250

0 10 20 30 40 50 60 70 80 Compound Correlation (%)

Figure 5.4: Uniqueness and existence for implied compound correlations of mezzanine tranches. Used parameters: see Tab. C.2 in Appendix C.

5.3 Performance

Tab. 5.3 provides some performance figures for the developed program code. All times are in seconds on a 2.53 GHz dual core processor. No simplifying assumptions like homogeneity etc. have been made. The underlying deal maturity is five years, on a quarterly schedule, i.e. 20 payment dates (the calculation of the default distribution must be repeated for each payment date). Further investigation shows that the major part of the CPU time (approximately 80%) goes into the computation of the recursion relation for the conditional loss distribution, Eq. (2.22). That recursion is of order O(N 2) which is reflected by the

49 5.3 Performance 5 Numerical Implementation

CDO pricing: CPU times [s] Nodes per factor 1 factor 2 factors Obligors 10 20 30 10 20 50 0.06 0.11 0.15 0.63 2.75 Gaussian 100 0.21 0.43 0.64 2.32 9.96 200 0.84 1.81 2.76 9.46 38.58 50 n.a. n.a. n.a. 0.73 2.98 Student-t (n = 3) 100 n.a. n.a. n.a. 2.44 10.01 200 n.a. n.a. n.a. 9.22 35.33

Table 5.3: CPU times for CDO pricing in different parameter regimes.

roughly quadratic relation between computational time and number of obligors in Tab 5.3. As expected, the CPU time grows roughly linearly with the total number of (d- dimensional) nodes that need to be evaluated and integrated. Thus, the computation time grows exponentially with the factor dimension, with the base being the number d of nodes per dimension, t ∝ ntot = n . Pricing a CDO with a Student-t copula is per se not significantly more expensive than with a Gaussian copula; however, a Student-t copula requires a two-dimensional factor at least, and the number of integration nodes required for a sufficient accuracy might be higher for low degrees of freedom, see Sec. 5.1.5 above.

For the calculation of implied correlations (compound or base), a root finding algo- rithm must be employed. Numerical root-finding methods use iteration which effec- tively amounts to a sequence of full CDO valuations (since the recursion needs to be performed in each iteration step). For the developed code, the built-in MATLAB function fzero was used. This function uses a combination of bisection, secant, and inverse quadratic interpolation methods for the root finding ([9, 16]) and typically needs around 10 evaluations to arrive at a very good accuracy. This amounts to approx. 5 sec. for the calculation of an implied correlation of a typical index tranche (the implied correlation structure is flat and Gaussian by definition, therefore one Gaussian factor is used; the number of factor nodes has been chosen to be 25).

50 Conclusions

This work is attributed to the effect of the dependency structure of reference portfolios on the value of CDO tranches written on these portfolios. In particular, we proposed a model for the correlation matrix of the reference portfolio and showed the suitability of this model to exhibit a smile (skew) shape in the implied compound (base) correlation curve of the CDO. This modeling approach for the correlation smile differs from previous ones in that only the correlation structure is utilized to generate the smile; recovery rates, the used copula, etc. remain unchanged compared to the standard industry model (i.e. the Gaussian copula model with flat correlation). This also allows for an efficient numerical implementation as the chosen parameterization of the correlation matrix can be naturally integrated into the common factor model framework, only requiring a small number (e.g. one) of additional factors.

In order to identify suitable structures of the correlation matrix, we made use of results from the investigation of empirical correlations of stock price returns (which are commonly used as proxy for default correlation) that suggest that correlation matrices observed from the market are to a large degree random, i.e. they only contain a small number of driving risk factors to which the names in the portfolio are significantly exposed to. These studies are based on the comparison of the spectra of the empirical correlation matrices to the predictions of random matrix theory and also provide some insights into the underlying structure of the correlation matrices which appear to be organized in groups of strongly correlated names.

Based on these findings, we proposed a general group structure of the correlation matrix which allows for arbitrary nesting of groups, reflecting the economic intuition of industry sectors (e.g. technology sector) within which companies are more strongly correlated and which possibly can be divided further into sub-groups of even higher correlation (e.g. telecommunication sector within the technology sector). We showed how these group structures can be directly mapped onto factor models, with the number of required factors depending on the complexity of the group model.

51 5.3 Performance 5 Numerical Implementation

Studying the shape of the implied compound and base correlation curves emerg- ing from these group model correlation structures, we found that group models are capable of generating characteristic features like a –potentially deep– smile in the compound correlation and an upwards sloping base correlation skew. It turned out that the best compromise between complexity (in terms of the number of adjustable parameters and computational costs) and flexibility (in terms of the tunability of the correlation curves) is provided by one of the simplest investigated models, namely one block of highly correlated obligors on a background of weakly correlated oblig- ors (which requires only two factors in the factor model framework). As for the other models, the parameter regimes of this model were investigated in detail and the impact of the variation of each parameter on the compound and base correlation curve was discussed. However, although the forming of correlation smiles and skews from the investigated models is an interesting and considerable result, the emerging smiles/skews in some aspects fail to accurately reproduce the observed market behav- ior, in particular strongly skewed base correlations and high compound correlations for the super senior tranche.

Finally, we studied the performance of different integration schemes for integrating over the distribution of the factor variables in order to calculate unconditional prop- erties in the factor model framework. For the Gaussian copula as well as for the Student-t copula with different degrees of freedom, the accuracy (with respect to the analytical result obtained from the large homogeneous portfolio approximation) was assessed. Surprisingly, a simple rectangle scheme turned out to be superior (in most cases) to the Gauss-Hermite quadrature scheme for standard normal distributed fac- tors, whereas the Gauss-Laguerre quadrature appeared to be best suited for factors following a Student-t distribution.

52 Appendix A

Inverse Transform Sampling

For the sample mean integration scheme, a number of nodes zi with a given distri- bution need to be generated, i.e. the spacing between the nodes follows a density function g. Let G be the corresponding CDF, G(z) = P(Z ≤ z). For a uniformly distributed random variable U we have by definition P(U ≤ u) = u. We now define a mapping from z to u,

z 7→ u = G(z), (A.1) and get u = G(z) = P(Z ≤ z) = P(G(Z) ≤ G(z)) = P(G(Z) ≤ u), (A.2) where we have used the monotonicity of G. We see that G(Z) is uniformly distributed, i.e. we can obtain a uniformly distributed variable U via U = G(Z). Inversely, G−1(U) is distributed according to G, which we use to create our integration nodes, −1 zi = GZ (ui).

53 Appendix B

Correlation in Student-t Factor Model

We use the definition of the correlation given in Eq. (2.11) and examine its individual terms for two Student-t distributed asset variables (ν degrees of freedom) Xi, Xj following the model in Eq. (2.19) 1:

• E[Xi] = E[Xj] = 0, see Ref. [3].

p ν • σ(Xi) = σ(Xj) = ν−2 , see Ref. [3].

For calculating E[Xi Xj], we use the fact that Xi and Xj share the common variables g and V ,

ν  E[X X ] = E β β V 2 + β β VV + β β VV + β β V V  , (B.1) i j g i j i j i i j j i j i j

p 2 where βi ≡ 1 − βi .

Using the independency of g, V , Vi, and Vj (for i 6= j), we obtain

ν  E[X X ] = E · β β · E[V 2], (B.2) i j g i j

h ν i ν with E g = ν−2 being the mean of scale-inverse chi-square distribution with scale parameter 1 (see Ref. [3]), and E[V 2] = 1 the mean of a chi-square distribution with 1 degree of freedom [3]. Putting everything together we arrive at

 β β , for i 6= j ρ(X ,X ) = i j (B.3) i j 1 , for i = j .

1As mentioned in Sec. 2.3.2, we impose the restriction ν > 2.

54 B Correlation in Student-t Factor Model

q ν Finally, note that by setting βi = 0 in Eq. (2.19), we get Xi = g Vi, i.e. also for zero correlation, Xi and Xj still depend on each other via the common driver g. This shows that for a multivariate Student-t distribution, the correlation does not fully capture the dependency structure.

55 Appendix C

Used Parameters

Parameter All Gaussian 1d Student-t Gaussian 2d # of obligors 500 # of payment dates 4 (1y) Risk-free interest rate 5% Loss notional 1 Recovery rate 40% Accrual factors 0.25 Correlation 60% 0%-3% 3%-6% Attachment points 6%-9% 9%-12% 12%-22% Hazard rate 1%/10% 5% 1%/10% Degrees of freedom 3/10

Table C.1: Used valuation parameters for the validation of integration schemes in Sec. 5.1.5.

56 C Used Parameters

Parameter All Group Model Implied Correlation # of obligors 125 # of payment dates 20 (5y) Risk-free interest rate 5% Loss notional 1 Recovery rate 40% Accrual factors 0.25 0%-3% 3%-6% Attachment points 6%-9% 9%-12% 12%-22% Hazard rate 1% Integration scheme Rectangle asymm., size 10 Integration nodes 15 25 per factor

Table C.2: Used valuation parameters for the implied correlations of group models in Sec. 4.3 and Sec. 5.2. The column ”Group Model” indicates the parameters used for the calculation of the spreads generated from the group models, the column ”Implied Correlation” lists the parameters used in the course of the root finding for the implied correlations.

57 C Used Parameters

58 Bibliography

[1] Global CDO Market Issuance Data. Technical report, Securities Industry and Financial Markets Association, 2009. URL www.sifma.org/uploadedFiles/ Research/Statistics/SIFMA_GlobalCDOData.pdf.

[2] Whitehouse Press Release, October 3, 2008. URL http://www.whitehouse.gov/news/releases/2008/10/20081003-17.html.

[3] M. Abramowitz and I. A. Stegun. Handbook of mathematical functions. 1972.

[4] J. D. Amato and J. Gyntelberg. CDS index tranches and the pricing of credit risk correlations. BIS Quarterly Review, Bank for International Settlements, March 2005.

[5] L. Andersen. Portfolio Losses in Factor Models: Term Structures and Intertemporal Loss Dependence. 2006.

[6] L. Andersen and J. Sidenius. Random Recovery and Random Factor Loadings. Journal of Credit Risk, 1(1):29–70, 2004.

[7] L. Andersen, J. Sidenius, and S. Basu. All your hedges in one basket. RISK, pages 67–72, November 2003.

[8] F. Black and J. C. Cox. Valuing Corporate Securities: Some Effects of Indenture Provisions. Journal of Finance, 31(2):351–367, 1976.

[9] R. Brent. Algorithms for Minimization Without Derivatives. Prentice-Hall, 1973.

[10] X. Burtschell, J. Gregory, and J.-P. Laurent. Beyond the Gaussian Copula: Stochastic and Local Correlation. Journal of Credit Risk, 3(1):31–62, 2007.

[11] X. Burtschell, J. Gregory, and J.-P. Laurent. A comparative analysis of CDO pricing models. 2008.

59 BIBLIOGRAPHY BIBLIOGRAPHY

[12] O. Cousseran and I. Rahmouni. The CDO market: Functioning and implications in terms of financial stability. Technical report, Banque de France, 2005.

[13] A. Edelman and N. R. Rao. Random Matrix Theory. Acta Numerica, pages 1–65, 2005.

[14] P. Embrechts, F. Lindskog, and A. McNeil. Modelling Dependence with Copulas and Applications to Risk Management. In S. Rachev, editor, Handbook of Heavy Tailed Distributions in Finance, chapter 8, pages 329–384. Elsevier, 2003.

[15] C. Ferrarese. A comparative analysis of correlation skew modeling techniques for CDO index tranches. Master’s thesis, 2006.

[16] G. E. Forsythe, M. A. Malcolm, and C. B. Moler. Computer Methods for Mathematical Computations. Prentice-Hall, 1976.

[17] Gabriel Frahm and Uwe Jaekel. Random Matrix Theory and Robust Estimation for Financial Data, 2005. URL http://www.citebase.org/abstract?id=oai:arXiv.org:physics/0503007.

[18] R. Frey and A. J. McNeil. Dependent Defaults in Models of Portfolio Credit Risk. Journal of Risk, 6(1):59–92, 2003.

[19] M. S. Gibson. Understanding the Risk of Synthetic CDOs. 2004.

[20] J. Gregory and J.-P. Laurent. I will survive. RISK, 16(6):103–107, 2003.

[21] J. Gregory and J.-P. Laurent. In the Core of Correlation. RISK, 17(10):87–91, 2004.

[22] S. Hager and R. Sch¨obel. A Note on the Correlation Smile. 2005.

[23] S. Hager and R. Sch¨obel. Deriving the dependence structure of portfolio credit derivatives using Evolutionary Algorithms. Lecture Notes in Computer Science, 3994:340–347, 2006.

[24] C. T. Hille. Synthetic credit baskets. 2007.

[25] R. A. Jarrow and S. M. Turnbull. Pricing Derivatives on Financial Securities Subject to Credit Risk. Journal of Finance, 50(1):53–85, 1995.

60 BIBLIOGRAPHY BIBLIOGRAPHY

[26] P. J¨ackel. A note on multivariate Gauss-Hermite quadrature. 2005.

[27] L. Laloux, P. Cizeau, J.-P. Bouchaud, and M. Potters. Noise Dressing of Financial Correlation Matrices. Phys. Rev. Lett, 83(7):1467–1470, 1999.

[28] D. X. Li. On Default Correlation: A Copula Function Approach. Journal of , 9(4):43–54, 2000.

[29] F. Lillo and R. N. Mantegna. Spectral density of the correlation matrix of factor models: A random matrix theory approach. Phys. Rev. E, 72(016219): 1–10, 2005.

[30] R. N. Mantegna. Hierarchical structure in financial markets. Eur. Phys. J. B, 11(1):193–197, 1999.

[31] V. A. Marˇcenko and L.A. Pastur. Distribution of eigenvalues for some sets of random matrices. Math USSR Sbornik, 1(4):457–483, 1967.

[32] L. McGinty and R. Ahluwalia. Credit Correlation: A Guide. Technical report, JP Morgan, 2004.

[33] R. C. Merton. On the Pricing of Corporate Debt: The Risk Structure of Interest Rates. Journal of Finance, 29(2):449–470, 1974.

[34] V. Plerou, P. Gopikrishnan, B. Rosenow, L. A. Nunes Amaral, and H. E. Stanley. Universal and Nonuniversal Properties of Cross Correlations in Financial Time Series. Phys. Rev. Lett, 83(7):1471–1474, 1999.

[35] L. Schloegl and D. O’Kane. A note on the large homogeneous portfolio approximation with the Student-t copula. Finance and Stochastics, 9:577–584, 2005.

[36] A. Sklar. Random variables, distribution functions, and copulas—a personal look backward and forward. In L. R¨uschendorf, B. Schweizer, and M. Taylor, editors, Distributions With Fixed Marginals and Related Topics, pages 1–14. 1996.

[37] S. Willemann. An Evaluation of the Base Correlation Framework. Journal of Credit Risk, 1(4):180–190, 2005.

61