Modelling Credit in portfolios of consumer loans: Transition Matrix Model for Consumer Credit Ratings

Madhur Malik and Lyn Thomas 1 School of Management, University of Southampton, United Kingdom, SO17 1BJ

Abstract

The corporate literature has many studies modelling the change in the credit risk of corporate bonds over time. There is far less analysis of the credit risk for portfolios of consumer loans. However behavioural scores, which are commonly calculated on a monthly basis by most consumer lenders are the analogues of ratings in corporate credit risk. Motivated by studies in corporate credit risk, we develop a Markov chain model based on behavioural scores to establish the credit risk of portfolios of consumer loans. We motivate the different aspects of the model – the need for a second order Markov chain, the inclusion of economic variables and the age of the loan – using data on a credit card portfolio from a major UK bank.

JEL classification: C25; G21; G33

Keywords: Markov chain; Credit risk; Logistic regression; Credit scoring

1. Introduction Since the mid 1980s, banks’ lending to consumers has exceeded that to companies ( Crouhy et al 2001). However it was only with the subprime mortgage crisis of 2007 and the subsequent credit crunch that it was realised what an impact such lending had on the banking sector and also how under researched it is compared with corporate lending models. In particular the need for robust models of the credit risk of portfolios of

1 E-mail: [email protected]

1 consumer loans has been brought into sharp focus by the failure of the ratings agencies to accurately assess the credit of Mortgage Backed Securities (MBS) and collateralized debt obligations (CDO) which are based on such portfolios. There are many reasons put forward for the subprime mortgage crisis and the subsequent credit crunch ( Hull 2009, Demyanyk and van Hemert 2008) but one reason that the former led to the latter was the lack of an easily updatable model of the credit risk of portfolios of consumer loans. This lack of suitable models of portfolio level consumer risk was first highlighted during the development of the Basel Accord, when a corporate credit risk model was used to calculate the regulatory capital for all types of loans ( BCBS 2005) even though the basic idea of such a model – that default occurs when debts exceed assets – is not the reason why consumers default. This paper develops a model for the credit risk of portfolios of consumer loans based on behavioural scores for the individual consumers, whose loans make up that portfolio. Such a model would be attractive to lenders, since almost all lenders calculate behavioural scores for all their borrowers on a monthly basis. The behavioural score can be translated into the default probability in a fixed time horizon ( usually one year) in the future for that borrower, but one can consider it as a surrogate for the unobservable creditworthiness of the borrower. We build a Markov chain credit risk model based on behavioural scores for consumers which is related to how the reduced form mark to market corporate credit risk models use a Markov chain approach built on the rating agencies‘ grades, ( Jarrow, Lando, Turnbull 1997). The methodology constructs an empirical forecasting model to derive a multi-period distribution of default rate for long time horizons based on migration matrices built from a historical database of behavioural scores. In our model development we have used the lenders’ behavioural scores but we can use the same methodology on generic bureau scores. This approach helps lenders take long term lending decisions by estimating the risk associated with the change in the quality of portfolio of loans over time. The models also assist in complying with the stress testing requirements in the Basel Accord and other regulations. In addition, the model provides insights on portfolio profitability, the determination of appropriate capital reserves, and creating estimates of portfolio value by generating portfolio level credit loss distributions.

2 There have been a few recent papers which look at modelling the credit risk in consumer loan portfolios. Rosch and Scheule ( Rosch and Scheule 2004) take a variant of the one factor Credit Metrics model , which is the basis of the Basel Accord. They use empirical correlations between different consumer loan types and try to build in economic variables to explain the differences during different parts of the business cycle. Perli and Nayda 2004) also take the corporate credit risk structural models and seek to apply it to consumer lending assuming that a consumer defaults if his assets are lower than a specified threshold. However consumer defaults are usually more about cash flow problems, financial naiveté or fraud and so such a model misses some of the aspects of consumer defaults. Musto and Souleles (2005) use equity pricing as an analogy for changes in the value of consumer loan portfolios, The do look at behavioural scores but take the monthly differences in behavioural scores as the return on assets when applying their equity model. Andrade and Thomas ( 2007) describe a structural model for the credit risk of consumer loans where the behavioural score is a surrogate for the creditworthiness of the borrower. A default occurs if the value of this reputation for creditworthiness , in terms of access to further credit drops below the cost of servicing the debt. Using a case study based on Brazilian credit bureau they found that a random walk was the best model for the idiosyncratic part of creditworthiness. Malik and Thomas (2007) developed a model of time to default for consumer loans where the risk factors were the behavioural score, the age of the loan and economic variables, and used it to develop a credit risk model for portfolios of consumer loans. Bellotti and Crook ( 2008) also used proportional to develop a default risk model for consumer loans. They investigated which economic variables ,might be the most appropriate though they did not use behavioural scores in their model. Thomas (2009b) reviews the consumer credit risk models and points out the analogies with some of the established corporate credit risk models. Since the seminal paper by Jarrow et al ( Jarrow et al 1997), the Markov chain approach has proved popular in modelling the dynamics of the credit risk in corporate portfolios. The idea is to describe the dynamics of the risk in terms of the transition probabilities between the different grades the rating agencies’ award to the firm’s bonds. There are papers which look at how economic conditions as well as the industry sector of the firm

3 affects the transitions matrices, ( Nickell et al 2000) while others generalise the original Jarrow, Lando Turnbull idea, (Hurd and Kuznetsov 2006, ) by using Affine Markov chains or continuous time processes ( Lando Skodeberg 2002). However none of these suggest increasing the order of the Markov chain or considering the age of the loan which are two of the features which we introduce in order to model consumer credit risk using Markov chains. Markov chain models have been used in the consumer lending context before, but none use the behavioural score as the state space nor is the objective of the models to estimate the credit risk at the portfolio level. The first application was by Cyert (1962) who developed a Markov chain model of customer’s repayment behaviour. Subsequently more complex models have been developed by Ho et al (2004) and Trench et al (2003). Schneiderjans and Lock (1994) used Markov chain models to model the marketing aspects of customer relationship management in the banking environment. In section two, we review the properties of behavioural scores and Markov chains, while in section three we describe the Markov chain behavioural score based consumer credit risk model developed. This is parameterised by using cumulative logistic regression to estimate the transition probabilities of the Markov chain. The motivation behind the model and the accuracy of the model’s forecasts are given by means of a case study and section four describes the details of the data used in the case study. Sections five, six and seven give the reasons why one needs to include in the model higher order transition matrices (section five), economic variables to explain the non stationarity of the chain (section six) and the age of the loan (section seven). Section eight describes the full model used, while section nine reports the results of out of time and out of time and out of sample forecasts using the model. The final section draws some conclusions including how the model could be used. It also identifies one issue – which economic variables drive consumer credit risk – where further investigation would benefit all models of consumer credit risk.

2. Behaviour Score Dynamics and Markov Chain models

4 Consumer lenders use behavioural scores updated every month to assess the credit risk of individual borrowers. The score is a sufficient statistic of the probability a borrower will be “Bad” and so default within a certain time horizon (normally taken to be the next twelve months). Borrowers who are not Bad are classified as “Good”. So at time t, a typical borrower with characteristics x(t) ( which may describe recent repayment and usage performance, the current information available at a credit bureau on the borrower, and socio-demographic details) has a score s(x(t),t) so pBxtt( | (),)= pBsxtt ( |((),)) (1) Most scores are log odds score so the direct relationship between the score and the probability of being Bad is given by PG(|((),) sxt t  1 sxtt( ( ), )= log  ⇔ PBsxtt ( | ( ( ), )) = s( xt ( ), t ) (2) PBsxtt(|((),)  1 + e

Applying Bayes theorem to (2) gives the expansion where if p G(t) is the proportion of the population who are Good at time t (p B(t) is the proportion who are Bad) one has

PGsxtt(|((),)pG( t )   PsxttGt (((),)|,)  s( x ( t ), t )= log =+ log  log   =+spop ( t ) woe t ( s ( x ( t ), t )) (3) PBsxtt(|((),) ptB ()   PsxttBt (((),)|,)  The first term is the log of the population odds at time t and the second term is the weight of evidence for that score, (Thomas 2009a). The s pop (t) is common to the scores of all borrowers and plays the role of a systemic factor which affects the default risk of all the borrowers in a portfolio. Normally the time dependence of a behavioural score is ignored by lenders. Lenders are usually only interested in ranking borrowers in terms of risk and they believe that the second term ( the weight of evidence ) in (3), which is the only one that affects the ranking, is time independent over horizons of two or three years. However the time dependence is important because it describes the dynamics of the credit risk of the borrower. Given the strong analogies between behavioural scores in consumer credit and the credit ratings used for corporate credit risk, one obvious way of describing the dynamics of behavioural scores is to use a Markov chain approach similar to the reduced form mark to market models of corporate credit risk (Jarrow at al 1997). To use a Markov chain approach to behavioural scores, we divide the score range into a number of intervals each of which represents a state of the Markov chain, and hereafter when we

5 mention behavioural scores we are thinking of this Markov chain version of the score, where states are intervals of the original score range. Markov chains have proved ubiquitous stochastic processes because their simplicity belies their power to model a variety of situations. Formally, we define a discrete time

{t0,t1,..., tn ,...: n ∈N} and a finite state space S = {1,2,..., s} first order markov chain as a stochastic process {X(t n)} n∈N with the property that for any s 0, s 1, …,s n-1, i, j ∈ S PXt[ ( ) === j|Xt( ) s,Xt( ) s,...,Xt( ) = s ,Xt( ) = i ] n+1 0011n− 11 n − n =PXt[]()()n+1 = j|Xt n == i pt,t ijnn()+ 1 ()4 where pij (t n,t n+1 ) denotes the transition probability of going from state i at time t n to state j at time t n+1 . The s ×s matrix of elements pij (., .), denoted P(t n,t n+1 ), is called the first order transition probability matrix associated with the stochastic process {X(t n)} n∈N. If

π(tn) =π( 1 ( t n) ,..., π sn( t )) describes the probability distribution of the states of the process at time t n, the Markov property implies that the distribution at time t n+1 can be obtained from that at time t u by π(tn+1) = π ( tPt,t n) ( nn + 1 ) . This extends to a m -stage transition matrix so that the distribution at time t n+m for m ≥ 2is given by

π(tnm+) = π ( t n) Pt,t( nn +1) ...Pt( nmnm +−+ 1 ,t ) The Markov chain is called time homogeneous or stationary provided

pt,tij( n n+1 ) = p ij ∀ nN. ∈ (5)

Suppose the process {X(t n)} n∈N is a nonstationary Markov chain. If one has a sample of histories of previous customers, let ni (tn), i∈S, be the number who are in state i at time tn, whereas let nij (tn,tn+ 1) be the number who move from state i at time t n to state j at time tn+1 .

The maximum likelihood estimator of pij( t n ,t n +1 ) is then

nij( t n ,t n +1 ) ˆpt,tij() n n +1 = . ()6 ni() t n If one assumed that the Markov chain was stationary, then given the data for T+1 time periods n= 0, 1, 2,…, T, the Transition probability estimates become

T −1 ∑ nij() t n ,t n +1 ˆp = n=0 ()5 ij T −1 ∑ ni() t n n=0

6 One can weaken the Markov property and require the information about the future is not all in the current state, but is in the current and the last state of the process. This is called a second order Markov chain which is equivalent to the process being a first order Markov chain but with state space S × S. The concept can be generalized to defining kth order Markov chains for any k , though of course, the state space and the size of the transition probability matrices goes up exponentially as k increases.

3. Behavioural score based Markov Chain model of Consumer Credit Risk

One could describe the behavioural score B t of a borrower as an observable variable related to the underlying unobservable variable U t which is the “credit worthiness” of the borrower. Assume that the borrower’s behavioural score is in one of a finite number of states, namely {s 0=D, s1,…sn, C} where s i i>0 describes an interval in the behavioural score range; s 0 =D means the borrower has defaulted and C is the state when the borrower closed his loan or credit card account having repaid everything ( an absorbing state). The Markov property means that the dynamics from time t onwards of the behavioural score is conditional on the score state at time t-1,B t-1. Given the behavioural score is in state s i

i at time t-1, we write the latent variable U t at time t as Ut . For the active accounts, the

i relationship between B t and Ut is that

i i i Bst=⇔≤≤ j j U t j+1, j = 0,1,.. n with 0 =−∞ , n + 1 =∞ (6)

i Assume the dynamics of the underlying variable Ut is determined by the explanatory

i ' i variable vector xt-1 in the linear form Ut=−β it x −1 +ε t , where βi is a column vector of

i i regression coefficients and ε t are random error terms. If the εt are standard logistic distributions. Then this is a cumulative regression model and the transition probabilities of B t are given by

i ' Prob() Bt= s01 |B ti− =s = logit( +β 11 itx − ) , i' i' Prob() Btti= s11 |B− =s = logit()() +β 2 itx− 1 − logit +β 11 it x − 1 , () 7

M M M i ' ProbB()tnti= s |B−1 =s =− 1 logit () n + β ix t −1 .

7 Estimating cumulative logistic model using usual maximum likelihood means that conditional on the realization of a time covariate vector xt-1, transitions to various states for different borrowers in the next time period are independent both cross-sectionally and through time. This has strong parallels with some of the corporate credit risk models. In Credit Metrics for example (Gordy 2000) the transition in corporate ratings are given by changes in the underlying “asset” variables in a similar fashion. The cumulative logistic model outlined in (7) leads to a Markov chain model for the behavioural score where the transitions depend on the explanatory variables x t . If t is a calendar time measured in quarters, then for a given initial state at time t-1, the i creditworthiness of a borrower, represented by a latent variable U t, at time t is determined by the relationship

K i i Ut=−∑ a ik State tki− − bEcoVar t −1 − cMoB i t − 1 +ε t ()8 k =2

where State t-k is a vector of indicator variables denoting borrower’s state at time t-k,

EcoVar t-1 is a vector of economic variables at time t-1, MoB t-1 is a vector of indicator variables denoting borrower’s months on books at time t-1 and a, b, and c are the With th such a model the dynamics of the behavioural score B t is described by a K order Markov chain where the transitions depend on economic variables and on the length of time the loan has been repaid. This last term does not occur in any corporate credit models but is of real importance in consumer lending ( Breeden 2007, Stepanova and Thomas 2002)

4. Data Description The original dataset contains records of customers of a major UK bank who were on the books as of January 2001 together with all those who joined between January 2001 and December 2005. The data set consists of customers time series of monthly behavioural score along with the information on their time since account opened, time to default or time when account get closed within the above duration. We considered approximately 50,000 randomly selected borrowers for training data during Jan 2001 – Dec 2004. We

8 tested our results using customer’s performance during 2005 from a subsample of the 50,000 and also holdout sample of approximately 15,000 customers. Anyone, who become 90 days delinquent, charged off or was declared bankrupt, is considered as having defaulted.

To analyse the changes in the distribution of behavioural score we first coarse classify behavioural score into various segments. Initially, we segment the behavioural score into deciles. Since behavioural score is a time covariate, the population considered for the above is not only the active people in certain month but all the people with behavioural scores in the training sample over the entire duration of 2001 to 2004. We use the chi- square statistic to decide whether to combine adjacent deciles if their transition probabilities are sufficiently similar. This technique of coarse classifying is standard in scorecard building (Thomas 2009a) to deal with continuous variables where the relationship with default in non linear. In this case it led to a reduction to five scorebands, namely s1={113-680}, s2={681-700}, s3={701-715}, s4={716-725} and s5={726 and above}. As well as these five states there are two more corresponding to Default and Account Closed. Behavioural scores are generated or updated every month for each individual so it would be possible to estimate a 1-month time step transition matrix. Since transitions between some states will have very few 1 month transitions, such a model may lead to less than robust estimates of the parameters. Hence we use 3-month time steps. Longer time steps, say six or twelve months, would start making short term forecasting difficult and quarters are an appropriate time period for economic measurements. In the following sections we shall consider various components of behavioural score transition matrix and provide a preliminary analysis of the effects of time varying macroeconomic and months on books covariates on behavioural score transitions.

5. Order of the Transition Matrix We first estimate the average transition matrix, assuming the Markov chain is stationary and first order using the whole duration of the sample from January 2001 to December 2004. Table 1 shows the 3-month time step transition matrix for that sample, where the

9 figures in brackets are the standard sampling errors. As one might expect, once a borrower is in the least risky state ( s 5 ) there is a high probability, 86%, they will stay there in the next quarter. More surprisingly the state with the next highest probability of the borrower staying there is s 1, the riskiest state, while borrowers in the other states move around more. The probabilities of defaulting in the next quarter are monotone with, as one would expect, 13-680 being the most risky state with a default probability of 6.7% and 726-high the least risky state with a default probability of 0.2%. Note that there is the obvious stochastic dominance ( ∑pij≤ ∑ p ij+1 ) for all the active states, which shows jk≥ jk ≥ that the behavioural score correctly reflects future score changes as well as future defaults.

Table 1: First Order Average Transition Matrix

Initial State Transition State

13-680 681-700 701-715 716-725 726-high Closed Default

13-680 49.0 22.1 9.6 4.0 4.0 4.7 6.7 (0.2) (0.2) (0.1) (0.1) (0.1) (0.1) (0.1) 681-700 15.7 34.7 25.1 9.6 11.2 2.8 0.8 (0.1) (0.2) (0.2) (0.1) (0.1) (0.1) (0.0) 701-715 6.0 13.6 35.9 18.1 23.4 2.6 0.5 (0.1) (0.1) (0.2) (0.1) (0.1) (0.1) (0.0) 716-725 3.0 6.1 15.7 28.3 44.1 2.5 0.3 (0.1) (0.1) (0.1) (0.2) (0.2) (0.1) (0.0) 726-high 0.7 1.2 2.7 4.3 88.4 2.4 0.2 (0.0) (0.0) (0.0) (0.0) (0.0) (0.0) (0.0) This first order Markov chain model assumes that the current state has all the information needed to estimate the probability of the transitions next quarter and so these are unaffected by the borrower’s previous states. If this is not true .one should use a second order Markov chain model Table 2 displays the estimates of the transition matrix for such a second order chain, obtained in a similar way as Table 1. Analysing Table 2 shows that there are substantial changes in the transition probabilities based on the previous state of the borrower. Consider for example if the current state is the risky one s1= {13-680}. If borrowers were also in the risky state last quarter then the chance of staying on it or defaulting in the next quarter is 58% +7%=65%.; if they were in the least risky state in the last quarter { 726+} but are now in s 1 , the chance of being in s+1 or default next quarter is 22.8%+7.7%=30.5%. Higher order Markov chains can also be considered. However, the size of the resultant transition matrices grows exponentially with the order,

10 and data sparsity and robustness of predictions become problems. Hence, we will use a second order chain to model the dynamics of the behavioural scores. Table 2: Second Order Average Transition Matrix

(Previous State, Current State) Terminal State 13-680 681-700 701-715 716-725 726-high Closed Default (13-680,13-680) 58.0 19.2 6.9 2.3 1.6 5.0 7.0 (681-700,13-680) 42.2 27.8 12.2 4.2 3.2 3.8 6.6 (701-715,13-680) 36.7 28.3 13.0 6.5 5.2 4.2 6.1 (716-725,13-680) 34.7 23.8 15.4 8.4 7.0 3.8 6.9 (726-high,13-680) 22.8 18.9 16.0 9.5 19.9 5.2 7.7 (13-680,681-700) 24.5 36.7 21.3 7.0 6.6 3.1 0.8 (681-700,681-700) 14.0 40.4 25.7 8.2 7.9 3.1 0.7 (701-715,681-700) 12.4 34.4 29.4 10.1 10.3 2.7 0.7 (716-725,681-700) 13.8 27.7 26.8 12.9 15.5 2.5 0.8 (726-high,681-700) 9.3 20.9 23.0 15.0 28.5 2.4 1.0 (13-680,701-715) 14.2 19.0 28.2 17.6 17.0 3.6 0.5 (681-700,701-715) 7.6 19.8 36.6 15.8 17.1 2.5 0.6 (701-715,701-715) 4.7 12.2 45.7 17.7 16.7 2.6 0.4 (716-725,701-715) 4.2 11.0 36.6 22.5 22.6 2.6 0.5 (726-high,701-715) 4.3 8.9 24.1 18.3 41.3 2.6 0.6 (13-680,716-725) 9.9 11.8 16.7 20.9 37.1 3.2 0.6 (681-700,716-725) 4.9 11.3 19.8 22.6 37.7 3.4 0.2 (701-715,716-725) 3.0 7.5 21.6 28.9 36.0 2.7 0.3 (716-725,716-725) 2.4 4.5 15.5 42.1 32.9 2.4 0.3 (726-high,716-725) 1.8 4.1 12.3 23.6 55.4 2.5 0.3 (13-680,726-high) 5.5 5.6 7.9 8.5 69.3 3.1 0.2 (681-700,726-high) 3.1 6.4 10.2 12.1 64.7 3.2 0.3 (701-715,726-high) 2.1 4.1 9.6 12.2 68.8 2.9 0.3 (716-725,726-high) 1.5 3.0 6.6 12.1 73.8 2.8 0.2 (726-high,726-high) 0.5 0.8 2.0 3.4 90.7 2.4 0.2

6. Macro Economic Variables Traditionally behavioural score models are built on customers performance with the bank over the previous twelve months using characteristics like average account balance, number of times in arrears and current credit bureau information. So the behavioural score can be considered as capturing the borrower’s specific risk. However, in corporate credit risk models (Das et al, 2007), it was shown that though borrower specific risk is a major factor, during economic slowdowns factors emerge and have had a substantial effect on the default risk in a portfolio of loans. The decomposition of the behavioural score in (3) suggests this is also the case in consumer lending, since the population log odds s pop (t) must be affected by such systemic changes in the economic environment. The question is which economic variables affect the default risk of consumers. We investigate five variables which have been suggested as important in

11 consumer finance ( Tang et al 2007, Liu and Xu 2003), together with one variable that reflects market conditions in consumer lending. The variables considered are: (a) Percentage Change in Consumer Price Index over 12 Months: reflects the inflation felt by customers and high levels may cause rise in customer default rate. (b) Monthly average Sterling Inter-bank lending rate: higher values correspond to general tightness in the economy as well as increases in debt service payments. (c) Annual Return on FTSE 100: gives the yield from stock market and reflects the buoyancy of industry. (d) Percentage change in GDP compared with equivalent Quarter in Previous Year: (e) UK Unemployment Rate. (f) Percentage Change in Net Lending over 12 Months: this gives an indication of the funds being made available for consumer lending.

There is a general perception (Figlewski et al, 2007) that change in economic conditions do not have an instantaneous effect on default rate. To allow for this, we use lagged values of the macroeconomic covariates in the form of weighted average over a six months period with an exponentially declining weight of 0.88. This choice is motivated by the recent study made by (Figlewski et al, 2007). Since macro economic variables represent the general health of the economy they are expected to show some degree of correlation. Table 3 below shows the pairwise correlation matrix for the above six macroeconomic variables. The values in brackets measures the statistical significance of the zero pairwise correlation between the macroeconomic variables. Thus at 95% significance level interest rate is negatively correlated with CPI and positively correlated with GDP and FTSE 100. Similarly, Net Lending is negatively correlated with Unemployment rate and positively correlated with GDP and FTSE 100 at 95% significance level. The presence of non zero correlation between variables is not a threat to the model, but the degree of association between the explanatory variables can affect parameter estimation. Table 3: Correlation matrix of macroeconomic factors

12 Interest Rate CPI GDP Net Lending Unemployment Rate Return on FTSE100 Interest Rate 1 -0.51166 0.33527 0.1428 0.00732 0.38843 (0.0002) (0.0198) (0.3329) (0.9606) (0.0064) CPI -0.51166 1 -0.10844 -0.23537 -0.44686 -0.093 (0.0002) (0.4632) (0.1073) (0.0015) (0.5295) GDP 0.33527 -0.10844 1 0.8488 -0.71439 0.86745 (0.0198) (0.4632) (<.0001) (<.0001) (<.0001) Net Lending 0.1428 -0.23537 0.8488 1 -0.4876 0.70393 (0.3329) (0.1073) (<.0001) (0.0004) (<.0001) Unemployment Rate 0.00732 -0.44686 -0.71439 -0.4876 1 -0.73078 (0.9606) (0.0015) (<.0001) (0.0004) (<.0001) Return on FTSE100 0.38843 -0.093 0.86745 0.70393 -0.73078 1 (0.0064) (0.5295) (<.0001) (<.0001) (<.0001) Figure 1 shows the variation of 3-month observed log(Default Odds) and macroeconomic factors for the sample duration of January 2001 to December 2004, where macroeconomic factors values are represented by primary y-axis and log(Default Odds) by secondary y-axis. In the benign environment of 2001-4 there are no large swings in any variable and the log of the default odds - -spop (t) – is quite stable.

Figure 1:3-Month Observed log(Odds Default) and Macroeconomic variables

25.0% -4

22.5% -4.2

20.0% -4.4 17.5% -4.6 15.0% UR -4.8 Int Rate 12.5% CPI 10.0% -5 FTSE 100 Net Lending 7.5% -5.2 GDP logOdds 5.0% -5.4 2.5% -5.6 0.0%

-5.8 -2.5% Jul-03 Jan-01 Jun-01 Apr-02 Nov-01 Feb-03 Dec-03 Sep-02 May-04 -5.0% -6

In Table 4, we estimate the first order transition probability matrices for two different twelve months calendar time periods between Jan 2001 to December 2004 to judge the effect of calendar time on transition probabilities. The first matrix is based on sample of customers who were on books during Jan-Dec 2001 and the second is for the duration Sept03 – Oct04. Both transition matrices show considerable similarities with the whole sample average transition matrix in Table 1, with the probability of moving into default decreasing as the behavioural score increases and the stochastic dominance effect still

13 holding. However there are some significant differences between the transition probabilities of the two matrices in Table 3. For example, borrowers who were in current state of s1={13-680} during Jan-Dec 2001 have a lower probability of defaulting in the next quarter -5.5% - than those who were in the same state during Sept03 – Oct04 where the value is 8.22%. We test the difference between the corresponding transition probabilities in the two matrices in Table 3 using the two-proportion z-test with unequal variances. The entries in bold under the z-statistics in Table 3 below identify those transition probabilities where the differences between the corresponding terms in the two matrices are significant at the 95% level. Table 4: Comparison of transition matrices at different calendar times

Initial State Terminal State Obligor 13-680 681-700 701-715 716-725 726-high Closed Default Quarters Jan-Dec 2001

13-680 52.90 21.77 9.24 3.62 3.67 3.31 5.50 24075 681-700 17.80 35.56 23.86 9.51 10.40 2.14 0.72 25235 701-715 6.74 14.84 35.25 17.90 22.72 2.16 0.40 31477 716-725 3.28 6.99 16.84 27.85 42.64 2.12 0.29 27781 726-high 0.72 1.35 2.86 4.30 88.39 2.10 0.26 220981

Initial State Terminal State Obligor 13-680 681-700 701-715 716-725 726-high Closed Default Quarters Oct 03-Sept 04

13-680 46.24 22.68 9.30 4.03 4.16 5.35 8.22 24060 681-700 14.79 35.62 25.25 9.80 10.99 2.74 0.82 29111 701-715 5.42 13.42 37.30 18.20 22.89 2.33 0.43 42200 716-725 2.68 5.63 16.17 29.34 43.79 2.05 0.33 38932 726-high 0.62 1.14 2.65 4.69 88.80 1.90 0.19 289814

z-statistics

13-680 10.346 -1.7094 -0.1824 -1.6562 -1.9582 -7.8341 -8.3934 681-700 6.7077 -0.1106 -2.645 -0.7928 -1.5598 -3.1709 -0.9468 701-715 5.2315 3.86247 -4.057 -0.7583 -0.39142 -1.1187 -0.4543 716-725 3.1677 5.03343 1.62888 -2.9847 -2.10987 0.4573 -0.6513 726-high 3.0222 4.9304 3.22696 -4.7262 -3.25315 3.4742 3.6772 7. Months on Books Effects As is well known in consumer credit modeling (Breeden 2007, Stepanova and Thomas 2002), the age of the loan (the number of months since the account was opened) is an important factor in default risk. To investigate this we split age into seven segments namely, 0-6 months , 7-12 months, 13-18 months , 19-24 months , 25-36 months , 37-48 months , more than 48 months.. The effect of age on behavioural score transition probabilities can be seen in Table 4, which shows the first order probability transition

14 matrices for borrowers who were on books between one to twelve months ( upper table) and more than 48 months ( lower table). Again the overall structure is similar to Table 1, but there are significant differences between the transition probabilities of the two matrices. Borrowers who are new on the books are more at risk of defaulting or of their behavioural score dropping than those who were with the bank for more than four years. Table 5: Comparison of transition matrices for loans of different ages

Initial State Terminal State Obligor 13-680 681-700 701-715 716-725 726-high Closed Default Quarters 1-12 Months (New Obligors)

13-680 51.0 22.3 8.1 3.1 2.0 5.8 7.6 24858 681-700 18.2 35.6 24.2 9.3 8.7 3.2 0.8 22019 701-715 8.1 15.9 30.5 17.8 24.6 2.7 0.5 21059 716-725 4.5 8.2 14.7 21.4 48.6 2.2 0.3 18050 726-high 1.8 3.0 5.7 7.6 79.3 2.3 0.2 59767

Initial State Terminal State Obligor 13-680 681-700 701-715 716-725 726-high Closed Default Quarters 49-high Months (Old Obligors)

13-680 44.1 23.5 11.3 4.9 7.0 4.0 5.3 28604 681-700 13.6 32.5 25.6 10.7 14.4 2.5 0.6 39835 701-715 4.7 11.8 37.2 18.8 24.8 2.5 0.3 66389 716-725 2.1 5.0 14.9 30.4 44.7 2.6 0.3 67660 726-high 0.4 0.9 2.1 3.7 90.4 2.4 0.2 698782

z-statistics

13-680 11.202 -2.2853 -8.7964 -7.2482 -20.541 6.8244 7.8834 681-700 10.618 5.59996 -2.8827 -3.7659 -15.6301 3.2103 1.7809 701-715 12.659 10.947 -13.264 -2.3181 -0.4927 1.0212 2.6596 716-725 11.604 10.9884 -0.3266 -18.581 6.856612 -2.2478 0.1627 726-high 22.089 26.7697 32.5756 30.3574 -55.0491 -1.802 1.0586

8. Modelling Transition Probabilities Behavioural score segments have a natural ordering structure with low behavioural score associated with high default risk and vice versa. This is the structure that is exploited when using cumulative (ordered) logistic regression to model borrowers' transitions probabilities as suggested in section 3. (Zavoina and McElvey, 1975). The cumulative logistic regression model is appropriate for modelling the movement between the behavioural scorebands and the defaulted state. If we wished also to model whether the borrowers close their accounts one would need to use a two stage model. In the first stage, one would use logistic regression to estimate the probability of the borrower closing the account in the next quarter given his current state. The second stage

15 would be the model presented here of the movement between the different scorebands including default conditional on the borrower not closing the account. To arrive at the final transition probabilities one would need to multiply the probabilities obtained in this second stage by the chance the account is not closed obtained from the first stage. So we now fit the cumulative logistic model to estimate the transition probabilities of a borrower’s movement in behavioural score from being in state i at time t-1 Bt−1 = s i to where the borrower will be at time t, Bt . These transitions depend on the previous state of the borrower, Bt−2 , the lagged economic variables and the age of the loan ( Months on Books or MoB). So one uses the model given by (6) and (8) but restricted to the second order case, namely Bs=⇔≤≤i U i i , j = 0,1,.. n with =−∞ , =∞ t j j t j+1 0n + 1 (9) i i Ut=− a iti State−2 − b EcoVar titt − 1 − c MoB − 1 + ε In order to choose which economic variables to include , we recall that Table 3 described the correlation between the variables. To reduce the effect of such correlations (so that the coefficients of the economic variables are understandable), we considered various combinations of macro economic variables as an input in a cumulative logistic model. In Table 6 we present parameter estimates for cumulative logistic models for each behavioural score segment with only two macroeconomic variables, namely interest rate and net lending, along with months on books and the previous state. The model with just these two variables provided a better fit in terms of the likelihood ratio of the model than other pairs of macroeconomic variables. We employ stepwise selection procedure to keep only variables that contribute significantly (95%) to the explanatory power of the model. The likelihood ratios and the associated p-values show that for each current behavioural score segment, transitions to other states in the next time period are significantly influenced by current macroeconomic factors, current months on books and information on previous state, represented by a Secstate variable in Table 6. This model fits the data better than the first order average transition matrix. A positive sign of the coefficient in the model is associated with a decrease in creditworthiness and vice versa. So the creditworthiness of borrowers decreases in the next time period with an increase in interest rates all current behavioural score segments.

16 Borrowers who are currently less than 18 months on books have higher default and downgrading risks than the others. This confirms the market presumption that new borrowers have higher default risk than older borrowers in any give time period. The coefficients of the Secstate variable, with one exception, decrease monotonically in value from the s1={13-680} category to the s 5 ={726-high} state. Those with lower behavioural score last quarter are more likely to have lower behavioural score next quarter than those with the same behavioural score currently but who came from higher behavioural score bands. So the idea of credit risk continuing in the same direction is not supported Table 6: Parameters for second order Markov chain with age and economic variables

Parameter Estimates Initial Behavioural Score

13-680 Std Error 681-700 Std Error 701-715 Std Error 716-725 Std Error 726-high Std Error

Interest Rate 0.0334 (0.0161) 0.092 (0.0143) 0.0764 (0.0123) 0.0834 (0.0134) 0.0778 (0.00885) Net Lending 0.0129 (0.00489)

Months on Books

0-6 -0.027 (0.0351) 0.0161 (0.0347) -0.2182 (0.0368) -0.1637 (0.0448) -0.0849 (0.0315) 7-12 0.2019 (0.0241) 0.1247 (0.0225) 0.2051 (0.0226) 0.2317 (0.0261) 0.3482 (0.018) 13-18 0.2626 (0.0262) 0.2663 (0.0236) 0.2301 (0.0228) 0.2703 (0.0268) 0.2554 (0.0193) 19-24 -0.07 (0.0275) -0.0796 (0.0251) -0.1001 (0.0241) -0.0873 (0.0284) 0.031 (0.0206) 25-36 -0.0015 (0.0244) -0.0521 (0.0223) 0.00191 (0.0198) -0.00487 (0.0229) -0.0254 (0.0162) 37-48 -0.0703 (0.0262) -0.0519 (0.0243) 0.019 (0.0206) -0.0801 (0.0241) -0.00709 (0.0166) 49-high -0.2957 -0.2235 -0.13781 -0.16603 -0.51721

SecState

13-680 0.8372 (0.0165) 0.6762 (0.0168) 0.5145 (0.0222) 0.3547 (0.0337) 0.381 (0.0399) 681-700 0.2365 (0.0201) 0.2847 (0.0139) 0.3598 (0.0146) 0.1942 (0.0224) 0.5168 (0.024) 701-715 -0.0111 (0.0249) 0.0491 (0.0168) 0.1314 (0.0119) 0.1255 (0.0164) 0.2991 (0.0178) 716-725 -0.1647 (0.0345) -0.1764 (0.0239) -0.1795 (0.016) 0.0098 (0.0152) 0.0525 (0.0162) 726-high -0.8979 -0.8336 -0.8262 -0.6842 -1.2494

Intercept/Barrier

Default -3.213 (0.0756) -5.4389 (0.0826) -5.8904 (0.1285) -6.011 (0.0967) -5.1834 (0.0506) 13-680 -0.2078 (0.0734) -2.179 (0.0657) -3.2684 (0.1175) -3.6011 (0.0648) -3.8213 (0.0436) 681-700 1.022 (0.0736) -0.3978 (0.0649) -1.9492 (0.1168) -2.461 (0.062) -2.9445 (0.0421) 701-715 1.9941 (0.0746) 0.861 (0.065) -0.1796 (0.1165) -1.2049 (0.0611) -2.06 (0.0415) 716-725 2.7666 (0.0764) 1.6267 (0.0656) 0.7317 0.171 (0.0609) -1.326 (0.0413) Likelihhod Ratio 3661.078 3379.459 4137.587 2838.765 20400.65 P-value <0.0001 <0.0001 <0.0001 <0.0001 <0.0001 9. Forecasting Multi-Period Transition Probabilities The model with the parameters given in Table 6 was tested by forecasting the future distributions of the scorebands in the portfolio, including those who have defaulted. The forecast uses the Markov assumption and so multiplies the probability transition matrix by itself the appropriate number of times to get the forecasts. In the first case we consider all non-defaulted borrowers in December 2004 and used the model to predict their distribution over the various behavioral score bands and the default state at the end of

17 each quarter of 2005. Not to add extra uncertainty to the forecast, the 2005 values of the two economic variables were used. The results are shown in Table 7. The initial distribution column gives the distribution of borrowers into each behavioural score segment in the test sample in December 2005. The observed column gives the observed distribution of borrowers at the end of each quarter in 2005. The other two columns gives the expected number of borrowers in each segment at the end of each quarter of 2004 as predicted by the second order average transition matrix in Table 2 and those predicted by the model in Table 6. The second order Markov chain model with economic variables gave predictions, particularly for defaults, which were very close to the actual values for the first and second quarters, but begin to overestimate the risks thereafter. So by the fourth quarter the first order Markov chain model which just takes the average of the transition probabilities is superior. Table 7: Distribution at the end of each time period on out of time sample test sample (2005)

Behavioural Score 1-Period 2-Period 3-Period 4-Period Segments Initial Average Model Observed Average Model Observed Average Model Observed Average Model Observed Distribution Matrix Predicted Matrix Predicted Matrix Predicted Matrix Predicted 13-680 571 520 560 457 498 561 384 475 566 424 457 573 368 681-700 659 659 696 595 635 702 594 612 711 604 592 719 592 701-715 1094 1011 1066 982 969 1065 918 935 1073 1007 908 1081 938 716-725 973 936 1027 952 902 1036 1038 878 1044 971 859 1049 943 726-high 7436 7535 7304 7666 7589 7208 7644 7627 7098 7511 7647 6989 7612 Default 0 72 80 81 140 160 155 206 241 216 270 322 280 The analysis was repeated on an out of time and out of sample portfolio. Again the distribution of the portfolio at the start of the period ( Spring 2005) was given and estimates for the next three quarters obtained using the model in Table 6. The results in Table 8 show that the second order model (Table 6) is better at predicting the actual number of defaults than the first order model (Table 3) even though both approaches slightly under predict. The second order model is better at predicting the numbers in the default and high risk states, while the first order model is better at predicting the numbers in the low risk categories. Table 8 Distribution at the end of each time period on out of time out of sample test sample (2005)

Behavioural Score 1-Period 2-Period 3-Period Segments Initial Average Model Observed Average Model Observed Average Model Observed Distribution Matrix Predicted Matrix Predicted Matrix Predicted 13-680 1428 949 1040 1199 879 983 1080 769 889 1043 681-700 1278 1054 1117 1096 978 1061 1076 894 996 1001 701-715 1379 1291 1384 1257 1262 1393 1316 1216 1363 1219 716-725 876 1047 1178 812 1051 1228 774 1044 1234 718 726-high 7514 7994 7621 7968 8059 7535 7943 8208 7596 8074 Default 0 139 134 143 245 274 286 344 397 420

18 Conclusions The paper has investigated how one could use a Markov chain approach based on behavioural scores to estimate the credit risk of portfolios of consumer loans. This is an attractive approach since behavioural scores are calculated monthly by almost all lenders in consumer finance, both for internal decision purposes and for Basel Accord requirements. The paper emphasises that behavioural scores are dynamic and since they do have a “systemic” factor – the population odds part of the score- the dynamics depends on changes in economic conditions. The paper also suggests one needs to consider carefully the appropriate order of the Markov chain. In the case study, the second order chain was superior to the first order one. Unlike corporate credit risk, one also needs to include the age of the loan into the modelling as this affects the credit risk. Such models are relatively easy for banks to develop since they have all the information readily available. The model would be useful for a number of purposes – debt provisioning estimation, stress testing in the Basel context as well as investigating the relationship between Point in Time Behaviour Scores and through the cycle probabilities of default, by running the model through an economic cycle. The model could also be used by ratings agencies to update their risk estimates of the securitized products based on consumer loan portfolios. This would require then to obtain regular updates of the behavioural scores of the underlying loans rather than the present approach of just making one initial rating based on an application or bureau score. This is extra work but would help avoid the failures of the rating of the mortgage backed securities (MBS) seen in 207 and 2008. There are still issues to be resolved in modelling the credit risk of consumer loan portfolios. One important one is to identify what economic variables most affect consumer credit risk and hence should be included on such models. One would expect some differences with those which have been recognised in corporate credit risk modelling, and one may want to use different variables for different types of consumer lending. House price movements will be important for mortgages but may be less important for credit cards. One also feels that some of the variables in the models should reflect the market conditions as well as the economic conditions, because the tightening in consumer lending which prevented customers refinancing did exacerbate the problems

19 of 2007 and 2008. This paper has described how such information on economic and market conditions can be used in conjunction with behavioural scores to estimate portfolio level consumer credit risks.

References Andrade F.W.M., Thomas L.C., ( 2007), Structural models in consumer credit, European Journal of Operational Research 183, 1569-1581. Basel Committee on Banking Supervision (2005, comprehensive version 2006), International convergence of capital measurement and capital standards – a revised framework, Bank for International Settlements, Basel Bellotti T., Crook, J.N.. Credit scoring with macroeconomic variables using survival analysis. Journal of the Operational Research Society , 2008 online.

Breeden J.L., (20070, Modeling data with multiple time dimensions, Computational Statistics and Data Analysis 51, 4761-4785.

Crouhy M , Galai D., Mark R, ( 2001), Risk Management, McGraw Hill, New York

Cyert R.M., Davidson H.J., Thompson G.L., (1962), Estimation of allowance for doubtful accounts by Markov Chains, Management Science 8, 287-303.

Das, S R., Duffie D., Kapadia N., Saita, L. (2007) Common Failings: How Corporate Defaults are Correlated. Journal of Finance 62, 93-117.

Demyanyk Y.S., van Hemert O., (2008), Understanding the subprime mortgage crisis, Working Paper, Federal Reserve Bank of Cleveland and New York University .URL: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1020396 .

Figlewski S, Frydman H and Liang W (2007). Modelling the Effect of Macroeconomic Factors on Corporate Default and Credit Rating Transitions. Working Paper No. FIN-06- 007, NYU Stern School of Business.

Gordy M (2000), A Comparative Anatomy of Credit Risk Models, Journal of Banking and Finance 24, 119-149.

Ho J., (2001), Modelling bank Customers Behaviour using Data Warehouses and Incorporating Economic indicators, Ph.D. thesis, University of Edinburgh, Edinburgh.

Hull J.C., (2009), The credit crunch of 2007: what went wrong?Why? What lessons can be learnt?, Journal of Credit Risk, 5, Number 2, 3-18.

Hurd T.R., Kuznetsov, (2006), Affine Markov chian model of multiform credit migration, Journal of Credit Risk 3, 3-29

20 Jarrow R.A., Lando D. Turnbull S., (1997). A Markov model for the term structure of credit risk spreads, Review of Financial Studies, 19, 481-523

Lando D, Skodeberg T, (2002), Analyzing rating transitions and rating drift with continuous observations, Journal of Banking and Finance 26, 423-444

Liu J., Xu X.E., (2003), The predictive power of economic indicators in consumer credit risk management, RMA Journal , September

McKelvey, R.D. and Zavoina, W., (1975). A statistical model for the analysis of ordinal level dependent variables. Journal of Mathematical Sociology 4, pp. 103–120.

Malik M., Thomas L.C.,(2009), Modelling the credit risk of portfolios of consumer loans , online at Journal of the Operational Research Society

Musto D.K., Souleles N., (2005), A portfolio view of consumer credit, Working Paper 05-25, Federal Reserve Bank of Philadelphia, Philadelphia.

Nickell P., Perraudin W., Varoli S.,(2001), Stability of ratings transitions, Journal of Banking and Finance 24, 203-227.

Perli R., Nayda W.I.,(2004) Economic and Regulatory Capital Allocation for revolving retail exposures. Journal of Banking and Finance, 28, 789-809).

Rosch D, Scheule H, (2004), Forecasting retail portfolio credit risk, Journal of Risk Finance, 5, Volume 2, 16-32

Schniederjans M.J., Loch K.D., (1994), An aid for strategic marketing in the banking industry: a Markov analysis, Computers and Operations Research 21, 281-287.

Stepanova M and Thomas L C (2002). Survival Analysis Methods for Personal Loan Data. Operations Research. 50: 277-289.

Tang L, Thomas L C, Thomas S and Bozzetto J-F (2007). It's the economy stupid: Modelling financial product purchases. International Journal of Bank Marketing. 2 5: 22-38.

Trench M.S., Pederson S.P., Lau E.T., Lizhi M., Hui W., Nair S.K., (2003), Managing Credit Lines for bank One credit cards, Interfaces 33, Issue 5, 4-22.

Thomas L.C., (2009a) Consumer Credit Models: Pricing, Profit and Portfolios, Oxford University Press, Oxford (2009)

Thomas L.C, ( 2009b): Modelling the Credit Risk for Portfolios of Consumer Loans: Analogies with corporate loan models, Mathematics and Computers in Simulation 20, 2525-2534.

21