Data Integrity in Financial Modeling
Total Page:16
File Type:pdf, Size:1020Kb
Data Integrity in Financial Modeling By Eric Kolchinsky, NAIC Director, Structured Securities fundamental assumptions of statistics and how they can Group limit the usefulness of financial models. Next, we review the differences between the mortgage loan information used to Introduction analyze pre-crisis RMBS and the loans’ true characteristics. With the benefit of hindsight, we can see 2009 was the be- Lastly, we discuss how the SSG and the Valuation of Securi- ginning of a slow recovery from the nadir of the recent global ties (E) Task Force are working to ensure the appropriate financial crisis. However, as the mortgage-induced crisis con- use of financial models for structured securities purchased tinued, the NAIC took a brave and a radical step—no longer by insurance companies. were ratings to be used to set the capital for residential mort- gage-backed securities (RMBS). RMBS, along with collateral- “Garbage in, Garbage out” ized debt obligations (CDOs), were at the epicenter of the crisis and had leveled financial giants such as Bear Stearns, “Pray, Mr. Babbage, if you put into the machine wrong Lehman Brothers, Fannie Mae, Freddie Mac and AIG. figures, will the right answers come out?” —Question posed to Charles Babbage, the creator of The NAIC decided to directly engage analytical vendors who the first mechanical computer. would work under the direct supervision of the Securities Valuation Office (SVO). PIMCO was chosen as the vendor for It is a common failure of human nature to find purpose in RMBS and, the following year, BlackRock Solutions was cho- complexity, and financial models are not immune. The sen as the vendor for commercial mortgage-backed securi- question posed to Mr. Babbage, above, is an illustration of ties (CMBS). The focus for these year-end valuation projects this timeless principle.2 were pre-crisis legacy securities. While there was little doubt the new issues market would return, the NAIC chose Beyond the numbers and the lengthy equations, it is vital to not to expend the effort at the time to set up a system for remember financial models are only techniques used to analyzing new securities. make sense of complicated relationships. They are not an independent source of new information. Financial models Flash forward five years—the new issue RMBS and CMBS are not an Oracle (of the Delphi variety), but simply a fancy markets have been coming back in spurts. While the $16 abacus. billion in newly issued RMBS1 is a far cry from the $1.2 tril- lion issued in 2006, it is double the $8 billion from 2012. All financial models are based on two closely related sub- CMBS has had a much healthier recovery: $64 billion issued jects: statistics and probability theory. Statistics pertains to year-to-date down from a peak of $229 billion in 2007. As the collection and interpretation of data, while probability the new issue market revives, insurance companies and the theory is the mathematical study of random variables. To- newly formed NAIC Structured Securities Group (SSG) are gether, they allow us to study the relationships between beginning to deal with issues raised by using a system de- various random variables and to build models that try to signed for surveillance to analyze new transactions. The predict the outcome of one variable (the dependent varia- fundamental question raised by some insurance companies ble) based on the values of one or more other observed is, “Why can’t these financial models be used to analyze variables (the independent variables). newly issued securities directly?” Model Appropriateness The answer lies in the weakness of financial models in gen- In simple terms, a financial model cannot, and does not, eral and the lessons learned from the crisis on how these predict the future. It just determines the relationships be- weaknesses were exploited. Specifically, financial models tween historical variables3 and attempts to determine how are only as good as the data used as inputs. Very little atten- a new set of variables would perform within that historical tion was paid pre-crisis to ensure the data powering the relationship. rating agencies’ or investors’ models bore any semblance to (Continued on page 12) reality. The result was massive differences between the predicted and the actual losses. In order to prevent the 1 November 2013 year-to-date. Source: Securities Industry and Financial Markets same mistakes, extra care needs to be taken to ensure data Association (SIFMA), www.sifma.org/research/statistics.aspx. 2 Another common example is the relationship between correlation and causality. integrity. For example, statistics can show there is a relationship between air temperature and the propensity of people to wear hats. However, it cannot determine if the hats cause temperatures to drop or vice versa. This “causality” needs to be determined This article will explore the connection between financial by other means. models and the data inputs. We begin by discussing the 3 This process is called model calibration and is not the subject of this paper. January 2014 | CIPR Newsletter 11 Data Integrity in Financial Modeling (Continued) However, history is of little use in finance—we typically into their qualitative analyses. The absence of that seek to understand future performance of some product. In credit risk meant that the credit enhancements they order to make statistics more useful for this “fortune tell- required were insufficient, the tranches bearing AAA ing,” we make a major assumption: that the future relation- ratings were too large, and the ratings they issued ships between the variables remain exactly the same. In were too optimistic. fact, we also assume the meaning of the variables also does … not change.4 Despite being on notice about the problem [of mort- gage fraud] and despite assertions about the im- This brings us to the first major concern: the extent to portance of loan data quality in the ratings process for which a model is appropriate for a given task. Many finan- structured finance securities, neither Moody’s nor S&P cial models use similar types of inputs to determine the established procedures to account for the possibility of final result. For example, many consumer credit models use fraud in its ratings process. For example, neither com- credit scores and income as model inputs. However, be- pany took any steps to ensure that the loan data pro- cause the models are developed on different sets of data, vided for specific RMBS loan pools had been reviewed the impact of each variable will not be the same. for accuracy.”6 A model based on prime borrowers will not be an accurate Data Integrity in Pre-Crisis RMBS description of the behavior of subprime borrowers without Financial models were used by market participants and adjustments. A prime borrower model, which is based on rating agencies to measure the risk of each RMBS mortgage verified income, will not be helpful with stated income bor- pool and tranche. While there is evidence the models them- rowers, and so on. selves were poorly designed, this problem was greatly exac- erbated by the use of false data as inputs to the models. As a result, the first step in any financial modeling process is 5 to verify, qualitatively, the model is appropriate for the Most mortgage credit models use information about the task at hand. This step involves comparing the pool to be borrower and the property to gauge the risk of the mort- analyzed with the pool upon which the model was calibrat- gage. The best source for this information is at the source— ed and making a judgment if the same relationships could the borrower. The user of the information is at the end of still hold. Even if the model is appropriate, however, the the process. In between, there are a number of parties with results could be rendered useless if the data inputs do not varied incentives; most were interested in simply passing reflect reality. the loan to the next participant and extracting a fee for the service. This made the “transport” of information from the Input Data Integrity source to the model fraught with peril. The phrase “garbage in, garbage out” is used in many con- texts, but usually stands for the proposition that starting Loans were typically originated by independent brokers with poor input for a process will result in a poor result. This working directly with the borrower. These two parties is true for financial models, even if a model is properly cali- would populate the relevant information about the borrow- brated and is appropriate to the task. er and the property to send to the originator. Neither party had any incentive to make sure this information was accu- While no one questions the proposition financial models rate. The money for the loan could have come from a bank, should be populated with good data, pre-crisis practices but just as likely it would have come from a specialized paid little attention to data quality. The Permanent Subcom- mortgage originator. Companies like Countrywide and New mittee on Investigations of the U.S. Senate Committee on Century originated the mortgages solely for the purpose of Homeland Security and Governmental Affairs conducted a structuring them into RMBS. thorough investigation of the causes of the recent financial (Continued on page 13) crisis, with one focus being rating agency practices in RMBS and CDO arenas: 4 For example, if the income of the borrower is an input to a model, does a “$100,000” number mean the same thing if the income was verified through two annual tax returns or if it was merely stated with no verification? From a credit “A final factor that contributed to inaccurate credit perspective, the former has much more predictive information than the latter.