Explain Artificial Intelligence for Credit Risk Management
Post 2008 crisis, Paul Willmott and risk measurement frameworks, depending At the same time, so-called machine- Emmanuel Derman were already pointing on portfolios characteristics for regulatory, learning method development is out one of the major challenges of financial financial or business decision-making skyrocketting. The progresses in institutions in the Modeler’s Hippocratic purposes. In addition, start-ups and computers’ processing allows the use Oath: “Nor will I give the people who fintechs are developing AI components of methods such as Deep Learning, use my model false comfort about its very quickly, powered by digital growth and tree-based algorithms (Decision Trees, accuracy. Instead, I will make explicit the increasing amount of available data. Random Forest and Gradient-Boosting its assumptions and oversights.” In Machines such as the most recent and addition to this needed transparency, this To align with those new agents, banks powerful XGBoost and Light GBM), and emphasizes the ethical point related to must develop more reliable models in ensembling techniques that combine the model use and understanding. order to reduce the decision time and ouputs of machine-learning models (such With the development of computational develop better business. as Stacking). methods, understanding the output In the credit risk industry, the usage of provided by complex models is Quantitative modeling techniques are used Machine Learning techniques for model becoming stronger than ever. to get more insights from data, reduce cost development faces skepticism, notably for and increase overall profitability. Every regulatory purposes because of the lack of After the financial crisis, regulators have model contains inherent deficiencies and it transparency and the known “black box” put a great focus on risk management is important to keep the focus on reducing effect of these techniques. supervision and expect financial model errors. institutions to have transparent, auditable Artificial Intelligence for Credit Risk Management
Although Artificial Intelligence can help model developers to reduce model risk and improve general model predictive power, a wide part of the financial industry remains careful regarding the explainability barrier faced by machine learning techniques. Indeed, the progress observed in the accuracy of models, are often made at the cost of their explainability. Moreover, this lack of explanation constitutes both a practical and an ethical issue for credit professionals, as said by Guidotti and al. (2018) 1
As pointed out by the latest reports produced by the World Economic Forum and the French prudential authority (ACPR), Artificial Intelligence as a topic has reached on inflection point. In the short term, it seems important that the development of Artificial Intelligence in the banking and insurance sectors satisfy minimum criteria for governance and control. This reflection The toolkit opening the black-box The Zen Risk toolbox aims to explain both should cover the proof of the reliability of Deloitte has designed the Zen Risk an isolated observation and the overall the algorithms used (with a view to their platform, which enable its users access decisions taken by the algorithms. These internal auditability as well as external), to the most advanced and modern tools techniques are instrumental in the era models’ explainability and interactions at each modelling and validation step. Its of increasingly precise models, to the between humans and intelligent objective is to provide pre-approval with detriment of their interpretability. algorithms. tools enabling them to automate a part of their work, to compare their model with The approach flows smoothly and Artificial Intelligence is already transforming different approaches and, finally, to give gradually, through many dimensions the financial ecosystem, offering a wide them the keys to integrate transparent (model explanation, important features, range of opportunities and challenges, rules identified by Artificial Intelligence into outcome global explanation, individual across different sectors (deposit existing models. forecast explanation). and lending, insurance, investment management, etc.), therefore the definition Originally designed for challenging internal Machine learning modelling of AI model governance is becoming models, this approach could be adapted The process starts with an advanced data a key concern. As a consequence, to underwriting and preapproval credit pre-processing step, where a wide range of understanding and explaining the output processes, building eligibility scores or machine learning tools are used to improve of machine learning is becoming a top underwriting scorecards. This black-box data quality. The process includes data priority for banks and regulators. opening approach can also be extended imputation, when relevant, data filtering to collections and recoveries, loss for very sparse data, and detection and management, including litigation recoveries management of outliers. The choice of the models, recovery forecasting models or techniques used is left to the modeller, restructuring and discount models. with the possibility to visualize the real time effect of the methods used on the dataset.
1 A Survey of Methods for Explaining Black Box Models, Guidotti and al. (2018)
2 Artificial Intelligence for Credit Risk Management
The solution then explores many models, from the well-known logistic regression to the most recent Boosting (Light GBM, XGBoost) and Neural Networks.
The table below sums-up some of the most widely used methods:
Algorithms considered:
Heterogeneous classifiers: Homogeneous classifiers:
Combines predictions from Combines predictions from multiple similar various types of models. models. Diversity through sampling.
Stacked Bagging Average Random Forest (RF) Hill Climbing Boosting algorithm (HCES) Individuals classifier: The eventual scorecard consists of a single classification model.
Logistic Regression (LR) Support Vector Machines (SVM) Neural Network (NN) Decision Tree (CART)
For these complex models that need a Generally, genetic algorithms consist in But if a machine-learning model performs high degree of parameter tuning, the keeping the most resilient parents models. well, why do not we just trust the model and use of hyperparameter optimization Crossing over parents and allowing genetic ignore why it made a certain decision? 2 algorithms such as genetic mutation creates further generations of Are we ready to believe the outputs with algorithms are necessary to reach the models. The process is then iterated until high degree of confidence, without taking best performance. reaching satisfactory results. into account ethical consideration? Of At the cost of its complex implementation, course, the answer is no, and it becomes Indeed, the number of hyperparameters using the right parameters leads to an necessary to open the black-box! is too important to test all set of possible increase of performance metrics (ROC combinations. Borrowed from biostatistics, AUC, F1 Scores, …) in a reasonable amount The development of methods for opening genetic algorithms are inspired by the of time. Indeed, in a case for a PD model the black box has increased considerably in concept of natural selection and help to development, optimizing hyperparameters recent years. fight back this issue. could lead to a 10% increase in AUC compared to default parameters.
Random Forests Features ALE Plot Contribution for a SHAP Given Prediction ICE Plot LIME Grad-CAM
Dec 2016
July 2014 Apr 2015 Feb 2017 Apr 2017 Dec 2018 Mar 2014 Aug 2015 Feb 2016 Mar 2017 May 2017 Apr 2001 Oct 2015
Random Forests LRP MACEM Features DeepLIFT Importance Decision Threshold RETAIN PDP Distribution (Freidman) (Airbnb)
2 Christoph Molnar, A guide for Making Black Box Models Explainable
3 Artificial Intelligence for Credit Risk Management
Model explainability For example, in the graph below, we The first model explainability tool is often can observe that the credit amount, the provided by the model itself. Indeed, borrower’s age or the maturity (duration) and this is especially true for tree-based are the most important variables in the methods, the algorithms can assess the dataset. On the contrary, some variables importance of each variables, giving a have a limited impact and bring almost hierarchy of features importance. To do no information. so, it computes the impact of changing a variable in a tree (by another random one) This method could be used for variable on the model evaluation metric. The more selection, before the application of a the model quality decreases in average, the most usual kind of model such as a more the variable is important. logistic regression.
Light M eatures importance
Credit amount Age Duration Purpose_car Job Check_no_inf Purpose_radio/TV Check_moderate Housing_own Sex_male Savings_no_inf