Managing Machine Learning Model Risk

Total Page:16

File Type:pdf, Size:1020Kb

Managing Machine Learning Model Risk Managing Machine Learning Model Risk May 13, 2019 Agus Sudjianto, Harsh Singhal and Jie Chen 2019 Wells Fargo Bank, N.A. All rights reserved. For public use. Master Class Agenda • Introduction (15 minutes) – Agus • Machine Learning Interpretability (90 minutes) – Jie – Post-hoc methodology • Overview of Machine Learning – Model distillation – Ensemble Model Methodology and Examples: Random Forest and GBM (60 minutes) – Jie – Deep Learning Methodology and Examples: Feedforward, • Structured-Interpretable Models – Agus Recurrent, and Generative Adversarial Network (60 minutes) – Jie • Validation of Machine Learning Models (90 min) – Harsh – Inputs/Data: bias and privacy test – Model specification: interpretability • Natural Language Processing (45 minutes) – Harsh – Performance: fairness and performance testing – Language Models – Model Monitoring and change control – Neural Architecture – Fail safe and disclosure Optional Lunch Time Bonus: Deep Learning Techniques for Derivatives Pricing – Bernhard 2 Machine Learning Methodology: Ensemble Model Methodology and Examples May 13, 2019 Jie Chen, Ph.D. MD, Head of Statistics and Machine Learning, Corporate Model Risk Outline • Statistics vs Machine learning • Introduction to machine learning – Supervised Learning – Unsupervised Learning – Semi-supervised learning – Reinforcement Learning • Decision Tree and CART • Ensemble algorithms – Bagging – Random forest – Boosting • Probability Calibration • Classification Example 4 Statistics vs ML • Leo Breiman: Two modelling paradigms: data model and algorithmic model – Breiman (2001) Statistical Modeling: The Two Cultures, Statistical Science • Traditional Statistics (data model) – View: Data generated by some underlying parametric model – goal is inference and interpret the model – Extensive interaction between data and data analyst o Summary, visualization, identiFication of outliers, shapes of distributions, transFormation, … – Parameter estimation, testing, conFidence intervals, asymptotic theory à based on model assumptions and theory – Dimensionality is curse à variable selection – Model validation: goodness oF Fit tests, residual diagnostics – Tailored For small data sets, Few number oF variables, structured data. – Driven by statisticians • Criticism – Simple parametric model imposed on data generated by complex system. InFormation obtained may be questionable. – Omnibus GOF test which tests in many directions have low power and will not reject until the lack oF Fit is large. – Feature engineer has to be done manually, which involves a lot oF hand craFting and is impractical For large number oF variables. 5 Statistics vs ML • Leo Breiman: Two modelling paradigms: data model and algorithmic model – Breiman (2001) Statistical Modeling: The Two Cultures, Statistical Science • Machine Learning (algorithmic model) – View: Data mechanism unknown and no intrinsic interest in the data generation process. Goal is to get the most accurate model, however complicated. – Very little direct interaction with the data – Emphasis on better algorithms, speed, efficiency of computing, parameter tuning o Data mining – exploratory data analysis on steroids o Neural networks, Boosting algorithms, etc. – Algorithms are black box à hard to interpret – Dimensionality is blessing àvariable selection is not needed, feature creation is encouraged (SVM). – Model validation: check prediction accuracy on testing set – Tailored for large data sets, with large number of variables, unstructured data. – Driven by computer scientist, engineers, and a few statisticians • Criticism – Lack of interpretability. 6 Statistics vs ML • Michael Jordan: the ideas of machine learning, from methodological principles to theoretical tools, have had a long pre-history in statistics. • Distinction is blurring … • Some statisticians have adopted methods from machine learning, leading to a combined field that they call statistical learning • Data Science has emerged as an alternative term to combine both fields… but includes DBM and computing 7 Machine Learning vs Artificial Intelligence (wiki and other sources) § Machine Learning: – Term coined by Arthur Samuel (IBM) in 1959 – gives "computers the ability to learn without being explicitly programmed” – study and construction of algorithms that can learn from data, summarize features, recognize patterns, make predictions, and take actions … – Related to statistics (`computational statistics’) but different paradigms – A key pathway to AI § Artificial Intelligence: concerned with making computers behave like humans – Term coined in 1956 by John McCarthy (MIT) – study of “ intelligent agents” – devices that perceive the environment and take actions that maximize its chance of success at some goal. – Long history: formal reasoning in philosophy, logic, … – Resurgence of AI techniques in the last decade: advances in computing power, computing and data architectures, sizes of training data, and theoretical understanding – Deep Learning Neural Networks: At the core of recent advancements in AI, specifically for certain classes of ML tasks (Reinforcement L and Representation L) – Applications: • Pattern recognition: speech (siri), image (Deep Face), handwriting, … • Autonomous systems: drones, self-driving cars • Recommender systems, drug discovery, marketing, … 8 Machine Learning: Tasks and Techniques • Tasks: • Supervised Learning: • Regression and classification • Unsupervised Learning: • Discover underlying structure • Dimension reduction, clustering, … • Semi-supervised learning • Reinforcement Learning: • Identifying how to make good decisions from context: observe, learn, and optimize • Deep reinforcement learning • Representation Learning: • Feature selection and engineering 9 Supervised Machine Learning § Supervised learning means the desired outcome is known, aka, the response variable is given. § Learning is supervised under the response: minimizing the error between prediction and the response. § Algorithms that falls under this category: – K-nearest neighbor – LASSO, Elastic Net – Support vector machine – Decision trees – Ensemble methods – Neural networks • Artificial Feed Forward NN • More complex NN for DL 10 Supervised Machine Learning § Machine learning algorithms usually come with hyper-parameters which controls the complexity of the algorithm. – For example, trees have depth, number of terminal nodes, etc to define the tree structure – Neural networks have number of layers, number of neurons per layer, activation function, etc to define the network structure. § Complexity is related with bias-variance trade-off. Prediction error can be decomposed into bias and variance. Bias and variance trade-off § Bias: ! " − $ !% " . Simpler models have large bias, and vice versa § Variance: &'( !% " . Simpler models have smaller variance, and vice versa § The best model is the one that achieves a good balance between bias and variance à hyper-parameter tuning 11 Supervised Machine Learning: Tuning § Hyper-parameter tuning, is to find the best hyper parameters which gives the most accurate machine learning algorithm. It is the key to the success of machine learning algorithms. § Simple model structure, small data requires less complicated algorithm and more complicated model structure with large data requires more complicated algorithm. So the hyper parameters are data dependent, and they need to be tuned to get the best model. § Tuning involves a search routine and an evaluation routine. For each hyper-parameter setting, fit the model and evaluate the model performance; Using the search routine to find the hyper-parameter/model that optimizes the model performance. 12 Supervised Machine Learning: Tuning § Search routine, some popular ones are – Grid search: define a grid of parameters and search this entire grid – Randomized search: randomly select parameters from a distribution to search. – Bayesian hyper-parameter optimization: model the prediction performance as a Gaussian Process. § Evaluation routine. The model performance is measured by – Continuous response: mean squared error – Categorical response: AUC/Gini (binary response), error rate, logloss § It is well-known that a model that minimizes the loss/error on the training data is likely to overfit. To avoid this, the performance is measured on a separate validation data, or using cross-validation. § Cross-validation. The typical K-fold cross validation works as follows: 1. Randomly divide the data into K folds. Stratification may be needed for imbalanced data. 2. For each i = 1, …, K 1.Leave the ith fold out, build a model using the rest K-1 folds. 2.Predict on the ith fold. 3. After obtaining the cross-validation predictions for the entire data, compute the loss/error. This is the cross-validation model performance. § Since both training data and validation data are used in construction of the best model, the model performance has to be evaluated on a separate test set. 13 Unsupervised learning § Unsupervised learning means there is no response. The observations are unlabeled. § It is used for clustering, dimension reduction, anomaly detection, etc. § Algorithms that falls under this category: – Clustering • K-Means • Hierarchical clustering • Mixture models – Visualization and dimensionality reduction • PCA • Kernel PCA • Locally-linear embedding • T-distributed stochastic neighbor embedding (t-SNE) – Association rule learning 14 Semi-supervised learning § Sometimes, it is very expensive or hard to obtain labels. So only part of the data are labeled. – Unlabeled data
Recommended publications
  • Obtaining Well Calibrated Probabilities Using Bayesian Binning
    Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence Obtaining Well Calibrated Probabilities Using Bayesian Binning Mahdi Pakdaman Naeini1, Gregory F. Cooper1;2, and Milos Hauskrecht1;3 1Intelligent Systems Program, University of Pittsburgh, PA, USA 2Department of Biomedical Informatics, University of Pittsburgh, PA, USA 3Computer Science Department, University of Pittsburgh, PA, USA Abstract iments to perform), medicine (e.g., deciding which therapy Learning probabilistic predictive models that are well cali- to give a patient), business (e.g., making investment deci- brated is critical for many prediction and decision-making sions), and others. However, model calibration and the learn- tasks in artificial intelligence. In this paper we present a new ing of well-calibrated probabilistic models have not been non-parametric calibration method called Bayesian Binning studied in the machine learning literature as extensively as into Quantiles (BBQ) which addresses key limitations of ex- for example discriminative machine learning models that isting calibration methods. The method post processes the are built to achieve the best possible discrimination among output of a binary classification algorithm; thus, it can be classes of objects. One way to achieve a high level of model readily combined with many existing classification algo- calibration is to develop methods for learning probabilistic rithms. The method is computationally tractable, and em- models that are well-calibrated, ab initio. However, this ap- pirically accurate, as evidenced by the set of experiments proach would require one to modify the objective function reported here on both real and simulated datasets. used for learning the model and it may increase the com- putational cost of the associated optimization task.
    [Show full text]
  • Calibrated Model-Based Deep Reinforcement Learning
    Calibrated Model-Based Deep Reinforcement Learning and intuition on how to apply calibration in reinforcement Probabilistic Models This paper focuses on probabilistic learning. dynamics models T (s0|s, a) that take a current state s 2S and action a 2A, and output a probability distribution over We validate our approach on benchmarks for contextual future states s0. Web represent the output distribution over the bandits and continuous control (Li et al., 2010; Todorov next states, T (·|s, a), as a cumulative distribution function et al., 2012), as well as on a planning problem in inventory F : S![0, 1], which is defined for both discrete and management (Van Roy et al., 1997). Our results show that s,a continuous Sb. calibration consistently improves the cumulative reward and the sample complexity of model-based agents, and also en- hances their ability to balance exploration and exploitation 2.2. Calibration, Sharpness, and Proper Scoring Rules in contextual bandit settings. Most interestingly, on the A key desirable property of probabilistic forecasts is calibra- HALFCHEETAH task, our system achieves state-of-the-art tion. Intuitively, a transition model T (s0|s, a) is calibrated if performance, using 50% fewer samples than the previous whenever it assigns a probability of 0.8 to an event — such leading approach (Chua et al., 2018). Our results suggest as a state transition (s, a, s0) — thatb transition should occur that calibrated uncertainties have the potential to improve about 80% of the time. model-based reinforcement learning algorithms with mini- 0 mal computational and implementation overhead. Formally, for a discrete state space S and when s, a, s are i.i.d.
    [Show full text]
  • Request for Information and Comment: Financial Institutions' Use of Artificial Intelligence, Including Machine Learning
    Request for Information and Comment: Financial Institutions' Use of Artificial Intelligence, including Machine Learning May 21, 2021 Dr. Peter Quell Dr. Joseph L. Breeden Summary An increasing number of business decisions within the financial industry are made in whole or in part by machine learning applications. Since the application of these approaches in business decisions implies various forms of model risks, the Board of Governors of the Federal Reserve System, the Bureau of Consumer Financial Protection, the Federal Deposit Insurance Corporation, the National the Credit Union Administration, and Office of the Comptroller of the Currency issued an request for information and comment on the use of AI in the financial industry. The Model Risk Managers’ International Association MRMIA welcomes the opportunity to comment on the topics stated in the agencies’ document. Our contact is [email protected] . Request for information and comment on AI 2 TABLE OF CONTENTS Summary ..................................................................................................................... 2 1. Introduction to MRMIA ............................................................................................ 4 2. Explainability ........................................................................................................... 4 3. Risks from Broader or More Intensive Data Processing and Usage ................... 7 4. Overfitting ...............................................................................................................
    [Show full text]
  • What Matters Is Meta-Risks
    spotlight RISK MANAGEMENT WHAT MATTERS IS META-RISKS JACK GRAY OF GMO RECKONS WE CAN ALL LEARN FROM OTHER PEOPLE’S MISTAKES WHEN IT COMES TO EMPLOYING EFFECTIVE MANAGEMENT TECHNIQUES FOR THOSE RISKS BEYOND THE SCORE OF EXPLICIT FINANCIAL RISKS. eta-risks are the qualitative implicit risks that pass The perceived complexity of quant tools exposes some to the beyond the scope of explicit financial risks. Most are meta-risk of failing to capitalise on their potential. The US Congress born from complex interactions between the behaviour rejected statistical sampling in the 2000 census as it is “less patterns of individuals and organisational structures. accurate”, even though it lowers the risk of miscounting. In the same MThe archetypal meta-risk is moral hazard where the very act of spirit are those who override the discipline of quant. As Nick hedging encourages reckless behaviour. The IMF has been accused of Leeson’s performance reached stellar heights, his supervisors creating moral hazard by providing countries with a safety net that expanded his trading limits. None had the wisdom to narrow them. tempts authorities to accept inappropriate risks. Similarly, Consider an investment committee that makes a global bet on Greenspan’s quick response to the sharp market downturn in 1998 commodities. The manager responsible for North America, who had probably contributed to the US equity bubble. previously been rolled, makes a stand: “Not in my portfolio.” The We are all exposed to the quintessentially human meta-risk of committee reduces the North American bet, but retains its overall hubris. We all risk acting like “masters of the universe”, believing we size by squeezing more into ‘ego-free’ regions.
    [Show full text]
  • Offline Deep Models Calibration with Bayesian Neural Networks
    Under review as a conference paper at ICLR 2019 OFFLINE DEEP MODELS CALIBRATION WITH BAYESIAN NEURAL NETWORKS Anonymous authors Paper under double-blind review ABSTRACT In this work the authors show that Bayesian Neural Networks (BNNs) can be efficiently applied to calibrate state-of-the-art Deep Neural Networks (DNN). Our approach acts offline, i.e., it is decoupled from the training of the DNN to be cali- brated. This offline approach allow us to apply our BNN calibration to any model regardless of the limitations that the model may present during training. Note that this offline setting is also appropriate in order to deal with privacy concerns of DNN training data or implementation, among others. We show that our approach clearly outperforms other simple maximum likelihood based solutions that have recently shown very good performance, as temperature scaling (Guo et al., 2017). As an example, we reduce the Expected Calibration Error (ECE%) from 0.52 to 0.24 on CIFAR-10 and from 4.28 to 2.46 on CIFAR-100 on two Wide ResNet with 96.13% and 80.39% accuracy respectively, which are among the best results published for these tasks. Moreover, we show that our approach improves the per- formance of online methods directly applied to the DNN, e.g. Gaussian processes or Bayesian Convolutional Neural Networks. Finally, this decoupled approach allows us to apply any further improvement to the BNN without considering the computational restrictions imposed by the deep model. In this sense, this offline setting is a practical application where BNNs can be considered, which is one of the main criticisms to these techniques.
    [Show full text]
  • Discovering General-Purpose Active Learning Strategies
    1 Discovering General-Purpose Active Learning Strategies Ksenia Konyushkova, Raphael Sznitman, and Pascal Fua, Fellow, IEEE Abstract—We propose a general-purpose approach to discovering active learning (AL) strategies from data. These strategies are transferable from one domain to another and can be used in conjunction with many machine learning models. To this end, we formalize the annotation process as a Markov decision process, design universal state and action spaces and introduce a new reward function that precisely model the AL objective of minimizing the annotation cost. We seek to find an optimal (non-myopic) AL strategy using reinforcement learning. We evaluate the learned strategies on multiple unrelated domains and show that they consistently outperform state-of-the-art baselines. Index Terms—Active learning, meta-learning, Markov decision process, reinforcement learning. F 1 INTRODUCTION ODERN supervised machine learning (ML) methods is still greedy and therefore could be a subject to finding M require large annotated datasets for training purposes suboptimal solutions. and the cost of producing them can easily become pro- Our goal is therefore to devise a general-purpose data- hibitive. Active learning (AL) mitigates the problem by se- driven AL method that is non-myopic and applicable to lecting intelligently and adaptively a subset of the data to be heterogeneous datasets. To this end, we build on earlier annotated. To do so, AL typically relies on informativeness work that showed that AL problems could be naturally measures that identify unlabelled data points whose labels reworded in Markov Decision Process (MDP) terms [4], [10]. are most likely to help to improve the performance of the In a typical MDP, an agent acts in its environment.
    [Show full text]
  • Model Risk Management: Quantitative and Qualitative Aspects
    Model Risk Management Quantitative and qualitative aspects Financial Institutions www.managementsolutions.com Design and Layout Marketing and Communication Department Management Solutions - Spain Photographs Photographic archive of Management Solutions Fotolia © Management Solutions 2014 All rights reserved. Cannot be reproduced, distributed, publicly disclosed, converted, totally or partially, freely or with a charge, in any way or procedure, without the express written authorisation of Management Solutions. The information contained in this publication is merely to be used as a guideline. Management Solutions shall not be held responsible for the use which could be made of this information by third parties. Nobody is entitled to use this material except by express authorisation of Management Solutions. Content Introduction 4 Executive summary 8 Model risk definition and regulations 12 Elements of an objective MRM framework 18 Model risk quantification 26 Bibliography 36 Glossary 37 4 Model Risk Management - Quantitative and qualitative aspects MANAGEMENT SOLUTIONS I n t r o d u c t i o n In recent years there has been a trend in financial institutions Also, customer onboarding, engagement and marketing towards greater use of models in decision making, driven in campaign models have become more prevalent. These models part by regulation but manifest in all areas of management. are used to automatically establish customer loyalty and engagement actions both in the first stage of the relationship In this regard, a high proportion of bank decisions are with the institution and at any time in the customer life cycle. automated through decision models (whether statistical Actions include the cross-selling of products and services that algorithms or sets of rules) 1.
    [Show full text]
  • Best's Enterprise Risk Model: a Value-At-Risk Approach
    Best's Enterprise Risk Model: A Value-at-Risk Approach By Seabury Insurance Capital April, 2001 Tim Freestone, Seabury Insurance Capital, 540 Madison Ave, 17th Floor, New York, NY 10022 (212) 284-1141 William Lui, Seabury Insurance Capital, 540 Madison Ave, 17th Floor, New York, NY 10022 (212) 284-1142 1 SEABURY CAPITAL Table of Contents 1. A.M. Best’s Enterprise Risk Model 2. A.M. Best’s Enterprise Risk Model Example 3. Applications of Best’s Enterprise Risk Model 4. Shareholder Value Perspectives 5. Appendix 2 SEABURY CAPITAL Best's Enterprise Risk Model Based on Value-at-Risk methodology, A.M. Best and Seabury jointly created A.M. Best’s Enterprise Risk Model (ERM) which should assess insurance companies’ risks more accurately. Our objectives for ERM are: • Consistent with state of art risk management concepts - Value at Risk (VaR). • Simple and transparent methodology. • Risk parameters are based on current market data. • Risk parameters can be easily updated annually. • Minimum burden imposed on insurance companies to produce inputs. • Explicitly models country risk. • Explicitly calculates the covariance between all of a company’s assets and liabilities globally. • Aggregates all of an insurance company’s risks into a composite risk measure for the whole insurance company. 3 SEABURY CAPITAL Best's Enterprise Risk Model will improve the rating process, but it will also generate some additional costs Advantages: ■ Based on historical market data ■ Most data are publicly available ■ Data is easily updated ■ The model can be easily expanded to include more risk factors Disadvantages: ■ Extra information request for the companies ■ Rating analysts need to verify that the companies understand the information specifications ■ Yearly update of the Variance-Covariance matrix 4 SEABURY CAPITAL Definition of Risk ■ Risk is measured in standard deviations.
    [Show full text]
  • 2018-09 Interest Rate Risk Management
    FEDERAL HOUSING FINANCE AGENCY ADVISORY BULLETIN AB 2018-09: INTEREST RATE RISK MANAGEMENT Purpose This advisory bulletin (AB) provides Federal Housing Finance Agency (FHFA) guidance for interest rate risk management at the Federal Home Loan Banks (Banks), Fannie Mae, and Freddie Mac (the Enterprises), collectively known as the regulated entities. This guidance supersedes the Federal Housing Finance Board’s advisory bulletin, Interest Rate Risk Management (AB 2004-05). Interest rate risk management is a key component in the management of market risk. These guidelines describe principles the regulated entities should follow to identify, measure, monitor, and control interest rate risk. The AB is organized as follows: I. Governance A. Responsibilities of the Board B. Responsibilities of Senior Management C. Risk Management Roles and Responsibilities D. Policies and Procedures II. Interest Rate Risk Strategy, Limits, Mitigation, and Internal Controls A. Limits B. Interest Rate Risk Mitigation C. Internal Controls III. Risk Measurement System, Monitoring, and Reporting A. Interest Rate Risk Measurement System B. Scenario Analysis and Stress Testing C. Monitoring and Reporting Background Interest rate risk is the risk that changes in interest rates may adversely affect financial condition and performance. More specifically, interest rate risk is the sensitivity of cash flows, reported AB 2018-09 (September 28, 2018) Page 1 Public earnings, and economic value to changes in interest rates. As interest rates change, expected cash flows to and from a regulated entity change. The regulated entities may be exposed to changes in: the level of interest rates; the slope and curvature of the yield curve; the volatilities of interest rates; and the spread relationships between assets, liabilities, and derivatives.
    [Show full text]
  • Model Risk Management: QuanFying, Monitoring and MiGaNg Model Risk
    Model Risk Management: Quan3fying, Monitoring and Mi3ga3ng Model Risk ! Dr. Massimo Morini, Head of Interest Rate and Credit Models at IMI Bank of Intesa Sanpaolo; Professor of Fixed Income at Bocconi University in Milan; Numerix Date! QuanEtave Advisory Board ! Dr. David Eliezer, Vice President, Head of Model Validaon, Numerix ! Jim Jockle, Chief MarkeEng Officer, Numerix Feb 11, 2015 About Our Presenters Contact Our Presenters:! Massimo Morini Head of Interest Rate and Credit Models at IMI Bank of Intesa Sanpaolo; Professor of Fixed Income at Bocconi University in Milan; Numerix Quantitative Advisory Board; Author Understanding and Managing Model Risk: A Practical Guide for Quants, Traders and Validators Dr. David Eliezer Jim Jockle Chief Marketing Officer, Vice President, Head of Model Numerix Validation, Numerix [email protected] [email protected] Follow Us:! ! Twitter: ! @nxanalytics ! @jjockle" " ! ! LinkedIn: http://linkd.in/Numerix http://linkd.in/jimjockle" http://bit.ly/MMoriniLinkedIn ! How to Par3cipate • Ask Ques3ons • Submit A Ques3on At ANY TIME During the Presentaon Click the Q&A BuJon on the WebEx Toolbar located at the top of your screen to reveal the Q&A Window where you can type your quesEon and submit it to our panelists. Q: Your Question Here Note: Other attendees will not be A: Typed Answers Will Follow or We able to see your questions and you Will Cover Your Question During the Q&A At the End will not be identified during the Q&A. TYPE HERE>>>>>>> • Join The Conversaon • Add your comments and thoughts on TwiWer #ModelRisk with these hash tags and follow us #Webinar @nxanalytics • Contact Us If You’re Having Difficules • Trouble Hearing? Bad Connection? Message us using the Chat Panel also located in the Green WebEx Tool Bar at the top of your screen # We will provide the slides following the webinar to all aendees.
    [Show full text]
  • On Calibration of Modern Neural Networks
    On Calibration of Modern Neural Networks Chuan Guo * 1 Geoff Pleiss * 1 Yu Sun * 1 Kilian Q. Weinberger 1 Abstract LeNet (1998) ResNet (2016) CIFAR-100 CIFAR-100 1:0 Confidence calibration – the problem of predict- ing probability estimates representative of the 0:8 true correctness likelihood – is important for 0:6 classification models in many applications. We Accuracy Accuracy discover that modern neural networks, unlike 0:4 those from a decade ago, are poorly calibrated. Avg. confidence Avg. confidence Through extensive experiments, we observe that % of Samples 0:2 depth, width, weight decay, and Batch Normal- 0:0 ization are important factors influencing calibra- 0:0 0:2 0:4 0:6 0:8 1:0 0:0 0:2 0:4 0:6 0:8 1:0 tion. We evaluate the performance of various 1:0 Outputs Outputs post-processing calibration methods on state-of- 0:8 Gap Gap the-art architectures with image and document classification datasets. Our analysis and exper- 0:6 iments not only offer insights into neural net- 0:4 work learning, but also provide a simple and Accuracy straightforward recipe for practical settings: on 0:2 Error=44.9 Error=30.6 most datasets, temperature scaling – a single- 0:0 parameter variant of Platt Scaling – is surpris- 0:0 0:2 0:4 0:6 0:8 1:0 0:0 0:2 0:4 0:6 0:8 1:0 ingly effective at calibrating predictions. Confidence Figure 1. Confidence histograms (top) and reliability diagrams (bottom) for a 5-layer LeNet (left) and a 110-layer ResNet (right) 1.
    [Show full text]
  • Volatility Model Risk Measurement and Strategies Against Worst Case
    Volatility Mo del Risk measurement and strategies against worst case volatilities y Risklab Pro ject in Mo del Risk 1 Mo del Risk: our approach Equilibrium or (absence of ) arbitrage mo dels, but also p ortfolio management applications and risk management pro cedures develop ed in nancial institu- tions, are based on a range of hyp otheses aimed at describing the market setting, the agents risk app etites and the investment opp ortunityset. When it comes to develop or implement a mo del, one always has to make a trade-o between realism and tractability. Thus, practical applications are based on mathematical mo dels and generally involve simplifying assumptions which may cause the mo dels to diverge from reality. Financial mo delling thus in- evitably carries its own risks that are distinct from traditional risk factors such as interest rate, exchange rate, credit or liquidity risks. For instance, supp ose that a French trader is interested in hedging a Swiss franc denominated interest rate book of derivatives. Should he/she rely on an arbitrage or an equilibrium asset pricing mo del to hedge this b o ok? Let us assume that he/she cho oses to rely on an arbitrage-free mo del, he/she then needs to sp ecify the number of factors that drive the Swiss term structure of interest rates, then cho ose the mo delling sto chastic pro cess, and nally estimate the parameters required to use the mo del. The Risklab pro ject gathers: Mireille Bossy, Ra jna Gibson, François-Serge Lhabitant, Nathalie Pistre, Denis Talay and Ziyu Zheng.
    [Show full text]