Statistical Decision Theory: Concepts, Methods and Applications
Total Page:16
File Type:pdf, Size:1020Kb
Statistical Decision Theory: Concepts, Methods and Applications (Special topics in Probabilistic Graphical Models) FIRST COMPLETE DRAFT November 30, 2003 Supervisor: Professor J. Rosenthal STA4000Y Anjali Mazumder 950116380 Part I: Decision Theory – Concepts and Methods Part I: DECISION THEORY - Concepts and Methods Decision theory as the name would imply is concerned with the process of making decisions. The extension to statistical decision theory includes decision making in the presence of statistical knowledge which provides some information where there is uncertainty. The elements of decision theory are quite logical and even perhaps intuitive. The classical approach to decision theory facilitates the use of sample information in making inferences about the unknown quantities. Other relevant information includes that of the possible consequences which is quantified by loss and the prior information which arises from statistical investigation. The use of Bayesian analysis in statistical decision theory is natural. Their unification provides a foundational framework for building and solving decision problems. The basic ideas of decision theory and of decision theoretic methods lend themselves to a variety of applications and computational and analytic advances. This initial part of the report introduces the basic elements in (statistical) decision theory and reviews some of the basic concepts of both frequentist statistics and Bayesian analysis. This provides a foundational framework for developing the structure of decision problems. The second section presents the main concepts and key methods involved in decision theory. The last section of Part I extends this to statistical decision theory – that is, decision problems with some statistical knowledge about the unknown quantities. This provides a comprehensive overview of the decision theoretic framework. 1 Part I: Decision Theory – Concepts and Methods Section 1: An Overview of the Decision Framework: Concepts & Preliminaries Decision theory is concerned with the problem of making decisions. The term statistical decision theory pertains to decision making in the presence of statistical knowledge, by shedding light on some of the uncertainties involved in the problem. For most of this report, unless otherwise stated, it may be assumed that these uncertainties can be considered to be unknown numerical quantities, denoted by θ. Decision making under uncertainty draws on probability theory and graphical models. This report and more particularly this Part focuses on the methodology and mathematical and statistical concepts pertinent to statistical decision theory. This initial section presents the decisional framework and introduces the notation used to model decision problems. Section 1.1: Rationale A decision problem in itself is not complicated to comprehend or describe and can be simply summarized with a few basic elements. However, before proceeding any further, it is important to note that this report focuses on the rational decision or choice models based upon individual rationality. Models of strategic rationality (small-group behavior) or competitive rationality (market behavior) branch into areas of game theory and asset pricing theory, respectively. Thus for the purposes of this report, these latter models have been neglected as the interest of study is statistical decision theory based on individual rationality. “In a conventional rational choice model, individuals strive to satisfy their preferences for the consequences of their actions given their beliefs about events, which are represented by utility functions and probability distributions, and interactions among individuals are governed by equilibrium conditions” (Nau, 2002[1]). Decision models lend themselves to a decision making process which involves the consideration of the set of possible actions from which one must choose, the circumstances that prevail and the consequences that result from taking any given action. The optimal decision is to make a choice in such a way as to make the consequences as favorable as possible. As mentioned above, the uncertainty in decision making which is defined as an unknown quantity, θ, describing the combination of “prevailing circumstances and governing laws”, is referred to as the state of nature (Lindgren, 1971). If this state is unknown, it is simple to select the action according to the favorable degree of the consequences resulting from the various actions and the known state. However, in many real problems and those most pertinent to decision theory, the state of nature is not completely known. Since these situations create ambiguity and uncertainty, the consequences and subsequent results become complicated. Decision problems under uncertainty involve “many diverse ingredients” - loss or gain of money, security, satisfaction, etc., (Lindgren, 1971). Some of these “ingredients” can be assessed while some may be unknown. Nevertheless, in order to construct a mathematical framework in which to model decision problems, while providing a rational 2 Part I: Decision Theory – Concepts and Methods basis for making decisions, a numerical scale is assumed to measure consequences. Because monetary gain is often neither an adequate nor accurate measure of consequences, the notion of utility is introduced to quantify preferences among various prospects which a decision maker may be faced with. Usually something is known about the state of nature, allowing a consideration of a set of states as being admissible (or at least theoretically so), and thereby ruling out many that are not. It is sometimes possible to take measurements or conduct experiments in order to gain more information about the state. A decision process is referred to as “statistical” when experiments of chance related to the state of nature are performed. The results of such experiments are called data or observations. These provide a basis for the selection of an action defined as a statistical decision rule. To summarize, the “ingredients” of a decision problem include (a) a set of available actions, (b) a set of admissible states of nature, and (c) a loss associated with each combination of a state if nature and action. When only these make up the elements of a decision problem, the decision problem is referred to as the “no-data” or “without experimentation” decision problem. However, if (d) observations from an experiment defined by the state of nature are included with (a) to (c), then the decision problem is known as a statistical decision problem. This initial overview of the decision framework allows for a clear presentation of the mathematical and statistical concepts, notation and structure involved in decision modeling. Section 1.2 The Basic Elements The previous section summarized the basic elements of decision problems. For brevity purposes, this section will not repeat the description of the two types of decision models and simply state the mathematical structure associated with each element. It is assumed that a decision maker can specify the following basic elements of a decision problem. 1. Action Space: A = {a}. The single action is denoted by an a, while the set of all possible actions is denoted as A. It should be noted that the term actions is used in decision literature instead of decisions. However, they can be used somewhat interchangeably. Thus, a decision maker is to select a single action a ∈ Afrom a space of all possible actions. 2. State Space: Θ = {θ}. (or Parameter Space) The decision process is affected by the unknown quantity θ ∈ Θ which signifies the state of nature. The set of all possible states of nature is denoted by Θ. Thus, a decision maker perceives that a particular action a results in a corresponding state θ. 3. Consequence: C = {c}. 3 Part I: Decision Theory – Concepts and Methods The consequence of choosing a possible action and its state of nature may be multi- dimensional and can be mathematically stated as c(a,θ ) ∈ C . 4. Loss Function:l(a,θ ) ∈ A× Θ. The objectives of a decision maker are described as a real-valued loss functionl(a,θ ) , which measures the loss (or negative utility) of the consequencec(a,θ ) . 5. Family of Experiments: E = {e}. Typically experiments are performed to obtain further information about eachθ ∈ Θ . A single experiment is denoted by an e, while the set of all possible experiments is denoted as E. Thus, a decision maker may select a single experiment e from a family of potential experiments which can assist in determining the importance of possible actions or decisions. 6. Sample Space: X = {x}. An outcome of a potential experiment e ∈ E is denoted as x ∈ X . The importance of this outcome was explained in (3) and hence is not repeated here. However, it should be noted that when a statistical investigation (such as an experiment) is performed to obtain information about θ, the subsequent observed outcome x is a random variable. The set of all possible outcomes is the sample space while a particular realization of X is denoted as x. Notably, X is a subset ofℜn . 7. Decision Rule:δ (x) ∈ A . If a decision maker is to observe an outcome X = x and then choose a suitable action δ (x) ∈ A , then the result is to use the data to minimize the lossl(δ (x),θ ) . Sections 2 and 3 focus on discussing the appropriate measures of minimization in decision processes. 8. Utility Evaluation: u(⋅,⋅,⋅,⋅) on E × X × A× Θ . The quantification of a decision maker’s preferences is described by a utility function u(e, x,a,θ ) which is assigned to a particular conduct of e, a resulting observed x, choosing a particular action a, with a corresponding θ. The evaluation of the utility function u takes into account costs of an experiment as well as consequences of the specific action which may be monetary and/or of other forms. Section 1.3 Probability Measures Statistical decision theory is based on probability theory and utility theory. Focusing on the former, this sub-section presents the elementary probability theory used in decision processes. The probability distribution of a random variable, such as X, which is 4 Part I: Decision Theory – Concepts and Methods dependent on θ, as stated above, is denoted asPθ (E ) orPθ (X ∈ E ) where E is an event.