Minimax Efficient Random Designs with Application to Model-Robust Design

Minimax Efficient Random Designs with Application to Model-Robust Design

Minimax efficient random designs with application to model-robust design for prediction Tim Waite [email protected] School of Mathematics University of Manchester, UK Joint work with Dave Woods S3RI, University of Southampton, UK Supported by the UK Engineering and Physical Sciences Research Council 8 Aug 2017, Banff, AB, Canada Tim Waite (U. Manchester) Minimax efficient random designs Banff - 8 Aug 2017 1 / 36 Outline Randomized decisions and experimental design Random designs for prediction - correct model Extension of G-optimality Model-robust random designs for prediction Theoretical results - tractable classes Algorithms for optimization Examples: illustration of bias-variance tradeoff Tim Waite (U. Manchester) Minimax efficient random designs Banff - 8 Aug 2017 2 / 36 Randomized decisions A well known fact in statistical decision theory and game theory: Under minimax expected loss, random decisions beat deterministic ones. Experimental design can be viewed as a game played by the Statistician against nature (Wu, 1981; Berger, 1985). Therefore a random design strategy should often be beneficial. Despite this, consideration of minimax efficient random design strategies is relatively unusual. Tim Waite (U. Manchester) Minimax efficient random designs Banff - 8 Aug 2017 3 / 36 Game theory Consider a two-person zero-sum game. Player I takes action θ 2 Θ and Player II takes action ξ 2 Ξ. Player II experiences a loss L(θ; ξ), to be minimized. A random strategy for Player II is a probability measure π on Ξ. Deterministic actions are a special case (point mass distribution). Strategy π1 is preferred to π2 (π1 π2) iff Eπ1 L(θ; ξ) < Eπ2 L(θ; ξ) : Tim Waite (U. Manchester) Minimax efficient random designs Banff - 8 Aug 2017 4 / 36 However, Player I's choice of θ is unknown to Player II. To account for uncertainty about θ, the standard choice is to play (if it exists) a minimax strategy, π∗, such that max Eπ∗ L(θ; ξ) = inf max Eπ L(θ; ξ) : θ2Θ π θ2Θ If both action spaces Θ and Ξ are finite (and not too large), minimax random strategies can be computed easily by solving a related linear programming problem. Tim Waite (U. Manchester) Minimax efficient random designs Banff - 8 Aug 2017 5 / 36 Example: paper-rock-scissors, Θ = Ξ = fP; R; Sg, with loss matrix L(θ; ξ) below ξ PRS P 0 1 −1 θ R −1 0 1 S 1 −1 0 Let δ be any deterministic strategy and π = U(fP; R; Sg), then 1 1 1 E L(θ; ξ) = × (−1) + × 0 + × 1 = 0 ; 8θ 2 Θ : π 3 3 3 Hence maxθ Eπ L(θ; ξ) = 0 and maxθ Eδ L(θ; ξ) = 1. Thus π is preferable to any deterministic design. Indeed π is optimal. Tim Waite (U. Manchester) Minimax efficient random designs Banff - 8 Aug 2017 6 / 36 Frequentist decision-theoretic experimental design In optimal (exact) design, attention is usually restricted to a deterministic choice n of design, ξ = (x1;:::; xn) 2 Ξ = X , a set of n points in design space X . We prefer a design with the lowest possible value of the risk R(θ; ξ) = Eyjξ;θ L(θ; ξ; y) However the risk often depends on a vector θ 2 Θ of fixed unknowns (e.g. model parameters in a nonlinear model). Hence, it is unknown which designs have minimum risk. [Design selection can be viewed as a game with loss L(θ; ξ) = R(θ; ξ). Player I: Nature, chooses θ; Player II: the Statistician, chooses ξ.] Tim Waite (U. Manchester) Minimax efficient random designs Banff - 8 Aug 2017 7 / 36 Minimax design There is thus a need to account for uncertainty about θ when choosing ξ. Many frequentists are reluctant to use prior distributions. In this case, typically a deterministic minimax design is sought, i.e. a ξ∗ 2 Ξ that minimizes maxθ2Θ R(θ; ξ). Random designs Considerations from game theory and statistical decision theory would suggest that we also allow a random design, i.e. a probability measure π on Ξ. Interpretation: choose the realized design ξ at random by sampling from π. (A deterministic design corresponds to a point mass distribution.) Tim Waite (U. Manchester) Minimax efficient random designs Banff - 8 Aug 2017 8 / 36 Expected loss for random designs If L is truly the loss function, utility theory implies that the performance of π is to be measured via R(θ; π) = Ey;ξjθ L(θ; ξ; y) : This makes intuitive sense: For a deterministic design we considered the repeated sampling distribution for L over hypothetical replications of the entire experiment. For a random design we do the same, but now for a different hypothetical replication, a different ξ will be sampled from π. A minimax random design π∗ satisfies max R(θ; π∗) = inf max R(θ; π) : θ π θ Tim Waite (U. Manchester) Minimax efficient random designs Banff - 8 Aug 2017 9 / 36 Example: Fisherian randomization Consider a linear model contaminated by fixed unknown additive unit effects, T u = (u1;:::; un) 2 U, T 2 yi = f (xi )β + ui + i ; i ∼ N(0; σ ) : It was shown in many cases that the minimax random design strategy is Fisherian randomization of a standard design. π minimizes maxu2U R(u; π). Assumptions about the structure of the experimental units, e.g. exchangeability/blocks, described by a permutation group G. Different loss functions considered, e.g. A, L-optimality. [Wu, 1981; Li, 1983; Hooper, 1989; Bhaumik and Mathew, 1995]. Tim Waite (U. Manchester) Minimax efficient random designs Banff - 8 Aug 2017 10 / 36 Fisherian randomization `is one of the greatest contributions of R. A. Fisher to science and statistics' (Wu, 1981). It seems to us a weakness of standard optimal design theory that Fisherian randomization does not arise as a necessary mathematical consequence. The preceding slide shows that it does arise directly from random designs and the minimax principle. This seems to be a big hint that minimax random designs are worth considering more widely. Tim Waite (U. Manchester) Minimax efficient random designs Banff - 8 Aug 2017 11 / 36 Design for point prediction Suppose we have a normal theory linear model, T 2 yi = f (xi )β + i ; i ∼ N(0; σ ) ; q with design points x1;:::; xn 2 X ⊆ [−1; 1] . Suppose the goal is prediction at an unknown point x, with squared error loss L(θ; ξ; y) = [fT(x)β − fT(x)β^]2 ; depending on θ = (β; x). ^ T −1 T Above, β = (Fξ Fξ) Fξ y is the least squares estimator, Fξ is the model matrix. Tim Waite (U. Manchester) Minimax efficient random designs Banff - 8 Aug 2017 12 / 36 Given a design ξ and σ2, the risk is 2 T T −1 R(x; ξ) = σ f (x)(Fξ Fξ) f(x) : Thus the minimax deterministic design minimizes T T −1 Φ(ξ) = max f (x)(Fξ Fξ) f(x) ; x2X i.e. it is the classic G-optimal design. However, this may be beaten by a minimax random design π∗, which minimizes T T −1 Φ(π) = max f (x)Eπf(Fξ Fξ) gf(x) : x2X Tim Waite (U. Manchester) Minimax efficient random designs Banff - 8 Aug 2017 13 / 36 Example: quadratic model, 2 factors, X = {−1; 0; 1g2, n = 6. There are 76 possible non-singular designs up to permutations of run order. The minimax deterministic design has maximum expected loss 2:75σ2. Using linear programming, an optimal random design can be obtained; it has maximum expected loss 1:55σ2. The efficiency of the deterministic design is just 56%. Tim Waite (U. Manchester) Minimax efficient random designs Banff - 8 Aug 2017 14 / 36 π(ξ1) = 0.2105 π(ξ2) = 0.1184 π(ξ3) = 0.1184 1.0 1.0 1.0 0.5 0.5 0.5 2 2 2 x x x 0.0 0.0 0.0 -1.0 -1.0 -1.0 -1.0 -0.5 0.0 0.5 1.0 -1.0 -0.5 0.0 0.5 1.0 -1.0 -0.5 0.0 0.5 1.0 x1 x1 x1 π(ξ4) = 0.1184 π(ξ5) = 0.1053 π(ξ6) = 0.1053 1.0 1.0 1.0 0.5 0.5 0.5 2 2 2 x x x 0.0 0.0 0.0 -1.0 -1.0 -1.0 -1.0 -0.5 0.0 0.5 1.0 -1.0 -0.5 0.0 0.5 1.0 -1.0 -0.5 0.0 0.5 1.0 x1 x1 x1 π(ξ7) = 0.1053 π(ξ8) = 0.1184 best deterministic 1.0 1.0 1.0 0.5 0.5 0.5 2 2 2 x x x 0.0 0.0 0.0 -1.0 -1.0 -1.0 -1.0 -0.5 0.0 0.5 1.0 -1.0 -0.5 0.0 0.5 1.0 -1.0 -0.5 0.0 0.5 1.0 x1 x1 x1 Tim Waite (U. Manchester) Minimax efficient random designs Banff - 8 Aug 2017 15 / 36 Model-robust design Variance-based optimality criteria Classic D; A; E-optimality etc. all assume that a particular parametric model is correct. Bias Box & Draper (1959) - polynomials of uncertain degree, found if model incorrect better off choosing the design to minimize bias, ignoring variance. Their investigation focussed only on polynomials. More sophisticated treatments of model-robust design exist (e.g. Wiens, 2015). Tim Waite (U. Manchester) Minimax efficient random designs Banff - 8 Aug 2017 16 / 36 Suppose that x 2 X , with X ⊆ [−1; 1]q, the design space. We assume y ∼ N[µ(x); σ2] ; and λ(X ) > 0, where λ is Lebesgue measure.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    36 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us