Arxiv:1605.00265V1 [Math.ST]

Arxiv:1605.00265V1 [Math.ST]

On the feasibility of semi-algebraic sets in Poisson regression Thomas Kahle Otto-von-Guericke Universit¨at Magdeburg, Germany [email protected], https://www.thomas-kahle.de Abstract. Designing experiments for generalized linear models is diffi- cult because optimal designs depend on unknown parameters. The local optimality approach is to study the regions in parameter space where a given design is optimal. In many situations these regions are semi- algebraic. We investigate regions of optimality using computer tools such as yalmip, qepcad, and Mathematica. Keywords: algebraic statistics, optimal experimental design, Poisson regression, semi-algebraic sets 1 Introduction Generalized linear models are a mainstay of statistics, but optimal experimental designs for them are hard to find, as they depend on unknown parameters of the model. A common approach to this problem is to study local optimality, that is, determine an optimal design for each fixed set of parameters. In practice, this means that appropriate parameters have to be guessed a priori, or fixed by other means. In [12] the authors approached this problem from a global per- spective. They study the regions of optimality of fixed designs and demonstrate that these are often defined by semi-algebraic constraints. Their main tool is a general equivalence theorem due to Kiefer and Wolfowitz, which directly yields polynomial inequalities in the parameters. This makes these problems amenable to the toolbox of real algebraic geometry. In this extended abstract we pursue this direction for the Rasch Poisson counts model which is used in psychome- try [6] in the design of mental speed tests. Analyzing saturated designs for this model amounts to studying the feasibility of polynomial inequality systems. We examine the state of computer algebra tools for this purpose and find that there arXiv:1605.00265v1 [math.ST] 1 May 2016 is room for improvement. Acknowledgement The author is supported by the Research Focus Dynamical Systems (CDS) of the state Saxony-Anhalt. 2 Kahle 2 Polynomial inequality systems in statistics For brevity we omit any details of statistical theory and focus on mathematical and computational problems. The interested reader should consult [12] and its references. We also stick to that paper’s notation. Throughout, fix a positive in- teger k, the number of rules, and another positive integer d ≤ k, the interaction k order. A rule setting is a binary string x = (x1,...,xk) ∈{0, 1} . The regression function of interaction order d is the function f : {0, 1}k →{0, 1}p whose com- ponents are all square-free monomials of degree at most d in the indeterminates x1,...,xk. The value p equals the number of square-free monomials of degree at most d and depends on d and k. For any β ∈ Rp, the intensity of the rule setting x ∈{0, 1}k is T λ(x, β)= ef(x) β. The information matrix of x at β is the rank one matrix M(x, β)= λ(β, x)f(x)f(x)T . The information matrix polytope is P (β) = conv{M(x, β): x ∈{0, 1}k}. The case d = 1 and k arbitrary is known as the model with k independent rules. In this case f(x) = (1, x1,...,xk) and p = 1+ k. Then P (0) is known as the correlation polytope, a well studied polytope in combinatorial optimization. This case is particularly well-behaved, well-studied, and relevant for practitioners. It was investigated in depth in [7,8,9,12]. The pairwise interaction model arises for d = 2, where f(x)=(1, x1,...,xk, x1x2, x1x3,...,xk−1xk) k and p = 1+k+2. This situation is already so intricate that neither an algebraic description of the model (the set of vectors (λ(x, β))x∈{0,1}k parametrized by β ∈ Rp) nor an explicit description of the polytope P (β) are known. 2k An approximate design is a vector (wx)x∈{0,1}k ∈ [0, 1] of non-negative weights with Px wx = 1. To each approximate design there is a matrix M(w,β)= Px wxM(x, β) ∈ P (β). The main problem of classical design theory is to find designs w that are optimal with regard to some criterion. We limit ourselves to D-optimality, where the determinant ought to be maximized. To simplify the problem, we also only consider maximizing the determinant over P (β), and not finding explicit weights w that realize an optimal matrix in P (β). In non-linear regression, such as the Poisson regression considered here, this optimal solution depends on β (in linear regression it does not). Our approach is to consider the set of optimization problems for all β and subdivide them into regions where the optima are structurally similar. These regions of optimality are semi-algebraic. In our setting, there are always matrices with positive determinant in P (β). Since the vertices are rank one matrices, the optimum cannot be attained on Semi-algebraic sets in Poisson regression 3 any face that is the convex hull of fewer than p vertices. A design w is saturated if it achieves this lower bound, that is, | supp(w)| = p. As the logarithm of the determinant is concave, for each given β, the op- timization problem can be treated with the tools of convex optimization. The design problem is to determine the changes in the optimal solution as β varies. A special design, relevant for practitioners and studied in [12], is the corner ∗ design wk,d. It is the saturated design with equal weights wx = 1/p for all k x ∈ {0, 1} with |x|1 ≤ d. For example, for k = 3 rules and interaction order d = 2 the regression function is f(x1, x2, x3) = (1, x1, x2, x3, x1x2, x1x3, x2x3) and there are p = 7 parameters. The corner design has weight 1/7 on the seven binary 3-vectors different from (1, 1, 1). Saturated designs are mathematically attractive due to their combinatorial nature. It is reflected in the following classical theorem of Kiefer and Wolfowitz which is a main tool in the theory of optimal designs. See [15, Section 9.4] or [13] for details and proofs. Theorem 1. Let X ⊂ {0, 1}k be of size p. There is a matrix with optimal determinant in the face conv{M(x, β): x ∈ X} if and only if for all x ∈{0, 1}k λ(x, β)(F −T f(x))T ψ−1(β)(F −T f(x)) ≤ 1. where F is the (p×p)-matrix with rows f(x), x ∈ X and ψ is the diagonal matrix β1 βp 1 diag(e ,...,e ). If this is the case, then the optimal point is p Px∈X M(x, β), the geometric center of the face. βi After changing the scale by the introduction of parameters µi = e , Theo- rem 1 yields a system of rational polynomial inequalities in the µi. Together with the requirements µi > 0, we find a semi-algebraic characterization of regions of optimality for saturated designs. For example, the inequalities corresponding to the corner design are the topic of [12]. It can be seen that there always exist parameters β1,...,βp that satisfy the inequalities in Theorem 1. A good benchmark for our understanding of the semi-algebraic geometry of the Rasch Poisson counts model is to understand the other saturated designs, raised as [12, Question 3.7]. Question 1. When βi < 0, for all i = 1,...,p, is the corner design the only saturated design w that admits parameters β such w is D-optimal for β? For d = 1, k = 3, Question 1 has been answered by Graßhoff et al. They have shown that, up to fractional factorial designs at β = 0, only the corner design yields a feasible system [9]. Using computer algebra, the case d = 1, k = 4 can be attacked. 3 Non-optimality of saturated designs for four predictors Our benchmark problem for computational treatment of inequality systems is an extension of the content of [9] to the case d = 1 and k = 4. Together with 4 Kahle Philipp Meissner, at the time of writing a master student, we have undertaken computational experiments. In this situation p = 5 and a saturated design is specified by a choice of its support X ⊂ {0, 1}4 with |X| = 5. A number of reductions applies. For example, if all 5 points lie in a three-dimensional cube, the determinant can be seen to be equal to zero throughout the face, so that optimality is precluded from the beginning. The hyperoctahedral symmetry acts on the designs and the inequalities. Therefore only one representative of each orbit has to be considered. After these reductions we are left with 17 systems of inequalities, one for each orbit of supports of saturated designs. One orbit corresponds to the corner design for which there always exist parameters at which it is optimal. It is conjectured that the remaining 16 saturated designs admit no parameters under which they are optimal. Theorem 1 translates this conjecture into the infeasibility of 16 inequality systems. The most complicated looking among them is the following. 4µ1µ2µ3µ4 + µ1µ3 + µ1µ2 +4µ2µ3 + µ4 − 9µ2µ3µ4 ≤ 0 4µ1µ2µ3µ4 + µ2µ3 + µ1µ2 +4µ1µ3 + µ4 − 9µ1µ3µ4 ≤ 0 4µ1µ2µ3µ4 + µ2µ3 + µ1µ3 +4µ1µ2 + µ4 − 9µ1µ2µ4 ≤ 0 µ1µ2µ3µ4 + µ2µ3 + µ1µ3 + µ1µ2 + µ4 − 9µ1µ2µ3 ≤ 0 µ1µ2µ3µ4 + µ1µ3 + µ2µ3 +4µ1µ2 +4µ4 − 9µ3µ4 ≤ 0 µ1µ2µ3µ4 + µ1µ2 +4µ1µ3 + µ2µ3 +4µ4 − 9µ2µ4 ≤ 0 µ1µ2µ3µ4 + µ1µ2 +4µ2µ3 + µ1µ3 +4µ4 − 9µ1µ4 ≤ 0 µ1µ2µ3µ4 +4µ1µ3 +4µ2µ3 + µ1µ2 + µ4 − 9µ3 ≤ 0 µ1µ2µ3µ4 +4µ1µ2 + µ1µ3 +4µ2µ3 + µ4 − 9µ2 ≤ 0 µ1µ2µ3µ4 +4µ1µ2 + µ2µ3 +4µ1µ3 + µ4 − 9µ1 ≤ 0 4µ1µ2µ3µ4 + µ1µ2 + µ1µ3 + µ2µ3 +4µ4 − 9 ≤ 0 µ1 > 0, µ2 > 0, µ3 > 0, µ4 > 0.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    6 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us