Discrete Convexity and Log-Concave Distributions In Higher Dimensions

1 Basic Definitions and notations

n 1. R : The usual of real n− tuples n 2. : A subset C of R is said to be convex if λx + (1 − λ)y ∈ C for any x, y ∈ C and 0 < λ < 1.

n 3. : Let S ⊆ R . The convex hull denoted by conv(S) is the intersection of all convex sets containing S. (or the set of all convex combinations of points of S)

n 4. : A function f from C ⊆ R to (−∞, ∞], where C is a n convex set (for ex: C = R ). Then, f is convex on C iff f((1 − λ)x + λy) ≤ (1 − λ)f(x) + λf(y), 0 < λ < 1.

5. log-concave A function f is said to be log-concave if log f is concave (or − log f is convex)

n 6. of a function : Let f : S ⊆ R → R ∪ {±∞}. The set {(x, µ) | x ∈ S, µ ∈ R, µ ≥ f(x)} is called the epigraph of f (denoted by epi(f)).

7. The effective domain (denoted by dom(f)) of a convex function f on S n is the projection on R of the epi(f): dom(f) = {x | ∃µ, (x, µ) ∈ epi(f)} = {x | f(x) < ∞}.

8. Lower semi-continuity : An extended real-valued function f given n on a set S ⊂ R is said to be lower semi-continuous at a point x ∈ S if f(x) ≤ lim inf f(xi) for every sequence x1, x2, ..., ∈ S s.t xi → x and limit i→∞ of f(x1), f(x2), ... exists in [−∞, ∞].

1 2 Preliminaries

In this section, we recall some basic concepts and results of Convex Analysis.

1.( Alternative definitions for convexity)

n (a) f is convex on S ⊂ R (given S is convex) iff epi(f) is convex as a n+1 subset of R . (b) Convexity of f is equivalent to that of the restriction of f to dom(f). n (c) A non-empty subset D ⊂ R is convex iff its indicator function is con- vex.

Note that in the context of Convex Analysis, the indicator function n of a non-empty subset S ⊂ R can be defined as, ( 0, x ∈ S ψS(x) = +∞ otherwise

n 2. (Rockafellar, Corollary 10.1.1)A convex function finite on R is neces- sarily continuous. 3. A convex function is proper if epi(f) = ∅ and contains no vertical lines. In other words, a is obtained by taking a finite convex n function on convex set C and then extending it to all of R by setting f(x) = ∞ for x∈ / C.

n 4. A convex function f : R → [−∞, ∞] is said to be closed if its epigraph is closed. 5.( Topological properties of convex functions)Let f be an arbitrary n function from R to [−∞, ∞]. Then, TFAE. n (a) f is lower semi-continuous throughout R . (b) {x | f(x) ≤ α} is closed for every α ∈ R. n+1 (c) The epigraph of f is a closed subset in R . 2 n 6. Let f ∈ C (C), where C ⊂ R is an open convex set. Then, f is convex on C iff its Hessian matrix is positive semi-definite for all x ∈ C.

3 Discrete convexity in higher-dimensions

In higher-dimensions, the definition of discrete convexity (equivalently, concavity) d is not clear. In Z , d ≥ 2, there are multiple definitions of discrete convexity. The following generalization of the concept of log-concavity for discrete mul- tivariate distributions was proposed by Bapat ’88.

2 d Definition 1. (generalized log-concave) Let S ⊂ N , d ≥ 2 and let f : S → d (0, ∞). f is said to be generalized log-concave on S if for any k ∈ N such that k + ei + ej ∈ S, i, j = 1, 2, 3..., d, then the positive symmetric d × d matrix ((f(k + ei + ej))) has exactly one(simple) positive eigenvalue.

The following theorem follows from the above definition and another interme- diary lemma (the interested reader may refer to ’Bapat, Discrete Multivariate Distribution and Generalized Log-Concavity ’88).

d Theorem 2. Let S ⊂ N , d ≥ 2 and let f : S → (0, ∞). Suppose there exist log-concave functions fi : N → (0, ∞) for all i = 1, 2, ..., d such that f(x) = Qd i=1 fi(xi), where x = (x1, x2, ..., xd) ∈ S, then f is generalized log-concave. Remarks 1.

1. Using Theorem 2, it can be verified that distrbutions such as Hypergeo- metric, negative hypergeometric, multinomial are g.l.c.

d 2. The definition of g.l.c need not be restricted to N and can easily be extended d to Z .

Murota and Shioura , ’01 provided a detailed survey about convex func- tions and sets in higher-dimensions. Among them, we are particularly interested in three definitions which can be defined as follows.

d Definition 3. (Separable-convex) Consider a function h : Z → R ∪ {∞} and d define the domain, dom(h) = {z ∈ Z | h(z) < ∞}. h is said to be separable- Pd d convex if h(z) = i=1 hi(zi)(z ∈ Z ) for a finite family of discrete convex functions hi : Z → R ∪ {∞}, i ∈ {1, 2, 3, ..., d}. That is, (∆hi)(z) ≥ 0 for all z ∈ Z and for all i.

In here, ∆ is the discrete Laplacian operator.

d Definition 4. (Discretely-convex) Define the set N0(x) = {z ∈ Z | bxc ≤ z ≤ dxe}. h is said to be discretely-convex if for any x, y ∈ dom(h) and α ∈ [0, 1],

min{h(z) | z ∈ N0(αx + (1 − α)y)} ≤ αh(x) + (1 − α)h(y)

d Similarly, a set S ⊆ Z is said to be discretely-convex if, for any x, y ∈ S and any α ∈ [0, 1],N0(αx + (1 − α)y)} ∩ S 6= ∅. Note 1. Alternatively, Murota and Shioura defined S to be discretely convex if δS is discretely convex. Definition 5. ( Convex-extendible) h is said to be convex-extendible if h(z) = d h(z) for all z ∈ Z , where h is the convex closure of h which is defined as, d h(x) = sup {α + β|x | α + β|z ≤ h(z) for all z ∈ Z } α∈R, β∈Rd

3 d d Similarly, a set S ⊆ Z is said to be convex-extendible if S ∩ Z = S, where S is the convex closure of S.

Remarks 2.

1. For a discrete convex-extendible function h, its convex extension is a R d R closed, continuous convex function h defined on R s.t h (z) = h(z) for d all z ∈ Z . 2. Convex closure h(x) is convex and it is the greatest convex extension of h.

3. Separable-convex ⊂ (discretely-convex ∩ convex-extendible) (By Murota and Shioura ’01).

4. The following result gives a criterion to check a given discrete function is convex-extendible.

d Lemma 6. (Murota and Shioura) Let h : Z → R∪{∞} be some function.Then, h(z) = h(z) iff there exists a closed convex extension of h.

d Example 1. Consider h(z) = z|Az , where z ∈ Z and A is a symmetric d × d R positive semi-definite matrix. h (x) = x|Ax, x ∈ R is a convex extension of h (note that convexity of h follows from Rockafellar, theorem 4.5 on page 27). Since h is continuous, h is closed. Thus, it is convex-extendible.

Example 2. Consider the set,

3 S = {z ∈ Z | z1 + z2 + z3 = 2, zi ≥ 0, i = 1, 2, 3} ∪ {1, 2, 0} ∪ {0, 1, 2} ∪ {2, 0, 1}

Define h(z) = 0 on S and ∞ on Z \ S. 1 1 Note that, S and h are discretely-convex; however, (1, 2, 0) + (0, 1.2) + 3 3 1 (2, 0, 1) = (1, 1, 1) ∈ S ∩ d but (1, 1, 1) ∈/ S.Thus, h is not convex-extendible 3 Z

Example 3. Let S = {x = (0, 0), y = (2, 1)} and let the function h be defined as before.

d The convex closure of S is the segment between (0, 0) and (2, 1). Thus, S∩Z = S. We conclude that h and S are convex-extendible. On the other hand, h is not discretely-convex, since N0(0.5x + 0.5y) = N0((1, 0.5)) = {(1, 0), (1, 1)} so N0(αx + (1 − α)y)} ∩ S = ∅.

4 4 Discrete log-concave distributions in higher dimen- sions

Throughout this, we focus on PMFs defined on Z. First, we introduce the notion of log-concave probability mass function in the one-dimensional setting. This is due to Balabdaoui et ea, ’13.

Definition 7. (Balabdaoui et ea, ’13) Let p ∈ Z → [0, 1] denote a PMF. p is said to be log-concave if for any z ∈ Z, (∆h)(z) = h(z − 1) − 2h(z) + h(z + 1) ≥ 0, where h(z) = − log p(z).

Remarks 3. ∆ denotes the discrete Laplacian operator, which can also be ex- pressed as (∆h)(z) = h(z − 1) − h(z) − (h(z − h(z + 1)). This is the second difference of h; thus it matches well that of the continuous setting.

d d Definition 8. (R. Bapat) A PMF p on N with the support S = {z ∈ N | p(z) > 0} is said to be generalized log-concave if,

d Y p(z) = pi(zi) , z ∈ S, i=1 where each pi satisfies (∆ log pi)(zi) ≤ 0.

Note that each pi is a univariate discrete log- (though, not necessarily a PMF; so this definition is slightly different than separable log- concavity).

d Definition 9. (Yan Hua Tian ’08) A PMF p : Z → [0, 1] is said to be e-log concave(eLC) if log(p(z)) is concave-extendible .

Remarks 4.

d 1. When d = 1, the class of all eLC PMFs defined on Z , say P0 agrees with the class of discrete log-concave distributions.

d 2. If Z valued random variable X = {X1,X2, ..., X3} has a distribution which is e-log concave and elements X1,X2, ..., Xd are known to be mutually in- dependent, then the PMF of X can be written as p(z) = eφ(z), where −φ(z) is separable-convex

Note 2. The definition of generalized log-concavity is more restrictive than eLC definition. We have the following relationship.

5 Proposition 10. Suppose that p is a generalized log-concave with support S. If S is convex-extendible, then p ∈ P0.

Pd Proof. For any z ∈ S, we have h(z) = − log p(z) = i=1{− log pi(zi)} = Pd i=1 hi(zi), where each pi(zi) is discrete log-concave on Z and hence hi(zi) is discrete convex. Now, for each i, let Si = {k ∈ Z :(k)i ∈ S}. Each hi is defined on Si. Thus, by definition, h is separable convex on S. Hence, it is convex-extendible.

Remarks 5. Based on the above proposition and the work from Bapat ’88 and Johnson et al ’97, it can be shown that distributions such as the multino- mial, negative multinomial , multivariate hypergeometric and multivariate nega- tive hypergeometric are also eLC. This can be done by checking their support is concave-extendible.

5 Properties of the class of eLC PMFs

Suppose that p ∈ P0 .

d 1. Support of p, S = {z ∈ Z | p(z) > 0} is convex-extendible.

Proof. Let h(z) − − log p(z); then h is convex-extendible by assumption. By Lemma 8, there exists a convex extension of h, say hR(x) defined d R R on R . Therefore, the effective domain of h , {x | h (x) < ∞} is a closed d convex set in R (Rockafellar, theorem 7.1 on page 51). The latter follows since for a closed function, its epigraph must be closed and the d effective domain is the projection of the epigraph onto R . Since such a projection of a closed set must be closed, it follows that the effective domain is closed. Therefore , S ⊂ S ⊆ {x | hR(x) < ∞}. Furthermore, we have d R d that S = Z ∩ {x | h (x) < ∞}. It follows that S = S ∩ Z . By definition, S is convex-extendible.

2. For A ⊂ S, let ( p(z) z ∈ A p˜(z) ∝ 0 otherwise

If A is convex-extendible, thenp ˜ ∈ P0.

6 Proof. Let h(z) = − log p(z), then h(z) is convex-extendible by assump- tion. By Lemma 8, there exists a convex extension hR(x) of h(z). Define, ( hR(x) − log c x ∈ A˜ h˜R(x) = +∞ otherwise

1 ˜ for c = P and where A denotes the convex-closure of A. Clearly, z∈A p(z) h˜R is also a closed-convex function. Also, for all z ∈ A,

− logp ˜(z) = − log cp(z) = h(z) − log c = hR(z) − log c = h˜R(z)

R Thus, h is a convex extension of − logp. ˜ Therefore,p ˜ ∈ P0.

d 3. Let p1, p2 ∈ P0 with support S1 = {z1 ∈ Z 1 | p1(z1) > 0} and S2 = {z2 ∈ d d ×d Z 2 | p2(z2) > 0}. Then, p(z) = p1(z1)p2(z2) with support S1 ×S2 ⊂ Z 1 2 and p ∈ P0.

Proof. Let h1(z1) = − log p1(z1) and h1(z1) = − log p2(z2). Then, h1, h2 d d are convex-extendible by assumption. Let x1 ∈ R 1 , x2 ∈ R 2 . By Lemma R R 8, there exists convex extensions h1 (x1), h2 (x2) of h1(z1), h2(z2) respec- d d tively. By definition, they are closed convex on R 1 and R 2 . Define R R R h (x1, x2) = h1 (x1) + h2 (x2).

R d ×d Claim: h is convex on R 1 2 and closed.

0 00 d1×d2 0 00 d1 0 00 Proof. To prove the convexity, let x , x ∈ R , x1, x1 ∈ R , x2, x2 ∈ d2 0 0 0 00 00 00 R and x = (x1, x2), x = (x1, x2). For any α ∈ [0, 1],

R 0 00 R 0 0 00 00 h (αx + (1 − α)x ) = h (αx1 + (1 − α)x2 , αx1 + (1 − α)x2 ) R 0 00 R 0 00 = h1 (αx1 + (1 − α)x1) + h2 (αx2 + (1 − α)x2) R 0 R 00 R 0 R 00 < αh1 (x1) + (1 − α)h1 (x1) + αh2 (x2) + (1 − α)h2 (x2) R 0 R 0 R 00 R 00 = α(h1 (x1) + h2 (x2) ) + (1 − α)(h1 (x1) + h2 (x2)) R 0 0 R 00 00 = αh (x1, x2) + (1 − α)(h (x1, x2)) = αhR(x0) + (1 − α)hR(x00)

R R R To prove that h is closed, note that both h1 and h2 are closed; there- fore, lower semi-continuous and the sum of lower semi-continuous functions is again lower semi-continuous. We conclude hR is closed (Rockafellar, theorem 7.1 on page 51).

7 R R R This proves the claim. Finally, h (z) = h1 (z) + h2 (z) = h1(z) + h2(z) − − log(p1(z1)p2(z2)) = − log p(z), where z = (z1, z2). Therefore, p ∈ P0.

d 4. Suppose that p ∈ P0 with support in Z and let z = (z1, z2), where z1 ∈ d d Z 1 , z2 ∈ Z 2 with d1+d2 = d. Then, the conditional distribution p(z1|z2) = p(z1, z2) | p(z2)) ∈ P0.

Proof. Let h(z) = − log p(z), then h(z) is convex-extendible by assumption d R and fix z2 ∈ Z 2 . Let h be the convex extension of h, which is closed con- d P vex on . Let p denote the marginal of p; i.e : p (z ) = d p(z , z ). R 2 2 2 z1∈Z 1 1 2 ˜R R d1 d2 Define h (x) = h (x1, x2 = z2) + log p2(z2), where x1 ∈ R and z2 ∈ Z R is fixed. We show that h˜ is the convex extension of − log p(z1|z2).

d1 ˜R R First, for any z ∈ Z , h (z1) = h (z1, z2) + log p2(z2) = − log p(z1, z2) + log p2(z2) = − log p(z1|z2).

R R h˜ (x1) is convex since h (x) is convex in x1 and log p2(z2) is a constant. We prove that h˜R is closed. The argument is similar to Rockafellar, the- orem 7.1 on page 51.

R R R Note that the level set of h (x1, x2) is closed since h (x) = h (x1, x2) d R is closed. Let C = {x ∈ R | h (x) ≤ α < ∞}. By Krantz ’91 proposi-

tion 5.5, for any Cauchy sequence {xn} ∈ C, where xi = {xi1 , xi2 , ...xid }

for i = 1, 2, ..., n, ..., its limit x0 = {x01 , x02 , ...x0d } is also an element of C.

For a fix z = {z , ..., z } ∈ d2 , we first show that hR(x , z ) is closed. 2 21 2d2 Z 1 2 R For the same α, the same level set of h (x1, z2) is equal to C˜ = {x ∈ d R R 1 | h (x1, z2) ≤ α}. We consider the following two cases.

(a) If (x1, z2) ∈/ C, then C˜ = ∅, which is closed.

(b) If (x1, z2) ∈ C, then for each Cauchy sequence {x˜n} ∈ C,˜ wherex ˜i = {x˜ , x˜ , ...x˜ }, we can extend it to a d dimensional Cauchy sequence i1 i2 id1 {x } with x = {x˜ , x˜ , ...x˜ , z , ..., z } for all i = 1, 2, ..., n, .. n i i1 i2 id1 21 2d2 Clearly, {x } ∈ C and let x = {x , x , ...x , z , ..., z } = {x˜ , z } ∈ n 0 01 02 0d 21 2d2 0 2 R C. Thus, h (˜x0, z2) ≤ α. We conclude thatx ˜0 ∈ C.˜ By construction, x˜0 is the limit of {x˜n}. Hence, by Krantz ’91 propoition 5.5, C˜ is closed.

R R Therefore, h (x1, z2) is closed. This implies h˜ (x1) is closed since log p2(z2) is a constant. h˜R is a and thus, the convex extension of − log p(z1|z2). Hence p(z1 | z2) is also eLC.

8 5. Let Z be a discrete random variable, with PMF p ∈ P0 with support S. Consider the linear transformation Z˜ = AZ + b, where A is a d × d matrix and b is a vector of length d. Letp ˜ denote the PMF of Z˜ with support S˜. If,

d (a) S˜ is a subset of Z . (b) The matrix A is invertible

then,p ˜ ∈ P0.

Proof. Let h(z) = − log p(z), then h(z) is convex-extendible by assump- tion. By Lemma 8, there exists a convex extension hR(x) of h(z). Now, p˜(z) = p(A−1(z − b)) for any z ∈ S.˜ Define h˜R(x) = hR(A−1(x − b)) for any x ∈ conv(S˜). By construction, h˜R is closed and convex. Moreover,

h˜R(z) = hR(A−1(z −b)) = h(A−1(z −b)) = − log p(A−1(z −b)) = − logp ˜(z)

R for any z ∈ Z. Hence, h˜ is the convex extension ofp ˜ and therefore,p ˜ ∈ P0.

REFERENCES

1. F. Balabdoui, H. Jankowski, K. Rufibach and M. Pavlides. Asymptotics of the discrete log-concave maximum likelihood estimator and related applications. J.R . Stat. Soc. Ser. B. Stat. Methodol. 2013

2. K. Murota and A. Shioura. Relationship of M/L− convex functions with discrete functions by Miller and Favati- Tardella. Discrete Appl. Math.

3. R.B Bapat. Discrete multivariate distributions and generalized log-concavity. Sankhy¯a. 1988

4. R.T Rockafellar. Convex Analysis . Princeton Mathematical Series, No. 28. Princeton University Press, Princeton, NJ. 1970

5. S.G Krantz. Real Analysis and foundations. Textbooks in Mathe- matics. CRC Press, Boca Raton, FL, third-edition, 1991

6. Yan Hua Tian. Maximum likelihood estimation of discrete log concave distribution with applications . York University

9