Dependent Relational Gamma Process Models for Longitudinal Networks

Dependent Relational Gamma Process Models for Longitudinal Networks Sikun Yang 1 Heinz Koeppl 1 Abstract as friendships or interactions in a social network, is often dynamic since the relations among the entities within the A probabilistic framework based on the covariate- network may appear or disappear over time (Mucha et al., dependent relational gamma process is developed 2010). Accordingly, the latent groups composed of those to analyze relational data arising from longitu- temporally connected entities also form and decay over time. dinal networks. The proposed framework char- Hence, appropriate models are needed to enable a better acterizes networked nodes by nonnegative node- understanding of the formation and evolution of dynamic group memberships, which allow each node to networks (Phan & Airoldi, 2015). belong to multiple latent groups simultaneously, and encodes edge probabilities between each pair A probabilistic framework is proposed to model such dy- of nodes using a Bernoulli Poisson link to the namic networks by assuming the network of interest is com- embedded latent space. Within the latent space, posed of a set of latent groups. Each node of the observed our framework models the birth and death dynam- network is hence associated with a time-dependent member- ics of individual groups via a thinning function. ships vector that governs its involvement in multiple groups Our framework also captures the evolution of indi- and interactions with other nodes. The node-group member- vidual node-group memberships over time using ships are assumed to be gamma distributed, thus, naturally gamma Markov processes. Exploiting the recent nonnegative real-valued. Moreover, to capture time-evolving advances in data augmentation and marginaliza- interactions between groups of nodes, we model the birth tion techniques, a simple and efficient Gibbs sam- and death dynamics of individual groups explicitly via a pler is proposed for posterior computation. Experi- dependent relational gamma process (dRGaP). The ideal mental results on a simulation study and three real- number of latent groups can be adaptively learned from data world temporal network data sets demonstrate the via the shrinkage mechanism of the dRGaP. model’s capability, competitive performance and Explicitly modelling group birth/death dynamics can be use- scalability compared to state-of-the-art methods. ful in many applications. For instance, latent groups in a network of military disputes between countries could mean alliances such as NATO coordinating collective defence 1. Introduction to attacks by external forces (Schein et al., 2016a). These The study of relational data arising from various networks groups can be born and die afterwards. For example, the including social, biological and physical networks is becom- Warsaw Pact was established during the Cold War and dis- ing increasingly important due to the emergence of massive solved in later years. We demonstrate that our model can relational data collected from these domains. Many efforts discover interpretable latent structure on a real network of have been dedicated to develop statistical models in terms of military interstate disputes (Ghosn et al., 2004) that agrees community detection and missing link prediction for analyz- with our knowledge of international relations (Section 5). ing relational data arising from static networks, where either Furthermore, it is reasonable to model the time-evolving a single snapshot of the network of interest or an aggregated memberships of each individual node to interpret its joining network over time is presented; see (Goldenberg et al., 2010) and withdrawing behavior to these groups. Hence, we cap- for a review of the literature. However, network data, such ture the dynamics of individual node-group memberships evolving over time via gamma Markov processes. 1Department of Electrical Engineering and Information Tech- In contrast to dynamic network modelling using logistic nology, Technische Universitat¨ Darmstadt, Germany. Correspon- or probit mapping functions (Foulds et al., 2011; Heauku- dence to: Sikun Yang <[email protected]>, Heinz lani et al., 2013; Durante et al., 2014a), we leverage the Koeppl <[email protected]>. Bernoulli-Poisson link (BPL) function (Dunson & Herring, Proceedings of the 35 th International Conference on Machine 2005; Zhou, 2015) to generate edges from the latent space Learning, Stockholm, Sweden, PMLR 80, 2018. Copyright 2018 representation, which makes the computational cost of our by the author(s). Dependent Relational Gamma Process Models for Longitudinal Networks model to scale linearly with the number of edges, rather 2.2. The Thinned CRMs Framework than quadratically with the number of nodes. In addition, Let Π = f(x ; θ ; r )g1 be generated by a Poisson pro- the Bernoulli-Poisson link is also a more appropriate model k k k k=1 cess on the augmented product space X × Θ × with for imbalanced binary data (Zhou, 2017; Hu et al., 2015), R>0 mean measure ν(dx; dθ; dr). Let G = P1 r δ be which makes the proposed model appealing for analyzing k=1 k (xk,θk) a CRM on X × Θ × , and let T denote the time set real-world relational data that are usually extremely sparse. R>0 as the coveriate. Our goal is to construct a family of ran- To perform inference, we present an efficient Gibbs sam- dom measures fG(t)g dependent on covariate values pling algorithm exploiting the Polya-gamma´ data augmenta- t2T t 2 T . To achieve this, we generate a set of binary ran- tion technique (Polson et al., 2013) and the data augmenta- (t) tion and marginalization technique for discrete data (Zhou dom variables bk for each point (xk; rk; θk) 2 Π such that (t) et al., 2015). p(bk = 1) = Pxk (t), where Px : T! [0; 1] denotes the thinning function which determines the probability that atom The paper is organized as follows. In Section 2, we shortly k in the global measure G appears in the local measure G(t) review the gamma process and the thinned completely ran- at covariate value t. Then, the set of covariate-dependent dom measure framework. In Section 3, we present our gen- (t) CRMs fG gt2T can be specified as erative model. In Section 4, we discuss some related work. 1 Experimental results are provided in Section 5. The com- X (t) G(t) = b r δ ; t 2 T : plete Gibbs sampling algorithm and additional experimental k k θk results are presented in the supplementary material. k=1 The new CRMs are well-defined by the mapping theorem 2. Covariate-Dependent Random Measures for the Poisson processes (Kingman, 1993), that is proved in (Foti et al., 2013). As a concrete example, we exploit a Our dynamic network model is based on the thinned com- thinned gamma process (tGaP) to model the global atoms pletely random measures (tCRMs) framework, originally and their activity/inactivity at multiple time points originally proposed in (Foti et al., 2013) for the construction of developed for dynamic topic models. Let ν(dx; dθ; dr) = −1 −cr covariate-dependent topic models and latent feature models. H(dx)G0(dθ)ν0(dr), where ν0(dr) = cr e dr is the We generalize this construction for longitudinal network Levy´ measure of the gamma process. We transform a Gaus- modelling. More specifically, a set of latent groups that con- sian basis kernel pointwise using a logistic function as the stitute the underlying structure of the observed dynamic thinning function: network is generated. Via the tCRMs framework, the gen- ( T ) erated groups are allowed to form and decay over time. To X 2 Px (t) = σ !0k + !lk exp[−φk(t − l) ] ; facilitate understanding, we shortly review the gamma pro- k l=1 cess and the thinned CRMs. where σ(x) = 1=(1 + exp(−x)) denotes the logistic func- 2.1. Gamma Process tion. We fix the centres of these kernels to the T discrete time points in covariate space T . We characterize The gamma process (GaP) is a completely random mea- each location xk 2 X by a set of T + 1 kernel weights sure (CRM) (Kingman, 1967) defined on the product space !lk 2 R, and a (shared) kernel width φk uniformly drawn Θ × G ∼ GaP(G ; c) c scale ∗ ∗ R>0 as 0 , where is a pa- from a fixed dictionary fφ1; : : : ; φDg of size D. To en- rameter, and G0 is a finite and continuous base mea- courage sparsity of the kernel weights, we place a normal- sure over a complete separable metric space Θ, such that inverse gamma prior over !lk, i.e., !lk ∼ N IG(!lk; 0; 1; 1). G(Sk) ∼ Gamma(G0(Sk); c) are independent gamma Hence, the base measure H(dx) can be expressed as fS g1 Θ ∗ ∗ random variables for disjoint subsets k k=1 of . The H(dx) = NIG(!lk; 0; 1; 1)Cat(φk; fφ1; : : : ; φDg). The positive Levy´ measure of the gamma process can be ex- generative procedure can be expressed as −1 −cr pressed as ν(dr) = cr e dr. As a completely random 1 measure, the gamma process can be regarded as a Pois- X G = rkδ(x ,θ ) ∼ CRM(ν(dx; dθ; dr)); (1) son process on Θ × with mean measure ν(dθ; dr). k k R>0 k=1 A sample from this Poisson process consists of countably ∗ ∗ RR !lk ∼ N IG(0; 1; 1); φk ∼ Cat(φ1; : : : ; φD); infinite atoms because Θ× ν(dθ; dr) = 1. Thus, R>0 ( T ) a sample from the gamma process can be expressed as X 2 1 P Pxk (t) = σ !0k + !lk exp[−φk(t − l) ] ; G = k=1 rkδθk ∼ GaP(G0; c). More detailed information about the gamma process can be found in (Wolpert l=1 1 et al., 1998; 2011). (t) h i (t) X (t) bk ∼ Bernoulli Pxk (t) ;G = bk rkδθk : k=1 Dependent Relational Gamma Process Models for Longitudinal Networks 3.

Load more