<<

Configuration models for preserving local quantities of nodes and hyperedges

Kazuki Nakajima1, Kazuyuki Shudo1, and Naoki Masuda2,3

1 Department of Mathematical and Computing Science, Tokyo Institute of Technology 2 Department of Mathematics, University at Buffalo 3 Computational and Data-Enabled Science and Engineering Program, University at Buffalo Empirical data, where unit interactions include those among three or more nodes in gen- eral, are increasingly available [1]. By extending a class of configuration models called the dK-series that are originally proposed for dyadic graphs [2], we propose a family of configuration models for hy- pergraphs that preserve local properties of the given hypergraph to different extents. The proposed model specifies the joint degree distributions of nodes in the subgraphs of size dv or less and the joint degree distributions of hyperedges in the subgraphs of size de or less for the that corresponds to the given hypergraph. We consider dv 0, 1, 2, 2.5 and de 0, 1 . Figure 1 shows four properties of∈ nodes, { i.e., (a)} node’s∈ degree { } distribution (DD), (b) node’s de- gree correlation (DC), (c) so-called degree-dependent node redundancy coefficient (DRC), and (d) the distribution of the shortest path length (SPL), for an empirical hypergraph data and the corresponding configuration models with de = 0. The empirical data is a drug network from the national drug code directory (NDC-classes) data set with 1,149 nodes and 1,047 hyperedges. The model with dv = 0 only intends to preserve the number of edges in the original bipartite graph. Therefore, it does not accurately approximate the four node’s quantities. The model with dv = 1 preserves the node’s degree distribution of the given hypergraph (Fig. 1(a)), as it intends, but not the other three quantities (Fig. 1(b)–(d)). The model with dv = 2 preserves the node’s degree distribution and roughly preserves the degree correlation (Fig. 1(a)–(b)). The model with dv = 2.5 further captures the abundance of triadic relationships (Fig. 1(c)). Furthermore, as dv increases from 0 to 2.5, the model better approximates the distribution of the shortest path length although the model is not designed to preserve it (Fig. 1(d)). The present family of configuration models is expected to serve as reference models when one investigates the structure and dynamics of empirical and synthetic hypergraphs. (a) DD (b) DC (c) DRC (d) SPL 0 10 0 0.6 Empirical 100 10 dv = 0 1 10− dv = 1 80 dv = 2 0.4 d = 2.5 1 v 60 10− 2 10− 40 0.2 Probability Average of Probability 3 10− node redundancy neighbor’s degree 2 20 Degree-dependent 10−

4 0.0 10− 0 100 101 102 103 100 101 102 103 100 101 102 103 0 5 10 Node’s degree Node’s degree Node’s degree Shortest path length

Figure 1: Distributions of node properties for the configuration models for hypergraphs. We set de = 0.

References

[1] F. Battiston, G. Cencetti, I. Iacopini, V. Latora, M. Lucas, A. Patania, J. -G. Young, and G. Petri. Networks beyond pairwise interactions: structure and dynamics. Physics Reports, 874:1–92, 2020. [2] C. Orsini, M. M. Dankulov, P. Colomer-de Simon,´ A. Jamakovic, P. Mahadevan, A. Vahdat, K.E. Bassler, Z. Toroczkai, M. Bogun˜a,´ G. Caldarelli, S. Fortunato, and D. Krioukov. Quantifying ran- domness in real networks. Nature Communications, 6:8627, 2015.