As Strong as the Weakest Link: Mining Diverse Cliques in Weighted Graphs Appendix Petko Bogdanov1, Ben Baumer2, Prithwish Basu3, Amotz Bar-Noy4, and Ambuj K. Singh1 1 University of California, Santa Barbara, CA 93106, USA, fpetko,
[email protected] 2 Smith College, Northampton, MA 01063, USA,
[email protected] 3 Raytheon BBN Technologies, 10 Moulton St., Cambridge, MA 02138, USA,
[email protected] 4 The City University of New York New York, NY 10016{4309, USA,
[email protected] 1 The choice of group score We model a group as a collection of pairwise interactions. Ideally we would consider higher-order interactions such as subgroups of size 3,4 or more, and incorporate them into our model. However, this approach has several limitations: 1. Although graphs may not be the ideal framework for modeling higher-order interactions, and hypergraphs or simplicial complexes [3, 15, 7] may be the preferred approach, algorithms for the latter settings are computationally demanding. 2. To validate the developed theories, one would need empirical datasets fea- turing a sufficient number of instances where a subgroup has interacted, and such data is hard to find for higher-order interactions. To elaborate on the second point above, data at the subgroup level becomes either sparser (in the case of sports) or is mostly unavailable (in the case of protein/gene interactions). For bigger subgroups in sports there are fewer games (observations) in which the same group participates, limiting one's ability to measure subgroup performance with high statistical confidence. In the case of gene networks, major technologies allow for testing only pairwise interactions, while the overall goal is to understand the system at a complex/pathway level.