Ego-Centric Graph Pattern Census

Total Page:16

File Type:pdf, Size:1020Kb

Ego-Centric Graph Pattern Census Ego-centric Graph Pattern Census Walaa Eldin Moustafa, Amol Deshpande, Lise Getoor Department of Computer Science, University of Maryland College Park, USA fwalaa,amol,[email protected] is_friend M M is_friend Abstract—There is increasing interest in analyzing networks M M of all types including social, biological, sensor, computer, and S transportation networks. Broadly speaking, we may be inter- is_married is_married S is_married is_married ested in global network-wide analysis (e.g., centrality analysis, community detection) where the properties of the entire network F F S S is_friendF F S S are of interest, or local ego-centric analysis where the focus is is_friend is_married is_married on studying the properties of nodes (egos) by analyzing their neighborhood subgraphs. In this paper we propose and study (a) (b) ego-centric pattern census queries, a new type of graph analysis A A A A A query, where a given structural pattern is searched for in every node’s neighborhood and the counts are reported or used in further analysis. This kind of analysis is useful in many domains B B B B B in social network analysis including opinion leader identification, node classification, link prediction, and role identification. We propose an SQL-based declarative language to support this class C C C C C of queries, and develop a series of efficient query evaluation Coordinator Consultant Gatekeeper Representative Liaison algorithms for it. We evaluate our algorithms on a variety of synthetically generated graphs. We also show an application (c) of our language in a real-world scenario for predicting future Fig. 1. (a) Pattern that captures two couples that are friends with each collaborations from DBLP data. other – such a pattern may be useful in a targeted marketing application; (b) Example pattern used in the node classification application; (c) Different I. INTRODUCTION brokerage patterns – the colors denote organizations, and the function of the broker (the middle node) depends on the organizations that the three nodes Network analysis is an area of growing importance in a belong to (e.g., B is a coordinator if all three are in the same organization). variety of domains including the social sciences, biology, communications, ecology, finance, the internet, law enforce- nodes (for a given k), or in subgraphs defined by intersections ment, national security, social media and others. Most of the or unions of k-hop neighborhoods of pairs of nodes; k-hop work in network analysis focuses on either (a) global macro- neighborhood of a node n is defined to be the incident level analysis characterizing properties of the entire network subgraph on the nodes reachable from n in k hops or less. We and communities (subgraphs) within the network, or (b) local refer to this type of query as an ego-centric pattern census. micro-level analysis characterizing properties of the actors We illustrate the need for such analysis using a few motivating (nodes) in the network. Macro-level analysis is important applications: for characterizing networks and understanding the evolution of networks; examples of such analysis include measuring Targeted Marketing: Viral marketing is proving to be an structural properties like degree distribution, diameter, and effective tool for product advertisement. Selected consumers graph cohesion [30], and discovery of patterns or motifs in the are given a product with the hope that they will like the network [28], [27], [6], [7], [21], [38]. Micro-level analysis fo- product and recommend it to their friends. These consumers cuses instead on measuring properties of the actors in network, must be chosen wisely to minimize the cost and maximize e.g., degree centrality, second-order degree, local clustering the benefits of advertising. Simple criteria such as picking coefficient etc. This is often called ego network analysis, consumers with the most friends, or consumers that are because it looks at the individual actors (egos) and their neigh- connected to many other consumers through short paths, bors (called alters) in the network. Although global network are typically used. However the ability to identify richer analysis is well-understood and efficient computational tools structures is desirable in many cases. For example, a travel are well-developed, similar tools are not yet available for the agency may wish to identify couples that have either the harder problem of ego-centric network analysis that requires largest number of couples in their combined network, or the analyzing a very large number of small, largely overlapping largest number of couple pairs, i.e., couples who are friends networks. with couples. The latter structure is depicted in Figure 1(a). In this paper, we address one such problem in ego-centric Node Classification: In collective classification [32], a node’s analysis, namely counting the number of structural patterns neighborhood is used to predict the node’s own class label. or motifs that occur either in the k-hop neighborhoods of the For example, in a collaboration network, a scientist who collaborates mostly with scientists from a specific field (e.g., motifs that a node is part of) has also been recently used as databases or software engineering) is likely to be from a tool to classify a node’s role. For example, Kerrebroeck et the same field. In a family relationship network (with “is al. [39] count the number of loops a node is part of, and married to” and “is parent of” relationships), for each child use it as a measure to quantify the node’s importance in the we may wish to count the number of relatives up to 3 hops network. Pruzlj¨ [31] proposes the notion of local graphlet away who are smokers (or obese), with their parents also degree distribution as a means for network comparison. being smokers (or obese). This could be a measure of the Several ego-centric measures can be expressed as ego- risk of being a smoker (or obese) for the child, and can be centric pattern census queries with very simple input patterns, used for predicting that risk. The pattern of this query is and can be seen as special cases of ego-centric pattern cen- depicted in Figure 1(b). sus. For example, an ego-centric measure like node degree Structural Balance: In social balance theory, signed net- corresponds to searching for a pattern of a node in a 1-hop works are networks that have positive and negative signs on neighborhood, and clustering coefficient (and its k-clustering their links, denoting whether a link expresses a positive tie coefficient version [22]) can be expressed in terms of counting (e.g., friendship) or a negative tie (e.g., foe) [10]. In signed edge patterns in 1-neighborhood (and k-neighborhood, respec- networks, triangles with an odd number of negative links tively). Similarly, Jaccard coefficient of a pair of nodes can (one or three) are considered unstable. In such networks, be computed by counting a node pattern in the nodes’ 1- we can measure the amount of instability in each node’s neighborhood intersection and union. ego network by counting the number of unstable triangles Our approach and contributions: In this work we investi- in its k-hop neighborhood. gate ego-centric pattern census queries, and develop efficient Brokerage Analysis: In organizational theory, management techniques for executing them. Our key contributions include: scientists are interested in the roles different individuals play, • We introduce a flexible declarative SQL-based query lan- both within an organization, and across organizations. For guage for specifying ego-centric pattern census queries. example, in a transaction network, the middle node in a Our query language allows users to specify the neighbor- directed triad (e.g., node B in the triad A ! B ! C) is hood size (k), the set of focal nodes (or pairs of nodes), called a coordinator if all three are in the same organization, and the pattern to be counted. or a gatekeeper if A is in a different organization than B • We propose an efficient graph pattern matching algo- and C. Figure 1(c) shows the different brokerage types. A rithm, and show that it outperforms GraphQL [20], a brokerage score of a given type can then be computed for recent graph pattern matching system. a node by counting the number of patterns of that type in • We introduce two query evaluation algorithms for the which the node occurs as the middle node. ego-centric census queries, one based on searching from Graph Indexing: Finally, counts of specific structural pat- nodes to patterns (node-driven) and another based on terns in every node’s k-hop neighborhood in a large graph searching from patterns to nodes (pattern-driven). are regarded as node signatures and are often used for • We empirically evaluate our algorithms on a variety of subgraph pattern matching to prune the search space [43]. real-world and synthetic data. Our algorithms can be used to efficiently build sophisticated Outline: The rest of paper is organized as follows. In Sec- signatures, that can be used when searching for large, tion II we present the data model and our language specifica- complex subgraphs. tion. In Section III, we propose an efficient pattern matching algorithm that is used as part of the algorithms in Section IV The problem of ego-centric pattern census combines elements that we propose for evaluating pattern census queries. In from micro-level ego-centric network analysis, subgraph pat- Section V we discuss experimental performance evaluation tern matching, and motif counting. The goal of subgraph through synthetic datasets and workloads. In addition, we pattern matching is to find all matches of a given query graph use our language to design a link prediction experiment over in the database graph.
Recommended publications
  • Simplified Computational Model for Generating Bio- Logical Networks
    Simplified Computational Model for Generating Bio- logical Networks† Matthew H J Bailey,∗ David Ormrod Morley,∗ and Mark Wilson∗ A method to generate and simulate biological networks is discussed. An expanded Wooten- Winer-Weaire bond switching methods is proposed which allows for a distribution of node degrees in the network while conserving the mean average node degree. The networks are characterised in terms of their polygon structure and assortativities (a measure of local ordering). A wide range of experimental images are analysed and the underlying networks quantified in an analogous manner. Limitations in obtaining the network structure are discussed. A “network landscape” of the experimentally observed and simulated networks is constructed from the underlying metrics. The enhanced bond switching algorithm is able to generate networks spanning the full range of experimental observations. Two dimensional random networks are observed in a range of 17 contexts across considerably different length scales in nature: to be limited to avoid damaging the delicate networks . How- from nanometres, in the form of amorphous graphene; to me- ever, even when a high-quality image is obtained, there is further tres, in the form of the Giant’s causeway; to tens of kilometres, in difficulty in analysis. For example, each edge in a network is a the form of geopolitical borders 1–3. A framework for describing complex molecule made up of tens of thousands of atoms which these continuous random networks for chemical systems was first may not lie strictly in a single plane. In addition, the most inter- introduced by Zachariasen to describe silica-like glasses, and has esting dynamic behaviour often occurs over very long timescales proved to be extremely versatile in the years since 4, being used to – potentially decades.
    [Show full text]
  • A Novel Neuronal Network Approach to Express Network Motifs
    Introduction to NeuraBASE: A Novel Neuronal Network Approach to Express Network Motifs Robert Hercus Choong-Ming Chin Kim-Fong Ho Introduction Since the discovery by Alon et al. [1-3], network motifs are now at the forefront of unravelling complex networks, ranging from biological issues to managing the traffic on the Internet. Network motifs are used, in general, as a technique to detect recurring patterns featured in networks. This is based on the assumption that information flows in distinct networks and, hence, common causal associations between nodal networks can be surmised statistically. In essence, the motif discovery algorithm [1-4] begins with a list of variable-information of a particular network and, their connections to other network variables. An analysis is subsequently made on the most prevalent causal relationships in the network based on the frequency of occurrences. These are then compared with a randomised network with the same number of variables and directed edges. Finally, the algorithm lists out statistically significant patterns of associations that occur more frequently than they would at random. In the paper by Alon et al. [2], it was reported that most of the networks analysed demonstrate identical motifs, even though they belong to disparate network families. For example, the said literature discovered that gene regulation in genetics, neuronal connectivity network and electronic circuit systems share two identical network motifs, which involved less than four variables. This observation implies that some similarities exist in these three network architectures. Besides the MFinder tool [1], researchers have also developed other network motif discovery tools. However, MAVisto [6] was found to be as computationally costly with long run-time periods as the MFinder tool uses the same network search for the same subgraph sizes.
    [Show full text]
  • Biological Network Approaches and Applications in Rare Disease Studies
    G C A T T A C G G C A T genes Review Biological Network Approaches and Applications in Rare Disease Studies Peng Zhang 1,* and Yuval Itan 2,3 1 St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065, USA 2 The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; [email protected] 3 Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA * Correspondence: [email protected]; Tel.: +1-646-830-6622 Received: 3 September 2019; Accepted: 10 October 2019; Published: 12 October 2019 Abstract: Network biology has the capability to integrate, represent, interpret, and model complex biological systems by collectively accommodating biological omics data, biological interactions and associations, graph theory, statistical measures, and visualizations. Biological networks have recently been shown to be very useful for studies that decipher biological mechanisms and disease etiologies and for studies that predict therapeutic responses, at both the molecular and system levels. In this review, we briefly summarize the general framework of biological network studies, including data resources, network construction methods, statistical measures, network topological properties, and visualization tools. We also introduce several recent biological network applications and methods for the studies of rare diseases. Keywords: biological network; bioinformatics; database; software; application; rare diseases 1. Introduction Network biology provides insights into complex biological systems and can reveal informative patterns within these systems through the integration of biological omics data (e.g., genome, transcriptome, proteome, and metabolome) and biological interactome data (e.g., protein-protein interactions and gene-gene associations).
    [Show full text]
  • Structural Balance and Partitioning Signed Graphs
    Structural Balance and Partitioning Signed Graphs Patrick Doreian Andrej Mrvar Department of Sociology Faculty of Social Sciences University of Pittsburgh University of Ljubljana Abstract Signed graphs have been used frequently to model affect ties for social actors. In this context, structural balance theory has been used as a theory for the organization and form of these relations. Based on a graph theoretical formalization of this theory, for signed graphs, there are theorems stating that balanced structures have clear parti- tioned forms. We discuss the use of local optimization procedures as a method of obtaining these partitioned structures – or partitioned structures that are as close to a balanced partition as possible for structures that are not balanced. In addition, this method provides a measure of the extent to which a signed graph is imbalanced. For a time series of signed graphs, this permits a test of the balance hypoth- esis that human groups tend towards balance over time. Examples are provided of interpersonal ties in networks and institutionalized rela- tions among collective units. Since the pioneering work of Heider (1946) there has been an enduring concern with theories of structural balance. See, for example, Taylor (1970) and Forsythe (1990). Central to this formulation is the idea of triads made up of two actors, p and q, and some object, x. There is a direct social relation between the two actors and there are unit formation links between the actors and the object. The set of such triads is displayed in Figure 1. The relation between the actors can be positive (for example, liking) or negative (disliking) and the unit formation relations can be positive (for example, approve) or negative (disapprove).
    [Show full text]
  • A Meta-Analysis of Boolean Network Models Reveals Design Principles of Gene Regulatory Networks
    A meta-analysis of Boolean network models reveals design principles of gene regulatory networks Claus Kadelkaa,∗, Taras-Michael Butrieb,1, Evan Hiltonc,1, Jack Kinsetha,1, Haris Serdarevica,1 aDepartment of Mathematics, Iowa State University, Ames, Iowa 50011 bDepartment of Aerospace Engineering, Iowa State University, Ames, Iowa 50011 cDepartment of Computer Science, Iowa State University, Ames, Iowa 50011 Abstract Gene regulatory networks (GRNs) describe how a collection of genes governs the processes within a cell. Understanding how GRNs manage to consistently perform a particular function constitutes a key question in cell biology. GRNs are frequently modeled as Boolean networks, which are intuitive, simple to describe, and can yield qualitative results even when data is sparse. We generate an expandable database of published, expert-curated Boolean GRN models, and extracted the rules governing these networks. A meta-analysis of this diverse set of models enables us to identify fundamental design principles of GRNs. The biological term canalization reflects a cell's ability to maintain a stable phenotype despite ongoing environmental perturbations. Accordingly, Boolean canalizing functions are functions where the output is already determined if a specific variable takes on its canalizing input, regardless of all other inputs. We provide a detailed analysis of the prevalence of canalization and show that most rules describing the regulatory logic are highly canalizing. Independent from this, we also find that most rules exhibit a high level of redundancy. An analysis of the prevalence of small network motifs, e.g. feed-forward loops or feedback loops, in the wiring diagram of the identified models reveals several highly abundant types of motifs, as well as a surprisingly high overabundance of negative regulations in complex feedback loops.
    [Show full text]
  • Multivariate Relations Aggregation Learning in Social Networks
    Multivariate Relations Aggregation Learning in Social Networks Jin Xu1, Shuo Yu1, Ke Sun1, Jing Ren1, Ivan Lee2, Shirui Pan3, Feng Xia4 1 School of Software, Dalian University of Technology, Dalian, China 2 School of IT and Mathematical Sciences, University of South Australia, Adelaide, Australia 3 Faculty of Information Technology, Monash University, Melbourne, Australia 4 School of Science, Engineering and Information Technology, Federation University Australia, Ballarat, Australia [email protected],[email protected],[email protected],[email protected] [email protected],[email protected],[email protected] ABSTRACT 1 INTRODUCTION Multivariate relations are general in various types of networks, such We are living in a world full of relations. It is of great significance as biological networks, social networks, transportation networks, to explore existing social relationships as well as predict poten- and academic networks. Due to the principle of ternary closures tial relationships. These abundant relations generally exist among and the trend of group formation, the multivariate relationships in multiple entities; thus, these relations are also called multivariate social networks are complex and rich. Therefore, in graph learn- relations. Multivariate relations are the most fundamental rela- ing tasks of social networks, the identification and utilization of tions, containing interpersonal relations, public relations, logical multivariate relationship information are more important. Existing relations, social relations, etc [3, 37]. Multivariate relations are of graph learning methods are based on the neighborhood informa- more complicated structures comparing with binary relations. Such tion diffusion mechanism, which often leads to partial omission or higher-order structures contain more information to express inner even lack of multivariate relationship information, and ultimately relations among multiple entities.
    [Show full text]
  • Balance and Fragmentation in Societies with Homophily and Social Balance Tuan M
    www.nature.com/scientificreports OPEN Balance and fragmentation in societies with homophily and social balance Tuan M. Pham1,2, Andrew C. Alexander3, Jan Korbel1,2, Rudolf Hanel1,2 & Stefan Thurner1,2,4* Recent attempts to understand the origin of social fragmentation on the basis of spin models include terms accounting for two social phenomena: homophily—the tendency for people with similar opinions to establish positive relations—and social balance—the tendency for people to establish balanced triadic relations. Spins represent attribute vectors that encode G diferent opinions of individuals whose social interactions can be positive or negative. Here we present a co-evolutionary Hamiltonian model of societies where people minimise their individual social stresses. We show that societies always reach stationary, balanced, and fragmented states, if—in addition to homophily— individuals take into account a signifcant fraction, q, of their triadic relations. Above a critical value, qc , balanced and fragmented states exist for any number of opinions. Te concept of so-called flter bubbles captures the fragmentation of society into isolated groups of people who trust each other, but clearly distinguish themselves from “other”. Opinions tend to align within groups and diverge between them. Interest in this process of social disintegration, started by Durkheim 1, has experienced a recent boost, fuelled by the availability of modern communication technologies. Te extent to which societies fragment depends largely on the interplay of two basic mechanisms that drive social interactions: homophily and structural balance. Homophily is the “principle” that “similarity breeds connection”2. In particular, for those individuals who can be characterised by some social traits, such as opinions on a range of issues, homophily appears as the tendency of like-minded individuals to become friends 3.
    [Show full text]
  • Chapter 5 Positive and Negative Relationships
    From the book Networks, Crowds, and Markets: Reasoning about a Highly Connected World. By David Easley and Jon Kleinberg. Cambridge University Press, 2010. Complete preprint on-line at http://www.cs.cornell.edu/home/kleinber/networks-book/ Chapter 5 Positive and Negative Relationships In our discussion of networks thus far, we have generally viewed the relationships con- tained in these networks as having positive connotations — links have typically indicated such things as friendship, collaboration, sharing of information, or membership in a group. The terminology of on-line social networks reflects a largely similar view, through its em- phasis on the connections one forms with friends, fans, followers, and so forth. But in most network settings, there are also negative effects at work. Some relations are friendly, but others are antagonistic or hostile; interactions between people or groups are regularly beset by controversy, disagreement, and sometimes outright conflict. How should we reason about the mix of positive and negative relationships that take place within a network? Here we describe a rich part of social network theory that involves taking a network and annotating its links (i.e., its edges) with positive and negative signs. Positive links represent friendship while negative links represent antagonism, and an important problem in the study of social networks is to understand the tension between these two forces. The notion of structural balance that we discuss in this chapter is one of the basic frameworks for doing this. In addition to introducing some of the basics of structural balance, our discussion here serves a second, methodological purpose: it illustrates a nice connection between local and global network properties.
    [Show full text]
  • Chapter 5 Positive and Negative Relationships
    Chapter 5 Positive and Negative Relationships Often, knowing just the nodes and edges of a network is not enough to understand the full picture — in many settings, we need to annotate these nodes and edges with additional information to capture what’s going on. This is a theme that has come up several times in our discussions already: for example, Granovetter’s work on the strength of weak ties is based on designating links in a social network as strong or weak, and empirical analyses of social networks that evolve over time is based on annotating edges with the periods during which they were “alive.” Here we describe a rich part of social network theory that involves annotating edges with positive and negative signs — to convey the idea that certain edges represent friendship, while other edges represent antagonism. Indeed, an important problem in social networks is to understand the tension between these two forces, and the notion of structural balance that we discuss is one of the basic frameworks for doing this. In addition to introducing some of the basics of structural balance, our discussion here serves a second, methodological purpose: it illustrates a nice connection between local and global network properties. A recurring issue in the analysis of networked systems is the way in which local effects — phenomena involving only a few nodes at a time — can have global consequences that are observable at the level of the network as a whole. Structural balance offers a way to capture one such relationship in a very clean way, and by purely mathematical analysis: we will consider a simple definition abstractly, and find that it inevitably leads to certain macroscopic properties of the network.
    [Show full text]
  • Structural Balance in Signed Networks: Separating the Probability to Interact from the Tendency to Fight∗
    Structural Balance in Signed Networks: Separating the Probability to Interact from the Tendency to Fight∗ J¨urgenLernery Abstract Structural balance theory implies hypothetical network effects such as \the enemy of an enemy is a friend" or \the friend of an enemy is an enemy." To statistically test such hypotheses researchers often estimate whether, for instance, actors have an increased probability to collaborate with the enemies of their enemies and/or a decreased probability to fight the enemies of their enemies. Empirically it turns out that the support for balance theory from these tests is mixed at best. We argue that such results are not necessarily a contradiction to balance theory but that they could also be explained by other network effects that influence the probability to interact at all. We propose new and better interpretable models to assess structural balance in signed networks and illustrate their usefulness with networks of international alliances and conflicts. With the new operationalization the support for balance theory in international relations networks is much stronger than suggested by previous work. Keywords: signed networks, structural balance theory, international re- lations, conditional sign of interaction 1 Introduction Generalizing Heider's theory of cognitive balance (Heider 1946), Cartwright and Harary (1956) called a signed network structurally balanced if every cycle has an even number of negative ties. They proved that a network is balanced if and only if its actors can be divided into two groups with only positive ties within the groups and only negative ties between groups. The fundamental claim of structural balance theory is that actors have a preference for balanced states and if a network is unbalanced then actors have a tendency to increase balance by adapting their ties.
    [Show full text]
  • Motif Aware Node Representation Learning for Heterogeneous Networks
    motif2vec: Motif Aware Node Representation Learning for Heterogeneous Networks Manoj Reddy Dareddy* Mahashweta Das Hao Yang University of California, Los Angeles Visa Research Visa Research Los Angeles, CA, USA Palo Alto, CA, USA Palo Alto, CA, USA [email protected] [email protected] [email protected] Abstract—Recent years have witnessed a surge of interest in is originally sequential in nature [35], product review graph machine learning on graphs and networks with applications constructed from reviews written by users for stores [34], ranging from vehicular network design to IoT traffic manage- credit card fraud network constructed from fraudulent and non- ment to social network recommendations. Supervised machine learning tasks in networks such as node classification and link fraudulent transaction activity data [33], etc. prediction require us to perform feature engineering that is Supervised machine learning tasks over nodes and links known and agreed to be the key to success in applied machine in networks1 such as node classification and link prediction learning. Research efforts dedicated to representation learning, require us to perform feature engineering that is known and especially representation learning using deep learning, has shown agreed to be the key to success in applied machine learning. us ways to automatically learn relevant features from vast amounts of potentially noisy, raw data. However, most of the However, feature engineering is challenging and tedious since methods are not adequate to handle heterogeneous information
    [Show full text]
  • GAMARALLAGE-THESIS-2020.Pdf
    c 2020 Anuththari Gamarallage NETWORK MOTIF PREDICTION USING GENERATIVE MODELS FOR GRAPHS BY ANUTHTHARI GAMARALLAGE THESIS Submitted in partial fulfillment of the requirements for the degree of Master of Science in Electrical and Computer Engineering in the Graduate College of the University of Illinois at Urbana-Champaign, 2020 Urbana, Illinois Adviser: Professor Olgica Milenkovic ABSTRACT Graphs are commonly used to represent pairwise interactions between dif- ferent entities in networks. Generative graph models create new graphs that mimic the properties of already existing graphs. Generative models are suc- cessful at retaining the pairwise interactions of the underlying networks but often fail to capture higher-order connectivity patterns between more than two entities. A network motif is one such pattern observed in various real- world networks. Different types of graphs contain different network motifs, an example of which are triangles that often arise in social and biological net- works. Motifs model important functional properties of the graph. Hence, it is vital to capture these higher-order structures to simulate real-world net- works accurately. This thesis introduces a motif-targeted graph generative model based on a generative adversarial network (GAN) architecture that generalizes and outperforms the current benchmark approach, NetGAN, at motif prediction. This model and its extension to hypergraphs are tested on real-world social and biological network data, and they are shown to be better at both capturing the underlying motif statistics in the networks as well as predicting missing motifs in incomplete networks. ii ACKNOWLEDGMENTS The work was supported by the NSF Center for Science of Information under grant number 0939370.
    [Show full text]