Using Strong Triadic Closure to Characterize Ties in Social Networks
Total Page:16
File Type:pdf, Size:1020Kb
Using Strong Triadic Closure to Characterize Ties in Social Networks Stavros Sintos Panayiotis Tsaparas Department of Computer Science and Department of Computer Science and Engineering Engineering University of Ioannina University of Ioannina Ioannina, Greece Ioannina, Greece [email protected] [email protected] ABSTRACT 1. INTRODUCTION In the past few years there has been an explosion of social The past few years have been marked by the emergence networks in the online world. Users flock these networks, and explosive growth of online social networks. Facebook, creating profiles and linking themselves to other individuals. LinkedIn, and Twitter are three prominent examples of such Connecting online has a small cost compared to the physi- online networks, which have become extremely popular, en- cal world, leading to a proliferation of connections, many of gaging hundreds of millions of users all over the world. On- which carry little value or importance. Understanding the line social networks grow much faster than physical social strength and nature of these relationships is paramount to networks since the \cost" of creating and maintaining con- anyone interesting in making use of the online social net- nections is much lower. The average user in Facebook has a work data. In this paper, we use the principle of Strong few hundreds of friends, and a sizeable fraction of the net- Triadic Closure to characterize the strength of relationships work has more than a thousands friends [20]. The social cir- in social networks. The Strong Triadic Closure principle cle of the average user contains connections with true friends, stipulates that it is not possible for two individuals to have but also with forgotten high-school classmates, distant rel- a strong relationship with a common friend and not know atives, and acquaintances made through brief encounters. each other. We consider the problem of labeling the ties Many of the online connections correspond to weak, or no of a social network as strong or weak so as to enforce the relationships in the physical world. Strong Triadic Closure property. We formulate the prob- Understanding the strength and nature of online relation- lem as a novel combinatorial optimization problem, and we ships is paramount to anyone interested in extracting some study it theoretically. Although the problem is NP-hard, utility out of the online social network data. For moneti- we are able to identify cases where there exist efficient al- zation purposes, knowing which relationships correspond to gorithms with provable approximation guarantees. We per- true friendships is of critical importance to advertisers who form experiments on real data, and we show that there is want to profile users based of their social circle and initiate a correlation between the labeling we obtain and empirical viral marketing campaigns. For sociologists, knowing the metrics of tie strength, and that weak edges act as bridges relative importance of online relationships can have a signif- between different communities in the network. Finally, we icant effect on the way they model and interpret dynamics study extensions and variations of our problem both theo- and norms in the social network. For friendship suggestion retically and experimentally. algorithms, knowing which friends matter more can have an important impact on the produced link recommendations. The problem of understanding the strength and nature Categories and Subject Descriptors of social ties has been studied in the past [12, 7, 6, 22]. J.4 [Computer Applications]: Social and behavioral sci- Previous approaches rely on user characteristics in order to ences; H.2.8 [Database Applications]: Data Mining; H.4 estimate the true affinity between two users. In this work we [Information Systems Applications]: Miscellaneous use solely the graph structure in order to derive the charac- terization of the ties within a social network. To this end we make use of the Strong Triadic Closure (STC) principle [4]. Keywords The STC principle has its roots into early works in Psychol- Strong Triadic Closure, Social Networks, Approximation Al- ogy [3, 15, 8], and it has been used in the study of social gorithms networks [4, 8]. Informally, the STC principle assumes that there are two types of ties, strong and weak, and it stip- Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed ulates that it is not possible for two individuals to have a for profit or commercial advantage and that copies bear this notice and the full citation strong relationship with a common friend and not know each on the first page. Copyrights for components of this work owned by others than the other. That is, it is not possible to have an open triangle author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or in the network graph where both edges of the triangle are republish, to post on servers or to redistribute to lists, requires prior specific permission labeled strong. and/or a fee. Request permissions from [email protected]. We use the STC property to characterize the ties of a so- KDD’14, August 24–27, 2014, New York, NY, USA. Copyright is held by the owner/author(s). Publication rights licensed to ACM. cial network by asking for a labeling of the edges of the social ACM 978-1-4503-2956-9/14/08 ...$15.00. graph into strong and weak such that the STC property is http://dx.doi.org/10.1145/2623330.2623664. satisfied. There is a trivial solution to this problem which is • We study our algorithms experimentally on real datasets. to label all edges weak. However, we believe that creating We show that there is a correlation between the label of strong relationships is the main motivation for users to join, an edge and naturally defined notions of tie strength, and actively engage with a social network, online or other- and that the weak edges act like bridges between dif- wise. Therefore, we look for a labeling that also maximizes ferent communities in the network. the number of strong ties (or minimizes the number of weak ties). The rest of the paper is structured as follows. In Section 2 We thus obtain the following two problems: the maxSTC we review some of the related work. In Section 3 we formally problem where we ask for a labeling of the graph such that define the problems we will study. In Section 4 we study the the STC property holds and the number of strong edges is complexity of our problem, and in Section 5 we consider maximized, and the minSTC problem where we seek to min- approximation algorithms. Section 6 considers extensions imize the number of weak edges. These are two novel com- to the basic problem. Section 7 contains the experimental binatorial optimization problems that are of independent evaluation, and Section 8 concludes the paper. theoretical interest. Both problems are NP-hard, and we thus look for efficient algorithms with approximation guar- 2. RELATED WORK antees. We show that this is not possible for the maxSTC In this paper we build upon the Strong Triadic Closure problem. For the problem, we show that it can be minSTC principle from Psychology. Strong Triadic Closure was first expressed as a graph vertex cover problem on an appropri- defined by Granovetter [8] in his seminal paper\The Strength ately defined graph, a problem known to have a constant of Weak Ties". Previously, Davis [3] and Newcomb [15] dis- factor approximation algorithm. cuss some evidence that this property exists in social net- We also extend our formulation to capture more complex works. The Strong Triadic Closure is discussed in detail in problems where new edges may be added to the graph, or the book of Easley and Kleinberg [4]. They discuss the effect ties in the network may be of different types. Of particular of the property on the structure of the network, and possible interest is the problem where we seek to en- minMultiSTC relaxations, but they do not consider the problem of labeling force a variant of the STC property in the presence of multi- the edges of the graph to enforce the property. In the dis- ple types of strong edges. The problem of understanding the cussion they also consider recent experiments [10, 16] which \type" of an edge is something that arises naturally in prac- demonstrate a correlation between structural properties of tice. Although there are specialized online social networks an edge and a notion of strength measured in practice. We catering to different needs of the users (social, professional, perform similar experiments in Section 7 where we study the informational), the boundaries between these different cir- correlation between the label of an edge and the empirical cles are not always clear. It is often the case that there are tie strength. multiple types of relationships in a single network. There- Recent work has considered the problem of assessing the fore, it is important to be able to not only distinguish be- link strength in a social network, using data from e-mails [14], tween strong and weak ties, but also to differentiate between phone calls [16], and social media [7]. Kahanda and Neville [12] different types of relationships and social circles. develop a supervised learning approach to predict link strength We test our algorithms experimentally on real datasets. from transactional information (communication, file trans- Our experiments demonstrate that the labeling we obtain fers, etc) and differentiate between strong and weak rela- based on the structural graph property of Strong Triadic tionships in large-scale social networks. Gilbert and Kara- Closure correlates well with empirical measures of tie strength.