Characterization of Friendfeed – a Web-Based Social Aggregation Service
Total Page:16
File Type:pdf, Size:1020Kb
Characterization of FriendFeed – A Web-based Social Aggregation Service Trinabh Gupta Sanchit Garg Anirban Mahanti Niklas Carlsson Martin Arlitt IIT Delhi IIT Delhi NICTA University of Calgary HP Labs India India Australia Canada USA Abstract This paper studies the FriendFeed service, with emphasis on social aggregation properties and user activity patterns. Many Web users have accounts with multiple different social We are interested in understanding what types of services networking services. This scenario has prompted develop- users aggregate content from, who follows the aggregated ment of “social aggregation” services such as FriendFeed that content, and why. We are also interested in understanding aggregate the information available through various services. Using five weeks of activity of more than 100,000 Friend- the characteristics of the FriendFeed social network and how Feed users, we consider questions such as what types of ser- they relate to the characteristics of the social network ser- vices users aggregate content from, the relative popularity of vices that it aggregates. Note that FriendFeed being an ag- services, who follows the aggregated content feeds, and why. gregation service enables us to study different services from one common observation point, and allows us to get a unique “sneak peek” on how these social networking and content Introduction sharing services are being used by a common set of users. We believe that social aggregation services are of interest to With the increasing popularity of online social networking information hungry Web users, especially those that like to (OSN) and content sharing services, many Web users have always stay connected with the ongoings in the cyber world. accounts on many of these services. This scenario results Social aggregation services are a recent phenomena, and, in the scattering of information, and has motivated the de- to the best of our knowledge, there has been no prior work velopment of “social aggregation” services that seamlessly on characterizing such services. Our work fills this void and collate content posted by a user on various services and facil- provides the following insights (Gupta et al. 2009): itate easy dissemination of the collated content. FriendFeed (www.friendfeed.com) is one such service. • The FriendFeed social network has significantly smaller FriendFeed allows aggregation of information from a network distances than some of the OSNs it aggregates. number of services that include popular social network- This is likely due to its “celebrity” driven nature, wherein ing, video sharing, photo sharing, and blogging services. a large fraction of the users are friends to a small set of A FriendFeed user can choose to aggregate content from users (e.g., famous bloggers). The node degree distribu- among the supported services into the user’s FriendFeed tion properties of the FriendFeed network, however, are profile page. A FriendFeed user can “follow” the activity of very similar to those reported for other OSNs. other users of this service by subscribing them as “friends”. • The active usage of services (as measured by the number A friend on FriendFeed is a unidirectional relationship. Note of posts, for example) is significantly more skewed than that a user following another need not be collating informa- the passive usage of services (as measured by subscrip- tion from the same set of services, and that any user with a tions to services, for example). Micro-blogging services public profile can be subscribed to as a friend. are the most popular in terms of both active and passive On FriendFeed users can comment and start discussions usage. Services such as video sharing (e.g., YouTube) on the aggregated content, similar to functionalities pro- and photo sharing (e.g., Flickr) observe high subscription vided by typical OSNs. Commenting on aggregated con- rates but relatively low activity. tent facilitates information dissemination in the FriendFeed • The activity level of users is independent of the number network. For example, consider three users A, B, and C. of followers (or friends) they have and also the number of Suppose that A follows B, and B follows C, but A does not services they aggregate content from. Instead, it is typi- follow C. If B comments on some of C’s aggregated content, cally dependent on the type of services subscribed. A will become aware of both B’s comment and the aggre- • gated content under consideration on C’s profile page. This We observe higher user retention rates for certain services process may also result in A deciding to follow C as well. than has been previously reported for them. This is likely due to low activity users being more reluctant to join a ser- Copyright !c 2009, Association for the Advancement of Artificial vice like FriendFeed. This hypothesis is further supported Intelligence (www.aaai.org). All rights reserved. by the short tail of low activity users that we observe. Table 1: Summary of Data Set. Table 2: Properties of the FriendFeed Social Network. Number of users 111,877 Network Reciprocity Diameter Path Length Private users (%) 12.7 FriendFeed 0.53 12 4.02 Total number of directed edges 1,404,446 Twitter 0.58 - 6 Average degree of a user 25.21 Flickr 0.70 27 5.67 Content data collection period 30 Sept - 4 Nov 2008 LiveJournal - 20 5.58 Total number of observed posts 10,678,120 YouTube - 21 5.10 Related Work 2008. Each post contains the time when it was published, the Properties of the network graph formed by users of vari- service on which it was published (“internal” being Friend- ous OSNs have been studied. Kumar et al. (2006) studied Feed itself) and comments (if any) made on this post. how the density, average diameter, and effective diameter of Table 1 summarizes our data set. We found 111,877 users, the Yahoo! 360 and Flickr OSNs have evolved over time. with over 1 million directed edges among them. Among the Their work categorized the users as passive members (sin- users discovered, 12.7% had “private” profiles. For these gletons), users who migrate their offline social network to users, we could not gather any information on posts and the online (isolated communities), and linkers who fully par- users they follow; we ignore these private users in our anal- ticipate in the evolution of the network. Java et al. (2007) yses. In total, there were more than 10 million posts aggre- studied Twitter’s social network and found that users gener- gated from various services; non-stationarities with respect ally reciprocate links, and that the node degree distribution to the amount of activity per day were seen, with there being follows power law. Characteristics of three social network- more activity on weekdays than on weekends. ing services were compared by Ahn et al. (2007); they re- port a multi-scaling behavior of the degree-distribution and Characterization Results attribute it to the presence of heterogeneous types of users on the network. Shi et al. (2007) compare several network Network Properties properties of two network samples of Blogosphere, differing Table 2 summarizes some fundamental metrics capturing in time duration and data collection methodology, and find important network properties of FriendFeed’s social net- them to be remarkably consistent. Leskovec et al. (2008) work, along with known results for some OSNs (Ahn et analyzed link formation in four OSNs and then presented a al. 2007; Mislove et al. 2007; Kumar, Novak, and Tomkins model to synthesize networks of arbitrary scale. 2006; Java et al. 2007). Recent work has considered how users use OSNs. Krish- FriendFeed’s social network follows power-law (Gupta et namurthy et al. (2008) studied Twitter usage. They found al. 2009). Specifically, the number of users subscribed to an that users can be classified as broadcasters, acquaintances, user (indegree) and the number of users followed by an user or spammers, based on indegree, outdegree, and number of (outdegree) both follow power laws. Using techniques de- tweets transmitted. Java et al. (2007) characterized Twit- scribed by Clauset et al. (2007), we find that both indegree ter users into information source, information seeker, and and outdegree distributions follow power law with shape pa- friends, based on the content of the tweets. Caverlee and rameters 2.28 and 2.12, respectively. Although FriendFeed Webb (2008) studied characteristics of MySpace users such allows for directional connections, these results are starkly as the number of friends they have, the number of comments similar to those typically observed for OSNs in which bidi- they make, their gender, and their privacy preferences. rectional relationships are required. Another dimension of interest is the degree of reci- Data Collection Methodology procity in relationships between nodes (Kumar, Novak, and Our data collection consisted of two phases. The first phase Tomkins 2006). The FriendFeed network has a reciprocity captured the network of FriendFeed users, while the second of 0.53, which suggests that it is 53% probable for a directed phase captured the activity of the users identified in the first edge to be present between two nodes in both directions, if phase over a period of five weeks. Please consult (Gupta et there is an edge in a single direction. Similar reciprocity val- al. 2009) for details of the data collection. ues have been observed in other non-aggregating networks. At the end of the first phase, for all users with publicly We studied whether or not the “small world” phenomena visible profiles, we had information on the users they fol- applies here and if it does, to what extent and why. We evalu- lowed and the services they were subscribed to. Because ated the diameter and average path length of the FriendFeed the FriendFeed API does not directly provide the users fol- network (Gupta et al.