<<

What is the Difference between Mobile and Online ?

ABSTRACT Keywords In recent years, online social network(OSN), emerged as a Mobile social network, Online social network, User behavior new medium, has flourished as never before. Meanwhile, constant developments in mobile technology have changed 1. INTRODUCTION the way people engage in . Another kind of In recent years, online social network(OSN), emerged as social network called mobile social network (MSN) emerges a new medium, has flourished as never before. Meanwhile, as required and is becoming more and more popular. The constant developments in mobile technology have changed goal of this paper is to compare communication behavior in the way people engage in social media. Another kind of mobile and online social network. social network called mobile social network (MSN) emerges We use the Reality Mining data set as a mobile social net- as required and is becoming more and more popular. In work sample and data as an online social network order to improve the social service for users from mobile case. We compare communication in three aspects: time, device, it is needed to understand the difference between location and topology. Communication in mobile social net- the intrinsic usage characteristics of mobile and online social work presents a more significant periodicity than online so- network. In other words, how do users behave in different cial network. Besides, both the reply time and conversation kinds of social networks? duration in mobile social network are shorter. But the re- Essentially, no matter mobile social network or online so- ply rate of most users in mobile social network is between cial network, they all provide a platform where individuals 0.2 to 0.4, while in online social network a larger portion with similar interests or commonalities can be connected of users have a high reply rate. We also discover that in with one another. However, mobile and online social net- both social network, the elapsed time during a conversa- work each have unique features, such as the communication tion decrease. Geographic distance influence mobile social periodicity, communication diversity etc. People use the t- network and online social network differently. Communica- wo channels differently. Therefore, it is important to under- tion in online social network is less location-sensitive than stand how people’s behave vary in online and mobile envi- mobile social network. User communication in mobile social ronment. The work in this area so far primarily focused on network is in better accordance with social theories, whereas identifying the distinguishing features of user behaviors on in online social network it performs inconsistently. online and mobile social network independently. In this pa- To the best of our knowledge this work is the first quan- per, we start from the difference analysis to compare online titative communication comparison between mobile and on- and mobile social network from an integrated angle of view. line social network. To the best of our knowledge, our work is among the first to study the comparison of users’ communication behavioral Categories and Subject Descriptors patterns between mobile and online social networks. In this paper, we analyze the Reality Mining data as a H.3.3 [Information Systems]: Social and behavioral sci- mobile social network sample, which contains 106 subject- ences s using mobile phones pre-installed with several pieces of that recorded and sent the researcher data about call logs, SMS logs, GSM cell information and BlueTooth General Terms scan for more than 9 months. To make a comparison, we Human Factors, Measurement select twitter as the most influential online social network and crawled 103,452 users and 38,554,741 following links and also extracted all tweets posted by these users, in total 211,509,594 tweets from Jan 1st to Oct 14th, 2010. Based on the collected data, this comparison in this paper include Permission to make digital or hard copies of all or part of this work for three aspects: temporal, geographic and topological analy- personal or classroom use is granted without fee provided that copies are sis. Firstly, we focus on the properties of reply and through not made or distributed for profit or commercial advantage and that copies the reply analysis to establish the conception of instant re- bear this notice and the full citation on the first page. To copy otherwise, to ply and ephemeral conversation which depends on particular republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. factors in different social networks. Secondly, we study the Copyright 200X ACM X-XXXXX-XX-X/XX/XX ...$5.00. influence of location over communication in different social Feature MSN Data set OSN Data set 2.1 Mobile Social Network Data Set users 106 103,452 To confirm the results in this paper, we introduce a mobile communication counts 162,699 965,042 social network data set (MSN data set) collected in the Real- communication links 12,952 55,911 ity Mining project. The original Reality Mining experimen- Avg. comm counts 1,535 9.32 t1[4] is one of the largest projects attempted Avg. comm links 122 0.54 in academia. They have captured communication, proxim- ity, location, and activity information from 106 subjects at Table 1: Statistics of mobile and online social net- MIT over the course of the 2004-2005 academic year. This work data sets. data represents over 350,000 hours ( 40 years) of continuous data on human behavior. The data consists of mobile phone communication log as networks. Finally, we analyze topological characteristics in well as location changes in the form of GSM cell information. different social networks and try to import social status the- There are totally 162,699 communication records (128,542 ory to make a comparison. call logs and 34,157 short message logs) during the whole Generally, we analyze the data to answer the key ques- data collection period. Most of the data has been gathered tions: from Nokia 6600 phones programmed to automatically run the ContextLog application as a background process at all 1. Does the communication in different social networks times. have distinguishing temporal properties? The communication log lists calls and text messages in- 2. How does the geographic location of users influence the cluding date and time, duration, direction and originator/ communication in different social network? recipient of communication. A call log is also labeled whether it is missing or not. The location data is in the form of GSM 3. Is there significant difference of topological structure cell information, such as location-area-code (LAC) and cell in different communication network? identifier (CellId). Subjects also labeled some typical place such as home and work so we can infer some location infor- Our data analysis reveals several interesting findings. We mation based on cell identifier. Proximity is also inferred find that: from repeated Bluetooth scans. Bluetooth device record other Bluetooth devices within a range of 5-10m so we have 1. Communication in different social network show dif- all proximity records when two subjectsa´rphones are within ferent temporal properties, such as periodicity, reply , 5-10 meters of each other. time, reply rate, and ephemeral communication dura- At last, it is important for us to stress the ethical impli- tion. cations of this study. No user or correspondent’s name or 2. Geographic Geographic distance influence mobile so- phone number is recorded in the data set, each user and cor- cial network and online social network differently. Com- respondent is assigned with an unique id to confirm that all munication in online social network is less location- the records are anonymous. sensitive than mobile social network. 2.2 Online Social Network Data Set 3. The topological structure of communication network It is necessary for us to find out an online social network varies in mobile and online social network, therefore data set (online social network data set) as a comparison to has different influence on user behavioral pattern. mobile social network. We choose a Twitter data set since Twitter is one of the most popular online social network. This paper is organized as follows. Section 2 describes Another reason is that the text-based posts(tweets) are up to the introduction and collecting methodology on mobile and 140 characters and is described as ”the SMS of the Internet”, online social network data sets. The we will compare in so tweets might be a good comparison of SMS in mobile three dimensions: time, location and topology. We conduc- social network. t the analysis of instant reply and ephemeral conversation We crawled and collected profiles of 103,452 users and in Section 3. In Section 4 we study how location influence tweets of them starting on Jan 1st and lasting until Oct communication among different social networks. We focus 14th, 2009, in total 211,509,594 tweets. By abstracting from on topology in communication network structure and in- the user profile, we get the information of the full name, the troduce social status theory in Section 5. Section 6 covers location, the number of tweets, followings and followers of related work and puts our work in perspective. In Section 7 users. There exist 38,554,741 directed relations of following we conclude. and being followed. Distinct from traditional communicating method in mo- 2. DATA SET bile social network such as calls and SMS, a user’s tweets To build up the comparison between mobile social network are publicly visible to all users. However, Twitter provide and online social network, we have studied two data sets in other ways of directional communication such as replies and mobile and online social network: To mobile social network, retweets. Users are provided with a button to reply to any we have utilized the Reality Mining data set captured by given message, which is easy to be seen as a one-to-one com- MIT Media Lab; To online social network, we’ve crawled a munication. The prototypical formulation of retweets is ”RT Twitter data set using an Application Programming Inter- user ABC” where the referenced user is the original author face(API) that Twitter offers. In following subsections we and ABC is the original tweet’s content. Retweets are usu- will introduce the collection procedure and the description of the data sets we evaluate in this paper. 1http://reality.media.mit.edu/ 0.04 0.035 MSN 0.035 0.03 OSN 0.03 0.025 0.025 0.02 0.02 0.015

Probability 0.015 Probability MSN−call 0.01 0.01 MSN−SMS 0.005 OSN 0.005 0 0 0 4 8 12 16 20 24 0 50 100 150 200 250 300 Time of a day Elapsed time(s) (a) Probability of Communication versus Time of a Day Figure 2: Time lag between a reply and the lat- 0.18 est communication. Online social network peaks at

0.17 MSN−call about 60s, while mobile social network peaks at a MSN−SMS much shorter time interval, 20s. 0.16 OSN

0.15

0.14 1

Probability 0.9 0.13 0.8 0.12 0.7 0.11 0.6 0.1 0.5

Sun Mon Tues Wed Thu Fri Sat CDF Day of a week 0.4 (b) Probability of Communication versus Day of a week 0.3 0.2 MSN OSN Figure 1: Communication periodicity in mobile and 0.1 0 online social network. The periodicity exhibits ob- 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 viously in mobile social network but becomes indis- Instant reply rate tinct in online social network. Figure 3: CDF of instant reply rate per user. 80% of mobile users aggregates between the reply rate in- ally thought to be a tool of information diffusion. However, terval of 0.2-0.4, while the reply rate of online users some studies have found that there exists the phenomenon varies. of ego retweets [3] which means two users repeatedly retweet the same tweet as a conversation, so we also consider these kinds of retweets as a kind of one-to-one communication in communication in mobile social network and online social this social network. We have found that 43.9% of tweets network during a day. Communications in mobile social net- contain replies and 11.1% of tweets contain retweets. We fi- work, either call or SMS, shows a clear 24-hour periodicity, nally extract 965,042 communication records between users with lower probability in the morning and higher probability we captured in total. at night. However, in online social network, although it still Table 1 shows a statistics of MSN Data set and OSN data relates to approximately 24-hour periodicity, the probabil- set. ity of communication fluctuates much smaller. The users’ communication behavior in online social network is much 3. TEMPORAL ANALYSIS OF COMMUNI- sparser during a day than that in mobile social network. Then we turn to one-week periodicity. Figure 1(b) shows CATION the average probability of different communication methods Studying the micro-level user behavior in different net- in each day of a week. The trend of communication proba- works provides insights into the fundamental difference be- bility during a week seems the same. When weekend is near tween online social network and mobile social network. In by hand, users prefer to communicate more, either for finish this section, we lay emphasis on the time factor of user inter- the work of a week, or for preparing activities of weekend. action in mobile and online social network. We first study The communication on Mondays shows a spark contrast s- the periodicity of communication in different social network, ince perhaps users focus on working on Mondays. However then focus on two typical patterns of interactions between the variance of probability during a week differs in different users, i.e., the instant reply and ephemeral conversation. kinds of social network. In mobile social network, the vari- ance of call probability during a week is 2.5 × 10−4 and that 3.1 Periodicity Analysis of SMS probability is 1.1 × 10−4, since SMS communication Periodicity is a fundamental feature of users’ communi- might be more flexible than call communication. Howev- cating behavior and User behavior usually exhibits 24-hour er in online social network, the variance of communication periodicity. Figure 1(a) shows the average probability of probability during a week is 1.8 × 10−5, less than fifth of 1 online social network allows users to be more relaxed. 0.9 By more clearly distinguishing the messages related from 0.8 messages unrelated, it is possible to better organize the user 0.7 communication for observations on higher-level. Thus we 0.6 give the definition of instant reply: 0.5

CDF Definition 1. Instant Reply: Given a time limit θ. If 0.4 user a sends a message to user b at time t, and user b sends 0.3 a message back to user a within the time limit θ, the message 0.2 MSN sent back is defined as an instant reply. 0.1 OSN 0 1 2 3 4 We empirically set the time limit θ = 600s. We conduct a 10 10 10 10 Conversation duration(second) series of comparisons of instant reply between online social network and mobile social network. Figure 4: The distribution of the duration of Reply Rate Comparison. Figure 3 shows the the corre- ephemeral conversations. In mobile social network sponding Cumulative Distribution Functions(CDF) curves 55% of ephemeral conversations are shorter than of users, with respect to the instant reply rate. We ob- 100s while in online social network the percentage serve that almost all mobile users aggregates between the is 13%. reply rate interval of 0.2-0.4, while the reply rate of online users varies. One discovery is the low reply rate of mobile 300 social network. Replying others with a mobile device may MSN be limited to the environment, condition, and many other 250 OSN constraints. Thus the users only reply less than half of the messages they received. Users in online social network usu- 200 ally are more likely to be indoor. Thus, it is possible for a 150 user in online social network to achieve a reply rate more than 0.8. The result also implies that users in mobile social

Elapsed time 100 network comply with a more regular behavior pattern, which is strongly correlated to our daily reality life. Online social 50 network, due to its disunion with the physical social net-

0 work, allows users to behave differently from their common 0 10 20 30 40 50 60 daily behavior. The n−th reply in an ephemeral conversation 3.3 Ephemeral Conversation Figure 5: Elapsed time of the n-th reply in ephemer- There are many isolated messages in social networks, which al conversations. When n becomes larger, the reply are not correlated with other messages and therefore cannot time decreases in both social networks. reflect interactions between users. Some messages might be correlated to form a semantic conversation, but the time span between two messages are too long to represent the the variance in mobile social network. The probability of dynamics of social network. In this section, we investigate communication in online social network fluctuates less than the ephemeral conversation, consists of a continuous series of that in mobile social network during a week, which shows instant replies. To recognize the characteristics of ephemeral one-week periodicity is not a notable feature of online social conversation can helps discover the underlying difference of network communication however it plays an important role structure and principles in mobile social network and online in mobile social network communication. social network, and benefit extensive research and applica- 3.2 Instant Reply tion. Definition 2. Ephemeral Conversation: If user a and Reply Time Comparison The time a user takes to reply a user b iteratively send instant replies to each other, all the received message reflects to what extent the user is involved messages sent during this period form an ephemeral conver- in the interaction process. We conduct a comparison be- sation. tween the average reply time between online social network and mobile social network. Figure 2 shows the probability of the elapsed time before a user replies. It can be observed Conversation Duration Analysis We plot the CDF that curves of online social network and mobile social net- curves of the conversation duration on both data sets, as work peak at different time interval. Online social network illustrated in Figure 4. The first observation is that most peaks at about 60s, while mobile social network peaks at a conversations endure less than 1000 seconds. It implies that much shorter time interval. It reveals that people in online people exchange limited amount of information during a con- social network takes more time than people in mobile social versation. It can also be observed that there are rare short network to reply others. This observation reveals differen- conversation (shorter than 100 seconds) in online social net- t characteristics of mobile social network and online social work, as illustrated by the curve, while short conversations network. The mobile social network is more closed to our occupy about 55% in mobile social network data set. People physical social network, therefore attracts more attention of take less time to communicate with each other on mobile so- the users to keep active and involved on it. In contrast, the cial network, reflecting their needs to achieve more effective Domestic work->home 0.14 Communication 13% 17% home->home 0.12 home->work 33% 13% 0.1

0.08 International Communication work->work 83% 0.06 41% 0.04

(a) Mobile Social Network (b) Online Social Network Communicating probability 0.02

0 Figure 6: Geographical homophily. (a) shows com- 0 1 2 3 4 5 6 7 8 9 10 11 12 munication rate of sender and receiver in different Time zone difference location types in mobile social network; (b) exhibit- s domestic and international communication rate in Figure 7: Time zone difference vs. communication online social network. probability. The probability of communication be- tween users in the same time zone is not the highest. interactions. The situation of online social network is fun- damentally different, since people are more likely to spend 120 Bluetooth device more time on communicating with others. 100 Cell information

Details of Conversation To capture the micro-level char- 80 acteristics of a conversation, we plot the elapsed time of each round of conversation, illustrated in Figure 5. It is 60 clearly that the average elapsed time in the beginning of a conversation is longer (150-200s). But as the conversation 40 lasts longer, each reply averagely takes shorter and shorter

Average number of communication 20 time. This phenomenon can be observed in both mobile so- cial network and online social network. It can be concluded 0 1−9 10−99 100−999 >1000 that people tends to communicate with each other faster and Co−occurance count faster during a conversation. Both curves reveal the process of how a conversation gradually captures people’s attention. Figure 8: Co-occurrence count vs. average num- Another observation is that the average elapsed time of mo- ber of communications. For more frequent co- bile social network is significantly shorter than online social occurrence in Bluetooth scale (5-10m), the commu- network. It supports our observation in Section 3.2. nication increases. In this section we have discussed the temporal factor of communication in mobile and online social network. Com- munication in mobile social network presents a more signif- face. We conduct an analysis of correlation between user’s icant periodicity than online social network. Besides, both location and communication on mobile social network and the reply time and conversation duration in mobile social online social network. network are shorter. But the reply rate of most users in mobile social network is between 0.2 to 0.4, while in online 4.1 Geographic Homophily social network a larger portion of users have a high reply Homophily is a tendency that ”a contact between simi- rate. We also discover that in both social network, the e- lar people occurs at a higher rate than among dissimilar lapsed time during a conversation decrease. people”[14]. Here we investigate homophily in geographic location context. In mobile data set, we count the com- 4. GEOGRAPHICAL ANALYSIS OF COM- munication between users located in ”home” or ”work”, and classify them according to the direction and location con- MUNICATION cerned: ”home->home”, ”home->work”, ”work->work” and There have been numerous studies on large-scale location- ”work->home”. In online social network, we compare the based social network[12, 13]. However, few have analyzed number of domestic communication and international com- location characteristics on a small scale and they do not munication. The result is illustrated in Figure 6. Significant investigate the difference of location influence on communi- difference can be observed from the result. In mobile so- cation across mobile and online social networks. We are con- cial network, 74% of communication occurs when both two cerned about how does the location of user influence her/his users are at home or in work place. It supports that users communication in different social networks. Online social in nearer location or similar environment are more likely to network, mobile social network and physical social network communicate with each other, namely geographic homophi- reflect three levels of geographic scale of people’s social cir- ly. However, in online social network, international com- cles. Online social network, due to its convenience and free, munication occupies 83% of the amount of communication, allows the users communicate with each other all around about 5 times as domestic communication, which violates the world; mobile communication is more usually generat- the geographic homophily. ed between people in a college, a city, or a nation; physical We discover the geographic homophily is supported in mo- social network only allows people to communicate face-to- bile social network, whereas online social network yields out totally different geographic properties. Online social net- work is more often used for international communication, rather than domestic communication, probably due to it- s convenience and inexperience. Another possible factor is the motivation of users in different social networks, which directly influence their behavior. Online users tends to dis- cover new friends and expand their social circles through the online social network, while mobile social network is more of- ten used for interaction between acquaintances, close friends, family members, etc. 4.2 Distance-based Analysis To capture the geographic characteristics of different so- cial networks, we analyze the trends of communication in regard to geographic distance.

Online social network. Twitter users self-report their (a) Mobile Social Network location. It is hard to parse location due to its free for- m. Instead, we consider the time zone of a user as an ap- proximate indicator for the location of the user to measure the geographic distance in online social network. Figure 7 demonstrates the probability distribution of communication with respect to time zones difference. The most communi- cation occurs between two users with 1 hour of time zone difference. As the time zone difference increases, the com- munication probability decreases. However, when the time zone difference reaches more than 4 hours, the fluctuation of communication probability flattens out. The result indicates that online users indeed communicate more with people from nearer regions, where more of their closed friends or family members live. But more importantly it is usually used for international communication, without impaired by the geographic gap. It implies the convenience and inexperience of online social network, particularly for communication in a global geographic scale. Another interesting observation is the respectively low- (b) Online Social Network er probability of communication between users in the same time zone. Although a user usually have the most friends Figure 9: Communication Network Structure and acquaintances in the same time zone, she/he does not tend to contact them through online social network. Mobile social network is probably preferred in this case and there- munication. When the scale comes smaller, users prefer to fore reduce the corresponding online communication prob- communicate with the location-based closer subjects more. ability. It implies the underlying interplay between online However when the scale comes larger, this pattern become social network and mobile social network. attenuated. This conclusion also reflects on the difference Mobile social network. People are moving in mobile of influence of location between mobile social network and social network, thus the geographic distance between two online social network. users are dynamic and hard to measure directly. We instead count the geographic co-occurrence of two users as an es- 5. TOPOLOGICAL ANALYSIS OF COMMU- timation of the average geographic distance between them. The more co-occurrence discovered, the shorter the average NICATION distance is. The communication distribution with respect to The temporal and spatial nature of communication reflect- the number of co-occurrence is demonstrated in Figure 8. s the behavior pattern of users. Beside time and location We apply two kinds of approaches to measure the co- factor, the topological structure of communication network occurrence of two subjects. One is Bluetooth Device which in mobile social network and online social network which we records the proximity within 5-10m range, another is GSM will further study is also critical. cell information and one GSM cell tower can cover about 100 meter distance. We can see that along with the increase of 5.1 Visualized Structure Bluetooth co-occurrence of a pair of users, their communi- Before the further study and comparison in mobile social cation also exhibits a notable raise. On the contrast, when a network and online social network, we should first compre- pair of users have more cell-id co-occurrence, their commu- hend the communicating structure of our data, as Figure 9 nication frequency does not come higher. This observation presents. Figure 9(a) shows the Reality Mining communica- illustrates that even only in mobile social network different tion structure. The red nodes are subjects who we captured scale on location has different levels of influence on com- data from while grey nodes are people that the subjects com- 1 1 0.9 0.8 0.8 0.7 0.6 0.6 0.5 CDF 0.4 0.4 0.3 0.2 MSN 0.2 MSN OSN 0.1 OSN Communication covered in median 0 0 0 1 2 0 100 200 300 400 10 10 10 Degree of node Top−k correspondents

Figure 10: Distribution of degree of nodes. Figure 12: Communication covered by top-k cor- respondents. Top-5 correspondents cover 59% and 50% of communications in mobile and online social 1 MSN network. OSN 0.8

100 0.6 90 MSN OSN CDF 80 0.4 70

60 0.2 50

40 0 −2 −1 0 1 2 10 10 10 10 10 30 The ratio of in−degree and out−degree 20 Average number of communication 10

0 Figure 11: Distribution of in-degree/out-degree ra- 1 2−3 4−7 8−15 >16 tio. Common neighbor count

Figure 13: Common neighbor count vs. average municate with and we have omitted the nodes that have less number of communication. More common neigh- than one degree for clarity. The links , which means two bors result to more communications in mobile social users communicate with each other for more than 5 times, network while not in online social network. are also dyed red or blue based on whether they are links between two red nodes or links between a red node and a grey node. The communicating strength is also demonstrat- ed by the width of line. Figure 9(b) shows the communi- herself/himself from any social communication. cation structure in online social network. The peripheral To more accurately describe the communication of a node, nodes are many isolated communicating groups while inner we consider the direction of edges. An arch from A to B will nodes constitute a large subnetwork with many communi- be established if A makes more communication to B. Based cating community visible in the graph. on the newly generated directed graph, we consider the ratio of in-degree to out-degree ( din ), and plot the correspond- dout 5.2 Node-level Analysis ing CDF for both mobile social network and online social network, compared in Figure 11. In online social network, Degree Property. We regard each user as a node, and there are more than 30% of nodes without any in-degree and establish an edge between users who have had communicat- more than 25% without any out-degree. Besides, about half ed with each other. Then we calculate the degree of each of nodes in online social network has more out-degree than node, and plot the CDF curves in Figure 10. It can be ob- in-degree and another 40% has more in-degree. The rest served that over 60% of nodes in online social network with 10% has exactly equal in-degree and out-degree. In mobile less than 50 degrees, whereas there are only less than 20% social network, about 85% of users have more out-degrees, in mobile social network. Both curves reach 1.0 at approxi- but few have no in-degree or no out-degree. The ratio of mo- mately 400 to 500 degree. The result indicates that a larger bile social network varies from 0.1 to less than 10, whereas portion of users in online social network behave passively the ratio of online social network can varies from less than as they communicate with less people. In comparison, user- 0.1 to more than 10. With this observation, we can infer that s in mobile social network keep a lower bound of degree: less than half of the registered users in online social network about 90% of users in mobile social network have more than are ”active”, i.e., users who both propagate the influence and 40 degrees. It underlines the significant difference between receive the influence from others. The considerable amoun- online social network and our real social network. About t of users without effective interaction declaring the huge 10% of nodes in online social network have 0 degree, while a gap between current online social network and our physical common adult in real social network is not likely to isolate social network. A A A ize the top-k correspondents per user based on the frequency of communications. This property essentially describes the users’ favoritism in communication. Figure 12 highlights the B C B C B C A median portion of users’ communications (including incom- tri000 tri100 tri011 ing and outcoming) covered by the top-k correspondents of

A A A B C users in mobile social network and online social network. unsatisfied triangle 58% of mobile social network and 50% of online social net- work communications are covered through the set of top-5 B C B C B C correspondents for half of the user population. Half of the tri101 tri110 tri111 user population communicate with less than 50 correspon- (a) Social Status Theory dents in online social network while half of the user popu- lation communicate with more than 100 correspondents in

0.4 mobile social network. It shows that users in mobile so- MSN cial network have more correspondents but are more likely 0.35 OSN to communicate with a small number of people, while in on- 0.3 line social network users have less correspondents while their 0.25 communication behave slightly sparser.

0.2

Probability 0.15 5.3 Pair-wised Analysis

0.1 We also conduct an analysis on each pairs of users to find out whether the network structure influences the com- 0.05 munication between users. A basic intuition is that users 0 tri000 tri100 tri011 tri101 tri110 tri111 triUns who share many common neighbors in a social network are Type of triad probably belong to the same community. Thus we plot the (b) Triangle Distribution communication number of two users with respect to their common neighbor count, as illustrated in Figure 13. In mo- Figure 14: Social Status Theory in different social bile social network, the average communication number of networks. Mobile social network has a significantly two users dramatically increase with the common neighbor high triangle rate of tri101 pattern. count of them. In contrast, such phenomenon becomes weak and indistinguishable in online social network: Users with nDCG@k 100 500 1000 5000 10000 different common neighbor number share almost the same average communication number. Avg-Deg 0.175 0.200 0.244 0.399 0.402 This discovery reveals the strong tendency of mobile user- MSN CN-Num 0.239 0.243 0.256 0.256 0.305 s to communicate within their own community. Since the CN-Str 0.315 0.226 0.254 0.287 0.313 communication network is highly related to the physical so- Avg-Deg 0.508 0.598 0.631 0.686 0.703 cial network of our daily life. However, online users indiffer- OSN CN-Num 0.793 0.794 0.796 0.792 0.784 ently communicate with people regardless of whether they CN-Str 0.653 0.699 0.717 0.745 0.753 belong to the same community. It implies a stronger mo- tivation of online users to connect with more ”unfamiliar” Table 2: Experiment results of communication de- friends. The observed difference distinguish mobile social tection. network and online social network by their functions in peo- ple’s social life. The mobile social network is more like an Communication Reciprocity. Depending on following auxiliary network of people’s daily life, thus it is usually and follower relationship, Twitter shows a low level of reci- more similar to the real social network. The online social procity. In our data, 74.25% of user pairs with any link network plays a role to help people extend their social net- between them are connected one-way, and only 25.75% have work. reciprocal relationship between them. Previous studies have Communication Detection. Here we use common neigh- also reported the reciprocity on Twitter as 22.1%[9]. The bor factor to detect communication to show how the topo- actual communication relationship in Twitter also shows al- logical difference between mobile and social network influ- most the same reciprocal rate, 25.66%. Surprisingly, in mo- ence user behavior. We extract the same time period, four bile social network, only 21.9% of user pairs have reciprocal months of data from both social network and split them into relationship between users, even less than in online social earlier and later two-month halves. What we want to do is network. 78.1% are unidirectional. We conclude to a result to detect unprecedented communication links in the second that, a user pair that really communicates in online social time period by ranking the non-communication user pairs in network has a higher probability to have a feedback than the first time period, utilizing the topological characteristics. in mobile social network, which goes against to our com- We offer three kinds of ranking approaches as below. mon sense. Twitter’s low reciprocity mark a deviation from known characteristics of human social networks. We have • discovered that mobile social network also has this proper- Average Degree(Ave-Deg) Users with more correspon- ty. dents may have a higher likelihood to build new com- munication links. So here we rank the user pair which Communication Diversity. To investigate the commu- has at least 1 common neighbors with the harmonic nicating strength in different social networks, we character- average of two users’ degrees, computed as below. 2d(r)d(s) AveDeg(r, s) = (1) B indicates that A has a higher status than B, then all d(r) + d(s) triangles of three nodes in the network should be acyclic. where d(s) means the communication degree of user Figure 14(a) lists 7 types of triads. For easy understand- s. ing, given a triad (A, B, C), we use 1 to denote a directed arch and 0 to denote an undirected edge. Thus label 011 • Common Neighbor Number(CN-Num) Users with more denotes that A and B connected by an undirected edge, common neighbors may like to communicate with each while the other two directed arches emit from C to A and other, so in this method we use the number of common B respectively. The last triad violates the social theory, and neighbor of the user pair to rank it. is labeled as unsatisfied triad. We make a statistic of all • Common Neighbor Strength(CN-Str) The communica- 7 types of triangles on both mobile social network and on- tion strength can be measured by the number of com- line social network, as presented in Figure 14(b). In mobile munication in user pairs. As a modification, here we social network, more than 90% of the triangles satisfy the regard the sum of the harmonic average of communi- social status theory, while in online social network there are cation count between each common neighbor and two 13% of the triangles violate the theory. Another significant users as the ranking index, computed as below: difference between two users is the case ”101”. In mobile so- cial network its probability is more than 35% but in online ∑ 2m(v, r)m(v, s) CNStr(r, s) = (2) social network it is only around 10%. It indicates that some m(v, r) + m(v, s) v∈N(r,s) users in mobile social network are significant leaders, who usually spread their opinions to a wider population. where N(r, s) is the set of all common neighbors of user r and s, m(v, r) is the communication count be- The obviously different pattern of triangle reveals differ- tween user v and r. ent properties of mobile social network and online social net- work. Mobile social network acts as an extension of our real We use nDCG measurement to evaluate our experiment social network, reflects the social status of users in their results as shown in Table 2. In fact there is only 1,104 of daily life. However, users in online social network may have 7,794,542(about 0.014%) non-communication user pairs in an absolute different behavior. The unusual communication the first time period build the communication link in the pattern obscures the social status hierarchy in online social second time period in mobile social network and the ratio network, and may not follow the social status theory. is even lower in online social network, 16,408 to 223,390,999 (about 0.007%), so the results are acceptable that all ap- 6. RELATED WORK proaches are helpful to communicating detection. There are three interesting discoveries we can extract from the results. Various online and social networking services has spurred Firstly, communication detection in online social network research into their characteristics and many work has for- through all approaches has a significantly better effect than ayed into characteristics and user interactions in traditional that in mobile social network, which means that a user in online social networking such as [2, 16]. online social network have a more likelihood to make friends Twitter, as a new form of online social network, has at- with friends of his friend while less in mobile social network. tracted much attention in recent years. Sue et al. analyze It is ironical that a user pair that already knows each other information spreading and impact of retweet over the entire with more common neighbors communicates more frequent- Twitter sphere [9]. Wu et al. investigates the flow of in- ly while an unknown user pair with more common neighbors formation among different categories of users [19]. But all is not tend to communicate with each other in mobile social the works are based on the relationship between following network. Secondly, in both social networks the performance and follower. Huberman et al. reports that the number of CN-Num and CN-Str approaches is better than that of of friends is actually smaller than the number of followers Ave-Deg approach when the cut-off k of ranking is small. or followings [6]. We shift attention from the relationship However with the cut-off k increasing, the performance of of following and follower to the one-to-one communication Ave-Deg approach has a notable raise. It indicates that on- such as reply between users. ly largest number or highest strength of common neighbor Mobile social network, as a traditional and typical kind lead to a higher probability to communicate while others of social network, has been studied a lot on communication make no sense. The user pair with median average degree behavior. Onnela et al. analyze structure and tie strengths are more likely to get acquainted with each other. Finally, in mobile call-based communication networks [17]. Banjo et in mobile social network CN-Str approach performs better al. develop a theoretical model for which social effects of cell than CN-Num approach but it is on the contrast in online phone usage in public places[1]. Miklas et al. investigates social network. In online social network a user would like how mobile systems could exploit people’s social interaction- to communicate with a stranger depending on how many s [15]. Yuan et al. infer user emotional status from mobile common friends they have. However the communication communication network [20]. Wang et al. mine the influen- strength is more important which means a common neigh- tial nodes in mobile social network based on communication bor which communicate with both users frequently may lead [18]. to the communication between these two users. Some communication analysis on other kinds of social net- work have been studied such as Karagiannis et al.’s work 5.4 Network-based Analysis on email social network [8] and Leskovec et al.’s work on We first introduce social status theory before we focus our instant-messaging network [10]. Kamvar et al. present a analysis on the collective network. logs-based comparison of search patterns across different platforms [7]. However our work marks the first to com- Social status theory[5, 11]. This theory is based on pare communicating behavior in online social network and the directed social network. Suppose an arch from A to mobile social network. 7. CONCLUSION [8] T. Karagiannis and M. Vojnovic. Behavioral profiles We presented comparison of user communication between for advanced email features. In J. Quemada, G. Le´l˝on, mobile and online social network. We compare communica- Y. S. Maarek, and W. Nejdl, editors, WWW, pages tion in three aspects: time, location and topology. In tem- 711–720. ACM, 2009. poral analysis, we analyze the periodicity properties. In ad- [9] H. Kwak, C. Lee, H. Park, and S. B. Moon. What is dition, we bring two definition: instant reply and ephemeral twitter, a social network or a news media? In WWW, conversation. We found out the instant reply rate in mobile pages 591–600, 2010. social network is concentrated among 0.2-0.4 while users in [10] J. Leskovec and E. Horvitz. Planetary-scale views on a online social network polarize on instant reply rate. We also large instant-messaging network. In WWW, pages found that during an ephemeral convesation the reply time 915–924, 2008. reduces gradually and finally the converged elapsed time in [11] J. Leskovec, D. P. Huttenlocher, and J. M. Kleinberg. mobile social network is much smaller than that in online Signed networks in social media. CoRR, social network. These temporal differences reflect users’ dis- abs/1003.2424, 2010. similar customs and behavioral patterns in mobile and online [12] N. Li and G. Chen. Analysis of a location-based social social network. Then we analyzed the influence of geograph- network. In CSE (4), pages 263–270. IEEE Computer ical location on different social networks. We have found Society, 2009. that on a smaller scale in mobile social network, some level [13] N. Li and G. Chen. Geographic community analysis of of homophily can be observed while in online social network mobile social networks. In Proceedings of the it becomes obscured. In the topological analysis, we conduc- Workshop on Social Networks, Applications, and t both node-level and pair-wised comparisons. We further Systems (SNAS), Boston, MA, Aug. 2009. explore whether these results can benefit communication de- [14] M. McPherson, L. Smith-Lovin, and J. M. Cook. tection. Then we induce social status theory and achieve the Birds of a feather: Homophily in social networks. conclusion that mobile social network acts as an extension Annual Review of Sociology, 27(1):415–444, 2001. of our real social network while the unusual communication [15] A. Miklas, K. Gollu, K. Chan, S. Saroiu, pattern obscures the social status hierarchy in online social K. Gummadi, and E. de Lara. Exploiting Social network. Interactions in Mobile Systems. In Proceedings of There are many potential future directions of this work. UbiComp 2007, pages 409–428, Sept. 2007. We may import some experiments to confirm and utilize the difference between two kinds of social network. We can pre- [16] A. Mislove, M. Marcon, K. P. Gummadi, P. Druschel, dict whether a user will instantly reply your communication and B. Bhattacharjee. Measurement and analysis of based on the difference we found in different social network- online social networks. In Proceedings of the 7th ACM s. Another potential issue is inferring users’ emotion and SIGCOMM conference on Internet measurement, IMC activity status in different social networks. ’07, pages 29–42, New York, NY, USA, 2007. ACM. [17] J. P. Onnela, J. Saramaki, J. Hyvonen, G. Szabo, D. Lazer, K. Kaski, J. Kertesz, and A. L. Barabasi. 8. REFERENCES Structure and tie strengths in mobile communication [1] O. Banjo, Y. Hu, and S. S. Sundar. Cell Phone Usage networks. PNAS, 104:7332–7336, 2006. and Social Interaction with Proximate Others:Ringing [18] Y. Wang, G. Cong, G. Song, and K. Xie. in a Theoretical Model. In ICA, 2006. Community-based greedy algorithm for mining top-k [2] F. Benevenuto, T. Rodrigues, M. Cha, and influential nodes in mobile social networks. In KDD, V. Almeida. Characterizing user behavior in online pages 1039–1048, 2010. social networks. In Proceedings of the 9th ACM [19] S. Wu, J. M. Hofman, W. A. Mason, and D. J. Watts. SIGCOMM conference on Internet measurement Who says what to whom on twitter. In S. Srinivasan, conference, IMC ’09, pages 49–62, New York, NY, K. Ramamritham, A. Kumar, M. P. Ravindra, USA, 2009. ACM. E. Bertino, and R. Kumar, editors, WWW, pages [3] D. Boyd, S. Golder, and G. Lotan. Tweet, tweet, 705–714. ACM, 2011. retweet: Conversational aspects of retweeting on [20] Y. Zhang, J. Tang, J. Sun, Y. Chen, and J. Rao. twitter. In HICSS, pages 1–10. IEEE Computer Moodcast: Emotion prediction via dynamic continuous Society, 2010. factor graph model. In ICDM, pages 1193–1198, 2010. [4] N. Eagle, A. Pentland, and D. Lazer. Inferring Social Network Structure using Mobile Phone Data. PNAS, 2007. [5] R. V. Guha, R. Kumar, P. Raghavan, and A. Tomkins. Propagation of trust and distrust. In WWW, pages 403–412, 2004. [6] B. A. Huberman, D. M. Romero, and F. Wu. Social networks that matter: Twitter under the microscope. First Monday, 14(1), 2009. [7] M. Kamvar, M. Kellar, R. Patel, and Y. Xu. Computers and and mobile phones, oh my!: a logs-based comparison of search users on different devices. In J. Quemada, G. Le´l˝on, Y. S. Maarek, and W. Nejdl, editors, WWW, pages 801–810. ACM, 2009.