<<

Hip and Trendy: Characterizing Emerging Trends on

Mor Naaman School of Communication and Information, Rutgers University, 4 Huntington St., New Brunswick, NJ 08901. E-mail: [email protected]

Hila Becker and Luis Gravano Computer Science Department, Columbia University, 1214 Amsterdam Ave., New York, NY 10027. E-mail: {hila, gravano}@cs.columbia.edu

Twitter, , and other related systems that we shown considerable impact on the information, communica- call social awareness streams are rapidly changing the tion, and media infrastructure of our society (Johnson, 2009), information and communication dynamics of our soci- as evidenced during major global events such as the Iran elec- ety. These systems, where hundreds of millions of users share short messages in real time, expose the aggregate tion or the reaction to the earthquake in Haiti (Kwak, Lee, interests and attention of global and local communities. Park, & Moon, 2010), as well as in response to local events In particular, emerging temporal trends in these sys- and emergencies (Shklovski, Palen, & Sutton, 2008; Starbird, tems, especially those related to a single geographic Palen, Hughes, & Vieweg, 2010). area, are a significant and revealing source of infor- SAS allow for rapid, immediate sharing of information mation for, and about, a local community. This study makes two essential contributions for interpreting emerg- aimed at known contacts or the general public. The content ing temporal trends in these information systems. First, of the often-public shared items ranges from personal sta- based on a large dataset of Twitter messages from one tus updates to opinions and information sharing (Naaman, geographic area, we develop a taxonomy of the trends Boase, & Lai, 2010). In aggregate, however, the postings by present in the data. Second, we identify important dimen- hundreds of millions of users of Facebook, Twitter, and other sions according to which trends can be categorized, as well as the key distinguishing features of trends that can systems expose global interests, happenings, and attitudes in be derived from their associated messages. We quan- almost real time (Kwak et al., 2010). titatively examine the computed features for different These interests and happenings as reflected in SAS data categories of trends, and establish that significant dif- change rapidly. The strong temporal nature of SAS informa- ferences can be detected across categories. Our study tion allows for the detection of significant events and other advances the understanding of trends on Twitter and other social awareness streams, which will enable pow- temporal trends in the stream data. Such trends may reflect a erful applications and activities, including user-driven varied set of occurrences, including local events (e.g., a base- real-time information services for local communities. ball game or “fire on 34th street”), global news events (e.g., Michael Jackson’s death), televised events (e.g., the final episode of ABC’s Lost), Internet-only and platform-specific Introduction memes (e.g., a “fad” of users describing various things they In recent years, a class of communication and infor- object to using the #idonotsupport keyword), and hot topics mation platforms we call social awareness streams (SAS) of discussion (e.g., healthcare reform or the tween idol Justin has been shifting the manner in which we consume and Bieber). produce information. Available from services Most related SAS research so far has focused on Twitter, such as Facebook, Twitter, FriendFeed, and others, these due to its wide global reach and popularity, and because hugely popular networks allow participants to post streams its contents are mostly public and are easily downloaded of lightweight content artifacts, from short status messages to with automated tools. Several research efforts focused on links, pictures, and videos. These SAS platforms have already characterizing or analyzing content from individual events on Twitter (Diakopoulos, Naaman, & Kivran-Swaine, 2010; Nagarajan, Gomadam, Sheth, Ranabahu, Mutharaju, & Received July 30, 2010; revised December 20, 2010; accepted December Jadhav, 2009; Sakaki, Okazaki, & Matsuo, 2010; Starbird 21, 2010 et al., 2010; Shamma, Kennedy, & Churchill, 2010; Yardi & © 2011 ASIS&T • Published online 7 March 2011 in Wiley Online Library boyd, 2010). Other research efforts have addressed the prob- (wileyonlinelibrary.com). DOI: 10.1002/asi.21489 lem of detecting and identifying trends in Twitter and other

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 62(5):902–918, 2011 SAS data. “Bursts” of interest and attention can be detected This improved understanding of emerging information in this data in hindsight (Becker, Naaman, & Gravano, 2010; on Twitter in particular, and in SAS in general, will allow Chen & Roy, 2009; Kleinberg, 2003; Rattenbury, Good, & researchers to design and create new tools to enhance the use Naaman, 2007) or in almost real time (Sakaki et al., 2010, of SAS as information systems in different contexts and appli- Sankaranarayanan, Samet, Teitler, Lieberman, & Sperling, cations, including the filtering, search, and visualization of 2009). Most recently, some work has focused on character- real-time SAS information as it pertains to local geographic izing aggregate general trend characteristics, for example, communities. showing a power law distribution of participation for manu- To this end, we begin with an introduction to Twitter and ally identified terms that correspond to events (Singh & Jain, a review of related efforts and background to this work. We 2010). then formally describe our dataset of and their Indeed, SAS systems in general, and Twitter in particular, associated messages. Later, we describe a qualitative study reflect an ever-updating live image of our society. However, exposing the types of trends found on Twitter. Finally, in the the lack of a well-established structure and semantics for this bulk of this article we identify and analyze emerging trends data limits its utility. Our interest in this article is in charac- using the unique social, temporal, and textual characteristics terizing the features that can help identify and differentiate of each trend that can be automatically computed fromTwitter the types of trends that we can find on Twitter. Better under- content. standing of the semantics of SAS trends could provide critical information for systems that build on this emerging data. Twitter The outcome will be a more robust and nuanced reflection of emerging trends that captures key aspects of relevance and Twitter is a popular SAS service, with tens of millions importance. of registered users as of June 2010. Twitter’s core function We focus here on content that is produced and shared allows users to post short messages, or tweets, which are up to within a specific geographic community and trends detected 140 characters long. Twitter supports posting (and consump- in that content. The relationship between geography and tion) of messages in a number of different ways, including neighborhood and community has been long studied through Web services and “third party” applications. Impor- and argued (Campbell, 1990; Hampton & Wellman, 2003; tantly, a large fraction of the Twitter messages are posted Tilly, 1974), particularly in view of the Internet’s effect on from mobile devices and services, such as Short Message local community ties (Hampton & Wellman, 2003; Putnam, Service (SMS) messages. A user’s messages are displayed as 2000). It is clear, though, that social ties are still more a “stream” on the user’s Twitter page. likely between geographically proximate individuals (Mok, In terms of social connectivity, Twitter allows a user to fol- Carrasco, & Wellman, 2010; McPherson, Smith-Lovin, & low any number of other users. The Twitter contact network is Cook, 2001), and those patterns persist in online networks directed: user A can follow user B without requiring approval as well (Scellato, Mascolo, Musolesi, & Latora, 2010). On or a reciprocal connection from user B. Users can set their Twitter in particular, Scellato et al. (2010) and Takhteyev, privacy preferences so that their updates are available only Gruzd, and Wellman (2010) show that a significant propor- to each user’s followers. By default, the posted messages are tion of the connections are local, although significant “global” available to anyone. In this work, we only consider messages patterns of connections exist. Beyond the higher likelihood posted publicly on Twitter. Users consume messages mostly of connections and ties, people living in the same geographic by viewing a core page showing a stream of the latest mes- area are more similar (McPherson et al., 2001), and likely to sages from people they follow, listed in reverse chronological share interests and information needs (Yardi & boyd, 2010). order. Therefore, we posit that trends that appear in content pro- The conversational aspects of Twitter play a role in our duced by individuals in a geographic community can be analysis of the Twitter temporal trends. Twitter allows several critical and useful to detect or report to others in this com- ways for users to directly converse and interact by referenc- munity. On the other hand, this type of information can also ing each other in messages using the @ symbol. A retweet become distracting and meaningless if these interests are not is a message from one user that is “forwarded” by a second reported or harvested correctly. In this work, the focus on a user to the second user’s followers, commonly using the “RT specific geographic community helps us effectively reason @username” text as prefix to credit the original (or previous) about emerging trends with global and local impact. poster (e.g., “RT @justinbieber Tomorrow morning watch This article offers the following contributions: me on the today show”). A reply is a public message from one user that is a response to another user’s message, and is identified by the fact that it starts with the replied-to user 1. A taxonomy of trends that can be detected from Twitter @username (e.g., “@mashable check out our new study on for a specific geographic community using popular, widely accepted methods. Twitter trends”). Finally, a is a message that includes 2. A characterization of the data associated with each some other username in the text of the message (e.g., “attend- trend along a number of key characteristics, includ- ing a talk by @informor”). Twitter allows users to easily see ing features, time signatures, and textual all recent messages in which they were retweeted, replied to, features. or mentioned.

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—May 2011 903 DOI: 10.1002/asi Finally, Twitter supports a annotation format so is due to “endogenous” factors. Videos are also classified as that users can indicate what their posted messages are about. “critical” or “sub critical,” again according to the time series This general “topic” of a tweet is, by convention, indicated data. Kwak et al. (2010) use these guidelines to classify the with the hash sign, #. For example, #iranelections was a Twitter trends in each of these two categories, showing how popular hashtag with users posting about the Iran election many trends fit each type of time-series signature. However, events. the two groups of authors never verified that the trends or videos labeled as exogenous or endogenous indeed matched their labels. Here we use the time series data (among other Related Work and Background characteristics) while manually coding identified trends as The general topic of studying Twitter trends, as well exogenous or endogenous in order to observe whether these as Twitter content related to real-life events, has recently categories show different time series effects. received considerable research interest. Research efforts While trend and event detection in news and blog posts often examined a small number of such trends to produce has been studied in depth (Allan, 2002; Kleinberg, 2003; some descriptive and comparative characteristics of Twitter Sayyadi, Hurst, & Maykov, 2009), the detection of trends trends or popular terms. Cheong and Lee (2009) looked at on Twitter is a topic that is still in its infancy (Petrovic, four trending topics and two control terms, and a subset of Osborne, & Lavrenko, 2010; Sakaki et al., 2010). For the messages associated with each, commenting on features example, Sankaranarayanan et al. (2009) use clustering such as the time-based frequency (volume of messages) for methods to identify trending topics—corresponding to news each term and the category of users and type of devices used events—and their associated messages on Twitter. Looking to post the associated messages.Yardi and boyd (2010) exam- at social text stream data from blogs and email messages, ined the characteristics of content related to three topics on Zhao, Mitra, and Chen (2007) detect events using textual, Twitter, two topics representing geographically local news social, and temporal document characteristics in the context events and one control topic. The authors studied the mes- of clustering with temporal segmentation and information sages posted for each topic (i.e., messages containing terms flow-based graph cuts. Other research considers event and manually selected by the authors to capture related content) trend detection in other social media data, such as Flickr pho- and the users who posted them. Among other findings, the tographs (Becker et al., 2010; Chen & Roy, 2009; Rattenbury authors suggest that local topics feature denser social con- et al., 2007). nectivity between the posting users. Similarly, Sakaki et al. The related problem of information dissemination has also (2010) suggest that the social connectivity for breaking events attracted substantial attention. As a notable example, recent is lower, but have only examined content related to two man- work studies the diffusion of information in news and blogs ually chosen events. Singh and Jain (2010) examine Twitter (Gruhl, Guha, Liben-Nowell, & Tomkins, 2004; Leskovec, messages with select and show that the content for Backstrom, & Kleinberg, 2009).As another example, Jansen, each such set follows a power-law distribution in terms of Zhang, Sobel, and Chowdury (2009) study word-of-mouth popularity, time, and geo-location. Kwak et al. (2010) show activity around brands on Twitter. Trends identified in the that different trending terms on Twitter have different charac- Twitter data are, of course, both products and generators of teristics in terms of the number of replies, mentions, retweets, information dissemination processes. and “regular” tweets that appear in the set of tweets for each Several recent efforts attempt to provide analytics for term, but do not reason about why and how exactly these trends and events detected or tracked on Twitter. Sakaki trends are different. Some of the metrics we use here for et al. (2010) study social, spatial, and temporal charac- characterizing trends are similar to those used in these stud- teristics of earthquake-related tweets, and De Longueville, ies, but we go further and perform a large-scale analysis of Smith, and Luraschi (2009) describe a method for using trends according to manual assignments of these trends to Twitter to track forest fires and the response to the fires distinct categories. by Twitter users. Starbird et al. (2010) described the tem- On a slightly larger scale, Kwak et al. (2010) also exam- poral distribution, sources of information, and locations in ined the time series volume data of tweets for each trending tweets from the Red River Valley floods of April 2009. term in their dataset, namely, a sample of 4,000 of the trend- Nagarajan et al. (2009) downloaded Twitter data for three ing terms computed and published by Twitter. The authors events over time and analyzed the topical, geographic, and based their analysis on the findings of Crane and Sornette temporal importance of descriptors (e.g., different keywords) (2008), which analyzed time series viewing data for indi- that can help visualize the event data. Finally, Shamma et al. vidual YouTube videos. Crane and Sornette observed that (2010), Diakopolous and Shamma (2010), and Diakopoulos, YouTube videos fall into two categories, based on their view Naaman, and Kivran-Swaine (2010) analyze the tweets corre- patterns. When a time series shows an immediate and fast rise sponding to large-scale media events (e.g., the United States in a video’s views, Crane and Sornette assert that the rise is President’s annual State of the Union speech) to improve likely caused by external factors (i.e., attention was drawn to event reasoning, visualization, and analytics. These tasks may the video from outside the YouTube community) and, there- all be improved or better automated with the enhanced under- fore, dub this category of videos “exogenous.” When there standing of the Twitter trends that is the result of the work is no such rise, the authors suggest that a video’s popularity presented here.

904 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—May 2011 DOI: 10.1002/asi FIG. 1. Trending terms, on the dark blue (middle) banner, on Twitter’s home page.

Trends on Twitter term, iOS4 (referring to the release of Apple’s mobile oper- ating system). To analyze a trend t, we study the set M of Because of the quick and transient nature of its user posts, t associated messages during the time period that contain the Twitter is an information system that provides a “real time” trend terms (in our example, all messages with the string reflection of the interests and thoughts of its users, as well “iOS4”). Note that, of course, alternative definitions and rep- as their attention. As a consequence, Twitter serves as a rich resentations of trends are possible (e.g., based on message source for exploring the mass attention of millions of its users, clustering; Sankaranarayanan et al., 2009). However, for this reflected in “trends” that can be extracted from the site. work we decided to concentrate on the above term-based For the purposes of this work, a trend on Twitter (some- formulation, which reflects a model commonly used in other times referred to as a trending topic) consists of one or more systems (e.g., by Twitter as well as other commercial engines terms and a time period, such that the volume of messages such as OneRiot). posted for the terms in the time period exceeds some expected While detecting trends is an interesting research problem, level of activity (e.g., in relation to another time period or to we focus here instead on characterizing the trends that can other terms). According to this definition, trends on Twitter be detected on Twitter with existing baseline approaches. For include our examples above, such as Michael Jackson’s death this, we collect detected trends from two different sources. (with terms “Michael” and “Jackson,” and time period June First, we collect local trends identified and published hourly 25, 2009), the final episode of Lost (with terms “Lost” and by Twitter; the trends are available via an application pro- “finale,” and time period May 23, 2010), and the healthcare grammer interface (API) from the Twitter service. Second, reform debate (with term “HCR” and time period May 25, to complement and expand the Twitter-provided trends, we 2010). This definition conveys the observation by Kleinberg run a simple burst-detection algorithm over a large Twitter (2003) that the “appearance of a topic in a document stream dataset to identify additional trends. We describe these two is signaled by a burst of activity, with certain features ris- trend-collection methods next. ing sharply in frequency as the topic emerges” but does not enforce novelty (i.e., a requirement that the topic was not previously seen). In Twitter’s own (very informal) definition, Collecting Trend Data trends “are keywords that happen to be popping up in a whole bunch of tweets.” Figure 1 captures Twitter’s home page with In this section we describe the two methods we use to several trending topics displayed at the top. compile trends on Twitter, and also how we select the set of In this article, each trend t is then identified by a set Rt trends for analysis and how we get the associated messages, or of one or more terms and a time period pt. For example, tweets, for each trend. The set of trends T that we will analyze Figure 1 highlights one trend t that is identified by a single in this article consists of the union of the trends compiled

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—May 2011 905 DOI: 10.1002/asi using both methods below. We use two methods in order commonly used in research efforts related to trends on Twitter to control, at least to some degree, for bias in the type of (e.g., Kwak et al., 2010; Cheong & Lee, 2009). trends that may be detected by one system, but not another. The Twitter-provided trends are computed for various geo- While other algorithms for trend detection exist, we strongly graphic scales and regions. For example, Twitter computes believe our selected methods will provide a representative and publishes the trends for New York City, as well as for sample of the type of trends that can be detected. The set of the United States, and across all the Twitter service (e.g., detected trends might be skewed towards some trend types those shown in Figure 1). From the data, we can observe in comparison to other methods, but this skewness does not that location-based trends are not necessarily disjoint: for affect the analysis in this work. We further address this issue example, New York City trends can reflect national trends or in the limitations discussion below. overlap with other cities’ trends. In subsequent sections we qualitatively examine a subset We collected over 8,500 trends published by Twitter for the TQual of the trends in T to extract the key types of trends that NewYorkCity area during the months of February and March are present in Twitter data and develop a set of dimensions of 2010. The data included the one or two terms associated according to which trends can be categorized. We then use with each published trend, as well as the trend’s associated the categories to compare the trends in (a different) subset time period, expressed as a date and time of day. We use the of T, TQuant, according to several features computed from notation Ttw (for “Twitter”) to denote this set of trends. the data associated with each trend, such as the time dynam- ics of each trend and the interaction between users in the trend’s tweets. We examine whether trends from different Trend Dataset II: Collecting Trends With Burst Detection categories show a significant difference in their computed features. We derived the second trend dataset using a simple trend- detection mechanism over our Tweets dataset described above. This simple approach is similar to those used in other Tweets Dataset efforts (Nagarajan et al., 2009) and, as noted by Phelan, The “base” dataset used for our study consists of over McCarthy, and Smyth (2009), it “does serve to provide a 48,000,000 Twitter messages posted by New York City users straightforward and justifiable starting point.” The trend- of Twitter between September 2009 and March 2010. This detection mechanism relies conceptually on the TF-IDF score dataset is used in one of our methods described below to (Salton, 1983) of terms, highlighting terms that appear in a detect trends on Twitter (i.e., to generate part of our trend certain time period much more frequently than expected for set T). The dataset is also used for identifying the set of that time of day and day of the week. We tune this approach tweets Mt for each trend t in our trend set T. (Recall that so that it does not assign a high score to weekly recurring T consists of all the trends that we analyze, compiled using events, even if they are quite popular, to ensure that we both methods discussed below.) We collected the tweets via a include a substantial fraction of trends that represent “one- script for querying the Twitter API. We used a “whitelisted” time,” nonrecurring events, adding to the diversity of our server, allowed to make a larger number of API calls per analysis. day than the default quota, to continuously query the Twit- Specifically, to identify terms that appear more frequently ter API for the most recent messages posted by New York than expected, we will assign a score to terms according to City users (i.e., by Twitter users whose location, as entered their deviation from an expected frequency. Assume that M by the users and shown on their profile, is in the New York is the set of all messages in our Tweets dataset, R is a set of City area). This querying method results in a highly signif- one or more terms to which we wish to assign a score, and icant set of tweets, but it is only a subsample of the posted h, d, and w represent an hour of the day, a day of the week, content. First, we do not get content from New York users and a week, respectively. We then define M(R, h, d, w) as the who did not identify their home location. Second, the Twitter set of every Twitter message in M such that (1) the message search API returns a subsample of matching content for most contains all the terms in R and (2) the message was posted queries. Still, we collected over 48,000,000 messages from during hour h, day d, and week w. With this information, we more than 855,000 unique users. can compare the volume in a specific day/hour in a given For each tweet in our dataset, we record its textual content, week to the same day/hour in other weeks (e.g., 10 am on the associated timestamp (i.e., the time at which the tweet Monday, March 15, 2010, vs. the activity for other Mondays was published), and the user ID of the user who published at 10 am). the tweet. To define how we score terms precisely, let Mean(R, h, = | | d) ( i=1,...,n M(R,h,d,wi) )/n be the number of mes- sages with the terms in R posted each week on hour h and Trend Dataset I: Collecting Twitter’s Local Trending Terms day d, averaged over the weeks w1 through wn covered by As mentioned above, one of our trend datasets con- the Tweets dataset. Correspondingly, SD(R, h, d) is the stan- sists of the trends computed by, and made available from, dard deviation of the number of messages with the terms the Twitter service. Twitter computes these trends hourly, in R posted each week on day d and hour h, over all the using an unpublished method. This source of trend data is weeks. Then, the score of a set of terms R over a specific

906 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—May 2011 DOI: 10.1002/asi TABLE 1. Summary of event datasets.

Notation Data source Selection for analysis

Ttw Twitter’s own trends as retrieved from the Twitter API Selected from complete set of trends published by Twitter Ttf Trends computed from raw Twitter data using term frequency measures Selected from top-scoring terms for each day

TABLE 2. Sample trends and their explanation.

Trend terms Explanation Date # Tweets

#TEDxNYED A New York City conference on media, technology, and education March 6, 2010 556 Sparklehorse The suicide of Mark Linkous, of the band Sparklehorse March 8, 2010 230 Burger Reaction to a tweet by Lady Gaga: “once you kill a cow, you gotta make a burger” March 12, 2010 3,249 Masters Tiger Woods’s announcement of his return to golf at the Masters March 16, 2010 693 itsreallyannoying Twitter meme: users sharing their annoyances March 23, 2010 2,707 Seder Passover-eve meal March 28, 2010 316 iPad Launch of the Apple iPad March 29, 2010 1,714 hour h, day d, and week w is defined as score(R, h, d, messages (posted on the corresponding day, and with the w) = (|M(R,h,d,w)|−Mean(R, h, d))/SD(R, h, d). corresponding terms). If we could not identify the topic or Using this definition, we computed the score for every reason for the trend, we removed it from the selected set individual term in our dataset (in other words, we computed to satisfy condition (2). In addition, after the first round of the scores for all R sets where each R is a set with a single coding trends according to the categories described below, 1-gram that appears in M). We computed the score for each we manually inspected the trends from the initial sets Ttw R over all h, d, and w values for the weeks covered by our and Ttf that were not yet selected for analysis. Instead of Tweets dataset. For each day d and week w, we identified the randomly choosing among them, we randomly chose a date R and h pairs such that (1) M(R, h, d, w) contains at least and then purposefully selected additional trends from that 100 messages and (2) the score(R, h, d, w) value is among date from underrepresented categories, satisfying condition the top-30 scores for day d and week w across all term-hour (1). Note that we attempted to create a comprehensive, but pairs. Each selected pair defines a trend with set of terms R not necessarily proportional, sample of trends in the data. In and associated time period specified by h, d, and w. (Note other words, some types of trends may be over- or underrep- that certain terms could be repeated if they scored highly resented in the selected trends dataset. At the same time, the for multiple hours in the same day; such repetition is also sample of trends in each category is representative of trends possible for the trend set Ttw. We compute a trend’s “real” in the category overall. Our aim here is to provide insight peak after we choose the trends for analysis, as described about the categories of trends and features of trends in each below.) We use the notation Ttf (for “term frequency”) to category, rather than discuss the magnitude of each category denote the resulting set of 1,500 trends. in the data, a figure likely to shift, for example, with changes For reference, the sources and properties of the event to the detection algorithms. datasets are summarized in Table 1. The result of this process was a set of trends T that com- bines trends from both Ttw and Ttf . We split the set T into Selecting Trends for Analysis two subsets. The first subset of selected trends, TQual, con- sisting of trends in T through February 2010, was used for After identifying the above two sets of trends, namely, Ttw the qualitative analysis described next. The second subset of and Ttf , our goal is to perform both a quantitative and a quali- tative analysis of these trends. To be meaningful, this analysis T, TQuant, consisting of trends in T from March 2010, was will rely on a manual coding of the trends, but an exhaustive used for the quantitative analysis described below. Table 2 lists several of the trends selected for the analysis: for each manual processing of all trends in Ttw and Ttf would, unfor- tunately, be prohibitively expensive. Therefore, our analysis trend t, we list its description, time period, and number of will focus on a carefully selected subset of the two trend associated messages (i.e., the cardinality of Mt). Next, we sets (see the Trend Taxonomy and Dimensions section). This explain how we identify Mt for each trend t. selection of trends should (1) reflect the diversity of trends Table 3 provides a summary of the datasets described in in the original sets and (2) include only trends that could be this section, along with their respective size. interpreted and understood by a human, through inspection of the associated Twitter messages. Identifying Tweets Associated with Trends For both sets Ttw and Ttf , one of the authors performed a random selection of trends to serve as an initial dataset. For our statistical analysis of trend features, for each trend t For each trend in this initial selection we attempted to iden- in TQuant we need to know the set of tweets Mt associated tify the topic reflected in the trend by inspecting associated with t. Each trend includes the terms that identify the trend

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—May 2011 907 DOI: 10.1002/asi TABLE 3. Details of the trend datasets produced and used in this work. emerge in a social information system such as Twitter, which is our focus here. Initial set of Twitter trends (T )>8,500* tw Our qualitative analysis of trends is based on a varia- Initial set of burst detection trends (Ttf )>1,500* Selected trends for qualitative analysis (TQual) 50 tion of the affinity diagram method, an inductive process Selected trends for quantitative analysis (TQuant) 200 (LeCompte & Schensul, 1999) to extract themes and pat- terns from qualitative data. For this analysis we used sticky *Including duplicate trending terms in different hours. notes to represent each trend in TQual and recorded the terms and the associated time period, as discussed (e.g., a trend and the explanation of the trend if needed, which happened might consist of term “Passover” on March 29, 2010, for the when the terms associated with the trend did not immedi- ately offer an idea of the content. Two of the authors of this hour starting at 4 pm). To define Mt, we first collect every message in our Tweets dataset that contains all of t’s terms article then put together the different items into groups and and such that it was posted up to 10 days before or after the categories in an iterative process of comparing, contrasting, time period for t.We sort these messages according to the time integrating, and dividing the grouped trends. According to at which they were posted and we aggregate them into hourly the affinity process, we considered the relationship between bins. Since the identifying term(s) may be popular at various categories as well as the items that are grouped and linked times (e.g., as is the case for a trend that persists for several together. hours), we identify the peak time for the trend by selecting the Indeed, the categories that emerged could be described and bin with the largest number of messages. Finally, after anchor- differentiated according to one key dimension: whether the ing the trend in its new associated time period, we retrieve all trends in the category are exogenous or endogenous. Trends messages posted up to 72 hours before or after the new time in exogenous categories capture an activity, interest, or event that originated outside of the Twitter system (e.g., an earth- period; this set is Mt, the set of messages associated with trend quake). Trends in endogenous categories are Twitter-only t. On average, the set Mt for each trend in TQuant consists of activities that do not correspond to external events (e.g., a 1,350 tweets, and the median cardinality of Mt is 573. Note again that other methods exist for trend detection popular post by a celebrity). Having this dimension at the top that may associate content with trends not only by simple level of the taxonomy reflects and highlights the substantial term matching as we do here (Becker et al., 2010). However, differences on Twitter between exogenous and endogenous most current systems rely on term matching to identify related trends regarding their importance and use scenarios. The content. In addition, many of the characteristics we extract top level of the taxonomy thus separates nonvirtual external for each trend’s content would directly apply, or apply with events from activities that only pertain to the Twitter system. minor changes, to sets of content collected via other methods. The groups of trends that emerged are described below, We further discuss this issue as part of our limitations below. with sample trends to illustrate each category.

Trend Taxonomy and Dimensions Exogenous Trends We now describe the qualitative analysis that we per- • Broadcast-media events: ◦ Broadcast of local media events: “fight” (boxing event), formed to characterize the Twitter trends in the TQual set of trends described above. The analysis was geared to identify “Ravens” (football game). ◦ Broadcast of global/national media events: “Kanye” the different types of trends that occur in Twitter data from (Kanye West acts up at the MTV Video Music Awards), one metropolitan area and relies on a taxonomy of the trends. “Lost Finale” (series finale of Lost). Many Twitter trends correspond to events that are reflected • Global news events: on Twitter by its users. The definition and characterization of ◦ Breaking news events: “earthquake” (Chile earth- “event” has received substantial attention across academic quake), “Tsunami” (HawaiiTsunami warning), “Beyoncé” fields, from philosophy (“Events,” 2002) to cognitive psy- (Beyoncé cancels Malaysia concert). chology (Zacks and Tversky, 2001). Media events have been ◦ Nonbreaking news events: “HCR” (health care reform), characterized by Dayan and Katz (1992) into three generic “Tiger” (Tiger Woods apologizes), “iPad” (toward the types of scripts that these events tend to follow, namely, launch of Apple’s popular device). • “contest,” “conquest,” and “coronation,” for events such as National holidays and memorial days: “Halloween,” “Valen- a presidential debate, an unfolding visit by a leader to a tine’s.” • Local participatory and physical events: foreign state, and a leader’s funeral, respectively. Boll and ◦ Planned events: “marathon,” “superbowl” (Super Bowl Westermann (2003) present discussion of events in the area of viewing parties), “patrick’s” (St. Patrick’s Day Parade). personal multimedia collections. In information retrieval, the ◦ Unplanned events: “rainy,” “.” concept of events has prominently been addressed in the area of topic detection in news events (Allan, 2002; Kleinberg, 2003; Yang, Pierce, & Carbonell, 1998). To summarize, the Endogenous Trends aforementioned research from multiple disciplines is closely • Memes: #in2010 (in December 2009, users imagine their related to our work. However, the taxonomies available in near future), “November” (users marking the beginning of these literatures do not capture the variety of trends that the month on November 1).

908 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—May 2011 DOI: 10.1002/asi • Retweets (users “forwarding” en masse a single tweet from a based on its associated set of tweets Mt. These features range popular user): “determination” (users retweeting LL Cool J’s from aggregate statistics of the content of each individual post about said concept). message (e.g., number of hashtags, URLs) to social network • Fan community activities: “2pac” (the anniversary of the connections between the authors of the messages in Mt and death of hip-hop artist Tupac Shakur). the temporal characteristics of Mt. Needless to say, the above set of categories might not be comprehensive (i.e., other trends that are not in our data might Content Features not comfortably fit in any of these categories). However, we Our first set of features (Table 4a) provides descriptive developed this set of categories after an exhaustive, thorough characteristics for a trend t based on the content of the mes- analysis of a large-scale set of trends, as described above. sages in M . These features include aggregate characteristics Therefore, we believe that this categorization is both suffi- t such as the average length of a message in M and the per- ciently broad and, at the same time, simple enough to enable t centage of messages with URLs or hashtags, or measures of a meaningful study of the “trends in trend data.” the textual similarity of the tweets in M . Our quantitative analysis below focuses on a limited t number of dimensions extracted from the taxonomy that Interaction Features capture key differences between trends. We identified the dimensions to focus on according to two criteria: (a) signifi- The interaction features (Table 4b) capture the interac- cance, or the importance of being able to extract differences tion between users in a trend’s messages as indicated on between the selected trend categories, and (b) the likelihood Twitter by the use of the @ symbol followed by a user- that these categories will result in measurable differences name. These interactions have somewhat different semantics between trends. on Twitter, and include “retweets” (forwarding information), The first dimension we examine is the high-level exoge- replies (conversation), or mentions of other users. nous and endogenous categories of trends. Such comparison will allow us to reason about this most distinguishing aspect Time-Based Features of any Twitter trend. The time-based features (Table 4c) capture different tem- Within exogenous trends, in this work we chose to concen- poral patterns of information spread that might vary across trate on two important dimensions. First, whether the exoge- trends. To capture these features for a trend t,wefitafam- nous activity falls into the local participatory and physical ily of functions to the histogram describing the number of events category above. These “local trends” represent physi- Twitter messages associated with the trend over the time cal events, located in one geographic area (e.g., the NewYork period spanned by the tweet set M (by construction, as dis- marathon) that are currently underrepresented in the detected t cussed, M has the matching messages produced up to 72 trends, but naturally play an important role in local communi- t hours before and after t’s peak). We aggregate all messages ties. The second dimension chosen is whether the exogenous in M into hourly bins. We refer to all bins before the peak as trends are breaking news events, global news events that are t the head of the time period, while all bins after the peak are surprising and have not been anticipated (e.g., an earthquake), the tail of the time period. as opposed to all other events and trends that are planned We proceed to fit the bin volume data, for both the head and or expected (e.g., a vote in the Senate, or a holiday). This the tail of the time period, separately, to exponential and log- dimension will allow us to separate “news-worthy” versus arithmic functions. Using the least squares method, we com- “discussion-worthy” trends, which may lead to a different pute logarithmic and exponential fit parameters for the head manner in which systems use and display these different trend and tail periods for each trend, considering the full time period types. of 72 hours, which we refer to as the Log72 fit and the Exp72 Similarly, within endogenous trends in this work we chose fit, respectively. We proceed in the same manner for a lim- to investigate the differences between trends in the two ited time period of 8 hours before and after the peak, which main categories of this group of events, namely, memes and we refer to as the Log8 fit and the Exp8 fit, respectively. The retweets, as explained above. focus on the shorter time periods will allow us to better match Next, these dimensions help us guide the quantitative study rapidly rising or declining trends (Leskovec et al., 2009). of the trends detected in Twitter data, as we label each trend In sum, our features for each trend thus include the fit according to categories derived from the dimensions above. parameters for 8-hour and 72-hour spans for both the head and the tail periods; and for each period and span we calculate the Characterizing Trends logarithmic and exponential fit parameters. In addition, for 2 The next step in our analysis is to characterize each Twit- each combination we also computed the R statistic, which ter trend using features of its associated messages. These measures the quality of each fit. features are later used to reason about differences between Participation Features the various trend dimensions described in the previous sec- tion. For this analysis, we use the trend set TQuant defined Trends can have different patterns of participation, in above. For each trend t we compute features automatically, terms of authorship of messages related to the trend. The

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—May 2011 909 DOI: 10.1002/asi TABLE 4. Features for a trend t. As usual, Mt denotes the set of tweets associated with t.

Explanation

(a) Content features

Average number of words/characters Let words(m) be the number of words in a tweet m and let char(m) be the number of characters in tweet m. words(m) ∈ Then the average number of words per message is m Mt , and the average number of characters per  |Mt | char(m) ∈ message is m Mt . |Mt | Proportion of messages with URLs Let Ut ⊆ Mt be the set of messages with URLs out of all messages for trend t. Then the proportion of messages with URLs is |Ut|/|Mt|.

Proportion of unique URLs Let URL(m) be the set of URLs that appear in tweet m. The set of unique URLs for t is |UUt|, where UUt ={u: u ∈ URL(m) for a message m ∈ Mt}, and the proportion of unique URLs is |UUt|/|Mt| (Note that the set semantics ensures that each unique URL is only counted once.)

Proportion of messages with hashtags Let Ht ⊆ Mt be the set of messages with hashtags in Mt. Then the proportion of messages with hashtags is |Ht|/|Mt|.  ⊆ Proportion of messages with hashtags, Let Ht Mt be the set of messages with hashtags in Mt, excluding messages where the hashtag is a term in excluding trend terms Rt, the set of terms associated with trend t. Then the proportion of messages with hashtags excluding the trend’s | | | | terms is Ht / Mt .

Top unique hashtag? Whether there is at least one hashtag that appears in at least 10% of the messages in Mt. This measure captures agreement on the terms most topically related to the trend.

Similarity to centroid We represent each message m ∈ Mt as a TF-IDF vector (Salton, 1983), where the IDF value is computed with respect to all messages in the Tweets dataset. We compute the average TF-IDF score for each term across all messages in Mt to define the centroid Ct. Using Ct, we then compute the average cosine similarity sim(Ct ,m) ∈ (Salton, 1983) m Mt as well as the corresponding standard deviation. These features help indicate |Mt | content cohesiveness within a trend. (b) Interaction features

Proportion of retweets Let RT t ⊆ Mt be the set of messages in Mt that are “retweets” (i.e., these messages include a string of the form “RT @user”). Then the proportion of retweets is |RTt|/|Mt|.

Proportion of replies Let RPt ⊆ Mt be the set of messages in Mt that are “replies” (i.e., these messages begin with a string of the form “@user”). Then the proportion of replies is |RPt|/|Mt|.

Proportion of mentions Let MNt ⊆ Mt be the set of messages in Mt that are “mentions” (i.e., these messages include a string of the form “@user” but are not replies or retweets as defined above). Then the proportion of mentions is |MNt|/|Mt|. (c) Time-based features 2 −p |h| Exponential fit (head) Best fit parameters (p0,p1,p2) and goodness of fit R for function M(h) = p1 e 0 + p2, where M(h) represents the volume of messages during the h-th hour before the peak. Computed for 72- and 8-hour periods before the peak. 2 Logarithmic fit (head) Best fit parameters (p0,p1) and goodness of fit R for function M(h) = p0 log(h) + p1, where M(h) represents the volume of messages during the h-th hour before the peak. Computed for 72- and 8-hour periods before the peak. Exponential fit (tail) Similar to above, but over 72- and 8-hour periods after the trend’s peak. Logarithmic fit (tail) Similar to above, but over 72- and 8-hour periods after the trend’s peak. (d) Participation features

Messages per author Let At = {a: a is an author of a message m ∈ Mt}. Then the number of messages per author is |Mt|/|At|.   Proportion of messages from top author We designate a ∈ At as the top author if a produced at least as many messages in Mt as any other author. Then  the proportion of messages from top author is |{m: m ∈ Mt and m was posted by a }|/|Mt|.

Proportion of messages from top 10% Let A10t be the set of the top 10% of the authors in terms of the number of messages produced in Mt. Then the of authors proportion of messages from top 10% authors is |{m: m ∈ Mt and m was posted by a ∈ A10t}|/|Mt|. (e) Social network features

Level of reciprocity The fraction of reciprocal connections out of the total number of connections |Et|, where authors a1,a2 ∈ At form a reciprocal connection if (a1,a2) ∈ Et and (a2,a1) ∈ Et.

Maximal eigenvector centrality The eigenvector centrality of an author measures the importance of this author in At by computing the eigenvector of the largest eigenvalue in the adjacency matrix of the network graph. We pick the author with the highest eigenvector centrality value over all a ∈ At. A high value suggests the existence of a dominant node in the network.

Maximal degree centrality The degree centrality of an author a ∈ At is the fraction of authors it is connected to. We compute the highest degree centrality value over all a ∈ At. A high value suggests the existence of a dominant node in the network.

(Continued)

910 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—May 2011 DOI: 10.1002/asi TABLE 4. (Continued)

Explanation

Transitivity Authors a1,a2,a3 ∈ At form a triad if one of them shares an edge with the other two authors. If each author shares an edge with both remaining authors, then they also form a triangle. The transitivity of the graph is computed as 3·triangles(Gt)/triads(Gt), and is a measure of the connectivity of the authors in the network.

Density The ratio of the number of edges |Et| in the author social network, out of the total number of possible edges |At|·(|At|−1). This is an indication of how closely-knit the group of authors At is.

Average component size A strongly connected component of Gt is a maximal set of authors AC ⊆ At such that for each a1,a2 ∈ AC there exists a directed from a1 to a2 and vice versa. The average component size is the number of authors (i.e., nodes in the network) divided by the number of strongly connected components. This is a normalized measure of how “broken apart” the network of authors is.

Proportion of largest component Let AC’ be the largest connected component. Then, |AC’|/|At| is a measure of how connected the network of authors is.

participation features (Table 4d) characterize a trend using content of the trend to correctly assign it to different cate- statistics about the participation of authors that produced gories. One of the authors examined each of the trends to the trend’s associated messages; in particular, we capture the generate a short description. The sources used for this exam- skew in participation (i.e., to which extent a small portion of ination were, first, the actual Twitter messages associated authors produced most of the content). with the trend. If that examination did not prove informa- tive enough, we used news search tools (e.g., Google News) to inspect corresponding news reports for that day and those Social Network Features terms. At the end of the process we had a description for Our final group of features for a trend t focuses on the 200 of the trends in our trend dataset TQuant, after removing set At of the authors of the messages in Mt. Specifically, 29 trends that could not be resolved (e.g., “challenging” on the social network features (Table 4e) capture the properties March 14, 2010) from our dataset. We computed these fea- of the social network Gt of authors. To model this network, tures for the 200 resolved trends in our trend dataset TQuant. we used the Twitter API to collect the list of followers for This data is the basis for our analysis, described below. each author, consisting of other Twitter users in At that sub- We mapped each of the trends into categories based on the scribe to the author’s message feed. (We ignore followers dimensions for analysis. Two of the authors independently that are not among the At authors. We also ignore followers annotated each of the trends. If an annotator could not assign of authors who restrict access to this information and those a value for some dimension, either a “not applicable” or an who have suspended Twitter accounts.) In other words, our “unknown” label was used. In each dimension, after remov- social network graph is a directed graph Gt(At,Et), such that ing trends marked “not applicable” or “unknown” by at least there exists an edge e ∈ Et from a1 to a2 if and only if a1 is one of the annotators, the inter-annotator agreement of the a follower of a2 on Twitter. We computed various features labeled trends in each dimension was very high (the remain- of the social network graph Gt for each trend t, capturing ing number of trends for each dimension is reported below, the connectivity and structure of connections in the graph in the analysis). For the final analysis in each dimension we (Wasserman and Faust, 1994). removed all “not applicable” and “unknown” entries for that dimension, as well as any remaining disagreements between the annotators. In other words, we ignored those trends for Categorizing Trends in Different Dimensions which we had reason to doubt the assignment to a category. In addition to the automatically extracted features, we manually categorized the trends in T according to Quant Quantitative Analysis of Trends the dimensions picked for analysis (e.g., whether the trend belongs to the “exogenous” or “endogenous” category). We The main drivers for our analysis are the coded categories manually associated every trend with one category in each of trends, as detailed above. In other words, we compared dimension. Later, we examined how the categories differ the samples of trends according to their categorization in according to the automatically computed features described different dimensions (e.g., exogenous vs. endogenous) and above. according to the features we computed from the data (e.g., We required a content description of each trend in order to the percentage of messages with URLs). Our hypotheses, properly label it according to the categories introduced in the listed below, are guided by intuitions about deviations in the previous section. The trend detection methods only output characteristics of trends in different categories, and are geared the trend terms and a time period. This type of output (e.g., towards confirming the expected deviations between the trend “Bacall on March 8th”) was often not enough to discern the categories. Such confirmation would allow, later on, for the

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—May 2011 911 DOI: 10.1002/asi development of automated systems to detect the trend type Memes vs. Retweet Endogenous Trends or provide better visualizations or presentation of the trend H4. We hypothesize that memes will have different quanti- data. We continue by listing the key hypotheses that guided tative characteristics compared to retweet trends. In particu- our analysis. lar:

Exogenous vs. Endogenous Trends H4.1: Content features of memes will be different than those of retweet trends. H1. We hypothesize that exogenous and endogenous trends H4.2: Interaction features of memes will be different than those will have different quantitative characteristics. In particular: of retweet trends; in particular, retweet trends will have significantly more retweet (forwarding) messages (this H1.1: Content features of exogenous trends will be differ- hypothesis is included as a “sanity check” since the ent than those of endogenous trends; in particular, they retweet trends are defined by having a large proportion will have a higher proportion of URLs and a smaller of retweets). proportion of hashtags in tweets. H4.3: Time features of memes will be different than those of H1.2: Interaction features of exogenous trends will be different retweet trends. than those of endogenous trends; in particular, exoge- H4.4: Participation features of memes will be different than nous trends will have fewer retweets (forwarding), and those of retweet trends. a similar number of replies (conversation). H4.5: Social network features of memes will be different H1.3: Time features of exogenous trends will be different for than those of retweet trends; in particular, meme trends the head period before the trend peak but will exhibit will have more connectivity and higher reciprocity than similar time features in the tail period after the trend retweet trends. peak, compared to endogenous trends. H1.4: Social network features of exogenous trends will be different than those of endogenous trends, with fewer connections (and less reciprocity) in the social network Method of the trend authors. We performed our analysis on the 200 resolved trends in TQuant. The analysis was based on a pairwise comparison of trends according to the trends categorization in differ- Breaking News vs. Other Exogenous Trends ent dimensions, following our hypotheses above. For each H2. We hypothesize that breaking news events will have such pair we performed a set of two-tailed t-tests to show different quantitative characteristics compared to other whether there are differences between the two sets of trends in exogenous trends. In particular: terms of the dependent variables, namely, our automatically extracted trend features. However, since each sub-hypothesis H2.1: Interaction features of breaking events will be different involved multiple dependent variables (e.g., we computed than those of other exogenous trends, with more retweets seven different social network features), we controlled for (forwarding), but fewer replies (conversation). the multiple t-tests by using the Bonferroni correction, which H2.2: Time features of breaking events will be different for the asks for a significance level of α/n when conducting n tests at head period, showing more rapid growth, and a better once. We thus only report here results with significance level fit to the functions’ curve (i.e., less noise) compared to of p < 0.008. other exogenous trends. H2.3: Social network features of breaking events will be As is common in studies of social-computing activities, different than those of other exogenous trends. many of our dependent variables were not normally dis- tributed, but rather they were most often skewed to the right. Following Osborne (2002), we used logarithms (adding a Local Events vs. Other Exogenous Trends small constant to handle zero values as needed) or square root H3. We hypothesize that local participatory and physi- functions to transform these variables in order to improve cal events will have different quantitative characteristics their normality. For most variables, such transformation compared to other exogenous trends. In particular: indeed generated a normal distribution. In the cases where we performed a variable transformation, whenever we find H3.1: Content features of local events will be different than significant differences between the transformed means in those of other exogenous trends. the analysis we also report here the original variable means H3.2: Interaction features of local events will be different than and medians. For variables that were still skewed after the those of other exogenous trends; in particular, local transformation, we performed the Mann–Whitney test for events will have more replies (conversation). nonnormal distributions, and note when that is the case. For H3.3: Time features of local events will be different than those of other exogenous trends. the one dependent variable in our data that was nominal, we H3.4: Social network features of local events will be different used the CHI-square test. than those of other exogenous trends; in particular, local Finally, following Asur and Huberman (2010), in the anal- events will have denser networks, more connectivity, and ysis we considered the temporal features only for trends that higher reciprocity. peaked on a US weekday (Monday through Friday), as the

912 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—May 2011 DOI: 10.1002/asi TABLE 5. Summary of results.

Categories compared

Content Interaction Time Participation Social

Exogenous vs. endogenous Hypothesis: H1.1 H1.2 H1.3 None H1.4 Found: Yes∗ ✔ Yes ∗ ✔ No ✗ No Yes ✔ Breaking news vs. other exogenous Hypothesis: None H2.1 H2.2 None H2.3 Found: No Yes ✔ No ✗ No No ✗ Local vs. other exogenous Hypothesis: H3.1 H3.2 H3.3 None H3.4 Found: No ✗ Yes ∗ ✔ No ✗ No No ✗ Memes vs. retweets (endogenous) Hypothesis: H4.1 H4.2 H4.3 H4.4 H4.5 Found: Yes ✔ Yes ✔ Yes ∗ ✔ Yes ✔ Yes ✔

∗Starred entries represent partial findings or findings that diverged somewhat from the detailed hypothesis.

temporal aspects in particular might be influenced by the suggests that users created more original content based on different patterns of Twitter usage during weekends. exogenous sources, rather than retransmit and forward con- that was already in the “system” as often happens for endogenous trends. Interestingly, we also found that exoge- Results and Discussion nous trends tend to have more “conversation”: the proportion We report below the results from our analysis. For conve- of replies in exogenous trends was higher than endogenous nience, an overview of the results and findings as they related ones. to the hypothesis is provided in Table 5. In terms of time features (H1.3), the hypothesis was not supported: our data did not show exogenous trends to have different time features for the head period. The tail period Exogenous vs. Endogenous Trends time parameters were, as we hypothesized, not found to be Exogenous trends were found to be different than endoge- different for exogenous and endogenous trends. nous trends in content, interaction, and social features, Finally, in terms of social network features (H1.4), we supporting most of the hypotheses under H1 as shown in found differences in the level of reciprocity between exoge- Table 5. In our dataset we had 115 exogenous trends and 55 nous and endogenous trends. Social network connections in endogenous trends (for some parts of the analysis the numbers exogenous trends had less reciprocity than those of endoge- are lower due to missing data). The detailed numerical results nous trends. Other differences were found but with marginal are shown in Table 6a,b. In terms of content features (H1.1), significance. exogenous trends had a higher proportion of messages with URLs than endogenous trends (results were similar for the Breaking News vs. Other Exogenous Trends proportion of unique URLs appearing in the trend’s content). In addition, the average term length for exogenous trends was Trends corresponding to breaking events were found to somewhat shorter than the length of terms used in endoge- have different interaction characteristics from other exoge- nous trends. We found only some differences in the presence nous trends, but no other differences were found, giving only of hashtags in the content: exogenous trends did not have partial support to hypothesis H2 (Table 5). In our dataset, a higher proportion of messages with hashtags, even when we had 33 breaking events and 63 other exogenous events excluding the trending terms. However, fewer exogenous (for some parts of the analysis the numbers are lower due to trends had a unique hashtag appearing in at least 10% of the missing data). The detailed numerical results are shown in messages compared to endogenous trends. This finding indi- Table 6c. cates less agreement between authors of exogenous trends on In terms of interaction features (H2.1), we found that the ad-hoc “semantics” of the trend (in other words, the cho- breaking exogenous trends have a larger proportion of retweet sen community representation for what that trend content is messages than other exogenous events. Breaking trends also about), which may stem from the fact that exogenous trends have a smaller proportion of reply messages than other exoge- are seeded at once from many users who choose different nous events. These findings show the informational nature of hashtags to represent the trend. breaking events, which focus on information transmission In terms of interaction features (H1.2), we found that rather than conversation. exogenous trends had a smaller proportion of retweets in the Hypotheses H2.2 and H2.3 were not confirmed, how- trend’s tweets compared to endogenous trends. This finding ever, finding no significant differences in time features

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—May 2011 913 DOI: 10.1002/asi TABLE 6a. Quantitative analysis results of exogenous/endogenous categories.

URL Proportion∗ Top Unique Hashtag Term Length (chars)† Retweet Proportion‡ Reply Proportion§

Exo Endo Exo (Y/N) Endo (Y/N) Exo Endo Exo Endo Exo Endo

N 115 55 47/68 36/19 115 55 115 55 115 55 Mean 0.34 0.144 – – 5.31 6.13 0.32 0.47 0.094 0.083 Median 0.307 0.058 – – – – 0.26 0.38 0.081 0.028

∗log-transformed, t = 6.117, p<0.001. χ2 = 9, p<0.002. †t =−5.119, p<0.001. ‡sqrt-transformed, t =−3.865, p<0.001. §log-transformed, t = 2.98, p<0.003.

TABLE 6b. Quantitative analysis results of exogenous/endogenous cate- Finally, we found no support for H3.3, as the differences gories. in time features between local events and other exogenous Reciprocity† trends could not be confirmed.

Exo Endo Memes vs. Retweet Endogenous Trends N 114 54 Looking at endogenous trends, “retweet” trends were Mean 0.26 0.33 Median – – found to be different than “meme” trends (H4) in content, interaction, time, participation, and social network features †t =−6.87, p<0.001. (Table 5). In our dataset, we had 29 memes and 27 retweet trends (for some parts of the analysis the numbers are lower due to missing data).The detailed numerical results are shown in Table 6e–g. between breaking exogenous events and other exogenous In terms of content (H4.1), we found several differences 2 events. It is noted, however, that we found that the R between the retweet trends and meme trends. Retweet trends quality of fit on the Exp72 time fit parameters for the tail have a larger proportion of messages with URLs than meme period was significantly different between breaking and other trends, and a higher proportion of unique URLs. More meme events, with breaking events having better fit on average trends have a single hashtag that appears in more than 10% than other events. Similar yet marginal differences were of the trend’s messages. Accordingly, meme trends have found for Log72 fit parameters. This difference might sug- a larger proportion of hashtags per message than retweet gest that the breaking events, after the peak, are less noisy trends, but are not different when we remove the trending than other exogenous events with discussion levels dropping terms from consideration (indeed, memes are often identified more “smoothly.” by the hashtag that the relevant messages contain). Finally, retweet trends have more textual terms in the tweets than meme trends, and the retweet trend tweets are longer on Local Events vs. Other Exogenous Trends average than meme trend tweets, and are even longer when We found limited support that local events have different counting characters in URLs (not reported here). However, characteristics than other exogenous trends (H3). In particu- these differences may be attributed to the “RT @username” lar, our data surfaced differences between interaction features phrase added to individual retweet messages, which were of local events and other exogenous trends (Table 5). In our more common, of course, for retweet trends (this obser- dataset we had 12 local events and 96 other exogenous trends vation was true at the time the data was collected; since (for some parts of the analysis the numbers are lower due then, Twitter has changed its format for retweet messages, to missing data). We note that the analysis was limited by so that “RT @username” does not always appear in the the small number of local events in our trends dataset. The data). detailed numerical results are shown in Table 6d. Supporting H4.2, retweet trends naturally have a signifi- We could confirm only one difference in terms of inter- cantly greater proportion of messages that are retweets than action features between local and other exogenous trends meme trends. Retweet trends also have a greater portion of (H3.2), where local events have a smaller proportion of replies, showing that they are slightly more conversational messages that are retweets than other exogenous trends. In than meme trends. The retweet and meme trend categories addition, our analysis suggests that local events might be are not different in their proportions of mentions. more conversational, in terms of the proportion of messages Regarding time features, addressing H4.3, we found one that are replies, than other exogenous trends; however, the difference between the time fit parameters: the head fit param- result for replies is not significant at the level we require for eter Log8 head p1, indicating different growth for the retweet reporting in this paper, and thus cannot be fully confirmed. trends. This finding may suggest that retweet trends develop We thus can provide only partial support to H3.2. in a different manner than meme trends.

914 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—May 2011 DOI: 10.1002/asi TABLE 6c. Quantitative analysis results of breaking/other categories.

Retweet Proportion∗ Reply Proportion¶ Exp72 tail R2† Log72 tail R2§

Breaking Other Breaking Other Breaking Other Breaking Other

N 3363336326372639 Mean 0.39 0.28 0.058 0.11 0.0119 0.0041 0.589 0.386 Median 0.35 0.21 0.049 0.101 – – – –

∗sqrt-transformed, t = 3.2, p<0.002. ¶log-transformed, t =−2.7, p<0.008. †t = 3.554, p<0.001. §t = 4.508, p<0.001.

TABLE 6d. Quantitative analysis results of local/other categories. of users in the system, are different in a number of important features from endogenous trends, which start and develop in Retweet Proportion∗ the Twitter “universe.” Connections between the authors of Local Other messages in endogenous trends tend to be more symmetri- cal (i.e., with higher reciprocity) than in exogenous trends, N1296suggesting perhaps that endogenous trends require stronger Mean 0.18 0.345 Median 0.148 0.2 ties to be “transmitted.” However, we also expected the den- sity of the endogenous trend networks and the average degree ∗sqrt-transformed, t =−4.82, p<0.001. of their nodes to be higher, but did not find any such differ- ences. Differences between these two categories of trends were not evident in the temporal features, where the results did not support our hypothesis of more rapid curve leading Looking at the participation features (H4.4), we found a to the peaks of the exogenous trends. However, the differ- number of significant differences between retweet and meme ences between these categories are further supported by the trends, supporting the hypothesis. Meme trends have more deviations in content and interaction features between the messages per author on average than retweet trends in a statis- categories: more URLs, unique URLs, and unique hashtags, tically significant manner (we performed the Mann–Whitney as well as a smaller proportion of retweets, show that exoge- test due to the nonnormal distribution of this parameter). In nous trends generate more independent contributions than addition, meme trends have a higher proportion of messages endogenous trends do. from the single top author than retweet trends, as well as a In a deeper examination of the differences between cat- higher proportion of messages from the top 10% of authors egories of exogenous trends, we found only interaction than retweet trends. These results show a significant differ- differences between trends representing “breaking” events ence in participation between these types of trends, where and other type of exogenous trends—breaking events are, meme trends are more skewed, and a limited number of users naturally perhaps, more “informational” and less “conversa- are responsible for a fairly significant part of the content, and tional” in nature than other trends. Significantly, we could not retweet trends are more “democratic” and participatory. confirm the hypothesis from Sakaki et al. (2010) that breaking Finally, retweet trends were significantly different than events will be more disconnected, as multiple contributors meme trends in a number of social network features, con- will independently contribute messages with less in-network firming H4.5. In terms of the proportion of reciprocation in coordination. However, one possible reason for not seeing the trends’authors social network, retweet trends had a lower this effect in the data is the long period of content (72 hours level of reciprocated ties than meme trends. Meme trends before and after a trend’s peak) over which we calculate the also had a higher average size of strongly connected compo- author networks. Perhaps focusing on the connection between nents than retweet trends. These findings suggest that retweet authors in the first hours of a trend would capture these trends are supported by a network that, while showing the differences between breaking events and other trends. same density, builds on directional, informational ties more Trends capturing local events were found to be only than meme trends that are supported by communication and slightly different than other exogenous trends mainly with reciprocity. respect to the interaction features. People discuss more, and forward information less, in the context of local events as compared to other exogenous trends. Note again that we Discussion have a low number of local events represented in our trend The results of our quantitative analysis provide a strong dataset and these findings should be considered tentative.Yet, indication that we can use the characteristics of the messages it is possible that differences between local events and other associated with a trend to reason about the trend, for example, trends would be even more pronounced when more data is to better understand the trend’s origin and context. available. In particular, we found that exogenous trends, originating Finally, we have shown that even endogenous trends, from outside the Twitter system but reflected in the activity which grow and develop from within the Twitter system

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—May 2011 915 DOI: 10.1002/asi TABLE 6e. Quantitative analysis results of meme/retweet categories.

URL Proportion∗ Unique URL Proportion¶ Hashtag Proportion† Length (term)‡ Length (chars)§

Memes RTs Memes RTs Memes RTs Memes RTs Memes RTs

N 27222722272227222722 Mean 0.044 0.24 0.035 0.064 0.989 0.275 13.19 17.83 80.3 106.1 Median 0.032 0.103 0.029 0.06 0.998 0.092 – – – –

∗log-transformed, t =−4.231, p<0.001. ¶log-transformed, t =−2.759, p<0.008. †t = 5.552, p<0.001. ‡t =−5.156, p<0.001. §t =−4.621, p<0.001.

TABLE 6f. Quantitative analysis results of meme/retweet categories.

∗ ¶ † ‡ § Term Lengths (chars) Top Unique Hashtag Retweet Proportion Reply Proportion Log8 head p1

Memes RTs Memes (Y/N) RTs (Y/N) Memes RTs Memes RTs Memes RTs

N 27 22 27/0 7/15 27 22 27 22 7 15 Mean 6.65 5.57 – – 0.309 0.692 0.029 0.079 0.055 −0.109 Median – – – – 0.277 0.739 0.018 0.0565 – –

∗t = 4.017, p<0.001. ¶χ2 = 27.99, p<0.001. †sqrt-transformed, t =−8.633, p<0.001. ‡log-transformed, t =−3.704, p<0.001. §t = 3.549, p<0.002.

TABLE 6g. Quantitative analysis results of meme/retweet categories.

Messages/author∗ Messages/top author¶ Messages/top-10% author† SCC size (avg)‡ Reciprocated Ties§

Memes RTs Memes RTs Memes RTs Memes RTs Memes RTs

N 2722272227 2227222722 Mean 2.067 1.171 0.042 0.019 0.382 0.182 1.861 1.332 0.363 0.3 Median – – 0.018 0.01 0.383 0.157 1.582 1.239 – –

∗Mann-Whitney Z = 5.44, p<0.001. ¶log-transformed, t = 2.793, p<0.008. †sqrt-transformed, t = 8.814, p<0.001. ‡log-transformed, t = 3.53, p<0.001. §t = 3.936, p<0.001.

and are not a reflection of external events, could have method. The second reason why our dataset is incomplete different categories that are different in a number of key relates to the selection of the tweets that we used for both features. Retweet trends, where users respond and forward trend extraction and characterization: specifically, we only a message from a single popular user, are different in included content from New York City users who disclosed many characteristics (including content, interaction, time, their hometown location in their profile and hence excluded participation, and social characteristics) than meme trends. content from other local users without an explicit profile loca- tion. (Automatically matching locations and users with no explicit geographical information in their profiles is the sub- Limitations and Other Considerations ject of interesting future work.) In addition, we defined each Before we conclude, we list several important considera- trend using terms, and we retrieved the messages associated tions about our study, acknowledging a few limitations and with each trend via simple keyword search. In the future, biases in the work. One limitation is in the dataset used in more complete message sets for each trend could be assem- this work, which is incomplete for two reasons. First, we bled by using clustering, so that we could associate a message generated the initial set of trends to analyze using two spe- with a trend even if the message and the trend do not include cific, albeit well-established methods. (As we discussed, the the same terms (Becker et al., 2010). focus of this article is not on the trend detection but rather Furthermore, our analysis focuses on a single system on the analysis of the trends.) However, better methods for (i.e., Twitter) and a single location (i.e., New York City). trend detection and identification of related content, exploit- Other dynamics and trend characteristics may exist in other ing both textual and nontextual information (Becker et al., systems and locations (e.g., involving Facebook data and con- 2010), might assist in capturing additional trends.At the same cerning users based in Paris, France). Indeed, the dynamics time, we believe that the sample of trends reflects the span we observed, and some of the characteristics we extracted, of trend categories that can reasonably be detected by any are unique to Twitter. However, Twitter is an important

916 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—May 2011 DOI: 10.1002/asi communication and information service that has already Acknowledgments made considerable impact on our society, and is important This material is based on work supported by the to study regardless of generalization to other SAS systems. National Science Foundation under Grants IIS-0811038, IIS- Moreover, we have no reason to believe that, other than 1017845, and IIS-1017389, and by two Google Research message volume, trends involving New York City users are Awards. In accordance with Columbia University reporting significantly different from trends for other locations. requirements, Professor Gravano acknowledges ownership The metrics that we have used to characterize trends can of Google stock as of the writing of this article. be extended or further developed. For example, for the time- based characterization, one could experiment with different fitting functions, identifying peaks in different ways (e.g., References considering the expected volume of tweets for each time of Allan, J. (2002). Introduction to topic detection and tracking. In J.Allan (Ed.), the day), using different time periods before and after the Topic detection and tracking—event-based information organization peak, and so forth. In another example, the social network (pp. 1–16). Amsterdam: Kluwer Academic Publisher. characteristics could consider the social network of authors Asur, S., & Huberman, B.A. (2010). Predicting the future with social media. Technical Report HPL-2010-53, HP. that appeared in the first 24 hours of the trends, following Becker, H., Naaman, M., & Gravano, L. (2010). Learning similarity metrics Yardi and boyd (2010), which might produce networks of dif- for event identification in social media. In Proceedings of the Third ACM ferent characteristics. These different methods could expose International Conference on Web Search and Data Mining (WSDM ’10) more pronounced differences between trend categories. How- (pp. 291–300). New York: ACM Press. ever, we believe the wide-ranging set of metrics presented Boll, S., & Westermann, U. (2003). MediAEther:An event space for context- aware multimedia experiences. In Proceedings of the 2003 ACM SIGMM here can serve as a good starting point for analysis, and Workshop on Experiential Telepresence (ETP ’03) (pp. 21–30). NewYork: has already helped identify key differences between types ACM Press. of trends. Campbell, K.E. (1990). Networks past: A 1939 Bloomington neighborhood. Social Forces, 69(1), 139–155. Chen, L., & Roy, A. (2009). Event detection from Flickr data through wavelet-based spatial analysis. In Proceedings of the 18th ACM Confer- Conclusions ence on Information and Knowledge Management (CIKM ’09) (pp. 523– 532). New York: ACM Press. “If Twitter had trending topics for Portland, #rain would be Cheong, M., & Lee, V. (2009). Integrating Web-based intelligence retrieval our Justin Bieber.” and decision-making from the Twitter trends knowledge base. In Proceed- —@waxpancake on Twitter ings of the Second ACM Workshop on Social Web Search and Mining (SWSM ’09) (pp. 1–8). New York: ACM Press. Temporal patterns in data on social awareness streams Crane, R., & Sornette, D. (2008). Robust dynamic classes revealed by such as Twitter are becoming increasingly important to measuring the response function of a social system. Proceedings of the our society’s information and communication landscape National Academy of Sciences, 105(41), 15649–15653. (Yardi & boyd, 2010; Takhteyev et al., 2010). These temporal Dayan, D., & Katz, E. (1992). Media events: The live broadcasting of history. patterns—for example, emerging trends in SAS data for local Cambridge, MA: Harvard University Press. De Longueville, B., Smith, R.S., & Luraschi, G. (2009). “OMG, from here, i communities—are thus deserving of in-depth understanding can see the flames!”: A use case of mining location based social networks and analysis. In this work we categorized and character- to acquire spatio-temporal data on forest fires. In Proceedings of the 2009 ized Twitter trends (or “trending topics”) for one geographic International Workshop on Location Based Social Networks (LBSN ’09) area, New York City, and showed that not all trends are (pp. 73–80). New York: ACM Press. created equal. There are different types and categories of Diakopoulos, N.A., Naaman, M., & Kivran-Swaine, F. (2010). Diamonds in the rough: Social media visual analytics for journalistic inquiry. In trends that are reflected in the data for this local commu- Proceedings of the IEEE Symposium on Visual Analytics Science & nity, and these trends are different in a number of ways that Technology (VAST ’10) (pp. 115–122). Washington, DC: IEEE Press. can be automatically computed. Our findings suggest direc- Diakopoulos, N.A., & Shamma, D.A. (2010). Characterizing debate per- tions for automatically distinguishing between different types formance via aggregated Twitter sentiment. In Proceedings of the 28th of trends, perhaps using machine learning or model-based International Conference on Human Factors in Computing Systems (CHI ’10) (pp. 1195–1198). New York: ACM Press. approaches, utilizing the trend characteristics we propose Events (2002). In Stanford Encyclopedia of Philosophy. Retrieved from above as well as others. In particular, perhaps most inter- http://plato.stanford.edu/entries/events/ esting to us is the identification of local happenings and Gruhl, D., Guha, R., Liben-Nowell, D., & Tomkins, A. (2004). Informa- events that might be underrepresented in other sources, such tion diffusion through blogspace. In Proceedings of the 13th International as traditional news media, but which are of great interest to Conference on World Wide Web (WWW ’04) (pp. 491–501). New York: ACM Press. local communities. Given a robust classification of trends, Hampton, K., & Wellman, B. (2003). Neighboring in netville: How the which could follow from the work described above, we can Internet supports community and social capital in a wired suburb. City improve prioritization, ranking, and filtering of extracted & Community, 2(4), 277–311. trends on Twitter and other SAS, as well as provide a more Jansen, B.J., Zhang, M., Sobel, K., & Chowdury, A. (2009). Twitter power: targeted and specialized visualization of content associated Tweets as electronic word of mouth. Journal of the American Society for Information Science and Technology, 60(11), 2169–2188. with each trend. Such a set of automated tools will signif- Johnson, S. (2009). How Twitter will change the way we live. Time icantly increase the utility of social awareness streams for Magazine. Retrieved from http://www.time.com/time/business/article/ individuals, organizations, and communities. 0,8599,1902604,00.html

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—May 2011 917 DOI: 10.1002/asi Kleinberg, J. (2003). Bursty and hierarchical structure in streams. Data Sankaranarayanan, J., Samet, H., Teitler, B.E., Lieberman, M.D., & Mining and Knowledge Discovery, 7(4), 373–397. Sperling, J. (2009). Twitterstand: News in tweets. In Proceedings of the Kwak, H., Lee, C., Park, H., & Moon, S. (2010). What is Twitter, a social 17th ACM SIGSPATIAL International Conference on Advances in Geo- network or a news media? In Proceedings of the 19th International Con- graphic Information Systems (GIS ’09) (pp. 42–51). New York: ACM ference on World Wide Web (WWW ’10) (pp. 591–600). NewYork:ACM Press. Press. Sayyadi, H., Hurst, M., & Maykov, A. (2009). Event detection and tracking LeCompte, M., & Schensul, J. (1999). Designing and conducting ethno- in social streams. In Proceedings of the International AAAI Conference graphic research. Walnut Creek, CA: Altamira Press. on Weblogs and Social Media (ICWSM ’09). Leskovec, J., Backstrom, L., & Kleinberg, J. (2009). Meme-tracking and the Scellato, S., Mascolo, C., Musolesi, M., & Latora, V. (2010). Distance mat- dynamics of the news cycle. In Proceedings of the 15th ACM SIGKDD ters: Geo-social metrics for online social networks. In Proceedings of the International Conference on Knowledge Discovery and Data Mining 3rd Workshop on Online Social Networks (WOSN ’10) (p. 8). Berkeley, (KDD ’09) (pp. 497–506). New York: ACM Press. CA: USENIX. McPherson, M., Smith-Lovin, L., & Cook, J.M. (2001). Birds of a Shamma, D.A., Kennedy, L., & Churchill, E. (2010). Statler: Summariz- feather: Homophily in social networks. Annual Review of Sociology, 27, ing media through short-message services. Paper presented at the 2010 415–444. ACM Conference on Computer Supported Cooperative Work (CSCW Mok, D., Carrasco, J.-A., & Wellman, B. (2010). Does distance still mat- ’10), Savannah, GA. ter in the age of the Internet? Urban Studies, 47(13), 2747–2783. DOI: Shklovski, I., Palen, L., & Sutton, J. (2008). Finding community through 10.1177/0042098010377363 information and communication technology in disaster response. In Naaman, M., Boase, J., & Lai, C.-H. (2010). Is it really about me? Mes- Proceedings of the 2008 ACM Conference on Computer Supported sage content in social awareness streams. In Proceedings of the 2010 Cooperative Work (CSCW ’08) (pp. 127–136). New York: ACM ACM Conference on Computer Supported CooperativeWork (CSCW ’10) Press. (pp. 189–192). New York: ACM Press. Singh, V.K., & Jain, R. (2010). Structural analysis of the emerging event- Nagarajan, M., Gomadam, K., Sheth, A.P., Ranabahu, A., Mutharaju, R., & web. In Proceedings of the 19th International Conference on World Wide Jadhav, A. (2009). Spatio-temporal-thematic analysis of citizen sensor Web (WWW ’10) (pp. 1183–1184). New York: ACM Press. data: Challenges and experiences. In Web Information Systems Engineer- Starbird, K., Palen, L., Hughes, A.L., & Vieweg, S. (2010). Chatter on the ing (WISE 2009). Lecture Notes in Computer Science, 5802, 539–553. red: What hazards threat reveals about the social life of microblogged Osborne, J. (2002). Notes on the use of data transformations. Practical information. In Proceedings of the 2010 ACM Conference on Computer Assessment, Research and Evaluation, 8(6). Supported Cooperative Work (CSCW ’10) (pp. 241–250). New York: Petrovic, S., Osborne, M., & Lavrenko,V.(2010). Streaming first story detec- ACM Press. tion with application to Twitter. In Human Language Technologies: The Takhteyev, Y., Gruzd, A., & Wellman, B. (2010). Geography of Twitter 2010 Annual Conference of the North American Chapter of the Asso- networks. Manuscript submitted for publication. Available from http:// ciation for Computational Linguistics (pp. 181–189). East Stroudsburg, takhteyev.org/papers/Takhteyev-Wellman-Gruzd-2010.pdf PA: ACL. Tilly, C. (1974). An urban world. Boston: Little, Brown. Phelan, O., McCarthy, K., & Smyth, B. (2009). Using Twitter to recommend Wasserman, S., & Faust, K. (1994). Social network analysis: Methods real-time topical news. In Proceedings of the Third ACM Conference on and applications, volume 8. Cambridge, MA: Cambridge University Recommender Systems (RecSys ’09) (pp. 385–388). New York: ACM Press. Press. Yang, Y., Pierce, T., & Carbonell, J. (1998). A study of retrospective and Putnam, R.D. (2000). Bowling alone: The collapse and revival of American on-line event detection. In Proceedings of the 21st Annual International community. New York: Simon and Schuster. ACM SIGIR Conference on Research and Development in Information Rattenbury, T., Good, N., & Naaman, M. (2007). Towards automatic extrac- Retrieval (SIGIR ’98) (pp. 28–36). New York: ACM Press. tion of event and place semantics from Flickr tags. In Proceedings of Yardi, S., & boyd, d. (2010). Tweeting from the town square: Measuring the 30th Annual International ACM SIGIR Conference on Research and geographic local networks. In Proceedings of the InternationalAAAI Con- Development in Information Retrieval (SIGIR ’07) (pp. 103–110). New ference onWeblogsand Social Media (ICWSM ’10) (pp. 194–201). Menlo York: ACM Press. Park, CA: AAAI. Sakaki, T., Okazaki, M., & Matsuo, Y. (2010). Earthquake shakes Twit- Zacks, J.M., & Tversky, B. (2001). Event structure in perception and ter users: Real-time event detection by social sensors. In Proceedings conception. Psychological Bulletin, 127. of the 19th International Conference on World Wide Web (WWW ’10) Zhao, Q., Mitra, P., & Chen, B. (2007). Temporal and information flow based (pp. 851–860). New York: ACM Press. event detection from social text streams. In Proceedings of the 22ndAAAI Salton, G. (1983). Introduction to modern information retrieval. New York: Conference onArtificial Intelligence (AAAI ’07) (pp. 1501–1506). Menlo McGraw-Hill. Park, CA: AAAI.

918 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—May 2011 DOI: 10.1002/asi