Profiling Online Social Network Platforms: Twitter Vs. Instagram
Total Page:16
File Type:pdf, Size:1020Kb
Proceedings of the 54th Hawaii International Conference on System Sciences | 2021 Profiling Online Social Network Platforms: Twitter vs. Instagram Veruska Ayora Flávio Horita Carlos Kamienski Federal University of ABC, Brazil Federal University of ABC, Brazil Federal University of ABC, Brazil [email protected] [email protected] [email protected] Abstract governments, policymakers, authorities, and businesses to understand market trends and behavioral patterns Online Social Networks (OSN) have been increasingly [38]. used as sources of information for different Several studies have also demonstrated the applications, ranging from business, politics, and potential of OSN in defining people's sentiments about public services. However, there is a lack of information events, incidents, products, and services [41]. The on OSN platforms' behavior that may impact big data potential of Twitter as a platform for improving processing and real-time services. In this paper, two of awareness over variables of interest and thus the most widely used social networks, Instagram and supporting a more informed decision-making process Twitter, are investigated to broaden the understanding has been highlighted in the literature for different of how each platform's message characteristics areas, such as earthquake detection [46], influenza influence data completeness and latency. We virus epidemics [1][6], disasters [14][25], elections performed a series of experiments to emulate data [21], mobility [10], organizational issues [43], or posting and collection automatically. Our results crimes [33]. increase the level of transparency of the platforms' Existing social media methods are mostly focused internal behavior, showing that both can deliver data on event detection [20][52], content analysis [27], and with reasonably low latencies and high completeness, rumor analysis [13], which can describe specific but Twitter can be up to eight times faster when it phenomena often retrieve data to transform it into comes to multimedia messages. services. The challenge now is to investigate how the behavior of such platforms can influence data 1. Introduction completeness and latency. This additional understanding is necessary as OSN platforms were not originally designed to support real-time services, even The increasing use of Online Social Networks though they belong to private providers and data (OSN) in different areas reveals a critical and mostly audition is not allowed [7]. unexplored question regarding the performance In this context, the main goal of this paper is to provided by their underlying platforms. With the rapid examine the influence of message characteristics of growth and proliferation of OSN platforms, a vast OSN platforms (in particular, photo, video, hashtags) amount of user-generated content becomes a valuable on data completeness and latency. We shed some light information source for applications in different areas. on the currently unexplored and poorly understood Also, data is widely accessible since it can be collected OSN platform behaviors, increasing the level of through web-crawlers or public APIs. These two transparency of their internal working. Particularly, we characteristics, i.e., massive and open data, represent seek to answer the following research questions: the primary motivation for most OSN research [41]. User-generated messages on OSN platforms, such 1 2 RQ1: Are there messages shown to the Twitter and as Instagram and Twitter , have emerged as powerful, Instagram users, but not readily shown to other users? real-time means of information sharing on the Internet. If so, what proportion of posted messages can be By the online communication of billions of individual reliably retrieved? users, these platforms have involuntarily created a global participatory sensing network that can be RQ2: Does any message characteristic (e.g., photo, harnessed as an observatory of social events. Data video, hashtag) affect the decision to make them collected from OSN provides social, economic, and available to other users? cultural information that can be utilized by RQ3: After a user posts a message, how long does 1 it take for Twitter and Instagram to make it available https://www.instagram.com/about/us to other users? 2 https://about.twitter.com/ URI: https://hdl.handle.net/10125/70955 978-0-9981331-4-0 Page 2792 (CC BY-NC-ND 4.0) information that could shed light on why and how the Here, we analyze two social networks, namely, the events and patterns measured by physical sensors microblogging platform Twitter and the photo- and emerge. video-sharing platform Instagram, as they have The potential of this approach has also been millions of users worldwide and allow data to be confirmed by [3], which mentions the power of OSN collected by automated processes. We utilized two key during emergencies: “In July 2013, a Boeing 777 aircraft metrics to understand OSN platform characteristics, crashed on landing at the San Francisco International i.e., completeness, measured as the message return Airport, after a transpacific flight from Seoul, South Korea. rate, and latency, measured as the time it takes for a An observer waiting to board another flight snapped a message to be available to other users. We are aware photograph of the accident with her mobile phone and uploaded it to Twitter less than 1 min after the impact. Within that OSN data distribution policies and algorithms may 30 min, there had been more than 44,000 tweets about the change according to different criteria. Still, we believe accident, including photos and videos taken by survivors as there are solid reasons to understand and track them they escaped from the wreckage (International Air Transport over time, by taking snapshots of their most significant Association, 2014).” behaviors as perceived by their users. Despite all the attention to OSN, using data without Hence, the main contributions of this paper are clearly understanding what it comprises might be very threefold. Firstly, we introduce a methodology for problematic [17]. The more we understand a system's evaluating OSN platforms based on heterogeneous data inner workings, the more justifiable it can be governed quality and its impact on the users. Secondly, we and held accountable [4]. Transparency provides examine two of the most widely used OSNs, Twitter, additional reliability and validity for algorithms in and Instagram to broaden the understanding of the charge of decisions that affect society. Also, it makes it similarities and differences in data quality across possible for users to evaluate the correctness of OSN platforms. For example, our results reveal that the outputs and identify incorrect data [44]. Twitter platform reposts up to 18% of the tweets However, even though transparency mechanisms shown in the user timeline. On the other hand, this ideally can empower users to question and critique the behavior has not been observed on Instagram. Thirdly, system, Ananny & Crawford [4] have highlighted we help developers by summarizing relevant OSN some technical limitations of the difficulties in getting technical limitations when designing scenarios for to know how algorithms operate. They suggested that different applications. imposing transparency is not as simple as it seems, as In the remainder of this paper, Section 2 presents system developers themselves are often unable to related works, and Section 3 explains the proposed explain how a complex system works or which parts research methods. Experimental results are presented are essential for its operation [9]. in Section 4. Section 5 discusses the lessons learned, Transparency becomes even more relevant, as and finally, Section 6 draws some conclusions and many OSN platforms are owned by commercial presents future work. corporations that profit over data and interaction [30]. Eslami et al. [18] showed that the majority of 2. Background & Related Work Facebook users have a lack of awareness or understanding about their News Feed being structured Recently, due to the broad adoption of mobile by an algorithm. When users miss an important post platforms, OSN represents an increasingly up-to-date from a friend, they usually blame themselves, ignoring digital reflection of society. It supports people in that the OSN has an algorithm making decisions that interacting with places around the city, with other could be misleading or hiding important content from people, and businesses, while talking about topics of them. On the other hand, even if users are aware of common interest, such as sentiments, political beliefs, such complex algorithms, they have no way of social interactions, human mobility, presence in knowing how it works, which may prevent them from specific events, likes [47]. being sure about the results of their actions [18]. OSN behaves as "human as sensors," recording human activities and preferences [22]. This data 2.1. Social Networks Platforms becomes a crucial asset since it can be transformed into valuable knowledge, helping the decision-making From the literature review, it is possible to process [37]. Furthermore, enabled by OSN, these understand how to integrate OSN platforms as data human sensors provide 24 per 7 real-time data streams distributors for various services that can be provided at virtually no cost [16][46][1][6][33]. Also, [15] have for the needs of a variety of smart applications. For argued that OSN updates shared by the citizens can instance, Anthopoulos & Fitsilis [5] explore