Pub Crawling at Scale: Tapping Untappd to Explore Social Drinking
Total Page:16
File Type:pdf, Size:1020Kb
Proceedings of the Tenth International AAAI Conference on Web and Social Media (ICWSM 2016) Pub Crawling at Scale: Tapping Untappd to Explore Social Drinking Martin J. Chorley∗ Luca Rossi Gareth Tyson Matthew J. Williams Cardiff University Aston University Queen Mary University of Birmingham [email protected] [email protected] University of London [email protected] [email protected] Abstract breweries opening each year (Naylor 2014). Thus, study- ing such a dynamic ecosystem could prove extremely fruit- There has been a recent surge of research looking at the re- ful, offering potential insight into the behaviour of people porting of food consumption on social media. The topic of alcohol consumption, however, remains poorly investigated. from around the world.This is because it is well known that Social media has the potential to shed light on a topic that, tra- many countries form social communities around the drink- ditionally, is difficult to collect fine-grained information on. ing of alcohol, which are quite distinct from those formed One social app stands out in this regard: Untappd is an app around eating (Clarke et al. 2000). There are also clear health that allows users to ‘check-in’ their consumption of beers. It considerations that such data contribute to. Excessive alco- operates in a similar fashion to other location-based appli- hol consumption is one of the most significant preventable cations, but is specifically tailored to the collection of infor- causes of death today. It caused 79,000 deaths and 2.3 mil- mation on beer consumption. In this paper, we explore beer lion years of potential life lost each year in 2001–2005 in consumption through the lens of social media. We crawled the United States alone (Kanny et al. 2011). Typically, such Untappd in real time over a period of 112 days, across 40 things can only be studied using labour-intensive (e.g., in- cities in the United States and Europe. Using this data, we shed light on the drinking habits of over 369k users. We fo- terviews and surveys) or coarse-grained (e.g., alcohol sales cus on per-user and per-city characterisation, highlighting key statistics) collection methods. Gathering reliable, up-to-date behavioural trends. and fine-grained data on such matters could be extremely helpful for exploring the role and impact of alcohol in soci- ety. 1 Introduction With these considerations in mind, one recent social app We are witnessing a convergence in our online and offline stands out: Untappd centres on the consumption of drinks personas. Activities that were once considered solely the (primarily beer), allowing users to ‘check-in’ beers that they domain of the real-world, have begun to encroach onto on- are drinking in a given location. These checkins are shared line territory. An example of this is the consumption of food with friends, allowing users to interact. Although the princi- and drink. It is now common for users to share details of ples underpinning the app may sound unusual, it has proven their meals online, as part of the so-called #foodporn revo- extremely successful. In 2014 its userbase surpassed 1 mil- lution (Mejova et al. 2015). As such, there has been a flurry lion, with 60 million beers being checked-in over just 3 three of recent research looking at consumption habits through the years1. This popularity shows little sign of abating. Hence, lens of social media (Abbar, Mejova, and Weber 2014). Untappd has the potential to shed extremely fine-grained in- While there has been a bulk of food-related research us- sight into the social drinking habits of a huge population ing social media data, a topic that has received less attention of users. Furthermore, unlike work modelling consumption is that of drinking alcohol. This is partly due to the percep- habits using free-text (e.g., via Twitter), there is no need tion that alcoholic drinks have far less diversity than food. to perform interpretation or translation of data. Instead, all Consequently, past papers have looked at the topic from check-ins are represented using a formal schema. a rather narrow perspective, e.g., identifying alcohol abuse In this paper, we offer the first characterisation of Un- by identifying tweets containing the word ‘hangover’ (Cu- tappd, exploring user drinking habits around the world. To lotta 2013). We argue that this, however, misses a signif- this end, we have crawled their publicly available data over icant opportunity. For example, the craft beer community a 112 day period, to gather all checkins for users in 40 cities. has grown dramatically in recent years. In 2013, craft beer We begin by characterising the dataset to reveal the app’s sales saw a year-on-year increase of 79%, with 74 million scale and popularity (§3). Following this, we separate our pints being sold in the UK alone (CGAStrategy 2013). There analysis into three broad themes. First, we shed light on in- are significant supply-side expansions too, with 200 new dividual user behaviour, highlighting drinking habits, as well ∗Authors listed alphabetically Copyright c 2016, Association for the Advancement of Artificial 1Untappd Official Blog, ‘Untappd is 1,000,000 strong and Intelligence (www.aaai.org). All rights reserved. growing’, 2014. http://blog.untappd.com/post/73638076039 62 as how users interact with Untappd (§4). Second, we explore more fine-grained information, allowing individual locations which beers are most popular, and how these vary across and drinks to be captured, as well as other metadata, e.g., in- cultural boundaries (§5). Then, third, we inspect the social tervals between drinks and ABV. side of Untappd, exploring Untappd’s social graph and ho- A key contribution of our work is the temporal and spa- mophily (§6). Our key findings can be summarised as fol- tial analysis of user drinking habits. Silva et al. (Silva et al. lows: 2014) explored similar spatial and temporal food/drink pref- • Untappd is a remarkably large-scale service. Over a 4 erences via Foursquare. Interestingly, they found that cul- month period, we witness in excess of 5 million check- tural factors were often stronger than spatial (e.g., French ins across 40 monitored cities. food has more in common with Brazil than British food). A common issue with studies such as these is that they use • Most users are responsible in their drinking (75% checkin venue choices as a proxy for specific food and drink pref- a maximum of 4 drinks per day), yet we observe a core erences. This contrasts with Untappd, which provides both group of very heavy drinkers. We find that 7% consume venue choices and individual drinks explicitly. We also note in excess of 10 drinks at least once, with 7.5% of users that work in this area has typically been limited to food, checking in excess of 50 drinks during the period. rather than drink, with particular focus on health implica- • We identify distinct diurnal drinking patterns, revealing tions. We expand this work by studying alcohol consump- differing trends across weekdays, weekends and typical tion, and go beyond health-based implications to explore the working hours. Cultural trends can also be extracted with social nature of the app. Another important difference be- different temporal patterns in cities (e.g., drinking peaks tween food and drink is the frequency and rate of usage, in Paris around 9PM compared to 7PM in London). with often many drinks being consumed per night. • There have also been various studies of more general- We observe clear preferences for particular types of beers purpose location-based social networks (LBSNs). Most no- across different cities. Through this, it becomes possible table is Foursquare, which also happens to provide the venue to cluster users into geographic regions based on their database for Untappd. Cramer et al. explored the reasons checkins. why people used LBSNs via in-depth interviews (Cramer, • A nascent Untappd social network exists, with a median Rost, and Holmquist 2011). They found many motivations friendship group size of 9. Despite its sparsity, we find for using LBSNs. Obvious examples include socially driven distinct homophily in the social network, with a tendency desires (e.g., knowing what friends are doing), although for friends to consume similar types of drinks. more unexpected reasons were discovered as well, such as the endorsement of venues and coordinating meet-ups. It We conclude the paper by highlighting key implications is logical to think that Untappd might hold many synergies from our work (§7 and §8). These include a range a possible here; however, we conjecture that the specific nature of the apps that could be underpinned by our data and findings, application likely leads to rather novel motivations as well particularly in relation to health monitoring (e.g., enabling (e.g., tracking popular beers). interventions through the social network). In this study, we build on past work to offer a window into the world of online ‘social’ drinking. On one hand, we pro- 2 Related Work vide the first empirical study of a new social app, which, as There has been a flurry of recent work into the relation- of yet, lacks even a rudimentary analysis. On the other hand, ship between food consumption and social media. Abbar et we extract and analyse this data to shed light on a number of al. (Abbar, Mejova, and Weber 2014) investigated the food social patterns, which appear unique to drinking. The rest of that people report eating on Twitter. They monitored 210k the paper explores this topic in depth. users to discover what food they reported consuming.