<<

arXiv:1412.8700v1 [cs.SI] 30 Dec 2014 ubro rfie vr3 8.W aet neln here biggest underline to the by have We now outdistanced [8]. has 3M got over over CS HC – with profiles Later, CS. them of [7]. as number of 2006 abbreviated biggest in Couchsurfing the [6]. profiles became At 1992 Hospitality 100k HC in front-end. followed. as created services app-based was other abbreviated and/or of ’Hospex’ installations called web- The service database a such a over Such first, profiles, services. to accessible hospex addition on-line is In comment for or up. popular references became set aka is reports accommodation profiles recording of database don’t online We [5]. different service hospex topic. whether one this than si discuss more community, disputed, use hospex people international is some single dif one just descriptio for It are or services participant communities location. different create – organizations basica and service profiles hospex contact A so-called [4]. 1949 administrates in starting member. International another with interaction member, real-life as everybody a term had will partic who we another Further, of member. hospitality rather hospitality They by or gratuitous but hosts. where directly have community, responded not a can in and organized host are participants Hospex relevant. rvlr a unit oa eiet n ievra the versa, vice as and Because abbreviated residents hospitality. exchange local gratuitous hospitality into of ar for they turn idea seeking budget since can them, adventurous ones travelers on term only concentrate and will the travelers we as Further, tourists contribut grew. to strangers expected well-being. hosts is his guest to the p. symbolically return, traditional least [2, In at of Wolof help. pro examples food, of other cite as shelter, and receives terranga [3] hereby can Eskimo stranger 14], A We of hospitality. p. hospitality known ones. [1, and all 17] Pashtun traditional almost of among especially melmastia common cultures, is human strangers to residents hi aapoie diinlinformation. comp additional – provides interconnected are data services their three the using ins couchsurfing.o munities and general hospitalityclub.org are provides othe services The th also gated services. determining exchange It serv hospitality of growth. internet-based exchange on goal its the hospitality influencing pursuit democratic factors hereby non-profit We the .org. of ysis Abstract opxsriei airt anan neacentral a once maintain, to easier is service hospex A Servas by initiated was service hospex first The ntemdr ol h hr ftuit mn the among tourists of share the world modern the In local by provided hospitality gratuitous of rite The eecm.r o-rfi eortchospex democratic non-profit a – Bewelcome.org Ti ae rsnsa xesv aabsdanal- data-based extensive an presents paper —This .I I. NTRODUCTION evc e pfrgrowth. for up set service hospex UBraaei Freiberg Bergakademie TU g Com- rg. investi- r became tection utmTagiew Rustam [email protected] ferent aring ipant Club ights nce lmiof Alumni ice lly n, is e e e s h ifrnebtenteterms the between difference the rfieonr and owner, profile n r lal ifrn rmtersaitc ulse i published statistics their from red different heavily clearly versions no figures are presented are subsequent and page CS website worth, statistics CS it’s CS the official of what the from statistics For statistics original available. public mirrored The longer the 2011. the of the 04.03.2014, in and on on access website rely BW data allowed only by search never can us Google we for HC mirrored Therefore, parties. kindly data. data its data. third to its with science to public it science prohi are share public data of to CS access pre-incorporation i the possessing corporation down Scientists for-profit shut a and became 2011 CS service. [9, hospex families biggest host to modern au-pairs hospitalit of in mediation gratuitous in eg.]. unethical least on at considered on common Nevertheless, profit is making culture. hospitality since gratuitous organizations, volunteer-driven users. call and of be subset to a order are in Members members member. other a with interaction real-life ulctoscamn opxa hi ujc atr–non- – matter subject their as hospex claiming publications Member Community Hospex Traveler the – CS from data the access not can we Unfortunately, donation non-profit, as started always services Hospex User Service Profile ecnietf he aeoiso xsigscientific existing of categories three identify can We evc ysgigup. signing by service arrangements. hospex. to agreed person a of location and contact the concerning community. member hospex another with interaction omncto u,wihfclttshospex facilitates which hub, communication A esn h rae rfieuigahospex a using profile a created who person, A nietecag fgautu hospitality. gratuitous of exchange Indirect I.R III. epeitronce yflle hospex. fulfilled by interconnected People member I T II. ELATED ERMINOLOGY o vr est srhda had user website every not – esn h a real-life had who person, A detru ugttourist. budget Adventurous user W ORK hc sbsclya basically is which , Description, 2011. n bited uced is y ed of n opxsrieadtert fpol inn pteeare there and 2005–2011 up years signing the ar in people coefficients CS of correlations for rate The correlated. the significantly and service hospex share to allowed c not anymore. work are data their the they the of because among correctness double-checked, insight the trust be Further, research general not paper. of provide these this aspect don’t in by they aimed the written Particularly, on users. papers CS concentrate data All CS solely [13]. know pre-incorporation teams [12], we used Currently who [11], not teams, hospex. but [10], research mindsets, on 4 into analy processes insights about and behavior gives data, real data survey Survey into data. of CS analysis of articles, scientific data development. subsequent the for prediction o S -au o hi iercreaini eo 10 below is correlation signup linear and their search Fig.1 for Google CS. P-value for on monthly data concentrate CS. of Google we for of graphs HC, for years scaled of data two 1-2% shows signup only about no have and we is BW Since BW CS. of of volume that search Google 2011–2013. iue1 ihlna orlto f.7 ewe ogesea Google between .971 of correlation ” linear High 1. Figure CC.A o a e,teitrs nhse services hospex in interest the see, can you As signup HC+CS. for data equivalent search an as Google can data linear its we search relying the data. Google Second, data using if years. even signup the 2013, or HC over was Dezember the change signups in estimate of not 6.9M number did say total and relationship can CS 6.6M that we data. confidence signup, between Google 99% monthly available a approximated on with the based prediction/approximation data up linear signup Summing a CS make hidden to now allows it First, ocsrn“n Smnhysgu nteyas20–01al 2005–2011 years the in signup monthly CS Couchsurfing“and Google search volume in % of maximum before Aug 2011 V G IV. rgesoa ogesac ouefrtenm fa of name the for volume search Google Progressional i. hw h ogesac ouefrH,C and CS HC, for volume search Google the shows Fig.2 0 20 40 60 80 100 120 u 4Ag0 u 6Ag0 u 8Ag0 u 0Ag1 u 2Aug 13 Aug 12 Aug 11 Aug 10 Aug 09 Aug 08 Aug 07 Aug 06 Aug 05 Aug 04 NRLDVLPETO OPXSERVICES HOSPEX OF DEVELOPMENT ENERAL 99% confidence interval for99% confidenceinterval prediction Recorded montlysignup Google searchvolume . 1 o Wi h years the in BW for 913 c ouefor volume rch oslinear lows e . as s − 0 20 40 60 80 100 120 140 971 sis 15 an of

. monthly signup in x10³ users nAg20 u 01 otl Ssgu eito rmisse its from A deviation curve. - signup gray 2004 CS Jan as Monthly in fit signup 2011. function total Aug power CS - for 2009 fits Aug function in Power 3. Figure m over HC+CS chart for Pie HC+CS. search and Google CS of HC, for distribution volume mean search adjusted Google 2. Figure Total signup in millions of profiles Google search volume in % of CS maximum

0 1 2 3 4 5 6 7 8 9 11 13 15 17 19 0 20 40 60 80 100 Jul 10.6% Jun 9.1% Aug 11.3% u 4Ag0 u 6Ag0 u 8Ag0 u 0Ag1 u 2Aug 13 Aug 12 Aug 11 Aug 10 Aug 09 Aug 08 Aug 07 Aug 06 Aug 05 Aug 04 40 60 80 01 21 415 14 13 12 11 10 09 08 07 06 05 04 May 7.8% CS HC HC+CS Monthly signup deviation Fit for Aug 2009−Aug 2011 Fit for Jan 2004−Aug 2011 Real datafor CStotalsignup Sep 8.7% Apr 7.6% CS databasecrash Mar 7.9% Oct 7.9% Feb 7.2% Nov 7.4% August 20.. Jan 7.4% Dec 7.1% onths. g21 and 2011 ug asonalized fgrowth of

−100 −50 0 50 100

Monthly signup deviation from seasonalized power function fit in % is seasonal and growing. Using Box-Cox transformation [14], we can fit a power function to HC+CS. Averaging the deviations between HC+CS and the fitting power function Google 27405 gives us a growth adjusted average distribution of interest in Microsoft 15181 hospex services over months. August has the highest share Yahoo 8933 GMX 2527 with 11.3% and December the lowest with 7.1%. The five Other .com−domains 1929 Web.de GmbH 1359 months Jun–Oct have almost the same share as Nov–May. Mail.ru Group 1160 Other .net−domains 933 Therefore we use August to mark the time axis on all plots. Other .de−domains 897 Educational organizations 847 This seasonality in Google search volume is common for all Orange 545 Yandex 448 services like or Booking.com. La Poste 435 freenet group 431 You see on fig.2 that the interest in CS overtook HC in Wirtualna Polska 408 2007, whereby the interest in HC started to decrease. Both AOL 368 Alternative emailproviders 363 curves show seasonal variations with peaks in August. The Other .org−domains 316 Other .it−domains 289 seasonal variations for HC distinctly and visibly flattened Italiaonline 286 Other .fr−domains 256 from August 2009 on. The growth of the monthly signup for Apple 204 Tencent QQ 197 CS decreased from August 2009 on as well. The gray curve Other .pl−domains 193 Other .cz−domains 190 on fig.3 shows the deviation between the CS monthly signup Other .ch−domains 156 and its seasonalized power function fit. On this curve, you Rambler&Co 145 Other .co.−domains 144 see the chasm of the CS database crash in summer 2006 and Others 1485 the clear slowdown after August 2009. This slowdown caused the change in the slope of the total signup curve for CS as 1 10 100 1000 10000 depicted on fig.3. The two power function fits, meaning for Jan 2004 - Aug 2011 and for Aug 2009 - Aug 2011, shape different predictions. The growth of CS users dropped to almost quadratic from being over cubic before August 2009. Figure 4. Domain name categorization and corresponding histogram in absolute numbers. By the way, the power function fit after August 2009 gives a prediction of 6.9M users for Dezember 2013. Since CS and HC are equivalent services, we assume a user migration period in the years 2006-2009, which powered the faster universities and schools from all around the world make 1.2%. growth of CS and the decline of HC. We have introduced the category of alternative providers, The self-proclaimed purpose of the first hospex service which emphasize privacy and/or certain political goals. These Servas is “world peace, goodwill and understanding by are riseup.[org|net], no-log.org, lavabit.com, biomail.de, spam- providing opportunities for personal contacts among people freemail.de, openmailbox.org, jpberlin.de, mailoo.org, safe- of different cultures, backgrounds and nationalities” [4]. mail.net, immerda.[ch|de] and posteo.*. This category makes CS adapted it as “a better world – one couch at time” [8]. only .5% of all signups. Although signing up at BW is driven Combining World Bank and CS data reveals that 75% of the by incorporation of CS and its consequences, majorities choice world population, which lives in poor countries with a per in email providers is not be explained as ethically dictated. capita GDP under $10,000, represented only 10% of CS users Not every signup at BW leads to subsequent real life in 2011. Well, at least for CS the admission to the ’better interaction. 4921 out of 68030 profiles never logged in after world’ seems to be tightly restricted by income level. signing up. Fig.5 makes a picture of this fact. The rate of users, who did never log in, grew with those signing up after September 2011 and stays at around 10%. The incentives for logging in on a hospex website are searching a stay, V. INSIGHTS FOR BW replying a request and general communication. We consider The growth of interest in BW does not display seasonal posting, inviting and contacting members for organization of variation as it is obvious on Fig.5, but rather is driven by activities and other minor reasons as examples for general protest of CS users. There is a major peak in BW signups communication. A user interested only in hosting, would not and the google search volume in September 2011 – CS has have any incentive to log in until an appropriate hospitality been secretly privatized for the personal enrichment of a few request is forwarded to his e-mail postbox. If communication a month before [15]. The three peaks between August 2012 obviously stimulates login frequency, one is interested in what and August 2013 are also consequences of changes in CS, is stimulating communication. which have been considered to be unethical by CS community. Under the application of decision tree based feature The first of these peaks is caused by the update in CS ToU, selection [16], we could determine two factors, which have according to which the CS data can be sold to third parties. most impact on number of received messages, in case neither A data set of 68320 profile entries has been received from messages have been sent nor posts have been published. These BW on 04.03.2014. 68030 of those entries possess e-mail do- factors are the indicated gender and the hosting status. Fig.6 main names – local parts have been removed before data trans- shows the average number of messages received depending on action concerning privacy. The remaining 290 entries are either these two factors. The profile text was not under investigated invalid or officially deleted. Fig.4 shows the domain name cat- features, because it has been replaced by its length to beware egorization and the corresponding histogram. E-mail services users’ privacy. For your information, 41.7% of BW users administrated by Google, Microsoft and Yahoo make 75.4% of indicated to be female and 56.7% to be male. all signups. Domain names of educational organizations like Users, who set their hosting status on “No, sorry” are iue6 vrg ubro esgsrcie,i ete se neither if received, messages of number Average 6. Figure rate. login’ ’no and growth BW 5. Figure BW Google search volume and signup in % of maximum Yes, Welcome

No, Sorry 0 20 40 60 80 100 Maybe u 7Ag0 u 9Ag1 u 1Ag1 Aug 13 Aug 12 Aug 11 Aug 10 Aug 09 Aug 08 Aug 07 ’no login’rate in% Signup Google Search aefml o’ elother IDon’tTell female male tnrposted. nor nt

0 10 20 30 40 50 60 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 ’no login’ rate in % iue7 omncto ae mn Wusers. BW among rates Communication 7. Figure ainbtensn n eevdmsae,wihi only is which messages, received and sent between lation fsn esgsadpbihdpsswt ubro received of number with determine posts is su published to messages the not and and between messages can correlation sent messages we of The privacy, categorize them. users to between protect relationships mining to text order apply search in of removed top the been on profile location. a entered shifts an only request for it hospitality results – every accepted that be mean will case not default “Welcome the does is so-called Welcome” “Maybe” “Yes, the volunteers. BW are by sent initiation messages” rece own they that without reason The messages travelers. mostly be to considered as e sr edmr hnoemsae eoethey before message, 68 one is than average the more and send hours message/post users 35 sent/published New is first message days. received the first between and delay median the of time that set shows data of users BW 38425 the involving on recei messages calculation yet 227695 A not low. of did is 5% but message private posted, by any and/or created messages sent are who posts users, of 98% – fig.7 in see users. and/or margina is can BW communication you for on forums as public travelers use Th over who of requests. users, nearly hospitality share of of is existing low rate of yet, distribution the a message unequal fig.7, indicates any in receive This see not 40%. can did you priv which As triggers users, extent. posts some publishing to that messages conclude we Therefore,

o e received yet not Rate of users in % al .D I. Table a. hw httecac ofl notectgr fBW of category the into fall to chance the that shows Tab.I received ic h eto rvt esgsadpbi ot has posts public and messages private of text the Since 0 20 40 60 80 100 sum u 7Ag0 u 9Ag1 u 1Ag1 Aug 13 Aug 12 Aug 11 Aug 10 Aug 09 Aug 08 Aug 07 . 1aogmmes ti ihrta h corre- the than higher is It members. among 81 SRBTO FPRIIAINI COMMUNICATION IN PARTICIPATION OF ISTRIBUTION ete etnrpse yet posted nor sent neither 69 26 43 . . . Timepoint ofthesignup 5% 2% 3% eto posted or sent 30 27 2 . . . 7% Sent|posted &received Not yet received Not yet sent Not yet posted 5% 8% 100% 54% 46% sum ≡ share e 68028 ived . . 72. ate ve m . l eevday hsi eiiae ic 68 since legitimate, is This any. received community. BW of Development 8. Figure eintm ea f1 or n naeaeo 0days. 10 86 of and average hours an 24 and in hours arrive 16 replies result of of time delay 60% a time at median users a two between 364 conversations continued. not successful are conversations aka exchanges message nawe.Teeoeatm ii fadymasa estimated an means day a of 19 limit of time probability a reply Therefore week. a in tlat115mmesb 40.04 i. hw the the shows of Fig.8 growth at only 04.03.2014. Monthly correlates was community. users by community of BW yet. number BW the members reciprocated the of not 11115 of development is size least it interactio the at if life of assumption, if and real number this even user a a assume Under the members, for can as we user those disputed, a big between been by as not left has be least the it can investigate at comment to is A comments. number is interactions The way community BW. real-life only on this of comments The of called reports interactions. size accommodation the life mem- estimate real of to community by hospex interconnected a bers creates also it communication, messages. repl 15 one wit needs least probability hours estimated at same 24 the receiving Archiving a week. for give a 96% within partners conversation of new probability to estimated messages private 10 that fnwmmesit h community. the joining into indicates and members 2011 new August of after connections drops of number member average BW The per members. of number the of irto eidfo Ct Sto er rm20 to 2006 from The – overcome. years to 4 took time CS of to period HC superable from long is period relatively number migration user a the leadersh takes to according Market but service services. hospex a accommodation of other in interest Monthly growth at users/members Wa opxsriede o nyfcltt on-line facilitate only not does service hospex as BW h neeti opxsrie sa esnla the as seasonal as is services hospex in interest The 0 500 1000 1500 2000 2500 3000 3500 u 7Ag0 u 9Ag1 u 1Ag1 Aug 13 Aug 12 Aug 11 Aug 10 Aug 09 Aug 08 Aug 07 Average reallife connections Monthly growth atmembers Monthly growth atusers I C VI. . %ado ek27 week a of and 1% ONCLUSION . 4wt h otl growth monthly the with 44 . %o ele arrive replies of 8% . %o l initiated all of 2% . % htmeans That 6%.

0 2 4 6 8 hin 08 ed ip n y n

, Average number of real life connections 1]J vror .Wlzn n .Pts Tutpropagati “Trust Potts, C. and n Wulczyn, trust E. for Overgoor, model J. generative and “Analysis [12] Dandekar, P. [11] 1]P itr .Crei,M .Cc,adE Herrera-Viedm E. and Cock, D. M. Cornelis, C. Victor, P. [10] o ellf neatosi Wcmuiyvrae between variates community BW 5. in number and average interactions 3 The life hours. 15 24 real and within for needed, probability were same messages the hospitali archive average for 10 To an 04.03.2014, traveler. for before at week a request lacks within BW probability reply providers. appl 96% ethic e-mail not unethical about do of concerns be utilities choice users’ to facilitating BW communication considered with hospex Most are community. of correlates CS but which the seasonal, CS, by not of is changes BW major in interest The 2009. 1]G .P o n .R o,“naayi ftransformation of analysis “An Cox, R. D. and Box P. E. G. [14] 1]D atrah .Tun,T hh n .A dmc “Sur Adamic, A. L. and Shah, T. Truong, H. Lauterbach, D. [13] eas hn h epe h rvddteWk irr [16]. library Weka express the help provided their to who for people, want Bevolunteer the of I thank also co-members [17]. We the License to gratitude Commons my Creative a under 1]Bvlnerog ‘opxdt eerh n ‘bewelc and research’ data “‘hospex Bevolunteer.org, [17] 1]I .Wte n .Frank, E. and Witten H. I. 2 couchwiki.org, issues,” [16] ‘conversion’ “Couchsurfing [15] 9 Apiwrd”api-ol.e,2014. aupair-world.net, “Aupairworld,” couchsurfing.org. [9] inc.” “Couchsurfing [8] 7 .Shit Hsiaiycu:Ntwr ie tr¨aumer eines Netzwerk club: “Hospitality Schmidt, J. [7] 6 .Sletzk Hse aaaefritrainle international for database a - “Hospex Sylwestrzak, W. deutsch.hospitalityclub.org/ab [6] club,” “Hospitality servas.org/who-we-are.php. [5] international,” “Servas [4] 2 .Sylla, A. [2] [1] 3 .J ue,“atesi n ieecag mn h es the among wife-exchange and “Partnership Rubel, J. A. [3] hswr a atal ulse nBewelcome.org on published partially was work This cetfi rceig eiso optrEgneigadI and Engineering Science Computer on Series Proceedings Scientific es,2006. Reise, fet oes”in models,” effects ae grgto prtr o rda rs n distrust and trust gradual for operators aggregation based optlt o rvles”hospex.org. travellers,” for hospitality 09 p 346–353. pp. 2009, lu fnrhr ot mrc, in america,” Alaska north of University northern of aleut fteRylSaitclSceySre B Series Society Statistical 1964. Royal the of ftut euainadrcpoiyo couchsurfing.com,” on reciprocity and Reputation trust: of eecm.r/ii pi 2014. April bewelcome.org/wiki, utn i ae:HwA-ad sWnigteWro Terror on War the Winning 2008. Is Schultheis, Al-Qaeda How Laden: Bin Hunting hee,Uiestd il I,1980. III, Lille de Universit th`eses, 00 p 505–510. pp. 2010, , apioohemrl e Wolof des morale philosophie La A OTAND BOUT ICWSM 91 p 59–71. pp. 1961, , R EFERENCES 2012. , A aaMining Data CKNOWLEDGMENT nhoooia aeso the of papers Anthropological ognKumn,2005. Kaufmann, Morgan . o.2,n.2 p 211–252, pp. 2, no. 26, vol. , tle erdcindes Reproduction Atelier . outdeu.htm. , pee Online Spiegel s,” 011. nwt mixed- with on etworks.” m charts’,” ome ,“Bilattice- a, ”in ,” in cag of xchange s,” nformation n web a fing ioand kimo S (4) CSE Journal Rob . World on y ty . ,