A Data Mining Method for Facebook Social Network: Take "New Row Mian (Beef Noodle)" in for Example

Jong-Shin Chen1, Chi-Yueh Hsu2*, Cheng-Ying Yang3, Ching-Chuan Wei4, Han Guo Ciang5 Department of Information and Communication Engineering, Chaoyang University of Technology, Wufeng, 41349, Taiwan, R.O.C.1,4,5 Department of Leisure Services Management, Chaoyang University of Technology, Wufeng, Taichung 41349, Taiwan, R. O. C. 2 Department of Computer Science University of , Taipei 10048, Taiwan, R.O.C. 3 E-mail:[email protected]*

Abstract—Facebook penetration rate in Taiwan is the highest where variety of chefs and restaurants compete for the 'best in the world, until July , the number of daily users beef noodle' title in Taiwan. 2011 Taipei International Beef reached 13 million for approximately 23 million population. Festival has been Taiwan beef noodle translated as "New Row Location-based Facebook check-in service is a hot topic, Mian". The naming imitates Japanese Sushi as or Korean numerous Facebook users go to their interested numerous Kimchi that translated from the local language literal checkin-in places and check in there. Taiwan beef noodle is considered a national dish. 2011 Taipei International Beef translation, highlighting the unique characteristics. Festival has been Taiwan beef noodle translated as New Row Accordingly, we selected "New Row Mian" as the topic to Mian. The naming imitates Japanese Sushi or Korean Kimchi explore related places from Facebook social network. that translated from the local language literal translation, Numerous places and check-in behaviors at these places can highlighting the unique culture. Through the culture in the form public options for example hot places, high density human activities, it will also produce the relevant Facebook regions of places. How can we know that related places is check-in places and check-in behaviors. In this study, we propose popular in which administrative regions? Unfortunately, it is a method to collect the big data of Facebook check-in places, find difficult to get administrative regions of places from Facebook. out the places related to "New Row Mian" and position for these The 1st and 2nd division regions in Taiwan Island includes 6 places. special municipalities, 3 cities, and 10 countries. The 3rd Keywords—big data; Facebook; public option; New Row Mian division of Taiwan includes 352 regions, which are 170 districts, 13 -controlled cities, and 169 townships [3-4]. I. INTRODUCTION In this case, we position the places to 1st, 2nd, and 3rd type Facebook penetration rate in Taiwan is the highest in the divisions of administrative regions. world, until July 2015 in Taiwan, the number of daily users In this study, we propose A Data Mining Method for reached 13 million for approximately 23 million population. Facebook Social Network and take "New Row Mian" in Check-in for places is a location-based function of Facebook. Taiwan for example. In section 2, the methods to collect place, Facebook user can go to some famous scenic spots (Facebook to update the data of places and to position the administrative places) or participate some activity and check in on the regions of places are introduced. In section 3, we demonstrate Facebook to show that the user participate some activities[1-2]. the results. Finally, the conclusion and future work is gave in If there are no suitable name for the Facebook place, Facebook section 4. user can create a new place for the scenic spot. In the rest of II. RELATED WORK this article, 'place' is referred to as 'Facebook place' simply and "…" is used to present a name of a specific Facebook place. The current research of check-in on social networks is After several years, there are numerous places and numerous divided into two types. The first type is based on big data that check-in behaviors at these places in Taiwan. In this study, we uses some technologies of Geographic Information Science [5- attend to explore the public option from Facebook places for a 8]. The other type is based on Ethnography-style study. The special topic. big-data based check-in studies generally depend on open Application Programming Interfaces(APIs), to acquire data Taiwan beef noodle exists in variety of forms throughout from open social networking platforms and then do data East Asia and Southeast Asia. Beef noodle soup was first mining and analyzing. The disadvantage of this method is that created by the Hui people (a Chinese Muslim ethnic group) it cannot be discussed in depth with the focused persons. during the Tang Dynasty of China. The red braised beef noodle soup was invented by the veterans in Kaoshiung, These studies always are based on Foursquare as the Taiwan who fled from mainland China during the Chinese research field. Facebook is the most popular check-in platform civil war. In Taiwan it is considered a national dish and Taipei in the world. However few studies are based on it as the City holds several times of International Beef Noodle Festival, research field. One of the major reasons is that Facebook platform only allows limited data access. The other reason is Facebook server is formed as JSON format[9] and encoded by that Facebook API often changes. Therefore, it is difficult to utf-8 format[10]. Figure 2 presents an overview of the JSON acquire mass data from Facebook platform by programming. formatted data in which there are data, and paging fields. The Foursquare is an another platform that user can do check-in at data field contains the data of n places. Each data of a place places. A Foursquare user dose check-in at a place and the includes the identification, name, location, category of this information simultaneously display on Twitter. All of the place. The next filed contains the next url if there are other information on Twitter is open. Accordingly, the mass data places at this area. related to Foursquare can be acquired from Twitter platform. Format: https://graph.facebook.com/search?type=place¢er=c& Ethnography is the systematic study of people and cultures. distance=r&access_token=ak&limit=n It is designed to explore cultural phenomena where the Example:https://graph.facebook.com/search?type=place¢er=24.069093, researcher observes society from the point of view of the 120.7127943&distance=150&access_token=1368411763*****|k95dZzlRoY subject of the study. An ethnography is a means to represent Vqg9I9NF_QxU*****&limit=50 graphically and in writing the culture of a group. The resulting field study or a case report reflects the knowledge and the Fig. 1. A https format of requesting identifications from Facebook server system of meanings in the lives of a cultural group. For { Ethnography-style study, our study can help to find out the hot "data": [ {data of the first place}, { data of the second place }, ..., { data of regions and hot locations. It is worth mentioning that each the nth place } ], Facebook place has a corresponded web page. The names of "paging": { "cursors": {...}, Facebook users, that do check-in at this place, can be found out "next": "the next url" from this page. According to the hyperlinks of Facebook users } on this place page, we can visits the web pages of the Facebook } users. Then, we can acquire some information related to the Fig. 2. An overview of the responded JSON formatted data from Facebook Facebook users on their web pages. Indirectly, by not server difficultly, we can contacts to the real persons of the Facebook users. These specific Facebook users could be the candidates in B. Content Maintenance Ethnography-style study. For the general public, many of the The detailed data of a place can be acquired by sending a Facebook places in the paper is a popular locations. Many request with its place id, according to the format as shown in people have been to these locations. Moreover, the web page of Fig. 3. After Facebook server received the request, it will return a Facebook place page, that has higher like count, represents the data of this place back. The data includes the name, there are many Facebook users to follow this page. This place location (latitude and longitude), category, description, about, can be used as a outdoor recreation location or a good web checkins, ..., and so on, where ‘checkins’ is the number of page can be visited on Internet. check-in behaviors at this place.

III. RESEARCH METHOD Format: https://graph.facebook.com/id/?access_token=ak Our research method is divided into 3 steps, place Example: https://graph.facebook.com/1789770481248040 collection, content maintenance, and place positioning [11]. /?access_token=1368411763*****|k95dZzlRoYVqg9I9NF_QxU***** A. Place Collection Fig. 3. A request of maintaining data of a place The data of Facebook places is open data. According to an access token ak, a developer can acquire the data from Facebook servers. The ak of a developer can be requested from the web page with url "https://developers.facebook.com". Each Facebook place has a unique identification, termed as id. A developer can acquire several identifications according to Hypertext Transfer Protocol Secure (https) protocol by sending a request to Facebook server. The format of this request is shown as Fig. 1, where c is a latitude and longitude coordinate, r is a distance, and n is a limit number. Fig. 4. The bounder of City in Taiwan When a Facebook server acquired this request, it returns at most n identifications at a geographical circle-area A with center c and radius r. If the number of places at A is larger than n, the responded data will contain a next url. According to this url, Facebook sever will again provide other identifications at A. Similar process will be performed again and again until no next url is returned from Facebook server. An example is shown in Figure 1, where c is 24.069093 (latitude) and 120.7127943 (longitude), r is 150 meters, ak is 1368411763..., n is 50. Notably, for secure, several characters of ak are marked as *. The responded data came from Fig. 5. Kaohsiung City Sanmin in Taiwan has 90 plcaces C. Place Positioning TABLE I. PLACE DISTIBUTION The bounder B of a geographical area A is defined as in (1), # Region No. of places # Region No. of places where (xi, yi) is a latitude and longitude coordinate for i=0, 1, ..., 1 Taichung City 739 11 108 n. The bounder is formed from (x0, y0) to (x1, y1), (x1, y1) to (x2, 2 Kaohsiung City 654 12 99 y2), ..., (xn-1, yn-1) to (xn, yn) and (xn, yn) to (x0, y0). A node (xt, yt) 3 Taipei City 430 13 County 99 is extended to a line L(xt, yt) as in (2). If the extended line L(xt, 4 385 14 Yilan County 95 y ) meets the bounder B odd times, (x , y ) is in A. Otherwise, (x , t t t t 5 Taoyuan City 297 15 City 71 y ) is not in A. t 6 262 16 Hsinchu City 67 B={(x0, y0), (x1, y1), ..., (xn, yn)} (1) 7 City 224 17 65 8 159 18 City 9 L(xt, yt)={(xk, yt)| xk ≥ xt} (2) 9 130 19 5 According to above method, the geographical area of 10 125

Taiwan is separated into more than 960 thousand subareas with Average: 211.74, Standard division: 207.28 radius 150 meter and we got more than 1.1 million places. Then, we positioned these places to 1st, 2nd, and 3rd type divisions of administrative regions. IV. RESULTS As shown in Fig. 4, the bounder of Kaohsiung City in A. Place Category Taiwan is depicted by Google Earth software. In this example, the extended line of p meets the bounder 2 times, that is not in The category of places related "New Row Mian" is to 0 search the date of places which the name, about, or description Kaohsiung City. The extended line of p1 meets the bounder 1 time, that is in Kaohsiung City Taiwan. Using a similar fields contain keywords " 牛肉麵", which the traditional approach, the place can be positioned to the scope of the 3rd Chinese words means "Beef Noodle". Totally there are 4024 division administrative regions. As shown in Fig. 5, the 40 related places in which there are 3743 places whose name places are positioned as in Kaohsiung City which data of these fields have the keywords, 465 places whose about fields have places contain 'Beef Noodle' words. the keywords, and 248 places whose description fields have the keywords. These places on Facebook almost mapped to physical stores and the place names always their names of these physical stores. Stores provides several goods to the public and New Row Mian is just one of their goods. However, There are 93.02% (3743/4024) stores, which words "New Row Mian" are used as a part of their store names, to highlight the characteristic. B. Place Distrbution Data of the distribution of 4024 places in Taiwan is shown in Fig. 6 and Table 1. To consider the number of places in 1st and 2nd divisions of administrative regions, the first is Taichung City having 739 places. Next, Kaohsiung City has 654 places and Taipei City has 430 places. The detail of places in regions is shown in Table 1. The geographical locations of these places in Taiwan are depicted as shown in Fig 6 in which Taipei City, New Taipei City, Taichung City, and Kaohsiung City are high density regions. For all 1st and 2nd divisions of administrative regions, the average number of places is 211.74 and the standard division is 207.28. To consider the place density, the first three regions are Taipei City, Chiayi City, and Hsinchu City whose values are 1.582, 1.183, and 0.643 per km2 respectively. C. Hot Places The hot places, which the check-in count is larger than 10000, are shown in Table II. There are two places which their check-in counts are larger than 100000. A Facebook place related to New Row Mian almost is a store in real world. A place id whose check-in count larger than 100000 means there are numerous Facebook users that have been here and have been enjoyed the delicious food. You can visit the web page of this place on Fig. 6. Place distribution in Taiwan https://www.facebook.com/id. Some information, includes that have been there and comments that posted by Facebook comments, of this place can be found out from the web page. users after attending some specific activities at this place. The region Taichung City has 739 Facebook places related to TABLE II. HOT PLACES "New Row Mian". The hot places have more than 100000 check-in counts. This study provides a method to find out the # Id Location Checkins Likes 1 164402176945881 Taipei City Zhongshan District 110015 15702 Facebook places and Facebook users. 2 188995511131356 Taipei City 102742 4229 For Ethnography-style study, our study can help to find out 3 191024864264295 Taoyuan City 74049 6699 the hot regions, the hot locations, and the Facebook users 4 189057044467859 Taipei City Da’an District 47134 1641 related to these regions for some topics. More statistical data is 5 173773912666756 Kaohsiung City Yancheng District 41452 1972 listed as Tab. III and Tab. VI. 6 114629905285950 Kaohsiung CityZuoying District 38264 2734 ACKNOWLEDGMENT 7 316125325149468 Pingtung County Chaozhou 31605 2329 8 153555914702137 Taipei City 23860 1022 This research was partially supported by the Ministry Of 9 519285271483011 Taipei City Wanhua District 23013 1365 Science and Technology, Taiwan (ROC), under contract no.: 10 253608867990317 Taitung County 22119 4067 MOST 103-2632-E-324-001-MY3. 11 187449717961594 Hukou Township 21386 1119 REFERENCES 12 182487775125528 New Taipei City Ruifang District 21331 546 [1] Wikipedia, Facebook, Retrieved March 5, 2017, from 13 127967877268995 Taitung County Taimali Township 19207 314 https://zh.wikipedia.org/wiki/Facebook 14 125146537553788 Hsinchu County Zhubei City 17975 1137 [2] EPOCH TIMES, Facebook global monthly active users reached 1.65 15 121154351289626 Taoyuan City Zhongli District 17741 659 billion Taiwan 18 million. Retrieved March 5, 2017, from http://www.epochtimes.com/b5/16/7/19/n8115301.htm 16 189049247790636 Nantou County Jiji Township 17695 609 17 101857676617902 Hsinchu County ZhudongTownship 15939 1441 [3] Ministry of the Interior, R.O.C.(Taiwan). Monthly report. Retrieved March 5, 2017, from http://sowf.moi.gov.tw/stat/month/list.htm 18 213317278999639 Taoyuan City Taoyuan District 15568 6770 [4] http://www.president.gov.tw/portals/0/images/IntroductionROC/goverm 19 200238353320831 Taipei City Da’an District 14841 1197 entorganizations/Local-governments.html 20 382094198527268 Taitung County Taimali Township 13800 220 [5] A. Bawa-Cavia, "Sensing the urban: Using location-based social 21 192588694096965 Nantou County Puli Township 13614 517 network data in urban analysis," First Workshop on Pervasive Urban Applications (PURBA), San Francisco, CA, 2010, September 22 140818355977859 Taoyuan City Zhongli District 13497 756 [6] Z. Cheng, J. Caverlee, K. Lee, and D. Sui, D. "Exploring millions of 13410 1616 23 193604177454770 Taichung City South District footprints in location sharing services," Fifth International AAAI 12430 4567 24 121506961274405 Toucheng Township Yilan County Conference on Weblogs and Social Media, Barcelona, ES2011, 2011 25 175520232494026 Pingtung County Pingtung City 11490 1078 July. 26 166137110077305 Yilan County Wujie Township 11427 2256 [7] D. J. Crandall, L. Backstrom, D. Cosley, S. Suri, D. Huttenlocher, and J. Kleinberg, "Inferring social ties from geographic coincidences," 11326 8781 27 1581210898783590 Taichung City West District National Academy of Sciences, 107(52), 22436-22441, 2010, September. 28 1548595512107500 Taichung City West District 11002 1888 [8] J. Cranshaw, R. Schwartz, J. Hong, and N. Sadeh, "The livehoods 29 159260344123485 Tainan City 10761 524 project: Utilizing social media to understand the dynamics of a city," 30 195594733787488 Nantou County Nantou City 10471 309 Sixth International AAAI Conference on Weblogs and Social Media, Dublin, IE, 2012, May. 31 135675536497879 Taipei City Zhongshan District 10242 565 [9] RFC 4627, https://www.ietf.org/rfc/rfc4627.txt 32 178816792155667 Taipei City Zhongshan District 10222 851 [10] RFC 3629, https://tools.ietf.org/html/rfc3629 [11] J. S. Chen et al., "Public Option Analysis for Hot check-in places at V. CONCLUSIONS Taiwan," International Conference on Advanced Information In this study, we proposed a data mining method for Technologies/Forum on Taiwan Association for Web Intelligence Facebook social network and taken "New Row Mian" in Consortium, pp. 745-755, 2017. Taiwan for Example. A Facebook place is a location in real world. The web page of this place contains the Facebook users TABLE III. STATISTICAL DATA OF 1ST AND 2ND TYPE DIVISON (APPENDIX A)

Region No. of places No. of Check-ins Avg. Check-ins Area Place Density Check-in Density Taichung City 739 1 313130 2 423.72 14 2214.9 6 0.334 4 141.375 5 Kaohsiung City 654 2 292662 3 447.5 12 2951.85 4 0.222 7 99.145 6 Taipei City 430 3 674111 1 1567.7 1 271.8 16 1.582 1 2480.18 1 New Taipei City 385 4 166010 5 431.19 13 2052.57 9 0.188 8 80.879 7 Taoyuan City 297 5 259749 4 874.58 3 1220.95 14 0.243 6 212.743 4 Changhua County 262 6 48921 13 186.72 17 1074.4 15 0.244 5 45.533 10 Tainan City 224 7 137465 6 613.68 8 2191.65 7 0.102 9 62.722 9 Nantou County 159 8 90570 8 569.62 9 4106.44 2 0.039 16 22.056 17 Pingtung County 130 9 83634 9 643.34 7 2775.6 5 0.047 14 30.132 13 Yunlin County 125 10 32627 15 261.02 16 1290.83 13 0.097 10 25.276 15 Miaoli County 108 11 51976 12 481.26 11 1820.31 11 0.059 13 28.553 14 Taitung County 99 12 83197 10 840.37 4 3515.25 3 0.028 18 23.667 16 Hsinchu County 99 13 102716 7 1037.54 2 1427.54 12 0.069 11 71.953 8 Yilan County 95 14 79373 11 835.5 5 2143.62 8 0.044 15 37.028 11 Chiayi City 71 15 22921 16 322.83 15 60.0256 19 1.183 2 381.854 3 Hsinchu City 67 16 45173 14 674.22 6 104.153 18 0.643 3 433.719 2 Chiayi County 65 17 9350 17 143.85 18 1903.64 10 0.034 17 4.912 18 Keelung City 9 18 4763 18 529.22 10 132.759 17 0.068 12 35.877 12 Hualien County 5 19 564 19 112.8 19 4628.57 1 0.001 19 0.122 19 Average 211.74 131521.68 578.77 1888.78 0.28 221.99 Standard Division 207.38 161794.98 351.69 1316.31 0.42 560.15

TABLE IV. PARTIAL STATISTICAL DATA OF 3RD TYPE DIVISON (APPENDIX B)

# Region No. of places No. of Check-ins Avg. Check-ins Area Place Density Check-in Density 1 Kaohsiung City Sanmin District 90 1 29977 19 333.08 101 19.7866 234 4.549 13 1515.02 19 2 Kaohsiung City Fengshan District 87 2 22712 30 261.06 127 26.759 216 3.251 19 848.761 31 3 Changhua County Changhua City 81 3 15870 42 195.93 147 65.6947 103 1.233 38 241.572 60 4 Taichung City 78 4 31558 17 404.59 83 39.8467 168 1.958 26 791.985 33 5 Kaohsiung City 71 5 69825 6 983.45 24 19.3823 236 3.663 17 3602.51 10 6 Taichung City 67 6 20752 32 309.73 106 62.7034 110 1.069 42 330.955 54 7 Taichung City West District 67 7 62360 7 930.75 26 5.7042 267 11.746 3 10932.3 5 8 New Taipei City 66 8 25779 27 390.59 86 23.1373 223 2.853 20 1114.17 23 9 Taoyuan City Taoyuan District 63 9 49292 9 782.41 34 34.8046 187 1.81 28 1416.25 20 10 Taichung City 62 10 19539 35 315.14 104 28.8758 212 2.147 24 676.657 37 11 Kaohsiung City Lingya District 61 11 28974 23 474.98 65 8.1522 259 7.483 7 3554.13 12 12 Taipei City Daan District 60 12 136915 4 2281.92 10 11.3614 251 5.281 10 12050.9 4 13 Taipei City Zhongzheng District 58 13 80553 5 1388.84 19 7.6071 260 7.624 5 10589.2 6 14 Pingtung County Pingtung City 58 14 30652 18 528.48 58 65.067 105 0.891 47 471.084 45 15 Taoyuan City Zhongli District 55 15 142427 3 2589.58 7 76.52 73 0.719 50 1861.3 16 16 Taitung County Taitung City 54 16 40412 11 748.37 39 109.769 40 0.492 65 368.155 51 17 New Taipei City 53 17 15161 44 286.06 113 120.226 35 0.441 71 126.104 83 18 Taipei City Xinyi District 53 18 39961 12 753.98 36 11.2077 253 4.729 12 3565.5 11 19 Taipei City Zhongshan District 51 19 167197 2 3278.37 4 13.6821 248 3.728 16 12220.1 3 20 Taichung City 51 20 20661 33 405.12 82 31.2578 201 1.632 31 660.987 38