Build Sentiment Classification Prediction Model for O2O Service Long-Sheng Chen Wan-Ting Chien Hua-Nan Chang Department of Information Department of Information Department of Golden-Ager Management Management Management 168, Jifeng E. Rd., Wufeng District, 168, Jifeng E. Rd., Wufeng District, 168, Jifeng E. Rd., Wufeng District, Taichung, 41349 , R.O.C. Taichung, 41349 Taiwan, R.O.C Taichung, 41349 Taiwan, R.O.C 886-4-23323000 ext. 5001 886-4-23323000 ext. 4167 886-4-23323000 ext. 6001 [email protected] [email protected] [email protected]

ABSTRACT restaurants has reached 10,000 from 2008 to 2016. Therefore, in With the rapid development of information and communication the recent rapid development of O2O, more and more enterprises technology, O2O (Online to Offline) business model has attracted need to face fierce competition to enter this market. It also lots of attentions for enterprises. In such a fast-growing become harder to attract customers in the O2O market. environment, some studies indicated that lack of trust will bring a In Taiwan, O2O platforms developed very rapid, but many great damage to O2O business. Besides, some published works problems occur. Take “meal vouchers” in group buying for pointed out those negative comments in social communities will example. There have been too many consumer disputes, including decrease the consumer's trust to O2O companies and platforms. the restaurant service information provided by platforms is So, it is necessary for enterprises to understand the important inconsistent to the actual stores, meal vouchers use information is factors that affect consumers’ sentiment of textual reviews. not detailed, or the meal coupon trading platform mixed with Therefore, this study aims to build prediction models by using frauds and so on. Support Vector Machines Recursive Feature Elimination (SVM- RFE) and Least Absolute Shrinkage and Selection Operator According to previously reported facts, before O2O users doing (LASSO), respectively. We do not only attempt to build sentiment purchases, they will first browse others’ reviews to understand the classification models, but also to find the important factors that authenticity of the platform information, integrity, credibility, and affect the sentiments of comments. The findings can be references other elements of trust. With these initial trust, online transactions for O2O market enterprises to carefully answer customers’ will be completed. Consequently, scholars believe that consumer comments to improve customers’ trust and service quality. trust will be an important factor in the success of O2O online transactions (Kim et al., 2005, Liang et al., 2014). Some studies CCS Concepts also confirmed that customer trust in O2O development is one of • Information systems➝ World Wide Web➝ Web important issues. Liang et al. (2014) indicated that consumer trust applications➝ Electronic commerce➝ Online shopping will affect the development of O2O (Liang et al., 2014). Wu et al. (2015) also believe that confusion and insufficient trust will bring Keywords a big problem for O2O (Wu et al., 2015). To sum up, we believe O2O; Sentiment classification; Feature selection; SVM-RFE; that customer trust will be a critical factor to O2O. So, it is very LASSO important to enhance the customer trust of O2O platforms and manufacturers. 1. INTRODUCTION O2O (Online to Offline or Offline to Online) refers to consumers Textual comments in social media could be considered as the can buy products/services from the network platform (online) and electronic word of mouth. These textual reviews in social then get purchased products or enjoy services in physical stores communities have become one of major references for consumers (offline). Many literatures pointed out that many companies in to make purchase decisions (Yan et al., 2016). According to China are greatly concerned about the development of O2O. They previous literatures, the reputations of companies and website compete to enter the O2O market because O2O has been platforms are part of customer trust. However, Sparks and considered to have a great potential of growth (Carsten et al., Browning (2011) indicated that electronic word of mouth 2016; Zhang and Huang, 2015). In the world, lots of O2O services contributes to the development of reputation and customer trust has been developed successfully, such as group buying (Groupon, (Sparks and Browning, 2011). Textual comments contain positive gomaji), food and beverage service (OPENTABLE, EzTable, and negative sentiments. Sentiment classification aims to classify Dianping and ), transportation (Uber and Zipcar), travel textual comments into positive and negative sentiments. If we can service (Airbnb and TripAdvisor) and so on. The rise of O2O has know the sentiment of customers from textual comments, O2O led to a gradual change in consumer spending and payment, and service providers can know the acceptance levels regarding their has contributed to the development of the online cash flow provided products and services. Moreover, they can further industry (Zhang and Hung, 2015). According to the report of the improve the quality of the service or product and give consumers world's famous OpenTable restaurant booking website, it has appropriate responses (Prabowo and Thelwall, 2009; Pang et al., more than 1 billion orders from when it has been established to 2002). Hence, they can enhance customer trust of O2O business. 2016. And there are more than 40,000 restaurants have cooperated Besides, some published works pointed out those negative with OpenTable. In another well-known restaurant O2O website comments in social communities will decrease the consumer's in Taiwan, EzTable, the number of cooperated high-level trust to O2O companies and platforms. So, it is necessary for enterprises to understand the important factors that affect they can use the results of sentiment classification as a basis for consumers’ sentiment of textual reviews. decision making to decide whether to order services or products. In available O2O related literatures, this issue is often conducted There are many sentiment classification methods. One of them is using a questionnaire survey (Wu et al., 2015, Li et al., 2016). semantic orientation which is to set up a positive and negative And the way of questionnaire survey cannot provide immediate vocabulary, and then calculate the scores, depending on the customer thinking, and it requires a lot of manpower and time. relationship between a specific textual document and the Therefore, this study aims to build prediction models by using vocabulary, to determine the semantic orientation of the document. Support Vector Machines Recursive Feature Elimination (SVM- In addition, many literatures have successfully applied the RFE) and Least Absolute Shrinkage and Selection Operator semantic orientation methods to complete the sentiment (LASSO), respectively. We do not only attempt to build sentiment classification, such as Murthy and Suresha (2015) who used the classification models, but also to find the important factors that semantic orientation method to effectively classify XML pages affect the sentiments of comments. The findings can be references (Murthy and Suresha, 2015), Chaovalit and Zhou (2005) used the for O2O market enterprises to carefully answer customers’ semantic orientation (Chaovalit and Zhou, 2005), unsupervised comments to improve customers’ trust and service quality. and supervised learning methods to classify film reviews. 2. LITERATURE REVIEW Another kind of common methods of sentiment classification is machine learning. Machine learning methods can have higher 2.1 O2O (Online to Offline) prediction accuracy, but them needs to spending lots of time to Recently, many businessmen and researchers paid much attention define class labels and train model. If we do not use machine on O2O business. O2O can be considered as a shopping learning model instead of SO methods, the positive and negative experience between online business and offline entities provided term sets need to be updated. And the number of vocabularies by any devices. There has been a need for more scholars to should be increased dramatically, when the number of reviews suggest how firms can provide a more attractive approach to the grows. So, this study uses SO methods to help defining class O2O model and maintain a competitive advantage (Tsaia et al., labels and then use machine learning methods to build sentiment 2015; Wu et al., 2015; He et al., 2016). classification model. However, under the rapid development of O2O, some scholars noticed that there may be a trust problem in the O2O environment 3. METHODLOGY (Wu et al., 2015; Liang et al., 2014). From the available literatures, There are several steps in this experiment. We need to define the customer trust issue in O2O is not sufficient. So, this study factors of influencing positive and negative sentiments, collect attempts to extend this topic for discussion data, establish training set, prepare data, implement feature selection by using SVM-RFE and LASSO, build predictive model At present, O2O related literatures are almost all use the by using SVM, assess selected important factors, and finally make questionnaire to do qualitative researches (Li et al., 2016; Wu et conclusions. The concise steps describes as below. al., 2015). It might have some problems. For examples, we cannot make sure whether the respondents really use O2O services. And Step1: define factors of influencing positive and negative the collection of the questionnaire requires a lot of manpower and sentiments time finishing and other issues exist. So this study attempts to use After collecting available literatures, this study attempts to define customer-generated content-reviews as research data. the potential factors that affect the sentiments of reviews in the O2O platform. 2.2 Sentiment Classification With the development of social media, online communities’ Step2: Data collection user-generated content has great power of influencing other users. In this study, we collect textual reviews from the OpenTable Online customer reviews is one of good examples. Online website. And the crawler tool is used to crawl the webpage, such comments and rating has become a major reference for consumers as the member's account, date, comments, tag comments, ratings to make their purchase decisions (Yan et al., 2016). Companies and so on. and suppliers can also find useful information about products or services (Engler et al., 2015) from these comments, and then they Step3: Establish training set can make appropriate responses based on them. Therefore, sentiment classification which aims to identify positive and In this step, we use defined factors in step 1 to describe the negative comments has become one of hot issues. If we can know collected data to form training set. the sentiment of customers from textual comments, vendors can Step4: Prepare data know the acceptance levels regarding their provided products and services. Moreover, they can further improve the quality of the In this study, we adopt 5 fold cross-validation experiments to service or product and give consumers appropriate responses make the experiment results more accurate. So, we need to divide (Prabowo and Thelwall, 2009; Pang et al., 2002). For customers, collected data into five equal parts. Then, each part in turn will be our test data, and other four parts will be our training data. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are Step5: Implement feature selection methods not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy In this study, we use two famous methods, SVM-RFE and otherwise, or republish, to post on servers or to redistribute to lists, LASSO. And we’ll compare them to find better one. requires prior specific permission and/or a fee. Step6: Build predictive model by using SVM the 2017 3rd International Conference on Industrial and Business Engineering (ICIBE 2017), August 17-19, 2017, in Sapporo, Japan. In this step, LibSVM is used to verify the experimental results of Copyright 2017 ACM 1-58113-000-0/00/0010 …$15.00. SVM-RFE and LASSO. The C-SVM classification parameter DOI: http://dx.doi.org/10.1145/12345.67890 selection tool grid.py in the RBF (radial basis function) kernel is 4.3 Results of Feature Selection used to obtain the best parameters and verify its classification efficiency. 4.3.1 SVM-RFE Step7: Assess selected important factors. Table2. Rusults of SVM-RFE Feature Selection Fold Occurrence Step8: Make conclusions. fold1 fold2 fold3 fold4 fold5 Factors frequency 4. EXPERIMENTAL RESULTS E V V V V V 5 4.1 Defined Factors SP V V V V V 5 According to the work of Liang and Yang (2014), O2O can have four dimensions: 1. Consumer; 2. Vendor & their Product and SA V V V V V 5 Service; 3. O2O Website Platform; 4.Trading Environment. But GL V V V V V 5 the Consumer, O2O Website Platform and Trading Environment dimensions cannot be obtained from textual comments. Most of T V V V V V 5 comments are regarding products and consumer experiences. So, SF V V V V V 5 this study focus on Vendor & Product and Service. The defined factors of influencing positive and negative sentiments listed in AR V V V V V 5 Table 1. SR V V V V V 5 Table 1. The Defined Potential Factors MD V V V V 4

Notation X (Independent variables) Notation X (Independent variables) RS V V 2 SP Service speed L Location S V V 2 F Friendly P Price C Clean UC Understand customer P V 1 needs Menu design L V 1 MD TD Text Difficulty /diversity RA Restaurant atmosphere TTR Type-Token Ratio In this study, according to SVM-RFE output of the factors (Merit), E Environment GL Grade Level we list top 10 selected factors which have higher merits for each AR Ambience Rating T Token fold. The results have been shown in Table 2. In this table, we can SR Service Rating S Sentimental features build three feature sets, SVM-RFE # 1 {E, SP, SA, GL, T, SF, E, SS}, SVM-RFE #2 {SV, SA, GL, T, SF, E, SS, MD}, and SVM- SF Food Rating RS Recommendation strength RFE # 3 {E, SP, SA, GL, T, SF, E, SS, MD, RS, S}. SA All Satisfaction Rating Ex Expertise PS Portion size 4.3.2 LASSO Y (Dependent variables) Y Sentiment Next, we use LASSO to estimate the factor coefficients, and select the factors whose values are not equal to 0. The results can be given in Table 3. According to their occurrence frequency, we can build three feature sets, LASSO # 1 {SA, SS}, LASSO # 2 {SA, 4.2 Collected Data SS, RA, RS}, and LASSO # 3 {SA, SS, RA, RS, SP, SF}. This study uses the import.io tool to crawl OpenTable and select the “The Five Fields” and “tramshed” restaurants in London as Table3. Rusults of LASSO Feature Selection well as the review time from 2013 to 2017 as our experimental Fold Occurrence data. Figure 1 gives an examples of textual reviews. In figure 1, fold1 fold2 fold3 fold4 fold5 red marked box 1 is the overall rating; box 2 is the comment; box Factors frequency 3 is the rating of specific items. A total of 918 restaurant reviews SA 0.23 0.23 0.22 0.24 0.23 5 have been collected. Among collected reviews, 76.03% were positive, 14.49% were negative and 9.48% had no special SR 0.10 0.04 0.08 0.08 0.08 5 sentiment. RA 0.04 0 0.03 0.03 0.02 4 RS 0.08 0 0.05 0.11 0.10 4 SP 0.04 0 0 0 0 1 SF 0 0 0.0020 0 0 1

Figure 1. An example of comment in OpenTable. 4.3.3 Evaluation by SVM In this step, we will build sentiment classification models by SVM using the original feature set and other three reduced feature sets SVM-RFE #1, #2, and #3. The results can be found in Table 4. From this table, we can find that SVM-RFE #1 feature set has the best performance, no matter in overall accuracy and training time. fold2 65.03% 68.85% 64.48% Table4. Evaluation Results Of SVM-RFE Feature Selection fold3 65.03% 60.11% 66.67% Subset SVM- SVM- SVM- Original set Experiment RFE#1 RFE#2 RFE#3 fold4 68.85% 71.04% 70.49% fold1 65.57% 68.85% 66.12% 64.48% fold5 67.03% 65.41% 67.03% fold2 65.03% 64.48% 66.67% 66.67% Mean 66.30% 67.39% 67.5% fold3 65.03% 66.67% 65.57% 64.48% St. Dev. 1.643 4.743 2.281 Traning fold4 68.85% 70.49% 72.13% 71.04% 0.1568 0.0104 0.0064 Time(s) fold5 67.03% 67.03% 66.49% 64.86%

Mean 66.3% 67.5% 67.4% 66.3% Table7. The Extracted Important Factors St. Dev. 1.643 2.281 2.679 2.795 Notation Factors Definitions Training The satisfaction about the 0.1568 0.0064 0.4944 0.0048 E Environment Time (sec.) environment such as the level of noise, brightness and so on

SP Service speed The satisfaction about the speed Next, we will build sentiment classification models by using the of service or efficiency original feature set and other three reduced feature sets LASSO #1, All Satisfaction SA The overall satisfaction of the #2, and #3. The results can be found in Table 5. From this table, Rating restaurant we can find that LASSO #1 feature set has the best performance, no matter in overall accuracy and training time. GL Grade Level It means readability which maps Table5. Evaluation Results of LASSO Feature Selection to American grade grading T Token The length of the textual reviews Subset Original set LASSO#1 LASSO#2 LASSO#3 Experiment SF Food Rating Consumers’ satisfaction with meals fold1 65.57% 71.58% 60.11% 67.21% AR Ambience Rating Consumers’ satisfaction with the restaurant environment fold2 65.03% 68.85% 62.84% 65.57% SR Service Rating Consumers’ satisfaction with fold3 65.03% 60.11% 54.64% 65.57% restaurant services fold4 68.85% 71.04% 66.67% 70.49% 5. CONCLUSIONS fold5 67.03% 65.41% 62.70% 68.11% The purpose of this study is to explore the factors that affect the sentiments of reviews in the O2O environment. We use textual Mean 66.30% 67.39% 61.39% 67.39% reviews as a samples. This work builds prediction models by using Support Vector Machines Recursive Feature Elimination St. Dev. 1.643 4.743 4.438 2.047 (SVM-RFE) and Least Absolute Shrinkage and Selection Training Operator (LASSO), respectively. We do not only attempt to build 0.1568 0.0104 0.0064 0.0216 sentiment classification models, but also to find the important Time (sec.) factors that affect the sentiments of comments.

According to the results, the eight important factors are Table 6 lists the comparison between the best performance of “Environment”, “Service speed”, “All Satisfaction Rating”, LASSO and SVM-RFE. From this table, we can find SVM-RFE “Grade Level”, “Token”, “Food Rating”, “Ambience Rating”, and feature set #1 has the best performance. Therefore, we’ll use this “Service Rating”. These important factors can be used as the set to build sentiment classification model and selected important future in the business can be informed of the preferences of factors. Finally, according to the result of comparison, we can consumers after the operation of the adjustment and management select 8 important factors shown in Table 7. comments can be particularly concerned about the characteristics,

and then make the appropriate response measures. The findings Table6. Comparison between SVM-RFE, LASSO Selected also can be references for O2O market enterprises to carefully Feature Sets and Original Feature Set answer customers’ comments to improve customers’ trust and Subset SVM- service quality. Original set LASSO#1 Experiment RFE#1 6. ACKNOWLEDGMENTS This work was supported in part by the National Science Council fold1 65.57% 71.58% 68.85% of Taiwan, R.O.C. (Grant No. MOST 104-2410-H-324-009-MY2). 7. REFERENCES competitive O2O markets,” European Journal of Operational [1] Carsten, P., “China’s Wanda, Tencent, to set up Research ,VOL. 254, NO. 2, PP. 595-609, 2016. $814 million e-commerce company.,” 2016 [11] J. Y. Lin, “A Study on Identifying Review Manipulation of from :http://www.reuters.com/article/us-wanda- ,” Thesis, Department of Information tencent- baidu-idUSKBN0GT04020140829 Management, Chaoyang University of Technology, 2013 [2] Z. Wu, “Service Recommendation Method on Multiple [12] S. Gagić, D. Tešanović, Ana Jovičić, “The Vital Components Dimension O2O,” International Conference on Intelligent of Restaurant Quality that Affect Guest Satisfaction,” Transportation, PP. 713-716, 2015. TURIZAM, VOL. 17, Issue 4, PP. 166-176, 2013 [3] B. Zhang, L. Huang, “The research status of O2O industry [13] M. Bujisic, J. Hutchinson, H.G. Parsa, “The effects of analysis , for example,” International restaurant quality attributes on customer behavioral Conference on Logistics, Informatics and intentions,” International Journal of Contemporary Service Sciences (LISS), 2015. Hospitality Management, VOL. 26, Issue 8, PP. 1270-1292, 2014 [4] D. J Kim, “A Multidimensional Trust Formation Model in B- to-C E-commerce: A Conceptual Framework and Content [14] T. H. Engler, P. Winter , M. Schulz, “Under standing online Analyses of Academia,” Decision Support Systems, VOL. 3 product ratings : A customer satisfaction model,” Journal of NO.8, PP. 143-166, 2005. Retailing and Consumer Services, VOL. 27, PP. 113–120, 2105. [5] M. Liang, X. Yang, H. Ou, “The Measurement of the Consumer Trust to O2O E-commerce Based on Fuzzy [15] R. Prabowo and M. Thelwall, “Sentiment analysis: A Evaluation,” Seventh International Joint Conference on combined approach, Journal of Informetrics,” VOL. 3, NO. Computational Sciences and Optimization, PP. 113-116, 2, PP. 143-157, 2009 2014, Beijing. [16] B. Pang, L. Lee, and S. Vaithyanathan, “Thumbs up?: [6] Q. Yan, S. Wu, L. Wang , “E-WOM from e-commerce Sentiment Classification Using Machine Learning websites and social media: Which will consumers adopt?,” Techniques,” Annual Meeting of the ACL Proceedings of the Electronic Commerce Research and Applications, VOL. 17, ACL-02 Conference on Empirical Methods in Natural PP. 62–73, 2016 Language Processing, VOL. 10, PP. 79-86, (2002)

[7] B. A. Sparks, V. Browning, “The impact of online reviews on hotel booking intentions and perception of trust,” Tourism [17] A. K. Murthy, Suresha, “XML URL Classification based on Management, VOL. 32, PP. 1310-1323, 2011 their semantic structure orientation for Web Mining Applications,” Procedia Computer Science, VOL. 46, PP. [8] X. Li, “Research on construction of takeout O2O platform 143–150, 2015 service quality evaluation system,” The 13th International Conference on Service Systems and Service Management [18] P. Chaovalit, L. Zhou,“Movie Review Mining: a (ICSSSM), 2016. Comparison between Supervised and Unsupervised Classification Approaches,” Proceedings of the 38th Hawaii [9] T. M. Tsaia, W. N. Wanga, Y. T. Lina, “An O2O commerce International Conference on System Sciences,IS SN 1530- service framework and itseffectiveness analysis with 1605, PP. 1-9, 2005 application to proximity commerce,” 6th International Conference on Applied Human Factors and Ergonomics [19] S. Tan and J. Zhang, “An Empirical Study of Sentiment (AHFE 2015) and the Affiliated Conferences, VOL. 3, PP. Analysis for Chinese Documents,” Expert Systems with 3498 – 3505, 2015 Applications, VOL 34, NO. 4, PP. 2622-2629, 2008. [10] Z. He, T. C. E. Cheng, J. Cheng, S. Wanf, “Evolutionary location and pricing strategies for service merchants in

Columns on Last Page Should Be Made As Close As Possible to Equal Length

Authors’ background Your Name Title* Research Field Personal website

Long-Sheng Chen Full 1.Data Mining https://www.cyut.edu. Professor 2.Software Quality Management tw/~lschen/ 3.Granular Computing 4.Neural Network Models & Applications 5. Six Sigma 6. Industrial Engineering and Management Wan-Ting Chien Master 1.Data Mining student Hua-Nan Chang Assistant Human Resource Professor *This form helps us to understand your paper better, the form itself will not be published.

*Title can be chosen from: master student, Phd candidate, assistant professor, lecture, senior lecture, associate professor, full professor