Netnography and Text Mining to Understand Perceptions of Indian Travellers Using Online Travel Services
Total Page:16
File Type:pdf, Size:1020Kb
(IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 11, No. 9, 2020 Netnography and Text Mining to Understand Perceptions of Indian Travellers using Online Travel Services Dashrath Mane1, Dr. Prateek Srivastava2 Department of CSE, Sir Padampat Singhania University [SPSU] Udaipur, India Abstract—Advancements in the electronic commerce industry The best costs and cashback bargains on air are assured with have helped online travel services (OTA) in many ways. The yatra.com. paper examines the overall impact of traveller’s using online services and their sentiments derived from a collection of reviews Cleartrip online travel service provider offers internet for online travel service providers known as online travel agents booking for train and flight tickets for domestic and (OTA) in India. Customer reviews from different identified international bookings. Goibibo.com is another name which sources are collected and the satisfaction of travellers using offers the best arrangements/bookings in trains, flight. From various online travel services is analyzed using netnographic booking universal/local occasions to encouraging Foreign analysis and text mining. This paper also covers a detailed Exchange, Visa, Passport, protection for movement purposes. process of data collection, analysis using netnography and text Thomas cook does this everything for the travellers. mining methods which helps us for the analysis and deriving Travelguru.com, regularly appraised as a standout amongst the sentiments from collected reviews. Various results obtained are most went by Indian travel destinations on the web, is a presented as part of token lists, keyword analysis, and service- fortune place of treats with regards to booking air tickets, specific analysis. The statistical analysis of different results is lodging offices. tested to understand the relationship between various services and OTA. Netnography and Tourism industry: Netnography is Method used for studying culture and communities online. It Keywords—Consumer; travellers; netnography; text mining; is a tool to understand social interaction in digital OTA; sentiment; perception communications [6]. Netnography is a known tool accepted as a virtual ethnography tool, is being explored along with text I. INTRODUCTION mining. The approach used in this paper also produces an A. Tourism Market in India analysis of perceptions of travellers in India using text mining techniques. India, the country is known for its rich history, cultures. India placed 40th out of 136 nations as per the report of Travel Phases in the Netnography Process [11]: Fig. 1 explains & Tourism Competitiveness for the year 2017 [1]. The various phases in the process of netnographic analysis; below tourism industry in India is playing a significant role in the are the various phases in it: economy [2]. 1) Research planning: Research planning [3] is the initial The current scenario shows that India has a bright future in phase in netnographic studies. This phase speaks about the tourism sector as lots of development taking place in the defining the problem, defining the research objectives, should Indian tourism industry. In the last few years, the Ministry of talk about translating the research objectives into a specific set Tourism, Government of India, has initiated various attempts of questions. to lift tourism in a country that has opened research opportunities in it. The plans like PRASAD, e-tourist Visa, 2) Entrée: In this phase, online communities, blogs, Mobile applications for tourists are helping out to gain a new groups are identified and allows an ethnographic to enter into tourist footprint [3]. the communities to get a better knowledge of cultures and communities. B. Online travel services in India 3) Data collection: Data collection is a very vital phase in It is expected that by the end of 2020, the count of the this type of study. In this phase, Netnographer collects data bookings made online is likely to reach 4 billion in India.[4] from Internet data, interview data, and field notes. Travellers‘ Makemytrip.com has been positioning high on the rundown of reviews downloaded using various rating & referral sites like movement sites that fill in as a one-stop look for the whole mouthshut.com and consumeraffairs.com. These are the group. The online booking travel related firm is a commonly known name these days & it has been collaborating with websites that allow the user to collect millions of reviews imperative players in the industry. Yatra.com[5] is the best for posted by travellers in India. The whole reviews were movement specialists, long-standing corporate customers, and collected using systematics steps in the netnographic easygoing hikers alike & is an excellent travel site in India. process[7]. 559 | P a g e www.ijacsa.thesai.org (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 11, No. 9, 2020 4) Interpretations/Data analysis: This makes use of One this online travel-related online services in India. Fig. 3 shows of the data analysis technique known as Analytical coding, the overall rating for Cleartrip on mouthshut.com. Renal data analysis consist of coding, noting, abstracting, Fig. 4 represents the sample to review and rating for checking and refining, generalizing and theorizing as shown Cleartrip [7]. In addition to actual comment, Reviews contain in Fig. 2. values from 1 (low) to 5(high) for all the website components. 5) The data collection process was performed from Fig. 5 is a sample of the overall rating for MakeMyTrip. sources like mouthshut.com [8] and Consumer Affairs.com for Referring to the Fig. 5, it is very much clear that for all five this study. More than 2000 reviews from these platforms have website components, the overall rating for MakeMyTrip is 2 been collected and maintained in an excel sheet. [8]. Data collected from the identified referral, rating sites is stored in the Excel sheet. This data contains various fields as 1) Date of review, 2) Reviewer name, 3) Gender, 4) Age, 5) Location, 6) Review, 7) Source, 8) Review rating, 9) Service and support, 10) Information depth, 11) Content, 12) User-friendly, 13) Time to load. Fig. 1. Phases in Netnography Process [11]. Fig. 6 shows an Excel document of reviews collected from different sources of online travel agents. A. Text Pre-Processing The reviews were collected from various online platforms & referral rating sites need to have a series of text preprocessing before referring them for analysis using Text mining. The concept of text preprocessing consists of a series Fig. 2. Analytical Data Analysis [11]. of stages, which include spelling in normalization, filtering, lemmatization. The various text preprocessing tasks are being II. RESEARCH METHODOLOGY essential and this includes content cleanup, tokenization, The online travel agents considered in this research work grammatical form tagging [9]. includes MakeMyTrip, Goibibo, Cleartrip, redBus, and Yatra. Table I represent various phases in determining positive A good number of reviews for each of these agents [12] negative reviews the process of determining sentiments selected are evaluated using netnographic studies. The consist of some of the essential processes like splitting words, analysis performed as netnographic analysis is used to carry POS tagging, lemmatization, joining and then sentiment out the sentiments, the levels of sentiments of travellers using analysis. Fig. 3. The Overall Rating for Cleartrip (Source: Mouthshut.com). Fig. 4. Sample Review and Rating for Cleartrip (Source: Mouthshut.com). 560 | P a g e www.ijacsa.thesai.org (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 11, No. 9, 2020 Fig. 5. Overall Rating for MMT (Source: Mouthshut.com). Fig. 6. Different Fields in Format of Reviews Collected. TABLE I. VARIOUS PHASES IN DETERMINING POSITIVE-NEGATIVE REVIEWS Process Output Horrible experience, hotels don't provide basic amenities and charge for everything. Initial Review Customer service part is worst. Lowercase Horrible experience, hotels don't provide basic amenities and charge for everything. Customer service part is worst. Punctuation Horrible experience hotels don't provide basic amenities and charge for everything customer service part is worst Split words ['horrible', 'experience', 'hotels', "don't", 'provide', 'basic', 'amenities', 'and', 'charge', 'for', 'everything', 'customer', 'service', 'part', 'is', 'worst'] [('horrible', 'JJ'), ('experience', 'NN'), ('hotels', 'NNS'), ("don't", 'VBP'), ('provide', 'VB'), ('basic', 'JJ'), ('amenities', 'NNS'), ('and', 'CC'), POS tagging ('charge', 'NN'), ('for', 'IN'), ('everything', 'NN'), ('customer', 'NN'), ('service', 'NN'), ('part', 'NN'), ('is', 'VBZ'), ('worst', 'JJ')] Lemmatization ['horrible', 'experience', 'hotel', "don't", 'provide', 'basic', 'amenity', 'and', 'charge', 'for', 'everything', 'customer', 'service', 'part', 'be', 'bad'] Joining horrible experience hotel don't provide basic amenity and charge for everything customer service part be bad Sentiment Analysis {'negative': 0.333, 'neutral': 0.667, 'positive': 0.0, 'compound': -0.7906} Sentiment Positive / Negative Fig. 7 and 8 graphically represents the top 5 tokens in III. RESULTS AND DISCUSSIONS positive and negative category respectively The results obtained from this research are presented in this section. 2) Gender-specific (Male) Analysis of tokens: In the gender-specific category of tokens obtained, the Table III 1) Token identification: