<<

Analyzing sentiments in tweets for #analyticsx Tejaswi Jha, Praneeth Guggila MS in Business Analytics, Oklahoma State University – Stillwater

Abstract Approach Tesla Model 3 is making news in the history of automobiles as never seen before. The new already has • For this paper, a sample of 1,000 tweets was analyzed. we considered tweets in English language for analyzing. The more than 400,000 reservations and counting. We carried out a descriptive analysis of sales of all Tesla models and sample tweets were cleaned by removing duplicate tweets, retweets, hyperlinks and usernames. found that the number of reservations to date are more than three times the sales of all previous Tesla cars combined. • The tweets in a form of excel file were imported in SAS Enterprise Miner and analyzed using the right NLP Clearly there is a lot of buzz surrounding this and such buzz influences consumers’ opinions and sentiments which in techniques, lemmatization, concept linking and use of synonyms. turn leads to additional reservations. This paper aims to summarize findings about people’s opinions, reviews and • Once the tweets were imported in SAS Enterprise Miner , we used the text parsing, text filter, test cluster and text sentiments about Tesla’s new car Model 3 using textual analysis of tweets. topic nodes to analyze the text which can be seen as below - For this, we used the live streaming data from over time and studied its pattern based on the booking timeline. We have been collecting data from March 2016 when the interest of people in this model spiked suddenly. A sample of 1,000 tweets was analyzed from a total of about 10,000 collected via fumes. We used the SAS® Enterprise Miner to evaluate key questions pertaining to the analysis such as following: What features do people think about? What are the factors that motivate people to reserve a Tesla? What factors are discouraging them?

Methods Fig 2. Data Analysis • Concept links can be viewed in the interactive filter viewer from the properties panel of text filter node. Data Preparation and Analysis • It is a type of association analysis between the terms used. In order to create the concept links we have treated The primary source of data for this research were the twitter feeds posted by users. Tweets posted around the time of some of the keywords as synonyms such as “innovation”, “breakthrough”, “revolution”, ”disruptive” etc. announcement of Tesla Model 3, were collected using FLUME which uses the exposed twitter API to save tweets. The process flow for the analysis can be seen in the figure below- Results Concept Links - We generated concept links to answer two questions regarding people’s choice - 1) Why an electric car? • The link shows the term electric car to be analyzed in the center and the terms that it is mostly used with as links. Fig 1- process flow • The width of the link here is directly proportional to the • FLUME saved the tweets on the Memory Channel (MEM) after which the data is sent to the Hadoop Distributed File strength of association of the term with electric car. System (HDFS) sink, and then to the HDFS. Also, it is showing how many times the two terms co-exist • HDFS is used to store data collected from social networking sites. Since the data is in an unstructured format (.json), it is converted into a structured format (.CSV) using Python. Over a period of 18 weeks, 10,000 tweets were in a tweet regarding Tesla Model 3 collected under the handle #TeslaModel3. 90% of the tweets obtained under this handle were tweeted within 60 • Most noticeable association – Affordable. Most tweets days after Tesla Model 3 unveiled. show that with minimized styling, basic model of TeslaModel 3 is most affordable electric car. Fig 3. Concept Link1 Analyzing sentiments in tweets for #analyticsx Tesla Model 3 Tejaswi Jha, Praneeth Guggila MS in Business Analytics, Oklahoma State University – Stillwater

Results Continued • Four clusters were generated and all of them were well separated from each other. 2) Why a Tesla Model 3? • The pie chart shows the distribution of the cluster frequencies for the four prominent clusters . The frequencies were well distributed amongst all four • Most frequent words- and Spacex clusters with cluster# 3 showing a slightly higher frequency than the rest. This clearly shows that most of the words contribute to the story regarding • An ardent fan base of Elon Musk was found in Fig 5 Cluster frequencies twitter with most positive tweets about him, increasing reservations, anticipation and wait period for this car. Fig 7 Distance between clusters TeslaModel3 and his other venture SpaceX. The tweets are drawing parallels between technology used in TeslaModel3 and SpaceX. Conclusion • The links clearly show the mindset of users The large amount of information contained in microblogging web-sites makes them an attractive source of data for regarding TeslaModel3. The tweets show opinion mining and sentiment analysis. This research sheds light on the views and opinions of twitter users about Tesla association of this car with words like future, Model 3 and found that the there was a very strong liking among the users for this car. There were negligible negative reservation, love, wait etc. Fig 4 Concept link 2 tweets on Tesla Model 3. Text Topic Analysis Future Scope Topic Topic terms Explanation • After connecting the Text Filter node in SAS® Enterprise Miner™, ID As a part of future scope, over an extended timeline, the twitter corpus can be analyzed using SAS® Sentiment Analysis we joined the Text Topic node which enabled us to combine the 1 +electric car, +market, This topic is a group of terms that +affordable, +teslamodel3, describe the Tesla Model3 car Studio to find if the nature of sentiments in comments (positive or negative) changing over time? Also, a multilingual term into topics so that we can analyze further. +future corpus of Twitter data can be collected and compared to identify the characteristics of the Twitter corpus across • Using text topic node and by carefully selecting terms, 2 +, +elonmusk, +great This topic talks about Elon Musk ,his different languages and regions. qualities and ventures. 8 user topics were defined as shown in Fig 5. 3 +elonmusk, +teslamodel3, love This topic is about love for Tesla References Cluster Analysis Model3 4 +reservation, +company, This topics is a group of terms that 1) http://support.sas.com/documentation/cdl/en/emag/65762/PDF/default/emag.pdf Cluster Descriptive terms Percentage StoryLine +electric car, +elonmusk talks about reservation of the car and ID 2) http://www.sas.com/en_us/software/analytics/text-miner.html brand value 1 +video, + cost, +electricvehicle, 20% This cluster is a group of terms that 5 +wait, +love, +excite, +deposit, This topic is group of terms that talks 3) Text Mining and Analysis: Practical Methods, Examples, and Case Studies Using SAS® by Goutam Chakraborty, +success, +Pre-orders, +Priority describe the factors related to the success +look about excitement and waiting period Murali Pagolu, Satish Garla. 3) Sentiment Analysis and Opinion Mining by Bing Liu (May 2012). 4) SAS Institute Inc of the Tesla Model3 car of Tesla Model3 car 2014. Getting Started with SAS® Text Miner 13.2. Cary, NC: SAS Institute Inc 6 +buy, +stock, +pay, This topic is group of terms that talks 2 +tax break, +subsidy, 20% This cluster group describes about the +teslamotors about stocks and market shares of 4) Sharat Dwibhasi, Dheeraj Jami, Shivkanth Lanka, Goutam Chakraborty, 2015, “Analyzing and visualizing the +government, + huge, +buyer subsidy benefits of the car Tesla. sentiment of the Ebola outbreak via tweets ” 7 +reserve, +gm, +Chevy bolt, This topic is comparing Tesla Model3 3 +reservation, +Plant, +factory, 33% This cluster is the group of terms related +pre-order with its competitors like general +elonmusk, +wait to production of tesla model 3 car motors and Chevy Bolt. Acknowledgement 4 +affordable, +sell, +year, +stock, 26% This cluster is a group of terms that are 8 +teslamodelx, +tech, This topic is group of terms that +autonews, +price, +autopilot, related to describing the factors +prototype compares Tesla Model3 car with We thank Dr. Goutam Chakraborty , Ralph A. and Peggy A. Brenneman Professor of Marketing and Dr. Miriam +interior associated with the features and market other Tesla cars. value of Tesla Model3 car McGaugh, Professor at Oklahoma State University for their guidance and support throughout the research. Fig 6 Text Clusters Fig 5 Text topics