Using Verbs and Adjectives to Automatically Classify Blog Sentiment
Total Page:16
File Type:pdf, Size:1020Kb
Using Verbs and Adjectives to Automatically Classify Blog Sentiment Paula Chesley Bruce Vincent and Li Xu Rohini K. Srihari Department of Linguistics Department of Computer Science Center of Excellence for Document University at Buffalo and Engineering Analysis and Recognition (CEDAR) [email protected] University at Buffalo University at Buffalo {bvincent,lixu}@buffalo.edu [email protected] Abstract 1. How effectively can classes of verbs in blog posts catego- rize sentiment? This paper presents experiments on subjectivity and po- larity classification of blog posts, making novel use of 2. How can we utilize online lexical resources, such as a linguistic feature, verb class information, and of an Wikipedia’s dictionary, to automatically categorize adjec- online resource, the Wikipedia dictionary. The verb tives expressing polarity? classes we use express objectivity and polarity, i.e., a positive or negative opinion. We derive the polar- 3. What is the best way to propagate up sentiment informa- ity of adjectives from their entries in the online dic- tion from the lexical level to the post level? tionary. Each post from a blog is classified as objec- 4. Are blogs different from other genres expressing senti- tive, subjective-positive, or subjective-negative. Our ap- ment and if so, how? proach is topic- and genre-independent, and our method of determining the polarity of adjectives has an accu- Having manually examined hundredsof blogs, the answer to racy rate of 90.9%. Accuracy rates of two verb classes (4) seems to reside precisely in the diverse rhetorical struc- demonstrating polarity are 89.3% and 91.2%. Initial ture of blog posts and large-scale references to other media. classifier results show accuracies of 72.4% on objec- An op-ed piece from a newspaper often states the opinion tive posts, 84.2% for positive posts, and 80.3% for neg- in the first paragraph of the article and reiterates this opin- ative posts. These accuracies all represent substantial ion when closing (so much so that examining simply the first increases above the established baseline classification. and last paragraphs for sentiment would potentially yield ro- bust results). However, the majority of a blog post is often a Introduction quoted excerpt of an article with which the blogger agrees or As the blogging phenomenon continues its exponential disagrees, followed by a short commentexpressing his or her growth, its increasingly influential role in the global mar- agreement or disagreement. We believe such nested speech ketplace of ideas and opinions is becoming widely acknowl- events (Wiebe, Wilson, & Cardie 2004) are a common oc- edged. In the Harvard Business Review of February 2005 currence in blogs and represent a significant challenge when Mohanbir Sawhney notes that “The ‘blogosphere’ is gaining classifying their sentiment. the power to influence what people think and do...Bloggers To aid in the sentiment classification task, we make use of are driven by a desire to share their ideas and opinions with verb-class information, since exploiting lexical information anyone who cares to tune in” (Sawney 2005). Consequently, contained in verbs has shown to be a successful technique new techniques to automatically extract and analyze senti- for classifying documents (Klavans & Kan 1998). To ob- ment expressed in blogs constitute lively research topics. tain this information we use InfoXtract (Srihari et al. 2003), In this paper we describe textual and linguistic features we an automatic text analyzer which groups verbs according extract and a classifier we develop to categorize blog posts to classes that often correspond to our polarity classifica- with respect to sentiment. We ask whether a given blog post tion. Additionally, we utilize Wikipedia’s online dictionary, 1 expresses subjectivity (vs. objectivity), and whether the sen- the Wiktionary , to determine the polarity of adjectives seen timent a post expresses represents positive (“good”) or neg- throughout the posts. With adjective accuracy at 90.9% and ative (“bad”) polarity. initial evaluations on verb classes to show 89.3% and 91.2% Although the expression of subjectivity vs. objectivity is accuracy, these methods for extracting lexical polarity prove ideally expressed on a continuous scale, we simplify by us- very robust. ing a binary classification. Similarly, we aim for a binary We then propagate this lexical classification information classification of positive vs. negative sentiment. In this re- up to the post level. This approach could then be extended search we address the following issues related to blog senti- to an entire blog, by categorizing a blog as subjective if the ment analysis: majority of its posts are subjective. Presumably, an author Copyright c 2005, American Association for Artificial Intelli- 1This dictionary is available at gence (www.aaai.org). All rights reserved. http://en.wiktionary.org/wiki/ . writes a post about one topic in order to preserve the coher- (Mishne 2005) takes up this task but uses different data ence of the text. A reader generallyexpects a blogger’s opin- and features; ion within a post to stay consistent, unless a post expresses 2. Theverb class informationwe take into account. Most ex- mixed sentiment, at which point the sentiment should be periments determining polarity restrict themselves to po- classified as mixed. Given this topic-independent approach, lar adjectives; our sentiment classifier is well-suited to integration with a blog topic/subtopic classifier and a blog search engine. We 3. The resources we use to determine polarity. We anticipate such a search engine, which we are currently pro- use InfoXtract to obtain verb polarity information and totyping, to be useful not only for corporations interested in Wikipedia’s online dictionary, the Wiktionary, to obtain public-opinion information but also for the general public. adjective polarity information; After examining related work, this paper describes our ex- 4. A topic- and genre-independent classification. Of the periments in the classification of blog posts. We first de- document-level analyses, Turney as well as Dave et al. scribe our training data. We then discuss the features we use examine one genre, reviews. in the classification experiments, followed by a discussion of our results. Finally, we conclude and offer new directions for research. Building training and test datasets Our classifier was trained using a dataset of objec- Related Work tive, subjective-positive, and subjective-negative documents. Training text from the web was obtained by a semi- This work can be positioned within the fields of text classifi- automated, manually verified approach comprising two cation and web mining, genre detection, and sentiment anal- steps: ysis. The majority of work on text classification is topic- related, and, to a lesser extent, event-related ((Klavans & 1. Obtain web documents via RSS and Atom web syndica- Kan 1998), MUC conferences, etc.). Genre detection is tion feeds. A customized aggregator was developed using related to sentiment analysis in that some genres, such as APIs of the Apache Software Foundation’s ROME (Rss editorials, are inherently more subjective than others, e.g., and atOM utilitiEs) project2. Compared to possible alter- “breaking news” reportage. natives of web crawling or “page scraping”, XML-based Recently there has been rapid expansion of work on senti- syndication formats of RSS (Really Simple Syndication) ment analysis and online sentiment analysis in particular, in and Atom can capture web content in a more straight- many cases for commercial purpose. Since many consumer- forward manner from a very large, fast-growing number oriented websites include fields for user feedback, it is quite of websites. The websites strategically selected for each interesting to examine consumer reviews of products. In this training category provide diverse content of objective or light (Hu & Liu 2004) query online documents for positive subjective categories. Although the test dataset was man- and negative opinions of various product features, such as ually chosen strictly from a set of blog posts of diverse the size of a digital camera. The polarity of adjectives is de- topics, the training data were obtained from traditional termined using a seed list of adjectives having known orien- websites as well as blogs. Many non-blog sites were pro- tation, and assigning unknown adjectives the orientation of lific sources of syndicated news and topical writing that their synonyms with known orientation. (Glance et al. 2005) consistently met our criteria of an objective text. Objec- examine online message boards and blogs to see what opin- tive feeds were from sites providing content such as world ions consumers have on a given product. These authors de- and national news (CNN, NPR, etc.), local news (At- termine sentiment orientation via techniques used in (Hurst lanta Journal and Constitution, Seattle Post-Intelligencer, & Nigam 2004), which make use of manual determination etc.), and various sites focused on topics such as health, of polarized adjectives for a specific domain. They too ex- science, business, and technology. For subjective cate- amine digital cameras and note that the adjective blurry is gories, traditional websites were also key sources of pos- most likely negative, while crisp is in all likelihood positive. itive and negative text meeting our criteria that docu- Work on sentiment analysis can be divided into ap- ments be categorically positive or negative. Subjective proaches working at the sentence or clause level ((Wilson, feeds included content from newspaper columns (Charles Wiebe, & Hoffmann 2005), (Glance et al. 2005), (Hu & Krauthammer, E. J. Dionne, etc.), letters to the editor Liu 2004)) and those examining the documentlevel ((Turney (Washington Post, Boston Globe, etc.), reviews (DvdVer- 2002), (Dave, Lawrence, & Pennock 2003), (Mishne 2005)). dict.com, RottenTomatoes.com, etc.), and political blogs Working at the document level represents a challenge in (Powerline, Huffington Post, etc.). that, for multiple sentence-documents, polarity information 2. Manually verify each document written by the aggregator is propagated up from individual sentences.