A Study of Question Effectiveness Using Reddit ``Ask Me Anything

A Study of Question Effectiveness Using Reddit “Ask Me Anything” Threads Kristjan Arumae, Guo-Jun Qi, Fei Liu University of Central Florida, 4000 Central Florida Blvd., Orlando, Florida 32816 [email protected], [email protected], [email protected] Abstract IAMA Holocaust survivor who immigrated to the US in 1951 Asking effective questions is a powerful social skill. In this with my husband and twin daughters. AMA. paper we seek to build computational models that learn to I am sitting with my 89-year-old grandmother who is always discriminate effective questions from ineffective ones. Armed looking for a new audience. she has a spectacularly clear memory with such a capability, future advanced systems can evalu- and important stories to tell. Here is her brief self-introduction: I ate the quality of questions and provide suggestions for ef- was born in Tluste, Poland (which is now the Ukraine) in 1923... fective question wording. We create a large-scale, real-world Do you think it takes a horrible experience to not take dataset that contains over 400,000 questions collected from Q1 Reddit “Ask Me Anything” threads. Each thread resembles an life for granted? (No Answer) How has your experience changed your view of the world online press conference where questions compete with each Q2 other for attention from the host. This dataset enables the de- compared to the younger generations? (No Answer) velopment of a class of computational models for predicting How does she feel about the situation in Greece now? Q3 whether a question will be answered. We develop a new con- (No Answer) volutional neural network architecture with variable-length Do you sometimes have dreams, nightmares or flashbacks Q4 context and demonstrate the efficacy of the model by com- about things you saw/experienced then? paring it with state-of-the-art baselines and human judges. Oh yes. In the beginning I dreamed of hiding my babies A4 and then it was hiding my grandchildren… Introduction Learning to ask effective questions is important in many Figure 1: A snippet of the AMA thread. It includes the title (shown scenarios. For example, doctors are trained to ask effective in bold), a brief intro from the AMA host, four questions (Q1–Q4) questions to gather necessary information from patients in a posted by Reddit users, among which only the last question was short time (Molla and Vicedo 2007); journalists frame their answered by the host (A4). questions carefully to elicit answers (Vlachos and Riedel 2014); students post their questions on class discussion fo- conference. An AMA host initiates a discussion thread with rums to seek help on assignments (Moon, Potdar, and Martin “Ask Me Anything” or “Ask Me Almost/Absolutely Any- 2014). Naturally the more effective a question, the better the thing.” The host will provide a brief background description information-seeking or problem-solving purpose is served. and invite others to ask any questions about any topic. An Despite its importance, the problem of “what constitutes example is illustrated in Figure 1. The questions in an AMA a good question” is largely unexplored. This paper seeks thread compete with each other for attention from the host. to close the gap by designing and evaluating algorithms The host will chooses to answer some questions while leav- that learn to discriminate effective questions—questions that ing others behind. arXiv:1805.10389v1 [cs.CL] 25 May 2018 elicit answers—from those that do not. We conduct a large- In this paper we will investigate whether it is possible scale data-driven study using Reddit “Ask Me Anything” to automatically predict which questions will be answered (henceforth “AMA”) discussion threads. by the host. We hypothesize that the question wording is Reddit is a vibrant internet community with more than 36 important. For example, “How does she feel about...” (Fig- million registered users (Reddit 2015). It was founded on ure 1) appears to be a difficult question that requires a del- June 23rd, 2005 and is currently ranked as the 24th most icate answer. Further, some topics are considered more fa- frequently visited website. The vast stream of data provides vorable/unpleasant than others by AMA hosts. We draw on an unprecedented opportunity to study question effective- recent development of deep neural networks to predict ques- ness. Our focus of this work is on the “Ask Me Anything” tion answerability. The advantage of neural models is the subreddit. It is a popular subforum with more than 15 mil- ability to automatically derive question representations that lion subscribers. Each AMA thread emulates an online press encode both syntactic structure and semantic knowledge. We Copyright c 2017, Association for the Advancement of Artificial develop a novel convolutional neural network (CNN) archi- Intelligence (www.aaai.org). All rights reserved. tecture that considers variable-length context. We draw an analogy between linear classifiers with n-gram features and et al. (2016) and Wei et al. (2016) respectively present stud- the new CNN model, and demonstrate improved results. The ies on understanding the mechanisms behind “persuasion.” contribution of this work includes: They acquire discussion threads from the “ChangeMyView” • A large-scale, real-world question-answering dataset col- subreddit. Users of this subforum state their views on certain lected from Reddit.1 The dataset contains over 10 million topics, invite others to challenge their views, and finally indi- posts and 400,000 questions generated by Reddit users. It cate if the discussion has successfully altered their opinions. is a very valuable resource for future research on question The studies find that both interaction patterns and language answering and question suggestion. cues are predictive of persuasiveness. While Reddit has be- come a rising platform for language-related studies, the Red- • We introduce a new variable-length context convolutional dit AMA corpus described next is expected to drive forward neural network model and demonstrate its efficacy against research in question-answering and question answerability. state-of-the-art baselines. We additionally present a human evaluation study to gain further insight into question The Reddit AMA Corpus answerability. We describe a methodology for creating a large-scale real- world dataset from Reddit “Ask Me Anything” subforum. Related Work Our goal is to collect all AMA threads and their associated We discuss related work on community question answering posts during a span of eight years. The process is non-trivial, and Reddit-inspired natural language processing studies. considering the sheer volume of data. In particular, we are A domain of research that is related to this study is faced with two challenges. First, the Reddit API returns at community question answering (CQA), where users post most 1000 threads per search query. To collect all threads, questions on discussion forums and solicit answers from we implement a 30-minute query window to retrieve threads the community. Popular CQA websites include Yahoo! during each time period. Second, a thread is represented as Answers, Ask Ubuntu, Stack Exchange, and Quora. The a JSON object, but certain posts may be missing. The JSON question-answering patterns in these websites are different object is restricted to contain up to 1,500 posts in order to from those of the Reddit AMA. For example, AMA ques- avoid excessive page loading time, whereas the largest AMA tions are largely opinion-eliciting and they can be “any ques- threads can reach over 20,000 posts. To build a complete tions about any topic,” whereas CQA questions are problem- thread structure, we make additional API calls to retrieve solving oriented and focus on technical topics. Multiple missing posts and insert them back to the thread structure. factors can contribute to unanswered questions on CQA An AMA thread contains a collection of posts organized into websites, including the posting time, question quality, user a tree structure. Each tree node is a post that includes a vari- reputation, and reward mechanisms (Anderson et al. 2012; ety of useful information, such as author, body text, creation Li et al. 2012; Liu et al. 2013; Ravi et al. 2014; Nakov et time, and replying posts. al. 2016). CQA corresponds to a single-inquirer multiple- The dataset contains a total of 67,723 discussion threads responders setting, whereas AMA corresponds to a multiple- and over 10 million posts. Each thread contains 248 posts on inquirers single-responder setting, because the AMA host average. The dataset spans eight years, from the beginning single-handedly responds to all questions posted to the dis- of Reddit AMA (May 2009) to end of the data collection pe- cussion thread. Whether a question will be answered de- riod (April 2016). A number of celebrities have held AMA pends on the question content—if it generates sufficient in- sessions during this time, including Barack Obama, Arnold terest that warrants an answer from the host. Danish, Dahiya, Schwarzenegger, Astronaut Chris Hadfield, and Bill Nye. and Talukdar (2016) analyze a range of factors that influence Other community members with unique experiences are also the AMA question answerability. In contrast to their work, asked to host AMA sessions. The dataset is near-complete, we focus on developing neural networks that automatically missing only 250 threads (0.4% of all threads)—mostly due learn question representations that incorporate syntactic and to server timeouts. More data statistics are presented in Ta- semantic knowledge. ble 1. The AMA dataset provides a highly valuable resource Reddit has been explored for studying various language- for future research on Reddit and question-answering. related problems. Bendersky and Smith (2012) investigate Preprocessing. The goal of preprocessing is to create a characteristics of “quotable” phrases. The machine-detected collection of question posts and their associated labels.

A Study of Question Effectiveness Using Reddit ``Ask Me Anything

Uila Supported Apps

Corrective Or Critical? Commenting on Bad Questions in Q&A

Mathematics 2020

An Analysis of Reputation: Currency of the Stack Exchange Network

Finding a Growth Business Model at Stackoverflow-Final2

“What Makes a Software Developer a Great Business Developer?” // F E a T U R E S // Oct-Nov ‘16

16 Best Web Development & Web Design Blogs

Scoring Answers in Stack Overflow by Raul Quintana

Data Structures

Notices of the American Mathematical Society ABCD Springer.Com

Quora 1 Quora

Quantifying Voter Biases in Online Platforms: an Instrumental Variable Approach