Using Language to Predict Kickstarter Success
Total Page:16
File Type:pdf, Size:1020Kb
Using Language to Predict Kickstarter Success Kartik Sawhney Caelin Tran Ramon Tuason Stanford University Department of Computer Science fkartiks2, cktt13, [email protected] Abstract and persuade them to support it over other cam- paigns. Previous literature on predicting the suc- Kickstarter is a popular crowdfunding cess of Kickstarter campaigns has focused on ap- platform that is used by people to seek plying machine learning to time series data (e.g., support on a variety of campaigns. Exist- the amount of funds raised over time) and social ing literature has already tackled the prob- networking. In this paper, we investigate whether lem of predicting campaign success (meet- it is possible to use information present at the be- ing the funding target by the deadline), ginning of a campaign to predict its success with- typically analyzing the evolution of a cam- out using time-dependent data or data external to paign’s funding and number of backers the Kickstarter campaign itself. We have built a bi- over time to predict its success. While nary predictor of Kickstarter success that focuses time-dependent features are effective in on actual content of a campaign and not its per- predicting success, especially as a cam- formance over time, capturing (1) people’s initial paign nears its deadline, we took a differ- reactions to the language used in a campaign and ent approach to the problem and built a bi- the description of its purpose and (2) the character- nary classifier that predicts the success of a istics and types of campaigns that attract people. campaign by analyzing campaign content, linguistic features, and meta-information 2 Related Works at the time of its inception (through tech- Literature on this topic can be broadly divided niques such as sentiment analysis and la- into three categories: specific work aimed at tent Dirichlet allocation). With sufficient understanding crowdfunding dynamics, natural training, our classifier achieves over 70% language understanding systems to predict audi- accuracy on large test data sets. We ence reaction, and persuasion theories in general show that language and campaign prop- which can be used to develop our work further. erties alone can be used to make moder- ate predictions about the success of a cam- There have been several studies that leverage paign, which helps us understand how the machine learning techniques to predict the success subtleties of persuasion and the choice of of a campaign. Vincent Etter et al. (see references campaign topics can influence campaign section) analyzed the social network by con- donors and informs the process of evalu- structing a projects-backers graph and monitoring ating and developing effective campaign Twitter for tweets that mention the project. They pitches. also discuss predictions based on the time series 1 Introduction of early funding obtained. Combining features with social information helped to improve the Kickstarter is an internet service for people to raise model substantially. A study by Ethan Mollick on money from crowdfunding. For a person to re- Kickstarter dynamics found that higher funding ceive any resources, his or her campaign must goals and longer project duration lead to lower meet its target by a deadline set at the time of chances of success, while inclusion of a video publishing. In order to be successful (i.e., achieve in a project pitch and frequent updates on the full funding by the deadline), a campaign must ef- campaign increase the likelihood of full funding. fectively convey its mission to potential backers While these studies rely on time-dependent f“name” : “Spinning Pots: Transformations of Clay through features and social information for predictions, Human Hands”, “blurb” : “Creating a small business making functional recent advances in natural language understanding pots/wares with clay and heat. Raising funds to purchase a have helped make predictions based on linguistic kiln.”, considerations as well. For instance, studying “goal” : 3000, “created at” : 1431738817, the linguistic aspects of politeness, Danescu- “deadline” : 1434680329g Niculescu-Mizil et al. show that Wikipedia editors are more likely to be promoted to a higher Figure 1, Campaign Example: Note that the creation time and deadline are expressed as unix timestamps. status when they are polite. Furthermore, Althoff et al. analyze social features in literature to 3.2 Baseline and Oracle determine relations that can predict whether an altruistic request will be accepted by a donor. To test the reliability of lingual features in pre- In addition, by analyzing the language and dicting a campaign’s success, we implemented words used in daily conversation, tools such as our baseline a naive Bayes (NB) classifier that as LIWC (Pennebaker et al.) can be used for uses unigram occurrences as features. Unigrams inferring psychologically meaningful styles and are individual words, such as “spinning,” “pots,” social behavior from unstructured text. Other and “transformations” from the sample above. We resources like the NTU Sentiment Dictionary and chose unigram occurrences as the features of our SentiWordNet also help in opinion mining on baseline since they provide a very basic analysis text-based content (Pang and Lee). of language and content. They do not provide insight into how words are used or how they Finally, we turn to more general theories of in- relate to one another. We randomly distributed fluence, noting the important role of reciprocity in campaigns in our dataset (160,000+ campaigns) social dynamics in investment situations as shown into 80% training data and 20% testing data. Ap- by Berg et al. In fact, Cialdini’s iconic work on the plying our classifier, we obtained a classification psychology of persuasion demonstrates the signif- accuracy of 65%. This suggests that it is possible icant influence of reciprocity in the form of limited to classify Kickstarter campaigns with significant rewards on the success of a Kickstarter campaign. accuracy. 3 Implementation of our Predictor To determine an upper limit for the Kickstarter prediction problem, we implemented as our oracle 3.1 Input-Output Behavior a support vector machine that considers unigram The input-output behavior of campaign prediction occurrences along with information such as how is straightforward: after training a predictor using many backers a campaign had by the end of its labeled (success / failure) Kickstarter campaigns, run, if it was featured in the Kickstarter website’s we use the predictor on arbitrary campaigns “Spotlight” page, and if it was presented as a (input) and see if we expect them to succeed “Staff Recommendation.” Applying our classifier, (output). For our raw dataset, we used Vitulskis et we got an average classification accuracy of 92%. al.’s pre-existing, regularly updated, and labeled The oracle illustrates the upper limit of any Kick- dataset created by crawling all Kickstarter pages. starter predictor based on data that it can receive from its start time to end time. This reflects our The evaluation metric we used for a predictor challenge of predicting a campaign’s outcome just is its classification accuracy of test campaigns. from information available at its inception. After Given an input campaign, a predictor will classify all, if you incorporate more information as the it as a success (1) or failure (0). campaign progresses, your prediction accuracy quickly increases, which is reflected in predictor As a concrete example, the following is a sam- performance in current literature. ple object representing a Kickstarter campaign that contains the relevant fields that our predictor We have a gap of about 27% between our ora- will use. This campaign failed — perhaps its un- cle, which looks at the future, and baseline, which original idea and unexciting language were con- looks at language from the beginning of cam- tributing factors. paigns. However, the problem is not so intractable 2 that language has no part in determining a cam- algorithms that operate on the entire dataset to paign’s success. Despite its simple nature, our draw comparisons between campaigns, analyzing naive Bayesian classifier actually performed above individual campaign features extracted in the first our expectations, yielding an accuracy well above step. Examples of algorithms used here include la- the random chance threshold of 50%. Naive Bayes tent Dirichlet allocation (LDA). Assignment con- assumes that, given a certain success state (“True” sists of (1) applying algorithms to compare cam- or “False”), the unigram features of a campaign paigns extracted from our training data and (2) as- are conditionally independent. Of course, this signing the results to those campaigns as features assumption is not accurate because a campaign for use in predictor training. about a given topic is likely to have many re- lated words and linguistic structures. This sug- Because there are tens of thousands of features gests that, with a more intelligent feature extrac- that our data points can have, e.g., the presence tor and a more advanced predictor, we can further of specific unigrams, most feature values that any improve our classification accuracy. given campaign has are equal to zero. Therefore, with our high-dimensional sparse datasets, we 4 Predictor Design and Implementation save ample memory with sparse matrices by stor- ing only the non-zero components of each feature In order to achieve better classification accuracy, vector in memory. We use the sklearn library’s we focused the