Extracting the Wisdom of Crowds from Crowdsourcing Platforms
Total Page:16
File Type:pdf, Size:1020Kb
EXTRACTING THE WISDOM OF CROWDS FROM CROWDSOURCING PLATFORMS Qianzhou Du Dissertation submitted to the faculty of the Virginia Polytechnic Institute and State University in partial fulfillment of the requirements for the degree of Doctor of Philosophy In Business Information Technology G. Alan Wang, Chair Weiguo Fan Lara Khansa Roberta S. Russell Onur Seref June 6th, 2019 Blacksburg, VA Keywords: crowdsourcing, the wisdom of crowds, statistical learning, opinion aggregation, crowdfunding Copyright 2019, Qianzhou Du EXTRACTING THE WISDOM OF CROWDS FROM CROWDSOURCING PLATFORMS Qianzhou Du Abstract Enabled by the wave of online crowdsourcing activities, extracting the Wisdom of Crowds (WoC) has become an emerging research area, one that is used to aggregate judgments, opinions, or predictions from a large group of individuals for improved decision making. However, existing literature mostly focuses on eliciting the wisdom of crowds in an offline context—without tapping into the vast amount of data available on online crowdsourcing platforms. To extract WoC from participants on online platforms, there exist at least three challenges, including social influence, suboptimal aggregation strategies, and data sparsity. This dissertation aims to answer the research question of how to effectively extract WoC from crowdsourcing platforms for the purpose of making better decisions. In the first study, I designed a new opinions aggregation method, Social Crowd IQ (SCIQ), using a time- based decay function to eliminate the impact of social influence on crowd performance. In the second study, I proposed a statistical learning method, CrowdBoosting, instead of a heuristic-based method, to improve the quality of crowd wisdom. In the third study, I designed a new method, Collective Persuasibility, to solve the challenge of data sparsity in a crowdfunding platform by inferring the backers’ preferences and persuasibility. My work shows that people can obtain business benefits from crowd wisdom, and it provides several effective methods to extract wisdom from online crowdsourcing platforms, such as StockTwits, Good Judgment Open, and Kickstarter. EXTRACTING THE WISDOM OF CROWDS FROM CROWDSOURCING PLATFORMS Qianzhou Du General Audience Abstract Since Web 2.0 and mobile technologies have inspired increasing numbers of people to contribute and interact online, crowdsourcing provides a great opportunity for the businesses to tap into a large group of online users who possess varied capabilities, creativity, and knowledge levels. Howe (2006) first defined crowdsourcing as a method for obtaining necessary ideas, information, or services by asking for contributions from a large group of individuals, especially participants in online communities. Many online platforms have been developed to support various crowdsourcing tasks, including crowdfunding (e.g., Kickstarter and Indiegogo), crowd prediction (e.g., StockTwits, Good Judgment Open, and Estimize), crowd creativity (e.g., Wikipedia), and crowdsolving (e.g., Dell IdeaStorm). The explosive data generated by those platforms give us a good opportunity for business benefits. Specifically, guided by the Wisdom of Crowds (WoC) theory, we can aggregate multiple opinions from a crowd of individuals for improving decision making. In this dissertation, I apply WoC to three crowdsourcing tasks, stock return prediction, event outcome forecast, and crowdfunding project success prediction. Our study shows the effectiveness of WoC and makes both theoretical and practical contributions to the literature of WoC. Acknowledgements First, I would like to acknowledge and thank my advisor, Dr. G. Alan Wang, for his guidance and support throughout my time in the Ph.D. program. Dr. Wang set a great example for me of how to be an excellent teacher and researcher. He devoted himself to offering great courses and conducting rigorous research that showed me how to excel in these areas. His support, patience, and trust helped me prevail throughout my time in the Ph.D. program. I am sincerely grateful to my committee members. Dr. Weiguo Fan always provides me with constructive suggestions and directions, whenever I have any questions about teaching, research, and even life. I would like to thank Dr. Roberta Russell for her guidance and encouragement all the time. My gratitude also goes to Dr. Lara Khansa for helping me refine my dissertation. I would like to thank Dr. Onur Seref for his research guidance and suggestions. My gratitude also goes to the Department of Business Information Technology. I would like to thank Dr. Cliff Ragsdale for accepting me into this wonderful Ph.D. program and supporting me during my job search. I would also like to thank Dr. Roberta Russell, as well as all the other faculty and staff in the Department of Business Information Technology, for all their support throughout my time in the Ph.D. program. In addition, I would like to especially thank other three professors, Dr. Zhongju Zhang, Dr. Pengfei Ye, and Dr. Zhilei Qiao. Although they are not my committee members, they provide many valued pieces of advice about finishing my Ph.D. program and seeking a job. I will be grateful all my life. iv Last but not least, I would like to thank my parents and my love Miss. Mu. Without your love, care, encouragement, and companionship, I am not able to survive from my long and difficult Ph.D. life. v Table of Contents 1 INTRODUCTION ................................................................................................................. 1 2 SOCIAL CROWD IQ: EXTRACTING WISDOM FROM SOCIAL CROWDS ......... 10 2.1 INTRODUCTION ..................................................................................................................... 10 2.2 RELATED WORKS ................................................................................................................. 14 2.3 SOCIAL CROWD IQ: OPINION AGGREGATION FOR SOCIAL CROWDS .................................. 18 2.3.1. The Weighting Procedure ............................................................................................ 19 2.3.2 The Aggregation Procedure ......................................................................................... 22 2.4 STUDY 1: STOCK RETURN PREDICTION ............................................................................... 23 2.4.1 Data Collection ............................................................................................................ 23 2.4.2 Performance Measure .................................................................................................. 24 2.4.3 Crowd Size .................................................................................................................... 25 2.4.4 Comparison of Opinion Aggregation Models .............................................................. 27 2.4.5 Functional Testing ........................................................................................................ 29 2.4.6 Financial Portfolio Simulation ..................................................................................... 30 2.4.7 Additional Analysis on the Predictive Power of SCIQ ................................................. 31 2.5 STUDY 2: FORECASTING EVENTS ......................................................................................... 33 2.5.1 Data Collection ............................................................................................................ 35 2.5.2 Comparison of Opinion Aggregation Models .............................................................. 35 2.5.3 Functional Testing ........................................................................................................ 36 2.6 CONCLUSIONS ...................................................................................................................... 37 REFERENCES .............................................................................................................................. 41 3 CROWDBOOSTING: A BOOSTING-BASED MODEL FOR OPINION AGGREGATION IN ONLINE CROWDS ................................................................................ 44 3.1 INTRODUCTION ..................................................................................................................... 44 3.2 RELATED WORKS ................................................................................................................. 47 3.2.1 Existing Opinion Aggregation Methods ....................................................................... 47 3.2.2 Biases in Heuristics ...................................................................................................... 49 3.3 A STATISTICAL LEARNING BASED OPINION AGGREGATION METHOD: CROWDBOOSTING 51 3.3.1 Statistical Learning Theory .......................................................................................... 53 3.3.2 CrowdBoosting ............................................................................................................. 54 3.4 EVALUATION ........................................................................................................................ 59 3.4.1 Data Collection and Processing ................................................................................... 60 3.4.2 Model Representation ................................................................................................... 60 3.4.3 Performance Measure .................................................................................................. 61 3.4.4 Comparison