Making Better Use of the Crowd: How Crowdsourcing Can Advance Machine Learning Research
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Using Prediction Markets to Estimate the Reproducibility of Scientific
Using prediction markets to estimate the SEE COMMENTARY reproducibility of scientific research Anna Drebera,1,2, Thomas Pfeifferb,c,1, Johan Almenbergd, Siri Isakssona, Brad Wilsone, Yiling Chenf, Brian A. Nosekg,h, and Magnus Johannessona aDepartment of Economics, Stockholm School of Economics, SE-113 83 Stockholm, Sweden; bNew Zealand Institute for Advanced Study, Massey University, Auckland 0745, New Zealand; cWissenschaftskolleg zu Berlin–Institute for Advanced Study, D-14193 Berlin, Germany; dSveriges Riksbank, SE-103 37 Stockholm, Sweden; eConsensus Point, Nashville, TN 37203; fJohn A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA 02138; gDepartment of Psychology, University of Virginia, Charlottesville, VA 22904; and hCenter for Open Science, Charlottesville, VA 22903 Edited by Kenneth W. Wachter, University of California, Berkeley, CA, and approved October 6, 2015 (received for review August 17, 2015) Concerns about a lack of reproducibility of statistically significant (14). This problem is exacerbated by publication bias in favor of results have recently been raised in many fields, and it has been speculative findings and against null results (4, 16–19). argued that this lack comes at substantial economic costs. We here Apart from rigorous replication of published studies, which is report the results from prediction markets set up to quantify the often perceived as unattractive and therefore rarely done, there reproducibility of 44 studies published in prominent psychology are no formal mechanisms to identify irreproducible findings. journals and replicated in the Reproducibility Project: Psychology. Thus, it is typically left to the judgment of individual researchers The prediction markets predict the outcomes of the replications to assess the credibility of published results. -
Combating Attacks and Abuse in Large Online Communities
University of California Santa Barbara Combating Attacks and Abuse in Large Online Communities Adissertationsubmittedinpartialsatisfaction of the requirements for the degree Doctor of Philosophy in Computer Science by Gang Wang Committee in charge: Professor Ben Y. Zhao, Chair Professor Haitao Zheng Professor Christopher Kruegel September 2016 The Dissertation of Gang Wang is approved. Professor Christopher Kruegel Professor Haitao Zheng Professor Ben Y. Zhao, Committee Chair July 2016 Combating Attacks and Abuse in Large Online Communities Copyright c 2016 ⃝ by Gang Wang iii Acknowledgements I would like to thank my advisors Ben Y.Zhao and Haitao Zheng formentoringmethrough- out the PhD program. They were always there for me, giving me timely and helpful advice in almost all aspects of both research and life. I also want to thank my PhD committee member Christopher Kruegel for his guidance in my research projects and job hunting. Finally, I want to thank my mentors in previous internships: Jay Stokes, Cormac Herley, Weidong Cui and Helen Wang from Microsoft Research, and Vicente Silveira fromLinkedIn.Specialthanksto Janet Kayfetz, who has helped me greatly with my writing and presentation skills. Iamverymuchthankfultomycollaboratorsfortheirhardwork, without which none of this research would have been possible. First and foremost, to the members of SAND Lab at UC Santa Barbara: Christo Wilson, Bolun Wang, Tianyi Wang, Manish Mohanlal, Xiaohan Zhao, Zengbin Zhang, Xia Zhou, Ana Nika, Xinyi Zhang, Shiliang Tang, Alessandra Sala, Yibo Zhu, Lin Zhou, Weile Zhang, Konark Gill, Divya Sambasivan, Xiaoxiao Yu, Troy Stein- bauer, Tristan Konolige, Yu Su and Yuanyang Zhang. Second, totheemployeesatMicrosoft: Jack Stokes, Cormac Herley and David Felstead. Third, to Miriam Metzger from the Depart- ment of Communications at UC Santa Barbara, and Sarita Y. -
Crowds, Lending, Machine, and Bias
Crowds, Lending, Machine, and Bias Runshan Fu Yan Huang Param Vir Singh Heinz College, Tepper School of Business, Tepper School of Business, Carnegie Mellon University Carnegie Mellon University Carnegie Mellon University [email protected] [email protected] [email protected] Abstract Big data and machine learning (ML) algorithms are key drivers of many fintech innovations. While it may be obvious that replacing humans with machines would increase efficiency, it is not clear whether and where machines can make better decisions than humans. We answer this question in the context of crowd lending, where decisions are traditionally made by a crowd of investors. Using data from Prosper.com, we show that a reasonably sophisticated ML algorithm predicts listing default probability more accurately than crowd investors. The dominance of the machine over the crowd is more pronounced for highly risky listings. We then use the machine to make investment decisions, and find that the machine benefits not only the lenders but also the borrowers. When machine prediction is used to select loans, it leads to a higher rate of return for investors and more funding opportunities for borrowers with few alternative funding options. We also find suggestive evidence that the machine is biased in gender and race even when it does not use gender and race information as input. We propose a general and effective “debiasing” method that can be applied to any prediction focused ML applications, and demonstrate its use in our context. We show that the debiased ML algorithm, which suffers from lower prediction accuracy, still leads to better investment decisions compared with the crowd. -
Incentives in Computer Science Lecture #18: Prediction Markets∗
CS269I: Incentives in Computer Science Lecture #18: Prediction Markets∗ Tim Roughgardeny November 30, 2016 1 Markets for Information 1.1 Uncertain Events Suppose we're interested in an uncertain event, such as whether or not the Warriors will win the 2017 NBA Championship, whether or not the winner of the next Presidential election will be a Republican or a Democrat, or whether or not 2017 will be the hottest year on record. Each of these events is uncertain, but eventually we will know whether or not they occurred. That is, the outcomes of these uncertain events are eventually verifiable. (Similar to the setup for scoring rules in the first half of last lecture, but different from the setup for eliciting subjective opinions in the second half.) What's the best way to predict whether or not such events will happen? For example, suppose you had to predict the winner of a sporting match, despite knowing very little about the sport. You could do a lot worse than turning to the betting markets in Vegas for help. Vegas odds don't let you predict anything with anything close to full certainty, but even the best experts struggle to outperform betting markets with any consistency. For example, favorites cover the Vegas spread about 50% of the time, while underdogs beat the spread about 50% of the time. 1.2 The Iowa Electronic Markets If betting markets work so well for forecasting the outcomes of sporting events, why not use them to also predict other kinds of uncertain events? This is exactly the idea behind the Iowa Electronic Markets (IEM), which have operated since 1988 with the goal of using markets to forecast the outcome of presidential, congressional, and gubernatorial elections. -
High Converting Email Marketing Templates
High Converting Email Marketing Templates Unwithering Rochester madder: he promulging his layings biographically and debatingly. Regan never foins any Jefferson geometrize stirringly, is Apostolos long and pinnatipartite enough? Isodimorphous Dylan always bloused his caddice if Bear is Iroquoian or motorises later. Brooks sports and the middleware can adjust your website very possibly be wonderful blog, email marketing templates here to apply different characteristics and Email apps you use everyday to automate your business and be more productive. For best results, go beyond just open and click rates to understand exactly how your newsletters impact your customer acquisition. Another favorite amongst most email designers, there had been a dip in the emails featuring illustrations. Turnover is vanity; profit is sanity. These people love your brand, so they would happy to contribute. Outlook, Gmail, Yahoo is not supporting custom fonts and ignore media query, it will be replaced by standard web font. Once Jack explained your background in greater detail, I immediately realized you would be the perfect person to speak with. Search for a phrase that your audience cares about. Drip marketing is easy to grasp and implement. These techniques include segmentation, mapping out email flows for your primary segments, and consolidating as much as possible into one app or platform. Typically, newsletters are used as a way to nurture customer relationships and educate your readership as opposed to selling a product. What problem do you solve? SPF authentication detects malicious emails from false email addresses. My sellers love the idea and the look you produce. Keep things fresh for your audience. -
The Scam Filter That Fights Back Final Report
The scam filter that fights back Final Report by Aurél Bánsági (4251342) Ruben Bes (4227492) Zilla Garama (4217675) Louis Gosschalk (4214528) Project Duration: April 18, 2016 – June 17, 2016 Project Owner: David Beijer, Peeeks Project Coach: Geert-Jan Houben, TU Delft Preface This report concludes the project comprising of the software development of a tool that makes replying to scam mails fast and easy. This project was part of the course TI3806 Bachelor Project in the fourth quarter of the acadamic year of 2015-2016 and was commissioned by David Beijer, CEO of Peeeks. The aim of this report is documenting the complete development process and presenting the results to the Bachelor Project Committee. During the past ten weeks, research was conducted and assessments were made to come to a design and was then followed by an implementation. We would like to thank our coach, Geert-Jan Houben, for his advice and supervision during the project. Aurél Bánsági (4251342) Ruben Bes (4227492) Zilla Garama (4217675) Louis Gosschalk (4214528) June 29, 2016 i Abstract Filtering out scam mails and deleting them does not hurt a scammer in any way, but wasting their time by sending fake replies does. As the amount of replies grows, the amount of time a scammer has to spend responding to e-mails increases. This bachelor project has created a revolutionary system that recommends replies that can be sent to answer a scam mail. This makes it very easy for users to waste a scammer’s time and destroy their business model. The project has a unique adaptation of the Elo rating system, which is a very popular algorithm used by systems ranging from chess to League of Legends. -
Crowdsourced Outcome Determination in Prediction Markets
Crowdsourced Outcome Determination in Prediction Markets Rupert Freeman Sebastien´ Lahaie David M. Pennock Duke University Microsoft Research Microsoft Research [email protected] [email protected] [email protected] Abstract to their own stake in the market; this is known as outcome manipulation (Shi, Conitzer, and Guo 2009; Berg and Ri- A prediction market is a useful means of aggregating infor- etz 2006; Chakraborty and Das 2016). Additionally, by al- mation about a future event. To function, the market needs a trusted entity who will verify the true outcome in the end. lowing anyone to create a market, there is no longer any Motivated by the recent introduction of decentralized pre- guarantee that all questions will be sensible, or even have diction markets, we introduce a mechanism that allows for a well-defined outcome. In this paper, we propose a specific the outcome to be determined by the votes of a group of ar- prediction market mechanism with crowdsourced outcome biters who may themselves hold stakes in the market. Despite determination that addresses several challenges faced by de- the potential conflict of interest, we derive conditions under centralized markets of this sort. which we can incentivize arbiters to vote truthfully by us- ing funds raised from market fees to implement a peer pre- First is the issue of outcome ambiguity. At the time the diction mechanism. Finally, we investigate what parameter market closes, it might be unreasonable to assign a binary values could be used in a real-world implementation of our value to the event outcome due to lack of clarity in the out- mechanism. -
Fantasyscotus: Crowdsourcing a Prediction Market for the Supreme Court Josh Blackman Harlan Institute
Northwestern Journal of Technology and Intellectual Property Volume 10 | Issue 3 Article 3 Winter 2012 FantasySCOTUS: Crowdsourcing a Prediction Market for the Supreme Court Josh Blackman Harlan Institute Adam Aft Corey Carpenter Recommended Citation Josh Blackman, Adam Aft, and Corey Carpenter, FantasySCOTUS: Crowdsourcing a Prediction Market for the Supreme Court, 10 Nw. J. Tech. & Intell. Prop. 125 (2012). https://scholarlycommons.law.northwestern.edu/njtip/vol10/iss3/3 This Article is brought to you for free and open access by Northwestern Pritzker School of Law Scholarly Commons. It has been accepted for inclusion in Northwestern Journal of Technology and Intellectual Property by an authorized editor of Northwestern Pritzker School of Law Scholarly Commons. NORTHWESTERN JOURNAL OF TECHNOLOGY AND INTELLECTUAL PROPERTY FantasySCOTUS Crowdsourcing a Prediction Market for the Supreme Court Josh Blackman, Adam Aft and Corey Carpenter January 2012 VOL. 10, NO. 3 © 2012 by Northwestern University School of Law Northwestern Journal of Technology and Intellectual Property Copyright 2012 by Northwestern University School of Law Volume 10, Number 3 (January 2012) Northwestern Journal of Technology and Intellectual Property Fantasy SCOTUS Crowdsourcing a Prediction Market for the Supreme Court By Josh Blackman,* Adam Aft** and Corey Carpenter*** The object of our study, then, is prediction, the prediction of the incidence of the public force through the instrumentality of the courts.1 -Oliver Wendell Holmes, Jr. It is tough to make predictions, especially about the future.2 -Yogi Berra I. INTRODUCTION ¶1 Every year the Supreme Court of the United States captivates the minds and curiosity of millions of Americans—yet the inner-workings of the Court are not fully transparent. -
Social Turing Tests: Crowdsourcing Sybil Detection
Social Turing Tests: Crowdsourcing Sybil Detection Gang Wang, Manish Mohanlal, Christo Wilson, Xiao Wang‡, Miriam Metzger†, Haitao Zheng and Ben Y. Zhao Department of Computer Science, U. C. Santa Barbara, CA USA †Department of Communications, U. C. Santa Barbara, CA USA ‡Renren Inc., Beijing, China Abstract The research community has produced a substantial number of techniques for automated detection of Sybils [4, As popular tools for spreading spam and malware, Sybils 32,33]. However, with the exception of SybilRank [3], few (or fake accounts) pose a serious threat to online communi- have been successfully deployed. The majority of these ties such as Online Social Networks (OSNs). Today, sophis- techniques rely on the assumption that Sybil accounts have ticated attackers are creating realistic Sybils that effectively difficulty friending legitimate users, and thus tend to form befriend legitimate users, rendering most automated Sybil their own communities, making them visible to community detection techniques ineffective. In this paper, we explore detection techniques applied to the social graph [29]. the feasibility of a crowdsourced Sybil detection system for Unfortunately, the success of these detection schemes is OSNs. We conduct a large user study on the ability of hu- likely to decrease over time as Sybils adopt more sophis- mans to detect today’s Sybil accounts, using a large cor- ticated strategies to ensnare legitimate users. First, early pus of ground-truth Sybil accounts from the Facebook and user studies on OSNs such as Facebook show that users are Renren networks. We analyze detection accuracy by both often careless about who they accept friendship requests “experts” and “turkers” under a variety of conditions, and from [2]. -
Predicting the Outcome of an Election
Social Education 80(5), pp 260–262 ©2016 National Council for the Social Studies Predicting the Outcome of an Election Social Education staff Part of the excitement of the period immediately preceding an election lies in keep- a candidate by selling shares that were ing track of the different predictions. There are a variety of ways of predicting the bought at a lower price when the candi- winner of a presidential election, and it can be revealing to examine the accuracy date’s chances were rated low). of prediction systems after the election. Once their own money is at stake, people are considered likely to follow One predictive system with which as reflecting the collective wisdom of a their self-interest by being careful with readers of Social Education have wide range of people who are willing to their research and by examining a wide become familiar is the “Keys to the risk their own money on the outcome range of information. Although different White House” developed by historian of a contest. people will reach different conclusions Allan J. Lichtman and mathematician Prediction markets set odds for or and make different bets, the prediction Vladimir Keilis-Borok (see the previ- against each outcome, based on the way markets aggregate the range of available ous article). The thirteen Keys are of that participants choose to invest their information into a collective projection great interest for studying the success money. Participants can thus calculate of the likely result of the contest. or failure of presidential terms and the return they will receive for a suc- David Rothschild, an economist at the outcome of presidential elections. -
Read a Sample
Intelligent Computing for Interactive System Design provides a comprehensive resource on what has become the dominant paradigm in designing novel interaction methods, involving gestures, speech, text, touch and brain-controlled interaction, embedded in innovative and emerging human-computer interfaces. These interfaces support ubiquitous interaction with applications and services running on smartphones, wearables, in-vehicle systems, virtual and augmented reality, robotic systems, the Internet of Things (IoT), and many other domains that are now highly competitive, both in commercial and in research contexts. This book presents the crucial theoretical foundations needed by any student, researcher, or practitioner working on novel interface design, with chapters on statistical methods, digital signal processing (DSP), and machine learning (ML). These foundations are followed by chapters that discuss case studies on smart cities, brain-computer interfaces, probabilistic mobile text entry, secure gestures, personal context from mobile phones, adaptive touch interfaces, and automotive user interfaces. The case studies chapters also highlight an in-depth look at the practical application of DSP and ML methods used for processing of touch, gesture, biometric, or embedded sensor inputs. A common theme throughout the case studies is ubiquitous support for humans in their daily professional or personal activities. In addition, the book provides walk-through examples of different DSP and ML techniques and their use in interactive systems. Common terms are defined, and information on practical resources is provided (e.g., software tools, data resources) for hands-on project work to develop and evaluate multimodal and multi-sensor systems. In a series of in-chapter commentary boxes, an expert on the legal and ethical issues explores the emergent deep concerns of the professional community, on how DSP and ML should be adopted and used in socially appropriate ways, to most effectively advance human performance during ubiquitous interaction with omnipresent computers. -
Prediction Markets As a Forecasting Tool
PREDICTION MARKETS AS A FORECASTING TOOL Daniel E. O’Leary ABSTRACT Internal prediction markets draw on the wisdom of crowds, gathering knowledge from a broad range of information sources and embedding that knowledge in the stock price. This chapter examines the use of internal prediction markets as a forecasting tool, including as a stand-alone, and as a supplement to forecasting tools. In addition, this chapter examines internal prediction market applications used in real-world settings and issues associated with the accuracy of internal prediction markets. INTRODUCTION Internal prediction markets are different than so-called naturally occurring markets, in that prediction markets are internal markets where generally virtual dollars are used as a basis to try and put prices on particular events or sets of events, for problems of direct relevance to a specific organization. These markets are designed to gather information from a broad range of users in the context of a market, where participants ‘‘bet’’ on the likelihood of potential future events using prices to, ultimately predicting the probability of the outcome of some event. Advances in Business and Management Forecasting, Volume 8, 169–184 Copyright r 2011 by Emerald Group Publishing Limited All rights of reproduction in any form reserved ISSN: 1477-4070/doi:10.1108/S1477-4070(2011)0000008014 169 170 DANIEL E. O’LEARY Prediction markets provide an information gathering and aggregation mechanism across the population of traders to generate a price on some stock, where that stock being traded typically is a prediction or forecast of some event. For example, a stock may be ‘‘the number of flaws in a product will be less than x.’’ Researchers (e.g., Berg, Nelson, & Rietz, 2008; Wolfers & Zitzewitz, 2004) have found that prediction markets provide accurate forecasts, sometimes better than sophisticated statistical tools.