Crowdsourcing

Total Page:16

File Type:pdf, Size:1020Kb

Crowdsourcing Crowdsourcing reCAPTCHA Completely Automated Public Turing test to tell Computers and Humans Apart Luis von Ahn • Guatemalan entrepreneur • Consulting Professor at Carnegie Mellon University in Pittsburgh, Pennsylvania. • Known as one of the pioneers of crowdsourcing. The problem: "Anybody can write a program to sign up for millions of accounts, and the idea was to prevent that" Luis von Ahn Ealier CAPTCHAs 2010: Luis invents CAPTCHA Business model B2B/B2C: The Captcha company sells captchas for around 30 $ per 1000 of them The idea Situation before 2007: CAPTCHAs were many and working well The thought of Luis: • Hundreds of thousands of combined human hours were being wasted each day, 200 million captchas were solved daily • Book digitalisation: at the tim Optical Character Recognition (OCR) software couln’t solve 30 % of the ammount of words to be digitalized • A CAPTCHA could be used with the intention of exploiting that to digitalize old books and articles reCAPTCHAs 2007: reCAPTCHA is found and a partnership is established to digitized the previous 20 years of New York ? Times issues within a few months, 13 million articles RESCUED crowdsourcing CARPATHIA ICEBERG reCAPTCHAs In 2009, reCAPTCHA was purchased by Google for an undisclosed amount for Google Books library, which is now one of the largest digital libraries in the world, or identify street names and addresses from Google Maps Street View. If you are not paying for the product, you are the product. Before and after reCAPTCHA : The ESP game and duoligo First of von Ahn projects: giving a name After CAPTCHA: Duoligo to an image: 10 M people playing Providing translations for the web through bought by Google Images a free language learning program.
Recommended publications
  • Recaptcha: Human-Based Character Recognition Via Web Security
    REPORTS on September 12, 2008 and blogs. For example, CAPTCHAs prevent www.sciencemag.org reCAPTCHA: Human-Based Character ticket scalpers from using computer programs to buy large numbers of concert tickets, only to re Recognition via Web Security Measures sell them at an inflated price. Sites such as Gmail and Yahoo Mail use CAPTCHAs to stop spam Luis von Ahn,* Benjamin Maurer, Colin McMillen, David Abraham, Manuel Blum mers from obtaining millions of free e mail accounts, which they would use to send spam CAPTCHAs (Completely Automated Public Turing test to tell Computers and Humans Apart) are e mail. Downloaded from widespread security measures on the World Wide Web that prevent automated programs from According to our estimates, humans around abusing online services. They do so by asking humans to perform a task that computers cannot yet the world type more than 100 million CAPTCHAs perform, such as deciphering distorted characters. Our research explored whether such human every day (see supporting online text), in each case effort can be channeled into a useful purpose: helping to digitize old printed material by asking spending a few seconds typing the distorted char users to decipher scanned words from books that computerized optical character recognition failed acters. In aggregate, this amounts to hundreds of to recognize. We showed that this method can transcribe text with a word accuracy exceeding 99%, thousands of human hours per day. We report on matching the guarantee of professional human transcribers. Our apparatus is deployed in more an experiment that attempts to make positive use than 40,000 Web sites and has transcribed over 440 million words.
    [Show full text]
  • ACM Bytecast Episode Title: Luis Von Ahn
    Podcast Title: ACM Bytecast Episode Title: Luis Von Ahn Welcome to the ACM Bytecast podcast, where researchers, practitioners, and innovators share about their experiences, lessons, and future visions in the field of computing research. In this episode, host Rashmi Mohan welcomes Luis Von Ahn—founder of Duolingo, the most popular foreign language learning app. The conversation starts off with a quick look at Luis’s background, what he does, and what drew him into this field as a whole. Luis’s mother began his interest in computers at 8 years old and he’s never looked back since. While he started off his training in the field of computer science, he has now started 3 businesses, one of which is Duolingo. Rashmi taps into the history on Luis’s first two businesses before jumping into Duolingo. Learn about the creation of CAPTCHA and re-CAPTCHA. While one of these is a challenge-response test to help computers determine whether the user on websites was human or not, the other protects website from spam. The cru- cial foundation for Luis’s businesses were simply real world problems that he sought to help fix. Luis integrated these real issues in the industry and worked on them from the academic world. Being used across countless websites today, Luis’s work proved pivotal. Rashmi asks Luis about his perspective on integrating academia and business more—don’t miss out on his thoughts! Rashmi shifts the conversation to learning the details involved in the transition from academia to actually running a business and the learning curve associated with this.
    [Show full text]
  • Luis Von Ahn - Episode 14 Transcript
    ACM ByteCast Luis von Ahn - Episode 14 Transcript Rashmi Mohan: This is ACM ByteCast, a podcast series from the Association for Computing Machinery, the world's largest educational and scientific computing society. We talk to researchers, practitioners, and innovators who are at the intersection of computing research and practice. They share their experiences, the lessons they've learned, and their own visions for the future of computing. I am your host, Rashmi Mohan. Rashmi Mohan: If you want to boost your brain power, improve your memory, or enhance your multitasking skills, then you're often recommended to learn a foreign language. For many of us, that option has become a reality, thanks to our next guest and his creation. Luis von Ahn is a serial entrepreneur and Founder and CEO of Duolingo. An accomplished researcher and consulting professor of Computer Science at Carnegie Mellon University, he straddles both worlds seamlessly. He's a winner of numerous awards, including the prestigious Lemelson-MIT Prize and the MacArthur Fellowship often known as The Genius Grant. Louis, welcome to ACM ByteCast. Luis von Ahn: Thank you. Thank you for having me. Rashmi Mohan: Wonderful. I'd love to lead with a simple question that I ask all of my guests. If you could please introduce yourself and talk about what you currently do, and also give us some insight into what drew you into the field of computer science. Luis von Ahn: Sure. So my name is Luis. I am currently the CEO and co-founder of a company called Duolingo. Duolingo is a language learning platform.
    [Show full text]
  • Larry Page Developing the Largest Corporate Foundation in Every Successful Company Must Face: As Google Word.” the United States
    LOWE —continued from front flap— Praise for $19.95 USA/$23.95 CAN In addition to examining Google’s breakthrough business strategies and new business models— In many ways, Google is the prototype of a which have transformed online advertising G and changed the way we look at corporate successful twenty-fi rst-century company. It uses responsibility and employee relations——Lowe Google technology in new ways to make information universally accessible; promotes a corporate explains why Google may be a harbinger of o 5]]UZS SPEAKS culture that encourages creativity among its where corporate America is headed. She also A>3/9A addresses controversies surrounding Google, such o employees; and takes its role as a corporate citizen as copyright infringement, antitrust concerns, and “It’s not hard to see that Google is a phenomenal company....At Secrets of the World’s Greatest Billionaire Entrepreneurs, very seriously, investing in green initiatives and personal privacy and poses the question almost Geico, we pay these guys a whole lot of money for this and that key g Sergey Brin and Larry Page developing the largest corporate foundation in every successful company must face: as Google word.” the United States. grows, can it hold on to its entrepreneurial spirit as —Warren Buffett l well as its informal motto, “Don’t do evil”? e Following in the footsteps of Warren Buffett “Google rocks. It raised my perceived IQ by about 20 points.” Speaks and Jack Welch Speaks——which contain a SPEAKS What started out as a university research project —Wes Boyd conversational style that successfully captures the conducted by Sergey Brin and Larry Page has President of Moveon.Org essence of these business leaders—Google Speaks ended up revolutionizing the world we live in.
    [Show full text]
  • Search the Full Text of Books and Discover New Ones
    Search the full text of books and discover new ones With Google Book Search, you can quickly search the full text of books, from the first word on the first page to the last word in the final chapter. Find the perfect books for your purposes or discover ones you never even knew existed. Book Search works like web search Try a search on Google Book Search (http://books.google.com) or on Google.com. When we find a book whose content contains a match for your search terms, we’ll link to it in your search results. Browse books online Clicking on a book result, you’ll be able to see everything from a few short excerpts to the entire book, depending on a few different factors. • Full view: If we’ve determined that a book is out of copyright or the publisher or rightsholder has given us permission, you’ll be able to page through the entire book from start to finish. • Limited preview: If the publisher or author has provided the book through the Google Books Partner Program, you’ll be able to preview sample pages – typically about 20% of the book. • Snippet view: If a book is under copyright and the publisher or author is not part of the Partner Program, you’ll find bibliographic about the book and a few snippets of text from the book, showing your search term in context. • No preview available: For books where we’re unable to show snippets, you’ll see only bibliographic information. Search within the book Once you a find a title of interest, you can search within the book.
    [Show full text]
  • The Future of the Internet and How to Stop It the Harvard Community Has
    The Future of the Internet and How to Stop It The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters. Jonathan L. Zittrain, The Future of the Internet -- And How to Citation Stop It (Yale University Press & Penguin UK 2008). Published Version http://futureoftheinternet.org/ Accessed July 1, 2016 4:22:42 AM EDT Citable Link http://nrs.harvard.edu/urn-3:HUL.InstRepos:4455262 This article was downloaded from Harvard University's DASH Terms of Use repository, and is made available under the terms and conditions applicable to Other Posted Material, as set forth at http://nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms- of-use#LAA (Article begins on next page) YD8852.i-x 1/20/09 1:59 PM Page i The Future of the Internet— And How to Stop It YD8852.i-x 1/20/09 1:59 PM Page ii YD8852.i-x 1/20/09 1:59 PM Page iii The Future of the Internet And How to Stop It Jonathan Zittrain With a New Foreword by Lawrence Lessig and a New Preface by the Author Yale University Press New Haven & London YD8852.i-x 1/20/09 1:59 PM Page iv A Caravan book. For more information, visit www.caravanbooks.org. The cover was designed by Ivo van der Ent, based on his winning entry of an open competition at www.worth1000.com. Copyright © 2008 by Jonathan Zittrain. All rights reserved. Preface to the Paperback Edition copyright © Jonathan Zittrain 2008. Subject to the exception immediately following, this book may not be reproduced, in whole or in part, including illustrations, in any form (beyond that copying permitted by Sections 107 and 108 of the U.S.
    [Show full text]
  • Google Books for Publisher Websites
    Google Books Google Books for Publisher Websites There are two complementary ways to add a customized Google Books search to your website: 1. Custom Book Search: a search box that allows users to conduct full-text searches across your complete catalog. 2. Google Preview: a button that is placed on each book’s catalog page and, when clicked, displays a limited preview of the book. Note that both of these tools are different from the regular search box you have on your website - while a traditional search tool indexes the contents of your website, Custom Book Search and Google Preview search the content of your books themselves. As a re- sult, the Google Books search tools should be used in addition to your regular search box. Custom Book Search Custom Book Search (also known as Co-branded Search) is a Google Books search engine for your catalog, on your own website. Implementing Custom Book Search enables visitors to your website to conduct full-text searches right from your site; when your visitors type a term into the Custom Book Search box, they are searching the full text of all the books you have in Google Books. They can then browse the relevant titles using Google’s secure limited preview and easily purchase the books that they browse. Setup is easy; go to the Product Setup tab of your Partner Center and click “Co-branded Book Search” in the green bar up top, then click “Add new co-branded site”. From there you can fill in your site’s basic information, upload your logo, and decide what you want the search box and search results page to look like.
    [Show full text]
  • The Future of the Internet and How to Stop It the Harvard Community Has
    The Future of the Internet and How to Stop It The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters. Citation Jonathan L. Zittrain, The Future of the Internet -- And How to Stop It (Yale University Press & Penguin UK 2008). Published Version http://futureoftheinternet.org/ Accessed February 18, 2015 9:54:33 PM EST Citable Link http://nrs.harvard.edu/urn-3:HUL.InstRepos:4455262 Terms of Use This article was downloaded from Harvard University's DASH repository, and is made available under the terms and conditions applicable to Other Posted Material, as set forth at http://nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms- of-use#LAA (Article begins on next page) YD8852.i-x 1/20/09 1:59 PM Page i The Future of the Internet— And How to Stop It YD8852.i-x 1/20/09 1:59 PM Page ii YD8852.i-x 1/20/09 1:59 PM Page iii The Future of the Internet And How to Stop It Jonathan Zittrain With a New Foreword by Lawrence Lessig and a New Preface by the Author Yale University Press New Haven & London YD8852.i-x 1/20/09 1:59 PM Page iv A Caravan book. For more information, visit www.caravanbooks.org. The cover was designed by Ivo van der Ent, based on his winning entry of an open competition at www.worth1000.com. Copyright © 2008 by Jonathan Zittrain. All rights reserved. Preface to the Paperback Edition copyright © Jonathan Zittrain 2008. Subject to the exception immediately following, this book may not be reproduced, in whole or in part, including illustrations, in any form (beyond that copying permitted by Sections 107 and 108 of the U.S.
    [Show full text]
  • Luis Von Ahn
    Luis von Ahn Professional 2011-present A. Nico Habermann Associate Professor. Computer Science Department, Employment Carnegie Mellon University. 2009-present Staff Research Scientist. Google, Inc. 2006-2011 Assistant Professor. Computer Science Department, Carnegie Mellon University. 2008-2009 Founder and CEO. ReCAPTCHA, Inc. (acquired by Google, Inc., in 2009). 2005-2006 Post-Doctoral Fellow. Computer Science Department, Carnegie Mellon University. Education Carnegie Mellon University, Pittsburgh, PA. Ph.D. in Computer Science, 2005. Advisor: Manuel Blum Thesis Title: Human Computation Carnegie Mellon University, Pittsburgh, PA. M.S. in Computer Science, 2003. Duke University, Durham, NC. B.S. in Mathematics (Summa Cum Laude), 2000. Research I am working to develop a new area of computer science that I call Human Computation. In Interests particular, I build systems that combine the intelligence of humans and computers to solve large-scale problems that neither can solve alone. An example of my work is reCAPTCHA, in which over 750 million people—more than 10% of humanity—have helped digitize books and newspapers. Selected MacArthur Fellow, 2006-2011. Honors Packard Fellow, 2009-2014. Discover Magazine: 50 Best Brains in Science, 2008. Fast Company: 100 Most Creative People in Business, 2010. Silicon.com: 50 Most Influential People in Technology, 2007. Microsoft New Faculty Fellow, 2007. Sloan Fellow, 2009. CAREER Award, National Science Foundation, 2011-2015. Smithsonian Magazine: America’s Top Young Innovators in the Arts and Sciences, 2007. Technology Review’s TR35: Young Innovators Under 35, 2007. IEEE Intelligent Systems “Ten to Watch for the Future of AI,” 2008. Popular Science Magazine Brilliant 10 Scientists of 2006. Herbert A.
    [Show full text]
  • Marketing Your Books Online Enhance Your Website, Easily and for Free
    Marketing Your Books Online Enhance your website, easily and for free Google Blogger What is it? A blog is a web journal that can be updated as often as you’d like. The most effective blog forms for presses tend to be publicity or news blogs, CEO or Publisher blogs, and author blogs. Where can I learn more? blogger.com (or do a Google search to find other free blogging tools) Google Calendar What is it? A public calendar that is easily added to your website and allows you to keep your readers abreast of readings and events. Where can I learn more? google.com/calendar Google Books for your site What is it? • Cobranded Search A search box for your homepage that searches the full text of all books you have in Google Books • Google Preview Individual book previews that can be embedded right into your website. Where can I learn more? Ask your Partner Manager for information on Cobranded Search and Google Preview. © Copyright 2009. Google is a trademark of Google Inc. All other company and product names may be trademarks of the respective companies with which they are associated. Marketing Your Books Online Drive readers to your website and books Google Books What is it? A powerful tool that displays books in response to regular Google queries. Your e-commerce site appears first among the “Buy this book” links, and because anyone with a website can embed book previews, your titles show up safely and securely all over the web. Where can I learn more? Ask your Partner Manager to fill you in on recent Google Books developments.
    [Show full text]
  • Invited Talk: Human Computation Luis Von Ahn Carnegie Mellon University Pittsburgh, PA USA [email protected]
    Invited Talk: Human Computation Luis von Ahn Carnegie Mellon University Pittsburgh, PA USA [email protected] ABSTRACT algorithm—it must be proven correct, its efficiency can be Tasks like image recognition are trivial for humans, but analyzed, a more efficient version can supersede a less effi- continue to challenge even the most sophisticated computer cient one, and so on. Instead of using a silicon processor, programs. This talk discusses a paradigm for utilizing hu- these “algorithms” run on a processor consisting of ordi- man processing power to solve problems that computers nary humans interacting with computers over the Internet. cannot yet solve. Traditional approaches to solving such “Games with a purpose” have a vast range of applications problems focus on improving software. I advocate a novel in areas as diverse as security, computer vision, Internet approach: constructively channel human brainpower using accessibility, adult content filtering, and Internet search. computer games. For example, the ESP Game, described in Two such games under development at Carnegie Mellon this talk, is an enjoyable online game – many people play University, the ESP Game and Peekaboom, demonstrate over 40 hours a week – and when people play, they help how humans, as they play, can solve problems that com- label images on the Web with descriptive keywords. These puters can’t yet solve. keywords can be used to significantly improve the accuracy of image search. People play the game not because they LABELING RANDOM IMAGES want to help, but because they enjoy it. Several important online applications such as search en- gines and accessibility programs for the visually impaired Categories and Subject Descriptors require accurate image descriptions.
    [Show full text]
  • Google Books
    Google Books About This Guide This guide explains what is available through Google Books and how to navigate this web-based search engine. About Google Books Google Books enables you to find full-text books or information about books that exist on your topic. Some magazines are also now included with Google Books. Depending on the title and its copyright or author’s permissions, one of the following viewing options will be made available: Full View, Limited Preview, Snippet View, or No Preview Available. For more information about these viewing options, visit http://books.google.com/intl/en/googlebooks/screenshots.html. Google Books also provides full-text searching capabilities so that you can locate books containing your search terms. Titles that are out of copyright, as well as those afforded full-text viewing rights by the author, may be viewed in their entirety. Some of these full-text titles may be downloaded, saved and printed. Note: Google Books does not provide an exhaustive catalog or full-text availability for all books that may be relevant to your topic. Therefore, you should also search the Leatherby Libraries Catalog, and possibly additional library catalogs, in conjunction with your Google Books searches. For instances when the full-text of a book is not available, you may request an Interlibrary Loan. Please see the “Interlibrary Loan” section of this guide for more information. How to Access Google Books The direct link is http://books.google.com. Or, from the Google homepage, www.google.com, select “Books” from the “More” drop-down menu. Advanced Search Capabilities: Once Google books opens, enter your search terms and click on “Search Books.” On the results page, scroll down to the bottom of the page to access “Advanced Search.” The “Advanced Book Search” page allows you to search by title and author, search for exact phrases, and more.
    [Show full text]