Report on Advances in the Field of Artificial Intelligence Attributed to Captcha
Total Page:16
File Type:pdf, Size:1020Kb
Utah State University DigitalCommons@USU All Graduate Plan B and other Reports Graduate Studies 12-2011 Report on Advances in the Field of Artificial Intelligence ttributedA to CAPTCHA Craig M. Schow Utah State University Follow this and additional works at: https://digitalcommons.usu.edu/gradreports Part of the Computer Sciences Commons Recommended Citation Schow, Craig M., "Report on Advances in the Field of Artificial Intelligence ttributedA to CAPTCHA" (2011). All Graduate Plan B and other Reports. 69. https://digitalcommons.usu.edu/gradreports/69 This Report is brought to you for free and open access by the Graduate Studies at DigitalCommons@USU. It has been accepted for inclusion in All Graduate Plan B and other Reports by an authorized administrator of DigitalCommons@USU. For more information, please contact [email protected]. i REPORT ON ADVANCES IN THE FIELD OF ARTIFICIAL INTELLIGENCE ATTRIBUTED TO CAPTCHA By Craig M. Schow A report submitted in partial fulfillment of the requirements for the degree of MASTERS OF SCIENCE in Computer Science Approved: ____________________ ____________________ Dr. Donald Cooley Dr. Stephen Clyde Major Professor Committee Member ____________________ Dr. Nicholas Flann Committee Member UTAH STATE UNIVERSITY Logan, Utah 2011 ii ABSTRACT Report on Advances in the Field of Artificial Intelligence Attributed to CAPTCHA by Craig M. Schow, Master of Science Utah State University, 2011 Major Professor: Dr. Donald Cooley Department: Computer Science A CAPTCHA is a specialized human interaction proof that exploits gaps between human and computer recognition abilities. By design, the hardness of a CAPTCHA is based on the difficulty of advancing the underlying artificial intelligence [AI] technology to a level that eliminates any exploitable gap. Due to this fact computer scientists have concluded that the widespread use of CAPTCHA would accelerate research in the underlying fields of AI eventually leading to near-human capabilities in certain AI systems. Despite these predictions no attempt has been made to identify advances in AI which can be attributed to the use of CAPTCHA. The goal of this report is to explore the concept of CAPTCHA as a catalyst for advancement in AI. As part of this goal I examine the underlying basis for expected contributions, provide direct examples of documented advancements that have already been made, evaluate the strengths and weaknesses of the CAPTCHA model and based on the results identify specific areas of AI most likely to benefit from CAPTCHA in the future. As a result of my research I have found that some advancement has been made as a result of CAPTCHA, but due to weaknesses in many CAPTCHA implementations these advancements have been limited and have often fallen short of expectations. As many of these weaknesses have been identified new methods of implementation have been introduced, but many of these have limitations as well. As part of the exploration of these challenges I have provided a basis that will allow for a more accurate understanding of the processes involved, and allow others to continue to build on the work which has already been done. iii CONTENTS ABSTRACT ........................................................................................................................................ii CHAPTER I. INTRODUCTION ............................................................................................................1 II. OVERVIEW OF CAPTCHA ..............................................................................................3 CAPTCHA................................................................................................................3 How CAPTCHA Works ............................................................................................3 Gap Amplification ..................................................................................................8 Uses of CAPTCHA ...................................................................................................8 III. BASIS OF CAPTCHA AS A CATALYST............................................................................10 CAPTCHA as a Catalyst.........................................................................................10 Precisely Stating the Problem..............................................................................10 Inducing Research................................................................................................11 IV. DIRECT EXAMPLES......................................................................................................13 Mori and Malik ....................................................................................................13 V. OPOSSING VIEWS .......................................................................................................16 Human Tolerance and Accessibility .....................................................................16 Specialization of Recognizers...............................................................................16 Useless Answers ..................................................................................................17 VI. ONGOING AND FUTURE IMPROVEMENTS .................................................................18 The Strengths of CAPTCHA ..................................................................................18 Real Pattern Recognition .....................................................................................18 Human Cognitive Sciences...................................................................................20 Linguistic Cognition..............................................................................................21 VII. CONCLUSION..............................................................................................................23 BIBLIOGRAPHY...............................................................................................................................25 1 CHAPTER 1 INTRODUCTION A CAPTCHA (Completely Automated Public Turing Test to Tell Computers and Humans Apart) is a type of challenge-response authentication that exploits gaps between human and computer recognition abilities in order to determine if the user is human. By design, the hardness of a CAPTCHA is based on the difficulty of advancing the underlying artificial intelligence [AI] technology to a level that eliminates any exploitable gap. This type of system is similar to some forms of public key cryptography where the underlying hardness is based on the difficulty of factoring large numbers. Thus, if an attacker can solve the underlying problem they will be able to defeat the protocol. Due to this fact, many computer scientists have concluded that the widespread use of CAPTCHA would accelerate research in the underlying field of AI eventually leading to near-human capabilities in certain AI systems much like cryptography has motivated research on algorithms and hardware used for factoring [1, 2, 3, 4, 5, 17]. Despite the common acceptance of CAPTCHA as a catalyst for advancing research in some fields of AI, very little work has been done to check the validity of this assumption or identify weaknesses that could be addressed. This report is intended to fill in the knowledge gap between the expectations that have been expressed and the actual results that have been produced. I have divided this report into the following chapters: Overview of CAPTCHA In this chapter I detail the main uses of CAPTCHA, describe how it works, and introduce any terms that are relevant to the report. This chapter contains enough detail to allow an individual who is unfamiliar with CAPTCHA to follow the ideas expressed in other sections of the report. Basis of CAPTCHA as a Catalyst In this chapter I explore the initial reasoning behind the idea that CAPTCHA would have a positive effect on the field of AI. I provide examples of related problems on which this reasoning is based and draw conclusions concerning expected outcomes. I also detail any underlying assumptions which have been made. Direct Examples In this chapter I identify specific examples of recent advancements in the field of AI that can be attributed in part to the use of CAPTCHA. Based on my findings, I identify strengths that have lead to success and areas in which advancements have fallen short of expectations. Opposing Views 2 In this chapter I explore many of the weaknesses that have been identified in current implementations of CAPTCHA. As weaknesses are presented I identify how each limits the potential effectiveness of CAPTCHA in advancing AI. Future Improvements In this chapter I take what has been learned and project that knowledge forward in an attempt to outline future contributions that could be made. I identify specific areas for improvement and outline areas of AI most likely to be impacted. Finally, I identify a general framework that can be applied to future problems 3 CHAPTER 2 OVERVIEW OF CAPTCHA CAPTCHA A CAPTCHA is a type of challenge-response authentication schema that asks “Are you human?” [2] The term CAPTCHA stands for “Completely Automated Public Turing Test to Tell Computers and Humans Apart” [7]. Many authors refer to CAPTCHAs as a type of Human Interactive Proof (HIP). In order to differentiate between a human and a computer the CAPTCHA program generates a test that is easily solved by humans, but difficult to solve using a computer [1]. Generally these tests take the form of distorted text, images or audio and are often found