Automated Essay Analysis

AUTOMATED ESSAY ANALYSIS by Isaac Persing APPROVED BY SUPERVISORY COMMITTEE: Vincent Ng, Chair Nicholas Ruozzi Latifur Khan Vibhav Gogate Copyright c 2017 Isaac Persing All rights reserved AUTOMATED ESSAY ANALYSIS by ISAAC PERSING, AS, BS, MS DISSERTATION Presented to the Faculty of The University of Texas at Dallas in Partial Fulfillment of the Requirements for the Degree of DOCTOR OF PHILOSOPHY IN COMPUTER SCIENCE THE UNIVERSITY OF TEXAS AT DALLAS May 2017 ACKNOWLEDGMENTS First I would like to thank my advisor Professor Vincent Ng. It is difficult to capture the full scope of the guidance and support he has given me throughout the last decade. Whenever I asked him for help on some matter, whether it was directly related to my research with him, or just some oddball request like the time I asked him to write a letter on my behalf for an obscure insurance matter, he was always quick to respond. Thanks to his extensive knowledge of the NLP field, his dedication, and his creativity, many of the ideas contained in this document could best be described as my interpretations of his original inspirations. I would also like to thank Professors Latifur Khan, Vibhav Gogate, and Nicholas Ruozzi for their guidance and for taking time out of their busy schedules to serve on my committee. Next I would like to thank Professor Ng's lab for their help on a variety of matters throughout the years. Members of Professor Ng's lab who I would particularly like to thank are Alan Davis who made many early contributions to the essay analysis project including co-authoring chapter 3 of this dissertation and Kazi Saidul Hasan who answered a lot of questions for me early on. I would like to thank the UT Dallas community in general for fostering an environment that helped me to complete this work. I am grateful to my parents Edward Garza and Teri Persing for always being there for me throughout my time at UT Dallas and in my life in general, and to my extended family who in turn were there for my parents when I could not be. Finally I would like to thank the donors who funded the Excellence in Education Doctoral Fellowship and the Jonsson Distinguished Research Fellowship, without whose support I could not have completed this research. I am also grateful to the National Science Founda- tion, whose grants IIS-0812261, IIS-1147644, IIS-1219142, and IIS-1528037 supported much of this work. March 2017 iv AUTOMATED ESSAY ANALYSIS Isaac Persing, PhD The University of Texas at Dallas, 2017 Supervising Professor: Vincent Ng, Chair Automated essay analysis is one of the most important educational applications of natural language processing. The practical value of automated essay analysis systems is that they can perform essay analysis tasks faster, more cheaply, and more consistently than a human can, and can be made available for student use at any time. For example, a human teacher may spend hours grading a class's essay assignments, and due to issues like fatigue may not grade the last essay with as much care as the first. An automated essay analysis system, by contrast, can grade a stack of essays in seconds, and can be guaranteed to grade the last essay as carefully as the first. Unfortunately, automated essay analysis is complicated by the fact that essay analysis tasks often require a deep understanding of essay text, so designing accurate automated essay analysis systems is not a trivial task. For example, we can't judge how well an essay is organized from just the words it contains. Instead, we must often develop task-specific features in order to help a computer make sense of it. This dissertation focuses on advancing the state-of-the-art in automated essay analysis. Specifically, we define and present new computational approaches to seven different essay analysis tasks, namely 1) scoring how well an essay is organized, 2) scoring the clarity of its thesis, 3) detecting which errors it makes that hinder its thesis's clarity, 4) scoring how well it adheres to the prompt it was written in response to, 5) scoring its argument's quality, 6) v detecting the stance its author takes on a given topic, and 7) detecting the structure of the argument it makes. For each of these tasks, our approach significantly outperforms com- peting approaches in an evaluation on student essays annotated for the task. To stimulate future work on automated essay analysis, we make the annotations we produced for these tasks publicly available. vi TABLE OF CONTENTS ACKNOWLEDGMENTS . iv ABSTRACT . v LIST OF FIGURES . xii LIST OF TABLES . xiii CHAPTER 1 INTRODUCTION . 1 1.1 Why Are These Tasks Important? . 1 1.2 What Makes These Tasks Challenging? . 2 1.3 Our Goals and Contributions . 3 1.3.1 Scoring Tasks . 3 1.3.2 Argument Analysis Tasks . 5 1.4 Roadmap . 6 CHAPTER 2 RELATED WORK . 8 2.1 Automated Essay Scoring . 8 2.1.1 Holistic Essay Scoring . 8 2.1.2 Prompt Adherence . 13 2.1.3 Coherence . 16 2.2 Argumentation Mining . 23 2.3 Other Essay Analysis . 34 2.4 Automated Essay Analysis Systems . 35 CHAPTER 3 ORGANIZATION SCORING . 36 3.1 Introduction . 36 3.2 Corpus Information . 38 3.3 Corpus Annotation . 38 3.4 Function Labeling . 40 3.5 Heuristic-Based Organization Scoring . 43 3.5.1 Aligning Paragraph Sequences . 44 3.5.2 Aligning Sentence Sequences . 46 3.6 Learning-Based Organization Scoring . 47 vii 3.6.1 Linear Kernel . 47 3.6.2 String Kernel . 49 3.6.3 Alignment Kernel . 51 3.6.4 Combining Kernels . 51 3.7 Evaluation . 52 3.7.1 Evaluation Metrics . 52 3.7.2 Results and Discussion . 53 3.8 Summary . 56 CHAPTER 4 THESIS CLARITY SCORING . 58 4.1 Introduction . 58 4.2 Corpus Information . 59 4.3 Corpus Annotation . 60 4.4 Error Classification . 63 4.4.1 Model Training and Application . 63 4.4.2 Baseline Features . 64 4.4.3 Novel Features . 65 4.5 Score Prediction . 69 4.6 Evaluation . 70 4.6.1 Error Identification . 71 4.6.2 Scoring . 74 4.7 Summary . 77 CHAPTER 5 PROMPT ADHERENCE SCORING . 78 5.1 Introduction . 78 5.2 Corpus Information . 79 5.3 Corpus Annotation . 80 5.4 Score Prediction . 81 5.4.1 Model Training and Application . 81 5.4.2 Baseline Features . 82 5.4.3 Novel Features . 83 viii 5.5 Evaluation . 90 5.5.1 Experimental Setup . 90 5.5.2 Results and Discussion . 92 5.5.3 Feature Ablation . 93 5.5.4 Analysis of Predicted Scores . 96 5.6 Summary . 97 CHAPTER 6 ARGUMENT QUALITY SCORING . 98 6.1 Introduction . 98 6.2 Corpus Information . 99 6.3 Corpus Annotation . 99 6.4 Score Prediction . 101 6.5 Baseline Systems . 101 6.5.1 Baseline 1: Most Frequent Baseline . 102 6.5.2 Baseline 2: Learning-based Ong et al. 102 6.6 Our Approach . 103 6.7 Evaluation . 111 6.7.1 Scoring Metrics . 111 6.7.2 Results and Discussion . 112 6.7.3 Feature Ablation . 113 6.7.4 Analysis of Predicted Scores . 116 6.8 Summary . 117 CHAPTER 7 STANCE CLASSIFICATION . 118 7.1 Introduction . 118 7.2 Corpus . 120 7.3 Baseline Stance Classification Systems . 122 7.3.1 Agree Strongly Baseline . 122 7.3.2 N-Gram Baseline . 122 7.3.3 Duplicated Faulkner Baseline . ..

Load more