Combining Multiple Corpora for Readability Assessment for People with Cognitive Disabilities

Combining Multiple Corpora for Readability Assessment for People with Cognitive Disabilities Victoria Yaneva, Constantin Orasan,˘ Richard Evans, and Omid Rohanian Research Institute in Information and Language Processing, University of Wolverhampton, UK v.yaneva, c.orasan, r.j.evans, omid.rohanian @wlv.ac.uk { } Abstract the task of text evaluation is challenging for target users because of their cognitive disability. Given the lack of large user-evaluated cor- Following from the first difficulty, user- pora in disability-related NLP research evaluated data is scarce and the majority of NLP (e.g. text simplification or readability as- research for disabled groups is done by exploit- sessment for people with cognitive dis- ing ratings or simplification provided by teachers abilities), the question of choosing suit- and experts (Inui et al., 2001; Dell’Orletta et al., able training data for NLP models is not 2011; Jordanova et al., 2013). Examples of such straightforward. The use of large generic a corpora are the FIRST corpus (Jordanova et al., corpora may be problematic because such 2013), which contains 31 original articles and ver- data may not reflect the needs of the target sions of the articles that had been manually simpli- population. At the same time, the avail- fied for people with autism, and a corpus of man- able user-evaluated corpora are not large ually simplified sentences for congenitally deaf enough to be used as training data. In Japanese readers (Inui et al., 2001). Henceforth this paper we explore a third approach, in in this paper, we refer to such manually simpli- which a large generic corpus is combined fied corpora as population-specific corpora. These with a smaller population-specific corpus corpora have not been evaluated by end users with to train a classifier which is evaluated us- disabilities. ing two sets of unseen user-evaluated data. As a result of the second difficulty, the fact One of these sets, the ASD Comprehen- that people with cognitive disabilities find text sion corpus, is developed for the purposes evaluation challenging, the size of user-evaluated of this study and made freely available. datasets is rather limited. For example, to the We explore the effects of the size and type best of our knowledge, there is currently only one of the training data used on the perfor- readability corpus evaluated by people with intel- mance of the classifiers, and the effects of lectual disability, called LocalNews (Feng, 2009). the type of the unseen test datasets on the This corpus contains 11 original and 11 simplified classification performance. news stories. In this paper we present another corpus evaluated by people with autism containing a 1 Introduction total of 27 documents. Henceforth in the paper, When developing educational tools and applica- we refer to these type of corpora as user-evaluated tions for students with cognitive disabilities, it is corpora. necessary to match the readability of the educa- Given the lack of large population-specific or tional materials to the abilities of the students and user-evaluated corpora in disability-related re- to adapt the text content to their needs. Both text search, the question of choosing suitable train- adaptation and readability research for people with ing data for NLP models is not straightforward. cognitive disabilities are thus dependent on evalu- While the use of large generic corpora as train- ation involving target users. However, there are ing data may be inadequate as such data may not two main difficulties in collecting data from users reflect the needs of the target population, the use with cognitive disabilities: i) experiments involv- of population-specific and user-evaluated corpora ing those users are expensive to perform and ii) as training data is problematic due to the scarcity 121 Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications, pages 121–132 Copenhagen, Denmark, September 8, 2017. c 2017 Association for Computational Linguistics of such data. In this paper we explore a third ap- 2 Related Work proach, in which a large generic corpus is combined with a smaller population-specific corpus to Previous work from the fields of psycholinguis- train tics, pertaining to language and autism, readability assessment, and domain adaptation are relevant to a system to predict the difficulty of text for peo- the research presented in our current paper. ple with autism. We compare the performance of this approach to: i) an approach exploiting only 2.1 Autism Spectrum Disorder the large generic corpus and ii) an approach ex- Autism Spectrum Disorder (ASD) is a neurode- ploiting only the small population-specific corpus. velopmental condition affecting communication We also compare the performance of the classifi- and social interaction. The reading difficulties of cation models derived from two different machine some people with ASD include, but are not lim- learning algorithms. All classifiers trained on the ited to, difficulties resolving ambiguity in mean- different corpora are then evaluated on two small ing (Happe´ and Frith, 2006; Happe, 1997; Frith sets of user-evaluated corpora (unseen data), one and Snowling, 1983; O’Connor and Klein, 2004), of which was developed for the purpose of this difficulties comprehending abstract words (Happe´, study (Section3). 1995), difficulties in the syntactic processing of long sentences (Whyte et al., 2014), difficulties identifying the referents of pronouns (O’Connor Contributions We developed the ASD Com- and Klein, 2004), difficulties in figurative lan- prehension Corpus containing 27 educational guage comprehension (MacKay and Shaw, 2004), articles evaluated by readers with autism and and difficulties in making pragmatic inferences classified as easy and difficult based on partici- (Norbury, 2014). Adults with autism have also pants’ answers to comprehension questions. The been shown to process images inserted in easy-to- texts and the answers of each participant for read documents differently from non-autistic con- each question are currently available at: https: trol participants (Yaneva et al., 2015). //github.com/victoria-ianeva/ ASD-Comprehension-Corpus1. Further, 2.2 Readability Assessment we explore i) the effects of the size and type of Readability is a construct which has been defined the training data on the external validity of the as the ease of comprehension because of the style classifiers and ii) the effects of the type of unseen of writing (Harris and Hodges, 1995). Histori- test datasets (only original versus original + sim- cally, the readability of texts has been estimated plified articles) on the classification performance. via formulae exploiting shallow features such as The system used in these experiments is available word and sentence length (Dubay, 2004); cog- http://rgcl.wlv.ac.uk/demos/ at: nitive models exploiting features such as age of autor_readability acquisition of words and text cohesion (McNa- mara et al., 2014) and, more recently, thanks to The rest of this paper is organised as follows. advances in Natural Language Processing (NLP), The next section presents related work relevant to readability has also been estimated via computa- this research, while Section3 describes the pro- tional models (Collins-Thompson, 2014; François, cess for the development of the ASD Comprehen- 2015). Advances in the fields of NLP and Artifi- sion corpus. Section 4 describes the corpora used cial Intelligence have enabled both the faster com- in the study. Section 5 presents the derivation of putation of existing statistical features and the de- the classification models, and Section 6 presents velopment of new NLP-enhanced features (e.g., a discussion of the main findings, which are sum- average parse-tree height, average distance be- marised in Section 7. tween pronouns and their anaphors, etc.) which can be used in more complex methods of assessment based on machine learning. An example of a 1The repository also contains the answers of participants readability model targeted to a specific application from a control group (without autism), which were not ex- of readability assessment are the unigram models plored in this article but may be useful to the community for investigating between-group differences. For more informa- by Si and Callan(2001), which have been found tion about the control group see Yaneva(2016). particularly suitable for assessment of Web con- 122 tent, where the presence of links, email addresses guage proficiency have used a similar approach and other elements biases the traditional formulae. to the one proposed in this paper. For example, In terms of readability assessment for readers Pilan´ et al.(2016) tackle the problem of data spar- with cognitive disabilities, previous research has sity when classifying language proficiency levels shown that readability features such as entity den- of learner-written output by incorporating knowl- sity per sentence and lexical chains (synonymy edge in the trained model from another domain or hyponymy relations between nouns) are useful consisting of input texts written by teaching pro- for estimating the readability of texts for readers fessionals for learners. Their results indicated with mild intellectual disability (Feng et al., 2010). that the weighted combination of the two types of This is due to the fact that these readers strug- data performed best, even when compared to sys- gle to remember relations that hold within- and tems based on considerably larger amounts of in- between-sentences (Feng et al., 2010). Similarly, domain data. In this paper we go a step further by features such as word length or word frequency are applying this approach to readability classification more relevant for readability assessment for peo- for people with cognitive disabilities. ple with dyslexia because they struggle with de- coding particular letter and syllable combinations 3 Evaluation of Text Passages by (Rello et al., 2012). In the case of autism, an im- Readers with Autism portant issue has been the lack of corpora whose We present a collection of 27 individual reading difficulty levels have been evaluated by documents for which the readability was people with autism.

Combining Multiple Corpora for Readability Assessment for People with Cognitive Disabilities

University Microfilms International 300 North Zeeb Road Ann Arbor

Fifth Grade Core Competencies

Software and Hardware Resource Catalog

A Taped-Words Reading Intervention for Non-Native English Speakers

GRADE 1 SCOPE & SEQUENCE THEME 1: I Am Your Friend STORY

TN Foundational Skills Curriculum Supplement Kindergarten Unit 5

And Others TITLE the Dolch Basic Sight Vocabulary Investigation: a Replication and Validation

Vocabulary 32 Language and Conventions 34 Writing Workshop 36 Inquiry 38 Book Club

Comparing the Dolch and Fry High Frequency Word Lists by Linda Farrell, Tina Osenga, and Michael Hunter Founding Partners, Readsters

A System for Teaching Word Recognition Skills To

TN SIG Toolkit.Indd

The Relationship of Readability on the Science