Word-Sense Disambiguation System for Text Readability

Word-Sense disambiguation system for text readability

Rodrigo Alarcon Lourdes Moreno Paloma Martinez Computer Science Department, Computer Science Department, Computer Science Department, Universidad Carlos III de Madrid, Universidad Carlos III de Madrid, Universidad Carlos III de Madrid, Madrid, Spain Madrid, Spain Madrid, Spain

ABSTRACT Although people with cognitive, language and learning disabili- People with cognitive, language and learning disabilities face ac- ties are directly affected, cognitive accessibility barriers affect other cessibility barriers when reading texts with complex or specialized user groups such as the deaf, deaf-blind, elderly, illiterate and immi- words. In order for these needs to be addressed, and in accordance grants who speak a different native language. According to the PISA with accessibility guidelines, it is beneficial to provide the defini- report, the majority of the adult population in Spain has difficulties tions of complex words to the user. In this sense, human language understanding dense texts [1]. In addition, 1.7% of the population can, at times, be ambiguous, and many words may be interpreted in is functionally illiterate and there are 277,472 people with some multiple ways depending on the context. To offer a correct defini- manner of intellectual disability. tion, it is often necessary to carry out Word Sense Disambiguation. To provide universal access to information and make texts more In this paper, a system that is based on Natural Language Processing accessible, one of the most noteworthy initiatives is the Easy Read- and Language Resources in the field of easy reading to provide a ing guidelines, which propose recommendations on how to adapt contextualized definition for a complex word is presented. Anex- texts so that they are easier to read and understand. In Spain, there pert linguistic, specialized in easy reading and plain language, has are regulations in this regard, such as the UNE 153101:2018 (Easy evaluated the Word Sense Disambiguation system. This research to read. Guidelines and recommendations for the elaboration of work is part of the EASIER project that offers an accessible platform documents) standard [2]. A second is the Plain Language initiative to provide users with the easiest possible Spanish text to understand which promotes the use of simple language in information society 1 and read. content . Related work is Web Content Accessibility Guidelines (WCAG) CCS CONCEPTS [3] within the Web Accessibility Initiative (WAI) of the W3C that contain specific standards that improve the understanding of texts. • Human-centered computing; • Accessibility; • Accessibility Along these lines, another notable initiative is the Accessibility technologies; • Computing methodologies; • Artificial intel- Working Group for Cognitive and Learning Disabilities (COGA TF) ligence; • Natural language processing; [4]. KEYWORDS In all these works, the guideline of using a language with a simple lexicon is repeated as being essential; techniques such as Web accessibility, Readability, WCAG, Tool, Word-sense disam- offering definitions of unusual words and distinguishing themso biguation, Natural Language Processing, Cognitive disabilities as to be recognizable by users are also recommended. Following ACM Reference Format: this guideline is costly since, in practice, there are no resources that Rodrigo Alarcon, Lourdes Moreno, and Paloma Martinez. 2020. Word-Sense offer automatic support, making manually compliance not feasible disambiguation system for text readability. In 9th International Conference in most cases. As part of the solution, there are Text Simplification on Software Development and Technologies for Enhancing Accessibility and methods found in the Natural Language Processing (NLP) field Fighting Info-exclusion (DSAI 2020), December 02–04, 2020, Online, Portugal. which could provide systematic support to improve the accessi- ACM, New York, NY, USA, 6 pages. https://doi.org/10.1145/3439231.3439257 bility of textual content and promote compliance with cognitive accessibility guidelines [3][5]. 1 INTRODUCTION While recent research, only focus on finding the sense of a word Currently, we have access to an overwhelming amount of infor- in a context, this work adds the procedure of finding the correct mation, yet this information is not accessible to all people. Certain definition for a word in a certain context. This has been akey individuals face accessibility barriers when reading texts that con- factor in defining the motivation of this research work: to create tain long sentences, unusual words, complex language structures, mechanisms which offer definitions for complex words. etc. Description of the accessibility guidelines concerning improving the accessibility of textual content are provided below. Section 3 Permission to make digital or hard copies of all or part of this work for personal or presents related work. The description of the WordSense Disam- classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation biguation (WSD) system and the EASIER platform in which it is on the first page. Copyrights for components of this work owned by others than ACM integrated is presented in Section 4. Section 5 includes the evalu- must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, ation of the WSD system. Finally, conclusions and future lines of to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. research are presented. DSAI 2020, December 02–04, 2020, Online, Portugal © 2020 Association for Computing Machinery. ACM ISBN 978-1-4503-8937-2/20/12...$15.00 1Plain Language network web page https://plainlanguagenetwork.org/plain-language/ https://doi.org/10.1145/3439231.3439257 que-es-el-lenguaje-claro/ [Accessed 18 November 2020]. DSAI 2020, December 02–04, 2020, Online, Portugal Rodrigo Alarcon et al.

Table 1: Success criterion 3.1.3 techniques (WCAG 2.1)

Option Main technique Techniques Option A: If the word or phrase has a G101: Providing the definition of a word or G55: Linking to definitions unique sense within the web page: phrase used in an unusual or restricted way •H40: Using description lists •H60: Using the link element to link to a glossary G112: Using inline definitions •H54: Using the dfn element to identify the defining instance of a word G62: Providing a glossary G70: Providing a function to search an online dictionary Option B: If the word or phrase has G101: Providing the definition of a word or G55: Linking to definitions different senses within the same web page: phrase used in an unusual or restricted way •H40: Using description lists •H60: Using the link element to link to a glossary G112: Using inline definitions •H54: Using the dfn element to identify the defining instance of a word

2 ACCESSIBILITY GUIDELINES 3 RELATED WORK Web Content Accessibility Guidelines (WCAG) [6], part of the Human language is ambiguous, and it is possible for many words W3C’s WAI provide guidelines regarding how to make content to be interpreted in multiple ways depending on the context. The more accessible for individuals with intellectual and learning dis- computational identification of meaning for words in context is abilities. The WCAG includes certain guidelines which, if followed, called Word Sense Disambiguation [7]. As new words continue assist in making web content accessible for individuals with intel- to be added to our language, this task becomes increasingly com- lectual and learning disabilities. plex. Furthermore, as it is influenced by the domain in which the The Success Criterion 3.1.3 (Unusual Words) indicates that there knowledge is created, producing resources to support knowledge must be a mechanism available for identifying specific definitions based WSD is extremely costly. Considering this disadvantage, ma- of words or phrases used in an unusual or restricted way, includ- chine learning approaches have proved to be a good solution by ing idioms and jargon. This Success Criterion 3.1.3 is included in using predictive strategies. Research can be found which has used Guideline 3.1 (Readable) which recommends making text content supervised, unsupervised, and semi-supervised approaches. readable and understandable. Likewise, this guideline belongs to Knowledge-based approaches require extensive lexical resources Principle 3 (Understandable), which states that the information and to determine the sense of a target word. Lesk [8] follows a operation of the user interface must be understandable. knowledge-based approach by overlapping the word context and Certain disabilities make it difficult for nonliteral word usage and the sense definitions from a machine readable dictionary. Next, specialized words or usage to be understood. This Success Criterion the sense that has a greater number of words in common with may help people with cognitive, language, and learning disabilities the context of the target word is chosen. This approach is depen- who have difficulty understanding words and phrases or those who dent on finding the exact words in the definition, resulting inpoor have difficulty using context to aid understanding. performance. The Success Criterion 3.1.3 provides techniques which can pro- Supervised approaches require sense tagged corpora to train vide assistance to these groups of people with disabilities. Funda- an algorithm to determine which sense is correct. For example, mentally, said techniques state that the definition of a word used in a team in the Senseval-3 competition [9] followed a supervised an unusual or restricted way on a web page must be provided. approach by training Support Vector Machines supported by the As shown in Table 1, in order to follow the techniques and pro- neighboring word’s POS tags, single words around the context and vide the definition for unusual words, it is necessary to differentiate syntactic relations, showing good recall when compared with the between two situations: if a word has a unique meaning within the other participants in the task. web page or if different senses for the same word appear within the More recently, unsupervised approaches are among the most re- same web page. Therefore, in order to apply these techniques, the searched by using comparable corpora strategies [10]. Additionally, first step is to differentiate between these two situations. Moreover, with the introduction of word embeddings, many investigations it is necessary to contextualize the meaning of the unusual word in combine these concepts. Moradi [11] trained a Word2Vec model to the textual content. As a solution, in this work, we propose using a evaluate similarity measures to disambiguate Persian language by WSD system which employs NLP methods. considering semantic relationships between words. Word-Sense disambiguation system for text readability DSAI 2020, December 02–04, 2020, Online, Portugal

Semi-supervised approaches appear by attempting to improve (Level AA) [6]. Furthermore, accessibility cognitive guidelines [4] the disadvantages of each of the previous approaches. Generally, were followed. Accessibility requirements such as making the pur- they consist in training classifiers from a small amount of labeled pose of page obvious, using clear and understandable content, and data and a large amount of unlabeled data. A recent work [12], making each step of the simplification process as clear as possible, proposes a semi-supervised method to improve a Long Short Term including instructions, were taken into consideration. Additionally, Memory (LSTM) model using self-learning and constructing the a consistent visual design, using graphic symbols that help the user, model by using the few labeled data to which they had access. was used (see Figure 1). At the same time, many competitions have been organized with the goal of solving WSD problems. The SemEval-2007 competition [13] was focused on correctly disambiguating and identifying the semantic relationship between words. The SemEval-2013 [14] competition addressed this task by presenting a multilingual sense- annotated corpus, one of which was in Spanish, tagged with Word- Net, Babelnet, and Wikipedia. The SemEval-2015 [15] addressed both WSD and Entity Linking (EL) to analyze and find ways to solve these tasks with similar methods by providing resources that integrate encyclopedic knowledge and lexicographic information. However, others have tackled this problem from another point of view. Such is the case of Google, which uses its BERT language representation model (Bidirectional Encoder Representations from Transformers) [16] to solve different tasks in NLP by fine tuning their pretrained models. A research project carried out based on this approach [17] consisted in fine tuning a BERT model for WSD using WordPiece embeddings as part of the entries. Good results were obtained, overperforming the results of current approaches in F1-scores. The core process of our WSD system follows the aforementioned approach. Figure 1: Screenshot of the EASIER system web interface This task greatly supports other areas, such as simplification (desktop version) systems [18] oriented to people with cognitive disabilities. For the benefit of people with disabilities with communication and Its functionality is as follows. Through the web interface, a user language problems, research has been found on systems that include can enter a text and, subsequently, a web page with the results is WSD such as [19], to provide predictive text functionality, and [20], returned. The system provides the same text, which was entered, concerned with the selection of a correct pictogram. but with the complex words highlighted (See Figure 1). When the user interacts with the complex words, synonyms, definitions and 4 WORD-SENSE DISAMBIGUATION (WSD) pictograms are provided. Adhering to a responsive design, a user SYSTEM interface for mobile devices is also provided. In this section, the EASIER project, in which the WSD system has Moreover, browser extensions have been developed for the been integrated, will be introduced and subsequently described. Chrome and Mozilla browsers that offer the function of identifying complex words and providing synonyms for text a user selects on 4.1 EASIER Project any web page. The WSD task which supports the accessibility requirement (Suc- In order to provide these language resources, the system inte- cess Criterion 3.1.3 (Unusual Words)) has been integrated into the grates various NLP tasks. Firstly, in order to detect complex words EASIER [21] platform 2 3 described in this section. within a text, lexical simplification based on Machine Learning is The difficulties in understanding texts containing unusual words carried out using the Support-Vector Machine (SVM) classification that may create accessibility barriers for people with language and algorithm [22]. learning disabilities led to the creation of the EASIER project. It is In relation to providing a definition of the complex word detected a system that provides various resources which improve cognitive during the previous step, dictionaries are then used and the WSD accessibility in Spanish such as synonyms, definitions, and pic- task is implemented as described in this paper. tograms for complex words within a text. To provide the definition Lastly, this service provides a pictogram of the complex word. 4 resource, it was necessary to integrate a WSD task, which is the This is obtained through the ARASAAC resource which offers focus of this work. graphic elements for people with communication disabilities. The The EASIER system is a web system that provides an accessible ARASAAC resource provides an available database of pictograms. web interface. It was designed to be compliant with the WCAG 2.1 Pictograms are a tool used by people with cognitive or communication disabilities. 2Easier project git http://github.com/LURMORENO/easier [Accessed 18 November 2020]. 4Arasaac resource webpage http://www.arasaac.org/index.php [Accessed 18 November 3Easier project webpage http://163.117.129.208:8080/ [Accessed 18 November 2020]. 2020 DSAI 2020, December 02–04, 2020, Online, Portugal Rodrigo Alarcon et al.

4.2 WSD System 4.2.2 Bert model. The core module uses a multilingual, pretrained 9 This section introduces the WSD system’s procedure used to select BERT model released by Google. To import this model, we use 10 the correct definition for a specific word. In order to understand the PyTorch interface for BERT by Hugging Face and, as this is the core of the system, the BERT model the system relies on is a pretrained model, a specific input is required. Said input isas subsequently explained. follows: • The sentence must be surrounded by specific tokens: [CLS] 4.2.1 Meaning lookup procedure. Figure 2 shows an example of and [SEP] indicating the beginning and the end of the sen- the WSD selection procedure. Two dictionaries are used: the “Real 5 tence, respectively. Academia de la Lengua Española” Dictionary (RAE) and the “Dic- • 6 Tokens that conform with the fixed vocabulary used in BERT. cionario Facil” , the latter a dictionary of Easy Reading definitions • 7 Token ID’S from BERT’s tokenizer (one of them must be the created by the “Plena Inclusión Madrid” association’s experts and target word). users with cognitive disabilities. • Mask ID’s to distinguish between tokens and padding elements. • Segment ID’s to distinguish between sentences. In this instance, said value is always the same since we input one sentence every iteration. • Positional embeddings. We need to use BERT´s tokenizer because of its unique way of representing words. This tokenizer was created with a WordPiece model, therefore splits the words into smaller sub-words or char- acters. The model first checks if the whole is in the vocabulary, if not it tries to break the word into sub-words. Take for example de word “embedding” where the tokenizer outputs four sub-words: “em”, “##bed”, “##ding”, “##s”. This happens because within the word, the tokenizer finds coincidences between sub-words and the tokenizer´s vocabulary. Later, the sentence is converted to a list of vocabulary indexes and Segment ID´s (just one). As a next step we generate an object from the model. This object has four dimensions: The layer number (12 layers), the batch num- Figure 2: Example of WSD system procedure ber (1 sentence), the number of tokens and the feature number (768). At first, the values of the object are grouped by layers, buttoour The system receives a sentence along with the target word, then purposes, we need it to be grouped by tokens and we change the creates a list of the target word definitions extracted from the RAE previous dimension into the following: number of tokens, number or Easy Dictionary. With the help of the BERT model in the system, of layers, number of features with the help of the permute function the word is masked in the sentence to which it belongs, then the of Pytorch. model predicts which words can be substituted for the masked word. Finally, to create our word vectors we combine some of the layer This results in a list of words that share a common meaning, thus vectors. To know which layers, provide the best results, we follow disambiguating the target word. Later, with the help of a Spanish Jay Allamar´s approach on Name entity recognition, where they Spacy8 model, we tokenize the sentence and search for verbs, nouns, perform a concatenation of the last four layers of the model. It is and adverbs. The words in the sentence that has lexical content worth mentioning that depending of the task, this strategy could are then extracted and added to the list. Finally, these words are have different results. lemmatized to enrich the list. These model embeddings are useful for semantic searches and Due to the fact that the first list created by our system contains information retrieval. The main difference between this type of words with similar semantics, these two lists are compared, and embedding and others, such as Word2Vec or FastText, is that BERT the coincidences are counted. The hypothesis followed is that the produces word representations that are dynamically informed by definition provided by the second list, which has more coincidences the context words, while Word2Vec words are represented as unique of words than the first list, is the correct definition associated with indexed values. In common word embeddings, each word is repre- the target word and, consequently, that chosen by the WSD system. sented with a single vector, ignoring polysemic words. In a sense, If no coincidences are found, the system selects the first definition with this word embedding, each word could have several vectors, in the list (First in). one for each of its possible senses. Therefore, these models allow the task of word disambiguation to be addressed when one attempts to discover the meaning of a word. 5RAE Web Page https://dle.rae.es/ [Accessed 18 November 2020] 6Diccionario facil resource webpage http://www.diccionariofacil.org/ [Accessed 18 November 2020]. 9Pytorch implementation: https://github.com/shehzaadzd/pytorch-pretrained-BERT 7Plena Inclusion association webpage https://plenainclusionmadrid.org/ [Accessed 18 [Accessed 18 November 2020]. November 2020]. 10Hugging Face git repository https://github.com/huggingface/transformers [Accessed 8Spacy web page https://spacy.io/ [Accessed 18 November 2020]. 18 November 2020]. Word-Sense disambiguation system for text readability DSAI 2020, December 02–04, 2020, Online, Portugal

Table 2: Results in WSD System By analyzing scenarios in which the system does not find any coincidences, we found that, generally speaking, there are no co- # Instances % Correct incidences in generic sentences such as those found in Table 5, where the model does not have much context to analyze and out- BERT Model 117 64.95 puts generic words like “resulta” (results), “consiste” (consists), and First in 408 72.06 “ayuda” (help). Moreover, in some cases, we found that the sys- Total 525 70.48 tem does not find coincidences because the definition is in another grammatical form. This issue can be corrected by lemmatizing the words in the definition and searching for coincidences. Lastly, certain scenarios were found by analyzing negative re- 5 WSD SYSTEM EVALUATION sults. As seen in in Table 5, the system presents problems when faced with generic sentences. In this case it outputs generic words, The evaluation of the WSD system was made by a linguistic expert therefore selecting incorrect definitions. Another issue found was specialized in easy reading and plain language. The expert received that when the system encounters the same number of coincidences a set of sentences associated with the target word and the definition among definitions, it selects the last processed definition. There selected by the system. The expert verified whether the definition must be a mechanism capable of selecting the correct definition in selected by the system was correct, taking the context of the word this scenario. in the sentence into consideration. Table 2 shows the results of the evaluation of the WSD system. 525 instances were evaluated, of which 70.48 % were rated as correct, 6 CONCLUSIONS thus indicating our approach has a fair performance. The main objective of this work is to offer correct definitions for The BERT model approach was able to process 117 instances with complex words resulting in an improvement in cognitive acces- a score of 64.95 % rated as correct. By applying the “First in” strategy, sibility by increasing the understanding and readability of texts. 408 instances were processed, from which 72.06 % were rated as To accomplish this objective, a WSD system has been created that correct. These results demonstrate that our approach performs uses a context-aware approach supported by a multilingual BERT well when dealing with polysemous words but faces some issues model, which provides a definition of a word, taking its context into on recall. To help understand our results, examples are illustrated account. This system has been integrated into the EASIER project, below. which provides various resources that improve cognitive accessibil- With regards to the positive results, we noticed that the system ity in Spanish such as synonyms, definitions, and pictograms for selects larger definitions. This is due to the fact that the more coin- complex words within a text. cidences the system finds, the larger the definition is, as shown in The system has been evaluated by a linguistic expert, obtaining Table 3. In this instance, the coincidences were the words “declarar” a precision score of 70.48%. The system showed positive results (state) and “exponer” (present). Furthermore, we noticed that the by selecting correct definitions. However, the coverage was low. It system prefers “Diccionario Fácil” definitions whenever possible. was observed that there were missing coincidences due to the fact This occurs because this database offers the definition of a word that certain definitions for words were in a different grammatical and an example of the word in a sentence. We believe that this is an form than the words that the WSD system provided. To improve advantage because there are some definitions in the RAE dictionary this and as future work, content words of the definition can be that are quite short and do not provide sufficient explanation. Table lemmatized and added to the WSD process. Additionally, an issue 4 shows an example of this in which the RAE dictionary provides a was also observed in some cases in which the same number of very short definition (a), from among all the possible definitions. coincidences among definitions were found. A mechanism to solve However, the system chooses a correct and more explanatory defi- this must be developed, taking the main objective of this work into nition (c). consideration.

Table 3: WSD system positive results example 1

Sentence “... confianza para todos los ciudadanos”, ha explicado.(“... trust for all citizens,” he explained) Target Word explicado” (explained) Definition Declarar o exponer cualquier materia, doctrina o texto difícil, con palabras muy claras para hacerlos más perceptibles. options (state or present any material, doctrine or difficult text with very clear words so as to make it more understandable) Enseñar en la cátedra. (teach from a podium) Justificar, exculpar palabras o acciones, declarando que no hubo en ellas intención dejustify, agravio.( exculpate words or actions, stating that there was no insult or injury intended) Dar a conocer la causa o motivo de algo. (present the cause of or motive for something) Definition Declarar o exponer cualquier materia, doctrina o texto difícil, con palabras muy claras para hacerlos más perceptibles. selected by the (state or present any material, doctrine or difficult text with very clear words so as to make it more understandable) system DSAI 2020, December 02–04, 2020, Online, Portugal Rodrigo Alarcon et al.

Table 4: WSD system on selecting a more explanatory definition

Sentence “Deben ser reconocidos por su contribución a la mejora en el intercambio y procesamiento de datos e información entre todos los agentes del sistema.” (They should be recognized for their contribution to the improvement in the exchange and processing of data and information between all of the system’s agents.) Target Word “procesamiento” (processing) Definition a) Acto de procesar. (the act of processing) options b) Acto por el cual se declara a alguien como presunto autor de unos hechos delictivos a efectos de abrir contra él un proceso penal. (an action by which an individual who has allegedly committed a crime is formally charged and criminal proceedings are initiated) c) Aplicación sistemática de una serie de operaciones sobre un conjunto de datos, generalmente por medio de máquinas, para explotar la información que estos datos representan. (The systematic application of a series of operations on a set of data, normally by machines, to exploit the information said data represents.) Definition Aplicación sistemática de una serie de operaciones sobre un conjunto de datos, generalmente por medio de máquinas, selected by the para explotar la información que estos datos representan. (The systematic application of a series of operations on a set of system data, normally by machines, to exploit the information said data represents.)

Table 5: WSD system negative result

Sentence “Y todo ello redunda en una mejor salud” (“And all of this results in improved health.”) Target Word “redunda” (results) Definition a) Dicho especialmente de un líquido: Rebosar, salirse de sus límites o bordes por demasiada abundancia. (Used options especially when referring to a liquid: spill over the edges or borders because the quantity exceeds the capacity.) b) Dicho de una cosa: Venir a parar en beneficio o daño de alguien o algo. (Used when referring to a thing: to create a benefit for or a damage to someone or something.) Definition Dicho especialmente de un líquido: Rebosar, salirse de sus límites o bordes por demasiada abundancia. (Used especially selected by the when referring to a liquid: spill over the edges or borders because the quantity exceeds the capacity.) system

ACKNOWLEDGMENTS [7] R. Navigli, “Word Sense Disambiguation: A Survey,” vol. 41, no. 2, 2009. [8] M. Lesk, “Automatic Sense Disambiguation Using Machine Readable Dictionaries: This work was supported by the Research Program of the Ministry How to Tell a Pine Cone from an Ice Cream Cone,” pp. 24–26, 1986. of Economy and Competitiveness - Government of Spain, (Deep- [9] Y. K. Lee, H. T. Ng, and T. K. Chia, “Supervised Word Sense Disambiguation with EMR project TIN2017-87548-C2-1-R), and the Accessible Technolo- Support Vector Machines and Multiple Knowledge Sources,” no. July, 2004. [10] H. Kaji and Y. Morimoto, “Unsupervised Word Sense Disambiguation Using gies award - INDRA Technologies and the Fundación Universia Bilingual Comparable Corpora,” 2005. (www.tecnologiasaccesibles.com). We would also like to express [11] B. Moradi, E. Ansari, and Z. Zabokrtsky, “Unsupervised Word Sense Disam- biguation Using Word Embeddings,” PROCEEDING 25TH Conf. Fruct Assoc., our sincerest gratitude to the association of people with cognitive 2019. disabilities "Plena Inclusión Madrid" (plenainclusionmadrid.org), [12] R. Cao, J. Bai, and H. Shinnou, “Semi-supervised learning for all-words WSD who granted us the use of its easy reading dictionary ("Diccionario using self-learning and fine-tuning,” no. Paclic 33, pp. 356–361, 2019. [13] S. S. Pradhan, D. Dligach, and M. Palmer, “SemEval-2007 Task 17: English Lexical Fácil"). Sample , SRL and All Words,” no. June, pp. 87–92, 2007. [14] R. Navigli, D. Jurgens, and D. Vannella, “SemEval-2013 Task 12: Multilingual REFERENCES Word Sense Disambiguation,” vol. 2, no. SemEval, pp. 222–231, 2013. [15] A. Moro and R. Navigli, “SemEval-2015 Task 13: Multilingual All-Words Sense [1] M. de E. OCDE (Organización para la Cooperación y el Desarrollo Económi- Disambiguation and Entity Linking,” no. SemEval, pp. 288–297, 2015. cos, “http://www.educacionyfp.gob.es/prensa/actualidad/2013/10/20131008- [16] J. Devlin, M. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep piaac.html,’’ 2013. . Bidirectional Transformers for Language Understanding,” 2019. [2] “UNE, Asociación Española de Normalización, UNE 153101:2018 (Easy to read. [17] J. Du, F. Qi, and M. Sun, “Using BERT for Word Sense Disambiguation,” no. 1, Guidelines and recommendations for the elaboration of documents),” 2018. 2019. [Online]. Available: https://www.une.org/encuentra-tu-norma/busca-tu-norma/ [18] J. Medina et al., “Towards Integrating People with Intellectual Disabilities in the norma?c=N0060036 . [Accessed: 18-Nov-2020]. Digital World,” no. Id, pp. 348–357, 2016. [3] L. Moreno, R. Alarcon, I. Segura-Bedmar, and P. Martínez, “Lexical simplifica- [19] H. Al-Mubaid and P. Chen, “Application of word prediction and disambiguation tion approach to support the accessibility guidelines,” Proc. XX Int. Conf. Hum. to improve text entry for people with physical disabilities (assistive technology),” Comput. Interact. (Interacción ’19), pp. 1–4, 2019. Int. J. Soc. Humanist. Comput., vol. 1, no. 1, pp. 10–27, 2008. [4] W3c, “Making Content Usable for People with Cognitive and Learning Disabilities, [20] L. Sevens, G. Jacobs, V. Vandeghinste, I. Schuurman, and F. Van Eynde, “Improving W3C Working Draft,” 2020. [Online]. Available: https://www.w3.org/TR/coga- Text-to-Pictograph Translation Through Word Sense Disambiguation,” pp. 131– usable/ . [Accessed: 17-Jul-2020]. 135, 2016. [5] L. Moreno, P. Martínez, I. Segura-Bedmar, and R. Revert, “Exploring language [21] L. Moreno, R. Alarcon, and P. Martínez, “EASIER system. Language resources technologies to provide support to WCAG 2.0 and E2R guidelines,” Proc. XVI Int. for cognitive accessibility,” 22nd Int. ACM SIGACCESS Conf. Comput. Access. Conf. Hum. Comput. Interact. - Interacción ’15, pp. 1–8, 2015. (virtual)., 2020. [6] W3C, “Web Content Accesibility Guidelines (WCAG),” 2019. [Online]. Available: [22] R. Alarcon, L. Moreno, I. Segura-bedmar, and P. Martínez, “Lexical simplification https://www.w3.org/WAI/standards-guidelines/wcag/. approach using easy-to-read resources,” Proces. del Leng. Nat., pp. 95–102, 2019.