Analysis of Natural Language Processing (NLP) approaches to determine semantic similarity between texts in domain-specific context Author: Surabhi Som (6248160)
[email protected] Supervisors: Denis Paperno
[email protected] Rick Nouwen
[email protected] A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Artificial Intelligence in Faculty of Science Utrecht University 1 Table of Contents Chapter 1 ............................................................................................ 3 Introduction ........................................................................................ 3 1.1 Problem Description ................................................................. 4 Chapter 2 ............................................................................................ 5 Literature Review .............................................................................. 5 2.1 Natural Language Processing .................................................. 5 2.2 Ontologies and its importance ................................................. 5 2.3 Semantics ................................................................................. 6 2.4 Semantic Similarity .................................................................. 7 2.5 Sentence Semantic Similarity ................................................ 15 2.6 Text Semantics ....................................................................... 23 2.7 Stemming and Lemmatization ..............................................