Natural Language Interfaces to Conceptual Models Qualification: Phd

Access to Electronic Thesis Author: Danica Damljanovic Thesis title: Natural Language Interfaces to Conceptual Models Qualification: PhD This electronic thesis is protected by the Copyright, Designs and Patents Act 1988. No reproduction is permitted without consent of the author. It is also protected by the Creative Commons Licence allowing Attributions-Non-commercial-No derivatives. If this electronic thesis has been edited by the author it will be indicated as such on the title page and in the text. Danica D. Damljanović Natural Language Interfaces to Conceptual Models Submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy at The University of Sheffield Department of Computer Science July 2011 Abstract Accessing structured data in the form of ontologies currently requires the use of formal query languages (e.g., SeRQL or SPARQL) which pose significant difficulties for non-expert users. One way to lower the learning overhead and make ontology queries more straightforward is through a Natural Lan- guage Interface (NLI). While there are existing NLIs to structured data with reasonable performance, they tend to require expensive customisation to each new domain. Additionally, they often require specific adherence to a pre-defined syntax which, in turn, means that users still have to undergo training. In this thesis, we study the usability of NLIs from two perspectives: that of the developer who is customising the NLI system, and that of the end-user who uses it for querying. We investigate whether usability methods such as feedback and clarification dialogs can increase the usability for end users and reduce the customisation effort for the developers. To that end, we have developed two systems, QuestIO and FREyA, whose design, evaluation and comparison with similar systems form the core of the contribution of this thesis. i ii Acknowledgements This work would not have been possible without help and support of many people. I am grateful to my supervisor, Hamish Cunningham, for encour- agement, guidance and support. My examiners, Enrico Motta and Mark Stevenson for an interesting discussion during my viva and useful comments that improved this thesis. Kalina Bontcheva and Valentin Tablan for their support, and guidance in the course of the TAO project. The members of the GATE team who were kind to participate in the QuestIO user evaluation. The participants of the First GATE Summer School, who participated in the FREyA user evaluation. Members of the NLP Group in Sheffield, and par- ticularly Mark Hepple, Rob Gaizauskas, John Derick, Louise Guthrie, David Guthrie, Sanaz Jabbari, Kumutha Swampillai and Leon Derczynski for en- couragement and useful discussions. Johann Petrak, for useful discussions while he was a visiting researcher in the NLP group, and also for developing Virtual Documents plugin later which became a part of FREyA. Yaoyong Li for answering my questions about statistics. The GOOD OLD AI Network, and all members who are always a great inspiration, and most of all Vladan Deved´zić.Ivan Dimitrijevićfor pointing me to the JIT library which became a part of FREyA. Nicolas Garcia Belmonte for support with the JIT library. Ontotext, and in particular Ivan Peikov, Danail Kozuranov and Atanas Kyr- iakov, for their support with OWLIM and LKB which are used in QuestIO and FREyA. Vanessa Lopez from The Knowledge Media Institute for giving me access to the AquaLog system. Abraham Bernstein and Esther Kauf- mann from the University of Zurich, for sharing with me the Mooney dataset in OWL format, and J. Mooney from University of Texas for making this dataset publicly available. David and Kumutha for reading and correcting English in the parts of this thesis; Milan Agatonovićfor reading the thesis iii draft and suggesting many structural improvements; Hamish and Kalina also read and provided many useful comments to this thesis draft; Sanaz read parts of this thesis and gave useful comments { their suggestions helped me improve this thesis significantly. Chris Moran with whom I had lots of discussions on usability of semantic web technologies, and who triggered many interesting questions in my head that, I believe, made this thesis stronger. My family and friends for their love and understanding. Milan Agatonović for bearing with me through all difficult moments that existed while working on this thesis and helping out with all those ClassNotFoundExceptions {this work would not have been possible without him. I would like to thank European Commission for funding majority of the work in this thesis: the first part of this work has been supported by the EU Sixth Framework Program project TAO (FP6-026460), while the second half has been partially supported by the EU Seventh Framework Program project LarKC (FP7-215535). I offer my regards and blessings to all those who supported me in any respect during the completion of this thesis and whose name I have not mentioned, including many colleagues I met and had very inspirational discussions during SSMS'07, LREC'08, ESWC'08, ISWC'08, CNL'09, K-CAP'09, LREC'10, ESWC'10, CNL'10, and project meetings in the course of TAO and LarKC. iv Publications The work presented within this thesis is published in the following papers: D. Damljanovic, M. Agatonovic, H. Cunningham: FREyA: an Interac- • tive Way of Querying Linked Data using Natural Language. In: Pro- ceedings of 1st Workshop on Question Answering over Linked Data (QALD-1), collocated with the 8th Extended Semantic Web Confer- ence (ESWC'11). Heraklion, Greece, June 2011. H. Wang, D. Damljanovic, T. Payne, N. Gibbins, and K. Bontcheva: • Transition of Legacy Systems to Semantically Enabled Applications: TAO Method and Tools. Semantic Web Journal, IOS Press, 2011, to appear. Available from http://www.semantic-web-journal.net/ H. Wang, D. Damljanovic and J. Sun: Enhanced Semantic Access to • Formal Software Models. In the Proceedings of the 12th International Conference on Formal Engineering Methods, Shanghai, China, Novem- ber 16 - 19, 2010. D. Damljanovic. Towards Portable Controlled Natural Languages for • Querying Ontologies. In Rosner, M., Fuchs, N., eds.: Proceedings of the 2nd Workshop on Controlled Natural Language. Lecture Notes in Computer Science. Springer Berlin/Heidelberg, Marettimo Island, Sicily, September 13-15, 2010. D. Damljanovic, M. Agatonovic, H. Cunningham: Natural Language • Interfaces to Ontologies: Combining Syntactic Analysis and Ontology- based Lookup through the User Interaction. In Proceedings of the 7th Extended Semantic Web Conference (ESWC'10), Springer Verlag, Heraklion, Greece, May 31-June 3, 2010. v D. Damljanovic, M. Agatonovic, H. Cunningham: Identification of • the Question Focus: Combining Syntactic Analysis and Ontology-based Lookup through the User Interaction. In Proceedings of the 7th Lan- guage Resources and Evaluation Conference (LREC'10), ELRA 2010, La Valletta, Malta, May 17-23, 2010. A. Wyner, K. Angelov, G. Barzdins, D. Damljanovic, B. Davis, N. • Fuchs, S. Hoefler, K. Jones, K. Kaljurand, T. Kuhn, M. Luts, J. Pool, M. Rosner, R. Schwitter, J. Sowa: On Controlled Natural Languages: Properties and Prospects. In N. Fuchs, ed.: Controlled Natural Lan- guage. Volume 5972 of Lecture Notes in Computer Science, pp. 281{ 289, Springer Berlin/Heidelberg, 2010. D. Damljanovic, K. Bontcheva: Towards Enhanced Usability of Natu- • ral Language Interfaces to Knowledge Bases. In V. Devedzic and D. Gasevic (Eds.), Special issue on Semantic Web and Web 2.0, Annals of Information systems, Springer-Verlag, 2009. D. Damljanovic, M. Agatonovic, H. Cunningham: Usability of Natural • Language Interfaces for Querying Ontologies, Workshop on Controlled Natural Language (CNL'09), Marettimo Island, Italy, June, 2009. D. Damljanovic, K. Bontcheva: Enhanced Semantic Access to Soft- • ware Artefacts. In Workshop on Semantic Web Enabled Software En- gineering (SWESE'08) held in conjunction with ISWC'08, Karlsruhe, Germany, October, 2008. V. Tablan, D. Damljanovic, K. Bontcheva: A Natural Language Query • Interface to Structured Information. In Proceedings of the 5h Euro- pean Semantic Web Conference (ESWC'08), Tenerife, June, 2008. D. Damljanovic, V. Tablan, K. Bontcheva: A Text-based Query Inter- • face to OWL Ontologies. In: 6th Language Resources and Evaluation Conference (LREC'08), Marrakech, Morocco, ELRA, May, 2008. D. Damljanovic: Natural Language Queries for Enhanced Knowledge • Access, Summer School on Multimedia Semantics Analysis, Annota- tion, Retrieval and Applications (SSMS'07) , Glasgow, UK, July 15-21, 2007. vi Contents Abstract i Acknowledgements iii Publications v I What are Natural Language Interfaces to Conceptual Models? 1 1 Introduction 3 1.1 Motivation . .5 1.2 Challenges . .6 1.3 Contribution . .9 2 Conceptual Models 17 2.1 What are Conceptual Models? . 17 2.2 Browsing Conceptual Models . 19 3 Natural Language Interfaces: a Brief Overview 27 3.1 Natural Language Interfaces to Relational Databases . 29 3.2 Open-domain Question-Answering Systems . 35 3.3 Interactive Natural Language Interface Systems . 39 3.4 Summary and Discussion . 42 vii 4 Evaluation of Natural Language Interfaces 47 4.1 Habitability . 48 4.2 Usability . 50 4.2.1 Effectiveness . 51 4.2.2 Efficiency . 52 4.2.3 User Satisfaction . 52 4.3 Summary . 53 II Usability of Natural Language Interfaces to Conceptual Models: State of the Art 55 5 Portability of Natural Language Interfaces to Structured Data 57 5.1 Introduction . 57 5.2 ORAKEL . 61 5.3 AquaLog and PowerAqua . 63 5.4 E-librarian . 65 5.5 PANTO . 66 5.6 Querix . 66 5.7 NLP-Reduce . 67 5.8 CPL . 67 5.9 Attempto Controlled English (ACE) . 68 5.10 Summary and Discussion . 69 6 Usability Enhancement Methods 77 6.1 Language Restriction . 78 6.2 Feedback . 80 6.3 Guided Interfaces . 83 6.4 Extending the Vocabulary . 85 viii 6.5 How to Deal with Ambiguities? . 88 6.5.1 Automatically Solving Ambiguities . 88 6.5.2 Clarification Dialogs . 89 6.5.3 Query Refinement . 90 6.6 Summary and Discussion . 92 III Building Natural Language Interfaces to Conceptual Models 97 7 QuestIO 99 7.1 Building the Domain Lexicon .

Load more