Topic Driven Access to Scientific Handbooks
Total Page:16
File Type:pdf, Size:1020Kb
UvA-DARE (Digital Academic Repository) Topic driven access to scientific handbooks Caracciolo, C. Publication date 2008 Document Version Final published version Link to publication Citation for published version (APA): Caracciolo, C. (2008). Topic driven access to scientific handbooks. SIKS. General rights It is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), other than for strictly personal, individual use, unless the work is under an open content license (like Creative Commons). Disclaimer/Complaints regulations If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library: https://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible. UvA-DARE is a service provided by the library of the University of Amsterdam (https://dare.uva.nl) Download date:06 Oct 2021 Topic Driven Access to Scientific Handbooks Caterina Caracciolo Topic Driven Access to Scientific Handbooks Promotiecommissie Promotor: Prof.dr. M. de Rijke Co-promotor: Dr. J. Kircz Co-promotor: Prof.dr. G. Tamburrini Overige leden: Prof.dr. J. van Eijck Prof.dr. F. van Harmelen Dr.ir. J. Kamps Prof.dr. J. Mackenzie Owen Dr. L. Serafini Prof.dr. B. Wielinga Faculteit der Natuurwetenschappen, Wiskunde en Informatica. SIKS Dissertation Series No. 2008-13 The research reported in this thesis has been carried out under the auspices of SIKS, the Dutch Research School for Information and Knowledge Systems. The research reported in this thesis was carried out at the Informatics Institute of the University of Amsterdam and supported by Elsevier, Amsterdam. Printed and bound by Delta 2000, Roma ISBN: 978–90–5776–176–8 Ai luoghi della vita e alle biciclette che li attraversano. iv Acknowledgments To my surprise, the moment has arrived. I am writing the ackowledgments of my thesis and I think back on the period of my life that I now call “Amsterdam.” Now, in the sunny, warm and chaotic Rome, all the study, work and life that I had in Amsterdam seem so far away that it is surprising that it all happened. But it did happen, for real. I would like to thank many, many people and let them know that they have been so important to me. This is the occasion to do so. The first person to mention is Rosella Gennari, without whom I would have never heard of an opening position at the ILLC department of the UvA and I would not be writing these words now. That position was under the supervision of Maarten de Rijke, somebody I only had a vague memory of because of one class I attended when I was an undergraduate exchange student in Amsterdam. I went to that class only out of curiousity, who would have thought he was going to play such an important role in the years to come. Maarten has been a great example and a continuous surprise to me, because of his quick mind and kind manners, music taste and approach to life. With Maarten I learned an uncountable number of things, all of which will stay with me forever. Together with Maarten, Joost Kircz has followed the entire development of this work. Inspiring conversations, vision, human touch: I had the unique chance to enjoy all of these from him. Guglielmo Tamburrini was my supervisor when I started my PhD in Pisa, then he kindly agreed to continue to follow my work when I moved to Amsterdam and when he moved to Naples. Other people have been influential to the completion of this thesis. With Frank van Harmelen I learned to adopt a healthy perspective on things and to stand for one’s own point of view; Jan van Eijck thought me to appreciate the humbleness and the intellectual courage that should always accompany research work; the Elsevier team showed me the importance of good coordination to carry out projects of any size. All of them showed me the importance of setting oneself a goal to achieve and keeping it despite any interferences. v I would like to thank all the members of my doctoral committee, for accepting to read the manuscript, providing insightful comments and so contributing to make my thesis a better piece of work: Jan van Eijck, Frank van Harmelen, Jaap Kamps, John Mackenzie Owen, Luciano Serafini, Bob Wielenga. A special thanks to Arjen de Vries for generously reading early versions of different chapters of my thesis and commenting on them. Willem van Hage also deserves special thanks, not only for contributing to this work as an undergraduate student, but also for his friendly and enthusiastic presence. David Ahn and Stefan Schlobach read several versions of chapters of this thesis and commented on content and presentation. Maarten van Schie also contributed to this work as an undergraduate student. Andrew Bagdanov read nearly all versions of this manuscript, proofread everything and improved so much the quality of my English. I shared a large part of the journey that lead to this thesis with dear friends, of inestimable value: Juan Heguiabehere, Gabriel Infante Lopez and Gabriele Musillo. Their presence, or even the simple thinking of them always warmed me to the true depths of my soul. Veronica Prada Moroni stayed close to me despite the physical distance between Pisa and Amsterdam — she knows every fold of this story, perhaps better than myself. Vorrei ringraziare Claudia, e Tobias e Frida (e ora Malvina!), per la loro sempre calda accoglienza e la paperella gialla; Roberta, e Andrea e Diego, per le conversazioni, le risate e le gite a Napoli e Bologna; Francesca per il suo continuo sostegno e la stima che mi ha sempre affettuosamente dimostrato; Roberto Lalli con Haci Ciugo e la vita Baires; Laura, Alessandro, Tiziana e Silvia per essere sempre stati cos`ı indulgenti verso le mie fugaci apparizioni a Pescara. My heartfelt hugs go to Rosella Gennari, Carlos Areces, David Ahn, Raffaella Bernardi, Paola Monachesi, Stefano Bocconi, Daniele Moroni, Martina Dominici; to the spanish crowd I spent many evenings and nights with; to all my colleagues at the ILLC and at the ILPS; to Fransje Enserink who is not here anymore; to Virginie and Pauline who helped me out from a distance. And Rome? Here I found breathtaking beauty, the contradictions that keep one’s spirit alive and a great working evironment at the FAO, the Food and Agriculture Or- ganization of the UN. In Rome, I was so lucky to find so many “new” friends: Giorgia, Federica e Simona, ManuManu Manuela, Valentina, Natalia e Francesca, Pietro, Clau- dio & la splendida Livia, Federico M. and Massimo, Tommaso e Daniela, the fabolous Occhionero and the whole “Ciclopicnic Entertainment,” the roman Critical Mass and la Ciemmona Interplanetaria. A big happy smile goes to all of them. La mia famiglia e` sempre con me e la distanza non conta. I miei genitori e i miei fratelli Eligio, Roberto, Paola e Carlo, e le loro famiglie, Barbara, Lucrezia Luna e Francesco, Teresa e Eugenia: vorrei potervi abbracciare tutti in un solo abbraccio. Finally, Andy. Insieme ad Amsterdam, Pisa, Firenze, Roma e chissa` dove an- cora. You always believed this moment would come, even when I myself would doubt. Thank you for this and much much much more. Roma, marzo 2008 vi Contents 1 Introduction 1 1.1 Problem Statement ........................... 2 1.2 Organization of this Thesis ....................... 5 1.3 Scope of this Thesis ........................... 7 1.4 Origins of the Material ......................... 8 2 A Vision on Access to Electronic Scientific Handbooks 9 2.1 The Vision ................................ 10 2.2 Handbooks ............................... 12 2.3 Searching and Accessing Textual Documents ............. 13 2.4 Semantic Structures ........................... 14 2.5 A Modular Approach to Focused Access ................ 20 2.6 Discussion ................................ 21 3 A Browsable Map for Logic and Language 23 3.1 User Requirements ........................... 23 3.2 Design and Content of the Map ..................... 26 3.3 Topics .................................. 27 3.4 Relations ................................ 28 3.4.1 Many Flavors of ISA ...................... 30 3.4.2 Part-of ............................. 33 3.4.3 Instance ............................. 34 3.4.4 Domain Specific Relations ................... 34 3.4.5 Non-Hierarchical Relations .................. 36 3.5 The LoLaLi Map: Features ....................... 36 3.6 Discussion ................................ 39 vii 3.6.1 Dealing with more Relations .................. 39 3.6.2 More on Subtopics ....................... 40 3.7 Editing and Managing the LoLaLi Map ................ 43 3.7.1 A Bit of History: First Attempts ................ 43 3.7.2 Proteg´ e´ and RDFS Modeling .................. 44 3.7.3 Accessing the Map through a Web Browser .......... 45 3.8 Conclusions ............................... 46 4 Interacting with the LoLaLi Map 49 4.1 Requirements for the User Interface .................. 50 4.2 Related Work on Tree and Graph Visualization ............ 51 4.2.1 Global views .......................... 51 4.2.2 Local views ........................... 55 4.3 A User Interface for the LoLaLi Map .................. 57 4.4 User Studies ............................... 61 4.4.1 The Setting ........................... 62 4.4.2 The Questionnaire ....................... 64 4.4.3 Results ............................. 65 4.4.4 Discussion ........................... 77 4.5 Conclusions ............................... 79 5 Looking for Link Targets 83 5.1 Passage Retrieval ............................ 85 5.1.1 Structural Passage Retrieval .................. 86 5.1.2 Fixed Size Passage Retrieval .................. 86 5.1.3 Semantic Passage Retrieval ................... 87 5.2 Linear Topic Segmentation: Two Algorithms ............. 88 5.2.1 TextTiling ............................ 88 5.2.2 C99 ............................... 90 5.3 Building the Ground Truth ......................