Personal Knowledge Models with Semantic Technologies
Total Page:16
File Type:pdf, Size:1020Kb
Max Völkel Personal Knowledge Models with Semantic Technologies Personal Knowledge Models with Semantic Technologies Max Völkel 2 Bibliografische Information Detaillierte bibliografische Daten sind im Internet über http://pkm.xam.de abrufbar. Covergestaltung: Stefanie Miller c 2010 Max Völkel, Ritterstr. 6, 76133 Karlsruhe Alle Rechte vorbehalten. Dieses Werk sowie alle darin enthaltenen einzelnen Beiträge und Abbildungen sind urheberrechtlich geschützt. Jede Verwertung, die nicht ausdrücklich vom Urheberrechtsschutz zugelassen ist, bedarf der vorigen Zustimmung des Autors. Das gilt insbesondere für Vervielfältigungen, Bearbeitungen, Übersetzungen, Auswertung durch Datenbanken und die Einspeicherung und Verar- beitung in elektronische Systeme. Unter http://pkm.xam.de sind weitere Versionen dieses Werkes sowie weitere Lizenzangaben aufgeführt. Zur Erlangung des akademischen Grades eines Doktors der Wirtschaftswis- senschaften (Dr. rer. pol.) von der Fakultät für Wirtschaftswissenschaften des Karlsruher Instituts für Technologie (KIT) genehmigte Dissertation von Dipl.-Inform. Max Völkel. Tag der mündlichen Prüfung: 14. Juli 2010 Referent: Prof. Dr. Rudi Studer Koreferent: Prof. Dr. Klaus Tochtermann Prüfer: Prof. Dr. Gerhard Satzger Vorsitzende der Prüfungskommission: Prof. Dr. Christine Harbring Abstract Following the ideas of Vannevar Bush (1945) and Douglas Engelbart (1963), this thesis explores how computers can help humans to be more intelligent. More precisely, the idea is to reduce limitations of cognitive processes with the help of knowledge cues, which are external reminders about previously experienced internal knowledge. A knowledge cue is any kind of symbol, pattern or artefact, created with the intent to be used by its creator, to re- evoke a previously experienced mental state, when used. The main processes in creating, managing and using knowledge cues are analysed. Based on the resulting knowledge cue life-cycle, an economic analysis of costs and benefits in Personal Knowledge Management (PKM) processes is performed. The main result of this thesis is a meta-model for representing knowledge cues, which is called Conceptual Data Structures (CDS). It consists of three parts: (1) A simple, expressive data-model; (2) A small relation ontology which unifies the relations found in the cognitive models of tools used for PKM tasks, e. g., documents, mind-maps, hypertext, or semantic wikis. (3) An interchange format for structured text together with corresponding wiki syntax. These three parts together allow representing knowledge cues in varying degrees of granularity (number and size of items), structuredness (relations between items), and formality (fraction of items typed with items from a meta-model) in a unified way. The CDS model has been implemented in Java. Based on this reference implementation several tools for personal knowledge management have been created (one by the author of this work and two external tools). All three tools have been used in a comparative evaluation with 125 person-hours. In the evaluation, the Conceptual Data Structures (CDS) data model has successfully been used to represent and use (retrieve) artefacts in a uni- form fashion that are in different degrees of formalisation. Although still research prototypes, the CDS Tools had interaction efficiency and usability ratings compared to Semantic MediaWiki (SMW). Using CDS Tools, users produced significantly more non-trivial triples than with SMW. The created Relation and concept hierarchies can be re-used in a semantic desktop. 4 Acknowledgements I thank my professor, Prof. Dr. Rudi Studer, for letting me pursue what I wanted to pursue, taking care of my environment. I thank the whole research group which has a fascinating attitude of open criticism and honest feedback. Of course, I also thank Prof. Dr. Klaus Tochtermann for his valuable feedback. I thank Dr. Andreas Abecker for a prompt and valuable review of the complete thesis. I thank (in order of appearance) Alexander Grossoul, Mike Sibler, Sebas- tian Gerke, Benjamin Heitmann, Markus Göbel, Andreas Kurz, Sebastian Döweling, and Mustafa Yilmaz who helped to research and develop various parts of this work with their student research projects (Studienarbeiten) and diploma thesis’ (Diplomarbeiten). I thank additionally (in order of appearance) the students that helped to develop the CDS code base, espe- cially Werner Thiemann, Daniel Clemente, Konrad Völkel, Andreas Krei- dler, Björn Kaidel, and Daniel Scharrer. Furthermore, I thank the students that helped to evaluate the final CDS tools. I would also like to thank a lot of people for reviewing this thesis in various stages. All mistakes that are left are my fault, not theirs. I thank Jens Wissmann and Peter Wolf for reviewing chapter 4. I thank Konrad Völkel for reviewing chapters 1, 3 and 4. I thank Mark Hefke for providing valuable feedback on the economic analysis. I thank David Elsweiler for language and style checking various parts of the thesis. I thank Heiko Haller for many enthusiastic day-long discussions about the obscurest details of the CDS model. His constant nagging questions helped to shape the CDS model and API. I thank Stephan Bloehdorn and Denny Vrandecic for many critical dis- cussions at our favourite Sushi place. And I thank Anke and my parents for being patient with me in the long times where I was sitting at my desk. Part of this work has been funded by the European Commission in the context of the IST Integrated Project NEPOMUK1 – The Social Semantic Desktop, FP6-027705. Another part of this work has been done in “WAVES - Wissensaustausch bei der verteilten Entwicklung von Software”2, funded by BMBF, Germany. Yet another part of this work was supported by the European Commission under contract FP6-507482 in the project Knowledge Web3. The expressed content is the view of the author but not necessarily the view of any sponsor. 1http://nepomuk.semanticdesktop.org/ (accessed 06.01.2010) 2http://waves.fzi.de/ (accessed 06.01.2010) 3http://knowledgeweb.semanticweb.org (accessed 06.01.2010) 5 Presumably man’s spirit should be elevated if he can better re- view his shady past and analyze more completely and objectively his present problems. He has built a civilization so complex that he needs to mechanize his records more fully if he is to push his experiment to its logical conclusion and not merely become bogged down part way there by overtaxing his limited memory. Vannevar Bush (1945, p. 108) Contents 1 Introduction 9 1.1Readersguide........................... 9 1.2 Motivation ............................ 12 1.2.1 Focus on the individual knowledge worker ....... 13 1.2.2 Limits of the individual ................. 18 1.2.3 External representations ................. 20 1.2.4 Automating symbol manipulation ............ 23 1.2.5 Economic considerations ................. 28 1.2.6 Summary ......................... 30 1.3 Research questions and contributions . ........... 30 1.4 Solution overview ......................... 32 2 Foundations 35 2.1 Modelling, models and meta-models .............. 36 2.2Documents............................ 39 2.3 Desktop operating system and the file metaphor ....... 42 2.4 Hypertext and the World Wide Web .............. 42 2.5Softwareengineering....................... 45 2.6 Semantic technologies ...................... 46 2.7 Note-taking ............................ 51 2.8 Personal Information Management ............... 53 2.9 Wikis ............................... 54 2.10 Mind- and Concept Maps .................... 56 2.11 Tagging and Web 2.0 ....................... 57 2.12 Knowledge acquisition ...................... 59 2.13 Human-computer interaction .................. 60 3 Analysis and Requirements 61 3.1UsecasesinPKM........................ 61 3.2ProcessesinPKM........................ 66 3.2.1 Existing process models ................. 66 3.2.2 Knowledge cue life-cycle ................. 69 3.3 Economic analysis ........................ 78 3.3.1 Costs and benefits without tools ............ 80 3.3.2 Costs and benefits with tools .............. 81 3.3.3 Detailed analysis ..................... 84 3.3.4 Summary and conclusions ................ 90 3.4 Requirements for Knowledge Models from literature ..... 93 Contents 7 3.4.1 Interaction for codify and augment process ...... 93 3.4.2 Interaction for retrieval process .............103 3.4.3 Expressivity ........................104 3.5 Analysis of relations from conceptual models of PKM-tools . 105 3.5.1 Documents ........................109 3.5.2 Hypertext .........................109 3.5.3 File explorer .......................111 3.5.4 Data structures ......................112 3.5.5 Mind- and Concept Maps ................113 3.5.6 Collaborative information organisation tools ......114 3.5.7 Summary of common relations .............114 3.6 Knowledge representation ....................115 3.6.1 Data exchange formats ..................115 3.6.2 Ontology and schema languages .............116 3.6.3 Common relations ....................119 3.7 Requirements summary .....................120 3.8 Conclusions ............................120 4 Conceptual Data Structures 123 4.1CDSdatamodel.........................124 4.1.1 Informal description and design rationale .......125 4.1.2 Formal definition .....................133 4.1.3 Queries . .........................138 4.1.4 Operations ........................140 4.2 CDS relation ontology ......................143 4.2.1 Informal description ...................144 4.2.2 Formal definition .....................148 4.3 Syntax