Personal Knowledge Models with Semantic Technologies
Total Page:16
File Type:pdf, Size:1020Kb
Max Völkel Personal Knowledge Models with Semantic Technologies Personal Knowledge Models with Semantic Technologies Max Völkel 2 Bibliografische Information Detaillierte bibliografische Daten sind im Internet über http://pkm. xam.de abrufbar. Covergestaltung: Stefanie Miller Herstellung und Verlag: Books on Demand GmbH, Norderstedt c 2010 Max Völkel, Ritterstr. 6, 76133 Karlsruhe This work is licensed under the Creative Commons Attribution- ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Fran- cisco, California, 94105, USA. Zur Erlangung des akademischen Grades eines Doktors der Wirtschaftswis- senschaften (Dr. rer. pol.) von der Fakultät für Wirtschaftswissenschaften des Karlsruher Instituts für Technologie (KIT) genehmigte Dissertation von Dipl.-Inform. Max Völkel. Tag der mündlichen Prüfung: 14. Juli 2010 Referent: Prof. Dr. Rudi Studer Koreferent: Prof. Dr. Klaus Tochtermann Prüfer: Prof. Dr. Gerhard Satzger Vorsitzende der Prüfungskommission: Prof. Dr. Christine Harbring Abstract Following the ideas of Vannevar Bush (1945) and Douglas Engelbart (1963), this thesis explores how computers can help humans to be more intelligent. More precisely, the idea is to reduce limitations of cognitive processes with the help of knowledge cues, which are external reminders about previously experienced internal knowledge. A knowledge cue is any kind of symbol, pattern or artefact, created with the intent to be used by its creator, to re- evoke a previously experienced mental state, when used. The main processes in creating, managing and using knowledge cues are analysed. Based on the resulting knowledge cue life-cycle, an economic analysis of costs and benefits in Personal Knowledge Management (PKM) processes is performed. The main result of this thesis is a meta-model for representing knowledge cues, which is called Conceptual Data Structures (CDS). It consists of three parts: (1) A simple, expressive data-model; (2) A small relation ontology which unifies the relations found in the cognitive models of tools used for PKM tasks, e. g., documents, mind-maps, hypertext, or semantic wikis. (3) An interchange format for structured text together with corresponding wiki syntax. These three parts together allow representing knowledge cues in varying degrees of granularity (number and size of items), structuredness (relations between items), and formality (fraction of items typed with items from a meta-model) in a unified way. The CDS model has been implemented in Java. Based on this reference implementation several tools for personal knowledge management have been created (one by the author of this work and two external tools). All three tools have been used in a comparative evaluation with 125 person-hours. In the evaluation, the Conceptual Data Structures (CDS) data model has successfully been used to represent and use (retrieve) artefacts in a uni- form fashion that are in different degrees of formalisation. Although still research prototypes, the CDS Tools had interaction efficiency and usability ratings compared to Semantic MediaWiki (SMW). Using CDS Tools, users produced significantly more non-trivial triples than with SMW. The created Relation and concept hierarchies can be re-used in a semantic desktop. 4 Acknowledgements I thank my professor, Prof. Dr. Rudi Studer, for letting me pursue what I wanted to pursue, taking care of my environment. I thank the whole research group which has a fascinating attitude of open criticism and honest feedback. Of course, I also thank Prof. Dr. Klaus Tochtermann for his valuable feedback. I thank Dr. Andreas Abecker for a prompt and valuable review of the complete thesis. I thank (in order of appearance) Alexander Grossoul, Mike Sibler, Sebas- tian Gerke, Benjamin Heitmann, Markus Göbel, Andreas Kurz, Sebastian Döweling, and Mustafa Yilmaz who helped to research and develop various parts of this work with their student research projects (Studienarbeiten) and diploma thesis’ (Diplomarbeiten). I thank additionally (in order of appearance) the students that helped to develop the CDS code base, espe- cially Werner Thiemann, Daniel Clemente, Konrad Völkel, Andreas Krei- dler, Björn Kaidel, and Daniel Scharrer. Furthermore, I thank the students that helped to evaluate the final CDS tools. I would also like to thank a lot of people for reviewing this thesis in various stages. All mistakes that are left are my fault, not theirs. I thank Jens Wissmann and Peter Wolf for reviewing chapter 4. I thank Konrad Völkel for reviewing chapters 1, 3 and 4. I thank Mark Hefke for providing valuable feedback on the economic analysis. I thank David Elsweiler for language and style checking various parts of the thesis. I thank Heiko Haller for many enthusiastic day-long discussions about the obscurest details of the CDS model. His constant nagging questions helped to shape the CDS model and API. I thank Stephan Bloehdorn and Denny Vrandecic for many critical dis- cussions at our favourite Sushi place. And I thank Anke and my parents for being patient with me in the long times where I was sitting at my desk. Part of this work has been funded by the European Commission in the context of the IST Integrated Project NEPOMUK1 – The Social Semantic Desktop, FP6-027705. Another part of this work has been done in “WAVES - Wissensaustausch bei der verteilten Entwicklung von Software”2, funded by BMBF, Germany. Yet another part of this work was supported by the European Commission under contract FP6-507482 in the project Knowledge Web3. The expressed content is the view of the author but not necessarily the view of any sponsor. 1http://nepomuk.semanticdesktop.org/ (accessed 06.01.2010) 2http://waves.fzi.de/ (accessed 06.01.2010) 3http://knowledgeweb.semanticweb.org (accessed 06.01.2010) 5 Presumably man’s spirit should be elevated if he can better re- view his shady past and analyze more completely and objectively his present problems. He has built a civilization so complex that he needs to mechanize his records more fully if he is to push his experiment to its logical conclusion and not merely become bogged down part way there by overtaxing his limited memory. Vannevar Bush (1945, p. 108) Contents 1 Introduction 9 1.1Readersguide........................... 9 1.2Motivation............................ 12 1.2.1 Focusontheindividualknowledgeworker....... 13 1.2.2 Limitsoftheindividual................. 18 1.2.3 External representations ................. 20 1.2.4 Automatingsymbolmanipulation............ 23 1.2.5 Economicconsiderations................. 28 1.2.6 Summary......................... 30 1.3 Research questions and contributions . ........... 30 1.4Solutionoverview......................... 32 2 Foundations 35 2.1 Modelling, models and meta-models .............. 36 2.2Documents............................ 39 2.3Desktopoperatingsystemandthefilemetaphor....... 42 2.4HypertextandtheWorldWideWeb.............. 42 2.5Softwareengineering....................... 45 2.6Semantictechnologies...................... 46 2.7Note-taking............................ 51 2.8PersonalInformationManagement............... 53 2.9 Wikis ............................... 54 2.10Mind-andConceptMaps.................... 56 2.11 Tagging and Web 2.0 ....................... 57 2.12Knowledgeacquisition...................... 59 2.13Human-computerinteraction.................. 60 3 Analysis and Requirements 61 3.1UsecasesinPKM........................ 61 3.2ProcessesinPKM........................ 66 3.2.1 Existing process models ................. 66 3.2.2 Knowledgecuelife-cycle................. 69 3.3Economicanalysis........................ 78 3.3.1 Costsandbenefitswithouttools............ 80 3.3.2 Costsandbenefitswithtools.............. 81 3.3.3 Detailedanalysis..................... 84 3.3.4 Summaryandconclusions................ 90 3.4RequirementsforKnowledgeModelsfromliterature..... 93 Contents 7 3.4.1 Interaction for codify and augment process ...... 93 3.4.2 Interaction for retrieval process .............103 3.4.3 Expressivity ........................104 3.5 Analysis of relations from conceptual models of PKM-tools . 105 3.5.1 Documents........................109 3.5.2 Hypertext.........................109 3.5.3 File explorer .......................111 3.5.4 Datastructures......................112 3.5.5 Mind-andConceptMaps................113 3.5.6 Collaborativeinformationorganisationtools......114 3.5.7 Summaryofcommonrelations.............114 3.6 Knowledge representation ....................115 3.6.1 Dataexchangeformats..................115 3.6.2 Ontologyandschemalanguages.............116 3.6.3 Commonrelations....................119 3.7Requirementssummary.....................120 3.8Conclusions............................120 4 Conceptual Data Structures 123 4.1CDSdatamodel.........................124 4.1.1 Informal description and design rationale .......125 4.1.2 Formaldefinition.....................133 4.1.3 Queries..........................138 4.1.4 Operations........................140 4.2CDSrelationontology......................143 4.2.1 Informal description ...................144 4.2.2 Formaldefinition.....................148 4.3Syntaxandstructuredtext...................152 4.3.1 Structuredtextinterchangeformat(STIF)......153 4.3.2 Fromsyntaxtostructuredtext.............156 4.3.3 FromstructuredtexttoCDS..............157 4.3.4 Summary.........................162 4.4UsingCDS............................163 4.4.1 Authoringandstepwiseformalisation..........163