Adaptive Hypertext. the Shattered Document Approach
Total Page:16
File Type:pdf, Size:1020Kb
M´ario Amado Alves Adaptive Hypertext. The shattered document approach Departamento de Ci^enciade Computadores da Faculdade de Ci^enciasda Universidade do Porto 2013-03-13 ii M´ario Amado Alves Adaptive Hypertext. The shattered document approach Tese submetida `aFaculdade de Ci^enciasda Universidade do Porto para obten¸c~aodo grau de Doutor em Ci^enciade Computadores Departamento de Ci^enciade Computadores da Faculdade de Ci^enciasda Universidade do Porto 2013-03-13 iv To the memory of my parents Irene and Marius. vi Abstract We study how adaptive hypertext may improve the utilization of large online docu- ments. We put forth the inter-related concepts of shattered documents, and renoding: splitting a document into components smaller than the page, called noogramicles, and creating each page as a new assemblage of noogramicles each time it is accessed. The adaptation comes from learning the navigation patterns of the usors (authors and readers), and is manifested in the assemblage of pages. Another essential trait of our work is the utilization of user simulation for testing our hypotheses. We have created software simulators and conducted experiments with them to compare several adaptive and non-adaptive configurations. Yet another important aspect of our work was the study and adoption of the technique of spreading activation to explore the network database of the learnt model of travels. We have realised a quantitative evaluation based on utilization quality measures adapted to the problem: session size, session cost. vii viii Resumo Estudamos como o hipertexto adaptativo pode melhorar a utiliza¸c~aode documen- tos em-linha de grande dimens~ao. Apresentamos os conceitos interrelacionados de document fragmentado e rela¸cagem (recria¸c~aode n´os):separa¸c~aodo documento em componentes de dimens~aoinferior `ap´agina,chamados noogram´ıculos, e cria¸c~aode cada p´aginacomo uma nova montagem de noogram´ıculosde cada vez que ´eacedida, deste modo criando novos n´osna rede hipertextual, `amedida que a utiliza¸c~aoprogride. A adapta¸c~aoprov´emde aprender os padr~oesde navega¸c~aodos utilizadores (autores e leitores) e manifesta-se na montagem das p´aginas. Outro tra¸coessencial do nosso trabalho ´eo recurso `a simula¸c~ao para testar as nossas hip´oteses.Cri´amossimuladores em software e realiz´amosexperi^enciascom eles para comparar v´ariasconfigura¸c~oes adaptativas e n~ao-adaptativas. Ainda outro aspeto importante deste trabalho ´eo estudo e ado¸c~aoda t´ecnicade propaga¸c~aoda ativa¸c~ao para explorar a base-de-dados reticular do modelo de viagens aprendidas. Realiz´amosuma avalia¸c~ao quantitativa baseada em medidas de qualidade da utiliza¸c~aoadaptadas ao problema: tamanho da sess~ao,custo da sess~ao. ix x R´esum´e On ´etudiecomme l'hypertexte adaptatif peut am´eliorerl'utilisation de documents en- ligne de grande dimension. On pr´esente les concepts, en rapport entre eux-m`emes,du document fragment´e et le rela¸cage (recr´eationde noeuds): la s´eparationdu document en des composants de dimension inf´erieure `acelle de la page, nom´esles noogramicules, et la cr´eationde chaque page comme un nouvel assemblage de noogramicules `achaque fois qu'on l'acc`ede, cr´eant ainsi de nouveaux noeuds sur le r´eseauhypertextuel, au fur et `amesure du progr`esde l'utilisation. L'adaptation vient de l'apprentissage des mod`eles de navigation des utilisateurs (auteurs et lecteurs) et elle se r´ev`ele`atravers l'assemblage des pages. Un autre trais essentiel de notre travail, c'est le recours `a la simulation afin de tester nos hypoth`eses. On cr´eades simulateurs logiciels et on r´ealisades exp´eriencesavec eux pour comparer plusieurs configurations adaptatives et non adaptatives. Un autre aspect important de ce travail c'est l'´etudeet l'adoption de la technique de propagation de l'activation pour exploiter la base de donn´esr´eseau du mod`elede voyages appris. On r´ealizaune ´evaluation quantitative support´eepar des mesures de qualit´ed'utilisation, adapt´eesau probl`eme:taille de s´ession,co^utde s´ession. xi xii Acknowledgements I love deadlines. I like the whooshing sound they make as they fly by. Douglas Adams I am extremely indebted to my advisor Doctor Al´ıpioJorge and co-advisor Doctor Z´e Paulo Leal. Al´ıpiois the most wise person I know. Each of his numerous advice was always entirely pertinent and convenient. Had I followed them all and this thesis would have been a work of absolute perfection. Al´ıpio’ssupport of my work stood unabated through the unending stream of deadline missing after deadline missing from my part. I was as surprised as I was thankful for this continued support. I was surprised because Al´ıpio is the most wise person I know, and I would expect even a mildly clever person to spot a clear lost case and hastily detach themselves from the dead weight. Paradoxically, the fact that you are now reading these lines on an acceptable if imperfect thesis is a result of Al´ıpiobeing the most wise person I know. I am indebted to Professor Pavel Brazdil for being the excellent leader of LIAAD - INESC Porto (formerly NIAAD - LIAAC), the laboratory where this whole business started. I am indebted to Rodolfo Matos, the prolific sysadmin of LIAAD, for his prompt support and enduring friendship. I am grateful to all NIAAD and DCC members for their support and comradeship. I am indebted to the Funda¸c~aopara a Ci^enciae Tecnologia for supporting four years of doctoral research. I am indebted to Ada Europe, APPIA, Prolearn, and Universidade Aberta, for conference- xiii going support. I am grateful to Cec´ılia. With the grace of God. xiv Contents Abstract vii Resumo ix R´esum´e xi Acknowledgements xiii List of Tables xxiii List of Figures xxvii 1 Introduction 1 1.1 Context . .1 1.1.1 Rationale for hypertextualization . .2 1.1.2 Limitations of hypertext . .3 1.2 Solutions . .4 1.2.1 Our solution . .6 1.2.2 Hypotheses of this work . .6 1.3 Main contributions . .7 1.4 Structure of the text . .8 1.4.1 Notation . .9 xv 2 Hypertext 11 2.1 Definition . 11 2.2 Terminology . 12 2.3 Forms of hypertext . 13 2.3.1 The words used . 19 2.3.2 Items vs. connections . 20 2.3.3 The Monographic Principle . 22 2.3.4 Summary of hypertext history . 24 2.3.5 The document dogma . 24 2.3.6 Information search|the impossible that is done . 25 2.3.7 Aporias of adaptation . 26 2.3.8 Minor issues . 27 2.4 Structure of documents . 36 2.4.1 Traditional document structure . 36 2.4.2 Standard hypertextualization . 37 2.5 Learning systems . 40 2.5.1 Adaptive hypertext techniques . 41 2.5.2 Learning Systems highlights . 44 2.6 Summary . 46 3 A new model for adaptive hypertext 49 3.1 Motivation . 49 3.1.1 Information, not documents . 49 3.1.2 Guidelines for adaptive hypertext . 50 3.2 Model design . 54 3.2.1 The Shattered Documents model . 54 xvi 3.2.2 Adaptive information, and author as first reader . 56 3.2.3 Interface design . 57 3.2.4 Detailed design with a network data model . 60 3.3 Techniques and tools reused . 60 3.3.1 A unified model of spreading activation . 63 3.3.2 A didactical example . 63 3.3.3 Benefits of spreading activation for information retrieval . 66 3.3.4 The generic model . 67 3.3.5 About the implementation . 70 3.3.6 Leaky Capacitor Model (LCM) . 71 3.3.7 Reverberative Circles (RC) . 73 3.3.8 Waterline . 74 3.4 Algorithms . 74 3.4.1 Overview . 75 3.4.2 Formalization . 76 3.4.3 Start page algorithms . 76 3.4.4 Recentring algorithms . 77 3.4.5 Learning algorithms . 79 3.5 Summary . 80 4 Experimental methodology 81 4.1 Simulation . 82 4.1.1 Formalization . 83 4.2 Experiments . 84 4.3 Parameter settings . 85 4.3.1 The document . 88 xvii 4.3.2 One thousand nodes . 88 4.4 Evaluation methodology and measures . 91 4.4.1 Session size . 91 4.4.2 Session cost . 91 4.5 Statistics . 92 4.5.1 Common and bottom line statistics . 94 4.5.2 Statistics in the Outcomes tables . 95 4.5.3 Statistics in the Results tables . 96 4.5.4 Statistics in the Evolution tables . 96 4.5.5 Alternate terms . 97 4.6 Summary . 97 5 Results 99 5.1 Main results . 100 5.1.1 Session size and success rate . 101 5.1.2 Evolution . 102 5.2 Testing link types . 103 5.3 Testing adaptative techniques . 105 5.4 Summary . 105 6 Conclusions 107 6.1 Main conclusion . 107 6.2 The pros and cons of simulating . 107 6.3 Paths not taken . 108 6.4 Future work . 109 A Electronic archive 127 xviii B Program listings 129 B.1 Package Arm05 Model (spec) . 129 B.2 Package Arm05 Model (body) . 131 B.3 Procedure Arm05 Model.Get Info (body only) . 134 B.4 Package Kasim2 (spec) . 137 B.5 Package Kasim2 (body) . 143 B.6 Package Kasim2.Activation (spec) . 151 B.7 Package Kasim2.Activation (body) . 152 B.8 Package Kasim2.Comparate (spec) . 160 B.9 Package Kasim2.Comparate (body) . 161 B.10 Procedure Kasim2.Comparate.Experiment (body only) . 186 B.11 Package Kasim2.Markov (spec) . 187 B.12 Package Kasim2.Markov (body) . 187 B.13 Package Kasim2.Markov With Heuristics (body) . 190 B.14 Package Kasim2.Markov With Heuristics (body) . 193 B.15 Package Kasim2.Structural (spec) .