Computing with Meaning by Operationalising Socio-Cognitive Semantics
Total Page:16
File Type:pdf, Size:1020Kb
Computing with Meaning by Operationalising Socio-cognitive Semantics Robert McArthur BSc(Hons), GradCertEd(HigherEd), MInfTech 2007 SUBMITTED FOR THE DEGREE OF DOCTOR OF PHILOSOPHY IN THE SCHOOL OF INFORMATION SYSTEMS QUEENSLAND UNIVERSITY OF TECHNOLOGY Abstract This thesis is motivated by the desire to provide technological solutions to enhance human awareness in information processing tasks. The need is pressing. Paradoxically, as informa- tion piles up people become less and less aware due to perceived scarce cognitive resources. As a consequence, specialisations become ever more specialised, projects and individuals in organisations become ever more insular. Technology can enhance awareness by informing the individual about what is happening outside their speciality. Systems which can assist people in these ways need to make sense of human communication. The computer system must know about what it is that it is processing; it must follow a socio-cognitive framework and reason with it. It must compute with meanings not symbolic surface structures. The hypothesis of the thesis is that knowledge potentially useful for enhancing aware- ness can be derived from interactions between people using computational models based on socio-cognitive semantics. The goals are whether an appreciable approximation of concep- tual spaces can be realised through semantic spaces, and whether such semantic spaces can develop representations of meaning which have the potential to enhance the awareness of users? The two thesis questions are how well the socio-cognitive framework of Gardenfors¨ could be brought into operational reality, and if a bridge can be made, then what practical issues can be involved? The theory of conceptual spaces of Peter Gardenfors¨ is combined with methods from cognitive science for creating geometric spaces to represent meaning. Hyperspace Analogue to Language and Latent Semantic Analysis are used as exemplars of the cognitive science algorithms. The algorithms are modified by a variety of syntactic processing schemes to overcome a paucity of data and hence lack of expressivity in representations of meaning: part-of-speech tagging, index expressions and anaphora resolution are effected and incorpo- rated into the semantic space. The practical element of the thesis consists of five case studies. These are developed in two parts: studies describing how meaning changes and evolves in semantic spaces, and stud- ies describing semantic space applications featuring knowledge discovery. These studies are in a variety of domains with a variety of data: online communities of interest using a mailing list, a health-based mailing list, organisational blogs, “hallway chatter”, and organisational email. The data is real world utterances that provide the situational factors that cognitive systems need to answer queries and provide context. The amounts of data are significantly i ii Abstract less than previously used by semantic space methods, hence the need for syntactic assistance. The particular problems examined in the case studies are corporate expertise management, social network discovery, tracking ebbs and flows of topics, and noticing the change in a person’s sense-of-self over time. These are significantly different to those usually examined using semantic spaces. The key differentiator of this work stems from its focus on the geometrically-based com- putational realisation of meaning. This thesis takes semantic spaces out of the closet and into real-world information technology applications, with a roadtest in real life. Publications These publications have direct relevance to this thesis. All are significantly or wholly rep- resented in at least one chapter, especially those describing case studies. In respect of this thesis, first authorship indicates a contribution of at least 60%, although I am happy to say they are a result of a true collaboration in ideas and penmanship. ² McArthur, R., Bruza, P., Kralik, D. and Warren, J. Projecting computational sense of self: A study of transition in a chronic illness online community. In Proceedings of the 39th Hawaii International Conference on System Sciences, IEEE Computer Society Press, January 2006. [106] ² McArthur, R., Bruza, P. and Song, D. Policy Conformance in the Corporate Blog Space. In the peer-reviewed Workshop on Policy Management on the Web, WWW Conference. 2005. Tokyo, Japan. [116] ² McArthur, R. and Bruza, P. Discovery of Implicit and Explicit Connections between People using Email Utterance. In Proceedings of the Eighth European Conference on Computer-Supported Cooperative Work (ECSCW). 2003. Helsinki, Finland: Kluwer. [111] ² McArthur, R. and Bruza, P. Discovery of Tacit Knowledge and Topical Ebbs and Flows within the Utterances of Online Community, in Chance Discovery, Ohsawa, Y. and McBurney, P. (Eds). 2003, Springer. p.115-131. [112] ² McArthur, R. and Bruza, P. Dimensional Representations of Knowledge in Online Community, in Chance Discovery, Ohsawa, Y.and McBurney, P. (Eds). 2003, Springer. pp.98-112. [110] ² McArthur, R. and Bruza, P. Finding Tacit Knowledge in Online Communities, in the peer-reviewed Workshop on Knowledge Sharing under Distributed Circumstance, Com- munities and Technologies Conference, Verburg, R. and Ridder, J.A.D. (Eds). 2003, NWO-MES: Amsterdam. [114] ² McArthur, R. and Bruza, P. Position paper on finding social networks using tacit and explicit knowledge, presented to the peer-reviewed Workshop on Social Network iii iv Publications Analysis, European Conference on Computer-Supported Cooperative Work (ECSCW), 2003. [113] ² McArthur, R. and Bruza, P. Position paper on defining online community, accepted to the peer-reviewed Second ECSCW Workshop on Community Knowledge, European Conference on Computer-Supported Cooperative Work (ECSCW), 2001. This paper was not physically presented due to travel restrictions following the “911” attacks. [108] ² McArthur, R. and Bruza, P.D. Finding tacit knowledge in online communities, in Pro- ceedings of the Eighth Australasian Document Computing Symposium (ADCS’03), 2003. [115] ² McArthur, R. and Bruza, P.D. Knowledge representation in online community, in Proceedings of the Sixth Australasian Document Computing Symposium (ADCS’01), 2001. [107] ² McArthur, R. and Bruza, P.D. The ABC’s of Online Community, in Proceedings of the First Asia Pacific conference on Web Intelligence: Research and Development, 2001. Published as Lecture Notes in Artificial Intelligence, v2198, Springer, Berlin, pp.141- 147. [109] The following paper was one in which I had peripheral involvement but which still in- volves ideas developed in and through those represented in this thesis: ² Song, D., Bruza, P., McArthur, R. and Mansfield, T., Enabling Management Oversight in Corporate Blog Space. To be published in the Proceedings of the AAAI-2006 Spring Symposium on Computational Approaches to Analyzing Weblogs, Stanford University, California, USA, March 27-29, 2006. [149] Table of Contents 1 Introduction 1 1.1 The problem . 1 1.2 Relationship to other areas . 3 1.3 The solution cauldron . 5 1.4 Thesis questions and contributions . 8 1.5 Methods . 11 1.6 Format of the thesis . 12 2 Socio-cognitive semantic representation theory 15 2.1 “Socio-cognitive” . 15 2.2 Gardenfors’¨ Socio-cognitive basis of meaning . 17 2.3 Other approaches to cognitive semantic representation . 20 3 Socio-cognitive semantic representation practice 25 3.1 Practice . 25 3.1.1 Representation . 27 3.1.2 Pre-semantic context . 43 3.1.3 Associational inference . 47 3.2 From practice to practicality . 54 3.2.1 Software . 54 3.2.2 Intellectual property . 55 4 Dynamics of Meaning 57 4.1 Utterance . 58 4.1.1 Communication artefacts . 58 4.1.2 Community . 61 4.2 The Dynamics of Meaning due to time . 64 4.2.1 The Harry Proa case study . 64 4.2.2 The “Sense-of-self ” case study . 75 4.3 The Dynamics of Meaning due to context . 90 4.3.1 The Sun blogs case study . 90 v vi TABLE OF CONTENTS 5 Knowledge Discovery Applications 103 5.1 Quantitative approaches to semantic knowledge representation . 104 5.2 The How’s it going with John? case study . 107 5.2.1 Data set . 107 5.2.2 Methods . 107 5.2.3 Results . 108 5.2.4 Discussion . 114 5.2.5 Conclusion . 116 5.3 Social Network Discovery case study . 117 5.3.1 Background . 117 5.3.2 Scenario . 119 5.3.3 Data set . 120 5.3.4 Methods . 121 5.3.5 Results . 122 5.3.6 Discussion . 127 5.3.7 Conclusion . 130 6 General Discussion and Conclusion 133 6.1 Thesis questions addressed . 133 6.1.1 How well can a bridge from the socio-cognitive framework of Gardenfors¨ Gardenfors¨ to an operational reality be made? . 133 6.1.2 If a bridge can be made, what are the practical issues involved? . 135 6.2 Claimed contributions . 136 6.3 Conclusions . 137 A Practical Lessons Learnt 139 A.1 Lessons for practitioners – methods . 139 A.2 Lessons for practitioners – domains and problems . 142 A.3 Lessons for practitioners – theories . 143 B Transition in chronic illness 145 C Case Study Feature Matrix 147 D Connecting theory and practice 149 Index ........................................ 165 Author Index . 170 List of Figures 1.1 Relationship to other academic areas . 4 1.2 The high level practical workflow . 7 1.3 The schematic of the thesis chapters . 11 2.1 Gardenfors’¨ three level model . 18 3.1 Example of the use of index expressions in HAL+ . 33 3.2 Illustration of singular value decomposition from 3D down to 2D . 37 3.3 Example of an index expression . 46 3.4 Example of differences in the information flow and cosine metrics . 51 3.5 Difference between Euclidean and non-Euclidean distance metrics . 53 4.1 Visualisation of a definition of ‘online community’ . 62 4.2 Visualisation of movement of sense-of-self against two axes of interest . 86 4.3 Number of blog entries. 93 4.4 Number of blog entries mentioning “dtrace” for each day. 97 4.5 Expertise around “dtrace” . 98 4.6 Expertise around “golf” . 100 5.1 Nonaka and Takeuchi’s SECI model.