Embedding a Semantic Network in a Word Space

Total Page:16

File Type:pdf, Size:1020Kb

Embedding a Semantic Network in a Word Space Embedding a Semantic Network in a Word Space Richard Johansson and Luis Nieto Pina˜ Sprakbanken,˚ Department of Swedish, University of Gothenburg Box 200, SE-40530 Gothenburg, Sweden richard.johansson, luis.nieto.pina @svenska.gu.se { } Abstract while the latter has seen much interest lately, their respective strengths and weaknesses are still being We present a framework for using continuous- debated (Baroni et al., 2014; Levy and Goldberg, space vector representations of word meaning 2014). The most important relation defined in a vec- to derive new vectors representing the mean- ing of senses listed in a semantic network. It tor space between the meaning of two words is sim- is a post-processing approach that can be ap- ilarity: a mouse is something quite similar to a rat. plied to several types of word vector represen- Similarity of meaning is operationalized in terms of tations. It uses two ideas: first, that vectors for geometry, by defining a distance metric. polysemous words can be decomposed into Symbolic representations seem to have an advan- a convex combination of sense vectors; sec- ondly, that the vector for a sense is kept sim- tage in describing word sense ambiguity: when a ilar to those of its neighbors in the network. surface form corresponds to more than one concept. This leads to a constrained optimization prob- For instance, the word mouse can refer to a rodent lem, and we present an approximation for the or an electronic device. Vector-space representations case when the distance function is the squared typically represent surface forms only, which makes Euclidean. it hard to search e.g. for a group of words similar We applied this algorithm on a Swedish se- to the rodent sense of mouse or to reliably use the mantic network, and we evaluate the quality vectors in classifiers that rely on the semantics of of the resulting sense representations extrinsi- the word. There have been several attempts to create cally by showing that they give large improve- vectors representing senses, most of them based on ments when used in a classifier that creates lexical units for FrameNet frames. some variant of the idea first proposed by Schutze¨ (1998): that senses can be seen as clusters of similar contexts. Recent examples in this tradition include 1 Introduction the work by Huang et al. (2012) and Neelakantan et Representing word meaning computationally is cen- al. (2014). However, because sense distributions are tral in natural language processing. Manual, often highly imbalanced, it is not clear that context knowledge-based approaches to meaning represen- clusters can be reliably created for senses that occur tation maps word strings to symbolic concepts, rarely. These approaches also lack interpretability: which can be described using any knowledge rep- if we are interested in the rodent sense of mouse, resentation framework; using the relations between which of the vectors should we use? concepts defined in the knowledge base, we can in- In this work, we instead derive sense vectors by fer implicit facts from the information stated in a embedding the graph structure of a semantic net- text: a mouse is a rodent, so it has prominent teeth. work in the word space. By combining two com- Conversely, data-driven meaning representation plementary sources of information – corpus statis- approaches rely on cooccurrence patterns to derive tics and network structure – we derive useful vec- a vector representation (Turney and Pantel, 2010). tors also for concepts that occur rarely. The method, There are two classes of methods that compute word which can be applied to context-counting as well vectors: context-counting and context-predicting; as context-predicting spaces, works by decompos- 1428 Human Language Technologies: The 2015 Annual Conference of the North American Chapter of the ACL, pages 1428–1433, Denver, Colorado, May 31 – June 5, 2015. c 2015 Association for Computational Linguistics ing word vectors as linear combinations of sense variables correspond to the occurrence probabilities vectors, and by pushing the sense vectors towards of the senses, but strictly speaking this is only the their neighbors in the semantic network. This in- case when the vectors are built using simple context tuition leads to a constrained optimization problem, counting. Since the mix gives an estimate of which for which we present an approximate algorithm. sense is the most frequent in the corpus, we get a We applied the algorithm to derive vectors for strong baseline for word sense disambiguation (Mc- the senses in a Swedish semantic network, and we Carthy et al., 2007) as a bonus; see our followup evaluated their quality extrinsically by using them paper (Johansson and Nieto Pina,˜ 2015) for a dis- as features in a semantic classification task – map- cussion of this. ping senses to their corresponding FrameNet frames. We can now formalize the intuition above: the When using the sense vectors in this task, we saw a weighted sum of distances between each sense and large improvement over using word vectors. its neighbors is minimized, while satisfying the mix constraint for each lemma. We get the following 2 Embedding a Semantic Network constrained optimization program: The goal of the algorithm is to embed the seman- minimize wijk∆(E(sij),E(nijk)) E,p tic network in a geometric space: that is, to asso- i,j,k ciate each sense sij with a sense embedding, a vec- X subject to pijE(sij) = F (li) i tor E(sij) of real numbers, in a way that reflects the ∀ topology of the semantic network but also that the Xj (1) vectors representing the lemmas are related to those p = 1 i ij ∀ corresponding to the senses. We now formalize this Xj intuition, and we start by introducing some notation. p 0 i, j ij ≥ ∀ For each lemma li, there is a set of possible senses si1, . , simi for which li is a surface realization. The mix constraints make sure that the solution Furthermore, for each sense sij, there is a neighbor- is nontrivial. In particular, a very large number hood consisting of senses semantically related to sij. of words are monosemous, and the procedure will Each neighbor nijk of sij is associated with a weight leave the embeddings of these words unchanged. wijk representing the degree of semantic relatedness 2.1 An Approximate Algorithm between sij and nijk. How we define the neighbor- hood, i.e. our notion of semantical relatedness, will The difficulty of solving the problem stated in Equa- obviously have an impact on the result. In this work, tion (1) obviously depends on the distance function we simply assume that it can be computed from the ∆. Henceforth, we focus on the case where ∆ is network, e.g. by picking a number of hypernyms the squared Euclidean distance. This is an impor- and hyponyms in a lexicon such as WordNet. We tant special case that is related to a number of other then assume that for each lemma li, we have a D- distances or similarities, e.g. cosine similarity and dimensional vector F (li) of real numbers; this can Hellinger distance. In this case, (1) is a quadratically be computed using any method described in Section constrained quadratic problem, which is NP-hard in 1. Finally, we assume a distance function ∆(x, y) general and difficult to handle with off-the-shelf op- that returns a non-negative real number for each pair timization tools. We therefore resort to an approx- of vectors in RD. imation; we show empirically in Sections 3 and 4 The algorithm maps each sense sij to a sense em- that it works well in practice. bedding, a real-valued vector E(sij) in the same The approximate algorithm works in an online vector space as the lemma embeddings. The lemma fashion by considering one lemma at a time. It ad- and sense embeddings are related through a mix con- justs the embeddings of the senses as well as their straint: F (li) is decomposed as a convex combi- mix in order to minimize the loss function nation p E(s ), where the p are picked 2 j ij ij { ij} Li = wijk E(sij) E(nijk) . (2) from the probability simplex. Intuitively, the mix k − k P Xjk 1429 The embeddings of the neighbors nijk of the sense 3 Application to Swedish Data are kept fixed at each such step. We iterate through The algorithm described in Section 2 was applied to the whole set of lemmas for a fixed number of Swedish data: we started with lemma embeddings epochs or until the objective is unchanged. computed from a corpus, and then created sense em- Furthermore, instead of directly optimizing with beddings by using the SALDO semantic network respect to the sense embeddings (which involves (Borin et al., 2013). The algorithm was run for a mi D scalars), the sense embeddings (and there- · few epochs, which seemed to be enough for reach- fore also the loss Li) can be computed analytically if ing a plateau in performance; the total runtime of the the mix variables pi1, . , pim are given, so we have i algorithm was a few minutes. reduced the optimization problem to one involving m 1 scalars, i.e. it is univariate in most cases. i − 3.1 Creating Lemma Embeddings Given a sense sij of a lemma li, we define the We created a corpus of 1 billion words downloaded weighted centroid of the set of neighbors of sij as from Sprakbanken,˚ the Swedish language bank.1 k wijkE(nijk) The corpora are distributed in a format where the cij = . (3) text has been tokenized, part-of-speech-tagged and k wijk P lemmatized.
Recommended publications
  • Probabilistic Topic Modelling with Semantic Graph
    Probabilistic Topic Modelling with Semantic Graph B Long Chen( ), Joemon M. Jose, Haitao Yu, Fajie Yuan, and Huaizhi Zhang School of Computing Science, University of Glasgow, Sir Alwyns Building, Glasgow, UK [email protected] Abstract. In this paper we propose a novel framework, topic model with semantic graph (TMSG), which couples topic model with the rich knowledge from DBpedia. To begin with, we extract the disambiguated entities from the document collection using a document entity linking system, i.e., DBpedia Spotlight, from which two types of entity graphs are created from DBpedia to capture local and global contextual knowl- edge, respectively. Given the semantic graph representation of the docu- ments, we propagate the inherent topic-document distribution with the disambiguated entities of the semantic graphs. Experiments conducted on two real-world datasets show that TMSG can significantly outperform the state-of-the-art techniques, namely, author-topic Model (ATM) and topic model with biased propagation (TMBP). Keywords: Topic model · Semantic graph · DBpedia 1 Introduction Topic models, such as Probabilistic Latent Semantic Analysis (PLSA) [7]and Latent Dirichlet Analysis (LDA) [2], have been remarkably successful in ana- lyzing textual content. Specifically, each document in a document collection is represented as random mixtures over latent topics, where each topic is character- ized by a distribution over words. Such a paradigm is widely applied in various areas of text mining. In view of the fact that the information used by these mod- els are limited to document collection itself, some recent progress have been made on incorporating external resources, such as time [8], geographic location [12], and authorship [15], into topic models.
    [Show full text]
  • Semantic Memory: a Review of Methods, Models, and Current Challenges
    Psychonomic Bulletin & Review https://doi.org/10.3758/s13423-020-01792-x Semantic memory: A review of methods, models, and current challenges Abhilasha A. Kumar1 # The Psychonomic Society, Inc. 2020 Abstract Adult semantic memory has been traditionally conceptualized as a relatively static memory system that consists of knowledge about the world, concepts, and symbols. Considerable work in the past few decades has challenged this static view of semantic memory, and instead proposed a more fluid and flexible system that is sensitive to context, task demands, and perceptual and sensorimotor information from the environment. This paper (1) reviews traditional and modern computational models of seman- tic memory, within the umbrella of network (free association-based), feature (property generation norms-based), and distribu- tional semantic (natural language corpora-based) models, (2) discusses the contribution of these models to important debates in the literature regarding knowledge representation (localist vs. distributed representations) and learning (error-free/Hebbian learning vs. error-driven/predictive learning), and (3) evaluates how modern computational models (neural network, retrieval- based, and topic models) are revisiting the traditional “static” conceptualization of semantic memory and tackling important challenges in semantic modeling such as addressing temporal, contextual, and attentional influences, as well as incorporating grounding and compositionality into semantic representations. The review also identifies new challenges
    [Show full text]
  • Knowledge Graphs on the Web – an Overview Arxiv:2003.00719V3 [Cs
    January 2020 Knowledge Graphs on the Web – an Overview Nicolas HEIST, Sven HERTLING, Daniel RINGLER, and Heiko PAULHEIM Data and Web Science Group, University of Mannheim, Germany Abstract. Knowledge Graphs are an emerging form of knowledge representation. While Google coined the term Knowledge Graph first and promoted it as a means to improve their search results, they are used in many applications today. In a knowl- edge graph, entities in the real world and/or a business domain (e.g., people, places, or events) are represented as nodes, which are connected by edges representing the relations between those entities. While companies such as Google, Microsoft, and Facebook have their own, non-public knowledge graphs, there is also a larger body of publicly available knowledge graphs, such as DBpedia or Wikidata. In this chap- ter, we provide an overview and comparison of those publicly available knowledge graphs, and give insights into their contents, size, coverage, and overlap. Keywords. Knowledge Graph, Linked Data, Semantic Web, Profiling 1. Introduction Knowledge Graphs are increasingly used as means to represent knowledge. Due to their versatile means of representation, they can be used to integrate different heterogeneous data sources, both within as well as across organizations. [8,9] Besides such domain-specific knowledge graphs which are typically developed for specific domains and/or use cases, there are also public, cross-domain knowledge graphs encoding common knowledge, such as DBpedia, Wikidata, or YAGO. [33] Such knowl- edge graphs may be used, e.g., for automatically enriching data with background knowl- arXiv:2003.00719v3 [cs.AI] 12 Mar 2020 edge to be used in knowledge-intensive downstream applications.
    [Show full text]
  • Large Semantic Network Manual Annotation 1 Introduction
    Large Semantic Network Manual Annotation V´aclav Nov´ak Institute of Formal and Applied Linguistics Charles University, Prague [email protected] Abstract This abstract describes a project aiming at manual annotation of the content of natural language utterances in a parallel text corpora. The formalism used in this project is MultiNet – Multilayered Ex- tended Semantic Network. The annotation should be incorporated into Prague Dependency Treebank as a new annotation layer. 1 Introduction A formal specification of the semantic content is the aim of numerous semantic approaches such as TIL [6], DRT [9], MultiNet [4], and others. As far as we can tell, there is no large “real life” text corpora manually annotated with such markup. The projects usually work only with automatically generated annotation, if any [1, 6, 3, 2]. We want to create a parallel Czech-English corpora of texts annotated with the corresponding semantic network. 1.1 Prague Dependency Treebank From the linguistic viewpoint there language resources such as Prague Dependency Treebank (PDT) which contain a deep manual analysis of texts [8]. PDT contains annotations of three layers, namely morpho- logical, analytical (shallow dependency syntax) and tectogrammatical (deep dependency syntax). The units of each annotation level are linked with corresponding units on the preceding level. The morpho- logical units are linked directly with the original text. The theoretical basis of the treebank lies in the Functional Gener- ative Description of language system [7]. PDT 2.0 is based on the long-standing Praguian linguistic tradi- tion, adapted for the current computational-linguistics research needs. The corpus itself is embedded into the latest annotation technology.
    [Show full text]
  • Universal Or Variation? Semantic Networks in English and Chinese
    Universal or variation? Semantic networks in English and Chinese Understanding the structures of semantic networks can provide great insights into lexico- semantic knowledge representation. Previous work reveals small-world structure in English, the structure that has the following properties: short average path lengths between words and strong local clustering, with a scale-free distribution in which most nodes have few connections while a small number of nodes have many connections1. However, it is not clear whether such semantic network properties hold across human languages. In this study, we investigate the universal structures and cross-linguistic variations by comparing the semantic networks in English and Chinese. Network description To construct the Chinese and the English semantic networks, we used Chinese Open Wordnet2,3 and English WordNet4. The two wordnets have different word forms in Chinese and English but common word meanings. Word meanings are connected not only to word forms, but also to other word meanings if they form relations such as hypernyms and meronyms (Figure 1). 1. Cross-linguistic comparisons Analysis The large-scale structures of the Chinese and the English networks were measured with two key network metrics, small-worldness5 and scale-free distribution6. Results The two networks have similar size and both exhibit small-worldness (Table 1). However, the small-worldness is much greater in the Chinese network (σ = 213.35) than in the English network (σ = 83.15); this difference is primarily due to the higher average clustering coefficient (ACC) of the Chinese network. The scale-free distributions are similar across the two networks, as indicated by ANCOVA, F (1, 48) = 0.84, p = .37.
    [Show full text]
  • Structure at Every Scale: a Semantic Network Account of the Similarities Between Very Unrelated Concepts
    De Deyne, S., Navarro D. J., Perfors, A. and Storms, G. (2016). Structure at every scale: A semantic network account of the similarities between very unrelated concepts. Journal of Experimental Psychology: General, 145, 1228-1254 https://doi.org/10.1037/xge0000192 Structure at every scale: A semantic network account of the similarities between unrelated concepts Simon De Deyne, Danielle J. Navarro, Amy Perfors University of Adelaide Gert Storms University of Leuven Word Count: 19586 Abstract Similarity plays an important role in organizing the semantic system. However, given that similarity cannot be defined on purely logical grounds, it is impor- tant to understand how people perceive similarities between different entities. Despite this, the vast majority of studies focus on measuring similarity between very closely related items. When considering concepts that are very weakly re- lated, little is known. In this paper we present four experiments showing that there are reliable and systematic patterns in how people evaluate the similari- ties between very dissimilar entities. We present a semantic network account of these similarities showing that a spreading activation mechanism defined over a word association network naturally makes correct predictions about weak sim- ilarities and the time taken to assess them, whereas, though simpler, models based on direct neighbors between word pairs derived using the same network cannot. Keywords: word associations, similarity, semantic networks, random walks. This work was supported by a research grant funded by the Research Foundation - Flanders (FWO), ARC grant DE140101749 awarded to the first author, and by the interdisciplinary research project IDO/07/002 awarded to Dirk Speelman, Dirk Geeraerts, and Gert Storms.
    [Show full text]
  • Untangling Semantic Similarity: Modeling Lexical Processing Experiments with Distributional Semantic Models
    Untangling Semantic Similarity: Modeling Lexical Processing Experiments with Distributional Semantic Models. Farhan Samir Barend Beekhuizen Suzanne Stevenson Department of Computer Science Department of Language Studies Department of Computer Science University of Toronto University of Toronto, Mississauga University of Toronto ([email protected]) ([email protected]) ([email protected]) Abstract DSMs thought to instantiate one but not the other kind of sim- Distributional semantic models (DSMs) are substantially var- ilarity have been found to explain different priming effects on ied in the types of semantic similarity that they output. Despite an aggregate level (as opposed to an item level). this high variance, the different types of similarity are often The underlying conception of semantic versus associative conflated as a monolithic concept in models of behavioural data. We apply the insight that word2vec’s representations similarity, however, has been disputed (Ettinger & Linzen, can be used for capturing both paradigmatic similarity (sub- 2016; McRae et al., 2012), as has one of the main ways in stitutability) and syntagmatic similarity (co-occurrence) to two which it is operationalized (Gunther¨ et al., 2016). In this pa- sets of experimental findings (semantic priming and the effect of semantic neighbourhood density) that have previously been per, we instead follow another distinction, based on distri- modeled with monolithic conceptions of DSM-based seman- butional properties (e.g. Schutze¨ & Pedersen, 1993), namely, tic similarity. Using paradigmatic and syntagmatic similarity that between syntagmatically related words (words that oc- based on word2vec, we show that for some tasks and types of items the two types of similarity play complementary ex- cur in each other’s near proximity, such as drink–coffee), planatory roles, whereas for others, only syntagmatic similar- and paradigmatically related words (words that can be sub- ity seems to matter.
    [Show full text]
  • Ontology-Based Approach to Semantically Enhanced Question Answering for Closed Domain: a Review
    information Review Ontology-Based Approach to Semantically Enhanced Question Answering for Closed Domain: A Review Ammar Arbaaeen 1,∗ and Asadullah Shah 2 1 Department of Computer Science, Faculty of Information and Communication Technology, International Islamic University Malaysia, Kuala Lumpur 53100, Malaysia 2 Faculty of Information and Communication Technology, International Islamic University Malaysia, Kuala Lumpur 53100, Malaysia; [email protected] * Correspondence: [email protected] Abstract: For many users of natural language processing (NLP), it can be challenging to obtain concise, accurate and precise answers to a question. Systems such as question answering (QA) enable users to ask questions and receive feedback in the form of quick answers to questions posed in natural language, rather than in the form of lists of documents delivered by search engines. This task is challenging and involves complex semantic annotation and knowledge representation. This study reviews the literature detailing ontology-based methods that semantically enhance QA for a closed domain, by presenting a literature review of the relevant studies published between 2000 and 2020. The review reports that 83 of the 124 papers considered acknowledge the QA approach, and recommend its development and evaluation using different methods. These methods are evaluated according to accuracy, precision, and recall. An ontological approach to semantically enhancing QA is found to be adopted in a limited way, as many of the studies reviewed concentrated instead on Citation: Arbaaeen, A.; Shah, A. NLP and information retrieval (IR) processing. While the majority of the studies reviewed focus on Ontology-Based Approach to open domains, this study investigates the closed domain.
    [Show full text]
  • Web Search Result Clustering with Babelnet
    Web Search Result Clustering with BabelNet Marek Kozlowski Maciej Kazula OPI-PIB OPI-PIB [email protected] [email protected] Abstract 2 Related Work In this paper we present well-known 2.1 Search Result Clustering search result clustering method enriched with BabelNet information. The goal is The goal of text clustering in information retrieval to verify how Babelnet/Babelfy can im- is to discover groups of semantically related docu- prove the clustering quality in the web ments. Contextual descriptions (snippets) of docu- search results domain. During the evalua- ments returned by a search engine are short, often tion we tested three algorithms (Bisecting incomplete, and highly biased toward the query, so K-Means, STC, Lingo). At the first stage, establishing a notion of proximity between docu- we performed experiments only with tex- ments is a challenging task that is called Search tual features coming from snippets. Next, Result Clustering (SRC). Search Results Cluster- we introduced new semantic features from ing (SRC) is a specific area of documents cluster- BabelNet (as disambiguated synsets, cate- ing. gories and glosses describing synsets, or Approaches to search result clustering can be semantic edges) in order to verify how classified as data-centric or description-centric they influence on the clustering quality (Carpineto, 2009). of the search result clustering. The al- The data-centric approach (as Bisecting K- gorithms were evaluated on AMBIENT Means) focuses more on the problem of data clus- dataset in terms of the clustering quality. tering, rather than presenting the results to the user. Other data-centric methods use hierarchical 1 Introduction agglomerative clustering (Maarek, 2000) that re- In the previous years, Web clustering engines places single terms with lexical affinities (2-grams (Carpineto, 2009) have been proposed as a solu- of words) as features, or exploit link information tion to the issue of lexical ambiguity in Informa- (Zhang, 2008).
    [Show full text]
  • The Use of a Semantic Network in a Deductive Question-Answering System
    THE USE OF A SEMANTIC NETWORK IN A DEDUCTIVE QUESTION-ANSWERING SYSTEM James R. McSkimin .Jack Minker Bell Telephone Laboratories Department of Computer Science Columbus, Ohio University of Maryland 43209 College Park, Maryland 20742 Abstract tions. Thus, it should be possible to deter• mine that a query such as "Who is the person The use of a semantic network to aid in the who is both father and the mother of a given deductive search process of a Question-Answering individual?n is not answerable. System is described. The semantic network is (3) To identify those queries which have a known based on an adaptation of the predicate calculus. maximum number of solutions so as to termi• It makes available user-supplied, domain-dependent nate searches for additional answers once the info mat ion so as to permit semantic data to be known fixed number is found. used during the search process. The semantic network discussed in this paper Three ways are discussed in which semantic is described briefly in Section 2. In Section 3, information may be used. These are: we describe how the semantic network may be used (a) To apply semantic information during the to solve some of the problems associated with the pattern-matching process. above items. An example which illustrates the (b) To apply semantic well-formedness tests to use of the semantic network is presented in query and data inputs. Section 4. A summary of the work and future (c) To determine when subproblems are fully- directions is given in Section 5. solved (i.e., they have no solutions other than a fixed, finite number).
    [Show full text]
  • Semantic Networks: Their Computation and Use for Understanding English Sentences
    Two Semantic Networks: Their Computation and Use for Understanding English Sentences R. F. Simmons The University of Texas In the world of computer science, networks are mathematical and computational structures composed of sets of nodes connected by directed arcs. A semantic net- work purports to represent concepts expressed by natural-language words and phrases as nodes connected to other such concepts by a particular set of arcs called semantic relations. Primitive concepts in this system of semantic networks are word-sense meanings. Primitive semantic relations are those that the verb of a sen- tence has with its subject, object, and prepositional phrase arguments in addition to those that underlie common lexical, classificational and modificational relations. A complete statement of semantic relations would include all those relations that would be required in the total classification of a natural language vocabulary. We consider the theory and model of semantic nets to be a computational theory of superficial verbal understanding in humans. We conceive semantic nodes as representing human verbal concept structures and semantic relations connecting two such structures as representing the linguistic processes of thought that are used to combine them into natural-language descriptions of events. Some psycholin- guistic evidence supports this theory (Quillian 1968, Rumelhart and Norman, 1971, Collins and Quillian, 1971); but a long period of research will be necessary before 64 Semantic Network Computation and Use for Understanding English Sentences enough facts are available to accept or reject it as valid and useful psychological theory. We are on much stronger ground when we treat semantic networks as a computa- tional linguistic theory of structures and processing operations required for computer understanding of natural language.
    [Show full text]
  • Applications of Network Science to Education Research: Quantifying Knowledge and the Development of Expertise Through Network Analysis
    education sciences Commentary Applications of Network Science to Education Research: Quantifying Knowledge and the Development of Expertise through Network Analysis Cynthia S. Q. Siew Department of Psychology, National University of Singapore, Singapore 117570, Singapore; [email protected] Received: 31 January 2020; Accepted: 3 April 2020; Published: 8 April 2020 Abstract: A fundamental goal of education is to inspire and instill deep, meaningful, and long-lasting conceptual change within the knowledge landscapes of students. This commentary posits that the tools of network science could be useful in helping educators achieve this goal in two ways. First, methods from cognitive psychology and network science could be helpful in quantifying and analyzing the structure of students’ knowledge of a given discipline as a knowledge network of interconnected concepts. Second, network science methods could be relevant for investigating the developmental trajectories of knowledge structures by quantifying structural change in knowledge networks, and potentially inform instructional design in order to optimize the acquisition of meaningful knowledge as the student progresses from being a novice to an expert in the subject. This commentary provides a brief introduction to common network science measures and suggests how they might be relevant for shedding light on the cognitive processes that underlie learning and retrieval, and discusses ways in which generative network growth models could inform pedagogical strategies to enable meaningful long-term conceptual change and knowledge development among students. Keywords: education; network science; knowledge; learning; expertise; development; conceptual representations 1. Introduction Cognitive scientists have had a long-standing interest in quantifying cognitive structures and representations [1,2]. The empirical evidence from cognitive psychology and psycholinguistics demonstrates that the structure of cognitive systems influences the psychological processes that operate within them.
    [Show full text]