Semantic Field of the Words. to the Question of the Theories of Linguistic Semantics L
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Words and Alternative Basic Units for Linguistic Analysis
Words and alternative basic units for linguistic analysis 1 Words and alternative basic units for linguistic analysis Jens Allwood SCCIIL Interdisciplinary Center, University of Gothenburg A. P. Hendrikse, Department of Linguistics, University of South Africa, Pretoria Elisabeth Ahlsén SCCIIL Interdisciplinary Center, University of Gothenburg Abstract The paper deals with words and possible alternative to words as basic units in linguistic theory, especially in interlinguistic comparison and corpus linguistics. A number of ways of defining the word are discussed and related to the analysis of linguistic corpora and to interlinguistic comparisons between corpora of spoken interaction. Problems associated with words as the basic units and alternatives to the traditional notion of word as a basis for corpus analysis and linguistic comparisons are presented and discussed. 1. What is a word? To some extent, there is an unclear view of what counts as a linguistic word, generally, and in different language types. This paper is an attempt to examine various construals of the concept “word”, in order to see how “words” might best be made use of as units of linguistic comparison. Using intuition, we might say that a word is a basic linguistic unit that is constituted by a combination of content (meaning) and expression, where the expression can be phonetic, orthographic or gestural (deaf sign language). On closer examination, however, it turns out that the notion “word” can be analyzed and specified in several different ways. Below we will consider the following three main ways of trying to analyze and define what a word is: (i) Analysis and definitions building on observation and supposed easy discovery (ii) Analysis and definitions building on manipulability (iii) Analysis and definitions building on abstraction 2. -
Semantic Differences in Translation Exploring the Field of Inchoativity
Semantic differences in translation Exploring the field of inchoativity Lore Vandevoorde language Translation and Multilingual Natural science press Language Processing 13 Translation and Multilingual Natural Language Processing Editors: Oliver Czulo (Universität Leipzig), Silvia Hansen-Schirra (Johannes Gutenberg-Universität Mainz), Reinhard Rapp (Johannes Gutenberg-Universität Mainz) In this series: 1. Fantinuoli, Claudio & Federico Zanettin (eds.). New directions in corpus-based translation studies. 2. Hansen-Schirra, Silvia & Sambor Grucza (eds.). Eyetracking and Applied Linguistics. 3. Neumann, Stella, Oliver Čulo & Silvia Hansen-Schirra (eds.). Annotation, exploitation and evaluation of parallel corpora: TC3 I. 4. Czulo, Oliver & Silvia Hansen-Schirra (eds.). Crossroads between Contrastive Linguistics, Translation Studies and Machine Translation: TC3 II. 5. Rehm, Georg, Felix Sasaki, Daniel Stein & Andreas Witt (eds.). Language technologies for a multilingual Europe: TC3 III. 6. Menzel, Katrin, Ekaterina Lapshinova-Koltunski & Kerstin Anna Kunz (eds.). New perspectives on cohesion and coherence: Implications for translation. 7. Hansen-Schirra, Silvia, Oliver Czulo & Sascha Hofmann (eds). Empirical modelling of translation and interpreting. 8. Svoboda, Tomáš, Łucja Biel & Krzysztof Łoboda (eds.). Quality aspects in institutional translation. 9. Fox, Wendy. Can integrated titles improve the viewing experience? Investigating the impact of subtitling on the reception and enjoyment of film using eye tracking and questionnaire data. 10. Moran, Steven & Michael Cysouw. The Unicode cookbook for linguists: Managing writing systems using orthography profiles. 11. Fantinuoli, Claudio (ed.). Interpreting and technology. 12. Nitzke, Jean. Problem solving activities in post-editing and translation from scratch: A multi-method study. 13. Vandevoorde, Lore. Semantic differences in translation. ISSN: 2364-8899 Semantic differences in translation Exploring the field of inchoativity Lore Vandevoorde language science press Vandevoorde, Lore. -
PDF Polysemy and Metaphor in the Verbs of Perception
Manasia: Polysemy and Metaphor 55 Polysemy and Metaphor in the Verbs of Perception Mihaela Georgiana Manasia ABSTRACT: This paper addresses the idea that has been recently that perception verbs have a polysemous structure motivated by ourput forwardexperience by andseveral understanding studies in the of fieldthe world. of cognitive Metaphor linguistics is not only characteristic of poetic language, but on the contrary, it can be found everywhere in everyday language and the polysemous and constructional alternatives makes them a motivating semantic character of perception verbs reflected into a wide range of syntactic KEY WORDS: polysemy, metaphor, perception verbs, prototypical meaning,field to approach metaphorical in this meaning. respect. olysemy represents, within semantics, the term used to Pcharacterize the situation in which a word has two or more polysemy has been subject to controversies and continues to remain similar meanings. Despite this very simple definition, the concept of In 1980, the study of polysemy and metaphor expands a debatable field in the linguistic research. book Metaphors We Live By. relationwithin cognitive of meanings. linguistics It is perceivedespecially withas categorization Lakoff and Johnson’s namely related meanings are organised They intodefine categories polysemy based as a systematic on family resemblance. 55 56 HARVARD SQUARE SYMPOSIUM | THE FUTURE OF KNOWLEDGE to put forward that perception verbs have a polysemous structure, motivatedRecent by studies our experience in the field and of cognitive understanding semantics of thehave world. tried Metaphor represents one of the cognitive instruments structuring the way in which we think, perceive and act. this varietyThe authors of meanings of Metaphors and a part We of Live everyday By criticized language the that classical affects theory of metaphor as a comparison, describing similarities that already exist. -
Short-Text Clustering Using Statistical Semantics
Short-Text Clustering using Statistical Semantics Sepideh Seifzadeh Ahmed K. Farahat Mohamed S. Kamel University of Waterloo University of Waterloo University of Waterloo Waterloo, Ontario, Canada. Waterloo, Ontario, Canada. Waterloo, Ontario, Canada. N2L 3G1 N2L 3G1 N2L 3G1 [email protected] [email protected] [email protected] Fakhri Karray University of Waterloo Waterloo, Ontario, Canada. N2L 3G1 [email protected] ABSTRACT 1. INTRODUCTION Short documents are typically represented by very sparse In social media, users usually post short texts. Twitter vectors, in the space of terms. In this case, traditional limits the length of each Tweet to 140 characters; therefore, techniques for calculating text similarity results in measures developing data mining techniques to handle the large vol- which are very close to zero, since documents even the very ume of short texts has become an important goal [1]. Text similar ones have a very few or mostly no terms in common. document clustering has been widely used to organize doc- In order to alleviate this limitation, the representation of ument databases and discover similarity and topics among short-text segments should be enriched by incorporating in- documents. Short text clustering is more challenging than formation about correlation between terms. In other words, regular text clustering; due to the sparsity and noise, they if two short segments do not have any common words, but provide very few contextual clues for applying traditional terms from the first segment appear frequently with terms data mining techniques [2]; therefore, short documents re- from the second segment in other documents, this means quire different or more adapted approaches. -
Distributional Semantics
Distributional semantics Distributional semantics is a research area that devel- by populating the vectors with information on which text ops and studies theories and methods for quantifying regions the linguistic items occur in; paradigmatic sim- and categorizing semantic similarities between linguis- ilarities can be extracted by populating the vectors with tic items based on their distributional properties in large information on which other linguistic items the items co- samples of language data. The basic idea of distributional occur with. Note that the latter type of vectors can also semantics can be summed up in the so-called Distribu- be used to extract syntagmatic similarities by looking at tional hypothesis: linguistic items with similar distributions the individual vector components. have similar meanings. The basic idea of a correlation between distributional and semantic similarity can be operationalized in many dif- ferent ways. There is a rich variety of computational 1 Distributional hypothesis models implementing distributional semantics, includ- ing latent semantic analysis (LSA),[8] Hyperspace Ana- The distributional hypothesis in linguistics is derived logue to Language (HAL), syntax- or dependency-based from the semantic theory of language usage, i.e. words models,[9] random indexing, semantic folding[10] and var- that are used and occur in the same contexts tend to ious variants of the topic model. [1] purport similar meanings. The underlying idea that “a Distributional semantic models differ primarily with re- word is characterized by the company it keeps” was pop- spect to the following parameters: ularized by Firth.[2] The Distributional Hypothesis is the basis for statistical semantics. Although the Distribu- • tional Hypothesis originated in linguistics,[3] it is now re- Context type (text regions vs. -
S&P 2012 Lecture 3: Semantic Change
4.43201 Semantics & Pragmatics 2012 Lecture 3: Semantic Change S&P 2012 Lecture 3: Semantic Change Today’s Menu: 1. Semantic shift: various processes of semantic change (widening /narrowing, amelioration/ pejoration) & their causes 2. Grammaticalization (both semantic & structural change) and its causes. Last week, we zoomed in on the ‘technical features’ of our ‘spinning wheel’; we examined the smallest units that make it up (word-meanings), and grouped them by resemblance (synonyms/ antonyms; homonymy/ polysemy) and contiguity (hyponymy/ hypernymy & meronymy/ holonymy). Thus, we established the nature of ‘systemic’ lexical relations between word-meanings because of resemblance or contiguity between them, as perceived by our minds. This week, while still focused on the ‘technical specifications’ of the language tool, we will ‘zoom out’ a little, and view its smallest units in the 4th dimension of all existence – Time. Our task this week is to establish how word-meanings change over time, and to explain why they do so. 1. Semantic Change: How do word-meanings change over time? In historical/ diachronic linguistics, semantic change refers to a change in denotative, socially shared, word meaning. Semantic shift is the general way of referring to any unspecified semantic change. Major types of semantic change may be viewed as . Widening (generalization) – a shift to a more general meaning: i.e., in Middle English, bridde meant a ‘small bird’; later, bird came to be used in a general sense and the word fowl, formerly the more general word, was restricted to the sense of ‘farm birds bred especially for consumption’; . Narrowing (specification) – a shift towards a more specific concept: the opposite of widening, or expansion. -
The Semantics and Pragmatics of Polysemy: a Relevance-Theoretic Account
The Semantics and Pragmatics of Polysemy: A Relevance-Theoretic Account Ingrid Lossius Falkum Thesis submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy University College London January 2011 I, Ingrid Lossius Falkum, confirm that the work presented in this thesis is my own. Where information has been derived from other sources, I confirm that this has been indicated in the thesis. _____________________ Sign. 2 Abstract This thesis investigates the phenomenon of polysemy: a single lexical form with two or multiple related senses (e.g. catch the rabbit/order the rabbit; lose a wallet/lose a relative; a handsome man/a handsome gift). I develop a pragmatic account of polysemy within the framework of Sperber and Wilson’s relevance theory, where new senses for a word are constructed during on-line comprehension by means of a single process of ad hoc concept construction, which adjusts the meanings of individual words in different directions. While polysemy is largely unproblematic from the perspective of communication, it poses a range of theoretical and descriptive problems. This is sometimes termed the polysemy paradox. A widely held view in lexical semantics is that word meanings must consist of complex representations in order to capture the sense relations involved in polysemy. Contrary to this view, I argue that a conceptual atomist approach, which treats word meanings as unstructured atoms and thereby avoids the range of problems associated with decompositional theories of word meaning, may be at least as able to account for polysemy when paired with an adequate pragmatic theory. My proposed solution to the polysemy paradox is to treat polysemy as a fundamentally communicative phenomenon, which arises as a result of encoded lexical concepts being massively underdetermining of speaker-intended concepts, and is grounded in our pragmatic inferential ability. -
Colexification and Semantic Change in Colour Terms in Sino-Tibetan and Indo-European Languages
Colexification and semantic change in colour terms in Sino-Tibetan and Indo-European languages Kajsa Söderqvist Bachelor Thesis in General Linguistics Supervisor: Gerd Carling Autumn semester 2016/2017 University of Lund, Centre for Languages and Literature Abstract Colour terms is a highly interesting field when investigating linguistic universals and how language vary cross-linguistically. Colour semantics, the investigation of the meaning of colour, consists in largely of two opposing sides: the universalists, proposing that colour terms are universal (Berlin & Kay 1969) and the relativists claiming a variation in meaning cross-linguistically (Wierzbicka 2008). The highly changeable field lexical semantic change is defined as the change in meaning in concepts connected to a lexical item and a typical pattern of change is words becoming polysemous (Durkin 2009). To gain an expanded picture and understanding of a term, a historical investigation and etymological research of its derived concepts is a useful resource. Biggam (2012) points out that specifically colour terms are less stable and that historical colour terms tend to have broader coverage than the modern terms, which makes them an interesting object of investigation. The focus of this thesis is consequently to investigate and contrast the synchronic colexifications and diachronic derivations of ten colour terms in ten Sino-Tibetan and ten Indo-European languages. A dataset in DiACL (Carling 2017) has been constructed to gather the collected lexemes, followed by a manual extraction to semantic networks for a visual representation (Felbaum 2012). The lexical meanings have then been grouped into semantic classifications (Haspelmath & Tadmor 2009) for further analyze. The results showed very small overlap of colexified lexical meanings for each colour term in the diachronic perspective, but showed a conformity of semantic categories between the families. -
Larkc FP7 – 215535 D1.2.2 Improved Operational Framework
LarKC The Large Knowledge Collider a platform for large scale integrated reasoning and Web-search FP7 { 215535 D1.2.2 Improved Operational Framework Coordinator: Gaston Tagni With contributions from: Gaston Tagni (VUA), Zhisheng Huang (VUA), Stefan Schlobach (VUA), Annette ten Teije (VUA), Frank van Harmelen (VUA), Barry Bishop (UIBK), Florian Fischer (UIBK), Vassil Momtchev (Ontotext), Yi Zeng (WICI), Yan Wang (WICI), Yi Huang (Siemens), Georgina Gallizo (HLRS), Matthias Assel (HLRS), Jose Quesada (MPI) Quality Assesor: Michael Witbrock (CycEur) Quality Controller: Reto Krummenacher (UIBK) Document Identifier: LarKC/2008/D1.2.2 Class Deliverable: LarKC EU-IST-2008-215535 Version: version 1.0.0 Date: March 30, 2010 State: final Distribution: public FP7 { 215535 Deliverable 1.2.2 Executive Summary This deliverable is the second of a series of three deliverables aimed at defining an Operational Framework for scalable reasoning in LarKC and as such is a continu- ation of the work reported in Deliverable D1.2.1 - Initial Operational Framework (M7). The main contributions of this document are: First, a detailed analysis and discussion of different plug-ins currently being developed in LarKC in the context of the different technical work packages along with a discussion of several re-use possibilities between plug-ins and their sub-components. The second con- tribution is the specification of a series of design patterns aimed at supporting the development of plug-ins and other components in LarKC to achieve reasoning at Web-scale. 1 FP7 { 215535 -
Semantic Excel: an Introduction to a User-Friendly Online Software Application for Statistical Analyses of Text Data
Semantic Excel: An Introduction to a User-Friendly Online Software Application for Statistical Analyses of Text Data Sikström1* S., Kjell1 O.N.E. & Kjell1 K. 1Department of Psychology, Lund University, Sweden *Correspondence to: [email protected] Acknowledges. We would like to thank Igor Marchetti for suggesting improvements on earlier drafts of this article. 1 Abstract Semantic Excel (www.semanticexcel.com) is an online software application with a simple, yet powerful interface enabling users to perform statistical analyses on texts. The purpose of this software is to facilitate statistical testing based on words, rather than numbers. The software comes with semantic representations, or an ordered set of numbers describing the semantic similarity between words/texts that are generated from Latent Semantic Analysis. These semantic representations are based on large datasets from Google N-grams for a dozen of the most commonly used languages in the world. This small-by-big data approach enables users to conduct analyses of small data that is enhanced by semantic knowledge from big data. First, we describe the theoretical foundation of these representations. Then we show the practical steps involved in carrying out statistical calculation using these semantic representations in Semantic Excel. This includes calculation of semantic similarity scores (i.e., computing a score describing the semantic similarity between two words/texts), semantic t-tests (i.e., statistically test whether two sets of words/texts differ in meaning), semantic-numeric correlations (i.e., statistically examine the relationship between words/texts and a numeric variable) and semantic predictions (i.e., using statistically trained models to predict numerical values from words/texts). -
Neural Methods Towards Concept Discovery from Text Via Knowledge Transfer
Neural Methods Towards Concept Discovery from Text via Knowledge Transfer DISSERTATION Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University By Manirupa Das, B.E., M.S. Graduate Program in Computer Science and Engineering The Ohio State University 2019 Dissertation Committee: Prof Rajiv Ramnath, Advisor Prof Eric Fosler-Lussier, Advisor Prof Huan Sun c Copyright by Manirupa Das 2019 ABSTRACT Novel contexts, consisting of a set of terms referring to one or more concepts, often arise in real-world querying scenarios such as; a complex search query into a document retrieval system or a nuanced subjective natural language question. The concepts in these queries may not directly refer to entities or canonical concept forms occurring in any fact-based or rule-based knowledge source such as a knowledge base or on- tology. Thus, in addressing the complex information needs expressed by such novel contexts, systems using only such sources can fall short. Moreover, hidden associa- tions meaningful in the current context, may not exist in a single document, but in a collection, between matching candidate concepts having different surface realizations, via alternate lexical forms. These may refer to underlying latent concepts, i.e., exist- ing or conceived concepts or semantic classes that are accessible only via their surface forms. Inferring these latent concept associations in an implicit manner, by transfer- ring knowledge from the same domain { within a collection, or from across domains (different collections), can potentially better address such novel contexts. Thus latent concept associations may act as a proxy for a novel context. -
Arxiv:1003.1141V1 [Cs.CL]
Journal of Artificial Intelligence Research 37 (2010) 141-188 Submitted 10/09; published 02/10 From Frequency to Meaning: Vector Space Models of Semantics Peter D. Turney [email protected] National Research Council Canada Ottawa, Ontario, Canada, K1A 0R6 Patrick Pantel [email protected] Yahoo! Labs Sunnyvale, CA, 94089, USA Abstract Computers understand very little of the meaning of human language. This profoundly limits our ability to give instructions to computers, the ability of computers to explain their actions to us, and the ability of computers to analyse and process text. Vector space models (VSMs) of semantics are beginning to address these limits. This paper surveys the use of VSMs for semantic processing of text. We organize the literature on VSMs according to the structure of the matrix in a VSM. There are currently three broad classes of VSMs, based on term–document, word–context, and pair–pattern matrices, yielding three classes of applications. We survey a broad range of applications in these three categories and we take a detailed look at a specific open source project in each category. Our goal in this survey is to show the breadth of applications of VSMs for semantics, to provide a new perspective on VSMs for those who are already familiar with the area, and to provide pointers into the literature for those who are less familiar with the field. 1. Introduction One of the biggest obstacles to making full use of the power of computers is that they currently understand very little of the meaning of human language.