Mitteilungen Der Dgfs Nr

Total Page:16

File Type:pdf, Size:1020Kb

Mitteilungen Der Dgfs Nr Mitteilungen der DGfS Nr. 79, Juni 2014 Inhalt 37. Jahrestagung der DGfS 4 Arbeitsgruppen für die 37. Jahrestagung der DGfS ................. 5 AG 1: Strong versus weak prosodic position: possible variation and relevance for grammar .............................. 7 AG 2: Exact repetition in grammar and discourse ............... 8 AG 3: What drives syntactic computation? Alternatives to formal features .. 9 AG 4: VO OV: Korrelationen der Kopf- Komplement- Abfolge in Grammatik und Lexikon .............................. 11 AG 5: Co- and Subordination in German and Other Languages ........ 12 AG 6: The prosody and meaning of (non-)canonical questions across languages 13 AG 7: Universal biases on phonological acquisition and processing ...... 14 AG 8: Normalität in der Sprache ....................... 15 AG 9: Varieties of Positive Polarity Items ................... 16 AG 10: Perspective- taking ........................... 17 AG 11: Big data: new opportunities and challenges in language acquisition research .................... 18 AG 12: The development of iconic gestures as resources in language acquisition 19 AG 13: Proportions and Quantities ...................... 21 AG 14: Modelling conditionality ........................ 22 36. Jahrestagung der DGfS 23 Berichte der Arbeitsgruppen der 36. Jahrestagung der DGfS ............ 23 Experimental and theoretical approaches to relativeclauses reconciled .... 24 Summary of the DGFS Workshop Demonstratives .............. 30 Clausal complementation and (non)factivity ................. 34 The Syntax and Semantics of Particles .................... 38 Categories and Categorization ......................... 41 Language in Historical Contact situations (LHC) ............... 45 Probleme der syntaktischen Kategorisierung ................. 50 Converging Evidence? ............................. 53 Sichtbare und hörbare Morphologie ...................... 58 Pejoration ................................... 61 Web Data as a Challenge for Theoretical Linguistics and Corpus Design ... 64 Grammatical categories in macro- and microcomparative linguistics ..... 67 2 Inhalt Labels and Roots ............................... 71 Protokoll der DGfS-Mitgliederversammlung am 6. März 2014 in Marburg ..... 75 Kassenbericht .................................... 84 Umstellung auf das SEPA-Basis-Lastschriftverfahren ................ 86 Bericht der Redaktion der Zeitschrift für Sprachwissenschaft ............ 87 Bericht der Sektion Computerlinguistik ....................... 89 Adressen 90 Vorstand ....................................... 90 Beirat ........................................ 90 Programmausschuss ................................. 91 Pressesprecher .................................... 91 Sektion Computerlinguistik ............................. 91 Redaktion der ZS .................................. 92 Kontaktadressen ................................... 93 3 37. Jahrestagung der DGfS vom 4. bis 6. März 2015 an der Universität Leipzig Rahmenthema: Grammatische Modellierung und sprachliche Verschiedenheit Kontakt Informationen zum Programm, Unterkunft, Anmeldung, Prof. Dr. Barbara Stiebels AG-Zeitplan, Doktorandenforum, Lehramtsinitative und Institut für Linguistik Tutorium CL finden Sie in Kürze auf der Internetseite Universität Leipzig der Konferenz. Beethovenstr. 15 04159 Leipzig Tel. +341/97-37604 Webauftritt: conference.uni-leipzig.de/dgfs2015 4 37. Jahrestagung der DGfS Arbeitsgruppen für die 37. Jahrestagung der DGfS Angenommene AGs für die DGfS 2015 in Leipzig AG 1: Renate Raffelsiefen & Marzena Zygis Strong versus weak prosodic position: possible variation and relevance for grammar AG 2: Rita Finkbeiner & Ulrike Freywald Exact repetition in grammar and discourse AG 3: Dennis Ott & Radek Šimík What drives syntactic computation? Alternatives to formal features AG 4: Balthasar Bickel , Walter Bisang , Gisbert Fanselow , Hubert Haider VO OV: Korrelationen der Kopf- Komplement- Abfolgein Grammatik und Lexikon AG 5: Ingo Reich & Augustin Speyer Co- and Subordination in German and Other Languages AG 6: Bettina Braun, Nicole Dehé, Daniela Wochner , Beste Kamali, Hubert Truckenbrodt The prosody and meaning of (non-)canonical questions across languages AG 7: Dinah Baer-Henney , Natalie Boll-Avetisyan Universal biases on phonological acquisition and processing AG 8: Bergische Universität Wuppertal), Franz d’Avis Normalität in der Sprache AG 9: Mingya Liu & Gianina Iordăchioaia Varieties of Positive Polarity Items AG 10: Stefan Hinterwimmer, Petra B. Schumacher, Hanna Weiland Perspective- taking AG 11: Christina Bergmann , Alex Cristia , Sho Tsuji Kurz-AG Big data: new opportunities and challenges in language acquisition research AG 12: Friederike Kern & Katharina Rohlfing Kurz-AG The development of iconic gestures as resources in language acquisition AG 13: Ulrich Sauerland Kurz-AG Proportions and Quantities AG 14: Eva Csipak , Ryan Bochnak Kurz-AG Modelling conditionality 5 37. Jahrestagung der DGfS Nicht angenommene AG-Vorschläge • Can pragmatics take over? – Issues at the syntax-pragmatics interface • Referential Expressions and Discourse Structure • Impairments in oral and written language production • Multiple functions of modifiers • Linguistic representations and processes in bilinguals • Mismatches between syntax and semantics • Discourse–‐related features: what they are, and what they mean 6 37. Jahrestagung der DGfS AG 1: Strong versus weak prosodic position: possible variation and relevance for grammar Renate Raffelsiefen (IDS Mannheim, FU Berlin) and Marzena Zygis (ZAS & HU Berlin) Both phoneticians and phonologists have found reason to distinguish ”strong”and ”weakpp- ositions referring to constituents of the prosodic hierarchy, including higher constituents, whose boundaries align with morphosyntactic boundaries, as well as lower constituents such as foot and syllable. Strength is commonly associated with initial positions and with stress whereas weakness is associated with non-prominent positions. Reference to strong versus weak positions has been invoked in articulatory phonetics (target overshoot, i.e. enhancement of the duration and/or magnitude of articulatory gestures, in strong positions versus target undershoot in weak position) as well as auditory phonetics (lower rate of misperception in strong positions versus higher rate in weak position). It has also been invoked to account for potential contrast, more distinctiveness being associated with strong positions (cf. the notions of positional faithfulness”and positional markedness”in Optimality Theory). Although reference to ”strong”versus ”weakpositions appears to be universally grounded in prominence and although it seems to be taken for granted that positions considered strong for the purpose of one area of phonetics or phonology implies strength for the purpose of others there is evidence for disparity. For instance, the word-initial position is associated with strong potential contrast by Beckman (1998), whereas Trubezkoy links both margin positions of words to low contrastiveness (e.g. neutralization of the voicing contrast for all consonants in word-initial position in Erza-Mordwin, Trubetzkoy 1958: 212ff). Similarly, the word-initial position is associated with target overshoot (e.g. aspiration of voiceless plosives) in English or German, but also exhibits fewer contrasts in fricatives than for instance the foot-internal position. The latter nonetheless exhibits target undershoot (flapping in American English). In view of these discrepancies, the workshop will provide a forum for phonologists and pho- neticians to discuss associations between segmental phenomena and prosodic positions from a cross-linguistic point of view, focusing on questions like: - Which prosodic positions need to be distinguished in terms of weakness versus strength to account for what sort of phenomenon (enhancement of articulatory gestures, perceptual discriminability, potential contrast). - To what extent do these phenomena overlap? - Is there evidence that weak versus strong po- sitions could be language-specific? - What are the implications for the modeling of grammar, e.g. is there a need to distinguish a phonemic level (contrast) from phonetics, the latter modeled as implementation? 7 37. Jahrestagung der DGfS AG 2: Exact repetition in grammar and discourse Rita Finkbeiner Ulrike Freywald FB 05, Deutsches Institut Institut für Germanistik Universität Mainz Universität Potsdam Jakob-Welder-Weg 18 Am Neuen Palais 10 55099 Mainz 14469 Potsdam fi[email protected] [email protected] Tel. 06131-39 25512 / Fax 06131-39 23366 Tel. 0331-977 4221 / Fax 0331-977 4245 Most linguists will agree that iteration is a pervasive phenomenon in language and an im- portant notion for linguistic analysis. Traditionally, the process of repetition is related to the domains of text and discourse, and associated with specific pragmatic effects (e.g., emphasis), while the process of reduplication is restricted to the domains of phonology and morpholo- gy, and associated with specific semantic effects (e.g., intensification). In phonological and syntactic theory, reduplication has mainly been discussed as a local copying process, while in typology, it has been described as a morphological marker of inflection or word formation. Repetition phenomena, in contrast, have been claimed to apply above word level. In interac- tional linguistics, the focus has been on functions of repetition such as marking of agreement and disagreement. In recent years, however, one has come to realize that the borderline
Recommended publications
  • D1.3 Deliverable Title: Action Ontology and Object Categories Type
    Deliverable number: D1.3 Deliverable Title: Action ontology and object categories Type (Internal, Restricted, Public): PU Authors: Daiva Vitkute-Adzgauskiene, Jurgita Kapociute, Irena Markievicz, Tomas Krilavicius, Minija Tamosiunaite, Florentin Wörgötter Contributing Partners: UGOE, VMU Project acronym: ACAT Project Type: STREP Project Title: Learning and Execution of Action Categories Contract Number: 600578 Starting Date: 01-03-2013 Ending Date: 30-04-2016 Contractual Date of Delivery to the EC: 31-08-2015 Actual Date of Delivery to the EC: 04-09-2015 Page 1 of 16 Content 1. EXECUTIVE SUMMARY .................................................................................................................... 2 2. INTRODUCTION .............................................................................................................................. 3 3. OVERVIEW OF THE ACAT ONTOLOGY STRUCTURE ............................................................................ 3 4. OBJECT CATEGORIZATION BY THEIR SEMANTIC ROLES IN THE INSTRUCTION SENTENCE .................... 9 5. HIERARCHICAL OBJECT STRUCTURES ............................................................................................. 13 6. MULTI-WORD OBJECT NAME RESOLUTION .................................................................................... 15 7. CONCLUSIONS AND FUTURE WORK ............................................................................................... 16 8. REFERENCES ................................................................................................................................
    [Show full text]
  • LA SITUACION LINGUISTICA EN GRECIA. PROBLEMAS Y PERSPECTIVAS' Pedro BADENAS DE LA PENA C.S.I.C
    LA SITUACION LINGUISTICA EN GRECIA. PROBLEMAS Y PERSPECTIVAS' Pedro BADENAS DE LA PENA C.S.I.C. Madrid El proceso de una integración económica, política y cultural europea, derivado del actual grado de desarrollo de la CEE, está permitiendo que asistamos a un redescubrimiento de la gran complejidad que se esconde tras el fácil, e inexacto, cliché de la supuesta "homogeneidad" eurooccidental. En primer lugar, conviene ya relegar el concepto de "occidental" para lo que realmente denota, una mera distribución geográfica. Su extensión a otros terrenos no deja de ser una manipulación semántica, más o menos interesada. Si por algo se caracteriza Europa es por ser un mosaico lingüístico, surgido y moldeado históricamente por tres culturas diferenciadas -latina, germánica y eslava- pero procedentes de una cepa común: el mundo grecorromano. Este pasado esencial de lo que podríamos llamar civilización europea es, ya en su misma raíz, doble. El carácter dual del Imperio Romano obedecía, entre otras razones, a una división lingüística: uso del latín en la pars Occidentis y del griego en la pars Orientis, división que se ahondaría desde la Alta Edad Media cuando definitivamente Roma y Bizancio se conviertan en focos opuestos de irradiación cultural. La función civilizadora de la lengua griega (v. mapa 1) condicionó decisivamente, a través de Bizancio, a los pueblos eslavos, pero de manera muy distinta a como actuó el latín en Occidente. El resultado más evidente lo tenemos en la fragmentación románica, mientras que el griego no llegó a ese proceso y permaneció como área lingüísitica unitaria. Dentro del ámbito lingüístico indoeuropeo -en Europa- el griego, como el aibanés, no han dado, pues, lugar a nuevas lenguas.
    [Show full text]
  • Collocation Classification with Unsupervised Relation Vectors
    Collocation Classification with Unsupervised Relation Vectors Luis Espinosa-Anke1, Leo Wanner2,3, and Steven Schockaert1 1School of Computer Science, Cardiff University, United Kingdom 2ICREA and 3NLP Group, Universitat Pompeu Fabra, Barcelona, Spain fespinosa-ankel,schockaerts1g@cardiff.ac.uk [email protected] Abstract Morphosyntactic relations have been the focus of work on unsupervised relational similarity, as Lexical relation classification is the task of it has been shown that verb conjugation or nomi- predicting whether a certain relation holds be- nalization patterns are relatively well preserved in tween a given pair of words. In this pa- vector spaces (Mikolov et al., 2013; Pennington per, we explore to which extent the current et al., 2014a). Semantic relations pose a greater distributional landscape based on word em- beddings provides a suitable basis for classi- challenge (Vylomova et al., 2016), however. In fication of collocations, i.e., pairs of words fact, as of today, it is unclear which operation per- between which idiosyncratic lexical relations forms best (and why) for the recognition of indi- hold. First, we introduce a novel dataset with vidual lexico-semantic relations (e.g., hyperonymy collocations categorized according to lexical or meronymy, as opposed to cause, location or ac- functions. Second, we conduct experiments tion). Still, a number of works address this chal- on a subset of this benchmark, comparing it in lenge. For instance, hypernymy has been modeled particular to the well known DiffVec dataset. In these experiments, in addition to simple using vector concatenation (Baroni et al., 2012), word vector arithmetic operations, we also in- vector difference and component-wise squared vestigate the role of unsupervised relation vec- difference (Roller et al., 2014) as input to linear tors as a complementary input.
    [Show full text]
  • Comparative Evaluation of Collocation Extraction Metrics
    Comparative Evaluation of Collocation Extraction Metrics Aristomenis Thanopoulos, Nikos Fakotakis, George Kokkinakis Wire Communications Laboratory Electrical & Computer Engineering Dept., University of Patras 265 00 Rion, Patras, Greece {aristom,fakotaki,gkokkin}@wcl.ee.upatras.gr Abstract Corpus-based automatic extraction of collocations is typically carried out employing some statistic indicating concurrency in order to identify words that co-occur more often than expected by chance. In this paper we are concerned with some typical measures such as the t-score, Pearson’s χ-square test, log-likelihood ratio, pointwise mutual information and a novel information theoretic measure, namely mutual dependency. Apart from some theoretical discussion about their correlation, we perform comparative evaluation experiments judging performance by their ability to identify lexically associated bigrams. We use two different gold standards: WordNet and lists of named-entities. Besides discovering that a frequency-biased version of mutual dependency performs the best, followed close by likelihood ratio, we point out some implications that usage of available electronic dictionaries such as the WordNet for evaluation of collocation extraction encompasses. dependence, several metrics have been adopted by the 1. Introduction corpus linguistics community. Typical statistics are Collocational information is important not only for the t-score (TSC), Pearson’s χ-square test (χ2), log- second language learning but also for many natural likelihood ratio (LLR) and pointwise mutual language processing tasks. Specifically, in natural information (PMI). However, usually no systematic language generation and machine translation it is comparative evaluation is accomplished. For example, necessary to ensure generation of lexically correct Kita et al. (1993) accomplish partial and intuitive expressions; for example, “strong”, unlike “powerful”, comparisons between metrics, while Smadja (1993) modifies “coffee” but not “computers”.
    [Show full text]
  • Collocation Extraction Using Parallel Corpus
    Collocation Extraction Using Parallel Corpus Kavosh Asadi Atui1 Heshaam Faili1 Kaveh Assadi Atuie2 (1)NLP Laboratory, Electrical and Computer Engineering Dept., University of Tehran, Iran (2)Electrical Engineering Dept., Sharif University of Technology, Iran [email protected],[email protected],[email protected] ABSTRACT This paper presents a novel method to extract the collocations of the Persian language using a parallel corpus. The method is applicable having a parallel corpus between a target language and any other high-resource one. Without the need for an accurate parser for the target side, it aims to parse the sentences to capture long distance collocations and to generate more precise results. A training data built by bootstrapping is also used to rank the candidates with a log-linear model. The method improves the precision and recall of collocation extraction by 5 and 3 percent respectively in comparison with the window-based statistical method in terms of being a Persian multi-word expression. KEYWORDS: Information and Content Extraction, Parallel Corpus, Under-resourced Languages Proceedings of COLING 2012: Posters, pages 93–102, COLING 2012, Mumbai, December 2012. 93 1 Introduction Collocation is usually interpreted as the occurrence of two or more words within a short space in a text (Sinclair, 1987). This definition however is not precise, because it is not possible to define a short space. It also implies the strategy that all traditional models had. They were looking for co- occurrences rather than collocations (Seretan, 2011). Consider the following sentence and its Persian translation1: "Lecturer issued a major and also impossible to solve problem." مدرس یک مشکل بزرگ و غیر قابل حل را عنوان کرد.
    [Show full text]
  • Collocation Classification with Unsupervised Relation Vectors
    Collocation Classification with Unsupervised Relation Vectors Luis Espinosa-Anke1, Leo Wanner2,3, and Steven Schockaert1 1School of Computer Science, Cardiff University, United Kingdom 2ICREA and 3NLP Group, Universitat Pompeu Fabra, Barcelona, Spain fespinosa-ankel,schockaerts1g@cardiff.ac.uk [email protected] Abstract Morphosyntactic relations have been the focus of work on unsupervised relational similarity, as Lexical relation classification is the task of it has been shown that verb conjugation or nomi- predicting whether a certain relation holds be- nalization patterns are relatively well preserved in tween a given pair of words. In this pa- vector spaces (Mikolov et al., 2013; Pennington per, we explore to which extent the current et al., 2014a). Semantic relations pose a greater distributional landscape based on word em- beddings provides a suitable basis for classi- challenge (Vylomova et al., 2016), however. In fication of collocations, i.e., pairs of words fact, as of today, it is unclear which operation per- between which idiosyncratic lexical relations forms best (and why) for the recognition of indi- hold. First, we introduce a novel dataset with vidual lexico-semantic relations (e.g., hyperonymy collocations categorized according to lexical or meronymy, as opposed to cause, location or ac- functions. Second, we conduct experiments tion). Still, a number of works address this chal- on a subset of this benchmark, comparing it in lenge. For instance, hypernymy has been modeled particular to the well known DiffVec dataset. In these experiments, in addition to simple using vector concatenation (Baroni et al., 2012), word vector arithmetic operations, we also in- vector difference and component-wise squared vestigate the role of unsupervised relation vec- difference (Roller et al., 2014) as input to linear tors as a complementary input.
    [Show full text]
  • Dish2vec: a Comparison of Word Embedding Methods in an Unsupervised Setting
    dish2vec: A Comparison of Word Embedding Methods in an Unsupervised Setting Guus Verstegen Student Number: 481224 11th June 2019 Erasmus University Rotterdam Erasmus School of Economics Supervisor: Dr. M. van de Velden Second Assessor: Dr. F. Frasincar Master Thesis Econometrics and Management Science Business Analytics and Quantitative Marketing Abstract While the popularity of continuous word embeddings has increased over the past years, detailed derivations and extensive explanations of these methods lack in the academic literature. This paper elaborates on three popular word embedding meth- ods; GloVe and two versions of word2vec: continuous skip-gram and continuous bag-of-words. This research aims to enhance our understanding of both the founda- tion and extensions of these methods. In addition, this research addresses instability of the methods with respect to the hyperparameters. An example in a food menu context is used to illustrate the application of these word embedding techniques in quantitative marketing. For this specific case, the GloVe method performs best according to external expert evaluation. Nevertheless, the fact that GloVe performs best is not generalizable to other domains or data sets due to the instability of the models and the evaluation procedure. Keywords: word embedding; word2vec; GloVe; quantitative marketing; food Contents 1 Introduction 3 2 Literature review 4 2.1 Discrete distributional representation of text . .5 2.2 Continuous distributed representation of text . .5 2.3 word2vec and GloVe ..............................7 2.4 Food menu related literature . .9 3 Methodology: the base implementation 9 3.1 Quantified word similarity: cosine similarity . 12 3.2 Model optimization: stochastic gradient descent . 12 3.3 word2vec ...................................
    [Show full text]
  • Developments of the Lateral in Occitan Dialects and Their Romance and Cross-Linguistic Context Daniela Müller
    Developments of the lateral in occitan dialects and their romance and cross-linguistic context Daniela Müller To cite this version: Daniela Müller. Developments of the lateral in occitan dialects and their romance and cross- linguistic context. Linguistics. Université Toulouse le Mirail - Toulouse II, 2011. English. NNT : 2011TOU20122. tel-00674530 HAL Id: tel-00674530 https://tel.archives-ouvertes.fr/tel-00674530 Submitted on 27 Feb 2012 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. en vue de l’obtention du DOCTORATDEL’UNIVERSITÉDETOULOUSE délivré par l’université de toulouse 2 - le mirail discipline: sciences du langage zur erlangung der doktorwürde DERNEUPHILOLOGISCHENFAKULTÄT DERRUPRECHT-KARLS-UNIVERSITÄTHEIDELBERG présentée et soutenue par vorgelegt von DANIELAMÜLLER DEVELOPMENTS OF THE LATERAL IN OCCITAN DIALECTS ANDTHEIRROMANCEANDCROSS-LINGUISTICCONTEXT JURY Jonathan Harrington (Professor, Ludwig-Maximilians-Universität München) Francesc Xavier Lamuela (Catedràtic, Universitat de Girona) Jean-Léonard Léonard (Maître de conférences HDR, Paris
    [Show full text]
  • Evaluating Language Models for the Retrieval and Categorization of Lexical Collocations
    Evaluating language models for the retrieval and categorization of lexical collocations Luis Espinosa-Anke1, Joan Codina-Filba`2, Leo Wanner3;2 1School of Computer Science and Informatics, Cardiff University, UK 2TALN Research Group, Pompeu Fabra University, Barcelona, Spain 3Catalan Institute for Research and Advanced Studies (ICREA), Barcelona, Spain [email protected], [email protected] Abstract linguistic phenomena (Rogers et al., 2020). Re- cently, a great deal of research analyzed the degree Lexical collocations are idiosyncratic com- to which they encode, e.g., morphological (Edmis- binations of two syntactically bound lexical ton, 2020), syntactic (Hewitt and Manning, 2019), items (e.g., “heavy rain”, “take a step” or or lexico-semantic structures (Joshi et al., 2020). “undergo surgery”). Understanding their de- However, less work explored so far how LMs in- gree of compositionality and idiosyncrasy, as well their underlying semantics, is crucial for terpret phraseological units at various degrees of language learners, lexicographers and down- compositionality. This is crucial for understanding stream NLP applications alike. In this paper the suitability of different text representations (e.g., we analyse a suite of language models for col- static vs contextualized word embeddings) for en- location understanding. We first construct a coding different types of multiword expressions dataset of apparitions of lexical collocations (Shwartz and Dagan, 2019), which, in turn, can be in context, categorized into 16 representative useful for extracting latent world or commonsense semantic categories. Then, we perform two experiments: (1) unsupervised collocate re- information (Zellers et al., 2018). trieval, and (2) supervised collocation classi- One central type of phraselogical units are fication in context.
    [Show full text]
  • Theories and Methods in Japanese Studies: Current State and Future Developments
    Hans Dieter Ölschleger (ed.) Theories and Methods in Japanese Studies: Current State and Future Developments Papers in Honor of Josef Kreiner V&R unipress Bonn University Press Bibliografische Information der Deutschen Nationalbibliothek Die Deutsche Nationalbibliothek verzeichnet diese Publikation in der Deutschen Nationalbibliografie; detaillierte bibliografische Daten sind im Internet über http://dnb.d-nb.des abrufbar. ISBN 978-3-89971-355-8 Veröffentlichungen der Bonn University Press erscheinen im Verlag V&R unipress GmbH. © 2008, V&R unipress in Göttingen / www.vr-unipress.de Alle Rechte vorbehalten. Das Werk und seine Teile sind urheberrechtlich geschützt. Jede Verwertung in anderen als den gesetzlich zugelassenen Fällen bedarf der vorherigen schriftlichen Einwilligung des Verlages. Hinweis zu § 52a UrhG: Weder das Werk noch seine Teile dürfen ohne vorherige schriftliche Einwilligung des Verlages öffentlich zugänglich gemacht werden. Dies gilt auch bei einer entsprechenden Nutzung für Lehr- und Unterrichtszwecke. Printed in Germany. Gedruckt auf alterungsbeständigem Papier. Table of Contents PREFACE...........................................................................................................7 Ronald DORE Japan – Sixty Years of Modernization? .........................................................11 KUWAYAMA Takami Japanese Anthropology and Folklore Studies................................................25 ITŌ Abito The Distinctiveness and Marginality of Japanese Culture.............................43 FUKUTA AJIO
    [Show full text]
  • Collocation Segmentation for Text Chunking
    VYTAUTAS MAGNUS UNIVERSITY Vidas DAUDARAVIČIUS COLLOCATION SEGMENTATION FOR TEXT CHUNKING Doctoral dissertation Physical Sciences, Informatics (09P) Kaunas, 2012 UDK 004 Da-371 This doctoral dissertation was written at Vytautas Magnus University in 2008–2012. Research supervisor: doc. dr. Minija Tamoši unait¯ e˙ Vytautas Magnus University, Physical Sciences, Informatics – 09 P ISBN 978-9955-12-844-1 VYTAUTO DIDŽIOJO UNIVERSITETAS Vidas DAUDARAVIČIUS TEKSTO SKAIDYMAS PASTOVIU˛JU˛JUNGINIU˛SEGMENTAIS Daktaro disertacija Fiziniai mokslai, Informatika (09P) Kaunas, 2012 Disertacija rengta 2008–2012 metais Vytauto Didžiojo universitete Mokslinis vadovas: doc. dr. Minija Tamoši unait¯ e˙ Vytauto Didžiojo universitetas, fiziniai mokslai, informatika – 09 P In the beginning was the Word (Jon 1:1) Summary Segmentation is a widely used paradigm in text processing. There are many methods of segmentation for text processing, including: topic segmentation, sentence segmentation, morpheme segmentation, phoneme segmentation, and Chinese text segmentation. Rule-based, statistical and hybrid methods are employed to perform the segmentation. This dissertation introduces a new type of segmentation– collocation segmentation–and a new method to perform it, and applies them to three different text processing tasks: – Lexicography. Collocation segmentation makes possible the use of large corpora to evaluate the usage and importance of terminology over time. It highlights important methodological changes in the history of research and allows actual research trends throughout history to be rediscovered. Such an analysis is applied to the ACL Anthology Reference Corpus of scientific papers published during the last 50 years in this research area. – Text categorization. Text categorization results can be improved using collocation segmen- tation. The study shows that collocation segmentation, without any other language resources, achieves better results than the widely used n-gram techniques together with POS (Part-of- Speech) processing tools.
    [Show full text]
  • 17058 FULLTEXT.Pdf (1.188Mb)
    Automatic Document Timestamping Kristoffer Berg Gumpen Øyvind Nygard Master of Science in Computer Science Submission date: June 2017 Supervisor: Kjetil Nørvåg, IDI Norwegian University of Science and Technology Department of Computer Science Abstract When searching for information, the temporal dimension of the results is an important fac- tor regarding the information quality. Using temporal intent as a condition when searching for information is a field that is gaining increasing interest. When we search for informa- tion on search engines, such as Google, we have the option to use time of creation as a part of the search criteria. Unfortunately, when searching on the web we have no guarantee that the timestamps for the results corresponds to the actual date the content was created. Since the timestamps provided on the Internet can not be trusted it would be of great use if there existed a method for timestamping documents without knowing the actual date of creation. In this thesis, we have presented and implemented some existing approaches to this problem, modified them and added some parameters for tweaking and fine tuning the results. These approaches are so called content based approaches, and they use sta- tistical analysis on the textual contents of documents in a collection in order to predict a document’s time of origin. In order to evaluate our implementation, we have performed extensive experiments and compared our results with results achieved in earlier research. Sammendrag Nar˚ man søker etter informasjon vil den temporale dimensjonen av resultatene være en viktig faktor med hensyn til informasjonskvaliteten. A˚ bruke temporalitet nar˚ man søker etter informasjon er et felt hvor interessen vokser.
    [Show full text]