Corpus-Based Research in the Humanities
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Compilation of a Swiss German Dialect Corpus and Its Application to Pos Tagging
Compilation of a Swiss German Dialect Corpus and its Application to PoS Tagging Nora Hollenstein Noemi¨ Aepli University of Zurich University of Zurich [email protected] [email protected] Abstract Swiss German is a dialect continuum whose dialects are very different from Standard German, the official language of the German part of Switzerland. However, dealing with Swiss German in natural language processing, usually the detour through Standard German is taken. As writing in Swiss German has become more and more popular in recent years, we would like to provide data to serve as a stepping stone to automatically process the dialects. We compiled NOAH’s Corpus of Swiss German Dialects consisting of various text genres, manually annotated with Part-of- Speech tags. Furthermore, we applied this corpus as training set to a statistical Part-of-Speech tagger and achieved an accuracy of 90.62%. 1 Introduction Swiss German is not an official language of Switzerland, rather it includes dialects of Standard German, which is one of the four official languages. However, it is different from Standard German in terms of phonetics, lexicon, morphology and syntax. Swiss German is not dividable into a few dialects, in fact it is a dialect continuum with a huge variety. Swiss German is not only a spoken dialect but increasingly used in written form, especially in less formal text types. Often, Swiss German speakers write text messages, emails and blogs in Swiss German. However, in recent years it has become more and more popular and authors are publishing in their own dialect. Nonetheless, there is neither a writing standard nor an official orthography, which increases the variations dramatically due to the fact that people write as they please with their own style. -
Distant Reading, Computational Stylistics, and Corpus Linguistics the Critical Theory of Digital Humanities for Literature Subject Librarians David D
CHAPTER FOUR Distant Reading, Computational Stylistics, and Corpus Linguistics The Critical Theory of Digital Humanities for Literature Subject Librarians David D. Oberhelman FOR LITERATURE librarians who frequently assist or even collaborate with faculty, researchers, IT professionals, and students on digital humanities (DH) projects, understanding some of the tacit or explicit literary theoretical assumptions involved in the practice of DH can help them better serve their constituencies and also equip them to serve as a bridge between the DH community and the more traditional practitioners of literary criticism who may not fully embrace, or may even oppose, the scientific leanings of their DH colleagues.* I will therefore provide a brief overview of the theory * Literary scholars’ opposition to the use of science and technology is not new, as Helle Porsdam has observed by looking at the current debates over DH in the academy in terms of the academic debate in Great Britain in the 1950s and 1960s between the chemist and novelist C. P. Snow and the literary critic F. R. Leavis over the “two cultures” of science and the humanities (Helle Porsdam, “Too Much ‘Digital,’ Too Little ‘Humanities’? An Attempt to Explain Why Humanities Scholars Are Reluctant Converts to Digital Humanities,” Arcadia project report, 2011, DSpace@Cambridge, www.dspace.cam.ac.uk/handle/1810/244642). 54 DISTANT READING behind the technique of DH in the case of literature—the use of “distant reading” as opposed to “close reading” of literary texts as well as the use of computational linguistics, stylistics, and corpora studies—to help literature subject librarians grasp some of the implications of DH for the literary critical tradition and learn how DH practitioners approach literary texts in ways that are fundamentally different from those employed by many other critics.† Armed with this knowledge, subject librarians may be able to play a role in integrating DH into the traditional study of literature. -
KB-Driven Information Extraction to Enable Distant Reading of Museum Collections Giovanni A
KB-Driven Information Extraction to Enable Distant Reading of Museum Collections Giovanni A. Cignoni, Progetto HMR, [email protected] Enrico Meloni, Progetto HMR, [email protected] Short abstract (same as the one submitted in ConfTool) Distant reading is based on the possibility to count data, to graph and to map them, to visualize the re- lations which are inside the data, to enable new reading perspectives by the use of technologies. By contrast, close reading accesses information by staying at a very fine detail level. Sometimes, we are bound to close reading because technologies, while available, are not applied. Collections preserved in the museums represent a relevant part of our cultural heritage. Cataloguing is the traditional way used to document a collection. Traditionally, catalogues are lists of records: for each piece in the collection there is a record which holds information on the object. Thus, a catalogue is just a table, a very simple and flat data structure. Each collection has its own catalogue and, even if standards exists, they are often not applied. The actual availability and usability of information in collections is impaired by the need of access- ing each catalogue individually and browse it as a list of records. This can be considered a situation of mandatory close reading. The paper discusses how it will be possible to enable distant reading of collections. Our proposed solution is based on a knowledge base and on KB-driven information extraction. As a case study, we refer to a particular domain of cultural heritage: history of information science and technology (HIST). -
Collection of Usage Information for Language Resources from Academic Articles
Collection of Usage Information for Language Resources from Academic Articles Shunsuke Kozaway, Hitomi Tohyamayy, Kiyotaka Uchimotoyyy, Shigeki Matsubaray yGraduate School of Information Science, Nagoya University yyInformation Technology Center, Nagoya University Furo-cho, Chikusa-ku, Nagoya, 464-8601, Japan yyyNational Institute of Information and Communications Technology 4-2-1 Nukui-Kitamachi, Koganei, Tokyo, 184-8795, Japan fkozawa,[email protected], [email protected], [email protected] Abstract Recently, language resources (LRs) are becoming indispensable for linguistic researches. However, existing LRs are often not fully utilized because their variety of usage is not well known, indicating that their intrinsic value is not recognized very well either. Regarding this issue, lists of usage information might improve LR searches and lead to their efficient use. In this research, therefore, we collect a list of usage information for each LR from academic articles to promote the efficient utilization of LRs. This paper proposes to construct a text corpus annotated with usage information (UI corpus). In particular, we automatically extract sentences containing LR names from academic articles. Then, the extracted sentences are annotated with usage information by two annotators in a cascaded manner. We show that the UI corpus contributes to efficient LR searches by combining the UI corpus with a metadata database of LRs and comparing the number of LRs retrieved with and without the UI corpus. 1. Introduction Thesaurus1. In recent years, such language resources (LRs) as corpora • He also employed Roget’s Thesaurus in 100 words of and dictionaries are being widely used for research in the window to implement WSD. -
Deutsche Literaturgeschichte Von Den Anfängen Bis Zur Gegenwart
metzlerverlag.de W. Beutin, M. Beilein, W. Emmerich, C. Kanz, B. Lutz, V. Meid, M. Opitz, C. Opitz-Wiemers, R. Schnell, P. Stein, I. Stephan Deutsche Literaturgeschichte Von den Anfängen bis zur Gegenwart Von den Merseburger Zaubersprüchen bis zu Juli Zeh u.a. Aktualisierte und erweiterte Kapitel zur Literatur und zum Literaturbetrieb der Gegenwart Mit Bibliografie, Personen- und Werkregister Von den mittelalterlichen Sängern und Epikern über Martin Opitz, Gotthold Ephraim Lessing, Friedrich Schiller und Johann Wolfgang von Goethe, über Heinrich Heine, Georg Büchner und Bertolt Brecht bis Günter Grass, Martin Walser, Uwe Tellkamp, Herta Müller und Ursula Krechel. Alle namhaften Schriftsteller sind erfasst: Die Literaturgeschichte fängt Lyrik, Roman, Prosa und andere literarische Gattungen und Strömungen im Spiegel der Epochen ein, zeigt die Autorinnen und Autoren, ihr Schaffen und den Literaturbetrieb in enger Verflechtung mit dem gesellschaftlichen, kulturellen und politischen Zeitgeist. Ein lebendiges Nachschlagewerk, das 9., aktualisierte und erweiterte Aufl. 2019, durch die gelungene Verknüpfung von Text und Illustrationen bei Neugierigen und Kennern XI, 799 S. gleichermaßen für großes Lesevergnügen sorgt. - Die Neuauflage schreibt die Kapitel zur Gegenwartsliteratur und zum Literaturbetrieb fort. Printed book Hardcover [1]32,99 € (D) | 33,91 € (A) | CHF 36,50 eBook [2]24,99 € (D) | 24,99 € (A) | CHF 29,00 Available from your library or springer.com/shop Order online at springer.com / or for the Americas call (toll free) 1-800-SPRINGER / or email us at: [email protected]. / For outside the Americas call +49 (0) 6221-345-4301 / or email us at: [email protected]. The first € price and the £ and $ price are net prices, subject to local VAT. -
Download Download
VÁCLAV BLAŽEK Masaryk University Fields of research: Indo-European studies, especially Slavic, Baltic, Celtic, Anatolian, Tocharian, Indo-Iranian languages; Fenno-Ugric; Afro-Asiatic; etymology; genetic classification; mathematic models in historical linguistics. AN ATTEMPT AT AN ETYMOLOGICAL ANALYSIS OF PTOLEMY´S HYDRONYMS OF EASTERN BALTICUM Bandymas etimologiškai analizuoti Ptolemajo užfiksuotus rytinio Baltijos regiono hidronimus ANNOTATION The contribution analyzes four hydronyms from Eastern Balticum, recorded by Ptole- my in the mid-2nd cent. CE in context of witness of other ancient geographers. Their re- vised etymological analyses lead to the conclusion that at least three of them are of Ger- manic origin, but probably calques on their probable Baltic counterparts. KEYWORDS: Geography, hydronym, etymology, calque, Germanic, Baltic, Balto-Fennic. ANOTACIJA Šiame darbe analizuojami keturi Ptolemajo II a. po Kr. viduryje užrašyti rytinio Bal- tijos regiono hidronimai, paliudyti ir kitų senovės geografų. Šių hidronimų etimologinė analizė parodė, kad bent trys iš jų yra germaniškos kilmės, tačiau, tikėtina, baltiškų atitik- menų kalkės. ESMINIAI ŽODŽIAI: geografija, hidronimas, etimologija, kalkė, germanų, baltų, baltų ir finų kilmė. Straipsniai / Articles 25 VÁCLAV BLAŽEK The four river-names located by Ptolemy to the east of the eastuary of the Vistula river, Χρόνος, Ῥούβων, Τουρούντος, Χε(ρ)σίνος, have usually been mechanically identified with the biggest rivers of Eastern Balticum according to the sequence of Ptolemy’s coordinates: Χρόνος ~ Prussian Pregora / German Pregel / Russian Prególja / Polish Pre- goła / Lithuanian Preglius; 123 km (Mannert 1820: 257; Tomaschek, RE III/2, 1899, c. 2841). Ῥούβων ~ Lithuanian Nemunas̃ / Polish Niemen / Belorussian Nëman / Ger- man Memel; 937 km (Mannert 1820: 258; Rappaport, RE 48, 1914, c. -
IXA-Pipe) and Parsing (Alpino)
Details from a distance? CLIN26 Amsterdam A Dutch pipeline for event detection Chantal van Son, Marieke van Erp, Antske Fokkens, Paul Huygen, Ruben Izquierdo Bevia & Piek Vossen CLOSE READING DISTANT READING NewsReader & BiographyNet Apply the detailed analyses typically associated with close (or at least non-distant) reading to large amounts of textual data ✤ NewsReader: (financial) news data ✤ Day-by-day processing of news; reconstructing a coherent story ✤ Four languages: English, Spanish, Italian, Dutch ✤ BiographyNet: biographical data ✤ Answering historical questions with digitalized data from Biography Portal of the Netherlands (www.biografischportaal.nl) using computational methods http://www.newsreader-project.eu/ http://www.biographynet.nl/ Event extraction and representation 1. Intra-document processing: Dutch NLP pipeline generating NAF files ✤ pipeline includes tokenization, parsing, SRL, NERC, etc. ✤ NLP Annotation Format (NAF) is a multi-layered annotation format for representing linguistic annotations in complex NLP architectures 2. Cross-document processing: event coreference to convert NAF to RDF representation 3. Store the NAF and RDF in a KnowledgeStore Event extraction and representation 1. Intra-document processing: Dutch NLP pipeline generating NAF files ✤ pipeline includes tokenization, parsing, SRL, NERC, etc. ✤ NLP Annotation Format (NAF) is a multi-layered annotation format for representing linguistic annotations in complex NLP architectures 2. Cross-document processing: event coreference to convert NAF to RDF 3. -
Exonyms – Standards Or from the Secretariat Message from the Secretariat 4
NO. 50 JUNE 2016 In this issue Preface Message from the Chairperson 3 Exonyms – standards or From the Secretariat Message from the Secretariat 4 Special Feature – Exonyms – standards standardization? or standardization? What are the benefits of discerning 5-6 between endonym and exonym and what does this divide mean Use of Exonyms in National 6-7 Exonyms/Endonyms Standardization of Geographical Names in Ukraine Dealing with Exonyms in Croatia 8-9 History of Exonyms in Madagascar 9-11 Are there endonyms, exonyms or both? 12-15 The need for standardization Exonyms, Standards and 15-18 Standardization: New Directions Practice of Exonyms use in Egypt 19-24 Dealing with Exonyms in Slovenia 25-29 Exonyms Used for Country Names in the 29 Repubic of Korea Botswana – Exonyms – standards or 30 standardization? From the Divisions East Central and South-East Europe 32 Division Portuguese-speaking Division 33 From the Working Groups WG on Exonyms 31 WG on Evaluation and Implementation 34 From the Countries Burkina Faso 34-37 Brazil 38 Canada 38-42 Republic of Korea 42 Indonesia 43 Islamic Republic of Iran 44 Saudi Arabia 45-46 Sri Lanka 46-48 State of Palestine 48-50 Training and Eucation International Consortium of Universities 51 for Training in Geographical Names established Upcoming Meetings 52 UNGEGN Information Bulletin No. 50 June 2106 Page 1 UNGEGN Information Bulletin The Information Bulletin of the United Nations Group of Experts on Geographical Names (formerly UNGEGN Newsletter) is issued twice a year by the Secretariat of the Group of Experts. The Secretariat is served by the Statistics Division (UNSD), Department for Economic and Social Affairs (DESA), Secretariat of the United Nations. -
Alsace Wine Route
ALSACE WINE ROUTE CASTLES, VILLAGES & VINEYARDS OF ALSACE ALSACE WINE ROUTE - SELF GUIDED WALKING HOLIDAY SUMMARY Enjoy a gentle wander through the vineyards of Alsace. Medieval villages of gingerbread houses, hilltop castles, charming hotels and wonderful food and wine. As you walk between the picturesque towns of this region you will discover the best of Alsace. The perfectly kept vineyards, welcoming cellars, gourmet restaurants, quiet forest trails and sweeping views of the Vosges mountains. A relaxed walking holiday in Alsace is perfect if you enjoy your walking combined with the good things in life. Tucked away in the eastern corner of France between the Vosges Mountains and the German border Alsace retains a rich Germanic character, a hangover from the many times that it has been under German control. The architecture, food and wine are more German than French in character. Vine covered hillsides generously sprinkled with picturesque villages, castles and churches make Alsace a delight to explore on foot. Tour: Alsace Wine Route Code: WFSAWR Type: Self-Guided Walking Holiday Price: See Website Single Supplement: See Website HIGHLIGHTS Dates: April - October Days: 9 Days (7 Walking Days) Easy walking between picture perfect Alsatian towns and villages. Nights: 8 Nights Start/Finish: Sélestat / Turckheim Wonderful views of the Vosges Mountains, Rhine and Black Forest. Grade: Easy to Moderate Enjoying the marvellous vins d’Alsace and regional specialities in each village. Stumbling across medieval Chateaux’s at almost every twist and turn of the trail. Charming and friendly traditional hotels and auberges. Is It For Me? · The Alsace Wine Route is the ideal mix of easy walking, great wine, good food, picture perfect towns and charming hotels. -
Sunday, June 30Th Departure from U.S. for Frankfurt. Monday, July
Sunday, June 30th Saturday, July 6th Departure from U.S. for Frankfurt. Thursday, July 4th Ribeauvillé / Haut-Koenigsbourg Gertwiller / Strasbourg Breakfast at the hotel then depart for a guided visit of Ribeauvillé. th Monday, July 1st Breakfast at the hotel then a guided visit of the Spice Bread Located at the foot of “le massif du Taennchel” in the center of the Transportation via luxury bus from Frankfurt to Alsace. Museum of Gertwiller, The world capital of the “Pain D’épice” wine region in between Strasbourg and Mulhouse, Ribeauvillé is a Arrival at the hotel in Obernai. Enjoy a welcome light lunch and (spice bread) for the past two centuries. This collection offers a big wine city with a medieval charm where life is a pleasure. aperitif, check-in and get settled. A guided tour of the town to get number of artifacts from years of Alsatian rural living (agriculture, Visit the “Park des Cigognes”, Storks Park, the symbol of Alsace. your bearings, then an early dinner. cooking, religion, old jobs), as well as popular Alsatian Art, which Lunch at the 3 Michelin Star, Auberge de l’Ill in Illhaeusern, all have to do with biscuit making, chocolate work….etc… one of France’s “Best of the Best”. nd Tuesday, July 2 Lunch at the world renowned ”Crocodile” in Strasbourg. Depart for the Castle of Haut-Koenigsbourg by the “Route des Obernai with Huskykart Acitivities Then discover the city of Strasbourg, with “La Grande Ile” which vins”(the wine route). Breakfast at the hotel, then a “first ever” in a Huskykart. -
Juli Zeh Other People [Unterleuten]
Juli Zeh Other People [Unterleuten] Outline + Sample Translation Literary Fiction Luchterhand 640 pages March 2016 It looks as if Linda, who is only ever called the "horse woman", has found paradise for herself and her stallion Bergamotte. Unspoiled countryside, a romantic cottage and endless space in the little village of Unterleuten promise to be an idyll. The peaceful life goes out of joint when an investment company decides to erect a wind park close by. Meiler, a city real estate speculator spoilt by and used to success, and Gombrowski, a farmer owning a large estate, are both willing to provide the necessary land for the project so as to cash in on the subsidies to the tune of a few million euros. The only rub is that the land of the two gentlemen is just a few square meters too small. The missing plot is on Linda's property of all places, which she sees as her big opportunity of making her dream of setting up her own stud farm come true. While she is skilfully manipulating the two men and playing them one against the other, Gombrowski's archenemy Kron, a crutch-wielding one-time Communist and hard-boiled troublemaker, does all he can to prevent the wind park being built. It is not long before jealousy and greed, craftiness and intrigues lead to all hell breaking loose in Unterleuten. With a keen feel for all that is human and a subtle sense of humour, the author gradually exposes the true nature of her protagonists, thus making even the apparently noblest of motives seem dubious. -
German Studies Association Newsletter ______
______________________________________________________________________________ German Studies Association Newsletter __________________________________________________________________ Volume XLIV Number 1 Spring 2019 German Studies Association Newsletter Volume XLIV Number 1 Spring 2019 __________________________________________________________________________ Table of Contents Letter from the President ............................................................................................................... 2 Letter from the Executive Director ................................................................................................. 5 Conference Details .......................................................................................................................... 8 Conference Highlights ..................................................................................................................... 9 Election Results Announced ......................................................................................................... 14 A List of Dissertations in German Studies, 2017-19 ..................................................................... 16 Letter from the President Dear members and friends of the GSA, To many of us, “the GSA” refers principally to a conference that convenes annually in late September or early October in one city or another, and which provides opportunities to share ongoing work, to network with old and new friends and colleagues. And it is all of that, for sure: a forum for