Creating and Working with Spoken Language Corpora in Exmaralda
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
A Conversion Tool for Spoken Language Transcription with a Pivot File in TEI Christophe Parisse, Carole Etienne, Loïc Liégeois
TEICORPO: a conversion tool for spoken language transcription with a pivot file in TEI Christophe Parisse, Carole Etienne, Loïc Liégeois To cite this version: Christophe Parisse, Carole Etienne, Loïc Liégeois. TEICORPO: a conversion tool for spoken language transcription with a pivot file in TEI. Journal of the Text Encoding Initiative, TEI Consortium, In press. halshs-03043572 HAL Id: halshs-03043572 https://halshs.archives-ouvertes.fr/halshs-03043572 Submitted on 7 Dec 2020 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. TEICORPO: a conversion tool for spoken language transcription with a pivot file in TEI Christophe Parisse, INSERM, Modyco-CNRS-Nanterre University, France - [email protected] Carole Etienne, ICAR-CNRS, Lyon, France - [email protected] Loïc Liégeois, University of Paris, France - [email protected] Abstract: CORLI is a consortium of Huma-Num, the French national infrastructure dedicated to the technical support and promotion of digital humanities. The goal of CORLI is to promote and provide tools and information for good and efficient research practices in corpus linguistics and especially spoken language corpora. Because of the time required to collect and transcribe spoken language resources, their number is limited and thus corpora need to be interoperable and reusable in order to improve research on various themes (phonology, prosody, interaction, syntax, textometry…). -
Exmaralda Partitur-Editor Manual Version 1.5.1
EXMARaLDA Partitur-Editor Manual Version 1.5.1 Last update: 20th October, 2011 By: Thomas Schmidt EXMARaLDA Partitur-Editor – Manual Table of Contents TABLE OF CONTENTS I. PRELIMINARY REMARKS .......................................................................................................................................... 7 XML, EXMARaLDA and the Partitur-Editor............................................................................................................................. 7 “Words of Caution” ......................................................................................................................................................................... 8 II. USER INTERFACE ......................................................................................................................................................... 10 III. PANELS ............................................................................................................................................................................... 14 A. Keyboard............................................................................................................................................................................... 14 B. Link panel ............................................................................................................................................................................. 15 C. Audio/Video panel ............................................................................................................................................................. -
Annotation Tool Toolbox How to Gloss/Annotate in Toolbox
Annotation tool Toolbox how to gloss/annotate in Toolbox Regensburg DOBES summer school Language Documentation Sebastian Drude 2011-09 Topics 1. Data and Annotation (Theory) 2. Annotation Tools (Overview and Comparison) 3. Intro to Interlinearization (not time-aligned) 1. Excurse: Text- vs. sentence-based databases 4. Time-aligned annotation 1. ELAN generated annotation 2. Excurse: Regular Expressions 3. Excurse: UNICODE and UTF-8 4. Transcriber generated annotation; Conversions 5. Round-trip configuration ELAN--Toolbox Data and Annotation Data Data is always data FOR something, or at least OF something – usually it is a systematic representation of physical states and events In linguistics, primary data is a direct representation or result of speech events, for instance a written text or, in partiuclar, an audio/video recording of a speech event Data and Annotation Annotation Annotation of data is a symbolic representation of properties of the state/event represented in the data In linguistics, the most common and basic types of annotation are a transcription and a translation of the linguistic expressions represented in primary data (e.g., an a/v recording) Data and Annotation Global vs. unit-oriented Annotation Global or holistic annotation represents properties of the event as a whole and is part of the metadata Unit-oriented annotation refers to specific parts of the data, in particular, utterances of individual sentences or words or sounds etc. We speak of individual annotations (plural) Data and Annotation Secondary and derived data If unit-oriented annotation is directly based on primary data (such as a written text or a audio or video recording), then it is secondary data Annotation of secondary data would be tertiary data, and so forth recursively In sum, all unit-o. -
Semi-Automatic Transcription of Interviews
Analysis and Optimization of Spatial and Appearance Encodings of Words and Sentences Semi-Automatic Transcription of Interviews Thomas Lüdi ChristianSemester Vögeli Thesis May 2014 Master Thesis SS 2005 Superviser Dr. Beat Pfister Prof. Dr. Markus Gross Adviser Prof. Dr. Lothar Thiele Eidgenössische Technische Hochschule Zürich Swiss Federal Institute of Technology Zurich Abstract Description of a system that automatically generates EXMARaLDA-Partitions for selected au- dio files. It uses MacSpeech Scribe to transcribe audio files and a compiled Matlab function to calculate timing information for the transcribed text. i ii Acknowledgement I would like to thank my superviser Dr. Beat Pfister for his support and suggestions during this project. Especially his insight on how a user might use the application was a great help. iii iv Contents 1. Introduction1 2. Components3 2.1. Applescript....................................3 2.2. EXMARaLDA..................................4 2.3. MacSpeech Scribe................................5 2.4. Audio Hijack Pro.................................5 2.5. MATLAB.....................................5 3. EXMARaLDA Transcriber7 3.1. Transcribing Files.................................7 3.1.1. Scripting MacSpeech Scribe.......................7 3.1.2. Acquiring Audio Using Audio Hijack Pro................8 3.1.3. Getting Timing Information using Matlab................8 3.1.4. Generating the EXMARaLDA-Partition.................9 3.1.5. Feedback to the User...........................9 3.2. Additional Functions...............................9 3.2.1. Free Disk Space.............................9 3.2.2. Reconstruct Wav-Files.......................... 11 4. Conclusion and Outlook 13 4.1. Further Work................................... 13 A. Appendix 15 A.1. Handbook..................................... 15 A.1.1. Introduction................................ 15 v Contents A.1.2. Installation................................ 15 A.1.3. General Operation............................ 16 A.1.4. -
Standards and Protocols for Visual Materials
Ref. Ares(2017)5589715 - 15/11/2017 Project Number: 693349 Standard and protocols for visual materials Giuseppe Airò Farulla, Pietro Braione, Marco Indaco, Mauro Pezzè, Paolo Prinetto CINI, Aster SpA Version 1.2 – 15/11/2017 Lead contractor: Universitat Pompeu Fabra Contact person: Josep Quer Departament de Traducció i Ciències del Llenguatge Roc Boronat, 138 08018 Barcelona Spain Tel. +34-93-542-11-36 Fax. +34-93-542-16-17 E-mail: [email protected] Work package: 3 Affected tasks: 3.5 Nature of deliverable1 R P D O Dissemination level2 PU PP RE CO 1 R: Report, P: Prototype, D: Demonstrator, O: Other 2 PU: public, PP: Restricted to other programme participants (including the commission services), RE Restrict- ed to a group specified by the consortium (including the Commission services), CO Confidential, only for members of the consor- tium (Including the Commission services) Deliverable D3.11: Standard and protocols for visual materials Page 2 of 45 COPYRIGHT © COPYRIGHT SIGN-HUB Consortium consisting of: UNIVERSITAT POMPEU FABRA Spain UNIVERSITA' DEGLI STUDI DI MILANO-BICOCCA Italy UNIVERSITEIT VAN AMSTERDAM Netherlands BOGAZICI UNIVERSITESI Turkey CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE France UNIVERSITE PARIS DIDEROT - PARIS 7 France TEL AVIV UNIVERSITY Israel GEORG-AUGUST-UNIVERSITAET GÖTTINGEN Germany UNIVERSITA CA' FOSCARI VENEZIA Italy CONSORZIO INTERUNIVERSITARIO NAZIONALE PER L'INFORMATICA Italy CONFIDENTIALITY NOTE THIS DOCUMENT MAY NOT BE COPIED, REPRODUCED, OR MODIFIED IN WHOLE OR IN PART FOR ANY PURPOSE WITHOUT WRITTEN PERMISSION FROM THE SIGN-HUB CONSORTIUM. IN ADDITION TO SUCH WRITTEN PERMISSION TO COPY, REPRODUCE, OR MODIFY THIS DOCUMENT IN WHOLE OR PART, AN ACKNOWLEDGMENT OF THE AUTHORS OF THE DOCUMENT AND ALL APPLICABLE PORTIONS OF THE COPYRIGHT NOTICE MUST BE CLEARLY REFERENCED ALL RIGHTS RESERVED. -
Exmaralda Partitur-Editor Manual Version 1.6
EXMARaLDA Partitur-Editor Manual Version 1.6 Last update: October, 2016 Main contributor: Thomas Schmidt Further contributors: Tara Al-Jaf, Anne Ferger, Carolin Frontzek, Karolina Kaminska, Heidemarie Sambale, EXMARaLDA Partitur-Editor – Manual Table of Contents TABLE OF CONTENTS I. PRELIMINARY REMARKS .............................................................................................. 9 XML, EXMARaLDA and the Partitur-Editor ................................................................................. 9 “Words of Caution” ....................................................................................................................... 10 II. USER INTERFACE ............................................................................................................ 12 III. PANELS ............................................................................................................................... 16 A. Keyboard ............................................................................................................................... 16 B. Link panel .............................................................................................................................. 18 C. Audio/Video panel ................................................................................................................ 19 D. Quick media open panel ........................................................................................................ 21 E. Praat panel ............................................................................................................................