The Field of Phonetics Has Experienced Two

Total Page:16

File Type:pdf, Size:1020Kb

The Field of Phonetics Has Experienced Two The field of phonetics has experienced two revolutions in the last century: the advent of the sound spectrograph in the 1950s and the application of computers beginning in the 1970s. Today, advances in digital multimedia, networking and mass storage are promising a third revolution: a movement from the study of small, individual datasets to the analysis of published corpora that are thousands of times larger. These new bodies of data are badly needed, to enable the field of phonetics to develop and test hypotheses across languages and across the many types of individual, social and contextual variation. Allied fields such as sociolinguistics and psycholinguistics ought to benefit even more. However, in contrast to speech technology research, speech science has so far taken relatively little advantage of this opportunity, because access to these resources for phonetics research requires tools and methods that are now incomplete, untested, and inaccessible to most researchers. Our research aims to fill this gap by integrating, adapting and improving techniques developed in speech technology research and database research. The intellectual merit: The most important innovation is robust forced alignment of digital audio with phonetic representations derived from orthographic transcripts, using HMM methods developed for speech recognition technology. Existing forced-alignment techniques must be improved and validated for robust application to phonetics research. There are three basic challenges to be met: orthographic ambiguity; pronunciation variation; and imperfect transcripts (especially the omission of disfluencies). Reliable confidence measures must be developed, so as to allow regions of bad alignment to be identified and eliminated or fixed. Researchers need an easy way to get a believable picture of the distribution of transcription and measurement errors, so as to estimate confidence intervals, and also to determine the extent of any bias that may be introduced. And in addition to solving these problems for English, we need to show how to apply the same techniques to a range of other languages that present a range of new problems. In addition to more robust forced alignment, researchers also need improved techniques for creating, sharing, searching, and maintaining the databases that result from applying these techniques on a large scale. Previous research has established a workable framework for the database issues involved, and some implementations are now in use in speech technology research; but these approaches need to be extended and adapted to meet the needs of phonetics researchers. The broader impacts: The proposed research will help the field of phonetics to enter a new era: conducting research using very large speech corpora, in the range from hundreds of hours to hundreds of thousands of hours. It will also enhance research in other language-related fields, not only within phonetics, but also in neighboring disciplines such as speech technology, sociolinguistics and linguistic anthropology. And this effort to enable new kinds of research also brings up a number of research problems that are interesting in their own right. Speech technology will benefit because better understanding of phonetic variation will enable the creation of systems that are truly robust to the range of speakers they need to deal with, thereby making modern user-interfaces more accessible to the entire population. Sociolinguistics and linguistic anthropology will be given new tools to map out populations based on their speech patterns, and ultimately help our society better understand the diversity of linguistic behaviors and associated cultural manifestations it encompasses. Key Words: speech science; corpus phonetics; acoustic modeling; pronunciation variation; phonetic databases; forced alignment. RI: Medium: New Tools and Methods for Very-Large-Scale Phonetics Research 1. Introduction The field of phonetics has experienced two revolutions in the last century: the advent of the sound spectrograph in the 1950s and the application of computers beginning in the 1970s. Today, advances in digital multimedia, networking and mass storage are promising a third revolution: a movement from the study of small, mostly artificial datasets to the analysis of published corpora of natural speech that are thousands of times larger. Peterson & Barney’s influential 1952 study of American English vowels was based on measurements from a total of less than 30 minutes of speech. Many phonetic studies have been based on the TIMIT corpus, originally published in 1991, which contains just over 300 minutes of speech. Since then, much larger speech corpora have been published for use in technology development: Collections of transcribed conversational telephone speech in English, published by the Linguiistic Data Consortium (LDC) now total more than 300,000 minutes, for example. And many even larger collections are now becoming accessible, from sources such as oral histories, audio books, political debates and speeches, podcasts, and so on. To give just one example, the historical archive of U.S. Supreme Court oral arguments (http://www.oyez.org/) comprises about 9,000 hours (540,000 minutes) of transcribed audio. These very-large-scale bodies of data make it possible to use natural speech in developing and testing hypotheses across the many types of individual, social, regional, temporal, textual and contextual variation, as well as across languages. All the sciences of spoken language stand to benefit, from not only within linguistics, but also in psychology, in clinical applications, and in the social sciences. However, in contrast to speech technology research, speech science has so far taken relatively little advantage of this opportunity, because access to the resources for very-large-scale phonetics research requires tools and methods that are now incomplete, untested, and inaccessible to most researchers. Transcripts in ordinary orthography, typically inaccurate or incomplete in various ways, must be turned into detailed and accurate phonetic transcripts that are time-aligned with the digital recordings. And information about speakers, contexts, and content must be integrated with phonetic and acoustic information, within collections involving tens of thousands of speakers and billions of phonetic segments, and across collections with differing sorts of metadata that may be stored in complex and incompatible formats. Our research aims to solve these problems by integrating, adapting and improving techniques developed in speech technology research and database research. The most important technique is forced alignment of digital audio with phonetic representations derived from orthographic transcripts, using Hidden Markov Model (HMM) methods developed for speech recognition technology. Our preliminary results, described below, convince us that this approach will work. However, forced-alignment techniques must be improved and validated for robust application in phonetics research. There are three basic challenges to be met: orthographic ambiguity; pronunciation variation; and imperfect transcripts (especially the omission of disfluencies). Speech technology researchers have addressed all of these problem, but their solutions have been optimized to decrease word error rates in speech recognition, and must be adapted instead to decrease error and bias in selecting and time-aligning phonetic transcriptions. In particular, reliable confidence measures must be developed, so as to allow regions of uncertain segment choice or bad alignment to be identified and eliminated or fixed, and to give a believable estimate of the distribution of errors in the resulting data. And in addition to solving these problems for English, we need to show how to apply the same techniques to a range of other languages, with different phonetic and orthographic problems. In particular, widely used languages like Mandarin and Arabic have inherent ambiguities in their writing systems that make the mapping from written form to pronunciation more difficult (lack of word segmentation in Mandarin, non-encoding of short vowels in Arabic script). Researchers also need improved techniques for dealing with the resulting datasets. This is partly a question of scale – techniques that work well on small datasets may become unacceptably slow, or fail completely, when dealing with billions of phonetic segments and hundreds of millions of words. There are also issues of consistency: different corpora, even from the same source, typically have differing sorts of metadata, and may be laid out in quite different ways. Finally, there are issues about how to deal with multiple layers of possibly-asynchronous annotation, since along with phonetic segments, words, and speaker information, some datasets may have manual or automatic annotation of syntactic, semantic or pragmatic categories. Researchers need a coherent model of these varied, complex, and multidimensional databases, with methods to retrieve relevant subsets in a suitably combinatoric way. Approaches to these problems were developed at LDC under NSF awards 9983258, “Multidimensional Exploration of Linguistic Databases”, and 0317826, “Querying linguistic databases”; with key ideas documented in Bird and Liberman (2001); and we propose to adapt and improve the results for the needs of phonetics research. The proposed research will help the field of phonetics enter a new era: conducting research using very large speech corpora, in the range
Recommended publications
  • FREQUENCY EFFECTS on ESL COMPOSITIONAL MULTI-WORD SEQUENCE PROCESSING by Sarut Supasiraprapa a DISSERTATION Submitted to Michiga
    FREQUENCY EFFECTS ON ESL COMPOSITIONAL MULTI-WORD SEQUENCE PROCESSING By Sarut Supasiraprapa A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Second Language Studies – Doctor of Philosophy 2017 ABSTRACT FREQUENCY EFFECTS ON ESL COMPOSITIONAL MULTI-WORD SEQUENCE PROCESSING By Sarut Supasiraprapa The current study investigated whether adult native English speakers and English- as-a-second-language (ESL) learners exhibit sensitivity to compositional English multi- word sequences, which have a meaning derivable from word parts (e.g., don’t have to worry as opposed to sequences like He left the US for good, where for good cannot be taken apart to derive its meaning). In the current study, a multi-word sequence specifically referred to a word sequence beyond the bigram (two-word) level. The investigation was motivated by usage-based approaches to language acquisition, which predict that first (L1) and second (L2) speakers should process more frequent compositional phrases faster than less frequent ones (e.g., Bybee, 2010; Ellis, 2002; Gries & Ellis, 2015). This prediction differs from the prediction in the mainstream generative linguistics theory, according to which frequency effects should be observed from the processing of items stored in the mental lexicon (i.e., bound morphemes, single words, and idioms), but not from compositional phrases (e.g., Prasada & Pinker, 1993; Prasada, Pinker, & Snyder, 1990). The present study constituted the first attempt to investigate frequency effects on multi-word sequences in both language comprehension and production in the same L1 and L2 speakers. The study consisted of two experiments. In the first, participants completed a timed phrasal-decision task, in which they decided whether four-word target phrases were possible English word sequences.
    [Show full text]
  • Segmentability Differences Between Child-Directed and Adult-Directed Speech: a Systematic Test with an Ecologically Valid Corpus
    Report Segmentability Differences Between Child-Directed and Adult-Directed Speech: A Systematic Test With an Ecologically Valid Corpus Alejandrina Cristia 1, Emmanuel Dupoux1,2,3, Nan Bernstein Ratner4, and Melanie Soderstrom5 1Dept d’Etudes Cognitives, ENS, PSL University, EHESS, CNRS 2INRIA an open access journal 3FAIR Paris 4Department of Hearing and Speech Sciences, University of Maryland 5Department of Psychology, University of Manitoba Keywords: computational modeling, learnability, infant word segmentation, statistical learning, lexicon ABSTRACT Previous computational modeling suggests it is much easier to segment words from child-directed speech (CDS) than adult-directed speech (ADS). However, this conclusion is based on data collected in the laboratory, with CDS from play sessions and ADS between a parent and an experimenter, which may not be representative of ecologically collected CDS and ADS. Fully naturalistic ADS and CDS collected with a nonintrusive recording device Citation: Cristia A., Dupoux, E., Ratner, as the child went about her day were analyzed with a diverse set of algorithms. The N. B., & Soderstrom, M. (2019). difference between registers was small compared to differences between algorithms; it Segmentability Differences Between Child-Directed and Adult-Directed reduced when corpora were matched, and it even reversed under some conditions. Speech: A Systematic Test With an Ecologically Valid Corpus. Open Mind: These results highlight the interest of studying learnability using naturalistic corpora Discoveries in Cognitive Science, 3, 13–22. https://doi.org/10.1162/opmi_ and diverse algorithmic definitions. a_00022 DOI: https://doi.org/10.1162/opmi_a_00022 INTRODUCTION Supplemental Materials: Although children are exposed to both child-directed speech (CDS) and adult-directed speech https://osf.io/th75g/ (ADS), children appear to extract more information from the former than the latter (e.g., Cristia, Received: 15 May 2018 2013; Shneidman & Goldin-Meadow,2012).
    [Show full text]
  • Multimedia Corpora (Media Encoding and Annotation) (Thomas Schmidt, Kjell Elenius, Paul Trilsbeek)
    Multimedia Corpora (Media encoding and annotation) (Thomas Schmidt, Kjell Elenius, Paul Trilsbeek) Draft submitted to CLARIN WG 5.7. as input to CLARIN deliverable D5.C­3 “Interoperability and Standards” [http://www.clarin.eu/system/files/clarin­deliverable­D5C3_v1_5­finaldraft.pdf] Table of Contents 1 General distinctions / terminology................................................................................................................................... 1 1.1 Different types of multimedia corpora: spoken language vs. speech vs. phonetic vs. multimodal corpora vs. sign language corpora......................................................................................................................................................... 1 1.2 Media encoding vs. Media annotation................................................................................................................... 3 1.3 Data models/file formats vs. Transcription systems/conventions.......................................................................... 3 1.4 Transcription vs. Annotation / Coding vs. Metadata ............................................................................................. 3 2 Media encoding ............................................................................................................................................................... 5 2.1 Audio encoding ..................................................................................................................................................... 5 2.2
    [Show full text]
  • Gold Standard Annotations for Preposition and Verb Sense With
    Gold Standard Annotations for Preposition and Verb Sense with Semantic Role Labels in Adult-Child Interactions Lori Moon Christos Christodoulopoulos Cynthia Fisher University of Illinois at Amazon Research University of Illinois at Urbana-Champaign [email protected] Urbana-Champaign [email protected] [email protected] Sandra Franco Dan Roth Intelligent Medical Objects University of Pennsylvania Northbrook, IL USA [email protected] [email protected] Abstract This paper describes the augmentation of an existing corpus of child-directed speech. The re- sulting corpus is a gold-standard labeled corpus for supervised learning of semantic role labels in adult-child dialogues. Semantic role labeling (SRL) models assign semantic roles to sentence constituents, thus indicating who has done what to whom (and in what way). The current corpus is derived from the Adam files in the Brown corpus (Brown, 1973) of the CHILDES corpora, and augments the partial annotation described in Connor et al. (2010). It provides labels for both semantic arguments of verbs and semantic arguments of prepositions. The semantic role labels and senses of verbs follow Propbank guidelines (Kingsbury and Palmer, 2002; Gildea and Palmer, 2002; Palmer et al., 2005) and those for prepositions follow Srikumar and Roth (2011). The corpus was annotated by two annotators. Inter-annotator agreement is given sepa- rately for prepositions and verbs, and for adult speech and child speech. Overall, across child and adult samples, including verbs and prepositions, the κ score for sense is 72.6, for the number of semantic-role-bearing arguments, the κ score is 77.4, for identical semantic role labels on a given argument, the κ score is 91.1, for the span of semantic role labels, and the κ for agreement is 93.9.
    [Show full text]
  • A Massively Parallel Corpus: the Bible in 100 Languages
    Lang Resources & Evaluation DOI 10.1007/s10579-014-9287-y ORIGINAL PAPER A massively parallel corpus: the Bible in 100 languages Christos Christodouloupoulos • Mark Steedman Ó The Author(s) 2014. This article is published with open access at Springerlink.com Abstract We describe the creation of a massively parallel corpus based on 100 translations of the Bible. We discuss some of the difficulties in acquiring and processing the raw material as well as the potential of the Bible as a corpus for natural language processing. Finally we present a statistical analysis of the corpora collected and a detailed comparison between the English translation and other English corpora. Keywords Parallel corpus Á Multilingual corpus Á Comparative corpus linguistics 1 Introduction Parallel corpora are a valuable resource for linguistic research and natural language processing (NLP) applications. One of the main uses of the latter kind is as training material for statistical machine translation (SMT), where large amounts of aligned data are standardly used to learn word alignment models between the lexica of two languages (for example, in the Giza?? system of Och and Ney 2003). Another interesting use of parallel corpora in NLP is projected learning of linguistic structure. In this approach, supervised data from a resource-rich language is used to guide the unsupervised learning algorithm in a target language. Although there are some techniques that do not require parallel texts (e.g. Cohen et al. 2011), the most successful models use sentence-aligned corpora (Yarowsky and Ngai 2001; Das and Petrov 2011). C. Christodouloupoulos (&) Department of Computer Science, UIUC, 201 N.
    [Show full text]
  • The Relationship Between Transitivity and Caused Events in the Acquisition of Emotion Verbs
    Love Is Hard to Understand: The Relationship Between Transitivity and Caused Events in the Acquisition of Emotion Verbs The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters. Hartshorne, Joshua K., Amanda Pogue, and Jesse Snedeker. 2014. Citation Love Is Hard to Understand: The Relationship Between Transitivity and Caused Events in the Acquisition of Emotion Verbs. Journal of Child Language (June 19): 1–38. Published Version doi:10.1017/S0305000914000178 Accessed January 17, 2017 12:55:19 PM EST Citable Link http://nrs.harvard.edu/urn-3:HUL.InstRepos:14117738 This article was downloaded from Harvard University's DASH Terms of Use repository, and is made available under the terms and conditions applicable to Open Access Policy Articles, as set forth at http://nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of- use#OAP (Article begins on next page) Running head: TRANSITIVITY AND CAUSED EVENTS Love is hard to understand: The relationship between transitivity and caused events in the acquisition of emotion verbs Joshua K. Hartshorne Massachusetts Institute of Technology Harvard University Amanda Pogue University of Waterloo Jesse Snedeker Harvard University In press at Journal of Child Language Acknowledgements: The authors wish to thank Timothy O’Donnell for assistance with the corpus analysis as well as Alfonso Caramazza, Susan Carey, Steve Pinker, Mahesh Srinivasan, Nathan Winkler- Rhoades, Melissa Kline, Hugh Rabagliati, members of the Language and Cognition workshop, and three anonymous reviewers for comments and discussion. This material is based on work supported by a National Defense Science and Engineering Graduate Fellowship to JKH and a grant from the National Science Foundation to Jesse Snedeker (0623845).
    [Show full text]
  • Distributional Properties of Verbs in Syntactic Patterns Liam Considine
    Early Linguistic Interactions: Distributional Properties of Verbs in Syntactic Patterns Liam Considine The University of Michigan Department of Linguistics April 2012 Advisor: Nick Ellis Acknowledgements: I extend my sincerest gratitude to Nick Ellis for agreeing to undertake this project with me. Thank you for cultivating, and partaking in, some of the most enriching experiences of my undergraduate education. The extensive time and energy you invested here has been invaluable to me. Your consistent support and amicable demeanor were truly vital to this learning process. I want to thank my second reader Ezra Keshet for consenting to evaluate this body of work. Other thanks go out to Sarah Garvey for helping with precision checking, and Jerry Orlowski for his R code. I am also indebted to Mary Smith and Amanda Graveline for their participation in our weekly meetings. Their presence gave audience to the many intermediate challenges I faced during this project. I also need to thank my roommate Sean and all my other friends for helping me balance this great deal of work with a healthy serving of fun and optimism. Abstract: This study explores the statistical distribution of verb type-tokens in verb-argument constructions (VACs). The corpus under investigation is made up of longitudinal child language data from the CHILDES database (MacWhinney 2000). We search a selection of verb patterns identified by the COBUILD pattern grammar project (Francis, Hunston, Manning 1996), these include a number of verb locative constructions (e.g. V in N, V up N, V around N), verb object locative caused-motion constructions (e.g.
    [Show full text]
  • (Or, the Raising of Baby Mondegreen) Dissertation
    PRESERVING SUBSEGMENTAL VARIATION IN MODELING WORD SEGMENTATION (OR, THE RAISING OF BABY MONDEGREEN) DISSERTATION Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University By Christopher Anton Rytting, B.A. ***** The Ohio State University 2007 Dissertation Committee: Approved by Dr. Christopher H. Brew, Co-Advisor Dr. Eric Fosler-Lussier, Co-Advisor Co-Advisor Dr. Mary Beckman Dr. Brian Joseph Co-Advisor Graduate Program in Linguistics ABSTRACT Many computational models have been developed to show how infants break apart utterances into words prior to building a vocabulary—the “word segmenta- tion task.” Most models assume that infants, upon hearing an utterance, represent this input as a string of segments. One type of model uses statistical cues calcu- lated from the distribution of segments within the child-directed speech to locate those points most likely to contain word boundaries. However, these models have been tested in relatively few languages, with little attention paid to how different phonological structures may affect the relative effectiveness of particular statistical heuristics. This dissertation addresses this is- sue by comparing the performance of two classes of distribution-based statistical cues on a corpus of Modern Greek, a language with a phonotactic structure signif- icantly different from that of English, and shows how these differences change the relative effectiveness of these cues. Another fundamental issue critically examined in this dissertation is the practice of representing input as a string of segments. Such a representation im- plicitly assumes complete certainty as to the phonemic identity of each segment.
    [Show full text]
  • Exploring Phone Recognition in Pre-Verbal and Dysarthric Speech
    Exploring Phone Recognition in Pre-verbal and Dysarthric Speech Syed Sameer Arshad A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science University of Washington 2019 Committee: Gina-Anne Levow Gašper Beguš Program Authorized to Offer Degree: Department of Linguistics 1 ©Copyright 2019 Syed Sameer Arshad 2 University of Washington Abstract Exploring Phone Recognition in Pre-verbal and Dysarthric Speech Chair of the Supervisory Committee: Dr. Gina-Anne Levow Department of Linguistics In this study, we perform phone recognition on speech utterances made by two groups of people: adults who have speech articulation disorders and young children learning to speak language. We explore how these utterances compare against those of adult English-speakers who don’t have speech disorders, training and testing several HMM-based phone-recognizers across various datasets. Experiments were carried out via the HTK Toolkit with the use of data from three publicly available datasets: the TIMIT corpus, the TalkBank CHILDES database and the Torgo corpus. Several discoveries were made towards identifying best-practices for phone recognition on the two subject groups, involving the use of optimized Vocal Tract Length Normalization (VTLN) configurations, phone-set reconfiguration criteria, specific configurations of extracted MFCC speech data and specific arrangements of HMM states and Gaussian mixture models. 3 Preface The work in this thesis is inspired by my life experiences in raising my nephew, Syed Taabish Ahmad. He was born in May 2000 and was diagnosed with non-verbal autism as well as apraxia-of-speech. His speech articulation has been severely impacted as a result, leading to his speech production to be sequences of babbles.
    [Show full text]
  • The Unsupervised Acquisition of a Lexicon from Continuous Speech
    MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES A.I. Memo No. 1558 Novemb er, 1996 C.B.C.L. Memo No. 129 The Unsup ervised Acquisition of a Lexicon from Continuous Sp eech Carl de Marcken [email protected] This publication can b e retrieved by anonymous ftp to publications.ai.mit.edu. Abstract We present an unsup ervised learning algorithm that acquires a natural-language lexicon from raw sp eech. The algorithm is based on the optimal enco ding of symb ol sequences in an MDL framework, and uses a hierarchical representation of language that overcomes many of the problems that havestymied previous grammar-induction pro cedures. The forward mapping from symb ol sequences to the sp eech stream is mo deled using features based on articulatory gestures. We present results on the acquisition of lexicons and language mo dels from rawspeech, text, and phonetic transcripts, and demonstrate that our algorithm compares very favorably to other rep orted results with resp ect to segmentation p erformance and statistical eciency. c Copyright Massachusetts Institute of Technology, 1995 This rep ort describ es research done at the Arti cial Intelligence Lab oratory of the Massachusetts Institute of Technology. This research is supp orted byNSFgrant 9217041-ASC and ARPA under the HPCC and AASERT programs. 1 Intro duction Internally,asentence is a sequence of discrete elements drawn from a nite vo cabulary.Spoken, it b ecomes a continuous signal{ a series of rapid pressure changes in the lo cal atmosphere with few obvious divisions.
    [Show full text]
  • The Phonetic Analysis of Speech Corpora
    The Phonetic Analysis of Speech Corpora Jonathan Harrington Institute of Phonetics and Speech Processing Ludwig-Maximilians University of Munich Germany email: [email protected] Wiley-Blackwell 2 Contents Relationship between International and Machine Readable Phonetic Alphabet (Australian English) Relationship between International and Machine Readable Phonetic Alphabet (German) Downloadable speech databases used in this book Preface Notes of downloading software Chapter 1 Using speech corpora in phonetics research 1.0 The place of corpora in the phonetic analysis of speech 1.1 Existing speech corpora for phonetic analysis 1.2 Designing your own corpus 1.2.1 Speakers 1.2.2 Materials 1.2.3 Some further issues in experimental design 1.2.4 Speaking style 1.2.5 Recording setup 1.2.6 Annotation 1.2.7 Some conventions for naming files 1.3 Summary and structure of the book Chapter 2 Some tools for building and querying labelling speech databases 2.0 Overview 2.1 Getting started with existing speech databases 2.2 Interface between Praat and Emu 2.3 Interface to R 2.4 Creating a new speech database: from Praat to Emu to R 2.5 A first look at the template file 2.6 Summary 2.7 Questions Chapter 3 Applying routines for speech signal processing 3.0 Introduction 3.1 Calculating, displaying, and correcting formants 3.2 Reading the formants into R 3.3 Summary 3.4 Questions 3.5 Answers Chapter 4 Querying annotation structures 4.1 The Emu Query Tool, segment tiers and event tiers 4.2 Extending the range of queries: annotations from the same
    [Show full text]
  • Pdf Field Stands on a Separate Tier
    ComputEL-2 Proceedings of the 2nd Workshop on the Use of Computational Methods in the Study of Endangered Languages March 6–7, 2017 Honolulu, Hawai‘i, USA Graphics Standards Support: THE SIGNATURE What is the signature? The signature is a graphic element comprised of two parts—a name- plate (typographic rendition) of the university/campus name and an underscore, accompanied by the UH seal. The UH M ¯anoa signature is shown here as an example. Sys- tem and campus signatures follow. Both vertical and horizontal formats of the signature are provided. The signature may also be used without the seal on communications where the seal cannot be clearly repro- duced, space is limited or there is another compelling reason to omit the seal. How is color incorporated? When using the University of Hawai‘i signature system, only the underscore and seal in its entirety should be used in the specifed two-color scheme. When using the signature alone, only the under- score should appear in its specifed color. Are there restrictions in using the signature? Yes. Do not • alter the signature artwork, colors or font. c 2017 The Association for Computational Linguistics • alter the placement or propor- tion of the seal respective to the nameplate. • stretch, distort or rotate the signature. • box or frame the signature or use over a complexOrder background. copies of this and other ACL proceedings from: Association for Computational Linguistics (ACL) 5 209 N. Eighth Street Stroudsburg, PA 18360 USA Tel: +1-570-476-8006 Fax: +1-570-476-0860 [email protected] ii Preface These proceedings contain the papers presented at the 2nd Workshop on the Use of Computational Methods in the Study of Endangered languages held in Honolulu, March 6–7, 2017.
    [Show full text]