Automatic Detection of Code-Switching Style from Acoustics

Total Page:16

File Type:pdf, Size:1020Kb

Automatic Detection of Code-Switching Style from Acoustics Automatic Detection of Code-switching Style from Acoustics SaiKrishna Rallabandi*, Sunayana Sitaram, Alan W Black* *Language Technologies Institute, Carnegie Mellon University, USA Microsoft Research India [email protected], [email protected], [email protected] Abstract switching affects co-articulation and context de- pendent acoustic modeling (Elias et al., 2017). Multilingual speakers switch between lan- Therefore, developing systems for such speech re- guages displaying inter sentential, intra quires careful handling of unexpected language sentential, and congruent lexicalization switches that may occur in a single utterance. We based transitions. While monolingual hypothesize that in such scenarios it would be de- ASR systems may be capable of recogniz- sirable to condition the recognition systems on the ing a few words from a foreign language, type (Muysken, 2000) or style of language mixing they are usually not robust enough to han- that might be expected in the signal. In this paper, dle these varied styles of code-switching. we present approaches to detecting code-switching There is also a lack of large code-switched ‘style’ from acoustics. We first define style of an speech corpora capturing all these styles utterance based on two metrics that indicate the making it difficult to build code-switched level of mixing in the utterance: CodeMixing In- speech recognition systems. We hypothe- dex(CMI) and CodeMixing Span Index. Based on size that it may be useful for an ASR sys- these, we classify each mixed utterance into 5 style tem to be able to first detect the switch- classes. We also obtain an utterance level acous- ing style of a particular utterance from tic representation for each of the utterances using acoustics, and then use specialized lan- a variant of SoundNet. Using this acoustic repre- guage models or other adaptation tech- sentation as features, we try to predict the style of niques for decoding the speech. In this utterance. paper, we look at the first problem of de- tecting code-switching style from acous- 2 Related Work tics. We classify code-switched Spanish- English and Hindi-English corpora using Prior work on building Acoustic and Language two metrics and show that features ex- Models for ASR systems for code-switched speech tracted from acoustics alone can distin- can be categorized into the following approaches: guish between different kinds of code- (1) Detecting code-switching points in an utter- switching in these language pairs. ance, followed by the application of monolingual acoustic and language models to the individual Index Terms: speech recognition, code- segments (Chan et al., 2004; Lyu and Lyu, 2008; switching, language identification Shia et al., 2004). (2)Employing a shared phone 1 Introduction set to build acoustic models for mixed speech with standard language models trained on code- Code-switching refers to the phenomenon where switched text (Imseng et al., 2011; Li et al., 2011; bilingual speakers alternate between the languages Bhuvanagiri and Kopparapu, 2010; Yeh et al., while speaking. It occurs in multilingual soci- 2010). (3) Training Acoustic or Language mod- eties around the world. As Automatic Speech els on monolingual data in both languages with lit- Recognition (ASR) systems are now recognizing tle or no code-switched data (Lyu et al., 2006; Vu conversational speech, it becomes important that et al., 2012; Bhuvanagirir and Kopparapu, 2012; they handle code-switching. Furthermore, code- Yeh and Lee, 2015). We attempt to approach this 76 Proceedings of The Third Workshop on Computational Approaches to Code-Switching, pages 76–81 Melbourne, Australia, July 19, 2018. c 2018 Association for Computational Linguistics Class CMI Hi-En Utts En-Es Utts Class Description Hi-En En-Es C1 0 6771 41624 S1 Mono En 5413 27960 C2 0-0.15 13986 2284 S2 Mono Hi/Es 0 12749 C3 0.15-0.30 492 2453 S3 En Matrix 626 2883 C4 0.30-0.45 8865 1025 S4 Hi/Es Matrix 36454 1986 C5 0.45-1 2496 1562 S5 Others 8307 3345 Table 1: Distribution of CMI classes for Hinglish Table 2: Distribution of span based classes for and Spanglish Hinglish and Spanglish. Note that the term ‘Ma- trix’ is used just here notionally to indicate larger word span of the language. problem by first identifying the style of code mix- ing from acoustics. This is similar to the problem of language identification from acoustics, which is ferent styles based on two metrics: (1) Code Mix- typically done over the span of an entire utterance. ing index (Gamback and Das, 2014) which at- Deep Learning based methods have recently tempts to quantify the codemixing based on the proven very effective in speaker and language word counts and (2) CodeMixed Span information recognition tasks. Prior work in Deep Neural Net- which attempts to quantify codemixing of an utter- works (DNN) based language recognition can be ance based on the span of participating languages. grouped into two categories: (1) Approaches that 3.1 Categorization based on Code Mixing use DNNs as feature extractors followed by sep- Index arate classifiers to predict the identity of the lan- guage (Jiang et al., 2014; Matejka et al., 2014; Code Mixing Index (Gamback and Das, 2014) was Song et al., 2013) and (2) Approaches that em- introduced to quantify the level of mixing between ploy DNNs to directly predict the language ID the participating languages in a codemixed utter- (Richardson et al., 2015b,a; Lopez-Moreno et al., ance. CMI can be calculated at the corpus and ut- 2014). Although DNN based systems outper- terance level. We use utterance CMI, which is de- form the iVector based approaches, the output de- fined as: cision is dependent on the outcome from every − f g frame. This limits the real time deployment capa- wm(N(x) maxLi2L tLi (x)) + wpP (x) Cu(x) = 100 bilities for such systems. Moreover, such systems N(x) (1) typically use a fixed contextual window which where N is the number of languages, t are the spans hundreds of milliseconds of speech while the Li tokens in language L , P is the number of code language effects in a code-switched scenario are i alternation points in utterance x and w and w suprasegmental and typically span a longer range. m p are weights. In our current study, we quantize the In addition, the accuracies of such systems, espe- range of codemixed index ( 0 to 1) into 5 styles and cially ones that employ some variant of iVectors categorize each utterance as shown in Table 1.A drop as the duration of the utterance is reduced. We CMI of 0 indicates that the utterance is monolin- follow the approach of using DNNs as utterance gual. We experimented with various CMI ranges level feature extractors. Our interest is in adding and found that the chosen ranges led to a reason- long term information to influence the recognition able distribution within the corpus. For example, model, particularly at the level of the complete ut- the C2 CMI class in Hindi-English code switched terance, representing stylistic aspects of the degree data has utterances such as start and style of code-switching throughout the utter- "पंधरा पे कये थे यारा (’started at fifteen, ances. पंधरा पे यार अभी तो कुछ नही आ" eleven or fifteen but buddy nothing has happened 3 Style of Mixing and Motivation so far’). The C4 class on the other hand, has ut- terances such as ”actual म आज यह rainy season का Multiple metrics have been proposed to quantify मौसम था ना" (’actually the weather today was like codemixing (Guzmán et al., 2017; Gambäck and rainy season, right?’). An example of a C5 utter- Das, 2014) such as span of the participating lan- ance is ”ohh English अछा English कौनसा favourite guages, burstiness and complexity. For our cur- singer मतलब English म?" (’Ohh English, ok who rent study, we categorize the utterances into dif- is your favorite English singer?’) 77 3.2 Categorization based on Span of codemixing While CMI captures the level of mixing, it does not take to account the span information (regularity) of mixing. Therefore, we use language span in- formation (Guzmán et al., 2017) to categorize the utterances into 5 different styles as shown in Ta- ble 2. We divide each utterance based on the span of the participating languages into five classes - monolingual English, monolingual Hindi or Span- ish, classes where the two languages are dominant Figure 1: Architecture for style modeling using (70% or more) and all other utterances. The classes modified Soundnet S3 and S4 indicate that the primary language in the utterance has a span of at least 70% with respect 4 Experimental Setup to the length of utterance. This criterion makes these classes notionally similar to the construct of 4.1 Data ‘matrix’ language. However, we do not consider We use code-switched Spanish English (referred any information related to the word identity in this to as Spanglish hereafter) released as a part of Mi- approach. As we can see from both the CMI and ami Corpus (Deuchar et al., 2014) for training and span-based classes, the distributions of the two lan- testing. The corpus consists of 56 audio record- guage pairs are very different. The Spanglish data ings and their corresponding transcripts of infor- contains much more monolingual data, while the mal conversation between two or more speakers, Hinglish data is predominantly Hindi matrix with involving a total of 84 speakers. We segment the English embeddings. The Hinglish data set does files based on the transcriptions provided and ob- not have monolingual Hindi utterances which is tain a total of 51993 utterances. For Hinglish, due to the way the data was selected, as explained we use an in-house speech corpus of conversa- in Section 4.1.
Recommended publications
  • AUTHOR a Longitudinal Study of the Language Development Of
    DOCUMENT RESUME ED 247 706 EC 170 061 AUTHOR Mohay, Heather TITLE A Longitudinal Study of the Language Development of Three Deaf Children of Hearing Parents. PUB DATE , Jun 81 NOTE 26p.,.; Paper presented at a Seminar on Early Intervention or Young Hearing-Impaired Children (Mt. Gravatt., Queensland, Australia, June 15-16,-1981). For the proceedings, see EC 170 057. PUB TYPE Reports' = Research/Technical (143) EDRS PRICE MF01/PCO2 Plus Postage. DESCRIPTORS. *Hearing.Impairments; Infants; *Language Acquisition; Longitudinal Studies; *Manual Communication; Sign Language; Young Children (F ABSTRACT A longitudinal study followed the language acquisition'of three deaf infants. Analysis of videotapes recorded in the child's home during informal play was performed in terms of communicative gestures. Results revealed that Ss used a very limited number of hand configurations, locations for signs, and hand and arm, movements. Analysis of the gestures showed that there was a great deal of similarity in the components used by the three Ss, Ss presented consistent patterns in their gestures. Although the single gesture utterance was the most common communicative form, all Ss did combine gestures to form two-gesture utterances. Speech was infrequently used, and all Ss had larger gestural than spoken lexitons at 30 months. It was possible for elements in two gesture utterances and word/gesture combinations to be produced simultaneously. The content of the utterances indicated that they were able to express the same semantic relationships as hearing children at a similar stage of language development. (CL) 9 *********************************************************************** Reproductions supplied by EDRS are the best that can be made from the original document.
    [Show full text]
  • Two-Word Utterances Chomsky's Influence
    Two-Word Utterances When does language begin? In the middle 1960s, under the influence of Chomsky’s vision of linguistics, the first child language researchers assumed that language begins when words (or morphemes) are combined. (The reading by Halliday has some illustrative citations concerning this narrow focus on “structure.”) So our story begins with what is colloquially known as the “two-word stage.” The transition to 2-word utterances has been called “perhaps, the single most disputed issue in the study of language development” (Bloom, 1998). A few descriptive points: Typically children start to combine words when they are between 18 and 24 months of age. Around 30 months their utterances become more complex, as they add additional words and also affixes and other grammatical morphemes. These first word-combinations show a number of characteristics. First, they are systematically simpler than adult speech. For instance, function words are generally not used. Notice that the omission of inflections, such as -s, -ing, -ed, shows that the child is being systematic rather than copying. If they were simply imitating what they heard, there is no particular reason why these grammatical elements would be omitted. Conjunctions (and), articles (the, a), and prepositions (with) are omitted too. But is this because they require extra processing, which the child is not yet capable of? Or do they as yet convey nothing to the child—can she find no use for them? Second, as utterances become more complex and inflections are added, we find the famous “over-regularization”—which again shows, of course, that children are systematic, not simply copying what they here.
    [Show full text]
  • The Utterance, and Otherbasic Units for Second Language Discourse Analysis
    The Utterance, and OtherBasic Units for Second Language Discourse Analysis GRAHAM CROOKES University of Hawaii-Manoa Theselection ofa base unitisan importantdecision in theprocess ofdiscourse analysis. A number of different units form the bases of discourse analysis systems designed for dealing with structural characteristics ofsecondlanguage discourse. Thispaperreviews the moreprominentofsuch unitsand provides arguments infavourofthe selection ofone inparticular-the utterance. INTRODUCTION Discourse analysis has been an accepted part of the methodology of second language (SL) research for some time (see, for example, Chaudron 1988: 40-5; Hatch 1978; Hatch and Long 1980; Larsen-Freeman 1980), and its procedures are becoming increasingly familiar and even to some extent standardized. As is well-known, an important preliminary stage in the discourse analysis of speech is the identification of units relevant to the investigation within the body of text to be analysed, or the complete separation of the corpus into those basic units. In carrying out this stage of the analysis, the varying needs of second language discourse analysis have caused investigators to make use ofall of the traditional grammatical units of analysis (morpheme, word, clause, etc.), as well as other structural or interactional features of the discourse (for example, turns), in addition to a variety of other units defined in terms of their functions (for example, moves). These units have formed the bases of a number of different analytic systems, which canmainly beclassified as either structural orfunctional (Chaudron 1988), developed to address differing research objectives. In carrying out structural discourse analyses of oral text, researchers have been confronted with the fact that most grammars are based on a unit that is not defined for speech, but is based on the written mode of language-that is, the sentence.
    [Show full text]
  • Stochastic Language Generation in Dialogue Using Factored Language Models
    Stochastic Language Generation in Dialogue Using Factored Language Models Franc¸ois Mairesse∗ University of Cambridge ∗∗ Steve Young University of Cambridge Most previous work on trainable language generation has focused on two paradigms: (a) using a statistical model to rank a set of pre-generated utterances, or (b) using statistics to determine the generation decisions of an existing generator. Both approaches rely on the existence of a hand- crafted generation component, which is likely to limit their scalability to new domains. The first contribution of this article is to present BAGEL, a fully data-driven generation method that treats the language generation task as a search for the most likely sequence of semantic concepts and realization phrases, according to Factored Language Models (FLMs). As domain utterances are not readily available for most natural language generation tasks, a large creative effort is required to produce the data necessary to represent human linguistic variation for nontrivial domains. This article is based on the assumption that learning to produce paraphrases can be facilitated by collecting data from a large sample of untrained annotators using crowdsourcing—rather than a few domain experts—by relying on a coarse meaning representation. A second contribution of this article is to use crowdsourced data to show how dialogue naturalness can be improved by learning to vary the output utterances generated for a given semantic input. Two data- driven methods for generating paraphrases in dialogue are presented: (a) by sampling from the n-best list of realizations produced by BAGEL’s FLM reranker; and (b) by learning a structured perceptron predicting whether candidate realizations are valid paraphrases.
    [Show full text]
  • Annotation of Utterances for Conversational Nonverbal Behaviors
    Annotation of Utterances for Conversational Nonverbal Behaviors Allison Funkhouser CMU-RI-TR-16-25 Submitted in partial fulfillment of the requirements for the degree of Masters of Science in Robotics May 2016 Masters Committee: Reid Simmons, Chair Illah Nourbakhsh Heather Knight Abstract Nonverbal behaviors play an important role in communication for both humans and social robots. However, hiring trained roboticists and animators to individually animate every possible piece of dialogue is time consuming and does not scale well. This has motivated previous researchers to develop automated systems for inserting appropriate nonverbal behaviors into utterances based only on the text of the dialogue. Yet this automated strategy also has drawbacks, because there is basic semantic information that humans can easily identify that is not yet accurately captured by a purely automated system. Identifying the dominant emotion of a sentence, locating words that should be emphasized by beat gestures, and inferring the next speaker in a turn-taking scenario are all examples of data that would be useful when animating an utterance but which are difficult to determine automatically. This work proposes a middle ground between hand-tuned animation and a purely text-based system. Instead, untrained human workers label relevant semantic information for an utterance. These labeled sentences are then used by an automated system to produce fully animated dialogue. In this way, the relevant human-identifiable context of a scenario is preserved without requiring workers to have deep expertise of the intricacies of nonverbal behavior. Because the semantic information is independent of the robotic platform, workers are also not required to have access to a simulation or physical robot.
    [Show full text]
  • Utterance Structure in Initial L2 Acquisition
    Utterance structure in initial L2 acquisition Jacopo Saturno language Eurosla Studies 2 science press EuroSLA Studies Editor: Gabriele Pallotti Associate editors: Amanda Edmonds, Université de Montpellier; Ineke Vedder, University of Amsterdam In this series: 1. Pérez Vidal, Carmen, Sonia López­Serrano, Jennifer Ament & Dakota J. Thomas­Wilhelm (eds.). Learning context effects: Study abroad, formal instruction and international immersion classrooms 2. Saturno, Jacopo. Utterance structure in initial L2 acquisition. ISSN: 2626­2665 Utterance structure in initial L2 acquisition Jacopo Saturno language science press Saturno, Jacopo. 2020. Utterance structure in initial L2 acquisition (Eurosla Studies 2). Berlin: Language Science Press. This title can be downloaded at: http://langsci-press.org/catalog/book/265 © 2020, Jacopo Saturno Published under the Creative Commons Attribution 4.0 Licence (CC BY 4.0): http://creativecommons.org/licenses/by/4.0/ ISBN: 978-3-96110-261-7 (Digital) 978-3-96110-262-4 (Hardcover) ISSN: 2626-2665 DOI: 10.5281/zenodo.3889998 Source code available from www.github.com/langsci/265 Collaborative reading: paperhive.org/documents/remote?type=langsci&id=265 Cover and concept of design: Ulrike Harbort Typesetting: Sebastian Nordhoff Proofreading: Amir Ghorbanpour, Andreas Hölzl, Brett Reynolds, Jeroen van de Weijer, Lachlan Mackenzie, Sebastian Nordhoff, Tom Bossuyt Fonts: Libertinus, Arimo, DejaVu Sans Mono Typesetting software:Ǝ X LATEX Language Science Press Xhain Grünberger Str. 16 10243 Berlin, Germany langsci-press.org Storage and cataloguing done by FU Berlin Contents Acknowledgements v 1 Introduction: Rationale and research questions 1 1.1 Word order vs. inflectional morphology .............. 3 1.2 Task effects: structured tests vs. semi-spontaneous production .
    [Show full text]
  • An Analysis of Code Switching in the Novel 9 Summers 10 Autumns
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Vivid Journal of Language and Literature AN ANALYSIS OF CODE SWITCHING IN THE NOVEL 9 SUMMERS 10 AUTUMNS Wenny Yuliani [email protected], 0810733112 English Department, Faculty of Humanities, Andalas University ABSTRAK Kajian ini membahas tipe dan fungsi alih kode dalam percakapan-percakapan pada novel 9 Summers 10 Autumns karya Iwan Setyawan. Metode yang digunakan untuk mengumpulkan data adalah simak bebas libat cakap (Sudaryanto 1993), sedangkan teori yang diterapkan dalam menganalisis data adalah teori dari Gumperz (1982) dan Hoffman (1996). Hasil analisis data dipaparkan dengan menggunakan metode formal dan informal. Dari data yang dianalisis ditemukan tiga tipe alih kode yang muncul dalam ujaran, yaitu 14 inter-sentential switching, 4 intra-sentential switching dan 2 emblematic switching. Selain itu ditemukan empat fungsi alih kode, yakni 7 message qualification, 5 adressee specification, 5 personalization, dan 3 interjection. Dari enam fungsi alih kode yang dikemukakan Gumperz, fungsi alih kode quotation dan reiteration tidak muncul dalam data. Hal ini disebabkan karena karakter tidak pernah mengutip ujaran orang lain ketika dia berujar dengan penutur lain dan juga tidak mengulang ujaran dari satu bahasa ke bahasa lainya. Kata kunci: alih kode, tipe alih kode, fungsi alih kode ABSTRACT This study is aimed to describe the types and the functions of code switching which are uttered by the characters in the novel 9 Summers 10 Autumns. In collecting the data, non participatory observational method (Sudaryanto 1993) was applied. In analyzing the data, Hoffman’s theory (1996) is referred to analyze the types of code switching and Gumperz’s theory (1982) is referred to analyze the functions of the code switching.
    [Show full text]
  • The Semantics of Onomatopoeic Speech Act Verbs
    The Semantics of Onomatopoeic Speech Act Verbs I-Ni Tsai Chu-Ren Huang Institute of Linguistics (Preparatory Office), Institute of Linguistics (Preparatory Office), Academia Sinica Academia Sinica No. 128 Academic Sinica Road, Sec 2. No. 128 Academic Sinica Road, Sec 2. Nangkang, Taipei 115, Taiwan Nangkang, Taipei 115, Taiwan [email protected] [email protected] Abstract This paper attempts to explore the semantics of onomatopoeic speech act verbs. Language abounds with small bits of utterances to show speaker's emotions, to maintain the flow of speech and to do some daily exchange routines. These tiny vocalizations have been regarded as vocal gestures and largely studied under the framework of 'interjection'. In this paper, the emphasis is placed on the perlocutionary force the vocal tokens contain. We describe their conventionalized lexical meaning and term them as onomatopoeic speech act verb. An onomatopoeic speech act verb refers to a syntactically independent monomorphemic utterance which performs illocutionary or perlocutionary forces. It is normally directed at the listener, which making the recipient to do something or to solicit recipient's response or reaction. They are onomatopoeic because most of them are imitation of the sounds produced by doing some actions. 1 Introduction Based on the Speech Act Theory (Austin 1992; Reiss 1985; Searle 1969, 1975, 1979), speech act is what speaker performs when producing the utterance. Researchers suggest that speech act is not only an utterance act but contains the perlocutionary force. If the speaker has some particular intention when making the utterance such as committing to doing something or expressing attitude or emotions, the speech act is said to contain illocutionary force.
    [Show full text]
  • Chapter 9. Utterances and Their Meanings
    CHAPTER 9 UTTERANCES AND THEIR MEANINGS Meanings are constructed situationally by the participants in interaction, as they construe intent in each other’s uttered words. A well-known story (said to be a favorite of both Vygotsky and Bakhtin) tells of a group of sailors having a nuanced exchange by repeating the same expletive to each other, but with a different intonation and timing at each turn. This polysemousness of words is equally to be found in an office memo announcing a change in reporting procedures that leaves the recipients wondering what the real meaning is—from enacting a corporate shake-up, to disciplining a co-worker, to a power-grab by a manager, to simply creating an efficiency. Much water-cooler time may be devoted to examining the nuances of expression or sharing other contexting information until a stable social meaning is agreed on, which will then guide the behavior of all concerned. To put it explicitly, meaning is not a property of language in itself, and is not immanent in language. Meaning is what people construe using the prosthesis of language, interpreted within specific contexts of use. To understand meaning, we need to take utterance and people’s construal of utterance as our fundamental units of analysis. VOLOSINOV AND HIS CIRCLE’S PROPOSAL FOR AN UTTERANCE-BASED LINGUISTICS Volosinov in Marxism and the Philosophy of Language (1929/1973), foreshadowed by comments in his earlier work on Freud (1927/1987), argued that linguistics should be grounded in utterance, rather than in the formal structure of language. Utterance was the natural unit of speech and communication, with each utterance taking shape within a recognizable form (that is, a speech genre), directed to a specific audience (what Bakhtin, 1984a, 1986, was to call addressivity), and in response to prior utterances.
    [Show full text]
  • Utterance Production
    Utterance Production LCSD56 Psycholinguistics Production It's rather nice in the sense that where the mews is on the way up to Kilburn it's a cheaper area it's not a select shopping centre by any means and there are lots of council houses and flats and I think it's fantastic because they're very nice looking flats and everything has been fairly well designed and you can go up there and shop reasonably 2 Production – actual delivery Its just + rather nice in the sense that where ++ the mews is erm ++ on ++ on the way up to Kilburn + you've got th- ++ well its its more of a er a cheaper + its not a select shopping centre by any means and there're lots of + council houses and flats and ++ erm + I mean I + think its del- its fantastic because you can go up there and they're very + nice looking flats and everything its its been fairly well designed + 'nd you can go up there and + and shop reasonably. What are the main differences in pausing between actual and ideal delivery? Is there any pattern to the repetitions? 3 Production – actual delivery Its just + rather nice in the sense that where ++ the mews is erm ++ on ++ on the way up to Kilburn + you've got th- ++ well its its more of a er a cheaper + its not a select shopping centre by any means and there're lots of + council houses and flats and ++ erm + I mean I + think its del- its fantastic because you can go up there and they're very + nice looking flats and everything its its been fairly well designed + 'nd you can go up there and + and shop reasonably.
    [Show full text]
  • The Neat Summary of Linguistics
    The Neat Summary of Linguistics Table of Contents Page I Language in perspective 3 1 Introduction 3 2 On the origins of language 4 3 Characterising language 4 4 Structural notions in linguistics 4 4.1 Talking about language and linguistic data 6 5 The grammatical core 6 6 Linguistic levels 6 7 Areas of linguistics 7 II The levels of linguistics 8 1 Phonetics and phonology 8 1.1 Syllable structure 10 1.2 American phonetic transcription 10 1.3 Alphabets and sound systems 12 2 Morphology 13 3 Lexicology 13 4 Syntax 14 4.1 Phrase structure grammar 15 4.2 Deep and surface structure 15 4.3 Transformations 16 4.4 The standard theory 16 5 Semantics 17 6 Pragmatics 18 III Areas and applications 20 1 Sociolinguistics 20 2 Variety studies 20 3 Corpus linguistics 21 4 Language and gender 21 Raymond Hickey The Neat Summary of Linguistics Page 2 of 40 5 Language acquisition 22 6 Language and the brain 23 7 Contrastive linguistics 23 8 Anthropological linguistics 24 IV Language change 25 1 Linguistic schools and language change 26 2 Language contact and language change 26 3 Language typology 27 V Linguistic theory 28 VI Review of linguistics 28 1 Basic distinctions and definitions 28 2 Linguistic levels 29 3 Areas of linguistics 31 VII A brief chronology of English 33 1 External history 33 1.1 The Germanic languages 33 1.2 The settlement of Britain 34 1.3 Chronological summary 36 2 Internal history 37 2.1 Periods in the development of English 37 2.2 Old English 37 2.3 Middle English 38 2.4 Early Modern English 40 Raymond Hickey The Neat Summary of Linguistics Page 3 of 40 I Language in perspective 1 Introduction The goal of linguistics is to provide valid analyses of language structure.
    [Show full text]
  • Utterance-Level Multimodal Sentiment Analysis
    Utterance-Level Multimodal Sentiment Analysis Veronica´ Perez-Rosas´ and Rada Mihalcea Louis-Philippe Morency Computer Science and Engineering Institute for Creative Technologies University of North Texas University of Southern California [email protected], [email protected] [email protected] Abstract Riloff, 2005; Esuli and Sebastiani, 2006) or large annotated datasets (Maas et al., 2011). Given the During real-life interactions, people are accelerated growth of other media on the Web and naturally gesturing and modulating their elsewhere, which includes massive collections of voice to emphasize specific points or to videos (e.g., YouTube, Vimeo, VideoLectures), im- express their emotions. With the recent ages (e.g., Flickr, Picasa), audio clips (e.g., pod- growth of social websites such as YouTube, casts), the ability to address the identification of Facebook, and Amazon, video reviews are opinions in the presence of diverse modalities is be- emerging as a new source of multimodal coming increasingly important. This has motivated and natural opinions that has been left al- researchers to start exploring multimodal clues for most untapped by automatic opinion anal- the detection of sentiment and emotions in video ysis techniques. This paper presents a content (Morency et al., 2011; Wagner et al., 2011). method for multimodal sentiment classi- In this paper, we explore the addition of speech fication, which can identify the sentiment and visual modalities to text analysis in order to expressed in utterance-level visual datas- identify the sentiment expressed in video reviews. treams. Using a new multimodal dataset Given the non homogeneous nature of full-video consisting of sentiment annotated utter- reviews, which typically include a mixture of posi- ances extracted from video reviews, we tive, negative, and neutral statements, we decided show that multimodal sentiment analysis to perform our experiments and analyses at the ut- can be effectively performed, and that the terance level.
    [Show full text]