Anne Boyle David Descriptive Grammar of Bangla
edited by Thomas J. Conners and Dustin A. Chacón
DE GRUYTER Copyright © 2015 University of Maryland. All rights reserved.
This material is based upon work supported, in whole or in part, with funding from the United States Government. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the University of Maryland, College Park and/or any agency or entity of the United States Government. | v
| For my kunjus, Guenevere Adimanthi and Rosalind Kanmani.
Foreword
It is remarkable that, in this age of unprecedented global communication and interac- tion, the majority of the world’s languages are as yet not adequately described. With- out basic grammars and dictionaries, these languages and their communities of speak- ers are in a real sense inaccessible to the rest of the world. This state of affairs is anti- thetical to today’s interconnected global mindset. This series, undertaken as a critical part of the mission of the University of Mary- land Center for Advanced Study of Language (CASL), is directed at remedying this problem. One goal of CASL’s research is to provide detailed, coherent descriptions of languages that are little studied or for which descriptions are not available in En- glish. Even where grammars for these languages do exist, in many instances they are decades out of date or limited in scope or detail. While the criticality of linguistic descriptions is indisputable, the painstaking work of producing grammars for neglected and under-resourced languages is often insuffi- ciently appreciated by scholars and graduate students more enamored of the latest the- oretical advances and debates. Yet, without the foundation of accurate descriptions of real languages, theoretical work would have no meaning. Moreover, without profes- sionally produced linguistic descriptions, technologically sophisticated tools such as those for automated translation and speech-to-text conversion are impossible. Such research requires time-consuming labor, meticulous description, and rigorous analy- sis. It is hoped that this series will contribute, however modestly, to the ultimate goal of making every language of the world available to scholars, students, and language lovers of all kinds. I would like to take this opportunity to salute the linguists at CASL and around the world who subscribe to this vision as their life’s work. It is truly a noble endeavor.
Richard D. Brecht Founding Executive Director University of Maryland Center for Advanced Study of Language
Series Editors’ Preface
This series arose out of research conducted on several under-described languages at the University of Maryland Center for Advanced Study of Language. In commencing our work, we were surprised at how many of the world’s major languages lack ac- cessible descriptive resources such as reference grammars and bilingual dictionaries. Among the ongoing projects at the Center is the development of such resources for various under-described languages. This series of grammars presents some of the lin- guistic description we have undertaken to fill such gaps. The languages covered by the series represent a broad range of language families and typological phenomena. They are spoken in areas of international significance, some in regions associated with political, social, or environmental instability. Provid- ing resources for these languages is therefore of particular importance. However, these circumstances often make it difficult to conduct intensive, in-country fieldwork. In cases where such fieldwork was impractical, the authors of that grammar have relied on close working relationships with native speakers, and, where possible, corpora of naturalistic speech and text. The conditions for data-gathering—and hence our approach to it—vary with the particular situation. We found the descriptive state of each language in the series to be different from that of the others: in some cases, much work had been done, but had never been col- lected into a single overview; in other cases, virtually no materials in English existed. Similarly, the availability of source material in the target language varies widely: in some cases, literacy and media are very sparse, while for other communities plentiful written texts exist. The authors have worked with the available resources to provide descriptions as comprehensive as these materials, the native speaker consultants, and their own corpora allow. One of our goals is for these grammars to reach a broad audience. For that reason the authors have worked to make the volumes accessible by providing extensive ex- emplification and theoretically neutral descriptions oriented to language learners as well as to linguists. All grammars in the series, furthermore, include the native orthog- raphy, accompanied where relevant by Romanization. While they are not intended as pedagogical grammars, we realize that in many cases they will supply that role as well. Each of the grammars is presented as a springboard to further research, which for every language continues to be warranted. We hope that our empirical work will provide a base for theoretical, comparative, computational, and pedagogical develop- ments in the future. We look forward to the publication of many such works.
Claudia M. Brugman Thomas J. Conners Anne Boyle David Amalia E. Gnanadesikan
Preface
This grammar started as a sketch morphology written several years ago so as to build a morphological parser for Bangla. Once the parser was finished, we put the sketch aside, until, at the urging of colleagues, we took it up again, in hopes of producing a more thorough grammar. There are a number of high-quality reference grammars of Bangla already, most recently the excellent and comprehensive Bengali: A Comprehen- sive Grammar, by Hanne-Ruth Thompson. Among others are Chatterji’s monumental Origin and Development of the Bengali Language and W.L. Smith’s Bengali Reference Grammar. Dimock, Battacharji, and Chatterjee’s Introduction to Bengali, Part 1 and Clint Seely’s Intermediate Bangla (Bengali) are both fine pedagogical grammars of the language; William Radice’s Teach Yourself Bengali is a good albeit briefer one. There remain, however, far fewer up-to-date resources in English than one would expect for a language of at least 200 million speakers. In the preliminary stages, our main consultant was Dr. Clinton B. Seely, Profes- sor Emeritus at the University of Chicago. Other scholars we conferred with include Dr. Tista Bagchi, Professor of Linguistics at the University of Delhi and native speaker from Kolkata; Dr. Miriam Klaiman; Dr. Susan Steele, whose work crucially informed our de- scription of complex predicates; and Dr. Hanne-Ruth Thompson, whose gracious gen- erosity in offering to share her 1999 grammar with us was thwarted in the endbythe repeated vagaries of the postal service. We also worked with native speakers Dr. Muna Hossain, Denis D’Rozario, and Fharzana Elankumaran, all from Dhaka. Without the diligent assistance and scholarship of the above authors, scholars, and consultants, as well as many others, we would not have been able to begin this task. Dr. Seely continued with us for a time during the second stage of writing. His love for and extensive knowledge of the Bangla language and its vast literature and his patience in helping me better understand both are things I am most grateful for. We also thank him for his generosity in allowing us to use his examples and in providing new ones. We were joined in this second stage by two other experts. University of Mary- land University College undergraduate and native Bangla speaker Sonia Purification’s conscientious, insightful work, proofreading our text and supplying examples, has immensely improved this book. University of Maryland linguistics graduate student Dustin Chacón, PhD 2015, also joined us; his grammatical knowledge of and broad field experience with Bangla have been invaluable. A word at this point about whythis volume has a separate author and editors: the three of us have diverse expertise and varying levels of understanding of the language. We felt our respective backgrounds— mine in morphology and historical and South Asian linguistics; Thomas Conners’ in syntax and fieldwork, and Dustin Chacón’s in Bangla syntax and fieldwork—would complement each other and enrich the volume. We cannot hope to improve upon either Chatterji’s or Thompson’s comprehen- sive works and did not try. Among the features our grammar adds to the corpus of Bangla research are a chapter on the typology of Bangla that situates it within the xii | Preface
South Asian language area, a formal grammar which can be used to feed a morpho- logical parser,¹ and data presented in both native orthography and transcription. We provide paradigms and extensive examples, complete with full interlinearization: a Bangla script line, a phonemic transcription, a morpheme-by-morpheme gloss line, and a free translation. Native orthography is often omitted from descriptive grammars, but is particularly useful not only to the language expert but also to the language learner. The data for this grammar come from a wide range of printed resources, com- plemented by fieldwork and work here at the University of Maryland with native speaker consultants. In our description we have tried to be theory-neutral without being simplistic. Any abstract description of a language is necessarily informed by theory at some level. We aim to be theoretically informed in as broad a way as possible, such that the descrip- tions and explications in this grammar will be of use not only to descriptive linguists, but to others from a variety of theoretic backgrounds. However, our primary loyalty is to the Bangla language, not to a particular theoretic approach to Language. A descriptive grammar is never really finished. Two areas in particular that we wish we could devote more time to are Bangla phonology and complex predicates. There remains much work to be done on Bangla, and we view this volume as a spring- board for scholars to continue working on this fascinating language in all its varieties. Many people have helped in the publication of this book. The authors would like to thank all our colleagues at the University of Maryland Center for the Advanced Study of Language for their support—in particular, CASL’s Director of Research, Amy Weinberg; its founding Executive Director, Richard Brecht; our dearly missed retired colleague David Cox; and our Area Director, Mike Maxwell, the creator of the parser, whose de- tailed comments and questions kept us clear and honest in our descriptions. Others who have had a part in this project include Aric Bills, Evelyn Browne, Claudia Brugman, Erin Crabb, Melissa Fox, Amalia Gnanadesikan, and Jessica Shamoo. Dr. Gnanade- sikan’s painstaking indexing, taken on at the last minute, is especially appreciated. We are also grateful to the anonymous reviewer whose suggestions and corrections helped make this a much better book. And finally, I would like to thank my editors and co-authors for their diligence and devotion to this stimulating but sometimes frustrating project, for correcting me whenever necessary, and for keeping their senses of humor throughout. All of the peo- ple mentioned above have worked hard to ensure this book contains as few infelicities as possible; any remaining are my responsibility.
Anne Boyle David
1 Available online for download to purchasers of this volume. Contents
Foreword | vii Series Editors’ Preface | ix Preface | xi 1 About this Grammar | 1 1.1 Overview | 1 1.2 Scope of this book | 1 1.3 Tables and examples | 1 1.3.1 Order of elements in a gloss | 2 1.4 Abbreviations and symbols | 3
2 The Bangla Language | 5 2.1 Population of speakers | 5 2.2 History and classification | 5 2.3 Dialectal variation | 9 2.4 The Bangla script | 10
3 Phonology and Orthography | 13 3.1 Introduction | 13 3.2 Bangla phonemes | 14 3.2.1 Vowels | 14 3.2.2 Consonants | 16 3.3 Other phonology | 18 3.3.1 Phonotactics | 18 3.3.1.1 Vowels | 18 3.3.1.1.1 Occurrence constraints and height neutralization | 19 3.3.1.1.2 Anticipatory assimilation | 19 3.3.1.1.3 Progressive assimilation | 22 3.3.1.1.4 Sanskritic vowel mutation | 23 3.3.1.2 Consonants | 23 3.3.1.3 Syllable structure | 24 3.3.2 Prosody | 24 3.4 Romanized transcription and Bangla orthography | 25 3.4.1 Introduction: our transcription system | 25 xiv | Contents
3.4.2 Orthography of Bangla vowels | 26 3.4.2.1 Vowel length in the orthography | 26 3.4.2.2 Vowel letters and vowel diacritics | 26 3.4.2.3 The vowel letter অ and the inherent vowel | 27 3.4.2.4 The vowel letter এ and its diacritic ◌ | 29 3.4.3 Orthography of Bangla consonants | 29 3.4.3.1 Nasals | 29 3.4.3.2 Sibilants | 30 3.4.3.3 Consonant conjuncts | 30 3.4.3.4 য/য়/◌ | 33 3.4.3.5 ◌ঃ | 34 3.5 Our transcription system | 34
4 Bangla as a South Asian Language | 41 4.1 Typological convergence | 42 4.1.1 Phonology | 42 4.1.2 Complex predicates | 43 4.1.2.1 Conjunct verbs | 44 4.1.2.2 Compound verbs | 46 4.1.3 Oblique case-marked subjects | 47 4.1.4 Reduplication & onomatopoeia | 47 4.1.5 Quotatives | 48 4.2 Typological divergence | 48 4.2.1 Phonology | 48 4.2.2 Ergativity | 50 4.2.3 Classifiers | 50
5 Nouns | 53 5.1 Nominal categories | 53 5.2 Nominal inflection | 53 5.2.1 Nominal markers | 54 5.2.2 Noun paradigms | 57 5.2.3 A note on orthography of case markers | 60 5.3 Allomorphy in noun inflection | 60 5.3.1 Nominative marker allomorphy | 61 5.3.1.1 Singular | 61 5.3.1.2 Plural | 61 5.3.2 Genitive marker allomorphy | 63 5.3.2.1 Singular | 63 5.3.2.2 Plural | 65 5.3.3 Objective marker allomorphy | 66 5.3.3.1 Singular | 66 Contents | xv
5.3.3.2 Plural | 66 5.3.4 Locative marker allomorphy | 67 5.4 Use of case and number markers | 68 5.4.1 Nominative | 68 5.4.1.1 Nominative case proper | 68 5.4.1.2 Unmarked nouns | 69 5.4.2 Objective | 70 5.4.3 Genitive | 72 5.4.4 Locative | 75 5.4.5 Plural number | 76 5.5 Noun derivation | 78 5.5.1 Deriving nouns from adjectives | 78 5.5.2 Deriving nouns from nouns | 79
6 Pronouns and Other Pro-forms | 81 6.1 Introduction | 81 6.2 Pronominal morphology | 82 6.2.1 Pronominal stems | 82 6.2.2 Pronominal case-marking suffixes | 82 6.2.3 Rules of stem allomorphy | 85 6.3 Personal pronouns (including demonstratives) | 87 6.3.1 First person pronouns | 87 6.3.2 Second person pronouns | 87 6.3.3 Third person pronouns | 90 6.4 Relative and correlative pronouns | 95 6.5 Demonstrative pronouns | 99 6.6 Reflexive pronouns | 100 6.7 Interrogative pronouns | 103 6.8 Indefinite pro-forms | 105 6.8.1 Indefinite pronouns and pro-forms | 105 6.8.2 Quantifying pro-forms | 108 6.8.2.1 Declinable quantifying pro-forms | 108 6.8.2.2 Indeclinable quantifying pro-forms | 109
7 Noun Modifiers | 111 7.1 Introduction | 111 7.2 Adjectives | 111 7.2.1 About adjectives | 111 7.2.2 Comparison of adjectives | 112 7.2.2.1 Comparatives | 112 7.2.2.2 Superlatives | 115 7.2.3 Historically derived adjectives | 116 xvi | Contents
7.2.3.1 Adjectives derived from adverbs | 116 7.2.3.2 Adjectives derived from nouns | 116 7.2.3.3 Adjectives derived from verbs | 119 7.3 Noun modification via other parts of speech | 119 7.4 Determiners | 121 7.4.1 Demonstratives | 121 7.4.2 Quantifiers | 123 7.4.2.1 Number names | 124 7.4.2.1.1 Inventory and representation | 124 7.4.2.1.2 Expressions involving number names | 129 7.4.2.2 Other quantifiers | 133 7.4.2.2.1 Inventory | 133 7.4.2.2.2 Interrogative quantifiers | 135 7.4.2.2.3 Indefinite Quantifiers | 135 7.4.3 Classifiers | 135 7.4.3.1 Definition | 135 7.4.3.2 Inventory of classifiers | 136 7.4.3.2.1 -টা /-ṭa/ ∼ - ট /-ṭe/ ∼ - টা /-ṭo/ | 136 7.4.3.2.2 - /-ṭi/ | 137 7.4.3.2.3 -জন /-jɔn/ | 138 7.4.3.2.4 - েলা /-gulo/ | 140 7.4.3.2.5 - িল /-guli/ | 140 7.4.3.2.6 -খানা /-khana/ | 141 7.4.3.2.7 -খািন /-khani/ | 142 7.4.3.2.8 - /-ṭuku/, - ক /-ṭuk/, - ন /-ṭukun/, - িন /-ṭukuni/ | 142 7.4.3.3 Functions of classifiers | 142 7.4.3.4 Frozen classifiers | 147 7.4.3.4.1 গাছা /-gacha/, -গািছ /-gachi/ | 147 7.4.3.4.2 -ফালা /-fala/, -ফািল /-fali/ | 148
8 Other Word Classes and Processes | 149 8.1 Interrogative words | 149 8.2 Adverbs | 153 8.3 Postpositions and prepositions | 155 8.3.1 Postpositions | 155 8.3.1.1 Postpositions requiring the genitive case | 156 8.3.1.2 Postpositions requiring no particular case | 160 8.3.1.3 Postpositions requiring the objective case | 162 8.3.1.4 Postpositions with optional genitive case | 162 8.3.2 Prepositions | 163 8.4 Conjunctions | 163 8.4.1 Coordinating conjunctions | 164 Contents | xvii
8.4.2 Subordinating conjunctions | 167 8.5 Particles or clitics | 168 8.5.1 The particle -ই /-i/ | 169 8.5.2 The particle -ও /-o/ | 170 8.5.3 The particle ত, তা /to/ | 171 8.5.4 The particle বা /ba/ | 172 8.5.5 The particle য /je/ | 172 8.5.6 The particle যা /ja/ | 173 8.5.7 The interrogative particle িক /ki/ | 173 8.6 Reduplication | 173 8.6.1 Reduplication of whole words | 174 8.6.1.1 Repetition of verbs | 174 8.6.1.2 Repetition of other parts of speech | 174 8.6.1.3 Reduplicative expressives | 175 8.6.2 Partial reduplication | 176 8.6.2.1 Partial reduplication with initial consonant insertion | 176 8.6.2.2 Partial reduplication with final vowel change | 177 8.7 Lengthened consonants | 178
9 Verbs | 179 9.1 Inflectional features | 179 9.1.1 Verbal categories | 179 9.1.2 Personal, tense, and aspect suffixes | 179 9.1.3 Verbal stem allomorphy | 182 9.2 Verb conjugation classes | 183 9.2.1 Class 1: (C)VC-; V ≠ /a/ | 184 9.2.2 Class 2: (C)aC- | 184 9.2.3 Class 3: (C)V-; V ≠ a | 185 9.2.4 Class 4: (C)a- | 186 9.2.5 Class 5: (C)ɔ(i)- or (C)a(i)- | 186 9.2.6 Class 6: (C)VCa- or (C)Vwa- | 188 9.2.7 Class 7: (C)VCCa- or (C)VVCa- (“three-letter” verbs) | 188 9.3 Verb paradigms | 189 9.3.1 Simple present | 189 9.3.1.1 Morphology of the simple present | 189 9.3.1.2 Uses of the simple present | 191 9.3.2 Present imperative | 192 9.3.2.1 Morphology of the present imperative | 192 9.3.2.2 Uses of the present imperative | 194 9.3.3 Present imperfect | 194 9.3.3.1 Morphology of the present imperfect | 194 9.3.3.2 Uses of the present imperfect | 196 xviii | Contents
9.3.4 Present perfect | 197 9.3.4.1 Morphology of the present perfect | 197 9.3.4.2 Uses of the present perfect | 199 9.3.5 Simple future | 200 9.3.5.1 Morphology of the simple future | 200 9.3.5.2 Uses of the simple future | 202 9.3.6 Future imperative | 203 9.3.6.1 Morphology of the future imperative | 203 9.3.6.2 Uses of the future imperative | 205 9.3.7 Simple past | 205 9.3.7.1 Morphology of the simple past | 205 9.3.7.2 Uses of the simple past | 207 9.3.8 Conditional/past habitual | 208 9.3.8.1 Morphology of the conditional/past habitual | 208 9.3.8.2 Uses of the conditional/past habitual | 210 9.3.9 Past imperfect | 211 9.3.9.1 Morphology of the past imperfect | 211 9.3.9.2 Uses of the past imperfect | 213 9.3.10 Past perfect | 213 9.3.10.1 Morphology of the past perfect | 213 9.3.10.2 Uses of the past perfect | 215 9.4 Irregular verbs | 216 9.4.1 আছ- /ach-/ ‘to be present, exist’ | 216 9.4.2 দওয়া /dewa/ ‘to give’ | 217 9.4.3 নওয়া /newa/ ‘to take’ | 218 9.4.4 ন- /nɔ-/ ‘not to be, not to exist’ | 218 9.4.5 যাওয়া /jawa/ ‘to go’ | 219 9.4.6 আসা /aśa/ ‘to come’ | 221 9.5 Non-finite forms | 222 9.5.1 Perfect participle | 224 9.5.1.1 Morphology of perfect participles | 224 9.5.1.2 Uses | 224 9.5.2 Imperfect participle | 227 9.5.2.1 Morphology of imperfect participles | 227 9.5.2.2 Uses | 227 9.5.3 Conditional participle | 231 9.5.3.1 Morphology of the conditional participle | 232 9.5.3.2 Uses | 232 9.5.4 Verbal noun | 233 9.5.4.1 Morphology of verbal nouns | 234 9.5.4.1.1 Common form | 234 9.5.4.1.2 Alternate form | 234 Contents | xix
9.5.4.2 Uses | 234 9.6 Causatives | 239 9.6.1 Morphology of causatives | 239 9.6.2 Causatives of pseudo-causative verbs | 239 9.6.3 Triple causatives | 240 9.7 Negation | 242 9.7.1 না /na/ | 242 9.7.1.1 As a negator of verbs | 242 9.7.1.2 Other uses of না /na/: | 244 9.7.2 নই /nei/ ‘is not’ | 245 9.7.3 ন- /nɔ-/ ‘not to be, not to exist’ (the negative copula) | 245 9.7.4 িন /ni/ (the perfect negative) | 246 9.7.5 নারা /nara/ | 246
10 Syntax | 247 10.1 Word order and clause structure | 247 10.1.1 Scrambling | 247 10.1.2 The two be verbs | 250 10.1.2.1 আছ- /ach-/ ‘be’ | 251 10.1.3 Questions | 252 10.1.3.1 Question marker | 252 10.1.3.2 Wh-phrases | 253 10.1.3.2.1 Wh-phrase structure | 254 10.2 Noun phrase structure | 255 10.2.1 Word order | 255 10.2.1.1 Adjective placement | 256 10.2.2 Headless noun phrases | 257 10.2.3 Definiteness marking | 258 10.2.4 Quantifiers and classifiers | 259 10.2.4.1 Bare nouns | 259 10.2.4.1.1 Nouns with and without classifiers | 259 10.2.4.1.2 Floating quantifiers | 262 10.2.4.2 “The whole” | 262 10.2.4.3 Indefinite number | 263 10.2.5 Associative plurals | 264 10.3 Verbal phrase structure | 264 10.3.1 Valency | 264 10.3.1.1 Passives | 265 10.3.1.2 Causatives | 266 10.3.2 Light verb constructions | 268 10.3.2.1 Subjects and light verbs | 269 10.3.2.2 Scrambling | 269 xx | Contents
10.3.2.3 Light verb inventory | 271 10.3.3 Conjunct verbs | 273 10.3.3.1 Uses of conjunct verbs | 273 10.3.3.2 Selection | 274 10.3.4 Imperfect participles | 275 10.3.4.1 Other uses of the imperfect participle | 277 10.4 Postpositions | 278 10.5 Subordinate clauses | 281 10.5.1 Perfect participles as subordinators | 281 10.5.2 Conditionals | 282 10.5.3 Relative and correlative clauses | 284 10.5.3.1 Modifying nouns | 287 10.5.4 Complement clauses | 288 10.5.5 Other subordinate clauses | 290 10.6 Non-canonical case-marking | 291 10.6.1 Oblique subjects vs. nominative subjects | 292 10.6.2 লাগা /laga/ ‘to strike’ | 293 10.6.3 Oblique and nominative pairs | 293 10.6.4 Expressing possession with an oblique subject | 296 10.6.5 Deontic modals | 296 10.6.6 Objective case | 297 10.7 Negation | 298
References Cited | 301
A The Digital Grammar | 305 A.1 Overview | 305 A.2 Audience | 306 A.3 More on uses of this grammar | 307 A.3.1 The grammar as a basis for computational tools | 307 A.3.1.1 Building a parser and generator | 308 A.3.2 The grammar as a description | 310 A.4 Spell correction | 311 A.5 Grammar adaptation | 312 A.5.1 Manual grammar building | 312 A.5.2 Automated grammar adaptation | 313 A.6 Formatting the grammar for viewing | 314
B Unicode Representation | 317 B.1 Diacritics | 317 B.2 Normalization | 317 Index | 321 List of Tables | xxi
List of Figures
1.1 In-line text example | 1 1.2 Interlinear example | 2
2.1 Map of Bangla-speaking area | 6
List of Tables
3.1 Bangla vowels ...... 14 3.2 Representative Final VV̯ and -VV Sequences ...... 16 3.3 Bangla consonants ...... 18 3.4 Transcription of consonant conjuncts ...... 31 3.5 Transcription of vowel letters and consonants ...... 35 3.6 Other Unicode symbols used in Bangla ...... 39
5.1 Nominal markers ...... 56 5.2 Singular nouns ...... 58 5.3 Plural human nouns ...... 59 5.4 Plural (definite) non-human nouns ...... 59
6.1 Pronominal human suffixes ...... 82 6.2 Pronominal non-human suffixes ...... 83 6.3 First person pronouns; stem = am- ...... 87 6.4 Second person honorific pronouns; stem = apn- ...... 88 6.5 Second person familiar pronouns; stem = tom- ...... 89 6.6 Second person intimate pronouns; stem = to- ...... 89 6.7 Third person human honorific pronouns; stems = en-, on-, tan- (KCB); ena-, una-, tan- (DCB) ...... 91 6.8 Third person human non-honorific pronouns; stems = e-, o-, ta- . 92 xxii | List of Tables
6.9 Third person non-human pronouns, without classifiers; stems = e-, o-,ta-...... 93 6.10 Third person non-human pronouns, with classifiers; stems = e-, o-, ta- 94 6.11 Relative and correlative human honorific pronouns; stems = jan-, tan- 96 6.12 Relative and correlative human non-honorific pronouns; stems = ja-, ta-...... 97 6.13 Relative and correlative non-human pronouns; singular stems = ja-, ta-...... 98 6.14 Reflexive pronouns ...... 101 6.15 Interrogative pronoun ক /ke/ ‘who?’ ...... 104 6.16 Interrogative pronoun িক/কী /ki/ ‘what?’ ...... 104 6.17 Interrogative quantifiers ...... 105 6.18 Forms of the indefinite human pronoun কউ /keu/ ‘someone’ . . . . 106 6.19 Forms of the indefinite nonhuman pronounিক /kichu/ ‘something, anything’ ...... 107 6.20 Indefinite pronouns and pro-forms with negative ...... 107 6.21 সবাই /śɔbai/ ‘all, everyone’ ...... 108 6.22 অেনেক /ɔneke/ ‘many persons’ ...... 108 6.23 সকেল /śɔkole/ ‘everyone’ ...... 109
7.1 Nominative deictics with -ই /-i/ ...... 122 7.2 Deictic derived forms ...... 122 7.3 Bangla numerals and number names ...... 125 7.4 Abbreviations of ordinal numbers ...... 129 7.5 Indefinite quantifiers ...... 135 7.6 Functions of classifiers when attached to noun phrases ...... 145 7.7 Word types that take classifiers ...... 147
8.1 Interrogatives ...... 149 8.2 কত /kɔto/ and কয় /koe/ derivatives ...... 151 8.3 Interrogative pronoun ক /ke/ ‘who’ ...... 152 8.4 Common uses of inclusive particle-ই /-i/ ...... 169 8.5 Common uses of inclusive/concessive particle-ও /-o/ ...... 170 8.6 Common uses of inclusive/concessive particle-ও /-o/ ...... 171
9.1 Personal and tense suffixes ...... 181 9.2 Simple present; tense suffix = Ø ...... 190 9.3 Present imperative; tense suffix = Ø ...... 193 9.4 Present imperfect; tense suffix = Ø; aspect- -/-ছ- suffix = /-(c)ch-/ . 195 9.5 Present perfect; tense suffix = Ø; aspect-ছ- suffix = /-ch-/ ...... 198 9.6 Future tense; tense suffix-ব- = /-b-/ ...... 201 9.7 Future imperative; tense suffix-ব- = /-b-/ ...... 204 9.8 Simple past; tense suffix-ল- = /-l-/ ...... 206 List of Tables | xxiii
9.9 Conditional/past habitual; tense/aspect suffix-ত- = /-t-/ ...... 209 9.10 Past imperfect; tense suffix-ইল- = /-il-/; aspect suffix- -/-ছ- = /-(c)ch-/ ...... 212 9.11 Past perfect; tense suffix-ইল- = /-il-/; aspect suffix-ছ- = /-ch-/ . . . 214 9.12 Simple present: আছ- /ach-/ ‘be present; exist’ ...... 216 9.13 Simple past: আছ- /ach-/ ‘be present; exist’ ...... 217 9.14 Simple present: দওয়া /dewa/ ‘to give’ ...... 217 9.15 Simple present: দওয়া /newa/ ‘to give’ ...... 218 9.16 Present ...... 219 9.17 Simple past: যাওয়া /jawa/ ‘to go’ ...... 219 9.18 Present perfect: যাওয়া /jawa/ ‘to go’ ...... 220 9.19 Past perfect: যাওয়া /jawa/ ‘to go’ ...... 220 9.20 Present imperative:আসা /aśa/ ‘to come’ ...... 221 9.21 Simple past: আসা /aśa/ ‘to come’ ...... 221 9.22 Simple past: আসা /aśa/ ‘to come’, irregular variants ...... 222 9.23 Non-finite verb forms ...... 223 9.24 Triple causative ...... 240 9.25 Triple causative with variant meanings ...... 241
10.1 Light verbs ...... 272 10.2 Relatives and correlatives ...... 285 10.3 Morphologically complex relatives and correlatives ...... 286
1 About this Grammar
1.1 Overview
With well over 200 million native speakers, Bangla has a lot of variation. It may best be described as a continuum of varieties spoken in Bangladesh and India (with smaller populations elsewhere). There is a standard written variety of Bangla, based on that from Kolkata, but accepted in both India and Bangladesh. In the spoken language, however, the range of dialects includes members that are not mutually intelligible. See Section 2.3 for a brief discussion of these other dialects.
1.2 Scope of this book
This grammar describes what can be termed Standard Colloquial Bangla. This can be thought of as representing a non-geographically marked, idealized variety contain- ing words and grammatical features most commonly found throughout the Bangla- speaking area. Where relevant and identifiable, we make reference to two broad di- alectal varieties, one centered on the Bangladeshi city of Dhaka, and the other on the Indian (West Bengal) city of Kolkata. We call them Dhaka Colloquial Bangla (DCB) and Kolkata Colloquial Bangla (KCB) respectively.
1.3 Tables and examples
In our description we use both in-line text examples and interlinear text examples. In- line text examples are used when a single form is being referenced or explicated in the text. The format is as follows: the first element is in native script; the second is inour Romanized, largely phonemic transcription (between slashes); and the third section provides an English gloss (in single quotation marks). Figure 1.1 illustrates an in-line example.
Native script Gloss
বািড়ওয়ািল /baṛiwali/ ‘landlady’
Romanization
Figure 1.1: In-line text example 2 | About this Grammar
The format for an interlinear example is as follows: the first line is an utterance in native script; the second line renders the utterance in our Romanization; the third line provides a word-for-word translation, including any grammatical category labels; and the fourth line gives a free translation into English. Figure 1.2 illustrates and labels these parts of an interlinear example.
Native script আনীকা শাহীরেক একটা উপহার িদেয়েছ। Romanization anika śahir-ke æk-ṭa upohar diye-ch-Ø-e Gloss Aniqua Shaheer-OBJ one-CLF present give.PRF-PRF-PRS-3.NHON Free translation ‘Aniqua gave Shaheer a present.’
Figure 1.2: Interlinear example
A word about our style of morpheme glossing is in order here. We want our read- ers to have the option of seeing clearly what grammatical facts each morpheme is en- coding, so we have elected to err on the side of over-specification, particularly when glossing verbs. The glosses therefore reflect our grammatical analyses in detail. For example, in Figure 1.2, we have glossed two morphemes as perfect—bothিদেয়- /diye-/ and -ছ- /-ch-/. This reflects the fact that-ছ- /-ch-/ is the perfect affix, and that perfect- ness is also encoded by the verb stem; we want this fact to be transparent. (See Section 9.3.4 and Section 9.3.10 for a description of present and past perfect verb formation.) In the above example, we have also illustrated the fact that the present tense is encoded by the absence of a morpheme, denoted here by the null sign Ø. This approach can make for some unwieldy morpheme gloss lines, but we believe the trade-off in transparency and the ability to line up each morpheme with thegram- matical category it encodes is worth it.
1.3.1 Order of elements in a gloss
The order of grammatical categories in a gloss is as follows: VERBS: aspect.tense.personnumber.formality NOUNS/PRONOUNS/ADJS: category.person.formality.case Abbreviations and symbols | 3
1.4 Abbreviations and symbols
Where possible, morpheme glosses in this grammar follow the Leipzig Glossing Rules,¹ a set of formatting conventions widely adopted in the linguistics community. Commonly used abbreviations and symbols in this grammar include the follow- ing: *: non-existent or unacceptable form, grammatical construction, or reading %: a form, grammatical construction, or reading that not all speakers accept as grammatical ∼: variation in forms (within or across dialects) [ ]: non-overt element (in a translation or morpheme gloss) / /: phonemic transcription [ ]: phonetic transcription -: morpheme boundary in a transcription or gloss-line .: a period indicates a mismatch between the number of Bangla elements and the number of elements in the English gloss Ø: zero morpheme < >: a grapheme 1: first person 2: second person 3: third person A: transitive subject ADJ: adjective ADJZ: adjectivizer C: consonant CAUS: causative CLF: classifier CMPL: complementizer COND: conditional participle COMP: comparative DCB: Dhaka colloquial Bangla DIM: diminutive DM: discourse marker EMPH: emphatic enclitic FAM: familiar FUT: future GEN: genitive HAB: habitual HON: honorific
1 http://www.eva.mpg.de/lingua/resources/glossing-rules.php 4 | About this Grammar
HUM: human IMIT: imitative (for onomatopoeic forms) IMP: imperative INCL: inclusive/concessive enclitic INDF: indefinite marker INT: intensifier INTM: intimate intr.: intransitive IPF: imperfect IPFP: imperfect participle KCB: Kolkata colloquial Bangla lit.: literally LIT: literary usage or form (used only on dialect tags) LOC: locative LVC: light verb construction N: noun NEG: negative NHON: non-honorific NMLZ: nominalizer NOM: nominative NP: noun phrase NSTD: non-standard O: transitive object OBJ: objective case PRFP: perfect participle PL: plural PRS: present PRF: perfect PRT.RED: partial/echoic reduplication PST: simple past PST.HAB: conditional/past habitual Q: question particle REDUP: reduplication REINF: reinforcer S: intransitive subject SG: singular SUP: superlative tr.: transitive V: vowel VN: verbal noun 2 The Bangla Language
2.1 Population of speakers
Bangla, or—as it is still known to many English speakers—Bengali [ISO:ben], is spoken over a continuous swath of land in the northwest of South Asia, and also off the coast of India in the Andaman and Nicobar Islands. It is the national language of Bangladesh (with 110,000,000 speakers reported in the 2001 census¹) and the official language of the Indian states of West Bengal and Tripura, as well as of the Barak Valley region of Assam. In addition, it is a minority language in the Indian states of Jharkhand, Bihar, Meghalaya, Mizoram, and Nagaland; and, with 26% of the speaker population, the most-spoken language in the Indian union territory of the Andaman and Nicobar Is- lands (ahead of Hindi, Tamil, Telugu, Malayalam, Nicobarese, Kurux/Oraon, Munda, and Kharia).² The 2001 census reported 82,5000,000 total native speakers of Bangla in India. It is also spoken by large diaspora populations in the United States, Canada, Aus- tralia, the United Kingdom, Nepal, Pakistan, and several of the Persian Gulf states. Re- liable numbers for diasporic speakers are unavailable, but worldwide there are close to 200 million native speakers of Bangla, and a total of 250 million, counting non-native speakers.³ Both of these figures place Bangla among the ten most-spoken languages in the world.
2.2 History and classification
Bangla is a member of the Indo-Aryan sub-group of the Indo-European language fam- ily. Scholars agree that Bangla, along with Asamiya [ISO:asm]⁴ and Oriya [ISO:ori], belongs to the Eastern, or Māgadhī, branch of Indo-Aryan. The degree of their genetic relatedness to the so-called “Bihari” languages of the Indo-Aryan group—Maithili [ISO: mai], Magahi [ISO:mag], and Bhojpuri [ISO:bho]—is more controversial. Over the years scholars have alternately grouped those latter three in the Western branch, as well as in the Eastern branch, and also as their own branch of Indo-Aryan. Thanks, however, to the “mixed dialectal ancestry of most NIA [New Indo-Aryan] languages” (Masica 1991, 460), a precise and detailed historical taxonomy of the Indo-Aryan languages is probably not achievable. That is, the persistent mutual influence of various dialects, over thousands of years, has led to “the entire Indo-Aryan realm (except for Sinhalese)
1 http://www.ethnologue.com/country/BD/languages 2 Report of the Commissioner for Linguistic Minorities; www.nclm.nic.in; http://nclm.nic.in/shared/linkimages/NCLM47thReport.pdf, p. 233; languages are listed in descending order of speakers. 3 Ethnologue. 4 Formerly, Assamese. 6 | The Bangla Language
Figure 2.1: Map of Bangla-speaking area History and classification | 7 constitut[ing] one enormous dialectal continuum” (Masica 1991, 25). This grammar therefore does not come down either way on the question of the Bangla language’s connection to the Bihari languages; however, we remain conservative in our assump- tions. Therefore, when we make reference to the Māgadhi or Māgadhan languages, we are referring only to Bangla, Asamiya, and Oriya. Nevertheless the recent histories and relatedness of Bangla, Asamiya, and Oriya may be more reconstructible than the histories either of their predecessors or their con- temporary NIA cousins. Dasgupta (2003, 352) remarks that consensus on the Māgadhī branch’s history “may emerge once the contemporary enterprise of producing serious descriptions of the modern languages has achieved its objectives.” Their hypotheti- cal Middle Indo-Aryan proto-language is known as Eastern, or Māgadhī, Apabhraṃśa. Scholars agree that the three branches of this unattested language began to diverge from each other somewhere around the end of the first millennium of the Common Era (CE). Due to similarities in structure and morphology—for example, an innovated height distinction among mid vowels and no number category among verbs— Bangla and Asamiya are considered to be more closely related to each other than either is to Oriya. In other words, they broke off together from the Māgadhī tree and diverged from each other later than from Oriya. The two also share a common script, other than a cou- ple of letters that differ. All three languages retain some degree of mutual intelligibility with each other. Texts of Buddhist devotional songs known as the Charyapada and dated to be- tween the tenth and twelfth centuries represent a language often referred toas Old Bangla, but scholars are divided on how much the Māgadhī languages had differen- tiated at this point and whether the evidence is therefore sufficient to justify this ter- minology. Dasgupta (2003, 354), for example, uses the term Old X to avoid implying more than he feels confident is currently knowable about whether the songs represent a pre- or post-split state of the Māgadhī group. According to him, texts subsequent to those songs, beginning around the start of the fourteenth century, reflect a lan- guage uncontroversially designable as Middle Bangla, and include ground-breaking treatises on logic, translations of the Rāmāyaṇa and the Mahābhārata, and devotional verses of bhakti poetry, a genre which grew out of the eponymous medieval Hindu de- votional movement that originated in southern India (Cutler 1987) and promoted per- sonal union with the divine. What we call Modern Bangla gradually developed from this middle language and began to be standardized in the seventeenth century. In the years following the transition into Modern Bangla, a split in the language arose, culminating in a full-blown state of diglossia by sometime in the nineteenth century, during which time the High (H) form of Bangla, orসা ভাষা /śadhu bhāśa/ ‘elegant/correct/refined language’,⁵ became the standard form for almost all writing. Drawing on earlier states of the language, yet differing from them, it contained heav-
5 In English, Shadhu Bhasha, or Shadhu for short. 8 | The Bangla Language ily Sanskritized vocabulary and preserved certain older morphology—for example, in conjunctive participles and pronouns—from the later part of the Middle Bangla period (Dasgupta 2003, 353). Masica (1991, 57) describes Shadhu Bangla as “based on the spo- ken language of the fourteenth and fifteenth centuries.” But the ascendancy of theH form did not last long. A Low (L) form,চিলত ভাষা /colit bhāśa/ ‘colloquial/ current lan- guage’,⁶ based largely on the colloquial standard of the pre-Independence educated classes from Kolkata and its environs, began to compete for prominence as a written standard. Thanks in part to energetic promotion by Bangla literary figures, including the novelist and Nobel laureate Rabindranath Tagore,⁷ Cholit Bangla began to predom- inate in the early part of the twentieth century. For some time the two lects, H and L, vied with each other, but by the end, the strong form of Shadhu Bangla had all but disappeared, although it can still be found “in the editorial columns of a newspaper here and there” (Seely 2002/2006, 1). However, the diglossic situation for Bangla is more complicated than this story implies. Cholit Bangla diverged from Shadhu in both lexical and morphological ele- ments, but morphology received more attention in the movement towards introducing the vernacular speech into writing. Tagore’s writing, for example, while it contained much colloquial morphology, retained a highly Sanskritic vocabulary, and this lexical register difference can still be seen in Bangla writing, including in newspapers, where the lexicon may differ substantially from that of spoken forms of the language, even while the morphology is unquestionably fully Cholit. To complicate the picture further, many non-standard dialects happen to be conservative, a characteristic that can make them resemble Shadhu. The old diglossia also makes its presence felt in its influence on the phonology of educated speakers, who will use conservative pronunciations of words they know to be direct borrowings from Sanskrit., or tatsamas. Masica (1991, 58), citing Narula (1955), points to regional integration’s role in stim- ulating the standardization of many of the modern Indo-Aryan languages’ vernacu- lar varieties, with Bangla among those he emphasizes. Regional and other social fac- tors associated with certain of these languages lent themselves to the development of a collective identity among the speakers. In the case of Bangla, post-Independence clashes over national language status no doubt increased feelings of linguistic solidar- ity in the middle part of the twentieth century. Following independence from Britain in 1947 and the partition of the British-dominated sub-continent into India and Pak- istan—including the further partition of Bengal along religious lines, into the Indian state of West Bengal and the Pakistani province of East Bengal (later, East Pakistan)— the Pakistani government declared Urdu the official language of the entire country. This language policy included their non-contiguous, eastern province, which angered its majority Bangla-speaking inhabitants. Protests in favor of elevating the status of
6 In English, Cholit Bhasha, or Cholit for short. 7 Masica (1991, 57); Thompson (2010, 16). Dialectal variation | 9
Bangla erupted into a clash on February 21, 1952, in which seven students were shot and killed by police. The shootings galvanized the movement for granting equal status to Bangla, and in 1956, the Pakistani government gave in to pressure and declared Bangla the other national language of Pakistan. But the whole episode of the Language Movement had further exacerbated discontent, strengthening a rising nationalist movement that cul- minated in the 1971 Bangladesh Liberation War, then independence from Pakistan and the formation of the nation of Bangladesh. The 21st of February, known asভাষা িদবস /bhāśa dibɔś/ ‘Language Day’, is a national holiday in Bangladesh; in November 1999, UNESCO declared it International Mother Language Day as well.⁸
2.3 Dialectal variation
There is considerable dialect variation in spoken Bangla, extending in some cases to mutual unintelligibility, as Bangla can arguably be called a dialect continuum rather than a single language (a continuum which, in a larger sense, blends into Asamiya). These dialects include Barik, Bhatiari, Chirmar, Hajong [ISO:haj], Kachari-Bengali, Kha- ria Thar [ISO:ksy], Lohari-Malpaharia, Mal Paharia [ISO:mkb], Musselmani, Rajbangsi [ISO:rjs], Rajshahi, Samaria, Saraki, Siripuria (Kishanganjia), and Sylheti [ISO:syl] in India; Barisal, Chittagonian [ISO:ctg], Hajong, Khulna, Mymensingh, Noakhali, Raj- bangsi, Sylheti, and Tangchangya [ISO:tnv] in Bangladesh; and finally, Barik, Bha- tiari, Chirmar, Kachari-Bengali, Lohari-Malpaharia, Musselmani, Rajbangsi, Rajshahi, Samaria, Saraki, and Siripuria in Nepal.⁹ Several of the above are considered by some to be separate languages. All of those accompanied by ISO designations, in fact, are labeled as separate languages by Eth- nologue. Sylheti, for example,, is a good candidate for being considered another lan- guage, rather than a dialect of Bangla. Similarly, the Rajbangsi dialect could be con- sidered a dialect of Asamiya on linguistic grounds, but is grouped with Bangla due to political considerations, because many of its speakers are citizens of West Bengal and Bangladesh and identify culturally with Bangla. And Chittagonian, which Masica (1991, 25) describes as differing from Standard Bangla “more than Assamese itself”, is nevertheless classified as a dialect of Bangladeshi Bangla. Even further afield, Chākmā [ISO:ccp], spoken in the hills of Chittagong and in Myanma, is sometimes labeled a di- alect of Bangla, although it is probably a creole of Bangla and the Tibeto-Burman Chin language (Masica 1991, 27). These and other non-standard dialects of Bangla are under-documented; our fo- cus in this grammar is on the standardized dialects of Kolkata and Dhaka, which we call Kolkata Colloquial Bangla (KCB) and Dhaka Colloquial Bangla (DCB). These two di-
8 http://www.un.org/en/events/motherlanguageday/ 9 Masica (1991, 25) and Ethnologue (http://www.ethnologue.com/). Accessed on 9 February 2015. 10 | The Bangla Language alects, while they differ in numerous small ways, are indisputably the same language: highly mutually intelligible and recognized by Bangla speakers as the standard in both speaking and writing. When a feature explicated in the grammar is common to both varieties, as most features are, no indication will be given—this is the default. Only where there are distinct forms for the two varieties will they be marked accordingly.
2.4 The Bangla script
The Bangla script is descended from the northern variety of Brāhmī, a script which originated in India and from which most modern South Asian scripts are descended, as well as many of those of South East Asia. Like other Brāhmī-descended South Asian scripts such as Devanagari, it is an alphasyllabary , or an abugida: similar in some ways to alphabetic systems used in other scripts (such as the Cyrillic script, or the Roman script used to write English and many other European languages), particularly in their representation of consonants. It is written from left to right and has no distinction be- tween upper- and lower-case letters. Speakers of Asamiya use a script that is identical with the exception of one or two consonant graphemes. The abugida systems of South Asia, as opposed to the writing systems typical of Eu- ropean languages differ from alphabets in that they are organized by syllables rather than by segments; hence, their handling of vowel representation differs also. One ma- jor difference is that each vowel grapheme of an abugida—with one exception—has two different forms. One form, called in English a full vowel, vowel letter, or indepen- dent vowel; is used in syllables that lack an initial consonant: that is, it occurs word- initially or following other vowels (see Section 3.4.2). The other form, called a vowel di- acritic, vowel sign, vowel mark, or dependent vowel sign; is used when the vowel sound comes after a consonant, in which case—crucially—it cannot stand on its own, butis represented as an appendage to the previous consonant. For this reason, we prefer the term vowel diacritic, since this restriction that it must appear with another character is inherent in the meaning of diacritic. When being referred to, a vowel diacritic is repre- sented in the following fashion— ◌ী , where the dotted circle to the right of the vowel sign represents the space where the consonant would be. The exception to vowels having two different written forms is the অvowel /ɔ ∼ o/, called the inherent vowel, a concept that is also typical across South Asian abugida scripts. It has only an initial form—a vowel letter—but no vowel diacritic: it is unwrit- ten when it follows a consonant. In most languages that follow this convention, the inherent vowel is /a/, but sound changes in Bangla have both rounded and also split /a/ into two phonemes: it is now realized most often as /ɔ/ and otherwise as /o/. So, for example, the Bangla letter for the consonant /k/ is written asক and pronounced as /kɔ/ or /ko/ (depending on the word), while the representation of /k/ followed by any other consonant includes a vowel diacritic:িক /ki/, ক /ke/, and so on. Note that the graphic ordering of the signs does not always map to the ordering of the sounds in The Bangla script | 11 speech, as the two previous graphemes show. This orthographic practice, while straightforward in Devanagari Sanskrit, causes some ambiguities in the representation of Bangla, where many instances of the inher- ent vowel have been lost due to sound change. Consequently the representation of Cɔ and Co in the Bangla script is homographic, as in বার /bar/ ‘time’ and বার /baro/ ‘twelve’. In situations where eliminating ambiguity is desirable, such as in children’s reading primers, the consonant can be marked with a special symbol called a hoshonto (corresponding to the virama in Devanagari), as in বা /bar/ ‘time’ (versus বার /baro/ ‘twelve’). The hoshonto tells the reader that the consonant is pronounced without any vowel following, as when the consonant in question is followed by another consonant, or when it appears word-finally. In practice the hoshonto is almost always omitted in written Bangla; however, the resulting ambiguity does not present difficulties to native speakers. A third characteristic of these South Asian abugidas is that the characters repre- senting a sequence of two or more consonants are often combined into what looks like a single character. This character is referred to as a conjunct letter. It is treated in the Unicode¹⁰ system of representing text on computers as a matter of rendering. That is, the sequence of consonants is typed on the keyboard as a sequence of characters, but appears visually (on the screen and in printed form) as a single conjunct letter. The computer’s internal representation of the consonant sequence corresponds to the typed-in sequence of characters, not to the visual appearance. Some dictionaries treat certain conjuncts (such as ক+ষ= /k + ś = kś/ ‘khiyo’) as letters in their own right. Chapter 3 describes Bangla orthography in more depth.
10 Unicode is a standard computer representation of the characters making up written text. The current standard is described in The Unicode Consortium, 2014 (Unicode Consortium n.d.).
3 Phonology and Orthography
3.1 Introduction
The best thorough treatment of Bangla phonetics and phonology remains Ferguson and Chowdhury (1960).¹ That article, however, is unfortunately only a beginning— the authors themselves call more than once for further systematic investigation and laboratory research. Their work is long overdue for an expansion and update—most notably with respect to both experimentation and more work on the non-standard dialects—a task unfortunately beyond the scope of the present volume. Instead, we aim only for a clear and linguistically precise overview of Bangla phonology as cur- rently understood. A common problem in descriptions of Bangla is the conflating of phonology with orthography. Bangla spelling, like English spelling, is conservative, a circumstance that has led to idiosyncrasies in the mapping between sound and letter. Explaining this mapping is an intricate and difficult task, but accuracy and clarity suffer when other- wise excellent scholars fail to distinguish between the language and its script. Thomp- son (2010, 40), for example, when speaking about the Bangla script, tells her readers that “Bangla is a syllabic language...” This statement doesn’t really tell us anything useful about Bangla: all languages are syllabic. What she means is that the Bangla writing system is syllable-based. Bangla itself still needs to be thought of in terms of phonemes when treating the phonology. Similarly, Radice (2003, 26) says that a par- ticular grapheme (◌ )”lengthens the sound of the consonant to which it is attached [emphasis mine],” when he means that it signifies lengthening of the consonant. (This particular vagary of Bangla spelling is due to umlaut, which we discuss in Section 3.3.1.1.)² This chapter treats both Bangla phonology and Bangla orthography, as well as the relationship between the two. The following two sections, Section 3.2 and Section 3.3, deal strictly with the phonology and for that reason, use only the International Pho- netic Alphabet (IPA), unless citing morphemes or lexemes, in which case we use our transcription system (Section 3.5). The last two sections, Section 3.4 and Section 3.5, de- scribe Bangla orthography as it represents the phonology of the language; that discus- sion makes use of both the Bangla script and this grammar’s transcription system, as well as IPA where necessary. Forward slashes (//) are used to represent phonemes—as is customary in linguistics, but here we do this with our own transcription symbols as well as with IPA symbols—and angled brackets (< >) are used to represent graphemes.
1 This is not to denigrate Khan (2008) or Kar (2010), but their works focus on particular aspects of Bangla phonology. 2 Granted, Radice’s goal is not to offer rigorous linguistic description, but to instruct a lay-audience on Bangla pronunciation. 14 | Phonology and Orthography
3.2 Bangla phonemes
3.2.1 Vowels
The vowel chart in Table 3.1 shows the Bangla vowels, oral and nasal.
i u ĩ ũ
e o ẽ õ
æ ɔ æ̃ ɔ̃
a ã
Table 3.1: Bangla vowels
As can be seen in the chart, all seven vowels have a nasal counterpart. Nasalization does not carry a heavy functional load in Bangla: oral vowels occur far more frequently than nasal ones, which in addition are mostly restricted to initial syllables; that is, prosodically prominent ones (see Section 3.3.2). There is no oral-nasal contrast next to nasal consonants, as all vowels are phonetically nasalized in that position (Dasgupta 2003). Bangla vowels do not have phonemic length. The reflexes—that is, descendants— of the Indo-Aryan short-long pairs /i, iː/ and /u, uː/ have merged, while the reflex of the short member of the Indo-Aryan pair /a, aː/ has rounded to /ɔ/ ∼ /o/, thus ob- scuring the original length distinction between the two. This merger is common to the Māgadhī sub-family of Indo-Aryan (Section 2.2). Bangla orthography does not reflect these changes, as is discussed in Section 3.4.2.1. The low-high distinction among mid vowels (both /e æ/ and /o ɔ/) is one of the in- novations shared by Bangla and Asamiya that separates them from Oriya. Some Bang- ladeshi dialects do not distinguish between /æ/ and /e/. Diphthongs are an unfinished topic in Bangla phonological research. A compre- hensive account, underpinned by phonetic and phonological experimentation, remains to be done. What follows is a summary of what has already been said about diphthongs and glides in Bangla. Masica (1991, 116) remarks of the Indo-Aryan family that “the Eastern languages have the greatest number of true diphthongs (as well as disyllabic vowel sequences)”. The diphthongs /oi̯/ and /ou̯/ are the ones most often listed in phoneme inventories— probably under the influence of Bangla orthography, which bestows on them acon- Bangla phonemes | 15 creteness by granting them unique representation in the script.³ But no agreement exists on Bangla’s diphthong inventory. Most authors name many more than two, in- cluding Ferguson and Chowdhury (1960, 42), with 17; Hai (1975), cited by Kar (2010, 19), with 31; Sarkar (1985), also cited by Kar (2010, 17), with 17; Chatterji (1986), in Kar (2010, 18), with 25; Dasgupta (2003, 356), with 18; and Thompson (2010, 27), with 25. The ques- tion of “true” diphthongs versus disyllabic vowel sequences, that is, of whether a given pair of adjacent vowels consists of vowel+offglide (VV̯) or vowel-hiatus-vowel (VV), along with the potential of an intervening glide (VV̯V)—sometimes optional, some- times obligatory—has not been fully explored for Bangla. Most authors do not make these distinctions, and Thompson (2010, 27) is explicit that she is “disregarding the fact that some of these combinations produce one syllable, others produce two.” Sarkar, Kar, Ferguson and Chowdhury, and Dasgupta, however, all draw a clear distinction between VV̯ and VV sequences. Dasgupta (2003, 357) acknowledges the interference of orthography when he remarks that “[c]rucial contrasts that the orthography con- ceals distinguish diphthongs...from sequences with an optional semivowel buffer....” An accurate and thorough description will have to consider syllabicity and distinguish between a true diphthong (VV̯) and a VV sequence. Both Dasgupta (2003, 356–357) and Ferguson and Chowdhury (1960, 39–42) assert the existence of two mid glides, /e̯/ and /o̯/, in addition to the two high glides /y/ and /w/, or /i̯/ and /u̯/. Their claim for a high/mid contrast among front glides and for a syllabic/non-syllabic contrast between these glides and their vowel counterparts is sound: they provide the following data as evidence: জা /ja/ ‘sister-in-law’ যাই /ja/ ‘go.IMP’ জাই /ja-i/ ‘sister-in-law-EMPH’ যাই /ja-i̯/ ‘I/we go [go-PRS-1]’ যায় /ja-e̯/ ‘he/she goes [go-PRS-3.NHON]’ জাও /ja-o/ ‘sister-in-law-INCL’ যাও /ja-o̯/ ‘you go [go-PRS-2.FAM]’ They admit, however, that “[e]xamples of a final contrast u-u̯ are rare” and do not provide one. Table 3.2 is adapted from a table in Ferguson and Chowdhury (1960, 42) and in- cludes all the VV̯ and VV combinations listed by Ferguson and Chowdhury and Das- gupta (2003, 357). In neither paper do the authors claim to have listed all the pos- sible diphthongs and VV sequences in Bangla, so the chart may not be comprehen- sive. A blank cell means they did not have anything to say about the existence of the vowel combination; a dash in the cell means Ferguson and Chowdhury assert probable non-existence of that combination. There are 18 diphthongs listed here altogether, in
3 Another influence comes via Sanskrit, which likewise has Devanagari symbols for the historical analogs of those two diphthongs. 16 | Phonology and Orthography columns 3, 5, 6, and 8. They come from the analyses of Ferguson and Chowdhury and Dasgupta; our study lacks the scope to definitively comment on their claims overall.
i i̯ e e̯ u̯ o o̯
a ai ai̯ ae ae̯ au̯ ao̯
æ æe̯ æo æo̯
ɔ ɔe ɔe̯ ɔo̯
e ei ei̯ ee ee̯ eu̯ eo ―
o oi oi̯ oe oe̯ ou̯ oo oo̯
i ii ii̯ ie ― iu̯ io ―
u ui ui̯ ue ― ― uo ―
Table 3.2: Representative Final VV̯ and -VV Sequences
Unfortunately, Dasgupta’s coverage of diphthongs is brief and dense, as is that of Kar (2010); while the description in Ferguson and Chowdhury, though longer, is not long enough and was written over fifty years ago. Furthermore, none of them of- fer experimental data, and little phonological evidence. Bangla diphthongs remain a fruitful area for further investigation, especially given their large inventory.
3.2.2 Consonants
As shown in Table 3.3, Standard Bangla has 31–37 consonants, depending on how one counts. With 16 stops and four affricates spanning five places of articulation, Bangla is obstruent-heavy, except in the area of fricatives . Four of the seven fricatives in the table—/v/, /s/, /z/, and /ʒ/—exist only in loanwords or in certain, more educated vari- eties of the standard language; hence the parentheses. But language contact may well be in the process of changing this phonemic picture of Bangla. As many as fifty-five years ago, in fact, Ferguson and Chowdhury (1960, 34–35) were assigning phonemic status to /s/ for some varieties of Bangla, due to fully assimilated English loanwords such as cinema, and that list of assimilated loans has only grown: many dialects have lexical items such asবাস /bas/ ‘bus’ orসাইেকল /saikel/ ‘bicycle’. They also show that Bangla phonemes | 17 many speakers (all of those in their study, in fact) had isolated but phonemic instances of /s/ in “purely Bengali material”; that is, in non-borrowings. The nasal consonants once had five places of articulation as well, but three of the five original nasal consonants of pre-Bangla have undergone a merger: the palatal, retroflex, and dental nasals all merged to a dental. The dental /n/ has palataland retroflex allophones before palatal and retroflex stops. Similarly, the three voiceless sibilants—originally a palatal, a retroflex, and a dental— have merged to a palatal /ʃ/, with a dental allophone [s] before dental consonants. Note that some Bangladeshi dialects have /s/ rather than /ʃ/ as their sibilant. This sibilant merger is another change common to the three members of the Māgadhī group, al- though Asamiya has the velar fricative /x/ instead of /ʃ/. Both Ferguson and Chowdhury (1960, 33) and Dasgupta (2003, 359) address the question of the independence of /ḍ/ and /ṛ/, as most of the time, the two are in comple- mentary distribution: /ḍ/ occurs initially, as a geminate, and in certain clusters such as /nd/̣; while /ṛ/ occurs in final position, intervocalically, and in certain other clus- ters, such as /mṛ/. But they do contrast in several words, including their respective letter names ড /ḍɔ/ and ড় /ṛɔ/; as well as the words কানাডা /kanaḍa/ ‘Canada’ versus কানাড়া /kanaṛa/ ‘Kannada [language]; [name of a] raga’ and রিডও /reḍio/ ‘radio’ ver- sus রিড়ও /reṛio/ ‘even castor [castor-INCL]’⁴ and a number of others. Most minimal pairs include an English loanword in which English /d/ has been adapted as Bangla /ḍ/, so it appears that language contact may have helped revitalize a fading contrast. In eastern dialects, /ṛ/ merges with /r/. Words with the consonant /ɖʰ/ are quite rare. As for /pʰ/, this phoneme presents a complicated descriptive situation. First, only some dialects have retained the phoneme /pʰ/ at all: most of them, including DCB, have /f/ in words where /pʰ/ historically oc- curred, as well as in loanwords where /f/ would once have been nativized with /pʰ/; for example, ফান /fon/ ‘telephone’. (The phoneme /f/ has entered the language through borrowing, possibly contributing to this loss of /pʰ/ in many Bangla dialects.) Nev- ertheless, /pʰ/ is still usually reported as a phoneme of Bangla in general, and simi- larly,
4 Although Dasgupta calls his Canada-Kannada example “somewhat far-fetched”, he nevertheless maintains it is “surely decisive”. 18 | Phonology and Orthography represented as
Palato- Labial Dental Retroflex Velar Glottal alveolar
Stops
Unaspirated p b t̪ d̪ ʈ ɖ k g
Aspirated (pʰ) bʰ t̪ʰ d̪ʰ ʈʰ ɖʰ kʰ gʰ
Affricates
Unaspirated tʃ dʒ
Aspirated tʃʰ dʒʰ
Fricatives f (v) (s z) ʃ (ʒ) h
Nasals m n ŋ
Laterals l
Flaps ɾ ɽ
w j Glides (o̯) e̯
Table 3.3: Bangla consonants
3.3 Other phonology
3.3.1 Phonotactics
3.3.1.1 Vowels The most salient characteristic of Bangla vowel behavior is a particular instability in the height of non-high vowels, especially the mid vowels, and especially as the result of assimilation. Bangla vowels exhibit both kinds of assimilation, progressive (left-to- right) and regressive (right-to-left), or anticipatory, with the latter being much more common. Details on the mutable quality of the mid vowels /e æ o ɔ/ take up most of this Other phonology | 19 section, but in Section 3.3.1.1.3 we also describe anticipatory assimilation of the low vowel /a/. We start, however, with another vowel alternation—height neutralization of mid vowels in unstressed position.
3.3.1.1.1 Occurrence constraints and height neutralization The vowel /æ/ generally occurs only in initial—that is, prosodically prominent—syllables (Dasgupta 2003); for example,এক /æk/ ‘one’, butঅেনক /ɔnek/ ‘much’, (from the pri- vative prefixঅন- /ɔ(n)-/). It is occasionally heard word-finally in what are perceived as “learned words or in bookish pronunciation of colloquial words ending in /-e/” (Fer- guson and Chowdhury 1960, 37). In KCB and other western dialects, the vowel /ɔ/, too, is mostly restricted to initial syllables, and does not occur at all word-finally, except in monosyllabic words; for example,শ /śɔ/ ‘hundred’ and হন /hɔn/ ‘you (HON) are’. This constraint gives pairs such asশ /śɔ/ ‘hundred’ versusএকশ /ækśo/ ‘one hundred’, andমল /mɔl/ ‘dirt’ versus অমল /ɔmol/ ‘free from dirt’. But /ɔ/ is more common in final syllables in DCB, where one does hear, for example,একশ /ækśɔ/ ‘one hundred’. These occurrence restrictions on /æ/ and /ɔ/ can be characterized as the result of height neutralization in unstressed syllables. In other words, low mid vowels tend to be restricted to syllabically prominent positions, at least in the western dialects. The absence of final /æ/ even in monosyllabic words is probably a consequence of itsrela- tive rarity as a phoneme in the language.⁵ As mentioned above, nasalized vowels are also almost entirely restricted to initial syllables.
3.3.1.1.2 Anticipatory assimilation Much more noteworthy than height neutralization in the Bangla vowel system is the tendency of mid vowels towards assimilatory raising. Vowel height assimilation has operated as a historical process tied to changes in the language over time, and also still exists as a productive, synchronic process. The process raises the mid vowels /e æ o ɔ/ one step, to /i e u o/ respectively, when the subsequent syllable contains a high vowel. It is important at the outset of a discussion on Bangla vowel assimilation to clarify some critical distinctions in both terminology and the actual processes. First, regard- ing processes: diachronic and synchronic vowel assimilation differ qualitatively, as Masica (1991, 208) reminds us. Diachronic processes are historical changes that occur in the structure and rules of a language over the long term, resulting in a permanently different structure to that language; while synchronic processes involve the applica-
5 According to Chatterji (1926, 410), /æ/ is a fairly recent addition to the phoneme inventory, having only arisen in the language towards the end of the Middle Bangla period, at the earliest. 20 | Phonology and Orthography tion of currently productive rules to particular forms during the production of those forms. Masica observes that ”both [diachronic and synchronic vowel assimilation] are significantly present [emphasis his]” in Bangla vowel alternation. We see diachrony in many of the vowel alternations in Bangla verb paradigms (Section 9.2), the result of completed sound changes; for example, in the forms of the verb কনা /kena/ ‘to buy’:
• Present tense; no alternation: কেনা /ken-Ø-o/ ‘buy-PRS-2.FAM’
• Future tense; alternation: িকনেব /kin-b-e/ ‘buy-FUT-2.FAM’
In this case, the Bangla future suffix used -ইব-to be /-ib-/; it is now-ব /-b-/ due to syncope (loss) of the vowelই /i/. Previously, a synchronic, phonological process of anticipatory vowel assimilation raised a low or mid vowel before the suffixal high vowel. This rule lost its conditioning environment when the /i/ deleted, and as a result, became fossilized in the language as a morphophonemic alternation. The new future marker for Bangla verbs is now the suffix /-b-/ plus vowel raising, when applicable. In a diachronic scenario such as this one, umlaut is the term given to both the changes in the vowels of the affected lexemes and the morphophonemic process of vowel alterna- tion that the change has engendered. Vowel raising in the verbal system is complex. In addition to now-opaque, morpho- logically conditioned raising, there are instances among the verbs where the phono- logical conditioning environment has remained. In those cases, however, we would still regard the rule as morphophonemic rather than phonological, for a couple of reasons—first, its application is morphologically conditioned—it operates absolutely only on verbs—and second, it acts in tandem with the umlaut of the verbal system. In the words of Anttila (1989, 117), “synchrony repeats history directly” in the vowel raising of the Bangla verbal system. Outside of the verbal system, vowel raising is a purely phonological process, in which the conditioning environment remains transparent, as can be seen in 3.1 through 3.6, which illustrate anticipatory raising. Dasgupta (2003, 357) calls this the “normal or dominant” pattern. As 3.1 through 3.4 show, Bangla orthography does not reflect the raising of /ɔ/ to /o/.
(3.1) নট nɔṭ actor-Ø ‘actor’ Other phonology | 21
(3.2) নটী noṭ-i actor-FEM ‘actress’
(3.3) একচি শ æk-cɔlliś one-forty ‘41’
(3.4) য়ি শ põi-triś (> pɔ̃e + triś) five-thirty ‘35’ Notice that both vowels of the first element য়ি শ in /põi-triś/ ‘35’ above undergo raising: first /e/ goes to /i/, which then triggers /ɔ̃/ to raise to /õ/. Raising is not obligatory, as it is in verbs, except with low mid vowels. Compare the reduplicating form in 3.5, where the high mid vowel /e/ fails to raise before /i/ in the second component, with that in 3.6, where /æ̃/ must raise:
(3.5) লখািলিখ lekha-lekhi write-REDUP ‘writing back and forth’
(3.6) ষা- িষ ghæ̃śa-ghẽśi approach-REDUP ‘rubbing shoulders with each other’ Application of raising among forms other than verbs varies with both dialect and register; in particular, vowel raising cannot occur in Shadhu writing, but in many cases, may be either present or absent in colloquial speech and writing, so that many lexemes have two forms, one with and one without raising. Using the latter form gives one’s language a more formal feel. Vowel raising does not apply to English loanwords. A second important clarification when discussing Bangla vowel assimilation con- cerns terminology. Bangla vowel raising has been variously referred to as metaphony, vowel raising, vowel harmony, and vowel (height) assimilation. Metaphony is a problem- atic term within linguistics because there is not enough agreement on what it covers; for example, it often obscures the synchronic-diachronic distinction we make above. And since the vowel alternations are due to an articulatory process, not a categorical 22 | Phonology and Orthography constraint, we avoid the term vowel harmony. We prefer either vowel raising or vowel (height) assimilation, as they are more generally applicable and also theory-neutral.
3.3.1.1.3 Progressive assimilation Another Bangla vowel alternation that receives less mention in most descriptions is right-to-left assimilation, which, unlike the anticipatory kind, affects the lowvowel /a/. Dasgupta (2003, 358) refers to it as the “counter-normal pattern”, and says of it, “like the normal pattern, [it is] omnipotent in the conjugation and a lexical choice- ridden set of tendencies outside it.” Unlike anticipatory assimilation, however, this alternation affects the vowel /a/, changing it—in non-verbs—to /e/ following asylla- ble with /i/ and to /o/ following a syllable with /u/. Examples 3.7 through 3.10 from Dasgupta’s discussion show adjectives that are etymologically /a/-final. But compare the final vowels in 3.7 and 3.8, which have low vowels in the preceding syllable, to those of 3.9 and 3.10, whose preceding syllable contains a high vowel:
(3.7) পাকা paka ‘ripe’
(3.8) ভাঙা bhaŋa ‘broken’
(3.9) িভেজ bhije ‘wet’
(3.10) েলা bhulo ‘forgetful’ Among verbs with a high vowel in the syllable preceding the verbal noun suffix -আন /-ano/, the suffix is expressed as /ono/. In contrast with the pattern described above for non-verbs, /a/ is always raised to /o/—never to /e/—regardless of the back- ness of the conditioning vowel:
(3.11) িচবান / িচেবান cibano/cibono ‘to chew’ Other phonology | 23
(3.12) দৗড়ান / দৗেড়ান douṛano/douṛono ‘to run’ Dasgupta does not raise the subject of productivity, so it is not clear whether this is a morphophonemic alternation or a still-productive process.
3.3.1.1.4 Sanskritic vowel mutation The effects of an entirely defunct process of vowel alternations can be seen among Bangla vowels in Sanskrit-derived vowel mutations. They originated in the old Indo- European vowel gradations known as ablaut, which is unrelated to Bangla vowel as- similation. A few remnants of it can be seen in derivational forms, as with the pair below. See Section 7.2.3 for a description.
(3.13) িব ান biggæn ‘science’
(3.14) ব ািনক boiggænik ‘scientific’
3.3.1.2 Consonants The following consonants cannot occur word-initially: /ŋ/, /ɽ/, and /ɖʰ/; as well as /w/, /j/, /o̯/, and /e̯/ (in other words, the entire class of glides). The consonants /ŋ/, /ɾ/, /ɽ/, and /h/ do not occur as geminates. The phoneme /h/ does not occur word-finally except in very careful speech and in certain interjections. The aspirated consonants occur orthographically in final position, but the contrast with unaspirated consonants greatly reduced, almost non-existent (Ferguson and Chowdhury 1960). Medial consonant clusters are very common and can be of almost any combination. Word-initial consonant clusters are less common, but do occur, in the following forms:
1. CC: dental stop + /l̪/ or /ɾ/ For example, /tɾ/
2. CC: /ʃ/ + dental stop or dental liquid (/l̪/ or /ɾ/); in this position, /ʃ/ = [s] For example, /ʃɾ/ (= [sɾ])
3. CCC: /ʃ/ + stop + /l̪/ or /ɾ/; in this position, /ʃ/ = [s] For example, /ʃtɾ/ (= [stɾ]) 24 | Phonology and Orthography
Final clusters are exceedingly rare and are heard only in loanwords by educated speak- ers.
3.3.1.3 Syllable structure Kar (2010, 24), citing Sarkar (1986), gives sixteen possible syllable structures for Bangla, which he lists in descending order of frequency, seen below. Thompson (2010, 40) gives only nine, indicated in Kar’s list by boldface font. Her schema includes glides (represented by <ŷ>) as a separate category from vowels or consonants; in illustrat- ing the intersection between her list and Kar’s, I have interpreted Thompson’s glides as representing either vowels or consonants, since she does not distinguish between syllabicity and non-syllabicity in V-V sequences (Thompson 2010, 27). CV CVC V VC VV CVV CCV CCVC CVVC CCVV CCVVC CVCC CCCV CCCVC VVC CCCVV
3.3.2 Prosody
Bangla is a stress-accent language. Stress is not phonemic: all Bangla words have ini- tial stress—a rule Hayes and Lahiri (1991, 55) call “inviolable”. It also has a phonolog- ically assigned phrasal stress pattern, which assigns stress to the left-most non-clitic word in the phrase. People sometimes refer to the Bangla’s phrasal stress as a pitch accent. Several recent papers⁶ have looked at intonational stress patterns in Bangla and their interaction with focus and boundary-marking, but more work remains to be done on this topic.
6 Hayes and Lahiri (1991) and Khan (2007), among others. Romanized transcription and Bangla orthography | 25
Bangla has no tones.
3.4 Romanized transcription and Bangla orthography
3.4.1 Introduction: our transcription system
Although Bangla orthography is for the most part consistent, the complexity of the rules and their significant departure from other Indic conventions, combined with some unpredictability and the issue of the inherent vowel, make correct pronuncia- tion difficult for the non-native speaker. In previous treatments, the usual two options for transcription into a Roman script have been: 1. A letter-for-letter transliteration (with the exception that the inherent vowel— not explicitly represented in the Bangla—is also assigned a Roman symbol) One advantage of this approach is that it helps preserve Bangla orthography in readers’ minds, which is especially useful for pedagogical purposes. Another is that it keeps morphological (and historical) relationships between forms trans- parent.⁷ The disadvantage of one-to-one transliteration between the Bangla and Roman scripts is that it presents a steep learning curve for most readers, since only those already acquainted with Bangla know how to pronounce the words at first sight. Further, in many of these transliteration schemas, the choices of which Roman letters to use are influenced by the Romanization of Sanskrit to the point of obscuring the actual Bangla; for example, the use of to represent the in- herent vowel, actually realized in speech as /ɔ/ or /o/.
2. A phonemic transcription that ignores Bangla orthography. (A phonemic, or broad, transcription offers a one-to-one correspondence between symbol and sound that is frequently lacking in most languages’ orthographies, although it does not represent the pronunciation in all its fine detail, as a phonetic, or nar- row, transcription does.) The advantage is that readers know immediately how to pronounce the word. The disadvantages are that readers do not know from the transcriptions how to spell the words in Bangla, and moreover, some morphological relationships may be obscured. Since our goal is to produce a resource that can be referred to quickly and effi- ciently, rather than a teaching tool, our transcription is for the most part phonemic.
7 William Radice puts a modified transliteration schema to good use in the first section ofhis Teach Yourself Bangla. With his method, students begin to internalize the rules of pronunciation before they even know the script (Radice 2003). 26 | Phonology and Orthography
In one case, however, we have chosen to give a narrower transcription to aid pronun- ciation. Although the ɔ/o distinction is neutralized before high vowels—i.e., the real- ization of the inherent vowel when followed by a high vowel is predictably [o]—we have chosen to transcribe it phonetically, and we therefore use
3.4.2 Orthography of Bangla vowels
3.4.2.1 Vowel length in the orthography As mentioned in Section 3.2.1, vowel length as a distinctive feature has been lost in Bangla. However, the conservative orthography retains a remnant of the distinction in the three vowel grapheme pairsই /ঈ ; উ/ ঊ; and অ/ আ. Strict transliteration schemes, as well as more conservatively-minded Romanizations, will still render these respectively as , <ī>; , <ū>; and , <ā>. With respect to the first two pairs, sounds that were once long and short versions of /i/ and /u/ have merged to /i/ and /u/; as for the two vowels represented by the third pair, the length distinction has given way to a distinc- tion in quality. Since the first two differences have no connection to pronunciation, our transcription does not represent them but Romanizes bothই and ঈ as and both উ and ঊ as . With the third pair, we use <ɔ> or
3.4.2.2 Vowel letters and vowel diacritics As mentioned above in Section 2.4, every vowel grapheme has two different forms, or allographs: a vowel letter and a vowel diacritic. Vowel letters occur at the beginnings of all vowel-initial words, as in আ /alu/ ‘potato’ and also word-internally after another vowel, as in কউ /keu/ ‘someone’, as well as after a consonant when that consonant represents C+inherent vowel. An example of the latter is the wordকই /koi/ ‘where?’ Romanized transcription and Bangla orthography | 27 written with the vowel letter ই ; this may be compared with the wordিক /ki/ ‘what?’ written with the corresponding vowel diacritic ি◌ . The dashed circle in the previous sentence indicates that the symbol it accompa- nies is in fact a diacritic; that is, a symbol that cannot occur alone. The circle’s position shows where the host symbol would be placed in relation to the diacritic. Bangla or- thography generally uses vowel diacritics to represent vowel-initial suffixes, unless a vowel-initial suffix attaches to a vowel-final stem, কাথাও asin /kotha-o/ ‘somewhere [where-INCL]’. In general, only the graphemesই ,উ , and ও
3.4.2.3 The vowel letter অ and the inherent vowel Due to sound changes and a conservative orthography, Bangla spelling does not al- ways distinguish the phonemes /o/ and /ɔ/. Although the Bangla alphabet does have the letter ও and its diacritic ◌াto represent some instances of /o/, the vowel letterঅ , used at the beginnings of words, and its null non-initial counterpart (the inherent vowel mentioned above in Section 2.4 and Section 3.4.1) are often used to represent both /ɔ/ and/o/. We see an example of a minimal pair inকের /kore/ ‘having done’ versus কের /kɔre/ ‘(he/she) does’. This ambiguity can cause difficulty for non-native speakers when reading the language. Although the spelling is not always a guide to which pronunciation of the inherent vowel is appropriate, there are some rules and tendencies that can often help:
1. Due to Bangla Vowel Raising (Section 3.3.1.1.2), /o/ and /ɔ/ are neutralized to /o/ when the vowel in the next syllable is high, as inনদী /nodi/ ‘river’. Because this is a phonological rule, this neutralization always holds, except in initial syllables. See (5) below for more on this exception.
2. Word-initially, the vowel is almost always pronounced [ɔ], unless (a) there is a following high vowel, and (b) the initial vowel is not the privative morphemeঅ /ɔ/ ‘not, un-’; e.g., অনীকা /onika/ ‘Onika [proper name]’, in which case the Vowel Raising rule applies. There are a few lexical exceptions to this initial- syllable neutralization to /ɔ/ that are predictable from the orthography. (See the last item of this list.) 28 | Phonology and Orthography
3. When a word contains two or more instances of these vowels, the sequence ɔ...o is much more likely than o...ɔ; e.g.,শহর is pronounced /śɔhor/ , not */śohɔr/.
4. The vowel /ɔ/ does not occur word-finally in KCB except in monosyllabic words (see Section 3.3.1.1.1). A handful of exceptions to this exist: the final vowel in the word হযবরল /hɔyɔbɔrɔlɔ/ ‘all out of order, mixed up’—a word that literally consists of the names of several letters of the Bangla alphabet recited out of order—and the second person informal present imperative of class 3 verbs with an ɔ-stem (see Section 9.3.2).⁸
5. In words spelled with an inherent vowel in the first syllable and a yɔ-phola (see below under Section 3.4.3.4) in the next, the first vowel represents an /o/; e.g., অ /onno/ ‘other’. This latter spelling reflects a time before the sound change Cj > CC, when the high glide /j/ conditioned raising of the vowel that preceded the nasal. This is a feature of certain lexemes, resulting from a previous sound change, not a synchronic phonological rule. It is a quirk of Bangla’s conserva- tive spelling that this tendency can be expressed as an orthographic rule, simi- lar to the way we are taught in English that silent
Some important points about Bangla spelling in connection with the inherent vowel:
• Many Bangla words that end in /o/ have variant spellings: either (1) with the inher- ent vowel or (2) with the regular vowel diacritic for /o/, ◌া ; e.g.,ভাল ∼ ভােলা /bhalo/ ‘good’.
• Some writers will use an apostrophe in the alternating verb forms with /o/, as in ক’ র /kore/ ‘having done’ versus কের /kɔre/ ‘(he/she) does’. The forms where this happens are imperfect participles (Section 9.5.2), conditional conjunctives (Section 9.5.3), second-person future imperatives (Section 9.3.6), and the perfect participles of অ-stems (Section 9.5.1).
• Many sources on Bangla, especially older ones, misleadingly transcribe the inher- ent vowel as , because of its etymological connection to the Sanskrit short /a/ (and probably also under the influence of Hindi). Our transcription aims to render pronunciation as transparently as possible, which is why we do not follow this con- vention. On the other hand, most Romanizations of Bangla proper nouns also do so,
8 Seely (2010), pc. Romanized transcription and Bangla orthography | 29
as, for example, in the name of the famous Bengali director,সত িজত রায় Satyajit Ray. The Romanization of his name actually illustrates three anachronisms in con- ventional Bangla transcription: for /ʃ/, for /o/ or /ɔ/, and Cy for CC. His first name is pronounced /ʃottoʤit/, not /satjʌʤit/,⁹ as most non-Bangla speaking West- erners would render it. We do retain these spelling conventions when using Bangla proper nouns in our English text.
• The consonant
3.4.2.4 The vowel letter এ and its diacritic ◌ Bangla orthography also fails to distinguish between the phonemes /æ/ and /e/, repre- senting them both with the vowel letter এ or its diacritic ◌ . Because these vowels are phonemically distinct—except in cases of neutralization resulting from raising—our transcription represents this difference even though the Bangla writing system does not. In certain words, the sound /æ/ can be also represented orthographically with the letter◌ , orয-ফলা /yɔ-fola/ (see Section 3.4.3.4).
3.4.3 Orthography of Bangla consonants
In the case of the consonants, even more so than the vowels, Bangla orthography does not reflect phonological changes. As a result, there are several rules that wouldmake a direct transliteration problematic for those unfamiliar with the script and its conven- tions. For example, retroflex nasals have merged with dental nasals, yet there is stilla separate retroflex nasal letter in the script. The following sections list this and other vagaries of Bangla consonants and de- scribe how we address them in our transcriptions.
3.4.3.1 Nasals Because of the merger of three of the pre-Bangla nasals (Section 3.2.2), the graphemes , , and —historically standing for a palatal, a retroflex, and a dental nasal respectively— all represent a dental nasal in the modern script and are therefore all transcribed as /n/ in this grammar. The other two nasal consonants are /m/ and /◌ং /ŋ/.
9 Transcription here is in IPA. 30 | Phonology and Orthography
3.4.3.2 Sibilants A historical merger among the sibilants—palatal, retroflex, and dental—similar to that of the nasals, has likewise led to there being three graphemes for the single phoneme /ʃ/: , , and . Traditionally is Romanized as <ś> (to represent its originally palatal character), as <ṣ> (because formerly it represented a retroflex), and as . We rep- resent all three with <ś> in our transcription, with the following exceptions. First, loan- words with /s/ are usually pronounced with [s] in the modern standard dialects, and so we represent them as such; for example,হাসপাতাল /haspatal/ ‘hospital’. Second, sibi- lants are realized phonetically as dental when they occur before dental consonants. Although this variation is conditioned and therefore not phonemic, we represent it in our transcription, both because /s/ is an emerging phoneme in the language and be- cause contemporary convention falls overwhelmingly on this side as well. Only and actually occur in this position; the following comprise all the possible combinations:
• : represented as
• : represented as
• : represented as
• : represented as
• : represented as
• : represented as
3.4.3.3 Consonant conjuncts Due to loss or assimilation of the second member of many consonant clusters, the con- servative Bangla spelling system has conjunct consonants whose wholes do not always equal the sum of their parts. What were consonant clusters historically, but are now in speech either single consonants or geminates, are still represented by symbols com- prised of their earlier constituent parts; furthermore, these conjuncts can represent different sounds depending on whether they are initial or medial. For example:
• , which would be transliterated as <śb>, actually represents /ʃ/ initially and /ʃʃ/ medially
• , which would be transliterated as
The types of Bangla consonant conjuncts and our phonemic transcriptions of them are listed in Table 3.4. Romanized transcription and Bangla orthography | 31
Table 3.4: Transcription of consonant conjuncts
Conjunct or Phonemic conjunct Transliteration Examples transcription template